Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

amazon web services - Cloudfront TTL not working

I'm having a problem and tried to follow answers here in forum, but with no success whatsoever.

In order to generate thumbnails, I have set up the following schema: S3 Account for original images Ubuntu Server using NGINX and Thumbor Cloudfront

The user uploads original images to S3, which will be pulled through Ubuntu Server with Cloudfront in front of the request:

http://cloudfront.account/thumbor-server/http://s3.aws...

The big deal is, that we often loose objects in Cloudfront, I want them to stay 360 days in cache. I get following response through Cloudfront URL:

Cache-Control:max-age=31536000
Connection:keep-alive
Content-Length:4362
Content-Type:image/jpeg
Date:Sun, 26 Oct 2014 09:18:31 GMT
ETag:"cc095261a9340535996fad26a9a882e9fdfc6b47"
Expires:Mon, 26 Oct 2015 09:18:31 GMT
Server:nginx/1.4.6 (Ubuntu)
Via:1.1 5e0a3a528dab62c5edfcdd8b8e4af060.cloudfront.net (CloudFront)
X-Amz-Cf-Id:B43x2w80SzQqvH-pDmLAmCZl2CY1AjBtHLjN4kG0_XmEIPk4AdiIOw==
X-Cache:Miss from cloudfront

After a new refresh, I get:

Age:50
Cache-Control:max-age=31536000
Connection:keep-alive
Date:Sun, 26 Oct 2014 09:19:21 GMT
ETag:"cc095261a9340535996fad26a9a882e9fdfc6b47"
Expires:Mon, 26 Oct 2015 09:18:31 GMT
Server:nginx/1.4.6 (Ubuntu)
Via:1.1 5e0a3a528dab62c5edfcdd8b8e4af060.cloudfront.net (CloudFront)
X-Amz-Cf-Id:slWyJ95Cw2F5LQr7hQFhgonG6oEsu4jdIo1KBkTjM5fitj-4kCtL3w==
X-Cache:Hit from cloudfront

My Nginx responses as following:

Cache-Control:max-age=31536000
Content-Length:4362
Content-Type:image/jpeg
Date:Sun, 26 Oct 2014 09:18:11 GMT
Etag:"cc095261a9340535996fad26a9a882e9fdfc6b47"
Expires:Mon, 26 Oct 2015 09:18:11 GMT
Server:nginx/1.4.6 (Ubuntu)

Why does Cloudfront not store my objects as indicated? Max-Age is set? Many thanks in advance.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your second request shows that the object was indeed cached. I assume you see that, but the question doesn't make it clear.

The Cache-Control: max-age only specifies the maximum age of your objects in the Cloudfront Cache at any particular edge location. There is no minimum time interval for which your objects are guaranteed to persist... after all, Cloudfront is a cache, which is volatile by definition.

If an object in an edge location isn't frequently requested, CloudFront might evict the object—remove the object before its expiration date—to make room for objects that are more popular.

http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html

Additionally, there is no concept of Cloudfront as a whole having a copy of your object. Each edge location's cache appears to operate independently of the others, so it's not uncommon to see multiple requests for relatively popular objects coming from different Cloudfront edge locations.

If you are trying to mediate the load on your back-end server, it might make sense to place some kind of cache that you control, in front of it, like varnish, squid, another nginx or a custom solution, which is how I'm accomplishing this in my systems.

Alternately, you could store every result in S3 after processing, and then configure your existing server to check S3, first, before attempting the work of resizing the object again.


Then why is there a documented "minimum" TTL?

On the same page quoted above, you'll also find this:

For web distributions, if you add?Cache-Control?or?Expires?headers to your objects, you can also specify the minimum amount of time that CloudFront keeps an object in the cache before forwarding another request to the origin.

I can see why this, and the tip phrase cited on the comment, below...

The minimum amount of time (in seconds) that an object is in a CloudFront cache before CloudFront forwards another request to your origin to determine whether an updated version is available.?

...would seem to contradict my answer. There is no contradiction, however.

The minimum ttl, in simple terms, establishes a lower boundary for the internal interpretation of Cache-Control: max-age, overriding -- within Cloudfront -- any smaller value sent by the origin server. Server says cache it for 1 day, max, but configured minimum ttl is 2 days? Cloudfront forgets about what it saw in the max-age header and may not check the origin again on subsequent requests for the next 2 days, rather than checking again after 1 day.

The nature of a cache dictates the correct interpretation of all of the apparent ambiguity:

Your configuration limits how long Cloudfront MAY serve up cached copies of an object, and the point after which it SHOULD NOT continue to return the object from its cache. They do not mandate how long Cloudfront MUST maintain the cached copy, because Cloudfront MAY evict an object at any time.

If you set the Cache-Control: header correctly, Cloudfront will consider the larger of max-age or your Minimum TTL as the longest amount of time you want them to serve up the cached copy without consulting the origin server again.

As your site traffic increases, this should become less of an issue, since your objects will be more "popular," but fundamentally there is no way to mandate that Cloudfront maintain a copy of an object.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...