How caching and CDNs work

Fastly is a content delivery network, or CDN. CDNs work on the principle that once a piece of content has been generated it doesn't need to be generated again for a while so a copy can be kept around in a cache. Cache machines are optimized to serve small resources very quickly. CDNs typically have caches placed in data centers all around the world. When a user requests information from a customer's site they're actually redirected to the set of cache machines closest to them instead of the customer's actual servers. This means that a European user going to an American site gets their content anywhere from 200-500ms faster. CDNs also minimize the effects of a cache miss. A cache miss occurs when a user requests a bit of content and it is not in the cache at that moment (because it's expired, because no-one has asked for it before, or because the cache got too full and old content was thrown out).

What can be cached?

CDNs are good at managing a cache of small, static resources (for example, static images, CSS files, Javascripts, and animated GIFs). CDNs are also popular for offloading expensive-to-serve files like video and audio media.

At Fastly, our architecture (known as a reverse proxy) is designed to enable customers to go a step further and cache entire web pages for even more efficient handling of your traffic.

Managing the cache

Caching serves as a powerful weapon in your make-the-site-faster arsenal. However, most objects in your cache aren't going to stay there permanently. They'll need to expire so that fresh content can be served. How long that content should stay in the cache might be mere seconds or a number of minutes or even a year or more.

How can you manage which of your content is cached, where, and for how long? By setting policies that control the cached data. Most caching policies are implemented as a set of HTTP headers sent with your content by the web server (as specified in the configuration or the application). These headers were designed with the client (browser) in mind but CDNs like Fastly will also use those headers as a guide on caching policy.


The Expires header is the original cache-related HTTP header and tells the cache (typically a browser cache) how long to hang onto a piece of content. Thereafter, the browser will re-request the content from its source. The downside is that it's a static date and if you don't update it later, the date will pass and the browser will start requesting that resource from the source every time it sees it.

Fastly will respect the Expires header value only if the Surrogate-Control or Cache-Control headers are not found in the request.


The Cache-Control headers (introduced in the HTTP/1.1 specification) cover browser caches and, in most cases intermediate caches as well, as defined by section 5.2 of RFC 7234:

  • Cache-Control: public - Any cache can store a copy of the content.
  • Cache-Control: private - Don't store, this is for a single user.
  • Cache-Control: no-cache - Re-validate before serving this content.
  • Cache-Control: no-store - Don't ever store this content.
  • Cache-Control: public, max-age=[seconds] - Caches can store this content for n seconds.
  • Cache-Control: s-maxage=[seconds] - Same as max-age but applies specifically to proxy caches.

Only the max-age, s-maxage, and private Cache-Control headers will influence Fastly's caching. All other Cache-Control headers will not, but will be passed through to the browser. For more in-depth information about how Fastly responds to these Cache-Control headers and how these headers interact with Expires and Surrogate-Control, check out our documentation on cache freshness.

Surrogate Headers

Surrogate headers are a relatively new addition to the cache management vocabulary (described in this W3C tech note). These headers provide a specific cache policy for proxy caches in the processing path. Surrogate-Control accepts many of the same values as Cache-Control, plus some other more esoteric ones (read the tech note for all the options).

One use of this technique is to provide conservative cache interactions to the browser (for example, Cache-Control: no-cache). This causes the browser to re-validate with the source on every request for the content. This makes sure that the user is getting the freshest possible content. Simultaneously, a Surrogate-Control header can be sent with a longer max-age that lets a proxy cache in front of the source handle most of the browser traffic, only passing requests to the source when the proxy's cache expires.

With Fastly, one of the most useful Surrogate headers is Surrogate-Key. When Fastly processes a request and sees a Surrogate-Key header, it uses the space-separated value as a list of tags to associate with the request URL in the cache. Combined with Fastly's Purge API an entire collection of URLs can be expired from the cache in one API call (and typically happens in around 1ms). Surrogate-Control is the most specific.

Fastly and Cache Control Headers

Fastly looks for caching information in each of these headers as described in our documentation on cache freshness. In order of preference:

  • Surrogate-Control:
  • Cache-Control: s-maxage
  • Cache-Control: max-age
  • Expires:

HTTP status codes cached by default

Fastly caches the following response status codes by default. In addition to these statuses, you can force an object to cache under other states using conditions and responses.

Code Message
200 OK
203 Non-Authoritative Information
300 Multiple Choices
301 Moved Permanently
302 Moved Temporarily
404 Not Found
410 Gone

To cache status codes other than the ones listed above, set beresp.cacheable = true; in vcl_fetch. This tells Varnish to obey backend HTTP caching headers and any other custom ttl logic. A common pattern is to allow all 2XX responses to be cacheable:

sub vcl_fetch {
  # ...
  if (beresp.status >= 200 && beresp.status < 300) {
    set beresp.cacheable = true;
  # ...


When an object or collection of objects in the cache expires, the next time any of those objects are requested, the request is going to get passed through to your application. Generally, with a good caching strategy, this won't break things. However, when a popular object or collection of objects expires from the cache, your backend can be hit with a large influx of traffic as the cache nodes refetch the objects from the source.

In most cases, the object being fetched is not going to differ between requests, so why should every cache node have to get its own copy from the backend? With shield nodes, they don't have to. Shielding configured through the Fastly web interface allows you to select a specific data center (most efficiently, one geographically close to your application) to act as a shield node. When objects in the cache expire, the shield node is the only node to get the content from your source application. All other cache nodes will fetch from the shield node, reducing source traffic dramatically.


Back to Top