Improving caching performance with large files

  Last updated August 09, 2018

Fastly provides two features to enhance performance specifically for large files up to 5GB: Streaming Miss and Large File Support.

Streaming Miss

When fetching an object from the origin, Streaming Miss ensures the response is streamed back to the client immediately and is written to cache only after the whole object has been fetched. This reduces the first-byte latency, which is the time that the client must wait before it starts receiving the response body. The larger the response body, the more pronounced the benefit of streaming.


Configuration is simple. In VCL, simply set beresp.do_stream to true in vcl_fetch:

  sub vcl_fetch {
    set beresp.do_stream = true;

The same can be achieved by creating a new header of type Cache, action Set, destination do_stream and source true (this can, of course, be controlled with conditions).

Enable Streaming Miss


There are several limitations to using Streaming Miss.

Origins cannot use TLS and object size will be limited

Fastly's Streaming Miss functionality currently only supports HTTPS (TLS) origin servers on a limited availability basis. Contact your Technical Account Manager or support@fastly.com to see if you qualify for early access to this program.

Until you've been accepted into the LA program, Streaming Miss will not support HTTPS (TLS) origin servers. The content requested will be served to the client over HTTPS, but it won't be fetched with Streaming Miss over HTTPS. Objects fetched from HTTPS origins will therefore be limited to the non-Streaming Miss size of 2GB.

Streaming Miss is not available to HTTP/1.0 clients

If an HTTP/1.0 request triggers a fetch, and the response header from the origin does not contain a Content-Length field, then Streaming Miss will be disabled for the fetch and the fetched object will be subject to the non-streaming-miss object size limit.

If an HTTP/1.0 request is received while a Streaming Miss for an object is in progress, the HTTP/1.0 request will wait for the response body to be downloaded before it will receive the response header and the response body, as if the object was being fetched without Streaming Miss.

Cache hits are not affected. An HTTP/1.0 client can receive a large object served from cache, just like an HTTP/1.1 client.

Streaming Miss is not compatible with on-the-fly gzip compressing of the fetched object

Streaming Miss can handle large files whether or not they are compressed. However, on-the-fly compression of objects that are not already compressed is not compatible with Streaming Miss. If the VCL sets beresp.gzip to true, Streaming Miss will be disabled.

Streaming Miss is not compatible with ESI (Edge-Side Includes)

Responses that are processed through ESI cannot be streamed. Responses that are included from an ESI template cannot be streamed. When ESI is enabled for the response or when the response is fetched using <esi:include>, then Streaming Miss will be disabled and the fetched object will be subject to the non-streaming-miss object size limit of 2GB.

Large File Support

Large File Support is automatically enabled for all clients — there's no need to manually configure anything. You should, however, be aware that there are maximum file sizes and several failure modes.

Maximum file size

If Streaming Miss is enabled, then the maximum size is slightly below 5GB (specifically 5,368,578,048 bytes). With Streaming Miss disabled, the maximum size is still higher than the previous maximum, but is limited to a little under 2GB (specifically 2,147,352,576 bytes).

Failure modes

There are several failure modes you may encounter while using Large File Support.

What happens when the maximum object size limit is exceeded?

If the response from the origin has a Content-Length header field which exceeds the maximum object size, Fastly will immediately generate a 503 response to the client unless specific VCL is put in place to act on the error.

If no Content-Length header field is returned, Fastly will start to fetch the response body. If while fetching the response body we determine that the object exceeds maximum object size, we will generate a status 503 response to the client (again, unless specific VCL is in place to act on the error).

If no Content-Length header field is present and Streaming Miss is in effect, Fastly will stream the content back to the client. However, if while streaming the response body Fastly determines that the object exceeds the maximum object size, it will terminate the client connection abruptly. The client will detect a protocol violation, because it will see its connection close without a properly terminating 0-length chunk.

What happens when an origin read fails?

A failure to read the response header from the origin, regardless of Streaming Miss, causes a 503 response (which can be acted on in VCL).

If reading the response body from the origin fails or times out, the problem will be reported differently depending on whether Streaming Miss is in effect for the fetch. Without Streaming Miss, a 503 response will be generated. With Streaming Miss, however, it is already too late to send an error response since the header will already have been sent. In this case, Fastly will again abruptly terminate the client connection and the client will detect a protocol violation. If the response was chunked, it will see its connection close without a properly terminating 0-length chunk. If Content-Length was known, it will see the connection close before the number of bytes given.

Incidentally, this is the reason why HTTP/1.0 clients cannot be supported by Streaming Miss in the cases when the Content-Length is not yet known or available. Without the client receiving a Content-Length and without support for chunking, the client cannot distinguish the proper end of the download from an abrupt connection breakage anywhere upstream from it.

Back to Top