code etc.

Varnish pipe for large files

I have some servers running Varnish 4 with small cache sizes — a few hundred megabytes. The servers host a handful of large files, the largest over a gigabyte. When Varnish serves requests for these files, the entire backend response is cached in varnish’s memory until the client downloads them. The size of the cache has no effect on this use of memory; varnish has to hold the response in memory until it’s served. A single download can create an immediate and critical memory strain on the server. Actual memory use for varnish (as reported in the RES column in the top command output) can spike from 300MB of memory to 1.3GB in less than a second.

The issue could be addressed by removing varnish from the loop. Static files could be placed on their own domain with nginx handling requests — which I understand to be both faster and more memory efficient than varnish.

But how do you solve this problem while keeping varnish as your front-end?

Use a pipe.

There is a minimal overhead for varnish to handle a dumb pipe between the client and the backend, but it is nothing that should cause an issue in terms of memory management.

In varnish 3, vcl_fetch provided access to both req and beresp. See this mailing list message for how the content-length header of the response can be checked and the response piped instead of served through the normal varnish process if it’s a large file:

// Varnish 3 configsubvcl_recv{/* Bypass cache for large files. The x-pipe header is set in vcl_fetch when a too large file is detected. */if(req.http.x-pipe&&req.restarts>0){removereq.http.x-pipe;return(pipe);}}subvcl_fetch{// don't cache files larger than 10MBif(beresp.http.Content-Length~"[0-9]{8,}"){setreq.http.x-pipe="1";return(restart);}}

Unfortunately, varnish 4 has a completely different architecture. The vcl_fetch routine has been replaced by vcl_backed_response, which doesn’t have access to the req object. It also is impossible to restart the entire request once it hits one of the vcl_backend* routines. This means the content-length header is of no use, and the decision to pipe or not pipe has to be made entirely within vcl_recv.