The focus of Varnish has always been performance and flexibility.
Varnish is designed for hardware that you buy today, not the hardware you bought 15 years ago.
This is a trade-off to gain a simpler design and focus resources on modern hardware.
Varnish is designed to run on 64-bit architectures and scales almost proportional to the number of CPU cores you have available.
Though CPU-power is rarely a problem.

32-bit systems, in comparison to 64-bit systems, allow you to allocate less amount of virtual memory space and less number of threads.
The theoretical maximum space depends on the operating system (OS) kernel, but 32-bit systems usually are bounded to 4GB.
You may get, however, about 3GB because the OS reserves some space for the kernel.

Varnish uses a workspace-oriented memory-model instead of allocating the exact amount of space it needs at run-time.
Varnish does not manage its allocated memory, but it delegates this task to the OS because the kernel can normally do this task better than a user-space program.

Event filters and notifications facilities such as epoll and kqueue are advanced features of the OS that are designed for high-performance services like Varnish.
By using these, Varnish can move a lot of the complexity into the OS kernel which is also better positioned to decide which threads are ready to execute and when.

Varnish uses the Varnish Configuration Language (VCL) that allows you to specify exactly how to use and combine the features of Varnish.
VCL is translated to C programming language code.
This code is compiled with a standard C compiler and then dynamically linked directly into Varnish at run-time.

When you need functionalities that VCL does not provide, e.g., look for an IP address in a database, you can write raw C code in your VCL.
That is in-line C in VCL.
However, in-line C is strongly discouraged because in-line C is more difficult to debug, maintain and develop with other developers.
Instead in adding in-line C, you should modularized your C code in Varnish modules, also known as VMODs.

VMODs are typically coded in VCL and C programming language.
In practice, a VMOD is a shared library with functions that can be called from VCL code.

The standard (std) VMOD, included in Varnish Cache, extends the functionality of VCL.
std VMOD includes non-standard header manipulation, complex header normalization and access to memcached among other functionalities.
Appendix D: VMOD Development explains in more details how VMODs work and how to develop yours.

The Varnish Shared memory Log (VSL) allows Varnish to log large amounts of information at almost no cost by having other applications parse the data and extract the useful bits.
This design and other mechanisms decrease lock-contention in the heavily threaded environment of Varnish.

To summarize: Varnish is designed to run on modern hardware
under real work-loads and to solve real problems. Varnish does not
cater to the “I want to make Varnish run on my 486 just
because”-crowd. If it does work on your 486, then that’s fine, but
that’s not where you will see our focus. Nor will you see us
sacrifice performance or simplicity for the sake of niche use-cases
that can easily be solved by other means – like using a 64-bit OS.

Objects are local stores of response messages as defined in https://tools.ietf.org/html/rfc7234.
They are mapped with a hash key and they are stored in memory.
References to objects in memory are kept in a hash tree.

A rather unique feature of Varnish is that it allows you to control the input of the hashing algorithm.
The key is by default made out of the HTTP Host header and the URL, which is sufficient and recommended for typical cases.
However, you are able to create the key from something else.
For example, you can use cookies or the user-agent of a client request to create a hash key.

HTTP specifies that multiple objects can be served from the same URL depending on the preferences of the client.
For instance, content in gzip format is sent only to clients that indicate gzip support.
Varnish stores a single compressed object under one hash key.

Upon a client request, Varnish checks the Accept-Encoding header field.
If the client does not accept gzip objects, Varnish decompresses the object on the fly and sends it to the client.

Fig. 2 shows the lifetime of cached objects.
A cached object has an origin timestamp t_origin and three duration attributes: 1) TTL, 2) grace, and 3) keep.
t_origin is the time when an object was created in the backend.
An object lives in cache until TTL+grace+keep elapses.
After that time, the object is removed by the Varnish daemon.

In a timeline, objects within the time-to-live TTL are considered fresh objects.
Stale objects are those within the time period TTL and grace.
Objects within t_origin and keep are used when applying conditions with the HTTP header field If-Modified-Since.