Context Navigation

A review of the technology, basis, and features of open-source and commercial web caching proxies.

Open-source

Squid

The development version of Squid is ​Squid 3.0 (latest release, however, is 2.5). Squid 3.0 is written in C++, and appears to (mostly) follow an object-oriented paradigm, in contrast to Squid 2.x which was written in C.

ACL's

A large portion of the current Squid architecture is devoted to the configuration of Access Control Lists. These lists have two types of lines, class definitions, and operators. The reference for this is in the ​Squid Configuration Manual

Classes

The acl 'name' is a unique descriptive string which explains the class. There are several ACL 'types', which can define the class as corresponding to a particular set of IP addresses, a particular time of day, a particular web browser, or almost countless other options. The 'strings' are parameters which describe the class.

Some of the most common acl class types. Some have a version which takes regular expressions:

Source/Destination IP address

Source/Destination Domain

Words in the requested URL

Words in the source or destination domain

Current day/time

Destination port

Protocol (FTP, HTTP, SSL)

Method (HTTP GET or HTTP POST)

Browser type

MIME type

Name (according to the Ident protocol)

Autonomous System (AS) number

Username/Password pair

SNMP Community

Operators

Operators are used to filter content based on the acls matched by the classes. the most common is http_access. It is suggested that a minimum http_access config looks something like:

tcp_outgoing_address - allows mapping requests to different IPs based on ACL

reply_max_body_size - stop large file downloads

log_access - log or don't log

Other config options:

Delay pools:

'delay pools' are bandwidth limiters, which can limit bandwidth based on ACL's. It is possible to create several numbered delay pools, each one of which has a class (5 total classes). The next be is taken verbatim from ​here

class 1 -- Everything is limited by a single aggregate bucket.

class 2 -- Everything is limited by a single aggregate bucket as well as an "individual" bucket chosen from bits 25 through 32 of the IP address.

class 3 -- Everything is limited by a single aggregate bucket as well as a "network" bucket chosen from bits 17 through 24 of the IP address and a "individual" bucket chosen from bits 17 through 32 of the IP address.

class 4 -- Everything in a class 3 delay pool, with an additional limit on a per user basis. This only takes effect if the username is established in advance - by forcing authentication in your http_access rules.

RabbIT

​RabbIT is touted as a "dial-up acceleration" proxy written in Java, designed to increase web browsing speed on low-bandwidth connections. It is focused on features, implemented through a "filters" system:

GZip compression of HTML pages (those not already compressed)

Image recompression to a small JPEG quality level

Ad removal

HTTP/1.1 pipelining support

Because application of these filters is considered "heavy", RabbIT can also serve as a simple cache.

Commercial

Blue Coat

Blue Coat's ​SG appliances are general-purpose proxies which seem to be targeted mainly at businesses looking to block users from accessing particular content. The appliances run on a custom operating system, kernel, and filesystem, which supposedly has no ties to Windows or Unix.

Patent-pending algorithm on when to serve stale objects and when to refresh them to avoid overloading origin web servers by too many requests. Invalidation API mentioned above can specify a "grace period" to allow stale objects to be served, as well as objects to never serve stale.

Stratacore

​Stratacache's Stratacore offers multiple blackbox caching devices, divided into "tiers" which can be selected on price/performance ratios as well as the number of users serviced. Software appears to be same on low-end sub-$1k device to $125k+ device.