Eliminating 502 Proxy Errors

While working on an infrastructure refresh and consolidation project for one of my clients they had a legacy archive of public data consisting of several hundred gigabytes. There are a couple of approaches to handle this, each with its advantages and disadvantages.

Decisions

Move the data into the web container

Advantages

Local data is easy to manage with standard tools

Disadvantages

Bloated containers

Synchronization requirement for each container

Cost

Move the data into a central nfs server

Disadvantages

Managing additional resources (HA NFS)

Cost

Advantages

A single location to manage data

Local data is easy to manage with standard tools

Move the data into a public bucket on object storage

Disadvantages

Data stored in object storage is harder to manage.
It requires additional software to be quasi POSIX, or web interface.

Advantages

No additional resources to deploy/manage

A single location to manage data

Cost

This was the perfect use case to offload this data to object storage in the cloud. One of my favourite object storage companies are the fine folks over at Backblaze. Their B2 service offers S3 compatible object storage at a fraction of the cost of Amazon S3, Google Cloud Storage, or Azure Blob Storage. Seriously, check out their calculator.

The First Crack

I added the following directives to the apache configuration and it worked, well, mostly!

SSLProxyEngine: B2 is only available over https so we need to enable the SSLProxy engine in apache

Location: You’re likely familiar with the <Directory> directive, <Location> is similar but operates on request URLs instead of filesystem paths. In this case, any request we receive for /archive/content/*.

ProxyPreserveHost: This directive determines is we send the Host header that matches our hostname. This is important as the B2 certificates don’t list our hostname as a CN/SNA so we’ll get a certificate error if it’s enabled. Disable it.

ProxyPass: This directive takes any request for /archive/content/ and proxies it to Backblaze.

That Was Too Easy

It didn’t take long for apache to start complaining about connection errors to Backblaze. It was a small percentage of errors and mostly during high concurrency. This led to the following messages in the apache error log.

The Second Crack

I suspected that Backblaze is terminating connections once they are idle. When apache tried to reuse that connection the above errors were tossed and resulted in a 502 Proxy Error returned to the client. Let’s disable keepalives and downgrade our protocol to HTTP/1.0 instead of HTTP/1.1:

That’s it. No more connection errors.

Vince Hillier is the President and Founder of Revenni Inc.
He is an opensource advocate specializing in system engineering and infrastructure. Outside of building solid architecture that doesn't break the bank, he's interested in information security, privacy, and performance.