Are we optimizing our images like cavemen?

Years from now, developers will look back on how we use images today and have a good laugh at our expense. How we handle images now is the equivalent of caveman drilling holes in skulls to release demons.

The problem: Images are getting fatter, which is hurting performance.

The web is obese. This isn’t news. Images are the biggest culprit. This isn’t news, either. Pages are, by many accounts, getting slower. Again, not news.

According to the HTTP Archive, the average web page is 1292 KB, with 801 KB of that page weight — more than 60% — being taken up by images. The proliferation of images on the web isn’t going to change and will likely continue to increase. People like high-quality pictures, and that’s not going to change, either, especially in light of the advent of retina displays.

The solution: We need to get a lot smarter about how we treat images online.

There are some obvious things you should be doing to optimize images — choose the right format, make sure they’re progressive — but in my opinion, these are primitive first-gen approaches. Today, inspired by my recent podcast chat with Ilya Grigorik, I want to explore some big-picture ideas. (No pun intended… okay, pun sort of intended.) I chatted with some of the great minds here at Strangeloop, notably Shawn Bissell, one of our senior software architects.

We might not have all the answers, but let’s start by asking the right questions.

Why not just automate how images are formatted, so that they’re always in the optimal format?

Inappropriate image formatting is a common performance culprit. We see this literally all the time. You could spend a lot of time educating every single person in your company about the best way to optimize every image type, or you could do what Ilya proposes: somehow automate how images are formatted, so that they’re always saved in the optimal format (PNG, GIF, JPEG, etc.).

At Strangeloop, we’ve built this feature into our products with a treatment called Image Compression. We recompress the image into either a JPEG and a PNG based on a set of default settings (which are user-configurable, as well as automatic), and then we use the smallest version. The smallest version is almost always a JPEG, except for small PNGs or GIFs, which are usually smaller in PNG-8 format.

The challenge, of course, is that the definition of “optimal” is, of course, highly subjective. Using a quality setting that is too low can produce artifacts (blurriness or “jaggies”). Shawn tells me that he recently discovered a cool new algorithm for determining the structural similarity (SSIM) index, which mathematically computes the difference between the original and compressed images. This is something we are currently considering adding to our products.

If we could add/popularize more image formats, what should they be?

What we’re currently using:

Photos – JPEG, PNG-24

Low complexity (few colors) – GIF, PNG-8

Low complexity with transparency – GIF, PNG-8

High complexity with transparency – PNG-24

Line art – SVG

What we should also be using, but aren’t:

JPEG 2000 – This is a great format for variable compression for different regions of interest

WebP – Offers better compression for high-resolution images

What we need to develop:

A format similar to WebP that can produce even smaller image sizes for images with a extremely high PPI (pixels per inch). This would help a lot with the huge image quality/performance challenges that we’re going to start hearing a lot more about now that the new retina displays are hitting the market.

Why is it so hard to add brand-new, better image formats?

This is a big, hairy, complicated beast. People have been working to improve image formats for years, with mixed success. One of the problems with new formats is that they’re often proprietary and require a license. Another is that they’re sometimes subject to legal issues because of unclear intellectual property rights. Browser vendors understandably don’t want to pay for the license or get caught in a legal battle.

JPEF 2000 and WebP, which I mentioned above, are classic examples of useful-yet-neglected formats:

JPEG 2000 (introduced in the year 2000) is a good example of a format that is actually superior to JPEG. Compared with the regular JPEG format, JPEG 2000 offers advantages such as support for higher bit depths, more advanced compression, and a lossless compression option.The problem: nobody uses it. This is at least partially due to the fact that, while the JPEG 2000 license is royalty free, there may be “submarine patents” on the actual IP, which are problematic.

By the somewhat opposite token, WebP (introduced in 2010) is an open standard that gets great results (according to Google, WebP can reduce filesizes by up to 45%, which is phenomenal), yet it’s still only supported by Chrome and Opera — no doubt due to the fact that it was developed by Google.

The lesson to be learned here is that if you want to create the next great image format, you need to make it free, open, and totally vendor neutral.

Why is the HTTP Vary header a problem for CDNs?

Ilya and I touched on this very briefly in our chat, and a few people have since asked me to explain this further, so I want to take a minute to elaborate here. The Vary header is a problem for CDNs or caching proxies because they are required to keep a different version of the resource for each different value of the header specified in the Vary header.

The classic usage of the Vary header is to vary by Accept-Encoding so that the CDN would keep different copies of the webpage in its cache for different compression encodings (gzip, deflate, plain). This case isn’t a problem, but in order to serve a different image to different browsers, the Vary header needs to vary based on something browser specific. “Browser specific” usually means varying by the User-Agent header, but there are so many different versions of User-Agent headers (even for the same browser!) that CDNs are afraid that varying by User-Agent will cause severe cache fragmentation. So they only like varying by Accept-Encoding, and not by anything else. And some CDNs have even done away with that by implementing their own compression engines.

(At Strangeloop, we avoid this problem by having our product use the User-Agent header on the HTML request — typically not cached in the CDN — to rewrite all the URLs to the images to point at different versions of the images. This way the CDN cache only has one copy per version (which is optimal), and the correct version is controlled by the URL in the HTML, so the browser always downloads this correct version.)

Takeaways

This post just grazes the surface of this topic. How we handle images on the web is a massive, complex issue, but our main challenges can be pared down to this:

We need to get a lot smarter about how we handle images on the web, because they’re only going to keep getting fatter.

We need to embrace existing formats, such as WebP, that will make our images leaner.

We need to develop a new format that can handle the emerging demands of retina displays.