For a project I’ve been working on, I wanted to to have my Sidekiq worker (which is part of an RSS crawler) discover the favicon for a web site and cache it for later display. It was fun figuring out a way to do this, so I just had to share.

A Brief History of Favicons

Favicons, or “shortcut icons,” can be defined in multiple ways. Like all too many things in web design, browsers handle them in slightly different and mildly incompatible ways, meaning there’s plenty of redundancy. Favicons came to be when Microsoft added them to Internet Explorer 5 in 1999, implementing a feature where the browser would check the server for a file named favicon.ico and display it in certain parts of the UI. The following year, the W3C published a standard method for defining a favicon. Rather than simply having the browser look for a file in the root directory, an HTML document should specify a file in the header with a <link> tag, just like with stylesheets.

Fast forward to the present, and you have a bit of screwiness.

All major web browsers check for the link tag first, and fall back to favicon.ico if it’s not found.

You can define multiple icons in the HTML header. You can have ICO/PNG/GIF formats, as well as different sizes.

Some browsers support larger 32×32 favicons, while others will only use the 16×16 ones. Chrome for Mac prefers the 32×32 ones, and scales them down to 16×16 on Macs without Retina displays.

The most compatible way to set up your favicon is to define both 32×32 and 16×16 icons in your header, using the PNG format, and make a 16×16 ICO formatted one to name “favicon.ico” and drop into your web root. Browsers that play nicely will use the PNG ones in whatever dimensions they prefer, and IE will fall back to the ICO file.

Writing the Class

Now that the history lesson is out of way, you can see why there’s a little bit of a challenge here. Depending on how badly you want to find and display that icon, you may have to write logic for the different methods. For this tutorial, I will focus on two. The simplest, which is looking to see if there’s a favicon.ico, and a basic implementation of checking for a link tag defining a shortcut icon.

Before we do anything else, we need to install a few dependencies. Either add them to your Gemfile and do a bundle install, or use the gem install command to install them manually.

Now require the necessary libraries at the top of a new Ruby file and we can get going.

require "httparty"
require "nokogiri"
require "base64"

We can define a class to make a nice, clean interface for this to keep it modular and easier to reuse. As you can see below, I’ve made a Favicon class and added some accessors for instance variables, as well as an initialize method that assigns the parameter it receives to the @host instance variable before calling the method we will be defining next.

We’ll be implementing the simplest part first. The check_for_ico_file method will send an HTTP GET request to /favicon.ico on the server specified in @host and check to see if a file exists. (The server will send a 200 OK response if it does, and a 404 Not Found error otherwise.) If it does, the URL will be saved to an instance variable and the icon file’s contents will be base64 encoded before being saved to an instance variable as well.

The HTTParty gem is great for this, since it drastically simplifies simple HTTP requests like this.

If you want, you could go ahead and instantiate the class to try out what we have so far. If you pass it the domain name of a site that uses the /favicon.ico convention, the object should find it without issue.

Now let’s handle link tags! The process for that is a little bit more in-depth. First we need to request a web page from the server, such as the index page, and parse it for tags that resemble <link rel="shortcut icon" href="..." />. Then we have to evaluate the contents of href to make sure it’s an absolute URL, and prepend the domain name if it is not. After that, we can finally make a request to get the icon itself and save it.

Still with me? Excellent, now here’s the code to do that. I’ll comment it a little more thoroughly, since it looks messier at a glance.

# Check "shortcut icon" tag
def check_for_html_tag
# Load the index page with HTTParty and pass the contents to Nokogiri for parsing
uri = URI::HTTP.build({:host => @host, :path => '/'}).to_s
res = HTTParty.get(uri)
doc = Nokogiri::HTML(res)
# Use an xpath expression to tell Nokogiri what to look for.
doc.xpath('//link[@rel="shortcut icon"]').each do |tag|
# This is the contents of the "href" attribute, which we pass to Ruby's URI module for analysis
taguri = URI(tag['href'])
unless taguri.host.to_s.length < 1
# There is a domain name in taguri, so we're good
iconuri = taguri.to_s
else
# There is no domain name in taguri. It's a relative URI!
# So we have to join it with the index URL we built at the beginning of the method
iconuri = URI.join(uri, taguri).to_s
end
# Grab the icon and set the instance variables
res = HTTParty.get(iconuri)
if res.code == 200
@base64 = Base64.encode64(res.body)
@uri = iconuri
end
end
end

Now there’s one more thing to do before we’re done. The initialize method needs to be tweaked so it calls our newest method:

Now…what of that “base64-encoded gibberish?” It’s the perfect format for a little trick called Data URIs, which you can read all about over at CSS-Tricks. If you cache that base64 string somewhere, probably in a database, you can output it like so:

It will display like any other image, but won’t use an additional HTTP request, because the image data is already embedded on the page. This makes it perfect for a list of web sites with icons beside them. Instead of kicking off several HTTP requests for individual tiny images, you just embed them right in the page.

If you’re unfortunate enough that you must support antique versions of Internet Explorer (version seven or prior) then you can’t use Data URIs, as they were not supported. However, all is not lost. You could conceivably adapt the class and have it write the image data to files on the server instead of base64-encoding them.

Related Posts

Great tutorial – there is one other way to install a favicon. If you have a self hosted WordPress site you can simply upload the favicon plugin, install your favicon and it shows up immediately. Easy.

Grabicon

Hi Matt – great article! If your readers want a shortcut way to get free favicons (also written in Ruby, by me) they can try grabicon.com. The benefit over the DIY approach is that instead of waiting 3-4 seconds to retrieve the icon, grabicon caches them, so they’re almost instant.

It also resizes icons to what you request, and generates unique default icons for sites that don’t have one. This allows web/mobile apps to have a uniform user experience because icons are all the same size, and none are missing. Here’s an example: