DNS Prefetching Implications

Recently I received a call from our $2/month DNS service provider explaining that our DNS queries have increased to 400 million authoritative queries per month (a high of 120 queries/second) , putting us outside the scope of the basic service and we were required to upgrade to a premium service in the range of $1600/month. Welcome to some tech at pinkbike.

After some investigation we determined that browser DNS prefetching was causing an increase in our authoritative DNS queries of a whopping 800%.

A single meta tag to control prefetching reduced Pinkbike DNS queries by 350 million per month. The same implementation on a larger site, deviantART, reduced authoritative DNS queries by 10 billion per month.

DNS Basics

DNS resolution is the process of converting a domain/hostname to an IP address required to access the resource. This process requires a certain time and adds to the perceived page loading time. DNS preresolution/prefetching is the process of figuring out the IP address of every link on the page before you even click on it, with the goal to save the DNS resolution time when the link is clicked. DNS prefetching is a fairly recent (added in Safari 7 months ago) enhancement to all the major browsers. After a page loads, the browser looks at all the hosts in the links on the page and in the background proceeds to issue DNS queries to resolve those hostnames.

Investigation

I setup our DNS a number of years ago, before DNS prefetching existed, so I never thought about it’s implications. I knew just enough to use a good DNS service which had multiple geo located DNS servers to minimize authoritative query request times, and to make sure that TTLs were set high (in our case a week) in order to keep caches around longer. In our case we used dyndns which has great service, performance, and an easy to use UI.

After receiving the information about the large amount of DNS queries that we were generating I investigated all the obvious elements.

1. DNS TTLsDNS queries are cached at many levels. Caches may exist on your browser, OS, router and ISP. Configuring how long the cache is valid is a TTL (time to live) setting in your DNS. In our case our TTLs were set at the maximum of 1 day, so this was surly not the issue.

2. Lots of links/images on other sitesIt is possible that there are a large amount of embedded images on other sites and users viewing those sites are causing additional load on our DNS. Pinkbike uses a separate subdomain for static content, pinkbike.org, and since this domain is not generating high DNS queries, this is not the problem

3. Misconfigured internal services hitting the DNSAnother potential issue was a misconfigured server. Pinkbike has a bunch of servers that communicate internally for all sorts of reasons. Perhaps one of these is not caching DNS. You can easily check this with the following command on a linux box. This will show you all the DNS queries that your server is doing. This was not the issue.dnstop -l 4 eth0

The basic DNS service we were using did have any reporting to allow us to determine what hostname the dns queries were for and where they were coming. This made it difficult to figure out what the issue may be.

Upgrading to Dynect

I need to give Bobby and the guys at Dyn a plug for being forthcoming in the help to investigate this issue. The first thing we were able to do is get a trial account to the Dynect platform which is the “enterprise” version of dyndns. Since the same company runs both services, transition from dyndns was one click to transfer all the data in to dynect. In addition to better performing DNS using anycast and more geo servers, you get all sorts of reporting, data, and control.

Looking at the www domain we see that this is not the culprit of the 120 q/s

Now that we were able to see which hosts were getting the bulk of the queries, I was able to narrow down the problem. The bulk of the queries were the wildcard dns queries used for user subdomains such as username.pinkbike.com. I always expected that this would be a source of additional queries, but the volume still did not make sense. After checking our analytics further, there were more DNS queries per subdomain then actual subdomain page loads. Pinkbike does about 100 dynamic html pages per second, so to have 120 DNS queries per second made no sense, especially taking into account multiple layers of DNS caching.

This finally led me to look in the direction of prefetching.

Browser DNS Prefetching

I was aware of the idea of DNS prefetching but never really understood how the mechanics worked in the browser. The first thing I did was fire up wireshark to take a look at exactly what was going on.

Some of our pages have comments, and each comment has a link to the author’s subdomain. When you have 100s of comments on a page, this means 100s of potential subdomains that can be prefetched. Of course the browsers seem to blindly prefetch all the subdomains.

Safari and Chrome prefetch DNS after the page loads. Prefetch can last many seconds after the page loads.

Firefox 3.6.x seems to bloat the prefetching even further. It seems to prefetch the AAAA (ipv6) record in addition to the A (ipv4) record for every prefetch. This may further explain the large amount of AAAA requests on all our domains. Perhaps FF does this for every hostname.

Firefox DNS prefetch is worse as it always gets the AAAA record in addition to the A record further increasing DNS queries. Though I did try the FF4 beta and it seems this is corrected in the future release.

IE8 / 9 early beta do not appear to do DNS prefetching.

Solution - Prefetch Control

Prefetching seems to be enabled by default on every browser, but can be disabled in the settings.You could ask all your users to disable prefetch in their browsers but that just would be a waste of goodwill with your users.Luckily the browsers have standardized on a prefetch control meta tag that can be used on your page to disable prefetching.

The dns prefetch control allows you to turn off and on prefetching for your whole page or certain parts of the page. Additionally you can force lookups on specific hostnames.

Initially I disabled prefetch on a subset of the pages and the result was dramatic and instant

Additional tuning resulted in a further drop

Steady state

After realizing this was the cause for bloated dns queries I called up my buddies at deviantART to let them know about this discovery and asked them to implement the basic fix. Their result was also instant.

On deviantART this adds up to a decrease of about 10 billion dns queries per month

When I first heard of DNS prefetching I thought it was a great idea and assumed it was an optimization that had absolutely no negative issues. I did not fully understand the mechanics and impact that it may cause on an organization. If 350 million queries cost Pinkbike.com $1600/month, then saving a cost for a few billion queries per month starts to be something to consider carefully.

Before you run to disable prefetching on your site, please realize that the increase in queries is directly proportional to the amount of subdomains that you link on your page. Arguably, prefetching is an effective way to decrease latency for most sites, but there are implications. I hope this article will make you aware of and help you understand the potential impact and behaviour of this technology.

Additional Thoughts

- A browser and OS typically cache DNS hostnames in the order of a few hundred. If a user is visiting sites that have a lot of subdomain hostnames, do the valuable entries in these caches get evicted by prefetched entries frequently, and in so prefetching may be causing increased latency?

- If I'm paying for an enterprise DNS solution which has DNS servers all over the world to minimize query time, is browser prefetching less valuable to me?

- If you're thinking of building a site with the pattern of username.example.com, this data may sway you away from it.

27 Comments

Great article, love the detail! Another option would be adding the DNS prefetch control setting in the response header rather than a meta tag in the theme. Essentially the the http-equiv meta tag is the same as adding a header, although it becomes a dependency of the theme rather than a system level configuration.

I actually am in the middle of writing a rather large blog for a site with DNS which will cover everything from optimizing internal DNS queries, to DNS prefetching, caching, and how to improve resolution as a whole.

For the most part, I'm not sure why you're too concerned with prefetching--it's not a bad thing, and only a good thing.

As far as Windows goes, the DNS client caches non-existent records for 15 minutes. It will cache records it finds for TTL or 24 hours, whichever is shorter (by default). These values can be tweaked in the Windows registry.

Chrome does DNS prefetching and also makes known invalid queries in an attempt to see if your ISP is hijacking invalid responses, which can break some of Chrome's features.

I'll be sure to keep this site in mind when I finish the articles. Also, feel free to contact me at my e-mail address so we can discuss.

Most linux distributions by design do not cache DNS queries, although DNS server software usually does when it's being done recursively. I'm not sure of any default aging in BIND, I think it completely honors the TTL. Clients however do not.

"For the most part, I'm not sure why you're too concerned with prefetching--it's not a bad thing, and only a good thing."

But his whole article is about how prefetching can be bad on the provider side. So while it's good for the user, it created a DNS mess. I get how it would speed up things for the user if and when they choose to click on one of the links where the FQDN was prefetched... but if 100 FQDN's are prefetched and the user only ultimately clicks on 1 of them, then couldn't the other 99 be considered a waste of network resources?

Exactly. Prefetching may be effective on pages which have less linked subdomains. The effectiveness is based on the click probability of a prefetched link. If you have a large amount of low probability links, prefetching can be a problem in...1. Users generate more internal traffic2. You pay for low probability speculative dns resolutions3. Users probably evict their browser/os caches and potentially even isp caches form higher probability cached hostnames, effectively incurring the lookup costs again. ( Will an isp cache hold say 10 million subdomains a site honouring the full ttl time? )

Do you REALLY need to pay for DNS solutions? You sit here and concern yourself with DNS prefetching yet your SOA record for pinkbike.com shows TTL is 1 hour. I'd argue that having such a low TTL is such an unnecessary thing unless you're doing round robin DNS.

You could DRASTICALLY decrease the amount of queries you receive by simply tuning up the TTL value. Hell, Windows clients alone cache DNS for 24 hours or TTL (whichever is shorter). So you already know that if your TTL goes beyond 24 hours you've already got at least that from Windows clients.

Linux clients on the other hand do not do client-side DNS caching.

I'm reading the dynect DNS solution--do you REALLY need anycasted DNS? I mean...seriously....I know people running massive amounts of sites on 1 to 4 authoritative name servers. The cost? Nothing more than it would cost to either host the servers or use the registrar's name servers (and with companies like godaddy, using their nameservers are free).

Of course, you lose the ability to dynamically update your DNS records like you're doing with Dynect.

But why? Seriously? You're seriously spending money on the simple fact that you want subdomains for every user of your site? If I were the IT guy I'd laugh at you if you approached me with that solution, especially if you mentioned that we'd be spending any money on it outside of infrastructure costs.

The subdomain architecture of the sites mentioned can't be taken back. That is something that was done many years ago for whatever reason. These sites have now millions of users/subdomains and you can't really take that back now. Also users actually pay to have a more personalized portfolio/subdomain so there is a business case there. Hey dA is a top 100 site, and maybe one of the reasons that contributed to that is user subdomains. No point arguing about this. That's a constant.I would probably not do it like that on something new or there would really have to be a good business reason.

I'm pretty sure any of the subdomains have TTL of 24 hours not 1 hour, and the results of these are what matters here.

Yeah I do want multiple servers around the world. traffic to these sites is very global so minimizing dns lookup, given the other constant, is a nice thing.

The point of the article is to make those who maybe in this situation aware of prefetching and prefetching control.

Do I need to pay for DNS? No. Would I pay a fair price to someone providing the service, yes. We used to have our own email servers once too, but at some point someone did it better for a fair price, so we use that service.

How about implementing an alternative form of the user subdomain URL pattern for places like the comments section that doesn't use the subdomain pattern. Something like (pinkbike.com/user/radek) and have that redirect to radek.pinkbike.com so you maintain compatibility with the current scheme?

Looks like they use that DNS service to have DNS nodes around the world. Sounds fair to me, when I first learned about this stuff I read that big companies will do this to provide better service to their clients.

Heldt, they said it's $2 a month, no biggy. MySQL is free, you get unlimited databases. I've never done a large scale project like this but my guess is that they have a row for each user profile, not an entire db or table. If you are paying per a database, someone is ripping you off

Right on for the tech talk PB! I was stoked to see "DNS" on the main page

You really don't need to pay for any sort of anycasted DNS solution. That is, you're better served by setting a reasonable TTL on your zone. The only time it would matter is if you're doing round-robin DNS, and it doesn't look like you're doing that. In the case of RR DNS, TTLs are set especially low to try and get users to load balance between IPs. It has its place, but I suspect your load balancing issues would be better solved elsewhere in your infrastructure long before you have to deal with pointing users to different servers. You can use various caching front ends to drastically increase the load capabilities of your site.

If you set your TTL > 24 hours, Windows will always attempt to fetch a new record after 24 hours, however most servers should have already cached your information for the TTL value--so it's very likely their configured DNS server will respond in your place.

It's highly unusual when DNS queries will actually have to come to your authoritative service from the same client repeatedly in such a fashion.

And no offense to your relationship with DYNDNS (or for any person with OpenDNS relationships), but their services are usually not needed--and you're mostly paying for the front end, and some other snake oil. If I were looking to save money, your DNS service should be the first thing you evaluate.

You will not find another site with a large interacting user volume setup like this online. As each subdomain requires a dns lookup.

All photos, messages, and personal user data resides on its own sub domain. Making some pages need ton perform upwards of 40 different dns lookups under pinkbike. If you are browsing the forums, or buy sell.

This whole thing could likely be fixed with a single mod-rewrite line in your .htaccess acres file. And proper use of a cdn.

Ive always been a huge unfan of using domain prefix for stuff that really should be part of the uri. This just adds to that dislike.
Good article and a good explaination on how the dns precaching works.

Rather than reducing the web browsing performance of all users of the site (by disabling prefetching), why not simply use a 10 line html fix and rename all user comment links so they don't use a subdomain? They could link to an internal page which reroutes to a subdomain, or possible a rewrite rule could do it. Or are you only turning off prefetching within the comments?