Monday, July 10, 2017

It's been a while since I've posted anything here. Being busy with work and various other projects, nothing has quite risen to the level of gotta write about it to get me writing, until now.

This morning, someone at work drew my attention to a post on The Hacker Blog that they thought warranted some attention. For those who haven't seen it, it purports to describe how the author managed to almost hijack most of the DNS traffic for the .io Top Level Domain. The post has started to receive a bit of attentionelsewhere, and while it describes a definite mistake on the part of the Backend Registry Operator for the .io TLD, it definitely does not constitute the catastrophe implied by the article.

The problem with the article stems from the author's misunderstanding of how delegations in DNS actually work, and the part that the behaviour of both recursive and authoritative name servers has to play in the described "hijack." The author assumes that because he's able to register a domain name that matches several of the authoritative name server names for the .io TLD that it is "likely that clients will randomly select our hijacked nameservers over any of the legitimate nameservers..." This is wrong.

The author demonstrates his "hijack" by pasting these results to a DNS query:

In order to poison a DNS server, you must understand the normal queries that would typically be sent by that server and be in a position to answer one of them with a crafted response, or be in a position to trigger specific abnormal queries that will elicit your poisoning response. The problem with the example query is that it would never be sent by a typical client, without some sort of abnormal prompting.

Since the author doesn't claim any ability to trigger unusual queries in arbitrary recursive servers, in order to evaluate the attack we should look at typical queries that would be sent by a recursive name server trying to look up an .io domain. To see what would actually happen in his attack, let's see what would happen if someone were to look up the A record for 'bit.io' (the first .io domain that popped into my head). We'll assume an empty cache in order to give the attacker the greatest advantage, but skip the root priming query for simplicity, and because it wouldn't be relevant here.

The first query that will be sent is to the root. Because recursive servers are normally trying to get the most work done with the least effort, they always send the query for the information they're actually interested in, and deal with whatever response they get (an exception to this is a new option in the lookup algorithm called Query Minimisation, but it would have no effect on this). Therefore, the server does not ask the root for the .io name servers; instead, it asks for the A record for bit.io.

The name server responds with the list of name servers for the bit.io domain. Note the very important difference between this and the example given in the original article. The list of .io name servers are nowhere to be seen. The client will go on and ask one of the Dreamhost name servers the same question:

It turns out there is no A record for bit.io, so the client is going to be disappointed. More importantly, it will not be poisoned with the attacker's NS set. The only set of name servers in this query chain to give the list of .io name servers was the root, and the contents of the root zone are unaffected by what you can convince your Registrar to pass up to your Registry, or your Registry to put in their zone.

The key element here is that the name servers for the .io TLD don't respond with their own NS set in their response. The only way you're likely to get that response out of those servers is to specifically ask, and that's a query rarely performed by your typical recursive DNS server. Even then, the author's attack doesn't work.

Let's use a concrete example. Here's a nearly-empty zone I just created for the TLD "myTLD".

The zone contains only those things necessary to make it a valid zone, plus one delegation for example.myTLD. The Hacker Blog author's attack was to add a delegation for one of the name servers. Here's what that would look like in the above zone:

Note that this creates a conflict in the zone. There is both a delegation and an A record for ns1.localhost.myTLD. In all authoritative DNS servers this converts that A record from an authoritative record in the zone to an "occluded name." In simple terms, the A record is hidden by the presence of a delegation at the same point in the tree.

Querying this zone for the A record for www.example.myTLD has the same results as for bit.io in the above tests, but what happens if we query this TLD name server for its own NS set?

And again, this is not a query that is normally going to be sent by any recursive name server trying to look up 'www.example.myTLD.' That client just wouldn't care. What the author succeeds in doing is to add a delegation to the .io zone. What he needs to do, to redirect any traffic at all, is to get an address into the .io zone. This is not going to happen with the method he's used.

In theory, getting an address for one of the TLDs nameservers into the TLD zone might be possible if he can register a host record with his registrar, but success assumes several things:

That the .io TLD uses host records at all:

Unbound by the same restrictions as the gTLDs, some ccTLDs don't bother with all of the bells and whistles of the EPP standards. I don't know about .io specifically, but there are some ccTLDs that do not allow the registration of host records.

That the same restrictions are absent for host records as were for delegations:

Assuming the Registry for .io uses host records, they may have a different set of criteria for allowing or disallowing the registration of host records than they do delegations. For example, it's unlikely you can register a host record that is not a subdomain of a domain you already have the delegation for, and it's not unreasonable to expect that the registry might require the host record to actually be a subdomain, and not be equal to the domain name.

That the registry doesn't have safeguards in its publishing that prevent duplicate records for its name server set:

Depending on how the Registry publishes its zone, it's possible (even likely) that the minimal set of records (the SOA, the apex NS set, and glue for the apex NS set) are handled differently during zone generation than delegations and their glue. The former is likely a static set, and the latter would typically come out of a database. Most Registries have safeguards in place to check the validity of a newly generated zone, and would be wise to include checks of the former "minimal set" in those tests.

During the day, and often at night, Matt works as a DNS infrastructure operator. He is the former DNS Operations Manager for the Canadian Internet Registration Authority (the organization that manages the .ca Internet domain), has worked DNS operations at Afilias (.info, .org, and several others), has served on the board of directors of DNS-OARC, and currently works as the Sr. DNS Engineer for Rightside (.ninja, .rocks, and many others) ... though he doesn't speak for any of the above here.

He enjoys live music, good beer, and speaking about himself in the third person.