Handling Endless Delegation Chains in Windows DNS Server

DNS servers that are acting as caching resolvers send queries to the authoritative DNS servers. One of the possible responses that an authoritative server can send is a ‘referral response’. The most common scenario for an authoritative server sending back a referral response is when the server is acting as a parent domain containing the delegation to the child domain for which the query was originally received. Such referral responses normally contain a name server (NS) record pointing to the DNS server that hosts the delegated domain and an address record (aka glue) pointing to the address of the name server.

Recursive resolvers, when they get such a response, are expected to follow the referral response, find out the address of the name server of the delegated domain (either via the obtained glue or via a new query), and send the original query to the delegated name server. They are expected to perform this ‘recursion’ until they get an authoritative response from one of the authoritative servers.

This mechanism of answering queries by the DNS system exposes a possible vulnerability where a set of rogue DNS servers can create very long or endless delegation chains and force the recursive resolvers into performing infinite recursions. This can cause resource exhaustion and thus the denial of service from the resolvers. Such vulnerability has been reported in some of the DNS solutions in recent past.

This scenario is addressed in Windows DNS server via an upper limit on the ‘effort’ that a DNS server makes to fetch the responses from the authoritative servers. This limit is placed by a setting -‘recursion timeout’ which can be set to a maximum value of 15 seconds on a DNS server. Effectively it means that after the server has waited for the configured ‘timeout’ number of seconds for the query to be processed, it will simply give-up on the query and respond back with server failure to the client. By default, recursion timeout on Windows DNS server is 8 second.

The users need to be judicious in setting this value based on their network latency and other environment parameters. Setting it to too low can cause undue timeouts for valid queries as well, if the network or the authoritative server is slow. Apart from this setting, users can also change the time taken to chase the records in the additional section of a response by tweaking the recursion “additional timeout” property, which is set to 4 seconds by default.

Set-DnsServerRecursion -AdditionalTimeout 3

The same logic can be extended when the recursive resolver is trying to chase a CNAME chain as well. In this case the Windows DNS server also has an upper limit to the number of links chased in a CNAME chain besides the timeout.

A related vulnerability that was located in some resolvers that follow multiple referrals at once can cause large bursts of network traffic. Windows DNS Server does not query multiple referrals simultaneously and thus it does not exhibit this vulnerability as well.