This instruction explains how to setup DNSSEC validation with the Unbound resolver for DNS. A companion article on BIND also exists. Note that Unbound has been written for security from the ground up, and carries less history than BIND.

Install. We used Unbound 1.4.5 on Debian Linux. Variations should work; there is even a prebuilt executable installer for Windows. Aside from the general practice to always run the latest Unbound for security reasons, it is specifically good to use 1.4.0 and beyond because it can keep up to date with root zone keys as they roll over.

The best option on Linux is to build it from source code (some distributions offer pre-built packages and Unbound is included in several BSD ports treess). This can be obtained from http://unbound.net/ or https://unbound.net/ where the latter is protected by a CACert-signed certificate in name of the maker’s primary domain, nlnetlabs.nl.

The build is straightforward enough; by default it installs everything under /usr/local which we will assume because you may not like overriding any old setup in /etc/ and /usr. So you would do:

./configure
make
make install

Configure trust. First, find the the trust anchor for the root zone and verify that it is reliable. Then edit the configfile for Unbound in /usr/local/etc/unbound/unbound.conf. At the very least, set auto-trust-anchor-file: /usr/local/etc/unbound/root-trust-anchor. This file will be automatically updated by Unbound when the root zone publishes updates. Make sure it can find and change it with:

Fill /usr/local/etc/unbound/root-trust-anchor with the initial DS record(s) for the root zone. This will usually be needed once, only to be repeated after extended periods of downtime of all your resolvers. What you do is download and possibly validate the XML file as described in https://data.iana.org/root-anchors/draft-icann-dnssec-trust-anchor.html and then you construct a DS record by filling in this pattern:

$ZONE IN DS $KEYTAG $ALGORITHM $DIGESTTYPE $DIGEST

This will end up looking like this (assuming you do this before the root signing key is rolled):

You probably want to ensure that the older DLV setup option dlv-anchor-file is disabled unless you have decided to mix this alternate channel with the trust paths for the root zone. Also check that any other trust-anchor-files are gone, as should be the case in a pristine setup of Unbound. If you have been relying on ITAR in the past, this is the time to clean up that temporary trust anchor.

Configure Unbound. Second, instruct Unbound to actually require DNSSEC on all zones that are said to be signed from the root down. Note that this does not apply to the “islands of trust” that may hang somewhere under the root zone but without a trust link all the way up; these can still be validated only through ISC DLV.

Requiring DNSSEC is setup with the following options in the configuration file:

module-config: "validator iterator"

Run. Now fire up Unbound:

/usr/local/sbin/unbound

Unbound should now pickup the DNSKEY for the configured trust anchor and overwrite the file with it. In doing so, it has used the configured DS record to validate the key. This is a good test to see if updating the file works, and if Unbound accepted your homegrown DS record.

A successful DNS reply would include an Authenticated Data or AD flag, which serves as an assurance to stub resolvers that are not DNSSEC-aware, in this case human eyeballs. A small session showing this flag would look like:

Why query for DS, you wonder? Like any parent/child transition in DNS, the TLDs in the root zone are present in both parent and child name servers. The DS record is the only one that is normally only present in the parent, so this answer is certain to come from the parent, which is the root zone because we are asking for a TLD name. Further down, things start depending on more complex constructions. More on that later!

And that’s all, you’re done. The only thing you may want to ensure is that the new signing keys are pulled in when the root zone rolls over its keys, and specifically, its Key Signing Key. The procedure should be automatic with the setup given
here, but it’s probably better to be safe than sorry.

Before you start using the root trust anchor, it is very important to verify it. ICANN has specified several methods for doing this. We relied on the PGP signature made by one of the trusted community representatives, Olaf Kolkman of NLnet Labs.

Again, please make sure that you validate the trust anchor before you start using it; it has no value whatsoever if this important step is not taken.

For those who have my (Roland van Rijswijk) PGP key, you can find a signed statement that I have validated and trust Olaf’s signature on the root trust anchor here.

UPDATE: Jakob Schlyter and Fredrik Ljunggren of Kirei (one of the organisations that has contributed to the root zone DNSSEC design team) have also published a signed statement on their website.

Today was a big milestone in the deployment of DNSSEC on the Internet with the signing of the root zone. For system administrators of recursive caching name servers – or as they are colloquially known, resolvers – this is good news. For the first time ever, they can configure a trust anchor for the root zone in their resolver and start validating based on the actual DNS infrastructure instead of having to rely on interim solutions like DLV or ITAR.

We received a question about configuring this trust anchor earlier today: “how do I configure this in my server”. We are going to address this in two blog posts in the coming week, and will include a step-by-step guide on how to do this for both Unbound as well as for BIND.

One final note: even though the trust anchor for the root is available, this does not mean that you can cover the same level of validation as is now possible with DLV. This is due to the fact that islands of trust still exist (for instance in the form of signed .net or .com second level domains). We are therefore going to be using the root trust anchor and DLV in parallel for some time.

UPDATE: Wolfgang Nagele has written a mini HOWTO on using the root trust anchor, you can find that here.

This blog deals with lots of details on how DNSSEC can be rolled out, but should also touch upon why it is a good idea. It will be clear that rolling it out as a professional service is by no means straightforward, so the question is warranted what advantages it brings.

The common but unimaginary answer is that it thwarts the Kaminsky attack. This is an attack that can be used to fill a cache with DNS data that is falsified, for example to redirect website traffic or email to an attacker’s server. Kaminsky devised a practically achievable attack for which a pattern was long known to be theoretically possible: a cache poisoning attack. While it is absolutely necessary to overcome this problem in our modern internet with its many commercial and privacy-related uses, this is not the only benefit to be gained from using DNSSEC.

A general problem on the Internet is that we deal with remote parties that we may never have met before. This generally raises concerns of trust in such a person. There are parties that believe that assured identities can solve these matters, but if you ever tried to talk to the police about arresting a distant crook who somehow fooled you into an illegal trick, you will know that this is not the real answer.

Of course there are some situations where online identities are grounds for trust; you might trust someone with an email-address under sec.gov with financial details that you would not freely share with others; you might send your credit card information to amazon.com but not just anywhere, and so on. Online identities such as domain names can be useful to gain trust in a distant person, even if you never spoke to such a person before. But even then, we need ways of establishing that a person represents a domain that we trust.

The general challenge in first-contact situations over a distance is to establish a secure link. For instance, how to be sure that an email came from the domain it said to come from, or how to send an encrypted email that can only be read within that domain. To do that, cryptography can help, but only after getting hold of a public key that is somehow linked to that domain.
The current approach for this is reliance on trusted third parties which are paid quite a bit of money to establish the identity and domain ownership of the party you never met. Unfortunately, to reduce the cost of such services, they often rely on mechanisms that are susceptible of traditional fraud, so it is not very reliable. In the end, even a trusted thrid party often boils down to legal or moral grounds of trust; this simply is the best that we currently have.

But then DNSSEC enters the scene. What it does is sign the content that any domain owner can publish in their own zn, with trust links from the root down. The path from the root to a privately owned domain traverses infrastructure that is maintained by technicians, not commercial-grade performers. These technicians already have ways of establishing domain ownership, and are therefore rather attractive trusted-party role players.

What remains to be done now is to publish information securely in DNS, sign it with DNSSEC and setup client software to use that as a trust foundation. A few highly appealing things have already been standardised.

There is an SSHFP record that stores a server’s Secure Shell fingerprint. This adds value because it avoids the single weak point in the SSH protocol, namely first access between a client and server machine. When properly maintained, SSHFP can make the security of SSH absolute. And quite interestingly, it is already very powerful when used internally in a company!

There is a CERT record that can store identity information falling under a domain. Both X.509 and OpenPGP certificates are possible. Although there currently does not seem to be a mechanism to use this for trust in self-signed server certificates, this is definately possible. Plus, given the abundance of IP addresses that the future has in store with IPv6, secure servers can easily become the default practice of tomorrow’s Internet!

There is an ENUM infrastructure to store personal contact information under one’s phone number. Such details should not be sidetracked, for risk of tapping your conversations. The obvious answer to protect against that is DNSSEC.

I’m sure this is just the beginning. DNSSEC may not be pretty on a bit level (but then again, what part of DNS is?) but at least it brings us fundamentally new possibilities. If you are not satisfied with “just” solving the ability of attackers to sidetrack your domain, then there are plenty of places where the overall security of your online operations benefit from the introduction of DNSSEC. Beating the Kaminsky attack is a must, but there is enough to make DNSSEC worthwhile on its own account. DNS is everywhere, so protecting it through DNSSEC has a lot to offer to all of us net-savvies.

If any design principle has been leading our architectural work around resilience for DNSSEC, it has been idempotence. It is one of those algebraic concepts that really helps to beat sense into a complex set of choices.

Idempotence means that doing the same thing twice is no different from doing it once. Painting orange on an orange wall still delivers an orange wall, to give an example.

The architecture that we defined for OpenDNSSEC signing involves redundant signers. And signers keep state. If one crashes, we are not necessarily aware of how much of the state has reached the secondary server; so how to handle that?

The solution is simple, and derives from the idea of idempotence; Rather than stating what zones to add or remove from the signing facilities, we simply state what the desired situation is. Concretely, we list the zones that we want to have signed, and the key repository to use for each. The underlying system is then required to determine changes with respect to the state it is aware of, and effect any such changes in the usual, automated ways that are part of OpenDNSSEC.

The great benefit derived from this is that the web interface (which we call SURFdomeinen) can be blissfully unaware of its communication partner’s state: Instead of knowing whether it is talking to the master or a slave that is preparing to become a master, it will simply upload the signed-zone list to whoever happens to be interested at that time. Master/slave rollovers need not be bothered with in SURFdomeinen, because the signers themselves care for the differentiation between the current and intended situations.

As it turns out, OpenDNSSEC is a great help with this approach. It does support commands to add or delete zones and policies, but under the hood these commands edit files and then ask the KASP Enforcer to determine what has changed, and update the database accordingly. Basically, OpenDNSSEC supports idempotence at heart! So instead of using the add/delete commands, what we do is generate parts of the configuration and ask the KASP Enforcer directly to update the database with changes. The two configuration files that we generate are:

/etc/opendnssec/zonelist.xml with the list of zones received. We use a transaction to pass in the zones and after having received all successfully we explicitly require a command to pass the zones on to the signer. This means that we will always get the full list, generate the complete zonelist.xml from it and let OpenDNSSEC sort out what changes there are.

/etc/opendnssec/kasp.xml with a policy for each of the institutions that we serve. As we upload zones, we annotate it with a textual handle for the institution, and use that to assign a policy to each zone as we enter it into the zonelist. The policy per institution is not used to vary cryptographic parameters, but it serves as the scope for a key sharing discipline. Knowing that HSMs generally limit the number of objects/keys they can hold, we decided to share keys within each institution.

If ever our master signer crashes, we assume the best we can, namely that any metadata on DNSSEC keys is in our redundantly shared database. Then all we need to do is upload the zonelist from SURFdomeinen to this slave, generate the configuration files from it and ask OpenDNSSEC to process them and continue signing.

Idempotence is a mathematical statement that says function f applied to the outcome of f always yields the same result as applying f only once but as abstract as that may be, it clearly is a concept that has direct impact on our understanding of the real world. And how to carve out its mechanics in computer scientific constructs. All the way down to the level of programming we have established how the concept can not only structure, but also simplify our reasoning and understanding of what goes on and what needs to be done.

In a previous post we addressed access control on the network level. This post will focus on access control in various ways on the signer machine.

User access control

The most basic – but nevertheless important – way of controlling access is by determining which users need access to the signer machine and the potentially sensitive data stored on it. In our case, this is limited to two categories of human users:

System/application administrators – these need to access the system to keep it up-to-date and to troubleshoot potential problems

Backup officers – these need to access the system in order to tell OpenDNSSEC that a backup of key material has been performed (this is necessary before newly generated keys are taken into production, more on that in a later post)

Human users can only access the machine through SSH or on the console.

Apart from human users, there are also ‘application’ users. In our setup we have two application users:

SURFdomeinen – the SURFdomeinen system needs to be able to tell the signer if a new zone needs to be signed and other administrativia. This is achieved by having a user account with a very limited shell that is only accessible over SSH. The SURFdomeinen system can then interact with the signer using a very limited set of commands. Authentication of this user is achieved using public key authentication.

Backup users – a nightly backup of the signer system is performed using rsync; a special backup user requires access in order to be able to do this (more on backups below).

Protecting sensitive data on the signer

Most data stored on the signer is not sensitive in our case, but there is one piece of information that warrants special attention: the user PIN for the HSM.

Courtesy of Wikimedia commons

The user PIN grants access to all the key material stored on the HSM and can thus be misused to forge signatures. Although access to the signer is highly restricted, there is one other problem that we are facing: an evildoer could inspect the disks of the signer machine (if he or she managed to gain access) and in our case this problem is very real. We run our signers as virtual machines, which are running on large virtualisation platforms with centralized storage. It is possible to access the VM’s disks from outside the virtual machine, making it very easy to snoop the user PIN from the disk.

To prevent this from happening, we store all configuration files that contain the user PIN in encrypted form. We use Red Hat Enterprise Linux on our signers. A special encrypted file system comes package with that, calledĀ ecryptfs. One of the best features of ecryptfs is that it allows you to do ‘overlay’ mounts. Thus, we can remount a directory, replacing the encrypted version by the decrypted version from a user’s perspective.Ā The encrypted volume is protected by a strong, randomly generated passphrase.

Finally, we have created a policy, which states that the encrypted volume must not be mounted automatically, but instead the passphrase needs to be entered manually by a system administrator during activation of the signer.

Security includes backups

One final note: always keep in mind that security includes backups. This has two sides to it:

Backups ensure that you can restore your system in case of catastrophic failure or hacking

Backups should never include sensitive data in the clear – thus – in our case – we must never create an rsync based backup that includes data from the mounted encrypted volume.

Introduction

A big part of the security of our infrastructure is determined by the access control we enforce on all the components that form the DNSSEC signer infrastructure. Access control is important on several levels:

Network level

Access to machines and user privileges on these machines

Access to sensitive data on the signer

HSM roles

In this post, we’ll zoom in on how we intend to set up network level access control in our DNSSEC deployment; the other levels will be addressed in upcoming posts.

Network level

On the network level, we have singled out all components and enforce strict ACL rules. The diagram below shows our network setup:

Click on the image to enlarge

The top-half of the diagram shows the core DNS and DNSSEC infrastructure that we maintain in each of the two colocations that have a signer instance. Each colocation has a DNS VLAN; this VLAN contains both the local authoritative nameserver(s) as well as the local OpenDNSSEC signer instance. Access to the VLAN is controlled by a firewall on the local router (shown as a separate component in the diagram). Each colocation also has a specific VLAN for the local HSM; again, access to this VLAN is controlled by a firewall in the local router.

The right hand side of the diagram shows an administrative VLAN containing systems from which system administration may be performed on both the HSM as well as on the DNS infrastructure.

Finally, the bottom right hand side of the diagram shows the SURFdomeinen server, which is the actual web portal in which users can manage their DNS zones.

The dotted red arrows in the diagram show the allowed flow of network data between the various components in the infrastructure. Let’s go through them one by one.

Inside the DNS VLAN, communication is allowed between the authoritative DNS server and the signer. The host-based firewall on the signer, however, will only allow SSH access and access on port 53 to the hidden DNS master running on the signer. On top of that (not shown in the diagram) all other authoritative nameservers that SURFnet operates also have access to this hidden master running on the signer.

The other red arrow going in to the DNS VLAN comes from the SURFdomeinen server. This arrow signifies the DNS transfers that SURFdomeinen does to the zone input on the signer.

The red arrow leaving the DNS VLAN to the HSM VLAN is a link that allows the signer – and only the signer – access to the SSL port on the HSM that it operates to accept commands. On top of this link being shielded by ACL rules in the router, the link is also mutually authenticated using certificates, thus guaranteeing that only the signer can use the HSM. There is also a link from the signer in the other colocation to the SSL port on the HSM (not shown in the diagram); both signers can access both HSMs because the HSMs run in a redundant high-availability setup.

The final red arrow goes from the administrative VLAN to the administrative port on the HSM, allowing authorised system administrators access to the HSM to perform – for instance – backups.

As you can see, there is no direct access for administrators to the signer. System administrators do have access to the authoritative DNS servers (not shown in the figure), and from these they can then access the signer for system administration. This two hop route was deliberately chosen to make it harder for attackers to access the signer.

Note that all data never travels over the Internet but only within the SURFnet WAN.

What we want to show with this post is that it is important to consider access control on a network level. This is your first line of defense against possible evildoers. We recommend that you carefully consider which entities require access to certain resources, and to design your infrastructure such that by default all access is prohibited only opening up ports specific to particular entities to access particular resources. We also recommend that you document your network design so all parties that play a role in your DNSSEC deployment are aware of the restrictions that have been imposed and the rationale behind these restrictions.

Our previous posts on our DNSSEC push-button service with some delay gave an idea of how we want to present DNSSEC to our end users, namely as a flag that can be toggled on and off for each domain managed in the “SURFdomeinen” tool for conceptual DNS management. Aside from delays in actually processing these requests, we do not see a need for the end user to make any choice about cryptographic or other operational issues.

Our architecture for DNSSEC revolves around a signer which receives unsigned domains from the SURFdomeinen environment, and the usual slaves (the public authoritative name servers) that publish the domains once they are signed. In support of the signer there is a Hardware Security Module (or HSM) that securely stores signing keys and that constructs signatures with them. Furthermore, there is a node that monitors the signing process as well as normal DNS operation.

It is important that the signing process is run without a single point of failure; this is because, unlike plain DNS, the signatures that are created on each zone have a validity period with a absolute end time. If that time passes without publishing fresh signatures in the zone, then the domain cannot be validated and it will disappear from the Internet until a new signature has been created. Redundant signers can help to mitigate this risk.

Signing itself is not an extremely heavy task, and OpenDNSSEC signer software (version 1.1) was not developed with redundancy in mind, so the best solution is to use a master/slave setup, where the master actively signs and the slave keeps its databases up to date so that the slave can immediately take over from the master if the need arises. The slave will stay up to date of any changes to its state, but only when the master is administratively removed will the slave become a new master, involved in normal signing. The change of master and slave roles will be administered manually, but largely scripted. The manual nature of the change allows us to employ quick-fixes to the master in cases where this is sufficient. We will need to define a window of time within which either master recovery or slave-to-master promotion must have taken place.

As for the HSM, the way it avoids a single point of failure is through a redundant setup. If the client PKCS #11 library accessing the HSM is running on the signer machine, it should ideally access both HSMs simultaneously. Each signer can access each (and, usually, all) HSMs at any time.

DNS is currently a āonce it runs, never touch it againā infrastructure. This changes with the introduction of DNSSEC. Managing a DNSSEC signed zone involves a continuous effort of resigning zones and generating key material. Apart from that, DNS is a fundamental Internet protocol, thus the changes required to implement DNSSEC have an impact at many levels of the Internet infrastructure. In turn, DNSSEC is affected by many network elements. The result of this is that there are potentially some operational issues that might affect a DNSSEC signed zone.

Four categories of operational DNSSEC issues can be distinguished:

Network related issues, such as firewall problems

Trust issues, such as incorrect secure parent-to-child delegations

Zone related issues, such as time/duration problems (TTL of a record vs. signature validity)

DNSSEC choices, such as NSEC vs. NSEC3

Most of the known operational issues can be found by either monitoring actively (fully automated online monitoring) or by running an integrity check (non-realtime checking of zone integrity and sanity).

Some efforts for developing a DNSSEC monitoring solution have been made. These includeĀ SecSpider (a distributed polling system that crawls the Internet to monitor worldwide DNSSEC deployment), DNSCheck.se (a web-based zone checking tool for DNS zones that includes some support for DNSSEC) and some tool suites from NLnet Labs and SPARTA Inc. None of these efforts, however, comprehensively address most known operational issues nor do they provide active online monitoring solutions for organisations that deploy DNSSEC signed zones.

To address this gap in monitoring capabilities, SURFnet introduces a DNSSEC monitoring plugin which checks the known operational issues. This plugin can be used with nagios or as a standalone web application. The nagios based plugin can warn a DNS operator when something is wrong, where the standalone web application can be used to manually validate whether or not a zone is operating properly.

Our monitoring plugin solution, which is based on unbound (by NLnet Labs) and dnspython, consists of four basic tests that, together, check the operational issues that affect a DNSSEC signed zone.

DNSSEC uses public key cryptography to build a chain of trust between various parent and child name servers. The first test is a chain check, which checks and validates the complete chain of trust,Ā starting at the secure entry point and ending with the signature of a signed record. It is important to keep monitoring this chain of trust, because the chain can change during key rollovers and when signatures are updated.

The DNS data portion of regular DNS UDP packets was before the introduction of DNSSEC limited to 512 bytes. This standard meant that if the data required to be in the response to a UDP requestdoes not fit in 512 bytes, a truncation flag bit is set in the response and the resolver must try again using TCP. TCP uses a substantially higher set up and tear down overhead and is therefore not preferred. Since the requests with DNSSEC became a lot larger, the limitation of 512 bytes did not hold anymore. ThereforeĀ DNS has been enhanced with new features (EDNS0), while maintaining compatibility with earlier versions of the protocol. Some non updated routers and/or firewalls do not support EDNS0 and/or block DNS UDP packets larger than 512 bytes. This is why our second test includes an EDNS0 check, which uses a binary search algorithm to determine the minimal packet size for DNSKEY records and in the course of this detects potential network maximum transmission unit problems caused by e.g. firewalls.

To keep DNSSEC fully secure a Next Secure (NSEC) record was defined to provide denial of existence, such that evildoers cannot return an NXDOMAIN for a record that actually does exist. This NSEC record, which links from existing name to existing name, will be returned by an authoritive server when a non-existing record is requested. The requested record should be between the verified records in the NSEC record, to make sure the requested record truly does not exist. Because this feature made it possible to enumerate a complete zone NSEC3 was introduced. NSEC3 does exactly the same, but includes a hashing over the names, such that it becomes impossible to get the domain names. Our third test is an NSEC(3) test, which checks whether a zone is using NSEC or NSEC3.

The final check is a TTL check, which verifies whether the TTL parameters used in the zone comply with the recommendations in RFC 4641bis. This includes:

The TTL value of a RR has to be equal to the TTL of the RRSIG that belongs to this RR. The reason for this is that once a resolver has cached both values and either one of them times out earlier, a non matching RR or RRSIG could be downloaded from the authoritive server.

When the maximum zone TTL is, e.g., equal to the signature validity period then all signatures will be cached until the signature expiration time. This can cause a high load on the authoritive servers, because all resolvers will, at the same time, request updates.

Re-signing a zone shortly before the end of the signature validity period may cause simultaneous expiration of data from caches, which can again lead to peak load on authoritive servers.

A validator should be able to complete validation before a record has expired, therefore the maximum zone TTL should not be smaller than 5 minutes.

If a secondary authoritive server serves a DNSSEC zone and it is impossible to get updates from its primary authoritive server, it may happen that the signatures expire before the SOA expiration timer counts down to zero. However it is not possible to completely prevent this from happening, the effects can be minimized where the SOA expiration time is equal to or shorter than the signature validity period.

The SURFnet DNSSEC monitor can be live tested at http://www.dnssecmonitor.org/. It is also possible to download the source code and use the DNSSEC monitor within your own nagios environment. We encourage you to try it out and give us feedback!