The Five Minute Interview – Dynamic Network Services

This article is one in a series of quick-hit interviews with companies using Apache Cassandra and DataStax Enterprise for key parts of their business. For this interview, we spoke with Tim Chadwick, Principal Engineer at Dynamic Network Services (Dyn).

“DataStax Enterprise with Cassandra is a real sweet spot for us, because it scales well. We can add and remove nodes easily, and there’s no need for strong transaction semantics for our initial use-cases.”

Tim: Dyn is a traffic management company that includes the famous DynDNS service used in home routers and other areas. I work on the DynECT platform, which is our enterprise DNS platform. I serve as one of the leads for our Infrastructure team, so we perform a lot of functions to make sure our features scale accordingly. We also need to conduct capacity planning to ensure that we can remain operational even when encountering vicious vectors or other challenges.

DataStax: How do you plan for capacity when you are probably flying blind – meaning, not knowing when a huge surge of traffic might occur?

Tim: A good rule of thumb is to always plan an order of magnitude larger than you expect. Of course, the physical limitations of an attack are really what network transit can provide, and so it’s very, very rare that you get anything that even touches that amount of bandwidth because it’s a very hard task to generate that sort of attack. When you see it, we can basically push it to what we call a “soaker site”. We push all that bandwidth away so it doesn’t affect our customers.

To answer your question directly, we plan for it by conducting drills that simulating attacks. Then we execute procedures in order to address those attacks. The simulations give us a decent sense for how much bandwidth capacity we need, and what the implications are for our downstream systems.

DataStax: How does DataStax Enterprise help Dyn support its customers?

Tim: We use DataStax Enterprise primarily for storing post-query data. As DNS providers, we’re constantly serving queries all day, and so every time that happens we log it and keep track of it, which results in a lot of interesting data. We take all that data, process it, and put it into Cassandra.

We’re still transitioning out of an older transactional system, and so because it is log data, it is naturally denormalized; and because of how we process it and serve it, it doesn’t need to be transactional at all. That’s where DataStax Enterprise with Cassandra is a real sweet spot for us, because it scales well. We can add and remove nodes easily, and there’s no need for strong transaction semantics for our initial use-cases.

DataStax: How long have you been using DataStax Enterprise, and what other database technologies did you evaluate beforehand?

Tim: We’ve used it for more than a year, and before then we evaluated Riak and HBase. We chose Cassandra because it hit the sweet spot since it offers a very cohesive toolkit and we are very invested in the open source community.

DataStax: How have you engaged with the community so far? How have they responded if you approach them with questions?

Tim: The Cassandra community is very helpful and we’ve been involved with the client library. Between the folks at Netflix, and of course everyone at DataStax, along with other people from Rackspace and other places, we’ve been very pleased. We also attend the Cassandra conferences, and everyone’s eager to help out. It’s nice because the client libraries are rich. We primarily use Python and Java and a little bit of C++, so everything we need is there.

DataStax: Can you walk us through Dyn’s Cassandra environment?

Tim: From a volume perspective, we have many hundreds of thousands of QPS going on throughout the DNS network, so the data volume is quite large. We deploy across multiple datacenters with about 12 nodes that currently hold around a terabyte of data, and we expect it to grow up to 15 terabytes.

Our primary implementation is for core QPS data that supports billing purposes as well as supplying full query logs to specific customers. We also have a few other analytical use cases and we expect to add more clusters as our organization proliferates standard database approaches.