Running Consul at scale: Service discovery in the cloud

I recently sat down with Darron Froese, site reliability engineer at Datadog, to discuss service discovery and what he hopes to see on the horizon for IT infrastructure. Here are some highlights from our conversation.

What is service discovery?

Service discovery tools manage how processes and services in a cluster can find and talk to one another. It involves a directory of services, registering services in that directory, and then being able to look up and connect to services in that directory.

Why is service discovery so important?

Service discovery is so important now because we’re moving web sites and web applications beyond what a single computer or virtual machine can handle.

As sites scale up, become more popular, and need more resources, you need tools and a mechanism to handle the registering and locating of services that your application needs to function.

It’s also becoming more important in cloud and container-based environments, because a static mapping of service to IP address and port is no longer possible with those systems.

Why is Consul such a popular tool for this?

Consul is a popular tool for this because:

1.It’s opinionated and has sensible defaults—you don’t have to build absolutely everything from scratch.

2.It has very modest runtime requirements.

3.It’s backed by Hashicorp, who builds lots of very handy tools and has gone above and beyond to help us—even as a non-paying customer.

4.It’s awesome. Seriously.

What do you see as the future of IT infrastructure, say in five years?

I’m hoping that in five years we have reliable infrastructure that abstracts more of the underlying OS from being necessary to care about.

I want to launch my app and I want my app to connect to services. I don’t want to have to care what it’s running on. I don’t want to have to care about arcane incantations to make it run faster. I just want to push my app code into place and have the “system” help to scale it out.

Computers make faster decisions than me; I want them to be able to make some of the decisions that I am having to make in real time and with minimal human intervention.

Brian Anderson, Infrastructure and Operations Editor at O’Reilly Media, covers topics essential to the delivery of software — from traditional system administration, to cloud computing, web performance, Docker, and DevOps. He has been working in online education and serving the needs of working learners for more than ten years.

Darron Froese has been building things for the Internet since the early ‘90s, when he first discovered Mixmaster remailers and Usenet. In 2014, after running nonfiction studios for 12 years, he moved to Datadog to be a site reliability engineer. Darron enjoys short build times, resilient infrastructure, clusters that keep their quorum, and breathing compressed gases underwater.