Watch the Interview

Listen to the Podcast

Show Notes

[01:20] Tell people who aren't familiar with Deis yet, what is Deis?

Deis is a Platform as a Service that uses Docker and so it makes Docker application development very easy to use and very familiar if you know something like Heroku. It is very similar to a Heroku-like system.

[01:43] What has changed since going 1.0, what's the biggest difference?

Well, there's really a few things. I was speaking with someone recently who made a point about 1.0 that I liked which was "You wave the 1.0 flag when you run out of reasons why you shouldn't".

For us we have had the project out there being used in production by a number of companies for a long while, we have APIs that rae stable, the stability of the project has proven itself out. So we were running out of reasons why we shouldn't call it 1.0.

We really spent some time to put in the last changes to the public-facing APIs that we wanted to get in there to make sure we were future-proofing the 1.0 API and once that was out there we went ahead and cut that and it has been exciting ever since.

[02:45] What are some of the feature functionalities that a developer gets access to:

One of the best ways to think of Deis, besides being a private Heroku, is that there is a lot of churn in the container eco-system. The main thing that Deis provides is a workflow for software teams that are looking to adopt this stuff that is stable, that isn't going to change.

Yes there is a 'git push' component to this typically that mimics what you are going to get from Heroku, there's also the ability to suck in native Docker images and deploy those Docker images using our Deis "pull" functionality.

That workflow along with the ability to scale containers, the ability to view aggregated logs, manage runtime configuration, collaborate with a team and all those sort of things we've really just hardened and solidified the core feature set that's proving to work at the software teams that have deployed Deis today.

[03:44] What about the backend database systems, what do you do for persistence with Deis in 1.0?

First of all, taking a step back, persistence (statefulness), as most of us in this space know, is one of the hardest problems in the distributed systems game. Though we still recommend that the types of applications you deploy on Deis are stateless, following the Heroku and 12 Factor model, Deis itself has to store state. We needed to solve that in a way that was resilient to host failures for our production 1.0 release.

The way that we did that was we actually containerized Ceph, the distributed database. Its a very interesting project, one of the things that was attractive to us about Ceph was that it provide three different access modes.

It provides a blob store API along the lines of S3 or Swift which was useful to us for integrating the Docker registry that we ship as part of the platform. There's also a block store API which allows you to mount block devices that are actually backed by the distributed Ceph database. And there's also a true filesystem called CephFS, support for CephFS landed in the 3.1.7 kernel and CoreOS included it very recently.

That's important for us, our log aggregation system has concurrent readers and writers and that was a pretty important point. We have our Postgres database, the log aggregation system, the Docker registry all using this highly available Ceph containerized deployment of Ceph as part of the platform control plane. Another thing that is sort of interesting along those lines, we actually have some other users who are using our containerized version of Ceph outside of Deis.

So, if you are potentially interested in rolling your own Ceph cluster and want to do it inside Docker containers you should take a look at the Deis store containers and there is a blog post by a fellow named Ian Blenke who's an active member of the Deis community and he has shown how to use the Ceph that we created outside of Deis.

[06:21] How is something like Ceph different from an etcd key-value store?

An etcd key-value store is really based around the Raft protocol and its to achieve consensus around a distributed system. etcd is not really designed to store large amounts of data the likes of which you would use Ceph for. Similar to Zookeeper you are going to write important configuration bits, important bits that are going to drive service discovery on the system, but you're not going to be writing massive amounts of data. The quorum is too expensive and replication costs are too expensive.

[07:04] Do you think Docker is more mature and more ready to run in a production environment?

Yeah, I think so. The way I like to judge that is by taking a look at the past release notes and if you look at some of the stuff that Docker has been working on there hasn't been a whole ton of really disruptive churn in the project, maybe with the exception of a few security related changes that have gone in recently.

I think that Docker as a whole has proven, at least at the container engine portion of Docker, Docker is expanding beyond that, the container engine portion seems to be definitely suitable for production.

[07:46] Can you comment on some of the recent things that have happened around CoreOS, Deis uses CoreOS and Panamax from CenturyLink uses CoreOS, can you comment on what your take is on some of the recent announcements around Rocket and CoreOS?

It's interesting times in this eco-system. We were kind of caught in the middle of this, we use CoreOS as you mentioned, we use Docker heavily as you mentioned, we have relationships with the engineering teams and we also have business relationships with both companies.

I definitely see the idea of competing technologies on the container engine front as a being good thing overall, I think competition is good. Yet at the same time I think a lot of this stuff is still relatively immature so its kind of difficult to predict where things are going to shake out. We are definitely interested in the ACI spec and are interested in contributing to that, as are other in the space. But for the foreseeable future we are going to be using Docker and especially now that we are 1.0 with a Docker container engine.

[09:08] What do you think about the Docker orchestration stuff coming out, and the clustering technology?

I definitely think its interesting and I think Docker has earned the right to do what they want as a company. I personally am very interested in what I like to refer to as layered architecture. I tweeted something out recently that got a decent amount of play which was a diagram that kind of outlines the stata of the container eco-system.

Really if you look at the container engine it but one small portion of the strata. I see no reason why companies can't play at different points of the eco-system as long as the lines between what is scheduling versus what is orchestration versus what is the container engine, as long as those lines are well defined architecturally that's what I'm most interested in.

[10:05] Now that Deis is 1.0, what does a regular developer need to do to get it up and running? Is it something that is easy to run on a cloud environment?

Absolutely, and I think this is one of the key differentiators of Deis. We have a lot of folks ask us "What are the Deis and Cloud Foundry?" for example. The main thing I try to express is, besides the fact that we were architected after Docker which is sort of a soft answer to that, Deis is an order of magnitude easier to install and easier to operate.

All you really need to get Deis running is an existing cluster of CoreOS machines, we've just moved to the CoreOS stable channel, so anything on stable or any later release should suffice just fine. And there is really just three commands that you need to run. There is a command line utility called 'deisctl' that is used for the operational side of this, so 'deisctl install platform', 'desictl start platform', and you're up and running. It takes a few minutes to download some Docker images but that's about it.

[11:15] Do you run those same commands on every node in your CoreOS cluster

No actually the desictl command is, you can run it on nodes in the cluster, but really it is designed as a tool to run on your workstation that operates across a cluster of CoreOS machines.

Another way of thinking about it at a lower level is that deisctl is wrapping the Fleet and etcd APIs. So Deis is scheduling the platform control plane and routing mesh via the Fleet APIs, its performing configuration functions via the etcd APIs and you can do that from your workstation over an SSH tunnel to one of the nodes in the CoreOS cluster and those changes will replicate across the other nodes.

[11:59] Where would somebody go to get started? Is there an easy way to get started with Deis?

Yes, if you go to deis.io you can find our documentation. One of the things that I am very proud of with our project is we've done, I think, a really great job fleshing out the documentation, making sure it stays updated as the project evolves, we don't accept pull requests that don't have documentation and the like.

You can find an Installing Deis section of our website, find a Quick Start guide, find guides for specific platforms; EC2, Digital Ocean, Google Compute, bare-metal and the like to get you up and get it running. If you look on Twitter and inside GitHub you'll see lots of companies have been able to stand-up Deis quite easily and they are quite happy with the process.

[12:57] There's also been a recent announcement about a partnership with you and the Dokku creator Jeff Lindsay, can you tell us more about that?

A little bit of background. Deis has evolved from being a platform that could support a single host configuration, to a platform that is highly available by default. Deis has a minimum of three nodes. That was kind of new and that was before we announced 1.0 and as part of the lead-up to that I know we had a number of folks that were using Deis on a single node.

To try and provide a better answer for those folks who really wanted these small setups, Jeff and I are friends and we talk regularly and it turned out that Dokku was looking for sponsorships. So we thought it was a good opportunity to give back to the open source community in the form of Dokku, but also to try to tie the two projects together a little more closely so that folks that are using Dokku when they outgrow it they can keep the same workflow, or similar workflow, and know their applications will work on Dokku as well as on Deis when they start to need to achieve scale.

I'm really excited about it and I'm always excited to work with Jeff, Jeff is always churning out great things, I am definitely excited about some of the stuff we can do around standardizing around buildpacks, Docker integration, example applications and test suites that can prove out if problems are in buildpacks or in some of the build infrastructure. Over time, the goal is to share the build stage of Dokku and Deis and maybe other projects like Flynn down the road.

[14:51] What do you think about buildpacks versus using the standard, canonical containers from Docker the ones that have Ruby or PHP in them and using the onbuild functionality of Dockerfiles instaed of using a buildpack?

I love it. I think there is a lot of benefits, the biggest one is that you can really shrink the image sizes pretty significantly. The Heroku Cedar stack, which is kind of the requirement for using buildpack based deploys, is one of the main reasons it takes about 30 minutes to provision a Deis cluster because we have to download that and sort of prep it. If that could be eliminated from the process it could be great.

Moreover, the general idea of it is pretty exciting to me. However, we are interested in shipping stuff that works for teams today and buildpacks are proven and reliable and very comfortable for folks migrating off Heroku and we do a lot of companies that have been successful on Heroku but need to move to metal or to their own EC2 instances for example. We want to offer them a great migration story.

[16:02] What happens if you try and setup Deis and you don't have three nodes in your CoreOS cluster?

That's an interesting question, we should probably fail fast and let you know right up front, I'm not sure we do today. I believe what might happen is, I think maybe the Ceph store won't actually come up, but that's a good point. Right now we warn you as much as we can throughout the documentation the idea is that you should follow those warnings (laughs).

[16:44] So would Deis if you add a node to your CoreOS cluster and register it into etcd will Deis automatically take advantage of the new infrastructure that is coming up?

Yes, absolutely. And that is really a function of the scheduler and the scheduler recognizing the new host is part of the cluster. Really going forward for some of these larger deploys that we are working on if you look at the Deis architecture diagrams there's kind of the control plane of Deis and the data plane of Deis.

The control plane is designed to be three to five hosts where you're really maintaing the Raft quorum, and that sort of thing, the platform control components like the Ceph cluster reside there. The data plane is designed to be kept really light-weight so a minimal set of components, minimal memory requirements and that's where you run the containers that power the applications for your team.

We see those two as kind of distinct, the idea is that you would setup a more or less static control plane and you really achieve scale and add and remove hosts on the data plane side of the equation.

[17:56] Can you tell us about some of the biggest Deis clusters that you've seen? How large has this been tested to scale?

That's a great question, its difficult for us to know, we're working with customers well above the 20 host point at this point. We're also working very closely with the folks at Mesosphere on achieving scale that is on the order of thousands of nodes.

For us, we are a scheduling consumer so there's not really much besides the scheduler itself that is going to affect how large the platform can scale. So we are really excited to work with the folks at Mesosphere, we actually have an alpha of this that we're testing that uses the Marathon API from Mesos. Its pretty exciting to us, pretty soon you are going to start seeing posts from us about Deis clusters that are over a thousand nodes.

[18:55] Is the current default scheduler Fleet?

Correct, the default scheduler is Fleet. I just gave a talk at QCon recently on cluster schedulers and specifically on cluster schedulers as they relate to Docker. One of the interesting things about cluster schedulers is that most of them are designed to do their own process isolation. One of the tricks with Docker is that Docker is also designed to do process isolation. What you have is cluster schedulers and Docker sort of fighting over who gets control over the cgroups for the platform.

That's a tricky place, there's only a few schedulers out there that support Docker controlling the cgroups, Mesos is one of them. Because we are based on CoreOS Fleet was kind of free, right? We ship with Fleet, we use Fleet to schedule the control plane so it is an easy choice. Sort of an zero config default that allows you to schedule with Fleet. Over time as we achieve scale we think other scheduling solutions like Mesos are going to prove better.

[20:16] What about the base operating system? Have you been looking at options outside of CoreOS, is Atomic (from Red Hat) on the roadmap?

Atomic is not on the roadmap as we are really focused on making CoreOS work well. We work very very closely with the team over there and I think they are doing great things, we're really interested in getting users that are not super concerned about what the underlying operating system is. If you are deeply concerned about that and you want to be tinkering at that level then Deis may not be right for you. And that is perfectly OK and not every project is a fit for every team.

Does that mean that we are going to be CoreOS forever, no absolutely not. We certainly reserve our right to choose some other operating systems down the line. Brings up an interesting question, one of the things that I would really love to see shake out in the eco-system is, I have this diagram I mentioned before about the strata of the container eco-system and really where we are sitting with Deis, and I would argue Panamax also, is kind of at the top at this workflow layer. This is kind of "what are you exposing to the end software teams?" Below that I see an orchestration layer along the lines of something like Kubernetes, and below that a scheduling layer which is doing placement across the distributed system, and below that you have the scheduling layer talking to a container engine, which could be Rocket or Docker and so forth. And below that you have an operating system.

Of course, these lines can be blurred and its not 100% clear... But what I would love as someone who is building tools in this space is to have some well defined boundaries between these layers along the lines of the OSI model. One of the reasons the OSI model is so great, for those of you who aren't familiar the OSI model defines networking layers for the networking stack, and the real beauty is that if you are building a solution on layer 3 of the networking stack for IP connectivity you only have to worry about the layer beneath you.

As a platform builder I would love nothing more than to build my workflow layer, worry about an integration to Kubernetes or something that exposes similar APIs and then not have to worry all the way down the rest of the stack about the container engine and what the scheduling system looks like and all that. With a relatively immature eco-system like this the truth is that we are not there yet and these strata are still going to take some time to solidify.

[23:07] What's next for Deis?

Yeah, that's really exciting. So there's a couple of things that I can share and there's other stuff that I can't share but all of it very exciting. The things I can share, if you follow the project these won't be big surprises, we are definitely interested in shipping a production ready Mesos integration to achieve scale. We're also very interested in providing a service gateway implementation that will allow explicit service attachments.

One of the things that Deis is used for in the field is microservice architectures. Having explicit attachment between different services that are backed by authorization and access controls and can be audited and that sort of thing is incredibly important. As opposed to some of the implicit, service discovery attachment systems that you'll find a lot of work on in the Docker eco-system today. Beyond that there is a lot of interesting proposals, one that I'll mention is that we had a community member write up a whole pull request to implement SSL support at the per application level.

So each deployed application can actually ship its ow certificate and private key which will integrate with the custom domains that are added and allow users to granularly control SSL at a per app level. Which is really exciting to me because its a feature we have wanted for a long time but one that we haven't had the resources to build and someone from the community came along and shipped it for us.