But there’s an important thing that Swarm needs to be able to do to take your apps to production: it needs to scale. We believed Swarm could scale up tremendously, so we looked around for a benchmark and found one here. We decided to recreate the Kubernetes test with Swarm. Like the team at Google, we wanted to make sure that as we launched more containers it would keep scheduling containers quickly.

What did we measure?

We wanted to stress test a single Swarm manager, to see how capable it would be, so we used one Swarm manager to manage all our nodes. We placed fifty containers per node. Commands were run 1,000 times against Swarm and and we generated percentiles for 1) API Response time and 2) Scheduling delay. We found that we were able to scale up to 1,000 nodes running 30,000 containers. 99% of the time each container took less than half a second to launch. There was no noticeable difference in the launch time of the 1st and 30,000th container.

We used docker info to measure API response time, and then used docker run -dit ubuntu bash to measure scheduling delay.

15 Responses to “Scale Testing Docker Swarm to 30,000 Containers”

Maybe someone could explain then, how can I share this swarm manager between multiple users in my team? I mean all of them want to push some containers out there.. docker-machine seems to be unable to use custom ssh key, or ssh keys from digital ocean profile.. sharing ~/.docker/.. ssh generated key is insane.. any solution for that?

Martin Bent

I was at DockerCon and saw this demonstrated with a great visual representation of all the nodes and containers in the Swarm cluster in a circle. I was wondering what this visualisation was and if we could use it in some of our demos. It may have been Mesosphere but I’m looking for something for my own on premise Swarm.

Kevin Tao

Allen McPherson

Nice work, but those t2.micros are not real nodes. My understanding is that they are "nodes" (virtual) mapped on to real nodes (hardware). So, given a not unreasonable 64-code node, 1000 "nodes" would consume only ~16 hardware nodes with the attendant savings in network traffic, etc.

It would be interesting to see how things scale on 1000 real nodes. Of course, you'd need millions of containers for that test.