PuppetDB architecture questions

I have had a couple of people ask me this question, so I thought I'd capture it here for everyone's benefit. If you're going to setup PuppetDB with the PostgreSQL backend, what is the best way to lay out your architecture?

The three main components are:

The Puppet master (or multiple masters)

The PuppetDB service

PostgreSQL database server(s)

Is it better to group components together (i.e. the Puppet master AND the PuppetDB service separate from the Postgres db servers)? What are the benefits and downsides of each scenario?

2 Answers

If you have only a handful of nodes that you manage, then there isn't much of a reason not to put it all on one system.

If you have 30-50 nodes, or use a lot of exported resources, you will probably want to at a minimum move the PostgreSQL server to it's own VM (or bare metal).

If you have more than 75 - 100 nodes that will be reporting to the master, or make heavy use of exported resources, you will probably want to have the Puppet master(s), PuppetDB(s) and PostgreSQL ...(more)

It's probably a bit tricky to give a very general-purpose answer for this, but it basically comes down to a tradeoff between hardware resources and increased network bandwidth.

If you have an absolute beast of a machine and you can deploy all three of those on the same host, then you'll minimize network I/O and latency, so you'll probably get the best performance. However, the definition of "beast" will vary depending on the size of your puppet node population.

The most likely bottleneck out of the three components is probably going to be the puppet master ...(more)