Extlookup has solved the basic problem of loading data into Puppet. This was done 3 years ago at a time before Puppet supported complex data in Hashes or things like Parametrized classes. Now as Puppet has improved a new solution is needed. I believe the combination of Hiera and the Puppet plugin goes a very long way to solving this and making parametrized classes much more bearable.

I will highlight a sample use case where a module author places a module on the Puppet Forge and a module user downloads and use the module. Both actors need to create data – the author needs default data to create a just-works experience and the module user wants to configure the module behavior either in YAML, JSON, Puppet files or anything else he can code.

Module Author

The most basic NTP module can be seen below. It has a ntp::config class that uses Hiera to read default data from ntp::data:

This is your most basic NTP module. By using hiera(“ntpserver”) you load $ntpserver from these variables, the first one that exists gets used. In this case the last one.

$ntp::config::data::ntpservers

$ntp::data:ntpservers

This would be an abstract from a forge module, anyone who use it will be configured to use the ntp.org base NTP servers.

Module User

As a user I really want to use this NTP module from the Forge and not write my own. But what I also need is flexibility over what NTP servers I use. Generally that means forking the module and making local edits. Parametrized Classes are supposed to make this better but sadly the design decisions means you need an ENC or a flexible data store. The data store was missing thus far and I really would not recommend their use without it.

Given that the NTP module above is using Hiera as a user I now have various options to override its use. I configure Hiera to use the (default) YAML backend for data but to also load in the Puppet backend should the YAML one not provide an answer. I also configure it to allow me to create per-location data that gives me the flexibility I need to pick NTP servers I need.

:backends: - yaml
- puppet
:hierarchy: - %{location}
- common

I now need to decide how best to override the data from the NTP module:

I want:

Per datacenter values when needed. The NOC at the data center can change this data without change control.

Company wide policy the should apply over the module defaults. This is company policy and should be subject to change control like my manifests.

Given these constraints I think the per-datacenter policy can go into data files that is controlled outside of my VCS like with a web application or simple editor. The common data that should apply company wide need to be controlled under my VCS and managed by the change control board.

Hiera makes this easy. By configuring it as above the new Puppet data search path – for a machine in dc1 – would be:

$data::dc1::ntpservers – based on the Hiera configuration, user specific

$data::common::ntpservers – based on the Hiera configuration, user specific

As Hiera will query this prior to querying any in-module data this will effectively prevent any downloaded module from supplying NTP servers other than ours. This is a company wide policy that applies to all machines unless specifically configured otherwise. This lives with your code in your SCM.

Next we create the data file for machines with fact $location=dc1. Note this data is created in a YAML file outside of the manifests. You can use JSON or any other Hiera backend so if you had this data in a CMDB in MySQL you could easily query the data from there:

hieradb/dc1.yaml

---
ntpservers: - ntp1.dc1.example.com
- ntp2.dc1.example.com

And this is the really huge win. You can create Hiera plugins to get this data from anywhere you like that magically override your in-manifest data.

These 3 nodes will have different NTP configurations based on their location – you should really make $location a fact:

web1.prod will use ntp1.dc1.example.com and ntp2.dc1.example.com

web1.dev will use the data from class data::common

oneoff.example.com is a complete weird case and you can still use the normal parametrized syntax – in this case Hiera wont be involved at all.

And so we have a very easy to use a natural blend between using param classes from an ENC for full programmatic control without sacrificing the usability for beginner users who can not or do not want to invest the time to create an ENC.