After using Puppet with an external node classifier for a while one starts questioning what other information could be generated by this instead of just YAML to feed the puppetmaster. When supervisor was being rolled out there was a need to a large number of near identical config files to be generated, however any special information about the configs really had no place in Puppet. So the solution to this was to have the Django app generate the config files and then have puppet pull them down with a custom parser.

In /var/lib/puppet/lib/puppet/parser/functions lives the file webcontent.rb which has the following contents:

Using the Ruby module open-uri content is grabbed by the puppetmaster and placed into the catalog. Using the following Django model, view and template a config file is easily generated and passed along to Puppet

<?php// Include the Genius config filerequire_oncedirname(dirname(__FILE__)).'/Core/gosConfig.inc.php';/**
* A function to get single values from a database table
*/function getThingFromDB($id){$db= gosDB_Helper::getDBByName('main');return$db->getOne("SELECT s1 FROM fixture_test WHERE i1 = ".$id);}

In a nearby test file:

<?php// Include the Genius fixture configuration. This geneartes a database// and applies the schema to it. See fixtureTestConfig.inc.php// for more details.require_once(GOS_ROOT .'Fixture/fixtureTestConfig.inc.php');function testGetThingFromDB(){// Create a fixture$fixture= gosTest_Fixture_Controller::getByDBName('main');// Load the fixture into the database$fixture->parseFixtureFile(GOS_ROOT .'Fixture/example_fixture.yaml');// Directly access the fixture, which is identical to what the database contains$idToGet=$fixture->get('fixtureName.fixture_test.i1');// Pull the value we want directly from the fixture$fixtureThing=$fixture->get('fixtureName.fixture_test.s1');// Pull the value we want from the DB via the function we're testing$thing= getThingFromDB($idToGet);echo"The fixture put $fixtureThing into the DB.\n";echo"Our function selected $thing from the DB.\n";}// Run the test
testGetThingFromDB();

The first magical line here is that require_once—fixtureTestConfig.inc.php uses the database connection information defined in Fixture/fixtureConfig.yaml to create a database is used for the duration of this PHP process only. Subsequent test runs (or invocations of this example script) will generate an entirely new database. See gosDB_AutoDBGenerator for details on these ephemeral databases. The takeaway is that you must have a user listed in that fixture config YAML file that can create databases.

In the test function itself, we create a fixture on the appropriate database, in this case our main database, and load the data from our fixture file, a simple YAML file defining what data we want in the database:

# The name of this fixture
fixtureName:
# A table we will insert data into
fixture_test:
i1: 11
s1: str1

At this point, we’re ready to start testing. First we get the ID that we need to pass to getThingFromDB, which we simply pull from the fixture. Next, we get the value that we are expecting from the same table in the fixture. Now that we have the ID we want to give to the function we’re testing, and the value we expect to get back, we can execute that function and compare the results. Go ahead and try this all for yourself; the above example works all on its own. The easiest way is to download the example.php and the necessary YAML file.

The gosFixture class automatically cleans up the database tables that it touched (see Core/lib/gosTest/Framework/TestCase.acls.php), so you need to re-apply the fixture for every test. The easiest way to do this is to put the fixture initialization in the setUpExtension() of a test class, giving you completely fresh fixture data for each and every test.

So go forth and test your code with better isolation and known clean data.

In Puppet the initial method of holding information about your machines is through the site.pp config file, this rapidly becomes tiresome when you have more than 5 servers. This is where an external node classier comes in as a handy tool.

The first step in setting up an external node classifier is developing what will be generating the YAML for the puppetmaster to read. In this situation Django chosen as a web framework as the built in admin interface saved a bit of development time for management. Additional thought behind using an ORM framework was cutting down on development time on other projects that might need access to the truth database and could simply pull YAML from it. Within the Django application there are classes describing hosts, environments, server class, puppet classes, hard drive and filesystem layout. Here is the class describing a host:

One issue that was encountered with exporting data from django to yaml was the usage of utf for strings, hence the continual usage of str(). The result of the function is dumped out through the following incredibly complex template

---
{{ yaml }}

At the end of all of this we finally get data that comes out as something similar to the following:

Puppet is a configuration management system that was created with the goal of making more portions of system administration tasks reuseable. At Genius puppet is used in all of our environments: production, staging, and development. Within this setup each environment has its own puppetmaster, with each master pulling against its own version of the puppet classes. All of the masters in turn pull all of their information nodes from a central external node classifier.

The masters act as their own clients, syncing against themselves with a puppet class that controls checkouts of an svn repository holding all puppet classes. All of the servers have been configured with custom defines to handle subervision checkouts and configs for supervisor, additionally a custom parser has been written to allow pulling down webpages as part a content description. All servers run on an interval of 5 minutes with clients running on a 15 minute interval, with some special exceptions for dns servers. Additionally the web app that powers the external node classier also handles generating config files and maintaining a truth database for other tasks, like imaging.

This is the first in a series of post regarding our setup, following posts will cover our external node classifier, truth database, customer parser, custom defines.

HornetQ is JBoss’s latest messaging product. It provides a JMS implementation as well as its own alternate API. JMS is the Java Message Service: a set of APIs built around asynchronous messaging through topics (think bulletin boards) and queues (self-explanatory). For more, see the standard JMS tutorial.

This tutorial shows how to access HornetQ JMS resources via JNDI without needing a separate JNDI server. It also provides a test utility to run an embedded HornetQ server (perfect for unit tests that need a JMS broker running!).

Why JNDI?

Since JMS is part of Java EE, you’ll often see tutorials that have JMS resources (e.g. connection factories or queues) being provided to your code is via JNDI. This isn’t the only choice, though: you could instead manually instantiate your preferred provider’s implementations of the resources you need.

If you choose the latter approach, you might use the following to get a Queue if you were using ActiveMQ (another JMS implementation):

Queue queue =new ActiveMQQueue("queueName");

This works fine until you switch to another provider, at which point you need to change all your queues (and connection factories and topics and…) to look like this:

Queue queue =new HornetQQueue("queueName");

This is why it’s recommended to instead pull those objects out of JNDI. If you’re getting the queue out of JNDI with the following code, you wouldn’t need to change anything if you switch providers other than changing how the JNDI context gets populated.

This isn’t meant to be a JNDI tutorial, but it should be clear why getting JMS resources (as well as things like JDBC DataSource objects and other “managed” objects) out of JNDI is a good thing: it makes it easier to program to interfaces, not implementations, and maintain vendor neutrality.

HornetQ’s JNDI support

HornetQ comes with all you need to be able to connect to a JNDI server and pull JMS resources out of that server. You can set up the HornetQ server to serve JNDI in addition to JMS by enabling the appropriate bean in hornetq-beans.xml.

This approach is straightforward, but it becomes problematic if you want your HornetQ servers to be in a HA (high availability) pair. It’s hard enough to configure and test HA without also needing JNDI service to fail over as well. Though there is some documentation on how to set up HA JNDI, it would be simpler (and surely faster as well) if there was no need to talk to a server at all to get JMS resources. This approach makes sense if your deployment setup is simple enough that you know which queues and JMS servers you will be using and can put that information in a few config files and bundle it with your code during deployment. This is the case for many uses of JMS. If, on the other hand, you’re already tied to using a Java EE server that provides JNDI, then you might as well use that.

Local JNDI Configuration

The starting point to accessing JNDI objects is creating a new InitialContext. The specific way that the resulting Context gets populated with data is controlled by the implementation of InitialContextFactory that you’re using. You can change the implementation by specifying the java.naming.factory.initial property in jndi.properties to be the fully qualified class name of an impementation of InitialContextFactory.

We can use this mechanism to specify an implementation of InitialContextFactory that reads HornetQ’s config files. More sophisticated implementations might produce a Context that accesses data on some remote server, but our goal here is something that reads data out of config files and populates a memory-only Context. We’ll need to read the core HornetQ config file to get connection factory information and the JMS HornetQ config file to get the JNDI names to assign to queues, topics, etc. We’ll also need something to actually create a Context implementation since we don’t want to have to write that ourselves. Implementing Context requires a non-trivial amount of work to do well. simple-jndi provides a basic in-memory InitialContextFactory, so we’ll use that.

The java.naming.factory.initial tells InitialContext which class to instantiate to return the underlying Context (look at the source that comes with the JDK if you’re curious how). In this case, we want InitialContext to use our custom implementation that reads HornetQ XML files.

The other properties are only relevant to our custom InitialContextFactory. See the HornetQ documentation for more about what to put in the XML config files.hornetq.jndi.wrapped.initialcontextfactory.impl defines what InitialContextFactory we’ll use to actually create a Context object. In this case, it’s the simple-jndi memory-only InitialContextFactory.hornetq.xml.jms.path is the path in the classpath to the HornetQ JMS config file. This is typically “hornetq-jms.xml”.hornetq.xml.config.path is the path in the classpath to the HornetQ core config file. This is typically “hornetq-configuration.xml”.

Now that we’ve got jndi.properties ready, I’ve added the custom InitialContextFactory source below. The only tricky part is handling HornetQ’s LogDelegateFactory selection. The LogDelegateFactory system lets you change the logging system used by HornetQ and is configurable via the core config file. (I’d link to the documentation, but this option is undocumented. See FileConfigurationParser#parseMainConfig() in the HornetQ source.) Unfortunately, a bug in the way the LogDelegateFactory instance is set causes some classes to not properly pick up your custom implementation, so if you wish to use a custom LogDelegateFactory, note the commented out call to Logger.setDelegateFactory. Other than that, it’s pretty simple. We use HornetQ’s configuration parsing code to get the information we need out of the config files, then populate the Context we get from simple-jndi with the resulting JMS objects.

Testing JMS code

As a bonus for reading this far, here’s JmsTestServerManager. This class makes it easy to start and stop an embedded HornetQ server. This is ideal for testing code that reads from or writes to JMS queues or topics, for instance. You should create one instance per test class (via @BeforeClass if you’re using JUnit 4) and call start() in setup (@Before) and stop() in teardown (@After). This means you’ll have a fresh server instance for each test. No more needing to clean up leftover messages in queues between each test! Fortunately, HornetQ stops and starts quite quickly so this is unlikely to be a performance problem for your tests.

You may have noticed ReadOnlyContextProxy being used in JmsTestServerManager. That’s a rather boring Context implementation that proxies all read-only method calls to an internal Context instance and has a no-op close() implementation. This is to prevent HornetQ from closing the Context since we want to keep the Context open through all test method invocations to prevent pointlessly re-creating an InitialContext every time.

The rather uninteresting source of ReadOnlyContextProxy follows to save you from writing your own.

PHP employs Perl Compatible Regular Expressions (PCRE) in the built-in collection of preg_* functions, such as preg_match(). While PCRE is certainly the preferred regular expression library, PHP’s implementation allows the functions to fail without any explicit warning—the user must check preg_last_error() to know that an error occurred. Often, the return of a regular expression match is checked, and different operations are performed if the regex matched or not.

Looks perfectly sensible. Through some mathematical regex trickery, we determine whether or not a number is prime. For reasons beyond the scope of this article, this regex fails under default PHP configurations beginning at the number 22201 because PHP’s regular expression backtracking limit is exceeded. While the documentation for preg_match() claims it will return boolean false if a PREG_BACKTRACK_LIMIT_ERROR occurs, the function actually returns integer 0. In the case of the above function, PHP will start calling everything above 22200 a prime number. Even if the documentation were correct we wouldn’t be much better off—every number would be classified as composite number.

How do we deal with this? You must check preg_last_error()every time a PCRE function is used. That warning is bold for a reason: the results of failing to check preg_last_error() can be even more destructive than improperly classifying integers. The function preg_replace() returns null when an error occurs, which PHP will happily coerce to 0 or the empty string depending on context. It is very easy to assume that your regular expression replacement went through successfully and keep trucking along, but your users will not be happy with that null value when it’s used in a string context.

The solution to these ails is the newly released gosRegex module of the Genius Open Source library. This new module provides simple wrappers for all of the PCRE functions in PHP, checking preg_last_error() for you and turning any errors into exception.