Tempest

In this third article in the series, we discuss adding one or more Jenkins slave nodes to the external OpenStack testing platform that you (hopefully) set up in the second article in the series. The Jenkins slave nodes we create today will run Devstack and execute a set of Tempest integration tests against that Devstack environment.

Add a Credentials Record on the Jenkins Master

Before we can add a new slave node record on the Jenkins master, we need to create a set of credentials for the master to use when communicating with the slave nodes. Head over to the Jenkins web UI, which by default will be located at http://$MASTER_IP:8080/. Once there, follow these steps:

Click the Credentials link on the left side panel

Click the link for the Global domain:

Click the Add credentials link

Select SSH username with private key from the dropdown labeled “Kind”

Enter “jenkins” in the Username textbox

Select the “From a file on Jenkins master” radio button and enter /var/lib/jenkins/.ssh/id_rsa in the File textbox:

Click the OK button

Construct a Jenkins Slave Node

We will now install Puppet and the software necessary for running Devstack and Jenkins slave agents on a node.

Slave Node Requirements

On the host or virtual machine that you have selected to use as your Jenkins slave node, you will need to ensure, like the Jenkins master node, that the node has the following:

These basic packages are installed:

wget

openssl

ssl-cert

ca-certificates

Have the SSH keys you use with GitHub in ~/.ssh/. It also helps to bring over your ~/.ssh/known_hosts and ~/.ssh/config files as well.

Have at least 40G of available disk space

.

IMPORTANT NOTE: If you were considering using LXC containers for your Jenkins slave nodes (as I originally struggled to use)…. Use a KVM or other non-shared-kernel virtual machine for the devstack-running Jenkins slaves. Bugs like the inability to run open-iscsi in an LXC container make it impossible to run devstack inside an LXC container.

Download Your Config Data Repository

In the second article in this series, we went over the need for a data repository and, if you followed along in that article, you created a Git repository and stored an SSH key pair in that repository for Jenkins to use. Let’s get that data repository onto the slave node:

Which indicates Puppet done and a set of Nodepool scripts are running to cache upstream OpenStack Git repositories on the node and prepare Devstack. Part of the process of preparing Devstack involves downloading images that are used by Devstack for testing. Note that this step takes a long time! Go have a beer or other beverage and work on something else for a couple hours.

Adding a Slave Node on the Jenkins Master

In order to “register” our slave node with the Jenkins master, we need to create a new node record on the master. First, go to the Jenkins web UI, and then follow these steps:

Click the Manage Jenkins link on the left

Scroll down and click the Manage Nodes link

Click the New Node link on the left:

Enter “devstack_slave1” in the Node name textbox

Select the Dumb Slave radio button:

Click the OK button

Enter 2 in the Executors textbox

Enter “/home/jenkins/workspaces” in the Remote FS root textbox

Enter “devstack_slave” in the Labels textbox

Enter the IP Address of your slave host or VM in the Host textbox

Select jenkins from the Credentials dropdown:

Click the Save button

Click the Log link on the left. The log should show the master connecting to the slave, and at the end of the log should be: “Slave successfully connected and online”:

Test the dsvm-tempest-full Jenkins job

Now we are ready to have our Jenkins slave execute the long-running Jenkins job that uses Devstack to install an OpenStack environment on the Jenkins slave node, and run a set of Tempest tests against that environment. We want to test that the master can successfully run this long-running job before we set the job to be triggered by the upstream Gerrit event stream.

Go to the Jenkins web UI, click on the dsvm-tempest-full link in the jobs listing, and then click the Build Now link. You will notice an executor start up and a link to a newly-running job will appear in the Build History box on the left:

Build History panel in Jenkins

Click on the link to the new job, then click Console Output in the left panel. You should see the job executing, with Bash output showing up on the right:

Manually running the dsvm-tempest-full Jenkins job

Troubleshooting

If you see errors pop up, you will need to address those issues. In my testing, issues generally were around:

Firewall/networking issues: Make sure that the Jenkins master node can properly communicate over SSH port 22 to the slave nodes. If you are using virtual machines to run the master or slave nodes, make sure you don’t have any iptables rules that are preventing traffic from master to slave.

Missing files like “No file found: /opt/nodepool-scripts/…”: Make sure that the install_slave.sh Bash script completed successfully. This script takes a long time to execute, as it pulls down a bunch of images for Devstack caching.

LXC: See above about why you cannot currently use LXC containers for Jenkins slaves that run Devstack

Zuul processes borked: In order to have jobs triggered from upstream, both the zuul-server and zuul-merge processes need to be running, connecting to Gearman, and firing job events properly. First, make sure the right processes are running:

Enabling the dsvm-tempest-full Job in the Zuul Pipelines

Once you’ve successfully run the dsvm-tempest-full job manually, you should now enable this job in the appropriate Zuul pipelines. To do so, on the Jenkins master node, you will want to edit the etc/zuul/layout.yaml file in your data repository (don’t forget to git commit your changes after you’ve made them and push the changes to the location of your data repository’s canonical location).

If you used the example layout.yaml from my data repository and you’ve been following along this tutorial series, the projects section of your file will look like this:

projects:
- name: openstack-dev/sandbox
check:
# Remove this after successfully verifying communication with upstream
# and seeing a posted successful review.
- noop-check-communication
# Uncomment this job when you have a jenkins slave running and want to
# test a full Tempest run within devstack.
#- dsvm-tempest-full
gate:
# Remove this after successfully verifying communication with upstream
# and seeing a posted successful review.
- noop-check-communication
# Uncomment this job when you have a jenkins slave running and want to
# test a full Tempest run within devstack.
#- dsvm-tempest-full

To enable the dsvm-tempest-full Jenkins job to run in the check pipeline when a patch is received (or recheck comment added) to the openstack-dev/sandbox project, simply uncomment the line:

#- dsvm-tempest-full

And then reload Zuul and Zuul-merger:

sudo service zuul reload
sudo service zuul-merger reload

From now on, new patches and recheck comments on the openstack-dev/sandbox project will fire the dsvm-tempest-full Jenkins job on your devstack slave node. If your test run was successful, you will see something like this in your Jenkins console for the job run:

\o/ Steve Holt!

And you will note that on the patch that triggered your Jenkins job will show a successful comment, and a +1 Verified vote:

A comment showing external job successful runs

What Next?

From here, the changes you make to your Jenkins Job configuration files are up to you. The first place to look for ideas is the devstack-vm-gate.sh script. Look near the bottom of that script for a number of environment variables that you can set in order to tinker with what the script will execute.

If you are a Cinder storage vendor looking to test your hardware and associated Cinder driver against OpenStack, you will want to either make changes to the example dsvm-tempest-full or create a copy of that example job definition and customize it to your needs. You will want to make sure that Cinder is configured to use your storage driver in the cinder.conf file. You may want to create some script that copies most of what the devstack-vm-gate.sh script does, and call the devstack ini_set function to configure your storage driver, and then run devstack and tempest.

Publishing Console and Devstack Logs

Finally, you will want to get the log files that are collected by both Jenkins and the devstack run published to some external site. Folks at Arista have used dropbox.com to do this. I’ll leave it up to an exercise for the reader to set this up. Hint: that you will want to set the PUBLISH_HOST variable in your data repository’s vars.sh to a host that you have SCP rights to, and uncomment the publishers section in the example dsvm-tempest-full job:

Final Thoughts

I hope this three-part article series has been helpful for you to understand the upstream OpenStack continuous integration platform, and instructional in helping you set up your own external testing platform using Jenkins, Zuul, and Jenkins Job Builder, and Devstack-Gate. Please do let me know if you run into issues. I will post some updates to the Troubleshooting section above when I hear from you and (hopefully help you resolve any problems).

This post describes in detail the upstream OpenStack continuous integration platform. In the process, I’ll be describing the code flow in the upstream system — from the time the contributor submits a patch to Gerrit, all the way through the creation of a devstack environment in a virtual machine, the running of the Tempest test suite against the devstack installation, and finally the reporting of test results and archival of test artifacts. Hopefully, with a good understanding of how the upstream tooling works, setting up your own linked external testing platform will be easier.

Some History and Concepts

Over the past four years, there has been a steady evolution in the way that the source code of OpenStack projects is tested and reviewed. I remember when we used Bazaar for source control and Launchpad merge proposals for code review. There was no automated or continuous testing to speak of in those early days, which put pressure on core reviewers to do testing of proposed patches locally. There was also no standardized integration test suite, so often a change in one project would inadvertantly break another project.

Thanks to the work of many contributors, particularly those patient souls in the OpenStack Infrastructure team, today there is a robust platform supporting continuous integration testing for OpenStack and Stackforge projects. At the center of this platform are the Jenkins CI servers, the Gerrit git and patch review server, and the Zuul gating system.

The Code Review System

When a contributor submits a patch to one of the OpenStack projects, one pushes their code to the git server managed by Gerrit running on review.openstack.org. Typically, contributors use the git-review Git plugin, which simplifies submitting to a git server managed by Gerrit. Gerrit controls which users or groups are allowed to propose code, merge code, and administer code repositories under its management. When a contributor pushes code to review.openstack.org, Gerrit creates a Changeset representing the proposed code. The original submitter and any other contributors can push additional amendments to that Changeset, and Gerrit collects all of the changes into the Changeset record. Here is a shot of a Changeset under review. You can see a number of patches (changes) listed in the review screen. Each of those patches was an amendment to the original commit.

Individual patches amend the changeset

For each patch in Gerrit, there are three sets of “labels” that may be applied to the patch. Anyone can comment on a Changeset and/or review the code. A review is shown on the patch in the Code-Review column in the patch “labels matrix”:

The “label matrix” on a Gerrit patch

Non-core team members may give the patch a Code-Review label of +1 (Looks good to me), 0 (No strong opinion), or -1 (I would prefer you didn’t merge this). Core team members can give any of those values, plus +2 (Looks good to me, approved) and -2 (Do not submit).

The other columns in the label matrix are Verified and Approved. Only non-interactive users of Gerrit, such as Jenkins, are allowed to add a Verified label to a patch. The external testing platform you will set up is one of these non-interactive users. The value of the Verified label will be +1 (check pipeline tests passed), -1 (check pipeline tests failed), +2 (gate pipeline tests passed), or -2 (gate pipeline tests failed).

Only members of the OpenStack project’s core team can add an Approved label to a patch. It is either a +1 (Approved) value or not, appearing as a check mark in the Approved column of the label matrix:

An approved patch.

Continuous Integration Testing

Continuous integration (CI) testing is the act of running tests that validate a full application environment on a continual basis — i.e. when any change is proposed to the application. Typically, when talking about CI, we are referring to tests that are run against a full, real-world installation of the project. This type of testing, called integration testing, ensures that proposed changes to one component do not cause failures in other components. This is especially important for complex multi-project systems like OpenStack, with non-trivial dependencies between subsystems.

When code is pushed to Gerrit, a series of jobs are triggered that run a series of tests against the proposed code. Jenkins is the server that executes and manages these jobs. It is a Java application with an extensible architecture that supports plugins that add functionality to the base server.

Each job in Jenkins is configured separately. Behind the scenes, Jenkins stores this configuration information in an XML file in its data directory. You may manually edit a Jenkins job as an administrator in Jenkins. However, in a testing platform as large as the upstream OpenStack CI system, doing so manually would be virtually impossible and fraught with errors. Luckily, there is a helper tool called Jenkins Job Builder (JJB) that constructs these XML configuration files after reading a set of YAML files and job templating rules. We will describe JJB later in the article.

The “Gate”

When we talk about “the gate”, we are talking about the process by which code is kept out of a set of source code branches if certain conditions are not met.

OpenStack projects use a method of controlling merges into certain branches of their source trees called the Non-Human Gatekeeper model [1]. Gerrit (the non-human) is configured to allow merges by users in a group called “Non-Interactive Users” to the master and stable branches of git repositories under its control. The upstream main Jenkins CI server, as well as Jenkins CI systems running at third party locations, are the users in this group.

So, how do these non-interactive users actually decide whether to merge a proposed patch into the target branch? Well, there is a set of tests (different for each project) — unit, functional, integration, upgrade, style/linting — that is marked as “gating” that particular project’s source trees. For most of the OpenStack projects, there are unit tests (run in a variety of different supported versions of Python) and style checker tests for HACKING and PEP8 compliance. These unit and style tests are run in Python virtualenvs managed by the tox testing utility.

In addition to the Python unit and style tests, there are a number of integration tests that are executed against full installations of OpenStack. The integration tests are simply subsets of the Tempest integration test suite. Finally, many projects also include upgrade and schema migration tests in their gate tests.

How Upstream Testing Works

Graphically, the upstream continuous integration gate testing system works like this:

We step through this event flow in detail below, referencing the numbered steps in bold.

The Gerrit Event Stream and Zuul

After a contributor has pushed (1a) a new patch to a changeset or a core team member has reviewed the patch and added an Approved +1 label (1b), Gerrit pushes out a notification event to its event stream(2). This event stream can have a number of subscribers, including the Gerrit Jenkins plugin and Zuul. Zuul was developed to manage the many complex graphs of interdependent branch merge proposals in the upstream system. It monitors in-progress jobs for a set of related patches and will pre-emptively cancel any dependent test jobs that would not succeed due to a failure in a dependent patch [2].

In addition to this dependency monitoring, Zuul is responsible for constructing the pipelines of jobs that should be executed on various events. One of these pipelines is called the “gate” pipeline, appropriately named for the set of jobs that must succeed in order for a proposed patch to be merged into a target branch.

Zuul listens to the Gerrit event stream (3), and matches the type of event to one or more pipelines (4). The matching conditions for the gate pipeline are configured in the trigger:gerrit: section of the YAML snippet above:

The above indicates that Zuul should fire the gate pipeline when it sees reviews with an Approved +1 label, and any comment to the review that contains “reverify” with or without a bug identifier. Note that there is a similar pipeline that is fired when a new patchset is created or when a review comment is made with the word “recheck”. This pipeline is called the check pipeline. Look in the layout.yaml file for the configuration of the check pipeline.

Once the appropriate pipeline is matched, Zuul executes (5) that particular pipeline for the project that had a patch proposed.

“But wait, hold up…“, you may be asking yourself, “how does Zuul know which Jenkins jobs to execute for a particular project and pipeline?“. Great question!

Also in the layout.yaml file, there is a section that configures which Jenkins jobs should be run for each project. Let’s take a look at the configuration of the gate pipeline for the Cinder project:

Each of the lines in the gate: section indicate a specific Jenkins job that should be run in the gate pipeline for Cinder. In addition, there is the python-jobs item in the template: section. Project templates are a way that Zuul consolidates configuration of many similar jobs into a simple template configuration. The project template definition for python-jobs looks like this (still in layout.yaml:

So, on determing which Jenkins jobs should be executed for a particular pipeline, Zuul sees the python-jobs project template in the Cinder configuration and expands that to execute the following Jenkins jobs:

gate-cinder-docs

gate-cinder-pep8

gate-cinder-python26

gate-cinder-python27

Jenkins Job Creation and Configuration

I previously mentioned that the configuration of an individual Jenkins job is stored in a config.xml file in the Jenkins data directory. Now, at last count, the upstream OpenStack Jenkins CI system has just shy of 2,000 jobs. It would be virtually impossible to manage the configuration of so many jobs using human-based processes. To solve this dilemma, the Jenkins Job Builder (JJB) python tool was created. JJB consumes YAML files that describe both individual Jenkins jobs as well as templates for parameterized Jenkins jobs, and writes the config.xml files for all Jenkins jobs that are produced from those templates. Important: Note that Zuul does not construct Jenkins jobs. JJB does that. Zuul simply configures which Jenkins jobs should run for a project and a pipeline.

There is a master projects.yaml file in the same directory that lists the “top-level” definitions of jobs for all projects, and it is in this file that many of the variables that are used in job template instantiation are defined (including the {name} variable, which corresponds to the name of the project.

When JJB constructs the set of all Jenkins jobs, it reads the projects.yaml file, and for each project, it sees the “name” attribute of the project, and substitutes that name attribute value wherever it sees {name} in any of the jobs that are defined for that project. Let’s take a look at the Cinder project’s definition in the projects.yaml file here:

You will note one of the items in the jobs section is called python-jobs. This is actually not a single Jenkins job, but actually a job group. A job group definition is merely a list of jobs or job templates. Let’s take a look at the definition of the python-jobs job group:

Each of the items listed in the jobs section of the python-jobs job group definition above is a job template. Job templates are expanded in the same way as Zuul project templates and JJB job groups are expanded. Let’s take a look at one such job template in the list above, called gate-{name}-python27.

The python-jobs.yaml file in the modules/openstack_project/files/jenkins_job_builder/config directory contains the definition of common Python project Jenkins job templates. One of those job templates is gate-{name}-python27:

Looking through the above job template definition, you will see a section called “builders“. The builders section of a job template lists (in sequential order of expected execution) the executable sections or scripts of the Jenkins job. The first executable section in the gate-{name}-python27 job template is called “gerrit-git-prep“. This executable section is defined in macros.yaml, which contains a number of commonly-run scriptlets. Here’s the entire gerrit-git-prep macro definition:

So, gerrit-git-prep is simply executing a Bash script called “gerrit-git-prep.sh” that is stored in the /usr/local/jenkins/slave_scripts/ directory. Let’s take a look at that file. You can find it in the /modules/jenkins/files/slave_scripts/[3]directory in the same OpenStack Infra Config project:

The purpose of the script above is simple: Check out the source code of the proposed Gerrit changeset and ensure that the source tree is clean of any cruft from a previous run of a Jenkins job that may have run in the same Jenkins workspace. The concept of a workspace is important. When Jenkins runs a job, it must execute that job from within a workspace. The workspace is really just an isolated shell environment and filesystem directory that has a set of shell variables export’d inside it that indicate a variety of important identifiers, such as the Jenkins job ID, the name of the source code project that has triggered a job, the SHA1 git commit ID of the particular proposed changeset that is being tested, etc [4].

The next builder in the job template is the “python27” builder, which has two variables injected into itself:

- python27:
github-org: '{github-org}'
project: '{name}'

The github-org variable is a string of the already existing {github-org} variable value. The project variable is populated with the value of the {name} variable. Here’s how the python27 builder is defined (in macros.yaml:

In short, for the Python 2.7 builder, the above runs the command tox -epy27 and then runs a prettifying script and gzips up the results of the unit test run. And that’s really the meat of the Jenkins job. We will discuss the publishing of the job artifacts a little later in this article, but if you’ve gotten this far, you have delved deep into the mines of the OpenStack CI system. Congratulations!

Devstack-Gate and Running Tempest Against a Real Environment

OK, so unit tests running in a simple Jenkins slave workspace are one thing. But what about Jenkins jobs that run integration tests against a full set of OpenStack endpoints, interacting with real database and message queue services? For these types of Jenkins jobs, things are more complicated. Yes, I know. You probably think things have been complicated up until this point, and you’re right! But the simple unit test jobs above are just the tip of the proverbial iceberg when it comes to the OpenStack CI platform.

For these complex Jenkins jobs, an additional set of tools are added to the mix:

Devstack-Gate — Scripts that create an OpenStack environment with Devstack, run tests against that environment, and archive logs and results

Assignment of a Node to Run a Job

Different Jenkins jobs require different workspaces, or environments, in which to run. For basic unit or style-checking test jobs, like the gate-{name}-python27 job template we dug into above, not much more is needed than a tox-managed virtualenv running in a source checkout of the project with a proposed change. However, for Jenkins jobs that run a series of integration tests against a full OpenStack installation, a workspace with significantly more resources and isolation is necessary. For these latter types of jobs, the upstream CI platform uses a pool of virtual machine instances. This pool of virtual machine instances is managed by a tool called nodepool. The virtual machines run in both HP Cloud and Rackspace Cloud, who graciously donate these instances for the upstream CI system to use. You can see the configuration of the Nodepool-managed set of instances here.

Instances that are created by Nodepool run Jenkins slave software, so that they can communicate with the upstream Jenkins CI master servers. A script called prepare_node.sh runs on each Nodepool instance. This script just git clones the OpenStack Infra config project to the node, installs Puppet, and runs a Puppet manifest that sets up the node based on the type of node it is. There are bare nodes, nodes that are meant to run Devstack to install OpenStack, and nodes specific to the Triple-O project. The node type that we will focus on here is the node that is meant to run Devstack. The script that runs to prepare one of these nodes is prepare_devstack_node.sh, which in turn calls prepare_devstack.sh. This script caches all of the repositories needed by Devstack, along with Devstack itself, in a workspace cache on the node. This workspace cache is used to enable fast reset of the workspace that is used during the running of a Jenkins job that uses Devstack to construct an OpenStack environment.

Devstack-Gate

The Devstack-Gate project is a set of scripts that are executed by certain Jenkins jobs that need to run integration or upgrade tests against a realistic OpenStack environment. Going back to the Cinder project configuration in the Zuul layout.yaml file:

Note the highlighted line. That Jenkins job template is one such job that needs an isolated workspace that has a full OpenStack environment running on it. Note that “dsvm” stands for “Devstack virtual machine”.

Not all that complicated. It exports some environment variables and copies the devstack-vm-gate-wrap.sh script out of the devstack-gate repo that was clone’d in the devstack-checkout macro to the work directory and then runs that script.

Construction of OpenStack Environment with Devstack

The devstack-vm-gate.sh script is responsible for constructing a full OpenStack environment and running integration tests against that environment. To construct this OpenStack environment, it uses the excellent Devstack project. Devstack is an elaborate series of Bash scripts and functions that clones each OpenStack project source code into /opt/stack/new/$project[5]— , runs python setup.py install in each project checkout, and starts each relevant OpenStack service (e.g. nova-compute, nova-scheduler, etc) in a separate Linux screen session.

Devstack’s creation script (stack.sh) is called from the script after creating the localrc file that stack.sh uses when constructing the Devstack environment.

You will note that the $DEVSTACK_GATE_TEMPEST_FULL Bash environment variable was set to “1” in the gate-tempest-dsvm-full Jenkins job builder scriptlet.

sudo -H -u tempest tox -efull triggers the execution of Tempest’s integration test suite. Tempest is the collection of canonical OpenStack integration tests that are used to validate that OpenStack APIs work according to spec and that patches to one OpenStack service do not inadvertently cause failures in another service.

If you are curious what actual commands are run, you can check out the tox.ini file in Tempest:

[testenv:full]
# The regex below is used to select which tests to run and exclude the slow tag:
# See the testrepostiory bug: https://bugs.launchpad.net/testrepository/+bug/1208610
commands =
bash tools/pretty_tox.sh '(?!.*\[.*\bslow\b.*\])(^tempest\.(api|scenario|thirdparty|cli)) {posargs}'

Archival of Test Artifacts

The final piece of the puzzle is archiving all of the artifacts from the Jenkins job execution. These artifacts include log files from each individual OpenStack service running in Devstack’s screen sessions, the results of the Tempest test suite runs, as well as echo’d output from the devstack-vm-gate* scripts themselves.

Conclusion

I hope this article has helped you understand a bit more how the OpenStack continuous integration platform works. We’ve stepped through the flow through the various components of the platform, including which events trigger what actions in each components. You should now have a good idea how the various parts of the upstream CI infrastructure are configured and where to go look in the source code for more information.

The next article in this series discusses how to construct your own external testing platform that is linked with the upstream OpenStack CI platform. Hopefully, this article will provide you most of the background information you need to understand the steps and tools involved in that external testing platform construction.

[1]— The link describes and illustrates the non-human gatekeeper model with Bazaar, but the same concept is applicable to Git. See the OpenStack GitWorkflow pages for an illustration of the OpenStack specific model.[2]— Zuul really is a pretty awesome bit of code kit. Jim Blair, the author, does an excellent job of explaining the merge proposal dependency graph and how Zuul can “trim” dead-end branches of the dependency graph in the Zuul documentation.[3]— Looking for where a lot of the “magic” in the upstream gate happens? Take an afternoon to investigate the scripts in this directory. [4]— Gerrit Jenkins plugin and Zuul export a variety of workspace environment variables into the Jenkins jobs that they trigger. If you are curious what these variables are, check out the Zuul documentation on parameters.[5]— The reason the projects are installed into /opt/stack/new/$project is because the current HEAD of the target git branch for the project is installed into /opt/stack/old/$project. This is to allow an upgrade test tool called Grenade to test upgrade paths.