Setting Up an External OpenStack Testing System – Part 1

This post is intended to walk somone through the process of establishing an external testing platform that is linked with the upstream OpenStack continuous integration platform. If you haven’t already, please do read the first article in this series that discusses the upstream OpenStack CI platform in detail. At the end of the article, you should have all the background information on the tools needed to establish your own linked external testing platform.

EXTREMELY IMPORTANT NOTE: The upstream Puppet modules used in this article have changed dramatically since writing this. I am in the process of updating this blog entry, but at this time, some important steps do not work properly!

What Does an External Test Platform Do?

In short, an external testing platform enables third parties to run tests — ostensibly against an OpenStack environment that is configured with that third party’s drivers or hardware — and report the results of those tests on the code review of a proposed patch. It’s easy to see the benefit of this real-time feedback by taking a look at a code review that shows a number of these external platforms providing feedback. In this screenshot, you can see a number Verified +1 and one Verified -1 labels added by external Neutron vendor test platforms on a proposed patch to Neutron:

Each of these systems, when adding a Verified label to a review does so by adding a comment to the review. These comments contain links to artifacts from the external testing system’s test run for this proposed patch, as shown here:

Comments added to a review by the vendor testing platforms

The developer submitting a patch can use those links to investigate why their patch has caused test failures to occur for that external test platform.

Why Set Up an External Test Platform?

The benefits of external testing integration with upstream code review are numerous:

A tight feedback loop

The third party gets quick notifications that a proposed patch to the upstream code has caused a failure in their driver or configuration. The tighter the “feedback loop”, the faster fixes can be identified

Better code coverage

Drivers and plugins that may not be used in the default configuration for a project can be tested with the same rigor and frequency as drivers that are enabled in the upstream devstack VM gate tests. This prevents bitrot and encourages developers to maintain code that is housed in the main source trees.

Increased consistency and standards

Determining a standard set of tests that prove a driver implements the full or partial API of a project means that drivers can be verified to work with a particular release of OpenStack. If you’ve ever had a conversation with a potential deployer of OpenStack who wonders how they know that their choice of storage or networking vendor, or underlying hypervisor, actually works with the version of OpenStack they plan to deploy, then you know why this is a critical thing!

Why might you be thinking about how to set up an external testing platform? Well, a number of OpenStack projects have had discussions already about requirements for vendors to complete integration of their testing platforms with the upstream OpenStack CI platform. The Neutron developer community is ahead of the game, with more than half a dozen vendors already providing linked testing that appears on Neutron code reviews.

The Cinder project also has had discussions around enforcing a policy that any driver that is in the Cinder source tree have tests run on each commit to validate the driver is working properly. Similarly, the Nova community has discussed the same policy for hypervisor drivers in that project’s source tree. So, while this may be old news for some teams, hopefully this post will help vendors that are new to the OpenStack contribution world get integrated quickly and smoothly.

The Tools You Will Need

The components involved in building a simple linked external testing system that can listen to and notify the upstream OpenStack continuous integration platform are as follows:

Jenkins CI

The server that is responsible for executing jobs that run tests for a project

Zuul

A system that configures and manages event pipelines that launch Jenkins jobs

A collection of scripts that constructs an OpenStack environment from source checkouts

I’ll be covering how to install and configure the above components to build your own testing platform using a set of scripts and Puppet modules. Of course, there are a number of ways that you can install and configure any of these components. You can manually install it somewhere by following the install instructions in the component’s documentation. However, I do not recommend that. The problem with manual installation and configuration is two-fold:

If something goes wrong, you have to re-install everything from scratch. If you haven’t backed up your configuration somewhere, you will have to re-configure everything from memory.

You cannot launch a new configuration or instance of your testing platform easily, since you have to manually set everything up again.

A better solution is to use a configuration management system, such as Puppet, Chef, Ansible or SaltStack to manage the deployment of these components, along with a Git repository to store configuration data. In this article, I will show you how to install an external testing system on multiple hosts or virtual machines using a set of Bash scripts and Puppet modules I have collected into a source repository on GitHub. If you don’t like Puppet or would just prefer to use a different configuration management tool, that’s totally fine. You can look at the Puppet modules in this repo for inspiration (and eventually I will write some Ansible scripts in the OpenStack External Testing project, too).

Preparation

Before I go into the installation instructions, you will need to take care of a few things. Follow these detailed steps and you should be all good.

Getting an Upstream Service Account

In order for your testing platform to post review comments to Gerrit code reviews on openstack.org, you will need to have a service account registered with the OpenStack Infra team. See this link for instructions on getting this account.

Don’t have an SSH key pair for your Gerrit service account? You can create one like so:

ssh-keygen -t rsa -b 1024 -N '' -f gerrit_key

The above will produce the key pair: a pair of files called gerrit_key and gerrit_key.pub. Copy the text of the gerrit_key.pub into the email you send to the OpenStack Infra mailing list. Keep both the files handy for use in the next step.

Create a Git Repository to Store Configuration Data

When we install our external testing platform, the Puppet modules are fed a set of configuration options and files that are specific to your environment, including the SSH private key for the Gerrit service account. You will need a place to store this private configuration data, and the ideal place is a Git repository, since additions and changes to this data will be tracked just like changes to source code.

I created a source repository on GitHub that you can use as an example. Instead of forking the repository, like you might would normally do, I recommend instead just git clone’ing the repository to some local directory, and making it your own data repository:

Now you’ve got your own data repository to store your private configuration data and you can put it up in some private location somewhere — perhaps in a private organization in GitHub, perhaps on a Git server you have somewhere.

Put Your Gerrit Service Account Private Key Into the Data Repository

The next thing you will want to do is add your SSH key pair to the repository that you used in the step above that had you register an upstream Gerrit service account.

If you created a new key pair using the ssh-keygen command above. You would copy the gerrit_key file into your data repository.

If you did not create a new key pair (you used an existing key pair) or you created a key pair that wasn’t called gerrit_key, simply copy that key pair into the data repository, then open up the file called vars.sh, and change the following line in it:

export UPSTREAM_GERRIT_SSH_KEY_PATH=gerrit_key

And change gerrit_key to the name of your SSH private key.

Set Your Gerrit Account Username

Next, open up the file vars.sh in your data repository (if you haven’t already), and change the following line in it:

export UPSTREAM_GERRIT_USER=jaypipes-testing

And replace jaypipes-testing with your Gerrit service account username.

Set Your Vendor Name in the Test Jenkins Job

(Optional) Create Your Own Jenkins SSH Key Pair

I have a private/public SSH key pair (named jenkins_key[.pub] in the example data repository. Due to the fact that I’ve put the private key in there, it’s no longer useful as anything other than an example, so you may want to recreate your own. Do so like so:

The above should create an SSL self-signed certificate for Apache to run Jenkins UI with, and then install Jenkins, Jenkins Job Builder, Zuul, Nodepool Scripts, and a bunch of support packages.

Important Note: Since publishing this article, the upstream Zuul system underwent a bit of a refactoring, with the Zuul git-related activities being executed by a separate Zuul worker process called zuul-merger. I’ve updated the Puppet modules in the os-ext-testing repository accordingly, but if you had installed the Jenkins master with Zuul from the Puppet modules before Tuesday, February 18th, 2014, you will need to do the following on your master node to get all reconfigured properly:

When Puppet completes, go ahead and open up the Jenkins web UI, which by default will be at http://$HOST_IP:8080. You will need to enable the Gearman workers that Zuul and Jenkins use to interact. To do this:

Click the `Manage Jenkins` link on the left

Click the `Configure System` link

Scroll down until you see “Gearman Plugin Config”. Check the “Enable Gearman” checkbox.

Note: Darragh O’Reilly noticed when he first did this on his machine, that the Gearman plugin was not actually enabled (though it was installed). He mentioned that simply restarting the Jenkins service fixed this problem, and the Gearman Plugin Config section then appeared on the Manage Jenkins -> Configure System page.

Once you are done with that, it’s time to load up your Jenkins jobs and start Zuul:

If you refresh the main Jenkins web UI front page, you should now see two jobs show up:

Jenkins Master Web UI Showing Sandbox Jenkins Jobs Created by JJB

Testing Communication Between Upstream and Your Master

Congratulations. You’ve successfully set up your Jenkins master. Let’s now test connectivity between upstream and our external testing platform using the simple sandbox-noop-check-communication job. By default, I set this Jenkins job to execute on the master node for the openstack-dev/sandbox project [1]. Here is the project configuration in the example data repository’s etc/jenkins_jobs/config/projects.yaml file:

Note that the node is master by default. The sandbox-dsvm-tempest-full Jenkins Job is configured to run on a node labeled devstack_slave, but we will cover that later when we bring up our Jenkins slave.

By default, the only job that is enabled is the sandbox-noop-check-communication Jenkins job, and it will get run whenever a patchset is created in the upstream openstack-dev/sandbox project, as well as any time someone adds a comment with the words “recheck no bug” or “recheck bug XXXXX”. So, let us create a sample patch to that project and check to see if the sandbox-noop-check-communication job fires properly.

Before we do that, let’s go ahead and tail the Zuul debug log, grepping for the term “sandbox”. This will show messages if communication is working properly.

sudo tail -f /var/log/zuul/debug.log | grep sandbox

OK, now create a simple test patch in sandbox. Do this on your development workstation, not your Jenkins master:

If you go to the link to the code review in Gerrit (the link that output after you ran git review), you will see your Gerrit testing account has added a +1 Verified vote in the code review:

Successful communication between upstream and our external system

Congratulations. You now have an external testing platform that is receiving events from the upstream Gerrit system, triggering Jenkins jobs on your master Jenkins server, and writing reviews back to the upstream Gerrit system. The next article goes over adding a Jenkins slave to your system, which is necessary to run real Jenkins jobs that run devstack-based gate tests. Please do let me know what you think of both this article and the source repository of scripts to set things up. I’m eager for feedback and critique.

[1]— The OpenStack Sandbox project is a project that can be used for testing the integration of external testing systems with upstream. By creating a patch against this project, you can trigger the Jenkins jobs that are created during this tutorial.

Thank you , Trinath. Second article coming shortly. It turns out that LXC cannot run Devstack (due to things like https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1226855), and therefore I had to redo my test instances using KVM. So, that slowed me down a bit. Hopefully will be pushing the second article shortly as soon as I have a full successful devstack Tempest run on the slave.

Scroll down until you see “Gearman Plugin Config”. Check the “Enable Gearman” checkbox.
Click the “Test Connection” button and verify Jenkins connects to Gearman

I have got no gearman server installed, “test connection: failed for me.

I have installed “gearman-job-server” and this set got passed.

Can you update the document on this troubleshooting.

http://joinfu.com/ Jay Pipes

Hi Trinath!

OK, so it sounds like you just need to update the openstack-infra/config code. On your master, do:

sudo -i
cd /root/config
git pull
exit

and then re-run the install_master.sh script:

bash install_master.sh

Note that gearman does not need to be manually installed. Gearman’s libraries are installed by the openstack-infra/config’s zuul::init.pp Puppet manifest, which is included by the os_ext_testing::master Puppet manifest.

The LOST build message simply means that communication to and from your CI server was working, but Zuul was not able to determine what happened to the Jenkins job that was triggered. This is likely because the Jenkins gearman plugin was not activated.

Best,
-jay

Trinath Somanchi

How can we check whether the gearman plugin is successfully activated with Jenkins? Other than the Jenkins GUI, is there any place where I can monitor the things.

Help me in this regard.
–
Trinath

http://joinfu.com/ Jay Pipes

You can check the Jenkins log file (in /var/log/jenkins). There should be a line in there saying that the Gearman plugin is enabled. As for knowing whether it is active (and successfully communicating with Gearman), I don’t know any other way

Trinath Somanchi

Done Jay!.. Your comments helped me.! I just once again done a “save” in the jenkins config, and it worked..

Got a +1

Thanks a lot for the article. It really helps.

http://joinfu.com/ Jay Pipes

Excellent news, Trinath!

Pattabi

I face the same problem with Gearman Plugin not appearing the Jenkins UI after I run the install_master.sh. I follwed the addional steps of updating the openstack-infra/config code and rerun install_master.sh.

Still I do not see the Gearman Plugin Option in the Jenkins UI.

Not sure if I am missing anything else. I do not find any log entries in the Jenkins log file other than Jenkins started message.

Any help on this is highly appreciated.

Regards.
Pattabi

http://joinfu.com/ Jay Pipes

Hi Pattabi,

If you look in the Jenkins main log (in /var/log/jenkins/), grep for “gearman” and let me know if you see a line in the log file about the plugin. I’m curious to see if you see errors in there.

As a last resort, you can always install the Gearman plugin manually. Go to Manage Jenkins -> Manage Plugins -> Available tab, and install the Gearman plugin…

Best,
-jay

Pattabi

Hi Jay,

Thanks for the response. In fact I was able to proceed further by manually installing the “gearman-job-server” and restarting the Jenkins.

I was able to proceed until the last step in terms of committing a change on sandbox project.

However, I do not see the messages in zuul log file.I see the following periodic errors on zuul.log file. Any pointers ?

That indicates that there is either a proxy issue (can you ping review.openstack.org properly from that host/VM, and if you do: ssh -T $CI_USER_NAME@review.openstack.org -p29418 as the Zuul user, do you successfully SSH to the Gerrit host?)

Or there may be an issue with your Gerrit SSH key. Are you sure you have added your Gerrit SSH private key to your data repository?

Best,

-jay

Pattabi

Hi Jay,

– I can ping review.openstack.org properly from the VM
– I can do ssh -T $CI_USER_NAME@review.openstack.org -29418 as the Zuul user
– The Gerrit SSH Key is in my data repository

I still have the same error trace on the zuul debug log.
Any other pointers on how to go about debugging the issue and/or other alternatives ?

Thanks in advance.

Regards,
Pattabi

http://joinfu.com/ Jay Pipes

Unfortunately, I’m kind of out of ideas here. The only thing I can think of is service zuul-merger stop; service zuul stop; and then rm -rf /var/log/zuul/*, and then restart both Zuul services. Then, recheck the Zuul logs to see if there’s any more information in there…

Other than that, I’m really not sure.

Best,
-jay

Ramy Asselin

Make sure you’re not using Corkscrew or similar tools when testing the ssh command because zuul uses direct ssh without any proxies.
(comment out the relevant lines from your ~/.ssh/config or /etc/ssh/ssh_config). If it doesn’t work, then there’s a (corporate) firewall blocking the connection.

mayu

Great job, it is a guide for my ci construction. Thanks a lot.

Bharat Kumar Kobagana

Hi Mayu,

I am facing few issues when following this guide. I am trying on Ubuntu 12.04.

I have and FTP server to post the logs, can you guide with the article on submitting the logs to FTP and publishing the link in gerrit.

Kindly help me

http://joinfu.com/ Jay Pipes

I left that as an exercise for the reader to do in my last article. I will try to add another article that shows how to set this up, but you can look at the upstream JJB and Puppet configurations for some clues.

Before that article, I need to complete an article about adding nodepool as the devstack slave VM manager…

John Griffith

Nice work Jay! I was running into issues with gearman as well. The plugin was missing from the manage interface, but I just manually installed without issue.

Other issue was when I tried to enable gearman; it failed to connect. I verified telnetd was installed, as well as disabling the firewall on my Precise system. Then did a restart of zuul (it wasn’t running) and it seemed to solve the issues.

Only outstanding issue is why my sandbox check gave a -1 for “Unable to merge” ?

John,
Did you figure out why it was posting a -1? My master is also posting a ‘-1′ on my sandbox test.

Trinath Somanchi

Hi Jay-

I noticed an issue here, When I configure two jobs with Jenkins and Zuul for a project,and restart zuul, for the first time, when a patchset arrives, it runs both the jobs. but when stream of patchsets arrive, it only runs the first job and will never post the result to gerrit again.

I have a multi node setup with a Jenkins master (Jenkins, Zuul and Gearman) and 2 slave nodes (devstack).

For configuration of multiple jobs, Is there any more configuration need to taken care?

please help me.

http://joinfu.com/ Jay Pipes

Hi Trinath, apologies for missing your above comment. Were you able to resolve your issues?

-jay

Richard Hedlind

Jay,
A question about layout.yaml. In the data repository, the project definition looks like this:

projects:
– name: openstack-dev/sandbox
check:
– sandbox-noop-check-communication
Your example has ‘sandbox’ added to the front of the job string. Should it be there or not?

Still trying to understand why my CI gives a ‘-1′ in the sandbox test.

thx
Richard

http://joinfu.com/ Jay Pipes

Hi Richard,

The name of the test job changed after the article was written, sorry. The correct job name is the one in the current version of the os-ext-testing-data repository: noop-check-communication. We generalized the job name to remove the sandbox- prefix.

Best,
-jay

Sunil

Is this method still supposed to work? I am trying this on a fresh VM with Ubuntu14.04 install, following all the steps. I get an error toward the end and there is no jenkins and there is no zuul on the system. Any ideas about what’s going on?

You are in ‘detached HEAD’ state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

puppet does not like layout_dir to be ”. Why is the layout_dir empty? What can I set it to?

http://joinfu.com/ Jay Pipes

Hi Sunil,

Apparently, the puppet manifests that install Zuul changed since I wrote this. You can set layout_dir to ‘/etc/zuul/’ by modifying the install_master.sh script to supply a class arg, like on this line:

After some manual intervention and dealing with errors like “Error: Could not initialize global default settings: Certificate names must be lower case; see #1168″, where the hostname had capital letters in it, I have Jenkins and Zuul server up and I can see the two jobs.

The question is how serious are these remaining 3 errors that I saw:

…
Notice: /Stage[main]/Zuul::Server/File[/etc/zuul/layout/layout.yaml]/ensure: defined content as ‘{md5}843951cf96796cb5d614669fb4d01609′
Error: /Stage[main]/Os_ext_testing::Master/File[/etc/zuul/openstack_functions.py]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/openstack_project/zuul/openstack_functions.py
…
Info: /Stage[main]/Zuul/Apache::Vhost[zuul]/File[50-zuul.conf]: Scheduling refresh of Service[httpd]
Error: Execution of ‘/usr/sbin/a2enmod mem_cache’ returned 1: ERROR: Module mem_cache does not exist!
Error: /Stage[main]/Zuul/A2mod[mem_cache]/ensure: change from absent to present failed: Execution of ‘/usr/sbin/a2enmod mem_cache’ returned 1: ERROR: Module mem_cache does not exist!
…
Info: Could not find filesystem info for file ‘modules/openstack_project/nodepool/scripts’ in environment production
Error: /Stage[main]/Os_ext_testing::Base/File[/opt/nodepool-scripts]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/openstack_project/nodepool/scripts
Error: /Stage[main]/Os_ext_testing::Base/File[/etc/apt/preferences.d/00-puppet.pref]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/openstack_project/00-puppet.pref

Sunil

Sorry for flooding this blog with my issue but this looked like it would work without much hassle.

I got it to generate gerrit events in the zuul. But it resulted in a merge conflict and it did not even fire the Jenkins job. How did it run into merge conflict? I just removed the file ‘mytest’ from sandbox. Where is the workspace where its trying to merge the changes?

This change was unable to be automatically merged with the current state of the repository. Please rebase your change and upload a new patchset.

http://joinfu.com/ Jay Pipes

Sunil, my apologies for you running into so many issues. Looks like there have been a number of breaking changes in the upstream puppet modules. I am going to need to go back through my os-ext-testing repo and try to piece together what needs to change. Unfortunately, I don’t have time to do that this week

-jay

Sunil

Jay,

No need for an apology…:) This has been a great help!

I think if I can understand where that merge conflict is coming from, I would be able to move forward.

As for the breaking changes, thinking out loud, may be its dogfood time conceptually…:-) i.e. detect changes in puppet modules and fire up an Ubuntu VM which runs the install_master.sh script and see if the sandbox test job gets created and can be run. If it fails, the least it can do is to inform you, and provide feedback to puppet module writers that he/she broke your application. Of course, this is oversimplification because I don’t know anything about puppet’s source and CI mgmt.