Monthly Archives: November 2013

Yesterday we tried to upgrade our mozmill-ci cluster to the previously released Mozmill 2.0.1. Sadly we failed on the OS X 10.6 machines and had to revert this change. After some investigation I found out that incompatibility issues between Python 2.6 and 2.7.3 were causing this problem in mozprofile. Given the unclear status of Python 2.6 support in mozbase, and a talk in the #ateam IRC channel, I have been advised to upgrade those machines to Python 2.7. I did so after some testing, also because all other machines are running Python 2.7.3 already. So I didn’t expect any fallout. First post upgrade tests have proven this.

The interesting fact I would like to highlight here is that we can see speed improvements by running our tests now. Previously a functional testrun on 10.6 has been taken about 15 minutes. Now after the upgrade it went down to 11 minutes only. That’s an improvement of nearly 27% with Mozmill 1.5.24. With Mozmill 2.0.1 there is a similar drop which is from 8 minutes to 6 minutes.

Given all that and the upcoming upgrade (hopefully soon) of our mozmill-ci system to Mozmill 2.0.1 we will see an overall improvement of 60% (15 minutes -> 6 minutes) per testrun!! This is totally stunning and allows us to run 2.5 times more tests in the same timespan. With it we can further increase our coverage for locales from 20 to 40 for beta and release candidate builds as next step.

After the last report over a two week cycle, I have to follow-up with another one for the weeks 45 and 46. Due to my move I had limited availability and to fix some important other stuff. So hopefully this is the last report over a two weeks period in the near future.

Highlights

To be able to release Mozmill 2.0.1 as soon as possible Henrik had to fix a lot of existent bugs for mozprofile in conjunction with its add-on manager class. Those fixes were necessary because our restart tests were broken due to an inappropriate clean-up of add-ons after closing Firefox. At the end 10 bugs have been fixed.

Dave and Henrik were both working on a couple of Mozmill-CI issues, which will help us to better diagnose the memory issues and random crashes of the Jenkins Java process. Everything has been merged to our staging server and has to bake a bit before a push to production will happen.

As of now we have released a new version of mozdownload. The 1.10 release contains a couple of new features, bug fixes, and absolutely to mention a suite of tests. A big thank you goes here to Johannes and Jarek. Both spend a lot of time again, to make it a successful and lesser bug prone release.

Here some major items to highlight:

Addition of tests for all types of scrapers and a lot for command line options

Output of all candidate builds founds if build number has been specified

Given that I was partly away in the last two weeks I haven’t had the time to write another summary report of our work yet. So lets combine the last two weeks in a single blog post this time.

Highlights

To be able to continue our work to speed-up testruns in Mozmill CI Henrik upgraded our Mozmill-CI system to Jenkins 1.509.4. This step was necessary so that we can investigate the slowness in sending out emails for finished testruns, which could take up to 25 minutes! Beside that Henrik also disabled the sending of emails for successful testruns given that those were mostly blocking us, and no-one really uses them. That was already a real boost and our system was able to execute all of the 245 update tests for the 25.0 release in 25 minutes – across all platforms!

As noted in the last couple of Automation Development reports, our Windows 8.1 64bit nodes were really unstable and restarted at random times. With a lot of research Henrik was able to identify the problem. Finally it turned out a bug in VMWare vSphere you could circumvent by tweaking the CPUID mask of the machine. All the details about investigation and problem solving can be found on bug 916746.

Together with Adrian from the IT team Henrik setup another slave node for each supported platform of the Mozmill-CI production instance. That means for the 16 different platforms (including 32bit vs. 64bit) we are running 64 slaves now. That is a lot, and the need for Puppet support is more important than ever before. Manually keeping those machines up-2-date is a real pain for me.

More and more we are facing a really nasty behavior of Jenkins when it cannot delete the workspace for all kinds of jobs on Windows. It forces us to log into the machines, delete the whole Jenkins folder, and re-connect the slave. As of now we have no idea what causes this problem, but it might be related to a restart of the slave and some permission problems. Has anyone else seen the same and shed some light on us?

Speaking about Mozmill we are still blocked on some addon-on related issues in mozprofile before we can release version 2.0.1. Henrik is heavily working on those bugs and hopefully we will have a new mozprofile release next week.

Dave participated in the Mozilla Festival this year, and while he was really busy over that weekend, he found the time to get started on the Mozmill CI configuration generator for ondemand tests. We will collect the requirements for the first version soon, and get started to further hack on it.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 43 and week 44.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda and notes from the last two Automation Development meetings of week 43 and week 44.