Posted
by
CmdrTaco
on Sunday June 05, 2005 @11:45AM
from the just-like-a-real-project dept.

An anonymous reader writes "The Linux Kernel is now getting automatically tested within 15 minutes of a new version being released, across a variety of hardware and the results are being published for all to see. Martin Bligh
announced this yesterday, running on top of IBM's internal test automation system. Maybe this will enable the kernel developers to keep up with the 2.6 kernel's rapid pace of change. Looks like it caught one new problem with last night's build already ..."

I automatically test every nightly -git snapshot release, so it's fairly well tied in anyway. This also means my heaviest usage of our machines is at night, when most of the (US) developers are asleep.

So it's fairly well tied in already... and the whole -rc cycle should enable us to catch a lot of stuff.

Red Hat (and probably Novell/SuSe, since they use over one thousand kernel patches) runs a myriad of tests on each of its own kernel builds nightly - and has been doing so for years. On more than just the 3 architectures covered by this test.

That said, pushing tests upstream is a great idea. Just not revolutionary or anything.

Compiles, boots, runs dbench, tbench, kernbench, reaim, fsx. If one test fails, it'll highlight itin yellow, rather than green or red. I have a few of those in the internal tests, but not the external set.

This is only the tip of the iceberg as to what can be done. We're already running LTP, etc internally, and several other tests. Some have licensing restrictions on results release (SPEC)... LTP is a pain because some tests always fail, and I have to work out the differential against baseline. Will come later.

Indeed. The automation system I wrote is just a wrapper around an internal harness called ABAT that has a massive amount of work behind it. If systems crash it can detect that, power cycle them, etc.

Going from 90% working to 99.9% working is frigging hard. I had all this working 3-6 months ago, but the results weren't good enough quality to be published. Several people internally put a massive amount of work into improving the quality and stability of the harness.

There is indeed an internal self-test suite on the harness. It's not desperately sophisticated, and I wouldn't dare show it to anyone;-) However, it does catch a lot of stupid bugs. It requires some manual intervention/inspection to work.

Plus, there's a separate development grid where we test new test-harness code before it's put onto theproduction grid.

The results are all there if anyone wants to play with them. Go to the results matrix, and click on the numerical part of the green box. Pick a test, and drill down to the results directory.

The numbers are there, it's just a question of drawing graphs, etc. I have some for kernbench already, but I'm not finished automating them. If anyone wants to email me code to generate them from the directory structure published there, feel free;-) Preferably python or perl into gnuplot.

Which would mean, for the last several 2.6.x releases, that you are always using a version with a known root hole in it. Here's an idea: use your vendor's QA-tested kernel that they package for your distribution.