rocaltrol (calcitriol) in mexico He was keen to investigate the problem himself so I thought now would be a good time to try the git bisect feature for the first time. There were approximately 5500 patches merged between 2.6.14 and 2.6.15, and this method found the offending patch in only 13 reboots (OK – still a high number, I said this guy was keen though!).

This directory looks like an ordinary kernel source distribution as you might expect in /usr/src/linux. It also has a hidden .git subdirectory which is where git stores its magical data. Tell git that we want to find a buggy patch through bisection:

# git bisect start

Next, tell git which kernel was the last known-working kernel, and which kernel is known to be not working. To identify these kernels, you can either use the long hexadecimal commit numbers, or you can abbreviate those numbers, or you can refer to tags directly. Linus tags every release with the version number, so the next step is as simple as:

# git bisect bad v2.6.15
# git bisect good v2.6.14

git now runs off and locates the commit exactly halfway between 2.6.14 and 2.6.15. It then “checks out” this tree, so the kernel infront of you is effectively a snapshot from halfway inbetween 2.6.14 and 2.6.15.

Bisecting: 2705 revisions left to test after this
[dd0314f7bb407bc4bdb3ea769b9c8a3a5d39ffd7] fbcon: Initialize new driver when old driver is released

Build that kernel in the normal way, and reboot into it. It works, so we know the bug was introduced after this point. We inform git of this:

# git bisect good

git now discards the whole first half of commits between 2.6.14 and 2.6.15 and bisects the remaining half (the changes after the point we just tested, but before 2.6.15). This is just a simple binary search. git now presents us with a new kernel snapshot (in this case, 3 quarters of the way between 2.6.14 and 2.6.15) and we have to test this.

A couple of kernels later, the search gives us a kernel which exhibits the bug. Telling git about this isn’t any harder:

# git bisect bad

The search continues, with the user telling git if the kernel was “good” or “bad” each time, and several reboots later we end up with the exact patch that introduced the bug:

Ronald filed kernel bug 5930 about this. Usually this stuff is a nightmare to debug, and even though this did require many reboots to locate, it’s definately a step in the right direction. The number of bisections you need to do is obviously less if you have a smaller range to test (e.g. if he’d known that 2.6.15-rc1 was OK and 2.6.15-rc4 was bad, some time would have been saved).

I agree, if I was able to shave off 75% of the commit range i would have saved a mere 2 reboots. I think from RC1 to RC4 accounts for more then 25% of the changesets between V14 and v15 releases

The same way counts the other way around, if i had only known 2.6.11 was good and 2.6.15 bad, that would have added a mere 2 more reboots. (every boot rules out around half of the changesets left over)

I started to use this approach to fix these infernal crashes I’ve been experiencing for nearly a year… and I came to the horrible realisation that it wasn’t going to work because the last good kernel was 2.6.9 – before git was created. I’m doomed! I really have tried everything. I’m not alone though, another guy with a similar Vaio has the same problem.

After bisecting I would try to compile a specific kernel version. How can I switch to a specific commit (with cogito I could do “$> cg-seek v2.6.18-rc3″) but the cogito scritps seems to be buggy, isn’t it ?

Last year, I ran into a v4l problem, and, as my first experience with git, I tried bisecting, but I ran into problems (don’t remember what, but it was more of an issue between me and git, than between me and the kernel), and decided just to download a bunch of tarballs and narrow it down. I managed to narrow it down to a 4-number release, so I had figured I’d go back into git and try to narrow it down further, but apparently there are no tags for the 4-number versions, and the commit hash (from the release notes, apparently written by Greg KH) was unknown to git on a clone of Linus’ repo.

So I gave up and just rewrote the app to use v4l2 (which fixed my issue), but I’m nevertheless curious what I really should have done with the additional info I had about the 4-number release… obviously if it happened now, I wouldn’t even bother with the tarballs, but it’d be nice not to have to build another 10+ kernels above and beyond the minimum if I ever did decide to pick up where I left off…