Torvalds Blows Stack Over Buggy New Kernel

Linux creator Linus Torvalds this week apologized for including in the just-released Linux 4.8 kernel a bug fix that crashed it.

"I'm really sorry I applied that last series from Andrew just before doing the 4.8 release, because they cause problems, and now it is in 4.8 (and that buggy crap is marked for stable too)," he wrote in a message to the Linux kernel mailing list. "In particular, I just got this -- kernel BUG at ./include/linux/swap.h:276 -- and the end result was a dead kernel."

The bug the dev was trying to fix has existed since Linux 3.15, "but the fix is clearly worse than the bug ... since that original bug has never killed my machine," Torvalds wrote.

The message became increasingly acrimonious, as Torvalds displayed the temper for which he's notorious.

No Excuse

"I should have reacted to the damn added BUG_ON() lines. I suspect I will have to finally just remove the idiotic BUG_ON() concept once and for all, because there is NO F*CKING EXCUSE to knowingly kill the kernel. Why the hell was that not a warning?" he fumed.

Torvalds acknowledged he was "grumpy," adding that "this went in very late in the release candidates, and I had higher expectations of things coming in through Andrew."

The reference presumably was to Andrew Morton, one of the Linux kernel's lead developers.

"Adding random BUG_ON()s to code that clearly hasn't had sufficient testing is *not* acceptable, and it's definitely not acceptable to send that to me after rc8 unless it has gotten a *lot* of testing, which it clearly must not have had," Torvalds continued.

"I've ranted against people using BUG_ON() for debugging in the past.
Why the f*ck does this still happen? And Andrew - please stop taking
those kinds of patches! Lookie here:

The Bug Fix That Wasn't

BUG() and BUG_ON() are the same instruction; the former is used in older kernels, and the latter from the 2.6 kernel on.

The instruction is an invalid one, which leads the CPU to throw an invalid opcode exception.

When a BUG_ON() assertion fails, or the code takes a branch with BUG() in it, the kernel will print out the contents of the registers and a stack process -- then the current process will die.

"This type of situation, while rare, is common enough in smaller and less visible projects, where testing processes and protocol are typically less sophisticated than those used by Linus and his team," noted Al Hilwa, a research program director at IDC.

The Grinch Who Rules Linux

Linux has grown dramatically over the 25 years of its existence, but its creator apparently hasn't seen the need to alter his trademark style of communication with its core developers.

Software development "remains a highly detailed and error-prone process, especially at these lowest levels of systems. Kernel work is [the equivalent of] brain surgery, and it's not a surprise that it's still not foolproof," Hilwa told LinuxInsider.

"Almost all engineers get grumpy about things like this," he said. "It's just that this team operates in the open, which is generally a wonderful thing -- but we get to see real human emotions in action."

Richard Adhikari has written about high-tech for leading industry publications since the 1990s and wonders where it's all leading to. Will implanted RFID chips in humans be the Mark of the Beast? Will nanotech solve our coming food crisis? Does Sturgeon's Law still hold true? You can connect with Richard on
Google+.