A kernel panic occurs when the Linux kernel enters an unrecoverable failure state. The state typically originates from buggy hardware drivers resulting in the machine being deadlocked, non-responsive, and requiring a reboot. Just prior to deadlock, a diagnostic message is generated, consisting of: the machine state when the failure ocurred, a call trace leading to the kernel function that recognized the failure, and a listing of currently loaded modules. Thankfully, kernel panics don't happen very often using mainline versions of the kernel--such as those supplied by the official repositories--but when they do happen, you need to know how to deal with them.

Note: Kernel panics are sometimes referred to as oops or kernel oops. While both panics and oops occur as the result of a failure state, an oops is more general in that it does not necessarily result in a deadlocked machine--sometimes the kernel can recover from an oops by killing the offending task and carrying on.

Tip: Pass the kernel parameter oops=panic at boot or write 1 to /proc/sys/kernel/panic_on_oops to force a recoverable oops to issue a panic instead. This is advisable is you are concerned about the small chance of system instability resulting from an oops recovery which may make future errors difficult to diagnose.

Examine panic message

If a kernel panic occurs very early in the boot process, you may see a message on the console containing "Kernel panic - not syncing:", but once Systemd is running, kernel messages will typically be captured and written to the system log. However, when a panic occurs, the diagnostic message output by the kernel is almost never written to the log file on disk because the machine deadlocks before system-journald gets the chance. Therefore, the only way to examine the panic message is to view it on the console as it happens (without resorting to setting up a kdump crashkernel). You can do this by booting with the following kernel parameters and attempting to reproduce the panic on tty1:

systemd.journald.forward_to_console=1 console=tty1

Tip: In the event that the panic message scrolls away too quickly to examine, try passing the kernel parameter pause_on_oops=seconds at boot.

Example scenario: bad module

It is possible to make a best guess as to what subsystem or module is causing the panic using the information in the diagnostic message. In this scenario, we have a panic on some imaginary machine during boot. Pay attention to the lines highlighted in bold:

[1] Indicates the type of error that caused the panic. In this case it was a programmer bug.

[2] Indicates that the panic happened in a function called fw_core_init in module firewire_core.

[3] Indicates that firewire_core was the latest module to be loaded.

[4] Indicates that the function that called function fw_core_init was do_one_initcall.

[5] Indicates that this oops message is, in fact, a kernel panic and the system is now deadlocked.

We can surmise then, that the panic occurred during the initialization routine of module firewire_core as it was loaded. (We might assume then, that the machine's firewire hardware is incompatible with this version of the firewire driver module due to a programmer error, and will have to wait for a new release.) In the meantime, the easiest way to get the machine running again is to prevent the module from being loaded. We can do this in one of two ways:

If the module is being loaded during the execution of the initramfs, reboot with the kernel parameter rd.blacklist=firewire_core.