Executive summary: it is fairly easy to recover from certain
situations where your machine won’t boot the way you want (or at all).

As discussed in section 2, there are lots of ways the
MBR (master boot record) can be damaged or changed in undesirable way.
Section 1.1 will tell you how to reinstall or recreate a
well-behaved MBR, without having to reinstall everything else.

Less commonly, the first stage of grub works OK, but
later stages fail, due to a damaged grub.cfg file.

Various scenarios that could cause this situation are discussed in
section 2, but for now let’s concentrate on getting out of
the situation.

You need to know which drive partition holds the
target system, i.e. the Linux system you want to boot. For
clarity, let’s discuss things using the shell variables $partition
and $drive. An example might be: partition=/dev/sda6 ;
drive=/dev/sda

If you happen to know, based on experience, where the target system
lives, define $partition and $drive accordingly, and skip to
step 7. If you need to figure it out, proceed as
follows:

Let’s assume you have already carried out the steps in
section 1.1. Various possible next steps include:

The system boots up fine. See section 1.4 for some
housecleaning suggestions. Then resume normal operations.

Grub prints a normal-looking menu, but when you try to boot
Linux, the boot does not succeed. This might be a problem with the
grub configuration. Try rebuilding the grub.cfg file according to
the instructions in section 1.3.

Grub doesn’t even print a menu, but instead prints “grub
rescue” prompt. This might also be a problem with the grub
configuration. Try rebuilding the grub.cfg file according to the
instructions in section 1.3.

Sometimes it isn’t a grub problem at all. For example, if you
build a 64-bit kernel and install it on a machined with 32-bit
utilities, it will not boot successfully, and there is nothing grub
can do about it. You have to install a correct, consistent set of
software. You may need to reconfigure grub also, but that happens
after you have solved the other problems.

The procedures in section 1.1 were meant to get the system
functioning again as quickly as possible. Now that the system is up
and running, so that the time pressure is off, we can do some
housekeeping:

Optional: You may want to
make sure your copy of the software
is not corrupted:
apt-get install --reinstall grub # (optional)

Install the latest and greatest grub in the MBR:
grub-install --recheck /dev/hda

In ideal situations, the work described in this section doesn’t
accomplish much, because it duplicates the work done in section 1.1 and section 1.3. However, consider the situation
where the Live Demo system you used to restore the MBR is using a
different version of grub. Maybe one system is out of date, or maybe
just exercised the option to use a different version. This is your
chance to install the grub version that your system thinks should be
installed. If you don’t do this, you risk having some ugly problems
later.

There are several scenarios that can lead to an MBR being overwritten
or otherwise rendered unsatisfactory. Examples include:

On a dual-boot system, every time you install (or reinstall)
Windows, it will almost certainly overwrite your MBR. See
section 2.1.

A failed upgrade can leave grub in a bad state. In particular,
if the system was using Grub Version 1 before the upgrade and wants
to use Grub Version 2 afterwards, sometimes things get confused.
I’ve seen it happen.

Suppose you have a dual boot system, i.e. one that sometimes
boots Linux and sometimes boots Windows. Every time you install (or
reinstall) Windows, it installs its own boot loader into the MBR.
This is a problem, because the MS boot loader will not load anything
except the MS operating system ... in contrast to grub, which will
happily allow you to boot almost anything: Linux, memtest86, various
MS products, et cetera.

Some folks recommend installing MS before installing Linux, so that
the Linux installation process will set up the MBR for you. This is
fine as far as it goes, but it is not always possible. For instance,
sometimes it is necessary to reinstall or upgrade the MS
stuff, days or months or years after Linux was installed.

The grub-reinstall procedure described in this document takes only a
few minutes, so feel free to install MS after Linux if you find it
necessary or convenient to do so. MS will trash the MBR, but you can
restore it using the techniques described here.

If you have two or more Linux systems, use system "1" to store the
backups pertaining to system "2" and vice versa.

If you have only one system, store the backups on floppy ... and
don’t forget where you put the floppy.

Feel free to keep another copy of sector 0 on the same
drive as the sector you are backing up. This is useful in cases where
part of sector 0 is messed up but the partition table remains correct.
It is useless in cases where the partition table trashed.

Keep in mind that sector zero contains both the stage-0 boot code and
the primary partition table. Therefore, before restoring the boot
sector, you have to make a decision:

In the scenario where something trashes sector 0 including the
partition table, then you want to restore the whole thing. This
can rescue from what would otherwise be a very bad situation.

dd if=host1-sda.mbr of=/dev/sda count=1

In the scenario where the partition table is not trashed, and
has possibly changed since you backed up the MBR, you want to restore
the boot code without disturbing the current partition table. You
need to splice the backed-up boot code onto the current partition
table before writing anything to sector 0. The procedure is:

Ubuntu: The Ubuntu Live USB drive (or DVD) that you used to
install Ubuntu also serves as a nice Live Demo image, suitable for
many purposes including the grub reinstallation process described
here. So be sure to keep that USB drive (or DVD) handy. If you need
to download a new copy, see reference 2.

Debian: The usual Debian install disk is not, alas, a
fully-featured Live Demo. A rundown of the various Debian Live images
can be found in reference 3.

Slackware: RIP (reference 4) is a Slackware Live Demo,
suitable for tasks such as grub reinstallation.

For good reasons, when you fire up a typical live CD, you
are logged in as an ordinary user, not the superuser.

You can exert superuser privileges on a command-by-command basis by
prefixing each command with "sudo" ... but since every command we are
about to do requires superuser privileges, it is easier to just become
superuser once and for all by saying sudo su

This creates a new empty directory named x. The name is
arbitrary, made up just for this purpose. You could use any other name
if you wanted, so long as you used the name consistently in all steps
in the grub-reinstall procedure ... but x is as good as any.
It’s just some empty directory. It serves the following purpose: In a
moment we are going to want to mount a filesystem. Linux mounts
things by mounting them onto a directory. The newly mounted
filesystem has to attach to the rest of the filesystem somewhere, and
Linux uses a pre-existing directory as a point of attachment.

The --root-directory=/x option tells grub where to look
for the grub directory during the installation process. The
grub directory is /x/boot/grub on typical distributions
such as Ubuntu and Debian, but may be /x/grub on some
*bsd setups.

The grub-install program uses the grub directory in several
ways during the installation process. Among other things, it goes
there to read the device.map file. It also goes there
to write the core.img file. A new core.img file
gets written each time you run grub-install.

Keep in mind that the Unix file system is essentially a graph (in the
sense of graph theory) with edges and nodes. The edges are the paths,
i.e. directory names and file names. The nodes do not have names.
The nodes are where the data is stored. So: the inode of interest
will be reached by the path "/x" during the installation
process. Grub assumes this inode will be reached by the simple path
"/" later, when the system on /dev/sda6 is actually booting and
running.

The idea that the same inode could be reached by one path now and a
different path later makes perfect sense if you think about it the
right way. The grub-install program understands the
distinction between the two, which is what makes it possible to
reinstall grub using the easy procedure described in this document.

This distinction is, alas, not well documented. You could read the
grub manpage all day and not learn anything about this distinction.
The grub-install --help message says

--root-directory=DIR install GRUB images under the directory DIR
instead of the root directory

which seems somewhere between incomprehensible and self-contradictory.
Is DIR the root directory (as suggested by the equation
root-directory=DIR) ... or is DIR used "instead of the root directory"
(as stated in the explanatory message)? Gaaack.

I hope you never need to know this. Usually the procedures described
in section 1.1 make this unnecessary.

Imagine a scenario where grub is installed in the MBR correctly, but
the grub configuration files are messed up, so all you get is the
grub> prompt (rather than a menu of kernels that can be
booted). Further imagine that you can’t fix it using the methods
described in section 1.1.

You may be able to recover using the following procedure:

At the grub> prompt, type root (hd0,<tab>

This will give you a listing of all the partitions on the hd0 drive,
along with their UUID, filesystem type, and modification date.

Note that you generally have to add the root=... option to
the linux command line.

Beware that the way grub numbers disk drives {hd0, hd1, hd2, etc.}
may be different from the way linux does it {sda, sdb, sdc, etc.}
... and the difference is not systematic. I have one system where hd0
corresponds to /dev/hde/. This is commonly an annoyance on systems
that have a mixture of SATA and PATA drives.

The numbering of partitions is also different, but the difference is
systematic: grub numbers them starting from 0, while linux numbers
them starting from 1, so grub partition (...,2) coresponds to linux
partition /dev/...3 and so on.

At the grub> prompt, type initrd /boot/init<tab>

This will give you a listing of all the initrd files. Pick the one
that corresponds to your kernel, and issue the complete command:
initrd /boot/initrd.img-2.6.35.10 or whatever.

Issue the boot command. The kernel should boot.

If the kernel panics because it could not mount the root fs, it
means you guessed wrong about the root=... command-line argument.
Maybe it is /dev/hda3 or /dev/sda3 or /dev/sde3. However ...

Remember that the kernel needs to know the root drive twice,
once when it is reading the initrd (initial ramdisk), and once again
when it is starting the system for real. I have seen situations
where the drive is named differently in the two cases, in which case
any drive name you pick is going to be wrong in one context or the
other, and the system will not boot correctly.

The only way to handle this case is to refer to the
disk by its UUID, using a construction of the form
root=UUID=4240ce68-802b-4a41-8345-543fad0ec20f

That is an obnoxious amount of typing, but with any luck you only
have to do it once.

Grub will tell you the UUID; see the first item in this list.

Once the system is booted, clean up the mess using
the methods described in section 1.4.