Do not Panic! Remote Server (Hetzner) not rebooting any more – A Solution

I went through this experience recently. First of all, don’t panic! I panicked, and because of this, I made a mistake: I didn’t wait long enough for it to come online. Had I waited up to 60 minutes, it would probably have come online (see reason below). The story:

I had broken packages on my Ubuntu 10.04 server and decided to fix them by

1

2

apt-get update

apt-get upgrade

While updating, I noticed that the package
grub-pc also was upgraded, apparently a new bootloader (or bootloader configuration) was installed. This made me feel uncomfortable, since I didn’t know if the server would reboot after this upgrade. So, because of the saying “The devil you know is better than the devil you don’t know” (and a desire to sleep peacefully at night!) I decided to reboot the server and see what would happen. To my big dismay, it did not come online. SSH connections failed with
"Port 22: Connection refused" .

I panicked and asked the Hetzner Support (which is very responsive and supportive btw!) to install a LARA Remote Console so that I could see the text output of the booting screen. After some regular startup text, the screen became blank. I panicked even more and took immediate steps to move all data to a new server. It took me 6 hours to complete the most important parts, and another full week of restless work to finish it. We will see that it was not neccessary.

Most regular servers at the Hetzner datacenter are running Software RAID. It seems that after a reboot (especially if you send a Hardware Reset) the OS needs some time to re-sync or check the file system. I am not sure what caused the delay in my case. Re-syncing the entire RAID array can take up to 1-2 hours, depending on your hardware and disc space.

So, wait at least 1 hour for it to come online, especially after a hardware reset! When you activate Hetzner’s Rescue system (which is very good btw!) it will stay active for a minimum of 1 hour, so your server will be down for 1 hour at least, in any event. So you are not losing much by waiting a bit longer.

Now, in my case, I assumed that Grub2 was broken. So I activated the Hetzner Rescue System, booted into it, and reinstalled Grub2. I have found the following method here and it worked for me. First you have to mount the regular RAID filesystem under
/mnt :

1

2

3

4

5

6

mount /dev/md2 /mnt

mount /dev/md1 /mnt/boot

mount -t dev -o bind /dev /mnt/dev

mount -t proc -o bind /proc /mnt/proc

mount -t sys -o bind /sys /mnt/sys

chroot /mnt

At this point, you are in your regular root directory. To reinstall Grub2 the Debian Way, I did:

1

apt-get install --reinstall grub-pc

To make really sure, I reconfigured the package:

1

dpkg-reconfigure grub-pc

It will ask you where to install the bootloader. I selected:

1

2

[*] /dev/sda

[*] /dev/sdb

No errors were reported. I rebooted again and it did not come online immediately, for the reasons previously mentioned. I waited long enough (in my case, 15 minutes) and it did come online. So, rule number one is: Don’t panic!

Thanks,
You provided me more details than the Hetzner’s wiki, Hetzener’s wiki some times is not detailed at all, so it doesn’t help much.
BTW, I used this method to reinstall Grub over Centos 6 and it worked, from Debian rescue mode of Hetzner.

WOOHOO!! Thanks for this!
I had a server with Ubuntu + Software RAID 1. Drive sda died and was replaced, but the machine wouldn’t boot. Host temporarily set it to boot from sdb, which worked. After resync, I just did the steps starting at:
dpkg-reconfigure grub-pc
Picked both the new sda + sdb for GRUB to install on, and rebooted. DONE! 🙂

Thanks much appreciated. my harddrive died and i didnt eject it out of the sytem in time (hetzner support was damn quick to replace it).

I booted the rescue system and partitioned the hard drive as explained in the hetzner wiki, but wasn’t sure about the whole chroot part at the end. I followed your chroot instructions and am currently resyncing the drives. Fingers crossed it works.

If anyone reads this: Make sure when your drive dies to eject it from the system before calling support. it saves you a lot of headache as you won’t need to fiddle around with the rescue system etc….

Totally saved my bacon on this one. The Hetzner Wiki is not clear on this point, as it makes it sound like the rescue system goes away when you reboot. Also, there is the added issue of having to sync the raid drives if you made the mistake of using a hardware reboot. Always try the soft reboot first!

Trackbacks/Pingbacks

[…] friend has written an important article. If at all possible, read it before you use the Hetzner Rescue system, but if you are having issues with the server not coming back online, read this and don’t […]