Recover from disk failure

I was having problems on an older computer, with some hints that it might be a disk failure. The only way to be sure, was to replace the hard drive and see if that corrected the problems. The old drive was a WD 320G SATA 1 drive. I purchased a WD 750G SATA 3 drive as replacement. I looked at smaller drives at Amazon, but some of them appeared to be refurbished. So I went with a clearly new 750G drive.

Replacing the drive

The computer, itself, is a Dell Dimension C521, purchased in 2007. Replacing the drive turned out to be relatively easy. No tools were required. I had to pull on a latch to take off the panel. Next, pressing a lever, I could lift out the DVD, and unplug its cables. Then I unplugged the connector on the sd-card reader, pressed the same lever, and slid that back so that I could lift it out. And then I could slide the disk drive back enough to lift it out, disconnecting the cables as I did so. I reversed those steps to install the disk drive and reinstall the other components.

I was now ready to test. The hard drive was unpartitioned, and with no software installed.

Booting up the system

I first had to connect the various cables (monitor, keyboard, mouse, sound amplifier, ethernet cable). Next I inserted a USB — it was actually the 64bit live Rescue system for opensuse 12.3. This was to have something to boot.

I powered the system up, and hit F2. That got me into the BIOS settings, which I wanted to review. They showed that the new hard drive was properly recognized. So I proceeded to boot the computer.

As expected, the computer booted to the USB. The BIOS should have determined that there was no operating system on the disk, no CD/DVD in the DVD reader. So the next possibility was booting from a USB. This computer did not have a floppy drive.

Looking around with the live rescue system, things looked good. The “fdisk” command recognized the disk (and recognized that it was not partitioned). I had an IP address on the home network. There were no obvious errors.

Restore from backup

The next step was to restore Windows Vista. I inserted my Acronis 2013 recovery CD. I located the external drive where I had last imaged the system.

I then told the rescue system to reboot. As soon as I saw the BIOS prompt, I unplugged the rescue USB, and plugged in the external drive with the image. The computer seemed to hang at that point. That might be a problem. Or it might be that I confused the BIOS by plugging/unplugging USB devices while it was checking them — I have had that happen before on this box. So I used CTRL-ALT-DEL to reboot. And, this time, the Acronis recovery media came up.

I told Acronis to recover. I pointed to the stored image. Since that image was encrypted, I supplied the encryption key. Acronis seemed happy and was ready to proceed.

At this point, I was a little confused because I had not previously done a restore to an unpartitioned disk. I went into Acronis tools and utilities, and created a partition for future use as “/boot”. I then returned to the Acronis restore screen. It still remembered the encryption key. I asked it to restore the Vista partition. It told me that it wanted to put it just after the “/boot” that I had created. And it wanted to set the partition size to 40G, which was what I had been using. I increased that to 60G, since I had plenty of space on that disk. The restore seemed to go well. Next, I restored a DATA partition that I had been using. It defaulted to putting that in the first available space, and making it a logical partition (as it had been on the old disk). That restore also went well.

Next, I tried to reboot. The BIOS said “No operating system”. So I reinserted the live rescue USB, hit the F12 key during boot to select booting from the USB, and was soon back running the rescue system.

That showed the problem. Acronis had set the first created partition as the active partition. But that partition was empty. There was probably a way to fix that in Acronis, but it was easy enough to fix with “fdisk”. For good measure, I re-installed the MBR boot code (saved as “/boot/backup_mbr” on another box). I restored with:

# dd if=backup_mbr of=/dev/sda bs=440 count=1

which wrote just 440 bytes from the backup_mbr file.

Another reboot, and the system booted into Windows Vista, which seemed to be in good shape except that I needed to catch up with Windows updates since the backup was taken.

Reinstalling opensuse

I did not have a full backup of opensuse. My choice had been to reinstall in case of disaster. I did have a full backup of the “/home” file system.

I first booted the live rescue USB, and there I created the partitions I wanted for opensuse. I had already created the “/boot” partition (at 500M). I created an additional 500M partition for a second “/boot” (for testing), and I setup an encrypted LVM for the install. The encrypted LVM had two root volumes — one for the main system and one for testing. I also gave it a swap volume and a home volume.

Next, I booted the 13.1 install media — a USB with the 64bit DVD image. The install was reasonably straight forward. I went into the expert partitioner mode, to specify use of the encrypted LVM volumes that I had already setup.

Once 13.1 was properly setup and running, I restored my “/home” from the backup. I had used “dar” for the backup. And “dar” was not on the live rescue system. So I had to wait until I had completed the install, before I could restore.

I restored “/home” to a directory that I created as “/home/x”. And then I moved most of what had been recovered back into “/home”. I did it that way to avoid overwriting the home directory of the user that I had logged in as. The alternative would have been to login as root for the restore. I restored while running “Icewm” as user “support” (the user I setup for administrative use).

Once I had “/home” restored, that contained saved settings that I normally put into “/home”. That made it easy to do my using install system tweaks.

One remaining step was to install the Nvidia driver. I installed that the hard way. While the nouveau driver works on this box, it does not work very well.

Final notes

The recovery went about as expected. The system is now up and running well. The new hard drive is SATA 3, but running at SATA 1 speeds.

I have since further tested the old drive. I installed it in an external disk enclosure. When I first powered it on, the disk failed to spin up. It took about 4 tries for it to work. And then it seemed good for a while. Later, I began to hear many clicks and slow response, indicating that it was having trouble finding and reading sectors and having to retry. These problems looked as if they could be responsible for the symptoms that I had previously been seeing on that computer.