My hard drive has some bad sectors, and they obviously are affecting system performance: I haven't found any of my valued personal files to be corrupted or lost, but the distro installation can't shut down properly and reinstallation doesn't solve the problem. (Can't believe it took me at least a week to realize the bad sectors were the likely reason.) Is there any way to isolate the bad sectors and make sure the drive never uses them? (No, I can't run out and buy a new hard drive.)

Guttorm

02-26-2013 10:28 AM

Hi

You could use fsck.ext2/3/4 with -c twice. I think you'll need to use a live CD so it can be run while unmounted. And it will take a long time if the disk is big.

Quote:

-c This option causes e2fsck to use badblocks(8) program to do a read-only scan of the device in order to find any bad blocks. If any bad blocks are
found, they are added to the bad block inode to prevent them from being allocated to a file or directory. If this option is specified twice, then
the bad block scan will be done using a non-destructive read-write test.

newbiesforever

02-26-2013 10:48 AM

Quote:

Originally Posted by Guttorm
(Post 4900153)

Hi

You could use fsck.ext2/3/4 with -c twice. I think you'll need to use a live CD so it can be run while unmounted. And it will take a long time if the disk is big.

Thank you. Some of the operating system appears to be written on the bad sectors already; if so, will adding the sectors to the bad-block inode make the system useless and necessitate reinstalling?

Guttorm

02-26-2013 11:01 AM

Not sure about this, but if a critical OS file uses a bad block, it will most likely crash anyway because the file is corrupt, or it can simply not be read.

The mkfs.ext4 command also has this option, so if you ever have to reinstall, that option should at least prevent the bad blocks from being used in the future.

TobiSGD

02-26-2013 11:10 AM

You may be able to mark bad sectors to not being used, but this will not solve your problem. When your OS is already affected by bad sectors that means that the disk is already out of spare sectors to replace bad blocks. This means your disk is dying and even if you can mark the current number of bad sectors as unusable they will become more and more over time, possibly rendering your OS unusable.

I know that you don't want to hear that, but there is no reliable way to bring this disk to a state that you can safely use it again. The disk has to be replaced, but in the meantime you can try to not use the the parts of the disk where the most bad sectors appear with just leaving them unpartitioned. Use the disk as rarely as possible (mount /tmp to RAM, use another disk or Flash drive for /home, ...). Don't trust it at all with important data.

newbiesforever

02-26-2013 11:21 AM

No, I don't mind hearing it. Thank you. I'm not sure offhand how big the hole is (I think around 145 bad sectors, but out of what total, I don't remember), but should I start checking every one of my data files to see if anything's been lost? Only the operating system is obviously affected. Also, is it pointless to back the hard drive up (because I could be backing up many corrupted files)?

TobiSGD

02-26-2013 11:38 AM

Quote:

Originally Posted by newbiesforever
(Post 4900198)

but should I start checking every one of my data files to see if anything's been lost?

If you have the possibility to do that then yes, you should do that.

Quote:

Also, is it pointless to back the hard drive up (because I could be backing up many corrupted files)?

More or less pointless, yes, unless you have a reliable way to determine if the data is corrupted. This is why you make a backup before a disk fails, to prevent data loss in case of failure, the same way as you fasten the seatbelt in your car for the case that you may have an accident, not after you hit the wall.

oskar

02-27-2013 03:29 PM

--edit-- given your last post it seems like this won't help you, but it might be relevant for someone else --edit--

How do you know there are bad sectors?
if there's a nagging "I/O error bad-sector blah blah" on your command line it could be that your kernel is being retarded and is trying to read from a device that isn't there. Happened to me more than once. Make sure the error message relates to an actual physical drive.

jefro

02-27-2013 07:49 PM

I'd get the OEM's drive diags. They tend to offer a feature that tests drives fully with write reads in some cases. The result would be what some collection of all the computer parts would know. For example. You could have a bad drive controller and a great drive and make it seem like there are issues with the drive.