After a power falure my openbsd server halts on boot. One of my drive have "inconsistency" according to the prompt. This is what I get after the file system check is done:
...
CANNOT READ: BLK 28196704
/dev/rwd1i: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
...

You have one or more sectors that cannot be read. In other words, a damaged drive. If there are sufficient spare sectors, the disk will reassign those sectors to the spares, once they are rewritten. But not until they are rewritten.

You can force mount the partition while in this state, and attempt to copy off everything you want saved, to another drive. If you mount it "dirty", you should do so read-only. Please see the mount(8) manual for details.

If it were my drive, I would take the drive out of service, and write to, then read, every sector on the harddrive, to ensure that spare sectors have successfully replaced all the bad sectors, before restoring data and returning it to service. Of course, I have backups of all my drives. You should, too. But you know that....now.

(The "badblocks" program is included in the efs2progs package, which I prefer, or, you could use dd(1) with /dev/rwd1c.)

Writing to the drive will, of course, destroy what is there, so use dd or badblocks only after recovering what you can.

You do not have other partitions on the drive, only "i". But if you did have other partitions, you should expect you might have bad sectors among them as well.

Kernel messages regarding sector numbers will be the physical sectors numbers used by the drive. Userland programs, such as fsck_ffs, will report sector numbers within the partition. They will not be identical, unless you are running a program against the "c" partition, which is of the whole drive.

Use raw partitions with dd or badblocks, to maximize performance.

You can communicate with the drive electronics (SMART) using atactl(8); I prefer the smartmontools package, as I find it easier to use. This may be able to provide you with a better understanding of the state of your drive. What comes from SMART is up to the drive vendor, some manufacturers produce more information than others.

This drive is my backup drive However, I wasn't finished with my custom backup system yet so I wasn't ready for this and this drive have unique files.

Hmm, this was tricky. Do I have to mount it successfully once in order to be able to reboot (like marking the file system OK)?
Would it work to just edit fstab to set the drive as read-only and then just reboot to get the server running again?

The other drive is OK and contains all the important system stuff (root, usr, var...).

One last thought -- if you cannot mount successfully with -o force,rdonly due to a "bad superblock" -- the first block of metadata about the filesystem, usually sector #8 within the partition -- the alternate/spare superblock may be usable. This is usually sector #32. See the -b option.

I'd mentioned SMART -- the drive electronics standard -- and smartmontools. On all my systems, I set up daily "short offline" tests to have the electronics do self-tests and test the data bus, and weekly "long offline" (also called "extended offline") tests to have the electronics read every sector.

On my RAID systems, it's easy enough to take a problem drive out of service and run badblocks on it, then put it back in service if it is still useable.

On my single-drive systems, what I do will depend on what sectors have failed. And that's a manual process. I would map the drive sectors to partition sectors, then map those to block numbers, then determine if the blocks are unassigned, or if assigned, to which inodes. Not easy, but dumpfs(8) can help. I haven't done this in several years, because my single-drive OBSD systems are now down to a grand total of one, and and so far .... no tests have reported any errors.

Now I have a stupid question, because I was just thinking...this is a single partition drive, and that partition is "i". Is this a foreign filesystem, and not FFS? That could be the root cause of fsck barfing up errors.

If what you describe is accurate -- and that is only what I can determine from your posting here -- your drive's electronics are not operating properly. If your power failure included either a power surge or a period of low voltage (or missing phases of AC), this might be the cause. But your drive's electronics are at least functioning, as the device does respond to I/O requests, sometimes successfully.

As for your question on "mounts" -- Whatever is occurring has nothing to do with a mount. Mounts are merely logical attachments of filesystems to the OS.

---

I do not clearly understand how you ended up with an "i" partition on wd1, but based on your technical background as presented in this thread, I will assume you are using a foreign MBR partition rather than a BSD disklabel on the drive -- the OS will create a virtual disklabel and assign foreign partitions it recognizes to BSD partitions beginning with "i'. And, whether or not this partition was ever actually an FFS filesystem or not is now immaterial - your kernel messages alone are proof of hardware problems -- an inability to read some sectors.

Since your forced, read only mount succeeded, the drive was able to return the sectors containing the primary superblock, which begins at sector #8.

When attempting to read the root directory (inode #2, if this is an FFS filesystem), your "ls" command appeared to hang. Understand that as the drive spins, the electronics may attempt to read the same sector repeatedly, in an attempt to extract valid information. Dozens, or hundreds of times. It must wait for a complete rotation of the drive each time it tries, and that is relatively slow. Eventually, kernel messages will be produced, showing timeouts (from retrying reads over and over, and the OS gives up waiting) and read errors (when the electronics on the drive gives up before the OS does). If you issue the "ls" command from the console, you would see these kernel messages appear while you waited.

Unfortunately, with the root inode unreadable, there is not much further you yourself will be able to do to extract useful data from the drive. If the root directory were available, you might be able to extract undamaged files, and traverse other undamaged directories. But it is not.

A skilled technician may be able to copy undamaged sectors from the drive, and reassemble some of the content into meaningful files. But that would be a manual, difficult, and long effort, with no guarantees.

As for your unreadable sectors, some of them might be readable by commercial laboratories that specialize in data recovery from disk drives. This would be many thousands of Dollars or Euros, and of course there are no guarantees, depending upon the underlying physical damage to the media.

---

If you wish to give up on the existing data on the drive, you may start destructive testing, and see if the drive can be returned to useful function. To do that, install e2fsprogs, dismount /backup, and use the badblocks program against the entire drive, rwd1c or wd1c, I can't remember which badblocks prefers. Use -p 1, so that badblocks continues to run until no new failures are discovered -- so that all bad blocks have been successfully replaced with spare sectors, and -w, so that badblocks writes and tests various bit patterns. See badblocks(8) for details.

It was a thunderstorm that caused the power failure and the lights did blink for a few seconds like in a horror movie. I could understand if some files were lost due to a uneven current, but a complete disc destroyed!

I don't know what to do now. I'll try fiddle with it a bit more and see if I can get it to list some files.

Thank you very much for your help!!! I post again later on (perhaps in a few days) when I have tried some more.

/Quaxo

Edit: Oh the letter "i". I don't remember but this was done during installation and I didn't make any notes about this particular part. So it's probably as you say, a virtual disklabel. If it helps my fstab looks as follows: