Ext3 filesystem failed

We have 3 Seagate drive (ST360021A) installed on a RH7.2 system with dual Celeron 533MHZ CPU. The motherboard has 2 HPT366 UltraDMA/66 controllers.

The filesystem(ext3) keep fails.
The first time, we used 2 drives for RAID and failed. Then we put another one in and copy all data (some are missing or failed) to that drive.

After a while (half month), the filesystem failed again and lost some data.

Now, this is the third time, with one single drive and the filesystem (ext3)failed again. We have important data lost, when we change (cd) to that directory and it says "Input/output error", please help!
Tell me what's the solution to get back the data and what's the possible problem. Thanks

there is no other way than fsck to repair a corrupted disk, except professionell data recovery services.
Well, you can use dd to dump the partition to a file, and then search through the dump and check

there is no other way than fsck to repair a corrupted disk, except professionell data recovery services.
Well, you can use dd to dump the partition to a file, and then search through the dump and check if you find your data, 47GB ...

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

indicates a 'receive checksum error'. This cannot be caused by Linux
as the APIC message checksumming part is completely in hardware. It
might be marginal hardware. As long as you dont see any instability,
they are not a problem - APIC messages are retried until delivered.

Have all of the failures been related to a particular drive? If so it might point to a problem with that drive, its cable, or controller. Also keep in mind that failing hardware (Motherboard, one or both CPu's, or memory) could write bad data to the drive, thus damaging the file system it was writing to.

FYI, I had a similar problem with a racked server last winter. The problem turned out to be a marginal cooling fan that only became a problem in the coldest weather when the AC system essentially shut down due to low outside temperatures. The system would begin to run a bit on the hot side and eventually crash, writing garbage to some of its mounted file systems in the process. Since the coldest time was in the wee hours of the morning no one was there to see the beginning of the failure and the next morning we'd be presented (some of the time) with a corrupt file system and data loss. Regular (daily) backups are a wonderful thing...

So, is there no other way to recover the data except request for a data recovery company to do it?

I also know that the importance of backup, however, because I use the backup drive to replace the first failed RAID drives, and I don't want to format those failed drives, so no more drive left in our company to do the backup. So bad.

In fact, when I first installed the RAID drive, they might be placed too close together and too hot and made them fail. However, I used Seagate tool to check those drive and nothing wrong with those drives.

jlevie, what OS is using in the failed system? Is it Linux too? Is the ext3 or ext2 not good as FAT32 or NTFS? Looks like nobody have problem with NTFS or FAT32.

The server that I was referring to was a w2k box, but it really doesn't matter what OS is being used when you have that kind of hardware problem. If the OS writes garbage to the disk, you wind up with a corrupt file system (be it FAT32, NTFS, UFS, XFS, ext2, etc). The solution, of course, is to eliminate the hardware problem.

On the subject of backups... Reliable backups of a server require a good schedule and history. By that I mean that one does a full backup at some scheduled interval and incrementals on all days between full backups. History is important and I wouldn't want less than two complete cycles (full & incrementals) on hand at all times. Unless you have some mega RAID volume on another system, this sort of backup strategy requires backup to tapes. Tape backup systems are expensive, but well worth the cost when something like this happens. If you can't afford a high capacity tape drive and must backup to disk, be darn sure to make the backup repository reside on a different system and make sure that you have enough disk on that box for at least two full cycles plus the current cycle. A corrupt or missing file might not be noticed immediately and if you don't have history in your backups that goes back far enough you won't be able to recover. FWIW, On the servers that I manage I keep at least 6 months of backup history (monthly full & daily incrementals). On the really, really important servers I have a year's worth of history. These are all backed up to DLT tapes, with the tapes stored in a fire resistant safe and bi-annual full backups stored off site.

Featured Post

Backup any data in any location: local and remote systems, physical and virtual servers, private and public clouds, Macs and PCs, tablets and mobile devices, & more! For limited time only, buy any Acronis backup products and get a FREE Amazon/Best Buy gift card worth up to $200!

Currently, there is not an RPM package available under the RHEL/Fedora/CentOS distributions that gives you a quick and easy way to allow PHP to interface with Oracle. As a result, I have included a set of instructions on how to do this with minimal …

It’s 2016. Password authentication should be dead — or at least close to dying. But, unfortunately, it has not traversed Quagga stage yet. Using password authentication is like laundering hotel guest linens with a washboard — it’s Passé.