After a while I'm here again, this time with an urgent plea for help. On some of my computers and on my friends', too, began appearing the following problem these days:

After a long period of idle (usually overnight) the system crashes with several error messages "Delayed write failed..." The files involved are usually $Mft and C:\Windows\system32\config, but random others as well.

In this situation always stop working USB and the network card. Sometimes there's possible to perform a standard restart, sometimes a hard reset is necessary. The disk can be repaired using chkdsk with relatively no harm.

The computers are quite different, hardware-wise and software-wise, one of them even has an SSD. Needless to say the disks are in a perfect state, I even replaced two of them, but to no avail. This happened last night on three computers again. Some are in the same LAN, one is completely away.

I am now pretty sure it is not a hardware failure. As for the software, the only common things are that they all use Windows XP SP3 (but different language versions), and they all use BitDefender Free Edition. There are no others substantial similarities.

I spent a lot of time reading different forums, but the only outcome of this was that I'm not alone. No solution whatsoever, though. I read about some virus, but could that be true? I used MalwareBytes and Kaspersky's TDSSkiller for additional scanning, but they found nothing. I also heard rumors that MS put something in theirs latest updates to force XP users to upgrade, but I'm very skeptic regarding such conspiracy theories.

Take a look at the event viewer. Although not as extensive with information as modern versions of Windows, even the one in Windows XP should be able to give you a hint of where to look for the error.

Do these errors occur at the same time? Are the computers you mention in close proximity (for example: same housing block)? Then I would try a voltage stabilizer or UPS or high quality PC power supply on at least one of the computers and check if that makes a difference. If so, you are getting bad power from the grid, which could cause the hard disk controller on your MoBo to act up giving you these kind of errors. That could also happen when the hardware you are using is getting old or got too hot too many times. This can even create bad blocks on brand new hard disks almost directly after these are connected and used for the first time.

I have learned a long time ago that software like MHDD is much more reliable in telling you the true state of your hard disk. More so than anything else Windows can offer. Windows also keeps telling you there are no problems when there clearly are. MHDD is dangerous software to use and should only be used by people who really know what they are doing and don't mind the spartan interface.

[story-time]Where I live in Paraguay electricity can be very problematic. Now I have a Windows PC that I use as a server and it automatically reboots after a power failure. At some point in time Windows wouldn't boot at all anymore because it couldn't read/write on the registry files. After restoring working copies of these registry file by hand in the recovery console several times I got tired of it. After restoration Windows would boot, and chkdsk would fix whatever else was wrong and the system remained working until the next power failure. Mind you, there was no problem rebooting the server if stopped and started the normal (Windows) way.

Now I divide my hard disks into partitions. At least 3 but I prefer 4. In other threads in this forum I explained why (in a very opinionated way), so search for those posts if you want to know, I won't bore you with this here and now. The 1st partition only contains the Windows installation (no user data, temp files or page file) and in this way the first partition was only 5 GByte in size and still had over 20% of free space.

To solve my dilemma I shrank the second partition by 5 GBYte and move the 1st partition into the "liberated" space on the hard disk (MiniTool partition software is freely available and works very well). Now the PC starts up without a hitch after a power failure as well as the normal reboot procedure. Point is that Windows/chkdsk deemed the 1st 5GByte of the hard disk to be good, while it clearly wasn't.

Problems with flaky power grid also became a lot less after I installed the better class of power supplies in all of my PC's (80% efficiency Gold rated), much more than using UPS'es (I have 7 or 8, all of them with fried electronics (some even came from the USA)...so after a while I stopped repairing them. Too much hassle for hardly any gain (in my local situation, your location cannot be as bad as here, so use an UPS!). [/story-time]

$MFT is the Master File Table where the NTFS file system keeps track of the files on the hard disk/partition. The other one is a part of the Windows registry that isn't written correctly. As mentioned earlier, there are ways to restore these files from the registry, but that will require working with the recovery console. Sorry, too tired to google it for you...but you should be able to find these instructions.

Changing anti-virus software might help as well. Also, check in the Windows Device Manager if your network card and USB controller have an option enabled that allows Windows to turn these off to conserve power. If that is the case, turn this off and see if there is a difference. If so, your OS was too eager to conserve power.

Those are the first things that sprung up in my mind after reading your post. If the above works out for you...great! If not...back to the drawing boards then.

Of course I looked already into the event viewer. The only suspect messages I found there, related to HD, were repeated warnings from PerfDisk: Unable to read the disk performance information from the system (Event ID: 2001). After a series of these follow assorted crashes of random apps and disk-related errors.

These errors occur solely after some hours of inactivity. So far they never appeared while working on the machines. Even the machine left idle for a couple of days doesn't crash every night, though. So far I was unable to track down any pattern.

As this happened just recently, and as the symptoms are always the same (I mean the same files, disabling USB and network inteface), I don't believe it is hardware related. On one of the affected PC I replaced successively motherboard, power supply, cables, RAM, and finally the hard disk. So, just the CPU remained, but the results are the same. Hard to believe that CPU causes this, and on other machines almost simultaneously, too.

As for the system settings, I've experimented with tweaking things like LargeSystemCache values, as suggested throughout some forums, but without any change whatsoever.

Well, this is going to take some time. At the moment I'm experimenting with disabling automatic updates. So far, so good, but it's too early. Next, the substitution of the antivirus will follow. I'll post the result after a few days of experimenting here.

If not already done, then I'd suggest you consider getting a free trial version (if available) of HD Sentinel installed - refer Hard Disk Sentinel PRO - Mini-Review.This will confirm the detailed heath status of the disk(s) involved. (You need more facts one way or the other.)Also check the Write-caching policy settings for the disk(s) involved.Have you had any power fluctuations?

I'm just making a wild guess. But if your disk monitor program cannot gather performance info and delayed writes are failing I wonder if it could be that the disks are asleep?

Check if all HD are in Performance Mode with sleep/spin down settings at Never.Do not allow network cards to shut down to save energy. I've never been much of a fan of sleep energy saving settings. To save energy I shut the system off overnight. I realize in a corporate environment sometimes you gadda' do what you gadda' do though.

@IainB:I just can't believe that exactly the same error would appear on seven disks in the span of two or three weeks. I replaced two of them, one is brand new. Exactly the same behavior. All the disks were throughly tortured on another machines. No errors, top performance. This must be something in software, I swear. As for the power, all the machines have UPS, and none of them reported any failures recently.

@MilesAhead:This was one the first things I verified, despite I'd never use the sleep feature on any device, let alone on WinXP. None of the disks or network cards is set this way. (I even don't use any screen saver.)

Jut for chuckles I would run some task every hour during the otherwise inactive time. See if it prevents the glitch. Also I would run chkdsk /f on all disks. If they come up clean, if you can spare the downtime, run chkdsk /r on one to see what it shows.

Well, thanks for your support. Of course, I've ran chkdsk /f many times. The disks were verified also by DRevitalize. After the crash, some minor errors usually appear, but otherwise the disks are in an excellent shape.

So far I tried disabling the automatic system updates, but to no avail. The machine crashed the same way. So this is out of the game now.

I also tried to replace Bitdefender with a trial version of Malwarebytes, but it crashed again, despite a different way — it performed a reboot. This all takes so much time... I will continue anyway.

Besides, I just realized another similarity in all the involved system: they all use LogMeIn. Bit I suspect this could be a culprit.

@yksyks: Well, I'm stumped. I can only think of a few questions, and I suspect they have already been covered by your good self. Reading through this, and to summarise and make a few assumptions (so please correct me if I am wrong), it seems that:

1. The error symptoms are:

After a long period of idle (usually overnight) the system crashes with several error messages "Delayed write failed..." The files involved are usually $Mft and C:\Windows\system32\config, but may include seemingly random others as well.

In all instances where this happens, the USB ports and the network card stop working.

Replacement and stress testing of the devices affected indicates no evidence of actual hardware errors.

___________________________________Question: Where any affected hardware has been removed and found to be OK, has it been subsequently reinstalled to the same computer and had its performance monitored/observed, and has it subsequently been affected by the same or a different error?

2. Common/differentiating factors in the population of computers affected:

All of the affected computers use Windows XP SP3 (but with differing language versions), and they all use BitDefender Free Edition. There are no other substantial/significant similarities.

The affected computers do not share the same AC power supply. (Is this true?)

The affected computers do not share the same DC power supply. (Is this true?)

LogMeIn may/may not be a common factor for the affected computers. (Needs verification.)

3. Population of computers affected:The error symptoms have spread amongst what appears to be a gradually widening population of computers, some of which share a network and some of which are discrete systems - i.e., not interconnected or intercommunicating. (Is this true?)___________________________________Observation: This sort of thing could look like the spread of a virus or bug introduced at a common point to all computers affected - e.g., maybe (say) at the point of a system/software update.___________________________________Questions:

Have the update (change management) logs for all computers - including those computers affected and not affected - been compared to establish what specific updates were done and when, on the run up to and prior to any manifestations of the errors?

Were any system updates derived from the same media or update data file(s)?

Does a checksum comparison of all of the "same" update files (and all/any other files on the media used) - for all computers affected and not affected - show any differences in the files? (There should be no difference.)

4. Conclusions thus far:

After a great deal of investigation and analysis so far, none of the errors have been deliberately repeatable, and so the cause(s) remain unknown.

The causal problem seems very likely to be software-related, not hardware.

Delayed write failure errors only occur when a device isn't responding in the allotted time to signal the file system that the write action can take place.And there are many reasons for this to happen and have been already discussed. However, these are practically always hardware related (in my personal/anecdotal experience at least).

@yksyks:Are the systems also properly cooled? All the time? Are you sure?The reason I ask is that here in Paraguay there can be very high ambient temperatures. Because of that I need hard disk cooler, which are screwed on the bottom of the drive and have 2 fans on it (one to blow air onto the device, the other to suck the hot air away). Without these, I can run these disks for only a few hours (in spring and summer) and then the operating system/file system plain simply "looses" the drive.

Running hard disks at high temperature seriously shortens their life span and damages it in the mean time.

If you do work with hard disk coolers, are these functioning properly? A fan that is stuck or not moving smoothly draws much more power than you expect and in essence becomes a heating element...residing under your hard disk, making the drive more hot more quickly.

Do you use solutions that make any of your PC fans slow down after a while? Are these coolers perhaps connected to chassis fan connectors? (I have such crap here when people bring me computers to repair.

Software can do crazy things if the hardware supplies a '1' when the software expects a '0', When that happens on a slightly bigger scale a cascading effect occurs that makes your computer behaves erratically. With the densely packed (magnetically/electrically) hard drives of today, there isn't much margin for error anymore on the hardware side and in combination with an intertwined operating systems such as Windows those errors can create havoc easily.

You have a vague problem and it is good of IanB to ask/confirm the details of your setup. Without a good description our guesses are as good as yours

...Delayed write failure errors only occur when a device isn't responding in the allotted time to signal the file system that the write action can take place.And there are many reasons for this to happen and have been already discussed. However, these are practically always hardware related (in my personal/anecdotal experience at least). ... _______________________________

Of course I looked already into the event viewer. The only suspect messages I found there, related to HD, were repeated warnings from PerfDisk: Unable to read the disk performance information from the system (Event ID: 2001). After a series of these follow assorted crashes of random apps and disk-related errors.These errors occur solely after some hours of inactivity. So far they never appeared while working on the machines. Even the machine left idle for a couple of days doesn't crash every night, though. So far I was unable to track down any pattern._______________________________

These comments do not seem easy to reconcile. I was reminded of something when I read of "disk performance information" above - it reminds me of the following, but I am not sure whether/how this could be relevant:

EDIT 2012-09-17:

Hooray! This seems to be an effective fix to the episodic real-time performance monitoring issue:(for more info., refer HDS FAQ page http://www.hdsentinel.com/faq.php)

The real time performance monitoring worked per the Registry settings workaround (see earlier edit below), but after some time (for example after connecting/removing external hard disk, pendrive or similar storage device) it stopped working and I periodically had to reset the Registry settings - i.e., the Registry settings change did not "stick". This was apparently caused by a function in HDS which provides for performance monitoring when a new device - e.g., an external hard disk - is connected/detected. When this happens, Hard Disk Sentinel has a function that clears the performance object cache and re-detects the performance objects. On some systems (regardless of hardware configuration) this function apparently causes the Windows performance monitoring settings in the Registry to be disabled.

If this happens, you can disable this HDS function as follows:

1. click "start" (Windows) button and to the search field enter REGEDIT

2. open REGEDIT

3. navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\HD Sentinel (or HKEY_LOCAL_MACHINE\SOFTWARE\HD Sentinel under 32 bit Windows), where you will see a lot of keys.

4. create a new STRING key named DisablePerfCacheClear and specify a value of 1 for that.

Then restart HDS, which now will not issue this special function to clear the performance object cache when it detects the change of configuration, so the performance counters will continue working normally - once reset in the Registry. Those Registry settings should now "stick" and not need to be reset again.

Just to add to this, I've had the same experience here with Delayed Write Failure and in almost every case it's always come down to one piece of hardware - the SATA or USB data cables.

For SATA - those cheap, non-locking ones that were so prevalent a few years ago. Low contact pressure combined with with the heating/cooling cycles within a PC case don't make for a happy combination.

For USB - the constant plug/unplug or the mechanics of the plug/socket sometimes result in a marginal electrical contact. (eg. On my netbook there's one USB port where if I plug a cable all the way in, the device isn't recognised - pull it back 1 mm or 2 and it's fine. Thus any decent vibration will result in the plugged-in device suddenly disappearing from the system.)

I suppose that any hardware cause is off now. Temperatures are okay, cable replaced, the disks repeatedly tested on other machines, some replaced too (with a cloned contents).

As I mentioned LogMeIn, please forget it. I meant TeamViewer, sorry, I mixed the two.

And as for the PerfDisk warning: it appeared only on one machine. On others there were different problems reported, like unreachable paging file, and so on.

As for the HD Sentinel, I don't have anything like this in all the registry.

Now, some news: On one PC I replaced the Bitdefender with a trial version of Malwarebytes. The behavior changed dramatically: now it doesn't report anything, just crashes to hard reboot. The reboot always fails, as no HD is found, even in BIOS. Reset doesn't help, after switching off the power and restarting the system starts normally. Other change is that it now happens every couple of minutes regardless of being idle or not.

So, my question is: Could there be a virus so powerful that would be capable to disable the disk controller? I still think of some software collisions, but...

I could be cursed as well. Believe it or not, but since yesterday the only working machine in the house (notebook with Vista), shows just a pitch black screen. It's working normally, I can reach it using the TeamViewer, just its display died. Funny.

Was the laptop used (not by you) to connect it to a beamer? And forgot to put the original video configuration back?Not having real experience with Vista, so I am not sure it automatically readjusts output to the Notebook monitor when it detects the beamer isn't connected anymore.

It sure sounds like something is attacking the hardware. If possible I would try an AV with a real time shield that you haven't tried as yet. Since the machines are networked it stands to reason the virus has spread across the Lan. Perhaps one of those Linux boot av scan discs will turn up something with a complete scan.

I'm back again, still alive, and my computers, too. It was not a discourtesy, I just wanted to test everything thoroughly before posting, and it all takes so much time...

So, my first step was a workaround: I automatically restart all the computers automatically overnight. This solved a lot of problems, but not all and I'm aware that it is not a solution. However, rebooting Windows machines often proved to be generally a good idea.

Now, I'm almost 100 % sure that the culprit was the Bitdefender Free Edition. I experimented with some other AV solution, and ended-up at AVG Free. To make it short, since I replaced Bitdefender with AVG, the following symptoms never appeared again:

Delayed write fail error

Disconnecting USB and network cards

Occasional blocked access to Scheduled Tasks

Repeatedly erased cookies

Random crashes of browser (using Opera 12)

This all now stopped appearing. I left the automatic rebooting in effect, just to be sure. So, I'm not sure I really found the culprit, but I suspect that I did. So far, so good, and thanks to you all for your support.

Thank you for taking the time to report -- it makes some sense that it could have been caused by antivirus.

I note that Shades, in the first reply to your first post, did include in his post these wise words:

"Changing anti-virus software might help as well."

I myself have experienced Blue Screens of Death caused by an antivirus tool which I otherwise was quite happy with. I suppose it's always good to be suspicious of these very low-level tools that hook into the underlying file system and intercept all file access..

Well, in fact, besides the OS itself, and TeamViewer, which I consider completely innocent in this regard, the Bitdefender was the only similarity between the machines, which suddenly started to exhibit almost identical symptoms. So it should have been evident from the beginning. I just hesitated too long...

But I can tell you, installing and then properly removing different AV products is a real challenge, especially time-wise. But that would deserve a stand-alone thread.