I have a Windows Server 2003 Small Business Server that has begun crashing more and more often. The machine does not physically power off(so I don't think it is a power supply issue), but it does not respond to any network activity. When I force a restart by holding down the power button(at 11:43:20 AM), the only relevant Event Log entry says "The previous system shutdown at 3:45:11 AM on 7/21/2009 was unexpected."

This first started about a year ago, when the machine would do this once every 3 or 4 months. Over the last month, it has started occurring more and more frequently, twice in 4 weeks, then twice in 2 weeks.

So, two questions:
1) Any idea what would cause crashes like this to occur? and
2) Are there any ways to get more detailed Event Log entries from within Windows?

I have full current backups of all the data on the server, so I am not worried about data loss. I have regularly updated and scanned Symantec Virus protection managed by the domain.

You say it does not respond to network traffic, but if you physically go to the console what do you see, has the server locked up, restarted etc?
–
SamJul 27 '09 at 14:45

Unfortunately, it is in a data center and there has been no monitor attached. I attached one (and a keyboard and mouse) last time it crashed so I could get more info next time it crashes.
–
minamhereJul 27 '09 at 14:52

3 Answers
3

It could be a million things from a faulty driver down to hardware problems.

I would say before going any further, the best thing to try would be to go to system properties (right click on computer and click properties) then go to the advanced tab, click startup and recovery and untick the Automatically restart button and choose to keel a kernel memory dump.

Next time this occurs, you should be able to see if the problem is related to memory, hardware or driver based on what it says on the BSOD. If the BSOD does not help, You should have a kernel dump that you can diagnose (if you can be bothered!)

Of course, it is possible that it won't even reach a Blue screen of death, and if this is the case, it most likly means a hardware fault such as power.

Conveniently, I had this option checked already, but the MEMORY.DMP file was last modified in 2007. I am about to examine it anyway, in case it contains any useful information. I turned off the automatic restart option so that next time it crashes, I can find out if it was a BSOD. Thanks.
–
minamhereJul 27 '09 at 15:53

The server just crashed again. It did not create a MEMORY.DMP file and the monitor did not see the computer, just stayed in standby mode, no matter if I moved the mouse or typed on the keyboard(which also did not respond to num lock or caps lock). I could not see the screen to read any BSOD info.
–
minamhereAug 28 '09 at 21:58

Ok, if this was me, (and no critical files running on the server), I would download prime95 - files.extremeoverclocking.com/file.php?f=103 and run it multiple times. Using task manager, I would set affinity for different instances to different cores and basically try to max it out to 100%. If it crashes after about 15-20 mins, I would swap the power unit for sure.
–
William HilsumAug 29 '09 at 20:08

I swapped the entire server for an identical model on Friday night. So far no crashes. I will run Prime95 on the old one, I would love to isolate this to a PSU issue. Thanks.
–
minamhereAug 30 '09 at 13:39

Swapping the server seems to have fixed all my problems. All hardware is identical and it has not crashed in 2 weeks. Thanks a lot.
–
minamhereSep 11 '09 at 19:05

The server finally crashed again today. Keyboard, Mouse, and Monitor did not respond at all. Keyboard got no power(caps lock and num lock would not light up), Monitor did not recognize that it was connected to anything. Network card was still lit and flashed every so often. Power LEDs were all still on as well. I forced a restart and Windows had the same error log entry as previous times. It did not create a MEMORY.DMP file as I had specified in the settings. I will try memtest and then see if I can swap the PSU.
–
minamhereAug 28 '09 at 21:57

Forcing a shutdown because your network is no longer responding isn't a smart thing to do, what's worse is that you can't diagnose the problem.

I don't think your server is locking up or crashing.

Sounds like your have problems with your network configuration or network card, or even possibly your switch. See if you can get some out of band management! Pluging in a kb/mouse is a good start, You need to get in front of the server physically next time it happens.

If the server has another NIC spare on it, configure that & plug it into another server /computer (cross-over may be required) with some static IP's & see if its still accessible via that network time it crashes.