I purchased a used DL560 G1 server with Dual PSU. Within 30 minutes of powering up I heard a "pop" and smelt some smoke.System indicated a PSU failure but kept going on remaining one PSU. The PSU was definitely faulty as I tried by itself and also checked the server status for the PSU that stated "faulty".

I ordered a second PSU, and plugged it in and got redundancy again.

After another 20 hours or so I heard another "pop" and smelt some smoke again (just like the first time but maybe not as loud).

The server shutdown with a "PSU fault" Red light (at least I think it was the PSU light and not the server health light).I thought damn.. both PSU have gone??

After recycling the power however the system has booted up fine with both PSU working....

Any ideas what happened?Something definitely blew and the smell is quite strong!

I should add I had this system in a poorly ventilated area.... (this has now been changed and it is somewhere with good ventilation).

I suspect this may have played a part in this story... but doesn't explain the second occurrence and why it has come up after a "smoking" event, a strong smell of plastic... just like the first time it failed.

It's a decoupling/filter type cap (tantalum "blasting cap") to make things nice and smooth in the power department. Companies still generally allow their engineers to put on a surfeit of these, so you usually can get away with a few failing (that might change soon - after all they're probably a nickel apiece and only help reliability, so it's not that important, right?). When they blow they generally allow a pretty big current surge through them, causing temporary voltage oddities that can trip monitoring and cause things to behave in odd manners, and sometimes (often) things straighten out after a reset 'cause the short is self-clearing in most cases.

When it blew the server immediately went dead... until I recycled the power. I suppose it's not critical then? Not sure why it could have happened though!

Tantalum "blasting caps" under stress tend to have a failure mode where they short out. It's a known problem, and usually affects older machines than yours. The good news is that it rarely damages much, and is an easy fix (and sometimes can even be ignored).

In other fields engineers are having the problem that they design with bypass caps in place and when the boards go to assembly the company "optimizes" all the bypass caps out of the design. When the boards come back they have strange noise and interference issues and need to send everything back to manufacturing....hope computers don't go that route.

Facebook on their custom servers for their custom datacenter says no to Tantalum. Tantalum has been considered a part of the war in Congo and to stay away from conflict materials tantalum caps are often avoided. MLC (multi-layer ceramic) caps have gotten better and have higher values in small packages now and are common for bypass caps now.

japes wrote:In other fields engineers are having the problem that they design with bypass caps in place and when the boards go to assembly the company "optimizes" all the bypass caps out of the design. When the boards come back they have strange noise and interference issues and need to send everything back to manufacturing....hope computers don't go that route.

Why wouldn't they? It would be a business goldmine for makers of "consumer" machines: 3/4 of users will think that they just need to buy a new one (more sales), the rest can have the usual "it's a software problem and will be fixed in a future upgrade" line given to them. The "upgrade", of course, never materializes.