Hello. I've got two servers. On one of them the BMC has stopped responding to IPMI queries/commands which caused an issue the other day when the OS crashed and the cluster was unable to fence that server.

I can get to the BMC web interface, and ipmiping gives a response. It just refuses to work with impitool. The other node works just fine.

Is there any way of working out/fixing this problem without opening up the server? It's a live production server and I'd rather not take it offline unless it's vital.

Also, I can't see a way of doing that without powering the server down. I was hoping to be able to solve the problem without doing that as I'd have to arrange a maintenance window to switch the live website over to the other node.

Yes, I'm logged in at admin level. On the web interface all the chassis function works - I can see the power state and power-off, restart etc.

It's via IPMI (using ipmitool) that can't run chassis commands, and that's also connecting as the admin user. This is for one node in a SR1670HV server. The other node's BMC works as expected. The chassis commands used to work on the problem node until a few weeks ago. The firmware re-flash didn't work, nor did removing power to the server for a while.

On further investigation, it turns out that the problem is confined to the impitool command. Other tools, such as ipmi-chassis and ipmi-sensors work. Unfortunately the cluster uses ipmitool to fence the node. I don't know why it's just ipmitool (and I've tried it from other servers and distributions) that doesn't work. Perhaps this is a protocol issue?

The workaround for now would be to configure the cluster to use a different module for node fencing.