Re: P4300 brings store down after 1 disk fails

do you have a failover manager for that particular management group? We had several disk failures over the past few months and right after two or three of them the affected node was offline for a minute or two. We have a setup with 4 nodes and a dedicated failover manager, so all volumes were still available. It seems that the raid controller sometimes takes some time to deal with a failed disk, doesn't react in time to requests and the SANiQ-Software takes the node offline due to too high latency.

If you only have two nodes in your setup and don't have a failover manager to provide quorum, the volumes would be unavailable for a short period of time. The same of course applies for all configurations with a single node.

Re: P4300 brings store down after 1 disk fails

we are running on 9.0. I can only assume that this behavior is in favor of setups with more than one node where there is no real danger in losing a node for a short period. As the network-raid spreads all accesses over all nodes in a management group, a node with a high latency is going to affect all volumes and all sessions, so the software decides to take the node offline to avoid clogging up the request queue. Sensible choice for setups with at least two nodes, fatal for setups without failover.