On Fri, Apr 27, 2007 at 01:21:29PM +0800, garylua singnet com sg wrote:
> Hi,
>
> I'm currently running a cluster (RHEL4, Cluster Suite Update 3) with 2 nodes. There is a virtual ip and a shared storage resources defined, together with other scripts that are supposed to be executed.
>
> During failover, I discovered that the time taken is a little long (around 25 seconds). I realised that the "bottleneck" of this long failover time is contributed by the unmounting/mounting of the shared storage filesystem and the failover of the virtual ip to the other node.
Each IP address teardown adds 10 seconds.
> I'm just wondering how can i make the shared storage in such a way that BOTH nodes can access the shared storage filesystem, but only the active node has the write privilege? So that there is no risk of data corruption, and yet the failover time can be reduced (hopefully?).
I don't think you can do this safely. You could use GFS if you wanted
to.
> All in all, I'm trying to reduce the time of the failover to as short a period as possible. I've already change the status monitoring (heartbeat) interval to 5 seconds. And on top of the shared storage and virtual ip, i have 4 scripts that need to be failed over. I configured the cluster in such a way that when any of the scripts fails, the rest of the scripts will fail over too.
If you're not doing NFS as part of your service (e.g. with the
"nfsexport/nfsclient setup"), kill the 'sleep 10' in the
/usr/share/cluster/ip.sh script. That will speed things up a bit.
> As such, there are a lot of dependencies among the resources and I'm trying to reduce the time of the failover to maybe about 10 seconds, if possible.
--
Lon Hohberger - Software Engineer - Red Hat, Inc.