First off, I don't understand why zmmtaconfig and zmmtaconfigctl were not running, or what other logs I could check to figure it out.

Second, what's with the "zmmtaconfig: zmmtaconfig started" message that comes in 2 seconds before it shows zmmtaconfig as not running? What is starting this, and why? Maybe this is some kind of automatic restart, and the zmcluctl script ran at exactly the wrong time, so it thought zmmtaconfig was not running? If so, what would have triggered this restart? Could it be a log rotation or something?

Thanks guys! As per my signature, I'm running 5.0.20_GA_3128.RHEL4_20091102090733. From the sounds of that bug report, this issue has NOT yet been fixed in the latest version (according to Mike Cathey), so upgrading might not help.

Any idea how long zmmtaconfig is down during these log rotations? If it's not long, I think the easiest solution might be to just make a wrapper script, something like:

This would run zmcluctl, and if it exits with a 1, it will sleep 30 seconds and then try it a second time, only returning 1 if both attempts fail. I feel like this would cut down on a lot of false positives. The only downside would be waiting an extra 30 seconds before failing over, but I can deal with that.

I prefer making a wrapper script over modifying the Zimbra script, as the Zimbra script will get overwritten if I upgrade to a newer version. It also allows me to run the original script unedited if I so desire.

Moral of the story, this is a bug in 5.x that it appears hasn't been fixed yet. Marking this thread as solved. Thanks everyone!