7 Tips for MOM

Server management is critical in nearly any shop, but even more so in larger environments. The larger the environment, the more critical it becomes.

Here at the Kentucky Department of Education Office of Education Technology (OET), we provide
technical standards and services to
all 1,400 K-12 public schools, for nearly 700,000 student and staff users throughout the Commonwealth. Our infrastructure consists
of 180 fully managed and monitored domains
ranging in size from 200 to 110,000 users.

For the past two years, our three-member OET Directory Services Team has had great success using Microsoft Operations Manager (MOM) to monitor this infrastructure. Since implementing MOM, we've reduced the number of break/fix help desk tickets by more than 90 percent for monitored machines and related services. Just the fact that we can monitor and maintain an environment of about 400 servers and nearly three-quarters of a million users with three people speaks to MOM's abilities in massive enterprise settings. During that time, we've learned a thing or two about using MOM. We hope you can benefit from these tips for getting the most out of MOM.

Tip #1: Take Advantage of the Management PacksMicrosoft currently lists 132 management packs and 13 product connectors, which you can view here. Management packs contain scripts, performance-gathering tools and Knowledge Base
information for
components MOM can monitor (more about the Knowledge Base later). Product
connectors allow MOM information to be forwarded to other management products such as HP OpenView or Tivoli TEC for consolidated alerting.

Bonus Tip

You’ll need to determine which management packs fit into your environment, but be careful to install only the minimum number of packs necessary to fulfill your monitoring requirements. Every management pack adds work to your management servers and adds size to the agents deployed on your managed machines.

The Active Directory management pack has been worth its weight in gold to the OET Directory Services Team. On several occasions MOM has alerted the team to replication problems that were quickly resolved using its Knowledge Base.

Tip #2: Know Your Ports to Head off the Storm Firewalls are an integral part of
any organization's security
infrastructure, but they can also wreak havoc on a MOM deployment. OET found this out the hard way when a rogue firewall rule produced a communications failure between
the MOM management servers
and a number of their managed servers. Alerts destined for the
management servers were dropped by the firewall due to port restrictions, so the MOM operators never knew the alerts were happening.

In the meantime, those same firewall rules were blocking replication. The result was an ugly mess of
replication failures that took several days to reconcile once the rogue rule was discovered and corrected. The MOM 2005 Security Guide details
all the ports needed for MOM to function properly.

Bonus Tip

MOM 2005 is a more pleasant experience right out of the box than the pervious version, as many of the noisiest rules have been eliminated. Before you make any rule changes, document and test each individually. If you find yourself making
several new rules, create a folder specifically for your rules so that other administrators can easily find them. We’ve found that creating a folder for each MOM administrator is helpful. An example is shown in Figure 1.

Tip #3: Play by the Rules
Once you've established communication between the individual
MOM components and successfully deployed the agents, you can begin tweaking the MOM rules and scripts. Depending on the size of your environment, this
can take 10 minutes or
10 months.

The directory services team at OET added nearly 20 new rules and turned
off several noisy rules while
running MOM 2000 SP1. Noisy rules are those that spit
out events or alerts en masse or unnecessarily. Examples in MOM 2000 SP1 include rules that send
successful Netlogon events to
the management servers. In an
environment with a large number
of users, this can grow your MOM
database tremendously. We also
significantly tuned performance
monitoring rules to reduce the size
of the database.

Figure 1. Creating a Rule Group Folder makes it easier for other administrators to find and use rules. (Click image to view larger version.)

Tip #4: Increase Your
Knowledge Base
As you create new rules and groups of rules, MOM lets you add them to its database. When the Operator
Console raises alerts, you can add your problem resolution steps into MOM 2005 by selecting the alert, right-clicking on the Company Knowledge Base tab, clicking
Edit and entering the properly
formatted information.

This has proven very beneficial
for OET. It reduces the number of
Tier 3 support calls, which translates into lower support costs. Adding the name of the person entering the
information (Figure 2) and the date to the Knowledge Base gives the MOM operator a person to contact if there are questions about the solution.

Figure 2. The Office of Educational
Technology formats Knowledge Base information so it can recall that data
for troubleshooting. (Click image to view larger version.)

To help keep the security folks happy, MOM 2005 agents can run under a reduced security context
on domain controllers without impacting their effectiveness. This
is accomplished using a "MOM Action Account."

That account—which you can
use to install agents, run scripts
and gather data from managed machines—must be part of the Local Administrators (not Domain Admins) and Performance Monitor users groups. It must also have the "Log
on Locally" and "Manage Auditing and Security Log" rights made active in the Default Domain Controller Security Policy, which the local Administrators group does by default in Windows 2003. All of the security settings and permissions required
for properly operating MOM are detailed in the MOM 2005
Security Guide.

Tip #6: Eliminate
Replication Headaches
MOM 2005 suffers from some of
its predecessor's ailments. The Microsoft Knowledge Base article 889054 references a problem that occurs when the replprov.dll tries to access an invalid pointer. It generates error messages when the file can't determine the replication status of the domain controller.

This alert can cause major headaches if you're monitoring
anywhere from a handful to
hundreds of domain controllers, but fortunately the hotfix is available and works well. If you see the alert (as presented in Figure 3), you're a prime candidate for this hotfix, which is applicable to both MOM 2000 and 2005.

Figure 3. If you see this alert, KB article 889054 is where you need to look for answers. (Click image to view larger version.)

Tip #7: Consider Trading Up If your business only requires "best effort" uptime, then don't worry about purchasing a monitoring product. However, if your customers are as finicky as mine, MOM is a solid
tool regardless of the size of your computing environment. With all the changes and new features MOM 2005 has to offer, an upgrade from MOM 2000 SP1 is a must.