Exchange Analyzer Tools Success Stories

Topic Last Modified: 2006-10-23

For months now, different people at Microsoft® have been writing articles, blogs, and newsgroup posts about how to use the Microsoft Exchange Analyzer Tools to help troubleshoot problems. In case you’ve missed these let me give you a quick summary of what these tools do.

The Microsoft Exchange Best Practices Analyzer is designed for administrators who want to determine the overall health of their Exchange servers and topology. The tool scans Exchange servers and identifies items that do not comply with Microsoft best practices.

In addition to being available for download, both of these tools will be in the Exchange Server 2007 Toolbox, accessed from the Exchange System Manager.

There is a lot of enthusiasm around these tools and we’d like it to be infectious. This month, I’ll tell you about some support cases in which Microsoft Product Support Services used the Exchange Troubleshooting Assistant components, and next month I’ll talk about some Exchange Best Practices Analyzer success stories. I’d like to share with you some of the Product Support Services success stories in using these tools to help troubleshoot support requests.

Please consider that these examples are taken from cases in which we in Product Support Services were not only learning how to use these tools ourselves, but even becoming aware of their existence. That’s why you’ll see some comments that say “after days of trying to solve the problem, I ran the Analyzer tool and solved the problem immediately.” In these cases, the Support Engineer was probably just learning of the tool’s existence.

The Exchange Troubleshooting Assistant (ExTRA) consists of the following three components.

Exchange Performance Troubleshooting Analyzer (ExPTA)

Exchange Mail Flow Analyzer (ExMFA)

Exchange Disaster Recovery Analyzer (ExDRA)

When you open the Microsoft Exchange Troubleshooting Assistant you have a choice of choosing between Performance Troubleshooter, Mail Flow Troubleshooter, or Database Recovery Management. I’ve grouped these success stories by the ExTRA component that was used to help in troubleshooting the problem.

Performance problems can be the most troublesome and time-consuming problems to fix. Support Engineers do not generally take one case and work on it exclusively until it is completed. In most cases there is some back-and-forth between the customer and the Support Engineer as they try different solutions and check different settings. Performance cases can take weeks to resolve.

The Performance Troubleshooting Analyzer has had a big effect on how long it takes to resolve these problems. Without using the Exchange Troubleshooting Assistant, it takes an average of 42 minutes to troubleshoot a problem. With the Exchange Troubleshooting Assistant, that average time is reduced to 12 minutes.

The following table shows the effect that ExPTA has had on troubleshooting performance problems. The times depicted are the amounts of time that the Support Engineer recorded working on a case.

Times or Cases

Using ExPTA

Manual

Average time

12 minutes

42 minutes

Minimum time

4 minutes

3 minutes

Maximum time

1 hour

16 minutes

8 hours

23 minutes

Median time

10 minutes

33 minutes

Solution Delivered First Contact

12 cases

3 cases

Unresolved cases

1 case

19 cases

One thing that really stands out to me is that the maximum time to solve a case using ExPTA was only about 30 minutes longer than the average time to solve a case without ExPTA. The other thing that really stands out is the number of times that we can solve the problem the first time that the customer and the Support Engineer make contact. I think that this shows that as the ExTRA is used more frequently before calling Product Support Services, we’ll see the number of performance cases decrease.

Symptom Outlook RPC pop-up box

Root Cause Client Restrict operation

Comments from the Support Engineer The ExPTA report pointed to the client’s use of the Restrict operation that is part of the process to request that Exchange creates a view on a folder or set of folders (effectively, a database table with associated criteria). If the view on the folder or set of folders already has a matching restriction, Exchange uses the existing view to satisfy the user request. If a view does not have a matching restriction, Exchange creates a new view. Creating a view is more costly than using an existing view. The issue was isolated within two days of ExPTA being run.

Symptom Outlook RPC pop-up box

Root Cause High Database Average Read and Write times

Comments from the Support Engineer I used ExPTA when onsite in front of the customer. Having the customer see this output in real time was excellent, as typically you would have to collect the perfmon data, manually examine it and compare each counter with our recommended thresholds in the performance white paper and then present this back to the customer. This tool helps save a massive amount of man hours that are typically lost to isolating performance-based issues.

Symptom Outlook RPC pop-up box

Root Cause SAN disk latencies

Comments from the Support Engineer I requested that the customer run ExPTA as soon as I got the case, and the problem was isolated in one day.

Symptom Outlook RPC pop-up box

Root Cause Disk bottlenecks on the Exchange database server drive

Comments from the Support Engineer I requested that the customer run ExPTA as soon as I got the case, and the problem was isolated in three days.

Unfortunately, I do not have the same data showing actual time savings for mail flow and database recovery issues that I do for performance issues. Overall there are fewer of these cases to choose from, and we have not had a chance to run the same study that we did for performance issues.

Symptom Mail was queuing up over the routing group connector between two routing groups

Root Cause FQDN of remote servers was incorrect

Comments from the Support Engineer All four Exchange servers in the first RG had queues to both servers in the second RG. In the application log, you got event ID 4000 "unable to bind to the remote destination server in DNS." ExMFA helped us resolve this issue in a timely manner. After we corrected the FQDN on all SMTP servers involved and restarted the Routing service, the mail queues cleared.

Symptom Mail stuck in post categorizer queue, InetInfo at 99 percent

Root Cause Non-standard SMTP sinks

Comments from the Support Engineer ExMFA enabled us to easily see non-standard SMTP event sinks on the server. It pointed us to the root cause of the problem.

Symptom Mail stuck in queue

Root Cause Default SMTP domain name change

Comments from the Support Engineer ExMFA identified the name change for the default SMTP virtual server domain.

Symptom Mailbox and public folder stores dismounted and could not mount successfully

Root Cause E00.log was missing

Comments from the Support Engineer I started by asking the customer to download ExTRA tool and to run DRA wizard. He sent me the XML output and it was clearly stated that E00.log is missing and required for the stores to mount successfully. ExDRA suggested locating the missing log, restoring from backup, or repairing the databases.

We located the missing log file in the Antivirus quarantine folder and restored it to its original path. The stores mounted successfully and the whole recovery took less than an hour.

ExDRA had saved time and effort in analyzing DB headers and finding the missing logs. It also provided several suggestions to recover from disaster. The customer was very happy with ExDRA and decided to use it and its other functions for all future issues that may occur.

I will use ExDRA in all DR cases to help more customers recover disasters faster than before.

Symptom Database not mounting

Root Cause Corrupted log file

Comments from the Support Engineer ExDRA gave us the correct options with which to proceed. The server was up and running on the same day

Symptom Database not mounting

Root Cause Log file sequence has reached the limit (E00FFFFF.log)

Comments from the Support Engineer ExDRA gave the correct options with which to proceed. The server was up and running within 50 minutes.

As you can see, we’ve found the Exchange Analyzer tools to be very valuable in Exchange troubleshooting. So the next time that the Exchange Server starts acting up, give these tools a try before you call Product Support Services. At a minimum, you’ll save some time when you can give us the reports when you open the Support Request. At best, you’ll find the problem and save yourself time and money getting it resolved.