Windows Error Reporting: Hoping for Fixes

"Hope." I kept seeing that word as I was reading responses to our reader survey questions about Microsoft Office and Windows Error Reporting (WER, formerly Dr. Watson, which the Office development team created and other Microsoft products are now adopting). For example, one IT pro said, "My hope is that Microsoft actually compiles the reports and then attempts to resolve the problems." Our reader survey indicates that most IT pros don't respond to WER requests to send crash information to Microsoft and don't encourage end users to respond. IT pros don't know what happens to the information they send, don't hear back from Microsoft, and can only hope the data helps improve the product.

"Windows Error Reporting: Elementary, My Dear Watson," August 2005, InstantDoc ID 46982, gave you the opportunity to judge how Microsoft's Ben Canning (group program manager, Office Trustworthy Computing) described the data that Microsoft collects and explained how the data is secured and how the Office development team uses WER reports to fix problems before and after the product's release. But what about IT pros' concerns with how end users perceive WER and how those perceptions affect IT support costs? This month, I give you Ben's take on those questions as well as his comments about IT's need to collect and analyze error data and the role of Microsoft's Corporate Error Reporting (CER). Check out Ben's examples and decide whether Microsoft is satisfying IT pros' hopes that WER data results in specific improvements to Office. Then see whether you're satisfied with Ben's explanation of how Microsoft is addressing your demands for the company to get back to you about reported crashes.

IT Resource Costs A key reason why 74 percent of IT pros don't encourage users to submit crash reports is the impact on IT resources and associated costs. One respondent said that sending WER reports "is not worth end users' time. They would think they were responding to my Help desk" instead of to Microsoft. Another person backed up that perception with data: "Users kept calling our Help desk asking what to do when the WER box pops up. So we disabled it and reduced Help desk calls by 15 percent." One respondent had a different cost concern: "I don't know whether Microsoft charges support fees to respond to WER."

These comments surprised Ben: "Several respondents were concerned about the support costs of error reporting—both in terms of end users not knowing what to do so they call the Help desk (which is something we absolutely have to look into), and also worrying about the external support costs. Someone asked, if you click Send, is that a support incident that will cost money? It's absolutely not. In fact, we don't know who you are—you're just someone sending us data."

Another set of survey responses that interested Ben were comments such as "We maintain our own error reporting infrastructure. Is there any value in providing this data to Microsoft?"

Ben remarked, "Several people said they don't need to send the report to Microsoft because their IT department can fix the problem or find the workaround. I found that very interesting. These are errors in the software. There's nothing the IT department can do to fix them. Microsoft has to fix them."

Ben emphasized, "These are low-level failures, so typically there's not a specific set of steps to prevent the crash. Customers often feel like they did something wrong. Well, you didn't do anything wrong. Microsoft did something wrong. We have a problem in our software that we need to fix."

If Microsoft takes action items from the data this survey produced, I'd suggest they address the effect on IT resources associated with end-user reactions to WER and the resulting Help-desk impact. I'll report back on any efforts that the Office team undertakes to deal with IT resource costs associated with WER.

CER: A Better WER, but Not for Everyone Survey respondents identified one way to help IT pros control resources and costs: "It would be helpful to log errors internally. If the same information that is sent via WER was saved on the local machine, we could track specific problems internally." As Microsoft's Software Assurance (SA) customers know, such a tool does exist: "We use Microsoft CER. It gets all of the crashes and compiles them to a central location, helping our IT staff identify issues on our desktops."

Ben elaborated on CER: "The corporation sets up its CER server and then uses Group Policy to inform all the machines in the organization that this is where error reports should go. There are several benefits: You can see all the information before it goes to Microsoft, you can choose which things to send and not send, you can see how many hits are occurring in your corporation for a particular issue, and you can control responses via policy (e.g., never allow users to send documents; just send hit statistics and never send minidumps). CER lets Microsoft hear corporations' voices without compromising their data, and IT maintains complete control over what's sent to Microsoft."

But not everyone has access to CER, right? Ben admitted that "today, anyone in the SA program can have CER for free, and it's not available outside of SA. The need for CER is definitely something we've heard from customers, and we're looking into it."

Hoping for Fixes from WER Regardless of whether crash reports come from CER or WER, Ben wants you to know that Microsoft is using the data to make specific improvements. "Service packs now are primarily driven from real customer information—not anecdotal information but real customer data sent by WER that identifies the biggest issues that customers are facing."

I've asked many Microsoft development teams for concrete examples of product features driven by customer feedback. Up to now, few have provided specifics, but Ben gave real examples: "Right after we shipped Office XP, Adobe shipped a new PDF Maker plug-in. We saw a huge spike of crashes coming into WER. We used the Watson data to identify that everyone hitting this problem had this PDF Maker plug-in. We contacted Adobe, figured out a fix, and used the Watson system to push out a response to customers telling them to update the PDF Maker add-in."

I asked Ben for another example of a WER-driven fix. "The first critical update to Office 2003 was driven by WER information. We took a security fix late in the Office 2003 development cycle to prevent opening documents that appeared to have been hacked in a particular way. But we got the fix slightly wrong, and a subtle document corruption that looked like this hack happened in prior versions of PowerPoint and other Office applications. So after we released Office 2003, customer documents suddenly failed to open and were labeled 'security threats.' Suddenly we were seeing an error message that we had expected to come up rarely, if ever. So we investigated and issued a critical update to fix that problem. We attached a response to that error message that told the customer to get this critical patch and provided the link to it."

Ben continued with yet another example: "Back in Office XP beta, we were having a hard time debugging an Outlook crash. So the developer put a response on the error message with his cell phone number, saying 'If you get this, please call me.' Just 15 minutes later he had a call from a customer and they were able to debug that problem. It just shows there's huge hunger in the development teams: They want to eliminate customer pain, and if we can get them the information to do that, it's incredibly powerful."

Those examples are fascinating, but about 83 percent of survey respondents have never (42 percent) or rarely (41 percent) seen a response like that from Microsoft. Ben replied, "When we have a specific response, like in the PDF Maker case, we absolutely hook up a response. Often fixes get rolled up into the next service pack. So if you hit a crash, the response is that you should upgrade to SP1 because that will probably fix this problem—and if it doesn't, it will give us data for the next service pack."

We'll Get Back to You? Not letting us know how the data we send through WER is improving Office and other Microsoft products is a good example of how Microsoft sometimes hurts itself even when it's trying to do the right thing. As one IT pro said, "I have not seen or heard that Microsoft is improving anything because of WER. Maybe they should release stats." In fact, many survey respondents had good suggestions for sharing information about WER, including a Knowledge Base page listing reported crashes and fixes, a blog to discuss WER issues, and email acknowledgments.

Ben expressed appreciation for these suggestions and said, "A lot of thinking is going on about that. We need to do more education and provide better documentation and better statistics on what we're doing so customers can understand how we're using WER information. We've been somewhat reluctant to do that so far, but I think we really are going to push more. We have ideas for future versions of WER to immediately show customers how we're making improvements.