The SCOE collaborates with customers, partners, industry, and other teams at Microsoft to increase awareness, foster innovation, and expand its reach of support on security topics.

Since the Slammer and Blaster worms hit in 2003, members of the SCOE have worked with many Microsoft corporate customers, dealing with difficult security problems. To put the problems in perspective, it's important to understand how Microsoft and our customers arrived in the situation that we found ourselves in last year.

A Little History

Five years ago, there was a strong push by our enterprise customers to achieve the highest levels of availability possible on Windows®-based systems. This effort, combined with the improvements in Windows 2000, resulted in a major shift toward actual availability targets. To support these targets, IT departments created Service Level Agreements (SLAs) for their Windows-based systems and worked diligently to document the level of availability that users could expect from their servers and workstations that run Windows. As Windows 2000 technology matured, along with the continual drive for efficiency in enterprise IT operations, Windows operating systems became mission-critical for many Microsoft customers.

For Microsoft customers SLAs were also the catalyst that let them see IT as a cost that could be outsourced. This resulted in many companies investigating alternative options to drive efficiency. The mission-critical nature of Windows operating systems, combined with the increased expectation for availability and the push toward outsourcing, created an odd situation where the secure design, deployment, and maintenance of Windows systems were often an afterthought. Even then security was often viewed as an impediment, especially by those in the eBusiness space.

In July 2001, the Code Red virus was unleashed, followed by Slammer, Blaster, and the recent Sasser worms. The advent of automated, self-propagating worms that could exploit vulnerabilities in unprotected and unpatched systems wreaked havoc on many business networks. What is most interesting about these worms is that in every large-scale worm scenario, the technology existed to prevent the worms from succeeding. The worms generally spread due to process failures. In every case where customers were not impacted by these worms, their resiliency can be attributed to a robust IT security program that focuses on process rather than on technology.

Note that OS vulnerabilities have been exploited for a long time, but the wide acceptance of Windows as an enterprise platform and the advent of the Internet caught Microsoft in the cross hairs. It also served as a wake-up call to anyone working with Windows operating systems. Companies realized they had sacrificed security for the sake of convenient networking.

For customers at Microsoft, the repercussions of Code Red spurred many of the security-related improvements that were seen in Service Pack 3 for Windows 2000, Service Pack 1 for Windows XP, and the secure-by-default design of Windows Server 2003. The consequences of worm-related security incidents served as a rallying point for many security professionals. Unfortunately, for those working with Windows operating systems, it took events on the scale of these insidious viruses to call attention to the importance of secure system design, deployment, and maintenance.

Those of us focused on security at Microsoft know that the root cause of Blaster and other worms targeting Microsoft applications is software vulnerabilities. At the same time, we have learned from our own internal Microsoft IT organization, as well as from customers who have dedicated time and talent to exploring the risks posed by security incidents, that software-based vulnerabilities can be mitigated through an effective IT security program. Here are some of the lessons.

1. Identify and Classify Assets

The principles of security risk management have been recognized since the earliest days of mainframe computing. A simple inventory is a great place to start to determine exactly which systems exist and how they are connected to your network. There are several ways to perform an inventory. They include the use of rudimentary network scanning tools, complex log revision techniques using information in Active Directory®, Dynamic Host Configuration Protocol (DHCP), DNS servers, and using dedicated systems management tools.

It's important to note that in nearly every catastrophic incident that affected our business customers, the problem was not caused by a managed system but was introduced by a system outside of the IT group's control. It is imperative to create a process where you can quickly and effectively verify which systems are under your control, and then identify any unmanaged systems. The identification of these systems generally represents the most difficult task for the IT staff to accomplish, but it is a necessary effort.

Several Microsoft customers that successfully identified their unmanaged systems relied on a tightly integrated virtual team composed of members from IT operations and the networking group.

To begin, the IT operations staff requested that the networking team identify all IP addresses in use on the company's networks. The IT team then began subtracting known IP addresses of managed systems from the list. The resulting set of IPs were not accounted for and thus identified the systems that needed to be addressed first. After all systems were identified, they were classified into subgroups to make them more manageable. The first step was separating the managed and the unmanaged.

Subgroups were defined based on a system's role, its location, or other attributes that couldn't be consolidated to eliminate the unwieldy problem of attempting to secure each individual system.

2. Establish Asset Ownership

Usually the asset owner is either the system administrator or the support group tasked with system maintenance. In the SCOE's experience, a successful IT asset ownership assignment rarely results in an IT staff member being assigned as the person responsible for system maintenance. Crossing the boundary from IT staff to business group ownership and responsibility for IT systems marks the beginning of a successful ownership assignment program. Based on experience, most IT professionals can easily identify examples where a line-of-business (LOB) application served as a blocker to deploying a critical system update. Shifting the responsibility of securing the IT asset that relies on the system to the business group is an important step in ensuring the success of this process.

Key Lessons

For IT professionals, the central principles of any effective program designed to manage the risk associated with software-based vulnerabilities are:

1 Identify and classify assets

2 Establish asset ownership

3 Define baseline system requirements

4 Measure compliance

5 Enforce compliance

Microsoft, like most large global organizations, found a situation where business owners would resist system updates. At one point, there were possibly thousands of our internal systems that could not be properly secured or maintained because of the reluctance to stop services long enough to change security settings or perform system updates. Over time, these exceptions were reduced to hundreds, and eventually reduced to tens. Currently a business-group exception for a system must be escalated to the highest levels of a business unit in order to be approved.

Creating a Responsible, Accountable, Consulted, Informed (RACI) matrix for assets is another great exercise in assuring that assets are properly accounted for and key stakeholders identified. Integrating this information into a Configuration Management Database (CMDB) results in a very powerful data set that can help IT staff and business leaders make informed decisions about when to force the issue of updating a particular system or establishing improved system maintenance processes for line of business applications.

3. Define Baseline System Requirements

Many IT professionals work in companies that have clearly written security policies, but relatively few confirm that those policies are applied uniformly on production systems. The alignment between security policies, system build standards, and actual implementation is extremely important. The key requirements that need to be identified can be information such as required settings, required software and services, and required updates. There must also be key prohibitions that outline what is not an acceptable configuration.

As with asset ownership, accountability and responsibility for the baseline system requirements should lie with the business group that relies on the IT system. If exceptions are sought, each business group should follow a process to ensure that the appropriate risk is analyzed and communicated to all participants and business leaders before they are granted.

However, this should not be interpreted to mean that the only solution is the default model for system builds. Within Microsoft, we have standardized builds that employees may use for their workstations, but anyone can install other platforms or versions of software. The onus is then placed on the user to assure that the system meets established baseline requirements for configuration settings, required software (such as antivirus protection), and that required updates are installed. Flexibility is possible, but should only be allowed within a carefully monitored and maintained environment.

4. Measure Compliance

In the experience of SCOE, effective security programs for large businesses rely on an automated toolset for the reporting of managed-system configuration status, the identification of unmanaged systems, and the mapping of each system to an appropriate business owner who is ultimately responsible for the system's adherence to the established system baseline. As with any complex process, measuring compliance requires periodic audits with formal targets to monitor the effectiveness of the program. The reports should officially serve as an indication of a company's IT health in addition to a warning light, helping those responsible for IT systems make informed decisions about systems at greatest risk, and which baseline settings are inconsistent with the baseline system configuration.

5. Enforce Compliance

In the IT culture that predominated before the appearance of automated worms, the thought of "pulling the plug" on any business-critical system would have been considered heretical. When businesses focused more on availability than robustness, shutting down an LOB server to electively bolster security, would result in a lot of flak. Heaven forbid if a CEO's laptop temporarily lost connectivity due to maintenance, and he could not read e-mail.

The SCOE handled just such a case recently with a customer. In a follow-up meeting with the customer's IT staff, there were tense moments when the customer's CEO expressed his frustration about being unable to connect to the intranet while technicians worked on resolving the problem. However, after reading the news headlines the next day about the impact of the same virus on several of his company's competitors, he appreciated that his business had been protected by the procedures that he had cursed the day before.

The benefits of enforcing security policies and assuring adherence to baseline standards may not always be apparent, but with relatively few exceptions, new security processes have improved IT departments and changed how they are perceived and used within large organizations.

Securing Your Environment

A myriad of technologies exist to help you protect your environments, but without a robust procedure to ensure the integrity of systems connected to your networks, none will be effective.

Working toward the establishment of a well-managed IT environment takes discipline and consistency. The Security Center of Excellence is committed to sharing with all of our customers the lessons that we learn from our own experiences, as well as the best practices that we prove within our internal IT environments. For more information on security for Microsoft systems, talk with your Microsoft relationship manager or visit the Microsoft security Web pages at Microsoft Security Home Page.

Aaron Turner is the Microsoft Security Center of Excellence Delivery Manager, and works with Microsoft Services to coordinate security consulting activities around the world. He has worked with customers of all sizes to develop effective IT security programs.