NIST/SEMATECH Engineering Statistics Handbook

Summary:

The NIST/SEMATECH e-Handbook of Statistical Methods is a Web-based book written to help scientists and engineers incorporate statistical methods into their work as efficiently as possible. Ideally, it will serve as a reference which will help scientists and engineers design their own experiments and carry out the appropriate analyses when a statistician is not available to help. It is also hoped that it will serve as a useful educational tool that will help users of statistical methods and consumers of statistical information better understand statistical procedures and their underlying assumptions, and more clearly interpret scientific and engineering results stated in statistical terms.

Description:

The project began with a request from SEMATECH, a consortium of major U.S. semiconductor manufacturers, to update the National Bureau of Standards (NBS) Handbook 91, Experimental Statistics. Handbook 91, written by Mary Natrella of the NBS Statistical Engineering Lab, was a best-selling NBS publication for many years. Engineers and scientists in a variety of fields appreciated it because of its problem-oriented approach to statistics and its detailed examples. The examples of each statistical procedure recommended in the text were also accompanied by fill-in-the-blank worksheets, allowing the reader to quickly and easily repeat the calculations with his or her own data. By the 1990's, however, the emphasis on hand calculations was too dated to be practical and many modern statistical methods were missing from the text, prompting SEMATECH's interest in updating Handbook 91 for the use of their member companies.

A joint NIST/SEMATECH project team was assembled to explore the idea, develop a formal project proposal and to carry the project out. With the rapid growth of the Internet when the project proposal was under development, the project quickly evolved from the publication of a new edition of a traditional book to development of an online handbook for distribution via the World Wide Web. The advantages of Web-distribution included easy access by users all over the world, the ability to integrate the software necessary to use the different statistical methods right into the text, and the opportunity to create an easily expandable resource.

The development of the e-Handbook's new format and content was carried out using a top-down approach. The team first laid out the scope of the new handbook and a detailed outline of its content. The outline was designed to lead the user hierarchically from the general topics covered to the specific information needed, avoiding statistical jargon as much as possible. The eight chapters in the top level of the outline include:

Exploratory Data Analysis

Measurement Process Characterization

Production Process Characterization

Process Modeling

Process/Product Improvement

Process/Product Monitoring

Process/Product Comparisons

Assessing Product Reliability

In addition to the main outline, several other methods of accessing the text were laid out to try to make the information in the e-Handbook as accessible as possible for users unfamiliar with the traditional organization of information in the statistical literature. Some of these alternative access methods include engineering questions linked to flow charts showing the steps necessary to complete a statistical analysis appropriate to the question, along with indexes of examples and search capabilities. Since another major goal of the new Handbook was to maintain a practical, problem-oriented approach to statistics, common structures such as a section of detailed case studies using real data from the semiconductor industry and the NIST laboratories were included in each chapter. Standard page formats for each type of page in the Handbook were also carefully developed to improve readability and to make navigation transparent.

Finally, after completing the high-level layout of the entire book, individual team members were assigned for each chapter to fill in the framework developed by the team. Of course, developing a stylistically coherent technical publication with multiple authors, while efficient in some ways, is quite a challenge in others. Fortunately the team found an appropriate editor in Tom Ryan, who diligently read and marked-up the entire text to help ensure that all the chapters of the e-Handbook read with a (reasonably) common voice. Readers of the beta version of the e-Handbook also provided many useful comments and corrections.

The approach taken toward integrating statistical software with the Handbook was more bottom-up than that used for updating the Handbook itself. The project team realized from past experience in teaching and consulting that different users like different software. Persuading a user to switch from one package to another generally requires compelling reasons since it costs the user not only money, but more importantly, time. The team also recognized that writing new statistical software that would be universally available across platforms, software written in Java, for example, would add greatly to an already ambitious project. As a result, the vision for software integration focused on development of an open system. Under this model, the project team integrated one statistical package with the Handbook, and established a framework that other software providers can use to integrate their software as well.

The software chosen for integration with the Handbook was Dataplot, a free downloadable statistical package maintained by the NIST Statistical Engineering Division. One of the primary reasons that Dataplot was chosen as the prototype software was the ability of the Handbook team to easily make any changes in the source code needed to improve integration. Another important consideration was its wide accessibility, allowing almost any user to take full advantage of the Handbook. The team produced Dataplot macros to carry out the various analyses used in the case studies.

A current effort is to provide R scripts for the analyses and case studies given in the e-Handbook. R is an open source statistical software program that has been widely

Major Accomplishments:

Publicly released June 2003.

Approximately 8,000 CD's of the e-Handbook have been distributed.

The e-Handbook web site averaged approximately 800,000 hits per month from September 2009 through August 2010.

End Date:

Product publicly released, maintenance and updates are ongoing.

Lead Organizational Unit:

itl

Customers/Contributors/Collaborators:

Barry Hembree, Jack Prins, Pat Spagon, Paul Tobias, and Chelli Zey, from International SEMATECH and International SEMATECH Member Companies.