NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance.

This report has been reviewed by a group other than the authors according to procedures approved by a Report Review Committee consisting of members of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine.

The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Frank Press is president of the National Academy of Sciences.

The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. Robert M. White is president of the National Academy of Engineering.

The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Samuel O. Thier is president of the Institute of Medicine

The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy's purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Frank Press and Dr. Robert M. White are chairman and vice chairman, respectively, of the National Research Council.

This project was supported by the Office of the Assistant Secretary of Defense (Force Management and Personnel).

Preface

In 1981, the four military Services, in an effort to improve their control over manpower quality in the enlisted ranks, launched a pioneering research program to develop measures of job performance so that, for the first time, enlistment standards could be linked to performance on the job. The Joint-Service Job Performance Measurement/Enlistment Standards (JPM) Project, as it is called, is being carried out by each Service under the overall direction and coordination of the Office of the Assistant Secretary of Defense for Force Management and Personnel. In 1983, the Committee on the Performance of Military Personnel was established within the National Research Council to act as an independent adviser to the Department of Defense on the JPM Project. At the request of its sponsors, the committee has given attention to the potential usefulness of the JPM research for personnel decisions and manpower management.

The Department of Defense decided to undertake the JPM Project as a result of difficulties caused by a technical error in scoring the test that is used throughout the military to determine enlistment eligibility: the Armed Services Vocational Aptitude Test Battery (ASVAB). The error had the effect of inflating test scores, so that approximately 250,000 young men and women were inducted between 1976 and 1980 who did not actually meet the entrance standards. The issue was complicated by concerns about the success of the all-volunteer force, because some of the Services had been having trouble in the late 1970s meeting recruiting goals. As a consequence

of the test misnorming, policy makers in both Congress and the Department of Defense became interested in establishing the relationship of the ASVAB to actual job performance. The JPM Project was the Department 's response to those concerns.

The first phase of the JPM Project was to determine if accurate, valid, and reliable measures could be developed that are representative of job performance and to determine how well the current enlistment procedures, including the ASVAB, predict these approximations of job performance. The second phase, now well under way, is to develop ways to set enlistment standards using the new job performance data. More specifically, the Department is exploring the use of cost/performance trade-off models to provide the standards-setting process with a more solid empirical foundation.

The primary focus of the committee in the early years of the JPM Project was the overall research design and the development of instruments to measure job performance. Later, the focus turned to problems in hands-on test administration, controlling for unreliability of measurement, the relationships among the various new performance measures, and extending the research findings to a larger set of military jobs. In order to place the research in context, the committee also learned about military entrance processing, entry-level jobs in the military, technical training, and the general outlines of how entrance standards are set. Committee members made a series of site visits to Army, Navy, and Air Force bases to see enlisted personnel at work, to talk to their supervisors about the content of entry-level jobs, and to observe test administration procedures. Subgroups of committee members made a number of trips to military personnel research laboratories to gather information. To facilitate an interchange of ideas, the committee invited JPM Project scientists as well as other experts to explore solutions to specific technical problems in a series of workshops. And, as supplements to its activities, the committee has called on outside experts to prepare background materials on various aspects of the issues involved.

Since 1983, a series of reports has been delivered periodically to the Department of Defense on various aspects of the JPM Project. The final report, which is companion to this volume, summarizes what the committee learned from analyzing the JPM experience. It begins with a historical overview of the criterion problem and a discussion of the conceptual approach and general research design of the project. It then looks closely at specific issues: the development of performance measures; sampling, logistical, and standardization problems; evaluating the quality of performance measurements in terms of reliability and content representativeness; the relationship between test scores and criterion measures; and management of human resources. The committee hopes that the insights and information contained therein will be of value to an audience wider than the military services, including policy makers, members of the testing community, em-

ployers concerned with performance assessment, and, given the new currency of performance assessment in the education arena, to the many school officials, educators, and policy makers involved in education reform.

This volume contains some of the most valuable papers that were prepared for the committee. With them the authors helped the committee and the JPM research scientists think through the technical challenges raised by attempts to develop criterion measures for a sample of jobs that could be made meaningful to the universe of jobs in the services. Some focus on approaches to performance measurement and analysis of the job performance data; others deal with broader issues involved in comparing multiple measures and generalizing from a small sample of jobs. Taken together, they provide those interested in the technical details of the JPM Project a closer look at some of the problems, challenges, and possible solutions. We sound their themes in the paragraphs that follow.

Robert Glaser, Alan Lesgold, and Sherrie Gott, in a paper that looks to the next generation of performance measurement, discuss the methodology needed to measure the cognitive aspects of job performance. The large number of highly technical jobs and the short periods of enlistment in which both training and useful performance must take place make the problem especially severe for the services. Cognitive psychology has produced a variety of methods that can be sources of a new set of measurement methodologies; the authors' application of these techniques in developing a cognitive task analysis procedure for technical occupations in the Air Force is the basis for their conclusions. They present a cognitive account of the components of skill, discuss the specific measurement procedures employed, and consider which aspects of measurement in the services can best use these approaches.

The measurement method of greatest interest in the JPM Project is the work sample. Frederick D. Smith presents a review of the work sample literature, with particular attention paid to the theory underlying work sample testing, the use of work samples as criterion measures, the adverse impact of work samples, and measurement issues associated with such tests. Considering both their advantages and disadvantages, he concludes that the research concerning work sample testing suggests that they can produce high predictive validities, and that when used as criteria they compare favorably with supervisor ratings and productivity measures.

In a paper drafted on behalf of the committee, we discuss the meaning of assessing competency or job mastery and consider ways of establishing such interpretations and using the results. We suggest that, to effectively communicate information about the performance of enlisted personnel and the implications of changing standards, the scoring scale of the job performance tests needs to be given some sort of absolute meaning. Policy makers would be better able to make informed judgments about what distribu-

tion of “quality” in the recruit cohort is acceptable and what is unacceptable if performance scores could be interpreted in terms of what the job incumbent who scores at each level is able to do. We illustrate this argument by analyzing a simple model for setting entrance standards.

Whereas many of the papers in the volume are concerned with developing more adequate measures of job performance, Linda S. Gottfredson explores strategies for evaluating alternative kinds of criterion measures. Today's challenge, she argues, is to develop procedures for comparing the relative utility of alternative measures for a given purpose. Gottfredson presents interesting suggestions for assessing the types and degrees of similarity and differences among criterion measures. Although she focuses on evaluating job performance measures in their role as criteria in developing personnel selection procedures, it has more general applicability.

Stephen B. Dunbar and Robert L. Linn provide an overview of standard procedures used to adjust correlations and regression parameters for the effects of selection, commonly referred to as corrections for range restriction. Technical issues related to the accuracy of these adjustments are considered, especially where they are likely to have implications for the types of adjustment procedures appropriate for large-scale predictive validity studies of an aptitude battery like the ASVAB. The authors conclude with a discussion of issues related to the implementation of a set of adjustment procedures for validation studies in the military, where the choice of the reference population, choice of selection variables for making adjustments, and choice of an analytical procedure all have important consequences for the assessment of the validity of the ASVAB for predicting performance on the job.

Linda J. Allred considers alternatives to the validity coefficient for reporting the relationship between test scores and performance. The validity coefficient indicates the overall strength of the test-criterion relationship for the groups being studied, but its meaning is obscure to a nontechnical audience. Using several sets of hypothetical data, Allred illustrates various display methods, including the scatter plot, the box-and-whisker plot, expectancy methods (chart, table, and plot), and the frequency table, and describes their strengths and weaknesses.

Richard J. Shavelson lays out a statistical theory of the multifaceted sources of error in a behavioral measurement. Called generalizability (G) theory, the theory has heretofore been applied to traditional measurements such as aptitude and achievement tests. Shavelson provides an example of how G theory can be applied to military job performance measurements, using hypothetical data, as well as specific applications of the theory, chosen to highlight the theory's flexibility in modeling a wide range of measurements.

associated with the use of hands-on tests of job performance. The first concerns methods for setting standards of minimally acceptable performance on the tests. The second involves procedures for eliciting and combining judgments of the values of enlistees' behaviors on military job performance tests. The third concerns procedures for using enlistees' predicted job performance test scores and judged values associated with those test scores in classifying enlistees into military occupational specialties. The first, for which there is the greatest research available, is discussed is considerable detail; discussion of the second is comparatively brief; and discussion of the third is illustrative rather than definitive.

Paul R. Sackett considers approaches to extending validity findings and empirically based predictor cutoffs beyond the 27 jobs chosen for inclusion in the JPM Project to the universe of military occupational specialties. The purpose for which job analysis is being done or for which jobs are being compared is often ignored, and the choice of the job descriptor has an important impact on decisions about job similarity. No single approach is recommended; rather, a number of possibilities are examined.

To all these authors the committee is grateful for turning their knowledge and experience to a number of novel and exceedingly difficult technical issues confronting all of those who would address the criterion problem seriously. They have enriched the advice that the committee provides to the Department of Defense, applying the sciences of psychometrics, testing and performance measurement, and industrial psychology to the problems raised by the JPM Project. We commend this volume to those who wish to expand their understanding of the issues, challenges, and advances generated by one of the largest and most important studies of job performance on record.

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.