Book review: Fundamentals of Polygraph Practice

It has been said that when reviewers praise poor work, the mistake will eventually surface but when reviewers pan good work, the error may never be discovered. This assumes that reviewers have a significant influence over what is read, an unlikely assumption when it comes to technical writings. Fundamentals of Polygraph Practice, unequivocally, is a work all examiners can read with the expectation of learning something new. It is also a handy reference for numerous, basic polygraph issues.

The opening chapter, a history of “lie detection,” is a useful overview of much that has been written before with a number of interesting additions. This is followed by a chapter by Joel Reicherter on Anatomy and Physiology. This chapter is thorough and does not require any special educational prerequisites to easily understand.

Chapters 3, 4 and 5, dealing with test question construction, data collection and analysis of polygraph data, respectively, are straightforward presentations of material that would be found in most instructional documents used in training schools. In fact, it appears to us that much of what is found here was drawn directly from that material. For example, statements such as: “Adjust the gain so that the tracing amplitude is about three-quarters of an inch” and “The right amount of cuff pressure during testing is about 60-70mm Hg.” are representative of what is found in these chapters.

There are some evaluative and useful comments found in these chapters but in the main they are pretty prosaic stuff. However, some might find a special interest in explanations of the common problem of tracings that look like deceptive responses when they are not, e.g. how increasing the pressure in a blood pressure cuff, pressing down on the EDA electrodes or simply taking a deep breath result in significant EDA “responses.” While the photoplethysmograph is discussed as if it were a parameter equal to the standard three, it would have been helpful to mention the reason why it is not commonly used and why its use is not required by the APA: it simply doesn’t offer enough useful data to enhance correct test decisions, though this point is later generalized in subsequent material.

Chapter six, dealing with polygraph screening examinations, begins with a very unbalanced rebuke of how private examiners’ greed and abusive practices resulted in the EPPA. While those topics were mentioned in the dispute, it was the political pressure of labor unions and the severe—and mostly unsupported—assertions of academic critics of polygraph screening on this and other anti-polygraph policy, such as the “Police Officer Bill of Rights,” that were far more telling in the passage of the legislation. In this chapter the authors also indicate that multiple issue screening tests have not been shown to produce high accuracy. They suggest, therefore, that the terms “significant response” and “no significant response” be used in lieu of “deception” and “no deception indicated.” While this is accepted practice in some, but not all, situations, it is the failure to note the need for the grammatical artifice that is missing here. In screening exams generally there are no case facts, no complainants, no physical or other evidence nor any other diagnostically-helpful information that is typically available in event specific examinations. In short, and in general, screening examinations typically suffer from the lack of a helpful context. Moreover, the relevant questions are very general and may appear to be somewhat ambiguous, precisely the qualities desired in comparison questions. In fact, many of the relevant questions included in screening examinations could be used as comparison questions in specific issue testing. Moreover, how they are introduced and subsequently understood by the examinee varies widely from one situation to the next. It is issues such as these and the role they play in producing accurate testing outcomes that needs attention; those are not provided in this chapter, propose the use of a “Successive Hurdles” solution to the shortcomings in screening settings. While the idea seems to have some merit, it is demanding of time, staff, financial and other resources and is impractical in many field settings, perhaps with the exception of governmental environments. In addition the value of such a solution is unknown and difficult to assess in real life situations since ground truth is usually not known or knowable.

Chapter seven covers the topic of specific issue testing techniques and provides the reader with a general overview of some standard protocols with a focus on three favored procedures: the Federal Zone Comparison Technique, the Utah Probable Lie Technique and the Air Force Modified General Question Technique.

In chapter eight, the use of recognition tests, applied in event-specific situations, is presented in an uncomplicated and easy to follow way. It is curious to note that in this chapter, the authors explain how the previous terminology for this approach, the Guilty Knowledge Test, evolved into the present day Concealed Information Test. They fail to follow this same pattern in explaining how the terminology for a stimulation test became the acquaintance test. While the term “stimulation” appears in the titles of cited references, this term is simply omitted as a test type from the text, Glossary and index. The same inconsistency appears in the use of the term “control” to describe a type of question in recognition testing without mentioning that throughout the polygraph literature the term “control question” was used far more often to refer to Reid’s original contribution even though Reid himself initially used a “comparative response” appellation. Finally, on the same point, the terms “global evaluation” and “global analysis” appeared in the literature at least as early as 1982 and have been discussed in detail at APA Seminars as recently as 2013. The authors not only fail to present this historical and traditional use of “global evaluation” but also neglect to use it to limn one of the principal points of difference in schools of thought about polygraph testing.

Chapter 9 is devoted to a short discussion of “scientific issues.” The coverage here is focused on what might be seen as primary concerns; these include validity, reliability, the effect of base rates and an overview of some of the extant “theories” regarding polygraph testing, particularly the Comparison Question Test.

Gordon Vaughn’s Legal Issues, Chapter 10, with a noted contribution from New Mexico Judge Charles Daniels, is concise, well annotated and succinct. It leads to the conclusion that polygraph admissibility in the United States has less to do with proof of scientific accuracy than with other factors primarily important to the judiciary. Unfortunately, the chapter lacks discussion of American employment law even though we learn in the text that screening tests are the most prevalent type and a section of the text (Appendix A) provides a complete overview of the Employee Polygraph Protection Act (EPPA). Certainly examiners would benefit from a presentation of options on how to reconcile conflicts among employment law case decisions, Equal Employment Opportunity Commission (EEOC) directives and the American Polygraph Association (APA) Standards of Practice. A basic understanding of the legality of employment practices is more important today than ever since pre-employment polygraph testing is increasing as more restrictions are being placed on alternative information gathering methodologies. In addition, many federal agencies previously exempt from the numerous federal employment laws are now being required by Executive Order or agency policy to conform to the same laws and restrictions other government and private employers have had to negotiate. It might also have been useful to forewarn examiners about possible legal attacks on polygraph scoring algorithms as is currently happening with certain forensic techniques.

While no text can possibly include everything about polygraph testing, there are some critical omissions as seen, for instance, in chapter 11 dealing with “advanced topics.” While going into great detail about how to address dwarfs or how to place attachments on subjects with a prosthesis, nowhere do the authors describe how to properly place and use audio/visual equipment, seemingly dismissing the topic as trivial. There are both positive and negative effects in the use of a/v equipment and, of course, there are legal restrictions under various state eavesdropping laws. Also, there are other important issues that examiners should be aware of in the use of a/v equipment. We would submit that some of these are more important in practice than some of the points made to seem significant in this chapter.

In their discussion of the Marin Protocol, a topic that seems to have little interest in recent days, the authors suggest a single method to establish examiner competency: cite a validity research study in which he or she has participated and demonstrated an ability of 86% or better at blind chart analysis. What’s missing here, among other things, is the more democratic option of simply having an examiner wishing to qualify as competent in “chart interpretation” achieve a specified accuracy level by analyzing a random sample of a number of verified charts, a procedure that one of the book’s authors actually administered for many years in promotion of the Marin Protocol.

In their discussion of using interpreters, the authors suggest a procedure requiring the examiner and interpreter to use question cards (Cards on which test questions have been written down, word for word, indicating what the interpreter should ask.) rather than simply to cite a letter or number referring to the desired question in a listing. The problem with what the authors recommend is that when the examiner unexpectedly needs to change the question order or needs to add an extra irrelevant question, the use of question cards is awkward. The chance of creating undesired artifacts increases when there are noises created behind the subject’s back or odd delays as the examiner and interpreter shuffle and pass cards between themselves. There is no reported evidence of undesired artifacts when an examiner first speaks a question reference letter or number before the interpreter reads the question, the time tested method for conducting tests with interpreters. Also, it is difficult to square the authors’ explanation of placing interpreters who sign in front of the subject but language interpreters behind the subject. It has been our experience that locating the language interpreter out of the subject’s direct line of sight, to the side but not behind, produces more than satisfactory examination results.

There are several perplexing omissions in this book, not the least of which is any meaningful discussion of field studies and practices as they actually apply in field settings. This results in an overreliance on laboratory studies to justify conclusions. For example, in Chapter six, there is a detailed narrative suggesting use of pre-recorded, automated question presentation as a useful practice. While there might be merit in doing this, there is little evidence showing a significant advantage in field conditions. Similarly, while the authors provide a script for introducing “Directed Lie Comparison Questions”, there is no such script provided for the use of the far more common probable lie comparison questions. Clearly, as observant examiners know, there are critical differences between examiners in the way probable-lie comparison questions are introduced and “worked up,” a term the authors use but fail to define. Laboratory studies typically employ a very rote approach in an attempt at “standardization” while field studies detail a more clinical approach tailored to individual subjects and unique case facts. Such an approach is hinted at in this book but it is left to the reader to determine how, for example, one would determine whether the Goldilocks test has been met, that is, how one would know in advance of testing if a probable lie question is “too hot” or “too cold.”

Then there is the elephant in the room. While the authors warn against procedures unsupported by research, they proceed to recommend the opposite or at least to suggest that doing so is okay. For example, they point out, correctly, that the use of either a sacrifice relevant question or a symptomatic question is not supported by evidence in either case. Yet both of these question types are included in the recommended techniques without any notice of the possible effects of the included questions, positive or negative. Similarly, the authors clearly indicate that the directed lie procedure should be relegated only to screening tests where there are no diagnostic opinions but, in the glossary, they point out that one of their recommended “techniques” for diagnostic purposes makes use of the directed lie approach.

The closing chapter of this book might be the one of greatest interest to those who already know the mechanics of conducting a proper polygraph examination but don’t know what alternatives are now being considered. The 2003 report on “lie detection” by the National Research Council told us that in spite of the shortcomings in polygraph testing there does not seem to be anything on the horizon that is ready to replace it. Those technologies and methods that seem most likely to have that potential, though, are briefly reviewed in this chapter. Some of these might be seen as complements to and others as substitutes for the polygraph. Included here are such things as measures of brain activity (Fmri, ERP’s), thermal imaging, and laser dopler vibrometry, among others.

Because this book is devoted entirely to the topic of polygraph testing and it attempts to cover a range of topics related to the history, the underlying ‘theory’ and the processes involved in the administration of polygraph examinations, we feel compelled at the end of this review to offer a number of what are, to us, significant points with which we, and we think in some instances the evidence, disagree. In doing so we acknowledge that our training and understanding of some aspects of polygraph testing differ, or appear to differ, from that of the authors. We focus on only a few items of concern, those which to us represent points that should be of interest to persons new to the field of polygraph testing, the apparent intended audience of interest to this books’ authors.

First, a small but yet important correction. On page 16 the authors point out that the Frye decision in 1923 was a “case [that] was ultimately taken up to the US Supreme Court. On December 21, 1923, the Supreme Court rendered what became known as the Frye Decision (or General Acceptance Standard), denying Frye’s appeal and setting a standard for the admissibility of scientific evidence that would remain well in to the 1990’s.” This case is very well known in the polygraph community and, of course, across the forensic sciences. It has been widely discussed in recent years in light of the Daubert (1993) decision. Of importance here though is that the authors indicate erroneously the Frye decision was handed down from the U.S. Supreme court. This, of course, is not true. Fortunately, the careful reader will note that the correct information is provided in a subsequent chapter dealing with admissibility issues, though the conflict regarding the court decision is not evident. In our view the Frye case is so critical that readers ought not be misled as to its source.

One of our concerns regarding this volume has to do with the unevenness of the material that is covered. In some places the writing and the material is somewhat analytical and well- considered whereas in others, as we have pointed out, it is equivalent only to what might be found in examiner training documents. It is
highly dogmatic and instructionally descriptive, often presented without a proper foundation or no foundation at all. Such a ‘how-to-do-it’ approach has a place but in this case it detracts from the text offered at a different level. The “polygraph-in-a box” approach can be obtained from many sources online and while much of that may not be what is said to be “best practices,” without a proper foundation there is no reason to believe that the basics offered here are anything more than just accepted, not necessarily “best,” practices.

We have commented on this already but are compelled to follow up on what we’ve said because it is central to a foundational point regarding polygraph testing in real-life situations. The term “global” in one sense refers to relating to or embracing the whole of something, or a group of things. To us, it refers to a proper understanding of a polygraph examination, and how all of the major components that make it up (e.g., collection of factual information, examinee information, pre-test interview, polygraph testing, and, in some views, a post-test interview) fit together and interact with each other such that the basis for confident decision-making is evident. Even though it is the polygraph data themselves, properly collected, that are the principal source of data providing the basis for an outcome they do not, without considering the context in which they are obtained, lead to the most accurate outcomes. One of us (SS), in fact, authored an entire article on this topic alone. In this article it was shown why global assessment is important, in contrast to these authors who use the term “global analysis” as a sort disparaging term to refer to a desultory, unstructured, perhaps casual and informal review of collected polygraph data,. The term in this reviewed book is defined in the glossary in two ways. First, as an “evaluation of the polygraph recordings as a whole, as opposed to making systematic comparisons among questions. Second, global evaluation is also used to refer to a process that includes the “use of extra-polygraphic information…when rendering a polygraph decision.”

We don’t know of any procedure that makes use of the first method, although the authors state that “a form of global analysis” (p. 122) is applied when the testing involves Relevant- Irrelevant testing. Even here, however, there is typically systematic comparisons among questions, even though these may not be expressed in a formal way or with the use of numerical values.

We understand that some examiners assert that the use of extra-polygraphic information, data aside from what can be seen in the physiological data, ought not to be done, primarily because it is seen to be unscientific, unreplicable and subjective. We disagree; it is none of those. And, in our view, those who deliberately ignore such information are more likely to be in error in their outcomes and the empirical evidence, we think, clearly shows that. But, that is beside the point here. A book devoted entirely to the topic of polygraph testing that does not at the least consider what actually occurs during a polygraph examination, from the assessment of investigative information, the interaction between the examiner and the examinee, and how they relate to the outcome is not conveying what is fundamental to the process. While the prevailing view in the field might be that polygraph testing is strictly objective and “scientific”—relying exclusively on an assessment of collected physiological data—it is easily shown that that is not typically the case in fact in field settings.

With respect to the use of extra polygraphic information it is commonly assumed that this includes observations of examinee behavior, often collected in what is referred to as a “structured pre-test interview” (SPI). The SPI makes use of stimulus items called “behavior provoking questions” or “behavioral observation questions,” terms used to refer to the same concept by different names depending upon which training facility is at issue. The SPI developed into what has now become known as the Behavior Analysis Interview (BAI), parts of which are taught in many polygraph training facilities, including the National Center for Credibility Assessment and the Canadian Police College polygraph training school. In spite of this widespread usage, this volume doesn’t touch on the use or value of behavioral observations; in fact, it devotes approximately one of 348 pages to the topic. We understand this omission in light of what we believe might be the authors’ preference for ignoring such information. Yet, as we have already stated behavioral observations are part of our reason for preferring the idea of “global analysis” as a descriptive term referring to decision-making, not “chart” evaluation. To be clear, however, we do not advocate the use of global analysis to provide for a way to overrule what careful analysis of polygraph data reveal. We believe that properly applied global analysis is most useful for avoiding errors that sometimes occur even when polygraph data are analyzed as they should be but are for whatever reason misleading. In fact, one of the authors of this book (DK), along with another person, devoted an entire article to showing how in one case a serious error was avoided by careful attention to extra-polygraphic information. This, in our experiences, can be seen as a regular observation in field testing. In addition, we note that almost all careful observers of the field research regarding CQ polygraph testing agree that field examiners decisions tend to be correct more often than those of blind evaluators of field- collected polygraph data. This, we believe, is because in actual field cases examiners make use of important diagnostic information that is not evident in analysis of polygraphic data alone.

We certainly favor the application of numerical scoring systems in polygraph testing for analysis of the collected physiological data. We also favor the use of automated computer scoring algorithms. In fact we welcome the use of any scoring-scheme that requires careful, assiduous attention to the data and that extracts diagnostic information from those data. In our view, though, the scoring of the data is a necessary but not sufficient basis for rendering a decision. Data evaluation and decision-making are two related but separate processes and when both are properly applied, the outcome is more apt to be correct than otherwise.

The authors write about “numerical scoring” as if it is a panacea for all shortcomings in field polygraph testing. They fail to note that such scoring, while valuable for some purposes, is not, in itself, sufficient justification for field decisions. Whatever method of numerical scoring is done in the field, the outcome—as we have already noted—ought to be guided by but not determined only by a “score” that reaches a specified threshold, as some advocate. Such scoring does not overcome the problems that surface when the test administration and the examinee’s perception of the situation are inconsistent with expected standards. In addition, the authors’ description of the genesis of numerical scoring is incomplete. They write as if what is now commonly known as “numerical scoring” simply emerged from nothing. In fact, what is not mentioned even in passing is that the idea of numerical scoring was derived from the work of the late Richard O. Arther. When Cleve Backster associated with Arther in the 1950’s, Arther steadfastly advocated the use of a “check mark system” (which he learned from his association with John E. Reid) to “score” polygraph charts. This system requires an examiner to assess response data not with numbers but with “check marks,” each mark differing in size from small, medium and large, to indicate the intensity of a response to each test question to be “scored.” Backster simply modified this system by assigning numbers instead of check marks to indicate response intensity. He further developed a scale against which the numerical totals could be compared in order to render a “chart-based” outcome. In spite of the many shortcomings of this method it is still widely used and has been one of the developments that has enhanced the consistency with which physiological data are evaluated. However, we note that the scientific evidence, in spite of some claims to the contrary, is not clear with respect to the purported benefits of Backster’s (or other similar systems) over other methods of assessing polygraph data.

In this book the authors present information relevant to three methods of specific issue CQT polygraph testing. These, they say, are their focus because they are “employed by the overwhelming majority of field examiners, and collectively have the most supporting research.” (p. 151). While we don’t know if this is true, we believe that to advance the idea (by implication) that some of the CQT methods of testing are distinct from others with respect to their accuracy and our knowledge regarding how CQT methods function is misleading.

Inspection of the APA’s meta-analytic report (It is worth noting that the two authors of this book were also co-authors of that report), included in this volume as Appendix 2 in a revised and summarized way, shows that the differences between procedures is actually small. There is no “technique” (as defined in the meta-analytic report.) that is actually inherently more accurate than others. In fact, to our knowledge there is only one study in which two different Comparison Question “techniques” were assessed in the same conditions; the results showed no difference between them with respect to their accuracy. One of the things we have learned from the extant research, in our view, is that regardless of a “format” and a specified method of data analysis, as long as both are consistent with what has become accepted practice the outcomes don’t differ much; the way in which the testing is administered appears to us to be more determinative of differences than is the “technique” that is applied. Finally, one will find in this book’s glossary a definition of the term “technique” that we believe is far more consistent with our position on polygraph testing than is the way in which “technique” was defined in the APA’s meta-analytic report. This would suggest that the authors now see the situation differently from the way they did in their preparation of the meta-analytic report. This, we believe, is worth noting. If a reader is interested enough to explore this issue in some depth it will become evident that we need to know much more about polygraph testing than we do now in order to have much confidence in the dogmatic, doctrinal assertions found in this book as well as in other publications on “lie detection.”