TIMES» " HERA R Y Michigan Siam Univermty This is to certify that the thesis entitled THE EFFECT OF PLACEBOS AND FEEDBACK ON THE DETECTION OF DECEPTION presented by Howard William Timm has been accepted towards fulfillment of the requirements for Ph.D. degree in Social Sc'iience/ Interdisciplinary Major professor Date June, 1979 0-7639 OVERDUE FINES: 25¢ per day per item RETURNING LIBRARY MATERIALS: Place in book return to remove charge from circulation record J IF-D +4? (7n1yétfi “.5843; 20m THE EFFECT OF PLACEBOS AND FEEDBACK ON THE DETECTION OF DECEPTION By Howard William Timm A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY College of Social Science/Interdisciplinary 1979 C) Copyright hy Howard William Timm 1979 ABSTRACT THE EFFECT OF PLACEBOS AND FEEDBACK ON THE DETECTION OF DECEPTION By Howard William Timm The purpose of this study was to examine the effect of placebos and feedback on the detection of deception. The subjects consisted of 270 volunteers enrolled in undergraduate Criminal Justice courses at Michigan State University, Fall (1977) term. Each of the subjects committed a mock contract murder after which the investi- gator administered a series of five lie detection tests in an attempt to ascertain the specific facts involved in the simulated murders. Subjects were awarded additional extra credit if they could success- fully mislead the examiner on three out of the five tests. Prior to the actual testing, l5 male and l5 female subjects were randomly assigned to each of the following groups: (l) placebo pass, feedback pass; (2) placebo pass, feedback fail; (3) placebo pass, feedback control; (4) placebo fail, feedback pass; (5) placebo fail, feedback fail; (6) placebo fail, feedback control; (7) placebo control, feedback pass; (8) placebo control, feedback fail; and (9) placebo control, feedback control. Subjects assigned to the placebo pass and placebo fail sub- groups were given a lactose placebo coupled with the suggestion that Howard William Timm the "medication" would either help or hinder them in their endeavor to mislead the examiner, depending on the group to which they were assigned. Similarly, subjects assigned to the feedback pass and feedback fail subgroups were given arbitrary feedback concerning a "demonstration" card test in which they were led to believe they either "beat" the test or were correctly detected, depending on their respective subgroup. The placebo control and the feedback control subgroup did not receive those respective treatments. The dissemination of placebos and the supervision of the mock murders were performed by research assistants, who worked independently from the polygraph examiner. The research assistants were informed that they were dispensing active medication to the subjects, and the examiner had no knowledge prior to testing regarding the specific facts involved with the subjects' mock murders. A standard field polygraph was used to record the subjects' respiration and skin resistance responses (SRR). After the testing, the subjects' SRR and respiration patterns were scored using various objective procedures. Biographical, performance expectancy, and atti- tudinal data were also collected from the subjects during the experi- mental sessions. The significance level for all statistical tests in the study was .05. Generally, the placebo and feedback conditions did not have a significant effect on the detection efficiency of the polygraph. Female subjects, however, did exhibit significantly less electroder- mal activity than the males during the polygraph testing. Howard William Timm Several other findings were noted during the course of the analysis. Contrary to the results reported in some detection of deception studies, respiration was found to be as valid an indicator of deception as galvanic skin response. It was concluded that the relatively high level of detection efficiency associated with respira- tion in this study may have resulted from the manner in which it was quantified. The levels of detection for the four physiological measures examined (respiration and three measures derived from SRR) were all significantly greater than chance levels. A low correlation (r'>|.l|) between respiration and the SRR measures supported the rationale for recording multiple physiological indices during decep- tion testing. It should be noted, however, that due to the nature of this study any inferences from it to field polygraph situations must be drawn with extreme caution. DEDICATION To my parents, teachers, and professors who encouraged and supported my research endeavors from my childhood years to the present. ii ACKNOWLEDGMENTS This study represents the culmination of a project I have longed to conduct since I was an undergraduate in college. I am deeply indebted to many individuals, without whom this project would still be nothing more than an elusive dream. I appreciate the guidance and cooperation I received from the members of the University Committee on Research Involving Human Sub- jects at Michigan State University. I would be remiss if I did not mention Dr. Arthur Seagull, who selflessly donated several hours training the research assistants how to handle potential emotional problems related to this study. Dr. Thomas Adams graciously provided technical assistance concerning electrodermal physiology and measure- ment. Russell Carlson was of invaluable assistance in building the numerous mechanical devices used in this study. Dr. George Felkenes helped provide the necessary testing rooms, when space was at a pre- mium. W. Frank Pont, Neal Schmitt, and Bill Brown graciously provided computer and statistical assistance. My close friend, Tom Austin, portrayed the priest in the slides used for the mock murders and pro- vided both a place for me to stay and encouragement during much of the writing phase. The police officer, fireman, Army colonel, and hos- pital orderly who also volunteered to pose for the mock murder slides were extremely considerate and accommodating. The same is true of the iii professors who agreed to provide extra credit to their students who participated in this study. Sue Cooley deserves special recognition for her expert typing and editing of the manuscript. Vera Kean also deserves recognition for her conscientious handling of the administrative and clerical aspects associated with the project's financial matters. I am particularly indebted to the numerous research assistants and the hundreds of subjects who participated in the study. The research assistants performed their tasks flawlessly and were punc- tual, dedicated, and cheerful throughout the entire project. The subjects who participated also helped to make the long hours of test- ing a very enjoyable aspect of the study. I will be forever indebted to the members of my dissertation committee. Frank Horvath, E. James Potchen, Ralph Turner, Carl Frost, Charles Press, and John Hudzik will always represent more to me than just my dissertation committee. They were also among the faculty members I respected and admired the most throughout my doctoral pro- gram. Each of them provided guidance and knowledge during my course of study and most generously took time from their busy schedules to help during all facets of this research. Special thanks are reserved for Frank Horvath, who served as the committee chairperson. Dr. Horvath provided a wealth of informa- tion concerning detection of deception testing throughout the project. He also served as my mentor and graduate assistantship supervisor fbr three years prior to the study, vastly improving my research and teaching skills. If it were not for Dr. Horvath, I would not have iv achieved the degree of success I currently enjoy, nor would I have internalized many of his rigid standards that I will always strive for but probably never attain. I wish to thank Vern Miller, Ed McGowan, and the Stoelting Company for providing the polygraph used in this study. I am also deeply indebted to the Office of Criminal Justice Education and Train- ing within the U.S. Department of Justice's Law Enforcement Assistance Administration for providing the necessary funding to conduct this study. Finally, I wish to thank my wife, Judee, who endured many sacrifices during the course of this project. If it were not for the love, assistance, and encouragement she provided, it is doubtful that the completion of this project would every have been realized. The material in this project was prepared under a research grant under Grant Numbers 77-NI-99-0078 and 78-NI-AX-0028 from the Office of Criminal Justice Education and Training, Law Enforcement Assistance Administration, U.S. Department of Justice. Researchers engaging in such projects under government sponsorship are encouraged to express freely their professional judgment. Therefore, points of view or opinions stated in this document do not necessarily represent the official position or policy of the U.S. Department of Justice. vi TABLE OF CONTENTS LIST OF TABLES ......................... LIST OF FIGURES ......................... Chapter I. INTRODUCTION AND REVIEW OF LITERATURE .......... Introduction ..................... Review of Selected Literature ............. Detection-of-Deception Experimental Designs ..... Field and Experimental Questioning Procedures . . . . Other Important Methodological Considerations Pertaining to Detection-of-Deception Experiments Studies Having a Major Influence on the Issues Examined and the Procedures Employed in This Study ....................... II. METHOD ......................... Selection of Subjects ................. Apparatus ....................... Procedure ....................... Objective Scoring Procedures ............. Respiration Total Length .............. GSR Total Length .................. GSR Amplitude .................... GSR Maximum Height ................. Summary ........................ III. RESULTS ......................... Introduction ..................... Principal Statistical Techniques Employed ....... MANOVA Results .................... ANOVA Results ..................... Reliability of the Procedures Used to Measure GSR Total Length and Respiration Total Length ...... The Effect of the Treatment Conditions on Performance Expectancy Scores ............ Correlations Between the Dependent Variables ..... vii —l www—i ll Chapter IV. The Incidence of GSR Maximum Height Downward Drift and GSR Amplitude Falling Patterns ...... The Accuracy of the Different Physiological Indices in Differentiating Between Critical and Noncritical Items .................. Detection Rates Attained Using the Scoring Procedure Developed by Lykken (l959) ........ The Results Derived From the Biographical Data Sheet and the Follow-Up Questionnaire ........ Age ......................... Sex ......................... Immediate Family Size ................ Combined Family Income ............... Subject's Year in School .............. Subject's Grade Point Average ............ Subject's Religious Preference ........... Church Attendance .................. Subject-Generated Polygraph Countermeasures ..... Performance Expectancy Scores ............ Attitudinal Responses ................ ANOVA Results Examining the Effects of Three Placebo Conditions and Three Feedback Conditions on Both the Accuracy of the Polygraph and on Two Measures of Electrodermal Responsiveness ........... Summary ........................ DISCUSSION ....................... Introduction ..................... The Effect of the Subject's Sex ............ The Effect of the Feedback Conditions ......... The Effect of the Placebo Conditions ......... The Effect of Various Combinations of the Sex, Feedback, and Placebo Conditions .......... The Reliability of the Procedures Used to Measure GSR Total Height and Respiration Total Height . . . . The Accuracy of the Different Physiological Responses in Detecting Deception .......... A Comparison Between the Guilty-Knowledge Accuracy Levels Attained in This Study and the Levels Reported in Other Guilty-Knowledge Detection-of- Deception Experiments ................ The Effect of Habituation ............... The Relationship Between the Subjects' Responses to Certain Questions Contained on the Questionnaires and Both Polygraph Accuracy and Electrodermal Responsiveness ................... Limitations of the Study ............... viii Page 85 88 92 99 103 104 105 107 109 109 114 117 119 126 129 130 130 131 134 137 139 141 143 147 Chapter Page APPENDICES ........................... 158 A. INFORMED CONSENT FORM .................. 159 B. MEDICAL RECORD RELEASE FORM ............... 161 C. MOCK MURDER CONTRACT .................. 163 D. BIOGRAPHICAL DATA SHEET ................. 166 E. QUESTIONNAIRE ...................... 168 F. MANOVA AND ANOVA RESULTS ................ 171 G. ANOVA RESULTS EXAMINING THE EFFECTS OF SEX, PLACEBO CONDITION, FEEDBACK CONDITION, AND ALL OF THEIR POSSIBLE INTERACTIONS ON FOUR MEASURES OF POLYGRAPH DETECTION EFFICIENCY THAT WERE BASED ON THE LYKKEN (1959) SCORING PROCEDURE ............... 198 BIBLIOGRAPHY .......................... 201 ix Table 10. LIST OF TABLES Mean Ranks for Males and Females on GSR Maximum Height for the Five Polygraph Tests .............. The Absolute Value Frequency Distribution of the Differences Between the Original and Subsequent Measurements of GSR Total Length ............ The Absolute Value Frequency Distribution of the Differences Between the Original and Subsequent Determinations of GSR Total Length Ranks ........ The Absolute Value Frequency Distribution of the Differences Between the Original and Subsequent Respiration Measurements ................ The Absolute Value Frequency Distribution of the Differences Between the Original and Subsequent Determinations of Respiration Ranks ........... The Effect of Sex, Three Placebo Conditions, and Three Feedback Conditions on the Mean Performance Expectancy Scores Acquired Immediately After the "Demonstration" Test ............. . ............ Analysis of Variance: The Effect of Sex, Three Placebo Conditions, and Three Feedback Conditions on the Mean Performance Expectancy Scores Acquired Immediately After the "Demonstration" Test ............. The Effect of Sex, Three Placebo Conditions, and Three Feedback Conditions on the Mean Performance Expectancy Scores Acquired Immediately After the Actual Test . . Analysis of Variance: The Effect of Sex, Three Placebo Conditions, and Three Feedback Conditions on the Mean Performance Expectancy Scores Provided by the Subjects Immediately After the Actual Test ............ Net Changes in Performance Expectancy Mean Scores Comparing Predictions Made Immediately Before and After the Actual Polygraph Tests for All Three Placebo Conditions and for Both Sexes .............. X Page 72 72 74 75 76 78 79 81 82 Table ll. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. Pearson Correlation Coefficients Comparing the Ranks of the Dependent Variables on the Critical Items for Each Polygraph Test ................. Pearson Correlation Coefficients for the Ranks of the Critical Items Comparing the Different Polygraph Tests on Each Dependent Measure ................ The Frequency of GSR Maximum Height Downward Drift Patterns on the Demonstration and Five Actual Polygraph Tests ..................... The Percentage of All GSR Amplitude Responses That Were Categorized as Falling Patterns on Each Polygraph Test The Percentage of All Critical Items That Were Categorized as GSR Amplitude Falling Patterns on Each Test .......................... The Percentage of Critical Items Ranked "One" (the Most Indicative of Deception) With Respect to Respiration, GSR Amplitude, GSR Maximum Height, and GSR Total Length for Each Polygraph Test ............. The Mean Rank of the Critical Items With Respect to Respiration, GSR Amplitude, GSR Maximum Height, and GSR Total Length for Each Polygraph Test ........ The Percentage of Critical Items Ranked as the Most Indicative of Deception With Respect to the Four Principal Dependent Measures for Each Polygraph Test The Mean Rank of the Critical Items With Respect to the Four Principal Dependent Variables for Each Polygraph Test .......................... The Estimated Proportion (Probability Distribution) of Innocent Subjects Attaining Each of the Possible Scores for the Testing Model Incorporated Into This Study Using the Lykken (l959) Scoring Procedure ..... The Actual Proportion of Subjects Attaining Each of the Possible Scores Derived From Scoring the Subjects' GSR Amplitude Values Using the Lykken (1959) Procedure xi Page 83 85 86 87 88 89 9O 92 93 95 96 Table Page 22. The Actual Proportion of Subjects Attaining Each of the Possible Scores Derived From Scoring the Subjects' GSR Amplitude Values Using the Lykken (l959) Procedure Excluding Subjects That Had Three or More GSR Amplitude Falling Patterns on Three or More of the Five Polygraph Tests ..................... 96 23. The Actual Pr0portion of Subjects Attaining Each of the Possible Scores Derived From Scoring the Subjects' Respiration Values Using the Lykken (l959) Procedure . . 98 24. The Actual Pr0portion of Subjects Attaining Each of the Possible Scores Derived From Scoring the Subjects' Composite Values With Respect to GSR Amplitude, GSR Maximum Height, and Respiration Using the Lykken (l959) Procedure .................... 98 25. A Comparison of Means for Three Age Categories With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness . . . . lO4 26. A Comparison of Means for Males and Females With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness ...... 105 27. A Comparison of Means for Three Categories of Family Size With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness ..................... lO6 28. A Comparison of Means for Three Categories of Combined Family Income With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness ..................... 107 29. A Comparison of Group Means for Underclassmen and Upper- classmen With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness ..................... l08 30. A Comparison of Group Means for Three Categories of School Grade Point Average With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness .............. llO 3l. A Comparison of Group Means for Four Religious Preference Categories With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness ..................... lll Table 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. A Comparison of Group Means for Three Categories of Church Attendance With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness .............. The Frequency and Description of Two Categories of Self- Initiated Countermeasures Employed by the Subjects During the Polygraph Tests ............... A Comparison of Group Means for Three Different Counter- measure Categories With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness .............. A Comparison of Group Means for Three Different Categories of Performance Expectancy Responses Given Immediately After the "Demonstration" Test With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness ........ A Comparison of Group Means for Three Different Categories of Performance Expectancy Responses Given After the Actual Polygraph Tests With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness .............. A Comparison of Group Means for Three Different Categories of the Combined Performance Expectancy Scores From Before and After the Actual Polygraph Tests With Respect to One Measure of Polygraph Detection Efficiency and Two Measures of Electrodermal Responsiveness ........ The Order and Wording of Statements Contained in the Attitudinal Survey ................... A Comparison of the Sum of Critical Item Composite Rank Means for Three Categories of Responses to Statements Contained in the Follow-Up Questionnaire ........ A Comparison of GSR Maximum Height Downward Drift Means for Three Categories of Responses to Statements Contained in the Follow-Up Questionnaire ........ A Comparison of GSR Amplitude Falling Pattern Means for Three Categories of Responses to Statements Contained in the Follow-Up Questionnaire ............. xiii Page Table 42. Analysis of Variance: GSR Amplitude Falling Patterns by Sex, Three Feedback Conditions, and Three Placebo Conditions ...................... 43. Analysis of Variance: GSR Maximum Height Downward Drift Patterns by Sex, Three Feedback Conditions, and Three Placebo Conditions .................. 44. Analysis of Variance: Sum of Critical Item Composite Rank Values by Sex, Three Feedback Conditions, and Three Placebo Conditions ............... xiv LIST OF FIGURES Figure Page 1. Treatment Matrix ..................... 34 2. Example of GSR Amplitude ................. 51 3. A Simple Bivariate Dependent Variable MANOVA Situation, in Which the Differences Among the Three P0pu1ations Are "Real" ....................... 60 4. The Effect of Sex and Three Placebo Conditions on the Respiration, GSR Amplitude, GSR Maximum Height, and GSR Total Length Responses During the Fourth Polygraph Test ..................... 64 5. The Effect of the Three Placebo Conditions and the Three Feedback Conditions on GSR Maximum Height During Polygraph Test One ............... 68 6. The Effect of the Three Placebo Conditions and the Three Feedback Conditions on GSR Maximum Height During Polygraph Test Four ............... 69 XV CHAPTER I INTRODUCTION AND REVIEN OF LITERATURE Introduction Lie detectors, as the public commonly refers to them, are instruments that measure various physiological responses. When using these instruments for deception analysis, the examiner is inter- ested in monitoring the physiological changes that are either directly or indirectly affected by the subject's autonomic nervous system. The "lie detection" examiner attempts to control the subject's external environment so he can examine the relationship between the statements the subject makes in response to stimulus questions, and his involuntary physiological responses. Most professional polygraph examiners (Reid & Inbau, 1966) believe that the physiological changes associated with deception stem from the subject's fear of detection. The use of polygraphs and other similar devices is widespread in the United States. "In 1972 the American Polygraph Association estimated that between 200,000 and 300,000 polygraph tests would be given during that year alone" (Barefoot, 1974, p. l79). These tests are given for a wide variety of reasons, including such sensitive areas as industrial security, police corruption and brutality, criminal investigations, and national security. Depending on the circumstances, the examiner's findings usually play a key role in deciding whether an applicant is hired, an investigation is continued, 1 or an employee is fired. Since such important decisions are based on the results of these tests, it is essential to determine if there are any systematic ways of "beating" the test. In addition to pointing out the possible limitations of deception testing, this information could prove invaluable in deve10ping procedures that are less suscep- tible to such measures. The major purpose of the present study was to examine the effects of placebos and feedback on the detection of deception. Field polygraphists have reported incidents in which guilty individuals successfully avoided detection (Klump, T965; Reid & Inbau, l966; Barland & Raskin, l973). In some of the cases described, the only explanation for the success appeared to have been a placebo effect produced by such seemingly innocuous measures as putting soap under the arms or putting bullets under the cuff used to record cardiovas- cular activity. Despite reports that indicate the placebo effect might substantially reduce the accuracy of deception tests, this hypothesis had not been scientifically tested. Another phenomenon that might also affect the accuracy of deception tests is feedback. One study that directly examined the relationship between feedback and detection of deception was con- ducted by Gustafson and Orne (l965). Those experimenters reported that the accuracy of deception testing was reduced when subjects who wanted to deceive the examiner were arbitrarily told that they had successfully done so on a previous test. A discussion of additional research that is pertinent to the current study is found in the next section--the review of selected literature. Review of Selected Literature The literature review presented in this chapter is structured to inform the reader about the issues and studies perceived to be most closely associated with the present research. Since several excellent general reviews of the literature in this area already exist (Abrams, 1973; Barland & Raskin, 1973; Horvath, 1974; Orne, Thackray, & Paskewitz, 1972; Timm, 1975), only the research that had a major influence on the direction and format of this experiment is presented. First, the review focuses on the considerations common to all laboratory detection-of-deception experiments. After laying this foundation, the review's orientation is changed to those studies that are most pertinent to the effects of placebos and feedback on detectability. Despite the limited research in the area, these studies illustrate the conceptual foundation from which the hypothe- ses to be tested in this experiment were derived. Detection-of—Deception Experimental Designs To examine the accuracy of detection of deception in an experimental context or the multitude of factors affecting its accu- racy, it is necessary to design a situation in which some or all of the subjects will attempt to deceive the examiner. In some of these situations, the researcher attempts to differentiate between "innocent" and "guilty" subjects or to determine the nature of the subject's involvement (i.e., innocent, lookout, or perpetrator). In other detection-of-deception experiments, the researcher knows in advance that all of the subjects will attempt to deceive him/her, but is interested in differentiating between the truthful and nontruthful statements made by each of these subjects. The former type of situa- tion has been called the "guilty-person paradigm," whereas the latter has been referred to as a "guilty-information paradigm" (Gustafson & Orne, 1964). Generally, the experimental designs structured to create the types of paradigms mentioned above can be classified in one of three different categories. The first category is most commonly referred to as a "card test" design. Usually the card test employs a guilty- information paradigm in which the subject is asked to select a card from a small deck (Alpert, Kurtzberg, & Friedhoff, 1963; Block, 1957; Block, Rouke, Salpeter, Tobach, Kubis, & Welch, 1952; Burtt, 1921; Geldreich, 1941; Horvath, 1978; Kubis, 1962; Kugelmass, 1967; Landis & Wiley, 1926; Langfeld, 1921; Obermann, 1939; Van Buskirk & Marcuse, 1954; Violante & Ross, 1964). The subject is then instructed to respond "no" each time the examiner asks if the subject selected a certain card, regardless of whether or not it was the card actually drawn. Obviously, by the time the researcher has asked about all of the cards contained in the original deck, the subject will have been forced to lie once during the test. The card test design can also be transformed into a guilty- person paradigm by including blank cards in the deck. Gustafson and Orne (1964) examined the effect of differential subject perceptions emanating from these two paradigms in a card test detection-of- deception experiment. They reported that in their study the guilty- information paradigm was significantly less effective in detecting deception than the guilty-person paradigm. The second most frequently used experimental model is one in which some or all of the subjects either observe or participate in a mock crime and then are given a detection-of-deception test regarding that scenario (Orne et al., 1972). This model can also involve either the guilty-information or guilty-person paradigm. Under the guilty-person paradigm, the researcher generally attempts to differ- entiate between subjects who have committed the mock crime and those who have not (Berrien, 1942; Berrien & Huntington, 1943; Chappell, 1929; Landis & Wiley, 1926; Marston, 1917; Obermann, 1939; Podlesny & Raskin, 1978; Raskin & Hare, 1978; Runkel, 1936). However, in some cases the examiner is also interested in determining the subject's degree of participation (i.e., participant, observer, lookout, those who had planned or attempted the crime, or those totally innocent [Baesen, Chung, & Yang, 1949; Davidson, 1969; Kubis, 1962, 1973]). When the guilty-information paradigm is used in a mock crime context, the examiner generally knows in advance that all subjects were required to commit one of several different mock crimes; how- ever, his/her task is to determine which particular crime the subject is guilty of committing. Burtt's (1939) study clearly illustrates the mock crime guilty-information paradigm. In that study, all sub- jects were required to Open and examine the contents of one of two boxes containing miscellaneous objects. Each subject was guilty of peeking into one of the boxes, and the experimenter's task was to determine which one the subject had opened. The third category of experimental detection-of—deception designs comprises studies in which some or all of the subjects are required to lie about certain information they possess that was not acquired through participating in a card test or mock crime. Lykken's (1960) study provides an excellent example of this type of design. In that study, subjects were motivated to deceive the examiner about personal information (i.e., father's name, name of their former high school, etc.). The examiner's role was to determine which of five personal histories belonged to the subject being tested. Hence, the study represented a guilty-information paradigm. This design can also be modified to form a guilty-person paradigm by including subjects whose histories have not been given to the examiner. Regardless of which type of laboratory design the researcher employs, the experimental situation is always structured so the experimenter will be able to discover ground truth at the conclusion of the study. The researcher also has complete control over the num- ber and types of possible alternatives the subject could be "guilty" of committing. For example, in a guilty-information card test experi- ment, the researcher knows that the subject had to select one of the cards from the deck. Generally, the researcher also knows in advance the numbers on all of these cards. Thus the researcher can structure the questions to pertain only to the possible alternatives the subject may have selected. The same is true in mock crime situations. The researcher usually knows in advance all of the possible alternative elements of the crime with which the subject could have been involved and can structure the questioning procedure accordingly. Field and Experimental Questioning_Procedures Since the researcher has absolute control over the number and types of possible alternatives the subject could be guilty of commit- ting, it is not surprising that the types of questioning procedures used in laboratory studies differ to some extent from those generally used by field polygraph examiners. Rarely do field examiners have such clear-cut alternatives or know in advance the precise range of the suspect's possible involvement in the criminal act they are investigating. The three most common types of questioning techniques used by field polygraph examiners are relevant-irrelevant, control question, and peak-of-tension tests. In using the relevant-irrelevant technique, the examiner asks a series of questions, some of which pertain to the matter under investigation (relevant) and others that do not (irrelevant). The first two questions are usually irrelevant (e.g., "Is today Friday?"), followed by a "Do you know who . . ." rather than a "Did you . . ." question (Barland & Raskin, 1973). After the third question is asked, any other relevant questions can be asked, with irrelevant questions inserted whenever the examiner wants the response to return to the basal level or after a fixed number of relevant questions has been asked (Harrelson, 1964; USAMPS, 1970). The second major questioning procedure used in the field is the control-question technique. This technique differs from the relevant-irrelevant format in that the order in which the ques- tions are presented is predetermined and control questions are incor- porated into the series. A control question is one designed to cap- ture the psychological set of the innocent subject (Barland & Raskin, l973). Suspects are led to believe that the control questions are important to the resolution of the matter under investigation. These questions are also formulated in a fashion that would make it diff- cult for anyone to answer confidently and completely truthfully. For example, in a theft case a possible control question might be: "Other than what you mentioned [during the pretest interview], did you ever steal anything while you were in high school?" It is believed that innocent individuals will react more strongly to the control than they do to the relevant questions, whereas the opposite would hold for guilty suspects. Naturally, formulating the control questions and establishing the psychological set that makes these questions more threatening than relevant ones to an innocent suspect is a skill that the examiner must develop (Raskin, 1978). The peak-of—tension test is the third major technique used by polygraph examiners in the field. This test usually comprises between five and seven questions that are mutually exclusive and worded similarly (Harelson, 1964; Reid & Inbau, 1966). The critical question, the one actually corresponding to the facts known about the crime, is placed approximately in the middle of the series. For example, if an individual took four dollars during an armed robbery, an appropriate peak-of—tension test sequence might be: 1. Regarding the amount of money taken, do you know if it was one dollar? Do you know if it was two dollars? Do you know if it was three dollars? Do you know if it was four dollars? Do you know if it was five dollars? Do you know if it was six dollars? Do you know if it was seven dollars? \lOfiU'l-bOON oooooo To increase the guilty person's apprehension about the criti- cal question, the suspect is usually either told in advance the order of the questions (Barland 8 Raskin, 1973) or the same question series is repeated so the suspect knows the questions and their order on the subsequent tests (Reid 8 Inbau, 1977). It is believed that a guilty person's physiological responses to the questions will peak at the critical question, and then return to normal after it has passed. Obviously, the use of this technique is limited to situations in which, of those to be tested, the guilty person, and only the guilty person, knows the correct reSponse. The examiner also has to use extreme care in formulating these questions, so that the alternatives are equally plausible and of approximately equal emotional value to innocent suspects. The conditions that make it feasible to use the peak-of- tension test are those most closely resembling the conditions present in laboratory detection-of—deception situations. However, it would be fairly rare for an examiner to possess information he/she is certain that the perpetrator knows, but no innocent people who are 10 tested would have found or figured out. The examiner must also be wary that the victim did not intentionally or unintentionally pro- vide false information. One of the more common questioning procedures employed in laboratory detection-of—deception situations is called the guilty- knowledge technique. This method was developed by Lykken (1959) for use in a mock-crime lie detection experiment. The guilty-knowledge method of deception analysis assumes that a guilty person knows cer- tain facts pertaining to his crime that an innocent person does not know, and that the guilty individual's physiological responses during testing will differentiate between the relevant and irrelevant stimuli (questions) presented by the examiner. The technique is very similar to the peak-of-tension test. However, it differs in that the placement of the critical question is generally random and the subjects usually do not know the questions or their order in advance. The fact that the researcher normally does not know in advance which of the ques- tions was the critical item also helps insure that the experimenter does not in some way bias the subject's response to it. Although both the relevant-irrelevant and control-question techniques can be and have been used in laboratory studies (Barland, 1972; Barland 8 Raskin, 1973; Orne et al., 1972), their use is gen- erally more appropriate in guilty-person-paradigm situations, whereas the guilty-knowledge and peak-of-tension methods are appropriate for either the guilty-person or guilty-information designs. 11 Other Important Methodological Considerations Pertaining to Detection-of—Deception Experiments Thus far, different types of experimental designs and ques- tioning procedures pertaining to the detection of deception have been discussed. Although the aforementioned classification systems provide a basis from which the reader can differentiate and interpret experi- mental detection-of—deception studies, it does so on only two dimen- sions. Other factors are also extremely important in considering the nature of the study. Using Easton's (1965) model, one can break down these factors into those associated with inputs, process, and outputs. The input factors in laboratory detection-of—deception experiments encompass a wide array of variables. The following examples are intended to illustrate the diversity and vastness of these factors. However, it should be noted that this list is not meant to be comprehensive; rather, it is provided to stimulate the reader's thoughts about and awareness of these issues. 1. The setting and layout of the experimental station 2. The physical appearance of the experimenter(s) 3. The intentional and unintentional verbal and nonverbal cues emanating from the experimenter(s) 4. The length of time for the subject, from the prospect of involvement with the experiment through its completion 5. Comments from former participants, prospective partici- pants, and nonparticipants regarding the experiment 12 6. The degree and nature of the subject's involvement in the experiment 7. Explicit statements to motivate the subject 8. The nature of the treatment variables 9. The questioning procedure used by the examiner Unfortunately, the variables affecting how the input variables are processed are far more complex and are those that the experimenter cannot easily manipulate unless specific measures are taken to con- trol for them. Included among the variables that might affect how the subject perceives and processes input variables are prior experiences the subject associates with the context of the experi- ment, as well as genetic, personality, and cultural differences. Therefore, it is highly unlikely that any of the subjects will per- ceive their role and status in the experiment exactly the same, or that they will react identically to the stimuli presented. The output variables also add to the complexity of under- standing differences among laboratory detection-of-deception studies. Frequently, different physiological indices are measured, using different equipment or different procedures. Even when the same physiological responses are monitored, they are not always measured and/or scored consistently from study to study. These comments concerning the differences between laboratory detection-of—deception studies were not intended to make the reader believe it is futile to attempt to comprehend the relationship between different studies, nor were they necessarily intended to plead for more consistency among future studies. Rather, their purpose was 13 to sensitize the reader to these differences and to convey the message that our current knowledge regarding the factors affecting the detec- tion of deception is analogous to a massive jigsaw puzzle in which a vast majority of the pieces are missing. However, on a more optimis- tic note, each new study in this area unveils another piece of the puzzle, which may bring us one step closer to seeing the total picture. Studies Having a Major Influence on the Issues Examined and the Procedures Employed in This Study In this section of the literature review, the research having the greatest influence on the direction of this study is presented. Despite the limited amount of research pertaining to the effects of placebos and feedback on detectability, the studies illustrate the conceptual foundations relevant to both the issues examined and the procedures employed in this study. The following studies examined the validity of the guilty- knowledge technique in detecting deception in mock crime situations. Lykken (1959) reported a study in which he examined the effec- tiveness of the guilty-knowledge technique in detecting deception. His sample consisted of 49 male subjects of whom 13 were randomly assigned to commit both a mock murder and mock theft; 24 were randomly assigned to commit only one of these two crimes; and 12 were randomly assigned to commit neither of these two crimes. Each subject was blindfolded and informed that if his physiological responses during the polygraph test were indicative of guilt, he would be given an electric shock. Each subject was actually given electric shocks on 14 an a priori schedule irrespective of their physiological responses to increase their motivation to escape detection. The subjects were also instructed to remain silent during the polygraph testing. Twelve question series were administered to each subject. Six pertained to the mock murder, and six pertained to the mock theft. The number of questions contained in each series varied, having a mean of 4.67 alternatives on the six murder question sets and a mean of 5.0 alternatives on the six theft question sets. Only one of the alternatives within each question set conformed directly to some apsect of one of the crimes; the other alternatives were structured to be equally plausible to an innocent person. For example, Lykken used the following alternatives regarding the desk drawer in which the meck murderer hid the weapon: (a) upper left, (b) lower right, (c) lower left, (d) upper right, (e) middle. The guilty-knowledge technique is based on the premise that a guilty person will recognize the relevant alternative and have a different physiological response to it than to the other alternatives. whereas an innocent person who is unaware which alternative is the relevant stimulus should respond greater to the relevant alternatives only at chance levels. The only physiological response monitored during the testing was skin conductance. The skin conductance amplitudes associated with the various alternatives within each of the 12 question series were ranked. If the largest amplitude occurring during a question set was to the relevant alternative, it was given a score of 2. If the second largest amplitude was to the relevant alternative, it was given the score of 1. Thus, a perfect guilty score for each of the two 15 crimes would be 12. Lykken analyzed each of the crimes independently, categorizing scores of 6 and less as indicative of innocence. Using that system, 88 percent (44 out of 50) of the guilt classifications were correct and 100 percent (all 48) of the innocent classifications were correct. Davidson (1968) also used Lykken's (1959) scoring procedure in a detection-of—deception experiment. Davidson randomly assigned a total of 48 subjects into 12 groups, each containing 4 subjects. Three of the four subjects in each group were motivated to commit a self-planned mock murder; however, the experiment was designed so that of those three subjects in each group one succeeded, one attempted but failed, and one did not make an attempt. The fourth subject in each group had no knowledge of the mock murder. Thus, only 12 out of the 48 subjects were actually guilty of committing the mock murder. Davidson examined the effects of motivation stemming from monetary reward on the detection of deception by providing each of the 12 mock murder victims with an envelope containing a pay voucher. The subjects were told about the voucher and informed that if they success- fully committed their mock murder and successfully "beat" a polygraph test regarding their guilt in that incident, they could keep the pay voucher. 'Six of the vouchers ranged in value from $25 to $50, whereas the other six ranged in value from 1¢ to 10¢. The questioning and scoring procedures used by Davidson in this study were essentially the same ones developed by Lykken (1959). Davidson did, however, report monitoring cardiovascular activity and respiration (these indices were not scored). Davidson reported that 16 all 36 of the innocent subjects and 11 out of 12 (91.7 percent) of the guilty subjects were correctly classified using the same criteria reported by Lykken. The only misclassified guilty subject was in the low-amount pay voucher group; however, the detection rates between the two monetary motivation groups were not significantly different. Another detection-of-deception study that incorporated Lykken's (l959) procedure was conducted by Podlesney and Raskin (1978). Their study also examined the effectiveness of the control-question tech- nique in a laboratory situation; however, only the guilty-knowledge portion of that study will be addressed in this section. Twenty sub- jects took part in the guilty-knowledge experiment, of whom half were assigned to the guilty condition and half to the innocent condition. Guilty subjects committed a highly ego involving mock theft, whereas the innocent subjects were told about the theft but they neither enacted the crime nor were told any of the details pertaining to the theft. Subjects in both groups were informed that they would receive a $10 bonus if they appeared innocent on the lie detector test. Five question series were administered to each subject during the polygraph test, each consisting of one relevant and four irrele- vant questions. The subject's respiration, skin conductance, and cardiovascular activity were monitored during the polygraph test. Each of the three physiological indices was objectively quantified in several ways, then scored independently using the Lykken (l959) procedure. Only skin conductance and plethysmograph scores signifi- cantly discriminated between guilty and innocent subjects. However, guilt/innocence classifications based solely on the objective 17 quantification of skin conductance responses were 90 percent correct, and all errors were false negatives. The following studies attempted to determine whether the lie detection procedure was vulnerable to certain methods employed by subjects to deceive the examiner. Lykken (1960) conducted a study to determine if the GSR could effectively be used to detect deception if the subjects attempted to "beat" the test. He trained 20 college students in the theory of using the GSR to detect guilty knowledge. He also allowed his sub- jects to practice inhibiting or producing false GSR's and informed them of the interrogation and scoring procedure that would be used. To ensure that the subjects were motivated to "beat the machine," they were offered ten dollars if they were successful. Instead of using a mock crime situation, the subjects were tested on personally relevant information (i.e., father's name, name of high school, etc.). The subjects were divided into subsets con- sisting of five subjects. During the polygraph test the subjects were asked a series of questions containing one relevant and four irrele- vant questions pertaining to each category of personally relevant information (i.e., "What is your father's name?" followed by the alternatives). The experimenter's task was to independently match the responses to the stimuli presented during the polygraph test for each subset of five subjects with the information contained on question- naires those subjects completed prior to testing. Lykken reported obtaining a 100 percent correct classification using objective scoring of the GSR protocol alone. 18 Kubis (1962) also examined the effectiveness of certain methods of trying to "beat the polygraph" in a laboratory study. Twenty sub- jects used muscle tension, exciting imagery, and yoga during different tests. Subjects using the exciting imagery were instructed to think of something exciting or upsetting at the appropriate times during the interrogation procedure. The subjects using the muscle- tension method attempted to induce reactions symptomatic of deception by pressing their toes against the floor. The yoga group tried to avoid detection by maintaining an abstract frame of mind that would allow them to separate themselves mentally from the outside stimuli. The examiner used a standard three-channel polygraph (GSR, Respira- tion, and Cardiovascular Activity) in an attempt to determine which number a subject picked during a card test. It was found that the yoga method was not very successful; however, both the muscle-tension and the exciting-imagery methods reduced the examiner's effectiveness from the high-statistical-significance to the chance level. Weinstein, Abrams, and Gibbons (1970) conducted a study in which they examined the effect of hypnotically induced repression and guilt. They selected six college students on the basis of their ability to enter deep hypnotic states. The subjects were divided into two groups. The three members of the first group were told to enter an office and take one of three bills ($1, $5, or $20). After- wards, they were hypnotized and told that they would not recall taking the money. The second group of students did not take any money; however, they were told under hypnosis that they had stolen one of the bills and that they would experience considerable guilt because 19 of this. The examiner was completely misled by the three innocent students. In fact, he stated with certainty that each had taken the hypnotically suggested amount. The examiner was only partly convinced that the members of the guilty group had taken the money and only correctly identified the amount taken by one of them. As illustrated by the preceding studies, attempts to "beat“ the lie detector normally consist in the individual trying either to reduce his responses to the critical items incorporated in the test or to create accentuated responses to the noncritical items. The subject usually attempts to accomplish this chemically, mentally, or through some form of movement. The studies conducted by Kubis (1962) and Weinstein et a1. (1970) suggested that it might be easier for the subject to create responses to noncritical items than to suppress responses to the critical items. However, in certain field-testing procedures (e.g., pre-employment screening, in which most of the questions could be considered "critical items"), the subject's ability to suppress his/her responses to the critical items is more crucial. In addition, a well-trained examiner would probably notice if the subject was engaging in certain methods of attempting to create accentuated reSponses to noncritical items during field poly- graph tests. An alert examiner should also be able to detect physical signs during the pretest interview if the individual has consumed a suffi- cient quantity of drugs (alcohol included) to markedly alter his/her responses. The pretest interview is a standard field procedure, during which the examiner conditions the subject for the test and 20 tailors the test questions to the individual and the information he/she provides. If a highly drugged person was able to get through the pretest interview undetected, his/her pattern would probably be either so erratic or so flat (depending on the type of drug) that the examiner would probably request another test date or judge the test inconclusive. Ferguson and Miller (1973) reported that an individual's GSR pattern can be used to differentiate between responses caused by physical movement as opposed to those caused by emotion. The examiner may also see that individual making obvious physical movements. Since the responses indicative of deception are theoretically affected by the subject's degree of concern over the possibility of detection, a more productive method of attempting to deceive the examiner might be by reducing this concern. Three factors associated with the subject's degree of concern are: (1) his/her involvement in the matter being tested; (2) the magnitude of the sanctions contin- gent on the testing, as perceived by the subject; and (3) the subject's degree of certainty that the true status of his/her involvement in the offense will be correctly or incorrectly diagnosed by the exam- iner. The importance of the perceived sanctions contingent on testing was demonstrated in a study by Gustafson and Orne (1963), who examined the effects of the subject's level of motivation to escape detection on the accuracy of detection of deception. Thirty- six college students were divided into two groups; one group was moti- vated to deceive the examiner, and the other was not. Subjects who 21 were motivated to deceive the examiner listened to a recording that contained the following information: (1) the experiment was designed to see how well the subject could keep information away from the experimenter; (2) that this was extremely difficult to do, and that only people of superior intelligence and great emotional control were able to do it; (3) they were to try as hard as they could to beat the experimenter and the equipment; and (4) if they were successful, they would receive an extra dollar. Subjects who were motivated to deceive produced larger skin responses more frequently than did the other group. The objective scoring procedure successfully detected the information processed by members of the motivated group at a much greater than chance level, whereas detection occurred only at a chance level in the other group. The researchers concluded that the degree of autonomic response to significant stimuli appeared to be a function of motivation. In field situations, however, the subject's degree of confi- dence in the outcome he/she expects seems to be the factor associated with the subject's degree of concern that is the most independent from the context of the testing situation. This appears true because both the subject's involvement in the matter being teSted and the magnitude of the sanctions contingent upon the testing are to a large extent dictated by the actual circumstances. One factor that might affect the subject's degree of cer- tainty that his/her involvement in the offense will be correctly or incorrectly diagnosed is feedback from prior polygraph testing. 22 Gustafson and Orne (1965) examined the effect of perceived role and role success on GSR for deception analysis. Sixty-four college students were divided into two groups: need to deceive and need to be detected. The need-to-deceive members listened to a tape that attributed positive qualities to those who could "beat the machine." The other group listened to a different tape, which, con- versely, gave positive attributes to those normally detected by the "lie detector." After the subjects completed the initial test, and regardless of the actual results, half of each group were told that they had been detected; the other half were told that they had not been detected. The investigators found that if the subjects received information that was consistent with their perceived roles, they were detected significantly less frequently than were subjects who received information not consistent with their roles. The following comments by Reid and Inbau (1966) illustrate that similar factors may affect polygraph examinations in the field: A subject's concern over the possibility of detection appears to be the principal factor accounting for the physio- logical changes that are recorded and interpreted as symptoms of deception. . . . Conversely, a lack of concern over the possibility of detection may prevent a diagnosis of deception. . . There is the rare subject who, because of the positive evidence against him, has developed an attitude of h0pe1ess- ness; in other words, he has "given up" and abandoned any expectation of ultimate clearance of suspicion or accusation. As to him, too, the test results may be inconclusive (p. 168). Another measure that might affect the subject's degree of certainty that the examiner will correctly or incorrectly diagnose his/her involvement in the offense is the placebo effect. This 23 hypothesis appears to be supported by the following comments made by Reid and Inbau (1966): An unwitting type of psychological evasion may result from a subject's belief, however unfounded it may be, that something he has done of a physical or medical nature will prevent a dis- play of deception criteria during the test. For instance, if he has taken a sedative or some other drug which he fully believes to be effective in permitting him to evade detection, he may thereby be relieved of the necessary concern over possible detection and either avoid deception reactions or produce a polygraph record that will not permit a definite diagnosis one way or the other. Another example--and an actual one in our own experience--is that of a police officer of limited intelligence, who, immediately prior to the test, was observed placing bullets under the pneumograph tube and under the blood pressure-pulse cuff. He apparently believed that by doing so he would suppress whatever deception indications he would otherwise display dur- ing the test. His polygraph records (either because of this belief or for some other reason) were devoid of deception cri- teria when asked the relevant as well as the control questions, and his deception would have remained undetected had the bullet stuffing efforts not been observed from the adjoining observa- tion room (p. 167). In an article on individuals who are nonreactors during polygraph examinations, Arther (1977) stated that phenomenon could result from the person's taking drugs before the examination. His comments further supported the premise that in certain cases the placebo effect might seriously jeopardize the outcome of polygraph examinations: There are two aspects of this problem--the physiological and the psychological. The physiological aspect deals with what drug(s) is involved, in what amounts it has been taken, what is the person's tolerance to that particular drug, when he last had it, what was already in his stomach when he took it, his overall physical condition. Just as important is the psychological aspect. That is: Does the person reall believe that whatever he has taken will really result in his eating the lie detector?" Of course, the more he is convihCéd that the drug will "beat the lie detector", the more likely he will be a non-reactor. This is even true when the drug supposedly should not cause the test to be affected. In fact, even a placebo can result in a non-reactor (p. 3). 24 Despite the belief held by many polygraph examiners that the placebo effect could seriously affect an individual's responses during a polygraph examination, this hypothesis had never been scientifically examined. However, numerous studies have dealt with the placebo effect in other situations. Several of these studies will now be discussed, since they indicate the strength of this effect and the factors affecting its magnitude. The term placebo is by no means new, nor has its definition remained consistent throughout the years. Shapiro (1968) traced the semantic changes associated with the word, from the Hebrew Bible to the present. Today the term placebo is generally used to connote a nonactive tablet or capsule, normally lactose, for use as a control measure in pharmacological research or as a therapeutic agent adminis- tered by a physician to promote or reinforce the patient's favorable expectancies. The term placebo effect is used in this study to refer to the change in the outcome of a given situation attributable solely to the psychological effect produced by some form of intervention taken by that person, which altered his/her set of expectancies regard- ing the probable outcome of that situation. In this section of the literature review, seleCted studies that treated the placebo effect as their principal independent vari- able are presented. The first studies demonstrate the strength of the placebo effect on a variety of dependent measures. After laying that foundation, the studies presented focus on the independent vari- ables that affect the magnitude of the placebo effect. 25 Evans (1969) examined the relationship between the placebo response and hypnotic susceptibility. Although he did not find a strong relationship between those two variables, he did report sig- nificant reductions in both subjective and objective measures of ischemic muscle pain after subjects ingested a placebo. The objec- tive measures used in that experiment were the volume of water'a subject could pump from one flask to another with a sphygmomanometer cuff around his/her arm inflated to 200 mm mercury above his/her systolic pressure, and the length of time it took to do this. In another study examining the effectiveness of placebos in combating pain, Beecher (1965) reported placebos were far more effec- tive in reducing pathological pain than pain generated experimentally. He found that placebos' average effectiveness in cases pertaining to pathological pain of the tissue was 35%, whereas their effectiveness was only 3.2% with experimentally contrived pain produced by heat, tourniquet, etc. This difference suggests that when anxiety and stress are severe, placebos are more effective than when stress is of a lesser degree or absent (Stroebel, 1972). In an experimental study, Gottschalk and Gleser (1969) examined the influence of placebos on achievement striVings. They reported that subjects who read a written statement saying that they would feel "more peppy and energetic" after ingesting their medica- tion, showed significant increases in achievement strivings (as measured by content analysis). This relationship was consistent, whether or not the subjects received a placebo, secobarbitol (100 mg), or dextroamphetamine (10 mg). However, the group receiving the 26 dextroamphetamine did have significantly higher achievement striv- ings than did the other two groups. Weiner and Sierad (1975) conducted an experiment on the relationship between the placebo effect and achievement needs. In that study, 200 male subjects were classified as having either high or low achievement needs. The subjects in the treatment condition were given a placebo, coupled with the suggestion that the drug would interfere with their hand-eye coordination. All subjects were then given four trials at a digit-symbol substitution task that was struc- tured in such a fashion that no one was able to complete it success- fully; however, the number of digit-symbol substitutions was recorded for each trial. Compared with subjects in the control groups, ascription of failure to the pill augmented the performance of sub- jects low in achievement needs, whereas it decreased the performance of subjects high in achievement needs. Sternbach (1964) examined the effects of placebos on stomach motility. Each of six subjects was tested under the following three conditions: (1) stimulant, (2) relaxant, and (3) placebo. In each instance, the drug administered was a placebo containing only a small magnet, which was used to measure gastric peristaltic rate. The subjects were informed the "relaxant" would decrease stomach motility and the stimulant increase it, whereas the placebo would have no effect. For four of the six subjects, stomach motility was highest after they took the "stimulant" placebo and lowest after they ingested the "relaxant" placebo. In one case the reverse order was found; in the other case the order was mixed. Overall, the placebo effects 27 conformed to the suggestions with which they were paired and their mean differences were statistically significant (p.< .05). The studies presented above demonstrate that the placebo effect can have a dramatic influence on a wide range of dependent variables. Sternbach's (1964) article is of particular importance because the dependent measure (stomach motility) was primarily con- trolled by the autonomic nervous system. This suggests the placebo effect might also affect the autonomic functions monitored during polygraph examinations. The next articles reviewed depict the inde- pendent variables affecting the magnitude of the placebo effect. Lasagna, Mosteller, Von Fesinger, and Beecher (1954) conducted a study in which they examined differences between placebo reactors and nonreactors. Their sample comprised 162 postoperative patients, who were observed for their ability to receive pain relief from sub- cutaneous injections of both saline solution (placebo) and morphine. The researchers reported that reactors were more likely to (1) like "everyone," (2) report the hospital care was "wonderful," (3) exhibit somatic symptoms under stress, (4) be "talkers," (5) have less educa- tion, (6) be regular churchgoers, (7) be slightly older, and (8) have different Rorschach scores than the nonreactors. No significant dif- ferences were found on the basis of sex or intelligence (as measured by the Wechsler-Belleveue scale). Rickels, Hesbacher, Weise, Gray, and Feldman (1970) studied the placebo response in psychoneurotic outpatients. They reported clinical improvement was significantly correlated with the number of 28 placebos taken daily for clinic and general practice patients; however, this relationship did not hold for private psychiatric practice patients. It was also found that the patient's social class seemed to influence the results. In comparison to other SES groups, patients in the lowest SES group were most likely to report improvement, regardless of the number of placebos taken. Patients in both the lowest and the highest SES groups also reported signifi- cantly greater improvement as the placebo dosage increased, whereas this relationship was not significant for patients in the middle SES group. Buckalew (1972) conducted a study to analyze the experimental components in a placebo effect. Fifty subjects were required to complete both a pretest and posttest measuring their motor skill reaction time to a visual stimulus. The subjects were randomly assigned to the following conditions: (1) control, (2) placebo only, (3) placebo plus reinforcement, (4) placebo plus suggestion, and (5) placebo plus both reinforcement and suggestion. Subjects receiving only the placebo were not told how it would affect their performance, whereas those receiving the suggestion were told that the placebo would reduce their reaction time. The reinforcement consisted in the experimenter informing those subjects that their reaction time was improving. The order of the mean reaction times for the groups, from slowest to fastest, was: (1) placebo only, (2) control, (3) placebo plus reinforcement, (4) placebo plus suggestion, and (5) placebo plus both suggestion and reinforcement. The last group 29 performed the test significantly (p_< .01) faster than all other groups except the placebo-plus-suggestion group, based on the results derived from Duncan's new multiple range test. CHAPTER II METHOD This chapter contains a discussion of the selection of sub- jects for the study, their assignment to treatment groups, the appa- ratus employed, the procedures of the study, and the scoring methods used. Selection of Subjects The subjects were volunteers enrolled in selected Criminal Justice classes as Michigan State University during Fall term 1977. The courses from which subjects were drawn were: two sections of Introduction to Criminal Justice, Criminology, Police Process, and Juvenile Delinquency. These courses were selected on the basis of the following criteria: They were all large, undergraduate classes in which the instructor agreed to permit his/her students to par- ticipate in the experiment for extra credit. To maintain consis- tency, the extra credit was standardized for all classes, based on a set percentage of the total points for each class. During the first week of the Fall (1977) term, the investi- gator visited each of the aforementioned classes. Students attending these classes were told the following information: 1. The purpose of this study is to examine the effects of certain drugs on the accuracy of lie detection tests. 30 31 2. The experiment itself will involve a mock contract murder. After the volunteer has been briefed about his/her intended victim, the volunteer will be required to shoot at an image of the victim shown on a movie screen. The volunteer will then be given a lie detection test concerning the mock murder. The person administering the lie detection test will attempt to ascertain the name of the victim, what the victim's occupation was, how many times the volunteer shot at him, where the Mafia family that hired the volunteer was located, and how much the volunteer was paid for the killing. 3. One percent of the total possible points in the student's respective class will be given to each subject who completes the experiment without breaking any of the rules. However, subjects who are able to "beat" the lie detector will be awarded a total of 5 per- cent of the total possible points. The number of points each subject will be awarded will be determined by objectively scoring his lie detection charts. If the individual who analyzes the charts is able to correctly identify the information pertaining to the subject on three or more of the five tests, the subject will receive only 1 per- cent extra credit. However, if the subject successfully deceives the examiner on three or more of the tests, he/she will be awarded the 5 percent extra credit. 4. The effects of two "drugs" on lie detection will be studied in this experiment. One of the drugs is believed to make it more difficult for a person to "beat" the lie detector, whereas the ingre- dients in the other "medication" should make it much easier. Volun- teers will be randomly assigned to one of three treatment groups. 32 One of the groups will receive the "drug" that should make it harder for them to successfully deceive the examiner. Another group will receive the "medication" that should help them "beat" the lie detector, whereas the third group will receive no medication at all. 5. For a student to participate in this study, all of the following conditions must be met: a. The student has to volunteer for the experiment. b. The student must sign an informed consent form. c. The student must sign a medical release form, which will permit physicians at the Olin Health Center to examine the subject's medical history. d. A physician must verify that there is nothing in the subject's medical records that indicates that either of the "drugs" would have a harmful effect on him/her. e. The volunteer has to agree not to discuss this experi- ment with any other volunteers until the study has been completed. It is important for the reader to note that in order to realize the research objectives of this study, the subjects had to believe the above information was correct. Once again, however, the actual purpose of this study was to examine effects of placebos and feedback on the detection of deception. The two "drugs" that were administered were in reality two differently colored pharmaceutical placebos containing only lactose. Since the "drugs" were only placebos, no actual check was made of the volunteers' medical records. The medical release form was included to give the placebo additional 33 credibility and to prevent any possible complaints of unauthorized disclosure of university medical records. Immediately after the study was explained to the students, any questions they had regarding the experiment were answered. Stu- dents interested in participating in the study were given a c0py of both the informed consent form (Appendix A) and the medical release form (Appendix B) and were asked to read them carefully. Students who still wanted to take part in the study were then asked to sign and return the forms. Each of the volunteers was given a subject number and was randomly assigned to one of nine treatment groups. Since the possi- bility existed that the subject's sex might be associated with auto- nomic responsivity, the ratio between males and females was held constant for all groups. It may be of interest that although the investigator did not anticipate finding significant differences between males and females, the almost even split between male and female volunteers made examining this hypothesis too attractive to ignore. In this experiment the two principal independent variables (placebos and prior feedback) were both manipulated to produce three levels of treatment for each variable, yielding the treatment matrix presented in Figure 1. A more detailed description of these groups is presented in the procedure section. Each of the nine groups included 15 males and 15 females, producing an N_of 270. The ages of the subjects ranged from 17 to 42 (7'= 19.8; §Q_= 2.34). The breakdown of the subjects' 34 year in school was as follows: freshmen, 91; sophomores, 78; juniors, 67; seniors, 38; graduates, l. PLACEBO CONDITION PASS CONTROL FAIL Male Female Male Female Male Female V) g g n=15 n=15 n=15 n=15 n=15 n=15 E C) _l 5' S c: t: n=15 n=15 n=15 n=15 n=15 n=15 >2 5 U Q < CD 8 .1 h‘." E n=15 n=15 n=15 n=15 n=15 n=15 Figure 1.--Treatment matrix. Apparatus A Stoelting field polygraph (model #22642) was used to record both the respiration and the skin resistance reSponses (SRR) of the subjects. Respiration was recorded by a pneumatic tube positioned around the subject's thoracic area. The SRR was recorded from two stainless steel electrodes attached to the volar surfaces of the first and third fingers of the subject's right hand. All SRR recordings were made with the instrument in the manual centering model. The instrument used to objectively score both respiration -responses and one of the measures of SRR examined in this study was 35 a modified map-distance measurer. The instrument was designed to measure curvilinear distances between two points on a sheet of paper.1 The instrument's original 1/4"-diameter circular wheel which came into contact with the line on the paper being measured was replaced with a lO-tooth gear having an outer circumference diameter of 2 mm. This modification made it easier to keep from deviating from the paths on the charts formed by the polygraph's ink pens. The gear was also less susceptible than the original wheel to sliding on the paper as Opposed to turning, which was necessary to achieve an accurate measure- ment of the distances. The original map distance measurer was also filed in certain places to permit the aforementioned gear to have free contact with the surface of the polygraph charts. Procedure All subjects reported individually to the room where they were to commit their mock murder. When each subject arrived, he/she first met with a research assistant who worked independently of the polygraph examiner. Each of the research assistants who worked in this capacity had received training from the Director of the Psychol- ogy Clinic at Michigan State University on how to identify and inter- act with subjects who might have an adverse psychological reaction to the experiment. When each subject arrived for the experiment, the research assistant greeted the subject at the door and shook his/her hand. The handshake served two purposes. First, it was intended to 1The writer gratefully acknowledges the suggestion of Dr. Frank Horvath to measure physiological responses with this type of instrument. 36 facilitate an Open interaction between the subject and the research assistant. Second, it helped the research assistant ascertain whether the subject was overly anxious about participating in the study. Sub- jects who had cool, clammy, sweaty hands or who exhibited other behav- ioral signs symptomatic of excessive anxiety (i.e., fast talking, hyperactivity, or other overt signs of nervousness) received special attention from the research assistants. Basically, this entailed the research assistant spending additional time conversing with the sub- ject on matters not related to the study, such as courses, sports, or weather. After the subject appeared relatively calm, the research assistant summarized what the experiment would entail. The subject was then shown his/her mock murder contract (Appendix C). This docu- ment specified the following information: the name of the individual the subject was to simulate killing, the victim's occupation, the amount of play money the subject was to receive, the number of shots the subject had to fire, the location of the Mafia family that was purchasing his/her services, and a picture of the intended victim. The contract also specified that the subject was required to say, "(victim's name), I am shooting you for betraying the (city where the Mafia family was located) branch of the Mafia," before he/she fired the pistol. Each subject was asked to read the mock murder contract silently, while the research assistant read the document aloud. The research assistant also answered any questions the subject raised. After reviewing the contract, each subject was given the option of 37 withdrawing from the study and still receiving 1 percent extra credit in his/her respective class. Only two subjects withdrew from the study at that point. The research assistant was also responsible for administering the placebos to subjects assigned to certain groups. The assistant gave all subjects assigned to the placebo-fail group a yellow placebo and informed them that the drug should make it more difficult to "beat" the lie detector. Subjects in the placebo-pass condition were given an orange placebo and told that the "drug" should make it easier for them to “beat" the lie detector. Subjects assigned to the placebo-control group were not given a placebo. It is important for the reader to note that all of the research assistants were led to believe the "medication" they were dispensing was active medication, not placebos. The research assistant removed a capsule for subjects assigned to the placebo-pass and placebo-fail groups from the appropriate uni- versity prescription-medication vial containing the respective placebos. The label on the vial containing the placebos for the placebo-pass group included the following information: "A Tranquiliz- ing Agent to decrease emotional responses," and the medication number 1139. Conversely, the label on the vial containing the placebo-fail capsules contained the following: "An Adrenergic Agent to increase emotional responses," and the medication number 1134. The following additional information appeared on the outside of both prescription vials: the prescribing physician's name, that the medication was for research purposes, and a warning that only one pill was to be taken. 38 The research assistant verified that all subjects who were to receive the placebo (1) did in fact take the placebo, (2) were told that the placebo should make it either easier or harder to beat the test (depending on their respective group), and (3) were told that the medication would take effect in approximately 15 minutes. The research assistant also warned all subjects that if they gave the examiner any indication of whether they had or had not received a "drug," they would be immediately disqualified from the study. After administering the appropriate placebos to subjects assigned to either the placebo-pass or placebo-fail group,the research assistant showed the subject one of five sets of slides based on the occupation of his/her intended victim. An equal number of subjects within each of the nine placebo-feedback treatment conditions were randomly assigned to shoot at the image of either a fireman, police- man, soldier, priest, or a surgeon. Each subject was shown a total of six slides portraying one of those occupational Options. The first slide in each of the occupational sets of slides depicted a building in the city in which the experiment took place that conformed to that particular occupational group (iae., police station, church, ROTC headquarters, fire station, and University Health Service). Each of the building slides was photographed from a position that made it possible for the subjects to read the sign identifying the building that was situated in front of the structure from the pro- jected image of the slide. The second, third, and fourth slides showed an individual working in various capacities congruous with the particular occupation he was portraying. The fifth slide 39 projected a full-length image of that same individual. In order to give the subject the impression that the image on the fifth slide was looking at him/her, each of the individuals photographed looked at the camera when that slide was taken. The sixth slide depicted the person lying on the ground simulating death. Each of the indi- viduals who appeared in the aforementioned slides wore a distinctive uniform appropriate for someone working in the particular occupation they were selected to portray. Although a different individual was portrayed in each of the five occupational sets of slides, the same individual appeared in all of the slides for each of the given occupational options. In order to maintain some consistency between the individuals photo- graphed, all five were males and had similar physical characteris- tics. Photographs were reproduced from the fifth slide of each of the five possible occupational options. Of those photographs the one that corresponded to the occupational option to which the subject was randomly assigned was attached to that subject's mock murder con- tract. Therefore, if the subject's mock murder contract specified that his/her victim was a fireman, the subject was shown the set of slides taken of the same individual dressed in the same fireman's apparel that was displayed on his/her mock murder contract. An equal number of subjects in each of the nine placebo/ feedback treatment combinations were also randomly assigned to one of five different options for each of the following categories: (1) victim's name, (2) number of shots to be fired, (3) Mafia family 4O location, and (4) price of the contract. The specific options cor- responding to each of the categories of information used in the mock murders are listed below: Category I Category II Category III Target's Occupation Target's Name No. of Shots to Be Fired 1. fireman 1. John Martin 1. 2 times 2. policeman 2. Michael Brown 2. 3 times 3. soldier 3. Edward Johnson 3. 4 times 4. priest 4. Henry Clark 4. 5 times 5. surgeon 5. Peter Miller 5. 6 times Category IV Category V Location of Mafia Family Contract's Price 1. Kansas City 1. $20,000 2. Miami 2. $30,000 3. Chicago 3. $40,000 4. New York 4. $50,000 5. Boston 5. $60,000 Therefore, of the 30 subjects in each of the nine placebo/feedback treatment combinations, 6 in each condition were randomly assigned to one of the five options in each category. The information cor- responding to the options to which the subject was assigned was filled in on that subject's mock murder contract. The same informa- tion was then used as the specifications for that subject's mock murder. The six slides of the subject's intended victim were shown on a white paper screen situated directly in front of a pellet backstop consisting of a wall of boxes filled with paper. The equipment was 41 designed to enable the research assistant to provide an unpunctured screen for each subject, which was supplied by a roll of wide paper fixed atop the frame of the screen. The slide projector was placed back far enough to project life-size figures onto the screen. The research assistant gave each subject a loaded pellet gun, closely resembling a 38-caliber revolver. Every subject was required to stand on a spot to the left side of the screen, which was close enough to make relatively certain that each shot would strike the intended victim's image. After the subject was shown the fifth slide, which was structured to give the subject the impression the victim was looking directly at him/her, the subject was required to say, " , I am shooting you for betraying the branch of the Mafia." The subject then fired at the victim the required number of times. After the subject was through firing at the image, the research assistant switched to the lst slide, which portrayed the victim lying on the ground simulating death. Finally, the research assistant counted out the appropriate amount of play money and handed it to the subject, who was then also required to count it. After the subject had finished counting the money, the research assistant warned the subject again that if he/she informed the lie detection examiner whether or not he had received one of the pills, or any details about the mock contract killing, he/she would imme- diately be disqualified from the experiment and would not receive any credit. The subject was then sent to see the polygraph examiner, who was located in another office down the hall. The examiner Spent 42 approximately 30 minutes administering the Biographical Data Sheet (Appendix D), explaining to the subject the theory behind lie detec- tion, and informing him/her how the equipment worked. After the explanation, the examiner gave the subject a “demonstration" of the instrument. It is important to note that the actual purpose of this demon- stration was to manipulate the nature of the feedback the subjects received from it. Subjects randomly assigned to the feedback-pass condition were led to believe they successfully deceived the examiner on the demonstration test, whereas subjects in the feedback-fail con- dition were led to believe they failed in that endeavor. Subjects in the feedback-control condition were not given any indication of how well they did on the demonstration test. During the demonstration test, each subject was shown five cards that were placed face down, and was asked to shuffle them with- out turning any of the cards over. All of the cards given to subjects in the feedback-fail group had the number 15 on their face. Subjects assigned to the feedback-pass group were given a deck composed of the numbers 2, 4, 10, 10, and 19. Note that the number 10 appeared twice and the number 15 was omitted from that deck. Members of the feedback-control group were given a normal deck composed of the numbers 2, 4, 10, 15, and 19. After the subject was satisfied that the cards had been ade- quately shuffled, he/she was asked to pull one of the cards aside, still keeping it face down. The examiner then removed the remaining fbur cards without looking at them. After removing the four cards, 43 the examiner turned his back and asked the subject to turn over the card he/she had selected and to memorize the number. The subject was also required to write the number down on a pad of paper that was placed directly in front of him/her, and then to turn over the pad of paper, placing it on top of the card so the examiner could not see which card had been selected. Before beginning the "demonstration,“ the subject was told that he/she would be asked a series of questions regarding the pos- sible number chosen and that he/she should reSpond "no" to each ques— tion, regardless of whether it mentioned the number actually drawn. Next, the examiner wiped off the subject's fingers with a tissue to -remove any excess dirt and perspiration. He then placed the GSR electrodes on the first and third fingers of the subject's right hand and adjusted the instrument. The subject was asked to close his/her eyes and face straight ahead without moving during the test. After the examiner made certain the subject was following these directions, he asked the following questions: 1. Did you select card number 2? Did you select card number 4? Did you select card number 10? #00“) Did you select card number 15? 5. Did you select card number 19? During the "demonstration," the examiner increased the GSR sensitivity for subjects in the feedback-fail and feedback-pass groups immediately after he asked them if they had selected card number 15. The sensitivity was returned to its previous level before the subjects 44 were asked the last question. This resulted in an increase in the GSR amplitude corresponding to the number 15. Individuals in the feedback-pass and feedback-fail groups were shown their charts. They were also informed about the method the examiner used to interpret the GSR patterns. However, the examiner did not turn the pad of paper over to confirm to disprove his inter- pretation of a subject's GSR responses until he was completely finished with that subject. Therefore, subjects in the feedback-fail group were intentionally led to believe that they had been detected, whereas those in the feedback-pass group were led to believe they had success- fully deceived the examiner. Subjects in the feedback-control group were not shown the results of their "demonstration" test. After discussing the "demonstration" test with the subjects, the examiner informed them that he was interested in determining whether or not individuals were cognizant of how well they did on polygraph tests. The subjects were also told that in order to resolve this problem the examiner would have to control for the sub- jects' preconception of how well they thought they would perform on the actual test. Each subject was questioned to make certain he/she understood that the examiner was interested in determining how well they thought they would do on the actual test, not how they had done on the "demonstration" test. The examiner also stressed that this aspect of the study would have no bearing on the number of extra- credit points the subject would be awarded. After the subject appeared to understand the above informa- tion, he/she was asked to place an “X" next to the one statement 45 contained in a performance expectancy self-report that most applied to him/her. The performance expectancy self-report contained the following series of statements: 1. I am almost positive I will "beat" three out of the five tests. I am pretty sure I will “beat" three out of the five tests. I have absolutely no idea how well I will do. I am pretty sure I will not "beat" three out of the five tests. I am almost positive I will not "beat" three out of the five tests. To minimize the level of additional contamination that could have stemmed from this procedure, the examiner turned his back while the subject checked the appropriate number. The subject was also asked to turn the pad upside down after marking it so the examiner could not see how he/she had reSponded. After subjects completed that procedure, the examiner admin- istered the actual test. He asked each of the subjects the sets of questions listed below. A. During the following series of questions you will be asked about the victim's occupation. Are you ready for me to begin? 1. Was the person you shot a doorman? Was the person you shot a fireman? Was the person you shot a soldier? Was the person you shot a surgeon? Was the person you shot a priest? 01 0'! h 0) N o o o o 0 Was the person you shot a policeman? 46 During the following series of questions you will be asked about the victim's name. Are you ready for me to begin? 1. Nas the person you shot named Thomas Wilson? Was the person you shot named John Martin? Was the person you shot named Michael Brown? Was the person you shot named Edward Johnson? Was the person you shot named Henry Clark? 03 U“! «D (A) N o o o o 0 Was the person you shot named Peter Miller? During the following series of questions you will be asked about the number of times you shot the victim. Are you ready for me to begin? 1. Did you fire one shot at the victim? Did you fire two shots at the victim? Did you fire three shots at the victim? Did you fire four shots at the victim? Did you fire five shots at the victim? 010'!wa Did you fire six shots at the victim? During the following series of questions you will be asked about the location of the Mafia organization that hired you. Are you ready for me to begin? 1. Were you hired by the Los Angeles branch of the Mafia? Were you hired by the New York branch of the Mafia? Were you hired by the Miami branch of the Mafia? Were you hired by the Chicago branch of the Mafia? Were you hired by the Boston branch of the Mafia? 01 01 h (A) N o o o o 0 Were you hired by the Kansas City branch of the Mafia? 47 E. During the following series of questions you will be asked about how much you were paid for murdering the victim. Are you ready for me to begin? -J . Were you paid $10,000? Were you paid $20,000? Were you paid $30,000? #00“) Were you paid $40,000? 5. Were you paid $50,000? 6. Were you paid $60,000? It is important to note that question 1 in each of the above series Of questions did not represent one of the possible Options to which the subject could have been randomly assigned. These questions were included to serve as a buffer for the subject's initial physio- logical responses associated with the introduction of a new question series. To avoid some of the possible experimenter contamination due to the examiner knowing the subject's feedback classification, the questions were tape-recorded. The questions asked on the tape were presented at ZO-second intervals; there were 30-second intervals between the different test series. Before testing, the subject's fingers were wiped off with tissue and he/she was asked to close his/her eyes and face forward without moving while responding to the taped questions. The subject was also instructed to respond "no" to each question he/she was asked during the test, except the questions asking the subject if he/she was ready to begin the new test series. These questions were included 48 to help make sure the subject paid attention to the content of the questions. After the testing was completed, the attachments were removed and the subject was asked to think about how well he/she had done in his/her attempt to deceive the polygraph examiner. Once again, after being told that this information would have no bearing on the number of extra-credit points he/she would receive, the subject was asked to place an "X" beside the one statement contained in the second performance expectancy self-report that most applied to him/her. The second performance expectancy self-report instrument consisted of the following statements: 1. I am almost positive I "beat" three out of the five tests. 2. I am pretty sure I "beat“ three out of the five tests. 3. I have absolutely no idea how well I did. 4. I am pretty sure I did not "beat“ three out of the five tests. 5. I am almost positive I did not "beat" three out of the five tests. To reduce the effect of the experimenter's presence on this aspect Of the experiment, he turned his back while the subject checked the appropriate number. The subject was asked to turn the pad upside down after marking it so the examiner would not see the response until he turned the pad over. Subjects were then thanked for participating in the study and were told that they would be informed later in the term how many extra-credit points they would receive. It is important to note that no subjects were permitted to see their charts or to find out how many 49 points they had received until all of the subjects had been tested, since their feedback to other volunteers might have contaminated the study. Immediately before the subjects were told how many extra- credit points they had received, they were asked to complete a brief questionnaire (Appendix E). This instrument was designed to query the subjects about their overall perceptions of the experiment, as well as about any methods they might have employed to assist them in their endeavor to beat the lie detector. Objective Scoring Procedures Two physiological parameters were recorded continuously dur- ing the polygraph examinations: thoracic respiration and skin resistance. Described below are the methods used to objectively score these physiological patterns. Respiration Total Length The respiration patterns were objectively scored by measuring the total length of the pattern produced by the polygraph respiration pen from the instant the stimulus question was asked until 15 seconds had transpired. This distance was measured using the modified map distance measurer described in the Apparatus section. Since the nature of the respiration pattern is affected by many variables, such as the subject's degree of obesity, the tightness of the pneumo- graph tube, and individual breathing differences, comparisons for detection-of—deception purposes were restricted to differences across the same person. 50 The respiration patterns corresponding to the five questions associated with each test were ranked from 1 to 5, using the method described above. Since the suppression of breathing is generally associated with deception, the shortest pattern was assigned the rank of l. The other four responses were then ranked from 2 to 5, using the same procedure. GSR Total Length One of the objective procedures used to score the subjects' electrodermal patterns was to measure the total length of the pattern formed by the polygraph's GSR pen from the instant the stimulus ques- tion was asked until 15 seconds had transpired. This distance was also measured using the aforementioned modified map distance measurer. The recorded measures of electrodermal activity, like those for respiration, are affected by many other variables in addition to those associated with deception. In this case such factors as the level of sensitivity at which the instrument was set, the humidity in the room, and individual differences make direct comparisons across subjects meaningless. Therefore, all of the objective methods used to score skin resistance were eventually ranked from 1 to 5 based on a compari- son Of the patterns corresponding to the five questions for each test. A relatively large decrease in skin resistance is generally associated with deception. The polygraph used in this experiment was designed to show decreases in skin resistance as upward movements by the pen used to record electrodermal activity. Therefore, of the 51 five electrodermal responses associated with the questions on each test, the largest response was assigned the rank of 1 (most indica- tive of deception), whereas the smallest was given the rank of 5 (least indicative of deception). GSR Amplitude Another objective procedure used to score the subjects' electrodermal patterns was to measure the vertical rise Of the largest wave occurring from onset of the stimulus'question until 15 seconds had transpired. The length of the vertical rise was measured from the lowest point prior to the wave's assuming a positive lepe to the highest point it reached within the lS-second period. (See Figure 2.) Figure 2.--Examp1e of GSR amplitude. To determine if more than one wave was present, the follow- ing method was employed: If the vertical rise from A to B was more Ithan twice the vertical decline from B to C, then ABC was not treated as a separate wave. Therefore, the vertical increase from A to 0 would constitute the "GSR amplitude" in this type of situation. 52 If there was absolutely no positive rise during the 15-second interval, the response was called a GSR amplitude falling pattern. It was impossible to assign "falling patterns" a numerical value because the instrument used did not specifically indicate ohm levels and did not reflect a consistent decrease in ohms for equal mm of vertical decline (unless the starting points of the GSR pen were exactly the same and the pen was not mechanically raised). Unfortu- nately, this problem persisted even if the tangent error due to the curved path of the polygraph's ink pens was taken into account. The values for the GSR amplitude for the five questions associated with each test were ranked from 1 (largest value) to 5 (smallest value). If "falling patterns" were included among the five, they were assigned equal ranks, which denoted the smallest measurements. Therefore, if only one "falling pattern" occurred among the five, it was assigned a rank of 5; if two occurred, they were both given the rank of 4.5; if three occurred, all three were ranked 4; and so on. GSR Maximum Height The last procedure used to objectively score the electro- dermal patterns was to measure the highest point reached during the lS-second interval commencing the instant each stimulus question was asked. This was accomplished by measuring the length in mm of a vertical line drawn from the highest point reached by the pen (during each time interval) to the bottom of the chart paper. If it was necessary to mechanically adjust the position of the GSR pen during 53 one of the tests, the amount of increase or decrease was subtracted or added, respectively, to all the responses in that series of ques- tions that followed the pen adjustment on that test. If the response to the buffer question was higher than the response to the first actual question-zuul the response to the first actual question was higher than the responses to all four of the other questions on that test, the entire test was said to have exhibited a GSR maximum height "downward drift pattern." Essentially, a "downward drift pattern" indicated that the subject's GSR pattern was falling, which implied that GSR maximum height was not an approp- riate measure for detection-of-deception purposes in the manner in which it was scored in this study. This phenomenon is often referred to as either a "falling galvo" or a "plunging galvo" by field poly- graph examiners. When a "downward drift pattern" was present on a test, all five responses for that test were assigned a rank of 3 for their GSR maximum height value. Otherwise, the values for the GSR maximum height for the five questions associated with each test were ranked from 1 (largest value) to 5 (smallest value). Summary This chapter contained a detailed explanation of the method- ology used in conducting the study. Included was an account of how subjects were selected for the project, as well as their assignment to the various treatment groups. Following this, the procedures involved in carrying out the "demonstration" as well as the actual 54 test were discussed. The final section of the chapter was an account of the objective scoring procedures used during the polygraph exami- nations. Chapter III contains the results of those tests. CHAPTER III RESULTS Introduction In this section of the dissertation, the methods used to analyze the data and their results are presented. The section com- mences with a brief description of analysis of variance and multi- variate analysis of variance. These two statistical techniques were the principal methods used to analyze the data collected in this study. After laying this foundation, a more comprehensive examina- tion of the relationships between the specified variables ensues, drawing from the data collected on the subjects' polygraph charts, performance expectancy self-reports, and questionnaires. As previously mentioned, the principal Objective of this study was to examine the effects certain placebo, sex, and feedback condi- tions had on the detection of deception. Other related issues that were examined include: the reliability of certain objective proce- dures used to quantify the data, the effect of the treatment conditions on the performance expectancy scores, the correlations between the different dependent variables, the degree of electrodermal nonrespon- siveness, the detection efficiency of the different physiological indices in differentiating between critical and noncritical items, and the association between selected variables contained on both the biographical data sheet (Appendix D) and the follow-up questionnaire 55 56 (Appendix E) and both polygraph detection efficiency and electroder- mal responsiveness. All statistical inferences presented in this chapter are treated as statistically significant when p_s .05. Two- tailed tests of significance were used whenever Z-tests or t-tests were conducted. Principal Statistical Techniques Employed The principal methods used to analyze the data collected in this study were analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA). The major independent variables examined were placebo treatments, feedback treatments, and sex differences. As noted in the preceding chapter, four dependent variables were extracted from the polygraph charts. They were GSR maximum height, GSR ampli- tude, GSR total height, and respiration total length. Since the investigator was interested in testing the sta- tistical significance of each independent variable by itself as well as the interaction effect between variables, it was necessary to employ a factorial design. Analysis of variance is a statistical technique used to analyze data Obtained in a factorial design when only one dependent variable is being considered. As will be shown, the scores the different subjects obtained on the various dependent measures examined in this study reflected a certain degree of vari- ability. Part of the score variance was attributed to the fact that subjects received different placebo and feedback treatments. Other parts of the score variance were attributed to sex differences and to the interaction effects between these variables. Finally, a 57 residual or error variance was produced, resulting from differences between subjects that were not accounted for by either the treatment or interaction effects. Analysis of variance permitted the investigator to deter- mine the degree of the total score variance attributable to each of these sources. It accomplishes this task by establishing a set of ratios by using the mean square for the residual component as the denominator and the mean squares of the other sources Of variation as the numerators.I The number produced by each of these ratios is referred to as an "F" ratio. If the F score is sufficiently large, taking into account the number of factors and their levels (degrees of freedom, to be more precise) that are associated with those two sources of variation, it is said to be statistically significant at the particular level selected by the investigator. For example, if the F score for the main effect for sex is found to be significant at the .01 level, it indicates that the investigator can be 99 percent sure that the differences between males' and females' scores were not produced by chance alone. As with all inferential statistical tests, significant differences may not represent substantive differences, which are left to individual interpretation. The specific type of analysis of variance that was employed in this study is called a three-factor, fixed-effect ANOVA. The three factors were sex, placebo treatments, and feedback treatments. Since either all levels of a factor were included in the study (i.e., 1For a more detailed explanation, almost any intermediate statistics text can be consulted. 58 male-female) or only the levels of particular interest were selected (i.e., placebo pass, fail, and control; feedback pass, fail, and control), the factors were all considered fixed. The fact that the factors were fixed (as opposed to some or all being randomly sampled from a population of levels) affected the generalizability of the experiment by limiting it to the treatments actually tested, and also affected the manner in which the ANOVAs were calculated. A posteriori contrasts were employed when statistically sig- nificant ANOVA main effects were found. These contrasts were used to pinpoint which of the treatment means were statistically significantly diffferent from the other means that were included in that main effect analysis. It should be noted that one-way ANOVA and tests for ANOVA main effects only determine whether significant differences between the various means exist. Without a posteriori contrasts one would have to rely on educated guesses when attempting to determine which means were actually statistically significantly different from the others. Tukey post hoc comparisons were used when the means being compared were based on equal cell sizes, whereas both Duncan's multiple-range test (Duncan) and Scheffe's test were utilized when the cell sizes were unequal. Of the three tests used, Duncan is the most liberal (most likely to indicate significant differences) and Scheffe's test is the most conservative (least likely to indicate significant differences). As with all inferential statistics, the appropriateness of the results derived from analysis of variance in a factorial experiment 59 is contingent on several assumptions. It is generally assumed that independent observations will be drawn from populations in which the dependent variable is normally distributed and that the dependent variable for these populations will have equal variances. It is for- tunate that ANOVA is relatively "robust" to certain violations of these assumptions (Glass 8 Stanley, 1970). This is an important feature, since the ordinal-level data that were used in calculating the ANOVAs in this study might jeopardize the normality assumption. It is also possible to use certain techniques to test whether some of the assumptions have been violated. For example, in this study Cochran's C and Bartlett's Box F are reported; they examine homo- geneity of variance. The second principal technique employed to analyze the data was multivariate analysis of variance (MANOVA). Essentially, MANOVA is a generalization of ANOVA that is used to test the effects of sev- eral dependent variables simultaneously. Instead of a single measure- ment on each experimental unit, MANOVA integrates the individual measurements of all the dependent variables into a single vector of responses. In ANOVA the analysis is based primarily on the means of the individual variables; however, in MANOVA the analysis is based on vectors of means in which each element of the vector is a group's mean for a particular variable. The purpose of MANOVA is to test whether there are statistically significant differences among the means vectors, which are also referred to as the group's centroids. This concept is graphically illustrated in Figure 3, which depicts a 60 simple bivariate dependent variable MANOVA situation in which the differences would be statistically significant. X: X1 Figure 3.--A simple bivariate dependent variable MANOVA situation, in which the differences among the three populations are "real" (Cooley 8 Lohnes, 1971, p. 224). As in univariate analysis of variance, the appropriateness of the inferences drawn from MANOVA is contingent upon a similar set of assumptions. The dependent variable vector is assumed to be multi- variate normal in distribution with the same dispersion, or variance- covariance matrix, for each population. Equality of dispersions is the MANOVA extension of the assumption of homogeneity of variances in ANOVA designs (Cooley 8 Lohnes, 1971, p. 224). This assumption was tested in this study using Boxes M.1 If a significant difference in dispersions is obtained, the F test for the differences between group centroids may be inflated. The differences between the group centroids were examined using four different criteria to derive levels of significance. They were 1For a more complete description of Boxes M, see Amick and Walberg (1975). 61 Wilk's Lambda, Hotelling's trace criterion, Roy's largest root, and Pillai's criterion. All four criteria are a function of the eigen- values (characteristic roots) of the ratio between the determinants of treatment sum-of—squares and cross-products matrix.1 However, for the sake of brevity, only the value of the most liberal criterion on each of the multivariate tests will be presented in this chapter. When significant multivariate results are reported, readers interested in knowing the values for all four criteria will find them listed in Appendix F. The MANOVAs were computed using Northwestern's verSion 7.0 SPSS MANOVA program at the Computer Center of Michigan State Univer- sity. This program automatically calculated the approximate F-ratio and its level of significance for Wilk's Lambda, Hotelling's trace criteria, and Pillai's criteria. The level of significance asso- ciated with Roy's largest root was derived from Heck percentage-point charts contained in Timm (1975). In addition to calculating the MANOVAs, the version 7.00 MANOVA program calculated the univariate F test of significance for each response separately. MANOVA Results The complete results of the statistical analysis derived from the multivariate analysis of variance procedures are pre- sented in Appendix F. Six MANOVA "runs" were made--five pertain- ing to the ranks of the critical items (CI) on each of the lThe formulas for calculating all four criteria are presented in Cohen and Burns (1977). 62 polygraph tests and one in which the CI ranks from the five polygraph tests were added together for each of the four dependent variables. As stated in the preceding section, Boxes M is often employed to test whether the equality-of—dispersion assumption has been vio- lated. This test was conducted on all six MANOVAs. A significant difference in dispersions was noted, corresponding to the third, fourth, and fifth polygraph tests (p_= .04, .009, and .O3, respec- tively). This finding suggests that the F value for the differences between group centroids may be inflated, especially for the values relating to the fourth polygraph test. Therefore, the validity of the multivariate significance values reported for those three polygraph tests should be viewed with a certain degree of skepticism. None of the multivariate tests of significance for the main effects of feedback or placebo with respect to the critical item scores were significant for any of the polygraph tests or for their combined ranks on the four dependent measures. However, the multi- variate tests for the main effect of sex were significant on three out of the following six tests. The F value associated with the main effect of sex on the test analyzing the combined CI ranks of the five polygraph tests for the four dependent variables was significant at p_= .0009. Its values associated with polygraph tests one through five were p_= .02, .11, .07, .07, and .04, respectively. There was also a significant sex x placebo multivariate inter- action (p_= .001) corresponding to the fourth polygraph test; however, this relationship was not found for any of the other polygraph tests or on the analysis conducted on their combined dependent values. 63 The aforementioned significant interaction effect is illustrated in Figure 4. Since feedback was not part of the significant interaction, the three levels of feedback were collapsed into their corresponding drug and sex categories, yielding a possible range of CI mean rank scores from 3 to 15. The higher the mean composite rank score, the less detection efficiency the polygraph had in correctly differen- tiating the critical items. The CI mean scores for males on GSR maximum height, GSR amplitude, and respiration indicated that the detection efficiency was the lowest in the tranquilizer placebo treatments, almost as low in the adrenergic placebo groups, and highest in the placebo control groups. However, males' GSR total length CI mean score indicated the lowest detection efficiency for the placebo control groups (7'= 7.3) and higher detection efficiency in the tranquilizer placebo and adrenergic placebo groups (7'= 6.7 and 6.6, respectively). The CI mean scores for females on respira- tion and GSR amplitude suggested a relationship opposite to that found for males. Their CI mean scores for those dependent variables indicated the lowest detection efficiency in the placebo control groups and considerably higher efficiency levels in the tranquilizer and adrenergic placebo conditions. The detection efficiency for females on GSR total length and GSR maximum height was the lowest in the tranquilizer placebo conditions (7' = 7.9 and 7.9), higher in the placebo control groups (7'= 7.0 and 7.6), and highest in the adrener- gic placebo conditions (7’= 6.1 and 7.3). None of the other multivariate tests of significance for the sex x feedback interaction, the drug x feedback interaction, or the sex 64 8.0 7.9 HF 7.8 LF 7.7 7.6 ”F 7.5 7.4 IF 7.3 7.2 HM HF 7.1 7.0 6.9 6.8 HM 6.7 LM 6.6 6.5 I” L“ 6.4 IF 6.3 6.2 6.1 IM,LF 6.0 RF 5.9 RM 5.8 IF 5.7 IM 5.6 \ 5.5 RM 5,4 _._—————-*“""TTTTTTT::T 5.3 R" \ RF Tranquilizer No Placebo Adrenergic Placebo (Control) Placebo Key: RM = Respiration mean values for males RF = Respiration mean values for females IM = GSR amplitude mean values for males IF = GSR amplitude mean values for females HM = GSR maximum height mean values for males HF = GSR maximum height mean values for females LM = GSR total length mean values for males LF = GSR total length mean values for females Note: The higher the mean rank score, the less detection efficiency the polygraph had in correctly differentiating the critical items from the noncritical items. Possible range: 3-15. Figure 4.--The effect of sex and three placebo conditions on the respiration, GSR amplitude, GSR maximum height, and GSR total length responses during the fourth polygraph test. 65 x drug x feedback interaction were significant for any of the poly- graph tests. ANOVA Results The analysis of variance procedures examined the same rela- tionships as discussed in the MANOVA section. However, instead of analyzing the treatment and interaction effects on all dependent variables simultaneously, ANOVA was used to examine these relation- ships completely separately for each dependent variable. The results from these procedures are presented in their entirety in Appendix F. The most dramatic and consistent relationship found was the main effect for sex on GSR maximum height. On the total summation of CI ranks for GSRmaximum height from the five polygraph tests, the main effect for sex was highly significant at p_= .0002. The main effect for sex on GSR maximum height was also sig- nificant on four out of the five polygraph tests with p_= .02, .15, .02, .01, and .004 for tests one through five, respectively. On all of these tests, the detection efficiency for females with respect to that dependent variable was consistently lower than that found for males. (See Table l.) The only other dependent variable that had a significant main effect for sex was GSR amplitude, which was only significant on the fifth polygraph test (p_= .05). On that test, the CI mean score for females with respect to GSR amplitude was 2.76, whereas the CI mean score for males on that particular dependent variable was 2.45. 66 Thus the detection efficiency on the fifth tests was higher for males than for females with respect to GSR amplitude. Table 1.--CI mean ranks for males and females on GSR maximum height for the five polygraph tests. Polygraph Test Sex Test Test Test Test Test Combined One Two Three Four Five Tests Males (N = 135) 1.86 1.99 2.19 2.21 2.23 2.10 Females (N = 135) 2.19 2.19 2.49 2.53 2.60 2.40 Note: The lower the mean ranks, the higher the detection efficiency. No significant main effects for feedback were found for any of the dependent variables on any of the five polygraph tests or on the tests conducted on the summed CI ranks of all five polygraph tests for each dependent variable. The only significant main effects found for the placebo con- ditions appeared on test five. On that test and only on that test, significant placebo main effects were indicated for both GSR total length (p_= .04) and GSR amplitude (p_= .03). The CI mean scores for GSR total length on that polygraph test were 2.49, 2.42, and 2.89 for the adrenergic, control, and tranquilizer placebo conditions, respec- tively. Once again, the higher the number, thelower the detection efficiency. A Tukey post hoc comparison of those means indicated a significant difference between the placebo control and the tran- quilizer placebo conditions. The CI mean scores for GSR amplitude on 67 the fifth polygraph test also suggest a similar pattern, with the following values for the adrenergic, control, and tranquilizer placebo conditions: 7 = 2.58, 2.37, and 2.88, respectively. A Tukey post hoc comparison of the means indicated a significant difference between the placebo control and the tranquilizer placebo conditions. Two significant drug x feedback interactions were found, both Of which were on GSR maximum height. These interaction effects appeared on polygraph tests one and four (p_= .02 and .01, respec- tively). Figure 5 depicts the drug x feedback interaction on the first polygraph test and Figure 6 illustrates that interaction on the fourth polygraph test. Since sex was not an important element in these interaction effects, the two levels of sex were collapsed into their corresponding feedback and placebo conditions, producing a possible range from 2 to 10 on both interactions. On polygraph test one, the detection efficiency for GSR maximum height on the three feedback-pass conditions was the lowest in the placebo control conditions (7'= 5.7) and considerably higher in the tranquilizer and adrenergic placebo conditions (7'= 4.2 and 4.1, respectively). These efficiency levels for the feedback control groups were also lowest in the placebo control condition (K'= 4.9); however, they were perceptibly higher in the tranquilizer placebo condition (7'= 4.1) than in the adrenergic placebo condition (7'= 4.7). The detection efficiency levels for GSR maximum height in the feedback-fail groups were lowest in the tranquilizer placebo groups (7'= 4.9), surpassed by the placebo control and adrenergic placebo conditions (Y'= 4.0 and 3.6, respectively). 68 z \x wwwbahbbmmmmmm ammompmmompmooo Tranquilizer No Placebo Adrenergic Placebo (Control) Placebo Key: X Y Z Feedback pass condition Feedback control condition Feedback fail condition Note: The higher the mean, theless detection efficiency GSR maximum height had in properly differentiating the critical items. Possible range: 2-10. Figure 5.--The effect of the three placebo conditions and the three feedback conditions on GSR maximum height during polygraph test one. 69 (an) -I=- -I=b -l=b -l=h -J=I (L31 iLIT (L31 <.71 (.IT (:73 (3!) (:2) E‘s) -I=II (Ill (3!) (:2) f‘i) -I=I (:73 (DE) (:2) Y Tranquilizer No Placebo Adrenergic Placebo (Control) Placebo Feedback pass condition Feedback control condition Feedback fail condition Key: f‘~l '-t: 3": II II II Note: The higher the mean, the less detection efficiency GSR maximum height had in correctly differentiating the critical items. Possible range: 2-10. Figure 6.--The effect of the three placebo conditions and the three feedback conditions on GSR maximum height during poly- graph test four. 70 On polygraph test four, the drug x feedback interaction for GSR maximum height was even more dramatic. The highest detection efficiency levels for both the feedback control (K'= 3.9) and feedback fail (K'= 4.1) groups were found in the placebo control conditions, whereas their lowest efficiency levels appeared in the tranquilizer placebo conditions (K'= 5.2 and 5.6, respectively), followed by the adrenergic placebo conditions (K'= 4.5 and 5.1, respectively). Con- versely, the detection efficiency levels for GSR maximum height for the feedback pass groups were lowest in the placebo control conditions (K' 5.5) and much higher in the tranquilizer and adrenergic groups (K' 4.4 and 4.5, respectively). No significant sex x feedback, sex x placebo, or sex x placebo feedback interaction effects were found for any of the dependent variables on any of the five polygraph tests or on the tests conducted on the summed values for all five polygraph tests for each dependent variable. Reliability of the Procedures Used to Measure GSR Total Length and Respiration Total Length As stated in the Apparatus section, the investigator developed a modified map distance measurer to measure GSR total length and res- piration total length. The reliability of the measurements obtained using the instrument was determined by randomly selecting 3O polygraph charts from the 270 charts produced in the study. The original measurements of GSR total length and respiration total length for all critical and noncritical items and their corresponding ranks on the selected charts were recorded. These numbers were then masked with 71 black tape, making it impossible to see the original values. A dif- ferent research assistant than the one who originally measured the reSponses remeasured the total lengths of the GSR and respiration patterns, following the same procedures as had been used to derive the original measurements. These new measurements were then ranked from one to five (see Method chapter) for each of the five tests included on the polygraph charts. The reliability of the measurements for the total length of the GSR and respiration responses was calculated by comparing the original measurements with their corresponding values compiled from the second measurement. The absolute value frequencies of the dif- ference between the GSR total length measurement-remeasurements are presented in Table 2. Twenty-three percent of the values were exactly the same and 80.8 percent of the values were i 1.0 unit (inclusive; 1 unit = 3.896 mm). The mean length of the GSR patterns found on the 30 randomly selected polygraph charts was originally 10.11 units, whereas the mean length computed on the remeasured values was 9.86 units. The Pearson correlation coefficient calculated on the two sets of values was r.= .94. The same procedure was used to compare the ranks assigned to the GSR total length measurements. The absolute value frequencies of the differences between the GSR total length rank and rerank deter- minations are presented in Table 3. Approximately 61 percent of the ranks based on the remeasured values corresponded exactly to their original ranks. Almost 87 percent of the corresponding ranks from the two data sets were 3 [1| rank from each other. The Pearson 72 Table 2.--The absolute value frequency distribution of the differences between the original and subsequent measurements of GSR total length. IX _ X I Percentage of Cumulative 1 2 Comparisons Percentage .O 23.5 23.5 .1 10.2 33.7 .2 5.1 33.8 .3 4.5 43.3 .4 4.8 48.1 .5 9.2 57.3 .6 4.3 61.6 .7 3.5 65.1 .8 3.2 68.3 .9 5.3 73.6 1.0 7.2 80.8 1.1 3.9 84.7 1.2 1.2 85.9 1.3 .7 86.5 1.4 2.3 88.8 1.5 2.8 91.6 1.6-2.0 4.4 96.0 2.1-2.5 1.2 97.2 over 2.5 2.8 100.0 X = .649; N = 750 Table 3.—-The absolute value frequency distribution of the differences between the original and subsequent determinations of GSR total length ranks. IX _ X 1 Percentage of Cumulative 1 2 Comparisons Percentage 0.0 60.9 60.9 1.0 26.0 86.9 1.5 .3 87.2 2.0 8.9 96.1 2.5 .3 96.4 3.0 2.8 99.2 4.0 .8 100.0 ><| u .565; N = 750 73 correlation coefficient comparing the original ranks of the critical and noncritical items to their corresponding second values was r_= .73. The reliability of the respiration total length measurements and their corresponding ranks was calculated in the same manner as used for the GSR total length values. The absolute value frequencies of the differences between the original respiration total length measurements and their corresponding remeasured values are presented in Table 4. Approximately 6.8 percent of the original measurements were exactly the same as their corresponding remeasured values, 54.3 percent were within i 1.0 unit (inclusive), and 94.9 percent were within 1 3.5 units (inclusive). The average length of the respira- tion patterns was 26.56 units originally and 26.81 units after they were remeasured. The Pearson correlation coefficient comparing the two data sets was 3_= .95. The absolute value frequencies of the differences between the respiration total length rank and rerank determinations are presented in Table 5. Almost 62 percent of the ranks based on the remeasured values corresponded exactly to their original ranks and 89.1 percent were within i 1 rank (inclusive) from each other. The Pearson cor- relation coefficient comparing the original ranks to their correspond- ing reranked values was §_= .78. 74 Table 4.--The absolute value frequency distribution of the differences between the original and the subsequent respiration measure- ments. 98 V9 01a tt an 19 UC mr ue CD- :1 05 n e0 95 3.1 tr "3 60.. cm «'0 CC P 2 x . 1 VA 855073671931434597491600 635819347349024968247780 11122333455666678999990 1 870576314842391148752540 662237412605112571322 2 1 1 01234567890123452234446 P. oooooooooooooooo . - . . 11111116161616 9 1223344 0 1.35; N = 750 75 Table 5.--The absolute value frequency distribution of the differences between the original and subsequent determinations of res- piration ranks. IX _ X I Percentage of Cumulative l 2 Comparisons Percentage 0.0 61.6 61.6 1.0 27.5 89.1 2.0 8.1 97.2 3.0 2.4 99.6 4.0 .4 100.0 ‘ = .525; N = 750 The Effect of the Treatment Conditions on Performance Expectancy Scores The subjects' mean performance expectancy scores obtained innediately after the "demonstration" tests are reported in Table 6. The means ranged from 2.00 for males in the placebo pass-feedback pass group to 4.0 for females in the placebo fail-feedback fail group. The higher the mean, the less certain the subjects were that they could "beat" three out of the five polygraph tests they were to take after marking their predictions. The males' performance expectancies indicated they were more confident that they could beat three out of the five tests than the females in each of the treatment subgroups. The performance expectancies for the feedback conditions were hier- archically ordered, with the subjects in the feedback pass group the most Optimistic in each case, the no feedback group in the middle, and the feedback fail group with the most pessimistic outlook in all cases when controlling for drug condition and sex. The effect Of 76 .mummu m>_w esp we pzo mecca eummne c_:oz xmzp was“ mew: mpumwasm mg» cwmpcwu mmm_ mg» .cmms an» segue; mg» “muoz mm.~ u x ”mp n : mcum u z PM oe.m oe.m oo.e No.m mm.m om.m cowaemaa 06 @w oo.m o..m m_.m me.~ om.~ me.N e um a mmm AN.N oe.~ om.~ oo.~ m_.~ oe.~ xeeeemea onmumpa onmumpa onmumpa onmumpm camompa onwumpm cmeszccmch oz uwmcmcmcu< Lm~wpwsacmcb oz uwmcmcmgu< copupucou mpmsmu «Pm: xwmmummm ucmucoqmmm eo xmm .ummp ecowumcumcosmue ms» cmumm xpmumwumeew umcwzcum mmgoum aucwpumnxm mocwegoecmq some mg» :0 meowpwucou xomnumme omega use .mcowuvucou oamumpn omega .xmm mo uumemm mghii.m «Pack 77 the drug conditions on performance expectancy was less stable. Generally, subjects taking the adrenergic placebo had the least Opti- mistic predictions; their level of Optimism was surpassed by the no placebo and the tranquilizer placebo conditions, respectively. How- ever, this was not the case for males in the no feedback and feedback fail conditions or for females in the feedback fail condition. In those treatment categories, one of the three mean scores for the placebo treatments did not follow the hypothesized order. The performance expectancy scores the subjects marked imme- diately after the demonstration test were analyzed using analysis of variance. The results derived from that technique are reported in Table 7. The main effects for both sex and feedback were highly sig- nificant (p_= .00003 and p_= .OOOOl, respectively), whereas the main effect for placebo was not significant (p_= .069). A Tukey post hoc comparison of the three means for the feedback conditions indicated that all three groups were significantly different from each other. Hence, males were significantly more certain they would beat three out of the five actual polygraph tests than females, and subjects taking the tranquilizer placebos were significantly more optimistic than members of the control group, who were in turn significantly more optimistic concerning their ability to beat the polygraph than the subjects in the adrenergic group. None of the interaction effects stemming from those conditions was significant. 78 Table 7.--Analysis of variance: the effect of sex, three placebo conditions, and three feedback conditions on the mean performance expectancy scores acquired immediately after the "demonstration" test. Sum of D.F. Mean F Signif. Source Of Variation Squares Square Value of F Sex 11.20 1 11.20 17.84 .00003 Drug 3.39 2 1.69 2.70 .06949 Feedback 51.23 2 25.61 40.79 .00001 Sex x drug .54 2 .27 .43 .65066 Sex x feedback .12 2 .06 .09 .90999 Drug x feedback 1.31 4 .33 .52 .72187 Sex x drug x feedback 1.84 4 .46 .73 .57136 Within cells 158.27 252 .63 The mean performance expectancy scores obtained from the subjects after they had completed the five actual polygraph tests are reported in Table 8. In this situation, the lower scores indi- cate that the subjects were more certain they had successfully deceived the examiner on three out of the five polygraph tests. Once again, the females generally had more pessimistic predictions than the males within the levels of the other various treatment conditions. However, the mean scores for females were lower (more Optimistic) in the feedback fail-no placebo and no feedback-tranquilizer placebo groups and the same as the mean scores for males in the feedback pass-adrenergic placebo group. The mean scores associated with the various feedback treatments continued the same hierarchical trend, with the feedback pass groups having the scores indicative of the highest degree of Optimism and the feedback fail groups generally having the lowest when sex and placebo treatment were controlled. 79 .mpmmu m>wm ecu mo azo mecca ebmmne has» peep mew: mwomnnzm we“ :wmpcmu mmmp mg» .cmms ecu cmsmw; as» "muoz _a.~ u.m mm, H e "cum u z . . . . . . _Peu on m ma N om m co m m_ m KN m xueeeeaa on $0 oe.N oe.m m_.m oo.m om.~ No.m x mm a mmw mm.N mm.~ mm.~ o~.N mm.~ mm.N eeaeemea camoepm onmumpa oamumpm camompa camompa enoumpa cw~wpwzccmch oz owmcmcmcu< cmNVFwaacmLH oz uwmcmcmcu< copppucou mpmsmm «Pm: xumnummm pcmucoammm we xmm .umwu Pmapum asp coumm AFwamwumEew umcwzoum mucoum Aocmuumgxm mocmsgowcma came one :o meowuwucoo xumnvmmm megs» use .mcowuwucou oncompn mugs» .xmm mo puwmem ugpii.w mpnmh 80 The only exceptions were the tie between the no feedback-tranquilizer placebo and the feedback fail-tranquilizer placebo groups for males and the lower mean score (K'= 2.93) for females in the feedback fail- no placebo group compared to their feedback fail-no placebo condition (Y'= 3.40). Although there was a trend for the mean scores in the adrenergic placebo groups to represent less Optimistic predictions than those in the other two placebo groups and the tranquilizer groups to be the most optimistic when controlling for sex and feed- back conditions, there were several exceptions to this order. Once again, the performance expectancy scores the subjects marked after taking the five actual tests were analyzed using analysis of variance. The results of that procedure are reported in Table 9. This time only the main effect for feedback was significant (p_= .00001). A Tukey post hoc comparison of the feedback means indicated that the subjects in the feedback pass group were more Optimistic about the outcome of the polygraph test than subjects in the other two feedback conditions. The F value for the main effect for sex and the sex x drug x feedback interaction were the only other sources of vari- ation that approached significance (p_= .086 and p_= .088, respec- tively). Table 10 depicts the net change in the subjects' mean predic- tions from how well they thought they would do after taking the "demonstration" test to how well they thought they had done after taking the five polygraph tests. The negative numbers indicate that the subjects thought their chances of "beating" three out of the five 81 Table 9.--Analysis of variance: the effect of sex, three placebo conditions, and three feedback conditions on the mean performance expectancy scores provided by the subjects immediately after the actual test. . . Sum of Mean F Signif. Source of Variation Squares D.F. Square Value of F Sex 2.50 l 2.50 2.97 .08612 Drug 1.87 2 .93 1.11 .33227 Feedback 33.16 2 16.58 19.66 .00001 Sex x drug .21 2 .10 .12 .88435 Sex x feedback .56 2 .28 .33 .71655 Drug x feedback 4.11 4 1.03 1.22 .30335 Sex x drug x feedback 6.93 4 1.73 2.05 .08753 Within cells 212.53 252 .84 tests had improved, whereas the positive numbers indicate their second prediction was less optimistic. Overall, the females' scores tended to decrease, indicating they thought their chances improved, whereas the males' scores increased. The changes in the two pre- dictions represented an interesting pattern when analyzed by feed- back conditions. All of the feedback fail groups' differences were negative or zero, indicating that they thought their chances of beat- ing three out of the five tests had either remained the same as when they completed their first prediction or had improved after taking the actual tests. Conversely, all of the no feedback groups' dif- ferences were positive or zero with the sole exception of the no feedback-tranquilizer placebo for females, which decreased by .40. The differences in the feedback pass groups suggest an interaction with sex, since the females' scores were all negative, whereas all of the group differences for males were positive, excluding the 82 .owumwewpoo mmoF mum: mcowpowooco ocooom cpozu ouoowocw memosoc o>wpwmoo oz» moocozz .oo>ocosw on: momma o>mm oz» mo “so moczu eocwpoone mo moocozo Lwozp unmoozp muoonoom oz» pogo ouooeocw mconsoc o>ppomo= one "opoz mmo.i u.w mmF u : mEN u z .i .i .i .i .. Pwou 0 me om no om mo xoooooom o¢.- om.+ o RN.+ o~.+ om.+ zoommoou .- .- .i . . .. moon mm so mm om + so + no zoooooom ooooopo ooooofim ooooopo ooooopo ooooo_a onooopo Lo~wpwoococp oz owmcococo< coNPPwoocock oz owmcococo< covaocou zoooooou oposou opoz ocoocoomoz mo xom .moxom soon so; one mcowpwocoo ooooopo ooczp FPo coe momma zoocmzpoo Fozpoo oz» Loewe oco ocoeoo z_mpopuose_ woos mcowuowooco mcwcoosoo mmcoom some zocouoooxo oocoELoecoo cw momcozo umzii.op epoch 83 feedback pass-adrenergic placebo category, which resulted in a slight decrease (X1 - X2 = -.O7). Correlations Between the Dependent Variables The correlations between the dependent variables for each of the five polygraph tests are reported in Table 11. In each case, the ranks of the critical item on the two specified dependent measures were compared to each other for the 270 subjects. A high correlation between the two dependent variables indicated that those two ranks on the critical item for each subject tended to be the same or very close to it. A high correlation also suggested that the discrimina- tion value of the two variables might be limited, since to a large extent they would both be accounting for the same variance. Table ll.--Pearson correlation coefficients comparing the ranks of the dependent variables on the critical items for each polygraph test. Test Test Test Test Test Dependent Variables Correlated One Two Three Four Five Respiration x GSR maximum height .096 .049 .039 -.O74 .019 Respiration x GSR total length .024 .028 -.076 -.014 -.005 Respiration x GSR amplitude GSR total length x GSR maximum .023 .046 .057 -.006 -.035 height .151 .295 .382 .377 .487 6524314123"; he‘ght x GSR .333 .362 .382 .367 .460 ngmfiiiiLdLength X GSR .680 .709 .718 .784 .734 84 The correlation coefficients between reSpiration and all other dependent measures were very low for all five polygraph tests. They ranged from a maximum positive value of .096 to a maximum negative value of -.O74. GSR maximum height was fairly highly correlated with both GSR total length and GSR maximum height, with correlations rang- ing from .151 for the GSR total length x GSR maximum height comparison on test one to .487 for those same dependent variables on test five. The variables showing the highest degree of correlation were GSR total length and GSR amplitude. Their coefficients from test one to test five were .680, .709, .718, .784, and .734, respectively. The correlations for the ranks of the critical items between the different polygraph tests for each of the four dependent vari- ables are reported in Table 12. A high correlation between two tests for a certain dependent variable would indicate that the subjects tended to have the same rank for the critical items on both tests, with respect to that particular physiological variable. The correla- tion coefficients for the various polygraph test combinations across all of the dependent measures were relatively low. The lowest corre- lation was .018 for the test one x test four comparison using GSR amplitude as the dependent variable, whereas the highest correla- tion was only .371 for the same comparison using GSR maximum height as the dependent variable. GSR maximum height tended to have the highest correlations from test to test, followed by respiration, GSR amplitude, and GSR total length, respectively. 85 Table 12.--Pearson correlation coefficients for the ranks of the critical items comparing the different polygraph tests on each dependent measure. GSR GSR GSR Polygpgpgtlgsts Respiration Ampli- Maximum Total tude Height Length Test one x test two .264 .073 .290 .049 Test one x test three .144 .074 .184 .091 Test one x test four .175 .018 .371 .041 Test one x test five .111 .076 .218 .060 Test two x test three .217 .140 .136 .170 Test two x test four .196 .113 .168 .138 Test two x test five .074 .103 .192 .069 Test three x test four .142 .153 .243 .176 Test three x test five .223 .185 .181 .127 Test four x test five .048 .249 .282 .232 The Incidence of GSR Maximum Height Downward Drift and GSR Amplitude Falling Patterns Table 13 depicts the number and percentage of GSR maximum height downward drift patterns that were found on each of the poly- graph tests. Those figures are also presented for the "demonstration" test; however, only the three feedback control conditions were used in deriving the number and percentage of GSR maximum height downward drift patterns for that particular test. The various feedback pass and feedback fail conditions were excluded from these calculations since the experimenter intentionally manipulated their respective charts to yield the desired feedback effects. The percentage of subjects producing GSR maximum height down- ward drift patterns remained relatively constant for each of the poly- graph tests. The percentages ranged from 23.0 percent on the second test to 37.0 percent on the fourth test. Although the percentage of 86 charts containing GSR maximum height downward drift patterns tended to increase with each successive polygraph test, this relationship was not consistent enough to constitute a strong trend. Table l3.--The frequency of GSR maximum height downward drift patterns on the demonstration and five actual polygraph tests. Relative Test Absolute Frequency requency (Percentage) Demonstration testa (N = 90) 25 27.8 Test one (N = 270) 83 30.7 Test two (N = 270) 62 23.0 Test three (N = 270) 85 31.5 Test four (N = 270) 100 37.0 Test five (N = 270) 98 36.3 aThe calculations for the demonstration test were based only on charts produced by the 90 subjects in the feedback control conditions. Table 14 shows the percentage of GSR amplitude falling patterns that were present for the five items on each of the polygraph tests. Once again, those figures presented for the demonstration test were based solely on the charts produced by subjects in the three feedback control conditions. Excluding the demonstration test, the frequency of GSR amplitude falling responses increased with each subsequent polygraph test. The percentage of GSR amplitude falling patterns ranged from 8.6 percent on the first test to 23.26 percent on the fiftL test. 87 Table 14.--The percentage of all GSR amplitude responses that were categorized as falling patterns on each polygraph test. Percentage of Responses Polygraph Test Categorized as Falling Patterns Demonstrationa 9.34 Test one 8.64 Test two 13.92 Test three 17.12 Test four 22.44 Test five 23.26 aThe calculations pertaining to the demonstration test were based only on the charts produced by the 90 subjects in the feedback control conditions. The percentage Of GSR amplitude falling patterns occurring on the critical items for each of the polygraph tests is presented in Table 15. Once again, the percentages increased for each subse- quent polygraph test. The percentage of critical items constituting a GSR amplitude falling pattern ranged from 0 percent on the demon- stration test to 17.4 percent on the fifth polygraph test. These percentages, which were based solely on the critical items, were perceptibly lower than those presented in Table 14 for all five items on each polygraph test. I If there was no difference in the incidence of GSR amplitude falling patterns between the critical and noncritical items, one would expect 20 percent of the total number of falling patterns to have occurred on the critical items. The 20 percent figure represents the 1:5 ratio between the number of critical items to the total number 88 of questions. However, significantly less than 20 percent of the total number of falling patterns were associated with the critical items (g_= -9.21). This demonstrates that the subjects were less likely to have GSR amplitude falling patterns on critical items than on noncritical items during the polygraph tests. Table 15.--The percentage of all critical items that were categorized as GSR amplitude falling patterns on each test. Percentage of Critical Items Polygraph Test Categorized as Falling Patterns Demonstrationa 0 Test one 2.6 Test two 4.1 Test three 11.1 Test four 13.3 Test five 17.4 aThe calculations pertaining to the demonstration test were based only on the charts produced by the 90 subjects in the feedback control conditions. The Accuracy of the Different Physiological Indices in Differentiating Between Critical and Noncritical Items Table 16 shows the percentage of critical items for each polygraph test that were ranked "one” (the response most indicative of deception) with respect to respiration, GSR amplitude, GSR maximum height, and GSR total length. Since there were five items on each test, the chance expectancy that the critical item would be ranked "one" was 20 percent for each of the physiological indices monitored. All of the percentages, regardless of the test or physiological 89 Table l6.--The percentage of critical items ranked "one" (the most indicative of deception) with respect to respiration, GSR amplitude, GSR maximum height, and GSR total length for each polygraph test. Physiological Index P GSR GSR GSR Olygraph T95t Respiration Ampli- Maximum Total tude Height Length Demonstrationa 41.1 60.0 57.8 51.1 Test one 45.2 46.3 40.0 37.8 Test two 48.1 53.7 46.3 43.3 Test three 41.9 29.6 31.5 28.5 Test four 52.2 43.0 33.0 37.0 Test five 40.0 26.7 27.0 28.9 aThe calculations pertaining to the demonstration test were based only on the charts produced by the 90 subjects in the feedback control conditions. measure on which they were based, were significantly more accurate than this chance level (g_z 2.74). Excluding the demonstration test, respiration was generally the best or one of the best physiological indices in discriminating between the critical and noncritical items. However, on the demonstration test (based solely on the 90 subjects in the three feedback control conditions), the three GSR indices had higher detection efficiency levels than respiration. The same relationships are demonstrated in Table 17, which depicts the mean rank of the critical item with respect to the aforementioned four dependent variables on each of the polygraph tests. The chance level for each of the dependent variable mean ranks on all of the polygraph tests was I'= 3.0, sg,= 1.4 (based on their probability distributions); that mean is a significantly higher value than any of the actual mean 90 rank scores attained (lgj 3 4.49). Thus the polygraph had a signifi- cantly higher detection efficiency than chance for each physiological parameter on all of the polygraph tests with regard to both the per- centage of responses to the critical items that were scored as the most indicative of deception and for the mean score of the dependent variable ranks. Table 17.--The mean rank of the critical items with respect to respiration, GSR amplitude, GSR maximum height, and GSR total length for each polygraph test. Physiological Index Polygraph Test GSR GSR GSR Respiration Ampli- Maximum Total tude Height Length Demonstrationa 2.33 1.76 1.79 1.97 Test one 2.18 2.12 2.23 2.30 Test two 2.10 1.86 2.09 2.00 Test three 2.14 2.45 2.34 2.49 Test four 1.95 2.11 2.37 2.31 Test five 2.25 2.61 2.42 2.60 Note: The smaller the mean, the higher the detection efficiency the variables had in identifying the critical items. aThe calculations pertaining to the demonstration test were based only on the charts produced by the 90 subjects in the feedback control conditions. One of the factors reducing the detection efficiency of the three GSR indices was the lack of electrodermal responsiveness demon- strated by certain subjects. To control for this phenomenon, the GSR maximum height accuracy percentages and means were recalculated, excluding tests in which the subjects produced a GSR maximum height 91 downward drift pattern. Similarly, these values for GSR amplitude and GSR total length were recalculated excluding charts in which all five of the responses on a particular test were GSR amplitude falling patterns. Since no related problems were associated with the respi- ration patterns, there was no need to recalculate their reSpective values. Table 18 shows the recalculated percentages of critical items that were ranked "one" for the four dependent variables, taking into account the exclusions mentioned above. Table 19 depicts the recalculated mean-ranks of the critical items for the polygraph tests with respect to each of the dependent variables. When the tests containing a GSR maximum height downward drift pattern and/or five GSR amplitude falling patterns were excluded from the two measures of detected efficiency, GSR maximum height became the most accurate index of deception, as demonstrated by the highest percentage of critical items ranked "one" and the lowest critical item mean ranks for each polygraph test. Respiration, GSR amplitude, and GSR total length were fairly equivalent in their ability to discriminate between critical and noncritical items, when the charts with all five responses constituting GSR amplitude falling patterns were excluded from the accuracy calculations for GSR ampli- tude and GSR total length. However, of these three dependent variables GSR amplitude was the most valid measure on the demonstration and first two actual polygraph tests, whereas respiration was the more accurate on the last three polygraph tests. 92 Table l8.--The percentage of critical items ranked as the most indicative of deception with respect to the four prin- cipal dependent measures for each polygraph test. Physiological Index Polygraph Test GSR GSR GSR Respiration Ampli- Maximum Total tude Height Length Demonstrationa 41.1 60.0 80.0 51.1 (n=90) (n=90) (n=65) (n=90) Test one 45.2 47.4 57.8 38.6 (n=270) (n=264) (n=187) (n=264) Test two 48.1 54.9 60.1 44.3 (n=270) (n=264) (n=208) (n=264) Test three 41.9 31.4 46.0 30.2 (n=270) (n=255) (n=185) (n=255) Test four 52.2 45.1 52.4 38.9 (n=270) (n=257) (n=170) (n=257) Test five 40.0 28.4 42.4 30.7 (n=270) (n=254) (n=172) (n=254) Note: These figures were calculated excluding tests containing GSR maximum height downward drift patterns in calculating the values associated with GSR maximum height and excluding tests containing GSR amplitude falling patterns on all five responses on a given test in deriving the values associated with GSR amplitude and GSR total length. aThe calculations pertaining to the demonstration test were based only on the charts produced by the 90 subjects in the feedback control conditions. Detection Rates Attained Using the Scoring Procedure Developed by Lykken (1959) The accuracy of the polygraph in this experiment was also analyzed using the scoring procedure developed by Lykken (1959). If the dependent variable associated with the critical item was ranked "one" (most indicative of deception), it was given a score of 2 on Table l9.--The mean rank of the critical items with respect to the four principal dependent variables for each polygraph test. Physiological Index GSR GSR GSR Polygraph TESt Respiration Ampli- Maximum Total tude Height Length Demonstrationa 2.33 1.76 1.32 1.97 (n=90) (n=90) (n=65) (n=90) Test one 2.18 2.10 1.89 2.29 (n=270) (n=264) (n=187) (n=264) Test two 2.10 1.83 1.81 1.97 (n=270) (n=264) (n=208) (n=264) Test three 2.14 2.42 2.03 2.46 (n=270) (n=255) (n=185) (n=255) Test four 1.95 2.07 2.01 2.27 (n=270) (n=257) (n=170) (n=257) Test five 2.25 2.58 2.08 2.57 (n=270) (n=254) (n=172) (n=254) Note: These figures were calculated excluding the tests containing GSR maximum height downward drift patterns in calculating the values associated with GSR maximum height and excluding charts containing GSR amplitude falling patterns on all five responses on a given test in deriving the values associated with GSR amplitude and GSR total length. aThe calculations pertaining to the demonstration test were based only on the charts produced by the 90 subjects in the feedback control conditions. that test. If the dependent variable associated with the critical item was ranked "two," it was given a score of 1. Thus by summing the scores on the five polygraph tests a perfect guilty score for each of the dependent variables was 10. Since none of the subjects in this study was innocent, it was impossible to make a direct comparison between the actual scores of 94 innocent and guilty subjects. However, it was possible to calculate the theoretical distribution to estimate the expected proportions of innocent subjects that would have achieved each of the various scores. For example, the probability that an innocent subject would have received a score of 10 would be (.2)5 assuming that it was equally likely that the subject's largest response would have been to the critical item as it was to any of the four noncritical items on each of the five tests. Thus one would expect .032 percent of all inno- cent subjects to have a score of 10 if an infinite number of innocent subjects was tested. The estimated proportions (probability distribution) of inno- cent subjects that would have obtained each of the scores possible in this study are presented in Table 20. The pOpulation mean of these scores based on their probability distribution is 3, with a standard deviation of 1.789. As indicated in Table 20, if a cut-Off point of scores 5 or greater was selected as values indicative of guilt, theoretically approximately 20 percent of the innocent subjects would have been misclassified as guilty. It should be noted that the per- centages depicted in Table 20 are actually probability values. Thus if innocent subjects had been included in this study, the actual per- centage of innocent subjects misclassified would probably be either slightly higher or lower than indicated by the table. However, the more innocent subjects tested the less deviation should occur between the actual percentage of misclassification and the estimated values given the aforementioned chance model. 95 Table 20.--The estimated prOportion (probability distribution) of innocent subjects attaining each of the possible scores for the testing model incorporated into this study using the Lykken (l959) scoring procedure. Score Estimated Relative Cumulative Relative Frequency (Percent) Frequency (Percent) 10 .032 .032 9 .160 .192 8 .800 .992 7 2.240 3.232 6 5.920 9.152 5 10.592 19.744 4 17.760 37.504 3 20.160 57.664 2 21.600 79.264 1 12.960 92.224 0 7.776 100.000 The subjects' GSR amplitude values were scored using the aforementioned scoring procedure. The frequency distribution of those scores is presented in Table 21. The mean score for subjects with respect to that dependent variable was 5.07, with a standard deviation of 2.38. As indicated by Table 21, 57.8 percent of the subjects had a score of 5 or greater. Since several of the subjects demonstrated a low degree of electrodermal responsiveness on some of the tests, the frequency distribution was recalculated excluding subjects that had three or more GSR amplitude falling patterns on three or more of the five polygraph tests. This procedure eliminated 33 out of the 270 subjects included in this study. The frequency dis- tribution of scores for the remaining 237 subjects is presented in Table 22. The mean score for those subjects was 5.25, with a standard deviation of 2.31. A comparison between the original mean with 96 Table 21.--The actual proportion of subjects attaining each of the possible scores derived from scoring the subjects' GSR amplitude values using the Lykken (l959) procedure. Score Absolute Relative Frequency Cumulative Relative Frequency (3) (Percent) Frequency (Percent)a 1O 7 2.6 2.6 9 12 4.4 7.0 8 30 11.1 18.1 7 31 11.5 29.6 6 31 11.5 41.1 5 45 16.7 57.8 4 50 18.5 76.3 3 25 9.3 85.6 2 18 6.7 92.3 1 11 4.1 96.4 0 10 3.7 100.1 aCumulative relative frequency column totals more than 100 percent due to rounding. Table 22.--The actual proportion of subjects attaining each of the possible scores derived from scoring the subjects' GSR amplitude values using the Lykken (1959) procedure excluding subjects that had three or more GSR amplitude falling patterns on three or more of the five polygraph tests . Score Absolute Relative Frequency Cumulative Relative Frequency (n) (Percent) Frequency (Percent)a 10 7 3.0 3.0 9 11 4.6 7.6 8 27 11.4 19.0 7 31 13.1 32.1 6 28 11.8 43.9 5 40 16.9 60.8 4 42 17.7 78.5 3 22 9.3 87.8 2 16 6.8 94.6 1 8 3.4 98.0 0 5 2.1 100.1 aCumulative relative frequency column totals more than 100 percent due to rounding. 97 respect to GSR amplitude and the recalculated mean after the 33 sub- jects with a low degree of electrodermal responsiveness were elimi- nated indicated that the two means were not significantly different (lzj = .87). Table 22 shows that 60.8 percent of the remaining 237 subjects had scores of 5 or greater when their GSR amplitude responses were scored using the Lykken (1959) procedure. The ranks 19). Therefore, the polygraph testing procedure used in this study detected the deceptive responses made by the subjects significantly more fre- quently than chance expectancy levels. A separate three-factor analysis of variance was conducted to examine the effects of sex, placebo condition, feedback condition, and all of their possible interactions on each of the four sets of scores presented above. The results derived from these statistical tests are presented in Appendix G. No significant main or interac- tion effects were found for any of the four dependent measures that were scored using the aforementioned scoring procedure. Hence, none of the principal independent variables examined in this study had a significant effect on polygraph detection efficiency when the Lykken (1959) scoring procedure was employed. The Results Derived From the Biggraphical Data Sheet and the Follow-Up Questionnaire The results derived from the biographical data sheet (Appen- dix D) and the follow-up questionnaire (Appendix E) are presented in this section. To convey this material in a systematic and compre- hensible manner, the findings are broken down into several subsections. 100 Each subsection commences with a report of the relative frequencies of subjects falling into the various categories described under that particular subheading. For example, one of the questions the sub- jects were asked related to their church attendance during the year preceding their involvement in the experiment. When that tOpic is presented, the various categories of church attendance on which the data were compiled (i.e., 0 times, 1-5 times, etc.) and the number of subjects falling into those particular categories are depicted. After providing that descriptive information, the subsection focuses on the relationship between the variable being discussed and three dependent variables: (1) a refined measure of detection efficiency, which is referred to as the sum of critical item composite ranks (SCICR); (2) the frequency of GSR amplitude falling patterns; and (3) the frequency of downward drift patterns with respect to GSR maximum height. The sum of critical item composite ranks dependent measure was designed to take into account some of the findings already noted in this section. To compensate for GSR maximum height downward drift patterns and/or when all five responses on a given polygraph test were scored as GSR amplitude falling patterns (in both situations, all five responses would have been assigned ranks of three for those dependent measures), the ranks of the dependent measures on each item were added together. The sums of the original ranks for the five items on each test were then ranked from one (smallest composite) to five (largest composite). Finally, the new ranks associated with critical items on each of the five polygraph tests were added together, 101 yielding a possible range from 5 to 25. A sum of critical item com- posite rank value of five would indicate that the critical item was correctly differentiated from the noncritical item on each of the five polygraph tests. Only three of the four dependent measures derived from the polygraph charts were used in compiling the sum of critical item com- posite rank values. The value for GSR total length was eliminated from these calculations because it was highly correlated with GSR amplitude and it provided little discriminatory value. The latter point was demonstrated when discriminant analysis was conducted to determine the relative contribution of each of the four dependent variables in discriminating between the critical and noncritical items. The values of the Beta weights associated with GSR total length were negligible for all five polygraph tests, indicating that it was not necessary to incorporate it as a major dependent variable. However, the Beta weights for the other three dependent measures were much higher and relatively equivalent, suggesting that weighting them equally was acceptable. Thus the first step in determining the sum of critical item composite rank value involved adding the relative ranks for respira- tion, GSR maximum height, and GSR amplitude together for each of the five items on the polygraph tests. Next, the five sums for each polygraph test were ranked. Finally, the new ranks assigned to the five critical items were added together. For example, if there were absolutely no positive GSR responses on a given test, all five ranks with respect to both GSR maximum height and GSR amplitude would have 102 been originally ranked as three. Using this new system, the two ranks of three for each of the five items would cancel each other out and the sole determinant of the critical item's relative rank on that particular test would be respiration. However, if no GSR maximum height downward drift or GSR amplitude falling patterns occurred, all three measures would be weighted equally in computing the rank of the critical item on that particular test. The other two dependent measures discussed in each of the remaining subsections are the frequency of GSR amplitude falling patterns and GSR maximum height downward drift patterns, which were both defined in the final section of the methodology chapter. These additional dependent measures are examined in each of the subsections for two reasons. First, since GSR maximum height is in effect excluded from the sum of critical item composite rank calculations on a test when downward drifting occurs and the same applies for GSR amplitude when falling patterns occur on all five responses for a given test, it is important to note the prevalence of these phenomena when considering the sum of critical item composite rank values. Second, because of the abnormally high frequency of subjects not show- ing any positive response on the GSR measures to many of the questions, it was of major importance to determine why this lack of electroder- mal responsiveness occurred. To accomplish the aforementioned tasks, three one-way ANOVAs were calculated, comparing the means of the different categories within each subsection with respect to each of the dependent variables. For example, in the subsection pertaining to church attendance, the 103 subjects are classified as having either low, medium, or high church attendance. Then the three means for these groups are compared to each other with respect to each of the three dependent variables. This should indicate which of the variables derived from the bio- graphical data sheet and the follow-up questionnaire appear to be related to the accuracy of the polygraph decisions and/or the lack of electrodermal responsiveness. £99 The subjects ranged in age from 17 to 42 (K'= 19.8, §Q_= 2.54). To facilitate analyzing the data, the age groupings of the subjects were collapsed into the following three categories: (1) ages 17 to 18, (2) ages 19 to 21, and (3) ages 22 to 42. Eighty- nine subjects fell into the 17 to 18 age group, 152 subjects into the 19 to 21 age category, and 29 into the last group. Table 25 depicts the means of those three age groups with respect to the sum of the critical item composite ranks, GSR maximum height downward drift, and GSR amplitude falling patterns. The table also reports the level of significance derived from the one-way analysis of variance calcu- lations, which examined differences among the three age group means for each dependent variable. None of the three ANOVAs indicated there were any significant differences among the age group means for the three respective dependent variables. 104 Table 25.--A comparison of means for three age categories with reSpect to one measure of polygraph detection efficiency and two measures of electrodermal responsiveness. Age Categories Dependent Variable 17-18 19-21 22-42 One-Way ANOVA Years Years Years Results (n=89) (n=152) (n=29) Sum of critical item —; —; _ F(2,267)=l.74 composite ranksa X 9'96 X-9‘16 XE9'4] pf.l8 GSR maximum height down- —; —; —; F(2,267)=l.74 ward drift patterns X-l.54 X'1'70 X'1'10 2?.18 GSR amplitude falling —; —; —; F(2,267)= .58 patternsc X-4.16 X‘4.5] X-3o34 3:056 aThe higher the mean, the less detection efficiency exhibited by the sum of critical item composite ranks. Possible range: 5-25. bThe higher the mean, the more frequently GSR maximum height downward drift patterns were produced. Possible range: 0-5. CThe higher the mean, the more frequently GSR amplitude fall- ing patterns were produced. Possible range: O-25. Sex As previously mentioned, 135 male and 135 female subjects participated in this study. Table 26 shows the differences between the means for males and females with regard to the sum of the criti- cal item composite scores and the two measures of electrodermal reSponsiveness, as well as indicating their level of significance. It is interesting that there was not a significant difference between males and females when their sum of critical item composite scores were compared (p_= .44). However, females had significantly more 105 GSR maximum height downward drift and GSR amplitude falling patterns than did males (p_< .0001). Table 26.--A comparison of means for males and females with respect to one measure of polygraph detection efficiency and two measures of electrodermal responsiveness. Sex Categories Dependent Variable Male Female R666165 (n=135) (n=135) Sum of critical item —; -; z=-.773 composite ranksa X-9.30 X-9.6O E}.44 GSR maximum height down- -; -; z=-6.23 ward drift patternsb X-1'0] X'2'16 E? 0000 GSR amplitude falling -; -; =-5.77 patternsc X-2’47 X-6.07 6% 0000 aThe higher the mean, the less detection efficiency exhibited by the sum of critical item composite ranks. Possible range: 5-25. bThe higher the mean, the more frequently GSR maximum height downward drift patterns were produced. Possible range: 0-5. cThe higher the mean, the more frequently GSR amplitude fall- ing patterns were produced. Possible range: 0-25. Immediate Family Size During the pretest interview, the subjects were asked how many individuals, including themselves, were in their immediate family. Their responses were collapsed into the following categories: 2 to 4 members, 5 to 7 members, and immediate families with more than 7 mem- bers. Table 27 presents the means of these categories with respect to the sum of the critical item composite score values and the numbers of both GSR maximum height downward drift and GSR amplitude falling 106 patterns present on the polygraph tests for these respective groups. The table also shows the significance levels attained when the imme- diate family size category means were compared to each other for each of these three dependent variables. Although none of the means were significantly different on any of the three dependent variables, they did suggest an interesting pattern. Both the detection efficiency, as measured by sum of critical item composite ranks, and electroder- mal responsivemess decreased as family size increased. Table 27.--A comparison of means for three categories of family size with respect to one measure of polygraph detection effi- ciency and two measures of electrodermal responsiveness. Immediate Family Size . 2 t0 4 5 t0 7 Over 7 One—Na ANOVA Dependent Var1ab1e Members Members Members ResUlts (n=68) (n=156) (n=46) Sum of critical item __ __ _ F(2,267)=2.l7 composite ranksa X'9‘92 X-9.48 X-8.66 g=.12 GSR max. heightdown- _ _ _ F(2,267)=1.53 ward drift patternsb X-1.35 X7150 X-l '89 p_=.22 GSR amplitude fall- _ _ _ F(2,267)=l.01 aThe higher the mean, the less detection efficiency exhibited Possible range: 5-25. by the sum of critical item composite ranks. b Possible range: The higher the mean, the more frequently GSR maximum height downward drift patterns were produced. 0-5. cThe higher the mean, the more frequently GSR amplitude fall- ing patterns were produced. Possible range: 0-25. 107 Combined Family Income During their pretest interview, the subjects were also asked to indicate the combined income of their parents. For analysis pur- poses, responses were placed in one of the following categories: (1) less than $15,000, (2) $15,000-$24,999, and (3) over $24,999. The numbers of subjects falling into each of these categories were 34, 92, and 142, respectively. Table 28 compares the group means for these categories with respect to the sum of the critical item rank composite scores, GSR maximum height downward drift, and GSR amplitude falling patterns. None of the combined family income group means were significantly different from each other for any of the three dependent variables. Table 28.--A comparison of means for three categories of combined family income with respect to one measure of polygraph detection efficiency and two measures of electrodermal responsiveness. Combined Family Income Less Than $15,000- More Than One-Way ANOVA DEPENdEflt Variable $15,000 24,999 $24,999 RESUItS (n=34) (n=92) (n=142) Sum of critical item —_ —_ —_ F(2,265)=.12 composite ranksa X-9.34 X'9’30 X-9.51 ‘E?.88 GSR max. heightdown- —; -; —; F(2,265)=.50 ward driftpatternsb X-1.4l X-l.53 X-1.68 pe.61 GSR amplitude fall- -; -; -—; F(2,265)=l.07 ing patternsc X-3.56 X-3.85 X-4.73 E? 35 6The higher the mean, the less detection efficiency exhibited by the sum of critical item composite ranks. Possible range: 5-25. bThe higher the mean, the more frequently GSR maximum height downward drift patterns were produced. Possible range: 0-5. cThe higher the mean, the more frequently GSR amplitude fall- ing patterns were produced. Possible range: 0-25. 108 Subject's Year in School One hundred sixty-four subjects participating in this study were either freshmen or sophomores; 106 of the subjects had attained junior status or higher. A Z-test was conducted for each of the three dependent variables (sum of critical item composite ranks, GSR maximum height downward drift, and GSR amplitude falling patterns) to deter- mine if the means for the two groups were significantly different. The means and their respective significance levels for those Z-tests are presented in Table 29. None of the three Z-tests indicated there were any significant differences between these two categories of sub- jects. Table 29.-~A comparison of group means for underclassmen and upper- classmen with respect to one measure of polygraph detection efficiency and two measures Of electrodermal responsiveness. Subject's Year in School - . Z-Test De endent Variable Freshmen and Juniors or p Sophomores Above RESUItS (n=164) (n=106) Sum of critical item -; _ gel.ll composite ranksa X-9.67 IXTQ'II p=.27 GSR max. height downward _ _ 5;,34 drift patternsb X" -5‘ 7" '55 3:.73 GSR amplitude falling _ _ 5;.04 patternsC XT4'27 XT4’25 pe.97 aThe higher the mean, the less detection efficiency exhibited by the sum of critical item composite ranks. Possible range: 5-25. bThe higher the mean, the more frequently GSR maximum height downward drift patterns were produced. Possible range: 0-5. CThe higher the mean, the more frequently GSR amplitude fall- ing patterns were produced. Possible range: 0-25. 109 Subject's Grade Point Avergge Table 30 shows the levels Of significance derived from three one-way ANOVAs, which compared different grade point average cate- gories with respect to their means on the sum of critical item com- posite ranks, GSR maximum height downward drift, and GSR amplitude falling patterns. The grade point average categories selected were: less than 2.8, 2.8 to 3.3, and over 3.3. All of the subjects' grade point averages were based on a four-point scale (A = 4.0). Fifty-four subjects had a grade point average less than 2.8, 144 subjects' grade point averages were between 2.8 and 3.3, and 72 were 3.4 or above. The three ANOVAs indicated there were no significant differences among the subjects in the different grade point average categories for any of the three dependent variables. Subject's Religious Preference Each of the subjects was asked to indicate his/her religious preference. Fifty-two subjects responded none, 13 replied Jewish, 106 stated Catholic, 98 said Protestant, and one declined to state a religious preference. Three one-way ANOVAs were conducted to deter- mine if there were any differences among the four religious preference means with respect to the sum of the critical item composite ranks, GSR maximum height downward drift, and GSR amplitude falling patterns. The different means and their respective levels of significance are presented in Table 31. Although none of the aforementioned mean com- parisons indicated significant differences, the mean for the Jewish subjects on GSR amplitude falling patterns (7': 6.77) was higher than 110 that for subjects indicating no religious preference (K'= 3.90), a preference for Catholicism (Y'= 4.77), or preferring the Protestant denominations (7'= 3.62). The relatively high probability value asso- ciated with that ANOVA (p_= .16), deSpite the perceptibly higher mean for GSR amplitude falling patterns associated with Jewish subjects, was in part a function of the small number of subjects stating a Jewish preference. Table 30.--A comparison of group means for three categories of school grade point average with respect to one measure of polygraph detection efficiency and two measures of electrodermal responsiveness. Subject's Grade Point Average Dependent Variable Less Than 2.8 to Greater OneEWaylANOVA 2.8 3.3 Than 3.3 esu ts (n=54) (n=144) (n=72) Sum ofcritical item —_ _ _ F(2,267)= .32 composite ranksa x-9 44 X'9°33 X-9 70 .pf.73 GSR max. height down- —; -; -; F(2,267)= .6O ward drift patterns X'I'BO X-1.55 X-1.50 .p=.55 GSR amplitude falling _ _ _ F(2,267)= .Ol patternsc x-4 31 X-4.28 x-4 19 .B=-99 aThe higher the mean, the less detection efficiency exhibited by the sum of critical item composite ranks. Possible range: 5-25. bThe higher the mean, the more frequently GSR maXimum height downward drift patterns were produced. Possible range: 0-5. cThe higher the mean, the more frequently GSR amplitude fall- ing patterns were produced. Possible range: 0-25. 111 .muio "wmeoe m_owmmom .omoauoea mew: mcewuuwo mcwppwe mozuw—oEo zmw xpmcmzomee weos wzu .come mzu ewzmwz mzho .mio "mmcoe wpowmmoe .owosooeo mew: mcemupoo ueweo oeo:=:oo uzmwwz Essexos «we apwcmzowem meoe wzu .cowe mzu ewzmmz mzho .mmim "wmcoe m_a_mmom .mxcwe wawmoosoo Ewuw Foowpweo mo 53w mzu xo owueowzxw socmwoweem cowuowumo mmm_ wzu .cowe mzu ewzmmz mzho mmemmmwée No.3 Rim :8 1.“ 8.3 9.2:: oewmfiwemmw : I. mppomwz ucwmmwmwee omwwhmuw “mwmmw Awmmuv w_no_eo> acmocmawc <>oz< z.ozimco mocmewmmee mzowmw—mm m.uomno=m .mmmcm>_mcoomme Posemooepompm eo mmeamoms o:u oco zocmwoeeew coeuomuwo zowemxpoo eo mezmowe wco om pomamme cow: mmweommuoo mocwemeweo waowmwpme eooe eoe momma oooem eo comweooeoo