THE EFFECT C? {(AEE’INEE "z‘l’ffi WTERVALS 0R TEE RE‘PE‘EQEEU‘C’EEQEFE a“; E'E‘Céii 0F RORSCHACH RESPGRSES 0N RETEST Thesis For The Bayes ‘-!.:f P11: D. .thcm HATE CO-iLEGE 312mm H. scamnm 1955 7 *1 this This is to certify that the thesis entitled The Effect of Varying Time Intervals on the Reproduction and Recall of Rorschach Responses on Retest presented by Bertram H. Schneider has been accepted towards fulfillment of the requirements for JILL— degree in W ”Major éofe‘fvy Date {.6 ‘55; 0-169 ‘ ‘ .ttt't‘ ,,n..-.. anti.‘ Q'.I‘l‘ .,I< , h, o H. . . offline? 7-»... \ t . .. . . ‘t‘it‘, .s...ovt.te.o. .111, v \........2. 00,. 9‘10.- , THE EFFECT ("1" V .“uiYIIJG TIME IfTPEVkLS ON THE REPRODUCTION AND RJCALL OF RORSCHACH RESPONSES ON RETEST By Bertram H. Schneider AN ABSTRACT Submitted to the School of Graduate Studies of Michigan State College of Agriculture and Applied Sciences in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Psychology Year 1955 Apmnnmd jgéazc #1 ' f THESIS The purpose of this study was to investigate the consistency of Rorschach results on retest where subjects were not exposed to any treatment other than the systematic varying of the time intervening between the tests. specifically, it was to determine the differential effects of the passage of time on the persistence, change and recall of Rorschach reSponses on retest. Two hypotheses were set forth: 1) performance on the Rorschach in terms of repeated reaponses on retest remains constant over varien short time intervals; 2) recall of those responses decreases as a function of the length of time between tests. sixty patients screened to assure exclusion of those with neuropsychiatric conditions were selected as subjects from a VA general medical and surgical hospital. The subjects were distributed into three groups of twenty, equated for age and intelligence. Each group was retested with the Rorschach after the following approximate time intervals: group I, four hours; group II, two weeks; group III, two months. Following the retest of each subject, his responses were individually read back to him and he was asked if these responses had been given in the initial test. Following this recall procedure a question- naire, designed to obtain a verbal report of the effect of recall on the retest, was administered to each subject. Two techniques were used to obtain data to test the hypotheses. The response-comparison technique was a matching FHRFNTFHZ procedure, by means of which each pair of Rorschach protocols were compared for common or consistent responses. The recall technique was a scoring method by means of which accuracy in the identification of retest responses as new or repeated reaponses was determined. The two techniques yielded seven measures which were tested for significance of differences among the three groups by the t-test. The results on the whole confirmed both hypotheses. It was found that the measures of consistency devised to test the first hypothesis did not yield significant differences among the three groups, regardless of the length of time elapsed before retest: The measures of recall devised to test the second hypothesis decreased as a function of the length of time between tests. It is concluded on the basis of the results that retest consistency is not to be solely accounted for in terms of recall. The verbal reports of the subjects are compatible with this finding. Forty per cent of the subjects reported that it was the stimulus properties of the cards that seemed to elicit the same responses on retest. This may be compared to 13% who reported that it was recall that seemed to be of primary importance in eliciting the same responses. The results also indicated that the percentage of new responses on retest was 25.0 after four hours, 32.2 after two weeks and 29.5 after two months. The verbal reports suggest that ease of concentration, curiosity and desires to be more thorough were some conditions related to the production of new responses. THE EFFECT OF VARYING TIME INTERVALS ON THE REPRODUCTION AED RECALL OF RORSCHACH RESPONSES ON RETEST By Bertram H. Schneider A THESIS Submitted to the School of Graduate Studies of Michigan State College of Agriculture and Applied Sciences in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Psychology Year 1955 ACI‘IFIOWLBE,)GITJITS The author is most deeply indebted to Dr. Albert I. Rabin, under those inspiration and constant onco:,1ra§;er;1c3nt this research was initially undertaken. He also wishes to express his sincere thanks to Dr. M. Ray Denny, who guided the completion of the study, and to other members of the doctoral committee, Drs. Carl Frost and Alfrcdifietze for their construc- tive criticisms and helpful suggestions. Grateful acknowledgement for valuable assistance is also extended to the psychology staff at the Dearborn VA Hospital, including Drs. John J. Brownfain, Andrew S. Dibner, William A. Alexander, Herbert B. Males and, in particular, to Dr. Bernard Chodorkoff for his most generous help. Deep appreciation is also due to the author's family, especially to his wife, Nelda, for her endurance and innumer- able contributions throughout the course of the study and to Mrs. Jesse Goldberg for her typing assistance. And, finally, thanks are accorded to friends and associates, who aided in one way or another, including Drs. Ned Papania and William Brett and to Rosemarie Szilagyi for typing the final manuscript. iii TABLE"F CONTENTS ASY"OWLWD“7“V”” L) ”d.“ I‘ J—l kr‘JiXLJl“ J. . . . . . o . . . o 0 TABLE CF CONTENTS . . . . . . . . . . LISr}? CF TABLES O O O O 0 O O O O O I. II. III. IV. II": 1.131;}OID-‘Cl-«L‘IOE\I O O I O O O O O O O C A. .— l I.’ . Some Problems of Rorschach Roll 0.) Previous tudics . . . ... . . . Split-Half Studies . . . . . . . Parallel Test Studies . . . . . Test-Retest Reliability Studies Experimental Test-Retest Studies a. Clinical Test-Retest tudics . . IIYPOTIEESES O O O O O O O O O O O O . IEETHODOL 0 GY '0 o o o o o o o o o o o A. B. C. D. Subjects . . . . . . . . . . . . Procedure . . . . . . . . . . . Methods anleechniques . . . . . Response Comparison Technique. . Recall Technique . . . . . . . . Matching by Judges . . . . . . . Questionnaire. . . . . . . . . . Treatment of Data. . . . . . . . RE SETLTS 0 O O O O O O O O O O O O O DISCLISSIOII 0 O O O O O O O O O O O 0 iv a 16 23 28 3O 3O 31 35 ° 35 37 HO l+2 u3 1+8 60 A. Consistency and Recall . . . . . . . . . 3, Some Conditinns Related to Consistency . C. Some Conditions Related to Inconsistency D. Judgcs' Verbal Reports . . . . . . . . . E. A Theoretical Interpretation . . . . . . r. Methodology . . . . . . . . . . . . . . nlications for Further Research . . . ~ G. I :3 VI. alien: . . . . . . . . . . . . . . . . . APPEIDIX.A - QUESEIOHKAIRE . . . . ... . . . . APPENDIX B — TABLES . . . . . . . . . . . . . . APPEUDIX c - vsuuou's FORNULAE . . . . . . . . . BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . Table I N I I 5 III I} IV " v " VI " VII "VIII H IX '3 X " XI "XIII " XIV n XV N XVI LIST OF TABLES Description of Groups Definitions of Recall Classifications Summary of Scoring Measures Comparison of Groups on Differences in the Reproduction Measure Comparison of Groups on Differences in the Reproduction-Recall Measure Comparison of Groups on Differences in the Total Reproduction Measure CO I nparison of Groups on Differences in the T otal Reproduction-Recall Measure Comparison of Groups on Differences in the New Response Measure Comparison of Groups on Differencesin the New Response Identification Measure Comparison of Groups on Differences in the Recall Measure Comparison of Groups on Mean Percentages In Each Measure Comparison of Groups on Replies to Questions 1’3’h‘7576’738,9 Comparison of Groups on Question #2 of Questionnaire Comparison of Groups on Question #3 of Questionnaire Replies of "Yes" to Question #% of Questionnaire Comparison of Groups on Question #5 of Questionnaire vi 9.) {Q U) [‘0 39 1+9 50 51 56 S8 72 73 75 77 78 Vii Page Table XVII - Replies of "Yes" to Question #6 of Questionnaire 80 " XVIII - Comparison of Groups on Question #10(a) - of Questionnaire . 81 " XIX - Comparison of Groups on Question #10(b) . of Questionnaire 83 I. INTRODUCTION Although the Rorschach test has become firmly estab- lished in psych.logical clinics as a major diagnostic instrument, research with it has never been focused on the specific question of temloral reliability. Most clinical studies on the Rorschach which report changes in performance due to drugs, electroschock, psychotherapy and the like, do not include untreated control groups. EXperimental studies generally are of the "testing-the—linits" nature where the standard procedure is altered in order to check on the resistance of the Rorschach to artifacts of the situation. Reliability studies employing the test-retest'methoi have been few for fear that practice or memory effects would mask any instability in the test. . The present study investigates the stability of the Rorschach in a situation where the subiects are not eXposed to any treatment other than the systematic varying of the time interval between the initial test and the retest. The use of a temporal dimension permits an evaluation of the differential effects of the sassage of time on the per- sistence, change and recall of Rorschach responses in retest. A. §Qggufigoblem§‘ f BQTSQHQQD.BCliflhil;L1 Test reliability refers to the consistency with which a test yields information. Two me or sources of unreliabil— ity are recognized: a) lack of stability in the function which is tested and b) errors in measurement. To illus- trate the former one might consider the measurement of an earthworm by means of a foot ruler. The errors in measure- ment wOUld be minimal whereas fluctuations in the length of the earthworm would give rise to inconsistent results. To illustrate the latter, one might consider the reverse, the measurement of a foot ruler by means of an earthworm. Here the function is perfectly stable, whereas the measuring instrument itself is faulty. This problem is particularly applicable to psycho- metrics where both fluctuations in function and errors in measurement are common. If one attempts to establish test reliability by comparing the results of two test administra- tions separated in time (test-retest), both sources of error are Operating. As an alternative the split half technique. has come into general usage. This method makes use of the comparison of equivalent halves of a test (first half vs. second half; odd items vs. even items, etc.). The obvious advantage here is that an assumption of function stability does not have to be made in order to assess errors of measurement. However, where changes in the function over a period of time is the subject of investigation, the retest technique is essential. In this case the function must be -3... considered stable over a short period of time and the . errors of measurement then estimated between the initial and the repeated tests. The test-retest approach to reliability has long borne the stigma of practice or recall as a source of error that gives a false appearance of consistency. Still, surprisingly enough, although the concept of reliability has been known since 190% and the literature dealing with test reliability has increased to sizeable propor— tions, there is little evidence that this assumption has ever been directly tested. Jackson (2%) reports a study investigating the effect of varying the time between tests on reliability by using the Revised Beta Examination. The time intervals were one- half day, one day, three days, one week and five weeks. He found no general pattern evident in the results and he was txnable from a statistical point of view to determine the net caffects of practice. It did appear, however, that changes unsre not related to the length of time elapsed between tests. Ttuis could be exelained on the basis that Jackson tested a refilatively stable function (intelligence) and hence errors Of' measurement would remain constant over the various inter- vals of time. If memory effects were significant in sustain- ihlg? an impression of stability, changes should have increased Vditii the passage of time. With the advent of personality tests, the problem of “ load to a dilemma. 1 .3. , 1’1 rCi/C-T'J deiSCBrmining consistency throug On- lshc one hand is the argument that leek of change in the retest may be accounted for by memory or practice. On the other hand is the “OSSlbilltf that this lack o n .L change is due to the reflection of stable features in the uérsonality end that any changes are a consequence of actual psychological changes in the subject. Opposing points of View on this problem have been repre- sented among Rorschach investigators. Thornton and Guilford (#9) state that the importance of the memory factor precludes a repetition of the same test series. Consequently they argue that the split-half method is the only one possible with the Rorschach technique for reliability studies. However, Piotrowski (37) insists that the split-half technique is unfeasible because of the unitary nature of the test which makes the direct compar- ison between parts impossible. He further claims that there are no practice or memory effects, because there'is no conscious effort. We feels that rather than be called mere repetitions, repeated percepts should be considered as representing stable personality trends typical for the time elapsed between examinations. B. .Previous Studies Split-half studies. The pioneer study on Rorschach reli- ability can be attributed to Vernon (51). There were two earlier test-retest type studies by Mira (3%) and Wertham and Bleuler (53) which will be described later. Neither of these two studies presented statistical findings and Vernon's comment on them was that owing to the effect of memory factors, any "repeat correlation coefficients" would be spurious. He, there- fore, decided to use the Split-half method by considering the test as consisting of two parallel series of five - 5 - blots each. He correlated the responses from «ne series consisting of cards I, III, V, VI and X with cards II, IV, VII, VIII and IX. The Spearman-Brown formula was used to correct for the reduced length of tte test. The results yielded an average correlation of .5% which Vernon considered unsatisfactory. The one exception was a correlation of .91 for the number of responses. His conclusion was that if the test is to have any claim to "objective validity" it must be modified in order to achieve a minimum reliability level of from .70 to .80. Hertz (20) followed Vernon's study with one also using the split-half method in which she divided the test into odd and even numbered cards. She reports finding an average correlation coefficient of .829 as compared to Vernon's average of .5H. She eXplained the higher reliability on the basis of more adequate standardization of the testing procedure and increased objectivity of the scoring. As a consequence of her findings Hertz maintained that the test factors were reliable and that hence personality traits in terms of the inter-relationships of the Rorschach factors tend to be consistent and follow a stable pattern. A third investigation using the split-half technique was reported by Thornton and Guilford (#9). They singled out the "Erlebnistypus" scores for a reliability study. The results were somewhat inconclusive. They stated that, under favorable but unknown conditions, reliability could be demonstrated for the M and C scores. This study touched off -6-- a blast by Piotrowski against "atomistic" studies carried out to avoid the memory factor. Shortly after this a general reaction against split-half studies set in among Rorschach workers. Hertz (21), herself, signalled the end of this method by pointing to the "global nature of the test" which excludes working with variables abstracted from the whole. A further re-evaluation by objectors to the Split- half method indicates its unsuitability for the following reasons: 1) There are unique stimulus values in the individ- ual cards. Hence each card elicits varying frequencies of the responses that are summarized in the various scoring categories. 2) The split—half method assumes a relatively constant test performance throughout the test situation which is not met in practice. The cards have.a sequential and mutual relation considered of interpretive importance which is not only of necessity ignored in split-half studies but also is contrary to the assumption of an unfluctuating function. 3) 0n statistical grounds Cronbach (10) points to the unfeasibility of using the Spearman-Brown correction, where ratio scores, such as those found in the Rorschach, vary with productivity. 4) The fact that five of the ten cards are colored precludes the equality of the stimulus values of the split-halves. Parallel test studies. A different approach to the problem of the memory factor through the use of an alternate - 7 - or parallel form of the inkblots has often been advocated. Rorschach, himself, indicated what he felt was the neces- sity of such a series, saying, "If the test is to be repeated with the same plates, conscinus or unconscious memory may warp the results. Analogous series of plates... are necessary for these situations," (#3). Of the number of alternate forms which have been devised, the Behn series has recently received the most attention. Swift (#8) carried out one of the first studies directed at measuring the degree of correspondence between the two test series. Using the Rorschach as the initial test, she retested preschool children with the Behn set after a seven day interval. Correlations of the scoring categories from both tests yielded some high coefficients, but a number of low ones led Swift to conclude that the Behn was not sufficiently comparable to the Rorschach. Eichler (12) reports a more recent study involving the Behn series. He used three groups, retested after approximately three weeks, the first group receiving the Behn followed by the Rorschach, the second receiving the tests in the reverse order, while the third served as a comparison group with the Rorschach followed by the Rorschach. His findings were similar to that of Swift's in that while reliability coefficients frow the Rorschach retest as compare} with the Behn retest were satisfactory in some (W :"I respects, consistent differences in other respects indicated the two tests were not sufficiently alike for use on the individual level. Singer (1+6) used a more "glflbfll"'{lpllI’OfiCh to the Rorschach-Behn comparison. He gave the protocols from ten subjects who had been given the Behn end the Rorschach to six judges with ’nstrwctions to pair them. Although The matchings were better than chance, Singer concluded that the Behn failed to meet standards of reliability demanded for individual prediction. The concensns of these findings is that the Behn is not an entirely satisfactory alternate form, especially for use in the individual case. If one exarines alternate forms from a theoretical point of View, its disadvantages become apparent. While it is true that memory effects are elimin- ated as a matter of concern for reliability, the alternate form contributes another source of error, i.e. the extent to which it falls short of being equivalent to the standard test. This lack of equivalence is especially prominent in projective techniques where the unstructured material is so difficult to duplicate. The more accurately the projective test is duplicated the closer the alternate form comes to being identical with the original test. It then becomes more than a retest and subject to the possible effects of repetition. Test-retest_reliability studleg. The persistent concern with the memory factor in rctesting has led to two studies which attempt to control for it in unique ways. In one study, Kelley, Margulies and Barrera (20) chose patients after electroshock who had amnesia for the initial test which immediately preceded the shock. Twelve of these patients who were free from confusion were retested two heurs later. The authors describe the changes as minor with few shifts of more than one response in the variables. The general personality picture of each appeared unchanged, although no statistical verification was reported. The fact, however, that there were some changes may be due, as Rabin (39) points out, to cerebral changes concomitant with electro- shock treatment. Griffith (18) made use of patients with-Korsakoff's syndrome as a means of ruling out memory factors, since- gross memory defects are an integral part of this disease entity. He found four patients who appeared to have no recall for the test upon the retest 24 hours later. A comparison of each pair of test-retest protocols showed. similar features which reliably characterized each individual. Full statistical treatment of the results was precluded by the small number of subjects. Despite the ingenuity of such studies as these, the results cannot be wholly definitive on the problem of memory. The subjects used are seldom encountered in practical - lO - clinical eXperience. Also, the actual effect of treatment or disease conditions on test performance must be discovered through further research. To put it more simply, the more practical question is the extent to which memory actual affects the test when it is repeated, since it is not the usual case for a subject's memory to be blotted out in the intervening time. - In spite of the contention that recall would invalidate reliability studies using test-retest, such studies have been attempted. One of the earliest studies of this nature was carried out by Mira, (as reported by Vernon) (51). He administered the retest to a group of subjects two weeks after the initial test. Some consistency of responses was observed in some subjects. Mira considered the degree of consistency as an index of the stability of the individual. Since he did not present any statistical evidence of his results, the study is more of historical interest than of significance for this reSearch. Several studies involving children as subjects have been reported. Kerr (27) repeated the test with fifty elementary school children who had been first tested the previous year. She compared the first and second tests in terms of correlation coefficients for several scoring categories, which ranged from .001 to .7#. Color determinants fell in the lower ranges. Kerr explained this finding on the basis of the affect represented by color responses. - 11 - It should not be surprising that color Varies so greatly since the emotional state of the subject is similarly inconst nt. On the other hand, the number of whole responses, which yielded the highest correlation cecfficient, was said to be expected, since Rorschach indicated that it was highly correlated with intelligence. It should be noted here that Kerr's use of correlation coefficients of the scoring categories as the sole statistical criteria of reliability without considering the configurations of the Rorschach factors is only a partial approach to the determination of reliability. A similar study, but with a preschool pOpulation of 55 subjects, was reported by Ford (13). While admitting the incomplete nature of a statistical approach of the type employed by Kerr, she could see no way out but to eXpress the results in Pearson product-moment coefficients. The reliabilities ranged from .38 to .86 with each coefficient indicating a significantxelationship between test and retest determinants. She stated that although these findings are not high, they can be considered as fairly satisfactory especially since the final synthesis depends on the balance and interrelationship of all the determinants. Troup (50) provides an interesting and well designed study from a methodological standpoint, illustrating a means of sidestepping the limitations of the piecemeal correlational proccdures. In this study, six judges were asked to match -12.. two Rorschach psychograms taken six months apart for each individual of ten pairs of twins of grammar school age. Comparing ten pairs of Rorschach psychograms at a time, three judges achieved 100% matchings, one, 90%, and two, 80%. Using a formula develOped by Vernon (51), this yielded a contingency coefficient of .9H. A chi square test of significance indicated the chance expectation of this figure was less than .001. Troup's conclusion was that the "degree of reliability based on the consideration of the total personality picture appears significantly greater than estimates based upon...correlation coefficients of the separate categories." The previously mentioned study by Swift (#8) is also pertinent here and will be described in greater detail since it is particularly relevant to this investigation. At varying time intervals Swift tested preschool children under four conditions: 1) A Rorschach test-retest with a median interval of 30 days using #1 children. This interval was chosen in the belief that the memory factor would be minimized while deve10pmental factors would remain constant. 2) A Rorschach test-retest with a median interval of 1% days. The Behn series was interpolated on the seventh day. The mubjects numbered #9 of whom 19 participated in condition 1, given the previous year. 3) The Behn series of condition 2 with the 7-day interval. u) A Rorschach test-retest with a ten month interval. This latter group included 20 subjects, all of whom had been used in conditions 1 or 2. The use of -’13 - the same subjects in the various conditions were justified by Swift on the basis that they apparently did not recognize the test from the previous year. The reliabilities of the various conditions involving Rorschach test-retest were reported in terms of product- moment correlatihns for those scoring categories which pre- sented a continuous distribution of scores. For the 1% day interval the reliabilities ranged from .59 to .8h (corrected for attenuation) with nine of twelve categories over .70. The two month interval yielded a range of .15 to .87 while the ten month interval produced a range of .08 to .86 with all but one over .50. It would appear that the highest re- liabilities are attained when the interviewing period is brief. However, it would be hazardous to generalize from data collected under the conditions reported. .It will be recalled that some subjects were reused in the various test- ing situations. It is even likely that some of the subjects tested in condition # had already been tested in all of the other conditions, making a total of six separate test administrations for them. Furthermore, the statistical analysis by the standard correlational methods is subject to the same criticism applied to those studies mentioned above, i.e. ignoring the patterns of scores. Another feature of Swift's study was an attempt to determine the extent of repetition in the retest. An analysis of the responses was made to determine the average - 14 _ percentage of identical responses in the initial and suc- ceeding tests. For the 1% day interval Swift found the average percentage of responses given in the retest, which were identical with those of the original test, to be 57% with a range of 0% to 90%. The cornesponding percentage for the 30 day interval was 51% with a range of 8% to 100%. Here it would seem that there is little variation in repetition from one time interval to the other. One should not assume, however, that an identical response is produced in retest because of recall. Although Swift cautions against considering these percentages as an index of memory factors, it has been 30 represented by Ainsworth (l). A final study which might be added to this section is reported by Holzberg and Wexler (22). Working with 20 chronic schiZOphrenic subjects, they sought to.determine the reliabilities of the Rorschach when used on a pepulation clinically defined as "unpredictable". The underlying assumption was that the retest might be significantly changed without a corresponding change in the clinical picture because of the "unpredictability" of the schiZOphrenic. A three week period intervened between the two tastings. Statistical procedures used were correlations of scoring categories and tests of differences between means. On the whole the results revealed significant correlations between means. Since some qualitative variation was observed in a few pairs - 15 - of protocols, the question was raised whether the statistical analysis masked im.ortant differences which might influence clinical judgement in identifying personality structure from test to retest. To answer this question tabulations of the data were submitted to two trained Rorschach workers. Matchings of pairs of tests were significant at less than the 1% level of chance occurrence. The authors concluded that "unpredictability" was not apparent in test-retest performance with a schiZOphrenic pepulation which has become stabilized in terms of chronicity. The studies listed above represent the bulk of the literature dealing exclusively with the retest criterion of reliability. A summarizing review of these studies reveals numerous shortcomings. The limitation of subject populations to children and schiZOphrenics does not provide information on the most representative individual of the pOpulation at large, the normal adult. Some data related to normal adults are available in the experimental studies, which are to be reviewed next. Choice of time intervals in the preceeding studies has been largely arbitrary except in Swift's study (#8) where the methodology was faulty. The statistical treatment of data in the earlier studies was a perseveration of the method used in split-half studies, dealing with the separate scoring categories instead of the interrelated patterns of scores which constitute the ma or interpretive -16- unit. The question of recall and practice effects has received little attention, again with the exception of Swift. However, the design of her study did not involve assessing the net effect of memory in the reproduction of responses 0 Emerimenfltal test-retest studies. A number of experi- mental studies, using the retest technique, have been designed to take account of extraneous factors on Rorschach perform- ance or, as one writer (1) put it, to "test the limits" of reliability. The usual method in these studies is to alter the standard test conditions in order to determine the pos- sible effects on the stability of the test. The historical forerunners of these studies were car- ried out by Fosberg (11515) in an attempt to check; the resistance of the Rorschach to "faking". In a preliminary inveStigation (11+) he used two subjects and in a latter one, 5*) subjects. The design of both studies involved the administration of the test four times to the same subjects; fir st, with standard instructions; second, with instructions to make the best possible impression; third, with instructions to make the worst possible impression; fourth, with standard instructions again. The retest intervals with a range of from o to 700 days were not held constant. In the first Study Fosberg used a chi square to show that the psychograms for each person correwonded. In the second, a correlational -17.. technique was used. The results of both studies led him to crrnclrtde that the Rorschach could not be manipulated with the deliberate intention of presenting oneself in a favorable or unfavorable light. He also stated that the time interval apparently had no effect on the reliability coefficients in the second study. However, Cronbach (10) indicates that the statistical procedures used throughout both studies were "entirely unsound" and hence all the conclusions are Open to question. Carp and Shavzin (8) attempted to verify Fosberg's findings by further testing the susceptibility of the - Rorschach to falsification. They tested 20 subjects twice, three weeks apart, with instructions to make a good impression on one test and a bad impression on the other. A test comparison of each category in the two tests showed no group differences, although there were differences on the individual level. When the two tests of each subject were compared as units by chi square tests, it was found that for four sub‘iects the probabilities were less than .10 that the two distributions came from the same penulation. They c01'IS:I.dered the results as refuting Fosberg's statement that the Rorschach resists g1; attempts at manipulation by the sub-3 ect, since their evidence shows that some subjects can alter their personality picture as reflected by the Rorschach. Another study employing altered instructions in the eXpeI’Zimental conditions was reported by Hutt, Gibby, Milton - 18 _ and Pottharst (23). They investigated the extent of modifie cation in specific scores on the Rorschach with specific instructions to alter these scores. Four groups of college subjects originally tested under standard instructions were retested two weeks later, each group under one of the following instructional conditions: 1) to pay particular attention to segmented areas of the blot; 2) to find as many human movement reSponses as possible; 3) to give only good form, but to combine color and form, and in addition human movement responses; H) to report everything they saw (standard instructions). This fourth group served as a com- parison group. The results of the experimental groups showed a general shift in the direction indicated by the instructions. The conclusion of the authors was that the variables in question were unstable as a result of the test conditions for their non-psychiatric pepulation. As for the control group, which is of interest as a study of test-retest reliability, the findings showed great variability in cor- relation coefficients of some scoring categories. Surprisingly enough, the correlation coefficients for the control group were lower than those of the experimental group. The authors suggest that this instability, both in the control and experimental population, may be a result of the lack of rigidity in the normal individual. One may well wonder, however, to what extent the college population is representa- tive of the general pepulation. _ 19 - The above studies all are related in that their attempt to show the influence of "set", suggesting that determining factors in perception imposed by the examiner through the instructions are equivalent to the inner determin- ing factors which the subject brings to a standard testing Situation. This, of course, may not be the case. Evidence by Norman gt_al. (36) and Rabin §t_al. (#0) indicates that subjects exposed to more indirect set-inducing eXperiences do not generally show the effects thereof. However the implication of the studies using altered instructions is that the instructions must be standardized in order to produce comparable data. Differences in instructions may be respons- ible for the variance contributed to test scores by examiner differences as'noted by Baughman (3). Certainly this must be a factor to be considered in not only diagnostic testing but also any study employing the Rorschach test. Other studies using the test-retest procedure have been designed to examine the influence of situational factors on the test. Kimble (28) studied the effect of a social setting by administering the test twice, one under standard conditions and another time in a college cafeteria with at least two other peeple present. Fourteen college students served as subjects. The time elapsing between tests was between one and two weeks. The social situation reportedly elicited significantly greater color responses. Kimble concluded that the friendly, intimate atmosphere evoked in the social setting was responsible for the increase in color, - 20 - representing a response to pleasant stimuli. Another study of this nature was made by Lord (33) who sought to assess the relative effects of retesting, of exper- imental alteration of the emotional climate in the test ' situation and of the examiner's personality. She used 36 college students tested under each of three conditions: 1) a standard test situation; 2) following other tests designed to make the subject feel rejected and a failure; and 3) following conditions designed to make him feel accepted and successful. Each examination was administered by a different one of three examiners. The results indicated that retesting effects were the least important. The test atmosphere produced more responsive (R), more imaginative (M), and less stereotyped (A%) results in the approving situation while tendencies in the Opposite direction were observed in the disapproving situation. The most prominent effects were those of the three examiners themselves. These studies also may be evaluated in terms of another possible source oftxndfiability, differences in subject- examiner relationships. It also indicates the advisability for the individual examiner in diagnostic work to appraise himself of his personal effect on a patient's performance. The final pair ofexperimental studies using the retest technique are those which alter the test procedure or the stimulus itself. Rabin and Sanderson (41) investigated the - 21 - effects of reversing the order of presentation of the Rorschach cards, thus testing the importance of the "temporal gestalt" in producing shock and its associated phenomena (delayed ‘ reaction time, decreased productivity, etc.). Two groups of fl: student nurses, 1? in each group, served as subjects. he design was counterbalanced so that in one group the normal order of presentation was followed in the initial test and the reverse order some two months later in the retest, while the converse was true of the other group. The results indic- ated a few changes as a consequence of reversing the order. Some cards appeared to elicit fewer responses and to prolong response time, regardless of the order of presentation.’ The authors concluded that shock may be a axnupgne of the "greater difficulty of some cards and lesser potentiality of others to evoke responses". A further observation was the high stability of the experience balance from test to retest, an interesting finding in view of the susceptibility of this ratio to change under the artificially contrived test conditions of social setting (28) and emotional atmosyhere (33). A study by Allen, Manne and Stiff (2), primarily concerned with the influence of color on retest reliability, is of methodological importance for this study. Two groups of college students totaling 25 were tested with a standard series of cards in one group and with a set from which the color was removed in the other. They were retested six weeks lu‘er with the standard and achromatic sets alternated between the two groups. Th- re5ponses on each card of both tests were compared for "consistency", defined as "the reappearance of a restonse in the retest protocol". The mean percent of consistency for the colored cards in the standard set was 30.h% while the percentage for the same cards in the achromatic set was 27%. The same statistic applied to the non-color cards of both sets, standard and achromatic, was 3h.6p and 30.6% respectively. Since neither of the differences was statistically significant, the conclusion was that the presence or absence of color seems to have no influence on the degree of consistency. The question that might be asked here is to what extent is the percentage of consistency influenced by memory? - A general review of these experimental studies is in order here. The majority of these studies report shifts in various Rorschach factors under the impact of artificial manipulations of test conditions. They serve to emphasize the importance of maintaining rigorously standardized conditions of administration in order to minimize errors of measurement. At the same time it should be kept in mind that the conditions with which the studies deal are not those encountered in the practical experience of the clinical psychologist. As contrasted with the test-retest reliability studies which largely dealt with children, the experimental studies generally have depended on student pepulat:i-ens for subjects. Hence, in neither ea se are the results representative of the wider pepulation. As in the reliability studies, the time intervening between the two test administrations has not received attention. Clinical_test-retest studies. The preceeding studies may be described as test-oriented in that there is an under- lying assumption of personality (function) stability and it is the test (measurement) itself that is studied. The studies which follow can be considered subject-oriented, since it is the conditions of the subjects themselves that are varied, the implicit assumption being that the test is stable. Because of this premise of test reliability, these studies rarely include comparison groups. Actually such investigations are more related to tests of validity than reliability. Their significance for this particular investigation is that they have a bearing on the important question in the theory of reliability of the stability of the function tested, in this case the personality. There is a very large group of clinical studies using test-uretest, too numerous to describe in detail. Only representative studies or those most relevant to the present inveStigation will be cited more than briefly. The historical precedent for these studies is the one by Wertham and Bleuler (5’3). The); administered the ROI‘SC‘l'iach as a method of investigating differences in the LA reactions of individuals under normal conditions and when under the influence of the drug, mescaline. They found comparatively close agreement in the two sets of responses. The slight differences occurring did not-materially affect the interpretation as a whole. However, statistical verification of the results was not demonstrated. Many other studies have been reported on the use of Rorschach retests as a means of evaluating physical treat- ment through drugs or electroshock treatment: (insulin) Piotrowski (37), Halpern (l9), Kisker (29), Bradway (6), Graham (1?), Beck (it); (sodium amytal) Kelley and Levine (25); (Metrazol) Kisker (29); (electroshock) Bradway (6), Kelley, Margulies and Barrera (26). Both Halpern (l9) and Kisker (29) report that their scfiiZOphrenic populations after treatment seem to retain PSYchotic patterns in their retests along with evidence of ‘ improvement. Beck (1+) states that after treatment the "main outlines of the Rorschach pattern are always recognizable belonging to the same individuals" but there are also "changes in important features". -He emphasizes that such Changes are at the periphery and not the core of the per— sonality. Some studies have used the Rorschach as a measure of I“33"31’10therapeutic changes: Brosin and Fromm (7), Krout, Krout and Dubin (30), Rioch (1+2), Brosin and Fromm (7) -25.. report that some Rorschach factors appear to remain stable over a course of psychotherapy. Rioch’s study (42) included 36 patients undergoing analytically oriented intense psycho- therapy. A comparison of the "before andr'after" protocols indicated some changes representing improvement, but on the whole, pairs of tests were more alike than different. Another group of Rorschach retest studies that lies somewhat in the border zone between clinical and experimental studies are those that involve hypnosis as a method of arti- ficially altering the emotional state of the subject: Sarbin (’+’+) , Levine, Grassi and Gerson (32), Lane (31), Counts and Mensh (9). These studies are reported here because they are Of the subject-oriented type according to the dichotomy indicated previously. The general pattern of these studies has been to retest a Small number of subjects (one in several cases) under a Variety of suggestions designed to vary emotional states. The most frequent impression reported here is of a. stable core of personality running through all the records of an individual along with changes consistent with the hypnotic suggestions. Although the findings are not to be taken as conclusive because of the small number of subjects, they are congruent with findings in the other clinical studies, lead- ing to a conclusion of changes occurring in response to twatt"“ant within a matrix of a stable pattern of personality. As for the significance of the clinical studies for this research, they appear to justify an assumption of relative function stability. How relative is this function stability? The term "personality structure" would seem to imply by definition a continuous, consistent core of organization. Yet at the same time personality theory takes account of the changing nature of this organization, in terms of its dynamic aSpects. Over the course of a lifetime an individual's personality would reflect the influences of major life experiences. If the Rorschach validly taps the personality structure, one should expect retesting after long periods of time to show both peripheral fluctuations along with a stable nucleus of personality. Such is the "relative" function stability found in the literature cited above. Conversely it would be expected that stability would be maximum in retests over short periods of time, where there is no reason to anticipate major personality changes. Hence, a comparison of Rorschach test and retest results over varying short time intervals should indicate consistency regardless of the time involved. This consistency should be accountable for in terms other than pure memory. Studies of retention beginning with Ebbinghaus in 1885 have indicated that memory varies as a function of time. The typical retention curve has as its main characteristic a rapid decline immediately after learn- ing and a gradual leveling off as time advances. In terms of _ 27 _ the Rorschach this suggests that recall for the first test should decline as the intervening time between tests is prolonged. Consequently, in Rorschach testing, retention over short periods of time should be less pronounced than consistency. II. HYPOTHESES On the basis of the foregoing discussion the following rliy'potheses were formulated: Ii. II. Performance on the Rorschach test remains constant over varied short time intervals between test and retest. 8. Exact response reporductions (identical reSponses common to both the initial test and the retest) will not vary significantly as a function of the length of time between tests. Total response reproductions (both identical respon- ses and responses with changes in locations upon retest) will not vary significantly as a function of the length of time between tests. ‘ New responses (reSponses appearing for the first time in the retest) will not vary significantly as a func- tion of the length of time between tests. Successful matchings of test and retest from the same individuals will be independent of the length of time between tests. Iflie reflection of memory for the initial test in the Ipetest will vary as a function of the length of time between tests . a The correct recall of exact response reproductions -28- - 29 - will decrease as a function of the length of time between tests. The correct recall of all response reporductions (both identical responses and responses with changes in location upon retest) will decrease as a function of the length of time between tests. The correct identification of new responses will de- crease as a function of the length of time between tests. The correct recall of all response reporductions and the correct identification of original responses will decrease as a function of the length of time between tests. I II . I-ZE’I'HODOLOGY A. Subjects Sixty subjects were used in this study. They were SQlected from male patients at the VA General Medical 9nd Surgical Hosgiital in Dearborn, Michigan. The subjects showed no evidence of a neuropsychiatric condition and none had taken the Rorschach previously. Absence of neurOpsychiatric disorders was established on the basis of hospital records including reasons for admission and ward behavior, as obserVed by nurses and physicians. Thirty of the patients were admitted to the hospital for treatment of pulmonary tuberculosis while the other thirty patients were admitted for surgical, orthopedic or general medical treatment. The sixty patients were drawn from a total of 83, who were originally tested for the purposes of this research. The 23 patients, who were not included in the pepulation of this study, were eliminated for the following reasons: 1) discharge from the hospital before the stated interval between tests had elapsed - 1‘) patients; 2) psychiatric referral by the ward physician or the observation of a pSYChj-atric condition after the initial test - three patientS; 3) a Pronounced lack of motivation by one patient upon -30.. -31... retest, manifested by excessive rejection of cards - one patient. The subjects were distributed into three groups of twenty each, differentiated on the basis of the time inter- vening between the initial test and the retest. The three gro ups were retested after the following aprorimate time Y wiervals: Grou') I, four hours; G;i’OLL}_‘) II, two weeks; Group in III , two months. Half of each group (ten subjects) was under treatment for pulmonary tuberculosis, while the other half was being treated for non-tubercular conditions. Tubercular patients were included in the sample been ise the relatively short length of hospitalization of non-tubercular patients .. precluded retesting sufficient subjects for Group III in a reasonable period of time. Hence each group was devised so as to be comprised of an equal number of tubercular and non- ‘m‘bercular patients as a balancing measure. The three groups were equated for intelligence, as measured by the vocabulary scale of the Wechsler-Bellevue Form I, and for age. Table I is a comparison of the three groups on intelligence and age. In addition to the 83 subjects initially tested with the Rorschach, approximately twenty- more were administered the vocabulary scale and enl‘lded from the sample because their intelligence level "’33 not comparable to that of Group III. B. Progedure Since there is evidence that inter-examiner differences contribute much of the variance to Rorschach results (3): -32 TABLE I DESCRIPTION F GROUPS Mean Range S. D. Ikggee (3roup I 29.8 21 - #0 5.65 " II 30.3 21 - #8 7.03 9 III 29.3 22 - 39 8.75’ Wechsler Bellevue Vocabularyl Ciroup I 28.65 15 - 35 5.8% " II 23.6 17 - 3# 4.96 " III 28.2 18 - 33% 6.52 Time elapsed between tests , ' Group I (hours) 3:59 3:10 - 5 33.13 " II (days) 15.3 12 - 18 1.77 5 III (days) 65.1 55 - 83 5.8% \ ---..-.~.-—.p 3.. Raw Score. test to beginning of retest. Minutes. Time determined from completion of inquiry of initial .-33- (333), all subjects were individually tested and retested tar 'the author. In order to avoid a research "set" the tests were presented. as part of the "routine examinations" admin- :i:31:ered by the psychology personnel to all hospital patients. Consequently it may be assumed that the test administration elicited responses typical of the normal clinical testing sstisuation. Personal observation and evidence derived from 9 srtructured verbal report following the retest tend to ~ 0 onfirm thi s . Before the administration of the initial Rorschach test tries 'vocabulary scale of the Wechsler-Bellevue intelligence t<35313 was administered to each patient. As previously men- tioned, approximately a score of patients were eliminated “fl tzkds basis for the following reasons: 1) too great a deviation from the mean vocabulary score of Group III, the f1I’st group tested; 2) a bilingual background which rendered trlfi ‘mocabulary score unreliable as a measure of intelligence. Following the administration of the vocabulary scale tile? =standard Rorschach test was administered. Based on B<3<:1{;'5 prescription (5) the instructions for the first test; v - Jere as follows: "You will be given a series of ten cards one at a time. On the cards are designs made up out of ink blots. Look at each card and tell me what you see on each card or anything that might be represented there. Look at each card as long as you like; only be sure to tell me everything you see on the card as you look at it. When you are finished with a card, give it to me as a sign that you are through with it." .. 31.]. .. The Rorschach test was rcadministered after the stated interval dependilg on the group to which the patient was assigned. The instructions for the retest closely followed the earlier instructions but were designed to take account of the obvious fact that the subject had already been exposed to the test. Another sentence was introduced before the last sentence of the above instructions and given as follows: " NOV." you might see the same things as before and perhaps something different, but as before tell me eyeyything you see can each card as you look at it." Upon completion of the retest, the responses of the 1“fittest were read back one by one to the subject and he was aslted if he had given these responses in the first test and if they were given in the same location. The instructions were as follows: "I am going to give you each card again and repeat the things you said you saw this time. I will also show you where you saw them. Please tell me whether you mentioned seeing them in the same place the first time you took this test. You may answer 'Yes', 'No' or 'Not.sure'." — - After completion of this recall procedure, the verbal report mentioned above was obtained in the form of responses -t‘3 61 questionnaire (see Appendix A). It was anticipated -t}1€‘13 the retest nature of the project might be disclosed by retested patients to other patients in their wards, who were 61V’81igt1ng retests. For this reason the purpose of this in"’"efistigation was explained to each patient following the administration of the questionnaire and his strict confidence concerning the project was requested. A check on the extent to which this confidence was kept was provided by question #1 of the verbal report. The response to this question indicated that more than 95% of the patients did not anticipate being retested when called for the second examination. Of those (two patients) who reported some expectation of being retested, the evidence indicates that they were not so informed by other patients but rather surmised this without full certainty. Hence it may be asstmed that there was no deliberate recall practice of the first test prior to the retest. C. Methods and Techniques Response-comparison technique. The principal method used for obtaining data to test the first major hypothesis WP: s a matching technique, by means of which the pairs of ROI‘Schach protocols were compared for common responses. The use of this technique has been cited above in the revi ew of previous studies (2), 0+8). The advantages 0f Such- an approach are: 1) it minimizes the extent of examiner influence since matchings are made solely on the b51315 0f the free association; 2) it provides a basis for a study of the Clifferential effect of memory; 3) it offers a more holistic approach to reliability than correlational techniques ‘0... :3 39d on separate scoring categories; 14-) it is possibly more ~36- relxevant in view of a current trend toward increasing use of content from the force association in interpretation. The following are definitions of the various categories cicexreloped for this study to classify the comparison of ireessponses from a pair of tests obtained from the same EDCBI‘SOH: Response renorduction (r) is defined as a response on the retest which is elicited from the identical area of the same card, and, which has the same specific (unelaborated) content as a reSponse in the first test. Response reproduction, location changed (r1) is defined as a response in the retest which has the same specific content, and, which is elicited from the same card as a response in the initial test but with a change in the location of the area on the card that initially elicited the response. A new response (nr) is defined as a response in the retest which did not appear on the same card in the initial test. . The chief criticism that might be directed at this °C>nlljearison technique involves the question of reliability. 3&le3 rnajor source of unreliability in classifying responses irl 1311e'retest according to the above outline appeared to be 111 inkle category, res onse renrodn tion 0 ation chan ed. Tkl‘a Iproblem here was whether a minor change in the location ()1? 51 regao‘se reproduction automatically made it inconsistent V’13t11' the reSponse in the initial test and hence classifiable a. , . :3 51 response geproduction, location changed. Experience 1. ndiczatcd that minor differences in the location were generally due to an incomplete definition of the location -37.. of the resnonse by the subject on one of the tests. Further- morw: the interpretive nature of two identical responses with .miricxr differences in location would be exactly the same. Accordingly a set of criteria was listed as an objective guide for discounting minor differences of location. Thus, .eosponsc in the retest was considered a Les-onse reproduc- c‘. ion despite slight differences of location if the following coz‘iclitions were met: 1) form quality remaining constant, 0 -g . F/ still F}; 2) location scoring remaining constant, e - g - D still D; 3) emphasis on primary content category remaining constant, e.g. A still A; l+) scoring of a "pepular" remaining constant even if location scorin shifted, e.g. P Still P, despite change from D scoring to w scoring or the reverse. A test of the reliability of the response-comparison .IZCiLI-iigue was performed by submitting a sample of ten pairs of: I‘C—Bcords chosen at random with the above definitions and Special criteria to another psychologist for matching and Classification. rl‘he percent agreement with the original SQOI‘ing was 90. This was considered a satisfactory level Of . . , . Scoring reliaoility. fiscall technique. The recall technique was designed t o 0 Y O 0 Obtain data to test the second magor hypotnCSis. As (1% C“ o - ., - 9 “91‘1de in the previous sectionl, ”(3:118 metnod involved kl been; - ‘ .g . _ r ' -:Lng been. to the subgect his retest responses and asking iSec page 3% for description of procedure and instruc- tions. - 38 - hiJn if he had given these responses in the first test. One of’ three possible answers, "yes", "no" or "not sure", was succured for each response. This scoring combined with the 'tliirce classes derived from the response comparison provided iskixnse subclasses of each category of the response matching. ‘ CDI1<3 following are definitions of each classification employed :113. the scoring of the recall technique: (See Table II for a SLunmary of these classifications) Correctly recall.“ a reply of " . J a subject to a resaonse renroduction (r), when he was askel by the evaminer, repeating the resncnses of the retest, if he mentioned seeing it and in the same place in the initial test. Questioned reproduction (qr) is defined as a reply of "not sure?"T to a response renroduition. Incorrectlygrecalled reurodugtion ir is defined as a reply of "no" to a resnonse reproduction. K: Correctly recalled reproduction._location changed (crl) is defined as a reply offii‘pno'r to a response reproduc- tion, location changed, (r1). Questioned'reproductioni location changed (qu) is defined as a reply of "not sure" to a response reproduction,.location‘changed.« Incorrectly_recalled reproductionlilocation changed (irls is defined as a reply of "yes" to a resuonsc reproduction, location changed. Correctly identified new response (cnr) is defined as a reply of i1noii to a new response (nr). ‘ ' Questioned new reevonsg (nnr) is defined as a reply of "not sure" to a new respong . Incorrectly identified new resnonse (inr) is defined as a reply of "yes" to a new resnonse. Two additional classifications were found necessary to <3 . . . . ategorize "regections" of cards. These are defined as follows: Correctly recalled rejection (crj) is defined as a reply of "yes" by a subject when asked if he failed to see anything the first time on a card which was TABLE II DEFINITIONS OF RECALL CLASSIFICATIONS _. 7-- Subject's Reply Code Classification reply- should be or - Correctly recalled reproduction Yes Yes qr —- Questioned reproduction ? Yes 11" —- Incorrectly recalled reproduc- tion No Yes CI‘l - Correctly recalled reproduction, _ location changed No he qu — Questioned reproduction, location changed ? No irl - Incorrectly recalled reproduc- tion, location changed Yes No \ cnr ~ Correctly identified new response No No an ~ Questioned new response ? No inl‘ ~ Incorrectly recalled new response .Yes NO \—‘~_’ u- _ , .-..— .. - ...... ...—......» - c333 _ Correctly identified rejection IYres lies 0 o i 3 - Incorrectly identified rejection Yes No No Yes ‘O- w*-- -o--. _ b0 _ rejected both in the initial test and in the retest and as a reply of "no" if the card was not rejected in the initial test. Incorrectly;reealled rejection (irj) is defined as a reply of "no" by a subject to the same question as above when the card was rejected in the initial test and as a reply of "yes" when the card was not rejected in the initial test. The scoring of classifications in the recall technique, lfiza:sed as it was on the three possible replies of a subject, IVEiES entirely mechanical and should not raise any question of r (21 iability. Matching by judges. A supplementary method used to Obtain data to test the first major hypothesis (specifically P15713<3thesis I (d) was a matching technique, in which judges Freezrez requested to blind-match two Rorschach records for E3E1<3k1 person. The use of this method has also been cited aTDCD‘ve in the review of previous studies (22), (50). The blindqnatching procedure has been a favorite method for EEVERIILuating-j Rorschach data because it permits study of each PEEcord as a molar unit. Cronbach states that, "We can now. CflDtaiin adequate_evidence on the stability of Rorschach p"“tterns only by such a method...". The liabilities Of this fn<313110d lie largely in the "human limitations of judges". 1‘53 (Ironbach points out, mismatching may occur because of ti - . Inihzior false elements. Matching, on the other hand, might 1363 fiaxcellent, even perfect; the study would still not g“Juiizbantee that each clement...was correct, especially if the subfiectswere quite different from each other" (10). A _ 41 - The purpose of using this method in the present research was to take advantage of the more holistic approach available through judges' matchings and, more importantly, to demonstrate that the consistency provided by the response- comnarisgn technique reflects significant aspects of the personality. It was felt that the deficiencies of this method was indicated by Cronbach might be resolved by care- ful design of the judge matching method. It was assumed that mismatching because of fiminor false elements" would be distributed by chance and hence should be equally distributed among the three groups tested, if as Hypothesis I (d) states, "Successful matchings...will be independent of the length of time between tests". In other words the emphasis here was on differences among three coefficients of contingency derived from judges' successes in matching pairs in the three groups rather than on one specific coefficient for the entire sample. On the other luuad, care was taken to avoid having matchings made solely because of obvious differences among the subjects. This was chnie by submitting records of several patients in one set with the Inmfluctivity (total number of responses) approximately emual for each patient in the set. Since productivity is one cxf the more obvious differences among subjects, e.g. it wcnald be relatively simple to differentiate one subject vdth za response total of twenty from another subject with a ...hg- resnonse total of sixty, it was felt that this means of control at least partially satisfied Cronbach's criticism. The procedure followed in the judge-matching method was to submit the sixty pairs of records in twelve sets of five pairs each. Each set included at least one pair of records from each group. Each record contained only the responses, listed sequentially with their location specified according to Beck (5) and the oosition of the card specified for each response. The scoring of the responses was not included. Only the specific unelaborated content of the response was listed, thus eliminating any cues based on idiosyncratic verbalizations. Furthermore, as indicated above, each set of five pairs were within a specified range, based on the productivity. Four judges, each a staff member of the clinical psychology section of the Dearborn VA Hospital and with at least five years of Rorschach experience, Derformed the matchings. At the conclusion of the procedure each judge submitted a‘merbal report of his impressions of factors involved in the matchings . Questionnaize. A supplementary method used to obtain furdner information related to the effects of practice or ammorfiy on the retest was a list of ten questions (see Appendix A). This questionnaire constituted a structured verbal. report of each subject's own perception of the effect CW his recall on his responses in the retest. The questions were not designed to provide quantitative data, but rather to give qualitative information for additional insight into the nature of the recall findings supplied by the recall technique. D. Treatment of Data In order to test the hypotheses of this investigation a series of measures were devised based on tabulations of the matching and recall data. Each measure will be discussed as it pertains to the particular hypothesis it was constructed to test. (See Table III for summary of measures). The reporduction measure was designed to test Hypothesis I (a) and is a mean percentage derived by dividing each subject's total number of response reproductions by his total number of responses in the initial test; r/B(I). It repre- sents the preportion of the total responses in a subject's initial test that he actually reproduced in the identical area in the retest. It may be considered a measure of consistency and is therefore hypothesized as being indepen- dent of the time variable. The total georoduction measure was designed to test Hypothesis I (b) and is a mean percentage derived by dividing each subject's total number of reporductions, regardless of location, by-his total number of responses in the initial test; r/rl/R(I). It represents the prOportion of the total possible responses in a subject's initial test that he \ . — “I 1+ - actually reproduces in the retest but not necessarily in the identical location. It also may be considered a measure of consistency and hence unvarying over the time intervals specified. The new response measurg was designed to test Hypothe- sis I (c) and is a mean percentage derived by dividing each subject's total number of new responses in the retest by his total number of reswonses in the retest; nr/R(II). It represents the preportion of the total number of responses in a subject's retest that were not similar in any way with the initial test. It may be considered a measure of incon- sistency but remaining constant over the varying short time intervals. The reorodugtion-recall measure was designed to test Hypothesis II (a) and is a mean percentage derived by dividing each subject's correctly recalled reporductions, by his total number of reaponse reproductio ; cr/r; It represents the proportion of the responses of the initial test which were reproduced in the identical area of the retest and were correctly recalled by the subject as having been given in the initial test. It may be considered a measure of recall and was therefore hypothesized to decrease as a function of the time between tests. The total renrodugtion-recall measure was designed to test Hypothesis II (b) and is a mean percentage derived by dividing each subject's correctly recalled reproductions plus gprrcctly rg_engl_reuroductions%Tlocation changed by --....- .- his total hunger of re ponse reprodugtions plus resppnse U) reproductionsl_location changed; cr%crl/r/rl. It represents the preportion of the responses of the initial.test which were reproduced, regardless of location, in the retest and were correctly recalled by the sub cct as having been given in the initial test. It is also a measure of recall and hence expected to decrease as a function of the length of time between tests. The new response identification measure was designed to test Hypothesis II (c) and is a mean score derived by the sum of each subject's correctly identified new responses plus a constant of ten minus the sum of his questioned new We 0 80 plus WW; cnr%lO-(an%inr). This measure varied from the mean percent- ages used for the measures described thus far because it was found that some subjects did not correctly identify any of their new response . Hence the score was devised to take .account of the varying totals of new response; which would not otherwise be indicated by zero percentage scores for these subjects. This score also measures recall and is Similarly hypothesized to decrease as a function of the length of time between tests. The regal; measure was designed to test Hypothesis II (d) and is a mean percentage derived by dividing each SUbiect's sum of correctly recalled reproductions plus _. 1‘6 .. correctly ~ccalled reproductions, location changed plus correctly_;dentified new responses by his total number of responses in the retest; cr/crl%cnr/R(II). It represents the preportion of the total responses in a subject's retest that he actually correctly recalled as having been given in the initial test, regardless of location, or correctly identi- fied as new responses appearing only in the retest. It is a more general measure of recall, being based on all of the responses of the retest, and is also predicted to decrease as a function of the length of time between tests. Hypothesis I (d) was tested by the use of Vernon's formula1 (52) which yielded three coefficients of contingency, one for each group. These coefficients were tested for significance of differences from one another. Confirmation of the null hypothesis with respect to these differences would be considered as another indication of the Stability of the Rorschach over the short time intervals specified. E lSee appendix C for Vernon's formula. 47 Aanaxane\H90,x.aH\adxnoe I au\n* pmopon a“ muoHpoemon msaa .mmopme cm momcoommnkHoPop‘ H woodpoonon dogmauneoa minoonpoo moan mnewncom. memwmwwomeo sawsomwwuqummmammwmwunoo H oxno .maoaposoon on ooaaooen Hpoopaoo Haeoom AUVHH momqoamon 3oz oofieapceea mapoonpooqfi mafia memeoamop 3e: cocofimeSc we age on» mamas so» made Annfixnddgu momqoqmmn Se: emahapnooa manomnnoo oaxnco incapmofimfipeenfisomqommon 3oz «ovHH uneven Ga mmmqoamon aspen H m momcommop 3e: Ham peaxn: use omcoammu 3oz AoVH :ofipmooa we mmeaenmwon .mcoflposooneeu Man I, anxu eoflpoooa we mmoaopowen anowpo mqofipesnonqen veaamoon mapoonpoo Haeoouuqowpozoopmep Hopes «nVHa amen pmpfie :H memnommep Hope» _ Ava cofipoooa we mmoaonmwea dawn .mGOapesconamh Ham * doaposeoumen aepoe AnVH mnmwposconamn Ham nwxnomno mcofiposoonqou eoaamoon mapoopuoo no Haooenncowposcouaem AmvHH pmep pmnau awtmemcommen Hope» thm mnofiposoonmop Ham uwxmmmno qofiposoonmom .eVH aoapfidawom enoom onsmwoz mammnpommm - oopmHom mm xomdm 2 eaHmoom mo emezapm SH 3mg IV. RESULTS The statistical treatment of the seven measures derived from the response-comparison and recall data involved a com- parison of group means for significant differences. The statistical technique employed for this comparison was the t- test. Since six out of the seven measures were based on percentage scores, a transformation was necessary to render the means and variances independent of one another (cf. Snedecor, p.##6). ,The suitable transformation in this case is the inverse sine or angular transformation.1 Although the t-test applied to psychological data has come to be viewed critically by many psychologists in recent years, its use was felt to be apprOpriate for this study. The most general criticism of this technique is directed at the assumption of normality required for its use, an assump- tion which is questionable when applied to much of psycholog- ical data. Edwards (ll)-points out, however, that the two- tail t-test is relatively little influenced by departures from normality. - . For convenience, the results pertaining to Hypotheses I (a) and II (a), I (b) and II (b) and I (c) and II (0) will be Presented together since the measures, on which the tests Of these hypotheses were based, are related. M lAlthough conclusions reached by analysis of original data are usually the same as that of transformed data, the test is made more sensitive in terms of the probability attached to the t-score obtained with transformed data (cf. Edwards, p.166) .- 11.9 .. The confirmation of Hypothesis I (a) resuires that there be no significant differences between the mean percentages of response reproductions of any pair of the three groups. In other words the percentage of response reproductions should be constant with the passage of time. Table IV provides a comparison of groups I, II and III on differences in the reproduction measure. The findings TABLE IV COMPARISON OF GROUPS ON DIFFERENCES IN THE REPRODUCTION MEASURE Groups t op I - II 1.916 '<310 I - III 2.458 (.02 II - III .730 1:.50 indicated that the mean percentage (transformed) of response geproductign§.of group I tended to be different from that of group II (P<:1) and was significantly different from that of group III (P<:O2). The means of groups II and III were not found significantly different (P<35l. These findings may be interpreted as indicating that the percentage of Lgspggseurgprgdggpigns tend to decrease after the short time interval of four hours but apparently becomes stabilized after two weeks for the period of time covered by the design of this research. The results thus partially confirm HVDothesis I (a). - 50 - The confirmation of Hypothesis II (a) requires that there be significant differences between the mean percentages -— can--— of gerrectly“r§called reprpductigng of any pair of the three groups. It further requires that the differences be positive when the mean of the group with the longer time interval between test and retest is subtracted from the group with the shorter time interval, e.g. mean of group I minus the mean of group II should bear a positive sign. In other words, the percentage of regnonse reproductions which are correctly recalled should decrease with the passage of time. Table V provides a comparison of groups I, II and III on differences in the reproduction-recall measure. The findings indicated TABLE V COMPARISON OF GROUPS ON DIFFERENCES IN THE REPRODUCTION-RECALL MEASURE Groups t an I - II 3.137 <.01 I - III 5.821 (.001 II - III ' 23:75 <02 that the mean percentage (transformed) of corgeetly recalled .ggprodugtions of group I was significantly different from that of group II (P<:01) and from that of group III (P<;001). The differences between the means of groups II and III were also significantly different (P<§02). The differences were all in the predicted direction. It seems to be a safe assumption that recall of response reproductions decreases as a function of time. The results thus confirm Hypothe31s II (a). ' Whenihefindings of the two measures reported above are compared, one may observe that while response reproductions tended to remain constant from two weeks up to two months, the recall of those response reproductions continued to decrease. The confirmation of Hypothesis I (b) requires that there be no significant differences between the mean percen- tages of all response reproductions, regardless of location, of any pair of the three groups. In other words, the per- centage of all repeated responses should not vary with the passage of time. Table VI provides a comparison of groups I, II and III on differences in the total reproduction measure. The findings indicated that the mean percentage TABLE VI COMPARISON OF GROUPS ON DIFFERENCES IN THE TOTAL RE RODUCTION MEASURE Groups t op I - II 1.518 <:}20 I - III 1.59h- <:;20 II - III - .076 >.90 (transformed) of ;§Sponseareproductions plus response IQDroductions, location changed of group I was not Significantly different from that of group II (P<32) or group III (P<§2), nor were the means of groups II and III f Significantly "Iilc‘f’BI‘O‘s’lt (P>s9). chgpite the 1pc}; O S 0ti.stical significance Letween the means of groups I and II, the l'””ereicss were in the direction of decreasins nern «n- ~ .‘ I . - . J, V ' ,.'- ,,.,1 .L'r A, ' J- ' - ' .-‘ percentages as 5140 increase:, the same rJrCCtion as in tne ease or “he response “epro tc' ions rcnorted above. (There is, of course, some overlap between the two measures since both 1‘ J' involve response reproductions). 4.4113, mes-e results are also rvr"stiv or a macro 3115 tendency after the four hour test- ~ 0 - o retest interval Fnd show more 3 otiLifiY from the two week rarest in *rvsl un to two wontcs. Ho ever on a statistical . - - .. -. ' .. . V «3 , . V .0 basis LhC rosuLfS MC: “0 C: nsi«er‘ us eon Irwfitory Or 51} 1011210515 I (70) - ‘ .. The confirmation of 1*fiothecwc II (b) *C1 II - III 2.7“3 <:.oi m.—o— -... --. .J . ., (”b . leV _- -. - ,- ’. ‘.‘ J- ' ‘. -~‘ :- 0.~: \ --. r - ‘ ... - — {I fife L3 1.0 b! ..y 1, -.o L, by; .1041 WJ‘C'CC:‘1t:If’Q (L . any; _I‘.‘v'~;_zg.r1) 1—. o . .“ " - ."‘ -(’\ 1'7 v- ' j '1 A. ~' 1‘. ~ 8—. ’\ 1 J‘ “ ~ r - 4-1 s. r o; re ct rveslie re -oul027o.s .li: CJIVCCU v :ecslled ., .7 _:--- a , _.-,- ., V . , ...,I: ., I . - .. 3 .o - . .1- _‘-‘ C;‘l"f‘zi..1lC «1-0113 g 9.009 $0-1 CHE-7 "14mg 0- ’ 37021.2; , S (.317: .11.; 103(11- ("’.1°.7'.7'|:i*_‘l;‘0';‘._t ji'f'OrI thi‘t Of grOup II (P<.Ol) Dry}. {rum tiff y, of group III (P”.C)<)l). The n ens of groups II anfl III were also signif- / enatlj ziffierent (Pt". This may be compared to 13% or eight of the subjects wtlo reported that recall elicited the same responses on Pcrtest. Thus the subjects' own perception of their recall Seaems to support the claim that consistency of response is due to more than retention. C. §omc Conditions Related to Inconsistency One of the purposes of this study was to evaluate the s function of the passage Ctlfungcs in the responses which are u Table XI presents findings related to this. It Of tit-70 o IT3‘Vweals whet seems to be a surprisingly high number of new Tfiixrzyonscs (mean percentage of 25) on retest after four hours. ) -6%- ~ After two months this percentage is only 29.5. Here also the verbal rep rts of the subjects are helpful in determin- ing the conditions related to inconsistency. stency Table XVI gives infmrmation related to incensi '~° represented by the deliberate attempt to produce new (to {resaonses. Categories A and B conbincd reveal that 15 of group I subjects as compared to 1% of group II and eight of Egroup III subjects reported making an effort to give new :responses out of curiosity or in order to be more thorough. Table XVIII, which indicates subjects' points of view CH1 the effects of recall, reveals on the whole, a perception ituation, where practice and Category B subjects in group I as compared to six II and two in group III reported that having initial test made it easier to concentrate and look fcxr more percepts in the retest. mhe subjects also stressed tPio effects of recall in the retest and the reduction in the arfllxiguity of the test situation, bot? of which tend to pro- d11<20 consistency. Thus, the altered situation of the retest both inconsistency and consistency with the g Ves rise t i. ' r1itial test. Tables XV and XVII reveal that for a variety of reasons SrrEDiects deliberately suppress responses. Whatever the T‘ . . _ . ‘_ ‘nglson, the deliberate supprcs31on of responses contributes _ 6h .. to inconsistency. The preponderance of subiects who reported doing this appear in group I. In conclusion many of the conditions which seem to account for inconsistency appear to be most prominent in 'the four hour retest group. It is suggested that this may (Bxplain the relatively high degree of inconsistency after :four hours where one would crpect much less variability. D. Judges' Verbal Reports Despite the fact that the results of the judge ruitching procedure precluded any statistical treatment, tile verbal reports of the judges are useful in shedding aehditional light on the nature of these findings. Although aljl of the judges reported a subjective impression of gezneral reliability in the response patterns, they also irndicated that there were a number of records that could be Huikamfi.only by elimination. In other words, the reliability 01‘ certain response patterns appeared reduced from those fC>Luri in the majority of the records. It is suggested that these especially inconsistent r€3cxbrds might be due to an extreme combination of the SIlEVDression of responses as mentioned previously, varying '1 L o o o . “(3:5rees of motivation, MISInterpretations of instructions, dcaéyree of threat by the task or of the many other reasons 11-3ch in Tables XIII, XVI and XVIII. Such idiosyncratic l . PC33‘<:tions to testing CMphasize the importance of attempting to gain insight into what Schachtel (#5) calls the "sub- jective Jefinition" given by the subject to the esting situation. He points out that every sub ect defines the ..f" st situation in terms 0? his own needs, wishes and rears. t 0 He suggests that one might deal with the "subjective definition" either by attempting to minimize the influences which give rise to it or by evaluating it as an additional source of insight into the subject's personality and attitudes. E. A Theeretical Intergretation It should be mentionefl that the findings of the present study 30 not necessarily support the contention of Piotrowski that consistency is due to the stability of the ersnnality in the time intervening between tests. These '73 _ndings simply suggest that recall is not the sole basis for F” consistency. Further studies of validity would be necessary to substentistc his claim. Un the other hand, an alternate approach to the results of this study might be pointed out. It may be said that in the Rorschach as in any test which is purported to ten personality, the findings of both consistency and inconsistency ere anticipatefl by personality theory. The organization of personality is not stitic and hence undergoes shifts and changes 0f emphases even within its relatively stable framework. H~nce, even fluctuations within the day, as represented by changes in mood, attitudes, etc., are to be found along with the continuous consistent core of organization. F . 1' IC‘:'{;31()<'T ology The methodology employed in this investigation might warrant some attention. The choice of subjects for a sample raises the question of the larder population represented by b the sample to which the findings can be generalized. The , choice here was a practical one and may be said to be repre- sentative of an extensive hospitalized veteran population which is the focus of the professional activity of a considereble number of clinical psychologists. It is doubt- ful whether a clinical instrument such as the Rorschach can ever be adequately t sted as a clinical instrument in a sample designed to be representative of the general pop- ulation. The difficulty lies in presenting the test as it 'ituation. As indicated H ....J. :5 H. O (‘3 H UN normelly would be in a c above the "subjective definition" which is of much importance would be quite different if the test were presented in a "research" es opposed to e clinical setting. In the present research the test was presented to each subject in a clinical setting which was in no way different from that of a routine referral for diagnostic testing. G. Imsl cations for Further Research -: L Ho A number of implications for further research are fflkfgested by this study. The most obvious would be to extend its? range o“ the temporal dimension so as to incl do much longer test-retest intervals. It would be valuable to fieterm‘ne where the retention curve in terms of the recall rmanxres reaches a fflxteeu and alxrv‘fiae poirt in.ijrt:xflicre the consistency measure decreases. Another suggestion for future investigation we id be the application of the various measures onwloycd in this study to the ecterninants. One J. for exrmnle, the extent to which recall as -I n .0. -- fl: - -- i .- V ,0 , ,_,_ .~ _, ,. represented in the consistency Oi ”he L80 Ji coltr, iorn or ' . . .. 4-. -~ -P‘ It . .i o .. . .‘ "'r‘ J- . x. ' 1' .‘ -1 cryibiruitirn1s tx‘CyCK)L. 11 Lullxi, so, cestixnx rnnifd git isle .4‘74 'ecrtwtieligtw (Xirrelr‘tes «if crunrre:;‘:n inn: rrétesr,z‘s iiulicetxxl --I _ ., . . \ ., .. . , , . - A ‘31 . ,, v .‘ ...- ... '1 A. 1 bar 1110 grerr TCEL u3.us< racxisxtrr,. lines»: (H18 -;~ s ;1ay' or: rcfilai-eél . ' ' ' " '. ""7 0 I . . ‘ ‘ ‘ . ~ 1 . \ V. ‘.’ . to the "Stflbllltv score" deVised by Gisev Le) b- n.tns of \o I O, ruieh “e was able to deterwine those incividtals in CXperi- ‘ ' _!.. ,\ f3 J- . .‘.- ‘-, . mental retest grouns, wwo were ebic to ulfuCh most change 2 . . -- _ 4.. - i; .. --' .-. ‘ T. n ‘l ... . iner evoe'zwontrllv ~1tere~ cowditiuns. rho Chances .‘ ‘-.L> - ‘ ... I -—---~. ‘- l ‘5’ indicetee in this stud” b? 350 a?” resoonse measure were of a spontaneous nature, stmfiming free subiects' own inter- k-V .o-. hes arch interest would i°}1 +- 1 .‘_C.. \. VJ uretation of the retest s A. slse be attached to the type of responses that pers st, change or drOp out and th' personality correlates thereof. VI. SU MARY A review of the literature on Rorschach test-retest studies indicated thPt-researeh has never been focused on temnoral reliability, i.e. the consistency of retes. find- ings over varying periods of time. Furthernore the test- retest anproach to reliability has been considered question- able because of practice or memory effects. Rorschach writers have disagreed on the importance of recall etfeets. Some arfiwe that retesting is meaningless because of memory whereas others insist that repeated responses are repre- sentative of stable personality trends rather than of recall. The purpose of this study was to investigate the con- sistency of Rorschach results where subjects vere'not [~30 exposed to any treatment other than the systematic vary ng of time interveninf hetween tests. Specifically, it was to determine the differential etfeets of the passage of time on the aersistence, change and recall of Rorschach responses On retest. Two hypotheses were set forth: 1) performance on the Rorschach in terms of repeated responses on retest remains constant over varied short time intervals; 2) rec ll of those resporses decreases as a Function of the length of time between tests. _ 68 _ -09 Sixty natients screened to assure exclusirn of those wi th n urO‘p sychiatric conditions were selected as subjects from a VA general medical and surgical hos 5tal. The suhiects were distributed into three groups of twenty, equated for age and intelligence. Eac h group was retested With the Rorschach after the following an‘r021“ to time intervals: group I, four hours; group II, two weeks, and grouo III, two months. Following the retest of each subject, his responses were individually read hack to him and he was “.1. asked if these responses had been given in the initial test. Following the recall procedure a questionnaire, designed to obtain a verbal rewort of the efoect of recall on the retest, was administered to each subiect. Two techniques were used to obtm in data to test the hypotheses. The re “Pfi3“-POWU“Plo0n technique was a ratch- ing procedure, by means of hie h each pair of Rorschach protocols were compared for e0*mon or consistent resoonses The recall teehnigug was a scoring method, by means of which accuracy in the identification of retest responrw s as new or repeated responses was determined. The two techniques yielded seven measures which-were tested for significance of dif'f erences among the three groups by the t-test. The results on the whole confirmed both hynothes es. It was found that the measures of consistency devised to ..i 'test the first hypothesis did not yield :;igzlificant differences _ 7Q _ sprang the three groups, regardless of the length of time Tte measures of recsll devised to ‘tcnst the second hypothesis decreased as 3 function of the lxcrigth of time between tests. It is concluded on the hesis. of? the results thnt retest consistency is not to be solely cu3<20unted for in terms of recall. It was also noted that measures of consistency tende tea flecreose in the same direction of the measures of recall :fcxr the time interve-s from four hours to two weeks. The ”indinvs in terms of the nereentsgcs of consistent ...! "- ... I 311(. new responses on retest were presented and discussev. Ii: ‘mfis nointefl out that the nercenteges of resqonse (marisistency were shoVe those report i in the literature. A 7M1?QbOF of conditions were suggested as related to the (mJYLfinCQd test reliability indicated by tne nresent findings. Afl.sc3, some conditions related to inconsistency as suggested by’ fine verbal resorts were siscussed. The results were discussed from the point of View of 1*“3i4r relevance to Rorschach reliability and to personality thCKDIfif. The choice of subjects used in this study also received attention. Finally, implications for further research, stemming LrFVW ‘the present study, were TOVieUOd- CC 0 \O 10. APPENDIX A QU£STIONNAlfiE Did you expect to take this test again? Why do you think this test was given again? Did you make an effort to see the same things this time that you saw the first time that you took the test? Why? Did you avoid mentioning anything this time that you saw the first time? Why? Did you make an effort to see anything new this time? Why? Did you see and mention anything this time that you saw the first time that you took the test but did not mention at that time? What and why? Did you mention anything the first time that you did not mention this time? Why? Did you mention anything this time that you did not mention the first time? What? . Did you see more things the first time or this time? What do you feel are the effects of taking the first test on this test? What do you think the effects of taking the first test would be on the second test if it were given ten years later instead of at this time? - 72 - APPENDIX B TABLE XII (nordpARISON OF GROUPS 0N REPLIES T0 QUESTIONS I,3,4,5,6,7,a,9 1 Yes No Don't g krow Nt>.7 Question I II III I II III I II III 1.. ’ Expect to take test 3 again? 1 O l 19 2O 19 O O O ‘3. EMake effort to see 3 same things? 0 b 9 l# 12 10 O O l I H-. : Avoid mentioning any- ‘ thing seen first i time? ' 8 4 2 III 16 18 1 o o 5. 1 Make effort to see ‘ ' anything new? 16 20 15 2‘ O 2 O O O 6. Mention anything this time, seen but not __._ mentioned first time? 4 3 1 16 15 18 o 2 1 7. Idention anything first time, not mentioned , ...4 this time? 1% b 6 4 10 2,1 2 h 9 8. ldention anything this 1 time, not mentioned 6 1 H E . .‘ ? : . ...1 first time l9 17 l 12 l I Q 2 Q 9- See more things the 7 g : a: * a: .__,second time? 18 1% 7 O? l 6 I 2 6 was the '“Subjects replied that the number of responses same in both tests. TABLE XIIl COMPARISON LF GJOUPS Uh QELSflON #2 (Why test was given again?)h :—‘ ...-.-- - —= ‘— Categories of Reasons I Totgls III ‘ t A. Experimental 12 1 3 i 1 f 1 see if imagination depends on memory ; l 1 3 determine effect of medicine 1 , 3 see if there is the same reaction to the1 - ; test twice ‘ l 5 see if one is strucn by same initial i . ‘ impression . ; l ; determine different frames of mind ; : l g determine effects of hospitalization ' l _ I lie Clinical i5 g 4 Q__ check sanity f 1 check for lying { I thinking ability and concentration ( 1 . check mental functioning 1; 3 l :indication of instability f l l 1:est observance at first sight, paying attention 1 l checkup i l Wmorx 18 '3 lO 8 cietermine if one sees same things 7 g l H check memory ll i ‘7 3 learning ability H 3: l N —-~»-_~--._.. -----'.-'- ~*—%~‘-1—:"“‘* ..-... i Wages 16 , ‘ In 10 l Ccmnparison 1 § 8 # fietermine changes, differences, added things; changes of opinion 15 i 2 | e h l .. /I .. TABLE XIII (continued) . * Categories of Reasons I _£0II18 III I E1. Part of test procedure - L1___,QL,~“_ broke the ice (with the first test) 1 more conclusive l l t more certain of your information 1 ' obtain pattern (with both tests) 1 check accuracy of first test 1 see how good my first impression is l confirm first findings routine I part of original test clarification of first test _—-— ‘ l HHHH £1, Other or no ouinion :3; _5 r 8 no opinion , 2 H b might have fouled up last time , l something went wrong I l l test my sincerity i *Some subjects gave more than one reason which accounts for sums of totals edualing more than total n. of twenty for each group. TABLE XIV coMPAuIson or s: ups on QUESTICN #3 (Make of ort to see same things?) *—--.~ *~~~--‘-"' ---" --.»--4— am Categories of Reasons I Totals . i III II. No effort because of stimulus prOperties of cards #9 e _g7 "can't helm seeing same things" ”saw them whether I thought about it or not" "they were there, no effort, would see same things with- out memory" "just what came to me, tried to find something else, but it still looked the same, because it's there" "just saw same things without thinking about it, but it came to my and that I'd seen it last time" Ii. No effort because of memory - , __ '2 3_g1 "wanted to see how much I remembered, could have made it look like something else" "no snecial effort, but hard not to say them when you remember them" "same things came back easily, one would remind me of another and I'd go looking for it and see other things in the-process too" "had in my mind what I saw first time and naturally saw it right away again the second time" %W- W‘- W... (3° Merely carried out instructions to -._____ report everythinggseen ‘1#<_fig3 _5 "just said whatever came to my mind" "just tried to see ev rything I could see, made no snecial effort" "did‘recaii but did not let it lead me to see what I did this time" "I didn't purposefully attempt to, may have unconsciously" -70.. TABLE AlV (continued) r.._—....~—.—-——. —‘-*-- _. -- —-—~~.— - .- -..--m‘ --'v ---..~-o “u --- --.— ..n..- ..-— .....n....—--..--.—_ ”-.-v-‘ ”.r-7 I... 0-- --‘~.~ --.-—--..—m .- mun—v.- ...—.e'm-n“ Categories of Reasons I Totals III w--_' —-—~o~.-~ ...“ ~“~--v“ -..~.-- I),*mrjggggan.efiort to see as me t_hings_*_w, Igg_J_2 2 "thought it was a checkup and I we nted to be as near right as dossible" "realized it was he same test; should see same things, maybe more TCflliSi ical'v" "wanted to know if ‘ could and let you know; 1 shed for same things age in" "- ---- “v. —- - ‘- ..~'~ w , - — ...-.-....- r—v--.-..-. m E3- Looked for L0 {ethinscr different and Ctherl_ u res-SI; .UQI.-- "made no att mot; .Iidere‘ if I could see soaeahi: g dii .I 01 ent" "tried to loow ior more thing 3; things don't look t2 same every time you look at the»" - "tried to see more for se3f sa‘ siaction“ "you askew we to see same thing' '18 ... ..-..- -'~—--v.‘ .— -~.'_-.- g—«vm >- 0-.—-0-'-.--_ - - -- EAQLE AV -",-‘:"" Q.“ ;‘ ' 11 'I -'-J ‘1‘. "v '\ r‘ 1 “.111 Milt.) ' I! 'L'JD \J Q‘ I. :13:.IUI: 7153+ UL" Q3.'E5.=IIOi~z-.AIIIE W --....-- ——_ — -- on- ' --.. ..- “v- u“... -mwr- w- . ~ ~‘.‘ ‘-"-.— -W‘ o--‘--- - O .A‘.- *“fl-.--n-q~'-‘- - o I H - .. . H R“ " k ’ ' '0 ~‘ ’ \ ‘ Group I Liin'U think he serwnd ti c that it 'as hot I hid the first tine" "'s Clttoa of bat' iidh't take the than :he see: a time" 3 that wrs a 'man' I'd need 0 one things looked different: them but they didn't look l'k I didn't mention them" *‘o*-* «I“‘* ‘—— d- hp D __v Group II "skeptical of one answer, too vogue" # “some didn't make sense this time" "some minor points of elaboration such L" a.) Is sexual organs: you didn't ask about them" “WW IGroup III L-v a—M's" "mentioned 'woman without a hoad' “ time, seened too silly" ... I m- 1*. ...--.. ~C..- * - .— ~- - - . - -.—- "¢.— .. ... ...... 4‘ _. - F O -' - --*-‘ -V 0. .-‘fi‘ L... M“—~w ..v-c- . — ‘h’-- -m- ..— r I) - /g _ I‘AIILE XVI COAPARISOW OF GROUPS UH Q ESIION #q " 1" J— -!- rt -. vx 3‘ .’.,. ~o .' ,«~ ~ I . p. ._~ ~ ‘ ( Yes — tour ted to giVu new reasonses) ::;-.—.;.. .2. :7: a;.-=T~==- «9—- : :x?’—~--o—uo-o -..... --.-.... - ...». rte. .- ..J-ux; 1 2-22;: 5.: Categories of Reasons I To: Is III A . I-I’IFIJIS have overlook : I something; more thorough [8 6 H "wanted to see if I overlooked something" "felt I wasn 't seeing enough, tried ha erd to concentrate" "iian' I: want to miss any hi dden object" "I thought about last time, (lidn't see things I should 1, H Ila-VG M :0 .3... -mt-t'fi’- o“..- -o -.-' H. .0 u- - ~— ”W “I.“pm .....- 2B. Curiosity; looked for something differenti7 18 01" more "wanted to find something different, find what inkblots represented" . "felt soniwoe else would see something different" "mostly curiosity,, to see if there was anything else" "thouglit it ould be interesting to see something els e" "tried to see if I was more alert, tested to see how - much could be gotten out of H C . Complied with instructions to see every- thing 2 u 1+ "wantel to make sure I saw everything'r 1 "tried to see everything I could, as you said" "made an effort to see exactly what was on fliere" "tried to see all I could, didn't feel too good last time" w. w .. '° "tr-2:: .. ~"- - ~— I).. Instructed to see more 2 l 2 \ f . _ __._ "thought you aszed me to see if there we sanything I didn' t see the firs t time" "see if missed anything last time, you said to do it" "felt that's what I was supposed to do" , J \_(‘ TABLE XVI (continued) Categories of Reasons I TofaISIII E Didn't try or unable to see more - l 0‘]: 3 ’13 "seemed to be sane things" "couldn't imagine any hing new" "made an effort, but don't think it was successful" -..- .- .o- ...—.-..“ ... 0-. Other reasons 0 l 3 "yes, becaus- test was given to see if I changed my opinions" ”wanted to add to my intelligence or imagination, it might hrve some bearing on my intelligence" "wanted to increase artistry" "thought I should change my mind after two months" “'-H “I! -U‘ ’1 . -I - ... REPLIES OF "YES" TO QUESTION #6 OF QUESTIONNAIRE (Gave rosnonses on retest that were seen but W.-T _ so — TABLE XVII not mentioned in initial test) .-..._. "...—g -rd-W"~~M - *1le Reasons for "yes" answers Iotal Group I M "figured I had seen enough" "thought (a resoonse)was silly" $ "thought (a ressonse) was of no importance "didn't see (a response) until after I gave back the card" Group II 2 1 "(response) wasn't as apparent the first I time" 5 Group III ‘ 0 none TABLE XVIII COMPARISON OF GROUPS ON QUESTION #lOCa) OF QUESTIONNAIRE (Effects of first test on retest) mar—..-”..- -- I.-- W" “-W‘W Categories of Replies I Tifalslll I — A. Familiarity; knew what to eXpect; could respond faster 8 7 5 - "didn't take as long to grasp what was on the cards" "more familiar, more at ease, better idea of what to do“ "more nervous first time, knew more about it this time" "nervous and afraid to say some things the first time, because it might be wrong or not really there" B. Ease of concentration; could look for more things # 6 2 1 "helped a lot, concentrated on what I'd missed" "don't have to lsok too hard second time" . "saw previous things first, went into a little more detail" "easier, knew some things already there, so looked for new ones" "paid more attention to outline and details" C. Tended to see same things due to recall effects 4—that you see before stays in your mind“: "made me look for things I saw last time rather than look fornew snes" "automatically see again some things, fresh in my mind from last time“ "pepped into your mind that you'd see it earlier" TABLE XVIII (continued) w— .5“ o~-.—m— A _v. . . IS Categor1es of Rep 1e T fa" .1 -1 S_,_._____“__-3 I ? rIII D. No effect; recall effects negligible 1; O 8 9 "looked at cards as if for first time" "would have seen some things even if I had not taken test previously“ "things pooped out and I'm not sure whether I remembered them or saw them again" "very little difference, saw about same things automatically" "memory actually secondary since things looked the same, I seenei to be drawn to same things in same order" E. Other I 2 l "didn't turn cards as much“ "easier first time because subiect was fresh" "forgotten all about it, put me to thinking about it" "didn't have to look for as many things w (_-J TABLE XIX COMPARISON OF GROUPS or UESTIOU #lO(b) OF QUESTIOHNAIHE (Possible effects of first test on retest 10 years later) if? ‘_. Categories of neplies I fotalslll A. Differences beccuse of new exueriences ll 8 6 —~~~—~. "a lot would hasten in ten years, change outlook; would imagine different things depending on intervening experiences" "probably see different things ferent in time, because of take iifferent forms" "could be a lot of things thet would remind you of something different, you would associate them with things that will he pen in the years from new" "big difference; I'd be elder, seen more things, the things I saw were related to things I've seen or eXperienced; in ten years I'i see more and different things" ' "probably see different things; naturally when one sees an undeseriptive picture he describes things from whet he‘s familiar with; in ten years I'll be older and in a different situation or environment" B. Differences because of no recall 3 l 3 l 2 A‘... - things always look dif- neture probably would r‘J "probablv not remember, it would we like cakins the test for the lirst tine" "great deal of difference. doubt if I'd remember cards J at all" ' "entirely different, like two different tests, would hvve fursottew what I saw ten years before, would be like new material" ..gu- TABLE XIX (continued) 0 - -—. -_,_-1 o» 0-; o H- ‘fi‘ ..-.—. -'-.-. on... o - o. -1. or. ---—.—.o.--- -...--u—o..- Categories of-renlies I ‘ a C. Some things because of identical stiuuli 2 3 3 ...—..- -...T-.~-.--...— .-...- fi-m>-- - —- Org 0-. -——o v ”<- “—-.--.- “—0 "resemble some tiziwr from tée shaves, which were all I bed to go by, th<,; would be the sage in ten years" "don't think it we ld make any difference, cards wouldnt change; I'd see some things in ten years if in same mentel'condition" "no changes unless cards fade, depends on health vision" "I'll see same things because ing and would convey the same meaning" D. Same things because of recall 1 “I‘M-I] 8 and the shapes are outstand— N -.mwo -.;.a. ‘0— “w "not too different, I'll still remember it, I can remevber well" "no differer ces, ten §ee rs is not too long, can cer ninly remember in ten years" "oret+ y (1058 to the cane th ngs; it's always been that way for me, I get e.n impression and don't easily forget it" "pretty close to one results; once you h: :ve learned something, it 9,.ttles in your mind" ---- I . - --—-.-..- ~41. - - - *‘fi m T H 9 U1 U". E. Same things regardless of recall "might stil‘ see same things, but might not remember taking test before" "be no retention at all; thi1ngs you see are connected with your type of mind ano. mental health, if that didn' t change, the tb ings you 'd see wouldn' t change either" ‘ . q "might see some things, but wouldn't go looning for thtzm" U "no differences; probably forget first test and w01ld be 1i} :0 taking it first time; differene denend on 1ra: e of mind" ii‘ 98 V‘. 0‘ TABLE XIX (continued) -0,".-‘—. >m-‘fl‘ . o -. . _, --~ - ~01 , -... —- “-..-...- - .m W “0—- _.._ -— r - ‘wfl.*~-n.~—. v. v nO‘M Categories of replies F. Other replies "haves mne idea as in firS‘ test, trying to finure whzw is there inste .d of whs actually is there;1i:1.tead of looning, I'd try to recall first test and look for that" "no idea" "woull see sor ething different because not shaped true to 1‘orm" v w.—_a.- d'cr the objects are I- -- “v. -..-.- APPENDIX C VERIUN'S FOZMULAE ref-1 :31 :‘1L73A’T.‘IE~FG szes's ETATCL’IIIEGS (r32) Coefficient of Contingency ,_. St- 22 C V E‘s-i ,1 Jét~l "- (t-l) 0 “IIEIMI » .— “ -. 1- I . -. C. a. a, -'»¢ 0'- q .1 - .,- . n nwcr O1 elements LC on netcned ‘. .2: ‘.-,“. - . — .7 4- '~ totrl anoer 01 jgqumCNtS or NOUCWCS pronert‘on of the judgements that are correct ID 10. ll. 12. DITLIOGLAPHX ’ ~~ ‘ "‘ ‘ I .7.- 1" f‘ \ ‘ “ ’V ‘ VI : O - . "Kl‘ ‘ . ‘ ‘\ 7” ~ -‘ ‘ Alnawo1th, n. J., "rrutle»s of Joliu.tion" in nlorfer, A ' 3’} r", 4' ’5 ‘5 ' 1-. , ~ 'F' 7.- " 0‘" “ -‘ 't- '- I . , iii' ,7 "T01 ‘11:, .‘L. 7. , 13.7.3.3)?" (3r, VI. Lig, ('Jiil «L1‘jl'i/ L. 0,} _'- ‘ . J. , f‘. 1 V 11 . .' l, ‘ jnrr7r1 3333351:3;;_.o:_c1°ch.l.cngyg we, o‘. i, JO? 7 'ijkz'. C1,);;7:1:71:>r, 1: 0 j. 0 , L;51i 7.1., and Stiff, TI. , The :T1n:z':‘lu<_‘;nce 10 Rorsehhch Beujhnhn, E., norschsch scores ms 0 functi o , r. inor di77erences, J. kroj. Tech., lle, 1;, 2h3-2h9 is _ ,M lity of the person nli Psych. hull., l9h2, 39, 512 (absbro ct) horsc1acn s Test, Vol. I, B sic Processes, \v) _ ’ —- a. 1., UTWHO aid Stratton, l mu Brfifltffiy’ K.’ 3101153011ch TOCOEGS Of {‘1 schizophrenic pat- ient before, during rind itcr electric shock and insulin treatment, J. ligi. Tech., 1951, 15, 87-97 Urosi1 d. W., end T1oum, E. 0., Some principles of Costa t Psyc olog" in the Rorsc13ch experiment, Ror. R 1 h led—12,61,145 7 l ".1 S 0 EIZC A Cure -7 , falsificc.tion of the Roxschech psychodiegnostic nique, J. CODSUI§., 1950, 1%, 230—233 . L., and Shevsin, A. 3., The susceptibility to tech- Counts, R. N., and I nsh, J. C1§.,h Personality charecter- istics in hyynoticelly ind es stility, J. Clin. Psychol., 19)O, 0, 323-330 Cronbach, L. J., ta 'sticsl methods apolied to Rorschach scores, a review, Psych. §u11., l9h9, to, 393-h29 Edwards, A. L., Experimental Design in P3 grcholosicel Research, Rinehart é 00., N. Y., 1950 Eicher, R. I., A cemosrison of the Rorschach and Uchn- Rorschach Inkblot tests, J. Consult. Psychol., J 951, 15, 183-189 l6. 19. 21. 22. 23. 211.. - 58 _ l' '3’ Ford, h., m1 Annlic ti1n oi th_e .or.cn 1 l'est to Young Cl‘;l__ eniv. ol‘ Iinn. Press, Liin., 1916 J. A,, Rorschgc ‘-‘ rescti notionS, ROI. 3‘8. EXCH., 19 C90 -rm 0 1 6 u: fixru3=nmler V9li’fll in- 3 3, 12—38 An e"1erimentel stuth' of the reliability of the Rorsch.1ch psychodin ”no tie technique, Ror. Res. EXCIl., 1911, 5', 2-931: Gibby, Robert G., The stability of Cert“in Rorschach Variables under Conditions of dxuerimentslly Induced Sets: I. The Intellectual Variables. J. Proj. Tech., m Graham, Virginia L., VS M? ‘loyicol_ StUdlCS of hypo- glycemin therapy, J. P3 37ch01., Jgho 10, 327-3,5 Gr ffith, Richard M., Test-retest simil1rity of the Rorschachs 01 patients without retention, Iiorsekoff, J. Proj. Tech., 1951, 15, 516-525 Holpern, F., Rorschach interpretation of the person- elitv structure in schizephrenics who benefit from ‘ ins:lin therapg, Psychist Qisrterj J, 19%0, 1%, 820—833 bility 01 the Rorschach Inkblot test, Hertz, M., Relie 193% ,h—6l-H77 J. Applied, , Curie. 6 problems in Rorschach theory and technique, J. P151, Tech., 1951, 15, 307-338 Holzberg, J. D., and chler, 12., ‘hc predictability 0.? schiZOphrenic p01 ire nonce on the Rorschach test, J. Consult Psychol., 1910, 19-, 395-399 Hutt, N. L., Gibby, R. G., hilton, 3., and Pott_arst, K., The effect of varied sets on Rorschach test perform- ence, J. Proj. chh., 1950, 19, 181-166 Jaclzs on, R. w. 9., end Ferguson, 3. A., Studies on the Reliabilityrof 'Tests, Univ. of Toronto Press, Toronto, 1951 Kelley, D. M. end Levine 11., nor3C1achsfindics during sodium omytal nercoses 2:1 '1ibs.r act), Ror. Ros. Exch., . _, L2r"lllu1, H., and Barrera, S. 8., The st bility of the lorsehech method es demonstrated in electric convulsiVo cases, nor. RQ§AnEKCh., 1991, 5, .,_13 28. 29. Lu 0 0 MO. I If . ‘I' I 2 _ ~ v '_V .2- t‘ 1 I '7‘ . _r ‘- _ ‘ Herr, ..., 1QMpcr.nKN1L»L ~1--o;.1my33 1n PGVChOl., 1976, 97, 7l-PU Kimble, G. A., Social influence on Aorschnch rocords, 3.11.39, (‘1‘ S’DC. _P____.C_‘_b_, l‘;))-i-D, L30, C*',~'-‘;‘3 Kisacr, G. 1., A \ro,cciivo op .onc‘ to horsonnlity phtforns fluvinr nsulis shock and mctrazol convulsive L 030py, J. Ah. < Soc. szghol., 1942, 37, 190-124 Krmt, J 1‘.3-m,,1t $1., uni"! Dubin, 'L‘., fiiorschach to st- ' 3 retest :5 a .nflo of pr-agrcss in Psychotherapy £__Cliz (V. I) L‘. Leno, 9., A valilftion List of the Aorschach Hovcm‘nt in”crpre'c Lion, Amer. J. Urthq., 1946, 16, 292-290. Levinn K., Grosni, J. 3., Gerson, H., HyWnOLLW 0'11} in CrJ mood Chongcs on the vcvbrl and graoLic HorschacL, 30.1"... -3?L'.:§.._.L.L:_._--:c1‘:~ , 1943, 7, 730-1“? orfi, 8., flywcrimontally inducofi variations in Lowschach perforronce, Psych. Mono:., 1990, 0% Liffl, 3.,3 qhwc ct vrlor 401 hsicodiaim MSiiCO do 30r_ SCFWCha ...—....ETOF51rilsfllibl ..lf‘ (Jilin __:Lc__§_, 7,935.5: 30, 430441525 l;ons, ;J' 3'7 39 inci;3‘:$ nnfl_i chLixrwzvf LiEL Hirsmfiu1ch Pill‘omfl HELJEHCE, L1 onlncot ,191245 ”- -To=:3;mo,, . I. , .S"o‘_.1hr~r=”vz I'M, Emile, .21., '33111‘: 5.233.113.1100 of o S”nor-1c131 1””0513tfily precocc1ng 'sct nvon \ - ' .. ! roooon us to tho horschoch, J. Qpnsult. Paych., 19)2, .Pio"““'- 5. ., rmn7s choc r1'11c11(“*,.L11n19 oL JJTGPOVT“WN1L in inmu trc.tcd schiZOphronics, vacoooc ct._icd., 1910 1 $08- 3 - , y of Horoc1 ch' 5 Lvlcbn1)tvpus, 7, 3‘“, ,.2.3r)_ )ffill)- Re 3111, A. I., "V'Tifl’ti 5:117 B: {poricz mute}. Stuflics with Lho Rorsclm n-LCvuod 1:1 Auflcroou, H., and anch‘on, G., (c s.) An InLLOLUQLlon to lrojocLivc-1-cwnlthu, H. Y., Pianicc-Hall, 1951 \ 3 ’._, J. Alwg L. DCECQ BS”. y... C.— , (31. 501p , 1:1. , 33—h," C10 I‘..:’ It. , “Off; fiqu‘ctflj content :3 function of :crccuL*.1 cxvcricqcc and sex 1. of the m"'I'Ln or, J. Clin. PS‘Ch., 195%, 3-0,188-190 f7.) L9. 50. \D O l a enfi Senucr3on, H.H., An experimentcl inq11i ry into cone Rorschach proco’nrcs, J. Clin. Payoh. 19:6, 3, 410-225 Rioch, H. J., ThC use of the Rorschoc h to t in thC "3”Cssnent of c1ongc in 1311en13 under psychotherapy, sychi3t., 19L-9, L27-L3L or.o113ch 11., l’rjchofll ostic s (2:13 133.), If. 3., Gruno e.nd Strs11tton,ll9L_n Serbin, T. H., Rorscliech p9 tt elns under hypnosis, Amer. J. Orth0., 1939, 9, §lfi“3l 9 Sehachtcl, E. G., Subjectivc definitions of the Ror— schach test situation and their ef:fcct on test pC-r- form nce. Contributions to an un’er3t3nd1nv 01 her- schech's test III, Psvchiatry, 19L15, 8, Ll7-LL8 Singer J. L. The Behn-Rorschach inkblots: A pr ]_im- 0 ‘J , 2 w 1n9ry comparison with the orig inol Hers cizvch series , J. ProjlpTech., 1952, 16, 238- 2L5 Snedccor, G. w., Statisticrl hethods ed. L, Iowa State COlIC"C Press, Ames, Iowa, l§h6 Swift, J. w., Reliabilitics of Horsch3ch 3corim1 catc- gories with pro—school chiLIrcn, Cb ild 03v l:m ,lQLL, 15, ?O7- 216 ‘Thornton, G. 3., and Guilford, J. P., The reliability and meaning of Erlcbnis typos scores in the Rorschach Troup, E., A comparative study by me: ns of the Rorschach met.hod of personality development in twenty pairs of iden1i cal twins, Qgpet. Psychol. Mono., 1938, 20, L-ol- 556 Vernon, P. E. ., HO? SChPCh inzblOt tCSt, II, Brju. Jo Lgfl. PS svchol., 1933, 13, 179-205 1 , The matching method 3p1lied to 1hve3t13- mi ens of person P511 Bu 1, I936, 33, 1L9- 177 rt Herthen, F., and Llculcr, M., Incon3t3ncy of the formal structure 01 the personality; cxpcriz;1cntal study of of the in1 lac ace of mescelinc on the fiorschzvch tc1t Arch. R 1:Wrol Psychiet., 1932, 28, 52-}0 39. g z 31'? ‘- - A“ "'Tl'iifififlfiflflifl’tfll[lfifllijffliflfiiflfliflifim‘ES