RETURNING MATERIALS: P1ace in book drh'[ * ”big ' ' CLARIFICATION OF THE ROLE AND CONFIGURATION OF LEARNING AS THEY ARE MANIFEST IN PERFORMANCE AT HALSTEAD'S CATEGORY TEST By Kenny William Bertram A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Psychology 1983 ABSTRACT CLARIFICATION OF THE ROLE AND CONFIGURATION 0F LEARNING As THEY ARE MANIFEST IN PERFORMANCE AT HALSTEAD'S CATEGORY TEST By Kenny William Bertram Review of the theoretical, clinical, and empirical histories of the Halstead Category Test (HCT) led the pres- ent investigator to argue that the test demands diverse and complex activities which recruit virtually all areas of the cerebral cortex. It was also reasoned that the HCT is best viewed as a test of learning. The predictions involved in the present study followed from this hypothesis. Subtests III through VI were treated analytically as independent records of item response behaviors, and within subtests, items were organized into groups, or sequences, on the basis of structural characteristics. Items were further grouped according to ordinal position within sequences, and by correct Option. Item incorrectness was construed as a dependent variable. The aforementioned factors, plus linear, quadratic, and cubic components for Trial and Sequence, their various interactions with one another, and with brain damage, defined by an Augmented Impairment Index, were entered as within groups independent variables. It was predicted that when correct option had been partialed away, item correctness would increase across trials and sequences. The neuropsychological protocols of 159 referrals of diverse etiology at a Midwestern Veteran's Administration Medical Center formed the sample for the study. A multiple linear regression strategy was applied, and the between subjects effect for Augmented Impairment Index proved significant, as anticipated. Within subjects effects for Option, Sequence, and Trial also were significant, and though the results were more complex than anticipated, it was concluded that learning had been demonstrated. The occasional significance of the quadratic and cubic aspects of Trial and Sequence was traced to item characteristics and the incompletely balanced distribution of Option levels across levels of Trial and Sequence. Interactions were also sporadically significant, and these results were largely attributable to incomplete bal- ance in the design, plus structural peculiarities among the item stimuli. It was concluded that while learning had Obtained as predicted, it was a determinant of item behavior Of modest importance. It was speculated that a more balanced analog of the HCT would permit a more definitive evaluation of the study's hypotheses. ACKNOWLEDGEMENTS I wish to express my great appreciation, my respect, and my liking to the sage and sane chairperson of my disser- tation committee, Norman Abeles. Thanks, Norm, for much useful, helpful, and supportive counsel. I wish to emphasize my gratitude to Neal Schmitt, a member Of my dissertation committee, for having been the major of desperately few real sources of inspiration to me in recent years. When I wasn't busy feeling dull in your presence, Neal, I was learning more than I would have be- lieved possible. I remain grateful, as well, for the involvement of the other two committee members, John Hurley and Joseph Papsidero, who have affected me lastingly and positively. In particular, thanks for helping me learn to write, John. And Joe, thanks for your longstanding interest in helping me develop myself as a scientist and academician. With great affection I also wish to acknowledge the guidance and friendship of J. Edwin Mason, my first mentor in Neuropsychology, and probably the person most responsible for my lasting interest in the field. Thanks Ed, and love. Finally, I seek here to recognize the great influence which Ward Campbell Halstead has had upon my thinking in Neuropsychology, and upon the direction my career has taken, 11 at this time. It seems to me that this brilliant and ener- getic man has received but a fraction of the acknowledgement he merits. Being a scientist of the first water, he devoted his years to goals other than material success and acclaim from his colleagues. As such, he avoided letting public attention affect the focus of his energies. It is a pity the same can not be so convincingly said of his theoretical progeny. iii TABLE OF CONTENTS Page LIST OF TABLES........................................ vi LIST OF FIGURES....................................... viii Section Introduction................................... 1 History and Development of the Halstead category TeStoooooooocoooooo00000000000000.0000 3 Post - Halsteadian Development of the Category TestOOOOOOIOOOOCOO...0....OOOOOOOOOOOOOOOOOIOOO Validity of the Halstead Category Test......... 15 Age, Education, and the Halstead Category TeStOOOOOOOOOOIOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 18 Intelligence and the Halstead Category Test.... 24 Additional Studies Concerning the Halstead category TestOOOO0.00000000000000000000.0...... 28 Formal, Psychometric, and Other Characteristics of the Halstead Category Test.................. 32 Summary and Outline of the Focus of the Current EndeavorOOOOOIOOOOOOOOOOOOOOOOOOOOO...0.0.0.... ”6 Hypotheses..................................... 52 Method......................................... 53 Subjects................................. 53 Examiners................................ 55 Procedures............................... 56 Design and Analysis...................... 56 iv TABLE OF CONTENTS (Continued) Section Page Results........................................ 68 Option................................... 86 Sequence................................. 9O Trial.................................... 101 Option by Sequence....................... 111 Option by Trial.......................... 129 Trial by Sequence........................ 138 Damage................................... 144 Sequence By Damage....................... 145 Trial by Damage.......................... 150 Trial by Sequence by Option.............. 150 Trial by Sequence by Damage.............. 155 Remaining Interactions................... 157 Discussion..................................... 158 Author's Notes................................. 167 Appendix....................................... 177 ReferenceSOO0.00.00...OOOOOOOOOOOOOOOOOOOI.0... 187 Table SWN 1O 11 12 13 14 15 LIST OF TABLES Summary of Effects Considered................ Between-SUbJCCtS AhBlYtiC ”Odelocoooooooooooo Within-Subjects Analytic Model for Polynomial Components........................ Summary of Sample Characteristics and Performance at Relevant Variables............ Summary of Subject and Item Means, Variances, and Standard Deviations, by Subtest.......... Derivation of Between-Subjects and Within- Subjects Estimates of Variance............... Between-Subjects Variance Analysis, or Regression of Brain Damage Upon Average HCT Item Response, Organized by Subtest.......... Subtest III: Basic Within-Subjects Regression Analysis.......................... Subtest III: Polynomial Within-Subjects Regr9831on Analy31s.OOOOOOOOOOOOOOOOOOOO0.0.. Subtest IV: Basic Within-Subjects Regression AnalySiSOOOOOO0.0...00....OOOOOOOOOOIOOOOOOO. Subtest IV: Polynomial Within-Subjects RegreSSion Analys18000OOOOOOOOOCOOOO0.0....00 Subtest V: Basic Within-Subjects Regression Analy3180000000000000000..00000000000000.0000 Subtest V: Polynomial Within-Subjects Regression Analysis.......................... Subtest VI: Basic Within-Subjects Regression Analysj-SOOI.OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOIO. vi Page 60 61 62 63 69 72 73 74 76 77 78 79 80 81 82 Table 16 17 18 19 2O 21 22 23 24 25 26 27 28 29 3O 31 32 33 LIST OF TABLES (CONTINUED) Page Subtest VI: Polynomial Within-Subjects Regression Analysis.......................... 83 Item Means and Variances, Segregated by option...O...OOOOOOOOOOOOOOOOOO00.0.00...O... 87 Marginal Means for Sequence.................. 91 Marginal Means for Trial..................... 102 Subtest III: Cell and Marginal Means for OptionxSequenceOOOOOOOOOOOOOIOIOOOOOOOOCOOO 113 Subtest IV: Cell and Marginal Means for OptionxsequenceOOOOOOOOOOO0.000000000000000 11“ Subtest V: Cell and Marginal Means for OptionxSequenceOOOOCOOOOOOOOOOOIOOOO0.00... 115 Subtest VI: Cell and Marginal Means for OptionXSequenceoiOOOOOOIOOOOOOOOOOOO0.00000 116 Subtest III: Cell and Marginal Means for OptionxTrial-O...0.000CIOOOOOIOOOOOOOOOOOOOO 130 Subtest V: Cell and Marginal Means for OptionxTriaIOOOOOO0.000000000IOOOOOOOOOOOOO 131 Subtest VI: Cell and Marginal Means for optionXTriaIOOOOOOOOOOOOOOOIOOOOOOOOOOIOOOO 132 Subtest III: Cell and Marginal Means for Trialxsequence...0.0.0....OIOOOOOOOOOOOOOOO 1’40 Subtest V: Cell and Marginal Means for Trial xsequenceOOOOOOOO..0.00000000000000000000000 1u1 Subtest V: Summary of Sequence X Damage..... 147 Subtest VI: Summary of Sequence X Damage.... 149 Subtest III: Summary of Trial X Damage...... 151 Subtest IV: Summary of Trial X Sequence X OptionOOOO0.0.00....OOOOOOOIOOOIOOOOOO0...... 153 Subtest V: Summary of Trial X Sequence X DamageOOOOO000......OOOOOOOOOOOOOOOOOOOOOOO00 156 vii Figure 10 11 12 13 1A 15 LIST OF FIGURES Page Subtest III: Partialed Mean Errors as a Function of Sequence........................ 93 Subtest IV: Partialed Mean Errors as a Function of Sequence........................ 94 Subtest V: Partialed Mean Errors as a . Function of Sequence........................ 95 Subtest VI: Partialed Mean Errors as a FunCtion Of SequenceOOOOO0.00000000COO0.000. 96 Subtest III: Partialed Mean Errors as a FunCtion Of Trial.OIOOOOOOOOOOOOOOOOOOIOOOO. 103 Subtest IV: Partialed Mean Errors as a FunCtion Of TrialOOOOOOOIOOOOOOOOOOOOO...0.. 10” Subtest V: Partialed Mean Errors as a FunCtion Of TrialOOI.OOOOOOOOOOOOOOOOOOOOOOO 105 Subtest VI: Partialed Mean Errors as a Function Of TrialOOOOCOO0.000000000000000000 106 Subtest III: Option-Segregated Curves as a Function of Sequence........................ 117 Subtest IV: Option-Segregated Curves as a Function of Sequence........................ 118 Subtest V: Option-Segregated Curves as a FunCtion Of sequenceOOOOOOOOOOOOOOO0.0...0.0 119 Subtest VI: Option-Segregated Curves as a Function Of sequence...OOOOOIOOOOOOOOOOOOOOO 120 Subtest III: Option-Segregated Curves as a FunCtion Of TriaIOOOOOOOOOO00.000.000.000... 133 Subtest V: Option-Segregated Curves as a FunCtion Of TriaIOOOOOOOOOOOOOOI0.0.00.0...O 13” Subtest VI: Option—Segregated Curves as a FunCtion Of TrialOOOOOOOOOOOOOOIOO0.0...0.00 135 viii Figure 16 17 LIST OF FIGURES (Continued) Subtest III: Sequence-Segregated Curves as a Fanatic“ or TriaIOOIOOOOOIOOOOOOOOOOOOOIOOOOO Subtest V: Sequence-Segregated Curves as a FunCtion OfTria10000000000000000000.0.000... Subtest Subtest Subtest Subtest Subtest Subtest Subtest I of the Halstead Category Test...... II of the Halstead Category Test..... III of the Halstead Category Test.... IV of the Halstead Category Test..... V of the Halstead Category Test...... VI of the Halstead Category Test..... VII of the Halstead Category Test.... Original Version of the Halstead Category TeStOOOOOOOOOOOO...OOOOOOOOOOOOCOOCIOOOOOO... ix Page 142 143 179 180 181 182 183 184 185 186 Introduction The scientific objective of this project was to inves- tigate and document the operation of learning, as it is manifested in respondents' behavior at the Halstead Category Test. This purpose, however, was largely incidental. The stimulus to undertaking this study was, that is to say, not the speculation that the Category Test was a mea- sure of learning, but rather, the investigator's nagging awareness of his own ignorance of precisely, or even vague- ly, what the Category Test was a measure of. Worse yet, even those mentors, well-established and practicing neuro- psychologists, consulted, could shed but little light on the matter. And so it eventually became imperative to consult the literature in hopes of developing an understanding of the test. The introductory part of this project, consequently, involves a critical summary of much, though not all, of the research which has, for one reason or another, incorporated the Halstead Category Test. The literature itself quite obviously converged upon the theme of learning in Category Test-taking behavior. The project can also be viewed as a clear step away from further evaluation of the discriminant validity of the test, a stride in the direction of task-analyzing the test. 1 2 Or, it may as well be concluded, once and for all, that the Category Test is a superior indicator of brain damage, and it is high time to begin the process of discovering those perceptual and cognitive elements providing the basis for its capacity to so powerfully discriminate. Obviously, a single study can have but scratched the surface in this regard, but, in addition to a few other, similar efforts which have been made, perhaps an acceptable beginning now exists. Finally, the project can be thought of as something of a newcomer'sieulogy to Ward Campbell Halstead, the brilliant and meticulous scientist who almost singlehandedly developed the currently immensely popular battery which bears his name. It would seem, somewhat in contradiction to the uses to which his tests during the past 35 years have been put, Halstead cared less for his Battery than for that construct they were designed to elucidate, 'Biological Intelligencefl Good for you, Dr. Halstead, and may the current endeavor, in its own humble way, serve as a redirection of attention back upon Biological Intelligence, and as a tribute to you. WWW In the latter half of the 1930's, Ward Campbell Halstead (1908-1969), having just completed his doctoral work in experimental psychology, had been rigorously pursu- ing clinical and empirical examination of brain-damaged humans. Though obviously a thinker of particular breadth, he was especially intrigued by what he called "grouping behavior" (Halstead, 1940), or more descriptively, the sort- ing of objects or stimuli into categories based on similari- ties and differences in one or more of their characteris- tics. Halstead's work in this context was heavily influenced from two directions. The first of these was the thinking and research of the Obscure scientist, Heinrich Kluver, originally a mentor and later a colleague to Halstead at the University of Chicago. Kluver's (1929, 1931, 1936) major impact upon Halstead's thinking concerned his "method Of equivalent and non-equivalent stimuli," essentially an ap- proach to the study of learning from the point of view of stimulus characteristics. Clearly a learning theorist, Kluver's interest was in determining which aspects of a complex stimulus array were relevant insofar as eliciting a given response from an organism was concerned.1 He consi- dered the stimuli "functionally equivalent" (Kluver, 1936) when irrespective of their apparent differences, they eli- cited identical behaviors. Much of his research consisted of the extremely careful study of stimulus differences and u similarities in search of precisely those aspects, in a given situation, which accounted for the facilitation or suppression of a conditioned response. Though most of his work involved animals, and especial- ly higher primates, Kluver (e.g., 1936) did from time to time discuss its applicability to the study of human learn- ing, and even to such examples of abnormal human functioning as psychopathology and brain damage (i.e., to "breakdowns", as it were, in the usual operation of stimulus equivalence or non-equivalenceL?’ It is likely that Halstead found these speculations intriguing. Following Kluvem's (1929, 1931, 1936) lead, it is possible to view the operation, within the organism, Of stimulus equivalence or non-equivalence, as the activity of abstract thought. Each function, that is to say, involves an ongoing process of solving a problem. Events or objects are in some cardinal sense or senses functionally equivalent or isomorphic, and in many other senses, functionally dis- similar, or distinct. The nmnlam to be solved is that of detecting and cataloging similarities and differences, and of somehow sorting through the comparisons of attributes and selecting the saliant similarity or similarities or differ- ences. The solution process is sagging because a new achievement of functional equivalence or non-equivalence must be sought at the addition of each bit of information. Inferring a conceptual isomorphism between the notions of stimulus equivalence and abstract reasoning, as Halstead (1939) so obviously did, leads naturally to a consideration 5 of the other important influence on Halstead's formulations: the German neurologist, Kurt Goldstein. Like Kluver, and very many others of the era, Goldstein's (1936, 1939, 1940, 1942, 1944) thinking was characteristic of the Mentalistic and Structuralistic tradi- tions in that he was more than willing to speculate at length concerning the internal, intrapsychic machinations corresponding;with empirically available, behavioral events. During the hiatus between the first and second World Wars, Goldstein and his colleagues intensely studied the victims of brain injury sustained during the first of these Wars, ostensibly (Goldstein, 1940) with the hope of deriving an account of human behavior. The idea (and by no means a new one) was to develop an understanding of normal behavior by investigating the character of its disruption secondary to cerebral insult. Consistent with the Mentalistic and Struc- turalistic orientations, both the (inferred) internal exper- ience, and the neurobehavioral characteristics, of overt behavior were stressed. Goldstein's work often turned in the direction of the highly inferential and the anecdotal, and he has been somewhat harshly criticized for this (e.g. Battersby, 1956; Reitan, 1958) in more recent times. None- theless, a great many of his more prominent theoretical tenets remain explicitly or implicitly popular today (Walsh, 1978, pp. 120 ff.). Goldstein was particularly fascinated by what he saw as the impact of cortical damage upon the "abstract attitude" (Goldstein, 1940, 1942). This, he defined, as follows: 6 The abstract attitude is the basis for the follow- ing aansaians and salitianal modes of behavior: 1. To detach our ego from the outerworld or from inner experiences. 2. To assume a mental set. 3. To account for acts to oneself; to verbalize the account. 4. To shift reflectively from one aspect of the situation to another. 5. To hold in mind simultaneously various as- pects. 6. To grasp the essential of a given whole; to break up a given whole into parts, to isolate and to synthesize them. 7. To abstract common properties reflectively; to form hierarchic concepts. 8. To plan ahead, ideationally; to assume an attitude towards the "mere possible" and to think or perform symbolically. (Goldstein and Scheerer, 1941, p. 4) Goldstein envisioned human thought as being capable of but two orientations or modalities: the abstract one, and the concrete one (Goldstein, 1940, 1942). The concrete attitude he defined as being the opposite of the abstract attitude, and hence being characterized by stimulus boundedness, ri- gidity, an absence of detached consideration or formulation, and so forth. Goldstein also argued that the two attitudes were functionally mutually exclusive, and, moreover, that the function of the abstract attitude was to oversee, and to deliberately and thoughtfully plan and control behavior, which Goldstein viewed as consisting of fundamentally con- crete operational units (Goldstein and Scheerer, 1941). As might be expected, Goldstein's position was that injury to brain compromised or even completely abolished, the capacity to adopt the abstract attitude.3 7 Goldstein and his co-workers developed or improved upon several tests of the capacity to adopt the abstract attitude (Goldstein and Scheerer, 1941). The tests share the demand upon the respondent that he or she formulate an explicitly abstzaat, hianananiaal solution to a given problem. For example, one test required subjects to sort skeins of yarn of various shades into groups. Correct test behavior de- pended upon the formulation Of the abstract (i.e. transcend- ent, relatively intangible) concept, hue. More concrete, and incorrect solutions included sorting according to red- ness, blueness, etc. (i.e., categorizing on a basis was than hierarchical). Goldstein was in no sense a psycho- metrician, and he consequently derived only analltatiya scoring approaches for his tests. He has been somewhat heavily criticized for this (e.g., Battersby, 1956). One of the tests Goldstein endorsed, The Gelb- Goldstein-Weigl-Scheerer Object Sorting Test, has been de- scribed in some detail by Weigl (1941). Materials consisted of 30 common household objects (e.g., a pipe, candles, bell) selected such that they could be organized or sorted into groups on the basis of one or several hierarchical organiz- ing principles. The objects were placed upon a tablecloth in "standard position," and subjects were directed to sort them into categories of their own design. Again, no quanti- tative mode of scoring existed, but brain-damaged people demonstrated less facility at the task than did normal controls. In particular, the cortically impaired: (1) tended to place objects together which would be used 8 together (eug., tools); (2) were often unable to decide that objects might be sorted according to more than a single organizing principle; (3) were typically uninsightful con- cerning the bases for sorting they wens able to arrive at (Weigl, 1941). The Gelb-Goldstein-Weigl-Scheerer Object Sorting Test is relevant not just because it typifies the means of as- sessing deficits in abstract thinking which were popular at the time. It was an important ancestor to one of the more widely used measures of brain damage today, the Halstead Category Test, or HCT (Halstead, 1940). Halstead's (1939, 1940) earlier work with "sorting tasks" employed a test very much like the Gelb-etc” but with 62 Objects rather than 30. For an illustration of the standard administration, consult the 1940 Halstead paper. The administration procedure was more psychometrically rig- orous and a careful, quantitative scoring procedure was developed. Halstead (1940) was able to demonstrate, with 26 meticulously documented neurosurgical patients and 11 normal controls, that brain-damaged individuals tended to sort fewer objects than intact individuals, and that the presence or absence of frontal lesions "accounted" for the majority of the variance observed (14%, frontally lesioned patients performed more poorly than those lesioned more posterior- ly).u In a test of imminent recall, it also was discovered that subjects with brain damage remembered fewer objects than normal subjects, and again, that frontally injured patients performed more poorly than other patients. 9 Halstead performed a rather thorough qualitative analysis of the strategies utilized in grouping objects by the various intact and lesioned subjects, in hopes Of elucidating the Operation of Kluver‘s (1931) principles of equivalence and non-equivalence in normal and brain-damaged functioning. In general, it obtained that frontal patients tended to group fewer objects, and to approach the test with less flexible and less stable organizing principles than did normal or posteriorly lesioned individuals. It might be concluded that judgments of equivalence and non-equivalence were more difficult to formulate for the frontally lesioned subjects, and were approached somewhat uninsightfully. It is impor- tant to emphasize, however, the differences here were sub- tle, and the strategies adopted by lesioned and intact individuals were not really qualitatively distinct, as Goldstein (1939, 1944) might have predicted. The work culminating in the previously summarized (Halstead, 1940) study apparently stimulated Halstead to develOp a more elegant, standard, and controlled procedure (Halstead and Settlage, 1943). This procedure manifested the desideratum of systematically varying certain aspects of stimuli, in this case, "geometric figures," while assuring that other aspects remained immutable. The procedure was the direct ancestor of what currently is known as the Halstead Category Test (Reitan and Davison, 1974, pp. 366- 368), and was identical to the contemporary version save that it utilized nine subtests of 40 trials each, for a total of 360 trials, as opposed to seven subtests of various 10 lengths (i.e. from eight to 40 items or trials), and summing to a total length of 208 trials.5 The reader is directed to Appendix 1, which contains thorough verbal and depictive descriptions of the Halstead Category Test. A remarkable, largely unprecedented, and, as yet rarely replicated, feature of this completely standard procedure were the demands placed upon the respondent that she or he: (1) develop, as hypotheses, organizing principles, or ab- stract concepts, with regard to items or trials within a subtest; (2) evaluate these hypotheses in accordance with positive or negative reinforcement; and (3) adjust the hy- potheses in an ongoing or dynamic way on the basis of the reinforcements received.6 It can be readily seen that good performance at the test demands relatively high level skills in noting similarities and differences among stimuli, the capacity to incorporate feedback into ongoing cognitive and behavioral planning and organization, and to some degree at least, a reasonable short term memory (Walsh, 1978, p. 121). With a group of six carefully described neurosurgical patients, and a group of ten normal controls, Halstead and Settlage (1943) demonstrated that patients with frontal or prefrontal cortical ablations performed dramatically poorly at the test relative to intact subjects. Patients with cortex removed elsewhere were noted to perform about as well as normal controls (but see note 4). The authors, charac- teristically, inferred that frontally damaged individuals had lost their capacity to evaluate equivalence and non- equivalence of stimuli. 11 During the middle and late 1940's Halstead and a number of his colleagues were awarded a substantial, federally backed grant for the purpose of explicating the relevance of the frontal lobes to organized complex human behavior. This ambitious endeavor culminated in the completion of Halstead's (1947) book, the only one he ever wrote, and in which appears the most comprehensive account of his theoret- ical conceptualizations. The project involved the administration of 27 "quanti- tative indicators" of neuropsychological and psychological functioning to 207 subjects with known (and, as usual, quite carefully documented) brain damage, and 30 cortically intact controls. From this full array of 27 indicators, 13 were selected for subsequent intercorrelation and factor analy- sis, because of their amenability to parametric statistical treatment.7 This set of 13 variables was factor analyzed independently by Holzinger (Halstead, 1945) and by Thurston (Halstead, 1947). Holzinger derived two alternate structu- ral decompositions, one orthogonal, and one oblique. Thur- stone apparently applied his principal components extraction (Thurstone, 1947) with a carefully executed Oblique rota- tion. All three approaches yielded four factors. Though he (Halstead, 1945) published Holzinger's solutions, Halstead evidently preferred Thurstone's and this is briefly dis- cussed below. Thurstone's first factor Halstead (1947, pp. 43-55) labeled "C", or "Central Integrative Field." He believed this construct to involverthe matrix of overlearned 12 behaviors and cognitions we all possess, garnered from hav- ing distilled the pertinent elements of thousands of exper- iences and situations. Halstead believed the Operation of "C" to be characterized by the elicitation of the stored, relevant material, in response to the new situation, whose critical features are then subsequently integrated, them- selves, into this central field. He felt that this "organ- ized experience of the "individual" was roughly coextensive with the psychoanalytic term, "ego" (Halstead, 1946, 1948). He also saw the importance of memory to the Central Integra- tive Field (Halstead, 1951). Thurstone's second factor, the "A" or "Abstraction" construct, Halstead believed to underly or drive what he called "grouping behavior" (e.g., Halstead, 1940). Halstead had examined perhaps a dozen different grouping tests and techniques (Halstead, 1947), and carefully documented four distinct forms of grouping behavior (or, four grouping strategies). Without going into unneccessary detail, he felt that these strategies existed at different points along a continuum from what might be called "unaware," or "irra- tional" abstraction, to what might be thought of as "ration- al" or "conscious" abstraction.8 Both involve the selec- tion, from among many, of a single property or aspect common to all members of a class of objects which differ signifi- cantly in other regards. Halstead (1951) argued that the operation of "A" obtained when the biologically "wired," or "irrational" abstraction was held in abeyance, and the 13 consciousness sought other hypotheses, or "organizing prin- ciples" (Halstead, 1947) by means of which tO group or categorize objects or events. He also believed the "A" factor operated to permit the discarding of an organizing principal, and its replacement with a more appropriate or powerful one as the need for this arose. Halstead (1947, pp. 68-83) named Thurstone's third factor "P," or "Cerebral Power." He believed this construct represented the capacity to willfully direct concentration, to control otherwise disruptive affects or impulses, to delay gratification, and the like. He believed the opera- tion of this factor, unlike the others, to occur as a func- tion Of cerebral metabolism. The final factor Halstead (1947, pp. 84-90) labeled "D" or the "Directional" factor. Its operation might be thought of as the avenue or modality by means Of which any of the three other, more process-oriented factors emerges, occurs, or is "exteriorized" (Halstead, 1951). This construct might more clearly be understood as the behavioral or cognitive flexibility with which the other three factors are expressed or put into operation. Halstead (1947) viewed "D" as being particularly salient in situations demanding that the indi- vidual adopt unusual modalities for sensory or other activi- ties (e.g., navigating in a completely darkened room). Together, Halstead believed the processes represented by these four factors produced "Biological Intelligence," the basic function of the central nervous system, and parti- cularly of the frontal lobes (Halstead, 1948). Halstead 14 (1951) held that the operation of Biological Intelligence was responsible for any and every adaptive and intelligent central nervous system activity, and that Biological Intel- ligence was the attribute of the individual which was com- promised when the cerebral cortex, especially the frontal cortex, was damaged (Halstead, Carmichael, and Bucy, 1946). Halstead's (1947) Category Test (HCT) loaded solidly upon factors "C" (.49) and "A" (.63). These two factors, though members of an obliquely rotated set, were essentially orthogonal (n = - .02). These results imply that the HCT demands both the careful, volitional abstraction of salient features of (visual) stimulus objects, and the capacity to integrate new, with previously existing, informatiOn. This is consistent with what various workers (Reitan and Davison, 1974, pp. 366-368; Walsh, 1978, pp. 294-295) have observed since Halstead's time. WW Beyond the late 1940's and early 1950's, Halstead's interests apparently veered away from the business of vali- dating and refining his battery of neuropsychological indi- cators, and he became consumed instead with the study of the effects of specific conditions (e.g., ablative surgery, hypertension, noise) upon brain function, or Biological Intelligence (Halstead, Apter, and Heimburger, 1951; Halstead and Chapman, 1954; Halstead, Chapman and Symmes, 1955L. He remained convinced that the frontal lobes were especially endowed with this quality (e.g., Halstead and 15 Shure, 1958), even when others began to challenge both his theoretical propositions and his empirical procedures . (Chapman and Wolfe, 1959; also see Walsh, 1978, pp. 113- 117). There have been several lines of research pursued with the HCT since Halstead's early work. These include: (1) its further validation; (2) correlating it with other varia- bles of neuropsychological relevance (principally, intelli- gence and age); (3) attempting to shorten or simplify the test; and (4) developing an account of the measure's psycho- metric and neuropsychological characteristics. In the main, this conceptual and empirical work has been undertaken, or at very least heavily influenced, by one of Halstead's earliest graduate students, Ralph Reitan, the individual primarily responsible for the refinement and popularity the Halstead battery of tests has enjoyed during the past 30 years. In fact, so much involvement by Reitan has occurred that the battery now is generally known as the Halstead- Reitan. To the extent made pertinent by the objectives of the current project, each of the research focuses listed is briefly characterized, below. W The measure's validity has generally meant it's capaci- ty to distinguish between groups of people with and without brain damage, with as few false negatives and positives as can be managed. The model for validity studies of the HOT, and, indeed, the entire Halstead-Reitan Battery, was 16 established by Reitan's (1955a) now classic endeavor, in which he compared 50 pairs of subjects, carefully matched for age, sex, education, and ethnic origin, where one member of each pair manifested, as it were, "proved" (Reitan, 1955a, p. 29) brain damage, and the other remained apparent- ly normal. The Halstead Battery was administered to all members of both groups. Of the total array of ten measures, the Halstead Category Test proved the most accurate discrim- inator between paired subjects, next to an index, the Im- pairment Index (Halstead, 1947), based upon all ten mea- sures. With the HCT, only three subjects with cortical impairment produced better scores than their intact counter- parts. In a subsequent report Reitan (1956) documented that HOT and the Impairment Index correlate substantially (n = .71 for 50 brain damaged subjects; a = .50 for 50 intact subjects). This not only underscores the validity or affi: flaw” of the HCT in this context, it also serves as evidence concerning the complexity of the measure and the demands it exerts upon the respondent's perceptual and cog- nitive faculties. Indeed, and in contradistinction to Halstead's (1947) frontal lobe "manifesto," it was demon- strated by Reitan (1955a) that HCT performance is rather unequivocally compromised by lesions anynnana upon the cere- bral cortex, and, moreover, that the deleterious influence of cortical impairment in this respect increases directly as a function of lesion size (Chapman and Wolff, 1959). The determinants, or more relevantly, the perceptual and 17 cognitive components of HCT performance are clearly compli- cated, and draw upon many different cortical sites. A great number of other, nearly identical studies have similarly demonstrated the discriminant validity of the Halstead-Reitan Battery and by necessity of the HCT (e.g., Reitan, 1966, Shaw, 1966; Vega and Parsons, 1967; Russell, Neuringer and Goldstein, 1970; Reitan and Davison, 1974, Filskov and Goldstein, 1974). The frequency of this sort of validity study is apparently increasing (Hevern, 1980), and with the development and wider dissemination of multivariate statistical technology, accuracy in discrimination has im- proved (eug.,‘Wheeler, Burke, and Reitan, 1963). The model for all of these studies and a great many more has been that of Reitan's (1955a), after Halstead's (1947) pioneering work. The design may be referred to as an extreme, or static groups approach (Campbell and Stanley, 1963). Of its many faults, one of the most disagreeable is the facilita- tion of the development and use of measures or entire bat- teries which are highly valid discriminators, but are en- tirely conceptually opaque. IDiagnosis is rendered definite, but little headway is made concerning the nature of what is being diagnosed. The impairment Index, a value probably useless for any purpose save indicating, with substantial accuracy, the odds of brain damage existing in a given case, best typifies the fruits of this strategy, called, by Rourke (1982), "static neuropsychology". The Halstead Category Test, as both the most important and probably the most complex measure in the Halstead-Reitan Battery, is perhaps 18 the second most grievous offender in this regard. What would seem to be essential is to approach the HCT, and the remainder of Halstead's tests, from the point of reference of psychometric or "task" analysis (Rourke, 1982). Certain inroads have in this regard been made, and it is to these which attention must be directed. WWW Reitan (1955b) examined 180 subjects with known brain damage and 101 neurologically intact subjects, and dis- covered that for cortically normal subjects, age correlated with Impairment Index both for subjects under 45 (L = .54), and for subjects within the age range form 45 to 65 (a = .61). For brain damaged people, the relationships were dramatically weakened (n = .27, 45 to 65). These findings indicate that age and cortical impairment are positively related and that the relationship is amplified by age it- self, and attenuated by occurrence of brain damage. In second study, Reitan (1956a) employed 190 known brain damaged individuals and 116 cortically intact sub- jects, and demonstrated that both psychometric intelligence assessed by means of the Wechsler-Bellevue Form I) and Impairment Index correlated solidly with age. Correlations between total weighted score (WB-I) and age, and Impairment Index and age were .32 and .37 for the brain damaged group, and .35 and .60 for the non-brain damaged group. As before, brain damage operated to attenuate the relationship between age and Impairment Index. 19 Reitan (1957) similarly evaluated the relationship between HCT performance and age for a group of 138 "normally functioning, high-level subjects" (imh, mean education was 16.52 years). A correlation of L : .45 obtained between the two variables. The author inferred that abstraction ability falls off with increasing age. Performance tended to de- crease most dramatically after about age 30. In another study, Reed and Reitan (1963a) demonstrated, with 40 matched pairs of intact and brain damaged subjects (mean age, 28 years; mean education, 11.8 years), and two groups of older subjects (n; = 46, mean age of 44.7 years, mean education of 16.7 years; n2 = 29, mean afie 0f 55-3 years, mean education of 13.9 years) that various neuro- psychological indicators were as functional in discriminat- ing between the two older groups as they were in discrim- inating between the two younger groups. The HCT was one such variable. The authors suggested the possibility that the degenerative impact of age upon CNS integrity was in some way analogous to the acute disruption in CNS function- ing caused by brain insult or lesion. A second study by the same investigators (Reed and Reitan, 1963b) involved groups of 40 young (mean age of 28.05 years, mean education of 11.82 years), and 29 older (mean age of 52396 years, mean education of 12.45 years) subjects. Both groups were administered some 29 standard measures including the Halstead-Reitan battery) which had been rank-ordered by three judges on a continuum from "heavily dependent on prior experiences" to "most dependent 20 upon immediate adaptive ability (and) complexity Of the problem-solvingn" The HCT was ranked "first" on this con- tinuum (1J5, most dependent upon adaptive ability and capa- city to solve complex problemsL. As well, the HCT proved to more clearly separate the two groups than any of the other measures. It was inferred that: (1) The HCT proved more dependent upon capacity to solve more complex problems than the other indicators (dwe., among Halstead's tests); (2) this capacity is impaired in the older brain, just as in the otherwise damaged brain. In a similar study, Fitzhugh, Fitzhugh and Reitan (1964) compared groups formed by means of splitting a pool of 283 patients with chronic cerebral dysfunction (i.e. seizure disorders) at the median age (35.5 years) with respect to performance at the array of tests utilized in the previously described study (Reed and Reitan, 1963b), and rank-ordered, as in that study, in accordance with a "problem-solving--experiental background" continuum. Once again, the tests toward the "problem-solving" end (e.g., HCT) of the continuum proved more effective in discriminat- ing between older and younger groups than the "experiential" (e.g., remote memory) tests. These results suggest that although both aging and brain damage impair the capacity to solve novel and complex problems, the two sources of defi- cits are also at least partially independent. That is to say, cognitive functioning which has been chronically im- paired due to brain damage deteriorates, nontheless, in much 21 the same way as normal cognitive functioning, with the accumulation of age. The qualification appropriate to in- troduce here is that epileptic subjects are an extremely heterogenous lot insofar as absolute degree of brain damage, as measured by the HCT or the Halstead-Reitan Battery, is concerned. Many test within the brain damaged range, yet many do not. With 50 neurologist-confirmed, brain damaged patients (mean age of 41.7 years; mean education of 10m2 years), and 50 neurologically intact, though hospitalized patients (mean age of 40.8 years; mean education of 11.1 years), Vega and Parsons (1967) correlated various indicators from the Halstead-Reitan Battery with age and education. The HCT, correlating more powerfully with age than any other variable used, produced, for the brain damaged group, coefficients of -.3611 and .22 between HCT and, respectively, age and educa- tion. Analogous values, for the intact group, were «6311 and .45. This effort was cross-validated several years later with samples of 35 brain damaged (mean age 34.6 years; mean education 11.2 years) and 25 neurologically intact but psy- chiatrically disordered (mean age 33.2 years; mean education 12.2 years) patients (Prigatano and Parsons, 1976). Corre- lations between HCT and, respectively, age and education, were: -.45 and .29 (brain damaged); and -.42 and .21 (psy- chiatric). Partialing education had no effect on the rela- tionship between age and HCT performance for either group. Interestingly, the HCT failed to discriminate significantly 22 between brain damaged and psychiatric patients, a finding which has been widely established, and which seems to affect not merely the HCT, but as well, the other tests of the Halstead-Reitan Battery (Watson et al., 1968; Lacks et al., 1970; Lewis, Nelson and Eggerston, 1979).“2 A final study (Mack and Carlson, 1978) in this vein involved three groups, including 40 young and cortically intact volunteers (mean age 25.03 years; mean education, 15.43 years), 41 aged and cortically intact volunteers (mean age 69.76 years; mean education, 14.05 years), and, 43 presumably neurologically impaired patients (mean age, 41.70 years; mean education, 13.00 years). An analysis of vari- ance revealed that young subjects were superior to the statistically indistinguishable older and brain damaged subjects. When treated as a repeated measures, groups (i.e. age) by subtests (utilizing only subtests III, IV, and V) design, both main effects and the interaction proved signif- icant. The effect for subtests was consistent with the notion that learning occurs as a function of experience with the HCT, as subtest performance improved for each succeeding subtest. The significant interaction was attributable to the fact that young subjects, in contradistinction to older and brain damaged subjects, performed more poorly on subtest V than on subtest IV. This finding was, and remains, unex- plained. In summary, these studies suggest that age, like corti- cal insult, impairs adaptive functioning and the capacity to address novel and complex problems. Relatively unaffected 23 by age were those skills dependent upon old learning, or "prior experience," and this is in fundamental agreement with Wechsler‘s (1944) notions concerning skills which "hold" and "don't hold" with the accumulation of age. When the current set of studies is considered in conjunction with the previous discussion of the validity studies, it can also be readily inferred that adaptive ability, complex problem- solving, and the capacity to quickly develop new learning, as assessed by the HCT, require an intact cerebral cortex, and are vulnerable to any event.or process which involves the loss of cells. As Halstead (1951) pointed out, Biologi- cal Intelligence is characteristic of the healthy nervous system. The attenuation of correlations noted in the brain damaged groups in several of the studies summarized, but especially those of Vega and Parsons (1967) and Prigatano and Parsons (1976) merits some comment. One possible infer- ence for the phenomenon is that both aging and (any other) brain damage exert a similar influence upon adaptive abili- ties and complex problem solving, and that the more rapidly developing influence of brain damage "preempts" the more gradual accumulation of influence secondary to normal aging. Statistically, this translates to a reduction in variance and covariance in the brain damaged groups, due to restric- tion in the range of HCT scores (Magnusson, 1967, pp. 144- 147). 24 Finally, like psychometric estimates of intelligence, the studies herein reveal that education is modestly corre- lated with HCT performance. This relationship is also some- what mitigated by the appearance of brain damage. W In an early study designed to flesh out Halstead's (1947) conceptual entity, "Biological Intelligence," or the adaptive qualities inherent in the healthy central nervous system, Reitan (1956b) correlated the Wechsler-Bellevue Scale (Wechsler, 1944), then still a relatively new measure, with Halstead's Tests, including the HCT. In his customary way, Reitan utilized two samples of 50 subjects each, one with, and one without, proven brain damage, and individually matched in pairs vis-a-vis race, sex, and education.13 Cor- relations were computed separately within samples. The Halstead Category Test (HCT) correlated more dra- matically with Verbal IQ (x; = -.58, brain damaged; 1; = -.65 non-brain damaged), performance IQ (L : -.64 brain damaged; 1; = -.67, non-brain damaged), and Full-Scale IQ (x; = -.65 brain damaged; a = -u72, non-brain damaged) than did any other indicator in the battery (including Impairment Index), except Speech Sounds Perception Test, which was roughly on a par with the HCT. In contrast to what was routinely ob- served in relationships between age and HCT'(eug. Vega and Parsons, 1967; Prigatano and Parsons, 1976), very little attenuation in the HCT-IQ relationships obtained in the brain damaged group, relative to the intact group. This 25 would seem to suggest that while age and brain damage mani- fest overlap in influence, psychometric intelligence and brain damage (or biological intelligence) do not, insofar as HCT performance is concerned. With the same groups of subjects, Reitan (1959) subse- quently demonstrated that the Halstead Impairment Index more effectively discriminated between brain damaged and non- brain damaged subjects than did any of the Wechsler-Bellevue Subtests or the three IQ estimates. It bears mentioning that all of the Wechsler variables, save the Digit Span subtest, also significantly discriminated between the groups. Also of interest, the presence of brain damage appeared to exert a more deleterious influence upon sub- jects' scores for Wechsler's (1944) "don't hold" subtests relative to his "hold" subtests. The "holds" of course, "hold" their own with age. "Holds" emphasize an learning; "don't holds" the capacity for nan learning. With 29 more or less neurologically normal subjects, Shore, Shore and Pihl (1971) obtained a correlation of -.87 between the Wechsler Adult Intelligence Scale's (Wechsler, 1955) age-equated sum of scale scores and the HCT. The authors also correlated Cohen's (1957) factors for the WAIS with HCT, and obtained coefficients of .84 with Verbal Comprehension, .72 with Perceptual Organization, 1.00 with Memory, and .76 with General Intellectual Functioning. These results would seem to indicate rather more substantial overlap between tested, or psychometric intelligence, and Halstead's Biological Intelligence, as represented by the 26 HOT, than was discovered by Reitan (1956), in working with the Wechsler-Bellevue Scale (Wechsler, 1944L. Note also the rather astonishing correlation between HCT and Cohen's Memory Factor, a variable defined by the simple linear combination of Arithmetic and Digit Span Subtests. With two samples (nfs of 177 and 62) of epileptics, Lin and Rennick (1974) replicated both Reitan's (1956, 1959) and Shore's, Shore's and Pihlfls (1971) endeavors. Thus, HCT total error score was correlated with estimates of Verbal, Performance, and Full Scale Intelligence, all component subtest scores, and three of Cohen's (1957) factors. For larger and smaller samples, respectively, correlations were obtained between HCT scores and Verbal IQ (-.51, -.68), Performance IQ (-.60 -.49), Full Scale IQ (-.59, -.65), Cohen's Verbal Factor (-u46, -w68), Cohen's Perceptual Fac- tor (-.61, -.44), and Cohen's Memory Factor (-.48, -.55). Results, in general, were like those obtained by Reitan (1956), and in contrast to Shore, Shore, and Pihl (1971), relationships between HCT and Cohen's factors were more modest. In view of the size of the Shore, Shore, and Pihl (1971) sample (n = 29), the results of Lin and Rennick are perhaps to be the more trusted. Landsell and Donnelly (1977) factor analyzed the WAIS subtests plus the HCT and the Halstead-Reitan Finger Oscil- lation Test, a measure of motor speed often useful in later- alizing brain damage (Russell, Neuringer, and Goldstein (1970). Subjects (n = 94) included depressed and other psychiatric patients, and epileptic and other (unspecified) 27 neurological patients (mean age, 39.5 years; Mean IQ 105.6). Their principal components, varimax-rotated solution re- sulted in four factors, the last two of which were rather minor. The first and largest factor was determined in general by the WAIS verbal subtests, and was labeled, "ver- bal comprehension." The second, "visuomotor" factor, was dominated by HCT (loading Of .82), and included all of the WAIS performance subtests, save Digit Symbol, which, in combination with the Finger Oscillation Test, comprised the fourth, tiny factor, "manipulative speed." The authors inferred that the HCT does not involve a skill distinct from nonverbal intelligence (i.e., as mea- sured by the WAIS). Given the rather unusual character of their sample, however, this conclusion may be a bit prema- ture. Reitan (1956) discovered substantial correlations between HCT and the verbal Wechsler-Bellevue subtests, which the current authors did not. As well, Shore, Shore, and Pihl (1971), and lin and Rennick (1974) discovered substan- tial correlations between HCT performance and Cohen's (1957) Verbal Comprehension Factor, as well as its components, the verbal subtests, Information, Comprehension, Similarities, and Vocabulary, plus the overall estimate of Verbal Intelli- gence. In summarizing the studies relating psychometric intel- ligence to HCT, it is perhaps appropriate to observe that while all the information is not yet available, the HCT draws substantially upon both verbal and nonverbal aspects of intellectual functioning. Phrased alternately, the HCT 28 is a complex and demanding task, and it demands diverse cognitive operations for successful performance. There exist some indications that the performance, nonverbal, or spatial manipulative characteristics of intellectual activi- ty are peculiarly relevant. This is not terribly surpris- ing, in that performance/nonverbal skills are least depend- ent upon Old learning, are most likely to succumb to the influence of aging, and generally require the novel solution of complex problems. However, as has already been dis- cussed, the determinants of HCT performance are quite com— plex, and draw upon both verbal and nonverbal aspects of cognitive activity. A final comment demanded here is that the WAIS subtests are, themselves, complex tasks which have not, as yet, been adequately analyzed in the interest of revealing component perceptual, cognitive, and motor elements. This renders difficult attempts to meaningfully relate HCT with them. ;.. ..- 7 . -- ., - . .. .- ,- ~ --. - -:. As has been summarized above, it was Halstead's (1947) impression that his Category Test demanded, from the indivi- dual, the ability to volitionally engage~in "grouping be- havior" based upon a careful consideration and selection (from among potentially several foils) of the appropriate characteristic or characteristics of arrays of visual stimu- li. The correct aspect or aspects he referred to as the "organizing principle" of the array. In order to emphasize the volitional, or the detached and analyzing, aspect of 29 grouping behavior, Halstead incorporated reinforcement (buz- zer and bell) contingencies. The utilization of the feed- back mechanism adds significantly to the complexity of the task and, as well, to the demands placed upon the subject, as it requires the respondent to allay responding in order to consider and integrate new information, the impact of which may involve shifting the response set or organizing principle. Halstead (1947) was in substantial agreement with Weigl (1941) and Goldstein (1940, 1941, 1942), in that he believed that it was the capacity to engage reflectively in tasks requiring organization of stimuli on the basis of abstracted features that: (1) most characterized frontal lobe activi- ty; (2) most effectively distinguished the cortically intact person from her or his brain damaged counterpart. Goldstein (1941) held that this distinction was qualitative, or rather that brain damage abolished the capacity to volitionally detach the focus of attention from a particular object (either external or internal in locus), and to analytically consider several such Objects. It is unclear that Halstead entirely agreed with this qualitative interpretation. It is more likely that he viewed abstract (i.e. grouping) behavior as a continuum, bridging Goldstein's (1941) polar Opposites, the abstract and concrete "attitudes." Indeed, at one time Halstead (1940) attempted to document and discriminate be- tween different forms or strategies of object sorting, which he apparently believed reposed at various points along 30 Goldstein's hypothetical continuum. Reitan (1958) felt the issue remained confused as to whether cortically intact and brain damaged renditions of behavior were qualitatively discrepant or not. In this early study, he demonstrated that median intercorrelations between measures of the Halstead-Reitan battery did not differ significantly between brain damaged and normal groups. In a second study, Reitan (1959) demonstrated that groups (n = 52, each) of brain damaged and intact subjects, matched for race, sex, chronological age, and education, both improved in performance on HCT subtest VI, relative to HCT subtest V, which manifests the same "organizing princi- pleu" The inference was drawn that new learning can occur even among those with "proven" brain damage. Reitan, fur- ther, argued that although brain damaged individuals may be ananiitatixaly,less able to abstract than normal controls (this, of course, is the basis for the claim that the HCT can discriminate between groups of brain damaged and intact subjects), there exists no difference in kind, or mafia of reasoning utilized. Reitan's inference is something of a presumptuous simplification, as it is impossible, upon the basis of HCT score alone, to determine whether or not the two groups utilized identical information-processing strate- gies, and varied solely in the degree of efficiency with which they did so, events which would seem essential to conclude qualitative identity. Those findings were replicated and extended with a subsequent (Doehring & Reitan, 1962) investigation, in which 31 it was shown that although brain damaged individuals score more poorly on each of the HCT subtests than intact indivi- duals, the distribution of errors across subtests is essen- tially identical for both populations. Thus, both popula- tions manifest maximal errors during Subtest III, and steadily improve across subsequent Subtests. These findings imply that the rates of relative familiarization, or learn- ing, with the HCT, are similar for brain damaged and intact subjects. In this same report, the authors demonstrated that although patients with right hemisphere lesions performed more poorly at the HCT than patients with lesions to the left hemisphere, the difference failed to attain signifi- cance. This latter, marginal discovery was in disagreement with earlier work by McFie and Piercy (1952a, 1952b) who, using one of Goldsteinus(1941) special sorting tasks, de- termined that impairment in abstraction ability was more often associated with left than right hemisphere damage. With grouping tasks other than the HCT, Halstead (1940) found lesion lateralization to be irrelevant. However, Halstead and Shore (1958) found left hemisphere damage to be slightly, though not significantly, more predictive of HCT impairment than right hemisphere damage. Chapman and Wolff (1959), in their careful and detailed analysis of perform- ance, found the Opposite to be true. Reitan (1960) found dysphasic patients, and those without dysphasia, though brain damaged, to perform equally poorly with the HCT. Finally, Doehring and Reitan (1961) discovered left visual 32 field defects to be more predictive than right visual field defects of poor HCT performance. This finding suggests that disruption of the primary visual radiation, occipital lobe, right hemisphere, harms HCT success. As previously, it is probably reasonable to conclude that HCT performance relies upon perceptual and cognitive processes of sufficient complexity that virtually all as- pects Of the cerebral cortex are utilized. The studies of this section also document quite nicely that: (1) the HCT daas require new learning for successful performance; (2) this is true for both intact and brain damaged subjects. The controversy between Reitan and Goldstein was not, of course, be resolved. There exists no way of determining, with recourse aniy to HCT responses, whether or not brain damaged and intact respondents arrived at solutions (organ- izing principles) in the same or in different ways, or, for that matter, if upon consistently applying the correct or- ganizing principle, they do so with equal insight or lack thereof. W W Discussion in this section is concerned with the few extant psychometric appraisals of the HCT, with the various tactics which have been utilized in order to render it easier or quicker to administer, and, finally, to the thor- ough conceptual analysis Simmel and Counts (1957) have gra- ciously afforded it. The methodological criticisms and 33 summarizing comments appearing herein will be seen to con- verge upon the substantive point of the current project. The Halstead-Reitan Battery is well known to be long, expensive, and tedious to administer, score, and interpret (Erickson et a1, 1978). This has encouraged a number of clinical researchers to somehow shorten the battery (e.g., Golden, 1976; Erickson et al., 1978; Mezzich and Moses, 1980; Barrett, Wheatley, and Laplant, 1982). Problems have arisen anew, however, in that shorter forms inevitably have led to the discarding of entire tests, producing not just a quantitative reduction in information, but a qualitative loss, as well. Additionally, few if any of the shortened versions or (more fashionably) "screening batteries" have been validated with independent samples. Another strategy interesting those seeking to render the Halstead Reitan Battery more time and cost efficient has been shortening various subtests (eug., Golden & Anderson, 1977; Calsyn, O'Leary, and Chaney, 1980; Gregory, Paul, and Morrison, 1979). In this regard, there has been particular emphasis on Halstead's Category Test (HCT). Probably, this has been due to the HCTWs lengthy administration (i.e., up to perhaps one hour with the incapacitated) and the substan- tial level of frustration subjects are frequently required to endure (Luria and Majovski, 1977). The test also has the virtue of being the better single indicator Of brain damage among the lot of Halstead's tests (Reitan, 1955), second only to Halstead's Impairment Index, a summary quantity dependent on seven indices, including the HCT. In addition, 34 the test is highly reliable (Shaw, 1966; Matarazzo et al., 1976)"4 and has been shown to correlate solidly with magni- tude of cortical lesion, irrespective of location on the cortex (Chapman and Wolff, 1959). Various approaches to designing a less noxious version of the HCT have been explored. The test has been shortened, both with attention having been paid to the fact that item scores are proactively dependent (1J5, HCT performance is dependent on learning, and it is appropriately viewed as a time process), and without. Thus, Kilpatrick (1970), Boyle (1975), and Gregory et al. (1979) have derived shortened versions, more»or less item-analytically, which involve items taken out of sequence. Kilpatrick and Spreen (1973) have applied the same strategy, and with far more considera- tion to psychometric principles, to a version of the HCT Reitan (1974) has modified for use with children 9 to 15 years old. These authors also standardized their shortened version. It is difficult to determine the utility of these shortened versions for two reasons. First, they have not (with the exception of the Kilpartick and Spreen effort) been restandardized or independently validated. Second, very little is known about the role learning plays as a determinant of HCT performance, and consequently it is im- possible to conclude whether taking items out of sequence will dramatically affect HCT test behavior or the ranking of individuals based on total HCT score. As well, these 35 versions remove more items from certain subtests than others, apparently because it has been determined that HCT subtests are not all equally discriminative of brain damage (Boyle, 1975L95 Yet the removal of many items from a given subtest may exert an effect upon items in subsequent sub- tests manifest only upon independent validation. Again, the role and the process of learning, as they develop during administration of the HCT have not yet been studied, and it consequently remains difficult to comment definitively upon these shortened versions. Calsyn (1980) has derived a short version (108 items) which, so he claims, does as; take items out of sequence. The version consists of the first four (of the total, seven) subtests. this form has been independently validated (Gold- en et al., 1981) with promising results. Again, however, though the shortened form held up during this validation, it remains difficult to compare it with the previously men- tioned forms, which have as; been so validated. Morever, Calsyn's (1980) approach was not psychometrically based: he merely divested the test of (the last 100 of its items (or the last three of its subtests), without concern for differ- ential item and scale validities. Another approach which has been applied to the problem of the HCT is the strategy of deriving a version unchanged in length or order of trials16, but altered in administra- tion so as to render the test more palatable to clinician, subject, or both. Thus, Beaumont (1975) developed an on- line program which administers and scores the HCT. 36 Essentially, no examiner is required with this procedure, which also provides feedback (idh, buzzer or bell) with invariable latency and is, of course, errorless. The prob- lem, however, is that in general, and in particular with brain damaged subjects, it is essential sometimes to provide ongoing coaching to those being examined. As was said before, this is because the test is highly frustrating. continual encouragement is often required to guarantee that the subject's best effort is being elicited. Indeed, the vulnerability of the test to motivational and affective influence may be part of the reason why HCT errors are predictive, not just of brain damage, but also of various forms of psychopathology.17 In any event, there exists no means by which an on-line computer can, with appropriate flexibility and judgment, coach the subject who is having difficulty with the test and who is thus in danger of "giving up." Another administration approach which has recently been developed involves the use of a latent image transfer sheet which provides subjects with visual information concerning the correctness or incorrectness of their responses (Wood and Strider, 1980). In a design which was counterbalanced for order of administration, visual feedback was alternated with the traditional auditory feedback procedure, from sub- test to subtest. No significant differences were noted, although the effect for test form was obviously confounded with deviation from standard administration (i.e., what were being compared were two alternate forms, bath different from 37 the standard HCT). Morever, even assuming that their com- parisons sans methodologically reasonable, the subjects involved were psychiatric, rather than brain damaged pa- tients. As was pointed out above, performance on the HCT is powerfully influenced by the presence of psychological dis- orders, for reasons as yet very poorly understood. It would seem inappropriate to generalize results based upon this sample to either "normal" or brain damaged populations. Finally, even assuming that these problems are trivial, their samples were small (i.e., two groups of 25), and the power of their tests was low (i.e., on the order of .30, assuming alpha of .05, and a two-tailed test), rendering it likely that even if the test forms sans different, the authors would not have been able to detect it in the first place. Subjects in their (Wood and Strider, 1980) study were questioned, and generally preferred standard adminis- tration, because it did not require that they look away from the projected images (i.e., test stimuli) in order to respond. Adams and Trenton (1981) utilized an identical visual feedback procedure, but supplanted the projector in their alternative administration with a deck of 3" x 5" cards with the HCT stimuli printed on them. Their professed aim was the development df a form of the HCT which would permit group (idh, relatively unsupervised) administration. With two groups of 30 "normal" subjects each, the same, counter- balanced design of Wood and Strider (1980) was adapted. 38 Split-half reliabilities were computed for these groups (n = .79), and for a third group of 100 subjects representing an unspecified population who took the standard HCT (L = .82), and the reliabilities were found not to significantly dif- fer. NO means or standard deviations were reported for any of the three groups. The authors stated an interest in determining if test halves which differed in administration format would correlate differently. One wonders why the authors concerned themselves so exclusively with this line of inquiry, as it so clearly limits the relevance of their findings, whatever they were. At any rate, the same set of criticisms apply here as was the case with the Wood and Strider (1981) study. As well, the comments made earlier concerning the importance of coaching and involvement on the part of the examiner militate against the utility of a group form of the HCT, so long as psychometric equivalence between forms has not been vigorously established. Kimura (1981) has independently developed a card form of the HCT, in this case with stimuli printed upon 4" x 6" cards. Subjects are asked to verbalize item responses, and the administrator provides verbal feedback (14%, says "right" or "wrong"). This form was compared with the stand- ard administration with two groups of 15 neuropsychological referrals, and the two forms were found not to differ sig- nificantly. Power of the test applied to total HCT scores was effectively zero. Thus, assuming the test forms, gene different, it would have literally been impossible to sta- tistically demonstrate this. Group means for total HCT 39 score were on the order of 80, implying that the samples faithfully represented the population of usual neuropsycho- logical referrals. A third group Of 15 "neurologically impaired" subjects were administered both forms (standard form first) and compared with a group of 11 "neurologically impaired" subjects with the reverse order of administration. Test-retest correlations were essentially identical for both groups (1: .-. .94, slides first; x; = .96, cards first). These estimates of reliability were also nearly identical with those which have been reported elsewhere under varying cir- cumstances (Matarazzo et al, 1976).19 The same criticisms mentioned in conjunction with the test-retest approach used by Adams and Trenton (1981) apply here, as well. Finally, McCampbell and DeFilippis ()1979) have devel- oped a "booklet form” of the HCT. This version involves subjects pointing to a number (iJe., one through four) to indicate their response choice for item. Verbal feedback ("correct" vs. "incorrect") is supplied by the examiner. In a preliminary report the booklet and standard form were compared with a counterbalanced design across testings with two groups of 15 college students each. Results of their two-way analysis of variance revealed a significant practice effect (idh, the replication factor), but a lack of signif- icance in the difference between forms, and the absence of a significant interaction (i.en, order of administration by practice). The test-retest correlations irrespective of order, were high (a = .89, slides first; a: .95, booklet 4O . first), as might be expected (Matarazzo et al, 1976).14 The same basic criticisms apply here. Power of their F-tests, for alpha of‘.05, was on the order of .15, implying that a significant difference probably would not have been de- tected, even if one existed. As well, "normal" undergrad- uate students served as subjects, as opposed to brain damaged individuals. In summarizing these attempts to reduce the length of the HCT and/or render the test less noxious or expensive to administer, there are several pertinent criticisms. First, the derivation samples have been small, thus guaranteeing an inadequate test Of the hypothesis that the two forms of administration are the same. Rather, the investigators are siding with the null hypothesis, since test significance is partly a function of sample size. Naturally, this circum- stance has the effect of reducing the degree of trust one has in the results. Another way of expressing the same problem is that beta (the likelihood of erroneously accepting the null hypothe- sis) is extremely large, and power (1-beta), the capacity to reject the null when it is false, extremely small. The researchers uniformly ignore this because the "wrong" null hypothesis is being focused upon. In deriving an alternate form for a test, it is more appropriate to test the hypothe- sis that the two forms are identical against the null that the two forms are distinct. Modality of administration and feedback are other stim- ulus aspects which jeopardize the psychological equivalence 41 of forms. It remains unknown as to how significant these changes are, insofar as affecting what the test measures. Independent validation has been another vexing problem, in that it was appropriately carried out only in the case of the Golden et al (1981) study Of Calsyn's (1970) abbrevia- tion of the HCT. This issue has in general been exacerbated by alternate forms having been derived with "normal" (idh, essentially cortically intact) subjects. Thus, what one possesses is a test of unknown comparability to the origi- nal, which has been neither derived, nor validated as an indicator of brain damage. Though Reitan (1958, 1959) has emphasized that brain damage exerts a quantitative, rather than a qualitative, impact upon HCT performance, this does not obviate the necessity to examine the possibility that level of brain damage interacts with test form or test length. The test-retest strategy of comparing alternate forms is suspect, too. This is because systematic differences in scores have no impact upon correlation coefficients (Cronbach and Gleser, 1953; Cronbach, 1953LJ7 As well, the capacity to store and retrieve information has been found to be inversely related to brain damage (Wechsler, 1945; Russel, 1975), and consequently test-retest correlations between alternate forms may themselves vary as a function of brain damage, thus rendering this index of equivalence in forms moot. The issue essentially reduces to the matter of deciding what aspects of a stimulus array can be altered while 42 maintaining some assurance that the response (both internal- ly and externally, or empirically) is identical. As Brunswik (1956) has Observed, this in itself can (and, from the point of view of valuing a scientific account, sheuld) be rigorously studied. As has been emphasized, in a test of this type, the problems and confounds resulting are numerous and egregious. The most expedient means of resolving these problems would be simply to treat the alternate versions of the HCT as qualitatively'distinct and potentially useful indices of brain damage and then to appraise them by pro- ceeding in the usual way, namely attempting to predict brain damage with them, and relating them to other indices (in- cluding the standard HCT) of brain damage. In this way, the two salient questions, concerning equivalence of forms, and capacity of alternate forms to detect brain damage, can be directly and appropriately addressed. Consideration of those forms involving the discarding of items presents additional problems, which, though obvi- ously tractable by the sensible methodology just outlined, suggest several interesting questions. First, as the very description (Halstead, 1940, 1943) of the HCT implies, the capacity evaluated by the test is that of learning. The test, that is to say, is not a test of power (Anastasi, 1976), in the usual cross-sectional sense. Rather, the test appraises the individual's capacities in a dynamic, or lon- gitudinal way.6 Conceptually, this is a minor problem in that it is not difficult to imagine a single, total score which represents the dynamic capacity of an individual to 43 solve an ordered set of rather complex, yet formally similar problems. However, since the capacity is a dynamic one, the ordering of items is of critical importance. Without invio- late ordering, and unless it can be assumed that items are functionally equivalent with regard to one another, and that the length of the test does not interact with the learning process, then forms of the HCT based upon the discarding of items are of dubious equivalence with respect to the stand- ard HCT. The first of these assumptions, that the items are functionally equivalent, does not obtain in the HCT (Simmel and Counts, 1957). Rather, it seems that some items are more difficult than others, partially because of their stim- ulus characteristics, but also because of their position in the ordering within a subtest, and because of the interac- tion between the two factors, stimulus characteristics and position in the ordering. Thus, not only are item responses not experimentally independent, but their stochastic depend- ence is influenced by their formal characteristics. In actuality, this is saying no more than that item difficulty, in the usual sense, varies in the HCT, but since HCT per- formance is a time process of sorts, item difficulty is not just a "main effect" insofar as test performance is con- cerned, but also it creeps into an interaction effect. The second assumption, that length of the test does not interact with the learning process, is clearly invalid. It is well known that at least up to a certain point, learning, as defined by the change in the probability of a response or Am a correct response, as a function of time, is not a linear process (Rachlin, 1976, pp. 180-190). Shortened versions of the HCT will likely tend not to be only linearly related to longer versions, assuming that the test measures learning. Simmel and Counts (1957) have completed the most meti- culous and thorough evaluation of the HCT to date. Their sample included 35 neurological patients, all but three of whom manifested psychomotor seizures with involvement of the anterior temporal lobe, either hemisphere. As well, 26 student nurses were included as their control group. Rank-order correlations computed between item number, within subtests, and number of subjects obtaining a correct response, suggested that performance improves as a function of familiarity with the subtest. Thus, for subtests III, IV, V, and V120, correlations, respectively, were .35, -.03, .35, and .10 for the sample of patients, and .70, .55, .41, and .08 for the normal group. As has consistently occurred, correlations were slighter for the impaired group. The authors also noted that although the likelihood of correct responses increased within subtests as a function of item number, the graph of the relationships hardly approximated the usual learning curve, because items are arranged in sequences or clusters of highly similar characteristics, within subtests. Between last and first trials, or items, of adjacent clusters, the authors (Simmel and Counts, 1957) observed precipitous drops in probability of correctness. The emerging curves were upward treading, but sawtoothed, 45 because of this intercluster phenomenon. Perhaps because of the irregularity of the process the authors made no attempt to further study or evaluate the operation of learning in HCT behavior. These investigators also obtained test-retest correlations over a three month interim of’.7O for 21 tempo- ral lobe patients, .74 for 26 student nurses, and .80 for all 47 individuals. For 20 patients, test-retest reliabil- ity was estimated at .88 over a hiatus Of 15 months. These values are not unlike those reported by Matarazzo et al. (1976).1u The authors also noted the tendency for all subjects to manifest improved performance over time, again suggesting that learning was occurring. Kuder-Richardson (or, Cronbach's alpha, if preferred) estimates of internal consistency21 were also computed, and revealed values of .96 (subtest III), .96 (subtest IV), .89 (subtest V), .91 (subtest VI), .75 (subtest VII), and .96 (subtests III through VII, and I through VII). These values closely approximate the split-half estimate of reliability found by Shaw (1966).” The remainder of this (Simmel and Counts, 1957) mono- graph was devoted to a careful scrutiny of the distributions of respondent's errors and their relationships to item char- acteristics. Their work is both esthetically and scientifi- cally pleasing, though not of great relevance to the current discussion. Certain details of their work will be referred to, as appropriate, in the presentation of the methodology. Intact and lesioned subjects were found not to differ appre- ciably insofar as the patterning of correct and incorrect 46 responses was concerned, although the impaired sample fared more poorly at the test. A"- -,. o_ ,- . ,- . - . ,- V -. ..-- . What has been covered thus far tends to converge upon the thesis that the HCT assesses a complex of phenomena having to do with abstract and conceptual thought, and the flexible solution of novel and complicated problems. HCT performance is exceedingly vulnerable to any form of corti- cal insult, demands substantial verbal, and perhaps espe- cially nonverbal, intelligence, and is quite dramatically impaired by the accumulation of age beyond about 30 years. It is tenable to argue, consequently, that the HCT is acute- ly sensitive to the loss of cortical tissue occurring for any reason, and probably this is because the HCT demands the use of most aspects of the cortex, or, back to the original statement, the test assesses a complex of phenomena. It is here argued, perhaps more out of vehemence than originality, that the complex process evaluated by the HCT is nothing more mysterious than the operation of learning, the acquisi- tion of new ways of organizing information, the formulation of new observational sets or proclivities, the forging of new connections between stimulus and response. Successful HCT performance is associated, then, with the ability to intentienally apply abstract thought and the capacity to integrate new information with old (Halstead's, 1947, "Cen- tral Integrative Field"), and also to incorporate response- 47 strengthening and response-weakening information into one's cognitions, and ultimately, one's behavior. It also is believed possible to conduct an exploration of the appropriateness and the veridicality of these asser- tions by means of studying, in the tradition established by Simmel and Counts (1957), the internal characteristics Of the HCT. Actually, two connotations of this notion, "internal characteristics" may be appealed too. The first sense is the traditional conception of internal stznetnse (Cronbach, 1951), which emphasizes the factorial composition of tests or subtests on the basis of the item covariance (or correla- tion) structures. The aim is to derive a coherent image of the sources of variance important in producing the distribu- tion of subtest or total test scores. The second sense is the characteristics of the item stimuli themselves. In general case, the two versions of internal characteristics are more or less identical, because items can be reasonably fully represented by their difficulties (idh, probabilities of correct responses), and their correlations, with one another, with total test (or subtest, as appropriate), and/ or with some remote criterion. In the case of the HCT, it is argued that these aspects of test behavior are important, but it is also true that HCT items are unusually amenable to analysis and summary on the basis of their formal character- istics. This is because they are visual arrays which have been carefully and systematically designed and juxtaposed. 48 Prior to considering the hypotheses and focal predic- tions in any detail, it is essential that the reader become familiar with the structural characteristics Halstead (1943, 1945, 1947) has designed into the HCT. In no general de- scription of this test (e.g., Reitan, 1955; Reitan 1966; Reitan and Davison, 1974, pp. 366-368) are these character- istics emphasized or made apparent. Yet they are of criti- cal importance in any attempt to evaluate the internal characteristics of the test. The reader is referred to the Appendix, in which the various stimuli (items) for subtests III through VII of the HCT are illustrated. The tables contained in this appendix have been adapted from Simmel and Counts (1947). Subtests I and II are not considered here, because, as Simmel and Counts (1957) have pointed out, item variances for these subtests are diminishingly small, since virtually everyone achieves a perfect score. From Figure I-C, it can be seen that the first 32 items are organized into eight four—item clusters. Within clus- ters, the stimulus arrays are identical, save that the correct response migrates from item to item. The last eight items are not organized into clusters, but rather the stimu- lus array shifts after each item. The "organizing princi- ple" for this subtest is that the most dissimilar of four geometric figures is the correct response. Items may also be organized by the option which is correct. Each of the four options is used ten times in the 40 items, and sequencing of correct options appears to have 49 been randomly selected. This is a critically salient attri- bute, in that throughout the test, some options are far more likely to be chosen than others (Simmel and Counts, 1957). This is because subjects tend to rely upon cannlinz in preference to more complicated strategies for deriving item responses. This, in turn, is partly because subtest II embodies counting of figures as the aesneeniate "organizing principle", and once having established this set, subjects are loathe to abandon it (Simmel and Counts, 1957). As an example of the ways in which counting operates in effecting item responses of one variety or another, consider item number five, subtest III. There are four objects (suggest- ing "four" as a response), arranged as three of one shape (suggesting "three" as a response) and one of a different shape (suggesting "one" as a response). None of these responses is correct, but all can be derived on the basis Of a counting rationale. The rationale for considering correct option as a determinant of test behavior is that various counting biases account for an appreciable number of the correct (and incorrect) responses made, across items. Before continuing with a presentation of the remaining subtests, it is critical to point out that subtest III is special. That is, it is generally the first subtest which gives subjects much difficulty.22 Consequently, it may be viewed as the first opportunity for learning to occur. This subtest, in a very real sense, displays the subject's base- line test behavior and early departures from it. 50 The "organizing principle" for subtest IV is the quad- rant, in achromatic figures, which is either deviant or missing. This is in part analogous to the preceding sub- test, in the sense that deviance (or difference) is the key. However, in this case, the correct response is based on a quadrant schema, with quadrants numbered from one to four, in a clockwise direction, beginning with the upper left quadrant. (Subtest III was based upon enainai nesitien of the distinctive member of four geometric figures). Subtest IV is visually summarized in Figure I-D, in the Appendix. Stimuli in Subtest IV can be organized in various ways, in a fashion analogous to that used with subtest III. First, items are arranged into ten clusters of varying length (i.e., from three to six items). As in subtest III, the four response options were randomly assigned to items, and balanced, by guaranteeing that each option is the cor- rect one on ten occasions. Subtest V is organized on the basis of the proportion (i.e., one, two, three, or four fourths) of a figure which is composed of solid lines, as opposed to broken, or dotted lines. Figures are achromatic, and some are solid geometric figures, while others are merely lines or line segments. This subtest is depicted in Figure I-E, in the Appendix. Items are organized into six clusters, of length varying from three to nine items. Correct option is dis- tributed as in the preceding two subtests. Subtest IV utilizes the same organizing principle as subtest V, and may, in a sense, be viewed as a continuation 51 of that subtest. Stimuli are less regular in this subtest, particularly toward its end. Subtest VI is illustrated in Figure I-F, in the Appendix. The first six items are either identical to items in the preceding subtest, or unused representatives of cluster sets appearing in the preceding subtest. Items 7 through 30 are arranged in three eight-item clusters. Items 31 through 40 are quite explicitly unrelated to the other items in the subtest, though, of course, the same organizing principle is utilized. Correct response option is distributed as before. Subtest VII consists of items drawn from subtest II through VI. In some cases, items are identical replications of previously seen items. In other cases, items may be viewed as previously unused representatives or members of cluster sets which were employed in other subtests. Subtest I is not represented in subtest VII. Each of the other subtests, II through VI, are represented equally'(i.e., four items per subtest), except that subtest V is more heavily represented than subtest VI, with a combined total of eight items. Halstead (1943; 1945; 1947, p. 59) describes this last subtest as a test of "recognition" (i.e., items have been at least genenieaiiy seen beforeL. As far as the current investigator knows, it has not been correlated with measures of memory, though very clearly, at least one of the functions evaluated by this subtest is memory.23 This 20 item subtest is summarized in the Appendix, Figure I-G. There is no particular organization applicable to subtest VII, save that correct option number is distributed roughly 52 as in previous subtests, except that option two is repre- sented but four times, and option one, six times. W As argued above, the current investigator proposes that learning is a relevant determinant of successful performance at the HCT; For analytic purposes, and as is spelled out below in a detailed way, "successful HCT performance" was operationally defined as making a correct response to HCT items. Given that learning must imply the acquisition over time and under systematic and relatively coherent environ- mental reinforcement contingencies, of behavior converging upon nearly continuous success, it was felt reasonable to operationally construe "learning" as the inezease, across succeeding items, and uithin a given HCT subtest, in the probability of a correct item response. This, then, was the hypothesis of the current endeavor: Learning, as assessed by the positive change, over time, of the likelihood of responding correctly to HCT items of similar content, is expected to characterize arrays of items nitnin HCT sub- tests, or alternately, subjects' behavior in response to succeeding items uithin HCT subtests. The specific predic- tions selected for evaluation of this hypothesis are out- lined in detail in the section concerned with design and analysis. A second principal hypothesis, and one which clearly follows from the original derivation of the HCT was that brain damage, or cortical impairment, as defined by or inferred from performance at the Halstead-Reitan 53 Neuropsychological test Battery for adults would exert a negative impact upon HCT success. It was also hypothesized that brain damage would impair the capacity to learn. Mflihnn Subjects Subjects were 159 individuals referred for neuropsycho- logical evaluation to the Psychology Service of a nearby Veterans Administration Medical Center. All were referred as inpatients. They proved rather a heterogeneous group in most regards, including intelligence, age, educational back- ground, and severity of cognitive impairment. A substantial minority had previous and/or coexisting diagnoses of func- tional illness, with a history positive for psychological involvement (e.g., depression, schizophrenia, and the major- ity manifested a history positive for neuropsychological involvement (e.g., head injury, exposure to neurotoxins, alcoholism). As is generally found with a population as complex as one sampled here, it is frequently not possible to definitively exclude the functionally disordered and to focus exclusively, then upon the cortically impaired. As Malec (1978), Lenzer (1980), and Tucker (1981) have pointed out, it is a tenable proposition that those with major psychological illness manifest cortical damage, or at least disfunction, of some sort.12 Certainly it has been demon- strated that damage to the cerebral cortex can, and fre- quently does result in disruptions in the personality's 54 functioning which are for all intent and purpose indiscrim- inable from their analogs obtaining in the (presumable) absence of cortical insult. All but three of the subjects were males. Another variable not controlled was that of psycho- active medication. As tends to be characteristic of in- patient samples, the current set of subjects were, by and large, recipients of medications intended to resolve or alleviate undesirable emotional and cognitive consequences or correlates of their functional disorders. These medica- tions included the usual array of antipsychotic and anti- depressant agents. An incisive and extensive review of the literature by Heaton and Crowley (1981) has revealed that once patients have been established on these preparations, performance at neuropsychological measures in not appreciably affected, relative, that is, to unmedicated performance in the same patients. Exceptions to this rule were that tasks demanding focused attention were actually performed better with a stabilized medication than without it, among schizophrenic patients. In the present sample, testing with schizophrenic pa- tients was deferred, if necessary, until psychotic manifes- tations had been resolved with a stabilized antipsychotic administration. Patients also were not evaluated in those circumstances in which they were apparently overwhelmed by medication. 55 Examinens Neuropsychological evaluations were, in very large part, completed by one of two licensed, Ph.D. level Clinical Psychologists. Both were thoroughly trained in the adminis- tration, scoring, and interpretation of the measures used (these are described below, in the following section). A minority of the subjects involved were evaluated by one of two Psychological Technicians, who had been thoroughly trained, and who were carefully supervised, by the senior of the two clinical psychologists involved. A small number (perhaps a dozen) of the subjects were assessed by advanced doctoral graduate students in clinical psychology. These examiners, again, were carefully trained and fully super- vised in the administration of the tests. The training of examiners, whether technicians or graduate students, occu- pied several weeks of full-time study and supervised prac- tice. Examiners were not permitted to evaluate subjects until it was meticulously demonstrated that they adhered to standard techniques of test administration. Over the course of data collection, examiners were also periodically re- checked for "drift" from standard procedures. The tests administered required, of the administrators a reasonable capacity to adhere to clearly defined instruc- tions, and relatively acceptable manual dexterity, in some cases. Virtually no clinical judgment was required, and interpersonal skills needed were no more sophisticated than those required for a pleasant demeanor. The tests were "objective", in the sense that measured behavior took the 56 form Of correct or incorrect responses, times (i.e, laten- cies) required to respond, and, in some cases, the presence or absence of clinical signs (e.g., dysphasia, dysstereogno- sis). None of the measures involved were "projective" in character, and no inferences were required in the scoring of the various tests. limes Subjects were administered, among a broader variety of neuropsychological and more traditionally psychological measures, the complete Halstead-Reitan Neuropsychological Test Battery (HRNTB), with standard (Reitan, 1969) equipment and instructions. The HRNTB is described in detail else- where (Reitan and Davison, 1974, pp. 366-370). The HCT, of particular relevance here, is focused upon in the Appendix. The version employed is that which has been in use since the middle 1950's (e.g., Reitan, 1955; Simmel and Counts, 1957; Shore and Halstead, 1958), but not that which Halstead (1943, 1945, 1947, 1951) originally developed. The newer form, as is indicated in the Appendix, was shortened from 360 to 208 items, and from nine to seven subtests. The HCT was administered and scored in strict adherence with Reitan's (1969) instructions, which differ only slightly from those first devised by Halstead. Desizn_and_Analxsis The major concern was with developing an account of HCT item behavior, on the part of the subjects. Specifically, it was undertaken to mathematically define HCT item behavior 57 as a function of the various test-determined factors de- scribed above and below, with consideration also given to brain damage. In essence, the focus of analyses was to demonstrate convincingly, and statistically, that learning is a relevant determinant of HCT item response behavior. The analytic designs used were complex, both because the structure of the test itself is complex, and because this structure is systematic enough to permit entering of its characteristics into prediction models. The general model chosen for the several analyses discussed below was multiple linear regression (MLR). The specific MLR orientation which was adopted was that espoused by Cohen and Cohen (1975). The rationale for selecting a generic MLR rather than a generic ANOVA Treatment was based upon the complexity and frequent lack of balance in the designs. The current in- vestigator was aware that the two approaches coincide mathe- matically, but the computing algorithms available for ANOVA were nowhere near so flexible as those for MLR. Four separate, though similar, analytic designs were employed during analyses. These involved treating subtests III through VI as separate records of item response be- havior. Subtests I and II were not considered as they: (1) were essentially designed to introduce subjects to the HCT format; (2) rarely caused subjects any difficulty: (3) therefore, were Of extremely low variances, since nearly all subjects correctly responded to all items. Subtest VII was not included, either, because it appeared to be more a test of memory than of learningrzu It also lacked the systematic 58 organization apparent in the other subtests. The factor, 'Trial', figured prominently in each de- sign. Trial refers to the item number within a cluster. For example, subtest III consists of eight clusters of four trials, and eight additional clusters of one trial. As a design factor, Trial is of special importance, in that it represents learning within clusters. The remaining factors have already been considered in detail above. Analytic designs were those developed for treatment of repeated measures (Cohen and Cohen, 1975, pp. 403-426; Winer, 1971, pp. 514-603) with multiple factors involved. The dependent measure was item behavior (i.e., correct- ness or incorrectness in response). With repeated measures designs, between-subjects variance is first computed and partialed away, leaving a composite of systematic and random (iae., "error") variance, within-subjects. Observations (in this case, items) are treated as m separate, distinct examples of the dependent measure, thus, expanding the num- ber of experimental units25 from n (14L, number of sub- jects) to am. Pertinent research factors were entered ana- lytically by means of dummy-coded variables. Analyses proceeded according to a strategy which was partly hierarchical, and partly simultaneous. This was because many of the factors and interactions among them were more or less irrelevant to the purpose of demonstrating the operation of learning. However, these "irrelevant" sources of variance were nonetheless felt to be powerful determi- nants of test behavior. They were "removed" (1J5, 59 partialed) first, in order that learning might be more clearly demonstrated to exist. Tests of significance of main effects and interactions were F-statistics with appropriate degrees of freedom. Since MLR was used, F-tests were derived from changes in squared multiple correlation estimates. Estimates of error terms used were attributable in large part to Winer (1971, pp. 514-603), although the current investigator disagreed with Winer sometimes, and consequently adopted alternate definitions for error, in certain cases.26 Although each of the four subtests involved was treated separately, the same "generic" design and analytic strategy was applied to all. This is summarized, in condensed form, in tables 1, 2, 3, and 4. The ensuing discussion follows from the information contained in these four tables. Note that table 1 lists 22 effects, or components of variance. Tables 2, 3, and 4, however, divide this array of effects into more meaningful sets. The component, 'subjects' listed in table 1, refers to variance attributable to differences between subjects. This variance was important both because it required partialing, or removal, from the within-subjects or replicated part of the design, and because an attempt was made to account for part of this variance by entering the factor, 'damagen 'Damage' was defined as the augmented Impairment Index. The HCT was removed from the array of indicators ordinarily utilized in computing the Impairment Index, and the Trail 6O \ Table 1. Summary of Effects Considered. Effect Level Of Measurement (1) Subjects Nominala (2) Damage Intervalb (3) Option Nominal (4) Sequence Nominal (5) Trial Nominal (6) Sequence power onec Ordinal (7) Trial power one C Ordinal (8) Sequence power twod Ordinal (9) Trial power twod Ordinal (10) Sequence power threee Ordinal (11) Trial power threee Ordinal (12) Option X Sequence Nominal (13) Option X Trial Nominal (14) Trial X Sequence Nominal (15) Damage Interval (16) Sequence X Damage Interval (17) Trial X Damage Interval (18) Option X Damage Interval (19) Trial X Sequence X Option Nominal (20) Trial X Sequence X Damage Interval (21) Sequence X Option X Damage Interval (22) Trial X Option X Damage Interval 8The term, 'nominal,‘ may be reasonably supplanted by 'qual- itative'. bLikewise, 'interval' may here be understood as roughly synonomous with 'quantitative'. cCoded by means of power 1 orthogonal polynomials. dCoded by means of power 2 orthogonal polynomials. eCoded by means of power 3 orthogonal polynomials. 61 Table 2. Between-Subjects Analytic Model Effect Degrees of Freedom8 Denominatorb (1) R2y.s n-1 none (2) R2y.d 1/n—1 (1)°-(2) Note: The notation used in this table is largely consistent with that Cohen and Cohen (1975) favor, and similar as well to that espoused by many others. Thus: 32 a (any) squared multiple correlation; R2y.s the proportion of variance in the dependent measure attributable to differences between subjects. The remaining "Effect" quantity is interpreted as the proportion of variance attributable to, or accounted for, or by, the various independent measures or arrays of same. The parenthesized numbers occurring to the 68t of the "Effect" quantities refer back to Table 1, and consequently ought to reduce confusion. 8The convention, df (numerator/df (denominator) was adopted and adhered to, throughout. b"Denominator" may be taken to indicate the "error term," or the sum of squares eventually entering the F-ratio as the denominator. cAs implied by the "Note," above, the parenthesized numerals are actually abbreviations for their associated effects. Thus: (1) R2 (1) - (2) Y-SS R2y.s - R2y.d. 62 Table 3. Basic Within-Subjects Analytic Model Effect Degrees of Freedom8 Denominatorb (3) R2y.o Co-1/nc(m-1)-(CO’1) 1-(1+3) (u) R2y.se Cse-1/D<3)-(Cse-1) 1-(S(3)+4) (5) R2y.t Ct-1/D(4)-(Ct’1) 1-(sflumams ocmmsca so .coas comma can N .mcoam cocmfism> muocnnsmlcfisufiz m m>HpmH=sso can msafic: ho meanefiumm .uqdz ooP. .m.e o Pe~m\e mmm. o emmsea x eonsdo x Haney ooa. .h.e o meam\mm mmm. o eweeea x cofiuao x mucosacm see. .h.e emm.p Pmam\mm mmm. eoo. emeeea x mocoscsm x awash see. me. eem.. mrmm\mm mum. poo. eoneeo x oocmsamm x awash oer. .h.e mmm.. .emm\m mum. Poe. humane x eoflueo cop. .h.e ope. eewm\m «mm. mooo. emeeea x Heath mop. .h.e ewe. aemm\m PNN. Poo. emesma x deceaeem com. Foo. mmo.mooe mmwm\. 0mm. amp. emmseo oer. .h.e -1- mmmm\o ewe. o eeeeseem x Heath cop. .m.e em=._ emwm\m ewe. Foo. Hunts x eoaedo com. Poo. emm.m Newm\m. ewe. epo. doeeseem x eonsdo emm. moo. «.m.FF Pmmm\m oee. moo. Henge Fem. Poo. moo.mp emmm\o Poo. ope. eeeeseem Pam. Foo. mmp.em mommxm Nee. Nee. eofiseo neezod meeeeenenemam .e .m.a m m O>Humasesu mamas: uocmum mammamc< cofimmcsmmm muomnnsmncfinufiz onmm ">H anesesm .PP magma 9 7 .u:=OEm czocxcs cm >9 .mH20 can» Asmsooav mama aaamsuom OLm 00,. no mosam>n .pc305m czocxcs cm >: czam> acumen on» oomoxc Ham: mas cosmowm«cwwm .Foo. .HO>OH ecu Lem .wnHOOLQEH one: ccpmo mmumEHpmm cosmonwcmamm For. moo. eom.oa ooom\P mmo. moo. Aoeeeooeoo euoooo Honey oor. Po. NFo.e ooom\F omo. Foo. Aoeeeoosoo cansov moccsoom ooP. .m.e Pom.P Foom\P moo. mooo. Aoeeeooeoo enomtoeooo Honey ooF. .m.e emo. moom\F moo. Fooo. Aoeeoooeoo onset lunacy cocosomm oer. Poo. Poe.om moom\_ moo. moo. Aoeeeooeoo tweenno Honey ooP. mo. mmm.o eoom\P eoo. Poo. AoeeeooEoo smocflnv coccacom Fem. Foo. mmF.eo moom\m Neo. moo. eouooo otezoa meeeeonowemnm .m .m.o m m O>Hpmassso momma: vacuum mwmzamc< conmmemm mavennsmICHnuwz Hmfieoczaom ">H amounzm .NP macaw .Ocoam mocmHLm> muoonnsmlcfinumz .ou O>HumHmL commune so .coa: momma mum mm m>mummze=o ccm enema: mo nonmeHumm .cuoz 80 ooF. .h.: o moom\oF mom. o ewesmo x eouooo x Heats ooP. .h.e 1-- Fmom\o mom. o emoEmo x . acmpno x mocosamm ooo. moo. meo.m Fmom\om mom. oPo. tomato x moccsvwm x Hmmta omm. .m.e emm. orem\om omm. moo. eoneoo x eccmsvom x Hmmth ooF. .m.e mom. om~m\m omm. .ooo. emeEmo x eonooo mFF. .h.e emo. omem\m omm. Poo. emmsmo x Hones omz. Poo. ooo.o ~e~m\m mmm. ooo. amuseo x eeeoooem moo. Foo. mm~.emm mmem\P mom. Fmo. emmEmo oeP. Poo. orm.oF mmem\m mpm. moo. eeeeooem x Henge moo. Foo. mom.» mmem\e mpm. boo. nudge x eonooo ooo. .oo. mmo.om Noem\=_ mom. oeo. doeeooem x eonooo ooo. Foo. mom.op oeem\m mop. pro. mouth ooo. .oo. mom.mm. ooem\m Amp. omp. eeeeooem oom. Foo. ooo.mm omem\m mpo. m_o. eouooo otezoe meeeoononemum .m .m.o mm mm m>Hummzsso snows: powhmm mammmmc< scammmsmmm muocnnsmucmsumz ommmm u> ammunsm .mp magma 81 .ucwpxo czocxcz cm on pawnam .mmsu ems» ALmLooav mama mammspom mgm: 00,. mo mmsmm>n .pcmuxm czocxc: cm >9 :wsocp .mw>oa was» omomwoxw zmmmgmcww mcficmmu Ino mumsmmg .Foo. mo msaam Lam .mmmomgaeH mLoz amucwscmgu mmumeHumw mocmomhmcmmmm mop. moo. :mo.oF moom\_ moo. moo. Auoooooeoo oooooo Homgo mFF. moo. oom.Pm oomm\_ moo. moo. Aocmcooeoo omnsov mocmaamm ooP. .m.o moo. mm>m\P moo. mooo. Aococooeoo omomgoaooo HmHLe mom. Foo. mm=.ooP omom\9 moo. mFo. Aoomoooeoo omooo nomzov oocmscwm opm. Foo. moP.mmP mo>m\P moo. mmo. Aucooooeoo Lmoofioo doggy omF. Foo. omo.m~ moom\P omo. moo. Auomooosoo Lmocmqv mocwsamm oom. Foo. omo.m~ ommm\m mFo. mpo. oomooo ogmzoo mooomomomcmmm .m .m.o mm mm m>mpmH3530 osamc: pommmm mammmmc< conmmmem muovnnsmicmnumz Hameoc>mom u> ummpnsm .zp mmnmb 82 .mocmHLm> muomnnsm nomzumz mace on o>mummmu oommgna mum mm w>mpmase=o can «same: we nonmeHamm .mpoz cop. .m.c o Ppmm\m me. o mwmema x cemuao x HmHLH ooP. .m.c 1-- ommm\o Pmm. o omngao x cempao x wocmscwm omm. .m.= moo. ommm\PP Pmm. moo. monomo x wocmscmm x Hmmgh oop. .m.c uuu .mmm\o mom. o oooooo x wocmSGmm x Hague ooP. .m.o --- mem\m mom. Poo. ommENo x oooooo NPP. .m.c moo. :mmmxo mam. Poo. ommEma x Hague mmm. Poo. mmm.o. F=Nm\m mom. moo. mmmgmo x mooooomm mom. Foo. omo.wmm m=Nm\F mmm. omo. «mason com. .m.: In: :2NM\o mww. o wocmsamm x HmmLe mom. mo. omm.m :=NM\m map. moo. Hanna x cemuao om». Poo. Fmo.> ozmmxo omp. m—o. cocoooom x cooooo wpm. Foo. opo.o mmmm\o mop. mmo. Hmmge ooo. Poo. moo.=om momm\m omp. mmP. mocoooom mo». Poo. moo.Fm :omm\m omo. omo. oomuoo numzom mmocmomuwcmmm m m o mm mm m>mpmmsssu madmc: uommum manzamc< :onmomem muomfinsmucmsumz ommmm uH> ammunsm .mp magma 83 cm on :wsonumm mum: Poo. mm mcmgmoaam mmona .ucmaxw csocxc: cm on can .ucmuxw uHoHHaxm:H .mmnu ems» ALoLooav mama >Hmmspom mgm: com. «0 mopmermw Lozomn .mmnp mo mmmoxw cm umnzwsom mmmmspom .omaomgaeH mum: mmucmzamgm mwumeHumm mocmomuwcmmmm PNF. Foo. Pom.mF ommm\F =o_. zoo. Auomooosoo aooooo Hague ooF. .m.: omo.m ommm\P FoF. Foo. Aucmcooeoo ooomgomooo muggy mm». Poo. m~:.mzo .omm\F ooF. omo. Aocooooeoo among Icmsov mocmsamm mm_. Poo. omo.oP momm\P mNP. =oo.. Aoowcoosoo Locomoo Hmfioa moo. Foo. oom.oom momm\r o... ooF. Aocmcooeoo gmocmav mocwscom moo. Poo. moo..m ommm\m omo. omo. oomooo ogozoo mmoomoHooomHm .m .m.o mm mm m>mpwmseso magma: pomumm mammmmc< cemmmmummm muoofinsmucanpmz Hmmeocmmom uH> ammunsm .0— manmh 84 being presented first, before the polynomial analysis. The power of each F-test computed was also estimated, and re- sulting quantities were included in these tables. Power has been defined as the complement of beta, or the probability of failing to reject the null, when it is indeed false. Power, then, is the probability of rejecting the null, when it is false. The quantity, power, increases as a function of sample size, effect size, and the size of alpha. Specifically, other things being held constant, as the sample size is increased, power will increase, as the effect size (or the difference between the populations) increases, power will increase, as alpha is set more lib- erally, power will increase. The purpose or objective of high or large power is to assure that the null will be rejected when it "ought" to be, when it is in fact false. Cohen and Cohen (1975, pp. 117- 118) recommend selecting as appropriate some value between .70 and .90, say, .80 as the lowest acceptable value for power. Once this has been fixed, then, the investigator is in the position of manipulating the other three parameters, but chiefly, effect and sample sizes, in her or his design. Alpha is generally set at..05'for reasons involving the relative costs of accepting or rejecting the null hypothesis when it is true. In applying these considerations to the present set of within-subjects variance or regression analyses, the issue of power emerges as marginally irrelevant, for the reason that sample sizes are virtually infinitely large, rendering 85 it highly unlikely that genuine departures from the null hypothesis, however slight, will go undetected, given a reasonable alpha of‘.01 or .05. In fact, alphas of .001 or even..0001 are fairly liberal, given the sample sizes in- volved. Under this circumstance, power approaches meaning- lessness, because it stands as an indicant only of effect size. And, as can be readily seen in Tables 9 through 16, those analyses associated with diminishingly small effects, or R2 estimates are associated, as well, with reduced power of the F-tests. However, sample sizes were so large that the resulting F values tended to be gigantic, and conse- quently grossly significant, anyhow. Another means of ex- pressing the same idea is that with alpha at .001 or smaller, and a significant F-test, power is unimportant, because the null has already been rather convincingly re- jected. For these reasons, although power was routinely computed with alpha fixed at .05, its interpretation was somewhat unimportant. Overall, the results rather clearly indicated that the variables under consideration accounted for at the most a moderate portion of the variance in subjects' item re- sponses. Thus, for the basic analyses, total variance at- tributable to the full array of the independent variables was only .113 for subtest III, .233 for subtest IV, .269 for subtest V, and .251 for subtest VI. The orthogonal polyno- mial components accounted for .057 in subtest III, .011 in subtest IV, .OSH in subtest V, and .1HH in subtest VI. Though statistical tests were highly significant in many 86 instances, and were carefully evaluated and interpreted, it nonetheless was obvious that the substantial majority of the variance in within-subjects item behavior could not be ac- counted for on the basis of the factors selected for analy- tic consideration herein. mm The factor 'Option', proved dramatically significant for each subtest, and perhaps especially so for subtest IV, where it alone accounted for more than 11% of the total within-subjects variance. As was stated above, this result was anticipated, and an account of it has already been carefully detailed by Simmel and Counts (1957). While the finding was not of special or particular interest to the present investigator, the item means and variances, segre- gated by option, were examined. These quantities appear in table 17. With one exception, partialing away between-subjects variance had no effect upon relative item difficulties, within subtests. The exception was that for subtest V, items 2 and 3 reversed position after the between-subjects variance had been statistically extracted. The reason for this shift remained unclear. For subtest III, option fl items proved the most difficult, followed in order of descending difficulty by options 2, 1, and finally, 3. As Simmel and Counts (1957, pp. 27-50) point out, this is consistent with a bias in subjects' response sets toward selecting options 1 and 3 in 87 Table 17. Item means and Variances, Segregated by Option Derived With Raw Item Data Subtest Option 1 Option 2 Option 3 Option 4 Mean Var. Mean Var. Mean Var. Mean Var. III .541 .248 .548 .248 .460 .248 .583 .243 IV .317 .217 .384 .237 .517 .250 .506 .250 V .519 .250 .371 .233 .357 .234 .418 .243 VI .289 .205 .194 .156 .258 .192 .159 .134 Derived with Between-Subjects' Variance Partialed Subtest Option 1 Option 2 Option 3 Option 4 Mean Var. Mean Var. Mean Var. Mean Var. IV -.114 .158 -.O48 .135 .085 .197 .074 .146 V .093 0225 -005“ .201 -0031 .216 -0008 .211 Note. Means derived with raw data are tantamount to condi- tional probabilities of making an error, a correct response, rather than as is more conventional. 88 preference to options 2 and 4. As can be seen in Figure I-C in the Appendix, subtest III items are all characterized by three similar and one distinct stimuli. Subjects tend to count, and to select option 3 on the basis of the three similar stimuli, or option 1 on the basis of the single unique stimulus in each item. This bias produces an en- hanced likelihood of earning a correct response (although for the "wrong" reasons) for items in which options 1 or 3 are actually correct. From Table 9, it may be noted that this finding is not especially remarkable, as the factor, 'Option', accounts for merely 1% of the within-subjects variance. Power of the F-test was also unacceptably low, at .420, and this value is largely attributable to the small effect size. For subtest IV, 'Option' accounted for 4~2$ of within- subjects variance, and the highly significant F-test was of power, .971. For this subtest, items with option 3 being correct proved the most difficult, followed ordinarily by options 4, 2, and 1. As Simmel and Counts (1957. Pp. 51-71) have pointed out, this likely obtained because subjects mistakenly assume that quadrants 3 and 4 in the item stimuli will occur in a left-right position, rather than a right- left, or "clockwise" position, relative to one another. Options 1 and 2 do, in fact, occur in a left-right order, and consequently, items characterized by these correct op- tions have a relatively high frequency of correct responses, as compared to items defined by correct options 3 and 4. 89 In subtest V, 'Option' accounted for 1.5% of within- subjects variance, and though significant, the effect was small enough to render power unacceptably low, at .564. For this subtest, items having option 1 correct were the most difficult, followed by options 4, 2, and 3. Though for this subtest, the determinants of the nonrandom error distribu- tion across options are somewhat complex, Simmel and Counts (1957, pp. 72-96) point out that items characterized by options 2 and 3 tend to be more reinforcing of the appro- priate organizing principle, while items described by op- tions 1 and 4 tend to support or suggest erroneous hypoth- eses on the part of subjects. In particular, for much of the subtest, option 1 items reinforce the response set which had been learned from subtest IV, and option 4 items suggest a single, unified whole stimulus, biasing subjects in favor of option 1. Appealing to Figure I-E in the Appendix sup- ports these contentions. The factor, 'Option', accounted for 2% of the within- subjects variance in subtest VI, a significant finding, the F-test of which was associated with power of .703. This estimate of power is somewhat low, and again, the reason for this is the rather minute magnitude of the effect. This subtest proved the easiest for subjects to master as can be readily deduced from the values in Table 17. Essentially, this was because subtests V and VI share the same organizing principle, or accurate response set, and consequently, sub- test VI performance can be viewed as practiced, or profiting from prior experience with subtest V. Maximally difficult 90 items were those having option 1 as correct followed by options 3, 2, and finally, 4. According to Simmel and Counts (1957, pp. 97-116), the emergence of option 1 as the more difficult occurred for the same reason as was posited for subtest V. The remaining options can be seen from Table 17 to be characterized by rather low and nearly equal fre- quences of error, and it was felt reasonable to argue that most subjects had correctly grasped the organizing principle by the time they began subtest VI. There was a slight tendency to favor option 4, and this in all likelihood was because the majority of the items analyzed in this subtest could be readily construed as stimuli consisting of four parts. The analysis of the salience of"Option' in determining HCT item behavior, within-subjects was, again, of little importance to the current investigator. The component was built into the design in order to permit its influence to be partialed away in operation for the analyses which gene of interest. Eminence This effect proved dramatically significant for each of the subtests, and of the effects which were significant, sequence tended to account for more variance than most, hence the rather large estimates for power (more than .95, in all cases). Table 18 contains the means for items segre- gated by sequence, for each of the subtests were devised on the bases both of raw item data, and as well, after having 91 Table 18. Marginal Means for Sequence Derived With Raw Item Data Subtest’ Sequence III 0711 0639 .1472 0517 0530 05149 0‘452 01410 -- -- IV .397 .472 .591 .430 .405 .334 .450 .366 .492 .430 V 0571 0356 0‘479 .151 0717 0361 -- .. -- -— VI 0 ”17 .12“ .121 -- -- C- -- -- -— -- Derived Hith Partialed Data Subtest Sequence 1 2 3 4 5 6 7 8 9 10 IV -0033 0066 0131 -0007 -0029 -0070 -0031 -0011 0065 .040 V .134 .074 .056 -.272 .293 -.031 -- -- -- -- VI 0 180 -0 099 - o 088 -- -- -1- -- -- -- -- flete. Both between-subjects variance and that variance attributable to correct option was partialed from raw item data in preparing the second half of the tabled values. Tabled valves, consequently, may be thought of as standard scores, rather than probabilities of erroneous responses. 92 partialed away variance attributable to between-subjects differences and the factor, 'Option'. The predicted relationship between sequence, or, more lucidly, familiarity or experience with a given organizing principle, and probability of errors at HCT items, was not unequivocally observed across HCT subtests. More lucidly phrased, sequence did indeed exert an impact upon the prob- ability of making errors in response to HCT item stimuli, but the effect of sequence upon this phenomenon was dramati- cally more complex than had been anticipated. As much can be readily deduced from even a cursory glance at Figures 1 through 4, in which the partialed sequence means for the four relevant HCT subtests were graphed. The curve depicted in Figure 1 was basically consistent With what was predicted, in that its slope, to a linear approximation, was negative. The linear component for se- guence accounted for 3.41 of the within—subjects item vari- ance, and this amount proved highly significant. However, the linear form of the curve was somewhat disfigured by a perturbation occurring at sequences 5 and 6, Where apparently items become more difficult. From examin- ing the items of this subtest portrayed in Figure I-C, it “as determined that this coincided with the strings of items 17 through 20 (sequence 5) and 21 through 24 (sequence 6). These strings of items are characterized by the introduction Of a "distractor" feature, as it were, in that suddenly individual stimuli comprising items differ in two character- istics, only one of which is germane to the organizing 93 Partialed mean A errors . 55 - .50- .45 - .40 - .3s« .30 - .25- .2o - .15- .10- .05 - .0 1 -.05- 1 -.1o - -.15 -‘ -.2o - -.25 - -.3o - —.35 - —.4o - -.45 - -.so - "955 .- Figure 1 Subtest III: Partia1ed Mean Errors as a Function of Sequence .fl- 94 Partialed mean 41 errors .55 - .50 '- .45 - .40 1 .35-‘ .30 - .25- .20 - -.50 - -.55 - Figure 2 Subtest IV: Partialed Mean Errors as a Function of Sequence 95 Partialed mean A errors . 55 .50 - .45 q .40 - .35-‘ .30 - .25- .20 - .15- .10‘ .05 r .0 I —\ I 411117 _.05_1 51678910 -.15 -‘ -.25 - -.3o - -.35 J —.40 - -050 " en’ss «- Figure 3 Subtest V: Partiaied Mean Errors as a Function of Sequence 96 P artialed mean a errors . 55 - .50 - .45 - .40 - .35- .30 - ,25- .20 - .15- .10- .05 '- . o A A I111|II1117 12345678910 -.05- -40.. -.40 - -.50 - -.55 - Figure 4 Subtest VI: Partiaied Mean Errors as a Function of Sequence 97 garinciple. Previously, only a single stimulus distinction tiad existed. The introduction of this distractor obviously rcendered items temporarily more difficult, and after the eight trials represented by sequences 5 and 6, then the predicted negative slope resumed, across sequences 7 and 8. The perturbation apparent between sequences 4 and 7 gave the curve a form highly compatible with the function, y = x3 and for this reason, the cubic component for sequence emerged as significant. For more or less the same reason, albeit to a minimal extent, the quadratic component for sequence also proved significant. That is to say, the relatively asymptotic character of the curve at sequences 3 and 4 essentially introduced a quadratic component, but because of the temporary upward turn in the curve, the cubic approximation demonstrated a superior fit. In spite of the relevance and significance of the quadratic and cubic compo- nents, however, the linear approximation clearly manifested the better fit to the actual function. Sequence accounted for but 1.9% of within-subjects variance in subtest IV, and though this was significant, it was not especially dramatic. Examinations of Figure 2 re- vealed that subjects found the items of sequence 1 easier than those of the next two sequences, which were experienced as progressively more difficult. From Figure I-D, it was inferred that the first six items (sequence 1) held far more information from which the correct organizing principle could be derived than did the following seven items (se- quences 2 and 3), which consequently increased their 98 ciifficulty. The succeeding 11 items comprising sequences 4, £5, and 6 were quite similar to those of sequences 2 and 3, and the negatively sloping curve in Figure 2 suggests that subjects perceived in this way, and gradually mastered the organizing principle. Then, the item design shifted again somewhat for sequence 7, and again for sequence 8, and both shifts affected the items by making them more difficult. Subjects, once more, apparently adapted to these changes, and items were experienced as somewhat easier in sequences 9 and 10. The shifts in slope from positive to negative again permitted the introduction and significance of the cubic, though not the quadratic component, for sequence, although its importance was less dramatic than in subtest III. Be- cause of the irregularity of the function as a whole, it was not possible to infer that the asymptotic characteristic of a learning curve had appeared at the locations on the graph at which its slope shifted. The significance of the linear component of sequence was in agreement with the observation that slope was generally negative, although only slightly so, and this supported the inference that learning, as predicted, had obtained, across sequences, although again, the finding was not so clear as its analog in subtest III. Sequence accounted for 13.6% of the within-subjects variance in Subtest V, and this was far larger than subtests III and IV. The reason for this can be readily inferred from examining Figure 3: sequences differ radically in 99 ‘bheir average item difficulties. In a way analogous to :subtest IV, though obviously remarkably more pronounced, the shift in item means across sequences was not as predicted. Indeed, the moderately significant linear component, ac- counting for .51 of within-subjects variance, actually mani- fested a nositixe slope, in contradiction to what was predicted. To the quadratic component was attributed 1.87% of the within-subjects variance, and again, this in no simple way demonstrated learning, but rather only emphasized the un- usual irregularity of the curve. In this case, too, the slope of the curve was clearly in the direction opposite to that predicted. In a similar way, the cubic component attained significance, accounting for a minute .31 of within-subjects variance. Had higher order orthogonal poly- nomials have been entered, it is highly likely that they should have captured sufficient variance to attain signifi- cance, as well. By appealing to Figure I-E, in the Appendix, an inter- pretation of the rather complex curve in Figure 3 was made possible. The negative slope between sequences 1 and 2 was attributed to the structural similarity of their item stimu- li, and this was in support of the hypothesis that learning ‘would occur as a function of familiarity with items within the subtests. Sequence 3 involved a change in item design, and this was associated, in a way by now quite predictable, with an increase in item difficulty. Sequence 4 again proved less difficult, and this was by far the simplest of 100 the sequences. The reason for this was readily derived from scanning items 26 through 33 in subtest V, and noting how transparent the organizing principle was in these items. All that was required of the subject was that she or he count the number of solid line segments. Sequence 5 was attended by a sudden and remarkable increase in errors, and this was attributed to the complexity of items 34 through 37, relative to items 26 through 33. Counting solid line segments in sequence 5 did not readily produce a correct response. Finally, the three items in sequence 6 again permitted the success of a rather straightforward counting strategy, and this was associated with a clear decrease in errors. In summary of subtest V, while examination of the structural features of the item stimuli permitted lucid interpretation of the sequence-dependent, remarkable alter- nations in item difficulty, only in the case of sequences 1 and 2 can learning, as hypothesized, be argued to have convincingly occurred. In subtest VI, again, sequence accounted for rather a dramatic proportion of the within-subjects variance, in this case, 13:71. From scrutinizing Figure 4 and the polynomial components in Table 16, it was readily concluded that both linear (10.01) and quadratic (3.71) components were salient and significant. Had it been possible to include a cubic component, it might well have accounted for additional var- iance. The slope of the function in Figure 4 was negative, 101 or in the predicted direction, and an examination of Figure I-F, in the Appendix, supported the notions that even sub- stantial alterations in the structural aspects of the item stimuli failed to increase item difficulty. It was inferred that learning had indeed obtained. 11111 This effect was also strikingly significant for each of the subtests, although it accounted for only about one half to one tenth the amount of variance attributed to sequence. The more moderate effect sizes were associated with de- creased estimates of power. Table 19 includes the marginal means for trial, for each of the subtests, and computed with either raw item data, or following the partialing away of both between-subjects and option-attributable variance. As was noted with sequence, the predicted, monotonic decrease in HCT item errors as a function of Trial, did not unequivocally obtain. Learning did indeed occur, and this conclusion was carefully justified, but the phenomenon was more complex than anticipated, because of the characteris- tics built into the HCT, and out, by implication, of the control of the present investigator. The partialed means were graphed as a function of Trial, within subtests, and the resulting curves may be viewed in Figures 5 through 8. On the basis of Table 19 and Figure 5, the likelihood of errors, as a function of trial, was concluded to have shifted in the direction predicted. Thus, the slope of the curve obtaining was negative, and it began to assume an 102 Table 19. Marginal means for trial. Derived With Raw Item Data Trial Subtest 1 2 3 4 5 6 7 8 9 III .604 .515 .489 .533 - -- —- _- -- IV .469 .417 .484 .363 .329 .361 -- -- -- V .498 .508 .468 .462 .370 .239 .411 .336 .436 VI .300 .180 .253 .169 .229 .219 .260 .174 -- Derived With Partialed Data Trial Subtest 1 2 3 4 5 6 7 8 9 III .076 -.023 -.054 .000 —- -- -- -- -- IV .033 -.001 .017 -.035 -.O61 -.150 -— -- -- V 0%0 .087 0M0 .036 -065 -0153 -00'45 008” 00'43 VI .085 -.002 -.015 -.O12 .007 -.048 .047 -.O61 -- Note. Variance attributable both to between-subjects differences and to correct option was partialed from raw item responses in preparing quantities in the second half of this table. Result- ing values are standard scores rather than probabilities of erroneous responses. 103 Partialed mean A errors .55 - .50 - .45 - .40 - .35 - .30 - .25 - .20 - .15 - .10 - .05 - Q\\ . '0 I 1 l i ‘1 14*} 5% 305,. 1V7. . 7 . 9 1. -.10 d -.15 - -.20 - -.25 - -.30 - ‘-.35 d -.40 - -.45 - -.50 q -.55 - Figure 5 Subtest III: Partialed Mean Errors as a Function of Trial 104 Partialed mean A errors .55 - .50 ~ .45 - .40 - .35-1 .30 - .25!- .20 - .15 - .10 r .05 - '0 F—$:§F’1 i ‘ 7% 1 1 l I 1 l _.05_123 5678910 —.4O - -.50 - _.55 an 4 Figure 6 Subtest IV: Partialed Mean Errors as a Function of Trial 105 Partialed mean A errors .55 - .50“ .45 - .40 - .35-e .30 - .25!- .20 - .15 - .10 - -°.:“\« I: -.50 - -.55 :4 Figure 7 Subtest V: Partialed Mean Errors as a Function of Trial 106 Partialed mean A errors .55 - .50 - .45 - .40 - .35 1 .30 - .25 - .20 - .15 - .10 ~ .05 " '0 j>k1 I I _.05_ 1 2 3 4 7\8 9 10 -.10 . -.15 - '-.20 - «-.25 - '-.30 ~ '-.35 - '-.4O - -.45 - '-.SO - -.55 - A V Figure 8 Subtest VI: Partialed Mean Errors as a Function of Trial 107 asymptotic form between trials 2 and 3. The upturning of the curve between trials 3 and 4 was not, however, expected, and amounts to an increase in errors for trial 4 relative to trial 3 ( or trial 2, for that matter). The reason for this upturning was very likely that the distributions of options across trials was nonrandom (see remarks concerning the interaction between option and trial, below), with trial 4 being loaded, so to speak, more heavily with options 2 and 4 than were the other trials. Because these options were more difficult than the other ones, trial four items were conse- quently rendered more likely to elicit errors than the others. The leveling and upturning character of the curve was consistent with a quadratic function, and for this reason both linear and second degreeorthogonal polynomial compo- nents proved significant. The quadratic component, however, augmented the more substantial (3.41) linear element by a scant .9$. The cubic component added nothing whatsoever to the prediction. As is depicted very nicely in Figure 6, the curve relating partialed errors to Trial in subtest IV was, as predicted, of negative slope. For the basic within-subjects variance or regression analysis, trial accounted for .91 of the available variance, and as indicated by Table 12, the curve was very well approximated by a linear component, with slight cubic curvilinearity introduced by the perturbation occurring between trials 2 and 3. The quadratic component proved irrelevant, and the linear and cubic elements 108 combined accounted for about 90% of the variance attribut- able to trial. The upward turn in the curve occurring at trial 3 was likely attributable to a preponderance, for this trial, of items with correct options 3 and 4, which were, for this subtest, the more difficult items. Nonetheless, it was clear that errors decreased monotonically as a function of Trial, much in the way hypothesized. Subtest V produced an astonishingly complex curve, and one rather difficult to render amenable to present hypoth- eses. As can be deduced by examining Figure 7, the curve was relatively as expected for trials 1 through 6, but then suddenly the slope became positive for trials 7, 8, and 9. Also not as predicted, trial 1 was characterized by fewer errors than trial 2. Because of the irregular form of the curve, the cubic component emerged as significant. On the whole, the effect captured 1.71 of the within-subjects vari- ance, and this was predomonantly linear in character with some improvement in fit accomplished by the addition of the cubic component. The departure of trial 1 from expectation could not be accounted for by appealing to the distribution of correct options, for, by examining Figure 8-E in the Appendix, it was concluded that there were as many difficult items for this trial as easy ones, determined on the basis that is, of correct option. The partialed individual item means also were examined, but this, too, failed to clarify the finding. 109 The upward trend obtaining for trials 7, 8, and 9 was also rather difficult to interpret. Trial 7 was clearly heavily loaded with correct options 1 and 2, and was never defined by option 2, and consequently, it was not surprising that error frequency increased for this trial. Trial 8 manifested a decrease in errors relative to trial 7, but this was still a more difficult trial, on the average, than was predicted. The items comprising this marginal cell included two instances of correct option 3, and one each of correct options 1 and 2, and this composition did not seem to support its relative difficulty. Nor could further light be shed upon the matter by appealing to the structural characteristics of the relevant items, as they appear in Figure I-E. Finally, even an examination of Figure 3, with consideration being given to sequences (1, 2, 3, and 4) having a trial 8 failed to clarify the matter. That is, though these are the earlier sequences in the subtests, they are by no means the most difficult. Trial 9 was readily explained, as it was represented solely by sequence 1, that characterized by more errors than any other sequence, save number 6. Subtest VI was somewhat more coherent than subtest V in the regard of trial, and this effect accounted for 1.7% of the within-subjects variance therein. Both linear and cubic components were important, as was also true for subtests IV and V. However, in this case, each component accounted for but 41 of the within-subjects variance, or approximately 24% of the variance allocated to trial, indicating that 521 of 110 the available variation was attributable to aspects of trial not predicted. From the curve appearing in Figure F, it can be seen that trial one was the more difficult of the lot, and that trials 2, 3, 4, and 5 were easier, and about equally diffi- cult. Examination of the marginal means derived from raw item data revealed that not only was subtest VI by far the simplest of those analyzed, but also, trials 2, 7, and 4 were comprised of items with very low difficulty. It was believed possible that these trials illustrated the opera- tion of an asymptotic process. Trial 6 manifested decreased difficulty, but this was attributed to the fact that of the three items composing it, two manifested correct option 3, and one, correct option 1, the easier two options for this subtest. Trial 7, on the other hand, included one item each of correct options 2 and 4, and consequently its frequency of errors increased. Trial 8, once again, was characterized by correct options 1 and 3. The decreases in errors for trials 6 and 8 were inferred to demonstrate further learn- ing, relative to trials 1 through 5. It was decided against employing the curvilinear compo- nents of sequence and trial in subsequent consideration of the two and three variable interactions. The reasons for this were that the linear components of these variables tended to account for far more variance than the curvilinear components, and that even when the curvilinear components proved relevant, it was generally because they permitted a better fit to curves distorted from linearity for reasons 111 other than hypothesized. It was decided, consequently , to enter the dummy coded versions of trial and sequence into interactions. Wrens: Both the option by sequence and the option by trial interactions were predicted to reach significance, and the rationale behind each prediction was that the impact the item structures or characteristics of each subtest upon the types of errors made would shift in form over time, in a way consistent with learning. Insofar as error distributions across the various options was concerned, it was consequent- ly anticipated that early in the subtests, some options would appear more difficult or more simple than others, and that these disparities would vanish as a function of time or familiarity with the subtests, as defined by the passage of sequences or trials. Convergence in apparent option difficulty levels was thus expected, over time. Of the total set of interactions examined, option by sequence emerged the more potent, in that it readily at- tained significance for all four subtests examined. For subtest III, it accounted for 2.11 of within-subjects vari- ance; for subtest IV, it accounted for 1.61; for subtest V, 4.01, and for subtest VI, 1.2%. Next to sequence itself, this effect tended to operate as a more important determi- nant of HCT item behavior than any other effect studied at the level of the within-subjects design. It had been pre- dicted that this interaction would attain significance. It 112 was argued that as familiarity with a subtest of the HCT increased, item characteristics would prove less distracting and hence less important in determining responses made. This shift, it was reasoned, would produce a diminution in the salience of correct option, as a predictor of HCT item behavior. Pertinent cell and marginal means have been re— produced in Table 20 for subtest III, Table 21 for subtest IV, Table 22 for subtest V, and Table 23 for subtest VI. As was the case previously, the strategy of partialing both between-subjects variance, and that variance attributable to option was applied here, as well. In order to facilitate the interpretation of these tabled valves, portialed mean errors were graphed as a function of sequence, for each of the subtests, and these curves were reproduced in Figures 9 through 12. Each figure contains four curves; one, that is, for each option. The four curves depicted in Figure 9 all demonstrate the negative slope already noted in Figure 1. Variation very obviously occurred, but the majority of this was at- tributed to the lack of balance in the design, or more lucidly, to the nonrandom representation of trial at the various points on the curves. As well, various aspects of the HCT items themselves may well have exerted an uncon- trolled impact upon mean errors. These sources of influence were ignored in interpreting the interaction. Consonant with the shape of the curve in Figure 1, three of the four curves in Figure 9 bend upward, in the direction of greater mean errors, as the more complex 113 Table 20. Subtest III: Cell and Marginal Means for Option X Sequence Sequence Option 1 2 3 4 Marginal 1 .2u3 "" 0092 e1u1 0179 2 -003” 0132 ---- 0109 0085 3 -0006 0068 -0091 ’--- -0030 4 ---— -.022 -.055 -.O16 -.O27 5 -.086 ---- .136 .036 .000 6 -0092 .061 ---- -0051 -0006 7 ---- -0195 .00" -0058 -0062 8 -0091 -0236 -"- -01“? -.142 Marginal .005 .012 -.076 .048 .000 Nete. Entries have been partialed with respect to variance attributable to between-subjects differences and to option. 114 Table 21. Subtest IV: Cell and Marginal Means for Option X Sequence Option Sequence 1 2 3 4 Marginal 1 .087 -.O61 -.099 -.108 -.033 2 -.O68 .085 .161 -- .066 3 .077 -- .173 .128 .131 4 -.078 .046 -.O32 .037 -.007 5 -.022 -.055 -.117 .076 -.029 6 -.144 -.O49 -- -.O17 -.O70 7 -- -.026 -.053 .008 -.O31 8 -.012 -.021 -— .001 -.011 9 .109 .002 .075 -- .065 10 -- -.008 -.131 -.O10 -.O40 Marginal -.114 -.O48 .085 .074 .000 Nete. Entries have been partialed with respect to variance attributable to between-subjects differences or to option. 115 Table 22. Subtest V: Cell and Marginal Means for Option X Sequence Option Sequence 1 2 3 4 Marginal 1 .140 .017 .133 .154 .134 2 -.051 -.080 -.075 -.088 -.074 3 .112 -.062 -.062 .233 .056 4 -.282 -.218 -.238 -.352 -.272 5 .019 .522 .506 .124 .293 6 -- -.061 -.035 -.003 -.031 Marginal .093 -.054 -.031 -.008 .000 Nete. Entries are based upon means with variance due to between-subjects differences and option having been partialed. 116 Table 23. Subtest VI: Cell and Marginal Means for Option X Sequence Option Sequence 1 2 3 4 Marginal 1 .211 .122 .136 .357 .180 2 -0130 -0069 -01111 -007” -0099 3 -0130 ‘0057 -0135 -0031 -0088 Marginal .063 -.029 .035 -.063 .000 Mete. Emtries are means with variance due to between- subjects differences and option having been par- tialed. 117 Partialed mean 7 errors .55 - .50 ‘ .45 -1 .40 - .35 r .30 - .25 - .20 - .15 ‘ .10 ‘ .05 ‘ '0 -.05 - -.10 ‘ '-.15 ‘ ‘ Option 4 Option 2 -.25 - -.30 - -.35 d '-.40 - -.45 - -.50 - -.55 - Figure 9 Subtest III: Option-Segregated Curves as a Function of Sequence 118 Partialed mean errors .55 .50 .45 .40 .35 .30 .25 .20 .15 .10 .05 -.05 -.10 -.15 -.20 —.25 -.30 -.35 -.4O -.45 -.50 -.55 l\ ‘ Option 1 . Option 3 Figure 10 Subtest IV: Option-Segregated Curves as a Function of Sequence 119 Partialed mean A errors .55 - .50- fi .45 - .40 - .35 ~ .30 - .25 - .20 - .15 1 .10 — .05 "‘ / Option 4 -.05~ ‘ " _ 1° _ c< \ Option 3 -.15 r -.20 - -.25 - -.30 - ‘-.35 d ‘-.40 - -.45 - -.50 - -.55 - Option 1 Option 2 Figure 11 Subtest V: Option-Segregated Curves as a Function of Sequence 120 Partialed mean A errors .55 - .50 - .45 - .40 ~ .35 - .30 - .25 - .20 - .15 1 .10 ~ .05 - A '0 I I I I I I I I ' _.05_ 2‘4\5678910 Option 4 -.10 . $\\\“Option 2 -015 -' ‘$\“'Option l -020‘ \ -.25 - Option 3 -.3O - -.45 - -050 - -.55-i Figure 12 Subtest VI: Option-Segregated Curves as a Function of Sequence 121 sequences 5 and 6 are encountered. The sole curve which failed to show this trend was that for option 1. A rather convincing explanation for this was arrived at upon exam- ining Figure I-C, in the Appendix. Items 17 through 20 form sequence 5. It will be recalled that options 1 and 3 were favored by subjects for this subtest, because stimuli tended to be divisible into two groups, one always containing three events, and one, a single event. Then, by counting, sub- jects tended to arrive at a response of either "one" or "three". This produced a bias in the direction of fewer errors for these options, although correct responses were made for inaccurate reasons. The negative slopes noted for the curves in Figure 9 across sequences 1 through 4 indicate that this uninsightful response set was relinquished by subjects as they were punished. Then, when sequence 5 was encountered, and it again became difficult to divine the accurate basis upon which to respond, subjects were very likely pushed, as it were, in the direction of counting once again. In this case, however, the first three items of the sequence offered only punishment as a consequence to choos- ing option 3. Option 1, on the other hand, was liberally reinforced, as it appears twice as the correct solution during the first three items of sequence 5. By the time correct option 3 arrived, with item 20, this response had been extinguished, and many subjects erroneously selected some other option, including, quite possibly, option 1. Sequence 6 contained no instance of correct option 3, and the rather high error rate noted for this option in sequence 122 7 probably indicates that subjects never recovered their faith in this response. The combination of lack of trust in option 3, and the increased complexity of the items beyond sequence 4, also seemed to have affected the error rates for options 2 and 4. On the one hand, that is, the punishment of option 3 encour- aged subjects to try, instead, options 2 or 4, even though in the earlier sequences these alternatives could not be arrived at solely by utilizing the strategy of item count- ing. However, the more complex, later sequences also are characterized by stimuli which differ more from one another, and consequently this may well have encouraged subjects to arrive at a "four" response by counting. Finally, sequences 7 and 8 contain items which can quite readily be separated into two groups of stimuli, and in all likelihood this accounted for the great reduction of errors at sequence 8 for the option, 2. In summarizing the option by sequence interaction for subtest III, it was concluded that its significance did not indicate that familiarity with the subtest had shifted the response bias introduced by item characteristics in the direction of more insightful behavior. Rather, this inter- action revealed that aside from the general decrease in errors as a function of familiarity with the subtest, the shifts in difficulty, over time, of various options, was attributable to alterations in the content or structure of items. In particular, as the accurate organizing principle again became obscure, counting was resorted to. Moreover, 123 as alternate, though equally incorrect counting strategies became available, they were used. In comparing Figures 2 and 10, it can be deduced that to a fair approximation, with exceptions as noted below, the family of option-segregated curves for subtest IV behaved in a fairly coherent way. On the whole, the slopes of these curves were negative, indicating a general decrease in er- rors, as a function of familiarity with the subtest, and largely irrespective of correct option. The first important exception was that for sequence 1, option 1 proved quite difficult relative to the remaining options, which were roughly equally difficult. For sequence 2 the pattern shifted, with option 1 manifesting fewer mean errors than the others. The higher frequency of errors for correct option 1 items in sequence 1 was inferred, after Simmel and Counts (1957, pp. 51-71), to have come about as a consequence of the subjects' difficulty to arrive at a "one" response by counting some aspect of the stimuli in sequence 1. As much was deduced by examining Figure I-D. Shunning option 1, subjects made many errors when this alternative was actually the correct one. The relative decrease in errors noted for items in sequence 1 with correct options 3 and 4 was attributed to the "success" of counting strate- gies, albeit for inaccurate reasons. That option 2 fared so well in its error rate was probably due to its occuring rather late in the sequence, at the position of trial 5, by which time many subjects had divined the correct organizing principle. 124 Upon the arrival of sequence 2, the stimulus array shifted such that the cuing numbers apparent in sequence 1 items were no longer present. Errors for option 1 decreased significantly, but errors for the other options increased just as dramatically. Based on the appearance of the item stimuli, these changes were inferred to have come about because suddenly the stimuli were quite appropriately viewed as unitary constructs, calling for a "one" response. Beyond this point, on the basis of the curves in Figure 10, it would appear that items with correct options 3 and 4 tended to remain more difficult than items with correct options 1 and 2. As well, whenever the stimulus figures were closed, or manifested an unbroken line completely enclosing an inner space, then items with correct option 1 manifested decreased mean errors. The reason for the first of these trends was that, as was mentioned above while discussing the effect, 'Option', quadrants 3 and 4, associated with correct options 3 and 4, were counterintuitively placed with respect to one another and to quadrants 1 and 2. Thus, subjects invariably tended to confuse these quadrants, and the associated mean error values remained inflated. The second trend was ex- plained as the predisposition by subjects to emit a "one" response when the stimulus figure could be viewed as a single, coherent event. This proved possible for sequences 2 (items 8 through 10), 4 (items 14 through 17), 6 (items 22 through 24), and to a lesser extent, 8 (items 29 through 32). For sequences which did not facilitate the perception of stimuli as single, coherent objects, items with correct 125 option 1 tended to manifest somewhat inflated mean error scores, and this was concluded to indicate that subjects had abandoned the "one" response, with the consequence that items for which it was the correct choice manifested ele- vated errors. As was already noted in discussing this interaction for subtest III, option by sequence can be said to have demon- strated appreciable learning effects in the case of subtest IV. Rather, the interplay of option and sequence was con- cluded to have arisen as a consequence of alterations in item characteristics which tended to enhance or suppress the likelihood of response options, and in turn, which irrele- vantly, insofar as learning was concerned, inflated or de- flated mean error scores. For subtest V, the array of findings was somewhat less complex to interpret for the option by sequence interaction than was true of subtests III and IV. In general, the curves in Figure 11 very neatly followed their option- unsegregated analog in Figure 3. The sole remarkable dis- crepancy was that items with correct options 1 and 4 tended to produce nearly identical mean error scores across se- quences, and the same was true of items with correct options 2 and 3. The curves for these two distinct pairs of items converged, more or less, at sequences 1, 2, 4, and 6, and were sharply divergent at sequences 3 and 5. At sequence 1, which was otherwise strongly convergent across options, items with correct option 2 manifested a lower mean error score than items with one of the other three options 126 correct. In all likelihood, this was because those items with correct option 2 could not be responded to on the basis of the set which had been acquired during the previous subtest, number IV. From scanning items 1 through 9 in Figure I-E, it was determined that those items with option 2 as correct could not be solved by appealing to the "quad- rant" schema as learned through contact with the preceding subtest. Items 4 and 6 also did not quite fit the previous- ly learned organizing principle, but these probably strongly elicited "one" responses, as they so clearly were unitary, coherent stimuli. Items characterized by correct option 2 were most likely to be correctly solved, then, because they did not so readily elicit an erroneous principle from subjects. The divergent locales on the curves obtaining at se- quences 3 and 5 were explained, again, by appealing to the stimuli as depicted in Figure I-E. For sequence 3, the items with correct option 4 in all likelihood tended to be seen as a pair of line segments, and these elicited "two" responses (Simmel and Counts, 1957, pp. 72-96), for this reason. Those items with correct option 1 tended to elicit either "two" or "three" responses. The explanation for the choice of option 2 was obvious. It proved more difficult to understand why in a situation of ambiguity a choice of "three" would prevail over a choice of "one", although this was also noted to have occurred in response to the majority of the items in subtest III. Sequence 5 produced the high- est mean frequency of errors of any group of items in this 127 subtest or any other. In fact, items 35 and 36 of this subtest were the most difficult items analyzed from this or any other subtest. The reason for this was in all likeli- hood an overwhelming inclination experienced by subjects to view the stimuli in sequence 5 as whole objects, and to emit "one" responses. The response of "four" was also apparently encouraged, due to its reduced mean error score, but this item was the last one in the sequence, and by that time, many subjects had probably solved the special problem posed by sequence 5. A careful examination of items 35 and 36 also suggested that in addition to "one" responses, these items were probably encouraging, respectively, of "two" and "four" responses. With sequence 6, error frequencies again dropped for correct options 2 and 3. The interaction involving option and sequence, for subtest V, was again concluded not to have supported the hypothesis that learning would operate to decrease the im- pact of item characteristics upon subjects' behavior. In- stead, the interaction in this case was quite explicitly a function of the impact changes in item structure had upon the likelihood of one option being selected over another. This could not be said to have anything whatsoever to do with learning. The family of curves depicted in Figure 12 follow their generic analog in Figure 4 very closely. There were few divergent aspects to the curves, and when present, these were readily attributed to item characteristics. Thus, option 1 items proved more difficult for sequence 1 than did 128 option 2 items, and this pattern shifted for sequences 2 and 3. The early (i.e., sequence 1) juxtaposition of mean error scores for options 1 and 2 was probably attributable to a "resurfacing," as it were, of the organizing principle learned from subtest IV. Then, during sequence 2, item 16 probably was generative of erroneous "two" responses, while the items manifesting correct option 1 were probably less likely to facilitate errors. For sequence 3, it was diffi- cult to understand or offer an account concerning why items defined by correct options 2 and 4 proved more difficult, on the average, than those manifesting correct options 1 and 3. Rather, it would have seemed more plausible that items with correct options 1 and 3 would have been readily confused with one another, producing higher mean error scores. The sole reason the current investigator was able to arrive at was that items early in this sequence tended to be charac- terized by correct options 2 and 4, and consequently, per- haps more mistakes were made with these, before the sequence had been mastered. For all four of the subtests, it was concluded that although the option by sequence interactions were salient and significant determinants of HCT item behavior, in no circumstance could it be argued that the nature of this influence was in the direction of item characteristics prov- ing less distracting or disruptive over time (i.e., se- quences). Thus, it was inferred that learning could not be demonstrated to have manifested a moderating influence upon the relationships between item characteristics and mean 129 error scores. On the other hand, the relationships between item characteristics, overall error rates, and option- dependent error rates, were elucidated significantly by consideration of the option by sequence interactions, and this further emphasized the salience of item characteris- tics, and particularly item complexity, upon HCT item errors a mm For three of the four HCT subtests considered, the option by trial interaction also emerged as significant. Thus, the interaction accounted for .21 of the within- subjects variance for subtest III, .71 for subtest V, and .31 for subtest VI. .As was true of the main effects involv- ing sequence, relative to trial, these interactions captured dramatically less variance than did their counterparts in- volving sequence. Cell and marginal partialed mean error scores for the three subtests manifesting significance ap- pear in Tables 24, 25, and 26. As was the practice with the option by sequence interaction, option-segregated families of curves were drawn, and these are included in Figures 13,- 14, and 15. Results were not documented in detail for subtest IV, as the option x trial interaction failed, for this subtest, to reach significance, and an examination of the partialed mean error scores proved not to provide more information concerning the operation of option and trial than had already been made available by considering the main effects for these variables. 130 Table 24. Subtest III: Cell and Marginal Means for Option X Trial Option Sequence 1 2 3 4 Marginal 1 .088 .183 -0003 -0013 0076 2 -0092 -006” 0019 0000 -0023 3 -0002 -0129 -0087 -0039 -005” L1 -0083 0010 0027 .045 0000 Marginal .055 .012 -.O76 .048 .000 Nete. Entries were partialed with regard to between- subjects and option variance. 131 Table 25. Subtest V: Cell and Marginal Means for Option X Trial Option Sequence 1 2 3 4 Marginal 1 .090 -.O94 -.035 .405 .060 2 .172 -- .221 -.161 .087 3 -.O1O .231 -.103 -- .040 4 -.281 .029 -- .144 .036 5 .053 .180 -.085 -.372 -.055 6 -- -.220 -- -.087 -.153 7 .159 -.122 -- .063 -.045 8 .019 .034 -.196 -- -.084 9 -- -- .043 -- .043 Marginal .093 -.054 -.031 .008 .000 flete. Entries are mean error scores with variance attribu- table to between-subjects differences or to option having been partialed. 132 Table 26. Subtest VI: Cell and Marginal Means for Option X Trial Option Sequence 1 2 3 4 Marginal 1 .302 .022 -- -.O71 .085 2 -- .014 -- -.032 -.002 3 -.O95 -- -.025 -- -.015 4 -- .019 -- -.077 --012 5 -.206 -- .160 -.086 .007 6 .120 -- -.132 -- -.O48 7 .130 -.086 -- .357 .047 8 -.143 -- .053 -.O93 -.O61 Marginal .063 -.O29 .035 -.O63 .000 Nete. Entries are partialed mean error scores, with regard to between-subjects and option-attributable variance. 133 Partialed mean errors .55 .50 .45 .40 .35 .30 .25 .20 .15 .10 .05 -.05 -.10 -.20 -.25 -.30 -.35 —.4O -.45 -.50 -.55 K Option} I I I I ”I I 4 5 6 7 8 9 10 . Option 2 “\\Option 1 ‘) Figure 13 Subtest III: Option-Segregated Curves as a Function of Trial 134 Partialed mean A errors . 55 - .50- .45 - .40 - ‘ .35- .30 - .25- .20 -, ‘\ ‘ . .15 - ‘ ‘ . ‘ Option 4 1 .10 A ' A / Option 2 .05 ‘ \h I jé-Optionjoption 1 '0 . —.05 -.10 ‘ -.15 - . "' -.20 -, ‘ . -.25 - -.30 J -.35 ~ —.40 -' -.45 - -.50 - -.55 1’ l l l p > =- '