I L. .3. s. I —: i2 .. . v. v . .v... I a... .4 I 33%."...3... .3 31...}. . . y: p . T ...__.,.... s. E :1: .TPEJEWWI 5 5154 .1. . ,1? . . . u: .11 . , . .fiwi : a ‘ 3;... $3331.. w-MJ‘U 3a ‘ UBRARY Michi; ‘tate University This is to certify that the dissertation entitled DUAL CODING ITEM FORMATS FOR COMPUTERIZED ADAPTIVE TEST (CAT) ENVIRONMENTS presented by Christine Bee Lan Chan has been accepted towards fulfillment of the requirements for the Doctoral degree in Educational Psychology 55/49 Add”... ' Thajor Professor’s Signature July 8, 2005 Date MSU is an Affirmative Action/Equal Opportunity Institution .-.‘n-l-C-n-I-O-D-I-l-O-O-l-l-I-l-l-l-l-o-|-I-O-h. PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE DUAL CODING ITEM FORMATS FOR COMPUTERIZED ADAPTIVE TEST (CAT) ENVIRONMENTS By Christine Bee Lan Chan A DISSERTATION Submitted To Michigan State University In partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Counseling, Educational Psychology and Special Education 2005 ABSTRACT DUAL CODING ITEM FORMATS FOR COMPUTERIZED ADAPTIVE TEST (CAT) ENVIRONMENTS By Christine Bee Lan Chan Dual Coding Theory (Paivio, 1986, 1990, 1991) hypothesizes that information presented simultaneously in both text and visuals would result in more efficient processing of information. When Dual Coding item formats are used for computerized adaptive tests, they are hypothesized to be more efficient measures of a candidate’s capabilities than traditional item formats. This hypothesis is suggested by research on Dual Coding theories of information (Paivio, 1986, 1990, 1991). Efficiency is the accurate assessment of performance in the shortest time possible. Test items from a past LSAT exam were used to develop two formats: (i) the DCT format where information was presented in paired visuals and text, and (ii) the LSAT format, a replication of the current pencil and paper exam. Participants of similar ability level were randomly selected and assigned to either the DCT group or the traditional LSAT group. Performance differences between the groups will thus indicate if there is a difference in response time and proportion correct for the different formats. Results indicate that DCT item formats had a significant effect with a higher median score of 5.75 in comparison to the traditional LSAT median of 4.75 that had a narrower range of 1.5 - 9.0. The data also showed differences in speededness among examinees, with DCT participants having greater mean response times (MRT) with a slightly higher median of 80 seconds compared to 70 seconds in the LSAT group. MRT for the DCT group increased for later items, but with more accurate answers. These results support the Dual Coding hypothesis of the effectiveness of a visual-text presentation of information, as they aid in preserving cognitive resources when higher-order complex tasks are engaged in immediate-delayed retention tests. Copyright By CHRISTINE BEE LAN CHAN 2005 ACKNOWLEDGMENTS I would like to thank the following people; Brian M. Winn, for help on Director, Dr. Linda Chard for the DIF and reliability analysis, the Episcopal-Anglican Chaplaincy at MSU, and all who have contributed to this study. I would specially like to thank my advisor, Dr. Mark D. Reckase, for his encouragement and astute guidance, pushing me toward the best of my ability. Without him, this would never have been possible. To my best friend Judith Brown-Clarke Ph.D., for her support, encouragement and wisdom through difficult times; to my parents who have sacrificed so much for me and to God, for by His grace and strength, I can do all things. TABLE OF CONTENTS LIST OF TABLES .............................................................................. LIST OF FIGURES ............................................................................ LIST OF SAMPLES ........................................................................... KEY TO ABBREVIATIONS ................................................................. INTRODUCTION ............................................................................ STATEMENT OF THE PROBLEM ........................................................ MEMORY SYSTEMS: A BRIEF OVERVIEW ............................................. The Psychometric Approach ................................................ The Cognitive Approach ..................................................... Visual-Spatial Working Memory ............................................. OBJECTIVES AND GOALS OF THIS STUDY ............................................ DUAL CODING THEORY (PAIVIO, 1986, 1990, 1991) ............................... THE BASIC PREMISE ..................................................................... DEVELOPMENT OF DUAL CODING THEORY (DCT) .................................. The Conceptual Peg Hypothesis ............................................. Imagery-Concreteness of Word-Picture Items ............................. Synchronous Organization ................................................... Symmetry-Asymmetry of Associative Items ................................ OPPOSING THEORIES OF DUAL CODING THEORY . . . . . . . . . . . . . . . . . . . . . . . . . . CONCLUSION ........................................................................... FACTORS AFFECTING VISUAL SHORT TERM MEMORY ......................... COGNITIVE ANALYSIS OF VISUAL PROPERTIES ....................................... VISUAL-SPATIAL LAYOUT ............................................................... TIMZE PREDICTION AND TASK COMPLETION PERCEPTIONS ......................... Experience-Rehearsed Sessions ............................................. Nature of Task & Distractors ................................................ Task Complexity & Duration ................................................ TYPES OF MENTAL OPERATIONS ...................................................... Quantitative Reasoning ....................................................... Sentence Verification Task ................................................... Maze, Copying and Object Manipulation Tasks ........................... INDIVIDUAL DIFFERENCES ............................................................ VISUAL-TEXT ASSESSMENT FORMATS ................................................ EARLY RESEARCH ....................................................................... PAST RESEARCH ......................................................................... CURRENTRESEARCH ix THE LAW SCHOOL ADMISSIONS TEST (LSAT) ...................................... ITEM TYPES-MEASURES ................................................................. Predictive Validity ............................................................ Analytical Reasoning (AR) Discrepant Subscores .......................... TLME SPEEDEDNESS ..................................................................... RESEARCH METHOD AND DESIGN ................................................... PURPOSE OF THE STUDY ............................................................... PROCEDURE ............................................................................. Definition of Terms ........................................................... Power Analysis ................................................................ Item Selection Process ........................................................ Kit of Factor-Reference Cognitive Tests .................................... Graphical User Interface Development (GUI) ............................. Participants ..................................................................... TEST ADMINISTRATION ................................................................ RESULTS AND DATA ANALYSIS ......................................................... ANALYSIS PROCEDURES ................................................................... Descriptive Statistics .......................................................... Differential Item Functioning ................................................... Reliability of Tests ............................................................ Validity of Test Items ......................................................... Proportion of Correct Responses ........................................... Average Response Times (RT) - Answers ................................. Time Correlations ............................................................. Multivariate Analysis of Variance (MANOVA) ................................ SUDNARY AND DISCUSSION ............................................................ Item Location .................................................................. Response Times and Speededness .......................................... Time Correlations ............................................................. IMPLICATIONS AND FUTURE RESEARCH ........................................... LIMTATIONS OF THE STUDY ........................................................... FUTURE RESEARCH ...................................................................... REFERENCES ................................................................................. APPENDICES ................................................................................. ** NOTE: IMAGES IN THIS DISSERTATION ARE PRESENTED IN COLOR 36 36 37 38 39 42 42 42 42 43 46 53 54 55 55 55 55 58 58 59 60 62 62 63 63 64 65 66 67 68 69 81 LIST OF TABLES Table 1. Immediate-Delayed Test Reliabilities for all Criterion Tests ..................... 32 Table 2. Observed Correlations and Correlations After Correction For Attenuation: LSAT Section Scores,June 1991 Forms of the LSAT . . . 37 Table 3. Descriptive Statistics For DCT and LSAT Formats .............................. 55 Table 4. DIF Indices: Traditional vs. DCT Format .......................................... 56 Table 5. Reliability Estimates and Descriptive Statistics For All Items .................... 58 Table 6. Correlation Coefficients for DCT and LSAT Correct Responses to Kit of 59 F actor-Reference Test Items ......................................................... Table 7. Mean Response Times to Proportion of Correct-Total Items Answered .............................................................................. 61 Table 8. DCT Table of Theoretical & Empirical Assumptions ............................. 82 Table 9. Means 8.: Standard Deviations on Verbal Test Forms ........................... 84 Table 10. Means 8: Standard Deviations on Visual Test Forms ........................... 84 Table 11. Summary Correlations Between and Among Predictor and Criterion Variables for Law Schools Participating in 1995—1996 Correlation Studies: Selected First-Year Student Results ........................ 85 Table 12. Incidence of Significant and Rare Differences for Each pair of LSAT subscores ............................................................. 86 Table 13. Incidence of Significant and Rare Differences for All pairs of LSAT subscores .................................................................. 86 Table 14. Correlation Coefficients of All examinees for MRT (mean response times) ............................................................... 93 Table 15. Multivariate Analysis of Variance (lVIANOVA) of Significant Items ..................................................................... 95 Figure 1. Figure 2. Figure 3. Figure 4. Figure 5. Figure 6. LIST OF FIGURES Stages of Information Processing (Norman, 1993) ............................... Comparison of Two Solution Strategies in Terms of Their Speed-Accuracy Tradeoff Functions ....................................... Correct Responses to the proportion of items answered for DCT & LSAT ......................................................... Box Plots of Mean Response Times (MRT) of DCT and LSAT Groups ............................................................ LSAT RT to The Proportion of Correct Responses .............................. DCT RT to The Proportion of Correct Responses .............................. 19 40 6O 91 91 92 LIST OF EXAMPLES Sample 1. Sample Format of McNeal’s Verbal to Visual Test Comparisons .................................................................. Sample 2. Kit of Factor-Reference Cognitive Test - Identical Pics Test .................................................................. Sample 3. Kit of Factor—Reference Cognitive Test — Finding As Test ..................................................................... Sample 4. Sample Screen Shot of DCT & LSAT item formats Q1~Q5 ................................................................................ Sample 5. Sample Screen Shot of DCT 8c LSAT item formats Q1 1-Q17 ............................................................................. 83 87 88 89 9O KEY TO ABBREVIATIONS AI: Artificial Intelligence AR: Analytical Reasoning CAT: Computerized Adaptive Test CBT: Computer Based Test CDT: Computer Display Terminals COGS: Council of Graduate Students DCT: Dual Coding Theory DIF: Differential Item Functioning FY A: First Year Average Scores GRE: Graduate Record Examination GRE-A: Graduate Record Examination-Analytical Section GRE-Q: Graduate Record Examination-Quantitative Section GUI: Graphical User Interface HCI: Human Computer Interaction I: Item ICC: Item Characteristic Curves ID: Identification Number IQ: Measure of Intelligence LR: Logical Reasoning LSAC: Law School Admissions Council LSAT: Law School Admissions Test MAN OVA: Multivariate Analysis of Variance MRT: Mean Response Time Q: Question R: Reliability RC: Reading Comprehension RT: Response Time SAI: Signed Area Index STM: Short-Term Memory T: Time TOEFL: Test of English as a Foreign Language UEE: User-Experience Engineers UGPA: Undergraduate Grade Point Average UI: User Interface VSTM: Visual Short—Term Memory WAIS-R: Weschler Adult Intelligence Scale-Revised Z_SAI: Standardized Scores of Signed Area Index INTRODUCTION STATEMENT OF THE PROBLEM Past Law School Admission Test (LSAT) research reports indicate that discrepant subscores, often very substantial ones, were frequently observed in comparing performance on the logical reasoning and analytical reasoning sections of the test. The analytical reasoning items are targeted to measure problem solving abilities, through mental manipulation and organization of information. Similar phenomenon have been reported in other tests, such as the Graduate Record Examination (GRE) (Bridgeman & Cline, 2000), Weschler Adult Intelligence Scale-Revised (WAIS-R), (Matarazzo, Daniel, Prifitera, & Herman, 1985), and other intelligence tests. ”The ubiquity of such discrepancies has led to the suggestion that differences in how people manifest intelligence are the norm rather than the exception” (Kaufman, 1990). Recognizing these individual differences, the objective of this study is to arrive at an understanding of information processing: its organization, filtering and retrieval; and to determine if Dual Coding Theory (DCT) test formats, defined as the presentation of information with both text and Visuals simultaneously, are a more effective and truer measure of problem solving capabilities. In its most general assumption, Dual Coding Theory (DCT) views cognition of visual or nonverbal information as an activity involving two specific symbolic representational systems, one responsible for the processing of images or visual objects, and the other for the processing of text (Paivio, 1990). According to Paivio’s Dual Coding Theory (DCT) of information processing (Paivio, 1986), cognitive efficiency in recall, comprehension, cognitive operations such as problem solving, and concept learning literacy increases when information is presented simultaneously in both Visual and textual form (Guilford, 1967; Paivio, 1986). However, it is not merely the act of inserting Visuals at random that produces this effect, but rather the development and layout of Visuals which are tailored to specific guidelines of ergonomics. With the advent of computer technology, the presentation of information to be displayed on computer screens has become a challenging task. The display of such information confined to the parameters of a digital monitor “vary in their effect on the problem solvers’ information processing activities and problem solving performance” (Woods, 1991 p.171). It is from this ergonomic premise that the field of Human-Computer Interaction (HCI) is borne, which is fundamentally interdisciplinary in nature (Olson & Olson, 2003), drawing its foundations from cognitive psychology, ergonomics, and computer science. In depth research of these applied domains is thus necessary to come to a better understanding of human cognitive, perceptual, and physical processes when interacting with computers. In educational assessment and testing, there is a growing trend in the use of computers for test construction delivery and administration in addition to tasks such as scoring, analyzing and reporting test results. This flexibility has resulted in the creation of new and innovative item types geared toward a more performance-based type of assessment. These items including such formats as interactive video and audio formats, and Vignettes to name a few, are able to assess “cognitive skills that can be difficult to fully tap using traditional paper-and-pencil test formats” (Zenisky & Sireci, 2002, p. 338). Along the similar lines of Paivio’s Dual Coding Theory (DCT) of information, Zenisky and Sireci (2002) reiterate the importance of format response presentation as a critical component in the design of computerized adaptive tests (CAT) item types. Harmes (1999) extends this argument by evaluating current test question types, particularly multiple-choice questions, which she believes do not provide an accurate assessment of higher-order cognitive skills as they are limited to a single fixed-response. Innovative forms of performance-based assessment have emerged since, taking into account individual differences among candidates (Frederiksen & Ward, 1978; Haladyna, 1997). However, their construction, administration and delivery are no easy task. To date, the most practical and closely approximate authentic form of assessment of these skills is in the creation of innovative items themselves, defined by Harmes as “...new and better forms of assessment that incorporate features and functions not possible with conventional test administration” (Parshall et al., 1999 p. 1). With all these emerging performance-based assessment formats being created and researched, Allan Paivio’s (1981) DCT brings us back to the foundations of human cognitive processing of information for an in-depth look at how different units of information representation are evaluated and processed. Thus, from a cognitive psychological standpoint, before performance can even be evaluated and assessed, it is crucial to understand how specific units of information are processed. Within this seemingly simplistic theory, lies an almost thirty-year study on the intricacies and complexities of cognitive information processes of visual and semantic imagery. To fully grasp its foundations and core elements, it is crucial to first have a broad understanding of memory systems and how the cognitive processing of information affects human performance in its retention, retrieval and transfer from one task to the next. MEMORY SYSTEMS: A BRIEF OVERVIEW One of the core areas in cognitive psychology is the study of memory or memory systems. Past studies in cognitive neuroscience of the working memory system have yielded two opposing theories: (1) one labeled the psychometric approach where working memory is regarded as a single unitary system; and (2) the other labeled the cognitive approach where working memory is regarded as comprised of two or more subsystems. These subsystems include: (i) the central executive, the main controlling component of the two subsystems, (ii) the Visuospatial Sketchpad which manipulates images, and (iii) the phonological loop responsible for verbal informadon (Baddeley, 1992). The Psychometrig Approach The premise of this approach, which has taken root most strongly in North America, “focuses on the extent to which performance on working memory tasks can predict individual differences in the relevant cognitive skills” (Baddeley, 1992 p. 556). The essence of this approach is to develop tasks that require the combined storage and manipulation of information and to correlate performance on these tasks with the performance of practically and theoretically important cognitive skills. These tasks were devised to specifically measure reading comprehension and reasoning and their impact on working memory in order to predict individual differences (Carpenter, Just, & Shell, 1990; Daneman & Carpenter, 1980; Kyllonen & Christal, 1990). The advantage of this approach is that it focuses on the central executive system, which is crucial to problems relating to reading comprehension or reasoning. Some criticize this approach, however, because of the reliance on complex memory tasks that may be arbitrary in construction, as they “do not readily lend themselves to a more detailed analysis of the memory component process” (Baddeley, 1992, p. 557). The QQQghi'h'vg Approach The cognitive approach, which is the premise of Baddeley’s theory of processing information via dual—modalities, proposes a tripartite system comprising of a central executive system controlling two subsystems: (i) the articulatory or phonological loop, and (ii) the visuo-spatial Sketchpad. The phonological loop is assumed to be responsible for maintaining speech or verbal information, while the Sketchpad sets up and manipulates visual imagery. The use of this dual-task model to analyze the structure of the working memory system centers on these two sub—systems because researchers believe that there are more tasks that have tractable problems involving them. As such, concurrent storage and processing of information is not the only aspect of working memory. What is crucial is the coordination of these resources (Barnard, 1986; Schneider, & Detweiler, 1987). It is from this premise, that Paivio’s (1986, 1990, 1991) Dual Coding Theory (DCT) is founded. In order to have a better understanding of Dual Coding Theory (DCT), a basic overview of the visuo-spatial working memory component and factors that affect the visual short-term memory (V STM) such as types of task, time constraints etc., should be examined. Vishgl-Spgtial Working Memory Much of short-term memory (STM) research has been focused on the phonological loop and not on the visuo-spatial Sketchpad. In order to have a better understanding of Dual Coding Theory (DCT), a basic overview of the visuo-spatial working memory component and factors that affect the Visual short-term memory (V STM) Should be examined. These studies have occurred since the 19th century by Sir Francis Galton (1885). Visual Memory Capaciry Similar to the working memory capacity of verbal information, the capacity of the VSTM is severely limited. Evidence of this has been documented in past studies, where it was difficult for individuals to integrate information gathered from successive fixations on spatial-based coordinates. This suggests that very little information can be retained from previous fixations, (Irwin, 1991; Irwin, Brown, & Sun, 1988; Irwin, Yantis, &]onides, 1983; Rayner & Pollatsek, 1983) and that capacity is very poor for unattended information in scene perception and in social interactions (Levin & Simons, 1996; Rensink, O’Regan, & Clark, 1997; Simons & Levin, 1998). Capacity for visual memory is said to be approximately four items, but is contingent on the type of stimuli being processed (Luck & Vogel, 1997; Pashler, 1988; Phillips, 1974; Simons, 1996). Capacities in visual memory vary for different types of Visuals. For letters or simple features, memory capacity is approximately four to five items (Luck & Vogel, 1997; Pashler, 1988). In contrast, however, memory capacity for spatial locations produces a larger variability, between eight locations to thirty-two locations, with a near perfect five locations, but drops when more locations are added. This occurs when only the VSTM is used. When visual sensory memory or iconic memory is used in addition to the VSTM, which is the recognition of Visual physical features such as color, shape, etc, and not semantically, their capacity increases but with a shorter durability (Neisser, 1967; Phillips, 1974; Simons, 1996; Sperling, 1960). The resolution or details of the physical attributes are also poorly retained (Intraub, 1997; Nickerson, 1965). Positioning of visuals, known as the visual recency effect, also had an impact on memory capacity and retrieval (Broadbent & Broadbent, 1981; Phillips & Christie, 1977a). This was contingent on such variables as the presence of secondary tasks or interpolation, the degree of difficulty of these tasks, presence of target probes or cues, lag—time between previous visual information processing, and previous exposure to similar visuals. For example, researchers discovered that capacity is impaired after a delay, when subjects are given a demanding task such as mathematical calculations (Phillips & Christie, 1977 a, b). This is not the case, however, if the demanding task occurs concurrently with the Visual information processing. This suggests that Visual short—term memory (V STM) organization is based on: (1) a spatial configuration of the target, (ii) its relationship to the surrounding items in the display and (1ii) two sub-systems of memory (Doost 8c Turvey, 1971). Visual-Spatial Qrgam’ atioh Processing and organization of visual information involves three integrating variables, (1) processing at the feature level, (2) processing at the representation or semantic level, and (3) processing at the space or location level. As such, investigations of the unit of VSTM representation have shown that the capacity can be enlarged by grouping visuals with similar features, meanings, functions, and other similar features into a single object (Chun & Jiang, 1998). Thus, units of Visual information are held both independently as well as relationally to each other. Numerous studies have discovered that the organization of VSTM which is based on spatial configurations is done at hierarchical or at multiple levels (Jiang, Olson, & Chun, 2000; Luck & Vogel, 1997). This concept is similar to the relation of words to the context of the passage in deriving its meaning and similar to the top—down bottom-up processing of words. The formation of spatial configuration is rapid (Chun &Jiang, 1998) and can be learned within five to ten repetitions. This is evident in graphical plotting, cartography or in the study of geographical maps, where locations of cities, states and other landmarks are easily remembered and retrieved even over long periods of lag times. These spatial configurations serve as a guide to contextual information in visual search tasks, counting and tracking or book—marking. Thus the individual does not need to rely on his or her visual memory resources alone. In addition, the visuals need not be detailed or concrete, merely arbitrary for configural organization to occur. This is why instructions in almost all analytical or logical test sections call for examinees to draw diagrams to help in selecting the correct response. Visual Irace é; Iask~Ihterachon Early studies of Visual attention give evidence of the persistence of images in working memory in terms of size and visual trace (Baddeley, 1992). This persistence of visuals is based on associations between stored visual representations of objects and their semantic meanings, (Humphreys, Riddoch, 8c Quinlan, 1988; Riddoch & Humphreys, 1987a, 1987b; Seymour, 1979) an explanation held by Paivio (1986, 1991, 1990). Other researchers attribute this dual task methodology to tasks that do not Vie for similar cognitive resources (Marschark & Cornoldi, 1991; Marschark, Warner, Thomson, & Huffman, 1991). As such, Paivio’s (1991) extensive work on Dual Coding Theory gives an in-depth look at dual task methodology which will be further discussed in this study. The components of VSTM and the many variables affecting visual information processing have led other researchers to build on a modality model. Two examples include Sweller’s (1976) Cognitive Load Theory, concerned with techniques for reducing working memory to facilitate changes in long term memory associated with schema acquisition, and Anderson’s (1981) Triple Coding System, a propositional theory of memory recognition. This research study will seek to investigate the effectiveness of Paivio’s (1991) Dual Coding Theory (DCT) in the testing and assessment domains, in order to achieve the following hypotheses. OBJECTIVE AND GOALS OF THIS STUDY The overall hypothesis of this study is that when Dual Coding Theory (DCT) formats for items are used for computerized adaptive tests (CAT), they will yield more efficient measures of a candidate’s capabilities than traditional item formats because they take advantage of the results of research on Dual Coding Theories of information (Baddeley, 1997; Paivio, 1986; Sweller & Cooper, 1985). By efficient, we mean an accurate assessment of a candidate’s performance that can be obtained in the shortest time possible. The objective of increasing the number of correct responses is to increase the accuracy of estimation of a person’s capabilities. A decrease in the amount of time taken is fruitless, however, if the correctness of responses decreases as well. As such, the specific objectives of the study will be to determine if the use of Dual Coding Theory (DCT): (1) decreases the response times between the presentation of each item and response to each item (known as the response latency) plus the correct response, and (ti) decreases the average length of time for the number of correct responses relative to the proportion of total number of items answered. In the following chapters, a more detailed account of Dual Coding Theory (DCT), its challenges and current directions will be discussed. Variables that may affect authentic cognitive measures such as types of mental operations, time perception, task distractors to name a few, will also be investigated. Research, though limited, on past investigations of testing formats that have utilized paired visuals and text in their test design and item formats will be reviewed. The analytical reasoning (AR) items of the Law School Admissions Test (LSAT) have been selected as the experimental assessment. As such, an investigation into its internal structure, its validity, reliability and time speededness will be conducted. The research and methods section will include a description of the test administration and experimental procedures, with selected screenshots, to provide a more cohesive and clearer understanding of the processes involved in graphical user interface (GUI) development. Analysis of the results will be discussed, and future directions proposed. DUAL CODING THEORY (PAIVIO, 1986, 1990, 1991) THE BASIC PREMISE The focus of DCT is the study of imagery and its functions, initiated by studies of individual differences of the Vividness of imagery by Galton (1880 et al.). The Dual Coding approach hypothesizes that imagery can be objectively measured by procedures and is systematically related to performance in memory and other tasks. These independent imagery variables include: (a) image invoking cues, such as the use of Visuals as a stimulus to generate specific words, (b) procedures used to distract or enhance the use of imagery, and (c) individual differences in the use of imagery (Paivio, 1991). According to Paivio, (1986) “human cognition is unique, in that it has become specialized for dealing simultaneously with language and with nonverbal objects and events.” Moreover, the language system is peculiar in that it deals directly with input and output (in the form of speech and writing), with representational units for verbal entities known as ‘logogens’, and representational units for mental images known as ‘imagens’. In addition, these units serve as symbolic functions with respect to nonverbal objects, events, and behaviors. As such, any representational theory must take into account dual functionality. Paivio’s theory postulates that there are two sub-systems in the visuo-spatial Sketchpad, one for processing Visual semantic information such as text, and the other for processing images such as objects. Three types of processing occurs within these sub- systems: (1) representational, which is direct processing of text or visuals, (2) referential, the activation of the verbal system (logogens) by the non-verbal system (imagens) and vice-versa, and (3) associative, which is the activation of representations within the same system. When visual images are presented together with text, it serves two purposes: (1) to complement the text to arrive at a better and more accurate understanding of what is being 10 conveyed, and (ii) to alleviate the cognitive load when reading text. Past experiments have given evidence of the Visuo-spatial working memory engaged in the ‘perceptive analyses’ of illustrations together with the relational text (Gyselinck, Cornoldi, Dubois, DeBeni & Ehrlich, 2002). As such, when “illustrations are presented with text, the visuo-spatial working memory would be more involved, both in basic operations matching text and illustrations, and in the formation and storage of the visual traces, before the integration of the two types of information” (Gyselinck, Cornoldi, Dubois, DeBeni & Ehrlich, 2002, p. 682). DEVELOPMENT OF DUAL CODING THEORY (DCT) The Congeprhal—Eeg Hypgthgsis Research that led to the DCT was motivated by verbal associations Via rhyming mnemonic techniques in human learning and thought (Noble, 1952). Mnemonic techniques explicitly require dual coding in that non-verbal images are initially generated from words during list learning, then generated from verbal cues during recall, and finally decoded back into words. For example, in the re-call of a list of say twenty-four items, a mnemonic scheme of words that rhymed with numbers - one-run, two-shoe, three-tree and so on is used. The word ‘run’ elicits a mental picture of someone running, the second with a pair of shoes and so on. The technique implicates the following DCT processes: (i) verbal and irnaginal referencing, (ii) verbal associations based on rhyming schemes, and (111) imagery organization and integration. In short, investigations into DCT give evidence that: (i) imagery benefits associative learning through integrative processes of images and text, and (11) recall of information occurs on a sliding scale of pictures to concrete words to abstract words. Imagery—goncreteness of Word-Picturg Item§ DCT “distinguishes between nonverbal imagery and verbal symbolic processes, 11 which involve independent but partially interconnected systems for encoding, storage, organization, and retrieval of stimulus information” (Csapo, 1991, p. 76). Efficient encoding of ‘logogens’ and ‘imagens’, is dependent on the type of words used, concrete or abstract, and the degree of similarity or relation between words and visuals. In short, DCT suggests that a mixture of an integrative and independent encoding of both images and text serves for better recall. This is because a simultaneous text and visual presentation of information is encoded both as images and as verbal traces (Csapo, 1991). Results from past studies indicate that recall depended on the concreteness of the retrieval cue (Paivio, 1971; Yarmey & O’Neill, 1969). For example, it would be easier to form a Visual representation of the word clap/mat or tzger versus one for the word [ya/9' or dircmline. Concreteness is not merely limited to single word nouns, verbs or adjectives. Begg (1972) discovered that using concrete phrases (e.g. the white horse) versus abstract phrases (e.g. basic truth) increased the capacity for free recall. He attributed this to the integration of irnaginal memory traces that are redintegrated to higher imagery words. There have been other theories that have challenged the concreteness hypothesis, however, to date evidence supporting “the Dual Coding Theory View of concrete material being better recalled due to additive effects of independent verbal and nonverbal (imagery) codes when all else is equal,” (Sadoski, Goetz & Avila, 1995) still holds. Synchronous Organizah'oh A major hypothesis of DCT is the integration of text and visual information presented simultaneously versus sequentially. This does not mean that all information is simultaneously processed, but that information is ‘available’ for processing simultaneously as needed (Paivio, 1986). In keeping with the associative elements between text and visuals, it is crucial that the pairs are presented as unitized compounds, i.e. having associative semantic 12 meaning (Davidson 8: Adams, 1970; Epstein, Rock & Zuckerman, 1960; Reese, 1972; Rohwer, Lynch, Suzuki & Levin, 1967). There exist functional criteria for synchronous organization of information. These include the following: (1) memory for spatial relations, (11) simultaneous availability of grouped component information, (1ii) freedom from sequential constraints, and (iv) redintegration effects, when a component or element of a unit of information is used as a retrieval cue for the entire previous occurrence. These effects according to Paivio (1991) are “the occurrence of an idea which is simultaneously accompanied by other ideas that are derived from perceptual experiences in which the component elements occurred together” (p. 68). S 7mm '-As rmm ' A o ia 'v It m Another important variable in the recall-retrieval of information is the issue of symmetrical properties among paired items. Forward and backward recall of items is dependent on the degree of concreteness of the items themselves. Smythe, (1970) concluded that “picture and concrete noun pairs resulted in symmetrical forward and backward recall, whereas abstract noun pairs generally showed higher forward and backward recall” (Paivio, 1991, p. 67) (cf. Yarmey & O’Neill, 1969). His experiment also measured the latency of correct responses and discovered that for concrete noun—paired words or picture pairs, both forward and backward recall was equal. These findings support the DCT concept that recall of pictures and concrete noun pairs is mediated by visuals containing synchronously organized information and can be processed without sequential constraints. Sequential constraints are typical of verbal representations that cannot be easily and readily recoded into images. There exists opposing theories to DCT which will be discussed in the following section. 13 OPPOSING THEORIES OF DUAL CODING THEORY (DCT) The Propositional Theory of Recognition Memory’s View of Visual information is that it is transformed into semantic form for storage in long—term memory (LTM). Although the propositional theory acknowledges the existence of visual processing in the Visual short- term memory (V STM) or Short-term memory (STM), it disputes the superiority of images over words. Some theories suggests that the superiority of images is due to people “process[ing] and rehears[ing] pictures more fully than words and sentences [which] results in more propositional information [. . .] when visual representations are provided than when information is given only in verbal form” (Rieber, 1994, p.114). Further studies, in particular research of those who are proponents of Artificial Intelligence (AI) have demonstrated that visuals are remembered by their meaning versus their physical visual features. This unitary view of pictures and words implies that both text and images are stored in the same way, and that there is no difference in the storage of verbal and Visual information. Many researchers in Artificial Intelligence (AI) hold this amodal theory of the abstract representation of knowledge (Driscoll, 1994; Molitor et al., 1989). Another argument regarding the superiority of Visuals attributes differences in information processing to age differences. Simpson (1995) believes that age differences play a vital role in choosing specific modalities to use when processing various forms of information. He argues that younger age individuals process information more so in the Visual modality versus older aged individuals in the text semantic mode. This could be attributed to a larger vocabulary derived among older aged individuals who have created a broader word base. Other Views opposite Paivio’s (1991) Dual Coding Theory that have emerged, are as 14 follows; the Computational Theory known as ‘connectionism’, (Potter & Faulconer, 1975; Seymour, 1973; Snodgrass, 1984; Theios & Amrhein, 1989) though gaining prominence in cognitive psychology and cognitive science, is beginning to be applied to visual representation and imagery problems. Its potential for handling a wider range of such phenomena still remains to be demonstrated. Others include the effects of implicit and explicit memory effects versus visual superiority, (Mel, 1986; Roediger & Weldon, 1987; Weldon & Roediger, 1987) and relational and distinctive processing (Marschark & Hunt, 1989). CONCLUSION Though there have been theories that have argued against Paivio’s (1986, 1991) theory of information processing, Dual Coding Theory presents a model that is conducive to the assessment and testing arena. These domains require the presentation and or creation of visual-text representations, as it may be a more accurate and effective measure of human problem solving capabilities. This is based on further studies conducted as an extension of the DCT by Marschark and Paivio, (1977) to determine the superiority of visuals in recall tasks. Results from their students indicate the following. (a) imagery was reported much more often than verbal strategies, (b) verbal strategies were predominant in abstract items, (c) frequency of images correlated positively and significantly with free and cued recalls of both concrete and abstract items (Paivio, 1991). A complete summary of theoretical and empirical assumptions and phenomenal domains of DCT is attached in the appendix (Appendix A). Though other views have discussed alternative hypotheses for proponents of the DCT, much of the results have not had predominance over a variety of experimental 15 conditions. Research findings and results from DCT studies and its proponents have proven to be the most credible under a variety of circumstances thus far. It is critical to note, however, that information processing of paired Visuals and text may be affected by specific variables that may impede or enhance efficient processing. These variables are discussed in the following chapter. 16 FACTORS AFFECTING VISUAL SHORT-TERM MEMORY COGNITIVE ANALYSIS OF VISUAL PROPERTIES Diagrammatic properties and formats are important variables to consider in selecting the appropriate Visual that matches the specific cognitive process. Diagrams are not a homogenous class of representations, but have a variety of formats and uses. As such, diagram features must be considered in relation to their objective goal and what they intend to represent. Classification of images and graphs are categorized as either structural or functional. Functional categories focus on the intended use and purpose of these diagrams, such as ‘how to’ manuals, whilst structural classifications focus on representations or forms of the image versus their content, such as bar charts and pie charts. A set of functional roles have been identified from numerous research studies to serve as a framework for diagrammatic selection as follows and are listed below (Cheng, 1996). 1. Spatial Structure and Organization — Diagrams that depict spatial features and arrangements of their components are crucial in maintaining what in HCI is known as ‘white space’. This facilitates accurate discrimination among grouped visuals-text, and prevents the overlapping and misrepresentation of information. An example is the display of visuals on a menu bar as icons with clickable functions. They are separated from other Visuals on the screen by ‘white space’ to allow for this discrimination. 2. Capturing Physical Relations - Diagrams are used at times to highlight specific physical relations that are important to the specific task; for example, an illustration of an electrical circuit would demonstrate inter-connectivity and sequence of components. 17 Physical Assembly -— Some diagrams illustrate how something is physically assembled from various components. These are similar to blueprints in engineering and architecture. Identifying variables, terms and components -— Diagrams at times are used to define and identify specific components, variables and features, such as specific symbols used in electrical circuit diagrams such as code breakers. Displaying values, states etc. — Diagrams are often used to represent quantitative data in the form of bar graphs, charts etc. Some depict states or conditions, such as weather conditions. Captures Laws and Theories - Some diagrams embody theoretical laws and theorems such as geometry and topography within their structure, such as the Item Characteristic Curve (ICC) in Psychometrics. Flows, Sequences and Processes — Diagrams are used to represent simple and complex flows of processes, both linear and non-linear such as loops, cycles and sequence stages. The mode for displaying visuals is also crucial in its synthesis, processing and understanding. Visuals to be displayed on computer screens are affected by different variables to visuals to be displayed on paper. In addition, even the screen size and resolution of Computer Display Terminals (CDTs) affect how visuals are being processed. McCormick et al., (1987) define visualization on computers as “the study of mechanisms in computers and in humans which allow them in concert to perceive, use, and communicate visual information” (Lohse, Biolsi, Walker & Rueter, 1994, p. 36). Lohse et al (1994) conducted a research study to investigate the organization and visualization of images among individuals based on three specific tasks: naming, rating, and 18 sorting of Visuals. The rating scale was scored on a 10-point scale of anchor-point phrases. In this study, visual displays are used as “data structures for expressing knowledge, which help facilitate problem solving and discovery by providing an efficient structure for expressing the data” (Larkin & Simon, 1987; Lohse, Biolsi, Walker & Rueter, 1994, p. 37; Rumelhart & Norman, 1988). Results of the study indicate a classification of approximately eleven visual types of representations, but with apparent inconsistencies. According to Lohse et al., (1994) these apparent inconsistencies are contingent on how well the graphic is represented, the type of task and the display acreage. Computer Display Terminals (CDTs) are limited in their available acreage for displaying information. Though computers are able to afford an environment for the dynamic display of information to interactive exchanges, the shortage of display space will affect the Visual-spatial layout of information, which will undoubtedly affect how efficiently information is to be processed. As such, the layout of Visuals within specific parameters of the Computer Display Terminals (CDTs) is crucial. VISUAL-SPATIAL LAY-OUT According to Norman, (1993) “solving a problem simply means representing it so as to make the solution transparent” (p.53). His theory postulates that the degree of difficulty of a task is dependent on the presentation format of the problem. Information presented in technological environments could either change what may be a relatively simple task into a challenging one, or aid the individual engaged in a complex task by providing guides toward the correct and desired solution. Norman (1993) lists three stages of information processing and retrieval from User Interface (U I) designs on computer screens: Organization of Information—>Search of Information—vComputation of Information Figure 7. Stager of Information Pmrem'rzg (N omum, 1993) 19 This theory of task functionality is founded on a user-centered design philosophy in the creation of digital interfaces for information presentation. The theory follows two specific principles as a guideline for effective UI displays; the naturalness principle which is the design of representation whose properties match the properties of everyday things, (Norman, 1986) and the perceptual principle which is the design of perceptual and spatial representations only if the representation and what it stands for is natural. Norman (1993) emphasizes two key interactive components that must always be at the forefront of any principle or concept, i.e. design to fit the person and the task at hand. Since then, there has been a tremendous amount of research and investigations into task performance in complex interactive systems that employ theories from HCI and cognitive neuroscience. The selection of visuals, item types and formats must also take into account the time constraints that are placed in a high-stakes assessment test, what in CAT environments is known as time speededness. Time perception and its effect on performance is discussed in the next section. TIME PREDICTION AND TASK COMPLETION PERCEPTIONS Time differs from most other dimensions of the environment as there is no specific sensory organ for its perception (Repp & Penel, 2002). Past research has discovered that time predictions even among experts of specific tasks, have over—optimistic time predictions despite having gone through similar tasks taking longer completion times than anticipated. This is known as the ‘planning fallacy’ (Kahneman & Tversky, 1979). The theory states that people tend to focus on the current task at hand during the planning stages rather than reflecting on the time taken to complete past similar tasks. From a HCI point of View, time perception is seen as a combination of man and machine interaction (Decortis, Keyser, Cacciabue, & Volta, 1991). The concept of time is 20 broken down into six specific sections, each with sub-sections of variables influencing perception. They include: 1. Temporary structures of man-machine interaction. 2. Attributes of the structure and its relation to events in terms of sequence, nature etc. 3. Key functions of controls in the system. 4. Adequate tuning of the operator to the system in order to arrive at a comfort zone of optimal performance. 5. Temporal Errors. 6. Varying time perceptions from each operator to the next. To fully understand the impact of time on project task completion and actual performance, it is important to know how specific lengths of time are allocated to tasks or particular series of tasks. Various factors impact time prediction of tasks completion. They include: (1) previous experience of the tasks, (2) the structural and sequential nature of the task and distractors, and (3) the cognitive complexity and duration of the task (Thomas, N ewstead & Handley, 2003). Experigncg-Rghearsed S§§§ion§ In their experiment, Thomas et al., (2003) found task experience is an important determinant of the time prediction process. Participants used their initial experience with a task as anchoring for adjusting to the next similar task. Their study also highlighted the importance of task experience on prediction accuracy, but only contingent on temporal distance or lag period i.e. the amount of time gone by between the first and second session of two similar tasks, and time between previous and current non-similar tasks. In addition, time perceptions are not constant across single task duration. Time experience and its impact in HCI have also recorded similar instances for its 21 effect on perception (Decortis, Keyser, Cacciabue, & Volta, 1991). Specific variables that need to be investigated include: 1. Time required for each action of the GUI i.e. mouse clicks, mouse movement, and keyboard input. Time in deciding which function is to be selected for specific functions, time in combination with the first variable, categorized as an explicit variable. Transition time from one functional mechanism to another between subject and operator, time classified here as an implicit variable. Nature of Task 8; Disrragtors “Time judgment performance may display a progressive deterioration as greater amounts of attention resources are diverted away from the timed task” (Brown & Boltz, 2002, p. 601). This is dependent on the nature of distractors during the task, their duration, the structure of the task and mental workload, which requires more memory storage. Brown and Boltz (2002) discovered that these variables had a definite effect on time judgment independently as well as interactively. Results from their study give evidence that: 1. Mental Workload - Timed judgments were more common among dual-tasks (timing plus target detection) compared to single task (timing only) conditions. With dual- tasks, attentional resources are utilized more than during a single task. Hence DCT serves as a perfect model to preserve these resources. Event Structure — Events that are inconsistent and disorganized had more errors in judgment. Duration — Based on Vierordt’s Law (1868) shorter interval times resulted in overestimation of time judgments versus underestimation for longer interval times (Bobko et al., 1977; Stevens & Greenbaum, 1966). 22 Tas 1 xi & do The type of task and its duration are important determinants of performance, retrieval and storage because of the limited capacity of short—term memory (STM) and visual short-term Memory (VSTM) in particular. Types of mental tasks will be discussed and elaborated in the following section. Time estimation is important from both an artificial intelligence (AI) and cognitive psychology perspective. In HCI, this allows one to predict temporal errors and improve functionality. In standardized tests, time limits that have been imposed serve two specific functions. First, time is considered as an inherent part of the construct as it “reflects intellectual power primarily, rather than the rate at which examinees wor ” (Bridgeman, Cline, Hessinger, 2003). Second, it serves as a standardized measure for all examinees i.e. the test being administered in the same way. The DCT model offers an efficient solution, by affording a decrease in cognitive workload such that unrealistic predictions of time on task and error free estimation will be accommodated. TYPES OF MENTAL OPERATIONS Generally, human problem solving can be broken down into three distinct dimensions: (1) the task dimension, one’s interaction with the environment, (ii) the performance-learning—development dimension, differences among individuals performing a task, learning to do a task and developing a task, and (iii) the individual-difference dimension, the variety of systematic ways each person arrives at the target solution. Thus, efficiency in problem solving capacity and ability is contingent on an individual’s internal representation of the problem itself. Factors such as: (i) the nature of the problem, (ii) individual perception of the problem, (1ii) matching the problem with past knowledge to arrive at a solution, and (iv) 23 exploratory avenues of solutions not yet discovered, all exert a great deal of influence on problem solving capacity and ability (Eisenstadt & Kareev, 1975). Though many studies have confirmed the multi-component structure of working memory (Baddeley, 1992), the advantages of the structure are contingent on the type of task required during processing of information. The degree of involvement of the two subsystems is thus contingent upon information format and layout of presentation, and the nature of the task to be measured. Quantitative Reasoghg Hitch (1978) and Hayes (1973) suggested that in mental computations, similar to those written and done on scratch paper, the mental notations such as intermediate sums or carries are held in the Visuo—spatial Sketchpad as imagery. Other investigations, however, attribute mental arithmetic tasks to a phonological coding process that retains operands in the phonological loop (Furst & Hitch, 2000; Heathcote, 1994). When mental computations were presented in a vertical format, participants responded more rapidly than when the problems were presented horizontally. However, as the task load increased and became harder, the differences in performance between items presented vertically versus those presented horizontally were smaller. “These tradeoffs suggest differential involvement of phonological and visual working memory as a function of problem format” (p. 742). It is therefore important to remember that the first step in any mathematical problem is to understand the representation, and then evaluate cognitive load and arithmetic problems in analyzing performance. This will then determine the degree of cognitive load, making attentional and time prediction resources and management of cognitive goals affected (T rbovich & LeFevre, 2003). Sentenge Verification Task Problem representations that require an analysis of sentences or words such as 24 naming, involve two additional processes determining the meaning of the picture, and finding a name for which involves matching and filtering of semantic or verbal codes. Clarke and Chase (1972) were the first to embark on the sentence verification task, providing students with true or false test sentences with respect to a picture. Results demonstrated that subjects went through a series of ‘discrete stages’ where both pictures and text were encoded into a common abstract representation. This resulted in the Multimodal Theory of picture- word processing. However, when the task of naming was changed to a comparison of a test sentence to either a picture or word input, format effects could not be predicted (Clarke 8: Chase, 1972). Further research conducted (Glaser & Glaser, 1989; Glenberg & Langston, 1992; Smith & Magee, 1980; Theios & Amrhein, 1989b) discovered that a Multimodal Theory of pictures and words would not predict the effects of format on information processing because access to the verbal or semantic network would be abstract rather than concrete, as proposed by DCT. Text and diagrams containing similar information are not equal in terms of the processing required to extract the information. Goolkasian’s study (1996) examined if format effects had any effect on four specific tasks; probability judgment with colors, probability judgment with shapes, category inclusion, and pragmatic inference. His findings are as follows: 1. Pictures had an overall advantage in terms of how information is extracted versus having access to semantic memory. 2. Pictures aid in the comprehension and retention of text through working memory management. 3. Pictures facilitate more efficient reasoning that is related to probability judgments than with colors and shapes, category inclusion and inferences. 25 4. Shapes were closer in Visual detail when compared to other attributes. 5. All conditions demonstrated similar format effects but varied in response efficiency and effect of item type. 6. Compared to probability judgment, problem solving with inclusion, and inference items were more affected by the test—text statement. 7. There is a performance advantage when pictures occur relative to when words occur in a variety of conditions. Though Goolsakian’s experiment proved that the advantage of pictures was based on how information was extracted, versus its access to the semantic memory system as hypothesized by the Multimodal and DCT models, the pictures used in his study did not have an associative or paired semantic relation to the accompanying text. Imagery according to Paivio can help mediate performance by serving as a reference or interactive relation to language. Maze ' O 'ec Ma ' la 'on sk In visual perception, spatial knowledge is crucial in detecting object location, direction, and recall. There are specifically four types of spatial relations: (i) direction relations describing order in space, (11) topographical relations describing neighborhood and incidences, (1ii) distance relations, and (iv) ordinal relations that describe inclusion (Pullar & Egenhofer, 1988). Space-based theories often emphasize the distance between the target and distractor stimuli irrespective of retinal location, as their distance is an important issue in effectively processing visual stimuli. Visuals and the space they occupy, distance between the target object and distractors, or within the parameters of the retinal view, are crucial in the effective processing of visual information. This also permits effective visual manipulation of objects and transfer. This is even more of a pertinent issue when visual representations occur 26 in the computer 2D environment because of the limits imposed by screen size and resolution. Some of the categories of mental operations described constitute an overall description of basic higher order cognitive skills of problem solving, analysis, manipulation and transfer. Specific distinctions of different complex cognitive tasks are many and varied requiring an interaction of all four cognitive skills. Much of the research that has been done on individual performance on complex tasks has been done by doing investigations of specific cognitive skills by controlling for any probable interaction, and looking at the interactions themselves. INDIVIDUAL DIFFERENCES Many research studies have investigated individual differences in cognitive processing of visual and verbal information and the effect on performance. According to Reinert (1976), “these cognitive abilities can be thought of as perceptual modalities, channels through which the individual receives, uses, and retains information. Each person is ‘programmed’ in certain ways, so that one particular cognitive ability becomes more compatible in confronting and obtaining information, whereas other abilities may be less effective” (Van Dusen, Spach, Brown, 8: Hansen, 1999, p. 1030). The term ‘Visual learner’ has been used often to describe one’s learning style. In recent years, however, researchers have obtained results that do not lend support to the thought of a strong relation between individuals who relied primarily on images to perform cognitive tasks, and high imagery tasks. On the contrary, current studies have indicated that the processing of visual-verbal information is not a unitary construct, as proposed by Paivio (1991), but involves an integration of problem-solving cognitive skills at varying levels among different individuals. 27 The reason for this great variation is that “imagery is not general and undifferentiated but composed of different, relatively independent Visual and spatial components” (Baddeley, 1992; Farah, Hammond, Levine, & Calvanio, 1988; Kosslyn, 1994; Kozhevnikov, Hegarty, & Mayer, 2002, p. 48; Logic, 1995). Moses, et al., (Moses, 1980; Suwarsono, & Presmeg, 1986a, 1986b) proposed that Visualization can be placed on a continuum, called ‘degree of Visuality’ while solving mathematical problems. Results from this approach failed to connect the degree of visual ability to levels of spatial ability. Following these incongruences, researchers such as Kosslyn (1995) proposed categories made up of Visual ability, either high or low, and spatial ability, either high or low. Imagery involving different types of Visuals was identified by Presmeg (1986a, b) in mathematical problem solving tasks. They include concrete pictorial imagery, pattern imagery, kinesthetic and dynamic imagery, and memory for formulas, classifying pattern imagery as the most important role in mathematical problem solving. This is because pattern imagery disregards concrete details and focuses on pure relations. Finally, Hegarty and Kozhevnikov (1999) discovered that Visual-spatial representations can be divided into primarily schematic or primarily pictorial, and found that “the use of schematic representations was significantly correlated with students’ spatial visualization ability” (Kozhevnikov, Hegarty, & Mayer, 2002, p. 51). In high stakes assessment environments, reading to search for information with the intent of answering specific questions, under a time constraint, and familiarizing oneself with functions of the interface, is a multi—task criterion for the candidate. Specific variables, as already described will have an impact on this process. The following chapter will investigate past and current Visual-text assessment formats and review the results and conclusions. 28 VISUAL-TEXT ASSESSMENT FORMATS EARLY RESEARCH There exist only a limited number of studies that have concentrated on the use of Visualized tests. The earliest research of utilizing visuals and text in educational standardized tests was conducted by Brown (1947), entitled A Comparison of Verbal and Projected . Specifically, the objective of his study was to look at the student’s ability to “use principles to explain, to predict, and to arrange conditions to bring about a desired end” (p. 1). His argument lends su ort to this current studv that in encil and a er standardized tests where student’s pp J ’ p p P ability is correlated to teacher scores: 1. 2. Words used in verbal tests may not invoke similar meanings among examinees. Intended meanings expressed by the test construct may not have the similar intended meaning among examinees. The verbal constructs of words, phrases and meanings may not allow for synchronous processing to occur, i.e. where information is not simultaneously presented, requiring examinees to piece together the problem thereby causing a cognitive overload. Quality of verbal information processing is influenced by an individual’s reading rate and overall comprehensive speededness. A summary of the results from Brown’s (1947) study are listed as follows: The scores were stable throughout the period of the performance test for each examinee. There were no significant differences between the two test formats with regard to the number of items. 29 3. Both formats correlated with the performance test criterion to a degree sufficient to indicate good predictive power and generalizability. 4. The verbal-pictorial format based on matched and paired items indicates that it is a more valid predictor of student performance with IQ levels of 100 and below. 5. The verbal-pictorial test format was significantly less difficult than the verbal test, but still maintaining its validity. These earliest findings strongly suggest the supremacy of a DCT method of information processing hypothesized by Paivio (1986, 1990, 1991). Building on Brown’s (1947) study of verbal-pictorial test formats, a section of Letkowith’s (1955) research focused on the reliability and validity of pictorial tests in actual testing programs. A pool of 60 multiple choice questions was administered with five answer choices to examinees in two different formats, a verbal or text only format, and a pictorial- text format utilizing visuals as cues. The overall results from the study indicated that: 1. The correlation between examinee scores to the pictorial test method was higher, as the pictorial stimuli became more iconic, a characteristic of the VSTM. 2. Pictorial tests were valid and reliable enough for use in actual testing programs that complement teaching and instruction in K-12 curriculums. All three hypotheses were realized. Since Brown’s (1947) and Leflcowith’s (1955) Visual—text Studies, there have been sporadic attempts to build on these hypotheses resulting in a variety of outcomes and results. PAST RESEARCH Dwyer and DeMelo’s (1984) study entitled Effegrs of Mgd; 9f Insghctign, lesrth' g, Order of Tgsting, and Ched Regal] 9n Student Achievement consisted of five types of evaluation formats: (1) a drawing test, the ability to re-create items in their appropriate 30 context, (2) an identification test, used to measure ability to discriminate one structure from another, (3) a terminology test, measuring specific domain knowledge, (4) a comprehension test, an evaluation of the application of learned information, and (5) a total criterion test, a combination of all the formats above. The content of their test was the functions of the human heart and its internal processes. The results indicate that using visuals to complement verbal instructions assists in recall and the effects of higher mean scores among students who took the verbal test format disappeared on the delayed two-week retention tests. Overall the Visual testing format that was predicted to improve performance over the verbal format was not realized. However, the investigators attributed this to the following: 1. It was the first exposure of visual testing for the participants; rehearsal sessions would have altered the results significantly. The design of the visual format items was done so that they would be congruent to the verbal distractors (the other non-correct answers used to re-direct focus of the participant from the correct answer) of the verbal items. This is not an issue of merely translating a verbal format into a visual one, as Visual images and their distractors are processed and filtered differently. Only one type of testing format was used in the Visual version of the test, matching the correct image to the multiple choice responses. The verbal category included a variety of formats, such as labeling, naming, and drawing. A significant finding in Dwyer and De Melo’s (1984) study was the advantage of the verbal test format disappearing after two weeks in delayed retention tests. Building on this premise, Richards (1987) revised their tests and focused on the aspect of immediate versus 31 delayed test formats both in a verbal and a visual version this time, on computer displays. In addition, he also investigated time spent on tests as a valid variable measure. A concise table of overall test item reliabilities is illustrated below. Table 1. Immediate - Delayed Test Reliabilitier for all Criterion Text: Test Reliability Drawing 0.722 Identification 0.653 Terminology 0.747 Comprehension 0.743 Total (Id. + Term. + Comp.) 0.816 Using the Kuder-Richardson Test Reliability formula, estimates of parallel reliabilities for both the immediate and delayed tests according to Richards, proved satisfactory. It is important to note, however, that his reliability estimates would not be satisfactory by today’s standards. A good estimate would be between 0.9 — 0.8, while anything at or below 0.7 index level would be considered poor. After analysis of the results, there were no significant differences between testing modes among all tests as well as in time spent on all three types of categories and on both formats. In contrast to the past preliminary research studies of verbal—visual testing modes and formats, McNeal (1994) utilized Paivio’s (1976, 1981, 1991) DCT as the core theoretical foundation of her research in assessment and testing, using the following test verbal formats: (i) an identification test comprised of multiple choice responses, (ii) a terminology test made up of multiple choice and fill-in-the-blank items, (iii) a comprehension test also made up of multiple choice responses. The following item types for the visual format were used: (1 an 32 identification test with one visual and four to five text labels at any one time, (1i) a terminology test using at least two visuals to fill-in-the-blanks of that part of the heart associated with the function, (1ii) a comprehension test which offers visuals of four multiple choice options. Finally, scores from each criterion test were combined to form a 62 item composite of visuals plus text format. The visuals used in McNeal’s (1994) study were simple lined drawings in black and white combined with text. A sample of a test format is attached in the appendix (Appendix B). Estimates of reliability coefficient for all items in the various formats were within the range of 0.70 — 0.92. Results indicate the following: 1. Though the “means on the visual form of the criterion measures were not generally deviant from those on the verbal forms, however, the standard deviations were usually higher on the Visual form of the criterion measures” (p.54). 2. The mean achievement scores on the combined visual-text section of all categories were significantly higher. An overall comparison of composite scores comparing all formats is illustrated in the appendix (Appendix C). McNeal (1994) attributed the higher achievement results among examinees in the composite Visual-text format (T4) to the following DCT (Paivio, 1976, 1986, 1990, 1991) principles of information processing: 1. When concepts are stored in both a verbal and visual code, they are retained in memory longer and are more easily accessible. Paivio (1976, 1981, 1990, 1991) identifies this as the code-additivity hypothesis, that encoding in both visual and verbal forms facilitates memory (Mayer & Gallini, 1990; Park & Hopkins, 1993; Reiber & Kini, 1991; Sadoski, Goetz, 8: Fritz, 1993a, 1993b). 33 2. The notion of referential connections, i.e. associations between text and Visuals in the DCT approach, allow for the great flexibility in human cognition (Sadoski, Paivio, & Goetz, 1991). 3. Another dual coding principle is encoding-specificity, matching the assessment format to the instruction-learning situation. 4. The superior performance on the prose section of the visual-text format was attributed to the deeper levels of information processing (Craik & Lockhart, 1972) of both Visuals and text. Much of the significant findings were mainly confined to the combined Visual-text test formats versus the visual—only test format. McNeal (1994) attributed the significant differences of the verbal test format scores in the comprehension and composite sections for the three instructional conditions to prior knowledge, and familiarity to test format. CURRENT RESEARCH More and more tests and assessments are now utilizing measures geared toward CAT environments, with in-depth research focusing on new and innovative measures. The listening-comprehension sections of the Test of English as a Foreign Language (TOEFL), now includes visual accompaniments to verbal stimuli. Ginther’s (2001) report entitled Eff s o the 'resnce. . bs -ofVis al on erfo an e on O i- _3 Listni - Comprehension Stimuli, sought to explore the following questions: 1. Do subjects perform better on test items when they are presented in a dual-modality format? (Baddeley, 1992) 2. Is there an interaction between Visuals and other stimuli? 3. Is the effect on examinee performance a result of English proficiency or is it related to Visuals making the task easier? 34 4. Is there a clear preference for the dual modality format or the audio only format? Results indicated that the use of visuals was contingent on the experimental condition, as the only significant finding was in the ‘mini-talks with content visuals’ condition with no significant change in the other conditions. Contrary to Ginther’s findings, Paivio et al., (Paivio & Desrochers, 1980) examined the effectiveness of Visuals in bilingual acquisition and second language learning by investigating results from relevant experimental studies. These experimental studies (Kellogg & Howe, 1971; Wirner & Lambert, 1959) have consistently shown that L2 (second language) responses are learned with fewer errors and in fewer trials if visual referents versus L1 (first language) words are used as cues or stimuli. In tests and assessments, the interaction effects between content and visual characteristics, extends to what Cronbach (1975) refers to as “a hall of mirrors that extends into infinity.” To date, researchers in the field of assessment and testing are exploring new and innovative ways to develop more valid and reliable measures of an examinee’s ability, taking into account individual differences that exist in information processing. Researchers are finally beginning to understand the importance of cognitive foundations and its impact on tasks, something cognitive psychologists have long studied and investigated. In the next chapter the LSAT and its internal structure will be assessed. 35 THE LSAT (THE LAW SCHOOL ADMISSIONS TEST) ITEM TYPESMEASURES The LSAT is currently the official admissions test for all candidates gaining entry into Law School across the United States and Canada. There are approximately three types of test items that make up the LSAT. They include Reading Comprehension (RC), Logical Reasoning (LR), and Analytical Reasoning (AR). The purpose of each item type is to measure specific cognitive abilities. There have been questions raised regarding the authenticity of LSAT item types to the specific cognitive ability it is to measure. Wilson and Powers (1994) investigated the internal structure of the LSAT to review the reliability and validity of the three specific item types and the abilities it measures. The specific skills to be assessed operationally by the three item type categories are as follows: 0 Reading Comprehension (RC) - This section requires the examinee to read a passage so as to determine relationships among various parts of the passage and draw inferences from it. The cognitive abilities measured here include inferring, filtering, association and transfer of applicable information. 0 Logical Reasoning (LR) — An examinee is required to read and understand the argument or reasoning in a passage. These questions test reasoning, logic, and drawing critical conclusions from given evidence or premises. 0 Analytical Reasoning (AR) — A set of conditions or rules is presented and the examinee is expected to draw conclusions using these rules. This section measures “the ability to understand a structure of relationships and to draw conclusions about the structure” (\Wilson & Powers, 1994, p. 1). The following table illustrates a breakdown of the correlations of the three item types. 36 Table 2. Observed Correlation: and Correlation: After Convection For Attenuation: LSAT (Law School Admzlrrion: Text) Section Scorer, fnne 1991 (7' October 1991 Form: of the LSAT* LSAT June 1991 October 1991 Section LR25 LR24 RC28 AR24 LR25 LR24 RC28 AR24 LR25 (.78) .97 .91 .71 (.77) .96 .89 .68 LR24 .76 (.79) .89 .72 .74 (.77) .87 .64 RC28 .72 .71 (.80) .63 .69 .68 (.79) .59 AR24 .55 .56 .50 (.77) .52 .49 .46 (.76) Nore: Observed correlations are shown below the diagonal; corrected correlations are shown above the diagonal; diagonal Elements are estimated KR«20 reliabilities. Data from unpublished ETS internal test analyses for forms of LSAT used in the present study; KR-20, internal consistency estimates. The pattern of correlations shown in Table 2 above gives evidence to the specific skill each item section is intended to measure. R’s for the Reading Comprehension (RC) and the Logical Reasoning (LR) items ranged between .87 and .91. This inter-correlation supports the objective of similar abilities being measured in Reading Comprehension (RC) and Logical Reasoning (LR). In contrast, the Analytical Reasoning (AR) items have r’s ranging from .59 to .63, giving evidence that Analytical Reasoning (AR) items have been developed to measure abilities that are not measured by Reading Comprehension (RC) and Logical Reasoning (LR). Prgdictivg Validiry Since the Wilson and Power (1994) study, there have been other studies that have investigated the validity of specific item types of the LSAT as a predictor of success in law schools. “The general concept of validity is a broad one, encompassing the accumulation of data to support a particular use of a test. The particular type of evidence obtained from the correlation studies is referred to as predictive validity” (Anthony, Harris, & Pashley, 1999, p. 2). 37 In their report, Anthony (1999) et al., investigated LSAT as a predictor of first year law school average scores (FY A) for 1995 — 1996 known as the criterion variable. The LSAT and first year average scores (FY A) data were gathered from 183 law schools throughout the nation. The results from the correlational study demonstrate that the LSAT score is a better predictor of first year performance in law school than undergraduate grade point average (U GPA), and that a combination of the LSAT and the UGPA than either individual measure serves as an even better predictor (Appendix D). Analm'cal Reasoning (AR) Discrepant; Spbsgores Statistically significant differences have been found to affect about a third of examinees in the AR section of the LSAT, while significant and rare differences involve about a tenth of test takers in the LSAT (Stricker, 1993). The findings suggest that there exist marked differences among examinees who take the LSAT, “reflecting variation in their development of the abilities tapped by the subtests” (Stricker, 1993, p.11). Other tests have also reported these discrepancies, such as the GRE (Bridgeman & Cline, 2000), and the WAIS-R (Matarazzo, Daniel, Prifitera, & Herman, 1985). The prevalence of discrepant scores in the AR section of the LSAT was greater for older examinees and for those who had higher total scores. Three types of discrepant subscores were obtained for each pair of subscore comparisons. These were adapted from those utilized in intelligence tests (Kaufman, 1990; Sattler, 1988). They include: (1) an observed difference, the actual difference between a pair of subscores moving in the same direction, (ii) a significant difference at the .05 level, and (1i1) a significant and rare difference, infrequent occurrences of .05 or less. Overall results from the study are attached in the appendix (Appendix E). The overall significant finding is that substantial subscore differences were frequent 38 among examinees. This reflects a variation of abilities to be effectively measured which is a common observation in intelligence tests (Chatrnan, Reynolds, & Willson, 1984; Kaufman, 1976a, 1976b; Matarazzo, Daniel, Prifitera, 8c Herman, 1988; Matarazzo & Herman, 1988; Mclean, Kaufman & Reynolds, 1989; Rosenthal, & Kemphaus, 1988). It is precisely because of these differences of subscores, especially in the AR section that has prompted the investigation of this study. TIME SPEEDEDNESS Time, as explained in Chapter 2 of this study, is both a variable and a constant that affects performance on tests, particularly if it is within the context of a constraint such as the LSAT. Research on response accuracy and response speed according to Scrams & Schnipke (1999), provide different measures of performance. Test speededness occurs when examinees receive lower scores as a result of lack of time and not because of their lack of ability. Speededness, with regards to the LSAT, is currently measured by calculating the proportion of test takers who do not reach each item on the test (Schnipke & Scrams, 1999). According to Schnipke and Scrams (1999), LSAT is partially speeded, as the proportions of items that are reached among test takers increases toward the end of the test. Schnipke and Scram proposed a two solution strategy model to account for the relationship between response times and accuracy of responses. The model is based on Thissen’s (1983) time-testing model which examines the relationship between ability and speed. The examinee is offered two solution strategies to choose: a heuristic strategy, and an algorithmic strategy. Her choice is determined by time limits. If there are strict time limits imposed she may choose the heuristic strategy to minimize her processing time. This strategy involves the management and allotment of time for questions and responses throughout the test (See Figure 2). 39 Fégure 2: Conrpan'ron of Two Solution Strategies in Term: of Tbeir Speed-Acmrag deeoflFunctionr. Tbe Two Vertical Line: Represent Pom‘ble Pmcerring Timer. Two Solution Strategies 1 T __ P("Correct") 0‘8 T ____.._._.~. 0'6 — Algorithm 3: T -—-— Heuristic 0 —+~+-—~———+~—+~—+——l ~0.2~—-123456789111 Processing The vertical line nearest the Y axis indicates her level of performance. If the algorithmic strategy is selected, the examinee concentrates on response at the expense of lower times to increase her asymptotic accuracy. Schnipke and Scrams (1999) also attributed shorter response times within tests that are speeded or have strict time limits to guessing Results from the study, using data from the logical and analytical sections of the CAT version of the GRE indicates the following: 1. Items located at the end of the test had faster responses, some responding less than 10 seconds. 2. Rapid guessing behavior was independent of item content but contingent on item location. 3. Faster response times were associated with low accuracy levels. 4. Slower response times were associated with higher accuracy levels. 5. As time slowly increases during tests, a plateau will be reached where accuracy is not increased. The premise of speed-accuracy relationships is that higher scoring examinees who 40 are confronted with higher difficulty test items tend to take more time on them. This interaction of level of item difficulty crossed with more response time thus confounds observations of the examinee’s true ability level. Schnipke and Scrams (1999a, 1999b) concluded that other variables such as strategies, content and context of information, and time management or pacing are issues that affect performance on test scores. 41 RESEARCH METHOD & DESIGN PURPOSE OF THE STUDY The purpose of this study is to compare the performance of samples of students taking the AR section of the traditional format of the LSAT to those taking the AR DCT format of the LSAT. A comparison of the performance of the two samples will thus determine if there is a format by time interaction i.e. the new dual coding format will serve as a solution to a more efficient measure of problem solving A DCT test format was developed from a past LSAT Analytical Reasoning section. Each item from the traditional section was carefully designed to include Visuals and text for question presentation and responses. Care was taken to ensure that the new information did not provide additional hints to the answers. The items for each format were then administered to the examinees. Each participant was randomly assigned to either the DCT or the traditional LSAT condition. Scores from the traditional section were compared to scores from the DCT test format and were analyzed and evaluated. PROCEDURE Definig'on of Ierrhs The overall hypothesis for this study is that DCT item formats are more efficient measures than the traditional item formats. Listed are definitions of the specific variables: Effigiency For any candidate taking a high stakes assessment test, performance is almost always affected by time constraints. Past research on the effect of time on test performance is currently being investigated in many high stakes assessments such as the GRE. “Whether speediness is irrelevant or a relevant indicator of academic ability, the extent to which score is dependent on time, is of interest to potential score users” (Bridgeman, Cline, 8c Hessinger, 42 2003). There cannot be complete assurance that a test was speeded. Efficiency in this case, is the time related to the number of correct responses in proportion to the total number of items answered. The hypothesis is that the new procedures will take less time between items, and hence less average response times than the previous item response process. 11th Time here is specifically defined as follows: (1) the average time to complete a single question, and (2) the average time taken to complete the number of correct responses in proportion to the number of items answered. Explicitly, the measure will include as follows: for each group, the average time taken to complete the number of items answered plus the number of correct answers, and the response latency, or time taken between the presentation of each item and response. (Note: Response time is defined as the time when the question first appears on the screen until an answer is given and confirmed.) Power Analysis A power analysis based on data from past research that investigated response time differences among examinees in CAT environments (Bridgeman & Cline, 2000), was conducted to determine an effective population sample size for the study. A section of the study looked at the mean response times between two categories of the GRE, the Quantitative section and the Analytical Reasoning section. Preliminary studies indicated that 20% of examinees fail to complete the quantitative section and 35% in the analytical section. The analysis indicate that a sample size of approximately N = 128 (total) with a power of .8, alpha level of p = < .05, at an effect size “(1” = 0.5, or N = 90 (total) with a power of .8, alpha level of p = < .05, with an effect size “d” = 0.62 will be used. In Bridgeman and Cline’s study of time and position effects using GRE Quantitative and Analytical items (note that the GRE Analytical section contains items similar to the 43 LSAT AR and LR items), an average response time of 78 seconds per item was observed. Therefore, a goal of this study will be to reduce the mean response time per item to below S 78 seconds over the full set of items, for the following reasons: 1. Bridgeman 8c Cline’s study was done with data from the GRE-Quantitative and GRE-Analytical, which serves as a good comparison to the LSAT AR section. 2. The response time difference was for separate categories at 20 second interval between both the CAT GRE-Q and the GRE-A items. This comparison was taken because GRE-Q items have had the inclusion of visuals in the trigonometry section versus GRE-A items that has always been text-only represented. 3. By selecting S 78 seconds as a minimum time difference to target, it would prove that the DCT items would close the gap between the GRE-Q item responses and the traditional GRE-A response times. 4. GRE—A items consists of testlets, a set of 4 or 5 questions to one stimulus which take more STM space, versus GRE-Q questions which can exist as stand-alone questions. If the minimum time difference of S 78 seconds is met or reduced, this would be a significant indication that DCT item formats help in increasing memory resources. 5. “LSAT is a univariate test designed to measure reasoning ability” (Henderson, 2004), that parallels time constraints of actual Law School in-class examinations. As such, it is a fairly robust predictor of law school in class exams in terms of test-taking speed. The format of this study has thus been selected to address the response time investigations. Irem Selectign Prggess The construction of innovative item types i.e. DCT format items must take into account the following issues: (1) users are not just targeting information for information’s sake, but are answering specific questions within a limited time framework, (ii) maintaining the validity and reliability of items to a 2D environment, (1ii) the different competent levels of technology use among candidates, and (iv) proper use of items for the specific testing domain, i.e. Analytical Reasoning (AR). The following sections will list out the basic criteria that were used in selecting specific features crucial in the item selection process. A set of past LSAT AR questions from W series (LSAC, 2002, 2003, 2004) was selected as the experimental item constructions for presentation on computers. Currently, the LSAT is only offered in pencil and paper format. Twenty-three questions from Section II (analytical reasoning) of the June, 2003 Prep Test 40 were utilized for the study. Approximately five sets of questions were based on a set of conditions, each measuring the following: directional skills, ordering or ranking abilities, and selection i.e. inclusion and exclusion. Unlike other research studies that use Visuals-text in the questions themselves, only the set of conditions and multiple choice options were presented in visual-text format. Selection of visuals for the conditions included the following: 0 Conditions that were directional utilized arrows. 0 Conditions that occurred in order or were ranked utilized number sequences and series of periods or dots. o Conditional events utilized the ‘it’ and ‘or’ words. 0 Values that were excluded utilized a diagonal line strike across the value. 0 Values that were included utilized a ‘plus’ or ‘addition’ symbol. 0 The points of connection for directional Visuals utilized red dots to represent connections and non-connections. Selection criteria was based on the eleven visual classification list (Lohse, Biolsi, Walker & 45 Rueter, 1994). Kit Qf Faeror-Reference Coghitive Tests The kit consisted of 72 tests that have been demonstrated as a consistent measure in studies of 23 cognitive factors such as reasoning, verbal ability, spatial ability, memory, and other cognitive processes. This tool was developed by Ekstrom, et al. (1976) with the goal of assessing individual differences in cognitive abilities. Two tests of perceptual speed, Identical Pictures Test for visuals, and Finding A’s for text were utilized. The inclusion for the Kit of Factor—Reference Cognitive Tests was to check for validity of the DCT and LSAT items in both formats, specifically if the Visuals in the DCT format attributed to a change in the validity of test items. An example of some items of the kit is attached in the appendix (Appendix F). Grapm'cgi User Ingerfaee (GUI) Development Before engaging in prototyping, basic evaluation procedures were adopted that researched three important factors: (1) performance to be measured, in this case problem solving skills, (ii) how this is to be measured on CDT displays and recorded, and (1i1) a persona of a typical examinee who would take the LSAT in normal situations, i.e. after completion or near completion of an undergraduate degree. In the development of an intuitive Graphical User Interface (GUI) based on Norman’s (1986) general principle of task functionality, a prototype is a critical tool in its design and evaluation. A prototype, as defined by Hackos and Redish (1998), “is an easily changeable draft or simulation of at least part of an interface” (p. 376). The following principles were taken into account: (1) the flow of screens for major tasks, (1i) the screen layout of the basic task screen, (iii) layouts for all screens, (iv) interactive functions for each screen for input and output data, and (v) matching the layout and screen to the task and 46 mental model of problem solving to the AR section (Hackos & Redish, 1998). Prototyping a GUI occurs at three stages; the pre-prototype phase, the prototype phase, and the post prototype phase. W In the pre-prototype phase of development, core variables, such as domain complexity, applying a suitable technology type to the task, were investigated and researched. Once an evaluation of the items themselves was completed, paper based mockups of each screen were drawn up. At this stage of the GUI development, most of the prototyping is done on paper because of its versatility and easy edit ability. Sketches of each screen were done in black and white first to obtain the overall positioning and spatial layout of the items to be displayed. Each screen was then drawn in color and then evaluated for their connectivity and consistency. Pro 0 / e P a The prototype phase level of design investigates principles of the GUI that will determine information representation and presentation on the specified computer display terminals (CDTs). Core principles of HCI and necessary functions needed for delivery and interaction, regardless of the domain were incorporated in this phase. Overall layout design is essentially based on Gestalt’s Theory of Visual Organization (Wertheimer, 1925) which is concerned with the configuration and visual organization of graphical objects and its impact on human perception (Roth, 1995). Specific Gestalt concepts are briefly listed as follows: 1. Similarity - Visuals grouped together based on similar shape, size, pattern, or color representing similar functions, or functions that consist of common interaction controls. 2. Proximity - Consistent measures of space between visuals reducing confusion and 47 error, enhancing comprehensibility of visuals and text. 3. Contrast - Emphasize important information by capturing the individual’s attention. This is because objects in the surrounding vicinity compete for visual attention. 4. Figure-Ground -- This is the foundation of layout in design, print or digital. Screenshot samples of the GUI of both formats are attached in the appendix (Appendix G). Spatial Relations The visual spatial relations and locations of the selected objects and text were conducted next after visual grouping. Three organizational methods for achieving screen design: (1) using a grid tool, (2) using item grouping strategies, and (3) standardizing the lay out (Marcus, 1992) were utilized to achieve this goal. Specific spatial layout concepts such as balance, equilibrium, symmetry, sequence, and unity (N go, Teo, Bryne, 2000) were addressed. I (‘0)! Representation Based on Norman’s (1999) theory of action-intention, an interface that can be directly manipulated is much easier to utilize than one that is not. In short, any interface that involves the concept of automaticity (Logan, 1988) does not need extra cognitive resources to understand what each icon does, serves as a good design. Icons in particular have to be carefully designed so that at least 90% of users can fully comprehend its representation. Haramundanis, (1996) & Horton (2001), defined icons as small pictorial symbols on computer menus, screens, and windows to: (1) serve as cues or reminders, (2) aid recognition, (3) save screen real estate, and (4) assist users whose native language is not English. Similar to the DCT approach, research in HCI issues illustrate that icons that are accompanied with text versus text or icons alone, are more effective in affording faster and 48 more accurate search times. As such, only black and white images of the menu icons were incorporated to prevent visual overload as the candidate is limited in time to become familiar with their functions. Interactors“ Buttons and other clickable interactors function as avenues to links of information for the interface user. Bodner (1994) discovered that animated buttons had 85% more correct responses, versus 67% of users who had to utilize static buttons. Animation does not necessarily mean a constant movement of a visual within the parameters of an interface design. It includes any embedded clicking action when the button is clicked on, any highlighting effects, or any change of visual image on mouse roll—overs. The following specific guidelines for designing interactors have been incorporated. 0 Mouse — The mouse is a simple point and click that is not without its setbacks. Errors in mistaken target selection and intent are always issues of concern. Only the mouse function of point and click were selected as functions to limit multiple interactive actions at any one time. 0 Animation — As explained, buttons need to have a change of image status to indicate functionality and interactivity. A highlighting roll-over feature fulfills this status change. 0 Limits on Interactors - Too many different types of interactors will cause confusion regarding its function and purpose. In a CAT, cognitive resources need to be preserved for the actual performance evaluation. Interactors have been limited to functions crucial only to the input of answers and navigation between screens. Color Rrpmmtatz'on 2’7 Use The use of color always centers around four specific characteristics, hue, brightness, 49 saturation, and contrast. Hue is the general identification of a specific color. Brightness is the level of luminance within color. Saturation is the interaction of hue and brightness, often referred to as the color depth. Contrast is the relative perceived brightness of two displayed colors based on the figure-ground theory, which enables the individual to separate and decipher two groupings (Misanchuk, Schweir, & Boling in press). Color is crucial in displaying detail in visual information. As such, it is imperative to use color with discretion. To adhere to these guidelines, the use of color was limited to red for the alphanumeric letters enclosed in bounding boxes, and grey for the bounding boxes in the DCT format. The roll- over yellow color was utilized as a highlighting device for the menu interactors. All other visuals were in black and white, with the background in white adhering to the contrast principle of color. Font S 0’16! and Size The problem with presenting any text information on a computer screen is a limited display space. Eye movements are limited to the screen versus the ‘real environment’ that allows for the eye to saccade over a larger area. Tullis (1983) discovered that reading, which involves parallel cognitive processes, one for processing text and the other the semantic processing of language, is not done in the traditional top-down and left—right processing as in the paper format. Instead, it occurs as a collective search for the required information. Synder and Maddox (1978) conducted an investigation of text legibility on computer screens, and discovered that smaller text sizes produced faster reading rates as cognitive resources needed for information processing are not spread over too large an area. The suggested size should be in line with the screen parameters, but from a general standpoint, the largest used for content text is approximately 14 point (Bernard, Mills, Frank, & McKnown, 2001). As such, text density is something that needs to be manipulated and addressed on 50 computer display terminals (CDT) to accommodate for this problem. Lines too short will cause the candidate to skim through it with no information processing occurring, while lines too long will cause an overload. What is important is the spacing between lines of text and paragraphing to accommodate chunking. In addition, types of fonts, serif or sans serif need to also be taken into account. Sans serif, a category of type faces that do not include serifs or small lines at the end of characters are generally harder to read than serifs. San serifs are generally good for use in small paragraphs or titles of text. Serifs on the other hand aids in reading but not as titles, as the presence of small lines at the end of characters may make the headlines appear too busy. In DCT formats, there is now the added complexity of presenting text as labels near or overlapping visual representations, or enclosed in specific parameters, such as a table cell, or a bounding box label. Care was taken in the use of specific font styles such as using solitary capital letters combined with its location as iconic cues and response selection, versus as text or labels. Line and character spacing were taken into account when displaying the text layout, horizontally and vertically, and Gararnond a serif style font was used as font style. Derigm'ng For Error Errors in mouse pointing normally occur because of reasons such as the close proximity of icons, no labeling with icon representations, or too many similar icons. To avoid such errors, specific guidelines to accommodate human errors were incorporated as follows: 0 Redundancy - This is a HCI theory to ensure that users are not off track in reaching the correct target providing a variety of avenues toward a similar outcome will ensure that this is maintained. For example, representing a function via both an icon 51 and text label. 0 Pop-Up —- Boxes that inform the candidate they have not filled in an answer or not fully comprehended directions are essential to ensure that they get back on the right track. 0 Confirmation - During an examination, anxiety levels of candidates are high because of the stakes involved in performing well. As such, pop-ups have been included that displays a candidate’s actual answer, asking if the answer they selected was what was intended. Post r0 0 e P ase The prototype phase of GUI development involves a test run of the software as is. At this stage of the interface development, “practitioners in the field of HCI called User- Experience Engineers (UEE) use a variety of methods to generate applications” (Olson & Olson, 2003). Examples of these methods include checklists, heuristic evaluation (Nielsen, 1993), cognitive walkthrough (Lewis et al. 1990), claims analysis (Carroll & Rosson, 1992), all termed ‘formative evaluation’ with the similar goal of detecting any error or difficulty that may cause problems for the user. A complete usability test was conducted for the two formats of the LSAT using the cognitive walkthrough (Lewis et al, 1990) technique. The Cognitive Walkthrough is a methodology for performing theory-based usability evaluations of user interfaces which focuses on a user’s cognitive activities; specifically, the goals and knowledge of a user while performing a specific task. The walkthrough consisted of three stages, the preparation, the evaluation, and the interpretation stage. The preparation stage is where information is collected about the specific tasks that examinees have to complete, what constraints are imposed, the examinee population 52 themselves, and any other pertinent information prior to the evaluation stage. At the evaluation stage, questions regarding the reasons why specific functions have been implemented, design features and its relevance to the user and tasks are all recorded. The final stage, which is the interpretive stage, is the culmination of all recorded information from the evaluation stage to assess which information falls into the positive category, and which into the negative category. The prototype is then edited based on the negative information collected. Final Product — Second Prototype After all edits and corrections have been applied to the software toward the second prototype, a final run-through was conducted in terms of functions, typos in text, arrangements, and layout to the actual screen resolution and screen size of the testing computers. This was also done for the back end of the software i.e. how the data was recorded for time taken between items and responses. It was decided that for each examinee, time data would be in milliseconds, for a more accurate and detailed rate. It is important to note that this phase is not fool proof, however, and knowledge that some mistakes could have gone undetected was accepted. Participants Approval for the use of human subjects was sought and the number of participants indicated by the power analysis was recruited for the study from a fairly large pool of graduate and undergraduate students attending Michigan State University. Requests for participants were done via academic list serves, advertisements by fellow teaching and graduate assistants in the classes they instruct, graduate school governing bodies such as Council of Graduate Students (COGS), other graduate organizations, and through word of mouth. Approximately 98 subjects participated, the percentage of undergraduate students 53 being approximately 7.84%, graduate students comprised the remaining 92%, of which 19.6% were Masters students, and 72.5 % were Doctoral students. TEST ADMINISTRATION The innovative item types were developed using Director 8.5 and uploaded on two Dell computers with 18 inch screens at The Canterbury of MSU office, which is the Chaplaincy of the Episcopal and Anglican Student Mnistry, in close proximity to campus. The time allocated for both CAT test formats was exactly the same as the time prescribed for the pencil and paper format of the past LSAT exam, 35 minutes. Each participant was required to read and then sign a consent or waiver form and go through a tutorial of the exam for approximately 10 — 15 minutes, to ensure that they became familiar with the test functionalities and layout. They were given pencils and paper to assist them in their tasks, regardless of which format they were assigned and had the option of stopping at any time. After completion of the tests, the two Kit of Factor-Reference Cognitive Tests (Ekstrom at al., 1976) were administered. The Identical Pictures Test had two parts, part I and II consisted of 96 items of 28 rows of visuals per page, at 4 pages per part. Each examinee was given approximately 90 seconds to complete each part. They were instructed to go through the items and check off which visual from the row of 5 possibilities was identical to the visual on the left. Once completed, the next kit Test, Finding A’s test was administered also in 2 parts, comprised of 5 columns of words per page, at 4 pages per part. In each column, candidates were asked to locate 5 words that contained the letter A and strike them out. Each examinee was given approximately 120 seconds to complete each part. Data from the Kit of Factor-Reference Cognitive Tests (Ekstrom et al., 1976) was recorded for each candidate and stored. Results from the study are reported in the next chapter. 54 RESULTS 8; DATA ANALYSIS ANALYSIS PROCEDURES The following analyses were conducted, with response times (RT) for each item i.e. response times between the presentation of each item to the next item, and correct response as the dependent variables and format as the independent variable: (i) descriptive statistics of total scores for both DCT and LSAT, (ii) Differential Item Functioning (DIF) to determine if versions of the items functioned differently, (1i1) estimates of reliability for the tests, (1v) correlation of DCT and LSAT formats to the Kit Factor-Reference Tests to obtain a sense of differences in validity, (v) proportion of correct responses to the total number of items answered, (vi) average response times (AVR) to responses, (vii) correlations between response times (RT), and (viii) a multivariate analysis of variance MANOVA) to determine if there were significant differences between item scores and response times for the two formats. Descriptive Statistics The mean and standard deviations for the total scores of the DCT and LSAT format are displayed in the table below. A two sample test of means was computed to determine the p value. There were no significant differences between the two means. However, there was a sizeable difference in variance between the two formats at 6.972 almost 7.0. Table 3. Demiptioe 5 tatim'cr for DCT and LSAT Format: Format No. Observed Mean Std. Deviation Variance DCT 47 9.96 4.344 18.868 LSAT 50 8.76 3.520 11.896 D' fere ' tem F n 'o ' I DIF is defined as the systematic statistical process for detecting performance differences on items among groups of individuals with similar, true cognitive ability, 55 regardless of any other characteristic that is ‘irrelevant’ to the measure (Zumbo, 1999). In this case the irrelevant measure is the format for the item. DIF was conducted to compare the performance of the focal group (examinees taking the DCT format) to the reference group (examinees taking the LSAT format). A one-parameter model of Raju’s signed area index (SAI), which is the area between two item characteristic curves (ICC) was used. For items to display DIF, the values must be > .5 or < -.5 in the Z_SAI column (values in the SAI column were converted to reflect Z standardized values) (Raju, 1988, 1990). Table 4. DIF I ”dim: Traditional w. DCT Format Item DCT-Proportion Correct LSAT-Proportion SAI Z_SAI Responses Correct Responses 1 0.74 0.70 ~0.85 -1.18044 2 0.49 0.66 -1.41 -1.97053 3 0.43 0.28 0.10 0.13229 4 0.40 0.52 0.04 0.05625 5 0.34 0.26 -0.74 -0.87059 6 0.85 0.86 -0.65 -O.77179 7 0.23 0.08 0.92 0.96628 8 0.47 0.51 -0.33 -0.46999 9 0.45 0.41 0.55 0.76865 10 0.39 0.39 -0.53 -0.71381 11 0.44 0.44 0.01 0.01415 12 0.40 0.34 -0.26 -0.34605 13 0.50 0.47 -0.68 -0.89404 14 0.48 0.27 0.68 0.88845 15 0.54 0.28 1.37 1.78057 16 0.38 0.23 0.76 0.96481 17 0.35 0.17 -1.38 -1.04137 18 0.91 0.77 1.17 0.97605 19 0.68 0.30 3.13 2.54438 20 0.72 0.63 0.52 0.51005 21 0.50 0.52 -1.43 -1.23216 22 0.75 0.51 3.22 1.61282 23 0.65 0.48 -0.96 —0.67265 The results in Table 4 show that with the exception of items 3, 4, 8, 11, and 12, all other items displayed DIF. A negative value indicates that an item was more difficult for the focal group (DCT), while a positive value indicates that an item was more difficult for the 56 reference group (LSAT). Eleven items were more difficult for the DCT group while 12 items were difficult for the LSAT group. For the DCT group, DIF analysis for item 2 had the greatest degree of difference followed by item 1. These two questions were located at the beginning of the exam and were the first set of analytical reasoning (AR) test items with the new format. The following reasons that could have attributed to this include: (1) familiarity with the new format, (11) initially processing the set of conditions, questions and multiple choice options available, and (iii) deciding which strategy to undertake, the algorithm or heuristic path of the two solution strategy (Schnipke & Scrams, 1999) as information is being processed. DIF values decreased toward the end of the set of test items as can be seen in items 3 and 4 which had low DIF but the level of item 5 increased though not to the same degree as items 1 and 2. As the test progressed, DIF measures for all other items seem to reflect an upward and downward sine-curve trend with lower levels of DIF in comparison to the first set of items. This supports Richard’s (1987) hypothesis of immediate-delayed retention tests in visual-text formats with significance in performance occurring in later stages, the creation of visual traces that do not vie for cognitive resources, and the effects of practice or rehearsal on the first set of questions. For the LSAT group, the DIF measures for items increased steadily with the first significant level at item 15, and the greatest difference occurring for item 19. Higher DIF levels reflect a trend where the items located later seem to be have greater DIF measures than earlier item sets. This would indicate one or two of the following: (i) a depletion of STM resources in an all text environment, and (ii) time running out to complete each item accurately. On average, DIF measures for items among the LSAT group when compared to the items for the DCT group had greater values. These results will be discussed later in the 57 chapter in relation to the MANOVA results and response time correlations. Items not congruent with other analysis will also be discussed and the phenomena explored. Reliabili T s s The reliability of a measurement procedure is defined as its consistency. Table 5 lists the reliabilities and descriptive statistics for the DCT and LSAT total score, and the two Kit of Factor-Reference tests. Table 5. Rama/2'9 1.3..1‘tir/zatei~ and Dmriptive S tatistit: For All Items DCT Reliability Mean Std Deviation Variance Total Score 0.86674 9.96 4.344 18.868 Finding A’s 0.94467 64.04 14.8367 220.129 Identical Pics 0.30 77.596 14.1877 201.290 LSAT Total Score 0.88705 8.76 3.420 11.696 Finding A’s 0.92886 65.94 17.867 319.241 Identical Pics 0.23 73.714 14.2156 202.083 The data indicate that all tests used for the study with the exception of the Identical Pictures test, had good reliability estimates. The lack of reliability for the Identical Pictures prompted measures of variance, mean and standard deviation to be computed to investigate if the difference reflects extremely small differences in variances between the two groups. A 41.22% coefficient of variation was estimated based on the combined means and variances. This was a sizeable variance difference in the Identical Pics test between the two groups. V ' ' o t The Pearson Product—Moment Correlation Coefficient (r) was used to determine 58 correlation coefficient estimates to investigate the relationship, if any between the two Kit Reference tests to both the DCT and LSAT formats to check for validity of item constructs (see table below): Table 6. Correlation C ogflieient: for DCT and LSAT Correct Reipome: to Kit of Faetor-Refireme Text: Identical Pics Finding A’s LSAT Responses .434** .022 DCT Responses .632** -.040 Chi-Square 1 .78071 P—Value 0.1 82 ** Correlation is significant at 0.01 level p S .01 * Correlation is significant at 0.05 level p S .05 Both format responses had a positive correlation to the Identical Pics test with a slightly higher correlation estimate for the DCT. A comparison of the two independent correlation coefficients was calculated to determine the significance. Results indicate that correlations were not significantly different. As such, both item forms seem to have the same relationship with these variables. to o ' rr c Res onses The proportion of correct responses based on the number of items attempted was calculated. This proportion was computed because the LSAT does not penalize an examinee for a wrong answer. Figure 3 illustrates the proportion of correct scores to the number of items answered for both groups. A greater number of correct responses occurred between the 3.75 - 6.5 range for DCT and 3 -— 5.75 for LSAT. The highest score in the LSAT was slightly lower at 9.25 compared to DCT at 10. The lowest score was 0.0 for DCT versus 1.5 for LSAT. An analysis of variance (AN OVA) was run to determine level of significance. A p S .07 level was taken as significant because of the small number of participants in the study. 59 Figure 3. Correct Reipome: to Tire Proportion of Itemr Answered for DCT 29' LSAT DCT-PC l l LSAT-PC l”— j Avera Res ons Time VR - nswers The average response times for both groups were calculated to determine if the S 78 second goal was reached over the full set of items (Bridgeman & Cline, 2000). Note, however, that the difference in the format and layout between the GRE-A and GRE-Q to the LSAT developed for CAT environments is different and performance would undoubtedly be affected for the following reasons: 1. The GRE-Q and GRE-A items were presented one question at a time with no option to review responses. 2. The GRE-Q had a time constraint of 45 min. (2700 sec) for 28 items vs. 23 items with a time constraint of 35 min. (2100 sec) for the LSAT. This works to approximately 96 sec. at most per item for the GRE and 91 sec. per item for the LSAT 3. There exists two sections of an AR section in the LSAT vs. one for the GRE—Q 4. There is no penalty for wrong responses for the LSAT. 5. LSAT is a partially speeded test. The table below illustrates the Mean Response Times (MRT) of examinees of both 60 groups. There were approximately 12 items between the two format groups that either reached 5 78 second target goal or were below this margin. Items 16, 23 and 22 had the fastest RTs respectively, for both DCT and LSAT, with faster RTs recorded for LSAT. The greatest difference between DCT and LSAT was for item 17 with a RT difference of 19.192 with a higher proportion of correct responses to total answered at 0.35 for DCT and 0.17 for LSAT. This supports Schnipke and Scrams’s (1999) hypothesis of higher scores for difficult items having longer RTs among examinees. The following box plots are attached in the appendices: MRT to proportion of correct answers for both DCT and LSAT, and the MRT for both DCT and LSAT (Appendix H). Table 7. Mean Reiponie Time: to Proportion of Correct-Total I term Anwered DCT LSAT Item RT Ans RT Ans 1 157 0.74 195 0.7 2 72.19 0.49 72.148 0.66 3 70.502 0.43 56.55 0.28 4 98.353 0.4 112.497 0.52 5 112.119 0.34 125.244 0.26 6 185.95 0.85 165.381 0.86 7 137.549 0.23 134.231 0.08 8 74.009 0.47 81.828 0.51 9 138.07 0.45 119.556 0.41 10 120.162 0.39 103.205 0.39 11 249.86 0.44 225.667 0.44 12 126.298 0.4 96.486 0.34 13 52.427 0.5 54.476 0.47 14 76.805 0.48 75.217 0.27 15 79.512 0.54 64.455 0.28 16 40.582 0.38 35.428 0.23 17 75.743 0.35 56.551 0.17 18 99.094 0.91 103.785 0.77 19 81.588 0.68 76.182 0.3 20 60.311 0.72 55.671 0.63 21 71.38 0.5 70.972 0.52 22 49.567 0.75 41.26 0.51 23 46.71 0.65 39.393 0.48 61 'm o la ' n The Pearson Product—Moment Correlation Coefficients (I) were calculated for the response time taken for each item. The correlation table is attached in the appendix (Appendix 1). Time correlations were investigated based on per set of items-conditions, T1 — T5, T6 - T10, T11 — T17, and T18 — T23. Based on Scrams and Schnipke’s (1999) theory of the importance of item location, the negative correlations between T1 — T5 to T18 - T23 are strong. Any increase in Set 1 reflects a decrease in the last set. This probably reflects examinees running out of time, hence the faster RT and lack of time management allotted to each item which lends support to ‘the planning fallacy’ theory (Kahneman & Tversky, 1979). T1, however, was the exception, as this is the first item at the beginning of the test which takes examinees a longer time to process information and familiarize themselves with both formats. Only T1 and T12 had a correlation to format, a positive relationship for T1 and a negative for T12. The increase in T1 reflects examinees trying to accustom themselves for the first time to both CAT formats. T12 is mid-way through the test, and any effects of the format are negligible at this point. The negative relationship to format reflects a depletion of STM and cognitive overload. Multivariat An 'sis Variance VA An analysis was conducted on the test items for both DCT and LSAT to investigate the effect of format (DCT-LSAT) on response times (RT) and proportion of correct responses. Effects of format were significant for the following responses: 2, 7, 14, 15, 17, 18, 19, and 22, and for response times taken for the following items: 1, 12, and 17. Though the alpha level was set at p S .05, a significance level of 0.09 for item 2, 0.07 for item 17, and 0.09 for item 18 were included because of the small number of participants. Response times 62 for items 1, 12, and 17 were significant at the .05 level and two item-time combinations, 114- T14 and Il7-T17 were also significant. The MANOVA table is attached in the appendix (Appendix J)- SUMMARY AND DISCUSSION The main focus of this exploratory study was that DCT item formats would be a more efficient measure of problem solving capabilities in the AR section of the LSAT. Efficient here is defined as the decrease in MRT plus the correct responses. The results and analyses reflect the following key points. Item Igcation The DIF levels among the DCT items and the significance of items from the MANOVA results point to the following having significant differences: items 2, 7, 12, 14, 15, 17, 19, and 22. These items in combination with the MRT for the group reflect a slow increase in MRT but with higher accuracy rates (see Table 7) when compared to the LSAT group that reflect faster MRT rates but less accurate scores. The DIF measures of the later placed items for the LSAT group would suggest a lack of time available to fully process the information that resulted in guessing and higher perceived difficulty level of items. The time correlation table also reflects a negative relationship between early items to later items. The DCT group seems to reflect the algorithm strategy as proposed by Schnipke and Scrams (1999) with greater MRT rates but higher accuracy responses. The DCT format allowed for either a more accurate ‘guessing’ strategy provided by the visuals-text, or faster recall of stored information because of the presence of paired visual—text cues that afford faster forward and backward recall (Paivio, 1986, 1990, 1991). This also gives evidence to studies of VSTM of spatial locations aiding in the book-marking and referencing of objects in relation to the surrounding objects. This relational processing of visual units increases 63 VSTM capacity over periods of lag time (Chun 8c jiang, 1998). As the test progressed, the immediate-delayed effect of visuals-text on performance (Richards, 1987) was supported and reflected in response accuracies where significant increases in correct responses occurred in later items. This immediate-delayed effect may have resulted from examinees becoming familiar with the visual-text format (practice- rehearsal), faster processing rates of information of visuals (Neisser, 1967; Phillips, 1974; Simons, 1996; Sperling, 1960), and improved forward and backward recall of information (Paivio, 1986, 1990, 1991). Responsg Tim; and Spegdedness The MRT rates were not realized here, as the RTs for the DCT format were higher when compared to LSAT format RTs. However, when taking Schnipke and Scram’s (1999) two solution strategy (see figure 2) where higher asymptotic levels of accuracy are reached with higher response times in a strict time speeded test, the goal of DCT format items of lower RTs and higher proportions of correct responses was unrealistic. The definition of efficiency would thus need to be changed to higher RTs and correct responses equals an overall increase of correct responses within the time constraints. Table 7 thus supports this hypothesis where AVR for the LSAT group when compared to the DCT group was on the whole lower with lower response accuracy. Differences in time taken for I1, 112, and 117 were significant in the MANOVA analysis. T1 was significant, as it is the time taken to respond to the first item of the AR LSAT test where examinees had to process the directions, questions and conditions. I12 is midway of the AR test, and the AVR Table on page 61 indicates that after I12, times for the LSAT group with the exception of T13 and T18 were lower plus lower accuracy rates. This would suggest a depletion of STM in a text only format, and the occurrence of guessing in 64 the last subset of items 18 — 23. The DIF analyses for these items were higher than for the DCT group, which lends support to the lower asymptotic levels of accuracy when the heuristic strategy is selected. Both time and item were significant for 17. The RT difference between the DCT and LSAT interacted with the item, was the largest difference here, approximately 20 seconds more time taken by the DCT group with 0.18 greater accuracy responses. In addition, DIF analysis for 117, which was the last item of the 3rd set, had the greatest difference in difficulty between the DCT group and the LSAT group. Though it was more difficult for the DCT group, respondents had more accurate responses at a higher response time. After item 17, 118 to 123 seemed increasingly more difficult for the LSAT group giving evidence to the Law of Diminishing returns, that any increase in time added would not result much in a change in accuracy of responses. im rr ti ns Overall, 118 — 123 were correlated negatively to items before 112, i.e. the first half of the AR test. Higher RTs taken in the first half of the test will reflect less RTs in the second half of the test. This again supports Schnipke and Scram’s (1999) two solution strategy in a time speeded task. The heuristic strategy that implies guessing at a lower RT rate with lower asymptotic accuracies gives evidence for the algorithmic strategy as a better solution for a time speeded task. The findings point toward the DCT format as having significant impact on performance of examinees which will be discussed in the next chapter. A review of the implications and weaknesses of the study will be presented and suggestions given for future research. 65 IMPLICATIONS AND FUTURE RESEARCH Results from the study indicate that there are significant findings in the impact of utilizing a DCT format of testing in CAT environments. Paivio’s (1986, 1990, 199) theory of the supremacy of visuals and text in information processing and recall is evident in the results. Using a DCT format in a domain that requires the examinee to multi-task within time constraints will preserve the examinees cognitive resources and provide a more efficient instrument as a measure of problem solving capabilities. This would prevent the result of lower scores attributed to fatigue as much of the LSAT is text based. The following results were realized in this study: 1. Mean responses times for the DCT format items were realized (i 78 secs). Though times were very closely or almost equal to RTs in the LSAT section, the results revealed that the advantages of a visual-text format had an effect on later items. This was attributed to examinees becoming familiar with the DCT format, the visual-text format aiding in preserving cognitive resources, and the immediate-delayed test studies with visuals-text conducted by Richards (1987). 2. The range of correct responses for DCT had a wider score range with a larger proportion of correct answers occurring between 4 — 6 versus the LSAT with the bulk of correct answers occurring between 2 — 5. Familiarity with the visual-text format again comes into play here. This also supports the visual recency effect (Phillips 8: Christie, 1977a; Broadbent & Broadbent, 1981), where performance of complex and demanding tasks occurring simultaneously were not affected because of the visual-text format. 3. Location of test items as purported by Scrams and Schnipke’s (1999) had an effect on RTs and performance for both formats. 66 4. Reliability estimates of DCT items were high or equal to the LSAT reliability measures, at .086 and 0.88 respectively. 5. The correlation coefficient measures indicate that DCT items maintained their validity with visuals having no significant impact on the item constructs. 6. Higher accurate scores were reached in the DCT format but with higher RTs when compared to the LSAT group. Though a prediction of lower RTs plus correct responses was predicted, it was found that this was an unrealistic goal because of Schnipke’s and Scrams (1999) study of time speededness and accuracy. Limitatiogs of the Study This experimental study was conducted with a relatively small sample size. As such, a more sensitive study could have been conducted if the sample size were substantially increased. An increase in size would investigate more accurately if the number of items reached or answered, correct responses and response times reflect stronger evidence for the Dual Coding Theory hypothesis. In addition, as with past research of visual—text assessments, many discovered that the advantages of these formats were only realized after (i) familiarity or rehearsal of formats or training sessions and (ii) after an extended test period. A recommendation would be to conduct the experiment with better examinee preparation, beyond the 10 -— 15 minute tutorial, or in two experimental time phases. The test format could be developed over two sets of analytical reasoning (AR) sections for a total of 35 mins to: (a) detect the immediate-delayed effects of visuals over a greater number of items, (b) prevent any isolated or rare occurences, (c) investigate the extent of the limits of a text-only format, and (d) to preserve of cognitive resources with a visual-text format. 67 The selection and design of visuals for the DCT format needs to be further researched by studying the problem solving diagrams drawn by examinees of past LSAT exams to determine a closest-to-fit generalization of a problem solving mental model among individuals. This would produce a more efficient and accurate visual-text test construct. Futute Research The object of this study was to determine if DCT item formats developed for CAT environments were more efficient measures of a candidate’s cognitive capabilities of problem solving when compared to the traditional LSAT formats. Many of the objectives of this study have been realized with regards to accuracy of responses, supremacy of visuals, and location of items, with the exception of response times. Currently there exists a variety of investigations focused on the creation of new and innovative items in the field of assessments and tests, critical issues involved in the information processing and problem solving of higher order cognitive tasks need to be fully understood and researched prior to any development of an innovative or novel item type. Pertinent issues of GUI and information architecture (IA) are also added domains that need to be investigated if tests are to be conducted in CAT environments. As such, research in the fields of cognitive psychology and HCI must occur alongside psychometrics to arrive at more authentic and effective measures of Theta (O). 68 REFERENCES Anderson,J. R. & Bower, G. H. (1973). Human A::ociative Memog. Washington DC: Winston and Sons. Anthony, L. C., Harris, V. F. & Pashley, P. J. (1999) Predictive Validig Of T/Je LSAT:A National S ummag Of Tbe 1995-1996 Correlation S tudie:. (LSAC Research Report No. 97-01) Newtown, PA: LSAC (Law School Admissions Council). Baddeley, A. (1992). Working Memory. Science, Reward; Librag Core, 255, 556 - 559. Baddeley, A. D. (1997). Human [Memory Tbeog And Practice. Boston: Allyn and Bacon. Barnard, P. (1986). Interacting Cognitive Subsystems: A Psycholinguistic Approach To Short Term Memory. In A. Ellis (Ed), Program In Tbe chbology Of Language (pp. 197 — 258). London: LEA, Lawrence Erlbaum Associates, Inc. Bernard, M.; Mills, M.; Frank, T. & Mcknown,J. (2001). Which Fonts Do Children Prefer To Read Online? Usability News. Winter, 2001. Retrieved June 8, 2001 from http:/ /wsupsy.psy.twsu.edu/ surl/usabilitynews/3W/ fontJR.htm Begg, I. (1972). Recall Of Meaningful Phrases. journal of Verbal Learning and Verbal Behavior, 19, 431 -439. Bobko, D. J. ; Schiffman, H. R. ; Castino,J. R. & Chiapetta W. (1977). Contextual Effects On Duration Experience. American journal Of chbology, 90, 577-586. Bodner, R. (1994) A Comparison Of Identification Rates Of Static And Animated Buttons. Dept. Of ComputerAnd Information S cience, University of Guelph, Ontario, Canada. Bridgeman, B, & Cline, F. (2000). Variation: In Mean Reiporm Time: For Que:tion: On Tbe Computer-Adaptive GRE General Te:t: Implication: For Fair A::e::ment (GRE Board Professional Report No. 96-20P; ETS RR 00-7). Princeton, NJ: ETS (Educational Testing Service). Bridgeman, B., Cline, F. & Hessinger, J. (2003). EflZ’ct Of Extra Time On GRE Quantitative And Verbal S core:, (ETS Rep. No. 03-13, GRE Rep. No. 00-03P) Princeton, NewJersey: ETS (Educational Testing Service). Broadbent, D. E. & Broadbent, M.H.P. (1981). Recency Effects In Visual Memory. Quarterb' journal Of Experimental chbology, 33A, 1-15. Brown, J. W. (1948) A Comparison Of Verbal And Projected Verbal-Pictorial Tests As Measures Of The Ability To Apply Science Principles. (Doctoral Dissertation, The University of Chicago, 1948). Dinertation Ab:tract: International, ADD W1948, 217 Brown, S. W. 8: Boltz, M. G. (2002). Attentiomal Processes In Time Perception: Effects Of 69 Mental Workload And Event Structure. journal Of Experimental chbology: Human Perception And Performance, 28(3), 600-615. Carpenter, P. A.;Just, M. A. 8: Shell, Peter (1990). What One Intelligence Test Measures: A Theoretical Account Of The Processing In The Raven Progressive Matrices Test. Pyle/”logical Review, 97(3), 404 - 431. Carroll,J. M. & Rosson, M. B. (1992) Getting Around The Task-Artifact Cycle: How To Make Claims And Design By Scenario. ACM Tramaction: On Iryormation Sy:tem:, 10(2), 1 81 -21 2. Chatman, S. P., Reynolds, C. R. 8: Wilson, V. L. (1984) Multiple Indexes Of Test Scatter On The Kaufman Assessment Battery For Children. journal Of Learning Di:abilitie:, 17, 523-531 . Cheng, P. C-H (1996) Functional Roles For The Cognitive Analysis Of Diagrams in Problem Solving. In Cottrell, G. W. (Ed), Proceeding: Of Tbe Eigbteentb Annual Conference Of Tbe Cognitive Science S ocieg'. (pp. 207-212). Hillsdale, New Jersey: Lawrence Erlbaum. Chun, M. M. & Jiang, Y. (1998). Contextual Cueing: Implicit Learning and Memory Of Visual Context Guides Spatial Attention. Cognitive chbo/ogy, 36, 28-71. Clarke, H. H. 8: Chase, W. G. (1972). On The Process Of Comparing Sentences Against Pictures. Cognitive Pg‘ycbology, 3, 472-517. Craik, F. & Lockhart, R. (1972) Levels of Processing: A Framework for Memory Research. journal of Verbal Learning And Verbal Bebavior, 11, 671 -684. Csapo, K. (1991). Picture Superiority In Free Recall: Imagery Of Dual Coding. In A. Paivio (Ed) Image: In Mind: Tbe Evolution of A Tbeog (pp. 76-106) New York: Harvester Wheatsheaf. Cronbach, L. J. (1975). Beyond The Two Disciplines of Scientific Psychology. American Pg'cbologifl, 12, 671-684. Daneman, M. & Carpenter, P. (1980). Individual Differences In Working Memory And Reading. journal Of Verbal Learning And Verbal Bebavior, 19, 450-466. Davidson, R. E. & Adams,J. F. (1970) Verbal And Imagery Processes In Children’s Paired Associative Learning. journal Of Experimental Cbild chbology. 9, 429-435. Decortis, F., de Keyser, V., Cacciabue, P. C. & Volta, G. (1991). The Temporal Dimension Of Man Machine Interaction. In G. R. S. Weir & J. L. Alty (Eds), Human-Computer Interaction And Complex S}:tem:. Glasgow, UK, Academic Press. Doost, R. & Turvey M. T. (1971). Iconic Memory And Central Processing Capacity. 70 Perception And chbopbynm, 9, 269- 274. Driscoll, M. P. (1994). chbo/ogy Of Learning For In:trnction. Needham, Ma: Allyn and Bacon. Dwyer, F. M. 8: De Melo, H. (1984). Effects Of Mode Of Instruction, Testing, Order of Testing, And Cued Recall On Student Achievement. journal Of Experimental Education 52, 86-94. Eisenstadt, M. & Kareev, Y. (1975). Aspects Of Human Problem Solving: The Use of Internal Representations. In D. A. Norman & D. E. Rumelhart (Eds), Exploration: In Cognition. (pp. 308 — 346) San Francisco, CA: Freeman Ekstrom, R. B., French,J. W., Harman, H. H. & Dermen, D. (1976) Kit ofFactor-Reference Cognitive Te:t:. Princeton, NewJersey: ETS (Educafional Testing Service). Epstein, W., Rock, I. & Zuckerman, C. B. (1960) Meaning And Familiarity In Associative Learning. chbological Monograpb:, 74, 491. Farah, M. J., Hammond, K. M., Levine, D. N. & Calvanio, R. (1988) Visual And Spatial Mental Imagery: Dissociable Systems Of Representations. Cognitive Prycbology, 20, 439-462. Frederiksen, N. & Ward, W. C. (1978). Measures For The Study Of Creativity In Scientific Problem-Solving. Applied chbo/ogical Mea:urement 2(1), 1-24. Furst, A. J. 8: Hitch, G. J. (2000). Separate Roles For Executive And Phonological Components Working Memory In Mental Arithmetic. Memogr and Cognition, 28, 774-782 Galton, F. (1880a). Statistics on Mental Imagery. Mind, 5, 301-318. Galton, F. (1880b). Visualised Numerals. Nature, 21, 252-256. Galton, F. (1880c). Visualised Numerals. IVature, 21, 494-495. Galton, F. (1880d). Visualised Numerals. journal Of Tbe Antbropological In:titute, 10, 85-102. Ginther, A]. (2002). Tbe EfiE’ct: Of Tbe Pre:ence and Ab:ence Of Vi:ual Accompaniment: On Performance On TOEFL Lj:tening Conprebencion Stimuli. (ETS RR-6 6. Princeton, NJ: Educational Testing Service. Glaser, W. R. & Glaser, M. O. (1989). Context Effects In Stroop-Like Word And Picture Processing. journal Of Experimental chbology, General, 118, 13—42. Glenberg, A. M. & Langston, W. E. Glenberg, A. M. L., W. E. (1992). Comprehension Of Illustrated Text: Pictures Help To Build Mental Models. journal Of Memory And Language, 31, 129-151. 71 Goolkasian, P. (1996). Picture-Word Differences In A Sentence Verification Task. Memog And Cognition, 24, 584-594. Goolkasian, P. (1999). Retinal Location And Its Effect On The Spatial Distribution Of Visual Attention. American journal Of chbology, 112(2), 187-211. Guilford, J. P. (1967). Tbe Nature Of Human Intelligence. New York: McGraw-Hill. Gyselinck, V., Cornoldi, C., Dubois, V. & Ehrlich, M-F. (2002). Visuospatial Memory and Phonological Loop In Learning From Multimedia. Applied Cognitive Pycbology, 16, 665-685. Hackos, J. T. & Radish, J. C. (1998) U:er and Ta:k Anabm': For Interface Dengn. New York: Wiley Computing Publishing. Haladyna, T. M. (1997). Writing Te:t Item: To Evaluate Hig/Jer Order Tbinking. Boston: Allyn and Bacon. Haramundanis, K. (1996) Why Icons Cannot Stand Alone. journal Of Computer Documentation, 22(1) 49-51. Harmes, J. C. (1999) Computer-Based Testing: Toward The Design And Use Of Innovative Items. November 22, Univern'ry Of S outb Florida. Hayes, J. R (1973). On The Function Of Visual Imagery In Elementary Mathematics. In W. Chase (Ed), Vi:ual Information Proce::ing. (pp. 177-214). New York: Academic Press. Heathcote, D. Heathcote, D. (1994). The Role Of Visuo-Spatial Working Memory In The Mental Addition Of Multi-Digit Addends. Current Pycbology Of Cognition, 13, 207-245. Hegarty, M. & Kozhevnikov, M. (1999) Types Of Visual-Spatial Representations And Mathematical Problem-Solving. journal Of Educational ngcbology, 91, 684-689 Henderson, W. D. (2004) S peed A: A Variable On Tbe LSATAnd Law S cbool Examr. (LSAC Research Report No. 03-03) Newtown, PA: LSAC (Law School Admissions Council). Hitch, G. J. (1978). The Role Of Short-Term Memory In Mental Arithmetic. Cognitive chbology, 10, 203-323. Hornof, A. J. (2001) Visual Search And Mouse-Pointing In Labeled Versus Unlabeled Two Dimensional Visual Hierarchies. ACM Tran:action: On Computer-Human Interaction, 8(3), 171-197. Humphreys, G. W., Riddoch, M. J. & Quinlan, P. T. (1988). Cascade Processes 1n Picture Identification. Cognitive Neuropsychology, 5, 67—103 72 Intraub, H. (1997). The Representation Of Visual Scenes. Trend: In Cognitive S cience:, 1(6), 217-221. Irwin, D. E., Yantis, S. &Jonides,J. (1983). Evidence Against Visual Integration Across Saccadic Eye Movements. Perception And Pycbop/yncr, 34, 49-57. Irwin, D. E., Brown, J. S. & Sun, J. S. (1988). Visual Masking And Visual Integration Across Saccadic Eye Movements. journal Of Experimental chbology: General, 117, 276-287. Irwin, D. E. (1991) Information Integration Across Saccadic Eye Movements. Cognitive chbology 23, 420—456. Jiang, Y., Olson, I. R. & Chun, M. M. (2000). Organization Of Visual Short Term Memory. journal Of Erqberimental chbology: Learning, M emory, And Cognition, 26, 683-702. Kaufman, A. S. (1976a) A New Approach To The Interpretation Of Test On The W'ISC-R. journal Of Learning Di:abilitie:, 9, 33-41. Kaufman, A. S. (1976b) Verbal-Performance IQ Discrepancies On The WTSC-R. journal Of Con:ulting And Clinical Pycbology, 44, 739-744. Kaufman, A. S. (1990) A::e::ingAdo/e:cent and Adult Intelligence. Boston: Allyn and Bacon. Kellogg, G. 8., & Howe, M. J. A. (1971). Using Words And Pictures In Foreign Language Learning. Alberta journal Of Educational Re:earcb, 17, 87-94 Kosslyn, S. M. (1994). Image And Brain: Tbe Rem/ution Of The Imagery Debate. Cambridge, MA: MIT Press. Kosslyn, S. M. (1995). Mental Imagery. In Kosslyn, S. M., and Osherson, D. (Eds), An Invitation To Cognitive Science: Vi:ual Cognition (V012). Cambridge, MA: MIT Press Kozhevnikov, M., Hegarty, M. & Mayer, R. E. (2002). Revising The Visualizer/Verbalizer Dimension: Evidence For Two Types Of Visualizers. Cognition and In:truction, 20, 47 77. Kyllonen, P. C. St Christal, R. E. (1990). Reasoning Ability Is (Little More Than) Working Memory Capacity?! Intelligence, 14, 389 - 433. Larkin,J. J. & Simon, H. A. (1987). Why a Diagram Is (Sometimes) Worth Ten Thousand Words. Cognitive Science, 11, 65-99. Lefkowith, E. F. (1955) The Effect Of Pictorial Stimuli Similarity In Teaching And Testing. (Doctoral Dissertation, The Pennsylvania State University, 1955) Di::ertation Ab:tract: International, 18, 473. Levin, D. T., & Simons, D. J. (1997). Failure To Detect Changes To Attended Objects In 73 Motion Pictures. Pycbonomic Bulletin And Review, 4(4), 501-506. Lewis, G, Poison, P., Wharton, C., 8: Riemau,J. (1990) Testing A Walkthrough Methodology For Theory-Based Design Of Walk-Up-And-Use Interfaces. Proceeding: Of Tbe ACM CHI, Seattle, WA, 90, 235—242. Logan, G. (1988) Toward An Instance Theory Of Automatization. chbological Review, 95(4), 492-527 Logie, R. H. (1995) Vzkuo-Spatial Working Memogr: I::ue: In Cognitive Prycbo/ogy. Hove (UK), Hillsdale, (USA): Lawrence Erlbaum Associates. Lohse, G., Biolsi, K., Walker, N. & Rueter, H. (1994) A Classification Of Visual Representations. Communication: oft/re ACA/I, 37(12), 36-49. Luck, S. J. & Vogel, E. K. (1997). The Capacity Of Visual Working Memory For Features And Conjunctions. Nature, 390, 279—281. Marschark, M. & Cornoldi, C. (1991). Imagery And Verbal Memory. In C. Cornoldi & M. A. McDaniel (Eds), Imageg And Cognition (pp. 41-56). New York: Springer. Marschark, M. & Hunt, R. R. (1989). A Reexamination Of The Role Of Imagery In Learning And Memory. journal Of Experimental Pycbology: Memory (’7 Cognition, 15, 710 - 720. Marschark, M. & Paivio, A. (1977). Integrative Processing Of Concrete And Abstract Sentences. journal Of Verbal Learning And Verbal Bebavior, 16, 217 - 231. Matarazzo, J. D., Daniel, M. H., Prifitera, A & Herman, D. O. (1988) Inter-Subtest Scatter In The WAIS-R Standardization Sample. journal Of Clinical Pycbology, 44, 940-950. Matarazzo,J. D. & Herman, D. O. (1985) Clinical Uses Of The WAIS-R: Base Rates Of Differences Between VIQ And PIQ In The WAIS-R Standardization Sample. In B. B. Wolfman (Ed), Handbook Of Intelligence, (pp. 899-932). New York: Wiley. Mayer, R. E. & Gallini, J. K. (1990) When Is An Illustration Worth Ten Thousand Words? journal Of Educational ch/Jology, 82(4), 715-726. McCormick, B. H., DeFanti, T. A. & Brown, M. D. (1987) Visualization In Scientific Computing - A Synopsis. IEEE Conputer Grapbic: And Application:, 7(4), 61-70. McDaniel, M. A. & Pressley, M. (Eds), Tbeorie:, Individual Dr'flerencea And Application:. New York: Springer-Verlag. McLean,J. E., Kaufman, A. S. & Reynolds, C. R. (1989) Base Rates Of WAIS-R Subtest Scatter As A Guide For Clinical And Neuropsychological Assessment. journal Of C/rild chbology, 45, 919-926. 74 McNeal, Joanne Margaret, (1994) Effect Of Rehearsal Strategies And Testing Format On Student Achievement Of Different Educational Objectives. (Doctoral Dissertation, The Pennslyvania State University, 1994). Di::ertation Ab:tract: International, DAI-A 55/ 12, 3820. Mel, B. W. (1986) A Connectionist Learning Model For 3-Dimensional Mental Rotation, Zoom, And Pan. Proceeding: Of Tbe Ergbtb Annual Conference Of Tire Cognitive Science S ociey. (pp. 562 571). Hillsdale, NewJersey: Lawrence Erlbaum Associates. Misanchuk, E. R., Schwier, R. A., and Boling, E. (1999). Visual Design For Instructional Multimedia. Proceeding: Of Tire World Conference On Educational Multimedia, H )permedia And Telecommunication:, Seattle, Washington. Molitor, S. (1989) Developing And Manipulating Knowledge By Writing. In P. Boscolo (Ed) Writing: Trend: In European Re:earcb (pp. 160-71). Padova: UPSEL Editore. Moses, B. E. (1980) Tbe Relation:bip Between Vi:ual Tbinking Tania: And Problem-Solving Performance. Paper presented at the annual meeting of the American Education Research Association, Boston. Neisser, U. (1967). Cognitive chbology. Englewood Cliffs, NewJersey: Prentice-Hall. Ngo, D. C. L., Teo, L. S. & Bryne,J. G. (2000). A Mathematical Theory Of Interface Aesthetics. Vi:ual Matbematim Art And Science Electronic journal Vol. 2(4). Retrieved November, 2003 from http: / / www.members.t1ipod.com/ vismath4/ ngo/ Nickerson, R. S. (1965). Short-Term Memory For Complex Meaningful Visual Configurations: A Demonstration Of Capacity. Canadian journal Of chbology 19, 1 55-160. Nielsen,J. (1993). Umbiligl Engineering. Boston, MA: Academic Press. Norman, D. A. (1986). Cognitive Engineering. In D. A. Norman 8: S. Draper (Ed) U:er Centered Sy:tem Dengn: New Peripective: On Human-Corrputer Interaction, Hillsdale, NJ, LEA, Lawrence Erlbaum Associates. Norman, D. A. (1993). Tbing: Tbat [Make U: Smart. Cambridge, MA: Perseus Publishing. Norman, D. A. (1999). Affordance, Conventions, And Design. Interaction: 6(3), 38-43. Olson, G. M. & Olson,J. S. (2003). Human Computer Interaction: Psychological Aspects Of The Human Use Of Computing: Annual Review Of chbology, 54, 491 - 516. Paivio, A. (Ed) (1986). Oxford chbology S erie:: Mental Repre:entation:. (Vol. 9) New York and Oxford, Oxfordshire: Oxford University Press. Paivio, A. (1990). [Mental Repre:entation:: A Dual Coding Approac/J. New York: Oxford 75 University Press. Paivio, A. (1991) Image: In Mind: TlJe Evolution Of A T/reory. New York: Harvester Wheatsheaf. Paivio, A. & Begg, I. (1981). chbology Of Language. Englewood Cliffs, NJ: Prentice Hall. Park, O. & Hopkins, R. (1993) Instructional Conditions For Using Dynamic Visual Displays. In:tructional Science, 21 , 427-449. Parshall, C., Davey, T. & Pashley, P. (2000) Innovative Item Types For Computerized Testing. In W. J. van der Linden & C. A. W. Glas (Eds) Conputerized Adaptive Te:ting: Tbeog' And Te:ting (pp. 129 - 149) The Netherlands: Kluwer Academic Publishers 8: ICO Pashler, H. (1988). Familiarity And Visual Change Detection. Perception And ch/rop/yflcg 44(4), 369-378. Phillips, W. A. (1974). On The Distinction Between Sensory Storage And Short-Term Visual Memory. Perception And chbop/gyn'm 16(2), 283-290. Phillips, W. A. & Christie, D. F. M. (1977a). Components Of Visual Memory. Quarterbl journal Of Experimental chbology, 29, 117 - 133. Potter, M. C. & Faulconer, B. A. (1975) Time To Understand Pictures And Words. Nature, 253, 437-438. Presmeg, N. C. (1986b) Visualization In High School Mathematics. For T/Je Learning Of Matbematic:, 63(3), 42-46. Pullar, R D. & Egenhofer, M. (1988). Towards Formal Definitions Of Topological Relations Among Spatial Objects. Proceeding: Of Tire Tbird International Sjnpwium On Spatial Data Handling, (pp. 225-241) Sydney, Australia. Raju, N. S. (1988). The Area Between Two Item Characteristic curves. chlrometrika, 53, 495-502. Raju, N. S. (1990). Determining The Significance Of Estimated Signed And Unsigned Areas Between Two Item Response Functions. Applied chbologicalMewur'ement, 14, 197- 207. Rayner, K. & Pollatsek, A. (1983). Is Visual Information Integrated Across Saccades? Perception And Pym/upland, 34(1), 39-48. Reese, H. (1972) Imagery And Multiple-List Paired-Associative Learning In Young Children. journal Of Experimental Cbild Pyclrology, 9, 310-323. Reiber, L. P. & Kini, R. S. (1991) Theoretical Foundations Of Instructional Applications For 76 Computer-Generated Animated Visuals. journal OfComputer-Ba:ed In:truction, 17, 83 88 Reinert, H. (1976) One picture Is Worth A Thousand Words? Not necessarily? Modern Language journal, 60, 160-168 Rensink, R. A., O'Regan,J. K. & C1ark,J. J. (1997). To see Or Not To See: The Need For Attention To Perceive Changes In Scenes. chliological Science, 8(5), 368-373. Repp, B. H. & Penel, A. (2002). Auditory Dominance In Temporal Processing: New Evidence From Synchronization With Simultaneous Visual And Auditory Sequences. journal Of E>goerimental chbology:H uman Perception And Performance, 28(5), 1085 — 1099. Richards, D. R. (1987) An Experimental Assessment Of The Relative Effectiveness Of Varied Types Of Computer-Generated Feedback Strategies In Facilitating’ Achievement Of Different Educational Objectives As Measured Verbal And Visual Tests. (Doctoral Dissertation, The Pennsylvania State University, 1987). Di::ertation Ab:tract: International, 48 ( I 0), 2528. Rieber, L. P. (1994) Conniutem Crap/Jim, And Learning. Madison, WI: WCB Brown and Benchmark. Riddoch, M. J. & Humphreys, G. W. (1987a). Visual Object Processing In Optic Apahsia: A Case Of Semantic Access Agnosia. Cognitive Neuropgiclrology, 4, 131-185. Riddoch, M. J. & Humphreys, G. W. (1987b). Picture Naming. In G. W. Humphreys & M. J. Riddoch (Eds), Vi:ual Omect Proce::ing: A Cognitive N europgvclrological Approac/J (pp. 107 143). London: Erlbaum UK. Roediger, H. L. & Weldon, M. S. (1987) Reversing The Picture Superiority Effect. In M. A. McDaniel 8c M. Pressley (Eds) Imageg And Related Mnemonic Proce::e:: Tbeorie:, Individual Dgflrencet And Application: (pp. 151 -174) New York: Springer-Verlag. Rohwer W.D., Lynch S, LevinJ.R. & Suzuki N. (1967) Pictorial And Verbal Factors In The Efficient Learning Of Paired Associates. journal Of Educational Pyc/rology, 58, 278-84. Rosenthal, B. L. & Kemphaus, R. W. (1988) Interpretive Tables For Test Scatter On The Stanford-Binet Intelligence Scale: Fourth Edition. journal Of Pyc/roeducational A::e::rnent, 6, 359-370. Roth, S. (1995). Visual Literacy And The Design Of Digital Media. Computer Grapbicr, 45-47. Rumelhart, D. E., & Norman, D. A. (1988). Representation In Memory. In R. C. Atkinson, R J. Herrnstein, G. Lindzey, & R. D. Luce (Eds), S teven:' Handbook Of Experimental Pycbolog. New York: Wiley. 77 Sadoski, M., Goetz, E. T. & Avila, E. (1995). Concreteness Effects In Text Recall: Dual Coding And Context Availability? Reading Re:eare/) Quarterbr, 30, 278-288. Sadoski, M., Goetz, E. T. & Fritz,J. B. (1991) Impact Of Concreteness On Comprehensibility, Interest, And Memory For Text: Implications For Dual Coding Theory And Text Design. journal Of Educational Hydrology, 85(2), 291-304. Sadoski, M., Paivio, A. & Goetz, E. T. (1991) A Critique Of Schema Theory In Reading And A Dual Coding Alternative. Reading Re:earclr Quarterb, 26(4), 463-485. Sattler, J. M. (1988) A::e::ment Of Clu'ldren ’: Intelligence And Special Abilitie: (3rd. ed.) San Diego, CA: Author. Schneider, W. & Detweiler, M. (1987). A Connectionist/ Control Architecture For Working Memory. In G. H. Bower (Ed), Tire ch/Jology Of Learning And Motivation: Advance: In Re:eare/) And Tbeogr. (pp. 53 — 119) San Diego, CA: Academic Press. Schnipke, D. L. & Scrams, D. J. (1999) Modeling Item Re:pon:e Time: Wit/J A Two-State Mixture Model: A New Approach To Mea:uring S peededne::. (LSAC Research Report No. 96-02) Newtown, PA: LSAC (Law School Admissions Council). Schnipke, D. L. & Scrams, D. J. (1999) Exploring I::ue: Of Te:t Taker Beiravior: In:igbt: Gained From Re:pon:e-Time Anaboet (LSAC Research Report No. 98-09) Newtown, PA: LSAC (Law School Admissions Council). Scrams, D. J. & Schnipke, D. L. (1999) Making U:e of Re:pon:e Time: In S tandardized Te:t::Are Tog Mea:uring Tire Same TlJing? (LSAC Research Report No. 97-04) Newtown, PA: LSAC (Law School Admissions Council). Seymour, P. H. K. (1979) Human Vi:ual Cognition. London: Macmillan. Simons, D. J. (1996). In Sight, Out Of Mind: When Object Representations Fail. chbological Science, 7, 301-305. Simons, D. J. 8: Levin, D. T. (1998). Failure to Detect Changes To People In A Real-World Interaction. Pyle/Jonomic Bulletin And Review, 5(4), 644-649. Simpson, T. J. (1994) Message Into Medium: An Extension Of The Dual Coding Hypothesis. I VLA (Imageg And Vi:ual Uteragr) Annual Conference, 1994, 255-263. Smith, M. C. & Magee, L. E. (1980). Tracing The Time Course Of Picture-Word Processing. journal Of Experimental Pyc/rology: General 109, 373-392. Smywhe, P. C. (1970) Pair Concretene:: And Mediation I n:truction: In Forward And Backward Paired A::ociative Recall. Unpublished Doctoral Dissertation, University of Western Ontario, London, Ontario, Canada. 78 Snodgrass,J. G. (1984) Concepts And Their Surface Representations. journal Of Verbal Learning And Verbal Bebavior‘, 23, 3-22. Snyder, H. L. & Maddox, M. E. (1978). Information Trang’ér From Corrmuter-Generated Dot- Matrix Dipla}: (Virginia Polytechnic Institute, Final Rep. HFL-78-3/ARO-78-1, NTIS No. AD A063 505) Blacksburg, VA: VPI (Virginia Polytechnic Institute). Sperling, G. (1960). Afterimage \X’ithout Prior Image. Science, 131, 1613-1614. Stevens, S. S. & Greenbaum, H. B. (1966). Regression Effect In Psycho-Physical Judgment. Percqbtion 29' P9cboplry:ic:, 1, 439-446. Stricker, L. J. (1993) Di:cnpant LSAT S ub:core:. (LSAC Research Report No. 93-01) Newtown, PA: LSAC (Law School Admissions Council) Suwarsono, S. (1982). Vi:ual Imageg In Tbe Matbematical Tbinking Of S event/J Grade S tudent:. Unpublished doctoral dissertation, Monash University, Melbourne. Sweller, J. (1988) Cognitive Load During Problem Solving: Effects On Learning. Cognitive Science, 12, 257-285. Sweller,J. & Cooper, G. A. (1985). The Use Of Worked Examples As A Substitute For Problem-Solving In Learning Algebra. Cognition And In:truction, 2, 59-89. Theios, J. & Amrhein, P. C. (1989) Theoretical Analysis Of The Cognitive Processing Of Lexical And Pictorial Stimuli: Reading, Naming, And Visual And Conceptual Comparions. chbological Review. 96, 5-24. Thissen, D. (1983) Timed Testing: An Approach Using Item Response Theory. In D. Weiss (Ed), New H orizon: In Te:ting: latent Trait Tbeory and Computerized Adaptive Te:ting. (pp. 179-203). New York: Academic Press. Thomas, K. E.; Newstead, S. E. & Handley, S. J. (2003). Exploring The Time Prediction Process: The Effects Of Task Experience And Complexity On Prediction Accuracy. Applied Cognitive Pyrcbology, 17, 655 - 673. Trbovich, P. L. & eFevre., J. (2003). Phonological And Visual Working Memory In Mental Addition. MemogI And Cognition, 31, 738-745. Tullis, T. S. (1983). The Formatting Of Alphanumeric Displays. Human Factors, 25, 657-683. Van Dusen, L. M., Spach,J. D., Brown, B. & Hansen, (1999) TRIO: A New Measure of Visual Processing Ability. Educational And chbologicalMemurement, 59(6), 1030-1046 Vierordt. K. (1868) Der Zeitsinn Nach Versuchen [Empirical Studies Of Time Experience]. Tiibingen, Germany Laupp. 79 Weldon, M. S. & Roediger, H. L. III (1987) Altering Retrieval Demands Reverses The Picture Effect. Memog' And Cognition, 15, 269-280. Wertheimer, M. (1920, 1925) Abgedruckt In Philosophische Zeitschrift Fiir Forschung Und Aussprache. 1, 39-60 (1925) Und Als Sonderdruck: Erlangen: Verlag Der Philosophischen Akademie (1925). [GESTALT THEORY] (1985) 7(2), 99-120, Opladen: Westdeutscher Verlag. Wilson, K. M. 8: Powers, D. E. (1994) Factor: In Performance On Tbe law S #200! Adminion: Te:t. (LSAC Research-Statistical Report No. 93-04) Newtown, PA: LSAC (Law School Admissions Council). Woods, D. D. (1985). Coping With Complexity: The Psychology Of Human Behavior In Complex Systems. In L. P. Goodstein, H. B. Andersen & S. E. Olsen (Eds) Ta:k:, Error: And Mental Model: (pp. 128 - 148). London: Taylor and Francis. Woods, D. D. (1991). The Cognitive Engineering Of Problem Representations. In G. R. S. Weir & Alty, J. L. (Eds) Human-Comuter Interaction And Complex S}:tem:, (pp. 169 187) Glasgow, Scotland: Academic Press. Yarmey, A. D. & O’Neill, B. J. (1969) S-R and R—S Paired Associative Learning As A Function Of Concreteness, Imagery, Specificity, And Association Value. journal Of chbology, 71, 95-109 Zenisky, A. L., & Sireci, S. G. (2002). Technological Innovations In Large-Scale Assessment. Applied Mea:urement In Education, 15, 337-362. Zumbo, B. D. (1999) A Handbook On Tbe Tbeog And Met/rod: Of Dij'erential Item Functioning (DIF): Logi:tic Regre::ion Modeling A: A U nitag' Framework For Binag And Ukert-Dpe (Ordinal) Item S core: Ottawa, Canada: Directorate of Human Resources Research and Evaluation, National Defense Headquarters. 80 APPENDICES 81 APPENDIX A: Table 8. DCT Table of Theoretical & Empirical Assumptions 7 General Cognition is served by two modality-specific systems that are experientially Empiricist derived and differentially specialized for representing and processing Assumptions information concerning nonverbal objects, events and language Distinctions between symbolic and sensorimotor systems : Unit Level Representational Units Vary Synchronous vs. Properties are modality specific Hierarchically in sequential intraunit size organizational strcuture 7 System Level Functional Independence Interunit Processing Operations: Properties & Partial between and 1. Activation of Interconnectedness Within systems . representations. 2. Representational, referential & associative 3. Synchronous vs. sequential 4. Transformational 5. Conscious & Automatic A Basic Evaluative Mnemonic Motivational & Functions Emotional 4 Empirical Theoretical Assumptions are linked to class of operational indicatorsand Variables procedures: stimulus attributes, experimental manipulation, individual differences in cognitive habits and skills, and subjective reports , Phenomenal Processing of verbal & nonverbal information in perceptual memory, Domain language, and complex problem-solving tasks; neuropsychology; issues in epistemology and philosophy of science. 82 APPENDIX B: Sample 1. Sample Format of McNeal’s Verbal to Visual Test Comparisons Plate 1 Terminology Test Verbal Form Blood from the right ventricle leaves the heart through the a. veins b. aortic artery c. pulmonary artery d. superior vena cava Plate 2 Terminology Test Visual Form Select the letter which correctly represents the part or function of the heart described in each question. The vessels(s) through which the blood leaves the heart from the right ventricle: Figure 3.6 Sample Questions from Terminology Test 83 APPENDIX C: Table 9. Means & Standard Deviations on Verbal Test Form Identification Terminology Comprehension Composite T1 M 14.77 13.34 11.02 39.14 SD 3.44 3.31 2.77 6.83 T2 M 15.89 14.45 12.05 42.27 SD 2.81 2.71 3.07 7.53 T3 M 15.23 13.36 11.02 39.61 SD 2.85 2.77 2.82 6.66 T4 M 16.52 14.82 12.20 43.30 SD 2.53 2.73 3.11 7.09 Table 10. Means & Standard Deviations on Visual Test Form Identification Terminolqu Comprehension Composite T1 M 14.45 13.00 10.16 37.61 SD 2.93 3.65 3.18 8.62 T2 M 15.30 13.20 9.86 38.59 SD 3.31 3.40 3.17 8.27 T3 M 14.95 12.59 10.25 37.75 SD 2.85 3.38 3.08 8.00 T4 M 16.59 14.25 10.98 41.82 SD 3.22 3.18 3.15 7.26 APPENDIX D: Table 11. Summary Correlations between and among predictor and criterion variables for law schools participating in 1995-1996 correlation studies: selected first—year results. j VAR YR MN SD 25 50 75 MIN MAX j Zero Order Correlations J ILSAT/FYA 1995 0.04 0.10 0.35 0.42 0.47 0.02 0.61 l 1996 0.04 0.09 0.34 0.40 0.46 0.01 0.62 ,, lUGPA/FYA 1995 0.26 0.08 0.20 0.27 0.31 0.05 0.45 I g 1996 0.25 0.08 0.19 0.25 0.31 0.02 0.42 » lLSAT/UGPA 1995 -005 0.14 -013 -005 0.06 -044 0.31 I f 1996 -0.06 0.13 -015 -0.06 0.04 -O.46 0.24 g 7 Multiple Correlations l [LSAT and 1995 0.49 0.08 0.44 0.50 0.55 0.18 0.68 * UGPA/FYA 1996 0.48 0.08 0.44 0.49 0.53 0.11 0.68 85 APPENDIX E: Table 12. Incidence of Significant and Rare Differences for Each Pair of LSAT Subscores. Subscore Pair Significant Difi’erences Rare Difi‘erences (-°/o) (+°/o) (-°/o) (+%) Analytical Reasoning vs. 9.8 9.8 2.5 2.5 Reading Comprehension Analytical Reasoning vs. 10.3 10.0 2.5 2.5 Logical Reasoning Reading Comprehension vs. 5.1 5.1 0.25 0.25 Logical Reasoning Table 13. Incidence of Significant and Rare Differences for All Pairs of LSAT Subscores. Frequency Significant Difi'erence Rare Difi'erence 0%) (“/o) 0 66.1 88.1 2 15.5 3.1 3 0.4 0.0 l 1 18.1 8.8 86 APPENDIX F: Sample 2. Kit of Factor-Reference Cognitive Test - Identical Pics Test IDEN’I‘ICAL PICTURES -— P - 3 How fast can you match a given object? This is a test of your ability to pick the correct object quickly. At the left of each row is an object. To the right are five test objects, one of which matches the object at the left. Look at the example below: 9 99999 The third test object has been marked by blackening t space below it, because it is the same as the object at the left. 990990 aggro O O .9999 Your score on this test will be the number of objects marked correctly minus a fraction of the number marked incorrectly. Work as quickly as you can without sacrificing accuracy. You will have 1 112 figures for each of the two parts of this test. Each part has two pages. Be sure to do both pages if you have time. When you have finished Part 1, STOP. Please do not go on to Part 2 until you are asked to do so. DO NOT TURN THIS PAGE UNTIL ASKED TO DO SO. Copyright 1962, 1975 by Educational Testing Service. All rights reserved. 87 Sample 3. The Kit of Factor-Reference Cognitive Test — Finding As Test FINDING A’s TEST — P-1 This is a test of your speed in finding the letter tPaI in words. Your task is to put a line through any such word. Listed below are five columns of words. Each column has five words containing the letter “a”. The first two columns have already been marked correctly. Now, on the other three columns, practice for speed in putting a line through the words whhan“£1 1 2 3 4 5 cider ease stripe insert defend bough blind coarse court settle fudge chord govern pearl lodge greet solar perfect bridle oaken faale spoon special recess croin leap piece consist soapy quest count rinse mostly able glimpse Shore drawa shrink pledge every easel fleet pencil refuse beak define sense hinder better where entire uncle solace patrol thorn ghost white keeper judge pause knife eeaeh night defect hence hedge south clock trust short peeal period picnic other person scope miller smart straw warm ripen alegaa finger noisy juice under height useful defer enter heard event slowly field ordeal quite bond meant mend nurse jump west quick skill cool Remember, in each column there are five words containing the letter “a”. Your score on this test will be the number of words marked correctly. Work as quickly as you can without sacrificing accuracy. You will have 2__min_j,i_t;;$_for each of the two parts of this test. Each part has four pages. When you have finished Part 1 (pages 2 to 5), STOP. Please do not go on to Part 2 until you are asked to do so. DO NOT TURN THIS PAGE UNTIL ASKED TO DO SO. Copyright 1962, 1975 by Educational Testing Service. All rights reserved. 88 Sample 4. Sample Screen Shot of DCT & LSAT item formats Ql-QS APPENDIX G a «flan 2V :V :6 .A s .5 S 6.5. 1:59. IEEE—v.3: «:35. {Hz—3:... .5333: .M E 2 .v :6 2., a: S «2.22:... .339. 18.5273: 6.3. 1...:2 4:28.; .9 5 2 a. 7: S 9.. 3 .3355: IEEE .33. .5532“ £51573: .255. .U : s :v. 9.... :3 Q. 7:. icicz [oz-Cu. £25.35: £515.73: 45:33:. 6.5. .6 S E :v 2 e is J: ..:...cv:~ £2525: .w._..:u_ 1.63:: «5.5273: 6.5. .< O 25:. 2... c. .63; 8a :5... a... 62.... s at: .1. 3 2:3 3.3:... 2i. a: 2.: 6.5, u. 8225 £5». 23.9.. 5: ~2— aU—flu— up: u: «9.5g»: 90: Qua-by”. 00:: 0:25. “fl .7917" an m—CAXCSm-‘C OFF—u r ll\ It .r—CC tuna—u... mm 073— Up: .2:- .I.U3~fl—C.Z Up: rapt—9% Illll ill. 2...... 1:39. 2.. 0.5.3.. 2:: 2:3 .3 .53..” Ba £53. 2.. :95. .36... .61.; a. 3233 a... .= _C. v u . _ . x. _ _ .72 .32..” 8.. :25. 3.. ca... .75.. .535- an $915.51.: 2.. .= _ ._ _ . 7. -__ . e m. a p. m _ . 9.5:...3. a... 9...? 2.8.5.339 on. 35: p.55 9.... c. «.53.. 2.. m3; 212.. v 9.9:...» c. 31.5 0...... .35. :5... 2...: .333 I. .53.. 47 .25. a .n .53.. ya: .52.... a 5. flag: . j§.m.1x¢ .3 .355 x: $558 HES...“ k. as...» « 34$: Eta... v 35:33 1.... .952 ®I 9 .55:a 5.5. 55.3. 1:... .3....“ s. 93...: 2.. 0m... 63.337 :26 twice 2.35.556 1:.” .1383!" :2: .3.. 933.79.. 2... Faxing .323"... p.12: a Bu... 3. .33.: ”x. .2: : 55:32.. a... .: 2:3 "8:953." :. £55.15; .2. .3 a c: .53.». a. 9593 .3... c. 33:32.. .3 :25. Lou...— ”ragga 32:35? mm. 355:. mm - 35... = nag 89 Sample 5. Sample Screen Shot of DCT & LSAT item formats Q11-Q17 3.3 4.93.55... .N O 15...... .Q 0 45.3.5741... .U 0 2:5. ... 0 5.3 .< O E... 3.3 2. 5.1.25 ”.5... 32...... 2.. .s 2... £2... a 8.326 5:33 9.2693 “3: .2. 5.5. 3.. 5 7.3.3:... 2.. 95...... 2:: 2:3. .5 .51.... 9... 29:55.2: 2.... .Iub—u— fife—:3“ mm 9—63 in 5:: Zyvafl—Cavu .49—u ubfiumuz «5...... a... 9.3.2. 9:: 2:3. .9 .51.... 2a 2.3.... 2.. :9... .36... .94....“ a. .5593 2.. .= .3... .53.... Ba ”2.5. 2.. :u... .3.... 19......" 9:. 5:32:72: u... .. $5.25.... 2.. 5:3 .:u.m..:=u 2. .38 5...... u... c. .75.. u... 7...... 9:22... 52...: c. 3...: 2.... .3.... :2. 9.9: .31.; 9.. .xx... :7. 58.. a .n .5... 9:: £5... a 3.552... .3... «2.35:. «sci: 35.5.2.1: 1...:9. 6.3-7.75. .c “.8... x.» 9.853 25...... k. as... a 3...»: 9.1.2.”. 33310 o...... .392 K \ .333." .25. .15.... .5." .3.... :. 9.3.x: u... 3... 63.33.. £23 £93....“ ....u.....n..::u .Ea 2.3.5.3.. .3.: .3. 35.93. .4... 22:3... .5943... :22: a .2"... c. .33... .X. .5: ._ £253.... \ a... E Q a... ..: 2:3. ”Erwin...“ :— «E...=x.cu ..: .3 e c: .322. .4. 5:33. 7.... :. 7.9.1.3.... ..: 9...... £93.... “fig—45a 23.3.5 «N .835: mm. . 25... ._ “madam. ._ 9O APPENDIX H: Figure 4. Mean Response Times (MRT) of DCT & LSAT groups. LSATRT ,— j u u n v : fl 0 50 100 150 200 250 LSATRT = Traditional LSAT Mean Response Times DCTRT = DCT Mean Response Times X Axis = Total number of questions answered Figure 5. LSAT RT to the Proportion of Correct Responses LSATTC LSATTC and LSATRT 0.9 - . 0.8 4 . 0.7 - . ° 0.6 4 ° 0.5 - 3 . ’ ° ° 0.4 - 9 ° ° 0.3 a , , ; . 0 0.2 - ° . 0.1 - o 0.0 I I . I I r I I I I 20 40 60 80 100 120 140 160 180 200 220 240 LSATRT LSATTC = Total number of correct responses LSATRT = LSAT Response Times 91 Figure 6. DCT RT to the Proportion of Correct Responses DCFFC DC'ITC and DCTRT 1.0 - 0.9 . ° 0.8 ‘ 0.7 - 0.6 " 0.5 - 0 s 0.4 - o . o 0.3 - 0.2 l I 1 0 50 100 150 200 250 DCTRT DCTT C = Total number of correct responses DCTRT = DCT Response Times 92 APPENDIX IzTable 14. Correlation Coefficient of All examinees for Mean Response Times Correlation T1 - 11 T2 - 12 T3 - 13 T4 - 14 T5 — 15 T6 - 16 T7 — 17 T8 - 18 T1 - 11 1.00 *0.25 -0.15 **0.30 M0.43 ”0.29 0.04 -0.17 T2 - 12 *0.25 1.00 0.06 “0.35 0.04 *0.21 0.01 -0.03 T3 - 13 -0.15 0.06 1.00 0.10 0.17 -0.04 “0.39 ”0.33 T4 - 14 "0.30 "0.35 0.10 1.00 “0.33 0.02 *0.21 *0.23 T5 - 15 “433.00 0.04 0.17 **0.33 1.00 0.03 M0.36 0.18 T6 - 16 “0.29 *0.21 -0.04 0.02 0.03 1.00 -0.02 -0.10 T7 - 17 0.04 0.01 "0.39 *0.21 M0.36 -0.02 1.00 **O.50 T8 - 18 -0.17 -0.03 “0.33 *0.23 0.18 -0.10 **0.50 1.00 T9 - 19 *-0.20 -0.02 0.13 0.04 0.23 -0.01 *0.26 M0.36 T10 - 110 -0.05 -0.02 0.12 0.15 0.12 0.05 *0.21 0.16 T11 - 111 *-0.24 -0.01 0.09 —0.08 "-0.27 0.18 -0.04 0.06 T12 - 112 “0.27 -0.03 0.16 -0.02 -0.13 0.03 0.05 0.09 T13 - 113 -0.16 -0.05 -0.15 *—0.23 “—0.28 0.12 -O.17 0.07 T14 — 114 "-0.28 —0.12 0.01 *-0.24 **-0.30 *-0.23 *-0.23 0.03 T15 — 115 *-0.24 **0.27 -0.10 -0.1 1 “-0.31 —0.11 -O.20 ~0.08 T16 - 116 -0.19 **-0.27 -0.09 -0.08 *-0.26 *-0.21 *-0.25 -0.11 T17 — 117 -0.23 *-0.25 -0.14 *-0.21 "-0.31 -0.19 “-0.42 *-0.25 T18 - 118 -0.16 "-0.30 “—0.26 "—0.44 "-0.40 -0.06 **-0.41 1""‘-0.36 T19 - 119 -0.19 "-0.26 -0.16 "-0.31 “-0.31 0.00 *-0.25 *-0.25 T20 — 120 *-0.24 1“"‘-0.27 *-0.23 *-0.24 ”—0.36 -0.19 “—0.37 M-0.27 T21 - 121 -0.19 **-0.29 *-0.25 ”-0.29 “-0.35 ~0.14 **-0.41 **-0.32 T22 - 122 —0.15 *-0.22 ”-0.30 ”-0.31 "-0.33 *-0.21 **—0.39 ”-0.34 T23 - 123 -0.10 -0.14 “—0.27 *-0.24 I”-0.28 *-0.24 **-0.34 **-0.28 Format *0.20 0.00 -0.15 0.10 0.12 0.14 0.03 ~0.17 Correlations T9-19 T10-110 T11-111 T12-112 T13-113 T14-114 T15-115 T16-116 T1 - 11 *-0.20 -0.05 *-0.24 "-0.27 -0.16 "-0.28 *-0.24 -0.19 T2 - 12 -0.02 -0.02 -0.01 -0.03 -0.05 -0.12 1""2027 “-0.27 T3 - 13 0.13 0.12 0.09 0.16 -0.15 0.01 -0. 10 -0.09 T4 - 14 0.04 0.15 -0.08 -0.02 *«023 *-0.24 ~0.11 -0.07 T5 - 15 *0.23 0.12 M-0.27 -0.13 **-0.28 *-0.30 “—0.31 *-0.26 T6 - 16 -0.01 0.05 0.18 0.03 0.12 *-0.23 -O.11 *-0.21 T7 - 17 *0.26 *0.21 -0.04 0.05 -0.17 *-0.23 -0.20 -0.25 T8 - 18 ”0.36 0.16 0.06 0.09 0.07 0.03 -0.08 -0.10 T9 — 19 1.00 "0.32 0.19 0.16 0.03 -0.05 -0.08 -0.13 T10 - 110 **0.32 1.00 0.11 0.12 -0.16 -0.19 -0.08 0.14 T11 - 111 0.19 0.12 1.00 "0.30 —0.02 0.12 0.08 -0. 10 T12 - 112 0.16 0.11 ”0.30 1.00 *0.21 M0.35 *0.20 *0.21 T13 - 113 0.03 -0.16 -0.02 *0.21 1.00 "0.27 0.1 1 0.1 1 T14 - 114 -0.05 -0.19 0.12 "0.35 "0.27 1.00 "0.45 "0.46 T15 - 115 -0.08 -0.09 0.08 *0.20 0.11 M0.45 1.00 "0.56 T16 - 116 -0_13 -0. 14 -0.10 *0.21 0.11 **0.46 ”0.56 1.00 T17 - 117 -0. 12 -0.18 0.01 0.07 0.19 “0.42 "0.41 “0.50 T18 - 118 “-0.38 "—0.29 -0.19 ~0.14 *0.21 0.09 0.08 *0.23 T19 - 119 "-0.30 ”-0.29 -0.14 -0.07 ”0.31 0.12 *0.23 *0.26 T20 - 120 M-0.39 ”-0.36 -0.16 -0.19 0.11 0.05 0.15 *0.21 T21 -121 M-0.31 "-0.30 -0.16 -0.18 0.14 0.05 0.13 0.17 T22 - 122 ”-0.38 "-0.32 0.19 **-0.27 0.07 0.04 0.04 *0.23 T23 - 123 “20.31 **-0.31 -0.19 **—0.29 0.05 0.04 0.10 0.12 Format —0.1 5 -0.10 -0.11 *-0.26 0.02 —0.03 -0.14 «0.07 93 Table 14: Continued Correlations T17-117 T18-118 T19-119 T20-120 T21-121 T22-122 T23-123 Format T1 - 11 *—0.23 -0.12 -0.19 *-0.24 -0.19 -0.15 -0.10 ”020 T2 - 12 *-.025 “-0.30 "-0.26 "-0.27 M-0.29 *-0.22 -0.14 0.00 T3 - 13 -0.14 "-0.26 -0.16 *-0.23 *—0.25 “—0.30 "—0.27 -015 T4 - 14 *0.21 “—0.44 *-0.23 *-0.24 **—0.29 "—0.31 *-0.24 0.10 T5 - 15 "-0.31 "-0.40 “-0.31 **-0.36 "-0.35 "-0.33 ”-0.28 0.12 T6 - 16 -0.19 -0.06 0.00 -0.19 —0.14 *-0.21 *-0.24 —0.14 T7 - 17 "-0.42 "-0.41 *-0.25 "-0.37 "-0.41 "-0.39 "-0.34 -0.03 T8 - 18 *—0.25 "-0.36 *-0.25 "-0.28 "-0.32 "-0.40 “-0.28 -0.17 T9 — 19 -0.12 "-0.38 “-0.30 “-0.39 "-0.31 "-0.38 **-0.31 -0.15 T10 - 110 -0.18 “-0.29 "-0.29 ”-0.36 **-0.30 “-0.32 “-0.31 -0.10 T11 - 111 0.01 -0.19 —O.14 -0.16 -0.16 -0.19 —0.19 -O.11 T12 - 112 0.07 -0.14 -0.07 —0.19 -0.18 "—0.27 a”-0.29 *-0.26 T13 - 113 0.19 *0.21 **0.31 0.11 0.14 0.07 0.05 0.02 T14 - 114 **0.42 0.09 0.12 0.05 0.05 0.04 0.04 —0.03 T15 - 115 M0.41 0.08 *0.22 0.15 0.13 0.04 0.10 -0.14 T16 - 116 "0.50 *0.23 *0.26 *0.21 0.17 *0.23 0.12 —0.07 T17 - 117 1.00 “0.33 "0.33 "0.30 "0.28 *0.27 “0.29 -0.15 T18 - 118 **0.33 1.00 **0.52 **0.49 "0.49 **0.51 **0.40 0.06 T19 - 119 "0.33 “0.52 1.00 "0.55 “0.49 “0.32 0.20 0.02 T20 - 120 **0.30 **0.49 "0.55 1.00 **0.72 “0.64 “0.62 0.02 T21 - 121 "0.28 "0.49 M0.49 0.72 1.00 "0.74 “0.50 0.82 T22 - 122 *0.26 M0.51 **0.32 **0.64 ”0.74 1.00 “0.67 0.82 T23 - 123 "0.29 "0.40 0.20 “0.62 **0.50 "0.67 1.00 0.09 Format 015 0.06 0.02 0.02 0.08 0.08 0.09 1.00 ** p S .01 * p S .05 T-I (Response time taken to answer specific item) 94 APPENDIX]: Table 16. Multivariate Analysis of Variance (MAN OVA) of Significant Items Dependent Variable df F Sig Item 1 1 0.236 0.628 Time 1 1 3.980 0.049 Item 2 1 2.918 0.091 Time 2 1 0.000 0.998 Item 7 1 4.511 0.036 Time 7 1 0.100 0.753 Item 12 1 0.344 0.559 Time 12 1 7.073 0.009 Item 14 1 3.891 0.052 Time 14 1 0.031 0.860 Item 15 1 6.065 0.016 Time 15 1 2.177 0.144 Item 17 1 3.387 0.070 Time 17 1 4.345 0.040 Item 18 1 2.891 0.093 Time 18 1 0.163 0.687 Item 19 1 11.091 0.001 Time 19 1 0.222 0.639 Item 22 1 6.34 0.015 Time 22 1 0.814 0.371 95 02110110