I

L.
.3.
s.

I
—:

i2

..
. v.
v
.
.v...

I
a...

.4 I
33%."...3...

.3
31...}.

 

. . y: p . T ...__.,....
s. E :1: .TPEJEWWI 5 5154 .1. . ,1? . . .
u: .11 . , .

.ﬁwi : a

‘ 3;... $3331..

 

 

w-MJ‘U

3a ‘ UBRARY
Michi; ‘tate
University

 

 

 

This is to certify that the
dissertation entitled

DUAL CODING ITEM FORMATS FOR COMPUTERIZED
ADAPTIVE TEST (CAT) ENVIRONMENTS

presented by

Christine Bee Lan Chan

has been accepted towards fulﬁllment
of the requirements for the

Doctoral degree in Educational Psychology

 

 

55/49 Add”...

' Thajor Professor’s Signature

July 8, 2005

 

Date

MSU is an Afﬁrmative Action/Equal Opportunity Institution

.-.‘n-l-C-n-I-O-D-I-l-O-O-l-l-I-l-l-l-l-o-|-I-O-h.

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DUAL CODING ITEM FORMATS FOR COMPUTERIZED ADAPTIVE TEST (CAT)
ENVIRONMENTS

By

Christine Bee Lan Chan

A DISSERTATION

Submitted To
Michigan State University
In partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Counseling, Educational Psychology and Special Education

2005

ABSTRACT

DUAL CODING ITEM FORMATS FOR COMPUTERIZED ADAPTIVE TEST (CAT)
ENVIRONMENTS

By

Christine Bee Lan Chan

Dual Coding Theory (Paivio, 1986, 1990, 1991) hypothesizes that information
presented simultaneously in both text and visuals would result in more efﬁcient processing
of information. When Dual Coding item formats are used for computerized adaptive tests,
they are hypothesized to be more efﬁcient measures of a candidate’s capabilities than
traditional item formats. This hypothesis is suggested by research on Dual Coding theories
of information (Paivio, 1986, 1990, 1991). Efﬁciency is the accurate assessment of
performance in the shortest time possible.

Test items from a past LSAT exam were used to develop two formats: (i) the DCT
format where information was presented in paired visuals and text, and (ii) the LSAT format,
a replication of the current pencil and paper exam. Participants of similar ability level were
randomly selected and assigned to either the DCT group or the traditional LSAT group.
Performance differences between the groups will thus indicate if there is a difference in
response time and proportion correct for the different formats.

Results indicate that DCT item formats had a significant effect with a higher median
score of 5.75 in comparison to the traditional LSAT median of 4.75 that had a narrower
range of 1.5 - 9.0. The data also showed differences in speededness among examinees, with
DCT participants having greater mean response times (MRT) with a slightly higher median

of 80 seconds compared to 70 seconds in the LSAT group. MRT for the DCT group

increased for later items, but with more accurate answers. These results support the Dual
Coding hypothesis of the effectiveness of a visual-text presentation of information, as they
aid in preserving cognitive resources when higher-order complex tasks are engaged in

immediate-delayed retention tests.

Copyright By
CHRISTINE BEE LAN CHAN

2005

ACKNOWLEDGMENTS

I would like to thank the following people; Brian M. Winn, for help on Director, Dr.
Linda Chard for the DIF and reliability analysis, the Episcopal-Anglican Chaplaincy at MSU,
and all who have contributed to this study. I would specially like to thank my advisor, Dr.
Mark D. Reckase, for his encouragement and astute guidance, pushing me toward the best of
my ability. Without him, this would never have been possible. To my best friend Judith
Brown-Clarke Ph.D., for her support, encouragement and wisdom through difﬁcult times; to
my parents who have sacriﬁced so much for me and to God, for by His grace and strength, I

can do all things.

TABLE OF CONTENTS

LIST OF TABLES ..............................................................................
LIST OF FIGURES ............................................................................
LIST OF SAMPLES ...........................................................................
KEY TO ABBREVIATIONS .................................................................

INTRODUCTION ............................................................................
STATEMENT OF THE PROBLEM ........................................................
MEMORY SYSTEMS: A BRIEF OVERVIEW .............................................

The Psychometric Approach ................................................
The Cognitive Approach .....................................................
Visual-Spatial Working Memory .............................................
OBJECTIVES AND GOALS OF THIS STUDY ............................................

DUAL CODING THEORY (PAIVIO, 1986, 1990, 1991) ...............................
THE BASIC PREMISE .....................................................................
DEVELOPMENT OF DUAL CODING THEORY (DCT) ..................................

The Conceptual Peg Hypothesis .............................................
Imagery-Concreteness of Word-Picture Items .............................
Synchronous Organization ...................................................
Symmetry-Asymmetry of Associative Items ................................
OPPOSING THEORIES OF DUAL CODING THEORY . . . . . . . . . . . . . . . . . . . . . . . . . .
CONCLUSION ...........................................................................

FACTORS AFFECTING VISUAL SHORT TERM MEMORY .........................
COGNITIVE ANALYSIS OF VISUAL PROPERTIES .......................................
VISUAL-SPATIAL LAYOUT ...............................................................
TIMZE PREDICTION AND TASK COMPLETION PERCEPTIONS .........................

Experience-Rehearsed Sessions .............................................
Nature of Task & Distractors ................................................
Task Complexity & Duration ................................................
TYPES OF MENTAL OPERATIONS ......................................................
Quantitative Reasoning .......................................................
Sentence Veriﬁcation Task ...................................................
Maze, Copying and Object Manipulation Tasks ...........................
INDIVIDUAL DIFFERENCES ............................................................

VISUAL-TEXT ASSESSMENT FORMATS ................................................
EARLY RESEARCH .......................................................................
PAST RESEARCH .........................................................................
CURRENTRESEARCH

ix

THE LAW SCHOOL ADMISSIONS TEST (LSAT) ......................................
ITEM TYPES-MEASURES .................................................................
Predictive Validity ............................................................

Analytical Reasoning (AR) Discrepant Subscores ..........................

TLME SPEEDEDNESS .....................................................................

RESEARCH METHOD AND DESIGN ...................................................
PURPOSE OF THE STUDY ...............................................................
PROCEDURE .............................................................................

Deﬁnition of Terms ...........................................................
Power Analysis ................................................................
Item Selection Process ........................................................
Kit of Factor-Reference Cognitive Tests ....................................
Graphical User Interface Development (GUI) .............................
Participants .....................................................................
TEST ADMINISTRATION ................................................................

RESULTS AND DATA ANALYSIS .........................................................
ANALYSIS PROCEDURES ...................................................................
Descriptive Statistics ..........................................................
Differential Item Functioning ...................................................
Reliability of Tests ............................................................

Validity of Test Items .........................................................
Proportion of Correct Responses ...........................................
Average Response Times (RT) - Answers .................................

Time Correlations .............................................................
Multivariate Analysis of Variance (MANOVA) ................................
SUDNARY AND DISCUSSION ............................................................
Item Location ..................................................................
Response Times and Speededness ..........................................

Time Correlations .............................................................

IMPLICATIONS AND FUTURE RESEARCH ...........................................
LIMTATIONS OF THE STUDY ...........................................................
FUTURE RESEARCH ......................................................................

REFERENCES .................................................................................

APPENDICES .................................................................................

** NOTE: IMAGES IN THIS DISSERTATION ARE PRESENTED IN COLOR

36
36
37
38
39

42
42
42
42
43

46
53
54

55
55
55
55
58
58
59
60
62
62
63
63
64
65

66
67
68
69

81

LIST OF TABLES

Table 1. Immediate-Delayed Test Reliabilities for all Criterion Tests ..................... 32

Table 2. Observed Correlations and Correlations After Correction For
Attenuation: LSAT Section Scores,June 1991 Forms of the LSAT . . . 37

Table 3. Descriptive Statistics For DCT and LSAT Formats .............................. 55

Table 4. DIF Indices: Traditional vs. DCT Format .......................................... 56

Table 5. Reliability Estimates and Descriptive Statistics For All Items .................... 58

Table 6. Correlation Coefﬁcients for DCT and LSAT Correct Responses to Kit of 59
F actor-Reference Test Items .........................................................

Table 7. Mean Response Times to Proportion of Correct-Total Items

Answered .............................................................................. 61
Table 8. DCT Table of Theoretical & Empirical Assumptions ............................. 82
Table 9. Means 8.: Standard Deviations on Verbal Test Forms ........................... 84
Table 10. Means 8: Standard Deviations on Visual Test Forms ........................... 84

Table 11. Summary Correlations Between and Among Predictor and
Criterion Variables for Law Schools Participating in 1995—1996
Correlation Studies: Selected First-Year Student Results ........................ 85

Table 12. Incidence of Signiﬁcant and Rare Differences for Each
pair of LSAT subscores ............................................................. 86

Table 13. Incidence of Signiﬁcant and Rare Differences for All pairs
of LSAT subscores .................................................................. 86

Table 14. Correlation Coefﬁcients of All examinees for MRT
(mean response times) ............................................................... 93

Table 15. Multivariate Analysis of Variance (lVIANOVA) of
Signiﬁcant Items ..................................................................... 95

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

LIST OF FIGURES

Stages of Information Processing (Norman, 1993) ...............................

Comparison of Two Solution Strategies in Terms
of Their Speed-Accuracy Tradeoff Functions .......................................

Correct Responses to the proportion of items

answered for DCT & LSAT .........................................................

Box Plots of Mean Response Times (MRT) of
DCT and LSAT Groups ............................................................

LSAT RT to The Proportion of Correct Responses ..............................

DCT RT to The Proportion of Correct Responses ..............................

19

40

6O

91

91

92

LIST OF EXAMPLES

Sample 1. Sample Format of McNeal’s Verbal to Visual
Test Comparisons ..................................................................

Sample 2. Kit of Factor-Reference Cognitive Test -
Identical Pics Test ..................................................................

Sample 3. Kit of Factor—Reference Cognitive Test —
Finding As Test .....................................................................

Sample 4. Sample Screen Shot of DCT & LSAT item formats
Q1~Q5 ................................................................................

Sample 5. Sample Screen Shot of DCT 8c LSAT item formats
Q1 1-Q17 .............................................................................

83

87

88

89

9O

KEY TO ABBREVIATIONS

AI: Artiﬁcial Intelligence

AR: Analytical Reasoning

CAT: Computerized Adaptive Test

CBT: Computer Based Test

CDT: Computer Display Terminals

COGS: Council of Graduate Students

DCT: Dual Coding Theory

DIF: Differential Item Functioning

FY A: First Year Average Scores

GRE: Graduate Record Examination

GRE-A: Graduate Record Examination-Analytical Section
GRE-Q: Graduate Record Examination-Quantitative Section
GUI: Graphical User Interface

HCI: Human Computer Interaction

I: Item

ICC: Item Characteristic Curves

ID: Identiﬁcation Number

IQ: Measure of Intelligence

LR: Logical Reasoning

LSAC: Law School Admissions Council

LSAT: Law School Admissions Test

MAN OVA: Multivariate Analysis of Variance
MRT: Mean Response Time

Q: Question

R: Reliability

RC: Reading Comprehension

RT: Response Time

SAI: Signed Area Index

STM: Short-Term Memory

T: Time

TOEFL: Test of English as a Foreign Language
UEE: User-Experience Engineers

UGPA: Undergraduate Grade Point Average

UI: User Interface

VSTM: Visual Short—Term Memory

WAIS-R: Weschler Adult Intelligence Scale-Revised
Z_SAI: Standardized Scores of Signed Area Index

INTRODUCTION
STATEMENT OF THE PROBLEM

Past Law School Admission Test (LSAT) research reports indicate that discrepant
subscores, often very substantial ones, were frequently observed in comparing performance
on the logical reasoning and analytical reasoning sections of the test. The analytical reasoning
items are targeted to measure problem solving abilities, through mental manipulation and
organization of information. Similar phenomenon have been reported in other tests, such as
the Graduate Record Examination (GRE) (Bridgeman & Cline, 2000), Weschler Adult
Intelligence Scale-Revised (WAIS-R), (Matarazzo, Daniel, Priﬁtera, & Herman, 1985), and
other intelligence tests. ”The ubiquity of such discrepancies has led to the suggestion that
differences in how people manifest intelligence are the norm rather than the exception”
(Kaufman, 1990). Recognizing these individual differences, the objective of this study is to
arrive at an understanding of information processing: its organization, ﬁltering and retrieval;
and to determine if Dual Coding Theory (DCT) test formats, deﬁned as the presentation of
information with both text and Visuals simultaneously, are a more effective and truer
measure of problem solving capabilities.

In its most general assumption, Dual Coding Theory (DCT) views cognition of
visual or nonverbal information as an activity involving two speciﬁc symbolic
representational systems, one responsible for the processing of images or visual objects, and
the other for the processing of text (Paivio, 1990). According to Paivio’s Dual Coding
Theory (DCT) of information processing (Paivio, 1986), cognitive efﬁciency in recall,
comprehension, cognitive operations such as problem solving, and concept learning literacy
increases when information is presented simultaneously in both Visual and textual form

(Guilford, 1967; Paivio, 1986). However, it is not merely the act of inserting Visuals at

random that produces this effect, but rather the development and layout of Visuals which are
tailored to speciﬁc guidelines of ergonomics.

With the advent of computer technology, the presentation of information to be
displayed on computer screens has become a challenging task. The display of such
information conﬁned to the parameters of a digital monitor “vary in their effect on the
problem solvers’ information processing activities and problem solving performance”
(Woods, 1991 p.171). It is from this ergonomic premise that the ﬁeld of Human-Computer
Interaction (HCI) is borne, which is fundamentally interdisciplinary in nature (Olson &
Olson, 2003), drawing its foundations from cognitive psychology, ergonomics, and
computer science. In depth research of these applied domains is thus necessary to come to a
better understanding of human cognitive, perceptual, and physical processes when
interacting with computers.

In educational assessment and testing, there is a growing trend in the use of
computers for test construction delivery and administration in addition to tasks such as
scoring, analyzing and reporting test results. This ﬂexibility has resulted in the creation of
new and innovative item types geared toward a more performance-based type of assessment.
These items including such formats as interactive video and audio formats, and Vignettes to
name a few, are able to assess “cognitive skills that can be difﬁcult to fully tap using
traditional paper-and-pencil test formats” (Zenisky & Sireci, 2002, p. 338). Along the similar
lines of Paivio’s Dual Coding Theory (DCT) of information, Zenisky and Sireci (2002)
reiterate the importance of format response presentation as a critical component in the
design of computerized adaptive tests (CAT) item types.

Harmes (1999) extends this argument by evaluating current test question types,

particularly multiple-choice questions, which she believes do not provide an accurate

assessment of higher-order cognitive skills as they are limited to a single ﬁxed-response.
Innovative forms of performance-based assessment have emerged since, taking into account
individual differences among candidates (Frederiksen & Ward, 1978; Haladyna, 1997).
However, their construction, administration and delivery are no easy task. To date, the most
practical and closely approximate authentic form of assessment of these skills is in the
creation of innovative items themselves, deﬁned by Harmes as “...new and better forms of
assessment that incorporate features and functions not possible with conventional test
administration” (Parshall et al., 1999 p. 1).

With all these emerging performance-based assessment formats being created and
researched, Allan Paivio’s (1981) DCT brings us back to the foundations of human cognitive
processing of information for an in-depth look at how different units of information
representation are evaluated and processed. Thus, from a cognitive psychological standpoint,
before performance can even be evaluated and assessed, it is crucial to understand how
speciﬁc units of information are processed. Within this seemingly simplistic theory, lies an
almost thirty-year study on the intricacies and complexities of cognitive information
processes of visual and semantic imagery. To fully grasp its foundations and core elements, it
is crucial to ﬁrst have a broad understanding of memory systems and how the cognitive
processing of information affects human performance in its retention, retrieval and transfer
from one task to the next.

MEMORY SYSTEMS: A BRIEF OVERVIEW

One of the core areas in cognitive psychology is the study of memory or memory
systems. Past studies in cognitive neuroscience of the working memory system have yielded
two opposing theories: (1) one labeled the psychometric approach where working memory is

regarded as a single unitary system; and (2) the other labeled the cognitive approach where

working memory is regarded as comprised of two or more subsystems. These subsystems
include: (i) the central executive, the main controlling component of the two subsystems, (ii)
the Visuospatial Sketchpad which manipulates images, and (iii) the phonological loop
responsible for verbal informadon (Baddeley, 1992).

The Psychometrig Approach

The premise of this approach, which has taken root most strongly in North America,
“focuses on the extent to which performance on working memory tasks can predict
individual differences in the relevant cognitive skills” (Baddeley, 1992 p. 556). The essence of
this approach is to develop tasks that require the combined storage and manipulation of
information and to correlate performance on these tasks with the performance of practically
and theoretically important cognitive skills.

These tasks were devised to speciﬁcally measure reading comprehension and
reasoning and their impact on working memory in order to predict individual differences
(Carpenter, Just, & Shell, 1990; Daneman & Carpenter, 1980; Kyllonen & Christal, 1990).
The advantage of this approach is that it focuses on the central executive system, which is
crucial to problems relating to reading comprehension or reasoning. Some criticize this
approach, however, because of the reliance on complex memory tasks that may be arbitrary
in construction, as they “do not readily lend themselves to a more detailed analysis of the
memory component process” (Baddeley, 1992, p. 557).

The QQQghi'h'vg Approach

The cognitive approach, which is the premise of Baddeley’s theory of processing
information via dual—modalities, proposes a tripartite system comprising of a central
executive system controlling two subsystems: (i) the articulatory or phonological loop, and

(ii) the visuo-spatial Sketchpad. The phonological loop is assumed to be responsible for

maintaining speech or verbal information, while the Sketchpad sets up and manipulates
visual imagery.

The use of this dual-task model to analyze the structure of the working memory
system centers on these two sub—systems because researchers believe that there are more
tasks that have tractable problems involving them. As such, concurrent storage and
processing of information is not the only aspect of working memory. What is crucial is the
coordination of these resources (Barnard, 1986; Schneider, & Detweiler, 1987). It is from
this premise, that Paivio’s (1986, 1990, 1991) Dual Coding Theory (DCT) is founded. In
order to have a better understanding of Dual Coding Theory (DCT), a basic overview of the
visuo-spatial working memory component and factors that affect the visual short-term
memory (V STM) such as types of task, time constraints etc., should be examined.
Vishgl-Spgtial Working Memory

Much of short-term memory (STM) research has been focused on the phonological
loop and not on the visuo-spatial Sketchpad. In order to have a better understanding of Dual
Coding Theory (DCT), a basic overview of the visuo-spatial working memory component
and factors that affect the Visual short-term memory (V STM) Should be examined. These
studies have occurred since the 19th century by Sir Francis Galton (1885).

Visual Memory Capaciry

Similar to the working memory capacity of verbal information, the capacity of the
VSTM is severely limited. Evidence of this has been documented in past studies, where it
was difﬁcult for individuals to integrate information gathered from successive ﬁxations on
spatial-based coordinates. This suggests that very little information can be retained from
previous ﬁxations, (Irwin, 1991; Irwin, Brown, & Sun, 1988; Irwin, Yantis, &]onides, 1983;

Rayner & Pollatsek, 1983) and that capacity is very poor for unattended information in scene

perception and in social interactions (Levin & Simons, 1996; Rensink, O’Regan, & Clark,
1997; Simons & Levin, 1998). Capacity for visual memory is said to be approximately four
items, but is contingent on the type of stimuli being processed (Luck & Vogel, 1997; Pashler,
1988; Phillips, 1974; Simons, 1996).

Capacities in visual memory vary for different types of Visuals. For letters or simple
features, memory capacity is approximately four to ﬁve items (Luck & Vogel, 1997; Pashler,
1988). In contrast, however, memory capacity for spatial locations produces a larger
variability, between eight locations to thirty-two locations, with a near perfect ﬁve locations,
but drops when more locations are added. This occurs when only the VSTM is used. When
visual sensory memory or iconic memory is used in addition to the VSTM, which is the
recognition of Visual physical features such as color, shape, etc, and not semantically, their
capacity increases but with a shorter durability (Neisser, 1967; Phillips, 1974; Simons, 1996;
Sperling, 1960). The resolution or details of the physical attributes are also poorly retained
(Intraub, 1997; Nickerson, 1965).

Positioning of visuals, known as the visual recency effect, also had an impact on
memory capacity and retrieval (Broadbent & Broadbent, 1981; Phillips & Christie, 1977a).
This was contingent on such variables as the presence of secondary tasks or interpolation,
the degree of difﬁculty of these tasks, presence of target probes or cues, lag—time between
previous visual information processing, and previous exposure to similar visuals. For
example, researchers discovered that capacity is impaired after a delay, when subjects are
given a demanding task such as mathematical calculations (Phillips & Christie, 1977 a, b).
This is not the case, however, if the demanding task occurs concurrently with the Visual
information processing. This suggests that Visual short—term memory (V STM) organization is

based on: (1) a spatial conﬁguration of the target, (ii) its relationship to the surrounding items

in the display and (1ii) two sub-systems of memory (Doost 8c Turvey, 1971).

Visual-Spatial Qrgam’ atioh

Processing and organization of visual information involves three integrating
variables, (1) processing at the feature level, (2) processing at the representation or semantic
level, and (3) processing at the space or location level. As such, investigations of the unit of
VSTM representation have shown that the capacity can be enlarged by grouping visuals with
similar features, meanings, functions, and other similar features into a single object (Chun &
Jiang, 1998). Thus, units of Visual information are held both independently as well as
relationally to each other.

Numerous studies have discovered that the organization of VSTM which is based on
spatial conﬁgurations is done at hierarchical or at multiple levels (Jiang, Olson, & Chun,
2000; Luck & Vogel, 1997). This concept is similar to the relation of words to the context of
the passage in deriving its meaning and similar to the top—down bottom-up processing of
words.

The formation of spatial conﬁguration is rapid (Chun &Jiang, 1998) and can be
learned within ﬁve to ten repetitions. This is evident in graphical plotting, cartography or in
the study of geographical maps, where locations of cities, states and other landmarks are
easily remembered and retrieved even over long periods of lag times. These spatial
conﬁgurations serve as a guide to contextual information in visual search tasks, counting and
tracking or book—marking. Thus the individual does not need to rely on his or her visual
memory resources alone. In addition, the visuals need not be detailed or concrete, merely
arbitrary for conﬁgural organization to occur. This is why instructions in almost all analytical
or logical test sections call for examinees to draw diagrams to help in selecting the correct

response.

Visual Irace é; Iask~Ihterachon

Early studies of Visual attention give evidence of the persistence of images in
working memory in terms of size and visual trace (Baddeley, 1992). This persistence of
visuals is based on associations between stored visual representations of objects and their
semantic meanings, (Humphreys, Riddoch, 8c Quinlan, 1988; Riddoch & Humphreys, 1987a,
1987b; Seymour, 1979) an explanation held by Paivio (1986, 1991, 1990). Other researchers
attribute this dual task methodology to tasks that do not Vie for similar cognitive resources
(Marschark & Cornoldi, 1991; Marschark, Warner, Thomson, & Huffman, 1991). As such,
Paivio’s (1991) extensive work on Dual Coding Theory gives an in-depth look at dual task
methodology which will be further discussed in this study.

The components of VSTM and the many variables affecting visual information
processing have led other researchers to build on a modality model. Two examples include
Sweller’s (1976) Cognitive Load Theory, concerned with techniques for reducing working
memory to facilitate changes in long term memory associated with schema acquisition, and
Anderson’s (1981) Triple Coding System, a propositional theory of memory recognition.
This research study will seek to investigate the effectiveness of Paivio’s (1991) Dual Coding
Theory (DCT) in the testing and assessment domains, in order to achieve the following
hypotheses.

OBJECTIVE AND GOALS OF THIS STUDY

The overall hypothesis of this study is that when Dual Coding Theory (DCT)
formats for items are used for computerized adaptive tests (CAT), they will yield more
efﬁcient measures of a candidate’s capabilities than traditional item formats because they
take advantage of the results of research on Dual Coding Theories of information (Baddeley,

1997; Paivio, 1986; Sweller & Cooper, 1985). By efﬁcient, we mean an accurate assessment

of a candidate’s performance that can be obtained in the shortest time possible. The
objective of increasing the number of correct responses is to increase the accuracy of
estimation of a person’s capabilities.

A decrease in the amount of time taken is fruitless, however, if the correctness of
responses decreases as well. As such, the speciﬁc objectives of the study will be to determine
if the use of Dual Coding Theory (DCT): (1) decreases the response times between the
presentation of each item and response to each item (known as the response latency) plus
the correct response, and (ti) decreases the average length of time for the number of correct
responses relative to the proportion of total number of items answered.

In the following chapters, a more detailed account of Dual Coding Theory (DCT), its
challenges and current directions will be discussed. Variables that may affect authentic
cognitive measures such as types of mental operations, time perception, task distractors to
name a few, will also be investigated. Research, though limited, on past investigations of
testing formats that have utilized paired visuals and text in their test design and item formats
will be reviewed. The analytical reasoning (AR) items of the Law School Admissions Test
(LSAT) have been selected as the experimental assessment. As such, an investigation into its
internal structure, its validity, reliability and time speededness will be conducted.

The research and methods section will include a description of the test
administration and experimental procedures, with selected screenshots, to provide a more
cohesive and clearer understanding of the processes involved in graphical user interface

(GUI) development. Analysis of the results will be discussed, and future directions proposed.

DUAL CODING THEORY (PAIVIO, 1986, 1990, 1991)
THE BASIC PREMISE

The focus of DCT is the study of imagery and its functions, initiated by studies of
individual differences of the Vividness of imagery by Galton (1880 et al.). The Dual Coding
approach hypothesizes that imagery can be objectively measured by procedures and is
systematically related to performance in memory and other tasks. These independent
imagery variables include: (a) image invoking cues, such as the use of Visuals as a stimulus to
generate speciﬁc words, (b) procedures used to distract or enhance the use of imagery, and
(c) individual differences in the use of imagery (Paivio, 1991).

According to Paivio, (1986) “human cognition is unique, in that it has become
specialized for dealing simultaneously with language and with nonverbal objects and events.”
Moreover, the language system is peculiar in that it deals directly with input and output (in
the form of speech and writing), with representational units for verbal entities known as
‘logogens’, and representational units for mental images known as ‘imagens’. In addition,
these units serve as symbolic functions with respect to nonverbal objects, events, and
behaviors. As such, any representational theory must take into account dual functionality.

Paivio’s theory postulates that there are two sub-systems in the visuo-spatial
Sketchpad, one for processing Visual semantic information such as text, and the other for
processing images such as objects. Three types of processing occurs within these sub-
systems: (1) representational, which is direct processing of text or visuals, (2) referential, the
activation of the verbal system (logogens) by the non-verbal system (imagens) and vice-versa,
and (3) associative, which is the activation of representations within the same system.

When visual images are presented together with text, it serves two purposes: (1) to

complement the text to arrive at a better and more accurate understanding of what is being

10

conveyed, and (ii) to alleviate the cognitive load when reading text. Past experiments have
given evidence of the Visuo-spatial working memory engaged in the ‘perceptive analyses’ of
illustrations together with the relational text (Gyselinck, Cornoldi, Dubois, DeBeni &
Ehrlich, 2002). As such, when “illustrations are presented with text, the visuo-spatial
working memory would be more involved, both in basic operations matching text and
illustrations, and in the formation and storage of the visual traces, before the integration of
the two types of information” (Gyselinck, Cornoldi, Dubois, DeBeni & Ehrlich, 2002, p.
682).
DEVELOPMENT OF DUAL CODING THEORY (DCT)
The Congeprhal—Eeg Hypgthgsis

Research that led to the DCT was motivated by verbal associations Via rhyming
mnemonic techniques in human learning and thought (Noble, 1952). Mnemonic techniques
explicitly require dual coding in that non-verbal images are initially generated from words
during list learning, then generated from verbal cues during recall, and ﬁnally decoded back
into words. For example, in the re-call of a list of say twenty-four items, a mnemonic scheme
of words that rhymed with numbers - one-run, two-shoe, three-tree and so on is used. The
word ‘run’ elicits a mental picture of someone running, the second with a pair of shoes and
so on. The technique implicates the following DCT processes: (i) verbal and irnaginal
referencing, (ii) verbal associations based on rhyming schemes, and (111) imagery organization
and integration. In short, investigations into DCT give evidence that: (i) imagery beneﬁts
associative learning through integrative processes of images and text, and (11) recall of
information occurs on a sliding scale of pictures to concrete words to abstract words.

Imagery—goncreteness of Word-Picturg Item§
DCT “distinguishes between nonverbal imagery and verbal symbolic processes,

11

which involve independent but partially interconnected systems for encoding,
storage, organization, and retrieval of stimulus information” (Csapo, 1991, p. 76). Efﬁcient
encoding of ‘logogens’ and ‘imagens’, is dependent on the type of words used, concrete or
abstract, and the degree of similarity or relation between words and visuals. In short, DCT
suggests that a mixture of an integrative and independent encoding of both images and text
serves for better recall. This is because a simultaneous text and visual presentation of
information is encoded both as images and as verbal traces (Csapo, 1991). Results from past
studies indicate that recall depended on the concreteness of the retrieval cue (Paivio, 1971;
Yarmey & O’Neill, 1969). For example, it would be easier to form a Visual representation of
the word clap/mat or tzger versus one for the word [ya/9' or dircmline.

Concreteness is not merely limited to single word nouns, verbs or adjectives. Begg
(1972) discovered that using concrete phrases (e.g. the white horse) versus abstract phrases
(e.g. basic truth) increased the capacity for free recall. He attributed this to the integration of
irnaginal memory traces that are redintegrated to higher imagery words. There have been
other theories that have challenged the concreteness hypothesis, however, to date evidence
supporting “the Dual Coding Theory View of concrete material being better recalled due to
additive effects of independent verbal and nonverbal (imagery) codes when all else is equal,”
(Sadoski, Goetz & Avila, 1995) still holds.
Synchronous Organizah'oh

A major hypothesis of DCT is the integration of text and visual information
presented simultaneously versus sequentially. This does not mean that all information is
simultaneously processed, but that information is ‘available’ for processing simultaneously as
needed (Paivio, 1986). In keeping with the associative elements between text and visuals, it is

crucial that the pairs are presented as unitized compounds, i.e. having associative semantic

12

meaning (Davidson 8: Adams, 1970; Epstein, Rock & Zuckerman, 1960; Reese, 1972;
Rohwer, Lynch, Suzuki & Levin, 1967).

There exist functional criteria for synchronous organization of information. These
include the following: (1) memory for spatial relations, (11) simultaneous availability of
grouped component information, (1ii) freedom from sequential constraints, and (iv)
redintegration effects, when a component or element of a unit of information is used as a
retrieval cue for the entire previous occurrence. These effects according to Paivio (1991) are
“the occurrence of an idea which is simultaneously accompanied by other ideas that are
derived from perceptual experiences in which the component elements occurred together”
(p. 68).

S 7mm '-As rmm ' A o ia 'v It m

Another important variable in the recall-retrieval of information is the issue of
symmetrical properties among paired items. Forward and backward recall of items is
dependent on the degree of concreteness of the items themselves. Smythe, (1970) concluded
that “picture and concrete noun pairs resulted in symmetrical forward and backward recall,
whereas abstract noun pairs generally showed higher forward and backward recall” (Paivio,
1991, p. 67) (cf. Yarmey & O’Neill, 1969). His experiment also measured the latency of
correct responses and discovered that for concrete noun—paired words or picture pairs, both
forward and backward recall was equal.

These ﬁndings support the DCT concept that recall of pictures and concrete noun
pairs is mediated by visuals containing synchronously organized information and can be
processed without sequential constraints. Sequential constraints are typical of verbal
representations that cannot be easily and readily recoded into images. There exists opposing

theories to DCT which will be discussed in the following section.

13

OPPOSING THEORIES OF DUAL CODING THEORY (DCT)

The Propositional Theory of Recognition Memory’s View of Visual information is
that it is transformed into semantic form for storage in long—term memory (LTM). Although
the propositional theory acknowledges the existence of visual processing in the Visual short-
term memory (V STM) or Short-term memory (STM), it disputes the superiority of images
over words. Some theories suggests that the superiority of images is due to people
“process[ing] and rehears[ing] pictures more fully than words and sentences [which] results
in more propositional information [. . .] when visual representations are provided than when
information is given only in verbal form” (Rieber, 1994, p.114).

Further studies, in particular research of those who are proponents of Artiﬁcial
Intelligence (AI) have demonstrated that visuals are remembered by their meaning versus
their physical visual features. This unitary view of pictures and words implies that both text
and images are stored in the same way, and that there is no difference in the storage of
verbal and Visual information. Many researchers in Artiﬁcial Intelligence (AI) hold this
amodal theory of the abstract representation of knowledge (Driscoll, 1994; Molitor et al.,
1989).

Another argument regarding the superiority of Visuals attributes differences in
information processing to age differences. Simpson (1995) believes that age differences play
a vital role in choosing speciﬁc modalities to use when processing various forms of
information. He argues that younger age individuals process information more so in the
Visual modality versus older aged individuals in the text semantic mode. This could be
attributed to a larger vocabulary derived among older aged individuals who have created a
broader word base.

Other Views opposite Paivio’s (1991) Dual Coding Theory that have emerged, are as

14

follows; the Computational Theory known as ‘connectionism’, (Potter & Faulconer, 1975;
Seymour, 1973; Snodgrass, 1984; Theios & Amrhein, 1989) though gaining prominence in
cognitive psychology and cognitive science, is beginning to be applied to visual
representation and imagery problems. Its potential for handling a wider range of such
phenomena still remains to be demonstrated. Others include the effects of implicit and
explicit memory effects versus visual superiority, (Mel, 1986; Roediger & Weldon, 1987;
Weldon & Roediger, 1987) and relational and distinctive processing (Marschark & Hunt,
1989).
CONCLUSION

Though there have been theories that have argued against Paivio’s (1986, 1991)
theory of information processing, Dual Coding Theory presents a model that is conducive to
the assessment and testing arena. These domains require the presentation and or creation of
visual-text representations, as it may be a more accurate and effective measure of human
problem solving capabilities. This is based on further studies conducted as an extension of
the DCT by Marschark and Paivio, (1977) to determine the superiority of visuals in recall
tasks.

Results from their students indicate the following. (a) imagery was reported much
more often than verbal strategies, (b) verbal strategies were predominant in abstract items,
(c) frequency of images correlated positively and signiﬁcantly with free and cued recalls of
both concrete and abstract items (Paivio, 1991). A complete summary of theoretical and
empirical assumptions and phenomenal domains of DCT is attached in the appendix
(Appendix A).

Though other views have discussed alternative hypotheses for proponents of the

DCT, much of the results have not had predominance over a variety of experimental

15

conditions. Research ﬁndings and results from DCT studies and its proponents have proven
to be the most credible under a variety of circumstances thus far. It is critical to note,
however, that information processing of paired Visuals and text may be affected by speciﬁc
variables that may impede or enhance efﬁcient processing. These variables are discussed in

the following chapter.

16

FACTORS AFFECTING VISUAL SHORT-TERM MEMORY
COGNITIVE ANALYSIS OF VISUAL PROPERTIES

Diagrammatic properties and formats are important variables to consider in selecting
the appropriate Visual that matches the speciﬁc cognitive process. Diagrams are not a
homogenous class of representations, but have a variety of formats and uses. As such,
diagram features must be considered in relation to their objective goal and what they intend
to represent. Classiﬁcation of images and graphs are categorized as either structural or
functional. Functional categories focus on the intended use and purpose of these diagrams,
such as ‘how to’ manuals, whilst structural classiﬁcations focus on representations or forms
of the image versus their content, such as bar charts and pie charts.

A set of functional roles have been identiﬁed from numerous research studies to
serve as a framework for diagrammatic selection as follows and are listed below (Cheng,
1996).

1. Spatial Structure and Organization — Diagrams that depict spatial features and
arrangements of their components are crucial in maintaining what in HCI is known
as ‘white space’. This facilitates accurate discrimination among grouped visuals-text,
and prevents the overlapping and misrepresentation of information. An example is
the display of visuals on a menu bar as icons with clickable functions. They are
separated from other Visuals on the screen by ‘white space’ to allow for this
discrimination.

2. Capturing Physical Relations - Diagrams are used at times to highlight speciﬁc
physical relations that are important to the speciﬁc task; for example, an illustration
of an electrical circuit would demonstrate inter-connectivity and sequence of

components.

17

Physical Assembly -— Some diagrams illustrate how something is physically assembled
from various components. These are similar to blueprints in engineering and
architecture.

Identifying variables, terms and components -— Diagrams at times are used to deﬁne
and identify speciﬁc components, variables and features, such as speciﬁc symbols
used in electrical circuit diagrams such as code breakers.

Displaying values, states etc. — Diagrams are often used to represent quantitative data
in the form of bar graphs, charts etc. Some depict states or conditions, such as
weather conditions.

Captures Laws and Theories - Some diagrams embody theoretical laws and
theorems such as geometry and topography within their structure, such as the Item
Characteristic Curve (ICC) in Psychometrics.

Flows, Sequences and Processes — Diagrams are used to represent simple and
complex ﬂows of processes, both linear and non-linear such as loops, cycles and
sequence stages.

The mode for displaying visuals is also crucial in its synthesis, processing and

understanding. Visuals to be displayed on computer screens are affected by different

variables to visuals to be displayed on paper. In addition, even the screen size and resolution

of Computer Display Terminals (CDTs) affect how visuals are being processed. McCormick

et al., (1987) deﬁne visualization on computers as “the study of mechanisms in computers

and in humans which allow them in concert to perceive, use, and communicate visual

information” (Lohse, Biolsi, Walker & Rueter, 1994, p. 36).

Lohse et al (1994) conducted a research study to investigate the organization and

visualization of images among individuals based on three speciﬁc tasks: naming, rating, and

18

sorting of Visuals. The rating scale was scored on a 10-point scale of anchor-point phrases.
In this study, visual displays are used as “data structures for expressing knowledge, which
help facilitate problem solving and discovery by providing an efﬁcient structure for
expressing the data” (Larkin & Simon, 1987; Lohse, Biolsi, Walker & Rueter, 1994, p. 37;
Rumelhart & Norman, 1988). Results of the study indicate a classiﬁcation of approximately
eleven visual types of representations, but with apparent inconsistencies. According to Lohse
et al., (1994) these apparent inconsistencies are contingent on how well the graphic is
represented, the type of task and the display acreage.

Computer Display Terminals (CDTs) are limited in their available acreage for
displaying information. Though computers are able to afford an environment for the
dynamic display of information to interactive exchanges, the shortage of display space will
affect the Visual-spatial layout of information, which will undoubtedly affect how efﬁciently
information is to be processed. As such, the layout of Visuals within speciﬁc parameters of
the Computer Display Terminals (CDTs) is crucial.

VISUAL-SPATIAL LAY-OUT

According to Norman, (1993) “solving a problem simply means representing it so as
to make the solution transparent” (p.53). His theory postulates that the degree of difﬁculty
of a task is dependent on the presentation format of the problem. Information presented in
technological environments could either change what may be a relatively simple task into a
challenging one, or aid the individual engaged in a complex task by providing guides toward
the correct and desired solution. Norman (1993) lists three stages of information processing
and retrieval from User Interface (U I) designs on computer screens:

Organization of Information—>Search of Information—vComputation of Information

Figure 7. Stager of Information Pmrem'rzg (N omum, 1993)

19

This theory of task functionality is founded on a user-centered design philosophy in
the creation of digital interfaces for information presentation. The theory follows two
speciﬁc principles as a guideline for effective UI displays; the naturalness principle which is
the design of representation whose properties match the properties of everyday things,
(Norman, 1986) and the perceptual principle which is the design of perceptual and spatial
representations only if the representation and what it stands for is natural. Norman (1993)
emphasizes two key interactive components that must always be at the forefront of any
principle or concept, i.e. design to ﬁt the person and the task at hand. Since then, there has
been a tremendous amount of research and investigations into task performance in complex
interactive systems that employ theories from HCI and cognitive neuroscience.

The selection of visuals, item types and formats must also take into account the time
constraints that are placed in a high-stakes assessment test, what in CAT environments is
known as time speededness. Time perception and its effect on performance is discussed in
the next section.

TIME PREDICTION AND TASK COMPLETION PERCEPTIONS

Time differs from most other dimensions of the environment as there is no speciﬁc
sensory organ for its perception (Repp & Penel, 2002). Past research has discovered that
time predictions even among experts of speciﬁc tasks, have over—optimistic time predictions
despite having gone through similar tasks taking longer completion times than anticipated.
This is known as the ‘planning fallacy’ (Kahneman & Tversky, 1979). The theory states that
people tend to focus on the current task at hand during the planning stages rather than
reflecting on the time taken to complete past similar tasks.

From a HCI point of View, time perception is seen as a combination of man and

machine interaction (Decortis, Keyser, Cacciabue, & Volta, 1991). The concept of time is

20

broken down into six speciﬁc sections, each with sub-sections of variables inﬂuencing
perception. They include:

1. Temporary structures of man-machine interaction.

2. Attributes of the structure and its relation to events in terms of sequence, nature etc.

3. Key functions of controls in the system.

4. Adequate tuning of the operator to the system in order to arrive at a comfort zone of

optimal performance.

5. Temporal Errors.

6. Varying time perceptions from each operator to the next.

To fully understand the impact of time on project task completion and actual
performance, it is important to know how speciﬁc lengths of time are allocated to tasks or
particular series of tasks. Various factors impact time prediction of tasks completion. They
include: (1) previous experience of the tasks, (2) the structural and sequential nature of the
task and distractors, and (3) the cognitive complexity and duration of the task (Thomas,
N ewstead & Handley, 2003).

Experigncg-Rghearsed S§§§ion§

In their experiment, Thomas et al., (2003) found task experience is an important
determinant of the time prediction process. Participants used their initial experience with a
task as anchoring for adjusting to the next similar task. Their study also highlighted the
importance of task experience on prediction accuracy, but only contingent on temporal
distance or lag period i.e. the amount of time gone by between the ﬁrst and second session
of two similar tasks, and time between previous and current non-similar tasks. In addition,
time perceptions are not constant across single task duration.

Time experience and its impact in HCI have also recorded similar instances for its

21

effect on perception (Decortis, Keyser, Cacciabue, & Volta, 1991). Speciﬁc variables that

need to be investigated include:

1.

Time required for each action of the GUI i.e. mouse clicks, mouse movement, and
keyboard input.

Time in deciding which function is to be selected for speciﬁc functions, time in
combination with the ﬁrst variable, categorized as an explicit variable.

Transition time from one functional mechanism to another between subject and

operator, time classiﬁed here as an implicit variable.

Nature of Task 8; Disrragtors

“Time judgment performance may display a progressive deterioration as greater

amounts of attention resources are diverted away from the timed task” (Brown & Boltz,

2002, p. 601). This is dependent on the nature of distractors during the task, their duration,

the structure of the task and mental workload, which requires more memory storage. Brown

and Boltz (2002) discovered that these variables had a deﬁnite effect on time judgment

independently as well as interactively. Results from their study give evidence that:

1.

Mental Workload - Timed judgments were more common among dual-tasks (timing
plus target detection) compared to single task (timing only) conditions. With dual-
tasks, attentional resources are utilized more than during a single task. Hence DCT
serves as a perfect model to preserve these resources.

Event Structure — Events that are inconsistent and disorganized had more errors in
judgment.

Duration — Based on Vierordt’s Law (1868) shorter interval times resulted in
overestimation of time judgments versus underestimation for longer interval times

(Bobko et al., 1977; Stevens & Greenbaum, 1966).

22

Tas 1 xi & do

The type of task and its duration are important determinants of performance,
retrieval and storage because of the limited capacity of short—term memory (STM) and visual
short-term Memory (VSTM) in particular. Types of mental tasks will be discussed and
elaborated in the following section.

Time estimation is important from both an artiﬁcial intelligence (AI) and cognitive
psychology perspective. In HCI, this allows one to predict temporal errors and improve
functionality. In standardized tests, time limits that have been imposed serve two speciﬁc
functions. First, time is considered as an inherent part of the construct as it “reﬂects
intellectual power primarily, rather than the rate at which examinees wor ” (Bridgeman,
Cline, Hessinger, 2003). Second, it serves as a standardized measure for all examinees i.e. the
test being administered in the same way. The DCT model offers an efﬁcient solution, by
affording a decrease in cognitive workload such that unrealistic predictions of time on task
and error free estimation will be accommodated.

TYPES OF MENTAL OPERATIONS

Generally, human problem solving can be broken down into three distinct
dimensions: (1) the task dimension, one’s interaction with the environment, (ii) the
performance-learning—development dimension, differences among individuals performing a
task, learning to do a task and developing a task, and (iii) the individual-difference
dimension, the variety of systematic ways each person arrives at the target solution. Thus,
efﬁciency in problem solving capacity and ability is contingent on an individual’s internal
representation of the problem itself.

Factors such as: (i) the nature of the problem, (ii) individual perception of the

problem, (1ii) matching the problem with past knowledge to arrive at a solution, and (iv)

23

exploratory avenues of solutions not yet discovered, all exert a great deal of inﬂuence on
problem solving capacity and ability (Eisenstadt & Kareev, 1975). Though many studies have
conﬁrmed the multi-component structure of working memory (Baddeley, 1992), the
advantages of the structure are contingent on the type of task required during processing of
information. The degree of involvement of the two subsystems is thus contingent upon
information format and layout of presentation, and the nature of the task to be measured.
Quantitative Reasoghg

Hitch (1978) and Hayes (1973) suggested that in mental computations, similar to
those written and done on scratch paper, the mental notations such as intermediate sums or
carries are held in the Visuo—spatial Sketchpad as imagery. Other investigations, however,
attribute mental arithmetic tasks to a phonological coding process that retains operands in
the phonological loop (Furst & Hitch, 2000; Heathcote, 1994). When mental computations
were presented in a vertical format, participants responded more rapidly than when the
problems were presented horizontally. However, as the task load increased and became
harder, the differences in performance between items presented vertically versus those
presented horizontally were smaller. “These tradeoffs suggest differential involvement of
phonological and visual working memory as a function of problem format” (p. 742).

It is therefore important to remember that the ﬁrst step in any mathematical
problem is to understand the representation, and then evaluate cognitive load and arithmetic
problems in analyzing performance. This will then determine the degree of cognitive load,
making attentional and time prediction resources and management of cognitive goals

affected (T rbovich & LeFevre, 2003).

Sentenge Veriﬁcation Task

Problem representations that require an analysis of sentences or words such as

24

naming, involve two additional processes determining the meaning of the picture, and
ﬁnding a name for which involves matching and ﬁltering of semantic or verbal codes. Clarke
and Chase (1972) were the ﬁrst to embark on the sentence veriﬁcation task, providing
students with true or false test sentences with respect to a picture. Results demonstrated that
subjects went through a series of ‘discrete stages’ where both pictures and text were encoded
into a common abstract representation. This resulted in the Multimodal Theory of picture-
word processing. However, when the task of naming was changed to a comparison of a test
sentence to either a picture or word input, format effects could not be predicted (Clarke 8:
Chase, 1972). Further research conducted (Glaser & Glaser, 1989; Glenberg & Langston,
1992; Smith & Magee, 1980; Theios & Amrhein, 1989b) discovered that a Multimodal
Theory of pictures and words would not predict the effects of format on information
processing because access to the verbal or semantic network would be abstract rather than
concrete, as proposed by DCT.

Text and diagrams containing similar information are not equal in terms of the
processing required to extract the information. Goolkasian’s study (1996) examined if format
effects had any effect on four speciﬁc tasks; probability judgment with colors, probability
judgment with shapes, category inclusion, and pragmatic inference. His ﬁndings are as
follows:

1. Pictures had an overall advantage in terms of how information is extracted

versus having access to semantic memory.

2. Pictures aid in the comprehension and retention of text through working memory
management.
3. Pictures facilitate more efﬁcient reasoning that is related to probability

judgments than with colors and shapes, category inclusion and inferences.

25

4. Shapes were closer in Visual detail when compared to other attributes.

5. All conditions demonstrated similar format effects but varied in response efﬁciency
and effect of item type.

6. Compared to probability judgment, problem solving with inclusion, and inference
items were more affected by the test—text statement.

7. There is a performance advantage when pictures occur relative to when words occur
in a variety of conditions.

Though Goolsakian’s experiment proved that the advantage of pictures was based
on how information was extracted, versus its access to the semantic memory system as
hypothesized by the Multimodal and DCT models, the pictures used in his study did not
have an associative or paired semantic relation to the accompanying text. Imagery according
to Paivio can help mediate performance by serving as a reference or interactive relation to
language.

Maze ' O 'ec Ma ' la 'on sk

In visual perception, spatial knowledge is crucial in detecting object location,
direction, and recall. There are speciﬁcally four types of spatial relations: (i) direction
relations describing order in space, (11) topographical relations describing neighborhood and
incidences, (1ii) distance relations, and (iv) ordinal relations that describe inclusion (Pullar &
Egenhofer, 1988). Space-based theories often emphasize the distance between the target and
distractor stimuli irrespective of retinal location, as their distance is an important issue in
effectively processing visual stimuli. Visuals and the space they occupy, distance between the
target object and distractors, or within the parameters of the retinal view, are crucial in the
effective processing of visual information. This also permits effective visual manipulation of

objects and transfer. This is even more of a pertinent issue when visual representations occur

26

in the computer 2D environment because of the limits imposed by screen size and
resolution.

Some of the categories of mental operations described constitute an overall
description of basic higher order cognitive skills of problem solving, analysis, manipulation
and transfer. Speciﬁc distinctions of different complex cognitive tasks are many and varied
requiring an interaction of all four cognitive skills. Much of the research that has been done
on individual performance on complex tasks has been done by doing investigations of
speciﬁc cognitive skills by controlling for any probable interaction, and looking at the
interactions themselves.

INDIVIDUAL DIFFERENCES

Many research studies have investigated individual differences in cognitive
processing of visual and verbal information and the effect on performance. According to
Reinert (1976), “these cognitive abilities can be thought of as perceptual modalities, channels
through which the individual receives, uses, and retains information. Each person is
‘programmed’ in certain ways, so that one particular cognitive ability becomes more
compatible in confronting and obtaining information, whereas other abilities may be less
effective” (Van Dusen, Spach, Brown, 8: Hansen, 1999, p. 1030).

The term ‘Visual learner’ has been used often to describe one’s learning style. In
recent years, however, researchers have obtained results that do not lend support to the
thought of a strong relation between individuals who relied primarily on images to perform
cognitive tasks, and high imagery tasks. On the contrary, current studies have indicated that
the processing of visual-verbal information is not a unitary construct, as proposed by Paivio
(1991), but involves an integration of problem-solving cognitive skills at varying levels

among different individuals.

27

The reason for this great variation is that “imagery is not general and
undifferentiated but composed of different, relatively independent Visual and spatial
components” (Baddeley, 1992; Farah, Hammond, Levine, & Calvanio, 1988; Kosslyn, 1994;
Kozhevnikov, Hegarty, & Mayer, 2002, p. 48; Logic, 1995). Moses, et al., (Moses, 1980;
Suwarsono, & Presmeg, 1986a, 1986b) proposed that Visualization can be placed on a
continuum, called ‘degree of Visuality’ while solving mathematical problems. Results from
this approach failed to connect the degree of visual ability to levels of spatial ability.
Following these incongruences, researchers such as Kosslyn (1995) proposed categories
made up of Visual ability, either high or low, and spatial ability, either high or low.

Imagery involving different types of Visuals was identiﬁed by Presmeg (1986a, b) in
mathematical problem solving tasks. They include concrete pictorial imagery, pattern
imagery, kinesthetic and dynamic imagery, and memory for formulas, classifying pattern
imagery as the most important role in mathematical problem solving. This is because pattern
imagery disregards concrete details and focuses on pure relations. Finally, Hegarty and
Kozhevnikov (1999) discovered that Visual-spatial representations can be divided into
primarily schematic or primarily pictorial, and found that “the use of schematic
representations was signiﬁcantly correlated with students’ spatial visualization ability”
(Kozhevnikov, Hegarty, & Mayer, 2002, p. 51).

In high stakes assessment environments, reading to search for information with the
intent of answering speciﬁc questions, under a time constraint, and familiarizing oneself with
functions of the interface, is a multi—task criterion for the candidate. Speciﬁc variables, as
already described will have an impact on this process. The following chapter will investigate

past and current Visual-text assessment formats and review the results and conclusions.

28

VISUAL-TEXT ASSESSMENT FORMATS

EARLY RESEARCH

There exist only a limited number of studies that have concentrated on the use of

Visualized tests. The earliest research of utilizing visuals and text in educational standardized

tests was conducted by Brown (1947), entitled A Comparison of Verbal and Projected

 

. Speciﬁcally, the

objective of his study was to look at the student’s ability to “use principles to explain, to

predict, and to arrange conditions to bring about a desired end” (p. 1). His argument lends

su ort to this current studv that in encil and a er standardized tests where student’s
pp J ’ p p P

ability is correlated to teacher scores:

1.

2.

Words used in verbal tests may not invoke similar meanings among examinees.
Intended meanings expressed by the test construct may not have the similar intended
meaning among examinees.

The verbal constructs of words, phrases and meanings may not allow for
synchronous processing to occur, i.e. where information is not simultaneously
presented, requiring examinees to piece together the problem thereby causing a
cognitive overload.

Quality of verbal information processing is inﬂuenced by an individual’s reading rate
and overall comprehensive speededness.

A summary of the results from Brown’s (1947) study are listed as follows:

The scores were stable throughout the period of the performance test for each
examinee.

There were no signiﬁcant differences between the two test formats with regard to

the number of items.

29

3. Both formats correlated with the performance test criterion to a degree sufﬁcient to
indicate good predictive power and generalizability.
4. The verbal-pictorial format based on matched and paired items indicates that it is a
more valid predictor of student performance with IQ levels of 100 and below.
5. The verbal-pictorial test format was signiﬁcantly less difﬁcult than the verbal test, but
still maintaining its validity.
These earliest ﬁndings strongly suggest the supremacy of a DCT method of information
processing hypothesized by Paivio (1986, 1990, 1991).

Building on Brown’s (1947) study of verbal-pictorial test formats, a section of
Letkowith’s (1955) research focused on the reliability and validity of pictorial tests in actual
testing programs. A pool of 60 multiple choice questions was administered with ﬁve answer
choices to examinees in two different formats, a verbal or text only format, and a pictorial-
text format utilizing visuals as cues. The overall results from the study indicated that:

1. The correlation between examinee scores to the pictorial test method was higher, as
the pictorial stimuli became more iconic, a characteristic of the VSTM.
2. Pictorial tests were valid and reliable enough for use in actual testing programs that
complement teaching and instruction in K-12 curriculums.
All three hypotheses were realized. Since Brown’s (1947) and Leﬂcowith’s (1955) Visual—text
Studies, there have been sporadic attempts to build on these hypotheses resulting in a variety

of outcomes and results.

PAST RESEARCH

Dwyer and DeMelo’s (1984) study entitled Effegrs of Mgd; 9f Insghctign, lesrth' g,
Order of Tgsting, and Ched Regal] 9n Student Achievement consisted of ﬁve types of

evaluation formats: (1) a drawing test, the ability to re-create items in their appropriate

30

context, (2) an identiﬁcation test, used to measure ability to discriminate one structure from

another, (3) a terminology test, measuring speciﬁc domain knowledge, (4) a comprehension

test, an evaluation of the application of learned information, and (5) a total criterion test, a

combination of all the formats above. The content of their test was the functions of the

human heart and its internal processes.

The results indicate that using visuals to complement verbal instructions assists in

recall and the effects of higher mean scores among students who took the verbal test format

disappeared on the delayed two-week retention tests. Overall the Visual testing format that

was predicted to improve performance over the verbal format was not realized. However,

the investigators attributed this to the following:

1.

It was the ﬁrst exposure of visual testing for the participants; rehearsal sessions
would have altered the results signiﬁcantly.

The design of the visual format items was done so that they would be congruent to
the verbal distractors (the other non-correct answers used to re-direct focus of the
participant from the correct answer) of the verbal items. This is not an issue of
merely translating a verbal format into a visual one, as Visual images and their
distractors are processed and ﬁltered differently.

Only one type of testing format was used in the Visual version of the test, matching
the correct image to the multiple choice responses.

The verbal category included a variety of formats, such as labeling, naming, and

drawing.

A signiﬁcant ﬁnding in Dwyer and De Melo’s (1984) study was the advantage of the

verbal test format disappearing after two weeks in delayed retention tests. Building on this

premise, Richards (1987) revised their tests and focused on the aspect of immediate versus

31

delayed test formats both in a verbal and a visual version this time, on computer displays. In
addition, he also investigated time spent on tests as a valid variable measure. A concise table
of overall test item reliabilities is illustrated below.

Table 1. Immediate - Delayed Test Reliabilitier for all Criterion Text:

 

 

Test Reliability
Drawing 0.722
Identiﬁcation 0.653
Terminology 0.747
Comprehension 0.743
Total (Id. + Term. + Comp.) 0.816

 

 

Using the Kuder-Richardson Test Reliability formula, estimates of parallel reliabilities for
both the immediate and delayed tests according to Richards, proved satisfactory. It is
important to note, however, that his reliability estimates would not be satisfactory by today’s
standards. A good estimate would be between 0.9 — 0.8, while anything at or below 0.7 index
level would be considered poor. After analysis of the results, there were no signiﬁcant
differences between testing modes among all tests as well as in time spent on all three types
of categories and on both formats.

In contrast to the past preliminary research studies of verbal—visual testing modes
and formats, McNeal (1994) utilized Paivio’s (1976, 1981, 1991) DCT as the core theoretical
foundation of her research in assessment and testing, using the following test verbal formats:
(i) an identiﬁcation test comprised of multiple choice responses, (ii) a terminology test made
up of multiple choice and ﬁll-in-the-blank items, (iii) a comprehension test also made up of

multiple choice responses. The following item types for the visual format were used: (1 an

32

identiﬁcation test with one visual and four to ﬁve text labels at any one time, (1i) a
terminology test using at least two visuals to ﬁll-in-the-blanks of that part of the heart
associated with the function, (1ii) a comprehension test which offers visuals of four multiple
choice options. Finally, scores from each criterion test were combined to form a 62 item
composite of visuals plus text format.

The visuals used in McNeal’s (1994) study were simple lined drawings in black and
white combined with text. A sample of a test format is attached in the appendix (Appendix
B). Estimates of reliability coefﬁcient for all items in the various formats were within the
range of 0.70 — 0.92. Results indicate the following:

1. Though the “means on the visual form of the criterion measures were not generally
deviant from those on the verbal forms, however, the standard deviations were
usually higher on the Visual form of the criterion measures” (p.54).

2. The mean achievement scores on the combined visual-text section of all categories
were signiﬁcantly higher.

An overall comparison of composite scores comparing all formats is illustrated in the
appendix (Appendix C).

McNeal (1994) attributed the higher achievement results among examinees in the
composite Visual-text format (T4) to the following DCT (Paivio, 1976, 1986, 1990, 1991)
principles of information processing:

1. When concepts are stored in both a verbal and visual code, they are retained in
memory longer and are more easily accessible. Paivio (1976, 1981, 1990, 1991)
identiﬁes this as the code-additivity hypothesis, that encoding in both visual and
verbal forms facilitates memory (Mayer & Gallini, 1990; Park & Hopkins, 1993;

Reiber & Kini, 1991; Sadoski, Goetz, 8: Fritz, 1993a, 1993b).

33

2. The notion of referential connections, i.e. associations between text and Visuals in
the DCT approach, allow for the great ﬂexibility in human cognition (Sadoski,
Paivio, & Goetz, 1991).

3. Another dual coding principle is encoding-speciﬁcity, matching the assessment
format to the instruction-learning situation.

4. The superior performance on the prose section of the visual-text format was
attributed to the deeper levels of information processing (Craik & Lockhart, 1972) of
both Visuals and text.

Much of the signiﬁcant ﬁndings were mainly conﬁned to the combined Visual-text
test formats versus the visual—only test format. McNeal (1994) attributed the signiﬁcant
differences of the verbal test format scores in the comprehension and composite sections for
the three instructional conditions to prior knowledge, and familiarity to test format.
CURRENT RESEARCH

More and more tests and assessments are now utilizing measures geared toward CAT
environments, with in-depth research focusing on new and innovative measures. The
listening-comprehension sections of the Test of English as a Foreign Language (TOEFL),
now includes visual accompaniments to verbal stimuli. Ginther’s (2001) report entitled
Eff s o the 'resnce. . bs -ofVis al on erfo an e on O i- _3 Listni -

Comprehension Stimuli, sought to explore the following questions:

1. Do subjects perform better on test items when they are presented in a dual-modality
format? (Baddeley, 1992)

2. Is there an interaction between Visuals and other stimuli?

3. Is the effect on examinee performance a result of English proﬁciency or is it related

to Visuals making the task easier?

34

4. Is there a clear preference for the dual modality format or the audio only format?
Results indicated that the use of visuals was contingent on the experimental condition, as the
only signiﬁcant ﬁnding was in the ‘mini-talks with content visuals’ condition with no
signiﬁcant change in the other conditions.

Contrary to Ginther’s ﬁndings, Paivio et al., (Paivio & Desrochers, 1980) examined
the effectiveness of Visuals in bilingual acquisition and second language learning by
investigating results from relevant experimental studies. These experimental studies (Kellogg
& Howe, 1971; Wirner & Lambert, 1959) have consistently shown that L2 (second language)
responses are learned with fewer errors and in fewer trials if visual referents versus L1 (ﬁrst
language) words are used as cues or stimuli.

In tests and assessments, the interaction effects between content and visual
characteristics, extends to what Cronbach (1975) refers to as “a hall of mirrors that extends
into inﬁnity.” To date, researchers in the ﬁeld of assessment and testing are exploring new
and innovative ways to develop more valid and reliable measures of an examinee’s ability,
taking into account individual differences that exist in information processing. Researchers
are ﬁnally beginning to understand the importance of cognitive foundations and its impact
on tasks, something cognitive psychologists have long studied and investigated. In the next

chapter the LSAT and its internal structure will be assessed.

35

THE LSAT (THE LAW SCHOOL ADMISSIONS TEST)

ITEM TYPESMEASURES

The LSAT is currently the ofﬁcial admissions test for all candidates gaining entry
into Law School across the United States and Canada. There are approximately three types
of test items that make up the LSAT. They include Reading Comprehension (RC), Logical
Reasoning (LR), and Analytical Reasoning (AR). The purpose of each item type is to
measure speciﬁc cognitive abilities. There have been questions raised regarding the
authenticity of LSAT item types to the speciﬁc cognitive ability it is to measure. Wilson and
Powers (1994) investigated the internal structure of the LSAT to review the reliability and
validity of the three speciﬁc item types and the abilities it measures.

The speciﬁc skills to be assessed operationally by the three item type categories are as
follows:

0 Reading Comprehension (RC) - This section requires the examinee to read a passage
so as to determine relationships among various parts of the passage and draw
inferences from it. The cognitive abilities measured here include inferring, ﬁltering,
association and transfer of applicable information.

0 Logical Reasoning (LR) — An examinee is required to read and understand the
argument or reasoning in a passage. These questions test reasoning, logic, and
drawing critical conclusions from given evidence or premises.

0 Analytical Reasoning (AR) — A set of conditions or rules is presented and the
examinee is expected to draw conclusions using these rules. This section measures
“the ability to understand a structure of relationships and to draw conclusions about
the structure” (\Wilson & Powers, 1994, p. 1).

The following table illustrates a breakdown of the correlations of the three item types.

36

Table 2. Observed Correlation: and Correlation: After Convection For Attenuation: LSAT (Law School
Admzlrrion: Text) Section Scorer, fnne 1991 (7' October 1991 Form: of the LSAT*

 

 

 

LSAT
June 1991 October 1991
Section LR25 LR24 RC28 AR24 LR25 LR24 RC28 AR24
LR25 (.78) .97 .91 .71 (.77) .96 .89 .68
LR24 .76 (.79) .89 .72 .74 (.77) .87 .64
RC28 .72 .71 (.80) .63 .69 .68 (.79) .59
AR24 .55 .56 .50 (.77) .52 .49 .46 (.76)

 

Nore: Observed correlations are shown below the diagonal; corrected correlations are shown above
the diagonal; diagonal Elements are estimated KR«20 reliabilities.
Data from unpublished ETS internal test analyses for forms of LSAT used in the present
study; KR-20, internal consistency estimates.

The pattern of correlations shown in Table 2 above gives evidence to the speciﬁc
skill each item section is intended to measure. R’s for the Reading Comprehension (RC) and
the Logical Reasoning (LR) items ranged between .87 and .91. This inter-correlation
supports the objective of similar abilities being measured in Reading Comprehension (RC)
and Logical Reasoning (LR). In contrast, the Analytical Reasoning (AR) items have r’s
ranging from .59 to .63, giving evidence that Analytical Reasoning (AR) items have been
developed to measure abilities that are not measured by Reading Comprehension (RC) and
Logical Reasoning (LR).

Prgdictivg Validiry

Since the Wilson and Power (1994) study, there have been other studies that have
investigated the validity of speciﬁc item types of the LSAT as a predictor of success in law
schools. “The general concept of validity is a broad one, encompassing the accumulation of
data to support a particular use of a test. The particular type of evidence obtained from the
correlation studies is referred to as predictive validity” (Anthony, Harris, & Pashley, 1999, p.

2).

37

 

In their report, Anthony (1999) et al., investigated LSAT as a predictor of ﬁrst year
law school average scores (FY A) for 1995 — 1996 known as the criterion variable. The LSAT
and ﬁrst year average scores (FY A) data were gathered from 183 law schools throughout the
nation. The results from the correlational study demonstrate that the LSAT score is a better
predictor of ﬁrst year performance in law school than undergraduate grade point average
(U GPA), and that a combination of the LSAT and the UGPA than either individual measure
serves as an even better predictor (Appendix D).

Analm'cal Reasoning (AR) Discrepant; Spbsgores

Statistically signiﬁcant differences have been found to affect about a third of
examinees in the AR section of the LSAT, while signiﬁcant and rare differences involve
about a tenth of test takers in the LSAT (Stricker, 1993). The ﬁndings suggest that there exist
marked differences among examinees who take the LSAT, “reﬂecting variation in their
development of the abilities tapped by the subtests” (Stricker, 1993, p.11). Other tests have
also reported these discrepancies, such as the GRE (Bridgeman & Cline, 2000), and the
WAIS-R (Matarazzo, Daniel, Priﬁtera, & Herman, 1985).

The prevalence of discrepant scores in the AR section of the LSAT was greater for
older examinees and for those who had higher total scores. Three types of discrepant
subscores were obtained for each pair of subscore comparisons. These were adapted from
those utilized in intelligence tests (Kaufman, 1990; Sattler, 1988). They include: (1) an
observed difference, the actual difference between a pair of subscores moving in the same
direction, (ii) a signiﬁcant difference at the .05 level, and (1i1) a signiﬁcant and rare difference,
infrequent occurrences of .05 or less. Overall results from the study are attached in the
appendix (Appendix E).

The overall signiﬁcant ﬁnding is that substantial subscore differences were frequent

38

among examinees. This reﬂects a variation of abilities to be effectively measured which is a
common observation in intelligence tests (Chatrnan, Reynolds, & Willson, 1984; Kaufman,
1976a, 1976b; Matarazzo, Daniel, Priﬁtera, 8c Herman, 1988; Matarazzo & Herman, 1988;
Mclean, Kaufman & Reynolds, 1989; Rosenthal, & Kemphaus, 1988). It is precisely because
of these differences of subscores, especially in the AR section that has prompted the
investigation of this study.
TIME SPEEDEDNESS

Time, as explained in Chapter 2 of this study, is both a variable and a constant that
affects performance on tests, particularly if it is within the context of a constraint such as the
LSAT. Research on response accuracy and response speed according to Scrams & Schnipke
(1999), provide different measures of performance. Test speededness occurs when
examinees receive lower scores as a result of lack of time and not because of their lack of
ability. Speededness, with regards to the LSAT, is currently measured by calculating the
proportion of test takers who do not reach each item on the test (Schnipke & Scrams, 1999).

According to Schnipke and Scrams (1999), LSAT is partially speeded, as the
proportions of items that are reached among test takers increases toward the end of the test.
Schnipke and Scram proposed a two solution strategy model to account for the relationship
between response times and accuracy of responses. The model is based on Thissen’s (1983)
time-testing model which examines the relationship between ability and speed. The examinee
is offered two solution strategies to choose: a heuristic strategy, and an algorithmic strategy.
Her choice is determined by time limits. If there are strict time limits imposed she may
choose the heuristic strategy to minimize her processing time. This strategy involves the
management and allotment of time for questions and responses throughout the test (See

Figure 2).

39

Fégure 2: Conrpan'ron of Two Solution Strategies in Term: of Tbeir Speed-Acmrag deeoﬂFunctionr.
Tbe Two Vertical Line: Represent Pom‘ble Pmcerring Timer.

 

Two Solution Strategies

 

 

 

 

    

1 T __
P("Correct") 0‘8 T ____.._._.~.
0'6 — Algorithm
3: T -—-— Heuristic
0 —+~+-—~———+~—+~—+——l

 

 

~0.2~—-123456789111

Processing

 

 

 

 

The vertical line nearest the Y axis indicates her level of performance. If the
algorithmic strategy is selected, the examinee concentrates on response at the expense of
lower times to increase her asymptotic accuracy. Schnipke and Scrams (1999) also attributed
shorter response times within tests that are speeded or have strict time limits to guessing

Results from the study, using data from the logical and analytical sections of the
CAT version of the GRE indicates the following:

1. Items located at the end of the test had faster responses, some responding less than

10 seconds.

2. Rapid guessing behavior was independent of item content but contingent on item
location.

3. Faster response times were associated with low accuracy levels.

4. Slower response times were associated with higher accuracy levels.

5. As time slowly increases during tests, a plateau will be reached where accuracy is not
increased.

The premise of speed-accuracy relationships is that higher scoring examinees who

40

are confronted with higher difﬁculty test items tend to take more time on them. This
interaction of level of item difﬁculty crossed with more response time thus confounds
observations of the examinee’s true ability level. Schnipke and Scrams (1999a, 1999b)
concluded that other variables such as strategies, content and context of information, and

time management or pacing are issues that affect performance on test scores.

41

RESEARCH METHOD & DESIGN

PURPOSE OF THE STUDY

The purpose of this study is to compare the performance of samples of students
taking the AR section of the traditional format of the LSAT to those taking the AR DCT
format of the LSAT. A comparison of the performance of the two samples will thus
determine if there is a format by time interaction i.e. the new dual coding format will serve as
a solution to a more efﬁcient measure of problem solving

A DCT test format was developed from a past LSAT Analytical Reasoning section.
Each item from the traditional section was carefully designed to include Visuals and text for
question presentation and responses. Care was taken to ensure that the new information did
not provide additional hints to the answers. The items for each format were then
administered to the examinees. Each participant was randomly assigned to either the DCT
or the traditional LSAT condition. Scores from the traditional section were compared to
scores from the DCT test format and were analyzed and evaluated.
PROCEDURE
Deﬁnig'on of Ierrhs

The overall hypothesis for this study is that DCT item formats are more efﬁcient
measures than the traditional item formats. Listed are deﬁnitions of the speciﬁc variables:

Efﬁgiency

For any candidate taking a high stakes assessment test, performance is almost always
affected by time constraints. Past research on the effect of time on test performance is
currently being investigated in many high stakes assessments such as the GRE. “Whether
speediness is irrelevant or a relevant indicator of academic ability, the extent to which score

is dependent on time, is of interest to potential score users” (Bridgeman, Cline, 8c Hessinger,

42

2003). There cannot be complete assurance that a test was speeded. Efﬁciency in this case,
is the time related to the number of correct responses in proportion to the total number of
items answered. The hypothesis is that the new procedures will take less time between
items, and hence less average response times than the previous item response process.

11th

Time here is speciﬁcally deﬁned as follows: (1) the average time to complete a single
question, and (2) the average time taken to complete the number of correct responses in
proportion to the number of items answered. Explicitly, the measure will include as follows:
for each group, the average time taken to complete the number of items answered plus the
number of correct answers, and the response latency, or time taken between the presentation
of each item and response. (Note: Response time is deﬁned as the time when the question
ﬁrst appears on the screen until an answer is given and conﬁrmed.)
Power Analysis

A power analysis based on data from past research that investigated response time
differences among examinees in CAT environments (Bridgeman & Cline, 2000), was
conducted to determine an effective population sample size for the study. A section of the
study looked at the mean response times between two categories of the GRE, the
Quantitative section and the Analytical Reasoning section. Preliminary studies indicated that
20% of examinees fail to complete the quantitative section and 35% in the analytical section.
The analysis indicate that a sample size of approximately N = 128 (total) with a power of .8,
alpha level of p = < .05, at an effect size “(1” = 0.5, or N = 90 (total) with a power of .8,
alpha level of p = < .05, with an effect size “d” = 0.62 will be used.

In Bridgeman and Cline’s study of time and position effects using GRE Quantitative

and Analytical items (note that the GRE Analytical section contains items similar to the

43

LSAT AR and LR items), an average response time of 78 seconds per item was observed.
Therefore, a goal of this study will be to reduce the mean response time per item to below
S 78 seconds over the full set of items, for the following reasons:

1. Bridgeman 8c Cline’s study was done with data from the GRE-Quantitative and
GRE-Analytical, which serves as a good comparison to the LSAT AR section.

2. The response time difference was for separate categories at 20 second interval
between both the CAT GRE-Q and the GRE-A items. This comparison was taken
because GRE-Q items have had the inclusion of visuals in the trigonometry section
versus GRE-A items that has always been text-only represented.

3. By selecting S 78 seconds as a minimum time difference to target, it would prove
that the DCT items would close the gap between the GRE-Q item responses and the
traditional GRE-A response times.

4. GRE—A items consists of testlets, a set of 4 or 5 questions to one stimulus which
take more STM space, versus GRE-Q questions which can exist as stand-alone
questions. If the minimum time difference of S 78 seconds is met or reduced, this
would be a signiﬁcant indication that DCT item formats help in increasing memory
resources.

5. “LSAT is a univariate test designed to measure reasoning ability” (Henderson, 2004),
that parallels time constraints of actual Law School in-class examinations. As such, it
is a fairly robust predictor of law school in class exams in terms of test-taking speed.

The format of this study has thus been selected to address the response time investigations.
Irem Selectign Prggess
The construction of innovative item types i.e. DCT format items must take into

account the following issues: (1) users are not just targeting information for information’s

sake, but are answering speciﬁc questions within a limited time framework, (ii) maintaining
the validity and reliability of items to a 2D environment, (1ii) the different competent levels
of technology use among candidates, and (iv) proper use of items for the speciﬁc testing
domain, i.e. Analytical Reasoning (AR). The following sections will list out the basic criteria
that were used in selecting speciﬁc features crucial in the item selection process.

A set of past LSAT AR questions from W series (LSAC,
2002, 2003, 2004) was selected as the experimental item constructions for presentation on
computers. Currently, the LSAT is only offered in pencil and paper format. Twenty-three
questions from Section II (analytical reasoning) of the June, 2003 Prep Test 40 were utilized
for the study. Approximately ﬁve sets of questions were based on a set of conditions, each
measuring the following: directional skills, ordering or ranking abilities, and selection i.e.
inclusion and exclusion.

Unlike other research studies that use Visuals-text in the questions themselves, only
the set of conditions and multiple choice options were presented in visual-text format.
Selection of visuals for the conditions included the following:

0 Conditions that were directional utilized arrows.

0 Conditions that occurred in order or were ranked utilized number sequences and
series of periods or dots.

o Conditional events utilized the ‘it’ and ‘or’ words.

0 Values that were excluded utilized a diagonal line strike across the value.

0 Values that were included utilized a ‘plus’ or ‘addition’ symbol.

0 The points of connection for directional Visuals utilized red dots to represent
connections and non-connections.

Selection criteria was based on the eleven visual classiﬁcation list (Lohse, Biolsi, Walker &

45

Rueter, 1994).
Kit Qf Faeror-Reference Coghitive Tests

The kit consisted of 72 tests that have been demonstrated as a consistent measure in
studies of 23 cognitive factors such as reasoning, verbal ability, spatial ability, memory, and
other cognitive processes. This tool was developed by Ekstrom, et al. (1976) with the goal of
assessing individual differences in cognitive abilities. Two tests of perceptual speed, Identical
Pictures Test for visuals, and Finding A’s for text were utilized. The inclusion for the Kit of
Factor—Reference Cognitive Tests was to check for validity of the DCT and LSAT items in
both formats, speciﬁcally if the Visuals in the DCT format attributed to a change in the
validity of test items. An example of some items of the kit is attached in the appendix
(Appendix F).

Grapm'cgi User Ingerfaee (GUI) Development

Before engaging in prototyping, basic evaluation procedures were adopted that
researched three important factors: (1) performance to be measured, in this case problem
solving skills, (ii) how this is to be measured on CDT displays and recorded, and (1i1) a
persona of a typical examinee who would take the LSAT in normal situations, i.e. after
completion or near completion of an undergraduate degree.

In the development of an intuitive Graphical User Interface (GUI) based on
Norman’s (1986) general principle of task functionality, a prototype is a critical tool in its
design and evaluation. A prototype, as deﬁned by Hackos and Redish (1998), “is an easily
changeable draft or simulation of at least part of an interface” (p. 376). The following
principles were taken into account: (1) the ﬂow of screens for major tasks, (1i) the screen
layout of the basic task screen, (iii) layouts for all screens, (iv) interactive functions for each

screen for input and output data, and (v) matching the layout and screen to the task and

46

mental model of problem solving to the AR section (Hackos & Redish, 1998).

Prototyping a GUI occurs at three stages; the pre-prototype phase, the prototype
phase, and the post prototype phase.

W

In the pre-prototype phase of development, core variables, such as domain
complexity, applying a suitable technology type to the task, were investigated and researched.
Once an evaluation of the items themselves was completed, paper based mockups of each
screen were drawn up. At this stage of the GUI development, most of the prototyping is
done on paper because of its versatility and easy edit ability. Sketches of each screen were
done in black and white ﬁrst to obtain the overall positioning and spatial layout of the items
to be displayed. Each screen was then drawn in color and then evaluated for their
connectivity and consistency.

Pro 0 / e P a

The prototype phase level of design investigates principles of the GUI that will
determine information representation and presentation on the speciﬁed computer display
terminals (CDTs). Core principles of HCI and necessary functions needed for delivery and
interaction, regardless of the domain were incorporated in this phase. Overall layout design
is essentially based on Gestalt’s Theory of Visual Organization (Wertheimer, 1925) which is
concerned with the conﬁguration and visual organization of graphical objects and its impact
on human perception (Roth, 1995). Speciﬁc Gestalt concepts are brieﬂy listed as follows:

1. Similarity - Visuals grouped together based on similar shape, size, pattern, or color
representing similar functions, or functions that consist of common interaction
controls.

2. Proximity - Consistent measures of space between visuals reducing confusion and

47

error, enhancing comprehensibility of visuals and text.
3. Contrast - Emphasize important information by capturing the individual’s attention.
This is because objects in the surrounding vicinity compete for visual attention.
4. Figure-Ground -- This is the foundation of layout in design, print or digital.

Screenshot samples of the GUI of both formats are attached in the appendix (Appendix G).

Spatial Relations

The visual spatial relations and locations of the selected objects and text were
conducted next after visual grouping. Three organizational methods for achieving screen
design: (1) using a grid tool, (2) using item grouping strategies, and (3) standardizing the lay
out (Marcus, 1992) were utilized to achieve this goal. Speciﬁc spatial layout concepts such as
balance, equilibrium, symmetry, sequence, and unity (N go, Teo, Bryne, 2000) were
addressed.

I (‘0)! Representation

Based on Norman’s (1999) theory of action-intention, an interface that can be
directly manipulated is much easier to utilize than one that is not. In short, any interface that
involves the concept of automaticity (Logan, 1988) does not need extra cognitive resources
to understand what each icon does, serves as a good design. Icons in particular have to be
carefully designed so that at least 90% of users can fully comprehend its representation.
Haramundanis, (1996) & Horton (2001), deﬁned icons as small pictorial symbols on
computer menus, screens, and windows to: (1) serve as cues or reminders, (2) aid
recognition, (3) save screen real estate, and (4) assist users whose native language is not
English.

Similar to the DCT approach, research in HCI issues illustrate that icons that are

accompanied with text versus text or icons alone, are more effective in affording faster and

48

more accurate search times. As such, only black and white images of the menu icons were
incorporated to prevent visual overload as the candidate is limited in time to become familiar
with their functions.

Interactors“

Buttons and other clickable interactors function as avenues to links of information
for the interface user. Bodner (1994) discovered that animated buttons had 85% more
correct responses, versus 67% of users who had to utilize static buttons. Animation does not
necessarily mean a constant movement of a visual within the parameters of an interface
design. It includes any embedded clicking action when the button is clicked on, any
highlighting effects, or any change of visual image on mouse roll—overs. The following
speciﬁc guidelines for designing interactors have been incorporated.

0 Mouse — The mouse is a simple point and click that is not without its setbacks.
Errors in mistaken target selection and intent are always issues of concern. Only the
mouse function of point and click were selected as functions to limit multiple
interactive actions at any one time.

0 Animation — As explained, buttons need to have a change of image status to indicate
functionality and interactivity. A highlighting roll-over feature fulfills this status
change.

0 Limits on Interactors - Too many different types of interactors will cause confusion
regarding its function and purpose. In a CAT, cognitive resources need to be
preserved for the actual performance evaluation. Interactors have been limited to
functions crucial only to the input of answers and navigation between screens.

Color Rrpmmtatz'on 2’7 Use

The use of color always centers around four specific characteristics, hue, brightness,

49

saturation, and contrast. Hue is the general identiﬁcation of a speciﬁc color. Brightness is the
level of luminance within color. Saturation is the interaction of hue and brightness, often
referred to as the color depth. Contrast is the relative perceived brightness of two displayed
colors based on the ﬁgure-ground theory, which enables the individual to separate and
decipher two groupings (Misanchuk, Schweir, & Boling in press). Color is crucial in
displaying detail in visual information. As such, it is imperative to use color with discretion.
To adhere to these guidelines, the use of color was limited to red for the alphanumeric letters
enclosed in bounding boxes, and grey for the bounding boxes in the DCT format. The roll-
over yellow color was utilized as a highlighting device for the menu interactors. All other
visuals were in black and white, with the background in white adhering to the contrast
principle of color.

Font S 0’16! and Size

The problem with presenting any text information on a computer screen is a limited
display space. Eye movements are limited to the screen versus the ‘real environment’ that
allows for the eye to saccade over a larger area. Tullis (1983) discovered that reading, which
involves parallel cognitive processes, one for processing text and the other the semantic
processing of language, is not done in the traditional top-down and left—right processing as in
the paper format. Instead, it occurs as a collective search for the required information.
Synder and Maddox (1978) conducted an investigation of text legibility on computer screens,
and discovered that smaller text sizes produced faster reading rates as cognitive resources
needed for information processing are not spread over too large an area. The suggested size
should be in line with the screen parameters, but from a general standpoint, the largest used
for content text is approximately 14 point (Bernard, Mills, Frank, & McKnown, 2001).

As such, text density is something that needs to be manipulated and addressed on

50

computer display terminals (CDT) to accommodate for this problem. Lines too short will
cause the candidate to skim through it with no information processing occurring, while lines
too long will cause an overload. What is important is the spacing between lines of text and
paragraphing to accommodate chunking. In addition, types of fonts, serif or sans serif need
to also be taken into account. Sans serif, a category of type faces that do not include serifs or
small lines at the end of characters are generally harder to read than serifs. San serifs are
generally good for use in small paragraphs or titles of text. Serifs on the other hand aids in
reading but not as titles, as the presence of small lines at the end of characters may make the
headlines appear too busy.

In DCT formats, there is now the added complexity of presenting text as labels near
or overlapping visual representations, or enclosed in speciﬁc parameters, such as a table cell,
or a bounding box label. Care was taken in the use of speciﬁc font styles such as using
solitary capital letters combined with its location as iconic cues and response selection,
versus as text or labels. Line and character spacing were taken into account when displaying
the text layout, horizontally and vertically, and Gararnond a serif style font was used as font
style.

Derigm'ng For Error

Errors in mouse pointing normally occur because of reasons such as the close
proximity of icons, no labeling with icon representations, or too many similar icons. To
avoid such errors, speciﬁc guidelines to accommodate human errors were incorporated as
follows:

0 Redundancy - This is a HCI theory to ensure that users are not off track in reaching
the correct target providing a variety of avenues toward a similar outcome will

ensure that this is maintained. For example, representing a function via both an icon

51

and text label.

0 Pop-Up —- Boxes that inform the candidate they have not ﬁlled in an answer or not
fully comprehended directions are essential to ensure that they get back on the right
track.

0 Confirmation - During an examination, anxiety levels of candidates are high because
of the stakes involved in performing well. As such, pop-ups have been included that
displays a candidate’s actual answer, asking if the answer they selected was what was
intended.

Post r0 0 e P ase

The prototype phase of GUI development involves a test run of the software as is.
At this stage of the interface development, “practitioners in the ﬁeld of HCI called User-
Experience Engineers (UEE) use a variety of methods to generate applications” (Olson &
Olson, 2003). Examples of these methods include checklists, heuristic evaluation (Nielsen,
1993), cognitive walkthrough (Lewis et al. 1990), claims analysis (Carroll & Rosson, 1992), all
termed ‘formative evaluation’ with the similar goal of detecting any error or difﬁculty that
may cause problems for the user.

A complete usability test was conducted for the two formats of the LSAT using the
cognitive walkthrough (Lewis et al, 1990) technique. The Cognitive Walkthrough is a
methodology for performing theory-based usability evaluations of user interfaces which
focuses on a user’s cognitive activities; speciﬁcally, the goals and knowledge of a user while
performing a speciﬁc task. The walkthrough consisted of three stages, the preparation, the
evaluation, and the interpretation stage.

The preparation stage is where information is collected about the speciﬁc tasks that

examinees have to complete, what constraints are imposed, the examinee population

52

themselves, and any other pertinent information prior to the evaluation stage. At the
evaluation stage, questions regarding the reasons why speciﬁc functions have been
implemented, design features and its relevance to the user and tasks are all recorded. The
ﬁnal stage, which is the interpretive stage, is the culmination of all recorded information
from the evaluation stage to assess which information falls into the positive category, and
which into the negative category. The prototype is then edited based on the negative
information collected.

Final Product — Second Prototype

After all edits and corrections have been applied to the software toward the second
prototype, a ﬁnal run-through was conducted in terms of functions, typos in text,
arrangements, and layout to the actual screen resolution and screen size of the testing
computers. This was also done for the back end of the software i.e. how the data was
recorded for time taken between items and responses. It was decided that for each examinee,
time data would be in milliseconds, for a more accurate and detailed rate. It is important to
note that this phase is not fool proof, however, and knowledge that some mistakes could
have gone undetected was accepted.
Participants

Approval for the use of human subjects was sought and the number of participants
indicated by the power analysis was recruited for the study from a fairly large pool of
graduate and undergraduate students attending Michigan State University. Requests for
participants were done via academic list serves, advertisements by fellow teaching and
graduate assistants in the classes they instruct, graduate school governing bodies such as
Council of Graduate Students (COGS), other graduate organizations, and through word of

mouth. Approximately 98 subjects participated, the percentage of undergraduate students

53

being approximately 7.84%, graduate students comprised the remaining 92%, of which
19.6% were Masters students, and 72.5 % were Doctoral students.
TEST ADMINISTRATION

The innovative item types were developed using Director 8.5 and uploaded on two
Dell computers with 18 inch screens at The Canterbury of MSU ofﬁce, which is the
Chaplaincy of the Episcopal and Anglican Student Mnistry, in close proximity to campus.
The time allocated for both CAT test formats was exactly the same as the time prescribed
for the pencil and paper format of the past LSAT exam, 35 minutes. Each participant was
required to read and then sign a consent or waiver form and go through a tutorial of the
exam for approximately 10 — 15 minutes, to ensure that they became familiar with the test
functionalities and layout. They were given pencils and paper to assist them in their tasks,
regardless of which format they were assigned and had the option of stopping at any time.

After completion of the tests, the two Kit of Factor-Reference Cognitive Tests
(Ekstrom at al., 1976) were administered. The Identical Pictures Test had two parts, part I
and II consisted of 96 items of 28 rows of visuals per page, at 4 pages per part. Each
examinee was given approximately 90 seconds to complete each part. They were instructed
to go through the items and check off which visual from the row of 5 possibilities was
identical to the visual on the left. Once completed, the next kit Test, Finding A’s test was
administered also in 2 parts, comprised of 5 columns of words per page, at 4 pages per part.
In each column, candidates were asked to locate 5 words that contained the letter A and
strike them out. Each examinee was given approximately 120 seconds to complete each part.
Data from the Kit of Factor-Reference Cognitive Tests (Ekstrom et al., 1976) was recorded

for each candidate and stored. Results from the study are reported in the next chapter.

54

RESULTS 8; DATA ANALYSIS

ANALYSIS PROCEDURES

The following analyses were conducted, with response times (RT) for each item i.e.
response times between the presentation of each item to the next item, and correct response
as the dependent variables and format as the independent variable: (i) descriptive statistics of
total scores for both DCT and LSAT, (ii) Differential Item Functioning (DIF) to determine
if versions of the items functioned differently, (1i1) estimates of reliability for the tests, (1v)
correlation of DCT and LSAT formats to the Kit Factor-Reference Tests to obtain a sense
of differences in validity, (v) proportion of correct responses to the total number of items
answered, (vi) average response times (AVR) to responses, (vii) correlations between
response times (RT), and (viii) a multivariate analysis of variance MANOVA) to determine
if there were signiﬁcant differences between item scores and response times for the two
formats.
Descriptive Statistics

The mean and standard deviations for the total scores of the DCT and LSAT format
are displayed in the table below. A two sample test of means was computed to determine the
p value. There were no signiﬁcant differences between the two means. However, there was a
sizeable difference in variance between the two formats at 6.972 almost 7.0.

Table 3. Demiptioe 5 tatim'cr for DCT and LSAT Format:

 

 

 

 

Format No. Observed Mean Std. Deviation Variance
DCT 47 9.96 4.344 18.868
LSAT 50 8.76 3.520 11.896
D' fere ' tem F n 'o ' I

DIF is deﬁned as the systematic statistical process for detecting performance

differences on items among groups of individuals with similar, true cognitive ability,

55

 

regardless of any other characteristic that is ‘irrelevant’ to the measure (Zumbo, 1999). In
this case the irrelevant measure is the format for the item. DIF was conducted to compare
the performance of the focal group (examinees taking the DCT format) to the reference
group (examinees taking the LSAT format). A one-parameter model of Raju’s signed area
index (SAI), which is the area between two item characteristic curves (ICC) was used. For
items to display DIF, the values must be > .5 or < -.5 in the Z_SAI column (values in the
SAI column were converted to reflect Z standardized values) (Raju, 1988, 1990).

Table 4. DIF I ”dim: Traditional w. DCT Format

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Item DCT-Proportion Correct LSAT-Proportion SAI Z_SAI
Responses Correct Responses

1 0.74 0.70 ~0.85 -1.18044
2 0.49 0.66 -1.41 -1.97053
3 0.43 0.28 0.10 0.13229
4 0.40 0.52 0.04 0.05625
5 0.34 0.26 -0.74 -0.87059
6 0.85 0.86 -0.65 -O.77179
7 0.23 0.08 0.92 0.96628
8 0.47 0.51 -0.33 -0.46999
9 0.45 0.41 0.55 0.76865
10 0.39 0.39 -0.53 -0.71381
11 0.44 0.44 0.01 0.01415
12 0.40 0.34 -0.26 -0.34605
13 0.50 0.47 -0.68 -0.89404
14 0.48 0.27 0.68 0.88845
15 0.54 0.28 1.37 1.78057
16 0.38 0.23 0.76 0.96481
17 0.35 0.17 -1.38 -1.04137
18 0.91 0.77 1.17 0.97605
19 0.68 0.30 3.13 2.54438
20 0.72 0.63 0.52 0.51005
21 0.50 0.52 -1.43 -1.23216
22 0.75 0.51 3.22 1.61282
23 0.65 0.48 -0.96 —0.67265

 

The results in Table 4 show that with the exception of items 3, 4, 8, 11, and 12, all
other items displayed DIF. A negative value indicates that an item was more difﬁcult for the

focal group (DCT), while a positive value indicates that an item was more difﬁcult for the

56

 

reference group (LSAT). Eleven items were more difﬁcult for the DCT group while 12 items
were difﬁcult for the LSAT group.

For the DCT group, DIF analysis for item 2 had the greatest degree of difference
followed by item 1. These two questions were located at the beginning of the exam and were
the ﬁrst set of analytical reasoning (AR) test items with the new format. The following
reasons that could have attributed to this include: (1) familiarity with the new format, (11)
initially processing the set of conditions, questions and multiple choice options available, and
(iii) deciding which strategy to undertake, the algorithm or heuristic path of the two solution
strategy (Schnipke & Scrams, 1999) as information is being processed.

DIF values decreased toward the end of the set of test items as can be seen in items
3 and 4 which had low DIF but the level of item 5 increased though not to the same degree
as items 1 and 2. As the test progressed, DIF measures for all other items seem to reﬂect an
upward and downward sine-curve trend with lower levels of DIF in comparison to the ﬁrst
set of items. This supports Richard’s (1987) hypothesis of immediate-delayed retention tests
in visual-text formats with signiﬁcance in performance occurring in later stages, the creation
of visual traces that do not vie for cognitive resources, and the effects of practice or
rehearsal on the ﬁrst set of questions.

For the LSAT group, the DIF measures for items increased steadily with the ﬁrst
signiﬁcant level at item 15, and the greatest difference occurring for item 19. Higher DIF
levels reflect a trend where the items located later seem to be have greater DIF measures
than earlier item sets. This would indicate one or two of the following: (i) a depletion of
STM resources in an all text environment, and (ii) time running out to complete each item
accurately. On average, DIF measures for items among the LSAT group when compared to

the items for the DCT group had greater values. These results will be discussed later in the

57

chapter in relation to the MANOVA results and response time correlations. Items not
congruent with other analysis will also be discussed and the phenomena explored.
Reliabili T s s

The reliability of a measurement procedure is deﬁned as its consistency. Table 5 lists
the reliabilities and descriptive statistics for the DCT and LSAT total score, and the two Kit
of Factor-Reference tests.

Table 5. Rama/2'9 1.3..1‘tir/zatei~ and Dmriptive S tatistit: For All Items

 

 

 

 

 

 

 

 

 

DCT Reliability Mean Std Deviation Variance
Total Score 0.86674 9.96 4.344 18.868
Finding A’s 0.94467 64.04 14.8367 220.129
Identical Pics 0.30 77.596 14.1877 201.290
LSAT

Total Score 0.88705 8.76 3.420 11.696
Finding A’s 0.92886 65.94 17.867 319.241
Identical Pics 0.23 73.714 14.2156 202.083

 

The data indicate that all tests used for the study with the exception of the Identical Pictures
test, had good reliability estimates. The lack of reliability for the Identical Pictures prompted
measures of variance, mean and standard deviation to be computed to investigate if the
difference reﬂects extremely small differences in variances between the two groups. A
41.22% coefﬁcient of variation was estimated based on the combined means and variances.
This was a sizeable variance difference in the Identical Pics test between the two groups.

V ' ' o t

The Pearson Product—Moment Correlation Coefﬁcient (r) was used to determine

58

 

correlation coefﬁcient estimates to investigate the relationship, if any between the two Kit
Reference tests to both the DCT and LSAT formats to check for validity of item constructs
(see table below):

Table 6. Correlation C ogﬂieient: for DCT and LSAT Correct Reipome: to Kit of Faetor-Reﬁreme Text:

 

 

 

 

 

 

Identical Pics Finding A’s
LSAT Responses
.434** .022
DCT Responses
.632** -.040
Chi-Square 1 .78071
P—Value 0.1 82

 

** Correlation is signiﬁcant at 0.01 level p S .01
* Correlation is signiﬁcant at 0.05 level p S .05

Both format responses had a positive correlation to the Identical Pics test with a slightly
higher correlation estimate for the DCT. A comparison of the two independent correlation
coefﬁcients was calculated to determine the signiﬁcance. Results indicate that correlations
were not signiﬁcantly different. As such, both item forms seem to have the same relationship
with these variables.
to o ' rr c Res onses
The proportion of correct responses based on the number of items attempted was

calculated. This proportion was computed because the LSAT does not penalize an examinee
for a wrong answer. Figure 3 illustrates the proportion of correct scores to the number of
items answered for both groups. A greater number of correct responses occurred between
the 3.75 - 6.5 range for DCT and 3 -— 5.75 for LSAT. The highest score in the LSAT was
slightly lower at 9.25 compared to DCT at 10. The lowest score was 0.0 for DCT versus 1.5
for LSAT. An analysis of variance (AN OVA) was run to determine level of signiﬁcance. A p

S .07 level was taken as signiﬁcant because of the small number of participants in the study.

59

 

Figure 3. Correct Reipome: to Tire Proportion of Itemr Answered for DCT 29' LSAT

 

DCT-PC l l

 

 

 

 

 

LSAT-PC l”— j

 

 

 

 

Avera Res ons Time VR - nswers

The average response times for both groups were calculated to determine if the S 78
second goal was reached over the full set of items (Bridgeman & Cline, 2000). Note,
however, that the difference in the format and layout between the GRE-A and GRE-Q to
the LSAT developed for CAT environments is different and performance would
undoubtedly be affected for the following reasons:

1. The GRE-Q and GRE-A items were presented one question at a time with no
option to review responses.

2. The GRE-Q had a time constraint of 45 min. (2700 sec) for 28 items vs. 23 items
with a time constraint of 35 min. (2100 sec) for the LSAT. This works to
approximately 96 sec. at most per item for the GRE and 91 sec. per item for the
LSAT

3. There exists two sections of an AR section in the LSAT vs. one for the GRE—Q

4. There is no penalty for wrong responses for the LSAT.

5. LSAT is a partially speeded test.

The table below illustrates the Mean Response Times (MRT) of examinees of both

60

groups. There were approximately 12 items between the two format groups that either
reached 5 78 second target goal or were below this margin. Items 16, 23 and 22 had the
fastest RTs respectively, for both DCT and LSAT, with faster RTs recorded for LSAT. The
greatest difference between DCT and LSAT was for item 17 with a RT difference of 19.192
with a higher proportion of correct responses to total answered at 0.35 for DCT and 0.17
for LSAT. This supports Schnipke and Scrams’s (1999) hypothesis of higher scores for
difﬁcult items having longer RTs among examinees. The following box plots are attached in
the appendices: MRT to proportion of correct answers for both DCT and LSAT, and the
MRT for both DCT and LSAT (Appendix H).

Table 7. Mean Reiponie Time: to Proportion of Correct-Total I term Anwered

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DCT LSAT

Item RT Ans RT Ans
1 157 0.74 195 0.7

2 72.19 0.49 72.148 0.66
3 70.502 0.43 56.55 0.28
4 98.353 0.4 112.497 0.52
5 112.119 0.34 125.244 0.26
6 185.95 0.85 165.381 0.86
7 137.549 0.23 134.231 0.08
8 74.009 0.47 81.828 0.51
9 138.07 0.45 119.556 0.41
10 120.162 0.39 103.205 0.39
11 249.86 0.44 225.667 0.44
12 126.298 0.4 96.486 0.34
13 52.427 0.5 54.476 0.47
14 76.805 0.48 75.217 0.27
15 79.512 0.54 64.455 0.28
16 40.582 0.38 35.428 0.23
17 75.743 0.35 56.551 0.17
18 99.094 0.91 103.785 0.77
19 81.588 0.68 76.182 0.3

20 60.311 0.72 55.671 0.63
21 71.38 0.5 70.972 0.52
22 49.567 0.75 41.26 0.51
23 46.71 0.65 39.393 0.48

 

 

 

 

 

 

 

61

'm o la ' n

The Pearson Product—Moment Correlation Coefﬁcients (I) were calculated for the
response time taken for each item. The correlation table is attached in the appendix
(Appendix 1). Time correlations were investigated based on per set of items-conditions, T1 —
T5, T6 - T10, T11 — T17, and T18 — T23. Based on Scrams and Schnipke’s (1999) theory of
the importance of item location, the negative correlations between T1 — T5 to T18 - T23 are
strong. Any increase in Set 1 reﬂects a decrease in the last set. This probably reﬂects
examinees running out of time, hence the faster RT and lack of time management allotted to
each item which lends support to ‘the planning fallacy’ theory (Kahneman & Tversky, 1979).
T1, however, was the exception, as this is the ﬁrst item at the beginning of the test which
takes examinees a longer time to process information and familiarize themselves with both
formats.

Only T1 and T12 had a correlation to format, a positive relationship for T1 and a
negative for T12. The increase in T1 reﬂects examinees trying to accustom themselves for
the ﬁrst time to both CAT formats. T12 is mid-way through the test, and any effects of the
format are negligible at this point. The negative relationship to format reﬂects a depletion of
STM and cognitive overload.

Multivariat An 'sis Variance VA

An analysis was conducted on the test items for both DCT and LSAT to investigate
the effect of format (DCT-LSAT) on response times (RT) and proportion of correct
responses. Effects of format were signiﬁcant for the following responses: 2, 7, 14, 15, 17, 18,
19, and 22, and for response times taken for the following items: 1, 12, and 17. Though the
alpha level was set at p S .05, a signiﬁcance level of 0.09 for item 2, 0.07 for item 17, and

0.09 for item 18 were included because of the small number of participants. Response times

62

for items 1, 12, and 17 were signiﬁcant at the .05 level and two item-time combinations, 114-
T14 and Il7-T17 were also signiﬁcant. The MANOVA table is attached in the appendix
(Appendix J)-

SUMMARY AND DISCUSSION

The main focus of this exploratory study was that DCT item formats would be a
more efﬁcient measure of problem solving capabilities in the AR section of the LSAT.
Efﬁcient here is deﬁned as the decrease in MRT plus the correct responses. The results and
analyses reﬂect the following key points.

Item Igcation

The DIF levels among the DCT items and the signiﬁcance of items from the
MANOVA results point to the following having signiﬁcant differences: items 2, 7, 12, 14,
15, 17, 19, and 22. These items in combination with the MRT for the group reﬂect a slow
increase in MRT but with higher accuracy rates (see Table 7) when compared to the LSAT
group that reﬂect faster MRT rates but less accurate scores. The DIF measures of the later
placed items for the LSAT group would suggest a lack of time available to fully process the
information that resulted in guessing and higher perceived difﬁculty level of items. The time
correlation table also reﬂects a negative relationship between early items to later items.

The DCT group seems to reﬂect the algorithm strategy as proposed by Schnipke and
Scrams (1999) with greater MRT rates but higher accuracy responses. The DCT format
allowed for either a more accurate ‘guessing’ strategy provided by the visuals-text, or faster
recall of stored information because of the presence of paired visual—text cues that afford
faster forward and backward recall (Paivio, 1986, 1990, 1991). This also gives evidence to
studies of VSTM of spatial locations aiding in the book-marking and referencing of objects

in relation to the surrounding objects. This relational processing of visual units increases

63

VSTM capacity over periods of lag time (Chun 8c jiang, 1998).

As the test progressed, the immediate-delayed effect of visuals-text on performance
(Richards, 1987) was supported and reﬂected in response accuracies where signiﬁcant
increases in correct responses occurred in later items. This immediate-delayed effect may
have resulted from examinees becoming familiar with the visual-text format (practice-
rehearsal), faster processing rates of information of visuals (Neisser, 1967; Phillips, 1974;
Simons, 1996; Sperling, 1960), and improved forward and backward recall of information
(Paivio, 1986, 1990, 1991).

Responsg Tim; and Spegdedness

The MRT rates were not realized here, as the RTs for the DCT format were higher
when compared to LSAT format RTs. However, when taking Schnipke and Scram’s (1999)
two solution strategy (see ﬁgure 2) where higher asymptotic levels of accuracy are reached
with higher response times in a strict time speeded test, the goal of DCT format items of
lower RTs and higher proportions of correct responses was unrealistic. The deﬁnition of
efﬁciency would thus need to be changed to higher RTs and correct responses equals an
overall increase of correct responses within the time constraints. Table 7 thus supports this
hypothesis where AVR for the LSAT group when compared to the DCT group was on the
whole lower with lower response accuracy.

Differences in time taken for I1, 112, and 117 were signiﬁcant in the MANOVA
analysis. T1 was signiﬁcant, as it is the time taken to respond to the ﬁrst item of the AR
LSAT test where examinees had to process the directions, questions and conditions. I12 is
midway of the AR test, and the AVR Table on page 61 indicates that after I12, times for the
LSAT group with the exception of T13 and T18 were lower plus lower accuracy rates. This

would suggest a depletion of STM in a text only format, and the occurrence of guessing in

64

the last subset of items 18 — 23. The DIF analyses for these items were higher than for the
DCT group, which lends support to the lower asymptotic levels of accuracy when the
heuristic strategy is selected.

Both time and item were signiﬁcant for 17. The RT difference between the DCT and
LSAT interacted with the item, was the largest difference here, approximately 20 seconds
more time taken by the DCT group with 0.18 greater accuracy responses. In addition, DIF
analysis for 117, which was the last item of the 3rd set, had the greatest difference in difﬁculty
between the DCT group and the LSAT group. Though it was more difﬁcult for the DCT
group, respondents had more accurate responses at a higher response time. After item 17,
118 to 123 seemed increasingly more difﬁcult for the LSAT group giving evidence to the Law
of Diminishing returns, that any increase in time added would not result much in a change in
accuracy of responses.

im rr ti ns

Overall, 118 — 123 were correlated negatively to items before 112, i.e. the ﬁrst half of
the AR test. Higher RTs taken in the ﬁrst half of the test will reﬂect less RTs in the second
half of the test. This again supports Schnipke and Scram’s (1999) two solution strategy in a
time speeded task. The heuristic strategy that implies guessing at a lower RT rate with lower
asymptotic accuracies gives evidence for the algorithmic strategy as a better solution for a
time speeded task.

The ﬁndings point toward the DCT format as having signiﬁcant impact on
performance of examinees which will be discussed in the next chapter. A review of the
implications and weaknesses of the study will be presented and suggestions given for future

research.

65

IMPLICATIONS AND FUTURE RESEARCH

Results from the study indicate that there are signiﬁcant ﬁndings in the impact of
utilizing a DCT format of testing in CAT environments. Paivio’s (1986, 1990, 199) theory of
the supremacy of visuals and text in information processing and recall is evident in the
results. Using a DCT format in a domain that requires the examinee to multi-task within
time constraints will preserve the examinees cognitive resources and provide a more efﬁcient
instrument as a measure of problem solving capabilities. This would prevent the result of
lower scores attributed to fatigue as much of the LSAT is text based.

The following results were realized in this study:

1. Mean responses times for the DCT format items were realized (i 78 secs). Though
times were very closely or almost equal to RTs in the LSAT section, the results
revealed that the advantages of a visual-text format had an effect on later items. This
was attributed to examinees becoming familiar with the DCT format, the visual-text
format aiding in preserving cognitive resources, and the immediate-delayed test
studies with visuals-text conducted by Richards (1987).

2. The range of correct responses for DCT had a wider score range with a larger
proportion of correct answers occurring between 4 — 6 versus the LSAT with the
bulk of correct answers occurring between 2 — 5. Familiarity with the visual-text
format again comes into play here. This also supports the visual recency effect
(Phillips 8: Christie, 1977a; Broadbent & Broadbent, 1981), where performance of
complex and demanding tasks occurring simultaneously were not affected because of
the visual-text format.

3. Location of test items as purported by Scrams and Schnipke’s (1999) had an effect

on RTs and performance for both formats.

66

4. Reliability estimates of DCT items were high or equal to the LSAT reliability
measures, at .086 and 0.88 respectively.

5. The correlation coefﬁcient measures indicate that DCT items maintained their
validity with visuals having no signiﬁcant impact on the item constructs.

6. Higher accurate scores were reached in the DCT format but with higher RTs when
compared to the LSAT group. Though a prediction of lower RTs plus correct
responses was predicted, it was found that this was an unrealistic goal because of
Schnipke’s and Scrams (1999) study of time speededness and accuracy.

Limitatiogs of the Study

This experimental study was conducted with a relatively small sample size. As such, a
more sensitive study could have been conducted if the sample size were substantially
increased. An increase in size would investigate more accurately if the number of items
reached or answered, correct responses and response times reﬂect stronger evidence for the
Dual Coding Theory hypothesis.

In addition, as with past research of visual—text assessments, many discovered that
the advantages of these formats were only realized after (i) familiarity or rehearsal of formats
or training sessions and (ii) after an extended test period. A recommendation would be to
conduct the experiment with better examinee preparation, beyond the 10 -— 15 minute
tutorial, or in two experimental time phases.

The test format could be developed over two sets of analytical reasoning (AR)
sections for a total of 35 mins to: (a) detect the immediate-delayed effects of visuals over a
greater number of items, (b) prevent any isolated or rare occurences, (c) investigate the
extent of the limits of a text-only format, and (d) to preserve of cognitive resources with a

visual-text format.

67

The selection and design of visuals for the DCT format needs to be further
researched by studying the problem solving diagrams drawn by examinees of past LSAT
exams to determine a closest-to-ﬁt generalization of a problem solving mental model among
individuals. This would produce a more efﬁcient and accurate visual-text test construct.
Futute Research

The object of this study was to determine if DCT item formats developed for CAT
environments were more efﬁcient measures of a candidate’s cognitive capabilities of
problem solving when compared to the traditional LSAT formats. Many of the objectives of
this study have been realized with regards to accuracy of responses, supremacy of visuals,
and location of items, with the exception of response times.

Currently there exists a variety of investigations focused on the creation of new and
innovative items in the ﬁeld of assessments and tests, critical issues involved in the
information processing and problem solving of higher order cognitive tasks need to be fully
understood and researched prior to any development of an innovative or novel item type.
Pertinent issues of GUI and information architecture (IA) are also added domains that need
to be investigated if tests are to be conducted in CAT environments. As such, research in the
ﬁelds of cognitive psychology and HCI must occur alongside psychometrics to arrive at

more authentic and effective measures of Theta (O).

68

REFERENCES

Anderson,J. R. & Bower, G. H. (1973). Human A::ociative Memog. Washington DC: Winston
and Sons.

Anthony, L. C., Harris, V. F. & Pashley, P. J. (1999) Predictive Validig Of T/Je LSAT:A
National S ummag Of Tbe 1995-1996 Correlation S tudie:. (LSAC Research Report
No. 97-01) Newtown, PA: LSAC (Law School Admissions Council).

Baddeley, A. (1992). Working Memory. Science, Reward; Librag Core, 255, 556 - 559.
Baddeley, A. D. (1997). Human [Memory Tbeog And Practice. Boston: Allyn and Bacon.

Barnard, P. (1986). Interacting Cognitive Subsystems: A Psycholinguistic Approach To Short
Term Memory. In A. Ellis (Ed), Program In Tbe chbology Of Language (pp. 197 — 258).
London: LEA, Lawrence Erlbaum Associates, Inc.

Bernard, M.; Mills, M.; Frank, T. & Mcknown,J. (2001). Which Fonts Do Children Prefer
To Read Online? Usability News. Winter, 2001. Retrieved June 8, 2001 from
http:/ /wsupsy.psy.twsu.edu/ surl/usabilitynews/3W/ fontJR.htm

Begg, I. (1972). Recall Of Meaningful Phrases. journal of Verbal Learning and Verbal Behavior,
19, 431 -439.

Bobko, D. J. ; Schiffman, H. R. ; Castino,J. R. & Chiapetta W. (1977). Contextual Effects
On Duration Experience. American journal Of chbology, 90, 577-586.

Bodner, R. (1994) A Comparison Of Identiﬁcation Rates Of Static And Animated Buttons.
Dept. Of ComputerAnd Information S cience, University of Guelph, Ontario, Canada.

Bridgeman, B, & Cline, F. (2000). Variation: In Mean Reiporm Time: For Que:tion: On Tbe
Computer-Adaptive GRE General Te:t: Implication: For Fair A::e::ment (GRE Board
Professional Report No. 96-20P; ETS RR 00-7). Princeton, NJ: ETS (Educational
Testing Service).

Bridgeman, B., Cline, F. & Hessinger, J. (2003). EﬂZ’ct Of Extra Time On GRE Quantitative And
Verbal S core:, (ETS Rep. No. 03-13, GRE Rep. No. 00-03P) Princeton, NewJersey:
ETS (Educational Testing Service).

Broadbent, D. E. & Broadbent, M.H.P. (1981). Recency Effects In Visual Memory.
Quarterb' journal Of Experimental chbology, 33A, 1-15.

Brown, J. W. (1948) A Comparison Of Verbal And Projected Verbal-Pictorial Tests As
Measures Of The Ability To Apply Science Principles. (Doctoral Dissertation, The
University of Chicago, 1948). Dinertation Ab:tract: International, ADD W1948, 217

Brown, S. W. 8: Boltz, M. G. (2002). Attentiomal Processes In Time Perception: Effects Of

69

Mental Workload And Event Structure. journal Of Experimental chbology: Human
Perception And Performance, 28(3), 600-615.

Carpenter, P. A.;Just, M. A. 8: Shell, Peter (1990). What One Intelligence Test Measures: A
Theoretical Account Of The Processing In The Raven Progressive Matrices Test.
Pyle/”logical Review, 97(3), 404 - 431.

Carroll,J. M. & Rosson, M. B. (1992) Getting Around The Task-Artifact Cycle: How To
Make Claims And Design By Scenario. ACM Tramaction: On Iryormation Sy:tem:, 10(2),
1 81 -21 2.

Chatman, S. P., Reynolds, C. R. 8: Wilson, V. L. (1984) Multiple Indexes Of Test Scatter On
The Kaufman Assessment Battery For Children. journal Of Learning Di:abilitie:, 17,
523-531 .

Cheng, P. C-H (1996) Functional Roles For The Cognitive Analysis Of Diagrams in
Problem Solving. In Cottrell, G. W. (Ed), Proceeding: Of Tbe Eigbteentb Annual
Conference Of Tbe Cognitive Science S ocieg'. (pp. 207-212). Hillsdale, New Jersey:
Lawrence Erlbaum.

Chun, M. M. & Jiang, Y. (1998). Contextual Cueing: Implicit Learning and Memory Of
Visual Context Guides Spatial Attention. Cognitive chbo/ogy, 36, 28-71.

Clarke, H. H. 8: Chase, W. G. (1972). On The Process Of Comparing Sentences Against
Pictures. Cognitive Pg‘ycbology, 3, 472-517.

Craik, F. & Lockhart, R. (1972) Levels of Processing: A Framework for Memory Research.
journal of Verbal Learning And Verbal Bebavior, 11, 671 -684.

Csapo, K. (1991). Picture Superiority In Free Recall: Imagery Of Dual Coding. In A. Paivio
(Ed) Image: In Mind: Tbe Evolution of A Tbeog (pp. 76-106) New York: Harvester
Wheatsheaf.

Cronbach, L. J. (1975). Beyond The Two Disciplines of Scientiﬁc Psychology. American
Pg'cbologiﬂ, 12, 671-684.

Daneman, M. & Carpenter, P. (1980). Individual Differences In Working Memory And
Reading. journal Of Verbal Learning And Verbal Bebavior, 19, 450-466.

Davidson, R. E. & Adams,J. F. (1970) Verbal And Imagery Processes In Children’s Paired
Associative Learning. journal Of Experimental Cbild chbology. 9, 429-435.

Decortis, F., de Keyser, V., Cacciabue, P. C. & Volta, G. (1991). The Temporal Dimension
Of Man Machine Interaction. In G. R. S. Weir & J. L. Alty (Eds), Human-Computer
Interaction And Complex S}:tem:. Glasgow, UK, Academic Press.

Doost, R. & Turvey M. T. (1971). Iconic Memory And Central Processing Capacity.

70

Perception And chbopbynm, 9, 269- 274.
Driscoll, M. P. (1994). chbo/ogy Of Learning For In:trnction. Needham, Ma: Allyn and Bacon.

Dwyer, F. M. 8: De Melo, H. (1984). Effects Of Mode Of Instruction, Testing, Order of
Testing, And Cued Recall On Student Achievement. journal Of Experimental Education
52, 86-94.

Eisenstadt, M. & Kareev, Y. (1975). Aspects Of Human Problem Solving: The Use of
Internal Representations. In D. A. Norman & D. E. Rumelhart (Eds), Exploration: In
Cognition. (pp. 308 — 346) San Francisco, CA: Freeman

Ekstrom, R. B., French,J. W., Harman, H. H. & Dermen, D. (1976) Kit ofFactor-Reference
Cognitive Te:t:. Princeton, NewJersey: ETS (Educaﬁonal Testing Service).

Epstein, W., Rock, I. & Zuckerman, C. B. (1960) Meaning And Familiarity In Associative
Learning. chbological Monograpb:, 74, 491.

Farah, M. J., Hammond, K. M., Levine, D. N. & Calvanio, R. (1988) Visual And Spatial
Mental Imagery: Dissociable Systems Of Representations. Cognitive Prycbology, 20,
439-462.

Frederiksen, N. & Ward, W. C. (1978). Measures For The Study Of Creativity In Scientiﬁc
Problem-Solving. Applied chbo/ogical Mea:urement 2(1), 1-24.

Furst, A. J. 8: Hitch, G. J. (2000). Separate Roles For Executive And Phonological
Components Working Memory In Mental Arithmetic. Memogr and Cognition, 28,
774-782

Galton, F. (1880a). Statistics on Mental Imagery. Mind, 5, 301-318.

Galton, F. (1880b). Visualised Numerals. Nature, 21, 252-256.

Galton, F. (1880c). Visualised Numerals. IVature, 21, 494-495.

Galton, F. (1880d). Visualised Numerals. journal Of Tbe Antbropological In:titute, 10, 85-102.

Ginther, A]. (2002). Tbe EﬁE’ct: Of Tbe Pre:ence and Ab:ence Of Vi:ual Accompaniment: On
Performance On TOEFL Lj:tening Conprebencion Stimuli. (ETS RR-6 6. Princeton, NJ:

Educational Testing Service.

Glaser, W. R. & Glaser, M. O. (1989). Context Effects In Stroop-Like Word And Picture
Processing. journal Of Experimental chbology, General, 118, 13—42.

Glenberg, A. M. & Langston, W. E. Glenberg, A. M. L., W. E. (1992). Comprehension Of

Illustrated Text: Pictures Help To Build Mental Models. journal Of Memory And
Language, 31, 129-151.

71

Goolkasian, P. (1996). Picture-Word Differences In A Sentence Veriﬁcation Task. Memog
And Cognition, 24, 584-594.

Goolkasian, P. (1999). Retinal Location And Its Effect On The Spatial Distribution Of
Visual Attention. American journal Of chbology, 112(2), 187-211.

Guilford, J. P. (1967). Tbe Nature Of Human Intelligence. New York: McGraw-Hill.

Gyselinck, V., Cornoldi, C., Dubois, V. & Ehrlich, M-F. (2002). Visuospatial Memory and
Phonological Loop In Learning From Multimedia. Applied Cognitive Pycbology, 16,
665-685.

Hackos, J. T. & Radish, J. C. (1998) U:er and Ta:k Anabm': For Interface Dengn. New York:
Wiley Computing Publishing.

Haladyna, T. M. (1997). Writing Te:t Item: To Evaluate Hig/Jer Order Tbinking. Boston: Allyn and
Bacon.

Haramundanis, K. (1996) Why Icons Cannot Stand Alone. journal Of Computer Documentation,
22(1) 49-51.

Harmes, J. C. (1999) Computer-Based Testing: Toward The Design And Use Of Innovative
Items. November 22, Univern'ry Of S outb Florida.

Hayes, J. R (1973). On The Function Of Visual Imagery In Elementary Mathematics. In W.
Chase (Ed), Vi:ual Information Proce::ing. (pp. 177-214). New York: Academic Press.

Heathcote, D. Heathcote, D. (1994). The Role Of Visuo-Spatial Working Memory In The
Mental Addition Of Multi-Digit Addends. Current Pycbology Of Cognition, 13, 207-245.

Hegarty, M. & Kozhevnikov, M. (1999) Types Of Visual-Spatial Representations And
Mathematical Problem-Solving. journal Of Educational ngcbology, 91, 684-689

Henderson, W. D. (2004) S peed A: A Variable On Tbe LSATAnd Law S cbool Examr. (LSAC
Research Report No. 03-03) Newtown, PA: LSAC (Law School Admissions
Council).

Hitch, G. J. (1978). The Role Of Short-Term Memory In Mental Arithmetic. Cognitive
chbology, 10, 203-323.

Hornof, A. J. (2001) Visual Search And Mouse-Pointing In Labeled Versus Unlabeled Two
Dimensional Visual Hierarchies. ACM Tran:action: On Computer-Human Interaction,
8(3), 171-197.

Humphreys, G. W., Riddoch, M. J. & Quinlan, P. T. (1988). Cascade Processes 1n Picture
Identiﬁcation. Cognitive Neuropsychology, 5, 67—103

72

Intraub, H. (1997). The Representation Of Visual Scenes. Trend: In Cognitive S cience:, 1(6),
217-221.

Irwin, D. E., Yantis, S. &Jonides,J. (1983). Evidence Against Visual Integration Across
Saccadic Eye Movements. Perception And Pycbop/yncr, 34, 49-57.

Irwin, D. E., Brown, J. S. & Sun, J. S. (1988). Visual Masking And Visual Integration Across
Saccadic Eye Movements. journal Of Experimental chbology: General, 117, 276-287.

Irwin, D. E. (1991) Information Integration Across Saccadic Eye Movements. Cognitive
chbology 23, 420—456.

Jiang, Y., Olson, I. R. & Chun, M. M. (2000). Organization Of Visual Short Term Memory.
journal Of Erqberimental chbology: Learning, M emory, And Cognition, 26, 683-702.

Kaufman, A. S. (1976a) A New Approach To The Interpretation Of Test On The W'ISC-R.
journal Of Learning Di:abilitie:, 9, 33-41.

Kaufman, A. S. (1976b) Verbal-Performance IQ Discrepancies On The WTSC-R. journal
Of Con:ulting And Clinical Pycbology, 44, 739-744.

Kaufman, A. S. (1990) A::e::ingAdo/e:cent and Adult Intelligence. Boston: Allyn and Bacon.

Kellogg, G. 8., & Howe, M. J. A. (1971). Using Words And Pictures In Foreign Language
Learning. Alberta journal Of Educational Re:earcb, 17, 87-94

Kosslyn, S. M. (1994). Image And Brain: Tbe Rem/ution Of The Imagery Debate. Cambridge, MA:
MIT Press.

Kosslyn, S. M. (1995). Mental Imagery. In Kosslyn, S. M., and Osherson, D. (Eds), An
Invitation To Cognitive Science: Vi:ual Cognition (V012). Cambridge, MA: MIT Press

Kozhevnikov, M., Hegarty, M. & Mayer, R. E. (2002). Revising The Visualizer/Verbalizer
Dimension: Evidence For Two Types Of Visualizers. Cognition and In:truction, 20, 47
77.

Kyllonen, P. C. St Christal, R. E. (1990). Reasoning Ability Is (Little More Than) Working
Memory Capacity?! Intelligence, 14, 389 - 433.

Larkin,J. J. & Simon, H. A. (1987). Why a Diagram Is (Sometimes) Worth Ten Thousand
Words. Cognitive Science, 11, 65-99.

Lefkowith, E. F. (1955) The Effect Of Pictorial Stimuli Similarity In Teaching And Testing.
(Doctoral Dissertation, The Pennsylvania State University, 1955) Di::ertation
Ab:tract: International, 18, 473.

Levin, D. T., & Simons, D. J. (1997). Failure To Detect Changes To Attended Objects In

73

Motion Pictures. Pycbonomic Bulletin And Review, 4(4), 501-506.

Lewis, G, Poison, P., Wharton, C., 8: Riemau,J. (1990) Testing A Walkthrough
Methodology For Theory-Based Design Of Walk-Up-And-Use Interfaces. Proceeding:
Of Tbe ACM CHI, Seattle, WA, 90, 235—242.

Logan, G. (1988) Toward An Instance Theory Of Automatization. chbological Review,
95(4), 492-527

Logie, R. H. (1995) Vzkuo-Spatial Working Memogr: I::ue: In Cognitive Prycbo/ogy. Hove (UK),
Hillsdale, (USA): Lawrence Erlbaum Associates.

Lohse, G., Biolsi, K., Walker, N. & Rueter, H. (1994) A Classiﬁcation Of Visual
Representations. Communication: oft/re ACA/I, 37(12), 36-49.

Luck, S. J. & Vogel, E. K. (1997). The Capacity Of Visual Working Memory For Features
And Conjunctions. Nature, 390, 279—281.

Marschark, M. & Cornoldi, C. (1991). Imagery And Verbal Memory. In C. Cornoldi & M. A.
McDaniel (Eds), Imageg And Cognition (pp. 41-56). New York: Springer.

Marschark, M. & Hunt, R. R. (1989). A Reexamination Of The Role Of Imagery In Learning
And Memory. journal Of Experimental Pycbology: Memory (’7 Cognition, 15, 710 - 720.

Marschark, M. & Paivio, A. (1977). Integrative Processing Of Concrete And Abstract
Sentences. journal Of Verbal Learning And Verbal Bebavior, 16, 217 - 231.

Matarazzo, J. D., Daniel, M. H., Priﬁtera, A & Herman, D. O. (1988) Inter-Subtest Scatter In
The WAIS-R Standardization Sample. journal Of Clinical Pycbology, 44, 940-950.

Matarazzo,J. D. & Herman, D. O. (1985) Clinical Uses Of The WAIS-R: Base Rates Of
Differences Between VIQ And PIQ In The WAIS-R Standardization Sample. In B.
B. Wolfman (Ed), Handbook Of Intelligence, (pp. 899-932). New York: Wiley.

Mayer, R. E. & Gallini, J. K. (1990) When Is An Illustration Worth Ten Thousand Words?
journal Of Educational ch/Jology, 82(4), 715-726.

McCormick, B. H., DeFanti, T. A. & Brown, M. D. (1987) Visualization In Scientiﬁc
Computing - A Synopsis. IEEE Conputer Grapbic: And Application:, 7(4), 61-70.

McDaniel, M. A. & Pressley, M. (Eds), Tbeorie:, Individual Dr'ﬂerencea And Application:. New
York: Springer-Verlag.

McLean,J. E., Kaufman, A. S. & Reynolds, C. R. (1989) Base Rates Of WAIS-R Subtest

Scatter As A Guide For Clinical And Neuropsychological Assessment. journal Of
C/rild chbology, 45, 919-926.

74

 

McNeal, Joanne Margaret, (1994) Effect Of Rehearsal Strategies And Testing Format On
Student Achievement Of Different Educational Objectives. (Doctoral Dissertation,
The Pennslyvania State University, 1994). Di::ertation Ab:tract: International, DAI-A
55/ 12, 3820.

Mel, B. W. (1986) A Connectionist Learning Model For 3-Dimensional Mental Rotation,
Zoom, And Pan. Proceeding: Of Tbe Ergbtb Annual Conference Of Tire Cognitive Science
S ociey. (pp. 562 571). Hillsdale, NewJersey: Lawrence Erlbaum Associates.

Misanchuk, E. R., Schwier, R. A., and Boling, E. (1999). Visual Design For Instructional
Multimedia. Proceeding: Of Tire World Conference On Educational Multimedia, H )permedia
And Telecommunication:, Seattle, Washington.

Molitor, S. (1989) Developing And Manipulating Knowledge By Writing. In P. Boscolo (Ed)
Writing: Trend: In European Re:earcb (pp. 160-71). Padova: UPSEL Editore.

Moses, B. E. (1980) Tbe Relation:bip Between Vi:ual Tbinking Tania: And Problem-Solving
Performance. Paper presented at the annual meeting of the American Education
Research Association, Boston.

Neisser, U. (1967). Cognitive chbology. Englewood Cliffs, NewJersey: Prentice-Hall.

Ngo, D. C. L., Teo, L. S. & Bryne,J. G. (2000). A Mathematical Theory Of Interface
Aesthetics. Vi:ual Matbematim Art And Science Electronic journal Vol. 2(4). Retrieved
November, 2003 from http: / / www.members.t1ipod.com/ vismath4/ ngo/

Nickerson, R. S. (1965). Short-Term Memory For Complex Meaningful Visual
Conﬁgurations: A Demonstration Of Capacity. Canadian journal Of chbology 19,
1 55-160.

Nielsen,J. (1993). Umbiligl Engineering. Boston, MA: Academic Press.

Norman, D. A. (1986). Cognitive Engineering. In D. A. Norman 8: S. Draper (Ed) U:er
Centered Sy:tem Dengn: New Peripective: On Human-Corrputer Interaction, Hillsdale, NJ,
LEA, Lawrence Erlbaum Associates.

Norman, D. A. (1993). Tbing: Tbat [Make U: Smart. Cambridge, MA: Perseus Publishing.

Norman, D. A. (1999). Affordance, Conventions, And Design. Interaction: 6(3), 38-43.

Olson, G. M. & Olson,J. S. (2003). Human Computer Interaction: Psychological Aspects Of
The Human Use Of Computing: Annual Review Of chbology, 54, 491 - 516.

Paivio, A. (Ed) (1986). Oxford chbology S erie:: Mental Repre:entation:. (Vol. 9) New York and
Oxford, Oxfordshire: Oxford University Press.

Paivio, A. (1990). [Mental Repre:entation:: A Dual Coding Approac/J. New York: Oxford

75

University Press.
Paivio, A. (1991) Image: In Mind: TlJe Evolution Of A T/reory. New York: Harvester Wheatsheaf.
Paivio, A. & Begg, I. (1981). chbology Of Language. Englewood Cliffs, NJ: Prentice Hall.

Park, O. & Hopkins, R. (1993) Instructional Conditions For Using Dynamic Visual Displays.
In:tructional Science, 21 , 427-449.

Parshall, C., Davey, T. & Pashley, P. (2000) Innovative Item Types For Computerized
Testing. In W. J. van der Linden & C. A. W. Glas (Eds) Conputerized Adaptive Te:ting:
Tbeog' And Te:ting (pp. 129 - 149) The Netherlands: Kluwer Academic Publishers 8:
ICO

Pashler, H. (1988). Familiarity And Visual Change Detection. Perception And ch/rop/yﬂcg
44(4), 369-378.

Phillips, W. A. (1974). On The Distinction Between Sensory Storage And Short-Term Visual
Memory. Perception And chbop/gyn'm 16(2), 283-290.

Phillips, W. A. & Christie, D. F. M. (1977a). Components Of Visual Memory. Quarterbl
journal Of Experimental chbology, 29, 117 - 133.

Potter, M. C. & Faulconer, B. A. (1975) Time To Understand Pictures And Words. Nature,
253, 437-438.

Presmeg, N. C. (1986b) Visualization In High School Mathematics. For T/Je Learning Of
Matbematic:, 63(3), 42-46.

Pullar, R D. & Egenhofer, M. (1988). Towards Formal Deﬁnitions Of Topological Relations
Among Spatial Objects. Proceeding: Of Tire Tbird International Sjnpwium On Spatial Data
Handling, (pp. 225-241) Sydney, Australia.

Raju, N. S. (1988). The Area Between Two Item Characteristic curves. chlrometrika, 53,
495-502.

Raju, N. S. (1990). Determining The Signiﬁcance Of Estimated Signed And Unsigned Areas
Between Two Item Response Functions. Applied chbologicalMewur'ement, 14, 197-
207.

Rayner, K. & Pollatsek, A. (1983). Is Visual Information Integrated Across Saccades?
Perception And Pym/upland, 34(1), 39-48.

Reese, H. (1972) Imagery And Multiple-List Paired-Associative Learning In Young Children.
journal Of Experimental Cbild Pyclrology, 9, 310-323.

Reiber, L. P. & Kini, R. S. (1991) Theoretical Foundations Of Instructional Applications For

76

Computer-Generated Animated Visuals. journal OfComputer-Ba:ed In:truction, 17, 83
88

Reinert, H. (1976) One picture Is Worth A Thousand Words? Not necessarily? Modern
Language journal, 60, 160-168

Rensink, R. A., O'Regan,J. K. & C1ark,J. J. (1997). To see Or Not To See: The Need For
Attention To Perceive Changes In Scenes. chliological Science, 8(5), 368-373.

Repp, B. H. & Penel, A. (2002). Auditory Dominance In Temporal Processing: New
Evidence From Synchronization With Simultaneous Visual And Auditory Sequences.
journal Of E>goerimental chbology:H uman Perception And Performance, 28(5), 1085 — 1099.

Richards, D. R. (1987) An Experimental Assessment Of The Relative Effectiveness Of
Varied Types Of Computer-Generated Feedback Strategies In Facilitating’
Achievement Of Different Educational Objectives As Measured Verbal And Visual
Tests. (Doctoral Dissertation, The Pennsylvania State University, 1987). Di::ertation
Ab:tract: International, 48 ( I 0), 2528.

Rieber, L. P. (1994) Conniutem Crap/Jim, And Learning. Madison, WI: WCB Brown and

Benchmark.

Riddoch, M. J. & Humphreys, G. W. (1987a). Visual Object Processing In Optic Apahsia: A
Case Of Semantic Access Agnosia. Cognitive Neuropgiclrology, 4, 131-185.

Riddoch, M. J. & Humphreys, G. W. (1987b). Picture Naming. In G. W. Humphreys & M. J.
Riddoch (Eds), Vi:ual Omect Proce::ing: A Cognitive N europgvclrological Approac/J (pp. 107
143). London: Erlbaum UK.

Roediger, H. L. & Weldon, M. S. (1987) Reversing The Picture Superiority Effect. In M. A.
McDaniel 8c M. Pressley (Eds) Imageg And Related Mnemonic Proce::e:: Tbeorie:,
Individual Dgﬂrencet And Application: (pp. 151 -174) New York: Springer-Verlag.

Rohwer W.D., Lynch S, LevinJ.R. & Suzuki N. (1967) Pictorial And Verbal Factors In The
Efﬁcient Learning Of Paired Associates. journal Of Educational Pyc/rology, 58,
278-84.

Rosenthal, B. L. & Kemphaus, R. W. (1988) Interpretive Tables For Test Scatter On The
Stanford-Binet Intelligence Scale: Fourth Edition. journal Of Pyc/roeducational
A::e::rnent, 6, 359-370.

Roth, S. (1995). Visual Literacy And The Design Of Digital Media. Computer Grapbicr, 45-47.

Rumelhart, D. E., & Norman, D. A. (1988). Representation In Memory. In R. C.

Atkinson, R J. Herrnstein, G. Lindzey, & R. D. Luce (Eds), S teven:' Handbook Of
Experimental Pycbolog. New York: Wiley.

77

Sadoski, M., Goetz, E. T. & Avila, E. (1995). Concreteness Effects In Text Recall: Dual
Coding And Context Availability? Reading Re:eare/) Quarterbr, 30, 278-288.

Sadoski, M., Goetz, E. T. & Fritz,J. B. (1991) Impact Of Concreteness On
Comprehensibility, Interest, And Memory For Text: Implications For Dual Coding
Theory And Text Design. journal Of Educational Hydrology, 85(2), 291-304.

Sadoski, M., Paivio, A. & Goetz, E. T. (1991) A Critique Of Schema Theory In Reading And
A Dual Coding Alternative. Reading Re:earclr Quarterb, 26(4), 463-485.

Sattler, J. M. (1988) A::e::ment Of Clu'ldren ’: Intelligence And Special Abilitie: (3rd. ed.) San
Diego, CA: Author.

Schneider, W. & Detweiler, M. (1987). A Connectionist/ Control Architecture For Working
Memory. In G. H. Bower (Ed), Tire ch/Jology Of Learning And Motivation: Advance: In
Re:eare/) And Tbeogr. (pp. 53 — 119) San Diego, CA: Academic Press.

Schnipke, D. L. & Scrams, D. J. (1999) Modeling Item Re:pon:e Time: Wit/J A Two-State
Mixture Model: A New Approach To Mea:uring S peededne::. (LSAC Research
Report No. 96-02) Newtown, PA: LSAC (Law School Admissions Council).

Schnipke, D. L. & Scrams, D. J. (1999) Exploring I::ue: Of Te:t Taker Beiravior: In:igbt: Gained
From Re:pon:e-Time Anaboet (LSAC Research Report No. 98-09) Newtown, PA:
LSAC (Law School Admissions Council).

Scrams, D. J. & Schnipke, D. L. (1999) Making U:e of Re:pon:e Time: In S tandardized Te:t::Are
Tog Mea:uring Tire Same TlJing? (LSAC Research Report No. 97-04) Newtown, PA:
LSAC (Law School Admissions Council).

Seymour, P. H. K. (1979) Human Vi:ual Cognition. London: Macmillan.

Simons, D. J. (1996). In Sight, Out Of Mind: When Object Representations Fail. chbological
Science, 7, 301-305.

Simons, D. J. 8: Levin, D. T. (1998). Failure to Detect Changes To People In A Real-World
Interaction. Pyle/Jonomic Bulletin And Review, 5(4), 644-649.

Simpson, T. J. (1994) Message Into Medium: An Extension Of The Dual Coding
Hypothesis. I VLA (Imageg And Vi:ual Uteragr) Annual Conference, 1994, 255-263.

Smith, M. C. & Magee, L. E. (1980). Tracing The Time Course Of Picture-Word Processing.
journal Of Experimental Pyc/rology: General 109, 373-392.

Smywhe, P. C. (1970) Pair Concretene:: And Mediation I n:truction: In Forward And Backward Paired

A::ociative Recall. Unpublished Doctoral Dissertation, University of Western Ontario,
London, Ontario, Canada.

78

Snodgrass,J. G. (1984) Concepts And Their Surface Representations. journal Of Verbal
Learning And Verbal Bebavior‘, 23, 3-22.

Snyder, H. L. & Maddox, M. E. (1978). Information Trang’ér From Corrmuter-Generated Dot-
Matrix Dipla}: (Virginia Polytechnic Institute, Final Rep. HFL-78-3/ARO-78-1,
NTIS No. AD A063 505) Blacksburg, VA: VPI (Virginia Polytechnic Institute).

Sperling, G. (1960). Afterimage \X’ithout Prior Image. Science, 131, 1613-1614.

Stevens, S. S. & Greenbaum, H. B. (1966). Regression Effect In Psycho-Physical Judgment.
Percqbtion 29' P9cboplry:ic:, 1, 439-446.

Stricker, L. J. (1993) Di:cnpant LSAT S ub:core:. (LSAC Research Report No. 93-01)
Newtown, PA: LSAC (Law School Admissions Council)

Suwarsono, S. (1982). Vi:ual Imageg In Tbe Matbematical Tbinking Of S event/J Grade S tudent:.
Unpublished doctoral dissertation, Monash University, Melbourne.

Sweller, J. (1988) Cognitive Load During Problem Solving: Effects On Learning. Cognitive
Science, 12, 257-285.

Sweller,J. & Cooper, G. A. (1985). The Use Of Worked Examples As A Substitute For
Problem-Solving In Learning Algebra. Cognition And In:truction, 2, 59-89.

Theios, J. & Amrhein, P. C. (1989) Theoretical Analysis Of The Cognitive Processing Of
Lexical And Pictorial Stimuli: Reading, Naming, And Visual And Conceptual
Comparions. chbological Review. 96, 5-24.

Thissen, D. (1983) Timed Testing: An Approach Using Item Response Theory. In D. Weiss

(Ed), New H orizon: In Te:ting: latent Trait Tbeory and Computerized Adaptive Te:ting. (pp.
179-203). New York: Academic Press.

Thomas, K. E.; Newstead, S. E. & Handley, S. J. (2003). Exploring The Time Prediction
Process: The Effects Of Task Experience And Complexity On Prediction Accuracy.
Applied Cognitive Pyrcbology, 17, 655 - 673.

Trbovich, P. L. & eFevre., J. (2003). Phonological And Visual Working Memory In Mental
Addition. MemogI And Cognition, 31, 738-745.

Tullis, T. S. (1983). The Formatting Of Alphanumeric Displays. Human Factors, 25, 657-683.

Van Dusen, L. M., Spach,J. D., Brown, B. & Hansen, (1999) TRIO: A New Measure of
Visual Processing Ability. Educational And chbologicalMemurement, 59(6), 1030-1046

Vierordt. K. (1868) Der Zeitsinn Nach Versuchen [Empirical Studies Of Time Experience].
Tiibingen, Germany Laupp.

79

Weldon, M. S. & Roediger, H. L. III (1987) Altering Retrieval Demands Reverses The
Picture Effect. Memog' And Cognition, 15, 269-280.

Wertheimer, M. (1920, 1925) Abgedruckt In Philosophische Zeitschrift Fiir Forschung Und
Aussprache. 1, 39-60 (1925) Und Als Sonderdruck: Erlangen: Verlag Der
Philosophischen Akademie (1925). [GESTALT THEORY] (1985) 7(2), 99-120,
Opladen: Westdeutscher Verlag.

Wilson, K. M. 8: Powers, D. E. (1994) Factor: In Performance On Tbe law S #200! Adminion: Te:t.
(LSAC Research-Statistical Report No. 93-04) Newtown, PA: LSAC (Law School
Admissions Council).

Woods, D. D. (1985). Coping With Complexity: The Psychology Of Human Behavior In
Complex Systems. In L. P. Goodstein, H. B. Andersen & S. E. Olsen (Eds) Ta:k:,
Error: And Mental Model: (pp. 128 - 148). London: Taylor and Francis.

Woods, D. D. (1991). The Cognitive Engineering Of Problem Representations. In G. R. S.
Weir & Alty, J. L. (Eds) Human-Comuter Interaction And Complex S}:tem:, (pp. 169
187) Glasgow, Scotland: Academic Press.

Yarmey, A. D. & O’Neill, B. J. (1969) S-R and R—S Paired Associative Learning As A
Function Of Concreteness, Imagery, Speciﬁcity, And Association Value. journal Of
chbology, 71, 95-109

Zenisky, A. L., & Sireci, S. G. (2002). Technological Innovations In Large-Scale Assessment.
Applied Mea:urement In Education, 15, 337-362.

Zumbo, B. D. (1999) A Handbook On Tbe Tbeog And Met/rod: Of Dij'erential Item Functioning
(DIF): Logi:tic Regre::ion Modeling A: A U nitag' Framework For Binag And Ukert-Dpe
(Ordinal) Item S core: Ottawa, Canada: Directorate of Human Resources Research and
Evaluation, National Defense Headquarters.

80

APPENDICES

81

APPENDIX A: Table 8. DCT Table of Theoretical & Empirical Assumptions

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

7 General Cognition is served by two modality-speciﬁc systems that are experientially
Empiricist derived and differentially specialized for representing and processing
Assumptions information concerning nonverbal objects, events and language

Distinctions between symbolic and sensorimotor systems

: Unit Level Representational Units Vary Synchronous vs.

Properties are modality speciﬁc Hierarchically in sequential intraunit
size organizational strcuture

7 System Level Functional Independence Interunit Processing Operations:

Properties & Partial between and 1. Activation of

Interconnectedness Within systems .

representations.
2. Representational,
referential & associative
3. Synchronous vs.
sequential
4. Transformational
5. Conscious &
Automatic

A Basic Evaluative Mnemonic Motivational &
Functions Emotional

4 Empirical Theoretical Assumptions are linked to class of operational indicatorsand
Variables procedures: stimulus attributes, experimental manipulation, individual

differences in cognitive habits and skills, and subjective reports

, Phenomenal Processing of verbal & nonverbal information in perceptual memory,
Domain language, and complex problem-solving tasks; neuropsychology; issues in

 

 

epistemology and philosophy of science.

 

82

APPENDIX B: Sample 1. Sample Format of McNeal’s Verbal to Visual Test Comparisons
Plate 1 Terminology Test Verbal Form

Blood from the right ventricle leaves the heart through the

 

a. veins
b. aortic artery
c. pulmonary artery
d. superior vena cava
Plate 2 Terminology Test Visual Form
Select the letter which correctly represents the part or function of the heart described in each

question.

The vessels(s) through which the blood leaves the heart from the right ventricle:

 

Figure 3.6 Sample Questions from Terminology Test

83

APPENDIX C: Table 9. Means & Standard Deviations on Verbal Test Form

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Identiﬁcation Terminology Comprehension Composite
T1 M 14.77 13.34 11.02 39.14
SD 3.44 3.31 2.77 6.83
T2 M 15.89 14.45 12.05 42.27
SD 2.81 2.71 3.07 7.53
T3 M 15.23 13.36 11.02 39.61
SD 2.85 2.77 2.82 6.66
T4 M 16.52 14.82 12.20 43.30
SD 2.53 2.73 3.11 7.09
Table 10. Means & Standard Deviations on Visual Test Form
Identiﬁcation Terminolqu Comprehension Composite
T1 M 14.45 13.00 10.16 37.61
SD 2.93 3.65 3.18 8.62
T2 M 15.30 13.20 9.86 38.59
SD 3.31 3.40 3.17 8.27
T3 M 14.95 12.59 10.25 37.75
SD 2.85 3.38 3.08 8.00
T4 M 16.59 14.25 10.98 41.82
SD 3.22 3.18 3.15 7.26

 

 

 

 

 

 

 

 

 

 

 

 

 

 

APPENDIX D: Table 11. Summary Correlations between and among predictor and
criterion variables for law schools participating in 1995-1996 correlation studies: selected
ﬁrst—year results.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

j VAR YR MN SD 25 50 75 MIN MAX j
Zero Order Correlations J
ILSAT/FYA 1995 0.04 0.10 0.35 0.42 0.47 0.02 0.61 l
1996 0.04 0.09 0.34 0.40 0.46 0.01 0.62 ,,
lUGPA/FYA 1995 0.26 0.08 0.20 0.27 0.31 0.05 0.45 I
g 1996 0.25 0.08 0.19 0.25 0.31 0.02 0.42 »
lLSAT/UGPA 1995 -005 0.14 -013 -005 0.06 -044 0.31 I
f 1996 -0.06 0.13 -015 -0.06 0.04 -O.46 0.24 g
7 Multiple Correlations l
[LSAT and 1995 0.49 0.08 0.44 0.50 0.55 0.18 0.68 *
UGPA/FYA 1996 0.48 0.08 0.44 0.49 0.53 0.11 0.68

 

 

 

 

 

 

 

 

 

 

85

 

 

 

 

 

 

 

 

APPENDIX E: Table 12. Incidence of Signiﬁcant and Rare Differences for Each Pair of
LSAT Subscores.

 

 

 

 

 

 

 

 

Subscore Pair Signiﬁcant Diﬁ’erences Rare Diﬁ‘erences
(-°/o) (+°/o) (-°/o) (+%)

Analytical Reasoning vs. 9.8 9.8 2.5 2.5
Reading Comprehension
Analytical Reasoning vs. 10.3 10.0 2.5 2.5
Logical Reasoning
Reading Comprehension vs. 5.1 5.1 0.25 0.25
Logical Reasoning

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 13. Incidence of Signiﬁcant and Rare Differences for All Pairs of
LSAT Subscores.

 

Frequency Signiﬁcant Diﬁ'erence Rare Diﬁ'erence
0%) (“/o)

 

 

0 66.1 88.1

 

 

 

 

2 15.5 3.1

 

 

3 0.4 0.0

l 1 18.1 8.8

 

 

 

 

 

86

APPENDIX F: Sample 2. Kit of Factor-Reference Cognitive Test - Identical Pics Test
IDEN’I‘ICAL PICTURES -— P - 3
How fast can you match a given object? This is a test of your ability to pick the

correct object quickly. At the left of each row is an object. To the right are ﬁve test objects,
one of which matches the object at the left. Look at the example below:

9 99999

The third test object has been marked by blackening t space below it,
because it is the same as the object at the left.

990990
aggro
O O

.9999

 

 

Your score on this test will be the number of objects marked correctly minus a
fraction of the number marked incorrectly. Work as quickly as you can without sacriﬁcing
accuracy.

You will have 1 112 ﬁgures for each of the two parts of this test. Each part has two
pages. Be sure to do both pages if you have time. When you have ﬁnished Part 1, STOP.
Please do not go on to Part 2 until you are asked to do so.

DO NOT TURN THIS PAGE UNTIL ASKED TO DO SO.

Copyright 1962, 1975 by Educational Testing Service. All rights reserved.

87

Sample 3. The Kit of Factor-Reference Cognitive Test — Finding As Test

FINDING A’s TEST — P-1

This is a test of your speed in ﬁnding the letter tPaI in words. Your task is to put a
line through any such word. Listed below are ﬁve columns of words. Each column has ﬁve
words containing the letter “a”. The ﬁrst two columns have already been marked correctly.
Now, on the other three columns, practice for speed in putting a line through the words

 

 

whhan“£1
1 2 3 4 5

cider ease stripe insert defend
bough blind coarse court settle
fudge chord govern pearl lodge
greet solar perfect bridle oaken
faale spoon special recess croin
leap piece consist soapy quest
count rinse mostly able glimpse
Shore drawa shrink pledge every
easel fleet pencil refuse beak
define sense hinder better where
entire uncle solace patrol thorn
ghost white keeper judge pause
knife eeaeh night defect hence
hedge south clock trust short
peeal period picnic other person
scope miller smart straw warm
ripen alegaa finger noisy juice
under height useful defer enter
heard event slowly field ordeal
quite bond meant mend nurse
jump west quick skill cool

Remember, in each column there are ﬁve words containing the letter “a”.

Your score on this test will be the number of words marked correctly. Work as
quickly as you can without sacriﬁcing accuracy.

You will have 2__min_j,i_t;;$_for each of the two parts of this test. Each part has four
pages. When you have ﬁnished Part 1 (pages 2 to 5), STOP. Please do not go on to Part 2

until you are asked to do so.

DO NOT TURN THIS PAGE UNTIL ASKED TO DO SO.

Copyright 1962, 1975 by Educational Testing Service. All rights reserved.

88

Sample 4. Sample Screen Shot of DCT & LSAT item formats Ql-QS

APPENDIX G

a

 

«ﬂan

2V :V :6 .A s .5 S
6.5. 1:59. IEEE—v.3: «:35. {Hz—3:... .5333: .M

E 2 .v :6 2., a: S
«2.22:... .339. 18.5273: 6.3. 1...:2 4:28.; .9

5 2 a. 7: S 9.. 3
.3355: IEEE .33. .5532“ £51573: .255. .U

: s :v. 9.... :3 Q. 7:.
icicz [oz-Cu. £25.35: £515.73: 45:33:. 6.5. .6

S E :v 2 e is J:
..:...cv:~ £2525: .w._..:u_ 1.63:: «5.5273: 6.5. .<

O

25:. 2... c. .63; 8a :5... a...
62.... s at: .1. 3 2:3 3.3:... 2i. a: 2.: 6.5, u. 8225

 

£5». 23.9.. 5:
~2— aU—ﬂu— up: u: «9.5g»: 90: Qua-by”. 00:: 0:25. “ﬂ .7917" an m—CAXCSm-‘C OFF—u

r ll\

     

It

.r—CC tuna—u... mm 073— Up: .2:- .I.U3~ﬂ—C.Z Up: rapt—9%

Illll ill.

2......

     

1:39. 2..
0.5.3.. 2:: 2:3 .3 .53..” Ba £53. 2.. :95. .36... .61.; a. 3233 a... .=

_C.

v u

. _ . x. _
_

.72 .32..” 8.. :25. 3.. ca... .75.. .535- an $915.51.: 2.. .=

_ ._ _ . 7. -__ .

e m. a p. m _ .

9.5:...3. a... 9...? 2.8.5.339 on. 35: p.55 9.... c. «.53.. 2..

m3; 212.. v 9.9:...» c. 31.5 0...... .35. :5... 2...: .333 I. .53.. 47 .25. a .n
.53.. ya: .52.... a 5. ﬂag: .
j§.m.1x¢ .3 .355 x: $558 HES...“ k. as...» « 34$: Eta... v
35:33

 

 

1.... .952

®I 9

.55:a 5.5. 55.3. 1:... .3....“ s. 93...: 2.. 0m... 63.337 :26 twice 2.35.556
1:.” .1383!" :2: .3.. 933.79.. 2... Faxing .323"... p.12: a Bu... 3. .33.: ”x. .2: : 55:32..
a... .: 2:3 "8:953." :. £55.15; .2. .3 a c: .53.». a. 9593 .3... c. 33:32.. .3 :25. Lou...— ”ragga

32:35? mm.
355:. mm - 35...

= nag

89

Sample 5. Sample Screen Shot of DCT & LSAT item formats Q11-Q17

 

3.3

4.93.55... .N O

15...... .Q 0

45.3.5741... .U 0

2:5. ... 0

5.3 .< O

E...
3.3 2. 5.1.25 ”.5... 32...... 2.. .s 2... £2... a 8.326

 

5:33 9.2693 “3:

.2. 5.5. 3.. 5 7.3.3:... 2.. 95...... 2:: 2:3. .5 .51.... 9... 29:55.2: 2....
.Iub—u— ﬁfe—:3“ mm 9—63 in 5:: Zyvaﬂ—Cavu .49—u ubﬁumuz

«5...... a...

9.3.2. 9:: 2:3. .9 .51.... 2a 2.3.... 2.. :9... .36... .94....“ a. .5593 2.. .=
.3... .53.... Ba ”2.5. 2.. :u... .3.... 19......" 9:. 5:32:72: u... ..

$5.25.... 2.. 5:3 .:u.m..:=u 2. .38 5...... u...
c. .75.. u... 7...... 9:22... 52...: c. 3...: 2.... .3.... :2. 9.9: .31.; 9.. .xx...
:7. 58.. a .n .5... 9:: £5... a 3.552... .3... «2.35:. «sci: 35.5.2.1:
1...:9. 6.3-7.75. .c “.8... x.» 9.853 25...... k. as... a 3...»: 9.1.2.”.

33310

 

o......

.392

K
\

.333." .25. .15.... .5." .3.... :. 9.3.x: u... 3... 63.33.. £23 £93....“ ....u.....n..::u
.Ea 2.3.5.3.. .3.: .3. 35.93. .4... 22:3... .5943... :22: a .2"... c. .33... .X. .5: ._ £253....

\
a... E Q a... ..: 2:3. ”Erwin...“ :— «E...=x.cu ..: .3 e c: .322. .4. 5:33. 7.... :. 7.9.1.3.... ..: 9...... £93.... “ﬁg—45a

23.3.5 «N
.835: mm. . 25...

._ “madam. ._

9O

APPENDIX H: Figure 4. Mean Response Times (MRT) of DCT & LSAT groups.

 

 

LSATRT ,—

 

 

 

 

 

 

 

 

 

 

 

 

j
u u n v : ﬂ
0 50 100 150 200 250
LSATRT = Traditional LSAT Mean Response Times
DCTRT = DCT Mean Response Times
X Axis = Total number of questions answered
Figure 5. LSAT RT to the Proportion of Correct Responses
LSATTC LSATTC and LSATRT
0.9 - .
0.8 4 .
0.7 - . °
0.6 4 °
0.5 - 3 . ’ ° °
0.4 - 9 ° °
0.3 a , , ; . 0
0.2 - ° .
0.1 - o
0.0 I I . I I r I I I I
20 40 60 80 100 120 140 160 180 200 220 240

LSATRT

LSATTC = Total number of correct responses
LSATRT = LSAT Response Times

91

Figure 6. DCT RT to the Proportion of Correct Responses

DCFFC DC'ITC and DCTRT
1.0 -

0.9 . °
0.8 ‘
0.7 -

0.6 "
0.5 - 0 s
0.4 - o . o

0.3 -

 

 

0.2 l I 1
0 50 100 150 200 250

DCTRT

DCTT C = Total number of correct responses
DCTRT = DCT Response Times

92

APPENDIX IzTable 14. Correlation Coefﬁcient of All examinees for Mean Response Times

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Correlation T1 - 11 T2 - 12 T3 - 13 T4 - 14 T5 — 15 T6 - 16 T7 — 17 T8 - 18

T1 - 11 1.00 *0.25 -0.15 **0.30 M0.43 ”0.29 0.04 -0.17
T2 - 12 *0.25 1.00 0.06 “0.35 0.04 *0.21 0.01 -0.03
T3 - 13 -0.15 0.06 1.00 0.10 0.17 -0.04 “0.39 ”0.33
T4 - 14 "0.30 "0.35 0.10 1.00 “0.33 0.02 *0.21 *0.23
T5 - 15 “433.00 0.04 0.17 **0.33 1.00 0.03 M0.36 0.18
T6 - 16 “0.29 *0.21 -0.04 0.02 0.03 1.00 -0.02 -0.10
T7 - 17 0.04 0.01 "0.39 *0.21 M0.36 -0.02 1.00 **O.50
T8 - 18 -0.17 -0.03 “0.33 *0.23 0.18 -0.10 **0.50 1.00
T9 - 19 *-0.20 -0.02 0.13 0.04 0.23 -0.01 *0.26 M0.36
T10 - 110 -0.05 -0.02 0.12 0.15 0.12 0.05 *0.21 0.16
T11 - 111 *-0.24 -0.01 0.09 —0.08 "-0.27 0.18 -0.04 0.06
T12 - 112 “0.27 -0.03 0.16 -0.02 -0.13 0.03 0.05 0.09
T13 - 113 -0.16 -0.05 -0.15 *—0.23 “—0.28 0.12 -O.17 0.07
T14 — 114 "-0.28 —0.12 0.01 *-0.24 **-0.30 *-0.23 *-0.23 0.03
T15 — 115 *-0.24 **0.27 -0.10 -0.1 1 “-0.31 —0.11 -O.20 ~0.08
T16 - 116 -0.19 **-0.27 -0.09 -0.08 *-0.26 *-0.21 *-0.25 -0.11
T17 — 117 -0.23 *-0.25 -0.14 *-0.21 "-0.31 -0.19 “-0.42 *-0.25
T18 - 118 -0.16 "-0.30 “—0.26 "—0.44 "-0.40 -0.06 **-0.41 1""‘-0.36
T19 - 119 -0.19 "-0.26 -0.16 "-0.31 “-0.31 0.00 *-0.25 *-0.25
T20 — 120 *-0.24 1“"‘-0.27 *-0.23 *-0.24 ”—0.36 -0.19 “—0.37 M-0.27
T21 - 121 -0.19 **-0.29 *-0.25 ”-0.29 “-0.35 ~0.14 **-0.41 **-0.32
T22 - 122 —0.15 *-0.22 ”-0.30 ”-0.31 "-0.33 *-0.21 **—0.39 ”-0.34
T23 - 123 -0.10 -0.14 “—0.27 *-0.24 I”-0.28 *-0.24 **-0.34 **-0.28
Format *0.20 0.00 -0.15 0.10 0.12 0.14 0.03 ~0.17
Correlations T9-19 T10-110 T11-111 T12-112 T13-113 T14-114 T15-115 T16-116

T1 - 11 *-0.20 -0.05 *-0.24 "-0.27 -0.16 "-0.28 *-0.24 -0.19
T2 - 12 -0.02 -0.02 -0.01 -0.03 -0.05 -0.12 1""2027 “-0.27
T3 - 13 0.13 0.12 0.09 0.16 -0.15 0.01 -0. 10 -0.09
T4 - 14 0.04 0.15 -0.08 -0.02 *«023 *-0.24 ~0.11 -0.07
T5 - 15 *0.23 0.12 M-0.27 -0.13 **-0.28 *-0.30 “—0.31 *-0.26
T6 - 16 -0.01 0.05 0.18 0.03 0.12 *-0.23 -O.11 *-0.21
T7 - 17 *0.26 *0.21 -0.04 0.05 -0.17 *-0.23 -0.20 -0.25
T8 - 18 ”0.36 0.16 0.06 0.09 0.07 0.03 -0.08 -0.10
T9 — 19 1.00 "0.32 0.19 0.16 0.03 -0.05 -0.08 -0.13
T10 - 110 **0.32 1.00 0.11 0.12 -0.16 -0.19 -0.08 0.14
T11 - 111 0.19 0.12 1.00 "0.30 —0.02 0.12 0.08 -0. 10
T12 - 112 0.16 0.11 ”0.30 1.00 *0.21 M0.35 *0.20 *0.21
T13 - 113 0.03 -0.16 -0.02 *0.21 1.00 "0.27 0.1 1 0.1 1
T14 - 114 -0.05 -0.19 0.12 "0.35 "0.27 1.00 "0.45 "0.46
T15 - 115 -0.08 -0.09 0.08 *0.20 0.11 M0.45 1.00 "0.56
T16 - 116 -0_13 -0. 14 -0.10 *0.21 0.11 **0.46 ”0.56 1.00
T17 - 117 -0. 12 -0.18 0.01 0.07 0.19 “0.42 "0.41 “0.50
T18 - 118 “-0.38 "—0.29 -0.19 ~0.14 *0.21 0.09 0.08 *0.23
T19 - 119 "-0.30 ”-0.29 -0.14 -0.07 ”0.31 0.12 *0.23 *0.26
T20 - 120 M-0.39 ”-0.36 -0.16 -0.19 0.11 0.05 0.15 *0.21
T21 -121 M-0.31 "-0.30 -0.16 -0.18 0.14 0.05 0.13 0.17
T22 - 122 ”-0.38 "-0.32 0.19 **-0.27 0.07 0.04 0.04 *0.23
T23 - 123 “20.31 **-0.31 -0.19 **—0.29 0.05 0.04 0.10 0.12
Format —0.1 5 -0.10 -0.11 *-0.26 0.02 —0.03 -0.14 «0.07

 

93

 

Table 14: Continued

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Correlations T17-117 T18-118 T19-119 T20-120 T21-121 T22-122 T23-123 Format
T1 - 11 *—0.23 -0.12 -0.19 *-0.24 -0.19 -0.15 -0.10 ”020
T2 - 12 *-.025 “-0.30 "-0.26 "-0.27 M-0.29 *-0.22 -0.14 0.00
T3 - 13 -0.14 "-0.26 -0.16 *-0.23 *—0.25 “—0.30 "—0.27 -015
T4 - 14 *0.21 “—0.44 *-0.23 *-0.24 **—0.29 "—0.31 *-0.24 0.10
T5 - 15 "-0.31 "-0.40 “-0.31 **-0.36 "-0.35 "-0.33 ”-0.28 0.12
T6 - 16 -0.19 -0.06 0.00 -0.19 —0.14 *-0.21 *-0.24 —0.14
T7 - 17 "-0.42 "-0.41 *-0.25 "-0.37 "-0.41 "-0.39 "-0.34 -0.03
T8 - 18 *—0.25 "-0.36 *-0.25 "-0.28 "-0.32 "-0.40 “-0.28 -0.17
T9 — 19 -0.12 "-0.38 “-0.30 “-0.39 "-0.31 "-0.38 **-0.31 -0.15
T10 - 110 -0.18 “-0.29 "-0.29 ”-0.36 **-0.30 “-0.32 “-0.31 -0.10
T11 - 111 0.01 -0.19 —O.14 -0.16 -0.16 -0.19 —0.19 -O.11
T12 - 112 0.07 -0.14 -0.07 —0.19 -0.18 "—0.27 a”-0.29 *-0.26
T13 - 113 0.19 *0.21 **0.31 0.11 0.14 0.07 0.05 0.02
T14 - 114 **0.42 0.09 0.12 0.05 0.05 0.04 0.04 —0.03
T15 - 115 M0.41 0.08 *0.22 0.15 0.13 0.04 0.10 -0.14
T16 - 116 "0.50 *0.23 *0.26 *0.21 0.17 *0.23 0.12 —0.07
T17 - 117 1.00 “0.33 "0.33 "0.30 "0.28 *0.27 “0.29 -0.15
T18 - 118 **0.33 1.00 **0.52 **0.49 "0.49 **0.51 **0.40 0.06
T19 - 119 "0.33 “0.52 1.00 "0.55 “0.49 “0.32 0.20 0.02
T20 - 120 **0.30 **0.49 "0.55 1.00 **0.72 “0.64 “0.62 0.02
T21 - 121 "0.28 "0.49 M0.49 0.72 1.00 "0.74 “0.50 0.82
T22 - 122 *0.26 M0.51 **0.32 **0.64 ”0.74 1.00 “0.67 0.82
T23 - 123 "0.29 "0.40 0.20 “0.62 **0.50 "0.67 1.00 0.09
Format 015 0.06 0.02 0.02 0.08 0.08 0.09 1.00
** p S .01
* p S .05

T-I (Response time taken to answer speciﬁc item)

94

 

APPENDIX]: Table 16. Multivariate Analysis of Variance (MAN OVA) of Signiﬁcant Items

 

 

 

 

 

 

 

 

 

 

 

 

Dependent Variable df F Sig

Item 1 1 0.236 0.628
Time 1 1 3.980 0.049
Item 2 1 2.918 0.091
Time 2 1 0.000 0.998
Item 7 1 4.511 0.036
Time 7 1 0.100 0.753
Item 12 1 0.344 0.559
Time 12 1 7.073 0.009
Item 14 1 3.891 0.052
Time 14 1 0.031 0.860
Item 15 1 6.065 0.016
Time 15 1 2.177 0.144
Item 17 1 3.387 0.070
Time 17 1 4.345 0.040
Item 18 1 2.891 0.093
Time 18 1 0.163 0.687
Item 19 1 11.091 0.001
Time 19 1 0.222 0.639
Item 22 1 6.34 0.015
Time 22 1 0.814 0.371

 

95

 

 

02110110