HIGH CONTRAST IN LOW-LEVEL VISION
By
Carie Cunningham

A THESIS
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
Communication - Master of Arts
2014

ABSTRACT
HIGH CONTRAST IN LOW-LEVEL VISION
By
Carie Cunningham
Intuitively, many people believe they are aware of all the information available in their
surroundings. However, that may not be correct. This paper identifies key visual features that
make up the information being shared via television broadcasts. Specifically, this project uses a
cognitive science approach to look at the competing hypotheses about the role of motion in
attentional capture. The attention literature suggests that attention will switch from one stimulus
to another when the second stimulus is either new to the environment or “odd” to the
environment. This paper reports on a critical test between three competing hypotheses (new
object, unique event, and behavioral urgency) to better understand how to capture attention in a
realistic television view setting. Using a within subjects design, subjects viewed video and then
were asked if they recognized any of the secondary stimuli manipulations. The new object
hypothesis was supported, while the other hypotheses were not.
Keywords: cognitive, communication, inattentional blindness, motion, attentional capture

Copyright by
CARIE CUNNINGHAM
2014

TABLE OF CONTENTS

LIST OF TABLES...........................................................................................................................v
LIST OF FIGURES........................................................................................................................vi
HIGH CONTRAST IN LOW-LEVEL VISION..............................................................................1
Attention..............................................................................................................................2
Three attention networks..........................................................................................2
Exogenous attention shift.........................................................................................3
Attentional Capture..................................................................................................5
Color and hue...............................................................................................6
Luminance....................................................................................................7
Size...............................................................................................................8
Change in stimulus speed and direction.......................................................8
Motion..............................................................................................8
New object hypothesis. .......................................................9
Delayed-signal hypothesis...................................................9
Unique event hypothesis....................................................10
Behavior urgency hypothesis. ...........................................10
Hypotheses.........................................................................................................................12
New object hypothesis...........................................................................................13
Unique event hypothesis........................................................................................13
Behavioral urgency hypothesis..............................................................................13
Method...............................................................................................................................14
Participants............................................................................................................14
Procedure...............................................................................................................14
Materials................................................................................................................15
Manipulation..............................................................................................15
Outcome measure.......................................................................................16
Results................................................................................................................................16
Discussion..........................................................................................................................19
Conclusion.........................................................................................................................23
APPENDIX…................................................................................................................................24
REFERENCES..............................................................................................................................28

iv

LIST OF TABLES

Table 1: Counterbalanced graphic manipulations........................................................................16
Table 2: Number of subjects who recognized the stimuli..............................................................17

v

LIST OF FIGURES

Figure 1: Recognition of Secondary Stimuli.................................................................................18
Figure 2: Total Number of Correct Secondary Stimuli.................................................................19
Figure 3: Recognized Screen Location for Stimuli........................................................................19
Figure 4: Television Weather Icons...............................................................................................26
Figure 5: Quadrants of the Television Screen................................................................................26

vi

HIGH CONTRAST IN LOW-LEVEL VISION
Intuitively, many people believe they are aware of all the information available in their
surroundings. However, that may not be correct. For example, drivers who are concentrating on
turning tend not to notice other drivers (Simons, 2000), a phenomenon known as inattentional
blindness (Beanland & Pammer, 2011). Similarly, individuals engaged in group communication
may miss important nonverbal cues or television viewers may miss graphic information because
they are focused on live action on the screen. Because much of communication relies on the
visual information, understanding the process by which people select information from the
environment is essential to understanding communication in general.
It has been repeatedly shown in laboratory experiments that people can miss dramatically
large objects in their visual field, whether those objects are a woman carrying an umbrella [see
Neisser (1979) for a description of several different versions] or even a gorilla (Simon &
Chabris, 1999). In the classic Simons & Chabris (1999) study, subjects watched a video where a
group of six people, half wearing white and half wearing black, were passing a basketball.
Subjects were instructed to count the number of passes made by either the black or white team,
or by both teams. At 45 s into the video, a woman dressed in a gorilla costume walked across the
screen. A total of 90 subjects (out of a total 196 subjects, 46%) failed to see the gorilla.
Do findings on visual attention from cognitive science laboratory studies transfer to the
real world, particularly that of media? This study tests the applicability of a subset of laboratory
findings on visual attention to a real world media viewing situation. Specifically, this project will
look at the competing hypotheses about the role of motion in attentional capture. The attention
literature suggests that attention will switch from one stimulus to another when the second
stimulus is either new to the environment or “odd” to the environment. This paper reports on a

1

critical test between the three competing hypotheses to better understand how to capture
attention in a realistic television view setting.
Attention
Attention has been defined as the choice to pursue one task over another (Duncan, 1999).
In the case of visual attention, the task is looking. Attention to a stimulus is a prerequisite for any
type of processing; if a person does not direct their attention to a stimulus, they cannot process it.
There are three attentional networks that assist working memory in selecting external data for
processing and two processes by which attention is captured. In this section, I will discuss the
types of attentional networks and the two attentional capture processes.
Three attention networks
There are three attention networks: executive, alerting, and orienting. Executive attention
controls and manages conflicts in systems. It involves “planning or decision making, error
detection, new or not well-learned responses, conditions judged to be difficult or dangerous,
regulation of thought and feelings, and the overcoming of habitual actions” (Mezzacappa, 2004;
Raz & Buhle, 2006, p. 374). Orienting attention refers to “the ability to select specific
information from among multiple sensory stimuli (sometimes known as scanning or selection).”
(Rav & Buhle, 2006, p. 372) Thus, orienting attention is focused on stimuli external to the
individual and executive attention is focused on stimuli that are internal to the individual. Both
types of attention are largely volitional. For example, an individual might focus selective
attention on her mother, ignoring all the other people in a crowded mall. Alerting attention is
sustained vigilance to the surrounding environment (Posner, 2006); “the ability to increase and
maintain response readiness in preparation for an impending stimulus” (Rav & Buhle, 2006, p.
371). Alerting attention can be found in children as young as 3 months of age (Mezzacappa,

2

2004). This attention serves an evolutionary purpose by drawing attention to peripheral events
quickly and automatically (Yantis, 1998; & Barton, 2005).
Orienting and alerting differ in how they operate. Orienting attention concerns spatial
precision (Fernandez-Duque & Posner, 1996), allowing an individual to focus on sensory stimuli
at a specific physical location. Alerting attention, however, pertains to “a signal to noise ratio
over the visual field,” (Fernandez-Duque & Posner, 1996, p. 477). Alerting attention is not
spatial located, but constantly monitors the environment for sensory stimuli that stand out from
average stimuli. Both orienting and alerting attention are implicated in exogenous (bottom-up)
attention. Exogenous attention shift is a reflexive mechanism in which attention is automatically
drawn toward a stimulus (as contrasted with endogenous or ‘top-down’ shift in which the
individual chooses to shift attention). It is very rapid; peaking at approximately 100–120 ms and
then decaying quickly (Barbot, Landy, & Carrasco, 2012). The process represents a shift from
orienting attention to a primary stimulus to a secondary stimulus via alerting attention. Initially,
orienting attention focuses on some existing stimulus (e.g., the action happening on a television
screen). When a friend enters the room, alerting attention identifies the friend as a new and
important stimulus because the friend stands out from the environment due to increased
magnitude of certain sensory cues (e.g., motion, sound, etc.). Orienting attention then focuses
processing resources on the friend as the new stimulus.
Exogenous attention shift
The exogenous (bottom-up) attention shift process is primarily associated with low-level
vision (Walther et al., 2004); a type of preattentive visual processing in which a person can
identify characteristics of a target (e.g., color, motion), but not an integrated image (Healey &
Enns, 2012). Low level vision represents the initial exposure of light to the eye’s two types of

3

photoreceptor cells: cones which sense color (red, blue, and green) and rods which sense
luminance and form (Livingstone et al., 1988). Electrical signals tranduced by the rods are
processed through magnocellular or the M-path, while signals from the cones are processed
through the parvocellular or P-path (Livingstone et al., 1988). These diverging paths serve
different supposes. Livingstone and colleagues (1988) explain, “while the magno system is
sensitive primarily to the moving objects and carries information about the overall organization
of the visual world, the parvo system seems to be important for analyzing the scene in much
greater and more leisurely detail” (p. 240). Because the bottom-up approach is a rapid initial
impression, signals are primarily processed via the M-path. Therefore, not all features of the
stimulus are captured in a bottom-up approach in low-level vision (Livingstone et al., 1988).
How does alerting attention identify a stimulus for exogenous attention shift? As
individuals are exposed to an image, they extract low-level vision attributes of the various
stimuli as well as the extent to which those stimuli contrast with the other objects and the
environment (Koch & Ullman, 1985). The information represented by the various stimulus
characteristics is used to form a saliency map (Treisman & Gelade, 1980). A saliency map is
defined as a two-dimensional map in which each area is represented on a contrast, gradient scale
(Treue, 2003). Coordinates existing within a saliency map compete for the highest contrast in
view eliciting a winner-take-all (WTA) network (Walther et al., 2004). WTA means that objects
highest in contrast with their environment will attract alerting attention and secondly, trigger the
bottom-up attention shift. The WTA effect is greatest when the desired target is in the highest
contrast with the environment (Koch & Ullman, 1985). There are stimuli that can fall into the
visual field, but fail to alert attention producing what is known as inattentional blindness (Simon,
2000). Itti, Koch, and Niebur (1998) argue that a similar contrast mechanism was part of the

4

early primate visual system for processing visual stimuli at swift rates, allowing early primates to
attend to critical stimuli to the exclusion of benign stimuli.
Stimulus characteristics that comprise the saliency map include: color, luminance, size,
and motion (Itti & Koch, 2001; Itti, Koch, & Niebur, 1998; Treisman & Gelade’s, 1980), as well
as a variety of less important characteristics including “length, closure, size, curvature, density,
number, hue, luminance, intersections, terminators, 3D depth, flicker, and lighting direction”
(Healy & Enns, 2012, p. 3). Because stimulus characteristics are often processed in parallel,
humans can see several of these basic features simultaneously (Treisman & Gelade, 1980). These
components have received considerable attention in the literature (Giesbrecht, Bischof, &
Kingstone, 2004; Healey & Enns, 2012; Most et al., 2001; Treisman & Gelade, 1980; Wolfe,
1998; Xuan et al, 2007).
Attentional Capture
An attentional shift that is involuntary is known as attentional capture. “Explicit
attentional capture occurs when a salient and unattended stimulus draws attention, leading to
awareness of its presence” (Simons, 2000, p. 147). Capture is triggered when an unattended
stimulus in the environment is able to overcome all other stimuli in the saliency map. However,
simply outcompeting other stimuli in the WTA network is not necessarily adequate to activate
capture. Instead, the unattended stimulus must also overcome attention to the attended stimulus.
As Simons explains, “when attention is engaged, the likelihood of capture is reduced” (Simons,
2000, p. 153). In the case of Simon & Chabris’ gorilla, the presence of the gorilla was not
adequate to overcome the attention to the ball for a subset of the subjects. Further, when subjects
given the more demanding attentional task of counting the number of passes by both teams

5

(45%), were less likely to spot the gorilla than subjects only counting the number of passes of
one team (64%).
A stimulus must ‘win’ a person’s attention to evoke the WTA effect. It has been shown
that a person is drawn to look at one stimulus over other stimuli based on specific, visual features
(Pessoa, 2005). Pessoa (2005) describes the human visual cortex as specific regions that respond
in simple ways to visual stimuli. The use of patterns is one example of how to evoke a response
using a stimulus (Pessoa, 2005). Commonly recognized characteristics of stimuli include
shading, color, size, and movement. These features are: (1) regarded as motivating factors of a
stimulus, (2) are considered the driving force behind focused attention to one stimulus over
another, (3) and are needed to make a target stimulus the “winner” (Wang et al., 2011).
Color and hue. Koivisto and colleagues (2004) as well as Most, Simons, Scholl,
Jimenez, Clifford, and Chabris (2001) results suggest that "bottom-up properties of the stimulus,
such as color, contribute to the likelihood of detecting an unexpected stimulus under inattention"
(p.3220). Most and colleagues (2001) were interested in examining how color, in an unexpected
object, affected inattentional blindness in a selective looking task. They found that, even though
inattentional blindness was not completely removed, there was a degree of change in
participants’ inattentional blindness to the unexpected object from 50% to 28% (Most et al.,
2001). Furthermore, if the intended, unexpected target has the same color as the current target
then the unexpected target is less likely to be identified (Koivisto et al., 2004). Attractors of the
non-target stimulus are most effective when their chromatic distance or hue separation is
increased (D’Zmura, 1990; Nagy & Sanchez, 1990). When the target stimulus is yellow, there
may not be a quick response when a new orange stimulus is introduced as when a dark blue
stimulus is introduced, because blue is in higher contrast with the yellow target and environment

6

than orange. Furthermore, D’Zmura’s (1990) research on color suggests that color attractors are
not just “red-green, yellow-blue and black-white”, but the human brain can see contrasts between
other intermediate hues too (p. 951). Nagy and Sanchez (1990) argue that the discrepancy in high
and low contrast in color suggests that short- and long-wavelength cones appear to be
independent of each other.
The color red may possess additional alerting properties. “Empirical work has begun to
emerge showing that exposure to the color red has motivational, as well as symbolic,
implications for human perceivers” (Meier et al., 2012). Animals inherently recognize the color
red as signifying danger, which in turn, evokes an aversive response (Elliot & Maier, 2007;
Meier et al., 2012). Because many potential threats in nature, such as blood or fire, are red,
mammals may have evolved and aversive response to the color. Further, Elliot and Maier (2007)
argue that the recognition of the color red happens mainly outside of our consciousness; it may
be that reactions to red are instinctive and automatic—a characteristic of alerting attention and
bottom-up attention shift.
Luminance. Along with color, luminance plays a strong role in attributes of a stimulus. It
is due to the contrast of luminance between the stimulus and the environment that a person is
able to see the stimulus (Treisman & Gelade, 1980). As mentioned under WTA, the higher the
contrast there is between the stimulus and the environment, the quicker the response time will be
to the stimulus. Luminance itself allows the viewer to see other parts of the stimulus including
color, size, and movement (Cavanagh & Favreau, 1985; Derrington & Badcock, 1985).
Even color varies by luminescence when it is on a dark background versus a light
background (Meier et al., 2012). Giesbrecht, Bischof, and Kingstone (2004) have shown that
when subjects viewed stimuli under dark and light conditions, perceived understanding of the

7

image was altered. Early visual responses can be greatly affected by dark and light conditions of
viewing that effect low-level vision (Giesbrecht, Bischof, & Kingstone, 2004).
Size. By evolutionary design, an object’s size provides important visual cues about how
much potential threat is present. Because objects that are distant appear smaller that objects that
are near, distant objects are perceived as less of a threat and immediate danger (Ashbridge et al.,
2000). Rapid recognition of relative size suggests that size is initially processed in low–level
vision, as a function of alerting attention (Ashbridge et al., 2000). In visual searches, “taller,
shorter, denser, and sparser pixels can easily be identified” (Healey & Enns, 1999, p.165).
Additionally, subjects are able to locate a long line among a group of short lines faster than they
can locate medium length lines (Healey & Enns 2012).
Change in stimulus speed and direction. Although not specifically named by Itti and
Koch (2001) as a key low-level vision feature, Wolfe (1998) argues that a key stimulus
characteristic driving attentional shift is the contrast of motion, defined as a stimulus’ change in
speed and direction. In one experiment, Wolfe (1998) had participants conduct a visual search
and found that objects contrasting in motion were more quickly recognized than those not
contrasting in motion. He concluded that a contrast in motion is one of the strongest attentiongetting features a stimulus can have.
Motion. According to Abrams and Christ (2003), “the onset of motion captures attention
in a bottom-up, stimulus-driven manner” (p. 429). In a series of three experiments, participants
were found to be more likely to identify target letters among distractors when the targets had
changed from static to moving, as compared to continuously moving targets (Abrams & Christ,
2003). The onset of motion can provide a “substantial additional benefit” for capturing attention
because the onset of motion is a cue that the stimulus may be alive (Abrams & Christ, 2005). In

8

the time since this initial finding, a number of competing hypotheses have emerged to explain the
phenomenon of attention capture due to motion (Franconeri & Simons, 2003).
New object hypothesis. In a 2008 article, Christ and Abrams argued that it wasn’t the
onset of motion that captures attention, but onset of a new object. The “new object hypothesis”
states that “new objects have a larger impact on the allocation of attention than new motion”
(Christ and Abrams, 2008, p.1). In previous research, Abrams and Christ (2003) used a visual
search task where participants were told to identify the location of letters on a display with
targets and distractors. Some of the letters were moving and others were stationary (e.g., Abrams
& Christ, 2003; Abrams & Christ, 2006; Christ & Abrams, 2008). In their studies, a blank area in
the visual field that subsequently gets an letter was considered a “new object” (Abrams & Christ,
2003). The researchers found that subjects were faster to identify the location of a target letter if
it occupies a previously empty space than if it was simply moving prior to the change (Abrams &
Christ, 2006).
Delayed-signal hypothesis. Another hypothesis posits a temporal component for the onset
of motion. The delayed-signal hypothesis predicts that feature changes will be more effective
when the change is cued in advance of the display transition (Horstmann, 2002). In one
experiment (Horstmann, 2002), subjects were shown twelve small squares arrayed in a circle (as
in the face of a clock). After 500ms a letter appeared in each of the 12 squares and stayed there
for 53ms before returning to their original color. Subjects were told to identify the location of the
letter “U” in the array. All squares were the same color in the conjunction condition, while in the
surprise condition, the square that would contain the target letter was a different color than the
rest. Subjects were significantly more likely to identify the correct location in the surprise

9

condition than in the conjunction condition. These findings help to support the delayed-signal
hypothesis.
Unique event hypothesis. The delayed signal hypothesis was made even more precise by
the introduction of the unique event hypothesis. The unique event hypothesis argues that
attention capture will be stronger if feature change occurs just slightly before, or even just after,
the display transition. In one study using the same paradigm as Horstmann (2002), Muhlenen,
Rempel, and Enns (2005) manipulated the temporal placement of the color change in four
conditions: 1000ms prior to the transition, 150ms prior to the transition, simultaneous with the
transition (0ms), and 150ms after the transition. They found that response time was significantly
faster in the conditions in which color change occurred 150ms prior to or after the transition.
They obtained the same results when the unique event signal was motion rather than color
change. Muhlenen et al., (2005) argue that these results uniquely support the unique event
hypothesis’ argument that the visual system is sensitively tuned to change in a number of
dimensions. They argue that the new object hypothesis (onset) is not supported because that
hypothesis doesn’t explain the effectiveness of a brief preview or delay on capture. Furthermore,
they argue that the delayed-signal hypothesis is not supported because capture was slow in the
1000ms condition (delayed-signal) and because the delayed-signal hypothesis does not account
for capture when color or motion occur after the transition.
Behavior urgency hypothesis. The behavior urgency hypothesis states that attention is
drawn to objects in the visual field that have features that suggest threat and may therefore
require the viewer to respond (Kawahara, Yanase & Kitazaki, 2012). Kawahara, et al. (2012)
tested the behavior urgency hypothesis using top-down and bottom-up attentional capture
approaches. Kawahara et al. (2012) was interested in whether top-down or bottom-up controlled

10

attention for task-irrelevant stimuli (movement of objects outside of the central task). In five
optic-flow experiments, subjects were to seek out the target amongst peripheral distractors
(Kawahara, Yanase & Kitazaki, 2012). If top-down controlled attention, then peripheral
distractors should not have an effect. However, the researchers found that indeed there was an
effect such that attention shifted from top-down to bottom-up when motion in the periphery
started or stopped and if the motion was expanding (engaged bottom-up) or contracting (did not
engage bottom up). Expansion suggested that the dots were approaching the subject and
contraction suggesting that the dots were moving away from the viewer. However, attention did
not shift to bottom up when the motion changed speed (slowed down or sped up). The
researchers concluded that qualitative change mattered (the quality of onset or offset of motion),
but quantitative did not matter (the speed). These results support the behavior urgency hypothesis
(Kawahara, Yanase & Kitazaki, 2012).
In other feature search task studies supporting the behavior urgency hypothesis, four
dynamic events were found to induce high priority: abruptly appearing objects, sudden motion,
looming, and “concurrent changes in luminance contrast and contrast polarity” (Franconeri &
Simons, 2003; Franconeri, Simons & Junge, 2004; Jonides & Yantis, 1988). The researchers
found that all the feature changes, except receding and color, captured attention, lending support
to the behavior urgency hypothesis.
In three experiments using visual search tasks, the researchers found that the onset of
motion (new object hypothesis), jitter motion (unique event hypothesis), and looming motion
(behavioral urgency hypothesis) all were significant in capturing attention (Franconeri &
Simons, 2003). This raises the question: how do these hypotheses work in a natural viewing
setting when viewers are not searching for the stimuli? In an experiment, Bergen, Grimes, and

11

Potter (2005) showed that when viewers watched television with two visual focuses competing
like a crawl and video on the same screen, they were less likely to retain the information than if
there was just one visual focus. This study explained that motion can be a viable way of
capturing attention; however, the study did not test what kinds of motion are more likely to
attract attention.
Movement is often seen during severe weather events as a crawl on the screen alerting
viewers of the oncoming danger or in other formats (Federal Communications Commission,
2007). During a severe weather watch or warning, the government requires official alerts,
including graphics, to be immediately broadcast for the maximum safety effect (Federal
Communications Commission, 2007). Carter (1996) describes five different types of animation
or moving weather graphics used in weather forecasts: point symbols (like raining clouds), line
symbols (usually showing flow patterns), raster display sequences (radar loops), 3D clouds (used
to show height of the weather system), and areal expansion and contraction (cold or warm air
changes). A common graphic used regularly during broadcasts are the point symbols. The
motion hypotheses can help to explain the most effective motion features among weather
graphics that can then in-turn be used to attract attention.
Hypotheses
From the above hypotheses and studies, along with several others, new object, uniqueevent, and behavioral urgency hypotheses are commonly supported, where delayed-signal has
inconclusive support. With this understanding, it is proposed that there should be a test, in a
natural-viewing setting, among the former three hypotheses. This paper compares the three of
conflicting accounts: new object hypothesis, unique-event hypothesis, and behavioral urgency

12

hypothesis. All of these hypotheses advocate that they are superior hypotheses and that all other
hypotheses are less effective in capturing visual attention.
New object hypothesis
This hypothesis predicts that the onset of motion encompasses all types of motion.
H1: The onset of motion is sufficient to capture attention, regardless of other
characteristics.
H1a: There will be no difference in attention capture between onset and looming
conditions.
H1b: There will be no difference in attention capture in onset and jittering
conditions.
Unique event hypothesis
This hypothesis predicts that unique motions in objects, like jittering, will be superior to
capturing attention that just a new object. Under this hypothesis, a new object is not as effective
as a new object that contrasts the environment when trying to gain attention because viewers
seek out the “unusual”.
H2: Jittering objects will be more likely to induce attentional capture than the onset of a
new object.
This hypothesis directly conflicts with H1b.
Behavioral urgency hypothesis
This hypothesis predicts that threatening motion, like looming, will be more effective at
capturing attention than new objects or unique motion.
H3: Looming objects will be more likely to induce attentional capture than the onset of a
new object.

13

This hypothesis directly conflicts with H1a.
H4: Looming objects will be more likely to induce attentional capture than jittering
objects.
Method
Participants
Forty-four undergraduate students (28 males, 16 females) from Michigan State
University participated in the study. All subjects had normal vision or vision corrected to normal.
Subjects were recruited through the Communication Department’s subject pool through
Experimetrix and participated for course credit. All participants were briefed on their rights as
research subjects and signed informed consent approved by the Michigan State Institutional
Review Board in advance. The students varied in class level: 25% Freshmen (n = 11), 16%
Sophomores (n = 7), 36% Juniors (n = 16), and 23% Seniors (n =10) and were from a variety of
colleges including Arts and Letters (n = 0; 0%), Business (n = 8; 18%), Communication (n =
23; 52%), Education (n = 2; 5%), Social Science (n = 2; 5%), Natural Science (n = 4; 9%),
Undecided (n = 3; 7%), and None of the above (n = 2; 5%),
Procedure
Subjects were briefed on the study, told of their rights as research subjects and read and
signed informed consent forms. Subjects were then randomly assigned to watch one of two
equivalent videos of a meteorological news report (see description below) in a group setting.
There were 3-19 subjects simultaneously viewing in each session. Subjects were told, “This
study is to understand how people learn information from educational science videos. After
watching this brief video, you will be asked to answer a set of questions about the video.” After
viewing, subjects were tested on recognition memory of graphics that were presented briefly

14

during the video in three ways: onset, looming, and jittering. The subjects also answered
questions on their opinions of the video and some demographic questions. After finishing the
instrument, subjects were thanked, given an opportunity to ask questions about the study, and
instructed to not discuss the study with anyone for two weeks (end of data collection period).
Materials
Manipulation. A segment of an ABC News national meteorological broadcast on
extreme storms was downloaded from the Internet. The video was approximately 3 minutes in
duration. Using a professional video guaranteed that the stimulus was of high quality and
maintained the cover story of the study being about educational science videos. Because the
study was about drawing visual attention in a real-life situation, the cover story was important so
that the subjects were not primed to search for graphics. Two black and white weather graphics
(see Appendix) were inserted in the lower corners of the video screen, as is commonly found
with television program graphics. Black and white graphics were used to eliminate potential
confounds caused by attentional capture due to color. The graphics were presented in one of
three ways consistent with the hypotheses: onset, looming, or jittering (see Table 1). The manner
of presentation was matched to manipulations in experiments described above. Static graphics
simply appeared, stayed on screen for 2s, and then disappeared. Looming graphics zoomed in for
2s and then disappeared from the screen. Jittering graphics consisted of oscillatory motion over a
small spatial distance (approximately 5 degrees) for 2s and then disappeared from the screen.
The stimulus manipulations occurred at 60s, 105s, and 160s into the video. In order to control for
spatial effects, two videos were created with location of graphics presentation counterbalanced
(see Table 1). Temporal order was not changed because none of the hypotheses supported a
cueing effect.

15

Table 1
Counterbalanced graphic manipulations

Stimulus Video

Lower Left/Right
Corner

Screen Location
Lower Left/Right
Corner

Video 1

(Static/Looming)

(Jittering/Looming)

(Static /Jittering)

Video 2

(Looming/ Static)

(Looming/Jittering)

(Jittering/ Static)

Lower Left/Right
Corner

Outcome measure. After viewing the video, participants were immediately given a
simple questionnaire consisting of nine questions and three demographic measures (see
Appendix). Of specific interest for the hypotheses was question five that tested recognition recall
of graphic elements. In addition to the six graphics used in the video, nine distractor graphics
were included to test for false recognition. Additional questions tested their general recall of
information from the video (numbers 1-4), their recall of the location of the graphics (number 6),
and their general rating of the quality of the program (numbers 7-9). Questions 1-4 and 7-9 were
used to maintain the cover story and provide information on how well subjects attended to and
remembered the video. Finally, three demographic questions were asked for sex, year in the
university (freshman-senior), and college that their major is in.
Results
Binomial tests were used to analysis the results of the three manipulations. Only data
from viewers saw a single icon in each manipulation was analyzed. Significance was counted at
the p < .05 level. Hypothesis 1a predicted that there would be no difference in attentional capture
between static objects and looming objects, while H3 predicted that looming objects would
capture attention more effectively than static objects. The results showed that static objects (n =

16

7) captured attention more than looming objects (n = 5), but the difference between the two
motion types were not statistically significant, p = .77. Therefore, H1a was supported and H3
was rejected (see Table 2).
Hypothesis 2 predicted that jittering objects would capture attention more effectively than
static objects, while H1b predicted there was no difference. The results showed that jittering
objects (n = 2) captured attention more frequently than static objects (n = 1), but the difference
between the two motion types were not statistically significant, p = 1.0. Therefore, H2 was not
supported, but H1b was supported (see Table 2).
Hypothesis 4 predicted that looming objects would capture attention more effectively
than jittering objects. The result showed that jittering objects (n = 13) captured attention more
frequently than looming objects (n = 2). The difference between the two motion types was
statistically significant, p = .007, and opposite the prediction of the behavioral urgency
hypothesis. Therefore, H4 was rejected and it was concluded that the data supports jittering
objects as a more effective way to capture attention than looming objects (see Table 2).
Table 2
Number of subjects who recognized the stimuli
Recognition (n =44)
None
One
Stimuli Type

Both

(Looming/Static)

30

(5/7)

2

(Jittering/Static)

37

(2/1)

4

(Looming/Jittering)

26

(2/13)

3

Note. One recognition describes the subjects who only saw one object in each manipulation. Each subject had the
potential to see anywhere from none to all of the stimuli.

17

The within subject design indicates that, including all of the manipulations, nine people
(20.4%) saw none of the stimuli, eight people (18.2%) saw only incorrect stimuli, 20 people
(45.5%) saw partially correct stimuli, and seven people (15.9%) saw only correct stimuli (see
Figure 1). Of the correct stimuli seen by viewers, the 17 viewers (36.4%) saw no correct stimuli,
14 viewers (31.8%) saw one correct stimulus, eight viewers (18.2%) saw two correct stimuli,
three viewers (6.8%) saw three correct stimuli, two viewers (4.5%) saw four correct stimuli, and
no viewers saw five or more correct stimuli (see Figure 2).
The count of the areas in which viewers saw objects/stimuli on the screen showed that
sixteen viewers saw the objects in the upper-right-hand corner (see Figure 3), despite exclusive
placement of the stimuli in the lower-left-hand and lower-right-hand quadrants, subjects were no
more likely to recall seeing the graphics in the correct quadrants than the incorrect quadrants (p <
.05).
25
20
20
15
10

9

8

Total Number of
Subjects

7

5
0

No objects
selected

Only
wrong
answers

Partially
correct
answers

Only
correct
answers

Figure 1: Recognition of Secondary Stimuli. This figure illustrates number of subjects who
identified the stimuli correctly.

18

18

17

16
14

14
12
10

Number of Correct
Stimuli Recognized

8

8
6
4

3
2

2
0
0

1

2

3

4

0

0

5

6

Figure 2: Total Number of Correct Secondary Stimuli. This figure illustrates number of correct
stimuli identified by the subjects.

Upper
Center
Lower

Left
10
6
9

Center
4
10
7

Right
16
6
13

Figure 3: Recognized Screen Location for Stimuli. This figure illustrates
the count of identified screen locations for the stimuli.

Discussion
The results of this study failed to replicate some experimental findings in real world
conditions. While the basic new object hypotheses were supported, there was no support for
more nuanced versions (unique event and behavioral urgency). There are a number of potential
explanations for this; most directly that bench cognitive science findings for visual attention do
not easily translate to real world mass communication experiences. This conclusion signals
caution to media designers who assume that bench findings readily transfer to media production

19

choices. This is a particularly important caution as media researchers begin to move away from
behavioral science approaches and into cognitive science.
There are other possible interpretations of the findings. First, it is important to note the
fundamental differences between tasks from the original experiments and this experiment. Prior
experiments were concerned with attentional capture in a visual search situation, whereas this
study was simply interested in attentional capture in a media viewing situation. It may be that the
differences in findings are a function of the visual search task that viewers in the present study
did not have. That is, attention capture may be stronger during search than it is during everyday
experience, including media viewing. Further, visual attention may be sensitive to more nuanced
processes under a focused search task. This makes sense in that a search task likely activates
greater alerting attention than is found in day-to-day tasks.
Further, outcome measures are different between the lab search tasks and the recognition
tasks. During search tasks, the subject signals as soon as identification occurs; which means that
every target must be seen. The recognition task requires that the attention of non-searching
subjects be captured by the target and that the subject is able to subsequently recognize the target
stimuli on the post-viewing instrument. This may simply be a more difficult cognitive task.
However, this is not to say that results from bench experiments cannot transfer to real world
experience; Simon and Chabis (1999) demonstrated that some visual attention findings do extend
to real world situations.
In Simon and Chabis’s study, where the object detection stimulus video was conducted
in a natural setting, they found that the likelihood of detecting an object was dependent on the
surrounding environment and the task at hand (1999). As previously stated, the study used
several people passing a ball and asked the viewer to count the passes (Simon & Chabis, 1999).

20

It then had a stimulus, a person in a gorilla suit, walk through the ball passing area (Simon &
Chabis, 1999). Their study suggested that, like laboratory studies, viewers are drawn to stimuli or
ignore stimuli based on the stimuli’s features (Simon & Chabis, 1999). This current study, uses
the feature of a stimuli, motion, to pick apart what types of motion are more attractive in
attention gaining.
Close consideration of the subject population suggests an interesting potential
generational confound. Unlike subjects in the Simon and Chabis (1999) study, the subjects in the
present study learned to use media during a time when television and video games commonly
display multiple simultaneous information streams. For example, CNN often provides 3-4
different streams of information to the viewer at one time including several on the periphery,
whereas traditional news contained only 1-2 information streams (a single news reader and a
graphic). A typical video game contains progress updates on the edges of the screen in addition
to the main action. It may be that subjects from this generational cohort have learned to ignore
information in the periphery in order to focus attentional resources on the main program. In the
context of this study, it is unlikely that any real threat information would be relayed on
peripheral graphics. A follow-up study with older populations or with subjects from single
information stream cultures would help clear up the extent to which the subjects in the present
study have simply learned to ignore certain streams of information.
During the data analysis phase, a possible confound was detected. One of the weather
graphics that was used may have been too closely associated semantically with the main video
story. That is, the video story included a section on the negative outcomes of an extreme rain
storm and one of the graphics represented a rain storm. Is it possible that subjects chose this
graphic primarily because of its semantic association with the story content? This may have

21

occurred in the jittering (16 viewers) versus looming (five viewers) condition because the
measure the jittering graphic was a cloud with raindrops. Furthermore, the recognition of
jittering of the partly-sunny icon was drastically different with only six viewers. From these
findings, it is possible that many participants assumed they saw objects that fit most closely with
the topic of the video playing as the primary stimuli. However, a significant portion of the story
was about a snow storm in Vermont and two of the distractor graphic elements on the recall test
were for a snow storm. These graphics were less likely to be seen as the jittering rain storm
graphic, suggesting jittering when competing against static motion can have a greater attentional
capture effect.
A similar phenomenon was found with stimuli screen location. Sixteen viewers, a better
than chance amount, incorrectly indicated that they viewed the stimuli in the upper-right-hand
corner of the screen (see Figure 3). This is concerning. After reviewing the stimulus video for
possible confounds, this paper suspects that there were no confounds in the video in the upperright-hand corner of the screen, but a natural bias for icon spatial location in television news.
There seems to be a bias for screen corners and the center. These locations also happen to be
popular places for reporting information on national and local networks. This seems to point out
general bias and not actual icon recognition. Perhaps, future studies could explore different
spatial locations to see if there is a similar effect.
Although the new object hypotheses were supported, an overwhelming number of
subjects in all manipulations missed the objects (see Table 2). This raises the question: why did
most subjects not see the stimuli? Perhaps future research should look at increasing the length of
time each graphic remains up. Also, manipulations in color or size may also increase the effects
of the object’s motion. Another, manipulation could be to an auditory direct reference of the

22

objects. With the current research, there is still much unknown about how motion affects
attentional capture in a natural-viewing setting.
Conclusion
This paper tested conflicting hypotheses from the motion and visual attention literature,
applying the finding to a real-world media situation. This was accomplished by testing different
types of motion, supported by different hypotheses, in a natural television-viewing setting. The
findings of this study suggest that there is more research needed in visual motion that is unrelated
to viewer’s task.

23

APPENDIX

24

Science Video Questionnaire
We have a number of questions we would like you to answer about the video you just saw.
Answer all the questions to the best of your ability.
1. What was the news source (circle)?
a. Weather Channel

e. CBS News

b. NBC News

f. ABC News

c. CNN

g. National Weather Service

d. Fox News Network

h. PBS News

2. What was the main topic of the story?

3. Where did the flooding occur?

4. In what state was the snow storm located?

25

5. Did you see any of the following graphics (circle any you saw)?

Figure 4: Television Weather Icons.

6. What part of the screen did you see the graphics (circle all that apply)?

1

2

3

4

5

6

7

8

9

Figure 5: Quadrants of the Television Screen.

26

7. How would you rate your interest in the main story?
1-------2-------3-------4-------5-------6-------7-------8-------9-------10
8. How would you rate your prior understanding of the topic of the main story?
1-------2-------3-------4-------5-------6-------7-------8-------9-------10
9. How would you rate your understanding after watching the video?
1-------2-------3-------4-------5-------6-------7-------8-------9-------10
Demographics
Sex _______ Female

_______ Male

Year _____ Freshman

College _____ Arts & Letters

_____ Sophomore

_____ Business

_____ Junior

_____ Communication

_____ Senior

_____ Education
_____ Social Science
_____ Natural Science
_____ Undecided
_____ None of the above

Thank you for your assistance with this research project

27

REFERENCES

28

REFERENCES

Abrams, R. A., & Christ, S. E. (2003). Motion onset captures attention. Psychological Science,
14(5), 427-432. doi: 10.1111/1467-9280.01458
Abrams, R. A., & Christ, S. E. (2005). The onset of receding motion captures attention:
Comment on Franconeri and Simons (2003). Perception & Psychophysics, 67(2), 219223. doi:10.3758/BF03206486
Abrams, R. A., & Christ, S. E. (2006). Motion onset captures attention: A rejoinder to Franconeri
and Simons (2005). Perception & Psychophysics, 68(1), 114-117.
doi:10.3758/BF03193661
Ashbridge, E., Perrett, D., Oram, M., & Jellema, T. (2000). Effect of image orientation and size
on object recognition: Responses of single units in the macaque monkey temporal cortex.
Cognitive Neuropsychology, 17, 13-34. doi:10.1080/026432900380463
Barbot, A., Landy, M. S., & Carrasco, M. (2012). Differential effects of exogenous and
endogenous attention on second-order texture contrast sensitivity. Journal of Vision,
12(8). doi:10.1167/12/8/6
Beanland, V., & Pammer, K. (2012). Minds on the blink: The relationship between inattentional
blindness and attentional blink. Attention, Perception, & Psychophysics, 74(2), 322-330.
doi:10.3758/s13414-011-0241-4
Bergen, L., Grimes, T., & Potter, D. (2005). How attention partitions itself during simultaneous
message presentations. Human Communication Research, 31(3), 311-336.
doi:10.1111/j.1468-2958.2005.tb00874.x
Carter, R. J. (1996). Television weather broadcasts: Animated cartography aplenty. Proceedings
of the Seminar on Teaching Animated Cartography, Escuela Universitaria de Ingeniera
Tecnica Topografica, Madrid, Spain, August 30 - September 1,1995.
http://nvkserver.frw.ruu.nl/html/nvk/ica/madrid/carter.html.
Cavanagh, P., & Favreau, O. E. (1985). Color and luminance share a common motion pathway.
Vision Research, 25(11), 1595-1601. doi:10.1016/0042-6989(85)90129-4
Christ, S. E., & Abrams, R. A. (2008). The attentional influence of new objects and new motion.
Journal of Vision, 8(3). doi:10.1167/8.3.27
Derrington, A. M., & Badcock, D. R. (1985). The low level motion system has both chromatic
and luminance inputs. Vision Research, 25(12), 1879-1884. doi: 10.1016/00426989(85)90011-2

29

Duncan, J. (1999). Attention. In R. A. Wilson & F. C. Keil (Eds.), The MIT Encyclopedia of the
Cognitive Science (pp. 39-41). Cambridge, MA: Mit Press.
D'Zmura, M. (1991). Color in visual search. Vision Research, 31(6), 951-966. doi:10.1016/00426989(91)90203-H
Elliot, A. J., & Maier, M. A. (2007). Color and psychological functioning. Current Directions in
Psychological Science, 16(5), 250-254. doi:10.1111/j.1467-8721.2007.00514.x
Federal Communications Commission. (2007). Emergency Alert System. Retrieved from
http://hraunfoss.fcc.gov/edocs_public/attachmatch/DOC-278628A5.pdf.
Fernandez-Duque, D., & Posner, M. I. (1997). Relating the mechanisms of orienting and
alerting. Neuropsychologia, 35(4), 477-486. doi:10.1016/S0028-3932(96)00103-0
Franconeri, S. L., & Simons, D. J. (2003). Moving and looming stimuli capture attention.
Perception & Psychophysics, 65, 999-1010. doi:10.3758/BF03194829
Franconeri, S. L., Simons, D. J., & Junge, J. A. (2004). Searching for stimulus-driven shifts of
attention. Psychonomic Bulletin & Review, 11(5), 876-881. doi:10.3758/BF03196715
Giesbrecht, B., Bischof, W. F., & Kingstone, A. (2004). Seeing the light: Adapting luminance
reveals low-level visual processes in the attentional blink. Brain and cognition, 55(2),
307-309. doi:10.1016/j.bandc.2004.02.027
Healey, C. G., & Enns, J. T. (2012). Attention and visual memory in visualization and computer
graphics. IEEE Transactions on Visualization and Computer Graphics, 18(7), 11701188. doi:10.1109/TVCG.2011.127
Healey, C. G., & Enns, J. T. (1999). Large datasets at a glance: Combining textures and colors in
scientific visualization. IEEE Transactions on Visualization and Computer Graphics,
5(2), 145-167. doi:10.1109/2945.773807
Hill, R. A., & Barton, R. A. (2005). Psychology: Red enhances human performance in contests.
Nature, 435(7040), 293-293. doi: 10.1038/435293a
Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene
analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 12541259. doi:10.1109/34.730558
Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews
Neuroscience, 2(3), 194-203. doi: 10.1038/35058500.
Jonides, J., & Yantis, S. (1988). Uniqueness of abrupt visual onset in capturing attention.
Perception & Psychophysics, 43(4), 346-354. doi:10.3758/BF03208805

30

Kawahara, J., Yanase, K., & Kitazaki, M. (2012). Attentional capture by the onset and offset of
motion signals outside the spatial focus of attention. Journal of Vision, 12(12).
doi:10.1167/12.12.10
Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: Towards the underlying
neural circuitry. Hum Neurobiol, 4(4), 219-227. doi:10.1007/978-94-009-3833-5_5
Koivisto, M., Hyönä, J., & Revonsuo, A. (2004). The effects of eye movements, spatial attention,
and stimulus features on inattentional blindness. Vision Research, 44(27), 3211-3221.
doi:10.1016/j.visres.2004.07.026
Kuldkepp, N., Kreegipuu, K., Raidvee, A., Näätänen, R., & Allik, J. (2013). Unattended and
attended visual change detection of motion as indexed by event-related potentials and its
behavioral correlates. Frontiers in Human Neuroscience, 7.
doi:10.3389/fnhum.2013.00476
Livingstone, M., & Hubel, D. (1988). Segregation of form, color, movement, and depth:
Anatomy, physiology, and perception. Science, 240(4853), 740-749.
doi:10.1126/science.3283936
Meier, B. P., D’Agostino, P. R., Elliot, A. J., Maier, M. A., & Wilkowski, B. M. (2012). Color in
context: Psychological context moderates the influence of red on approach-and
avoidance-motivated behavior. PloS One, 7(7), e40333.
doi:10.1371/journal.pone.0040333
Mezzacappa, E. (2004). Alerting, orienting, and executive attention: Developmental properties
and sociodemographic correlates in an epidemiological sample of young, urban children.
Child Development, 75(5), 1373-1386. doi:10.1111/j.1467-8624.2004.00746.x
Most, S. B., Simons, D. J., Scholl, B. J., Jimenez, R., Clifford, E., & Chabris, C. F. (2001). How
not to be seen: The contribution of similarity and selective ignoring to sustained
inattentional blindness. Psychological Science, 12, 9-17. doi:10.1111/1467-9280.00303
Nagy, A. L., & Sanchez, R. R. (1990). Critical color differences determined with a visual search
task. JOSA A, 7(7), 1209-1217. doi:10.1364/JOSAA.7.001209
Neisser, U. (1979). The control of information pickup in selective looking. In A. D. Pick (Ed.),
Perception and its development: A tribute to Eleanor J. Gibson (pp. 201-219). Hillsdale,
NJ: Lawrence Erlbaum.
Pessoa, L. (2005). To what extent are emotional visual stimuli processed without attention and
awareness? Current Opinion in Neurobiology, 15(2), 188-196.
doi:10.1016/j.conb.2005.03.002
Posner, M.I. (1995). Attention in Cognitive Neuroscience. In M.S. Gazzaniga (Ed.), Handbook
of Cognitive Neuroscience (pp. 615-624) Cambridge, MA: MIT Press.

31

Raz, A., & Buhle, J. (2006). Typologies of attentional networks. Nature Reviews- Neuroscience,
7(5), 367-79. doi:10.1038/nrn1903
Simons, D. J. (2000). Attentional capture and inattentional blindness. Trends in Cognitive
Sciences, 4(4), 147-155. doi:10.1016/S1364-6613(00)01455-8
Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional blindness
for dynamic events. Perception-London, 28(9), 1059-1074. doi:10.1068/p2952
Todd, J. T., & Van Gelder, P. (1979). Implications of a transient–sustained dichotomy for the
measurement of human performance. Journal of Experimental Psychology: Human
Perception and Performance, 5(4), 625. doi:10.1037/0096-1523.5.4.625
Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive
Psychology, 12(1), 97-136. doi:10.1016/0010-0285(80)90005-5
Treue, S. (2003). Visual attention: The where, what, how and why of saliency. Current Opinion
in Neurobiology, 13(4), 428-432. doi:10.1016/S0959-4388(03)00105-3
von Mühlenen, A., Rempel, M. I., & Enns, J. T. (2005). Unique temporal change is the key to
attentional capture. Psychological Science, 16(12), 979-986. doi:10.1111/j.14679280.2005.01647.x
Walther, D., Rutishauser, U., Koch, C., & Perona, P. (2004). On the usefulness of attention for
object recognition. In Workshop on Attention and Performance in Computational Vision
at ECCV (pp. 96-103).
Wang, Z., Lang, A., & Busemeyer, J. R. (2011). Motivational processing and choice behavior
during television viewing: An integrative dynamic approach. Journal of Communication,
61(1), 71-93. doi:10.1111/j.1460-2466.2010.01527.x
Wolfe, J. M. (1998). Visual Search. In H. Pashler (Ed.), Attention (pp. 13-74). East Sussex, UK:
Psychology Press.
Xuan, B., Zhang, D., He, S., & Chen, X. (2007). Larger stimuli are judged to last longer. Journal
of Vision, 7(10). doi:10.1167/7.10.2
Yantis, S. (1998). Control of visual attention. In H. Pashler (Ed.), Attention (pp. 223-256). East
Sussex, UK: Psychology Press
Yantis, S., & Jonides, J. (1984). Abrupt visual onsets and selective attention: Evidence from
visual search. Journal of Experimental Psychology: Human Perception & Performance,
10, 601-621. doi:10.1037/0096-1523.10.5.601

32