I. .55... .
:.oruuua

:
uﬁ

. . .-
. . ,
war 2:1: i.

.&.
, :
3%

f:
t’» n

W... a.

‘1

.
5:.

. E A
A.

s.

-é?t..

J. . 31-1}: 3‘
ﬁﬁt

.. 1
.n.\

. s
‘25. Ill
Rdu
. Z 3 .
,mmﬁhu “.35"an
9.... «v are».

 

5::
... )3

\l‘!
s, .u.

ﬂiiiﬁﬁlvjﬂiﬁd§ s ‘
.. i... n; V . . . , 9.4.: , 2. .
A . . 4 V , ‘ ‘ “ﬁlmiﬁw r w.
, . .t. . .
”E? WW

q. Véuufuw 3.3 . ﬂ. . g
. :L.

 

3‘ LIBRARY
may; Michigan §tate
Unnversnty

This is to certify that the
dissertation entitled

SPATIAL INFORMATION DISPLAY FRAMEWORK FOR
MOBILE AUGMENTED REALITY INTERFACE

presented by

Kwok Hung Tang

has been accepted towards fulﬁllment
of the requirements for the

Ph.D. degree in Computer Science and
Engineering

 

7 /'~

’4

. . ,‘I ,/ .'
I" .'/1 ,1}; fl 1 ,
: . I ’ .’
k-. 1’ ~21 XII) L/M /

 

Major Professor’s Signature
1St November 2005

Date

MSU is an Afﬁrmative Action/Equal Opportunity institution

PLACE IN RETURN BOX to re
TO AVOID FINES re
MAY BE RECALLED Wit

move this checkout from your record.
turn on or before date due.
h earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

JUN 2 1 200;

I

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2/05 p:/ClRC/Dale0ue.indd-p.1

 

IIII‘

-IIIIOI

III I. 4

SPATIAL INFORMATION DISPLAY FRAMEWORK FOR MOBILE
AUGMENTED REALITY INTERFACES

By

K wok Hung Tang

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Computer Science and Engineering

2005

IIi ILI‘Il . il‘o

V ..}[I IIOIIIIII'III

ABSTRACT

SPATIAL INFORMATION DISPLAY FRAMEWORK FOR
MOBILE AUGMENTED REALITY INTERFACES

By

K wok Hung Tang

Future augmented reality (AR) user interfaces will allow designers the
ﬂexibility of placing information all around the body of a mobile user,
effectively utilizing the area around the body as a spatial user interface. The
design of these future interfaces prompts a signiﬁcant human factors challenge:
How should interface designers map diﬂerent metaphors, information, and
functions of computer usage into a volumetric computing environment to
maximize information bandwidth and reduce a user ’s attentional and cognitive
load? Issues of human cognition and psychological effects in AR are mainly
unexplored, and little is known about how humans organize information objects
in an egocentric and exocentric free-space environment. This thesis addresses
the research problem by: (l) constructing a spatial information display
framework based on neuropsychological research, and (2) extending research in
cognitive psychology and behavioral science to AR interface design. Three
research questions in cognitive psychology are identiﬁed that are closely related
to the design of AR interfaces: (1) the use of reference ﬂames during the spatial
encoding process, (2) applicability of perceptual asymmetry properties in AR

interface design, and (3) directing visuo-spatial attention in omnidirectional

space. Six experiments were conducted to investigate these three research
questions. The experimental results were combined with existing literature to
form a set of information display guidelines for information display in mobile

AR environments.

Keywords:

Augmented reality, human-computer interaction, perceptual and kinematics

asymmetry, spatial reference frame, three-dimensional visuo-spatial attention

ACKNOWLEDGEMENTS

The thesis is the collaborative effort, and would not have materialized without the
help of my advisors, colleagues, friends and family. I own more to my parents than
anyone else. It is to them my thesis is dedicated to. I am deeply indebted to my two
academic fathers, Dr. Charles Owen and Dr. Frank Biocca. It was a great pleasure to be a
student of Dr. Charles Owen, my principle advisor. His excellent consultancy,
professional editing, unrelenting support, and hard working attitude not only provide a
solid basis for my thesis research, but also set a great role model for my future career. It
has been a prestigious opportunity to be able to work with Dr. Frank Biocca in the
M.I.N.D. Labs for the last seven years. His philosophical mind has been of great value to
the theoretical background of my thesis research, and his vision has broadened and
deepened my view on scientiﬁc research. My sincere gratitude goes to Dr. George
Stockrnan and Dr. John Weng for monitoring my research work and spending their
valuable time in reading the document and providing valuable advisor to this thesis. I
wish to express my warm and sincere thanks to Dr. Weimin Mou for his contribution of a
signiﬁcant amount of research work in this thesis. My sincere thanks also go to Dr.
Prabu David for his statistical analysis on Experiment 5. I also owe my fellow colleague,
Fan Xiao, a big thank for his support in preparing the stimulus materials. I must also
thank Betsy McKeon, who did a very competent work on data collection in the
experiments. And last but not least, I would like to express a warm gratitude to Zena

Biocca, who created a great working environment in the lab to foster this thesis research.

iv

TABLE OF CONTENTS

LIST OF TABLES ........................................................................................................... viii
LIST OF ABBREVIATION ............................................................................................. xii
1 Introduction ................................................................................................................... 1
1.1 Using Space as a Medium for Thought ................................................................ 2
1.2 How Spatial Representations Leverage Spatial Cognition for Thinking ............. 3
1.3 Spatial Cognition and Augmented Reality Space ................................................ 5
1.4 Research Motivation and Problem Statement ...................................................... 6
1.5 Contributions of this Thesis ................................................................................. 7
2 Theoretical Background ................................................................................................ 9
2.1 Spatial Framework of Three-dimensional Space ................................................. 9
2.2 Neuropsychology of Three-dimensional Spaces ............................................... 10
2.2.1 Personal/Body Space ................................................................................ 10
2.2.2 Peripersonal Space .................................................................................... 12
2.2.3 Extrapersonal space .................................................................................. 14
2.3 Mapping Digital Information to Space in Augmented Reality Systems ............ l7

3 Spatial Framework of Information Display in Mobile Augmented Reality
Environments ............................................................................................ 19
3.1 Spatial Information Framework ......................................................................... 19
3.1.1 Personal-body Infospace ........................................................................... 19
3.1.2 Peripersonal Infospace .............................................................................. 23
3.1.3 Extrapersonal Focal Infospace .................................................................. 25
3.1.4 Extrapersonal Action-Scene Infospaces ................................................... 25
3.1.5 Extrapersonal Ambient Infospaces ........................................................... 25
3.2 Summary ............................................................................................................ 26
4 Behavioral Properties of Three-dimensional Space .................................................... 28
4.1 Behavioral Properties in Personal-body Infospaces .......................................... 28
4.1.1 Proprioception ........................................................................................... 29
4.1.2 Spatial Bias in Personal-body Infospace ................................................... 31
4.1.3 The Hand and Forearms Personal-body Infospaces .................................. 31
4.2 Behavioral Properties in Peripersonal Infospace ............................................... 33
4.2.1 Spatial Biases of Information in Peripersonal Space ................................ 33
4.3 Behavioral Properties in Extrapersonal Focal Infospace ................................... 34
4.3.1 Visual Clutter in Head Stabilized Reference Frame ................................. 35
4.3.2 Perceptual Fading of Visual Stimulus ....................................................... 35
4.3.3 Spatial Bias in Head Stabilized Reference Frame .................................... 36
4.4 Behavioral Properties in Relation to Egocentric Infospaces .............................. 36
4.4.1 Kinematics Asymmetry ............................................................................ 37
4.4.2 Perceptual Asymmetries ........................................................................... 39

4.5 Behavioral Properties in Extrapersonal Action-scene Infospaces ..................... 45
4.5.1 Spatial Consistency of Information Objects with the Environment ......... 46
4.5.2 Remote Interaction for Information Objects in Extrapersonal Action-scene
Infospace: Selection and Manipulation .................................................... 47
4.5.3 Unregistered Extrapersonal Action-scene Infospace ................................ 53
4.6 Behavioral Properties in Extrapersonal Ambient Infospace .............................. 53
4.6.1 Spatial Bias in Extrapersonal Ambient Infospace .................................... 53
4.6.2 Linear Perspective and Motion Perception Properties .............................. 54
4.7 Summary ............................................................................................................ 54
Reference Frames in Mobile Augmented Reality Displays ........................................ 55
5.1 Related Works .................................................................................................... 55
5.2 Experiment 1: The Default Reference Frame .................................................... 59
5.2.1 Methodology ............................................................................................. 59
5.3 Experiment 2: Adaptation of Egocentric Frame with Prior Experience ............ 66
5.3.1 Methodology ............................................................................................. 67
5.3.2 Results and Discussion ............................................................................. 68
5.4 Experiment 3: Adaptation to an Egocentric Frame with Oral Instruction ......... 70
5.4.1 Methodology ............................................................................................. 70
5.5 Discussion .......................................................................................................... 72
Evaluation of Perceptual Asymmetric Effects in Egocentric Infospaces ................... 76
6.1 Experiment 4: Evaluation of Left vs. Right Instruction Presentation ................ 76
6.1.1 Methodology ............................................................................................. 77
6.1.2 Results ....................................................................................................... 79
6.1.3 Discussion ................................................................................................. 80
6.2 Experiment 6: Emotion and Semantic Meaning ................................................ 81
6.2.1 Related Works ........................................................................................... 81
6.2.2 Methodology ............................................................................................. 83
6.2.3 Results and Analysis ................................................................................. 86
6.2.4 Discussion ................................................................................................. 92
6.3 Summary ............................................................................................................ 95
Directing Attention in Mobile AR Interface ............................................................... 96
7.1 Attention Management ....................................................................................... 98
7.1.1 Attention Cueing in Existing Interfaces .................................................... 99
7.1.2 Spatial Cueing in Augmented Reality ..................................................... 100
7.2 The Omnidirectional Attention Funnel ............................................................ 101
7.2.1 Components of the Attention Funnel ...................................................... 102

7.2.2 Affordances in the Attention Funnel that Guide Navigation and Body
Rotation ................................................................................................... 106
7.2.3 Methods for Sensing or Marking Targets Objects or Locations ............. 107
7.3 Methodology .................................................................................................... 108
7.3.1 Participants .............................................................................................. 109
7.3.2 Stimulus Materials .................................................................................. 109
7.3.3 Apparatus and Test Environment ............................................................ 110

vi

7.3.4 Measurements ......................................................................................... 1 11

7.3.5 Procedure ................................................................................................ 111

7.4 Results .............................................................................................................. 1 12

7.5 Discussion ........................................................................................................ 1 14

7.6 Application of the Attention Funnel ................................................................ 114

8 Discussion and Conclusion ....................................................................................... 117

8.1 Guideline for Information Display in Augmented Reality Environments ....... 118

8.2 Future Works ................................................................................................... 119

8.3 Conclusion ....................................................................................................... 120
Appendix A. Spatial Information Display Guideline for Mobile Augmented Reality

Interfaces ................................................................................................. 122

A. Spatial Framework of the Three-dimensional Space ....................................... 123

B. Peripersonal Infospace ..................................................................................... 124

C. Personal-body Infospace .................................................................................. 126

D. Extrapersonal Focal Infospace ......................................................................... 129

E. Egocentric Infospaces ...................................................................................... 132

F. Extrapersonal Action-scene Infospace ............................................................. 135

G. Extrapersonal Ambient Infospace .................................................................... 138

H. Infospace Choice for Common Information Objects ....................................... 139

9 References ...................................................................................... 142

vii

LIST OF TABLES

Table 4.1. Summary of cerebral hemispheric specializations ........................................... 41

Table 4.2.

Table 5.1.

Table 5.2.

Table 5.3.

Table 5.4.

Table 5.5.

Table 5.6.

Table 6.1.

Table 6.2.

Summary of pros and cons of different remote objects manipulation method52

Pointing latency (in seconds) and pointing accuracy (in degrees) as a function
of Actual-Imagined (A-I) distance and Learning-Imagined (L-I) distance in
Experiment 1. .................................................................................................. 64

Analysis of variance results for pointing latency and pointing accuracy in
Actual-Imagined (A-I) and Leaming-Imagined (L-I) conditions in Experiment
1 ....................................................................................................................... 65

Pointing latency (in seconds) and pointing accuracy (in degrees) as a function
of Actual-Imagined (A-I) distance and Leaming-Imagined (L-I) distance in
Experiment 2. .................................................................................................. 68

Analysis of variance results for pointing latency and pointing accuracy in
Actual-Imagined (A-1) and Learning-Imagined (L-I) conditions in Experiment
2 ....................................................................................................................... 69

Pointing latency (in seconds) and pointing accuracy (in degrees) as a function
of Actual-Imagined (A-I) distance and Learning-Imagined (L-I) distance in
Experiment 3. .................................................................................................. 71

Analysis of variance results for pointing latency and pointing accuracy in
Actual-Imagined (A-I) and Learning-Imagined (L-I) conditions in Experiment

3 ....................................................................................................................... 72
Task completion time and standard deviation in Experiment 4 ...................... 80
Means for the Different levels of the 3 Experimental Factors ........................ 87

viii

Figure 2.1.

Figure 2.2
Figure 2.3.
Figure 2.4.
Figure 2.5.
Figure 2.6.
Figure 2.7.
Figure 3.1.

Figure 3.2.

Figure 3.3.
Figure 3.4.

Figure 3.5.

Figure 4.1.

Figure 4.2.

Figure 4.3.

Figure 5.1.

Figure 5.2.

Figure 5.3.

LIST OF FIGURES

A prototype volumetric AR interface with information objects placed in

different reference frame. ............................................................................ lO
Personal/body space. .................................................................................... 11
Peripersonal space ........................................................................................ 13
Extrapersonal focal space. ........................................................................... l4
Extrapersonal action space. .......................................................................... 15
Scene space. ................................................................................................. l6
Extrapersonal ambient space ........................................................................ 17
Human skeletal structure (Gray, Bannister, Berry and Williams 1995). ..... 21

Selected bone groups for Personal-Body Infospaces: (a) Skull (b) Vertebral
Column (0) Stemum and costal cartilages (d) Humerus (e) Forearm group

(1) Hand (g) Femur (h) Patella (i) Leg (j) Foot (Gray et al. 1995) ............... 22
Skeletal structure of human hand (Gray et al. 1995). .................................. 23
The vertebral column (Gray et al. 1995). ..................................................... 24

Spatial ﬁamework for information display in mobile augmented reality
environments ................................................................................................ 27

Hand Personal-body Infospaces: a menu attached to the non-dominant hand
and an interaction tool (the ring) attached to the dominant hand. ............... 32

The visual pathway of left and right visual ﬁeld. Retinal signal from the left
and right visual ﬁelds projects exclusively to the contralateral cerebral

hemispheres. ................................................................................................ 42
Visual pathway of the upper and lower visual ﬁeld. ................................... 44
The eight virtual objects used in the experiments ........................................ 60

Layout of objects used in the experiments. During the learning phase, half of
the participants faced the cell phone and the other half faced the notebook.
..................................................................................................................... 60

Design of experiments: Head-nose icons indicate actual headings; arrows
indicate imagined headings. Headings and differences between them are

ix

Figure 6.1.

Figure 6.2.

Figure 6.3.

Figure 6.4.
Figure 6.5.
Figure 6.6.

Figure 7.1.

Figure 7.2.

Figure 7.3.

Figure 7.4.

Figure 7.5.

measured counter-clockwise to maintain consistency with previous
experiments. ................................................................................................. 61

Examples of instruction and the completed task. Example of text instruction
is shown in (a) instruction and the completed task is shown in (b), and
example of graphic instruction is shown in (c) and the completed task is
shown in (d). ................................................................................................ 78

Ten predeﬁned locations around the body. The ﬁve locations in the near
space are 3’ away ﬁ'om the body. The ﬁve locations in the far space is 10’
from the body. The above, below, left and right locations is deviated 30°
from the center location. .............................................................................. 84

The two stimulus material used in the experiment. The golden sphere used
for object representation is shown on the left side. The human head used for

agent representation is shown on the right side. .......................................... 84
Relevant-Irrelevant by distance and position of objects .............................. 89
Urgent-Not urgent by distance and position. ............................................... 91
Urgent-Not Urgent by object and position ................................................... 91

The attention ﬁmnel links the head of the viewer directly to an object
anywhere around the body. ........................................................................ 101

Three basic patterns are used to construct a funnel: (A) the head centered
plane includes a boresight to mark the center of the pattern from the user’s
viewpoint, (B) funnel planes, added in a ﬁxed pattern (approximately every
12 centimeters) between the user and the object, and (C) the object marker

pattern that includes a red cross hairs marking the approximate center of the
object. ......................................................................................................... 103

As the head and body move, the attention funnel dynamically provides
continuous feedback. Affordances from the perspective cues automatically
guide the user towards the cued location or object. Dynamic head movement
cues are provided by the skew (e. g., left, right, up, down) of the attention
funnel. The level of alignment (skew) of the funnel provides an immediate
intuitive sense of how much the body or head must turn to see the object.
................................................................................................................... 104

Example of the attentional funnel drawing attention of the user to an object
on the shelf, the red box. ............................................................................ 107

Test Environment: The user sat in the middle of test environment for the
visual search task. It consisted of an omnidirectional workspace assembled
ﬁom four tables each with 12 objects (6 primitive shapes and 6 general
ofﬁce objects) for a total of 48 target search objects. ................................ 110

Figure 7.6

Figure 7.7.

Search time and consistency by experimental condition. Attentional funnel
decreased search time by 22% on average (28% when reach time is

subtracted) and increased search consistency (decreased variability) by 65%.
................................................................................................................... 1 13

Mental workload measured by NASA TLX for each experimental condition.
................................................................................................................... l 13

xi

A-I

AN OVA

B.C.E.

CHIMP

GPS

HOMER

LGN

L-I

ms

PDA

SD

SGI

VR

WIMP

LIST OF ABBREVIATION

Actual-Imagined

Augmented Reality

Analysis of Variance

Before the Common Error

Chapel Hill Immersive Modeling Program
Global Positioning System

Head-mounted Display

Hand-centered Object Manipulation Extending Ray-casting
Head-up Display

Lateral Geniculate Nucleus
Learning-Imagined

Mean

Milliseconds

Personal Digital Assistant

Radio Frequency Identiﬁcation

Standard Deviation

Silicon Graphics, Inc.

Virtual Reality

Window, Icon, Menu, Pointer

xii

1 Introduction

Technological developments are allowing for the design of computer user
interfaces that extend the traditional interface into the physical space all around the user,
breaking the bounds of the small monitor-based display and allowing for mobile
interfaces that appear to present a virtually unlimited quantity of information objects
around the user. User interface components can ﬂoat in space around the user or appear
to be placed on the surface of the body. This extension of the space utilized for
information brings to question how best to place content around the user. This thesis
explores the effective utilization of the space around a user in future user interfaces,
addressing issues of effective placement that are sound from a psychological and
physiological standpoint.

Alan Kay described the personal computer as the ﬁrst meta-medium — an
electronic medium which can be used to store, manipulate and access numerous media
forms such as text, images, audio, video, and three-dimensional models (Kay 1984). The
emergence of the World Wide Web in the last decade brought into existence the “global
village interconnected by an electronic nervous system” as envisioned by Marshall
McLuhan (1967). During this era, the computer has evolved into an information portal to
databases in different media forms and a communication portal for different social
activities. An unprecedented amount of information and activities can be received
continuously through this portal by the user. The user interface is analogous to a gateway
for this communication and information portal. It manages, and often limits, the

information the user is able to absorb and the commands the user is able to deploy to the

computer system. Effective design of this gateway can maximize the bandwidth between

the computer and the user.

1.1 Using Space as a Medium for Thought

Every medium, from traditional printed media to modern computer-mediated
interactive media, uses spatial arrangement in some way to organize information (Cavell
2002). The prevalent computer user interface for the last 25 years, the traditional WIMP
(window, icon, menu, pointer) direct manipulation interface (Shneiderman 1983) is a
two-dimensional spatial arrangement of icons and overlapping windows suggesting
layers of information and containers (or folders) that are “opened” to reveal arrays of
icons and simulating the arrangement of material as if it were on an ofﬁce desktop.
Motor interaction in WIMP interfaces is spatial, as the system is controlled by a virtual
pointer on the display manipulated by the mouse on a spatial surface. The advantage of
the WIMP interface is familiarity. It is based on the desk surface and folders metaphor
that is obvious to novice users. However the metaphor is limited in much the same way
limiting an ofﬁce to just the surface of a small desk would be. Three-dimensional
environments are far richer and more expressive than two-dimensional ﬂat surfaces.

With the advent of motion tracking systems and low-cost, high-performance
graphics workstations, the novel and highly spatial augmented reality (AR) interfaces
visualized in Hollywood movies, video games and science ﬁction are becoming
technologically feasible. These interfaces tightly couple spatial three-dimensional stimuli
to the movement of the user’s body. The sensors and effectors of the computer system
are then mapped to the user’s body schema (Biocca 1997). Volumetric AR interfaces

make use of a greater range of human sensorimotor capabilities, potentially increasing the

communication bandwidth between the user and the computer by cutting the ties to that
technological ancestor -— the typewriter.

AR interfaces have very unique characteristics as compared to other media and
computer interfaces: users interact with the computer system through body motion in a
volumetric space, instead of via a two-dimensional surface. This is very different from
traditional computer interfaces and other three dimensional screen-based interactions
such as DataMountain (Robertson, Czerwinski, Larson, Robbins, Thiel and van Dantzich
1998) and ﬁsh-tank virtual reality (VR) (Ware, Arthur and Booth 1993). Traditional
computer interfaces can be likened to limiting user interaction to the surface area of a
small ofﬁce desktop, and screen-based three-dimensional interfaces are analogous to a
window into the ofﬁce through which users peer at a presentation of an alternative
reality; effectively an outsider looking in. AR is a truly immersive spatial electronic
medium in which the user’s body is immersed into a blended real/virtual environment,
where the computer arrays two-dimensional and three-dimensional information around
the user. This unique spatial arrangement allows for the display of large volumes of data,
and designers are still exploring ways to organize information in this cutting edge

interface.

1.2 How Spatial Representations Leverage Spatial Cognition for Thinking

In the everyday world, humans organize and manipulate objects in space to
facilitate thinking. Kirsh asserted that humans are constantly, whether consciously or
subconsciously, organizing and reorganizing space in everyday life to enhance
performance, and argued that “methods used to manage our space are key to organization

of our thought patterns and behavior” (Kirsh 1995).

Spatial schema and spatial reasoning are not just about space. They are also
implicated in abstract reasoning. There is ample evidence from the ﬁelds of psychology
and neuroscience that spatial cognition plays an important role in mathematical
reasoning, modeling of time, language organization, and memory organization (Gardner
1983; Bryant 1992; Bryant, Tversky and Franklin 1992; Kirsh and Maglio 1992; Eilan,
McCarthy and Brewer 1993; Ferguson 1994; Grabowska and Nowicka 1996; Boroditsky
and Ramscar 2002). The use of spatial representation and organization to enhance human
cognition has been a successful strategy since the effective mnemonic strategies of the
ancient Greeks. Demosthenes, a Greek orator born around 384 B.C.E., used a strategy
known as “Method of Loci” to memorize long speeches by mentally walking through his
house, associating each element in the speech with different spots or objects in the house
(Yates 1966).

How information is spatially represented can facilitate cognition. For example,
different spatial arrangement of physical objects can dramatically affect how people solve
a problem. Zhang and Norman reported an experiment showing that a subj ect's
performance when solving the Tower of Hanoi problem was drastically affected by the
spatial placement of the problem pieces (Zhang and Norman 1994). Much of the
problem representation of the Tower of Hanoi problem can be ofﬂoaded to an external
spatial representation of the problem pieces, and as a result, the load on internal working
memory can be reduced and more working memory capacity can be allocated to problem
solving.

There is historical evidence that the arrival of new ways to visualize information,

such as illustrations, graphs, computer graphics and videos, has had a dramatic impact on

advances in engineering and science (Ferguson 1994). Virtual environments and
visualizations represent information spatially through proximity, color gradiation, or
spatial arrays to allow users to immediately grasp large amounts of quantitative data and
complex mathematical relationships (Card, Mackinlay and Shneiderman 1999; Ware
2000). Spatial arrays can be intuitive for even novel users. For example, Merickel found

that VR enhanced a child's ability to solve spatially related problems (Merickel 1992).

1.3 Spatial Cognition and Augmented Reality Space

Wearable and mobile AR systems have a great potential to provide continuous
support for virtual space and visualized information arrays, as well as integrating,
annotating, and interacting with physical space. These systems can potentially be
powerful “cognitive artifacts” (Norman 1993) or “intelligence amplifying systems”
(Brooks 1996) that enhance human cognitive activities, such as attention, planning,
decision making, and procedural and semantic memory.

Information objects in AR environments have unique spatial properties. Because
of the nature of gravity, traditional information objects have to be physically attached to
the body or other support structures within the environment. However, tools and
information objects in AR environments can remain stationary with respect to the world
or to user body parts such as the head and the torso and appear to be totally unsupported
and ﬂoating in space. The amount of mobile space available to organize information
objects is increased by extending the working volume from the surface of the body to a
peripersonal volume in the volumetric AR computing environment; a working space that
is associated with the physical body and, thereby, the user. In such an environment, users

will be able to manipulate and access multiple information objects concurrently, much as

users commonly multitask with devices such as cell phones, address books, and other

physical information media.

1.4 Research Motivation and Problem Statement

This thesis constructs a spatial framework for information display in AR
environments based on experimental behavioral science and neuropsychological studies
of how humans interact with visually and physically perceived objects in three-
dimensional space. The theoretical framework allows researchers of AR interfaces to
design to systematically investigate spatial cognition issues closely related to AR
interfaces design.

It seems obvious that the human cognitive system should process information
objects in an augmented environment in exactly the same way real information objects
are processed. However, information objects in an augmented environment do not
necessarily behave the same as objects in reality. For example, tools and information
objects in an AR environment can remain stationary in space or be attached to different
reference frames in the environment or to body parts. Since it is impossible to generate
this apparent “anti-gravity” feature in the physical environment, precious little is known
about how humans mentally organize information objects attached to an egocentric or
allocentric “weightless” environment. How might users manage and organize different
information ﬁelds around different ﬂames of reference in this new environment?

The primary attention and efforts for researchers in the AR community has been
focused on technologies and engineering of AR systems. User studies in AR are

generally limited to testing proof of concept prototypes with simple user evaluation.

Currently there is a lack of explicit theories and guidelines in computer-human

interaction to support the design of this emerging technology and its varied applications.

1.5 Contributions of this Thesis

The major contribution of this thesis is the construction of a new spatial
iameworkaor informzLion dignity in AR environments. A large volume of existing
work in cognitive psychology and neuroscience is examined and existing theories in
human perception and information processing are coalesced and transformed into theories
applicable to information placement in an AR environment. Furthermore. six experiments
were conducted to discover unique human spatial cognitive properties closely rel_ated to
the designof AR environments. The experimental results were then combined with
existing research in behavioral science and neuropsychology of three-dimensional space
and used for the construction of research-based information placement guidelines for
mobile AR environments.

The remainder of this dissertation is organized as follow. Chapter 2 reviews
literature in behavioral science and neuropsychology that are closely related to spatial
information display in AR environments. Chapter 3 presents a spatial framework of
three-dimensional spaces based on existing neuropsychological evidence reviewed in
Chapter 2. Chapter 4 discusses behavioral properties in the spatial frarhework based on
existing literature. Three research questions are raised and investigated in Chapter 5, 6
and 7. Chapter 5 discusses three experiments that investigate the use of reference frames
during the spatial encoding process in AR environments. Chapter 6 discusses 2
experiments to evaluate the applicability of perceptual asymmetry properties in AR

interfaces design. Chapter 7 presents a novel metaphor for directing visuo-spatial

attention along with experimental evaluation of the metaphor. The main contributions of

this research are then summarized in Chapter 8, and potential future research is discussed.

2 Theoretical Background

Theory driven human-computer interaction design is necessary to develop a high
performance AR interface. With motion tracking technologies, AR systems afford many
options for information placement relative to the environment, objects in the
environment, and the user’s body. Figure 2.1 illustrates a prototype AR interface with
information attached to different reference frames. If users of AR systems will be
accessing, organizing, and deploying large volumes of information in space, then an
understanding of how the brain accesses and organizes spatial information is a sound,
human factors basis for interface research and guidelines. The problem statement
becomes: given an environment where information can be placed anywhere in space
around the user and stabilized relative to the body or the environment, what are

effective ways to organize information objects in that space?

2.] Spatial Framework of Three-dimensional Space

Much of the cognitive capability of the human brain is allocated to the task of
tracking the location of people and objects in space, especially in the planning of motor
actions. From biological and psychological viewpoints, AR space is not a continuous
Cartesian space. Research in spatial cognition indicates that objects in the environment
appear to be modelled in the brain using interrelated spatial coordinate frameworks
organized around the body, objects, and the larger environment (Pettigrew and Dreher
1987; Previc 1990b; Bryant 1992; Bryant et al. 1992; Pani and Dupree 1994; Cutting and

Vishton 1995; Previc 1998).

 

Figure 2.1. A prototype volumetric AR interface with information objects placed in

drﬂerent reference frame.

2.2 Neuropsychology of Three-dimensional Spaces
According to current neuropsychological theories, the brain models the
surrounding three-dimensional space as three overlapping regions: (1) personal/body

space, (2) peripersonal space, and (3) extrapersonal space.

2. 2. I Personal/Body Space

The clearest psychological spatial boundary is deﬁned by personal space, or body
space; it is the psychological space that defines the boundary between the body (the
proximal “me”) and the world beyond the body. The personal/body space is the volume

extending to a few centimeters from the skin of the body, as illustrated in Figure 2.2.

This space not only holds proprioceptive information about the position of limbs and
body; it is also where pericutaneous (tactile surface) interaction (such as hand shaking)

and buccal (oral) interactions occur.

 

Figure 2.2 Personal/body space.

Some neuroscience data based on animal studies suggest that neuronal responses
to body space extend slightly beyond the skin surface (Graziano and Gross 1995).
Philosophers and psychologists (for example, Heidegger 1968; Bateson 1972) have long
speculated that the psychological boundary of the body sometimes expands so that
objects near the body are integrated into the personal body space. Although the boundary
of the body appears to be physical and ﬁxed from the viewpoint of an objective observer,
there is evidence from research in neuropsychology that the sense of the boundary of the
body is plastic. Personal space, deﬁned as the shape and extent of body schema, can be

expanded to incorporate objects attached to the body (e. g. clothing and tools).

11

Neuroscience studies by Maravita and Iriki (2004) on neuronal motor responses
during tool usage by monkeys suggest that the body schema, deﬁned as receptive ﬁelds
of neurons associated to perceived body parts, expands to incorporated tools such as
sticks and rakes after extended use. Furthermore, they show that this extension of the
receptive ﬁelds extends to video representations of the monkey’s body shown on a
monitor, so that the neurons respond to a displaced virtual hand as if it were the
monkey’s physical hand. This suggests that tools can be incorporated into the body
schema at some level.

Another line of research that suggests how media tools can restructure the body
schema is work on visual-motor adaptation in space perception. In these studies, a
technology is used to alter visual perception though the use of sensory prosthesis such as
a prismatic lens. Adaptation to the sensory change, subsequent errors, and readaptation
after the alteration is removed are observed (Stratton 1897; Held and Schlank 1959;
Harris 1963; Kohler 1964; Hay and Pick 1966; Ebenholtz and Mayer 1968; Dolezal
1982). In studies on visual and motor hand adaptation in virtual environments, it was
found that AR systems can remap the perceived location of the hands (motor space)
relative to visual space (Rolland, Biocca, Barlow and Kancherla 1995; Biocca and

Rolland 1998).

2. 2. 2 Peripersonal Space

Another key subspace motivated by neuroscience research on three-dimensional
spaces is the peripersonal space. Peripersonal space is the volume of space immediately
in front of the body and reachable by the arms and hands. Peripersonal space is tied

mainly to the egocentric tnmk- or shoulder-centered coordinate frame (Previc 1998).

12

Located immediately in front on the body, biased towards the central 60° in the lower
visual ﬁeld, and with a radial extension of 0-2 m, peripersonal space overlaps
considerably with the ergonomic space known as the reach envelop (Proctor and Van
Zandt 1994a; Proctor and Van Zandt 1994b) (Figure 2.3). Peripersonal space is
functionally organized for binocular object inspection, motion processing, hand motion,
and manipulations such as directly reaching and handling objects. This interpretation is
supported by behavioral evidence, in that information and objects in this area are found

and manipulated the fastest (Hari and Jousmaki 1996; Murphy and Goodale 1997).

 

Figure 2. 3. Peripersonal space.

2.2.3 Extrapersonal space

Extrapersonal space is the spatial volume beyond the reachable distance of the
arms. The extrapersonal space can be subdivided into four subspaces: (1) extrapersonal
focal space, (2) extrapersonal action space, (3) extrapersonal scene space, and (4)

extrapersonal ambient space.

2.2.3.1 Extrapersonal Focal space

Extrapersonal focal space is an elliptical region of central fovea vision anchored
in the plane of ﬁxation with a lateral extent of 20°-30° and radial extent of higher than
10-20 cm, as illustrated in Figure 2.4 (Rizzolatti, Gentilucci and Matelli 1985; Rizzolatti
and Camarda 1987; Previc 1990a; Previc 1998). This space is associated with the
retinotopic coordinate system and its location is determined by the ﬁxation of the eyes. It
serves high-resolution visual processes that are carried out exclusively in the central
visual ﬁeld. Extrapersonal-focal space is generally associated with visual search and
object recognition, and is biased toward the upper visual area slightly outside of reaching

distance.

 

Figure 2. 4. Extrapersonal focal space.

14

2.2.3.2 Extrapersonal Action Space

Extrapersonal action space encapsulates the body in a 360° surround, with a range
starting from 2 meters from the body to approximately 30 meters (Figure 2.5). This
region appears to be active in orienting and activating attention, memory, and voluntary
motor systems within topographically (as opposed to gravitationally) deﬁned external
space (Previc 1998), and is biased towards the upper visual ﬁeld. It is closely linked to
the remembrance of speciﬁc places or events, in accordance with the general linkage of
episodic scene memory to distal space and navigation. It has been argued that the
extrapersonal-action space incorporates an allocentric coordinate system, but
neuropsychological data and lesion study results provide evidence that the extrapersonal-

action space incorporate a gaze-centered or head centered coordinate system.

 

Figure 2. 5. Extrapersonal action space.

2.2.3.3 Scene Space

There is evidence for a mental model of a larger region of visible objects beyond
action space. Scene space is not gaze-centered like action space, and involves an
allocentically—oriented model of the larger space around the body (Figure 2.6). This space
is assembled from clusters of objects whose position is deﬁned relative to prominent
features or objects in a scene (Easton and Sholl 1995; Sholl and Nolin 1997; Shelton and
McNamara 2001a; Mou and McNamara 2002). There is evidence of cognitive maps
organized and distorted to ﬁt around landmarks and evidence that priming memory for
one object activates memory for objects in the cluster or regions nearby (McNamara

1986; McNamara 1989).

 

Figure 2. 6. Scene space.

16

2.2.3.4 Extrapersonal Ambient Space

Extrapersonal-ambient space is the outermost space of the visual ﬁeld (Figure
2.7). It appears to be biased towards the lower visual ﬁeld. Oriented towards a
gravitational, earth-centered spatial framework, it plays a role in the maintenance of
spatial orientation, balancing, self—motion (Dictgans and Brandt 1978) and postural
control (Previc 1990a; Previc and Neel 1995) and allows the user to interpret self-motion

in an apparently stable world (Leibowitz and Post 1982).

 

Figure 2. 7. Extrapersonal ambient space

2.3 Mapping Digital Information to Space in Augmented Reality Systems
A high performance AR interface design can be constructed by mapping the
natural processing properties in different portions of the three-dimensional space to the

information placement in the AR environment. In Chapter 3, a spatial information display

framework is constructed based on the literature reviewed in this chapter. Chapter 4

explores the behavioral properties of different portions of three-dimensional space.

18

3 Spatial Framework of Information Display in Mobile Augmented

Reality Environments

A theoretical ﬂamework for three dimensional-space based on
neuropsychological theories was developed through the examination of existing literature
in Chapter 2, segmenting the space around a human in terms of the general use and
perception of these spaces. In this chapter, these ideas and other new and existing work
will be extended to develop a spatial ﬂamework speciﬁcally tailored for the presentation

of information in mobile AR environments.

3.1 Spatial Information Framework

So, how can the neuropsychological spaces deﬁned in Chapter 2 become
information spaces? With motion tracking systems, there are many technological options
on how information objects can be placed so as to appear to be stable relative to different
reference ﬂames in the spatial framework (Foxlin 2002). Based on the
neuropsychological model reviewed in Chapter 2, this chapter establishes a spatial
ﬂamework for information spaces, which will be referred to as Infospaces to emphasize

that the spaces are designed to present information.

3.1.1 Personal-body Infospace

Information objects attached to the Personal Body Infospace remain stationary
with respect to some moving part of the body. In order to attach information objects to a
moving part of the body, the position and orientation of that body part need to be tracked.
Information objects can be attached to any tracked moving body part, such as hands,

arms, legs, or other extremities. To examine the possible body-stabilized ﬂames, it is of

19

interest to examine the skeletal structure of the human body, exploring the major bone
groups that can be used to deﬁne useful information ﬂames. Figure 3.1 is an illustration
of the skeletal structure of the human body. The skeleton represents the rigid structure of
the moving elements of the human body and, as such, provides a set of possible tracking
references that can deﬁne information ﬂames relative to the human body. Since direct
attachment to bones for tracking purposes is generally not practical, the attachment is
more likely to be to the epidermis (surface of the skin). But proper placement allows the
epidural attachment to be a good approximation of the underlying bone tracking.

The concept of a personal body infospace is a very general idea. A human adult
skeleton has 206 bones. Clearly, many of these are not useful ﬂom an information ﬂame
point of view (such as bones in the inner ear) or are redundant (such as the dual ﬁmction
of the ulna and radius or the set of bones in the rib cage). Other bones may have very
limited utility in mobile AR environments (such as the bones in the feet). Figure 3.2
describes ten bone groups useful for deﬁnition of AR information frames. Some of these
groups deﬁne frames directly (such as the skull); others deﬁne sets of ﬂames (such as the

vertebral column).

20

 

ﬁrm“.--

II-mvv‘vtolii

. 14,281,” Y.
o'er!

    

Figure 3.1. Human skeletal structure (Gray, Bannister, Berry and Williams 1995).

21

 

       

(a) (b) (C) (d) (e)

 

         

(f) (9) (h) (i) (j)

 

 

 

Figure 3. 2. Selected bone groups for Personal—Body Infospaces: (a) Skull (b) Vertebral
Column (c) Stemum and costal cartilages (d) Humerus (e) Forearm group {/9 Hand (g)

F emur (h) Patella (i) Leg (1) Foot (Gray et al. 1995).

The Hand Personal-body Infospace is a common infospace for manual interaction
in AR environment. Tracking of hand movement is required to facilitate the creation of a
Hand Personal-body Infospace. The human hand is a complex device with many bones.
Figure 3.3 is an illustration of the bones of the human hand. The most basic
conﬁguration of tracking would emit 15 frames for a hand, fourteen for the phalanges, the
bones of the ﬁngers, and one for the metacarpus. A few technologies exist that can
provide this level of tracking for the hand. Simple object manipulation can often be

accomplished with only metacarpus tracking.

22

 
   

I

I

I
.1) M

I
I
I
I

I
It
I!
W

‘x
i ‘ h.
I. . l ‘ ’
' ‘ “+21" \~\ " .
\ - . ~. .-
k ‘ I ' V - ,-
7. .. ‘ ‘ , ._
u .
l4, . ., r
.__, _ c - _ , f
. a 5:.
- ,.
r~ ti‘i \. ‘1 " if; -‘4,: a.
‘v' . 9. ~ ‘
A" ' ‘
. ﬁ . '_ ‘
\ uuuu Iv ‘ -' ___ . ’ " ‘2 " ‘

      
   

i

  
 
 
 

t"
~4-

3» l

—”.") ‘ I. .

A

but}:

Figure 3. 3. Skeletal structure of human hand (Gray et al. 1995).

3.1.2 Peripersonal Infospace

The Peripersonal Infospace remains stationary with respect to the upper torso.
Tracking of the upper torso is required to create the peripersonal infospace. Tracking of
the sternum and costal cartilages (ﬂont part of the body) would generate unwanted
breath motion for information objects attached to the peripersonal infospace. Therefore,
the vertebral column is recommended as the tracking source for the peripersonal
infospace, generally through some external attachment such as a belt that will transmit
the motion ﬂom the vertebral column to a tracking device with a minimum of motion

error due to epidermal layers. The Vertebral Column is composed of 7 cervical vertebrae,

23

12 dorsal vertebrae, 5 lumbar vertebrae, sacrum and coccyx (Figure 3.4). It is situated in
the median line of the back of the upper torso. The 7 cervical vertebrae, which form the
neck, are not well suited for tracking because of the deformation of muscles around the
neck. The 12 dorsal vertebrae and 5 lumbar vertebrae are more suitable for tracking. The
vertebral column clearly emits a variety of tracking points, each with unique

characteristics. Tracking of the upper back (dorsal area) will create a ﬂame that follows

the body.

 

Figure 3. 4. The vertebral column (Gray et al. 1995).

24

3.1.3 Extrapersonal Focal Infospace

Information objects attached to the Extrapersonal Focal Infospace remain
stationary with respect to eye ﬁxation. Tracking of eye movement is required for
displaying rendered virtual elements that appear to be stabile with respect to eye ﬁxation.
When eye tracking is not available, relevant information objects can be placed in a head-
stabilized reference ﬂame to grab the user’s attention. Position and orientation of the
head is commonly tracked in AR systems, typically through tracking of a head-mounted

display that is ﬁxed relative to the head.

3.1.4 Extrapersonal Action-Scene Infospaces

The Extrapersonal Action-Scene Infospaces are an amalgamation of the
neuropsychological extrapersonal action space and scene spaces, which deﬁne the spatial
volume of the allocentrically oriented spaces. Typically, Extrapersonal Action-scene
Infospaces encapsulate task-speciﬁc working volumes such as desks, cabinets or building
structures. Information objects can be attached to stationary objects in the environment
without additional tracking support. Some AR systems present information attached to
moving objects in the scene. In such AR systems, the position and orientation of objects
needs to be tracked. There can be multiple Extrapersonal Action-scene Infospaces

existing concurrently for multiple working volumes in a multitasking scenario.

3.1.5 Extrapersonal Ambient Infospaces

In current mobile AR systems, information is often presented as “world
stabilized”: information is ﬁxed to real world locations in the world coordinate frame and
the data view varies as the user changes viewpoint orientation and position. Thus, a

user’s viewpoint position and orientation are tracked and the transformation ﬂom the

25

world ﬂame to the user’s view frame is computed and used to transform virtual
augrnentations and objects so as to appear registered with the real world. In such a system,

the only information ﬂame is the world coordinate frame.

3.2 Summary

Based on existing literatures in neuropsychology, a spatial information ﬂamework
is constructed for information organization for mobile AR computing environment, as
summarized in Figure 3.5. The spatial information framework consists of ﬁve
information spaces, or Infospaces. Chapter 4 reviews a collection of literature about

behavioral properties in each Infospaces.

26

 

Extrapersonal
Infospaces

 

V‘

Personal-body Peripersonal
Infospace Infospace

   
  
   

  

“thy I.“ ‘ t. L-
Exlrapenonal ‘ .1, ‘ " Extrapersonal
Focal Infomce Extrapersonal AmbIenI
Action-scene Infospace IMOIPGOG
Figure 3. 5. Spatial framework for information display in mobile augmented reality

environments.

27

4 Behavioral Properties of Three-dimensional Space

The automatic neurological activities for space around the user can be leveraged
for information organization. In order to develop a theory-based interface design, it is
important to determine how the human brain perceives and reacts to objects in different
spatial location in three-dimensional space. This chapter reviews existing research about
behavioral properties of the Infospaces that were deﬁned in Chapter 3, with a mind

toward utilizing the spaces for information presentation.

4.1 Behavioral Properties in Personal-body Infospaces

Surfaces on the human body can be used as information spaces. The use of a
wristwatch places an information device directly on the surface of the skin. However,
beyond this simple functionality and occasional jotting of notes on the skin or decorative
uses such as tattoos, physical body space is rarely seen as a possible information space
for interaction with information in any form. Tool storage and manipulation using belts or
pockets is very common. However, visual displays such as clothing, tattoos, makeup,
and other body-attached items are usually not used for communication, particularly with
the user himself. They are representations used for signaling information such as social
status, sexual availability, and other social information to other observers. Technology
now allows the augmentation of the environment with computer-generated virtual
content. If a body part can be tracked, a computer can register graphics with the body
part and display them in various forms that allow the user to perceive them as placed on

or near the body part. This use of the body surface in AR systems to hold and display

28

private information for the user has considerable potential for providing information to
users in a familiar territory, but actual implementations of this idea remain rare.

There are neuropsychological advantages to placing virtual tools such as icons,
buttons, and other digital objects on or very near the body surface. Neuroscience data

suggests that neuronal responses to body space may extend slightly beyond the skin

surface:

the visual space near the animal is represented as if it were a
gelatinous medium surrounding the body that deforms whenever the head
rotates or the limbs move. Such a map would divide the location of the visual
stimulus with respect to the body surface, in somatotopic coordinates

(Graziano et al. 1995/p. 1031).

Tracking the position and orientation of individual body parts allows digital
information to be attached to the moving body (Owen, Biocca, Tang, Xiao, Mou and Lim
2005). This section reviews some important behavioral properties of the Personal-Body

Infospace that are relevant to information communication.

4.1.1 Proprioception

Proprioception is the unconscious perception of movement and orientation of the
body arising ﬂom sensing mechanisms within the body itself. Neuropsychological
literatures suggested that both vision and proprioception contribute to the establishment
of spatial representation (Chance, Garnet, Beall and Loomis 1998; Shelton and

McNamara 2001b; Yamamoto and Shelton 2005). Literature in the ﬁeld of

29

Neuropsychiatry suggests that an essential contribution ﬂom basal ganglia for the
integration of visual and proprioceptive information is required for achieving high
accuracy in pointing tasks (Adamovich, Berkinblit, Hening, Sage and Poizner 2001;
Keijsers, Admiraal, Cools, Bloem and Gielen 2005). In the design of AR interfaces,
additional proprioceptive cues for information objects in the Personal-body Infospace
have the potential to increase pointing and manipulation accuracy and naturalness. If
information objects are associated with body parts, proprioception assists in the
knowledge of the location of the associated object.

In everyday life, human-beings do use the body as a medium for communication,
information display, and storage in a limited fashion. The aforementioned watch
example is the use of the wrist to display of time and date and for storage of other
personal information. Workers attach tools on a waist tool belt for easier and faster
access to the tools. The Personal-body Infospace naturally becomes an intuitive
information space for metaphorical personalization in AR computing environments.
Lehikoinen (2000) proposed the idea of a shirt embedded with an array of pressure
sensors that allows the user access to and interactively manipulation of digital
information mapped to locations on the torso. Aside ﬂom the motor advantage of easy
pointing and manipulation, Personal-body Infospaces allow the user to develop
metaphorical associations and proprioceptive memory between information objects or
control functions and spatial locations on the user’s body. Some real life examples of
using the body for metaphorical association include the parachute control for skydiving
and ﬁshing vests for storing tools on different positions relative to the torso.

Metaphorical associations between digital information and control functions to body

30

positions developed by individual users have a great potential for a more intuitive
interface with higher performance. The habituation and proprioceptive memory
established could provide a faster and more accurate access, retrieval, and manipulation

of information objects and control functions.

4.1.2 Spatial Bias in Personal-body Infospace

Human visual, auditory, and haptic systems for perception and motion are
strongly skewed to maximum performance in the ventral (frontal) regions of the body
(Corballis and Beale 1983; Corballis 1993). The dorsal regions (back of the body), in
general, exhibit decreased sensory resolution, are less accessible by the hands, and are not
visible by the user’s eyes. Therefore, the back of the body is clearly not ideal for holding
digital information that must be viewed or manually manipulated. The Personal-body
Infospace is further biased towards the upper body, where the body parts are reachable by

the hands.

4.1.3 The Hand and F orearms Personal-body Infospaces

There are cases in current VR and AR practice where information is attached to a
limb-stabilized space. In VR interfaces, hand-stabilized information systems often
present relevant information such as a crude cursor, virtual representations of the hand, or
tools that appear to be attached to or operated by the hand. Information objects attached
to a hand-stabilized reference frame should be action orientated. For example, tools
selected for the current action, menus, and selection trays can be attached to the non-
dominant hand, and the dominant hand can be used for selection and action and
manipulation (Figure 4.1). Issues of handiness and kinesthetic asymmetry of the

Personal-body Infospace will be discussed in Section 4.4.1.

31

 

Figure 4. 1. Hand Personal-body Infospaces: a menu attached to the non-dominant hand

and an interaction tool (the ring) attached to the dominant hand.

32

4.2 Behavioral Properties in Peripersonal Infospace

Information objects placed in the Peripersonal Infospace remain arm-reachable
regardless of the user’s position and orientation in the world, providing quick and easy
access to objects placed in the ﬂame. As there is no real world equivalent of a
Peripersonal Infospace with associated, yet detached, objects in the physical world,
experience in the real world does not prepare users for data presentation where two-
dimensional and three dimensional data objects appear to hover weightlessly in an
egocentric reference ﬂame, and the behavioral properties in Peripersonal Infospace are

largely unknown.

4. 2. 1 Spatial Biases of Information in Peripersonal Space

The primary action in the Peripersonal Infospace is object reaching, grasping and
manipulations. Ifdifferent spatial locations in Peripersonal Infospace have different
cognitive and behavioral signiﬁcance, there are design advantages and disadvantages to
placing information in different spatial positions in the Peripersonal Infospace.

Previous research indicates that the perceptual, cognitive, behavioral and
biomechanical properties of space are inherently and, sometimes, ﬁmdamentally
asymmetrical (issues of perceptual and kinematics asymmetries in egocentric spaces will
be discussed in Section 4.4.1 and 4.4.2). Reaching movements are biased towards the
middle 60° of the body (Mountcastle 1976; Servos, Goodale and J akobson 1992), the
lower visual ﬁeld and the lower volume of peripersonal space (Previc 1990a; Sheliga,
Craighero, Riggio and Rizzolatti 1997).

In an experiment by Biocca, Eastin and Daugherty (2001), it was found that

participants were at least 175% and up to 930% faster (average 313%) at locating targets

33

and placing objects at target locations in the central area of the peripersonal space.
Target search and object placement was also found to be signiﬁcantly faster by 67%
within the right side than the left side. A quadrant effect emerged favoring the search,
reach, and manipulation for objects in the lower-right quadrant in peripersonal space.
This perceptual and motor advantage extended into a memory advantage for recall of the

location of objects and recognition for the information objects participants manipulated.

4.3 Behavioral Properties in Extrapersonal Focal Infospace

A key issue in mobile systems is the allocation of spatial attention. The
Extrapersonal Focal Infospace is a ﬂoating volume centered at the spatial location the
user is currently paying attention to. Experimental user interfaces have attempted to
harness this high-bandwidth space with eye pointers or “eye mice”. A head stabilized
reference ﬂame can be used to display information related to the extrapersonal focal
space when eye tracking is not available. However, information objects attached to the
head stabilized reference ﬂame remain stationary with respect to the head. This differs
ﬂom a true eye ﬁxation-based focal infospace, but closely approximates the concept of a
space stabilized relative to the vision ﬁeld.

The main functions of the extrapersonal focal space are object searching and
recognition. Objects in the environment are often recognized in the extrapersonal focal
space before being brought into the peripersonal or personal-body space. As eye tracking
technology inside AR environments is technologically impractical at this time, this
section will focus on important behavioral properties of a head stabilized reference ﬂame

and human spatial attention.

34

4. 3. 1 Visual Clutter in Head Stabilized Reference Frame

Head-stabilized reference ﬂames are the most common space for presenting non-
task-related information in AR and wearable computing environments. No tracking is
necessary for the presentation of head-stabilized data. There is extensive research in
displaying information for drivers and pilots thru Head-up Displays (HUDs) and HMDs.
A study conducted by Haines, et al. (Haines, Fischer and Price 1980) indicated that pilots
who use an HUD have less head and eye movement when compared to pilots that use
traditional displays in the cockpit panels. However, several reports indicate that optically
overlaid information cannot be processed in parallel (Neisser and Becklen 1975; Becklen
and Cervone 1983; McCann, Foyle and Johnston 1994). Others have reported that there is
a reaction latency associated with cognitive switching among the environment and the
overlaid information (Fisher, Haines and Price 1980; Weintraub, Haines and Randle 1985;
Larish and Wickens 1991), and symbology placed within a 5 degree radius of the fovea is
annoying to drivers (Sojourner and Antin 1990; Inzuka, Osumi and Shinkai 1991). These
research results suggest that only a small amount of information can be placed in the
Extrapersonal Focal Infospace, and the central visual ﬁeld should be reserved to avoid

visual clutter to the real environment.

4.3.2 Perceptual Fading of Visual Stimulus

It is well known that perception of sustained and constant sensory input attenuate
over a period of time ranging ﬂom seconds to minutes. This “perceptual fading effect”
has been well documented in perceptual psychology in vision (Ditchbum and Ginsborg
1952; Riggs and Ratliff 1952; Riggs, Ratliff, Comsweet and Comsweet 1953; Krauskopf

and Riggs 1959; Heckenmueller 1965), audition (Hood 1950), touch (Hoagland 1933),

35

smell (Eugen 1982) and taste (Abrahams, Krakauer and Dallenbach 1937). One everyday
example of this phenomenon for the visual sensory channel is that dirt particles on
eyeglasses will perceptually disappear after a few seconds if the person is not
intentionally paying attention to it. While information attached to the head stabilized
reference ﬂame is always visible to the user, interface designers need to be aware that
these information objects could perceptually disappear over a period of time ranging ﬂom

seconds to minutes.

4. 3.3 Spatial Bias in Head Stabilized Reference Frame

Human visual attention is biased towards the central area of the head stabilized
reference ﬂame. However, the central area should be reserved to avoid visual clutter to
the real environment and to avoid degradation of navigation. It is suggested that

information objects be placed at the peripheral area of a head stabilized reference ﬂame

(Mch et al. 1994).

4.4 Behavioral Properties in Relation to Egocentric Infospaces

An Egocentric Infospace is registered with and moves with some part of the body.
This attachment personalizes the space for a given user and allows the space to follow the
user or speciﬁc user appendages. User interface elements residing in an egocentric
Infospace appear to be attached to the user, either directly or through some invisible
attachment. The Personal-body Infospace, Peripersonal Infospace and Extrapersonal
Focal Infospace are all egocentric Infospaces. An interesting characteristic of egocentric
Infospaces is the existence of spatial biases due to asymmetries of the brain and body that
effect perception of and interaction with user interface components in this very personal

space around the body.

36

Psychological research has demonstrated that human behavior consistently
exhibits egocentric spatial biases. There are well understood perceptual asymmetries in
psychology and neuroscience for the left/right, upper/lower, and near/far visual ﬁelds.
Motor actions are also highly asymmetric due to handedness. These asymmetries
inﬂuence the perception and interaction of information at various spatial locations.

Placing an object at different locations could signiﬁcantly alter the cognitive
process. These effects are relatively benign for traditional user interfaces due to ﬁxed
placement and layout of physical interface components (i.e. display, keyboards and the
mouse). Due to the limited ﬁeld of view of small display devices and the fact that
information objects are usually attached to allocentric reference frames, the spatial
locations either do not consistently stay on one side of any of the known zones of
asymmetry or varying placement is not an option at all due to limited screen size.

Egocentric Infospaces allow placement of information that moves in a manner
directly related to body elements. Hence, placement of interface elements can be
managed in relation to know spatial asymmetries, allowing spatially signiﬁcant regions

around the body to be exploited in egocentric Infospaces.

4. 4. I Kinematics Asymmetry

One of the advantages of immersive AR interfaces is that users can apply intuitive
birnanual interaction, the use of both hands in interface tasks. Bimanual interaction in
conventional user interfaces is generally limited to the hand-cooperative task of typing. It
is well-known that human motor skills are asymmetric due to handedness and cerebral
lateralization. This thesis will examine issues of kinematic asymmetry in the context of

birnanual action.

37

Unimanual tasks, tasks that can be completed by one hand, are usually biased
towards the dominant hand. So it is a simple design guideline that the system should
present simple pointing and selection tasks on the side of the user’s dominant hand. The
dominant hand is excellent at precise, corrective and rapid movements, while the non-
dominant hand usually acts in a supporting role or as a ﬂame of reference for the
dominant hand.

Guiard’s Kinematic Chain Theory (Guiard 1987; Guiard and F errand 1995)
provides a theoretical ﬂamework for the role of the hands in bimanual activities, and how
the actions of the two hands work complement each other. The theory classiﬁes bimanual
asymmetric actions in the following three classes:

1. Spatial Reference in Manual Motion: motion of the dominant hand is often
based on a spatial reference deﬁned by motion of the non-dominant hand. The roles of
the non-dominant hand include a physical stabilizing action (e. g. stabilizing the paper
when writing), deﬁning steady states (e. g. putting the non-dominant hand in ﬂont when
hitting a tennis ball with a racket using the dominant hand), or deﬁning a spatial
reference. Bimanual tasks that involve information objects requiring a physical
stabilizing action, deﬁned steady states, or a deﬁning spatial reference should be placed
on the non-dominant side.

2. Contrast in the Spatial-Temporal Scale of Motion: the dominant hand has a
considerably ﬁner spatial and temporal motor resolution. Information objects for
bimanual interactions that require macrometric movement should be presented to the
side of the non-dominant hand, and tasks that require micrometric movement should

be presented to the side of the dominant hand.

38

3. Precedence in Action: the non-dominant hand it typically the ﬁrst
participant in bimanual interaction, with motion preceding that of the dominant hand.

A single interface element that requires bimanual interactions should be presented on
the side of the non-dominant hand

Guiard’s Kinematic Chain Theory provides a ﬂamework for the design of user
interface components based on bimanual computer interaction. Interchanging subtasks
between the two hands can potentially impact the performance of a task and
circumstances in the design that encourage this interaction should be avoided. Presenting
information on the correct side of the body would result in a faster and more natural
access to the relevant information objects by the hands.

Tasks that are naturally designed for bimanual interaction are best placed on the
non-dominant side of the body so as to encourage reach and acquisition by the non-
dominant hand rather than an acquisition by the dominant hand that may force a transfer
to the non-dominant hand for stabilization or referencing. Since the non-dominant hand
is typically the initiator in a bimanual interaction, forcing reach on the dominant side by

placement can delay the onset of birrranual interaction and, again, force a transfer.

4. 4. 2 Perceptual Asymmetries

Perceptual asymmetries refer to the asymmetric properties of human perception in
different visual ﬁelds. It is well known in psychology that humans perceive the same
stimulus material differently when it is presented in different spatial locations. The

location of an object affects cognitive processes in the brain.

39

4.4.2.1 Bilateral Asymmetry

The concept of contralaterality (the difference in information processing between
two sides of the brain) was documented as early as 2500 B.C.E. by the ancient Egyptians
(Hecaen and Albert 1978). The human mind consistently exhibits a left-right bias in
visual perception and information processing. Information in the left visual ﬁeld of both
eyes is sensed by the right side of the retinas and then transmitter over the visual pathway
leading to the visual cortex of the right hemisphere. Similarly, information in the right
visual ﬁeld is sensed by the left side of the retinas and then transmitted over the visual
pathway leading to the visual cortex of the left hemisphere. Figure 4.2 illustrates these
visual pathways. Hence, visual information ﬂom the left and right visual ﬁelds projects
exclusively to the contralateral cereme hemispheres (Bryden 1982).

Research in visual perception is often based on the use of conventional
tachistoscopic techniques to test hypotheses relative to perceptual bilateral asymmetry
effects. A subject is asked to ﬁxate on the center of a screen, and stimulus materials are
ﬂashed to either the left or right side of the visual ﬁeld. Reaction time and/or task
accuracy are measured. The reaction time differential measured in this experimental
technique is very short. Typically, 100 milliseconds is considered a signiﬁcant effect
(Solso 1998). To summarize various experimental results undertaken by various
researchers using the tachistoscopic technique and lesion studies, the left hemisphere
(right visual ﬁeld) is found to be biased towards letters and words, language (Strauss
1998), functional or symbolic meaning, verbal memory, local patterns (Robertson and
Lamb 1991; Yovel, Yovel and Levy 2001), higher spatial ﬂequencies (Sergent 1983;

Sergent 1987), categorical spatial relationships (Kosslyn 1987), and time; while the right

40

hemisphere (left visual ﬁeld) is found to be biased towards geometric patterns, visual

appearance, visual memory, global patterns, lower spatial ﬂequency, coordinate spatial

relations, emotion (Dimond and F arrington 1977), face recognition, and sustained

attention (Whitehead 1991). These experimental results are summarized in Table 4.1.

Left hemisphere (Right visual ﬁeld)

Right hemisphere (Left visual ﬁeld)

 

Letter and words

Functional or symbolic meaning
Verbal memory

Local patterns

High spatial ﬂecjuencies
Categorical spatial relations

Time

 

Geometric patterns

Visual appearance

Visual memory

Global patterns

Low spatial ﬂequencies
Coordinate spatial relations
Emotion

Face recognition

Sustained attention

Table 4.1. Summary of cerebral hemispheric specializations.

41

Optical
Tract

 

 

 

 

 

Figure 4. 2. The visual pathway of left and right visual ﬁeld. Retinal signal from the left

and right visual ﬁelds projects exclusively to the contralateral cerebral hemispheres.

42

4.4.2.2 Perceptual Asymmetry: Upper vs. Lower Visual Field

Besides the well-known left and right hemispheric specialization, there are also
perceptual and behavioral asymmetries within the upper/lower visual ﬁeld and the
far/near visual ﬁeld. The optic nerves direct the retinal signals to the optical chiasm,
where retinal signals ﬂom the left and right visual ﬁeld are divided and fed to the
contralateral hemisphere as shown in Figure 4.2. Figure 4.3 illustrates the ﬁrrther
processing of visual information in the brain. After the optical chiasm, retinal signals
proceed through the optical tract, to the lateral geniculate nucleus (LGN). From there, the
visual pathway of the upper and lower visual quadrant spilt into two routes, where the
pathway of the upper visual quadrant takes a longer route to the temporal lobe (this
pathway is known as the Meyer’s Loop) before it heads to the occipital lobe. The
representation of the upper and lower visual quadrants is anatomically discontinued in the
extrastriate areas (V2 and higher), with visual area V1 physically separating them (Rubin,

Nakayama and Shapley 1996).

43

 

Occlpital
Lobe

 

 

Meyer's Loop

Figure 4. 3. Visual pathway of the upper and lower visual ﬁeld.

Previc (Previc 1990a) argued that most reaching and grasping behavior occurs in
the lower visual ﬁeld whereas visual search and object recognition occur in the upper
visual ﬁeld. So the lower visual ﬁeld became specialized for reaching and other
visuomotor activities and the upper visual ﬁeld became specialized for visual search and
object recognition (Previc and Blume 1993). Previc also argued that visual attention can
be subdivided into two major systems: (1) a peripersonal system that assists in reaching
and other visuomotor activities, which is biased towards the lower visual ﬁeld; and (2) an
extrapersonal system that is used in visual search and scanning, which is biased towards

the upper visual ﬁeld. Objects in the near space are primarily processed by visuomotor

44

systems for reaching and grasping, whereas objects far ﬂom an observer are primarily
processed with visual search and object recognition.

An upper-ﬁeld bias in visual search/scanning has been shown in studies that used
single-ﬁxation search ﬁeld presentation and those that allowed ﬂee eye-movement
search. According to these results, the upper visual ﬁeld is better for presenting dynamic
information, which requires object recognition and visual search, whereas the lower .
visual ﬁeld is better for presenting static information that has already been recognized or
that requires visuomotor activities. Experimental results ﬂom studies conducted by
Wade, et al. demonstrated that the lower visual ﬁeld is specialized for perceiving shape
ﬂom shading information whereas the upper visual ﬁeld is specialized for perceiving
shape ﬂom edge-based information (Wada, Saijo and Kato 1998). Rubin, et al. (Rubin et
al. 1996) also found that the lower visual ﬁeld performed much better than the upper
visual ﬁeld in the segmentation of an image into ﬁgures and backgrounds. Previc (Previc
et al. 1993) investigated visual search performance as a function of a target’s location in
space. The ability to ﬁnd a target shape was best when it was presented in the upper-right
visual ﬁeld and was closest to the ﬁxation point in its depth and eccentricity. This result

is consistent with the individual results for spatial subdivision.

4.5 Behavioral Properties in Extrapersonal Action-scene Infospaces

In most current AR systems, information is presented in Extrapersonal Action-
scene Infospaces, i.e., objects appear to be stationary relative to the surrounding
environment. In this scenario, virtual objects provide task-speciﬁc augmentation or
online information about the real environment (Caudell and Mizell 1992; Feiner,

MacIntyre and Seligrnann 1993; Feiner, MacIntyre, Tobias and Webster 1997; Tang,

45

Owen, Biocca and Mou 2003). These systems require proper registration between the real
and the virtual environment so the virtual object appears to be stationary and in the
correct location corresponding to the real environment. By spatially relating virtual
information to physical objects and locations in the real world, AR provides the human

cognitive system with strong additional leverage in many tasks.

4. 5. 1 Spatial Consistency of Information Objects with the Environment

By “seaming” the information to the real environment, AR technologies are used
“as a complement of human cognitive processes” (N eumann and Majoros 1998). There
is evidence that the cognitive load for processing virtual information objects can be
reduced when information objects are spatially consistent with the environment. The cost
for information search and attention switching between a workpiece and detached media
(such as a paper manual) can be reduced by spatially placing task related information in
the correct spatial location. Tang et al. (Tang et al. 2003) designed and evaluated an AR
system with spatially registered three-dimesional instructions that directs the user during
an assembly process using spatially registered instructions stabilized to the workpiece.
Experimental data demonstrated that subjects using the AR system achieved a lower error
rate and perceived a lower mental effort compared with subjects using other traditional
instructional media (paper and conventional screen-based presentations, for example).
The experimental results seem to indicate that the cognitive system processes spatial
information and operations (e.g. mental rotation, spatial memory and spatial updating) of
virtual objects along with real objects and the environment, and, as a result, spatially
registered instruction presentation relieves the mental effort for processing spatial

information for the virtual objects. Psotka (Psotka) conducted an experiment to evaluate

46

visual memory of pictures in three conditions: (1) a Monitor Condition with the pictures
being displayed in a stationary monitor, (2) a Virtual Reality Condition with the pictures
ﬂoating in the air around the user in a virtual environment, and (3) an Augmented Virtual
Reality Condition with the pictures projected on the physical wall of the experiment
room. The results show that subjects in the Augmented Virtual Reality Condition
recalled twice as many items as those recalled by subjects utilizing either of the other two
conditions. The author interpreted the increased memory effect as a result of spatial

consistency of virtual objects to the real-world coordinate system.

4. 5 .2 Remote Interaction for Information Objects in Extrapersonal Action-scene

Infospace: Selection and Manipulation

Information objects in an Extrapersonal Action-scene Infospace are attached to
allocentric reference ﬂames. Their visibility and reachability are dependent upon the
user’s location and orientation. Thus, information objects may fall outside the reachable
distance of user’s hands as the user navigates in the environment. In some cases,
information objects can be reachable by navigating towards the objects so that they are
within reachable range, while in other cases, it would be more convenient to interact with
the information remotely.

There are two types of remote interaction: selection and manipulation. Selection
is choosing an object in three-dimensional space. Manipulation involves the selection of
an object, modifying its position and orientation, and then releasing it. There have been
many studies in VR and traditional human-computer interaction that explore different
options in object selection and manipulation. This section explores the studies relevant

to, or leading to, ideas on selection and manipulation in AR user interfaces.

47

4.5.2.1 Remote Object Selection

Different body parts (such as the ﬁnger, hand, and head) can be used as interface
elements for selection of remote objects. A measurement of difﬁculty of input devices
for a pointing task can be calculated using Fitts’s Law (Fitts 1954; Fitts and Peterson
1964). Fitts’s Law states that the time required to complete a pointing task is directly
proportional to the distance to the target, and inversely proportional to the width of the
target. In other words, the closer and/or the larger the target, the shorter the time required
for the pointing task. F itts’s Law has been shown to be valid under a wide variety of
circumstances, including movement of different body parts (such as ﬁngers, hands, arms,
foot, head, and eye-gaze), pointing tasks of different pointing devices (such as mouse,
joystick, touch pad, trackball, touch screen), varying physical environments (such as
underwater), and diverse user populations (such children, aged, different gender).

F itts’s Law can be used to compare and contrast performance of the same
pointing task using different input mechanisms. Experimental data by Langolf (Langolf
1973) shows that performance of a pointing task using the head has the highest Fitts’s
Index of Difﬁculty (i.e. most difﬁcult) followed by the arm, the hand, and then the ﬁnger.
In other words, pointing using the head is very difﬁcult, both in terms of precision and
time of completion, followed by the arm, the hand, and then the ﬁnger. This result
suggests that body parts used for pointing and selection tasks should be chosen based on

the order of ﬁnger, hand, arm, and then head.

4.5.2.2 Remote Object Manipulation
A few studies in VR interaction techniques have been conducted to explore

techniques to manipulate virtual objects outside the reachable distance of the arms. In the

48

Ray-casting technique, a user utilizes a virtual light that extends ﬂom the hand to select
and manipulate objects. With the Ray-casting technique, the user aims at the target
object using this virtual beam of light. When an appropriate target object has been
selected, usually through pressing a button held in the hand, the target object is attached
to the virtual light, and spatial location and orientation can be manipulated with simple
hand movements. The object is then released, usually due to a button release or
secondary button press event and the manipulation is complete. Ray-casting is a highly
intuitive technique. However, manipulation using the Ray-casting technique exhibits a
“lever-arm problem”: the selected object is attached to the end of a long lever arm,
making distance manipulation and arbitrary rotational control of the object impossible.

Mine (Mine 1996) developed the CHIMP (Chapel Hill Immersive Modeling
Program) interaction technique for distant objects in virtual environments. Similar to the
Ray-casting technique, Mine’s remote manipulation technique uses a spotlight attached to
the hand for target object selection. Once the object is selected for manipulation, a pop-
up menu appears so the user can specify the translation, rotation and other manipulation
values through a pop-up keyboard or scroll-bars. While this technique allows the user to
manipulate target object accurately by entering precise numeric manipulation values, it is
very unintuitive and unnatural.

The Arm-extension technique, or the Go—Go technique (Poupyrev, Billinghurst,
Weghorst and Ichikawa 1996), incorporates an extendable virtual hand such that a user
can grab remote objects in the virtual environment. The Go-Go metaphor is implemented
by using a nonlinear function for mapping the movement of the user’s physical hand to

the movement of the virtual hand. The user reaches the target object by extending the

49

physical hand toward the object of interest, and the virtual hand extends to a remote
location through the mapping of the non-linear ﬁmction. This technique allows full six
degree of ﬂeedom manipulation of target objects, and manipulation is very intuitive once
the target object is selected. However, object selection is difﬁcult because of the non-
linear mapping of the virtual hand, overriding proprioceptive cues and making the control
of movement of the virtual hand very difﬁcult.

The World in Miniature Technique (Stoakley, Conway and Pausch 1995) provides
a miniature replica of the world where the user can manipulate all the objects in the
replica within reachable distance. Manipulation of objects in the miniature world maps to
full scale manipulation of objects in the actual world. This technique allows users to
manipulate all objects ﬂeely regardless of the user’s location and orientation. However,
it is hard to accurately perform micro-manipulations in the miniature world when the
scaling ﬂom the miniature world to the hill-sized world is large. Furthermore, there is a
mental effort overhead for performing spatial transformation between the miniature world
and the full-scale world.

In the HOMER (Hand-centered Object Manipulation Extending Ray-casting)
technique (Bowman and Hodges 1997), a user ﬁrst selects the target object as in the Ray-
casting technique. Once the target object is selected, orientation of the object is
controlled by the orientation of the user’s hand, and the position of the object is
controlled by the position between the user’s hand and user’s body. Since there are limits
on the range of hand position and orientation changes within the reach envelope, this
method, while intuitive, can only effect limited changes in object orientation and

position.

50

In the Voodoo Doll Technique (Pierce, Stearns and Pausch 1999), a user ﬁrst
selects the target object as in the Ray-casting technique. Once the target object is
selected, a replica of that object appears in ﬂont of the user and manipulation of the
replicated object results in a corresponding manipulation of the remote target object.
This technique achieves a rather satisfactory accuracy and control in orientation
manipulation, but the direction and scale of position manipulation is not clear to the user
during the manipulation. An improved Voodoo Doll Technique (Pierce and Pausch
2002) was created by adding a reference point to both the target object and replicated
object after selection. There are reports indicating that users are occasionally confused as
to which of two objects is the voodoo doll.

There are advantages and disadvantages for every manipulation technique.
Characteristics of each manipulation technique that illustrate these advantages and
disadvantages are summarized in Table 4.2. There is no standard interaction technique
that will work for all applications. AR interface designers will need to analyze the

requirements for a speciﬁc application before choosing a manipulation technique.

51

 

Manipulation

 

 

 

 

 

 

 

 

 

 

 

 

 

Method
Ray-casting Pros Extremely Intuitive

Cons Limited ﬂeedom in position and orientation manipulation
CHIMP Pros Precise manipulation

Cons Magnitude input is cumbersome
Arm-extension Pros Intuitive Manipulation

Cons Object selection and position manipulation is hard to

control

World in miniature Pros Manipulation of all objects at any time

Cons Low accuracy in position manipulation

Mental effort overhead for spatial transformation

HOMER Pros Relatively intuitive

Cons Limited ﬂeedom in orientation manipulation
Voodoo Doll Pros Relatively intuitive

Cons Occasional confusion between the control doll and the

 

 

target object

 

Table 4. 2. Summary of pros and cons of dijferent remote objects manipulation method

52

 

4. 5. 3 Unregistered Extrapersonal Action-scene Infospace

Extrapersonal Action-scene Infospaces do not necessarily need to be registered
with the real environment. Unregistered Extrapersonal Action-scene Infospaces are
spatially independent of the real environment. The volume of Extrapersonal Action-scene
Infospaces is much larger than the egocentric reference ﬂames, and is extendable without
a limit. Unregistered Extrapersonal Action-scene Infospaces are suitable for applications
with large volumes of information objects. It can be used as a working volume for

browsing, searching, and management of non-task-speciﬁc data.

4.6 Behavioral Properties in Extrapersonal Ambient Infospace
Extrapersonal-ambient space is the outermost space of the visual ﬁeld. In the
human cognitive system, this space is primarily used for motion perception, maintaining
spatial orientation and postural control. Extrapersonal Ambient Infospace is not
commonly used for displaying digital information or user interface design. In a real
world scenario, extrapersonal ambient space is often the conveyer of implicit information
such as time of the day as indicated by the status of the sun or moon in the sky and
relative location in space as indicated by landmarks. Information in this space is Earth-

ﬁxed and generally not task or object speciﬁc.

4. 6. 1 Spatial Bias in Extrapersonal Ambient Infospace

Extrapersonal Ambient Infospace is biased towards the peripheral visual ﬁeld
(Dictgans et al. 1978; Leibowitz et al. 1982; Previc et al. 1995). Information in a
person’s peripheral vision is usually processed without conscious attention. The

Extrapersonal Ambient Infospace is also biased towards the lower visual ﬁeld (Foley and

53

McChesney 1976; Telford and Frost 1993; D'Avossa and Kersten 1996) for the

perception of vection and optical ﬂow on the ground during forward locomotion.

4. 6. 2 Linear Perspective and Motion Perception Properties

The visual cues that are the most signiﬁcant in the extrapersonal ambient space
are those related to motion perception and spatial orientation, such as horizontality cues,
linear perspective, and optical ﬂow. Extrapersonal ambient space is particularly sensitive
to motion information in all three types of angular motion (yaw, pitch and roll), as well as
inward linear motion (Wallach 1987). Extrapersonal ambient space is less sensitive to
linear motion moving outward, side to side, and up and down. The evolutionary or
developmental explanation for this property is that human beings rarely walk backward,
side to side, or up and down. So the human brain was evolved or developed to be

specialized in one type of linear motion visual processing.

4.7 Summary

The literature reviewed provided a solid basis for mapping spatial cognitive
properties of different Infospaces for the design of AR environments. At the same time
the reviewed literature show a gap in research about other unexplored cognitive
properties in spatial ﬂamework that is useful for the design of AR environments. Chapter
5, 6 and 7 present three sets of experiments that address important research questions in
relation to spatial ﬂameworks. The answers to these questions can be use to optimize

spatial placement of information in AR environments.

54

5 Reference Frames in Mobile Augmented Reality Displays

The ﬁrst question about behavioral properties in Peripersonal Infospace is: Can
the human cognition system manage to process information objects attached to the
egocentric reference ﬂame naturally? For example, when users turn left with their eyes
closed, will the mind's information mapping assume a surrounding information array in
an egocentric frame will move with the body or will the cognitive systems assume the
objects will stay still with respect to the world? Is this cognitive behavior ﬁxed or does it
adapt in the presence of new information display techniques? Can the human cognition
system process information objects attached to an egocentric reference frame?

It is clear that each ﬂame of reference has its own advantage in some applications.
However, it is not clear how to manipulate a user’s preference of ﬂames of reference
according to different applications. Three experiments were conducted to investigate the
default reference ﬂame for spatial memory, and how to manipulate human spatial
cognitive systems to adapt to a different reference ﬂame (Mou, Biocca, Owen, Tang,

Xiao and Lim 2004a).

5.1 Related Works

This thesis presents the ﬁrst speciﬁc research into human spatial memory and
spatial updating of “weightless” information array in AR systems. However, some
answers to the above questions may be suggested by human spatial memory and spatial ,
updating of real objects in the physical world. There is a large body of evidence
indicating that human spatial cognition updates locations of objects during locomotion

(for example, Levine, Jankovic and Palij 1982; Rieser, Guth and Hill 1986; Rieser 1989;

55

Presson and Montello 1994; Farrell and Robertson 1998; Simons and Wang 1998; Wang
and Simons 1999; Sholl and Bartels 2002; Waller, Montello, Richardson and Hegarty
2002; Mou, McNamara, Valiquette and Rump 2004b; Mou, Zhang and McNamara
2004c). For example, participants in one of Waller et al.’s (2002) experiments learned 4-
point paths. In the “stay” condition, participants remained at the study position and made
pointing judgments ﬂom headings of 0° and 180° (“aligned” vs. “misaligned”). The
results in this condition replicated several other studies of spatial memory in showing that
performance was better for the imagined heading of 0° than for the imagined heading of
180° (e. g. Levine et al. 1982). In the “rotate— update” condition, participants learned the
layout and then were told to turn 180° in place so that the path was behind them.
Performance was now better for the heading of 180° (the new egocentric heading) than
for the heading of 0° (the original learning heading). This result indicated that, as they
turned, participants updated their orientation with respect to the locations in memory.

Simons and (see Simons et al. 1998; Wang et al. 1999) investigated the
interaction between observer movement and layout rotations on change detection. They
showed that detection of changes to a recently viewed layout of objects was disrupted
when the layout was rotated to a new view and the observer remained stationary, but
there was no disruption when the layout remained stationary and the observer moved to
the new viewpoint. In other words, updating was efﬁcient when the observer moved
around the layout but not when the layout rotated in ﬂont of the observer (see Wraga,
Creem and Profﬁtt 2000, for analogous results in imagined updating).

In a recent study, Mou, McNamara, et al. (2004b) reported that the angular

distance between both the imagined heading and the learning heading and the imagined

56

heading and the actual heading had effects on people’s ability to accurately point to
objects in the environment. Participants in one of their experiments learned the locations
of 10 objects ﬂom a single view (e. g., a vase was located next to the learning position;
see Figure 7 of that article), walked to the center of the layout (e. g., next to a shoe), and
faced three headings before making pointing judgments ﬂom imagined headings. There
were three imagined headings, 0° (e. g., “Imagine you are facing the phone), 90°
(“Imagine you are facing the banana”), or 225° (“Imagine you are facing the jar”), and
there were two angular distances between the imagined heading and the actual heading,
0° (e.g., participants actually faced the phone and were instructed to imagine facing the
phone) or 225° (e. g., participants actually faced the book and were instructed to imagine
facing the phone). Pointing performance was best when the imagined heading was
parallel to the learning view. Pointing performance was also better when the actual and
the imagined headings were the same. Mou, McNamara, et al. proposed that people both
represent locations of objects in terms of an obj ect-to-object ﬂame of reference selected
by the egocentric view (also see Mou et al. 2002; Mou et al. 2004c) and update their
location and orientation in terms of that ﬂame of reference during locomotion.

In Experiment 1, using the paradigm developed by Mou, McNamara, et al.
(2004b), we investigated whether people with no experience in mobile AR systems
would use the environment-stabilized or body-stabilized ﬂame of reference as the default.
We hypothesized that if participants used the body-stabilized ﬂame of reference, the
angular distance between the imagined heading and the actual heading would not affect
pointing performance; however, if they used the environment-stabilized ﬂame of

reference, the angular distance between the imagined heading and the actual heading

57

would affect pointing performance, just as was observed in the Mou, McNamara, et al.
study. We only investigated the user’s ﬂame of reference preference during rotation (and
only in the horizontal plane) rather than in translation (in all three body axes); we
assumed the information objects around the user’s body should be arrayed independently
of the user’s translation. We limited our study to the ﬂame of reference preference during
body rotation rather than head rotation because a display stabilized with respect to the
head would have a very limited information ﬁeld. In this study, the second goal was to
examine whether the nature of the representation of the objects in the AR system can be
altered ﬂom environment centered to body centered. The experience of large objects
moving with the body does not occur normally in the real world, except in cases in which
objects are directly attached to the body. So, although the default organization of virtual
objects appears to be tied to the exocentric world ﬂame, experience of a body-stabilized
ﬂame might enable users to adopt the newly experienced frame when updating their
memories for objects’ locations in a new layout, even without direct visual guidance. In
Experiment 2, we examined whether a couple of minutes of experience in the body-
stabilized AR display would allow users to adopt the body-stabilized ﬂame of reference.
In Experiment 3, we examine whether only oral instructions to use a body-stabilized
ﬂame of reference for updating the location of a set of objects might be sufﬁcient to
induce participants to use a body-stabilized ﬂame of reference in accessing a complex
layout. Waller et al. (2002) reported that people were able to imagine simple, body-
stabilized 4-point paths in ﬂont of them when they physically turned back after being

instructed to do so.

58

5.2 Experiment 1: The Default Reference Frame
In Experiment 1, participants learned the locations of virtual objects displayed on
the ﬂoor ﬂom a single stationary viewing position in a large cylindrical room; they were

instructed either to
5. 2. 1 Methodology

5 .2. 1 . 1 Participants
Participants were 8 female and 8 male undergraduates at Michigan State

University who participated voluntarily as partial fulﬁllment of course requirements.

5.2.1.2 Materials and Design

Stimulus materials were displayed in stereo with the Sony Glasstron LDI-lOOB
head mounted display. Head motion was tracked with a Polhemus Fastrak magnetic
tracker. Stereo graphics were rendered in real time on the basis of the data ﬂom the
tracker. Presentation of stimulus materials, audio instructions for participants,
experimental procedure sequencing, and data collection for the experiment were
automated so that the experimenter did not need to hand code the experimental results.
The experiment was developed using the ImageTclAR augmented reality development
environment (Owen, Tang and Xiao 2003).

The list of objects used in the experiment is illustrated in Figure 5.1. The
conﬁguration of eight virtual objects was displayed by the AR system (see Figure 5.2).
Objects were selected with the restrictions that they be visually distinct, ﬁt within an area

approximately 0.3 m on each side, and not share any obvious semantic associations. The

59

objects were all virtual analogs of existing physical objects and were presented in exact
scale.

Each test trial was constructed ﬂom the names of two objects in the layout and
required participants to point to an object (e.g., “Imagine you are facing the cell phone;
please point to the ball”). The ﬁrst object established the imagined heading (e.g., cell
phone) and the second object was the target (e.g., ball). Participants pointed with a

tracked hand-held wand.

1525 a I 0

Figure 5.1. The eight virtual objects used in the experiments.

 

A -

Figure 5. 2. Layout of objects used in the experiments. During the learning phase, half of

the participants faced the cell phone and the other half faced the notebook.

60

Learning-Imagined

Figure 5. 3. Design of experiments: Head—nose icons indicate actual headings; arrows

 

 

 

 

 

Actual-Imagined

 

indicate imagined headings. Headings and diﬂ'erences between them are measured

counter-clockwise to maintain consistency with previous experiments.

The design is illustrated in Figure 5.3. The independent variables are (a) the
angular difference between the learning heading and the imagined heading at the time of
test and (b) the angular difference between the actual body heading and the imagined
heading at the time of test. As is shown in Figure 5.2, to factorially manipulate these two
variables, participants had two actual body headings at the time of test: One was the same
as the learning heading (e.g., actually facing the cell phone), and the other was 90°
different ﬂom the learning heading (e. g., actually facing the book). At each actual
heading, participants had two imagined headings: One was the same as the learning view
(e.g., “Imagine you are facing the cell phone”), and the other was 90° ﬂom the learning
view (e.g., “Imagine you are facing the book”). Hence, as illustrated in Figure 5.2, the
actual body heading was the same as the learning heading when the distances of the

learning—imagined and the actual—imagined were the same (either both were 0° or both

61

were 90°), or it was 90° different ﬂom the learning heading when the distances of the
learning—imagined and the actual-irnagined were different (one was 0° and the other was
90°). Both of these variables were manipulated within participants. At each actual body
heading, participants had 14 trials (pointing to each of the seven objects, except the
imagined facing object at each imagined heading) in a random order. Participants would
imagine themselves or the scene rotating 90° when the imagined heading was 90° ﬂom
their actual heading (e. g., they believed they were actually facing the cell phone but were
required to imagine facing the book). According to a study by Wraga et al. (2000), most
people would rotate their body. However, participants were not explicitly instructed to
adopt body or scene rotation when the imagined heading was different ﬂom the actual
heading because that is beyond the scope of this study and would not change the results.
During the learning phase, half of the participants were randomly assigned to face
the cell phone and the other half faced the book. This design counterbalanced the
pointing direction across all four conditions (as is illustrated in Figure 5.3) and ensured
that all conditions were equally difﬁcult in terms of the pointing response. The order of
the actual body headings at test time was also counterbalanced across participants: Half
of them kept their learning orientation in the ﬁrst block of pointing and then turned 90°
for the second block; the other half performed in the reverse order. The primary
dependent variables were pointing latency and pointing accuracy. Pointing directions

were calculated in terms of the participants’ facing direction.

5.2.1 .3 Procedure
Participants were randomly assigned to each body-heading combination at test

time, with the constraint that each group contained an equal number of men and women.

62

Alter providing informed consent, participants were trained in how to point ﬂom
the imagined heading, which is either the same as or different ﬂom their actual heading.
After participants understood how to conduct the pointing judgment, the experimenter
escorted them to the learning room. To remove any potential orientation inﬂuence due to
environmental structures, which may be represented in spatial memory, participants were
blindfolded while being escorted into the learning room and to the learning position.

When the participants were standing in the learning position and facing the
learning direction, the blindfold was removed. Then the participants were instrumented
with the AR hardware system. The experimenter put a binder with a tracker on the
participants’ waist, placed the HMD with a tracker on their head, and handed them a
pointing wand with a tracker. They were instructed to press a button on the wand when
they felt they were pointing to the target object accurately. At this point, the learning
phase began. Participants were instructed via earphones to point to all objects twice in a
row with visual guidance (e.g., “Please point to the ball”) to get used to the wand.
Participants used earphones throughout the experiment to avoid any spatial references
resulting ﬂom sound source location. After that, they were allowed to study the layout for
30 seconds, and then were asked to keep their eyes closed and to point to the objects
named by the system. Participants performed ﬁve study—test sequences and were able to
point to all of the objects accurately (within 15°). The audio cues were prerecorded so as
to ensure consistency between subjects.

After participants had learned the layout, they were blindfolded and adopted the
ﬁrst actual body heading. Participants always stood at their learning position but turned

their body if the actual body heading was different ﬂom the learning view. Test trials

63

were presented and participants were asked to point with the wand as accurately as
possible before they pressed the button. The tracker on the wand recorded the pointing
direction; pointing latency was recorded ﬂom the onset of the target object cue to the
button-press. After they ﬁnished all 14 trials, they adopted the second actual body

heading (turned by the experimenter) and repeated the same 14 trials.

5.2.1.4 Results and Discussion

Pointing accuracy and pointing latency as a function of actual—imagined distance
and learning—imagined distance are presented in Table 5.1. The means for each
participant and for each condition were analyzed through the use of repeated-measures

analyses of variance (ANOVA) in terms of actual—imagined distance (0° and 90°) and

learning—imagined heading (0° and 90°).

 

 

 

 

 

A-I = 0° A-I = 90 °
Latency Accuracy Latency Accuracy
L-I Mean SD. Mean SD. Mean SD. Mean SD
0° 3.821 2.139 14 8 4.515 2.081 15 9
90° 4.384 1.683 15 11 5.117 2.097 17 11

 

 

 

 

 

 

 

 

 

Table 5.1. Pointing latency (in seconds) and pointing accuracy (in degrees) as a function

of Actual-Imagined (A-I) distance and Learning-Imagined (L-I) distance in Experiment I.

64

 

 

 

 

 

F(l,15) Cohen’sf
Source Latency Accuracy Latency Accuracy
L-I 4.52* .32 .55 .15
Error 1.20 123.60
A-I 11.11** .77 .86 .23
Error .73 39.11
L-I x A-I .01 .19 .00 .11
Error 1.33 39.75

 

 

 

 

 

*P<.05, **p<.01

Table 5. 2. Analysis of variance results for pointing latency and pointing accuracy in

Actual-Imagined (A-1) and Learning-Imagined (L-I) conditions in Experiment I.

The AN OVA results for pointing accuracy and pointing latency are presented in
Table 5.2. In angular error, no main effect was signiﬁcant. People were highly accurate in
all conditions. In pointing latency, both main effects of learning—imagined and actual—
imagined were signiﬁcant, whereas the interaction between them was not.

The most important result of Experiment 1 was that pointing latency was shorter
when the actual and the imagined headings were the same (0°) than when they were
different (90°). This result indicates that people cognitively update the location of the
virtual object when they rotate their body. In other words, humans use an environment-
stabilized reference ﬂame to access information arrays. The evidence for this is the cost

in latency that was incurred by the need to align the egocentric ﬂont with the facing

65

 

object speciﬁed in the pointing judgment. Given that the imagined heading is the same,
the pointing latency should be the same when they are facing the learning view as when
they are turned 90° ﬂom it when participants use a body-stabilized reference ﬂame. This
result showed that turning 90° ﬂom the learning view at test time beneﬁted the imagined
heading of 90° but had the reverse affect on the imagined heading of 0°.

The second important ﬁnding was that pointing latency was shorter when the
imagined and learned headings were the same (0°) than when they were different (90°).
This result indicates that people represent the location of the virtual object with a
reference ﬂame selected by the learning view; that is, spatial memory is orientation
dependent.

Both of these results were consistent with the research of spatial updating of
physical objects (Mou et al. 2004b), suggesting that the spatial cognition system codes
and processes spatial locations of virtual objects presented in AR environments using the

same coding and processing as for physical objects.

5.3 Experiment 2: Adaptation of Egocentric Frame with Prior Experience

The results of Experiment 1 indicate that participants used an environment-
stabilized ﬂame of reference to access the location of virtual objects if they had never
experienced the possibility that objects can also be attached to the body (egocentric
reference ﬂame) in virtual and AR environments. In Experiment 2, we examined whether
direct experience of virtual objects attached to an egocentric reference ﬂame in which
objects translate and rotate with the moving body (a condition rarely experienced in the
physical world) would stop participants ﬂom updating their actual heading with respect

to the layout but would, instead, cause them to use a body-stabilized reference ﬂame for

66

other layouts. Evidence of this effect would suggest that users are capable of learning to
use and update arrays of menus and objects organized around their moving body and that
an egocentric infospace would be accepted and processed cognitively in such a way that

it would be an efﬁcient information presentation medium.
5. 3. 1 Methodology

5.3. l . 1 Participants
Participants were 8 female and 8 male undergraduates at Michigan State

University who participated voluntarily as partial fulﬁllment of course requirements.

5.3.1.2 Materials, design, and procedure

The materials, design, and procedure of Experiment 2 were similar to those of
Experiment 1 except a training session was added before participants learned the
experimental layout of eight objects.

During the training session, ﬁve virtual objects were presented. Participants were
instructed to look at the locations of all objects. After they saw all of them, they were
asked to turn left and look at the locations ﬂom the new viewing direction. The objects
were simultaneously rotated in space so as to maintain their position and orientation
relative to the participant’s body. The subjects then turned back to adopt the original
orientation and took a look at the locations of the objects. The process was repeated with
a right turn, again maintaining object position and orientation relative to the subject’s
body. Finally, they were instructed to return to the initial orientation. The training
session lasted approximately 2 minutes. The experimenter did not comment on or
verbally explain the behavior of the virtual objects. Learning about the object behavior

was through observation only.

67

Following the training session, the learning session started. The learning session

was identical to that of Experiment 1.

5. 3.2 Results and Discussion

Pointing accuracy and pointing latency as a function of actual—imagined distance
and learning—imagined distance are presented in Table 5.3. The means for each
participant and each condition were analyzed in repeated-measures ANOVAs in terms of
actual-imagined distance (0° and 90°) and learning—imagined heading (0° and 90°). The
AN OVA results for pointing accuracy and pointing latency are presented in Table 5.4. In
angular error, no effect was significant. People were highly accurate in all conditions. In

pointing latency, only the main effect of leamed—imagined was signiﬁcant.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A-I = 0° A-I = 90°
Latency Accuracy Latency Accuracy
L-I Mean SD. Mean SD. Mean SD. Mean SD
0° 4.077 2.030 13 9 4.510 2.947 19 17
90° 5.513 3.347 17 7 5.777 3.441 19 12
Table 5. 3. Pointing latency (in seconds) and pointing accuracy (in degrees) as a function

of Actual-Imagined (A-I) distance and Learning-Imagined (L-I) distance in Experiment 2.

68

 

 

Cohen’s f

 

 

 

 

 

 

 

 

Source Latency Accuracy Latency Accuracy

L-I 20.21" .26 1.16 .13
Error 1.45 77.96

A-I 2.40 3.48 .40 .48
Error .81 73.91

L-I x A-I .05 1.07 .05 .27
Error 2.50 56.05

** p < .01

Table 5. 4. Analysis of variance results for pointing latency and pointing accuracy in

Actual-Imagined (A-1) and Learning-Imagined (L-I) conditions in Experiment 2.

The most important ﬁnding of Experiment 2 was that the effect of the angular

distance between the imagined heading and the actual heading on pointing latency was

not signiﬁcant. Although failing to reject the null hypothesis is not the same as

demonstrating the validity of the null hypothesis, it is safe to conclude that the eﬁ’ect of

the actual—imagined heading on pointing latency decreased after people had a brief

exposure to a body-stabilized display. The difference in pointing latency between actual—

imagined (0° and 90°) decreased ﬂom 713 ms in Experiment 1 to 394 ms in Experiment

2. The effect size f on pointing latency consistently decreased ﬂom .86 in Experiment 1

to .40 in Experiment 2. It is hard to exclude the possibility that some participants showed

the actual—imagined effect and others did not because this is not an individual-based

69

 

experiment. In general, however, the results indicate that participants were able to use the
body stabilized, egocentric reference ﬂame to access information for the location of an
array of virtual objects after only 2 min of prior exposure to the location of an array of

virtual objects.

5.4 Experiment 3: Adaptation to an Egocentric Frame with Oral Instruction

In Experiment 3, we examined whether participants who were instructed that the
layout was stabilized with respect to their body would stop updating their actual heading
with respect to the layout and would, instead, adopt a body-stabilized, egocentric

reference ﬂame.

5. 4. 1 Methodology

5.4. l .1 Participants
Participants were 8 female and 8 male undergraduates at Michigan State

University who participated voluntarily as partial fulﬁllment of course requirements.

5.4.1.2 Materials, design, and procedure

The materials, design and procedure were similar to Experiment 1 except for the

following two modiﬁcations:

1. Prior to the physical turn of the participants during the testing phase, they
were given a body-stabilized instruction (e.g., “When you physically turn your
body, the objects on the ﬂoor will move as you turn. Hence, after you turn
right, you will be still facing the cell phone”).

2. A new motion tracking system, InterSense IS-900, was used due to an upgrade
to the experiment facility. The new tracking system performed identically to

the original system with the exceptions of a considerably increased range and

70

slightly decreased latency, and, thus, is not likely a signiﬁcant factor in these

experiments.

5.4.1.3 Results and Discussion

Pointing accuracy and pointing latency as a function of actual—imagined distance
and learning—imagined distance are presented in Table 5.5. The means for each
participant and each condition were analyzed in repeated-measures ANOVAs in terms of
actual—imagined distance (0° and 90°) and learning—imagined heading (0° and 90°). The
ANOVA results for pointing accuracy and pointing latency are presented in Table 5.6. In
both angular error and pointing latency, only the main effect of leamed—imagined was
signiﬁcant. The results clearly indicate that after being instructed that the objects were
arrayed around the body in a body-stabilized display, people used a body-stabilized

frame of reference to access the information array.

 

A-I = 0° A-I = 90°

 

Latency Accuracy Latency Accuracy

 

L-I Mean SD. Mean SD. Mean SD. Mean SD

 

 

0° 4.077 2.030 13 9 4.510 2.947 19 17

90° 5.513 3.347 17 7 5.777 3.441 19 12

 

 

 

 

 

 

 

 

 

Table 5. 5. Pointing latency (in seconds) and pointing accuracy (in degrees) as a function

of Actual-Imagined (A -1) distance and Learning-Imagined (L-I) distance in Experiment 3.

71

 

 

 

 

 

 

 

 

 

 

F(l,15) Cohen’s f
Source Latency Accuracy Latency Accuracy
L-I 18.68" 9.98“ 1.12 .82
Error 4.72 327.05
A-I .09 2.43 .08 .40
Error 1.51 89.90
L-I x A-I .08 1.38 .07 .30
Error 4.62 133.95
** p < .01

Table 5. 6. Analysis of variance results for pointing latency and pointing accuracy in

Actual-Imagined (A -l) and Learning-Imagined (L-I) conditions in Experiment 3.

5.5 Discussion

Current 3D graphics and tracking technology allow designers to display
information arrays around a mobile AR user with respect to a body-stabilized or an
environment-stabilized ﬂame of reference. There have been no prior studies conducted to
investigate which reference ﬂame mobile users use and what factors may inﬂuence
choices of reference ﬂame. This study, through the use of the paradigm developed to
investigate human spatial memory and spatial updating in physical environments (Mou et
al. 2004b), suggests that users with no prior experience of mobile AR systems tend to use
an environment stabilized reference ﬂame to access information arrays presented in AR

environments. In other words, people expect the information arrays of virtual objects in

72

 

AR environments to behave like arrays of objects in physical environments (i.e., when
they rotate their body, objects stay in their locations relative to the physical environment).
This study also suggests that users who brieﬂy experience the egocentrically centered
display of virtual objects or those who are instructed that the display is egocentrically
centered are capable of quickly adopting a body-stabilized reference frame to code and
access the locations of virtual objects in the physical environment.

Why do naive users think the locations of the virtual objects are stabilized with
respect to the environment? One apparent explanation is that ﬂom birth, human beings
perceive that the locations of objects in the environment as independent of their own
locomotion and, thus, the relationship between their body’s locomotion and changes of
self-to-obj ect relations are represented in their cognitive system. To efﬁciently locomote
in an environment where objects are not always visible, humans have to develop the
ability to update locations of objects in the environment without visual guidance (Rieser
et al. 1986; Rieser 1989; Presson et al. 1994; Farrell et al. 1998; Simons et al. 1998;
Wang et al. 1999; Sholl et al. 2002; Waller et al. 2002; Mou et al. 2004b). People couple
their motions and locomotion with an automatic spatial updating of the representation of
object locations. They do so by coupling their locomotion with the perception of change
in the spatial relations between the body and objects in the environment during their
interaction with the environment (Rieser, Pick and Ashmead 1995; Rieser 1999). People
with no prior experience in mobile AR systems simply interpret the relation between their
locomotion and the locations of virtual objects with the mental model they use to

interpret the physical world.

73

On the other hand, the results of Experiments 2 and 3 demonstrate that this
lifetime experience with physical objects can be quickly replaced with a model of virtual
object arrays that move with the body. In Experiment 2, participants perceived that the
locations of virtual objects stayed stationary with respect to their body rotation for only 2
minutes. Their spatial updating behavior indicated that people in general tend to use
body-stabilized reference ﬂame to code and access the locations of virtual objects after
experiencing the behavior of these objects in the new AR layout. This implies that people
couple their motions and locomotion with a cancellation of the spatial updating of the
representation of object locations. They do so during their interaction with the
environment by coupling their locomotion with the perception of “unchanged” in the
spatial relations between the body and objects in the environment.

The quickness with which participants adapted to the egocentric array of object
locations is very promising, as far as use of that Peripersonal Infospace to hold digital
information in AR environments is concerned. It was speculated that people might have
a mental model in favor of a body-stabilized reference ﬂame that can accommodate
arrays of virtual objects that move with the body even though they have no visible means
of attachment to the body. This consideration was supported by the results of Experiment
3, which showed that even without any direct experience, and with only oral instruction
that the objects were ﬁxed relative to the body (body-stabilized ﬂame of reference),
people were able to use body-stabilized ﬂames of reference to code and access the
locations of virtual objects. The results of both Experiments 2 and 3 also suggest the

spatial cognitive system is highly ﬂexible with respect to spatial updating.

74

Can users of AR systems remember and make use of arrays of three-dimensional
objects that move around the body even when they are more than 1 m away ﬂom the
body? The results of these studies suggest that high quality, mobile AR interfaces may be
able to leverage the capacity of human spatial memory and spatial updating mechanisms
for efﬁcient access to information items around the body. In this study, we attempted to
(a) identify the default frame of reference in coding virtual objects in a high-quality AR
mobile system and (b) determine whether experience and oral instruction could alter it.
Further studies should investigate how people encode the locations of virtual objects on
occasions in which both body—stabilized and environment-stabilized ﬂames of reference
are necessary. It remains to be seen whether the updating of these virtual objects
interferes with the updating process for objects in the physical environment. This
notwithstanding, the current study provides answers to the questions raised in the
introduction: Users with no prior experiences in mobile AR systems tend to use
environment-stabilized reference frames to encode and access information arrays around
their body. Evidently, experiences with or oral instructions of a body-stabilized display

allow users to adopt a body-stabilized ﬂame of reference instead.

75

6 Evaluation of Perceptual Asymmetric Effects in Egocentric Infospaces

Perceptual asymmetric effect can potentially impact the human cognition system
in various ways, such as in reaction time, perception, induced emotion, and semantic
meaning. While perceptual asymmetries are well-known effects in cognitive psychology
and can be easily demonstrated in laboratory settings, it is not clear at all that these
effects can be directly utilized in user interface scenarios. Empirical study on perceptual
asymmetry effects in an application setting is needed before applying them to the design
of interfaces. Two experiments were conducted to investigate the practical impact

perceptual asymmetric effects on actual tasks.

6.1 Experiment 4: Evaluation of Left vs. Right Instruction Presentation

An experiment was conducted to evaluate asymmetrical effects of graphical and
text instructions placed on the left or right side of a head-stabilized reference ﬂame with
an emphasis on the impact on task completion time. According to research in
psychology, the left visual ﬁeld is superior for word recognition (Melville 1957; Bauma
1973; Axelrod, Haryadi and Leiber 1977; Young and Ellis 1985; Ellis, Young and
Anderson 1988) and language processing (Sperry 1961; Gazzaniga and Sperry 1965),
while the right visual ﬁeld is superior for geometric patterns and visual orientation
matching (Atkinson and Egeth 1973). Since visual stimulus presented on one side of the
head-stabilized reference ﬂame will predominantly fall on the same side of the visual

ﬁeld. it is predicted that:

76

H1: Task completion time for graphical instruction presented on the right
side of the head-stabilized reference frame will be significantly shorter
than on the left side, and

H2: Task completion time for text instruction presented on the left side of the
head-stabilized reference frame will be significantly shorter than on the

right side.

6.1.1 Methodology

A within-subjects experiment was designed. There were two dependent variables:
position of the instruction on the head-stabilized reference ﬂame (left vs. right), and type
of instruction (graphic vs. text). The independent variable is the time of completion of the
experimental task. Four experimental conditions were created: (1) Graphic instructions
presented to the left, (2) Graphic instructions presented to the right, (3) Text instructions

presented to the left, and (4) Text instructions presented to the right

6.1.1.1 Stimulus Materials

. Participants were asked to complete a task of arranging Duplo blocks into a
spatial pattern and then pressing a button of a speciﬁc color according to instructions
presented in a head—stabilized reference ﬂame. Arranging Duplo blocks into spatial
patterns was chosen as a task in the experiment to minimize bias towards a population
with expertise in a certain knowledge related to a task. In each trail, participants were
asked to acquire 5 to 15 Duplo blocks of different colors ﬂom an unsorted bin and

arrange them into the patterned presented in the instruction. Figure 6.1 shows examples

of instruction presented and the completed task.

77

(a) (C)

Row 1: Yellow, Yellow, Blue, Red, Blue,
Yellow, Yellow
Row 2: Yellow, Blue, Blue, Yellow, Blue

 

Button: Blue

 

Figure 6.1. Examples of instruction and the completed task. Example of text instruction
is shown in (a) instruction and the completed task is shown in (b), and example of

graphic instruction is shown in (c) and the completed task is shown in (d).

6.1.1.2 Participants
Participants were 8 undergraduate students at Michigan State University who
participated voluntarily as partial fulﬁllment of course requirements. None of them had

previous experience in any AR environment.

6.1 . 1 .3 Experimental Equipments
Visual cues were displayed in stereo with the Sony Glasstron LDI-lOOB head-

mounted display, and audio stimulus materials were presented using a pair earphones.

78

6.1.1.4 Procedure

Participants were ﬁrst introduced to the experimental procedure and equipments,
and then entered the pretest environment. A few example instructions were presented to
the participants and the experimenter explained the tasks in the experiment. When
participants indicted that they understood the experimental procedure and the task, the
experiment began where they experienced each interface treatment condition (graphical
instruction of left, graphical instruction of right, text instruction of left, text instruction of
right) in a randomized order. There were 12 trails in each treatment condition. At the
beginning of each trail, a tone was played to the user through a pair of earphones and the
visual instruction was displayed on the HMD according to the treatment condition.
Participants were to arrange the Duplo blocks into the spatial pattern according to the
instruction displayed on the HMD and then press the button of the color as speciﬁed on

the instruction.

6. 1.1 .5 Measurements
Task completion time in milliseconds was measured as the time it took for

participants to press the button following the onset of an audio cue tone.

6.1.2 Results

The mean and standard deviation of task completion time for each condition is
summarized in Table 6.1. A general linear model repeated measure analysis was
conducted to test the effect of stimulus position on task completion time. There was no
statistical signiﬁcant effect of stimulus position on task completion time for both
graphical and text instruction, F(l, 8) = 3.213, p = 0.116 for graphical instruction; F(l , 8)

= 0.930, p = 0.372 for text instruction.

79

 

Graphical Graphical Text Instruction Text Instruction

Instruction on Instruction on on Left Side on Right Side

 

 

 

Left Side Right Side
Mean 20308 ms 22190ms 22709ms 21122ms
SD. 8316 9843 10068 7788

 

 

 

 

 

Table 6.1. Task completion time and standard deviation in Experiment 4.

The descriptive statistic indicates an advantage for graphic instruction placed on
the let side and text instruction place on the right side. However, none of these result has

any statistical signiﬁcant.

6.1.3 Discussion

In contrary to the hypothetical prediction, graphic instruction presented on the left
side and text instruction presented on the right side had a shorter task completion time in
general. However, the experimental results did not achieve statistical signiﬁcance. One
explanation of this effect is that effects of perceptual asymmetries apply to the visual
ﬁeld of the retinal image only.

Even though visual stimuli placed on one side of the head-stabilized display are
predominantly fall on the same side of the visual ﬁeld, the retinal image may move to
another side due to eye movement. Hence, simply placing information items to one side
does not guarantee the image will be projected to that side of the visual ﬁeld exclusively.
Furthermore, the reaction time for perceptual effects of bilateral asymmetry is measured

in milliseconds. The perceptual advantage of bilateral asymmetry could be relatively

80

 

insigniﬁcant when compared to other cognitive and psychomotor process such as sorting
and motor action planning and execution. A perceptual advantage measured in
milliseconds does not have a signiﬁcant impact on a task that spans 5 to 10 seconds. In
conclusion, Perceptual asymmetric properties in reaction time are not robust, and too
subtle to have practical effects on reaction to stimuli in AR and other information
displays. These effects can only be used sparsely for information placement in egocentric

infospaces.

6.2 Experiment 6: Emotion and Semantic Meaning

Semantic meaning is likely to be mapped to proximity. An experiment was
conducted (Biocca, Lamas, Gai, Brady and Tang 2001; Biocca, David, Tang and Lim
2004) to explore how semantic meaning of virtual objects and agents change with the
location around the body. Do positions in space around the body carry meaning? Is the

spatial location of an object part of its connotative meaning?

6. 2. 1 Related Works

Approach and avoidance ﬁelds around animals and humans are well documented.
The work of proxemics is well known in communication research. The term "proxemics"
was coined by Edward Hall in 1963 (Hall 1963; Hall 1966) when he investigated human's
use of personal space in the communication and social context. His theory suggests that
human maintain different levels of distance to different people, agents and objects in
space. For example, people in the United States maintain a 6 to 18 inches distance as an
intimate distance for embracing, 1.5 to 4 feet as a personal distance for good ﬂiends, and
a 4 to 12 feet distance as social distance for everyday conversation. A violation of these

comfort distances (e. g. a stranger stepping into the 1.5’ — 4’ personal distance) could

81

inﬂuent the perception of the intention of other people, and could trigger different
emotional response and semantic meaning. This research suggests that spatial location of
agents, people and objects in space relative to the body will have meaning, especially as
the location crosses a threshold into the private space around the body.

Looking at the semantic oppositions within language, there is good support for
this spatial semantic asymmetry. Cultures worldwide map meaning to the high-low
dimension of space. For example, within most Greco-Roman languages, it is very
consistent that “up” or “above” is perceived with positive meanings and “down” and
“below” with negative. There is some linguistic evidence for the semantic asymmetries
of spatial location.

There is also some evidence in neuroscience for differences in the processing of
peripersonal space and extrapersonal space. Neurophysiological studies in brain-inj ured
patients have shown that lesions in different brain regions can lead to asymmetrical
neglect for near or far spaces consistent with peripersonal and extrapersonal space (Berti,
Smarria and Allport 2001).

Humans also spontaneously respond to affordances in the environment that are
correlated with sentient beings such as other humans and animals (Sheehan and Sosna
1991). Mediated embodiments such as pictures, computer characters, moving robots, and
other representations of “apparently sentient” others can automatically trigger social
presence responses (Reeves and Nass 1996). The philosophical and psychological
concept of agency has many subtle dimensions (McCann 1998; Bratrnan 1999). The
concept of agency, deﬁned as the state of being in action or of exerting power, is central

to the issue of the volitional or intentional force that drive the actions of an entity. This

82

property of potentially acting within a space may make the spatial location of agents

more salient, and therefore more meaningful, to the user of a virtual environment.

6.2.2 Methodology

A 5 by 2 by 2 within-subjects experiment was designed with three within-subjects
factors: (1) location around the body, (2) distance ﬂom the body, and (3) type of object.
Location around the body had ﬁve levels deﬁned by spatial location. Distance ﬂom the
body had two levels, near and far (see Figure 6.2). And ﬁnally, the factor, agency, had
two levels, agent representation (a 3D anthropomorphic head) and object representation

(i.e., a simple golden sphere) (see Figure 6.3).

6.2.2. 1 Participants
13 undergraduate students participated in the study voluntarily for class credit. All

participants were right handed.

6.2.2.2 Stimulus Materials

Two types of stimulus were used to manipulate the perception of agency, a 3D
human head or a golden sphere, as illustrated in Figure 6.3. In each experimental trail,
either the sphere or the head appeared in one of the ten predeﬁned spatial locations. The
ten predeﬁned spatial location, ﬁve for near space and ﬁve for far space, are as shown in
Figure 6.2. The left, right, up and down positions are 30° deviated ﬂom the center
location. The near locations are measured 3 feet ﬂom the participant’s body, and the far

locations are measured 10 feet ﬂom the participant’s body.

83

\‘

 

Right

1
‘\ ,
m; , it
-: 4 1 I L
/
Figure 6. 2. T en predeﬁned locations around the body. T he ﬁve locations in the near
space are 3 ’ away from the body. The five locations in the far space is 10 ’ from the body.

The above, below, left and right locations is deviated 30° from the center location.

 

Figure 6. 3. The two stimulus material used in the experiment. The golden sphere used
for object representation is shown on the left side. The human head used for agent

representation is shown on the right side.

84

Head motion was tracked with a Polhemus Fastrak magnetic tracker. Stereo
graphics were rendered in real time on the basis of the data ﬂom the tracker using a
Research V8 stereoscopic HMD. An SGITM Onyx® Reality Engine2 running with two
graphics pipes and MultiGen® SmartSceneTM software was used to render the stimulus

materials in real time.

6.2.2.3 Measurement

A set of semantic meanings was measured using four bipolar measurements ﬂom
the classic semantic differential instrument (Osgood, Suci and Tannenbaum 1957).
These four items were selected based on the result of a pilot test. Those that were most
sensitive to variations in the spatial location of objects were retained. The four bipolar
measurements selected are superior-inferior and relevant-irrelevant in evaluative factor,
(2) urgent-not urgent in potency factor, and aggressive-peaceful in activity factor. Each
measurement item provided the anchor points on a seven point scale. For example, the
superior-inferior measurement asked the participants to answer the following question:
“How superior or inferior is the object? Very superior, moderately superior, slightly

superior, neutral, slightly inferior, moderately inferior, or very inferior.”

6.2.2.4 Procedure

Participants entered the experiment room and were briefed about the equipments
and were assisted to mount the HMD. At the beginning of each trail, one of the ten
predeﬁned locations was randomly selected, and either a head or a sphere appears in that
location. Participants then observed the object for about 10 seconds. A questionnaire
containing the 4 items semantic measurement then appeared in ﬂont of the participants’

visual ﬁeld. The subjects read the questionnaire and respond orally to the experimenter.

85

After the 4 questions were completed, the trial ended and next trial began. There were

twenty trials total.

6. 2.3 Results and Analysis

Table 6.2 summarizes the mean of the four semantic differential measurements
for the 3 experimental factors. The data analysis was conducted in four steps. In each
step, a different semantic-differential measure was entered as the dependent variable and
the data analyzed using a 5 (Position: center, left, right, top, bottom) x 2 (Distance: close,
far) x 2 (Agency: agent, ball) within-subject repeated measures analysis of variance. Due
to the complexity of the design, each of the four measures was treated separately so that
higher-order interactions might be more interpretable. Correlation analysis of the
measures revealed that the four items were signiﬁcantly correlated with pair-wise
correlations ranging ﬂom .55 to .75. However, a composite score of the four items was
not examined because the purpose of this study was to examine different aspects of the

semantic space, rather than one overall evaluation.

86

 

 

 

 

 

 

 

Overall Distance Position Object
Near Far Center Left Right High Low Face Sphere
Superior (l)/
Inferior (7) 4.2 3.7 4.7 4.5 4.5 4.1 4.0 3.8 3.6 4.8
Relevant (1) 4.0 3.7 4.5 4.3 4.4 4.0 3.6 4.2 3.4 4.7
Irrelevant (7)
Urgent (1) 4.6 4.0 5.2 4.7 4.7 4.7 4.3 4.5 4.2 5.0
Not-Urgent
Aggressive (l)/ 4.2 3.8 4.5 4.3 4.3 4.2 3.8 4.2 3.6 4.7
Peaceful (7)

 

 

 

 

 

 

 

 

 

Table 6. 2. Means for the Diﬂerent levels of the 3 Experimental Factors

In within-subjects designs it is highly likely that the sphericity assumption is

violated, as a result inﬂating the degrees of ﬂeedom. To adjust for this, the Huynh-Feldt

correction was applied to the degrees of ﬂeedom, which is the reason that some of

degrees of ﬂeedom presented in this section have decimal values. For each of the

univariate tests, the corresponding multivariate test was also examined. The univariate

and multivariate tests were nearly identical in most cases, with two exceptions. In one

case, the multivariate test was signiﬁcant, but the univariate test was slightly above the

.05 level, whereas in the other instance, the univariate test was signiﬁcant, but

multivariate test was above the .05 level.

87

 

6.2.3.1 Superior/Inferior

When the ratings of the semantic differential superior/inferior were analysed, the
main effects for Position, F (4, 48) = 2.65, p = .04, Distance, F(1, 12) = 28.27, p = .000,
and Agency, F(1, 12) = 17.54, p = .04, were found to be statistically signiﬁcant. Nearer
items that appeared in the near space were rated as superior (M = 3.7, SD = 1.8), in
comparison to farther items that appeared in the far space (M = 4.7, SD = 1.7). Also, the
3D human face was rated as superior (M = 3.6, SD = 1.7) in comparison to the golden
sphere (M = 4.8, SD = 1.7). To examine the difference between levels that contributed to
the main effect for Position, two within-subjects contrasts were used. The contrast
comparing the left vs. right positions was tending toward signiﬁcance, F(1, 12), = 3.70, p
= .07; whereas the contrast for top vs. bottom was not signiﬁcant. Items that appeared in
the right ﬁeld (M = 4.2, SD = 1.9) had higher superiority ratings compared to items that
items presented in the left ﬁeld (M = 4.5, SD = 1.8), offering some evidence of a left-

right asymmetry.

6.2.3.2 Relevant/Irrelevant

The relevance of the object in space was analysed. The main effects for Position,
Distance and Agency were statistically signiﬁcant, F(3.3, 39.4) = 3.36, p = .03 for
Position; F(1, 12) = 17.49, p = .001 for Distance; and F(1, 12) = 19.53, p = .001 for
Agency. Among the interactions, only the two-way interaction for Position x Distance
was signiﬁcant, F(3.3, 39.3) = 2.91, p = .042. An interaction contrast revealed that
Position x Distance interaction was accounted for by the differences between the left vs.
right positions, F (1, 12) = 8.09, p = .015. Nearer items within the peripersonal space (M

= 3.7, SD = 1.6), were rated as being more relevant than items outside the peripersonal

88

space (M = 4.5, SD = 1.7). By the same token, the 3D face model (M = 3.4 SD = 1.6) was
rated as more relevant than the golden sphere (M = 4.7, SD = 1.5). While the left-right
asymmetry observed in the superiority ratings was also apparent for relevance, a two-way
interaction between distance and left-right positions was further analysed. Figure 6.4
illustrates this two-way interaction, that items within peripersonal space were rated more
relevant than items outside peripersonal space, the left-right asymmetry was observed
only for the items within the peripersonal space, but not for items outside the peripersonal

space.

 

+ Far
—<>— Near

l

5.5

 

4.5

l

l

3.5

 

 

 

Left Right

Figure 6. 4. Relevant-Irrelevant by distance and position of objects

89

6.2.3.3 Urgent/Not Urgent

Urgency has a statistically signiﬁcant effect on Distance and Agency, F (1, 12) =
42.55, p = .000 for Distance and F( 1, 12) = 7.86, p = .016 for Agency. In addition, a
Position x Distance two-way interaction was statistically signiﬁcant, F(4, 48) = 2.58, p =
.049. The Position x Agency two-way interaction was trended toward signiﬁcance, F(2.8,
34) = 2.63, p = .069. Nearer items (M = 4.0, SD = 1.7) were perceived as more urgent
than items outside peripersonal space (M = 5.2, SD = 1.4). The effect of distance was
asymmetrical between the left and right ﬁelds, with the effect of distance being more

pronounced in the right ﬁeld than in the left ﬁeld.

Figure 6.5 illustrates this asymmetric effect. Also, the 3D face (M = 4.2, SD =
1.7) evoked more urgency than the golden sphere (M = 5.0, SD = 1.6) and the effect of
agency was more pronounced when the objects were located away ﬂom the center. At the
center, there was no difference in the perceived urgency for the sphere and 3D face,
however, once when these objects appeared in the right or left ﬁelds, a difference
trending toward signiﬁcance emerges (p = .07), with the face being perceived as more

urgent than the sphere. Figure 6.6 illustrates this effect of Urgency by position.

90

 

 

 

 

 

 

 

6
5.5 — /
5 _.
4.5 -
<>\\\‘
4 - X\-
\e
3_5 _ + Far
—<>— Near
3 1
Left Right
Figure 6. 5. Urgent-Not urgent by distance and position.
6
+ Ball
5'5 —\—’>— Face O
5 _
4.5 - \
\\\
4 ~ M
3.5 —
3 u r
Center Left Right

Figure 6. 6. Urgent-Not Urgent by object and position.

91

 

 

6.2.3.4 Aggressive/Peaceful

When ratings of aggressive/peaceful were analysed, the main effects for distance
and agency were found statistically signiﬁcant, F (1, 12) = 7.14 for distance and p = .02,
F(1, 12) = 9.73, p = .009 for agency. Furthermore, objects in near space (M = 3.8, SD =
1.8) were seen as more aggressive than objects in far space (M = 4.5, SD = 1.7). Agency
also had a sizable impact on the aggressive/peaceful rating, with the 3D face (M = 3.6,

SD = 1.7) rated more aggressive than the golden sphere (M = 4.7, SD = 1.7).

6. 2.4 Discussion

The clearest ﬁnding ﬂom this study is that location in virtual space appears to
have semantic meaning. Participants ascribed differences in meaning, as measured by
items ﬂom all the dimensions typically captured by the classic semantic differential scale
(Osgood et al. 1957), as objects and agents changed position in space. Depending on
their location, agents and objects differed in their superiority and relevance (evaluative
factor), urgency (potency factor) and in their level of aggression-peaceﬁrlness (activity
factor). It is relevant to note that Osgood’s semantic differential was conceptualized
using a spatial metaphor of semantic spaces. In this study it appears that participants are

actually mapping semantic properties and connotations to locations in space.

6.2.4.1 The Effect of Distance in Connotative Semantics

One of the strongest effects observed in this study was the shifts in connotative
meanings when an object is located in the near space. Objects in the near space appeared
to be more relevant, superior, urgent, and aggressive.

Some of these ﬁndings might be predicted with the literature on proxemics (Hall

1963). But that literature does not explain why objects, as opposed to people, should also

92

have a shift in connotative semantic properties within peripersonal space. One possibility
is that the sphere, which ﬂoated in space, might have been seen as an agent, or at last
closer in agency to a three-dimensional head. But this is not a plausible explanation, as it
would suggest that the two would have been seen as largely equivalent. The three-
dirnensional head was seen as signiﬁcantly different ﬂom a sphere on all our measures.

Further research is required to examine this speciﬁc effect.

6.2.4.2 On the Semantic Volatility of Agent’s Spatial Location

The simplest interpretation of the ﬁndings regarding agency is: People are more
meaningful than objects. The virtual head was perceived as more superior, relevant,
urgent, and aggressive than the virtual sphere. But the ﬁnding goes beyond this. The
experiment did not use a person, but only a virtual head that did not display much in the
manner of true agency or life like characteristics. Consistent with the work on virtual
agents, the mere representation of agency, a simple and not very realistic three-
dimensional virtual head, evokes more meaningful responses than the object, even when
they are roughly equal in size and exactly equal in location. The users’ perception of the
ability of agents to act within the space may also evoke greater uncertainty and more
volatile shifts in connotations. There may be uncertainty regarding about agents facing
the user, but located on the side of the body. When agents moved ﬂom being directly in

ﬂoat of the body, to be on either side of the body, they are seen as more urgent.

6.2.4.3 Left-right asymmetries in the connotative semantics of virtual space
Some of the more robust ﬁndings in the literature are those that ﬁnd left-right
asymmetries in the visual processing of objects presented brieﬂy within the visual ﬁeld.

This is typically interpreted as evidence for bilateral differences in information

93

processing and function across the brain’s hemispheres. In this study, objects falling on
the left side of the body tended to have semantic properties with values that deviated from
neutral relative to objects falling on the right side of the body. This ﬁnding can be
interpreted by properties of the right hemisphere. The right hemisphere is dominant for
expressing and perceiving emotion, regardless of valence (Sackheim, Gur and Saucy

1978). Objects falling on the right side of the body tend to be perceived as neutral.

6.2.4.4 Beyond spatial location on the retina to location around the body

Most of those studies exploring differences in spatial location control location by
placing objects within a speciﬁc location within the visual ﬁeld and for a brief exposure.
Objects fall on either the left or right hemiretina. They are not open to examination
because they are presented for a very brief duration. The experiment put objects on the
left and right of egocentric space. So any object could fall on either retina and processed
by either hemisphere when observed directly. Nonetheless, the ﬁndings here are similar
to visual ﬁeld studies. One possible explanation for this effect is that the objects fall
predominantly within one visual ﬁeld, especially when on the eccentricities of the body.
So they might be “predominantly” processed in one hemisphere or another. But effects
argued ﬂom location within the visual ﬁeld only would be very weak. It is also possible
that mere location around the body regardless of where objects fall on the visual ﬁeld has
meaning. Such ﬁndings are suggested by work by Graziano et al. (Stein 1984; Graziano
et al. 1995; Graziano and Gross 1998; Graziano 1999) with animals demonstrating the
coding for egocentric location regardless of the location of the eye of observer. Location

relative the body is coded in mutltimodal integration of space (Marks 1978; Stein 1984;

94

Marks and Armstrong 1994). These location may have some of the properties observed in

visual ﬁeld studies, and may not rely just on location on the retina.

6.3 Summary

Two experiments were conducted to evaluate the applicability of perceptual
asymmetric effects for information display in Egocentric Infospaces. The experiment
ﬁndings of the two experiments were mixed. Results of Experiment 4 suggest that
reaction time based perceptual asymmetric effect is too subtle to have an effect on task
temporal performance. Results of Experiment 5 suggest that spatial locations around the
body are prescribed with semantic meanings. For example, connotations to a three-
dimensional face presented in a virtual environment may be slightly different if this face
is presented left or right, close or far in the virtual environment. AR interface designers
may need to be aware that location may impart connotation and emotion to user’s
perception of the information objects, especially agents. The ﬁndings also suggest that
spatial location may impart connotations to objects that do not necessarily have agency.
But the shift in connotations of objects may be volatile and less inﬂuenced by spatial

location than agents.

95

7 Directing Attention in Mobile AR Interface

One basic user interface ﬁrnctionality is the ability to direct attention to physical
or virtual objects in the environment. Mobile, context-aware interfaces will often be
tasked with directing attention to physical or virtual objects that are located anywhere in
the environment around the user. Often the target of attention will be beyond the visual
ﬁeld and beyond the ﬁeld of view of the display devices in use. Mobile AR systems
allow users to interact with all of the environment, rather than being focused on a limited
screen area. Hence, they allow interaction during visual search, tool acquisition and
usage, or navigation. In emergency services or military settings, AR can cue users to
dangers, obstacles, or situations in the environment requiring immediate attention. These
many applications call for a general purpose interface technique to guide user attention
while mobile to physical and virtual objects, labels, locations and other information
populating a potentially cluttered physical environment.

Mobile AR interfaces present an interface challenge that can be characterized as
follows: How can a mobile interface manage and guide visual attention to locations in the
environment where critical information or objects are present, even when they are not
within the visual ﬁeld? The challenge is part of a larger need for attention management
(Roel 2002) in high information bandwidth mobile interfaces.

To illustrate the beneﬁts of management of visual attention in an AR system,
consider the following application scenarios:

Telecollaborative spatial cueing. An emergency technician wears a camera and an

AR HMD while collaborating with a remote doctor during a medical emergency. The

96

remote doctor needs to indicate a piece of equipment that the technician must use next.
What is the quickest way to direct her attention to the correct tool among a large and
cluttered set of alternatives, especially if she is not currently looking at the tool tray and
doesn’t know the technical term for the tool

Object Search. A warehouse worker uses a mobile AR system to manage
inventory, and is searching for a speciﬁc box in an aisle where dozens of virtually
identical boxes are stacked. Tracking systems integrated into the warehouse detect that
the box is stored on a shelf behind the user using inventory records, a Radio Frequency
Identiﬁcation (RFID) tag, or other markers. What is the most efﬁcient way to signal the
target location to the user?

Procedural Cueing during Training. A trainee repair technician uses an AR
system to learn a sequence of steps where parts and tools are used to repair complex
manufacturing equipment. How can the computer best indicate which tool and part to
grab next in the procedural sequence, especially when the parts and tools may be
distributed 360° throughout a large workspace?

Spatial Navigation. A tourist with a Personal Digital Assistant (PDA) equipped
with Global Positioning System (GPS) is looking for an historic building in a street with
many similar buildings. The building is around the corner down the street. How can the
PDA efﬁciently indicate a path to the main entrance?

These scenarios share a common demand for a technique that allows for precise
target location cueing in near or far open spaces and at any angle relative to the user

under conditions where speed and accuracy may be important. Any technique must be

97

able to provide continuous guidance and direct the user around occlusions. The scenarios

illustrate various cases where attention must be guided or managed by the interface.

7.1 Attention Management

Human cognitive capacity is a ﬁnite resource and attention is one of the most
limited of mental resources (Shifﬂin 1979). Attention management (Roel 2002) is a key
human-computer interaction issue in the design of interfaces and devices (Horvitz, Kadie,
Pack and Hovel 2003; McCrickard and Chewar 2003). Information-rich applications of
mobile AR interfaces (e.g., emergency services) begin to push up against a ﬁrndamental
human factors limitation, the limited attention capacities of humans. For example, the
attention demands of relatively simple and low bandwidth mobile interfaces, such as
PDAs and cell phones, may contribute to car accidents (Redelrneier and Tibshirani 1997;
Strayer and Johnston 2001).

Attention is used to focus cognitive capacity on a certain sensory input so that the
brain can concentrate on processing the information of interest (van der Heijden 1992;
van der Heijden 2003). Attention is primarily directed internally, ﬂom the “top down”
according to the current goals, tasks, and larger dispositions of the user. Attention,
especially visual attention, can also be cued by the environment. For example, attention
can be user driven, i.e., “ﬁnd the screwdriver,” collaborator driven “use this scalpel
now,” or system driven “please use this tool for the next step.”

Visual attention is even more limited, since the system may have information
about objects anywhere in an omnidirectional working environment around the user.
Visual attention is limited to the ﬁeld of view of human eyes (<200°), and this limitation

is further narrowed by the ﬁeld of View of common HMDs (< 80°).

98

In mobile AR interfaces the attentional demands of the interface on mental
workload (Hancock and Meshkati 1988; Johnson and Proctor 2004) must also be
considered. Attention is shared across many tasks and tasks in the virtual environment are
often not of primary consideration to the user. Individuals may be ambulatory, working
with physical tools and objects, and interacting with others. The user may not be at the
correct location in the scene, or looking at the correct spatial location or object needed to
accomplish a task. 80, attention management in the interface should reduce demands on

mental workload.

7.1.1 Attention Cueing in Existing Interfaces

Currently, there are few, if any, general mobile interface paradigms to quickly
direct spatial attention to objects or locations anywhere in the environment. Users and
interface designers have evolved various ways to direct visual attention in interpersonal

interaction, architectural settings, and standard interfaces.

7.1.1.1 Spatial cueing in Windows Interfaces

WIMP interfaces beneﬁt ﬂom the assumption that user’s visual attention is
directed to the screen, which occupies a limited angular range in the visual ﬁeld. Visual
cues such as ﬂashing cursors, pointers, radiating circles, jumping centered windows,
color contrast, or content cues are used to direct visual attention to spatial locations on the
screen surface. Large display areas extend this angular range, but still linrit the visual
attention to a clearly deﬁned area. Khan and colleagues (Khan, Matejka, Fitzmaurice and
Kurtenbach 2005) proposed a visual spotlight technique for large room interfaces.

The integration of audio with visual cues helps draw attention even when vision is

not directed to the screen. Of course, these systems work within the conﬁnes of a very

99

limited amount of screen real estate; an area most users can scan very quickly. The audio
cue often only initiates the attention process, requiring completion using visual scanning.
These techniques cannot easily or quickly cue objects in the 3D environment around the

user, for example pointing at an object behind the user.

7.1.2 Spatial Cueing in Augmented Reality

In mobile AR environments, the volume of information is large and
omnidirectional. AR environments have the capacity to display large amount of
informational cues to physical objects in the environment.

Most current AR systems adopt WIMP cursor techniques or visual highlighting to
direct attention to an object (e. g., Feiner et al. 1993; Mann 2000 ). Recently, Chia-Hsun
and colleagues (Bonanni, Lee and Selker 2005) proposed projecting light into the
environment. Other techniques involve adding virtual quasi-architectural signage or
virtual objects such as arrows or lines to the environment (Schmalstieg and Wagner
2005).

Spatial cueing techniques used in interpersonal communication (Burgoon, Buller
and Woodall 1996), WHVIP interfaces, and architectural environments are not easily
transferred to AR systems. Almost all of these techniques assume that the user is looking
in the direction of the cued object or that the user has the time or attentional capacity to
search for a highlighted object. Multirnodal cues such as audio can be used to one the
user to perform a search, but the cue provides limited spatial information and must
compete with other sound sources in environment. Spatialized audio (Baluert 1983) does

not have the spatial resolution to indicate spatial locations precisely.

100

7.2 The Omnidirectional Attention Funnel
Interface design in a mobile AR system presents two basic challenges in

managing and augmenting attention of the user:

 

Figure 7.1. The attention funnel links the head of the viewer directly to an object

anywhere around the body.

(1) Omnidirectional cueing. To quickly and successfully cue visual attention to
any physical or virtual object in 360° space as needed.

(2) Minimal attention demands. Minimize mental workload and attention
demands during search or interference with attention to tasks, objects, or navigation in
the physical environment.

The Omnidirectional Attention Funnel is an AR display technique for rapidly
guiding visual attention to any location in physical or virtual space. The basic
components of the attention funnel are illustrated in Figure 7.1. The most visible
component is the set of dynamic 3D virtual objects linking the view of the user directly to

the virtual or physical object.

101

In spatial cognitive terms, the attention ﬁrnnel visually links a head-centered
coordinate space directly to an object centered coordinate space, ﬁrnneling focal spatial
attention of the user to the cued object. The attention funnel takes advantage of spatial
cueing techniques impossible in the real world, and AR’s ability to dynamically overlay
3D virtual information onto the physical environment.

Like many AR components the AR funnel paradigm consists of: (1) a display
technique, the attention funnel, combined with (2) methods for tracking and detecting the

location of objects to be cued.

7. 2. 1 Components of the Attention Funnel

The attention funnel has been realized as an interface widget in an augmented
reality development environment. The attention funnel interface component
(arwattention) and is one component in a planned set of user interface widgets being
designed for mobile AR applications. These components are being built and tested as
extensions of the ImageTclAR augmented reality development environment (Owen et al.
2003). The arwattention widget provides a mechanism for drawing visual attention to
locations, objects, or paths in an AR enviromnent.

The basic components of the attention funnel, as illustrated in Figure 7.2, are: (a)
a view plane pattern with a virtual boresight in the center, (b) a dynamic set of attention
funnel planes, (c) an object plane with a target graphic, and (d) a invisible curved path
linking the head or viewpoint of the user to the object. Along this path are placed
patterns that are repeated in space and normal to the line. We refer to the repeated

patterns on the linking path as an attention funnel.

102

 

Figure 7. 2. Three basic patterns are used to construct a funnel: (A) the head centered
plane includes a boresight to mark the center of the pattern from the user ’3 viewpoint, (B)
funnel planes, added in a fixed pattern (approximately every 12 centimeters) between the

user and the object, and (C) the object marker pattern that includes a red cross hairs

marking the approximate center of the object.

The path is deﬁned using cubic curve segments. Initial experiments have
instantiated the path as Hermite curve (Hearn and Baker 1996). A Hermite curve is a
cubic curve segment deﬁned by a start location, end location, and tangent vectors at each
end. The curve follows a path ﬂom the starting point in the direction of the starting end
tangent vector. It ends at the end point with the curve approaching the end point in the
direction of the end tangent vector. As a cubic curve segment, the curve presents a
smoothly changing path from the start point to the end point with curvature controlled by
the magnitude of the tangent vectors. Hermite curves are a standard cubic curve method
discussed in any computer graphics textbook. Figure 7.3 clearly illustrates the curvature

of the funnel ﬂom a bird’s eye perspective.

103

 

Figure 7. 3. As the head and body move, the attention funnel dynamically provides
continuous feedback. Aﬂordances from the perspective cues automatically guide the user
towards the cued location or object. Dynamic head movement cues are provided by the
skew (e.g., left, right, up, down) of the attention funnel. The level of alignment (skew) of
the funnel provides an immediate intuitive sense of how much the body or head must turn

to see the object.

The starting point of the Hermite curve is located at some speciﬁed distance in
ﬂont of the origin in a frame deﬁned to be the viewpoint of the user (the center of
projection for a single viewpoint or average of two viewpoints for stereo viewers). The
curve terminates at the target. The tangent vector for the Hermite curve at the starting
point is in the —z direction1 and the tangent vector at the ending point is a vector speciﬁed
as the difference between the end and start locations (the direction to the target). The

curvatures of the starting and ending points are speciﬁed in the application.

 

' Assuming a right hand coordinate system.

104

A single cubic curve segment creates a smoothly ﬂowing path ﬂom the user’s
viewpoint to the target in a near ﬁeld setting. Larger environment that include occlusions
are require complex navigation are realized using a sequential set of cubic curve
segments. The join points of the curve segments are speciﬁed by a navigation
computation that takes into account paths and occlusions. As an example, a larger
outdoor navigation system under development uses the Microsoft® Mappoint®
commercial map management software to compute waypoints on a navigation path that
then serve as the curve join points for the attention funnel path. The key design element
is the smooth curvature of the path that allows for the funneling of attention in the desired
target direction.

The orientation of each pattern along the visual path is obtained by spherical
linear interpolation of the up direction (Shoemake 1985). Spherical interpolation allows
the rotation angle between each interval to be constant, i.e. the changes of orientations of
the patterns are smooth. The computational cost of this method is very small, involving
the solution of the cubic curve equation (three cubic polynomials), the spherical
interpolation solution, and computation of a rotation matrix for each pattern display
location. Computational costs are dwarfed by the rendering costs for even this low-
bandwidtlr display rendering.

The purpose of an attention ﬁmnel is to draw attention when it is not properly
directed. When the user is looking in the desired direction, the attention funnel becomes
superﬂuous and can result in visual clutter and distraction. The solution to this case is to
fade the funnel as the dot product of the source and target tangent vectors approaches

one, indicating the direction to the target is close to the view direction.

105

7. 2.2 Aﬂordances in the Attention Funnel that Guide Navigation and Body Rotation

The attention funnel uses various overlapping visual cues that guide body
rotation, head rotation, and gaze direction of the user.

Building on an attention sink pattern introduced by Hochberg (Hochberg 1986),
the attention ﬁrnnel uses strong perspective cues as shown in Figure 7.4. Each attention
funnel plane has diagonal vertical lines that provide depth cueing towards the center of
the pattern. Each succeeding ﬁrnnel plane is placed so that it ﬁts within the preceding
plane when the planes are aligned in a straight line. Increasing degrees of alignment
cause the interlocking patterns to draw visual attention towards the center. Three basic
patterns are used to construct a funnel: (1) the head centered plane includes a bore sight
to mark the center of the pattern ﬂom the user’s viewpoint, (2) funnel planes, added in a
ﬁxed pattern (currently every 12cm) between the user the object, and (3) the object
marker pattern that includes a red bounding box marking the approximate center of the
object. Patterns 1 and 3 are used for dynamically cueing the user that they approach an
angle where they are “locked onto” the object (see below).

As the head and body moves, the attention funnel provides continuous feedback
that indicates to the user how to turn the body and/or head towards the cued location or
object. Continuous dynamic head movement cues are indicated by the skew (e. g., left or
right) of the attention ﬁrnnel. The pattern of the ﬁmnel provides an immediate intuitive
sense of the location of object relative to the head. For example, if the funnel skews to the
right, the user knows to move his head to the right (e. g., more skewing suggests that more
body rotation is needed to see it). The funnel provides a continuous dynamic cue that one

is getting closer to being “in sync” and locked onto the cued object. When looking

106

directly at the object, the funnel fades so as to minimize visual clutter. A target behind
the user is indicated by a funnel that moves forward for visibility, then turns and heads

behind the user - a clear visual cue.

 

 

Figure 7. 4. Example of the attentional funnel drawing attention of the user to an object

on the shelf the red box.

7. 2.3 Methods for Sensing or Marking Targets Objects or Locations

Attention funnels are applicable to any augmented vision display technology
capable of presenting 3D graphics, including head-mounted displays and video see-
through devices such as tablet PC’s or handheld computers. The location of target objects

or locations in the environment may be known to the system because they are: (1) virtual

107

objects in tracked three-dimensional space, (2) tagged with sensors such as visible
markers or RF ID tags, or (3) at predeﬁned spatial locations as in GPS coordinates.
Virtual objects in tracked 3D space are the most straightforward case, as the attention
funnel can link the user to the location of the target virtual object dynamically. Objects
tagged with RF ID tags are not necessarily detectable at a distance or locatable spatially
with a high degree of accuracy, but local sensing in a facility may be sufﬁcient to indicate
a position sufﬁcient for attention direction.

In some cases, the location of the object is detected by sensors and is not known
ahead of time. An implementation we are currently exploring involves the detection of
visible markers with auxiliary omnidirectional tracking cameras, which can be
implemented as an additional tracking system in a video see-through or optical see-
through system. (This implementation is distinct ﬂom the traditional video see-through
system, where the only camera used represents the viewpoint of the user). The head-
mounted omnidirectional camera detects markers in a 360° environment around the user.
The relation of the camera to the user’s viewpoint is known. Detected objects can be
cued for the user based on task needs or search requests by the user (i.e., “ﬁnd the tool

box”).

7.3 Methodology

A within-subj ects experiment was conducted to test the performance of the
attention funnel design against other conventional attention direction techniques: visual
highlighting and verbal ones. The experiment had one independent variable, the method
used for directing attention, with three alternatives: (1) the attention funnel, (2) visual

highlight techniques, and (3) a control condition consisting of a simple linguistic cue.

108

    

‘1' A

Figure 7. 5. Test Environment: The user sat in the middle of test environment for the
visual search task. It consisted of an omnidirectional workspace assembled from four
tables each with 12 objects (6 primitive shapes and 6 general oﬂice objects) for a total of

48 target search objects.

7.3.3 Apparatus and Test Environment

A 360° omnidirectional workspace was created using four tables as shown in
Figure 7.5. 12 objects were placed on each table: 6 primitive objects of different colors
(e. g. red box, or black sphere) on a shelf, and 6 general objects (e.g. stapler, notebook) on
the table top.

Visual cues were displayed in stereo with the Sony Glasstron LDI-lOOB head-
mounted display, and audio stimulus materials were presented with a pair of headphones.

Head motion was tracked by an Intersense IS-900 ultrasonic/inertia hybrid tracking

110

system. Stereo graphics were rendered in real time based on the data ﬂom the tracker. A
pressure sensor was attached to the thumb of a glove to capture the reaction time when
the subject grasped the target object.

Presentation of stimulus materials, audio instructions for participants,
experimental procedure sequencing, and data collection for the experiment was
automated so that the experimenter did not need to manually record the experimental
results. The experiment was developed in the ImageTclAR AR development environment

(Owen et al. 2003).

7. 3. 4 Measurements

Search Time, Error, and Variability. Search time in milliseconds was measured as
the time it took for participants to grab a target object ﬂom among the 48 objects
following the onset of an audio cue tone. The end of the search time was triggered by the
pressure sensor on the thumb of the glove when the user touched the target object. An
error was logged for cases when participants selected the wrong object.

Mental Workload. Participant’s perceived task workload in each condition was
measured using the National Aeronautics and Space Administration Task Load Index

(NASA T LX) after each experimental condition (Hart and Staveland 1988).

7. 3.5 Procedure

Participants entered a training environment where they were introduced and
trained to use each interface (audio, visual highlight, attention funnel). They then began
the experiment. Each subject experienced the interface-treatment conditions (audio,
visual highlight, and attention ﬁmnel) in a randomized order. For each condition,

participants were cued to ﬁnd and touch one of the 48 objects in the environment as

111

quickly and accurately as possible. Participants participated in 24 trials balanced such
that 12 trials involved searching for a random selection of primitive objects and 12 trials

involved randomly selected general everyday objects.

7.4 Results

A general linear model repeated measure analysis was conducted to test the effect
of metaphors on the different performance indicators. There was a significant eﬂect of
interface type on search time, F (2, 14) = 10.031, p = 0.001, and on search time
consistency (i.e., smallest standard deviation), F(2, 14) = 23.066, p = 0.000. The attention
funnel interface clearly allows subjects to ﬁnd objects in the least amount of time and

with the most consistency (M = 4473.75 ms, SD = 1064.48) compared to the visual

 

highlight interface (M = 6553.12, S_D = 2421.10) and the audio only interface (M =

4991.94 ms, SD = 3882.11), which had the largest standard deviation. See Figure 7.6.

 

There was a signiﬁcant effect of interface type on the participants perceived
mental workload, F(2, 14) = 4.178, p = 0.027. The results indicate that the attention
funnel interface has the lowest mental workload (M = 44.64, SD = 16.96), comparing to
the visual highlight interface (M = 54.57, SD = 18.26), and the audio interface (M
=55.57, SD = 12.43). See Figure 7.7.

There was no signiﬁcant effect of interface type on error, F (2, 14) = 1.507, p =
0.24 (attention funnel M = 1.14, SD = 0.77, visual highlight, M = 1.43, SD = 1.56, audio

M = 0.86, s1; = 1.03).

112

 

 

 

 

 

 

 

 

 

10000 *‘ F

8000 ~
3‘
E.
o 6000 r
E
F V. .-
‘c . s .‘ '3
‘5 i ' .. . .
. 4ooo — “Fl

zooo - a. i: signs

0 ‘1". f
Audio nghlight anel

Conditions

Figure 7. 6 Search time and consistency by experimental condition. Attentional funnel
decreased search time by 22% on average (28% when reach time is subtracted) and

increased search consistency (decreased variability) by 65 %.

Mental Workload (NASA TLX Score)
8

 

 

f‘ . . . ‘

er ...' (I,

‘. ~ . (7' “ .
‘ I .. l

Audo Hig'rlight Funnel
Conditions

 

Figure 7. 7. Mental workload measured by NASA T LX for each experimental condition.

113

7.5 Discussion

When compared to standard cueing techniques such as visual highlighting and
audio cueing, we found that the attention funnel decreased the visual search by 22%
overall, or approximately 28% for visual search time, and 14% over its next fastest, as
shown in Figure 6. While increased speed in the aggregate is valuable in some
applications of augmented reality, such as medical emergency and other high risk
applications, it may be critical that the system exhibit consistent performance. The
attention ftmnel had a very robust effect on search consistency (decreased standard error).
The interface increased consistency by 65% on average, and 56% over the next best
interface.

In summary the attention funnel led to faster search and retrieval times, greater
consistency of performance, and decreased mental workload when compared to verbal

cueing and visual highlighting techniques.

7.6 Application of the Attention Funnel

With the success of AR enabled, mobile systems, designers will seek to add
potentially rich, even unlimited layers of location based information onto physical space.
As AR systems are used in demanding mobile applications such as manufacturing
assembly, warehouse search, tourism, navigation, training, and distant collaboration,
interface techniques appropriate to the AR medium will be needed to manage the mobile
user’s limited attention, improve user performance, and limiting cognitive demands while
achieving a more optimal spatial performance.

The attention ﬁmnel paradigm involves basic techniques that have potentially

general applicability in mobile interfaces: A user’s attention has to be directed to objects

114

or locations in order to accomplish tasks. We are currently implementing the technique
on other mobile devices including hand held devices such as PDAs and cell phones.

Broadly, the attention funnel techniques may be implemented in applications
involving the following generic classes of ﬁrndamental tasks:

Physical Object Selection. Situations where a user may be looking for a physical
object in a space, for example a tool in a workbench, a box in a warehouse, a door in
space, the next part to assemble during object assembly, etc. The system can direct the
user to the target object.

Virtual Object Selection. AR systems may insert labels or 3D objects inside the
environment. These may be within or outside the current view of the user. Attention
ftmnels can cue them to look at the spatially registered label, tool, or cue.

Visual Search in a Cluttered Space. The user may be searching in a highly
cluttered natural or artiﬁcial environment. An attention funnel can be used to cue them to
the correct location to view, even if they are not looking in the right place.

Navigation in Near Space. The system might also need to direct the walking path
of the individual through near space (e. g., through aisles, etc.). A directional funnel path
(slightly different implementation than the attention funnel above) can be used to indicate
and cue the user’s direction, and provide dynamic cues as to path accuracy.

Navigation in Far Space. An attention funnel can direct users to distant
landmarks. As an example, someone walking towards an ofﬁce several blocks away
must maintain a link to the landmark as they navigate through an urban environment,

even when landmarks are obscured.

115

The AR attention funnel paradigm represents an example of cognitive
augmentation speciﬁcally adapted for users of mobile AR systems navigating and

working in information and object rich environments.

116

8 Discussion and Conclusion

This thesis is the ﬁrst research work to construct a spatial ﬂamework of
information placement in AR environments based on neuropsychological research. The
spatial ﬂamework provides a theoretical model to map cognitive properties of each
Infospace for information organization in mobile AR interface. Unique cognitive
properties of each infospaces in the spatial framework is then systematically review. A
large volume of literature in psychology, behavioral science and neuroscience on spatial
cognition is systematically reviewed and organized according the spatial ﬂamework. As
there is no spatial ﬂamework previously existed for information placement in three-
dimensional space, compiling a set of cognitive properties in each infospace in the
ﬂamework allows researchers to determine where new research is needed to ﬁrrther
investigate issues in AR interfaces design.

Three research questions were identiﬁed concerning unexplored cognitive
properties in the spatial ﬂamework that are useful for the design of AR environments.
The ﬁrst research question addresses the capability of the human cognitive system to
manipulate egocentric and allocentric reference ﬂames to encode spatial information in
the environment. Even though there is no physical equivalent of Peripersonal Infospace
in the real world, experimental results show that participants’ preference of reference
ﬂame for spatial memory and spatial updating can be easily manipulated by oral
instruction or brief experience. The ease of manipulation in spatial reference ﬂame

shows promise for using the Peripersonal space for information organization.

117

The second research question is regarding the applicability of perceptual
asymmetric properties to information display in Egocentric Infospaces. There are a large
number of perceptual asymmetric properties in the psychology literature concerning
different cognitive properties such as reaction time, emotion and semantic meaning. Two
experiments were conducted to evaluate the applicability of these asymmetric properties
to AR interfaces design. Results show that perceptual asymmetric properties based on
reaction time (e. g. perceptual response, memory retrieval), typically measured in
milliseconds, are too subtle to have practical impact on reaction to stimuli in AR and
other information displays, and should only be used sparsely for information placement
in egocentric infospaces. On the other hand, semantic meaning and participants’
emotional response of virtual objects and agents change with the location around the
body. AR interface designers may need to be aware that location may impart connotation
and emotion to user’s perception of the information objects, especially agents.

Finally, a novel metaphor for directing visuo-spatial attention, the Attention
Funnel, was developed. Traditional paradigms for directing user’s attention (such as
blinking, audio signal, audio instruction, use of color and highlight) are inaccurate,
mentally demanding and ambiguous in the omni-directional environment. The Attention
Funnel paradigm, a dynamic three-dimensional perspective cue linking user’s retinotopic
space to a virtual or physical object in space, was shown to reduce user’s visual search

time and mental workload comparing with traditional paradigms.

8.1 Guideline for Information Display in Augmented Reality Environments
The cognitive properties in the spatial ﬂamework developed form the basis of

information display guideline in AR environments. Based on the discussion in this

118

thesis, a set of guidelines for information display in mobile AR interfaces was compiled
(Appendix A). This set of guidelines are the distillation of the best available information
about behavioural patterns in spatial cognitive psychology and results of new
experimentation presented in this thesis, rather then general rules of thtunb derived ﬂom
user or designer experiences and/or personal opinions. The set of guidelines provide
interface designers a clear ﬂamework to follow for spatial information placement in AR
environments, helping them to develop AR interfaces that exploit the spatial processing
capabilities of the human brain. It also serves as a checklist for AR interface designers,
guiding decisions on a wide range of issues in spatial information placement during the
design and evaluation process. More importantly, this thesis seeks to promote discussion

among researchers and stimulate additional research in spatial AR interfaces design.

8.2 Future Works

As there is no clearly deﬁned guidelines previously existed for spatial information
placement in three-dimensional space, compiling a set of initial guidelines allows
researchers to determine where new research is needed to further investigate issues in AR
interfaces. The compiled set of guidelines in Appendix A is by no mean a complete set
of guidelines that covers every aspect of spatial information placement in AR
environments. Issues raised in this dissertation only serve as a starting point, and it is
expected that AR Interfaces research to use this set of guideline to identify new issues not
being covered in this guideline and expanding this set of guidelines with new

experimental results.

119

There is no physical equivalent of Peripersonal Infospace in the real world, and
precious little about cognitive properties of Peripersonal Infospace is known. For
example, spatial location in Peripersonal Infospace may interact with many other
cognitive processes such as information search, attention, visual change detection and
memory. More research in cognitive psychology related to the Peripersonal Infospace is
needed to explore cognitive properties of the Peripersonal Infospace and turn these ,
properties into spatial information display guideline.

Another under explored Infospace in the spatial ﬂamework is the Personal-body
Infospaces. There were just a few studies in neuroscience exploring the relation of tool
usage and personal space and body schema in the last few years. But these studies are
limited to basic research in neuroscience. There are a lot of unexplored cognitive
properties related to Personal-body Infospace (such as reaching response, memory,
accuracy, emotion, semantic meaning) that have potential implications in AR interface
design.

A prototype mobile AR interface based of the compiled set of guidelines is
currently being developed. It is intended to be developed for mobile AR applications
ﬁeld experiment in various task speciﬁc applications and scenario. It will also be used as

a test bed for future studies to explore new issues in mobile AR interfaces.

8.3 Conclusion

The design of AR interfaces prompts a signiﬁcant human factors challenge of
mapping different metaphors, information, and ﬁrnctions of computer usage into the
human cognitive system. In this thesis, 3 set of spatial information placement guidelines

was constructed based on a body of literature in neuropsychology, spatial cognition, and

120

behavioral sciences and a series of experiments tightly related to AR interfaces design.
The literature and experimental results provides grounding for theory driven human-
computer interaction design for the development of high performance AR interfaces,
mobile infospaces potentially tailored to human spatial cognition.

The set of guidelines present interface designers a clear ﬂamework to follow for
spatial information placement in AR environments, and serves as a checklist for AR
interface designers, guiding decisions on a wide range of issues in spatial information
placement during the design and evaluation process. More importantly, this thesis seeks
to provoke discussions among researchers and stimulate more research in spatial AR

interfaces design.

121

 

FR]; 1 an.
AI

Appendix A. Spatial Information Display Guideline for Mobile Augmented

Reality Interfaces

The guidelines presented in this chapter are the distillation of the best available
information about behavioural patterns in spatial cognitive psychology along with results
of new experimentation. The guidelines present interface designers a clear ﬂamework to
follow for spatial information placement in AR environments, helping them to develop
AR interfaces that exploit the spatial processing capabilities of the human brain. It also
serves as a checklist for AR interface designers, guiding decisions on a wide range of
issues in spatial information placement during the design and evaluation process. More
importantly, this thesis seeks to promote discussion among researchers and stimulate
additional research in spatial AR interfaces design. As no clearly deﬁned guidelines
previously existed for spatial information placement in three-dimensional space,
compiling a set of initial guidelines for spatial information placement allows researchers
to determine where new research is needed to further investigate issues in AR interfaces.

The compiled set of guidelines in this thesis only addresses spatial issues in AR
interfaces design. It presents a set of issues related to how the human brain processes
information objects in space in a mobile AR computing environment. This set of
guidelines does not address general interface and display issues such as styles,
appearances, messages and contents. It is also expected that additional spatial interface
guidelines will be advanced as research into human spatial cognition proceeds, thereby
expanding this set of guidelines with new ideas. This is, by no means, the end of research
into spatial placement guidelines for augmented reality systems. Rather, it is intended as

a strong beginning.

122

A. Spatial Framework of Three-dimensional Space

A]. Partitioning Three-dimensional Space into Information Spaces

 

 

 

Guideline: AR systems should include in their design the inherent partitioning
of space into ﬁve infospaces, supporting each as an identiﬁable
information ﬂame.

Comments: The ﬁve infospaces are deﬁned as: (a) Personal-body Infospace,

 

(b) Peripersonal Infospace, (c) Extrapersonal Focal Infospace, (d)
Extrapersonal Action-Scene Infospace, and (e) Extrapersonal
Ambient Infospace. This partitioning of space is supported by
neuropsychological research that shows these spaces to have
unique physical and psychological properties. Information objects
should be categorized as members of an appropriate infospace.
The deﬁnition of each Infospace is given as follow: (a) Personal-
body Infospace is the volume immediately adjacent to and
including the surfaces of the user’s body, (b) Peripersonal
Infospace is the volume deﬁned by the arm-reaching space
immediately in ﬂont of the body, (c) Extrapersonal Focal
Infospace is an elliptical region with a lateral extent of 20°-30°
anchored to the user’s eye ﬁxation. This is the predominant visual
space and is the target space for head stabilized reference ﬂames,
although the best deﬁnition of extrapersonal focal spaces would be
based on eye tracking information, (d) Extrapersonal Action-scene

Infospace is the spatial volume of the allocentrically oriented

 

123

 

 

 

 

spaces. It encapsulates the body in a 360° surround, with a range
starting ﬂom 2 meters ﬂom the body to approximately 30 meters,
and (e) Extrapersonal Ambient Infospace is the earth-ﬁxed

outermost space of the visual ﬁeld.

 

A2. Egocentric Infospaces

 

Guideline:

Interface elements that require access regardless of the user’s
location should be attached to any of the three egocentric
Infospaces: Personal-body Infospace, Peripersonal Infospace, and
Extrapersonal Focal Infospace; unless the volume required

exceeds the capacity of the Peripersonal Infospace

 

Comments:

 

 

Interface elements attached to the three egocentric Infospaces are
reachable by a mobile user during locomotion regardless of
location. However, egocentric infospaces have limited capacity.
Information objects that require a volume exceeding the capacity
of the Peripersonal Infospace, which has. the largest capacity
among the three egocentric Infospaces, should be attached to the

Extrapersonal Action-scene Infospace.

 

B. Peripersonal Infospace

‘ BI . Reference Frames and tracking requirements of Peripersonal Infospace

 

Guideline:

 

 

Tracking for the peripersonal infospace should be attached as

 

124

 

 

 

 

closely as possible to the spine, ideally to the upper lumbar

vertebrae.

 

 

Comments:

 

Information objects in the Peripersonal Infospace remain
stationary with respect to the upper torso. Tracking of the upper
back (dorsal area) creates a ﬂame for information objects attached
to the peripersonal infospace that follows the body, but without the

unwanted breath motion exhibited by the breast.

 

32. Physical Volume and Information Capacity of Peripersonal Infospace

 

 

 

Guideline: The Peripersonal Infospace is the default Infospace for
information objects that must follow the user during locomotion
and interface elements that require ﬂequent access.

Comments: The Peripersonal Infospace has the highest capacity, and exhibits

 

the least pyschophysiological specialization among the three
egocentric Infospaces. It is ideal as the default Infospace for
generic interface elements that must follow the user during
locomotion and interface elements that require frequent access. A
general rule is that information objects that must follow the user
during locomotion should be initially assigned to the Peripersonal
Infospace unless some unique characteristic of the other two

egocentric spaces demands their application.

 

B3. Visibility of Peripersonal Infospace

125

 

 

 

 

 

Guideline: Interface designs must accommodate the variable visibility of
differential spatial locations in Peripersonal Infospace.
Comments: The space immediately in ﬂont of the head is the most visible

 

volume in Peripersonal space. Information object visibility
decreases as they are moved farther away ﬂom the central area.
Designers should map the visibility of spatial location as elements
of the design, ensuring that objects that require more attentiveness

are placed at the location with higher visibility.

 

B4. Spatial Bias of Peripersonal Infospace

 

 

 

Guideline: Placement of interface elements should be spatially sorted so as to
accommodate the spatial biases of the Peripersonal Infospace.
Comments: Reaction time of reaching movements is biased towards the lower

 

portion of the Peripersonal Infospace and the middle 60° of the
body. Hand motor resolution is also ﬁner in the lower portion of
Peripersonal Infospace. There are evidence that these motor
advantages extend into a memory advantage for recalling the
location of objects and recognition of objects that are manipulated.
For right-handed users, these properties are also biased towards

the right side.

 

C. Personal-body Infospace

C1. Reference Frames and tracking requirements of Personal-body Infospace

126

 

 

 

Guideline:

AR user interfaces may incorporate multiple Personal-body

Infospaces by tracking alternate body parts.

 

 

Comments:

 

A Personal-body Infospace is a reference ﬂame stabilized to a
body part such as the hand or arm. Information objects in a
Personal-body Infospace remain stationary with respect to the
body part they are attached to. Different body parts lead to
different capabilities. Humans are used to associating information
with the arm, but less is know about the association of information

objects with other body parts.

 

C2. Physical Volume and Information Capacity of Personal-body Infospace

 

 

 

Guideline: The amount of information placed in a Personal-body Infospace
should be limited in any design.
Comments: The Personal-body Infospace has a very limited volume with the

 

smallest capacity of any of the three Egocentric Infospaces. The
capacity of a Personal-body Infospace depends upon the area of
the space surrounding the body part. The volume of a Personal-
body Infospace typically extends a few centimeters ﬂom the
epidermis. However, there is neuropsychological evidence that
the Personal-body Infospace can be plastically extended following
active tool-use. So the volume can possibly be extended to the
surrounding volume of interface elements attached to Personal-

body Infospace after prolonged active use of those interface

 

127

 

 

 

 

 

elements.

 

C3. Visibility of Personal-body Infospace

 

 

 

Guideline: Interface designs must accommodate the variable visibility of
objects in a personal-body infospace.
Comments: Information objects in Personal-body Infospace are not always

 

visible to the user. Information objects attached to the forearms are
the most visible, while information objects attached to the upper
torso, lower torso, upper-arms, thighs and legs are less visible due
to the limits of head motion. Designers should map the visibility of
these objects as elements of the design, ensuring that objects that
must remain visible are not placed in regions commonly occluded
or beyond the normal ﬁeld of view. Some user interface elements
do not require visibility, such as physical interfaces like buttons,
but are ideally associated with a Personal-body Infospace because

of the physical reachability of objects attached to the infospace.

 

C4. Spatial Bias of Personal-body Infospace

 

 

 

Guideline: Interface design should take spatial bias into consideration during
the selection of the appropriate Personal-body Infospace for a
given interface element.

Comments: Personal-body Infospaces are strongly biased towards the ventral

 

body (ﬂontal region). The dorsal body (back of the body) is not

 

128

 

 

 

 

 

 

within the visual ﬁeld and less accessible by the user’s hands. The
Personal-body Infospace is further biased towards the upper body,

where body parts are reachable by the hands.

 

C5. Proprioception

 

 

 

Guideline: Tasks and control functions that require a high degree of spatial
resolution should be attached to one of the Hand-stabilized
Personal-body Infospaces in order to take advantage of
proprioceptive feedback for high accuracy placement.

Comments: Proprioception, the sensation of the movement and orientation of

 

body parts, is required to achieve high accuracy hand manipulation
and alignment. A Personal-body Infospace associated with the

hands provides the best proprioceptive feedback to the user.

 

D. Extrapersonal Focal Infospace

DI. Reference Frames and Tracking Requirement of Extrapersonal Focal Infospace

 

 

 

Guideline: Tracking of head and/or eye motion is required for use of the
extrapersonal focal infospace.
Comments: Information objects in the Extrapersonal Focal Infospace remain

 

stationary with respect to eye ﬁxation, or with respect to the user’s
head when eye tracking is not available. The head is traditionally
tracked in AR systems, typically through tracking of a head-

mounted display that is assumed to remain ﬁxed relative to the

 

129

 

 

 

 

 

head. An eye movement tracker is required for information to

remain stationary with respect to eye ﬁxation.

 

 

D2. Physical Volume and Information Capacity of Extrapersonal Focal Infospace

 

 

 

Guideline: The amount of information placed in the Extrapersonal Focal
Infospace should be limited in any design.
Comments: The physical volume of the Extrapersonal Focal Infospace is the

 

volume immediately in ﬂont of the head, and its capacity for
information objects is necessarily very limited. Furthermore, the
central area should be reserved to avoid visual clutter that may

obscure the real environment.

 

D3. Visibility of Extrapersonal Focal Infospace

 

 

 

Guideline: Information objects that require immediate attention should be
attached to Extrapersonal Focal Infospace.
Comments: Information objects in Extrapersonal Focal Infospace are located

 

immediate in ﬂont of the head, and are always visible by
deﬁnition regardless of the user’s location and posture.
Consequently, information objects in the Extrapersonal Focal
Infospace have a great potential for distraction or interference with

vision.

 

D4. Perceptual Fading of Visual Stimulus in Head Stabilized Reference Frame

130

 

 

 

 

 

 

Guideline: Information obj ects that require sustained attention should not be
placed in the Extrapersonal Focal Infospace.
Comments: When attaching information objects to the Extrapersonal Focal

 

Infospace, interface designers should be aware of perceptual
fading, which may cause information to perceptually disappear

over a period of time, ranging ﬂom seconds to minutes.

 

D5. Visual Clutter and Spatial Bias of Extrapersonal Focal Infospace

 

Guideline:

Even though user attention is biased towards the central area of the
Extrapersonal Focal Infospace, information objects should be
placed along the peripheral area of the Extrapersonal Focal
Infospace to avoid visual clutter that may obscure the real

environment.

 

 

Comments:

 

Visual attention is eccentrically distributed ﬂom the eye ﬁxation
point. However, information objects placed within a 5 degree
radius of the eye ﬁxation will cause annoyance to the user.
Therefore, information objects should be placed at the peripheral

area of the head-stabilized Extrapersonal Focal Infospace.

 

D6. Directing User’s Attention in an Omni-directional Environment

 

 

Guideline:

 

The Attention Funnel paradigm, a dynamic three-dimensional
perspective cue linking a user’s retinotopic space to a virtual or

physical object in space, is recommended for directing visuo-

 

131

 

 

 

 

spatial attention.

 

 

Comments:

 

Traditional paradigms for directing attention (such as blinking
indicators, audio signals, audio instruction, use of color and
highlighting) are inaccurate, mentally demanding and ambiguous
in an omni-directional environment. The Attention Funnel
paradigm, a dynamic three-dimensional perspective cue linking
user’s retinotopic space to a virtual or physical object in space, has
been shown to reduce visual search time and mental workload

comparing with traditional paradigms.

 

E. Egocentric Infospaces

E1. Spatial Asymmetric Properties in the Brain

 

 

 

Guideline: Perceptual and kinematics asymmetric properties can be used to
optimize the placement of information objects and interface
elements in egocentric infospaces.

Comments: Different quadrants in the visual ﬁeld take a different visual

 

pathway to different regions of the brain, and have different
perceptual properties. Human motor skills are asymmetric due to
cereme lateralization. Presenting information on the correct side
of the body could enhance the perceptual and cognitive processes
relevant to the information objects, and result in a faster, more

natural, and increasingly accurate access.

 

132

 

 

E2. Kinematics Asymmetric Properties: Unimanual Tasks

 

 

 

Guideline: Simple pointing and selection tasks should be presented to the side
of the dominant hand.
Comments: Unimanual tasks are usually biased towards the dominant hand.

 

The dominant hand is better at precise, corrective, and rapid

movements.

 

 

E3. Kinematics Asymmetry Properties: Spatial Reference in Bimanual Tasks

 

 

 

Guideline: Bimanual tasks that involve information objects requiring a physical
stabilizing action, deﬁned steady states, or a deﬁning spatial reference
should be placed on the non-dominant side.

Comments: Motion of the dominant hand typically ﬁnds its spatial reference as the

 

results of motion of the non-dominant hand. The roles of the non-
dominant hand include physical stabilizing actions (e. g. stabilizing the
container for precise selection and manipulation), deﬁning steady

states (e. g. aiming a moving target), and deﬁning a spatial reference.

 

 

E4. Kinematics Asymmetry Properties: Spatial-temporal Scale for of the Two Hands

 

 

 

Guideline: Information objects for bimanual interactions that require
macrometric movement should be presented to the side of the non-
dominant hand, and tasks that require nricrometric movement
should be presented to the side of the dominant hand.

Comments: The dominant hand has a ﬁner spatial and temporal motor

 

 

 

133

 

It-

 

 

 

resolution than the non-dominant hand.

 

E5. Kinematics Asymmetry Properties: Precedence in Action for the Two Hands

 

 

 

Guideline: A single interface element that requires bimanual interactions
should be presented on the side of the non-dominant hand.
Comments: The non-dominant hand starts earlier than the dominant hand in

 

bimanual action. Placing interface elements on the non—dominant

side encourages reach and acquisition by the non-dominant hand.

 

E6. Perceptual Asymmetric Properties: Response time

 

Guideline:

Perceptual asymmetric properties based on reaction time (e.g.
perceptual response, memory retrieval) are too subtle to have
practical effects on reaction to stimuli in AR and other information
displays, and should only be used sparsely for information

placement in egocentric infospaces.

 

 

Comments:

 

There are a large number of perceptual asymmetric properties in
the psychology literature. However, perceptual asymmetric
properties based on reaction time are typically measured in
milliseconds, and are too subtle to have signiﬁcant effects on

practical tasks.

 

E7. Perceptual Asymmetric Properties: Emotion

 

 

Guideline:

 

Information objects that intentionally trigger a user’s emotion

 

134

 

 

 

 

 

should ideally be placed on the left side in Egocentric Infospaces.

 

 

Comments: Information objects presented on the left side have semantic
properties that are shown to deviate from neutral more
signiﬁcantly than objects falling on the right side. The left side of

the body is more sensitive to stimulus-evoking emotions.

 

 

E8. Perceptual Asymmetries: Social Proxemics and Semantic Meaning

 

Guideline: Information objects with conative meaning should ideally be

placed closer to the body in an Egocentric Infospace.

 

 

Comments: Information objects in the near space are perceived with more
conative meanings (e. g. more relevant, superior, urgent, and
aggressive), while information objects in the far space are
perceived with less. This effect is stronger for agent

representation such as representations of humans and animals.

 

 

F. Extrapersonal Action-scene Infospace

F1. Reference Frames of Extrapersonal Action-scene Infospace

 

Guideline: Information objects can be attached to stationary objects in the
environment without additional tracking support. Multiple
Extrapersonal Action-scene Infospaces may be incorporated by

tracking the motion of each moving object in the environment.

 

 

Comments: Information objects in the Extrapersonal Action-Scene Infospace

remain stationary relative to objects in the scene. Tracking

 

 

135

 

 

 

 

 

 

 

sources of typical AR systems induce a local reference frame.
Information objects that remain stationary relative to stable objects
in the environment can be pre-calibrated with respect to this local
reference frame. Addition tracking is required for each moving
object so that information objects remain stationary relative to the

moving object.

 

F2. Physical Volume of Extrapersonal Action-scene Infospace

 

 

 

Guideline: Extrapersonal Action-scene Infospace can accommodate
information objects that require a large volume.
Comments: The physical volume of the Extrapersonal Action-scene Infospace ‘

 

is unlimited.

 

F3. Visibility of the Extrapersonal Action-scene Infospace

 

 

 

Guideline: Information objects attached to the Extrapersonal Action-scene
Infospace require visuo-spatial directed attention paradigms (such
as the Attention Funnel paradigm) should user’s attention to the
information object be necessary.

Comments: Visibility of information objects in Extrapersonal Action-scene

 

Infospace depends on the user’s viewpoint orientation and
position. Often this viewpoint and orientation will be such that the
information object is beyond the ﬁeld of view. When attention to

the information object is required, visuo-spatial attention needs to

 

136

 

 

 

 

 

 

 

be directed explicitly.

 

F4. Remote Objects Selection in Extrapersonal Action-scene Infospace

 

 

 

Guideline: Body parts used for pointing and selection tasks should be chosen
based on the order of ﬁnger, hand, arm, and lastly, the head.
Comments: Information objects may fall outside the reachable distance of the

 

hands as the user navigates in the environment. In such cases, the
object must be indicated by a direction indication rather than direct
selection. This often entails pointing in the form of indicating a
direction to the information object using the head, arm, hand, or
ﬁnger. Performance of pointing tasks using the head has the
highest F itts’s Index of Difﬁculty (i.e. most difﬁcult) followed by

the arm, the hand, and then the ﬁnger.

 

F5. Remote Objects Manipulation in Extrapersonal Action-scene Infospace

 

 

 

Guideline: AR interface designers will need to analysis the requirements for
of speciﬁc applications before choosing a remote manipulation
technique. The pros and cons for various remote manipulation
techniques are listed in Chapter 4 in Table 4.10.

Comments: There is no standard remote objects manipulation technique that

 

will work for all applications. Table 4.10 summarized the pros
and cons of six remote manipulation techniques: (1) Raycasting,

(2) CHIMP, (3) Arm-extension, (4) World in miniature, (5)

 

137

 

 

 

 

 

 

 

HOMER, and (6) Voodoo Doll.

 

G. Extrapersonal Ambient Infospace

GI. Reference Frames of Extrapersonal Focal Infospace

 

 

 

Guideline: Information objects can be attached to stationary objects in the
environment without additional tracking support.
Comments: Information objects in the Extrapersonal Ambient Infospace

 

remain stationary relative to objects on the earth. Once sufﬁcient
tracking is available to support Egocentric Infospaces, support for
Extrapersonal Ambient Infospace is, for all practical purposes,

free.

 

GZ. Spatial Bias in Extrapersonal Ambient Infospace

 

 

 

Guideline: Information objects should be placed nearer the ﬂoor in the
Extrapersonal Ambient Infospace.
Comments: The Extrapersonal Ambient Infospace is biased towards the

 

peripheral visual ﬁeld and lower visual ﬁeld. Proximity to the
ﬂoor provides a visual stabilization of the virtual elements relative

to the real environment.

 

G3. Linear Perspective and Motion Perception Properties

 

 

Guideline:

 

The Extrapersonal Ambient Infospace is ideal for information

objects related to spatial orientation and motion perception, for

 

138

 

 

 

 

 

examples, landmarks, horizontality cues and signage on the ﬂoor.

 

 

Comments:

 

Extrapersonal Ambient Infospace is particularly susceptible to

linear perspective and optical flow cues.

 

H. Infospace Choice for Common Information Objects

H1. Alerts and System Messages

 

 

 

Guideline: Alerts, system messages, or information objects that require
immediate attention should be placed in the Extrapersonal Focal
Infospace.

Comments: The Extrapersonal Focal Infospace has the highest visibility among all

 

Infospaces. For information objects attached to other Infospaces,
user’s immediate attention can be captured by an alert in the
Extrapersonal Focal Infospace ﬁrst. Devices such as the Attention
Funnel can then be used to direct visuo-spatial attention to the location

of the information objects.

 

H2. Unimanual Selection and Manipulation Tools

 

 

 

Guideline: Selection tools and unimanual manipulation tools should be attached
to the Personal-body Infospace of the dominant hand or ﬁngers.
Comments: Performance of pointing tasks using the hand or the ﬁnger has the

 

lowest Fitts’s Index of Difﬁculty. Furthermore, the dominant hand
has a ﬁner spatial and temporal motor resolution than the non-

dominant hand.

 

139

 

 

 

H3. Tools Selection Tray

 

 

 

Guideline: A tools selection tray should be attached to the Peripersonal
Infospace.
Comments: Peripersonal Infospace has the largest volume among the three

 

egocentric Infospaces to accommodate various selection and
manipulation tools. It also allows bimanual manipulation tools to be

selected by both hands concurrently.

 

H4. Task Speciﬁc Information Objects related to the Real Environment

 

 

 

Guideline: Task speciﬁc information objects related to the real environment
should be place in the Extrapersonal Action-scene Infospace, and
should be spatially registered to the task objects.

Comments: The cost for information search and attention switching can be

 

reduced by spatially placing task-related information in the correct

spatial location.

 

H5. Information Objects that require continuous monitor

 

 

 

Guideline: Information objects that require continuous monitoring (e. g. system or
task speciﬁc statuses or readings) should be attached to the
Peripersonal Infospace.

Comments: Even though the Extrapersonal Focal Infospace is the most visible of

 

the Infospaces, it is inappropriate for tasks that require sustained

 

140

 

 

 

 

 

 

attention due to perceptual fading. The visibility of objects in a
Peripersonal Infospace is acceptable for information that requires

continuous monitoring.

 

H6. Non-task Speciﬁc System and Personal Information Storage: Small Volume

 

 

 

Guideline: Peripersonal Infospaces and Personal-body Infospaces are ideal for
non-task speciﬁc personal information objects that require a small
volume.

Comments: The metaphorical associations and proprioceptive memory established

 

in the egocentric Infospaces provide for faster and more accurate
access and manipulation of information objects. However, the
number of information objects in an Extrapersonal Focal Infospace
should be limited, as this space is not suitable for large volume system

and personal information storage.

 

H7. Non-task Speciﬁc System and Personal Information Storage: Large Volume

 

 

 

Guideline: Unregistered Extrapersonal Action-scene Infospace is ideal for
holding non-task speciﬁc personal information objects that require a
large volume.

Comments: Extrapersonal Action-scene Infospace can accommodate information

 

objects that require a large volume. System and personal information
objects that exceed the capacity of Peripersonal Infospace can be

attached to the unregistered Extrapersonal Action-scene Infospace.

 

141

 

 

 

 

9 References

Abraharns, H., Krakauer, D. and Dallenbach, K. (1937). "Gustatory adaptation to salt."
American Journal of Psychology 49(3): 462 - 469.

Adamovich, S., Berkinblit, M., Hening, W., Sage, J. and Poizner, H. (2001). "The
interaction of visual and proprioceptive inputs in point to actual and remembered
targets in Parkinson's disease." Neuroscience 104(4): 1027 - 1041.

Atkinson, J. and Egeth, H. (1973). "RIght hemisphere superiority in visual orientation
matching." Canadian Journal of Psychology 27: 152 - 158.

Axelrod, S., Haryadi, T. and Leiber, L. (1977). "Oral report of words and word
approximations presented to the left or right visual ﬁeld." Brain and Language 1977 :
550 - 557.

Baluert, J. (1983). Spatial hearing: the psychoacoustics of human sound localization.
Cambridge, MA, MIT Press.

Bateson, G. (1972). Steps to an ecology of mind. New York, NY, Ballantine Books.

Bauma, H. (1973). "VIsual interference in the parafoveal recognition of initial and ﬁnal
letters of words." Vision Research 13: 767 - 782.

Becklen, R. and Cervone, D. (1983). "Selective looking and the notice of unexpected
events." Memory & Cognition 11: 601 - 608.

Berti, A., Smania, N. and Allport, A. (2001). "Coding of far and near space in neglect
patients." Neuroimage 14: S98 - 8102.

Biocca, F. (1997). "The cyborg's dilemma: Progressive embodiment in virtual
environments." Journal of Computer-Mediated Communication 3(2).

Biocca, R, David, P., Tang, A. and Lim, L. (2004). Does virtual space come precoded
with meaning? Location around the body in virtual space affecs the meaning of
objects and agents. In Proceedings of 54th Annual Conference of the International
Communication Association. New Orleans, LA. May 27 - 31, 2004.

Biocca, F., Eastin, M. and Daugherty, T. (2001). Manipulating objects in the virtual space
around the body: relationship between spatial location, ease of manipulation, spatial
recall, and spatial ability. In Proceedings of51th Annual Conference of the
International Communication Association. Washington, DC. May 24 - 28, 2001.

Biocca, F., Lamas, D., Gai, P., Brady, R. and Tang, A. (2001). Mapping the semantic
asymmetris of virtual and augmented reality space. In Proceedings of Fourth

142

 

International Conference on Cognitive Technology, CT 2001 , 117 - 122. Warwick,
UK. August 6 - 9, 2001.

Biocca, F. and Rolland, J. (1998). "Virtual eyes can rearrange your body adaptation to
virtual eye location in see-thru head-mounted displays." Presence: T eleoperators and
Virtual Environments 7(3): 262-277.

Bonanni, L., Lee, C. and Selker, T. (2005). Attention-based design of augmented reality
interfaces. In Proceedings of ACM CHI 2005. Portland, OR. April 2 - 7, 2005.

Boroditsky, L. and Ramscar, M. (2002). "The roles of body and mind in abstract
thought." Psychological Science 13(3): 185.

Bowman, D. and Hodges, L. (1997). An evaluation of techniques for grabbing and
manipulating remote objects in immersive virtual environments. In Proceedings of
ACM Symposium on Interaction 3D Graphics, 35 - 38.

Bratrnan, M. (1999). Faces of intention: selected essays on intention and agency.
Cambridge, UK, Cambridge University Press.

Brooks, F. (1996). "The computer scientist as toolsmith 11." Communications of the ACM
39(3): 61-68.

Bryant, D. J. (1992). "A spatial representation system in humans." Psycholoquy 3(16).

Bryant, D. J ., Tversky, B. and Franklin, F. (1992). "Internal and external spatial
ﬁ'ameworks for representing described scenes." Journal of Memory and Language 31:
74 - 98.

Bryden, M. P. (1982). Laterality: Functional asymmetry in the intact brain. New York,
NY, Academic.

Burgoon, J ., Buller, D. and Woodall, W. (1996). Nonverbal communication: the
unspoken dialogue, McGraw-Hill Companies, Inc.

Card, 8., Mackinlay, J. and Shneiderman, B. (1999). Readings in information
visualization: Using vision to think. San Francisco, CA, Morgan Kaufrnann.

Caudell, T. P. and Mizell, D. W. (1992). Augmented Reality: An Application of Heads-
up Display Technology to Manual Manufacturing Processes. In Proceedings of
International Conference on System Sciences, 659-669. Kauai, Hawaii. January 1992.

Cavell, R. (2002). McLuhan in space. Toronto, Canada, University of Toronto Press.

Chance, 8., Garnet, F ., Beall, A. and Loomis, J. (1998). "Locomotion mode affects the
updating of objects encountered during travel: The contribution of vestibular and

proprioceptive inputs to path integration." Presence: T eleoperators and Virtual
Environments 7: 168 - 178.

143

 

Corballis, M. and Beale, I. (1983). The ambivalent mind: The neuropsychology of left and
right. Chicago, Nelson-Hall.

Corballis, M. C. (1993). The lopsided ape: evolution of the generative mind. New York,
Oxford University Press.

Cutting, J. E. and Vishton, P. M. (1995). Perceiving layout and knowing distances: The
integration, relative potency, and contextual use of different information about depth.
In (W. Epstein and S. Rogers eds.) Perception of space and motion (pp. 69-117). San
Diego, CA, Academic Press. 5.

D'Avossa, G. and Kersten, D. (1996). "Evidence in human subjects for independent
coding of azimuth and elevation for direction of heading from optic ﬂow." Vision
Research 36: 2915 - 2924.

Dictgans, J. and Brandt, T. (1978). Visual-vestibular interaction: effects on self-motion
perception and postural control. In (R. Held and H. Leibowitz eds.) Perception: Vol.8
Handbook of sensory physiology (pp. 755 - 804). New York, NY, Springer-Verlag.

Dimond, S. J. and F arrington, L. (1977). "Emotional response to ﬁlms shown to the right
or left hemisphere measured by heart rate." Acta Psychologica 41: 259.

Ditchbum, R. and Ginsborg, B. (1952). "Vision with a stabilized retinal image." Nature
170: 35 - 37.

Dolezal, H. (1982). "Living in a world transformed: perceptual and perforrnatory
adaptation to visual distortion."

Easton, R. and Sholl, M. (1995). "Object-array structure, frames of reference, and
retrieval of spatial knowledge." Journal of Experimental Psychology: Learning,
Memory, and Cognition 21: 483 - 500.

Ebenholtz, S. and Mayer, D. (1968). "Rate of adaptation under constant and varied
optical tilt." Perceptual and Motor Skills 26: 507 - 509.

Eilan, N., McCarthy, R. and Brewer, B. (1993). Spatial representation. Oxford, UK,
Oxford University Press.

Ellis, A., Young, A. and Anderson, C. (1988). "Modes of word recognition in the left and
right cerebral hemispheres." Brain and Language 35: 254 - 273.

Eugen, T. (1982). The perception of odors. New York, NY, Academic Press.

Farrell, M. and Robertson, 1. (1998). "Mental rotation and the automatic updating of
body-centered spatial relationships." Journal of Experimental Psychology: Learning,
Memory, and Cognition 24: 227 - 233.

144

 

Feiner, S., MacIntyre, B. and Seligrnann, D. (1993). "Knowledge-based Augmented
Reality." Communications of the ACM 36(7): 52—62.

Feiner, S., MacIntyre, B., Tobias, H. and Webster, A. (1997). A Touring Machine:
Prototyping 3D Mobile Augmented Reality Systems for Exploring the Urban

Environment. In Proceedings of International Symposium on Wearable Computers,
208—217. Cambridge, MA. October 13-14, 1997.

Ferguson, E. S. (1994). Engineering in the mind 's eye. Cambridge, MA, MIT Press.

Fisher, E., Haines, R. and Price, T. (1980). Cognitive issues in head-up displays. Moffett
Field, NASA Ames Research Center. f

Fitts, P. (1954). "The information capacity of the human motor system in controlling the
amplitude of movement." Journal of Experimental Psychology 47(6): 381 - 391.

F itts, P. and Peterson, J. (1964). "Information capacity of discrete motor responses."
Journal of Experimental Psychology 67(2): 103 - 112.

 

Foley, J. and McChesney, J. (1976). "The selective utilization of information in the optic i:
array." Psychological Research 38: 251 - 265.

F oxlin, E. (2002). Motion tracking requirements and technologies. In (K. Stanney eds.)
Handbook of virtual environments (pp. 163 - 210). Hillsdale, NJ, Lawrence Erlbaum
& Associates.

Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York,
NY, Basic Books.

Gazzaniga, M. and Sperry, R. (1965). "Language after section of the cerebral
commussures." Brain 88: 237 - 294.

Grabowska, A. and Nowicka, A. (1996). "Visual-spatia1-ﬁequency model of cerebral
asymmetry: A critical survey of behavioral and electrophysiological studies."
Psychological Bulletin 120: 434 - 449.

Gray, H., Bannister, L., Berry, M. and Williams, P. (1995). The anatomical basis of
medicine and surgery.

Graziano, M. (1999). "Where is my arm? The relative role of vision and proprioception in
the neuronal representation of limb position." Proceedings of the National Academic
of Sciences, USA 96: 10418 - 10421.

Graziano, M. and Gross, C. (1998). "Spatial maps for the control of movement." Current
Opinion in Neurobiology 8(195 - 201).

145

Graziano, M. S. and Gross, C. (1995). The representation of extrapersonal space: A
possible role for bimodal visual-tacile neurons. In (M. Gazzaniga eds.) The cognitive
neurosciences (pp. 1021-1034). Cambridge, MA, M.I.T. Press.

Guiard, Y. (1987). "Asymmetric division of labour in human skilled bimanual action: the
kinematic chain as a model." Journal of Motor Behavior 19(4): 486 - 517.

Guiard, Y. and Ferrand, T. (1995). Assymmetry in bimanual skills. In (D. Elliot and A.
Roy eds.) Manual Asymmetries in Motor Performance (pp. 175 - 195). Boca Raton,
FL, CRC Press.

Haines, R., Fischer, E. and Price, T. (1980). Head-up transition behaviour of pilots with
and without head-up display in simulated low-visibility approaches. Moffett Field,
NASA Ames Research Center.

Hall, E. (1963). "A system for the notation of proxemic behavior." American
Anthropologist 65: 1003 - 1026.

Hall, E. (1966). The hidden dimension: man 's use of space in public and private. Garden
City, NY, Doubleday.

Hancock, P. and Meshkati, N. (1988). Human mental workload. New York, NY, North-
Holland.

Hari, R. and J ousmaki, V. (1996). "Preference of personal to extrapersonal space in a
visuomotor task." Journal of Cognitive Neuroscience 8(3): 305 - 307.

Harris, C. (1963). "Adaptation to displaced vision: Visual motor or proprioceptive
change?" Science 140: 812 - 813.

Hart, S. and Staveland, L. (1988). Development of NASA-TLX (Task Load Index):
results of empirical and theoretical research. In (P. Hancock and N. Meshkati eds.)
Human Mental Workload (pp. 139 - 183). Amsterdam, The Netherlands, North-
Holland.

Hay, J. and Pick, H. (1966). "Visual and proprioceptive adaption to optical displacement
of the visual stimulus." Journal of Experimental Psychology 71: 150 - 158.

Hearn, D. and Baker, M. P. (1996). Computer Graphics, C Version, Prentice Hall.
Hecaen, H. and Albert, M. (1978). Human neuropsychology. New York, NY, Wiley.

Heckenmueller, E. (1965). "Stabilization of the retinal image; a review of method."
Psychological Bulletin 63: 157 - 159.

Heidegger, M. (1968). What is a thing? Chicago, IL, H. Regnery Co.

146

Held, R. and Schlank, M. (1959). "Adaptation to disarranged eye-hand coordination in
the distance-dimension." American Journal of Psychology 72: 603 - 605.

Hoagland, H. (1933). "Quantitative aspects of cutaneous sensory adaptation." Journal of
General Physiology 16: 911 - 923.

Hochberg, J. (1986). Representation of motion and space in video and cinematic displays.
In (K. Boff, L. T. Kaufman and J. Thomas eds.) Handbook of Perception and Human
Performance, Vol. 1 mp.) New York, NY, Wiley.

Hood, J. (1950). "Studies in auditory fatigue and adaptation." Acta Otolaryngology,
Supplement 92: 1 - 57.

Horvitz, E., Kadie, C., Pack, T. and Hovcl, D. (2003). "Model of attention in computing
and communication: from principles to applications." Communications of the ACM
46(3): 52 - 59.

Inzuka, Y., Osumi, Y. and Shinkai, K. (1991). Visibility of head up display for
automobile. In Proceedings of 35th Annual Meeting of the Human Factors Society.

Johnson, A. and Proctor, R. (2004). Attention: theory and practive. Thousand Oaks, CA,
Sage Publications.

Kay, A. (1984). "Computer software." Scientific America 251(3): 40 - 47.

Keijsers, N., Admiraal, M., Cools, A., Bloem, B. and Gielen, C. (2005). "Differential
progression of proprioceptice and visual information processing deﬁcits in
Parkinson's disease." European Journal of Neuroscience 21(1): 239 - 248.

Khan, A., Matejka, J ., F itzmaurice, G. and Kurtenbach, G. (2005). Spotlight: directing
users' attention on large display. In Proceedings of A CM CHI 2005, 791 - 798.
Portland, OR. April 2 - 7, 2005.

Kirsh, D. (1995). "The intelligent use of space." Artiﬁcial Intelligence 73(1-2): 31 - 68.

Kirsh, D. and Maglio, P. (1992). Some epistemic beneﬁts of action: Tetris, a case study.
In Proceedings of Fourteenth Annual Conference of the Cognitive Science Society.
Hillsdale, NJ.

Kohler, I. (1964). "The formation and transformation of the perceptual world."
Psychological Issues 3: 1 - 173.

Kosslyn, S. M. (1987). "Seeing and imagining in the cerebral hemispheres: A
computational approach." Psychological Review 94: 148-175.

Krauskopf, J. and Riggs, L. (1959). "Interocular transfer in the disapperance of stablilzed
images." American Journal of Psychology 72: 248 - 252.

147

 

Langolf, G. (1973). Human motor performance in precise microscopic work. Ann Arbor,
MI, University of Michigan.

Larish, I. and Wickens, C. (1991). "Attention and HUDs: ﬂying in the dark?"

Lehikoinen, J. (2000). Virtual pockets. In Proceedings of Fourth International
Symposium on Wearable Computers, 165 - 170. Atlanta, GA. October 18 - 21, 2000.

Leibowitz, H. and Post, R. (1982). The two modes of processing concept and some
implications. In (J. Beck eds.) Organization and representation in perception (pp.).
Mahwah, NY, Erlbaum.

Levine, M., J ankovic, I. and Palij, M. (1982). "Principles of spatial problem solving."
Journal of Experimental Psychology: General 111: 157 - 175.

Mann, S. (2000). Telepointer: Hands-Free Completely Self Contained Wearable Visual
Augmented Reality without Headwear and without any Infrastructural Reliance. In
Proceedings of Fourth International Symposium on Wearable Computers, 177.

Maravita, A. and Iriki, A. (2004). "Tools for the body (schema)." Trends in Cognitive
Science 8(2): 79 - 86.

Marks, L. (197 8). The unity of the senses. New York, NY, Academic Press.

Marks, L. and Armstrong, L. (1994). Haptic and visual representations of space. In (T.
Inui and J. McClelland eds.) Attention and Performance (pp. 262 - 288). Cambridge,
MA, The MIT Press.

McCann, H. (1998). The works of agency: on human action, will and freedom. Ithaca,
NY, Cornell University Press.

McCann, R., Foyle, D. and Johnston, J. (1994). Attentional limitations with head-up
displays. In Proceedings of International Symposium on Aviation Psychology.
Columbus, OH.

McCrickard, D. and Chewar, C. (2003). "Attentive user interface: attuning notiﬁcation
design to user goals and attention costs." Communications of the ACM 46(3): 67 - 72.

McLuhan, M. (1967). Gutenberg galaxy. Toronto, Canada, University of Toronto Press.

McNamara, T. (1986). "Mental representations of spatial relations." Cognitive
Psychology 18: 87 - 121.

McNamara, T. (1989). "Mental representations of spatial and nonspatial relations."
Quarterly Journal of Experimental Psychology 41 : 215 - 233.

Melville, J. (1957). "Word-length as a factor in differential recognition." American
Journal of Psychology 37: 85 - 106.

148

 

SPF Us... ...
l

Merickel, M. L. (1992). A study of the relationship between virtual reality (perceived
realism) and the ability of children to create, manipulate and utilize mental image for
spatially related problem solving. In Proceedings of Annual Convention of the
National School Boards Association. Washington, DC.

Mine, M. (1996). Working in a virtual world: interaction techniques used in Chapel Hill
Immersive Modeling Program. Chapel Hill, NC, University of North Carolina,
Chapel Hill.

Mou, W., Biocca, F., Owen, C., Tang, A., Xiao, F. and Lim, L. (2004a). "Frames of
reference in mobile augmented reality displays." Journal of Experimental
Psychology: Applied 10(4): 238-244.

Mou, W. and McNamara, T. (2002). "Intrinsic frames of reference in spatial memory."
Journal of Experimental Psychology: Learning, Memory, and Cognition 28: 162 -
170.

Mou, W., McNamara, T., Valiquette, C. and Rump, B. (2004b). "Allocentric and
egocentric updating of spatial memory." Journal of Experimental Psychology:
Learning, Memory, and Cognition 30: 142 - 157.

Mou, W., Zhang, K. and McNamara, T. (2004c). "Frames of reference in spatial
memories acquired from language." Journal of Experimental Psychology: Learning,
Memory, and Cognition 30: 171 - 180.

Mountcastle, V. (1976). "The world around us: neural command ﬁmctions for selective
attention." Neuroscience Research Program Bulletin 14: 1 - 47.

Murphy, K. and Goodale, M. (1997). "Manual prehension is superior in the lower visual
hemiﬁeld." Society for Neuroscience Abstracts 23: 178.

Neisser, U. and Becklen, R. (1975). "Attention to visually speciﬁed events." Cognitive
Psychology 7: 480 - 494.

Neumann, U. and Majoros, A. (1998). Cognitive, performance, and systems issues for
augmented reality applications in manufacturing and maintenance. In Proceedings of
IEEE VRAIS '98, 4 - 11. Atlanta, GA. March 14 - 18, 1998.

Norman, D. (1993). Things that make us smart: defending human attributes in the age of
the machine. Menlo Park, CA, Addison-Wesley Publishing Co.

Osgood, C., Suci, G. and Tannenbaum, P. (1957). The measurement of meaning. Urbana,
IL, University of Illinois Press.

Owen, C., Biocca, F ., Tang, A., Xiao, F., Mou, W. and Lim, L. (2005). Information
frames in mobile augmented reality user interfaces. In Proceedings of Human
Computer Interaction International 2005, 11th International Conference on Human-
Computer Interaction. Las Vegas, NV.

149

 

Owen, C., Tang, A. and Xiao, F. (2003). ImageTclAR: a blended script and compiled
code development system for augmented reality. In Proceedings of ST ARS2003, The
International Workshop on Software Technology for Augmented Reality Systems, 23 -
28. Tokyo, Japan. October 7, 2003.

Pani, J. and Dupree, D. (1994). "Spatial reference frames in the comprehension of
rotational motion." Perception 23: 929 - 946.

Pettigrew, J. and Dreher (1987). Parallel processing of binocular disparity in the cat's
retinogeniculocortical pathways. In Proceedings of Royal Society B: Biological
Science, 297 - 321. 22nd December, 1987.

Pierce, J. and Pausch, R. (2002). Comparing Voodoo Dolls and HOMER: exploring the
importance of feedback in virtual environments. In Proceedings of A CM CHI 2002.
Minneapolis, MN. April 20 - 25, 2002.

Pierce, J ., Stearns, B. and Pausch, R. (1999). Voodoo Dolls: seamless interaction at
multiple scales in virtual environments. In Proceedings of A CM Symposium on
Interactive 3D Graphics.

Poupyrev, I., Billinghurst, M., Weghorst, S. and Ichikawa, T. (1996). The Go-Go
Interaction Technique: non-linear mapping for direction manipulation in VR. In
Proceedings of the 9th annual ACM Symposium on User Interface Software and
Technology, 79 - 80. Seattle, WA.

Presson, C. and Montello, D. (1994). "Updating after rotational and translational body
movements: coordinate structure of perspective space." Perception 23(1447 - 1455).

Previc, F. H. (1990a). "Functional specialization in' the lower and upper visual ﬁelds in
humans: Its ecological origins and neurophysiological implications." Behavioral and
Brain Sciences 13: 519 - 542.

Previc, F. H. (1990b). "Visual processing in three-dimensional space: perceptions and
misperceptions." Behavioral and Brain Sciences 13: 559 - 566.

Previc, F. H. (1998). "The neuropsychology of 3D space." Psychological Bulletin 124:
123 - 164.

Previc, F. H. and Blume, J. (1993). "Visual search asymmetries in three-dimensional
space." Vision Research 33: 2697 - 2704.

Previc, F. H. and Neel, R. (1995). "THe effects of visual surround eccentricity and size
on manual and postural control." Journal of Vestibular Research 5: 399 - 404.

Proctor, R. and Van Zandt, T. (1994a). Human factors in simple and complex systems.
Boston, MA, Allyn and Bacon.

150

Proctor, R. W. and Van Zandt, T. (1994b). Anthropometrics and workspace design. In
eds.) Human factors in simple and complex systems (pp.). Boston, MA, Allyn and
Bacon.

Psotka, J. "Memory in VR and Augmented VR." from http://alcx-
immersionarmymil/serial.html.

 

Redelrneier, D. and Tibshirani, R. (1997). "Association between cellular telephone calls
and motor vehicle collisions." New England Journal of Medicine 336(7): 453 - 458.

Reeves, B. and Nass, C. (1996). The media equation: how people treat computers,
television, and new media like real people and places. Cambridge, UK, Cambridge
University Press.

Rieser, J. (1989). "Access to knowledge of spatial structure at novel points of
observation." Journal of Experimental Psychology: Learning, Memory, and
Cognition 15: 1157 - 1165.

Rieser, J. (1999). Dynamic spatial orientation and the coupling of representation and
action. In (R. Golledge eds.) Wayﬁnding behaviour: cognitive mapping and other
spatial processes (pp. 168 - 191). Baltimore, MD, Johns Hopkins University Press.

Rieser, J ., Guth, D. and Hill, E. (1986). "Sensitivity to perspective structure while
walking without vision." Perception 15: 173 - 188.

Rieser, J ., Pick, H. and Ashmead, D. (1995). "Calibration of human locomotion and
models of perceptual-motor organization." Journal of Experimental Psychology:
Human Perception and Performance 21: 480 - 497.

Riggs, L. and Ratliff, F. (1952). "The effects of counteracting the normal movements of
the eye." Journal of the Optical Society of America 42: 872 - 873.

Riggs, L., Ratliff, F., Comsweet, J. and Comsweet, T. (1953). "The disappearance of
steadily ﬁxed visual test objects." Journal of the Optical Society of America 43: 495 -
501.

Rizzolatti, G. and Camarda, R. (1987). Neural circuits for spatial attention and unilateral

neglect. In (M. J eannerod eds.) Neurophysiological and neuropsychological aspects
of spatial neglect (pp. 289 - 314). Amsterdam, The Netherland, North-Holland.

Rizzolatti, G., Gentilucci, M. and Matelli, M. (1985). Selective spatial attention: One
center, one circuit, or many circuits? In (M. Posner and O. Marin eds.) Attention and
Performance [1 (pp. 251 - 265). Hillsdale, NJ, Erlbaum.

Robertson, G., Czerwinski, M., Larson, K., Robbins, D., Thiel, D. and van Dantzich, M.
(1998). Data Mountain: Using spatial memory for document management. In
Proceedings of A CM UIST '98 Symposium on User Interface Software & Technology.
San Francisco, CA. November, 1998.

151

Robertson, L. C. and Lamb, M. (1991). "Neuropsychological contributions to theories of
part/whole organization." Cognitive Psychology 23: 299 - 330.

Roel, V. (2002). Designing attentive interfaces. In Proceedings of Symposium on Eye
Tracking Research and Applications. New Orleans, LA.

Rolland, J ., Biocca, F ., Barlow, T. and Kancherla, A. (1995). Quantiﬁcation of adaptation
to virtual-eye location in see-thru head-mounted displays. In Proceedings of Virtual
Reality Annual International Symposium (VRAIS '95), 56-66. Research Triangle Park,
NC. 11-15 March. IEEE Computer Society.

Rubin, N., Nakayama, K. and Shapley, R. (1996). "Enhancecd perception of illusory
contoursin the lower versus upper visual hemiﬁelds." Science 271: 651 - 653.

Sackheim, H., Gur, R. and Saucy, M. (197 8). "Emotions are expressed more intensely on
the left side of the face." Science 202: 434 - 436.

Schmalstieg, D. and Wagner, D. (2005). A handheld augmented reality museum guide. In
Proceedings of IADIS International Conference on Mobile Learning 2005. Qawra,
Malta. June 28 - 30, 2005.

Sergent, J. (1983). "Role of the input in visual hemispheric asymmetries." Psychological
Bulletin 93: 481 -5 1 2.

Sergent, J. (1987). "Failures to conﬁrm the spatial-frequency hypothesis: Fatal blow or
healthy complication?" Canadian Journal of Psychology 41: 412 - 428.

Servos, P., Goodale, M. and J akobson, L. (1992). "The role of binocular vision in
prehension: a kinematic analysis." Vision Research 32: 1513 - 1521.

Sheehan, J. and Sosna, M. (1991). The boundaries of humanity: humans, animals,
machines. Berkeley, CA, University of California Press.

Sheliga, B., Craighero, L., Riggio, L. and Rizzolatti, G. (1997). "Effects of spatial
attention on directional manual and ocular responses." Experimental Brain Research
114: 339 - 351.

Shelton, A. and McNamara, T. (2001a). "Systems of spatial reference in human
memory." Cognitive Psychology 43: 274 - 310.

Shelton, A. and McNamara, T. (2001b). "Visual memory from nonvisual experiences."
Psychological Science 12: 343 - 347.

Shiffrin, R. (1979). "Visual processing capacity and attentional control." Journal of
Experimental Psychology: Human Perception and Performance 5: 522 - 526.

Shneiderman, B. (1983). "Direct manipulation: A step beyond programming languages."
IEEE Computer 16(8): 57 - 69.

152

Shoemake, K. (1985). Animating rotation with quatemion curves. In Proceedings of 12th
Annual Conference on Computer Graphics and Interactive Techniques, 245 - 254.

Sholl, M. and Bartels, G. (2002). "The role of self-to-object updating in orientation-free
performance on spatial memory tasks." Journal of Experimental Psychology:
Learning, Memory, and Cognition 28: 422 - 436.

Sholl, M. and Nolin, T. (1997). "Orientation speciﬁcity in representations of place."
Journal of Experimental Psychology: Learning, Memory, and Cognition 23: 1494 -
1 507.

Simons, D. and Wang, R. (1998). "Perceiving real-world viewpoint changes."
Psychological Science 9: 315 - 320.

Sojourner, R. and Antin, J. (1990). "The effects of a simulated head-up display
speedometer on perceptual task performance." Human Factors 32(3): 329 - 339.

80130, R. (1998). Cognitive psycholog. Needham Height, MA, Allyn & Bacon.
Sperry, R. (1961). "Cerebral organization and behavior." Science 133: 1749 - 1757.

Stein, B. (1984). "Multimodal representation in the superior colliculus and optical
tectum." Journal of Neurophysiology 41: 55 - 64.

Stoakley, R., Conway, M. and Pausch, R. (1995). Virtual reality on a WIM: interaction
worlds in miniature. In Proceedings of A CM SIGCHI '95.

Stratton, G. (1897). "Upright vision and the retinal image." Psychological Review 4: 182
- 187.

Strauss, E. (1998). "Writing speech separated in split brain." Science 280(5365): 827 -
828.

Strayer, D. and Johnston, W. (2001). "Driven to distraction: dual-task studies of
simulated driving and conversing on a cellular phone." Psychological Science 12(6):
462 - 466.

Tang, A., Owen, C., Biocca, F. and Mou, W. (2003). Comparative Effectiveness of
Augmented Reality in Object Assembly. In Proceedings of A CM CHI ’2003, 73-80.
F 011 Lauderdale, FL.

Telford, L. and Frost, B. (1993). "Factors affecting the onset and magnitude of linear
vection." Perception and Psychophysics 53: 682 - 692.

van der Heijden, A. (1992). Selective attention in vision. New York, NY, Routledge.

van der Heijden, A. (2003). Attention in vision perception, communication and action.
New York, N.Y., Psychology Press.

153

 

Wada, Y., Saijo, M. and Kato, T. (1998). "Visual ﬁeld anisotropy for perceiving shape
from shading and shape fro edges." Interdisciplinary information sciences 4(2).

Wallach, H. (1987). "Perceiving a stable environment when one moves." Annual Review
of Psychology 38: 1 - 27.

Waller, D., Montello, D., Richardson, A. and Hegarty, M. (2002). "Orientation speciﬁcity
and spatial updating." Journal of Experimental Psychology: Learning, Memory, and
Cognition 28: 1051 - 1063.

Wang, R. and Simons, D. (1999). "Active and passive scene recognition across views."
Cognition 70: 191 - 210.

Ware, C. (2000). Information visualization. San Francisco, Morgan Kaufrnann.

Ware, C., Arthur, K. and Booth, K. (1993). Fish tank virtual reality. In Proceedings of
SIGCHI conference on Human factors in computing systems, 37-42. Amsterdam, The
Netherlands. April 24-29, 1993.

Weintraub, D. R., Haines, R. and Randle, R. (1985). Head-up display (HUD) utility II:
Runway to HUD transitions monitoring eye focus and decision times. In Proceedings
of Human Factors Society 29th Annual Meeting.

Whitehead, R. (1991). "Right hemisphere superiority during sustained visual attention."
Journal of Cognitive Neuroscience 3: 329 - 334.

Wraga, M., Creem, S. and Profﬁtt, D. (2000). "Updating displays after imagined object
and viewer rotations." Journal of Experimental Psychology: Learning, Memory, and
Cognition 26: 151 - 168.

Yamamoto, N. and Shelton, A. (2005). "Visual and proprioceptive representations in
spatial memory." Memory & Cognition 33(1): 140 - 150.

Yates, F. (1966). The art of memory. Chicago, IL, University of Chicago Press.

Young, A. and Ellis, A. (1985). "Different methods of lexical access for words presented
to the left and right visual hemiﬁelds." Brain and Language 24: 326 - 358.

Yovel, G., Yovel, I. and Levy, J. (2001). "Hemispheric asymmetries for global and local
visual perception effects of stimulus and task factors." Journal of Experimental
Psychology: Human Perception and Performance 27: 1369 - 1385.

Zhang, J. and Norman, D. (1994). "Representations in distributed cognitive tasks."
Cognitive Science 18(1): 87 - 122.

154

 

   

uliltnljjjlljjijtitty