I LIBRARY
*. Michigan State ’
University ____l

 

This is to certify that the

thesis entitled

COMPARATIVE EFFECTIVENESS OF AUGMENTED
REALITY IN OBJECT ASSEMBLY

presented by

Kwok H. Tang

has been accepted towards fulﬁllment
of the requirements for

M.S. d9 eein Computer Science &
sgr Engineering

 

 

 

i I
Major professor

Date iLllLllm

0-7639 MS U is an Afﬁrmative Action/Equal Opportunity Institution

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/01 cJClRC/DateDu‘pBS-DJS

 

COMPARATIVE EFFECTIVENESS OF AUGMENTED REALITY

IN OBJECT ASSEMBLY

By

Kwok H. Tang

A THESIS
Submitted to
Michigan State University
in partial fulﬁlhnent of the requirements

for the degree of

MASTER or SCIENCE

Department of Computer Science and Engineering

2001

Abstract

Comparative Effectiveness of Augmented Reality in Object Assembly
By

Kwok H. Tang

With all the speculations about the instructional capabilities of Augmented
Reality (AR), there has been very little empirical research studying the actual
effectiveness of AR as an instructional medium. The purpose of the research reported in
this thesis is to explore the effectiveness of using AR in a computer assisted assembly
task. Instructions for the assembly task are displayed in user’s ﬁeld of view and
registered onto the workspace. Instructions are presented to the user as 3D objects
superimposed on real objects to explicitly demonstrate the exact assembly step. Three
other instructional media are compared and contrasted with the AR system: a printed
manual, computer assisted instruction using a monitor-based display, and computer
assisted instruction using a head-mounted display. Initial ﬁndings show that overlaying
3D instructions on the actual work reduce the error rate of an assembly task, particularly
highly correlated and sequential errors. The result suggests that a part of the mental
calculation of the assembly is offloaded to the computer since the system automatically
calculates the position and orientation of the assembly part and overlay and provides an

appropriate visualization according to the user’s view.

Keywords: Augmented reality, computer assisted instruction, human computer

interaction, usability study.

® Copyright December 11, 2001 by
Arthur Tang

All Rights Reserved

Acknowledgements

This thesis research cannot be made possible without the help of many peOple. To
begin, I would like to thank Dr. Charles Owen for his supportive assistance throughout
this research project. He is an excellent advisor and editor. His advices and comments has
been very valuable. Next, I would like to thank Dr. Frank Biocca for introduced me into
the area of HCI in virtual environment and the fascinating world of academic research in
the ﬁrst place. I would not have attempted graduate school without his advices and
assistance. Next, I would like to thank Dr. George Stockman for his suggestions and
comments on my research project.

I would also like to thank Dr. Weimin Mou for his support on the statistical
analysis. A special thanks goes to Dr. Alex Terrazas for valuable discussion and review,
and most importantly for sorting out some technical submission problem at the very last
moment. Also, thanks to Dr. Duncan Rowland for reviewing the thesis and Dr Thomas
Muth for allowing me to recruit participants in the experiment in his class. Thanks to
Kenny Lee for setting up and trying out the stimulus materials, and all the other
miscellaneous laboring jobs during the experiment. And ﬁnally, thanks to all the staff
members in the M.I.N.D. Lab and the MET Lab for the assistances and suggestions that

seem insigniﬁcant, but very important.

iv

TABLE OF CONTENTS

List of Figures ..................................................................................... viii
List of Tables ....................................................................................... ix
List of Abbreviation ............................................................................... x
Chapter 1: Introduction ......................................................................... 1
1.1. Research Problem ......................................................... 2

1.2. Research Contributions ................................................... 3

1.3. Outline of the Thesis ...................................................... 3

Chapter 2: Augmented Reality Systems ...................................................... 4
2.1. Basic Components of an Augmented Reality System ............... 4

2.1.1. See-through Head Mounted Display ........................... 4

2.1.2. Tracking System ................................................ 8

2.1.2.1 Time-frequency Measurement Tracking ............ 8

2.1.2.2 Spatial Scan Tracking ................................... 8

2.1.2.3 Inertial Sensing .......................................... 9

2.1.2.4 Mechanical Linkages Tracking ........................ 9

2.1.2.5 Direct-ﬁeld Sensing .................................... 9

2.2. Calibrations in Augmented Reality .................................... 10

2.2.1. Pointer Calibration ............................................. 11

2.2.2. Workspace Calibration ......................................... 11

2.2.3. Display Calibration ............................................. 11

2.3. Error Evaluation for Calibration ....................................... 12

Chapter 3:

Chapter 4

Chapter 5

Overview of Manufacturing Assembly ........................................ 13
3.1. The Importance of Manual Assembly in Manufacturing ........... 13
3.2. Issues in Manual Assembly ............................................ 14

3.3. Using Augmented Reality for Computer Assisted Instruction ....15

Methodology ...................................................................... 17
4.1 Hypothesis ................................................................ 17
4.2 Method .................................................................... 18
4.3 The Assembly Task ...................................................... 18
4.4 Experimental Setups .................................................... 19

4.4.1 Treatment 1: Printed Media ................................... 20

4.4.2 Treatment 2: Computer Assisted Instruction on
LCD Monitor ................................................... 21

4.4.3 Treatment 3: Computer Assisted Instruction on

See-through Head-mounted Display ......................... 21

4.4.4 Treatment 4: Augmented Reality ............................. 22

4.5 Participants ............................................................... 23
4.6 Limiting Unwanted Variables .......................................... 24
4.7 Experimental Procedure ................................................ 24
4.8 Measurements ............................................................ 25
Results .............................................................................. 28
5.1 Descriptive Statistics .................................................... 28
5.2 Effect of Time of Completion on Treatment Conditions ........... 31
5.3 Effect of Accuracy on Treatment Conditions ........................ 32

5.3.1 Effect of Total Errors on Treatment Conditions ...........32
5.3.2 Effect of Dependent Error on Treatment Conditions ...... 33

5.3.3 Effect of Independent Error on Treatment Conditions ....34

5.4 Effect of NASA TLX on Treatment Conditions ..................... 34
5.5 Effect of Spatial Cognitive Ability on Performance ................ 34
5.5.1 Effect of Spatial Cognitive Ability on Total Error .........34
5.5.2 Effect of Spatial Cognitive Ability on Dependent Error ..35
5.5.3 Effect of Spatial Cognitive Ability on Independent
Error .............................................................. 35
5.5.4 Effect of Spatial Cognitive Ability on Time of
Completion ...................................................... 36
Chapter 6 Discussions and Conclusions ................................................... 37
6.1 Effect of Information Overlay on Performance ..................... 37
6.2 Effect of Attention Switching and Mental Transformation
Ofﬂoading on Performance ............................................. 38
6.3 Effect of Treatment Conditions on Mental Workload .............. 40
6.4 Effect of Spatial Cognitive Ability on Performance ................ 40
6.5 Effect of Dependent Error in Augmented Reality .................. 41
6.6 Effect of Attention Tunneling in Augmented Reality ............... 42
6.7 Conclusion ............................................................... 42
Appendix Procedural Steps for the Assembly Task ...................................... 44
Bibliography ........................................................................................ 54

vii

List of Figures

Figure 2.1.
Figure 2.2.
Figure 2.3.
Figure 2.4.
Figure 4.1.
Figure 4.2.
Figure 4.3.

Figure 4.4.

Figure 4.5.

Figure 5.1.

Figure 5.2.

Components of a typical AR system.

Video see-through HMD.

Optical see-through HMD.

Transformations between coordinate systems.

The completed assembly task

Treatment condition I: printed manual.

Treatment condition 2: CAI on LCD.

Treatment condition 3: CAI on HMD.

Treatment condition 4: AR.

Bar chart on the average time of completion in each treatment conditions.
Bar chart on the average number of dependent error, independent error,

and total error in each treatment conditions.

viii

List of Tables

Table 4.1

Table 4.2

Table 5.1.

Table 5.2.

Table 5.3.

Table 5 .4

Table 5.5.

Table 5.6.

Table 5.7.

Table 5.8.

Table 5.9.

Table 5.10:

Table 5.11:

Table 6.1:

NASA TLX Rating Scale Deﬁnation.

Table 6.2. Combination of pair wise comparisons between the 6 rating
scales.

Descriptive statistics for time of completion in each treatment conditions.
Descriptive statistics for number of error in each treatment conditions.
Average score on Spatial Cognition Test in each treatment conditions.
Average score on NASA TLX Rating in each treatment conditions.
ANOVA Post Hoc comparisons of time of completion on treatment
conditions.

ANOVA Post Hoc comparisons of total error on treatment condition.
ANOVA Post Hoc comparisons of dependent error on treatment
conditions.

Bivariate Correlation of Spatial Cognitive Ability and Performance.
Bivariate Correlation of Spatial Cognitive Ability and Dependent Error.
Bivariate Correlation of Spatial Cognitive Ability and Independent Error.
Bivariate Correlation of Spatial Cognitive Ability and Time of
Completion.

Bivariate correlation analysis of Spatial Cognitive Ability and combined

sample from treatment 1, 2 and 3.

ix

List of Abbreviation

ANOVA
AR

CAI
CCD
CRT
DOF

dpi

FOV
HDD
HMD
HUD
LCD
NASA
NASA TLX
PDA
SPAAM

VR

Analysis of Variance

Augmented Reality

Computer Assisted Instruction
Charge-coupled Device

Cathode Ray Tube

Degree of freedom

dot per inch

Field of view

Head-down Display

Head-mounted Display

Head-up Display

Liquid Crystal Display

National Aeronautics and Space Administration
NASA Task Load Index

Portable Digital Assistant

Single Point Active Alignment Method

Virtual Reality

Chapter 1

Introduction

The term Augmented Reality (AR) is used to describe systems that combine computer
generated environments with real environments. This combination might include the
enhancement of an image with virtual annotations, the detection and ampliﬁcation of soft
sounds or those outside the normal range of hearing, or the use of haptics to increase the
sensitivity of touch. Unlike Virtual Reality (VR), AR enhances the real environment
rather than replacing it. In a typical AR system for augmented vision, a see-through head-
mounted display (HMD) is used to composite computer generated graphics with the real
environment The superimposed graphics provide additional information to the user while
the user is interacting with the real environment. AR technology has many potential
applications, including computer assisted instructions (CAI), industrial training,
computer-aided surgery, computer visualization, engineering design, interior design and

modeling, and entertainment.

The idea of overlaying a computer generated synthetic environment over a real
environment through an HMD dates back to Ivan Sutherland’s idea of “the ultimate
display” in 1965 [42, 43]. However, little research was done in this area until the last
decade, when the tremendous advances in real time 3D graphics rendering, display
technologies, motion tracking technologies and computer processing power solved many
of the technical obstacles to the creation of practical systems. In 1990, researchers at The

Boeing Company started a pilot project on using AR for wire bundle assembly [10, 29].

In 1992, a research group at University of North Carolina, Chapel Hill started a
research project to explore using AR in surgical settings. The project overlays 3D
ultrasonic echography images on a patient to endow the doctor with “xray vision” to see
into the patient’s body [3]. The Computer Graphics and User Interfaces Laboratory at
Columbia University developed a prototype using AR to assist maintenance for a laser
printer in 1993 [14]. In 1997 they also developed a system called “The Touring
Machine”, a Mobile AR system that overlays tourist information onto the user’s view
[15]. In 1998, the Massachusetts Institute of Technology Media Laboratory developed an
AR application that enhances the game of billiards by calculating and overlaying strategic
shots in a game [21]. In the same year, the Mixed Reality Laboratory in Tokyo, Japan
developed an AR air hockey game, where two players hit a virtual puck with real mallets

on a real table [32].

1.1 Research Problem

There have been many speculations about what AR can do, but there have been
very few empirical research studies exploring the effectiveness of AR. Even though a
number of AR prototypes and test-bed applications were deve10ped in the last decade,
they were mame “proof-of-concept” applications or demonstrations. Currently there is a
lack of theories and guidelines in computer-human interaction to support the design of
this emerging environment. This thesis is an early attempt to study the effectiveness of
using AR to create an interface that “assists the user’s memory for procedures
(procedural memory) and context speciﬁc reference information (semantic memory)” [7].

This is a specific example of “Intelligence Ampliﬁcation”, a term coined by Frederick

Brooks -— to denote using the machine “to couple the mind and the machine together with

broad-band channels” to increase human performance on speciﬁc tasks [9].

1.2 Research Contributions

The purpose of this thesis is to explore the effectiveness of using AR in a
computer assisted assembly task. Information for the task is displayed in user’s view and
registered in the workspace. Instructions can be presented to the user as 3D objects
superimposed on real objects to demonstrate the exact direction explicitly. This research
has produced 3 signiﬁcant contributions in this area:

1. Support for the assertion that AR improves human performance in assembly tasks.
2. Provide theoretical basis for improved AR user interfaces.

3. Indication of some of the potential weakness of current AR systems.

1.3 Outline of the Thesis

The thesis presents the results of an empirical study of the effectiveness of an AR
environment in a speciﬁc assembly task. Chapter 1 presents an introduction, and gives
the general motivation to the research problem. Chapter 2 gives an overview of
technologies used in AR and issues involved in the design of AR systems. Chapter 3
examines issues in manufacturing assembly, problems faced by system developers, and
how AR can potentially solve some of these problems. Chapter 4 presents the
methodology utilized to examine the research problem, and describes the design of the
experiment. Chapter 5 presents the experimental results. Chapter 6 discusses

experimental results, and presents the conclusions drawn from the experiment.

Chapter 2

Augmented Reality System

There are many methods for augmented human perception. This thesis focuses on the
augmentation of human vision with a see-through head-mounted display. What AR
attempts to do is to superimpose informative virtual environment over the user’s ﬁeld of
view according to the position of the user and the direction the user looking. This chapter

will explore the design issues in building an AR system.

2.1 Basic Components of an Augmented Reality System

A typical AR system consists of four components: HMD, tracking system, a
computer, and soﬁware (Figure 2.1). The tracking system estimates the position and
orientation of the user’s head. This information is used to compute a viewpoint for
graphics that will be displayed in the HMD. Tracking of the user’s vision and HMD
apparatus allows the system to render graphics that register with the real world as viewed

through the semi-transparent display.

2.1 . 1. See-through Head Mounted Display

A see-through HMD is a device that combines virtual computer generated
graphics with the real environment. There are two major types of see-through HMD for
AR system: optical see-through and video see-through. A video see-through HMD
consists of an opaque HMD and two small video cameras mounted on the outside of the

HMD. Real time video streams from the two cameras are combined with computer-

 

’ Tracking

 

 

 

 

Video Signal

 

Figure 2.]. Components of a typical AR system.

generated graphics presented inside the opaque HMD (Figure 2.2). An optical see-
through HMD overlays computer graphics on the visual environment using a partially
transmissive half-silvered mirror (Figure 2.3). There is also a technology that uses high

intensity light to paint images onto the retina of the user’s eyes.

There are advantages and disadvantages to both types of displays. Video see-
through displays position the two cameras as an approximation of the position of user’s
eyes. Consequently, the video streams being seen in the opaque HMD are displaced by

the cameras position. This eye-oﬂfset problem can complicate tasks that require very

    
 
  
 

imv—u—T, ..r ,u '77-'71 9‘7" 1;

VIDEO COMPOSITOR J

  
 

 

AUGMENTED
IMAGES , i
. . i
vibeo 9F REAL W°RL° COMBINED
, ,, I * » VIDEO

REAL WORLD

 

Figure 2.2. Video see-through Hm.

     
   
 

lMAGE
PROCESSOR

 

“ m w 5mm“...

. ., ..... . ’ EAUGMENTED
.;&g““ i lMAGES
I: 33.: I
DISPtAr 4 -------------- .1

: -- '-' REAL WORLD
OPTICAL COMBIN ER

 

Figure 2.3. Optical see-through Him).

accurate hand-eye coordination (e.g. Surgery) [6, 38]. This is not a problem for optical
see-through displays.

Both optical and video see-through displays require rendering of the graphics in
response to some method for tracking the head position and orientation in real time. In
addition, video see-through displays require digitizing and re-rendering the video signals,
and this usually adds at least 1/30 of a second of delay to the video stream This latency
can lead to unnatural hand-eye coordination or simulation sickness [36]. Video see-
through displays limit the resolution and ﬁeld of view for both the real and virtual
environment to the resolution and FOV of the cameras and the display. With the current
camera and display technologies, this limit is far inferior to the resolution of the human
eye [40].

Video see-through uses video-mixing equipment to “paint” the virtual graphics
onto the real environment [12], while optical see-through uses half-silver mirrors to
optically combine the real and virtual environment. One of the disadvantages of optical
see-through techniques is the real scene cannot be obscured by the virtual scene, and
everything in the virtual environment looks semi-transparent.

Display calibration refers to the alignment between the virtual world displayed in
the HMD and the physical world. Display calibration for video see-through HMD can be
achieved using traditional camera calibration techniques. These calibration procedures
can be performed once and reused. For Optical see-through techniques, users are required
to perform an online calibration procedure to determine the viewing parameters such as
center of projection of the display, and geometric relation between the head tracker, eyes,

and the display. Since these relationships vary among different users and are dependent

upon the worn position of the display, users are required to perform the calibration
procedure every time before Operation. Section 2.2.3 describes display calibration in

more detail.

2.1.2. Tracking System

A tracking system is used in an AR environment to approximate the position on
the user’s head, and the direction the user is looking. According to Roland et a1. [39],
tracking technologies can be classiﬁed as (1) time-frequency measurement, (2) spatial

scan, (3) inertial sensing, (4) mechanical linkages, and (5) direct-ﬁeld sensing.

2.1.2.1 Time-frequency Measurement Tracking

Time-frequency measurement tracking systems measure the time and/or phase
difference Of pulsed signals traveling to at least 3 stationary points to determine the
position and orientation Of the source. Typical pulsed signals being used in time-
frequency measurement tracking include ultrasonic, infrared laser-diode, and radio
signals. This is, by far, the most precise measurement technique, but suffers from
limitations due to occlusion and low update rate. Also ultrasonic signals are sensitive to
noise from CRT (Cathode Ray Tube) sweep frequencies and disk drives, and tracker lag

increases as the distances between the receivers and emitters increase.

2.1.2.2 Spatial Scan Tracking
Spatial scan tracking systems uses Optical sensing devices, such as CCD (Charge-

coupled Device) cameras, to scan for targets in a working volume and determine the

position and orientation. Examples of targets in the working volume include ﬁduciary
marks, bar codes, and infrared light sources. Spatial scan tracking has a very good update
rate, and could have, in principle, unlimited scalability. But these systems suffer from

occlusion and Optical noise.

2.1.2.3 Inertial Sensing

Inertial sensing trackers measure the change of momentum of the target to
determine the position and orientation. These systems typically use sensing devices such
as mechanical gyroscopes and/or accelerometers. Inertial sensing can Operate without a

source of reference, but suffers from accumulated error over time.

2.1.2.4 Mechanical Linkage Tracking

Mechanical linkage tracking physically links the target to a reference point. With
an encoder attached to the linkages, the system uses the angular rotation measured by the
encoder to determine the position and orientation of the target. Mechanical linkage
tracking usually has a high accuracy and low lag, but usually with a limited working

volume and range of motion.

2.1.2.5 Direct-ﬁeld Sensing

Direct-ﬁeld sensing trackers use magnetic ﬁeld sensors to measure a static
magnetic ﬁeld to determine the position and orientation of the sensors. The source of the
magnetic ﬁeld can be generated artiﬁcially or Earth’s natural magnetic ﬁeld could be

used. Direct-ﬁeld sensing trackers are inexpensive, lightweight, compact, and can be used

without any pre-calibration. But they usually have a larger latency, smaller working
volume, and suffer from magnetic interference from metallic Objects such as iron and

aluminum.

2.2 Calibrations in Au grnented Reality

The tracking system only provides information about the position and orientation
of user’s head relative to the source of the tracking system. In order for the computer
graphics to merge with the real world in a spatially meaningful way, a series Of
calibrations is required. “Calibration is the process of instantiating parameter values for
‘models’, which map the physical environment to internal representations, so that the
computer’s internal model matches the physical world” [27]. Typically, 3 calibration
procedures are necessary to Obtain the parameters of these geometric relations: pointer

calibration, workspace calibration, and display calibration.

 

 

 

 

 

Figure 2. 4. Transformations between coordinate systems.

10

2.2.1. Pointer Calibration

Pointer calibration determines a geometric transformation from the marker of a
tracking system attached to the pointer to the tip of the pointer (Transformation T1 in
Figure 2.5). Pointer calibration is necessary because we need to pick points in the
workspace to align with the virtual world in workspace calibration. The result of pointer
calibration can be stored and reused as long as the marker is rigidly attached to the

pointer.

2.2.2. Workspace Calibration

Workspace calibration is the alignment of the real world to the virtual world
(Transformation T2 in Figure 2.5). With a calibrated pointer, we can pick points in the
real world and estimate a rigid body transformation or afﬁne transformation to the
equivalent points in the virtual world. This calibration can be stored and reused as long as
the source of the tracking system remains stable relative to the workspace.
2.2.3. Display Calibration

Display calibration refers to a method to estimate the transformation that applies
to the virtual Object displayed on the HMD, so that the virtual Object is registered with the
real Object (Transformation T3 in Figure 2.5). Display calibration methods for video see-
through HMD are being studied extensively in literatures on Camera Calibration in
Computer Vision, such as [46], [2], and [24].

Azuma described a few methods to calibrate see-through HMD in [1]. One non-
systematic calibration method for Optical see-through HMD is to align a virtual Object

displayed in the HMD with a real Object by moving the user’s viewpoint until it “looks

11

correct”. This approach requires a “skilled user”, and generally does not achieve robust
results; registration becomes inaccurate when the user move away from the calibration
point. Azuma also describes a more systematic method using a boresight alignment
through a long pipe. Tuceryan and Navab developed an optical see-through calibration
method called Single Point Active Alignment Method (SPAAM) [44]. This method uses
a single point at a known location in the workspace to calibrate with crosshairs displayed
in the HMD. This method is considered to be a more user-friendly method because using
a single point for alignment simpliﬁed user interaction. Also, the user is not required to

move the head to a ﬁxed location and is free to move during the alignment.

2.3 Error Evaluation for Calibration

Since human performance and calibration error in AR is highly correlated, it is
very important to get quantitative data of calibration error to evaluate human performance
in AR. Calibration error evaluation for video see-thorough HMD can be done using
traditional image-based methods [19]. For optical see-through HMD, this approach is not
applicable since user’s retinal images is not available. Mcgarrity et al. described an
online calibration error evaluation method for see-through HMD that is capable of

producing quantitative metric data [28].

12

Chapter 3

Overview of Manufacturing Assembly

One of the most exciting applications of AR is assembly and maintenance. In general,
manufacturing processes consist of 4 series of Operations: fabrication, assembly,
inspection, and testing. This thesis only focuses on the assembly Operation in a

manufacturing process.

3.1 The Importance of Manual Assembly in Manufacturing

While many assembly Operations are automated, there are still a signiﬁcant
number of assembly Operations that cannot be done using automation and require a
human assembler. Automated assembly is good for assembly tasks that have a well-
deﬁned location for acquiring and inserting parts, and for mass production manufacturing
processes. For certain assembly processes, “people are good at assembly in spite of their
lack of certain abilities. People use vision or, for occluded objects, special aptitude to get
within range Of an assembly task. They then use tactile sensing in coordination with
movement to achieve the task” [37]. Also, in a market where customers are constantly
changing what they want or for products that are highly customized, the cost for
redesigning the automated processes can become substantial.

Manual assembly is typically used in manufacturing processes where automation
is not cost-effective, products are highly customized, or processes cannot be done by
automatic machineries (e.g. high quality soldering, parts that are fragile to machineries).

Example products of these kinds of processes include aircraft, mainframe computers,

13

military equipment, rapid prototypes, medical devices, and National Aeronautics and
Space Administration (NASA) contract works.

In the early 19908, a new manufacturing conceptual framework, agile
manufacturing, began to be employed widely. Agile manufacturing is a manufacturing
Operation that has the ﬂexibility to change the manufacturing process quickly and
efﬁciently to match rapid changes in market demands. Agile manufacturing has resulted
in mass customization in small quantities of highly specialized products. It usually relies

heavily on manual Operations for ﬂexibility.

3.2 Issues in Manual Assembly

One of the main problems in manual assembly is that expert assemblers are hard
to train, particularly for assembling processes that requires problem solving skills. It
usually takes months or even years for a novice assembler to develop expert knowledge
for assembling processes that have high complexity. In some cases, even the experts need
to refer to the instructional manual for procedures with high complexity, or procedures
that are rarely performed.

In agile manufacturing, assemblers face the challenge of a continuously changing
assembly process. It is impractical to retrain assemblers every time the assembly
processes are changed. Assemblers need to be cross-trained to different assembly tasks so
they have a deeper understanding of the process as a whole, and this trainings usually

needs to be done on the job.

14

3.3 Augmented Reality for Computer Assisted Instruction

CAI is typically used in complex assembly tasks that involve a huge set Of
assembly instructions, so the assembler can pull out the appropriate instructions online
when needed. However, the limited sensorimotor bandwidth (the amount of information
ﬂow between the human user and the computer) Of current interfaces of computer and
portable digital assistants (PDA) make them inadequate for hands-free Operation and
continuous data access with high interface-user information transfer rates. The limitation
Of sensorimotor bandwidth of modern computer interfaces (i.e., small screens, limited
input/output options, etc.) makes it hard for the powerful multimedia computer to utilize
its capabilities [5, 7].

In this research project, we present an AR system designed to guide and train
assembly workers for assembly tasks of large complexity. This approach is very different
from the traditional printed manual or online CAI approaches. In an augmented reality
environment, 3D synthesized computer graphics are overlaid in the user’s ﬁeld of view.
A study conducted by Haines, et al. [17] indicated pilots that use Head-up Display
(HUD) have less head and eye movement when comparing with pilots that use Head-
down Display (HDD) in the cockpit panels. By reducing head and eye movement and
increasing “eye-on-the-workspace” time, user performance is expected to increase. By
overlaying equivalent information on the work pieces in a spatially meaningful way, time
for information searching in the instructional medium (e.g. printed manual, handheld
display, machine display panel) is reduced.

By “seaming” the information to the real environment, AR technologies could be

used “as a complement Of human cognitive processes” [31]. Using AR as an instructional

15

medium can reduce the overhead of attention switching between the instructional media
and the task. AR systems can also be used to augment human attention. Synthesized
computer graphics are merged with the user’s view, so attention can be caught by arrows,
tags, highlighting the Object with wire-frame, playing 3D animations, etc. Invisible
Objects that are blocked from view can also be indicated.

AR technologies can also facilitate on-the-job training. Human beings tend to
memorize information better when they are docked to a space at the frame of reference Of
the real world. Demosthenes, a Greek orator born around 384 BC, used a strategy,
known as the Method of Loci, to memorize long speeches by mentally walking through
one’s house, associating each item of the speech to different spots or different Objects in
the house. In the ﬁeld Of neuroscience, there have been a number of theories suggesting
that there is a strong relation between spatial location and working memory [33, 34].
Kirsh argued that “methods used to manage our space are key to organization Of our
thought patterns and behavior” [22]. By spatially relating pieces of information to
physical objects and locations in the real world, AR provides a strong leverage Of spatial

cognition and memory [8].

16

Chapter 4

Methodology

This thesis hypothesizes that using AR in CA1 expands human capability to absorb and

process information. This chapter expands the hypotheses in detail, and explains the

methodology used to investigate these hypotheses.

4.1

H1:

H2:

H3:

H4:

H5:

Hypothesis

Based on the discussion in section 3.3, the following hypotheses were generated:

Overlaying information in the user’s view using a see-through HMD improves user
performance on an assembly task by reducing head and eye movement.

Overlaying information in the user’s ﬁeld of View using AR in a spatially
meaningful way improves user performance on the assembly task by reducing
attention switching between the instructional media and the workspace.

By offloading the mental transformation tasks to the computer, subjects using AR
instructions will perform better when comparing with subjects using traditional
instructional media, where pictorial instructions need to be mentally transformed to
the subjects’ point Of view.

Mental workload for the assembly task using traditional instructional media is higher
than using AR instruction.

Individuals with better spatial ability will perform better in an assembly task based

on traditional pictorial-based instruction.

l7

4.2 Method

To test the hypotheses, an experiment was employed to compare the effectiveness
of 4 different instructional media for an assembly task: a printed manual (treatment 1),
CAI on a Liquid Crystal Display (LCD) monitor (treatment 2), CAI on a see-through
HMD (treatment 3), and AR (treatment 4). The experiment uses a between subject design
among the 4 treatment conditions. Subjects are required to complete the experimental
assembly task according to the procedural instructions presented using the speciﬁc media
as per the treatment condition. An assembly task made up Of Duplo® is used in the
experiment to minimize bias towards a population with expertise in a certain knowledge

related to the assembly task.

4.3 The Assembly Task

The assembly task consists of 56 procedural steps. For each procedural step,
subjects are required to acquire a part of a speciﬁc color and size from an unsorted part-
bin and insert the part onto the current subassembly in a speciﬁc position and orientation
according to the presented instruction. The assembly task is 3 dimensional in nature;
some procedural steps subjects are required to put a part on tOp Of parts that was
previously inserted. Some of the procedural steps are correlated, so a mistake made in a
previous step could potentially generate additional mistakes in the later steps. Figure 4.1

shows the completed assembly. The 56 procedural steps are shown in Appendix A.

18

 

Figure 4.1. The completed assembly task.

4.4 Experimental Setups

Instructions for all 4 treatment conditions use pictorial representation, without any
language. Appendix A shows the 56 procedural steps of the assembly task. The display
resolution of all 4 treatments is set to 640 x 480 pixels, using 16-bit color. The graphics
used in all 4 treatments are rendered using the ImageTclAR Toolkit developed by The
Media and Entertainment Technologies Laboratory at Michigan State University [35]. In
order to facilitate hands-free task engaged operation, subjects in the treatment 2 (CAI on
LCD), treatment 3 (CAI on HMD), and treatment 4 (AR) used voice command to control

the instructions. The voice command “next” prompts the instruction to the next

procedural step, while the voice command "previous” prompts the instruction to the
previous procedural step. A human agent is used to interpret the voice command and
control the instruction accordingly to ensure maximum accuracy on the voice recognition

task. An audio signal is played to the user as a conformation of the voice command.

4.4.1 Treatment 1: Printed Media
The printed media is produced using a color solid ink printer with the resolution
of 1200 dot per inch (dpi). The instructions are printed single sided, with one procedural

step per page (Figure 4.2). Subjects are free to move the manual to anywhere in the

workspace, or hold it in their hand during Operation.

 

Figure 4.2. Treatment condition I .' printed manual.

20

4.4.2 Treatment 2: Computer Assisted Instruction on LCD monitor

Instructions are displayed on a laptop computer placed on the workspace (Figure
4.3). The size of the LCD monitor is 15-inch (diagonal), and the native resolution of the
screen is 1400 x 1050 pixels. The pictorial instructions were displayed in full screen.

Before the start of the experiment, subjects are free to adjust the brightness and

orientation of the screen.

 

Figure 4.3. Treatment condition 2: CAI on LCD.
4.4.3 Treatment 3: Computer Assisted Instruction on See-through Head-mounted Display
Instructions are displayed on a see-through HMD. The see-through HMD used in
the experiment is the Sony Glasstron LDI-lOOB (Figure 4.4). It has a native resolution of

832 x 624 pixels and a simulated 30 inches (diagonal) screen at 4 feet ahead.

21

 

Figure 4. 4. Treatment condition 3 .' CAI on Hm.

4.4.4 Treatment 4: Augmented Reality

Instructions are displayed in stereo using the Sony Glasstron LDI-lOOB. Head
motion of the subjects are tracked using the Polhemus Fastrak® 6 DOF magnetic tracker.
Stereo graphics are rendered in real time based on the data from the magnetic tracker,
using a computer with dual Intel® Pentium® III XeonTM 800 MHz processors, 512MB
RDRAM® and a 3Dlabs Wildcat II 5110 graphic accelerator, running under Microsoft®
Windows® 2000 Professional. The program is written using the ImageTclAR Toolkit
[35]. The Toolkit uses a variation of the SPAAM algorithm for stereo display calibration.

The calibration procedure will be described in Section 4.7.

22

 

Figure 4. 5. Treatment condition 4: AR.

4.5 Participants

75 subjects were recruited in an introductory undergraduate class at a large
midwestern university in the United States who volunteered to participate in the study for
class credit. Subjects were from a general college student population with majors ranging
from Information Technologies and Law and Policy, to Business Management and Media
Arts. None of the subjects had previous experience in any AR environment. Subjects
were randomly assigned to each treatment condition. The number Of males and females
was arranged to be distributed evenly among different treatments to control a possible

gender effect to the experiment.

23

4.6 Limiting Unwanted Variables

In treatment conditions 3 and 4, instructions are presented to the subjects through
a see-through HMD. Light from the real world will be attenuated and distorted by the
half-silver mirror when entering the see-through HMD. The subjects’ FOV is limited by
the HMI) (Horizontal FOV is about 28 degree for the Sony Glasstron HMD). And people
generally feel uncomfortable with a load (the Sony Glasstron HMD weights about 120g)
on ones head. These are factors that count as disadvantages to performance in treatment
conditions 3 and 4. To eliminate these factors from the experiment, subjects in all
treatments are required to wear the HMD during operation so that these variables remain
constant among different treatment.

In treatment condition 4, subjects are required to perform a display calibration and
error evaluation procedure that takes 8-12 minutes. This procedure generally is
considered to be challenging for an untrained user, and can potentially induce fatigue and
mental workload factors to the assembly task that affect subjects’ performance. To
eliminate these factors from the experiment, subjects in all treatment were required to
perform the display calibration and error evaluation procedure so that these variables

remain constant among different treatment.

4.7 Experimental Procedure

The experiment began after the participants read and signed a consent form
indicating their voluntary participation in the experiment. The participants were ﬁrst

briefed about the whole experimental procedure. After that, they were instructed about

24

the display calibration procedure. The display calibration procedure involves aligning 9
crosshairs for each eye presented in the HMD sequentially (18 crosshairs total) to a
crosshair located in the middle on the workspace. After completing the display
calibration procedure, the experimenter explained the graphical metaphors used in the
instructions, and in treatment 2, 3, and 4, the voice command used to control the
instructions. They then entered the pretest environment and performed the training
assembly task. Errors made in the pretest by the participants were explained after the
participants ﬁnished the pretest, and participants were asked if they feel comfortable in
performing the assembly, and if they want to repeat the pretest to get more familiar with
the environment When participants felt comfortable with the pretest environment, they
were allowed to proceed to the main test environment. Participants were asked to perform
the task in the main experiment as fast and as accurate as possible, and any question the
subjects had were answered at that time. The participants then completed the assembly
task. Immediately after the experiment, participants completed the post-test
questionnaires, which includes the NASA TLX rating, demographic information, and the
spatial ability test. After the participants completed the questionnaires, they were thanked

and debriefed.

4.8 Measurements

Performance: Performance Of the subject is deﬁned as time of completion and the
accuracy Of the assembly task. Accuracy is measured in number Of errors the subject
made in the assembly task, where error is deﬁned in a particular assembly step as: (1) a

part is inserted at the wrong location, (2) a part is inserted with the wrong orientation, (3)

25

a part with the wrong color is inserted, (4) a part with the wrong size is insert, (5) a part is
missing, and (6) an extra part is inserted.

Spatial ability: Spatial ability of subjects was measured using the mental rotation test
[13]. The test includes two timed tests (3 minutes each) assessing 3D rotation of drawings
of 42 pairs of cubes. 3 sides Of each cube are visible, and the subject is to mentally rotate
one or both cubes to determine if they are the same.

Mental Workload: Subjective measurement Of mental workload on the assembly task Of
the subjects is collected using the NASA Task Load Index (NASA TLX) [18]. Subjects
rate each Of the 6 categories as shown in Table 6.1 based on their experience on the
assembly task, using a 20 point scale. And then they were asked to do a pair wise
comparison about which category is more important correspond to the assembly task
among the 15 combinations as shown in Table 6.2. A mean weighted workload score can
then be calculated by adding up on the rating multiplied by its respective weighting for

each category.

26

 

Mental Demand

How much mental and perceptual activity was required (e. g.

thinking, deciding, calculation, remembering, looking, searching,

etc.)? Was the task easy or demanding, simple or complex, exacting
or forgiving?

 

Physical Demand

How much physical activity was required (e. g. pushing, pulling,
turning, controlling, activating, etc.)? Was the task easy or
demanding, slow or brisk, slack or strenuous, restful or laborious?

 

Temporal Demand

How much time pressure did you feel due to the rate or pace at which
the tasks or task elements occurred? Was the pace slow and leisurely
or rapid and frantic?

 

ffort

How hard did you have to work (mentally and physically) to
accomplish your level of performance?

 

Performance

How successful do you think you were in accomplishing the goals of
the task set by the experimenter (or yourself)? How satisﬁed were

ou with your Erforrlance in accomplishing these goals? ‘

 

Frustration Level

 

How insecure, discouraged, irritated, stressed and annoyed versus
secure, gratiﬁed, content, relaxed and complacent did you feel during
the task?

 

 

 

Table 6.1. NASA TLX Rating Scale Deﬁnation.

 

   

Mental demand

 

Mental demand Mental demand Mental demand Mental demand

vs.
Ph sical demand Tern
Physical demand Physical demand Physical demand Physical demand

vs. vs. vs. vs.
ral demand Effort Frustration level Performance

 
  
 
  
  

 

 

 

vs. vs. vs.
Effort Performance Frustration level
Temporal demand Temporal demand Temporal demand

 

 

vs. vs.
Performance Frustration level
Effort Effort

    
 

vs.
Performance

vs.
Frustration level
Performance
vs.
Frustration level

 
 
  
 
 

 

 

  

 

Table 6. 2. Combination of pair wise comparisons between the 6 rating scales.

27

Chapter 5

Results

A total of 75 subjects participated in the experiment, 18 in treatment condition 2, 19 in
each of treatment conditions 1, 3 and 4. The average age of the participants is 20.63. 21
(28%) of the participants are female, and 54 (72%) are male. An alpha level of .05 (2-

tailed) was used for all statistical tests.

5.1 Descriptive Statistics

Table 5 .1 and Figure 5.1 illustrate the mean time for completing the assembly task
in seconds. They demostrate that treatment 4 (AR) has the shortest time of completion
among the 4 treatment conditions, while treatment I (printed manual) has the longest

time of completion.

 

Treatment Condition N Mean (seconds) Median (seconds) Std. Dev.
1: Printed Manual 19 864 847 289.61
2: CAI on LCD Display 18 686 716 158.29
3: CAI on HMD 19 668 687 211.74
4: AR 19 651 609 174.31

 

Table 5.1. Descriptive statistics for time of completion in each treatment conditions.

28

Seconds,,~t_-

  

0
1: printed 2: Lab 3: HMD 4; AR greztizment
Manual on trons

Figure 5.]. Bar chart on the average time of completion in each treatment conditions.

Table 5.2 and Figure 5.2 shows the average number of errors for the assembly
task in number of steps. The total number of steps of the assembly task is 56. Two classes
of errors are deﬁned: dependent error and independent error. Dependent error is an error
that is related to another error made previously in the assembly sequence. Independent
error is an isolated error that does not related to a previous step. The statistics show that
treatment condition 4 (AR) has signiﬁcantly lower error rates in all categories. It also
shows that a majority of errors in treatment 4 are independent errors, whereas treatment
1, 2 and 3 exhibit a majority proportion of dependent errors.

Average total Average dependent Average independent

 

Treatment Condition error (# of steps) error (# of steps) error (# of steps)
1: Printed Manual 9.37 7.21 2.16
2: CAI on LCD Display 8.44 6.17 2.28
3: CAI on HMD 9.50 7.11 2.39
4: AR 1.63 0.21 1.42

Table 5. 2. Descriptive statistics for number of error in each treatment conditions.

29

Error
“I Of 13199155)-.. M. .. , _,

I Dependent
Error

I Independent
Error

  

4 = AR Conditions

Manual

Figure 5. 2. Bar chart on the average number of dependent error, independent error, and
total error in each treatment conditions.

Table 5.3 shows the mean score Of the spatial cognition test and the NASA TLX
rating. The statistics show that subjects in treatment 1 have the highest mental workload,
where subjects in treatment condition 4 have the lowest mental workload. It also shows

that subjects among 4 treatment conditions have about the same mean in spatial cognition

 

abilities.
Treatment Condition I Spatial Cognition Test
1: Printed Manual 26.95 / 42
2: CAI on LCD Display 28.22 / 42
3: CAI on HIVID 26.00 / 42
4: AR 28.11 / 42

Table 5. 3. Average score on Spatial Cognition Test in each treatment conditions.

30

Treatment Condition NASA TLX Rating
1: Printed Manual 13.25 / 20
2: CAI on LCD Display 12.23/20
3:CAI on HMD 11.04/20
4: AR 10.00 / 20

Table 5. 4. Average score on NASA T IX Rating in each treatment conditions.

5.2 Effect of Time of Completion on Treatment Conditions

A one-way ANOVA (Analysis of Variance) was conducted on the effect of time
of completion on treatment conditions. ANOVA is used for determining if the differences
between treatment conditions are statistically signiﬁcant. The eﬁ‘ect of time of
completion depending on treatment conditions is statistically signiﬁcant, E(3, 71) = 3.75,
p = .015. Post Hoe comparisons were further conducted using the Bonferroni Method to
obtain all possible pair wise comparisons among treatment conditions. The results are

shown in Table 5.5.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(I) Setting (J) Setting Mean Difference (I-J) Std. Error SiL
2 178.03 70.80 .085
1 3 173.37 69.84 .092
4 212.95 69.84 .019
1 -178.03 70.80 .085
2 3 -4.66 70.80 1.000
4 34.92 70.80 1.000
1 -173.37 69.84 .092
3 2 4.66 70.80 1.000
4 39.58 69.84 1.000
1 -212.95 69.84 .019
4 2 -34.92 70.80 1.000
3 —39.58 69.84 1.000

 

Table 5. 5. AND VA Post Hoc comparisons of time of completion on treatment conditions.

The analysis shows that there is a statistically signiﬁcant effect between treatment

31

conditions 1 and 4 (p = .019). The effect between treatment conditions 1 and 2 and

 

treatment conditions 1 and 3 trends toward signiﬁcance (p = .085 and .092 respectively).
But there is no signiﬁcant effect between treatment conditions 2 and 3 (p = 1.000),
treatment conditions 2 and 4 (p = 1.000), and treatment conditions 3 and 4 (p = 1.000).
The results of the ANOVA analyses show that treatment conditions 2, 3 and 4 have a
signiﬁcant improvement in time of completion comparing with treatment condition 4.
However, there is no statistically signiﬁcant effect between treatment conditions 2, 3 and

4.

5.3 Effect of Accuracy on Treatment Conditions

5.3.1 Effect of Total Errors on Treatment Conditions

A one-way ANOVA was conducted on the effect of total error rate on treatment
conditions. The effect of total error depending on treatment conditions is statistically
signiﬁcant, 3(3, 71) = 4.41, p = .007. Post-Hoe Comparisons were ﬁirther conducted
using the Bonferroni Method to obtain all possible pair wise comparisons among

treatment conditions. The results are shown in Table 5.6.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(1) Setting (J) Setting Mean Difference (l-J) Std. Error Sig._
2 .92 2.65 1.000
1 3 -.68 2.61 1.000
4 7.74 2.61 .025
1 -.92 2.65 1.000
2 3 -1.61 2.65 1.000
4 6.81 2.65 .073
1 .68 2.61 1.000
3 2 1.61 2.65 1.000
4 8.42 2.61 .012
1 -7.74 2.61 .025
4 2 -6.81 2.65 .073
3 -8.42 2.61 .012

 

 

 

 

 

 

Table 5. 6. ANO VA Post Hoc comparisons of total error on treatment condition.

32

The analysis shows that there are statistical signiﬁcant effects between treatment
conditions 1 and 4 (p = .019) and conditions 3 and 4 (p =.012). The effect between
treatment conditions 2 and 4 trends toward signiﬁcance (p = .073). But there is no
signiﬁcant effect between treatment conditions 1 and 2 (p = 1.000), treatment conditions
1 and 3 (p = 1.000), and treatment conditions 2 and 3 (p = 1.000). The results of the
ANOVA analyses show that treatment conditions 4 has a signiﬁcant improvement in total
error comparing with treatment condition 1, 2 and 3. However, there is no statistically

signiﬁcant effect between treatment conditions 1, 2 and 3.

5.3.2 Effect of Dependent Error on Treatment Conditions

A one-way ANOVA was conducted on the effect of error rates of dependent error
on treatment conditions. The effect of dependent error depending on treatment conditions
is not statistically signiﬁcant, £(3, 71) = 4.68, p = .005. Post-Hoe Comparisons were
further conducted using the Bonferroni Method to obtain all possible pair wise

comparisons among treatment conditions. The results are shown in Table 5.7.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(I) Setting (J) Setting Mean Difference (I-J) Std. Error Sig_._
2 1.04 2.30 1.000
1 3 -.53 2.27 1.000
4 7.00 2.27 .017
1 -1.04 2.30 1.000
2 3 -1.57 2.30 1.000
4 5.96 2.30 .070
1 .53 2.27 1.000
3 2 1.57 2.30 1.000
4 7.53 2.27 .009
1 -7.00 2.27 .017
4 2 -5.96 2.30 .070
3 -7.53 2.27 .009

 

 

 

 

 

 

Table 5. 7. ANO VA Post Hoc comparisons of dependent error on treatment conditions.

33

The analysis shows that there are statistical signiﬁcant effects between treatment
conditions 1 and 4 (p = .017) and conditions 3 and 4 (p =.009). The effect between
treatment conditions 2 and 4 trends toward signiﬁcance (p = .070). But there is no
signiﬁcant effect between treatment conditions 1 and 2 (p = 1.000), treatment conditions
1 and 3 (p = 1.000), and treatment conditions 2 and 3 (p = 1.000). The results of the
ANOVA analyses show that treatment conditions 4 has a signiﬁcant improvement in
dependent error comparing with treatment condition 1, 2 and 3. However, there is no

statistically signiﬁcant effect between treatment conditions 1, 2 and 3.

5.3.3 Effect of Independent Error on Treatment Conditions
A one-way ANOVA was conducted on the effect of error rates of independent
error on treatment conditions. The effect of independent error depending on treatment

conditions is not statistically signiﬁcant, E (3, 71) = .967, p = .413.

5.4 Effect of NASA TLX on Treatment Conditions

A one-way ANOVA was conducted on the effect of the NASA TLX rating on
treatment conditions. The effect of NASA TLX depending on treatment conditions is

statistically signiﬁcant, E(3, 71) = 6.26, p_ = .001.

5.5 Effect of Spatial Cognitive Ability on Performance

5.5.1 Effect of Spatial Cognitive Ability on Total Error
A bivariate correlation analysis was conducted on the effect of spatial cognitive

ability on total error. The results are as shown in Table 5.8.

34

 

Treatment Condition(s) Pearson Correlation S3;
1 -.121 .621
2 -.350 .155
3 -.1 82 .45 7
4 -.270 .263

Table 5. 8. Bivariate Correlation of Spatial Cognitive Ability and Performance.

 

The results show that there is no statistical signiﬁcant correlation between spatial

cognitive ability and total error in all treatment conditions.

5.5.2 Effect of Spatial Cognitive Ability on Dependent Error

A bivariate correlation analysis was conducted on the effect of spatial cognitive

ability on dependent error. The results are as shown in Table 5.9.

 

Treatment Condition(s) Pearson Correlation Si
1 -.051 .835
2 -.409 .092
3 -.101 .682
4 .198 .417

 

Table 5. 9. Bivariate Correlation of Spatial Cognitive Ability and Dependent Error.
The results show that there is no statistical signiﬁcant correlation between spatial
cognitive ability and dependent error in all treatment conditions.
5.5.3 Effect of Spatial Cognitive Ability on Independent Error
A bivariate correlation analysis was conducted on the effect of spatial cognitive

ability on independent error. The results are as shown in Table 5.10.

 

Treatment Condition(s) Pearson Correlation Sig.
1 -.369 .120
2 -.001 .998
3 -.394 .095
4 .198 .417

 

Table 5 . I 0. Bivariate Correlation of Spatial Cognitive Ability and Independent Error.

The results show that there is no statistical signiﬁcant correlation between spatial

cognitive ability and independent error in all treatment conditions.

35

5.5.4 Effect of Spatial Cognitive Ability on Time of Completion

A bivariate correlation analysis was conducted on the effect of spatial cognitive ability on

time of completion. The results are as shown in Table 5.11.

 

Treatment Condition(s) Pearson Correlation Sig._
l l .000 .263
2 -.493 .038
3 -.1 82 .457
4 .198 .417

 

Table 5.1 l. Bivariate Correlation of Spatial Cognitive Ability and Time of Completion.

The results show that there is no statistical signiﬁcant correlation between spatial

cognitive ability and time of completion in treatment conditions 1, 3 and 4, and there is a

statistically signiﬁcant effect in treatment conditions 2.

36

Chapter 6

Discussions and Conclusions

This chapter explores the experimental ﬁndings in relationship to the stated hypotheses. It
investigates the implications of the results to the theoretical model, and provides further

insight into the inﬂuence of AR in human performance and perception.

6.1 Effect of Information Overlay on Performance

Hypothesis 1 states that overlaying information in user’s view using a see-through
HMD improves user’s performance on the assembly task by reducing head and eye
movement. This hypothesis suggests that the performance of subjects, in terms of time of
completion and accuracy, in treatment conditions 3 and 4 is expected to be better than
treatment conditions 1 and 2. Even though there are statistical signiﬁcant advantages in
time of completion and accuracy in condition 4 comparing with conditions 1 and 2, there
is no signiﬁcant advantage in time of completion in condition 3 comparing with
conditions 2, and no signiﬁcant advantage in accuracy in condition 3 comparing with
conditions 1 and 2. Therefore, this hypothesis is not supported.

In treatment conditions 1, 2 and 3, it is a common practice that the subjects count
the number of bumps from the edge of the Duplo® base plate to determine the exact
position of the part to be inserted. Some subjects in treatment condition 3 also reported
that it is hard to perform counting on the instructions since they cannot touch the
instructions physically. Some of the responses from subjects in treatment condition 3

stated that the overlaid instructions interfered with the workspace and it was hard to see

37

the workspace clearly. Other stated that the workspace was interfering with the overlaid
instructions and it was hard to read the instructions clearly. Studies of HUDs for
automobile drivers suggested that symbology placed within a 5 degree radius of the fovea
is annoying to drivers [20, 41].

The Sony Glasstron HMD projects a simulated 30 inches (diagonal) screen at 4
feet ahead. The distance between the subject’s head and the top of the workbench is
approximately 1.5 feet. So the projected image in the HMD appears to be under the table.
Some of the subjects in treatment condition 3 reported that it is hard to adjust the focus
on a point under the workbench. A subset of the subjects moved their heads up and
looked at a plain background on the wall when they read the instructions to solve the
visual cluttering and/or focusing problem. This portion of subjects gained no advantages
from increasing “eyeoon-the workspace” time by overlaying of information.

This result suggested that overlaying information in the central vision area of the
user’s view does not facilitate improvement in human performance. Based on the
limitations of FOV and resolution of the current HMD technologies, only a very limited

amount of information can be placed outside of the central vision area of a user.

6.2 Effect of Attention Switching and Mental Transformation Ofﬂoading
on Performance

Hypothesis 2 states that overlaying information in user’s view using AR in a
spatially meaningful way improves user performance on the assembly task by reducing
attention switching between the instructional media and the workspace. Hypothesis 3
states that by offloading the mental transformation tasks to the computer, subjects using

AR instructions will perform better when comparing with subjects using traditional

38

instructional media, where pictorial instructions needed to be mentally transformed the to
subjects’ point of view. These two hypotheses suggest that the performance of subjects,
in terms of time of completion and accuracy, in treatment condition 4 is expected to be
better than treatment conditions 1, 2 and 3. There is statistically signiﬁcant advantage in
accuracy (in total error and dependent error) in condition 4 comparing with condition 1,
2, and 3. But there is no statistical signiﬁcant advantage in time of completion. Since
there is a strong correlation between time of completion and accuracy by nature (e. g. the
faster you go, the more mistake you make), having advantage in one category and having
the same performance in another category would be considered an advantage in overall
performance. Therefore, these hypotheses are supported.

There is extensive research in the ﬁeld of ergonomics of HUDs for aircraft pilots
concerning switching attention among information sources and the real environment. [4,
25, 26, 30] reported evidences that optically overlaid information cannot be processed in
parallel. [16, 23, 45] reported that there is a time cost associated with the cognitive
switching among the information displayed in HUD and the surrounding environment In
AR, synthetic computer graphics are registered with the real world, and they appear to be
a part of the world. It eliminates the cognitive load of switching attention across
information displayed and the working environment. However, there is no literature the
author aware of about how computer-assisted mental transformation of pictorial diagram
affects user performance. It is a general presumption that computer assistance in the
mental transformation task may result in improvement in performance. It is not certain

how these two factors contribute to user task performance; i.e. which factor contributes

39

more to improving task performance. More research is needed to determine the

contributions of these factors.

6.3 Effect of Treatment Conditions on Mental Workload

Hypothesis 4 states that mental workload of the assembly task using traditional
instructional media is higher than using AR instruction. This hypothesis suggests that the
NASA TLX in treatment condition 4 is expected to be lower than treatment conditions 1,
2 and 3. The lower NASA TLX rating in condition 4 is statistically lower relative to
conditions 1, 2 and 3, which indicates subjects’ mental workload in condition 4 is lower

than in condition 1, 2 and 3. Therefore, this hypothesis is supported.

6.4 Effect of Spatial Cognitive Ability on Performance

Hypothesis 5 states that individuals with better spatial ability will perform better
in an assembly task based on traditional pictorial-based instruction. This hypothesis
suggests that the score of the spatial cognition test in treatments 1, 2 and 3 is expected to
be correlated to the performance of subjects. However, there is no statistical signiﬁcant
correlation between the score of spatial cognition test and performance of subjects in all
treatment conditions. Therefore, this hypothesis is not supported.

It is possible that a Type 11 error occurs in this correlation analysis. The correlation
analysis might miss a small effect due to an insufﬁcient sample size (about 19 in each
treatment conditions). A bivariate correlation analysis was repeated with a combined

sample from treatment 1, 2 and 3. The results are shown in table 6.1.

40

 

Correlation analysis of spatial cognitive ability for combined
sample from treatment condition 1, 2 and 3

Total error r(56) = -.207, p = .127

Dependent error {(56) = -.164, p = .187

Independent error [(56) = -.291, p_ = .030

Time of completion r(56) = -.251, p_ = .062

 

Table 6.1. Bivariate correlation analysis of Spatial Cognitive Ability and combined
sample from treatment I, 2 and 3.

The analysis shows that the correlation between independent error and spatial
cognitive ability is statistically signiﬁcant, and the correlation between time of
completion and spatial cognitive ability trends toward signiﬁcance. This result is
contradictory to the original assertion about the falsity of hypothesis 5. A larger sample

size is necessary in order to make an accurate assertion on hypothesis 5.

6.5 Effect of Dependent Error in Augmented Reality

In Section 5.1, it is noted that the number of dependent errors in treatment
condition 4 is much lower than the other 3 treatment conditions. This may be due to the
fact that determining position and orientation from pictorial diagram drawn from the
author’s perspective is a primitively hard task. Human beings tend to approximate the
position and orientation using ﬁxations and landmarks already in place. By overlaying
the instruction in the exact position of the part at the location where it is to be inserted,
AR not only reduces the cognitive workload to locate the position and orientation at the
workspace from the instructional media, but also eliminates some dependency among

procedural steps.

41

6.6 Effect of Attention Tunneling in Augmented Reality

It is observed that the rate of subjects correcting a mistake made in previous
assembly steps in treatment condition 4 is much lower than in treatment condition 1, 2
and 3. This observation is coherent with a phenomenon called attention tunneling (also
refered to as attention capture and cognitive capture in some literature). Attention
tunneling refers to the phenomenon that attention is focused on the area cued, at the cost
of other areas. Dapping-Hepenstal reported that “military pilots ﬁxated more frequently
on information presented on a HUD at the cost of scanning the outside scene” [11]. Yeh,
et al. reported that “cueing aided the target detection task for expected targets but drew
attention away from the presence of unexpected targets in the environment” [47].
Attention tunneling can reduce user performance and generate potentially hazardous
scenarios. Yeh et al. recommended that the designer of such cueing systems more

carefully evaluate operator reliance on automation.

6.7 Conclusion

The results of this research project support that AR improves human performance
and relieves some of the user mental workload. The feature of overlaying and registering
information on the workspace in a spatially meaningful way in AR allows it to serve as
an effective instructional media. However, the limitations in the current display and
tracking technologies are the biggest obstacles preventing AR from being realistic in
practical uses. There is also a psychological implication in the phenomenon of attention

tunneling which could possibly reduce human performance. AR system designer needs to

42

leverage the potential power of AR carefully in order to design a system that achieves an

overall improvement of performance.

43

Appendix A

Procedural Steps for the Assembly Task

   

Step 1 Step 2

   

Step 3 Step4

   

Step 5 Step 6

44

--

   
 
 

 

Step 7 Step 8
Step 9 Step 10

Step 1 1 Step 12

45

   

Step 13 Step 14

   

Step 15 Step 16

   

Step 17 Step 18

   

Step 19 Step 20

   

Step 21 Step 22

   

Step 23 Step 24

 

Step 25 Step 26

 

Step 29 Step 30

48

--

Step 31 Step 32

Step 33 Step 34

Step 35 Step 36

 
 

49

 

.-

 

Step 37 Step 38
Step 39 Step 40

 

--

Step 41 Step 42

50

 
 
 

--

Step 43 Step 44
Step 45 Step 46

--

Step 47 Step 48

51

   

Step 49 Step 50

   

Step 51 Step 52

   

Step 53 Step 54

 

Step 55 Step 56

53

Bibliographies

10.

11.

Azuma, R. (1995), Dissertation: Predictive Tracking for Augmented Reality.
1995, University of North Carolina, Chapel Hill: Chapel Hill. p. 262.

Bajura, M. (1993). Camera Calibration for Video See-Through Head-Mounted
Display. Technical Report TR93-048. Chapel Hill: Department of Computer
Science, University of North Carolina, Chapel Hill.

Bajura, M., H. Fuchs, and R. Ohbuchi (1992), Merging Virtual Reality with the
Real World: Seeing Ultrasound Imagery Within the Patient. IEEE Computer
Graphics, 1992. 26(2): p. 203-210.

Becklen, R. and D. Cervone (1983), Selective looking and the notice of
unexpected events. Memory & Cognition, 1983. 11: p. 601-608.

Biocca, F. (2000). Human-bandwidth and the design of Internet 2 interfaces:
Human factors and psychosocial challenges. In Internet2 Socialtechnical Summit.
2000. Ann Arbor, MI.

Biocca, F. and J. Rolland (1998), Virtual Eyes Can Rearrange Your Body:
Adaptation to visual displacement in see-through, head-mounted displays.
Presence, 1998. 7(3): p. 262-277.

Biocca, F., A. Tang, and D. Lamas (2001). Evolution of the mobile infosphere:
Iterative design of a high-information bandwidth, mobile augmented reality
interface. In E uroimage 2001: International Conference on Augmented Virtual
Environments and 3D Imaging. 2001. Mykonos, Greece.

Biocca, F., A. Tang, D. Lamas, J. Gregg, R. Brady, and P. Gai (2001). How do
users organize virtual tools around their body in immersive virtual and augmented
environment?: An exploratory study of egocentric spatial mapping of virtual tools
in the mobile infosphere. Technical Report East Lansing: Media Interface and
Network Design Labs, Michigan State University.

Brooks, PP]. (1996), The Computer Scientist as Toolsmith 11. Communications
of the ACM, 1996. 39(3): p. 61-68.

Caudell, T.P. and D.W. Mizell (1992). Augmented Reality: An Application of
Heads- Up Display Technology to Manual Manufacturing Processes. In
International Conference on System Sciences. 1992. Kauai, Hawaii.

Dopping-Hepenstal, LL. (1981), Head-up displays: The integrity of flight

information. IEE Proceedings Part F, Communication, Radar and Signal
Processing, 1981. 128(7): p. 440—442.

54

12.

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.

Edwards, E.K., J. Rolland, and KP. Keller (1992). Video see-through design for
merging of real and virtual environment. In IEEE Virtual Reality Annual
International Symposium. 1992. Seattle, WA.

Ekstrom, R.B., J .W. French, H.H. Harman, and D. Derman (1976), Manual for kit
of factor-referenced cognitive tests. 1976, Princeton, NJ: Educational Testing
Service.

Feiner, S., B. MacIntyre, and D. Seligmann (1993), Knowledge-based Augmented
Reality. Communications of the ACM, 1993. 36(7): p. 52-62.

Feiner, S., B. MacIntyre, H. Tobias, and A. Webster (1997). A Touring Machine:
Prototyping 3D Mobile Augmented Reality Systems for Exploring the Urban
Environment. In International Symposium on Wearable Computers. 1997.
Cambridge, MA.

Fisher, E., R. Haines, and T. Price (1980). Cognitive issues in head-up displays.
Technical Report 1711. Moffett Field: NASA Ames Research Center.

Haines, R., E. Fischer, and T. Price (1980). Head-up transition behavior of pilots
with and without head-up display in simulated low-visibility approaches.
Technical Report 1720. Moffett Field: NASA Ames Research Center.

Hart, S.G. (1987). Background Description and Application of the NASA Task
Load Index (TL/19. In Department of Defense Human Engineering Technical
Advisory Group Workshop on Workload. 1987. Newport, RI.

Holloway, R. L. (2001), Registration Error Analysis for Augmented Reality
Systems, in Fundamentals of Computers and Augmented Reality, W. Barﬁeld and
T.P. Caudell, Editors. 2001, Lawrence Erlbaum Associates, Publishers: Nahwah,
NJ. p. 183-217.

Inzuka, Y., Y. Osumi, and K. Shinkai (1991). Visibility of head up display for
automobiles. In 35th Annual Meeting of the Human Factors Society. 1991.

J ebara, T., C. Eyster, J. Weaver, T. Starner, and A. Pentland (1997). Stochastic/rs:
Augmenting the Billiards Experience with Probablistic Vision and Wearable
Computers. In International Symposium on Wearable Computers. 1997.
Cambridge, MA.

Kirsh, D. (1995), The intelligent use of space. Artiﬁcial Intelligence, 1995. 73(1):
p. 31-68.

Larish, I. and C. Wickens (1991). Attention and HUDs: Flying in the dark? In
Society for Information Display International Symposium Digest of Technical
Papers HI]. 1991.

55

24.

25.

26.

27.

28.

29.

30.

31.

32.

33.

34.

35.

36.

Lenz, R.K. and R.Y. Tsai (1988), Techniques for Calibration of the Scale Factor
and Image Center for High Accuracy 3-D Machine Vision Metrology. IEEE
Trans. on Pattern Analysis and Machine Intelligence, 1988. 10(5): p. 713-729.

McCann, R.S., D.C. Foyle, and J.C. Johnston (1993). Attentional limitations with
head-up displays. In Seventh International Symposium on Aviation Psychology.
1993. Columbus, OH.

McCann, R.S., J.M. Lynch, D.C. Foyle, and J.C. Johnston (1993). Modeling
attentional eﬂects with head-up display. In Human Factors and Ergonomics
Society 3 7th Annual Meeting. 1993.

Mcgarrity, E. and M. Tuceryan (1999). A Method for Calibrating See-through
Head-mounted Displays for AR. In 2nd IEEE International Workshop on
Augmented Reality (IWAR 99). 1999. San Francisco, CA.

Mcgarrity, E., M. Tuceryan, C. Owen, and N. Navab (2001). A new system for
online quantitative evaluation of optical see-through augmentation. In IEEE and
ACM International Symposium on Augmented Reality. 2001. New York, NY.

Mizell, D.W. (2001), Boeing's Wire Bundle Assembly Project, in Fundamentals of
Wearable Computers and Augmented Reality, W. Barﬁeld and T.P. Caudell,
Editors. 2001, Lawrence Erlbaum Associates, Publishers: Mahwah, NJ. p. 447-
467.

Neisser, U. and R. Becklen (1975), Selective looking: Attention to visually
specified events. Cognitive Psychology, 1975. 7: p. 480-494.

Neumann, U. and A. Majoros (1998). Cognitive, Performance, and Systems Issues
for Augmented Reality Applications in Manufacturing and Maintenance. In IEEE
VRAIS '98. 1998. Atlanta, GA.

Ohshirna, T., K. Satoh, H. Yamamoto, and H. Tamura (1998), AR2 Hockey
system: A collaborative mixed reality system. Transactions of the Virtual Reality
Society of Japan, 1998. 3(2): p. 55-60.

O'Keefe, J. and J. Dostrovsky (1971), The Hippocampus as a spatial map. Brain
Res, 1971. 34: p. 171-175.

O'Keefe, J. and L. Nadel (1978), The hippocampus as a cognitive map. 1978,
Oxford: The Clarendon Press.

Owen, C. (2001). The ImageTclAR Augmented Reality DeveIOpment
Environment "http://metlab.cse.msu.edu/imagetclar".

Pausch, R., T. Crea, and M. Conway (1992), A Literature Survey for Virtual
Environments: Military Flight Simulator Visual Systems and Simulator Sickness.
Presence, 1992. 1(3): p. 344-363.

56

37.

38.

39.

40.

41.

42.
43.

44.

45.

46.

47.

Redford, A. and J. Chal (1994), Design for Assembly, Principles and Practice.
1994, London, England: McGraw-Hill Book Company.

Rolland, J ., F. Biocca, F. Barlow, and A. Kancherla (1995). Quantification of
Adaptation to Virtual-Eye Location in See- T hru Head-Mounted Displays. In IEEE
VRAIS '95. 1995. Research Park Triangle.

Rolland, J ., L. Davis, and Y. Baillot (2001), A Survey of Tracking Technology for
Virtual Environments, in Fundamentals of Wearable Computers and Augmented
Reality, W. Barﬁeld and T.P. Caudell, Editors. 2001, Lawrence Erlbaum
Associates, Publishers: Mahwah, NJ. p. 67-112.

Rolland, J. and H. Fuchs (2001), Optical versus Video See- Through Head-
Mounted Displays, in Fundamentals of Wearable Computers and Augmented
Reality, W. Barﬁeld and T.P. Caudell, Editors. 2001, Lawrence Erlbaum
Associates, Publishers: Mahwah, NJ. p. 113-156.

Sojourner, R and J. Antin (1990), The eﬂects of a simulated head-up display
speedometer on perceptual task performance. Human Factors, 1990. 32(3): p.
329-339.

Sutherland, LE. (1965). The ultimate display. In IFIP Congress. 1965.

Sutherland, LE. (1968). A Head-mounted Three Dimensional Display. In
Proceedings of AF IPS Conference. 1968.

Tuceryan, M. and N. Navab (2000). Single point active alignment method
(SPAAM) for optical see-through Hm calibration for AR. In IEEE and ACM
International Symposium for Augmented Reality. 2000. Munich, Germany.

Weintraub, D.R., R. Haines, and R. Randle (1985). Head-up display (HUD) utility
II: Runway to HUD transitions monitoring eye focus and decision times. In
Human Factors Society 29th annual meeting. 1985.

Weng, J ., P. Cohen, and M. Hemiou (1992), Camera calibration with distortion
models and accuracy evaluation. IEEE Trans. on Pattern Analysis and Machine
Intelligence, 1992. 14(10): p. 965-980.

Yeh, M. and CD. Wickens (2000). Attention and Trust Biases in the Design of
Augmented Reality Displays. Technical Report ARL-00-3/FED-LAB-00-1.
Savoy, IL: Aviation Research Lab, University of Illinois, Urbana-Champaign.

57