136
918

 

THESIS
1
log 3
55002950

This is to certify that the

thesis entitled

EVALUATION OF REGISTRATION ERROR THROUGH ‘
OPTICAL MODELING OF THE DISPLAY EYE SYSTEM l

presented by

Jonathan P. Babbage

has been accepted towards fulﬁllment
of the requirements for

MS. degree in Computer Science and Engineering

/

‘

 

Maj: r professor
May 8, 2003
Date

 

0-7639 MS U is an Afﬁrmative Action/Equal Opportunity Institution

 

 

LIBRARY
Michigan State
University

 

 

 

PLACE IN RETURN Box to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE I DATE DUE DATE DUE

 

WN220399|

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/01 eJCIFtC/DateDue.p65«p.15

EVALUATION OF REGISTRATION ERROR THROUGH OPTICAL
MODELING OF THE DISPLAY EYE SYSTEM
By
Jonathan P. Babbage

A THESIS

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

MASTER OF SCIENCE
Department of Computer Science and Engineering

2003

ABSTRACT

EVALUATION OF REGISTRATION ERROR THROUGH OPTICAL
MODELING OF THE DISPLAY EYE SYSTEM
By
Jonathan P. Babbage

This thesis examines the system formed by the combination of the human eye and
an optical see-through head mounted display (HMD) for Augmented Reality (AR)
applications. It is critical that developers of AR systems know what expectations can
be placed on the hardware of the system. What level of registration accuracy can
we expect between the virtual content as displayed in an HMD and real content in
the world? What errors are introduced? In the past, the error has been measured
by comparing points displayed in the HMD with the location a user perceives those
points. This gives a measurement of the total error in the system but does include
other variables such as noise from the tracking system and human error in the pointing
process. This thesis develops a model for the correspondence of points in an optical
see-through HMD with points in the physical world as seen through the display, taking
into account the parameters of the physical human-computer system and motion
of the eye relative to the display. Typically the graphics that are generated in an
augmented reality system are created using a pin hole camera model. The eye is
much more complicated, thus this camera model may not be sufﬁcient. A model of
the human eye will be presented and used to analyze what happens to light that
enters the eye, be it from a real world object or one presented in the HMD. Using this
information we be able to model the registration of virtual content and real content

in the HMD/eye system.

TABLE OF CONTENTS

ABSTRACT ii
1 INTRODUCTION 1
1.1 Real World - Virtual Alignment ..................... 2
1.2 Related Works ............................... 5
1.3 Contributions of this Thesis ....................... 6
1.4 Outline ................................... 7

2 COMPONENTS OF AN AUGMENTED REALITY SYSTEM 8
2.1 Hardware ................................. 8
2.1.1 Tracking System ......................... 11

2.1.2 System Processor ......................... 12

2.1.3 Head Mounted Display ...................... 12

2.2 Software .................................. 13
2.2.1 Calibration ............................ 14

2.2.2 Graphics Rendering ........................ 17

2.3 User .................................... 18

3 MODEL OF THE DISPLAY-EYE SYSTEM 19
3.1 Physiology of the Human Eye ...................... 20
3.2 Values Used in Calculation ........................ 21
3.3 Components of Head Mounted Display ................. 22
3.4 Measurements used for Model ...................... 23
3.5 Considerations .............................. 24
3.6 Components as a System ......................... 24

4 METHODS 25
4.1 Overview of Methods ........................... 25
4.2 Mathematical Models ........................... 26
4.2.1 Sphere Intersection ........................ 27

4.2.2 Plane Intersection ......................... 27

4.2.3 Direction Computation ...................... 28

4.2.4 Subsequent Surfaces ....................... 30

4.2.5 Pin-Hole Camera Model ..................... 30

4.3 Modeling Behavior of the Entire System ................ 32
4.3.1 Mapping Real World to Display ................. 33

4.3.2 Mapping Display to Real World ................. 33

4.4 Ray Selection ............................... 35
4.4.1 Searching ............................. 35

iii

5 RESULTS

5.1 Graphical Rendering Error ........................

5.1.1 Experiment Layout ........................
5.2 Isolated Eye Movement ..........................
5.3 Movement of the Head Mounted Display Relative to the Eye .....

6 DISCUSSION AND CONCLUSIONS

6.1 Pin Cushion Effect

6.2 Handling Eye Movement .........................
6.3 Relative Movement of the HMD .....................

6.4 Conclusion .....

BIBLIOGRAPHY

37
37
38
42
46

49
49
51
52
56

57

3.1
3.2
3.3

5.1
5.2
5.3

6.1

LIST OF TABLES

Values for schematic eye ......................... 22
Indexes of Refraction ........................... 22
Values for Head Mounted Display .................... 24
Values of Real World Point and Perceived Rendering ......... 42
Values for Rotation of 13° ........................ 44
Values for Moving the Display ...................... 48
Values for Radial Distortion Correction ................. 52

2.1
2.2
2.3
2.4

3.1
3.2
3.3
3.4

4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8

5.1
5.2
5.3
5.4
5.5

6.1
6.2
6.3

LIST OF FIGURES

Hardware components of and Optical See-Through AR System . . . .
Formation of a Virtual Image ......................
Coordinate System Transformations ...................
Calibration of Coordinate System ....................

Horizontal Section of the Eye ......................
Surfaces and Media of the Eye (not to scale) ..............
Optical Components of the Sony Glasstron ...............
Possible Movement of the Sony Glasstron ................

Refraction .................................
Incident Angle ...............................
Resultant Angle ..............................
2 Dimensional Pin Hole Camera ....................
3 Dimensional Camera ..........................
Reﬂection in a Curved Mirror ......................
Initial Ray Selection for HMD to Real World Mapping ........
Search Vectors ...............................

Layout of Optical Components for Extra-Fovea Points .........
Points used to create optimal conﬁguration ...............
Comparison of Real World Points and Perceived Rendering ......
Graph of Error Based on a Rotation of 13° ...............
Graph for Moving the Display ......................

Real World Pin Cushion Example ....................
Simulated Rotation of the Eye ......................
Simulated Display Movement ......................

vi

11
13
15
16

23

28
29
30
31
32
33

38
39
41
45
47

50
53
55

CHAPTER 1
INTRODUCTION

Augmented Reality (AR) is the blending of computer-generated virtual content with
real content. While many paradigms are available in AR, this thesis focuses of the use
of head mounted displays (HMD) that present a 2D projection screen overlaid over
the ﬁeld of view of a user, thereby enhancing a users perception of the world. AR is
related to Virtual Reality(VR), but instead of replacing the world with an alternative
reality, the user is still allowed to perceive reality, only with computer enhancement.
In effect, the world is blended with the virtual elements [28]. The approach eliminates
the isolation typically experienced in a VR environment. AR also allows the use
of real world context as a reference to virtual content. It is this alignment of the
real world and virtual components that necessitates information being accurately
displayed. This thesis examines the eye, HMD, and their contributions to system
error.

Computers have been advancing at great Speeds for many years. A simple example
of this growth is seen in Moore’s Law. Originally proposed by Gordon Moore in 1965,
it proposed that the number of transistors on the integrated circuit would double
every few years [30]. This prediction has been accurate for almost 40 years since it was
proposed. The same progress has been seen in many different aspects of computing
including memory capacity, processor speed, and network speed. During this time
human interaction with the computer has not seen advancements at the same speed.
In particular, the portal through which we communicate with a computer remains
a tiny rectangle relative to the size of real desktops and work environments. The

advent of AR is changing this method in interaction. It also introduces the problem

of aligning information with the real world. The quality of this alignment has been
examined by Erin McGarrity [26]. This research only dealt with the overall error
encountered by a user over a period of time.

An AR system has many components which contribute to the error measured
by previous research. It was unknown how each part of the system added to the
measured error. In order to increase the accuracy of the system it is necessary to
know how different components are contributing to the error. Therefore, the eye and
the display are isolated in order to determine their role in the system.

This thesis focuses on both the human eye and the HMD that presents images to
the eye. Both of these are small objects and difficult to measure. In the case of the
eye, measurement were obtained through medical literature. Once the measurements
were known, models were created that allowed for the examination of behavior in
different situations. These models predicted the behavior of the system, but in order
to validate the results photos were taken. The images were compared to the results
from the models to see if the models of the system mimicked the real behavior.

The simulations support the idea that the current HMD is capable of presenting
much more accurate information to the user, and therefore it must be other aspects of
the system that are producing a majority of the error. The optical properties of the
system can be compensated for in software. The movement of the users eye does not
change perceived locations drastically. Also, the movement of HMD does not produce
a great deal of error, though the tilting of it may. Therefore if the tilting of the HMD is

limited and the software corrections are made, the HMD can be considered accurate.

1.1 Real World - Virtual Alignment

Consider the small task of going to a new movie theater. On a computer one could
easily ﬁnd a map as well as driving directions that would lead them to the theater. The

user is now required to take this information from the computer, store it, and apply it

to the real world. Computer hardware now allows for new methods of displaying data
and entering information. Augmented Reality can be deﬁned as the combination of
information from a computer with the real world [4].

This broad deﬁnition admits a wide range of activities into the Augmented Reality
classiﬁcation. Again consider the task of going to a new movie theater. The follow-
ing are two different solutions to the problem that may be considered AR. In both
examples the output from portable computer is presented to the user through a Head
Mounted Display (HMD). The ﬁrst application could be as simple as displaying text
directions to the theater. This removes the requirement for a user to remember infor-
mation for the duration of the drive. The second solution involves a more complicated
combination of both real world information, user input, and virtual information. Add
to the previous system a method for tracking a users pose and location. Pose can be
deﬁned as ones orientation in space. Now one could present data such as distance
until the next turn, estimated time until arrival, graphics that would direct the user
to turn on the appropriate street, and an error message if the user is not headed in
the right direction. The idea of using pose information coupled with a HMD it create
a more inversive display can be traced back to the ideas of Sutherland [3, 2, 39, 40].
The real world advantages as such an immersive displayed have been researched by
Tang [41].

These two examples demonstrate the idea that AR applications can have varying
degrees of immersion. The range of what can be considered AR is broad, with many
applications lying at different points along the spectrum. It is for this reason that it
is useful to consider AR as being part of a continuum that occupies the space between
a real environment and a completely virtual environment [28]. There are interesting
ideas that the concept of a continuum introduces. It allows one to discuss the location
of an application in the spectrum. A second concept that may be introduced is the

idea of sliding a program in either direction along the spectrum. One could consider

how the application would change if the user was made more aware of the real world
or if it were more obscured. The system discussed in this thesis works by presenting
virtual content in a display such that the content overlays the visual ﬁeld. A user
sees reality with virtual elements added. Display technologies exist that present
this overlay has an addition to the light viewed at each pixel or as a replacement
[33]. Displays present content to each eye, thereby allowing for perception of the
augmented content as 3D, fully registered with reality. Proper registration requires
the placement of the graphics content at exactly the correct location in the display so
as to be along the ray from the pupil to the corresponding physical location. Hence,
a major element of any AR system is a method for ensuring the correct registration
between the virtual content and reality.

Other AR paradigms, such as augmented imagery, the augmentation of still and
video images, allow for the automation of the registration process. The ﬁnal image
as seen by the user is available to the computer for analysis. This is also true for
video see-through head mounted displays. In this class of display the user perceives
the world through a camera and an opaque display much as has been traditionally
used for virtual reality systems. Again, the combined image as perceived by the user
is available for analysis, allowing for automatic registration mechanisms and direct
measurement of registration performance.

This process is complicated in an optical see-through display because the actual
composition of the real and virtual content takes place on the retina of the user. An
absolute capture of this composite image is not possible, so automatic methods for
ensuring accurate registration based on image analysis are not possible. Hence, all
existing approaches to optical see-through HMD calibration and registration require
human interaction. This thesis seeks to determine what performance is actually
possible in the eye/display system and how to achieve the best possible performance.

In the second example given above there is a great deal of error that a user could

accept. If the system were off by a measurement as drastic as a few feet, the graphics
may instruct the user to turn before reaching the correct location. The user in the
system, a driver, could possibly realize this error as being directed to turn onto a
side walk or into a street sign. In this case the user would still be able to identify
and turn onto the correct street quite easily. This is not the case for all possible
applications. If the application was moved along the AR continuum in the direction
of being completely virtual, it would not be so easy for the user to correct this error.
It is possible the system would block out the user’s ability to see the road in the
real world. It is also possible the information being presented does not correspond
to some real world object or the user is not able to observe the real object. The user
would have no reference to the real world and thus not be able to correct the error.
This is the case in research done at University of North Carolina. They designed a
system to present medical information on a patient [23, 13]. In this case an error of
a centimeter could be very drastic if it were used in a surgical setting [24].

Error is introduced into the system from two main sources. The ﬁrst is inaccurate
tracking. If the pose information from the tracking system is off by a degree, this
could result in an error of a centimeter at arms length. The other source is inaccurate
rendering of the graphics. Graphics rendering systems often use a pin hole camera
model to create images. This model will be discussed later, but is it is enough to
know that the human eye is considerably more complex that a pin hole camera.
Complexities of the eye include the fact that it is capable of changing shape, has

multiple optical surfaces, and is able to move to focus on different objects.

1 .2 Related Works

The ﬁrst step in researching the error in an AR system was moving from a qualitative
to a quantitative measurement. This advancement was made by Erin McGarrity [26].

Previous to his work there was only a qualitative measure. A user could say the

system seemed to work well or that it had too much error. One needed to be able to
make statement such as ”the average error of the system is one centimeter.”

In order to establish this measurement of error in the system, objects in the real
world must be compared to virtual ones. This is done by deﬁning a set of points
in space. These points have real world locations that are used in the computation.
Also the points are put into the virtual system. Then they can be rendered as virtual
objects and displayed to the user. Now the user needs to input the location where
they perceive the point. The user is able to see the graphics in the HMD. They then
take a stylus and move it to the location they may see the point. The tracking system
then converts this selected point into a real world location. This information, along
with the original location, is used to compute the error. This process is repeated over
a large number of points and produces a quantity that is the average error.

Other related work include Holloway’s research in looking into all the different
aspects that can contribute to system error [14, 15]. Other research has been done
by Min that focuses speciﬁcally on stereo viewing [29]. Stereo viewing referees to
the process of presenting a different image to each eye in order to mimic real world
experiences. There has also been efforts to reduce the error in the eye/ display system
made by Barsky [5, 6]. This research aimed to reduce error through an updated

rendering technique.

1.3 Contributions of this Thesis

At this point an aggregate error is known, but is it not known how much each compo-
nent contributes to the total. Therefore each component must be examined separate
from the rest of the system in order to produce a measurement of the error that
portion contributes. This thesis identiﬁes the different types and magnitudes of error
possible in the process of rendering and perceiving information. The contributions of

this thesis are:

1. A model of the eye/display system and a method to simulate its behavior.
A mapping from real-world points to display points.
An inverse mapping from display points to real-world rays.

Experimental validation of the model and simulation methods.

9199950

A model for consequences of both eye and display movement during usage.

In order to accomplish this, the optical objects that are involved in both presenting
graphics and perceiving them must be measured and modeled. This means that
models for the HMD and the human eye are needed. The main result of the thesis is
a measure of a quantity of error that is product of the interaction of the display and

the eye. This error is a measurement that assumes perfect tracking and calibration.

1.4 Outline

This thesis is based on the results of computations done to model different situations
that could arise in an AR system. Chapter 1 is an introduction to the key ideas in
AR as well as a presentation of the motivation and contributions of the research. The
Chapter 2 discusses the components that make up the particular AR system being
modeled in this thesis as well as the current method of graphics rendering. Chapter
3 presents the mathematical models used in the computation of results. This chapter
also contains measurement data from the components of the system. Chapter 4
details the steps involved in converting these models into a computational process
that produces results. Chapter 5 contains the results from the different situations
that were examined. The ﬁnal chapter discusses and summarizes the results and

possible future research.

CHAPTER 2
COMPONENTS OF AN AUGMENTED REALITY
SYSTEM

There are a wide variety of augmented reality applications and each requires a unique
and speciﬁc hardware conﬁguration. The process of combining images from the real
world with those generated by a computer is the primary deﬁning factor in an AR
system. This composition be accomplished in a variety of ways, although most so—
lutions have three components in common. There must be hardware that is able
obtain information about the world and present appropriate data properly registered
with that information. Software must exist to connect hardware that senses the sta-
tus of the system in the real world and display hardware that achieves the actual
composition. The user is a critical component in an AR system, in as it is the user
perception of augmented reality that is important in an AR application. The re-
search and results presented in this thesis only deal with one particular AR system,
optical see-through head-mounted displays with real-time presentation of registered
computer graphics. However, it is possible to apply these methods to other systems,

though the components may differ.

2.1 Hardware

As was stated earlier, all AR systems lie on a continuum of the “degree of Immersion.
It is for this reason that the hardware used in such systems varies greatly. Also it
complicates the view of what actually is needed to constitute an AR system. The

simplest view is that every AR system for augmentation of vision is made up of three

hardware components. The ﬁrst is of these is the input device, often taking the form
of a video camera or some other type of tracking device. The input from this device
is handled by the system processor. Finally the output is presented in a display. The
following AR systems demonstrate the vast differences in hardware.

One simple version of AR is projective AR. In this paradigm, a video projector is
used to display information on some object within the user’s ﬁeld of view [35, 34].
The location of the projector is known, so it is possible to place graphics at a speciﬁc
location in space. Systems often include a video camera which is capable of tracking
the location of objects in space. A simple application could track the location of
a piece of paper, and use the projector to place some image on the paper. If the
user moves the paper, the image on it moves as well. The rendered graphics can be
warped so as to appear mapped onto the surface, much as texture mapping is done
in computer graphics applications. Projective AR systems have the advantage that
the user is not independently instrumented in any way and does not require special
display hardware. However, projective systems limit the space in which they can be
used to specially prepared objects and surfaces.

An alternative approach presents information from a video camera to the user.
This information may be seen on a computer monitor or displayed in an opaque
head-mounted display. The user is effectively viewing the world through the eyes of a
camera. This method, known as monitor-based AR when used on computer monitors
or video see-through when an HMD is utilized, can use the data in the camera image
to determine the pose and location of objects and render graphics to be added to the
video image. This combined image is them presented to the user [22, 27, 18]. As this
approach can be HMD-based, it can mobilize the display. As one moves around in a
projective system, it is possible to see surfaces that would be obscured from the output
of a stationary projector or set of projectors. Using a HMD this problem can be solved,

and all locations the user can see can be augmented. An additional advantage of

monitor-based and video see-through approaches is that content in the original image
can be occluded by virtual content. Projective systems and fundamentally additive in
nature, adding the displayed content from the projectors to the illumination already
present on the surface.

There are multiple methods for both displaying information to the user. Example
methods for combining graphics include reﬂecting information off of a half-silvered
mirror, actually combining real video and graphics before presenting them to the user,
and writing directly on the retina of the eye with a laser.

Tracking technologies are used to provide an indication of the relative location of
the user or camera in the real world. The tracking can be low resolution such as a
Global Positioning System [36]. This technique uses information from satellites to
establish some latitude and longitude location on the earth. It has the advantage of
being able to track over the entire earth, but can have errors of a few meters and yields
no information about pose. GPS is also occluded by buildings or terrain. There are
other options that would add pose information but have a smaller range of operation.
This could be done with inertial tracking, magnetic tracking, or with a camera and
computer vision techniques. All of these options have advantage and disadvantages
that must be considered when developing an AR application.

Three main hardware components are present in the AR system considered in
this thesis. They are a magnetic tracker, system processor, and optical see—through
HMD. These are the same components that were used in previously mentioned work
regarding error measurement. It is for this reason that the total error for the system
can be compared to the rendering error.

These three components are connected in series and information ﬂows from the
tracker, through the processor, and on to the HMD. The tracker reports information
about pose and location of the user’s head to the computer. The computer then feeds

these values in the rendering process to produce graphics output. This output is sent

10

to the HMD that the user is wearing. The tracking allows for these graphics to be

presented in reference to some location in the real world.

Location Information

 

 

 

 

 

 

 

 

 

 

 

 

 

System Processor Tracker
Virtual Objects
Display

Users

View < /
Combined Transparent Real World
Image Mirror

Head Mounted Display

Figure 2.1: Hardware components of and Optical See-Through AR System

2.1.1 Hacking System

The tracking of a user in the AR system utilized in this thesis is accomplished using
an Ascension Flock of Birds six degree-of-freedom magnetic tracking system. This
hardware uses a transmitter to produce a magnetic ﬁeld. Readings from up to 4
sensors that are placed in this ﬁeld are then taken. The sensors are accurate within
a range of 1.2 meters of the transmitter. The sensors have 6 degrees-of-freedom
(6DOF) meaning that they can detect both location and pose (orientation). The
optimal distance between the transmitter and sensors is 30.5cm. At this range the
location accuracy is 1.8mm and the pose accuracy is 0.5°. The resolution for location
is 0.5mm and for pose 0.1°. These values are the stated capabilities. In a lab setting
there can be a large amount of interference in the magnetic ﬁeld. Factors such as
the presence of metallic objects as well as electronic interference from other computer
components can decrease the accuracy of the tracking system [47].

Two sensors are used in this speciﬁc AR system. One sensor is mounted to a stylus

11

that is used to select points in space for calibration purposes and other interaction.
The other sensor is mounted to the HMD. The data from this sensor is fed into the
graphics rendering system. It is then used to align graphics with the real world. Also
critical to the system is ﬁxing the position of the transmitter in the world. If the
transmitter were to move, it would affect the values reported for all the sensors, thus

skewing the results.

2. 1. 2 System Processor

The system processor has many different tasks in an AR system. It is the main
connection between all the other pieces of hardware in the system. The system
processor is responsible for providing the information from the tracker to the AR
application. It runs the software that controls all parts of the AR system. The system
processor also includes the hardware that produces the graphics for the HMD. The
rendering in this system is done using OpenGL. OpenGL is a rendering technique the
produces a 2 dimensional image from 3 dimensional models of objects. The speciﬁcs
of the system processor such as manufacturer or clock speed are not included because
they do not inﬂuence the rendered image or light from the real world. A different
system processor given the same tracking information will produce the same graphical
output. It is possible that the speed at which it does this may change but this is not
a factor in the following computations. The only requirements are that the system

be able to run OpenGL and connect to the tracking system.

2.1.3 Head Mounted Display

The combination of real world and computer generated images is achieved using a
method called Optical See-Through. This involves the use of a partially transparent
mirror which reﬂects output from a display into the user’s eye. Light from the real

world is able to pass through this mirror and enter the eye. These two light sources,

12

the real world and the display, combine in the eye to form an augmented image for
the user. The HMD used in this system is the Sony Glasstron. Since light passes
through the display, it must be considered when modeling the optics of the system.
The Sony Glasstron is designed to simulate a large display, while being easily worn
on the head. The display is seen as a 762mm by 571.5mm screen at a distance of
1200mm. This is achieved using a concave mirror. A curved mirror has a focal length
that is one half the radius of its curvature. If an object is placed within the focal
distance of such a mirror, the reﬂected rays will not converge. If someone were to
view these divergent rays, they would be able to see the object. The image they see
is called a virtual image, and its location and size are determined by the location of
the original object relative to the focal point of the mirror [21]. The rays involved are
shown in Figure 2.2. The center of the radius of curvature is labeled C and the focal

point F.

 

Figure 2.2: Formation of a Virtual Image

2.2 Software

Each AR application has three different tasks it must accomplish. It must ﬁrst take
readings from the AR system to establish a calibration. This is a process whereby
the locations of objects and system components in the real world are reconciled with

the locations of virtual elements to be used as augmentations. Once this is done it

13

must create a pipeline to the HMD using information from the tracker. This allows
the application speciﬁc code to simply add virtual objects and the software will use
the tracking information and present the graphics. The last task is different for each
AR application, depending on the needs of the system. If the system is used to direct
a user to a location, this task would include deciding when to instruct the user to
turn. Since many AR applications share similar hardware components, calibration
requirements, and tracking information, a system was created that allows for rapid
generation of shell application. The shell application provides calibration capabilities
and a pipeline to the HMD. This software is called ImageTclAR. It is a package
created by Dr. Charles Owen and contributed to by researchers in the Michigan
State University Metlab. A basic application was created using this software. This is

the software used for this thesis.

2. 2. 1 Calibration

The tracking hardware provides location and pose information relative to some point
in the center of the transmitter and some point in the receiver, but the location of
these points is not always known with great accuracy. All this data makes little
sense in this format since the location of the transmitter may not correspond to the
points that need to be tracked [9, 9, 45]. The goal of tracker calibration is to convert
this information into something more meaningful. To start applying meaning to the
data from the tracker, a main reference frame must be established. All coordinates
can then be deﬁned within this frame. This frame of reference is called the world
coordinate system.

The idea of relative location, world coordinate system, and converting location
data can be clariﬁed by the following example. Consider a painting hanging on a
wall. Any point in the painting could be described as being some distance from the

top and left side of the frame. The same point in the painting has some location on

14

 

I-—————_

 

0A

 

 

 

 

 

 

Figure 2.3: Coordinate System Ti’ansformations

the wall as well. This location could be a distance of ﬁve feet from the right wall and
six from the floor. The data (5, 6) does not completely describe the location if the
frame of reference is not known, since this could be the distance from the ceiling and
left wall. Therefore when describing a location of a point, what frame of reference
the measurements are relative to is very important.

The other idea that is demonstrated by the picture example is converting between
coordinate systems. If the top left corner of the wall is deﬁned to be the world
coordinate system, the top left corner of the painting is (2, 1). Consider point A
in the picture that is half a foot from the top of the frame and half a foot from
the left. The location of this point is (2.5, 1.5) in reference to the world coordinate
system. If the picture is moved to some new location on the wall, the location of
it will have changed in reference to the wall but not in the painting. A conversion
(transformation) can be created that will convert the location of A from the painting
coordinates to the world coordinates. The conversion can be done by adding (.5, .5) to
the location of the top left corner of the painting. The conversions used in calibrated

rendering are more complicated, involving rotations or projections, but follow this

15

same basic concept.
In addition to the world coordinate system each component has its own coordinate

system. The many different coordinate systems used are shown in Figure 2.4. The

A?

 

Real World S tylu

«’9

N HMD

Figure 2.4: Calibration of Coordinate System

coordinate systems are the world, stylus receiver, stylus tip, HMD receiver, and both
of the users eyes. Now a method is needed to convert between the coordinate systems.
The conversion takes the form of a mathematical construct known as a transformation
matrix. Using matrix multiplication, data in reference to one coordinate system may
be easily converted to another system. There are three main conversions that must
be established in order for the AR system to function. The calibration of the system
takes place in a few steps since some transformations need to be established before
the next one can be created.

The ﬁrst step in calibration is establishing a means of selecting points in space.
This is done using a stylus with a receiver mounted on it. (Not all AR systems utilize

calibrated styli.) The data provided from the tracking system are the location and

16

 

pose of the receiver, which are different than the point of the stylus. At least six
readings are taken while keeping the point of the stylus in one location and moving
the stylus into different poses. This data is then used to produce a transformation
from the location of the receiver to the point of the stylus. This allows for the
selection of points in space using the tip of the stylus, which are then used to create a
transformation from the tracker coordinate system to one in the real world. The stylus
is used to select at least six known points on a rigid object in space. These are used
to produce a second transformation that will convert locations from the transmitter
coordinate system to the world coordinate system. The last step is the calibration of
the HMD. An example method for this calibration has been developed by 'I‘uceryan
and Navab and is called SPAAM [25, 42]. This involves the user repeatedly aligning
a point displayed in the HMD with a point in the real world. This is used to establish
a location and pr0perties of a camera that represents a pinhole camera model of the
user’s eye in combination with the optical see-through display. This process allows
for the interaction with the real world using the stylus, the placing of virtual object
in reference to a chosen world coordinate system, and the rendering of this entire

virtual scene registered with the real world in the HMD.

2.2.2 Graphics Rendering

All graphics that are to be aligned with or represent real world objects must be ren-
dered from 3—D models. These graphics are rendered using OpenGL using calibration
information from the SPAAM procedure. Objects are modeled using many polygons,
which are made of a series of points, and added into a scene which OpenGL is to
render. The information from the tracker mounted to the HMD dictates the location
of the camera used for rendering. Thus, as a user moves their head around in space,
the camera location is updated. Therefore, it appears to the user that the virtual

objects are ﬁxed in the real world since the graphics being drawn are always from the

17

viewpoint of a camera that corresponds to their eyes.
OpenGL uses a pin-hole camera model to render its graphics. Since this rendering
is the critical source of error that is being examined in this thesis, the pin-hole model

is discussed in section 4.2.5.

2.3 User

The ﬁnal component of the AR system is the user. Human users introduce many
variables in the processes of an AR system. Each user is different, having a differing
interocular distance and placing the display at a slightly different location on the
head. Some users have a great deal of experience with AR systems, some are complete
novices. Some wear glasses. Some have eyes that have a different physiology than the
norm of the population. Even if the same person were to calibrate and use the system
at two different times, the location of the glasses can differ after being removed and
remounted. All of these factors result in a different experience for each user, and
possibly different experiences for the same user under different calibrations. This
thesis assumes an average case and the results are computed from these assumptions.
A more detailed computation is possible if measurements for each user are taken,
but often this is difﬁcult if not completely impractical. This is acceptable since the
model can be changed to simulate some of the conﬁgurations of the system. This
will provide more information about how changes to the system are reﬂected in the

magnitude of error.

18

CHAPTER 3
MODEL OF THE DISPLAY-EYE SYSTEM

We are interested in determining exactly what performance is possible in an AR
system utilizing an optical see-through head mounted display. In an optimum system,
virtual augmentations would be precisely registered with sub-pixel accuracy with real
elements. As an example, a wireframe of a box might be drawn over the physical
box. In a perfect system, the wireframe would exactly follow the edges as seen from
the viewpoint of the user of the system. The user would perceive the virtual lines as
in the exact same location as the real box edges. An open question has been if this
level of accuracy is even possible and, if so, what would be a mapping from points in
the real world to equivalent points in the display. Clearly, if the eye does not move
and all systems are static, such a mapping will exist with no error. But, what error
will occur if the eye moves, the display moves, or if an approximation camera model,
such as the pin-hole camera model, is used.

It is not possible to measure error directly in an optical see-through AR system.
The location that light from a point source strikes the retina of the eye is not available
(barring intrusive methods such as retinal photographs or heretofore unknown brain
scanning technologies). In order to obtain the exact mapping from world points to
display points, a mathematical model of the eye is needed. The eye is not the only
component of the AR system that is involved in the perception of visual information
by the user. Objects from the real world are viewed through the HMD. Also the
overlaid virtual images pass through some components of the HMD. Thus models of

the HMD and real-world objects are needed as well.

19

3.1 Physiology of the Human Eye

The human eye is a complicated system involving four refractive surfaces as well as
a curved surface on which images are formed. Light strikes the cornea and travels
through the anterior chamber. It then must pass through the opening of the iris
and on into the main lens of the eye. Finally the light will strike the retina at the
rear of the eye. All of these structures can be seen in Figure 3.1. In order to view
different objects the eye is able to change conﬁguration slightly. To focus on objects
at different distances, the eye will change the shape of the lens. This is accomplished
by contracting the ciliary muscles. The other way the eye can change its properties
is by rotating inside the eye socket. There are six extra-ocular muscles that are used

to rotate the eye around a ﬁxed point [7].

 

Figure 3.1: Horizontal Section of the Eye

In order to create a model of the eye it is necessary to know the shape of the

different surfaces of the eye as well as the index of refraction of the materials that

20

make up the eye. The ideal situation would be if all the surfaces had shapes that
followed a mathematical formula and the index of refraction were constant within
each component. The eye differers from this ideal model in three ways. The surfaces
of the eye are not exactly spherical. They tend to ﬂatten out further away from the
center, in particular the cornea does this more dramatically than other surfaces. Also
the lens of the eye is not directly in line with the center of the eye. It is usually offset
and on a slight angle. Lastly, the lens does not have a consistent index of refraction.
The index of refraction is greater toward the center of the lens [21, 31]. All of these

properties of the eye make modeling difficult.

3.2 Values Used in Calculation

In order to make a model possible some approximations are necessary. The model used
in this thesis uses an approximation created by Gullstrand [31]. In this model the lens
has been given a constant index of refraction based on an average. Also the surfaces
are estimated to be completely spherical. The following distance measurements are

all relative to the front of the eye.

Cornea Lens

 

1] Vitreous

 

Aqueous

Figure 3.2: Surfaces and Media of the Eye (not to scale)

21

 

 

 

 

 

 

 

Surface Relative Location Radius
1 7.7mm 7.7mm
2 7.3mm 6.8mm
3 13.6mm 10mm
4 1.2mm 6mm
5 13.3mm 1 1mm

 

 

 

 

 

Table 3.1: Values for schematic eye

 

Cornea 1.376

Aqueous 1.336
Lens 1 .41

Vitreous 1.336

 

 

 

 

 

 

 

Table 3.2: Indexes of Refraction

3.3 Components of Head Mounted Display

The display used in the particular AR system being modeled is a Sony Glasstron. It
displays an image that appears to be approximately 1200mm from the user. This
is accomplished with the use of curved half silvered mirror and a few other simple
optical objects. The graphics are presented on a small LDC display in the t0p of the

LCD Display

51/ o

Curved Mirror

Eye
Half Silverd Mirror

Figure 3.3: Optical Components of the Sony Glasstron

Glasstron. The light from the display strikes a mirror positioned at 45°. This reﬂects

the light into the curved mirror, which then reﬂects back through the angled mirror

22

and into the eye of the user. The display is simpliﬁed to create the model. This is
done by changing the location of the LCD in the model. It is moved to be directly
in front of the curved half-silvered mirror. Optically this is the same as the real
display but without reﬂecting light off the 45° mirror to direct the it into the primary

mirror. The HMD is placed on the users with a pad contacting the forehead. There

 

 

Figure 3.4: Possible Movement of the Sony Glasstron

is a bar connecting this contact point to the other display components. This allows
for substantial movement of the optical components, both changing the location and

pose.

3.4 Measurements used for Model

The most complicated component of the Glasstron is the curved mirror located at
the front of the glasses. The values currently being used are from direct physical
measurements.

The HMD is modeled with two spheres, two planes for the ﬂat mirror, as well
as a plane for the LCD screen. All the locations given are relative to the rear, the
side facing the user, of the HMD. This allows for modeling movement of the HMD
by moving this reference point. The standard conﬁguration of the system has this

reference point set at (20mm, 0mm, 0mm) in the world coordinate system.

23

VS"

 

 

 

 

 

 

Surface Relative Location Radius
Front of Lens -27.26mm 57.47mm
Rear of Lens —27.26mm 55.44mm

Front of Mirror 19.24mm n.a.
Rear of Mirror 17.74mm n.a.
LCD Screen 0mm n.a.

 

 

 

 

 

Table 3.3: Values for Head Mounted Display

3.5 Considerations

This model is based on average information of the population. There are other factors
that can inﬂuence what happens to rays that enter the eye. A user’s physiology will
necessarily vary from the standard. Also, many people wear corrective lenses to
allow for normal vision. Lastly, the eye has the capability to change shape to focus
on objects at different depths. All of these factors are currently not part of the
computational model that is being used [8, 46]. They can be included in the model in
the future. The goal of this thesis is to determine what is possible assuming a given
eye and display combination. The variances from different users will necessary need

to be accommodated in a calibration process.

3.6 Components as a System

All of these optical components are combined to form a system which is responsible
for presenting information to the user. In order to simulate different usage situations,
modiﬁcations can be applied to the system. These modiﬁcations can take the form of
changing the location of the HMD or rotating the eye. All of this information is then
used by the methods presented in Chapter 4. The combination of the two provides a
means to answer questions about the entire system. The HMD and eye are no longer

viewed as individual objects, but the combination as one image formation system.

24

CHAPTER 4
METHODS

The primary goal of this thesis is a derivation of a model for the eye and optical
see-through head-mounted display as a system. Such a model will allow us to make
several critical determinations. How close is the common pin-hole camera model to
the actual model required by the system? How much error does the pin-hole camera
model introduce? How much error is introduced when the eye moves if a static
projection model is utilized. This section describes the mathematical methods that
have been used to derive a model for the eye-display system, determine an optimum

projection model, and determine the errors that will be introduced in common usage.

4.1 Overview of Methods

In order to answer questions about the system, all the components must modeled
mathematically. These models provide a means of determining the behavior of light
as it travels into the eye. The light refracts with each surface in the system as it
enters the eye. Next, an initial ray is created that will refract with the surfaces of the
system, producing a location on the iris. This initial ray is adjusted using a search
technique to ﬁnd a ray that will pass through the center of the iris.

This ray is the ﬁrst step in creating the mappings that are the ﬁnal result of the
methods in this chapter. A mapping from the real world to the display is created by
reversing this ray so it is traveling out of the eye. It is then reﬂected off the curved
mirror of the HMD and ﬁnally intersected with the display, yielding a location on the

display. To create a map from the display to the real world a different initial ray is

 

need. It is computed by adjusting the order in which the components are dealt with.

Again the result ray is reversed, and traced out into the real world.

4.2 Mathematical Models

There are two steps necessary to determine what will happen to a ray of light as it
travels into the eye through different optical objects. The ﬁrst question is: Where will
the given ray strike a surface? After this is known, a new direction is produced as the
light travels from one medium to another. This new direction must be determined as
well. In order to determine the behavior of light entering the eye a technique known
as ray tracing is used [11, 16]. In this system, mathematical equations are used to
represent objects and rays are constructed. These rays are then traced into the scene
to see what they will intersect with. Rays are represented by vectors and surfaces in
the system are represented with spheres and planes.

Rays in the system are modeled as vectors using a point and direction represen-

tation. A simple representation is:

P = E + tD (4.1)

where P represents any point on the ray, E is some starting point for the ray, D is
the direction, and t is some distance in that direction. Typically, D is a normalized
vector and t represents unit steps in the direction of the vector. This equation can

also be expanded out to:

 

 

 

IT) 1130 15d
20 Zd

26

 

 

 

4.2.1 Sphere Intersection

The eye model presented in section 3.2 as well as the main mirror of the HMD use
a spherical shape for all the surfaces. This can be modeled mathematically using a

simple equation of a sphere with center at (we, yc, 2:6) and radius r.
(a: — :86)2 + (y — are)2 + (z — :rc)2 — r2 = 0 (4.3)

To ﬁnd the intersection of a ray with a sphere, simply substitute the values from
equation 4.2 into equation 4.3. This yields a quadratic equation that can be solved
using the quadratic formula. If the discriminant is less than zero, no solution is
possible. This means the two objects do not intersect. If the discriminant is greater
than zero, there will be two solutions. One being the location where the ray enters the
sphere and one when it leaves. Depending on the surface being modeled the correct
solution can be chosen. If the surface is convex the closer point should be chosen,

otherwise the further one is used.

4.2.2 Plane Intersection

The other possible object a ray may intersect with is a plane. A plane is modeled
using a point on the plane Q as well as a normal N. A point P is on a plane if

N ~ (P - Q) = 0. Substituting equation 4.1 in for P above yields the following.

,zNwQ—m
N-D

 

(4.4)

If t 2 0 then the ray will intersect the plane at E + tD.

27

4 . 2. 3 Direction Computation

Now that a location of intersection is known, the next step is to calculate a new
direction vector based on the angle of incident and the index of refraction of the
two materials involved. This is done using vector methods. In the following ﬁgures
the surface is shown to be curved, but the same methods work for both curved and
planar surfaces. The only difference is the planar surfaces have the same normal at
all intersection points, and a sphere normal is different at each point on its surface.
The direction of the ray entering the surface is known. In Figure 4.1 this is labeled

Normal Ray

Incident Ray

    

Center

Figure 4.1: Refraction

as Incident Ray. Also known from the computation in section 4.2.1 or 4.2.2 is the
location the ray intersects a surface. If the surface is a sphere a normal can then be
created by forming a vector from the center of the sphere to the intersection point.
If the surface is planar, the normal has already been given. This is labeled Normal
Ray. In order to determine the direction of the Refracted Ray, one must use Snell’s
Law [21]. Snell’s Law applies to light traveling from one medium into another. It is
stated as

n1 * sin (a) 2 n2 * sin (5) (4.5)

where a is the incident angle, [3 the refracted angle, n1 the original index, and n2 the
index of the second medium. These angles are measured in reference to the computed
normal of the surface.

The next step is calculating the incident angle to use in equation 4.5. The dot

28

product can be used to determine the incident angle. The dot product, deﬁned in
equation 4.6, has the property that it is equal to the product of the two vectors and

the cosine of the angle between them, shown in equation 4.7 [20].
aob=ax>kbz+ay=kby+az>kbz (4.6)

a - b = |a||b| cos(a) (4.7)

Using the dot product, a from equation 4.5 can be computed. From here the value
of D can be determined The value of S is the angle between the refracted ray and the
surface normal. Even though the angle with respect to the normal is determined, the
ﬁnal refracted direction is still not known. This ﬁnal direction must lie in the same
plane as both the normal and original direction. This is done by ﬁrst constructing a
right triangle using the incident ray and the normal. The dot product can be used
again to construct the right triangle. In a right triangle the cosine is the ratio of the
adjacent side and hypotenuse. If the lengths of both the dir and n vectors is set to
1, the dot product is exactly the cosine of the angle. The length of dir from Figure
4.2 is set to be 1, and n is set to the value of the dot product. These measurements

produce a right triangle. The third side of the triangle in Figure 4.2 is anglevec and

\ anglevec

n...
,9...’\ a \\\
4 . \
dir

Figure 4.2: Incident Angle

is computed as dir — n. A second right triangle is constructed using n and a scaled

29

version of angleoec to create the ﬁnal resultant vector with the correct value of 6. The

 
   

\anglevec

\"
Figure 4.3: Resultant Angle

value of {3 in Figure 4.3 is know and thus the length of anglevec is scaled to tan (3).
This produces a right triangle with the correct angle measurement. Therefore the

ﬁnal direction is n + anglevec.

4.2.4 Subsequent Surfaces

The calculations in sections 4.2.1, 4.2.2, and 4.2.3 enable the computation of the result
of a ray passing through one refractive surface. There are many surfaces involved
in the Display Eye system. Depending on the desired behavior being modeled in
the system, the different surfaces may be involved at different times. Therefore the
computations have been presented in a way that allows for changes in order. The
computation for a single surface takes a ray as input and produces a resultant ray as
output. This new ray is the combination of the intersection point and the refracted
direction. The output can be used as input for the next surface in the model. The
different behaviors of the system and the necessary order of computation will be

discussed later.

4.2.5 Pin-Hole Camera Model

OpenGL, DirectX, and many other graphics systems use a pin-hole camera model

to create the graphics that are displayed to the user. It is this rendering that is

30

critical in determining the error contributed by the display eye system when these
systems are utilized unmodiﬁed. The model is based on the idea of an inﬁnitely small
hole that allows only one ray of light from any location through its opening. This
one ray can then be used to determine where the light from a location will show up

on the image plane. This concept is show in Figure 4.4 in a simpliﬁed 2 dimensional

A l

 

i x'
A,

y f

Figure 4.4: 2 Dimensional Pin Hole Camera

representation. The pin hole is labeled P. A ray of light from point A travels through
P and strikes what is known as the image plane at A’. This image plane is a distance
of f away from the pin hole, a measure called the focal length. The location of A is

known to be (as, y) and A’ is (m’, f). These measures are directly related to as follows,

22—3;

y f . Since the only unknown is :r’ this is simpliﬁed to 12’ = it. The same method
works for generalization to 3 dimensions by simply repeating the process for the z
direction. By this method a location on the image plane can be determined for any
point in space [1]. The eye is signiﬁcantly more complex than this simple model and
has been an open question as to how well the pin hole camera model approximates
the relationship between locations in space and locations in the projection window
for the HMD.

OpenGL must still do many more computations in order to produce a ﬁnal image.
The calculations discussed above assumes the pin hole of the camera is located at
the origin and the image plane has the y axis as its normal. This is not the case in
practice, since the camera may be anywhere in space, and the direction of the camera

can change. The points that make up a scene must be translated according to the

31

 

Figure 4.5: 3 Dimensional Camera

location and pose of the camera. This means means the pin hole of the camera is
translated to the origin as well as orienting the normal on the y axis. Once this is

done, the translated point may be rendered accordingly.

4.3 Modeling Behavior of the Entire System

There are two major questions that need to be answered by the model of the Display
Eye system. These two questions are the central theme of this thesis. First, given a
location in 3D space, what 2D point in the display does it correspond to (the project
operation)? Second, given a point in the 2D display, what 3D points in space does it
correspond to (the unproject operation)? These questions obviously enjoy an inverse
relationship. In order to answer these questions a more basic question must ﬁrst be
solved. What happens to a ray of light from a point in space as it passes through the
components of the system? This question is easily answered by using the original ray
and refracting it with the ﬁrst surface of the system. This will yield a second ray to
be refracted with the next surface. This can be repeated for all the components of

the system in order, until ﬁnally the ray intersects with the retina of the eye.

32

4.3.1 Mapping Real World to Display

The input to the real world to display mapping is a point in space. An initial ray is
created and traced into the eye. The ray is created by directing a ray at the center
of the iris from the initial point. The surfaces involved, in order, are: the front of the
curved mirror, the rear, the front of the ﬂat mirror, the rear, the front of the cornea,
the rear of the cornea, and ﬁnally the iris. The method in Section 4.4 provides a way
to adjust this ray so it passes through the center of the iris. This ray is then reversed
and reﬂected off the curved mirror of the HMD. When the ray is intersected with
the sphere modeling the mirrored surface, it forms an incident angle. The angle of
incident is equal to the angle of reﬂection. Therefore it is possible to compute the

angle of reﬂection, given the angle of incident. In Figure 4.6 the vector labeled inc is

 

 

Figure 4.6: Reﬂection in a Curved Mirror

the incident direction and is known. Using methods from Section 4.2.3, the anglevec
is computed. Thus the reﬂected vector must be —1=I= (ref + 2 * anglevec). This forms
a new ray when combined with the original point on the curved mirror. This ray is
ﬁnally intersected with the plane of the LCD to produce a point on its surface. This

point is the result of the mapping of a real world point into the HMD.

4.3.2 Mapping Display to Real World

The model must also be able to generate 3D points in space given a 2D location in

the HMD. This is not as speciﬁc as the ﬁrst function. In mapping onto the LCD

33

surface, there was a ﬁnal surface that could be intersected with. This is not the case
when trying to ﬁnd a location in space. Thus the result of this mapping is a vector
directed away from the HMD. This vector can be intersected with any plane in space
to yield a point result.

The method for creating this mapping follows the same steps as the previous
mapping. The input is a point on the LCD plane. First a starting ray must be
created that will be used as input for the selection process in section 4.4. This ray is
created by ﬁrst deﬁning a direction that begins at the center of the iris and ends at
the input point on the virtual display. The initial ray is deﬁned as having a starting
at the input point and the direction previously deﬁned. This ray is used because it

will intersect will the appropriate surfaces. The selection process needs a method of

V<

Virtual
Display

Figure 4.7: Initial Ray Selection for HMD to Real World Mapping

intersecting a ray with the iris, in the form of a list of interactions between the ray
and the optical components. In this case the ray must ﬁrst be reﬂected off the curved
mirror of the HMD. This is done using the same method as ﬁnding the reﬂected ray
from the previous section. The reﬂected ray is then refracted with the two surfaces
of the ﬂat mirror and the ﬁrst two surfaces of the eye. It is then intersected with the
iris. Once the appropriate ray has been selected, it can be reversed and refracted well

all the surfaces in reverse order. The ﬁnal result being the output vector.

34

4.4 Ray Selection

An object in space will theoretically reﬂect the light that hits it in an inﬁnite number
of directions. There are any number of rays that contribute to forming images in
the eye. Therefore a single ray must be selected that is representative of a path into
the eye. This is done by determining a ray that will go through the center of the
iris. This process is provides a method of computing an intersection point on the iris
plane given a ray, as well as an initial ray. The method of computing this intersection
is such that a closed form solution is not feasible. Therefore no inverse could be
computed that would yield a ray from the center of the iris that would intersect with
a given point in space. Therefore another method for ﬁnding a ray from a given point

through the center of the iris is needed.

4 .4 . 1 Searching

A search method is needed to ﬁnd an appropriate ray. Typically search algorithms
are designed to ﬁnd an object in one dimension. For example searching for the largest
value in a list or checking to see if an element is in an ordered list. The search space
involved in ﬁnding a desired ray is two dimensional, since a rays direction may be
changed horizontally and vertically. In order to ﬁnd the desired ray a differential
technique is used. It is referred to as differential since it uses a local rate of change to
compute a new value for the next iteration. First an e value is chosen as an acceptable
amount of error. The process is started by using the provided initial ray. It is not
terribly important where this ray may end up, only that is strikes the optical surfaces
involved. This ray is intersected with the components until it passes through the iris.
In Figure 5.1 this point on the plane of the iris is labeled A. Then two new rays are
formed, one by tilting the starting ray up slightly and another by tilting it to the side.

These new rays are traced into the iris and are marked by z’ and y’. They will both

35

A,

 

Figure 4.8: Search Vectors

result in some change in location from the original intersection point. Some linear
combination of the two changes are added to A so the result is (0,0), the center of
the iris. The same linear combination of the changes made to produce the new rays
is added to the original ray. This produces an approximation of a ray that will pass
through the center of the iris. This new ray is then traced into the eye as the previous
ray and its intersection is marked as A ’. The error is calculated as the distance from
the origin. The process is repeated with this new ray until the error is less than the

given 6.

36

CHAPTER 5
RESULTS

The models and methods presented in Chapters 3 and 4 provide the necessary infor-
mation and techniques such that given a point in space a ray can be constructed that
will pass through the center of the iris after being refracted by the optical components
in front of the iris. This information in itself is not useful since it tells nothing about
error. Tests needed to be designed that would use this capability to provide data that
it is relevant. These tests model real world situations that a user of an AR system

would confront.

5.1 Graphical Rendering Error

The ﬁrst research question is how much error is due purely to the simpliﬁed graphics
rendering of the pinhole model and the introduction of optical components in front
of the eye. In order to answer this question an optimum situation is designed. This
situation makes three assumptions in order to isolate error to the display eye system.
The ﬁrst is that the data from the tracking system is going to be completely free of
error. This allows us to specify the exact position of the eye and assume that perfect
tracking would provide this information accurately. The second is assumption of an
optimal calibration. The calibration process estimates the location and focal length
for a camera that represents the eye. This estimate is needed since highly precise
measurements of the location and angle of the display with respect to the eye are
not available. The third assumption is that the location and direction of the eye is
predeﬁned. The eye is set to be located directly in line with the center of the virtual

display and a radius of the curved mirror of the HMD.

37

5.1.] Experiment Layout

The ﬁrst step in laying out all the experiments is deﬁning a world coordinate system.
This is needed to allow relative measurements of the eye, HMD, and real world points
to be deﬁned in meaningful ways compared to the others. The chosen coordinate
system places the origin at the center of the ocular globe. The ocular globe is the
main component of the eye that rotates inside the eye socket of the skull. This object
is modeled by surface 5 from Table 3.2. The second component of a coordinate system
that is needed is an orientation of the axis. The X-axis is placed pointing directly

out of the eye and is normal to the other refractive surfaces of the eye. The optical

AA
x/

Figure 5.1: Layout of Optical Components for Extra-Fovea Points

components of the HMD are placed in such an orientation that the X-axis is normal
to the curved mirror and passes through the center of the virtual display.

The next step is determining the optimal camera calibration the will be used to
compare points rendered by the system to their optimal location. This calibration is
created by deﬁning a correct rendering in a small scale situation. In order to make
calculations easier the image plane of the pin-hole camera is moved in front of the
origin which represents the pin hole. The image plane is represented by a plane which
is orthogonal to the X axis. Rays are formed that connect the points in the real world
to the origin. These rays intersect the image plane at some point. The display eye

model is used to generate these ﬁrst points on the image plane. Then the properties

38

 

 

 

 

 

 

Image Plane Real World

 

Figure 5.2: Points used to create optimal conﬁguration

of a pin hole camera can be determined that would produce the same locations on the
image plane. This camera can then be used on other points and the results compared
to those from the display eye model.

The smallest increment that the display is capable of representing is a pixel,
therefore it is chosen as the small range from which the pin hole camera is deﬁned.
The LCD in the display is 12.7mm wide and has a horizontal resolution of 800 pixels.
This leads to the fact that each pixel is 0.015875mm wide. Computations using
the display eye model show that at a range of 1200mm, the range of the display, a
point at (0.6956, 0) in the real world, will map to (0.015875, 0), a pixels width, in the
image plane. Using the equation for the pin-hole camera, this leads to the following

equation.

0353i; : 0.01f5875 (51)
Therefore the focal length to be used in the optimal camera is 27.386mm. The only
other concern is an offset in the Y direction. The ﬂat mirror of the HMD results
in a shift of the rays passing through it. The point (0,0) in the real world maps to
(0, 0.01613) in the display. This can be accommodated by shifting the camera down

0.01613mm. These two measurements, along with the normal deﬁned as the X axis,

39

completely describe the pin hole camera. Points in space can be rendered to the

image plane using the following equations.

 

z * f
Zip = (E (52)
y.,, = y;—f + 0.0124555 (5.3)

This thesis is comparing two methods of determining an appropriate LCD display
pixel from a point in space. The ﬁrst is through the use of optical analysis using the
methods discussed in Chapter 4. This method will produce a location in the LCD that
will need to be illuminated in order to overlay a real world point. The second method
is using the pin hole camera approximation model with the parameters determined
above. These two methods both produce a location in the LCD for a point in space.
Measurements of distances on the LCD are not meaningful in themselves. To say
there is an error of 0.1mm in the display is not as meaningful as an error of 1cm in
the real world. For this reason one needs to ﬁnd the location in space that a point
in the display will be perceived. This is done using the method from Section 4.3.2.
The point rendered using the pinhole camera can be mapped back out to a real world
location. The ﬁnal result is a pair of points, the original and the perceived location
of a pinhole camera rendering. When compared, the difference is the error due to the
properties of the display eye system. The results of applying both models to a range
of points in space are shown in Figure 5.3 and Table 5.1. The following locations are
all at distance of 1200mm in front of the eye. The X and Y ranges were chosen by

tracing the corners of the LCD out into the real world.

40

250

 

 

 

 

l l l T l
200 _)§ 3* X X 'X_[
150 - n
100 -x x are are 3K-
50 - -
0 -x x are x x-
-50 _ _
-100 we are x x x—
-150 - -
'200 3% l X I at l x 1 -

-300 -200 -1 00 0 1 00 200 300
Real World Point +
Perceived Rendering ><

Figure 5.3: Comparison of Real World Points and Perceived Rendering

41

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Max Error 2.292 Mean Error 0.396
er Kw Xrendered Kendered
-280 -210 -281.567 ~211.673
-140 -210 -140.329 -210.697

0 -210 0 -210.372
140 -210 140.329 -210.697
280 -210 281.567 -211.673
-280 -105 -281.061 -105.748
-140 -105 -140.078 -105.143
0 -105 0 -104.942
140 -105 140.078 -105.143
280 -105 281.061 -105.748
-280 0 -280.897 -0.320978
-140 0 -139.997 -0.0790623
0 0 O -3.75409e—10
140 0 139.997 -0.0790623
280 0 280.897 -0.320978
-280 105 -281.068 105.018
-140 105 -140.086 104.905
0 105 0 104.866
140 105 140.086 104.905
280 105 281.068 105.018
-280 210 -281.57 210.664
-140 210 -140.341 210.206
0 210 0 210.051
140 210 140.341 210.206
280 210 281.57 210.664

 

 

Table 5.1: Values of Real World Point and Perceived Rendering

5.2 Isolated Eye Movement

The retina of human eye has a small portion in which the ability to sense light is
much greater than the rest of the retina. This region of the retina is known as the
fovea and has a range of approximately 5° about the center of the eye[7]. This results
in humans having a small region of clear vision and a loss of detail on the periphery
of their vision.

The fovea is a small portion of the retina but provides the brain with the most

detailed information. Therefore to sense the environment in detail, the eye is moved

42

so that light from different portions of the world fall on the fovea. There are two
ways this can be accomplished, either by moving the head or by moving the eye itself.
Since the location and pose of the head is tracked this movement is input into the AR
system and adjustments are made accordingly. The movement of the eye is typically
not tracked and it has been a question of some debate as to how much affect moving
the eye has on the calibration. If no eye tracking is used, the image presented to the
user remains the same when the eye is moved. Thus it is possible that the movement
of the eye in its socket produces some amount of error.

All of the optical components that are involved when the eye moves are modeled
in the system. Therefore the situation examined in Section 5.1 can be generalized
to allow the eye to move in its socket. The movements of the eye are rotations
about a ﬁxed point. The ﬁxed point is the center of the sphere that models the
retina [7]. When this movement takes place, the tracking system does not register
any change. Therefore the image presented to the user will be the same as before
the eye was moved. It is this fact that allows for the introduction of error. The
magnitude of the error is determined by the methods laid out above. The only new
technique that is needed is a method of rotating the eye. This is done using a linear
transformation. A matrix is deﬁned that will accomplish the desired rotation. This
matrix is applied to the different components of the eye. For the spheres, this is done
my multiplying the transformation and the center. The iris is modeled using a plane,
thus the transformation must be multiplied with both the point on the plane and the
normal.

The conﬁguration of the system changes slightly for this situation. The HMD has
not moved so the same pixels of the LCD are lit. The difference is only the location
that these pixels may be perceived. Therefore the method from Section 4.3.2 can
be used on these display locations. The only change that is made is rotating the

components of the eye. The actual display area of the HMD occupies approximately

43

26° of the ﬁeld of view, therefore a rotation of 13° from the center would be the
maximum angle that would allow light from the display to strike the fovea. This
degree of rotation is also assumed provide the largest amount of error, therefore this
angle is used to demonstrate the effect or rotating the eye. The results from rotating
13° about the Y axis are shown in Figure 5.4 and Table 5.2. Rotation about the Z

axis, looking up with the eye) show a similar magnitude of error.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Max Error 2.382 Mean Error 1.353
X rw Kw X rendered Kendered
-280 -210 -280.224 -210.787
-140 -210 -138.835 -209.907

0 -210 1.2211 -209.897
140 -210 140.763 -209.933
280 -210 280.588 -210.824

-280 -105 -279.379 -105.343
-140 -105 -138.189 -104.788

0 -105 1.67094 -104.746
140 -105 141.015 -104.805
280 -105 280.643 -105.37
-280 0 -279.392 -0.311631
-140 0 -138.12 -0.073537

0 0 1.82124 0.00211766
140 0 141.245 -0.0780219
280 0 280.952 -0.320293
-280 105 -279.39 104.637
-140 105 -138.199 104.565

0 105 1.67057 104.676
140 105 141.024 104.573
280 105 280.652 104.646
-280 210 -280.232 209.806
-140 210 -138.849 209.434

0 210 1.22022 209.585
140 210 140.776 209.45
280 210 280.594 209.825

 

Table 5.2: Values for Rotation of 13°

44

 

 

 

 

 

250 l l I F l
200 _x x x x x4
150 - -
100 Fx x x x x-
50 - -
0 -x x x x x:
-50 - 4
-100 we x x x 5K"
-150 - -
'200 ‘x l X J SK l X L 3C

-300 -200 -1 00 0 100 200 300
Real World Point +
Perceived Rendering x

Figure 5.4: Graph of Error Based on a Rotation of 13°

5.3 Movement of the Head Mounted Display Relative to

the Eye

When a user begins using the AR system, they must ﬁrst calibrate the system. Part
of this process is determining the relative location of the eye to the HMD. After this
is set in calibration, it no longer changes. It is, however, possible to move the HMD
relative to the eye while using the system. This would be unnoticed by the tracking
software and thus not result in a change in the image presented to the user. Similar
to rotating the eye, this could also be a source of error. Any possible error introduced
by movement of the display relative to the eye is of great interest in repeatability of
calibrations. How accurately would a display need to be replaced on the head if a
calibration is to be reused?

In order to model this situation, the components of the HMD must be moved.
This is done by simply adding the offset to each location coordinate for the HMD
components. The normal of the half silvered mirror remains the same. The lit pixels
of the HMD change location relative to the eye, thus the offset must be added to the
pixel location before it is sent to the process discussed in Section 4.3.2. The result of
this computation is a perceived location in space.

The viewing window of the HMD is approximately 30mm wide, therefore a move-
ment on the scale of 10mm would nearly move the eye out of this frame. Thus a
change on a smaller scale is used to demonstrate the effects of moving the display.
The values in Figure 5.5 and Table 5.3 are the result of moving the display up 3mm
in the Y direction and 2mm closer to the eye. These measurements were chosen to

simulate the movement of the display along the nose of the user.

46

 

 

 

 

250 r i r l l
200 _¥ )K X X X..-
150 - d
100 _)K X 3K 3K X-A
50 - -
0 ->IE x x x x-
-50 _ ..
-100 *x x x x x-
-150 - -
'200 ”X A 3K 1 x l X l X'-

-300 -200 -1 00 0 1 00 200 300
Real World Point +
Perceived Rendering x

Figure 5.5: Graph for Moving the Display

47

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Max Error 1 1.03 Mean Error 2.092
er )Irw Xrendered Kendered
-280 -210 -281.55 -209.157
-140 -210 -140.321 -210.741

0 -210 0 -211.276
140 -210 140.321 -210.741
280 -210 281.55 -209.157
-280 -105 -281.025 -104.715
-140 -105 -140.059 -106.689

0 -105 0 -107.355
140 -105 140.059 -106.689
280 -105 281.025 -104.715
-280 0 -280.842 0.169166
-140 0 -139.969 -2.17864

0 0 0 -2.9699
140 0 139.969 -2.17864
280 0 280.842 0.169166
-280 105 -280.998 105.918
-140 105 -140.049 103.216

0 105 0 102.306
140 105 140.049 103.216
280 105 280.998 105.918
-280 210 -281.487 212.926
-140 210 -140.297 209.891

0 210 0 208.87
140 210 140.297 209.891
280 210 281.487 212.926

 

 

 

 

Table 5.3: Values for Moving the Display

48

 

CHAPTER 6
DISCUSSION AND CONCLUSIONS

The values presented in Chapter 5 deal with speciﬁc usage situations. This chapter
will discuss what these results mean in an application setting as well as their relation
to observed characteristics of the HMD. Also, it will present some possible methods
to reduce error in optical see-through display applications.

There are three situations that are discussed. In order to validate the results of
the system modeling, real world measurements were taken. These situations were
reproduced in the laboratory using the HMD and a calibrated camera. The informa-
tion from this calibration is used to determine what pixel differences in the image are

equivalent to in the real world.

6.1 Pin Cushion Effect

The ﬁrst result is the pin cushion effect that the display has on the images it presents.
Points that are near the edge of the LCD face a certain amount of distortion, which
increases the further they are from the center. This causes straight lines in the display
to appear curved when compared to the real world. This matches what is seen through
the HMD. The image in Figure 6.1 was taken by a camera looking through the HMD.
The corner points are connected with a line. Then the distance from the middle point
to this line is measured. The center top of the display is measured to be one pixel
below this line which works out to be 1.68mm at 1200mm. This closely corresponds
to the values from the models. The data in Table 5.1 shows a difference of 0.613mm
between the height of the top center and corner points. Since the measured difference

is so small, a difference of one pixel would change the measurement signiﬁcantly. The

49

images taken through the HMD are somewhat blurry, thus such an error is possible.
A higher resolution camera could provide more detail but it may not be practical to

position the HMD in front of it.

 

Figure 6.1: Real World Pin Cushion Example

At the corners of the display, which show the greatest amount of error, the mag-
nitude is 1.7mm at a range of 1200mm. In the experiments conducted by McGarrity
[26], measurements were taken in a working space no larger than arms length, or
about 700mm. At this range the error scales to 0.99mm. The results from McGar-
rity showed minimum errors on the order of 10mm, thus small improvement could
be made by adjusting for this rendering error. It is clear that the majority of the
error in the McGarrity study was due to the design of the calibration method, a
human-computer interaction problem that needs a great deal of further study.

The error is larger the further a point is away from the center of the display, this

error is modeled as radial distortion. It is possible to warp an image to compensate for

50

radial distortion. If this warp were applied to the input to the HMD before it is sent,
the radial distortion could be canceled out. This could be accomplished with a more
complex rendering method [19, 37, 44]. A simple radial warp was designed that would
cancel the distortion. The distortion value is deﬁned in equation 6.1. The distorted

locations were calculated by multiplying the original values by this distortion value.

I+I€*\/m+\/];] (6.1)

It was found that a value of It = —0.000834 minimized the average error. For a
sample of 2451 points, this distortion decreased the average error from 0.3961mm to
0.3327mm. The results for a limited number of points are shown in table 6.1. These
results are more impressive than the average error, showing how the points on the

edge of the display are corrected.

6.2 Handling Eye Movement

The results in the previous section are concerned with points that are a large distance
away from the center of the display. It also assumes the user does not move their eye
to examine these points. The more likely case is that the user will rotate their eye in
order to focus this area on their fovea.

The most extreme case of this movement was modeled in Section 5.2. The eye was
moved to focus on the point (280,0). Previous to this movement, the rendering would
be perceived at (280.897, —0.321), and afterward error is reduced. Now the rendering
is perceived at (280.652,—0.320). The change in perception after the rotation is
less than 0.5mm, a small measurement at a range of 1200mm. For this reason, the
orientation of the eye does not contribute signiﬁcantly to the error of the system.
These calculations correspond to the measured changes from simulating rotation of

the eye. The image is shown in ﬁgure 6.2. The center mark of this image is at

51

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

er Kw Xdistorted Ydistorted
-280 —210 -280.446 -210.826
-140 -210 -139.861 -209.992
0 -210 0 -209.985
140 -210 139.861 -209.992
280 -210 280.446 -210.826
-280 -105 -280.097 -105.382
-140 -105 -139.687 -104.848
0 -105 0 -104.806
140 -105 139.687 «104.848
280 -105 280.097 ~105.382
-280 0 -280.274 -0.318334
-140 0 -139.776 -0.0779452
0 0 0 5.09483e-05
140 0 139.776 -0.0779452
280 0 280.274 -0.318334
-280 105 -280.107 104.662
-140 105 -139.697 104.615
0 105 0 104.732
140 105 139.697 104.615
280 105 280.107 104.662
-280 210 -280.452 209.831
-140 210 -139.874 209.508
0 210 0 209.668
140 210 139.874 209.508
280 210 280.452 209.831

 

 

 

 

of moving the setup to simulate the situation.

52

Table 6.1: Values for Radial Distortion Correction

(292,228), while the center mark in ﬁgure 6.1 is (293,230). The most signiﬁcant

result is that the center changed 2 pixels in the Y direction. This could be a result

6.3 Relative Movement of the HMD

One might expect that small movements of the HMD with respect to the eye would
be a large error contributor. Since the display is so close to the eye, it would seem

that a small change would scale to a large change at a greater distance. However,

 

 

Figure 6.2: Simulated Rotation of the Eye

53

the optics of the HMD were created to mimic a large display some distance away
from the user. Therefore when the display is moved small distances around the eye
the result is closely correlated with moving a large display that same small distance.
This phenomenon is only exhibited for a small range. As the changes become larger,
the error introduced becomes much greater. For example a movement of 10mm up in
the Y increases the mean error to 25.89mm, while a movement of 3mm gives a mean
error of 1.63mm.

In the case of a movement is Section 5.3 the point (0, 0) was perceived at (0, -—2.9699).
This change, though a change in the opposite direction of the movement, is not as
drastic as what may have been expected. If this movement were scaled by the same
factor that the LCD is apparently scaled, it would be perceived at (0, 180). Thus the
display is shown to closely approximate a large display at a signiﬁcant distance from
the user.

These calculations correspond to the changes from moving the display. The image
is shown in Figure 6.3. The center mark of this image is at (292, 230), while the center
mark in Figure 6.1 is (292, 228). The change between the images is a downward shift
of 2 pixels after moving the display. This difference of 2 pixels is equivalent to
—3.36mm at 1200mm. The system modeling predicts the difference would have been
—2.9699mm.

There is an additional way the HMD may move on a user. Changing position of
the HMD is seen to not have a drastic effect on what the user perceives. This is not
that case when the pose of the HMD is changed. A small change in the angle of HMD
in reference to the eye results in a drastic change in perception. This follows from the
design of the display. It is designed to be similar to a large display a distance from
the user. Changing the angle of the HMD can be compared to moving the display

that many degrees along an are at the virtual distance.

54

 

Figure 6.3: Simulated Display Movement

55

6.4 Conclusion

Prior to the work done in this thesis it was unknown how different usage situations of
the AR system contributed to the overall errors that were measured. Every user would
place the HMD on their head slightly differently. Some users would tightly secure the
display and need to adjust as they continued usage. When running experiments, users
could be asked to interact in some way with an object. They controlled where in the
display this object was displayed by moving their head. It was not recorded where
they chose to situate the object while interacting with it. Also it was noted that a pin
cushion effect was noticeable, but was never measured. All of these unknowns fueled
the question of what aspects of the AR system were contributing to the overall error.

There are three usage cases modeled and discussed in this thesis. Only two of
these are signiﬁcant contributors to error. In the case of the eye rotating in it socket,
the error is not signiﬁcant, less than a millimeter and would be even less if radial
distortion were included. The other error contributors were larger contributors to the
error of the system. The radial distortion of the display can now be measured and an
inverse function found. If the inverse is added to the rendering process this error can
be greatly reduced. The other signiﬁcant contributor is the relative movement of the
HMD. This error is difficult to track in real time. Precautions can be taken to limit
the movement of the HMD, but during prolonged uses, it remains a likely occurrence.
A HMD display designed to have a ﬁxed angle would reduce the considerable error
introduced by changes in pose of the display. An occasional recalibration done while
using the system could correct for the error cause by movement of the display during
usage [10, 12]. If these changes are implemented, the efforts of reducing error can

then be focused on other components of the system.

56

[1]

l2]

[3]

[4i

[5]

l6]

[7]
l8]

[9]

[10]

[11]

[12]

[13]

BIBLIOGRAPY

Edward Angel. Interactive Computer Graphics: A Top-Down Approach with
OpenGL. Addison-Wesley, 2002.

R. Azuma. Making direct manipulation work in virtual reality. SIGGRAPH
Course Notes 30, August 1997.

Ronald Azuma. Tracking requirements for augmented reality. Communications
of the ACM, 36(7):50—51, 1993.

Ronald Azuma. A survey of augmented reality. Presence: Teleoperators and
Virtual Environments, 4(6):355—385, Aug 1997.

Brian Barsky, Billy Chen, Alexander Berg Maxence Moutet, Daniel Garcia, and
Staley Klien. Incorpprating camera model, ocular model, and actualy patient
data for photo-realistic and vision realitstic rendering. Abstract in the Fifth In-
ternational Conference on Mathematical Methods for Curves and Surfaces, 2000.

Brian Barsky, Daniel Garcia, Stanley Klein, Woojin Yu, Billy Chen, and Sarang
Dala. Rays (render as you see): Vision-realistic rendering using hartmann-shack
wavefront aberrations. Internal Report, Mar 2001.

Hugh Davson. Physiology of the Eye. Little, Brown and Company, London, 1963.

Danial Garcia. CWhatUC : Software Tools for Predicting, Visualizing and Sim-
ulating Corneal Visual Acuity. PhD thesis, University of California, Berkeley,
Berkeley, California, 2000.

Y. Genc, F. Sauer, F. Wenzel, M. Tuceryan, and N. Navab. Optical see-through
hmd calibration: A stereo method validated with a videa see-through system.
Siemens Corporate Research, 2000.

Michael Gleicher and Andrew Witkin. Through-the—lens camera control. Com-
puter Graphics, 26(2):331—-340, 1992.

Robert Goldstein and Roger Nagel. 3-d visual simulation. Simulation, 16(1):25-
31, Jan 1971.

Li-wei He, Michael F. Cohen, and David H. Salesin. The virtual cinematographer:
A paradigm for automatic real-time camera control and directing. Computer
Graphics, 30(Annual Conference Series):217—224, 1996.

William Hendee and Peter Wells. The Perception of Visual Information.
Springer, New York, 1997.

57

[14] H. Holloway. Registration error analysis for augmented reality, 1997.

[15] Richard Holloway. Registration Error in Augmented Reality Systems. PhD thesis,
University of North Carolina, Chapel Hill, North Carolina, 1995.

[16] Douglas Kay. Transparency, refraction, and ray tracing for computer synthesized
image. Master’s thesis, Cornell University, Ithaca, New York, 1979.

[17] Rudolf Kingslake. Lens Design Fundamentals. Academic Press, New York, 1978.

[18] G. Klinker. Conﬂuence of computer vision and interactive graphics for augmented
reality. Presence: Teleoperators and Virtual Environments, 6(4):433—451, 1997.

[19] Craig Kolb, Don Mitchell, and Pat Hanrahan. A realistic camera model for
computer graphics. Computer Graphics, 29(Annual Conference Series):317-324,
1995.

[20] Bernard Kolman. Elementary Linear Algebra. Macmillan Publishing Co., Inc.,
New York, 1977.

[21] Arthur Linksz. Optics Volume I: Physiology of the Eye. Grune and Stratton,
New York, 1950.

[22] m. Bajura. Mergin Real and Virtual Environments with Video See- Through Head-
Mouned Displays. PhD thesis, Univeristy of North Carolina at Chapel Hill,
Chapel Hill, North Carolina, 1997.

[23] H. Fuchs M. Bajura and R.Ohburchi. Merging virtual reality with the real
world: Seeing ultrasound imagery within the patient. IEEE Computer Graphics,
26(2):203—210, 1992.

[24] C. Maurer and J. Fitzpatrick. A review of medical image registration, 1993.

[25] E. McGarrity and M. Tuceryan. A method for calibrating see-through head-
mounted displays for augmented reality. IEEE International Workshop on Aug-
mented Reality, October 1999.

[26] Erin Mcgarrity. Evaluation of calibration for optical see-through augmented real-
ity systems. Master’s thesis, Michigan State University, East Lansing, Michigan,
2001.

[27] James E. Melzer and Kirk Moffitt. Head Mounted Displays: Designing for the
User. McGraw-Hill, New York, 1997.

[28] P. Milgram and F. Kishino. A taxonomy of mixed reality visual displays. IEICE
Tranctions on Information Systems, pages 1321—1329, 1994.

[29] P. Min and H. Jense. Interactive stereoscopy optimization for head-mounted
displays, 1994.

[30] Gordon Moore. Cramming more components onto integrated circuts. Electronics,
38(8), Apr 1965.

[31] Kenneth Ogle. Optics: An Introduction for Opthalmologists. Charles C Thomas,
Springﬁeld, Illinois, 1968.

[32] D. C. O’Shea. Elements of Modern Optical Design. John Wiley and Sons, 1985.

[33] C. B. Owen, J. Zhou, K. H. Tang, and F. Xiao. Augmented imagery for digital
video applications. Handbook of Video Databases, 2003.

[34] R. Raskar, G. Welch, and W. Chen. Tabletop spatially augmented reality: Bring-
ing physical models to life using projected imagery. Second Int Workshop on
Augmented Reality, Oct 1999.

[35] Ramesh Raskar and Kok-Lim Low. Interacting with spatially augmented reality,
2001.

[36] G Reitmayr and D Schmalsteig. Location based application for mobile auge-
mented reality.

[37] Jannick P. Rolland and Terry Hopkins. A method of computational correction
for optical distortioin in head-mounted displays. Technical Report TR93—O45, 1,
1993.

[38] Robert Shannon. The Art and Science of Optical Design. Cambridge University
Press, Cambridge, 1997.

[39] Ivan Sutherland. The ultimate display. Proceeding of IFIPS Congress, 2:506—508,
May 1965.

[40] Ivan Sutherland. A head-mounted three-dimensional display. AFIPS Conference
Proceedings, 33:757—764, 1968.

[41] Kwok Hung Tang. Comparative effectiveness of augmented reality in object
assembly. Master’s thesis, Michigan State University, East Lansing, Michigan,
2001.

[42] M. Tuceryan and N. Navab. Single point active alignment method (spaam) of
Optical see-through hmd calibration for ar. Proceeding of the IEEE and ACM
International Symposium of Augmented Reality, 59:148-158, 2000.

[43] A. Walther. The Ray and Wave Theory of Lenses. Cambridge University Press,
New York, 1995.

[44] Benjamin A. Watson and Larry F. Hodges. Using texture mapts to correct
for optical distortion in head-mounted displays. In Proc IEEE Virtual Reality
Annual Internation Symposium, number 95-04, 1995.

59

[45] Ross T. Whitaker, Chris Crampton, David E. Breen, Mihran Tuceryan, and
Eric Rose. Object calibration for augmented reality. Computer Graphics Forum,
14(3):15—28, 1995.

[46] Woojin Yu. Simulation of vision through actual human optical system. Master’s
thesis, University of California, Berkeley, Berkeley, California, 2001.

[47] G. Zachmann. Distortion correction of magnetic ﬁelds for position tracking.
In Proc Computer Graphics International, Belgium, Jun 1997. IEEE Computer
Society Press.

60

 

 

 

   

M T

will

 

ii

3

    

E

[till]

[ill]

477

l

   

[ill