a! {to ,vo?¢ 09‘ flauihfioVJr-bvnc In an - $0. (a ... :A 1.9... r 1:. .l!.cla pt .\ . I. : 1.x .4113!) .l..l|v.40.0vl: i 7... l 4.1. 0... 307.2%. .o \Io-I urhusuf... . 1‘»: 11.3.}: 0.5!; .l Iflv'.:9l|lcll.l I "4.0! .. 00»|.'o ill->1! I‘ll M".|\I.).I.pb:| mil/WIlll‘lli‘l‘iuilfillWilli?!" 3 1293 00899 2780 This is to certify that the dissertation entitled INDOOR ROBOT NAVIGATION USING A SYMBOLIC LANDMARK MAP presented by STEPHEN JOHN WALSH has been accepted towards fulfillment of the requirements for PhD degree in Computer Science Major professor Date /é flth 9?. MSU it an Affirmative Action/Equal Opportunity Institution 0-12771 rr LIBRARY Michigan State University L A W“. PLACE IN RETURN BOX to remove this eheckout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE m a 20m: l LT T—l J ail MSU Is An Affirmative Action/Equal Opportunity Institution eWWunS-pd l INDOOR ROBOT NAVIGATION USING A SYMBOLIC LANDMARK MAP By Stephen John Walsh A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Computer Science Department 1992 ABSTRACT INDOOR ROBOT NAVIGATION USING A SYMBOLIC LANDMARK MAP By Stephen John Walsh This dissertation addresses the problem of a mobile robot autonomously estimat- ing its location within an a priori map using sensed information. A 2D indoor robot world is sparsely represented as a collection of vertical landmarks encoded as an at- tributed edge graph. This map is matched to data obtained from a single image with depth to feature information estimated by ultrasonic sensors. The goal of this work is the rapid recovery of an estimated position within a pre- viously mapped domain from an initial state of complete uncertainty. The structure provided by indoor environments permit geometric assumptions that enable rapid vertical landmark detection and classification. Pairs of vertical edges are extracted from monocular images and candidate landmarks identified. Methods of fuzing ultra- sonic estimated range data with extracted vertical edge features from a single image are developed. Experiments have demonstrated that the proposed methods classified vertical landmarks with 84 percent accuracy over a large library of domain data. Fur- ther experiments lead to the definition of a footprint for accurate ultrasonic sensing. Detailed composite error models for the ultrasonic and imaging systems are developed and used to examine the performance of the proposed sensing geometries. Analysis and experiments lead to the definition of a general navigation envelope that enhances sensing accuracy while only mildly restricting travel and locomotion. An independent suite of test data combined with the developed composite er- ror models are used to conduct a large simulation study on the performance of this methodology in real and synthetic domains. Several heuristic methods are proposed for recovering from sensing errors. Simulation results show that a deterministic error recovery approach outperforms the heuristic methods in measures of speed and ac- curacy. In one large domain, position recovery is obtained in less than 12 seconds in the absence of error and in less than 15 seconds with realistic modeled error; results show 98 percent correct pose decisions. Similar time and accuracy performances are observed in other domains. The time performance is limited not by the solution ap- proach, but by the mechanical limitations of the robot base. Travel speeds up to 30 feet per second could be supported. To my Lord and Savior Jesus Christ by Whose grace I do everything however great or small iv ACKNOWLEDGMENTS I wish to thank my thesis advisor, Professor George Stockman, for his guidance during the execution of this research. His ideas were always helpful and often pre- cipitated welcome progress. I particularly appreciate his ability to coach more per- formance out me than I thought possible. His investment of many weekend and evening hours toward the progress of this work was greatly appreciated. Dr. Stock- man serves as a positive model for me in many ways. I also acknowledge the selfless efforts of my thesis committee members, Professors Richard Dubes, Lal Tummala, John Forsyth and Chuck MacCluer, who made valuable suggestions toward improv- ing this research. Additional thanks to Dr. Anil Jain for equipping and maintaining the Pattern Recognition and Image Processing (PRIP) laboratory with advanced re- search tools, including the hardware necessary for constructing Rome, the mobile robot used to investigate this work. I also want to thank Dr. Abdol Esfahanian for his assistance in pursuing this research. Thanks also go to Dr. Carl Page for inviting me to return and pursue my doctoral work, and to Dr. Harry Hedges of the National Science Foundation and Dr. Mo Jamshidi of the University of New Mexico for their encouragement. I also wish to thank Dr. Stockman’s secretary, Lora Mae Higbee, for her patience and maternal guidance. I am also grateful for the friendship and technical dialogue shared with many former and current PRIPies. I’m especially thankful to Dr. Greg Lee for his patient advice on so many topics, including the details of defense and dissertation prepa- ration as well as being a true friend. I also thank Jon Courtney, Tim Newman, Hans Dulimarta, Qian Huang, Jinlong Chen, Dr. Deb Trytten, Sushil Bhattacharjee, Sharathcha Pankanti, Marie-Pierre Dubuisson, Philippe Ohanian, and Sally Howden. Their support, both personal and professional, has enabled and motivated this re- search to bear fruit. The friendship and technical assistance of Jon Engelsma and Christian Trefftz have also been instrumental to the completion of this dissertation. It is difficult to imagine this work being enjoyable in the absense of these fine in- dividuals. We shared many cheers and hardships together; thank you my friends. Thanks also to my colleagues Reid Baldwin, Edgar Kalns, and Barb Czerny for their friendship and support over the years. My sincere appreciation also goes to the PRIP manager, John Lees, for establishing and maintaining a stable research environment. Roxanne Peacock and Brian Wright were instrumental in helping provide the tiny, but essential hardware bits that held Rome together. My professional family at the US. Air Force Academy has also been pivotal to the progress of this work. Thanks to Colonel Dan Litwhiler for supporting my fellowship here at Michigan State, and thanks also to Lt. Colonel Steve Gordon for both his encouragement and his guidance. I also thank Lt. Colonels Bill Kiele, Bill Skeith, and Major Jerry Johnson for all their help, support, and understanding along the way. Thank you also to Major Laurie Wells and Captains Bob Lawconnell, Don Duckro and Jay Jacobson who covered a few of my classes so that I could conclude portions of this work. And thanks to Captains Jason Moore, Chris Warack and Joe Hewitt who helped me become familar with our local computing environment. Finally, I must thank the pillars of my life. Thanks to my wife, Lori, and our children, John, Niki, and Eric, who have supported and loved me through many challenging times. I believe they have worked harder and suffered greater than I did during times of stress and strain. Thanks also to my parents, Dr. Bill and Dorothy Walsh, and my fellow siblings Bill and Martha, who together have shared my academic peaks and valleys. I couldn’t have began, nor completed, this work without my family’s unfailing love and support. vi TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES 1 Introduction to the Problem 1.1 Problem Statement ............................ 1.2 Robotic Navigation ............................ 1.3 Organization of this Dissertation .................... 2 Literature Background and Motivation 2.1 Sensor Overview ............................. 2.2 Navigation Survey ............................ 2.2.1 Indoor Applications ........................ 2.2.2 Outdoor Applications ....................... 2.3 Pose Estimation .............................. 2.4 Sparta and Rome ............................. 2.5 Summary ................................. 3 The Where-am-I Problem 3.1 Navigating with a Map .......................... 3.2 Representation .............................. 3.2.1 Geometric Paradigm ....................... 3.2.2 Computational Geometry .................... 3.2.3 Occupancy Grid .......................... 3.3 A New Approach - StickRep ....................... 3.4 Summary ................................. 4 Sensing Landmarks 4.1 Visual Sensing of Vertical Ribbons ................... 4.1.1 Feature Extraction ........................ 4.1.2 Landmark Classification ..................... vii xii CBCQMH 11 11 2O 22 27 29 31 31 35 36 37 38 38 41 4.2 Theory of Ultrasonic Sensing ....................... 52 4.3 Ultrasound Studies ............................ 55 4.4 Summary ................................. 57 5 Sensing Geometries 59 5.1 Motivation ................................. 60 5.2 Two Side Geometry ............................ 61 5.3 Short Side Geometry ........................... 65 5.4 Error Analysis ............................... 67 5.4.1 Error Analysis of Geometric Approximation .......... 69 5.4.2 Composite Range Error Model .................. 71 5.4.3 Linear First Order Error Model ................. 72 5.5 Monte Carlo Experiments ........................ 74 5.6 Summary ................................. 82 6 Inexact Graph Matching 83 6.1 Representation .............................. 83 6.2 Ambiguity ................................. 85 6.3 Simulations ................................ 90 6.4 Error .................................... 91 6.4.1 No Error .............................. 93 6.4.2 Empirical Based Error ...................... 95 6.5 Metric Relaxation Matching ....................... 96 6.6 Heuristic Error Recovery Experiments ................. 98 6.6.1 First Frame Error ......................... 99 6.6.2 False Door Error ......................... 99 6.6.3 Mis-sized Door Error ....................... 100 6.6.4 Full Recovery Suite ........................ 101 6.7 Deterministic Error Recovery ...................... 102 6.8 Examining Additional Domains ..................... 104 6.9 Summary ................................. 109 7 Summary, Conclusions and Recommendations for Future Research 110 7.1 Summary and Conclusions ........................ 110 7.2 Contributions ............................... 112 7.3 Recommendations for Future Research ................. 114 A Ultrasound Histograms 116 viii B Partial Derivatives from Sensing Geometries 128 BIBLIOGRAPHY 143 ix LIST OF TABLES 2.1 Logic for label combination ....................... 15 4.1 Landmark classification boundaries showing ribbon width in inches . 52 4.2 Experiment 1: Ultrasonic percent range error as a function of surface type and angle .............................. 58 4.3 Experiment 2: Ultrasonic percent range error as a function of surface type and angle .............................. 58 5.1 Two Side Geometry acoustic angles ................... 65 5.2 Composite ranging error summary .................... 71 5.3 Worst case landmark error using linear first order estimate ...... 73 5.4 Accuracy resulting from reducing A0 to 375° using Short Side Geometry 73 5.5 Short Side 0 tolerance for error within two inches ........... 74 5.6 Probability of Short Side Geometry measuring within 2 inches . . . . 75 6.1 Domain ambiguity: Number of edge pairs with adjacency angle of 90° 85 6.2 Domain ambiguity: Number of edge pairs with adjacency angle of 180° 85 6.3 Domain ambiguity: Number of edge pairs with adjacency angle of 270° 86 6.4 Domain ambiguity: Three Edges: (door)(wall)(door), adjacency an- gles of 180° ................................ 86 6.5 Domain ambiguity: Three Edges: (wall)(d00r)(wall), adjacency angles of 180° ................................... 86 6.6 Domain ambiguity: Four Edges: (door)(wall)(door)(wall), adjacency angles of 180° ............................... 87 6.7 Domain ambiguity: Four Edges: (wall)(door)(wall)(door), adjacency angles of 90° ................................ 87 6.8 Domain ambiguity: Door type census .................. 88 6.9 Domain ambiguity: Wall type census .................. 88 6.10 Empirical Based Error Model ...................... 95 6.11 Performance without heuristics ..................... 98 6.12 Performance with First Frame Heuristic ................ 100 6.13 Performance with False Door Heuristic ................. 100 6.14 Performance with Mis-sized Heuristic .................. 101 6.15 Performance with Full Heuristic Suite .................. 101 6.16 Performance of deterministic recovery in Engineering Building . . . . 107 6.17 Overall performance summary ...................... 107 6.18 Performance of deterministic recovery in Wells Hall .......... 107 6.19 Performance of deterministic recovery in regular polygon ....... 107 6.20 Performance of deterministic recovery along synthetic storefront . . . 108 6.21 Performance of deterministic recovery in a modified Wells Hall . . . . 108 6.22 Comparison of domain performance ................... 108 xi 1.1 1.2 1.3 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.1 4.1 4.2 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.1 LIST OF FIGURES Sample indoor navigation domain: How should this be represented? . Door feature ............................... Ultrasonic sensing of distance to hallway walls ............. Plot of representative cell patterns. Heavily shaded cells denote occu- pied cells, cross hatched cells have conflict label assignments ..... Corrected plot of representative cell patterns from previous Figure . . Shortest path in 2D polygonal space .................. Viewing geometry for P4P problem ................... Algorithm steps to recover global pose and focal length ........ Sparta ................................... Rome ................................... StickRep example ............................. Vertical Edge Ribbon Extraction Algorithm .............. Image processing suite example. (a) Original image (b) Vertical edge image (c) Thresholded edge image ((1) Smeared vertical projection with darkened feature footprint ........................ Field of View geometry .......................... Rome’s sensing geometry ......................... Linear first order error model ...................... Short Side Geometry, 0 = 90° ...................... Short Side Geometry, 0 = 70° ...................... Short Side Geometry, 0 = 50° ...................... Two Side Geometry, 0 = 90° ....................... Two Side Geometry, 0 = 70° ....................... Two Side Geometry, 0 = 50° ....................... Starting location ambiguity of MSU Engineering Building ....... xii 16 16 19 25 26 28 30 43 49 50 62 68 89 6.2 Simulation Algorithm ........................... 92 6.3 Unique pose without error ........................ 94 6.4 Metric Relaxation Heuristic ....................... 97 6.5 Multi-Sequence with error ........................ 103 Al Concrete surface, 45° ........................... 117 A2 Concrete surface, 30° ........................... 118 A3 Concrete surface, 15° ........................... 119 A4 Concrete surface, 0° ............................ 120 A5 Wood surface, 45° ............................. 121 A6 Wood surface, 30° ............................. 122 A.7 Wood surface, 15° ............................. 123 A8 Wood surface, 0° ............................. 124 A9 Cloth surface, 30° ............................. 125 A.10 Cloth surface, 15° ............................. 126 All Cloth surface, 0° ............................. 127 B.1 53—1,, 4) = 0° ............................... 130 B.2 516m 4» = 0° ............................... 130 B3 3%, o = 0° ............................... 131 8.4 5%, o = 0° ............................... 131 B5 5721-, as = 20° .............................. 132 B.6 5%,, o = 20° .............................. 132 B.7 52—1, 3 = 20° .............................. 133 B8 5%,, a = 20° .............................. 133 B9 5%, 3 = 40° .............................. 134 8.10 £5, a = 40° .............................. 134 B.11 £7, 3 = 40° .............................. 135 3.12 5%, o = 40° .............................. 135 B.13 21—31-143 = 0° ............................... 137 B.14 8%,, a = 0° ................................ 137 B.15 33,, as = ° ............................... 138 B.16 5727, o = 0° ............................... 138 RN 5%, ¢ = 20° .............................. 130 8.18 8%,, o = 20° ............................... 139 8.19 5%}, o = 20° .............................. 140 8.20 537, 45 = 20° .............................. 140 Bill a—jj,—,—, 45 = 40° .............................. 141 3.22 5%, 3.23 57°37, 8 ¢=4O° ¢=4o° ¢=40° xiv CHAPTER 1 Introduction to the Problem Robotics is a rapidly expanding field with a rich research landscape. Multisensor mo- bile robots are navigating through academic and industrial buildings and ambitious outdoor efforts are presently under study. A central requirement towards the devel- opment of capable and flexible robotic systems is the ability to develop robust spatial descriptions of their environment from sensory information, and to employ this infor- mation in planning and solving application tasks. These capabilities enable the robot to interact lucidly with its domain, both by interpreting its sensory data to draw appropriate conclusions for near term decisions, and by diagnosing and maintaining an adequate domain model for strategic planning and decisions. Systems with little or no sensing capabilities are limited to fixed sequence operations in highly controlled work areas, and cannot perform activities with any significant degree of autonomy or adaptability. To successfully achieve these capabilities in robotic systems, several research issues must be addressed. These include fusion of information from multiple, and dissimilar sensors, problems in sensor interpretation, mapping and representing the navigator’s domain, handling sensor uncertainity and errors, landmark identification, position estimation, and registering a priori maps with sensed information. In this disserta- tion we touch upon all of these issues to varying degrees in our approach to solving [0 a particularly challenging robot navigation task - pose recovery. Pose recovery is the estimation of the navigator’s position after initial, and complete disorientation within its operating domain. A long term goal of the research presented in this dissertation is the development of minimal, yet robust mapping and navigation tools for mobile robots operating in and exploring structured domains. In the following section, we will present an overview of the problem studied in this dissertation. A brief discussion of robot navigation and an outline of the organization of this dissertation close the chapter. 1.1 Problem Statement To widen the application range and deployment of both industrial and research robots, a key requirement is the development of systems with greater autonomy, able to sense, plan and operate in structured environments with minimal prior knowledge. To achieve this level of independence, the robot must not only be able to accurately sense its domain, but must be able to accurately sense its pose in its domain. Pose estimation is a base problem to all navigation tasks. Many different paradigms exist for different applications, but all use some sensory information to overcome dead reckoning error. To be robust in any environment, a robot navigator must know where it is relative to major objects in its domain, to some degree of accuracy. This task is known as The Where-am-I Problem. The Where-am-I Problem : The autonomous recovery of an estimated position within a previously known, or mapped operating domain, from an initial state of complete translational and rotational uncertainity. Solving the Where-am-I Problem is more difficult than maintaining an estimated pose during navigation from an initial known position. However, a successful Where- am-I Solution will also adequately address pose maintenance. The autonomous ability to accurately sense, and represent an operating environment are pivotal for success- fully addressing this problem. In this dissertation we present our solution to the Where-am-I Problem for a class of indoor domains. 1 .2 Robotic Navigation A robotic navigator must have some mechanism for representing its surrounding en- vironment. If not, it cannot truly navigate, but merely avoid obstacles, and its application will be specialized and limited. Domain representations, or maps, do not have to be provided as an a priori database, but may be constructed from sensory information obtained by the robot. What domain features are important for navi- gation, and how can they be encoded into a representation that enables self-location in an operating domain such as the building in Figure 1.1? Building a mobile robot application solely upon an a priori representation is considered fragile even within structured, indoor domains, because it fails to accomodate the inherent environmen- tal dynamics associated with people. At a minimum, applications employing a static domain model must still sense to avoid unpredictable obstacles such as pedestrians or temporary objects. But static domain approaches fail to update these dynamic events into their representational schemes, and often fail in forgiving, but dynamic domains such as a business office [56, 156]. We will discuss the opinions of static and dynamic domain modeling in detail in the next chapter. How much domain knowledge is necessary for navigation? This question is valid whether employing a static or dynamic representation, or whether the application constructs its own map or employs an a priori map. How can a priori domain features such as the door in Figure 1.2 be extracted and represented? What sensors are best suited for domain feature extraction? Can inexpensive sensors such as the ultrasonic ranging system shown in Figure 1.3 reliably extract feature information? We were Lu. L723 L723 5127 ”a A42, L73! an: 5.73: 5734 A.7): use A131 L?” 5.739 out an: no: 6 o n o o . o . e a . o , d .2.6 120° ,,.o ,z.d .290 qge . 130° :10 In ta :2. m n a as n n . —IL —I —L — - .— - et — h 1 5-730 l“° O '0. In" .- l = I.,°. ”9° d .0730 .. 7', .-"‘ I ,6. be", I-"3 1.7.! In," ‘ 0 ha 5'0, ‘o'O! Do,” ”05. A-TO! ‘30: “’0'. NICO used no.9 and nod and 1209 «9 110° and m 9 a" mi! .230 «0° 1:00 120° coo Figure 1.1. Sample indoor navigation domain: How should this be represented? motivated by a sense of economy and desired to explore a minimalist approach. We did not wish our solution to revolve around a highly accurate metric map, because we did not feel this was necesary, and we desired real time performance using modest hardware and sensors. Human navigation operates at a higher level, and is more sensitive to feature landmarks than fine metric data [92]. Landmark based navigation will be more closely examined in Section 2.3. Robust sensing and dynamic representations are certainly desirable system at- tributes. Many researchers would even argue that these attributes are indeed re- quirements. Feature extraction, landmark classification, and error handling are often employed in support of these goals. But these desirable sensing and matching capa- bilities may sometimes Oppose the rapid, accurate pose recovery that is our research goal. These are some of the research issues that frame the work we present in this dissertation. Many of these issues that arise are not dealt with exhaustively, and much room for further research and exploration remains. Figure 1.2. Door feature obstacle Figure 1.3. Ultrasonic sensing of distance to hallway walls 1.3 Organization of this Dissertation The remainder of this dissertation is organized as follows. Chapter 2 examines the at- tributes and liabilities of commonly employed robotic sensors. A thorough discussion on the performance and approach of several research efforts from the mobile robotic literature, including both indoor and outdoor applications, will be presented. We will also survey competing methodologies to pose estimation, and examine the local efforts that preceeded our work. In Chapter 3, we discuss the Where-am-I Problem and the various methodologies that address it. An examination of several approaches to domain representation are presented as well as StickRep, our chosen representation. Strengths and weaknesses are compared with respect to our desired solution capabilities. An examination of landmark sensing is presented in Chapter 4. We will discuss the development of our vertical edge ribbon detector, including the performance of our feature extraction methods and the landmark classification performed. The the- ory of ultrasonic sensing and an exhaustive ultrasonic characteristics experiment are presented. The need for an accurate ultrasonic sensing footprint is motvated from these results. In Chapter 5, we present two sensing geometries, Two Side and Short Side, that perform as accurate sensing frameworks. We will see that these geometries offer real time capability that extend beyond our present application to more general situations. Extensive error analysis, Monte Carlo experiments and a large simulation study are presented to define several composite error models for our data acquisition. These results suggest that our Short Side Geometry is best suited for our solution approach. Domain ambiguity and the representational challenges that it presents are dis- cussed in Chapter 6. Large scale simulations are presented that illustrate the chal- lenge of the Where-am-I Problem even without the presence of sensing error. Further heuristic error recovery experiments are compared using the developed error models, and our metric relaxation technique is discussed. Many of these heuristic approaches offer adequate pose recovery, but our deterministic error recovery experiment demon— strates the strongest solution performance in the chosen domain. Chapter 7 concludes the dissertation by summarizing the contributions of our work and offering suggestions for future exploration. CHAPTER 2 Literature Background and Motivation To open up the scope of applications for both industrial and research mobile robots, greater autonomy is needed. Unstructured environments present a difficult challenge, and most application designers have responded by providing detailed a priori in- formation to the mobile robot. Digital contour maps for outdoor applications [39], or encoded infrared beacons [110], and bar-coded floor stripes for indoor robots [61], are examples of providing reference information to enable the robot to determine its location in the world. To gain greater independence, the robot must acquire an under- standing of what’s around it, by capturing and developing an adequate environmen- tal model. Robots with restricted sensing abilities are limited to highly structured, and sequential operations. A variety of complimentary sensors and a mechanism for extracting salient information have been widely successful in enabling autonomous operations in structured environments. Determining the presence and location of objects is a primary task in robotic navigation. Most of this sensory information will have to be compiled from multiple sensors, and a world model then constructed. This model can then serve as a basis for path planning, obstacle avoidance, landmark identification, position estimation and other essential operations. Constructing such a model starts with the complex task of determining range information from sensing the world. Stereo vision is the most p0pular passive sensor with mobile robots, and ultrasonic and infrared sensors are the most broadly used active sensors. We will briefly examine the merits of the sensors most common in robotic applica- tions, and then survey some of the latest research and industrial applications whose results could most readily contribute to our local efforts. We will primarily focus on vision-based systems, although most applications combine stereo vision with active SCHSOI‘S. 2.1 Sensor Overview Ultrasound technology has been with us for some time, but few previous applications built detailed maps. Ultrasonic sensors have now become widely used in mobile robot applications. Their main attraction is their simplicity, low cost, and the fact that distance measurements can be determined directly. A typical sensor has a useful measuring range of 1 to 35 feet, with a beam width of 15 degrees, so an array, or belt of sensors are often employed [52]. Early robotic applications include simple distance measurements [63], localizing object surfaces [24], and determining the world position of a robot [29]. In a later effort, ultrasonic sensors were used to determine a robot’s position by comparing obtained distance measurements against an accurate map. This approach didn’t account for the range errors that are typical in ultrasonic data [111, 112]. A subsequent method also used an onboard map, but attempted to handle noisy range data [45]. A more sophisticated effort pursued ultrasonic mapping and navigation by using a narrow beam to build a line-based description of the robot’s world [36, 37]. The ultrasound readings were interpreted by fitting line segments to the detected points, and then comparing these to an a priori map. Noisy data once again plagued the approach. A recent effort explicitly represents this inherent me in 0]", 10 uncertainty [38]. It dynamically maintains a description of free space limits, and matches observations to a world model. A major effort overcame these earlier troubles by building its own map through a probabilistic fusion of successive readings [50]. No a priori information was used. A controlling microprocessor selected and fired the sensors, timed the returns and provided the corresponding range value. Other similar systems are now maturing [91]. Stereo vision systems have been traditionally used to extract depth information from two images [114]. The major difficulty of employing this technique in real world environments is the intrinsic computational expense of extracting 3-dimensional (3D) information from stereo-image pairs which limits the number of points that may be tracked [2, 16, 17, 43, 113, 133]. Real time constraints have previously precluded dense 3D descriptions, but later approaches have obtained promising results [4, 23, 120]. Traditional stereo systems have relied on high-contrast edges or points that could be easily tracked through several images [100, 115]. Most practical real world vision navigation systems only build sparse depth maps as a result of this constraint [107, 114, 145]. They select points to be matched and tracked using an interest operator, and handle 30 to 50 points, generating a 3D map in 30-50 seconds on a VAX 11 / 780. It’s predicted that 25 percent of all robots sold this year will incorporate some type of vision system [10]. Infrared (IR) sensors are another active sensor that can directly determine prox- imity information. They usually complement other more capable sensors, but one approach proposes encoded IR beacons to estimate position [110]. Signaling specific pose information is not limited to the IR medium. Radio transponders costing just $1 have now become available that may broadcast a specific domain location to a navigator. Such a manufactured landmark can be used to provide pose information ' in difficult, ambiguous domains. More sophisticated transponders possess the mem- ory to record the messages of passing navigators in an environment where multiple ll robots are operating. This emerging technology represents a low cost solution to our problem in domains where their use is practical, and could be particularly attractive to industry. Laser range finders are an expensive answer to obtaining direct range information. The maximum detectable phase shift limits typical range to 64 feet. A field of view of 30 vertical degrees by 80 horizontal degrees has been achieved [39]. This sensor class is maturing rapidly with many new sensors entering the market. Models with finer resolution for indoor work are now available [10]. 2.2 Navigation Survey We will now examine in some depth the more interesting efforts of applied dynamic path planning over the last couple of years. These summaries are divided into indoor and outdoor environments. The structure provided by an indoor robot world greatly simplifies the navigational challenges. Path planning and obstacle avoidance often become a 2-dimensional (2D) problem. The additional tasks imposed by an outdoor, 3D robot world have been addressed by more complex navigational paradigms, but have limited applicability to our present efforts at Michigan State. 2.2.1 Indoor Applications Elfes’s Dolphin system is a major effort, perhaps the first to construct dense ultrasonic maps without any a priori maps or feature data [50, 52]. He felt that the matching errors typical in stereo vision applications were too serious a challenge to build a base robotic navigation system upon. He was concerned about how ultrasound and laser range finders were coarse and insensitive to surface characteristics. The ultrasonic sensors that were available for mapping unstructured environments are quite noisy, so he built a probabilistic occupancy system to facilitate the combination of multiple 12 sensor readings on the same location. He designed a grid system that assigned a probability of occupied or empty to each square. The grids could be as fine as 0.1 foot or as coarse as 1 foot square, but they were uniform throughout the robot world. To derive a probability for each grid space given a particular ultrasonic reading, Elfes defined: P the grid cell of interest R the range measurement returned by ultrasonic sensor Ran-n the smallest range value accepted from ultrasonic sensor 6 the mean ultrasonic error to the beam aperture S the position of ultrasonic sensor 6 the distance from S to P 0 the angle between the acoustic axis and segment SP The area illuminated by the ultrasonic beam is modeled as two regions with different probability formulas for each: Empty Region: points where 6 < R — e and 0 S % Occupied Region: points where e < l R — 6 | and 0 S E;- One probability is computed for angle and one for range. Whether the point of interest is in the occupied or empty region, the angle probability is: 2 P4 =1" (13) In the occupied region the range probability is: P. = 1 — (a): In the empty region the range probability is: _ 6—Rmm 2 PC — 1 — (R‘s-Rmin) The occupied or empty probability for each cell is calculated by multiplying its angular by its range probability. Successive ultrasonic readings were additively combined with previously collected data to capture an evolving picture of the world. He found that 13 these ultrasonic maps were produced an order of magnitude faster than those from stereo vision. He had great success in actual tests using this system, both inside and outside. Michigan State’s Sparta system modified Elfes’ approach to better handle rapidly changing situations, and an occasional bad ultrasonic return [134]. Matthies’ effort built upon the occupancy grid system to incorporate stereo vi- sion [107]. Matthies liked the occupancy grid system because it uniformly treated sensory data and handled the uncertainty of the robot’s position. However, ultra- sound couldn’t see through clutter like stereo vision could. Ultrasound gives better detection of broad object surfaces and indicates large empty areas, but gives less definition to surface boundaries and fine surface structure. So he integrated a stereo vision system onto the system pioneered by Elfes. He extracted near-vertical edges using a Canny edge detector [28], and independently matched five scanlines centered on the robot’s horizon looking for features that extend across all five. He used these edges to constrain the range of the search. The result of this process is a set of edge points on the horizon line of known depth. He joined these points with line segments to create a depth profile that approximates the scene surface structure. Matthies points out that combining these two sensor systems leads to one of three situations: complementation, correction, or conflict. Ultrasound made strong statements about emptyness of regions, but weaker statements about occupied areas. Stereo statements can be weak or strong, depending on the distribution or distance to features. When sensors conflict strongly about a region’s status, he marks the grid unknown, for later determination. Maybe the most salient feature of the occupancy grid system is that it provided a means to combine raw data of totally different natures. The sensor data have qualitatively different information encoded in the range information they pro- vide, and this prevents a simple analytic, or geometric integration approach towards building a coherent world model. Matthies next tackled error modeling in stereo vision-based navigation [108]. His 14 previous system, with stereo vision-based navigation that tracked landmarks, relied on scalar models of triangulation measurement error. Here, he developed a 3D Gaussian error model, using ellipsoids to describe the 3D geometry resulting from constant probability of error contours. Scalar-weighted error models were motivated by the fact that uncertainty grows with distance [154]. This corresponds to a spherical model, but it doesn’t capture the skewness and orientation of the uncertainty. With the ellipsoidal model, for nearby points, the contours will be almost spherical, but as the points get farther away, the contours grow more eccentric. He experimented with their robot using each model. He found that with the ellipsoidal model, the robot estimated its final position to within two percent of distance traveled, and one degree of orientation. Using the spherical model, accuracy was within eight percent of distance traveled, and seven degrees orientation. This work addresses an area of significant need; position uncertainty is a base problem for all navigation tasks. Beckerman and Oblow modified the occupancy grid method to address the system- atic nature of ultrasound errors [9]. Their thesis is that systematic errors are dominant compared to random errors when using wide-angle ultrasonic sensors. They observed that the occupancy grid, or stochastic approach, were local in character, reducing the label for a pixel based on information pertaining only to the single pixel. Beckerman and Oblow’s method is nonlocal in character, using information about neighboring pixels and maintaining self-consistency between occupied and empty space. They label grid cells with one of four labels: E = Empty 0 = Occupied C 2 Conflict U = Unknown Conflicts arise from different labels from two or more ultrasonic returns. Table 2.1 shows how labels are propagated through overlapping ultrasound scans. Beckerman Fffin' 15 and Oblow implemented this cell-by-cell multiplication using the label representation and then quickly evaluate the 16 binary products to determine the new cell label C from intersecting labels A with B. A B_ C if L3=L5=>L3=Lg else ngij-XLS m0d6 This logic facilitates rapid map construction and error attenuation. Figures 2.1 and 2.2, taken from [9] illustrate typical conflict resolution. Systematic errors are washed through a conflict resolving relabeling scheme. Most of the new labels are obtained by examining the characteristic patterns of conflict that result during scanning and processing. Their results indicated that despite collecting 80 percent less data than a stochastic method, the resolution of their maps was superior. The direct multiplica- tion scheme updated cell labels much faster than the probabistic implementation of LB 0 C U 0 E C C C C C LA U C U O E O C O O C E C E C E Table 2.1. Logic for label combination Figure 2.1. Plot of representative cell patterns. Heavily shaded cells denote occupied cells, cross hatched cells have conflict label assignments Figure 2.2. Corrected plot of representative cell patterns from previous Figure 17 an occupancy grid approach [50, 107, 114, 134, 151]. The work of Clark and his colleagues in developing the Harvard Head is also of great interest [32]. He implemented a motion control system to provide modal control of attention of a binocular vision head with seven degrees of freedom. He noted that visual tasks require the movement of the eyes to closely examine areas of interest for the particular task, paying little attention to the rest of the scene which is viewed peripherally. A given visual task might require the corners of an object to be detected, while another task may require that the object color be determined. Each case needs different features attended to. He defines attentive visual control in two parts: Decide where in the visual scene to attend on, and secondly, decide which motions are required to redirect the visual sensors toward that location. The control method is based on a two-level modal control technique proposed by Brockett [19]. The outer level controls the focus of attention, determining what features are going to be used to determine where to look next. The inner level directly controls camera pose through the driving motors. This is based on the human oculomotor control system. Their feature detection and localization is done in a special image processing system made by Datacube [123]. Their first experiment tracked blobs of a specific intensity range. Objects 0.5 to 2 meters from the head were fixated on, to within two pixels, and followed. The vision system could process five frames per second, but with communication delays, only three frames per second were realized. Their next experiment located geometric shapes by using the first three moments and intensity value as features. The scene was segmented with a connected components routine and then a saliency map was built. His present vision system is not yet optimized, and it takes one to ten seconds to find and fixate on the most salient object. Krotkov is now developing a similar system [89, 90]. Hung and his colleagues used multiple marks for determining 3D robot location in a complex industrial setting [75]. They placed the marks so that at least one could 18 be seen by the camera from any location. They integrated a pattern recognition technique with the 3D geometrical transformations so that all viewing angles could be handled. Their experimental results showed less than 3 percent average location error. This concept of placing artificial landmarks is very simple and flexible in many applications. Frohn built VISOCAR, a mobile industrial robot, by breaking the navigational task into a hierarchy of goals obtained by functionally independent modules [61]. He constructed a hierarchy of capabilities, not a hierarchy of image processing steps. He used optical tracking of barcoded lane boundaries, and used the landmarks on the lane boundaries to update his position estimate against his onboard landmark rnap- His lane tracker is a fast adaptive correlation algorithm based on a proposal by Barnea [8]. He has an independent ultrasonic obstacle detector. The complete system is controled by a M68000 microprocessor. VISOCAR appears remarkably similar to Lit ton Industries’ Integrator. IDiavis and his colleagues at Maryland are investigating a class of vision-based problems on a Connection Machine [40]. Their RAMBO project is a testbed for eXI)1(>l‘ing efficient image processing and analysis, visual tracking, and visual planning. Pose estimation is one of their current interests. The independent context in which they’ re tackling these problems warrants interest. SIlarir and his colleagues have extensively pursued geometric path planning algo- fit"1111s [135, 137]. Figure 2.3, taken from [137], illustrates the polygonal geometry. They assume that complete information is known about a static environment char- a“Qt-etized by bounding polyhedral surfaces and objects. In a 2D application they have developed an O(n2 log n) algorithm that guarantees to find the shortest path. In 2D space this shortest path between start and goal must be a polygonal line Whose vertices are the start, goal, and the corners of the polygonal obstacles. The 2D problem is solved using a visibility graph to determine which vertices are visible ifO.’ lCLl' 19 Figure 2.3. Shortest path in 2D polygonal space from the others and what the connecting edge length is. An optimal shortest path routine such as Dijkstra’s algorithm is then invoked to calculate the path. This is one of many roadmap approaches to path planning, others are called retraction or Voronoi diagrams, freeway nets, and silhouette [96]. There is great research activ- ity in cOrnputational geometry, but it offers little advantage in a dynamic, uncertain environment. Llill'l'lmsslsky addresses the pose problem and develops a path planner with incom- plete information [104. 105]. He felt that most applications are inherently dynamic, and that. any strategy assuming complete or static environment information would fail. He explored whether a richer sensor, like stereo vision, could be naturally in- eluded in a path planning model to further enhance performance. He concludes that major modifications of tactile algorithms must be made to take full advantage of these additional sensing capabilities, without losing algorithm convergence. Efficient orga-‘lization of feedback interaction between the subsystems gathering sensor data and Path planning proved equally important. Grandjean explores fuzing surface data from laser range finders with photometric data from stereo vision to assist in building 3D geometric scene models [64]. Kalman 20 filtering is used to fuze numeric features into higher level models, and to derive self- calibration transforms. Explicit representation of data accuracy is propagated to every level of his perception process. The two sensor types operate in harmony, with the laser range finders flagging the presence of a surface while stereo edges determine the surface’s location and extent. Final 3D scene modeling is done with a set of planar faces. His experimental work indicates strong promise for this modeling to assist higher-level robotic processes such as localization and scene interpretation. 2.2.2 Outdoor Applications We now move our scope to outdoor systems. DARPA and NASA are fueling a great deal of work in outdoor systems. We won’t attempt to cover subjects in any depth, but outdoor systems have delivered some significant results. There are many lessons in how these efforts approached their navigational challenges and what limitations they reluctantly accepted. The prototype legged vehicle that may someday explore the Martian surface, called Ambler, completely avoids vision-based approaches [71]. Hebert and his colleagues opted for active range sensors because of fewer necessary computations and insensitiv- ity to illumination conditions. The interplanetary communication lag here mandates fully autonomous operation and these researchers felt that vision was too expensive computationally. Intractability is a recurring theme, even in relatively massive, distributed, au- tonomous land systems such as the ALV [48, 97, 157]. There is general difficulty in getting real time sensor-based control in unstructured environments [39, 115]. Ad- vanced modeling techniques such as temporal stability offer just the advantage that an outdoor environment demands, but it aggravates the processing bottleneck [13]. Very complex systems even predict the appearance and disappearance of landmarks using dynamic mode] matching [117]. A knowledge-based landmark recognition system uses 21 an a priori map, perceptual knowledge and spatial reasoning. Model—based vision tries to reduce the computational complexity. Despite all the domain knowledge, ALV video processing is still extremely challenging [39]. If a structured road is to be followed, a fast pipeline can be laid between sensing and acting, creating a model of the perceived environment and then generating a control plan [20, 69, 70, 122, 153]. Independent parallel processes have been used to produce coherent behavior and faster response. Road modeling by spline functions and edge detection by linear recursive filters have also been used successively, achiev- ing processing rates of 14 seconds per frame on a VAX 11 / 780 [25]. Other approaches match road geometry [103], or intensity [41]. These methods lend themselves well to parallel processing. Kluge makes a strong case for using strong explicit models to achieve reliable recognition [85]. He feels that there still isn’t a reliable road track- ing vision system despite vigorous research. Weak models with weak, or nonexistent higher level processes make them brittle to lost features or illumination changes. His thesis is that assumptions made in road modeling must be available for program ac- cess and modification. The SCARF system is a color-based, domain-modeling bicycle path follower. It proved quite reliant on known road shape, and was sensitive to illu- mination changes. Kluge’s colleagues had trouble in constructing an explicit model since there were few inherent features. Their latest system, FERMI, uses five trackers to build explicit models with strong constraints. It uses higher level reasoning about the road. There will be two techniques of tracker fusion in their developing effort. Maryland’s system is arguably the best, and is based on using Hough transforms to find and group edges. But even their methodology fails with strong, straight shadow edges from trees and buildings, and it is sensitive to illumination changes [157]. The VITS system has followed roads at speeds up to 20 kph and detected and avoided obstacles. General capability was sacrificed for speed [152]. Dickmanns’ Mercedes van ran as fast as 100 kph on the autobahn [44]. His extremely simple perception 22 model uses a monochrome camera and a simple edge detector. The system attempts to discard distracting edges, but the trackers can get fooled by shadows, puddles, road imperfections and varying illumination. One of the lessons that is being learned by navigating outdoors is that an un- structured environment is extremely demanding on domain models, and processing capabilities, even with complex, multisensory approaches. Current efforts are rich in knowledge—based domain reasoning, parallel processing and direct ranging. All these systems incorporate a massive degree of a priori information. Successful indoor sys- tems have been realized without using any of these tools, but extending these early indoor efforts may require adopting the discoveries from outdoor research. 2.3 Pose Estimation Determining the position and orientation, or pose, of a robot camera is a problem that has been examined by many researchers. It’s a pivotal problem since it’s diffi- cult to navigate well with pose uncertainty. If dead-reckoning travel were error-free, pose estimation wouldn’t be an interesting issue, but unfortunately, this is far from the case. Pose estimation has become the cornerstone task in most mobile robotic applications. Fischler and Bolles published a watershed paper that initially examined the Perspective-B-Point problem (P3P) [59]. In this problem the task is to determine the point in 3D space from which an image of known landmarks was obtained. This method seeks the correspondence between n image points and 11 control points, where n = 3 is the smallest number of landmarks. They characterized the P3P problem and provided a closed-form solution, but showed that it could yield as many as four solutions. The issue of the maximum number of possible solutions for the P4P and P5P problems remains open, but the authors proved that the P6P problem provides a 23 unique solution. Their most important result was that the P4P problem yields a sin— gle solution if the control points are coplanar. We shall examine some very interesting work on the P4P problem due to Abidi later [1]. Krotkov follows and extends Sugihara’s work [143], pursuing location estimation from a single image frame [88]. A motive for this approach is to avoid the difficult reconstruction problem by using just one image. Here he matches rays to vertical edges taken from the image to an a priori landmark model. This approach uses the excellent angular resolution of CCD cameras, yet avoids 3D reconstruction issues and feature correspondance. Image processing beyond edge detection is avoided to allow more time to perform the landmark interpretation-tree search. Algorithm complexity is O(n“) in time and O(n) in space. Noisy rays can be handled by giving up some accuracy for the sake of robustness. Spurious landmarks are handled by introducing a null model landmark. There is great potential for color here. Establishing color correspondence before searching the interpretation tree would decrease the time com- plexity by O(n). If two color landmarks were extracted, complexity would go down O(nz). Sarachik approaches the self-localization problem from a different angle [133]. She bases her work on three assumptions: rectangular rooms, inherent features at wall/ceiling junction, and most importantly, flat, uniform-height ceilings. These as- sumptions are reasonable in many indoor environments. Her uncalibrated stereo cameras are mounted with the line of sight about 45 degrees above the horizon. She is trying to take note of only those features that never change - the position of the walls. Her approach starts with the robot spinning in place taking multiple images, and using a single, composite image it determines where it is facing a wall head-on. It then uses a 1-dimensional strip from each of the cameras to match edges, and notes the vertical offset between the wall and ceiling. It is looking for the edge of greatest depth. It then images the same ceiling-edge pair from many vantage points 24 in the room and determines the upward tilt of each camera. This angle provides enough data to calculate the dimensions of the room, scaled by the unknown, but constant ceiling height. Performance of her robot yielded only 75 percent reliable self- calibration, and thus had trouble consistently determining room size. The failures in accurately determining the distance to the walls were the result of incorrect matches. Improved robustness in matching, and an ability to prune bad data points are being pursued to enhance results. The combination of redundant sensors and probabilistic integration could yield large improvements. Sarachik’s method of navigation seems quite appropriate in environments that are busy and dynamic. Room identification by size or position of the door would be interesting to pursue. Haralick developed a closed form solution to the P4P problem assuming both coplanar rectangular marks and a known focal length [68]. The planar geometries of the 2D perspective projection resulting from a rectangular mark determines the camera viewing parameters in the 3D world. His approach’s only ambiguity in the determination is whether the camera is looking up and seeing the rectangle from below on one side, or whether the camera is peering down and seeing the target rectangle from above on the other side. Kite developed a similar closed-form geometric approach that also presumed a coplanar target rectangle and a known focal length [83]. He combined the two solu- tions using heuristics based on human vision and determined the unique 3D pose of the camera. Two other researchers with dissimilar applications represent different approaches to the same pose estimation problem. Ray describes a model-based vision method for tasks where approximate object position estimates are known [127]. Geometric models for objects in a tight workspace and object position estimates are used to predict where linear image features are expected to appear in each View. An ob ject’s pose is then estimated as the position which optimizes a figure-of-merit function which IQ or describes correspondence between the observed features and the features predicted from the estimated position. Han determines pose using tree annealing [67]. He formulates pose determination as a nonlinear optimization problem. The criterion function will turn out to possess local minima. and he deals with these by using a technique called tree annealing. His simulations indicated that average computing times of around 150 seconds are to be expected, so this method seems to offer little for real time applications. Abidi introduced a new pose estimation technique based on the volume of the tetrahedra formed by three corners from the rectangular target and the lens center. Figures 2.4 and 2.5, taken from [1], illustrate the situation with the target on the right, and summarize the algorithm stages. There are actually 12 ways to calculate Figure 2.4. Viewing geometry for P4P problem the focal length so all are found, and the median is used by the remaining algorithm stages. The importance of this effort is that it doesn’t presume the focal length of the Target Dimensions (Given) 512' ‘13! 814. 923' ‘24: 534 Target Image (Given) (31.111). (12.312), (33.113): (34:94) Camera Focal-Length (Recovered) Step 2: c 6 Interior Orientation (Recovered) I P“ P” 3’ P: I V T11 T12 T13 T14 Step 3: T = T21 T22 T23 T24 Exterior Orientation (Recovered) 731 732 733 T134 T = NM Step 4: _ 7 Final Pose (Recovered) I ‘P - (XOrYOrZo.O.fir7) I Figure 2.5. Algorithm steps to recover global pose and focal length 27 camera. It uses the redundancy in the inherent geometry to minimize error in pose recovery. The target dimensions are known and the only other inputs are the target coordinates on the image plane. It calculates seven outputs, the 3D global position, 3D orientation, and the effective focal length. The geometry also yields 6 redundant measures of distances to the target, and again the medians are used. A singularity exists when the target is exactly parallel to the image plane. It has also been shown to be sensitive to the exact coordinates of the image points. This sensitivity limited capability as the distance from the target increased, so Abidi developed an enhanced version that performed shape restoration using a conjugate gradient technique. The ability to extract the effective focal length is important. It reduces the error from assuming a fixed focal length, and can determine the effective focal length when using an autofocus camera. 2.4 Sparta and Rome The Sparta system is the fruition of years of work by the Michigan State University Departments of Electrical Engineering and Computer Science [151]. Sparta is based on the occupancy grid system pioneered by Elfes, but incorporates a rotating sensor head holding eight ultrasonic sensors. Figure 2.6, taken from [151], illustrate the construction. The Transitions Research Corporation (TRC) Labmate base is capable of speeds up to around 1 meter per second, and supports ultrasonic and infrared sensors with an additional sensor board. The sensor board mates to a PC through a serial port, and the sensors and locomotion are commanded by high level function calls based on the work of Crowley [36, 37, 38]. Several computational enhancements have been included to decrease the time necessary to scan and map its immediate environment. Its path planning system is based on the work of Lumelsky [104, 105], and it has demonstrated intraroom mapping and navigation capabilities. It presently at do All 110 tum table ultrasonic sensor tower frame host computer infrared sensor bumper guard Figure 2.6. Sparta takes about 25 seconds to scan and update its occupancy map. Work is maturing towards implementing a wall-following algorithm that will enable Sparta to navigate at higher speeds down a known hallway [14. 15, 86]. IR sensors will be used to skip down the side of a hallway at a standoff of 1.5 meters. Ultrasound will be used only to avoid a frontal collision. There are no plans to incorporate vision. Present efforts are aimed at achieving the best possible speed of travel and using only prudent sensing to accomplish the task. No pose estimation is performed outside dead-reckoning. The Sparta system represents an inexpensive platform for conducting both naviga- tion and sensory research. Its design became the initial blueprint for the construction of Rome, the hardware platform used in the examination of this dissertation. Rome, illustrated in Figure 2.7, was designed to have an untethered, on board Zenith 386SX laptop computer, and a static array of 24 Polaroid Instrument Grade ultrasonic sen- sors around the perimeter of its sensor head. Decreasing the time necessary for range acquisition, and the capability for monocular frame grabbing were the primary hard— ware motivations. A Panasonic GP-KR202 color CCD camera mated to a 6mm lens and a. Data Translation DT2803 black and white frame grabber were chosen. Our goal with Rome was to provide a mobile sensor platform that Would enable us to pursue landmark based navigation, and pose recovery, while simultaneously increasing the 29 speed of travel and decisionmaking. As Rome matured as a hardware platform, its role as an sensory testbed attracted other experimenters and approaches. 2.5 Summary We have examined a broad variety of robotic navigators in this chapter. Each system has its merits and liabilities, and each aids in framing the challenges we face in our approach to landmark based navigation and pose recovery. Many efforts provide insight into solving specific navigational tasks, and extend traditional methodologies toward more general, and robust problem solving capabilities. A few central trends have become apparent in robotic sensing. Computer vision is unique in offering surface detail, and spatial resolution capabilities, but at a computational cost that restricts real time applications. Ultrasonic sensors enable direct, real time ranging, but lack the necessary angular resolution to independently solve high level problems without large a priori databases. Multisensory approaches are becoming very common in robot navigation, but methodologies to fuze dissimilar sensory information are just now maturing. Pose estimation is the base task with respect to general robot navigation. Progress in enabling the navigator to self-locate will rapidly open up its range of applications. Most progress to date, observed with both indoor and outdoor systems, has been in domains providing at least minimal structure to the navigator. In the next chapter we will examine and define the task of independent, au— tonomous localization. This discussion will expose some of the representational chal- lenges whose solution both limits its domain and enables its utility. 30 Figure 2.7. Rome CHAPTER 3 The Where-am—I Problem A core question in robot navigation is the robot’s ability to locate its pose relative to a frame of reference. The immediate scope of the present work is indoor navigation making the reference frame relative to a particular building coordinate frame. This chapter presents the Where-am-I Problem; a key challenge for robot navigation ap- plications. A robust solution to this problem provides the recovery of a lost pose, and enables aggressive approaches to navigational tasks. We discuss the salient elements of recovery from disorientation, focusing on connectivity and relational information. We then present competing approaches to representation and propose a represen- tion for the robot’s world. Instead of establishing a dense, computationally demanding geometric model of the robot’s environment, we offer a sparse representational scheme that requires strong symbolic matching, ambiguity resolution and error rejection ca- pabilities. Our representation is built upon the novel idea of encoding extractable environmental features as attributed edges in a sparse symbolic graph. 3.1 Navigating with a Map A pivotal question in navigation is the robot’s ability to locate its pose relative to a reference frame. The degree of accuracy that defines an accurate pose is relative to 31 32 the application, and is correlated to the objects being manipulated. For industrial manufacture, this accuracy may be less than % of an inch, but for navigation we need less metric accuracy. The degree of pose accuracy might vary from task to task within the same application. Six inch pose accuracy may be sufficient for hallway travel, but is too coarse for entering a room through a threshold. If the robot has some knowledge of a past landmark, it could use dead-reckoning to approximate its current pose. Our Labmate is wheeled and has shaft encoders. In a perfect world with no slip or slide between wheel and ground, readings from these encoders would give an extremely accurate estimate of pose [147]. Unfortunately, wheels slip and slide with magnitudes that are a function of wheel velocity and acceleration, particular ground surface composition and shape, and wheel load. All these aspects can be modeled, but the surface the robot travels upon must then be restricted and we can rarely control the surface dirt and temperature well enough to avoid errors of the same magnitude [21]. We don’t see much hope or portability in trying to overcome the uncontrollable factors in a particular environment in order to construct a metric map to navigate with. So what must be done is to construct a navigator that can be periodically updated with an accurate pose taken from some passing landmark. This is similar to the way humans travel through both familar and unknown surroundings. Humans rarely possess an accurate metric map of an environment unless it is extremely familar. Kuipers’ work supports a landmark-based cognitive model in humans [92, 93]. His thesis is that during human navigation, decision points and landmarks encountered constitute the cognitive map. Humans often construct a map based on topological and geometric properties, not on metric information. When a biker gets lost in the Rocky Mountains without a map, he doesn’t construct a metric map to navigate back to civilization, but most likely searches for recognizable landmarks that provide a coarse pose. Even with a pose provided only by the sun’s position, or the Big Dipper, a hiker can intelligently navigate towards an anticipated river, road, or powerline. The 33 hiker then can follow this feature using recalled topologies to a safe haven. It bears emphasizing that the hiker is only completely lost if he has no metric, topological or geometric information. Soon a handheld Global Position Satellite receiver will provide our lost hiker his 3D pose to within 25 meters [130], but this is too coarse for our application, and offers no advantage indoors. Continuing the hiker analogy, the Where-am-I Problem occurs when the hiker is placed in location with no dead-reckoning information. The hiker is free to look around and extract feature data, and recall geometric and topological relations, but the hiker is surrounded by what Brooks calls an uncertainty cylinder [21]. This is a volume centered about an estimated position with radii corresponding to the posi- tional uncertainities along various axes. Brooks’ approach is to build and maintain a rubbery, relational map. His approach closely models the human manner of navi- gation. A liability is that the farther the robot navigates, the larger the uncertainty cylinder. After three right turns, it is even possible to be uncertain that it has trav- eled at all. Without periodic pose fixes, a robot in any building would be paralyzed with uncertainty. Elfes’ approach is to build an explicit 2D metric map of the robot world. Using a probabilistic construct, this method yields a very useful map if the robot knows where it is in the map . Sparta has already shown some weakness in dead- reckoning despite a world limited to a single room. Litton Industry’s Integrator and Frohn’s VISOCAR use optical floor markings to explicitly update the robot’s pose as often as once a second, depending on its speed. This also represents an unacceptable handicap, too much structure for any expected utility in our application. Ultrasound can’t solve the Where-am-I Problem alone, it is too coarse in angular resolution and is corrupted by reflections and specularities. Crowley even compared sensing with ultrasound to trying to navigate through a dark house of mirrors with only a flashlight [36]. A camera can accurately extract the landmarks needed to peri— odically fix the robot pose. In this dissertation, we combine the metric and topological 34 approaches using a hybrid map. This map is an attributed graph holding the topo- logical properties of the building in its connectivity, and coarse metric information in its attributed edges. McDermott and Davis pursued a massive artificial reasoning project using a similar approach [109]. Each feature that can be extracted by ultra- sonic, monocular vision, or by IR sensors could be an edge. A discrimination tree will classify features by their properties to facilitate rapid correspondence. Fuzzy feature data, most often from ultrasonic sensors, will be passed up to an executive routine that will try to match to edge attributes within a limited uncertainty region. When a high confidence feature match occurs, most likely from the camera, the accumu- lated dead-reckoning error determining the radius of the uncertainity region can be briefly reset. This periodic flushing of accumulated uncertainity is pivotal to having any useful capabilities. The central interest in the W here-am-I Problem is recovering from complete disorientation. Complete disorientation, or as complete as this generation of robots will deal with, occurs when the robot is completely uncertain of its position within its map. This is probably the most important problem to conquer in order to build a more flexible local capability. More challenging tasks can be tackled if our robot can gracefully recover from complete navigational failure. Test pilots will try just about anything if they have a parachute on their back. So in our application, the challenge occurs as we startup the robot at some unspecified location within a chosen domain: the robot has complete freedom to use its a priori map, but it could initially be anywhere on this map. Unless we know the angle the acoustic axis forms with the reflecting surface, ultrasound is not capable of a high confidence match with an edge attribute, so the robot must register a match by other means. Humans might try to locate themselves by reading room numbers, but .more often would seek a topological or geometric feature such as a stairwell or hallway corner. So unless we spread enough artificial visual landmarks to always have one within the camera’s field of view, the 35 robot will wander around looking for a visual landmark. If the robot were within a room at startup, it would first have to locate and navigate to the doorway and then enter the hallway. Since a door is an edge on the graphical map, and the robot is looking for a particular edge to fix its pose, it must traverse the nodes to seek another edge unless it registers with its initial edge. A logical follow-on effort might be to recognize a room by its dimensions with the method of Sarachik [133]. The robot must be able to fix its pose within a reasonable amount of time. A performance time goal of 1 minute seems reasonable, but this presumes that the Labmate base is capable of a reasonable forward speed. Dulimarta’s exploratory work with the Labmate base reveals that linear travel in excess of one meter per second will be difficult to achieve with the current hardware [47]. 3.2 Representation The choice of representations for a building map pivot around issues including the ability of holding information usable to the robotic navigator, flexibility in a dynamic environment, and appropriateness as a path planning database. The key decision strategy is to make the choice from the robot’s point of View, respecting its limited sensing capability and coarse navigation skills. Adopting this perspective is what cartographers would term a robot-level of use [27]. Cartographers are often challenged to build different representations of the same region for different levels of intended use. Another challenge is to separate the decision from our human skills, but not from our cognitive experience. A human spy might desire to have a complete scale map of a building, supplemented by color photographs of particular features. But this wealth of data is only useful to humans because of their massive data compression ability, and fast indexing skills. Humans need little metric information as long as they have a sense of scale. Human experience records how wide doorways are, and what elevators 36 look like. There are three major types of representational schemes that have been employed by various researchers. They are dissimilar in approach, each with their attributes and liabilities, and are presented in the next three sections. 3.2.1 Geometric Paradigm Traditional approaches to robot sensor interpretation have mainly relied on the recov- ery and manipulation of geometric world models. Low-level sensing processes extract geometric features such as surface patches and line segments from sensor data to con- strain the sensor interpretation process. The resulting deterministic geometric world models are then used as the basis for robotic activities such as obstacle avoidance, path planning, or planning assembly and grasping tasks. These approaches charac- terize what Elfes has labeled the geometric paradigm in robotics and computer vision, and have several shortcomings [51]. Elfes concluded that these approaches lead to brittle and sparse world models. They require early decisions in the interpretation of sensor data for the application of specific model primitives, and do not provide adequate means for handling the uncertainity and error intrinsic in the sensory data. These approaches rely heavily on the adequacy and accuracy of the prior world mod- els and the heuristic assumptions made. This also introduces strong domain-specific dependencies. Better environment descriptions are primarily derived from the appli- cation of finer tuned prior models and further constraints to the available sensor data, not from further active sensing. A consequence of these liabilities is the gap between the two informational layers: the layer that corresponds to the imprecise and limited data actually delivered by the sensor data and the layer of abstract geometric and symbolic world models manipulated by the perception processes. If such a gap exists, the mobile robot’s utility will be restricted to highly structured domains. 37 3.2.2 Computational Geometry Reducing the 2D world to a collection of polygonal lines and regions has been heav- ily studied by the computational geometry community. Its modeling restrictions of straight lines and convex polygons aren’t a severe restriction to the robot world. The main attribute of this methodology is the existence of a tractable shortest path algorithm. Sharir’s algorithm [137] has a complexity of O(nzlogn) which is not a handicap here, but its assumption that the world is completely known and static is a serious liability. Once a shortest path is computed, the robot begins travel, but if a previously unknown obstacle blocks the path, the robot must recompute its path, again at an O(nzlogn) cost. This approach also places the path through the vertex corners of the confining polygonal regions, an area along the path very likely to con- tain obstacles. Pursuing a retraction approach to constructing a Voronoi diagram would move the path out through the centers of the tesselated regions, reducing the probability of stumbling into corners due to poor gross navigation. This approach also suffers from the assumptions of a known and static environment. Incorporating environmental updates is also cumbersome with this method because a new vertex may actually change the coordinates of its neighbors [135]. Floyd’s algorithm could be used instead of Dijkstra’s since it adds robustness to the path planner by finding a shortest path from any point to any other point. At an additional O(n) cost, this would allow the navigator to immediately know the shortest way remaining to the goal after dodging an unexpected obstacle. Representing features that are extractable by the robot sensors within this methodology is unclear. To a robot, and indeed a human as well, a visual landmark such as a colorful symbol or room number could be more useful than geometric region boundaries. In our opinion, the computational ge— ometry approach to mapping is inflexible, and too dependent on a static environment to offer an advantage in our application. 38 3.2.3 Occupancy Grid Elfes pioneered the occupancy grid representation. This grid is a 2D array of square cells representing a square of floor space, each cell being assigned a probability of occupancy that can vary with time. This occupancy data is derived directly from the ultrasonic returns, and the robot literally builds its world map as it travels, with no a priori information. This approach can be tailored to the coarseness of the robot’s sensing capabilities, and nicely supports a dynamic world. The robot navigator can fall victim to false or reflected returns, and can even become trapped by phantom obstacles, particularly in close quarters. The significant liability of this method is one of scale. The robot needs a grid cell size of about a foot square to support local navigation, but representing a large building this way is troubling. Multiple scales are needed in the representation. Moreover, path planning to goals outside the robot’s ultrasonic horizon will be cumbersome with this representation, since there exists so many similar, but distinct cell trails to the goal. Local use of this representation has shown the computational overhead to be significant, and has motivated the pursuit of faster, IR-based strategies for hallway travel [151]. In our opinion, the occupancy grid representation is too fine in detail for use outside a locality such as a single room, and would seriously degrade the robot’s navigation speed. The grid cell research community is very active and new approaches using this methodology appear frequently. 3.3 A New Approach - StickRep The structure of a large building is often dominated by long hallways. This structure lends itself well to defining freespace as a 1%D map. The robot is most interested in freespace, not obstacles. Travel on a single floor could be represented by abstractly cutting and stretching out a hallway into a line with rooms hanging off the 39 hallway line. Branch hallways are nodes on the line. Features along the hallway are captured as edges between the physical nodes, with attributes describing informa- tion of interest. The cartographic research community has built databases on very similar concepts [27]. Highway segments are encoded as links and towns as nodes, each with attributes, and networks are built upward from these primitives. Political units, urban areas, and water bodies are modeled with a set of links forming a closed polygon. These basic components may have connecting relationships defining the overall topology. The building representation which we developed could be termed a feature highway. Wall and door features are represented as a sequence of attributed edges with nodes providing the topological connectivity. The robot navigator travels along a feature highway either confirming anticipated feature edges, or updating new information in the graph. We are motivated by economy; to investigate how well a navigator can perform with minimal information. The building map forms a database: the executive routine registers matches to the inexact feature data acquired by the sensors. The type of features that are included in the database are limited by the ability of the sensors to reliably extract these features. Many richer representational schemes have been developed, but most encode features with more complexity than we can presently extract [3, 5, 6, 7, 9, 14, 15, 30, 35, 51, 57, 58, 60, 81, 84, 119, 160, 161]. Our modest experiment warrants a sparser representational scheme. Since the angular resolution of the ultrasonic sensors are poor, only coarse angular adjacency information is encoded into the database. The StickRep representation scheme sparsely represents the 3D world, capturing only topological connectivity, coarse metric data, and feature types. Such a scheme places a greater burden on subgraph matching, ambiguity resolution, and error recovery than do other more mature efforts [9, 12, 54, 87, 124]. We examined several candidate structures, and chose an attributed edge graph to represent the a priori building data base. Planar physical domain features such 40 as wall segments and doors are represented as attributed edges. Door jams and hall corners are the junction of planar physical domain features, and are represented as attributed nodes. StickRep is a departure from the majority of navigation representa- tion schemes. Many previous efforts have been built on 2D representational structures [9, 15, 49, 51, 66, 141, 158]. StickRep is a 1% dimensional structure that provides a minimal representation for our robotic navigator. StickRep encodes sequences of feature information. Each edge contains some or all of the following attributes: type E {corner, doorway, elevator, hallway, pattern, stairwell} metric 6 33+ prototype 6 {doorjam, wallsegment, pattern} pattern 6 {‘, O, Q, Q7} Nodes hold adjacency information of the incident feature edges and contain the following attributes: adjacency angle 6 {0, ...,360} global coordinates E {XW, YW, ZW, a, fi, 7} We illustrate this representation with an example in Figure 3.1. To more clearly portray our representation, we have omitted the prototype edge attribute, and ac- tually have drawn the edge adjacency angles to represent the angles present in the physical domain. In all the domains we examined, sequences of features were rep- resented as rings of adjacent edges. Since features on opposing sides of a hallway are not physically adjacent, opposing features may be represented in separate rings, depending on the hallway geometry of the domain. In Figure 3.1, the three graph components illustrated are actually fragments from three separate feature rings. Each ring is joined to another ring at only one artificial node, so a complex 41 domain is represented as an attributed graph with weakly connected components. This representation proved fast and flexible when matching physical feature adjacen- cies, but representing opposing features is more difficult, and was not pursued. This representation was used to match sensed feature data to a priori domain information. After a navigator determines its current pose, it must maintain this pose by periodic updates. If a high confidence match is made between sensed data and the next an- ticipated edge on the graph, within its current uncertainity circle, the current pose estimate can be updated to this edge. If not, the robot will continue searching for the anticipated feature edge until it reaches a tolerance limit that initiates complete pose recovery. Chapter 6 examines the use of this representational scheme to support extensive simulated navigation. 3.4 Summary In this chapter we have presented the Where-am-I Problem. A solution to this prob- lem provides the ability to recover pose, and enables aggressive approaches to naviga- tional tasks because the executive routine is confident of recovery from disorientation. We discussed the salient elements of disorientation recovery, focusing on feature con- nectivity and relational information. We emphasize symbolic feature data, not metric properties or relationships. We then discussed competing approaches to representation as well as our proposal for representing the robot’s domain. Instead of establishing a dense, computationally demanding geometric model of the robot’s environment, we offer a minimal repre- sentational scheme that requires symbolic matching, ambiguity resolution and error rejection capabilities. Our representation is built upon encoding extractable environ- mental features as attributed edges in a symbolic graph that sparsely models the 3D world. Our representational scheme hinges on strong, inherent assumptions about the 42 structure of an indoor navigation domain. In the next chapter, we will present our methodology for extracting symbolic feature information from our structured domain. 43 Figure 3.1. StickRep example CHAPTER 4 Sensing Landmarks In this chapter we will examine the capabilities and liabilities of monocular vision and ultrasonic ranging as compared to the more traditional stereo vision. We will note that stereo vision cannot. provide the ranging we desire rapidly enough for our real time application. The use of monocular vision and ultrasound are not without their own liabilities, and we will explore ultrasonic ranging error experimentally. The key problem is the ability to accurately replace the depth information provided by stereo ranging with a fuzed monocular vision and ultrasonic methodology. The difficulties with stereo vision systems led us to investigate other promising ranging systems [9, 15, 51, 87, 107]. We sought to combine the most capable attributes of vision and active ranging approaches into a fuzed acquisition methodology using a single camera and a suite of ultrasonic sensors. In this chapter we will present our landmark sensing strategies and closely examine the capabilities of ultrasonic sensing. In the following chapter we will formally present the geometric algorithms that use these sensing strategies in a structured domain. 44 45 4.1 Visual Sensing of Vertical Ribbons Fine feature detail in the operating domain cannot be obtained through ultrasonic range sensing. Ultrasonic range data is too coarse in an uncontrolled environment and erroneous data is common. Ultrasonic sensing will be examined in detail in Sections 4.2 and 4.3. We will now discuss the extraction of fine feature landmarks from the domain using a monocular camera. The requirement for real time processing restricts many common approaches. Many researchers have pursued stereo vision methodologies, but they struggle to overcome the computational complexity of stereo correspondance [33, 34, 87, 99, 116, 118, 121, 126, 129, 146, 149]. In the application of stereo ranging methodologies to real world navigation, the intrinsic computational expense of extracting three dimen- sional range data from stereo pairs limits the number of features that can be tracked. Real time constraints have traditionally prevented the obtainment of dense 3D world models [22, 26, 42, 65, 139, 142, 162]. Constant increases in computing power are now begining to yield promising results [107, 120]. Practical stereo vision navigation sys- tems have constructed sparse depth maps by matching high contrast points selected by an interest operator or geometric features such as edges [108, 114, 145]. Additional problems for robotic applications come from operating under available illumination and in domains with unpredictable surface acoustic characteristics. The depth information recovered by a stereo system is vital in robotic applica- tions, but depth data can also be provided by active ranging. We chose to pursue the extraction of symbolic features in our domain with single images fuzed with ul- trasonic range data. Despite the avoidance of stereo correspondence, many powerful algorithms, such as Canny edge dectection, also challenge real time implementation [28, 78]. Our entire image processing suite was developed under this heavy real time requirement. Many of the visual features in an indoor domain possess strong 46 verticality. Doors, windows, and supports present striking vertical landmarks yielding strong, long ribbons in images. We designed our approach to extract these vertical ribbon edges and classify the observed features using a template scaled to the feature range. 4.1.1 Feature Extraction Before we examine visual edge sensing, we offer the following definition: Vertical Edge Ribbon A physical domain feature that may produce two parallel vertical intensity edges in an image. Both edges are capable of being simultaneously seen within the field of view of the sensing camera. The lines containing the vertical edge ribbon are normal to the floor of the domain. A door is a common vertical edge ribbon found in indoor domains. Doors are logical landmarks for human navigation and are consistent with the cognitive map approach [92, 93]. A door often provides the strongest intensity edge in a domain [77, 136]. Here doors are symbolic, they represent a common static feature, ex- tractable under varied illumination throughout the structured domain. Posts are less common in some domains, but could also be easily extracted. We experimented with several standard edge detectors on training images examining the filtered images for sensitivity to filter size, noise, and camera misalignment [95, 128]. We chose a ver- tically oriented 5x5 Prewitt filter as the candidate that best extracted strong, long intensity edges from the training data [125]. Kriegman pursued a larger mask to smear images in a broader application [87]. Our filter not only rejected horizontal features, but it also averaged vertical features. This averaging has the following desired effects: 47 Horizontal edge ribbons vanish. Slanted edge ribbons are blurred. o The effect of small edge features and marks is attenuated. e The effect of image noise on vertical edges is attenuated. Ribbon length is a strong indicator of a large vertical feature. If we require vertical persistence, we filter out finer vertical features and marks that clutter our domain. To rapidly examine this attribute, we performed vertical integral projection. Let I (r, c) be our filtered image. The vertical integral projection of I (r, c) can be defined as: This column summing technique is similar to Kanade’s two dimensional work on the recognition of human faces [80]. The results of the vertical integral projection re- quired careful thresholding to properly segment the strongest vertical ribbons present in the scene. Additionally we needed to recover the footprint of the vertical projec- tions onto the floor of our modeled two dimensional domain. With this footprint we could register fuzed ultrasonic range data to classify the landmark using the algo- rithms to be developed in Sections 5.2 and 5.3. After thresholding to a binary image, pixel column bins were clustered based on three nonempty bins within a five bin window. Pincushion lens distortion caused edges one pixel wide to bow across seven pixels at a viewing range of six feet. To accomodate this, footprints were determined by calculating the center of mass of the bins. Using the sensing geometries to be developed in the next chapter, we found that the subpixel accuracy obtained from this bin clustering technique reduced measurement variance 39 percent in a controlled experiment. 48 Several adaptive thresholding techniques were pursued [31, 73, 79, 132]. Chow’s technique of fitting a bimodal Gaussian model to the image histogram had been im- plemented locally but it could not be adequately tuned to the crisp intensity gradients common to man-made environments and the varied illumination present in the train- ing images [74]. We reluctantly turned to absolute thresholding based on the library of 157 training images taken over a four month time period in varied illumination. We found that the door jams that constitute the vertical edge ribbons typically subtend 2.5% of the image scene as viewed from Rome. We set the absolute threshold level at 5% to accept an equal degree of desired edge and accidental marks. After we had completed the initial design that successfully identified features in our filtered image, we sought to minimally sample rows from the image to both reduce the computational load on our system, and to identify long edge ribbons. Returning to our training suite of images, we empirically determined that five rows extracted from the image were sufficient to capture vertical edge information from a door within our field of view. Three of these sampled rows were near the vertical image center, with the others toward the top and bottom. If the range to the door is six feet or less, sampling these particular rows will always be sufficient to capture the full vertical range of the feature. But our navigator may certainly view doors obliquely and at greater distances. To accomodate this behavior, a range-based scale factor controls which rows are sampled from the image as a function of the ultrasonic range data obtained coaxial with the optical axis. The image processing algorithm is summarized in Figure 4.1. It implemented dynamic sampling of five rows as a function of range, Prewitt filtering, absolute thresholding, vertical integral projection, and clustering of the vertical ribbon to a footprint. A complete example of our processing algorithm is shown in Figure 4.2. In Figure 4.2(b)-(c) the entire image is shown for clarity, but processing was only performed on the five rows sampled. In Figure 4.2(d) the vertical projection has been 49 smeared for ease of illustration. The dark pixels represent weights of confidence in the feature footprint location, darker pixels representing areas of greater confidence. This processing suite demonstrated excellent tolerance to the pincushioning produced by the 6mm lens we used, and the camera misalignment often observed during Rome’s travel. Clumping the vertical projection was the most significant component in the attenuation of these undesirable effects. This clumping successfully recovered long vertical ribbons with a misalignment angle as great as 10 degrees from vertical at a distance of 12 feet. This result is strongly correlated to the maximum spread of the center three sampled rows. It should be noted that roll axis camera misalignment causes all vertical and horizontal feature edges to be smeared by the averaging fil- ter, and, provided that the three center sampled rows are relatively close in vertical proximity, their projected footprints will be successfully clumped. (0) Dynamic sampling of five image rows based on range to ribbon (1) Vertical 5x5 Prewitt filtering (2) Thresholding to pass top 5% (3) Vertical image projection (4) Clumped vertical ribbon footprint to subpixel accuracy (5) Output landmark candidates each with width and location Figure 4.1. Vertical Edge Ribbon Extraction Algorithm 50 ((0 Figure 4.2. Image processing suite example. (a) Original image (b) Vertical edge image (c) Thresholded edge image (d) Smeared vertical projection with darkened feature footprint 51 4.1.2 Landmark Classification The output of the first processing stage, described previously in Section 4.1.1, pro- vided candidate footprint pairs, with confidence weights, to the landmark classifica- tion stage. Despite the thresholding and filtering, several candidate pairs are often passed forward for potential classification. In this stage, fuzed ultrasonic range data is used to determine the estimated size of landmark candidate pairs. If a large depth edge is noted across the ultrasonic horizon in an area corresponding to the candidate feature, then a line fitting routine interpolates an estimated depth to the feature by using only the flank depths. This occurred frequently in both the training and test data sets because several doors were partially or completely open. Outward opening doors left partially open were often ultrasonically sensed as a sharp convex depth edge. Inward opening doors similarly were sensed as a sharp concave depth edge. The structure of the hallway domain made the line fitting quite accurate in recovering the depth to the closed door position. Open doors were more difficult to accurately classify than closed doors, but this was mainly due to image clutter inside the open door misleading the first image processing stage, and not due to ultrasonic ranging shortcomings. We will discuss these results in Section 6.4.2. Since door landmarks are only present in discrete metric sizes, their a priori prob- abilities can be determined (see Tables in Chapter 6). A Bayes decision rule for classifying candidate pairs were desired, so experiments were conducted to estimate the class conditional probabilities [62]. Using the algorithms to be presented in Sec- tions 5.2 and 5.3, data was collected and conditional probabilities were estimated at an abscissa bin granularity of 0.05 inch. This estimation was necessary since observed data came from a continuous metric spectrum. Posterior probabilities could then be determined. Table 4.1 illustrates, in inches, the landmark classification boundaries implemented using a Bayes decision rule. 52 Landmark Low Boundary High Boundary 32 29.1 33.5 36 33.5 38.8 40 38.8 42.7 30/30 56.4 64.0 36/36 67.9 76.3 Table 4.1. Landmark classification boundaries showing ribbon width in inches The landmark candidate pair with the highest confidence weight assigned by the initial processing stage will be classified using Table 4.1. If a candidate pair falls outside these boundaries it is rejected, and the subsequent candidate pairs, above a preset confidence threshold, are considered in turn. No tuning or training on the test image suite was done. The test suite classification performance of the two processing stages presented in the last two sections was encouraging. After we develop the necessary sensing geometry in the next chapter, we will more thoroughly examine these classification results in Section 6.4.2. In the next section we turn our discussion to a more thorough examination of ultrasonic sensing. 4.2 Theory of Ultrasonic Sensing One important application of an ultrasonic sensor sytem is active ranging. Ultrasonic sensors use high frequency sound to determine the time of flight between the sensor and an object. The acoustic beam propagates from an emitter through its medium of travel as a mechanical wave and is reflected back to a receiver. The wavelength A of the ultrasonic wave is: >4 ll \ln 53 Here c is the speed of sound in the medium and f is the frequency of the emitter. The speed of sound is not constant, even within a given medium, as temperature and humidity are factors to consider. Typical frequencies for robotic applications range from 20 KHz to 2 GHz. Our Polaroid Instrument Grade transducers operate in the 60KHz range, with A = 5.7mm. A voltage pulse excites the transducer which causes an acoustic pulse to be emitted. The echo reflected from an object may be detected by the same sensor that emitted it. A typical time of flight system produces a range value when the echo amplitude first exceeds a set threshold value. A range measurement, do is determined from the round-trip wave travel time of flight t f by: -22 (1..., Absorption of the ultrasonic wave by the prOpagation medium influences the at- tentuation of the wave. The degree of this absorption depends on the medium and the radiating frequency. Gaseous media attenuate ultrasonic waves significantly more than solids or liquids, and this attenuation depends greatly on the particular fre- quency used [51]. Currently most robotic applications employ ultrasonic systems radiating into the atmosphere. The pressure amplitude of the acoustic wave decays exponentially with distance. If [)0 is the pressure amplitude of the planar acoustic wave, and a is the absorption coefficient, then p is the pressure amplitude of the wave at a distance d: P = 1106‘“ The absorption coefficient a clearly determines the rate of attenuation [72]. The extinction distance d..’ is defined to express this attenuation of the ultrasonic wave in the atmosphere, d.3 = l/a. At ambient temperature and humidity, with f = 60kHz, d,3 = 17 meters [51]. Employing ultrasonic waves in the atmosphere is significantly less efficient than radiating through a solid or liquid medium. This is not just due to the 54 attenuation factor, but also because the impedance mismatch between the solid ultra- sonic transducer and the air is much higher [82]. This translates to a high transmission loss compared to the energy transfer across a lower impedance boundary such as a liquid or a solid. But this same impedance mismatch on transmission becomes an advantage upon wave reflection. This phenomenon causes strong reflections, and not energy draining absorptions, by an object in air. The reflective behavior of a surface in air depends on the object texture [18]. If the period of the texture is much greater than the ultrasonic beam period, surface reflection is essentially specular. For robotic applications, most object surfaces can be considered quite smooth, and the ultrasonic beam reflection largely specular, with a small diffuse component. Sophisticated signal processing of the ultrasonic wave allows a parametric evalu- ation of the returning arc formed when the sensor is scanned across an object [18]. This approach permits the differentiation of planar surfaces, corners, and edges in a specular domain. In nonatmospheric medium, higher ultrasonic frequencies may be employed that permit ultrasonic imaging that achieves resolution comparable with what can be achieved with optics in air [72]. Short of these application types, ultra- sonic sensing systems provide a simple, low cost alternative for range data acquisi- tion. Ultrasound operates independently of the environment lighting conditions and the optical characteristics of the object surfaces. Current commercial systems for atmospheric transmission still offer poor angular resolution and suffer from problems associated from multiple reflections. In the next section we will experimentally ex- amine some of these physical issues in an attempt to develop an operating envelope supporting accurate and robust navigation. 55 4.3 Ultrasound Studies Ultrasonic sensors are widely used in mobile robot applications. A strong systemic attribute is their simplicity, low cost, and the fact that distance measurements can be determined directly. A typical sensor has a useful measuring range of 1 to 35 feet, with a beam width of 15 degrees; so an array, or belt of sensors is often employed [9, 15, 14, 24, 29, 38, 37, 45, 50, 51, 52, 53, 54, 63, 91, 111, 112, 131, 134, 140]. The more mature of these efforts use probabilistic models, redundant sampling, or sampling histories to handle the range errors that are common to ultrasonic sensors. Physically based ultrasonic sensor models are just now maturing [101]. One effort elegantly and extensively characterizes a single sensor in a clean laboratory environ- ment [18]. It notes, but doesn’t explore the signal error produced by sound absorbing surfaces, specular surfaces, and surfaces sensed from oblique angles. It became ap- parent that surrounding Rome there existed an area of reliable sensing, or a sensing footprint, that varied with the type and orientation of the surrounding surfaces, and the sensor type. This conclusion motivated an extensive examination of ultrasonic sensor performance against various surface types at varied orientations. We will now examine this phenomenon and develop an ultrasonic sensing footprint to support accurate and robust navigation. During preliminary navigation experiments, it was observed that Rome’s ultra- sonic sensing system ranged surfaces very accurately when the acoustic axis and the surface normal were approximately colinear. It was also observed that as the angle between the surface normal and the acoustic axis increased, so did the ranging error. Individuals wearing thick clothing often were not sensed at all, while at other times the same individuals were accurately ranged. An experiment was designed to examine the effect of surface characteristics on ultrasonic ranging. Three surfaces types were presented to the ultrasonic sensor at various orientations to the acoustic axis. The 56 hallway was initially chosen as the testing laboratory, not because we thought it was a clean environment, but to include multiply reflected signals causing ranging error. Multiple reflection error was not observed in this experiment, and the results strongly followed a pilot study conducted in a acoustically isolated environment. Concrete was chosen as a common domain feature surface, while polished wood presented a much smoother ranging surface. A padded fabric wall partition was exam- ined to characterize suspected energy absorption. One thousand ultrasonic samples were taken from three surface types at four orientations each for a total of 12,000 samples. Let (t be the angle between the acoustic axis and the surface normal, with 45 = 0 corresponding to the sensor directly facing the surface. Table 4.2 summarizes the percent range error observed. We repeated the entire experiment to increase the confidence in our conclusions, obtaining the results summarized in Table 4.3. For a given surface and orientation, only two to nine discrete range values would occur. The histograms of the complete experiment are in Appendix A, and illustrate the deterministic nature of the ranging error. A total of over 40 thousand samples were taken to confirm these results. The polished wood and concrete surfaces ranged quite accurately. Cloth surfaces also ranged accurately, but only if 45 = 0 (the surface normal and the acoustic axis align). The standard deviation of the ranging error is exceedingly small with useful error percentages. These results are encouragingly similar to our initial experiment. We desired to model the ultrasonic error Rome would typically encounter while navigating through- out the operating domain. Cloth clearly is a challenging surface to accurately range. If the acoustic axis and surface normal are not nearly colinear, then grossly inaccurate range values can be expected. The other surfaces can be properly ranged if the ultra- sonic sensing is only performed while within this accurate ranging footprint. Accurate range acquisition can be obtained from concrete and wood if Rome’s sensing can be controlled to maintain c5 S 15°. In the next chapter we will propose two competing sensing geometries that support accurate ultrasonic sensing. Multimodal Gaussian error models were developed from the ultrasonic sensing analysis presented in this section. These error models will support extensive simulation experiments that we will subsequently present. 4.4 Summary In this chapter we have presented a fast technique for the sensing of vertical edge ribbons using a single camera. We designed our approach to extract these verti- cal ribbon edges and classify the observed features using a template scaled to the feature range. Range data was obtained from ultrasonic sensing. Our final design implemented dynamic sampling of five rows as a function of range, Prewitt filtering, absolute thresholding, vertical integral projection, and clustering of the vertical rib- bon to a footprint. The extraction and classification suite demonstrated excellent robustness to the pincushioning produced by the 6mm lens we used, and the roll axis camera misalignment often observed during Rome’s travel. We have also extensively examined ultrasonic sensor behavior, and defined an accurate ranging footprint to provide depth information to the monocular vision system. In the next chapter we will propose two fuzed sensing geometries that build upon the sensing experiments presented here. Unlike many competing methodologies, these sensing strategies sup- port real time feature extraction and landmark classification. Table 4.2. Experiment 1: Ultrasonic percent range error as a function of surface type and angle Table 4.3. Experiment 2: Ultrasonic percent range error as a function of surface type and angle 58 <25 Cloth Wood Concrete 0 0.50 0.03 0.03 0.03 0.71 0.03 15 85 80 0.95 0.03 2.2 0.03 30 340 93 16 0.07 5 0.04 45 ++ ++ 17 0.08 14 0.09 >45 ++ ++ ++ ++ ++ n=1000 E a E 0‘ ii a 45 Cloth Wood Concrete 0 0.53 0.03 0.04 0.03 0.6 0.03 15 100 95 1.2 0.04 1.5 0.03 30 390 84 8.9 0.06 4.3 0.02 45 ++ ++ 22 0.9 15 0.08 >45 ++ ++ ++ ++ ++ ++ n=1000 f a f a E a CHAPTER 5 Sensing Geometries In the preceeding chapter we discussed a visual sensing approach capable of extracting vertical landmarks in real time and examined ultrasonic sensing towards the definition of an accurate ranging footprint. What is now needed is a methodology that fuzes these two sensing capabilities, enabling depth information to be attributed to an extracted domain feature. Constructing a sensing geometry that casts vision and ultrasonic sensing in roles where they can operate accurately, while simultaneously supporting unrestricted real time navigation, is a challenging task. In this chapter we will present two competing sensing geometries that strive towards this goal. These sensing geometries provide a framework for navigation and sensing, ensuring that sensing only occurs where the information is not only accurate, but useful to the particular task right now. These sensing geometries strongly influence the overall control plan of the navigating robot. The shortest path to a goal might not be the best path if the feature sensing system can not accurately extract landmarks along this route. A sensing geometry that does not restrict navigation control is a goal, but we feel that sensing capability is more important than unrestricted navigation and point to point speed. Furthermore, we desired a more general landmark extraction solution. Ultrasonic error models, derived from the experimental results of Section 4.3 will 59 60 be presented in Section 5.4. These error models will be used in simulating the per- formance of the competing sensing geometries. The performance of the best sensing geometry, applied to several operating domains, will be examined in detail in Chap- ter 6. We begin the present chapter with a generalization of our sensing situation. 5.1 Motivation Our present sensing problem, extracting and classifying a vertical landmark from its domain, is a common problem that is widely pursued. The sensing geometries to be presented were designed to enable ultrasonic sensors to accurately provide depth to a feature and assign absolute metric scale to the extracted landmark in real time. The domain constraints required by the sensing geometries are: 0 C1) The landmark has parallel linear edges delineating the face to be measured. 0 C2) An intensity contrast exists between the landmark and its domain. 0 C3) The orientation of the optical axis relative to the ultrasonic array is known. 0 C4) The landmark to be measured presents a planar face to the image plane. 0 C5) The angle between the landmark surface normal and the optical axis is less than 45°. 0 C6) The domain subset within the field of view is approximately planar. 0 C7) The focal length and angular field of view of the camera lens is known. These constraints represent the domain assumptions we were required to make to accurately assign depth to a landmark. Our goal was to provide as near an unre- stricted navigation and sensing environment as could be engineered, but some con- straints became necessary to reduce the complexity of the problem. It is important 61 to note that there is no requirement for ultrasonic or camera calibration. Only the mounting position of the camera relative to the ultrasonic sensing array and the cam- era’s field of view are required. Sensing accuracy would improve with a decrease in the angle of Constraint C5, and is tightly coupled to the experimental results of Section 4.3. Continued advances in low cost ultrasonic sensors, would not remove Constraint C5, but would improve sensing accuracy. If the angle of Constraint C5 is known, then Constraint C6 can be removed. Constraint C4 implies that the sensing geometries actually measure the projection of the landmark onto the image plane. Nonplanar landmark faces would have their projection onto the image plane mea- sured. We will examine the limits of each geometry’s accuracy envelope in detail in Section 5.5. The sensing geometries to be presented are certainly not alone in their ability to measure objects in their constrained domain. Industrial applications have demon- strated more accuracy, but gain this advantage from a predictable, and much more structured domain. Less structured domain applications have yet to obtain real time performance [87]. The sensing geometries presented in the following two sections offer a robust, low cost, real time solution to problem instantiations occupying the middle ground between a well structured, specialized industrial application, and a large scale 3D explorer. 5.2 Two Side Geometry Ultrasonic sensors can provide accurate ranging within a sensing footprint. The pivotal task of this sensing framework is to fuze the depth information obtained from the ultrasonic sensors to extracted feature edge ribbons. In a monocular image from an uncalibrated camera, the only gound truth references available are the edges of the field of view. Constraint C3 in the preceeding section was established specifically Figure 5.1. Field of view geometry for this purpose. If a landmark is within the field of view, the sensing geometry must determine the position of the landmark relative to the edges of the field of view, and assess an absolute metric scale. If we know the angle of the optical axis relative to the acoustic axes of the ultrasonic array, then we can determine the range to the edges of the field of view. Figure 5.1 illustrates the geometry within the field of view (FOV). Constraint C3 provides the necessary reference to assign the correct ultrasonic range value to estimate RL and RR. Since we know the value of F OV for the appli- cation lens, the Law of Cosines provides WL. the length of the domain context visible within the field of view. 63 WL = x/RL2 + RR2 — 2RLRR cos FOV The candidate landmark footprint, as presented in Section 4.1.1, will partition the domain context into the three annotated segments seen in Figure 5.1. Using similar triangles and the Law of Sines we can calculate the length of the segments (IL and d3. d _ RL sin ¢L L — sin(7r — gbL - a) dR = RR sin ¢R sin(7r — ¢R + a) Similar triangles with bases on the image plane and the domain context will provide a means of determining 451, and am. We define LJ and RJ as the left and right footprint of the verticle edge ribbon, respectively, It as the horizontal pixel width in our CCD camera, in millimeters per pixel, and mid as half the number of horizontal pixels across the image plane. With a camera lens of focal length, f, 45;, and 453 can be calculated. FOV 2h(mid — LJ) (#1, = — — arctan 2 f FOV (2h(RJ — mid)) (pa = —2— — arctan f Combining these equations we present a composite formula for the observed land- mark using the ranges RL and RR estimated by our ultrasonic sensors. RL and RR 64 represent our two range sides of the sensing geometry, and the dependence on these two sides led us to call this our Two Side Geometry. Two Side Geometry ribbonwidth = \/RL2 + RIP — 2RLRR cos FOV RL sin (5% — arctan {—L—U mfg—L“, D o ‘ _ . t. Sln (7r — (Ir—2!- — arctan [J—M m; I“) — arcsm ’1’qu FOV 7RL5+RR5—2RLRRcos FOV - h RJ— ‘d RR SlIl (Lg—l: — arctan [LL—1111]) . _ FOV __ 2thJ-mid2D) . RRsinFOV 31“ (FOV ( 2 “Clan [ f + ”(38111 JRL +RR —2RLRRcos FOV This is an equation in four unknowns: RL and RR, the two estimated ranges along the edges of the field of view; and LJ and R], the two candidate landmark footprints classified by the image processing suite. This equation has been checked against ground truth with a large suite of images, and its error analysis will be presented in Section 5.4. In Section 5.5, we will examine the use of this geometry and the accurate estimation of its four variables. We will see that the ability to obtain accurate variables will define a navigation control envelope. In the next section we will attempt to improve upon this sensing geometry, and present our Short Side Geometry. We explored this geometric approach because it performed the same ranging function as our Two Side Geometry, but only required the use of one ultrasonic range reading. This motivation for economy led to an approach that eventually provided superior accuracy. 65 Angle with Surface Look RL RR 90" 30° 30° 70° 45° 10° 50° 60° 10° Table 5.1. Two Side Geometry acoustic angles 5.3 Short Side Geometry In this section we will attempt to improve upon the Two Side Geometry presented in the preceeding section. Our motivation was to more accurately obtain RL and RR shown in Figure 5.1. We have seen in Section 4.3 that ultrasonic accuracy improves as the acoustic axis approaches the surface normal. With a circular array of ultrasonic sensors, one or two of the 24 sensors will be nearer the domain surface to be ranged than the other remaining sensors. The Two Side Geometry does not use the most accurate ultrasonic readings. In fact, if one of the side readings was very accurate, the other range side is certain to be very inaccurate at most look angles. Table 5.1 illustrates acoustic axis angles with the ranged surface normal from arbitrary look angles. Noting the oblique acoustic angles that our Two Side Geometry utilizes, together with the experimental results in Section 4.3, impressive metric accuracy should not be expected. We postpone the examination of these results until Section 5.5. The moti- vation for using the most accurate ultrasonic sensor reading led us to the development of the Short Side Geometry. We previously noted that when the acoustic axis aligned with the surface nor- mal, ultrasonic ranging was most accurate. When a frame is acquired, the shortest ultrasonic range obtained from sensors within the field of view sector corresponds 66 to the sensor axis nearest the surface normal. Constraint C3 ensures that we know the orientation of the camera relative to the ranging sensors, so we can geometrically derive RL and RR from the shortest returned ultrasonic range. Let us annotate the shortest range value, or the short side, as sdis, and the angle between the optical axis and the surface normal as 9. Then the Short Side Geometry derives the two sides RL and RR from 6 and sdis, and determines the landmark’s metric attribute using the formula presented previously. Short Side Geometry sdis sdis (‘7‘ < . < 3) This sensing geometry amends the Two Side Geometry and ensures that only the most accurate ultrasonic range value is used. The Short Side Geometry reduces expected ultrasonic error at the price of introducing error in 0. Since the ultrasonic array is laid out along Rome’s perimeter in sectors of 15°, our uncertainty in which sensor is indeed nearest to the surface is as great as 7.53". The sine function is sensitive to error over portions of its domain, but overall we feel the Short Side Geometry is a strong improvement, as we will note in the next section. 67 5.4 Error Analysis Overcoming error is a strong challenge for any robotic navigator. Some of this error is due to systemic causes such as ultrasonic range error, and some error is due to a methodological weakness towards a subset of the domain. StickRep was designed to handle erroneous data and ambiguous domain features in an efficient manner, but what is the nature of the composite error it must handle? We examined ultrasonic sensor error in Section 4.3, but what methodological weaknesses exist in our sensing geometries? Let us examine our sensing geometry in greater detail. Figure 5.2 revisits our geometrical framework. In this Figure, RL and RR retain their previous definition, but we have now annotated the sensing and domain dynamics in finer granularity. Angle 6 represents the angle between the optical axis, opax, and the surface. Rome’s ultrasonic sensor locations are represented by tick marks around the perimeter, with r being the radius of Rome’s sensing head. The ideal rays from Rome’s center through the ultrasonic sensors are annotated as L and R. If the camera lens field of view (FOV) exactly matched the sector defined by an integer multiple of ultrasonic sensors around the perimeter, then L 2 RL and R 2 RR. In our application the sensors are placed every 15° around the perimeter, and our chosen lens possessed a field of view of 55.8170 so RL and RR approximate the desired rays L and R along the edges of our field of view in our geometry. This approximation is examined in the next section. In the upper portion, or left sensing side of Figure 5.2, the dashed sector delimits the 23.6“ sensing cone of a typical ultrasonic sensor. When an ultrasonic wavefront reflects back and is captured by the sensor, its range value is directly related to the sensor’s distance from the nearest object within its sensing cone. Unless the ideal ray L or R is normal to the surface, the ultrasonic wavefront will first strike the surface adjacent to the point where the ideal ray meets the surface. This introduces another 68 Notwsale - \ Figure 5.2. Rome’s sensing geometry 69 source of sensing error. In Figure 5.2 (11, and d3 represent the range value actually returned by a particular ultrasonic sensor. In the left sensing side, the surface normal intersecting the sensor location lies within the sensor’s cone, in the right sensing side, the surface normal intersecting the sensor lies outside the sensor’s cone, and (13 represents the range along the left sensing cone limit of the reflected wavefront. A third geometric case is when the surface normal intersecting the sensor location again lies outside the sensing cone, but in this case, it lies outside the WL-L-R sensing triangle. Note that in all three cases, the returned range value dL or d3 is not the desired range along the ideal ray L or R. This error, exaggerated in the Figure, is a portion of the error we examined empirically in Section 4.3, and will be compounded by the gross error observed at oblique acoustic angles. We shall develop a worst case model of this behavior, and combine it with our geometric approximation error in Section 5.4.2. 5.4.1 Error Analysis of Geometric Approximation There are three separate cases of geometric approximation error in our sensing frame- work. These cases were introduced in the previous section and were derived from Figure 5.2. This approximation error corresponds to the difference between the ideal rays L and R, and the rays RL and RR defined by the actual edges along the field of view. Case 1 occurs when the surface normal to the sensor location lies within the sensor’s cone, that is, 48° 3 6 < 72°. Case 2 occurs when the surface normal to the sensor location lies outside the sensing cone, but within the WL-L-R sensing triangle; 72° 3 6 __<_ 90". Case 3, 6 < 48", is when the surface normal to the sensor location again lies outside the sensing cone, but also lies outside the WL—L-R sensing trian- gle. These angular based case definitions are symetric about 90°, and were developed Using similar triangles and the Law of Sines. We now present the actual geometric definition of RL and RR that we have approximated in our Short Side and Two Side 70 Geometries with the returned range values (11, and (13 plus the head radius, r. [Case 1, 48° _<_ 6 < 72"] dL sin(150 - 6) RL : sin (6 + 51%!) sin (6 + F_(2)_V_) r RR _ (IR sin(6 — 30) r ' ' sin (180 — 6 + 51,31) sin (180 — 6 + %V—) [Case 2, 72" S 6 S 90"] sin(18 + 6) sin(150 — 6) BL = d 1‘ sin (6 + 6%) L sin (6 + %) RR = . sm(198F;‘6) (112 + ' sm(6 :33) r 8111(180 + T — 6) 8111(180 + '2— — 6) [Case 3, 6 < 48"] RL : sin(138 — 6) sin(150 — 6) d sin(150 —— 6) sin(6 + 30) Sin (5 + FOTV) L sin (6 + FOTV) r sin(6+%K-72)sin(6-60+Fg—V) +sin(6—60+Egy—) sin (240 — 6 — 501-) sin(210 — 6) R sin(210 — 6) _— — 2 If the look angle, 6, were known by Rome, then we could enhance our metric accuracy by calculating range values using these formulas. In the next section, we 71 6 L—>RL R——+RR (IL—>L dR—iR 90" 2.0% 2.0% 4.3% 5.1% 70° 4.1% 0.57% 15% 2.1% 50" 9.1% 0.72% 21% 1.5% Table 5.2. Composite ranging error summary will evaluate our approximation to these formulas in the worst case, and combine it with the ultrasonic error summaries we presented in Section 4.3. 5.4.2 Composite Range Error Model Rome experiences ultrasonic ranging errors of two distinct types. The first type is the ranging error we observed in Section 4.3 with particular weaknesses on soft surfaces and at oblique acoustic axis angles. The second error type is due to the approximations our Short Side Geometry must make to the exact ranges presented in the previous section. In Table 5.2, we have evaluated this approximation error in the worst case, and summarized the empirical results we obtained in Section 4.3 at three arbitrary look angles. Table 5.2 summarizes the composite ultrasonic error models we used to drive the extraction simulations to be discussed in Section 5.5. These error models were developed from empirical range experiments in a real domain together with a worst case analysis of our approximation to the theoretically correct range. In the next section we present a linear first order error approximation of range extraction and landmark metric inference using our sensing geometries. 72 5.4.3 Linear First Order Error Model We desired to get a quantitative approximation for the percent error we might ob- serve using the Short Side and Two Side Geometries, prior to implementing these approaches. Starting with the geometric formulas presented in Sections 5.2 and 5.3, we derived a partial derivative evaluated at a particular look angle and multiplied it by a worst case measurement error factor as shown in Figure 5.3. Two Side Geometry] ('3 8 6 6 6RL’ ERR’ BLJ’ BRJ Ametric z ( ) I...” . (ARL, ARR, ALJ, ARJ) [Short Side Geometry] 8 6 8 6 . t ' z —-— —— —— — Am ”C (Bsdz’s’ 60’ aLJ’ aRJ ) lawn. - (Asdis, A0, ALJ, ARJ) Figure 5.3. Linear first order error model The complete partial derivatives were obtained, but due to their length, we present them graphically in Appendix B. We evaluated these approximation error models at three arbitrary look angles, or stares. Worst case error in 0 was 7.50, as discussed earlier. Empirical experiments supported a worst case error for A LJ and A RJ to be five pixels. Table 5.3 summarizes this linear first order error approximation for both sensing geometries at three stares. Table entries are the worst case landmark metric error, in inches, using the linear first order estimate. The sensitivity of the Short Side Geometry to error in 0 can be seen particularly at the more oblique stares. The Two Side Geometry appears more robust at oblique 73 Geometry 90" 70° 50° Two Side 2.4 3.9 4.7 Short Side 0.40 5.5 12 Table 5.3. Worst case landmark error using linear first order estimate 6 90° 70° 50° Ametric 0.28 3.2 6.6 Table 5.4. Accuracy resulting from reducing A0 to 3.750 using Short Side Geometry stares, but its accuracy is still moderate at a 90° stare. We doubted the ability to perform inexact matching with this level of expected metric accuracy. The Short Side Geometry appeared to offer the absolute accuracy that could be pivotal to a landmark classification scheme. A control envelope was developed to take advantage of the accuracy obtained at 90° stares by the Short Side Geometry, while simultaneously discouraging oblique stares by a cost function. Prior to selecting the Short Side Geometry for implementation, we again used the linear approximation of Figure 5.3 to explore classification and robustness questions. If we installed twice as many ultrasonic sensors, reducing A0 to 3.750, what benefit in worst case performance would be observed when employing the Short Side approach? Table 5.4 summarizes the answer. The advantage gained from the extra hardware appears to be insufficient to justify the effort. Another question concerned how much A0 error the Short Side Geometry could handle and still deliver accuracy at an arbitrary benchmark, say two inches. Table 5.5 summarizes the results at arbitrary stares. It appears very tolerant at blunt look angles, but sensitive at oblique angles. 74 6 A0 900 S 57.00 70" S 1.80 50" S 05° Table 5.5. Short Side 0 tolerance for error within two inches These results motivated us to explore large scale simulations of' landmark classi- fication employing the Short Side Geometry. With a supporting navigation control scheme, accurate metric inference could be obtained. In the next section we will present our simulation results of landmark measuring using the Short Side and the Two Side Geometries. We will see that these simulation results are consistent with the error models we have explored in the previous sections. 5.5 Monte Carlo Experiments We were very interested in the behavior of the two sensing geometries in the presence of the error models we developed in Section 5.4.2. The Short Side Geometry appeared to promise the most capability, but we were concerned about its sensitivity to 0 error. Both Schneider and Wang had concluded that rotational, or angular uncertainity was the most sensitive parameter in their applications [134, 155], and we have already seen evidence to support this conclusion in our problem. Uncertainity is often a function of measurement errors that have additive effects, but their work found that angular uncertainity had multiplicative effects. We conducted over 100 thousand Monte Carlo trials to examine the percent of metric error in measuring landmarks of known sizes using either of our proposed sens- ing geometries. Ten thousand trials at a given optical axis orientation were obtained using each sensing geometry and the error models developed in the last chapter. The 75 0 Prob(:t 5.55%) 90° 0.92 70° 0.41 50° 0.16 Table 5.6. Probability of Short Side Geometry measuring within 2 inches following six Figures illustrate with histograms the performance of both geometries at selected arbitrary look angles. As expected the Short Side Geometry delivered accuracy superior to the Two Side Geometry. The effect of the uniformly distributed A0 can be seen in the wide shape of the Short Side results. The Two Side results have the characteristic Gaussian shape, and the offset bias varied with the camera orientation. Useful accuracy is only seen with the orthogonal look angle of the Short Side Geometry. In our domain, landmark classes are often separated by two inches, or less, in their metric attribute. Table 5.6 summarizes the probability that the Short Side Geometry measured the landmark with two inches, or 5.55%, at three arbitrary look angles. From these results we concluded that the Short Side Geometry offered better performance in a general domain than the Two Side Geometry, and that a control envelope should be defined to ensure that sensing occurs when the data extracted from the sensors is most likely to be accurate. Obtaining range data from sharp look angles promised the highest accuracy, so an optical axis orthogonal to the axis of motion would be a good design for very fine metric work. 1 000 1 500 2000 2 500 .Percent Range Error (n=10000) Figure 5.4. Short Side Geometry, 0 = 90° 400 600 800 1 000 1 200 200 77 I I 10 20 30 _ Percent Range Error (n=10000) Figure 5.5. Short Side Geometry, 0 = 70° 2000 1500 1 000 500 78 l I l l l l l l -30 -20 -10 0 10 20 30 Percent Range Error (n=10000) Figure 5.6. Short Side Geometry, 0 = 50° 1500 1 000 500 79 7.0 7.1 7.2 7.3 Percent Range Error (n=10000) Figure 5.7. Two Side Geometry, 0 = 90° 1 000 1 500 2000 2500 500 12.3 12.4 12.5 12.6 12.7 12.8 Percent Range Error (n=10000) Figure 5.8. Two Side Geometry, 0 = 70° 1000 1 500 2000 2500 500 17.5 _- '7 '~ IIIII- ___________ F I r I 17.6 17.7 17.8 17.9 18.0 Percent Range Error (n=10000) Figure 5.9. Two Side Geometry, 0 = 50° 18.1 82 5.6 Summary In this chapter we have presented two sensing geometries that support landmark clas— sification in a structured domain. These methodologies fuze the sensing capabilities of a monocular camera and an array of ultrasonic transducers. This geometric approach enables depth information to be attributed to an extracted domain feature. These geometries successfully cast vision and ultrasonic sensing in roles where each can op- erate accurately, while simultaneously supporting real time navigation. While these sensing frameworks each led to an overall sensing footprint, and a navigation control envelope, locomotion and path planning issues have not been overly constrained. We developed detailed composite error models that were injected into landmark classifi- cation simulations that demonstrated the potential of both these sensing geometries. The Short Side Geometry displayed the greater accuracy potential and will be exten- sively employed in the studies of the next chapter. Both of these original sensing geometries offer a more general landmark classifi— cation solution to other applications. It is noteworthy that neither sensing geometry requires ultrasonic or camera calibration, and both offer real time performance, at very affordable hardware costs. CHAPTER 6 Inexact Graph Matching The developments presented in the previous chapters provide the representational model and the sensing strategy to perform autonomous pose recovery in a structured domain. In the following sections we will discuss the symbolic feature ambiguity inherent in operating domains, and the representational challenges that the repre- sentation must handle. Inexact feature attributes, particularly sensed metric data, present difficult matching challenges. Heuristic and deterministic algorithms for error recovery will be presented along with simulation results obtained using realistic error models. These simulations explore different approaches toward developing a robust, working solution to the Where-am-I Problem. Performance trade-offs between various heuristics will be studied. The results show that our approach can provide accurate pose recovery in a short period of time. 6.1 Representation In Sections 3.2 and 3.3 we examined representations used in mobile robotics and pre- sented the StickRep structure. We saw that StickRep was a 1% dimensional structure that provided a minimal representation for our robotic navigator. Each edge contains some or all of the following attributes: 83 84 type 6 {corner, doorway, elevator, hallway, pattern, stairwell} metric 6 32+ prototype E {doorjam, hallsegment,pattern} pattern 6 {fi, 0, Q, C7} Nodes hold adjacency information of the incident feature edges and contain the following attributes: adjacency angle 6 {0,...,360} global coordinates E {XW,YW,ZW,a,,B,7} StickRep is implemented as a doubly linked list of records. These records encode sequences of feature data. If we view the database and the sensed feature data as a one dimensional sequence of attributed edges, then feature matching becomes a string matching problem with each attributed edge corresponding to a terminal in our edge type alphabet. The sensed feature map under construction becomes a graph of attributed terminal symbols [46]. This design greatly simplifies the mechanics of the matching and a string matching scheme was modified to handle ambiguous metric information [94]. The pattern field is a dynamic entry which attempts to capture feature information that is supplemental to the feature type. For example, a poster on a door is often sensed as a pattern within a larger feature that is strongly conjectured to be of edge type door. This feature information is attributed to a door edge for subsequent disambiguation and confirmation. Feature information is dynamic, so expanding the context may provide greater confidence in a potential match. In the next section we will discuss how StickRep, together with our matching scheme, handled symbolic feature ambiguity. 85 A = 90° door30 door32 door36 door40 wall door30 0 0 0 0 0 door32 0 0 0 0 0 door36 0 0 0 0 0 door40 0 0 0 0 0 wall 0 0 0 0 1 15 Table 6.1. Domain ambiguity: Number of edge pairs with adjacency angle of 90° A = 180" door30 door32 door36 door40 wall door30 1 1 0 0 0 11 door32 0 0 0 0 4 door36 0 0 28 0 183 door40 0 0 0 0 1 1 wall 10 4 181 1 1 0 Table 6.2. Domain ambiguity: Number of edge pairs with adjacency angle of 180° 6.2 Ambiguity Despite the structure provided by our assumed domains, great feature ambiguity still exists even without the confusion injected by sensor error. Complete knowledge of the edge types, and the adjacency angles between the edges, is insufficient for unique identification. Ambiguity can be strong without larger context. The third floor of the Engineering Building is just one of the domains we examined, but it provides an illustrative example of ambiguity, and is portrayed in Tables 6.1 - 6.9. Each Table contains the number of occurrences of a particular edge sequence on the building’s third floor. For example, Table 6.2 shows that there exist 28 instances of adjacent, coplanar 36 inch door pairs. Table 6.5 shows that there are 152 instances of a 36 inch door with coplanar wall surfaces on both adjacent sides. Strong ambiguity exists, 86 1: 270° door30 door32 door36 door40 wall door30 0 0 0 0 0 door32 0 0 0 0 0 door36 0 0 0 0 1 door40 0 0 0 0 0 wall 0 0 2 0 98 Table 6.3. Domain ambiguity: Number of edge pairs with adjacency angle of 270° A = 180° door30 door32 door36 door40 door30 4 0 4 2 door32 0 0 1 0 door36 5 2 I 18 3 door40 0 0 4 4 Table 6.4. Domain ambiguity: Three Edges: (door)(wall)(door), adjacency angles of 180° particularly involving edges representing the most common door width of 36 inches. Tables 6.8 and 6.9 illustrate the variety of door and wall edges within their re- spective edge types. Metric information is often necessary to disambiguate candidate subgraphs during matching. For example, Table 6.9 indicates that only 62 of the 431 wall edges occuring on the Engineering Building’s third floor are 160 inches or greater door30 0 door32 4 door36 152 door40 1 1 Table 6.5. Domain ambiguity: Three Edges: (wall)(door)(wall), adjacency angles of 180° 87 A = 180° door30 door32 door36 door40 door30 0 0 1 2 door32 0 0 0 0 door36 0 2 1 14 3 door40 0 0 2 4 Table 6.6. Domain ambiguity: Four Edges: (door)(wall)(door)(wall), adjacency angles of 180° A = 180° door30 door32 door36 door40 door30 0 0 0 0 door32 0 0 1 0 door36 5 1 109 2 door40 0 0 4 4 Table 6.7. Domain ambiguity: Four Edges: (wall)(door)(wall)(door), adjacency angles of 90° in length. The Where-am-I Solution will be examined within this ambiguous operating do- main. Feature ambiguity and sensing error present difficult obstacles for our solution. The robot navigator must extract a vertical ribbon edge feature and classify the land- mark candidate while continuing forward motion. It will attempt to match a classified landmark to its domain map, but due to the great feature ambiguity in this domain, multiple matches are almost guaranteed. Additional landmarks must be obtained, ex- panding context so that a unique match can be found. Suppose the robot navigator were to start at a random location and orientation on the third floor, autonomously aligning its motion vector down the hallway axis. Figure 6.1 illustrates, for every possible starting location, the number of adjacent edges the navigator would have to accurately sense and classify, before the assembled subgraph of edges would uniquely 88 blank old window no window new window 2 new windows poster door30 22 0 0 0 0 0 door32 4 0 0 0 0 0 door36 212 54 10 3 104 29 door40 11 11 0 0 0 0 Table 6.8. Domain ambiguity: Door type census )0 212 216 232 248 264 280 296 2112 2128 2160 wall 431 415 370 328 223 202 160 121 102 88 62 Table 6.9. Domain ambiguity: Wall type census match a starting location in the master building graph. Maximum ambiguity occurs at several starting locations that correspond to 36 inch doors embedded in a highly regular metric feature sequence. These starting locations represent the areas that would require the most context before the robot navigator could determine its pose. This disambiguation becomes much more complex in the presence of sensing error and noise. We will examine this issue in detail in subsequent sections. Ambiguity could be reduced by designing the edge type definitions much finer in granularity. Subdividing the door and wall edge types into multiple edge types would yield matching subgraphs with considerably less context. This corresponds to encorporating many of the attributes directly into the edge definitions and classifi- cation schemes. Although this methodology appears attractive, its implementation may quickly reveal an inability to accurately sense the features necessary for robust classification. The necessity of real time processing further restricts the ability to extract fine feature detail [102, 106, 138, 148]. Edges needed for Uniqueness 89 30 25 l 20 15 10 F r I M I 0 1 0000 20000 30000 40000 Initial Location Figure 6.1. Starting location ambiguity of MSU Engineering Building 90 6.3 Simulations Over 80,000 simulated navigation runs were conducted using the StickRep model database presented, under various simulated conditions. For each simulation, Rome was initially placed at a random location and orientation in the hallway freespace. The following assumptions were made throughout the simulation experiments: A1) Rome has a valid StickRep database of the operating domain. A2) Rome’s ultrasonic and monocular sensors behave as modeled in Chapters 4 and 5. A3) Rome has the capability to autonomously align its forward motion vector to within a few degrees of the ultrasonically sensed hallway axis. This assumption is based on the exploratory work demonstrated by Dulimarta [47]. A4) Rome has the capability to autonomously center, within a few inches, its position in the hallway freespace and maintain this centering during forward motion without significant oscillation. Again, this assumption is based on the empirical work of Dulimarta using only ultrasonic sensing. Rome may commence travel either left to right or right to left relative to the orientation of the camera. Each simulation experiment consisted of a suite of 10 independent trials, with each trial consisting of 1000 different random starts. The experiment was implemented with a sliding window along the chosen surface of the hallway freespace. Frames were acquired each 0.2 seconds, retaining only frames containing a vertical edge ribbon. The matching process took a worst case time of only 40 milliseconds per frame acquired, and the image processing described in Section 4.1 consumed another 100 milliseconds. Conservatively then, Rome only requires 0.2 seconds between frames to solve this problem. All these times were taken 91 from simulations performed on a Sparc 1 workstation. Upgrading Rome’s computer from the 38GSX laptop to a Sparc 1, sans monitor, is currently underway. To ensure that feature overlap exists with this field of view, a subsequent frame acquired 0.2 seconds later enforces a maximum velocity of 30 feet per second. This is well over the maximum Labmate base speed of 2.6 feet per second so pursuing this approach to the Where-am-I Problem is limited by the mechanical capability of Rome’s Labmate base, and not by implementation of our approach. Any hallway navigator cruising at speeds in the neighborhood of 30 feet per second would very likely be viewed as a safety hazard, particularly since this is well beyond the reaction horizon using our current ultrasonic sensors (and close to world record sprinting time). Figure 6.2 describes the simulation we conducted. In the next section we will discuss how step (2) in Figure 6.2 was implemented using the error models developed in Section 4.3. 6.4 Error Error is inherent in any robotics application. The Where-am-I Problem is challenging not just due to ambiguity in a dynamic operating domain, but also due to unavoidable sensing errors. If we had domain ambiguity, but no error, then shaft-encoded wheels could support accurate dead reckoning and there would be no pose uncertainity once our navigator had disambiguated its correct pose. In the presence of sensing error, navigation and control become more difficult because we can never have complete confidence in our sensed data. We assume that our navigator does not possess its initial pose so any sensed data that is consistent with some location in its building database has to be accepted as long as it remains consistent with any previously sensed data. In the next section, we will explore our Where-am-I solution in the operating domain assuming no error. In Section 6.4.2 we will develop a empirically based fuzed (0) Receive random starting location (1) Index actual StickRep edge from database (1.1) Establish ground truth location (2) Inject modeled error onto actual StickRep edge (2.1) Ultrasonic error (2.2) False positive error (2.3) Missized true postive error (2.4) False negative error (3) Determine sensed StickRep edge (4) Concatenate into sensed subgraph (5) Index sensed subgraph from building database (5.1) Assemble set of possible locations from matches with database (6) Examine possible location set (6.1) If more than one possible location (6.1.1) Increment metric travel indicator (6.1.1.1) Inject mechanical wheel error (6.1.2) Goto (1) for next edge (6.2) Else, if one possible location (6.2.1) Propose symbolic pose (6.2.2) Compare proposed pose with ground truth (6.3) Else, if zero possible locations (6.3.1) Raise sensing error flag Figure 6.2. Simulation Algorithm 93 error model that will be supplemental to our model of Section 4.3 in simulating the injected error of step (2) in Figure 6.2. 6.4.1 No Error Examining our Where-am-I Solution in a perfect sensing domain illustrates the am- biguity in the feature space, and presents a best-case situation. This simulation experiment, like the other five we will examine later, consisted of a suite of 10 in- dependent trials, with each trial composed of 1000 simulated navigation runs from random starting locations and orientations. Figure 6.3 is a representative histogram illustrating the number of edges acquired before context disambiguation resolved a single possible match to the building database. In all cases, the proposed pose was the correct actual pose, as expected without any sensing error. Frames were acquired and assembled into StickRep edges. When complete domain features were assembled into a candidate subgraph, it was indexed into the building database. The average number of edges acquired before convergence to a unique pose was 5.28. The average StickRep edge in the operating domain had a metric average of 67.64 inches, so the average distance Rome travelled before acquiring its pose was 29.76 feet. We saw in the previous section that computationally we could acquire this average pose fix in just over one second, but Rome’s Labmate base is the limiting constraint. Using the present hardware, this Where-am-I Solution would require an average of 11.45 seconds. Let us again emphasize that this simulation result was without sensing error. This experiment was designed to evaluate the viability of the StickRep model in representing a large building. It also examined the feature ambiguity of the operating domain. These experimental results provide a comparative baseline for the evaluation of the subsequent simulation experiments. In the next section we will discuss the development of our fuzed sensor error models. 94 O m _ N O o _ N O In _ '— O _ m l I I I j 0 10 20 30 Number of Edges (n=1000) Figure 6.3. Unique Pose without Error 95 Correct Classifications 83.87 % True Positives 83.67 % True Negatives 84.10 % Misclassifications 16.13 % False Positives 15.91 % Missized True Positives 10.20 % False Negatives 6.12 % Table 6.10. Empirical Based Error Model 6.4.2 Empirical Based Error In Sections 4.3 and 5.4 we discussed the modeling of individual sensor error. What was needed to construct a realistic simulation was a fuzed sensor error model; a model that explains how the Short Side feature extraction algorithm developed in Section 5.3 fails with respect to both vertical ribbon detection and ultrasonic ranging. The Short Side Algorithm was trained on five different suites of coordinated image/ ultrasonic frames totalling 157 fuzed sensor readings. It was then tested on a fuzed image/ ultrasonic suite of 94 fuzed sensor readings. These data were acquired using the actual Rome platform under various illumination conditions during a 9 month time period. Recall— ing our individual ultrasonic sensor error model from Section 4.3 we define our control envelope within the domain as a position within 1 foot of the hallway freespace axis, with the optical axis within 75° of the surface feature normal. A significant amount of this evaluation data was taken from outside this control envelope, so we feel that these results are conservative. Table 6.10 presents the summary of the Short Side Algorithm’s performance on this large library of data. These results constitute the injected error model for the remainder of the simulation experiments as shown in step (2) in Figure 6.2. For classification purposes, recall from Section 4.1 that the vertical ribbon detector was looking for doors, and classifying them into one of six legal sizes. 96 We feel that these results could be further improved by tuning the parameters of the feature extractors, but our main interest here is exploring the performance of our Where-am-I Solution, not in tuning the sensor suite to an individual problem domain. In the next section we will discuss how the symbolic StickRep model was used to match subgraphs to the building database despite the inexact feature metric attributes that were the result of sensor error. 6.5 Metric Relaxation Matching Fuzzy set theory and fuzzy control theory have active, maturing research communi- ties. Much work has been done since Zadeh’s pioneering work in 1965 [159]. The task of interest here is developing the capability to accurately infer a match between a deterministically derived metric attribute and a sensed metric attribute that may or may not contain sensing error. Many applications in fuzzy control handle vary- ing inputs by representing the inputs by their fuzzy membership values [76, 144]. Other image processing applications develop sufficient and c-sufficient statistics [11], or ranking procedures [150] to perform fuzzy pattern classification. An implementa- tional weakness of these approaches revolves around the class membership definition. In our application, although a door type edge must be a member of one of six metric classes, a wall type edge has no such restriction on the metric attribute. How do we define our wall type edge class membership? Lee suggests that relaxation is a much simpler approach in this situation, and may actually yield superior results [98]. We implemented a metric relaxation approach. If no sensing error were present, a metric relaxation approach would not be nec- essary. However, significant error is not always present, particularly when Rome operates within the defined control envelope. The presence of significant error can only be noted when sensing a known feature, or when the sensed feature fails to match 97 (0) Subgraph fails to match any region of building database (1) If relaxation = 0%, set relaxation = 10% (2) If relaxation = 10%, set relaxation = 20% (3) If relaxation = 20%, FAIL Figure 6.4. Metric Relaxation Heuristic a feature edge in the building database. When this matching failure occurs, Rome’s executive knows that a sensing error has occurred, but it does not know what edge, or edges, in the assembled subgraph contain significant error. To ensure that the assembled subgraph is without error, the entire sensed subgraph must be discarded. We were concerned that such a decision strategy would threaten our real time con- trol requirements. We will revisit this issue in Section 6.7. We initially developed a stepped metric relaxation heuristic that is invoked when the executive routine notes that significant error has produced a failure to match condition. Figure 6.4 illustrates the heuristic that operates on the feature metric attribute in the building database. Note that after step (2) the relaxed granularity is coarser than the metric accuracy delivered within the specified operating envelope. If our assembled subgraph fails to match despite an arbitrary 20 percent relaxation in metric attributes, the executive routine registers a sensing failure. During the simulation experiments, this heuristic was used in conjunction with other heuristics and when such a failure was noted the attempted pose fix was labeled a failure. We will discuss these experiments in detail in the following section. 98 43.80% of frame sequences contain error 0.00% of erroneous frame sequences recovered pose 41.70% of frame sequences did not converge 57.30% Correct Poses 1.00% Incorrect Poses Table 6.11. Performance without heuristics 6.6 Heuristic Error Recovery Experiments In this section we will discuss in detail the extensive simulations that were conducted using the StickRep Solution to the Where-am-I Problem. We conducted these heuris- tic error recovery experiments on only one domain - the third floor of the Engineering Building. We felt that this domain offered the most general, and varied real environ- ment. The deterministic method to be developed in the next section will be examined by multiple domains. These experiments were conducted according to the assump- tions in Section 6.3. Rome can only react to sensing errors if it knows that an error has occured. Thus, the executive routine invokes a heuristic only if the subgraph matcher has failed to produce any pose candidates. These heuristic invocations are a modification to step (6.3.1) in our general simulation algorithm presented in Figure 6.2. We emphasize that no heuristic is invoked until a matching failure occurs. Using the error model from Table 6.10 for step (2), we initially conducted an experiment without heuristics to examine the performance of the algorithm under simulated er- ror. Table 6.11 summarizes the results of the median trial of 1000 attempts from the suite of 10 such trials. The results presented here represent the probability of Rome correctly determining its symbolic pose in the presence of sensing and navigation error. The first line of Table 6.11 illustrates the percent of the simulated frame sequences that contained 99 error before a pose was concluded, or matching was abandoned due to error. The second line shows the percent of frame sequences that did contain error, zero here, that subsequently converged to a match and a pose was concluded. The third line indicates the percentage of frame sequences that did not converge to a unique match due to error, and were abandoned. In the fourth line the percent of correct poses concluded is noted, and the last line shows that Rome incorrectly determined its pose a small percent of the time because sensing error in its assembled subgraph accidently matched a region of the building database. We feel that this behavior, corresponding to a situation of false pose confidence, could represent a dangerous situation. Unless this false pose is subsequently corrected by expanded context, undesired behavior certainly could result. The percent of incorrect poses clearly is a figure to watch closely. 6.6.1 First Frame Error Here we examine StickRep’s performance enhanced by the decision to restart the data acquisition process if an error is noted from a matching failure, but only if this error occurs in the first acquired frame. This approach foreshadows the discussion in the next section. The results in Table 6.12 summarize our solution’s performance. Statistical variation among the 10 trials in this experiment were minor, and were correlated to the injected error rate. The median error rate trial is given. Only a mild overall improvement in performance was observed. 6.6.2 False Door Error In this experiment we examined a recovery heuristic that was developed to overcome the most common error in our empirical error model. The reason that false doors are observed is due to spurious edge features in the operating domain and an aggressive 100 43.30% of frame sequences contain error 15.70% of erroneous frame sequences recovered pose 34.80% of frame sequences did not converge 63.50% Correct Poses 1.70% Incorrect Poses Table 6.12. Performance with First Frame Heuristic 43.67% of frame sequences contain error 40.30% of erroneous frame sequences recovered pose 23.40% of frame sequences did not converge 74.00% Correct Poses 2.60% Incorrect Poses Table 6.13. Performance with False Door Heuristic vertical ribbon detector. Here our heuristic removed the door type edges in our assembled subgraph that corresponded to the door type whose a priori probability was the lowest. Referring to Table 6.8, in this domain, the 32 inch door is the least likely to be sensed. This deletion occurs sequentially arbitrarily starting with the most recently sensed door, backwards in time. Between each door edge removal, a matching occurs, attempting to recover candidate poses. Again, statistical variation was minor among the 10 trials, and Table 6.13 summarizes performance. A significant performance improvement was observed, but at a cost of almost tripling the dangerous incorrect poses. 6.6.3 Mis-sized Door Error Here we isolate our examination of metric feature relaxation as discussed in the previ- ous section. The desired performance enhancement is to recover from the mis-sizing, 101 44.30% of frame sequences contain error 28.37% of erroneous frame sequences recovered pose 31.90% of frame sequences did not converge 66.90% Correct Poses 1.20% Incorrect Poses Table 6.14. Performance with Mis-sized Heuristic 42.20% of frame sequences contain error 62.80% of erroneous frame sequences recovered pose 8.20% of frame sequences did not converge 84.30% Correct Poses 7.50% Incorrect Poses Table 6.15. Performance with Full Heuristic Suite or misclassification of door type edges. An actual door was sensed, and correctly identified as a door, but was misclassified as a door type edge of incorrect width. Table 6.14 presents the performance summary. This heuristic was unimpressive when not used in conjunction with other heuristics, but only mildly increased the number of incorrect poses. 6.6.4 Full Recovery Suite In Table 6.15 we summarize our solution’s performance when all three heuristics are used in combination. An 84.3 percent error recovery rate was observed, but a signifi- cant percentage of false poses was observed. A 20 percent increase in matching time was observed, but this 8 milliseconds per frame is not significant in our application. The correct pose performance noted in Table 6.15 is impressive, but we felt that the incorrect pose percentage represented a significant risk. Reducing the risk of 102 incorrect pose estimation without degrading correct pose performance would yield a solution approach with strong potential. In the next section we will discuss just such an approach. 6.7 Deterministic Error Recovery Table 6.15 shows the best heuristic performance towards error recovery. While the applied heuristics were powerful in reducing the percentage of frame sequences that did not converge, incorrect poses occured frequently. Table 6.16 summarizes results from an experiment where no heuristics were used. Here we discard the entire sensed subgraph if an error is noted by a failure to match the building database, and sim- ply construct another sensed sequence. The observed multi-sequence performance exceeded the accuracy of all previous experiments. The matching was fast enough that despite a large increase in the number of matching requests precipitated by this approach, the absense of heuristics allowed this experiment to actually run faster than the experiments using heuristics. Figure 6.5 is a histogram illustrating the number of frames sensed before a final pose was determined using this approach. Table 6.17 summarizes the comparison of our solution’s average performance with, and with- out modeled error. The average number of edges required before convergence to a unique pose was observed to be 6.84, or 38.55 feet travelled before acquiring its pose. Using the present hardware, this Where-am-I Solution would require an average of only 14.83 seconds to acquire its pose. That pose would be accurate 97.80 percent of the time. This performance compares very favorably with the results we observed without sensing error in Section 6.4.1. 150 41 100 50 103 l . I I‘lllllllllIlu-II_I——I_I_ - - I I I I 0 I 10 20 30 40 Number of Edges (n=1000) Figure 6.5. Multi-Sequence with Error 50 104 6.8 Examining Additional Domains Our examination of error recovery approaches in the previous sections led us to con- clude that our deterministic recovery approach was superior to the heuristic methods. The deterministic error recovery performance was summarized in Tables 6.16 and 6.17. We now will examine the performance of our solution applied to additional domains. We felt that although the Engineering Building domain is very large, many other domains, while smaller in scale, possess even greater feature ambiguity that will fur- ther challenge our approach. We conducted the examination of these other domains with only our deterministic error recovery algorithm and used the composite error model we developed for our previous domain. The performance summaries presented represent the median trial of 1000 attempts from a suite of 10 such trials. We will discuss one additional real, one manufactured, and two synthetic domains: 0 MSU Engineering Building, Third Floor: 3 mile hallway with both lab and office areas and varied features. Domain previously discussed. 0 MSU Wells Hall, Seventh Floor: Office building roughly 15 percent the scale of the Engineering Building with strong feature regularity. 0 Synthetic regular polygon with 46 sides and only one unique feature provides a very ambiguous domain. 0 Synthetic storefront 115 mile long as viewed from the street provides a structured outdoor domain. 0 MSU Wells Hall, Seventh Floor: Modified by introducing a single artificial vertical edge ribbon at the feature edge location farthest from existing reference landmark. 105 The next application domain was the seventh floor of Wells Hall as shown in Figure 1.1. This domain is roughly 15 percent the scale of the Engineering Building domain, but contains regular feature sequences that demand a much larger context to enable an accurate match. The major feature sequence ring contains a single feature landmark to provide reference context for all other ring features. Table 6.18 summarizes the observed accuracy. The ambiguous domain demanded long frame sequences for reference context and these sequences were thus more likely to contain an sensing error. We next constructed two synthetic domains to explore particular performance questions. The regular polygon was arbitrarily chosen as a difficult domain with a single reference feature placed to avoid complete ambiguity. The storefront sequence is comparable to the feature sequence viewed during hallway navigation. We assume that street navigation is safe and possible, and designed our simulation to recover pose in a domain scaled to a city block. Real storefronts were used to guide the construction of this domain. Tables 6.19 and 6.20 summarize the accuracies observed in these two domains. We returned to the Wells Hall domain to explore the potential of disambiguating the domain with the addition of a reference landmark. We introduced a single arti- ficial vertical ribbon edge at the feature edge location most distant from the existing reference landmark. We hoped that this feature manufacturing would decrease the scope of the context searching observed in the actual domain. Table 6.21 illustrates the observed mild improvement in accuracy performance. but the time performance is a more important summary figure. In Table 6.22 we summarize the time performance observed in all five domains. The regular polygon was clearly a difficult navigation domain even without sensing error, and our approach appears unsuitable for applications of this type. Wells Hall, despite its smaller size, was observed to be a more difficult. domain for pose recovery 106 than the Engineering Building. The artificial landmark placed in Wells Hall improved pose recovery performance significantly from the existing domain, particularly when sensing error is considered. Although this manufactured domain provides only a single example, it demonstrates the potential for improving domain performance with a small degree of domain tailoring. The synthetic storefront provides an example for the application of our solution approach in structured outdoor domains, provided that locomotion and safety issues can be overcome. An outdoor environment still offers insufficient structure for navigation of this nature to be currently practical. If special situations, such as hazardous materials handling, were to remove human dynamics from this domain, then useful pose recovery would become possible. Advances in locomotion hardware could improve these results significantly before the applications approached a real time processing constraint. 107 41.50% of frame sequences contain error 94.70% of erroneous frame sequences recovered pose 0.00% of frame sequences did not converge 97.80% Correct Poses 2.20% Incorrect Poses Table 6.16. Performance of deterministic recovery in Engineering Building No Error Modeled Error Edges 5.28 6.84 Distance (ft) 29.76 38.55 Time (sec) 11.45 14.83 Accuracy (%) 100.00 97.80 Table 6.17. Overall performance summary 69.70% of frame sequences contain error 90.82% of erroneous frame sequences recovered pose 0.00% of frame sequences did not converge 93.60% Correct Poses 6.40% Incorrect Poses Table 6.18. Performance of deterministic recovery in Wells Hall 96.00% of frame sequences contain error 97.92% of erroneous frame sequences recovered pose 0.00% of frame sequences did not converge 98.00% Correct Poses 2.00% Incorrect Poses Table 6.19. Performance of deterministic recovery in regular polygon 108 47.40% of frame sequences contain error 93.15% of erroneous frame sequences recovered pose 0.00% of frame sequences did not converge 96.10% Correct Poses 3.90% Incorrect Poses Table 6.20. Performance of deterministic recovery along synthetic storefront 62.40% of frame sequences contain error 91.35% of erroneous frame sequences recovered pose 0.00% of frame sequences did not converge 94.60% Correct Poses 5.40% Incorrect Poses Table 6.21. Performance of deterministic recovery in a modified Wells Hall Domain Error Type Edges Distance (ft) Time (sec) Accuracy (%) Engr None 5.28 29.76 11.45 100.0 Bldg Modeled 6.84 38.55 14.83 97.8 Wells None 13.72 69.97 26.91 100.0 Hall Modeled 37.05 188.96 72.68 93.6 Modified None 12.61 64.31 24.74 100.0 Wells Modeled 14.76 75.28 28.95 94.6 Regular None 46.00 383.33 147.44 100.0 Polygon Modeled 66.62 555.19 213.54 98.0 Store- None 14.28 73.76 28.54 100.0 front Modeled 18.04 92.19 37.88 96.1 Table 6.22. Comparison of domain performance 109 6.9 Summary This chapter has presented the simulation results of our StickRep Solution to the Where-am-I Problem. We have seen that in a domain with great symbolic feature ambiguity, StickRep provides a strong representational framework to determine the pose of a robot navigator. We have examined several heuristic schemes to recover from sensing error, but all these heuristics fall short of the performance of our deterministic approach. Metric relaxation of ambiguous feature attributes were matched using the StickRep representation. Even under the influence of realistic noise models, our solution converged to a pose only 30 percent slower than under ideal, error-free sensing conditions. Furthermore, even with significant error, our solution converges to an accurate pose in a short period of time. We feel that this solution approach is robust, and provides strong performance even on low cost hardware navigators like Rome. No competitive solution to this problem has appeared in the literature. Capable, but low cost sensors and navigation platforms have opened up the sc0pe of applications for both industrial and research mobile robots. Our solution method- olgy enables these systems to become more autonomous and to pursue more chal- lenging applications when provided the structure of an indoor environment. StickRep enables less ambitious tasks to be pursued in a structured outdoor environment. Our Where-am-I Solution does not rely on fast or specialized hardware, so it can grow in performance as advances in hardware continue. With such a strong solution in hand, more ambitious navigational tasks can be attempted since the robot possesses the capability to autonomously recovery its pose. CHAPTER 7 Summary, Conclusions and Recommendations for Future Research 7 .1 Summary and Conclusions The research addressed in this dissertation dealt with robotic navigation and pose recovery using a symbolic landmark map. Sensory information was obtained from an ultrasonic sensing system and a monocular camera to form a vertical landmark model. This model served as a basis for pose estimation, a key challenge for mobile robot applications. Our solution to the Where-am-I Problem enables map-based path planning, obstacle avoidance, landmark identification and other essential navigation operations. Constructing such a model starts with the complex task of determining range information from the robot’s sensing suite. We presented alternative approaches to representing the robot’s sensory world, but instead of pursuing a dense, computa- tionally demanding geometric model, we offered a sparse representational scheme that required strong symbolic matching, ambiguity resolution and error rejection 110 111 capabilities. Our StickRep representation was built upon the idea of encoding ex- tractable environmental features as attributed edges in a sparse symbolic graph. Our StickRep representation hinges on strong assumptions about the structure of an in- door domain. Visual landmarks in this domain must be sensed, and a real time image processing requirement restricts many approaches. We chose to pursue the extraction of symbolic features in our domain with a single intensity image fuzed with ultrasonic range data. Many visual features in an indoor environment possess strong verticality. We developed a processing suite that extracted vertical edge ribbons, and classified symbolic landmark candidates using the fuzed sensory data. These experiments also led to the definition of an ultrasonic sensing accuracy footprint. Constructing a sensing framework that casts vision and ultrasonic sensing in roles where each can operate accurately, while simultaneously supporting unrestricted real time navigation was our goal. We presented two sensing frameworks, our Two Side and Short Side geometries, that strove towards this goal. These geometries were de- veloped and trained with a fuzed image/ ultrasonic suite of 157 real sensor readings, and over 100 thousand Monte Carlo experiments. Detailed composite error mod- els were determined from lengthy experiments employing these sensing geometries. Both geometries have a more general application, but extensive error analysis and simulations revealed that the Short Side Geometry was best suited for our task. To accurately establish its pose, our navigator must acquire its sensory information at instances where particular sensor error is near its minimum. We defined an accu- rate navigation envelope that enhances sensing accuracy while only mildly restricting travel and locomotion. Using the StickRep representation and a test suite of 94 real fuzed sensor readings, extensive simulations were conducted to examine the performance of our approach in domains containing great feature ambiguity. Metric attributes were often required to successfully disambiguate among, and classify candidate landmarks. We presented 112 a stepped metric relaxation scheme to enable feature matching to proceed despite significant metric sensing error. From over 80,000 simulated navigation runs we de- termined that from a random initial and unknown pose, our navigator could recover its pose in less than 12 seconds in the absense of error and in less than 15 seconds with realistic modeled error in a large domain. Other domains also illustrated strong results, and the introduction of artificial landmarks offer to further improve perfor- mance in difficult domains. This time performance was limited not by our solution approach, but by the mechanical limitations of the robot base. Our solution would support implementation in a navigation system traveling at speeds up to 30 feet per second using a Sparc 1 processor. A battery of heuristic based pose recovery experi- ments were presented, but the overall performance of our deterministic error recovery approach exceeded the former results in all measures of speed and accuracy. 7 .2 Contributions The primary contribution of this research is a solution to the Where-am—I Prob- lem in the examined domains. Our solution was examined using 94 real, fuzed test image/ ultrasonic readings with over 80,000 simulated navigation trials. An accurate pose was recovered in 97.8 percent of the attempts in a large, varied domain. Accu- rate and rapid solution performance was also obtained in experiments with additional domains. We observed accurate and rapid pose recovery performance with a naviga- tor constructed with low cost hardware. A Where-am-I Solution enables navigation systems to become more independent and to pursue more difficult applications in a structured, indoor environment. Our Short Side and Two Side Geometries measure vertical edge ribbon fea- tures in real time with an uncalibrated camera. These sensing frameworks represent real time solutions to the more general problem of object measuring. 113 Provided the objects have at least two parallel linear edges, and are viewed with a con- trastng background, their primary dimension may be determined without expensive, specialized hardware or a customized measurement jig. We have presented StickRep, a 1% dimensional structure that supports rapid symbolic matching with minimum a priori information. StickRep sparsely represents the 3D world. This representation supports fuzing ultrasonic range data to images of symbolic features from the mapped domain. StickRep suc- cessfully handled inexact feature attributes in ambiguous feature domains throughout our extensive simulations. This representation reduced feature matching to an O(n) string matching problem, and supports dynamic feature attributes. StickRep avoids massive domain databases and the limiting assumption of a static environment. We presented a real time vertical edge ribbon detector using available illumination. This detector was robust to rotational misalignment of the camera, and correctly extracted and classified almost 84 percent of the landmark candidates from a suite of 94 real test images. This detector only required inexpensive sensing hardware and computational support with processing speeds comparable to a Sparc 1. Another contribution is the definition of an accurate ultrasonic sensing footprint. Forty thousand ultrasonic samples were obtained from various ranging surfaces and orientations. Definite limitations were observed that motivate acquisition strategies to ensure accurate ranging. Poor angular resolution has plagued ultrasonic sensing applications in the literature, but no sensing footprint has been established. Such a footprint led to the design of an overall navigation envelope that protects methodologies from frequent spurious ranging data. A small contribution was made towards the physical construction of two robotic navigators to serve as testbeds for future experiments. They can serve as testbeds for theoretical work and simulations, and stimulate further studies. 114 Rome and Sparta represent mature hardware platforms well prepared for future sen- sory and navigation experiments. Future designers will be relieved of a majority of the hardware concerns and instabilities inherent in robotic applications. Rome now possesses a stable power backplane, ready for evaluation and experimentation of sys- tems such as transputers, or a Sparc 1 system box. Rome forms the initial backbone for more mature work, pursuing more ambitious tasks. 7 .3 Recommendations for Future Research We presented a large simulation study of our Where-am-I Solution in Chapter 6. We examined two real and three synthetic domains. While these domains are large, and contain varied, and ambiguous features, expanding the simulation domains to include other buildings could be illuminating. It would be desirable if generalizations could be made linking navigation capabilities with building types, but the existence of sensing error will precipitate observations that are probabilistic in character. No performance guarantees can be made. We might expect pose recovery to be more rapid in some smaller buildings, since they would require less matching attempts than in our domains. But this presumes that feature ambiguity would not be great. Successfully extending our solution into limited outdoor domains would also be very useful. As we discussed in Section 2.3.2, many efforts are pursuing structured outdoor applications, and adapting StickRep’s sequential feature representation to support more general image vieWpoints would greatly empower autonomous naviga- tion. Such an effort would be more general than past outdoor navigators at Michigan State [55], and at many other research centers. Unlike most published approaches, our effort is limited, not enabled, by the hardware on which it is implemented. Per- formance certainly may be degraded by an outdoor generalization, but strong perfor- mance 115 potential exists. An enhancement to an indoor navigator would be using two cameras with opposing optical axes to search both hallway walls for features. Our matching routine could be tasked to match two separate landmark candidates and then search for a domain location where these symbolic sequences occur on opposing walls. StickRep must be carefully modified to encode the domain relationships to include an opposing feature edge. Feature matching would resemble two dimensional string matching. Although this expansion of domain context may greatly decrease pose recovery times, it might rather become computationally intense due to the more complex edge adjacencies that may exist at a feature node. This approach could prove useful, and certainly warrants exploring. We presented our vertical edge ribbon detector in Section 4.1. While this de- tector was fast, and tolerant to roll axis misalignment of the camera, it produced false positives on blank walls. A better detection procedure to eliminate these false positives represents a good opportunity for increased performance. Several attempts were made to implement more adaptive thresholding, but the crisp, uniform inten- sity histograms typical of a man made environment caused bimodal Gaussian fitting to be problematic and poor in performance. Other adaptive thresholding techniques should be explored. A successful adaptive thresholding technique that operated under natural, and varied illumination, would enhance the robustness of our detector. APPENDICES APPENDIX A Ultrasound Histograms The following histograms are summarized in Tables 4.2 and 4.3 in Section 4.3. 1000 ultrasonic samples were obtained from each of three surface types at three or four orientations each. The indicated angle is the angle between the acoustic axis and the surface normal. The choice of angles was arbitrary, and the following histograms are representative of the results observed. 116 300 250 200 150 100 50 I L 117 I MI I I 13.8 14.0 14.2 14.4 ' Percent Range Error (n=1K) Figure A.1. Concrete surface, 45° 600 500 400 300 200 100 118 4.9 Percent Range Error (n=1 K) Figure A.2. Concrete surface, 30° 5.2 200 300 400 500 600 100 119 I I I I I I I 1 .7 1 .8 1 .9 2.0 2.1 2.2 2.3 Percent Range Error (n=1 K) Figure A.3. Concrete surface, 15° O o — . 0 O o _. 0 O . o .— V O o _ N o _ If I I I I I I —0.76 -0.74 -0.72 -0.70 -0.68 -0.66 -0.64 Percent Range Error (n=1 K) Figure A.4. Concrete surface, 0° 800 600 400 200 17.0 I I 17.5 18.0 Percent Range Error (n=1 K) Figure A.5. Wood surface, 45° 18.5 400 300 200 100 1 5.2 I I T I f I 15.4 15.6 15.8 16.0 16.2 16.4 Percent Range Error (n=1 K) Figure A.6. Wood surface, 30° 600 500 400 300 200 100 123 I I I I I 0.95 1.00 1.05 1.10 1.15 Percent Range Error (n=1 K) Figure A.7. Wood surface, 15° 600 200 -0.10 Percent Range Error (n=1 K) Figure A.8. Wood surface, 0° 400 600 800 1000 200 200 400 600 Percent Range Error (n=1 K) Figure A.9. Cloth surface, 30° 800 400 600 800 1 000 I l l I 200 I 126 — MI 200 I I 400 600 Percent Range Error (n=1 K) Figure A.10. Cloth surface, 15° 800 800 600 400 200 127 i- elem-.m' ~ 0.50 0.55 0.60 0.65 Percent Range Error (n=1 K) Figure A.11. Cloth surface, 0° APPENDIX B Partial Derivatives from Sensing ] Geometries In Section 5.5.3 we presented the first order linear error model for our landmark measuring geometries. The closed form representation of these partial derivatives surpassed four pages in length each I We present the following figures to summarize the behavior of the partial derivatives for each of the two sensing geometries. Note that as the arbitrary look angle increases, the nonlinear gain portion of the curve approaches the linear region of the curve that we are operating on. 128 129 Two Side Geometry 130 1 1 1 I 1 1 1 I 1 1 1 1 1 1 o no ‘ A o I no 3 A 0 10 2 o l 10 1 v A v 0 PP} DbthPb Plkhbhb Dbl?) o 2 ‘ 6 a u o o o o o o o _ . . . :1 ¢=0° -6— am.» Figure B.1. I bbbbbb D h D D b b D I D D 1 1 1 I 1 1 1 1 d 1 1 1 1 d1 1 1 1 l‘ A 400 300 200 100 1 o 6 o nun tr e ¢=o° Figure B.2. 131 -o.47s - J i : -0.478 r : -o.4s } ; 3-0.4ez E ; O I , -O.484 E J -o.4ss E J -o.qaa } J l i -o.49 1 . - - . . . , 1 0 so. 100 150 200 250 LJ Figure B.3. %, ¢ = 0° O O D a O vv,.- 0.486 0.484 90.482 vv'vvvv'vvvv'vv 'v 0 o . O r- v 0.478 0.476 vv'vvvv'vv 0 so A 100 A 150 260 250 FgmeR4-i- ¢=0° 132 1 1 1 ‘ 1111 400 300 200 100 r ._ A A T n U L) Di PPPPDDDDIDDDPPPP.DIDI>> o 2 ‘ s I . C o o 0 :1 20° ¢= _8. am.) Figure B.5. 1 ‘ 1 1 d 1 1 11 ‘ 111 1 1 1 ‘0 no ‘ 0 Lo 3 A o I no 2 A A I o no 1 u o P. t D DID F D l D D P D b b k D P P h a 6 4 o I I O o o o nun It 45 20° A one, Figure 3.6. 133 t, v f I ‘ -o.4 - - -o.s - . 3-o.6 - . O I . —o.7 - - I 1 .0.8 . . 0 so 100 A 150 A A 200 A 2530 LJ Figure B.7. %, 45 = 20° 150A A A A200 250 0 $0 100 RJ Figure B.8. a—ffi, ¢ = 20° 134 1 1 ‘ 1 1 1 1 ‘ 1 1 1 G 1 1 1 1 ‘ 400 100 300 200 .- bhtbhubbbhbbbbbbbbb 2 2 6 6 a a o o o o o o o 0 o . . . . :1 40° ¢= L any Figure 3.9. 1 111111111 111411 41110 no ‘ 0 no 3 A A o no 2 0 re 1 o by up... pubbpb>>ppn a 6 G 2 o O I I I o o o o I! 40° ¢= _9_ arm: Figure B.10. 135 1l11‘1111‘1111‘1111C1 1 11‘ 1111 o I S 2 o v 0 2 o v 15 1 o .3 1. o S 1 o PIfIDPfLIIDDDPDDD I DDDDPDP 4 6 8 1 2 . o o - o o o 0 1m. ¢=40° L 8LJ’ Figure 3.11. DthIIDI .14141411144 bbbthDIIDb .1111‘1111 250 200 150 3v .5. .. 50 2 1 1 O 0 2n ¢=40° .4. aRJ 2 Figure 8.12. 136 Short Side Geometry 137 ‘ 111‘11‘1.‘ 1 ‘1.1“““1I“1 1‘1‘11‘1‘11“1“.‘1“‘1‘11‘11 1. 1‘ . A A 0 .2 I 16 1 . A A r A v o lo 1 1‘ 1 O nU : . .2 no 8 AW. - r A , 3 a a .a.m 1 .0 .0 3 a 6 O 90 1 I B . .4 av 10 4. .1 t 1". A .m v . ..W y 10 b bibtb Ibhbfb bbtbbfiht .PblttthtbbbnbbtbbbbhhbbbbhbprbbbhtfbA a 6 6 2 0 2 2 5 1. 5 o 5 1 5 o o o e o o 1 o o o o 1 o o o o o o c o o I o c . o o 4 . AW fluonuc theta ¢=0° A as, Figure B.14. ‘. ,A .7 F—Iw- -._7 DU 138 o »' ' ' . . P d r 1 , . h 1 -o.2 - , h 1 -o.o - ‘ b d b 1 -O.6 ~ ‘ -o.a » ' -1 D L ‘ 1 1 1 0 so 100 150 zoo 230 1 LJ - a Figure B..15 51.7, ¢ = 0" 1 - . 0.3 - ‘ 0.6 - . p 1 ’ 1 0.4 - . 1 0.2 ~ l o . 1 AA A A M L A 1 0 so 100 150 A A 260 250 as Figure 3.16. 'a—gj, ¢ 2 0° 1,.- 139 0.350194- 0.350194 - 0.350194 - 0.350194 ~ . ' . Dadia 0.350194 - 0.350194 ~ I I 0.350194 - 0.350194 - Figure B.17. %, ¢ = 20° Dchetn 0.4 - 14 16 10 20 22 Figure 3.18. g, 6 = 20° 140 1 1 C 1 111 1 C 1 1! 1 250 200 100 50 -°.8 P LJ ¢=20° _8_ BLJ’ Figure 3.19. 1 1 1 i 5 D F b D r D 1 1 C 1 1 1 1 ‘ 1 1 1 1 ‘ 1 D . 0 P D 150 A 250 200 100 50 RJ ¢=20° .9. am ’ Figure 3.20. 141 1.05743 - . 1.05743 » | 7 1 i r ' 31.05743» ll . -'. i H. I 5 1 11 [ II 2 I! l ; u a I 1.05743 - I ' {E . I ‘ . 1.05743 - . 1.05743 - . 0 20 40 60 so 100 120 3411: . a — O I?1gp1re 13.21” 5:33;, qb __740 2. I :1 5 - 3 g . U D 1* 1 0,5L. . r, , . . . .. 22 24 25 23 30 32 34 35 theta Figure B22. 6%, ¢ = 40° -o.1 -o.125 1 -0.15’ -0.175 3 O -0.2 -0.225 ’ -o.25 : -o.275 . . 1 . - . ‘ 50- 100 150 200 250 LJ I a — o Flgure 3.23. m, 05 — 40 T . 0.3 " 1 P 1 ’ m 0.25 - . 2 I j G ’ 1 0.2 - ‘ 0.15 - . P 1 50 100 150 200 A 250 as n a _ O Flgure B..24 m, ¢ — 40 BIBLIOGRAPHY BIBLIOGRAPHY [1] M. Abidi and T. Chandra. Pose estimation for camera calibration and land- mark tracking. Proceedings IEEE International Conference on Robotics and Automation, 1990. [2] N. Alvertos and D. Brzakovic. Camera geometries for image matching in 3-d machine vision. IEEE Transactions on Pattern Analysis and Machine Intelli— gence, September 1989. [3] R. Arkin and R. Murphy. Autonomous navigation in a manufacturing environ- ment. IEEE Transactions on Robotics and Automation, August 1990. [4] J. E. Arnold. Experiences with the subsumption architecture. Proceedings IEEE' International Conference on Artificial Intelligence Applications, 1989. [5] M. Asada. Building 3-d world model for a mobile robot from sensory data. Proceedings IEEE International Conference on Robotics and Automation, 1988. [6] M. Asada. Map building for a mobile robot from sensory data. IEEE Transac— tions on Systems, Man, and Cybernetics, November 1990. [7] M. Asada et al. Representing global world of a mobile robot with relational local maps. IEEE Transactions on Systems, Man, and Cybernetics, November 1990. [8] D. Barnea and H. Silverman. A class of algorithms for fast digital image regis- tration. IEEE Transactions on Computers, February 1972. [9] M. Beckerman and E. Oblow. Treatment of systematic errors in the processing of wide-angle sonar sensor data for robotics navigation. IEEE Transactions on Robotics and Automation, April 1990. [10] B. Bhanu. CAD-based robot vision. IEEE Computer, August 1987. 143 [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] 144 J. Bialasiewicz. Sufficient and epsilon-sufficient statistics in pattern recognition and their relation to fuzzy techniques. IEEE Transactions on Systems, Man, and Cybernetics, September 1989. F. Blais et a1. Optical range image acquisition for the navigation of a mobile robot. Proceedings IEEE International Conference on Robotics and Automation, 1991. R. Bolles and A. Bobick. Exploiting temporal coherence in scene analysis for au- tonomous navigation. Proceedings IEEE' International Conference on Robotics and Automation, 1989. J. Borenstein and Y. Koren. Histogramic in-motion mapping for mobile robot obstacle avoidance. IEEE Transactions on Robotics and Automation, August 1991. J. Borenstein and Y. Koren. The vector field histogram - fast obstacle avoidance for mobile robots. IEEE' Transactions on Robotics and Automation, June 1991. K. Boyer and A. Kak. Robotic manipulation experiments using structural stere- opsis for 3d vision. IEEE Expert, Fall 1986. K. Boyer and A. Kak. Structural stereopsis for 3d vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, March 1988. O. Bozma and R. Kuc. Building a sonar map in a specular environment using a single mobile sensor. IEEE Transactions on Pattern Analysis and Machine Intelligence, December 1991. R. Brockett. On the computer control of movement. Proceedings IEEE Inter- national Conference on Robotics and Automation, 1988. R. Brooks. A robust layered control system for a mobile robot. IEEE Transac- tions on Robotics and Automation, March 1986. R. Brooks. Visual map making for a mobile robot. In Readings in Computer Vision. Morgan Kaufmann, 1987. R. Brooks. New approaches to robotics. Science, September 1991. R. Brooks and A. Flynn. Self calibration of motion and stereo vision for mo- bile robot navigation. Technical report, Massachusetts Institute of Technology Artificial Intelligence Laboratory, 1987. [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] 145 M. Brown. Locating object surfaces with an ultrasonic range sensor. Proceedings IEEE International Conference on Robotics and Automation, 1985. D. Brzakovic and L. Hong. Road edge detection for mobile robot navigation. Proceedings IEEE International Conference on Robotics and Automation, 1989. A. Califano et al. Data and model driven foveation. Proceedings IEEE Inter- national Conference on Robotics and Automation, 1990. H. W. Calkins and M. D. F. The transition to automated production cartogra- phy: Design of the master cartographic database. The American Cartographer, February 1987. J. Canny. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, November 1986. A. Chattergy. Some heuristics for navigation of a robot. Technical report, Robotics Research Lab University of Hawaii, 1984. S. Chen and W. Tsai. Determination of robot locations by common object shapes. IEEE Transactions on Robotics and Automation, February 1991. C. K. Chow and K. T. Automatic boundary detection of the left ventricle from cineangiograms. Computers and Biomedical Research, May 1972. J. Clark and N. Ferrier. Control of visual attention in mobile robots. Proceedings IEEE International Conference on Robotics and Automation, 1989. D. Coombs and C. Brown. Intelligent gaze control in binocular vision. Proceed- ings IEEE International Symposium on Intelligent Control, 1990. D. Coombs et a1. Gaze control and segmentation. Proceedings AAAI Qualitative Vision Workshop, July 1990. I. Cox. Blanche - an experiment in guidance and navigation of an autonomous robot vehicle. IEEE Transactions on Robotics and Automation, April 1991. J. Crowley. Position estimation for an intelligent mobile robot. Technical report, Carnegie—Mellon University, 1983. J. Crowley. Dynamic world modeling for an intelligent mobile robot using a rotating ultrasonic ranging device. Proceedings IEEE International Conference on Robotics and Automation, 1985. 146 [38] J. Crowley. World modeling and position estimation for a mobile robot using [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] ultrasonic ranging. Proceedings IEEE International Conference on Robotics and Automation, 1989. M. Daily et al. Autonomous cross-country navigation with the ALV. Proceedings IEEE International Conference on Robotics and Automation, 1988. L. Davis et a1. RAMBO — vision and planning on the connection machine. DARPA Image Understanding Workshop, 1989. L. Davis and T. Kushner. Vision-based navigation: A status report. DARPA Image Understanding Workshop, 1987. D. DeMenthon and L. Davis. New exact and approximate solutions of the three-point perspective problem. Proceedings IEEE International Conference on Robotics and Automation, 1990. M. Dhome and M. Richetin. Determination of the attitude of 3-d objects from a single perspective view. IEEE Transactions on Pattern Analysis and Machine Intelligence, December 1989. E. Dickmanns and A. Zapp. A curvature-based scheme for improving road vehicle guidance by computer vision. Proceedings SPIE' Conference on Mobile Robots, 1986. M. Drumheller. Mobile robot localization using sonar. Technical report, Arti- ficial Intelligence Lab Massachusetts Institute of Technology, 1985. G. Dudek and M. Jenkin. Robotic exploration as graph construction. IEEE Transactions on Robotics and Automation, December 1991. H. Dulimarta. Personal communication. July, 1992. R. Dunley. Obstacle avoidance perception processing for the autonomous land vehicle. Proceedings IEEE International Conference on Robotics and Automa- tion, 1988. C. Durieu et a1. Localization of a mobile robot with beacons taking erroneous data into account. Proceedings IEEE International Conference on Robotics and Automation, 1989. A. Elfes. Sonar-based real-world mapping and navigation. IEEE Transactions on Robotics and Automation, June 1987. [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [631 147 A. Elfes. Occupancy Grids: A Probabilistic Framework for Robot Perception and Navigation. PhD thesis, Carnegie-Mellon University, 1989. A. Elfes. Using occupancy grid for robot perception and navigation. IEEE Computer, June 1989. A. Elfes. Occupancy grids: A stochastic spatial representation for active robot perception. Proceedings of the 6th Conference on Uncertainty in AI, July 1990. A. Elfes. Dynamic control of robot perception using stochastic spatial mod— els. Proceedings International Workshop on Information Processing in Mobile Robots, March 1991. B. Fehr and J. Gerrish. Vision-guided off-road vehicle. ASAE International Meeting, December 1989. J. Fei and C. Isik. Adapting to environmental variations in a rule-based mobile robot controller. Proceedings IEEE International Conference on Systems, Man, and Cybernetics, 1989. C. Fennema and A. Hanson. Experiments in autonomous navigation. Proceed- ings IEEE International Conference on Robotics and Automation, 1990. C. Fennema and A. Hanson. Model-directed mobile robot navigation. IEEE Transactions on Systems, Man, and Cybernetics, November 1990. M. Fischler and R. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Com- munications of the ACM, June 1981. K. Fok and M. Kabuka. An automatic navigation system for vision guided vehicles using a double heuristic and finite state machine. IEEE Transactions on Robotics and Automation, February 1991. H. Frohn and W. Seelen. Visocar: An autonomous industrial transport vehicle guided by visual navigation. Proceedings IEEE International Conference on Robotics and Automation, 1989. A. Fukunaga. Statistical Pattern Recognition. Academic Press, 1990. G. Giralt et al. An integrated navigation and motion control system for au- tonomous multisensory mobile robots. Proceedings International Symposium on Robotics Research, 1984. 148 [64] P. Grandjean. 3-D modeling of indoor scenes by fusion of noisy range and stereo data. Proceedings IEEE International Conference on Robotics and Automation, 1989. [65] W. Crimson. Sensing strategies for disambiguating among multiple objects in known poses. IEEE Transactions on Robotics and Automation, December 1986. [66] N. Griswold and J. Eem. Control for mobile robots in the presence of moving objects. IEEE Transactions on Robotics and Automation, April 1990. [67] Y. Han et a1. Pose determination using tree annealing. Proceedings IEEE International Conference on Robotics and Automation, 1990. [68] R. Haralick. Determining camera parameters from the perspective projection of a rectangle. Pattern Recognition, March 1989. [69] S. Harmon. USMC ground surveillence robot: A testbed for autonomous vehicle research. Proceedings University of Alabama Robotics Conference, 1984. [70] M. Hebert. Building and navigating maps of road scenes using an active sensor. Proceedings IEEE International Conference on Robotics and Automation, 1989. [71] M. Hebert et a1. Terrain mapping for a roving planetary explorer. Proceedings IEEE International Conference on Robotics and Automation, 1989. [72] R. Hickling and S. Marin. The use of ultrasonics for gauging and proximity sensing in air. Journal of the Acoustical Society of America, April 1985. [73] B. Horn. Robot Vision. MIT Press, 1986. [74] Q. Huang. Personal communication. May, 1992. [75] K. Hung et a1. Robot location determination in a complex environment by multiple marks. Pattern Recognition, June 1988. [76] T. Huntsberger et al. Representation of uncertainity in computer vision using fuzzy sets. IEEE Transactions on Computers, February 1986. [77] T. Hwang and J. Clark. On local detection of moving edges. Proceedings IEEE International Conference on Robotics and Automation, 1990. [78] T. Hwang and J. Clark. A spatio—temporal generalization of canny’s edge detec- tor. Proceedings IEEE International Conference on Robotics and Automation, 1990. [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] 149 A. Jain and T. Newman. Range-intensity histogram for segmentng ladar im- ages. Pattern Recognition, January 1992. T. Kanade. Picture processing by computer complex and recognition of human faces. Technical report, Kyoto University, Department of Information Science, 1973. T. Kanade and S. Shafer. Image understanding research at carnegie-mellon. DARPA Image Understanding Workshop, 1989. P. J. Kielczynski et al. Ring piezoelectric transducers radiating ultrasonic energy into the air. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, January 1990. D. Kite and M. Magee. Determining the 3D position and orientation of a robot camera using 2D monocular vision. Pattern Recognition, August 1990. P. R. Klarer. Autonomous land navigation in a structured environment. IEEE Aerospace and Electronic Systems Magazine, March 1990. K. Kluge and C. Thorpe. Explicit models for robot road following. Proceedings IEEE International Conference on Robotics and Automation, 1989. Y. Koren and J. Borenstein. Potential field methods and their inherent limita- tions for mobile robot navigation. Proceedings IEEE International Conference on Robotics and Automation, 1991. D. Kriegman and E. Triendl. Stereo vision and navigation in buildings for mobile robots. IEEE Transactions on Robotics and Automation, December 1989. E. Krotkov. Mobile robot localization using a single image. Proceedings IEEE International Conference on Robotics and Automation, 1989. E. Krotkov et al. An agile stereo camera system for flexible image acquisition. IEEE Transactions on Robotics and Automation, February 1988. E. Krotkov and R. Kories. Adaptive control of cooperating sensors: Focus and stereo ranging with an agile camera system. Proceedings IEEE International Conference on Robotics and Automation, 1988. R. Kuc and B. Barshan. Navigating vehicles through an unstructured environ- ment with sonar. Proceedings IEEE International Conference on Robotics and Automation, 1989. [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] B. Kuipers. Modeling spatial knowledge. Cognitive Science, February 1978. B. Kuipers and T. Levitt. Navigation and mapping large-scale space. Artificial Intelligence, Summer 1988. S. Kuo and G. Cross. A two-step string-matching procedure. Pattern Recogni- tion, July 1989. M. Landy. HIPS. Human Information Processing Laboratory, NYU, New York, New York. J. Latombe. Robot Motion Planning. Kluwer Academic, 1991. J. Le Moigne. Domain-dependent reasoning for visual navigation of roadways. IEEE Transactions on Robotics and Automation, August 1988. C. C. Lee. Fuzzy logic in control systems: Fuzzy logic controller, part ii. IEEE Transactions on Systems, Man, and Cybernetics, March 1990. S. Lee and Y. Kay. An accurate estimation of 3-D position and orientation of a moving object for robot stereo vision: Kalman filter approach. Proceedings IEEE International Conference on Robotics and Automation, 1990. J. Leonard and H. Durrant-Whyte. Mobile robot localization by tracking geo- metric beacons. IEEE Transactions on Robotics and Automation, June 1991. J. J. Leonard and H. F. Durrant—Whyte. Directed Sonar Sensing for Mobile Robot Navigation. Kluwer Academic, 1992. M. Leung and T. Huang. Detecting wheels of vehicle in stereo images. Proceed- ings IEEE International Conference on Robotics and Automation, 1990. 8. Lion and R. Jain. Road following using vanishing points. Computer Vision, Graphics, and Image Processing, March 1987. V. Lumelsky et a1. Dynamic path planning in sensor-based terrain acquisition. IEEE Transactions on Robotics and Automation, August 1990. V. Lumelsky and T. Skewis. A paradigm for incorporating vision in the robot navigation function. Proceedings IEEE International Conference on Robotics and Automation, 1988. gulfl. [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] 151 R. Luo and M. Lin. Robot multi-sensor fusion and integration: Optimum estimation of fused sensor data. Proceedings IEEE International Conference on Robotics and Automation, 1988. L. Matthies and A. Elfes. Integration of sonar and stereo range data using a grid-based representation. Proceedings IEEE International Conference on Robotics and Automation, 1988. L. Matthies and S. Schafer. Error modeling in stereo navigation. IEEE Trans- actions on Robotics and Automation, June 1987. D. McDermott and E. Davis. Planning routes through uncertain territory. Artificial Intelligence, 1984. C. McGillem and T. Rappaport. Infrared location system for navigation of autonomous vehicles. Proceedings IEEE International Conference on Robotics and Automation, 1988. D. Miller. Two dimensional mobile robot positioning using onboard sonar. Proceedings Pecora IX Remote Sensing Symposium, 1984. D. Miller. A spatial representation system for mobile robots. Proceedings IEEE International Conference on Robotics and Automation, 1985. A. Mitiche and G. Habelrih. Interpretation of straight line correspondences using angular relations. Pattern Recognition, March 1989. H. Moravec. The Stanford cart and the CMU rover. Proceedings IEEE, July 1983. H. Moravec and A. Elfes. High resolution maps from wide angle sonar. Pro- ceedings IEEE International Conference on Robotics and Automation, 1985. N. Nandhakumar and J. Aggarwal. Thermal and visual information fusion for outdoor scene perception. Proceedings IEEE International Conference on Robotics and Automation, 1988. H. Nasr and B. Bhanu. Landmark recognition for autonomous mobile robots. Proceedings IEEE International Conference on Robotics and Automation, 1988. N. N asrabadi et a1. Stereo vision correspondance using a multi-channel graph matching technique. Proceedings IEEE International Conference on Robotics and Automation, 1990. [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] 152 H. Nguyen and P. Heckman. Real-time pattern recognition for guidance of an autonomous undersea submersible. Proceedings IEEE International Conference on Robotics and Automation, 1988. H. Nishihara and T. Poggio. Stereo vision for robotics. Proceedings International Symposium on Robotics Research, 1984. T. Olson and D. Coombs. Real-time vergence control for binocular robots. Technical report, Department of Computer Science University of Rochester, 1990. D. Payton. An architecture for reflexive control. Proceedings IEEE International Conference on Robotics and Automation, 1986. T. Poggio et al. The MIT vision machine. DARPA Image Understanding Workshop, 1988. A. Preciado and D. Meizel. Fusion of multi-sensor data: A geometric approach. Proceedings IEEE International Conference on Robotics and Automation, 1991. J. Prewitt. Object enhancement and extraction. In B. Lipkin and A. Rosenfeld, editors, Picture Processing and Psychopictorics. Academic Press, 1970. P. Puget and T. Skordas. An optimal solution for mobile camera calibration. Proceedings IEEE International Conference on Robotics and Automation, 1990. L. Ray. Estimation of modeled object pose from monocular images. Proceedings IEEE International Conference on Robotics and Automation, 1990. L. G. Roberts. Machine perception of three—dimensional solids. In J. Tippett et al., editors, Optical and Electro-Optical Information Processing. MIT Press, 1965. M. Rokey. Remote mission specialist: A study in real-time, adaptive planning. IEEE Transactions on Robotics and Automation, August 1990. J. Roos. GPS comes onboard. Armed Forces Journal International, July 1990. Y. Roth—Tabak and R. Jain. Building an environment model using depth infor- mation. Technical report, University of Michigan, 1988. P. K. Sahoo and S. S. A survey of thresholding techniques. Computer Vision, Graphics, and Image Processing, May 1988. [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] K. Sarackik. Characterising an indoor environment with a mobile robot and uncalibrated stereo. Proceedings IEEE International Conference on Robotics and Automation, 1989. J. Schneider and H. Dulimarta. Robot navigation via ultrasound sensors. Tech- nical report, Computer Science Department Michigan State University, 1989. J. Schwartz and M. Sharir. A survey of motion planning and related geometric algorithms. Artificial Intelligence, 1988. T. Shakunaga. 3-d corridor scene modeling from a single view under natu- ral lighting conditions. IEEE Transactions on Pattern Analysis and Machine Intelligence, February 1992. M. Sharir and A. Schorr. On shortest paths in polyhedral spaces. SIAM Journal of Computing, February 1986. S. Shaw et al. Fusion of radar and optical sensors for space robotic vision. Proceedings IEEE International Conference on Robotics and Automation, 1989. Z. Shiller and Y. Gwo. Dynamic motion planning of autonomous vehicles. IEEE Transactions on Robotics and Automation, April 1991. K. Skifstad. High-Speed Range Estimation Based on Intensity Gradient Analy- sis. Springer-Verlag, 1991. B. Steer and T. Atherton. Design for navigation. Proceedings IEEE Interna- tional Conference on Robotics and Automation, 1990. F. Stein and G. Medioni. Efficient two dimensional object recognition. Proceed- ings IEEE International Conference on Robotics and Automation, 1990. K. Sugihara. Some location problems for robot navigation using a single camera. Computer Vision, Graphics, and Image Processing, April 1988. H. Tahani and J. Keller. Information fusion in computer vision using the fuzzy integral. IEEE Transactions on Systems, Man, and Cybernetics, May 1990. C. Thorpe. The CMU Rover and the FIDO vision and navigation system. Proceedings Symposium Autonomous Underwater Robots, 1983. A. Tirumalai and B. Schunck. Dynamic stereo with self-calibration. Proceedings IEEE International Conference on Robotics and Automation, 1990. [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [150] 154 Transitions Research Corporation, Danbury, CT. TRC Labmate User Manual, 1989. M. Trivedi et al. Developing robotic systems with multiple sensors. IEEE Transactions on Systems, Man, and Cybernetics, November 1990. R. Tsai and R. Lenz. Real time versatile robotics hand/eye calibration using 3D machine vision. Proceedings IEEE International Conference on Robotics and Automation, 1988. T. Y. Tseng and C. M. Klein. New algorithm for the ranking procedure in fuzzy decisionmaking. IEEE Transactions on Systems, Man, and Cybernetics, September 1989. R. Tummala and C. Lin. Mobile robot navigation using sensory feedback. Proceedings IEEE International Conference on Systems Engineering, 1991. M. Turk et al. VITS - a vision system for autonomous land vehicle navigation. IEEE Transactions on Pattern Analysis and Machine Intelligence, May 1988. R. Wallace et al. First results in robot road following. Proceedings International Joint Conference on Artificial Intelligence, 1985. C. Wang. Location estimation and uncertainity analysis for mobile robots. Proceedings IEEE International Conference on Robotics and Automation, 1988. C. M. Wang. Error analysis of spatial representation and estimation of mobile robots. Technical report, General Motors Research Laboratories, 1986. Y. Watanabe and S. Yuta. Position estimation of mobile robots with internal and external sensors using uncertainity evolution technique. Proceedings IEEE International Conference on Robotics and Automation, 1990. A. Waxman et al. A visual navigation system for autonomous land vehicles. IEEE Transactions on Robotics and Automation, April 1987. S. Xie. View planning for mobile robots. Proceedings IEEE International Con- ference on Robotics and Automation, 1990. L. A. Zadeh. Fuzzy sets. Informat. Contr., August 1965. W. R. Zhang et al. P0012: A generic system for cognitive map development and decision analysis. IEEE Transactions on Systems, Man, and Cybernetics, January 1989. _ . ._ “‘1‘:th “ I "_I [161] Z. Zhang and O. Faugeras. Building a 3D world model with a mobile robot: 3D line segment representation and integration. Proceedings IEEE International Conference on Robotics and Automation, 1990. [162] J. Zheng and S. Tsuji. Panoramic representation of scenes for route understand- ing. Proceedings IEEE International Conference on Robotics and Automation, 1990.