lEﬁiﬁ

||ll|||l|l|lll|||lllllllllllllllll||ll||||l|lllllllllllllllllll

31293 00903 2669

This is to certify that the

dissertation entitled

RECONSTRUCTION OF LINE DRAWING
GRAPHS FROM FUSED RANGE AND
INTENSITY IMAGERY

presented by

GREG CHUNGMOU LEE

has been accepted towards fulﬁllment
of the requirements for

PhD degree in Computer Science

/
-.-' 6' '

 

Major professor

Date é a“? (72—

MSU is an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

 

 

Michigan State
University

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU Is An Affirmative Action/Equal Opportunity Institution
czwmwmeduema-p:

 

 

RECONSTRUCTION OF LINE DRAWING
GRAPHS FROM FUSED RANGE AND

INTENSITY IMAGERY

By

Greg Chungmou Lee

A DISSERTATION
Submitted to
Michigan State} University
in partial fulﬁllment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

Computer Science Department

1992

 

 

67?»~ 65‘ 9/

ABSTRACT

RECONSTRUCTION OF LINE DRAWING
GRAPHS FROM FUSED RANGE AND
INTENSITY IMAGERY

By

Greg Chungmou Lee

This thesis addresses the problem of extracting a labeled line drawing graph from
registered range and intensity imagery. Wing representation [26] is used as an inter-
mediate step to achieve the ﬁnal goal.

To attain the wing representation of an imaged scene, a low level feature detection
algorithm for extracting wing primitives is proposed. The procedure is largely based
on a hypothesize-and-test strategy. In a small neighborhood of the fused image, ex-
istence of various wing primitives is hypothesized. Veriﬁcation of a wing hypothesis
is posed as a non-linear optimization problem in which the 2D intensity contour con-
straint, whenever available, is used to aid the recovery of surface parameters. Fitting
functions that relate various quadric surface forms to their limb contours and inter-
secting quadric surfaces to the projection of the crease edges are derived to define the
constraints of fused data ﬁtting. The limb contours of quadric surfaces are also instru-

mental in deducing good initial parameter estimates for the non-linear optimization

 

scheme. Results from Monte Carlo studies on the accuracy of parameter estimates
and surface shape discrimination by ﬁtting fused, surface—only and contour-only data
conﬁrm the importance of the limb contour constraints. Further study shows that
only very few data points are needed to achieve quality results. Experimental results
of wing detection on real images indicate that a recognition system probably can be
built upon the outlined primitive detection subsystem.

Given the wing representation of the scene, the labeled line drawing graph rep—
resentation is constructed. Three origami and polygonal scene reconstruction proce—
dures are given to analyze scenes under various restrictive assumptions. With very
idealistic assumptions about the wing samples, the complete and unique LLDG is ten-
able if resampling of the range image is allowed. By studying the geometric properties
of the line drawing graphs, a set of necessary but not sufﬁcient rules are derived to
reconstruct the LLDG without having to resample the range image; however, unique-
ness of the reconstructed LLDG is not guaranteed. In dealing with imperfect wing
sampling, a heuristic based algorithm is devised to handle the possibility of having
missing and spurious wings and the inherent measurement and computation errors.
Methods for reconstructing quadric surface scenes are proposed but not implemented.

All of the implemented algorithms have been tested on both synthetic and real
images. Parallel renditions are natural but not yet attempted. It is shown that
reconstruction of labeled line drawing graphs from raw fused images via wing features

is plausible.

 

 

 

Copyright © by
Greg Chungmou Lee
1992

 

 

 

 

 

To my parents

and my wife, Hsin—Chen

 

lwish to thank ;
toward the successfu
patience helped keel
countless hours at n
tered in research, I
personally, Acknow]
30rs Anil Jain, Mihr;
suggestions helped n
311d maintaining the
with ﬁrst class equip
Professor Richard Du
Monte Carlo experin
Barton for giving me
The experience was b

The alliance of thr
than, and I, came ab
all started on thesis
one week of each 0th
Up the spirit when on
the completion of thi

lam grateful for t
former and current P
Sci-Wang Chen for int
and Dr. Patrick F lyn
providing many usefu
Value the advice of Dr.

thesis preparation.

 

 

 

ACKNOWLED GMENTS

I wish to thank my thesis advisor, Professor George Stockman, for his guidance
toward the successful completion of this thesis. His directions, encouragements, and
patience helped keep me focused and motivated. I sincerely appreciate his spending
countless hours at nights and on weekends to analyze the problems I have encoun-
tered in research. He has inﬂuenced me in a positive way both academically and
personally. Acknowledgment is also due to my thesis committee members, Profes-
sors Anil Jain, Mihran Tuceryan, Wen-Jing Hsu and Connie Page, whose invaluable
suggestions helped make a better thesis. Special thanks to Dr. Jain for equipping
and maintaining the Pattern Recognition and Image Processing (PRIP) laboratory
with ﬁrst class equipment for conducting research. I value the critical comments of
Professor Richard Dubes during my PRIP presentations and his advice on conducting
Monte Carlo experiments. I also wish to thank Professors Richard Reid and Ruth
Barton for giving me the opportunity to participate on the ACM Programming team.
The experience was both interesting and worthwhile.

The alliance of the gang-of—four, Deborah Trytten, Narayan Raja, Sateesha Nad-
abar, and I, came about when we took the comprehensive examination together. We
all started on thesis writing at about the same time and defended our theses within
one week of each other. We shared the joy of successful career searches and cheered
up the spirit when one of us was down. Their emotional support was instrumental to
the completion of this thesis. To them, I say thank you, my friends.

I am grateful for the friendship and the thought provoking discussions with many
former and current PRIPies. I especially wish to thank Drs. Chaur-Chin Chen and
Sci-Wang Chen for introducing me to the wonderful research area of Computer Vision
and Dr. Patrick Flynn for paving the way for future research using range image by
providing many useful software tools for obtaining and processing range images. I
value the advice of Dr. Joseph Miller and Dr. Farshid Farrokhnia on research and on
thesis preparation.

I also wish to thank all other PRIPies, especially Philippe Ballard, Sushil Bhat-
tacharjee, J inlong Chen, Yao Chen, John Courtney, Chittra Dorai, Marie—Pierre

vi

 

Dubuisson, Hans l
Nanda, Tim Newrr
tenure an enjoyablr
four to seize contro
grateful to Steve W
ﬁnish up his researc
Lee: and former ma
throughout the year

This research h;
grant CDA-8806599
College of Engineer-i
of Michigan State I
Department of Com
throughout my grad
conferences held aro

Finally, I must p:
life. My most since]
through my beginnin
current success. I acl
education, of my fam
wife, Hsin-Chen, for l

   

 

 

Dubuisson, Hans Dulimarta, Sally Howden, Qian Huang, Jian-Chang Mao, Arun
Nanda, Tim Newman, Philippe Ohanian, and Sharathcha Pankanti, for making my
tenure an enjoyable one. I value their kinship and support for allowing the gang-of-
four to seize control of the PRIP lab during the thesis writing stage. I am especially
grateful to Steve Walsh for proofreading this thesis at a time when he is also trying to
ﬁnish up his research. My sincere appreciation also goes to the PRIP manager, John
Lees and former manager, Dave Marks, for maintaining a stable research environment
throughout the years.

This research has been supported in part by the National Science Foundation
grant CDA-8806599 and IRI 8903628, by the Graduate School Fellowship, by the
College of Engineering Research Fellowship and by the Equal Opportunity Fellowship
of Michigan State University. Special thanks to Professor Anthony Wojcik and the
Department of Computer Science for supporting me with a Graduate Assistantship
throughout my graduate studies and for the travel funds that enabled me to attend
conferences held around the country.

Finally, I must pay tributes to the people whom have made a big difference in my
life. My most sincere gratitude to Mr. and Mrs. Hugh Fosters for their guidance
through my beginning years in the United States. They are most instrumental to my
current success. I acknowledge the love of my parents for providing me with the best
education, of my family for their encouragements along the way, and especially of my

wife, Hsin-Chen, for her understanding and support through my education endeavor.

 

 

LIST OF TABLE
LIST OF FIGUR

1 Introduction
1.1 Problem DI
1.1.1 Win
1.2 Representai
1.3 Organizatio

2 Formalizationr
2.1 Introductior
2.2 Object Don

2.2.1 6 -p
2.2.2 6 -q
2.2.3 Sam]
2.3 Accidental

 

 

 

TABLE OF CONTENTS

LIST OF TABLES xii
LIST OF FIGURES xiii
1 Introduction 1
1.1 Problem Deﬁnition ............................ 3
1.1.1 Wing Representation Theory .................. 4

1.2 Representational Schemes for 3D Object Recognition ......... 7

1.3 Organization of the Dissertation ..................... 11

2 Formalization of Object Domain and Wing Models 14
2.1 Introduction ................................ 14

2.2 Object Domain Deﬁnitions ........................ 15
2.2.1 6 -polygons ............................ 16

2.2.2 6 ~quadric surfaces ........................ 16

2.2.3 Sample of Objects in the Object Domain ............ 19

2.3 Accidental and Non-Accidental Viewpoints ............... 21

2.4 The Line Drawing Graph (LDG) .................... 23

2.5 What are Wings? ............................. 24
2.5.1 Wing Primitives .......................... 25

2.5.2 Derivation of Wing Contour Equations ............. 27

2.5.3 Deriving PM from P3,; ...................... 28

2.6 Registered Fused Images ......................... 34

2.7 Summary ................................. 35

3 Wing Detection 37
3.1 Introduction ................................ 37

3.2 Survey of Related Background ...................... 38
3.2.1 Edge Detection in Intensity Imagery .............. 39

3.2.2 Segmentation of Range Imagery ................. 40

3.2.3 3D Surface Reconstruction: Superquadric and Quadric Fitting 41

viii

 

3.3 Wing Del
3.3.1 Te
3.3.2 De
3.3.3 Fi1

3.4 Simultane

3.4.1
3.4.2
3.4.3
3.4.4
3.4.5

Fit
Di:
X3
Ini‘
An

3.5 Experimer

3.5.1
3.5.2
3.5.3

Fit
Sur
Sin

3.6 Fitting wit
3.7 Experimen

3.7.1
3.7.2
3.7.3
3.7.4
3.7.5

JIG
CYl
0pc
Sun

Mor

3.8 Summary

4 Perfect Recons

4.1 Introductio

4.3 Reconstruct
4.3.1 ASS“
4.3.2 Also
4.3.3 Res

4.4 Reconstrllcti

4.4.1 NeW

4.4.2 Geo

   
   
  
  
    

 

_

3.3 Wing Detection Algorithm ........................
3.3.1 Tesselation of the Input Image ..................
3.3.2 Detecting Presence of Wings ...................
3.3.3 Fitting Wing Primitives .....................

3.4 Simultaneous Boundary and Surface Fitting ..............
3.4.1 Fitting Equations .........................
3.4.2 Distance Functions ........................
3.4.3 x“ Merit Functions .......................

3.4.4 Initial Parameters .........................
3.4.5 An Example ............................
3.5 Experiments with Synthetic Data ....................
3.5.1 Fitting True Models to Contour, Surface, and Fused Data
3.5.2 Surface Classiﬁcation Experiments ...............
3.5.3 Simulation Results Summary ..................
3.6 Fitting with Different Number of Surface and Contour Points .....
3.7 Experiments with Real Fused Imagery .................
3.7.1 JIGl Example ...........................
3.7.2 CYL2 Example ..........................
3.7.3 Open Cup vs. Lid Cup ......................
3.7.4 Summary on 10 Real Images ...................
3.7.5 More Wing Detection Examples .................
3.8 Summary .................................

4 Perfect Reconstruction of LLDG - Origami/Polyhedral World

4.1 Introduction ................................

4.2 Survey of Related Background ......................
4.2.1 Model Based Interpretation of Line Drawings .........
4.2.2 Model-Free Interpretation of Line Drawings - Polyhedral and

Origami World ..........................
4.2.3 Model—Free Interpretation of Line Drawings - Curved Surface
World ...............................

4.3 Reconstruction of LDG Via Resampling ................
4.3.1 Assumptions ............................
4.3.2 Algorithmic Approach ......................
4.3.3 Resampling Reconstruction Algorithm: POLY-1 ........

4.4 Reconstruction Via Geometric Constraints ...............
4.4.1 New Terminology .........................

43

44

45

48
49
51
52
54
56
61
63
64

81
85
87
89
91

93
93
95
96

96

105
105
110
111
112

4.4.2 Geometric Constraints Imposed by Projection of a 6 -polygon 113

ix

 

 

 

4.4.3 GE
4.4.4 CC
4.4.5 EX
4.4.6 Mt
4.5 Expert”
45 Worst C85

4.7 Summar.V

5 Partial Recon
5,1 Introductil
52 Survey of i
5.3 Heuristicl

5.3.1 Pre
5.3.2 Clu
5.3.3 Not
5.3.4 Her
5.3.5 Mei

5.3.6 Heu
5.4 Examples

5.5 Worst Case

5.6 Summary

6 Reconstructior
6.1 Introductim
62 Perfect Rec-

6.2.1 Win,

6.2.? Wim
6.2.3

6.2.4

Exte
Reco

5-2.5 Quac

6-3 Heuristic Re

 

—-—-————*—‘* * ~—-——

4.4.3 Geometric Constraint Based Decision Rules .......... 117
4.4.4 Complete Reconstruction Algorithm: POLY-2 ......... 123
4.4.5 Example — Paddle Wheel ..................... 124
54.4.6 Merging of Surfaces ........................ 125
4.5 Experimental Results ........................... 131
4.6 Worst Case Time Complexities of POLY-1 and POLY-2 ........ 132
4.7 Summary ................................. 140

5 Partial Reconstruction of LLDG - Origami/Polyhedral World 142

 

 

5.1 Introduction ................................ 142
5.2 Survey of Related Background ...................... 143
5.3 Heuristic LLDG Reconstruction Algorithm ............... 146
5.3.1 Preprocessing of Wing Samples ................. 147
5.3.2 Clustering of Wings ........................ 148
5.3.3 Notion of Maybe-Segments .................... 149
5.3.4 Heuristic Rules .......................... 149
5.3.5 Merging of Planar Faces ..................... 155
5.3.6 Heuristic Reconstruction Algorithm: POLY-3 ......... 158
5.4 Examples ................................. 159
5.5 Worst Case Time Complexity of POLY—3 ................ 165
5.6 Summary ................................. 169
6 Reconstruction of the LLDG - Quadric Surface Scenes 172
6.1 Introduction ................................ 172
6.2 Perfect Reconstruction Algorithm .................... 173
6.2.1 Wing Sensing Assumptions .................... 173
6.2.2 Wing Clustering ......................... 174
6.2.3 Extension of Wings ........................ 174
6.2.4 Reconstruction of Individual Wing Group ........... 175
6.2.5 Quadric LLDG Reconstruction Algorithm: QUAD-1 ...... 177
6.3 Heuristic Reconstruction Algorithm ................... 180
6.3.1 Wing Clustering ......................... 180
6.3.2 Wing Extension .......................... 181
6.3.3 A Heuristic Approach ...................... 182
6.3.4 Heuristic Quadric LLDG Reconstruction Algorithm: QUAD-2 189
6.3.5 Examples ............................. 189
6.4 Summary ................................. 190

 

 

 

7 Summary, Cc
7.1 Summary
7.2 Contribut

7.3 Recomme
A Creation of F
B x” Merit Fur

C Parameters 0'
Algorithms

BIBLIOGRAPH

 

 

7 Summary, Conclusions and Recommendations for Future Research 194

7.1 Summary and Conclusions ........................ 194
7.2 Contributions ............................... 198
7.3 Recommendations for Future Research ................. 200
A Creation of Fused Imagery 203
B x” Merit Elnction for Fused Fitting 214

C Parameters of the Wing Detection and the LLDG Reconstruction
Algorithms 216

BIBLIOGRAPHY 221

 

xi

 

t\>
;_.

List of all
2.2 List of 38

3-1 Range (in
3.2 Initial esti.
3.3 Fitting Sp}
344 Fitting Cy
3-5 Fitting Co:
3-6 Fitting P13
3.7 Fitting an
3.8 Fitting all
3.9 Fitting an
3-10 Fitting an 1
3.11 Percentage
3.12 Surfaceonl:
3.13 Summary 0

4'1 Rest 01 the
5.1 Sunlfﬂary 01

Cl Parameters
.2 Surface Moc

2.1
2.2

3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
3.10
3.11
3.12
3.13

4.1
5.1

CI
C2
C3

 

LIST OF TABLES

List of all 60 half-wing primitives .................... 27
List of 38 full wing primitives ...................... 28
Range (in inches) of parameters ..................... 64
Initial estimate of the model parameters ................ 65
Fitting Spherical Model on Spherical Data ............... 66
Fitting Cylindrical Model on Cylindrical Data ............. 68
Fitting Conical Model on Conical Data ................. 70
Fitting Planar Model on Planar Data .................. 71
Fitting all models to Spherical surface patch .............. 73
Fitting all models to Cylindrical surface patch ............. 75
Fitting all models to Conical surface patch ............... 76
Fitting all models to Planar surface patch ............... 78
Percentage of bad fused ﬁts versus number of surface and contour points. 79
Surface-only ﬁtting with various numbers of surface points ....... 80
Summary of wing detection of 10 real images .............. 88
Rest of the processing steps of the Paddle Wheel example. ...... 125
Summary of the LLDG of the 7 real images. .............. 166
Parameters of the wing detection algorithm .............. 217
Surface Model parameters ........................ 219
Parameters of the POLY-3 algorithm .................. 220

 

 

1.1

2.5

3

High level
reconstruo.
Unique wir
which all h
Aspect gra;

Examples c
Exammes c
(a) Exampl
W in the c
(a) A non-a.
under [83, 3
not be an a
from the b0
RePfeésentat
junction Cat.
resentation.

Set-Up of wt

Flowchart of
Tesselation c
Wing detecti

at“ Points ‘
errors. . . ‘
COmPUting t:
01;; cOne. ,

Fllllng wng I

4 Detected Wim

1.1

1.2

2.1
2.2
2.3

2.4

2.5

2.6

3.1
3.2
3.3
3.4

3.5

3.6

3.7
3.8

 

LIST OF FIGURES

High level description and an example of wing detection and LLDG
reconstruction process ..........................
Unique wing representation of a bowl, a half grapefruit and a clam,
which all have the same line drawing [27] . ...............
Aspect graph of a tetrahedron (from [71] ) ................

Examples of valid and invalid 6-polygons ................
Examples of valid and invalid 6—quadrics ................
(a) Examples of objects in the object domain. (b) Examples of objects
not in the object domain. ........................
(a) A non-accidental view of a solid box, which is considered accidental
under [83, 91] (b) An accidental view of a topless box. (This would
not be an accidental view if the side containing a, b, 0 Were removed
from the box!) ............. ' .................
Representation of a coke can. (a) Line labeling using Malik’s [83]
junction catalogue (b) Chen’s wing [27] representation. (c) LLDG rep-
resentation. ................................
Setup of White Scanner and the two coordinate frames ........
Flowchart of proposed wing detection algorithm ............
Tesselation of input image ........................
Wing detection within each subimage w,- ................
Data points that are close to the ﬁtted curve may have large residual
errors. ...................................
Computing the Euclidean distance from a point in 3D to the surface
of a cone. .................................
Fitting wing models to a window with cylindrical surface patch. . .

Detected wings of the JIG image --------------------
Detected wings of the CYL image ....................

xiii

 

[\D

11

17
19

20

22

25
35

44
45
47

50

82
83

 

3,9 Wing rep]
lid . . .
3.10 Wing 46“

4,1 Wing repr
4.2 Huffman-1
4.3 Problems
4.4 Origamijt

type (from
4.5 Junction c
4.6 Apiece ofr
4.7 Decidingo
4.8 Graphical
4.9 Graphicali
4.10 Multiple re
4.11 Graphicali
4.12 Graphical i
4.13 Graphical i

4.14 Reconstruc

 

 

3.9 Wing representation can distinguish an opened cup from a cup with a
lid ..................................... 86
3.10 Wing detection examples of quadric surface scenes ........... 90

4.1 Wing representation and reconstruction of the Big—Block (from [26] ). 94

4.2 Huffman-Clowes junction catalogue .................... 98
4.3 Problems with Huffman-Clowes labeling scheme ............ 99
4.4 Origami junction types and number of possible junction labels for each

type (from [66] ). ............................. 100

4.5 Junction catalogue for piecewise smooth surfaces [83] . ........ 104
4.6 Apiece of nose cone is not in Malik’s [83] object domain. ....... 104
4.7 Deciding on LDG line segments (Figure originated from [26] ). . . . . 108
4.8 Graphical interpretation of newly deﬁned functions ........... 113
4.9 Graphical illustration of Theorem 4.3 . ................. 116
4.10 Multiple region labels are possible if wings are collinear ......... 118
4.11 Graphical illustration of the redeﬁned and the new functions ..... 120
4.12 Graphical illustration of Rule 4.3 .................... 122
4.13 Graphical illustration of Rule 4.4 .................... 123
4.14 Reconstruction of the background wing group of the Paddle Wheel. . 128
4.15 Reconstructed silhouette (background) of the Paddle Wheel. ..... 129
4.16 Applying Algorithm 2 on front face of the Big Block .......... 130
4.17 Reconstructed LDG of the Big Block ................... 131
4.18 Complete reconstruction of the Big Block ................ 133
4.19 More examples of syntactic scenes that have been reconstructed . . . 135
4.20 Reconstruction of the LLDG of a view of the gb3 block ......... 136
5.1 Falk’s heuristic rule for completing imperfect line drawings [34] . . . . 145
5.2 Spurious line could be added with F alk’s [34] rules. .......... 146
5.3 Some examples of Maybe~segments and non-Maybe—segments ..... 150
5.4 Graphical illustration of Heuristic Rules 2-6. .............. 152
5.5 Problems encountered when merging faces into ﬁnal LDG their solutions157
5.6 Reconstruction of the JIG via algorithm POLY-2 ........... 161
5.7 Examples of reconstructed LLDG via algorithm POLY-2 ....... 164

6.1 Examples of reconstruction of LLDG from Wing samples of quadric

scenes. ................................... 179
6.2 Determining set of all possible junctions in the LDG ......... 184
6.3 Graphical illustration of the heuristic rules for reconstructing LDG of

curved object scenes ........................... 186

 

6.4 Reconstrr
6.5 Reconstrr

algorithm
6.6 More LLL

A.l Setup of\
A2 A complei
image, (c)
image. . .
A3 Relating \1
To align t1
about the
new Y—axi:

A4 Fused rang

 

 

 

6.4
6.5

6.6

A.1
A.2

A.3

A.4

Reconstruction of LDG of a curved surface scene ........... 187
Reconstruction of LLDGs from imperfect wing samples using heuristic

algorithm QUAD-2. ........................... 191
More LLDGs that are reconstructed by the heuristic algorithm QUAD-2.192

Setup of White Scanner and the two coordinate frames ........ 205

A complete example: (a) original range image, (b) original intensity

image, (c) calibration matrix, (d) fused range image, (e) fused intensity
image. ................................... 207
Relating World coordinate frame to Camera-centered coordinate frame.

To align the axes: rotate {C }about Xc—axis by 45 degree then rotate

about the new Z-axis by -90 degree follow by 180 rotation about the
208

................................

new Y-axis.
Fused range (left) and intensity (right) images ............. 213

XV

 

 

 

CHAP'l

Introdu

Human vision is s
Most people assurn
in View. This is not
human brain is our
system is still uncle
vision systems to In
Computer vision
goal of a computer
the interpretation 0
activities in the co
interpreter; (2) bett
proposing altogether
to any object reco
extraction, which w
system. This thesis

input imagery and f

example of the goal 0

 

 

 

CHAPTER 1

Introduction

Human vision is so effortless that one often forgets that it is a very difﬁcult task.
Most people assume that the eye furnishes the brain with a description of the world
in View. This is not so. The eye is just an input sensor; the visual cortex of the the
human brain is our primary organ of vision. An exact science of the human visual

system is still unclear. Nevertheless, vision researchers continue to build computer

  
 
 
 
  
 
 
   
  
  

vision systems to mimic human vision.

Computer vision has gone a long way since the days of Roberts [108]. The ultimate
goal of a. computer vision system is the recognition of the objects in the scene and
the interpretation of their relationships. Thus it is not surprising that most research
activities in the computer vision ﬁeld concentrate on (1) building a better scene
interpreter; (2) bettering the techniques of various object recognition modules; or (3)
proposing altogether new object recognition and representation paradigms. Central
to any object recognition system is the issue of object representation and feature
extraction, which would be the function of the visual cortex in the human visual
system. This thesis proposes new methods for extraction of wing features from raw
nput imagery and for construction of the labeled line drawing of the scene. An

xample of the goal of this work is depicted in Figure 1.1.

 

 

     
   

Figure 1.1. High level

construction process.

discussed in Chapter 6

 

wing detection

 

 

--
N-

 

reconstruction

 

reconstructed
LLDG

 

 

 

 

 

Figure 1.1. High level description and an example of wing detection and LLDG re—
construction process. The top image shows the intensity portion of the fused imagery.
The middle image shows the ideal set of wings that are extracted from the raw image.
The bottom image shows a LLDG reconstructed from an ideal set of wing samples as
discussed in Chapter 6.

 

 

 

In the next s
dissertation, and

literature. An ou‘

1.1 Probl

Prospects for ma
greatly enhanced i
modeling of oertai
and by progress in
that there must be
data at hand. Mod
used in CAD / CAM
to the qualitative n
limited view is heir
recognition [33] or r
not the only crucial
to the sensory data
system is the issue
researched topics in
An approach som
analysis. In this a
suiacas, edges, and
inaZD image [61, 6

general information

is a potential for mor

 

 

 

In the next section, we will present an overview of the problem studied in this
dissertation, and a quick survey of different object representation schemes in the

literature. An outline of the organization of this thesis concludes the chapter.

1 .1 Problem Deﬁnition

Prospects for machine understanding of industrial or household scenes have been
greatly enhanced in the last ten years by development of better sensors, by successful
modeling of certain lower level functions of the various channels involved in vision,
and by progress in model-based object recognition. The concept recognition implies
that there must be some model in memory which is somehow evoked by the sensory
data at hand. Models can range in generality from the very speciﬁc object deﬁnitions
used in CAD / CAM [41], through parameterized forms such as the superquadrics [98],
to the qualitative models formed by geons and their relations [13]. (Note that a more
limited view is being taken here by not considering recognition by classical pattern
recognition [33] or recognition by function [125, 113]. ) The character of the model is
not the only crucial issue; the strategy or algorithm for matching the memory model
to the sensory data is just as critical [40, 63, 80]. Central to any object recognition
system is the issue of object representation and feature extraction; two of the most
researched topics in the computer vision-object recognition community.

An approach somewhat opposite to model-based recognition is that of line drawing
analysis. In this approach, an attempt is made to interpret the 3D structure of
surfaces, edges, and vertices by interpretation of the line drawing of object contours
in a 2D image [61, 66, 83, 92, 115, 123]. Usually no model database is assumed, only
general information about the geometry and topology of objects. As a result, there

is a potential for more generality of object deﬁnition and for bottom-up processing.

 

 

 

The major we
sumption that the
perfect line drawir
obtainable, ambig
tion is added (see

The work pres

recognition and so

1. A proposal i:
range data. vi
(surface) data
object limbs

crease edges.

2. Construction
object primiti
once derived, 5
level system.
representation
this work pro

proven to be c

In the next subs

offer a quick view of

1.1.1 Wing R

Wing representation

lion. Instead of mod

 

 

 

The major weakness in most line drawing analysis schemes derives from the as-
sumption that the line drawings are perfect, when, in fact, years of work show that
perfect line drawings cannot be obtained from real data. Moreover, even if they were
obtainable, ambiguous interpretations would still arise unless some real 3D informa-
tion is added (see [84] for one way to do this).

The work presented in this thesis addresses two important problems in object

recognition and scene interpretations:

1. A proposal is made for the detection of object primitives in fused intensity-
range data via simultaneously fitting intensity contours and the adjacent range
(surface) data. The quadric form and its projection is proposed for modeling
object limbs and the intersection of quadric forms is pr0posed for modeling

crease edges.

2. Construction of a labeled line drawing graph from the set of locally detected
object primitives rather than a perfect line drawing graph. The line drawing,
once derived, should provide many features to index into the models of a higher-
level system. By organizing data-directed processes to create the line drawing
representation, effects of the segmentation problem are greatly reduced. Hence,
this work provides a partial solution to the “segmentation problem” which has

proven to be combinatorially difﬁcult for most model-based schemes.

In the next subsection, we brieﬂy introduce the wing representation theory and

offer a quick view of our methodology for extracting wing features.

1.1.1 Wing Representation Theory

Wing representation theory was introduced by Chen [26, 27] for 3D object recogni-

tion. Instead of modeling objects by parts [53], objects are modeled by set of views

 

 

 

Figure 1.2. Uniqui
all have the same

composed of both
pie wing is deﬁned
a fragment of imag
urination, symmetr
wings are grouped i
ilar projected line 41
bowl, 3. half grapefn
represented using

In [28], a comp
representations of g
conclusion that win
stability. However,

A more in-depth
representation offers
that the wing featu

from a raw input sc

 

segmentation [26]. B

remains to be discov

 

 

 

 

 

 

bowl half grapefruit clam

Figure 1.2. Unique wing representation of a bowl, a half grapefruit and a clam, which
all have the same line drawing [27].

composed of both simple wings, composite wings and their spatial structures. A sim-
ple wing is deﬁned as a triplet that includes a pair of surface patches separated by
a fragment of image contour. Through some non—accidental relations such as coter-
mination, symmetry, parallelism, connectivity, collinearity and curvilinearity, object
wings are grouped into composite wings. Using wing representation, objects with sim-
ilar projected line drawings can be distinguished easily. For example, in Figure 1.2 a
bowl, a half grapefruit and a clam all have the same line drawing, but can be uniquely
represented using wings.

In [28], a computational framework for construction of the multiple-view wing
representations of general 3D rigid objects was proposed. With it, was the preliminary
conclusion that Wing representations have the properties of uniqueness, locality, and
stability. However, no satisfactory wing extraction method was proposed.

A more in—depth treatment of wing deﬁnition will be given in Chapter 2. Wing
representation offers a good basis for an object recognition system. It was conjectured
that the wing features are the highest level scene features that can be extracted
from a raw input scene without using any speciﬁc object models or any prior scene
segmentation [26]. But as with other theoretical work, an effective feature extractor

remains to be discovered.

 

 

In this disserta
1. Push forwaﬁ

2' Complete (p2

of detected “

Wings are to b
alocal area of int
hypothesized. Veri
problem in which 1
recovery of surface

Parameter estirnate

ofa wing primitiv‘

adjacent surface P3
from the input imag
surfaces present in 1

Rather than the

from a input image

terpretation via line

should he produced

raw image, we recor

issue involved is whe

tion. We will explor

and derive reconstrui

for complete line dre

scenes will be given '

be provided. To the l

tnec to simultaneous

 

 

In this dissertation, we strive to

1. push forward toward automatic wing detection; and

2. complete (partial) reconstruction of a labeled line drawing graph from the set

of detected wings.

Wings are to be extracted from a single registered range and intensity image. In
a local area of interest in the fused image, existence of various wing primitives is

hypothesized. Veriﬁcation of a wing hypothesis is posed as a non-linear optimization

 

problem in which the 2D contour constraint, whenever available, is used to aid the

 

recovery of surface parameters. The 2D contour is also useful in generating the initial
parameter estimates required by the non-linear optimization technique. Rectiﬁcation

of a wing primitive gives qualitative as well as quantitative information about the

 

adjacent surface patches. No speciﬁc object models are used. Wings are extracted
from the input image without any pre—processing nor knowledge of the explicit object
surfaces present in the scene.

Rather than the usual two-step process (construction of a perfect line drawing
from a input image followed by analysis of the derived line drawing) in scene in-
terpretation via line drawing, we reason that a line drawing and its interpretation
should be produced concurrently. Speciﬁcally, given the set of sampled wings from a
raw image, we reconstruct the labeled line drawing graph (LLDG) in one step. The

issue involved is whether wing representation is adequate for line drawing reconstruc-

 

tion. We will explore the geometric constraints underlying the line drawing graph
and derive reconstruction rules based on those constraints. Deterministic algorithms
for complete line drawing reconstruction of origami, polygonal and quadric surface
scenes will be given. Heuristic algorithms for generating partial line drawings Will also
be provided. To the best of our knowledge, Trytten [119] is the only other work that

tries to simultaneously derive and label the line drawing graph. However, this work

 

differs from Tm t

classes 0f objects:

1.2 Rep“

ReCOE

This thesis is about
more popular objec
for the purpose of c

object representati:

Generalized C

Generalized cylinde
3D objects. It isd
the eccentricity of t
corresponds to the :
arbitrary planar sha
ans. The sweeping
eccentricity is the ar.

a righ

t circular cvlin'

straight axis with no

90 degree.

The Generalized

rotation

,scale and i111

attention in the pest

n ,
the generalized cy

 

 

 

 

 

differs from Trytten’s in several aspects including our use of fused imagery, different

classes of objects, and Trytten’s usage of junction catalogue.

1.2 Representational Schemes for 3D Object
Recognition

This thesis is about scene representation. In this section, we brieﬂy survey some of the
more popular object representation schemes in use by the computer vision community
for the purpose of object recognition. [11, 29] provide in-depth surveys of various 3D

object representation and recognition techniques.

Generalized Cylinder

Generalized cylinders or generalized cones were introduced by Binford [14] to model
3D objects. It is deﬁned by an axis, a cross-section function, a. sweeping rule and
the eccentricity of the generalized cylinder. The axis is an arbitrary 3D curve and
corresponds to the spine of the object being described. The cross-section is of any
arbitrary planar shape that may change in shape and size as it is swept along the
axis. The sweeping rule governs how the size of the cross-section changes, and the
eccentricity is the angle between the axis and the cross-section plane. For example,
a right circular cylinder can be represented as a circular cross-section, swept along a
straight axis with no change in shape or size of the cross section and with eccentricity
90 degree.

The Generalized cylinder is a volumetric shape descriptor that is invariant to
otation, scale and illumination changes. For this reason, it has received considerable
ttention in the past. While Marr has prOposed an object recognition system based

n the generalized cylinder [86], Brooks developed ACRONYM [21], a model-based

 

 

 

 

object recognition
have concentrated

from dense range

GeonS

Biederman [13] P“
Ions for use in his
by a set of four C
tonal/reﬂective, re
lug/contracting) a1
cylinders.

An object is re
structural relations
the relations includ
approximately equal
to G); (3) Centerin.
ageon’s surface; (4)
short surface of Cl 2

Biederman’s sim]
154 million qualitat

geons, respectively a

shows that human la

objects, implying the
sufﬁcient representati
An edge based pr

also proposed. First

Properties such as
cur

 

 

object recognition system that explicitly models objects as generalized cones. Others
have concentrated on extracting generalized cylinders from raw input images ([2, 94]

from dense range images and [105] from intensity images).

Geons

Biederman [13] proposed a set of 36 volumetric primitives called Geons or Geometric
Ions for use in his theory of Recognition—by-Components (RBC). Geons are deﬁned
by a set of four qualitative features: edge (straight or curved), symmetry (rota-
tional / reﬂective, reﬂective, asymmetric), size variation (constant, expanding, expand-
ing/contracting) and axis (straight, curved); and is a sub—class of the generalized
cylinders.

An object is represented by a set of geons, denoting its components, and the
structural relations among pairs of geons. Given two connected geons 01 and 02,
the relations include (1) Relative Size: G1 could be greater than, smaller than or
approximately equal to 02; (2) Verticality: G1 could be above, below or side connected
to G2; (3) Centering: the point of attachment could be centered or off-centered on
a geon’s surface; (4) Relative size of surfaces at join: the join could be on a long or
short surface of G1 and 02.

Biederman’s simple calculation showed that approximately 75 thousand and over
154 million qualitatively different objects can be constructed using only 2 and 3
geons, respectively and inter-geon relationships. He argued that liberal estimation
shows that human language vocabulary consists of only 30,000 readily discriminable
objects, implying that the set of 36 geons along with their inter-geon relations have
sufﬁcient representational power.

An edge based procedure for segmenting an image into geon components was
also proposed. First, a line drawing of the objects was obtained, from which, edge

properties such as curvature, collinearity, symmetry, parallelism and co-termination

 

were detected. '.
each component
drawing is the cr

from range image

Superquadri

Instead of modeli
cylinders, Pentlar
thought of as “lur
bending, twisting
using Boolean ope
deformed, reshape.

SuPerquadrics .'

 

were detected. The objects were segmented at concave regions and the identity of
each component was derived from the edge properties. The extraction of the line
drawing is the critical and the most difficult step. Obtaining geon~like parts directly

from range images has recently been addressed by Raja [103, 104].

Superquadrics

Instead of modeling object parts by qualitative shapes such as geons or generalized
cylinders, Pentland [97] suggested the use of superquadrics. The basic parts are
thought of as “lumps of clay” which may be deformed and reshaped by stretching,
bending, twisting or tapering. These basic “lumps of clay” can then be combined
using Boolean operations to form new, more complex prototypes which again can be
deformed, reshaped and combined.

Superquadrics are deﬁned by the equation

alC'n“ 0“,62
~ 7r 7r
X014“): agCnc'Sw‘z ,‘ESUS§,—WSUJSW;
(1337,61
where 0,, = c0307) and SW = sin(w). The parameters have intuitive geometric

meanings. Parameters a1, a2 and a3 affect the size of the superquadrics along the x,
y and z axes, respectively. 61 and 52 govern the squareness of the superquadrics along
he z—axis and in the xy-plane. Parameters n and w are the latitude and longitude
ngles of vector X in spherical coordinates.
Recovering superquadrics in general position requires estimating 11 parameters
om the surface patch. These are location (3), orientation (3), size (3) and shape (q
nd 62) parameters. Various schemes for superquadrics ﬁtting have been proposed in

e literature but all assume that the images are pre-segmented into regions corre-

spondiﬂg to Obje

with some intern

Problems W

A problem comm
of those parts fror
efforts in ﬁtting tl
segmentation is re
more, those volun
objects can be sa
2.5D. Blindly mod

cause false alarms

Aspect Grapl

In contrast t0 part
sentation of 3D Obi
the aspect graph 0i
Each node, defined
the object as seen f
view from this conr
are joining two aspe
crossing from one re

all possible stabl
by

e vi

some aspects and

Staph for a tetrahedr

 

 

10

sponding to object parts [5, 48, 98, 112]. Superquadric ﬁtting to actual range image

with some intermediate segmentation have been demonstrated in [38, 50].

Problems With Existing Parts-Based Representation

A problem common to all the above object representation schemes is that extraction
of those parts from a raw image is difficult. Most researchers have concentrated their
efforts in ﬁtting those forms to pre—segmented images. Parts are rich and sophisticated
segmentation is required, whereas wing theory is a theory of segmentation. Further-
more, those volumetric parts model object parts as a 3D solids; hence, no origami
objects can be satisfactorily represented. In addition, the images are not 3D but
2.5D. Blindly modeling the backside of an object that is not visible in the image can

cause false alarms in model-based recognition systems.

Aspect Graph Representation

In contrast to parts-based representation, aspect graphs are a multiple-view repre-
sentation of 3D objects. Koenderink and van Doorn [71] introduced the idea of using
the aspect graph of topologically distinct views of an object to represent its shape.
Each node, deﬁned as an aspect, in the aspect graph represents a “stable view” of
the object as seen from some maximal connected region of viewpoint space. Object
views from this connected region of viewpoint space appear qualitatively similar. An
arc joining two aspects represents a possible transition between two stable views by
crossing from one region to another. To complete the deﬁnition of an aspect graph,
all possible stable views and transitions between such stable views are represented

y some aspects and arcs in the aspect graph. Figure 1.3 shows an example aspect

raph for a tetrahedron.

 

Figr

This concept of
scheme for object 1
matic generation of
orthographic and pe
theory coupled with

object recognition 8}

1.3 Organi.

The remainder of thi
deﬁne the object don
line drawing graph (I
Will derive the set of

btsed on those deﬁnit

escrrbe the process c

 

11

\0

T\ V\w
/a;?/_W

n//

Figure 1.3. Aspect graph of a tetrahedron (from [71]).

4—0—

 

This concept of aspect graph, intuitiVely, seems to be a powerful representation
scheme for object recognition. Many research efforts have been devoted to auto-
matic generation of aspect graphs of object models in various domains under both
orthographic and perspective projections (for a listing see [19]). Wing representation
theory coupled with aspect graph model representation might be a good basis of an

object recognition system.

1.3 Organization of the Dissertation

he remainder of this dissertation is organized as follows. In Chapter 2, we formally
eﬁne the object domain of this research. The formal deﬁnitions of object wing, the
ine drawing graph (LDG), and the labeled line drawing graph (LLDG) are given. We
ill derive the set of all possible wing models and their mathematical representation
ased on those deﬁnitions. We will also explain the setup of our imaging system and

escribe the process of range and intensity images fusion.

 

Detection oil
from objects pr0‘
propose simultan
linear equations l
on synthetic dati
surface data alon
needed to achieve
set of real images
limb or crease edgc
consistent with th
We conclude that
wings from fused i

The collection c
tion algorithms dep
algorithms, POLY-
ing graph of polygo
is tha

t the range i1
geometric constrain
Junction catalogues
tion algorithms. Th
several of which has

in Chapter 5, we
idealistic assumptior
occur in the input \
inaccuracies are tole

construction of the l

experimentally show

 

ﬁg“. .—-.-——-— (Eu—rhvu-g 7__—,._.-.__._ﬁ , ——__‘._— v7? 7. 7 7 a 7. _ . I y,

12

Detection of wings is treated in Chapter 3. It is well-known that edge information
from objects provides strong clues about the local shape of surface region [70]. We
propose [simultaneous boundary and surface ﬁtting with few sample points. The non-
linear equations for fused ﬁtting are derived. Results from Monte Carlo experiments
on synthetic data show that fused ﬁtting is superior to ﬁtting either boundary or
surface data alone. Moreover, only a relatively small number of data points are
needed to achieve quality parameter estimates. Results of wing detection on a large
set of real images are analyzed. We observe that ﬁtting fused data near an object

limb or crease edge is advantageous and more accurate than ﬁtting surface data alone,

 

consistent with the ﬁnding from the Monte Carlo experiments on synthetic images.
We conclude that our wing detector, though not perfect, can reliably extract object
wings from fused imagery without any special pre-processing.

The collection of wing samples form the input to the line drawing graph reconstruc-

 

tion algorithms depicted in Chapters 4 - 6. In Chapter 4, we present two deterministic
algorithms, POLY-1 and POLY-2, that can reconstruct the perfect labeled line draw-
ing graph of polygonal scenes under some idealistic assumptions. In POLY—1, the key
is that the range image must be available for re—sampling whereas in POLY-2, the
geometric constraints of LDG are deduced and utilized to guide the reconstruction.

Junction catalogues and fallible heuristic rules are not used in these two reconstruc-

 

tion algorithms. The algorithms are tested on over 15 scenes, both synthetic and real;
several of which have appeared as problematic scenes in the literature.

In Chapter 5, we present a heuristic algorithm, POLY-3, that operates without the
idealistic assumptions as in Chapter 4. Instead, missing and spurious wing samples
occur in the input wing set. Errors introduced by computation and measurement
inaccuracies are tolerated. A set of heuristic rules are derived to perform the re—
construction of the LLDG given that the input wing samples are incomplete. We

experimentally show that algorithm is capable of reconstructing the complete LLDG

 

if the input Willi
defective. Results
In Chapter 6,
curved objects. Vt
tions, POLY-l an
There are two ma
main: wing cluste
introduced that b]
solving the wing e
LLDGs from syntl
the dissertation by

for future research.

 

 

*— ———-——v——-———-.———— mM-HHY—V'VWW"_' ~ﬁ ~—~~ - .— - 7‘ « —r«~——-—‘-ﬁ , **- — ' H i '

13

if the input wing sample is complete and a partial LLDG if the input wing set is
defective. Results from both synthetic and real images are provided.

In Chapter 6, the aforementioned algorithms are extended to handle scenes with
curved objects. We argue and show by example that under the same idealistic assump—
tions, POLY-1 and POLY-2 can be directly applied with only minor modiﬁcations.
There are two major difﬁculties in direct extension of POLY-3 for the expanded do-
main: wing clustering and wing extension. A new heuristic algorithm, QUAD-2 is
introduced that bypasses the step of wing clustering. Future research directions for
solving the wing extension problem are also suggested. Examples of reconstructed
LLDGs from synthetically generated wing samples are shown. Chapter 7 concludes
the dissertation by summarizing the contributions of this research and pointing ways

for future research.

 

 

cHAP

Formal.

and W

21 Introc

There has been ml
area of surface ﬁttil
and line drawing If
model (3D surface
in the following
thesis are defined. I:
The mathematical r
Formal definitions (
stated. We will p0
droning graph and
analysis. In additior

Images for use as tes

 

 

 

 

CHAPTER 2

Formalization of Object Domain

and Wing Models

2.1 Introduction

There has been much work in the analysis of line drawings and a lot of work in the
area of surface ﬁtting. In this chapter we will lay the foundation for the wing detection
and line drawing reconstruction techniques, which are based largely on ﬁtting a wing
model (3D surface model and 2D contour model) to fused range and intensity data.

In the following sections, terms that are essential to the understanding of this
thesis are deﬁned. In particular, the object domain and wing models are characterized.
The mathematical representation of the wing models and their derivations are given.
Formal deﬁnitions of a line drawing graph and labeled line drawing graph are also
stated. We will point out the differences between our deﬁnition of a labeled line
drawing graph and the line labeling commonly found in the work of line drawing
analysis. In addition, the process of acquiring registered fused range and intensity

images for use as test data is also described.

14

 

 

 

 

2.2 Obis

In this research,
agiven vieWpoir
Appendix A. W
and intensity irn.‘
sensed, regardles:
in terms of the sc
would produce “1
The very ﬁrst
research is that i
polyhedron or pol
restricted. Their
deﬁnition exclude:
l55l complex deﬁn
calm be sensed.
related by Lakatos]
be bOIh mathemat.
Before deﬁning .

Catalogue nor the 0'

objects considered ;'

6‘fualll‘fc Surfaces, .

iii] are allowed as

which have satisfact

 

 

f' V " f

15

2.2 Object Domain Deﬁnitions

In this research, the input is a registered range and intensity image of the scene from
a given vieWpoint. The sensing equipment and the set-up is described in detail in
Appendix A. With our set-up, which has a single camera for capturing both range
and intensity images, only those surfaces in the scene visible to the camera are being
sensed, regardless of the shape of the surfaces. Therefore, we deﬁne the object domain
in terms of the surfaces that bound the object and the legal vieWpoints as those that
would produce “usable” registered range and intensity image.

The very ﬁrst problem encountered and an important lesson learned from this
research is that it is not easy to get the appropriate general deﬁnition for even a
polyhedron or polygon! The deﬁnition used by Huffman and Clowes [30, 61] is too

restricted. Their block world includes objects with only trihedral vertices. That

 

 

deﬁnition excludes many common polyhedral and all origami objects. Hoffmann’s
[55] complex deﬁnitions support CAD operations well but involve structures which
cannot be sensed. After reading the fascinating history of “Eulerian polyhedra”
related by Lakatos[74] we are less ashamed of our struggle for deﬁnitions that would

be both mathematically and practically effective.

 

Before deﬁning our objects, it is important to note that we do not need a junction
catalogue nor the object domain restrictions that such a catalogue would imply. The
objects considered in this thesis are those whose faces are either 6-polygons and/ or
5-quadric surfaces. The objects do not have to be solid. Objects from origami world
[66] are allowed as long as the faces are all 6-polygons. First, we define polygons

which have satisfactory size properties if properly viewed by the sensor.

 

22.1 6'p(

Deﬁnition 2'1

bounded by strait
that there “is“ ‘
nonzero area 0f b

of other poly/90" i

The radius: 1"

Imposes a minimu

 

P to all other edg
short for accurate
polygon is closed.
condition is actual
case in Theorem 4.

Note that more
main difference bet
any closed polygon
exists a point on e
edges is greater the

so are bowties. N 0

segments. Some era

2.2.2 6 -quad:

Extension of the n
. 0

Poh’gonal case the

de ' ‘
ahng With a curt
e

that
t e vertices art

 

 

 

 

 

16

2.2.1 6 -polygons

Deﬁnition 2.1 A 6-polygon is a closed region of ﬁnite extent in a 3D plane
bounded by straight edge segments; moreover, (a) each edge contains a point P such
that there exists an open neighborhood N, of radius r > 6 > 0 about P which includes
nonzero area of both the interior and exterior of the polygon, and includes no points

of other polygon edges; (6) every vertex has an even number of incident edges.

The radius, r, of the open neighborhood N, in (a), of minimum length 6, implicitly
imposes a minimum length on each edge of the polygon and a minimum distance from
P to all other edges. Basically, it dismisses all objects that may be too thin or too
short for accurate range sensor registration and wing detection. To ensure that the
polygon is closed, condition (b) requires an even number of incident edges. This
condition is actually implied by the other restrictions. We will show that this is the
case in Theorem 4.2. It is explicitly stated here for emphasis.

Note that more than 2 coincident edges are allowed. Intuitively speaking, the
main difference between a 6-polygon and the “usual” polygon is that a 6-polygon is
any closed polygon such that (1) the length of each edge is at least 6, and (2) there
exists a point on every edge such that the distance between that point to all other
edges is greater than 6. With the above deﬁnition, not only are holes allowed, but
so are bowties. Non-6-polygons include those polygons with a cut or dangling line

segments. Some examples of 6—polygons and non-6—polygons are shown in Figure 2.1.

2.2.2 6 -quadric surfaces

Extension of the notion of 6-pongon to 6-quadric is not straightforward. In the
polygonal case, the deﬁnition of an edge is intuitively clear. This is not so when
dealing with a curved surface patch. Before deﬁning an edge, we must ﬁrst deﬁne

what the vertices are. The vertices of a quadric surface patch are those points on

 

 

 

 

 

 

 

 

 

 

 

 

D
t

LL

I.
:4

 

_\
_/

 

(a)

/_
¥
{—

1%
_l

 

 

 

 

 

(b)

Figure 2.1. Examples of valid and invalid 6-polygons. (a) non-6—polygons: a—b and a
have points having neighborhood with zero exterior/interior polygonal area; a has
no point P that would satisfy the open neighborhood property. (b) 6—polygons. Open
neighborhood about a point P is represented by O.

 

the bounding cr
where the curve
segments separa
restriction on th
that the minimt

curve is greater 1

Deﬁnition 2.2

such that there 6
quadric surface pl
lines, L = {11,121
such that an open
area of both the i
#i, and (3) on.

incident with an e.

In this thesis, '
surfaces. Speciﬁca
snifaces will be allc
projection of those
linear, circular or t
bonndaries will alw

regardless of the vie
Using this deﬁni

t,
tan 6/2. Otherwi:

PIOjected bounding

bite '
v ‘
”sea a right ci

her than (i. Th1

 

—7——————m_w._wﬁ ‘

18

the bounding curve where the curvature or tangent is discontinuous and those points
where the curvature changes sign (points of inﬂection). The edges are the boundary
segments separated by the vertices. Since a closed curve is now possible, an additional
restriction on the coverage of the open neighborhood N, is also needed to guarantee
that the minimum of the maximum distance between any two points on the closed

curve is greater than 6.

Deﬁnition 2.2 A 6-quadric is a closed 30 quadric surface patch of ﬁnite extent
such that there exists a viewing direction in 30 such that the 2D projection of the
quadric surface patch along it is a planar image region R bounded by a set of (curved)
lines, L = {ll,lg,...,}; moreover, for each I,- E L, (a) there exists a point P on l.-,
such that an open neighborhood N, of radius r > 6 > 0 about P includes (1) nonzero

area of both the interior and exterior of the planar region R, (2) no points of lj if

 

 

j 7S i, and (3) only a proper subset of points from. l5; and (6) every junction of R is

incident with an even number of l; ’s.

In this thesis, we will restrict 6-quadric surfaces to a special subclass of quadric
surfaces. Speciﬁcally, only planar, spherical, circular cylindrical and circular conical

surfaces will be allowed and each edge must be of type linear, circular or elliptic. The

 

projection of those edges onto the 2D image plane results in (curved) lines of type

 

linear, circular or elliptic. Furthermore, given those four surface types, their limb
boundaries will always be projected to straight or circular lines in the image plane
regardless of the viewpoint.

Using this deﬁnition, a spherical patch would be a 6—quadric if its radius is greater
than 6/2. Otherwise, any neighborhood N, of radius r > 6 on any point on the
projected bounding curVe would include the entire boundary (see Figure 2.2 (b)).
Likewise, a right circular cylinder must have radius greater than 6 / 2 and height

greater than 6. Thus, the two sides of a penny are 6-quadrics if r > 6 / 2, but not

 

There are 2 ty

 

Figure 2.2. E“
quadrics. Open n

the cylindrical su
cylindrical surface
valid view becaus

area (is, no ViSlb

Figure 2.2 (bl-
For simplicity, 1

ably with d—pOll'g‘”

2.2.3 Sample

We are now readv

pes of
l. Origami or pol

?. Curved objects

 

 

 

 

 

0 side of a very
thick penny

 

 

 

r a
tap view tilted view front view
(a) (b)

Figure 2.2. Examples of valid and invalid 6—quadrics. (a) 6-quadrics. (b) Non-6-
quadrics. Open neighborhood about a point P is represented by O.

the cylindrical surface because its height is too short. Note the top view of the
cylindrical surface projects to a circle which has radius r > 6 /2 but this is not a
valid view because the projection of the cylindrical surface yields a region with 0
area (i.e., no visible region). Possible views of this cylindrical surface are depicted in
Figure 2.2 (b).

For simplicity, the terms polygons and quadric surfaces will be used interchange—

ably with 6—polygons and 6-quadric surfaces.

2.2.3 Sample of Objects in the Object Domain

We are now ready to deﬁne the type of objects that belong to the object domain.

There are 2 types of objects in the object domain:
1. Origami or polyhedral objects where every face is a 6-polygon.

2. Curved objects whose bounding faces are 6-quadrics.

 

 

 

h
c

Figure 2.3. (a) El
not in the 051.9Ct ‘

The ﬁrst class (

objects may be bor

struction of scenes

the curved surface

how a complete lint

Polygons. Example:
Class are shown in l

Note that the p4

surface is too short t

may project to the s
is shorter then 6. W

an accidental View is

 

 

 

 

 

 

 

20

non» 6 ~polygons

 

 

 

 

 

 

Z7

 

 

 

8 -polygons

 

 

 

 

 

non- 5 - quadrics

 

 

 

(a) Cb)

Figure 2.3. (a) Examples of objects in the object domain. (b) Examples of objects
not in the object domain.

The ﬁrst class of objects contains only planar surfaces, whereas the second class of
objects may be bounded by curved surfaces. Wing detection and line drawing recon—
struction of scenes involving objects in the ﬁrst class are easier than those involving
the curved surface objects. In fact, in Chapter 4 we will give an algorithm showing
how a complete line drawing may be derived from a set of ideal Wings sampled from
polygons. Examples of objects in each class as well as objects that belong to neither
class are shown in Figure 2.3.

Note that the penny is not in the object domain. The height of the cylindrical
surface is too short to qualify it as a 6-quadric. The test tube, although a valid object,
may project to the same line drawing where the height of the test tube in the image
is shorter then 5. We shall call this view an accidental view. The formal deﬁnition of

an accidental View is given in the next section.

 

23 ACC

Wing detection
task. It operate
predetermined
edge and adjace
the same surfaCt
varying sizes. E
the plane project
lost. Detectionc
yields very short
those situations a
next two deﬁnitic

accidental views 0

Deﬁnition 2.3 [f

(quadric) scene is

1. there exists 5
quadric) fron
2. every non~jun
labeled with 0

Note that from

image Plane has len
Deﬁnition 2.4 An
Sensing PT0perty is

This deﬁnition of

ental view is deﬁner

 

 

 

 

 

 

ri—mﬁ“

21
2.3 Accidental and N on—Accidental VieWpoints

Wing detection, as described in the next chapter, is a low-level computer vision
task. It operates on fused range and intensity data of the scene as seen from some
pre—deterrnined vieWpoint. Success of wing detection depends on how much of the
edge and adjacent surface are visible in the image plane. With different viewpoints,
the same surface patch/edge may project onto the image plane as regions/lines of
varying sizes. For example, a planar surface seen from a vieWpoint co-planar with
the plane projects onto the image plane as a straight line; the surface information is
lost. Detection of wings is impossible in this and other cases where the projection
yields very short line segments and/or small regions in the image plane. We deﬁne
those situations as consequences of sensing from accidental viewing directions. The

next two deﬁnitions deﬁne the sensing property of vieWpoints and accidental/non-

accidental views of the scene.

Deﬁnition 2.3 [Sensing Property] A non-accidental view of a polygonal

(quadric) scene is a view with all of the following properties :
I. there exists some 6 > 0 such that the visible projection of every 6-polygon (6-
quadric) from 3D is an e-polygon (e-qaadric) embedded in the image plane;
2. every non-junction point of any line segment in the image plane can be correctly
labeled with one and only one wing label.
Note that from Deﬁnitions 2.1 and 2.2 it follows that each line segment in the
image plane has length at least 6.
Deﬁnition 2.4 An accidental view is a view where one of the conditions of the
Sensing Property is violated.

This deﬁnition of accidental View differs from that of [83, 91], in which an acci-

dental view is deﬁned as the orthographic projection of 3D edges under an unstable

 

 

 

E

D
Figure 2.4. (a) .
tal under [33» 91

accidental view it

viewpoint. A vie
line drawing chan
sphere. In this th
face can be (parti
avisihle edge resu.
wing representatio
droning graph will
As an example.
accidental under [8‘
accidental view of 2

lo the line segment

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F \ A f \ a
f I
v __ ?
g ix
E + B e > >

b

A v A v
D j C d j C

X \
(a) (b)

Figure 2.4. (a) A non-accidental View of a solid box, which is considered acciden-
tal under [83, 91] (b) An accidental view of a topless box. (This would not be an
accidental view if the side containing a, b, c were removed from the box!)

viewpoint. A viewpoint is deﬁned to be unstable if the structure of the projected
line drawing changes as the viewpoint is moved within some open set of the Gaussian
sphere. In this thesis, a non-accidental view is one where each visible edge and sur-
face can be (partially) sensed. Without this condition, an edge detector might miss
a visible edge resulting in no detected wings for that edge. As a result, the corrupted
wing representation of the scene will be incomplete and the reconstruction of the line
drawing graph will be much more difﬁcult.

As an example, Figure 2.4 (a) shows a view of a solid box that would be considered
accidental under [83, 91] but is perfectly legal by Deﬁnition 2.3. Figure 2.4 (b) is an
accidental view of a topless box. It is accidental because the 3D edge corresponding

to the line segment marked “?” and its adjoining surface(s) cannot be sensed.

 

 

2.4 The

n this retard]
it is not import.
are either MOI"

3D polygons/q"

Definition 2-5

forms a planarg

In this regard
including [30, 61
naming Conventii
LDG has "3950“
depending on the

In general, lint

LDG to be an ima

boundary (>) or C
in the LDC, too.

Deﬁnition 2.6 «4

alabeled line dI‘E

The complexity .
'3 Purposely left uns

t .
othe application HE

interpretation (( +
circular, parabolic”

Pose information

use ‘ ‘
dint rs research

of

 

 

———————“w --*

23

2.4 The Line Drawing Graph (LDG)

In this research, objects are assumed to be composed of 6-polygonal/5-quadric faces;
it is not important whether they are regarded as 3D solids or just 2D surfaces. Faces
are either 6-polygons or 6—quadrics. A view of a polygonal / quadric scene is a set of

3D polygons/quadrics with some viewing direction.

Deﬁnition 2.5 The projection of edges of objects in a scene with hidden line removal

forms a planar graph called the line drawing graph (LDG).

In this regard, the LDG is the starting point of much line labeling analysis research
including [30, 61, 66, 83, 115, 123]. Following previous authors, we will adopt the
naming convention that faces in the 3D world have edges, and vertices, while the

LDG has regions, line segments, and junctions. Note that line segments in the LDG

 

depending on the object domain are not necessarily straight.

 

In general, line drawing analysis attempts to interpret every line segment in the
LDG to be an image of a convex crease (+), concave crease (-), self-occluding surface
boundary (>) or occluding surface boundary (>). Here, we strive to label the regions

in the LDG, too.

Deﬁnition 2.6 A LDG where each line segment and each region is labeled, is called
a labeled line drawing graph (LLDG).

The complexity of line and region labels are not speciﬁed in the above deﬁnition. It
is purposely left unspeciﬁed so the complexity of the labels can be adjusted according
to the application need. For our purposes, the line label not only must contain the 3D
interpretation ({+, —, >>, > }) of the segment it must also capture the shape ({linear,
circular, parabolic}) of the segment. The region labels will have the shape and the
pose information of their 3D surface counterparts. The exact line and region labels

used in this research is the topic of next section.

 

2.5 Wh

The notion 0f “
dron representa
Primitives for 3‘
to supPOrt anal

similar to that C

Deﬁnition 2-7
that can appear
labels for the tint
appear in the im

from the sample ‘

Deﬁnition 2.8 .
called the wing C
that

project onto 2

thermore, a wing

Since each winj

will often refer to

Note

that this deﬁr
and line labels. If 1
Would consists of p

111 most line labelir

line segment in the
other works on line
ln Chen ’8 work [i

Concave

) saddle}, a:

 

 

24
2.5 What are Wings?

The notion of wing was ﬁrst conceived by Baumgart [9] as a data structure for polyhe-
dron representation for computer vision. Chen [27] later deﬁned object wings as 2%D
primitives for 3D object recognition and showed that wing representation is suitable
to support analysis of scenes of general objects. Here, the notion of wings will be

similar to that of Chen.

Deﬁnition 2.7 Let S = {81,82,...,Sm} denote the set of different labels of surfaces
that can appear in the scene and let C = {c1,c2, ...,c,,} denote the set of diﬁ'erent
labels for the lines (ED projection of the 3D edge), both straight and curved, that can
appear in the image plane. Then a Wing primitive, is deﬁned as a triplet drawn

from the sample space formed by the set cross product S x C x S.

Deﬁnition 2.8 A Wing in an image is deﬁned as a fragment of image contour,
called the wing contour along with the surface patches, called the wing surfaces,
that project onto the regions adjoining the wing contour fragment in the image. Fur-

thermore, a wing can be uniquely labeled as one of the wing primitives.

Since each wing in the image is uniquely labeled by one of the wing primitives, we
will often refer to this unique wing primitive when we speak of wings in the image.
Note that this deﬁnition of wing places no restriction on the complexity of the surface
and line labels. If S = {NIL} then the image is represented by a set of wings which
would consists of purely line labels. If the line labels are restricted to labels found
in most line labeling research (e.g.,Malik [83]) and if each wing contour is a Whole
line segment in the image, then the wing representation is equivalent to that used in
other works on line labeling (Figure 2.5 (a)).

In Chen’s work [27], there are 5 surface types, {null (background), planar, convex,

concave, saddle}, and 4 contour types, (silhouette, jump, convex crease, concave

 

 

 

Figure2-5- Ree
catalogue (bl C‘

crease}. Theore
deﬁnition. Howe
physically realize
image using this
the wing contour
3 different wing

LLDG wing repr
capture explicit w

and surface post? i

2.5.1 Wing

ludeﬁning the ob,
restricted to {plan
labeling schemes (a

as surface labels. T

S: { PLN(a,b,d).

 

 

 

 

25

null null

 

null null

(a) (b) (C)

Figure 2.5. Representation of a coke can. (a) Line labeling using Malik’s [83} junction
catalogue (b) Chen’s wing [27] representation. (c) LLDG representation.

crease}. Theoretically there are 5 x 4 x 5 = 100 different wing primitives in his
deﬁnition. However, as Chen pointed out, only 34 of those 100 wing primitives are
physically realizable: that is there are only 34 different types that can appear in the
image using this deﬁnition of surface and contour labels. For example, if we make
the wing contour to take full length, then a coke can would have 5 wings but only
3 different wing primitives are needed to label them all (see Figure 2.5 (b)). The
LLDG wing representation (Figure 2.5 (c)) differs from Chen in that we want to

capture explicit wing contour / surface shape information as well as the contour label

and surface pose information.

2.5.1 Wing Primitives

In deﬁning the object domain, the set of bounding surfaces of all objects have been
restricted to (planar, spherical, cylindrical, conical}. Instead of using other surface
labeling schemes (as in [27]), we use only those 4 surface types and their pose equations

as surface labels. Thus,

5 =.- { PLN(a,b,d), SPH(:co,yo,ro), CYL(xo,ya,ro,a,ﬁ). CON (woaoeoroesﬂ). NIL }

 

 

 

 

26

where NIL indicates the background surface or don’t care. The equation of each of
the 4 surfaces in general position will be given in a later section.

The “wing contour label is to have information about both the shape of the curve
as well as the type of the curve. Instead of deﬁning wing contour to have shape of
general quadric polynomial, it is represented by 3 types of explicit 2D curves, namely
linear, circular and parabolic. Since the limb boundaries of cylinders or cones and
spheres project to straight lines and circles, respectively, linear and circular curves
are used as wing contour models. Note that some nonvlimb edges may also project to
a straight or circular curve. Projection of all other curved edges are locally modeled
by parabolic curves in 2D. Besides labeling each wing contour with explicit shape
information, 3D interpretation of the line segment is also recorded. Following the
symbols used in the line drawing analysis literature [83], each wing contour is also

labeled as a jump (<), convex crease (+), concave crease (——) or limb (<<). Thus 12

 

 

different labels are possible for wing contours.

Half-wings

As stated earlier, part of the goal of this research is to derive a reliable wing detector.

Instead of deﬁning wings as three-tuples, (5;, 0]., Si) where Si: 53' E S and C): E C,

 

they can be expressed as the union of two-tuples, (5;,Ck) U (SJ-,C'k), with a common
wing contour. We shall call each of those two-tuples as half-wings. The problem of
wing detection in an image are sometimes reduced to the problem of detecting two
individual half-wings followed by a merge operation. This is true if the edge contour
that separates the two surfaces is either jump or limb (depth discontinuity).

With the previously deﬁned wing surface and contour labels, a complete list of
half-wing primitives are generated and given in Table 2.1. Note that for each half-
wing, the wing contour acts as a separator between the two wing surfaces. The wing

contour lies on exactly one of the two Wing surfaces. Thus, a half Wing could have

 

spl
Wir

are

 

 

27

Table 2.1. List of all 60 half-wing primitives

 

Planar;

(PLN+LNQ<)
(PLN+CC<< )
(PLN+PB<< )

(PLN+LN< )
(PLN+CC< )
(PLN+PB< )

(PLN+LN+ )
(PLN+CC+ )
(PLN+PB+ )

(PLN+LN_ )
(PLN—I—CC. )
(PLN+PB_ )

 

Spherical:

(SPH+LN<< )
(SPH+CC<< )
(SPH+PB<< )

(SPH+LN< )
(SPH+CC<
(SPH+PB<

(SPH+LN+ )
(SPH+CC+ )
(SPH+PB+ )

(SPH+LN_. )
(SPH+CC._
$PH+PB_

 

Cylindrical:

(CYL+Lne<)
(ch+00<< )
(CYL+PB« )

)
)
(CYL+LN< )
(CYL+CC< )
(CYL+PB< )

(CYL+LN+ )
(CYL+CC+ )
(CYL+PB+ )

)
)
(CYL+LAL)
(CYL+CC. )
(CYL+PB.. )

 

Conical:

(<30N+L1v<< )
(CON+CC<< )
(CON+PB<< )

(CON+LNQ)
(CON+CC<)
(CON+PB<)

(CON+LNn)
(CON+CC+)
(CON+PB+)

(CON+LNL)
(CON+CC_)
(CON+PB-)

 

Background: (NIL+LN<< )
(NIL+C'C<< )

(NIL+PB<< )

(NIL+LN< )
(NIL+C(J< )
(NIL+PB< )

(NIL+LN+ )
(NIL+CC+ )
(NIL+PB+ )

(NIL+LN_ )
(NIL+CC_ )
(NIL+PB_ )

 

 

 

 

a planar surface while the wing contour is a projection of a limb from a cylinder or

 

sphere ((PLN+LN<< ) or (PLN+CC<< )). For completeness, Table 2.2 lists all those
wings that can be detected as a whole rather than as two half-wings. Basically, these

are wings that involve crease edges.

 

2.5.2 Derivation of Wing Contour Equations

Wing detection as to be described in the next Chapter, is to proceed by ﬁtting each
of the wing models to the sampled surface/contour points. In the case of jump
wings (i.e., wing contour corresponds to a jump edge), sample points along the edge
projection do not facilitate the wing recovery process. However, those 2D points are
vital for the recovery of wings involving limb boundaries and crease edges. To aid the
recovery of wings by ﬁtting the wing contour model to the sample points from the
edge projection, we must ﬁrst establish the connection between the equations of the

Wing contour and the wing surface(s).

 

 

 

 

28

Table 2.2. List of 38 full wing primitives

 

. (PLN+LN+ +PLN)
(PLN+LN_ +PLN)

 

(PLN+LN+ +CYL)
(PLN+LN_ +CYL)
(PLN+LN+ +SPH)
(PLN+LN_ +SPH)
(PLN+LN+ +CON)
(PLN+LN.. +CON)

(PLN+CC+ +CYL)
(PLN+CC‘- +CYL)
(PLN+CC+ +SPH)
(PLN+C'C.. +SPH)
(PLN+CC+ +CON)
(PLN+CC’.. +CON)

(PLN+PB+ +CYL)
(PLN+PB.. +CYL)
(PLN+PB+ +SPH)
(PLN+PB_ +SPH)
(PLN+PB+ +CON)
(PLN+PB_ +CON)

 

(CYL+LN+ +SPH)
_(CYL+LN.. +SPH)
(CYL+LN+ +CON)
(CYL+LN.. +CON)

(CYL+CC+ +SPH)
(CYL+C'C.. +SPH)
(CYL-t-CCZ, +CON)
(CYL+CC- +CON)

(cramps+ +SPH)
(CYL+PB_ +SPH)
(CYL+PB+ +CON)
(CYL+PB_ +CON)

 

 

(SPH+LN+ +CON)
(SPH+LN_ +CON)

(SPH+CC+ +CON)
(SPH+CC- +CON)

6PH+PB++CON)
wPH+PB_+CON)

 

 

 

In the following sections, the equation of planar, spherical, cylindrical and conical
surfaces in general position are given. These equations will be denoted as P3d. From
these equations, the projections of the limb boundaries of sphere, cylinder and cone,
which are circular, linear and linear, respectively, onto the image plane are derived

and are denoted as Pzd. Finally, we show the equations of the Wing contours that are

 

projections of crease edges. But ﬁrst, we show how to derive the equation of the ortho-
graphic projection of the limb edge from the equation of the general quadric surface.

In so doing, we will force the two equations to have the same set of parameters.

2.5.3 Deriving P2,; from PM

The general equation of a quadric surface takes the form

P3d(a',y,z) =aa2+by2+cz2+d$y+6$2+fyz+g$+hy+iz+j=0. (2.1)

 

 

 

 

 

 

 

 

29

To ﬁnd the equation of the projection of the limbs, the points belonging to the limb
boundary must be identiﬁed. Note that a point on the quadric surface is on the limb if
and only if the surface normal at that point is orthogonal to the projection direction.

The surface normal at a given point (2:, y, z) is simply

8PM 8PM 0PM)
8:1: ’ 0y ’ 82 '

 

Np,d(a:,y,z) =(

Assuming that the viewing direction is along the z—axis and that the vieWpoint is

at (0, 0, z”), the projection direction for a given point (2:, y, z) is then

Ppr0j($)ysz) : (a3,y,z _ 211)“

Thus, the limbs must lie on the surface deﬁned by the equation

PproJ-oNP“:$(ezu+g)+y(fz,,+h)+z(2021,+t)+(tzu+2j) =0 (2.2)

which is planar.

Finally, by solving Equation 2.2 for z in terms of :1: and y and substituting back into
Equation 2.1 and taking the limit as 2,, —-+ 00, we arrive at P“, which is the equation
of the limb, orthographically projected onto the image plane. The P2,; equation derived
from a general P3,; surface equation is long and is omitted here. It should be noted
that only simple algebraic manipulations are needed in deriving the ﬁnal form.

Fitting the general P34 and P24 equations to fused data, which have many complex
terms, is computationally expensive [73]. Furthermore, the error model for computing

the distance between the ﬁtted model and the data points can only be approximated
in practice [24, 73]. Our approach is to ﬁt explicit categorical surface models to fused
data. With explicit surface models, we are able to derive and use the exact distance

between data points and the model in evaluating the goodness of ﬁt. Moreover, there

 

 

 

 

30

are fewer parameters to ﬁt. Below, we give the P3,; and the derived P2,; limb projection
equations of spherical, cylindrical and conical models. This is followed by equations
of the projections of crease edges formed by two planar, a planar and a spherical, and

a. planar and a cylindrical surface.

Spherical Surfaces

A sphere in general position has the equation
ngh (3:: yr 2; 3023/01 20,7'0) = (.77 _’ $0)2 + (y _" y0)2 + (Z _' ZO)2 '— 7'02 = 0 (23)

There are four parameters: three translational (mo, go, 20) and radius r0. It is obvious
that the orthographic projection of the contour is a circle. By applying the procedure

outlined above, we arrive at the equation of this circle.

 

 

Pita]: ($3 ya 2; 930, yo, 7‘0) == (93 -‘ $130)2 “l” (y “ 3/0)2 — 7‘02 = (2.4)

Note that only three of the four P3?" parameters are present in Pig" . This means

that ﬁtting P2?" to contour points leaves one free parameter (i.e., z).

Cylindrical Surfaces

The circular cylindrical model has 5 parameters: two translational (:co, yo), two rota-
tional (oz, )6) and the radius (r0). We derived P3,; for the cylindrical model as follows.
Take a cylinder with radius re and the axis coincident with the z—axis. Translate the
cylinder along the x-axis by $0 and along the y-axis by yo. Using (3:0, yo, 0) as the
origin of the new coordinate frame, rotate about the x—axis by angle a and then rotate
about the new y-axis by angle ﬂ. The implicit quadric equation of the cylinder in the

new position can then be expressed in the original coordinate frame as:

 

 

 

31
P3?! (:5, y, 2; x0, yo, re, a, ,6) = ((y — yo) cos(a) + z sin(a))2+
((2: — $0) cos(ﬂ) + (y - yo) sin(a) sin(ﬂ) — zcos(a) sin(j6))2 - r02 (2.5)

The orthographic projection of the limbs are two straight lines. The P2,; derived
using the method outlined above is a quadric equation representing the two parallel

lines. By solving for one of the variables, we have the equation of each of the lines.

P2531 (517,31; (EON/07 To, aria) : (III "' $0) (C0t(ﬂ) Sin(a)) + (y — yO) :l:

 

\/r02 (6 - 2 cos(2 a) - cos(2 (a - ﬂl) “ 2 cos(2 :6) " cos(2 (a ‘l' ﬁll? ‘3er
2% (sin(oz)2 + cos(oz)2 sin(,6)2)

 

(2.6)

 

Conical Surfaces

The circular cone model has 6 parameters: three translational (ao,yo,zo), two ro-

 

tational angles ((2,,6) and the height of the cone at unit radius (do). To ﬁnd the
equation of the cone in general position, we ﬁrst assume that the circular cone is in
the upright position (i.e., has its vertex at the origin and the axis coincident with
the positive z-axis). By translating the cone along the x-, y- and z-axis by $0, go, 20,
respectively, then rotating about the new x- and y-axis by a, ﬁ as in cylindrical case,

we arrive at the equation of the cone in general position.

Pgdn (1:: y: Z; $0) .710: in d0) 03 ﬂ) : U2 + V2 ‘— Wz/doz’ (27)
U = (a: _. x0)cos(ﬁ) + (y — yo)sin(oz) sin(,6) -— (z — zo)cos(a) sin(j8)
Where V = (y - yo)c08(a) + (z - Zolsinﬁll

W = (x -— :ro)sin(ﬂ) + (y - y0)sin(a) (303(5) + (Z — ZO)C05(Q) 003(5)

 

 

 

 

32

After expanding and rewriting 363" as

$3" (2,31, 2; $0. 310.20. do. oaﬂ) = 0:122 + by2 + c22 + dry + 61:2 + fyz + 99: + by + 2'2 H.
(2.8)

derivation of the equation of the projection of the limb edges is straightforward.

263" (33,31, 2; $0,310, 20, do, CY, ,8) = kzr2 + ly2 -l- may + no: + 0y + p, (2.9)

where
.. 93. .. [3 _. _ 21:
. - . '2
_ e: __ t _ r
n— “5;, 0— h*‘2Lcs P- 3‘32“,

By comparing Equation 2.9 with the factoring of two linear equations, we can solve
for the equation of each limb projection, namely
can _\/m2—4kl—-m — 02—4lp—o

2d} : y _ 21 (I: + 21 (2.10)

 

 

can —-\/m2 ——4kl-—m V07 -—4lp—o

 

 

Two Planar Surfaces

Our equation of planar surfaces has three parameters (a, b, d) and takes the form
ng"(:r,y,z;a,b,d)=arc+by+z+d=0 (2.12)

To ﬁnd the equation of the projection of the crease edge between two intersecting

planar surfaces, simply take two such equations and set them equal to each other.

 

 

 

 

—

33
Pziidnl($ayazialiblidll:a1$+bly+2+d1 =0
P51” (x’y7ziazab2id2) = 02$+bzy+z+d2 =0
Pfjnfpln (93,31; 01,51, d1,612, badz) = (01 - €12)$ + (b1 — b2) y + (d1 - d2) = 0 (2.13)

Paw?!“ equation is deﬁned in terms of all six parameters that

Note the resulting
appear in ngm and Pa"? . By ﬁtting data points to Pfj" , we are in effect ﬁnding the

parameters of the two intersecting planes.

Planar and Cylindrical Surfaces

 

Recall the expressions for Pig" and P533! from Equations 2.12 and 2.5. By solving for

the variable 2 in Pfj" and substituting the expression into P323,“ to eliminate the 2 term,

we derive the equation of the projection of the crease edge created by the intersecting

 

planar and cylindrical surfaces. Note that this equation has all the parameters that

. , 1
are present in P5,," and P3531 .

new (so. y; a. b. d, so, is. re, a, ﬂ) = ((y -— yo) cos(a) - (ax + by + d) sén(a))2+

((1E - 930) 008W + (v - in) 5mm) sin(ﬂ) + (a :v + by + d) C08(01) sin(ﬂ))2 — r02

 

(2.14)

Planar and Spherical Surfaces

Using the same technique for deriving Pfjn‘Lcyl , the equation of the projection of the

crease edge created by two intersecting planar and spherical surfaces is

PTdn+8ph (5513/; a) b) do $0, 310) 207 7‘0) = ((13 _. $0)2 + (y _ y0)2 + ((a 1" + by + d) + 20)2 — T02

(2.15)

 

 

 

 

 

 

 

 

 

34

Other Surface Intersections

The 2D equations of the projection of intersection of other surface types can be
derived similarly. However, they may involve very long expressions and are thus

omitted here.

2.6 Registered Fused Images

A database of more than 40 registered (fused) range and intensity images has been
created for experiments. The fused data is dense except for regions of shadowing and
for small holes due to quantization. Each pixel in the fused image contains both the
3D positional information and the intensity value at that position. We refrain from
creating a new terminology such as “fusel” to denote the rich information at each
image grid point. Instead, the term “pixel” will be used throughout this thesis and
should be understood to contain both range and intensity data. The fused images have
been acquired via the White Scanner from Technical Arts [117]. A brief discussion
of the fusion process is given here. Refer to Appendix A for complete details of the
fusion procedure.

The White Scanner is a triangulating sensor which projects a sheet of laser light
onto objects which are viewed by a camera from another angle (see Figure 2.6).
Normally, the intensity image from the camera is only used as intermediate data
toward the computation of 3D coordinates in the workspace reference frame. We
registered the intensity image pixels from the camera with range values obtained by
parallel projecting computed 3D surface points along the optical axis of the camera
onto a synthetic fused image plane. The fused image plane is a synthetic image plane
that is coplanar with the CCD array of the camera and contains both a range value and
an intensity value at each quantized point. A perspective transformation was obtained

by calibration to map 3D world surface points onto the intensity array produced by

 

 

 

 

 

 

 

 

35

  

xc (parallel to Y w)

 

 

 

 

Figure 2.6. Setup of White Scanner and the two coordinate frames

the camera. In this manner, each visible surface point'could be assigned an intensity
in the fused image plane. Care was taken to assure registration accuracy to within
one pixel in the ﬁnal fused images. Because the White Scanner is a triangulating
sensor, there are voids in the data due to shadowing from both the camera and laser
sheet. Shadowing from the white light sources has been minimized by using two or
three light sources when taking the intensity images.

One ﬁnal note: because the laser illumination is not the same as the white light,
it may happen that intensity edges and range jumps at limb boundaries are slightly

out of registration.

2.7 Summary

In this chapter, we have deﬁned the wing primitive. The structure of a wing and its
mathematical representation were discussed. Essentially, a wing is a 2%D fragment of

image contour along with knowledge about the surfaces that project to the regions

 

 

 

 

36

adjoining the image contour. A wing captures not only the shape of the object in
the local region but also gives clues to the pose of the surfaces that form the wing.
Given our deﬁnition of contour and surface labels, a set of 98 wing primitives were
conceived. The wing detection algorithms of the next chapter detect wings that can
be labeled by those wing primitives.

The line drawing graph is the assumed starting point and input of many past line
drawing analysis research projects [30, 61, 66, 83, 115, 123]. However, the labeled line
drawing graph deﬁned here is much richer in information than the usual labeled line
drawing. A LLDG contains not only the usual line interpretations (labels) but also
contains information about the surfaces that project onto the regions in the graph.
With this additional surface information, a higher level vision process may be able to
determine the pose of the objects in the scene.

Finally, we briefly discussed the set-up in which the test images are generated.
We outlined how one can fuse the range and intensity images, originally in different

coordinate systems, to produce a single registered image. The details of the fusion

process can be found in Appendix A.

 

CHAPTER 3

Wing Detection

3.1 Introduction

As with any representation theory for object recognition, successful extraction of scene
features prescribed by the theory is as important as the theory itself. For theory of
wing representation, the features to be extracted are wings. A list of wing primitives
that can appear in the image are deﬁned in Chapter 2. This chapter presents an
approach to wing detection.

Instead of detecting the surface and contour components of the wing indepen-
dently, we pr0pose to extract, both wing contour and wing surfaces simultaneously.
Hypotheses of presence of various wing primitives are ﬁrst generated. Veriﬁcation of
each hypothesis is cast into a non-linear optimization problem in which convergence
to a. solution approves the hypothesis and also recovers the parameters of the wing.
The veriﬁcation procedure involves a novel idea of fused ﬁtting of surface and image
contour data simultaneously to recover the wing parameters. The fused ﬁtting process
is formulated and carefully studied in this chapter. Monte Carlo experiments are per-
formed to test the quality of the estimated wing surface parameters from ﬁtting with
fused, surface—only and contour-only data and to test the shape discrimination capa-

bility of fused and surface-only ﬁtting techniques. Experiments are also conducted

37

 

 

38

to study the minimum number of data points required to obtain quality parameter
estimation. Finally, we report on results and inferences drawn from detection of wings

on real fused images.

3.2 Survey of Related Background

The wing, as deﬁned in Chapter 2, consists of a fragment of edge image contour and
adjacent surface information. Using commonly known vision techniques, wing detec-
tion can be decomposed into two steps: edge detection followed by surface ﬁtting for
recovery of the surface properties and the pose information. Chen [26] experimented
with the above simple strategy using only range images. Image segmentation was per-
formed using techniques of either Hoffman 85 Jain [54], Laurendeau & Paussart [78], or

Marr & Hildreth [85], afterwhich, a symbolic uniform procedure repeatedly grouped

 

 

pixels into the most likely neighboring region. Another approach proposed by Chen
was to perform feature extraction by means of the directional derivatives. Directional
derivatives are ﬁrst order derivatives of a point in a given direction. However, our
implementation shows that a better wing detector is still needed. We should note
that the wing as deﬁned by Chen is a less speciﬁc than our wing deﬁnition (see Fig-
ure 2.5). A major difference is that pose information of the surfaces is embedded in

our wing while Chen’s has only qualitative surface labels.

 

Surface ﬁtting has traditionally been done by ﬁne tuning the parameters of an un-
derlying assumed surface model and edge constraints are usually not used. However,
edge information from objects provides strong clues about the local shape of surfaces
[70]. For example, Lowe used only 2D contours for recognition of objects in gen-
eral position [80]; furthermore, superquadric surface ﬁtting with edge constraints has
been successfully demonstrated by Bajcsy and Solina [5]. More recently, Lowe [81],

Taubin [116], and Bolle et al. [17] have also demonstrated the importance of contour

 

 

 

39

information by utilizing it in estimating initial parameters prior to ﬁtting and / or us-
ing it to aid the recovery of parameters during ﬁtting. Our wing detector explicitly
uses the contour information to help deduce the initial parameter estimates and guide
the non—linear optimization process.

The following subsections survey some of the traditional edge detecting, surface

segmenting and surface ﬁtting techniques.

3.2.1 Edge Detection in Intensity Imagery

The work on edge detection in intensity imagery can be traced back to the 2 x 2
cross operator of Roberts [108]. An approximate gradient is computed by convolv—
ing the edge operator with the image in two orthogonal directions and (non-)linearly
combining the convolved values. The direction of the maximum gradient can also
be approximated from the two convolved values. Since Roberts, many other differ-
ent edge masks have been proposed, including the Sobel [33] and the Prewitt [102]
operators. Instead of convolving the edge mask in only two directions and approx—
imating the maximum gradient direction from it, better results can be achieved by
convolving the edge operator in many different directions and selecting the maximum
value obtained as the gradient magnitude at the given location, and the correspond—
ing direction as the gradient direction. The Prewitt [102] and Kirch [69] are two such
operators. However, the drawback of such compass gradient operators is that they
require considerably more time for processing on a sequential computer.

Another type of window based operators, proposed by Hueckel [59, 60], Hum-
mel [62], and Frei & Chen [45], ﬁts an explicit model of the ideal edge/line to the
image. Rather than ﬁtting the model directly, the model is approximated by an
orthogonal set of basis vectors (masks) so that existence and the orientation of the

edge/ line is computed by simple convolutions.

 

 

40

A well known and popular edge detector for intensity imagery is the Canny edge
detector [22]. Canny deﬁnes detection and localization criteria for a class of edges and
augments it with a criterion to ensure that the detector has only one response to a
single edge. The major distinctions between his detector and the zero-crossing based
detector[85] are that (1) the detector has only one response to a single edge, and
(2) it uses directionally sensitive elongated (rather than circular) operators. Many
post-Canny edge detectors have been published in the literature, but the Canny
edge detector is still the norm against which all other intensity edge operators are

compared. A brief summary of the recent research trends in edge detection is given

in [20].

3.2.2 Segmentation of Range Imagery

 

In general, the intensity recorded at each pixel of an intensity image depends not

 

only on the surface position and orientation but also on the surface material, light
source position and the viewing position. Many researchers have concentrated on
the recovery of explicit shape information from intensity image(s) (e.g., shape-from-
shading [56, 57], shape-from-occluding contours [122], shape-from-texture [126], and

shape-from-stereopsis [8]) which are collectively called “shape-from-x”. A survey of

 

those techniques can be found in [1].

With rapid advances of technology, range sensors have become more affordable
and accurate [10] and a new class of edge detectors for range images has evolved. The
edge detector for intensity imagery can only be applied to range images to detect jump
(depth discontinuity) edges. Crease edges cannot always be directly detected using
those operators because of smooth transition in depth. As a remedy, Mitiche and
Aggarwal [87] offered a statistical test for detection of crease edges in range images.
Different partitions at a neighborhood of interest were tested for evidence of a crease

edge. A planar surface is ﬁt to each partition and the difference between the slopes

 

 

 

 

 

 

41

of the two planar ﬁts serves as the test statistic. Rather than using those intensity
image operators, some pure range image segmentation methods have been proposed
and are based on aggregation of local surface properties such as curvatures or surface
normals [12, 35, 39, 54, 128]. Since curvature is a second order differential property,
it is especially sensitive to noise. In [43], Flynn compared several different curvature
estimation techniques but found no one performed satisfactorily under the inﬂuence
of noise.

Among the many range segmentation techniques in the literature, there are two
major classes: region-based (6.9., [12, 54, 107]) and edge-based (6.9., [35, 96, 99]).
Region—based algorithms segment the image by ﬁnding homogeneous regions corre-
sponding to surfaces of the objects while edge—based methods try to extract lines from
images that correspond to boundaries of the surfaces. While a region-based algorithm
may be poor in delimiting the region boundaries, the edge~based approach has the
inherent drawback that the extracted line segments may be fragmented so that a
heuristic line linking procedure is often required to ensure closed curves. Recently,
Nadabar [90], Yokoya & Levine [128], Davignon [31], and Ferrie et a1. [39] have given
hybrid approaches to range image segmentation and have shown that by integrating
the region-based and edge-based approaches and using their complementary informa-
tion about the scene, a segmentation is achieved that has a better correspondence

between the shape of the regions and the surface of the objects.

3.2.3 3D Surface Reconstruction: Superquadric and

Quadric Fitting

Fitting of implicit quadric and superquadric forms to range data is one of the ways
to reconstruct surface shape. Here we will concentrate on the ﬁtting aspect of the

reconstruction since our wing detector also involves quadric surface ﬁtting.

 

 

 

 

 

————7—~——————ww—~

42

The problem of 3D surface reconstruction has received enormous attention in
vison research. A survey of recent 3D surface reconstruction methods can be found in
Bolle [18]. For example, Rimey & Cohen [107] divide the range image into windows,
classify each window as either planar, cylindrical or spherical and group like windows
into surface regions. Grouping windows of the same surface types is cast as a weighted
maximum likelihood clustering problem. The mixed windows, if any, are segmented
using a maximum likelihood hierarchical segmentation algorithm.

Quadric surface representation, popularized by Faugeras [37], is to represent object

surfaces by simple geometric quadric shapes such as a plane, sphere, cylinder, and

 

cone. The implicit equation of a general quadric surface is given in Equation 2.1.
By ﬁtting this implicit equation to surface points to recover the coefﬁcients, the
underlying surface shape and its parameters can be estimated.

When ﬁtting superquadrics in general position, 11 parameters need to be esti-

 

mated. These are location (3), orientation (3), size (3) and shape (61 and £2) param-
eters. Given the surface equations (see Section 1.2), the reconstruction proceeds as
a nonlinear optimization procedure in which the error between the sampled surface
points and the model is minimized. The most popular nonlinear optimization tech—

nique seems to be the Levenberg-Marquardt method [101]. As with most nonlinear

 

optimization strategies, it requires an initial estimate of the model parameters and
converges to the global optimum only if the initial estimate is already close to the
global optimum. Much recent research on quadric and superquadric ﬁtting only dif-
fers in the merit function for evaluating the goodness of ﬁt and the estimation of the
initial parameters. The well known research includes that of Pentland [98], Gross &
Boult [48], Ferrie et al. [38], Solina & Bajcsy [112], and Gupta & Bajcsy [50],
Recently, Kriegman & Ponce [73] proposed a method for shape reconstruction
based on the rim contour points only. They use Elimination Theory to derive the

implicit equation of the rim contour of surfaces of revolution and solve for the pa-

 

 

 

43

rameters by ﬁtting a large number of contour points to the rim model using the
Levenburg—Marquardt optimization technique. The implicit equation is a compli-
cated high degree polynomial.

All of the ﬁtting routines mentioned above assume that the image is preosegmented
and all use a large number of surface points sampled over a large surface area for
ﬁtting (in [73], it is contour points over most of the rim contours). The large surface
(contour) patch requirement is needed for better estimation of the initial parameters.
As we will show later in this chapter, reconstruction of surface shape can be done
with few sample points over a small surface area if the boundary information can be

used during the ﬁtting process. Moreover, images do not have to be pre-segmented.

 

As already pointed out at beginning of this section, image contour information can
be an important one in extracting surface shape. In the remainder of this chapter,

we will explore the possibility of simultaneous ﬁtting 3D surface data points together

 

with 2D image contour points.

3.3 Wing Detection Algorithm

 

A ﬂowchart of the proposed approach is depicted in Figure 3.1. Instead of detecting

 

all wings in the input image I at once, we propose to divide the input image I into
overlapping windows and to detect wings in each of those windows in parallel. Since
Wings are local features, locating all wings at once from the original image is recast
into locating wings from smaller subimages that overlap the original image. The set
of detected wings may contain many overlapping and co—curvilinear wings, in which
case, they may be merged to form longer wings and to achieve better accuracy for
the Wing parameters. The functionality of each module of the algorithm is examined
more closely in the following subsections and a list of the major parameters of the

algorithm can be found in Appendix C.

 

 

 

 

44

  

Detect wings in
WI

Detect wings in
w2

Detect wings in
wi
Detect wings in
“’MxM

Figure 3.1. Flowchart of proposed wing detection algorithm

  

 

    

raw range Divide I into M X M set of all
——> overlapping windows

image. I of size n x n

   
 

   

detected wings

 

 

 

 

3.3.1 Tesselation of the Input Image

The ﬁrst step in the wing detection process is to divide the input image I Nx N into M

x Moverlapping windows, 1121,2122, ...,waM, of size nxn (see Figure 3.2) such that
1- ini = INxNa and

2. each pixel Pm, except for those that are near the image border, is covered by

exactly m windows.

The constants, {11, m, N, M } are certainly all interrelated. The number of Win-

dows, M x M, is directly related to the window size, n and the degree of overlapping,

 

m. The size of the window is crucial to the success of wing detection. If the window
is too small, then the covered surface patch would also be small, in which case, the
number of data points available for the parametric ﬁtting may be too few to be ac-
curate. Worse yet, with a small surface patch, it may be impossible to derive a good
set of starting values for the non-linear ﬁt. On the other hand, large window size de-

feats the purpose of dividing the original image into smaller size to simplify the wing

 

 

45

 

window size

EB

. -. .; we); .1-

 

 

 

 

Figure 3.2. Tesselation of input image

detection process. In general, there is no one window size that will work well for all
images. The best set of {n, m, N, M } to use is really problem, feature size and pixel
size dependent. With larger features that expand across a large area of the image, a
large window would not hurt the detection of features. However, small features may
be missed by using large windows. Given the drawback of a ﬁxed window size, a scale
space approach in wing detection using coarse to ﬁne window sizes is an important
future research direction.

The reason for overlapping windows instead of disjoint segmentation is to guard
against the fact that wings may fall on the border of a window, in which case, the
Wing may not be detected. By having overlapping windows, we try to minimize the

the chance of a wing falling on borders of all windows that (partially) cover it.

3.3.2 Detecting Presence of Wings

A hierarchical diagram of how wings may be detected from each subimage is depicted

in Figure 3.3.

 

 

——f——————————m—

46

In the ﬁrst stage, we screen out those windows that are not interesting or are
too busy and only keep those windows that are moderately busy for wing detection.
Since a wing contains a fragment of an edge contour, if a window covers only the
image of a single surface patch, there can be no wing in that window and no wing
detection is necessary in that window. We call those windows bland windows. On
the other hand, windows that are too busy will also be discarded. A window is said
to be busy if more than one wing is present. A typical busy window would be one in
the neighborhood of a vertex. It has been shown in the literature that edge detection
near a corner is a difﬁcult task. Many have attempted to ﬁnd corners or dominant
points on planar curves [4, 36, 44, 106, 110, 118], which would require the image
to be segmented in the ﬁrst place. By detecting wings away from busy areas of
the image and later reconstructing the junctions from the detected wings, the true

junction type and location might be better estimated. Thus, only those windows that

 

 

contain exactly two surface patches, counting the background as one, will be further
processed. Those are the interesting windows.
A simple procedure is used to detect the presence of wing(s) and to identify

those interesting windows. First the intensity map of the subimage is convolved

 

with the Sobol operator. The 2xw,,~ze pixels with the highest gradients from the
above convolution are selected as potential edgelets. From the deﬁnition of wings in
Chapter 2, there are three classes of wing contours: linear, circular and parabolic.
That is, the 3D edge when projected onto the image plane can be locally approximated
by either a straight line, a circle or a parabola. Therefore, the decision criterion for an
interesting window is based on the goodness of ﬁt of those wing contour models to the
potential edgelets. Note that those 2D curve models have 2, 3, and 3 parameters. By
selecting enough edgelet points, we can solve for the parameters of the curve models
using the linear least squares method. If the window is bland or contains many 3D

edges, then all these ﬁts should be poor and the window will be rejected. Otherwise,

 

 

 

47

Wi

1

Compute gradient image, Gi
from intensity data

 

Gi [Wi

    

Boundary Detection

 

DISCARD Fit wing models to boundary and adjacent DISCARD
surface(s) using non-linear least—squares

 

 

 

]

Set of detected wings. W1! - - - 1 WII

Figure 3.3. Wing detection within each subimage w,-

 

 

 

48

in an effort to remove outliers and to achieve more accurate wing location, those
potential edgelets that deviate from the ﬁt model by more than the mean deviation
of all potential edgelets are thrown out as outliers. The remaining edgelets are ﬁt to
the same model again to obtain the ﬁnal wing contour. The equation of the ﬁtted
curve is useful in the estimation of the initial parameter values for the iterative non-
linear ﬁtting to be described in Section 3.4. Furthermore, the policy for sampling
range data for wing model ﬁtting can also be based on the shape of the contour
extracted and the location of the contour.

Note that the selection of the edgelets is based on the Sobel gradient of the in-
tensity image; the range data is not used. We are able to successfully extract most
edgelets using this simple procedure. Obtaining edgelets without the use of range
data also paves the way for our future research on wing detection from intensity

image alone.

 

 

3.3.3 Fitting Wing Primitives

Once interesting windows have been identiﬁed, the next step is to detect plausible
wings from each of these windows. This is achieved by ﬁtting each of the wing models
to a sub-sample of points from each of the two regions and the edge contour in the
window. As pointed out in Chapter 2, if a wing model assumes a jump edge then
the 2 surfaces are ﬁt individually without the beneﬁt of contour data. Otherwise,
either one or both surfaces may be ﬁt with the aid of the contour points. We follow
the popular technique of Levenberg—Marquardt for ﬁtting non-linear models [101]. A
non-linear model depends nonlinearly on the set of unknown parameters that deﬁne

it. A X2 merit function is deﬁned and the best-ﬁt parameters are determined by

 

minimization of the merit function. If no wing model ﬁts the sampled points well,
then no Wing is detected in that window. Otherwise, all detected wings from a window

are reported back for higher level processing.

 

 

 

.,
49

In the next section, we will develop the details of the ﬁtting equation, the X2 merit

function and the initial estimate of the wing model parameters.

3.4 Simultaneous Boundary and Surface Fitting

Least-squares ﬁtting has been one of the most popular ﬁtting techniques in the ﬁeld of
computer vision [16, 48, 68, 73, 81, 100, 112, 116]. The conceptual idea is as follows.
Given aset of n data points X = {231,232,...,23n}, we want to ﬁt the data by an
implicit equation, P(:c,-; fl) = 0, having parameters 0. The problem is to ﬁnd a set
of parameter values fl such that P(fl) models those data points well. fl is said
to be the least-squares estimate (LSE) of the parameters that minimizes the sum of

squared errors. Thus,

 

 

2(P(5ci; {Dz S Z(P(:c,-; fl))2 for all possible parameter values fl.

i=1 i=1

It is well known that the residual error measure may be a poor indicator of the
error between the data points and the ﬁtted model [76]. For example, when ﬁtting a
polynomial to a set of points in 2D, points that have the same Euclidean distance to
the ﬁtted polynomial may have grossly different residual errors (see Figure 3.4).

A better approach is to use the exact normal distance function when computing
ﬁtting errors. Instead of minimizing the squared residual errors, the sum of the
squared normal distances between the data points and the ﬁtted curve is minimized.

Hence the quantity to be minimized is

 

2

ZdiSt( 23;, P(€Bg; 9)) .
i=1

Unfortunately, the exact formulation of the distance between the data points and

the ﬁtted curve often does not exist (6.9., determining distance between a 3D point

 

 

 

 

 

50

A
residual = 9 true error

true error 5
, B

5 residual

  

Figure 3.4. Data points that are close to the ﬁtted curve may have large residual
errors.

and a general quadric surface). Some surface ﬁtting algorithms in the computer
vision community have either continued to use the residual errors [42] or used an
approximation to the true distance function [16, 68, 73, 112, 116]. The study by
Grossand Boult [48] showed that the ad hoc goodness-of-ﬁts measures have many
problems and biases in parameter ﬁtting. It was shown that ﬁtting superquadrics
with an error measure based on the true Euclidean distance fares far better than if
other ad hoc measures are used.

In what follows, we will deﬁne our ﬁtting functions and derive the true distance
measure to be used as the indicator for the goodness of the ﬁt. We are able to derive
the true distance measure because explicit quadric surface models are used. This
would not be possible if a general quadric surface equation were used instead.

For convenience, denote the candidate window for wing detection as M. Instead
of using all the points in the window, we subsample m surface points from each region
and n contour points along the wing contour as data points for wing ﬁtting. Let’s
denote those points as X51 2 {511, 5'12, 513, ..., 51m}, X32 = {521, 522,523, ---,52m},
and Xe 2: {C1, 02, C3, ..., Cu}, respectively. The effect of different sizes of m and n

have been studied and is reported in Section 3.6.

 

 

 

 

51

3.4.1 Fitting Equations

There are 3 different ﬁtting functions used depending on the type of the wing contour
of the assumed wing model. If the wing contour of the assumed wing model is of type

convex or concave crease, then the ﬁtting function is

171(2),; 9) = wl (31' E X51) P593018; 9) + 102(31' E Xe) P2d(mii 9) +

101 (931‘ E stl Pﬁhi; 9) , (3-1)

for all points in X = X,1 U XC U X32. The expressions (2:.- E Rig), (:3,- E X6),
and (m,- E X32) are binary terms. For any 23,-, only one of the terms is 1 (true) and
the other two are 0 (false). The weights wl and wg are used to scale the relative
signiﬁcance of image contour residual compared to range residual. Note that there
is also an implicit weighting caused by the numbers of surface points m and contour
points n.

If the wing contour is of type “jump”, then contour information is not useful in
the recovery of the surface parameters. In this case, the wing is broken into two half-
wings and each half-wing is ﬁt to only the set of sampled surface points. Thus, across

jump edges the ﬁtting function is the same as the surface model equation, which is
F200;; 9) = Pixie 9), V23,- 6 X,1

and

F2(:e,-; n) = P§f(:c,-; 52), V13; 6 .122. (3.2)

 

 

——-7——:

52

If the wing contour of the assumed wing model is of type limb, then the contour
data can only aid the recovery of the self-occluding surface. The ﬁtting equation for

this surface and the contour is
F308;; 9) = w] (23; 6 X31) P;§($i; Q) + 7.02 (23; 6 X6) P2d($i; 9), (3.3)

for all 2:,- E (X = X31 U Xe). Since a limb edge is a jump edge, the other surface
which is being occluded by the self»occluding surface is detected without the aid of

the contour information. Its ﬁtting equation would be the same as Equation 3.2.

3.4.2 Distance Functions

If we substitute Pg} and P33} in the ﬁtting equations with one of {Pﬁn , 51’5", ng’,

ngn} and PM with one of {P;,]”, 2’5" , P531 , Pf? and do a least-squared ﬁt over

 

 

F1, F2 or F3 then we would be computing the LSE of the parameters by minimizing
the squared residual errors. As Figure 3.4 indicates, a better error function is the true
distance function. Since we have modeled different types of quadric surfaces by their

respective explicit quadric equations, we are able to derive the true distance from a

 

given 3D point, 23d, to the model surface. The distance function from a given 2D
contour point, 233,; to the model 2D contour may also be derived. We shall denote the
distance function as Dng" and Dng" where qqq denotes one of the surface/ contour

types.

Planar. The way P13" is deﬁned, the distance from $34 to the ﬁtted plane is equal
to the residual at that point. Thus the distance function of Pg" is simply [Pit]n [.
Computation of the distance from 239.; to a line in the same image plane is also straight

forward. If the line has slope m and y-intercept b, then the distance is simply (ms: +

y)/v1+ m2. Thus,

 

 

 

 

 

 

 

53

szinl(x yazi alabladl): [alm+bly+z+dl ],
DPPJMW 9,2; 02,52,612): [agz+bgy+z+d2|,

Ppln+pln

ngdn+pln(xiyizi alablidl’a2’b2,d2) : 1+i62-01]2
[I (bi-'52)

Sphere. Computation of the distance from 9:34 to a ﬁtted sphere can be computed
as the distance from 233.1 to the center of the sphere minus the radius of the sphere. In
2D, the distance from 2:9,; to the limb contour projection is the distance from 2:9,; to

center of the projected circle minus the radius of the circle. Thus,

 

:ngha W, ,O,yo,ZO,.~o)= |\/(a:-—a:o) +(y- yo) +(z-Zo) ~7‘ol

stph

 

($7 yi ((30, yo, 20,7'0)='- ]\/( (Cl: -’ $0) 2+ (y" yo)2 "’ To].

Cylindrical. The distance from 2:3,; to a cylindrical surface is also straight forward.
The distance is the absolute difference between the radius of the cylinder and the dis-
tance from 2:34 to the axis of the cylinder which is a straight line in 3D. Computation
of the distance from 2,4 to the projected limb contour is similar to that of DPfjn‘fp‘".

Note that the slope of the projected limb contour inP 2°: is cot(,6) sin(a ).

DP;3I(1"7 ya 2; 550: yo: T0,“) ﬂ) =
(90 -— y) cos(a) sin(a) cos(ﬂ) 2 + (:1: - :60) cos(a) cos(ﬂ) sin(ﬂ) —

(<sin<a)ﬁcosia)’sin<m *3“ (“(0‘) + cosh) sinw) w +

2 (--:z:2 + 2 a; 3:0 — x02) cos(a) 2 cos(ﬂ) sin(a) sin(ﬂ) +

 

2 (-x y + 3:0 9 + :1: yo -— :60 yo) sin(a) sin(ﬂ) cos(ﬂ) +
(--y2 +2yyo - 31096034a ) +Sin(a) lsinw [92+
2 (—-y2 + 2.. y. - .7102) cos(a) 2 sin(a) sins) 2)

Pa”

DPcyl x,y,z; :1: ,y ,7‘ ,CMB =
2d( 0 0 0 ) 1+C0t2(ﬁ)8ln2(a)

 

 

 

 

 

 

 

54

Conical. We use Figure 3.5 to explain how the exact distance function from 2:30: to
the surface of a cone can be derived. The distance from 1:34 to the conical surface is
merely I

d’lSt (33d , P3221; ) = [1 ‘ sin(02).

By the Law of Sines, 02 can be computed as sin'ldf) —- 01, where 11 is the dis—
tance between 23,; and the vertex, and I; is the distance between 2:3,; and the axis
of the cone. Both quantities are easily computed given the estimated parameters
fl: {160,190, zo,do, a, ,6} of the cone. 01, which is the angle between the axis and the
surface of the cone, is just l/do.

The formulation for computing the distance from 2:2,; to the limb contour of the
cone is similar to the cylindrical case. As with the cylindrical case, since the 2D

equations of those limb contours are already derived (Equations 2.10 and 2.11), one

 

 

only needs to ﬁnd the distance between a point and a line.

D 3:3”(x3 y) 2; $0) yo, 30) (109(1) ﬂ) = II ' Sin(02)

can ‘

ifl + slope2

where slope is the slope of the line {3" (refer to Equations 2.10 and 2.11 for exact

DPZCcfn(x73/7 2; 330,110, 20,610, (1,6) =

'I

 

 

 

notation).

3.4.3 X2 Merit Functions

Following [101], we will use the X2 merit function to evaluate the goodness-of-fit.
Assume that each surface data point, :6; 6 26,1 or :6; 6 26,2, has a measurement error
that is independently and identically distributed as a normal distribution around
the ﬁtting model P34(a2,; Q), with zero mean and standard deviation 0,. Likewise,
assume that each contour data point, m.- 6 X6, has a measurement error that is
also independently and identically distributed as a normal distribution around the

contour model P2d(a:,-; ft) with zero mean but different standard deviation ac (L6

‘7

 

 

 

 

 

 

 

55

 

 

Given: 9, , x3d,x0

11 = DISTfX3d, X0)

—
N
H

DIST ( x3d, axis of the cone)

 

l
6 -sm" (11)- e,
1

d =11*Sin(92)

 

 

Figure 3.5. Computing the Euclidean distance from a point in 3D to the surface of a
cone.

Dng‘rfacewM , ng‘rfacc) ~ N(O,cr,) and 0nger (23,4 , ngntm) ~ N(0,0‘c)). Then

the x2 merit functions for the the ﬁtting functions deﬁned earlier are

’ .. 2 . . .
231+" EL?) 7 for ﬁtting F1 in Equations 3.1 (creases)

X2 (9) " "‘1 (any)? , for ﬁtting F2 in Equations 3.2 (jumps) (3.4)

i:

I I

 

k Ema rags-ﬂ)”, for ﬁtting F3 in Equations 3.3 (limbs)

{:1

For F2 in Equation 3.2 where only one surface model is involved, 02 is simply 0,.
01 and 03 are derived from the sum of two X2 3. Their derivations are given in
Appendix B.

By minimizing the X2 quantity, we obtain the maximum likelihood estimate of
the model parameters if the noise follows the distribution assumptions. Note that
we have substituted D 3'"; for P5; and DP]? for P3: in the ﬁtting functions so that
the least-squares estimates are computed in terms of true distance errors instead of

residual errors.

 

 

 

 

 

 

56

3.4.4 Initial Parameters

With an iterative non-linear optimization technique, we must supply the ﬁtter with
an initial estimate of the parameters. The ﬁtter iteratively reﬁnes the supplied initial
values until convergence. The ﬁnal estimated parameters are more likely to be the
global minimum if the initial estimates are close to it in the ﬁrst place. Thus it is
important to have good initial estimate of the parameters. In [80], Lowe matched
features of the hypothesized model to features found in the image to help estimate
the pose parameters of the hypothesized object. Using this paradigm, not only are
the model database must be available, the challenging problem of matching image
features to model features must also be solved. Furthermore, there could be large
number of matches if only a small number of features were detected. However, Lowe
claimed that if the initial orientation parameters are within 60 degrees of the correct
values, then almost any values can be chosen for the other parameters. Our approach,
which will be described in the following subsections, is to estimate the parameters
from the small surface patch and make good use of the contour information whenever

it is available.

Planar. The parameters of a plane are completely recovered if the surface normal
and a point on the plane are known. The surface normal can be computed by taking
the cross product of any two non—collinear vectors on the plane. In our implemen-
tation, we picked three non—collinear surface points that are far apart in the region
and use two of the three surface vectors formed by those three points to compute the
surface normal. The normal vector along with any of the three points picked uniquely

determines the plane on which the three points lie.

 

 

 

 

—7‘——
57

Spherical. A spherical surface model has four parameters, the center of the sphere
(mo, yo, 20) and the radius of the sphere r0. There are two cases to be considered

when the assumed surface model is spherical.

Case 1: Wing contour is of class “circular” and of type “limb”. Note
that the orthographic projection of the limb boundary of the sphere onto the image
plane (the z = 0 plane) is a circle whose radius is the same as the sphere. Furthermore,
since the viewing direction is along the z-axis, the center of the sphere must lie on
the line parallel to the z~axis and passing through the center of the circle in 2D.
This means the parameters mo and yo can also be extracted from the equation of the
limb on the image plane. With an additional point on the spherical surface patch,
say (12;, 9;, 2;), the lone remaining parameter, 20, can be computed by solving the

equation of the sphere for 20. Thus,

 

 

 

20 = 21' + 712) " (:12; "' $0)2 - (yi "' yO)2

Since the equation of the projected circle on which the wing contour lies is already

known from the wing presence test in Section 3.3.2, all of the parameters can be

 

estimated using the procedure outlined above.

Case 2: Estimating parameters from surface patch. Without information
about the limb, the parameters must be estimated from the surface patch directly.
Instead of estimating the curvatures of surface points, which are second derivative
features and therefore are noise sensitive, and deriving the parameters from the esti-
mated curvatures as in [42], we simply ﬁt the surface points with the implicit equation

of a sphere. This implicit equation has the form:

P§§h(cc,y,z; a,b,c,d)zxz+yg+zz—a$—by—cz—d=0 (3.5)

 

 

 

— I
58

The parameters of the spherical surface model can then be computed from the pa—

rameters of the implicit equation {a, b, c, d} by

 

$0 = a/2, yo = b/Za 20 = C/Z’ To = \/d _ (3)2 _ ('21:)2 _ (g)2

Cylindrical. There are two translational and two rotational parameters along with
the radius of the cylinder to be estimated. In effect, one only need to recover the radius
and the orientation of the axis in 3D to estimate all ﬁve parameters. Again, there are

two cases to be considered when the assumed wing surface model is cylindrical.

Case 1: Wing contour is of class “linear” and of type “limb”. As in
the spherical case, the presence of the limb contour is useful in estimating the initial

parameter values. However, the limb contour information is not used in estimating the

 

radius of the cylinder. Although the limb contour gives the direction of the minimum

 

curvature of the surface patch in 2D, the maximum curvature (therefore the radius)
cannot be computed directly because the direction of the maximum curvature is not
along a straight line in the 2D image Instead, the radius of the cylinder is estimated

by ﬁrst ﬁtting a bi-quadratic surface of the form

 

 

z = Paw; a,b,c,d,e,f) = ax2+by2+cxy+dx+ey+f

to sampled surface points from the surface patch and then computing the maximum
curvature at each of the sampled points. The radius of the cylinder is taken to be the
inverse of the median of all the maximum curvatures.

The axis of the cylinder must lie parallel to the wing contour in the image plane.
Furthermore, any straight line on the surface patch that is parallel to the wing contour
in the image plane must have the same orientation as the axis of the cylinder. Hence,

by shifting the line along the wing contour into the region corresponding to the

 

_______A

 

 

 

 

59

cylindrical surface patch, the surface points that lie on the new line form a 3D line
that has the same orientation as the axis of the cylinder. The rotational angles can
now be estimated from the orientation of this 3D line. Finally, the translational
parameters are estimated by ﬁrst projecting this 3D line into the cylinder along the
surface normal by a distance equal to the estimated 713. The new 3D line that goes
through those projected points is an approximation to the axis of the cylinder. The
translational parameters can then be determined by extending the axis to intersect

the image plane (the soy-plane).

Case 2: Estimating parameters from surface patch. The radius is esti-
mated using the same procedure as in Case 1. Since the orientation of the cylinder
in the 2D image plane is not known in this case, we project each of the sampled
surface points along its surface normal into the cylinder by a distance equal to the
estimated r0. Those projected points should lie roughly on a line which is the axis of
the cylinder, from which the translational and rotational parameters of the cylinder
can all be estimated.

In Section 3.7 the cylindrical parameters estimated in Case 1 result in better

estimates due to the ﬁx on the axis direction in 2D.

Conical. Although the ﬁtting equations have been derived, we are unable to derive
a satisfactory algorithm for estimating the geometric parameters of the conical model

from a small conical surface patch, even if information about a limb boundary is avail-

able. We have left the problem of estimating conical parameters from a small surface

patch to future research. This is not to say that there is no hope in initial parameter
estimation for conical models. In fact, several papers [17, 42, 95] addressing quadric
surface ﬁtting have reported methods in estimating those parameters. However they

all have their limitations. Often, a large surface patch is assumed and/ or they use

 

 

 

60

noise-sensitive second derivative features such as curvatures to derive the initial val-
ues. For example, Flynn and Jain [42] proposed a method for estimating conical
model parameters based on curvature values. A bicubic surface is ﬁt to the surface
patch, from which the maximum and minimum curvatures and their directions are
estimated. The geometric parameters of the conical models are computed based on
those curvature values. As the authors pointed out, these calculations require accu-
rate curvature estimates, which in itself is a very difficult task [43]. Also, no examples
of conical parameter estimation were given. In [95] a model driven technique for es-
timating conical surface parameters was given. This method is model driven because
one of the parameters, namely the angle between the axis and the cone surface, is
assumed known from the model database. The other parameters, the vertex of the

cone and axis direction, are inferred from the surface patch which must include both

 

limbs. However, from the outlined approach and the examples given in the paper, it

 

seems that a large surface patch size is required to obtain acceptable initial param-
eter values. Bolle [17] approximated the axis direction of quadrics of revolution by
intersecting two planes in 3D that theoretically contain the axis of the cone. Each
plane was chosen to contain a surface point and spanned by the direction vectors of
minimum curvature and normal at that point. We think that this approach merits
some consideration in our future work. By using information about the axis direction
in 3D and the limb contour in 2D, the initial estimates of the conical model can be
deﬁned.

Our work so far shows that it is very difficult to differentiate a cone from a
cylinder when only one limb is present. It may be that we should consider them

locally equivalent.

 

 

 

61

3.4.5 An Example

Figure 3.6 depicts how two wing models are ﬁt to a window that partially covers
one side of a cylinder including the limb boundary and the background. In this
window, since only one surface exists, the wing is detected using half—wing models.
The background requires no ﬁtting at all. In Case 1, where the correct wing model
(NIL + LN<< + CYL) is to be ﬁtted, the half-wing used is (CYL + LN<< ) Since
the wing contour is assumed to be the projection of a limb boundary, both surface

and contour points are ﬁtted simultaneously. Hence, the ﬁtting function is

Fm; 9) = w. (m. e x.) Ps' (2.; n) + w. (m.- e «1%) Ps’ (we a).

where :6,- E X,U Xe.

In Case 2, an incorrect underlying wing model (NIL + LN< + CYL) is assumed.
Since the wing contour is assumed to be the projection of a jump edge, only the
sampled surface points are used for ﬁtting. Note that the assumed surface model
is correct so that the ﬁt may converge. However, it is our experience that initial
parameter estimation is more difficult and less accurate and therefore less likely to

converge. The ﬁtting function is,
F(a:;; Q) = P31 (93,-; Q), where :12, E X,.

Finally in Case 3, both the assumed wing surface and wing contour models are
incorrect. Again, since the wing contour is assumed to be the projection of a jump

edge, no contour data is used in ﬁtting. The ﬁtting function is

F(:c,-; Q) = P333” (22,-; Q), where :6,- 6 X5.

 

 

62

so fit (Pgdfx,g ,z])
to surface points

20 ﬁt ( Pmlx,gl)
to edge points

 

Case 1: F(:v,-; fl) = w1(:c,-E X3) 1336f;l (22.; Q) + 102(93iE Xe) P263! (31'; 9)-
Case 2: F(:v,~; fl) 2 P331 (mi; fl).
Case 3: F(:c,-; fl) '2 P335,L (mi; fl).

(b)

Figure 3.6. Fitting wing models to a window with cylindrical surface patch. (a)
Window from which the surface and edge points are sampled. (b) Fitting equations
of three different wing models.

 

 

 

—7—‘——
63

3.5 Experiments with Synthetic Data

Two sets of experiments with synthetic data were conducted to evaluate the perfor-
mance of the fused ﬁtting technique as compared to ﬁtting surface or contour points
alone.

In the ﬁrst set of experiments, we set out to evaluate the quality of the ﬁt on planar,
spherical, cylindrical and conical surfaces when using respectively only contour data,
only surface data and fused data. True surface models were used during each ﬁt. The
questionto be answered by this set of experiments is “Is the use of both contour and
surface data for ﬁtting superior to either used alone?”

While the ﬁrst set of experiments used true surface/ contour models during ﬁtting,
the second set of experiments ﬁtted both true and incorrect models to the data points.

The goal is to determine if the wrong models will ﬁt the surface/ contour patch better

 

 

than the true model. In other words, the experiments are to exploit the classiﬁcation
power of the fused data ﬁtter against the usual techniques.

One thousand Monte Carlo trials were performed in each of the experiments. Dur-
ing each iteration, a new set of surface and contour data were synthetically generated
as follows. First the parameters of the surface model were generated randomly. The
ranges from which those parameters were generated are given in Table 3.1. Points
on the surface could then be rendered without error using the selected parameter
values and the appropriate surface model equation. A rectangular sampling window
of speciﬁed size was placed over the image contour of the limb boundary. In the case
of two intersecting planes, the window was placed over the line in the image that
corresponded to the crease edge in 3D. Surface and contour points for ﬁtting were
sampled from this window. The total number of points sampled in each of the trials
was 30. For surface points, Gaussian noise was added to the z-component to simu-

late sensor errors. Results at two noise levels, 0’ = 0.01 and a = 0.05 are reported

 

 

 

64

Table 3.1. Range (in inches) of parameters

 

Planar: —5.0 S (11, b1 , d1 , (12, b2 , d; S 5.0.
Spherical: —5.0 S $0 , yo S 5.0; 15.0 S zo S 25.0; 1.0 S r0 S 10.0.
Cylindrical: —5.0 S :60, 90 S 5.0; 1.0 S r0 S 10.0; 5.0 S a, ,3 S 175.
Conical: -—5.0 S :60, yo S 5.0; 15.0 S 20 S 25.0; 0.5 S do S 2.0;
5.0 S a, 3 S175.

 

 

 

 

here. Gaussian noise at levels of a = 0.002 and a = 0.01 was also added to both a:-
and y—components of the contour data points. In reality, the noise level of our range
scanner [117] was determined to have a < 0.01. We also experimented at a higher 0
because we feel that range data derived using shape-from-x techniques would have a

much higher noise level.

3.5.1 Fitting True Models to Contour, Surface, and Fused
Data

The ﬁrst four experiments involve ﬁtting the true model to spherical, cylindrical,
conical and planar surface patches. We compare the results of ﬁtting fused (both
contour and surface) data to ﬁtting contour-only or surface-only data. The fused
data consists of 15 points sampled from the contour and 15 points sampled from the

adjacent surface patch for a total of 30 points. For surface-only and contour-only

 

data sets, 30 points were sampled from either the surface patch or the contour in the
window.

Since true model parameter values are known, the initial parameter estimates are
taken to be the truth plus/ minus some random deviations having uniform distribu-
tion. Table 3.2 lists the range of deviations for all of the parameters.

In studying the results of the experiments, one of the statistics reported is the
total number of bad ﬁts over the 1000 trials. A ﬁt is considered bad if the computed

X2 is over the threshold of 12. Since 30 data points were used in each trial, and the

 

 

 

_ V a ._ w,

65

Table 3.2. Initial estimate of the model parameters

 

Planar Conical
.a‘lzalztdal, da1~U(0,1) fo=$oid$o, d$0~U(0,1)
51 :2 b1 idbl, db, ~ U(O, 1) y}, = yodzdyo, dyo ~ U(0, 1)
J, =.- d1 :1: dd1 , dd, ~ U(0, 1) :20 = 20 :l:dzo, dzo ~ U(0, 1)
(£2 = (122111102, (102 N U(0,1) Cfo = doiddo, ddo N U(0,0.5)
b; .-= b; 4.112., db2 ~ U(0, 1) a = aida, do ~ U(0, 10)
ct. = d; iddz, dd2 ~ U(0, 1) p“: ﬂidﬂ, dﬂ ~ U(O, 10)
Spherical Cylindrical

fozmoidxo, dzo~U(0,1) fo=xozlzdzo, dxo~U(0,1)
3133290331310, dyo~U(0,1) For—”yoidZ/o, dyo~U(0,1)
z“O=zO:l:dzO, dzo~U(0,1) fo=roztdro, dro~U(0,2)
for-roidro, dr0~U(0,1) cir=azhda, da~U(0,10)

 

 

 

 

 

 

 

 

 

number of parameters is either 4 (spherical), 5 (cylindrical) or 6 (conical, planar), the
degrees of freedom would then be 26, 25 or 24. A X2 of 12 with number of degrees
of freedom in the range of 24 — 26 indicates that the probability of the ﬁt obtaining
a xzas small as 12 happening by chance is about 2%. While a bad ﬁt is a type II
error in the hypothesis testing sense in which the correct hypothesis is falsely rejected,
we note that additional reject conditions will be added in the following subsections,

making the term “type II error” an inaccurate description of the rejects by the whole

model ﬁtting procedure. Thus, we shall use the term “bad fits” to donote those ﬁts

that resulted in a type II error or rejected by the additional “good” ﬁt criteria.

All other statistics are computed over the good ﬁts only. Besides reporting the
average and maximum deviations of the estimated parameters from the ground truth,
three other summarizing statistics are also reported. They are the average d.lse, the
average m.lse and the average number of iterations for the good ﬁts to converge. The
(1136 models the variance of the Gaussian noise in the data and is computed as the
unweighted sum of the squares of the Euclidean distances between the data points

and the ﬁtted surface/ contour averaged over all 30 points. The m.lse is a similar

 

   

 

66

Table 3.3. Fitting Spherical Model on Spherical Data

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Number of trials: 1000
. Number of samples per trial: 30
Parameter Measurements Contour-Only Surface-Only Fused
are in inches Avg. Max. Avg. Max. Avg. Max.
afo- 23m“: 0.00 0.01 0.01 0.03 0.00 0.01
320- 9..., 0.04 0.45 0.03 0.14 0.02 0.08
z‘o- ztme Inf. Inf. 0.03 0.14 0.01 0.04
a = 0.01 r‘o- rm": 0.04 0.50 0.04 0.19 0.02 0.08
X2 0.01658 0.26373 0.13292
d 136 0.00001 0.00009 0.00004
m 136 0.00001 0.00002 0.00001
iterations 3.99 4.29 5.37
bad ﬁts 30 0 3
330 - some 0.01 0.03 0.03 0.11 0.01 0.06
3,70- ya... 0.12 0.45 0.17 0.38 0.08 0.35
2‘0— ztrue Inf. Inf. 0.18 0.43 0.04 0.17
0' = 0.05 r‘o- rm“, 0.12 0.45 0.23 0.50 0.08 0.34
x2 0.27139 6.80193 3.19649
d 136 0.00009 0.00227 0.00107
m lse 0.00001 0.00056 0.00015
iterations 4.12 4.72 5.77
bad ﬁts 34 188 3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

quantity except the error is computed from the data points before Gaussian noise
was added to the ﬁtted surface/ contour. This measure gives an indication of how well

the estimated surface / contour matches the noise free data.

Experiment 1: Spherical

The sampling window size was chosen to be proportional to the radius of the sphere

 

as 0.5m x 0.6 r0. This covers roughly 10% of the projected area of the hemisphere.
Since the sampled area is a small fraction of the sphere, many good ﬁts may result

with various different radii. An additional “good ﬁt” criterion is added to detect

 

those ﬁts whose estimated radius deviates by more than 0.5 inches from the ground

 

 

 

————————— ~————————————~~-

67

truth. Thus, a ﬁt is said to be good if and only if x2 < 12.0 and the estimated radius
is no more than 0.5 inches from the truth. The results are summarized in Table 3.3.

In ﬁtting contour-only data, the z—component of the center of the sphere cannot
be recovered. When the noise contamination level is low (a = 0.01), the results are
comparable between surface-only and fused ﬁtting. Perhaps a slight advantage goes
to fused ﬁtting. In ﬁtting with fused data, the estimated parameters are closer to
the ground truth. The average number of iterations is only one more than if only
surface data were used. Although there are 3 bad ﬁts using fused data, it is negligible
in light of the large number of trials. The results in the more noisy case (0' = 0.05)
strongly support our claim that fused ﬁtting is superior then ﬁtting with 3D or 2D
data alone. The number of bad ﬁts using fused data remained low while the number

of bad ﬁts for surface-only data shot up to 18.8%. The estimated radii are also much

 

worse using surface data alone.

 

Overall, the results suggest that simultaneously ﬁtting surface and contour points
can tolerate a much higher noise level. The ﬁtted parameters are more accurate,
and these results are obtained without much increase in computation time (one more

iteration).

 

Experiment 2: Cylindrical

The window from which the data points are to be sampled is a rectangular box of
height 2.0 r0 (running parallel to the axis of the cylinder) and width 0.6 r0. This
roughly covers 33% of the projected area of a cylinder of height 2.0 r0. As in the
spherical case, many ﬁts will be deemed good if X2 were the only criterion used. Here
a ﬁt is consider good if and only if (1) X2 < 12.0, (2) [7:0 — 7'0] S 0-5. and (3)
[6r —— a] S 10° and [6 — ﬂ] ‘ S 10°. Those conditions force a good ﬁtted cylinder to have
similar size and have the axis approximately pointing in the true direction. Results

from the Monte Carlo trials are depicted in Table 3.4.

 

 

 

 

68

Table 3.4. Fitting Cylindrical Model on Cylindrical Data

 

Number of trials: 1000
Number of samples per trial: 30

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Parameter Measurements Contour-Only Surface-Only Fused
are in inches Avg. Max. Avg. Max. Avg. Max.
.230- 2“,“, 1.69 61.84 0.29 14.52 0.12 2.81
911- y.... 0.44 5.33 0.17 5.27 0.08 1.92
60- rm“, 0.25. 0.50 0.09 0.46 0.03 0.27
€2— atmc 4.59 9.98 0.63 7.52 0.60 5.19
a = 0.01 9 - 13.... 2.70 9.71 0.49 6.88 0.48 4.30
X2 0.06889 0.26924 0.18603
d 136 0.00002 0.00009 0.00006
m 1.96 0.00002 0.00725 0.00067
iterations 3.45 7.18 6.63
bad ﬁts 815 58 77
:50- mm, 1.40 18.74 0.54 16.39 0.50 14.51
9‘5- 9.... 0.43 5.44 0.29 3.29 0.35 10.86
790- rm", 0.25 0.50 0.23 0.50 0.15 0.50
ci- atme 4.63 9.99 0.91 9.95 1.29 9.89
a .-= 0.05 6"— 6...... , 2.61 9.70 0.62 5.02 0.70 5.37
X2 0.32513 6.63258 4.31621
d lse 0.00011 0.00221 0.00144
m 1.96 0.00003 0.00087 0.00369
iterations 3.72 4.76 6.18
bad ﬁts 819 676 186

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

69

As with the spherical cases, the fused ﬁt outperformed the other two methods.
This is especially true with the noisier data. Note that the contour-only based ﬁtting
converged on less than 20% of all trials performed and is therefore not a good choice
for parameter estimations. Under light noise contamination, the convergence rate for
the surface-only ﬁt is 2% higher than the fused ﬁt. However, the fused ﬁts result in
better parameter estimates and slightly faster rate of convergence. With higher noise
level, the surface-only ﬁts failed much more frequently and the estimated parameters

were inferior to those obtained in fused ﬁts.

Experiment 3: Conical

The size and placement of the window is chosen as follows. First a point 2:.” along the
limb boundary at some random elevation from the vertex of the cone is selected. The
radius of the circular cross-section that touches 3w and orthogonal to the axis can be
then be computed, say r0. The window from which the data points are to be sampled
is thus a rectangle of height 2.0 r0 (running parallel to the limb contour of the cone)
and width 0.6 r0 placed with one corner of the window coinciding with the selected
point on the limb contour. The good ﬁt criteria are similar to the cylindrical case,
which are: (1) X2 < 12.0, (2) [do—do] S 0.5, and (3) [fr—a] S 10° and [6—6] S 10°.
Results of this experimentation are given in Table 3.5.

As with the previous two cases, the contour-only ﬁts are not reliable for 3D pa-
rameter estimation. In low noise, the fused ﬁts performed the best in all category
including number of bad ﬁts, the accuracy of the estimated parameters and the con-
vergence rate. However, none of the ﬁtting methods performed satisfactorily when
the noise level was raised to a = 0.05. Overall, conical model ﬁtting performed worse
than the spherical and cylindrical cases, probably because the conical model has more

degrees of freedom and the region from which the data points are sampled is smaller

[24].

 

 

 

 

70

 

Table 3.5. Fitting Conical Model on Conical Data

 

Number of trials: 1000
Number of samples per trial: 30

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Para-meter Measurements Contour-Only Surface-Only Fused
are in inches Avg. Max. Avg. Max. Avg. Max.
:50. 5...... 0.41 1.00 0.33 2.99 0.27 2.96
9.1- y.... 0.38 0.97 0.28 1.87 0.21 1.69
z},- 2.... 0.51 1.00 0.55 11.73 0.37 8.98
£7},- (1.... 0.20. 0.47 0.09 0.49 0.08 0.46
a = 0.01 6. am“, 4.33 9.93 2.31 9.99 1.91 9.96
B— 76.... 4.13 9.67 1.31 9.45 1.12 9.78
X? 4.07391 0.10676 1.12798
d (36 0.00136 0.00004 0.00038
m 1.86 0.00136 0.00921 0.01186
iterations 5.15 15.03 12.53
bad ﬁts 843 273 155
as..- m.... 0.40 1.00 0.55 3.22 0.48 2.92
y“:.- 9.... 0.37 0.97 1.02 3.92 0.51 2.25
2‘0- Ztrue 0.51 1.00 2.33 15.65 0.91 11.08
110— at... 0.20 0.47 0.17 0.48 0.11 0.49
0' = 0.01 a- 5.... 4.17 9.93 3.87 9.71 3.14 9.86
{9— 31.... 3.96 9.59 2.55 8.95 2.10 9.64
X2 3.85353 1.70781 5.00942
d 136 0.00128 0.00057 0.00167
m 1.96 0.00121 0.04439 0.00210
iterations 5.14 27.44 17.45
bad ﬁts 854 872 744

 

 

 

 

 

 

 

71

Table 3.6. Fitting Planar Model on Planar Data

Number of trial: 30

are in inches

 

Experiment 4: Planar

The fused ﬁt is performed over two adjacent planar surfaces along with the projec-
tion of the intersection edge. No contour-only ﬁtting is performed because there is
insufﬁcient constraint on the 3D parameters. The sampling window size is ﬁxed at
1.0 x 2.0, with each surface getting 1.0 x 1.0 of coverage. The surface-only data set
consisted of 15 points from each of the two surfaces whereas the fused data set is
made up of 10 surface points from each surface and 10 contour points. A good ﬁt is

Simply a ﬁt in which X2 < 12. The results are shown in Table 3.6.

 

 

 

72

The differences in the planar ﬁts are judged to be insigniﬁcant. Almost all 2000
Monte Carlo trials converged to the ground truth. The fused ﬁt is slightly faster than
the surfaceonly ﬁt. However, the difference is small. One observation that came out
of additional Monte Carlo trials using a wide range of initial parameter values is that
planar ﬁtting is very tolerant of bad initial guesses. The ﬁts still converge rapidly
even when the starting parameter values are far off the ﬁnal values. This is consistent
with many reports of planar ﬁtting in the literature. All of this implies that fused

ﬁtting is not needed for polyhedra.

3.5.2 Surface Classiﬁcation Experiments

The next 4 experiments are designed to test the surface classiﬁcation and shape recog-
nition power of ﬁtting fused data as compared to ﬁtting surface-only data. Speciﬁ—

cally, we want to test how well the fused ﬁts reject the ﬁtting of one model to data

 

 

generated from other models (6.9., ﬁtting all models to spherical surface path).
The initial guesses of the parameter values are important in ﬁnding the correct ﬁt.
When the true model is ﬁtted to the surface patch, the initial parameter estimates are

perturbed from the truth as before. However, when an incorrect model is ﬁtted, the

 

initial estimate of the parameters can no longer be directly generated from the true
model parameters of data to be ﬁtted. Care has been taken to ensure that the initial
guess results in a surface close to the actual surface patches. The reported statistics
include the number of bad ﬁts, the average X2 , the average number of iterations and
2 new entities. First, Best X2 counts the frequency in which one model is the best ﬁt
among all good ﬁts in terms of X2 . If all models failed to ﬁt a surface patch, then the
comparison is not made. The other new entry, misclassiﬁcation %, is the percentage

in which the wrong surface model ﬁtted the surface patch better then the true surface

 

 

 

73

Table 3.7. Fitting all models to Spherical surface patch

Number of trials: 1000
Number of

trial: 30

 

model. Again, if all surface models failed to converge, then no classiﬁcation was

performed and that trial was not included in the calculation of frequency of misclas-
siﬁcation.

Experiment 5: Fitting all models to spherical patch

The results of ﬁtting all models to a spherical surface patch are depicted in Table 3.7.
When the added Gaussian noise is low (a = 0.01), there is no misclassiﬁcation error
using either data set. Note that when contour information is not used, the conical
model will ﬁt the spherical surface patch 9% of the time, whereas when contour in-
formation was used, the conical model never converged. The effect of added contour
information is more apparent in the more noisy case. Using fused data, the misclassi-
ﬁcation rate was 0.2%. Using surface-only data, the error rate jumps to 3.9%. Again,
when contour data was also used during ﬁtting, the conical and cylindrical models
never ﬁtted well. Since the limb contour of a sphere is curved while the limb contour

Of a cylinder or cone is straight, the use of contour points will cause the ﬁtter to

 

 

 

———-w'ﬁ

74

reject cylindrical and conical models as bad ﬁts. This further supports our claim that
contour information is important in classifying surface shape. We must make a note

that when ﬁtting a planar model to the surface patch, only the surface data is used.

 

Observation: Among the wrong models, the conical model ﬁt the spherical

surface patch the best only when surface data was used. With fused data, planar

 

 

model does better among the wrong models.

 

Experiment 6: Fitting all models to Cylindrical patch

The cylindrical surface/ contour data sets were generated as in Experiment 2 except
for the height of the window (along the limb boundary) which was shrunk from
2.0% down to 0.5%. With the original elongated window, the spherical and conical
models would be quickly ruled out by the long straight limb and zero curvature along
the limb. With a more squared sampling window (0.5 7'0 x 0.611;), The conical and
spherical models have a better chance of ﬁtting the cylindrical surface patch and the
contribution of the contour data in fused ﬁtting can be better interpreted. Results of
the experiments are given in Table 3.8.

With low Gaussian noise, the cylindrical model has the best ﬁt most of the time.
Although the misclassiﬁcation rate for fused ﬁts (4.5%) is lower than that of surface-
only ﬁts (7.8%), they are close. With higher noise level, the power of shape recognition
with fused data is more apparent. While ﬁtting surface—only data yields a 60.8% mis-
classiﬁcation rate, fused data ﬁtting has a much lower 16.6% error rate. Furthermore,
the conical model ﬁtted the cylindrical surface patch more successfully than the cylin-
drical model when only surface points were used, but this tendency was reversed with
fused data. This can be attributed to the fact that the Gaussian noise induced sur-

face patch no longer resembles a cylindrical patch. Since a conical model has one

 

 

75

Table 3.8. Fitting all models to Cylindrical surface patch

Number of trials: 1000
Number of trial: 30

 

more degree of freedom, it would ﬁt this “irregular” surface patch better than the
cylindrical model. When the limb contour is introduced in fused ﬁtting, it forces the
limb of the conical model to line up with the contour data and consequently makes
the tapering surface of the conical model a bad ﬁt to the non—tapering surface patch.
On the other hand, the contour data put a ﬁx on the axis direction of the cylindrical
model which wasn’t available to surface-only ﬁtting. Consequently, it would ﬁt better
than before.
in ﬁtting surface-only data set, when the surface is misclassiﬁed, the conical model
tends to have the best ﬁt, followed by planar then spherical models. We must note
that although a large sphere may ﬁt a small cylindrical patch well, we have made an
explicit assumption that no sphere of radius greater than 10 inches is possible (see
Table 3.1). Thus, all spherical ﬁts with To > 10 were rejected.
0n the other hand, if the surface patch is misclassiﬁed when ﬁtting fused data,
the planar model tends to have the best ﬁt followed by the conical model but never

the spherical model.

 

76

Table 3.9. Fitting all models to Conical surface patch

Number of trials: 1000
Number of data

trial: 30

 

 

Observation: Among the wrong models, the conical model ﬁt the cylindrical
surface patch the best if only surface data were used. With fused data, the

the planar model does better among the wrong models.

Experiment 7: Fitting all models to Conical patch

The result of this experiment is consistent with the previous experiments in that ﬁtting
with fused data yielded fewer misclassiﬁcations. From Experiment 3, we learned that
ﬁtting a small conical surface patch is a difficult task even if a conical model Were
given. Here we found out that a wrong model is less likely to ﬁt a small conical surface
patch if fused data were used. From Table 3.9, the conical surface patch is misclassiﬁed
15.1% (a = 0.01) and 56.7% (a := 0.05) of the time when only surface points are ﬁtted.
When both surface and limb contour points are used, the misclassiﬁcation rate drops

to 2.5% and 13.7%. We also note that without contour information, among the
misclassiﬁed cases, the spherical model tends to fit the surface patch well, while the

Spherical model was never a good ﬁt to the conical surface patch when fused data

 

 

 

77

were used. This phenomenon is also consistent with the earlier observation that the
straightness of the limb contour prohibits the spherical model from being a good ﬁt

to the conical surface patch.

 

Observation: Among the wrong models, the spherical model ﬁt the conical

 

surface patch the best if only surface data were used. With fused data, the

cylindrical model does better among the wrong models.

 

 

 

Experiment 8: Fitting all models to Planar patch

As in Experiment 4, data points were sampled from a window of size 1.00 X 2.00 £122.
The planar model was ﬁtted to the data set the same way as before. However, when
ﬁtting conical, cylindrical and spherical models, only data collected from one of the

planar surfaces was used. Any small planar patch can be reasonably ﬁtted if the

 

radius of the quadric model is large. For this reason, the initial r0 was set to 7 inches
for cylindrical and spherical models. The other parameters were chosen carefully so
that the surface would conform close to the planar patch. As in the previous two
experiments, any cylindrical or spherical ﬁt with radius greater than 10 inches was
rejected as a bad ﬁt. The results of this experiment are given in Table 3.10.

Clearly, ﬁtting with fused data results in much better classiﬁcation power. The
fused ﬁt only misclassiﬁed 1 surface patch out of 2000 chances where the surface-only
ﬁt produces a much worse result. As with previous observations, without the edge
contour data, the conical model will ﬁt the surface patch rather well. However, with
the added contour information, only one of the wrong models ﬁtted the planar patch

better than the planar model.

 

 

 

 

Table 3.10. Fitting all models to Planar surface patch

78

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

True surface form: PLAN AR
Number of trials: 1000
Number of data points sampled per trial: 30

Surface-Only Fused

con cyl sph pln con cyl sph pln
bad ﬁts 178 937 998 O 1000 996 1000 0

a x7 3.54 3.63 6.91 0.42 - 8.22 - 0.41
|| iterations 15.33 31.43 49 3.47 - 28.25 - 3.30
0.01 Best X2 161 9 0 830 0 0 0 1000

% misclassiﬁed 17.0% 0.0%

 

ﬁts

277

974

998

1000

996

1000

 

 

4. 5. . . - 8. -

zteratzons .7 . 4 .5 . - . —

est 51 0

m.

 

 

Observation: Among the wrong models, the conical model ﬁt the planar
surface patch the best only when surface data were used. With fused data,

none of the wrong models ﬁt the planar patch well.

 

 

 

3.5.3 Simulation Results Summary

The simulation results of the ﬁrst 4 experiments show that the use of fused data in
surface parameter recovery is superior to using either the contour-only or the surface-
only data. In general, the fused ﬁt yields more accurate parameter estimates and
better convergence rate (especially under higher noise level) without a signiﬁcant
increase in number of iterations before convergence. In fact, the average number of
iterations for convergence went down using the fused data in some cases.

Monte Carlo experiments directed toward recognition capability indicate that the
proposed fused ﬁtting method has a much lower false alarm rate than the method of

ﬁtting only surface data. On the rare occasions where the ﬁt to an incorrect model

 

 

 

 

 

 

 

 

79

Table 3.11. Percentage of bad fused ﬁts versus number of surface and contour points.

 

n \ m 5 10 15 20 25
5 32.7% 8.0% 7.6% 5.9% 5.2%
10 31.8% 7.8% 7.7% 6.2% 5.6%
15 31.3% 7.9% 7.1% 6.5% 5.9%
25 30.6% 7.8% 7.6% 6.4% 5.7%

m = number of surface points
= number of contour points

 

 

 

 

 

 

 

 

 

 

does converge, the misclassiﬁcation rate (wrong surface label) is also much lower.
This misclassiﬁcation rate appears to be small enough to give good input to our
planned higher level scene reconstruction module. The observations made in each
of the experiments strongly supports our hypothesis that edge contour information

contributes greatly to the correct qualitative classiﬁcation of surface shape.

3.6 Fitting with Different Number of Surface and

Contour Points

In the literature, ﬁtting is often performed with a large number of surface or contour
points. We argue that more is not necessarily better. In this section, we study the
effect of fused and surface-only data ﬁtting with various numbers of data points over
a small cylindrical surface patch.

A Monte-Carlo experiment of 1000 trials with different numbers of surface and
contour points was conducted. The number of surface points sampled was 5, 10, 15,
25, or 50 and the number of contour points was 5, 10, 15 or 20. The numbers of bad
ﬁts over the 1000 trials are reported in Tables 3.11. Contrary to popular belief and
common practice in ﬁtting, the results suggest that the performance of the ﬁtter is

stabilized with very few surface and contour points. Improving the performance by

 

 

 

 

80

Table 3.12. Surface-only ﬁtting with various numbers of surface points.

 

m Bad Fit % m Bad Fit %
10 38.5% 40 5.0%
15 47.4% 50 4.9%
20 6.1% 55 4.5%
25 5.1% 60 4.5%
30 5.1% 65 4.2%
35 5.6% 75 4.3%

 

 

 

 

 

 

 

 

increasing the number of data points is possible but the gain is minimal compared to
the increase of the data size.

Similar experiments on surface-only ﬁtting was also conducted and reported in
Table 3.12. The total number of points sampled ranges from 10 to 75. As with the
fused ﬁtting, the performance of the ﬁt stabilized at a very low number of data points
(20). Although the data suggests that more data points would improve the ﬁt, the
improvement is small. One must weigh the advantage of higher convergence ratio to
the cost of slower convergence rate and more expensive sensing when more points are
used to determine the number of points to use for ﬁtting. Similar results should be

expected with surfaces other than cylindrical surfaces.

3.7 Experiments with Real Fused Imagery

 

3.7.1 JIGl Example

All surfaces in this image are planar (see Figure A.4). 104 out of total 255 windows
were interesting windows, (i.e., candidate windows for wing detection). Of the 104
interesting windows, 80 were found to contain wings. All hypotheses in the other 24
windows were rejected due to poor surface/fused ﬁts. Of the 80 windows, 75 wings
were correctly identiﬁed. However, 5 false wings were also reported. Those 5 false

wings all belong to the error category of E2. E2 occurs if the window covers an

 

 

 

 

81

homogeneous region of the surface but the wing detector managed to ﬁnd a wing in
it. For example, in Figure 3.7, window A produces a spurious wing. Window B shows
that the'correct wing primitive hypothesis, (PLN+LN+PLN), is veriﬁed as the wing
present in the window and the other false hypotheses are rejected by fused ﬁtting.
Window C shows that a jump wing is correctly detected.

To evaluate the ﬁtted parameters, we manually clustered the 75 good wings into
the 4 surfaces (this step would be done automatically by the LDG reconstruction
process) and visually inspected the surface parameters of wings in the same cluster.
The means and the standard deviations of each parameter are also shown in Figure 3.7

(The three numbers represent (A, B, D) in the planar equation A3: + By + z + D = 0).

 

From the standard deviations, we see that there is little variance in parameters among

all wings in the same cluster. Note that the two surfaces at the far ends have the

 

same normals and the dot product of the normals of the two inner surfaces, which

are perpendicular to each other, equals 0.

3.7 .2 CYL2 Example

There are two types of surfaces in this image: cylindrical and planar (see Figure A.4).

 

One of the limbs is also registered in the fused imagery. Results of the wing detection
indicates that there are only 45 interesting windows: 30 winged windows and 15 busy
windows. Of the 30 winged windows, 23 contain good wings and 7 contain erroneous
wings. The errors are of type E1 (4) and type E3 (3). An E1 error is one where the
simple wing test failed to recognize that two or more 3D edges exist in the window
and to classify it as a busy window (6.9., window F and the lower left corner of
Figure 3.8). An E3 error is one where the surface is cylindrical but the cylindrical
surface hypothesis was rejected and the planar/spherical hypotheses was accepted.
All 3 E3 errors had occurred in windows in which only surface data were available for

ﬁtting (i.e., along the boundary near the lower limb in Figure 3.8).

 

 

 

 

 

 

 

82

HAL-100.357) [-1.42,-100,-2532)
[035,031,013] [009,002,000]

 

 

'\
\
(000,400,402?) Q; / l
[”31'0‘00'035‘) (001,-100,-25.26)
\\ [038,001,002]
A:(90 30) wing! - PLN, PLN
hypotheses chi-sq fitted parameters

(CYL+PB+ 1.44) (CYL: x0=92.51 y0= 7.47 r0= 43.43 alpha= 31 beta=136)
*(PLN+PB+ 1.33) (PLN: a = 1.47 b =-0.99 d =-38.01)
(SPH+PB+ 7.79) (SPH: xO=13.87 yO=-4.93 zO= 32.86 r0= 9.92)

( +PB+CYL 159.30) (CYL: x0=61.36 y0=25.63 r0= 27.97 alpha= 28 beta=148)
*( +PB+PLN 1.12) (PLN: a = 1.33 b =-1.04 d =-36.99)
( +PB+SPH 27.06) (SPH: xO= 8.18 yo=-o.99 zO= 28.59 r0= 1.88)

B:(9O 165) wing! - PLN+LN+PLN

(CYL+LN+PLN 1000) (CYL: x0= 0.00 y0= 0.00 r0= 0.00 alpha= 0 beta: 0)
(PLN: a = 0.00 b = 0.00 d = 0.00)
(PLN+LN+CYL 1000) (PLN: a = 0.00 b = 0.00 d = 0.00)
(CYL2 10: 0.00 y0= 0.00 to: 0.00 alpha= 0 beta= O)
*(PLN+LN+PLN 1.46) (PLN: a = 0.01 b = -1.00 d = -26.78)
(PLN: a = -1.46 b = -1.00 d = -25.79)

C:(150 45) wing! - PLN
(CYL+ + 1000) (CYL: xO=-43.49 y0= 23.04 r0=61.36 alpha= 50 beta=136)
(CYL+LN+ 597) (CYL: x0 37.28 yO=-16.43 r0= 0.00 alpha=153 beta= 46)
(SPH+LN+ 1000) (SPH: x0 3.55 y0= -o.87 zo=24.81 r0= 3.54)
*(PLN+LN+ 1.06) (PLN: a = 1.39 b = -0.97 d =-37.43)

Figure 3.7. Detected wings in the jig image. Window A in the image corresponds to
the output marked (90 30) where the jump wing hypothesis (PLN+LN), (LN-+1) LN) is
wrongly accepted. It should be a bland window. Window B in the image corresponds
to the output marked (90 165) where the crease wing hypothesis (PLN+LN+PLN)
is correctly accepted and the others rejected. Window C in the image corresponds
to the output marked (150 45) where the jump Wing is detected. * denotes accepted
hypothesis.

 

 

 

 

 

 

 

 

83

[435,333,129,51,401
(.058,1.30,005,0.94,000;

  

(3521,2847, 1.22,47,43)
(533,326,031,3A4,4211

A: (60 105) wing! - CYL, CYL+LN
*( + +CYL 1.92) (CYL: xO=-37.67 y0=28.21 r0= 1.65 alpha= 46 beta= 45)
*( +LN+CYL 3.42) (CYL: xO=-42.92 y0=34.84 r0= 1.74 alpha= 52 beta= 45)

3: (60 120) wing! - CYL+LN

< + +CYL 10.44) (CYL: xO=-69.36 y0= 68.85 r0= 1.78 alpha= 68 beta: 43)
*( +LN+CYL 1.78) (CYL: xO=-41.18 y0= 31.93 r0= 1.87 alpha= 50 beta= 45)

C: (105 75) wing! - CYL, SPH

*(CYL+ + 1.31) (CYL: xO=-32.56 y0= 23.17 r0= 1.83 alpha= 41 beta: 44)
(CYL+LN+ 21.17) (CYL: xO=-20.19 yO=-78.03 r0= 0.87 alpha=109 beta=163)
*(SPH+LN+ 1.34) (spa: x0= 3.24 y0= -1 03 zo= 27.76 r0= 1.88)

D: (120 90) wing! - SPH, CYL
*(CYL+PB+ 5.39) (CYL: x0=-26.02 y0= 32.35 r0= 1.29 alpha= 50 beta= 34)
*(SPH+PB+ 4.28) (SPH: 10: 3.29 y0= -o.94 zO= 27.37 r0= 1.56)

'3: (60 195) wing! - PLN
*(PLN+CC+ 1.21) (PLN: a = 1.38 b

F: (90 165) wing! - (PLN, SPH)
*(PLN+PB+ 0.42) (PLN: a = 1.61 b
*( +PB+SPH 0.57) (SPH: xO= 0.75 yO

( +PB+CYL 1000) (CYL: x0= 0.00 yO

 

 

-1 .05 d =-25.56)

-1.13 d = -25.54)
0.64 zO 26.21 r0= 1.88)
0.00 r0 0.00 alpha= o beta= 0)

Figure 3.8. Detected wings in the CYL image. Wings A and B : limb location in 2D
aids the recovery of the cylinder parameters. Wings C and D : cylinder parameters
recovered without the limb information in 2D. Wing E: correct wing segment type is
recovered. Wing F: small cylindrical surface patch is best ﬁtted by a spherical model.
* denotes accepted hypothesis.

 

 

 

84

Again, we clustered the correct wings into the two surface regions. We further
clustered the wings associated with the cylindrical surface into two groups: one where
parameters are estimated from fused data and one in which parameters are estimated
from surfaceoonly data. The mean parameter values and their standard deviations
are reported in Figure 3.8. The 5 numbers associated with the cylindrical surface
are the ﬁve cylinder parameters ($0,yo,ro,a, ,6). Six and nine wings were deteCted
with the limb (fused ﬁts) and without the limb (surface-only ﬁts), respectively. Of
the 5 parameters, we only have the ground truth for 70, which should be close to 2.0.
The fused ﬁt performed better in having closer radius than the surfaceonly ﬁts. We
have no way of accurately measuring the ground truth of other pose parameters. By
trying to ﬁt the mean parameters from the surface-only ﬁt as the initial estimates for
fused ﬁtting and vise versa, we found that the mean surfacevonly ﬁt parameters are

not good starting parameters for fused ﬁtting (did not converge). However, the mean

 

 

parameters from fused ﬁtting are good starting values for the surface-only ﬁtting.
This result empirically suggests that the parameters recovered from fused ﬁtting are
more believable than those from surface-only ﬁtting. In any case we are encouraged
by the fact that the parameters clustered tightly.

By carefully studying the results, some general inferences can be made. We use

 

Figure 3.8 to illustrate those inferences. We note that those inferences are true in

 

general among all the images that were tested.
0 2D limb contour information is useful for recovery of surface parameters.

As stated above, parameters recovered from fused ﬁtting should be closer to the
ground truth. In Window A, the cylindrical surface parameters are recovered with
((+LN+CYL)) or without ((+ +C'YL)) the limb contour. However note that the
estimated radius is closer to the truth with fused ﬁtting. In Window B, the surface-
only ﬁt ((+ +CYL)) did not converge where as the fused ﬁt did. Upon examining

the initial parameter estimates, we found that the initial values of surface-onlyoe ﬁts

 

 

 

 

85

are rather far from both sets of mean parameters described above. Note that the
initial values for surface ﬁts are derived from a local surface patch whereas the initial
values for fused ﬁts are derived with the aid of limb contour position in 2D. This
leads us to believe that limb contour position in 2D is important information to use
for recovering surface shapes.
0 The program is able to correctly identify a cylindrical region near limb boundary
from one that is not.

From sample output of window B, we see that the hypothesis that the cylindrical
surface is near limb ((+LN+CYL)) is correctly accepted. In windows C and D, the
hypothesis that the surface is near the limb is properly rejected. Only the hypotheses
that the surface is not near limbs are accepted (e.g.,(CYL+ + ) and (CYL+PB+ )).
o Curved contours can be correctly inferred.

This is exempliﬁed by windows D and E. where the correct local contour shape,

parabolic, is correctly identiﬁed.

3.7.3 Open Cup vs. Lid Cup

In Figure 3.9 we demonstrate the power of wing representation and our wing detector.
In Figure 3.9 (a), wing detection is performed on an image of a cup with no lid. The
detected wings around the rim on top are correctly detected as jump wings. On the
other hand, the wings around the rim on top of a cup with lid, as in Figure 3.9 (b),
are correctly identiﬁed as crease wings. From the line drawing perspective, those two
objects have the identical projection in the 2D. Because of the many-to—l mapping in
projecting 3D points into 2D image plane, line drawings in 2D often have ambiguous

interpretations. This effect is also exhibited in Figure 1.2.

 

 

 

86

 

(a) (b)

(a): (90 90) wing! - CYL

< +PB+CYL 16.14)(CYL: x0=-25.09 y0= 39.21 r0= 1.31 alpha= 54 beta= 31)

(b): (90 90) wing! - CYL, PLN
*(CYL+PB+PLN 17.74)(CYL: x0=-18.75 y0= 39.91 r0= 1.19 alpha= 55 beta= 25)
(PLN: a = 1.61 b = -1.80 d =-33.9o)

Figure 3.9. Wing representation can distinguish an opened cup from a cup with a
lid. (a) Opened cup, the line corresponds to the rim is a jump edge. (b) Cup with
lid, the line corresponding to the rim is a crease edge.

 

 

 

87

3.7.4 Summary on 10 Real Images

The summary of wing detection on the 10 real images depicted in Appendix A is

 

given in Table 3.7.4. The window size for wing detection is ﬁxed at 30 pixels x
30 pixels. By carefully studying the table and the program output, we make the

following observations.

1. Planar surfaces are reliably detected, even when the visible surface patches are
small. Furthermore, there are no E4 errors. However, there tends to be more E1
and E2 errors. We believe a more intelligent busy window detector can remove

most of the E1 errors.

 

2. For ﬁts involving spherical surface patches, the wings can also be reliably de—

tected. We also noticed that the estimated radii tend to be smaller than the

 

ground truth. There are also no E5 errors in the two scenes with true spherical
surface patches. The curved limb contour works strongly against the cylindri-
cal hypotheses which must have straight limb contours. This agrees with our

experiments done on synthetically generated data in Section 3.5.2.

3. Cylindrical ﬁts were very sensitive to initial parameter values. We experienced
better initial parameter estimations when the window overlapped the limb pro-
jection. The limb in 2D makes apparent the direction of the maximum curva-
ture. Furthermore, it restricts the 2D projection of the axis to be parallel to it.
Without the limb, the axis can only be estimated from the surface points, which
is difﬁcult if the patch is small. There were 13 E3 errors in the 5 images involving
cylindrical surface patches. All but one of those 13 errors occurred in windows
that did not overlap a limb. Thus, as with the spherical case, the straightness

of the limb boundary acts as a deterrent against spherical hypotheses.

 

 

 

 

88

Table 3.13. Summary of wing detection of 10 real images

 

Number of Interesting W,-

Number of W,- containing wing

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

region no correct
too wing wing wing(s) E1 E2 E3 E4 E5 E6
small detected detected inferred
block2 54 27
5 J 22 I 27 21 T4LOFOFOLOII
bulbl 11 5
0f 6 l 5 4l0[0[0[0]0|1
cone 28 18
2 L 8 [18 0 L0j0[0]0j0]17
cylinder2 45 30
619 [30 23 [4|0I3]0]0[0
jigl 104 80
3121180 75 10L510JOLOIO
agpartl 22 6
1[15 I 6 4 l2[010]0f0[0
block1+ 49 26
columnl 3] 20 [ 26 18 16]O]2101010
cone+ 34 20
cylinder2 0 1 14 I 20 10 1 3J 0] 3 [0 [0 | 2
hump+ 48 20
agpart 2 f 26 E 20 16 T210111010j1
cone+ 37 24 J
cylinder+ 10 3 24 13 4 2 4 0 O 1
sphere l i l W
0 E1: window should be busy but wing is detected. 1

0 E2:
0 E3:
0 E4:
0 E5:
0 E6:

ﬁts anyway.

W.- is bland, but wing is detected - (spurious wing)
W,- contains cylindrical surface patch but planar or sphere are a better ﬁt.
W,- contains planar surface patch but cylinder or sphere are a better ﬁt.
W,- contains spherical surface patch but plane or cylinder are a better ﬁt.
W; contains irregular surface patch but plane, cylinder and/ or sphere

 

 

 

 

 

 

 

89

4. Finally we note that in the absence of the conical surface hypothesis, when
ﬁtting a conical surface patch the ﬁtter will reject all the hypotheses when the

surface patch overlaps part of the limb boundary. However, when the surface

 

patch does not include the limb boundary, the spherical hypothesis will often

be accepted. This again is in agreement with our ﬁnding with synthetic data.

3.7 .5 More Wing Detection Examples

More examples of wing detection results are shown in Figure 3.10 (3)-(f). The ﬁgures
show the detected wing samples overlayed on top of the original intensity images.

Although some of the wings along an edge contour look ragged, this is partly due to

 

the fact that short local edge contour may be well approximated by more than one
curve type, resulting in possibly more than one wing of different wing contour shape

being detected in the same window. Also, missing range data near edges causes holes

 

in the displayed data. The current wing detector does not differentiate among those
wings and all detected wings are returned for higher level processing. In the future,
a smarter wing sensor could be designed to return only the “correct” wing for each
window.

Note that there are some spurious wings and that some edge contours have no
wings. On planar surfaces, spurious wings often have very similar wing surface at-
tributes on either side and can be removed easily (see Section 5.3). For a spurious
wing in a quadric surface, we observed that the wing surfaces often have different
qualitative labels, making detection of it as a spurious wing difﬁcult. This can be
partly attributed to the fact that an interesting window is not divided evenly into
two halves. If one of the regions becomes too small, an erroneous surface label may
emerge from the non-linear least squares ﬁt. In all the ﬁgures, edge contours with no
wing can all be attributed to the fact that the ﬁxed scale window always overlapped

a fragment of those edge contours along with some other edge contours in the image,

 

 

 

 

90

 

(a) cup (b) hump

 

(c) block1+coll (d) C011+C012

 

(e) block0+ball (f) cup+jig

Figure 3.10. Wing detection examples of quadric surface scenes

 

 

—7—W"ﬂ

91

which resulted in a busy window. In the future, a scaled-space wing detector may be
tried to alleviate both of these problems.

The conclusions drawn from the previous examples also apply to these examples.
The limb boundaries of the cylinder and the spherical ball are all correctly detected
and labeled. Further, the surface parameters recovered with the limb contours are
more consistent than those derived without it. More examples of wing detection on

scenes consisting of only planar surfaces can be found in Figure 5.7.

3.8 Summary

In this chapter, a procedure for the detection of object wings in fused intensity-range
data was outlined. Wings, which are local features, are detected from overlapping

subimages. A simple busy/ bland window detector is used to weed out the uninter-

 

esting subimages so that only those interesting subimages are further processed for
wing detection. Hypotheses about the type of wings that are present are veriﬁed by
ﬁtting the hypothesized wing models against the data points in the subimage.

The novel idea of simultaneously ﬁtting intensity contours and the adjacent range
(surface) data is the central theme of the wing detector. We derived the contour
equations of the limb projection of spheres, cylinders and cones from their respective
3D surface shape formulation and the projection of crease edges formed by intersecting
quadric surfaces so that the 2D contour and 3D surface equation(s) are related by a

common set of parameters. We showed by means of Monte Carlo experiments that

 

(1) fused ﬁtting results in fewer bad ﬁts and more accurate parameter estimates than
if surface or contour data were used alone; moreover, (2) fused ﬁtting is better able
to discriminate the different surface types. The advantage of fused ﬁtting was more

evident when higher Gaussian noise was introduced in the experiments.

 

92

We also studied the effect of different numbers of surface and contour points used
for fused ﬁtting. Contrary to popular belief, only a very small total number of data
points are needed for the ﬁt to achieve high success rate. This result seems to suggest
that past ﬁtting techniques that use hundreds and thousands of points are merely
wasting valuable computing resources for a very small gain in conﬁdence.

Wing detection over a large set of real fused images demonstrates the effectiveness
of our wing detector. In general, correct wings are detected over “interesting” regions
of the image and spurious wings, although present, are rare. The usefulness of the
edge contours, especially limbs, in aiding surface ﬁtting was apparent. It was shown

that the limb contour is essential in generating good initial parameter estimates for

 

wings near a limb boundary. We also observed that for curved surfaces, although
the qualitative surface shape is often correctly inferred, the quantitative parameter

estimates still need to be improved. The problem lies in generating good initial

 

parameter estimates. Near a limb boundary, a more accurate guess can be induced
with the help of its 2D position. Away from limb boundary, the initial estimates must
be “guessed” over the small subimage area. Improving the initial estimate is a topic
for future research. We also observed that if the region from which the surface points

are sampled for ﬁtting were too small, then the recovered qualitative and quantitative

 

shape measures will often be incorrect.

Finally, we note some of the improvements that can be made to the wing detection
algorithm. As is, wing detection is performed over a ﬁxed scale; the window size is
ﬁxed for all images. Short image segments may never be recovered given the ﬁxed
window size and the image partition policy. To be robust, a scale-space approach can
be used to prevent small features from being overlooked. Although this may increase
the computational complexity, the algorithm is suitable for parallel implementation.
Since wing detection within a window of subimage can proceed independently from

the others, all wings can be detected concurrently on a parallel computer.

 

 

CHAPTER 4

Perfect Reconstruction of LLDG -
Origami / Polyhedral World

4.1 Introduction

Line drawing analysis has been actively researched in the past. The goal of research
in this ﬁeld is to interpret the scene by interpreting the lines of the line drawing of
iobjects derived from the scene. A common trait of the early papers is that they
all assume that a perfect line drawing is available as input. They often left the
difﬁcult task of low level processing, taking a raw image and converting it to its
line drawing representation to future research. In addition, a junction catalogue
was usually derived in advance to facilitate the line interpretation process; this in
turn placed unrealistic limitations on the domain of objects. The major drawback
in those works is that by insisting on having the perfect line drawing in hand before
interpreting the scene, one is faced with the difﬁcult if not impossible mission of
deriving the perfect line drawing from the raw data.

In this chapter, we will present two methods for complete construction of the
labeled line drawing graph (LLDG) from a set of sampled wings under ideal assump-

tions. The added power of sparse range data is explored. With the beneﬁt of the

93

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

94
—7— --s— a
/'r X fll
:1- S +
3 13
—6— 1} 15 N
7L 1 A £23 £21 V
X‘l’w + £12 A +
12 1:; +
+4“ 8 ‘3' 'Q' 197}, 8 4 J. < <
353+40+3+ +|33 2" __1_9 1__1_3 “f3?“ £22 - 133 +
3:? 9 2'4 £13 h + - £14
1+7 *3 jg— . . +
432 24:5 317/ £~ +
‘ﬁ'r— ‘10— <
(a) wing representation (b) labeled line drawing

Figure 4.1. Wing representation and reconstruction of the Big-Block (from [26]).

range data, detected wings can be sorted so that data from individual planes can be
interpreted separately. The basic idea is to ﬁrst recover the line drawings of individual
visible object faces; then the entire view is reconstructed by integrating the recovered
individual faces. Two algorithms have been developed to do the reconstruction. In
the ﬁrst algorithm, global or local constraints are not necessary, provided that we can
take sparse samples from the original range image. The second algorithm relies only
on the geometrical properties of the LDG and does not need to reference the original
range image. In so doing, not only will a perfect line drawing be constructed, but
the lines and the junctions in the LDC will also be interpreted and the pose of each
polygonal surface in the scene will be recovered (the labeled line drawing graph). An
example showing the wing representation and reconstruction of an object is depicted
in Figure 4.1.

Both algorithms to be presented in this chapter can be classiﬁed. as deterministic
algorithms because no heuristic procedures are involved in the reconstruction process.

This is only possible due to the strong assumptions made about the completeness of

 

 

 

 

95

the input wing set. A heuristic based algorithm which relaxes some of the strong
assumptions will be discussed in the next chapter. The main differences between this
research and that of the past are that: (1) a perfect line drawing is derived instead
of assumed; (2) a junction catalogue is not needed in interpreting the line drawing;
and (3) surface pose information is also recovered.

In the next section, we will survey some of the previous research done on line
drawing analysis. This is followed by the presentation of the two LLDG reconstruction
algorithms. Arguments for correctness of each of those algorithms will be given.
Section 4.5 provides reconstruction examples from both syntactically generated scenes
and real scenes. A section summarizing the results and contribution of this research

concludes the chapter.

4.2 Survey of Related Background

Analysis of line drawings have received much attention since the early days of com—
puter vision. Interpretation of line drawings of polyhedral objects has been studied
quite extensively [30, 61, 66, 82, 108, 115, 123] and is well understood. Only recently,
have works on line drawing interpretation been extended to a limited class of curved
objects [83, 92, 93]. One common drawback of all those algorithms is that they all
assumed that a perfect line drawing can be acquired from the input image. HOWever,
to date, no edge detection algorithms can yield perfect line drawings so that those
algorithms can be applied.

In the following subsections, past work in the area of line drawing analysis will be
reviewed. Most works have concentrated on the polyhedral world. Only recently has

interpretation of curved line drawings received any attention. One signiﬁcant result

 

coming out of those studies is that a curved line in the image may be labeled as an

occluding edge in one part and a crease edge in the other [83].

 

————'_w """

96

4.2.1 Model Based Interpretation of Line Drawings

One of the ﬁrst signiﬁcant works on interpreting two dimensional line drawings of
three dimensional objects is due to Roberts [108]. The main component of his 3D
object recognition system is the database of models of objects to be recognized. Given
an image of an object from the model database, the perfect line drawing of the object
is extracted from the image. The database is searched for a model whose perspec-
tive projection, after some translational, orientational transformation and scaling,
coincides with the extracted line drawing. Although this system required the strong
assumptions that (1) a perfect line drawing can be extracted, and (2) objects are
isolated in the images, it did pave the way for future research on line drawing inter—
pretation and object recognition via a model database.

Falk [34] also used ﬁxed models of the objects that would appear in the scene to
help in identifying the visible objects in a photograph of a scene and determining
their orientation and positions in 3D space. There are two main differences between
the systems of Falk and Roberts. For one, Falk allowed for imperfect input (line
drawings generated from the photograph can be degenerated views of the objects or
some edge may be missing all together) as Opposed to the perfect line drawing required
by Roberts. Secondly, Falk’s models speciﬁed precise shapes and sizes as opposed to
Roberts’ models, which are generic in the sense that a cube can represent any right
parallelepiped. A hypothesize—and—test strategy was used by Falk in identifying and

locating objects in the scene.

4.2.2 Model-Free Interpretation of Line Drawings - Poly-
hedral and Origami World

In contrast to Roberts, Guzman [51] did not attempt to recognize isolated objects

in a scene. Instead, he developed a program called “SEE” which took a perfect line

 

 

 

97

drawing of a scene consisting of jumbles of solid polyhedra with occlusion and tried
to partition it into a set of individual objects. The partitioning was based on a set
of ad hoc rules where conﬁgurations of lines and regions at junctions dictate how the
regions are to be decomposed/ grouped. Although this method worked well for many
complicated line drawings, it was nevertheless. a heuristic algorithm. No justiﬁcation
of why the rules are correct was given. Examples of line drawings where the algorithm
may fail were given by Mackworth [82].

The ﬁrst systematic scene interpretation based on line drawing labeling was largely
credited to Huffman and Clowes [30, 61]. They independently went about line drawing
interpretation of trihedral polyhedral scenes. A trihedral polyhedra is an object in
which exactly three planar surfaces meet at each vertex. Each line in the line drawing
can be labeled by exactly one of four symbols: {+, —, >}. A “+” indicates projection

of a convex edge where both surfaces are visible; a “-—” indicates projection of a

 

 

concave edge where both surfaces are also visible; and a “>” indicates projection of an
occluding edge where only one of the two surfaces is visible and the visible surface lies
to the right as one moves in the direction of the arrow. Junctions which correspond
to visible vertices are labeled as a combination of line labels of all lines meeting
at each junction. Assuming a general vieWpoint, an algorithm can be formulated to

systematically, yet exhaustively examine all physically realizable junctions. Given the

 

trihedral assumption, the three faces of any vertex deﬁne three intersecting planes,
which divide the space into 8 octants. By considering all ways of ﬁlling up these 8
octants with object material and viewing the vertex from un-ﬁlled octants, a complete
junction catalogue can be obtained. A complete listing reveals that only 18 of 208
possible junction labels are physically realizable (see Figure 4.2).

Using the junction catalogue, valid interpretations of line drawing of trihedral
world are rendered through exhaustive search. This labeling scheme is not without

problems. More than one interpretation of a single line drawing is possible (Fig-

 

 

 

 

\/\/\/\/\/\/
YYYYY
TTTT
/l+\'/l\1‘/l\f

Figure 4.2. Huffman-Clowes junction catalogue.

ure 4.3 (a)). Moreover, nonsensical scenes are coherently labeled (Figure 4.3 (b));
line drawings may be given geometrically impossible line labels (Figure 4.3 (c)); and
scenes that cannot arise from polyhedra are easily labeled (Figure 4.3 (d)).

Waltz [123] extended Huffman and Clowes’ research in two important ways. One,
he expanded the Huffman-Clowes set of line labels to include shadows, cracks, and
separable concave edges. As a result, the number of line labels increased from 4
to 11 and the junction catalogue was greatly expanded due to the addition of 4-
and 5~line junctions. Approximately 2593 junction labels were physically possible.
It is clear that exhaustive search for a consistent line labeling as with Huffman-
Clowes is computationally prohibitive. The second contribution from Waltz is that he
replaced the exhaustive search for consistent line labelings by a constraint-propagation
algorithm to weed out impossible junction labels at each junction. A direct tree search
followed to enumerate all possible labelings. However, the problems associated with

Huffman-Clowes labeling still persist.

 

 

 

 

99

 

 

 

 

 

 

 

‘5 \_._/

(C)
((0

Figure 4.3. Problems with Huffman-Clowes labeling scheme. (a) More than one
interpretation may exist for a single line drawing. (b) Syntactically nonsensical scenes
are coherently labeled. (c) Line labels that are geometrically impossible (d) Non-

polyhedra can also be labeled (b, c, d are from [7] )

To deal with the problem of legal labelings that are not realizable as polyhedra,
Mackworth [82] proposed the use of gradient space to remove those labelings. Simply
stated, the gradient (p, q) measures the instantaneous change in the depth of a surface
at point (x, y). The gradient space is then a representation for vector orientations and
is used to represent the surface normals of the polyhedral faces. For a planar surface,
all points on the plane map' to the same (p, q) point in the gradient space. Using
the same set of line labels as Huffman and Clowes, Mackworth was able to interpret
line drawings by reasoning about surface orientations based on the properties of the
gradient space. Not only were the labelings which yield geometrically impossible
objects removed, but also the object domain was broadened to include all polyhedra

(not just trihedral polyhedra).

 

 

 

 

 

 

 

100

V L 8 X X 7

/I\

Y \l/ PSI 22

—l_ T 16 \K XK 41
K K 19 X AST 16

Figure 4.4. Origami junction types and number of possible junction labels for each
type (from [66]).

Mackworth’s approach was continued by Kanade [66]. Kanade broadened the
object domain of Huffman and Clowes to include origami objects. The assumptions
were that surfaces in the origami world are assumed to be planar, that no more
than three surfaces of different orientations meet at a vertex, and that no more than
three edges of different orientations are involved at a vertex. Orthographic projection
was assumed. Under this set of assumptions, a new junction catalogue was derived
much like that of Huffman and Clowes. Line labels were still restricted to the set
{+, —, > }. However, 9 junction types are now possible (Figure 4.4) and the total
number of entries in the junction catalogue jumps from 18 to 153. Line labeling
proceeds by using a ﬁltering procedure for labeling junctions similar to that of Waltz
and checking the consistency of surface orientation. Kanade later introduced two
heuristics to ﬁlter out unnatural interpretations [67].

Sugihara [115] took the algebraic approach to line drawing interpretation. First,
a set of candidate spatial interpretations was generated for the given line drawing via
constraint propagation much like that of Waltz [123]. For each interpretation, spatial

information such as the relative depth between a pair of vertices or a vertex and a

 

 

 

 

101

 

surface was derived from the line label. The task of determining the correctness of
each candidate interpretation was reduced to the problem of testing whether a system
of linear constraints has a solution or not.

Assuming that there are m (r1, r2, ..., rm) regions and n (j1,j2, ...,jn) junctions in

the line drawing, the system of linear constraints was derived as follows:

(1) Each region r,- is associated with a 3D planar face f; with linear equation agx +
by + z + c,- = 0. Each 3D vertex 1)., which corresponds to a junction jg, and
which lies on f,- gives rise to a linear equation aggro, + bgya + 20, + c,- = 0. Note
that the unknowns are a;, b;, c,- and 20,. A system of linear equations is obtained

by collecting all such equations, one for each junction in each region.

(2) A system of inequalities can be gathered from the relative depth of all pairs of
vertices corresponding to junctions on the line drawing. If a vertex v,- is in front
of 22,- then

Z,'<Zj

else

2; > Zj.

(3) Finally, a system of inequalities are formed using the relative depth information

between a vertex 220 and a surface f,- that correspond to a junction jo, and a

 

region r,- on the line drawing. If 2),, is in front of f, then

ajxo, +bjya+za +cj > 0

else, if '00, is behind fJ- then

ajzca + bjya + Za + Cj < 0.

 

 

 

 

102

Appending the above three systems results in one big linear system of equations
and inequalities with n + 3m (121,22, ..., zmal, ()1, c1, ...,am, lime...) unknowns. Sugi-
hara showed that the existence of a solution to this system of linear equations and
inequalities is a necessary and sufﬁcient condition for a labeled line drawing to rep-

resent a valid planar surface scene.

4.2.3 Model-Free Interpretation of Line Drawings - Curved
Surface World

Turner [121] attempted to extend the trihedral vertex polyhedral junction catalogue to
smooth faced opaque solids. The surfaces were composed solely of planar, parabolic,
elliptic, or hyperbolic points and bounded by surface-normal discontinuous edges.
A general vieWpoint was assumed and shadow boundaries were allowed. Several
other assumptions about the vieWpoint, background, illumination, and corners were
also made. However, unlike the junction catalogue for trihedral vertices, Turner’s
catalogue was not shown to be complete for its domain [91].

Chakravarty [25] dealt with planar-faced and curved-surface solids, having ver-
tices formed by at most, three surfaces, and edges formed by two surfaces. Seven
generalized junction types were deﬁned. Limitations on permissible junction types as
one moves from one end of a line segment to the other provided the ability to verify
whether a given sequence of junctions forms a realizable conﬁguration. This labeling
scheme dealt with regions and lines, rather than with the geometric characteristics of
a junction.

Malik [83] systematically derived a junction catalogue (Figure 4.5) for scenes com-
prising Opaque regular solids bounded by piecewise smooth 03 surface patches with
no markings or texture on them. Orthographic projection and a general viewpoint

were assumed. The input line drawing to the interpretation module was assumed to

 

 

 

 

 

 

 

 

103

be perfect and that lines due to shadows or specularities were not present. An algo'
rithm which utilizes this catalogue to determine all legal labelings of the line drawing
was developed. A new line label << was introduced to accommodate limb boundaries,
a locus of points on the surface where the line of sight is tangent to the surface (see
Figure 4.6). In order to prune highly counter-intuitive interpretations involving a
number of hidden faces, a local minimum complexity rule was also used to restrict
the number of hidden faces at a vertex to at most one. Although a large number of
curved objects were allowed in the object domain, simple objects such as the piece of
nose cone shown in Figure 4.6 were excluded.

A mathematical framework for line drawing interpretation was recently presented
by Nalwa [91]. The assumptions were that the imaging geometry is accurately cap-
tured by orthographic or perspective projection, the vieWpoint is general, the surfaces
are piecewise C3, and that limbs are the only vieWpoint-dependent edges. Using this
framework, constraints on the scene from instances of straight lines and conic sections
in line drawings were derived [92]. In [93] constraints associated with bilateral sym—
metry were investigated. It was shown that the orthographic projection of a surface
of revolution exhibits bilateral symmetry about the projection of the axis of revolu-
tion, irrespective of the viewing direction. However, no attempt was made to suggest

schemes for the detection of bilateral symmetry in line drawings.

4.3 Reconstruction of LDG Via Resampling

An LLDG reconstruction algorithm similar to the one presented in this section was
ﬁrst conceived by Chen [26]. However no implementation was attempted and there
were more restrictive assumptions. Here, we present a slightly different algorithm
that does not require the objects to be completely inside the ﬁeld of view and show

the results of our implementation.

 

 

 

 

 

vvvvvv
YYYYY
TTTTTT
”lb/[‘43

Three-Tangent:
+
Terminal: //

Figure 4.5. Junction catalogue for piecewise smooth surfaces [83].

limbs ‘

COHVCX

Figure 4.6. Apiece of nose cone is not in Malik’s [83] object domain.

 

 

 

 

 

105

4.3.1 Assumptions

The reconstruction algorithm depends upon the following assumptions.

Object Domain / Scene Assumptions:

A1) Objects are 6-polygons or polyhedra whose faces are 6—polygons.

A2) Scenes can be composed of multiple objects which may occlude one another.

Wing Detector Assumptions:

A3) The view is non-accidental and the “wing sensor” produces at least. one wing
sample for each edge that would be visible in the LDG. (Recall the deﬁnition

of a non—accidental view from Section 2.3.)

A4) There are no spurious wings.

Computation Accuracy Assumption:

A5) Measurements and computations are inﬁnitely accurate.

Clearly assumptions A3, A4 and A5 are unrealistic in lieu of existing low level
image processing techniques. However, our intention is to present a deterministic
algorithm such that a complete line drawing graph can be reconstructed under these
ideal assumptions. In the next chapter, we shall give a heuristic algorithm for re-
constructing LDGs that relaxes these assumptions. An object view and its wing

representation which satisfy these assumptions is given in Figure 4.1.

4.3.2 Algorithmic Approach

Our approach is to individually reconstruct the LDG of each face of the objects in the

view then merge them to form the ﬁnal LLDG. For each plane, the line segments from

 

 

 

 

 

 

 

106

all wing samples involving that plane are extended until they intersect the image frame
forming the set of all possible LDG line segments. When necessary, the algorithm
resamples the range data to decided whether one of the possible segments is really in

the LDG. The example below originates from Chen [26].

Step 1 : Cluster wings into plane groups

Wings are clustered into groups such that wings in the same group lie on the same
plane. Counting background as one planar group, each wing will be a member of
exactly two plane groups. From Assumption A3, the wing sensor produces at least
one wing for each LDG segment. Thus, all planes containing object faces will be
known as well as all lines containing visible edges. In addition, from Assumption A4,
there can be no spurious wings; hence no spurious planes. Consider the example in
Figure 4.1 (a). After grouping wing segments according to their plane equations, 8
plane groups are obtained. Processing of one plane group, { 4, l3, 14, 15, 16, 17,
26, 27, 28, 29 }, from the right face of the block will be continued below. Note that
it is possible for a planar group to be composed of wings from two or more faces.
For example, wings from faces f13 and f14 or f32 and f33 in Figure 4.1 (b) belong to
the same group because the two faces are coplanar. The following steps show how to
reconstruct LDG segments from the wings of a single plane group. At the very last

step, all labeled segments will be merged to complete the interpretation of the LDG.

Step 2 : Determine all possible LDG segments of a planar group

For each wing segment in a plane group, construct a straight line within the picture
frame (see Figure 4.7 (a)). This will result in a set of junctions and line segments
which represent the possible images of the 3D object features (vertices and edges).
From Assumption A5, collinear wing segments will lie exactly on the same constructed

line. There are no extra lines generated (Assumption A4) and no missing lines (As-

 

 

 

 

 

107

sumption A3). The object edges visible in the LDG must form a subgraph embedded
in the planar graph of all these possible line segments: it is the goal of the next steps
of the reconstruction algorithm to extract the correct subgraph (or subgraphs in case
of ambiguity).

At any time in the reconstruction, a line segment can be classiﬁed into one of the
following three categories: a must-segment, an undecided-segment or an unwanted-
segment. A must—segment is a line segment that has been determined to be part of
an object face boundary. An undecided-segment is one that is not a must-segment,
but has not been discarded. An unwanted-segment is one that has been determined
to not be part of a face boundary. We now select only those line segments which form

the visible boundaries of the given 3D object faces in the given plane.

Step 3 : Determining initial must-segments

 

 

Because of Assumption A4, a line segment overlapping a wing segment must indicate
some object edge and hence should become a must-segment. The result for the right

face of the block in Figure 4.1 is shown in Figure 4.7 (b). There are also several

 

undecided line segments, denoted by broken line segments in Figure 4.7 (b). For
example, LDG segment H has not yet been recovered because no wing segment is

associated with it. Such segments are examined in the next step to decide whether

or not they should be part of the LDG.

Step 4 : Access the range data for all undecided-segments

Any line segment in the LDG must separate exactly two distinct regions. If 3D points
projecting to both sides of a candidate line segment are located on the same 3D plane,
then the undecided line'segment can be deleted from consideration. An undecided
segment is unwanted unless it has exactly one side with points on the 3D plane of the

current wing group. Assumptions A1 and A2 ensure that there is sufﬁcient surface

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

   

 

 

 

 

 

 

108
1i §oi .............
+ i g ...... t .l' ''''''''
"41-an ..
r" :+ """ ik iiiiii
..... 0* +§ f
'{41’ 4, ,,,,,,
...... d'i" q
(a) Step 2: all line segments extended. (b) Step 3: line segments overlapping
sensed wings are must segments
X X X
i X
’h/
a +
X + v
X
X
X X
X X
X
X X X
(0) Step 4: range sampling removes 19 unwa‘gted ((1) Step 5: delete redundant intersections.

segments(x) and includes must segment k1

Figure 4.7. Deciding on LDG line segments (Figure originated from [26]).

 

 

 

109

in the 3D view to sample and make this decision. By resampling the range data on
either side of the undecided—segment, every undecided line segment can be classiﬁed
as a must- or unwanted-segment. Thus unique LDG is obtained for each planar
group of faces. Returning to the running example, the line segment between [C and l
in Figure 4.7 (b) can be successfully recovered, while all other undecided—segments,
including has—ﬁg—h’g—j and 775, are identiﬁed as unwanted-segments. The result of
this step is depicted in Figure 4.7 (c).

By using this resampling rule, each of the undecided segments is either kept as a
must—segment or discarded as an unwanted-segment. Obviously if the range image is
not available for resampling, this step would not be possible. In the next section, we
give another way of reconstructing faces from each planar group using rules derived

from geometric constraints instead of resampling range data.

Step 5 : Delete redundant junctions.

Returning to Figure 4.7 (c), we see that junctions e, h and k do not correspond to
vertices of a 3D polygon. From object and view point assumptions all regions in the
image plane are e—polygons. Thus each junction of the newly reconstructed region
must be shared by an even number of line segments not all of which are collinear.
Therefore, after removing unwanted line segments, junctions which become redundant
are removed. After applying steps 2 to 5 to each plane group, the following conditions
must hold for each group: every junction has even degree and line segments form a
set of circuits. The distinct object faces from a given plane group are represented by

distinct sets of circuits.

 

Step 6 : Merge planar groups to form the LDG.

Up to this point the algorithm has recovered all edges and vertices of polygonal faces.

The remaining work is to reconcile the line segment labels for the entire LDG. T-

 

 

110

junctions are created when plane groups are merged and line and sub-line segment
labels are assigned according to the recovered T junctions and the wings associated
with the given line segments. Figure 4.1 (b) shows the result of such a process. More
interesting is the analysis of the “table & chair” in Figure 4.19 : when planar face
ABCD is reconstructed, points G and H are not known — they are formed by the
merger with other faces. Note that the junction H is not in the Huffman-Clowes
catalogue. Junctions (vertices), such as C and F, are formed by merging 2D (3D)
points of intersection computed from the original wing data : assumptions A1 and A2
give us tolerances to use for this task. (Also note that the geometric character of m
can be determined by the wing sensor to be one of { + (chair-top collinear with table
edge), < (chair-top closer than table top), > (chair-top occluded by table-top)}, but
in no case can the connectivity of the materials being sensed be determined without

fallible heuristics being applied on top of the LDG construction.)

 

 

4.3.3 Resampling Reconstruction Algorithm: POLY-1

The above steps are summarized below in an algorithm skeleton. The input includes
the range image and an extracted wing representation of a polygonal scene. The out-
put includes a labeled line drawing, the equations of visible faces and the coordinates

of visible vertices. Examples of scenes that can be handled are given in Section 4.5.

 

 

(1) Cluster wing segments by plane equation
(2) For each plane group

(2.1) generate a line for each Wing

(2.2) compute the intersections among all lines

(2.3) preserve line segments which overlap a wing

(2.4) remove or preserve line segments by testing the range data

(2.5) delete redundant intersections

(3) Merge LDGs of individual polygonal faces

 

_ __‘_A

 

 

111

(4) Compute 3D vertices by intersecting line equations
(5) Retain all line segment labels from wings

(6) Output LDG with plane equations, 3D vertices and edges

4.4 Reconstruction Via Geometric Constraints

While the algorithm developed in Section 4.3 is able to reconstruct a unique LLDG
from the wing samples, it requires the original range image to be available for re-
sampling. However, the range image may not be available in some situations. Fur-
thermore, future research may enable wing detection without range data. Thus, an
algorithm for the recovery of the LLDGs using only the set of sampled wings is de-
sirable and is deveIOped in this section. The algorithm is similar to the previous one
except for the way individual planar groups of wings are reconstructed into regions.
Rather than resampling the range data to complete the line drawing of each of the
coplanar faces, we show that the LDG can be reconstructed using the assumptions
and the inherent geometric constraints of the projection of polygonal scenes. Thus
only the region reconstruction step will be described in this section.

Given a polygonal scene consisting of only objects in the object domain and a
non-accidental View of the scene, assume there are m visible faces which lie on n,
n _<_ m, different planes, the task at hand is the reconstruction of the line drawing of
the faces on each plane. Let G denotes the perfect LDG of the faces on one polygonal
plane as viewed through the given non-accidental viewpoint. G is not the LDG of
the scene; rather, the LDG of a set of coplanar 6-polygons in the scene. In the
following subsections, we will exploit the geometrical constraints on G and formulate

reconstruction rules base on those constraints.

 

 

 

 

 

 

112

4.4.1 New Terminology

To aid the understanding of the geometric constraints and the new algorithm, we
deﬁne the following functions such that each returns some speciﬁc information about
a particular line segment, junction or region label. The ﬁrst four functions are similar
to those of Baumgart [9] but are renamed here to better suit our data structure.
Denote the regions (rs—polygons) to which the faces («S-polygons) project as
{R1,R2,...}, and the line segments in the LDG as {seg,~,-,segjk,...} where seggj
denotes the segment from junction v,- to junction Uj. Furthermore, the regions
{R1,R2, ...} all have the same planar label, say P9,, which is the equation of the
plane from which the line drawing of object faces are being reconstructed. Note that

due to occlusion, there may be more junctions in the LDG than there are vertices in

3D.

SegCCWv,(seg,-j): returns segmc where 369;], is incident to junction v,- and is the ﬁrst

segment encountered from segg in the counter-clockwise direction. (same as
PCW(edge) in [9])

SegCin(seg,,-): returns saga where seggz is incident to junction v,- and is the ﬁrst seg-
ment encountered from seggj in the clockwise direction. (same as NCCW(edge)
in [9])

RL—CCWu,(seg,-j): returns region label P5 of the region encountered in the counter-
clockwise direction from seggj with viewing origin at 22,-. (same as PFACE(edge)
in [9])

RL~CW,,,(seg,j): returns region label P5 of the region encountered in the clockwise

direction from seggj with viewing origin at 22,-. (same as NFACE(edge) in [9])

In addition, we shall deﬁne two segments seggj and segik to be adjacent segments

if they share a common junction v,- and (1) SegCCin(seg,-j) = segqc or (2) seggc

 

 

_____—_‘A

 

 

 

 

 

 

113

SegCCW a(ad)

    
  

SegCWa(ad) SegCCW (ac)
a

RL-CCW d (de)

......‘Q-......... ...-....v ‘CCI..... .....-C...’..-....

.....I--.' ......U..

RL-CW d (dc) = RL-CCW e (de)

Figure 4.8. Graphical interpretation of newly deﬁned functions.

= SegC’Wv,(seg.-,~) and deﬁne the region immediately bounded by seggj and 369;): as
RL-CCWU,(seg,~J-) (or equivalently RL-CWu,(seg,-k)) if (1) is true or RL-CWv,(seg,',-)
(or equivalently RL-C'CW,,,(seg;k)) if (2) is true.

The above function deﬁnitions are depicted graphically in Figure 4.8. Note that

RL-CCWvJsegij) = RL-Cij(segi,-) and RL-CCij(seg,-j) = RL—CWv,(seg;,-).

4.4.2 Geometric Constraints Imposed by Projection of a
6—polygon

Theorem 4.1 If a circle centered at v is divided into 72 pieces by n straight seg-
ments, all extending from v to the boundary of the circle (like a pie chart), then the

Q-coloring problem has a solution if and only if n is even.

Pf: Assume a1, a2, a3, ..., and, on are the pieces of the pie in the clockwise direction
starting at al. In a 2-coloring problem, adjacent sectors must have alternating colors.

Therefore, one must color the pieces in the following way. If (11 is colored green, a2

 

 

 

 

 

114

must be colored white, a3 must be colored green, and so on. When all the pieces are
colored, a, will be colored green for all odd i’s and a,- will be colored white for all
even j’s.

Since a1 and an are adjacent, they must also have different colors. If n is even,
then a1 and an will have different color and the 2-coloring problem is solved. However,
if n is odd, both a1 and an will have the same color, namely green. Since a1 and an
are adjacent to each other, the condition of 2-coloring is violated. Therefore, the

2-coloring problem has a solution if and only if n is even. 0

Theorem 4.2 Every vertex of a 6—polygon must be incident with an even number of

edges.

Pf: Let v be a vertex of the 5—polygon, P5. Consider the plane that is coplanar with

P5, call the polygonal plane and a circular neighborhood N, of radius r < 6, centered

 

 

at v on this polygonal plane, the edges of P5 incident with v divides N, into sectors
where each sector is wholly inside or outside of P5 on the polygonal plane (recall that
the deﬁnition of 6-polygon guarantees that each edge of P5 have length greater than

5 on the polygonal plane).

 

Suppose the vertex v is incident with odd, say 271—1, number of edges. Those 2n-1
edges divides N, into 212-1 sectors. Since there is an odd number of sectors, from
Theorem 1, at least 2 adjacent sectors will have the same label. But this means that
the edge between those two sectors will have the same label on both sides, which
contradicts the deﬁnition of a 6—polygonal edge. Therefore, a 5'polygon may not have

a vertex of odd degree. D

Corollary 4.2.1 Every junction of the line drawing of an non-accidental view of a

6-polygon must be incident with an even number of line segments.

Pf: Recalling the deﬁnition of a non-accidental vieWpoint (Section 2.3), the projection

of a 6-polygon onto the image plane from a non-accidental viewpoint is an e-polygon.

 

___—___A

 

 

115

Therefore, by Theorem 42, every junction (vertex) of the line drawing (e—polygon)

must be incident with an even number of line segments (edges). U

Theorem 4.3 Let S denote a set of coplanar {co-quadric) faces of objects in the
scene, G denote the perfect LDG of those faces in S as viewed through some non-
accidental viewpoint, and P5 denote the region label of all the regions project onto
which those faces ofS (P5 is simply the equation of the coplanar {co—quadric) surface).
If seggj and segik are two adjacent segments in G then either
RL-CWv,(seg,-,-) = P5 = RL-CCWv,(seg,-k)
or

RL-CCin(seggj) = P5 = RL-CWuJSBgik)

Pf: From the deﬁnition and the properties of the LDG, two regions adjacent to a

line segment in the LDG correspond to projections from two faces in distinct surfaces;

 

thus must have different region labels. Therefore, it must be that RL-C'CWW(seg;,-) 74

RL-C’Wv,(seg,-j) and RL-CC’WW(seg,-k) 76 RL-CWU,(seg,-k). Without loss of generality,

label the regions of G which are not images of faces in S as PI, where Px ¢ P5.
First let’s assume that the region immediately bounded by 589,] and seguc

corresponds to RL-CCW,,,(seg,-j) and RL-CWU,(seg,-k) (see Figure 4.9 (a)). Thus

RL-C'CWW(seg,-,~) = RL-CWv,(seg,-k).

QM: Assume RL-CCWU,(seg,-j) = P5. Then RL-CC'Wv,(seg,~,-) = P5 =

RL-Cin(seg,-k).

Case 2: Assume RL-CWU‘.(seg,-j) = P5. If it were the case that RL-CWW(seg.-k) = P5,

 

then so must RL-CCWU,(seg,-j) = P5 because they bound and label the same region in
G. But this implies that RL-CCWv,(seg,-j) = RL—C'Wui(seg,-j) = P5, which contradicts
the fact that adjacent regions of a line segment cannot have the same region label.

Thus, it must be that RL-CCWW(seg;k) E RL-Cm.(3€9ij) = P6-

 

 

116

   

(a)

 

(b)

Figure 4.9. Graphical illustration of Theorem 4.3.

The same argument can be constructed if we assume the region immediately

 

bounded by 3e95,- and seggc corresponds to RL-CW5,(seg;,-) and RL—C’C’WU,(seg;k)

 

instead (see Figure 4.9 (b)). D

Corollary 4.3.1 Let S denote a set of coplanar {co-quadric) faces of objects in the
scene, G denote the perfect LDG of those faces in S as viewed through some non-
accidental viewpoint, and P5 denote the region label of all the regions projected onto
by those faces in S {P5 is simply the equation of the surface). If segij and seguc are
two adjacent segments in G then either
RL-C'C'Wv,(seg;j) = RL-CCWv,(seg,-k)
nor

RL-C'Wv, (seggj) 2: RL-CWU, (segik).

 

 

 

 

 

 

———'_'Mw 7

117

Pf: Since seggj and seguc are adjacent line segments, either RL-C'CWUI.(seg.-j)
RL—CW,,_.(seg,-k) or RL—CW,,,(seg,-,-) = RL-CCWU,(seg,-k) must hold. If

RL—C'CVVvJseggj) = RL—C'C'Wm(seg;k) then by transitivity, RL-C'CW.,,(seg,-j) —
RL-CW.,.(seg.-j), which violates the fact that adjacent regions cannot have
the same region label. Likewise, if RL-Cm,(seg,-j) = RL-CWUI.(seg,-k) then
RL-CCW,,,(seg,k) = RL-CWv,(seg,-k), which again is impossible. El.

4.4.3 Geometric Constraint Based Decision Rules

Before characterizing the decision rules, we must ﬁrst deﬁne more terminology and
redeﬁne the functions given in the previous section.

Since the functions RL-C’ Wand RL-CC Ware deﬁned as functions that act on the
perfect LDG, they need to be redeﬁned to be used here.

Let W denote the set of coplanar wings on the plane P5 that is being reconstructed
and let G denote the state of reconstruction of P5 at a given time instance. Recall
that each wing sample has information about the two surfaces that project to the
two regions adjacent to the wing edge projection (called the wing segment) in the
LDG. One of the regions adjacent to the wing segment must be labeled P5 because
wing samples in W are all coplanar with plane P5 and the other region must not be
labeled P5, say P3. When the wing segments are extended and intersected to form
undecided—segments, the region labels on both sides of the wing segments are carried
over to their extended segments. If wing segments happen to be collinear, then the
extended segment(s) in—between the two wing segments may have multiple region
labels on each side of the segment(s). Thus we shall denote the region label on each
side of a segment in the LDG as a set of labels. When an undecided-segment becomes
a must—segment, the region label on one side will be exactly P5 and the other may

not contain P5 as a possible region label. An example is depicted in Figure 4.10.

 

 

 

 

 

 

‘- -—-- W~.—ct '3. -‘.-

118
P 'tr
a “.2
‘3 PW Pa
Pw:'Pa
{Pa ’Py} i... ...: {Pa ’Py}

....... ____\ ...... ./.. __a___...s ........
Pa 3 / \ Px PX
{13,13} 5" WP, ,P,}

P :'.P
W0 a
.-' Pa\Pz

Figure 4.10. Multiple region labels are possible if wings are collinear.

We will now redeﬁne the functions RL-CC W and RL-C W to suit the possibility of

multiple region labels and to deﬁne three more new functions. These are the functions
used in the implementation of the reconstruction algorithm.

1);.

RL-CCWv,(seg,,): returns the set of region labels associated with seggj on the side

encountered in the counter-clockwise direction from segij with viewing origin at

RL-CWU,(seg,-,-): returns the set of region labels associated with seggj on the side

encountered in the clockwise direction from seggj with viewing origin at v;.
”i

False.

WingDir (seg,,-) returns True if at least one of the detected wings is collinear with
seggj and appears on the v,- side of seg,,- using v,- as viewing origin; else returns

degmu,t(v,): returns the number of must-segments currently known to be incident
with junction 12,-. Also called the must-degree of v,.

degund (12,-): returns the number of undecided-segments currently known to be incident
with junction vi. Also called the undecided-degree of v,.

 

 

 

 

 

 

119

In addition to the list of must-, undecided- and unwanted-segments, three sets

of junctions, V3,, V0, and V2,,“ are also maintained for the new decision procedure.

Members of the same set share the common characteristics as deﬁned below.
VE, = { v,- | degmu,¢(v;) is even and degund(v,-) = 1. }
V0,: { v,- | degmu,t(v;) is odd and degund(v,-) = l. }

Vlmd = { v.- I deguna(v;) 2 1- }

Note that VE, Q VIM; , V0, Q Vund and V3, ﬂ V0, = null.

We make a note that the set membership of VE, V0, and Vund and the values
returned by the functions SegCCl/V, SegCl/V, RL-CCW, RL-C W, degmmt, degund may
vary over time as the state of reconstruction G progresses. Therefore, the functions
are not true functions in the mathematical sense but are access functions in the
abstract data type sense. Those functions and sets are graphically illustrated in
Figure 4.11.

As a direct consequence of Corollary 4.2.1, if a junction in G has exactly one
undecided-segment, then that undecided segment can be classiﬁed as a must-segment

or unwanted-segment based solely on the must-degree of that junction. This gives

rise to our ﬁrst two rules.

Rule 4.1 An undecided-segment seg,-,- can be discarded if degmu,,(v,-) is even and

degund(vi) = I; or if degmu,t(vj) is even and degund(vj) = 1.

Rule 4.2 An undecided-segment seggj is a must—segment if degmu,t(v;) is odd and

degund(v,) = 1; or if degmu,t(vj) is odd and ngund('Uj) = I.

From our experiments, most of the undecided-segments can be resolved using
those 2 rules. However, there are situations where these two rules do not apply. We

derived Rules 3a and 3b to discard more undecided-segments that cannot possibly be

r, ‘2 ‘

 

120

RL-CW a(ad) = Py

RL-CCW al(ad) = P5

   

.....................................................

 

b .‘i i c .
.' Px [ '
RL—CW d (de) = RL-CCW e (de) = P5
WingDir d (db) = False deg mus, (d) = 2
WingDir d (da) = True deg und (a) = 4

Figure 4.11. Graphical illustration of redeﬁned and new functions. Notation: broken
lines are undecided-segments, solid lines are must-segments, thick solid lines are wing
segments.

mustvsegments in the presence of other must-segments. These two rules are directly

implied by Theorem 4.3 and Corollary 4.3.1.

 

Rule 4.3a Given two adjacent line segments M seg,,- and Usegu where M segij is a
must-segment and Useg“ = SegCCWv,(Mseg,-j) is an undecided-segment. Assume
that the wing group being processed has the surface representation P5. Then U sega
can be discarded as an unwanted-segment if
RL-CCM,(Mseg,-j) = {P5} and P5 ¢ RL-CW5,(Useg,-¢)
or

RL-C’CWU,(Mseg,-j) % {P5} and RL-CWW(USCgu) = {P5}.

 

 

 

 

 

 

—7—‘ W" ' ' ' " "’"‘ "V“ ' ' "' '
_ . ... .. . “-

121

Rule 4.3b Given two adjacent line segments Mseg,,- and Usegu where Msegﬁ is a
must-segment and U 369,, = SegCWv,(Mseg,-j) is an undecided-segment. Assume that
the wing group being processed has the surface representation P5. Then U segi, can be
discarded as an unwanted-segment if
RL—GWU,(Msegg,-) := {P5} and P5 5? RL-CCWU,(Useg,~1)
or

RL‘CWm(Msegij) 74 {P6} and RL'CCWmesegul = {P5}-

The situations depicted by Rules 4.3 (a) and 3b are graphically illustrated in Fig-
ure 4.12. Basically, Rule 4.3 discards an undecided-segment if (1) it adjoins and
neighbors a must—segment and (2) only the must-segment or the undecided-segment
labels the commonly shared region as P5 but not both. Figures 4.12(a)-(b) depict

two situations in which the undecided-segment (broken segment) is discarded and

 

Figures 4.12(c)-(d) show instances in which the rule would not apply.
Finally, we add a rule to remove spurious extension of a wing. Assumption A3
states that at least one wing is detected on each visible edge of the scene. Thus,

all extensions of a wing beyond intersection with a must-segment should all be re-

 

moved. This is graphically demonstrated in Figure 4.13. In Figure 4.13(a), undecided-
segments 27 and fl; are extensions of wing to; which forms the line l. Since must-
segment m2 intersects line I, if any of the segments on the U side of line I were to be
a must-segment then there must be a wing on that side of line I. However, since there
were no wings detected on that side, and there are no missing wings (Assumption A3),

all segments on the if side of line I must be unwanted-segments.

 

 

 

122

 

 

 

 

J contradicts Corollary 3.1
(b)
.5 P051 1?- EmPc n i Pr: Ii‘ R. -------------
Pa] 1%; Pa canbe Paj- Pa. Pa or
Pb Pa 3 5 Ph Pa g 1)b Pa?
1
(C)
....... 'i..iii.il..Pa.§nLPa_ n P": B! PL i Pa
lbi Peg Pc canbe Pei 1%] PC or Pa i Pc
Pa Pb i 5 Pa Pb 3 Pa 1)b 5
j

(d)

Figure 4.12. Graphical illustration of Rule 4.3. Must-segments and undecided-
segments are represented by solid and broken lines, respectively. (a) Undecided
segment Z7 is discarded via Rule 4.3a. (b) Undecided segment i1 is discarded via
Rule 4.3b. (c)-(d) Neither Rule 4.321 nor Rule 4.3b applies.

 

 

 

 

 

 

123
wing/ must [‘— undecided
” WI 3 1.. ......k a: j k
line 1 no wmg on this sale line 1 \ violates A 3
<- must m2
(8) (b)

— ............... E 1
lme l ]

(C)

Figure 4.13. Graphical illustration of Rule 4.4. Must-segments and undecided-
segments are represented by solid and broken lines, respectively. (a) No wing on the
ij side of line I. (b) if 27 were a must segment, it would violate Assumption A3. (c)
Segments on ij side of line I in (b) are unwanted-segments.

Rule 4.4 Given a must-segment Msegij and an undecided-segment Useguc where
Usegg, = SegCWu'.(Mseg,-j) or SegCCWm(Mseg,-,-). If Mseggj and Usegik are not
collinear {co-curvilinear) and WingDirW(Useg,-k) = False, then all (undecided) seg-
ments collinear (co—curvilinear) to and on the side of U segtk with respect to v, includ-

ing Useguc can be deleted.

4.4.4 Complete Reconstruction Algorithm: POLY-2

We are now ready to give the geometric constraint based LDG reconstruction al—
gorithm. All steps but step (2.4) of this algorithm are the same as the previous
algorithm. Note that we apply Rules 4.1 and 4.2 whenever possible because they

are easier to apply than the other rules.

 

 

124

( 1) Cluster wing segments by plane equation
(2) For each plane group

(2.1) generate a line for each wing

(2.2) compute the intersections among all lines

(2.3) preserve line segments which overlap a wing

(2.4) remove or preserve line segments by using Rules 4.1, 4.2, 4.3, and 4.4

(2.4.1) if V1.3l 7t null, apply Rule 4.1 then goto (2.4.1)

(2.4.2) if V0, aé null, apply Rule 4.2 then goto (2.4.1)

(2.4.2) apply Rules 4.3 and 4.4 to ajunction v, of Vund

(2.4.3) if V2,“ 2 null or no more changes are possible then goto (2.5)
else goto (2.4.1)

(2.5) delete redundant intersections
(3) Merge LDGs of individual polygonal faces
(4) Compute 3D vertices by intersecting line equations
(5) Retain all line segment labels from wings

(6) Output LDG with plane equations, 3D vertices and edges

4.4.5 Example - Paddle Wheel

Figure 4.14 depicts how one group of wings can be reconstructed using the above
4 rules. This group of wings are extracted from the “background” group (denoted
as P5) of the paddle wheel example in Figure 4.19 (a). After step (2.3), the state
of reconstruction looks like Figure 4.14 (a). The three junction lists, V131 , V01 and
de are formed at this instance. By repeated application of Rules 4.1 and 4.2, 20 of
the 40 undecided-segments were discarded as unwanted-segments and 3 became must-
segments. Note that in Figure 4.14 (b), junctions b5 and b3 6 Vund due to accidental
intersections, but they are eliminated soon after. The intermediate LDG resulting
from each application of these two rules are shown in Figure 4.14 (b)-(f). At this
Point, both VE1 and V01 are null lists, but there are still some undecided-segments

in the LDG. Thus a more complex rule must be used to further the reconstruction

 

M.. mn—WV—J.“- .a

 

125

Table 4.1. Rest of the processing steps of the Paddle Wheel example.

4.1 on removes
on removes
on removes E

on removes
on removes
on removes
on removes
on t removes

 

process. The undecided-segment m and the must—segment W satisfy the criteria set
forth in Rule 4.3 (i.e., RL-CWn(Mseg,,m) # {P5} and RL-CCWn(Usegn;) = {P5}).
Therefore, H can be discarded. Upon removal of m (see Figure 4.14(g)), Rule 4.1 can
be applied once more to arrive at Figure 4.14(h). This is followed by one application
of Rule 4.4 to remove undecided-segments W and 57. As a result, more undecided-
segments can be discarded or become must segments via application of Rules 4.1
and 4.2. Table 4.1 shows the rest of the actions to be taken to arrive at the ﬁnal

reconstructed “background” LDG in Figure 4.15.

4.4.6 Merging of Surfaces

The surface merging step is trivial for the ﬁrst algorithm since every visible surface

in the scene is guaranteed to be completely recovered. However, in the second al-

gorithm Where surfaces are reconstructed using the four rules mentloned above, it is

possible to have some undecided—segments still unresolved at the end of the planar

surface reconstruction step. That is, those four rules only depict necessary but not

sufﬁcient conditions for complete and unique reconstruction of surfaces. An example

is illustrated in Figure 4.16. Figure 4.16 (a) shows the wings detected f0r the front

face of the block as in Figure 4.1. The reconstructed LDG using the above proce-

 

 

 

,. b11

m k
b, an 1 jib
12

“ bro

F8

 

 

 

 

 

 

 

 

b5 b5 In its
V51: { b1—b4, b6, b7, b9-b18 }
(a) V0,: empty

Vund= all the junctions in the ﬁgure

hsbn b16 blS b141313

b1 a n ”,1 ‘ i 912

 

 

b11

 

ix‘ b1()

38‘

 

 

 

...,

 

 

 

 

3’

b8

<8"

b5
E1={b8}
(C) V01={a,i}

Vund= { b8, a, C, er 81],], n"— }

126

”13% bis bl5 1:14 1,13

m k
b1 8 n 1 ‘ i b
—. 12

 

 

bu

b10

 

 

 

U‘
. .
. .
‘2. O .o 3'
a. "
I. .o
.
I. .~
. .
. .
O. ‘I
. .
.' .
.
4' ‘-
0 ‘.
, .
i. ' '
‘ l
‘. .
.q .
‘ C
.l l.
.‘ .
. .
. .
.‘m on _. .
I. .5
.
.
5'

a
o
a

 

 

 

 

 

be

118'
3

b;
(b) v01: 1 21.1 }
Vund= { b5, b8, a-j, l, n-t }

b18bl7 b16 b151,141,”

m k
a n V I i
b1 b12

b2 b11

 

 

bio

 

 

 

£13
a—

 

 

 

 

 

Vund= { a. c, e, g-j, l, n-t }

Figure 4.14. cont. (see page 128 for caption)

{3.5-"VT. . . »

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

bl8bl7 b16 1,15 1,14 5,3
m\/ 1:
b1 a n .1. . i bu
b2 b11
0 ,3... UK. a l'
.u‘ ... ... "-
3‘39 321:,
p o .3... {'0' ‘ 8
b3 ‘x". 1’10
b4 b c /\ g h l’9
d r
VE1= { r, S l
(e) V01: empty
Vund= { C. e. g. j. 1. n, o-t }
b18b17 b16 b15 b14b13
b1 a]—n '1. i i b12
t’2 ..." a.“ bll
o I “ .xo.’ ﬂ.“ r
i it'll :t':
p5." ': - S
b3 [ . 1’10
b4 b C e g h t9
<1 r
b5 b7 b8
V51: { n }
(g) V01: empty

Vund= { c. 6, g, j. l, n. o, p. q, l}

 

   
 
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

127
bl8bl7 b16 b15b14b13
g: background k
b1 8 . ,. .’ i [’12
b2 l ,"I' bu
o:._ .. i r
'si
Pl". “‘3‘ ' S
b3 [ i . . ' b10
“4 b7 c e g h '39
l .
b5 b6 b7 "bs
V = empty
El
(0 V0,: empty apply Rule 3
de= { C, e) g, j, l! n, 0, p, q) t }
b18bl7b16 b153141313
b1 i biz
s2 undecided , bu
segments 'x ‘.
0:. 3 i r
,. . 3:
pix g ' 5
v3 . . bio
'34 b' c‘ e s 11 b9
r
b5 b6 b7 b8
V = empty
El 1 Rule4
(h) V0,: empty app y

de= { C, e, g, j, 19 O! P; (17 t}

Figure 4.14. cont. (see page 128 for caption)

 

 

 

 

 

 

 

 

 

 

128

 

 

 

 

 

 

 

 

 

hsbn b16 b15 b14 b13 bl8bl7 b16 b1513141’13
k k “
]
b1 3 i b12 b1 i b12
b2 b11 b2 b11
1' 1'
:51 t'
l I‘ s s “
b3 blo 53 bio ‘1
m e g h 19 m g h 19 ]
]
1
‘1
r f ‘
b5 b6 b7 b3 b5 b6 b7 b8
. VE1=empty VE1= {0}
(1) V01: {P} (l) V01: empty

Vund= { C’ e, g’ j! l! 0, p, qr t}

Vund= { c, e, g) j! l, 0! qr t-}

Figure 4.14. Reconstruction of the background wing group of the Paddle Wheel.
(a) State of reconstruction after step (2.3).

(b) Result of applying Rule
(c) Result of applying Rule
(d) Result of applying Rule
(e) Result of applying Rule
(f) Result of applying Rule
(g) Result of applying Rule
(h) Result of applying Rule

(1) Result of applying Rule
( )

4.1 to
4.1 to
4.1 to
4.2 to
4.1 to

a).
b).
c).
d).

e).

/\A

4.3 to segment ii-l in (f).

4.1 to (g).

4.4 to segments p_q and 217 in (h).
j Result of applying Rule 4.2 to (i).

 

129

 

 

 

Figure 4.15. Reconstructed silhouette (background) of the Paddle Wheel.

dure results in eight undecided-segments that cannot be resolved using the 4 rules
(see Figure 4.16 (b)). There are four possible valid interpretations as shown in Fig-
ures 4.16(c)-(f). Without information about other adjacent surfaces, this ambiguity
cannot be resolved.

Although the case depicted in Figure 4.16 is highly coincidental, nevertheless it
shows that the preceding algorithm cannot guaranteed an unique LLDG even with a
wing sample from every edge visible in the LDG. Even the use of a junction catalogue
cannot resolve this ambiguity. However, we must note that since all four possible in-
terpretations are physically realizable, perhaps it is not that the rules are insufﬁcient,
but rather that the wing representation of the scene is inherently ambiguous. Unless
the wings are made to be as long as the actual edge segments, ambiguous cases can
always be constructed. Unfortunately, this requirement would mean derivation of a

perfect line drawing graph from the raw input image.

 

 

 

130

 

 

 

 

 

 

 

(a) i (b)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(c) (d)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(e) (D

Figure 4.16. Algorithm 2 on front face of Big block. (a) Wings detected. (b) 4
undecided segments still unresolved from surface reconstruction. (c)-(f) Possible valid
interpretations (only (e) remains after merging surfaces).

 

 

 

 

131

 

 

 

must segments in P8

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

P V
8
X X X X
P ‘\ /'P
I must segments in P4 &P6
(3) (b)

Figure 4.17. Reconstructed LDG of the Big Block. (a) Incomplete front face. (b)
Ambiguity resolved when surfaces are merged. '

With this algorithm, the fate of any left over undecided-segments on a surface
is determined when the surface is merged with its neighboring surfaces, if any exist.
Basically, each of the remaining undecided—segments will become a must-segment only
if it is completely overlapped by a must-segment on some adjacent surface. In other
words, since a segment can only be adjacent to exactly two surfaces, if the segment
is deemed to be a must-segment on one surface, then it must also be a must—segment
on the other surface. Using this fact, when all the surfaces of the 3D Big-Block are
merged, the algorithm is able to resolve the ambiguous cases in Figure 4.16. The

result is shown in Figure 4.17.

4.5 Experimental Results

These two line drawing graph reconstruction algorithms have been implemented in
common LISP and tested on more than 15 test cases. The average run time for scenes

considered in this section was less than 5 seconds per scene on a Sun690MP. Wing

 

132

detection time is not included in this estimate. Both syntactically generated wing
sets and wings detected from real images were used to test the two algorithms. For
the syntactically generated cases, the completely labeled line drawing graph is ﬁrst
created by hand. The wings are then extracted from each line in the line drawing.
Note that each line segment in the graph can produce more than one wing. The
results of the experiments on the syntactic data are depicted in Figures 4.18 and
4.19.

Figure 4.18 shows the input wing set and all the reconstructed faces as well as
the ﬁnal reconstructed LLDG of the Big Block example. Note that although face P1
cannot be uniquely reconstructed using the geometric constraints based algorithm,
the merging of all the faces discards the unsupported cases and results in unique
interpretation of the object in the scene. More examples showing only the input
wings and the reconstructed LLDG are given in Figure 4.19.

The geometric constraint based algorithm was also applied to a set of wings de-
tected from a real image using the wing detector outlined in Chapter 3. Since the
reconstruction algorithm demands complete wing samples (at least one wing on every
edge contour), we have doctored the output of the wing detector by inserting one
missing wing and merged collinear compatible wings into one long wing. The aug—
mented wing set was then fed into the algorithm for reconstruction. Results of the

reconstruction is shown in Figure 4.20.

4.6 Worst Case Time Complexities of POLY-1
and POLY-2

Both POLY—1 and POLY-2 algorithms consist of four sequential modules: clustering
of wings, extending wings and computing the intersections, resolving the undecided-

Segments and merging of the planar faces. Analysis of the complexity of the algorithms

 

———A

 

133

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6.
Po .-
/ m
0' +
13
=‘Bf—5
P7 ' 6
.. P1 34+ ‘8“?
4 3 _ 7
38
_31 :23 _24
P4 25
2*. 31-1.52 2.44.4...
1.45.1... V
‘ 29 21
‘12 ur
(a)
1’0 P2‘
P2
P1
P1
P4
P5
P1
P1
P1 P1
‘P5 ~P5
P4 P6 P4
“4.3 ﬂ
-1" .3
P1

(b)

Figure 4.18. Reconstruction of the Big Block. (a) Detected wings (solid lines). (b)
Individual reconstructed faces. (c) Reconstructed LLDG.

 

 

 

 

134

.
0..
a
a
.._.
a...

 

 

~
‘~
~
‘~
.

 

 

 

 

...-

 

 

 

a
--—--..~~ I -1

 

 

 

 

(a) wing samples

Figure 4.19. cont. (see page 135 for caption)

 

 

135

 

 

 

 

 

 

 

 

 

 

 

 

(b) reconstructed LDG

Figure 4.19. More examples of syntactic scenes that have been reconstructed. (a)
Wings samples. (b) Reconstructed LDG.

 

 

136

 

Color Chart

—- jump, must— segnent
— concave areas 9., must— segment mayb 3‘ 5 egment
--- convex crease, must— segment “M' undemded— 5‘39“th

 

 

   

(a) original intensity image (b) detected wing samples

 

\\

4

 

 

 

 

 

(c) augmented wing samples ((1) reconstructed LLDG

Figure 4.20. Reconstruction of the LLDG of a view of the gb3 block.

 

 

 

137

is decomposed into analysis of those individual components. Of those four modules,
the two algorithms only diﬁer in how the undecided-segments are resolved. In the

following analysis, we assume that the number of wing samples is n.

Wing Clustering

The wing clustering step is simpliﬁed with the ideal assumption about the measure-
ment and the computational accuracy. Co-planar wings all have the same wing-surface
labels on one of the two wing-surfaces; therefore, only one pass through the wing set is
necessary to form the co-planar wing clusters. Suppose wing wl wg_1 have already
been placed into m proper clusters, where m S 2(i — 1) (recall that each wing must
belong to exactly two planar clusters). To place wing w,- into the two proper planar
clusters, at most m comparisons with the already existing planar clusters are needed.

Thus the complexity of the clustering step is

O(:2(i—1))= 0(22).

I=1
Wing Extensions and Intersections

Assume that there are m different wing clusters, C1,Cz,...,0m with n1,n2,...,n,,,
wings, respectively. Note that n1 + 112 + + nm = 2n and the minimum number of
clusters is 2. For each wing cluster 0,, the wings are extended and intersected with
other wing extensions in the cluster which is accomplished in 0(( 75‘ )) = 0(n?)
time. Summing over all wing clusters, the total time is

fed 2 )> = $0023).

{:1 i=1

 

'er-r‘vtz'ure-W- " "‘7" "'~ '
—i—‘—,.m.u, .. .. .

138

Since ( 3‘ ) > 0, Vi, it is easy to see that the above sum is bounded above by
0((n1 + n; + + nm)2) :2 O((2n)2) = 0(n2)

This worst case scenario only occurs when all the wing samples are co-planar giving

rise to two planar groups: the plane of the object faces and the background.

Resolving Undecided-Segments

Initially, the number of undecided-segments in each wing cluster is at most n3. Using
POLY-1, each of those segments is visited only once to either change it into a must-
segment or an unwanted-segment. Note that once a segment is “decided” it can never
be changed back to the undecided state. Therefore the time complexity over all wing

clusters is

 

 

which again is bounded above by

0((n1 + n; + + nm)2) = 0((2n)2) = 0(n2)

With POLY-2, the algorithm iterates over the set of remaining undecided-
segments until either no more undecided-segments remain or no more updating is
possible. Thus, during each pass through the remaining undecided-segments, at least
one undecided-segment is permanently updated. The worst case scenario occurs When
only one undecided-segment changes state (to either must or unwanted) during each
iteration (in practice, several undecided-segments would have been updated during
each pass). Thus for wing cluster Cg, which has at most 72? undecided-segments, the
updating will stop after at most n3 iterations and the number of remaining undecided-

segments is decremented by at least one after each iteration. Therefore the worst case

 

 

 

139

time complexity is characterized by

0(3) = 0((1—+’;i)(—"il> = 0022‘).

Over all wing clusters, the total time complexity is

i004).

i=1

which is bounded above by
0((n1 + n; + + nm)4) = 0((2n)4) = 0(n4).

Merging of Planar Faces

In merging planar faces, must—segments from two non-co-planar faces are intersected
in 3D to discover the final line segments in the LDG and to compute the 3D vertices.
Recall that the number of line segments in each planar group of faces is bounded by
the square of the number of wings in the planar group. Theoretically, for group Cg, at
most n? line segments can appear in the reconstructed face group 0;. Over all planar
groups, the upper bound on total number of line segments would be (n§+n§+ n3”) / 2
(each segment must appear in two planar groups). However, the ideal wing sample
assumptions A3 and A4 restrict the number of line segments in the ﬁnal LDG to be
at most the number of wings samples, n. Thus, there can only be at most 2n line
segments among all the reconstructed faces. Pairwise intersecting those segments

yields a worst case time complexity of

i30<< 22” )> = 0M)-

i=1

 

 

 

 

140

Overall Worst Case Time Complexities

For the POLY-1 algorithm, the worst case time complexity is
0(n2) + 0(n2) + 0(n2) + 0(n2) = 0(n2).
For the POLY—2 algorithm, the worst case time complexity is

0(n2) + 0(n2) + 0(n4) + 0(n2) = 0(n4).

4.7 Summary

We have given two algorithms for reconstructing line drawings from “wing samples”
for a large class of polygonal scenes and views. The mathematical characterization of
legal scenes and views was given in Chapter 2, which we believe to be more general
and more practical than those in previous works. A junction catalog was not as-
sumed, nor were accidental views ruled out by tOpological events. Moreover, fallible
heuristics based on principles of human perceptual organization have not been used.
Both theoretical and practical considerations indicate that wing features are quite
appropriate for low-level representation of polygonal scenes.

In POLY-1, we resampled the range values during reconstruction. In step (2.4),
a limited search of the range image was performed in order to get information that
was not directly coded in the bottom—up wing representation. This step allows the
processing to continue to completion using only local processing. Via POLY—2, we
showed that the line drawing can actually be constructed without such further access
to the underlying range image. The second algorithm searches for valid interpreta-
tions of each surface using general geometrical constraints. Since the rules employed

were shown to be necessary, no valid interpretations will ever be thrown away. The

 

 

 

141

surface merging procedure has successfully merged and resolved any ambiguous sur-
face interpretations in our test cases.

Both. algorithms presented are suitable for parallel implementation. Once the
wing groups are decided, reconstruction of each surface can proceed independent
of one another. Parallel implementation of the second algorithm can also cure the
problem of ambiguous surface interpretations by communicating with adjacent plane
groups for information about the undecided segments.

All of the results presented in this chapter hinge on strong assumptions about the
input wing set and accuracy of computation. In the next chapter, we will present a

heuristic algorithm that can better deal with imperfect set of wing samples.

 

 

CHAPTER 5

Partial Reconstruction of LLDG -

Origami / Polyhedral World

5.1 Introduction

The algorithms depicted in the previous chapter rely strongly on the assumptions
that the wing samples are complete and that the computations are inﬁnitely accurate.
Those assumptions are unrealistic in practice and were asserted only to facilitate a
better understanding of the reconstruction algorithms. The algorithm to be given in
this chapter is heuristic based. Heuristic rules are used to replace the deterministic
rules and to account for those unrealistic assumptions. Only the two object domain

assumptions from Chapter 4 survive the reality check in this chapter:
A1) Objects are 8-polygons or polyhedra whose faces are 6-polygons.
A2) Scenes can be composed of multiple objects which may occlude one another.

We do not make any assumptions about the completeness of wing samples; an
imperfect set of wing samples with missing and/ or spurious wings is allowed as input.

Furthermore, the non-accidental view assumption is also omitted. From the deﬁnition

142

 

 

143

of accidental view in Chapter 2, wings may not be sensed on some of the visible edges
in an accidental view, which results in missing wings.

In the next section, we brieﬂy review past and present work on line drawing
reconstruction. This is followed by the presentation of our approach. A good heuristic
algorithm should result in a complete LLDG if the input wing samples are complete.
We show by example that this is indeed the case. Examples of partially reconstructed

LLDGs from imperfect sets of wings are also given.

5.2 Survey of Related Background

Construction of perfect line drawings from a raw input image or imperfect line drawing
has only been sparingly reported in the literature. Although an abundance of work
has been done in image segmentation, existing edge detection techniques are still far
from being perfect. Furthermore, line labeling algorithms surveyed in the previous
chapter all require that all junctions in the image be identiﬁed.

Perhaps the earliest documented work on obtaining line drawings of polyhedral
scenes is that of Shirai [111]. The input is an intensity image from which the line
drawing is simultaneously generated and interpreted. Shirai assumes that the scenes
consist of only polyhedral objects and are always set up with high contrast between
white objects and their black background. Thus the contour lines in the image are
easy to ﬁnd. The strategy for ﬁnding the lines is to ﬁrst extract the contour lines

in the image and then to heuristically search along the contour for other object

 

boundary lines and crease edge lines. The set of heuristics used not only ﬁnds lines,
but also facilitates the ﬁnding of the relationships between objects. When a complete
set of boundary lines is found, all the objects can then be identiﬁed. However, this

top-down approach for line ﬁnding may fail to generate the appropriate line drawing

 

 

e — Y. _,,

144

if some contour/ boundary lines were not detected or if spurious lines were found due
to noise in the image, especially during the early stages of the process.

In [49], Gu and Huang described a heuristic-search algorithm for extracting a
connected line drawing from a perspective View of a polyhedral scene. The main
feature of the algorithm is that instead of detecting lines (edges) in the image before
forming the junctions (corners), the junctions are located ﬁrst and the search for lines
concentrated on local areas centered at each junction. An image of not necessarily
connected edge points is obtained by thinning the thresholded Sobel gradient image.
A corner ﬁnder routine [36] is then applied to this sparse edge map to generate a list
of plausible corners. Final edges in the image are found by heuristically searching for
evidence that an edge exists between two corners for every possible pair of corners. A
drawback of this approach is that corner ﬁnding is expensive and difﬁcult. Further-
more, if a visible corner is not detected, all line segments incident with that corner
will be lost.

Most recently, Trytten has proposed to create a labeled line drawing graph from a
single intensity image by integrating multiple perceptual modules [119]. The approach
is to take existing low level perceptual modules and integrate them through the use
of a blackboard system as a common database, where each module can cooperate
and compete with others to search for a good interpretation of the input image. The

modules proposed for integration include:
1. regularization based surface segmentation module [15],
2. boundary detection via perceptual grouping [120],
3. line labeling analysis [83], and

4. symmetry detection module [47].

 

 

 

145

 

(a) (b) (c)

Figure 5.1. Falk’s heuristic rule for completing imperfect line drawings [34].

This approach might be robust since features that are hard to detect and consequently
missed by one module may be diagnosed as being imperfect and possibly ﬁxed by
another module. However, the complexity of integrating various modules may be

overwhelming and must be overcome.

Instead of deriving an entire line drawing from scratch, Falk [34] assumes that
an incomplete line drawing is obtained, ﬁlls in the missing line segments based on
heuristic rules, and ﬁnally uses object models to aid the interpretation of the line

drawing. Three ad hoc rules are used to complete the line drawings:

1. Two dangling collinear lines are replaced by a single line. (Figure 5.1 (a))

2. If two dangling lines on an incomplete face can be extended to form a corner,

then the lines are extended to complete the face. (Figure 5.1 (b))

3. If there exists a pair of L-type junctions with parallel sides as in Figure 5.1 (c),

then add a line between these two junctions to split the face into two.

 

 

 

 

 

146

Figure 5.2. Spurious line could be added with Falk’s [34] rules.

Obviously those rules may result in spurious line segments in many real life situations.
For example, Rule 3 would unnecessarily introduce a new segment between junctions
A and B in Figure 5.2. Based on the amended line drawing, Falk’s algorithm hy—
pothesized about objects in the scene and their orientations, then predicted the line

drawing of the hypothesized scene, and finally tried to verify it.

5.3 Heuristic LLDG Reconstruction Algorithm

The ﬂow of the heuristic based algorithm is very similar to the two deterministic
algorithms presented in the previous chapter. A quick overview of the algorithm is
as follows. First, some preprocessing is needed to screen out spurious wings and to
cut down the number of wings to use for reconstruction. The remaining wings are
clustered according to their wing—surface labels. This step is no longer trivial because
we have removed the measurement and computation accuracy assumption. As be-
fore, each cluster of wings is processed independently. Wings within each cluster are
extended and intersected as before. Each line segment is labeled using one of the six

heuristic rules. The notion of maybe-segment is introduced to facilitate the recon-

m - ---' .. - ..._....-__-.

147

struction process. After (partial) reconstruction of all the planar faces, they must be
merged to form the ﬁnal LDG, which involves more heuristic decision making. The
complete algorithm as well as the major steps that are different from the algorithms

of Chapter 4 are given and discussed in detail in the following subsections.

5.3.1 .Preprocessing of Wing Samples

Using the detection algorithm depicted in Chapter 3, the sampled wings are often
short and overlapping. Multiple wings extracted from the same physical edge often
have the same wing label; however, due to measurement and computation inaccura-
cies, those wings almost never have the same set of parameter values. Since a line
segment in the LDG can be recovered if at least one wing is present, it is advanta-

geous to combine those overlapping compatible short wings into one long wing before

 

performing LDG reconstruction. Reconstructing the LDC using the new set of longer

 

wings reduces the complexity of the algorithm.

Basically, two wings are combined into one if they overlap and the wing attributes
are compatible. Two wings are compatible if their wing-contour labels and the two
wing-surface labels are equivalent; furthermore, the wing~contours in 2D must be

near-collinear within some threshold. Note that an edge may still be represented by

 

more than one non-overlapping wing after the merge.

Another task undertaken by the preprocessing module is to screen out spurious
wings. Spurious wings are often the result of surface marking. While detection of
surface marks may be of some importance to other vision tasks, this research strives
to reconstruct the shape of the objects in the scene but not the features on the surface
of those objects. Thus those wings shall be dismissed before the actual reconstruction
takes place. A common trait among those spurious wings is that the surface labels
(equation) on both sides of the Wing are nearly coplanar. Experiments have shown

that usually only a few spurious wings were detected by the wing detector described

 

 

148

in Chapter 3. In those rare instances, spurious wings often can be screened out by

the above procedure.

5.3.2 Clustering of Wings

As in the POLY-1 and POLY-2 algorithms, the ﬁrst step in the reconstruction phase
is to cluster wings into groups according to their wing-surface labels. Each wing
belongs to exactly two planar groups (counting background or don’t care groups as
one planar group). Previously, inﬁnite measurement accuracies were assumed and the
planes were denoted symbolically which assured that coplanar wings have common
wing—surface labels. Given this assumption, the clustering process was a trivial task.
In reality, the wing-surface labels are the parameters of the planar surfaces as detected
by the wing detector which may not be exactly the same even for coplanar wings.

Our clustering strategy is motivated by the pose clustering technique used by
Stockman [114]. First, the plane equations from all wings are collected into a pool, say
P. Plane equations in P are then pairwise clustered. A pair of planes are equivalent
and therefore clustered if they are coplanar in the sense that the angle between
their surface normals and the difference in distances to the origin are within some
predetermined thresholds. Clusters of planar groups are formed such that members
of each plane cluster P,- = {12.-1, pig, ...} are either symmetrically equivalent (pg,- E pgk)
or transitively equivalent (pgj E p“. E E pm). Finally, wings with wing-surface
labels in the same cluster are collected as a wing cluster. A symbolic label is assigned
to each plane cluster and the wing-surface labels are replaced with those symbolic
symbols.

Having formed planar wing clusters, the next step is to recover the planar faces

and their boundaries within each cluster. Wings are extended and the intersections

 

of the extended wings are found as before. The junctions are taken to be the union of

 

 

149

those intersections and the actual wing segment endpoints. The initial must-segments

are those segments overlapped by a wing segment. The rest were undecided-segments.

5.3.3 Notion of Maybe-Segments

Before introducing the heuristic rules, we must add a new segment type. Previously,
an undecided-segment became a must-segment if some criteria were satisﬁed. Here,
an undecided-segment must ﬁrst become a maybe-segment before becoming a must-
segment.’ The strategy for finding all the must-segments is to extend the existing
must-segments whenever possible. The maybe-segments are the segments that are
subjected to the must-segment tests. So the sole purpose of this extra stage is to
screen for the most likely candidates for the tests. A maybe-segment must be an
extension of an existing must-segment; furthermore, its adjacent region labels must

be consistent with those of its neighboring must-segments if they exist.

 

Let P5 denotes the plane equation of the planar group that is being reconstructed.

An undecided-segment seggj becomes a maybe-segment if
1. segjk is a must—segment and is collinear with seggj; and

2. SegCCij(seg.—j) (SegCWvJ. (seg;,-)) is a must-segment and the region labels for
the region shared by 369;, and SegCCij(seg,-j) (SegCij(seg,-j)) contains P5.

Circumstances in which an undecided-segment is and isn’t turned into a maybe-

segment are depicted in Figure 5.3.

5.3.4 Heuristic Rules

Reconstruction of faces in each planar group of wings proceeds independently. For

convenience, we denote the planar group under reconstruction as P5 (i.e., P5 is the

common wing-surface label of all wings in the group).

 

 

 

 

 

150

Segment 3 a maybe-segment Segment U not a maybe-segment

{le {P5}

{p} {p} {P} (p)
i """" °""_§—k_'i 5' 5 k “’5’ k
‘lejipxl {le'llpxl [P5]
‘(P )5
{PX} "
{PX} r33}
"(.33-)" ilrali PX ,k
if?” {Pbk
s a:
{p51

 

..........

 

 

Figure 5.3. Some examples of Maybe-segments and non—Maybe—segments. Notation:
dashed segments are undecided-segments; thin solid segments are maybe-segments;
thick solid segments are must-segments.

 

 

 

 

 

151

Since our experiments have shown that occurrence of spurious wings are rare and
when they do occur, the preprocessing step would remove them, we will proceed
as if no'spurious wings survive to this level. Thus no line segments are results of
extension of spurious wings. However, there may still be missing line segments due
to missing wings. From this observation, Rule 4.1 of the POLY-2 algorithm can also
be applied here. Speciﬁcally, if a junction is incident with an even number of must-
segments and if only one undecidedcsegment is incident with that junction, then that

undecided-segment shall be removed as an unwanted-segment.

Rule 5.1 An undecided-segment or maybe-segment segg E (v; vj) is discarded as
unwanted-segment if degmu,t(v,-) 2 even and degund(v,-) :: I; or if degmu,t(vj) = even

and degund(vj) = 1.

Note that Rule 4.2 used by algorithm POLY-2 to make an undecided-segment a
must-segment cannot be used because line segments could be missing due to missing
wings.

The ﬁrst condition in which a maybe-segment can become a mustcsegment is based
on the Gestalt Law of Good Continuation [72, 124] and is graphically illustrated in

Figure 5.4.

Rule 5.2 Given line segments seg,,-,segjk, and 3691:: are collinear with compatible
attributes where seggj and 36912: are must-segments and 3e57,;c is a maybe-segment.
Then segj;c becomes a must-segment if junctions j and k are coincident with no other

segments.

Basically, Rule 5.2 ﬁlls the void between two collinear must-segments to form one
continuous line that would otherwise be two dangling must-segments.

While Rule 5.2 ﬁlls in the gap on a broken but likely continuous line, the next
two rules reconstruct the junctions of each face with high degree of certainty. The

situations in which those rules can be applied are graphically illustrated in Figure 5.4.

 

 

 

 

 

152
{Pa} (3,} {r5} .___. {1’5}
Heuristic Ru|e 2 .—_————_ .—
’{lelll’xlklpxll ’ Jim“ ‘

Heuristic Rule 3

 

Heuristic Rule 4

Heuristic Rule 5

 

Heuristic Rule 6

 

Figure 5.4. Graphical illustration of Heuristic Rules 2-6.

 

 

 

 

 

153

Rule 5.3 Given noncollinear line segments seggj,segjk, and segkz where seggj and
segkz are must-segments while seg,-;c is a maybe-segment that is collinear with segkl.
In addition, seggj and 3693']: are neighboring segments around junction j. If

RL-CCWj(seg,-j) = {P5 } and P5 6 RL-CWJ-(segjk)

or

RL-CWJ-(semj) = {P5 } and P5 6 RL-CCl/Vj(segjk),
then segﬂc becomes a must-segment and the extensions (connected undecided-segments
from junction j ) of these two segments into the non-shared region{s) are removed as

unwanted-segments.

Rule 5.4 Given two pairs of collinear compatible segments {segh,-,seg,-,-} and
{segjhsegu} where segh; and segk; are must-segments while seggj and segj;c are
maybe-segments. In addition, seggj and segj;c are neighboring segments around junc—
tion j. If

B; E RL—CCWJ-(seggj) and P5 6 RL-CWJ-(segjk)

or

R; E RL-CWj(seg,-,-) and P5 6 RL-CCWJ-(segjk),
then both segij and segjk become must-segments and the extensions (connected
undecided-segments from junction j) of these two segments into the non-shared re-

gion{s) are removed as unwanted-segments.

In the case of Rule 5.3, the fact that a must-segment is coincident with the maybe-
segment at a junction and that they both label the region they share as P5 gives strong
suggestion that the maybe-segment indeed is a must-segment. Rule 5.4 differs only in
that two maybe-segments meet at the junction instead of one must- and one maybe-
segment. However, Rule 5.4 is not weaker than Rule 5.3 because the maybe-segments
are all immediate extension of must-segments and the evidence from wing-surface

labels supports the hypothesis those maybe—segments should both be must-segments.

 

 

 

 

ﬁ"*’

154

While the ﬁrst 4 rules can be applied with high degree of conﬁdence, they are
not sufﬁcient in reconstructing all of the image scenes that were tested. The next
two rules are the weaker version of Rules 5.3 and 5.4. They are weaker because the
maybe—segments are still going to become must—segments even when other undecided-
segments are present in the region that is labeled as R; by both segments. Example
of their usage is given in Figure 5.4. Even though applying these two rules may
introduce untrue must-segments, the risk is minimized by invoking them only when
no more segments can be decided by any of other rules. Moreover, a must-segment
on a particular planar group of faces does not automatically carry over to the ﬁnal
LDG. The same must-segment must appear in one other planar group of faces or it
maybe downgrade to a maybe-segment in the ﬁnal LDG. Merging of planar faces is

discussed in the next subsection.

Rule 5.5 Given noncollinear line segments seggj,segjk, and segkl where SCgij and
segk; are must-segments while segj;c is a maybe-segment that is collinear with segkz.
Looking at a small neighborhood N, of radius r ‘< 6, Line segments seggj and segjk
locally divide N, into two regions. If

RL-CCWj(seg,-j) = {P5 } and P5 6 RL-CWj(sng-k)

or

RL-CWj(seg,-j) = {P5 } and P5 6 RL~CCW,(seg,-k)

and there are only undecided-segments in the region in which the above conditions were

met, then segjk becomes a must-segment and the extensions {connected undecided-

 

segments from junction 3') of these two segments into the other region are removed as

unwanted-segments.

 

155

Rule 5.6 Given two pairs of collinear compatible segments {segh,-,seg,-,-} and
{segjhsegkl} where segh; and segkl are must-segments while seggj and segjk are
maybe-segments. Looking at a small neighborhood N, of radius r < 6, Line seg-
ments SCgij and Sngk locally divide N, into two regions. If

P5 E RL-CCWJ-(seggj) and P5 6 RL'CVVj(-S€gjk)

or

P5 6 RL'CWj(Seg{j) and P5 6 RL'CCI’VJ'(Sngk),
then both seggj and seg,-;c become must-segments and the extensions (connected
undecided-segments from junction j) of these two segments into the other region are

removed as unwanted-segments.

The rules are applied from 1 to 6 in that order since the first four rules have more
local evidences in turning undecided/maybe—segments into unwanted and/ or must-
segments. Upon application of each rule, a check is performed to remove any false
junction (junction that appeared on collinear must-segments). If a false junction is
detected and removed, the extension of any must / maybe-segment incident with that
false junction is also updated to reﬂect this change. Afterward, the cycle repeats
starting from Rule 5.1. The process stops when there is no more undecided- or
maybe-segment to decide or when none of the rules can be applied to the remaining

undecided / maybe-segments.

5.3.5 Merging of Planar Faces

Once all the faces have been (partially) reconstructed, they need to be merged to-
gether to arrive at the ﬁnal LLDG of the scene. The merging process is not as trivial
as before because some of the faces now may still have missing lines after recon-
struction. Although a line segment theoretically should appear on exactly two faces,

missing wings can cause a line segment not to appear in one or both faces. Further-

 

 

 

 

156

more, even if it does appear in both faces, it may be labeled differently (e.g., labeled
as must in one and as maybe in the other). More heuristic steps must be taken to
ensure that each segment in the ﬁnal LLDG has a unique interpretation in the sense
that it is either a must-segment or maybe-segment. Note that a line segment is said to
appear on a face only if it is labeled either as a must-segment or as a maybe—segment.

The ﬁrst step in merging all the faces is to pull together all the must- and maybe-
segments and determine all the intersections of all those segments. Three very com-
mon problems encountered as a result of this operation are depicted in Figure 5.5 (a).
All of these problems can be attributed to computation and measurement inaccura-
cies. First, many short residual segments may result around the intersection of two
segments. On the other hand, two lines that were supposed to intersect may fail
to do so. Thirdly, when three or more supposedly co—junction line segments meet,
instead of meeting at one unique junction, three or more different close—by junctions
will occur. To alleviate these problems, after determining all the junctions in the
LDG, nearby junctions are clustered into one unique junction. All the line segments
that terminate in the same cluster of junctions have their endpoints replaced by that
unique junction. In so doing, the overshoot of line segments are removed; the gaps
between two supposedly intersecting line segments are ﬁlled; and the multi-junctions
are merged into one (see Figure 5.5 (b)).

Once all the junctions have been determined, we must resolve any ambiguity
in labeling each line segment (i.e., must- or maybe-segment). If the line fragment
appears on two faces and it is labeled as a must-segment on at least one of the two
faces, then the segment is labeled as a must-segment. Otherwise, the line fragment
is labeled as a maybe-segment. This is the case where the line fragment is labeled as
maybe—segment on both faces.

When the line fragment only appears on one face and it is labeled as a must—

segment, then the decision process is more complicated. Recall that during recon-

 

 

157
wing .""‘—..’§ no wing
\/ F1 l input wing set

 

F0 F2 F3 -'

yr...“ VZN

 

 

 

-
a
o ,r
o .
c' I o o
a a

 

 

 

 

 

u
\..-..-.-o

.~ .' reconstructed
3 faces/lines

 

 

 

 

 

 

 

 

 

 

 

i...
i

 

 

 

 

 

 

 

 

 

A B
(a) ' 'C d» after faces/lines are merged
(b) B clustering of junctions
C
A final reconstructed
(C) B partial LDG
c l’

 

 

 

Figure 5.5. Problems encountered when merging faces into ﬁnal LDG their solutions

 

 

 

 

"-F‘_W.'*"v 1‘2- ~" ' I

 

158

struction of a face, the original set of must-segments are the actual wing segments
which are just fragments of the real line segment in the LDG. Let’s denote the end—
points of those wing segments as wing endpoints. If the reconstructed face contains
a dangling wing segment, it would indicate that some line segment(s) of the face is
(are) missing because that wing is not extended to meet with other line segments.
Therefore a wing endpoint in the reconstructed face indicates that the wing segment
should be extended further at that end. Thus, a line fragment in the merged LDG
which appears on only one face would be labeled as a must-segment if (1) it is 1a-
beled as a must—segment on that single face and (2) one of its endpoints is a wing
endpoint. Otherwise, that line fragment is labeled as a maybe—segment. However, if
the segment is labeled as a maybe-segment on the only face in which it appears, then
it is discarded as an unwanted-segment. Examples of different cases mentioned above
are depicted in Figure 5.5 (c).

Upon resolution of the line labels, the LLDG is ﬁnally reconstructed. Since some
line segments may still be missing due to missing wings, the reconstructed LLDG

shall be called a partial LLDG.

5.3.6 Heuristic Reconstruction Algorithm: POLY-3

The entire reconstruction procedure is summarized below. The input is a set of
sampled wings which may contain spurious and/or missing wings. The output is a
(partial) LLDG whose completeness is directly related to the completeness of the
sampled wings. Examples of scenes that can be handled are given in the next section.

A list of the'major parameters used in this algorithm is given in Appendix C.

(0) Preprocess wing samples to merge compatible wings and to cull out extraneous
wings.

(1) Cluster merged wing samples by plane equation

 

 

 

 

 

(2) For each plane group

(2.1) generate a line for each wing

(2.2) compute the intersections among all lines

(2.3) preserve line segments which overlap a. wing

(2.4) remove or preserve line segments by using Rules 1, 2, 3, 4, 5 and 6.
(2.4.1) if V5, 75 null, apply Rule 1 then goto (2.4.1)
(2.4.2) apply Rule 2. If invoked, goto (2.4.1)
(2.4.3) apply Rule 3. If invoked, goto (2.4.1)
(2.4.4) apply Rule 4. If invoked, goto (2.4.1)
(2.4.5) apply Rule 5. Ifinvoked, goto (2.4.1)
(2.4.6) apply Rule 6. If invoked, goto (2.4.1)
(2.4.7) no more changes are possible, goto (2.5)

(3) Merge LDGs of individual polygonal faces
(4) Compute 3D vertices using intersecting line / plane equations
(5) Retain all line segment labels from wings

(6) Output LDG with plane equations, 3D vertices and edges

5.4 Examples

A complete example depicting the major phases of the reconstruction process is given
in Figure 5.6. The set of wing samples as detected by the wing sensor is imperfect.
It consists of some spurious wings and missing wings. Many close-by parallel wings
are also present in the wing set. After the initial wing merging step, all the spurious
wings are removed and the compatible wings are merged; only 13 wings remain for the
reconstruction. The initial and final states of each wing cluster upon reconstruction
are shown in Figure 5.6 (e)-(l). Note that some false must-segments are inserted in
(e) and (f). However, each is downgraded to a maybe-segment due to non-sufﬁcient
evidence from the other faces. The final reconstructed LLDG is depicted in (b).
More LLDG reconstruction examples are shown in Figure 5.7. The images on the

left column show the sampled wings overlaid on the original intensity image while the

 

 

160

 

Color Chart

—— jump, must— segment
-— concave crease, must— segment maYb é‘ “ﬁlm‘-
--- convex crease, must— segment “WNW Lmdemded- 5&3th

 

 

 

 

 

 

 

(a) original image (b) reconstructed LLDG

 

l\/|
\/

 

 

 

(d) merged wings

 

 

 

 

 

 

 

 

 

 

(e) wing cluster 0 (f) reconstructed cluster 0

Figure 5.6. cont. (see page 161 for caption)

 

 

161

 

 

 

\ l I

I\'\

 

 

 

)wing cluster 1 (h) reconstructed cluster 1

 

 

l/i/
/!

 

 

 

(i) wing cluster 2 (j) reconstructed cluster 2

 

 

 

 

 

 

 

 

 

 

(k) wing cluster 3 (1) reconstructed cluster 3

Figure 5.6. Reconstruction of the JIG via algorithm POLY-2

 

 

 

 

162

right column shows the reconstructed LLDG. Since we have assumed that only planar
surfaces can appear in the scene, the wing detector was tailored to detect only planar
wings. The “roofl” example is the only one in which there was no missing Wing and
the complete LLDG was reconstructed. The others all have defective wing samples.
We must also admit that some of the wings in the “jig1+box1” did not cluster properly
and were manually placed in the “correct” cluster before proceeding. The missing
wings in the “gb3” and “blockl” examples are direct effects of the quantized wing
detecting window. In both cases, whenever the window overlapped the edge contour
with no wing, it also overlapped some other junction, resulting in a busy window.
As previously discussed in Chapter 3, a scale-space approach to wing detection may
cure this problem. Furthermore, in the “gb3” example, although a face is completely
missing (having no range values due to occlusion of the laser light), an LLDG was
nevertheless reconstructed from the available wing samples. Note that line labeling
of the reconstructed LDG using the Huffman-Clowes junction catalogue would not be
possible due to the missing face. The missing wings in the “planar2” and “blocksl”
examples were due to slight mis-registration of range and intensity data: the intensity
data and range data gave conﬂicting information about the presence of a wing, which
resulted in having no wing detected.

Table 5.1 gives the performance summary of the POLY—3 algorithm in reconstruct—
ing the LLDGs of the real images in Figures 5.6 and 5.7. Observe that in the “blockl”
image, an arc in the ideal LLDG with no wing sample was correctly inferred. However,

one must—segment was also incorrectly inferred, showing the fallible heuristic nature

 

of the algorithm. Upon closer examination, we found that there was some evidence for
the existence of that extra must—segment and no evidence to discourage its formation.
Furthermore, its existence was tied to the fact that an are that was supposed to adjoin
Was missing in the reconstructed LDG due to missing wing samples on that arc. For

all other images, all arcs with at least one wing sample were correctly reconstructed

 

163

 

 

C 0101' Chart

—— jLunp, must— segment
—— concave areas e, must— segment
---- convex crease, must- segment

may}: e- s egment
WWW undecided- segment

 

 

 

 

 

 

 

 

Q
a

 

 

 

 

 

 

 

 

 

Figure 5.7. cont. (see page 164 for caption)

 

 

164

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 5.7. Examples of reconstructed LLDG via algorithm POLY—2 (Names of the
scenes: roof], gb3, block1,jig1+bor1, planarg, blocksl).

 

 

165

and no false arcs were introduced in the reconstructed LLDGS. Associated with the
missing arcs are the incomplete regions in the reconstructed LLDGs. However, most
of the incomplete regions have only one missing are. Despite the incompleteness of
the LLDGs, the features embedded in the partially reconstructed LLDGs seem to be
rich enough for any reasonable object recognition module to infer the objects from
the model database.

Finally, we make a note that the algorithm has been applied to the synthetically
generated test cases of Chapter 4, where the wing samples were complete. The
algorithm is able to reconstruct the complete LLDG as would the previous algorithms

in those test cases.

5.5 Worst Case Time Complexity of POLY-r3

As with the POLY-2 algorithm, POLY-3 also consists of the same four major sequen-
tial modules: clustering of wing samples, extending and intersecting wings, resolving
the undecided-segments and merging of the planar faces. Again, assume that the
number of wing samples after the preprocessing step is 71 (Note the preprocessing
module requires 0(w2) time to complete where w is the original number of wing

samples).

Wing Clustering

Recall that the wing clustering step was performed by ﬁrst pairwise clustering the
wing samples followed by distribution of wings into clusters such that wings within
each cluster are either symmetrically equivalent or transitively equivalent. Since there

are n wing samples, the pairwise clustering step can be completed in

0(( ’2‘ )> =0<n2)

166

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

o H m. a o S a. a A: 383
H m H o o HH m H NH @283
H m H m o Ha e H mm Hashim:
m H m H H a a. a NH :83
N m H o o NH H» H mH Hooﬁ 5.on
o m c o o a m o a :08
o m m m. o H a. a 3 Ha:
one one manoﬁmom mucoawmcm UQHH monEem UQHH
wqmmmmﬁ mnonoH 85 3:08me .328 -658 one HHH wars on 9: E oneom
mqommou 30388 wHHHmmHE 6938 much; ”—828 muommou .3 worse: 88 Ho
Ho a a a no a no in Ho a H0 a a H82 as no a a H82
usage :9:

 

 

 

 

domed: 1?: N. 23 Ho UQHH ea... Ho ﬁeEEsm .H.m oEeB

 

 

    

 

 

 

 

 

167

time. The second clustering stage iterates over the pairwise clusters to form the ﬁnal
wing clusters. A new wing cluster is formed at the completion of each iteration which
has a time complexity of no more than 0(( ’2’ )). Note that with n wing samples,
there can be as many as n wing clusters. Thus, the wing clustering module has a

worst case time complexity of
on“) + 0<( ’2‘ )>0<n) = 0013).

Wing Extensions and Intersections

The wing extensions and intersections module in POLY-3 is similar to that of POLY—
1 and POLY—2. Assume that there are m different wing clusters, Cl, 02, ..., Cm with
n1, n2, ..., nm wings, respectively, where m + n; + + nm = 2n. Following the same
analysis as in Section 4.6, the time complexity of the wing extension and intersection

step over all wing clusters is bounded above by 0(n2).

Resolving Undecided-Segments

The module for resolving the undecided-segments in POLY-3 is similar to that of
POLY-2. They only differ in the rules used to form must- and unwanted-segments.
So the worst case time complexity can be expected to be the same. As in POLY-2,
the algorithm iterates over the set of remaining undecided segments until either no
more undecided-segments remain or no more updating is possible. The worst case
scenario in resolving all the undecided-segments is when only one undecided-segment
is updated during each iteration. Note that once a segment is “decided” it can never
be changed back to the undecided mode. Thus for wing cluster 0,, which has at most
n? undecided-segments, the worst case time complexity is

)= one).

1

05:1,: 0,0 +2.00%?)

 

 

168

Over all wing clusters, the total time complexity is

£50023).

{:1

which is bounded above by
0((n1 + n2 + + nm)4) = O((2n)4) = 0(n4).

Merging of Planar Faces

The analysis of the merging module in POLY-3 is more complex than in POLY-2 due
to the existence of maybe-segments. Given a wing cluster C.- with n,- wings in the
cluster, at the completion of the reconstruction the numbers of must- and maybe-
segments are bounded above by n,- and 272,-, respectively. Thus wing cluster C,- would
contribute at most 3n,- segments to be merged with segments from other wing clusters.

Summing over all wing clusters, there would be at most

i3n; = 3:17.; =37).

i=1 {:1

must- and/or maybe-segments to be merged. Pairwise merging those segments yields

the line segments in the ﬁnal LDG. This merging step has a time complexity of

Note that resolution of must/maybe label on the line segments of the LDG is also
done during this merging step.

As indicated in Section 5.3.5, with the inherent round-off errors of ﬂoating point
computations, a true junction in the LDG may be represented by many scattered,

yet close-by, junctions. A junction clustering step is performed to group those multi-

 

—————” '1

169

junctions into one representative junction. Given the number of line segments to be
merged is 3n, the maximum number of junctions that can arise is 0(( 32” )) = 0(n2).
Using the pairwise clustering strategy as for merging line segments, the junction

clustering step requires has a theoretical time complexity of
2 3
0<( ’5 )> - 0(( 2" )) = 0026).

However, in practice, most of the line segments do not intersect and thus do not
produce any junctions. The worst case time complexity of 0(n6) should be treated
as a theoretical upper bound.

Thus, the total time required for merging the faces is bounded by
0(n2) + 0(n6) = 0(n6).

Overall Worst Case Time Complexity
Given the above complexity analysis of individual modules, the algorithm POLY—3

has the worst case time complexity of

0(n3) + 0(n2) + 0(n4) + 0(n6) = 0(n6).

5.6 Summary

In this chapter, we presented a third algorithm, POLY-3, for reconstructing the LLDG
of origami / polygonal scenes. POLY-3 differs from its predecessor in many ways. For

one, it is constructed to work with wing samples detected from real images. As

 

such, it allows for an imperfect set of wing samples as input. Missing and spurious
wings, disallowed in POLY-1 and POLY-2, are allowed with POLY-3. Furthermore,

the unrealistic assumption of inﬁnite computation and measurement accuracy are

 

 

 

170

dropped. POLY-3 is an algorithm designed to work in the real, yet imperfect world.
In place of those idealistic assumptions is a set of 6 heuristic rules that reason about
junctions and regions to inject must-segments and to reject unwanted-segments. A
junction catalogue is not used and the original range, intensity or fused images are
also not needed.

The input set of possibly imperfect wing samples as detected from the wing de—
tector algorithm in Chapter 3 is preprocessed to merge overlapping compatible wings
and to remove any spurious wings. The all important step of clustering wings into
planar group was shown to be realizable given the accuracy of our wing detector.
Heuristic rules are used to reconstruct the LDG of each planar group of wings. Some
faces may not be completely reconstructed due to missing wings. Merging of different
faces from different planar groups requires that a must-segment in the ﬁnal LDG
be substantiated by the two plane groups that it is in. This conservative approach
guards against introducing spurious segment in the ﬁnal LDG. By trying to be sure
that no untrue must-segment exists in the reconstructed LLDG, a higher level object
recognition module can proceed without wasting time second guessing the validity of
the line feature.

The algorithm has been implemented and tested on the same test cases of POLY—2
to ensure that its performance is comparable to that of POLY-2 when a complete set
of wing is present. POLY—3 has also been tested on a number of imperfect sets of
wings as acquired from real imagery through our wing detector. The results show that
reconstruction of partial LLDGs is tenable given the accuracy of our wing detector
and the set of heuristic rules.

Finally we note that POLY-3 is only as good as the wing detector will allow
it to be. If the accuracy of the detected wing falls outside of the ﬁxed tolerance
range for error, then the clustering step may fail (wings put in the wrong group)

causing many segments in the true LDG to be missed. Future research will aim at

 

 

171

improving the robustness of the algorithm to tolerate a wider range of wing detection
inaccuracies. Also we note that similar to POLY-1 and POLY—2, POLY-3 is suitable
for parallel implementation. Once the wings are clustered, reconstruction can proceed
independently. As with POLY-2, a parallel version of the algorithm that allows
for different planar groups to communicate during reconstruction may facilitate the

undecided-segment decision process.

 

 

 

CHAPTER 6

Reconstruction of the LLDG -

Quadric Surface Scenes

6.1 Introduction

As with line drawing analysis for polygonal scenes, some attempts have been made
to understand the line drawings of scenes of limited classes of curved objects [25, 83,
92, 91, 93], but no one has been successful in building a line extractor to demonstrate
that the two step approach to scene interpretation (line extraction followed by line
labeling/interpretation) is viable. To our knowledge Trytten [119] is the only other
researcher who is working toward a labeled curved line drawing graph by extracting
and labeling line segments simultaneously.

In this chapter, we will assume that objects in the scene are either 6—quadrics or
objects with 5-quadric faces, and that the scene can be composed of multiple objects
which may occlude one another. In the following section, we will argue that under
ideal wing sensing and inﬁnite computation accuracy assumptions, the POLY-1 and
POLY-2 algorithms of Chapter 4 can be modiﬁed to reconstruct the LLDG of scenes
in the quadric domain. In Section 6.3, we drop the idealistic assumptions about

the sensor and computational accuracy and point out two major obstacles in our

172

 

 

 

 

173

approach to reconstruction. A new algorithm will be proposed to work around one
of the difﬁculties. The solution to the second problem is left for future research.

Examples of reconstruction by both algorithms are provided.

6.2 Perfect Reconstruction Algorithm

Initially, reconstruction of the LLDG from wings seems to be made more difficult
by the expansion of the object domain to include objects with 6-quadric surfaces.
However, under ideal assumptions, reconstruction in this domain is no more difﬁcult
than reconstruction in the polygonal domain. The POLY-1 and POLY—2 algorithms
depicted in Chapter 4 can be readily adapted to work in this expanded domain. In
fact, the only modiﬁcations are the wing clustering and the wing extension steps.
In the following subsections, we will reiterate the idealistic assumptions, point out
the necessary modiﬁcations to the previous algorithms and present some examples of

reconstructed scenes.

6.2.1 Wing Sensing Assumptions

Recalling from Section 4.3, the ideal wing sensing and computation accuracy assump-
tions are:

Wing Detector Assumptions:

A3) The view is non-accidental and the “wing sensor” produces at least one “correct”
wing sample that projects onto a line segment, curved or straight, in the LDG
for each segment that would be visible in the LDG. (Recall the deﬁnition of a

non-accidental view from Section 2.3.)

A4) There are no spurious wings.

 

 

 

174
Computation Accuracy Assumption:

A5) Measurements and computations are inﬁnitely accurate.

Note that the no missing wing assumption (Assumption A3) has been modiﬁed to
account for the difﬁculty in deﬁning edges of curved objects in 3D. Together with the
assumption on computational accuracy, the two assumptions guarantee that every
visible line segment, curved or straight, of the LDG is represented by at least one
physically “correct” wing sample (meaning that the wing surface label contains the
correct quadric shape and parameters for the adjacent regions) and that co—quadric
wings have a common wing surface label. Two wings are co-quadric if they lie on the

same quadric surface in 3D.

6.2.2 Wing Clustering

The ﬁrst step in the reconstruction process is to cluster the wings into “common”
surface groups. Previously, all surfaces were planar and the wings were grouped
according to the planar equations of the Wing surfaces. With the addition of quadric
surface types, this clustering step is broken into a two-step process. Recall that the
wing surface label contains knowledge of both the qualitative surface shape and the
quantitative surface shape parameters. Using the qualitative shape information, the
wings are ﬁrst clustered into planar, circular, cylindrical and conical groups. Each
group is then further clustered by using the shape parameters. The second grouping
is only possible if one is conﬁdent about the accuracy of the recovered parameters

but this is guaranteed by the computation accuracy assumption.

6.2.3 Extension of Wings

Upon grouping of wings into clusters, reconstruction of individual wing clusters pro-

ceeds independently. As in POLY—1 and POLY-2, the ﬁrst step toward this recon-

 

 

 

175

struction is to form a graph embedding of the true LDG. This was accomplished by
extending the straight wing contours and forming the intersections as junctions in
the polygonal scenes. Extending curved wing contours is much the same. For the
quadric surface domain, we know that line segments in the LDG must be of type
linear, circular or elliptic (see Section 2.4). Given our computation accuracy assump—
tion, coupled with the wing contour information accompanying each wing sample, the
wing contour can be extended without error. Extending the curved wing contour in
practice, however, is a much more difficult task. In Section 6.3, a possible remedy to

this problem will be offered.

6.2.4 Reconstruction of Individual Wing Group

The extended wing contours intersect to form the initial set of undecided-segments.
As before, those segments that overlap wing contours have to be must—segments (As-
sumptions A3 and A4). In the POLY-1 algorithm, range data on both sides of an
undecided—segment are re—sampled to justify its presence as a must-segment or to
reject it as an unwanted-segment. If the range data is available for re—sampling, then
the same procedure can be applied to determine all the must—segments in the graph.

Rather than resampling, the POLY-2 algorithm rebuilds the LDG based on geo-
metric constraints derived from the physical structure of the LDG. Notice that the
constraints used in deriving the reconstruction rules all still hold under the new object
domain. To use Rules 4.1 and 4.2, the junctions in the LDG of individual quadric
surfaces must all have even degree. This condition is satisfied by the deﬁnition of
6-quadrics. The constraints leading up to the development of Rule 4.3 are stated in
Theorem 4.3 and Corollary 4.3.1, which already are deﬁned in terms of 6‘quadrics.
Finally, the no missing wing assumption (Assumption A3) gives us Rule 4.4. The four
rules are now restated with proper choice of words to reﬂect the possibility of curved

segment types in the LDG.

 

 

 

176

Rule 6.1 An undecided-segment segij can be discarded if degmu,t(v;) is even and

degund(v.-) = I; or if degmu,t(vj) is even and degmd(vj) = 1

Rule 6.2 An undecided-segment seggj is a must-segment if degmu,¢(v,-) is odd and

degund(v,~) = I; or if degmu,t(vj) is odd and degund(vj) = 1.

Rule 6.3a Given two adjacent line segments Msegij and Useg“ where Mseg,,- is a
must-segment and Useg“ = SegGCWui(Mseg,-j) is an undecided-segment. Assume
that the wing group being processed has the surface representation P5. Then Useg“
can be discarded as an unwanted-segment if
RL-CC'Wui(Mseg,-_,-) = {P5} and P5 ¢ RL-CWUXUsegu)
or

RL-GGWui(Mseg,-j) 7e {P5} and RL-CWUI.(US€gu) = {P5}.

Rule 6.3b Given two adjacent line segments Msegij and Useg“ where Msegij is a
must-segment and Usegi, = SegCWv‘.(Mseg,-j) is an undecided-segment. Assume that
the wing group being processed has the surface representation P5. Then Usegi, can be
discarded as an unwanted-segment if
RL-Gin(Mseg,-j) = {P5} and P5 ¢ RL-CCWU‘.(Useg,-z)
or

RL-CW5,(Mseg,-j) 74 {P5} and RL-C’GWW(Useg,-1) = {P5}.

Rule 6.4 Given a must-segment Msegij and an undecided-segment Useguc where
Useggk = SegCWu‘,(Mseg,-,-) or SegGCWW(Mseg.-j). If Msegij and Usegik are not
co-curvilinear and WingDirvi(Useg,-k) : False, then all (undecided) segments c0-
curvilinear to and on the side of (1589:"): with respect to 1),- including Usegik can be

deleted.

The resolution of the undecided—segments is solely based on reasoning about each

junction and region labels, and makes no use of the physical properties (such as being

 

 

177

planar or otherwise) of the projecting surface. Thus, it is not surprising to see that

those rules also apply in the reconstruction of the LDG in the quadric domain.

6.2.5 Quadric LLDG Reconstruction Algorithm: QUAD-1

The reconstruction algorithm given below is essentially the same as the algorithm
for reconstruction in the polygonal domain except for the different choice of words
to account for curved segments in the LDC. Examples of reconstructed LLDGS from
complete wing samples are depicted in Figure 6.1. In hand—tracing those examples,
we observed that it is less likely for curved surfaces to be co—quadric (as compared
to planar surfaces to be coplanar); consequently, few wings would be clustered into
co—quadric group, making the reconstruction of faces in each wing cluster a simpler
task. All of the examples in Chapter 4 where only planar faces are allowed are all

hand-traced and reconstructed by this algorithm.

(1) Cluster wing segments by quadric equation
(2) For each quadric group

(2.1) generate a curve for each wing
(2.2) compute the intersections among all curves
(2.3) preserve line segments which overlap a wing
(2.4) remove or preserve line segments by using Rules 6.1, 6.2, 6.3 and 6.4.
(2.4.1) if VE1 # null, apply Rule 6.1 then goto (2.4.1)
(2.4.2) if V01 7’: null, apply Rule 6.2 then goto (2.4.1)
(2.4.2) apply Rules 6.3 and 6.4 to a junction v,- of VW
(2.4.3) if Vund = null or no more changes are possible then goto (2.5)
else goto (2.4.1)

(2.5) delete redundant intersections
(3) Merge LDGs of individual quadric faces
(4) Compute 3D vertices by intersecting curve equations
( 5) Retain all line segment labels from wings

(6) Output LLDG with junctions, lines, and regions all labeled.

178

wing samples reconstructed LLDG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 4.14. cont. (see page 179 for caption)

 

 

 

179

 

wing samples

 

 

 

reconstructed LLDG

 

 

 

 

Figure 6.1. Examples of reconstruction of LLDG from wing samples of quadric scenes.

 

180

6.3 Heuristic Reconstruction Algorithm

6.3.1 Wing Clustering

The ﬁrst step of all LDG reconstruction algorithms that have been presented is the
clustering of wing samples. This clustering step is non-trivial in practice. There are
many pitfalls which make wing clustering difﬁcult. Recall that the wing surface label
contains the qualitative surface type and the parameter estimates of the assumed
surface model. If the wing detector wrongly classiﬁes the surface patch then there is
no hope of correctly clustering that wing into the “correct” wing cluster. Even if the
“correct” surface model were assumed, if the surface parameters cannot be recovered
accurately then wings may be placed in the wrong cluster. Worse yet, each wing
could form cluster of its own if the error in estimation is large.

Fortunately, we have been able to escape such a predicament so far. In the POLY-
1, POLY-2 and QUAD-1 algorithms, the measurement and computation accuracy
assumptions ensure that all coplanar and co—quadric wings have the same wing surface
labels; thus they can be clustered easily without mistake. In POLY-3, we relied on
the fact that the wing detector is able to estimate the planar parameters accurately to
correctly cluster the wings into proper groups. With the addition of quadric surfaces,
the problem is not so easy to solve. In Chapter 3, we have concluded that the wing
detector is able to classify qualitative surface shape more accurately than to estimate
the quantitative shape parameters. Furthermore, parameter estimates from two far
apart regions of a quadric surface could deviate by an unacceptable margin.

Given the difficulty in wing clustering, one has to ask: “Can this step be bypassed
in LDG reconstruction?” Perhaps, reconstruction of the LDG can be achieved without
having to Cluster wings at all. In this section, we will present a heuristic algorithm
which does not cluster wings; rather, the entire LLDG is to be reconstructed from

the entire set of sampled wings at once.

 

 

 

 

181

6.3.2 Wing Extension

Extension of wing contours across the entire imaging (bounded) plane is also an es-
sential part of all of the LDG reconstruction algorithms. The accurately extended
segments together with their intersections form a graph of segments and junctions
that is a superset of all segments and junctions of the LDG. Under ideal assumptions,
extending wing contours is simple. Extending straight wings detected from real im-
ages of the polygonal domain has also been shown to be realizable given the accuracy
of our wing detector. However, the same cannot be said about extension of curved
wing contours. As noted in Chapter 3, the wing contour descriptor, which is derived
from ﬁtting a 2D curve model to selected edge points within each subimage, is only
a local curve approximation; it may not reﬂect the true shape of the global curve.
Blindly extending each wing contour may result in many supposedly co-curvilinear
curves that are potentially damaging to the heuristic reconstruction of the LDG.
One possible solution to this dilemma is to use the technique of the Hough Trans-
form (HT) [58], popularized by Duda and Hart [32], to accumulate evidence about
existence of different curves and their locations in the image. The main advantages
of the Hough Transform for curve detection are that little has to be known about
the location of the curve as long as the shape of the curve can be described para-
metrically; furthermore, it is relatively unaffected by gaps in curves and by noise [7].
In [6], Ballard presented a generalized Hough Transform technique for detection of
arbitrarily shaped curves that cannot be easily parameterized. The biggest drawback
with the HT techniques are that they generally are very time and memory consum-
ing and are therefore expensive for practical computer vision tasks. For this reason,
much research on the HT has aimed at reducing the storage (parameter) space and/ or

speeding up the process (e.g., [3, 64, 79]). Others have concentrated on detection of

 

 

 

 

 

182

certain curve shape [23, 88] or surface shape [89, 109]. A survey of different variations
of Hough transform can be found in [65].

Hough transform techniques have also been used in the object recognition aspect
of computer vision. Object recognition and pose identiﬁcation via transformation
clustering is a Hough based technique [75, 114]. However, recent investigation by
Grimson and Huttenlocher on the sensitivity of using the Hough Transform to perform
transformation clustering suggests that under moderate levels of noise in sensory data,
occlusion and image clutter, the HT may generate many false solutions [46].

Part of our future research plan is the study of Hough based techniques for group-

 

ing co-curvilinear wing contours and to recover the underlying curve from those wing

contours.

6.3.3 A Heuristic Approach

The purpose of clustering wings in all of the previous algorithms was to weed out

 

unnecessary wings and to keep only the co—quadric Wings for quadric surface recon-
struction. However, if a wing is wrongly clustered, it, in effect, has introduced a
spurious wing in one cluster and its rightful cluster will be missing a Wing. The effect

is potentially damaging if several wings are misclustered. Rather than risking the

 

damaging effect of wrongly clustered wings, we shall propose an algorithm that uses
the entire set of wing samples to reconstruct the LDG at once. The surface equations
of the wing surfaces used for wing clustering are only used in this algorithm to ver-
ify the hypothesis that two line segments should be merged. Figure 6.4 provides an

accompanying example for the explanation of the algorithm below.

Step 1: Extension of Wings

Although we have side-stepped the issue of wing clustering, our strategy still re-

volves around creating a graph G in which the true LLDG is embedded, and to

 

 

 

 

WY—1~—,—--‘. .
_“r'- . -

 

183

remove unwanted-segments and verify must-segments using heuristic rules. As dis-
cussed above, extension of wings is also a difficult task made worse by inexact shape
representation of the wing contour. Although our wing detector can reliably dif-
ferentiate surface shapes, the wing contour shape can sometimes be misrepresented
especially under the presence of noise. The accuracy of wing segment shape represen-
tation depends strongly on the window size used in detecting that wing. For example,
with a small window, due to shortness of the edge image in the window, many wings
may be detected as being straight. Future research will aim at better estimation of

the wing segment shape.

Step 2: Determination of Junctions

Assuming that wings can be reliably extended, some housekeeping jobs are required
before starting the reconstruction 100p. First the wing labels are propagated to the
entire extensions. Then the intersections of those wing extensions must be resolved.
Under an ideal situation, only one junction will arise when many line segments meet
at supposedly one junction. In practice, supposedly co-terminating segments may
intersect at more than one point or may fail to intersect at all (see Figure 6.4 (b)).
To ﬁnd missing junctions, if the minimum Euclidean distance d between two non-
intersecting line segments is less than some threshold 7, then the two segments are
adjusted to meet along the shortest path between them. An example is provided
in Figure 6.2 (a). Note that already intersecting segments are exempt from this
procedure. Although intersecting wing extensions and merging of nearby segments
may create many split junctions, the junctions are clustered so that nearby junctions
are pulled together into one unique representative junction (see Figure 6.2 (b). Finally,

the endpoints of all wing segments are also denoted as junctions.

 

 

 

184

 

 

 

 

 

 

/

wings fully extended

 

 

 

 

.C>.
- .

 

 

 

 

 

 

 

 

 

/

(a) (b)

 

 

Figure 6.2. Determining set of all possible junctions in the LDG. (a) Nearby non-
intersecting segments are merged at o. (b) Nearby scattered junctions are clustered
into one.

 

 

 

 

185

Step 3: Initial Classiﬁcation of All Segments

Initially all the segments in G are denoted as undecided—segments. Update the seg-
ment labels in G by labeling all wing segments as must-segments. An undecided-
segment is a maybe-segment if it is coincident and co-curvilinear with exactly one
must-segment at one of its two endpoints. The updated G is shown in Figure 6.4 (b)
where undecided-segments are represented by broken lines, maybe-segments by solid

lines and must-segments by thick solid lines.

Step 4.1: Removal of Unwanted-Segments

We are now ready to invoke the heuristic rules to remove any undecided-segments as
unwanted-segments. Recall that each region of the LDG can only correspond to a
single face of the object in 3D. Therefore, theoretically, each region should only have.
one label. In reality, computation inaccuracy prevents the segments surrounding that
region to have the same exact label (remember that the wing labels are propagated
to all its extensions in Step 2). Also recall that the wing surface label consists of both
the surface type and the parameters of that surface. Our wing detection experiments
suggest that although the recovered surface shape parameters may vary throughout
the surface patch, the qualitative surface type can be accurately predicted. For con-
venience, denote that surface type as the coarse label and the actual parameters as
the ﬁne label of the wing surface label.

The following rule checks for inconsistency of the coarse labels of two adjacent
segments on the region bounded by the two segments. If they differ and one is an
undecided-segment while the other is a maybe- or must-segment, then the undecided-
segment is discarded. No action is taken if both segments are undecided-segments
because we have no knowledge of which segment is likely to be unwanted. The rule

is graphically illustrated in Figure 6.3 (a).

 

 

 

 

 

 

      

 

 

 

 

 

 

 

186

'0 ................ ". {pln(a,b,c)} ', ............... ‘.| .' ................ 'u
"i i‘ .3 i- .3
nil nil
P1“ i l pln E
l c 1 ' c l i i
P n y I y i pln cyl i
pln cyl i pln cyl
contradictory ,. "

(a) (b)

Figure 6.3. Graphical illustration of the heuristic rules for reconstructing LDG of
curved object scenes. (3.) Rule 1, removal of undecided-segments. (b) Rule 2, rectifi-
cation of must-segments.

Rule 6.5 An undecided-segment seg,,- can be discarded as an unwanted-segment if
seggk is an adjacent, maybe- or must-segment and the coarse label of the bounded

region of the two segments are different.

Removal of undecided—segments may leave another undecided-segment dangling by
itself (i.e., not attached to any other segments). Therefore, any dangling undecided-
segments are also removed as unwanted-segments. In addition, phantom junctions,
those having no segments attached or incident to two co—curvilinear segments, are
removed from G, and the list of maybe-segments is updated accordingly.

The above process is applied repeatedly until no more undecided-segments can be
removed. Many of the undecided-segments in the running examples are removed by

the above procedure and maybe-segments are extended as shown in Figure 6.4 (c).

Step 4.2: Veriﬁcation of Must-Segments

While coarse labels are used to check for inconsistency of region labeling by two

adjacent segments, they can also be used to check for consistency.

 

 

 

 

 

O
...
...

 

 

 

 

 

 

 

(C)

 

 

 

 

 

 

 

Figure 6.4. Reconstruction of LDG of a curved surface scene.
undecided-segments, solid lines are maybe-segments and thick solid lines are must-
segments). (a) Wing samples and their extensions. (b) After junction detection and
clustering. (c) Result of repeat application of Rule 6.5. ((1) Result of one application

of Rule 6.6.

 

(d)

 

 

 

 

 

 

 

 

 

 

 

 

 

Broken lines are

 

188

If the region labels are consistent, and both segments are maybe-segments or one
is a must-segment while the other is a maybe-segment, then it is hypothesized that
both segments are must-segments. The hypothesis is veriﬁed or rejected by comparing
the finer label (parameter estimates of the projecting surface) of the two segments.
If the estimated parameters are reasonably close, then the hypothesis is accepted;
otherwise, rejected. This hypothesis veriﬁcation step is similar in principle to the
wing clustering step that we have cited as being too unreliable and consequently
abandoned. However, there is a major difference in these two approaches. Wing
clustering was performed over all sampled wings; wings that are far apart, though on
the same surface, need to be clustered into the same group. Here, only two sets of
parameters are compared, and they are propagated from two wing segments that fall
on adjacent line segments of the LDG, which in general means the surface patches
from which those wings are derived are close in 3D. Again, from the wing detection
experiments, we have concluded that nearby wings have more consistent parameter
estimates than far apart wings of the same surface, except in the planar case where
there is not much difference. Therefore, hypothesis veriﬁcation by comparing the

parameter estimates is reasonable.

Rule 6.6 Two adjacent segments segij, segnc should both be must-segments if {1) both
are must— or maybe-segments or one of each; (2) the coarse labeling of their bounded

region is consistent; and {3) the fine labeling of their bounded region is also consistent.

The result of applying this rule to the the running examples is depicted in Fig-

ure 6.4 (d). Again, any phantom junction resulting from this operation is removed.

Step 5: The Final LLDG

By repeated application of the above two steps, we arrive at the ﬁnal LLDG as in

Figure 6.4 (e). It only remains to label the regions with a unique label. Note that

 

 

 

189

any closed region in the LDG must have a consistent coarse label on all surrounding
segments (Rule 6.6). If the original range image is available for resampling, one could
reﬁt the surface model (coarse label of the region) to points sampled from within this
region on the range image. If the range image is not available, 'then depending on the
application, all the estimates or perhaps the mean of all estimates could be passed

onto the next higher level of recognition module.

6.3.4 Heuristic Quadric LLDG Reconstruction Algorithm:
QUAD-2

 

The complete heuristic algorithm is outlined below.

(1) Extend each wing segment (no clustering)

(2) Form junctions by computing the intersections among all curves and propagate
the wing labels of each wing segment to its extensions; denote this graph as G

 

(3) Initial classiﬁcation of must-segments, maybe-segments, and undecided segments
(4) Remove or preserve maybe-segments by heuristic Rules 6.5 and 6.6

(4.1) Repeatedly apply heuristic Rule 6.5 to all remaining undecided-segments
in G until no more undecided-segments can be thrOwn out as unwanted«
segments

(4.2) Apply heuristic Rule 6.6 to all remaining maybe-segments in G to rectify
any must-segment

(4.3) If G is updated, loop back to step (4.1)

(5) Resolve region labels and output (partial) LLDG.

6.3.5 Examples

Two examples involving missing wings are given in Figure 6.5. In Figure 6.5 (a), we
removed 5 wings from Figure 6.4 (a) that we thought would be less likely to be sensed

from real fused image. Using algorithm QUAD-2, the reconstructed LLDG revealed

 

 

 

 

190

the two cylinders at the end. However, one curve segment that does not belong in
the true LLDG was inferred. In Figure 6.5 (b), two more wings were removed which
results in just one big cylinder and two non-existing limb contour segments in the
reconstructed LLDG. In the defense of the algorithm, we must point out that the
most revealing features about the object in the scene, the curvature-L junctions and
the three-tangent junctions, were detected; furthermore, the reconstructed LLDGs
mimic how the human visual system would have reconstructed the scenes without
given prior knowledge about the objects in the scenes.

In Figure 6.6, more interesting facts about the algorithms are revealed. In Fig—
ure 6.6 (a), the complete LLDG are reconstructed despite that fact that there are
three missing wings. In Figure 6.6 (b), a missing wing caused 4 maybe—segments not
to be resolved, which indicates that the reconstructed LLDG is not complete. In Fig-
ure 6.6 (c), a complete set of wing samples are supplied for reconstruction; however, 4
maybe- and 3 undecided-segments are not resolved by the reconstruction algorithm.
Those examples show that the reconstruction rules used in algorithm QUAD-2 are
not sufﬁcient. Better decision rules await to be discovered to prevent situations like

that of Figure 6.6 (c) from occurring.

6.4 Summary

Extending reconstruction techniques to the domain of quadric-surface objects proves
to be quite challenging. In this chapter, we have shown that under the same set
of idealistic assumptions about the input wing set and the computation accuracy as
in Chapter 4, the LLDG reconstruction algorithms of POLY-1 and POLY—2 can be
directly extended to reconstruct the LLDG in the expanded domain. The extension is
possible because the algorithms made no assumptions about the linearity of the line

segment and the planarity of the surfaces. Reconstruction was performed based purely!

 

 

 

wing samples reconstructed LLDG

(a) ....... —_. .
5 """ "‘ o
z ‘ ..... P1 .......... 3'

 

 

...
.-
.....

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 6.5. Reconstruction of LLDGs from imperfect wing samples using heuristic

algorithm QUAD—2.

 

192

wing samples reconstructed LLDG

(a) ii} a
v o
«1*,

HQ 4
C94

(b) iii.‘
if i

x ..... ﬁn. ..... k .

 

 

_‘_

 

 

 

 

 

 

 

 

 

 

(C) g _
,./ .. """"" Pl

.....
u
.-

p
2.“ ya
. (:2 ”22:22:22 ..... c 2 ,

 

 

 

 

 

 

 

 

 

 

 

Figure 6.6. More LLDGs that are reconstructed by the heuristic algorithm QUAD—2.

193

on the geometrical constraints of the underlying line drawing graph. A modiﬁed
algorithm, QUAD-1, was outlined.

A couple of challenging problems arise when the idealistic assumptions are
dropped. First, the wing clustering step, which was successful for POLY-3, will
be hard to duplicate given our knowledge about the quality of the quadric surface
wings detectable by our wing ﬁnder. We argued that wing clustering requires more
accuracy on the estimated surface shape parameters than currently available. Instead
of risking placing wings in the wrong cluster, a new algorithm, QUAD-2, is proposed
to bypass the whole clustering step and to reconstruct the LLDG at once from the
set of input wings. The second problem when dealing with realistic situations, has to
do with the extensibility of wing contours to form a graph embedding of the LDG.
Given that wing contours only model the local image contour shape, extending it to
ﬁt the global contour is asking a lot. No solution to this problem was offered but the
Hough Transform technique was suggested to be a plausible approach.

The proposed QUAD-2 algorithm has only two heuristic decision making rules.
Both are based on the fact that each region in the LDG corresponds to a physical
surface in 3D so that the line segments surrounding a region should all label the
common region the same way. The qualitative region labels of each segment are used
to discard any conflicting unwanted segments and the surface parameter estimates are
checked to verify the hypothesis of a must—segment. A drawback of this algorithm is
that by not clustering sampled wings into co-quadric groups and reconstructing each
group independently, the transparent parallel implementation of previous algorithms
is not present in QUAD-2.

Neither QUAD-1 nor QUAD-2 has been implemented. Hand—traced results were
provided to exemplify the algorithms. Future research is needed to solve the difﬁcult

task of extending curved wing contours so that QUAD-2 can be aptly applied.

 

 

 

CHAPTER 7

Summary, Conclusions and
Recommendations for Future

Research

7 .1 Summary and Conclusions

The research addressed in this dissertation dealt with extracting a labeled line drawing
graph from registered range and intensity image via locally detected wing primitives.
The problems of wing detection and labeled line drawing reconstruction are treated as
separate issues: Wing primitives are extracted from raw fused images and reconstruc—
tion of the LLDG proceeds from the sampled wings. By making it a separate issue,
limitations of wing detection will not hinder the growth of line drawing reconstruction
research.

After introducing the research problem in Chapter 2, we set out to define the
special terminology used in this thesis and to derive the mathematical formulation for
various quadric surfaces, the image contours and image of the crease edge created by
intersecting quadrics. The structure of the wing and its mathematical representation

were discussed; in the process, a set of 98 wing primitives were conceived for the

194

 

 

 

 

 

195

quadric surface domain. It is those primitives that are to be extracted from the fused
images. A labeled line drawing graph is a representation for the view of the scene
with as much geometric and topological ﬁdelity as is possible. We deﬁned accidental
views in terms of the surface sensing property, which dictates that each visible edge
and surface must be partially sensed. Mathematical representation of various surface
and contour shapes were derived and fit to facilitate the wing detection process.

The ﬁrst problem that we studied was on the detection of wings from fused images.
Many issues and conclusions were drawn from this study. A major contribution of
this thesis is the study of the effectiveness of simultaneously ﬁtting 3D surface data
and 2D image contour data to recover surface shape and pose parameters. A-large
number of Monte Carlo experiments were performed and the results strongly support
our hypothesis that fused ﬁtting can better discriminate between different quadric
shapes and provide more accurate estimations of the pose parameters than if surface
data or contour data were used alone. The results are even more compelling when
Gaussian noise of higher variance was introduced in the data. Experimentation on
real images further supports this conclusion. An experiment was also conducted to
study the minimum number of data points required to obtain reasonable ﬁts. The
performance of the fused and surface-only ﬁtters stabilized when as few as 20 sample
points were used for ﬁtting. This result refutes the common belief that large number,
hundreds or even thousands, of sample points are required to obtain good ﬁts. Private
communication [24] with our colleagues conﬁrms that it is not the number of data
points, rather, it is the coverage of the sample points that controls the success of the
ﬁt.

Armed with this new found fused ﬁtting technique, we proceeded to investigate
the main problem of wing detection. It is well known that edge information from
objects provides strong clues about the local shape of surface regions [70] and shape-

from-boundary methods have already been implemented [1‘27]. For example, the limb

 

 

 

196

boundary of a circular cylinder should be detected as much by its straight boundary
contour as by the curvature of its surface points. The elliptical contour at the lip of a
cup or can is perhaps more revealing of 3D shape than is a local set of range samples
as recent work has shown [41, 54]. We proposed to incorporate the limb boundary
and crease edge position information to constrain and to aid the recovery of the
surface shape parameters by simultaneous ﬁtting the 2D contour information with
the sampled 3D surface data points. Furthermore, the limb contour was shown to be
of particular importance for it is instrumental in generating good initial parameter
estimates for non-linear ﬁtting.

The Wing detection algorithm was tested on over 20 real fused images. Carefully

studying the results enabled us to draw the following conclusions:

0 The wings can be reliably detected in polygonal scenes. Presence of spurious

wings are rare and the variation of surface parameters across a plane is low.

0 The conclusion drawn from Monte Carlo experiments about the surface shape
discriminatory power of fused ﬁtting is also evident in real images. Wings near
the limb boundary or crease edge contour are detected with more precision and

with fewer errors in qualitative surface classiﬁcation.

0 Wing detection in images of quadric surface scenes reveals that the qualitative
shape of the surface is often correctly recovered but the quantitative shape

parameters may have large variance over the same quadric surface.

After closely examining the quality of the wing detector, we turned our attention
to reconstruction of labeled line drawing graphs in Chapters 4 to 6. In Chapter 4, we
studied the plausibility of reconstructing a unique line drawing of origami and polyg-
onal scenes from a complete set of wing samples under some favorable conditions.
The input wing sets were assumed to be complete, meaning that at least one wing

was detected for each visible edge in the LDG but no spurious wings. Furthermore,

 

 

 

 

 

197

inﬁnite computation and measurement accuracy were assumed so that the validity
of the reconstruction algorithm could be accurately assessed. Under these idealistic
assumptions, two deterministic algorithms, POLY-1 and POLY-2, were pr0posed for
complete reconstruction of the LLDG. By allowing re—sampling of range image, com-
plete reconstruction of the labeled line drawing graph is guaranteed. On the other
hand, by studying the geometric constraints of the line drawing graph, rules can be
deduced to facilitate the reconstruction process as in POLY-2. Since those rules are
derived from the geometric properties of the LDG, they were shown to be geomet-
rically “correct”, meaning that the reconstruction rules are necessary conditions for
any LDG. However, as pointed out by one of the examples, those rules are not sufﬁ-
cient to guarantee the recovery of a complete LLDG. The problems lies not with the
reconstruction algorithm; rather, the information retained by wings about the scene
may be inadequate for complete reconstruction even in an idealistic setting.

In Chapter 5, the idealistic assumptions were dropped and a new algorithm,
POLY—3, was proposed to reconstruct the LLDG from the wing samples as detected
by our wing sensor. The objects were still assumed to consist of planar surfaces only.

Some of the problems addressed and overcome were:

1. possible inaccuracy and imprecision of wing parameters;

2. spurious wing samples that do not correspond to any physical line segment in

the LDG; and
3. missing wings (i.e., no wing sample for some of the line segments of the LDG).

The problem of possible inaccuracy in wing parameters was partially solved by the
inherent capability of our wing sensor to produce accurate wings. However, there will
always be some discrepancy in the parameters. This was overcome by introducing
displacement tolerances in places where the quantitative parameter values are com-

pared. Spurious wings could potentially cause difficult problems if they are present

 

 

 

 

 

 

198

during the reconstruction. Since spurious wings do not correspond to physical edges
in 3D, by checking to see if the two planar surfaces represented by the wing are actu—
ally coplanar, they can be discerned from other wings and discarded from the wing
samples. Experiments have shown that spurious wings are rare and are often removed
by the preprocessor. The missing wing problem is the reason why heuristic rules are
used for reconstruction. By means of heuristic rules, the algorithm has no guarantee
of reconstructing a complete LLDG. We conclude that the algorithm is sound by
showing that (1) it is capable of generating an LLDG comparable to that of POLY-2
when a complete set of wing samples is given as input; and (2) a partial LLDG can
be obtained even when the input wing set is incomplete. With this, we conclude that
(partial) reconstruction of labeled line drawing graphs from fused range and intensity
imagery is tenable for scenes involving origami and polygonal objects.

Having drawn such strong conclusions about planar surface scenes, we must admit
that our work on reconstructing LLDG of quadric surfaced scenes is only preliminary.
We argued that under idealistic assumptions, the POLY-1 and POLY—2 algorithms
can be directly extended to operate in the expanded domain. Hand traced examples
supported our argument. We expect that the difficulties involved in reconstructing
the LLDG for the polygonal domain will only be magniﬁed for the quadric scenes.
An algorithm was offered to overcome some of these difficulties, but it has not been
implemented nor fully tested. Hand traced examples are available to encourage us

that we are moving in the right direction.

7.2 Contributions

The primary contribution of this research is the successful demonstration of recon-

structing a labeled line drawing graph (LLDG) from raw fused imagery.

 

 

 

m I—v-I-r w—~.

199

The wing detection algorithm and the reconstruction algorithms were tested on a
large number of test cases, both synthetic and real.

A contribution is made to strengthening “wing representation theory” by the
successful implementation and testing of a wing detection algorithm. In
contrast to some other representational theories, we showed that the primary feature
of the wing representation theory, which is the wing, can be extracted from raw fused
images.

We have proposed to recover surface shape and pose parameters by simul-
taneously ﬁtting 3D surface data as well as relevant 2D contour points.
We have shown by both Monte Carlo experiments and by tests on real data that
fused ﬁtting is superior to either ﬁtting surface data or contour data alone. We also
found that only a small number of data points are really necessary to obtain quality
ﬁtting results. Valuable computation time is saved by sampling the data points for
ﬁtting rather than ﬁtting all the available data points. Also within the study of wing
detection, we discovered that the limb contour position information can help provide
a good starting set of parameter values for the non-linear ﬁtting routine. A good
starting vector is essential for the ﬁtter to converge and to converge to the global
optimum.

The study on labeled line drawing reconstruction of polygonal scenes under ideal-
istic assumptions provided insights into the geometric constraints of the line
drawing graph. It was shown that the reconstruction rules derived from those ge-
ometric constraints are necessary conditions but not sufﬁcient conditions for unique
reconstruction of the LDG. The example, where unique reconstruction could not be
made, points out that wing representation is not mathematically adequate for recon-
structing a perfect LDG. Progress toward automatic generation of the LLDG
for'polygonal scenes was made by our heuristic reconstruction algorithm. The

algorithm was shown by examples to be comparable to the deterministic algorithms

 

 

 

 

200

when the input wing set was complete. When the input wing set was not complete, a
partial LLDG was generated. Difﬁculties in generalizing the reconstruction algorithm
to include quadric scenes were identiﬁed and possible solutions were offered. Much
more needs to be done to achieve the goal of automatic generation of the LLDG for
quadric scenes.

One ﬁnal small contribution of this thesis is the successful fusion of registered
range and intensity images to subpixel accuracy from source images in two differ-
ent world coordinate frames. The fused image provides complementary information
about the surfaces in view. Object recognition related processing may beneﬁt from
the use of fused data as demonstrated by our research on wing detection and fused

ﬁtting.

7.3 Recommendations for Future Research

We have presented algorithms with examples that show line drawing generation and
interpretation is plausible in the planar surface'domain. Extending the implemen—
tation to handle quadric or even more general surface scenes will be an important
future research topic. Some of the difﬁculties involved are already discussed in Chap-
ter 6. The question “Is wing representation adequate for line drawing recovery and
interpretation of quadric surfaced scenes?” needs to be addressed before reaching for
a quadric surface scene reconstruction module.

The use of fused ﬁtting enabled a more accurate recovery of wing features within
a subimage of the scene; but a poor decision on the “busyness” of a subimage can
result in having no wing detected on an edge contour (missing wing) or having spurious
wings. Our experiments have suggested that a ﬁxed-scale window for wing detection is
inadequate across scenes with multi-size features. Variable size windows are needed to

achieve a more robust wing sensor. Toward this end, a scaled-space approach to wing

 

 

 

 

 

201

detection is an important theme to be explored in the future. Scale space techniques
for feature extraction have been successfully implemented in the past [35, 96, 106]. A
similar feat should be duplicated to achieve a variable scale wing sensor.

Having demonstrated the feasibility of wing detection from fused imagery, a natu-
ral improvement and extension is to forego the range image and to attempt to extract
wing features using shading data alone. Intensity can be directly related to surface
patch shape Via the same model parameters and hence new ﬁtting equations, such as
those of Section 3.4, can be derived. Future experiments may show that rough object
shape can still be distinguished by applying the same ﬁtting method to observations
derived from a single intensity image.

An immediate future goal is the parallel implementation of the wing detection and
the LLDG reconstruction algorithms. Parallel implementations of our algorithms are
natural. Detection of individual wings is independent of one another; once the image
is tessellated into subimages, wings may be detected in parallel. With parallel wing
detection, wings detected in one window with conﬁdence may be used to aid the
recovery of the wings in neighboring windows. The LLDG reconstruction procedure
can also be parallelized by reconstructing each planar group of wings simultaneously.
By allowing inter-group communications during reconstruction, ambiguous interpre-
tations may be resolved during the reconstruction phase (recall the front face of the
Big-Block example in Figure 4.18). The rules governing conflict resolution must be
carefully constructed to ensure only a physically realizable interpretation survives.

Finally, with the advent of our wing detector, object recognition based on wing
representation theory should be studied further than previously [26] done. Aspect
graph representation offers a structural multi—view representation of the objects, while
wing representation explicitly deﬁnes the characteristics of the Visible surfaces of the
object in a given view. By building an aspect-wing graph object model database,

object recognition can be performed based on the set of detected wings. Furthermore,

 

 

 

202

successful partial line drawing reconstruction from wings should be most useful in
object recognition tasks where the junctions in the LLDG may be used as composite
Wings to reduce the prohibitive large space requirement imposed by the aspect—wing

graph.

 

 

 

 

 

APPENDICES

 

 

 

APPENDIX A

Creation of Fused Imagery

In this Appendix, the process of obtaining registered range and intensity imagery
will be detailed. We will describe the the imaging system and its set—up follow by
the complete description of the data acquisition procedure including all the necessary
frame transformations.

For convenience in its use, the data is presented as if it were formed from the scene
using parallel projection along the optical axis of the camera. However, a perspective
transformation was needed in order to map 3D world surface points onto the intensity
array produced by the camera, so that intensities could be assigned to 3D surface
points. The resulting data is deﬁnitely not 3D, but rather 2 1/2 D, and is very much
viewpoint dependent. Because the Technical Arts Scanner is a triangulating sensor,

there are voids in the data due to shadowing from both the camera and laser sheet.
Shadowing from the white light sources has been minimized by using two or three
light sources when taking the intensity images.

Care has been taken to assure registration accuracy to within one pixel in the ﬁnal

fused images. The fusion process is briefly outlined ﬁrst; complete details follow.

203

 

204

Data Creation Process

1. Obtain range image (Rm) using Technical Arts White Scanner.

N

. Obtain intensity ima e Ic usin Innovision’s color ima in system through
g g g g

White Scanner’s camera.

03

. Compute the calibration matrix for mapping a surface point in Ru, to its corre-

sponding point in IC.

4. Compute the transformation matrix for transforming a point in world coordinate

frame to camera coordinate frame.

0‘

. Parallel project each surface point in B“, into the fused image plane (zc = 0
plane in camera-centered frame), from which depth and intensity values can

easily be computed.

C3

. Set the binary ﬂag image to indicate the validity of the range/ intensity values

at each element in the image.

System Setup and Capturing Range Images

Figure A.1 depicts the set up of the White Scanner for capturing range images in our
facility. The White Scanner is a triangulation based range scanner [117]. The stage
for holding the objects in the scene can be moved in two directions: along the X,”—
and Yw-axes. A sheet of laser light is projected onto the scene and the laser stripe
that hits the surface of the objects in the scene is picked up by the camera. It is those
surface points (along the stripe) whose z-values (zw ) are computed using the world
coordinate system. Note that only those surface points that are illuminated by the
laser light and are visible to the camera can yield fused data values. The stage was

moved in the direction alon the Xw-axis to ensure that the zw’s are com uted for
8 P

 

 

 

 

205

  

xc (parallel to Y w)

 

 

 

 

., 0_))
/

Figure A.1. Setup of White Scanner and the two coordinate frames

all visible surface points. The total distance moved along the Xw-axis and the strip
seen by the camera were both quantized into 240 steps so that the resulting range
image has the dimension of 240 pixels by 240 pixels. Since the ﬁnal fused image
contains both range and intensity data we use either term pixel or range] rather than
coin another term such as fusel 1 However, the actual :13", and yw coordinates were
recorded rather than the row and column numbers. An example range image is given
in Figure A.2 (a).

0, which is the angle between the sheet of laser light and the optical axis of the
camera, is ﬁxed to be 45 degree. Ham, which is the normal distance from the camera
to the sheet of laser light, is knowu from calibrating the White Scanner beforehand.
These two parameters enable the scanner to compute the 2.0 coordinate for each of the
visible & illuminated surface points. Furthermore, knowledge of those two parameters

is essential in parallel projecting the surface point into the ﬁnal fused image plane as

will be seen later.

 

 

206
Capturing Intensity Images

The intensity image Ic was captured through the White Scanner’s camera using digi-

 

color, an in house program, which drives the Datacube board for capturing images.
The laser light was turned off and 1 to 3 white light sources were used to minimize

the shadowed regions. The registered intensity image, originally of size 512 pixels

 

x 485 pixels, was reduced to 240 x 240 using the “stretch” function (which is basi-
cally block averaging) available in the HIPS [77] package. This intensity image is
camera-centered as opposed to world-centered for the range image. Note that due
to occlusion of the laser stripe, some surface points that are visible in the intensity

image may not have corresponding data in the range image. An example intensity

 

image is shown in Figure A.2 (b).

Computing the Calibration matrix

In order to obtain a fused (range and intensity) image, a calibration (transformation)
matrix for mapping rangels in the range image to corresponding surface points in
the the intensity image must be found. This is done by picking n, n _>_ 6, pairs
of corresponding points in the range and intensity images (say R1,R2,...,R,, and
11,12, ..., In) and solving for the calibration matrix C [52]. Mathematically, this process

is written as

F 331%. ya. ZR. 1 ' f $1. 311. 1
$32 sz 232 1 $I2 XII; 1
. ' 04133 = .
- $12.. 3112.. 23,, 1 i - $1.. 311.. 1 .

 

 

 

 

Figure A.2 (a), (b) and (c) show actual images and matching data points used in
computing a calibration matrix C. Note that using this calibration matrix, subpixel

accuracy was achieved for these calibration points.

 

207

 

 

 

 

 

   

(a)
Input Data I Fitted Data I Residuals
Xw Yw Zw I Xi Yi I Xi2 Yi2 I X Y
I I l
-3.0 -1.3 -0.0 I 62.0 231.0 I 61.5 231.5 I 0.5 -0.5
2.9 -1.3 -0.0 I 133.0 215.0 I 133.1 214.3 I '0.1 0.7
-3.0 0.6 0.0 I 61.0 184.0 I 61.5 183.8 I '0.5 0.2
2.9 0.7 -0.0 I 133.0 172.0 I 133.0 172.6 I 0.0 -0.6
-2.9 4.2 -3.5 I 23.0 98.0 I 23.0 97.4 I -0.0 0.6
2.9 4.2 -3.5 I 95.0 99.0 I 95.0 99.6 I 0.0 -0.6
'3.0 7.7 -0.0 I 61.0 11.0 I 61.0 11.5 I -0.0 ‘0.5
2.9 7.7 -0.0 I 133.0 26.0 I 132.9 25.5 I 0.1 0.5
CALIBRATION MATRIX
14.92047691 -0.06266269 9.73421764 100.84538269
3.06331229 -22.63347054 '2.31218886 192.72633362
0.02696475 -0.00035766 -0.02190359 1.00000000
(c)
(d) (6)

Figure A.2. A complete example: (a) original range image, (b) original intensity
image, (c) calibration matrix, (d) fused range image, (6) fused intensity image.

 

 

 

Zw

208
Rotate about X by 45 Rotate about 2’ by -90 Rotate about Y" by 180
A
(C")
(C) Y ' (C I ) xu
z c x w
x Y“
xc z I 2 II

Yw

Figure A.3. Relating World coordinate frame to Camera-centered coordinate frame.
To align the axes: rotate {C} about Xc-axis by 45 degree then rotate about the new
Z~axis by -90 degree follow by 180 rotation about the new Y—axis.

Forming Final Fused Images

The ﬁnal step in forming the registered orthographically projected fused image is to
parallel project the (aw , yw , 2w ) into the fused image plane. The fused image plane
is a synthetic image plane that is coplanar with the CCD array of the camera. Since
the exact relationship between the world coordinate frame and the camera-centered
coordinate frame is known, translate the origin of {W} to {C} then rotate the axis

so that they align (see Figure A.3), the projection is carried out via the following

transformation:
are It... —Hdist
9° R R R ”w
= . a - II .
2C 13c .45 2c ,—90 yc ,180 gm _Hdist
1 1

Each element in the fused image plane has 3D coordinates (so , yc , zc ), Where
2c is the actual depth from image plane to object surface point. The corresponding
intensity value,ic , at (a:C , yc , 2c ) is taken from the original intensity image, L, at
the pixel corresponding to the surface point (an, , yw , zw ). This is done by projecting

($1.. , y... , 2w ) into IC using the calibration matrix derived above.

 

 

 

 

 

209

A binary flag image is used to indicate the validity of the range/ intensity values
at each element in the image. A pixel element is deemed to have invalid registration
if at least one of the range or intensity values is not available. A background pixel
will have neither range nor intensity value. A pixel has intensity but no range value
when it is the projection of a surface point that is visible to the camera but not
the laser. On the other hand, when computing the range values (2“, ), the object(s)
is(are) passed through the sheet of laser light by moving the stage along the Xw-axis
direction while the intensity image is captured with the stage centered at aw = 0.
Thus, there may be surface points that have range values but no intensities because
those points were not visible to the camera at that ﬁxed stage position. The ﬁnal
fused image has 5 components (ﬂag, :rc , ye , zc , ic ).

Finally, because we have forward transformed the data, there are often small
holes or lines in the fused data. These are due to quantization and not shadowing.
Most such holes may be easily removed using a median ﬁlter (or, by doing back
transformation and interpolation). We thought that it might be better for scientiﬁc
research to view the data as spatial samples which may not be dense in the array,

and, thus we have not ﬁlled the holes at this time.

Fused Images

A list of 10 sample fused images are given below. These are the 10 images used to

compile the wing detection summary in Table 3.7.4.

 

210

 

(a) agpart2

 

(b) block2

 

(c) bulbl

Figure A.4. cont. (see page 213 for caption)

 

 

 

211

 

(d) conel

 

(e) cylinder2

   

(f) iigl

Figure A.4. cont. (see page 213 for caption)

 

 

212

 

(g) block1+column1

 

(h) cone+cylinder

 

(i) cone+cylinder+sphere

Figure A.4. cont. (see page 213 for caption)

 

213

 

(j) hump+agpart

Figure A.4. Fused range (left) and intensity (right) images

 

 

APPENDIX B

X2 Merit Function for Fused
Fitting

Suppose we are ﬁtting a set of n data points X“ = {211,331, ...,:efl} to a model
P3d(:c; Q) = 0 with m adjustable parameters 9. Assume that for each data point 23:1 E
X31, it has a measurement error that is independently and identically distributed as

a normal distribution with mean 0 and standard deviation 03d :
Pad(a=i1;0) ~ No. at). v24, 6 x...

Then the X2 merit function for ﬁtting P3,; is deﬁned as

. . 2
Xﬁ—m = E":(P34(m;1; 9)) _

{:1 03d

Given another set of n points X6 = (that; ...,:cf} to be ﬁt to another model
P2d(:v;fl) = 0 with the same m adjustable parameters 9. Again, assume that each
data point at: E XC has a measurement error that is independently and identically dis—

tributed as a normal distribution with mean 0 but a different standard deviation 02d :

P2d(a::; 82) ~ N(0, 03d), V126 X6.

214

 

215

Then the X2 merit function for ﬁtting P2,; is deﬁned as

XL... = : (P2404: 9))2'
' 2

0'

Now let X: X,1 U X — —{:c,1,a:2 m,a:;‘1:cl,a:§, ...,:cg}, to which we want to ﬁt

a “fused” model
Pfused(mj 0) = (11 P3d(23; 9) + (12 P240139): 0

with the same m adjustable parameters, 0, as in Pad and PM, where a1 and a2 are
two weighting constants.

Given that the ﬁtting errors of ﬁtting 2:.- 6 X to P3D and P213 are all i.i.d. normal
distributed, the error of ﬁtting weighted P34, P2,; and consequently Pfuﬂd are also

normal distributed. Speciﬁcally,

a. Patten; m ~ No. (ulnar).
(12132431; O) N N(07 (‘12 (5)2),
and

Pfused($ii 0) N N(0a (01 0.3102 + ((12 0&2)? V23; 6 X '

Therefore the X2 merit function for ﬁtting a weighted fused model is

A

2
Puse( 0)
X2 2n—m =:an(_1€—£—_) '

where a = (a1 03602 + (a2 Uc)2-

 

APPENDIX C

Parameters of the Wing Detection
and the LLDG Reconstruction

Algorithms

Parameters of the Wing Detection Algorithm

There are two sets of parameters that are present in the wing detection algorithm:
image/ window size parameters and the wing detection parameters (see Table C.1).
The ﬁrst set of parameters control the number and the size of individual image patch
for wing detection. These parameters include the image size (N x N), the window
size (n x n), degree of OVerlap between adjacent windows (177. rows and m columns),
and the number of windows (M x M). Though other image and window shapes are
possible, we have selected square images and windows in this study for simplicity.
Wing detection would proceed in the same way regardless of the shape of the image
patch.

Two window sizes were used in the studies reported in this research: 30 X 30 and
20 x 20 pixels. The larger window size was used in compiling the statistics reported in

Table 3.7.4. The wing detection results shown in Figures 3.10 and 5.7 were generated

216

 

 

217

Table C.1. Parameters of the wing detection algorithm

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Tesselation of the input image
image size N x N 240 x 240 pixels
window size n x n 30 X 30 or 20 x 20 pixels
degree of window overlaps m x m 15 x 15 or 10 x 10 pixels
number of windows M X M 225 and 52.9
Detection of interesting windows
initial number of potential edgelets 2 x n
minimum number of edgelets MIN -2D-PTS n
goodness-of-ﬁt
linear 0.1
circular 3.0
parabolic 3.0
minimum number of pixels per region 2 x n
Wing model ﬁtting parameters
total number of data points 30
number of 2D/ 3D points n21) / n30 15/15
number of 3D/2D/3D points ngp/ngp/nw 10/10/10
number of 3D points n31) 30
weight distribution
ﬁtting 2D/ 3D points 102D/ID3D 1.2/0.8
ﬁtting 3D/2D/3D points 1039 /w2D/w3D 06/08/06
maximum number of iterations 50
X2 thresholds
wing model involving only PLN surface 1.5
wing model involving only SPH surface 2.9
wing model involving only CYL surface 7.5
wing model involving two PLN surfaces 5.0
wing model involving a PLN and a CYL surfaces 20.0

 

 

 

 

 

 

 

218

with the smaller window. Large window size often yields fewer wing samples but more
accurate parameter estimates while smaller window results in fewer missing wings but
less accurate parameter estimates. The other window size related parameters were
also varied as indicated in Table C.1).

The ﬁrst step toward wing detection is the detection of “interesting” windows. The
interesting window detecting module uses 6 parameters: the initial number of Sobel
edge points used for edge contour ﬁtting, the minimum number of Sobel edge points
required on a valid edge contour, the three goodness—of-ﬁt thresholds (experimentally
chosen) for the three types of wing contours and the minimum number of valid pixels
(surface points) in each of the two regions available for wing model ﬁtting. The last
parameter is to guarantee that the size of the region is adequate for sampling.

Detection of wing within each interesting window proceeds by ﬁtting various wing
models to the contour and/or surface points sampled from the window. The total
number of sampled points (30) and the distribution of those points to surface and / or
contour samples are some of the parameters involved in this module. If the wing
model hypothesis calls for ﬁtting of only one surface patch, then 30 data points are
sampled from the surface patch. If the contour and a surface patch are to be ﬁtted,

then 15 points are sampled from each of the contour and surface patch. And, if the

 

contour and two surface patches are to be ﬁtted, then 10 data points are sampled
from each contour and surface patches. The weights placed on the contour and surface
data points during fused ﬁtting were 1.2 and 0.8, respectively (which were not varied
in the study), and the maximum number of iterations allowed for the non-linear least-
squares ﬁt is capped at 50. The experimentally chosen x2 goodness-of-ﬁt thresholds

round out the set of parameters in wing detection.

 

 

 

 

219-

Table C.2. Surface Model parameters

 

Planar: a1 , b1 , d1
Spherical: x0, yo , zo, r0
Cylindrical: x0 , yo, r0, 0, ,0
Conical: $0, 310, 20, do, 0,5

 

 

 

Parameters of the Surface Models

The parameters of the wing surface models used in this research are listed in Ta-
ble C2. The range of these parameters for the Monte Carlo experiments performed
in Section 3.5 were previously given in Table 3.1. Note that the equations of the
2D projections of the limbs and crease edges have the same set of parameters that

created those limbs / edges.

Parameters of the POLY-3 Algorithm

The POLY-1 and POLY-2 algorithms require no parameters due to the assumption
on measurement and computation accuracy. However, in the POLY-3 algorithm,
5 parameters are needed. In preprocessing of the wing samples, short compatible
wings are merged into long wings. Compatibility of two overlapping wings is based
on the collinearity of the wings and the co-planarity of the adjacent surfaces. Using
the polar coordinate system, 2D position of the wing can be represented by the
parameters p and 0. Thus two wings are collinear if their respective p and 0 values
are within some thresholds. Co-planarity of surfaces are determined by requiring the
surface normals and the minimum distance from the origin to the planes to fall within
some predetermined thresholds. Those parameters are used again during the wing
clustering step in which the planar surfaces equations of all the wings are clustered.

The ﬁfth parameter is used to cluster close-by scattered junctions in the ﬁnal LDG

 

 

 

 

 

 

220

Table C.3. Parameters of the POLY-3 algorithm

 

collinear wing/ line segments
IPI - P2I < 0-3
|01 -— 02' < 10°
coplanar surfaces
I angle between two surface normals | < 10°
| diﬂerence in distance to the origin I < 1.5

 

 

 

 

 

 

clustering of nearby junctions
diSt(th1,th2) < 0.3

 

 

 

 

into one representative junction after the reconstructed individual planar faces are

merged. The threshold values for each of those parameters are listed in Table C.3.

 

 

BIBLIOGRAPHY

 

 

 

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

BIBLIOGRAPHY

J. Aggarwal and C. Chien. 3-D structures from 2-D images. In J. L. Sanz,
editor, Advances in Machine Vision, pages 64—121. Springer-Verlag, New York,
NY, 1988.

G. Agin and T. Binford. Computer description of curved objects. In Inter-—
national Joint Conference on Artiﬁcial Intelligence, pages 629—640, October
1973.

M. Albanesi and M. Ferretti. A space saving approach to the Hough Transform.
In IEEE' 1990 Conference on Computer Vision and Pattern Recognition, pages
472—475, Atlantic City, NJ, June 1990.

I. Anderson and J. Bezdek. Curvature and tangential deﬂection of discrete arcs:
A theory based on the commutator of scatter matrix pairs and its application
to vertex detection in planar shape. IEEE Transactions on Pattern Recognition
and Machine Intelligence, 6(1):27—40, January 1984.

R. Bajcsy and F. Solina. Three dimensional object representation revisited. In
Ist International Conference on Computer Vision, pages 231—240, London, UK,
June 1987.

D. H. Ballard. Generalizing the Hough Transform to detect arbitrary shapes.
Pattern Recognition, 13(2):111—122, 1981.

D. H. Ballard and C. M. Brown. Computer Vision. Printice—Hall, Inc., Engle-
wood Cliffs, NJ, 1982.

S. Barnard and M. F ischler. Computational stereo. ACM Computing Surveys,
14(4):553~572, December 1982.

[9] B. G. Baumgart. A polyhedron representation for computer vision. In National

Computer Conference, AFIPS, pages 589—596, 1975.

221

 

 

[10]

[11]

[12]

I13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

 

222

P. J. Besl. Active optical range imaging sensors. In J. L. Sanz, editor, Advances
in Machine Vision, pages 1—63. Springer—Verlag, New York, NY, 1988.

P. J. Besl and R. C. Jain. Three—dimensional object recognition. ACM Com-
puting Surveys, 17(1):75—145, March 1985.

P. J. Besl and R. C. Jain. Segmentation through variable—order surface ﬁt-
ting. IEEE Transactions on Pattern Recognition and Machine Intelligence,
10(2):167—192, March 1988.

I. Biederman. Recognition—by-components: A theory of human image under-
standing. Psychological Review, 94(2):115—147, 1987.

T. O. Binford. Visual perception by computer. In IEEE Conference on Systems
and Control, Miami, 1971.

A. Blake and A. Zisserman. Visual Reconstruction. The MIT Press, Cambridge,
MA, 1987.

R. Bolle and D. Cooper. On optimally combining pieces of information, with
application to estimating 3-D complex-object position from range data. IEEE
Transactions on Pattern Recognition and Machine Intelligence, 8(5):619—638,
September 1986.

R. M. Bolle, A. Califano, and R. Kjeldsen. A complete and extendable approach
to visual recognition. IEEE Transactions on Pattern Recognition and Machine
Intelligence, 14(5):534—548, May 1992.

R. M. Bolle and B. Vemuri. On three-dimensional surface reconstruction meth-
ods. IEEE Transactions on Pattern Recognition and Machine Intelligence,
13(1):1—13, January 1991.

K. Bowyer, D. Eggert, J. Stewman, and L. Stark. Developing the aspect graph
representation for use in image understanding. In DARPA Image Understanding
Workshop, pages 831—849, Palo Alto, CA, May 1989.

K. L. Boyer and S. Sarkar. Assessing the state of the art in edge detection:
1992. In SPIE 1708: Applications of Artiﬁcial Intelligence X: Machine Vision
and Robotics, pages 353—362, Orlando, FL, April 1992.

R. A. Brooks. Model-based three-dimensional interpretations of two dimen—
sional images. IEEE' Transactions on Pattern Recognition and Machine Intel-
ligence, 5(2):140—150, March 1983.

 

 

 

223

[22] J. Canny. A computational approach to edge detection. IEEE Transactions on
Pattern Recognition and Machine Intelligence, 8(6):679—698, November 1986.

[23] X. Cao and F. Deravi. An efﬁcient method for multiple-circle detection. In 3rd

International Conference on Computer Vision, pages 744—747, Osaka, Japan,
December 1990.

[24] X. Cao, N. Shrikhande, and G. Hu. A comparison of quadric ﬁtting methods
for range data. Technical report, Center for Computer Vision and Robotics Re-

search, Department of Computer Science, Central Michigan University, Septem-
ber 1991.

[25] I. Chakravarty. A generalized line and junction labeling scheme with applica-
tions to scene analysis. IEEE Transactions on Pattern Recognition and Machine
Intelligence, 1(2):202——205, April 1979.

[26] S.-W. Chen. S—D Representation and Recognition Using Object Wings. Ph.D.
thesis, Michigan State University, 1989.

[27] S.-W. Chen and G. Stockman. Object wings - 2 1 / 2 D primitives for 3-D recog-
nition. In IEEE 198.9 Conference on Computer Vision and Pattern Recognition,
pages 535—540, San Diego, CA, June 1989.

[28] S.-W. Chen and G. Stockman. Wing representation for rigid 3D objects. In 10th
International Conference on Pattern Recognition, pages 398—402, June 1990.

[29] R. T. Chin and C. R. Dyer. Model-based recognition in robot vision. ACM
Computing Surveys, 18(1):67——108, March 1986.

[30] M. B. Clowes. On seeing things. Artiﬁcial Intelligence, 2279—116, 1971.

[31] A. Davignon. Contribution of edges and regions to range image segmentation.
In SPIE 1708: Applications of Artiﬁcial Intelligence X: Machine Vision and
Robotics, pages 228—239, Orlando, FL, April 1992.

[32] R. Duda and P. Hart. Use of the Hough Transformation to detect lines and

curves in pictures. Communications of the Association of Computing Machin-
ery, 15(1):11-15, January 1972.

[33] R. Duda and P. Hart. Pattern Classiﬁcation and Scene Analysis. John Wiley,
New York, NY, 1973.

 

 

 

 

 

 

 

224

[34] G. Falk. Interpretation of imperfect line data as a three-dimensional scene.

Artiﬁcial Intelligence, 3:101—144, 1972.

[35] T.-J. Fan, G. Medioni, and R. Nevatia. Segmented descriptions of 3-D surfaces.

IEEE Journal of Robotics and Automation, RA-3(6):527—538, December 1987.

[36] J. Fang and T. S. Huang. A corner ﬁnding algorithm for image analysis and

[371

[38]

[39]

[40]

I41]

registration. In National Conference on Artiﬁcial Intelligence, Pittsburgh, PA,
August 1982.

O. Faugeras, M. Hebert, and E. Pauchor. Segmentation of range data into
planar and quadratic patches. In IEEE 1983 Conference on Computer Vision
and Pattern Recognition, pages 8—13, Washington, D.C., June 1983.

F. P. Ferrie, J. Lagarde, and P. Whaite. Darboux frames, snakes, and super-
quadrics: Geometry from the bottom-up. In Proceeding IEEE Workshop on
Interpretation of 3D Scenes, pages 170—176, Austin, TX, November 1989.

F. P. Ferrie, A. Lejeune, and D. Baird. Curvature, scale, and segmentation.
In SPIE 1708: Applications of Artiﬁcial Intelligence X: Machine Vision and
Robotics, pages 240—250, Orlando, FL, April 1992.

M. A. Fischler and R. C. Bolles. Random sample consensus: A paradigm for
model ﬁtting with applications to image analysis and automated cartography.
Communications of the Association of Computing Machinery, 24(6):381—395,
June 1981.

P. J. Flynn. CAD-Based Computer Vision: Modeling and Recognition Strate-
gies. Ph.D. thesis, Michigan State University, 1990.

[42] P. J. Flynn and A. K. Jain. Surface classiﬁcation: Hypothesis testing and pa-

rameter estimation. In IEEE 1988 Conference on Computer Vision and Pattern
Recognition, pages 261—267, Ann Arbor, MI, June 1988.

[43] P. J. Flynn and A. K. Jain. On reliable curvature estimation. In IEEE 1989

Conference on Computer Vision and Pattern Recognition, pages 110-116, San
Diego, CA, June 1989.

[44] H. Freeman and L. Davis. A corner-ﬁnding algorithm for chain-coded curves.

IEEE Transactions on Computers, C-26(3):297—303, March 1977.

[45] W. Frei and C.-C. Chen. Fast boundary detection: A generalization and a new

algorithm. IEEE Transactions on Computers, C-26(10):988—998, October 1986.

 

 

ILL.

 

 

 

225

[46] W. E. L. Grimson and D. P. Huttenlocher. On the sensitivity of the hough
transform for object recognition. IEEE Transactions on Pattern Recognition

and Machine Intelligence, 12(3):255—274, March 1990.

[47] A. D. Gross. Shape from a symmetric universe. Ph.D. thesis, Columbia Uni-
versity, 1990.

[48] A. D. Gross and T. E. Boult. Error of ﬁt measures for recovering parametric

solids. In 2nd International Conference on Computer Vision, pages 690—694,
Tampa, FL, December 1988.

[49] W. Gu and T. S. Huang. Connected line drawing extraction from a perspective
View of a polyhedron. IEEE Transactions on Pattern Recognition and Machine
Intelligence, 7(4):422—430, July 1985.

[50] A. Gupta and R. Bajcsy. An integrated approach for surface and volumetric
segmentation of range images using biquadrics and superquadrics. In SPIE

1708: Applications of Artiﬁcial Intelligence X: Machine Vision and Robotics,
pages 210—227, Orlando, FL, April 1992.

[51] A. Guzman. Computer Recognition of Three-Dimensional Objects in a Visual
Scene. Ph.D. thesis, Massachusetts Institute of Technology, 1968.

[52] E. L. Hall, J. B. K. Tio, C. A. McPherson, and F. A. Sadjadi. Measuring curved
surfaces for robot vision. IEEE Computer, 15(12):42—54, December 1982.

[53] D. Hoffman and W. Richards. Parts of recognition. Cognition, 18:65—96, 1984.

[54] R. Hoffman and A. K. Jain. Segmentation and classiﬁcation of range images.
IEEE Transactions on Pattern Recognition and Machine Intelligence, 9(5):608—
520, September 1987.

[55] C. M. Hoffmann. Solid and Geometric Modeling. Morgan Kaufmann Publishers,
San Mateo, CA, 1989.

[56] B. K. Horn. Understanding image intensities. Artiﬁcial Intelligence, 8:201—231,
1977.

[57] B. K. Horn. Robot Vision. MIT Press, Cambridge, MA, 1986.

[58] P. Hough. Method and means for recognizing complex patterns. US Patent
3069654, 1962.

 

 

 

 

 

 

 

 

 

226

[59] M. H. Hueckel. A local visual operator which recognizes edges and lines.
J. ACM 20(4):634—647, October 1973.

[60] M. H. Hueckel. An operator which locates edges in digitized pictures. .I. ACM,
18(1):]13—125, January 1973.

[61] D. A. Huffman. Impossible objects as nonsense sentences. In R. Meltzer and
D. Michie, editors, Machine Intelligence, volume 8. Elsevier, New York, NY,

1971.

[62] R. A. Hummel. Feature detection using basis functions. Computer Vision,
Graphics, and Image Processing, 9:40—55, 1979.

[63] D. P. Huttenlocher and S. Ullman. Recognizing solid objects by alignment. In
DARPA Image Understanding Workshop, pages 1114-1122, Tampa, FL, Spring

1988.

[64] J. Illingworth and J. Kittler. The adaptive Hough transform. IEEE Trans-
actions on Pattern Recognition and Machine Intelligence, 9(5):690—697, May

1987.

[65] J. Illingworth and J. Kittler. A survey of the Hough transform. Computer
Vision, Graphics, and Image Processing, 43:221—238, 1988.

[66] T. Kanade. A theory of origami world. Artiﬁcial Intelligence, 13:279—311, 1980.

[67] T. Kanade. Recovery of the three-dimensional shape of an object from a single
view. Artiﬁcial Intelligence, 17:409—460, 1981.

[68] D. Keren, J. Subrahmonia, and D. B. Cooper. Robust object recognition based
on implicit algebraic curves and surfaces. In IEEE 1992 Conference on Com»

puter Vision and Pattern Recognition, pages 791—794, Champaign, IL, June
1992.

[69] R. Kirsch. Computer determination of the constituent structure of biological
images. Computers and Biomedical Research, 4(3):315—-328, June 1971.

[70] J. J. Koenderink. What does the occluding contour tell us about solid shape?
Perception, 13:321~330, 1984.

[71] J. J. Koenderink and A. van Doorn. Internal represenation of solid shape with
respect to vision. Biological Cybernetics, 32(4):211—216, 1979.

 

 

 

 

 

227

[72] K. Koffka. Principles of Gestalt Psychology. Harcourt, Brace, New York, NY,
1963.

[73] D. J. Kriegman and J. Ponce. On recognizing and positioning curved 3D
objects from image contours. IEEE Transactions on Pattern Recognition and
Machine Intelligence, 12(12):1127—1137, December 1990.

[74] I. Lakatos. Proofs and Refutations. Cambridge University Press, Cambridge,
Great Britain, 1976.

[75] Y. Lamdan and H. Wolfson. Geometric hashing: A general and efﬁcient model-
based recognition scheme. In 2nd International Conference on Computer Vi-
sion, pages 238-249, Tampa, FL, December 1988.

[76] P. Lancaster and K. Salkauskas. Curve and Surface Fitting. Academic Press,
San Diego, CA, 1986.

[77] M. Landy. HIPS. Human Information Processing Laboratory, NYU, New York,
NY.

[78] D. Laurendeau and D. Poussart. 3D model building using a fast range ﬁnder.
In IEEE 1986 Conference on Computer Vision and Pattern Recognition, pages
424-426, Miami Beach, FL, 1986.

[79] P. Liang. A new transform for curve detection. In 3rd International Conference
on Computer Vision, pages 748—751, Osaka, Japan, December 1990.

[80] D. G. Lowe. Three-dimensional object recognition from single two-dimensional
images. Artiﬁcial Intelligence, 31:355—395, 1987.

[81] D. G. Lowe. Fitting parameterized three-dimensional models to images. IEEE
Transactions on Pattern Recognition and Machine Intelligence, 13(5):441—450,
May 1991.

[82] A. K. Mackworth. Interpreting pictures of polyhedral scenes. Artiﬁcial Intelli-
gence, 4:121—137, 1973.

[83] J. Malik. Interpreting line drawings of curved objects. International Journal of
Computer Vision, 1(1):73—103, 1987.

[84] J. Malik and D. Maydan. Recovering three-dimensional shape from a single im-
age of curved objects. IEEE Transactions on Pattern Recognition and Machine
Intelligence, 11(6):555—566, June 1989.

 

 

 

 

 

228

[85] D. Marr and E. Hildreth. Theory of edge detection. Proceedings of Royal Society
of London, 207:187—217, 1980.

[86] D. Marr and H. Nishihara. Representation and recognition of the spatial orga-
nization of three-dimensional shapes. Proceedings of Royal Society of London,

200:269—294, 1977.

[87] A. Mitiche and J. K. Aggarwal. Detection of edges using range information.
IEEE Transactions on Pattern Recognition and Machine Intelligence, 5(2):l74—
178, March 1983.

[88] H. Muammar and M. Nixon. Tristage Hough Transoform for multiple ellipse
extraction. IEE Proceedings : Part E, Computers and Digital Techniques,
1338:27—35, 1991.

[89] Y. Muller and R. Mohr. Planes and quadrics detection using Hough Transform.
In 7th International Conference on Pattern Recognition, pages 1101—1103, Mon-
treal, Canada, July 1984.

[90] S. G. Nadabar. Markov Random Field Contextual Models in Computer Vision.
Ph.D. thesis, Michigan State University, 1992.

[91] V. S. Nalwa. Line-drawing interpretation: A mathematical framework. Inter-
national Journal of Computer Vision, 2(2):103—124, 1988.

[92] V. S. N alwa. Line-drawing interpretation: Straight lines and conic sec-
tions. IEEE Transactions on Pattern Recognition and Machine Intelligence,
10(4):514—529, July 1988.

[93] V. S. Nalwa. Line-drawing interpretation: Bilateral symmetry. IEEE Trans-
actions on Pattern Recognition and Machine Intelligence, 11(10):1117—1120,
October 1989.

[94] R. Nevatia and T. Binford. Description and recognition of complex-curved
objects. Artiﬁcial Intelligence, 8:77—98, 1977.

[95] T. S. Newman, P. J. Flynn, and A. K. Jain. Model-based surface classiﬁcation.
In SPIE 1570: Geometric Methods in Computer Vision, pages 250—261, San
Diego, CA, July 1991.

[96] B. Parvin and G. Medioni. Adaptive multiscale feature extraction from range
data. Computer Vision, Graphics, and Image Processing, 45(3):346—356, March
1989.

 

 

 

 

 

229

[97] A. Pentland. Perceptual organization and the representation of natural form.
Artiﬁcial Intelligence, 28:293—331, 1986.

[98] A. Pentland. Recognition by parts. In Ist International Conference on Com-
puter Vision, pages 612—620, London, UK, June 1987.

[99] J. Ponce and M. Brady. Toward a surface primal sketch. In Internation Con-
ference on Robotics and Automation, pages 420-425, St. Louis, MO, March

1985.

[100] V. Pratt. Direct least-squares ﬁtting of algebraic surfaces. ACM Computer
Graphics SIGGRAPH ’87, 21(4):145—152, July 1987.

[101] W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, editors. Numerical
Recipes in C: The Art of Scientiﬁc Computing. Cambridge University Press,

Cambridge, England, 1988.

[102] J. Prewitt. Object enhancement and extraction. In B. Lipkin and A. Rosenfeld,
editors, Picture Processing and Psychopictorics, pages 75—149. Academic Press,

New York, NY, 1970.

[103] N. S. Raja. Obtaining Generic Parts from Range Data Using a Multi-view
Representation. Ph.D. thesis, Michigan State University, 1992.

[104] N. S. Raja and A. K. Jain. Obtaining generic parts from range data using a
multi-view representation. In SPIE 1708: Applications of Artiﬁcial Intelligence
X: Machine Vision and Robotics, pages 602—613, Orlando, FL, April 1992.

[105] K. Rao. Shape Description from Sparse and Imperfect Data. Ph.D. thesis,

University of Southern California, 198.:-

[106] A. Rattarangsi and R. T. Chin. Scale-based detection of corners of planar
curves. IEEE Transactions on Pattern Recognition and Machine Intelligence,

14(4):430-449, April 1992.

[107] R. D. Rimey and F. S. Cohen. A maximum-likelihood approach to segmenting
range data. IEEE Journal of Robotics and Automation, 4(3):277—286, June

1988.

[108] L. G. Roberts. Machine perception of three-dimensional solids; In J. T. Tippett,
editor, Optical and Electra-Optical Information Processing, pages 159—197. MIT

Press, Cambridge, MA, 1965.

 

 

 

 

230

[109] P. Rosin and G. West. Extracting surfaces of revolution by perceptual group-
ing of ellipses. In IEEE 1991 Conference on Computer Vision and Pattern

Recognition, pages 677—678, Maui, HI, June 1991.

[110] P. Sankar and C. Sharma. A parallel procedure for the detection of dominant
points on a digital curves. Computer Vision, Graphics, and Image Processing,

34:321—343, 1978.
[111] Y. Shirai. A context sensitive line ﬁnder for recognition of polyhedra. Artiﬁcial

Intelligence, 4:95—119, 1973.

[112] F. Solina and R. Bajcsy. Recovery of parametric models from range images:
The case for superquadrics with global deformations. IEEE Transactions on

Pattern Recognition and Machine Intelligence, 12(2):131-147, February 1990.

[113] L. Stark. Achieving generalized object recognition through reasoning about as-
sociation of function to structure. Ph.D. thesis, University of South Florida,

Tampa,F L, 1990.

[114] G. Stockman. Object recognition and localization via pose clustering. Computer

Vision, Graphics, and Image Processing, 40:361—387, 1987.

[115] K. Sugihara. Machine Interpretation of Line Drawing. The MIT Press, Cam-
bridge, MA, 1986.

[116] G. Taubin. Estimation of planar curves, surfaces, and nonplanar surface curves
deﬁned by implicit equations with applications to edge and range image segmen-
tation. IEEE Transactions on Pattern Recognition and Machine Intelligence,

13(11):1115—1138, November 1991.

Technical Arts Corporation. 100X 3—Dimensional Scanner: User’s Manual and

[117]
Application Programming Guide. Redmond, Washington.

On the detection of dominant points on digital

[118] C. Teh and R. Chin.

curves. IEEE Transactions on Pattern Recognition and Machine Intelligence,
11(8):859—872, August 1989.

D. A. Trytten. The Construction of a 2.5 Dimensional Sketch by Integration of

[119]
Multiple Perceptual Processes. Ph.D. thesis, Michigan State University, 1992.

D. A. Trytten and M. Tuceryan. Segmentation and grouping of object. bound-
aries using energy minimization. In IEEE 1991 Conference on Computer Vision

and Pattern Recognition, pages 730—731, Maui, HI, June 1991.

[1201

 

 

 

 

231

[121] K. Turner. Computer Perception of Curved Objects using a Television Camera.
Ph.D. thesis, School of Artiﬁcial Intelligence, University of Edinburgh, 1974.

[122] R. Vaillant and O. Faugeras. Using occluding contours for recovering shape
properties of objects. In Proceeding IEEE WorkshOp on Interpretation of 3D

Scenes, pages 26—32, Austin, TX, November 1989.

[123] D. Waltz. Understanding line drawings of scenes with shadows. In P. H. Win-
ston, editor, The Psychology of Computer Vision. McGraw-Hill, New York, NY,

1975.

[124] M. Wertheimer. Investigations on the gestalt theory. In W. Ellis, editor, A
Source Book of Gestalt Psychology, pages 71-88. Harcourt, New York, NY,
1938.

[125] P. H. Winston. Introduction to Artiﬁcial Intelligence, 2nd Edition. Addison-
Wesley, New York, NY, 1984.

[126] A. Witkin. Recovering surface shape and orientation from texture. Artiﬁcial
Intelligence, 17:17—45, 1981.

[127] G. Xu and S. Tsuji. Inferring surfaces from boundaries. In Ist International
Conference on Computer Vision, pages 716—720, London, UK, June 1987.

[128] N. Yokoya and M. D. Levine. Range image segmentation based on differential
geometry: A hybrid approach. IEEE Transactions on Pattern Recognition and

Machine Intelligence, 11(6):643—649, June 1989.

 

 

“I IIIIIIII[IIIIIIII[III]IIIHIIHII