{6* W * ww‘v wfik. . 1. "'5.- ~' 51'! ., 1.9: w.» :n'm‘v ‘bo‘y . .‘v. i”._,...,.,‘... “w r 'H""’ V 1.: «c .‘,;!..- ,... r u ‘.'..,n‘ v u r‘ ‘34.” ‘ 41“?" “‘5." , «'5 Wm u.-. .. ,... z. a ¢. ,_.._ ..-,~Au~ .12" 353' Av . a Aa“~‘.“ no .. V' :Mvv 0;"... "5-. r n .--—— » a >~ ‘ ‘11."‘Ov‘otII‘2f 3:“? «W _-,v.u. A m3 - as”; ( '1' win!“ .W"","" . n "it! ”\h‘ 51:52. . . a” " ..-r'- ”“2 J a“ ,-L.z'.."~’} ; I; .:.~,.,:'.':..:.?‘ 5'1"” *3 N n-flr" “Wm-j; iv: -.».....'..,~W ."C'; up? 4" , a I .‘. » W5.“ "n “ r, .M :7‘ $5? ngfifi ’ w" ‘ ‘1. ' I: . -I w} w. 1 9 ~ A’ ; .I,’ «3:: . s . 41$. [k 7 - 4 ~44. ‘4‘? kmfifi» :- n ‘9" ‘4; 35.92 S p, «R; .- .y: r‘ , ‘.. «.4 .. , 1 g, ._ E- _. m £ - 5 fiff‘fii‘K tn 3%,.“ "b » '3 a‘ 5? fix“ _~i‘zw’3~\§:¢ ”W“ 75¢»: @523 "fi-m cf5 AWN“ “2.32.29“. ‘ mar“ ‘ “1'. {4:5— 593:- 4 . W ; , “a" -"" #93:.” 7W3: ””Lp—x ..-‘_'_.JC $¥5 $1: . Ermwmg ‘ .. ”my 95305096 “I I it mm we 5 ;‘ If EHIHWIUHIIUH "H1 l1 I‘lllil‘lli HM ”BRA" H 3 1293 006127 6180 _ Michigan St.“ University This is to certify that the dissertation entitled A KNOWLEDGE BASED APPROACH TO LANDSAT IMAGE INTERPRETATION presented by Jezching Ton has been accepted towards fulfillment of the requirements for DOCTORAL COMPUTER SCIENCE degree in Major professor Date ”W q” [787 MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 w-———‘— fig...“ a—‘.._—§_—~_____‘__.__~_._ PLACE IN RETURN BOX to remove this checkout from your record. . TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE MSU Is An Affirmative Action/Equal Opportunity Institution A KNOWLEDGE BASED APPROACH TO LANDSAT IMAGE INTERPRETATION By Jezching Ton A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Computer Science 1989 Kgg‘J“? :2 t':>‘ ABSTRACT A KNOWLEDGE BASED APPROACH To LANDSAT IMAGE INTERPRETATION By Jezching Ton Automatic interpretation of Landsat images is valuable for various applications. Despite the fact that it has attracted attention for decades from researchers in image pro- cessing, artificial intelligence, and pattern recognition, limited progress has been achieved and numerous problems have yet to be solved before the entire image interpre- tation process can be automated. Image registration of Landsat Thematic Mapper (TM) scenes with translational and rotational differences is studied. These images could also differ in the acquisition dates. We concentrate on two major steps of image registration: control point selection and con- trol point matching. In control point selection, we first define the properties that a good control point should satisfy, and then suggest several methods for extracting them from the input image. In control point matching, we improve a relaxation algorithm previ- ously proposed in the literature by reducing its time complexity from 001‘) to 0(n3) where n is the number of control points. Experimental results on Landsat 4 and 5 images show that the proposed method produces results comparable to those obtained by an experienced photointerpreter. We propose a multiresolution technique to detect roads in Landsat images. Such a technique permits us to extract information about roads at several levels of detail. Major advantages of this method are: filling road gaps, deleting short road-like segments, and enhancing long and low contrast roads. Several Landsat TM images were used in the experiments to evaluate the road detection algorithm developed here. Landsat images are quite complex and require a large amount of auxiliary knowledge in their interpretation. We attack this problem by first constructing a hierarch- ical structure for the major land cover types in the study area. Then we develop an algo- rithm to automate the process of spectral rule generation to identify different land cover types. Kernel image information is constructed to provide an initial interpretation of the image. Finally, a spatial clustering segmentation technique, kernel image information, and the spectral and spatial rules are combined to provide a more detailed interpretation of the image. Despite the progress that has been achieved in this Thesis, it is unlikely that fully automatic Landsat image interpretation can be realized in the near future. Continuing research in knowledge based image segmentation is needed to make Landsat image interpretation more successful. To my parents and my wife for their love and support iv ACKNOWLEDGEMENTS It is my great pleasure to thank God for giving me wisdom, love, and perseverance through my Ph.D. program. I wish to thank my adviser, Dr. Anil K. Jain, for his gui- dance and patience in thesis preparation. I especially appreciate his endless constructive critiques and intellectually challenging questions that initiated many meaningful research directions and significantly improved my abilities as a researcher. Special thanks also go to my Ph.D. committee members, Dr. George C. Stockman, Dr. Lionel M. Ni, Dr. Jon Sticklen, and Dr. Jon Bartholic, for their friendly and helpful guidance in these years. I am grateful to Patrick Flynn, Deborah Trytten, and Sally Howden for their help in editing many drafts of my thesis, and to all faculty and graduate students in the Pattern Recogni- tion and Image Processing Laboratory for their help during my stay in the Department. I sincerely appreciate the technical and financial support from the Center for Remote Sensing, Michigan State University. The financial support has lasted for more than four years. The technical support, from Mr. Bill Enslin, Dr. Dave Lusch, and Dr. Dennis Hudson, was endless. Further thanks also go to my wife, Suh-neu Su Ton, for her encouragement and pati- ence in these years. She never complained about my poor and seemingly endless student life even after most of our friends had graduated and left. Table of Contents Chapter I Introduction ........................................................................................... l 1.1. Background -- Land Cover Types, Landsat Images, and Study Areas 2 1.2. Statement of the Problem and the Proposed Strategy .............................. 7 1.3. Knowledge Based Image Interpretation Systems .................................... 11 1.4. Organization of the Thesis ...................................................................... 14 Chapter 2 Registering Landsat Images by Point Matching ................................ 15 2.1. Introduction ............................................................................................. 15 2.2. Control Point Selection ........................................................................... 18 2.2.1. Desired Structure Properties ........................................................... 18 2.2.2. Candidates for Control Points ......................................................... 19 2.3. Control Point Matching ........................................................................... 26 2.3.1. Review of Point Matching Algorithms ........................................... 26 2.3.2. Point Pattern Matching by Relaxation ............................................ 28 2.3.3. Improving the Time Complexity of the Matching Algorithm Using Mergesort .............................................................................. 31 2.3.4. Two Way Matching ........................................................................ 32 2.3.5. Point Matching Algorithm .............................................................. 34 2.3.6. Robustness Study of the Point Matching Algorithm ....................... 36 2.3.7. Mapping Function ........................................................................... 40 2.4. Experimental Results .............................................................................. 41 2.5. Conclusions ............................................................................................. 46 Chapter 3 A Hierarchical Approach for Identifying Roads in Landsat Images ................................................................................ 47 3.1. Introduction ............................................................................................. 47 3.2. Road Detection Using a Road Following Algorithm .............................. 50 3.2.1. An One Pixel Away Road Operator ............................................... 51 3.2.2. A Road Following Algorithm .......................................................... 53 3.3. A Global View of Road Enhancement .................................................... 55 3.3.1. Limitations of Road Enhancement by Relaxation ........................... 55 3.3.2. Using Global Information in Road Detection .................................. 57 3.4. Road Thickening Using an Iterative Road Operator .............................. 59 3.5. Hierarchical Road Detection .................................................................... 69 3.6. Experimental Results ............................................................................... 73 3.7. Conclusions .............................................................................................. 85 Chapter 4 Knowledge Based Segmentation of Landsat Images ......................... 86 4.1. Introduction ............................................................................................. 87 4.2. Review of Knowledge Based Image Segmentation Techniques ............ 89 4.3. The Proposed Knowledge Based Segmentation ...................................... 93 4.4. Kernel Image Information ...................................................................... 98 4.4.1. Water Regions ................................................................................. 99 4.4.2. Vegetation Regions ......................................................................... 103 4.4. 1. Roads ............................................................................................... 103 4.5. Spectral Expert System for Landsat Image Classification ....................... 107 4.5.1. Landsat Image Classification ....................... , ................................... 1 07 4.5.1.1. Supervised Classification . ...................................................... 107 4.5.1.2. Band Ratios ............................................................................ 108 4.5.1.3. Signature Extension ............................................................... 108 4.5.2. Overview of the Proposed Specu'al Expert System ......................... 109 4.5.2.1 Study Areas, Landsat Images, and Major ' Land Cover Types .................................................................. 110 4.5.2.2 Processing Steps ..................................................................... 110 4.5.3. Spectral Expert System for Classifying Hierarchical Land Cover Types ............................................................................ 1 12 4.5.3.1. Decomposing Land Cover Types into Hierarchical Structure ............................................................ 1 12 4.5.3.2. Consu'ucting Spectral Expert System ................................... 114 4.5.3.3. Several Examples of Classification Hypothesis ................... 116 4.5.3.4. Experimental Results ............................................................ 118 4.5.4. Automatic Rule Generation from Training Data ............................. 119 vii 4.6. Knowledge Based Segmentation of Landsat Images ............................... 124 4.7. Discussion and Conclusions .................................................................... 138 Chapter 5 Summary, Future Research, and Conclusions .................................. 140 5.1. Surmnary of Image Registration .............................................................. 140 5.2. Summary of Road Detection .................................................................... 141 5.3. Summary of Knowledge Based Segmentation ........................................ 142 5.4. Contributions of the Thesis ...................................................................... 142 5.5. Future Research Topics ........................................................................... 144 5.5.1. Parameter Tuned Block in Image Processing Applications ........... 144 5.5.2. Region Based Relaxation Techniques ............................................ 146 5.5.3. Spatial Knowledge Acquisition Techniques ................................... 146 5.6. Conclusions .............................................................................................. 147 Appendix Program Generated Land Cover Spectral Rules ..................................... 148 List of References ................................................................................................... 152 viii List of Tables Table 1-1 Major land cover types in the study areas in Michigan ............................ 3 Table 2-1 Spectral data of Grayling Township ......................................................... 20 Table 2-2 Comparison of various point matching algorithms .................................. 27 Table 2-3 Centroid coordinates and sizes (in pixels) of water and pad regions from 1986 and 1982 Landsat images ........................................................ 30 Table 2-4 Results of robustness study ....................................................................... 39 Table 2-5 Evaluation of point matching results ........................................................ 41 Table 2-6 Comparison of control point and manual regisuation methods ............... 43 Table 3-1 A comparison of road detection accuracy with and without hierarchy ...................................................................................... 76 Table 4-1 Spectral data of Grand Traverse County .................................................. 111 Table 4-2 Spectral data of Grayling area .................................................................. 111 Table 4-3 Automatically generated rules for forest category .................................. 122 Table 4-4 Automatically generated rules for shrub category .................................... 123 Table 4-5 Two cluster solution of Forgy algorithm on Fredrick Township vegetation regions ..................................................................................... 131 Table 4-6 Three cluster solution of Forgy algorithm on Fredrick Township vegetation regions ..................................................................................... 131- Table 4-7 Classification accuracy by land cover types ............................................. 132 ix List of Figures Figure 1-1 Study areas and Landsat images from four scenes ................................. 6 Figure 1-2 The proposed strategy for Landsat image interpretation ........................ 10 Figure 2-1 Landsat 4 band 2 image of Blair Township, Oct 1982 ........................... 21 Figure 2-2 Landsat 5 band 2 image of Blair Township, June 1986 ........................... 21 Figure 2-3 Extracted pad regions (white) and water regions (yellow) from Figure 2-1 .................................................................................................. 22 Figure 2-4 Extracted pad regions (white) and water regions (yellow) from Figure 2-2 .................................................................................................. 22 Figure 2-5 The two filters for oil/gas pad detection .................................................. 24 Figure 2-6 Success ratio vs. matching index ............................................................. 38 Figure 2-7 Landsat 4 band 2 image of Fredrick Township, Oct. 1982 ...................... 44 Figure 2-8 Landsat 5 band 2 image of Fredrick Township, June 1986 ..................... 44 Figure 2-9 Extracted Pad Regions (white) from Figure 2-7 ..................................... 45 Figure 2-10 Extracted Pad Regions (white) from Figure 2-8 .................................... 45 Figure 3-1 The 14 masks of the one pixel way road operator .................................. 52 Figure 3-2 Road gap filling in a synthetic image ...................................................... 56 Figure 3-3 Road thickening for a horizontal road ..................................................... 63 Figure 3-4 The 14 masks for postprocessing in road thickening ............................... 64 Figure 3-5 Road thickening for multiple horizontal roads ....................................... 65 Figure 3-6 Road thickening for 45° oriented roads .................................................. 66 Figure 3-7 Road thickening for 22° oriented roads .................................................. 67 Figure 3-8 An application of road thickening algorithm to a Landsat image ........... 68 Figure 3-9 Hierarchical road detecfion flowchart ..................................................... 72 Figure 3-10(a) Landsat TM band 3 image (256x256) in Lake County .................... 77 Figure 3-10(b) The reduced image (128x128) from Figure 3-10(a) ........................ 77 Figure 3-10(c) The reduced image (64x64) from Figure 3-10(b) ............................ 78 Figure 3-10(d) The detected road image (256x256) from Figure 3-10(a) ............... 78 Figure 3-10(e) The detected road image (128x128) from Figure 3-10(b) ................ 79 Figure 340(1) The detected road image (64x64) from Figure 3-10(c) .................... 79 Figure 3-10(g) The final level 1 road image (256x256) of Lake County ................. 80 Figure 3-1l(a) Landsat TM band 3 image (256x256) of Grand Traverse County 80 Figure 3-ll(b) The level 1 road image (256x256) of Grand Traverse County ........ 81 Figure 3-ll(c) The level 2 road image (128x128) in Grand Traverse County ........ 81 Figure 3-ll(d) The level 3 road image (64x64) in Grand Traverse County ............ 82 Figure 3-ll(e) The final road image (256x256) in Grand Traverse County ............ 82 Figure 3-12(a) The landsat TM Band 3 image (256x256) in Crawford County ...... 83 Figure 3-12(b) The level 1 road image (256x256) of Grand Traverse County ........ 83 Figure 3-12(c) The level 2 road image (128x128) in Crawford County .................. 84 Figure 3-12(d) The final road image (256x256) in Crawford County ..................... 84 Figure 4-1 A diagram of a pOpular knowledge based segmentation approach ......... 92 Figure 4-2 The proposed knowledge based segmentation procedure ....................... 96 Figure 4-3 Detailed segmentation procedure of Figure 4-2 ...................................... 97 Figure 4-4 Landsat TM image of Alpena City, Michigan ........................................ 101 Figure 4-5 Water regions for Figure 4-4 ................................................................... 101 Figure 4.6 Landsat TM image of Grand Traverse County ....................................... 102 Figure 4-7 Water regions for Figure 4-6 ................................................................... 102 Figure 4-8 Landsat TM image of Fredrick Township, Crawford County ................ 104 Figure 4-9 The vegetation index image for Figure 4-4 ............................................. 105 Figure 4-10 The vegetation index image for Figure 4-8 ............................................ 105 Figure 4-11 The road image for Figure 4-4 ............................................................... 106 Figure 4-12 The road image for Figure 4-8 ............................................................... 106 Figure 4-13 Hierarchical structure of land cover types ............................................. 113 Figure 4-14 The segmented regions by spatial clustering ......................................... 124 Figure 4-15 The kernel information for Figure 4-4 ................................................... 134 Figure 4-16 The detected seeds for urban and bare land from Figure 4-15 .............. 134 Figure 4-17 The detected urban and bare land regions from Figure 4—16 ................. 135 Figure 4-18 The vegetation regions for Figure 4-4 .................................................... 135 Figure 4-19 The detected seeds for Figure 4-8 .......................................................... 136 Figure 4-20 The extended regions from Figure 4-19 ................................................. 136 Figure 4-21 The final segmented regions for Figure 4-8 ........................................... 137 Figure 4-22 The ground truth image of Figure 4-8 .................................................... 137 Figure 5-1 Parameter tuned block in image processing ............................................ 145 xi Chapter 1 Introduction Just as an accurate road map is essential for us to drive a car in an unfamiliar city, so too an up-to-date land cover/use map is indispensable for many land resource related activities, such as forest monitoring, urban planning, soil studies, etc. Failure to provide accurate land cover information may lead land resource managers to wrong conclusions or, even worse, endanger the public safety. For example, a forest pest specialist might utilize land cover information to predict infestations of gypsy moths (an insect that des- troys trees), and obtain a wrong conclusion because of inaccurate data. If a precautionary spraying of pesticides on a potentially endangered forest area is prescribed, the results may include both monetary loss (spraying fees) and tree loss (trees that were not sprayed but were infected). Obtaining and updating land cover information is time consuming and costly. For example, the Michigan Department of Natural Resource began a program in 1980 to computerize the generation of land use maps of Michigan with an annual budget of one million dollars. Tens of thousands of airphotos covering various counties in Michigan were taken and then carefully interpreted by experienced photointerpreters. As of March, 1989, 50 of the 83 Michigan counties have a computerized land use map [DNR89]. Furthermore, the majority of these maps are based on 1978 airphotos. It’s about time to update these ten year old maps even though maps of some counties have not yet been generated. One promising approach to speed up the generation and updating of land cover maps is through the automatic interpretation of Landsat images. At least 10,000 airpho- tos are needed to interpret the entire State of Michigan. Unlike airphotos, only twenty Landsat scenes can cover the same area, and all these images are stored in computer- accessible format. Landsat images can also be obtained more frequently. However, Landsat images have to be interpreted to generate useful land cover maps. This interpre- tation process, which is traditionally performed by experienced photointerpreters, is very difficult to automate. The major goal of this thesis is to utilize image processing and artificial intelligence techniques in automating the process of Landsat image interpreta- tion. We have to clarify that airphotos typically have a higher resolution (1 - 3 m) than Landsat Thematic Mapper (TM) images (30 m), which means Landsat images cannot fully replace airphotos in detailed image interpretation. However, Landsat images con- tain 7 bands of data where airphotos contain 3 bands of data. Automatic Landsat image interpretation techniques may be useful in several application areas such as forest moni- toring and major road network updating. Extension of the developed techniques to SPOT images (10 m resolution) may be useful in urban planning. 1.1. Background -- Land Cover Types, Landsat Images, and Study Areas The term land cover relates to the type of features present on the surface of the earth. Urban buildings, lakes, forests, and agricultural areas are examples of land cover types. Table 1-1 shows the 10 major land cover types in the study areas, based on the land cover statistics published for the northern Lower Peninsula [Jak80] and the eastern Upper Peninsula of Michigan [Smi82]. Land cover information may change over time. For example, small forest areas may be cut for their timber or for oil/gas exploration, or agricultural land may be developed for urban usage. As mentioned before, maintaining up-to-date land cover information is important for various applications. Landsat 4 TM images, acquired every 16 days for any ground area on earth, are a good source of data for updating land cover information. Table l-l: Major land cover types in the study areas in Michigan. 1. Agriculture 6. Grassland 2. Deciduous 7. Shrub 3. Red Pine 8. Urban 4. Jack Pine 9. Water 5. Clearcut 10. Bare land The Landsat 4 satellite, launched on July 16, 1982, follows a repetitive, circular, sun-synchronous, near-polar orbit at a nominal altitude of 705 km. The satellite crosses the equator at approximately 9:45 am. on each pass. The Landsat 4 16-day ground cov- erage cycle, collecting data covering the entire Earth (poles excepted), is accomplished in 233 orbits. Each Landsat scene covers a ground area of 185 km by 170 km with a spatial resolution of 30x30 meters (except the Thermal band, which has a resolution of 120x120 meters). The Landsat 5 satellite has the identical band specification and resolution as Landsat 4. It was launched in 1985 to compensate for the power problem in Landsat 4. Currently, both Landsat 4 and 5 are functioning. While Landsat 5 is collecting TM and Multiple Specu'al Scanner (MSS) data, Landsat 4 is collecting MSS data. More detailed informa- tion about Landsat 4 and 5 is available in [Lan84] and [U se85]. Images from the TM sensor aboard Landsat 4 and 5 are composed of seven spectral bands. Each bandwidth was carefully selected based on the experience of land cover classification accuracy acquired from previous Landsat images (Landsats 1-3) and the goal of maximum spectral separation of major land cover types on the earth. The band designations, spectral ranges, and principal applications are as follows: 0 Band 1 (0.45-0.52 micrometers (um)) Designed for water penetration, making it useful for coastal zone mapping. Also useful for differentiation of soil from vegetation, and deciduous from coniferous areas 0 Band 2 (0.52-0.60 um) Designed to measure the visible green reflectance peak of vegetation for the assessment of vigor. .0 Band 3 (0.63-0.69 um) A chlorophyll absorption band important for vegetation discrimination. 0 Band 4 (0.76-0.90 um) Useful for determining biomass content and for delineation of water bodies. 0 Band 5 (1.55-1.75 um) Indication of vegetation moisture content and soil moisture. Also useful for differentiation of snow from clouds. e Band 6 (10.40-12.50 um) A thermal infrared band useful in vegetation stress analysis, soil moisture discrimination, and thermal mapping. e Band 7 (2.08-2.35 um) Discriminates rock types and for hydrothermal mapping. The study areas for this research include six sites in the northern Lower Peninsula and two sites in the eastern Upper Peninsula of Michigan (see Fig. 1-1). The available Landsat images are listed below. These study areas were selected for three reasons: 1) Landsat 4 TM images of these areas are available, 2) the staff of the Center for Remote Sensing at Michigan State University has considerable experience in interpreting these images, and 3) some of the ground truth information is available for these regions mak- ing the verification task easier. Landsat 4 and 5 images which are available to us from the study areas include: 1). 1986 Summer, Lower Peninsula, whole scene (185km x 1701cm, Landsat 5). 2). 1984 Summer, Lower Peninsula, Leelanau County, quarter scene (Path 22, Row 29, top-left quarter, Landsat 4). 3). 1982 Fall, Lower Peninsula, whole scene (Path 22, Row 29, Landsat 4). 4). 1984 Summer, Upper Peninsula, Chippewa County, partial scene (about 40km x 70km, Landsat 4). WRS Path 22, Row 28 July 11, 1984 -Y5013215561 ° 1 <5 / WRS Path 21, Row 29 N L‘ June a. 1986 ‘ , -Y5082915455 7’1 . WRS Path 22, Row 29 July 11, 1984 -Y5013215564 No. Location Date 1 Antrim June 1986 2 Fredrick June 1986 3 Grayling June 1986 - 4 Lake June 1986 5 East Leelanau July 1984 6 West Leelanau July 1984 V 7 East Chippewa July 1984 8 West Chippewa July 1984 Figure 1-1: Study areas and Landsat images from four scenes. 7 1.2 Statement of the Problem and the Proposed Strategy The problem studied in this thesis is stated below. Given a Landsat image, develop an automatic procedure to segment and interpret the image. The ultimate goal is to generate a corresponding land cover map. Automatic interpretation of Landsat images is a difficult task. In includes an integration of image processing techniques, knowledge from domain experts, and ancil- lary information such as previous maps of the study area. Although this topic has been studied by many researchers [Goo87, Wha87, Tai87, Wan88], we still need to solve many problems before we can fully automate the whole process of Landsat image interpretation. In the following, we first discuss three important issues in automatic Landsat image interpretation, and then present a flowchart to describe the pmposed approach. 1. Image Registration Techniques Landsat images have to be registered with maps or other images of the same ground area before they can be fully utilized. In the traditional approach of image registra- tion, a photointerpreter must select and determine the control point pairs from the two images to be registered. Then a mapping function is generated based on the control point pairs and the resulting mapping function is used to register one image with the other. The whole registration process requires substantial human involvement. If this registration process can be automated then a significant amount of human labor can be saved. 2. Image Segmentation Techniques A human expert interprets an image by first identifying some regions he/she is very confident about. These identified regions are helpful in subsequent detailed image interpretation. The same strategy should be used in automatic image interpretation. Automatic image segmentation techniques should be developed to reliably extract salient objects or regions from Landsat images. These identified regions can result in an initial interpretation of the image which can be updated by using detailed domain knowledge. 3. Integration of Domain Knowledge in the Interpretation Domain knowledge in Landsat images includes spectral, spatial, and temporal infor- mation. Spectral knowledge captures the fact that, a Landsat image consists of 7 bands, where each band has its design purpose to identify certain land cover types (see Section 1.1). Many land cover objects have certain spatial properties such as shape, size, etc. For example, the size of an oil/gas pad area ranges from 8 to 40 pix- els. Clear cut regions usually have a rectangular shape, and are larger in size. Spatial relationships among different objects are also important in the interpretation. For example, a small grassland surrounded by an urban area is very likely to be a park. Parks should be interpreted as part of the urban area. A human expert uses a significant amount of knowledge in the image interpretation procedure. This knowledge has to be transformed into a computer accessible format when building an automatic Landsat image interpretation system. We have attempted to solve these three problems in this thesis. The flowchart of the proposed Landsat image interpretation system is shown in Figure 1-2. In our experi- ments, the digital land cover/use map of Fredrick Township in Crawford County is first registered with the 1982 Landsat image. The 1986 image and 1982 images of Fredrick Township are then registered through our image regisuation technique; details are dis; cussed in Chapter 2. Finally the map is used for evaluation of the interpreted 1986 Fredrick Township image. The automatic segmentation techniques that reliably extract salient objects or kernel information (such as roads, water regions, etc.) from Landsat images are studied. We have developed automatic techniques to extract water regions and oil/gas pads in Chapter 2. Landsat road detection techniques are presented in Chapter 3. Interpretation of Landsat images involves a large amount of spectral knowledge. This spectral knowledge can be either acquired from expert photointerpreters or can be derived automatically from the land cover training data. Our spectral expert system and our knowledge based approach to Landsat image segmentation are presented in Chapter 4. We summarize two major attributes of our system: 1. Domain knowledge is used at the very beginning of the image processing. 2. Kernel information is well integrated in the whole image interpretation pro- cedure. Most of the existing knowledge based image interpretation techniques have been applied to aerial photographs. The only "successful" knowledge based image interpreta- tion system for Landsat images is from the Canada Center for remote Sensing [G0087, G0185]. The brief review of their system is given in Section 1.3. Most of the available knowledge based image interpretation systems [Mck85, Goo87] use large computers and complicated control structure (Figure 4-2). Our system has been developed on an IBM AT personal computer with the exception of the prototype spectral expert system which was developed on a Sun workstation. We believe that an extension of this Thesis can produce a realistic low-cost Landsat image interpretation system. 10 Map / Landsat / Spectral Land Cover | Image Knowledge Training 1 1 Data : ............................................................................................. .l . IMAGE REGISTRATION egistered Image KNOWLEDGE ROAD DETECTION ACQUISITION OIL/GAS PAD DETECTION WATER DETECTION pectral kernel Expert System information KNOWLEDGE BASED SEGMENT ATION Figure 1-2: The proposed strategy for Landsat image interpretation. 11 1.3. Knowledge Based Image Interpretation Systems In this section a brief review of three most representative knowledge based image interpretation systems is given. Literature reviews of image registration, road detection, and Landsat image segmentation techniques are given in Chapters 2, 3, and 4, respec- tively. McKeown et a1. [McK85] have successfully designed a knowledge based expert system which uses map and domain specific knowledge to interpret airport scenes. The system is composed of three main components: an image/map database, a set of image processing tools, and a rule based system. The image/map database contains detailed information about the airport, such as the location and size of major buildings, and run- ways. The image processing tools include a region growing segmentation procedure, a road follower, procedures in linear feature extraction, and an interactive human segmen- tation system. The major goal of the system is to monitor the whole airport by interpret- ing airphotos of the airport. Since very detailed ground information within the airport is stored in the image/map database, the system can identify transient objects such as air— planes, and can successfully interpret airphotos with poor quality. The rules manipulate three types of primitives: regions, fragments, and function areas. The idea behind the sys- tem is to perform an initial segmentation, hypothesize interpretations for regions, and verify or weaken hypotheses based on interpretations of nearby regions. The system has successfully interpreted difficult airport scenes, although initial segmentations were poor. Generally speaking, the system is a successful knowledge based system. The system has two properties which make it difficult to apply to Landsat image interpretation: 1. Very detailed airport ground information, such as location and size of terminal buildings and parking lots, length and width of runways, is stored in the image/map database. This detailed ground truth information is used to interpret each segmented region of an airphoto correctly even if the airphoto is poorly taken. For an important airport, it may be worthwhile to design such a database. But for Landsat image interpretation, it is impractical to design a complex data- base to store very detailed ground information. 12 2. The system is complicated and inflexible. It may take a tremendous effort to design such a system for one specific airport. If we want the system to interpret images of other airports then the image/map database requires significant changes. Nagao and Matsuyama [Nag80] have developed a knowledge based system which performs a structural analysis of complex aerial photographs using a technique called segmentation-by-recognition. The flow of the system is as follows: first segment the air- photo image to get regions, then generate property tables for regions, and finally use knowledge based rules to identify each region. The system uses knowledge about the intrinsic properties of objects such as size, shape, location, color, and texture. Whenever possible, the spatial characteristics of regions are used to identify objects rather than their spectral properties. Therefore, the system can give stable results despite changes in pho- tographic conditions. One of the major characteristics of this system is the organization of its form of a "blackboard". The blackboard contains three data structures for storing a variety of information about regions and objects: the global parameter table, the property table, and the label picture. The blackboard enables efficient information sharing between subsystems (such as the forest subsystem, the house subsystem). Although this system is a good example of how a knowledge based system with a well designed control structure can successfully analyze complex aerial photographs, there exist several problems in extending this method to Landsat image interpretation: 1. Only a few object types and simple rules are considered. For example, the rule to detect forest and grassland is as follows: If a region is (high conuast texture area) and (large vegetation area) and (large homogeneous region) and (non water region), then the area belongs to the forest or grassland category. This rule will work only when a small number of land cover types are con- sidered which are easily distinguished. 13 2. The segmentation method [Nag80, pp. 87-94] is not appropriate for 7 band Landsat images. Edge preserving smoothing followed by recursive splitting may divide the image into too many small regions; a 4 band 256*256 image was divided into thousands of regions [Nag80, pp.90]. If 7 band images are used for the same area, many more regions would be obtained. 3. Extension of this system is difficult. Since the blackboard stores all the informa- tion for all the regions, it requires much storage. If both the number of land cover types and the number of objects increase, the update of blackboard infor- mation may become time and memory intensive. Goldberg et al. [60185, 00087] have designed a hierarchical expert system for updating forest maps in British Columbia, Canada. The problem addressed is the automatic estimation of forest depletion by logging. The gigantic system (the software has more than 1,000,000 lines of Fortran code) is composed of four elements: the remotely sensed image, the geocoded database, the contextual information, and the image processing tools. The top level of the hierarchy is an expert with broad knowledge about updating forest maps using Landsat data. Below this expert are a number of spe- cialized experts: an expert in cloud and shadow determination, experts in change detec- tion, and an expert in maps and associated geocoded databases. The hierarchical system is organized in a pyramidal structure. At the n”I level resides the chief officer who sets broad goals for the (n-l)"' level of command. The (n-l)"‘ level managers in turn translate these goals into subgoals which are transmitted to the next level in the hierar- chy. Eventually, the processing must be performed by the lowest level in the command structure. For image analysis, this last level corresponds to the low level image process- ing algorithms. The advantage of a hierarchical structure is the clarity and organization of how work can be performed. The disadvantage is the significant overhead paid to get simple tasks performed, especially communications between different levels. 14 It may be pointed out that most of the knowledge based image interpretation sys- tems are described with reference to aerial imagery. 1.4 OEganization of the Thesis The remainder of the dissertation is organized as follows. Chapter 2 introduces a point matching technique for registering Landsat images. Chapter 3 proposes a hierarch- ical approach for road detection. Chapter 4 discusses a knowledge based approach for Landsat image interpretation. Chapter 5 provides the summary, discussions, and conclu- sions. Chapter 2 Registering Landsat Images by Point Matching Image registration of Landsat Thematic Mapper (TM) scenes with translational and rotational differences is studied in this chapter. We concentrate on two major steps of image registration: control point selection and conu'ol point matching. In control point selection, we first define the properties that a good control point should satisfy. We then suggest several methods for extracting them from the input image. In control point ‘ matching, we improve a relaxation algorithm previously proposed in the literature by reducing its time complexity from O(n4) to O(n3) where n is the number of control points. The new matching algorithm uses a two way matching concept which utilizes the inherent symmetry of the point matching problem. The robustness of the proposed algo- rithm is demonstrated through simulation experiments by evaluating a matching index. Experimental results on Landsat images show that the proposed method produces results comparable to those obtained by an experienced photointerpreter. 2.1. Introduction Image registration is the process of geometrically aligning two or more images of the same scene. When analyzing two images of the same scene, such as in change detec- tion, this process is required. Traditional approaches to image registration require sub- stantial human involvement. The process of image registration is typically composed of four steps: (i) control points from the two images are selected by a photointerpreter; (ii) 15 16 control points are matched between the two images; (iii) matched control point pairs are used to estimate a mapping function between the two images; (iv) this mapping function is used to spatially register the two images. Among these four steps, steps 3 and 4 can be automated, and a number of commercial software packages (such as the ERDAS image processing system [Erd86]) are available. However, in most of the current image registra- tion methods, the first two steps still require manual intervention [Wel85]. Landsat 4 or 5 Thematic Mapper (TM) images can be acquired with high frequency. On the average, an up—to-date Landsat image can be obtained for most areas on the Earth every 16 days. Since any newly acquired Landsat image needs to be registered with other images or maps before it can be useful, there is a great need to automate this regis- tration process. The major difficulties in designing an automatic image registration method are control point selecrion and control point matching, the main t0pics of this chapter. We use both a Landsat 4 TM scene taken in 1982 (path 22, row 29) and a Landsat 5 TM scene taken in 1986 (path 21, row 29) in our experiments. These scenes are received from Earth Observation Satellite Company (EOSAT) as P-Tape format (CCT-PT). Radiometric and geometric errors have been corrected for each of these images (115 by 110 square miles). Image registration is still necessary when subscenes are used. It is also useful to develop registration methods for images with different spa- tial resolutions, e.g., Landsat TM image vs. MSS image, or image to map registration. Several authors have addressed the problem of control point selection. Stockman et al. [Sto82] used line intersections, Goshtasby et al. [60586] used centers of gravity of closed-boundary regions as control points, Davis and Kenue [Dav78] used window centers as control points, and Medioni and Nevatia [Med84] used line segments as matching primitives. In Section 2.2 we define the properties that an ideal control point candidate should satisfy, suggest several potential candidates, and discuss methods for extracting them from the images to be registered. Numerous papers have been published on control point matching. In general, there are two types of information that can be used in control point matching: feature proper- ties associated with each point, and relative distances between points. Price [Pri86] l7 employed a relaxation approach with feature-based symbolic descriptions for point matching, in which each of the possible assignments has a rating based on how well the model and image elements correspond. Ranade and Rosenfeld [Ran80] proposed an itera- tive point matching algorithm based on the relative distance information between points. Their method can handle only translational differences. Wang et al. [Wan83] extended Ranade’s method to use feature information associated with each point. This algorithm allows translations and rotations of the images, but its time complexity is O(n4), where n is the number of control points. In Section 2.3.3, we use a merge sort to speed up this point matching algorithm, so that its time complexity is reduced from O(n4) to O(n3). Ibispon and Zapalowski [Ibi86] argued that traditional relaxation labeling schemes [Ros76, Kah80, Ran80] destroy the "inherent symmeu'y" of the problem and then sug- gested a "symmetric fully constrained algorithm". Their method requires that the number of correct point pairs be known a priori, an unrealistic assumption in practical remote sensing applications. In Section 2.3.4 we propose a two way matching concept which can exploit the "inherent symmetry" property and determine the correct point pairs without this prior information. The performance of the proposed point matching algorithm has been extensively studied by simulation with results appearing in Section 2.3.6. A matching index is intro- duced. to measure the performance of the algorithm. If roughly half of the points in each image are true control points then the algorithm has an 80% to 90% chance of correctly identifying the true point pairs. The limitation of the proposed algorithm is that compu- tation is still time consuming for images with a large number of conuol points. If both images have 70 points each, the algorithm takes over an hour (on an 80386-based IBM/PC) to determine the true point pairs. Experimental results with Landsat images show that our proposed method demonsuates comparable performance to that of an experienced photointerpreter in image registration and requires much less human involvement. Scaling differences between the images to be registered will not be considered in our study because for any Landsat image or map, the scale is known. We assume that the 18 two images can globally fit mapping equations with rotational and translational differ- ences because the highest mountain in our study area (mid-Michigan) is below 500 meters and the Landsat TM images are taken at a height of 705 kilometers above the earth. The remainder of this chapter is organized as follows. Section 2.2 discusses control point selection, Section 2.3 addresses control point matching, Section 2.4 describes experiments, and Section 2.5 gives our conclusions. 2.2. Control Point Selection A good image registration system should be capable of selecting various types of objects or regions (structures) from images which are likely to contain control point can- didates. For example, control points can be road intersections or centroids of lakes [Gos86]. In Section 2.2.1, we define the properties that a su'ucture should satisfy for reli- able extraction of control point. Then, in Section 2.2.2, we suggest several such struc- tures and discuss extraction methods. 2.2.1 Desired Structure Properties An ideal structure for reliable extraction of conu'ol point candidate should satisfy the following criteria: 1. Stability: The structure should not change much in shape and size fi'om image to image. It should also appear in both images. 2. Extractability: We should be able to extract the su'ucture consistently using standard image processing techniques. The exuaction algorithm should be possible to automate and be applicable to general Landsat TM images. 3. Frequency: The structure should appear frequently in Landsat TM images. It is not worthwhile developing techniques to extract structures which 19 occur in only a few images, because the techniques may be used only rarely. Region based control points have two advantages in control point selection: (i) For a given image there may not be enough "point" candidates to completely consu'ain a registration transformation. (ii) Use of the centroid of a region can provide control point coordinates with subpixel accuracy; the boundary of a region may change slightly from image to image, but its centroid is unlikely to change substantially in position. 2.2.2. Candidates for Control Points We propose two region-based control point candidates based on the analysis of Landsat TM images of the State of Michigan. Note that different geographic regions may require different types of candidates. 1. Water regions In Landsat TM images, water is the unique land cover type such that its band reflectance intensity values always decrease as the band number increases (ignoring the thermal band 86) [Lu589]. This band decreasing property holds for all the test sites in all Landsat TM images of Michigan which we have examined. Table 2-1 shows the band reflectances for various land cover types in Grayling Township, Crawford County. Entries in Table 2-1 show that water is the unique land cover type with the band decreas- ing property. Water areas (such as lakes) usually do nOt change in size and shape with time. They are also a frequently appearing object in Michigan, which has thousands of lakes. Since only water pixels have the band decreasing property, they are easy to identify. Figures 2-1 and 2-2 show two Landsat TM images (Band 2, size 256x256), taken in October 1982 and June 1986, respectively, of the same ground area (Blair Township, Grand Traverse County, Michigan). Water regions are colored yellow in Figures 2-3 and 24. They were identified by applying a simple segmentation algorithm using the band 20 decreasing property of water regions. Candidate control points are the centroids of the segmented water regions. The white regions in these Figures are oil/gas pads and will be discussed later. Table 2-1: Spectral data of Grayling Township. Land-Cover Type Band Number 1 2 3 4 5 7 Urban Residential 141 66 86 92 137 78 Shrub 80 32 29 113 96 31 Northern Hardwood 75 31 24 166 97 26 Central Hardwood Oak 75 30 23 162 99 27 Agren Birch 75 3O 23 139 81 22 Lowland Hardwood 76 30 25 106 78 24 Conifer Red Pine 77 28 26 72 53 18 Conifer Jack Pine 77 28 26 78 69 23 Lowland Conifer 76 31 25 79 59 19 Water 81 37 33 18 13 6 Wetland Grass 76 3O 26 117 86 25 The Landsat 4 band 2 image of Blair Township, Oct. 1982. 1: Figure 2- The Landsat 5 band 2 image of Blair Township, June 1986. Figure 2-2 22 Figure 2-3: Extracted Pad Regions (white) and Water Regions (yellow) from Figure 2-1. Figure 2-4: Extracted Pad Regions (white) and Water Regions (yellow) from Figure 2-2. 23 2. Pad Regions In the last decade, many oil and gas pads (drilling areas) were constructed in mid- Michigan because of the increase of gas prices. Since these pads are easily visible in Landsat images, they can be used as structures for extracting control points. Oil/gas pads are very bright objects in visible bands of Landsat TM images. Their size usually ranges from 8 to 40 pixels in Landsat TM images. In a previous work [En587] we have developed techniques to extract active pad regions. The proposed pad detection algorithm is given below. Pad Detection Algorithm Step 1. Apply a 5x5 filter (Figure 2-1(a)) to the input Landsat TM band 2 image. The generated output image is called A 1. Step 2. Apply a filter (Figure 2-1(b)) to image A 1. The output image is called A 2. Step 2a. Select seed threshold by using training pad file (if file is available). Step 3. Select potential seeds by applying seed threshold to image A2. The gen- erated image is called A 3. Step 4. Use the Blob Coloring Algorithm [Bal82] to form seed regions in image A3. Step 5. Extend seed regions to form pad regions In Step 1, we use a high-pass filter to enhance those pixels located at the center of pad regions. The filter size (5x5) is based on the average size of existing pad regions. The filter is designed based on the observation that pad regions usually have substantially higher intensity values than non-pad regions. Theoretically, pad regions should have high intensity values in all three visible bands (band 1 to band 3). Band 2 is selected in our algorithm because we found experimentally that it provides the best contrasr between pad 24 -1 -1 24 -1 -1 O 1 0 -1 -1 -1 -1 -1 1 1 1 -1 -1 -1 -1 -1 0 1 0 (a) (b) Figure 2-5: The two filters for oil/gas pad detection. regions and non-pad regions. In Step 2, we apply another filter to enchance pad center pixels. This step can smooth out noisy pixels that also give high response in Step 1. The true pad center pixels should be significantly enhanced after Step 2. In Step 2a, we use training data from pads in Fredrick township which consists of twenty-five active pad regions. For each training pad region, its seed value is defined as the maximum of those training pad pixel values in image A2. The seed threshold is defined as the minimum seed value among the twenty-five pad regions. Potential seed regions are selected in Steps 3 and 4. In Step 5, the potential seed regions are overlayed on the original Landsat band 2 image to form pad regions. Each seed region is extended to a pad region based on the region smoothness, edge information, and the prior knowledge that the maximum size of a pad region is 7x7 pixels. More details are given in [En587]. For Landsat images shown in Figures 2-1 and 2-2, white regions in Figures 2-3 and 2-4 show pad regions detected by the pad detection algorithm. Centroids of detected pad regions are used as control points. Roads are common, visible objects in Landsat TM images. The nearly linear struc- tures can be divided into many unit segments. The centroid of each segment can be used to transform the line matching problem into a point matching problem. A large number of techniques for extracting roads from Landsat images and aerial photographs have been 25 reported [Can87, Edw84, FisSlb, Gr082, Har83, Hue87, Wan87, Zhu86]. A hierarchical approach to road detection will be discussed in Chapter 3. 26 2.3. Control Point Matching This section will address the following problem: If two images A and B have m and n control points, respectively, and they have It points in common, how do we determine the k corresponding pairs of points in the two images? Note that the value of k is unknown. Without loss of generality, the remaining sections of this chapter assume that an when time complexity of a point matching algorithm is calculated. Since for a given Landsat image, the scale information is always available, we only consider pairs of images covering roughly the same ground area but with possible translational and rota- tional differences. 2.3.1. Review of Point Matching Algorithms A large number of papers have been written on point pattern matching between two images. Baird [Bai84] has developed an algorithm with expected time complexity of 0 (n2), where n represents the number of points in the two images. However, his method requires that the two images have the same number of points (m = n). If we use Baird’s method to solve our matching problem, then the complexity will be 0((m-k)!(n -k)!) [Bai84, pp.95], which is exponential. Ranade and Rosenfeld [Ran80] proposed a method based on the idea that a true point pair will receive stronger "support" than incorrect point pairs. The support for a given point pair is calculated from the number of point pairs whose distances to the given point pair in the two images are consistent. Then a relaxation scheme is used to update the support of the point pairs. This method has time complexity of 001‘). Stockrnan er al. [St082] proposed an 0 (n4) image registration method using a clus- tering approach. First, structures such as line segments and line intersections are extracted from the images. Then, by matching structures of the same type, transformation parameters are derived from each possible pair of structures. The transformation parame- ters will form a cluster centered at the true value of those parameters. The method is very efficient when multiple types of segments are available [Sto82]. Goshtasby er al. [Gos85] merged the convex hull concept with the clustering approach to speed up the point matching algorithm, but it still has a worst case time complexity of O(n5 logn). Price and Reddy [Pri79] introduced a registration system based on matching seg- ments in the two images. Segments are matched based on properties such as size, loca- tion, and shape. Although the time complexity of this method is O(n3), the problem of extracting appropriate segments from two images makes the method difficult to imple- ment. We summarize the capability and complexity of the point matching algorithms pro- posed in the literature in Table 2-2. 27 Table 2-2: Comparison of various point matching algorithms. Legend : Y : yes, has this feature; N : does not have this feature; R : Yes, but over a small range. Registration Invariance Information used worst case Method feature relative time Translation Rotation Scale property distance complexity Baird[Bai84] Y Y Y N Y exponential Ranade[Ran80] Y N N N Y 0(n‘) Stockman [Sto82] Y Y Y Y R 0(n‘) Goshtasby [Gos85] Y Y Y N Y O(n’logn) Price [Pri79] Y R Y Y N 001’) Proposed method Y Y N Y Y 00’) 28 2.3.2. Point Pattern Matching by Relaxation Let A = {A 1, A2,..., A,,,} be a set of control points in image A, and B = {81, 82,..., 8"} be a.set of control points in image B. The basic concept of our matching algorithm is based on the following assumption: if (Ai, 3}) is a true pair, then for any other point A, in image A, there may be a corresponding point B, in image B such that the distance between A,- and A, is close to distance between B ,- and 3,. This is also true if (A,, 8,) is a correct match. If (A;,Bj) is a true pair, then we expect the remaining m-l point pairs (A,,B,) to provide support to (A;,B,-). Note that here m may be larger than n. In other words, in calculating the support for a given pair there may exist two points in image B assigned to the same point in A. This situation is allowed, because the true pairs should have the most support after several iterations (ignoring symmetrical arrangements of con- trol points). We generate a matrix P 2 [pg], of size m by n, where Pij represents the probability of a match between points A,- and B j. In initializing this matrix, all the information asso- ciated with control points is used. We use segment type (water versus pad) and segment size (area in pixels) to initialize P. Control points of different types in two images can never be matched, so their initial probability of match is set to zero. The initial value of p;,- is given by . minl size (A;) ’ size (B j) 0 size (B 1) size (A,) p U = i 0, otherwise ], ifA; and B} are the same type. (2.1) Tables 2-3(a) to 2—3(d) show the centroids and sizes of pad and water regions in the segmented Landsat images shown in Figures 2-3 and 2-4. Table 2—3(e) shows the initial match matrix P”. Notice the large brooks of zeros in this matrix which correspond to the impossibility of matching points of different types. The fact will speed up the subse- quent point matching algorithm. The proposed matching algorithm uses both types of control points and their rela- tive distances. The support provided by the pair (A,, 8,) towards the pair (A1, 31') at 29 iteration l is given by dir djs djs ’ dir S’0.8, or (2) p;,->0.5 and pi,- is at least five times larger than any other entry in row i and column j of P. 35 In Step 9, the average matching probability of any given point pair after the first iteration should be (1/mn). We set a threshold of (1/10mn) in Step 9 such that any point pair with matching probability below one tenth of the average will be eliminated from further consideration. The purpose is to discard highly unlikely point pairs as soon as possible. The most time-consuming part of this algorithm is Step 6, which requires C°m'n-(m+n) iterations, where c is a constant. So the time complexity of our point matching algorithm is O(n3) when m S n. Note that G, the number of iterations, is a con— stant. Through a simulation study (see Section 2.3.6), we found that G=5 is adequate for our applications. 36 2.3.6 Robustness Study of the Point Matching Algorithm Given two images A and B with m and n points, respectively, having k points in common, under what conditions will the Point Matching Algorithm converge? How will the algorithm perform for various combinations of values of m, n, and k? In order to answer these questions, we performed the following Monte Carlo experiment: 1. Select numbers m, n, k, the noise level, and the number of Monte Carlo trials. 2. Randomly generate m points in a 256x256 image A. 3. Randomly choose translation (5 to 30 pixels) and rotation (15 to 80 degrees) parameters; transfmm points in image A to image B. 4. Randomly delete (m —k) points from image B, and then randomly insert (n -k) points in image B. So, there are k true point pairs. Add specified noise to the (x,y) coordinates of all points in image B. 5. Run Point Matching Algorithm on points in images A and B and use the matched point pairs to compute translational and rotational parameters. If the computed parameters are close to the true parameters (from step 3) then call this trial a "success", otherwise call it a "failure". A trial is a success if at least two thirds of the k true point pairs are detected, translation error is within 2 pixels, and rotation error is within 5 degrees. The success ratio is the ratio of successful trials to the total number of trials. The simulation program was implemented on an 80386-based IBM/PC. The seed of the random number generator is extracted from the system clock. We ran 40 trials for each combination of m, n, k, and the noise level. A noise level of p means that each point in image B is shifted by a random number of pixels generated by a uniform distri- bution ranging from 0 to p in both the horizontal and vertical directions. Experimental results (Table 2-4(a)) show that noise has little effect on the success ratio. A possible reason is that the support function (Equation 2.2) is calculated from the relative inter- 37 point distances and the noise level is smaller than this distance in the images (the average inter-point distance is about 45 for a 256x256 image with 50 points). We do not expect that in practical image registration we will encounter errors larger than 3 pixels in control point selection. We define a matching index as follows. _ (k—l) . (k-l) " (m—l) (n-l) MI (2.5) We conjecture that the success ratio is correlated with M1. A possible reason is that, the support for a true pair (A,,Bj) is determined by the product of the maximum row- support (k—1)/(m--l) and the maximum column-support (k-1)/(n—l). The values of M] are bounded by 0 S M] S 1. The higher the value of M1, the better the quality of the match. M1 = 1 implies a perfect match (m = n = k). Table 2-4(b) shows that increasing the matching index from 0.2 to 0.3 will generally improve the success ratio to 80% or greater. This property holds for a wide range of values of m and n. Roughly speaking, this result indicates that if half of the points in each image are true control points then our algorithm has an 80% to 90% chance of correctly determining the true point pairs. Fig- ure 2-6 shows the relationship between the success ratio and the matching index, obtained by fixing m=n =30 and changing k from 8 to 18. Generally speaking, the success ratio is very close to 100% when the matching index is over 0.35. The success ratio is likely to be below 50% when the matching index is below 0.18. Table 2-4(b) gives the execution time for the proposed matching algorithm. These results show that the proposed algorithm actually runs faster than the 0 (n4) matching algorithm of Ranade and Rosenfeld [22], and its behavior appears to be 0 (n 3) in time complexity. For example, the comparison of CPU time for control points m=n =20 against CPU time for control points m=n=10 is 1.45/02 =7.25 (see Table 2-4(b)), which is close to theoretical O(n3) time complexity, (20/10)3 = 8. 38 Success Ratio 0.4 — 0.2 - O l l l l m 0.06 0.12 0.18 0.24 0.3 0.36 Matching Index Figure 2-6: Success ratio vs. matching index. 39 Table 2-4: Results of robustness study. 2-4(a): Relationship between noise and success ratio for various values of m, n, k, and noise level, p. Forty Monte Carlo trials were performed for each combination of parameters. Success ratio reported here is the average over 40 trials. m n k p Success Ratio 35 30 25 1 100% 35 30 25 2 100% 35 30 25 3 100% 20 15 12 1 100% 20 15 12 2 100% 20 15 12 3 100% 15 12 6 O 20% 15 12 6 1 15% 15 12 6 2 15% 15 12 6 3 10% 2-4(b) Relationship between Matching Index and Success Ratio, p = 0. Twenty Monte Carlo trials were performed for each combination of parameters. CPU time (minutes) m n k Matching Success Proposed O(n3) O(n4) Index Ratio Algorithm Algorithm [22] 10 10 5 0.20 60% 0.2 0.35 10 10 6 0.31 95% 15 15 7 0.18 70% 0.6 1.6 15 15 8 0.26 90% 4 20 20 10 0.22 85% 1.45 5.0 20 20 1 l 0.28 90% 25 25 12 0.21 80% 2.75 12.6 25 25 13 0.25 90% 30 30 15 0.23 85% 5.0 27.0 30 30 16 0.27 95% 35 35 17 0.22 80% 7.9 62.0 35 35 18 0.25 90% 40 40 20 0.24 80% 12.5 84.0 40 40 21 0.26 90% 50 50 25 0.24 80% 22.3 151.0 50 50 26 0.26 85% 40 2.3.7. Mapping Function Once control point pairs have been obtained from the point matching algorithm, a mapping function can be generated to transform an image from one coordinate system to another, called registering the images. In our study, only rotation and translation transformations (between the two images) are considered, so the mapping function can be described as below. Let (X,Y) represent a point in one image and let (X’,Y’) represent its corresponding point in the second image. Then the mapping function is given by X’= X c050 - Y sine + Ax (2.6) Y’= X sine + Y cosO + Ay (2.7) where 9 denotes the rotation parameter and Ax and Ay specify the two translation param- CICI'S. The least squares fitting method [Gos85, St082] is used to estimate the translation and rotation parameters (0,Ax, Ay). The prerequisite for applying the least squares method to image registration is that the scene should be flat in order to avoid large fitting errors caused by local geometric distortion. Our study area, mid-Michigan, meets this requirement. For those scenes which consist of hilly terrains, other fitting methods may have to be employed for registration. Root Mean Square (RMS) error is commonly used to evaluate the performance of least square fitting results. In Landsat image registration, this RMS error should be less than 1.5 pixels [Erd86]. Some commercially available software programs allow the user to assign the tolerance of RMS error. In those situa- tions, the least squares method is iteratively applied (by eliminating one control point pair with the largest fitting error after each iteration) until either the RMS error is below the specified threshold or too few control points are left. Iterative application of leasr square method has some well known problems [Fis81a]. 41 2.4 Experimental Results For the given Landsat images (Figures 2-1 and 2-2), we first extract pad regions and water regions to identify control points (see Figures 2-3 and 2-4). The band decreasing property is used to extract water regions. In pad detection, the thresholds obtained from the experiments in Fredrick Township are used for other images. We next calculate the centroid and size of each detected region and generate the initial match matrix P“. Then we run the Point Matching Algorithm (Section 2.3.5) to obtain the final match matrix. Inspection of Figures 2-3 and 2-4 shows that there are 12 true point pairs between the two images (Table 2-5) and our algorithm correctly identifies all of them. This result is not surprising because: (i) the size and type features associated with the control points make the matching easier, (ii) the matching index is 0.47, which implies that the match- ing algorithm should be successful (see Section 2.3.6), and (iii) there is very little loca- tion error associated with the control points (within 1 pixel). Table 2-5: Evaluation of point matching results. 1982 Image 1986 Image Visual match Algorithm match Pad regions 9 12 7 7 Water regions 7 6 5 5 Total regions 16 18 12 12 The iterative least squares method [Erd86] is used to generate a mapping function based on the twelve matched point pairs; the user-specified threshold of the RMS error is 1.0 pixel. The final RMS error is 0.503 pixel, and the mapping functions between the two Landsat images (Figures 2-1 and 2-2) are given by X’ = 0.9959X + 0.0178Y - 4.1018 (2-8) Y’ = —0.0178X + 0.99591’ + 0.8663 (2-9) 42 Table 2-6 summarizes the experiments of image registration from three townships by using two different registration methods. In the control point method, the selection and matching of control points is done automatically. In the manual method, a photoin- terpreter selects and matches the control points manually. All three township images are extracted from the same two Landsat 4 or 5 TM scenes (taken in Oct. 1982 and June 1986, respectively). Both water and pad regions are extracted as control points in Blair Township image, while only pad regions are used in the experiments in Union and Fredrick Townships. We evaluated the registration accuracy by first selecting 6 control point pairs (by an experienced photointerpreter) from the original Blair Township TM images (Figs. 2-1 and 2-2) and then generating the mapping functions using the least squares method. Table 2-6 shows that the estimated values of translational and rotational parameters given by the manual and the control point methods are very close, and the corresponding RMS error is small. In order to test the flexibility of the point matching algorithm, we decreased the thresholds used in the pad detection algorithm to obtain more pad regions in Blair Township images. This resulted in 54 and 31 pad regions in the 1982 and 1986 images, respectively. Twenty eight point pairs are generated by the point matching algo- rithm. The mapping parameters (the second row under Blair Township in Table 2-6) obtained through these 28 point pairs are consistent with the other two sets (the first and the third row under Blair Township in Table 2-6). The experiments in Union Township show that 13 and 17 pad regions are detected from 1982 and 1986 images, respectively. Eight point pairs are generated by the point matching algorithm. The results show that the values of the mapping parameters gen- erated by our algorithm are consisrent with the parameters estimated by the manual method. Figures 2-7 and 2-8 show two Landsat images covering the Fredrick Township in Crawford County, Michigan. The corresponding pad regions are shown in Figures 2-9 and 2-10, respectively. These two images have 13 and 16 control points, respectively. They have 9 true point pairs. Our Point Matching algorithm correctly identified all the 9 43 point pairs with a matching index of 0.36. The values of the mapping parameters estimated from our point matching method are consistent with the parameters estimated by an experienced photointerpreter (see Table 2-6). Table 2-6: Comparison of control point and manual registration methods. Image Registration Number of Matched Mapping Parameters RMS Error Location Method Control Points Pairs Pixels Degrees 1982 Image 1986 Image AX AY 0° Blair Control Point 16 18 12 -4. 10 0.87 - 1.02 0.503 Township Control Point 31 54 28 4.21 1.06 - 1.08 0.572 Manual 6 6 6 359 1.31 -l. 16 0.403 Union Control Point 13 17 8 3.02 23.30 -0.94 0.607 Township Manual 6 6 6 3. 13 22.54. - 1.29 0.516 Fredrick Control point 13 16 9 -5.21 3.07 0.75 0.541 Township Manual 5 5 5 -4.94 3.30 0.812 0.412 Figure 2-7: The Landsat 4 band 2 image of Fredrick Township, Oct. 1982. Figure 2-8: The Landsat 5 band 2 image of Fredrick Township, June 1986. 45 Figure 2-9: Extracted pad regions (white) from Figure 2-7. Figure 2-10: Extracted pad regions (white) from Figure 2-8. 46 2.5. Conclusions As more and more digital images of the Earth are being acquired from satellites, there is an increasing need to automate the image registration process. Control point selection and matching are two key steps in this process. We have discussed how to extract control point candidates from Landsat TM images and proposed an O(n3) point matching algorithm as well as a matching index to predict the performance of the algo- rithm. Some successful applications of the proposed method have been given, and the entire registration process has been automated. Results from our experiments show that the proposed point matching method can perform almost as well as a trained photointer- preter. One limitation of the proposed method is that at least half of the control points in each image should be true control points. In our experiments each scene must also con- rain at least 10 water or pad regions. We have used only small subscenes (256x256) in the experiments. A possible extension is to use larger subscenes or different scene con- tents in the experiments. A future research topic is to develop an image registration sys- tem which can automatically extract control points and lines and combine the point and line information to register Landsat images. Chapter 3 A Hierarchical Approach for Identifying Roads in Landsat Images Using ancillary information is very important in detecting roads from Landsat images. Many existing road detection techniques obtain this ancillary information from existing road maps. Potential problems with such an approach include missing data, out of date maps, or misregistration between the map and image. In this chapter, we propose a multiresolution technique to detect roads in Landsat images. Such a technique permits us to extracr information about roads at several levels of detail. For example, when the same size road operator is applied to a lower resolution image, a comparatively larger neighborhood is utilized to extract information about the roads. This procedure can be repeated at different image resolutions. Final road detection is performed by combining these results at different image resolutions. Major advantages of this method are: filling road gaps, deleting short road-like segments, and enhancing long and low contrast roads. Several Landsat Thematic Mapper (TM) images were used in the experiments to evaluate the road detection algorithm developed here. 3.1. Introduction Identifying roads in Landsat images is useful in many applications, such as automated mapping and image interpretation. Roads in Landsat TM images are usually one or two pixels wide (30-60 meters). Many automatic techniques which perform well 47 48 in detecting wide roads (5-10 pixels) in aerial photographs [Zhu86, Hue87] are not appropriate for Landsat images. We classify roads in landsat TM images into three types: 1. Major roads: Expressways and large highways, 2 or more lanes, paved, usually long, straight, and about 15-45 m wide. 2. Local roads: Permanently maintained roads, generally 2 lanes, paved or unpaved, 8-20 m wide, may have smooth curves, also long. 3. Minor roads: Access roads, maintenance may be intermittent, 2 lanes or less, usu- ally unpaved or short, generally less than 10 m wide. Although most road detection techniques attempt to find major roads, all three types of roads are useful in image interpretation. For example, minor roads can be used in oil/gas pad detection [Ton87]. Local and minor roads are also useful in recognition of urban areas (Chapter 4). It is more difficult to detect local and minor roads than to detect major roads in Landsat TM images. Further, verification of detected local and minor roads is also harder and more time consuming. Typical approaches to road detection consist of four steps [Gui88, Nev80, Wan87, Ton87, Wan88]. (i) A local line detector is applied to the entire image to calculate a mag- nitude and direction value for each pixel which indicates the likelihood of that pixel being on a road. (ii) The most likely road pixels (called road seeds) are selected from the image based on the magnitude of the line detector. (iii) A line following technique is implemented to extend each road seed to form a road segment. (iv) Knowledge based rules are used to refine these road segments by filling small gaps between long road seg- ments and deleting short and isolated road segments. A major problem in road detection is that the poor quality of the line detector response (step (i)) often causes difficulties in later processing. Lines and curves in real world imagery are not perfect. The local detectors often return responses which are due primarily to noise: a strong response may occur when no line segment is present, or a weak response may occur when a line segment is present. There is no one-to-one 49 relationship between a detector response and the existence of a line segment. It is important to take a "global perspective" to identify roads. The global perspec- tive helps us in determining which roads are important, whether a road gap should be filled, or whether a small line segment is due only to noise or is a piece of a long road. Our approach to global perspective is based on examining an image at several levels of resolution. Whether a line is long, short, strong, or weak is determined on a competitive and comparative basis by observing the image at several levels of resolution. Without this global perspective, the line detector response which is based only on local informa- tion cannot be improved effectively in real applications. In the published literature, prior map knowledge has been used in guiding the road detection procedure [Fis8l]. Potential problems include missing data or rrrisregistration between the map and image. Generic road knowledge (length, strength, curvature, orien- tation, etc.) has been organized into a knowledge based system [Wang88] in guiding the road detection procedure. However, without specific information about the given image, the rules representing generic knowledge cannot function effectively. For example, one such rule may be stated as follows: if two long roads with the same orientation are separated by a small gap, then this gap should be filled. At the time of implementation, these symbolic features like "long roads" and "small gaps" have to be quantified and these image dependent values cannot be obtained from generic knowledge. A road gap of one hundred pixels is small for a road that is a thousand pixels long, but a five pixel gap may be too large for a road that is only thirty pixels long. Pixel based relaxation labeling techniques have also been used to improve line detector response [Pra80, Zuc77]. These techniques can obtain local consistency for each pixel in the image but do not necessarily obtain a global perspective by looking at the whole image. In Seetion 3.3 we show that the relaxation algorithm in [Zuc77] can only fill a road gap a few pixels long. For the same reason, this algorithm cannot remove noisy segments more than a few pixels long. Multiresolution processing has received a lot of attention in the computer vision literature. Usually, the image size is fixed and the operator size is varied. The well- 50 known Laplacian-of-Gaussian filter is a good example of this [Mar80]. Different sized local operators (filters) are applied to the same image and the filter responses are com- bined to obtain the final response. This concept have been widely applied in edge detec- tion [M6188 Ros71] as well as in curve smoothing [Low88]. However, unlike edge detectors, the size of a local line detector is limited in real applications. To our knowledge, there are no convolution mask line detectors with sizes greater than 11x11 in the published literature. This size limitation is caused by the fact that only linear areas along the "true roads" should be emphasized in a road detector. As the size of a road operator increases, the number of masks needed to enhance the possible "true road pix— els" increase very quickly, which makes the design of large-size road operator impracti- cal. Although Hough tranform [Bal82] can be used as a global line detector, it is not appropriate in detecting curved roads which are common in Landsat images. Using the image at multiple scales is another way to improve the line detector output. When the line detector size is unchanged and the image size is reduced, a more global line detector response can be obtained. The remainder of this Chapter is organized as follows. Section 3.2 briefly reviews a road detection algorithm which we have developed. Section 3.3 addresses the limitations of pixel based relaxation techniques and proposes a method of using image hierarchy in road detection. Section 3.4 introduces an iterative road operator for road thickening. Sec- tion 3.5 discusses a hierarchical technique to combine road detection results from dif- ferent image resolutions. Secrion 3.6 discusses the experimental results and Section 3.7 gives our conclusions. 3.2. Road Detection Using a Road Following Algorithm We briefly review a road detection method [Ton87] which will be used in the later . sections. The method first enhances an image by an "one pixel away" road operator, then uses a road following algorithm to detect road pixels at three levels. The one pixel away operator will be used in road thickening (Section 3.4) and the parallel road following algorithm will be used in road detection (Section 3.5). 51 3.2.1. An One Pixel Away Road Operator The "one pixel away" property is based on our observation that in many Landsat images, the intensity contrast between the road pixels and the neighboring non-road pix- els is higher if they are one pixel apart. This may be caused by the fact that in Landsat TM images each pixel covers a ground area of 30 x30 meters, and the observed intensi- ties of many road-neighboring pixels are due to a combination of the intensities of the ground cover of road and non-road regions. This one pixel away property has been used in the Duda Road Operator (DRO) [FisSl], but DRO contains five image dependent parameters which make it difficult to implement. The proposed road operator is a modification of DRO and the road operators proposed by Wang and Howarth [Wan87], and is shown in Figure 3-1. The operator has a total of 14 masks associated with eight possible road directions (0°, 22°, 45°, 67°, 90°, 112°, 135°, 157°, and 180°). During the road enhancement procedure, all the 14 masks associated with the operator will be applied to each pixel. The road direction and road magnitude for a given pixel is derived from the mask which produces the maximum response. For a more detailed discussion of this road operator, see [Ton87]. Direction Magnitude Masks 0.180 22 0 0 -1 O 2 O 0 -r o g o -r 45 -r n 2 n -r n o -r o 2 0 -r 0 _1 2 41 _1 67 -1 o 2 o -1 0 1) _, 2 fl _1 -1 0 2 0 -L 0 _1 n n -L (1 -1 0 2 0 -l 90 .r n 7 n -r -1 fl 1 0 -1 -1 J) 2 0 -1 IL -1 0 0 -1 0 112 0 -1 D 2 D -1 -1 O 0 -1 0 o -r n 2 n -r n -1 2 n -r A 0 2 0 -1 0 0 135 o -1 o 2 o -l o 0 D -l O 2 0 -1 157 52 Figure 3-1: The 14 masks of the one pixel away road operator. 53 3.2.2. A Road Following Algorithm Given an enhanced image where each pixel is associated with a road magnitude and a road direction, we introduce a road following algorithm to detect the roads in the image. First, a road magnitude histogram for all pixels in the image is constructed, and a road likelihood score is assigned to each pixel to describe the likelihood that it is a road pixel. Then a line following by graph searching technique [Ba182, Wan87] is used to extend "seeds" to form roads. In our applications, we define a pixel as a seed pixel if its road magnitude is within the top 0.1% of the road magnitude histogram. We observe that the road following algorithm is intrinsically parallel; all the road seeds can extend simultaneously if a multiprocessor machine is available. In each scan of the image, all the seeds can extend their qualified neighbors to be road pixels. Since our program is implemented on a single processor computer, all the seeds have to be pro- cessed sequentially. The road following algorithm is listed below. 54 Road Following Algorithm Step 1. Assign a road magnitude score to all pixels of the image. Scores are 1, 2, 3, or 0 depending on whether the pixel’s road operator magnitude value is within the top 5%, 5.1-10 %, 101-20%, or 20.1-100% of the magnitude histogram, respectively. Step 2. Assign all seed pixels (top 0.1% magnitude) to Type 1 road pixels. Step 3. for i:=1 to 3 do * BEGIN for k:=1 to i do REPEAT Scan the entire image, from top to bottom and left to right. For any Type k pixel, if any of its 8-neighbors satisfies the following three criteria: 1. Edge direction is consistent (within 45 degrees) 2. It has not been classified yet. 3. Its magnitude score is i then assign the neighbor as a road pixel of Type i. UNTIL no more road pixels can be added. END; Note that in Step 3 all the Type 1 major road pixels are likely to be detected first, then all the Type 2 local road pixels, and finally all the Type 3 minor road pixels. The number of iterations needed to detect roads in an image depends on the shape of the roads present and the distribution of seeds on the roads (notice that multiple seeds can be detected for a given road). In our experiments with 256x256 images, we performed at most 15 iterations to obtain a road labeling. 55 3.3. Using Hierarchy in Road Enhancement Use of relaxation techniques in low level road enhancement has the advantage of propagating local constraints. Section 3.3.1 reviews existing relaxation techniques and their limitations. Section 3.3.2 introduces our method of road enhancement from a image hierarchy. The problem of traditional image reduction techniques for road detection pur- poses is discussed. 3.3.1. Limitations of Road Enhancement by Relaxation Zucker et al. [Zuc77] used relaxation labeling for line enhancement. However, this method uses only local information in road enhancement. Using relaxation techniques in road detection may satisfy pixel label consistency from a local view but not necessarily achieve consistently in the global pixel labeling. For example, the length of road gap that can be filled is significantly limited by the size of the line detector in spite of the length of the road. In the following, we show that this method cannot fill a horizontal road gap over four pixels wide while 5x5 local neighborhood information is used in the relaxation procedure. Figure 3-2 shows an ideal line image consisting of two long horizontal lines (lines A and B, pixel intensity 100) separated by a gap of five pixels. All remaining image pixels have intensity 0. The dotted window shows the local neighborhood of pixel 2. Using the relaxation procedure in [Zuc77], this local neighborhood will not support the fact that pixel 2 is a road pixel. In the relaxation algorithm [Zuc77], every image pixel is assigned one of nine labels 71,-, with 2.1, ..., 2.3 representing the presence of a line having one of eight directions (0°, 22°, 45°, 67° , 90°, 112°, 135°, 157°, and 180°), and 39 indicating that no line is present. For a given pixel i, the probability of each label 3. at iteration (k-r-l) is calculated by 9’0.) [1 + qlk’mr z rplk’tv) (1 + qlk’rvm ’ veA 9+1) (k) = (3.1) 56 where (19°00 = Z dij [ 2 M7». v) p}“’ 1, pixel 2 in Figure 3-2 has a stronger probabil- ity of the no—line label 2.9 than of the horizontal-direction label 1; . From Equations (3.1) and (3.2), we obtain pl‘+" _ 99°09) [1 +qS"(7\o)1 (3 3) 129””01) - Fina-r) [1 +q$“’(kr)l 57 For pixel 2 in Figure 3-1, we obtain qgklog) > 1,900.1). (3.4a) Within a 5x5 neighborhood of pixel 2, the support for label kg is always larger than the support for 21. Equation (3.4b) is the initial condition. pm.) > pawl) (3.4b) By combining Equations (3.4a) and (3.4b) with Equation (3.3), we obtain the result 99“”09) paw—of) >1‘ (3'5) Similar results can be shown for pixels 3 and 4 in Figure 3-2. So this road gap will never be filled by following the relaxation algorithm in [Zuc77]. Using a similar argu- ment, it can be shown that a horizontal noise segment which is five pixels long cannot be removed by using the relaxation algorithm with a 5x5 operator. Prager [Pra80] also used a relaxation technique to detect region boundaries. He claimed that any pixels which are at the end of roads would be unchanged during the relaxation procedure. Similar to the method in [Zuc77], Prager’s method can make the line labeling more consistent, but is not able to fill road gaps or remove noise segments. Other line enhancement techniques using iteration [Van77] also fail to detect roads effec-' tively because only local information is used in the procedure. 3.3.2. Using Image Hierarchy in Road Detection The major road information in a given image should be preserved through a scale change of the image. For example, when an image is proportionally reduced to half of its original size, the relative relationship between, "long roads" and "small gap" within the image should be maintained. If we further reduce the image size, the original major roads should still exist, small road gaps should be filled, and short and bright noisy segments should disappear. Since major road information is invariant to scale change, this 58 information can be obtained through repeated reduction of the image. This major road information can then be used in guiding the road detection procedure. An image hierarchy has been successfully applied in detecting region-type objects [R0577]. The pyramidal structure has been used in edge detection [Tan75, Han78]. How- ever, image hierarchy has not been applied in road detection. The major problem is that conventional image reduction methods do not work in the line detection task. Reduction of an image from 256x256 to 128x128 involves reducing every 2x2 block of pixels to a single pixel by the pick up method, the weighted average method, and the most distinct method. In the pick up method, the top left pixel of every 2x2 block is selected to form the reduced image. The problem with this method is that one pixel wide horizontal or vertical roads in the original image may totally disappear in the reduced image. In the weighted average method, the reduced image is formed by averaging the intensity values in every 2x2 block. The problem with this method is that, after only two or three levels of reduction, all the roads in the original image will be averaged out. In the most distinct method, the brightest pixel in every 2x2 block is selected to form the reduced image. This may result in a single color image after several levels of reduction because all the bright noisy pixels will eventually dominate. The above problems with typical image reduction techniques can be solved if somehow we can "guarantee" that every road in the original image is two pixels wide. Then the reduced image obtained by applying the pick up method will still contain all the road pixels. In the next section we introduce a technique to thicken every road in the original image to two pixels wide without using any prior information about where the roads are. By recursively using the road thickening technique and image reduction, we can use the image hierarchy to obtain global information in guiding the road detection procedure. 59 3.4. Road Thickening Using an Iterative Road Operator The one pixel away road operator (Figure 3-1) was originally designed to enhance road pixels. However, we found that this operator can also be iteratively applied to thicken roads needed for image reduction. The proposed road thickening algorithm is given below. The input to the algorithm is a Landsat image with roads one or two pixels wide. In the output image, all the roads are enhanced and thickened to about two pixels wide. Road Thickening Algorithm Step 1. Assign the input image as A 1. Step 2. For i:=1 to 5 do Apply the One Pixel Away Road Operator to image A,- to generate the output im- age Ai+l~ Step 3. Apply the Postprocessing to image A 6 to generate the final output image. The purpose of postprocessing (in Step 3), as explained later, is to clean the thick- ened image so that the roads are about two pixels wide. The first two steps of the algo- rithm, which we call the Iterative Road Operator, iteratively apply the one pixel away road operator to enhance and thicken roads in the input image. The proposed one pixel away road operator considers eight possible road directions. These directions can be grouped into the following three categories. A. Roads with 0° or 90° orientatiom Figure 3-3(a) shows a synthetic image of a one-pixel-wide horizontal road. After the first iteration of the road operator, those neighboring pixels that are directly above or below the horizontal road take values which are one-third of the magnitude of the road pixels (see Figure 3-3(b)). All these neighboring pixels have the direction code 1 (0° orientation). Recall that the magnitude and direction of a given pixel is derived from the 60 mask which produces the maximum response. Figure 3-3(c) show the result after five iterations of the horizontal road detector. After k iterations, the magnitude of the neigh- boring road pixels (directly above or below the road pixels) can be obtained from the fol- lowing equation: k 1 . i=1 3 where M 1") represents the magnitude of neighboring pixel i after k iterations, and G represents the magnitude of road pixels. As the iteration number k increases, the magni- tude of road-neighboring pixels should approach one-half of the magnitude of road pix- els. Our experiments show that it is sufficient to apply the road detector five times for the purpose of road thickening. Note that in Figure 3-3, the road-end pixels are gradually blurred as the iteration proceeds. This is because road-end pixels distribute their magnitudes to neighboring non-road pixels. This blurring will shorten every road by two to three pixels. However, this road-end blurring makes it easier to fill road gaps and smooth out short noisy seg- ments. Our goal is to thicken roads so that they are two pixels’wide. However, the iterative road operator might thicken one-pixel-wide horizontal roads to three pixels wide with uneven intensity values (see Figure 3-3(c)). A postprocessing procedure described below has to be performed to improve the output of the road thickening. Postprocessing The goal of the postprocessing is to redistribute road magnitude from one side of the road to the other side of the road, so that the detected roads are only two pixels wide. The postprocessing procedure is listed below. For each pixel, evaluate all the 3x3 masks in Figure 3-4 centered at that pixel. Note that c2 is the gray value of the pixel under consideration. 61 If 0.4c151150.6c1and0.4c1Sr150.6c1and ' 0.462 512 $0.662 and 0.462 sz 50.662 and 0.463 513 $0.663 and 0.463 Sr3 $0.663 thensetll=c1and12=c2andl3=c3 andr1=r2=r3=0. Theoretically, the ratio of thickened road neighboring pixel values to road pixel values is close to 0.5. We set a tolerance of :l: 0.1 on this ratio to make the postprocess- ing more adaptive for real images. Figure 3-3(d) shows that, after postprocessing, the road is almost two pixels wide. Some non-zero pixels in the left part of the Figure 3-3(d) are caused by the road end blurring (compare with Figure 3-3(a)) of the iterative road Operator. Figure 3-5(a) contains two roads with different magnitudes and width. The relative magnitude between the roads does not change as a result of the iterative road operator. After using the iterative road operator and postprocessing, the one pixel wide road becomes two pixels wide and the two pixel wide road remains unchanged (see Figure 3- 5(c)). B. Roads with 45° or 135° orientations The behavior of the iterative road operator for diagonal roads is shown in Figure 3- 6. After five iterations, the magnitude of road-neighboring pixels stabilizes at about 60% of the magnitude of the road pixels. After postprocessing, the one pixel wide roads become two pixels wide. C. Roads with 22°, 67°, 112° , or 157° orientations The behavior of the iterative road operator on this category of roads is nor as good as on other types of roads. Figure 3-7(b) shows the thickened road after applying the iterative road operator to a one-pixel wide road with 22° orientation (Table 3-7(a)). Fig- ure 3-7(c) shows the results of posrprocessing. Figure 3-8 shows an example of applying the iterative road operator to a small size Landsat image. Figure 3-8(a) is a Landsat band 3 image (18x18) of Lake County, 62 Michigan. There is a vertical road in the image (white pixels). Figure 3-8(b) shows the result of the first iteration of the road Operator. Figure 3-8(c) shows the final thickened image, where the road is about two-pixel wide. Figure 3-8(d) shows the reduced image (9x9) of Figure 3-8(c). Note that the original five-pixel road gap in Figure 3-8(a) is now reduced to two pixels in Figure 3-8(d), whereas the brightness of road pixels has been maintained in the procedure. 63 o o o o o o o o o o o o o o o o o o o 0 0 0 0 0 0 0 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o o o o o 255 255 255 255 255 255 255 255 255 255 255 255 255 255 o o o o o o o o o o o o o o o o o o o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o o o o o o o o o o o o o o o o o o o (a). Original Image with a one-pixel-wide horizontal road 0 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 85 85 85 85 85 85 85 85 85 85 85 85 85 85 85 o o o 85 170 255 255 255 255 255 255 255 255 255 255 255 255 255 255 o o o o 85 85 85 85 85 85 85 85 85 85 85 85 85 85 85 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o (b). After lst iteration, both sides of road neighboring pixels are thickened o 1 2 2 2 2 3 3 o o o o o o o o o o o o r s 9 7 9 11 9 3 o o o o o o o o o o o r 6 25 52 87 107 122 126 127 127 127 127 127 127 127 127 127 r27 0 r 5 26 64 116 170 214 241 253 255 255 255 255 255 255 255 255 255 o 1 6 25 52 87 107 122 126 127 127 127 127 127 127 127 127 127 127 o r 5 9 7 9 rr 9 3 o o o o o o o o o o o r 2 2 2 2 3 3 o o o o o o o o o o o (c). Image after 5th iteration of road operator 0 r 2 2 o o o o o o o o o o o o o o o o r 5 9 7 o o o o o o o o o o o o o o o o o 25 52 82 170 214 241 253 255 255 255 255 255 255 255 255 255 o o o 26 64 116 170 214 241 253 255 255 255 255 255 255 255 255 255 o o o 25 52 82 o o o o o o o o o o o o o o 1 5 9 7 o o o o o o o o o o o o o o o r 2 2 o o o o o o o o o o o o o o o (d). After post-processing, roads are thickened to 2-pixel-wide Figure 3-3: Road thickening for a horizontal road. 64 MagnitudeMasks Direction 11 12 l3 0, 180 C1 CZ c3 ’1 r2 r3 13 12 [3 22 ll 12 C3 [1 C2 C3 C1 CZ '3 C1 72 f3 ’1 7'2 7'1 13 13 C3 r3 12 C3 45 12 02 r2 11 02 "3 ’1 [3 C3 r3 [3 C3 r3 67 12 C2 7'2 12 C2 '2 11 C1 ’1 11 cl ’1 11 C1 '1 90 [2 C2 f2 13 C3 "3 IL c1 3 11 Cl '1 112 [z 62 f2 ’2 CZ r2 [3 C3 r3 13 C3 '3 11 C1 '1 135 L2 52 r; 13 C3 '3 11 I1 I; c c l 157 C1 12 13 l 2 3 Figure 3-4: The 14 masks for postprocessing in road thickening. o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o (a). Original Image 0 o o o o o o o o o o o o o o o o 127 127 127 127 127 127 127 127 127 127 127 127 127 127 127 127 127 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 127 127 127 127 127 127 127 127 127 127 127 127 127 127 127 127 127 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o (b). Image after 5th iteration of road operator 0 o o o o o o o o o o . o o o o o o 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 150 o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o (c). Final Image Figure 3-5: Road thickening for multiple horizontal roads. 0 0 0 0 000 00000 0000 0 0 255 255 255 (a). Original Image Data 0 0 0 42 42 154 42 154 255 42 154 255 154 42 154 255 154 42 42 154 255 154 42 42 154 255 154 42 42 154 255 154 42 42 154 255 154 42 42 154 255 154 42 42 154 255 154 42 42 154 255 154 42 154 255 154 42 255 154 42 (b). Image after 5th iteration 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00000 00000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 255 25525500000 0 0 255 255 0 0 255 255 0 0 255 255 0 255 255 0 255 255 0 255 255 0 255 255 0 255 255 255 255 255 255 (c). Final Image oriented roads. kening for 45° 1c Road th Figure 3-6 67 0 255 255 0 255 255 0 255 255 0 255 255 0 255 0 255 0 255 255 0 255 255 0 255 255 255 (a). Original Image 231 255 13 1% 255 27 23 1 255 1% 231 13 1% 255 27 27 23 1 255 1% 231 13 13 126 255 231 27 27 23 1 255 126 13 13 126 255 231 27 231 255 1% l3 13 126 255 231 27 231 255 126 13 l3 1% 255 231 27 27 231 255 126 13 13 1% 255 231 27 231 255 126 13 13 1% 255 231 27 231 255 1% 13 (b). Image after 5th iteration 231 255 1% 255 27 231 255 1% 231 1% 255 27 27 255 1% 231 13 126 255 231 27 231 255 1% 231 13 27 1% 255 0 1% 231 255 1% 231 0 231 255 0 27 255 255 126 231 1% 255 0 27 231 1% 231 0 1% 255 1% 231 0 231 255 0 126 255 231 255 1% 231 (c). Final Image Road thickening for 22" oriented roads. Figure 3.7 68 (a) Original Band 3 Image (18x18) (b) Image After First Iteration (18x18) 1 (c) Final Thickened Image (18x18) (d) Reduced Image (9x9) of (c) Figure 3-8: An Application of the road thickening algorithm to a small Landsat image. 69 3.5. Hierarchical Road Detection Figure 3-9 shows the flowchart of the proposed hierarchical method for road detec- tion. The image hierarchy is represented through multiple image resolution levels. An input image (at the lower left comer) is at resolution level 1. When the image size is reduced by the pick up method (Section 3.3.2), the major road information is maintained. The number of resolution levels needed, k, is image and goal dependent. The road detec— tion algorithm can be applied to each level of the image hierarchy to obtain a road image. The level 1' road image can be used in improving the road detection results at level i — 1. This process is repeated until the final road image at level 1 is obtained. In the following, we first present an algorithm to merge road detection results at two neighboring resolu- tion levels, then provide a complete hierarchical road detection algorithm which utilize three image resolution levels. The procedure for merging level 1' road image and the updated level i+l road image to form the updated level i road image is discussed below. Merge Algorithm Given a level 1' road image and an updated level i+1 road image, they are combined by the following two steps: Step 1. Enlarge the size of the level i+1 road image to the same size as level 1' image . by expanding each pixel to a 2x2 block of pixels. Step 2. Obtain the updated level i road image by using the following rules. Let 1" (p) represents the road type of the p“ pixel in level i road image. Note that li(p)e {0,1,2,3}. If [‘09) = 0 then pixel p is a non-road pixel. If both the level i and the level i+1 road images label the road pixel p as the same type, then the label of pixel p is unchanged in the updated level i road image. 70 Rule 1. If both the level i and the level i+1 road images label the road pixel p as the same type, then the label of pixel p is unchanged in the updated level 1 road image. If7‘(p)=1‘+1(p)= 11hen1‘(p):= 1;1f1‘(p)=1‘+1(p)= 2 then 1"(p) ;= 2; Ifli(p) = 1‘+1(p) = 3 then 1"(p) := 3 Rule 2. If pixel p in the level i+l image ranks higher than in the level 1' image, then the higher-ranked label is used in the updated level i road image. If1‘(p)=2 andl‘+‘(p) = 1 then 1"(p) := 1; 111‘ Road Image Image A Image Reduction Road Thickening Final Level 2 Road Detection Level 2 ; Level 2 Reduced . ; Road Road Image Algonthm Image Image 11 Image Reduction I Road Thickening 11 Final Level 1 Road Level 1 . ’ ———> Level 1 Input Detection Road Image Algorithm Image Road Image Figure 3-9: Hierarchical road detection flowchart. 73 3.6. Experimental Results We have tested our hierarchical road detection algorithm on more than 10 Landsat images. The results looks good by visual inspection. However, only three of these images have been extensively evaluated by an experienced photointerpreter with the help of aerial photographs of the same ground areas. These three images will be discussed in more detail. Since our images are of size 256x256, only three levels of resolution were used in the experiments. Only band 3 of the Thematic Mapper was used in our experi- ments because visual inspection of the TM images indicated that roads are most prom- inent in this band. In the road detection literature, multiband Landsat images have been used in detecting major roads [Wan88]. However, no evidence or experiments have been provided to indicate that the use of multiple bands can outperform single band methods in road classification accuracy. Figure 3-10(a) shows a 256x256 Landsat TM band 3 image of an area in Lake County, Michigan. Figure 3-10(b) shows the level 2 resolution image (128x128) obtained by applying the iterative road operator to Figure 3—10(a) followed by image reduction. A visual comparison of Figures 3-10(a) and 3-10(b) shows that most roads still exist in the reduced image and the relative strength of the roads is maintained. Figure 3-10(c) shows the level 3 resolution image (64x64) by applying the iterative road opera- tor and image reduction to the level 2 image (Figure 3-10(b)). Figure 3-10(d) shows the detected roads in the level 1 image (256x256) using the one pixel away road operator and the parallel road following algorithm discussed in Section 2. The strength of the road pixels is color coded: red means the pixel is in the top 5% road strength (type 1 road pixel), green means the pixel is in the top 5% to 10% road strength (type 2 road pixel), and blue means the pixel is in the top 10% to 20% road strength (type 3 road pixel). Fig- ure 3-lO(e) shows the level 2 roads after applying the road operator and road following to Figure 3-10(b). Note that many medium or weak road pixels in level 1 road image (Fig- ure 3-10(d)) are now labeled as strong road pixels in level 2 road image (Figure 3-10(e)). This is because long roads are enhanced in the reduced image, and it meets our expecta- tion that level 2 image can provide a better global perspective than the level 1 image. 74 Figure 3-10(f) shows the level 3 road detection output corresponding to the image in Fig- ure 3-10(c). A comparison between Figures 3-10(d), 3-10(e), and 3-10(f) shows that that roads on the left side of the image are detected more clearly as the image size reduces. Figure 3-lO(g) shows the final level 1 road image, which combines the road detection results from three resolution levels (Figures 3-10(d), 3-10(e), and 3-10(f)). Comparing Figures 3-10(g) and 3-10(d) shows that the proposed hierarchical road detection tech- nique can enhance the long roads better and identify some weak roads which were not detected by the non-hierarchical road detection technique. Figure 3-11(a) shows a 256x256 Landsat TM band 3 image of an area in Grand Traverse County, Michigan. Figure 3-11(b) shows the detected roads at the given 256x256 resolution (level 1). We have problems detecting a near horizontal and a near vertical road in the center of the image. The reason is that these roads are weak (with magnitude in the top 15% to 20% of road strength) based on the road strength histogram of the whole image. Figures 3-11(c) and 3-1 l(d) show the detected roads in level 2 reso- lution image (128x128) and in level 3 resolution image (64x64), respectively. Note that in level 2 road image these near horizontal and vertical roads are partially detected. In this example, the level three road image does not provide any additional information over the level two road image, although major roads in the image are still maintained. Figure 3-11(e) shows the final road detection output by using the hierarchical road detection algorithm, which clearly performs better than the original level 1 road detection output.‘ Many noise segments detected in Figure 3-11(b), which correspond to oil/gas pads, are now weakened in Figure 3-1 l(e). The near horizontal and vertical long roads at the center of the image are also detected more clearly in Figure 3-12(e) than in Figure 3- 11(b). Figure 3-12(a) shows a 256x256 Landsat TM band 3 image of an area in Crawford County, Michigan. Figures 3—12(b) and 3-12(c) show the detected roads at level 1 resolu- tion image and level 2 resolution image, respectively. Figure 3-12(d) shows the final road detection output by combining Figures 3-12(b) and 3-12(c). Again, we see that the combined road detection result (Figure 3-12(d)) performs better than the road detection at 75 a given level of resolution (Figure 3-12(b) or 3-12(c)). We use only two resolution levels for detecting roads in this image because the roads in this image are fairly clear so the third reso‘lution level does not improve the road detection accuracy. Table 3-1 shows a comparison of road detection accuracy between the proposed hierarchical technique and the non-hierarchical technique. In Table 3-1(a) the ground truth (including major, local, and minor roads) is obtained by a photointerpreter after examining the airphotos and maps of the study area. The identification accuracy is specified as the number of pixels on major, local, and minor roads which are detected. The commission error is specified as the number of road pixels (including major, local, and minor) detected by the program but not in the ground truth road image. Results in Table 3-1(a) show that the use of hierarchy does improve the road detection accuracy. The Grand Traverse image gives poor road identification accuracy with or without the hierarchy. After a careful investigation, we found out that many roads in the Grand Traverse image which could not be detected by our algorithm are due to the resolution limitation of the Landsat TM image. In Table 3-1(b), the true roads are obtained by an experienced photointerpreter investigating the same Landsat TM images (no aerial pho- tographs or maps are used) that were used by our road detection algorithm. The identification accuracy with hierarchy in the Grand Traverse image is now 79.1%, which is higher than the same situation in Table 3-1(a). Based on these experiments, we make the following three observations. 1. We believe that the hierarchical algorithm can achieve significant improvements over nonhierarchical algorithms, particularly in complex scenes. 2. In Table 3-l(b), the photointerpreter was using the same Landsat TM images to generate "true" road images which were used by our algorithm. Since the algorithm’s accuracy is only approximately 80% relative to photointerpreter’s performance, this might indicate that there is still a significant gap between our hierarchical road detection algorithm and the human expertise. In the reported literature, Wang and Newkirk [Wan88] deve10ped a road detection system on Landsat TM images that obtained an 88% identification accuracy. However, only 76 highway networks (major roads) were considered in their road detection system. Comparing the experimental results of Crawford County image in Tables 3-1(a) and 3- 1(b), it is ironic to find that the performance of our road detection algorithm is rated higher when evaluated by ground truth (over 90% accuracy) but not highly evaluated by a photointerpreter analyzing the same Landsat image (79%). This demonstrates that the limited resolution (30m) of the Landsat TM images has influenced the performance of the human expert. When using aerial photo- graphs with a resolution of less than 3m, more accurate roads can be obtained. Table 3-1: A comparison of road detection accuracy with and without hierarchy. 3- 1(a): Compared with ground truth (airphotos, maps) Without Hierarchy With Hierarchy Study Area Identification Commission Identification Commission Accuracy (%) Error (%) Accuracy (%) Error (%) Lake County 64.2 17.4 82.9 18.6 Crawford County 90.0 12.5 90.9 12.5 Grand Traverse 51.4 17.8 63.8 19.4 3-1(b): Compared with photointerpreter analyzing the same Landsat TM Images Without Hierarchy With Hierarchy Study Area Identification Commission Identification Commission Accuracy (%) Error (%) Accuracy (%) Error (%) Lake County 71.9 12.5 79.2 12.1 Crawford County 78.1 9.7 79.4 10.5 Grand Traverse 60.8 11.3 79.1 10.4 Figure 3-10(a): A Landsat TM band 3 image (256x256) in Lake County. Figure 3-10(b): The reduced image (128x128) from Figure 3-10(a). Figure 3-10(c): The reduced image (64x64) from Figure 3-10(b). Figure 3-10(d): The detected road image (256x256) from Figure 3-10(a). 79 Figure 3-10(e): The detected road image (128x128) from Figure 3-10(b). Figure 340(1): The detected road image (64x64) from Figure 3-10(c). 80 Figure 3-10(g): The final level 1 road image (256x256) of Lake County. Figure 3-ll(a): Landsat TM band 3 image (256x256) in Grand Traverse County. 81 Figure 3-ll(b): The level 1 road image (256x256) in Grand Traverse County. Figure 3-ll(c): The level 2 road image (128x128) in Grand Traverse County. 82 Figure 3-ll(d): The level 3 road image (64x64) in Grand Traverse County. Figure 3-ll(e): The final road image (256x256) in Grand Traverse County. Figure 3-12(a): Landsat TM band 3 image (256x256) in Crawford County. Figure 3-12(b): The level 1 road image (256x256) in Crawford County. Figure 3-12(c): The level 2 road image (128x128) in Crawford County. Figure 3-12(d): The final road image (256x256) in Crawford County. 85 3.7. Conclusions This chapter has presented a hierarchical approach for road detection. The existing road detection techniques only use a local neighborhood. We combine the image hierar- chy and an iterative road operator in such a way that when the image size is reduced, the major road information is maintained. Road detection can be performed at different resolutions of the given image and these results combined to obtain the final road detec- tion output. Experimental results from several Landsat TM images show that the pro- posed hierarchical road detection algorithm performs better than traditional non- hierarchical road detection techniques. We have used only small Landsat TM images (256x256) in the experiments. Future research includes applying the technique to larger size images and combining results from multiband images. Chapter 4 Knowledge Based Segmentation of Landsat Images In this chapter a knowledge based approach for Landsat image segmentation is pro- posed. Landsat images are quite complex and require a large amount of auxiliary knowledge in their interpretation. These two factors make it difficult to use common image segmentation techniques, such as t0p down and bottom up methods, for Landsat image interpretation. We attack this problem by first constructing a hierarchical structure for the major land cover types in the study area. Then we use a pattern recognition tech- nique to automate the process of spectral rule generation to identify different land cover types. Kernel image information is constructed to provide an initial interpretation of the image. Finally, a spatial clustering segmentation technique, kernel image information, and the spectral and spatial rules are combined to provide a more detailed interpretation of the image. Only limited experimental results have been provided by applying the pro- posed method to Landsat images because the computer accessible ground truth informa- tion for evaluation purpose is very difficult to obtain. Topics for future research include developing more spatial rules and using prior map information in the image interpreta- tion procedure. 86 87 4.1. Introduction In the context of Landsat imagery, a segmentation technique which does not utilize any prior_information about an image is not likely to be successful. Amdasun and King [Ama88] showed an example where the same image can be segmented into three types of regions (water, burnt areas, and areas with plant cover) or five types of regions (silt water, clear water, tree cover, farmlands, and burnt areas). In order to determine which segmentation is meaningful, we require prior information about the image. According to Levine [Lev85], there are two types of segmentation techniques: partial (general purpose) segmentation and complete (knowledge guided) segmentation. Partial segmentation techniques segment an image into homogeneous regions without any prior information about the image. The segmented regions do n0t necessarily correspond to objects in the image. Many popular segmentation techniques (such as split and merge and region grow- ing) fall into this category. On the contrary, complete segmentation techniques achieve a final result by employing semantic knowledge in the segmentation process. This type of segmentation is preferable because it is knowledge guided and goal driven, and the seg- mented regions are constrained to correspond to objects in the image. Most of the region based interpretation techniques [Nag80, Oht85, McK85] initially use general purpose segmentation as a low level (bottom up) process and then use domain knowledge (top down) to refine the bottom up segmentation. The knowledge used in segmentation can be divided into two types: spectral and spatial. In Landsat images, spectral properties associated with different land cover types have been extensively studied and reported in the remote sensing literature [Ens83, Lus89, Lil79]. This spectral knowledge plays an important role in satellite image seg- mentation. For example, if a region in a Landsat image has very high reflectance in near infrared band and low reflecrance in red light band, then the region is vegetation. Whar- ton [Wha87] developed a spectral expert system for urban land cover discrimination. However, this system requires experienced photointerpreters to spend considerable amount of time in generating the spectral rules. Furthermore, many of the spectral rules have to be significantly updated for a geographically different ground area (say, from 88 Washington DC. to Michigan). It would be nice if these spectral rules could be automat- ically generated. In Section 4.5 we explore techniques to automate the process of spec- tral rule generation from training data. Spatial knowledge deals with the spatial relationships among various objects in the image, and has been widely used in the interpretation of aerial photographs. McKeown [McK85] used spatial knowledge to recognize objects in images of airports. Examples of spatial knowledge in this system are that airplanes should be adjacent to or on runways and terminal buildings should connect to roads. Ohta [Oht85] used spatial knowledge in interpreting outdoor scenes. For example, the sky should be above the trees and windows should be part of buildings. The extraction of spatial rules for land cover types in Landsat images is much more difficult than for man-made objects in aerial photographs. With the currently available techniques in artificial intelligence and computer vision, it is unrealistic to automate the process of spatial rule generation for Landsat image interpre- tation. The major reason is that, although a computer program may outperform a human expert in manipulating digital numbers to generate spectral rules, a human expert is still more skilled than a computer program in recognizing spatial relationships among dif- ferent objects. The rest of the chapter is organized as follows. Section 4.2 reviews the knowledge based segmentation techniques in the area of remote sensing. Section 4.3 presents the proposed knowledge based segmentation technique. Section 4.4 deals with the acquisi- tion of kernel image information. Section 4.5 discusses the construction of hierarchical land cover structure and the knowledge acquisition of spectral rules. Section 4.6 gives the proposed knowledge based segmentation algorithm and experimental results. Section 4.7 provides the conclusions. 89 4.2. Review of Knowledge Based Image Segmentation Techniques General purpose image segmentation techniques are indispensable in a knowledge based segmentation procedure in spite of the importance of prior domain knowledge. Frequently used general purpose segmentation techniques include split and merge [Hor76], region growing [Zuc76], recursive splitting [Ohl78], and spatial clustering [Bur81]. Good reviews of these approaches are available in [Har85] and [Tai86]. These segmentation techniques can be broadly divided into two categories: histogram oriented and cluster oriented. Histogram oriented segmentation generates a histogram for each individual band of the multispectral data, segments them individually, and then overlaps the segmentation results of each individual band into more fragmented regions. Cluster oriented segmentation uses all bands of data to partition the image pixels into clusters. Both Ohlander [Ohl78] and Nagao [Nag80] used a histogram oriented segmentation in their work. Nagao and Matsuyama [Nag80] used histogram oriented split and merge technique in aerial photograph segmentation. Their technique divided a 4 band 256x256 image into thousands of regions [Nag80, p.90]. If 7 band Landsat TM images of the same image size were used, many more regions would typically be obtained. Cluster oriented techniques may be more appropriate than histogram oriented tech- niques in segmenting multiband satellite images. This concept has been suggested before [Sch76], but has not attracted the attention that it deserves. When single band images are considered, histogram oriented segmentation is the same as cluster oriented segmenta- tion. When multiband images are considered, cluster oriented segmentation can be viewed as merging multiband histograms into one histogram and then doing the segmen- tation. Clustering is a mature research field and we can view it as transforming a multiple-parameter estimation problem (determining thresholds for histograms of each band) to a single parameter estimation problem (number of clusters)[Jai88b]. A combination of top down and bottom up strategies is an often used knowledge based segmentation approach [Nag80, Oht85, Hwa86, Mat87]. The major concept of this approach is presented in Figure 4—1. For a given input image, first an initial general pur- pose segmentation is performed to divide it into many homogeneous regions. Region 90 properties or features such as mean intensity values, size and shape are collected. The object description module then provides an initial interpretation for each region based on its associated features. For example, in a typical outdoor scene, a tree is spectrally green and a window of a house is. bright with a rectangular shape. The spatial relationship gui- dance module provides spatial knowledge among different objects. This spatial knowledge or contextual relationship helps in further processing of partially interpreted images, such as in hypothesis integration [Hwa86] or plan generation [Oht85]. The con- trol rule module coordinates the application of top down rules (from the spatial relation- ship guidance module) and bottom up rules (from the object description module), and also handle problems of conflict resolution and error correction [Mat87]. We find that this approach is not appropriate for interpreting Landsat images despite the fact that it has been successfully applied to outdoor scenes [Mat80, Oht85] and to aerial photographs [Sha86]. The major problems of this approach are listed below. 1. Loose relationships between image processing techniques and knowledge rules. The initial segmentation technique is applied to the entire image without using any domain knowledge. If the output of the initial segmentation technique is of poor quality, recovery from these errors is difficult even though complex knowledge rules can be applied. 2. The difiiculty in constructing the control module. The control module is the heart of the approach. It has to control the timing and priority of rules fi'om different modules, and handle difficult tasks such as focus attention and error correction [Naz84]. Testing the performance of the system is also difficult because a slightly different rule sequence may produce different interpretation results. Without substantial understanding of the whole image back- ground and the characteristics of knowledge rules, it is very time consuming to design a realistic control module. 3. Lack of spatial guidance rules in Landsat images. 91 Knowledge acquisition is a long standing problem in the design of expert systems [Kah85, McK87]. For a simple outdoor scene, it may be possible to build sufficient spatial guidance rules to describe the relationships among, for example, sky, tree, car, road, house, etc. However, for a typical Landsat image of our study area, the available spatial guidance rules among different land cover types are insufficient and incomplete. This will make the top down interpretation of Landsat images impractical. In the next section, we pr0pose a knowledge based segmentation technique which is more appropriate for Landsat images. 92 Scene Description A - I .- ---------------- i ---------------- 4 : I l I l I I E Spatial Guidance : TOP‘DOWD : Module : Process I I I l | r ' 1 2 l I i ' . . i i : . : Control Module : V I I 1 I 1.1 I I | I I I . I 1 . . I I I I I | I I 1 I . . o I r Object-descrrptron I I I : MOdUIC I Bottom-Up I I k ---------------- $ ---------------- -’ Process Initial I Segmentation : T i I I I I Input 1'! Image Figure 4-1: A diagram of a popular knowledge based segmentation approach. 93 4.3. The Proposed Knowledge Based Segmentation In order to develop a knowledge based segmentation technique appropriate for Landsat image processing, we propose the system shown in Figure 4-2. The whole seg- mentation process is divided into two parts: goal oriented segmentation followed by image oriented segmentation. In goal oriented segmentation, spectral and spatial knowledge about a given land cover type is integrated with the image segmentation tech- niques to extract targeted regions. In our applications, these regions include water areas, roads, vegetation and nonvegetation areas. These identified regions, which we call kernel information, are then combined with the domain knowledge to further identify land cover objects such as urban and bare land. In image oriented segmentation, a hierarchical seg- mentation is integrated with all the previously identified regions and available spectral and spatial knowledge to further interpret the vegetation areas. In order to support such a segmentation procedure, land cover types in the study are organized in a hierarchical SU'UCIUI'C. The image processing techniques and the domain knowledge are highly integrated in our method, as shown in Figure 4-3. The dotted block in Figure 4-3 represents the segmentation procedure of the goal oriented segmentation (see Figure 4-2) if clustering is not used. It represents the image oriented segmentation (see Figure 42) if clustering is used. The segmentation procedure consists of four steps: clustering, seed detection and region growing, region interpretation, and region adjustment. The spectral and spatial knowledge is used in three different ways in the segmentation procedure. 1. Embedded in the seed detection and region growing techniques for region detec- tion. In detecting water regions and roads, the spectral knowledge is embedded in the seed detection and region growing procedure. For example, roads are spectrally bright objects in Landsat images and this knowledge is used in detecting "road seeds". In detecting urban and bare land regions, both spectral and spatial knowledge is used in the seed detection and region growing procedure. 94 2. Represented as spectral rules for region interpretation Spectral rules are used in interpreting the segmented regions. For example, if a region is formed in the hierarchical segmentation over vegetation areas, its spec- tral band values can be input to spectral rules to determine whether it belongs to forest or not. Spectral rules are also helpful in determining the number of clusters in the hierarchical segmentation procedure. 3. Spatial rules are used in region adjustment. For a partially interpreted image, spatial rules are used for more complete region interpretation. A small vegetation area surrounded by urban areas is relabeled as an urban area. A tiny nonvegetation region (less than 5 pixels) surrounded by forest areas is merged with the forest areas. Our goal is to develop a knowledge based segmentation technique that can segment and interpret any township-size subscene (256x256 pixels, about 5 by 5 miles) in a large Landsat scene (about 115 by 110 miles) of Michigan. The study areas and major land cover types in the area are shown in Figure 1-1 and Table 1-1. There are three major properties of the proposed segmentation method: I . Use of kernel image information in the interpretation procedure. The kernel information of an image is the information that can be reliably extracted from the image by using spectral knowledge and image processing techniques. The ker- nel information of an image includes identifying roads, water regions, vegetation regions, nonvegetation regions, etc. We have developed algorithms to extract this kernel infor- mation to provide an initial interpretation of the image and use this as a basis for further image segmentation and interpretation. 2. Automation of the Spectral Knowledge Acquisition Knowledge acquisition for Landsat image interpretation is usually performed by experienced photointerpreters. This process takes a considerable amount of time and 95 reliable rules are sometimes difficult to obtain. We have automated the process of spec- tral rule generation by applying pattern recognition techniques to spectral land cover training data. The experimental results show that the automatically generated rules com- pare favorably with those generated by a photointerpreter in classification accuracy. Further, more rules can be generated by the automatic rule generation procedure than those provided by a photointerpreter. The program generated rules are selected by observing 10 features of spectral training data set in a clustering space. Some useful rules may be difficult for human experts to detect. 3. Use of hierarchy in satellite image segmentation A Landsat image consists of many regions that belong to different land cover types. These land cover types form a hierarchical structure. Interpreting Landsat images through hierarchy can provide at least a partial understanding for some regions. For example, we may identify a region as a forest, but we may not be sure whether it is a coniferous forest or a deciduous forest. 96 Input Image 1- ............ , Goal Oriented I . I : Kernel Regron : Segmentation I I I Extraction ' L ..... I. ...... .l V S ctral [- ------------ 1 <- - - - m :Nonvegetation : 1' Rules I Object Detection I g I I. ............ .J ' \ I ‘ I ‘ I ‘I 3“ ’ \ ’ \ ' \ I \ V ‘1 . ,. ............ 1 Spatial I : Hierarchical I <._- _ Rules 1 I : Segmentation : I. ...... . ...... .1 ~ / ‘ ‘ _< ’ ’More Detailed‘ . ‘ , Image Onented ‘ ~Segmentatron , , ’ ‘VNo Interpreted Image Figure 4-2: The proposed knowledge based segmentation procedure. """'V L es Segmentation 97 ., I I I I \ I I I I I I +1—- ' r—i I I I I I I I I I J I I I I : an: : : Clustering ., ...... .1 ......... Spectral E 5 Rules I I I \ 1.3 I 1 1 . 4' I 1 Seed Detection I I .and . q :, . Regron Growrng i I .° -. I .31 ' : l : Spatial I ' I P ,- 1 Region I RUICS I Interpretation : I l V. I.-‘ I f 1 .31 ' ..' I f' ------------- 1 1 I I 1|- I I .3 I ' o I : Region gap”; ............ : Prror Map : I Adjustment I I Information I I I I. _____________ J I I I' I I I I I I I I I I I I ‘——r— I I I I I I I I I I I. Interpreted Image "‘ : not used in current procedure ** : goal oriented segmentation does not use clustering. l : only used in image oriented segmentation 2 : only used in goal oriented segmentation Figure 4-3: Detailed segmentation procedure of Figure 4-2. 98 4.4 Kernel Image Information A typical procedure that a human expert follows in interpreting an image is as fol- lows. First identify some regions in the image that he/she is very confident about (based on prior knowledge from interpretation experience). Then, step by step, interpret the remaining regions in the image. The same strategy can be used in automatic image interpretation. Information that can be reliably extracted from an image itself will be called kernel image information. Kernel information can form an initial interpretation of the image which will be useful for more detailed interpretation. Note that the kernel image information concept is similar to island driving strategy [Bar81] in natural language problem solving. One major difference is that in our method the kernel infor- mation is used to confirm what will be the true land cover types for neighboring regions instead of hypothesizing and verifying the true land cover types of the neighboring regions. Landsat images consist of multiple bands of data. Spectral characteristics of various land cover types have been well studied and reported in the remote sensing literature. Some of this spectral knowledge can be applied to any given Landsat image to interpret some regions confidently. These interpreted regions that can be classified as kernel image information should satisfy the following three criteria: 1. The knowledge needed to recognize these regions is well understood and can be applied to different Landsat images. 2. The corresponding region extraction techniques should be easily implemented and reliable. 3. The identified regions appears frequently in Landsat images, and are helpful in interpreting the whole image. 99 Our kernel information consists of the water regions, vegetation regions, and roads, as well be described below. 4.4.1. Water Regions In Landsat TM images, water is the unique land cover type such that the band reflectance values always decrease as the band number increases, ignoring the thermal band (Band 6). Namely, B 1>82>B3>B4>35>B7, where B,- is the reflectance in the i "' band. This band decreasing property holds for all the test sites in different Landsat TM images of Michigan that we have examined. We use this band decreasing property to extract water regions from Landsat sub-images in our study area. Figure 4-4 shows a Landsat image of the city of Alpena, Alpena County, Michigan. The blue colored regions in Figure 4-5 show the water regions detected using the band decreasing pro- perty. Note that some boundary pixels of water regions (a pixel covers 30x30 meters of ground area) may consist of both water and non-water areas, and may not obey the band decreasing property. In the following, we improve the boundaries of detected water regions by combining the band decreasing property with the clustering technique developed above. In Landsat spectral data, water pixels are separated from other land cover types. We have clustered many Landsat images (all with 10 clusters) and find that water pixels do form a unique cluster in all the images studied. The number of clusters was selected through experiments. If we combine this unique cluster property with the kernel water information then water regions can be reliably extracted from Landsat images. The Forgy clustering algorithm in the ERDAS software package [Erd86] is used in the pro- - cess. The complete procedure to detect water regions in a Landsat image is listed below. 100 Water Detection Algorithm Step 1. Initialize an array classcount with classcount[1]=...=classcount [10]=0. Step 2. Cluster the Landsat image image into 10 classes based on the spectral data. Call the output image A. Step 3. Apply the water band decreasing property to the Landsat image to label the kernel water pixels. Call the output image B. Step 4. For each kernel water pixel in image B, find its corresponding class number (say, k) in image A, and set classcount[k]:=classcount[k]+1; Step 5. For i:=l to 10, if classcount [i] 2 5 then assign class i to the water class. Step 6. Mark all the pixels in image A that belong to water class. In Step 5, we assume that a water class should have at least 5 pixels within a 256x256 image, otherwise we claim that no water regions exist in the image. The green colored areas in Figure 4-5 show the water pixels detected by the Water Detection Algo- rithm, but missed using only the band decreasing property. Figure 4—6 shows another Landsat image in Grand Traverse area, Michigan. Figure 4-7 shows the water regions detected by the above algorithm. These detected water regions were verified to be accu- rate through through a photointerpretation of aerial photographs of the same study area. However, no classification accuracy is provided because of the lack of a computer read- able ground truth file. The scattered water pixels are usually very helpful in detecting forested Wetland areas. Figure 4-4: A Landsat TM image of Alpena, Michigan. Figure 4-5: Water regions for Figure 4-4. 102 Landsat TM image in Grand Traverse County. Figure 4-6 for Figure 4-6. Water regions Figure 4-7 103 4.4.2. Vegetation Regions Green vegetation usually has a high reflectance in the near infrared band (Band 4 in Landsat TM image) and a low reflectance in the red light band (Band 3 in TM image). This characteristic has been often used in the remote sensing literature to separate vegeta- tion regions from nonvegetation regions [Per84]. The definition of vegetation index is not unique [Jen86]. We define a vegetation index by the infrared/red band reflectance ratio. In our experiments, different land cover types had different ranges of variation in their vegetation index. This was helpful in obtaining an initial interpretation of the image. For example, the vegetation index is usually high for forest areas with values for decidu- ous forests being higher than coniferous forests. Urban areas and bare land have a low vegetation index and it is very low for water regions. In summary, vegetation index can be used as follows. Let 89".) = 83"” my?” be the vegetation index at pixel (i,j). For a given pixel(i,j), If 39"” > 2.5 then the pixel is very likely to be vegetation If 89"") 51.5, the pixel is very likely to be nonvegetation If 1.5 < 89".) 52.5, it is unclear whether the pixel is vegetation or not. Figure 4—8 shows a Landsat image of Fredrick Township, Crawford County. For Landsat images shown in Figures 4-4 and 4-8, their corresponding vegetation index images are shown in Figures 4-9 and 4-10, respectively. The red color areas in Figures 4-9 and 4- 10 are vegetation, the blue areas are nonvegetation, and the green color areas are uncertain regions. 4.4.3. Roads Roads are common objects in Landsat TM images. A hierarchical road detection technique was proposed in Chapter 3. In a previous work [Ton87], we have shown that the road information can be useful in detecting oil/gas pads. In Section 4.6 we will show that the road information can be useful in distinguishing an urban area from barren land. 104 Figures 4-11 and 4-12 show the detected roads in the Landsat images of Figures 4-4 and 4-8, respectively. Red, green, and blue pixels represent their road strengths of strong, medium, and weak, respectively. Figure 4-8: Landsat TM image of Fredrick Township, Crawford County. 105 4-4. rgure for F image index 10“ The vegetat Figure 4-9 The vegetation index image for Figure 4-8. Figure 4-10 Figure 4-11: The road image for Figure 4-4. Figure 4-12: The road image for Figure 4-8. 107 4.5. Spectral Expert System for Landsat Image Classification Spectral knowledge plays a very important role in multi-band Landsat image interpretation. Section 4.5.1 reviews the available techniques of using spectral data in Landsat image classification, and points out their problems. Section 4.5.2 briefly describes our method of generating a spectral expert system for Landsat image classification. Section 4.5.3 provides the hierarchical structure for major land cover types in the study area, and introduces a human generated spectral expert system. Sec- tion 4.5.4 provides the algorithm to automate the process of spectral rule generation, and compares the spectral rules generated by a human expert and by our computer program. 4.5.1. Landsat Image Classification The most frequently used Landsat image classification techniques are discussed below. Unsupervised classification techniques are not included because it does not use any knowledge in the process. 4.5.1.1. Supervised Classification A traditional and widely used technique of recognizing land cover types from Landsat images is that of supervised classification [Tai86]. In this method, first the train- ing spectral data of the land cover types of interest are collected from the Landsat image and then a maximum likelihood classification technique [Dud73, Dev82] is applied to the whole image to classify each pixel to its appropriate land cover type. This method has limited applications because land cover training data must be collected for each image to be classified. In other words, the training data of land cover types collected from one image cannot usually be applied to Other images. This limitation is caused by the follow- ing two facts: 1) major land cover types that appear in one image may not appear in other images, and 2) spectral variations due to atmospheric or illumination conditions may be different for different images. 108 4.5.1.2. Band Ratio Features A good way of alleviating the image dependency limitation typically encountered in supervised classification is to use band ratios as well as individual band data in land cover recognition. Several studies have shown the usefulness of band ratio features in correcting for detector striping and compensating for atmospheric scattering. Wharton [Wha86] used band ratio features in the design of a spectral expert system which recog- nizes land cover types in Landsat Thematic Mapper Simulator (TMS) data of the Wash- ington DC. area. In this method each land cover type is described in terms of three types of features: band to band, cover type to background, and cover type to cover type. Unfor- tunately, his method is designed for small image areas and very restricted land cover types, so it is not clear whether his method is a general solution to the image dependency limitation. 4.5.1.3. Signature Extension A signature is a representative or prototype feature vector associated with a land use category. Signature extension allows us to extend the use of spectral information in Landsat image interpretation. This method organizes the spectral information (signa- tures) obtained from a training data set and then applies this information to recognize a spatially temporally removed data set. If this can be successfully accomplished then one set of ground information can be used in the processing of several other images. Hender—' son [Hen76] uses this technique in crop land recognition. His method first applies cluster- ing to the Landsat images of two areas, and then finds a one to one cluster matching rela- tionship between these areas. Finally, he applies linear regression on the paired clusters to find the best linear equations which can transform the band intensity values from one area to the other area. However, signature extension by linear regression has been criti- cized by researchers because it "naively oversimplifies the complexity of the spectral representation problem" [Swa78]. The major argument is that since many factors (sun angle, atmosphere, etc.) will influence the spectral data of each Landsat image, a linear relationship may not exist between the spectral information in two images. 109 There are two major problems in using spectral data for satellite image classification: i) image dependency, and ii) system extension capability. Both band ratios and signature extension techniques are targeted to solve the first problem by extending the specttal land cover information from one image to other images. However, so far only limited progress has been achieved. We attack the first problem by developing spectral expert system that can be used in large Landsat scenes, and attack the second problem by developing an algorithm to automate the process of spectral rule generation. Our algorithm can be applied to different geographic areas. 4.5.2. Overview of the Proposed Spectral Expert System The input to our system is a set of spectral data of major land cover types in a test site (about a township in size, 9.6kmx9.6km). The test site is randomly selected from a large Landsat scene. The data for a test site is shown in Table 4-1. The desired output is the appropriate land cover labeling for each test site. To overcome the image dependency limitation, we utilize the following strategies: (i) Use band ratios as well as individual band values. (ii) Classification is based on regions rather than individual pixels . (iii) Training data is selected from large areas so that they are representative of land use categories of interest. (iv) Use hierarchical structure in land cover classification. To make the system easily applicable to a variety of geographical areas, we have automated the process of rule generation from training data (Section 4.5.4). When the whole system is applied to a totally different environment, major land cover types and land cover training data have to be collected manually by experienced photointerpreters, but the spectral rules can be automatically generated from training data sets. 110 4.5.2.1. Study Areas, Landsat Images and Major Land Cover Types Our study areas are in the northern lower peninsula and eastern upper peninsula of Michigan, which covers about 200x200 square miles. The major cover types in the study areas are listed in Table 1-1. The available Landsat scenes are shown in Figure 1-1. 4.5.2.2. Processing Steps Step 1. Step 2. Step 3. Sample spectral data of land cover types from various areas and Landsat scenes. Training data of various land cover types from eight test sites and two Landsat scenes were collected. Each test site contains 8 to 12 land cover types. All the Landsat images that were used here were taken in the same season (summer) but from different years (1984 and 1986). Table 4-2 shows the collected training spectral data of various cover types from one training site. Build a hierarchical spectral recognition system to classify a region based on its spectral data. We first construct a hierarchical structure for major land cover types, then gen- erate spectral rules to classify various land cover types by combining spectral knowledge from remote sensing literature and spectral training data of various land cover types. Evaluate the performance of the developed system. Test data is selected from Landsat images not used in Step 1. Locations of the two test sites are marked with an asterisk in Figure 1-1 while Table 4-1 shows the spectral test data from Grand Traverse County. The test data is used for evaluat- ing the performance of the hierarchical spectral recognition system built in Step 2. 111 Table 4-1: Spectral data of Grand Traverse County. Land Cover Type Band Number 1 2 3 4 5 7 Urban Residential 122.3 53.5 86.7 95.4 132.4 82.8 Shrub 80.4 35.3 31.4 115.7 93.2 34.4 Deciduous 79.5 31.2 24.4 165.4 95.9 25.7 Red Pine 79.5 28.3 26.2 81.9 53.3 17.2 Jack Pine 76.9 30.9 29.8 73.1 76.2 28.1 Agriculture (new) 81.6 32.6 25.2 171.5 99.7 26.9 Water 74.4 22.4 17.3 1 1.9 6.7 2.8 Grassland 94.1 40.5 45.7 100.2 135.3 54.6 Clearcut (old) 93.4 39.7 47.4 84.2 134.3 56.2 Bare land 119.4 55.2 77.6 90.5 173.2 110.4 Table 4-2: Spectral data of Grayling area. Land Cover Type Band Number 1 2 3 4 5 7 Urban Residential 141.3 66.5 ‘ 86.5 92.2 137.1 78.4 Shrub 80.5 32.2 29.3 113.2 96.4 31.6 Deciduous (N orthem Hardwood) 75.1 31.4 24.5 166.1 97.0 26.2 Deciduous (Aspen Birch) 75.3 30.4 22.8 139.2 81.1 22.0 Red Pine 77.2 27.9 26.3 72.1 52.8 17.9 Jack Pine 76.9 28.2 26.1 77.8 69.2 23.3 Wetland Conifer 76.2 30.6 25.2 79.7 59.5 19.1 Water 81.2 36.6 33.2 18.4 13.1 5.6 Wetland Grass 76.3 30.2 25.9 117.4 86.3 24.8 112 4.5.3. Spectral Expert System for Classifying Hierarchical Land Cover Types Hierarchical classification is a powerful approach in solving classificatory problems [Cha86]. It attacks a classificatory problem by dividing the problem into a hierarchical tree structure with each node associated with a subset of knowledge or specialty. The problem solving for this task is performed through a top down process. Hierarchical classification has been successfully applied in the medical domain [Sti85, Cha83]. User friendly development tools for hierarchical classification system are available [Byl85]. Since land cover types and their associated knowledge fit quite well into a hierarchical structure, we use the hierarchical classification approach in constructing the spectral expert system by a human expert. Section 4.5.4 then provides algorithms to automate the process of spectral rule generation. 4.5.3.1. Decomposing Land Cover Types into Hierarchical Structure Land cover types in Michigan have previously been decomposed into a hierarchical structure [En883]. But this structure was designed for the interpretation of aerial photo- graphs, which have much higher resolution than Landsat TM images. In order to over- come the difficulty that certain land cover types are spectrally indistinguishable in Landsat images, we have to modify the hierarchical decomposition in [Ens83] based on the following two principles: 1. Only major land cover types in northern Michigan are considered. 2. Most land cover types which cannot be distinguished in Landsat images are deleted or merged to form a single category. The final hierarchical structure, consisting of 4 levels and 17 nodes, is shown in Figure 4- 1 3. There are three main reasons to build a spectral land cover recognition system based on hierarchical classification: 1. Natural land cover types have an implicit hierarchical structure. Land cover types can be naturally classified as vegetation and nonvegetation based on whether green vegetation exists on the surface of a region or not. For example, 113 AGRICULTURE / NONFOREST \ GRASS / RANGELAND SHRUB VEGETATION ' RED PINE CONIFER / / \ JACK PINE FOREST LAND COVER TYPE DECIDUOUS CLEARCUT URBAN NONVEGETATTON BARE LAND WATER Figure 4-13: Hierarchical structure of land cover types. urban areas and water regions contain no vegetation while forest and grassland areas are covered by green vegetation. Furthermore, each land cover type can be naturally decomposed into more detailed categories. For example, forest can be decomposed into coniferous and deciduous classes. Coniferous forests consist of those trees which have needle like leaves while a deciduous forest consists of those trees which 114 have broad leaves. 2. A hierarchical classification system can partially classify a given region. Landsat images have limitations in spectrally classifying certain land cover types. For example, northern hardwood and aspen birch are spectrally indistinguishable in Landsat spectral data, but both of them belong to the deciduous forest category. Using a hierarchical classification system can at least identify that an aspen birch region belongs to the forest category, and also that it belongs to a deciduous forest, although the system cannot correctly identify it as an aspen birch. 3. Computational Advantages of Hierarchical Classification Hierarchical classification techniques [Cha86, Sti85] allow an efficient examination of the stored knowledge of the system based on need. The computational advantages of hierarchical classification come from two types of pruning: hierarchical pruning and knowledge group pruning. In hierarchical pruning, if a land cover type is ruled out (not established), then none of its sub-land cover types need be examined. In knowledge group pruning, knowledge rules for a given land cover type are listed in priority order, so that only the minimum subset of knowledge rules are examined to confirm or deny whether a given spectral data belong to the land cover type. 4.5.3.2. Constructing Spectral Expert System Once the hierarchical land cover structure is set up, we have to generate knowledge rules for each land cover type. We used human expertise to combine land cover training data (including spectral data and its true land cover type) with existing land cover knowledge (from the remote sensing literature) to design the knowledge rules for each land cover type. A trial and error method is used to test and modify the knowledge rules until the system can correctly classify training land cover spectral data. Three types of information are considered in constructing the knowledge rules for each land cover type in the hierarchical structure: 115 I . Domain Spectral Knowledge Spectral knowledge can be used to construct the hierarchical structure of land cover categories. For example, land cover types belonging to the vegetation category have medium spectral reflectance in the red light band (band 3 in Landsat TM data) but have very high reflectance in the near-infrared band (band 4). The vegetation index, defined as the ratio of band 4 and band 3 reflectances, can effectively distinguish all the land cover types into two categories: vegetation or nonvegetation. Water has the unique band decreasing property among all known land cover categories. This domain knowledge was obtained from the remote sensing literature [Lus89, Lil79, Swa78, Ens83, Wha86]. 2. Spectral Classification rules based on training spectral data for land cover types The remote sensing literature can provide general spectral properties of various land cover types, but not specific classification rules. Since we are using Landsat images to classify various land cover types, general spectral knowledge has to be transformed to more specific quantitative classification rules. For example, we know that the vegetation index will be high for the vegetation category. But how high is "high"? Based on the col- lected training data we have concluded that vegetation index greater than 2.5 is very high, and lower than 1.5 is very low. Only based on these specific quantitative rules, can we confidently classify land cover types into appropriate categories. 3. Non-spectral knowledge Spectral knowledge alone is not enough to classify all the land cover types. There- fore, some information extracted from experienced photointerpreters is also used as knowledge rules. For example, agriculture or clearcut regions are usually rectangular shaped, urban areas are close to major roads, etc. 116 4.5.3.3. Several Examples of Classification Hypotheses In the following, we show several examples of how the human generated spectral expert system detects a land cover type. Example 1. Red Pine Identification A test datum is taken from a region in Grand Traverse County. The true land cover type of the region is Red Pine. Its average band reflectance values are given below. Band 1 Band 2 Band 3 Band 4 Band 5 Band 7 78.5 28.3 26.2 85.6 63.3 17.2 The vegetation index for this region is 3.27. The system arrives at its conclusion based on the following reasoning. (i) The region belongs to the vegetation category, because the band ratio of the near infrared band (band 4) and the red light band (band 3) is high (> 3) which implies that there is a large amount of green vegetation in the region. (ii) The region does not belong to the agriculture category because of the low reflectance of both the water absorption band (band 5) and the blue band (band 1) which implies no evidence of agriculture in the region. (iii) The region belongs to the forest category because Of the high value of the vegeta- tion index and the low reflectance of the blue band. (iv) The region belongs to coniferous forest based on the evidence that the reflectance in the near infi'ared band is much higher than the red light band (band 4/band 3 > 3) but not much higher than the water absorption band (band 4/ band 5 < 1.5), which is the spectral property of needle-like leaves of trees belonging to a coni- ferous forest. (v) Finally, the system confirms that the region is red pine but not jack pine because band 4 reflectance is greater than band 7 reflectance (band 4/band 7 > 3.5), which is a property of red pine land cover. 117 Example 2. Urban Identification A test datum is taken from a region in Montrnorency County. The true land cover type of the region is Urban Residential. Its average band reflectance values are given below. Band 1 Band 2 Band 3 Band 4 Band 5 Band 7 126.1 58.5 74.6 86.3 120.9 66.7 The system arrives at its conclusion based on the following reasoning. (i) The vegetation index is low and the visible blue band reflectance is very high which shows that the region is not composed Of green vegetation. Therefore, the region is classified into the nonvegetation category. (ii) The region is confirmed to be in the urban category because the reflectance of band 5 is larger than band 4. In addition, the visible bands have high reflectance. (iii) At this stage, the spectral data of the region is insufficient to decide whether the region belongs to the urban residential category or bare land category. 80, the sys- tem needs nonspectral information such as whether major roads exist in the neighborhood and whether the texture of region is rough (urban residential can be a mixture of houses, trees, and parking lots, etc.) to confirm that the region belongs to the urban residential category. Example 3. Shrub Identification A test datum is taken from a region in Leelanau County. The true land cover type Of the region is shrub. Its average band reflectance values is given below. Band 1 Band 2 Band 3 Band .4 Band 5 Band 7 84.3 33.9 32.4 102.1 87.7 28.3 The vegetation index of this region is 3.19. The system arrives at its conclusion based on the following reasoning. (i) (ii) (iii) (M 118 The region is vegetation because the vegetation index is high ( band 4/ band 3 > 3). The region is rejected as belonging to the agriculture category because this category usually has much higher vegetation index and much brighter visible band values. The region is rejected as being in the forest category because the region has a higher blue band reflectance than forest areas (forest trees are taller than shrubs so their visible blue band should have lower value). Finally, the region is classified as shrub but not grassland based on the evidence that its vegetation index is high (grassland usually has a lower vegetation index) and its water absorption band is not high (grassland contains less moisture than shrub, so its water absorption band should have higher reflectance). 4.5.3.4. Experimental Results Landsat spectral data of ten land cover types from two test sites, Grand Traverse County and Montmorency County, were used for the performance evaluation of the designed Landsat spectral recognition system. Note that no training data were collected from these two sites. Experimental results show that 15 out of 20 land cover types are correctly identified, while another two are partially correctly classified. This. classification accuracy is satisfactory considering that the training and test spectral data were collected from different satellite images and over 200x200 square miles of area. Most Of the classification errors came from the confusion between cropland, grassland, and urban area, which are mainly caused by the spectral limitations of Landsat 4 TM images. 119 4.5.4. Automatic Rule Generation from Training Data This section introduces a technique to automate the rule generation procedure for classifying different land cover types. Inputs include a hierarchical land cover structure (Fig. 4-4) and training data of various land cover types collected from different Landsat images. The output will be spectral rules to recognize these land cover types. The number of training data sets used in this study was not large (about 60). Each training data set consists of 10 features along with the true land cover type. Six of the ten features correspond to the six bands Of the Landsat TM images (B1, 32, B3, B4, 85, B7), and the other four features are the band ratios (B4/B3, 85 IBg, B4/Bz, and B4/Bl). The data was collected from eight different geographical areas taken from two Landsat scenes, with each area (about a township size) containing eight to twelve different land cover types. The major reason that we did not choose a larger amount of training data is that selecting representative training data and identifying its true land cover type is a time consuming job even for an experienced photointerpreter. Our strategy is to automate the rule generation procedure by using a clustering tech- nique. The training data belonging to a single land cover type is likely to form a cluster in the spectral feature space. These clusters are used to construct spectral rules to distin- guish the land cover types. Jain and Hoffman [J aiS8a] have proposed a similar automatic rule generation technique for three dimensional (3-D) object recognition. The method used the minimum spanning tree to form clusters of feature vectors that belong to the same object. Recognition rules for the objects can be generated based on the feature values associated with the clusters. Their method showed satisfactory classification accuracy on a set of 10 3-D objects. In the following, we first present the complete algo- rithm for rule generation, then give an illustrative example which generates spectral rules to distinguish forest from nonforest categories. 120 Begin End Step 3. Step 4. Step 5. Step 6. Spectral Rule Generation Algorithm Step 1. Select target cover type, non-target cover types, and training data set X. Step 2. For each candidate feature(s), do Use the (k,l)—NNR (Nearest Neighbor Rule) to find all the feature vectors in X so that the majority of their nearest neigh- bors belong to the target cover type. Call it the set S. If omission error (feature vector belongs to the target cover type, but not in S) and commission error (feature vectors in S but does not belong to the target cover type) are above threshold 1 - t1, go to End. Calculate the mean and the standard deviation for each selected feature(s) in S. Classify the training set X using the following candidate rule. If classification accuracy exceeds t2, then the rule is accepted. If for the selected p‘h feature, mean (Sp) - sd (Sp) 5 xp 5 mean (5,) + std (5,) then clas- sify x as target cover type. where mean (Sp) and sd(Sp) represent the mean and standard deviation of the p‘h feature selected in Step 5, xp represents the p"I feature value of data 1:, 1:6 X. The first step in the spectral rule generation algorithm consists of preprocessing, which determines the target land cover type, other land cover types in the same hierarchy (non-target land cover types), and all the training data in the same hierarchy. Step 2 selects the candidate band(s) used in rule generation. Theoretically, each rule can involve as many as 10 features (6 bands and 4 band ratios). However, in our applications we find 121 that three features are sufficient for generating spectral rules. Too many features in a rule may not improve the classification accuracy, and the computation time increases significantly as the number of bands increases [Jai82]. Step 3 determines whether the feature vectors belonging to the target land cover type are neighbors of each other. The (k,l)—NNR rule is defined as follows. For each training vector, if at least I out of its k nearest neighbors belong to the target land cover type, then assign it to the target land cover type (set S). A more detailed discussion of (k,l)-NNR is available in [Dev82]. We set k = 5 and l = 4 in our algorithm. The Euclidean metric is used to compute nearest neighbors. In ideal case, all the data in set 8 should belong to the target cover type, and any data not in set S should not belong to the target cover type. If a high classification accuracy (> t1) is obtained, then a candidate rule for the target cover type can be gen- erated in Step 5. We calculate the mean and the standard deviation for each candidate feature(s) of the training data in the set S. In step 6, we use one standard deviation toler- ance (above or below the mean of candidate feature(s)) generated in Step 5 as a rule to identify target cover type from non-target cover types. If the classification accuracy over the training data set X is higher than a threshold t2, then the rule is accepted. The two thresholds t1 and t2 are adjustable in the algorithm. The threshold t; represents the percentage of training data of target land covet type that can form a cluster based on the nearest neighbor criteria. A value of t1 = 100% means that all the target training data must form a cluster. We initialize t1 to 100%, and if no rules can be gen- erated, then :1 is successively decreased by 10%. The algorithm is stOpped if t1 5 40%. The threshold t2 is used as a. measurement of classification accuracy for a generated rule. For a given rule, r; = 1 means the rule can identify the target cover type from non-target cover types with 100% accuracy based on all the available training data. Table 4-3 shows the rules generated for the forest category. This rules were obtained from 26 uaining vectors belonging to the forest cover type and 23 training vec- tors belonging to the nonforest cover type. In this example, five rules were generated. Forest rules are used to distinguish forest from other nonforest vegetation, such as agri- culture and rangeland. Rule 1 can identify 24 out of the 26 true forest areas and 122 rrrislabeled 2 out of the 23 nonforest areas. Rule 2 can. identify 22 out of the 26 true forest areas and mislabeled none of the nonforest areas. In order to compare the program generated forest rules with the human generated forest rules, the forest rule constructed by a photointerpreter is also given below. If B4 /B 3 2 3 and B 1 < 80 then it is forest with very strong confidence. IfZ SB4/B3 < 3 andB1< 80 then it is likely to be forest. These rules, generated by a photointerpreter, identify 25 out of 26 true forest areas and mislabel 3 out Of the 23 nonforest areas, which has about the same classification accuracy as given by the program generated rules. Note that the program can generate multiple rules with similar classification accuracy. Knowledge rules to distinguish shrub from grass are not easy to obtain. It is known [Ens83] that shrub areas should have higher visible band reflectance and higher near- infrared band reflectance. However, the human generated shrub rules correctly identify 5 out of the 9 shrub areas and mislabel none of the 8 non-shrub areas. Table 4-4 shows the program generated shrub rules. The first rule correctly identifies 6 out of the 9 shrub areas without false alarm. The second rule correctly identifies 7 out of the 9 shrub areas and mislabels 2 out of the 8 nonshrub areas. Some of the program generated rules might be difficult to interpret by a human expert, although these rules can provide satisfactory classification accuracy. Table 4-3: Automatically generated rules for forest category. Classification Accuracy Rule Feature Mean S.d. Target cover type Non-Target cover type No. No. Correct Total False Total 1 Bl 77.00 2.30 24 26 2 23 2 B 1 28.35 1.75 22 26 0 23 3 B3 24.50 1.60 24 26 2 23 4 B, 53.40 15.50 18 26 O 23 5 B, 17.10 6.10 20 26 0 23 123 Table 4-4: Automatically generated rules for shrub category Classification Accuracy Rule Feature Mean S.d. Target cover type Non-Target cover type No. No. Correct Total False Total 1 B4 125.40 23.20 6 9 0 8 2 86 28.75 5.45 7 9 2 8 3 8., /B, 110.30 6.26 6 9 0 8 4 B4 IBZ 153.46 12.44 5 9 0 8 The Appendix shows the program generated rules for various land cover types. Each land cover type may involve several rules. The success ratio of a rule is defined as the fraction of correctly identified training vectors for the associated cover type. For example, rule 1 in Table 4-3 identifies 24 out of the 26 forest areas, so its associated suc- cess ratio is 24/26= 0.923 (see the first forest rule in the Appendix). The false alarm ratio of a rule is defined as the fraction of training vectors which are incorrectly assigned to the associated cover type. The positive confidence of a rule is calculated as success ratio x (1 - false alarm ratio). The value of positive confidence ranges from 0 to 1. The higher the value, the better the quality of the rule. A positive confidence of 1 means that the rule can correctly identify all the training data belonging to the associated cover type and will reject all the training data which do not belong to the cover type. As we have seen, several rules are generated for each land cover type. In calculating the positive confidence of a test data which is assigned to a land cover type, all the rules of that land cover type are examined and the maximum positive confidence value is chosen. 124 4.6. Knowledge Based Segmentation of Landsat Images Landsat images usually contain many regions with ill defined boundaries, and spec- tral and spatial knowledge used by experienced photointerpreters are difficult to code into rules. These factors make the traditional top down and bottom up technique [Oht85] difficult to apply. For example, we used the spatial clustering method (a bottom up method) to segment the Alpena image (Figure 4-4) into homogeneous regions (see Fig. 414). However, many areas of the image (white areas in Figure 4-14) are too complex ‘ to form regions. Many segmented regions (pink colored regions, whose ground truth is urban area) are spectrally mixed to be interpreted by only using spectral nrles. Figure 4-14: The segmented regions by spatial clustering. 125 Our proposed segmentation technique was presented in Figure 4-2. At first, a Landsat image is divided into two types of regions: vegetation and nonvegetation. The nonvegetation areas are further interpreted by object oriented segmentation. The vegeta- tion areas are interpreted through a hierarchical segmentation procedure. Each type of region is further decomposed into smaller regions for detailed interpretation. This pro- cedure can be recursively applied until regions cannot be further subdivided. All the available knowledge such as kernel information, spectral rules, as well as human gen- erated spatial rules are used in the segmentation procedure. The advantages of using hierarchical segmentation are listed below. 1. Natural land cover types have the implicit hierarchical structure. This property was discussed in Section 4.5.3. Easy extraction of contextual information. Contextual information is very useful in image interpretation. For example, a given small region with the spectral signature of grassland can be identified as a small island if surrounded by water, can be a part of an urban area if surrounded by non- vegetation area and roads, can be part of a forest if surrounded by forest. This con- textual information is gradually constructed in the hierarchical segmentation pro- cedure, while in a traditional segmentation technique lots of contextual information may be missing when the background areas are mixed regions. More consistent regions can be formed in the segmentation procedure. Choosing the number of clusters is always a problem in a segmentation procedure. Since we use a hierarchical approach in the segmentation process, we use only two or three clusters for a given segmentation. This strategy can usually form larger con- sistent regions, which makes the segmentation and interpretation easier. Each seg- mented region will be interpreted based on its spectral data and more detailed seg- 126 mentation and interpretation can be performed for each segmented region. In the following, we list the proposed segmentation algorithm, and then present examples of its operation. Step 2. Decompose Step 2.1. Step 2.2. Step 3.1. Step 3.2. Step 3.3. Step 3.4. Step 3.5. Knowledge Based Segmentation Algorithm Step 1. Use the vegetation index to classify all image pixels into three types: vegeta- tion, nonvegetation, and uncertain. The uncertain pixels will be used in both vegetation and nonvegetation interpretation. nonvegetation areas Overlay obtained kernel information with the nonvegetation pix- els. Use contextual information and spectral rules to identify urban and bare land. Step 3. Decompose vegetation areas Overlay obtained kernel information with the vegetation pixels Use spatial clustering to cluster the pixels in vegetation areas. Use seed detection and region growing techniques to obtain con- sistent regions. Use spectral rules and kernel information to identify these re- gions. Steps 3.3 to 3.5 can be recursively used to identify regions with detailed land cover types. The spectral knowledge used in the segmentation procedure is discussed in Section 4.5. Some of the spatial knowledge is merged in the seed detection and region growing modules of the segmentation algorithm. The following spatial rules are used in our seg- mentation procedure. 127 1. If a nonvegetation region contains roads and trees, then it is likely to be an urban area, otherwise it may be a bare land. 2. If a small vegetation region is surrounded by urban area, then the region is labeled as urban area. 3. An oil/gas pad usually has rectangular shape whose size is between 8 and 40 pix- els. 4. A small region (S 5 pixels) is merged with the largest neighboring region. For the Landsat image of Figure 4-4, the result of Step 1 of the segmentation algo- rithm is shown in Figure 4-9. The blue color in Figure 4-9 shows the nonvegetation areas while the green color shows the potential vegetation areas. We overlay these nonvegeta- tion areas with the kernel information of water regions and roads, which results in Figure 4-15. The colors of blue, green, red, and yellow represent the water regions, roads, poten- tial vegetation regions, and nonvegetation and nonwater regions, respectively. Nonvege- tation areas can be urban, water, or bare land. Water regions can be reliably detected using the method discussed in Section 4.4. Both urban and bare land may consist of high values of visible band reflectances, which are difficult to distinguish using only spectral data. However, an experienced photointerpreter uses the following knowledge rule to separate urban area from bare land: An urban area is mixed with trees, roads, and nonvegetation objects, while bare land consists of only nonvegetation objects. Based on the extracted kernel information shown in Figure 4-15, we integrate the above knowledge rule into seed detection and region growing techniques to identify urban and bare land regions. The seeds of urban regions are detected by the following procedure: For any pixel in the image, check its 13 x 13 neighborhood, If 1. Over 30% Of the pixels are labeled as roads (green color in Figure 4—15), and 128 2. Over 20% of the pixels are potential vegetation pixels (red color in Figure 4- 15), and 3. Total nonvegetation pixels are over 95% (all colors except white in Figure 4- 15). then assign the pixel as a seed of an urban area. The minimum percentages of pixels assigned in the seed detection procedure, such as 30% road pixels, 20% potential vegetation pixels, are selected empirically. Increasing these percentages will produce fewer seeds for urban regions. We found that adjusting these percentages within a tolerance (i10%) will only slightly affect the detected urban results. The potential urban pixels are detected by the following procedure: For any pixel in the image, check its 7 x 7 neighborhood. If 1. Over 30% of the pixels pixels are roads or potential vegetation, and 2. Total nonvegetation pixels ar over 60% (all colors except white in Figure 4- 15), and 3. The pixel is not a water pixel (blue color), then the pixel is assigned as potential urban pixel. Note that a pixel is called the seed of an urban area if its neighborhood (13x13 pix- els, about 390x390 square meters) presents sufficient evidence of roads, potential vegeta- tion areas, and nonvegetation areas. A comparatively smaller neighborhood information (7X7 pixels) is used for detecring potential urban pixels. The seeds of bare land regions are detecred by the following procedure: For any pixel in the image, check its 13 x 13 neighborhood. 129 If 1. Over 90% pixels are labeled as nonvegetation (yellow color in Figure 4-15), - and 2. Less than 5% pixels are potential vegetation pixels (red color in Figure 4- 15), and 3. Less than 1% pixels are road pixels, then assign the pixel as the seed of a bare land region. The potential bare land pixels are detected by the following procedure. For any pixel in the image, check its 7 x 7 neighborhood, If 1. Less than 10% pixels are roads or potential vegetation, and 2. Nonvegetation pixels are over 60% (all colors except white in Figure 4-15), and 3. The pixel is not water pixel (blue color), then the pixel is assigned as potential bare land pixel The region gowing algorithm for detecting urban areas is given below. Input files contain urban seed pixels and potential urban pixels, and the outputs are the urban regions. The same region growing algorithm is used to detect bare land, the only change is that input files are bare land seed file and potential bare land file. 130 Region Growing Algorithm Step 1. Assign all the urban seed pixels as urban pixels Step 2. Set update:=l; Step 3. While update = 1 do BEGIN update:=0; Scan the entire image, from top to bottom and left to right, for any urban pixel, if any of its 8-neighbors is a potential urban pixel, then begin assign the neighbor as an urban pixel. set update := 1; end; END; Figure 4-16 shows the results of using the above rules. Red pixels are urban seeds, the green pixels are bare land seeds. The blue and brown pixels are potential urban pix- els, and the yellow pixels are potential bare land pixels. Figure 4-17 shows the detected urban regions (red color) and bare land regions (yellow color) by applying the region growing algorithm to Figure 4-16. The Figure 4-18 shows the results of applying the clustering technique to vegetation areas. The majority of the vegetation areas (colored blue) are typical vegetation pixels which cannot be further interpreted Red regions are vegetation areas. The green pixels are spectrally between vegetation or nonvegetation pixels, many of them are road pixels in ground truth. Note that the three vegetation regions (blue color) on the upper right part of the image can be relabeled as‘urban areas because they are surrounded by large urban areas. Figure 4-8 shows a 256x256 image of Fredrick Township, Crawford County. We first produce the kernel information for the image: water regions, road networks, and 131 vegetation index image. The detected roads are shown in Figure 4-12. Our water detec- tion algorithm detected less than ten water pixels in this image which indicates that no meaningful water regions exist in this image. Figure 4-10 shows the corresponding vege- tation index image. Only 2% of the pixels belong to nonvegetation (colored blue). Most of these nonvegetation pixels are scattered into small regions. So, no urban or bare land regions are detected in this image. After the analysis of kernel information, we conclude that more than 90% of the pixels in this image belong to vegetation. We segmented vegetation regions of Fredrick township image into 2 and 3 clusters. The Forgy clustering algorithm in the ERDAS software package [Erd86] is applied to the six bands (removing the thermal band) of Landsat image. The percentage of pixels in each cluster and the cluster centers for the two cluster solution are given below. Table 4-5: Two cluster solution of Forgy algorithm on Fredrick Township vegetation regions. Cluster No. Percentage Band 1 Band 2 Band 3 Band 4 Band 5 Band 7 1 80.33% 78.93 32.64 28.56 129.38 95.97 30.17 2 19.67% 91.67 39.05 45.09 95.83 127.62 53.43 .4— Using these cluSter centers in the spectral rules, cluster 1 is successfully classified as vegetation, forest, and deciduous. Similarly, cluster 2 is successfully classified as vegeta- tion and nonforest. The results of three cluster segmentation are given below. Table 46: Three cluster solution of Forgy algorithm on Fredrick Township vegetation regions. Cluster No. Percentage Band 1 Band 2 Band 3 Band 4 Band 5 Band 7 1 60.31% 78.85 32.85 28.43 134.05 98.23 30.55 2 16.58% 75.97 28.98 25.12 94.98 64.91 20.26 3 23.11% 88.64 37.70 41.06 101.69 120.52 47.13 132 Again, using these cluster centers in the spectral rules, cluster 1 is successfully classified as vegetation, forest, and deciduous. Similarly, cluster 2 is vegetation, forest, and conifer, and cluster 3 is vegetation and nonforest. The three cluster solution is used for later processing of this image, because the three cluster centers are well separated and are well interpreted by the spectral rules. Based on the three cluster result, we define a 5x5 window as a seed if all the pixels within the window belong to the same class. A blob coloring technique [Ba182] is then used to merge all homogeneous seeds into regions and label them. The results are shown in Figure 4-19. The blue regions (total of 18) belong to cluster 1, the green regions (19) belong to cluster 2, and the red regions (38) belong to cluster 3. The results of region growing are shown in Figure 4-20. The spectral data of all these segmented regions are collected and interpreted by using the spectral rules in the Appendix. These regions are classified into three types: deciduous, coniferous, and nonforest. Previously identified nonvegetation pixels are labeled as non- forest. The scattered regions (with size S 5 pixels) are labeled by the majority of their neighborhood. The final interpreted results are shown in Figure 4-21. The blue colored regions are deciduous, green colored regions are coniferous, and the red colored regions are nonforest. Figure 4-22 shows the ground truth information of the same area. This ground truth information is provided by the Michigan Department of Natural Resource, which was generated based on 1978 aerial photographs. Comparing Figures 4-21 and 4- 22, we obtain a classification accuracy of our knowledge based segmentation method. Table 4-7: Classification accuracy by land cover type. Land Cover Type Classification Accuracy (%) Commission Error (%) Deciduous 85.7% 4.3% Coniferous 86.3% 15.2% Nonforest 87.5% 26.4% The classification accuracy of a land cover type is defined as the number of correctly classified pixels divided by the total number of pixels in the land cover type. The commission error of a land cover type is defined as the number of commission pixels 133 (labeled as the land cover type by the algorithm but not by the ground truth data) divided by the total number of pixels in the land cover type. We believe that these classification accuracies should be higher. The reason is that the ground truth is based on the 1978 aerial photographs where as the segmentation was based on the Landsat image taken in 1986. In that eight year time span, some deciduous areas have changed to nonforest categories which have not been updated in the ground truth file. 134 Figure 4-16: The detected seeds for urban and bare land from Figure 4-15. 135 Figure 4-18: The vegetation regions of Figure 4-4. tn Figure 4-20: The extended regions from Figure 4-19. Figure 4-22: The ground truth image of Figure 4-8. 138 4.7. Discussion and Conclusions In this chapter we have proposed a knowledge based approach for Landsat image segmentation. The proposed method is designed in such a way that a Landsat image can be segmented and interpreted without any prior image dependent information. The gen- eral spectral land cover knowledge is constructed from the training land cover data and the road information of an image is obtained through a road detection program. One advantage of the method is its flexibility in applying it to geographically different areas. In our method, major land cover types in the study area are first organized in a hierarchical structure. The spectral rules for land cover classification are automatically generated. The kernel image information, such as water, road, and vegetation index is then generated to provide an initial interpretation of the image. Finally, a hierarchical segmentation technique is proposed to segment and interpret the image by combining kernel information, spectral rules, and contextual image information. The experimental results show that the proposed method can be successful in segmenting complicated Landsat images. Through this study, we believe that the proposed knowledge based, hierarchical method is a promising approach for Landsat image segmentation. Future , research topics to make the proposed method more reliable and powerful are listed below: 1. More spatial rules should be generated. An experienced photointerpreter uses spatial knowledge in the image interpretae tion. Unlike spectral knowledge, spatial knowledge is not automatically gen- erated. A good future research problem is to collect most frequently used spatial rules and transform them into computer accessible format. 2. Prior map information should be used in image interpretation. For many Landsat images, prior land cover/use maps are already available. This prior information should be useful in the image interpretation procedure. A complete knowledge based segmentation technique may consist of two stages. The first stage uses the proposed method to segment a Landsat image by spectral 139 knowledge rules. The second stage then collects more area dependent spatial rules and the prior map information to perform a more complete segmentation. Chapter 5 Summary, Future Research, and Conclusions This thesis develops techniques for automating the process of Landsat TM image interpretation. Three major topics are studied: image registration, road detection, and knowledge based segmentation. The summary of work done on each topic is provided in Sections 5.1 to 5.3. Specific contributions made in this thesis are listed in Section 5.4. Future research topics are discussed in Section 5.5, and the conclusions are given in Sec- tion 5.6. 5.1. Summary of Image Registration Image registration is a frequently used procedure in Landsat image processing. Traditional approaches to image registration require substantial human involvement. In Chapter 2, we proposed a method for attacking two major steps that hinder the automatic image registration: control point selection and control point matching. We used the cen- troids of water and Oil/gas pad regions as control points and developed techniques to extract them automatically. We have improved a point matching algorithm proposed in the literature by reducing its time complexity from 0 (n4) to 0(n3). The robustness of the proposed matching algorithm is demonstrated through simulation experiments. Experimental results on tluee Landsat images show that the performance of the proposed method is comparable to an experienced photointerpreter. One limitation of the proposed point matching technique is that the computation time increases drastically as the number 140 141 of control points becomes large (over an hour for 65 control points on a 80386-based IBM/PC). We used only small Landsat images (256x256) in the image registration experiments. Larger images should be used to test the robustness Of the proposed method. 5.2. Summary of Road Detection Roads are fi'equently appearing Objects in Landsat images. Identifying roads in these images is useful in many applications, such as automated registration and image interpretation. A major problem in most road detection techniques is that the poor qual- ity of the line detector response Often causes difficulties in later processing. In the pub- lished literature, relaxation techniques have tried to solve this problem [Zuc77, Pra80]. However, we have shown that pixel based relaxation techniques can obtain only limited improvement in road detection because they fail to fill road gaps which are over a few pixels long. We have proposed a hierarchical approach for road detection. The key con- cept is to combine the image hierarchy and an iterative operator in such a way that when the image size is reduced, the global road information is maintained. Experimental results on several Landsat images show that using the image hierarchy results in better road identification than without the image hierarchy. 142 5.3. Summary of Knowledge Based Segmentation Commonly used knowledge based segmentation techniques, such as top-down and bottom-up methods, are difficult to apply in Landsat image interpretation. We attack this problem by first constructing a hierarchical structure for the major land cover types in the study area. Then we use a pattern recognition technique to automate the process of spec- tral rule generation to identify different land cover types. Kernel image information is constructed to provide the initial understanding of the image. Finally, a spatial clustering technique, kernel image information, and the spectral rules are combined to provide a more detailed interpretation of the image. Due to the lack of ground truth information, only one Landsat image has been evaluated in our experiments. The results show that our method can identify deciduous forest, coniferous forest, and non-forest areas with an accuracy over 85%. The proposed segmentation technique has also been used to interpret a Landsat image of the city of Alpena, Michigan. The results appear reasonable. How- ever, no classification accuracy can be provided because we do not have the necessary ground truth information. 5.4. Contributions of the Dissertation One unique contribution in Chapter 2 is the improvement of the time complexity of a popular point matching technique from 0 (n4) to 0 (n3). The proposed technique can compete with any other published point matching technique in time complexity, and its robustness has been demonstrated through simulation. The proposed point matching tech- nique has been used in an automatic image registration method. Experimental results indicate that the performance of the proposed method is comparable to that obtained by manual image registration. Road detection techniques have been studied for more than fifteen years by numerous researchers. However, a major problem in the process, namely, how to obtain a global view, has not been adequately addressed. In Chapter 3, a hierarchical approach is proposed to obtain the global road information from the image itself. The key concept is to combine the image hierarchy and an iterative operator in such a way that, when the 143 image size is reduced, the global road information is maintained. The advantage of this method is that it can identify roads in an image more efficiently. Experimental results using several satellite images show that the proposed hierarchical approach can perform better 'in road detection than without the hierarchy. There are three contributions in the proposed knowledge based segmentation tech- nique, which are briefly described below. 1. Using Kernel Image Information in the Interpretation Procedure. The kernel information of an image is the information that can be reliably extracted from the image by using spectral knowledge and image processing techniques. The ker- nel information of an image includes roads, water regions, vegetation regions, and non- vegetation regions. We have developed a technique to extract this kernel information to provide the initial interpretation of the image and use this as a basis for further image segmentation and interpretation. 2. Automating the Spectral Knowledge Acquisition Knowledge acquisition for satellite image interpretation is usually performed by experienced photointerpreters. This process takes a considerable amount of time and reliable rules are sometimes difficult to obtain. We have automated the process of spec- tral rule generation by applying pattern recognition techniques to the training land cover spectral data. The experimental results show that the automatically generated rules can compete with those generated by a photointerpreter in classification accuracy. Further, more rules can be generated by the automatic rule generation procedure. 3. Use of Hierarchy in Landsat Image Segmentation A Landsat image consists of many regions that belong to different land cover types. These land cover types form a hierarchical structure. Interpreting Landsat images hierarchically can at least provide us partial understanding for some regions. For exam- ple, we may identify a region as a forest but we may not be sure whether it a coniferous forest or a deciduous forest. 144 Despite the contributions achieved in this dissertation, we have to humbly admit that fully automating the process of Landsat image interpretation is still an open problem. Future research directions have been suggested in Section 5.5 to make computerized Landsat image interpretation more feasible. 5.5. Future Research Topics Despite the contributions we have made in this thesis, there are still lots of work to be done to make the automatic interpretation of Landsat images more successful. This section provides the potential research topics which (we thought) may eventually play a crucial role in Landsat image interpretation. 5.5.1. Parameter Tuned Block in Image Processing Applications A long standing problem for many image processing techniques is the inflexibility property where the same algorithm might work for one set of images but not for another set of images. This inflexibility property significantly restrains the flexibility of image processing techniques. In the following, we would like to introduce a parameter tuned block which might make many image processing techniques more flexible, adaptive, and user friendly. The concept of parameter tuned block can be explained through a daily life exam- ple. If a person is listening to a radio, the radio can be viewed as a block and the two switches (volume and channel) can be viewed as parameters. The person need not know the internal structure of a radio; only the basic functions of the two switches. Through the appropriate adjustment of the two switches, the person can enjoy the music he/she likes. The same concept can be used in image processing. Figure 5-1 shows the basic flowchart The function block can be any set of image processing techniques, such as line detection, edge detection, or segmentation, etc. The major concept is that the behaviors of the func- tion block are controlled by the key parameters. When the values of key parameters 145 change, the behaviors of the function block will change correspondingly. A good exam- ple of this concept is the Canny edge operator. There are four parameters in the Canny operator. Two parameters determine the range of edge strength to be considered, the other two parameters the size of the edge operator. Through the adjustment of these four parameters, the output of the canny operator will be different even if the input image is unchanged. Learning techniques can be developed to adjust the parameters based on the training input and output images. One potential advantage of the parameter tuned block is that many image processing techniques would become user friendly tools. Photointer- preters can use these tools in speeding up the process of satellite image interpretation. Input Image .____, I: FUNCTION parameters BLOCK ADJUST lnterrnediate Output PARAMETERS Image I no EVALUATION OK ? I... Figure 5-1: Parameter tuned block in Image processing. 146 5.5.2. Region Based Relaxation Techniques Region labeling is a typical problem in knowledge based segmentation which is described below. Suppose we know the label Of each region of the image with certain probability, we want to interpret each region based on its local neighborhood information so as to obtain a consistent interpretation of the whole image. Rule based methods com- bined with the blackboard concept is the most popular way to solve this problem [Nag80, Oht85]. The potential problems of using rule based approach are that it consumes a lot of computation time and memory space, and the results might change if rules are applied in different orders. There is a clear similarity among the region labeling problem and the pixel labeling problem [R0576]. Both problems try to interpret each entry based on its local neighbor- hood consistency and hope to obtain a global consistency. Pixel based relaxation tech- niques are parallel and iterative, and have been successfully applied in many applica- tions. A potential research topic is to apply the relaxation technique to region based interpretation. Using production rules in an array format [G0183] might be the first step. However, some of the artificial intelligence techniques have to be developed and integrated with the relaxation concept to realize a region based relaxation technique. 5.5.3. Spatial Knowledge Acquisition Techniques Spatial knowledge plays a very important role in Landsat image interpretation. Without this knowledge, the interpretation would be error prone and impractical. A human expert might use hundreds of spatial rules in Landsat image interpretation. Unlike the spectral knowledge acquisition, there is no easy way to automate the process of spatial rule generation. How to represent these rules into a computer accessible format and integrate them effectively with other image processing techniques is a difficult and time consuming task. 147 5.6. Conclusions Automatic Landsat image interpretation is a difficult task. It needs the integration of domain knowledge with the techniques from the fields of image processing, artificial intelligence, and pattern recognition. In this dissertation, three important issues in automatic satellite image interpretation have been studied: image registration, road detec- tion, and knowledge based segmentation. Several contributions have been made to each topic, which have been summarized in Section 5.3. Despite the progress that has been achieved in this thesis, it is unlikely that fully automatic Landsat image interpretation can be realized in the near future. The major problem is that, despite more than twenty five years of effort by numerous researchers, the available techniques in computer vision and artificial intelligence are still too primi- tive to replace experienced human experts. A more promising approach is a cooperation of human experts and computer programs in image interpretation where the effort of human experts can be minimized and the use of the computer can be maximized. In order to achieve this goal, several future research topics have been suggested in Section 5.5. If these problems can be solved and integrated with the techniques proposed in this disser- tation, then it is very possible that an automatic Landsat image interpretation product can be produced. Such a product would save significant amount of time and cost in Landsat image interpretation, compared to the currently used method that heavily depends on experienced photointerpreters. Appendix Program-Generated Land Cover Spectral Rules The definition of a spectral rule for a given land cover type is given below. Urban and bare land are separated by spatial rules. If mean (xk) — sd(xk) Sxk 3 mean (xk) + sd(xk), then label the region as the cover type with the designated positive confidence, where xk is the k‘h feature value of the test region. Vegetation Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B3 41.00 18.10 1.00 0.091 0.909 2 B 5 / B 6 145.34 34.66 0.980 0.00 0.98 3 , B 4 /B 2 149.28 30.72 0.980 0.091 0.891 4 B 4 / B 1 109.50 14.57 0.959 0.091 0.872 Agriculture Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B 1 91.45 7.65 1.00 0.038 0.962 B3 44.15 13.25 1.00 0.038 0.962 3 B 5 104.20 24.50 0.833 0.038 0.801 B4/B3 135.73 23.82 148 149 Forest Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B 1 77.00 2.30 0.923 0.087 0.843 2 B 2 28.35 1.75 0.846 0.0 0.846 3 B 3 24.5 1.60 0.923 0.087 0.843 4 B 5 53.4 15.50 0.692 0.0 0.692 5 B 5 17.10 6.10 0.769 0.0 0.769 Deciduous Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B 5 87.55 9.35 0.778 0.0 0.778 2 B4/B3 169.96 10.04 1.00 0.00 1.00 3 B 2 / B 1 165.08 14.92 0.889 0.00 0.889 Coniferous Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B4 73.95 13.15 1.00 0.00 1.00 2 B4/83 138.53 11.76 0.938 0.10 0.844 3 B4/Bz 132.49 7.28 1.00 0.00 1.00 4 B4/81 98.80 3.50 1.00 0.00 1.00 Red Pine Rules Rule Feature Mean S.d. Success False Alarm Positive . Ratio Ratio Confidence 1 B 5 47.50 9.60 0.857 0.0 0.857 2 B 5 15.05 4.05 1.00 0.00 1.00 3 B 5 /B 5 144.35 4.56 0.857 0.00 0.857 150 Jack Pine Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 _ 86 22.35 1.70 1.00 0.00 1.00 2 B 5 /B 6 135.01 1.62 0.66 0.00 0.66 Nonforest Rules Rule Feature Mean S.d. Success False Alarm Positive - Ratio Ratio Confidence 1 B 1 92.51 10.80 0.869 0.038 0.838 2 B 3 44.30 14.80 0.869 0.038 0.838 3 B 4 95.30 26.50 0.826 0.038 0.794 B 5 1 17.75 39.85 4 B5 115.75 41.85 0.913 0.076 0.843 B 9 134.97 22.36 Rangeland Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 Bl 91.60 10.50 0.824 0.219 0.644 2 B 2 38.55 7.65 0.882 0.25 0.661 B 6 51.45 12.65 3 B 3 44.30 14.80 0.824 0.219 0.644 Grass Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B 4 78.80 10.00 0.875 0.1 1 0.779 2 B 4 / B 3 1 1 1.65 5.34 0.625 0.00 0.625 3 B 1LIB 2 1 16.76 4.15 0.625 0.00 0.625 4 B 4 / B 1 94.82 3.57 0.625 0.00 0.625 151 Shrub Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B 4 125.40 23.20 0.67 0.00 0.67 2 B 6 28.75 5.45 0.889 0.25 0.67 3 B 4 /B 1 1 10.30 6.26 0.67 0.00 0.67 4 B 4 /B2 153.46 12.44 0.56 0.00 0.56 Nonvegetation Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B4/B3 93.73 10.18 0.91 0.00 0.91 Water Rules Rule Feature Mean S.d. Success False Alarm Positive Ratio Ratio Confidence 1 B4 15.55 4.75 1.00 0.00 1.00 2 B 5 10.30 5.30 1.00 0.00 1.00 [Ama88] [Bai84] [Bal82] [Bar81] [Bur81] [By185] [Can87] [Cha83] [Cha86] [Cha84] List of References Amadasun, M. and King, R. A., "Low-level Segmentation of Mul- tispectral Images via Agglomerative Clustering of Uniform Neighbor- hoods," Pattern Recognition, Vol. 21, NO. 3, pp. 261-268, 1988. Baird, H. S., Model-Based Image Matching Using Location. The MIT Press, Cambridge, Massachusetts, 1984. Ballard, D. H and Brown, C. M., Computer Vision. Prentice-Hall Inc., Englewoood Cliffs, NJ., 1982. Barr, A. and Feigenbaum, E. A., The Handbook of Artificial Intelli- gence. William Kaufmann, Los Altos, California, Vol. 1, 1981. Burt, P. J., Hong, T. H., and Rosenfeld, A."Segmentation and Estima- tion of Image Region Properties Through Cooperative Hierarchical Computation," IEEE Trans. Systems Man Cybemet., Vol SMC—ll, pp. 75-82, 1981. Bylander, T., and Mittal, S., "CSRL: A Language for Classificatory Problem Solving and Uncertainty Handling," AI Magazine, Vol. 7, pp. 66-77, 1985. Canning, J., Kim, J., and Rosenfeld, A., "Symbolic Pixel Labeling for Curvilinear Feature Detection," Proc. of DARPA Image Understand- ing Workshop, pp.242-256, Feb. 1987. Chandrasekaran, B. and Mittal, 8., Conceptual Representation of Med- ical Knowledge for Diagnosis by Computer: MDX and Related Sys- tems. Advances in Computers(eds. Yovits, M.), Vol. 22, Academic Press, pp.217-293, 1983. Chandrasekaran, B., "Generic Tasks in Knowledge-Based Reasoning: High-Level Building Blocks for Expert System Design," IEEE Expert, Vol. 1, pp. 23-30, Fall, 1986. Chang, S. H., "Trajectory Detection and Experimental Designs," Computer Vision, Graphics, and Image Processing, Vol. 27, pp.346- 368. 1984. 152 [Che84] [Che88] [Chi80] [Dav79] [Dav78] [Dev82] [DNR89] [Dud73] [Ens83] [Ens87] [Erd86] [Eri841 153 Cheng, J. K. and Huang, T. 8., "Image Registration by Matching Rela- tional Structures," Pattern Recognition, Vol. 17, No. 1, pp. 149-159, 1984. Chen, K. and Lu, C. Y., "A Machine Learning Approach to the Automatic Synthesis of Mechanistic Knowledge for Engineering Deci- sion Making," Proc. 4th IEEE Conf. on Artificial Intelligence Applica- tions, pp. 306-311, 1988. Chittineni, C. B., "Signature Extension in Remote Sensing," Pattern Recognition, Vol. 12, pp. 243-249, 1980. Davis, L. 8., "Shape Matching Using Relaxation Techniques," IEEE Trans. Pattern Anal. Machine Intell., Vol. PAMI-l, No. 1, pp.69-72, Jan. 1979. Davis, W. A. and Kenue, S. K., "Automatic Selection of Control Points for the Registration of Digital Images," Proc. Fourth Int. Joint Conf. Pattern Recognition, pp 936—938, 1978. Devijver, P. A. and Kittler, J., Pattern Recognition: A Statistical Approach. Prentice-Hall International, London, 1982. Department of Natural Resources, State of Michigan, The Michigan Resource Inventory Program - Product Catalog, March, 1989. Duda, R. O. and Hart, P. E., Pattern Classification and Scene Analysis. John Wiley & Sons, Inc., New York., 1973. Enslin, W. R., Hudson, W. D., and Lusch, D. P., Photo Interpretation Key to Michigan Land Cover/Use, Center for Remote Sensing, Michi- gan State University, 1983. Enslin, W. R., Ton, J., and Jain, A. K., "Land Cover Change Detection using a GIS- guided, Feature-based Classification of Landsat Thematic Mapper Data, " ASRPS-ACSM Annual Convention, Baltimore, Vol. 6, pp. 108-120, 1987. Erdas User’s Guide, 1983, Erdas Corporation, Atlanta, Georgia. Erickson W.K., and Likens W.C., "An Application of Expert Systems Technology to Remote Sensed Image Analysis," Proceedings 9th Pecora Symposium, pp. 258-276, 1984. [Fis8 1 a] [FisBlb] [Gal85] [G0183] [G0185] [Goo87] [Gos85] [Gos86] [Gos83] [Gro82] 154 Fischler, M. A. and Bolles, R. 0, "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography," Communication of Association of Comput- ing Machinery, pp. 726-740, 1981. Fischler, M. A., Tenenbaum, J. M., and Wolf, H. C., "Detection of Roads and Linear Structures in Low-Resolution Aerial Imagery Using a Multisource Knowledge Integration Technique," Computer Graph- ics and Image Processing, Vol. 15, pp.201-223, 1981. Gallant, S. I., "Automatic Generation of Expert Systems from Exam- ples," IEEE Conference on Computer Vision and Pattern Recognition, June 1988. Goldberg, M., Kararrr, G., and Alvo, M., "A Production Rule-Based Expert System for Interpreting Multi-Temporal Landsat Imagery," Proc. IEEE, pp. 77-82. 1983. Goldberg, M., Goodenough D.G., Alvo M., and Karam G.M., "A Hierarchical Expert System for Updating Forest Maps with Landsat Data, Proceedings of IEEE, Vol.73, No. 6, pp. 1054-1063, June 1985. Goodenough, D.G., Goldberg M., and Plunkett G., "An Expert System for Remote Sensing," IEEE Trans. on Geoscience and Remote Sens- ing, Vol. GE-25, No. 3, pp. 349-359, May 1987. Goshtasby, A. and Stockman, G., "Point Pattern Matching Using Con- vex Hull Edges," IEEE Trans. Syst., Man, Cybem., Vol. SMC-15, no. 5. PP. 631-637, Sep 1985. Goshtasby, A., Stockman, G., and Page, C. V., "A Region-Based Approach to Digital Image Registration with Subpixel Accuracy," IEEE Trans. on Geoscience and Remote Sensing, Vol. GE-24, No. 3, pp. 390-399, May 1986. Goshtasby, A., A Symbolically-Assisted Approach to Digital Image Registration with Application in Computer Vision," Ph.D. Thesis, Dept. of Computer Science, Michigan State University, 1983. Groch, W. D., "Extraction of Line Shaped Objects from Aerial Images Using a Special Operator to Analyze the Profiles of Functions," Com- puter Graphics and Image Processing, Vol. 18, pp.347-358, 1982. [Gu188] [Han78] [Har83] [Har85] [Hen76] [Hor7 6] [Hue87] 155 Guindon, B., "Multi-Temporal Scene Analysis: A Tool to Aid in the Identification of Cartographically Significant Edge features on Satel- lite Imagery," Canadian Journal of Remote Sensing, Vol. 14, NO. 1, pp. 38-45, 1988. Hanson, A. R. and Riseman, E. M. (eds.), Computer Vision Systems. Academic Press, New York, 1978. Haralick, R. M., "Ridges and Valleys in Digital Images," Computer Vision, Graphics, and Image Processing, Vol. 22, pp. 28-38, 1983. Haralick, R. M. and Shapiro L. G., "Image Segmentation Techniques," Computer Vision, Graphics, and Image Processing, Vol. 29, pp. 100- 132, 1985. Henderson R.G., "Signature Extension Using the MASC Algorithm," IEEE Trans. on Geoscience Electronics, Vol.14, No. 1, pp. 34-37, Jan. 1976. Horowitz, L. and Pavlidis, T., "Picture Segmentation by a Tree Traversal Algorithm," J. Assoc. Comput. Mach., Vol 23, pp. 368-388, 1976. Huertas, A., Cole, W., and Nevatia, R., "Detecting Runways in Aerial Images," Proc. DARPA Image Understanding Workshop, pp. 272-297, Feb. 1987. [Hwa86] Hwang, S. S., Davis, L. S., and Matsuyama, T., "Hypothesis Integra- [Ibi86] [Ja182] [Jai88a] tion in Image Understanding Systems," Computer Vision, Graphics, and Image Processing, Vol. 36, pp. 321-371, 1986. Ibison, M. C. and Zapalowski, L., "On the Use of Relaxation Labeling in the Correspondence Problem," Pattern Recognition Letters, Vol. 4, pp. 103-109, 1986. Jain, A. K. and Chandrasekaran, B., "Dimensionality and Sample Size Considerations in Pattern Recognition Practice," in Handbook of Statistics, Vol. 2, pp. 835-855, Krishnaiah, P. R. and Kanal, L. N. (eds.), North-Holland Publ. CO., 1982. Jain, A. K. and Hoffman, R., "Evidence-Based Recognition of 3D Objects," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 10, No.6, pp. 783-802, 1988. [Jai88b] [Jak80] [Jen86] [Joh85] [Kah80] [Kah85] [Kam88] [Kas75] [Lan8l] [Lan84] [Lev85a] [Lev85b] 156 Jain, A. K. and Dubes, R. C., Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs, 1988. Jakes, P.J., "Timber Resource of Michigan’s Northern Lower Penin- sula," Resource Bulletin NC-62, U.S.D.A. Forest Service, North Cen- tral Forest Experiment Station, St. Paul, Minnesota, 1980. Jensen, J. R., Introductory Digital Image Processing: A Remote Sens- ing Perspective. Prentice-Hall, Englewood Cliffs, NJ ., 1986. Johnson, D. S., "The NP-Completeness Column: an Ongoing Guide," Journal of Algorithms, Vol. 6, pp. 434-451, 1985. Kahl, D. J., Rosenfeld, A., and Danker, A., "Some Experiments in Point Pattern Matching," IEEE Trans. Syst., Man, Cybem., Vol. SMC-IO, No. 2. PP. 105-116, Feb. 1980. Kahn,G., Nowlan, S., and McDermott, J., "Strategies for Knowledge Acquisition," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. PAMI-7, NO. 5. pp. 511-522, 1985. Kamada, Toraichi, K., Mori, R., Yamamoto, K., and Yamada, H., "A Parallel Architecture for Relaxation Operations," Pattern Recognition, Vol. 21, No. 2. pp. 175-181, 1988. Kasvand, T., "Some Observation Linguistics for Scene Analysis," Proceedings of the IEEE Conference on Computer Graphics, Pattern Recognition, and Data Structure," Los Angeles, pp. 118-124, May 1985. Landgrebe, D. A., "Analysis Technology for Land Remote Sensing," Proc. of the IEEE, Vol. 69, No. 5, pp. 628-642, 1981. Landsat 4 Data User Handbook, National Oceanic and Atmospheric Administration, 1984. Levine M. D., Vision in Man and Machine. McGraw-Hill, New York, 1985. Levine MD. and Nazif A.M., "Rule-Based Image Segmentation: A Dynamic Control Strategy Approach, Computer Graphics and Image Processing, Vol 32, pp. 243-248, 1985. [Lil79] [Low88] [Luk80] [Lu589] [Mac88] [Mar80] [Ma388] [Mat87] [McK85] [McK87] 157 Lillesand, T. M. and Kiefer, R. W., Remote Sensing and Image Interpretation, John Wiley & Sons, Inc., New York, 1979. Lowe, D. G., "Organization of Smooth Image Curves at Multiple Scales," Second IEEE International Conference on Computer Vision, pp. 558-567, Dec. 1988. Luks, E. M., "Isomorphism of Bounded Valence can be Tested in Polynomial Time," Journal of Computational System Science, Vol. 25, pp. 42-65. 1980. Lusch, D. P., "Fundamental Considerations for Teaching the Spectral Reflectance Characteristics of Vegetation, Soil and Water," in Nellis, M. D., Lougeay, R., and Lulla, K. (eds.), Current Trends In Remote Sensing Education, Hong Kong, Geocarto International Centre, pp. 5- 27, 1989. J. N. Macgregor, "The Effects of Order on Learning Classifications by Example: Heuristics for Finding the Optimal Order," Artificial Intelli- gence, Vol. 34, pp. 361-370, 1988. Marr, D. and Hildreth, E. C., "Theory of Edge Detection," Proc. Roy. Soc. London Ser. B 207, pp. 187-217, 1980. Mason D. C., Corr D. G., Cross A., Hogg D. C., Lawrence D. H., Petrou M., and Tailor A. M., "The Use of Digital Map Data in the Seg- mentation and Classification of Remotely-Sensed Images," Int. J. Geo- graphical Information Systems, Vol. 2, No. 3, pp. 195-215, 1988. Matsuyama, T., "Knowledge-Based Aerial Image Understanding Sys- tems and Expert Systems for Image Processing," IEEE .Trans. on Geoscience and Remote Sensing, Vol. GE-25, No. 3, pp. 305-316, 1987. McKeown D.M., Harvey W.A., McDermott J., "Rule-Based Interpre- tation of Aerial Imagery," IEEE Trans. Pattern Analysis and Machine Intelligence, VOL 7, No.5, pp. 570-585, 1985. McKeown, D. M. and Harvey, W. A., "Automatic Knowledge Acquisition for Aerial Image Interpretation," Defense Advanced Research Projects Agency (DARPA) Image Understanding Workshop, pp. 205-221, Feb., 1987. [McL88] [Med84] [Mic86a] [Mi086b] [M0083] [Nag80] [Nag84] [Naz84] [Nev80] [Nib88] [Ohl78] 158 McLean, G. F. and Jemigan, M. E., "Hierarchical Edge Detection," Comput. Vision Graphics Image Process, Vol. 44, pp. 350-366, 1988. Medioni, G. and Nevatia, R., "Matching Images Using Linear Features," IEEE Trans. Pattern Anal. Machine Intell., Vol. PAMI-6, No. 6. Pp. 675-685. 1984. Michalski R. 8., "Understanding the Nature of Learning: Issues and Research Directions," in Machine Learning: An Artificial Intelligence Approach, Vol. 2, pp. 3-25, R. S. Michalski, J. G. Carbonell, T. G. Mitchell (eds.), Morgan Kaufman, Los Altos, CA, 1986. Michalski, R. S., "Leaming to Predict Sequences," in Machine Learn- ing: An Artificial Intelligence Approach, Vol. 2, pp. 63-106, Michal- ski, R; S., Carbonell, J. G., and Mitchell, T. G. (eds.), Morgan Kauf- man, Los Altos, CA, 1986. Mooneyhan D.W., "The Potential of Expert Systems for Remote Sens- ing," Proceedings 17th International Symposium on Remote Sensing of Environment, Ann Arbor, Michigan, pp. 51-64, 1983. Nagao M. and Matsuyama T., A Structural Analysis of Complex Aerial Photographs. Plenum press, New York, 1980. Nagao, M., "Control Strategies in Pattern Analysis," Pattern Recogni- tion, Vol. 17, No. 1, pp. 45-56, 1984. Nazif A.M. and Levine M.D., "Low Level Image Segmentation: An Expert System," IEEE Trans. Pattern Analysis and Machine Intelli- gence, Vol. 6, No. 5, pp. 555-577, 1984. Nevatia, R. and Babu, K. R., "Linear Feature Extraction and Descrip- tion," Computer Graphics and Image Processing, Vol. 13, pp. 257-269, 1980. Niblack, W. and Petkovic D., "Experiments and Evaluations of Rule Based Methods in Image Analysis," Proc. IEEE Computer Vision and Pattern Recognition, Ann Arbor, pp. 123-128, 1988. Ohlander, R., Price, K., and Reddy, D. R., "Picture Segmentation using a Recursive Region Splitting Meth " Comput. Graphics Image Process. , Vol. 8, pp. 313-333, 1978. [Oh185] [Per84] [Pra80] [Pri86] [Pri79] [Ran80] [Ri587] [R057 1] [Ros76] [Ros77] [Rum86] [Sch76] 159 Ohta, Y., Knowledge-based Interpretation of Outdoor Natural Color Scenes. Pitrnan Advanced Publishing Program, Boston, 1985. Perry, C. R. and Lautenschlager, L. F., "Functional Equivalence of Spectral Vegetation Indices," J. Remote Sensing of Environment, Vol. 14, pp. 169-182, 1984. Prager, J. M., "Extracting and Labeling Boundary Segmernts in Natural Scenes," IEEE Trans. on Pattern Analysis and Machine Intelli- gence, Vol. 2, pp. 16-27, January 1980. Price, K. E., "Hierarchical Matching Using Relaxation," Computer Vision, Graphics, and Image Processing, Vol. 34, pp. 66-75, 1986. Price, K. and Reddy, R., "Matching Segments of Images," IEEE Trans. Pattern Anal. Machine Intell., Vol. PAMI-l, No. 1, pp. 110-116, Jan. 1979. Ranade, S. and Rosenfeld, A., "Point Pattern Matching by Relaxa- tion," Pattern Recognition, Vol. 12, pp. 269-275, 1980. Riseman, E. M. and Hanson, A. R., "A Methodology for the Develop- ment of General Knowledge-Based Vision Systems," in Hanson, A. R. and Arbib, M. A. (eds.), Vision, Brain, and Cooperative Computation, MIT Press, Cambridge, Massachusetts, pp. 285-328, 1987. Rosenfeld A. and Thurston, M., "Edge and Curve Detection for Visual Scene Analysis," IEEE Trans. Comput., Vol. C-20, pp. 562-569, 1971. Rosenfeld, A., Hummel, R A., and Zucker, S. W., "Scene Labeling by Relaxation Operations," IEEE Trans. Syst., Man, Cybem., Vol. SMC- 6, pp, 420-433, June 1976. Rosenfeld, A. and VanderBrug, G. J., "Two-Stage Template Match- ing," IEEE Trans. Comput., Vol. C-26, pp. 384-393, 1977. Rumelhart, D. E., Hinton, G. E., and Williams, R. J., "Leaming Representations by Back-Propagating Errors," Nature, Vol. 323 9, pp. 533-536. 1986. Schachter, B. J., Davis, L. 8., and Rosenfeld, A., "Scene Segmentation by Cluster Detection in Color Spaces," SIGART Newsletter, no. 58, pp. 16-17, 1976. [Sed83] [Sha85] [Smi82] [Sti85] [Sto82] [Swa78] [Swa85] [Tai86] [TaiS7] [Tan75] [Ten77] 160 Sedgewick, R., Algorithms, Addison-Wesley., Menlo Park, California, 1983. Shapiro L.G., "The Role of AI in Computer Vision," The Second _ IEEE Conference on Al Applications, Dec 13-15, pp. 76-81, 1985. Smith, W.D., "Timber Resource of Michigan’s Eastern Upper Penin- sula," Resource Bulletin NC-64, U.S.D.A. Forest Service, North Cen- ual Forest Experiment Station, St. Paul, Minnesota, 1982. Sticklen, J., Chandrasekaran, B., Smith, J. W., and Svirbely, J. R. MDX-MYCIN: The MDX Paradigm Applied To The MYCIN. Domain, Computers And Mathematics with Applications, vol. 11, pp. 527-539, 1985. Stockman, G., Kopstein, 8., and Benett, 8., "Matching Images to Models for Registration and Object Detection via Clustering," IEEE Trans. Pattern Anal. Machine Intell., Vol. PAMI-4, NO. 3, pp. 229- 241, May 1982. Swain, P. H. and Davis, 8. M., Remote Sensing: The Quantitative Approach, McGraw-Hill, New York, 1978. Swain P.H., "Advanced Interpretation Techniques for Earth Data Information Systems," Proceedings of IEEE, Vol. 73, No. 6, pp. 1031-1039, 1985. Tailor, A., "Knowledge-Based Interpretation of Remotely Sensed Images," Image and Vision Computing, Vol. 4, NO. 2, pp. 67-83, 1986. Tailor A., "A System for Knowledge-Based Segmentation of Remotely-Sensed Images," Proceedings of IGARSS ’87 Symposium, Ann Arbor, pp. 111-116, 1987. Image and Vision Computing, Vol. 4, No. 2, pp. 67-83, 1986. Tanimoto, S. and Pavlidis, T., "A Hierarchical Data Structure for Pic- ture Processing," Computer Graphics and Image Processing, Vol. 4, pp. 104-119, 1975. Tenenbaum, J. M. and Barraw, H. G., "Experiments in Interpretation- Guided Segmentation," Artificial Intelligence, Vol. 8, pp. 241-274, 1977 . [Ton87] [Use85] [Van77] [Wan83] [Wan87] [Wan88] [We185] [Wha86] [Wha87] [Win84] [Zhu86] 161 Ton, J., Jain, A. K., Enslin, W. R., and Hudson, W. D., "Automatic Road Detection in Landsat 4 TM Images," Proc. of let International Symposium on Remote Sensing of Environment, pp. 925-937, Ann Arbor, Michigan, Oct. 1987. User Guide for Landsat Thematic Mapper Computer Compatible Tapes, Earth Observation Satellite Company, 1985. Vanderbrug, G. J ., "Experiments in Iterative Enhancement of Linear Features," Computer Graphics and Image Processing, Vol. 6, pp. 25- 42, 1977. Wang, C., Sun, H., Yada, S., and Rosenfeld, A., "Some Experiments in Relaxation Image Matching Using Corner Features," Pattern Recog- nition, Vol. 16, pp. 167-182, 1983. Wang, J. F. and Howarth, P. J., "Automatic Road Network Extraction from Landsat TM Imagery," ASRPS-ACSM Annual Convention, Bal- timore, Vol. 1, pp. 86-95, 1987. Wang, F. and Newkirk, R., "A Knowledge-based Stystem for Highway Network Extraction," IEEE Trans. on Geoscience and Remote Sens- ing, Vol. 26, No. 5, pp. 525-531, 1988. Welch, R., Jordan, T. R., and Ehlers, M., "Comparative Evaluations of the Geodetic Accuracy and Cartographic Potential of Landsat-4 and Landsat-5 Thematic Mapper Image Data," Photogrammetric Engineer- ing and Remote Sensing, Vol. 51, No. 9, pp. 1249-1262, Sep. 1985. Wharton, S. W., "A Spectral Target Recognition Expert for Urban Land Cover Discrimination in High Resolution Remotely Sensed Data," Ph.D. Thesis, University of Maryland, 1986. Wharton, S. W., "A Spectral-Knowledge-Based Approach for Urban Land-Cover Discrimination," IEEE Trans. on Geoscience and Remote Sensing, Vol. GE-25, No. 3, pp. 272-282, 1987. Winston, P. H., Artificial Intelligence. Addison-Wesley, Menlo Park, California, 1984. Zhu, M. L. and Yeh, P. 8., "Automatic Road Network Detection on Aerial Photographs," Proc. IEEE Computer Vision and Pattern Recog- nition Conference, Miami Beach, pp. 34-40, June 1986. [Zuc76] [Zuc77] [Zuc78] 162 Zucker, S. W., "Region Growing: Childhood and Adolescence," Com- puter Graphics and Image Processing, Vol. 5, No. 3, pp. 382-399, Sep- tember 1976. Zucker, S. W., Hummel, R. A., and Rosenfeld, A., "An Application of Relaxation Labeling to Line and Curve Enhancement," IEEE Trans. on Computers, Vol. c-26, No. 4, pp. 394-403, April 1977. Zucker, S. W., Krishnamurthy, E. V., and Hart, R., "Relaxation Processes for Scene Labeling: Convergence, Speed, and Stability," IEEE Trans. Syst., Man, Cybem., Vol. SMC-8, pp. 41-48, Jan. 1978. 111/1711711711 U swasllllllllrrrr 180