INFORMATION TO USERS This reproduction was made from a copy o f a document sent to us for microfilming* While the most advanced technology has been used to photograph and reproduce this document, the quality of the reproduction is heavily dependent upon the quality o f the material submitted. The following explanation of techniques is provided to help clarify markings or notations which may appear on this reproduction. 1.The sign or “target” for pages apparently lacking from the document photographed is “Missing Page(s)”. I f it was possible to obtain the missing page(s) or section, they are spliced into the film along with adjacent pages. This may have necessitated cutting through an image and duplicating adjacent pages to assure complete continuity. 2. When an image on the film is obliterated with a round black mark, it is an indication of either blurred copy because of movement during exposure, duplicate copy, or copyrighted materials that should not have been filmed. For blurred pages, a good image o f the page can be found in the adjacent frame. If copyrighted materials were deleted, a target note will appear listing the pages in the adjacent frame. 3. When a map, drawing or chart, etc., is part of the material being photographed, a definite method of “sectioning” the material has been followed. It is customary to begin filming at the upper left hand comer o f a large sheet and to continue from left to right in equal sections with small overlaps. I f necessary, sectioning is continued again—beginning below the first row and continuing on until complete. 4. For illustrations that cannot be satisfactorily reproduced by xerographic means, photographic prints can be purchased at additional cost and inserted into your xerographic copy. These prints are available upon request from the Dissertations Customer Services Department. 5. Some pages in any document may have indistinct print. In all cases the best available copy has been filmed. University. Microfilms International 300 N. Zeeb Road Ann Arbor, Ml 48106 8308952 Hill-Rowley, Richard AN EVALUATION OF DIGITAL LANDSAT CLASSIFICATION PROCEDURES FOR LAND USE INVENTORY IN MICHIGAN Michigan State University University Microfilms International PH.D. 1982 300 N. Zeeb Road, Ann Aibor, M I 48106 PLEASE NOTE: In all cases this material has been filmed in the best possible way from the available copy. Problems encountered with this document have been identified here with a check mark V ^ 1. Glossy photographs or pages 2. Colored illustrations, paper or print 3. Photographs with dark background 4. Illustrations are poor copy______ 5. Pages with black marks, not original copy______ 6. Print shows through as there is text on both sides of page_____ 7. Indistinct, broken or small print on several pages 8. Print exceeds margin requirements_____ 9. Tightly bound copy with print lost in spine______ 10. Computer printout pages with indistinct print_____ 11. Page(s)___________lacking when material received, and not available from school or author. 12. Page(s)__________ seem to be missing in numbering only as text follows. 13. . ^ Two pages numbered__________ . Text follows. 14. Curling and wrinkled pages______ 15. Other________________________________________________________________ University Microfilms International AN EVALUATION OF DIGITAL LANDSAT CLASSIFICATION PROCEDURES FOR LAND USE INVENTORY IN MICHIGAN By Richard Hill-Rowley A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Geography 1982 ABSTRACT AN EVALUATION OF DIGITAL LANDSAT CLASSIFICATION PROCEDURES FOR LAND USE INVENTORY IN MICHIGAN By Richard Hill-Rowley The major research objective of this study was to evaluate the informational value of LANDSAT digital data in the context of provid­ ing land use data to users in Michigan. Information on user needs was developed through an extensive survey focusing primarily on land use planners. Land categories were identified and subsequently modified in order to be compatible with processing of LANDSAT data. Three test sites, each characterized by a distinct land use type (agricultural, urban and forest) were chosen to evaluate LANDSAT per­ formance. For each of these sites LANDSAT data for August 16, 1979 were classified by means of three commonly available algorithms (max­ imum likelihood, minimum distance-to-means, grouping of cluster classes). Accuracy evaluation of the resulting classifications included examining the effects of generalization for geographic infor­ mation system formats. Conclusions derived from the study can be summarized in three major areas. Analysis of land use categories identified by question­ naire responses, in conjunction with limitations imposed by the characteristics of the LANDSAT data, show that the minimum land use requirements of the majority of planners surveyed could not be ful­ filled by the classification procedures. The categories that remained, nevertheless, represent the dominant land use classes in each of the areas selected. Accurate updating and/or remapping of these categories would be of considerable value. Classification accuracies for the urban and forest test sites approached the 759 accuracy standard used in the study. for the agricultural site were significantly lower. Accuracies Variations were present among algorithms, with the unsupervised grouping of cluster classes substantially superior in the agricultural site, maximum likelihood the most accurate in the urban site and minimum distanceto-means preferable in the forest site. Despite this distribution, no significant difference existed at the 95Jt-confidence level between the overall performance of the algorithms. Accuracy analysis of the generalized LANDSAT classes, when com­ pared with geocoded land use types derived from airphoto interpreta­ tion, resulted in a substantially lower accuracy than the same com­ parison between actual aerial photography and the pixel-based LANDSAT classification. The extent of this accuracy loss seems to have a direct relationship with the diversity and spatial complexity of the test site being classified. To Lilian, Frank and Elizabeth ii ACKNOWLEDGEMENTS This work could not have been completed without the emotional, logistical and financial support of my wife, Christine. There is no way for me to adequately express my appreciation for her efforts and sacrifices. Many other people contributed time and effort to help me complete the study and to them I wish to express my sincere thanks. John Baleja, now with ESL, Inc., provided invaluable support with the computer aspects of the study. His understanding and patient work on my behalf went far beyond his formal responsibilities and allowed me to overcome what, at times, appeared to be never-ending problems. Professor Dieter Brunnschweiler, my major advisor, provided encouragement and counsel throughout the study. His friendship over many years has been a source of both personal and professional support for which I am most grateful. The other members of my committee. Pro­ fessors Schultink, Groop and Chubb, each reviewed the manuscript in a careful, thorough fashion and made valuable suggestions, as did Pro­ fessor Mezga, the Dean's representative. The work was completed at the Center for Remote Sensing where Dave Lusch, Bill Enslin and Andy Zusmanis contributed important techn­ ical assistance. Elizabeth Bartels, secretary at the Center, helped me bring the manuscript to fruition and generously provided those essential services so necessary to completing a project like this. Logistical support was provided by NASA Grant # NGL 23-00U-083. Mike Scieszka from the Division of Land Resource Programs, Michi­ gan Department of Natural Resources, was very supportive of the study iii at its inception and arranged assistance from the PA 20H program in developing and distributing the Michigan Data Needs Questionnaire. The late Ron Webster from Land Resource Programs also contributed his talents during the preparation of the questionnaire. Finally, thanks are due to all those people who have been associ­ ated with the Remote Sensing Project and the Center for Remote Sensing over the last eight years. From sensing, and many other things. iv .n I have learned much about remote TABLE OF CONTENTS Page LIST OF TABLES..................................... viii LIST OF FIGURES...................................... xii Chapter I.INTRODUCTION .............................................. 1 Geographic Framework................................ 1 Applications Context................................ 6 Objectives.......................................... 9 II.THE MICHIGAN DATA NEEDS QUESTIONNAIRE....................... 11 Introduction........................................ 11 Information Components.............................. 12 Information Characteristics ......................... 17 Questionnaire Design................................ 18 Response to the Questionnaire...................... . 20 Analysis of Questionnaire Responses . . . ........... 24 Land Use Category Identification ............... 24 Technical Characteristics of Information Require­ ments.......................................... 45 Conclusion.......................................... 49 III.DIGITAL CLASSIFICATION OF LANDSAT DATA .................... 51 Introduction........................................ 51 Data Collection.................................... 51 v Page III. IV. Digital Classification of Landsat Data (cont'd.) Data Interpretation................................ 52 Methods of Analysis................................ 54 Data Reformatting and Pre-Processing ........... 55 Definition of Training Statistics............... 56 Computer Classification of Data................. 60 Information Display and Tabulations............. 75 Evaluation of Results.......................... 77 EXECUTION OF THE LANDSAT TEST............................ 78 Introduction........................................ 78 Category of Reduction .............................. 79 Study Area Selection and Classification Materials . . . 84 Study Area Selection.......................... 84 LANDSAT Data Acquisition....................... 86 Ground Verification............................ 96 Training Set Selection.............................. 97 Training Set Selection Procedure ............... 97 Training Set Delineation ....................... 101 Classification........................................ 112 Summary and Conclusions ............................ V. EVALUATION OF CLASSIFICATION RESULTS ..................... 115 120 Introduction.......................................... 120 Procedures............................................ 120 Classification Performance.............................123 Generalized Classification Results..................... 147 VI. CONCLUSIONS................................................164 Introduction.......................................... 164 vi VI. Conclusions (cont'd.) Category Definition ................................ 165 Testing LANDSAT Classifications ..................... 168 Generalization to Geographic Information System Formats.............................................. 169 Current Applicability and Future Developments ........ 170 Al. THE MICHIGAN RESOURCE INVENTORY ACT (PA 2 0 4 ) ............. 173 A2. THE MICHIGAN DATA NEEDS QUESTIONNAIRE....................... 177 A3. AGENCIES RESPONDING TO THE MICHIGAN DATA NEEDS QUESTIONNAIRE 180 A4. MICHIGAN REFERENCE M A P .................................... 182 A5. CURRENT USE INVENTORY: A6. TABULATION OF RESPONSES TO INVENTORY CHARACTERISTICS. . . . 188 TRAINING SET SELECTION PROCEDURE ........................ 208 Cl. REVIEW OF ACCURACY TESTING PROCEDURES. . . . 213 C2. INDIVIDUAL CATEGORY ACCURACY ANALYSIS....................... 222 APPENDICES Appendix B. SUMMARY OF RESPONSES BY USER GROUP . ............. 183 D. COMPARING CLASSIFICATION PERFORMANCE ..................... E. CLASSIFICATION ERROR MATRICES FOR MAP AND GEOCODED ACCURACY PROCEDURES................................................ 233 BIBLIOGRAPHY .................................................. vii 229 251 LIST OF TABLES Table 1. Current Use Inventory Categories Included in the Michigan Data Needs Questionnaire.................................. 14 2. Resource Information Categories Included in the Michigan Data Needs Questionnaire.................................. 16 3. Information Characteristics Options Included in the Michigan Data Needs Questionnaire.................................. 19 4. Rate of Response to the Michigan Data Needs Questionnaire . . 22 5. Categories Selected by Questionnaire Respondents: The Total Response Group............................................ 25 6. Categories Selected by Questionnaire Respondents: The Planner Sub-Group................................................ 27 7. Categories Selected by the Planner Sub-Group: Region 1 (Southern Lower Peninsula)................................ 30 8. Categories Selected by the Planner Sub-Group: Region 2 (Northern Lower Peninsula)................................ 32 9. Categories Selected When Questionnaire Respondents are Ordered by Mean Rank Score: The Total Response G r o u p ............. 37 10. Categories Selected When Questionnaire Respondents are Ordered by Mean Rank Score: The Planner Sub-Group................. 39 11. Cumulative Percentages for Selected Categories: The Planner Sub-Group................................................ 43 12. Classification Categories ................................. 44 13. Classification Categories ................................. 80 14. Category Rationalization Process ......................... 85 15. Training Categories: Agricultural Test Site ............. viii 104 Page 16. Training Categories: Urban Test Site ............. 105 17. Training Categories: Forest Test Site....................... 106 18. Final Training Areas: Agricultural Test Site .............. 19. Final Training Areas: Urban Test Site........................Ill 20. Final Training Areas: Forest Test Site .................... 109 113 21. Accuracy Comparison of Test Sites ........................... 124 22. Classification Error Matrix for the Agricultural Test Site: Maximum Likelihood Classification Ground Verification Source: Aerial Photography .......... 127 23. Classification Error Matrix for the Agricultural Test Site: Minimum Distance-to-Means Classification Ground Verification Source: Aerial Photography .......... 128 24. Classification Error Matrix for the Agricultural Test Site: Grouping of Cluster Classes Ground Verification Source: Aerial Photography 129 25. Classification Error Matrix for the Urban Test Site: Maximum Likelihood Classification Ground Verification Source: Aerial Photography 135 26. Classification Error Matrix for the Urban Test Site: Minimum Distance-to-Means Classification Ground Verification Source: Aerial Photography 136 27. Classification Error Matrix for the Urban Test Site: Grouping of Cluster Classes Ground Verification Source: Aerial Photography 137 28. Classification Error Matrix for the Forest Test Site: Maximum Likelihood Classification Ground Verification Source: Aerial Photography 141 29. Classification Error Matrix for the Forest Test Site: Minimum Distance-to-Means Classification Ground Verification Source: Aerial Photography 142 30. Classification Error Matrix for the Forest Test Site: Grouping of Cluster Classes Ground Verification Source: Aerial Photography 143 31. Comparative Test Site Accuracies............................. 150 32. Accuracy Loss in PercentagePoints Between Photo Accuracy and Geocoded Accuracy .................................... ix 152 Page Geocoded Ground Verification Comparisons................. 155 Individual Category Accuracies (%): Agricultural Test Site Ground Verification Source: Aerial Photography ......... 224 Individual Category Accuracies (%): Urban Test Site Ground Verification Source: Aerial Photography ......... 226 Individual Category Accuracies (%): Forest Test Site Ground Verification Source: Aerial Photography ......... 228 Classification Error Matrix for the Agricultural Test Site: Maximum Likelihood Classification Ground Verification Source: Delineated Land-Use Map. . . . 233 Classification Error Matrix for the Agricultural Test Site: Minimum Dlstance-to-Means Classification Ground Verification Source: Delineated Land-Use Map. . . . 234 Classification Error Matrix for the Agricultural Test Site: Grouping of Cluster Classes Ground Verification Source: Delineated Land-Use Map. . . . 235 Classification Error Matrix for the Urban Test Site: Maximum Likelihood Classification Ground Verification Source: Delineated Land-Use Map. . .. 236 Classification Error Matrix for the Urban Test Site: Minimum Distance-to-Means Classification Ground Verification Source: Delineated Land-Use Map. . .. 237 Classification Error Matrix for the Urban Test Site: Grouping of Cluster Classes Ground Verification Source: Delineated Land-Use Map. . .. 238 Classification Error Matrix for the Forest Test Site: Maximum Likelihood Classification Ground Verification Source: Delineated Land-Use Map. . .. 239 Classification Error Matrix for the Forest Test Site: Minimum Distance-to-Means Classification Ground Verification Source: Delineated Land-Use Map. . .. 240 Classification Error Matrix for the Forest Test Site: Grouping of Cluster Classes Ground Verification Source: Delineated Land-Use Map. . .. 241 Classification Error Matrixfor the Agricultural Test Site: Aggregated Maximum Likelihood Classification Ground Verification Source: Geocoded Land-Use Map........ 242 Classification Error Matrix for the Agricultural Test Site: Aggregated Minimum Dlstance-to-Means Classification Ground Verification Source: Geocoded Land-Use Map........ 243 x Page E.12. Classification Error Matrix for the Agricultural Test Site: Aggregated Grouping of Cluster Classes Ground Verification Source: Geocoded Land-Use Map.......... 244 E.13. Classification Error Matrix for the Urban Test Site: Aggregated Maximum Likelihood Classification Ground Verification Source: Geocoded Land-Use Map............ 245 E.14. Classification Error Matrix for the Urban Test Site: Aggregated Minimum Distance-to-Means Classification Ground Verification Source: Geocoded Land-Use Map............246 E.15. Classification Error Matrix for the Urban Test Site: Aggregated Grouping of Cluster Classes Ground Verification Source: Geocoded Land-Use Map......... 247 E.16. Classification Error Matrix for the Forest Test Site: Aggregated Maximum Likelihood Classification Ground Verification Source: Geocoded Land-Use Map............248 E.17. Classification Error Matrix for the Forest Test Site: Aggregated Minimum Distancet-to-Means Classification Ground Verification Source: Geocoded Land-Use Map............249 E.18. Classification Error Matrix for the Forest Test Site: Aggregated Grouping of Cluster Classes Ground Verification Source: Geocoded Land-Use Map............250 xi LIST OF FIGURES Page Figure 1. Distribution of Questionnaire Responses (for County Names see Appendix A4).......................................... 23 Location of LANDSAT Bands Within the Electromagnetic Spectrum (after Lusch, 1982) ...................................... 53 3. Elements of a Pattern Recognition System................... 60 4. A. 2. Generalized Spectral Curves for Representative Land-Cover Types (Hoffer, 1978; Landgrebe, 1973) ................. 65 Representative Land-Cover Types Plotted in 2-Dimensional Space (LANDSAT Bands 5 and 7 ) ......................... 65 Minimum Distance-to-Means Classification Strategy (after Lillesand and Kiefer, 1979) .............................. 67 Parallelepiped Classification Strategy (after Lillesand and Kiefer, 1979) ............................................ 68 Equal Probability Contours Defined by a Maximum Likelihood Classifier (after Lillesand and Kiefer, 1979) ............. 72 Decision Points Based on Probability and Mean Values (after Jayroe et al., 1976)...................................... 73 B. 5. 6. 7. 8. 9. Topographic Map Display of the Agricultural TestSite. . . . 10. Black-and-White Mosaic Display of the Agricultural Test 11. 87 Site. 87 False Color Composite of LANDSAT Data for the Agricultural Test Site Displayed on an RGB Monitor (Color Gun Assignments: MSS Band 4 - Blue, MSS Band 5 - Green, MSS Band 7 - Red). . . 88 12. Topographic Map Display of the Urban Test Site......... 89 13. Black-and-White Mosaic of the Urban Test S i t e ......... 90 14. False Color Composite of LANDSAT Data for the Urban Test Site Displayed on an RGB Monitor (Band Assignments - see Figure 11) xii 91 Page 15. Topographic Map Display of the Forest Test S i t e ............ 92 16. Black-and-Hhlte Mosaic of the Forest Test Site............. 92 17. False Color Composite of LANDSAT Data for the Forest Test Site Displayed on an RGB Monitor (Band Assignments - see Figure 11) 93 18. Location of the Test Sites Within the LANDSAT Scene (Path 23, Row 30) and the Centers of the Scenes That Cover Michigan . . 94 19. Identification of a Training Area Representing Water on Aerial Photography............................................. 99 20. The Water Training Area Delineated on a LANDSAT Display (2X Magnification), with Accompanying Histograms for 4 LANDSAT B a n d s ................................................... 99 21. An Alarmed LANDSAT Display for the WaterTraining Area. .. . 100 22. Dot-Matrix Printer Display with AlphanumericSymbols Repre­ senting Land Use Classes: Maximum Likelihood Classification for a Portion of the Urban Test Site..........................116 23. Simple Flow Chart of StudyProcedures ..................... 117 24. Rectified False Color Composite of LANDSAT Data for the Agricultural Test Site Displayed on an RGB Monitor (Band Assignments - see Figure 11)................................ 130 25. Land Use Classes for the Agricultural Test Site Derived from Maximum Likelihood Classification Displayed on an RGB Monitor 130 26. Land Use Classes for the Agricultural Test Site Derived from Minimum Distance-to-Means Classification Displayed on an RGB Monitor................................................... 131 27. Land Use Classes for the Agricultural Test Site Derived from a Grouping of Cluster Classes Displayed on an RGB Monitor . . 131 28. Rectified False Color Composite of LANDSAT Data for the Urban Test Site Displayed on an RGB Monitor (Band Assignments see Figure 11)..............................................138 29. Land Use Classes for the Urban Test Site Derived from Maximum Likelihood Classification Displayed on an RGB Monitor . . . . 138 30. Land Use Classes for the Urban Test Site Derived from Minimum Distance-to-Means Classification Displayed on an RGB Monitor. 139 31. Land Use Classes for the Urban Test Site Derived from a Grouping of Cluster Classes Displayed on an RGB Monitor . . . 139 xlli Page 32. Rectified False Color Composite of LANDSAT Data for the Forest Test Site Displayed on an RGB Monitor (Band Assign­ ments - see Figure 11)...................................... 145 33. Land Use Classes for the Forest Test Site Derived from Maximum Likelihood Classification Displayed on an RGB Monitor 145 34. Land Use Classes for the Forest Test Site Derived from Minimum Distance-to-Means Classification Displayed on an RGB Monitor................................................146 35. Land Use Classes for the Forest Test Site Derived from a Grouping of Cluster Classes Classification Displayed on an RGB Monitor................................................146 36. Geocoded Ground Verification Data for the Agricultural Test Site Displayed on an RGB Monitor.............................156 37. Land Use Classes for the Agricultural Test Site Derived from Aggregated Maximum Likelihood Classification Displayed on an RGB Monitor............................................... 156 38. Land Use Classes for the Agricultural Test Site Derived from Aggregated Minimum Distance-to-Means Classification Displayed on an RGB Monitor......... 157 39. Land Use Classes for the Agricultural Test Site Derived from an Aggregated Grouping of Cluster Classes Classification Dis­ played on an RGB Monitor. . . ............................. 157 40. Geocoded Ground Verification Data for the Urban Test Site Dis­ played on an RGB Monitor.................................... 158 41. Land Use Classes for the Urban Test Site Derived from Aggre­ gated Maximum Likelihood Classification Displayed on an RGB Monitor................................................... 158 42. Land Use Classes for the Urban Test Site Derived from Aggre­ gated Minimum Distance-to-Means Classification Displayed on an RGB Monitor..............................................159 43. Land Use Classes for the Urban Test Site Derived from an Aggregated Grouping of Cluster Classes Displayed on an RGB Monitor................................................... 159 44. Geocoded Ground-Verification Data for the Forest Test Site Displayed on an RGB Monitor................................ 160 45. Land Use Classes for the Forest Test Site Derived from Aggre­ gated Maximum Likelihood Classification Displayed on an RGB Monitor................................................... 160 xiv Page 46. Land Use Classes for the Forest Test Site Derived from Aggregated Minimum Dlstance-to-Means Classification Dis­ played on an RGB Monitor........... 161 47. Land Use Classes for the Forest Test Site Derivedfrom an Aggregated Grouping of Cluster Classes Displayed on an RGB Monitor................................................... 161 Cl.l. A Stratified Systematic Unaligned Sample Grid .............. & xv 214 CHAPTER I INTRODUCTION Geographic Framework "Interest in planning for the classification and use of the land and other natural resources is not new to our science; in fact it is one of the most persistent interests in American geography." This quotation from a presidential address to the Association of American Geographers in 1936 (Colby, 1936) introduces a chapter on "Applied Geography" in James's history of geographical ideas (James, 1972). It is a statement that would not be seriously challenged today, more than forty years later; and because of that, it reinforces the idea of land use as a persistent theme in geographical inquiry. There is, of course, an ebb and flow of themes in the development of a discipline. During the 1960s, geography rapidly embraced mathematical and statistical techniques, and the scientific method dictated by a logical positivist approach (Taaffe, 1974). Emphasis was placed on theoretical concerns while the applied aspects of the discipline, of which land use was a part, were in relative decline. Applied geography is now re-emerging with renewed strength and proponents have even begun to suggest the possibility of an applied geography paradigm for the discipline (Frazier, 1978). forces are pushing this move into the applied realm. Two major The positivist approach, while contributing much to research practice, has been chal­ lenged by other value-oriented philosophical approaches (Buttimer, 1974), and a product of this discussion has been the conclusion that the discipline has not addressed contemporary problems in a relevant way. Frazier (1978) identifies Berry (1973) and King (1976) as previ­ ous advocates of the positivist approach who now see theoretical 1 2 approaches inhibiting the development of a geography that is socially relevant and that can deal with applied questions. The second reason has a more practical origin, but complements the intellectual argument. A shift towards professionalism ack­ nowledges the need for professional geographers to compete in the non-academic job market (Harrison and Larsen, 1977; Pryde, 1978). This, in turn, requires orientation to areas of expertise such as analysis of land use and environmental relationships (Morrill, 1975) and the use of theories and methodologies in solving "real-world" problems (Stutz, 1980). Applied geography and land use studies are thus re-emerging as a dominant theme in the field. The work reviewed by James documents a number of the concepts and objectives identified during the 1920s and 1930s which are still appropriate when viewed in the context of present land use patterns and problems. This can be illustrated by considering a major resource inventory program carried out in Michigan at that time under the auspices of the Michigan Department of Conservation by P.S. Lovejoy, Carl Sauer, and others, known as the Michigan Land Economic Survey (MLES) and its present day parallel, the Michigan Land Resource Inven­ tory Program (PA 204). There are three themes that can be identified in the Michigan Land Economic Survey which are still valid despite a changed context. The survey was a response to a severe and widespread land use problem in northern Michigan (Barnes, 1929). Lumbering and fire had removed the forest cover and much of the land had proved unsuitable for agri­ culture. Large areas had reverted to state-ownership because of tax delinquency and the maintenance of roads and schools was becoming a 3 major burden on the state treasury. In an attempt to resolve the situation, Sauer proposed a mapping program so that the conflicting interests involved - those of the loggers, the farmers, the hunting and fishing groups and the early recreational entrepreneurs - would understand the differences in the cut over lands and plan future development more effectively. The major tool in this scheme was "... a type of map that attempts to cover the field of pure geography, the so-called economic map" (Sauer, 1919. P. 47). It was in fact a "geo­ graphic map" that was designed to "portray the use to which the entire land surface is put." A crude classification system was proposed which is similar to the first level of differentiation in a classifi­ cation scheme widely used today (Anderson et al., 1976).1 Current pressures on the land resource have a more complex etiology and the nature of these pressures varies throughout the state. Multiple demands on forest resources in the Upper Peninsula and northern Lower Peninsula are substantial. The forest products indus­ try is expanding, increasing the demand for wood. Demand for wood as a domestic heating source and possibly as fuel for electricity genera­ tion also reinforce the need to carefully manage the use of this resource. Other concerns, such as the monitoring of extractive- industry activities in oil and gas, and sand and gravel, in addition to accommodating increased recreational demands, contribute to the complexity of the situation. In the southern Lower Peninsula the sym­ biotic dangers of expanding urban-oriented uses and the loss of prime 1The MLES classification includes: 1) barrens, 2) woodlands, 3) per­ manent pastures and meadows, 4) cultivated lands, and 5) town sites; compared with the Level I classification in Table 1. 4 agricultural land have created land use pressures that are common throughout the country. Although a number of these problems have been addressed by state legislative action, an integrating state land use map is not yet available to put options into perspective. The Michi­ gan Resource Inventory Act (PA 204) which is now in its implementation state (1982) has a "current use inventory" as one of its primary pro­ visions. The land use map is still a critical information source when conflicting forces have to be reconciled to allow a wise utilization of limited land resources. A second theme clearly apparent in the original Michigan Land Economic Survey is experimentation with different methodologies for land inventory. James (1972) makes this clear in a description of the survey noting that the work was done before the fractional code map­ ping of y.C. Finch. This concern with developing new techniques is confirmed by Leighly's thoughtful account of participation in the work as a graduate student (Leighly, 1979). Land use mapping today is also characterized by experimentation in methods, partly in direct response to the rapid development of remote-sensing technology over the last two decades. Among the many advances that took place during the early part of this period, two had a major impact on the development of operational land use mapping (Peplies and Keuper, 1975). These advances were the common use of color-infrared films in aerial cameras and the use of high altitude platforms, such as the RB57 aircraft by NASA, to acquire high resolu­ tion, small scale imagery. In the early 1970s, a major demonstration of the applicability of small-scale color-infrared aerial photography to land use mapping was undertaken by the U.S. Geological Survey 5 (USGS) on the Central Atlantic Regional Ecological Test Site (CARETS). Land use information was gathered for the Norfolk- Portsmouth SMSA in southeastern Virginia (Alexander, 1975). The results were very suc­ cessful and prompted the development of the national land use and land cover mapping program of the U.S.G.S. (Anderson, 1977). A nationwide program for high-altitude aerial photography was also proposed (Acker­ man and Alexander, 1975) which after an initial delay finally gained approval (Antill and Gockowski, 1979). Photography from this program is now available for some parts of Michigan and other states (1982) but it has not proceeded as rapidly as was first planned because of funding constraints. Another major development in the progress of remote sensing tech­ nology has been the development of the LANDSAT satellites. Visual analysis of early LANDSAT color composites was used in a comparative study as part of the CARETS project, but the major emphasis, espe­ cially in recent years, has been in digital processing of LANDSAT data available on computer-compatible tapes. A substantial number of stu­ dies investigating the potential of LANDSAT digital processing for gathering land use information have been conducted with quite varied results (Castruccio, 1978). What has been absent from these studies is a comparison of how various classification algorithms perform and a systematic approach to defining which categories must be mapped, prior to the actual processing. Experimentation will continue with testing data from new satellite systems, such as LANDSAT 4, which will be of higher spatial and spectral resolution. The third theme is in the integration of land use information with other information such as soils, land ownership, and population 6 characteristics. Sauer (1921) used these information sets to estab­ lish agricultural productivity classes, DeVries (1928) to associate economic conditions with land cover type, and Schoenmann (1931) to establish rural zoning. These ideas are conceptuallysimilar to those behind the establishment of a gan under PA 204. geographic information system for Michi­ Computer technology facilitates an expansion of the areas that can reasonably be considered by a geographic information system and allows an analyst to employ more variables in a statistical framework, but the objectives of this analysis tool are similar to those of the original MLES study. Applications Context Specific questions which have to be addressed inorder to facili­ tate the implementation of PA 204 are: i) A clear indication of the categories of land use information that are required by planners and resource managers under the provi­ sions of the Act. A number of general areas of concern were included in the text of PA 204; however, it was necessary to operationalize these concerns in terms of specific alternatives and ascertain from the user group the importance of these alternatives. It was decided that the best way to obtain this information was through a survey questionnaire allowing the users to respond directly to a range of alternatives. ii) The land-use inventory associated with PA 204 is to be com­ pleted using the 1:24,000 color infrared photography acquired by the Michigan Department of Natural Resources in 1977/78. An update schedule is specified in the Act, but it seems unlikely at this time that statewide aerial photography will be available to complete this 7 task in 3 to 5 years. LANDSAT provides repetitive coverage and has been used in land use mapping applications with varied levels of suc­ cess. For meaningful advanced planning it is important, therefore, to evaluate how accurately LANDSAT can provide information on the land use categories included in the PA 204 inventory and, more specifi­ cally, which of the classification methodologies available give the best results. Comparative studies of the various classification pro­ cedures used in digital LANDSAT processing have been limited and until very recently have not addressed general land use and land cover inventory (Nelson et al., 1981). Perhaps more importantly, the com­ parative studies have taken advantage of advanced sophisticated pro­ cessing systems. In recent years a number of state government agen­ cies have had the opportunity to work with NASA personnel at their regional facilities using sophisticated technology and available technical assistance. and agencies. This technology is being transferred to users During the next decade, however, federal support for land-based remote sensing is likely to be substantially reduced. Agency users will thus have to deal with their own, or with university LANDSAT digital processing capacity which is, in most cases, more rudimentary. This study is an experiment in using the capabilities of 2 A comprehensive evaluation of LANDSAT accuracy would involve two as­ pects. It is first necessary to test the technical feasibility of at­ taining acceptable land use information. If the results of this work are positive, it is then necessary to consider the cost of completing such an analysis. This is an operational stage which enables an analyst to compare LANDSAT performance with results from alternative data sources and interpretation methodologies. In an operational sense, accuracy is only relevant when associated with cost information pertaining to its achievement. The work reported in this study deals only with testing the technical feasibility of using LANDSAT to provide land use informa­ tion. 8 a microcomputer-based analysis system to replicate procedures avail­ able in more advanced systems. These systems are a new development in digital processing of LANDSAT data which can, potentially, balance the reduced availability of sophisticated NASA facilities, because of their reduced cost. Colwell (1979) has noted that rapid changes in remote sensing methods since 1960 have led to heavy resistance by potential users to the acceptance of new technology. The demand for timely and accurate resource information by users is nevertheless great and comparative studies which can show the capabilities of these technologies in meet­ ing user-defined needs will be critical in breaking down or confirming this resistance. iii) An important component of the provisions included in PA 204 is the creation of a geographic information system to manage the land use data and allow it to be used in conjunction with other resource information for land management tasks. Information is stored in the information system on the basis of grid cells and in this case, as with many others, the size has been designated as 10 acres. A land use inventory derived from a high-resolution data source such as 1:24,000 aerial photography must be generalized considerably to fit the constructs of an information system of this type. Mapping takes place with a minimum type-size of 10 acres, simplifying the land use pattern by omitting those types which cover less than 10 acres. A second level of generalization occurs when the data are digitized or geocoded into grid cells. LANDSAT is a generalized data source to begin with, having an approximate resolution element of 1.1 acres. When these data are 9 classified and similarly generalized to 10 acres, it seems reasonable to hypothesize that the generalized land use from aerial photography and LANDSAT may be comparable, even if the accuracy of the LANDSAT classification, when compared to the aerial photography, is not par­ ticularly high. Objectives The major research objective of this study is an evaluation of the informational value of LANDSAT digital data in the context of pro­ viding land use information to users in Michigan. In developing this objective the three themes identified in previous work create a frame­ work for the study. They are elaborated in three sub-objectives, the specific elements of which are given a context by the requirements of PA 204: i) Determination of the land use and land cover categories required by planners in Michigan. This is accomplished through analysis of responses to the Michigan Data Needs Questionnaire which was designed and implemented as part of the study. ii) An evaluation of the effectiveness of commonly available digital classification procedures in obtaining the categories defined above. For representative test sites in Michigan, land use categories are derived using maximum likelihood, minimum distance-to-means and clustering classification procedures. The accuracy of these classifi­ cations is tested through comparison with 1:24,000 scale color infrared aerial photography. iii) Testing the effect on accuracy of converting digitally gen­ erated LANDSAT land use classifications to a geographic information 10 system format. This involves aggregating the LANDSAT classification data and cross-tabulating them with geocoded ground verification data sets prepared from a delineated land use map. Several areas of investigation were pursued to derive conclusions relative to these sub-objectives, and the dissertation is organized accordingly. Initially, analysis of the questionnaire responses pro­ vides information necessary to establish a land use classification scheme. The theoretical alternatives presented by digital remote sensing are then discussed and this leads into a consideration of the alternatives which can be realistically tested in the operational environment available for the study. It is next required to mesh the procedures and the data selected with the categories of the classifi­ cation scheme, in order to conduct an actual LANDSAT classification test for carefully defined study areas. Specific conclusions come from a comparative accuracy evaluation of the results of this test, and they are ultimately broadened to address the major research objec­ tive of the study: the overall informational value of LANDSAT digital data for land use mapping. CHAPTER II THE MICHIGAN DATA NEEDS QUESTIONNAIRE Introduction In order to adequately test the capabilities of LANDSAT digital interpretation in providing land use information that would be useful in a planning context, it was first necessary to determine which categories were considered important by the planning community. This task was accomplished through the use of a questionnaire designed to elicit responses from professional planners, resource managers, and other individuals or interested groups concerning their land use data needs and the technical specifications required for these data with respect to collection methodology and modes of presentation. This type of information, along with other inventory-oriented information, was also required by the Division of Land Resource Pro­ grams within the Michigan Department of Natural Resources. A project design study preparatory to implementation of the Michigan Resource Inventory Act (PA 204,1979) had been initiated and a requirement of this study was to define the specific categories and methods to be used in a statewide inventory of current use and land resources. (A copy of PA 204 is included as Appendix A1), A questionnaire was designed to serve the purposes of the research objectives reported on in this dissertation and the requirements of the project design study.1 ^Preparation and distribution of the questionnaire was supported by the Division of Land Resource Programs, Michigan Department of Natural Resources, and the Department of Resource Development, Michigan State University. Logistical support was provided by the Center for Remote Sensing, Michigan State University. 11 12 Information Components The first task in the preparation of the questionnaire was to compile the potential data elements to be considered in the research task of defining land use data needs. These, then, had to be recon­ ciled with the general categories of information specified in PA 204. In fact, expanding and specifying the Current Use Inventory section of PA 204 adequately covered the research need to define land use infor­ mation requirements, so the task focussed on operationalizing the gen­ eral inventory specifications of the act. Four resource information categories were indicated: i) Current Use Inventory. The categories listed in the act are broad, general groups rather than specific land use categories. In fact, users may require land use categories in several different forms, broken down into specific groups or levels of detail. This requires the use of a comprehensive land use classification system so that clearly defined, mutually exclusive categories can be identified and understood by producers and users alike. A classification system which meets these criteria was initially proposed by Andersen (1971) for land use mapping from remotely sensed data sources, and was later adopted by the U.S. Geological Survey (USGS) for their land use mapping program. The system was subse­ quently refined and explained in further detail (Anderson, et al., 1972, and Anderson, et al., 1976). As its central concept it has categories defined at several levels of detail, with Level I being the most general and Level IV the most specific. Category recommendations were only made at Level I and Level II and these categories were seen to have national applicability. Levels III and IV were discretionary 13 levels to be defined by local or regional groups for their own pur­ poses, incorporating the range of detailed cover types appropriate to that particular area. The Michigan Land Cover/Use Classification System (1975) was designed by a group of concerned professionals under the sponsorship of the Department of Natural Resources. It incorporates Level I- and II-categories from the USGS system. Level III- and Level IV- categories are also defined in this system, with particular reference to the Michigan environment. The system has been used extensively for land use inventory in the state (Remote Sensing Project, 1976; Reed and Enslin, 1977). Levels I, II, and III from the Michigan System have been used in the data needs questionnaire as the group of categories from which respondents could choose when defining their requirements for land use information. Level IV-information has been left out, except for six categories which further specify "Utilities" (#146) and have, in the past, proved to be important for regional planners (Remote Sensing Project, 1976). As a general rule. Level IV-categories are so specific and numerous that they are not included on land use maps or in their related information systems. The land use and land cover categories in the questionnaire are listed in Table 1. A number of categories from Level III were omitted or combined with others because they were not often used, or could easily be combined with other categories. Clearly, it was anticipated that any individual respondent would not identify a need for all of the categories and the questionnaire instructions illustrated a possible response for categories in the Level I Level II Level III Level I Level II 1 Urban 11 111 113 114 115 119 H u lt l- fim lly re s id e n tia l S in g le -fa m ily re s id e n tia l S tr ip re s id e n tia l Nobile In m parks Other (please sp e cify) 2 A g ric u ltu re 21 1? Cow c r c U I . Services, In s titu tio n a l 12! 122 123 124 126 Primary/Central Business O is tr ic t Shopping centers S tr ip development Secondary Business O is tric t Other (please specify) 13 131 132 133 134 135 136 139 Primary metal production Petrochemicals Primary wood production Stone, c la y , glass fte ta l fa b ric a tio n Nonaetal fa b ric a tio n Other (please specify 141 142 143 144 145 146 A ir tran sportation K a il tran sportation Hater tran sportation Road tran sportation Communications U t il it ie s Residential In d u s tria l 14 Transportation, Commmlcatlon and U t il it ie s Cropland, Rotation and Pemanent Pasture 221 Tree f r u it s 222 B ru sh fru lts and vineyards 223 H o rtic u ltu re and nurseries 23 Confined Feeding Operations 231 Livestock 232 P oultry 29 Other A g ric u ltu ra l Land 291 Farmsteads 292 Greenhouses 299 Other (please specify) 3 Rangeland 31 Herbaceous Rangeland 311 Upland herbaceous rangeland 312 Lowland Shrub rangeland 4 Forest land 41 Broadleaf Forest 411 Upland hardwoods 412 Aspen/birch association 413 Lowland hardwoods 42 Coniferous Forest 421 422 43 Nixed C onifer* Broadleaved Forest E le c tric a l production and transmission 1462 Gas storage and transmission 1463 Petroleum storage and transmission 1464 s o lid waste disposal and tra n s fe r 1465 Sewage treatment and transmission 1466 Hater treatment and transmission 51 19 Open Space Upland conife rs Lowland conife rs 431 432 433 434 Upland hardwoods and pine Aspen/birch, c o n ife r Lowland hardwood w ith cedar, spruce Upland c o n ife r w ith maple/elm. aspen/birch 435 Lowland c o n ife r w ith maple/elm, aspen b irc h Streams and Waterways 52 Lakes 53 16 Hived Urban Land Use E xtractive 211 C u ltivated cropland 212 Hay, ro ta tio n and permanent pasture 22 Orchards. B ru s h fru lts , Vineyards and H o rtic u ltu re 1461 17 Level III Reservoirs 54 Great Lakes 171 Open p i t 172 Shaft 173 H ells 191 192 193 194 199 Outdoor c u ltu ra l ( 200s , e tc .) Outdoor assembly (d riv e -in s , e tc .) Outdoor recreation Cemeteries Other (please specify) 6 Wetlands 7 Barren Forested Wetlands 611 Wooded wetland 612 Shrub/scrub wetland 62 Nonforested Wetlands 621 Aquatic bed wetland 622 Emergent wetland 623 F lats 61 72 Beaches and Rlverbanks 73 Sand other than Beaches 731 Sand dunes 74 Bare Exposed Rock Table 1. Current Use Inventory Categories Included in the Michigan Data Needs Questionnaire 15 "urban" grouping. The general extent of urban land including all sub­ categories, may be an item of importance, although within the boun­ daries of this area the more specific categories of Residential and Industrial from Level II and Mobile Home Parks and Strip (commercial) Development from Level III might be indicated. This is a selection of five categories that are considered to be needed by the agency, out of a possible forty-five that make up the complete urban group. ii) Special Lands. In the language of PA 204 some functionally defined lands are included under the rubric of current use. They refer specifically to lands reserved or designated under other land resource-related Michigan legislation. Land enrolled in the Farmland and Open Space Preservation Act and the Commercial Forest Reserves Act is mentioned specifically. These groupings differ from the current use categories in that they are not necessarily homogeneous and often consist of several land cover types. In a practical sense this means that the methodologies used to inven­ tory these types of lands differ from those used with current use and they are often difficult to fit into a strict classification scheme. Other categories of information important in a comprehensive resource inventory are similar to areas subject to legislative mandates, such as flood plains and recreation areas that, together, make up a group loosely defined as special lands (Table 2). The list is not comprehensive and the respondent can add other categories that are considered important. iii) Land Resource Data. This group of categories refers to the major natural resources of an area which are important in most 16 Table 2. Resource Information Categories Included In the Michigan Data Needs Questionnaire Special Lands 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Ownership Recreation Historic and Archeological Sites Vildlife Preserves Floodplains Farmland and Open Space Commercial Forest Reserves Shorelands Wilderness and Natural Areas Natural Rivers Other Reserved Lands Land Resource Data Geology: 1. 2. 3. Bedrock Geology Glacial Geology Depth to Bedrock Soils: 4. 5. 6. 7. Texture Drainage Characteristics Permeability Slope Hydrology: 8. 9. Ground Water Availability Floodplains Climate: 10. 11. Normal Values Variability Composite Indicators Limitations: 1. 2. 3. 4. 5. 6. 7. 8. Residential Development Commercial Development Industrial Development Solid Waste Disposal Septic Tanks Mining Activity Roads and Parking Lots Erosion Hazards and Subsistence Suitability: 9. 10. 11. 12. 13. Agriculture Forestry Recreation Mineral Extraction Wildlife Habitat 17 resource management decisions (Table 2). Again, the list Is not comprehensive but it does attempt to make more specific the general categories indicated in the land resources language of PA 204. Four groups are indicated: Unlike geology, soils, hydrology, and climate. most of the categories in the previous two groups which are used pri­ marily in map formats, these categories are often used as component parts of a natural resources information system. Before they get to this stage, however, they may be compiled in map formats, and the characteristics which determine their applicability to various situa­ tions are similar to the other groups. iv) Composite Indicators. Assessing the capacity of an area for some future land use or to evaluate whether its current use is best, given a set of criteria, often involves the consideration of several resource attributes simultaneously. Data management systems with large geographic data bases allow resource managers to overlay and combine several variables in order to derive limitations for certain types of development and suitability indicators for others (McHarg, 1971; Hopkins, 1977). A number of suitability indicators were suggested in the language of PA 204 and others have been added to this list (Table 2). The next logical step would be to enquire as to the data elements considered important in making up the composite indicator but such a task is beyond the scope of the basic questionnaire. Information Characteristics Category selection is only the first part of the data needs iden­ tification process. Equally as important are a number of decisions on 18 what can broadly be defined as the format of those data. The ques­ tionnaire identifies nine format components which a user should con­ sider on choosing a particular information category. Several options are indicated for each of the nine components and it is only through an evaluation of these options that a producer can decide how best to collect and present the categories of information selected. The nine components are: coverage, resolution, accuracy, scale, level of aggre­ gation for tabular data, application areas, importance, frequency of use and up-dating cycle (Table 3). Each of these components is appli­ cable to the four groups of categories, but they are of most critical importance to the current use categories as these data will be gath­ ered from primary sources, while most of the other data are assembled from other documentary sources. With documentary sources, the format allows the provider to select, combine and possibly modify or reject data that might be used incorrectly to support decision making. Using this information to direct primary data acquisition allows critical decisions on data source and methodology to be made in the clearest possible manner. Questionnaire Design The questionnaire was designed with a matrix format similar to that employed in a survey of data needs administered to government officials in the state of Oregon (Brooks, 1980). Respondents were asked to enter and/or select their category choices from the list on the left side of matrix boxes. If other choices were more appropri­ ate, alternative selections could be written in or explained in a com­ ments section. Once the categories were selected, the respondent then indicated one of the options from each of the information A - COVERAGE: What geographic coverage is required? 1. 2. 3. 4. Individual township County Multi-county area Scattered coverage with emphasis on lands with particular re­ sources 5. Specific regions, e.g., watersheds, recreation area, etc. (please specify) 6. Other (please specify) B - RESOLUTION: what is the most appropriate type size (minimum sized unit to be mapped) for the inventory? 1. 2. 3. 4. Less than 1 acre I acre 2-5 acres 10 acres 5. 6. 7. B. 40 acres (1/4 1/4 section) 160 acres (1/4 section) 640 acres (one square mile of 1 section) Other (please specify) ACCURACY: Whatlevel of accuracy is required? Accuracy levels and costs aredirectlyrelated. Higher accuracy levels can only be achieved with progressively higher costs. Please indicate the mini­ mum accuracy level acceptable for your program needs. 1. 95* 3. 85* 5. Other (please specify) 2. 90* 4. 80* D - SCALE: If map products are required, indicate the most appropriate scale. 1:10,000 (1“ X 833 feet) 1:15,840 (1" X 1,320 feet or 1:20,000 (1" s 1,660 feet) 1:24,000 (1" X 2,000 feet or 1:31,680 (1" 8 2,640 feet or 1:50,000 (1* E 4,167 feet) 1:62,500 (1* 8 5,208 feet or 8. 1:63,360 (1* = 1 mile) 9. 1:100,000 10. 1:250,000 11. Other (please specify 1. 2. 3. 4. 5. 6. 7. 4" - 1 mile) same as USGS 7 1/2' Quads) 2" * 1 mile) same as USGS F - APPLICATION: 1. 2. 3. 4. Farmland and open space preservation programs Forest management Wildlife habitat management and protection General planning and development planning (zoning, subdivisdion regulations, etc.) 5. Tax equalization 6. Management of urban development 7. Siting decisions for Industrial development 8. Siting decisions for commercial development 9. Siting decisions for residential development 10. Siting decisions for recreational development 11. Highway corridor decisions 12. Implementation of the National Flood Insurance Act 13. Environmental impact analysis 14. Other (please specify G - IMPORTANCE: How important is the resource category to the activities of your unit? not very important 1 2 3 4 5 critical H - FREQUENCY: USE; How frequently would this data variable be used? 1. 2. 3. 4. 5. 6. 7. Several times daily Daily Several times weekly Weekly Several times monthly Monthly Quarterly 8. 9. 10. 11. 12. 13. Several times annually Annually Bi-annually At least every 3-4 years At least once every 10 years Other (please specify) I - UP-DATING CYCLE: How long are the data reliable before updating is needed? 1. Less than 1 year 2. Each year 3. Every 2 years 4. Every 5 years 5. Every 10 years 6. Other (please specify) E - NON-MAP FORMATS: if tabular data are required, please indicate appropriate level of aggregation. 1. 2. 3. 4. Section Township County Ownership Unit Table 3. 5. 6. 7. 8. Management Unit Watershed River Basin Other (please specify) Information Characteristics Options Included in the Michigan Data Needs Questionnaire 20 characteristic components. Again, new options in these components could be added or completely new components could be used if this was deemed necessary. Drafts of the questionnaire were subject to review within the Division of Land Resource Programs, the Center for Remote Sensing and the Department of Geography at Michigan State University. A final draft was agreed upon and type-set for reproduction and test­ ing. 2 Testing identified minor problems with layout and text type sizes which were corrected and the questionnaire was then prepared for mailing.3 A copy of the final questionnaire is included as Appendix A2. Response to the Questionnaire Three groups were selected as potential respondents to the ques­ tionnaire: i) Professional planners in the state with responsibility for land use issues. All levels of organization were included: regional, county, township, and city. Names and addresses were obtained from the Department of Natural Resources mailing list and a list of plan- 2 Test respondents were: Richard Harlow, Assistant Planner, Meridian Township Planning Department, Okemos, Michigan, and John Coleman, Chief Planner, Tri-County Regional Planning Commission, Lansing, Michigan. 3 A cover letter, co-signed by the Division Chief of Land Resource Pro­ grams and the Coordinator of the Center for Remote Sensing, accom­ panied the questionnaire and encouraged survey respondents to complete the document and return it at their earliest convenience. Contingency plans were made for a follow-up mailing; however, it was decided that the initial response was large enough to make this unnecessary. Respondents were encouraged to telephone with any questions and a con­ siderable number of telephone enquiries were answered. This contact and other informal contacts between personnel from the Division of Land Resource Programs and individual respondents was all the followup that was required to obtain an acceptable response. 21 » nlng personnel published by the Michigan Office of Intergovernmental Affairs (1980); ii) Chairpersons of County Planning Commissions without a plan­ ning staff. With the designation of the above two groups, someone within each Michigan county had the opportunity to respond to the questionnaire and thus have input into the category selection process. These respondents are the major target group for the survey and constitute the Planner Sub-Group which will be referred to in subsequent sec­ tions. iii) A mixed group, representing various types of professional and citizen groups in agriculture, forestry, minerals, soils, and environmental protection. Table 4 contains the full tabulation of responses. Overall response was 46% of the total questionnaires mailed (86 returns of 186 mailed). A substantially better response rate was achieved from the major target groups of regional and county planners. Responses were obtained from 66% of the county planners and 79% of the regional planners contacted. The distribution of these responses within Michi­ gan is presented in Figure 1. The regional agency responses, because of their larger percentage return, have the most comprehensive areal coverage, but the county agency responses (planners and commissioners combined) are also well distributed throughout the state. This is important because the results of tabulation from this group of responses can be considered to be without substantial regional bias. A list of responding agencies is included as Appendix A3. Table 4. Rate of Response to the Michigan Data Needs Questionnaire Total Mailed Responses Regions 14 11 79 County Planners 53 35 66 County Commissioners 30 6 20 Cities and Towns 37 18 49 Townships 20 6 30 Others 32 TOTAL 186 ' 10 86 Percent Return 31 46 23 Responses from County Planning Agencies Figure 1. Distribution of Questionnaire Responses (for County Names see Appendix A4) 24 Analysis of Questionnaire Responses Information pertinent to this dissertation came from the responses to the Current Use Inventory section of the questionnaire. Analysis of the returns constitutes the remainder of the chapter; its objective is twofold; i) to identify the optimum set of land cover and land use categories that satisfy the largest number of respondents, and ii) to determine which technical options the respondents see as appropriate for effective use of these categories. Land Use Category Identification Responses to the current use inventory section of the question­ naire were tabulated in three ways to bring out different dimensions in the data. Simple Tabulation of Responses The frequency of response to all of the categories was tabulated and segmented by user groups. A complete listing of these tabulations is included as Appendix A5. In total, the tabulations indicate that a wide range of categories is considered to be Important. If the data are grouped into classes based on the percentage of respondents indicating a par­ ticular category, this range becomes quite clear. Industrial is the most popular category and yet it was selected by only 65$ of the respondents. At the 50$ threshold just five categories (including Industrial) were all that could meet the requirements of a majority of respondents (Table 5). and are all urban types. These categories are from Levels II and III If the threshold level is reduced to Table 5. Categories Selected by Questionnaire Respondents: Level 1 The Total Response Group Level II Level III Over 60% of Respondents Industrial Over 50% of Respondents ^Commercial, Services & Institutional Industrial *0pen Space *Multi-family Residential *Single-family Residential (5 categories) Over 45% of Respondents *Agriculture *Water *Wetlands *Residential Commercial, Services & Institutional transportation, Communications & Utilities Open Space *Streams Multi-family Residential Single-family Residential Mobile Home Parks Strip Development (13 categories) Over 40% of Respondents Agriculture *Forest Water Wetlands Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Open Space Streams *Lakes Multi-family Residential Single-family Residential Mobile Home Parks ^Shopping Centers Strip Development *Road Transportation (17 categories) *Categories added at each percentage increment indicated. 26 categories that are selected by over 45% of all respondents, the number of categories more than doubles. More categories are added in Level II and Level III and all are from the urban group, with the exception of Streams which is a category that can be found clearly delineated on the topographic base maps of most areas under study. Level I-categories of Agriculture, Water, and Wetland are included at the 45% threshold, throwing into even clearer perspective the respondents' overall need for detailed urban information and only very general information from the other major land use categories. little additional effect: Dropping the threshold level to 40% has Forest is added in Level I and Lakes In Level II, plus two other Level II-urban categories. The major users of the inventory information to be generated under PA 204 will be county and regional planners. Combining the responses of these two groups with those of the county commissioners (representing counties without planning staffs) thus creates a sub­ group of responses which is more representative of actual user needs than the full set of returns. the 50% or above threshold. Sixteen categories are now included in Level II- urban categories still dom­ inate, but Orchards, a Level II-agricultural category, and the four major non-urban Level I-categories are also included in the group (Table 6). This more balanced view of land cover categories is strengthened at both the 45% and 40% threshold levels as three addi­ tional Level II-agricultural categories and then two Level II-wetland categories are added to the lists. This last group of categories at the 40% threshold appears to be a mix of categories that could serve as a base from which to make final decisions on categories to be Table 6. Categories Selected by Questionnaire Respondents: Level I The Planner Sub-Group Level II Level III Over 60% of Respondents Industrial Over 50% of Respondents *Agriculture Cores t *Water *Wetlands *Residential Commercial, Services & Institutional Industrial *Transportation, Communications & Utilities *Extractive *0pen Space *0rchards *Streams *Lakes *Multi-£amily Residential *Single-family Residential *Mobile Home Parks Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Extractive Open Space Orchards *Confined Feeding *0ther Agriculture Multi-family Residential Single-family Residential Mobile Home Parks (16 categories) Over 45% of Respondents Agriculture Forest Water Wetlands Categories added at each percentage increment indicated. (18 categories) Table 6. (Cont'd.) Level I Level II Level III Over 40% of Respondents Agriculture *Range Forest Water Wetlands *Barren Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Extractive Open Space *Cropland & Pasture Orchards Confined Feeding Other Agriculture *Forested Wetlands ♦Non-forested Wetlands ♦Categories added at each percentage increment indicated. Multi-family Residential Single-family Residential Mobile Home Parks ♦Central Business District ♦Rail Transportation ♦Solid Waste Disposal (24 categories) 29 included in an inventory. It has a good range between levels and seems to meet a substantial proportion of user needs. The data for the Planner Sub-Group were also segmented by region. Responses were categorized as Southern Lower Peninsula, Northern Lower Peninsula, and Upper Peninsula.4 There were not enough responses from the Upper Peninsula to be Included in tabular form and they were omit­ ted from further analysis. Differences between the categories which emerge from the northern and southern Lower Peninsula groups are clear, although in terms of total numbers of categories selected in each threshold group, the patterns are very similar. The southern Lower Peninsula group is distinguished by a mix of urban and agricul­ tural categories at Level II in the 50% or over threshold group and an amplification of these urban selections in Level III-categories (Table 7). This trend is repeated in the 45% and 40% threshold groups, with the addition of general Level I-categories and Level II-wetland categories. In the northern Lower Peninsula group the emphasis is almost totally on Level II-categories and these lean heavily toward agriculture and wetland categories at the 50% or over threshold (Table 8). Non-residential urban categories are present, but it is not until the 45% threshold that Residential at Level II- and three Level Illresidential categories are indicated. Despite their difference in emphasis, both sets of tabulations suggest the dominance of Level I- 4 The Upper Peninsula is self-contained. The Southern Lower Peninsula consists of counties south of the Muskegon-Bay City line and Regions 1 through 6. Northern Lower Peninsula consists of the remaining coun­ ties in the Lower Peninsula and Region 7 through 10 plus 14 (see Fig­ ure 1). Table 7. Categories Selected by the Planner Sub-Group: Level I Region I (Southern Lower Peninsula) Level II Level III Over 60% of Respondents Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Open Space Multi-family Residential Single-family Residential Mobile Home Parks Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Open Space ^Cropland & Pasture *0rchards *Confined Feeding Multi-family Residential Single-family Residential Mobile Home Parks ♦Central Business District *Air Transportation *Rail Transportation *Solid Waste (8 categories) Over 50% of Respondents *Agriculture *Water ^Wetlands (18 categories) Over 45% of Respondents Agriculture *Forest Water Wetlands Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Open Space Cropland & Pasture Orchards Confined Feeding *0ther Agriculture Multi-family Residential Single-family Residential Mobile Home Parks Central Business District Air Transportation Rail Transportation *Road Transportation Solid Waste Disposal (21 categories) ♦Categories added at each percentage increment indicated. Table 7. (Cont'd.) Level I Level II Level III Over 40% of Respondents *Urban Agriculture Range Forest Water Wetlands Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Open Space Cropland & Pasture Orchards Confined Feeding Other Agriculture *Streams *Forested Wetlands *Non-forest Wetlands Categories added at each percentage increment indicated. Multi-family Residential Single-family Residential Mobile Home Parks Central Business District Air Transportation Rail Transportation Road Transportation Solid Waste Disposal *Sewage Treatment (27 categories) Table 8. Categories Selected by the Planner Sub-Group: Level I Region 2 (Northern Lower Peninsula) Level II Level III Over 60% of Respondents Industrial Streams Lakes (3 categories) Over 50% of Respondents *Forest *Wetlands *Barren *Commercial, Services & Institutional Industrial ^Transportation, Communications & Utilities *Extractive *Cropland & Pasture *0rchards *Confined Feeding *0ther Agriculture Streams Lakes *Great Lakes *Non-forested Wetlands *Categories added at each percentage increment indicated. *Moblle Home Parks (17 categories) Table 8. (Cont’d.) Level I Level II Level III Over 45% of Respondents Forest Wetlands Barren *Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Extractive *Open Space Cropland & Pasture Orchards Confined Feeding Other Agriculture Streams 1 Lakes Great Lakes *Reservoirs Forested Wetlands Non-forest Wetlands Categories added at each percentage increment indicated. *Multi-family Residential *Single-family Residential Mobile Home Parks *Cultivated Cropland (23 categories) Table 8. (Cont'd.) Level I Level II Level III Over 40% of Respondents *Agriculture *Range Forest Water Wetlands Barren Residential Commercial, Services & Institutional Industrial Transportation, Communications & Utilities Extractive Open Space Cropland & Pasture Multi-family Residential Single-family Residential Mobile Home Parks Cultivated Cropland Orchards Confined Feeding Other Agriculture Streans Lakes Great Lakes Reservoirs Forested Wetlands Non-forested Wetlands ^Categories added at each percentage increment indicated. (25 categories) 35 and particularly Level II-category selections by the Planner SubGroup, and they also show that Level III-information is mainly required for detailed residential land use discrimination. Mean Rank Score Tabulations Simple tabulation of responses to categories identified in the questionnaire gave equal weight to each respondent and no considera­ tion to the mix of categories selected by respondents. The mix of categories, however, may be important if it is assumed that certain category selections have a higher priority than others. This is espe­ cially so if the respondent has selected a large number of categories. Similarly, different mixes of categories may be weighted, when seg­ menting the responses to identify the individual categories demanded by a majority of respondents. A new indicator of the mix of categories, that gives less weight to respondents who selected large numbers of relatively unpopular ones, allows further clarification of the category selections that are most important to the largest number of questionnaire respondents. In order to manipulate these data in this way, the categories selected by the Total Response Group and the Planner Sub-Group were ranked according to their frequency of selection (i.e., Industrial, the most popular category, was ranked 1 and so on). Each question­ naire was then coded using these ranks. A mean rank scorewas deter­ mined by totalling the rank numbers of the categories listed in the questionnaire apd dividing this total by the number of categories selected. A low mean rank mix of popular categories. score meant that the respondentselecteda The higher the mean rank score the more the mix reflected less popular categories. The questionnaires were 36 then ordered according to mean rank score, from low to high. Deriving the individual categories from the mean rank score list involved segmentation, but now the segmentation was done first on the basis of the ordered questionnaires, i.e., the first 509 of the list, then 609 of the list, and so on. Within each of these segmented groups, the categories most often chosen are identified by their per­ centages within the segment. What emerged from manipulating the data in this fashion is that relatively few and mainly Level II- and IIIcategories meet the major needs of most respondents (Tables 9 and 10). In the Total Response Group, urban categories dominate overwhelmingly in the Level II-and Level III-groups, although in each segment the major Level I-categories are requested by over 509 of their respective segments. In the Planner Sub-Group, Level II-agricultural categories are indicated along with the urban categories, particularly within the 509 and 409 threshold levels of each segment. The strength of the conclusion that relatively few categories satisfy the major needs of most respondents is made clear in the 809 mean rank score segment of the Planner Sub-Group. In this group,a total of 16 categories split between the three levels satisfy a majority of the respondents when 809 of the questionnaire responses are considered. If the 409 thres­ hold level is added to this group, the categories included again form the basis of a classification scheme that was used in the subsequent analysis of remote sensing methodologies for deriving land cover and land use information. Cumulative Percentages for Selected Categories When completing the questionnaire, respondents could select categories defined at three levels of generalization and each choice Table 9. 1. Categories Selected When Questionnaire Respondents Are Ordered by Mean Rank Score: Response Group The Total 50% OF MEAN RANK SCORE LIST Level 1 Level II Level III Over 70% of Respondents Industrial Over 60% of Respondents Residential Commercial, Services & Institutional Over 50% of Respondents Agriculture Forest Water Wetlands Over 40% of Respondents Urban 2. Multi-family Residential Open Space Single-family Residential Strip Development 60% OF MEAN RANK SCORE LIST Level I Level II Level III Over 70% of Respondents Industrial Over 60% of Respondents Residential Commercial, Services & Institutional Over 50% of Respondents Agriculture Forest Water Wetlands Over 40% of Respondents Urban Open Space Multi-family Residential Mobile Home Parks Transportation, Communications & Utilities Single-family Residential Strip Development J. Table 9. 3. (Cont'd.) 70% OF MEAN RANK SCORE LIST Level I Level II Level III Over 70% of Respondents Industrial Over 60% of Respondents Residential Commercial, Services & Institutional Over 50% of Respondents Agriculture Forest Water Wetlands Over 40% of Respondents Urban Open Space Multi-family Residential Single-family Residential u> Transportation, Communications & Utilities Strip Development Road Transportation 80% OF MEAN RANK SCORE LIST Level I Level II Level III Over 70% of Respondents Industrial Over 60% of Respondents Residential Commercial, Services & Institutional Over 50% of Respondents Agriculture Water Wetlands Over 40% of Respondents Urban Forest Open Space Multi-family Residential Single-family Residential Shopping Center Strip Development Road Transportation CO Table 10. 1. Categories Selected When Questionnaire Respondents Are Ordered by Mean Rank Score: Planner Sub-Group 50% OF MEAN RANK SCORE LIST Level I Over 70% of Respondents Forest Water Over 60% of Respondents Agriculture Wetlands Over 50% of Respondents Barren Over 40% of Respondents Urban Range 2. The Level II Level III Residential Commercial, Services & Institutional Industrial Industrial Open Space Cropland & Pasture Mixed Urban Extractive Orchards ° 60% OF MEAN RANK SCORE LIST Level I Over 70% of Respondents Forest Level II Residential Commercial, Services & Institutional Industrial Over 60% of Respondents Agriculture Transportation, Communications & Utilities Water Open Space Wetlands Over 50% of Respondents Barren Level III Table 10. (Cont'd.) Over 40% of Respondents Urban Range 3. Mixed Urban Extractive Cropland & Pasture i Orchards Confined Feeding (Other Agriculture— 39%) (Forest Wetlands— 39%) Multi-family Residential Single-family Residential (Mobile Home Parks— 39%) (Solid Waste— 39%) 70% OF MEAN RANK SCORE LIST Level I Level II Level III Over 70% of Respondents Commercial, Services & Institutional Industrial Over 60% of Respondents Agriculture Forest Water Wetlands Over 50% of Respondents Range Over 40% of Respondents Urban Barren Residential Transportation, Communities & Utilities Open Space Cropland & Pasture Orchards Confined Feeding Extractive Non-forested Wetlands (Other Agriculture— 39%) (Lakes— 39%) (Forested Wetlands— 39%) Multi-family Residential Single-family Residential Mobile Home Parks Solid Waste Table 10. 4. (Cont'd.) 80% OF MEAN RANK SCORE LIST Level I Level II Level III Over 70% of Respondents Commercial, Services & Institutional Industrial Over 60% of Respondents Forest Water Wetlands Over 50% of Respondents Agriculture Over 40% of Respondents Range Barren Residential Transportation, Communications & Utilities Open Space Extractive Cropland & Pasture Orchards Confined Feeding Multi-family Residential Single-family Residential Mobile Home Parks Other Agriculture Streams Lakes Forested Wetlands Rail Transportation Solid Waste 42 was counted the same when tabulated, despite the fact that a Level III-category was a component part of a Level II-category, and this Level II-category being, in turn, a component part of a general Level I-category. Selections at Level III then imply certain selections at Level II and Level II selections imply selections at Level I. The questionnaire respondent was not asked to make these selections expli­ cit, but when the questionnaires were re-coded to reflect this situa­ tion, Level I- and some Level II-categories were substantially more predominant, as might be anticipated (Table 11), thus reinforcing their importance.^ Responses to Level III-categories are included in Table 11 for comparative purposes. Preliminary Category Selections In order to finalize a decision as to which categories are most appropriate to use in a statewide inventory, a composite list consist­ ing of categories selected by 50% or more of the respondents in each of the tabulation methods was prepared (Table 12). This list includes the variation exhibited in the questionnaire responses and concen­ trates on the most popular categories. While it does not yet stand as an operational classification system, it is the basis for such a sys­ tem. The list also confirms the premise that to meet the needs of a wide variety of users, categories from various levels of detail would 5 Examining the cumulative percentages for selected categories was un­ dertaken as a confirmatory device after the other tabulations had been completed. Since analysis of the other tabulations had emphasized the importance of responses from the Planner Sub-Group, and this group was of primary interest, cumulative data were assembled only for the Planner Sub-Group. Table 11. Cumulative Percentages for Selected Categories: The Planner Sub-Group % Level I Urban — Agriculture— Range — Forest — Water — Wetlands — Barren — Level II 96% 98% 60% 90% 90% 94% 67% Over 80% of Respondents Residential Commercial, Services & Institutional Industrial Over 70% of Respondents Transportation, Communications & Services Open Space Cropland & Pasture Over 60% of Respondents Extractive Orchards Confined Feeding Over 50% of Respondents Other Agriculture Streams Lakes Forested Wetland Over 40% of Respondents Broadleaf Forest Coniferous Forest Mixed Forest Non-forest Wetlands Level III Over 50% of Respondents Multi-family Residential Single-family Residential Over 40% of Respondents Mobile Home Parks Rail Transportation Solid Waste Disposal Over 30% of Respondents Strip Residential Primary Central Business District Shopping Centers Strip Development Secondary Business District Road Transportation Sewage Treatment Recreation Table 12. Classification Categories Derived from Analysis of the Michigan Data Needs Questionnaire Level I Level II Level III Urban Residential Multi-family Residential Agriculture Commercial, Services & Institutional Single-family Residential Range Industrial Mobile Home Parks Forest Transportation, Communications & Utilities Central Business District Water Extractive Air Transportation Wetland Open Space Rail Transportation Barren Crops & Pasture Solid Waste Orchards .> 4> Confined Feeding Other Agriculture Streams Lakes Great Lakes Forest Wetland Non-forested Wetland 7 Broadleaf Forest Shopping Center Coniferous Forest Strip Development Mixed Forest Road Transportation 15 (3) 7 (3) 45 need to be included in the system, even at the expense of a certain redundancy. Technical Characteristics of Information Requirements When selecting categories from the questionnaire, the respondent was also asked to respond to options for technical specifications, to identify application areas, and to evaluate how data are actually used. Responses to these questions for the categories identified in the final part of the previous section are considered here. The categories were grouped into Levels I, II, and III, but even with this segmentation there is a large amount of data to be con­ sidered. Tabulations were made for the Total Response Group and the Planner Sub-Group and were expressed as percentages of these responses. Despite the volume of data, there is a basic similarity in the responses to the technical options. each of the characteristics in turn. This becomes clear by examining Generalizations are made in the following paragraphs after a re-statement of the questions posed in the survey. For information on individual categories within each category Level, it will be necessary to refer to the complete tabula­ tion of responses which is included as Appendix A6. Again, the focus will be on the Planner Sub-Group, but while the percentages for the Total Response Group were often different, the dominant options and distribution of options selected usually followed the same trend as that of the Planner Group. i) Coverage: What geographic coverage is required? The dominant response to this question is for coverage at the individual township level. County and multi-county coverage was 46 indicated next and this pattern is only broken in Level III-categories where solid waste disposal information seems to be of interest at the county level. the Total For other categories from Level III, particularly in Response Group, the "other" option showed as second orthird choice; this reflects choices by city respondents for coverage areas smaller than townships, such as city blocks and census tracts. ii) Resolution; What is the most appropriate type size(minimum size unit to be mapped) for the inventory? Resolution levels indicated are concentrated in the 1 to 10 acre groups. Within th|| range it is clear that requirements get stricter with each level of detail. At Level I, 10 acres dominates, 2-5 acres is marginally the most important group at Level II, and 1 acre or less is required for the largest number of Level III-categories. Indivi­ dual categories provide exceptions to these generalizations. Water and Wetland Level I-categories have 2-5 acre type size as dominant and at Level II the Forest and Wetland categories have 10 acres dominant. It is important to note that for almost all categories combining the percentages for 1 acre and greater than 1 acre creates the largest group. Information with a resolution this small is, however, rarely practical for most applications. iii) Accuracy; What level of accuracy is required? Accuracy requirements are similar for most of the categories. The dominant selection is for a 90% accuracy level, with a 95% level as the next indicated by approximately half as many respondents. In the case of Level I-categories about 15% of the respondents are prepared to accept accuracy levels below 90%. At Level II, a slightly smaller percentage are prepared to accept less than 90% and the domi­ 47 nance of 90% accuracy is strong with two exceptions: for Streams and Lakes a 95% accuracy level is dominant. The split between 90% and 95% accuracy is almost even for Level III-categories. Overall, however, a 90* accuracy is most acceptable to the majority of users for almost all categories. iv) Scale: If map products are required, indicate the most appropriate scale. For categories at Level I and Level II, 1:24,000 maps are selected most often, particularly by the planner group where the mar­ gin between this scale and the next is greatest. The other responses are split between 1:10,000, 1:15,000 and "other," without any clear second dominant emerging. Selection of "other," however, in almost every case represented a choice of a scale larger than 1:10,000, If these three second-preference selections are totaled, they are often equivalent to the 1:24,000 or larger-scalemaps. The Level III- categories exhibit an even stronger tendency toward larger-scale map products, with 1:24,000, 1:10,000 and "other" selections almost even, and, again, the "other" category representing scales larger than 1 :10 ,0 00 . v) Non-Map Formats: If cate the appropriate level tabular dataare required, please indi­ of aggregation. The majority of responses for Level I- and II-categories is con­ centrated in the jurisdictional options of section, township, and county. Township aggregation is dominant, followed by section and county. For Level III-categories the pattern is somewhat different. Section aggregation is dominant for the three residential categories, and for most in the Total Response Group, section and township aggre- 48 gation is about even for the remainder of the Level III-categories. In all of the Level III-categories, however, the "other" group have responses which are for intra-city boundaries, such as census tracts. vi) Applications: General planning dominates as the major response throughout all categories of information. The "other" option is Indicated by 15—20% throughout, but these choices represent very specific applications and follow no clear pattern. The other responses are predictable ones which relate categories to applications controlling the land use of land cover Identified in those categories, for instance, "Commercial Development Siting" is indicated as an application for the Level IICommercial category. vii) Importance: How important is the resource category to the activities of your unit? The "Importance" responses were clustered into the three options: important, very important, and critical. "Very important" was the dominant option and only Level I-Agriculture was indicated as criti­ cal. The pattern of responses does not allow a clear identification of certain categories that are more important than others, although, by only tabulating responses for the most popular categories, this is to be expected. viii) Frequency: How frequently would this data variable be used? Responses to this question were very mixed, covering the spectrum from daily to annually. "Several times weekly" and "several times monthly" are the most frequently chosen options, but it is difficult to identify general preferences. 49 ix) Updating: How long are the data reliable before updating is needed? A five year update cycle is the dominant response for most Level I- and Level II-categories although significant numbers prefer two years and one year. When combined it is clear that five years is the maximum update time frame considered acceptable. For Level III- categories the pattern is similar, with an even stronger desire for two year updates in most cases. Conclusion The response rate to the questionnaire was high amongst the planner target group, and the responses exhibited a wide variation: some respondents required only a few land use elements, while others perceived that close to one hundred elements were necessary. Analysis of these responses, particularly from the Planner Sub-Group, however, allowed compilation of a manageable category list that satisfied the major needs of most of the respondents. Categories were taken from Levels I, II, and III, indicating the need for a range of detail. Level I-categories can be obtained through grouping of Level II and Level III selection so that twenty-eight mappable land use categories constitute the basis from which a classification scheme has to be developed. The categories selected can be derived from conventional aerial photography without difficulty. The mode of data capture and interpretation methodologies used with LANDSAT digital data, however, make the classification task presented in this study very different from aerial photography-based inventories. It is necessary, there­ fore, to review LANDSAT data collection methods and analysis 50 procedures in the next chapter, before proceeding with further category evaluation. This information will have a direct impact on which categories can be obtained and how accurately they can be mapped. CHAPTER III DIGITAL CLASSIFICATION OF LANDSAT DATA Introduction Data from the multispectral scanner aboard the LANDSAT 1 satel­ lite, which was launched in July, 1972, revolutionized the development of applied remote sensing in three important ways: i) ii) through establishing a completely new way of collecting data, by introducing the multispectral scanner which required a more substantial emphasis on spectral information when interpreting LANDSAT data, and iii) requiring methods of analysis taken from pattern recogni­ tion theory to facilitate classification of the LANDSAT digital data. The focus of this chapter will be on the modes of analysis, which before the introduction of digital remote sensing, had little applica­ tion to land use mapping. Prior to that, however, it is necessary to briefly cover the changes in collection methods and interpretation emphasis. Data Collection The three LANDSAT satellites were placed in near polar circular orbit in 1972, 1975, and 1978, at an altitude of approximately 920km (570 miles). This type of orbit allows for fairly constant sun- illumination conditions at any given location on the earth's surface. An equatorial crossing time of approximately 9:40 a.m. provides a trade-off between a reasonable sun angle and a minimum of cloud build-up over major land areas. The satellites circle the earth every 103 minutes, completing 14 orbits a day and sequentially imaging the 51 52 surface every 18 days (Taranik, 1978: U.S. Geological Survey, 1979). These orbital characteristics allow for the acquisition of synop­ tic, world-wide areal coverage by the satellites' imaging systems. A glimpse at this potential had been provided by images from earlier spacecraft such as Apollo, but no systematic global coverage was available before LANDSAT. In addition, the coverage is repetitive in nature, theoretically facilitating a multi-temporal approach in the selection of appropriate seasonal coverage within a given year and creating the possibility of multi-year change analysis. Data Interpretation A significant element in the evolution of remote sensing tech­ niques has been the recording and subsequent interpretation of spec­ tral reflectance from the earth's surface. The multispectral scanner (MSS) aboard the LANDSAT series of satellites1 forced a much more explicit consideration of this spectral information for two reasons: i) The MSS system uses an oscillating mirror to scan the earth's surface and direct incoming radiation to a bank of six sets of four detectors. The four detectors are sensitive to different spectral wavelengths: Band 4 to 0.5 - 0.6 um (green reflectance), Band 5 to 0.6 - 0.7 um (red reflectance). Band 6 to 0.7 - 0.8 um (infrared reflectance), and Band 7 to 0.8 - 1.1 um (infrared reflectance) (Fig­ ure 2). This discrete recording of spectral reflectance, as opposed to the instantaneous record created by exposure to film, directs attention toward spectral properties of the scene and allows the V he LANDSAT satellites have also carried a Return Beam Vidicon (RBV) imaging system; imagery from this system was not used in this study and will not be described here. ht U ltravio let Visible I N F R A R E D (IR) Thermal IR Reflective IR Microwave (radar) -L-1--- ---------- 0.1 0.7 1.0 0.4 6.0 UV 60.0 10.0 R eflected 100 1000 IR Blue iQreeni Red 0.4 0.6 1.2 M8S M38 MBS 4 6 6 LANDSAT Figure 2. ---------► Location of LANDSAT Bands Within the Electromagnetic Spectrum (after Lusch, 1982) 10.000 (um) 54 analyst to deal with single spectral bands of information. ii) The detectors on the MSS system record information from ground resolution cells of approximately 79 meters on a side although sampling of this information during signal processing creates a nomi­ nal ground spacing of 56 meters. Nominal pixel dimensions are, there­ fore, 79m x 56m, an area of 1.1 acres. This resolution is very coarse compared to the resolution of even small-scale aerial photography. Characteristics such as pattern, shape, and spatial association which are major factors in aerial photo-interpretation cannot be effectively used in this context so that the importance of the spectral informa­ tion is further enhanced.2 Methods of Analysis Current procedures for digital processing and analysis of data from LANDSAT multi-spectral data involve five major areas of activity (Hoffer, 1979a): i) ii) iii) iv) v) Data reformatting and pre-processing; Definition of training statistics; Computer classification of data; Information display and tabulation; Evaluation of results. Each of these topics will be dealt with in the following discus­ sion only insofar as they are pertinent to the current study. LANDSAT Data Users Handbook published by the U.S. Geological Sur­ vey (1979) provides a comprehensive review of information relating to the orbital characteristics, sensor systems and data formats employed by all of the LANDSAT satellites. 55 Data Reformatting and Pre-Processing. In most applications studies this first step in the analytical procedure is one that is beyond the control of the individual analyst. In certain circumstances, when data are being processed for a specific user at a NASA facility or a major research institution such as the Environmental Research Institute of Michigan (ERIM) or Laboratory for Applications of Remote Sensing (LARS), custom pre-processing is possi­ ble; but for the general user, data are available from the EROS Data Center (EDC) after standard reformatting and pre-processing. The sys­ tem that converts raw LANDSAT digital data to computer-compatible tapes is, however, a complex one. It involves pre-processing for radiometric corrections and a range of geometric corrections. Distor­ tions caused by a) the optics of the system, b) the scan mechanism, c) cfeteetor array geometry, d) spacecraft altitude and attitude varia­ tions, and e) earth rotation are all removed or minimized. The output tape is then resampled to a standard map projection (Slater, 1980) and this process results in an output pixel size of 57 meters by 57 meters. The data used in this study were resampled using a cubic con­ volution algorithm and output in the Hotine Oblique Mercator projec­ tion. After 1979 the new EDC Digital Image Processing System (EDIPS) added further enhancement capability (Harris, 1979). With the stan­ dard EDIPS format, in addition to the corrections previously men­ tioned, contrast stretching and haze removal functions are applied to the data. Digital data used in this study were those of the standard EDIPS format. 56 Definition of Training Statistics The definition of training statistics is a critical step in the classification of multispectral data. Training involves designating particular combinations of pixel values which represent the reflec­ tance in the four LANDSAT bands for a land cover of interest. Once designated, the computer can proceed to classify all the pixels in a study area on the basis of statistics generated from these pixel values. It is clear, therefore, that unless the training statistics effectively represent the land cover types being sought by the analyst, the classification is bound to be inaccurate. The first con­ dition for selecting a training class, then, is that it must have informational value to the user. Digital classification of LANDSAT data involves the use of pattern recognition procedures to define spectral classes; therefore, a second condition for effective discrim­ ination is that a particular class must be as spectrally distinct from all other classes as possible. Defining training areas that are both separable and have informa­ tional value has led to several different approaches in their acquisi­ tion. The approaches, however, can be split into two major groups. Supervised methods of defining training statistics focus on the use of ground truth data to define spectral classes as the basis of informa­ tional importance. Unsupervised methods involve a clustering process which groups data into spectrally homogeneous classes that are subse­ quently given informational value on the basis of ancillary data. number of specific procedures have been commonly used. A 57 The Supervised Training Field Technique The analyst selects various cover types of interest and desig­ nates to the computer the XfY coordinates of these areas. It is important that the training fields be as homogeneous as possible and be representative of the entire area to be classified. classes should be avoided. Mixed spectral The size of training fields selected using this method is disputed in the literature, and ranges from as small as 16 acres (Ellis, 1978) through 25 acres (Markham and Townshend, 1981) to the most commonly recommended, 40 acres (Swain and Davis, 1978). This technique is straightforward and has been used most fre­ quently; however, it does have two common problems: i) the analyst fails altogether to define important spectral classes, decreasing the effectiveness of the classifier; ii) informational classes are selected that are either not spec­ trally differentiated or do not adequately characterize the spectral variation within that informational class. The Unsupervised Clustering Technique Here, the analyst designates an area to be classified and a specified set of analysis parameters relating to the number of classes required. The results can be used after, labelling of the spectral classes with ancillary data, as a spectral classification or serve as input into a parametric classifier. Often more than one iteration is required, with re-clustering to eliminate multimodal classes, and to group spectral classes that have no informational value. This procedure is very expensive in terms of computer time, and constitutes a major limitation of the technique. 58 The Multi-Cluster Blocks Technique This technique is a hybrid of the supervised and clustering methods (Fleming and Hoffer, 1977). With this approach the analyst locates several relatively small blocks of data, each of which con­ tains several cover types and spectral classes. Each block is indivi­ dually clustered into approximately 14 to 16 spectral classes and these classes are identified from ancillary data. The spectral classes for all cluster areas are then combined through a series of iterations (usually three) to form a single set of training statis­ tics. Fleming and Hoffer (1977) identified three other variants of this hybrid technique: i) Mono-cluster blocks, where several heterogeneous blocks are combined and the entire group clustered as a single unit; ii) Mono-cluster fields, where supervised training fields are combined into a single block which is then clustered; iii) Multi-cluster fields, where supervised training fields are clustered individually and the statistics combined into a single set of training data. The Multi-Cluster Block approach is now the standard procedure employed by most of the workers at the Laboratory for Application of Remote Sensing, Purdue University. The technique overcomes limita­ tions of both the supervised and clustering methods. Clustering the data is possible, thus accounting for the full range of spectral vari­ ability in the data. By restricting the size of blocks, computer costs are reduced and proper sampling methods allow representation of the whole area with a very small proportion of the entire data set. Small cluster blocks also facilitate the matching of cluster classes 59 with ground truth information. The small compact areas chosen allow use of aerial photography for exact delineation and land cover clas­ sification of the blocks. Identification of useful informational classes is thus facilitated. Fleming and Hoffer (1977) tested the methods indicated above for forestry applications and found that the Multi-Cluster Block approach minimized analysts' time inputs, gave the best classification perfor­ mance, and required the smallest amounts of computer time. Supervised training was highest in analysts' time and computer costs and lowest in accuracy. As a counterbalance to these results, a land cover clas­ sification for an area in Minnesota (Nelson et al., 1981) where several training methods were used, obtained their best results using supervised training. Multi-cluster blocks, however, were not an option that could be tested in this study. The Procedure 1 Technique This technique was developed during the Large Area Crop Inventory Experiment (LACIE) Program (Nelson and Hoffer, 1980b) and uses a series of randomly located pixels of known cover type to seed a clus­ tering algorithm which then defines the training statistics without further inputs by the analyst. The procedure has not been used exten­ sively outside the LACIE project; however, it does provide results comparable to the Multi-Cluster Block approach (Nelson and Hoffer, 1979). The limitation is that fairly good ground truth information is required for the procedure to be effective. 60 Computer Classification of Data Pattern Recognition Interpretation of remotely sensed data prior to LANDSAT 1 had been image-oriented, as was indicated earlier in this chapter. Some work with digital data from multi-spectral scanners was under way at a few research centers such as LARS, ERIM, and the Jet Propulsion Laboratory (JPL), but image interpretation methodologies predominated in the main stream of remote sensing research until the early 1970's (Estes et al., 1977). LANDSAT data are readily processed into images which, in fact, still constitute the majority of purchases from the EROS Data Center (Watkins, 1981). These data are also available in digital form and this has created a new era in remote sensing analysis which involves computer classification of data (Landgrebe, 1981). In this type of approach the goal of the analyst is to obtain some form of "classification" of a series of observations that is relevant to a particular application rather than an "interpretation" of an actual image product. The procedures used in this type of digital processing and analysis are derived from pattern recognition concepts and a brief review of this information is necessary to create a perspective for understanding how these methods are derived. A pattern recognition system can be generalized into three major processes: Environment-* Measurement — Observations** Figure 3. Decision Feature -Classes -Features-* Functions Extraction Elements of a Pattern Recognition System (after Jayroe et al., 1976) 61 The first process is that of measurement. Input data, which can be measured and from which classes can be recognized, have to be sup­ plied to the system. In the LANDSAT context, the brightness value from each of the four multi-spectral channels, for each ground resolu­ tion element or pixel, constitutes this measured quantity. It represents the spectral reflectance of that point on the earth's sur­ face. These four measurements can be thought of as defining a point in four-dimensional Euclidean space which is known as "measurement space." The second process in pattern recognition involves the extraction of characteristic features or attributes from the measured input data. Feature extraction has two major functions. It is used to i) separate useful information from noise within the data, i.e., all of the input measurements may not be equally important, and ii) reduce the dimen­ sionality of the data in order to simplify calculations that will sub­ sequently be required by the classifier (Swain, 1972). In the LANDSAT case all four measurements (the four spectral bands) are used in most analyses. A crude form of feature extraction, classifying with only Band 5 and Band 7, is sometimes used; but for the purposes of this dissertation, feature extraction as a process is not implemented and the analysis is conducted using all four bands of LANDSAT data. The third process in pattern recognition involves the determina­ tion of optimum decision procedures which can be used in the identifi­ cation and classification process. After the observed data from the classes to be recognized have been expressed in the form of a measure­ ment vector, a decision rule has to be implemented to decide which observations belong to which class. Establishing this "measurement 62 vector" is done in remote sensing work through the identification of training sets. The decision functions can be generated in a number of ways, and in its most abstract sense this study involves testing the effectiveness of certain of these decision functions when practically applied to LANDSAT data for small test sites in Michigan. The focus of research will be to evaluate variations in results from several standard algorithms rather than in new algorithm development. Organizing Framework Concepts which relate to the decision functions that are used to define and characterize pattern classes can be separated into three types: i) A membership concept by which an unknown pattern class is matched to a known pattern that has been stored in a library of refer­ ence patterns. This type of concept was explored in remote sensing and is associated with the idea of a spectral signature and signature banks (Landgrebe, 1976). This term Implies a unique, well defined spectral pattern, by means of which a particular earth surface feature can be positively and reliably identified. It has become clear that unique and unchanging spectral signatures do not exist in nature (Hoffer, 1979b). Normal geographic and temporal effects, along with a host of other factors, cause variations of spectral response in sur­ face features at any period in time. Spectral response patterns asso­ ciated with similar features do have characteristics in common, but they have to be defined from within the particular environment for which they are to be used. The membership concept, then, is inap­ propriate to remote sensing as it can only work with perfect pattern samples. 63 ii) A common property concept. Here, a pattern class is charac­ terized by common properties, shared by all its members. The basic assumption is that patterns belonging to the same class possess cer­ tain common properties or attributes that reflect similarities among these patterns. It is this assumption which provides the basis for supervised training in remote sensing. The main problem is to deter­ mine the common properties from a finite set of samples known to belong to the pattern class. This is the training problem in remote sensing which, as mentioned earlier, can be handled in a number of ways. iii) Clustering concept. When the patterns of a class are vec­ tors whose components are real numbers, as they are with LANDSAT data, a pattern class can be characterized by clustering its properties in measurement space. Recognition schemes can then be formulated and derived in a statistical framework depending on the complexity of the clusters and the extent to which they overlap. The statistical approach which can be applied to both the common property and clustering concepts may be divided into two categories: parametric and non-parametric. Parametric schemes utilize probability formulations to minimize the risk of an observed pixel being assigned to an improper class and require the analyst to make certain assump­ tions about the distribution of the data prior to classification. Non-parametric schemes look for linear relationships and thus require less initial knowledge, fewer assumptions, and are, therefore, easier to implement. The general concepts involved in pattern recognition provide structure for the LANDSAT classification applications that follow. 64 Comprehensive treatments of pattern recognition, including detailed examinations of the statistical theory behind the many techniques that can be used, are available in several standard references (Fukunaga, 1972; Haralick, 1976; Tou and Gonzalez, 1977). Types of Decision Functions The discussion of pattern recognition concepts illustrates that the critical element in classifying the measurements that constitute a set of input data is selecting a decision function which will divide the measurement space into decision regions, with each region corresponding to a specific class. The question now arises as to which of these decision functions are best applicable to LANDSAT data and can form the conceptual basis for the classification procedures that have been used in the current test. As has been shown, the LANDSAT MSS measures earth surface reflec­ tance in four designated wavelength bands. If the spectral response values for various cover types, in the wavelengths which are measured by LANDSAT, are plotted out in graph form, the resultant spectral response curves have clearly different shapes (Figure 4.A). The sen­ sor records one average value within each spectral band to represent this variability for each spectral resolution element. Plotting these discrete values creates the measurement space described in the previ­ ous section. In Figure 4.B, a two-dimensional measurement space is created by plotting the LANDSAT Band 7 response against the Band 5 response, and this further illustrates the difference between the representative cover types. All of the LANDSAT bands are normally used in an actual analysis procedure and implemented mathematically to maximize the possible discrimination between cover types. 65 BAND6 A. BAND7 R E F L E C T A N C E <%) URBAN AGRICULTURE BARREN F0RE8T WATER 05 0.8 » X 0.7 0.8 0.9 W AVELENGTH(um ) 1.1 X AGRICULTURE R E F L E C T A N C E <%) X BARREN FOREST BAND7 X URBAN X WATER BANOS REFLECTANCE <«) Figure 4. A. Generalized Spectral Curves for Representative Land Cover Types (Hoffer, 1978; Landgrebe, 1973) B. Representative Land-Cover Types Plotted in 2-Dimensional Space (LANDSAT Bands 5 and 7) 66 Extending this example, there are three major ways of partition­ ing this measurement space so that unknown pixels can be associated with identified pattern classes. The examples presented will continue to use a two dimensional representation to facilitate graphic presen­ tation, but the concepts are valid for n-dimensional space. The data plotted on Figures 5, 6, and 7 can be considered as training set data: points within the input data set of known ground cover. Unknown points, for example numbers 1 and 2 in Figures 5 and 6, are points that need to be classified into ground cover information classes. i) Pattern classification based on distance functions is one of the most straightforward and commonly used classification algorithms in the classification of digital LANDSAT data. It is a non-parametric method which adopts the comparability concept in establishing a pro­ cedure used for obtaining results. The mean value for each spectral class of a training set is calculated and the mean values for each class, known as vectors, are taken as standards which represent that particular class. Values for unknown pixels can then be measured against the standard and a "distance" computed between the unknown and the standard. Pixels are then classified according to the minimum distance between the measured value and one of the representative standards. In Figure 5, pixel 1 would be classified as Barren because this is the closest mean value. This decision illustrates one of the limi­ tations of the minimum distance to means approach. While closest to Barren, pixel 1 appears to be part of the more variable Urban group of pixels and as a consequence is misclassified. ii) An alternative to a single number representing a pattern 67 A U u REFLECTANCE <*> A A* A AAA f A U >B> \\“ i+2 A A A U .. t,'' v * U U u u U u "&!&< » “ “ „u / Fc/ jcW'F // FF FFF F F rFF r rF 7 u / BAND ✓ ✓ / / s / / AFBUW- ^ / / / Wuu Wj7 u/gW W»w AGRICULTURE FOREST BARREN URBAN WATER •>Ua J BAND 5 REFLECTANCE(« ) Figure 5. Minimum Distance-to-Means Classification Strategy (after Lillesand and Kiefer, 1979) 68 •u *B •B A" A A * - BB >B B * f ' f ^ B f #ff f F FF F L F FF F BAND 7 REFLECTANCE <«> A AFBUW- JS71 Wu* AGRICULTURE FORE8T BARREN URBAN WATER * BAND S REFLECTANCE (tt) Figure 6. Parallelepiped Classification Strategy (after Lillesand and Kiefer, 1979) 69 class is to consider a range of data values in a training set. This allows the analyst to take into account the intra-category variance within the training set. The range nay be defined as the highest and lowest pixel value associated with each band or some other statistic representing range within the data. An unknown pixel, like pixel 1, can now be classified in terms of a decision region defined by a rec­ tangle based on the range of values (Figure 6). These decision regions, when conceived in terms of multi-dimensions (i.e., the 4dimensional space of LANDSAT values), are known as parallelepipeds. The classification of pixel 1 in this parallelepiped framework illus­ trates the greater sensitivity to intra-category variation exhibited by using range values. "Barren” has a small clearly defined region, while the Urban region is much more variable and pixel 1 is now appropriately classified as Urban. Overlapping decision-regions cause problems with parallelepiped classifiers. In the illustrative example Agriculture and Forest have overlapping regions so that pixel 2 falls within the Forest region when classification as Agriculture would perhaps be more appropriate. The overlap occurs because the spectral values for agriculture and forests vary similarly in both of the bands that are indicated in the graph. They are positively correlated, because high values in Band 7 are associated with relatively high values in Band 5. The Urban or Barren classes show no distinct pat­ tern in terms of band association. Water pixels are negatively corre­ lated, that is, high values in Band 7 are associated with relatively low values in Band 5. A certain amount of the confusion in parallelepiped classifica­ tion which is caused by overlapping regions can be eliminated by step­ 70 ping the borders of each decision region, in effect creating regions of several adjoining rectangles (Shlien and Goodenough, 1974). Clas­ sifiers that are strictly based on parallelepiped methods are few, the most well known being the General Electric Image 100 System. The use of range limits as components of multispectral classification systems is more often found. Interactive systems that have an "alarm" capa­ bility use the ranges of the four bands to perform a basic paral­ lelepiped classification for one category based on one training set. This classification is then indicated on the graphics plane of the RGB monitor to indicate how that training set performs as a representative informational class. Range limits are also used in hybrid classifica­ tions in association with minimum distance algorithms. This technique is a preliminary thresholding technique which screens out pixels which do not fall into any of the defined parallelepipeds and then allows the minimum distance measures to operate on the remaining pixels. This can create a set of pixels which will not be classified because they are beyond the range of limits defined by the training statis­ tics. A further thresholding technique is often applied to the minimum distance measurements. The user is allowed to set a maximum distance based on a multiple of the standard deviation associated with the mean value of a training statistic, beyond which a pixel cannot be classified, even if that mean value is in fact the closest available. iii) A parametric classification technique which forms the basis of many analysis systems is the maximum likelihood classifier. With this procedure the training set data for individual classes are used to estimate not only the mean and standard deviation values which are used in the minimum distance formulations, but also its covariance 71 matrix (Swain, 1972). This covariance matrix indicates the variance present in the data for each spectral band and also the degree of correlation between each band. In order for these statistics to be taken as valid and acceptable in an analysis as estimators of the characteristics that all pixels within a particular class will have, the training set must have a Gaussian distribution (i.e., a normal distribution). This is a critical assumption and it has been shown that carefully chosen training set data do, in fact, meet this assump­ tion and have histograms with normal, bell-shaped curves (Landgrebe, 1980). The mean and covariance are used to statistically describe the distribution of a particular class in terms of a probability density function which can be graphically portrayed on the scatter diagram as a series of contours (Figure 7). The classifier operates by making a set of calculations as to the probability of each pixel falling into each class, and then assigns it to the most likely class.^ The decision point between two classes occurs when probabilities are equal (Figure 8). With the different probability functions of the two example classes, it is clear that the decision point (Xq) is not midway between the means (Xj) as would be expected in a minimum dis­ tance formulation. At point X^ the probability and consequently X^ would be assigned to Class 1. is greater than P2 It is this poten­ tial difference between decision points based on means and probability functions which creates the potential differences in the classifica­ tion of particular sets of classes which are to be investigated in this study. 3 A detailed explanation of the mathematical basis for Maximum hood Classification can be found in Swain (1972). Likeli­ 72 BAND 7 REFLECTANCE (%) A A-AQRICULTURE F-FOREST B-BARREN U-URBAN W-WATER BAND Figure 7. REFLECTANCE (%) Equal Probability Contours Defined by a Maximum Likelihood Classifier (after Lillesand and Kiefer, 1979) 73 PROBABI LI TY («) CLASS 2 CLASS 1 P2 MEAN 1 MEAN 2 PIXEL VALUES Figure 8. Decision Points Based on Probability and Mean Values (after Jayroe et al., 1976) 74 Estimation of the probability density function is the step which characterizes the pattern of informational classes and is consequently the core of the classification methodology. Since this function is described by statistics generated from the training sets, the critical nature of training set development is reinforced. It is for this rea­ son also that maximum likelihood classifiers are particularly sensi­ tive to small variations in training statistics (Fleming and Hoffer, 1977). Throughout this discussion another underlying assumption has been that each class has an equal probability of occurrence. LANDSAT classifications, this is not the case. For most To compensate for this problem some analysis systems contain the Bayesian rule which allows an analyst to establish ’a priori' probabilities for specific categories, based on knowledge of particular study areas. This, com­ bined with the establishment of thresholds set on the basis of unacceptably low probabilities, improves classification accuracy, but also creates an "unclassified" category. Other modifications, such as the use of context information to create texture measures that can be included in classification algo­ rithms (Swain, 1976) and the use of digital terrain information to establish classification probabilities for forest types (Strahler, 1980), are broadening the range of classification options. The use of logit analysis, which transforms non-Gaussian distributions, allows for inclusion of information such as the distribution of soil type in digital classifications (Maynard and Strahler, 1981), but these changes are still largely experimental and not included in standard analysis systems available to general users. 75 Information Display and Tabulation Output classification results are usually presented in map for­ mats or area tabulations. This capability is often the weakest link in a small and relatively unsophisticated analysis system. Maps can be produced by very sophisticated automated systems such as ink jet plotters and digital film writers which can create geo­ corrected color images with map characteristics. Uncorrected color images can also be produced and while these are attractive they are difficult to use in an applications context when it is necessary to tie classifications to specific ground locations. very expensive to produce. These products are The equipment necessary to essentially take the image from an RGB monitor1* and produce a high quality hard­ copy output required, until recently, investments only possible in sophisticated processing facilities. Less expensive hard-copy systems for use with small systems are now available; however, photographing the screen of the RGB monitor is often the only practical alternative for obtaining hard-copy color output. A more common and locally available map output is the standard line printer map. For this out­ put, the analyst selects variotfs symbols to represent the classes selected and these are printed on a standard line printer or dot matrix printer. Prior to the EDIPS processing system at the EDC, LANDSAT data were reformatted such that when the 79m x 57m ground resolution pixels were output to the line printer set at eight charac­ ters per inch and 10 lines per inch, a map of approximately 1:24,000 a "RGB” is the technical abbreviation for Red, Green, Blue which desig­ nate the color guns used in display monitors on which LANDSAT data are portrayed. 76 scale was the result. This is very convenient for comparison with standard topographic maps of the same scale. EDIPS processing imple­ mented more sophisticated geometric correction which involves resam­ pling of the LANDSAT data and an output pixel size of 57m x 57m. Standard line printer maps are thus elongated in the vertical dimen­ sion. In order to use a line printer to produce an orthogonal map in these circumstances, it is necessary to have a printer which can represent a square pixel. Standard line printers and remote terminals do not have this capability. The Decwriter IV terminal, a more advanced version of the stan­ dard interactive terminal, allows the user to specify within a certain range of alternatives the number of characters and lines per inch. The only alternative available here which will represent a square pixel is 12 characters per line and per inch.. This produces a charac­ ter map at a scale of approximately 1:27,000 which is difficult to use with standard topographic maps. Another option is to re-sample the output product and essentially add or subtract data to create a map of the appropriate scale. This may require software development which can be beyond the analyst's resources; it also introduces a certain amount of error into the map product. A dot matrix printer of the type that is usually part of a micro-computer analysis system also has the capacity to represent a square pixel through assignment of the pins that form each character. This will also require some software modification in most instances. It is clear, then, that production of orthogonal line-printer maps from recently processed LANDSAT data present some logistical problems. Tabular outputs of classification results are used for acreage 77 determinations. Even the most rudimentary analysis systems will tabu­ late the number of pixels assigned to each class and from these, acreage calculations can be made. More sophisticated systems have a digitizer capability which allows an analyst to specify the boundaries of a specific area within the classified scene, for example a county or a watershed, and then generate area statistics from within this area. Again, rudimentary systems often do not have this capability. Evaluation of Classification Results A number of evaluation methods have been employed to evaluate the accuracy of LANDSAT classification results. They include visual com­ parison with existing map sources, establishment of known test areas within the data set that are checked for accuracy after classifica­ tion, and acreage comparisons between the satellite inventory and ones made through conventional methods. Systematic sampling of areas within the classified data for accuracy assessment has generally been absent from LANDSAT-generated inventories, but is beginning to be regarded as critical to the operational use and acceptance of these products (Mead and Szajgin, 1980). The accuracy assessment of the LANDSAT classifications will be covered in more detail in the chapter dealing with the results of the current study. CHAPTER IV EXECUTION OF THE LANDSAT TEST Introduction Execution of the classification test was completed using the software systems of the Earth Resources Data Analysis System (ERDAS). This is a completely self-contained image-processing computer system, based on the Z-80 microprocessor. It contains LANDSAT processing software which features false color display, interactive field selec­ tion, a variety of image enhancement procedures, and three major LANDSAT classification algorithms (ERDAS, 1981). The objectives of the study require splitting the classification test into a series of tasks: i) Category reduction. The final categories established in Chapter II represent the land use categories that fulfill the general requirements of the majority of planners in Michigan. This list was compiled without reference to modes of acquiring these land use data; modifications now have to be made to accommodate limitations that pro­ cessing of digital LANDSAT data impose. ii) Study area definition and LANDSAT data acquisition. Representative study areas have to be selected and available LANDSAT data must be identified for a classification performance test. iii) Training set procedures and definition. Using the capabil­ ities of the ERDAS system, a training procedure has to be developed which approximates the best approach indicated in the review of train­ ing set selection methodologies. iv) Classification of data. 78 For classification of the test 79 sites, a common set of training statistics is subject to the minimum distance-to-means and the maximum likelihood algorithms in the ERDAS software. An unsupervised clustering algorithm is also used to clas­ sify the test sites. v) Evaluation of classification results. This is the central theme of the study and will be dealt with in a separate chapter. Category Reduction The system of categories identified through analysis of the ques­ tionnaire returns is presented in Table 13. from each level of generalization. included. Categories were selected All seven Level 1-categories are Fifteen Level Il-categories were selected, to which have been added the three forest Level Il-categories, because of their importance in terms of a classification applicable to all of Michigan. The seven Level III-categories selected are all specific elements from within Level II-Urban categories. Again, three additional categories were added to the list from the Level III group to complete a distinct block of urban classes identified by questionnaire responses. Three processes are necessary to convert thi3 list of classifica­ tion categories into an internally consistent classification scheme that can be used in an interpretation process. i) They involve: a rationalization of overlaps between categories which exist because the questionnaire asked for selections from each level; ii) selection of categories that are interpretable from LANDSAT data, i.e., categories with distinct spectral characteristics and appropriate resolution requirements; and iii) segmentation of land use categories into land cover types for which training sites can be developed. Table 13. Classification Categories Derived from Analysis of the Michigan Data Needs Questionnaire Level I Level II Level III Urban Residential Multi-family Residential Agriculture Commercial, Services & Institutional Single-family Residential Range Industrial Mobile Home Parks Forest Transportation, Communications & Utilities Central Business District Water Extractive Air Transportation Wetland Open Space Rail Transportation Barren Crops & Pasture Solid Waste Orchards 00 o Confined Feeding Other Agriculture Streams Lakes Great Lakes Forest Wetland Non-forested Wetland 7 Broadleaf Forest Shopping Center Coniferous Forest Strip Development Mixed Forest Road Transportation 15 (3) 81 The list can be condensed initially by considering the duplica­ tion in Level I-categories. Level I-Urban, for instance, can be obtained by combining the Level II-Urban oriented categories of Residential, Commercial, Industrial, Transportation, Extractive, and Open. Nixed Urban is the only category missing from this comprehen­ sive Level II list, and it will be incorporated into the other categories where appropriate. Some discrimination of the land use pattern will be lost and some error introduced by diluting the purity of other classes with the mixed uses, but this is not critical. I-Agriculture can be obtained in a similar way. Level The major agricultur­ ally oriented Level Il-categories are indicated and when combined will comprehensively represent Level I-Agriculture. Level I-Range and Level I-Barren are not represented by Level Il-categories and have to be directly interpreted.1 Level I-Forest, Water, and Wetland are also all comprehensively represented by Level Il-categories. In the overlap rationalization process, then, the original category list can be cut by five Level I-categories for interpretation purposes. From this modified list of interpretable categories a further rationalization is necessary to make the list compatible with a LANDSAT classification test. As has been indicated earlier, two crit­ ical factors determine the type of data that are available for the earth's surface from the LANDSAT multi-spectral scanner, which, in 1The Range category is not particularly appropriate for land use clas­ sification in Michigan although it is included in the Michigan Land Cover/Use Classification System. For this reason it was incorporated into the Michigan Data Needs Questionnaire and was selected as a category for interpretation. The category represents un-improved land with an herbaceous cover and consequently causes numerous interpreta­ tion difficulties that will be elaborated in later sections of this chapter. 82 combination, establish the type of information classes that can be derived from such data. The data available are spectral records and the major current interpretation methodologies to be used in this test employ procedures that use only this spectral information. The ground resolution of these data is approximately 1.1 acres so that landscape elements smaller than this are averaged into the spectral responses of surface features immediately surrounding them. Functional classes dependent on spatial association and/or juxtaposition of elements smaller than 80 meters, therefore, are not interpretable and the category list must be modified with this criterion in mind. When the criterion is applied to the category list, Level IIIcategories that define the most detailed level of information are par­ ticularly vulnerable. Distinguishing between multi-family and single-family residential Involves a degree of detail that LANDSAT is unable to provide. Mobile home parks have a distinctive pattern on aerial photography, but individual units are not dlscriminable with LANDSAT resolution. Level Ill-Residential categories will, therefore, be omitted from the test list. The three Level III-categories associ­ ated with Commercial, Services and Institutions pose similar problems. They are mainly composed of larger buildings and parking lots, in various proportions, but are primarily distinguishable by their loca­ tion, rather than their spectral properties. These three categories, therefore, are also collapsed into the Level II-Commercial umbrella. The Level III-categories associated with Transportation and Util­ ities are also difficult to include in the test category list. Linear features stand out well on small scale images of LANDSAT data, but form stepped patterns when displayed at larger scales. Resolution and 83 shape help distinguish road from rail on aerial photography, but this is not possible with LANDSAT data. thus dropped from the list. Rail and Road Transportation are Ground facilities for air transportation are defined solely in terms of shape and pattern. Large airport facilities are sometimes distinguishable on LANDSAT imagery because of their paved surfaces and this may be transferable to digital process­ ing methodology, but since this is a feature which covers only a very small part of any area to be interpreted it is also omitted from the test category list. Solid waste is a pattern-oriented functional land category type that is difficult to accurately identify even on largescale aerial photography and is thus not included on the category list. Collapsing Level III-categories into their respective Level Ilcategories now leaves twenty categories, principally at Level II, which are in turn collapsible into seven Level I-categories after classification. These twenty categories can also be rationalized. Despite their variability, streams, lakes, and Great Lakes can be grouped into the Level I-class of water and be interpreted by a map user at the more detailed level. Extractive industry a3 a class is only interpretable in LANDSAT-terms as open pit mining and this is very similar to the component elements within Barren. The two classes are thus combined and their interpretation on the classified output made contingent on USGS topographic map overlay correlation. Confined Feeding and Other Agriculture are also difficult to consider as directly LANDSAT-interpretable. They are both mixtures of buildings which cannot be included because of resolution requirements, and cover types that would probably be classified, in isolated fashion, as 84 either crops, pasture, or range. These categories are, therefore, dropped. Open Space is another difficult category. Its component parts can vary considerably and the primary characteristic used in identifi­ cation is its location within an urban area. Interpretation of this category on classified output can be made through USGS topographic map overlay correlation and it is also eliminated from the category list. The final list of classification categories then comprises 13 categories (Table 14). Study Area Selection and Classification Materials Study Area Selection The data-needs questionnaire was distributed throughout Michigan and the results represent a statewide response. It is important, when testing the capability of LANDSAT classification algorithms, there­ fore, to choose test areas which represent a range of environmental situations found in the state. In general terms, this meant selection of areas which represented land use types in agricultural areas, urban areas, and forested areas. Logistically, these areas had to be located within one LANDSAT scene to minimize data acquisition costs and computer time required to window and reformat test area locations. The frame which is centered approximately on Grand Rapids, Michigan has enough variability to accommodate these requirements. The test areas were small, their size being dictated primarily by the maximum size limit of the ERDAS microcomputer system. This is a screen size limit of 256 x 240 pixels (105.6 mi2). Three test sites were chosen for the study. Their characteris­ tics can be reviewed by reference to the following descriptions and Table 14. Category Rationalization Process * Level I Level XI RESIDENTIAL Multi-family Residential Single-family Residential Mobile Home Parks COMMERCIAL, SERVICES & INSTITUTIONAL Central Business District Shopping Center Strip Development Urban INDUSTRIAL Transportation, Communications & Utilities Agriculture Level III CROPLAND & PASTURE Confined Feeding Other Agriculture ORCHARDS RANGE Forest BROADLEAF FOREST CONIFEROUS FOREST MIXED FOREST WATER Streams Lakes Great Lakes Wetland FORESTED WETLAND NON-FORESTED WETLAND BARREN/EXTRACTIVE *Final categories are capitalized. Rail Transportation Road Transportation Air Transportation Solid Waste 86 graphic materials which depict the areas on a topographic map, a black-and-white air-photo mosaic and a false-color LANDSAT display. i) Agricultural - This is a primarily agricultural area north of the small city of Allegan. An area of broadleaf forest to the south and west of the city and a mixture of residential areas, small woodlots and cropland to the south and east make up the remainder of this site. The Kalamazoo River runs through the lower part of the test area, broadening into the eastern part of Lake Allegan. Two other medium-sized lakes are included in the test area (Figures 9, 10 and 11). ii) Urban - This test site is centered on the city of Grand Rapids and includes the Central Business District (CBD) and industrial development along the Grand River Immediately south of the CBD. The suburban areas to the north and west, where residential development is displacing orchard lands, are also included (Figures 12, 13 and 14). iii) Forest - This is an area of primarily broadleaved forest with little agriculture or residential development, located in the Manistee National Forest. The site is in Newaygo County and includes parts of the Hardy Dam and Croton Dam ponds, a large wooded swamp, and riparian wetlands along the Little Muskegon River (Figures 15, 16 and 17). LANDSAT Data Acquisition The study areas which exhibit the necessary range of environmen­ tal conditions for an adequate classification test were located in the LANDSAT scene centered on Grand Rapids, Michigan (nominal center path 23, row 30 of the Worldwide Reference System, Figure 18). When reviewing the available data, a number of image quality factors have 87 Figure 9. Figure 10. Topographic Map Display of the Agricultural Test Site Black-and-White Mosaic Display of the Agricultural Test Site 88 Figure 11. False Color Composite of LANDSAT Data for the Agricultural Test Site Displayed on an RGB Monitor (Color Gun Assignments: MSS Band 4 - Blue, MSS Band 5 - Green, MSS Band 7 - Red) 89 Figure 12. Topographic Map Display of the Urban Test Site 90 Figure 13. Black-and-White Mosaic of the Urban Test Site 91 Figure 14. False Color Composite of LANDSAT Data for the Urban Test Site Displayed on an RGB Monitor (Band Assignments - see Figure 11) 92 Figure 15. Topographic Map Display of the Forest Test Site Figure 16. Black-and-White Mosaic of the Forest Test Site 93 Figure 17. False Color Composite of LANDSAT Data for the Forest Test Site Displayed on an RGB Monitor (Band Assignments - see Figure 11) 94 26 27 26 25 24 23 22 29 1. Agricultural Taat Slta 2. Urban Taat Site 3. Foraat Taat Slta Figure 18. Location of the Test Sites Within the LANDSAT Scene (Path 23, Row 30) and the Centers of the Scenes That Cover Michigan 95 to be considered, such as radiometric quality, presence of clouds, and the processing system used. This usually eliminates a number of alternatives, particularly in Michigan, where cloud cover problems are severe. Specific factors, such as time of year, cost constraints, and purpose are involved in the choice among alternatives that remain. In this instance, image selection was determined by the need to obtain recent coverage of the scene between May and September. A previous study (Karteris, 1980) indicated that images for August 16, 1979, and September 12, 1979, were the only ones of suitable quality available. Computer-compatible tapes of these scenes were available at the Michi­ gan State University Center for Remote Sensing (CRS), and a decision was made to use the August scene.2 It was clear at the outset that data from August are not optimal for general land use and land cover discrimination. Active plant growth in crop and tree species is occurring and this results in simi­ lar spectral records for different major ground cover types. Previous work in the mid-western United States has indicated that early June is the optimal time of year for land cover discrimination (Vescott and Breault, 1978; Hill-Rowley, 1980). During this season crop/forest, forest/residential, and wetland/forest discrimination is optimized. 2 While the study was in progress an attempt was made to obtain data from the EROS Data Center (EDC) for June 17,1980 that had recently be­ come available. The data were of excellent quality and hard copy im­ ages were obtained. Unfortunately, the digital record of this infor­ mation retained by the EDC was damaged and could not be made available without re-processing at Goddard Space Flight Center. This facility has had considerable backlogs (see LANDSAT Data Users Notes, No. 20, September, 1981) and a good copy of the tape could not be made avail­ able within a reasonable time frame. The data for September are available at the CRS; however, software problems precluded the use of these data with the ERDAS micro-computer system in this study. 96 Use of the August data in this study points up the fact that while the LANDSAT satellites acquire data at frequent intervals, theoretically allowing the analyst to select seasonal coverage appropriate to specific applications, it is often difficult and some­ times impossible to meet these requirements. The results of analyses using less than optimal data may then give a better indication of operational performance potential, than those in cases where all requirements were fulfilled. Ground Verification An integral part of testing the effectiveness of LANDSAT classif­ ication procedures is an evaluation of their accuracy. In this study, the reference source against which the classification products were tested for accuracy was 1:24,000 color infrared aerial photography obtained by the Michigan Department of Natural Resources (DNR). photography was acquired during 1977 and 1978. This Some changes can be anticipated between this period and the LANDSAT date. eight week seasonal difference is also important. The five to Visual comparison of the aerial photography and the LANDSAT image, however, indicated that these changes were minimal. A significant part of the accuracy assessment was to evaluate LANDSAT classification performance by comparing the results with a land use map, produced by conventional methods, and also with digital data derived from this map, that had been entered into a grid-based information system. In order to reduce the bias involved in such a test it was necessary to complete the delineation of land use and land cover before classifying the LANDSAT data. performed using the following procedure: This interpretation was 97 i) Review the aerial photography with particular reference to the appropriateness of the land use and land cover categories in the final classification and modify as required. ii) Establish a minimum type size equivalent to a 3 x 3 LANDSAT pixel matrix (171m x 171m ground dimensions). This makes the del­ ineated map compatible with a 3 x 3 pixel grid size used to geocode the LANDSAT classifications into a geographic information system for­ mat. This size was selected because it is the closest pixel aggrega­ tion to the standard minimum type size of 10 acres that is being used by the DNR in its statewide inventory. iii) Delineate land use and land cover categories for the test sites, employing the final classification as confirmed or modified in Step 1. lv) Transfer delineations to a mylar overlay which is pin registered to a mylar copy of the topographic map at 1:24,000 scale.3 v) Enter the delineated land use and land cover into the geo­ graphic information system resident on the ERDAS micro-computer system using a geocoding grid specifically prepared to represent a 3 x 3 pixel size. Training Set Selection Training Set Selection Procedure The ERDAS system is interactive. The data can be displayed on a color monitor with 31 levels for each color gun and supervised train- 3 Photographically prepared 1:24,000 scale mylar copies of the topo­ graphic maps were prepared for the three test sites. The original to­ pographic maps for Allegan and Croton Dam were at 1:62,500 scale.) 98 ing can take place interactively. A cursor is used to outline cover types to be used in training and it is possible to magnify the image and re-display it on the screen. Combining these capabilities allows the user to precisely locate training fields (Figures 19 and 20). software also allows the analyst to "alarm** a training field. The Alarm takes the maximum and minimum values from the training field in each of the four bands and performs a single category classification by the parallelepiped method which is displayed on the graphics plane of the screen (Figure 21). The effectiveness of individual training fields can thus be evaluated with the study area data in a real-time environ­ ment. While interactive display makes training field selection a fairly precise operation, the software does not offer as much evaluative sta­ tistical information as some other, batch-oriented systems.1* However, histograms are displayed on the screen (Figure 20) and comprehensive a The University of Washington Image Processing System (UWIPS) is resident at Michigan State University. This system is derived from algorithms originally developed at Oregon State University (Muller, 1979). Modules of this program provide hard-copy histograms with as­ sociated maximum, minimum, mean and standard deviation values for training sets. Also available is a table which lists the brightness values for each pixel along with a measure of distance between this value and the mean for its class, and the total distance from the means, squared and totaled, for the four bands. These distance values indicate any odd brightness values in a particular band and the combined distance to­ tals suggest non-conforming pixels within an otherwise good set. A combined distance total over 10 is likely to be a non-conforming pixel (Muller, 1979) and should be removed. The combined distance totals and standard deviations are useful evaluative tools in assessing whether the training set will perform well in a classification. UWIPS was used in the early stages of this project for training set evaluation and the minimum distance classification algorithm it contained. In both of these areas compatibility with the ERDAS system was a severe problem. Later versions of the ERDAS software contained a minimum distance classification algorithm and additional training set information; consequently, use of UWIPS was discontinued. 99 Figure 19. Identification of a Training Area Representing Water on Aerial Photography Figure 20. The Water Training Area Delineated on a LANDSAT Display (2X Magnification), with Accompanying Histograms for 4 LANDSAT Bands Figure 21. An Alarmed LANDSAT Display for the Water Training Area 101 statistics for the training samples (as opposed to each pixel within it) are available. The ERDAS software does include a clustering algo­ rithm with a maximum limit of 27 classes and this provides a limited capability to expand the evaluative process. It is not possible, how­ ever, to completely simulate the multi-cluster block method for a number of reasons: i) small areas from within the test area data sets (the actual blocks to be clustered) cannot be easily defined; ii) while clustered output can be edited to combine several spectral classes into a land use class, it is not possible to generate statistical information on this class; iii) there is no facility to take the means and covariances gen­ erated from clusters and enter them directly into the ERDAS processing algorithms. It was possible to cluster the complete data set for each study area and use this information to confirm the spectral homogeneity of a supervised training field. The combined features of color-interactive selection, statistical output and precise pixel location within the clustered information made this technique very effective. cedure developed, then, was another type of hybrid: confirmed supervised training field approach. The pro­ a cluster- A detailed review of the entire training set selection procedure is included in Appendix B. Training Set Delineation Initial Review At this stage in the process, critical decisions had to be made that influence the effectiveness of the classification performance. The first step was to review the classification scheme and to 102 determine whether all of the categories were present in each of the test sites. This was quite straightforward. Mixed Forest was not present in the Agricultural test site, Coniferous and Mixed Forest categories were not present in the Urban test site and Commercial, Industrial, Transportation, and Orchards were absent from the Forest test site. With inappropriate categories eliminated from the classif­ ication scheme of each test site it was then possible to implement the training set methodology described in the last section. The objective was to establish the spectral components of the classification categories that remain for each test site, and decide whether further refinement was necessary because of spectral overlap and consequent confusion among categories. A review of LANDSAT data for each test site displayed on the ERDAS color monitor indicated, at the outset, that a number of other categories had to be eliminated as separate entities. In the Agricul­ tural site, the only area classified as Transportation by aerial photo-interpretation was the Allegan Airport. This is a small group of buildings with two narrow paved strips and a large perimeter area of grass cover. The spatial resolution capabilities of LANDSAT rela­ tive to the size of this complex make it very difficult to identify such a functional feature; therefore, Transportation was omitted from the Agricultural test site category list and recoded as Range in the geocoded ground-verification data base. Three categories could not be separated on initial review in the Urban test site. Both Forested Wetland and Non-Forested Wetland represent very small areas in the photo-interpreted inventory and were not clearly discriminable. They were both omitted from the category 103 list and recoded as Broadleaf Forest in the geocoded ground verifica­ tion base. The Range category was a more difficult problem. It occu­ pies a larger area than Wetlands on the photo-interpreted map although it is not a major category. The LANDSAT display shows that it is dif­ ficult to separate Range from Pasture and some Range appears to be confused with Broadleaf Forest. Range was omitted from the category list and recoded into Crops and Pasture in the geocoded groundverification data base. In the Forest test site only Non-Forested Wetlands posed a prob­ lem on initial review. These were small sites and are located within the Forested Wetland areas. The category was eliminated and recoded as Forested Wetland in the geocoded ground-verification data base. Land use and land cover training areas associated with the remaining classification categories for each test site are listed in Tables 15. 16, and 17. Generalization In the Agricultural test site of Allegan County twelve classifi­ cation categories remained (Table 15). It was clear that confusion between a number of categories existed; however, careful editing of sites allows for spectral separability to be maintained and informa­ tion validity retained through reference to the aerial photography. Nevertheless, the "alarmed" images indicated that both Residential categories were easily confused with Broadleaf Forest, Forested Wet­ land was confused with Broadleaf Forest, and Industrial facilities with sand mining sites and some bare fields. In the selection of training sites, the conflict always arises between the necessity to obtain a "pure" training site which correctly represents a spectral Table 15. Training Categories: Agricultural Test Site Training Area Descriptions Older Central Newer Peripheral Classification Categories 1 RESIDENTIAL Central Business District COMMERCIAL, SERVICES & INSTITUTIONAL Industrial Complex on River Industrial Plant in Agricultural Area INDUSTRIAL Level I Groupings Urban Crops— high vigor (cropped) Crops— low vigor (harvested) Bare Field Pasture CROPS & PASTURE Orchards ORCHARDS Herbaceous Pasture/Scrub RANGE Broadleaf Forest (oak dominant) BROADLEAF FOREST Coniferous Forest (red pine) Coniferous Forest (jack pine) CONIFEROUS FOREST Deep Lake Water Shallow River Water WATER Water Riverine Forested Wetland FORESTED WETLAND Wetland Riverine Marsh NON-FORESTED WETLAND Sand Mining Sites BARREN/EXTRACTIVE Agriculture Range Forest Barren Table 16. Training Categories: Urban Test Site Training Area Descriptions Classification Categories Level I Groupings Subdivision Residential with Scattered Trees and Large Backyards Heavily Hooded Residential Mobile Homes RESIDENTIAL Central Business District Shopping Center COMMERCIAL, SERVICES & INSTITUTIONAL Large Manufacturing Complex INDUSTRIAL Interstate Road System Rail Yards TRANSPORTATION, COMMUNICATIONS & UTILITIES Cropped Field Pasture Recreational Open Space CROPS & PASTURE Orchards ORCHARDS Agriculture Broadleaf Forest (oak dominant) BROADLEAF FOREST Forest Lake Water River Water WATER Water Sand Mining Site BARREN/EXTRACTIVE Barren Urban Table 17. Training Categories: Forest Test Site Training Area Descriptions Classification Categories Level I Groupings Nucleus of small settlement RESIDENTIAL Urban Crops Permanent Pasture CROPS & PASTURE Agriculture Herbaceous Cover (rough pasture and scattered trees) RANGE Range Broadleaf Forest (oak) Broadleaf Forest (aspen) BROADLEAF FOREST Forest Coniferous Forest (white pine) Coniferous Forest (jack pine) Coniferous Forest (red pine) CONIFEROUS FOREST Deep Reservoir Water WATER Water Forested Wetland (lowland hardwood) Forested Wetland (swamp conifer) FORESTED WETLAND Wetland Sand Mining Activity BARREN/EXTRACTIVE Barren 107 class, and the necessity of chosing a large enough site so that deriv­ ing statistics which represent that class are valid. At this stage, small groups of pixels which alarmed well and could be clusterconfirmed were chosen, even if they were smaller than optimal in terms of size. The cluster-confirming process and evaluation of training area statistics required further rationalization procedures. i) The Industrial training sites which were only marginally acceptable in the first-cut evaluation were small in size and very variable in nature. One of these sites exhibited characteristics similar to those for the Central Business District which represented the Commercial category. Industrial, as a separate category, was con­ sequently dropped and combined with Commercial. Industrial will still fall under the Level I-Urban category. ii) Orchards could not be discriminated and appeared to be grouped with Broadleaf Forest. This introduces a higher level of error into the classification because Orchards is a separate Level II-category that belongs to Level I-Agriculture. Combining Orchards with Broadleaf Forest shifts the category into another Level Igrouping (Forest), making this generalization less clearly defined, iii) Both of the wetland categories had to be eliminated. Accu­ rate discrimination of these categories is difficult in August, even with high-resolution data sources. Forested Wetland was indistin­ guishable from Broadleaf Forest and was consequently grouped with this category. The one distinct area of Non-Forested Wetland was statisti­ cally very similar to cropped fields and this category was also elim­ inated. iv) Barren/Extractive was a marginal category at the outset, the 108 sand mining sites being confused with the Central Business District and a few bare fields. These signatures and the classification category were also eliminated. v) The area involved was very small. Residential, Coniferous Forest, and Water categories had two training areas each. In all cases, the areas were similar enough in terms of their statistical characteristics to be combined into compo­ site training areas. The final list of training areas and classification categories used in the Agricultural test site are contained in Table 18. For the Urban test site centered on Grand Rapids, Michigan, nine classification categories remained (Table 16). The confusion present between training areas from major classes demanded a rationalization and elimination of categories that accounted for significant parts of the test site. i) Residential areas occupy a substantial amount of the Urban test site and were originally generalized into three types: sions, heavily wooded, and mobile homes. subdivi­ The heavily wooded residen­ tial areas, however, were confused with Broadleaf Forest and the Mobile Home Parks with the Cerftral Business District. The only viable Residential training area was associated with an older subdivision and this was retained. ii) Commercial and Industrial training areas were substantially confused with each other and had to be combined into one composite group. The Central Business District and Manufacturing Plant training areas were retained, but these labels are not strictly descriptive. Both training areas represent part of the Commercial/Industrial category and differ from each other only in terms of building density. I Table 18. Final Training Areas: Training Area Descriptions Agricultural Test Site No. of Pixels Classification Categories Level I Groupings Residential 25 RESIDENTIAL Central Business District 17 COMMERCIAL, SERVICES & INSTITUTIONAL/ INDUSTRIAL Crops (cropped) Crops (harvested) Bare Field Pasture 26 16 36 21 CROPS & PASTURE Agriculture Herbaceous Pasture & Scrub 20 RANGE Range Broadleaf Forest 35 BROADLEAF FOREST Coniferous Forest 15 CONIFEROUS FOREST Water 24 WATER Urban 109 Forest Water 110 iii) Transportation is a problem category. The two major consti­ tuents are the Interstate road system and several large rail yards. Neither of these functional categories has any spectral homogeneity. The road system has reflective characteristics of Residential areas for most of its length, with the large intersections being closer to Commercial. So, despite a clear pattern on the display, this section of Transportation was eliminated. The rail yards were confused with Industrial plants and also had to be eliminated. Splitting of the classification between these two types creates a problem with recoding the ground verification information. The data were re-interpreted to designate railyards as Industrial and the remaining road system was recoded as Residential. iv) Recreational open space, primarily small parks and golf courses, had very similar reflectance patterns to those of pasture. Since this category is most conveniently grouped with Crops and Pas­ ture in the overall study classification system, and the golf course provided an excellent training area, Recreational Open Space is used in the final training area set to represent pasture. v) inated. Orchards are confused with Broadleaf Forest and are elim­ The ground verification data are recoded accordingly. vi) Sand mining sites cannot be separated from the Central Busi­ ness District signature and are eliminated. The ground verification data are recoded as Commercial. The final list of training areas and classification categories used in the Urban area test site are contained in Table 19. The Forest test site located in the Croton and Hardy Dam area of Newaygo County, Michigan, required the least amount of category Table 19. Final Training Areas: Training Area Descriptions Urban Test Site No. of Pixels Classification Categories Level I Groupings Subdivision Residential 36 RESIDENTIAL Central Business District Large Manufacturing Complex 65 37 COMMERCIAL, SERVICES & INSTITUTIONAL/ INDUSTRIAL Urban Cropped Fields Recreational Open Space 15 28 CROPS & PASTURE Agriculture Broadleaf Forest 50 BROADLEAF FOREST Forest Lake Water River Water 63 14 WATER Water 112 rationalization (Table 17). The dominant general cover is forest and so the classification problem is to separate forest from non-forest and differentiate classes within the forest cover. The non-forest classes account for a very small proportion of the test site and while they are split into the three categories of Residential, Crops and Pasture, and Range in four total training areas, there is considerable confusion amongst them. Despite these problems, all four training areas were retained in the classification. Broadleaf Forest areas are generally dominated by oak or aspen species. There was minor confu­ sion between the oak-dominant training area and the lowland-hardwood Forested Vetland training area, but this was not severe and had to be accepted. Within the three Coniferous Forest training areas there is a certain amount of intra-training area confusion, but this does not affect other classification categories. Two training area categories had to be eliminated from the origi­ nal list. Swamp conifers within the Forested Wetland category cannot be separately differentiated from other conifer species; therefore, Forested Wetland is represented only by the lowland-hardwood training area. This suggests an under-estimation of this category vis-a-vis the ground verification data. Sand mining activity, as with other test sites, could not be isolated from other bright targets and was consequently eliminated. Barren was recoded as Residential in the geocoded ground verification data. The final training area list for the Forest test site is included in Table 20. Classification The three test areas were classified utilizing the programs resident on the ERDAS system. Supervised classification with standard Table 20. Final Training Areas: Forest Test Site Training Area Descriptions Nucleus of Small Settlement No. of Pixels 8 Classification Categories Level I Groupings RESIDENTIAL Urban Crops Permanent Pasture 9 16 CROPS & PASTURE Agriculture Herbaceous Cover 20 RANGE Range Broadleaf Forest (oak) Broadleaf Forest (aspen) 25 Coniferous Forest (white pine) Coniferous Forest (jack pine) Coniferous Forest (red pine) 12 Water Forested Wetland (lowland hardwood) 20 24 24 121 12 BROADLEAF FOREST Forest CONIFEROUS FOREST WATER Water FORESTED WETLAND Wetland I 114 maximum likelihood and minimum distanee-to-means algorithms were per­ formed using the common training areas. No thresholds were esta­ blished and all pixels were classified into one of the training area categories. Unsupervised clustering was also performed, deriving 27 classes for each test site which were grouped into the classification categories on the basis of their spatial distribution and correlation with ground verification data. Three classified data sets were thus produced for each of the test sites. For the Agricultural and Forest test sites a full screen of data was processed (256 x 240 pixels) and for the Urban test site a smaller area (256 x 208 pixels) was pro­ cessed because of data transfer problems mentioned earlier. A post-classification procedure was required before the classifi­ cation results could be analyzed. The original LANDSAT data and the classification products were not rectified to a standard map-based coordinate system. In order to create orthogonal data for display and printer map output which could be compared to ground verification classifications prepared by non-digital methods, a rectification pro­ cedure had to be implemented. This was achieved with two additional ERDAS programs. Using the pixel coordinates and map coordinates of ground control points, a first-order transformation matrix is computed which converts map coordinates to pixel coordinates. The user then specifies an out­ put cell size and whether bilinear interpolation or nearest neighbor resampling is required. The classified data sets were rectified, using U.T.M. coordinates, and a cell size of 57 x 57 meters which corresponds to the LANDSAT pixel size. Nearest neighbor resampling was used, as this method does not involve actual manipulation of data 115 values. The rectification process also served to create aggregated data sets for all of the classification results. By setting the cell size to 171 x 171 meters, the classification results can be resampled to create a data set generalized to an output dimension of 3 x 3 pix­ els which approximates the 10-acre cell often used in grid-based information systems. Pixel Classification results were displayed as dot-matrix printer maps for accuracy evaluation (Figure 22). Software limitations meant that these maps were at a nominal scale of 1:26,500. The geocoded data sets were not evaluated from hard-copy maps, as the classifica­ tion results and the ground verification material were both in digital form and could be compared directly without the need for sampling. A windowing of the classification results to match the arbitrary, but also U.T.M.-based, origin of each ground-verification data set was necessary. This procedure reduced marginally the size of the geocoded data sets used in the analysis. Summary and Conclusions A useful way of summarizing the operational steps taken in the preparation of the classifications is to refer to a simple flow chart of the procedures that were implemented (Figure 23). sequence is straightforward. The Input It has to be iterative, as precise definition of the test site on the first attempt is very difficult to achieve, despite careful measurement from the hard copy images.5 Once 5 An alternative is to use a main frame, batch-oriented processing sys­ tem to generate a grey map of a test site with larger dimensions tnan the 256 x 240 pixel size demanded by ERDAS. At the outset of this study the UWIPS system was to be used and as a consequence grey maps of each site were generated and precise pixel coordinates for the 256 x 240 area were taken from these maps. Subsequent to the completion 116 “ \, i y - p 1*. P P 9 a r R :•: r ■a a 3 F r* 3 e ,3 - T p F a *-. A 3 3 3 r F " P r R E = 9 r F a rt r? F A B “ F .3 ,3 3 |T - 2* ,B .-1 a 3 R ^ *- 3 3 p p rr •> p 2 T' p p 9 a 9 9 9 9 9 B E a a E 3 7 ‘7 rt « 3 ,3 .3 s - “ “ 3 3 .3 3 P . !,! 3 p 9 .3 3 3 B Pa • “ 3 p ,3 3 r F R r\ A. P F F P p H. i4, *? P P F a F E P E F B F E E a a r. 9 s a c 3 3 F P ” ■p 9 9 a a fr R P 9 rt r 9 F rt “ F R ft 9 .3 3 9 9 9 P H .«* P E ;-3 n - P •—-Z “ ~ 3 3 C F F A a a EC c P B a 4. rt E a — F I-' i*r P a A. P R F F P F rt B F E P P E p a c .3 3 B .rt r, a 'Z z a p 3 .3 9 A R R r. R 'A1R 3 3 R f r 7 p U I 7i B E •3 3 E P E E -• •* rs r” ■Z “ 3 a 3 C R.A R R r> A R 3 9 P 9 9 “ W y ,*1 u p R A. R R H a a R rt p & rt A s c P B E F !T 3 f“ C F F .3 3 .3 r?R 3 3 p r F 3 ,3 R F? f r A u 0 y p R F P F P A A. rt a F R F p 9 cs E F p “ E P E E P P F? E a 9 5* 1™ C •” P 13 3 3 P C 3 3 3 3 3 3 c c R R P I,,! WA IS P a F F F rt H a R F R “ p a E E “ 9 9 EP E E R 9 l* 1* " Z P a 3 c 13 3 c *3 C 3 1** ” P fS P F p 3 I..! 3 r P r* P P F P F R R a n rr 9 F B F E E p F E r E F E E R “ Fp a - F E ~ 3 i” 3 5 3 3 3 3 C 3 a p P R A VVU 3 F P 9 P p P R F P R a R F B F F E a p a E P P A 9 P A A rt ” Z 'Z ■2 3 3 3 R R r 3 P P C ;3 3 3 p F F r B AE * S U P C F P P ri P R a F R E E F E P H F F f? rt hi A r t A rt A rt l” ‘Z *•» ~ F R •“ F R 3 3 R R F 3 3 3 3 R P p g ?.l 9 rt rt F 3 P F 9 a 9 R R P a a a a B B B A HR E ;a rt n A r t A A rt 3 a 3 3 0 C 3 3 F R .3 ,3 3 3 C F A. g 3 y p F P a F P R R 9 E A H F B -i, rt P P P p A rt rt A rt •‘•i rt rt j? A rt. -L P F • c c 3 9 1** 9 a * c c ■3 C 3 3 ;9 p 9 3 ■* /*• ■“ 3 3 C P 13 “ i V.* c c c P p P “ P P p R R H R 9 a H E C P r r ‘: rt A A A A A r rt F F F F F e 1* c z s ,3 3 3 3 .3 3 ,3 f" C Z 3 ,3 >3 3 F F FU ! '/> y 3 3 F a P F P R p R F F H 9 F E 9 E E A A n rt A A H rt. rt A a P E P F ■“ c c c s 4 C C F 9 3 a r? c 3 3 3 3 3 C 9 • 1}.I 3 c 3 p R 9 R P R F f? P H E R F F E H A A A rS A rt rt A A A r C ” p .~ y 3 3 3 3 F a F F F F R f? rt P a i-. P F E A A H H H A r t A A A H ™ P P F n: —1" c c s F A ft R P R ,3 R f? F P JS 9 9 9 ■? P t* c z 3 R a ft H R 3 3 P ??!? ? p •“ 3 C C 3 3 3 3 •,..i 3 1“ ,r P P R R rr P F P P P P “ p P A A A rt rt A rt rt rt rt H c 3 P P C P P F P P R R F R P R F “ 9 P E A A A A A. H rt A rt; A A P P F P P a p P C ■3 •3 R H H ft 3 F R “ t—p R 3 C C F a 9 3 c c 1* 3 A j4>rtC 3 R i7 * ft r 9 R P F C u 9 P C 3 F R P P R A R s? F F n h e ,3 r rt A H A rt M rt rt H A rt J5 9 F F? ~ j4; C P a 3 F 77 P C 3 R F c c .3 3 F F E A E 11t ; W P C 3 P F F “ s R? E H H H p 13 C P F? A A rt A A A rt A A • C 9 E i* 3 "P P 9 C P F 5 3 *3 C C 3 R R s ? 3 3 P a P R c W w '.1 9 3 3 P R F P R R p 9 rr A rt A A «* 1 ” S PE 3 3 3 s P •*! P C P F u w f? F F 3 R F F R P P R F F P P 3 3 C r r P E ® F rt, A A A A P » !? P 3 3 3 3 3 C C »3 3 s PP P P a C P P p >“ c c : 3 3 F r=f? f? F 3 3 >33 c u 3 u P P 3 P R F R P R 9 a E E c •3 •3 p E E F E E R A rt A A rt A 3 F P P? 9 P P P R p 3 *3 P C 3 P P r?r? a F R 3 3 3 R U w F R F? F? P P R P R T? a C P F A i i H E E E E P P f t H H H u; H ir R 3 C C 3 u *v y H P P A p 3 R R A 9 9 E C F A A H P E p E E P P f t P ft f t rt E 9 C P a 9 PR c 9 RR P F p P ft cs pcrs 9 a P P C P =? a R 3 3 3 3 r y >..,i P j? P F P C P P rr p E rt C 3 E C* rt F A E E E P E r t F rt; rt, P •h ,3 P R F P R F jr = 1? 53 p 9 3 P R R F s’3 R r? R F P f? P P fr F 3 C C 3 3 w Ui tJR F a P “ R R 9 R •3 E C 3 F E E a p a E P P E rt E f? P E s F E R E P P P R R F F F R F R P P P ~ R P 3 3 3 3 3 W w w RF? P p R i s P F p P •3 CP s e p R R E E E P E E E P F ra A 5 3 3 3 a rs 9 F F 3 R R F R F •a 9 F P P P s C 3 3 3 3 c y ■.J P F F P 9 a 9 9 9 F c 9 E R R E PP P P E E E F P P n B 9 a tr P F t? P C F r? F R A R C C P R S P f? F F 3 3 3 3 c y w P C 3 P P R R R r; P v3 C C F p P H P P P P 9 ft S P R E E F a a e c a c F 9 P ? ? rr T 9 9 P C 3 F R F 9 Fr 3 3 3 p 3 ■.i ‘...I i.JF 3 F B P P P p 9 P rt C R E E p E E 9 E R p P E R R a F P R p P R F7 *7 C 3 P P P F f= F P 3 3 3 p 3 w w y r 3 3 3 P P F A 3 F P P R p P P F E F R E P P A R E rt E rt 4 3 3 3 p 3 1 3 .3 a c E E rt rt 9 9 a a .3 9 ? a 9 9 p 3 = r rt E E E P F~ R P P E A i? 9 PR 3 33 PER F P PR R ~ F A r e c a F P P F a fr R P R F F J3 S F “ “ P 3 3 3 3 3 V.' :,i P R 3 P R R 3 c P P P E P P E E p P r t n E a P F F F P A E 3 “ 9 r* 9 a p CC p a R F A ff 3 R *? F f? p? a p 3 3 3 3 3 u !.,! 1.; 9 3 R A y C R R 9 p P P F E E a p r t p 9 .3 rt rt a rt rt rt sc a .31 3 3 3 .3 3 3 P R? F - : •3 !? 3 3 C 3 3 3 3 3 ~ .■» 3 A 3 -3 3 r* P P P P F 9 P E F E a A 3 « r? a a B rt C ” c c ,3 3 .“ •3 .■“ “ C 3 3 ft ft 3 3 •3 3 3 p 3 c C 3 3 3 3 3 3 3 .■* .* •.; a C P rt F F p F P R rt rt P E :.J F F rt 3 39 ” 9 j™ ,™ r* a 9 n 2? B p P < 3 3 a P — R P 3 3 3 3 3 EE P 3 3 3 = P p ■r PP F P F A P F F PR 3 3 R F F R 3 3 3 C R r? P FP a r p a 3 e 3 9 3 F p !? R R R ff P R “ ,3 3 C Z 3 3 g i.i 3 3 3 3 3 3 3 3 a P ?3 C B 4. A E A E 3 a •S a 9 ± 9 P F a a R F a F P c 3 3 F F p R P R a P p P F R F a 3 C C 3 * w U’ C 3 3 3 ” •■* 3 3 3 P k F r r: R F rt P E a P F A A A r E rt B 23 3 r» r? 9 F ~ P 3 P P F “ r rr 9 A R —^ P P s 3 3 3 3 3 1" w 3 3 3 3 3 3 3 3 R 3 3 P H rt ff rt rt A P p a r rt E rt A rt*■H rt p H P F Fr rv P 9 F P F P p Q R F C P R F ftR R P R 3 3 z 3 3 w 3 3 3 3 3 3 •f R a ■3 3 C rr A F H p H h A P E A .- . 3 3 3 :— 3 3 3 F C — ,3 C F ■7 9 ? P A E p 9 rt rt a ,4. A rt. .3 rt rr - 9 *? P F ? F p 3 p p R fr C >3 F P F F :? fr fr 3 z c 3 U rt F* s F E R = E a rr. a R A A C “ rt a p ■*” P 9 F f? P “ F F?C P c p R P P 3 3 a R C 3 P F 3 3 3 ~ “ y 3 3 3 3 3 P x F *j a 3 F : p e F A p P R 3 c .3 C ” 3 3 ,3 3 3 3 3 3 U 3 3 3 3 3 3 H rt rt 2 3 9 9 9 p 9 rt E a p F rt a A ir .3 B a F ” rr a F 3 P = a p F rr P R P z 3 3 3 3 p 3 J 3 3 3 3 3 ir 3 P H **: r 1. rt E C “ E E C r .F <3 a p p p p r t 9 :3 3 w a 9 9 a a a 9 J3 3 3 p p p “ 3 C 3 3 ** -*■ 3 ir 9 3 3 3 .3 3 3 g CC 3 3 ? R rt rt 9 H H r 3 3 p x -S.rt E :p P rz P.x E A R 3 E a as a> a “ r p F p F? ■3 C C 3 3 3 P R 3 P “ 3 3 3 3 3 3 3 .3 3 F “ 17 f t f r f f 9 rt H F T F F r H r. E p E A F ~ rt .rt r- rt s P F a P p *3 •3 3 3 3 3 9 = p •R R E C = p a rt A E p RF A P H F “ a “ R n 3 CC 3 it 3 3 3 3 3 3 9 p R F 2 rt 3 3 3 p er P P P J!p n? E F “ r? a a F r t a p E “ E F rt “ “ rt F |r ft* F ~ F p P r ’ F F 3 3 3 3 3 3 3 3 3 a 9 3 3 c 3 1 • V.' * - P E 7 9 T r;j B rt 3 a a :;j 73 F a 9 F R a - •3 3 3 3 .3 3 3 c 13 a 3 3 c c 3 I.! 3 3 3 3 3 3 “7 H H rr R P F F rz P B 9 A rt • Z Z 3 “ 3 B H P ? c S rt E F “ '■3 a A “ rr rt rt .»* E a E P F a r* F F P 9 a P F? F F p a 3 3 3 .3 3 3 3 3 •3 3 3 3 3 3 •* V* ri 9 a •3 3 r* “ 3 3 p c c 3 3 ~ * 9 a 3 3 .3 3 3 9 p FE 9 B a 'rt rt a B — r -1 rt ; •?rv r 71 ~ T** ” a a c c R “ 1« » c c 3 3 3 23 F r a R P P F rt B C p A H A S3 tr P A rt -. “ c ? a a a 3 c c p p T .3 3 3 3 3 3 C P F R 3 3 3 3 3 rt rt. t-2 5 rt rt r ;- rt rt r a 9 3 3 3 3 3 3 ;» a a R p s 9 c c ~ 9 55 3 3 3 3 3 y R 3 3 3 3 3 a P E Hr p p E R 9 F E rt a = - r 1“ “ 3 s a F s= a 3 p “ 3 jS s -** ." /*■ 3 i” s F R F R 3 3 3 •■* 1.; : 1B E rt B p a E I4 r rt rt ** rt rt — rt p ERR B a pE a F F P_ ra 3 F 3 a p R r a 9 3 R E P rr 9 a rt .3 “ J3 rt ~ —H a s A rt 9 9 9 RR 9 p 9 R a P •3 3 C C 3 :*■ 13 3 3 3 “ ■* 3 rt a a " 3 >3 3 * 3 • r* 3 3 3 a p a P a ' P R a s F a a — a R a rt a p trt F P P F rr P P E C p R E r *.X rt R F r a* 3* a" r p.ar f 3f-3 3 I c 'm 9t r t . E* ■ir , t *F 9* •.F a • *c c c r tr t“ - r F ^ F- r i v. * ;. “ 1 rt• rr v. •r Ff i E* C ?r ■ * ; •r-rPr ; p ; Rr rt P p r p Er J 'Z Ctr 3tr R?n • PrE A 3t „ i C_i C„: z A.2'.,t_3 r3tr3t1M —z •a s ar a c c ? s - r - ■ ~ - rr '2 5 S 3 ~ - 9 rc S b p r B p f3 r “ b b p p 3 3 f f ,f 3 p R - Residential C - Commercial Figure 22. ^ 9 r {5 f ;rr * - P 53 r ; 5 p ^ p o !■ 3 9 B 3 .f .3 p .f 3 ,3 f .3 f f f ;f f f »f 3 3 C A - Crops and Pasture H - Broadleaf Forest "3 S3 a * ^ s p ? t " = «? “ ■ C ? F P - ■! p 33* 3 9 P 3 TS 3 9 9 J? - 9 F C C C r = C C " 3“ a = ■ “* ;? ?P » EE> E F “ - W - Water Dot-Matrix Printer Display with Alphanumeric Symbols Representing Land Use Classes: Maximum Likelihood Classification for a Portion of the Urban Test Site. INPUT Acquire LANDSAT Tape P R O C E S S I N G Cluster Test Site Data | Define Test Areas from Hard-Copy Images and Window Tape (Main Frame Computer) Display Test Site Data and Establish Classification Categories Present j Generate Grey Map and Histogram Display Data Confirm Test Area Locations Select Training Areas Cluster Confirmation | REJECT | ~| Interactive Selection Final Training Area Selection ACCEPT Begin Processing Classification: 1. Maximum Likelihood 2. Minimum Distance-to-Means 3. Grouping of Cluster Classes Rectification of Results to U.T.M. Coordinates Figure 23. Simple Flow Chart of Study Procedures 117 Transfer Data to ERDAS Microcomputer J Assemble GroundVerification Data Figure 23 (Cont’d.) E V A L U A T I O N LANDSAT Classification of Test Site Grouping Cluster Classes Maximum Likelihood Classification Screen Image and Printer Map Minimum Distance Classification Screen Image and Printer Map Screen Image and Printer Map Sampling Strategy Pixel Accuracy Using Aerial Photography and Maps Information System Accuracy Aggregated data compared with geocoded photo-interpretation data Manually-prepared Classification Error Matrix Cross tab-prepared Classification Error ______ Matrix_____ Comparison of Classification Error Matrices Typology of Categories and Procedures 119 the test site has been defined to the user's satisfaction, processing of the data can begin. The Processing Sequence is critical because analytical judgments have to be made at several points which directly influence classifica­ tion performance. In each of the three test sites there were classif­ ication categories which could not be separated spectrally. Categories had to be eliminated as individual entities and grouped with spectrally similar classes. It is clear then, even before clas­ sification, that LANDSAT data for August cannot provide even the minimal data requirements of land-use planners and managers as reported in the Michigan Data Needs Survey. conclusion. This is a significant Given such a reduction, and necessary amalgamation of categories, the next step was to evaluate how well the remaining categories could be obtained using the classification algorithms available. Evaluation of classification results can be a multi-faceted pro­ cess, but in this study it is approached from two perspectives. The first perspective concerns the performance of the classification when evaluated on a pixel-by-pixel basis and compared to post­ classification aggregation. This sequence is shown in the "Evalua­ tion" segment of the flow diagram. A second perspective concerns the comparative performance of the classification algorithms in different environmental situations. of work for this study a tape drive was added to the ERDAS system at the MSU Center for Remote Sensing. Data can now be read directly to the ERDAS floppy disks and areas larger than 256 x 240 pixels can be transferred, displayed with reduction, and used in analysis. CHAPTER V EVALUATION OF CLASSIFICATION RESULTS Introduction Several lnter-related objectives are addressed in the evaluation of classification results. In general terms, an assessment is made of: i) ii) how well LANDSAT can provide land use information; whether the classification algorithm makes a significant difference to the results obtained; ill) whether classification performance varies in test sites that are representative of different environmental situations; and iv) how generalization of LANDSAT classification results to the 10-acre resolution used in many statewide geographic information sys­ tems, affects the accuracy of those results. This information is critical if LANDSAT data are to be employed operationally in these types of systems. Procedures In order to estimate the accuracy of classification maps derived from LANDSAT data, a set of sample pixels has to be compared with ground reference material for the same area. Color infrared aerial photography at a scale of 1:24,000 was used for this purpose.1 To test the effects of generalization, generalized ground reference material also had to be prepared. The aerial photography was interpreted to produce a delineated land use map using a minimum type size equivalent 1Made available for this study by the Division of Land grams, Michigan Department of Natural Resources. 120 Resource Pro­ 121 to 3 x 3 pixels (171m x 171m) and this map was geocoded to provide the information system component of ground reference material. Two critical factors have to be considered in the selection of a sampling strategy for accuracy assessment: the procedure to select the actual sample locations, and the formula to calculate a minimum number of observations needed to estimate the accuracy of the map. these issues have been the subject recent papers anda review of ofvarying Both of interpretationsin this literature is included as Appendix Cl. The formula used to determine the number of sample points follows that found in USGS work where manual selection techniques have been employed (Fitzpatrlck-Lins, 1981). Sample size, n, was estimated as: n * Z2 pq/E2, p = expected accuracy (%) q = 100 - p E = allowable error (%) (Source: (1) (Z = 2: generalized from the standard normal deviate of 1.96 for the 95% two-sided confidence level) Snedecor and Cochran, 1967, p. 517) A review of previous LANDSAT studies (Castruccio, 1978) suggests that an expected accuracy level of 75% is appropriate for general land use studies. Establishing an allowable error of 5% and a 95% confidence level for the acceptance of results, the actual sample size is calcu­ lated as: n - M Z 5 x 151 = 30() (2) 5 Sample pixels were plotted within the classified maps, utilizing 122 a stratified, systematic, unaligned technique (Berry and Baker, 1968). Details concerning preparation of the grid used to implement this technique are also included in Appendix C1. Verification of the sample points was accomplished by means of optical transfer devices. For determining the accuracy of the clas­ sification results directly from the aerial photography, the tran2 sparencies were mounted in a zoom transferscope and projected directly onto the classified map. The super-imposition of map and aerial photograph allowed the visual verification of the sample points. A similar procedure was employed for checking the classified map with the delineated ground verification map, Interpreted from the same aerial photography, except that the projection device employed was a reflecting projector/enlarger. 3 Accuracy checking of the gen­ eralized classification results did not require sampling, as complete data for each of the three test areas were geocoded into the informa­ tion system for comparative purposes. The sampling procedure provides sufficient sample points for a reliable estimate of map accuracy as a whole, and the accuracy level can be simply stated as the ratio of correctly interpreted points (r) to the total number of points sampled (n). It is necessary, however, to establish a confidence limit for this value (Fitzpatrick-Lins, 1981). Upper and lower limits can be calculated using a 95X two- tailed test but it is the lower accuracy limit that is the most eriti- 2 The instrument used was the Bau3ch and Lomb Zoom Transferscope avail­ able at the MSU Center for Remote Sensing. ^The instrument used was the Kargl Reflecting Projector the MSU Center for Remote Sensing. available at 123 cal in map evaluation. The lower limit of the true accuracy of a map is thus obtained using the 9556 one-tailed lower confidence limit from the formula: pL * p - (1.645 pL p q n r (Source: Vpq/n + 50/n) (3) = the lower accuracy limit expressed as a percent = r/n (expressed in %) * 100 - p ■ sample size «* number of correctly interpreted points Snedecor and Cochran, 1967, p. 211) Classification Performance LANDSAT classification accuracies for the Urban and Forest test sites chosen for this study approach the 756 estimated accuracy level indicated as appropriate in the testing procedure. The results for the Agricultural site were significantly poorer (Table 21). The rela­ tive strength of the Urban classification is unusual, as classifica­ tions in these environments have often been the most difficult in which to obtain acceptable results (Jensen, 1981). Results for the Agricultural scene, however, were lower than anticipated. The classification algorithm which gives the best representation of the ground scene is different for each test site. For the Urban and Forest sites the variation between algorithms is small, with max­ imum likelihood the most successful for the Urban site and minimum distance-to-means slightly superior in the Forest site. Grouping of cluster-classes is clearly the best alternative for the Agricultural site. In general terms, the cluster grouping is a relatively poor classifier, as it combines 27 clusters into the information categories Table 21. Accuracy Comparison of Test Sites Maximum Likelihood Classification Minimum Distance-to-Means Classification Grouping of Cluster Classes % Correct 95% Lower Confidence Limit Agricultural Site 54.4 (49.8) 44.7 (40.1) 63.6 (59.1) Urban Site 75.4 (71.7) 71.5 (67.9) 69.5 (65.2) Forest Site 68.3 (64.2) 71.8 (67.7) 70.8 (66.1) Wilcoxon Signed Rank Test: Correct 95% Lower Confidence Limit Correct 95% Lower Confidence Limit For all 3 test sites Maximum Likelihood vs. Minimum Distance-to-Means — no significant difference Maximum Likelihood vs. Grouping of Cluster Classes — no significant difference Minimum Distance-to-Means vs. Grouping of Cluster Classes — no significant difference 125 selected for the particular test site, with only limited application of ground truth. Clustering in the Agricultural site, however, avoids, to some extent, the confusion between Residential and Cropland and Pasture, which is a major source of error in both the maximum likelihood and minimum distance-to-means classifications. Despite this apparent difference between algorithms for the Agri­ cultural site, an evaluation of the difference in accuracies (using Wilcoxon's signed rank test for paired samples4 ) indicated no signifi­ cant difference between any of the algorithms at the 95% confidence level. Similarly, testing of the three algorithms for the Urban and Forest test sites indicated no significant difference between these classifications at the 95} confidence level. A detailed picture of how the classifications performed in the three test sites can be obtained by reviewing -the classification error matrices for each site,^ and the graphic displays of the maps pro­ duced. The discussion pertains only to accuracy for each of the ii Comparative statistical analyses of classification performance have not been widely reported in the literature. A review of current in­ formation on this topic and the Wilcoxon's Signed Rank Test used in this study is included in Appendix D. 5 Classification error matrices are sometimes referred to as "confusion tables" and they present the results of a comparison between sample points within a classification and ground verification information. Values along the diagonal represent correct interpretations; other values indicate misclassifications. The classification error matrices in this study show the frequencies obtained in checking classified pixels against ground verification data. This is a manual process and subject to error. For example, the number of points sampled for each site should be identical. The totals are slightly different, indicat­ ing missed pixels in the verification process. Similarly, the column total representing the ground verification information should be the same for the three algorithms within the same test site. These are also slightly different because i). of different pixel selection when a sample point fell within four possible choices, and ii) variations in interpretation by the interpreter from one classification to another. Errors of this nature are inherent in the replication of manual pro- 126 classification maps as opposed to individual categories within them and is referred to as overall accuracy performance.® i) Agricultural Test Site. Classification accuracies were much lower in the Agricultural test site than in either of the other two sites. The key to this poor classification accuracy is the substan­ tial amount of confusion present between Residential, and Crops and Pasture categories; between Forest, and Crops and Pasture; and between Forest and Residential (Tables 22, 23, and 24). The minimum distance-to-means classification (Table 23) incorporates all of these three confusion characteristics, but the overriding problem in this classification is the Forest/Crops and Pasture mixture. Compared with the other algorithms, the Broadleaf Forest category i3 considerably overestimated and it dominates the classification map (Figure 26).^ This can be seen by comparing the column total (157) for Broadleaf Forest in Table 23 which represents the LANDSAT classification with the considerably lower row total (103) for Broadleaf Forest esta­ blished by the ground-verifieation information. The corollary to this dominance is the underestimation of Crops and Pasture. cedures and the comparison of a classified output to a high resolution ground verification source. Each matrix is valid in its own right and percentages derived for each map can be legitimately compared. ^Information from the classification error matrices can also be used to calculate individual-category accuracies under certain cir­ cumstances. Deciding which categories have sufficient samples to make this determination has been subject to varying interpretations. An analysis of individual category accuracies for categories with twenty or more sample points was completed for the test sites in this study, however, given the potential problems associated with this procedure, the results are not discussed in the main body of the text. This in­ formation is presented in Appendix C2. 7 A Rectified False Color Composite of LANDSAT data for each test site is included prior to each set of land use maps for reference purposes. This composite is not referred to in the text. 127 Table 22. Classification Error Matrix for the Agricultural Test Site: Maximum Likelihood Classification (Ground Verification Source: Aerial Photography) LANDSAT CLASSIFICATION o°V Residential VERIFICATION GROUND Range Broadleaf Forest ** 0°V 3 1 27 1 2 1 4 1 102 9 14 6 9 2 2 17 46 3 41 23 Commaroial / Industrial Crops and Pasture o V 34 Coniferous Forest Water Total 80 2 162 16 57 162 2 19 2 109 1 1 2 14 7 14 184 338 16 Correct Total 54.4 % Correct 128 Table 23. Classification Error Matrix for the Agricultural Test Site: Minimum Distance-to-Means Classification (Ground Verification Source: Aerial Photography) LANDSAT CLASSIFICATION Residential jP o t* 24 4 GROUND VERIFICATION Commercial / Industrial ^ ^ t° ^ 4 32 3 Crops snd Pasture 43 Range 10 Broadleat Forest 18 1 29 3 Coniferous Forest 3 16 74 163 4 3 17 2 80 103 1 1 Water Total O0 <1° 95 1 40 26 157 2 • 2 11 3 11 149 333 13 Correct Total 44.7 % Correct 129 Table 24. Classification Error Matrix for the Agricultural Test Site: Grouping of Cluster Classes (Ground Verification Source: Aerial Photography) LANDSAT CLASSIFICATION Residential 15 Commercial / Industrial GROUND I VERIFICATION Crops and Pasture 2 15 33 1 10 Range 3 Broadleaf Forest 2 2 1 1 109 2 14 3 25 1 Coniferous Forest 162 39 20 68 97 1 30 6 164 10 1 6 15 16 2 3 Water Total 1 107 2 16 213 335 Correct Total 63.6 7Q Correct 130 Figure 24. Rectified False Color Composite of LANDSAT Data for the Agricultural Test Site Displayed on an RGB Monitor (Band Assignments - see Figure 11) Figure 25. Land Use Classes for the Agricultural Test Site Derived from Maximum Likelihood Classification Displayed on an RGB Monitor 131 Figure 26. Land Use Classes for the Agricultural Test Site Derived from Minimum Distance-to-Means Classification Displayed on an RGB Monitor Figure 27. Land Use Classes for the Agricultural Test Site Derived from a Grouping of Cluster Classes Displayed on an RGB Monitor 132 In the maximum likelihood classification of the area (Table 22), the reverse situation Is true for the Broadleaf Forest category. This category is underestimated, while the Crops and Pasture category is correct in total area, although a certain amount of confusion with Broadleaf Forest is still evident. The reason for this problem appears to be in the manner in which the different algorithms handle low-density Broadleaf Forest. This has been identified as a problem in a previous LANDSAT classification for an area in Michigan (Roller and Visser, 1981). The training set selected to represent the Broadleaf Forest category was taken from a large, dense stand of oak trees. Several other training areas were initially selected with the objective of representing various densities of growth. This could not be successfully accomplished and the training set which performed best was the hlgh-rdensity oak stand. Using this training set, the minimum distance-to-means algorithm applies the derived mean value in the classification, but this creates a situation where active cropland is grouped with Broadleaf Forest. When the probabilities of pixels fal­ ling into these two classes are calculated in the maximum likelihood algorithm, the tighter standard deviation of the Broadleaf Forest sig­ nature vis-a-vis the Cropland signature allow for only the dense stands to be classified as Broadleaf Forest. The lower-density stands fall into the Cropland category (Figure 25). Consequently the probability-based maximum likelihood algorithm has a substantial impact on the way pixels are distributed into categories, but cannot eliminate the basic spectral confusion problem. Unsupervised clustering does not rely on the statistics derived from training sites, and grouping the 27 eluster-classes into the 133 seven information classes used in the classification improves the accuracy obtained. A better balance between Broadleaf Forest and Crops and Pasture is achieved (Table 2*0, contributing to the improved overall performance of this classification. Confusion between the classes still exists, however, indicating that the spectral mixture is too great to avoid misclassifications. Another major contributor to overall error in both the minimum distance-to- means and maximum likelihood classifications is a sub­ stantial amount of land classified as Residential. The main area of confusion is with pixels in the Crops and Pasture grouping and, in particular, the Pasture signature of this group. The mixture of reflectances that make up residential areas combine bright, paved sur­ faces, backyards that to a large extent mimic pasture, tree-lined streets which approximate low-density broadleaf forest, and roof tops of generally low reflectance. Averaging of these disparate values creates the confusion with pasture which neither algorithm can sort out. This type of confusion is commonly reported in the literature (Castruccio, 1978), and poses a difficult problem for many users because it mixes categories into two general classes. Residential forms part of Level I Urban, and Crop and Pasture 13 the major com­ ponent of Level I Agriculture. An upward combination of the two con­ fused categories to a Level I-category thus cannot be achieved without inherent error. Despite the similarity of error information indicated by the classification error matrices, the actual distribution of Residential land defined by each classification (Figures 25, 26, and 27) differs substantially. The distribution is interrelated with the overclassification of Broadleaf Forest in the minimum distance-to- 134 means classification; however, with this algorithm the Residential area is in clear, discrete blocks which contrasts substantially with the scattered distribution of Residential land provided by the maximum likelihood classification. The grouping of cluster classes records substantially less Residential area than either of the other classifications (Table 24), and the elimination of many Residential pixels does, in turn, reduce category confusion. A mixture of categories still persists, however. As with the Crop and Pasture/Broadleaf Forest confusion problem, the grouping of cluster classes does moderate the Residential/Crops and Pasture mix so that this algorithm provides not only the best classif­ ication statistically, but its classification output (Figure 27) is also superior. It clearly represents the complex of landscape ele­ ments in agricultural test sites better than either of the other two alternatives. ii) Urban Test Site. In comparison to the Agricultural site, the Urban site has higher levels of classification accuracy and much closer correspondence between accuracy levels for the three methods tested (Tables 25, 26, and 27). The classification error matrices for the maximum likelihood and minimum distance-to-means have very similar patterns and the classification maps are also without major differ­ ences (Figures 29 and 30). The maximum likelihood classification map is, again, more fractured, displaying smaller segments of Crops and Pasture than minimum distance-to-means, particularly in the NW qua­ drant of the map. In this respect, the blocks of Residential land in the minimum distance-to-means classification, despite a little overes­ timation, are a truer representation of the ground scene than the max- 135 Table 25. Classification Error Matrix for the Urban Test Site: Maximum Likelihood Classification (Ground Verification Source: Aerial Photography) LAND8AT CLASSIFICATION Residential 127 2 17 6 42 1 Crops and Pasture 14 1 42 2 59 Broadleaf Forest 12 23 33 68 GROUND VERIFICATION Commercial / Induatrial 4 150 49 Water Total 8 159 45 83 39 8 8 252 334 Correct Total 75.4 % Correct Table 26. Classification Error Matrix for the Urban Test Site: Minimum Distance-to-Means Classification (Ground Verification Source: Aerial Photography) LAND3AT CLASSIFICATION GROUND VERIFICATION Raaidantlal 156 130 4 17 5 Commercial / Industrial 12 33 3 Crops and Paatura 18 3 30 2 53 Broadlaaf Forast 16 17 38 71 48 Water Total 12 176 40 67 12 12 45 243 340 Correct Total 71.5 % Correct 137 Table 27 Classification Error Matrix for the Urban Test Site: Grouping of Cluster Classes (Ground Verification Source: Aerial Photography) LANDSAT CLASSIFICATION A & Residential VERIFICATION Cropa and Pasture GROUND Commercial 7 Industrial Forest 117 5 9 39 16 13 151 2 2 52 15 27 9 51 10 22 39 71 Broadleaf Water Total 13 151 44 65 63 13 15 235 338 Correct Total 69.5 % Correct 138 Figure 28. Rectified False Color Composite of LANDSAT Data for the Urban Test Site Displayed on an RGB Monitor (Band Assignments - see Figure 11) Figure 29. Land Use Classes for the Urban Test Site Derived from Maximum Likelihood Classification Displayed on an RGB Monitor 139 Figure 30. Land Use Classes for the Urban Test Site Derived from Minimum Distance-to-Means Classification Displayed on an RGB Monitor Figure 31. Land Use Classes for the Urban Test Site Derived from a Grouping of Cluster Classes Displayed on an RGB Monitor 140 imum likelihood version. The smaller, scattered pixel groups of Crops and Pasture are often within Residential areas and would be inter­ preted as such by a photo-interpreter. Broadleaf Forest is underes­ timated in both classifications, Crops and Pasture are overestimated, and the two classes show a certain amount of confusion with each other and with Residential. This indicates that the confusion patterns evi­ dent in the Agricultural site are still present in the Urban site, although to a lesser extent. The grouping of cluster classes also shows a similar pattern, particularly in the classification map (Figure 31). where it appears to be the middle course between maximum likelihood and minimum distance-to-means. The Crops and Pasture/Residential balance appears to be the best of the three, although this is not indicated as such by the classification error matrix. iii) Forest Test Site. Overall classification accuracy in the Forest test site is relatively high and also very consistent among the three algorithms used (Tables 28, 29. and 30). Despite these similar­ ities, there are clear differences among the classification results. The minimum distance-to-means and grouping of cluster classes classif­ ication underestimate Broadleaf Forest and overestimate Coniferous Forest and Forested Wetland, whereas maximum likelihood inverts this pattern. In the non-forest classes, Cropland and Pasture is underes­ timated in the minimum distance-to-means classification, close to correct for the maximum likelihood classification, and overestimated in the grouping of cluster classes. between classifications.. The Range category also varies Compensating errors, then, are the reason for similar overall classification accuracies. The differences 141 Table 28. Classification Error Matrix for the Forest Test Site: Maximum Likelihood Classification (Ground Verification Source: Aerial Photography) VERIFICATION LAND8AT CLASSIFICATION Residential 2 5 2 4 Crops and Pasture 2 12 8 3 1 Rang* 2 5 19 1 3 Broadleaf Forest 2 5 6 137 3 9 13 • 5 1 27 30 1 11 167 2 41 GROUND Coniferous Forest Water 3 1 Forested Wetland Total 6 21 1 12 24 28 38 175 39 43 50 1 9 46 22 246 360 32 Correct Total 68.3 7o Correct 142 Table 29. Classification Error Matrix for the Forest Test Site: Minimum Distance-to-Means Classification (Ground Verification Source: Aerial Photography) GROUND VERIFICATION LANDSAT CLASSIFICATION Residential 2 1 Crops and Pasture 4 10 7 1 Renge 2 4 15 3 6 120 4 3 Broadleaf Forest Coniferous Forest 1 2 Water 1 8 16 32 1 24 24 41 37 10 Forested Wetlend Total 5 9 1 138 49 167 45 45 55 28 45 38 70 257 358 Correct Total 71.8 % Correct 143 Table 30. Classification Error Matrix for the Forest Test Site: Grouping of Cluster Classes (Ground Verification Source: Aerial Photography) LAN08AT CLASSIFICATION Residential 2 4 Crope and Pasture 1 20 GROUND VERIFICATION Ranpe Broadleaf Forest 1 Coniferous Forest 1 Water 2 Forested Wetland Total 7 2 1 9 4 3 1 29 8 11 1 3 23 6 7 106 10 1 5 39 1 1 1 8 39 24 125 30 46 50 47 23 55 160 33 53 47 248 350 Correct Total 70.8 % Correct 144 between classifications can be seen in the resulting classification maps (Figures 33* 34, and 35). As was seen in the previous test sites, the minimum distance-to-means classification presents the most distinct pattern of categories and for the Forest test site it is the most accurate. Without a serious overestimate of Forest Wetland at the expense of Broadleaf Forest, the overall accuracy would be much higher. In the grouping of cluster-dasses an overestimate of Crops and Pasture is evident, so much so that a visual analysis may make it look inferior to the maximum likelihood classification. The Coniferous Forest class suffers in the maximum likelihood classification. A portion of Coniferous Forest is classified as Range: this is another variant of the stand density problem encoun­ tered with Broadleaf Forest in the Agricultural site. The areas clas­ sified as Range are not concentrated in one type of coniferous stand.' Three training sites representing jack, red, and white pines were used in the classification. Misclassification occurs where the trees are small, as in young plantations. Some mature red pine stands are also involved, but the problem occurs predominantly in the areas where smaller trees allow more ground reflectance to modify the pixel * values. In the minimum distance-to-means classification, the statis­ tics from the training set allow the classifier to "capture” these pixels as coniferous forest; in maximum likelihood they are given a higher probability of being Range and are classified as such. The maximum likelihood classifier, then, appears to allow only conifer stands that closely match the training set characteristics to be clas­ sified correctly. Minimum distance-to-means classification designates more area as conifers, but in so doing, allows a fringe of pixels 145 ( Figure 32. Rectified False Color Composite of LANDSAT Data for the Forest Test Site Displayed on an RGB Monitor (Band Assignments - see Figure 11) Figure 33. Land Use Classes for the Forest Test Site Derived from Maximum Likelihood Classification Displayed on an RGB Monitor Figure 34. Land Use Classes for the Forest Test Site Derived from Minimum Distance-to-Means Classification Displayed on an RGB Monitor Figure 35. Land Use Classes for the Forest Test Site Derived from a Grouping of Cluster Classes Classification Displayed on an RGB Monitor 147 along the waters edge to also be classified as conifers. These water-edge pixels are, in fact, distinct on the maximum likelihood classification, too. A similar fringe, now grouped with Crops and Pasture, indicates obvious misclassification. The conclusion that can be derived from the overall classifica­ tion accuracies is that single date, mid-summer LANDSAT data performs only adequately in providing general land use information for Urban and Forest test sites in Michigan. Results for the Agricultural site are even less accurate and fail to approach a level which could be considered operationally viable. These accuracy levels discussed could even be considerably lower, as indicated by the 95% confidence lower accuracy level. Differences between algorithms are evident, but compensating errors within the categories of each classification result in similar overall accuracies, particularly in the Urban and Forest sites. The accuracies vary more in the Agricultural site, but these variations, along with those for the Urban and Forest site, were not significant at the 95% confidence level. Generalized Classification Results Evaluation of the classification results on a pixel-by-pixel basis with reference to high-resolution ground verification material has indicated that using single date LANDSAT coverage for the late summer season provides land use information of only limited utility. If the land use information is to be used as input to a computer-based geographic information system, however, accuracy checking with refer­ ence to a more generalized ground verification source may be more appropriate. In Michigan, land use information for the state-wide geographic 148 Information system is being generated from 1:24,000 aerial photogra­ phy, using a 10-acre minimum type size. tized for storage in 10-acre cells. The resulting maps are digi­ A high resolution data source is thus being generalized, through interpretation, to a relatively small number of land use categories. Checking the accuracy of a LANDSAT interpretation against a delineated land use map and also, in aggre­ gated form, against a geocoded land use data base of the same area, may be more realistic in an operational sense. This kind of evalua­ tion will also test the hypothesis that when only broad land use categories are required, a lower resolution data source, such as LANDSAT, can provide that information as effectively as a higher reso­ lution source. In order to test the propositions suggested above, two further sets of accuracy tests were completed for the LANDSAT classification results. Land use maps for the three test sites were prepared using 1:24,000 color infrared aerial photography as a data source and a minimum type size based on the dimensions of 3 x 3 pixels (171m x 171m). A series of sample points were tested for accuracy. situation is reversed from the original accuracy test. The Now the accu­ racy testing source, the delineated map, is a more generalized source than the classified land use map which records a classification for each 57m x 57m pixel. This is clearly an interim product with which to test the accu­ racy of the LANDSAT classifications. Nevertheless, a user is likely to find an accuracy comparison made against a standard delineated map of utility because this is the alternative a LANDSAT classification seeks to replace or update, despite obvious differences in data reso- 149 lution between the two maps. The other, more Important, comparison to be made is between the standard delineated map that has been converted to an information sys­ tem format and the LANDSAT classifications that are generalized to a similar format. Here, the comparison is between two data bases with similar characteristics that have been abstracted from different map products. The delineated map was manually geocoded using a grid-cell with dimensions equivalent to 3 x 3 pixels, whereas the LANDSAT map was resampled using a nearest-neighbor procedure and the same output dimensions (171m x 171m). i) Delineated Map Accuracy. The results of the generalization procedures show that overall classification accuracies are reduced when compared with those obtained through comparison with aerial pho­ tography (Table 31). There is one exception to the overall pattern. In the Agricultural test site a small improvement in overall accuracy is seen when comparing the pixel classification map to the delineated land use map. o This improvement is consistent over the three algo­ rithms, despite the differences in the category assignments between them that have been discussed earlier. The anomaly is difficult to explain, especially as the accuracy comparison with the geocoded data follows the general pattern of a significant drop in accuracy level. The delineated maps do generalize small areas below the minimum type size into the dominant category, so that Broadleaf Forest and Range pixels that may have been associated with sample Crops and Pasture O The accuracy-cheeking procedure of matching the categories of sample point locations was repeated and the small increase in accuracy against the delineated map was confirmed. 150 Table 31. Comparative Test Site Accuracies Agricultural Site Urban Site Forest Site Photo Accuracy 54.4 75.4 68.3 Map Accuracy 58.0 66.7 65.3 Geocoded Accuracy 41.0 58.1 52.0 Photo Accuracy 44.7 71.5 71.8 Map Accuracy 45.1 64.0 69.9 Geocoded Accuracy 38.8 59.7 53.3 Photo Accuracy 63.6 69.5 70.8 Map Accuracy 66.9 62.6 64.3 Geocoded Accuracy 52.4 55.7 52.3 Classification Maximum Likelihood Minimum Distance-to-Means Grouping of Cluster Classes Photo Accuracy — Sample points checked against aerial photography Map Accuracy — Sample points checked against the delineated map* Geocoded Accuracy— Cross-tabulation of geocoded data sets* Classification Error Matrices for these ground verification sources are included in Appendix E. 151 pixels are now delineated as Crops and Pasture and consequently the accuracy is improved. A similar effect probably takes place with the absorption of Range, and Crops and Pasture pixels into Broadleaf Forest delineations. In the Urban and Forest test sites there is a consistent reduc­ tion in accuracy levels with generalization. Differences between photo accuracy and map accuracy for the Forest test site are smaller than for the Urban site, but larger for the Urban site when the geo­ coded information is considered (Table 31). For the Forest site, the small reduction between photo and map accuracy is probably due to the relatively smaller amount of intermixture between categories at the sub-minimum type size level, i.e., fewer isolated pixel-size areas of Conifer within Broadleaf Forest areas, that become generalized as Broadleaf Forest. In Urban areas, intermixture, such as isolated pixel-size areas of Crops and Pasture or Broadleaf Forest pixels being classified as Residential, is more likely. ii) Geocoded Accuracy. The differences between the accuracy of the pixel-based classification results derived from aerial photography and those derived after the conversion of the ground-verification and classification inventories to an information system format are sub­ stantial. Accuracy loss is uneven when looked at both in terms of classification algorithm or test site (Table 32).9 9Wilcoxon's Signed Rank Test for the geocoded accuracy results from the three test sites indicate no significant difference between the algorithms. Generalization, therefore, does not alter this charac­ teristic of the three algorithms, despite uneven accuracy loss. 152 Table 32. Accuracy Loss in Percentage Points Between Photo Accuracy and Geocoded Accuracy Agriculture Maximum Likelihood Minimum Distance-to-Means Grouping of Cluster Classes Average Urban Forest Average 13.4 17.3 16.3 15.7 5.9 11.8 18.5 12.1 11.2 13.8 18.5 14.5 10.2 14.3 17.8 153 The minimum distance-to-means algorithm, on average, sustains the smallest accuracy loss, declining by 12.1 points, followed by the grouping of cluster classes at 14.5 points and maximum likelihood classification at 15.7 points. The blocked nature of the minimum distance-to-means classification results mentioned earlier may be an influencing factor in this low figure. The effect of this charac­ teristic, however, seemed to be most important in the Forest test site and this is not the case in geocoded generalization where the minimum distance-to-means classification fares marginally worse than maximum likelihood in terms of accuracy loss. In sum, the type of classifica­ tion algorithm does not seem to have a consistent effect on accuracy loss vis-a-vis geocoded generalization. Among the three test sites, the average loss is smallest for the Agricultural site at 10.2 points, larger at 14.3 points for the Urban site, and largest for the Forest site at 17.8 points. These varia­ tions suggest that the classifications for the three environments do have different susceptibilities to accuracy loss through generaliza­ tion. The smallest accuracy loss is associated with the lowest overall accuracy levels in the Agricultural site, but this is not a consistent trend. Accuracy levels for the Forest and Urban sites are similar, but the Forest site has a larger accuracy loss. influencing factor is the number of categories classified. Another The Urban site has fewer categories than the Forest site and a better distribu­ tion of airphoto-derived accuracies amongst them, i.e., four out of the five categories have accuracies above 60% with all algorithms. The validity of this factor is clouded by the Agricultural site which has the same number of categories as the Forest site and a smaller 154 accuracy loss through generalization than either the Forest or the Urban site. Examination of the map products created through the generaliza­ tion process suggests a reason for the differences in accuracy loss that exist amongst the three test sites (Figures 36 through 47). The geocoded ground-verlficatlon inventories for all the sites show that one or two categories dominate each: Crops and Pasture, and Broadleaf Forest in the Agricultural site, Residential in the Urban site, and Broadleaf Forest in the Forest site. There is, however, considerable variation in the distribution of area among the other categories. This is shown more clearly by examining the number of cells per category in each of the test sites (Table 33). The Agricultural site is the most overloaded and this is reflected in the highest standard deviation of the three. Urban is difficult to Interpret because it has only five categories, as opposed to seven in the other two sites. The distribution of values is more even than in the Agricultural site, despite the dominance of Residential. standard deviation. This is reflected by a lower The Forest site has the widest spread of values and the lowest standard deviation. As might be anticipated, the rela­ tionship between the dispersion of the data and the amount of accuracy loss is a direct one. The Forest site has the most dispersed data values and the largest accuracy loss, while Agriculture has the least dispersed values and smallest accuracy loss. Accuracy loss, then, does seem to be related to the distribution of the land uses in the test site. If the land uses for a particular test site are concen­ trated in one or two categories, there seems to be less accuracy loss when a LANDSAT classification for that area is generalized. 155 Table 33. i Geocoded Ground Verification Comparisons Urban : Site Agricultural Site Category Forest Site Category # Cells RESIDENTIAL 212 RESIDENTIAL 1502 COMMERCIAL/ INDUSTRIAL 75 COMMERCIAL/ INDUSTRIAL 542 CROPS & PASTURE 517 CROPS & PASTURE RANGE BROADLEAF FOREST CONIFEROUS FOREST WATER Total Cells 1678 if Cells 144 1471 BROADLEAF FOREST 627 54 161 WATER 112 Category RESIDENTIAL # Cells 64 CROPS & PASTURE 255 RANGE 173 BROADLEAF FOREST 1405 CONIFEROUS FOREST 373 WATER 464 FORESTED WETLAND 321 3795 3300 3055 Cells/ Category 542 660 436 Standard DevlatIon 710 645 447 Categories/ Unit Area 2.07 2.09 2.1 Figure 36. Geocoded Ground Verification Data for the Agricultural Test Site Displayed on an RGB Monitor Figure 37. Land Use Classes for the Agricultural Test Site Derived from Aggregated Maximum Likelihood Classification Dis­ played on an RGB Monitor 157 Figure 38. Land Use Classes for the Agricultural Test Site Derived from Aggregated Minimum Distance-to-Means Classification Displayed on an RGB Monitor Figure 39. Land Use Classes for the Agricultural Test Site Derived from an Aggregated Grouping of Cluster Classes Classi­ fication Displayed on an RGB Monitor 158 Figure 40. Geocoded Ground Verification Data for the Urban Test Site Displayed on an RGB Monitor nymaaiHl ■iMM jHHHB ■1 Figure 41. ■ Land Use Classes for the Urban Test Site Derived from Aggregated Maximum Likelihood Classification Displayed on an RGB Monitor 159 Figure 42. Land Use Classes for the Urban Test Site Derived from Aggregated Minimum Distance-to-Means Classification Displayed on an RGB Monitor Figure 43. Land Use Classes for the Urban Test Site Derived from an Aggregated Grouping of Cluster Classes Displayed on an RGB Monitor Figure 44. Geocoded Ground Verification Data for the Forest Test Site Displayed on an RGB Monitor Figure 45. Land Use Classes for the Forest Test Site Derived from Aggregated Maximum Likelihood Classification Displayed on an RGB Monitor Figure 46. Land Use Classes for the Forest Test Site Derived from Aggregated Minimum Distance-to-Means Classification Displayed on an RGB Monitor •- r* ’ ■ i. ■ ■ I. P-“ rr • ‘ . ■ v : 'r Figure 47. Land Use Classes for the Forest Test Site Derived from an Aggregated Grouping of Cluster Classes Displayed on an RGB Monitor 162 Another characteristic of the test site distribution was also found to be important. The number of categories per unit area has recently been one of several measures proposed to indicate the spatial complexity of a classification derived from LANDSAT data {Merchant, 1981). Taking an area of 3 x 3 geocoded cells for the three test areas, a value for categories/unit area was derived for the three test areas. The values for the sites were close, but they also trend in the same direction as accuracy loss, indicating that the Forest site is the most complex area and the Agricultural site the least complex. These figures, in themselves, probably reflect the arbitrary nature of the test site selection rather than the characteristic of Agricultural as opposed to Forest test sites. They do, however, sug­ gest that generalized LANDSAT classifications for diverse and complex sites suffer considerable loss of accuracy vis-a-vis their pixel-based original classifications. Less complex sites also have accuracy losses, but these are less severe. The map products also illustrate the inherent problem which causes low accuracy levels when comparing a LANDSAT-derived classifi­ cation with one prepared using photo-interpretation methods. The geo­ coded ground verification inventories are clearly the product of a generalized interpretation procedure. Patterns are bold and rela­ tively simple, suggesting that the interpretation process has filtered out unnecessary data so that the categories detailed in the classifi­ cation scheme are presented with an overt spatial ordering. Despite resampling to a 3 x 3 pixel level of aggregation, the LANDSAT classif­ ications still retain their fractured appearance which conforms to the exclusively spectral nature of the original data and the absence of 163 mechanisms in the classification process to impose a spatial ordering similar to aerial photo-interpretation. With an accuracy check by pixel against a high-resolution ground-verification source, spatial ordering of this source has not yet taken place and consequently the possibility of coincidence (I.e., accuracy) is at its greatest. Interpretation of land use classes with a minimum type size substan­ tially larger than the data resolution, by definition, reduces the variation in this ground truth and thus, in most cases, reduces the possibility of correctly matching a single pixel. Geocoding general­ izes both the ground-verification and the classification products. For the ground-verification data, the spatial order present in the interpreted data is retained, while resampling the LANDSAT products merely degrades the resolution of the classification, without imposing any new order. reduced. As a consequence, accuracy levels are significantly CHAPTER VI CONCLUSIONS Introduction Much operational land use information is still gathered using traditional aerial photo-interpretation methods (Michigan DNR, 1981). Experiments have been conducted, however, to determine whether digital processing of LANDSAT data can provide similar or at least compatible information (Swain, 1976; Todd, 1977; Gaydos and Newland, 1978). Four major shortcomings are apparent in these investigations: i) most of the studies, while purporting to serve the needs of users, nevertheless fail to define the requirements of large hetero­ geneous user groups in a systematic way; ii) determination of which LANDSAT classification algorithms are most appropriate for land use mapping in different environmental situations is absent; iii) accuracy evaluations of land use inventories were rare until recently; and iv) scant attention is given to a consideration of the informa­ tion, once it had been converted to a geographic information system format. This latter aspect is assuming increased importance as state­ wide and local geographic information systems become the mechanisms through which much land use analysis is conducted. This study developed information on user needs through an exten­ sive survey focusing primarily on land use planners throughout Michi­ gan. A series of land use categories was identified and subsequently modified for use as a classification scheme for processing LANDSAT 164 165 data. The test Included three study areas with different environmen­ tal characteristics: site. an Agricultural, an Urban, and a Forest test For each of these sites the LANDSAT data were classified by means of three algorithms. Accuracy evaluation of the resultant land use categories was conducted which included examining the effects of generalization for geographic information system formats. The conclusions derived from the study are considered in three phases: i) category definition, 11) testing of LANDSAT classification capabilities, and ill) the effects of generalization on geographic information sys­ tem formats. Category Definition This process involves both reduction and rationalization of the almost 100 alternatives indicated by a statewide group of professional planners. It is an operational step required before actual data clas­ sification, but it has major implications for the objectives of the study. Three discrete stages are involved in the rationalization pro­ cess. From the analysis of the questionnaire responses it was possible to identify a series of categories that were favored by the majority of the respondents. These included seven Level I-categories, eighteen Level II-eategories, and ten Level III-categories. With inventory procedures based on standard aerial photography, the indicated Level II-categories with Level III-subcategories would be adopted for use in a classification system. Level I-categories would be derived through a combination of more detailed information. The inherent limitations 166 of LANDSAT data preclude using a category list of this nature as a classification scheme and, consequently, a rationalization of the categories was required. This second stage produced a minimum ques­ tionnaire response of thirteen Level-I and Level II-categories, con­ firming intuitive knowledge that LANDSAT cannot, a priori, provide all of the land-use information required by a majority of planners. In terms of statewide applicability, however, these categories represent the dominant land uses that have to be inventoried. The categories omitted are predominantly intra-urban classes which represent a very small proportion of total area. The third stage of category defini­ tion comes during the establishment of training sites and is conse­ quently part of the classification process. Of the thirteen minimum categories, seven were finally used in the classification of the Agri­ cultural and Forest sites and only five in the Urban site. Category definition, then, pursued in a systematic way, generates two substantive conclusions. Most importantly, it shows that a rela­ tively small number of land use categories will meet the needs of a majority of land use planners. In addition, these categories can be rationalized to an even smaller group, by removing categories with high spatial resolution requirements. The final category list, there­ fore, consists of thirteen classes which are required by land use planners and are potentially interpretable from LANDSAT data. Detailed requirements of specific projects may prevent the application of LANDSAT data for a complete inventory, but the capability to update information on categories which constitute most major land uses in Michigan seems to be available. The second conclusion is specific to the classification tests 167 undertaken in the study. In developing training sites, it was not possible to define all the classification categories present in the test areas. Spectral confusion among categories was the primary cause of this problem, a spectral confusion that is inherent in data from mid-August. Data from early June, or possibly multi-date data would improve discrimination (Lichtenegger and Seidel, 1981); however, there are several land use categories which seem likely to pose problems, regardless of the season of data acquisition. Distinguishing between Commercial and Industrial uses will always cause problems, although the two categories can reasonably be combined. Highly reflective tar­ gets, such as sand mining sites which belong in the Barren/Extractive category, also create problems in a multi-category inventory, although isolating these areas in a single category mode with clustering does seem possible. It may be.possible to classify Orchards from data of a different season, but it seems likely that this category is also one to provide major problems for LANDSAT classification. In a strict sense, then, before an actual classification of data, it was shown that LANDSAT data cannot provide the minimum land use requirements of the majority of planners surveyed. This is a primary conclusion, linking the category definition and analytical elements of the study. The categories that remain for the test sites, neverthe­ less, represent the dominant land uses in each of the areas selected, and accurate updating and/or re-mapping of the final categories would be of considerable value. Testing the effectiveness of deriving these categories thus becomes the major objective of the classification test. 168 Testing LANDSAT Classifications The classification results demonstrate the difficulties involved in any attempt to incorporate LANDSAT classification into an opera­ tional land use mapping program. Accuracies for the test sites were relatively low for the Urban and Forest areas and unacceptable in the Agricultural area. Variations were present among the different algo­ rithms, with the unsupervised grouping of cluster classes substan­ tially superior in the Agricultural area, maximum likelihood the most accurate in the Urban area, and minimum distance-to-means preferred in the Forest site. Despite this distribution, no significant difference existed between the overall performance of the algorithms at the 95* confidence level; this parallels similar results with LANDSAT classif­ ication of crops (Scholz et al., 1979) and forest (Fleming and Hoffer, 1977). The necessity of using single-date data from mid-August, as mentioned previously, had a substantially detrimental effect on clas­ sification performance. A lack of differentiability between Cropland/Residential, Cropland/Broadleaf Forest, and Residential/Broadleaf Forest categories was present for the test data, degrading the results. Accuracy levels were nevertheless considerably higher than those reported for a test area in Minnesota where season­ ally poor data, but substantially more sophisticated processing were used (Nelson et al., 1981). Some of this difference may be due to the selection of three thematically distinct test sites, rather than a larger area in which a greater variety of land uses would be represented. The evidence does show, however, that micro-computer based analysis systems are capable of producing results comparable to larger systems and perhaps inadver- 169 tently points toward advantages of scene segmentation prior to clas­ sification, which further enhances the advantages of micro-computer based systems. Generalization to Geographic Information System Formats Conventional aerial photography-based land use inventories are converted to the geographic information system formats through geocod­ ing or digitization, most commonly to a 10-acre grid cell. a generalization occurs. Inevitably LANDSAT pixels, once classified, also have to be generalized to become compatible with this format. Comparison of geocoded classifications derived from aerial photo-interpretation with generalized LANDSAT classifications indicate a substantial accu­ racy loss vis-a-vis the same comparison between the actual aerial pho­ tography and the pixel-based LANDSAT classification. The extent of the accuracy loss seems to have a direct relationship with the diver­ sity and spatial complexity of the scene being classified. This find­ ing compounds the problem of poor classification accuracy and demon­ strates that accuracy information has to be carefully considered, par­ ticularly in terms of the resolution of the ground-verification source. The issue of differing accuracies due to generalization is impor­ tant and requires further investigation. Future work, which prepares ground truth information at LANDSAT resolution, and subsequently adopts various methodologies for generalization to understand the mechanics of accuracy deterioration which occur with generalization to geographic information formats, will contribute to more effective operational application of LANDSAT data. 170 Current Applicability and Future Developments The conclusions derived from this study show that digital clas­ sification of LANDSAT data can only make limited contributions to the inventory requirements of PA 204. A significant number of categories could not be obtained because of spatial resolution characteristics inherent to LANDSAT data. This situation was exacerbated through the use of seasonally poor data. The accuracy of those categories that were obtained met only minimal standards. A similar analysis with improved data would still supply only a small group of categories, but accuracy levels should be substantially improved. In these cir­ cumstances, critical updating of a pre-existing inventory would be possible. The Minnesota Land Information System is currently (1982) in the position of needing revised land use information to such an extent that LANDSAT classification is being used in a selective updat­ ing procedure (Nelson et al., 1982). For a similar procedure to be adopted for PA 204 it is important to distinguish between the two modes of presentation envisaged for land use information within the Act: graphic information system. the delineated map and the geo­ The delineated map has to be of suffi­ cient quality for use in site-specific applications and the need is reflected in very high accuracy levels indicated for categories in the information characteristics portion of the Michigan Data Needs Ques­ tionnaire. Classified LANDSAT data, with information recorded for each acre of ground surface, has a finer resolution than delineated categories interpreted from aerial photography with a 10-acre minimum type size. Unless geometric registration is very precise, however, users have found these LANDSAT data, portrayed at scales of 1:24,000, 171 to be unsatisfactory. The delineation process allows the interpreter to group together land cover into coherent blocks of land use. LANDSAT spectral classifications will represent the land cover types and, as a consequence, single pixels will "appear" inaccurate while, in fact, representing true ground conditions. Post-classification algorithms to eliminate these types of pixels have been developed (Wilson, 1979), but the display of classified pixel information at relatively large scale still remains unsatisfactory from a user's point of view. Selective updating of delineated maps through digital LANDSAT classification is, therefore, inappropriate. Land use information in the geographic information system is viewed in a different fashion. Preliminary analysis of the responses relating to limitations and suitability in the questionnaire indicate that a significantly lower accuracy level is required for this type of information. The land use component in the system then can "accommo­ date" lower accuracies and less spatial precision. Analyses using a geographic information system take land use data in conjunction with other data sets and yield results that are then checked for applica­ bility to particular land parcels through reference to precise base maps. Adjustments can be made when required. In these circumstances, classified LANDSAT data can be significant in an updating framework. The possibilities for using LANDSAT data in land inventory work are also on the verge of substantial new developments with the advent of data from LANDSAT-4 which was launched successfully July 16, 1982 (Smith, 1982). Data from the Thematic Mapper system (TM) aboard this satellite will have better spatial resolution than the Multispectral Scanner (MSS) used on previous LANDSAT satellites, and will also con­ 172 tain a wider range of spectral information (Salomonson et al., 1981). The applicability of currently available classification algorithms to the new data has still to be fully tested. Preliminary classification with simulated TM data, in a forest environment, has Indicated some improvement in accuracy compared with MSS data because of different spectral band selection, but no improvement could be detected as a consequence of the higher spatial resolution (Teillet et al., 1981). For many land uses, however, improved spatial resolution may be the key to using satellite data for mapping (Welch, 1982). New methods of analysis suggested by procedures adopted in this study, to take advan­ tage of improved spatial resolution, may include an integration of visual interpretation with computer-assisted classification, particu­ larly in updating procedures. Visual interpretation of MSS data displayed on the RGB monitor during training selection allowed the identification of some land use classes that could not be retained in the final classification system. This capability was strongly tied to the availability of aerial photography for the area which was being used in the training set definition. Hlgher-resolution data will improve the potential for this type of interpretation and could be used effectively, without supporting aerial photography, for updating land use by analysts familiar with the test area being displayed. Future work in developing procedures, through which LANDSAT data can fulfill the requirements of land use planners, should explore this type of integration of methodologies to achieve productive results. APPENDICES APPENDIX Al THE MICHIGAN RESOURCE INVENTORY ACT (PA 204) STATE O F MICHIGAN 80TH LEGISLATURE R E G U L A R SESSION O F 1979 Introduced by Senators Monsma, Kammer, Hertel, Arthurhultz and Allen ENROLLED SENATE BILL No. 443 A i\ ACT to provide for a land resource and current use inventory in the state; to provide for technical assistance on the use of the inventory to municipalities, counties, and governmental planning and resource management entities; to create an inventory advisory committee in the department of natural resources to advise the legislature, governor, and the department; to prescribe the powers and duties of the inventory advisory committee; to provide for funding to municipalities, counties, and regional planning commissions for their participation in the inventory process; to prescribe certain duties of the department of natural resources; and to make an appropriation. The People of the State of Michigan enact: Sec. 1. This act shall be known and may be cited as the “Michigan resource inventory act”. Sec. 2. As used in this act: (a) “Classification system" means a mechanism to identify the current use of land and any structures on the land. (b) “Data management system” taeans a mechanism which relies on a computer to manipulate, store, and retrieve information collected and updated during a resource inventory. (c) “Department" means the department of natural resources. (d) “Inventory” means the land resource and current use inventory. (e) “Municipality” means a city, village, or township. (f) “Regional planning commission” means a regional planning commission designated by the governor pursuant to executive directive to carry out planning in a multicounty region of the state. (g) 'Technical assistance” means the aid which the department shall provide to municipalities, counties, and other interested groups and individuals, on the use of the land resource and current use inventory and related information for planning and resource management decisions. Sec. 3. (1) The department shall make or cause to be made a project design study. The study shall determine the appropriate operational criteria, computer software and hardware, staffing, available information resources, data updating methodology, most economical inventory resources, location of data management operations, linkages with other data management systems in the state, data geographic base configuration, data delivery system, and other information necessary to complete the inventory and development of a data management system. (2) The department shall make or cause to be made a land resource and current use inventory, as provided in sections 6 and 7, of all land, public or private, in this state. The land resource and current use inventory, if appropriate, shall rely on any other information and surveys. 173 (79) 174 (3) The department shall create a technical assistance program for the purpose of providing services to municipalities and counties as provided in section 5. (4) The department shall prepare recommendations for consideration by the inventory advisory committee regarding means to address problems or issues indicated by the inventory. Sec. 4. (1) There is created an inventory advisory committee in the department. The governor shall appoint members of the inventory advisory committee with the advice and consent of the senate. The committee shall be composed of representatives from municipalities, counties, regional planning commissions, soil conservation districts, the major state agencies affected by the inventory, federal agencies, major land resource based industries, environmental groups, and the general public from various areas of the state. The committee shall consist of 20 members. Not less than 4 of the members shall be from the Upper Peninsula, not less than 4 of the members shall be from the Lower Peninsula north of townline 16, and not less than 4 of the members shall be from the Lower Peninsula south of townline 16. The term of each member shall be 4 years, except that of those first appointed, 5 shall be appointed for 1 year, 5 for 2 years, 5 for 3 years, and 5 for 4 years. A vacancy shall be filled for the balance of the unexpired term in the same manner as the original appointment. (2) A minimum of 5 subcommittees of the inventory advisory committee shall be created. The subcommittees shall be composed of members of the inventory advisory committee and other individuals. The subcommittees shall study the inventory requirements in the areas of agriculture, forestry, minerals, unique and special environments, and developed land. Each subcommittee shall include members of the inventory advisory committee whose particular interests are affected by those portions of the inventory dealing with 1 of those areas. (3) The inventory advisory committee shall be responsible for reviewing and assisting the department’s efforts in completing the inventory and providing fdr its use. The committee’s review and assistance shall include the development of: (a) The definitions of the features to be inventoried in the land resource inventory as provided in section 6. (b) The technical assistance program to help municipalities and counties effectively use the inventory as provided in section 5. (c) Final recommendations to be sent to the legislature and the governor regarding the manner of addressing the problems or issues indicated by the inventory as provided in section 3(4). (4) The inventory advisory committee shall approve or disapprove the criteria for participation in the current use portion of inventory process as provided in section 8. (5) The inventory advisory committee shall meet not less than 3 times each year. (6 ) The legislature a n n u a lly shall establish a schedule for reimbursement of expenses of the inventory advisory committee members. Sec. 5. (1) The department shall create a technical assistance program designed to help municipalities and coundes effectively use the inventory. The technical assistance program, when feasible, shall utilize the »4*>hnical assistance programs of regional planning commissions. The technical assistance shall include: (a) The publication and distribution of the inventory as applicable to each municipality and county in the state. (b) The preparation and distribution of land resource management manuals to assist municipalities and counties, planning and resource management entities, and other federal, state, and local agencies in updating their planning and resource management programs to incorporate the inventory. Land resource management manuals may also be prepared to assist municipalities and counties in solving problems which confront their planning resource management programs. (c) The conducting of workshops, in conjunction with local government associations, regarding the inventory. (d) The provision of a team of experts on the inventory to assist in problem solving by municipalities and counties. (e) The provision of an inventory information center and library function which municipalities and counties may. utilize in their own programs. Sec. 8. (1) The land resource portion of the inventory shall be completed in a format that may be readily integrated into the data management system, and shall provide a base of information to analyze the existing and future productivity of the state’s natural resources and provide information to assist in the analysis of the timing, location, and intensity of future development in the state. The format should also 5 175 include information that will be readily usable and available to assist local governmental units in their land use planning. The inventory may include any of the following: (a) Geological features, including groundwater features such as depth to groundwater, groundwater recharge zones, and potable aquifers. (b) Land area with characteristics which pose problems to development such as an area subject to reasonably predictable hazardous natural phenomenon which may include flooding, high risk erosion, or subsidence. (c) Land area with characteristics which make it suited for agricultural use. (d) Land area with characteristics which make it suited for silvicultural use. (e) Metallic and nonmetallic mineral deposits. (f) Hydrological features, including lakes, rivers and creeks, impoundments, drainage basins, and wetlands. (g) Land area of wildlife habitat, including each significant breeding area or area used by migratory wildlife. (h) Topographic contours. (2) If the department designates an area as wetland, the state may negotiate and contract for an option to purchase or exchange the wetland in order to protect the wetland. The option to purchase or exchange the wetland shall be valid for 5 years. After an option to purchase is negotiated a person may apply for and receive consideration for an exemption from property taxes levied pursuant to Act No. 206 of the Public Acts of 1893, as amended, being sections 211.1 to 211.157 of the Michigan Compiled Laws, for the duration of the option to purchase. Sec. 7. The current use portion of the inventory shall be completed using aconsistentclassification system that can be readily integrated into the data management system, and shallprovide the base to analyze the existing use and cover in the state. The current use inventory may include any of the following: (a) Substantially undeveloped land devoted to the production of plants and animals useful to humanity, including forages and sod crops; grain and feed crops; dairy and dairy products; livestock, including the breeding and grazing of those animals; fruits of all kinds; vegetables; and other similar uses and activities. (b) Land in the production of fiber and other woodland products, or supporting trees for the protection of water resources, soils, recreation, or wildlife habitat. (c) Land which is being mined, drilled, or excavated for metallic and nonmetallic mineral, rock, stone, gravel, clay, soil, or other earth, petroleum, or natural gas resources. (d) A site, structure, district, or archaeological landmark which is officially included in the national register of historic places or designated as an historic site pursuant to state or federal law. (e) Urban and developed land, including residential, commercial, industrial, transportation, com­ munication, utilities, and open space uses, including recreational land. (f) Land owned on behalf of the public, including land managed by federal, state, or local government or school districts. (g) Land enrolled in Act No. 116 of the Public Acts of 1974, as amended, being sections 554.701 to 554.719 of the Michigan Compiled Laws. (h) Land enrolled in Act No. 94 of the Public Acts of 1925, as amended, being sections 320.301 to 320.314 of the Michigan Compiled Laws. (i) Land designated for tax abatements, restricted use, or specific use under a public act of the state of Michigan. Sec. 8. (1) The current use portion of the inventory may be conducted by municipalities, counties, or regional planning commissions as provided in subsection (4). A municipality, county, or regional planning commission conducting a portion of the current use inventory shall conduct that portion on a scale, level of detail, format, and classification system prepared by the department. (2) Within 9 months after the effective date of this act, the department shall prepare criteria for municipality, county, and regional planning commission participation in the current use inventory process. The criteria shall specify the scale, level of detail, format, and classification system to be used in the current use portion of the inventory, and shall contain forms and information on the financial reimbursement provisions provided in section 9. (3) The inventory advisory committee shall approve or disapprove the criteria for participation in the current use portion of the inventory as prepared by the department under subsection (2). If the committee 3 176 disapproves the criteria, the department shall again prepare the criteria, taking into consideration any comments made by the committee. (4) The criteria prepared under subsection (2) shall be circulated by the department to local government associations and to a municipality, county, or regional planning commission, upon request. Within 1 year after the effective date of this act, a municipality with an established planning commission may submit to the department and to the county board of commissioners of the county in which the municipality is primarily located a notice of intent to perform or cause to be performed the work necessary to complete the current use portion of the inventory. Within 15 months after the effective date of this act, a county with an established planning commission may submit to the department a notice of intent to perform or cause to be performed the work necessary to complete the current use portion of the inventory for each area for which a municipality is not performing the work necessary to complete the current use portion of the inventory. Within 18 months after the effective date of this act, a regional planning commission may submit a notice of intent to the department to perform the work necessary to complete the current use inventory for each area not covered by a municipality' or county notice of intent. For each area not covered by a notice of intent under this subsection, the department shall make or cause to be made the current use portion of the inventory. (5) A municipality, county, or regional planning commission engaged in the preparation of the current use portion of the inventory may make use of assistance, data, and information made available to them by public or private organizations. Sec. 9. The state shall reimburse each municipality, county, or regional planning commission engaged in the preparation of the current use portion of the inventory for 752 of the expenditures certified by the department. Certification shall be based upon conformance to the format, scale, and classification system provisions of the contract between the municipality, county, or regional planning commission and the department. If the amount appropriated during any fiscal year is not sufficient to provide the 752 reimbursement, the director of the department of management and budget shall prorate an amount among the eligible municipalities, counties, and regional planning commissions. Sec. 10. (1) The project design study required under section 3(1) shall be completed within 6 months after the effective date of this act. (2) The first land resource and current use inventory of this state shall be completed within 3 years after the effective date of this act. Sec. 11. (1) The land resource portion of the inventory shall be reviewed and updated when necessary, but not less than once every 10 years. (2) The current use portion of the inventory shall be reviewed and updated when necessary, but not less than once each 5 years. Sec. 12. (1) This act shall not be construed to permit the state, the department, or a person to exercise control over private property or to curtail development of private property. (2) This act shall not: (a) Constitute a state land use plan. (b) Be used by any state agency to control the existing and future productivity of the state’s natural resources or the timing, location, or intensity of future development in the state. Sec. 13. (1) There is hereby appropriated from the general fund for the fiscal year ending September 30,1980 the sum of $250,000.00 for the department of natural resources for the purpose of carrying out the project design study described in section 3(1) of this act. (2) To further carry out the requirements of this act it is anticipated that a minimum of approximately $1,000,000.00 for each year for the next 3 years will be required. Sec. 14. This act shall expire 5 years from the effective date of this act. 4 APPENDIX A2 MICHIGAN DATA NEEDS QUESTIONNAIRE 1. Transmittal Letter, p. 178. 2. Instruction Sheet, p. 179. 3. Questionnaire Document, back pocket. 177 17* STATE OF MICHIGAN NATURAL RESOURCES COMMISSION JACOB A mOEF€* E M LMTALA HILARY I hNf'U VAUL H WEMOlFri MAAflv h WHITELEV JOAHi WOLFE CHARLES G YOUNGLOVE WILLIAM G. MILLIKEN. Governor DEPARTMENT OF NATURAL RESOURCES STEVENS r MASON BUILDING BOX 30028 LANSING. Ml 40909 HOWARD A. TANNER. Director Dear Colleague: The Michigan Resource Inventory Act (1979 PA 204) program s ta ff is seeking your thoughts and suggestions through response to the enclosed question­ naire. As you are probably aware, the PA 204 program 1s designed to collect, organize and provide to local decision makers land bover/use and land resource Inventory Information. In order to Insure that the informa­ tion collected 1s useful to the largest possible cross-section of decision makers, we are asking you to te ll us what you need and how you would like to see i t presented. Your response is vital to the successful implementa­ tion of this new program. The questionnaire 1s designed to e lic it your response as to the format, scale, level of d e ta il, frequency and other important components of inventory Information. At f ir s t glance i t may seem long and complicated. In fact, i t can be completed quite quickly by following the step by step instructions and checking only those options that are appropriate for your Individual situation. Responses to the questionnaire w ill be tabulated and presented to the Inventory Advisory Committee early in 1981 so that they can fin a lize their recommendations to the Department of Natural Resources. Me would be pleased i f you or someone within your agency could complete the questionnaire and return i t by December 31,1980. A return address with pre-paid postage is Included as part of the questionnaire and i t need only be re-folded and mailed. Results of the survey w ill be distributed to participants as soon as is feasible. Thank you for your time and cooperation. Sincerely, Karl R. HosfoWr“Chief Division of Land Resource Programs Jon Bartholic Coordinator Center for Remote Sensing Michigan State University SHG26 ’ .00 STEP-BY-STEP INSTRUCTIONS TO SPEED UP YOUR RESPONSE TO THE QUESTIONNAIRE STEP A. Unfold the questionnaire to the full 25” x 19” format. STEP B. Begin reading the introductory information in the top left-hand corner of the question­ naire. STEP C. Stop and review each of the three question­ naire sections in turn: STEP F. STEP G. Complete the three other matrix boxes: Special Lands (4), Land Resource Data (5), and Composite Indicators (6), For those cat­ egories which are important to you, enter the technical characteristics that are re­ quired. STEP H. Complete the Name, Position, Agency block. STEP I. Refold the questionnaire, staple and mail. 2. INFORMATION CHARACTERISTICS: technical characteristics listed along the top of the questionnaire. 3. MATRIX BOXES: where your actual response is made. STEP E. Continue reading the introductory informa­ tion paying particular attention to the exam­ ple. If the format is unclear at this point, con­ sider calling the Center for Remote Sensing for further explanation (517-353-7195). Thank you Jor your time and e jjorl in jiltin g out this questionaire. 179 Select the mix of Level I, II and III cate­ gories required by your agency and en­ ter them by name and number in the box. Next, enter the technical character­ istics most appropriate for each of your selections. If you have to specify a res­ ponse that is not listed, enter the num­ ber in the box and then, in the coments section, indicate the category, the tech­ nical characteristic to be clarified, and the specific option you require. 1. RESOURCE INFORMATION CATEGORIES: descriptions which detail the categories are found adjacent to or below the matrix boxes. STEP D. Complete the Current Use Inventory Matrix. APPENDIX A3 AGENCIES RESPONDING TO THE MICHIGAN DATA NEEDS QUESTIONNAIRE Regional Planning Agencies Region 2 Jackson, Lenawee, Hillsdale Region 3 South Central Michigan Region A Southwestern Michigan Region 5 Genesse, Lapeer, Shiawassee Region 6 Tri-County Region 7 East Central Michigan Region 8 West Michigan Region 9 NEMCOG Region 10 Northwestern Michigan Region 11 Eastern Upper Peninsula Region 12 Central Upper Peninsula Countv Planning Agencies Livingston Clinton Antrim Macomb Eaton Benzie Oakland Ingham Emmet St. Clair Bay Grand Traverse Wayne Midland Manistee Jackson Saginaw Wesford Branch Huron Marquette Calhoun Mecosta Ontonagon Kalamazoo Newaygo Baraga Berrien Allegan Washtenaw Shiawassee Mason Cass Lapeer Oscoda County Commissioners Townships Osceola Redford, Wayne County Presque Isle Bloomfield, Oakland County Alpena Meridian, Ingham County Crawford Waterford, Oakland County Montmorency Delta, Eaton County Menominee Addison, Oakland County 180 181 Appendix A3 (cont*d.)« pities Highland Park Big Rapids Farmington Hills Kentwood Port Huron Ann Arbor Springfield Flint Royal Oak Livonia Southfield Saginaw Wyoming Westland Marquette Grand Rapids Detroit Alpena Others Wayne County Cooperative Extension Service Allegan County Cooperative Extension Service Alpena Drain Commission Huron County Equalization Office DNR District Forester Van Buren Soil Conservation District Menominee Soil Conservation District Resource Information Associates American Aggregate Corporation Michigan Community Development APPENDIX A4 MICHIGAN REFERENCE MAP 102 APPENDIX A5 CURRENT USE INVENTORY: Regions (11) F Z County Planners (35) F X 1 3 27 14 40 3 50 20 38 2 33 5 28 5 50 32 37 11 5 45 20 57 3 50 27 52 4 67 6 33 4 40 42 49 111 7 64 19 54 2 33 28 54 4 67 15 83 2 20 49 57 113 7 64 20 57 2 33 29 56 4 67 14 78 2 20 49 57 114 5 45 9 26 1 17 15 29 0 0 8 44 2 20 25 29 115 7 64 19 54 2 33 28 54 1 17 9 50 3 30 41 48 119 2 18 8 23 1 17 11 21 0 0 6 33 0 0 17 20 12 5 45 24 69 1 17 30 58 4 67 6 33 4 40 44 51 121 5 45 15 43 1 17 21 40 2 33 11 61 0 0 34 39 122 5 45 11 31 1 17 17 33 3 50 13 72 1 10 35 41 123 4 36 15 43 1 17 20 38 3 50 15 83 2 20 40 46 124 5 45 10 29 1 17 16 31 2 33 10 56 0 0 28 33 126 2 18 5 14 0 0 7 13 0 0 4 22 0 0 12 14 11 61 4 40 56 65 Category MLCUS # County Commissioners (6) F X SUMMARY OF RESPONSES BY USER GROUP SUBTOTAL (52) F X Townships (6) F z Cities (18) F X Others (10) F X TOTAL (86) F Z 8 73 28 80 1 17 37 71 4 67 1 9 4 11 0 0 5 10 0 0 5 28 0 0 10 12 132 2 18 4 11 0 0 6 11 0 0 4 22 0 0 10 12 133 1 9 5 14 2 33 8 15 0 0 5 28 1 10 14 16 134 1 9 3 9 0 0 4 8 0 0 5 28 0 0 9 10 135 1 9 4 11 1 17 5 10 0 0 7 39 0 0 13 15 136 1 9 4 11 1 17 6 11 0 0 8 44 0 0 14 i6 139 1 9 5 14 0 0 6 11 1 17 2 11 0 0 9 10 183 13 131 Regions (11) F Z County Planners (35) F Z 14 4 36 22 63 3 50 29 56 2 33 5 28 3 30 39 45 141 6 55 11 31 1 17 18 35 2 33 5 28 1 10 27 31 1 17 Category MLCUS # County Commissioners (6) F Z 142 6 55 14 40 143 5 45 9 26 144 6 55 12 34 1 17 SUBTOTAL (52) F Z Townships (6) F z Cities (18) F Z Others (10) F Z TOTAL (86) F Z 21 40 3 50 7 39 2 20 32 37 14 27 0 0 3 17 3 30 20 23 19 36 5 83 11 61 3 30 38 44 145 2 18 5 14 1 17 8 15 0 0 6 33 1 10 15 17 146 3 27 10 29 1 17 14 27 2 33 9 50 1 10 26 30 1461 2 18 8 23 10 19 0 0 3 17 2 20 15 17 3 27 8 23 1 17 12 23 0 0 3 17 1 10 16 19 1463 3 27 6 17 1 17 10 19 0 0 3 17 1 10 15 17 1464 4 36 16 46 3 50 23 44 0 0 5 28 3 30 31 36 1465 4 36 11 31 1 17 16 31 1 17 6 33 3 30 26 30 1466 3 27 9 26 1 17 13 25 1 17 5 28 1 10 20 23 17 6 33 2 20 25 29 16 3 ■ 27 1-3 37 16 31 1 17 8 73 17 49 1 17 26 50 0 0 3 17 3 30 32 37 171 0 0 8 23 1 17 10 19 1 17 2 11 2 20 15 17 172 0 0 1 3 1 2 0 0 3 17 0 0 4 5 173 0 0 6 17 6 11 0 0 1 6 0 0 7 8 19 7 64 21 60 1 29 56 3 50 7 39 5 50 44 51 191 2 18 4 11 1 17 7 13 0 0 6 33 1 10 14 16 192 1 9 6 17 1 17 8 15 0 0 6 33 0 0 14 16 193 31 3 50 18 35 4 67 10 56 3 30 35 41 17 4 36 11 194 2 18 10 29 2 33 14 27 1 17 7 39 1 10 23 27 199 0 0 4 11 1 17 5 10 0 0 4 22 0 0 9 10 » 184 1462 Category MLCUS # Regions (11) F % County Planners (35) F Z County Commissioners (6) F Z SUBTOTAL (52) ‘ F Z Townships (6) F Z Cities (18) F z Others (10) F Z 5 TOTAL (86) F Z 2 5 45 18 51 4 67 27 52 4 67 4 22 21 4 36 17 49 3 50 24 46 1 17 2 11 2 20 29 34 211 5 45 9 26 1 17 15 29 2 33 0 0 4 40 21 24 212 5 45 6 17 1 22 4 36 20 57 221 2 18 3 9 222 2 18 2 223 2 18 I 23 7 64 231 2 232 1 50 40 46 12 33 0 0 0 0 4 40 16 19 26 50 0 0 1 6 3 30 30 35 1 17 6 11 0 0 0 0 2 20 8 9 6 1 17 5 10 0 0 0 0 2 20 7 8 3 1 17 4 8 0 0 0 0 1 10 5 6 17 49 1 17 25 48 0 0 0 0 3 30 29 34 18 4 11 0 0 6 11 0 0 0 0 1 10 7 8 9 3 9 0 0 4 8 0 0 0 0 0 0 4 5 24 46 0 0 0 0 3 30 27 31 29 7 64 15 43 2 33 291 1 9 7 20 0 0 8 15 0 0 0 0 3 30 11 13 292 0 0 2 6 0 0 2 4 0 0 0 0 3 30 5 6 0 0 1 10 4 5 299 0 0 3 9 0 0 3 6 0 0 3 4 36 16 46 1 17 21 40 0 0 0 0 1 10 23 27 31 4 36 5 14 1 17 10 19 0 0 0 0 2 20 12 14 2 4 0 0 0 0 0 0 2 2 311 0 0 1 3 1 17 312 0 0 1 3 1 17 2 4 0 0 0 0 0 0 2 2 4 4 36 20 57 5 83 29 56 1 17 1 6 3 30 38 44 41 5 45 11 31 2 33 18 35 0 0 0 0 2 20 21 24 411 2 18 5 14 2 33 9 17 0 0 0 0 3 30 12 14 412 2 18 4 11 2 33 8 15 0 0 0 0 3 30 11 13 413 2 18 4 11 2 33 8 15 0 0 0 0 3 30 11 13 185 17 33 Category MLCUS 9 Regions (11) F X County Planners (35) F Z County Coinmissloners (6) F Z 42 5 45 11 421 2 18 4 11 33 422 2 18 4 11 2 33 31 2 33 SUBTOTAL (52) F Z 18 Townships (6) F z Cities (18) F z Others F X F Z 0 0 2 20 21 24 TOTAL (86) (10) 35 0 0 8 15 0 0 0 0 2 20 10 12 8 15 0 0 0 0 2 20 10 12 1 17 1 6 1 10 21 24 5 45 11 31 2 33 18 35 2 18 3 9 2 33 7 13 1 17 1 6 18 3 9 2 33 7 13 0 0 0 0 10 10 10 2 1 1 9 432 8 9 433 2 18 3 9 2 33 7 13 1 17 1 6 1 10 9 10 434 2 18 4 11 2 33 6 15 0 0 0 0 1 10 9 10 435 2 18 4 11 2 33 8 15 1 17 1 6 1 10 10 12 5 5 45 18 51 3 50 26 50 2 33 2 11 6 60 41 48 51 6 55 17 49 3 50 26 50 4 67 4 22 3 30 42 49 50 3 17 3 30 37 43 52 6 55 18 51 3 50 27 52 3 53 4 36 9 26 1 17 14 27 0 0 0 0 1 10 17 20 54 4 36 8 23 1 17 13 25 0 0 0 0 2 20 15 17 6 5 45 20 57 3 50 28 54 3 50 3 17 4 40 39 45 61 4 36 16 46 3 50 23 44 0 0 0 0 3 30 31 36 611 3 27 3 9 2 33 8 15 1 17 1 6 2 20 11 13 612 3 27 2 6 3 50 8 15 0 0 0 0 2 20 9 10 62 6 55 15 43 2 33 23 44 0 0 0 0 4 40 32 37 621 1 9 2 6 2 33 5 10 1 17 1 6 0 6 7 622 1 0 2 6 2 33 5 10 0 0 0 0 0 6 7 623 1 9 2 6 2 33 5 10 0 0 0 0 0 0 0 0 5 6 186 43 431 Category Regions County County Commissioners (6) F Z F Z Planners (35) F X 7 1 9 19 54 3 50 23 44 0 0 0 0 4 40 29 34 72 3 27 11 31 2 33 16 31 0 0 0 0 3 30 20 23 73 3 27 6 17 2 33 11 21 0 0 0 0 1 10 12 14 731 3 27 8 23 0 0 11 21 0 0 0 0 3 30 14 16 74 1 9 3 9 1 17 5 10 0 0 0 0 1 10 6 7 MLCUS t (ID SUBTOTAL (52) F Z Townships (6) F Z Cities (18) F Z Others (10) F Z TOTAL (86) F Z 187 APPENDIX A6 TABULATION OF RESPONSES TO INVENTORY CHARACTERISTICS Level I 1. Urban T(32) P(21) 2. Agriculture T(41) P(30) 3. Range 4. Forest 5. Water 6. Wetlands 7. Barren T(24) P(23) T(38) P(29) T(42) P<27) T(39) P(28) T(29) *<21) A . COVERAGE 1. Township 50 48 56 57 54 56 58 62 52 56 59 61 55 67 2. County 31 29 27 23 17 17 16 17 17 15 18 21 10 10 3. Multi-County 16 24 7 10 13 9 11 10 9 11 10 11 10 10 J 188 4. Scattered *» J J a 5. Regions 6. Other — — — — 2 — 8 — 4 9 — 3 3 8 5 — 9 10 9 7 — 3 4 3 — 10 13 14 10 5 5 — B. RESOLUTION 1. 1 Acre 16 10 2. 1 Acre 16 14 3. 2-5 Acres 12 19 4 . 10 Acres 34 6 5. AO Acres 10 10 17 13 13 3 4 4 3 24 27 21 22 33 32 37 46 10 7 7 4 5. 6. 160 Acres 3 — 5 3 7. 6A0 Acres 9 14 7 7 8. Other — — — — T— number of responses from Total Response Group P— number of responses from Planner Sub-Group 4 — 14 IS 13 7 24 28 31 33 28 43 29 31 17 19 4 13 14 5 5 — — 32 31 33 26 29 38 43 7 3 4 3 5 12 11 10 11 7 10 2 _ 3 _ 3 _ - 4 4 8 10 _ _ 3 — Level I 1. Urban T(32) P(21) 2. Agriculture T(41) P(30) 3. Range ' 4. Forest 5. Hater 6. Wetlands 7. Barren T<24) P(23) T<38) P(29) T(42) P(27) T(39) P(28) T(29) P<21> C . ACCURACY 1. 952 22 24 24 20 17 13 24 17 20 19 33 29 24 13 2. 902 56 57 54 60 58 61 50 55 52 63 51 57 55 57 3. 852 6 5 5 3 17 17 8 10 5 4 7 7 7 10 4. 802 9 10 7 7 — — 8 7 7 7 7 11 7 10 5. Other — — — — — — 1. 1: 10,000 22 14 18 10 17 13 21 17 24 IS 23 18 17 10 2. 1: 15,840 12 5 15 7 17 17 13 10 17 11 15 11 17 19 3. 1: 20,000 3 5 2 3 — — 5 3 2 4 3 4 3 5 4, 1: 24,000 31 48 34 43 38 39 29 34 31 41 36 43 31 33 5. 1: 31.680 6 5 2 3 2 4 6. It 50,000 3 5 3 4 7. 1: 62,500 6 5 2 3 7 7 3 5 8. It 63,360 3 1 7 7 8 9. 1:100.000 3 5 2 3 4 10. 1:250,000 3 5 2 3 [1. Other 3 5 7 10 SCA1.E — — — .— — — 8 5 7 9 8 10 12 11 3 4 7 10 4 3 3 2 4 3 4 3 5 3 3 2 4 3 5 8 7 2 7 5 — 9 S 7 189 — Level I I. Urban T(32) P{21) E. 2. Agriculture T(41) P(30) 3. Range 4. Forest 5. Water 6. Wetlands 7. Barren T(24) P(23) T(38> P(29) T<42) P(27) T(39) P(28) T(29) P(21) NON-MAP FORMATS 1. Section 34 33 39 33 21 22 29 28 33 37 33 39 28 33 2. Township 44 43 37 47 63 65 39 48 31 41 36 43 48 52 3. County 12 14 5 3 8 9 10 14 14 11 10 7 7 10 4. Ownership — — 7 7 4 8 3 2 — 5 4 3 — — 2 — -- — — — — — — — — — 5. Management — 3 6. Watershed — — — — — — — — 5 7. River Basin — — — — — — — — 2 — 8. Other — — — — — — 5 -- 7 — 3 2 5 4 3 — 3 — * APPLICATION 1. Farn and Open 3 — 27 30 4 4 3 3 — 4 — — Space 2. Forest Hanagement — — — — 4 4 18 24 2 3. Wildlife Management — — — — 50 48 — — 5 4. Ceneral Planning 50 57 41 40 4 4 50 52 S. Tax Equlllzatlon — ~ 2 — 8 9 — — 6. Urban Development Hanagement 22 19 7 3 3 7. Industrial Develop­ 7 — — 7 5 — — 3 4 13 14 3 5 60 50 49 50 69 76 — — — — — — 11 10 11 7 — — 5 3 — ment Siting 8. Commercial Develop­ — — 2 3 — '— — — —— — — — — — — — ment Siting 9. Residential Devel­ 6 10 — — — — — -- 2 4 — — 3 5 — -- — — — — 5 7 -- — — opment Siting 10. Recreation Devel­ opment Siting 11. Highways — — — 12. Flood Insurance — — — 13. Environmental Impact Analysis — — 14. Oilier 9 5 2 10 3 5 _ — _ _ — — — 13 17 17 ___ __ 3 4 5 -- 5 4 7 7 11 7 7 4 7 4 — 3 — Level I 1. Urban 2. Agriculture T(32) P(21) C. 1(41) P(30) 3. Range 4. Forest 5. Water 6. Wetlands 7. Barren T<24) P(23) T(38) P(29) T(42) P(27) T(39) P(28) 1(29) P(21) IMPORTANCE 1. Hot Very — — — — 4 4 3 3 - 2. 9 10 S 3 8 9 5 3 2 3. 10 5 19 23 29 30 24 24 21 41 48 32 30 33 30 42 48 28 33 37 37 21 17 18 14 — — 2 — — — 4. f 5. Critical 3 — 5 7 7 10 10 22 IB 21 17 10 52 52 44 50 55 62 19 18 18 14 10 ‘io 5 4 3 — FREQUENCY 2. Dally 9 10 5 3. Several Tines Weekly 9 10 4. Weekly 19 5. Several Tinea Monthly 16 191 1. Several Tinea Dally 7 4 4 5 7 5 4 5 7 3 5 15' 20 21 22 8 10 9 15 13 18 10 14 24 12 10 4 4 10 17 9 11 13 18 10 14 14 22 23 25 22 16 10 24 20 15 11 17 14 6. Monthly 9 10 7 10 17 17 26 31 5 4 26 32 7. Quarterly 9 10 7 3 4 4 5 3 2 4 10 7 8. Several Tines Annually 9 5 2 3 8 9 11 14 5 — 3 3 8 9 3 3 2 — 3 n _ 3 9. Annually * — — 2 10. Bl-Annually — — 2 11. 3-4 Years — — — 12. 10 Years 13. Other _ _ — — — — — — — — _ — — — _ — _ 3 21 24 — — 14 19 4 10 5 4 3 __, — Level I 1. Urban T(32) P(21) 2. Agriculture T(41) P(30) 3. Range 4. Forest 5. Water 6. Wetlands 7. Barren T(24) P(23) T(38) P(29) T<42) P(27) T(39) P(28) T(29) P(21) t. UPDATE 6 5 2. Each Year 22 3. 2 Years 4. S Years 1. 1 Year 5. 10 Years 6. Otlier — 19 22 17 13 13 18 14 17 15 18 18 10 10 34 38 32 23 29 26 20 24 17 15 23 21 17 10 34 38 54 S3 46 48 45 52 48 63 46 53 52 62 — ___ 2 3 — -- 5 7 7 — 7 4 14 ' 10 2 — 3 5 4 4 ___ — — ___ 192 — Level II Residential Commercial Industrial Transportation Extractive Open Space Cropland Orchards Confined Feeding T(42) P(29) T(42) P(31) T(54) P(38) T(39) P(29) T(31) P(15) T(44) F(29) TOO) P{25) T(30) P(26) T(29) P(25 COVERAGE 1. Township 64 65 67 68 74 63 51 55 52 56 so 51 60 60 67 69 58 60 2. County 19 21 19 19 19 18 27 31 13 12 23 24 20 24 17 19 17 20 3. Hulti-County 10 14 12 13 9 13 13 10 16 16 14 17 13 12 7 4 14 12 4. Scattered — — — — — — — 3 4 — — ~ — — — 5. Regions — — — — — 3 4 — — — 6 6. Other 7 ~ 2 7 — — 3 9 — 8 — 7 11 3 — — 4 7 7 4 4 — — — — — — R ES O LU TIO N 1. > 1 Acre 26 17 33 26 31 24 18 17 23 2 23 17 10 12 10 11 14 16 2. 19 24 29 35 26 26 18 21 6 4 11 14 10 8 10 11 7 8 3. 2-5 Acres 31 41 31 29 35 32 33 31 32 36 36 45 30 32 27 31 34 36 4. 10 Acres 12 10 5 3 9 11 10 10 26 28 9 10 27 28 33 38 38 36 5. 40 Acres — — 4 — — -- 3 4 11 7 7 8 7 4 7 8 6. 160 Acres — -- 7 6 — — 10 4 10 7 5 6 4 3 4 — — _ _ — _ 1 Acre 7. 640 Acres 5 — — 8. Other T— number of responses from Total Response Group p— number of responses from Planner Sub-Group — 6 5 13 7 — — — — — — 5 7 7 2 __ _ _ — 3 — Level II Residential Coanerclal Industrial Transportatlon Extraction Open Space Cropland Orchards Confined Feeding T<42) P(29) T(42) P(31) T(54) P(38) T(39) F(29) TOl) F(15) T<44) P(29) TOO) P(26) T(30) P(26) T(29) P(25) ACCURACY 1. 95X 26 28 33 29 39 ' 37 26 28 35 32 32 34 27 28 33 35 31 32 2. 90Z 57 62 57 55 61 58 54 55 48 52 55 55 60 64 60 58 58 56 3. 85X 5 3 5 3 6 3 — — 6 8 7 3 10 4 7 8 10 12 4. 80X 5 7 5 6 9 5 13 14 3 4 5 7 3 4 1. Is 10,000 24 17 33 23 24 18 18 17 13 12 18 14 10 12 13 15 10 12 2. 1: 15,840 19 21 19 . IS 20 13 23 21 16 16 24 25 20 12 17 15 21 20 3. 1: 20,000 — — — — — — — — — -- — — 3 4 3 4 — — 4. 1: 24,000 38 48 43 45 . 41 45 38 41 32 32 34 41 37 40 40 42 45 44 5. 1: 31,680 — — — — 6. 1: 50,000 — — — — — — — — 3 4 2 3 3 4 3 4 3 4 — — — — — — 3 4 5 3 7 8 — -- — — — — — — — 3 3 4 — — 10 12 7 8 7 8 3 4 — — — — — — 8 13 11 14 12 3. Other SCALE 7. 1: 62,500 2 3 8. 1: 63,360 — — 9. 1:100,000 — — 2 2 3 2 3 3 3 10. 1:250,000 2 3 2 3 2 3 3 3 11. Other 7 7 5 6 17 11 10 10 2 19 20 ii <1 J 10 10 Residential Coanerclal Industrial Transportatlon Extractive Open Space Cropland Orchards Confined Feeding T(42) P(29) T(42) P(31) T(54) P(38) T(39) P(29) T(31) P(15) T(44) P(29) T(3Q) P(25) T(30) P(26) 7(29) P(2! NON-MAP FORMATS 1. Section 31 34 33 29 46 34 28 31 36 29 27 24 27 20 30 31 24 24 2. Township 40 52 52 61 44 50 38 41 40 35 45 52 57 64 63 69 65 72 3. County 7 7 5 6 6 5 10 14 6 8 5 7 10 12 — — 3 4 4. Ownership 2 — 5 — 2 3 3 3 — — 5. Hanagement 2 — 2 — 6 8 7 6 6. Watershed — — — — -- — 3 . 2 4 — 5 — 6 — 3 — — — 5 3 — 7. River Basin 8. Other — — — — 3 8 4 — — 2 3 16 14 3 9 10 27 20 17 15 — — — — — — — — 47 54 48 52 — — 10 3 — — — — — — — — — — — — — — — — 55 52 41 45 47 56 4 5 3 3 4 7 7 3 4 — 3 -- 10 12 APPLICATION — 1. Farm and Open Space 2. Forest Management 3. Wildlife Management — — — — — — 4. General Planning 55 62 57 58 57 53 S. Tax Equlllzatlon 2 — 6. Urban Development Management 12 2 — 4 10 14 10 2 7. Industrial Develop­ — ment Siting — — —— 8. Conmierclal Develop­ — ment Siting — 19 19 9. Residential Devel­ opment Siting 17 17 10. Recreation Devel­ opment Siting — — 11. Highways — — 12. Flood Insurance — — 3 3 61 65 3 3 — — 30 21 10 2 3 3 — 3 7 — — — — 3 — 4 — — — — — — — — — 4 7 4 — —— 3 4 — — 7 3 — 4 -— 3 3 — — — — — — — — — — — — — 5 13. Environmental Impact Analysis 2 3 2 3 2 3 14. Other 2 3 2 3 9 10 7 5 — 8 3 3 — 7 4 4 4 9 10 — — — — — 2 — — — — — — - — — — -- — — — — — — — 21 i: — — — -— — — 2 3 16 20 11 14 — 13 8 13 8 Level II G. Comer clal Industrial Transportatlon Extractive Open Space Cropland Orchards Confined Feeding T(42) P(29) T(42) P(31) T(54) P(38) T(39) P(29) T(31) P(15) T(44) P(29) T(30) P(25) T(30) P(26) 1(29) P(25) IM PO RTANCE : 1. Hot Very 2 3 2 2. 2 3 2 3 6 3 3 3 10 12 9 10 3 3. 26 28 26 23 22 16 26 21 19 24 32 34 36 38 60 58 59 58 49 55 55 52 45 31 34 14 16 24 21 18 17 6 4 4. 'r 5. Critical H. Residentlal — — — — 27 32 30 51 33 32 11 14 33 32 7 7 3 4 31 24 24 37 38 41 40 33 31 31 32 7 4 FREQUENCY 1. Several Tinea Daily 2 2. Dally — 2 2 7 7 5 3 7 8 5 3 6 8 7 7 3 3. Several Tinea Weekly 21 24 29 29 20 18 15 17 23 24 9 10 27 28 23 27 24 28 4. Weekly 14 17 12 16 20 18 21 17 13 12 18 21 3 12 13 11 10 8 7 10 19 19 24 26 23 21 19 20 9 7 30 32 27 27 27 28 19 24 7 6 15 10 15 21 13 16 18 24 10 8 7 8 17 16 — _ 10 12 10 14 — — 6 — — 5. Several Tims Monthly 6. Monthly 7. Quarterly 8. Several Tinea Annually 9. Annually 7 10 12 10 5 10. Bi-Annually — 11. 3-4 Years 12. 13. 10 Years O th e r — - 7 10 6 9 6 4 __ 6 — 3 3 — 3 -- — •— — 5 — — — 5 — __ __ 2 — 5 7 7 14 10 5 4 — 7 3 8 — — — 7 8 — 3 4 — 10 — — — 8 3 — 4 — — 3 4 — 7 4 Level II Residential Conerdal Industrial Transportatlon , Extractive Open Space Cropland Orchards Confined Feeding T<42) P(29) T(42) P(31) T<54) F(3B) T(39) P(29) T(31) P(15) T<44) P(29) T(30) F(25) T(30) P(26) T(29). P(23) _ _ _ _ _ 2 3 2. Each Year 21 3. 2 Years 4. — 17 19 — — 17 16 27 23 14 12 31 33 28 47 50 41 40 36 41 43 52 3 — 41 44 5 3 — — — — — 3 23 24 16 18 14 10 8 20 17 40 42 39 37 31 31 2» 32 27 33 32 44 45 46 52 42 48 5 3 2 — — ~ 10 4 — 24 24 31 34 5 Years 24 31 5. 10 Years 7 6. Other 2 — 3 2 — 1. 1 Year . — — — Level II (cont'd.) Other Agriculture T(27) P(24) Forest Non-Forest Broadleaf Coniferous Stream _____ takes______ Wetlands_____ Wet lends______ Forest______ Forest T(42) P<27) T(37) P(27) T(31) P(22) T(33) P(24) T(21) P(18) T(21) P(18) Mixed Forest T(21) P(lt COVERAGE 1. Township 52 54 64 56 68 67 65 73 61 63 76 72 71 72 62 61 2. County 22 25 19 22 19 22 19 18 18 17 5 6 10 6 10 6 3. Multi-County 15 12 7 11 8 11 10 10 6 8 14 17 14 17 14 17 4. Scattered ■— — — 3 — 3 5 6 5 6 5 — 3 5. Regions 6. Other 7 — — — 4 — 2 5 4 — — — -- — 3 — 3 — 3 — 4 — — — — — — — — R ES O LU TIO N 1. 1 Acre 2. 1 Acre 7 8 26 19 11 11 16 14 12 8 10 11 10 11 — — 14 11 16 15 19 18 18 17 5 6 5 6 3. 2-5 Acres 37 37 24 26 38 44 19 27 24 29 38 33 33 4. 10 Acres 37 37 14 22 19 22 32 32 33 38 33 39 5. 40 Acres — — — — 5 6 — 5 6. 160 Acres 4 -- 2 7. 640 Acres 4 4 12 15 _ 5 4 8. Other _ — T— nuaber of responses fron Total Response Croup P— nunber of responses fron Planner Sub-Group 5 11 — 3 — 3 15 6 10 6 8 5 — 6 5 — — 33 33 33 38 39 38 39 5 6 5 5 5 — 6 — -- -- 5 6 Level II (cont'd.) Other Agriculture Streans Lakes Foreat Wetlands Non-Forest Wetlands Broadleaf Forest Coniferous Forest Mixed Forest T(27) P(24) T(42) P(27> T(37) P(27) T(31) P(22) T(33) P(24) T(21) P(18) T(21) P(18) 3(21) P(lfl ACCURACY 1. 95* 31 33 24 52 51 56 29 32 30 33 29 28 33 33 29 28 2. 90Z 59 58 29 30 35 33 52 55 55 54 57 61 62 61 52 56 3. BSZ 7 8 12 4 3 4 13 10 10 8 10 6 5 6 5 6 7 11 8 11 3 5 3 4 __ A. 802 — — 5. Other — — — — — — — — — — — 5 — — SCALE 1. 1: 10,000 7 8 7 ' 17 19 11 23 18 18 13 14 17 14 17 10 11 2. 1: 15,840 15 17 11 19 16 15 13 18 15 17 14 11 14 11 10 11 3. 1: 20,000 — — — — — — — — — 4. 1: 24,000 48 46 41 33 35 41 35 45 36 46 43 44 43 44 38 39 5. 1: 31,680 — — — — — — — — — — — -- — — — — — — — — 5 6 — — — -- — 6. 1: 50,000 7. Is 62,500 8. 1: 63,360 4 — 7 4 — 8 ■— 4 2 3 4 7 5 5 7 3 4 2 3 4 6 4 2 3 4 19 17 16 18 — 5 3 6 5 8 6 5 . — 6 5 — 6 9. 1:100,000 10. 1:250,000 — — 11. Ocher 19 17 — — — — — — — — — — 16 14 15 13 24 22 24 22 19 14 Level II (cont'd.) O ther Forest Wetlands Agriculture Stream Lakes T(27) P(24) T(42) P(27) T(37) P(27) T(31) P<22) Non-Forest Broadleaf Coniferous Mixed Wetlands_______ Forest______ Forest_______ Forest T(33) P(24) T(21) P(18) T(21) P(18) T(21) P(18) E. NON-HAP FORMATS 1. Section 19 21 15 26 22 11 23 18 36 33 24 22 24 22 24 22 2. Township 70 75 59 48 46 59 52 68 42 54 66 72 66 72 57 56 3. County — — 14 15 13 15 13 14 10 13 — — — — — —— — 4 2 3 4 3 4 3 — 10 6 10 5 10 6 10 6 5 6 10 11 5 6 4. Ownership 5. Management 4 6. Watershed — -- 4 2 3 7.- River Bssln — — 4 2 3 — 7 4 4 2 5 4 1. Farm and Open Space 11 12 2 4 3 4 2. Forest Management — — 3. Wildlife Management — — — 4. General Planning 48 54 5. Tax Equlllratlon 7 4 6. Urban Development Management 4 4 ft. Other 3 3 — 3 APPLICATION — 3 4 14 11 14 11 6 10 6 4 — — — — — — 56 57 59 55 59 55 58 38 44 43 44 43 44 ~ -- — 5 6 5 6 5 6 4 — 55 — 5 7 4 7. Industrial Develop­ — ment Siting — — — 8. Commercial Develop­ — ment Siting —■ — — 3 4 6 3 9. Rooftlnnfl.1 noual- — — 3 4 opnent Siting 10. Recreation Developnent Siting — — — — 11. Highways — — 2 4 — — — — 12. Flood Insurance — — 5 13. Environmental Impact Analysis — — 5 4 7 11 11 6 5 12 13 — 14. Other 22 17 7 7 11 11 13 14 15 17 24 — 22 24 ~ 22 19 17 200 3 5 — — — — Level II (cont'd.) Other Agriculture Streams Lakes Forest Wetlands Non-Forest Wetlands Broadleaf Forest Coniferous Forest Mixed Forest T(27) P(24) T(42) P(27) T(37) P{27) T(31) P(22) T(33) P(24) T(21) F(18) T(21) P(18) T(21) P(18) G. IMPORTANCE 1. Not Very 3 . 7 8 7 - — — 3. 26 25 19 11 16 11 26 27 33 44 46 45 56 59 67 42 50 22 21 26 30 24 30 19 16 — — — — 2 4. 'f 5. Critical 6 6 — — 5 6 5 6 5 38 19 22 19 22 24 28 36 42 52 ‘50 52 50, 43 44 21 31 24 22 24 22 19 17 14 6 14 6 10 6 19 22 19 22 19 22 — 1. Several Tinea Daily 3 7 4 11 14 16 11 3 26 29 15 14 16 22 16 18 12 17 2. Dally 3. Several Tines Weekly 3 201 H. FREQUENCY 4. Weekly 3 7 8 7 5 5 7 26 32 12 13 — — — — — — 5. Several Times Monthly 30 29 30 21 24 26 13 18' 30 38 29 28 29 28 19 17 6. Monthly 11 12 11 10 8 11 3 5 12 17 19 22 19 22 19 22 7. Quarterly 10 11 — — IS 4 4 2 3 4 3 5 3 4 ~ — — — 8. Several Tiotes Annually 4 4 11 10 5 7 3 5 3 4 — — — — 9. Annually 7 4 7 12 B 4 13 5 15 4 5 6 5 — 5 6 — E J t 6 4 5 6 5 MM 5 6 10. Bl-Annually 4 4 — 11. 3-4 Years 12. 10 Years 13. Other 3 — — — JL 4 MM Level II (cont *d .) Other Agriculture Stream Lakes Forest Wetlands Mon-Forest Wetlands Broadleaf Forest Coniferous Forest Hlxed Forest T(27) P(24) T(42) P(27) T(37) P(27) T(31) F(22) T(33) F(24) T(21) F(18) T(21) F(18) T(2l) P(18) 1. UPDATE 1. 1 Year — 2. Each Year -- — — — — 7 4 15 21 16 15 13 5 12 4 19 17 19 17 14 11 — 19 22 52 61 2 Years 37 37 11 14 16 15 16 23 18 21 5 6 5 4. S Years 48 58 52 48 41 48 53 59 48 58 66 72 66 72 5. 10 Years — — 22 17 19 22 10 5 12 8 -- — — — 6. Other _ _ 4 2 3 _ _^ _ 5 _ — _ 202 3. Level III Multi-Family Resldential Single-Family Residential Mobile Home Parka Rail Transportation Road Transportation Solid Waste Managnent Shopping Centers Strip Development Central Business District T(49) P(28) T(49) P(29) T(41) P(28) T(33) P(20) T(38) P(19) T(30) P(22) T(35) P(16) T(40) P(20) T<34) P(21) A. COVERAGE 1. Township 57 57 53 59 56 61 42 45 55 53 37 36 57 56 60 60 53 48 2. County 16 14 12 14 15 14 24 25 13 16 37 45 9 6 13 20 15 19 3. Multi-County 8 11 8 10 10 11 10 10 11 T6 10 5 6 5 10 6 10 4. Scattered 2 3 5 — — 3 5 3 A o S. Regions 2 4 2 3 2 4 3 5 3 5 -- 3 6 3 5 3 5 18 11 IB 10 15 7 15 5 13 5 3 5 20 6 5 — 18 10 1. > 1 Acre 39 36 41 34 29 29 27 25 26 21 27 18 26 19 30 25 35 19 2. 27 29 24 28 34 39 18 25 13 21 13 18 26 25 38 40 24 33 20 25 20 28 20 21 21 30 29 32 23 23 31 38 20 25 24 29 6 4 4 3 5 4 6 — 13 10 6 13 5 5 6 10 -- 6 3 5 3 — 3 3 5 34 5 3 _ 6. Other — P RESOLUTIOH 1 Acre 3. 2-5 Acres 4. 10 Acres 5. 40 Acres — 6. 160 Acres 2 7. 640 Acres 4 8. Other 2 — 2 — — 2 — 7 — 4 2 2 7 — T— number of responses from Total Response Group P— number of responses from Planner Sub-Group 5 2 7 — — 3 5 — 18 20 — — 3 5 e 3 16 * 21 17 5 3 3 6 _ _ Level III Multi-Family Residential Single-Family Residential Nobile Hone Parks Rail Traneportation Road Transportation Solid Waste Hanagenent Shopping Centers Strip Development Central Business District T(49) P(28) T(49) P(29) T(41) P(28) T(33) P(20) T(38) P(19) T(30) P(22) T(35) P(16) T(40) P(20) T<34) P(2J ACCURACY 1. 952 45 50 41 41 37 43 45 40 47 47 47 50 51 SO 40 35 50 38 2. 90Z 50 46 47 48 49 54 39 55 32 47 33 41 40 44 48 55 41 52 3. 852 2 — 6 7 — — 3 — 10 — 3 — 6 6 8 5 6 5 4. BOX 2 4 8 3 5 4 6 — 5 5 13 10 — — — — — — 1. 1: 10,000 35 29 33. 28 29 29 18 20 24 21 27 18 31 19 33 20 36 24 2. 1: 15,840 18 11 16 10 10 14 15 15 13 11 17 23 14 6 10 10 15 10 3. 1: 20,000 — -- — — — — ~ — — -- — __ — — — — — 4. 1: 24,000 18 29 16 24 22 29 24 20 26 20 18 17 31 18 30 18 20 5. 1: 31,680 2 — — — 3 — — — 3 — 3 — — — 6. 1: 50,000 2 10 3 7 10 6 13 3 — 7 10 11 3 5 S. Other SCALE 7. 1: 62,500 — 2 — 2 3 — — -- -- — — 4 7 4 — 8. 1: 63,360 2 4 9. 1:100,000 2 4 10. 1:250,000 2 4 6 18 18 22 11. Other 2 — — 6 — 3 -- — 10 5 5 — — 3 3 5 5 3 — — — 6 5 — 3 , 5 2 4 2 4 3 5 4 6 5 5 5 7 5 3 6 5 5 3 5 24 22 21 24 20 21 21 13 14 26 19 20 15 18 19 — 9 21 — Level III Multi-Fanily Residential Single-Family Residential Mobile Hose Parka Rail Transportation Road Transportation Solid Waste Management Shopping Centers Strip Development Central Business District T(49) P<28> T(49) P(29) T(41) P(28) T(33) P(20) T(38) P(19) T(30) P(22) T(35) P(16) T(40) P(20) T(34) P(21) NON-MAP FORMATS 1. Section 43 39 47 45 32 36 24 30 34 32 30 32 37 38 10 30 44 38 2. Township 27 36 20 28 4i 50 27 30 10 32 33 36 20 38 23 35 21 33 3. County 4 4 2 5 4 24 . 30 13 16 17 18 6 6 10 10 3 5 4. Ownership 2 — 6 5 — 6 5. Management 2 6. Watershed — 7. River Basin — 8. Other 22 — 4 — 3 — — 3 4 3 — 3 5 — 3 5 3 6 3 5 5 3 — — — — — — 23 15 21 14 2. Forest Management — — — — 3. Wildlife Management ~ — — — 53 45 53 48 — — 18 22 3 17 22 14 18 15 18 10 10 — 3 — 2 — 5 — — — — 7 — 10 — 26 — 13 F. APPLICATION 1. Farm and Open Space 4. General Planning 61 5. Tax Equillzation 2 2 6. Urban Development Management 6 5 54 61 55 58 54 61 45 42 32 47 45 54 56 7 6 5 16 11 10 10 11 6 12 20 5 11 7. Industrial Develop- — ment Siting 9 . 8. Commercial Develop- — ment Siting 9. Resource Development Siting 8 10. Recreation Development Siting — 11. Highways — 12. Flood Insurance — 13. Environmental Impact Analysis 14. Other 6 10 14 5 8 10 9 10 8 5 15 19 5 10 9 — — — — — — — — — 2 8 2 13 8 16 3 3 4 4 16 2 4 21 7 17 22 21 18 25 18 — 21 5 — 10 14 3 6 3 5 3 5 14 23 11 13 13 15 9 14 Level III Hult1-Fanlly Residential T(49) P(28) Single-Fatally Mobile Residential Hone Parka 1(49) P(29) T<41) P(28) Rail Tranaportation Road Tranaportation Solid Waste Managenent Shopping Centers Strip Development Central Bualneaa District T(33) P(20) T(38) P(19) T(30) P(22) T{35) P(16) T{40) F(20) T(34) P(21) a C. IMPORTANCE 3 1. Not Very — 3 — — 2. — — — — 12 — 12 5 3 3. 22 IB 24 21 22 14 27 40 29 45 57 41 52 54 64 48 50 33 25 33 24 17 21 9 S 6 7 6 7 5 7 2. Dally 20 16 22 16 21 21 12 20 11 21 18 12 20 10 18 3. Several Tinea Weekly 18 21 4. Weekly 16 18 24 29 6 15 25 20 16 25 7 28 14 22 10 24 15 5. Several Tines Monthly 6. Monthly 10 11 10 10 24 32 12 *• ,t S. Critical — — — — • — — — — _ ~ — — — — 3 5 32 27 27 17 13. 20 10 12 10 47 58 30 23 63 75 65 85 65 76 18 11 37 41 20 13 15 5 21 14 — — — — — —— — FREQUENCY 3 — 3 — 6 Dally 7. Quarterly 8. Several Tines Annually 9. Annually LO. Bi-Annually 11. 3-4 Years 12. 10 Years 13. Other 9 4 — 2 — 2 2 — — 2 — 7 — — — — — 4 — 6 6 3 26 21 3 7 5 14 5 26 13 31 13 25 10 30 21 10 24 21 10 20 19 23 27 23 26 13 9 13 23 15 25 20 27 15 24 14 15 5 S 3 — 11 5 — 11 10 10 7 10 10 10 3 3 6 6 6 10 -10 10 — 5 12 14 5 5 5 — 3 — 7 — — 3 — 3 3 — — — — — -- 3 3 — 5 — _ 18 — 9 3 — 3 _ — — — _ _ 3 6 — — — — — __ _ 206 1. Several Tinea Level III Multl-Faaily Residential T(49) P(28) Slngle-Faaily Mobile Residential Hone Parka T(49) P(29) T(41> P<28) Rail Tranaportation Road Tranaportation Solid Haste Managenent Shopping Centers Strip Development Central Bualnesa District T(33) P(20) T(38) P(19) T(30) P(22) T(3S) P(16| T(40) P(20) T(34) P(21) I. UPDATE — — — — — 37 13 40 25 41 29 23 37 38. 28 25 21 19 36 23 44 30 45 32 43 _ 3 6 3 1 3 5 2 4 2 5 2. Each Year 27 11 35 14 29 21 15 10 21 16 27 23 3. 2 Years 33 32 31 31 34 32 30 35 37 32 30 4. 5 Years 33 43 31 45 37 43 42 SO 34 53 30 5. 10 Years _ 2 3 2 4 6 5 5 _ 1. 1 Year : 207 6. Other — APPENDIX B TRAINING SET SELECTION PROCEDURES The following is a brief review of the steps involved in the training set selection procedure. 1. Assemble Materials. i) Ground verification information. Color infrared photography at a scale of 1:24,000 available from the Michigan Department of Natural Resources (DNR) was used as ground verification Information for this study. The photography was flown as part of the state-wide coverage obtained by the DNR in 1977-78. This photography was used to obtain training field information and also as a ground verification source against which to test the accuracy of the classified products. ii) Read the LANDSAT data from the computer compatible tape (CCT) supplied by the EROS Data Center to disk on the CDC main frame computer. Transfer these data via telephone link to the ERDAS micro­ computer (a 1200 baud line is available).^ lii) Once the study data are available for manipulation on the ERDAS system, generate study area histograms and establish the color gun assignments for use in color display of these data. 2. i) Initial Training Set Development. Review ground verification materials and select a series of training fields that represent the land use and land cover classes "*This task is accomplished by an ERDAS program named LISTEN. Early versions of this program would not function correctly and the study area data were transferred to diskette using facilities at the Georgia Technological University. A subsequent version of LISTEN is now oper­ able, but a tape drive has been added to the ERDAS system at MSU mak­ ing data transfer to diskette a self-contained operation. 208 209 required in the analysis. The areas selected should be spectrally as homogeneous as possible and this may require selection of several sites which represent various aspects of one information class. ii) Display the study area on the ERDAS color monitor at nominal scale and in a general way associate the image with the ground verifi­ cation information. Problems may occur because of time differences between ground verification and LANDSAT acquisition dates. A cropped field on the ground verification material may be a harvested field on the LANDSAT data and as a consequence have very different spectral characteristics. This is an important first step as going initially to a magnified display can cause orientation problems and Inhibit accurate definition of the training fields. When the analyst feels familiar with the relationships between ground verification informa­ tion and displayed LANDSAT data, the actual definition of areas can take place. For this process it is useful to re-display the image at a magnification of 2 or 3 and this means taking several segments of study area one at a time. Training sites are outlined in the image display and a histogram of the site displayed on the graphics plane of the monitor. bility exists to locate 100 separate vertices. The capa­ Some statistical information and the screen coordinates of the training set are displayed on the communications monitor. This information gives the analyst a first cut on whether to accept or reject the site. The his­ tograms should be unimodal with maximum and minimum values fairly close to the mean. If the training set seems to be valid, the next step is to "alarm" the image display which produces a single category parallelepiped classification. If ground truth materials are fairly 210 extensive this procedure can give a good visual indication of the validity of the training set. If the training set is to be retained, it now has to be given a name and the screen coordinate locations need to be recorded. All potential training fields from the ground verifi­ cation information have to be tested in this way and, if accepted, given a name and their coordinate locations recorded. ill) As was indicated earlier, previous research has suggested that this type of interactive selection of training sites does not always Identify the full spectral variability present within the scene. Clustering, by definition, accomplishes this and is used here primarily as a confirmatory device. defined using the ERDAS software. A maximum of 27 clusters can be In addition, it is possible to specify three parameters which determine how the algorithm performs. These are the minimum distance between cluster means, the maximum radius of individual clusters and the number of points to be processed until clusters are merged (values are expressed in units of spectral space). Clustering of all the test data sets used a minimum distance between clusters of 5, a maximum radius of 20, and 100 points until merger. The results of the clustering are read into an information system file and displayed on the color monitor. With 27 classes two files have to be created, the first displaying clusters 1-15 with clusters 16-27 aggregated into the 16th color level and a second file to display clusters 16-27 with 1-15 aggregate into one color level. Using an option in the information system software it is possible to locate the training sites selected during the supervised procedure through the X,Y locations, and check on the spectral homogeneity of the training classes. This type of checking may cause the analyst to 211 flag the site as possibly of mixed spectral classes and a candidate for rejection or change the X,Y locations of the set to better reflect a spectrally homogeneous training area. If the X,Y locations are re-defined, this set will eventually have to be re-defined in the LANDSAT training mode of ERDAS. In addi­ tion to cluster-confirming the supervised training sites, this evalua­ tion may indicate that a spectral class within the data has not been accounted for. Should this be the case, the class can be defined in X,Y coordinates and compared with ground verification data to ascer­ tain its informational value. A new training set defined in this way will also have to be redefined in the LANDSAT training mode. The result of steps (ii) and (iii) are a series of training sets with informational value and apparent spectral separability. Software to generate separability indices such as the Swain-Fu Index (Swain and Davis, 1980) is not available on the ERDAS system. 3. Evaluate Training Set Statistics. Depending of the availability of ground verification information, the analyst should attempt to select several training sites to represent a particular spectral class. Statistical information should be generated for each of these sites and evaluated as to their accep­ tability. Attention up to this point has been directed at obtaining informational value and separability; however, the size of training sites has also been shown to be an important consideration in training site acceptability. Similar ground verification training sites can be checked for comparability and accepted or rejected, and smaller acceptable sites combined so that one group of pixels representing a particular spectral class which will subsequently be used in the clas- 212 slflcafcion process. This evaluation phase will most likely take a number of iterations of selection and rejection to rationalize the final training set selections, grouping together cluster-selected and supervised-selected training areas. APPENDIX C1 REVIEW OF ACCURACY TESTING PROCEDURES Testing the accuracy of thematic maps has begun to receive atten­ tion in the literature over the last few years. A number of investi­ gators have approached the problem from different directions and the results are somewhat conflicting (Hord and Brooner, 1976; Van Genderen and Lock, 1977; Hay, 1979; Ginevan, 1979). There is agreement however on three issues that must be addressed in establishing a procedure to test the accuracy of a thematic map. Sample selection methods, esti­ mation of sample size, and the determination of a confidence interval for the accuracy statistics are all critical elements. Each of these is briefly reviewed and the specific methodology selected to test the study maps is indicated. 1. Sampling Procedure. A stratified, systematic, unaligned sampling technique originally proposed by Berry and Baker (1968) is suggested by most authors as the best method of obtaining a bias-free sample for accuracy evaluation. This method was specifically recommended by Berry for use in the land use mapping program of the U.S. Geological Survey (Fitzpatrick-Lins, 1981). The procedure involves creating a square grid that can be over­ laid on the area to be checked. The dimensions of the grid are deter­ mined by the number of sample points to be selected. For example, an area of 30,000 pixels from which 300 samples have to be drawn would require cells containing 100 pixels, with dimensions of 10 x 10 pix­ els. Once the grid has been constructed, an origin point within the upper left square is selected by using a pair of random numbers to 213 214 establish the coordinates. The x coordinate of this upper left square is then used with newly selected random y coordinates to locate the remainder of points within the top row of grid squares. Similarly the y coordinate of the upper left square is used with random x coordi­ nates to locate the remainder of points within the first column of grid squares. The random x coordinate established for the subsequent grid squares of column 1 are then used with the random y coordinates from the points established in row 1 to fill in points for all the other rows of the matrix (Figure C1.1). X AXIS 1 1 1 I — A 1 1 1 -4 \ -• 2 S < — r— • • • • • • • • • • • • • • • • • • » • • • • • • • • • • • • • • • • • • • • • • • • • Figure Cl.l. • • • > i I — i i • • • • A Stratified Systematic Unaligned Sample Grid Stratified random sampling does not, in fact, always lead to an adequate number of test samples to determine the accuracy of each land use category because it is an area dependent methodology rather than a category dependent one. Consequently, categories with small areas and/or categories concentrated in small parts of the map tend to be under-represented in the sample. Van Genderen and Lock (1977) avoid this problem by continuous sampling within each category until a prescribed number is reached. Fitzpatrick-Lins (1981) describes the 215 USGS GIRAS software program which deals with the problem in terms of strata. A first stratum consists of a set number of samples (i.e., the 300 from the earlier example) and then a second stratum of addi­ tional samples for specific categories that have been under­ represented (i.e., additional samples will be selected for categories with less than 20 points). This, however, is an automated procedure and is facilitated by a digitized data base. It is acknowledged that this methodology creates an unmanageably large number of sample points for manual accuracy evaluation. Operationally, Implementation of a category dependent sampling scheme was judged to be prohibitively timeconsuming. The actual effectiveness of a standard stratified sys­ tematic unaligned sample is, however, intricately related to the sam­ ple size chosen and can be adequate for certain sets of circumstances. Another factor of concern with category-weighted samples has been the development of appropriate confidence intervals. Until recently the literature has been unclear on how category weighting affects the establishment of confidence limits. Very recent studies (Rosenfield et al., 1982; Card, 1982) appear to have clarified this confusion; however, the data necessary to evaluate accuracy in the present study were being collected before this new information was available. 2. Sample Size. The critical factors determining the sample size necessary to represent a particular accuracy level are: i) whether the investigator is looking for a single percentage figure to represent the accuracy of the map as a whole, or whether it is important to be able to indicate accuracy levels for specific categories within a classification scheme, 216 ii) the allowable error and confidence interval an investigator is prepared to accept with reference to that accuracy figure. Hord and Brooner (1976) approach the determination of accuracy levels for the whole map arguing that a single accuracy estimate will be of most value to the majority of map users. Using the binomial distribution they develop a table which details the upper and lower accuracy limits at the 95% confidence level, for accuracies from 80% to 100%, with sample sizes of 50 through 400. They show clearly that the validity of an accuracy statement Is dependent on the number of samples taken, but do not present criteria for selecting how many sam­ ples are appropriate in a particular situation. Van Genderen and Lock (1977) suggest that in an operational accuracy evaluation each category should be checked separately. They also employ the binomial distribution to create a table which specifies an anticipated interpretation accuracy, but tie this to the probability of drawing an error-free sample at that particular level in relation to sample size. The table shows that no-error sample results can quite easily be obtained with small samples when the true error rate is quite high. By dividing the table on the basis of the 95% confidence level it is possible to determine the minimum sample size required to check for an anticipated map accuracy level. For example, for 80% accuracy at least 15 samples per category are required; for 85%, 20 samples; and for 90%, 30 samples. Ginevan (1979) suggests three criteria are important in the establishment accuracy sampling scheme. i) accuracy, They are: there should be a low probability of accepting a map of low 217 il) there should be a high probability of accepting a map of high accuracy, and ili) the first two objectives should be achieved with a minimum number of ground truth samples. Given these criteria, Ginevan suggests that the statistical frameworks presented by Hord and Brooner (1976) and Van Genderen and Lock (1977) meet the first and third criteria, but ignore the second. Both procedures he recommends have a fairly high probability of rejecting a map of high accuracy which could result in additional re­ checking of maps that were, in fact, acceptably accurate. Ginevan presents a procedure for meeting all three criteria developed from a branch of statistics known as acceptance sampling, which was developed to evaluate whether large lots of manufactured goods are of acceptable quality. In this situation, there is both "consumer's" and "producer's" risk. The consumer's risk is analogous to the probabil­ ity of accepting an inaccurate map and the producer's risk is analo­ gous to rejecting an accurate map. Both factors have to be taken into consideration when estimating the accuracy level of a particular map. By means of the binomial probability density function a complicated . series of tables are prepared which allow the user to identify both of these risks, within certain limitations, and be able to derive from them the number of samples necessary to check a map and the number of errors acceptable before the map must be rejected as inaccurate. For example, the lowest level of accuracy considered acceptable in Ginevan's tables is 852 so that, while a 752 level was determined accurate for the purposes of this study, 852 has to be used in this instance. A consumer's risk of 52 and 12 are available in the table 218 and, opting for the lenient side, a 5% level would be considered for the test. The user then decides on the producer's risk of rejecting a map that is either 90}, 95} or 99} accurate. A 5} risk of rejecting a 90} accurate map, for instance, means a sample of 379 points with the possibility of up to 45 errors (Ginevan, 1979» Table 2, p. 1373). while a 5} risk of rejecting a 95} accurate map under the same cir­ cumstances means 76 points and up to 6 errors (Ginevan, 1979, Table 2, p. 1373). As with the other systems discussed, the larger the antici­ pated error that a user builds into the initial assumptions, the larger the number of sample points required to test for a required level of accuracy. With this scheme, the number of points to be sam­ pled in order to avoid rejecting a map estimated to be 90} accurate is substantially higher than for a map estimated to be 95} accurate. Rosenfield et al. (1982) and Fitzpatrick-Lins (1981) in describ­ ing the procedure used by the USGS National Land Use and Land Cover Mapping Program, provide a critique of methodologies suggested by Hord and Brooner and Van Genderen and Lock. Ginevan*s scheme. No reference is made to Although not explicitly stated as such, their methodology separates sample size selection and accuracy evaluation into two distinct procedures rather than integrating them into one combined step through use of a look-up table. Using the same statist­ ical foundation as the other workers, based on binomial probability theory, their analysis suggests that the best methods of sample size selection is through use of the equation: _2 n = Z Z p q E = = = = (Source: ,_2 (1) pq/E 2 expected accuracy (%) 100 -p allowable error (%) (Z = 2: Snedecor and Cochran, 1967, p. 517) generalized from the standard normal deviate of 1.96 for the 95Z two-sided confidence level) 219 This formulation gives the analyst considerable flexibility in establishing appropriate criteria for the thematic map in question. For example, most schemes suggest that a minimum acceptable accuracy level for a land use map is the 859 indicated by Anderson (1976), and their tables are constructed accordingly. be more appropriate in some instances. Lower accuracy levels may Allowable errors of 59 and 109 are standard, as is the 959 confidence level for accepting results but, again, circumstances may require substitution of different values, particularly as a means to "manage" the number of sample points that have to be verified. In the current test an initial review of the classification out­ put suggests that a realistic estimate of the expected accuracy level is 759. Establishing an allowable error of 59 and a 959 confidence level for the acceptance of the results the sample size would be: n = 4(75 * 25) = 300 (2) 5 A smaller sample size can be obtained by shifting the allowable error marginally upward to 69 (equation 3) or adjusting the confidence level for the acceptance of results slightly wider to 909 (equation 4). n = | n = = 208 * 25>- - 202 (3) (4) 5 (In equation 4, Z - 1.645, generalized from the standard normal deviate of 1.6448 for the 90% two-sided confidence level. Source: Snedecor and Cochran, 1967, p. 549) 220 A sample size of 300 points was selected for use in the three test sites considered in this study, as indicated by the above review. The problem of an acceptable number of samples to evaluate the accuracy of individual categories still remains. Van Genderen and Lock (1977) specify a minimum of 20 points, Hay (1979) suggests 50 points although the recent USGS research recommends much higher numbers of samples (Rosenfield et al., 1979). For an 85} accuracy level and a 10} allowable error at the 95} confidence level, this work suggests that 45 samples are required for each category, 34 of which must be correct (Fitzpatrick-Lins, 1981). With lower estimated accu­ racies or a 5} allowable error the sample number rises precipitously. For example, an 80} accuracy level with a 5} error at 95} confidence level requires 193 sample points (Rosenfield, 1979). magnitude quickly get impractical. Numbers of this .For the purposes of this study, recognizing the inherent limitations and following Van Genderen and Lock (1977) and Fitzpatrick-Lins (1981) when working with a manual sample, individual category accuracies are considered with 20 or more points. This evaluation is not, however, included in the main body of the text because of uncertainty associated with the minimum number of samples required for individual category accuracy. Some conclusions concerning individual category accuracy are presented in Appendix C2 and should be treated Despite the work with appropriate caution. that has been done on establishing amethodology for accuracy evaluation, a number of outstanding issues still remain. None of the schemes considered above make any reference to variations that are necessary to accommodate: i) the level of detail in the categories mapped,i.e.. Level I, 221 Level II, or Level III categories, 11) the size of the study area under consideration, I.e., whether it Is a small county-size area or a large regional-size area, ill) the scale of the base map on which the interpreted data are displayed, and iv) the spatial resolution (or minimum type size) of the origi­ nal data. Clearly, this topic requires more research to establish a greater level of understanding as to how these factors influence the metho­ dologies developed to evaluate map accuracy. 3. Confidence Level. A sampling methodology used to estimate accuracy, by definition, implies the possibility of some error in the final percentages. The range within which this error will fall is determined through the establishment of confidence limits, and this is dealt with in the main body of the text. APPENDIX C2 INDIVIDUAL CATEGORY ACCURACY ANALYSIS Implicit in the discussion of the overall accuracy performance of the LANDSAT classification has been the fact that some individual categories are classified more accurately than others. The classifi­ cation error matrices allow a comparison of row totals which represent the ground verification information, with column totals representing the LANDSAT classification, and an estimate of over- or under­ estimation is possible. More important is the distribution of fre­ quencies within the rows and columns which indicate how many samples for a single classification were correctly classified and to which other categories misclassified pixels were assigned. It is also important to look at percentage accuracy for individual categories. This can be achieved in two ways. One method takes the number of correctly classified pixels as a percentage of the total number of samples in that category, as derived from the LANDSAT classification, i.e., the column total in the classification error matrix. From this percentage it is possible to evaluate the commission error which, in effect, tells an analyst how well the classified category represents the real world: whether the mapped category contains most of the area in that category on the ground. This estimate of category accuracy is the one that is most usually considered in accuracy analysis of land use maps (Fitzpatrick-Lins, 1980). A second and complementary way of calculating accuracy is to take the number of correctly classified pixels as a percentage of the total number of samples in that category, as indicated by the groundverification information; the row total is the classification error 222 223 matrix. From this percentage it is possible to evaluate the omission error which indicates how well the real world is represented by the map: that is, to what extent the mapped category omits ground-verified information. Consideration of individual-category accuracy through the means of "percent correct" in both commission and omission allows for a standardization of the slightly variable frequencies in the classifi­ cation error matrices for each test site. In this way, classification comparisons are possible for categories that have sufficient samples. Deciding which categories have sufficient samples for a determination of individual accuracy, however, is subject to varying interpretations in the literature (see Appendix C1). Rosenfield (1982) notes correctly that the sampling strategy proposed by Berry and Baker (1968) is area-weighted_and, as such, under-represents those categories with small areal extent. The GIRAS computer software sys­ tem currently used by USGS for accuracy evaluation uses a categoryweighted procedure, but prior to its introduction, USGS used the Berry strategy and considered individual category accuracy for categories with twenty or more sample points. Individual-category accuracies for those categories with more than twenty samples were calculated for the study area test sites and the results are presented in the following section. i) Agricultural Site. Much of the inter-category variability in the Agricultural site was already referred to in demonstrating how the overall accuracies differed between algorithms. The accuracy figures for individual categories (Table C2.1) show that the grouping of clus­ ter classes, which is the superior overall classification, does not 224 Table C2.1. Individual Category Accuracies (%): Ground Verification Source: Agricultural Test Site Aerial Photography Maximum Likelihood Classification Minimum Dis tance-to-Means Classification Grouping of Cluster Classes Commission Accuracy 29 25 50 Omission Accuracy 85 75 45 Commission Accuracy 63 73 66 Omission Accuracy 63 18 67 Commission Accuracy 13* 15 30* Omission Accuracy 11* 24* 15* Commission Accuracy 72 51 64 Omission Accuracy 38 78 70 100* 100* 94* 88* 85* 94* Categories RESIDENTIAL CROPS & PASTURE RANGE BROADLEAF FOREST WATER Commission Accuracy Omission Accuracy *Less than 20 samples. 225 have the best commission accuracy performance for the important categories of Crops and Pasture, and Broadleaf Forest. The omission accuracy, however, is consistently high for these categories; so it is the combination of good accuracy in both omission and commission that indicates a truly accurate performance rather than strictly a high commission accuracy which can mask underestimation of a particular category. Overestimation, which was the case for Residential in both the maximum likelihood and minimum distance-to-means cases, is shown by a poor commission accuracy and a high omission value. Most of the Residential category on the ground was. accounted for, giving high omission accuracy, but substantial areas of non-Residential land use were also included, making commission accuracy very poor. ii) Urban Site. The relative success of the maximum likelihood classification in the Urban test site is clearly evident in the accu­ racy percentages for the individual categories (Table C2.2). With one minor exception, the maximum likelihood algorithm has the best perfor­ mance in both commission and omission accuracy. As indicated in the previous general discussion, however, the differences between the per­ formance of the three algorithms is not substantial. Unlike the Agri­ culture site, the grouping of cluster classes does not facilitate a separation of the Residential, Crops and Pasture, and Broadleaf Forest mixture. The low commission accuracy for Crops and Pasture, combined with the low omission accuracy for Broadleaf Forest, indicates the source of much of the classification error within this test site. iii) Forest Site. The minimum distance-to-means classification is only marginally superior overall in the Forest site, but the per­ formance of the major individual categories suggests minimum 226 Table C2.2. Individual Category Accuracies (%): Ground Verification Source: Urban Test Site Aerial Photography Maximum Likelihood Classification Minimum Distance-to-Means Classification Grouping of Cluster Classes Commission Accuracy 80 74 77 Omission Accuracy 85 83 77 Commission Accuracy 93 80 89 Omission Accuracy 86 72 75 Commission Accuracy 51 45 42 Omission Accuracy 71 57 53 Commission Accuracy 85 83 62 Omission Accuracy 49 58 55 Commission Accuracy 100* 100* 100* Omission Accuracy 100* 100* 100* Categories RESIDENTIAL COMMERCIAL/INDUSTRIAL CROPS & PASTURE BROADLEAF FOREST WATER *Less than 20 samples. 227 distance-to-means to be the best algorithm (Table C2.3). Coniferous Forest is identified well, with high commission and omission accura­ cies. Broadleaf Forest is also classified best by the minimum distanee-to-means algorithm, particularly in terms of commission accu­ racy. Maximum likelihood classification only provides adequate results in the Broadleaf Forest and Water categories. Relatively poor omission accuracy in both Forested Wetlands and Coniferous Forest, and a similar performance in commission and omission for Crops and Pasture indicate errors that make this classification of very dubious utility, despite its overall performance close to the other algorithms. 228 Table C2.3. Individual Category Accuracies (%): Ground Verification Source: Forest Test Site Aerial Photography Maximum Likelihood Classification Minimum Distance-to-Means Classification Grouping of Cluster Classes Commission Accuracy 43* 63* 51 Omission Accuracy 44* 42* 70 Commission Accuracy 50 47* 46* Omission Accuracy 63 63* 48* Commission Accuracy 78 87 85 Omission Accuracy 82 72 66 Commission Accuracy 62 82 70 Omission Accuracy 59 82 85 Commission Accuracy 93 100 100 Omission Accuracy 86 82 88 Commission Accuracy 41* 40 43 Omission Accuracy 28* 74 70 Categories CROPS & PASTURE RANGE BROADLEAF FOREST CONIFEROUS FOREST WATER FORESTED WETLAND *Less than 20 samples. APPENDIX D COMPARING CLASSIFICATION PERFORMANCE An essential objective of the study was to evaluate the differ­ ences in mapping accuracy between the maximum likelihood and minimum distance classification algorithms. A number of approaches to this problem have been suggested in the literature, although comparative analysis of classification performance is not common. Rosenfield and Melley (1980) present four techniques derived from standard statisti­ cal tests that can be used to compare land-use classifications: 1. T-test for comparison of pairs (Snedecor and Cochran, 1967); 2. Wilcoxon Signed Rank Test for paired samples (Sokal and Rohlf, 1969); 3. Sign test for pair samples (Snedecor and Cochran, 1967)5 A. Two-way analysis of variance without replication (Sokal and Rohlf, 1969). Each of these tests have different assumptions and are more appropriate than others in particular circumstances. The two-way analysis of variance test is the most powerful and flexible of the four and, along with the T-test, is a parametric test which assumes normality in the data set. If the data set is not normal, as in the two-way analysis of variance example used by Rosenfield and Melley (1980), it has to be made so before the test can proceed. instance, an arcsine transformation is used. In this The Wilcoxon and Signed Rank tests are non-parametric and, therefore, can be used with non­ normal data, although this is not a prerequisite. The important con­ clusion of these tests is that while there is some difference in rela­ tive efficiency between tests, of the four that were considered, all 229 230 of the methods will lead to equivalent results. The most recent work on comparing land use classification has been in the use of discrete multivariate analysis techniques (Congalton, 1981; Congalton et al., 1981). These techniques are considered to be appropriate because classification data are discrete, rather than continuous, i.e., a pixel either falls into a particular category or it does not. The statistical techniques in current use that were described by Rosenfield and Melley (1980) assume continuous data. Using algorithms documented in Bishop et al. (1975), two pro­ cedures have been adopted for use in accuracy assessment. the classification error matrix is normalized. Initially, Normalization is an iterative procedure by which rows and columns of the matrix are suc­ cessively balanced until each row and column adds up to a given margi­ nal value, usually one. i) This accomplishes two objectives: it eliminates incompatibility caused by differences in sample sizes between matrices, and ii) it represents both omission and commission errors present in the matrix by allowing each cell value to be Influenced by all other cell values in its row and column. Overall accuracy, calculated by summing the diagonals of the normalized matrix and dividing by the number of categories, is thus considered to be more representative of true classification performance than other methods (Congalton et al., 1981). It is usually lower than the standard value obtained by divid­ ing correct classifications by the total sampled. The second procedure used is the development of a statistic that can be used as an actual measure of agreement between two matrices. This measure is based on the difference between the actual agreement I 231 of the classification (i.e., the agreement between the classification and the ground truth indicated along the major diagonal of the clas­ sification error matrix) and the chance agreement which is indicated by the row and column. The test is based on a maximum likelihood estimate of the multinomial distribution. The resulting K statistic for each matrix, with its confidence limit, represents the accuracy of a classification and can be compared to the K statistics for another matrix. If the confidence intervals for two matrices overlap, they are not significantly different from each other. The statistical procedures were tested with multiple sets of data and apparently yield good estimates of classification accuracy and comparison between matrices. The procedures, however, depend on the implementation of fairly complex statistical procedures and while For­ tran programs are included in the work discussing the methods (Congal­ ton, 1981), there is no comparative reference made to other more well known procedures. In this study the Wilcoxon Signed Rank Test was selected to test the agreement between maximum likelihood classification, minimum distance-to-means classification, and a grouping of cluster class algorithms. The test is simple to execute and evaluations of its use in the literature indicate that it provides equivalent results to the statistically more complex tests. The procedure for carrying out the Wilcoxon test on the classifi­ cation data followed that set out in Sokal and Rohlf (1969). The categories are listed with the number correct in each category for each classification constituting a set of paired data. between each of the pairs are computed. Differences These differences are then 232 ranked from smallest to largest without regard for sign. Ranks from pairs with a positive difference and those with a negative difference are then summed separately. The test value is the smaller of these two sums. In order to determine the significance of the computed T value, it is necessary to refer to critical values of the Wilcoxon rank sum (Sokal and Rohlf, 1969)* The computed value for T must be equal to or less than the tabular value for the given number of pairs for the hypothesis to be rejected. If the computed value is higher than the table value the hypothesis is accepted. APPENDIX E CLASSIFICATION ERROR MATRICES FOR MAP AND GEOCODED ACCURACY PROCEDURES 233 Table El. Classification Error Matrix for the Agricultural Test Site: Maximum Likelihood Classification (Ground Verification Source: Delineated Land-Use Map) LANDSAT CLASSIFICATION Residential 17 4 1 22 5 2 9 34 100 9 5 6 1 18 28 1 2 1 Commercial / Industrial Crops and Pasture NOIlVOMIU3A Range (INnOHD Broadleaf Forest 2 Coniferous Forest 9 1 12 41 2 90 1 4 Water Total 153 15 74 2 145 15 50 4 15 15 177 305 Correct Total 58.0 % Correct 234 Table E2. Classification Error Matrix for the Agricultural Test Site: Minimum Distance-to-Means Classification (Ground Verification Source: Delineated Land-Use Map) LANDSAT CLASSIFICATION Residential 1 4 1 2 2 1 1 26 18 74 153 4 3 3 1 11 11 2 1 81 3 98 2 1 2 1 6 15 2 23 Commercial / Induatrial 5 Crops and GROUND VERIFICATION Pssture Range 34 Broadleaf Forest Coniferous Forest Water Total 66 4 38 24 160 1 11 5 11 139 308 12 Correct Total 45.1 % Correct 235 Table E3. Classification Error Matrix for the Agricultural Test Site: Grouping of Cluster Classes (Ground Verification Source: Delineated Land-Use Map) LANDSAT CLASSIFICATION Residential 10 7 1 1 19 1 9 Commercial / Industrial 1 4 3 13 3 112 1 27 Crops and GROUND VERIFICATION Pasture 1 157 Range 1 10 1 2 Broadleaf Forest 2 21 1 69 1 94 3 1 1 5 Conlfsrous Forsst 14 Water Total 16 24 7 156 7 101 3 16 16 210 314 Correct Total 66.9 % Correct 236 Table E4 Classification Error Matrix for the Urban Test Site: Maximum Likelihood Classification (Ground Verification Source: Delineated Land-Use Map) LAND3AT CLASSIFICATION ✓ Residential 106 6 23 9 26 2 Cropa and Pasture 15 2 27 3 47 Broadleaf Forest 12 17 30 59 GROUND VERFICATION Commercial / Industrial 11 146 37 Water Total 11 142 34 69 44 11 11 200 300 Correct Total 66.7 % Correct 237 Table E5. Classification Error Matrix for the Urban Test Site: Minimum Distance-to-Means Classification (Ground Verification Source: Delineated Land-Use Map) LANDSAT CLASSIFICATION / Residential VERIFICATION * 11 107 5 20 15 26 3 Cropa and Pasture 20 3 17 1 41 Broadleaf Forest 22 12 38 72 Commercial Industrial GROUND A 143 / 44 Water Total 11 11 164 34 52 50 11 199 311 Correct Total 64.0 % Correct 238 Table E6. Classification Error Matrix for the Urban Test Site: Grouping of Cluster Classes (Ground Verification Source: Delineated Land-Use Map) LANDSAT CLASSIFICATION A ✓ P Residential GROUND VERIFICATION Commercial / Industrial A 6/ O0 ^ 97 8 7 36 / e O' 9 * 17 * aJPA A* o '