“ llllillll'lliflillllllllllllllll lllllllllllllllllllllll 3 1293 10641 8308 [—u USMRY Michigan 3mm University This is to certify that the dissertation entitled A Framework for Texture Analysis Based on Spatial Filtering presented by James Michael Coggins has Been accepted towards fulfillment of the requirements for Ph. D. degreein Computer Science M W’ W7 Major professor Date “lie/Kl MS U is an Affirmative Action/Equal Opportunity Institution Fl" 1:1“; MSU LlBRARlES gang-an. be charged RETURNING MATERIALS: Place in book drop to remove this checkout from your record. FINES will if book is returned after the date stamped below. 17 x152 W7 , 37.K308 9 | . 000 “33k 457‘ tlfiyozfllfi '5 3 K2 3M? T00 a 19’ 05W; C) /u,.‘.ué£.a.fl§2 M5643 31H [\ W A FRAMEWORK FOR TEXTURE ANALYSIS BASED ON SPATIAL FILTERING BY James Michael Coggins A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Computer Science I982 ABSTRACT A FRAMEWORK FOR TEXTURE ANALYSIS BASED ON SPATIAL FILTERING By James Michael Coggins A texture analysis method motivated by a theory of human visual information processing and based on spatial filtering is defined and evaluated for classification and segmentation of textured images. The problem of texture analysis is viewed as an attempt to duplicate the texture analysis performance of human vision. This performance is considered to be a consequence of certain information reductions (filtering) performed in early stages of vision which are modelled by a sequence of filters defined in" the spatial frequency domain. The implementation of the model results in a sequence of spatial domain filtered images which contain limited spectral information from the original image. Features which are interpreted as measurements of average local energy are defined and evaluated in texture classification experiments. The energy features are found to outperform power spectral features; this is attributed to the use of phase information in the filtered images. This contrasts with previous studies in which phase was assumed to be unimportant for texture analysis. The channel filtering features are found to be insensitive to global, constant gray level changes, making some preprocessing operations unnecessary. Procedures are demonstrated for determining that two images portray the same I texture at different magnifications or orientations. A method for computing a texture feature vector for a neighborhood about each pixel in the image is defined and demonstrated in texture segmentation experiments. An image is segmented by identifying clusters in the feature space and labelling each pixel by the cluster in which its feature vector lies. A cluster validity statistic is used to determine an appropriate number of clusters. The results indicate that the channel filtering feature space is an appropriate representation of image texture for segmentation. TABLE OF CONTENTS List of Tables .................................................. v List of Figures ................ ..... ............ . ..... . ...... viii Acknowledgements . ................. ...... ..... .... .......... ..... x l lntroduction 1.] Computational Vision ...... .... ............. ...... ......... l 1.2 Texture ............................... .......... . ....... .. 2 l.3 Guidance from Studies of Human Vision .............. ....... 3 l.h Spatial Filtering ............... ......... .. .......... ..... 5 l.5 Organization of This Dissertation ......................... 6 2 Toward a Definition of "Texture" V 2.l Introduction .................... ........ .................. 8 2.2 Studies of Human Texture Perception .............. ......... 9 2.3 Properties of Texture and of Texture Perception .......... l2 2.h Comments on the Catalog of Texture Definitions ... ...... ... l5 2.5 Summary .................................................. l6 3 Models and Methods in Texture Analysis 3.l Introduction ............................................. 17 3.2 Local Analysis Methods ............................. ..... . 17 3.3 Global Analysis Methods .................................. 2i 3.h Intermediate Analysis Methods ............................ 26 3.5 Image Modelling Methods .................................. 29 3.6 Summary .................................................. 3i h A Texture Analysis Method Based on a Theory of Human Vision h01|ntr°ducti°n 000...... ..... ......OIOOOOOOO0.0.0.0000...... 33 h.2 A Critical Re-evaluation of Texture Analysis Methodology . 33 h.3 Spatial Frequency Channels ....................... ........ 35 h.h Constraints on Possible Texture Features ............. .... Ah h.5 Properties of Channel-Filtered Images . .................. . L7 h.6 Definition of Texture Features ............ ......... ...... A8 h.7 How Will the Texture Features Be Used? .... ........ ....... 5i h.8 Summary .................................................. 52 5.l Introduction ............................................. 53 5.2 Evaluations of the Features on Natural Images ............ 55 5.2.] Experiment l: Natural Images, 6hx6h Subimages ........ 58 5.2.2 Experiment 2: Natural Images, 32x32 Subimages ........ 62 5.2.3 Experiment 3: Natural Images, l6xl6 Subimages ........ 6h 5.2.h Summary ........... . ........ ......... ....... .......... 65 5.3 The Effect of Histogram Equalization ..................... 65 5.3.1 Experiment h: Histogram Equalized Images, 6hx6h Subimages. 5.3.2 Experiment 5: Histogram Equalized Images, 32x32 Subimages. 5.3.3 Summary .............................................. 75 5.h Experiment 6: Uniform Gray Level Changes ................. 76 5.5 Experiment 7: Magnification Changes ...................... 79 5.6 Experiment 8: Orientation Changes ....... ............... .. 82 5.7 Experiment 9: Phase Spectrum Changes .. ........... . ..... .. 86 5.8 Second-Order Statistics and Channel Filtering ............ 92 5.8.] Experiment 10: Application of the Co-occurrence Method,95 5.8.2 Experiment ll: Application of Channel Filtering ..... lOl Evaluation of Channel Filtering Features for Texture Classification 69 72 5.8.3 Summary ......... ......... ... ...... .................. 5.9 Computational Simplifications ...... ........ . ......... ... 5.9.i Experiment 12: Using Fewer Channels ... ....... ....... 5.9.2 Experiment 13: 5.10 Summary ..... Ideal Bandpass Channels .............. IO3 10b l0“ IOS 107 6 Evaluation of Channel Filtering Features for Texture Segmentation 6.l Segmentation ....... 6.2 6.3 6.h 6.5 6.6 6.7 6.8 6.9 7 Summary and Conclusions 00.000.000.000...0.0.00.0... ..... 0... Computing Texture Features for Segmentation . ....... ..... Segmentation Using Feature Images ............. ..... ..... Evaluating the Segmentations ............................ Segmentation Experiment I: Dot Textures ................. Segmentation Experiment 2: Gaussian White Noise ......... Segmentation Experiment 3: Natural Image Composite ...... Segmentation Experiment h: SCRE Image ................... Summary ....... 7.1 Summary . ............................... . ................ 7.2 Conclusions ........................... .. .............. .. 7.2.l Classification ... 7.2.2 Segmentation . ..... ...... 7. 2.3 General .. 7.3 Advantages of the Channel Filtering Method .............. 7.h Disadvantages of the Channel Filtering Method ........... 7.5 Suggestions for Further Research ......... ............ ... Appendix A: Appendix B: List of References ....... A Catalog of Texture Definitions .................. Definition of Channel Filters ..................... l09 llO II3 IIS 116 123 12h I36 136 M3 m. IAI. 1A6 IA] IAB 11.9 150 152 l56 158 List of Tables Table 1: Classification of Eight Natural Image Classes Using 6hx6h Subimages ......... . ....................... 59 Table 2: Classification of Eight Natural Image Classes Using 32x32 Subimages ..... ... ....... ...... ............ 63 Table 3: Classification of Four Natural Image Classes Using l6xl6 Subimages ............ ....... ... ........... 65 Table h: Classification of Eight Histogram-Equalized Natural Image Classes Using 6hx6h Subimages ........... 70 Table 5: Classification of TILE, ROCK, SAND. PAPE and TILQ, ROCQ, SANQ, PAPQ Using 6Ax6h Subimages ........ .. 71 Table 6: Classification of CORK, GRAS, WOOD, SCRE and CORQ, GRAQ. WOOQ, SCRQ Using 6hx6h Subimages ........ .. 72 Table 7 Classification of Eight Histogram-Equalized Image Classes Using 32x32 Subimages ...... ... ..... ..... 73 Table 8: Classification of TILE, ROCK, SAND. PAPE and TILQ, ROCQ, SANQ, PAPQ Using 32x32 Subimages .......... 7h Table 9: Classification of CORK, GRAS, WOOD, SCRE and CORQ, GRAQ, HOOQ. SCRQ Using 32x32 Subimages ......... . 75 Table 10: Classification Results Showing the Effect of Average Gray Level Changes .............. ...... ........ 78 Table 11: Size Change Experiment Using 6 Spatial Frequency Channels ..................... ............... 80 Table l2: Size Change Experiment Using 5 Spatial Frequency Channels 0.00.00.00.00.000000000000000.000.0. 82 Table I3: Orientation Experiment Using No Channel Shifting ...... 85 Table lh: Orientation Experiment Using Circular Channe' Shifting ......OOOOOOOOOOOO0.00.....000000.0... 86 Table 15: Classification Results on Four Phase-Modified Gaussian White Noise Classes .......................... 90 \l Table 16: Classification Results Illustrating the Effect of Phase Spectrum Changes on Gaussian White Noise Images 9] Table 17: Phase Modification Experiment with Indiscriminable Classes Merged OI000......0............OOOOIOOOOOOO0... 92 Table 18: Black-Black Displacement Vectors in Four-Dot Micropatterns 00......OOOOOOOOCOOOOOOOOOOOOOO. ...... 0.0 96 Table 19: Computation of Expected Number of Black-Black Co-occurrences in 256x256 Four-Dot Images .... ......... 98 Table 20: Classification of Four-dot Textures Using Co-occurrence Matrices ................. ...... ........ 100 Table 21: Classification of Four-dot Textures by Channel Fi‘tering ......OOOOOOOOOOOOO...... 00000000 0.010] Table 22: Classification of Four-dot Textures Using Six Spatial Frequency Channels ................. ...... 102 Table 23: Classification of Four-dot Textures Using Four Orientation Channels ............................ 103 Table 2h: Classification Results on Natural Images Using Eight Channels ......OOOOOOOOOOOOOOOO000...... ..... 0.0106 Table 25: Classification Results on Natural Images Using Ideal Bandpass Channels ...................... ..... ... 107 Table 26: Evaluation of Clustering on Dots Image Using 8x8 Averaging Windows ................................ 122 Table 27: Evaluation of Clustering on Dots Image Using 16x16 Averaging Windows .......... .................... 123 Table 28: Evaluation of Clustering on Gaussian White Noise Image Using 8x8 Averaging Windows ............. ..... .. 131 Table 29: Evaluation of Clustering on Gaussian White Noise Image Using l6xl6 Averaging Windows .................. 132 Table 30: Evaluation of Clustering on Composite Natural Image Using 8x8 Averaging Windows .................... 13h Table 31: Evaluation of Clustering on Composite Natural Image Using l6xl6 Averaging Windows .................. 135 Table 32: Evaluation of Clustering on SCRE Image Using 8x8 Averaging Windows ................................ 1&0 vi Table 33: Evaluation of Clustering on SCRE Image Using l6xl6 Averaging Windows ........... ..... .............. 1&1 vii Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure Figure LIST OF FIGURES ' Sinusoidal Gratings ............................. ..... . 2h Diagram of the Spatial Filtering Procedure ........ .... 38 : A Spatial Frequency Channel Filter ........ .......... .. 39 A: An Orientation Channel Filter ......................... #0 10: ll: 12: 13: 1h: 15: 16: 17: 18: An Example of Channel Filtering ....................... £2 Indiscriminable Textures which Could Be Discriminated Based on the Edge Structure of the Micropatterns ...... A5 Texture Feature Definitions ... ....... ................. 50 . Natural Image Classes ................................. 56 ° Mean Values of Feature 1 in Each Channel for the Eight Natural Image Classes .......... ............. 61 Histogram-Equalized Natural Images ............ ..... ... 67 Gaussian White Noise Images ....... ....... ............. 77 Bar Images for Orientation Experiment ......... ........ 8h Phase Spectrum Modification Procedure ................. 88 Effect of Phase Modification on GWNI ..... ..... ........ 89 Four-Dot Micropatterns and Corresponding Image Classes .........OOOOOOOOOOOOOOOO......OOOOOOOOOO 93 Diagram of the Segmentation Procedure ................ 111 Dots Image for Segmentation Experiment 1 ............. 117 Segmented Images for Dots Using 8x8 Averaging Windows .........O...........................OOOOOOOOO1‘9 Segmented Images for Dots Using l6xl6 Averaging Windows ......OOOOOOOOOOOOOOOOOOOOOOO0.000000000000000 ‘20 viii Figure Figure Figure Figure Figure Figure Figure Figure Figure 20: Gaussian White Noise Image for Segmentation Experiment 2 ....................... . ..... . ........... 125 21: Segmented Images for Gaussian White Noise Using 8x8 Averaging Windows .... ..... .... ............. 126 22: Segmented Images for Gaussian White Noise Using l6xl6 Averaging Windows ........................ 127 23: Natural Image Composite for Segmentation Experiment 3 ..... .................. . ............ ..... 128 2A: Segmented Images for Natural Image Composite Using 8x8 Averaging Windows ..... .. ...... .. ........... 129 25: Segmented Images for Natural Image Composite Using 16x16 Averaging Windows ....... . .............. .. 130 26: SCRE Image for Segmentation Experiment A ..... ........ I37 27: Segmented Images for SCRE Using 8x8 Averaging Windows ........................ .. .. ..... .. . ........ 138 28: Segmented Images for SCRE Using 16x16 Averaging Windows ....... O... ..... ..OOOOOOOOOOOOOOOOOOOOO ....... 139 ACKNOWLEDGEMENTS I would like to thank Dr. AniI K. Jain for his guidance, patience, and persistence in his role as advisor, critic, editor, and taskmaster. His careful editing has contributed greatly to the quality of this document and his critical scrutiny of my work has kept me out of a lot of trouble. I am also grateful to Major Arthur Ginsburg, whose filtering model of human vision forms the foundation on which my work is built. In addition to teaching me about the spatial filtering model, psychOphysics. and scientific writing, Maj. Ginsburg also served on my doctoral committee. I would also like to thank the other members of the committee, Dr. Richard Dubes, Dr. Carl Page, and Dr. James Zacks for their contributions throughout my research. Several persons have provided software tools without which some aspects of my work would have been nearly impossible. These include Larry Johnson, George Cross. Ken Dimoff, Steve Smith, Neal Wyse. and Gautam Biswas. I would also like to thank Dr. Harry Hedges for his support, and Ardeshir Goshtasby, John Piippo, Steve Smith, and Steve Fullenkamp for their various contributions. Chapter 1 Introduction 1.1 Computational Vision Computer applications involving visual input include problems in aerial image interpretation [Darling and Joseph. 1968: Kettig and Landgrebe, 1976: Landgrebe, 1981], biomedical image analysis [Bacus, 1976; Pressman, 1976; Hall et al, 1977; Landeweerd and Gelsema, 1977: Mui et a1, 1977; Jain et a1, 1980: Trussel, 1981], industrial processes [Perkins, 1978; Tennenbaum et al, 1978; Agin, 1980], and robotics and scene analysis [Duda and Hart, 1973: Marr and Nishihara, 1978; Barrow and Tennenbaum, 1981; Stevens, 1980: Aggarwal et a1, 1981]. Hardware for acquiring, storing, and displaying images is available, and current computers are capable of performing some image analysis tasks in real-time. The development of algorithms for processing images has proven to be a slow and difficult task [Gurari and Wechsler. 1982]. Image analysis methods have been developed in the fields of pattern recognition [Duda and Hart, 1973; Fu, 197A, 1977: Pavlidis, 1977], image processing [Pratt, 1978: Rosenfeld and Kak, 1981] and artificial intelligence [Winston, 1977; Nilsson, 1980]. The inquiries in these three fields involving approaches to automatic image analysis, interpretation. and understanding are sometimes referred to collectively as “computational vision” [Brady, 1982: Barrow and Tennenbaum, 1981]. In order to cape with the variety and complexity of activities associated with vision, several subproblems within computational vision have been identified including image enhancement, edge detection, segmentation, image registration, and texture analysis. These subproblems have been attacked separately and have tended to evolve into subfields themselves. This has the advantage that many different techniques with different characteristics are available as tools for use in applications. The separation of the subproblems has the disadvantage that similarities in the solutions to different problems may be overlooked and that diverse solutions to the subproblems of vision complicate the construction of unified computational vision systems. 1.2 Texture Human observers are capable of some image segmentation and discrimination tasks under conditions (such as brief exposure to a test image) which prevent detailed scrutiny of the image. This ability is referred to as “effortless” or "pre-attentive” visual discrimination. When an image does not portray any particular object or form, only certain aspects of the overall pattern of gray level changes in the image is effortlessly perceived. "Texture" sometimes refers to the pattern of gray level variations produced by some image generation procedure. However, different procedures may yield images which are not effortlessly discriminable to human observers. In this thesis, two images which do not portray particular objects or forms will be considered to have the same "texture" if they are not effortlessly discriminable to human observers. Texture is recognized as being fundamental to the perception of regions and surfaces in images [Brady, 1982: Stevens, 1980]. Textural information can potentially be used by automatic vision systems in region identification, image segmentation and classification tasks. Texture analysis is a major component in discussions of general computer vision systems [Sklansky, 1978: Wechsler, 1980: Barrow and Tennenbaum, 1981: Brady, 1982]. The potential importance of texture for automatic vision systems has inspired many attempts to develop algorithms for texture analysis. Unfortunately, the characterization of texture in terms of human performance does not suggest a simple measurement on images which can duplicate human texture perception. This lack of precise guidance has led to a proliferation of ad hoc texture analysis methods based on diverse mathematical, statistical and heuristic measurements on images. Most texture analysis methods duplicate human performance reasonably well for some image classes, but they fail to duplicate human performance in more general problems. 1.3 Guidance from Studies of Human Vision Since a texture analysis algorithm is supposed to duplicate the performance of human vision, it seems reasonable to look to vision science for guidance in developing such algorithms. Computational methods need not emulate the human visual system, but knowledge of the limitations and strategies present in human vision could guide the development of useful algorithms. Some texture analysis algorithms have been guided by results of psychophysical experiments designed to find an upper bound on the complexity of texture perception. The nature of texture is described by specifying a statistical property which indiscriminable textured images appear to have in common. This type of description is intended to constrain the nature of texture by a criterion which is independent of human performance, but the generality or minimality of such a criterion is difficult to establish. Such results have guided many texture analysis studies, but bounds on the complexity of indistinguishable stimuli do not necessarily suggest a particular algorithm which will reproduce human performance. While algorithms for computational vision need not emulate human vision at the neural level, knowledge of the overall strategies used by the visual system for analyzing images could provide more definite guidance for developing algorithms. We will use one such theory in this dissertation to motivate the development of a new approach to texture analysis. This theory characterizes early stages of the human visual system as being composed of quasi-independent mechanisms,called channels, which decompose an image into certain bands of spatial frequency and orientation [Ginsburg, 1971]. This decomposition is modelled by a spatial filtering operation using filters defined in the spatial frequency domain [Ginsburg, 1978, 1980b]. The result of the decomposition is a sequence of filtered images in the spatial domain which contain limited spectral information from the original image. The channel filtering theory is not simply a generalization of human visual behavior; it is an attempt to describe the information content and possible processing strategies in early human vision. The theory implies that many aspects of visual perception are consequences of the information reduction (filtering) performed by the perceptual system [Ginsburg, 1978, 1980b]. Thus, in order to characterize texture perception, it may be most effective to determine first how the human visual system filters the information in an image. The particular decomposition of an image performed in early stages of vision can explain why certain stimuli are discriminable and others are confused. l.h Spatial Filtering The channel filtering theory asserts that the early processing of an image by the human visual system is effectively modelled by a spatial filtering operation. Spatial filtering is a technique which has been useful in several areas of computational vision. Two equivalent implementations of spatial filtering exist; one involves convolution in the spatial domain, and the other involves multiplication in the spatial frequency domain. The use of spatial filtering as a computational tool for image analysis is attractive because of its well-known mathematical basis, because of the availability of efficient algorithms for performing the Fourier transform, and because of the existence of intuitively satisfying interpretations for the Fourier transform and for spatial filtering. Some uses for spatial filtering in computational vision are two-dimensional extensions of well-known one-dimensional operations [Papoulis, 1962]. Low-pass filtering has been used to remove noise from an image [Rosenfeld and Kak, 1981]. High-pass filtering has been applied to edge detection, and in fact, a certain filtering approach has been shown to be an "optimal" edge detection method [Shanmugam et al, 1979]. Band-pass filtering has been applied to form and edge detection [Ginsburg, 1976, 1978, 1979a, 1980a, 1980b: Crowley and Parker, 1978: Marr et a1, 1979; Marr and Hildreth, 1980]. Spatial filtering was also used in an implementation of a theory of stereo vision which involved Operations similar to those required in image registration [Marr and Poggio, 1979]. Several approaches to texture analysis, such as those based on edge detection or template matching can be implemented using spatial filtering operations [Laws, 1980]. In this study, a computational vision system will be defined by specifying its filtering properties in the spatial frequency domain. The filter characteristics to be used are adopted from studies of human vision, but this study will not attempt to determine in detail the correspondence between human vision and the computational vision system. Methods for texture classification and segmentation based on the computational vision system will then be defined and evaluated. 1.5 Organization of This Dissertation This dissertation will begin by reviewing the results of previous research in texture analysis. Chapter 2 reviews attempts to define or at least to delimit the nature of "texture". This review includes results from vision science and principles derived from human intuition and experience with texture perception. In Chapter 3, texture analysis algorithms proposed in the literature will be reviewed. Chapter A will present a new texture analysis method based on spatial filtering. Chapters 5 and 6 present the results of experiments which evaluate the proposed method for classifying and segmenting textured images. The conclusions of this study will be summarized and further research will be suggested in Chapter 7. Chapter 2 Toward a Definition of “Texture” 2.1 Introduction In order to use textural information, the computer vision system requires an operational definition of "texture". In the absence of a sufficiently precise general definition, the operational definition of texture is implicitly supplied in the features computed from an image. This feature space then constitutes a model of "texture”. How can an operational definition for texture be constructed? A reasonable first step is to observe the texture analysis performance of the human visual system [Haralick et al, 1973; Barrow and Tennenbaum, 1981]. Texture is a common and important aspect of human visual perception. Everyday experiences with texture analysis and recognition tasks provide an extensive background of intuition regarding the nature of “texture". One common manifestation of this intuition is in the adjectives we use to describe textures. For computational texture analysis, however, the nature of texture must be precisely quantified. Texture analysis has sometimes been approached as the problem of quantifying our intuition and our vocabulary about texture. Some early attempts to construct automatic texture analysis procedures referenced certain results from vision science which appeared to limit the complexity of human texture perception. Textured areas which did not differ in certain simple statistical measurements were found to be indiscriminable to human observers. These psychophysical results provided a useful upper bound on the complexity of texture in human vision but little guidance concerning what approaches should be effective for computational purposes. Section 2.2 reviews these influential findings and recent extensions and modifications to them. Several intuitive properties of texture and of texture perception were identified, and image analysis techniques were proposed which seemed to measure aspects of an image which were relevant to those properties. These results provide some valid insights into the nature of texture, though they actually tell more about how textures are perceived than about what texture is or how to compute texture. A summary of these results is presented in Section 2.3. The intuitive insights into the nature of texture have not produced. a characterization from which simple computational procedures can be developed. Researchers attempting to present new texture analysis techniques have found it awkward and difficult to describe the properties of images which their techniques are supposed to measure. Section 2.A will discuss the Catalog of Texture Definitions in Appendix A. 2.2 Studies of Human Texture Perception Experiments by Julesz [Julesz, 1962, 1965, 1975; Julesz et al, 1973] have influenced the development of texture analysis methodology. 10 In these experiments, human observers are presented an image composed of subimages which are generated by different rules. The subject's task is to find the different subimages. The composite image is displayed to the observer for very brief periods (100 ms) in order to prevent the observer from scrutinizing the image to find the different areas. Since the test images typically contained many repetitions of some micropattern and since the segmentation was to be performed without scrutinizing the image, the subimages were considered to have different ”textures" and the experiments were interpreted as tests of pre-attentive human texture analysis ability. The tests would demonstrate the complexity of the processing performed in the visual system before detailed analysis involving cognition or memory could begin. Notice that in such experiments, including [Julesz, 1962, 1965, 1975: Julesz et a1, 1973: Pratt et al, 1978; Richards and Polit, 197A], “texture" is treated as an intrinsic property of an image determined by the image generation procedure. Pre-attentive human vision is then tested, and can succeed or fail in the texture discrimination task. The results of these tests were examined to find some properties of the generated images which could predict whether the human visual system could preattentively discriminate different subimages. One generalization of the results of these preattentive discrimination experiments became known as the "Julesz Conjecture". This conjecture asserted that image areas which do not differ in their second-order gray level joint probability distributions cannot be preattentively discriminated by human observers. The conjecture was seen to imply that computational methods no more complex than the ll computation of second-order gray level distributions should be sufficient to rival human performance in texture discrimination tasks. Texture analysis researchers used this conjecture as a justification for limiting the complexity of their techniques to fairly simple computations. Unfortunately, the Julesz Conjecture was wrong. Later studies [Caelli and Julesz, 1978a, 1978b: Julez et a1, 1978; Julesz, 1981: Gagalowicz, 1981] have produced many examples of images which have identical second- and even higher-order gray level joint probability distributions but which are visually discriminable. Revisions to the original Julesz Conjecture have been attempted. One recent attempt [Julesz, 1981] involves the assumption of special geometrical structures called "textons" which, allegedly, are detected in early stages of vision. The textons involve local geometrical structures and include "corner”, "closure”, and "connectivity". Detection of these special features is assumed to enable discrimination of textures which have identical second-order distributions. Another recent paper [Gagalowicz, 1981] modifies the original conjecture by suggesting that the actual co-occurrence matrices (an estimate of the second-order gray level distribution; see Section 3.2) derived from Ithe given image should be used to characterize textural properties rather than the underlying probability distribution. The argument is that two discriminable texture fields can be produced from the same probability distribution if the generating process is not ergodic [Papoulis, 1965]. In such a case, the random differences between the images due to the non-ergodicity of the generating procedure can be a textural discriminating factor. 12 In spite of the problems which have been discovered with texture characterization based on second-order distributions, methods based on the Julesz Conjecture are still used in texture analysis applications. One study [Pratt et al, 1978] attempts to characterize the cases when the conjecture fails and when the conjecture works. They conclude that the Julesz conjecture is a reasonable approximation to human performance in many texture discrimination problems. 2.3 Properties of Texture and of Texture Perception Introspective observation of human texture perception has provided some insight into the nature of texture. These observations do not constitute a definition of texture: instead, they serve to guide and constrain the development of computational texture analysis methods. The use of constraints in the absence of precise definitions is a basic approach in artificial intelligence (AI) research. As computational vision problems (such as texture analysis) have been found to resist attempts to define fundamental terms, the AI approach of finding and exploiting constraints has been adopted [Zucker, 1981]. The intuitive guidance for texture analysis methods can be summarized in four principles (cf. [Beck, 1980]). 1. Texture is a property of areas: the texture of a point is undefined. Thus, operations on a random sampling of pixels would not be an appropriate texture analysis procedure since texture does not exist in context-free intensities. Texture is a global property of a region, but the region under consideration can be a fairly small subimage of a larger scene. Analysis of textures over small areas 13 occurs when one attempts to find a boundary between two textured areas [Haralick, 1975; Thompson, 1977]. 2. Texture involves the spatial distribution of gray levels throughout a region. This spatial property could be characterized by statistical features computed over regions or by maintaining position (or relative position) information throughout processing. A one-dimensional gray level histogram is not, by itself, an appropriate texture analysis tool because it captures no spatial information. Two-dimensional histograms, or co-occurrence matrices (Section 3.2), are more reasonable texture analysis tools because a spatial parameter is incorporated in the histogram computation. The minimum size of an image which adequately characterizes a texture depends on several factors including the apparent size of ”objects" in the image, the resolution of the image acquisition system, and the size of the ”operators” used in the texture analysis procedure. The perception of texture depends on the assumption of a frame of reference [Haralick, 1979] governing the sizes of gray level changes 'which are to be considered significant and the spatial scale which is to be Operated upon. Since texture information can be found at many different scales, recent papers have advocated multi-level or hierarchical descriptions of regions for texture analysis [McCormick and Jayaramamurthy, 1975: Crowley and Parker, 1978; Ehrich and Foith, 1978: Tomita, 1981; Zucker and Kant, 1981]. Alternatively, the frame(s) of reference can be adopted from human performance [Ginsburg and Coggins, 1981]. 3. Texture is perceived in image regions which contain many equally significant gray level changes. This implies that the gray 1A levels in a "textured” region have low redundancy and high information content [Resnikoff, 1981]. A textured region can be created by inserting large numbers of "objects” such as edges, regions of constant gray level, or micropatterns [Hall et al, 1977]. Alternatively, texture can be created by removing all "enumerable objects" [Richards and Polit, 197A]. In either case, a "texture" is perceived when significant individual forms are not present. A. Textural properties of regions are invariant through moderate changes in overall brightness, orientation, and size (as in magnification/shrinkage). While these changes are observable, an original texture and a modified version of the same texture are recognizable as being samples of the same texture. (Some interesting points about textural invariance are made in [Modestino et al, 1981; Ginsburg and Coggins, 1981].) Another potential source of guidance is in the perceived qualities of texture. Textural qualities are typically expressed by adjectives such as coarse, streaked, sharp, irregular, fine, cellular, rippled, directional, etc. One texture study [Tamura et a1, 1978] investigated rank correlations between human evaluations of natural textures and statistical features designed specifically to quantify textural adjectives. Six perceptual dimensions of texture were specified, as follows: coarseness (coarse vs. fine), contrast (high contrast vs. low contrast), directionality (directional vs. non-directional), line-likeness (line-like vs. blob-like), regularity (regular vs. irregular) and roughness (rough vs. smooth). In psychophysical experiments, subjects evaluated 16 natural textures from [Brodatz, 1966] along the six dimensions. Correlations among these six features 15 indicated that the features were not independent. In fact, the features cluster into two groups [Ginsburg and Coggins, 1981]. One group (coarseness, contrast, and roughness) appears to be affected by the apparent size of the texture while the other group (directionality, line-likeness, and regularity) appears to be affected by directional dependencies in the image. The clustering of the subjective features suggests that more general or fundamental descriptions of textured images may exist. Statistical features designed to duplicate the human evaluations of image textures along the six intuitive dimensions yielded poor results. 2.A Comments on the Catalog of Texture Definitions Appendix A contains a collection of attempts by texture analysis reSearchers to define “texture". It should be noted that many papers dealing with methods for quantifying visual texture do not even attempt to define the concept of “texture". The selections in the appendix are typical of the definitions which do appear. Usually, attempts to define “texture” are either constructed specifically for a particular texture analysis approach [Appendix A, items 1, 3, A], or they are so general that they are of little practical value [item 2]. Some “definitions” simply characterize “texture” by certain aspects of the human perception of textures [items A-6]. Texture can be characterized as a global property [items 5, 7], a local property [item 2], a random phenomenon [item 3], a non-random phenomenon [item 9]. a property of a region which remains when all 16 ”objects” are removed [item 6], a property of a region when many "objects" are present [item 5], a property determined by structural arrangements of objects [items 1, A, 5, 9], a property determined by statistical distributions [items 3, 7]. Unfortunately, these attempts to define texture have not suggested simple computational texture measurements [item 10]. 2.5 Summary Texture is a commonly perceived quality of regions and surfaces, but no precise, general definition of texture exists. Texture analysis research has been guided by some results concerning the complexity of texture analysis in human vision and by intuitive properties of texture. These results serve to constrain the nature of texture, but not to define texture or to suggest what computational approaches might be appropriate for texture analysis. An attempt to directly quantify the textural qualities perceived by human observers yielded poor results. Due to the lack of more precise guidance, awkward and even contradictory “operational definitions” of texture have appeared in the literature. Chapter 3 Models and Methods in Texture Analysis 3.1 Introduction The process of analysis or measurement of texture is inseparable from the process of creating an operational definition of texture. The features computed from an image and the decision procedures applied to the features constitute a working definition of ”texture". Existing reviews of texture analysis methods [Hawkins, 1969: Rosenfeld and Troy, 197A; Sklansky, 1978: Haralick, 1979: Wechsler, 1980] concentrate on the mathematical or computational techniques on which the methods are based. This review will organize the methods according to the nature of the operational definition of “texture" implied by the methods. 3.2 Local Analysis Methods The unifying aspect of texture analysis methods reviewed in this section is a dependence on small groups of neighboring pixels. This local information may then be averaged or accumulated over a region for use in characterizing the texture of the region. This approach is typical of many "statistical” texture analysis procedures. Among these are the gray level run length method 17 18 [Galloway, 1975], the gray level difference method [Weszka et al, 1976] and the gray level co-occurrence method [Haralick et al, 1973]. All three methods involve counting the occurrences of a simple local property over the entire region. In each case, the local property is easy to identify and each instance of the property involves only a few pixels. Since the local property depends on the gray levels of the pixels, accumulations are stored in a matrix whose size depends on the number of gray levels available in the quantization procedure. Statistics are then computed from the matrix for use as texture features. In the gray level run length method, the local property is the number of linearly adjacent pixels with a specific gray level. The matrix Rd(i,j) gives the number of runs of length j of pixels with gray level i in direction d. Four such matrices can be computed, for d-O, A5, 90, and 135 degrees. In the gray level difference method, the local property is the absolute difference between the gray levels of pixels at a specified displacement d-(dx,dy) from each other. The matrix Gd(i) gives the number of times that a gray level difference of i occurs between pixels at displacement d from each other. In the gray level co-occurrence method, the local property consists of the gray levels of two pixels with a given displacement d-(dx,dy). The matrix Hd(i,j) gives the number of occurrences of a pixel with gray level i at displacement d from a pixel with gray level j. The displacement vectors for the difference and co-occurrence methods could be any vectors which can occur within the image. In 19 practice, however, fairly small vectors (|d|<10) are used. Since the co-occurrence matrix is sensitive to gray level changes over a certain distance, the use of small displacement vectors involves an assumption that texture exists in the gray level distributions in local areas of an image. In the co-occurrence method especially, the matrices for several displacement vectors can be added together to remove certain orientation dependencies. For example, Haralick uses a definition which involves the sum of the co-occurrence matrices for vectors i(dx,dy) and (-dx,-dy) [Haralick, 1979]. This particular definition results in a symmetric co-occurrence matrix. In comparative evaluation studies [Weszka et al, 1976; Conners and Harlow, 1980a] involving these texture analysis methods, the co-occurrence method was found to be superior, with the gray level difference method a close second. These comparative studies have influenced the frequent use of the co-occurrence method in applications. Some of the problems associated with local analysis methods are as follows: (1) The choice of the displacement vectors is critical. (2) The visual significance of the features computed from .the matrices is sometimes difficult if not impossible to understand. This difficulty has led to attempts to capture textural adjectives directly in statistical features [Tamura et al, 1978: Conners, 1979]. (3) The local analysis methods are sensitive to ”noise" due to their dependence on actual gray level values. In addition, the gray scale resolution determines the size of the accumulation matrices 20 independent of the image size, making comparison of textural properties of images digitized under different circumstances difficult. (A) Local analysis methods are insensitive to global aspects of the image such as brightness gradients and directional tendencies. Another type of local analysis method attempts to characterize the pattern of gray levels encountered in one-dimensional scans of the image. This scan can be analyzed as a one-dimensional function using time series techniques [McCormick and Jayaramamurthy, 197A: Deguchi and Morishita, 1978] or heuristic analysis of extrema in the function [Ehrich and Foith, 1978: Mitchell et al. 1977: Mitchell and Carlton, 1978]. In the time series analysis method, autoregression coefficients are the texture features. This method has been used for synthesis of certain types of streaked textures, but its applicability to less directional or less regular textures is questionable [Haralick, 1979]. Heuristic analysis of the one-dimensional scan involves identifying gray level extrema and associated properties such as the height and width of the peaks and the distance to the next higher peak. The texture features computed from such methods include the average height (contrast) of peaks and the density of peaks of a particular height. The density of edges in a local region can also be used as a texture feature [Rosenfeld and Thurston, 1971: Rosenfeld et al, 1972: Rosenfeld and Troy, 197A]. A simple local edge operator such as the gradient can be used to identify edge pixels. The gradient computation can be adjusted to be a function of distance to obtain a hierarchical characterization of edge densities in the image. Edge detection can be expressed as a spatial filtering operation by defining a set of templates in the spatial domain and convolving 21 them with the image. One such template is the Sobel operator [Pratt, 1978]. Laws [1980] uses several templates (including Sobel operators and gradients) to detect edges at different orientations and with different contrasts. 3.3 Global Analysis Methods Global tendencies in an image such as the average sizes of objects or areas or directional preferences are difficult to capture using local analysis methods. Since such global information seems to be related to texture (in particular, to coarseness and directionality), methods which describe global properties of the image could be useful in texture analysis. Global size and directional tendency information can be derived from the autocorrelation function [Hayes et al, 197A]. LetiI(x,y) be the image function which gives the gray level at position (x,y) for 0 <= x,y <8 N-l and is 0 otherwise. The values of'I(x,y) are integers with 0 <- IKx,y) <= G-l. The (normalized) autocorrelation function R(dx,dy) is the product of the image function with a shifted [by d-(dx,dY)] copy of itself. That is, N-l N-l I \ R (dx,dy)= ---- [ > > I(X.y) *I(x+dX.Y+dY) J R0 / / x-O y=0 where R0 is a normalizing factor. The maximum value of R(dx,dy) occurs at R(0,0). If the image contains large areas of constant gray level, the autocorrelation function will decrease slowly with distance from (0,0); if the image contains mostly small areas of constant gray level 22 (the texture is "busy") then the autocorrelation function will drop off sharply. If the image is periodic, the autocorrelation function will rise and fall with the same period as the pattern in the image. The value of R(dx,dy) is related to the co-occurrence matrix Hd(gl,gZ) where d-(dx,dy) as follows: I \ \ R(dx,dy)= ---- [ > > gl*gZ*Hd(gl,gZ) ] R0 / / 9180 92-0 Thus, the autocorrelation function can be derived from the entire ensemble of N**2 co-occurrence matrices of the image, but the converse is not true. For a given displacement, the co-occurrence matrix provides a more detailed characterization of the spatial distribution of gray levels than the autocorrelation at the same displacement. The Fourier transform has been used more frequently than the autocorrelation function as a texture analysis tool [Lendaris and Stanley, 1970; Bajcsy, 1973: Bajcsy and Lieberman, 1976: Weszka et al, 1976: Conners and Harlow, 1980a; D'Astous and Jernigan, 1981; Eklundh, 1979]. The two-dimensional Discrete Fourier Transform (DFT) of an NxN image I(x,y) is defined as follows: N-l N-l \ \ F (u,v)= > > I(x,y) exp[-j2h’(ux+vy)] . / / /¢ x=0 y-O The Fast Fourier Transform algorithm can be used to compute the two-dimensional DFT [Johnson and Jain, 1981]. The Fourier spectrum can also be expressed in magnitude-phase form as 23 F(u,v) = M(u,v) exp[jP(u,v)] An intuitively appealing interpretation of the Fourier transform is based on the representation of an image as a weighted sum of sinusoidal gratings (Figure 1) [Rosenfeld and Kak, 1981; Duda and Hart, 1973]. The parameters u and v determine the frequencies of the horizontal and vertical sine waves in the gratings. The amplitude of the sine waves (the contrast of the grating) is given by M(u,v), and their phase is given by P(u,v). We note that the autocorrelation function and the power spectrum (M(u,v)z) are a Fourier transform pair. Thus, by the argument given earlier, the ensemble of N2. possible co-occurrence matrices also determines the magnitude spectrum [Julesz and Caelli, 1979]. As spatial frequency (distance from. (0,0) increases, the wavelength of a cycle decreases. Thus, high spatial frequency is associated with small areas and low spatial frequency is associated with large areas in the image. This association means that a coarse texture - one composed of large areas of constant gray level - will have strong low spatial frequency components and a fine texture will have strong high spatial frequency components. Because of the associations between size and spatial frequency, several texture features have been defined on the power spectrum. The phase spectrum has been largely ignored because of associations between phase and position. Since textures can be identified regardless of the position of particular components in the visual field or the position of a (sufficiently large) window in a uniformly textured plane, phase information has been assumed to be unimportant for texture analysis but . .21 It'll-11.. infill III. 1| 24 I (c) Figure 1: Sinusoidal Gratings. (a) 3 cycles per image horizontal (b) 8 cycles per image horizontal (c) Combination of five spatial frequencies with arbitrarily selected amplitudes. This is an example of a "one-dimensional texture". (d) 8 cycles per image horizontal and 8 cycles per image vertical 25 useful for pattern recognition [Richards and Polit, 197A (Appendix A, item 6): Bajcsy and Lieberman, 1976]. Some recent studies have attempted to reexamine the potential of phase information for texture analysis, but the results have not been encouraging [Eklundh, 1979; Julesz and Caelli, 1979: Zucker and Cavanaugh, 1980: D'Astous and Jernigan, 1981]. It will be argued later that the apparent failure of phase information in texture analysis is due to its improper use. Two types of features are computed from the power spectrum: spatial frequency energy and orientation energy [Weszka et al, 1976: Haralick, 1979]. Spatial frequency energy features have the form (using polar coordinates) 7.11 (kl-Ar ‘2. F :.-. 3 F (rcoselrsine) 0L" ((9 K 0 r1 These features give the total energy in a limited band of spatial frequencies. Orientation energy features have the form (again in polar coordinates) r 0K+A9 00‘ I 3 Fz(rc0593rsine) gigolr‘ 0 9K These features give the total energy in limited orientation bands, which indicate directional tendencies in the image. These global analysis methods have several disadvantages. (1) The shapes and widths of the spatial frequency and orientation bands are free parameters which are arbitrarily specified. (2) Differences in illumination, contrast, or gray level resolution cause significant changes in the power spectrum. This problem can be alleviated by preprocessing techniques such as histogram 26 equalization, but this can cause a loss of gray level resolution and image fidelity [Haralick, 1979] and can dramatically change the appearance of the image. (3) Computation of the Fourier transform or autocorrelation function over irregularly shaped regions (as are commonly encountered) involves some difficulty [Wechsler, 1980]. (A) Every entry in F(u,v) is determined by all of the image function. It is not possible to extract spatially-limited information from F(u,v) by any method short of an inverse Fourier transform. This implies that the spatial frequency domain features cannot be used for texture segmentation without recomputation of the Fourier transform for each subimage to be analyzed [Bajcsy, 1973; Bajcsy and Lieberman, 1976]. 3.A Intermediate Analysis Methods Several texture analysis methods are based on the relationships among particular types of objects ("primitives”) in an image. These methods typically involve definition of primitives, extraction of primitives from -a textured image, and characterization of the texture by formal language or heuristic methods. In the formal languages approach, textures are considered to be language classes for which separate grammars can be inferred. Texture analysis is seen to consist of extracting the primitives and then parsing the image according to the texture grammars. Unfortunately, both of these operations are nontrivial, even in simple test cases. 27 Alternatively, the primitives can be analyzed by heuristic methods, often involving statistical measurements of the structure of the primitives. For example, ”generalized co-occurrence matrices" [Davis et a1, 1979; Haralick, 1979] can capture some aspects of the spatial distribution of primitives. A generalized co-occurrence matrix gives the number of times different primitives occur in the image at a particular displacement. The same features used for the gray level co-occurrence method can be computed from generalized co-occurrence matrices. An orderly review of intermediate analysis methods is obtained by examining them in order of increasing complexity of the primitives. One method based on formal language techniques uses 9x9 pixel windows as the primitive [Lu and Fu, 1978a, 1978b]. The texture is characterized by a stochastic tree grammar which specifies the assignment of gray levels to the windows. Error-correcting parsing methods are used to eliminate noise, and functions to compute the ”distance". between languages are used to measure differences between texture classes. Another simple primitive is an area of constant or nearly constant gray level [Tsuji and Tomita, 1973; Tomita and Yachida, 1973; Tomita, et a1, 1973]. Features such as size, gray level, curvature and directionality are computed to characterize the regions. Textures are identified by multiple modes in the histograms of feature values. Labelling each region by the mode in which it appears provides a first approximation to the textured regions. Generalized co-occurrences or split-and-merge techniques can then be used to refine the segmentation. 28 One obvious disadvantage of this approach is that its complexity increases rapidly with the fineness of the texture and with the gray scale resolution due to the large number of primitive areas which must be evaluated. An edge can also be used as a primitive. One recent texture analysis method attempts to characterize repetitive patterns of edge pixels in an image [Vilnrotter et a1, 1981] using an "edge repetition array" constructed from horizontal and vertical scans of the image. The array is then analyzed to find repetitive edge structures. In another approach, texture is characterized by computing generalized co-occurrences of edge pixels [Rosenfeld, 1979]. A spatial frequency domain form extraction method [Crowley and Parker, 1978] which identifies edges whose contrast exceeds a threshold is also suggested for texture analysis. A more complex type of primitive involves geometric shapes such as lines, curves, angles, open polygons, and closed polygons. These primitives are difficult to use, but in some cases complex geometrical texture patterns can be synthesized from simple grammars and fairly complex primitives [Carlucci, 1972: Siromoney et al, 1972]. A theoretical scheme for an intermediate-level structural analysis of texture is presented in [Zucker, 1976]. In this theory, an observed texture is considered a transformation of an "ideal texture" which is typically a polygonal tesselation of the plane. The ideal texture governs the placement of primitives, and transformation rules determine the relationships between primitives (such as superposition, adjacency, etc.). This theory is an idealized characterization of "structural" texture analysis, but there seem to have been no practical spinoffs of 29 the theory. Other intermediate-level analysis methods have been suggested based on combinations of statistical and structural methods [Tomita, 1981: Conners and Harlow, 1980b]. 3.5 Image Modelling Methods This class of methods for characterizing texture assumes that the textured image is a realization of some stochastic process which is governed by a few parameters. Texture analysis is viewed as a parameter estimation problem: given an image I(x,y), estimate the parameters of the assumed random process so that the probability of obtaining I(x,y) is maximized. The estimates of the parameters serve as texture features for classification problems. The estimates can also be used to synthesize other images with similar (in the sense of the model) texture. One type of image synthesis model is based on stochastic tesselations of the plane [Schachter et al, 1978a: Ahuja and Rosenfeld, 1981]. These random mosaic models produce polygonal regions in the image plane. Conceptually, the models produce a textured image either by generating random lines through the image or by generating random points and growing regions around them. In [Modestino et al, 1981], gray scale random textured images are generated which are reminiscent of the random mosaic images. This model allows control of gray level correlations between adjacent regions as well as control of the polygonal generation process. A log-likelihood texture analysis and segmentation procedure based on the synthesis method is presented and 30 demonstrated on compound images composed of subimages generated by the model. The segmentation results are good. Further experiments reveal some difficulties in generalizing the method to discriminate natural textures. A different, very flexible image generation model is developed in [Pratt et al, 1978; Gagalowicz, 1978; Pratt et al, 1981]. In this model, an image is considered to be the output of a homogeneous spatial operator responding to noise input. The noise input supplies the randomness in the texture and the spatial operator supplies local structure. Characterization of a texture then requires specification of a noise distribution and a spatial operator. Several statistical models for image generation and image modelling are reviewed in [Kashyap, 1980; Garber and Sawchuk, 1981; Chellappa and Kashyap, 1981a, 1981b]. One class of well-known texture models is based on the Markov random field [Rosenblatt and Slepian, 1962: Besag, 197A: Hassner and Sklansky, 1978; Cross and Jain, 1981; Chellappa and Kashyap, 1981a, 1981b; Garber and Sawchuk, 1981; Schmitt and Massaloux, 1981]. One-dimensional Markov random field models have been used to synthesize images for psychophysical experiments [Julesz 1962, 1965] and for theoretical comparisons of texture analysis methods [Conners and Harlow, 19803]. In one recent study [Cross and Jain, 1981], parameters of a Markov process were fitted to natural textures from [Brodatz, 1966]. The parameters were then used in the model to generate random images. The results are typical of the fundamental problem with model-based approaches to characterizing textures: while some successful model-based parameterizations can be found, and the model 31 can be used to randomly generate some images which appear similar to a prototype, in general, natural images are not bound to conform to the restrictions of a particular model. This fundamental problem severely limits the potential for model-based approaches in general texture analysis problems. 3.6 Summary A critical review of texture analysis methods has been presented. The large number and the variety of texture analysis methods is caused in part by the lack of a general definition of texture and by the difficulties encountered when existing methods have been applied to general texture analysis problems. Many methods have been developed on an ad hoc basis guided by intuition. The few comparative evaluations which exist conclude that the co-occurrence method is the best statistical approach to texture. The co-occurrence matrix is widely used in applications and as a basis for more sophisticated techniques including generalized co-occurrences, visually interpretable texture features, and combined statistical-structural approaches to texture. Features from the Fourier transform of the image have been suggested, but their performance has been poor. The comparative studies conclude that the statistical features provide better results than power spectral features, and attempts to develop texture features from the phase spectrum have been unsuccessful. The performance of existing texture analysis methods depends on the data. Acceptable performance can be obtained for some specific 32 problems, but the approaches lack generality. Many approaches have disadvantages such as insensitivity to some aspects of texture or susceptibility to irrelevant gray level variations (”noise"). Chapter A A Texture Analysis Method Based on a Theory of Human Vision A.l Introduction In this chapter, the channel filtering theory of early human vision will be used to motivate a new approach to texture analysis. First, however, we pause to explain why and how this theory of human vision will be used to guide the development of a texture analysis method. Then, the channel filtering theory and its potential significance for computational vision will be briefly presented. A feature space for texture analysis will then be developed and its plausibility will be examined. A.2 A Critical Re-evaluation of Texture Analysis Methodology Computational vision methods are not evaluated for use in applications based on whether the methods accurately emulate the mechanisms (or sometimes even the performance) of human vision. Methods are sometimes evaluated by correlations with human performance, but correlation does not imply equivalence or causality. High correlations with human performance might be obtained using processes very different from those present in human vision. But if correlation 33 3A with human performance is desirable, it seems reasonable to base the computational techniques on whatever is known about human vision. Insights into human vision can serve to motivate computational texture analysis procedures. How should information concerning human vision be used to guide computational vision research? Ideally, we would know the overall strategies used in human vision to analyze an image. We could then implement a vision system which might well rival human performance on many visual tasks. But the computer vision system would not necessarily emulate the human visual system at the neural level. The operation of low-level components of a complex system may give no useful information about the overall strategies used in the system. (This is discussed at length by Hofstadter [1979].) Once the overall strategies in vision are identified, their implementation can be tailored to whatever devices or constraints are relevant. In some approaches to computational vision, a technique which appears to reproduce or explain human performance is developed, then neural mechanisms are postulated which correspond to some aspects of the computational methods [Julesz et al, 1973; Julesz, 1981; Marr et al, 1979; Marr and Hildreth, 1980: Marr, 1980; Hildreth, 1980]. The use of vision science to construct post hoc justifications for existing methods is not effective since the value of a proposed technique for computational vision is determined by correlations between observed and desired performance. Correspondences between a proposed computational method and hypothesized mechanisms of human vision are ultimately irrelevant. 35 In this study, the opposite approach is used. Information from vision science will be used to guide the development of a new method for texture analysis, not to bolster or to validate an existing method. A theory which attempts to describe the possible processing strategies in early human vision will be used to guide the development of the. computational procedure. A.3 Spatial Frequency Channels In the late 1960's, researchers found that the threshold visibility of sinusoidal gratings (Figure 1) depends on the spatial frequency of the gratings [Campbell and Robson, I968; Pantle and Sekuler, 1968; Blakemore and Campbell, 1969; Campbell and Maffei, 1970]. The potential power of Fourier analysis as a tool for studying human vision was noted immediately [Campbell and Robson, 1968]. Attempts were made to determine how well spatial frequency domain analysis of visual stimuli correlated with actual human performance. One key issue involved whether the visual system combines spatial frequency information linearly, and thus whether visual analysis of complex objects could be expressed easily in terms of the spatial frequencies in the complex stimuli. Another important issue was whether the analysis of a stimulus by the visual system involves several independent mechanisms (called channels) which analyze different aspects of the stimulus. It was hypothesized that the channels might have a convenient spatial frequency domain representation. This hypothesis sparked additional activity in which neurological and psychophysical experiments were interpreted as 36 providing evidence for or against various single-channel or multi-channel models of early stages of human visual processing. For reviews of this activity, see [Graham, 1981; Ginsburg, 1978]. These results were unified and extended in [Ginsburg, 1978, 1980b]. An implementation of the multi-channel theory was used to illustrate the action of spatial frequency domain filters on various images containing complex forms, visual illusions, multistable images, and certain visual textures. In addition, the contrast sensitivity of abnormal visual systems was found to have unique properties in the spatial frequency domain. The channels in the visual system are modelled by filters defined in the spatial frequency domain, but a critical feature of this particular channel analysis is the assumption that phase information is an essential part of the internal representation of visual stimuli. Thus, the decomposition of an image is modelled by a sequence of spatial domain filtered images [Ginsburg, 1971]. These studies did not develop algorithms for automatic image analysis, but other papers [e.g. Ginsburg, 1973, 1979a] suggested that filtering in spatial frequency channels might be useful for machine pattern recognition, including texture analysis. The assumption of a channel decomposition of an image suggests two computational implementations [Hall, 1972; Nathan, 1970: Ginsburg, 1979b]. In one, a series of point spread functions, or templates, is convolved with the image to obtain a series of distorted (filtered) versions of the original image. Alternatively, the convolution can be performed in the spatial frequency domain by applying an inverse Fourier transform to the product of the filter transfer function (the 37 Fourier transform of the point spread function) and the Fourier transform of the image. This process is illustrated in Figure 2. The resulting sequence of filtered images is exactly the same as those produced by spatial convolution. The choice of the method is largely a matter of convenience and computational efficiency. For this study, the spatial frequency domain method will be used due to the availability of fast algorithms for the discrete Fourier transform [Johnson and Jain, 1981] and due to the intuitive value of defining the filters in the spatial frequency domain [Ginsburg and Coggins, 1981: Graham, 1981]. The next issue is to decide the shapes, sizes, locations, and number of the channels. The filtering properties of the human visual system are not precisely known, but psychophysical and neurological data can be used to guide the selection of the filter parameters. This study will use filters whose parameters are within the constraints specified in [Ginsburg, 1978]. The transfer functions for spatial frequency channels are defined by a Gaussian function (on a log scale): their center frequencies are one octave apart and their width is between 1 and 2 octaves. The number of spatial frequency channels used depends on the size of the image (see Appendix B for details). Figure 3 shows two representations of a spatial frequency channel filter in the spatial frequency domain (u-v plane). The height of the surface above the u-v plane (Figure 3a) and the intensity of the gray levels (Figure 3b) represent the filter amplitudes, |F(u,v)|, which are between 0 and 1. Four orientation channels are implemented with center orientations directed horizontally, vertically, and along the two diagonals (see Appendix B for details). Figure A shows two 38 'F""“1 IMAGE LT" DFT A FILTER SPECTRUM , TRANSFER FUNCTION real imaginary real imaginary COMPLEX MULTIPLY .. I FILTERED SPECTRUM real imaginary DFT‘1 FILTERED IMAGE Figure 2: Diagram of the Spatial Filtering Procedure. This procedure is repeated once for each channel. (a) (b) \\\\\\\\\\~\\‘ s-‘ ~‘ 40 (a) A transect plot portraying the amplitude of the real part of the filter transfer function in the spatial frequency domain. zero-frequency component is in the center of the region. Figure A: An Orientation Channel Filter- The (b) An image representation of the same filter transfer function. Al representations of an orientation filter in the u-v plane. Other channel filtering models for human vision have been developed [Sachs et al, 1971; Richards and Polit, 197A: Mostafari and Sakrison, 1976; Wilson and Bergen, 1979]. We note also that some arguments against a Fourier model of vision have appeared [Julesz and Caelli, 1979; Ochs, 1979: Zucker and Cavanaugh, 1980]. The sequence of filtered images obtained from a 128x128 sample of a ceiling tile image [Brodatz, 1966] are shown in Figure 5. Each channel responds to gray level changes over different sized regions or at different orientations. Energy from large objects is displayed in low spatial frequency channels; energy from small objects is displayed in high spatial frequency channels. Orientation channels respond to gray level changes with a directional preference. Since the phase information from the original image is retained in the filtered images, the spatial distribution of the spectral energy in each channel is apparent in the positions of gray level variations in the filtered images. The use of phase information captures in the filtered images the gray level spatial distribution information which is essential to texture. The specification of the channels completes the definition of the initial information processing stage of a computational vision system. The output of this stage is a series of channel-filtered images, each of which contains limited spectral information from the original image. The next task is to exploit the channel decomposition, reducing the series of filtered images down to a set of texture features which can be used to duplicate some texture analysis capabilities of human vision. 42 (b) (c) (d) (E) Figure 5: An Example of Channel Filtering. (3) Original 128x128 TILE image. (b)-(l) filtered images with center frequencies (in cycles per image) and orientations (in degrees) as follows: (b) 0 degrees (c) 45 degrees (d) 90 degrees (e) 135 degrees (f) 1 c/i (g) 2 c/i (h) A c/i (i) 8 c/i (j) 16 c/i (k) 32 c/i (l) 64 c/i A3 (g) (h) _"'"‘\ . o -I‘ I". - 0 do v v‘; b ‘ ' 1 (i) (j) (k) (1) Figure 5 (cont’d) AA A.A Constraints on Possible Texture Features Pragmatic requirements constrain the nature of the features we are willing to compute from the filtered images. The features must be very simple since we now have a whole series of images to analyze rather than a single image on which more extensive analysis can be performed. We would like to find some evidence from vision science to suggest that simple features which duplicate human performance could exist. Several observations suggest that texture is a consequence of crude information reduction in vision which simple computational methods should be sufficient to capture. First, textures can be analyzed and discriminated by the human visual system quickly and effortlessly, but only very simple, crude mechanisms have been found in the early stages of mammalian visual systems [Hubel, I963; Ginsburg, 1978; Graham, 1981]. Second, texture is perceived most clearly in an image area which has many intensity changes and thus small areas of constant intensity [Crowley and Parker, 1978; Resnikoff, 1981]. Individual edges and their placement are not critical for identifying textures. In fact, very different generation procedures can yield images which are not preattentively discriminable. ”Texture" appears, then, to be the result of a crude information reduction which occurs in an image area with no distinctive individual features. This crude information reduction is evident in several indiscriminable textures presented by Julesz (Figure 6). In these images, a simple check of the edge structure of the micropatterns would immediately yield discrimination of the different regions, yet these 45 S‘ ‘4- 9 (It to 0- F 3.9 35' ‘3»? ‘31s: 95' 9:? b b F! 9.993999% b- 5‘ «$%%%$% fifififififi fifififififi cfifififififi Tfififififi %%%$% ‘$% $$ fifi %% 50- '1 t.- by; ‘Ar‘Arqré. ' b b $3595 1.. t. or a? of .50.! .9 v: «of 9;? £2- ‘3}? £3 9!? ‘3}?- A‘f 9.?- A‘? 3.: 91$ A? 9;? 3.? 9.1- %? if? 9:3 '31s“ 9.? #%%%% ~31:- %‘$ 91$ A? A? $$$$$$o “ b '0 b “A, ‘r‘Ar‘Ar‘Ar'Aswé‘a’é‘ 9'19 ‘09 (A b- 0‘ 6?} J (p '0 J 5‘ 4 9 4 4 4 o- o- u- by.- 315 615: 7.5: 1.5: 7.5 %: 9.:- ‘35: 9.:- «.s- C- 0' 2. 7o- ?c- 7p 7p we“ 4 4 4 4 4 4 4 4 4 5t» ‘3»? 9;? 18 73 13 7.5" a}: m5 715' or ‘0 0- " 0* h ‘T‘d‘g‘du firewélselr 0-4 4 4 4 4 4‘4; ‘A'F'Aswéwé'wéfieé‘séw 4!, 19" 1a A, 4 4 7g 7:: 7‘5 7 J“ 42. 4L 4L 4C. 4E. J 'A S‘ a? 7n- ?ev 7o 10 '20- 7 0| .3“ .5" 4!. 42. 4L 4!. 4!. 4t. u o-4 3193191333: 05' '3" '29 7c- 19 7c- ?tv 7a- ‘a lab I» 9.959%: h h b '4- 9193.99.93): F5 5555 555513555 55535555555155} $¢&Q&H%B%R¢¢$% tflR¢¢B¢mefRRmU Uw&%Rflmb&88&¢R ¢w33x$g¢é$m9eé 13¢U%fl&9$h938& RU&$PPQ9RRKUR% Remyeéastfieext ”fm&838$w$flkfiR d$¢¢$9é8$kmw%% ¢%&&&%%3Rf&%Rw R%¢&&%Um¢UQ=R& mRfitztffiBthé¢ R$¢¢%R&¢wmaB%R $4&%&%&R$%&R$Q 00900900000999 09909000099909 00909000999900 99000000000000 99009900900090 00009999999909 00990999999900 99090999999900 00990999999990 09009999999900 90090999999900 90990999999990 00000090000000 09000900900009 Figure 6: Indiscriminable Textures which Could Be Discriminated Based on the Edge Structure of the Micropatterns. (Each image contains two micropatterns which are reflections of each other. From Julesz, 1973.) A6 are among the most difficult patterns to visually discriminate, even given the nature of the difference between the micropatterns. Simple pixel-level operations could be constructed to discriminate the image fields in Figure 6, but the criteria for the discrimination would no longer be "texture". The ability to discriminate micropatterns does not necessarily imply that textures composed of the micropatterns are easy to discriminate [Beck, 1980; Ginsburg, 1978]. Further evidence for the simplicity of human texture perception comes from psychophysical experiments involving "one-dimensional textures" (Figure 1c). In these images, the gray levels in each column are constant and the gray levels along each row are determined by a weighted sum of sinusoidal functions. Psychophysical experiments show that arbitrary one-dimensional textures can be matched with images containing only a few (A) selected spatial frequencies [Richards and Polit, 197A]. Another study shows that human similarity judgements between one-dimensional textures are predicted better by the outputs of four spatial frequency channels than by the actual spatial frequencies present in the textures [Harvey and Gervais, 1981]. An observation made in passing by Julesz et a1 [1973:Figure 6] provides further evidence that texture is a consequence of simple operations on images. Two patterns generated by a geometrical method involving repetitions .of four-dot micropatterns were presented in an image. The generation method insures that the two patterns have identical "second-order statistics", and it was observed that the patterns are not effortlessly discriminable to human observers. Julesz then presented a "blurred“ version of the image in which the patterns were found to be discriminable. In addition, it was noted that the A7 blurred patterns have different second-order statistics. The significance of this observation is that "blurring" is an alternative characterization of certain spatial filtering operations. Blurring can enable discrimination using simpler measurements than were required on the original image. Thus, there is evidence that simple operations on channel-filtered images could be effective for texture analysis. A.5 Properties of Channel-Filtered Images In addition to the evidence for the simplicity of ”texture" presented in the last section, we can obtain guidance in constructing texture features from certain properties of the channel-filtered images which are consequences of the filter definitions and of the spatial filtering operation. Four properties which have proven useful from preliminary studies are as follows: 1. The mean gray level of the channel-filtered images can be fixed in advance. In the implementation used in this study, the mean gray level of the filtered images is made equal to the mean gray level of the original image by setting the value of the zero-frequency component of the filters to 1, thereby passing the zero-frequency component (average gray level) of the original image unchanged. This property makes possible meaningful displays and comparisons of the channel filtered images. Keeping the average gray level of the filtered images the same simplifies the construction of features which are invariant over global, constant gray level changes. A8 2. The gray level frequency histograms of the filtered images tend to be symmetrical about the mean gray level due to the symmetrical response of the filters to objects in the original image. Asymmetry in the gray level histograms is caused by the interaction of the responses of spatially close objects. 3. The magnitude of the deviation of the gray level of a pixel in a filtered image from the mean gray level is directly related to the spectral energy contained in a neighborhood of that pixel in the original image. Since phase information is retained in the filtered images, the spatial distribution of the spectral energy passed by the filter is reflected in the gray level distribution in the filtered image. Thus, the spectral energy in different spectral bands arising from small spatial areas can be measured from the sequence of filtered images without recomputing a Fourier transform for each local area of interest. A. The differences between the channel-filtered images lies in their sensitivities to gray level variations in the original image over regions of different sizes and orientations. The channel decomposition, then, allows separate measurements of local energy over spatial domain neighborhoods of different sizes and orientations. A.6 Definition of Texture Features Property 3 above implies that the gray levels in each filtered image can be interpreted as representing the spectral energy arising from local areas of the original image. This interpretation motivates the selection of features for use in texture analysis experiments in ‘19 this thesis. The features will measure the average local spectral energy in an image by computing the spread of the gray level frequency histogram of the filtered images. A number of features can be defined to measure the spread of the histograms. We have used eight such features in classification experiments (Figure 7). The first feature is the average absolute deviation from the mean gray level. Features 2-A are functions of the second through fourth moments of the gray level histogram. Feature 2 is actually the standard deviation of the gray levels. We note that the third moment is not strictly a measure of spread, but it is included for completeness. Functions of the moments are used rather than the actual values of the moments because of computational difficulties encountered in preliminary experiments caused by the large magnitudes of the third and fourth moments. Features 5-8 are heuristic spread measurements which assume that the histogram is symmetric. Property 2 from the previous section states that this assumption is generally reasonable, but can fail in specific cases. The threshold values of .25, .50, .75, and 1.0 used in the feature definitions are arbitrarily selected. Similar features are investigated by Laws [1980], though the filters in that study are defined as small spatial domain templates. The sequence of filters satisfies the requirement that texture be measured at several different scales. Each gray level in the filtered images is determined by a neighborhood in the original image, so the filtered images are consistent with the spatial property of texture. Statistics of the gray level histograms of filtered images are easy to compute, so they satisfy the pragmatic requirement that channel C ,1 .45 E Ig-El*fk k N _ g— - '1 C 1/2 F2 = l (g-E)2*f (g) k --- E k N2 L g=0 -I G 1 1/3 p3 = 1 (g46>3*f (g) k --- k N2 I. 8"0 ..I ' n C 1/4 FA = 1 (gJE)4*f (g) k --- k 2 frequency gray level E+d Let Sk(d) = _' fk(g) g=G-d F5k = d such that Sk(d) = .25*N2 F6k = d such that Sk(d) = .50*N2 F7k = d such that Sk(d) 3 .75*N2 _ 2 F8k d such that Sk(d) - N Figure 7: Texture Feature Definitions. fk(g) denotes the frequency of gray level g (osgsc) in the filtered image for channel k. C denotes the mean gray level. Image size is NxN pixels. 51 filtering features be simple. Moreover, the features have reasonable visual interpretations in terms of spectral energy, size, and orientation. In addition, these features can be computed for any subimage without repeating the channel filtering Operation; this property will be important for texture segmentation (Chapter 6). A.7 How Will the Texture Features Be Used? An image will be represented by a feature vector consisting of the values of one of the texture features computed over each filtered image. The feature vector will map the image into a point in the feature space: this point will represent the image's texture. The distance between points in the feature space will be used as a measure of the textural difference between images. In most approaches to texture analysis, a number of features are defined and then they are applied in various combinations to classify texture samples [e.g. Weszka et al, 1976: Faugeras and Pratt, 1980; Haralick and Shanmugam, 1973]. This approach is more likely to produce, at some point, ”good" classification results than using each feature in isolation. But combining various features whose properties are not well understood leads to problems in interpreting results. In this study, the features will be used separately, expecting that the results for all of them (except perhaps Feature 3) will be similar. Selection of the "best" feature from these will have to be based on computational considerations and compatibility with solutions to other image analysis problems such as form extraction, image registration, or edge detection. 52 A.8 Summary Existing texture analysis methods use little guidance from properties of human visual perception. A theory concerning the early information processing in human vision was used to motivate the development of a new feature space for texture analysis. The theory asserts that an image is decomposed by the human visual system into channels which are modelled by spatial filtering operations. Since phase is retained in the channel-filtered images, the spectral energy caused by gray level variations in local spatial areas can be measured from the channel-filtered images. Features which measure this local energy were defined for use in texture analysis studies. Chapter 5 Evaluation of Channel Filtering Features for Texture Classification 5.1 Introduction This chapter will present the results of several experiments designed to evaluate the channel filtering features, to compare their performance against some other texture features proposed in the literature, and to illustrate properties of the feature space which may be useful in applications. The experiments in this chapter are texture classification problems using a supervised learning paradigm. Sets of test images are labelled according to an external criterion which, for these experiments, is a visual evaluation of the image texture. The sets of images are then analyzed by the channel filtering method, resulting in a feature vector for each image. These feature vectors are then input to a classification algorithm, and the estimated error rate is used to evaluate the class separations in the feature space. Most of the experiments in this chapter use 25 samples of each image class. The samples are extracted as subimages from a large (256x256 pixels) image which represents a single texture class. Nonoverlapping subimages are extracted as far as possible, then the sampling origin is shifted by half the subimage size to complete the 53 5A set if necessary. The number of samples is fairly small, but since the large images are homogeneous, with no complications such as shadows, receding surfaces, or magnification, this number of samples from throughout the image should adequately characterize the image texture. However, in interpreting the estimated error rates, we must keep in mind that the estimates are based on a small number of samples. In all of the experiments in this chapter, different texture features are used in separate classification problems. The feature vector for each image consists of the values of one of the texture features F1...F8 defined in Figure 7 computed over all of the channel-filtered images. The dimensionality of that vector depends on the size of the images being analyzed: for 6Ax6A images, eleven channels (four orientation, seven spatial frequency) are used: for 32x32 images, ten channels (four orientation, six spatial frequency) are used. For the experiments in this chapter, a Nearest Neighbor Classifier will be used. The decision rule is to classify an unlabelled point into the class of its nearest labelled neighbor in the feature space, where the distances are determined by the Euclidean distance metric. This decision rule results in piecewise linear decision boundaries. The error rate of the classifier will be estimated using the Leave-One-Out rule, which is often used when sample sizes are small. In this method, each point in turn is treated as a test sample with all other points as training samples. The results will be presented as a confusion matrix for each feature. The (i,j)th entry of the matrix gives the number of times a sample from class i was classified as class j. 55 The performance of the channel filtering features will be compared to the performance of another feature, the total power spectral energy (PSE) in the sequence of filtered spectra. This feature has been used with ideal band-pass channels in previous studies (Section 3.3). This feature uses no phase information, by definition, but it uses the same power spectral information as the channel filtering features. The effect of using phase information in the channel filtering features will be determined by comparing the error rates of PSE and channel filtering features. The experiments will first establish the performance of the proposed texture features on classification of natural images whose sizes are 6Ax6A, 32x32, and l6xl6 pixels. Then the effect of histogram equalization on the natural images will be evaluated. The sensitivity of the features to changes in the average gray level, magnification, and orientation of the textured images will be evaluated. Computational simplifications of the channel filtering procedure using fewer channels and using channels with an ideal band-pass characteristic will also be evaluated. The features will also be applied to a particular set of images to compare their performance to that of the co-occurrence method. 5.2 Evaluations of the Features on Natural Images In this sequence of experiments, eight images from [Brodatz, 1966] will be used to provide an initial test of the proposed texture analysis method. The data for these experiments consists of subimages of eight 256x256 images illustrated in Figure 8. The images (and their 56 Figure 8: Natural Image Classes. (3) TILE (b) ROCK (c) SAND (d) PAPE (e) CORK (f) GRAS (g) WOOD (h) SCRE 57 Figure 8: (cont'd) 58 abbreviations) are as follows: ceiling tile (TILE), beach .pebbles (ROCK), beach sand (SAND), handmade paper (PAPE), pressed cork (CORK), grass lawn (GRAS), wood grain (W000), and straw screening (SCRE). These images were selected to provide a variety of textural properties and a range of difficulty in discrimination. The images were digitized using a Spatial Data Systems Eyecom image processing system. No preprocessing was applied to the digitized images for the experiments in this section. 5.2.1 Experiment 1: Natural Images, 6Ax6A subimages In the first experiment, 25 samples of size 6Ax6A were extracted from the 256x256 images. Sixteen of the samples were nonoverlapping, and nine more were extracted by shifting the sampling origin by (32,32). The 256x256 images are visually discriminable. The feature space will now be tested on smaller images which could, in principle, provide a more difficult discrimination problem since a smaller texture sample is available for analysis. Each of the 200 samples were filtered and the eight features in Figure 7 were computed on all of the filtered images. The classification results for all of the features are given in Table 1. The PSE feature provides 69% correct classification while all of the channel filtering features (except feature 3) provide better than 90% correct classification. Feature 3 is the only channel filtering feature which does not measure the spread of the gray level frequency histogram of the filtered images and therefore cannot be easily interpreted as a local energy measure. Feature 3 yields 79.52 59 Table 1: Classification of Eight Natural Image Classes Using 6Ax6A Subimages. PSE TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE l9 2 3 1 O 0 O O ROCK 3 l8 1 0 1 0 1 1 SAND A 0 1A 2 O l A O PAPE O 0 l 22 l 1 O 0 69% CORK 0 O O O 15 10 O 0 accuracy GRAS 0 O O l 9 l3 2 0 W000 2 O 5 O O A 1A 0 SCRE O 0 0 0 0 2 O 23 F1 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 22 2 l 0 O O O O ROCK 1 2A 0 O O O O 0 SAND 0 0 25 O O O O O PAPE O 0 0 25 O 0 0 O 982 CORK 0 O O 0 25 0 0 0 accuracy GRAS 0 0 0 0 O 25 0 0 W000 0 O O 0 0 O 25 0 SCRE O 0 0 O O O 0 25 F2 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 22 2 l 0 O O 0 O ROCK 0 25 0 O O 0 0 O SAND 0 0 23 O 0 2 0 0 PAPE O 0 0 25 0 O O 0 97.5% CORK 0 0 O 0 25 0 0 0 accuracy GRAS O O 0 O O 25 0 0 W000 0 0 O 0 0 0 25 O SCRE O O O O 0 O 0 25 F3 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 25 0 O 0 0 0 0 0 ROCK 2 16 0 O 1 5 1 0 SAND 0 0 16 2 l O 3 3 PAPE 0 O 0 25 O 0 O 0 79.52 CORK O 0 l O 17 A 1 2 accuracy GRAS 0 O 0 2 11 12 O 0 W000 O O l 0 0 O 23 l SCRE O O O O O 0 O 25 FA TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 23 1 l O 0 0 0 O ROCK 0 25 O O 0 0 O O SAND O O 23 0 O 2 0 0 PAPE O O O 25 O 0 0 O 98% CORK O O O 0 25 0 0 0 accuracy GRAS 0 O 0 0 0 25 0 0 W000 O O O O 0 0 25 O SCRE O 0 O O O O O 25 60 Table l (cont'd) F5 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 21 1 l O O 2 O O ROCK 5 20 0 0 0 0 O 0 SAND 1 0 22 0 O 2 O 0 PAPE O 0 O 25 0 O O O 92% CORK O O O 0 2A 1 O 0 accuracy GRAS O O 1 0 0 2A 0 O WOOD O O O 1 O l 23 O SCRE O 0 0 0 O O 0 25 F6 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 22 l 1 0 0 l O O ROCK 3 22 0 0 0 0 0 O SAND O 0 2A 0 O 1 0 O PAPE O O 0 25 O O 0 O 96.5% CORK O 0 O 0 25 0 O 0 accuracy GRAS O 0 O 0 O 25 O 0 W000 0 O O 0 0 O 25 O SCRE 0 O O 0 0 0 0 25 F7 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 23 1 1 0 0 0 0 O ROCK 1 2A 0 0 0 O O O SAND 0 O 25 O 0 0 O 0 PAPE O 0 0 25 0 0 0 0 98.5% CORK O 0 0 0 25 O O 0 accuracy GRAS O O O 0 O 25 O 0 W000 O O O 0 O O 25 O SCRE 0 O O O O O 0 25 F8 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 22 0 l O O 2 O 0 ROCK 0 25 O O 0 0 0 O SAND O 0 l9 0 2 A 0 O PAPE O O O 25 O O O O 912 CORK 0 O 2 2 21 0 0 0 accuracy GRAS 0 O 5 O O 20 O 0 W000 0 O O 0 O 0 25 0 SCRE O O 0 0 O O 0 25 correct classification in this experiment. Some insight into the structure of the patterns in the feature space can be obtained by plotting the average, over the 25 subimages, of the feature value in each channel. Figure 9 shows the plot for 6,1: 4O 36 ’1‘ ‘\‘ E 1' ‘\‘ H T I L! ‘t '/ ~~\ in: :33 30 . p“. \ HPBPE Mean Feature Value Channel number 40 35 - ‘——‘CORK "“3 GREG 3° *”“*HOOD ‘ "”'3CRE 25 . Q) .E’. l S“ 20 -i‘ i‘.’ ‘I c a 15 “ -v ‘ I 4' \“ 8 \" I \ I” ~. Eh I ‘ I ‘s‘ c l \ f \. JTTT“QL /’ I’)‘\v 8 1° - “ I. Y X ” \ 2: if ’I \W' I. a’ ‘\»a .l' \ ‘ ..--.. J ’v"-' NV \‘\ \ o l T 1 l 1 I I T T l 2 3 . 4 5 8 ' 7 I 9 IO 11 Channel number Figure 9: Mean Values of Feature 1 in Each Channel for the Eight Natural Image Classes 61.1: 40 ll 35 d ’4 ‘\‘ q ll ‘\‘ H T I L: \‘ ,” ‘5‘ ti: Rocg 30 _ p... \ Hears Mean Feature Value Channel number 40 35 - ‘-“CORK ‘ l---El GREG ‘ , , .. i. , Mean: 25 m d .3 l S” 20 ..i‘ i'.’ ‘I . 3 s “ I B \ a ‘ .1 '0‘, I, \ ‘0‘ “ 9 '1 \ I \ ‘s an / ,F’ I x {3 1o \‘ It s“ /'N {a ’1’ (U ' s ‘ I o \ I \ 1‘. I I Z ‘ ’ | I w 0*‘ \ O 'r’ ‘\‘\‘ S -’ 1‘ I "~- " \ I s ‘ -v‘ I \‘ o \ I 1 1 1 1 1 U I l T Channel number Figure 9: Mean Values of Feature 1 in Each Channel for the Eight Natural Image Classes 62 feature 1 for the eight natural image classes. The plot of the mean feature values in each channel has unique properties for each image class. For example, ROCK is the only class with high feature values in low frequency channels due to the presence of large objects in the ROCK image. SCRE has a unique pattern in orientation channels: the horizontal channel (number 1) has a high value and the other orientation channels have low values. This reflects the horizontal directional tendency and the regularity of SCRE. In addition, SCRE has low feature values in spatial frequency channels except for the channel centered at 32 cycles per image which captures the periodicity 'of the bars in the screen. For other texture classes, the description of the mean feature value plot might not be as easy or convenient. These examples show how the values of the channel filtering features can be interpreted as visible image properties. In the remaining experiments, the results of only the PSE feature and feature 1 will be reported. The channel filtering features (except feature 3) were found in preliminary experiments to have similar performance. Feature 1 is selected for reporting because its behavior was typical of the channel filtering features, because it is a simple feature to compute, and because the feature has its own intuitive interpretation. 5.2.2 Experiment 2: Natural Images, 32x32 subimages The second experiment involves 32x32 samples of the same eight 256x256 images. The same procedures as in the first experiment were used except that six spatial frequency channels (rather than seven) 63 Table 2: Classification of Eight Natural Image Classes Using 32x32 Subimages. PSE TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 19 0 5 1 0 0 0 O ROCK A 11 1 O O l 6 2 SAND 2 O 19 3 O O 1 O PAPE l 0 O 23 0 0 O O 68% CORK O 0 O 0 21 2 0 2 accuracy GRAS 0 O O O A 12 8 1 W000 0 O A 0 1 11 9 0 SCRE O 0 0 0 0 A 0 21 F1 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 20 l 2 0 O 2 0 0 ROCK O 22 O O O 0 3 0 SAND 1 0 22 O O 2 0 O PAPE 0 O O 25 0 0 0 O 91% CORK 0 O 0 0 25 O O 0 accuracy GRAS 0 0 6 O O 19 0 0 W000 0 l 0 O 0 0 2A 0 SCRE O 0 O 0 0 0 O 25 were computed. The decreased size of the subimages means that no information beyond the Nyquist frequency of 16 cycles per image width is available, so the highest-frequency channel from the previous experiment (center frequency 6A cycles per image width) is not usable. The classification results for the PSE feature and feature 1 are given in Table 2. The PSE feature provides 692 correct classification, which is almost the same as for the 6Ax6A subimages, but the errors occur between different classes. In spite of the reduction in the size of the subimages from those used in the previous experiment, the results show that there is only slight degradation in the performance of the channel filtering features; feature 1 provides 91% correct classification. About half of the errors occur between the GRAS and SAND classes. These results imply that the 32x32 subimages contain enough textural information to enable good classification results on the eight natural image classes. This result increases our confidence 6A in the adequacy of the 6Ax6A subimage size for characterizing the textural properties of the natural images. 5.2.3 Experiment 3: Natural Images, l6xl6 subimages In this experiment, 50 samples of size 16x16 are extracted from four of the natural images (TILE, ROCK, SAND, PAPE). For subimages of this size, only five spatial frequency channels are useful. Because of this limitation and the small size of the subimages, the number of samples per class was doubled for this experiment. The classification results for the PSE feature and feature 1 are given in Table 3. In this experiment, the PSE feature achieved 83% correct classification with errors distributed mostly among TILE, ROCK, and SAND. The improved performance of the PSE feature using these small subimages may be due to the larger variance in the average gray levels of small subimages. The increased variance can result in the (accidental) formation of clusters from the same image class which enhance the classification accuracy using the nearest-neighbor classifier. The sensitivity of the PSE feature to changes in the average gray level of an image will be demonstrated in Experiment 6. Channel filtering feature 1 yields 68% correct classification with more than half of the errors occurring between TILE and SAND. In fact, the TILE and SAND classes are practically indistinguishable. The results indicate that l6xl6 subimages do not capture enough of these texture patterns to enable accurate nearest-neighbor classification using the channel filtering features. 65 Table 3: Classification of Four Natural Image Classes Using 16x16 Subimages. PSE TILE ROCK SAND PAPE TILE 39 2 9 0 ROCK 7 Al 2 o 832 SAND 3 1 Al 5 accuracy PAPE l 0 A A5 Fl TILE ROCK SAND PAPE TILE 20 7 22 1 ROCK 8 38 A O 682 SAND l6 1 32 1 accuracy PAPE 1 O 3 A6 5.2.A Summary This sequence of experiments has served as a test of the channel filtering features on a variety of natural textures which were not preprocessed. Images of different sizes were extracted from large images which are visually interpreted as having different textures. The results show that a nearest-neighbor classifier using channel filtering features is able to identify the image classes with 90% or greater accuracy when the subimages are as small as 32x32. An attempt to extend this to 16x16 subimages failed when classes TILE and SAND proved indistinguishable to channel filtering features and error rates between other pairs of classes were also higher. 5.3 The Effect of Histogram Equalization One common preprocessing technique applied in existing texture analysis studies is histogram equalization [Haralick et a1, 1973]. The technique is used in texture studies to standardize the average gray 66 level and contrast of the images, though the method has also been used for contrast enhancement. For some images, histogram equalization can dramatically change the image's appearance while for other' images, histogram equalization has little or no effect. The procedure takes an image with G gray levels and an arbitrary gray level frequency histogram and produces an image with G' gray levels (G' J I- 1 I. I b 1 r q ‘p I b d A . J l A _L A A J J A n a J a A Y 7[__’ x (a ) x (b ) a. 'a. .‘= ... ... .~ .' ' .... ... ... ..= .= ... :0. . e a. . .0 .. :.. .... .oa .... .... :.. :.. .... 0': .: a o . an ... an I“ a... a: a... a... at. .z o' : ... ." :.. ' .. ... ... ”. .... .... :0. o a .: . :0 u. .0 .0 no a .a 0 ' ... u. '. ' ... ..° .: '. .g '. .I a .: 2 . . .a .a : a :; ... a : :: ..: :0 . ..I IO. 0 .. :.I ..o a... ..a ..: a... a... t a a... . a. :e a. o ... :0 :: u e :2 on. ... .; ' 0' as .g .a' ' .e' '. on .a' '. .0. .O‘ (c) (d) Figure 15: Four-Dot MicrOpatterns and CorreSponding Image Classes. (The squares in the micrOpattern diagrams correSpond to black pixels. The labels correspond to the usage in Table 18.) (a) MicrOpattern 1A (b) Micropattern 13 (c) Part of image JUlA (d) Part of image JUlB (e) MicrOpattern 2A (f) MicrOpattern 2B (g) Part of image JUZA (h) Part of image JUZB 9A 2 (f) 2 (e) ’I a, Q. .\ ’0 .f .f ’0 .\ ’0 3 ’0 ’0 \- .f .I \. 4'7 :.\!.3 3 3.7 JL;\~: A.7 v ‘0 .\ ’0 .\ ’0 ’0 .’ ‘5' .f ’0 .f ’0 '5. .5. A 4': v y-xfst 4.v v ~6<¢ 3'? 3.¢ v y y-JC;\~: 3 3 5 5 3-7 4 3.3.7 :.\3.3 3 £.y-;\.3 3'? s s 1’. ‘5 ‘50 ‘5 fit! {50”, {0‘s ‘5 3'7 v ¢ 7 fi.p ? f £.£.¢ 3.9. 5 5 ” 0' - o 5 I', 5‘ i‘, '0 ’0 4’, 5" $~ “ 4 4'7 3 3 3 3.? y ? 3.3.v v v ’0 .’ ’0 \. .’ ’0 .‘f .\ ...0 '5. :5 3»? (h) (g) Figure 15 (cont'd) 95 image classes are illustrated in Figure I5. The image classes are visually discriminable. 5.8.I Experiment I0: Application of the Co-occurrence Method The definition of the co-occurrence matrix [Haralick, I979] involves the selection of a displacement vector. Since there exist N2 possible displacement vectors for an NxN image, some guidance in selecting the displacement vector is required. The choice of the displacement vector is critical. The displacement vectors to be used in this experiment will be chosen by a (suboptimal) strategy which takes advantage of known differences between the micropatterns. The black pixels in the micropatterns in Figure I5 are numbered to demonstrate the similarities between micropatterns of each pair. The placement of onIyI one dot (labelled O and 0') differs in the micropatterns of each pair. This implies that there exists a black-black co-occurrence which involves a different displacement 96 Table I8: Black-Black Displacement Vectors in Four-Dot Micropatterns. (arrows indicate vectors which discriminate the micropatterns) Micropattern IA Micropattern IB Points Displacement Points Displacement 0-1 (-2, A) 0'-I (-2, 0) 0-2 ( 2. 0) o'-2 ( 2,-A) 0-3 ( I, 2) <********> o'-2 ( I,-2) l-2 ' ( A,-A) I-2 ( A,-A) 1-3 (39-2) 1-3 ( 39-2) 2-3 (-l,-2) 2-3 (-I.-2) Micropattern 2A Micropattern 28 Po'nts Displacement Points Displacement 0"] ('19 3) 0"] (T‘s-1) 0-2 ( I, I) 0'-2 ( l,-3) 0-3 ( 2, 2) <***A****> o'-3 ( 2,-2) 1‘2 ( 2.-2) 1'2 ( 2.'2) 1-3 ( 3.-l) l-3 ( 3.-l) 2'3 l I. l) 2'3 ( l. 1) vector in the two micropatterns. This displacement vector will be used to discriminate the image classes. (Note: due to symmetry in the co-occurrence matrix definition, vectors (dx,dy) and (-dx,-dy) are considered to be identical.) Table I8 shows the displacement vectors between all pairs of black pixels in each micropattern. This information can be used to select an appropriate displacement vector to distinguish the micropatterns. For example, the (I,I) displacement vector can be used to discriminate the micropattern pairs. Micropatterns 2A and 28 contain black-black co-occurrences at displacement (I,I) but micropatterns IA and IB do not. Similarly, the (l,2) vector can discriminate the micropatterns of the first pair, and the (2,2) vector can discriminate the micropatterns of the second pair. These three displacement vectors, selected specifically to capture the differences between micropatterns, will be used to compute co-occurrence matrices for the textures. 97 For these displacement vectors, there are no interactions between different copies of the micropatterns for black-black co-occurrences. Thus, we can compute the expected number of black-black co-occurrences in the image from rotations of individual micropatterns as shown in Table I9. The number of micropatterns occurring in the 256x256 image is known (78A), so we can estimate the number of micropatterns occurring at each of the four allowed orientations (I96). By counting the number of black-black co-occurrences in each micropattern at each orientation, we can then compute for each displacement vector the 98 Table I9: Computation of Expected Number of Black-Black Co-occurrences in 256x256 Four-dot Images. Number of black-black Expected no. black-black co-occurrences at each orientation co-occurrences in image DISPLACEMENT (l,I)-(-I,-I) Rotation angle 0 90 I80 270 total Micropattern ---------------- IA 0 0 0 0 0 0*196 . 0 IB 0 0 0 0 0 0*195 a 0 2A 2 0 2 0 A AA196 . 73g 28 2 0 2 O A A*]96 . 78A DISPLACEMENT (1.2)‘('1.-2) Rotation angle 0 90 180 270 total Micropattern ---------------- IA l 0 I 0 2 2*]95 . 392 I8 0 0 0 0 o 0*]96 . 0 2A 0 0 0 0 0 0*195 . 0 28 0 0 0 0 0 0*196 . 0 DISPLACEMENT (2.2)'('2.'2) Rotation angle 0 90 180 270 total MicrOpattern ---------------- IA 0 0 0 0 0 0*196 . 0 IB 0 0 0 0 O 0*196 . 0 2A I I I I A AAIQB 3 735 28 0 2 0 2 A AA]96 . 73g expected number of black-black co-occurrences in the entire image. The result of this computation shows that the expected number of black-black co-occurrences at displacement (l,2) is different for the first pair of images (Figure l5a). Thus, the number of black-black co-occurrences at displacement (l,2) should be able to discriminate image classes IA and ID. The other vectors do not provide discriminating information between IA and IB. In the second image pair (Figure I5b), the expected numbers of black-black co-occurrences are equal for all three displacement vectors. Even the (2,2) displacement 99 vector, which could discriminate the micropatterns, cannot discriminate images 2A and 28. The vectors which differ in the second pair of micropatterns have the same length, but their orientations are 90 degrees apart. Since the micropatterns are rotated by random multiples of 90 degrees, the difference between the micropatterns is confused in the images. These results use only black-black co-occurrences to help select reasonable displacement vectors for use in the co-occurrence method. The three vectors will now be used to compute co-occurrence matrices for use in classification problems. The performance of the co-occurrence matrix method will now be demonstrated on 25 6Ax6A subimages from each class. Since the images are binary, the co-occurrence matrices are of size 2x2, and since the definition of the co-occurrence matrix tobe implemented [Haralick, I979] imposes symmetry on the matrix, the co-occurrence matrix involves three different values: the numbers of black-black, black-white, and white-white co-occurrences in the image. These numbers are used as features. The nearest-neighbor classification results are given in Table 20. The (I,I) displacement vector discriminates the two pairs but does not discriminate between textures in the same pair. The (l,2) vector can identify texture IA, but the other classes are confused. The (2.2) vector discriminates between the pairs, but the discrimination within each pair is poor: the (2,2) vector was originally intended to discriminate the second image pair. However, since the black-black co-occurrences are known not to contribute to the discrimination of these classes. the discrimination which does occur can be attributed to white-black and white-white co-occurrences. The results using all three co-occurrence matrices combined are also shown. 100 Table 20: Classification of Four-dot Textures Using Co-occurrence Matrices. (I,I) JUIA JUIB JU2A JUZB JUIA 2 23 o o JUIB 2 23 o o JUZA o o I 2A JU28 o o 9 I6 (1.2) JUIA JUIB JU2A JUZB JUIA 25 o o o JUIB o I A 20 JUZA o 2 A I9 JU28 o 2 o 23 (l.3) JUIA JUIB JUZA JU28 JUIA I 2A 0 o JUIB 7 I8 0 o JU2A o 0 I8 7 JU28 o o 9 I6 ALL JUIA JUIB JU2A JU28 JUIA 25 0 0 0 JUIB 0 25 0 0 JUZA 0 0 I3 I2 JUZB 0 0 7 I8 This result uses nine features: the three values in each of the three co-occurrence matrices. The images of the first pair are identified, but the second pair is not well separated. l0I Table 2l: Classification of Four-dot Textures by Channel Filtering. PSE JUIA JUIB JU2A JU28 JUIA II IA 0 O JUI8 I7 8 O 0 JU2A 0 0 I7 8 JU28 0 0 9 l6 Fl JUIA JUIB JU2A JU28 JUIA 25 0 0 O JUIB 0 25 0 0 JU2A O O 25 O JU28 0 O O 25 5.8.2 Experiment II: Application of Channel Filtering The channel filtering approach was applied to the same samples of the four four-disk textures. The classification results are shown in Table 2l. The PSE feature discriminates the different pairs, but cannot discriminate the textures within each pair. Feature I achieves I002 correct classification on all four classes. In this experiment, it is of some interest to determine whether the discrimination of the texture classes is due to the action of orientation channels, spatial frequency channels, or both. Since these images are composed of dots, it might seem that no orientation or size selectivity exists between the image classes. Table 22 gives the classification results using only the six spatial frequency channels, and Table 23 gives the results using only the four orientation channels. The results show that both orientation and spatial frequency information contribute to the discrimination of the classes. These contributions can be understood by considering the interpretation of spatial filtering as a moving average operation. Each gray level in a filtered image is the result 102 Table 22: Classification of Four-dot Textures Using Six Spatial Frequency Channels. PSE JUIA JUIB JU2A JU28 JUIA I2 I3 0 0 JUIB I3 I2 0 0 JUZA 0 0 2l A JU28 0 O 7 I8 Fl JUIA JUIB JUZA JU28 JUIA 2A I 0 0 JUIB 0 25 0 0 JU2A 0 0 2A l JU28 0 0 I 2A of an averaging operation over an area of the original image defined by the filter point spread function. The result of this averaging can be expressed for these particular images as a consequence of the local dot density. Variations in the dot density in local areas of different sizes and orientations provides the discriminating information between the image classes. Since the number of dots in a local area determines the average gray level of the area (which is a component of the power spectrum), this interpretation is consistent with the characterization of the channel filtering features as “average local energy" measures. I0 3 Table 23: Classification of Four-dot Textures Using Four Orientation Channels. PSE JUIA JUIB JUZA JU28 JUIA 13 12 O O JUIB I3 12 O O JU2A O O 18 7 JU28 0 O 9 16 F1 JUIA JUIB JUZA JU28 JUIA 25 0 O O JUIB 3 22 0 0 JU2A O O 25 O JU28 O O O 25 5.8.3 Summary This series of experiments compared the performance of the co-occurrence method to the channel fitering method for discriminating a particular set of textured images. A procedure for selecting displacement vectors for co-occurrence matrices was applied. The displacements were chosen to detect known differences in the micropattern structures. For one image pair, the combined co-occurrence results were satisfactory: for the other, the co-occurrence matrices performed poorly. The method for selecting the displacement vectors was not optimal or exhaustive, but the results demonstrate that the selection of displacement vectors is critical and that the performance of a particular displacement vector cannot be reliably predicted from its performance on micropatterns. In these experiments, only certain black-black co-occurrences could be investigated in detail. Some results indicated that the white-white and black-white co-occurrences might also contribute to class discriminations. The problem of selecting suitable displacement 10A vectors is a major disadvantage to the use of the co-occurrence method. Channel filtering feature Fl yielded I002 classification results. Both spatial frequency and orientation channels were found to contribute to this result. The performance of the PSE feature indicates that the different image pairs have significantly different power spectra, but the power spectra of the images within each pair are not discriminable. These observations imply that the performance of feature Fl is not due to power spectral differences; it is the use of phase information which enables discrimination of all of the classes. 5.9 Computational Simplifications In this section, two computational simplifications of the channel filtering approach will be tested to determine if such simplifications cause any significant degradation in classification performance. 5.9.I Experiment l2: Using Fewer Channels One possible simplification is to use fewer channels. Since texture is perceived in regions containing small areas of constant gray level, the low-frequency channels, which respond to gray level variations over large areas, might be eliminated without degrading performance. In addition, the highest spatial frequency channel does not capture any spectral information which is not available in other channels. In this experiment, the two lowest spatial frequency channels and the highest spatial frequency channel will be eliminated. The 6Ax6A images will be filtered by four orientation channels and four spatial 105 frequency channels (center frequencies A, 8, I6, and 32 cycles per image). The results of using these eight channels on the eight natural image classes are given in Table 2A. The 65.52 correct performance of the PSE feature is slightly worse than its performance using all ll channels. Feature l provides 982 correct classification, the same as with all of the channels present. Using only eight spatial frequency channels does not appear to degrade the performance of the channel filtering features. The minimum number of channels required for texture discrimination depends on the particular set of textures. 5.9.2 Experiment l3: Ideal Band-Pass Channels The filtering operation using Gaussian filters requires that the filter amplitudes and the image spectrum be multiplied. If the filter l06 Table 2A: Classification Results on Natural Images Using Eight Channels. PSE TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE I7 2 A I 0 0 I 0 ROCK 3 IS I 0 2 0 0 A SAND A 0 l2 2 0 0 7 0 PAPE 0 0 2 22 I 0 0 0 65.52 CORK 0 0 0 0 I6 9 O 0 accuracy GRAS 0 l I 0 8 l2 2 I WOOD 2 0 5 I 0 A I3 0 SCRE 0 0 0 0 0 I 0 2A Fl TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 22 I 2 0 0 0 0 0 ROCK 0 25 0 0 0 0 0 0 SAND l 0 2A 0 0 0 0 0 PAPE 0 0 0 25 0 0 0 0 982 CORK 0 0 0 O 25 0 0 0 accuracy GRAS 0 0 0 0 0 25 0 0 W000 0 0 0 0 0 0 25 0 SCRE 0 0 0 0 0 0 0 25 amplitudes were all either zero or one, the multiplications could be avoided. In this experiment, channels with an ideal band-pass characteristic will be applied to the natural image classes. The spatial frequency channels to be used are the same as those defined for power spectral energy features in [Weszka et al, I976]. The channels pass all spatial frequencies in nonoverlapping bands whose lower and upper bounds, in cycles per image, are as follows: [2,A], [A.8], [8,I6] and [I6,32]. The orientation channels to be used pass all spectral information within A5 degrees of horizontal, vertical, and both diagonal orientations. The use of ideal band-pass filters causes additional ripples in the filter responses due to the abrupt filter cutoffs in the spatial frequency domain. The response is still symmetrical about the mean gray level, however. In [Marr and Hildreth, I979] these ripples were found to interfere with edge and form detection in filtered images. Table 25 shows that the ideal band-pass 107 Table 25: Classification Results on Natural Images Using Ideal Bandpass Channels. PSE TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 21 1 3 0 O O 0 O ROCK A 20 O O O O 0 1 SAND A O 19 0 0 O 2 O PAPE O 0 2 22 O 1 O O 772 CORK 0 0 0 0 l9 6 0 0 accuracy GRAS 0 0 2 0 9 1A 0 0 WOOD 1 0 6 0 1 2 15 O SCRE O 0 O O O 1 0 2A F1 TILE ROCK SAND PAPE CORK GRAS WOOD SCRE TILE 23 1 1 0 O 0 O O ROCK 1 2A 0 O O O O O SAND 1 0 2A 0 O O 0 0 PAPE O 0 O 25 0 O 0 O 983 CORK 0 0 0 0 25 0 0 0 accuracy GRAS O O O O 0 25 0 0 W000 0 O 0 0 0 0 25 O SCRE O O 0 O 0 O 0 25 filters do not seem to degrade texture analysis performance: feature I provides 982 correct classification results. The PSE feature achieves 772 correct results, which is better than the 692 performance recorded using Gaussian channels. The PSE feature gave similar performance in another study using ideal band-pass filters. [Weszka et al, I976]. 5.IO Summary This chapter has presented the results of experiments designed to demonstrate properties of the channel filtering feature space. The performance of the method on natural and artificial image classes, on images which differ in average gray level, magnification, orientation, and phase spectra was examined. Some computational simplifications of the channel filtering method were also presented and found not to seriously degrade the performance of the features. Throughout the 108 experiments, the power spectral energy in each channel was used as a feature to demonstrate the effect of using phase information in the channel filtering features. The channel filtering features provide superior performance. A comparison of the channel filtering method with the co-occurrence method on one set of images demonstrated that, even when the displacement vectors for the co-occurrence matrix are chosen specifically to capture differences between the micropatterns in a textured image, the channel filtering method outperforms the co-occurrence matrix method. Chapter 6 Evaluation of Channel Filtering Features for Texture Segmentation 6.l Segmentation The previous chapter evaluated the channel filtering features in texture classification problems. In more realistic situations, the images being analyzed contain an unknown number of textured regions which must be identified, thus segmenting the image. This texture segmentation problem is different from a general image segmentation problem in which the objective is to identify objects in a scene. Segmenting an image into textured regions is more difficult than classifying textured images in three ways. First, the number of classes is specified in advance in classification problems whereas in segmentation problems the number of classes is unknown. This requires a segmentation algorithm to include some means for determining the actual (or at least an appropriate) number of classes from the data. Second, the objects being classified in classification problems are subimages. In segmentation problems, the objects to be classified are individual pixels. The number of classifications required in segmentation is very large. For example, segmenting a 128xI28 image requires each of the l6,38A pixels to be classified. This large number of classifications will require some simplifications in the methods to 109 110 be used. Third, in classification experiments the subimages being classified are known to contain a single texture, but in segmentation experiments, the neighborhoods of some pixels will involve more than one texture. The segmentation procedure in this chapter will require a determination of the texture in a neigborhood about each pixel. A channel filtering feature will be used for this purpose. A feature vector will be computed for each pixel in the image. The number of texture classes present will be determined from the feature vectors, and all of the pixels in the image will be classified. Figure I6 shows a block diagram of the segmentation procedure. 6.2 Computing Texture Features for Segmentation Ideally, a “resolution-preserving textural transform” [Haralick, I975] would be applied to the image which would replace the gray levels by texture feature values computed over some neighborhood about each pixel. Such an ideal is often not practically attainable, but a reasonable solution exists using the channel filtering approach. By applying two operations to the filtered images, the gray level at each pixel of the filtered images can be replaced by a new gray level which is related to the value of channel filtering feature I. In the first step, we replace each gray level as follows: Ik' (x.y)- 2* |E-Ik(x,y)| where Ik(.,.) is the kth filtered image and G is the mean gray level of the filtered images. This step produces an image. Ik'(.,.), in which the gray levels are related to the absolute deviation from the 111 IMAGE 1 Channel filtering FILTERED IMAGES 1 Ik'(x,y) = 2*Ilk(x,y)4al Fifi] Ik'(.,.) IMAGES 1 Averaging procedure ll Minimum- Distance Classifier I 3.1, FEATURE IMAGES Extraction of 64 representative pixels ..... SEGMENTED IMAGE I PATTERN MATRIX CLUSTER CENTERS Figure 16: Diagram of the Segmentation Procedure. 112 mean gray level. Simply taking the absolute deviation would result in an image with about half the original number of gray levels. Since gray levels are integer values, the absolute deviation is scaled to increase the precision of the averaging step which occurs next. In the experiments, 872 was added to the pixel values to enhance the visibility of the Ik'(.,.) image. The second step involves computing a moving average over the 'Ik'(.,.) images. In the definition of channel filtering feature I, the absolute deviation from the mean is averaged over the entire subimage. For segmentation experiments, the averaging is performed over small neighborhoods about each pixel. Therefore, we wish to replace each pixel in Ik'(.,.) with the average gray level in a neighborhood about the pixel. This can be accomplished either by a convolution in the spatial domain or by a filtering operation in the spatial frequency domain. The window which defines the neighborhood must be large enough to capture an adequate texture sample, but not so large that transitions between different textures are blurred over a large area. The result of the averaging is a I'feature image" in which the gray level at each pixel is a measure of the texture present at the corresponding location of the original image. The experiments in this chapter will use 8x8 and l6xl6 square windows in computing the feature images. In these experiments, the spatial domain averaging method is used. The feature images defined here correspond to the ”texture energy planes“ of Laws [1980]. In that study, the filters were defined as spatial domain templates which were convolved with the original image. The filtered images had an average gray level of zero, resulting in a 113 slightly different computational procedure for producing the feature images. The present study differs from Laws [I980] in the use of spatial frequency domain filters rather than spatial domain templates, in the decomposition of the image by isolating bands of spatial frequency and orientation rather than by detection of "edges", "spots", and "rings", and in the use of results from vision science to guide the development of the computational methods. Note that the computation of the feature images does not require each neighborhood to be filtered separately. This contrasts with some earlier approaches to texture segmentation in which the entire computational procedure is repeated for each neighborhood considered [Haralick, I975; Bajcsy, 1973; Bajcsy and Lieberman, 1976]. 6.3 Segmentation Using Feature Images The feature images provide an evaluation of the texture in small neighborhoods about each pixel in the image. Corresponding to each pixel is a feature vector in which the number of features equals the number of channels used. The texture segmentation problem has now been transformed into the feature space. The next problem is to assess the structure of the patterns in the feature space. This assessment will be made by applying a clustering algorithm [Anderberg, 1973: Everitt, 197A]. Previous uses of clustering in segmentation problems are discussed in [Mitchell and Carlton, I978: Schachter et al, I978b: Rosenfeld, l98l; Coleman, I979: Davis and Mitiche, I982] The clustering algorithm to be used, called CLUSTER [Dubes and Jain, I976], is a partitional clustering procedure which attempts to 11A mimimize a squared-error criterion. CLUSTER was selected because the clusters it produces are not limited to hierarchical relationships and because this algorithm does not require user-specified parameters. CLUSTER provides partitionings of the data with the number of clusters going from 2 to a user-specified bound (8 in this study). Each partitioning of the data corresponds to a segmentation of the given image. It will be necessary to evaluate these clusterings to determine which ones are appropriate representations of the data, and several statistics are provided by CLUSTER for this purpose. Clustering algorithms are computationally very demanding. Most of them are designed to cluster only a few hundred points. To segment a 128x128 image using clustering alone would require the algorithm to process l6,38A points. To reduce computational requirements, only 6A pixels spaced l6 rows and I6 columns apart will be clustered. This simplification assumes that the textured regions to be segmented will not be irregularly shaped or very small in area. Additionally, only eight channels, the four orientation channels and spatial frequency channels 3-6 (center frequencies A, 8, l6, and 32 cycles per image) will be used. These eight channels were found in section 5.9.I to provide good classification results. This provides a sample of 6A points in 8 dimensions for clustering. The cluster centers will be used to define a minimum-distance classifier to classify the remaining pixels. The classification results will be displayed as a segmented image in which gray levels denote the cluster labels assigned to each pixel. Note that the segmentation is completely determined by the structure of the points in the feature space according to the 115 clustering algorithm. This differs from interpretation~guided segmentation [Barrow and Tennenbaum, I981] where semantic information is used to guide the segmentation process. 6.A Evaluating the Segmentations The number of segments obtained in an image depends on the clustering algorithm. A problem facing users of any clustering algorithm is the question of cluster validity [Dubes and Jain, I979]. Since any clustering algorithm will produce clusters regardless of the distribution of points in the feature space, the user must determine whether the partitioning obtained is a consequence of structure in the points or an artifact of the clustering algorithm. A related problem is to determine the actual number of clusters present in the data. CLUSTER provides several statistics which can be used to qualitatively assess the validity of a clustering. One statistic, which measures the spread of a cluster in the feature space is the average within-cluster distance which is defined for cluster k as follows: Mk) 0 \ \ 2 CLAVGD(I<)= l: l/N(k) 2" > > [X(i.j)-C(k..l)] :] / / I-I j=l 1/2 where N(k) is the number of points in cluster k, C(k,.) is the cluster center for cluster k, D is the dimensionality of the feature space, and X(i,j) is the value of the jth feature for the ith pattern. A measure of the “validity" of cluster k which takes into account the compactness and the isolation of the cluster is defined as II6 D mIn \ 2 l > [C(k’j)-c(‘!j)] U” / S(k) = j=1 CLAVGD(k) Large values of S(k) indicate compact, well-isolated clusters. This statistic will be used to determine the acceptability of a clustering. An acceptable clustering is one in which the value of S(k) for all clusters exceeds a threshold. We have empirically determined a threshold of 1.70 for S(k) by tuning the threshold in preliminary experiments to yield approximately the number of clusters perceived by preattentive human vision. This particular value is only an approximation: we found a threshold of ‘2.0 to reject too many reasonable clustering solutions and a threshold of 1.5 to accept too many clusterings. The threshold value requires that for each cluster the minimum distance to another cluster center be at least 1.70 times the average within-cluster distance. The clusterings which are accepted by this criterion will be ranked by the value of the average of S(k) over all clusters weighted by the number of points in the clusters. Clusterings with higher average S(k) values will be preferred. 6.5 Segmentation Experiment I: Dot Textures The data for this experiment consists of a 128xl28 binary image illustrated in Figure I7. The left half of the image is a regular dot pattern in which the dots are separated by 3 pixels from their 117 Figure 17: Dots Image for Segmentation Experiment 1. 118 horizontal and vertical neighbors. The right half of the image is a random dot pattern in which the probability that each pixel is black is independent of the other pixels and is approximately equal to l/9. (The exact probability is the number of dots in the regular texture divided by half the number of pixels in the image, which results in identical average gray values for both textures). This image was segmented using 8x8 and l6xl6 windows in the averaging step for computing the feature images. The segmented images produced using 8x8 windows are shown in Figure 18, and the segmented images produced using l6xl6 windows are shown in Figure 19. The figures show only the segmentations for 2, 3, A, and 5 clusters because the gray levels used for labelling each segment become too difficult to see when more classes are present. Note that the actual gray levels in the segmented images have no significance other than to distinguish the regions. The two-cluster segmentations for both window sizes accurately distinguish the regions of different texture: over 982 of the pixels are correctly labelled. The 8x8 segmentation contains a few small areas in the random texture which are classified with the regular texture. These misclassifications do not appear in the l6xl6 segmentation because the probability that a region of random dots will resemble a regular texture is lower for larger regions. The segmentation results corresponding to more than two clusters break up the random texture into irregularly shaped regions. The regular texture is not subdivided. None of the segmented regions lie across the boundary between the textures. 119 Figure 18: Segmented Images for Dots Using 8x8 Averaging Windows. (b) 3-cluster solution (d) 5-c1uster solution (a) 2-c1uster solution (c) A-cluster solution 120 Figure 19: Segmented Images for Dots Using 16x16 Averaging Windows. (a) 2-c1uster solution (b) 3-cluster solution (c) A-cluster solution (d) 5-cluster solution 121 We now need to determine which of the segmentations are reasonable using the validity test defined earlier. The number of points in each cluster, the value of S(k) for each cluster, and the weighted average values of S(k) are shown in Table 26 for the 8x8-averaged feature images and in Table 27 for the 16x16-averaged feature images. In both 122 Table 26: Evaluation of Clustering on Dots Image Using 8x8 Averaging Windows. (* indicates accepted clustering solutions) k N(k) S(k) k N(k) S(k) 2 CLUSTERS 7 CLUSTERS l 32 6.93 l 32 5.75 2 32 2.31 2 3 2.19 AVG A.62 A 3 3 1.19 A 9 I.3A 3 CLUSTERS 5 9 1.13 I 32 6.17 6 5 1.31 2 l9 I.AI 7 3 1.53 3 l3 I.2A AVG 3.55 AVG 3.76 8 CLUSTERS A CLUSTERS I 32 5.75 I 32 6.08 2 3 1.98 2 20 I.A8 3 3 1.19 3 8 I.I5 A 9 I.AI A A 2.20 5 3 I.66 AVG 3.78 6 5 1.31 7 3 1.53 5 CLUSTERS 8 6 1.29 I 32 6.08 AVG 3.59 2 13 1.05 3 8 I.IA A A 2.I6 5 7 0.975 AVG 3.6A 6 CLUSTERS I 32 5.75 2 7 I.53 3 3 1.19 A II 1.2A 5 6 1.07 6 5 I.A7 AVG 3.53 cases, only the two-cluster solution is accepted. The tables also show that the S(k) value for cluster I, which corresponds to the regular texture, is always very high. Since the cluster of points from the regular texture is always compact and well-isolated, the regular texture is never subdivided in the segmented images. 123 Table 27: Evaluation of Clustering on Dots Image Using l6xl6 Averaging Windows. (* indicates accepted clustering solutions) k N (k) S (k) k N (k) S (k) 2 CLUSTERS 7 CLUSTERS 1 32 9.39 1 32 8.06 2 32 3-79 2 7 1.39 AVG 6.59 * 3 8 1.58 A 5 1.8A 3 CLUSTERS 5 2 1.77 1 32 8.86 6 7 1.71 2 22 1.33 7 3 1.35 3 10 1.62 AVG A.83 AVG 5.1A 8 CLUSTERS A CLUSTERS 1 32 8.06 1 32 8.27 2 7 1.59 2 11 1.22 3 8 1.58 3 9 1-39 A 5 1.93 A 12 1.28 5 2 1.87 AVG A.78 6 6 1.97 7 2 1.91 5 CLUSTERS 8 2 1.91 1 32 8.05 AVG A.91 2 8 1.39 3 3 1-53 A 6 1.56 5 10 1.26 AVG A.73 6 CLUSTERS 1 32 8.06 2 7 1.A5 3 7 1-63 A 6 1.58 5 7 1.1A 6 5 2.97 AVG A.87 6.6 Segmentation Experiment 2: Gaussian White Noise This experiment will apply the segmentation procedure to a 128x128 image which is visually perceived to contain a single homogeneous texture. The gray level at each pixel is generated independently using 12A a Gaussian distribution with mean 128 and standard deviation 30. The image is illustrated in Figure 20. The same filtering and segmentation procedure as in the previous experiment was applied. The segmented images from 8x8-averaged feature images are shown in Figure 21 and the results from 16x16-averaged feature images are shown in Figure 22. In both cases, the images are segmented into irregular patches. The patches for the 8x8 case are generally smaller than those for the 16x16 case. The statistics for evaluating the clusterings are given in Table 28 for 8x8 averaging and in Table 29 for l6xl6 averaging. None of the clusterings pass the threshold test. In fact, none of the individual clusters appear to be valid. Since all multiple-class solutions are rejected, we conclude that the image contains a single texture. 6.7 Segmentation Experiment 3: Natural Image Composite Figure 23 shows the 128x128 image to be used for this experiment. The image is composed of four natural texture classes: WOOD, PAPE, SCRE and SAND. The results of segmentation based on 8x8 averaging for the feature images is shown in Figure 2A, and the results for l6xl6 averaging are Shown in Figure 25. Different results are obtained by using different averaging windows. The two-cluster solution using 8x8 averaging groups the W000 and SCRE areas in one cluster and the PAPE and SAND areas in the other. This clustering appears to be determined by the irregularity of PAPE and SAND and the directional tendencies of W000 and SCRE. Using 16x16 averaging, the SCRE texture is alone in one cluster in the two-class 125 Figure 20: Gaussian White Noise Image for Segmentation Experiment 2. 126 Figure 21: Segmented Images for Gaussian White Noise Using 8x8 Averaging Windows. (b) 3-c1uster solution (d) S-cluster solution (a) Z—cluster solution (c) A—cluster solution 127 Figure 22: Segmented Images for Gaussian White Noise Using 16x16 Averaging Windows. (b) 3-cluster solution (d) 5—cluster solution (a) 2-c1uster solution (c) A-cluster solution 128 Figure 23: Natural Image Composite for Segmentation Experiment 3. 129 Figure 24: Segmented Images for Natural Image Composite Using 8x8 Averaging Windows. (a) 2-c1uster solution (b) 3-c1uster solution (c) A-cluster solution (d) 5—cluster solution 130 Figure 25: Segmented Images for Natural Image Composite Using 16x16 Averaging Windows. (b) 3-cluster solution (d) 5-cluster solution (a) 2-cluster solution (c) A-cluster solution 131 Table 28: Evaluation of Clustering on Gaussian White Noise Image Using 8x8 Averaging Windows. A N(k) S(k) k N(k) S(k) 2 CLUSTERS 7 CLUSTERS I 55 I.26 I 10 I.I3 2 9 1.12 2 5 I.66 AVG I.2A 3 7 1.07 A 7 1.16 3 CLUSTERS 5 I5 -0.923 I 38 0.831 6 7 1.02 2 8 I.22 7 I3 1.03 3 18 0.939 AVG 1.09 AVG 0.91 8 CLUSTERS A CLUSTERS I 9 I.29 I IA 1.21 2 A I.AA 2 16 1.11 3 6 I.53 3 I8 I.05 A 7 I.0I A I6 I.07 5 I3 1.18 AVG 1.11 6 9 1.15 7 6 1.06 5 CLUSTERS 8 I0 1.09 I I3 1.18 AVG 1.20 2 I0 I.0A - 3 I7 I.I5 A 10 1.01 5 IA 1.13 AVG 1.11 6 CLUSTERS I I2 1.08 2 A 1.6A 3 7 1.29 A 8 I.0A 5 23 1.06 6 I0 0.9A6 AVG I.I0 segmentation. The regularity and strong orientation dependence appear to determine this clustering. The segmentations with three and four clusters are similar for the two window sizes. The 8x8 window tends to break up the image into smaller areas and to misclassify more pixels than the 16x16 window. Edges between textures are more sharply identified using the 8x8 window, but the 8x8 window is more likely to 132 Table 29: Evaluation of Clustering on Gaussian White Noise Image Using 16x16 Averaging Windows. k N (k) S (k) k N (k) S (k) 2 CLUSTERS 7 CLUSTERS 1 39 1.12 1 11 1.25 2 25 0.957 2 11 1.3A AVG 1.06 3 11 1.2A A 11 1.3A 3 CLUSTERS 5 10 1.2A 1 2A 0.951 6 3 1.28 2 19 I.0A 7 7 1.20 3 21 0.879 AVG 1.27 AVG 0.95A 8 CLUSTERS A CLUSTERS 1 10 1.30 1 21 1.06 2 8 1.10 2 15 1.00 3 9 1.37 3 17 0.972 A 11 1.36 A 11 1.03 5 9 1.21 AVG 1.02 6 3 1.28 7 6 1.23 5 CLUSTERS 8 8 1.13 1 13 1.20 AVG 1.25 2 12 1.10 3 15 1.08 A 13 1.2A 5 11 0-973 AVG 1.12 6 CLUSTERS 1 11 1.25 2 1A 1.11 3 11 1.2A A 11 1.3A 5 10 1.2A 6 7 1.22 AVG 1.23 incorrectly subdivide a homogeneous texture area. One significant error made using 8x8 averaging is the misclassification of a low—contrast area of the SCRE panel as WOOD. The same area appears as a separate segment in the 3 and A class segmentations using l6xl6 averaging. 133 The statistics for evaluating the clusterings are shown in Table 30 for 8x8 'averaging and in Table 31 for 16x16 averaging. The three-cluster solution is the only one which passes the threshold test in the 8x8 case. With l6xl6 averaging, the solutions for 3, A, and 5 clusters are found to be acceptable. Using the weighted average of I3A Table 30: Evaluation of Clustering on Composite Natural Image Using 8x8 Averaging Windows. (* indicates accepted clustering solutions) k N(k) S(k) k N(k) S(k) 2 CLUSTERS 7 CLUSTERS I 35 I.3A I IA 1.21 2 29 1.58 2 6 1.52 AVG I.A5 3 I2 2.08 A I0 1.27 3 CLUSTERS 5 A I.32 I 2A 2.II 6 7 I.33 2 28 1.75 7 II 1.32 3 I2 2.22 AVG I.A5 AVG I.97 * 8 CLUSTERS A CLUSTERS I IA 1.21 I 2I 1.61 2 6 I.52 2 16 1.39 3 12 2.09 3 12 2.19 A 5 I.37 A I5 I.A6 5 A I.32 AVG 1.63 6 7 1.33 7 II I.A3 5 CLUSTERS . 8 5 I.76 I 2I I.A6 AVG I.52 2 I5 I.35 3 I2 2.I7 A 7 1.06 5 9 1.03 AVG I.A6 6 CLUSTERS I IA 1.21 2 7 I.A7 3 12 2.I3 A IA 1.11 5 I0 1.28 6 7 1.33 AVG I.AI S(k) to rank the solutions, we find the three-cluster solution to be the best, followed in order by the A- and 5-cluster solutions. In the three-cluster solution, the SAND and PAPE regions are merged in a single cluster. The irregularity and the contrast of these areas are similar, so the preference for this solution is plausible. In the three-class segmentation using 16x16 averaging windows, over 952 of the 135 Table 31: Evaluation of Clustering on Composite Natural Image Using l6xl6 Averaging Windows. (* indicates accepted clustering solutions) k N(k) S(k) k N(k) S(k) 2 CLUSTERS 7 CLUSTERS I A8 I.AA I 5 2.53 2 I6 2.26 2 12 2.22 AVG I.65 3 I5 2.05 A 6 1.17 3 CLUSTERS 5 12 1.55 I 3I 2.18 6 9 1.09 2 16 2.51 7 5 2.76 3 I7 2.95 AVG 1.86 AVG 2.A7 A 8 CLUSTERS A CLUSTERS I 5 2.53 I 13 1.92 2 I2 2.22 2 l6 2.A0 3 I5 2.05 3 16 2.59 A .6 1.23 A 19 1.89 5 I2 I.53 AVG 2.20 a 6 6 I.A3 ' 7 3 I.A0 5 CLUSTERS 8 5 2.76 I 13 1.92 AVG 1.91 2 I2 2.22 3 15 2.05 A 19 1.89 5 5 2.76 AVG 2.06 A 6 CLUSTERS I 5 2.53 2 I2 2.22 3 I5 2.05 A I5 1.61 5 I2 I.56 6 5 2.76 AVG I.98 pixels are correctly labelled (assuming the SAND and PAPE panels to be the same class). The 8x8 window segmentation using 8x8 windows classifies about 862 of the pixels correctly. The four-cluster solution, which is the second choice for l6xl6 averaging, reasonably captures the four image types used: the segmentation based on an 8x8 averaging window correctly labels 812 of 136 the pixels while using a 16x16 window, 912 of the pixels are correctly labelled. 6.8 Segmentation Experiment A: SCRE Figure 26 shows the 128x128 SCRE natural image which will be' segmented using the same procedures as the previous experiments. The image contains a single texture class, but unlike the Gaussian white noise image, the SCRE image contains some internal structure. The segmented images using 8x8 averaging windows are presented in Figure 27 and the results using 16x16 averaging windows are presented in Figure 28. The cluster evaluation statistics are shown in Table 32 for 8x8 averaging and in Table 33 for l6xl6 averaging. None of the clustering solutions passes the threshold test, but the values of S(k) are larger than those obtained with Gaussian white noise. There appear to be emerging clusters, but the presence of some non-isolated clusters in each solution causes all of the clusterings to be rejected. 6.9 Summary This chapter presented an algorithm for computing a textural transform in which the texture of a neighborhood about each pixel is represented by gray level values in a series of feature images. The feature images are computed from filtered images using a two-step procedure in which the second step can be implemented as another filtering operation. The resulting textural transform was evaluated by applying a texture segmentation procedure in four experiments. The segmentation procedure imposed no restrictions on the nature of the 137 Figure 26: SCRE Image for Segmentation Experiment A 138 Figure 27: Segmented Images for SCRE Using 8x8 Averaging Windows. (b) 3—cluster solution (d) 5—c1uster solution (a) 2-c1uster solution (c) A-cluster solution 139 Figure 28: Segmented Images for SCRE Using 16x16 Averaging Windows. (a) 2-cluster solution (b) 3-cluster solution (c) A-cluster solution (d) 5-cluster solution 1A0 Table 32: Evaluation of Clustering on SCRE Image Using 8x8 Averaging Windows. I: N (k) S (k) k N (k) S (k) 2 CLUSTERS 7 CLUSTERS 1 AA I.A5 1 9 1.7A 2 20 1.60 2 12 1.82 AVG 1.50 3 7 1.A3 A 5 2.5A 3 CLUSTERS 5 13 1.71 1 27 1.2A 6 6 1.71 2 15 2.02 7 12 1.1A 3 22 1.57 AVG 1.66 AVG 1.5A 8 CLUSTERS A CLUSTERS 1 9 1.7A 1 18 I.A6 2 6 1.37 2 1A 1.88 3 5 2.03 3 19 1.89 A 5 2.A6 A 13 1.33 5 11 1.5A AVG 1.65 6 6 1.71 7 10 1.95 5 CLUSTERS 8 12 1.51 1 19 1.37 AVG 1.7A 2 1A 1.85 3 17 1.70 A 5 2.70 5 9 1-33 AVG 1.73 6 CLUSTERS 1 9 1.7A 2 1A 1.76 3 17 1.70 A 5 2.5A 5 13 1.68 6 6 1.71 AVG 1.78 segmented regions such as a minimum size or connectivity requirement. A clustering algorithm was applied to some representative pixels, and the remaining pixels were assigned to the clusters by a minimum distance classifier. The segmented images indicate that the feature space is a reasonable representation of the textures in an image. This conclusion 1A1 Table 33: Evaluation of Clustering on SCRE Image Using l6xl6 Averaging Windows. k N(k) S(k) k N(k) S(k) 2 CLUSTERS 7 CLUSTERS I 57 1.65 I 13 1.73 2 7 1.81 2 A 3.10 AVG 1.66 3 6 1.99 A 12 1.21 3 CLUSTERS 5 A 2.AI I 32 I.3A 6 16 1.72 2 6 1.99 7 9 1.25 3 26 I.3A AVG 1.72 AVG I.AO 8 CLUSTERS A CLUSTERS I 13 1.73 1 18 I.A9 2 A 3.10 2 A 3.99 3 A 2.50 3 25 1.36 A II 1.29 A 17 1.59 5 A 2.A1 AVG 1.62 6 16 1.72 7 6 I.AA 5 CLUSTERS 8 6 1.70 1 18 1.53 'AVG 1.80 2 A 3.50 3 1.83 A 9 1.60 5 6 1.6A AVG 1.73 6 CLUSTERS 1 IA 1.86 2 A 3.63 3 7 1.77 A 17 1.50 5 6 1.28 6 16 1.73 AVG 1.79 was further confirmed by a statistical evaluation of the clusterings. The evaluation identified reasonable clusterings by requiring all clusters to be compact and isolated as measured by a test statistic. Acceptable clusterings were then ranked by an average isolation statistic. The segmentations corresponding to the preferred clusterings contained reasonable numbers of clusters and the 1A2 corresponding segmentations generally agree with the image generation method and visual segmentations of the images. The experiments used 8x8 and 16x16 windows for computing the feature images. Both gave satisfactory segmentation results, though these window sizes are smaller than the subimages used in the classification experiments. This suggests that a minimum distance classification algorithm might be more appropriate than a nearest neighbor algorithm for the channel filtering feature space. Chapter 7 Summary and Conclusions 7.1 Summary Texture is characterized by preattentive human visual performance of segmentation and classification tasks. This form of perception has many potential applications in computer vision systems, but the characterization of ”texture" in terms of human performance does not lead to simple, effective algorithms 'for texture analysis. No alternative definition of texture independent of human performance has provided sufficiently precise guidance to enable development of general texture analysis algorithms. Since texture analysis algorithms attempt to model preattentive human vision, insights from vision science and from intuition have been used to guide the development of algorithms. The lack of precise guidance from a definition of texture and the lack of generality in existing algorithms has resulted in a profusion of diverse approaches to texture analysis. A recent theory concerning the early information processing strategies in human vision was used to motivate a new approach to texture analysis. A feature space which measures average local energy was defined from filtered images and used in texture classification 1A3 1AA problems. The features were applied to artificial images, natural images, and preprocessed natural images. The performance of the feature space in various classification tasks was investigated and compared with the power spectral and co-occurrence methods. A method for computing a texture feature vector over a small neighborhood about each pixel in an image was developed for texture segmentation. A clustering algorithm was applied to the feature vectors, and a segmented image was produced by labelling each pixel according to the cluster in which the pixel's feature vector lies. Since each clustering solution corresponds to a segmented image, a means for seleCting acceptable clusterings is required. A cluster validity statistic was used to determine which, if any, of a set of clusterings provided an acceptable representation of the data. A related statistic was used to rank the, accepted solutions. Four texture segmentation experiments were performed using both composite and homogeneous images. 7.2 Conclusions 7.2.1 Classification The channel filtering features were used to classify subimages of eight natural textures. The results using 6Ax6A and 32x32 subimages were good. The features performed poorly when presented l6xl6 subimages. These results indicate that the features are suitable for classification of natural images based on 32x32 or larger subimages. 1A5 The effect of histogram equalization on the evaluation of natural textures by the channel filtering features was investigated in a series of experiments. We found that almost all of the histogram equalized images were perceived as having different textures from the original classes. Histogram equalization was also observed to confuse image' classes which were originally separable. The results suggest that histogram equalization should be used carefully, if at all, with channel filtering features. The channel filtering features were found to be insensitive to global, constant gray level changes. This implies that histogram equalization to remove differences in average brightness is unnecessary. A procedure for determining whether two images portray the same texture at different magnifications was developed. The procedure was tested for 2X magnifications only. Certain aspects of the procedure might be simplified by using a different classification algorithm, but the results obtained were good. A procedure for detecting orientation differences was developed which gave excellent performance for 90 degree orientation changes but was less reliable for A5 degree orientation changes due to fundamental differences in A5 degree rotated patterns caused by the rectangular quantization grid and the square image shape. The effect of phase modification was investigated. The experiment demonstrated that images with identical power spectra could be discriminated by channel filtering features based on differences in the phase spectra. 1A6 An experiment comparing the co-occurrence method with the channel filtering method demonstrated that selection of displacement vectors for co-occurrence matrices is a serious problem. Even displacement vectors selected specifically to discriminate micropatterns can fail to discriminate images composed of the micropatterns. On the other hand, the channel filtering features were applied in the same way as in other experiments and produced excellent classification accuracy. Experiments to test computational simplifications of the channel filtering procedure showed that the method is "robust" in that the classification accuracy is not degraded by using ideal band-pass channels or by eliminating certain channels. 7.2.2 Segmentation Four experiments were performed to test the utility of the channel filtering approach for texture segmentation. In the first experiment, an image composed of a regular dot pattern and a random dot pattern were segmented. The statistical evaluation of the clustering solutions indicated that two clusters existed in the feature space. The labellings of the pixels corresponded closely to the actual image areas. Segmentation of a Gaussian white noise image yielded irregularly shaped, non-contiguous segments throughout the image. The cluster validity statistic found no acceptable clustering solution, so the image was correctly identified as portraying a single texture. 1A7 A composite image composed of four natural textures was segmented. The statistical evaluation indicated that the best solution involved three clusters. This result was considered reasonable due to the visual similarity of two of the actual texture classes. The second best solution involved four clusters and the segmentation generally corresponded to the actual textures in the image. Another experiment on a homogeneous natural image resulted in a correct identification of the image as a homogeneous texture. Emerging clusters were found, but they were not well-defined enough to pass the statistical validity test. 7.2.3 General The channel filtering feature space has been evaluated on a variety of texture classification and segmentation problems. The results indicate that the feature space is a good representation of texture for these problems. Equivalently, the feature space has been validated as a model for preattentive human vision on a variety of stimuli. The investigation of this feature space has involved tests in which visually discriminable image classes were expected to be discriminated by the features and in which visually indiscriminable image classes generated in different ways were expected to be confused by the features. The classification and segmentation procedures used to evaluate the feature space did not involve sophisticated heuristics: the results are consequences of the structure of the feature space. Some results, 1A8 in particular the magnification experiment and the results of the segmentation experiments, suggest that a minimum distance classifier may be more appropriate than a nearest neighbor classifier for the channel filtering feature space. Further investigation of the feature space may suggest that a different clustering procedure may be more appropriate for texture segmentation. The performance of the channel filtering feature demonstrates that a global filtering model of vision is capable of reasonably approximating human preattentive vision. This contrasts to results of previous studies which discounted the value of phase information for texture analysis and which suggested that the Fourier transform is not an appropriate tool for texture analysis. This study shows that phase information provides critical information on the distribution of spectral energy through the image plane. The use of phase information enables spatially local analysis of different spectral energy bands. This information was measured by the channel filtering features and provided good results. 7.3 Advantages of the Channel Filtering Method 1. The channel filtering features have a satisfying intuitive basis. Unlike some ad hoc procedures, the significance of the channel filtering features are explainable in terms of image properties such as contrast, edge density, local energy, and directionality. The explanations are a consequence of the intuitive interpretations of the channels and of the spatial filtering operation. 1A9 2. The channel filtering method provides classification results superior to those obtained by the power spectral method. The performance is attributed to the utilization of phase information in the channel filtering procedure. 3. The channel filtering method has no critical parameters similar to the choice of displacement vectors for the co-occurrence method. A. A method for computing features for texture segmentation exists which does not require recomputation of the entire filtering procedure over every subimage of interest. This means that only a simple feature computation needs to be repeated to evaluate the textures in different regions. In fact, the texture features for every pixel can be computed by a procedure, easily amenable to parallel execution. 5. Simple procedures for classification and for segmentation produce good results. This indicates that the results are due to the structure of the feature space and not to a clever classification or segmentation procedure. 6. There are many opportunities for parallelism in computing the filtered images and the texture features. This could enable fast hardware implementations of the channel filtering procedure to be constructed with a high degree of modularity and at fairly low cost. 7.A Disadvantages of the Channel Filtering Method 1. The computational and storage requirements for sequential, digital implementation of the channel filtering approach are too 150 demanding for many applications. Other implementation methods using parallel hardware or optical/digital methods may make fast implementations possible. Among the current difficulties are the number of Fourier transforms, inverse transforms, and image multiplications required in the channel filtering procedure. These operations are very demanding when implemented sequentially. 2. A simple method for generating images which have specified characteristics in the texture feature space is not known. Such a procedure would enable further validation of the feature space by enabling generation of images which are "close" to a given image. This capability would enable a quantitative determination of how well the feature space models human preattentive perception. The major obstacles to development of such a method are the overlapping of the channels in the spatial frequency domain and the dependence of the feature values on phase spectrum information. 7.5 Suggestions for Further Research 1. Segmentation results have suggested that the minimum distance classification algorithm is more appropriate than the nearest-neighbor algorithm for the channel filtering feature space. Further research is needed to confirm this hypothesis and to investigate the applicability of other decision rules. 2. Different clustering algorithms for use in the segmentation procedure should be investigated. CLUSTER is known to perform poorly in certain situations, and statistics similar to those used to evaluate the clusterings in this study can be defined for other algorithms. 151 3. Further investigation of the performance of the segmentation procedure is needed. The averaging windows used to compute the feature images could be implemented as another filtering operation, perhaps using a sequence of filters as in the original channel filtering procedure. The validity of the feature space for images which involve- distinct forms should also be investigated. A. The use of channel filtering for receding surfaces should be investigated. Since the apparent size of the texture on a receding surface shrinks progressively with distance, the spectral information from the surface should move steadily toward high-frequency channels. This would cause the feature vectors for pixels depicting the surface to be strung out though the feature space. A clustering procedure which can detect long, stringy clusters, such as the single-link algorithm, might be useful for analyzing surface structure. 5. Since the feature space developed in chapter A is based on a theory of human visual information processing, it may be useful to determine how faithfully the feature space reproduces human texture vision. Psychophysical experiments could help to guide the selection of classification and clustering algorithms which would duplicate human performance. 6. Spatial frequency domain filtering has been used to provide at least partial solutions for several computational vision problems, now including texture analysis. Revision of procedures for solving other computational vision problems using channel filtering as a framework could enable development of unified computational vision systems. APPENDICES Appendix A: A Catalog of Texture Definitions 1. ”We may regard texture as what constitutes a macroscopic region. Its structure is simply attributed to the repetitive patterns in which elements or primitives are arranged according to a "placement rule"." [Tamura et al, 1978] 2. "We suggest the following operational definition of 'texture'. A region in an image has a constant texture if a set of local statistics or other local properties of the picture function are constant, slowly varying, or approximately periodic." [Sklansky, 1978] 3. "A texture will be considered to be a random field X(n,m) where n and m are integers." (They later specialize to Markov random fields.) [Conners and Harlow, l980a] A. ”The image texture we consider is nonfigurative and cellular.... An image texture is described by the number and types of its (tonal) primitives and the spatial organization or layout of its (tonal) primitives... a fundamental characteristic of texture: it cannot be analyzed without a frame of reference of tonal primitive being stated or implied. For any smooth gray-tone surface, there exists a scale such that when the surface is examined, it has no texture. Then as resolution increases, it takes on a fine texture and then a coarse texture.“ [Haralick, 1979] 5. “Image texture refers to the visual sensation that one receives about the structure arrangement of an image region. The 152 153 textural properties of a scene are often qualitatively described as coarse, grainy, striated, or rough. Computer simulations .have been done in which overlapping particles have been placed at random locations. When only a few particles are present, the visual impression is of discrete, countable objects. However, when the number of objects is increased, the visual impression is of texture rather than countable objects.‘| [Hall et al, 1977] 1 6. "Texture is defined for our purposes as an attribute of a field having no components that appear enumerable. The phase relations between the components are thus not apparent. Nor should the field contain an obvious gradient. The intent of this definition is to direct the attention of the observer to the global properties of the display - i.e. its overall "coarseness", "bumpiness", or "fineness". Physically, nonenumerable (aperiodic) patterns are generated by stochastic as opposed to deterministic processes. Perceptually, however, the set of all patterns without obvious enumerable components will include many deterministic (and even periodic) textures. Because our criterion for enumerability was subjective rather than objective, many of our patterns actually contained repetitive elements which were not immediately obvious but could be identified when the observer specifically looked for these components. To further minimize very obvious enumerable periodic components of the patterns, all displays contained only components whose spatial frequencies were proportional to prime numbers. We would like to stress that the above constraints imposed on our textures are all designed to minimize the importance of phase information - a variable we consider more important for pattern recognition than for texture perception." [Richards and Polit, I97A] 15A 7. "Images composed of numerous binary pixels which are only weakly correlated with their neighbors form one extreme in a continuum of scene types whose opposite extreme is exemplified by simple line drawings against a uniform background. In the latter case the image pixels are highly redundant and the image consequently carries little information in the sense of Shannon, whereas in the former case, redundancy is low and the information content is high: normally so high that the vision system must selectively disregard information in order to process the scene. The limiting extremes of this continuum of images are the uniform ("constant”) image all of whose pixels are identical, and the random image, whose pixel arrangement is completely determined by a probability distribution. The random image appears to be homogeneous (and therefore completely redundant) in a global sense: that is, different subregions large enough to contain many pixels convey an equivalent subjective impression. This type of homogeneity, which is different from a locally highly redundant image, is usually called "texture", and is characteristic of the structure of the probability distribution of the random image.” [Resnikoff, 1981] 8. ”Texture is an apparently paradoxical notion. 0n the one hand, it is commonly used in the early processing of visual information, especially for practical classification purposes. 0n the other hand, no one has succeeded in producing a commonly accepted definition of texture. The resolution of this paradox, we feel, will depend on a richer, more developed model for early visual information processing, a central aspect of which will be representational systems at many different levels of abstraction. These levels will most 155 probably include actual intensities at the bottom and will progress through edge and orientation descriptors to surface, and perhaps volumetric, descriptors. Given these multi-level structures, it seems clear that they should be included in the definition of, and in the computation of, texture descriptors." [Zucker and Kant, 1981] 9. ”The notion of texture appears to depend upon three ingredients: (1) some local 'order' is repeated over a region which is large in comparison to the order's size, (2) the order consists in the nonrandom arrangement of elementary parts, and (3) the parts are roughly uniform entities having approximately the same dimensions everywhere within the textured region." [Hawkins, 1969] 10. "Although these descriptions of texture seem perceptually reasonable, they do not immediately lead to simple quantitative textural measures in the sense that lthe description of edge discontinuity leads to the quantitative definition of an edge in terms of its location, slope angle, and height.” [Pratt, I978] APPENDIX 8: Definition of Channel Filters The transfer function of a 2Nx2N spatial frequency channel filter Fk(u,v), where -N+I (. u,v <= N, is defined as follows: Imag [ Fk(u,v) ] = 0 for all u,v,k, Real [ Fk(0,0) ] = l for all k, and for (u,v) $ (0,0) , 2 [10(D(U.v)) - 10941) 1 Real [ Fk(u,v) ] = exp [ -.5 * ------------- 5' --------- ] 6' 2 2 1/2 k-l where D(u,v) - [u +v ] , 5 - .275. and f‘k- 2 . For a 2Nx2N image, we will use spatial frequency channels for k=I,...(logz N + I). This definition yields a series of filters one octave apart whose widths are slightly more than one octave on each side of the center frequency. The value of O" was selected to produce filters whose widths are within the constraints for human visual filters given in [Ginsburg, I978]. 156 157 The transfer function for a 2Nx2N orientation channel filter Gk(u,v) where -N+I <- u,v <- N is defined as follows: Imag [ Gk(u,v) ] I 0 for all u,v,k, Real [ Gk(0,0) ] 8 1 for all k, and for (u,v);E(0,0), 2 Real [ Gk(u,v) ] - exp [ -,5 A ----32---- ] (71 where ak- Min { l/J‘K- arctan(v/u)| , | (Ink-18°) - arctan(v,u)| }, 6 - 17.8533, 1 <- k <- A, and the values of [AK are given in the table below for each value of k. The value of (5 is chosen to produce the same overlap between channels as in the spatial frequency channels. Note that the orientation channels defined here are slightly wider then the 30 degree wide channels proposed for human vision in [Ginsburg, 1978]. Values ofIAk(in degrees) for each k k M l 0 (horizontal) 2 A5 3 90 A 135 List of References List of References Agin, G. J., 1980, ”Computer Vision Systems for Industrial Inspection and Assembly," Computer, 13: 11-20. Aggarwal, J. K., L. S. Davis and W. N. Martin, 1981, "Correspondence Processes in Dynamic Scene Analysis,“ Proceedings pi pp; IEEE. 69: 562-572. Ahuja, N. and A. Rosenfeld, I981, "Mosaic Models for Textures," IEEE Transactions pg Pattern Analysis and Machine Intelligence, 3: 1-11. Anderberg, M. R., 1973, Cluster Analzsis jg; Application , New York: Academic Press. Bacus, J. W., 1976, "A Whitening Transformation for Two-Color Blood Cell Images,“ Egttern Recggpition, 8: 53-60. Bajcsy, R., 1973, "Computer Identification of Visual Surfaces,” Computer Graphics and Imagg Processing, 2: 118-130. Bajcsy, R. and L. Lieberman, I976, “Texture Gradient as a Depth Cue,” Computer Graphics Egg Image Processing, 5: 52-67. Barrow, H. G o . and J. M. Tennenbaum, 1981, "Computational Vision," Proceedings _1 525 E | EE. 69: 572-595. Beck, J., 1980, "Texture Segmentation," University of Maryland, Computer Science Technical Report TR-97I. Besag, J., 197A, "Spatial Interaction and the Statistical Analysis of Lattice Systems,” Journal {pf the Rozal Statistical Society, 836: 192-236. Blakemore, C. and F. W. Campbell, 1969, "On the Existence of Neurons in the Human Visual System Selectively Sensitive to the Orientation and Size of Retinal Images,” Journal pi thsiologz, 203: 237-260. Brady, M., 1982, "Computational Approaches to Image Understanding,“ Computing Survezs, IA: 3-71. Brodatz, P., 1966, Textures: A Photographic Albpm jg; Artists pgg Designers, New York: Dover. 158 159 Caelli, T. M. and B. Julesz, I978a, "0n Perceptual Analyzers Underlying Visual Texture Discrimination - Part 1," Biological Cybernetics, 28: 167-175. Caelli, T. M. and B. Julesz, I978b, ”0n Perceptual Analyzers Underlying Visual Texture Discrimination - Part 2,” Biological Cybernetics, 29: 201-21A. Campbell, F. W. and L. Maffei, 1970, "Electrophysiological Evidence for the Existence of Orientation and Size Detectors in the Human Visual System,“ Journal pi thsiology, 207: 635-652. Campbell, F. W. and J. G. Robson, 1968, "Application of Fourier Analysis to the Visibility of Gratings," Journal pi thsiology, 197: 551-566. Carlucci, L., 1972, "A Formal System for Texture Languages,” Pattern Recognition, A: 53-72. Chellappa, R. and R. L. Kashyap, I981a, “On the Correlation Structure of Random Field Models of Images and Textures," Proceedings ij LE5 IEEE Conference pp Pattern Recognition Egg Image Processing, Dallas. 57A-576. Chellappa, R. and R. L. Kashyap, I98Ib, "Synthetic Generation and Estimation in Random Field Models of Images,” Proceedings pf £52 IEEE Conference '23 Pattern Recognition ppg Image Processing, Dallas. 577-532. Coleman, G. B. and H. C. Andrews, 1979, "Image Segmentation by Clustering," Proceedings pi _gg IEEE, 67: 773-785. Conners, R. W., 1979, "Towards a Set of Statistical Features Which Measure Visually Perceivable Qualities of Textures,” roceedings p: 555 IEEE Conference pp Pattern Recognition 22g Image Processing, Chicago. 382-390. Conners, R. W. and C. A. Harlow, 1980a, "A Theoretical Comparison of Texture Algorithms,“ IEEE Transactions pp Pattern Analzsis and Machine Intelligence, 2: 20A-222. Conners, R. W. and C. A. Harlow, I980b, "Toward a Structural Texture Analyzer Based on Statistical Methods," Computer Graphics and Image Processing, 12: 22A-256. Cross, G. R. and A. K. Jain, I981, ”Markov Random Field Texture Models," Proceedings pf 353 IEEE Conference pp Pattern Recognition 32g Image Processing, Dallas. 597-602. Crowley, J. and A. Parker, 1978, "The Analysis, Synthesis and Evaluation of Local Measures for Discrimination and Segmentation of Textured Regions," Proceedings pi 335 IEEE Conference pp Pattern 160 Recognition and Imagg Processing, Chicago. 372-378. Darling, E. M. and R. 0. Joseph, 1968, "Pattern Recognition from Satellite Altitudes," IEEE Transactions pg Systems, Mpg 22g Cybernetics, A: 38-A7. D'Astous, F. and M. Jernigan, 1981, "Phase Information in Texture Feature Extraction," Proceedings pi the Internptional Conference pp Cybernetics and Society, Atlanta. 182-186. Davis, L., S. Johns and J. Aggarwal, I979, "Texture Analysis Using Generalized Co-occurrence Matrices," IEEE Transactions pg Pattern Analysis and Machine Intelligence, 1: 251-259. Davis, L. and A. Mitiche, I982, "MITES: A Model-Driven Iterative Texture Segmentation Procedure,“ Computer ’Graphics Egg Image Processing, 19: 95-110. Deguchi, K. and I. Morishita, 1978, "Texture Characterization and Texture-Based Image Partitioning Using Two-dimensional Linear Estimation Techniques," IEEE Transactions pg Information Theory, 27: 733'7A5- Dubes, R. C. and A. K. Jain, I976, "Clustering Techniques: The User's Dilemma," Pattern Recognition, 8: 2A7-260. Dubes, R. C. and A. K. Jain, I979, "Validity Studies in Clustering Methodologies," Pattern Recognition, ll: 235-25A. Duda, R. O. and P. E. Hart, 1973, Pattern Classification and Scene Analysis, New York: Wiley. Ehrich, R. and J. P. Foith, I976, "Representation of Random Waveforms by Relational Trees," IEEE Transactions pp Computers, 25, no. 7: 725'736- Ehrich, R. and J. P. Foith, 1978, "A View of Texture Topology and Texture Description,“ Computer Graphics 32g Image Processing, 8: I7A-202. Eklundh, J., 1979, "On the Use of Fourier Phase Features for Texture Discrimination,” Computer Graphics ppg Image Processing, 9: 199-201. Everitt, B., I97A, Cluster Analysis, London: Heinemann Educational Books. Faugeras, 0. D. and W. K. Pratt, 1980, "Decorrelation Methods of Texture Feature Extraction," IEEE Transactions pg Pattern Analysis 22g Machine Intelligence, 2: 323-332. Fu, K. S., I97A, Syntactic Methods lg Epttern Recognition, New York: Academic Press. 161 EU, K. S., 1977. Syntactic Pattern Recognition Applications, New York: Springer-Verlag. Gagalowicz, A., 1978, "Analysis of Texture Using a Stochastic Model," Proceedings pi 352 Fourth International Conference pp Pattern Recognition, Tokyo. 5A1-5AA. Gagalowicz, A., 1981, "A New Method for Texture Fields Synthesis: Some Applications to the Study of Human Vision," IEEE Transactions pg Pattern Analysis and Machine Intelligence, 2: 520-533. Galloway, M., l97A, "Texture Analysis Using Gray Level Run Lengths," Computer Graphics 22g Image Processing, A(2): 172-179. Garber, D. and A. Sawchuk, 1981, ”Computational Models for Texture Analysis and Synthesis," Proceedings 21 pp; DARPA Image Understanding Workshop, Washington, D. C. 69-88. Ginsburg, A. P., 1973, ”Pattern Recognition Techniques Suggested from Psychological Correlates of a Model of the Human Visual System," Proceedings pi pg; IEEE National Aerospace Electronics Conference, 1972, Dayton, OH: 309-316. Ginsburg, A. P., 1976, "The Perception of Visual Form: A Two-Dimensional Filter Analysis" in Information Processing jg Egg Visual System, Proceedings of the IV Symposium on Sensory System Physiology, I. P. Pavlov Institute of Physiology, Leningrad, U. S. S. R., V. D. Glezer, editor. A6-5I. Ginsburg, A. P., 1978, Visual Information Processing Based pg Spatial Filters Constrained py Biological Data, Dissertation for Ph. 0. University of Cambridge, England, 1977. Also published as Air Force Aerospace Medical Research Laboratory Technical Report AMRL-TR-78-129, December, 1978. Ginsburg, A. P., I979a, ”Visual Perception Based on Spatial Filtering Constrained by Biological Data,‘I Proceedings pi the International Conference pp Cybernetics and Society, Denver. A53-A57. Ginsburg, A. P., l979b, ”Spatial Filtering and Mechanisms of Perception,” Proceedings pi £35 Tenth Annual Pittsburg Conference pp Modelling Egg Simulation, Pittsburg. 185-192. Ginsburg, A. P., I980a, ”Visual Perception Based on Biological Filtering of Spatial Information," Proceedings 2: Egg International Conference pp Cybernetics ggg Society, Cambridge, MA. A2A-A28. Ginsburg, A. P., I980b, ”Specifying Relevant Information for Image Evaluation and Display Design: An Explanation of How We See Certain Objects," Proceedings .2: Egg Society 12; Industrial Design, 21: 219-227. I62 Ginsburg, A. P. and J. M. Coggins, I981, "Texture Analysis Based on Filtering Properties of the Human Visual System," Proceedings p1 ppp International Conference pp Cybernetics ppp Society, Atlanta. 112-117. Graham, N., 1981, “The Visual System Does a Crude Fourier Analysis of Patterns, in Mathematippl Psychology ppp Psychophysiology, Stephen Grossberg editor. SIAM and AMS Proceedings Series. I-l6. Gurari, E. M. and H. Wechsler, 1982, ”On the Difficulties Involved in Segmentation of Pictures,” IEEE Transactions pp Pattern Analysis and Machine Intelligence, A: 3OA-306. Hall, E. L., 1972, ”A Comparison of Computations for Spatial Frequency Filtering,” Proceedings p1 ppp IEEE, 60: 887-891. Hall, E. L., B. K. Rouge and R. P. Kruger, 1977, "Automated Chest X-Ray Analysis,“ SPIE Applications p1 Optics jg Medicine ppp Biology, 89: 109-118. Haralick, R. M., 1975, ”A Resolution-Preserving Textural Transform for Images,‘I Proceedings pi ppp IEEE Conference pp Computer Graphics, Epttern Recpgnition, ppp Datg Strugtpgps, Los Angeles. 51-5A. Haralick, R. M., 1979, “Statistical and Structural Approaches to Texture,” Proceedings pi ppp IEEE, 67(5): 786-80A. Haralick, R. M. and K. Shanmugam, 1973, "Computer Classification of Reservoir Sandstones,“ IEEE Transactions pp Geoscience Electronics, 11: 171-177. Haralick, R. M., K. Shanmugam and I. Dinstein, I973, "Textural Features for Image Classification,” IEEE Transactions pp Systems, Man and Cypernetics, 3: 610-621. Harvey, L. 0. and M. J. Gervais, 1981, "Internal Representation of Visual Texture and the Basis for the Judgement Similarity,” Journal p: Experimental Psychology: Human Perception and Performance, 7: 7AI-753. Hassner, M. and J. Sklansky, 1978, "Markov Random Fields as Models of Digitized Image Texture," Proceedings pi the IEEE Conference pp Pattern Recognition ppg Image Processing, Chicago. 3A6-351. Hawkins, J. K., 1969, "Textural Properties for Pattern Recognition," in Picture Processing ppp Psychopictorics, B. Lipkin and A. Rosenfeld, editors, New York: Academic Press. Hayes, K. C., A. N. Shah and A. Rosenfeld, l97A, ”Texture Coarseness: Further Experiments," IEEE Transpctions pp Systems, npp ppp Cybernetics, A: A67-A72. Hildreth, E. C., 1980, ”Implementation of a Theory of Edge Detection," MIT Artificial Intelligence Laboratory AI-TR-579. 163 Hofstadter, D. H., 1979, Godel, Escher, Bach: 5p Eternal Golden Braid, New York: Basic. Hubel, D. H., 1963, "The Visual Cortex of the Brain,” Scientific American, 209(November I963): 5A-62. Jain, A. K., S. P. Smith and E. Backer, 1980, "Segmentation of Muscle Cell Pictures: A Preliminary Study,” IEEE Transactions pp Pattern Analysis and Machine Intelligence, 2: 232-2A2. Johnson, L. R. and A. K. Jain, 1981, "An Efficient 2-Dimensional FFT Algorithm," IEEE Transactions pp Pattern Analysis and Machine Intelligence, 3: 698-701. Julesz, 8., 1962, ”Visual Pattern Discrimination," IRE Transactions pp Information Theory, 8: 8A0-892. Julesz, B., 1965, "Texture and Visual Perception," Scientific American, 212(February I965): 38-5A. Julesz, 8., 1975, ”Experiments in the Visual Perception of Texture," Scientific American, 232(April 1975): 3A-A3. Julesz, 8., I981, ”Textons, the Elements Iof Texture Perception, and Their Interactions," Nature, 290(12 March 1981): 91-97. Julesz, B. and T. Caelli, 1979, "On the Limits of Fourier Decompositions in Visual Texture Perception,” Perception, 8: 69-73. Julesz, 8., E. N. Gilbert, L. A. Shepp and H. L. Frisch, I973, "Inability of Humans to Discriminate Between Visual Textures That Agree in Second-Order Statistics - Revisited," Perception, 2: 39I-A05. Julesz, 8., E. N. Gilbert and J. 0. Victor, 1978, "Visual Discrimination of Textures with Identical Third-Order Statistics," Biolpgjcal Cybernetics, 31: 137-1A0. Kashyap, R. L., 1980, "Univariate and Multivariate Random Field Models for Images,“ Computer Graphics ppp Image Processing, 12: 257-270. Kettig, R. and D. Landgrebe, 1976, "Classification of Multispectral Image Data by Extraction and Classifiction of Homogeneous Objects," IEEE Transactions pp Geoscience EIectronics, IA, no. I: 19-26. Landeweerd, G. H. and E. S. Gelsema, I977, "The Use of Nuclear Texture Parameters in the Automatic Analysis of Leucocytes," Pattern Recognition, 9: 57-61. Landgrebe, D. A., 1981, "Analysis Technology for Land Remote Sensing," Was a the ELL 69: 628-6A2. 16A Laws, K. I., 1980, "Textured Image Segmentation," University of Southern California Image Processing Institute TR-9A0. Lendaris, G. and G. Stanley, I970, "Diffraction Pattern Sampling for Automatic Pattern Recognition,” roceedings pi ipp IEEE, 58: 198-216. Lu, S. Y. and K. S. Fu, 1978a, "A Syntactic Approach to Texture Analysis,” Computer Graphics and Imagp Processing, 7: 303-330. Lu, S. Y. and K. S. Fu, I978b, ”Stochastic Tree Grammar Inference for Texture Synthesis and Discrimination," Proceedings pi ipp IEEE Conference pp Pattern Recognition ppp Image Processing, Chicago. 3A0-3A5. Marr, D., 1980, I'Visual Information Processing: The Structure and Creation of Visual Representations,” Philosophical Transactions pi ipp Royal Sociegy pi London, 8290: 199-217. Marr, D. and E. Hildreth, 1980, "Theory of Edge Detection,” flipppppipgp pi _pp Royal Society pi London, 8207: 187-217. Marr, D. and M. K. Nishihara, 1978, "Visual Information Processing: AI and the Sensorium of Sight," Technology Review, 81: 28-A9. Marr, D. and T. Poggio, 1979, ”A Computational Theory of Stereo Vision,” Philosophicpi Transactions pi the Royal Society pi London, 820A: 301-328. Marr, 0., S. Ullman and T. Poggio, 1979, ”Bandpass Channels, Zero-Crossings, and Early Visual Information Processing," Journal pi ipp Optical Society pi America, 69: 9IA-9I6. McCormick, 8. H. and S. N. Jayaramamurthy, I97A, "Time Series Model for Texture Synthesis,“ International Journal pi Computer and Information Science, 3(A): 329-3A3. McCormick, 8. H. and S. N. Jayaramamurthy, 1975, "A Decision Theory Method for the Analysis of Texture," Interpgtional Jourpgi pi Computer and Information Science, A(I): 1-38. Mitchell, 0. R. and S. C. Carlton, 1978, "Image Segmentation Using a Local Extrema Texture Measure," Pattern Recogpition, 10: 205-210. Mitchell, 0. R., C. Myers and W. Boyne, 1977, "A Max-Min Measure for Image Texture Analysis,” IEEE Transactions pp Computers, 25: A08-A1A. Modestino, J. W., R. W. Fries and A. L. Vickers, I981, "Texture Discrimination Based Upon an Assumed Stochastic Texture Model," IEEE Transactions pp Pattern Analysis ppp Machine Intelligence, 3: 557-580. Mostafavi, H. and D. Sakrison, 1976, "Structure and Properties of a Single Channel in the Human Visual System," Vision Research, 16: 165 957-968- Mui, J. K., K. S. Fu and J. W. Bacus, 1977, "Automatic. Classification of Blood Cell Neutrophils," Journal pi Histochemistry and Cytochemistry, 25: 633-6AO. Nathan, R., 1970, "Spatial Frequency Filtering," in Picture Processing and Egychopictorics, B. Lipkin and A. Rosenfeld, editors. 151-I63. Nilsson, N. J., 1980, Principles pi Artificial Intelligence, Palo Alto:Tioga. Ochs, A. L., 1979, "Is Fourier Analysis Performed by the Visual System or by the Visual Investigator?” Journal pi the Optical Society pi America, 69: 95-98. Oppenheim, A. V. and J. S. Lim, 1981, "The Importance of Phase in Signals,” Proceedings pi ipp IEEE, 69: 529-5Al. Pantle, A. and R. Sekuler, 1968, "Size-Detecting Mechanisms in Human Vision,“ Science, 162: IlA6-lIA8. Papoulis, A., 1962, The Fourier Integral ppp lip Applications, New York: McGraw-Hill. PaPOUIIS. A., 1965. frobability, Random Variables and Stochastic Efigsséggfi. New York: McGraw-Hill. Pavlidis, T., 1977. Structural Pattern Recognition, New York: Springer-Verlag. Perkins, W. A., 1978, "A Model-Based Vision System for Industrial Parts," IEEE Transactions pp Computers, 27: 126-1A3. Pratt, W. K., 1978, Digital Image Processing, New York: Wiley. Pratt, W. K., O. D. Faugeras and A. Gagalowicz, 1978, "Visual Discrimination of Stochastic Texture Fields," IEEE Transactions pp Systems, Man ppp Cybernetics, 8: 796-80A. Pratt, W. K., O. D. Faugeras and A. Gagalowicz, 1981, ”Applications of Stochastic Texture Field Models to Image Processing," Proceedings pi ipp IEEE, 69: 5A2-551. Pressman, N. J., 1976, ”Markovian Analysis of Cervical Cell Images,” Journal pi Histochemistry ppp Cytochemistry, 2A(1): l38-IAA. Resnikoff, H. L., 1981, "Selective Omission of Information and Machine Intellegence," Presented at the Machine Intelligence and Perception Symposium at the Annual Meeting of the American Association for the Advancement of Science, Toronto. 166 Richards, W. and A. Polit, l97A, "Texture Matching," Kybernetic, 16: 155-162. Rosenblatt, M. and D. Slepian, I962, "Nth Order Markov Chains with Any Set of N Variables Independent," Journal pi ipp Society ipp Industrial ppp Applied Mathematics, 10: 537-5A9. ROSBDfEIG. A., 1979. "Some Recent Developments in Texture Analysis," ELQEEsgiflgg 21 ipp IEEE Conference pp Pattern Recognition and Imagp 2522232129. Chicago. 618-622. Rosenfeld, A., 1981, "Image Pattern Recognition," Proceedings pi _pp IEEE, 69: 596-605. Rosenfeld, A. and A. C. Kak, 1981, Digital Picture Processing, second edition, New York: Academic Press. Rosenfeld, A. and M. Thurston, I971, "Edge and Curve Detection for Visual Scene Analysis," IEEE Transpptions pp Computers, 20: 562-569. Rosenfeld, A., M. Thurston and Y. Lee, 1972, "Edge and Curve Detection: Further Experiments," IEEE Transactions pp Computers, 21(7): 677-715- Rosenfeld, A. and E. Troy, I97A, "Visual Texture Analysis," Computer Graphics and Image Processing, A: 172-179. Sachs, M. 8., J. Nachmias and J. G. Robson, 1971, "Spatial Frequency Channels in Human Vision," Journal pi ipp Optical Society pi America, 61: 1176-1186. Schachter, 8., A. Rosenfeld and L. S. Davis, 1978a, "Random Mosaic Models for Textures," IEEE Transppiions pp Systems, ppp ppp Cybernetics, 8: 69A-702. Schachter, 8., A. Rosenfeld and L. S. Davis, 1978b, "Some Experiments in Image Segmentation by Clustering of Local Feature Values,” Pattern Recognition, II: 19-28. Schmitt, F. and D. Massaloux, I981, ”Texture Synthesis Using a Markov Model," Proceedings pi ipp IEEE Conference pp Pattern Recognition ppp Image Processing, Dallas. 593-596. Shanmugam, K. S., F. M. Dickey and J. A. Green, 1979, “An Optimal Frequency Domain Filter for Edge Detection in Digital Pictures," IEEE Transactions pp Pattern Analysis ppp Machine Intelligence, 1: 37-A9. Siromoney, G., R. Siromoney and K. Krithivasan, 1972, "Abstract Families of Matrices and Picture Languages,” Computer Graphics ppp Image Processing, 1: 28A-307. 167 Sklansky, J., 1978, ”Image Segmentation and Feature Extraction,” IEEE Item m _y___..5 stems .Ha_n en_d _Y__C bereetics. 8: 237-210. Stevens, K., 1980, "Surface Perception from Local Analysis of Texture and Contour,” MIT Artificial Intelligence Lab TR-512. Tamura, H., S. Mori and Y. Yamawaki, 1978, ”Textural Features Corresponding to Visual Perception," IEEE Transactions pp Systems, 53p and Cybernetics, 8(6): h60-h73. Tennenbaum, J. M., et al, 1978, "Prospects for Industrial Vision,” SRI International Technical Report TR-175. Thompson, W., 1977, ”Textural Boundary Analysis," IEEE Transactions pp Computers, 26: 272-276. Tomita, F., 1981, "Hierarchical Description of Textures,” Proceedings pi the Internationpi Joint Conference pp Artificial Intelligence, Vancouver. 728-733. Tomita, F., M. Yachida and S. Tsuji, 1973, "Detection of Homogeneous Regions by Structural Analysis," Proceedings pi the Internationpi Joint Conference pp Artificial Intelligence, Palo Alto. 56h-571. Tomita, F. and M. Yachida, 1973, “A Structural Analyzer for a Class of Textures,” Computer Graphics ppp Image Processing, 2: 216-231. Trussel, H. J., 1981, "Processing of X-Ray Images," Proceedings pi ipp IEEE, 69: 615-627. Tsuji, S. and F. Tomita, 1973, "A Structural Analyzer for a Class of Textures," Computer Graphics ppp Image Processing, 2: 216-231. Vilnrotter, F., R. Nevatia and K. Price, 1981, ”Structural Analysis of Natural Textures,“ Proceedings pi ipp DARPA Image Understanding Workshop, Washington, D. C. 61-68. Wechsler, H., 1980, "Texture Analysis - A Survey," Signal Processing, 2: 271-282. Weszka, J., C. Dyer and A. Rosenfeld, 1976, ”A Comparative Study of Texture Measures for Terrain Classification," IEEE Transactions pp Systems, ppp ppp Cybernetics, 6(h): 269-285. Wilson, H. R. and J. R. Bergen, 1979, "A Four-Mechanism Model for Threshold Spatial Vision," Vision Research, 19: 19-32. Winston, P. H., 1977. Artificial Intelligence, Reading, MA: Addison-Wesley. Zucker, S. W., 1976, "Toward a Model of Texture," Computer Graphics ppp Image Processing, 5(2): 190-202. 168 Zucker, S. W., 1981, "Computer Vision and Human Perception: An Essay on the Discovery of Constraints,” Proceedings pi the IEEE Conference pp Pattern Recognition ppp Imagp Processing, Dallas. 1102-1116. Zucker, S. W. and P. Cavanaugh, 1980, "Constructive Texture Perception: Orientation Anisotopies in Discrimination," McGill University Dept. of Electrical Engineering. TR 80-8. Zucker, S. W. and K. Kant, I981, "Multiple-Level Representations for Texture Discrimination," Proceedings pi ipp IEEE Conference pp Pattern Recognition ppp lmagp Processing, Dallas. 609-61h. .OU "‘IIIIIIIIIIIIII“