2,,» vi '1’ ‘r-g.

‘ "u
Lug

n.»
u.

4.23;
..
S .

, e-

r”. {a
I...— ’1.“

v
ix
I!

‘ t

'1
‘1
I

’1

g
'1

333% ‘4
«11.; ‘-

v .
‘.

 

‘11:? V 5"
v 5* x .0"
L'u‘l I“B(‘t n‘

Ivy: ‘4’” 4.“ '4
\U‘. *’ t.“

w P. 14 7

"~ﬁ“,'”’7'

"‘6“

ms -» _
4"}!

V

1.’ v
. M,
aiﬁn‘v

( . . I '
Raw-1.4, - ‘

w. -. r -
u E'l l q
1; ‘b-ﬂ,» ‘1
4 (“‘qu ‘z '4‘.
. , N ,

a}

ﬁ'
4:? g“

t k
. [3‘21

.- -‘{ 5"
t. -‘y" F K r
l .
KHz-worn ‘

' .
’l
‘ 351‘ .
Wm“ .
1V5;

. -\
L191 ‘ k,
‘ . 187:.

I

f'r

,v. ‘.
y'-‘~5-;’V';J l
1!..‘11?!’ 33‘ ‘1'”! "
‘ . .‘ 7.7;,“
Away.- ”1..
u' A u‘

i
-;‘q':-,‘;
H a
a ,v
'o‘ Ir.
«u v

I
\j‘lv‘l
'" .31

.c

a}.
. (-n
‘v .u

3({411171-33: ‘1
“MR/'1“; .
. ‘ "HQ-WA
N 29‘

7

f3. .. -
"III-‘1
_ ._‘.'_4 ..

.1
‘
‘1‘ ’3?”- 7- ‘1.“
v

cg fink-ms ‘Z
‘0 ‘liv '
.u- 3‘31 u"

”,5“
‘ "‘0
y “5156’

‘41
wt; 1

we: 4
.mr’.
L'?‘-‘v!~;1

k‘rl'

u:

v:
:4 . ‘

j.
k '4.
ﬁzz-u?”

x

.1 {‘11. ‘
“(43¢

24,“,
{a {a ‘
1‘1"

4 . w )-
.}-.v,\—-.-7~ ’1 '

cuf-

A

'5‘

x;
(3,. Hal

"1-
‘,<.:
A” .
,.

q-‘
-.1

. ;

fury.“

I‘;
1

 

-.¢v-

N‘ ' w‘k'u' « var». u 'a 'I ‘4 ‘4‘

 

 

 

 

 

,1 3 0 6450 a
lllllllllIll|||l|lll|||l|llllllllllllllllllllllllllllllllll

3 1293 00579 1540

THESE

This is to certify that the

dissertation entitled

Markov Random Fields in Image Analysis

presented by

Chaur—Chin Chen

has been accepted towards fulﬁllment
of the requirements for

Ph.D. Computer Science

 

 

degree in

75sz

 

Major professor

Date M

MSU is an Afﬁrmative Action/Equal Opportunity Institution 0- 12771

 

 

LIBRARY 3
Michigan State
University

 

 

 

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSU Is An Affirmative Action/Equal Opportunity Institution

MARKOV RANDOM FIELDS IN IMAGE ANALYSIS

By

Chant-Chin Chen

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

Department of Computer Science

1988

ABSTRACT

MARKOV RANDOM FIELDS IN IMAGE ANALYSIS

By

Chaur—Chin Chen

Markov random ﬁelds (MRFs) or Gibbs random ﬁelds (GRFs) have been frequently
used for image analysis. This thesis investigates fundamental problems of sampling and
statistical inference of MRFs in image analysis. Applications in texture synthesis and tex-

ture classiﬁcation are examined.

We deﬁne a class of GRFs based on single cliques and pairs of cliques, prove a
theorem which gives the exact formulas of the MRFs corresponding to our GRFs. Some
commonly used MRFs are shown to be special cases of our GRFs. Their global and local
distributions are given to further highlight the MRF-GRF equivalence theorem.

Criteria for validating a sampling algorithm are proposed. A sampling algorithm,
which does not require the form of local conditional distribution of an MRF, for sampling
a GRF is proposed and shown to be computationally simpler than the Gibbs sampler.
The phenomenon of "phase transition" occurring in an Ising model also occurs in other
GRFS. We interpret this phenomenon as a result of a GRF having "multiple peaks". We
show by experiments that this phenomenon may not appear if a sampling algorithm does
not have enough iterations. A methodology is proposed to detect unreliability, nonuni-

formity, and indistinguishability towards identifying unique "useful" GRFs.

 

 

New estimators of parameters immigrated from logit models are investigated for
binary MRFs. A nonparametric statistical test is proposed to compare the estimators
when their distributions are not known. A visual inspection on a given texture and a
reproduced texture from an MRF model with estimated parameters may be misleading.
We propose a ranking statistic and a x2 statistic to measure the goodness-of-ﬁt when
ﬁtting MRFs to textures. The application of sampling, estimation and goodness-of-ﬁt test
to natural textures are examined.

The parameters of discrete MRFs are investigated as features for texture
classiﬁcation. Experiments for recognizing synthetic and natural textures are imple-
mented. The results show that our approach performs better than the spatial gray level

dependency matrices approach in recognizing some natural textures.

To my parents and my wife

for their love and support

iv

ACKNOWLEDGEMENTS

I wish to thank my major professor, Richard C. Dubes, for his help and guidance,
both in thesis preparation and in the encouragement of my research efforts. Without him,
my program would never have been completed. Special thanks also go to Professors Anil
K. Jain and George C. Stockman, who provided valuable knowledge in image processing
and computer vision. I also thank Professor James Stapleton, who served on my Ph.D.
committee and taught courses in Probability and Statistics, based on which this thesis

was established.

I sincerely appreciate the discussions I had with Professor Raoul LePage on statisti-
cal computing problems in image processing. I also thank Dr. Mihran Tuceryan who fre-
quently provided interesting texture segmentation problems. I am also grateful to all the
graduate students and visiting professors in the Pattern Recognition and Image Process-

in g Laboratory for their help during my stay in the department.

Acknowledgement should also be made of the ﬁnancial support I received during
my long stay at MSU from NSF grant ECS-8300204 and ECS-8603541. The spiritual

and ﬁnancial support from my late grandparents is also appreciated.

Table of Contents

Chapter 1 Introduction ........................................................................................... 1
1.1. MRF and GRF Models ............................................................................ 2
1.2. Texture Studies ....................................................................................... 3
1.2.1. Texture Synthesis ................................................................................. 5
1.2.2. Texture Classiﬁcation ........................................................................... 6
1.3. Image Segmentation ............................................................................... 7
1.4. Image Restoration ................................................................................... 9
1.5. Organization of the Thesis ...................................................................... 10
Chapter 2 Gibbs Random Fields and Markov Random Fields .......................... 12
2.1. Deﬁnitions ............................................................................................... 13
2.2. A Class of Gibbs Random Fields ............................................................ 16
2.2.1. Derin-Elliott Model .............................................................................. 17
2.2.2. Auto-binomial Model .......................................................................... 17
2.2.3. Auto-Poisson Model ............................................................................ 17
2.2.4. Auto-normal Model ............................................................................. 18
2.3. A Relationship Between a GRF and an MRF ......................................... 20
2.4. Summary ................................................................................................. 22
Chapter 3 Sampling Gibbs and Markov Random Fields .................................... 23
3.1. Sampling Discrete GRFs ........................................................................ 23
3.2. Methods for Validating Sampling Algorithms ....................................... 26
3.3. The Comparison of Algorithms C and G ............................................... 27
3.3.1. Comparison of Algorithms for the Derin-Elliott Model ....................... 28
3.3.2. Comparison of Algorithms for the Auto-binomial Model .................... 36
3.4. Towards Identifying Useful Discrete GRFs ........................................... 44
3.4.1. A Procedure for Detecting Unreliable GRFs ....................................... 45
3.4.2. A Procedure for Detecting Nonuniform GRFs .................................... 45
3.4.3. Methodology for Identifying Useful GRFs ......................................... 46
3.4.4. Effects of Iterations on the Sampling Algorithm .................................. 53
3.4.5. Phase Transition .................................................................................... 56

vi

 

3.5. Sampling Auto-normal GRFs ................................................................. 58

3.6. Discussion and Conclusion ...................................................................... 60
Chapter 4 Statistical Inference on Gibbs and Markov Random Fields ............. 61
4.1. Parameter Estimation for Discrete GRFs ............................................... 62
4.1.1. Pseudo Maximum Likelihood Estimation (PMLE) .............................. 64
4.1.2. Least Square Error Method (LSQR) .................................................... 66
4.1.3. Logit Model Fit Method (Logit) .......................................................... 67
4.1.4. Minimum Logit x2 Method (Min-x2) ................................................... 68
4.1.5. Discussion and Extension ..................................................................... 69
4.2. Parameter Estimation for the Auto-normal Model ................................. 69
4.2.1. Pseudo Maximum Likelihood Estimation (PMLE) .............................. 69
4.2.2. Minimum Sum of Square Error Method (MSSE) ................................ 70
4.2.3. Maximum Likelihood Estimation CMLE) ............................................ 71
4.2.4. Discussion ............................................................................................ 72
4.3. Goodness-of-Fit Test .............................................................................. 72
4.3.1. A Nonparametric Ranking Test ........................................................... 73
4.3.2. A x2 Test .............................................................................................. 74
4.4. The Comparison of Estimation Methods ................................................ 77
4.4.1. Comparison of Estimation Methods for the Derin-Elliott Model ......... 79
4.4.2. Comparison of Estimation Methods for the Auto-binomial Model ...... 82
4.4.3. Comparison of Estimation Methods for the Auto-normal Model ........ 83
4.5. Fitting MRF Models to Nature Textures ................................................. 84
4.6. Discussion and Conclusion ...................................................................... 88
Chapter 5 Texture Classiﬁcation ........................................................................... 89
5.1. A Texture Classiﬁcation System ............................................................. 90
5.2. Feature Deﬁnition and Extraction ........................................................... 91
5.2.1. Features Derived from SGLDM .......................................................... 92
5.2.2. Features Extracted by Fitting a Discrete MRF to a Texture ................. 93
5.3. Proposed Recognition Methodology ...................................................... 94
5.4. Experiments ............................................................................................ 95
5.5. Discussion and Conclusion ..................................................................... 105
Chapter 6 Conclusion, Discussion, and Future Research ................................... 106
6.1. Summary of Results ................................................................................. 106

vii

 

6.2. Discussion and Future Research .............................................................. 108

Appendix A Proof of The Proposition in Chapter 2 ................................................ 110
Appendix B Fisher’s Linear Classiﬁer ...................................................................... 111
Appendix C Equal Probability Quantization ............................................................ 112
List of References ................................................................................................... 113

viii

List of Tables

Table 3-1 Frequency of Colors for Textures in Experiment DEl ............................. 31
Table 3-2 Mean, Std. of Pixel Changes and Energy for Experiment DEl ............... 31
Table 3-3 Estimated Parameters for Textures in Experiment DEl ........................... 31
Table 3-4 Frequency of Colors for Textures in Experiment DE2 ............................. 35
Table 3-5 Mean, Std. of Pixel Changes and Energy for Experiment DE2 ............... 35
Table 3-6 Estimated Parameters for Textures in Experiment DE2 ........................... 35
Table 3-7 Frequency of Colors for Textures in Experiment ABl ............................ 39
Table 3-8 Mean, Std. of Pixel Changes and Energy for Experiment ABl ............... 39
Table 3-9 Estimated Parameters for Textures in Experiment ABl .......................... 39
Table 3-10 Frequency of Colors for Textures in Experiment AB2 .......................... 43
Table 3-11 Mean, Std. of Pixel Changes and Energy for Experiment AB2 ............. 43
Table 3-12 Estimated Parameters for Textures in Experiment AB2 ........................ 43
Table 13-13 No. of Uniform Images (out of 20) in Experiments PHI, PH2, PH3 ..... 48
Table 3-14 Estimated Parameters for Images in Experiment PH3 ........................... 56
Table 3-15 Frequency of Colors for Images in Experiment PH3 (500 Iterations) 57
Table 4-1 True Parameter Vectors for the Derin—Elliott Model ............................... 80
Table 4-2 Comparison for the Derin-Elliott Model .................................................. 81
Table 4-3 Comparison for the Auto-binomial Model ............................................... 82
Table 4-4 True Parameter Vectors 9 for the Auto-normal Model ............................ 83
Table 4-5 Comparison for the Auto-normal Model ................................................. 84
Table 4-6 Results of Fitting Derin-Elliott Model to Natural Textures ..................... 85
Table 4-7 Results of Fitting GMRF Model to Natural Textures ............................... 87
Table 5-1 Leave-one-out Errors on Experiments 5-1, 5-2, 5-3 ................................. 99

ix

List of Figures

Figure 1-1 Examples of structural textures ............................................................... 4
Figure 1-2 Examples of statistical textures ............................................................... 4
Figure 2-1 The sites and orders of geometric neighbors of site t ............................. 15
Figure 2-2 The clique types of the 2nd order neighborhood .................................... 16
Figure 3-1 Images generated by Algorithm C for Experiment DEl ......................... 29
Figure 3-2 Images generated by Algorithm G for Experiment DEl ......................... 29
Figure 3-3 No. of pixel changes vs. iteration for Experiment DEl .......................... 30
Figure 3-4 Energy vs. iteration for Experiment DEl ................................................ 30
Figure 3-5 Images generated by Algorithm C for Experiment DE2 ......................... 33
Figure 3—6 Images generated by Algorithm G for Experiment DE2 ......................... 33
Figure 3-7 No. of pixel changes vs. iteration for Experiment DE2 .......................... 34
Figure 3-8 Energy vs. iteration for Experiment DE2 ................................................ 34
Figure 3-9 Images generated by Algorithm C for Experiment ABl ......................... 37
Figure 3-10 Images generated by Algorithm G for Experiment ABl ........................ 37
Figure 25-11 No. of pixel changes vs. iteration for Experiment ABl ........................ 38
Figure 3-12 Energy vs. iteration for Experiment ABl ............................................. 38
Figure 3-13 Images generated by Algorithm C for Experiment AB2 ........................ 41
Figure 3-14 Images generated by Algorithm G for Experiment AB2 ........................ 41
Figure 23-15 No. of pixel changes vs. iteration for Experiment AB2 ........................ 42
Figure 3-16 Energy vs. iteration for Experiment AB2 ............................................. 42
Figure 3-17 PCP of feature vectors in Experiment PHI ........................................... 49
Figure 3-18 Images generated by Algorithm C for Experiment PHl ....................... 49
Figure 3-19 Mean feature vs. parameter value in Experiment PHI ......................... 50
Figure 3-20 PCP of feature vectors in Experiment PH2 ........................................... 52
Figure 3-21 Images generated by Algorithm C in Experiment PH2 ........................ 52
Figure 3-22 PCP of feature vectors with 500 iterations in Experiment PH3 ............ 54
Figure 3-23 PCP of feature vectors with 50 iterations in Experiment PH3 .............. 54
Figure 3—24 Images generated with 50 iterations in Experiment PH3 ...................... 55
Figure 3-25 Images generated with 500 iterations in Experiment PH3 .................... 55

 

Figure 4-1 Codings for a 2nd-order neighborhood system ....................................... 64

Figure 4-2 Synthetic textures in Experiment SYl .................................................... 76
Figure 4-3 Synthetic textures in Experiment SY2 .................................................... 76
Figure 4-4 Natural textures -- (a) grass (b) bark (c) leather ((1) wood ...................... 86
Figure 4-5 Binary natural textures by EPQ on textures in Figure 4-4 ...................... 86
Figure 4-6 Reproduced binary textures by estimated parameters ............................ 86
Figure 5-1 A pattern classiﬁcation system ................................................................ 90
Figure 5-2 Synthetic textures generated from 3rd-order GMRF models .................. 97
Figure 5-3 Textures of four colors after EPQ on textures in Figure 5-2 .................. 97
Figure 5-4 PCP of feature vectors from SGLDM (A) in Experiment 5-1 ................ 97
Figure 5-5 PCP of feature vectors from SGLDM (B) in Experiment 5-1 ................ 98
Figure 5-6 PCP of MRF feature vectors (C) in Experiment 5-1 ............................... 98
Figure 5-7 Synthetic textures generated from 2nd-order GMRF models .................. 100
Figure 5-8 Textures of four colors after EPQ on textures in Figure 5-7 ................... 100
Figure 5-9 PCP of feature vectors from SGLDM (A) in Experiment 5-2 ................. 101
Figure 5-10 PCP of feature vectors from SGLDM (B) in Experiment 5-2 ............... 101
Figure 5-11 PCP of MRF feature vectors (C) in Experiment 5-2 .............................. 102
Figure 5-13 PCP of feature vectors from SGLDM (A) in Experiment 5-3 .............. 103
Figure 5-14 PCP of feature vectors from SGLDM (B) in Experiment 5-3 .............. 104
Figure 5-15 PCP of MRF feature vectors (C) in Experiment 5-3 ............................ 104

xi

Chapter 1

Introduction

 

A radiologist stands in his ofﬁce, viewing a woman’s breast X ray. He thinks he
sees a small nodule in the breast, but the X ray is low in contrast and not sharp, and the
small nodule is ill-deﬁned. He must decide if surgery is called for on the basis of this X-
ray picture [And77]. Certain types of pulmonary disease, attributable to interstitial
ﬁbrosis, have been found to be quite common among miners. An important medical
problem is the evaluation of roentgenograms in order to classify normal and abnormal
pulmonary patterns [Lev85]. The captain in a cruiser detects, from radar signals, that a
ﬂight 8 miles away is ﬂying toward his cruiser. He must identify this ﬂight as an airbus or
a ﬁghter in seconds. These separate events, connected by a common thread, fall in the
study of image analysis [Gon87].

Image analysis is concerned with the manipulation and analysis of pictures by com-
puter. Its major subareas include image formation, enhancement, restoration, reconstruc-
tion, segmentation, coding and compression, texture analysis, shape analysis, representa-
tion, matching, description and recognition [Bal82, Gon87, Lev85, Pra78, Ros82]. Image
analysis is both data-dependent and goal-oriented. In other words, there is no universal
way to do image analysis. The recipe is to develop a software package as the tools for
analyzing one class of images such that the software can be easily modiﬁed for analyzing

another class of images.

We are interested in two-dimensional (2D) digital images. A 2D digital image can
be represented by an MxN matrix, or lattice, whose elements (called pixels) have
integral values from 0 up to 0—1 corresponding to the brightness levels (gray levels)
[R0382]. In the digital images studied in this thesis, the gray level of a pixel is highly
dependent on its geometric neighboring pixels, but is nearly independent of remote pix-

els.

Markov random ﬁelds (MRFs) [Bes74] have been found to be rich models for vari-
ous areas in image analysis due to the dependency among pixels in spatial neighbor-
hoods. The classical problems of texture synthesis [Che85a, Cro83, HasSO], texture
classiﬁcation [Che85b, Kho87], image segmentation [Coh87, The83], image restoration
[Be886, Che82, Gem84], and image compression [Del79] have been attacked with MRFs.

Although MRFs have been utilized in various applications, their use is still in its
infancy. The problems of sampling and statistical inference of discrete MRFs, due to the
mathematical intractability of apparently simple likelihood functions, are very difﬁcult
and have not been completely solved [Pic87]. The objective of the research in this thesis
is to study fundamental problems of MRF models as tools for image analysis. We deﬁne
a class of Gibbs random ﬁelds (GRFs) [Gem84, Kin80] which are shown to be MRFs by
a simple proof of the MRF-GRF equivalence theorem. We investigate the problems of
sampling and statistical inference of this class of MRFs. Applications to texture synthesis
and texture classiﬁcation are of particular interest. We also emphasize computational

problems and the interface between an MRF and a digital image.

1.1. MRF and GRF Models

By a model of an image, we mean a mathematical process which constructs or
describes the image. In texture synthesis, we desire to establish an algorithm that gen-
erates a texture based on a few model parameters. One goal is to represent a digital tex-
ture by a few model parameters to achieve data compression [Gon87, R0882]. The estab-
lishment of an image model does not have a unique rule, but it should be at least based

on some property of an image. Markov random ﬁeld models assume that the intensity at

 

each pixel in an image depends on its "neighboring pixels" but is independent of other
pixels. An image is viewed as a coloring of a lattice. An MRF is a probability space
[M0074] whose probability measure, deﬁned on the sample space consisting of all possi-
ble colorin gs of a lattice, satisﬁes positivity, the Markov property, and homogeneity
[Bes74]. Markov random ﬁelds have been extensively used for modeling images because
of the natural coincidence between the local dependency of a model and the local depen-

dency of neighboring pixels.
The building of an MRF model should follow two principles. The MRF should

naturally match the properties of a class of images and should be mathematically tract-
able. If an MRF generates images whose pixels have integral values, it is discrete, other-
wise it is continuous. For example, an auto-binomial MRF [Bes74] is a discrete MRF,
and an auto-normal MRF [Bes74] is a continuous MRF. It is believed that discrete MRFs
are more natural to digital textures but less mathematically tractable than continuous

MRFs.

1.2. Texture Studies

Texture is an important characteristic for the analysis of many types of images. It
can be seen in all images from multi-spectral scanner images to microscopic images of
cell cultures or tissue samples. Despite its importance and ubiquity in image data, a pre-
cise deﬁnition of texture does not exist [Har79]. Since textures often appear in our sur-
roundings, we seem to know them when we see them even if a formal deﬁnition of tex-
ture is not given. The literature [Ahu81, Har79, We080] suggest two approaches to treat
textures. It implicitly says that there are two categories of textures : structural textures
and statistical textures. Structural textures, consisting of textons arranged according to
certain placement rules, frequently appear in the studies of human vision [J ul81]. Statisti-
cal textures can be viewed as images in which pixels are obtained by some stochastic
processes. Two structural textures are shown in Figure 1-1 and two statistical textures

are shown in Figure 1-2. This thesis only considers statistical textures.

 

L1J1L1L1FJ1L1
L1J1LJLJFJ1F1
F1J1LJLJFJ1F1
r1J1LJ1J1J1r1
F1J1LJ1J1J1F1
J1F1LJ1J1JWF1
r1r1LJ1JrJ1r1
FJF1FJLJFJ1F1
r1r1rJr1rJ1r1
L1F1FJF1FJLF1
L1F1LJF1FJLF1
J1J1LJL1TJ1L1
J1J1L1L1r41L1

)

b
1 Examples of structural textures

(

a)

(

1-

igure

F

 

cal textures

tlStl

Figure 1-2 Examples of sta

A wide variety of methods are used for texture analysis [Van85]. A basic problem
is to capture textural features [Har79, Har86]. Textural features can be descriptive or
generative. Descriptive textural features characterize some properties of textures
corresponding to visual perception like coarseness, contrast, directionality, line-likeness,
regularity, and roughness [Tam78]. Haralick [Har79], Weszka et al. [Wes76], and
Conners and Harlow [C0n80] survey and compare descriptive textural features. Genera-
tive textural features can be used not only to describe a texture but also to produce a tex-
ture. We are particularly interested in texture synthesis and texture classiﬁcation [Lev85,
Van85] based on MRF models. Background for these two applications are discussed in

this section.

1.2.1. Texture Synthesis

There are many applications in image processing in which texture synthesis is use-
ful. For example, if a region of a picture is missing or highly corrupted by noise, an
artiﬁcial texture can be generated to replace the missing data. In image coding applica-
tions, if a texture can be generated with a few parameters, we can store those parameters
instead of storing the whole texture image data to save computer storage. Texture syn-
thesis means texture generation by computer. Many algorithms have been suggested for
generating a variety of textures based on mosaic models [Ahu81, Mod80, Sch78], fractal
models [Pen84], syntactic models [Lu79], time series models [Del79, Kas81, Kas84,
McCo74, Tou76], and Markov random ﬁeld models [Cro80, Cha85, Che81, Der85,
Gou88, Ha880]. Each model achieves success with some textures but fails with other
textures. Haralick [Har86] provides a good summary of statistical texture analysis
methods including texture synthesis. Model-based texture synthesis involves three prob-
lems. First, develop an "efﬁcient" algorithm based on the given model to generate tex-
tures. Second, develop a robust estimator for model parameters based on a single texture.
Third, develop a statistic to measure the correspondence between the given texture and a

model with estimated parameters.

Markov random ﬁelds were ﬁrst proposed as texture models by Hassner and Sklan-
sky [Has80]. They investigated Besag’s auto-logistic MRFs [Bes74] in binary texture
synthesis. Cross [Cro80] extended Hassner and Sklansky’s work to model natural tex-
tures with eight colors, from Brodatz’s book [Br066], by using the auto-binomial model
[Bes74]. Chellappa [Che81] used the auto-normal model (also referred to as a Gaussian
Markov random ﬁeld or GMRF) and the simultaneous autoregressive (SAR) model
[Bes74] to model natural textures. It extends other work [McCo74, Tou76, De179], based
on the time series models [Box76]. Recently, Derin and Elliott [Der85, 87] extended the
Ising model [Kin80, Pic87] to model textures with more than two colors. Frankot [Fra87]
used log-GMRF model and log-SAR model to synthesize aperture radar images. Gout-
sias [G0u88] built mutually compatible MRFs to model binary natural textures. In sum-
mary, Markov random ﬁelds are particularly useful for modeling homogeneous textures
whose neighboring pixels have a high dependency but remote pixels are nearly indepen-

dent.

1.2.2. Texture Classiﬁcation

The problem of texture classiﬁcation can be stated as follows. Given n, texture sam-
ples (training textures) from texture class i, 1 S i S K, extract textural features and estab-
lish a decision rule, based on the selected features, which can efﬁciently classify an

unknown texture into one. of the K categories.

There is no universal way to select textural features for classiﬁcation and recogni-
tion, but a simple and computationally efﬁcient method is desired. Textural features can
be derived from gray level co-occurrence matrices [Har73, Sie88], gray level run lengths
[Wes76], gray level differences [Wes76], Fourier power spectra [Wes76], spatial ﬁltering
[Cog85, Fau80, Law79], and MRF model ﬁtting [Che85b, Kan82, Kas82, K0h87,
deSo82]. Van Gool et a1. [Van85] gives a good survey of texture classiﬁcation methods.
Tamura et al. [Tam78] proposed six textural features, namely, coarseness, contrast, direc-
tionality, line—likeness, regularity, and roughness, which correspond to visual perception.

The existing techniques attempt to capture these textural features.

A classiﬁer is a decision rule designed to efﬁciently assign an unknown texture to
one of the known categories based on the extracted textural features. The 1-NN, qua-
dratic, and Fisher’s linear classiﬁers are commonly used in pattern recognition [Dud73,
Dev82, Jai87]. The design of a classiﬁer should consider the relationship between the
number of features and the number of training samples [Dud73, Jai82]. Recognition rate
is a commonly used index to evaluate a classiﬁer in pattern recognition. In texture
classiﬁcation, recognition rate is used not only to evaluate a classiﬁer but also to evaluate

the textural features.

Textural features from continuous Markov random ﬁeld models have been selected
for classiﬁcation. Kashyap et a1. [Ka582] ﬁtted a SAR model to texture to capture
features. Chellappa and Chatterjee [Che85b] ﬁtted a GMRF model to select textural
features. Khotanzad and Kashyap [Kh087] ﬁtted a SAR model to a given texture twice to
capture features. Feature selection is nothing but parameter estimation for MRF models.
To achieve robust estimation for parameters of continuous MRFs, image size should be
large enough to remove boundary effects, and the number of gray levels should also be

large enough to approximate a continuous distribution.

Discrete MRF models have not been used to select textural features for
classiﬁcation. We will investigate the use of discrete MRF models in texture

classiﬁcation.

1.3. Image Segmentation

The purpose of segmentation is to partition an image into "meaningful" com-
ponents. The deﬁnition of a "meaningful component" is a function of the problem being
considered [Gon87]. For example, in a Landsat image, the objective of segmentation
might be to automatically identify components corresponding to different types of crops.
In boundary extraction, the objective of segmentation is to assign a pixel into either a

pixel on the boundary between two regions or a non-boundary pixel.

The problem of segmentation usually requires knowledge about the image and

involves the "thresholding problem" [Tsa85]. Rosenfeld [Ros85] gives a good survey of

image segmentation articles. Segmentation algorithms based on Markov random ﬁelds
have been investigated and developed [Bes86, Cha85, Ch087, Coh87, Der84, Der86,
Der87, Fan88, Gem84, The83, The86]. The segmentation problem can be stated as :

Given an observed image y, ﬁnd a coloring x of pixels in y which optimizes speciﬁed cri-

teria. There are two basic approaches.

(1).

(2).

Iterative pixel labeling approach -- Individually label pixels one by one until some
criteria are achieved. The most commonly used criterion is a maximum a posteriori
(MAP) probability. The goal is to ﬁnd an i which maximizes P(xly) =
P(ylx)P(x)/P(y) over all possible x’s. Geman and Geman [Gem84], and Chou et a1.
[Ch087] further assume "P(le)" deﬁnes an MRF or a GRF and they apply the pro-
cedure of simulated annealing to iteratively label pixels to maximize P(xly). Besag
[Be886] further assumes that pixels in an observed image are corrupted by indepen-
dent Gaussian noise, and develops an algorithm called iterated conditional modes
(ICM) to label pixels. Thenin et a1. [The83,86] further assume that the observed
image y can be modeled by a 2D autoregressive time series model [B0x76] and
design a segmentation algorithm similar to Besag’s ICM to segment a terrain image.
Marroquin et a1. [Mar87a] develop an algorithm called the maximizer of posterior
marginals (MPM) which iteratively maximizes P(xily) for each i until "conver-

gence".

Region classiﬁcation approach -- An image is partitioned into several regions such
that each region is assumed to be a realization of a single MRF model with known
parameters or a combination of two MRF models [Coh87, Fan88]. The type of
region is determined by hypothesis testing using estimated parameters in the region.
Adjacent regions from the same class can then be merged. How to partition an
image into regions represented by MRFs is the most important problem. Cohen and
Cooper [Coh87], Fan and Cohen [Fan88] gave partial solutions. Derin et a1. [Der84,
86, 87] developed a segmentation algorithm, called a dynamic programming on a
strip, to label pixels. In each updating step, their algorithm labels all pixels in a

"strip" (a rectangular region) instead of labeling a single pixel.

The ﬁnal goal of image segmentation is to automatically recognize the regions in an
image. Image segmentation algorithms based on MRFs assign a pixel or a region in an
image to a category. In the region classiﬁcation approach, an image should be parti-
tioned into disjoint rectangular regions of moderate size to guarantee the robustness of
estimated MRF model parameters. In the pixel labeling approach, the algorithms are
iterative and the optimal criteria may never be achieved due to the computational burden
imposed by a simulated annealing procedure. It should be mentioned that the true image
is usually not a "realization" from the MRF model used as the prior information. The
MRF simply introduces context into prior information. In the current algorithms [Bes86,
Coh87], the model parameters corresponding to the true pixels or regions are assumed to
be known. In applications, the parameters are seldom known but can be estimated based
on the given image. The potential research and challenge is to develop knowledge-based
and computationally attractive algorithms to do automatic image segmentation, which is
beyond the scope of this thesis. Efﬁcient sampling algorithms for MRFs and robust esti-
mators for MRFs are needed in image segmentation and will be investigated in this

thesis.

1.4. Image Restoration

Image restoration is a process which attempts to reconstruct or recover an image
that has been degraded by using some knowledge of the degradation phenomenon. It has
many ﬁelds of applications, including space and biomedical imagery [And77, R0882].

Knowledge of the degradation phenomenon is crucial to restoration.

Markov random ﬁeld models were recently proposed as ways of inserting prior
information during image restoration [Bes86, Car85, Che82, Gem84, Mar87a,b, W0185].
The idea is to impose a Markov random ﬁeld model on a true image or a degraded image.
Given a degraded image, parameters are estimated, and then an algorithm is developed to

process the degraded image to restore the true image.

Chellappa and Kashyap [Che82] assumed that an observed degraded image y and its

true image x have the relationship y = Hx + n, where H is a known linear operator

10

corresponding to degradation phenomena and n is a vector of Gaussian noise. They
assume y (or x) is modeled by a GMRF with unknown model parameters. They estimate
the parameters of the GMRF based on y and develop an algorithm based on minimum
mean square error [And77, Ch6] to restore the image. Note that an actual model for x or

y is not important, but the local dependency of GMRF is important.

Image restoration based on discrete MRFs [Bes86, Car85, Gem84, Mar87a,b,
W0185] is also called "pixel labeling by stochastic relaxation" since the algorithms used
to reconstruct images iteratively labeled pixels by relaxation until some criteria are
optimized. Geman and Geman [Gem84] introduced a "line process" associated with a
degraded image to help restore the true image. They apply a simulated annealing pro-
cedure to yield maximum a posteriori estimate of the image given the degraded observa-
tions. The procedure introduced by Geman and Geman requires sampling Gibbs distribu-
tions which is computationally demanding. It should be mentioned that some researchers
[Der84, Der86, Der87] used "segmentation" to mean "restoration“. Efﬁcient sampling
algorithms for MRFs, robust estimators for MRFs, and justiﬁcation of valid parameters

are required in image restoration and will be investigated in this thesis.

1.5. Organization of the Thesis

Chapter 2 characterizes the background of Gibbs random ﬁelds and Markov random
ﬁelds, deﬁnes a speciﬁc class of GRFs, gives a theorem which proves GRF-MRF
equivalence, and shows the Gibbs forms and the corresponding Markov forms of several
currently used models. Chapter 3 characterizes two sampling, or synthesis algorithms for
discrete GRFs, proposes methodology to validate algorithms, demonstrates a "phase tran-
sition" phenomenon, and proposes methodology to capture "useful" discrete GRFs. A
sampling algorithm for GMRF, based on simulating a multivariate normal distribution, is
also reviewed and discussed. Chapter 4 deals with parameter estimation and goodness-
of-ﬁt problems. Two logit model-based estimators for binary GRFs and MRFs are pro-
posed. A nonparametric statistical method that compares the estimators is proposed and

used to compare estimators for Derin-Elliott, auto-binomial and auto-normal texture

11

synthesis models. Quantitative statistics to measure the correspondence between a given

image and the model with estimated parameters are also proposed. Finally, MRF models

for texture synthesis are examined. Chapter 5 discusses the problem of texture

classiﬁcation and recognition. We propose using parameters of discrete GRFs as textural

features and use a Fisher’s linear classiﬁer to classify and recognize visual textures. The

method is applied to synthetic and natural textures. Results are compared with those

based on co-occurrence matrices. Chapter 6 summarizes and discusses the results and

list future work.

(1)
(2)

(3)

(4)

(5)

(6)

The principal contributions of the thesis are listed below.
A simple proof of the GRF-MRF relationship theorem in Section 2.3.

A new sampling algorithm in Section 3.1 and methods for validating sampling algo-

rithms in Section 3.2.

The discovery of the phenomenon of "phase transition" and introduction of notions
of reliability, uniformity, and indistinguishability. A methodology to detect unrelia-

bility, nonuniformity towards identifying useful GRFs in Section 3.4.

New estimators for binary MRFs in Sections 4.1.2 and 4.1.3. A procedure to com-

pare various estimators when their distributions are unknown in Section 4.4.

Introduction of quantitative statistics to measure the correspondence between a

given texture and an MRF model in Section 4.3.

The methodology of using discrete MRF model parameters as textural features for

classiﬁcation in Chapter 5.

Chapter 2

Gibbs Random Fields and Markov Random Fields

 

This chapter presents theoretical results which will be used throughout the thesis.
Two major problems in modeling images by a GRF or an MRF are sampling and parame-
ter estimation. Chapter 3 shows that sampling a GRF is more intuitive and computation-
ally attractive than sampling an equivalent MRF. Chapter 4 shows that estimating
parameters of an MRF is more practical than estimating parameters of an equivalent
GRF. There has been an increasing interest in modeling images by Markov random
ﬁelds (MRFs) since Besag’s paper on spatial interaction and the statistical analysis of lat-
tice systems [Bes74]. An image is viewed as a coloring of a lattice. Its local characteris-
tic can be modeled by an MRF while its global property can be characterized by a Gibbs
random ﬁeld (GRF). The consistency and equivalence between an MRF and a GRF under

some conditions was characterized by the Hammersley and Clifford theorem [Bes7 4].

We begin with the deﬁnitions of a Gibbs random ﬁeld and a Markov random ﬁeld,
then deﬁne a class of GRFs which will be studied in this thesis. We show that some com-
monly used GRFs are special cases of this class. A theorem which characterizes the rela-
tion between a Gibbs random ﬁeld and a Markov random ﬁeld is stated and proved. The

MRFs corresponding to some GRFs are then given.

12

2.1. Deﬁnitions

An MXN lattice is denoted by L={(i,j)| 1S i SM, ls j SN}. Let X(i.j) be a random
variable (r.v.) representing the "color" at site, or pixel, (i,j) on the lattice L. For conveni—
ence, we simplify the labeling of X(i.j) to Xt with t=j+N(i-1). Thus there exists (i,j)e L iff
there exists te S={ 1,2, ..... ,MN}. Let A be the range set common to all r.v.’s X,, and let (2
= {(x1,x2,....,xMN) | x,eA for all teS} denote the set of all colorings of S (or L). Note
that A is speciﬁed according to the application. For instance, A = {0,1, ..... ,255} in
discrete gray valued images. We also permit A to be the entire real line for continuous
images. The random vector X=(X1, ...., XMN) denotes a coloring.

A random ﬁeld is the triplet <9, ‘I’, ¢(.)>, where 9 deﬁned above is a sample space,
‘I’ = { W | W ; Q } is the collection of all Borel subsets of Q, and t1)(.) is a probability
measure with domain ‘1’.
Deﬁnition 2.1(a) : A discrete Gibbs random ﬁeld is a random ﬁeld for which A is a

countable set and whose probability mass function has the form
P(X=x) = e—UW/z, for all xe 0, (2.1)

where U(x) is called the energy function, and normalizing constant Z = E e"U(y) is
yeQ

called the partition function.

Deﬁnition 2.1(b) : A continuous Gibbs random ﬁeld is a random ﬁeld for which A is the

real line and whose probability density is
fx(x) = Ke‘U(")' for all xe $2, (2.2)

where U(x) is called the energy function, and normalizing constant K is called the parti-
tion function. The main difference between a discrete GRF and a continuous GRF is the
mathematical structure of the sample space S2 = AMN, where A is deﬁned above. The
remainder of this development will assume a discrete GRF but the extension to a continu-

ous GRF is straightforward.

14

Deﬁnition 2.2 : The site se S is not a probabilistic neighbor of site t if P(X,=u I Xs=v) =
P(X,=u) for all (u,v). Let Tt={rl site r is not a probabilistic neighbor of site t}. Deﬁne
Nt = S—Tt-{t}. The sites in N, are said to be probabilistic neighbors of site t.

Deﬁnition 2.3 : A clique is a set of sites in S that consists of either a single site or has the

property that each site in the set is a probabilistic neighbor of every other site in the set.

Deﬁnition 2.4 : For any site te S with probabilistic neighbors Nt, deﬁne Q, to be the set
of all cliques which contain site t. Then a neighborhood system E. can be deﬁned as the

ordered class {Q1, Q2, , QMN}. Note that the sets in this class are not pairwise dis-

MN
joint. Deﬁne Q = U Q, the collection of all cliques from the neighborhood system E.

121
Deﬁnition 2.5 : A Gibbs random ﬁeld with respect to a neighborhood system E is a
Gibbs random ﬁeld whose energy function U can be expressed as the sum of functions
deﬁned on cliques in Q. In other words,

U(x) = 2 Vc (x), where Vc(.) is called a potential function. (2.3)
ceQ

Deﬁnition 2.6 : A discrete Markov random ﬁeld is a random ﬁeld whose probability
mass function satisﬁes the following conditions :

(a) Positivity : P(X=x) > 0 for all xe Q.

(b) Markov property : For all te S,

P(Xt=Xt I X1=X1, X2=X2, ...., Xt_1=xt_.1, Xt+1=xt+1,...., XW=XMN)
: P(thxt I all Xr=Xr,I‘E NI)’ (2.4)

where Nt is the set of neighbors of site t.

For notational convenience, denote

P(Xt=xt | all X,=x,,re N,) by P(x, I R,),

where Rt is an ordered set of colors for the neighbors of pixel t.

(c) Homogeneity : P(xt l R,) does not depend on a particular site t.

Theorem 1: With respect to a speciﬁed neighborhood system, there exists a unique

Gibbs random ﬁeld for every Markov random ﬁeld and there exists a unique Markov

 

15

random ﬁeld for every Gibbs random ﬁeld.
Proof : see [Bes74].

This theorem implies that we can deﬁne a random ﬁeld based on the global proba-
bility measure (2.1), or the local conditional probability measure P(xt l R,) under some
neighborhood conditions (2.4) and thus the GRF and MRF models will be equivalent
under a neighborhood system. Theorem 1 is not generally true unless the GRF is based

on cliques under some neighborhood system.

Geometric neighbors of site t of different orders have been discussed by Besag
[Bes74] and Cross [Cr080]. Figure 2—1(b) labels ﬁrst order geometric neighbors of site t
as "1", second order geometric neighbors of site t as "2", and repeats up to order 5. An
absolute labeling {1, 2, ..., MN} has been deﬁned for the sites. Figure 2-1(a) establishes
a relative numbering system for the geometric neighbors of site t up to order 5. The
clique types associated with the 2nd-order geometric neighborhood are given in Figure
2-2. Note that a site is a cell in S, while a pixel value is a color (or a gray level). A boun—
dary pixel has fewer neighbors than an interior pixel. In this thesis, unless otherwise
speciﬁed, we will assume an image has periodic boundaries. For example, the four
nearest neighbors of the site 1 (the pixel at the top left comer) have absolute indices N, 2,

(M-1)><N+l, and NH, and relative indices 12-1, 1:+1, 1:-2, and 1:+2.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

t:-1l t:-7 t:-6 t:+8 t:+12 5 4 3 4 5

t:-9 t:-3 t:-2 t:+4 t:+10 4 2 1 2 4

t:-5 t:-1 t t:+l t:+5 3 1 t 1 3

t:-10 t:-4 t:+2 t:+3 t:+9 4 2 1 2 4

t:-12 tz-8 t:+6 t:+7 t:+11 5 4 3 4 5
(a) (b)

Figure 2-1 The sites and orders of geometric neighbors of site t

 

HE:

E]
E
WI

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

J | __ _

Figure 2-2 The clique types of the 2nd order neighborhood

 

 

2.2. A Class of Gibbs Random Fields

A Gibbs random ﬁeld is completely characterized by its clique functions (2.3). In
this thesis, we consider a class of GRFs closely related to auto-models [Bes74]. The only
cliques that contribute to the energy function are single pixels or pairs of pixels. The

energy function has the following form.

MN MN c
U(X)=Z F(Xr) + Z Z H(Xr,xr:+r), (2-5)
1:1 t=11=1

where H(a,b)=H(b,a) and c depends on the neighborhood. For example, c = 2, 4, 6, 10, 12
for lst-order, 2nd—order, 3th-order, 4th-order and 5th-order neighborhoods, respectively
(see Figure 2-1). Function F is the potential function deﬁned on cliques consisting of sin—
gle pixels, and function H is the potential function deﬁned on cliques consisting of pairs
of pixels.

The advantages of the class of GRFs deﬁned in (2.5) are brieﬂy described below.
First, the algorithm for sampling one discrete GRF can be easily modiﬁed for sampling
another discrete GRF (Chapter 3). Second, algorithms for parameter estimation for one
GRF can be easily modiﬁed for another GRF (Chapter 4). Third, some textures visually
similar to natural textures can be generated based on this class of GRFs. Finally, software
for sampling and estimation can be directly used or easily modiﬁed for applications in

texture classiﬁcation, image restoration, and image segmentation.

17

By specifying functions F and H in (2.5), one can deﬁne various GRFs. The follow-
ing sections show that many GRFs proposed in the literature have energy functions of the

form (2.5). The list is not exhaustive.

2.2.1. Derin-Elliott Model

Let A = {0,1, ..... ,G-l }. The model is deﬁned by specifying

F(x,) = 0t,t , H(xbxcﬁ): 0,1(x,,xc+,), where I(a,b) = -1 if a=b and 1 if aatb.

This model has been frequently used for image segmentation and texture synthesis
[Bes86, Der86, Der87] since it ﬁrst appeared [Gem84]. When G=2, this is an Ising
model [Kin80]. Furthermore, if ao=a1,....,=0tG_1, Appendix A shows that the probability
mass function is invariant under any permutation of the colors, and P(Xt=g)=1/G, for 0 S

g S G-l; te S.

2.2.2. Auto-binomial Model
Let A = {0,1, ..... ,G-l }. The model is deﬁned by specifying

F(x,) = ax,- meg—1], H(x,,x,_.+,)= e,x.x,,+,, where 03‘ = m!/(m-n)!n!.

When G=2, this is an auto-logistic model [Bes74]. It has been successfully used for
modeling synthetic and natural textures [Has80, Cr080] since it was ﬁrst proposed by
Besag [Bes74].

2.2.3. Auto-Poisson Model
Let A = {0,1, ..... ,oo}. The model is deﬁned by specifying
F(x,) = Otxt+ln(x,!), H(x,,x,;+,)= Orxtxmr.
This model was ﬁrst proposed by Besag [Bes74], but has never been used for digital

image analysis because the range of gray levels is an inﬁnitely countable set which is

impractical.

 

18

2.2.4. Auto-normal Model

An auto-normal model [Bes74] is a continuous GRF whose energy function is
characterized below.

P(x.) = emf/202, Hum...) = -—e, (xt-utxxne—uua/zoz.

In this model, the range set A of gray levels is the real line. The model was ﬁrst
proposed by Besag [Bes74], who suggested it as a model of crop yields in plant ecology.
This model has also been used for modeling natural textures [Che85b], and synthetic
aperture radar images [Fra87]. The distribution of X under the auto-normal model is a
multivariate normal distribution [And84, Bes74], with a speciﬁc type of correlation
matrix B‘1 , given below.

_ (x—pTth—g)

16
'B ' 262 (2.6)

fx(x)= (21:02)MN/2 e

 

The matrix B is an MNxMN block circulant matrix with M2 blocks of NxN circulant
matrices Bij’s [Mor73, Gon87] deﬁned below.
”Bu 1312 . - BIM

BlM B11 ‘ ‘ Bl,M—1
. . ° ' . (2.7)

 

 

B12 B13 Bu

For the 1st-0rder neighborhood,
B11=circulant (1,- 01,0, 0, ..... ,0,— 01),
B12=circulant (— 02,0,O,....,0),
BlN=circulant (— 02,0,0,....,O),
B1j=O for 2<j <N.

For the 2nd-order neighborhood,
B11=circulant (1,- 01,0, 0, ..... ,0,- 01),
B12=circulant (— 02,— 03, 0,0, ..... ,0,— 04),
B1N=circulant (— 62,— 04, 0,0, ..... ,0,— 03 ),
B1j=0 for 2<j <N.

 

19

We denote the fact that X has a multivariate normal distribution with mean vector 11

and covariance matrix 072B"l by

X~MVN(u,02B‘1). (2.8)

The probability density (2.6) is valid only if B"1 is positive deﬁnite, and identiﬁable
if no different parameter sets lead to the same eigenvalues of matrix B. The eigenvalues

of matrix B can be written as [Mor73, Kas80]

M—lN-l
Am, = 2 Z bxy zlw‘zzvy, for u=0, l, ..., M-1;v=0, 1, ..., N—l, (2.9)
x=0y=0

where complex numbers 21 and 22 are given by zl =ej2n/M, 22 =ej2n/N, and
bch =B(1,xN+y +1) is the element of B in the ﬁrst row and the (xN+y +1)th column.

Corresponding eigenvectors are given in the column vectors of matrix W.

1 ..

 

where WN is an NxN symmetric matrix with the k-th column (1, z2", 222k, ....,
22<N-1>")T, k = o, 1, N-l; i, j = 0, 1, M-l.

For the lst-order neighborhood, the positive deﬁniteness of matrix B implies that

0 < 7t“, = 1 — 201c0s (21tv/N) — 292605 (21tu/M), or (2.11)

1911+ 1921 <-;-. (2.12)

The collection of all parameters such that B is positive deﬁnitive is called the valid
domain. The valid domain for model parameters is hard to state explicitly for neighbor-
hoods of order two or more. We can still check the validity of parameters by examining

the positivity of all eigenvalues computed from (2.9).

 

20

2.3. A Relationship Between a GRF and an MRF

The local conditional probability densities for the MRFs corresponding to the
discrete GRFs with global distributions deﬁned in (2.5) can be derived from the follow-
ing theorem. For a discrete model, sampling a GRF is more efﬁcient than sampling an
equivalent MRF (Chapter 3) but parameter estimation with current techniques can only
be done by knowing the form of an MRF. This theorem shows how to establish an MRF
from a given GRF.

Theorem 2 : A discrete GRF characterized by the energy function U deﬁned in Eq.
(2.5) has the following local conditional probability measure.
-F(X:) - £[H(Xtuxt:+r)+H(xtv xt:—r)] -F(S) - é [H(s.Xz;+r)+H(s. xt:—r)]

P(x, I R,)=e =1 / z e
seA

(2.13)

Proof :
With respect to a coloring x, deﬁne two subsets of 9 below.
Qt1={ye§2|yi= xi ifieNl}, 95(k)={ye§2t1lyt=k}.
With respect to site t, we also deﬁne two functions

U‘1(y)= F(yt) + Z [H(YI:yl:+r)+H(ytaYL:—r)]a U‘2(y) = U(y) - Ui (y).

1:

Note that U‘l (y) = U‘1(x) if ye Q§(x,). Denote a coloring 2 as zj = xj, V j¢t. Then

P(x. :R.) = 2 e‘”‘”/ 2 e“”"’
yeQ‘2(x,) ye Q‘l

_ r _ 1 —ul —U‘( )
[ 2 e U10)e U2(Y)]/[z 2 e 1006 ZY]
ye 9‘20.) keA ye 9500

_ t _ l _U‘ —U1 )
=6 U1(X)[ z 6 U2()’)] /[ 2 6 1(1)] [ 2 C 20]
yeQ‘zm) zleA yeQ‘2(x,)

= e-uaw /[ 2 {”59}. which completes “‘6 Pm“
zleA

21

The relationship between a global form (2.5) and a local form (2.13) of an auto-
norrnal model can be found in [Bes74]. Note that Besag’s auto-models assume the func-
tion H in (2.5) has the form H(x,,xn+,) = thxtm, so the Derin-Elliott model is not an
auto-model, but all other GRFs deﬁned above are auto-models. The local conditional
probability measures of discrete models deﬁned in Sections 2.2.1 to 2.2.3 can be derived
from (2.13).

Derin-Elliott model : A = {0,1, ..... ,G—l}.

‘axt _ Zermxt.Xt:+r)+1(xuxe-r)l ’as " 2 erﬂ(stxt:+r)+1(stxt:—r)]
P(xt I R.) = e F1 / 2 e ”=1 (2.14)
seA

Auto—binomial model: A = {0,1, ..... ,G-l].

P(x. I R.) =C§‘:‘1p"‘<1—p)“’“"‘t. (2.15)

C
where p=e'T/(l+e“T), and T = 0t + 20r(xu+,+xt,_r).

r=1

Auto-Poisson model : A = {0,1, ..... ,oo}.

P(xt I R,) = e-tx‘t/xu, (2.16)

C
-(l — Zer(xtz+r+xt:—r)
where 7‘. = e “1

Auto—normal model [Bes74] : A = R (the real line).

V2 txt—ut—2:6.(um—u...Hum-m.»12I262
P(x, I R,)=(21t02)’ e F1 . (2.17)

 

22

2.4. Summary

A discrete Gibbs random ﬁeld is characterized by an energy function. Its probability
mass function involves a complicated partition function so that sampling and statistical
inference are difﬁcult or infeasible. Introducing the notion of cliques establishes a sub-
class of GRFs whose energy functions can be expressed as the sum of potential functions
deﬁned on cliques. An MRF deﬁned based on a local conditional probability is useful in
statistical inference [Bes74]. The relationship between a GRF and an MRF is character-
ized by an equivalence theorem (Theorem 1). Although it is well known, the exact rela-
tionship between a GRF and an MRF has hardly been investigated. This chapter deﬁnes
a speciﬁc class of GRFs whose energy functions can be expressed as the sum of simple
potential functions deﬁned on cliques consisting of either a single pixel or a pair of pixels
(see eq. (2.5)) and shows that some frequently used GRFs are special cases of this class
by specifying potential functions. Theorem 2 shows how to derive the corresponding
Markov random ﬁelds. The auto-normal GRF (GMRF) is also reviewed, and the valid
domain of parameters is emphasized. In the following chapters, we shall investigate the

fundamental problems of sampling and statistical inference for these GRFs or MRFs.

Chapter 3

Sampling Gibbs and Markov Random Fields

 

Texture synthesis based on a GRF or an MRF means sampling a random ﬁeld. Two
sampling algorithms based on stochastic relaxation for discrete GRFs were proposed
[Ham64, Gem84] but not validated. We propose some criteria for validating sampling
algorithms. Two algorithms are compared for sampling the Derin-Elliott model and the
auto-binomial model. We propose methodology to detect unreliable and nonuniform
GRFs as a ﬁrst step towards identifying useful GRFs. The phenomenon of "phase transi-
tion" in the Ising model [Kin80, Pi087] is shown to exist in the second-order isotropic
Derin-Elliott model. A critical value for "phase transition" is conjectured that helps iden-
tify useful GRFs. A sampling algorithm based on simulating a multivariate normal dis-
tribution by using a 2-D Fourier transform for auto—normal GRFs is also reviewed and

discussed.

3.1. Sampling Discrete GRFs

The objective of a sampling algorithm is to generate an image x that occurs with
frequency dictated by P(X=x) in (2.1). To sample a GRF is to simulate a multi-
dimensional distribution. Traditional techniques for simulating a discrete multi-
dimensional distribution, such as rejection-acceptance and conditional probability
decomposition [Ken80], are impractical due to the complicated partition function

involved in a GRF. A sampling algorithm need not generate a realization with minimum

23

24

energy, nor should an algorithm generate the same realization every time.

Two sampling algorithms will be studied. Algorithm C is based on [Ham64] and
Algorithm G is based on [Gem84], also known as the Gibbs sampler. The algorithms are
listed below. It is well known that both algorithms converge to a realization having a
distribution (2.1), but the number of iterations needed and the rate of convergence are
unknown. Establishing the convergence rate is a difﬁcult problem and a completely
satisfactory solution to this problem does not exist. In practical applications, the validity
of a realization is judged by algorithmic criteria and empirical evidence such as visual
appearance. Another algorithm, proposed by Metropolis et al. [Met53] and used by
Cross and Jain [Cr083], generates a realization with the number of pixels in each gray
level the same as in the initial image. Metropolis et al. showed that the algorithm simu-
lates a GRF whose domain is a subset of Q, and which may no longer be an MRF since it
violates the positivity condition of Deﬁnition 2.6. The Metropolis algorithm does
prevent convergence to a single-color image. However, it avoids this difﬁculty in an

artiﬁcial manner and may lead to misleading interpretations of parameter values.

Algorithm C

/* Sampling an MxN discrete GRF with G colors from A={0,1, ...,G-l} */
/* The inputs are parameters of the GRF deﬁned in Section 2.2 */
/* The output coloring x is called a realization of this GRF */

(1) For s=1 to MN, randomly assign ge A to pixel s to deﬁne an initial color-

ing x. /* a random initial state */

(2) For s=1 to MN Do
(a) Let yt = xt for all tat 5. Choose ge A at random and let y5 = g.
/* a potential next state */
(b) Let p = min{ 1, P(y)/P(x)}, where P is deﬁned in (2.1);
(c) x <— y with probability p.

25

(3) Repeat Step (2) for KC iterations. /* stopping criteria */

Algorithm G (Gibbs Sampler)
/* Sampling an MXN discrete GRF with G colors from A={0,1, ...,G-l} */
/* The inputs are parameters of the GRF deﬁned in Section 2.2 */

/* The output coloring x is called a realization of this GRF */

(1) For s=1 to MN, randomly assign ge A to pixel s to deﬁne an initial color-

ing x. /* a random initial state */

(2) For s=1 to MN Do
(a) pg (— P[xs=gle], for g=0,l,....,G-1;
where R8 is the set of neighboring colors of pixel s.

(b) xs (— g with probability pg.

(3) Repeat Step (2) for KG iterations. /* stopping criteria */

The only difference between algorithms C and G is in Step (2). For G>2, Algorithm
G takes more time to compute Step (2) since it involves the computation of G exponen-
tial functions (sec (2.13)). In addition, the exponents may be so large as to make this
computation inaccurate. Algorithm C successfully avoids these drawbacks. It computes
the ratio of two probability densities, and the speciﬁc form of a GRF (2.5) reduces to
computing only one exponential function. If the number of iterations KC or KG is ﬁxed,
increasing G does not increase the time complexity for Algorithm C but it does increase
the time complexity for Algorithm G. The selection of an optimal number of iterations
corresponding to GRF parameters is an open problem. Empirical evidence for selecting
KC and KG will be given later. The factors affecting sampling algorithms are : initial
state, parameters of a GRF, number of colors, image size, and number of iterations KC or

KG.
Although it has been proved [Ham64, Gem84] that algorithms C and G will con-

verge to a realization having the distribution (2.1) from any initial state, the initial state

will affect the convergence rate. An initial state with randomly colored pixels is

26

commonly used. In this chapter, we are primarily interested in the number of iterations.

3.2. Methods for Validating Sampling Algorithms

Procedures for sampling a Gibbs distribution have appeared frequently in statistical
mechanics [Met53, Kin80], in texture synthesis [0080, Der85, Der87, Has80], and in
image restoration [Ch087, Gem84]. A sampling algorithm is said to be valid if it simu-
lates a Gibbs distribution deﬁned in (2.1). The validation of sampling algorithms has not
been studied formally except for Cross [Cro80] who used the ratio of pixel exchanges in
an iteration as a stopping criterion. Cross used the Metropolis exchange algorithm
[Met53] to generate textures. The number of pixels in each gray level is ﬁxed by the ini-
tial state, so his domain of a Gibbs distribution is different from the one used in this

thesis. The GRF-MRF equivalence theorem subject to his domain may not be true.
Algorithms C and G are designed to sample the probability mass function (2.1).

Theoretically, the validation of the algorithms can be checked by Pearson’s x2 statistic
[Moo74]. Suppose 11 >> IQI independent realizations could be generated with nx realiza-

tions of state x, where 2 11K = n. The x2 statistic is deﬁned as
er

(n. — nP(x))2
R — 2 nP(x)

xeﬂ

 

If R < x2“( I QI—l), the a-quantile of the x2 distribution with IQI-l degrees of freedom,
we have 100a% conﬁdence that the algorithm is correct. The complicated partition func-
tion and huge number of states (there are GMN states for MXN G-color images) prohibit
validating sampling algorithms by this method. Instead, we propose four indices, based
on empirical evidence, to validate the algorithms. We assume an algorithm requires at
least K iterations. We say the algorithm is valid if the index Iolul is small, where u and

02 are the mean and variance of the following quantities.
(1) Number of pixel changes over iterations {K, K+1, ...., K+h-1}.

(2) Energy over iterations {K, K+l, K+h-1}.

 

(3)

(4)

27

Number of pixel changes at iteration K over m independent generations realized

from m randomly chosen initial states.
Energy at iteration K over m independent generations.

For computational reasons, only quantities collected from (1) and (2) are used in our

studies.

3.3. The Comparison of Algorithms C and G

(1)

(2)

(3)
(4)

We compare the sampling algorithms C and G based on the following criteria.

Indices for validating algorithms -- Let a and b be the mean and standard deviation
of quantities collected from (1) in Section 3.2, and let c and d be the mean and stan-
dard deviation of quantities from (2) in Section 3.2. For K 2 50, h = 50, 8 = 10%, if
Ib/aI < 8 or Id/cl < 8, we say the algorithm is valid. The selections of K, h, 5 are

based on empirical studies.

Visual inspection on intermediate images —- A good sampling algorithm tends to

generate images which are "similar" after a certain number of iterations.
CPU time for each iteration on a VAX8600,
Distance between the true model parameters and estimated parameters (Chapter 4).

The comparison is based on two discrete second-order GRF models : the Derin-

Elliott model and the auto-binomial model. Algorithms C and G are compared experi-

mentally in the problem of sampling these two GRF models for 64x64 4-color images.

The number of colors, 4, is arbitrarily chosen, and is more general than binary images.

The methodology can be extended for comparing the algorithms for other models with

different sizes of images having more or fewer colors. How many colors are most

appropriate is another issue.

28

3.3.1. Comparison of Algorithms for the Derin-Elliott Model

The model in Section 2.2.1 tends to generate textures with an almost equal number
of pixels in each color if the parameters are chosen appropriately (Appendix A). The

parameters used for the following experiments are chosen from [Der87].

Experiment DEl

Choose parameters 0ts =0 for se{0,1, 2, 3}, 01: 02 = 03 =1, 04 =—1 and KC =
KG = 300. We expect to obtain streak-like textures having 135° directionality. The ini-
tial image and the realizations at iterations 50, 100, 200, and 300 by Algorithm C are
shown in Figure 3-1 (a)-(e). The corresponding textures by Algorithm G are shown in
Figure 3-2 (a)-(e). Figures 3-1 and 3—2 (b)-(e) demonstrate the expected 135° direc-
tionality and are all visually similar. The frequency of colors for images in Figures 3-1
and 3-2 are given in Table 3-1, which indicates that the number of pixels in all images is
uniformly distributed over colors. A visual inspection suggests that both algorithms gen-
erate visually indistinguishable textures after 50 iterations and the two algorithms gen-
erate the same type of textures. Plots of number of pixel changes and energy vs. iteration
are shown in Figures 3-3 and 3-4 respectively. Algorithm C has fewer pixel changes but
higher energy at each iteration than Algorithm G, which suggests that Algorithm G tends
to generate a minimum energy state, while Algorithm C tends to generate a"stab1e state".
Table 3-2 shows that both algorithms satisfy Criterion (1) with K=50. Algorithm C
needs 0.8 seconds per iteration, and Algorithm G needs 1.6 seconds per iteration. The
parameters estimated by the pseudo maximum likelihood method (Section 4.1.1) from
textures in Figures 3-1 and 3-2 are given in Table 3-3 and are numerically close to the
true parameters.

We conclude that 50 iterations are enough for both algorithms, Algorithm G tends

to generate a realization with smaller energy than Algorithm C if the number of iterations

is ﬁxed, and Algorithm C is faster than Algorithm G.

29

 

Figure 3-1 Images generated by Algorithm C with 0 = (1, 1, 1, -1) for Experiment DE1
-- (a) Initial image, (b) 50 iterations, (c) 100 iterations, (d) 200 iterations, (e)

300 iterations

 

Figure 3-2 Images generated by Algorithm G with 0 = (1, 1, 1, -1) for Experiment DE1
-- (a) Initial image, (b) 50 iterations, (c) 100 iterations, (d) 200 iterations, (e)

300 iterations

30

    

2000 4f
1500 —§ Algo c

No. of pixel i
I Algo G

changes [

1000 —§

I

I

500 —
l

 

10 30 50 70 90 110 130 150 170 190 210

Iteration

 

Figure 3-3 No. of Pixel Changes vs. Iteration for Experiment DE1

O —l
I
I
I
—-2000 —'| Algo C ---
i Algo G
Energy (
—4000 -, \

—6000 _

   

      
     

'0. .f.

1 - $1... 1'""' .. i, _-_ I' .1.-2 1 -. 13.
10 30 50 70 90

   

 

'31-. ML..- I». .. {1.1 I
110 130 150 170 190 210

Iteration

Figure 3-4 Energy vs. Iteration for Experiment DE1

31

Table 3-1 Frequency of Colors for Textures in Experiment DE1

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 0 1 2 3 Figure 0 l 2 3
Initial 1008 1011 1002 1075 Initial 1008 1011 1002 1075
3-1(b) 950 953 1002 1191 3-2(b) 1018 977 977 1124
3-1(c) 924 1056 920 1196 3-2(c) 1001 1112 934 1049
3-1(d) 811 1116 916 1253 3-2(d) 863 1202 1049 982
3-l(e) 859 1027 968 1242 3-2(e) 969 1031 1080 1016
Table 3-2 Mean, Std. of Pixel Changes and Energy in Experiment DE1
Algorithm a b Ib/al c d ld/cl
C 190 13 7.1% -6788 77 1.1%
G 329 18 5.4% -7205 50 0.7%
Table 3-3 Estimated Parameters for Textures in Experiment DE1
Figure 0, 02 03 94 Figure 01 02 63 04
True 1.000 1.000 1.000 -1.000 True 1.000 1.000 1.000 -l.000
3-1(b) 1.019 1.015 0.993 -1.010 3-2(b) 0.837 0.819 1.044 -0.712
3-1(c) 1.006 0.951 0.957 -1.029 3-2(c) 0.962 1.098 1.070 -1.049
3-1(d) 1.068 1.070 0.980 -1.188 3-2(d) 1.246 1.161 0.912 -1.252
3-1(e) 1.029 0.820 1.042 -0.959 3-2(e) 0.957 1.041 1.058 -1.045

 

 

32

Experiment DE2

Choose parameters as = 0 for se{0,1, 2, 3}, 01: 02 = 2, 03 = 04 = —1.0, and KC =
KG = 300. We expect to obtain textures which have simultaneous horizontal and vertical
directionality. The textures and tables corresponding to Experiment DE1 are given in
Figures 3-5 through 3—8 and Tables 3-4, 3-5 and 3-6 respectively. From a visual inspec—
tion of textures in Figures 3-5 and 3-6, the numerical closeness between estimated
parameters and true parameters, and the small variances of energies and pixel exchanges

between 50 and 99 iterations, we draw the same conclusions as in Experiment DE1.

Although the theorems behind the sampling algorithms [Ham64, Gem84] say that
the number of iterations should be as large as possible, more iterations require more com-
putations. Our experiments suggested that 50 iterations should be enough to generate

"stable images" from the Derin-Elliott model.

33

 

Figure 3-5 Images generated by Algorithm C with 0 = (2, 2, -l, -1) for Experiment
DE2 -- (a) Initial image, (b) 50 iterations, (c) 100 iterations, (d) 200 itera-

tions, (e) 300 iterations

 

Figure 3-6 Images generated by Algorithm G with 6 = (2, 2, -1, -1) for Experiment
DE2 -— (a) Initial image, (b) 50 iterations, (c) 100 iterations, (d) 200 itera-

tions, (e) 300 iterations

34

2000—

1500 —E Algo C

No. of pixel of i Algo G
changes 1000 —

F.".~FIH MHVIV

500—'

    

 

l
10 30 50 70 90 110 130 150 170 190 210

Iteration

Figure 3-7 No. of Pixel Changes vs. Iteration for Experiment DE2

-1000 —

—2000 — Algo C «-

—3000 —: Algo G

.vvuwvv IY—_"_
—’—-

Em”59—4000 ..2

—5000 —

 

-6000 —

l I 'v'.'.-.i' "5"" '1, .3 :31 .-. .- 151... .- -‘- ﬁlm”... r1 '. -. _..:1..~. L3: .- l
10 30 50 70 90 110 130 150 170 190 210

 

Iteration

Figure 3-8 Energy vs. Iteration for Experiment DE2

35

Table 3-4 Frequency of Colors for Textures in Experiment DE2

 

 

 

Figure 0 1 2 3 Figure 0 1 2 3

Initial 1008 1011 1002 1075 Initial 1008 1011 ‘1002 1075
3-5(b) 1030 1054 916 1096 3-6(b) 995 995 1147 959
3-5(c) 1020 1204 834 1038 3-6(c) 1024 912 1053 1107
3-5(d) 1094 1 148 830 1024 3'6(d) 1051 1042 1054 949
3-5(e) 972 1213 958 953 3-6(e) 1081 987 1068 960

 

 

Table 3-5 Mean, Std. of Pixel Changes and Energy in Experiment DE2

 

 

 

Algorithm a b lb/al c d We!
C 139 12 8.9% ~6304 62 1.0%
G 249 21 8.5% -6680 54 0.8%

 

 

Table 3-6 Estimated Parameters for Textures in Experiment DE2

 

 

Figure 01 02 03 04 Figure 01 02 03 04

True 2.000 2.000 -1000 .1000 True 2.000 2.000 -1.000 -1.000
3-5Cb) 2.027 2.048 -0991 -l.082 3-6(b) 2.121 2.094 -1.069 -1.071
3-5(c) 2.065 2.058 -1.152 -0.929 3-6(C) 2.051 2.024 -1035 -1.000
3-5(d) 1.901 1.877 -0937 -0.920 3-6(d) 2.049 2.061 -1.058 -0.989
3-5(e) 1.968 1.943 -0.968 -0.987 3-6(e) 2.016 2.034 -0.981 -1038

 

 

 

36

3.3.2. Comparison of Algorithms for the Auto-binomial Model

The global deﬁnition in Section 2.2.2 and the local conditional density given in
(2.15) predict that the auto-binomial model tends to generate images having an unbal-
anced number of pixels in each color with high frequency for one of the extremal colors
and low frequency for the other extremal color unless the model parameters are small and
their sum is near zero. The parameters for the following experiments are chosen to

demonstrate these phenomena in addition to comparing the two algorithms.

Experiment ABl
Choose parameters on = O, 61 = 62 = 63 = 94 = 0.3, and KC = KG = 300. We expect

to obtain images which have little interaction in any direction. The initial image and the
realizations at iterations 50, 100, 200, and 300 by Algorithm C are shown in Figure 3-9
(a)-(e). The corresponding textures by Algorithm G are shown in Figure 3-10 (a)-(e).
The frequency of colors for images in Figures 3—9 and 3-10 are given in Table 3-7, which
indicates that color 0 has high frequency but color 3 has low frequency, which coincides
with our prediction since all the parameters are positive and their sum, 1.2 is away from
zero. A visual inspection of textures in Figures 3-9 and 3-10 suggests that the intermedi-
ate textures are similar from iteration 50 to 300. In other words, both algorithms generate
visually indistinguishable textures. Plots of number of pixel changes and energy vs.
iteration are shown in Figure 3-11 and Figure 3-12. Both algorithms seem to have about
the same energy at each iteration, but Algorithm C has half as many pixel changes in
each iteration and why this happens is unknown. Table 3-8 shows both algorithms
satisfy Criterion (1) with K=50. Algorithm C needs 0.8 seconds per iteration, and Algo-
rithm G needs 0.9 seconds per iteration. The estimated parameters by Besag’s coding
scheme [Bes74, Chapter 4] from textures in Figures 3-9 and 3-10 are given in Table 3-9,
and are numerically close to the true parameters. This evidence suggests that both algo-
rithms are valid when the number of iterations is 50, Algorithm C is faster than Algo-
rithm G and is favored.

37

 

Figure 3-9 Images generated by Algorithm C with 6 = (0.3, 0.3, 0.3, 0.3) for Experi-
ment ABl -- (a) Initial image, (b) 50 iterations, (c) 100 iterations, (d) 200

iterations, (e) 300 iterations

 

Figure 3-10 Images generated by Algorithm G with 0 = (0.3, 0.3, 0.3, 0.3) for Experi-
ment ABl -- (a) Initial image, (b) 50 iterations, (c) 100 iterations, (d) 200

iterations, (e) 300 iterations

38

 

3000—
Algo c
E AlgoG
2500 _;
No.0fPixe1 ' ' “ " "~‘ - ' ~
changes 2000'“
1500 '1
I
l
NWVMv.WVWWWtM J

 

10 30 50 70 90 110 130 150 170 190 210

Iteration

Figure 3-11 No. of Pixel Changes vs. Iteration for Experiment ABl

   
 

i 1 I L I J. 1' l
30 50 70 90 110 130 150 170 190 210

Iteration

Figure 3-12 Energy vs. Iteration for Experiment ABl

39

Table 3-7 Frequency of Colors for Textures in Experiment ABl

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 0 1 2 3 Figure 0 l 2 3
Initial 1008 1011 1002 1075 Initial 1008 1011 1002 1075
3-9(b) 2139 1449 451 57 3-10Cb) 2166 1447 430 53
3-9(c) 2178 1470 407 41 3-10(c) 2095 1516 431 54
3-9(d) 2183 1447 414 52 3-10(d) 2095 1516 431 54
3-9(e) 2145 1476 417 58 3-10(e) 2122 1476 445 53
Table 3-8 Mean, Std. of Pixel Changes and Energy in Experiment ABl
Algorithm a b lb/al c d ld/cl
C 1183 29 2.4% -549 31 5.5%
G 2262 28 1.2% -546 28 5.2%
Table 3-9 Estimated Parameters for Textures in Experiment ABl
Figure 01 92 03 94 Figure 01 02 93 04
True 0.300 0.300 0.300 0.300 True 0.300 0.300 0.300 0.300
3-9(b) 0.314 0.287 0.313 0.276 3-10(b) 0.332 0.263 0.302 0.308
3-9(c) 0.326 0.308 0.347 0.282 3-10(c) 0.290 0.275 0.275 0.316
3-9(d) 0.299 0.322 0.342 0.299 3-10(d) 0.290 0.275 0.275 0.316
3-9(e) 0.290 0.308 0.325 0.289 3-10(e) 0.281 0.337 0.279 0.273

 

 

4O

Experiment AB2

Choose parameters on = 0, 01 = 02 = 0.2, 63 = 04 = —0.2. Since the parameters are
small and sum to 0, we expect that the number of pixels in each generated image is uni-
form over colors. The textures and tables corresponding to Experiment ABl are given in
Figures 3-13 through 3-16 and Tables 3-10, 3-11 and 3-12 respectively. From a visual
inspection on textures in Figures 3-9, 3-10, numerical closeness between estimated
parameters and true parameters, and small variances of energies and pixel exchanges

between 50 and 99 iterations, we draw the same conclusions as in Experiment A31.

The results of Figures 3-9, 3-10, 3-13, and 3-14 show that after 50 iterations, both
algorithms generate visually indistinguishable textures. The ratios of standard deviation
to mean for pixel changes and energy between iterations 50 and 99 characterized in
Tables 3—8 and 3-11 satisfy Criterion (1) deﬁned in this section. The estimated parame-
ters in Tables 3-9 and 3-12 are numerically close to the true parameters at iteration 50.
We suggest that 50 iterations are enough for both algorithms, Algorithm C runs faster
than Algorithm G and is favored.

41

 

Figure 3-13 Images generated by Algorithm C with 0 = (0.2, 0.2, -0.2, —0.2) for Experi-
ment AB2 -- (a) Initial image, (b) 50 iterations, (c) 100 iterations, (d) 200

iterations, (e) 300 iterations

 

Figure 3-14 Images generated by Algorithm G with 6 = (0.2, 0.2, -0.2, -0.2) for Experi-
ment AB2 -— (a) Initial image, (b) 50 iterations, (c) 100 iterations, (d) 200

iterations, (e) 300 iterations

42

2500 — ~..—.,,.: 51.5% -. 1-

No. of pixel
changes 2000 _

1500a

 

10 30 50 70 90 110 130 150 170 190 210

Iteration

Figure 3- 15 No. of Pixel Changes vs. Iteration for Experiment AB2

Algo c

’3500 “ Algo G

'7' a--—

:t

.\

g‘\
I

‘2'.“

Energy 4000 __

.I

I. . u '
.I. c...
.0

.' .E'J“: .
' l

1 l 1 i 1 1 :535'1 "i 0.1". l
10 30 50 70 90 110 130 150 170 190 210

Iteration

 

 

Figure 3-16 Energy vs. Iteration for Experiment AB2

43

Table 3-10 Frequency of Colors for Textures in Experiment AB2

 

 

Figure 0 1 2 3 Figure 0 1 2 3

Initial 1008 101 l 1002 1075 Initial 1008 101 l 1002 1075
3-13(b) 910 1102 1143 941 3-14(b) 949 1098 1149 900
3-13(c) 851 1190 1180 875 3-14(c) 903 1161 1103 929
3-13(d) 908 1 157 1 138 893 3—14(d) 909 l 169 1072 946
3-13(e) 929 1108 1116 943 3-14(e) 958 1130 1035 973

 

 

 

Table 3-11 Mean, Std. of Pixel Changes and Energy in Experiment AB2

 

 

 

Algorithm a b lb/al c d ld/cl
C 1437 42 2.9% -4404 48 1.1%
G 2479 31 1.3% -4434 44 1.0%

 

 

Table 3-12 Estimated Parameters for Textures in Experiment AB2

 

 

Figure 01 02 0, 0., Figure 01 02 03 04'

True 0.200 0200 0.200 0.200 True 0.200 0.200 -0200 0.200
3-13(b) 0.174 0.218 -0.178 0.224 314(0) 0.224 0.182 -0.182 0.214
3-13(c) 0.230 0.155 0.184 0.207 3-l4(c) 0.161 0.230 0.195 -0.l96
3-13(d) 0.188 0210 0.207 0.187 344(0) 0.163 0.228 0.201 0.191
3130:) 0.178 0.220 0.191 -O.208 3-14(e) 0.200 0.208 0.189 0.217

 

 

 

44

3.4. Towards Identifying Useful Discrete GRFs

A GRF is completely specified by a parameter vector. Although all parameter vec-
tors can be used for specifying discrete GRFs, some are unreliable and some are nonuni-
form. An unreliable GRF is a GRF which tends to produce images whose characteristics
depend on the initial image, even when a random initial image (an image with each pixel
randomly assigned a color) is chosen. A GRF is said to be reliable if it is not unreliable.
A nonuniform GRF is a GRF which tends to produce images with unbalanced numbers of
pixels over colors. A GRF is said to be uniform if it is not nonuniform. An example of a
nonuniform GRF is a ﬁrst-order isotropic binary Derin-Elliott GRF (Ising model
[Kin80]) with parameters 61 = 02 = 03 = 04 > Tc, where the critical value Tc 2 0.44. It is
well known that an Ising model whose parameter is beyond the critical value tends to
generate a single—color realization [Kin80, Pic87]. This phenomenon is called "phase
transition" in statistical physics. A GRF model is "useful" in applications if it is reliable

and uniform.

We also deﬁne two GRFs to be indistinguishable if they tend to generate images
which have the same characteristics. Two useful GRFs with very different parameters
may be indistinguishable. In texture synthesis, the goal is to represent a texture by a few
parameters, so the distinguishability is not a serious problem. If one wants to construct a
set of useful GRFs such that each GRF is reliable, uniform, and is indistinguishable from
other GRFs in the same set, one can represent all the useful but indistinguishable GRFs

in that set by the GRF whose parameter vector has the least norm.

This section will establish procedures for detecting unreliable and nonuniform
GRFs so as to identify the portion of the parameter space leading to useful GRF models.
Little of the theory is known, so we pursue an experimental approach. The major factors
affecting this study are the type of GRF, the order of a neighborhood system, sampling
algorithm, image size, and the number of colors in an image. We will characterize the
general procedures to attack the problems but limit our experiments to the 2nd-order

Derin—Elliott model with four parameters {01, 02, 03, 04] for 4-color 64x64 images.

45

3.4.1. A Procedure for Detecting Unreliable GRFs

A GRF is speciﬁed by a parameter vector 0 = (01, 02, 03, 64). To determine if a
GRF is reliable is to check if images generated based on this GRF with different random
initial images have the same characteristics. The following procedure quantitatively
determines if a GRF is reliable or not.

(1) Select a sampling algorithm for the given GRF.

(2) Generate 11 images from 11 different random initial states.

(3) Compute a feature vector vi for image i, ls i s n.

(4) Find the number of "clusters" in the set of feature vectors {vi’s}.

(5) If more than one cluster exists, the given GRF is unreliable.

Algorithm C with 50 iterations (see Section 3.3) will serve as a sampling algorithm
to generate 20 images (n=20). As features, we will count the numbers of pairs of pixels
which are adjacent and which have the same color in the directions 0°, 45°, 90°, 135°.
Each feature is normalized to be between 0 and 1 by dividing the count by the total
number of possible pairs in each direction (which equals the number of total pixels).
These features are related to the energy function and hence capture some properties of
the model. Other possible features are discussed in Chapter 5. Step (4) involves the
choice of a clustering algorithm and a criterion for determining the number of clusters.
We will project the feature vectors to a two—dimensional space by the principal com-
ponent projection [And84, Mor76] and visually examine the number of clusters (Section
3.4.3). The sampling algorithm, the selected feature vectors, and the clustering algorithm
in this study are by no means unique. Our goal is to characterize a general procedure and

to propose a possible approach to the problem.

3.4.2. A Procedure for Detecting Nonuniform GRFs

Let ri be the number of pixels with color 1 in an image, where 0 S i S 3, and let

7: 0211213 (ri). Deﬁne a uniform image to be an image which satisﬁes the "uniform cri-
_l _

terion" : Y 2 70, where 70 = 642t/4 and t < 1 is a threshold. The following procedure

46

characterizes a quantitative measure to determine if a GRF is uniform.
(1) Select a sampling algorithm for the given GRF.
(2) Generate 11 images from n different random initial states.
(3) Compute the index 7 for images 1 to n.
(4) Count the number k of images which satisfy the uniform criterion,
(5) If k < nor, the given GRF is nonuniform.

Algorithm C with 50 iterations and n=20 will be used for our study. The selection
of thresholds t and or will affect the solution, but no theoretical guide exists, so we heu-

ristically set t = 0.4, a = 0.8.

3.4.3. Methodology for Identifying Useful GRFs

We will use the procedures proposed in Sections 3.4.1. and 3.4.2. to detect unreli-
able or nonuniform GRFs as a step in identifying the portion of the parameter space lead-
ing to useful models. We struggle for a deeper understanding of the relationship between
images and models. In addition to studying the reliability and uniformity of the GRFs,
we also like to identify a critical value for phase transition in the 2nd-order Derin-Elliott
model. The critical value for phase transition in the 1st-order isotropic binary Derin-
Elliott model is around 0.44 [Kin80, Pic87], but the critical value is unknown for higher
order GRFs or multi-color GRFs. The critical value for phase transition is important.

Beyond the critical value, the model tends to generate nonuniform images.

We limit our study to a portion of the parameter space by specifying two sets of
GRFs in the following experiments. The ﬁrst experiment contains 9 isotropic GRFs.
Each is parameterized with four identical parameters, ranging from -1.0 to 1.0. The
second experiment involves 14 systematically chosen GRFs, which are either direction-
oriented or blob-like. The goal is also to determine whether there exists indistinguishable
GRFs which are represented by "well-separated" parameter vectors. Monte Carlo simu-
lations are used for the study. The number of images used for the study is taken to be 20.

Algorithm C with 50 iterations is used for sampling GRFs. The effects of iterations in

47

the sampling algorithm will be discussed in Section 3.4.4.

To determine if two useful GRFs are indistinguishable, we apply Steps (1)-(4) of the
procedure in Section 3.4.1 for these two GRFs to obtain feature vectors. If Step (5) indi-
cates that the feature vectors form a single cluster, the two GRFs are indistinguishable.
An alternative way to determine if the feature vectors from two GRFs form a single clus-
ter is by means of a pattern recognition technique [Dud73, Dev82]. The feature vectors
computed from realizations of two GRFs form a training pattern matrix with known
categories. We apply the Fisher’s linear classiﬁer [Fis36, Mor76, Appendix B] to classify
the patterns and the leave-one-out method to examine the error rate of the classiﬁer on
this pattern matrix. If the error rate is not near zero, then the patterns corresponding to
different GRFs are not "separated" and two GRFs are indistinguishable. Indistinguish-
able GRFs can form a class and can be represented by a single GRF, for example, the

one whose parameter vector has the least norm.

Experiment PHI

The experiment studies the reliability and uniformity of nine Derin-Elliott GRFs
along the line in the parameter space whose parameter vectors 0 are : (a) 1.01, (b) 0.51,
(c) 0.31, (d) 0.11, (e) 0.01, (f) —0.11, (g) —0.31, (h) —0.51, (i) —1.01, where 1 is a 4-tuple
vector with all 1’s. We apply the procedure in Section 3.4.1 to compute 20 feature vec-
tors from 20 images generated corresponding to each GRF (by Algorithm C with 50
iterations). We collect the 180 feature vectors corresponding to the 20 samples of the 9
GRFs in a set T, and project the vectors in the set T to a two-dimensional space by using
the principal component projection.

Visually inspecting the scatter plot in Figure 3-17 (the largest two eigenvalues are
0.37417, 0.00041) shows that the categories c through f form single clusters, and
categories a and b form one cluster. We conclude that all GRFs are reliable with 50
iterations. We then apply the procedure proposed in Section 3.4.2. to detect uniformity.
Table 3-13 characterizes the number of images satisfying the uniform criterion for each

set of parameters. The GRFs with parameter vectors 0=r1 for r S 0.3 are strongly

48

uniform, but those with r > 0.3 are weakly uniform at 50 iterations. Figure 3-17 suggests
that the GRFs with parameter vectors 1.01 and 0.51 may be indistinguishable. The
leave-one-out error of the Fisher’s classiﬁer on feature vectors corresponding to two
GRFs with parameter vectors 1.01 and 0.51 is 0.225, which indicates that GRFs (a) and
(b) are indistinguishable. Figure 3-18 shows images generated by these nine GRFs. Fig-
ures 3-18(a) and 3-18(b) are "visually indistinguishable", which supports our conclusion.
Finally, we compute the mean vector of the 20 feature vectors for each GRF, and
plot the ﬁrst two components versus parameter values in Figure 3-19. Figure 3-19 exhi-
bits an abrupt change between parameter values 0.1 and 0.3. This abrupt change demon-

strates an interesting phenomenon which will be further discussed in Section 3.4.5.

Table 3-13 No. of Uniform Images (out of 20) in Experiments PHI, PH2, PH3

 

Exvel‘iment (a) (b) (C) (d) (C) (D (g) (h) (i) (i) (k) (1) (m) (H)

 

PH] 19 18 20 20 20 20 20 20 20

PH2 20 20 20 19 20 20 20 19 20 20 20 20 20 20
PH3(A)(50) 20 20 19 20 19

PH3(B)(500) 20 14 4 6 6

 

 

 

49

 

0.1—

0.05 — a

 

 

 

I
0 0.5 1 1.5 2

Figure 3-17 PCP of feature vectors in Experiment PHI

 

Figure 3-18 Images generated by Algorithm C at 50 iterations with 0 = r1, where r =
(a) 1.0, (b) 0.5, (c) 0.3, (d) 0.1, (e) 0.0, (t) -0.1, (g) -0.3, (h) -0.5, (i) -l.0

50

 

 

l I
0.8 — x -- feature at 0 degree
0 6 o -- feature at 90 degree
Mean ' _
feature 04 _
E
I
0.2 —- I l 1‘
I
I I I I I
—l —0.5 0 0.5 1

Parameter value

Figure 3-19 Mean feature vs. parameter value in Experiment PHI

51

Experiment PH2

This experiment examines the reliability and uniformity of fourteen Derin-Elliott
GRFs with parameter vectors 0 = (a) (1, 1, 1, -l), (b) (1, 1, -1, 1), (c) (1, 1, -l, -1), (d) (1,
-1, 1, 1), (e) (1, -1, 1, -1), (f) (1, -1, -1, 1), (g) (1, -1, -1, -1), (h) (-1, l, 1, 1), (i) (-1, 1, 1,
-1), (i) (-1, 1, -1, 1), (k) (-1, 1, -1, -1), (l) (—1, -1, 1, 1), (m) (-1, -1, 1, -1), (n) (—1, -1, -1,
1). We apply the procedure in Section 3.4.1. to compute 280 feature vectors, 20 from
each GRF. We then project these vectors to a two-dimensional space by using the princi-

pal component projection (the largest two eigenvalues are 0.17547 and 0.10225).

Figure 3-20 shows that the categories corresponding to the individual GRF form
single clusters. Samples of the same category are hard to distinguish, so only one "point"
for each category is plotted. We conclude that all GRFs are reliable at 50 iterations. We
then apply the procedure proposed in Section 3.4.2. to detect uniformity. Table 3-13
shows that all of the fourteen GRFs are uniform. Figure 3—20 suggests that (d) and (h),
(e) and (i), (f) and (i), (g) and (k) may be pairwise indistinguishable. The leave-one-out
errors of the Fisher’s classiﬁer on feature vectors corresponding to pairwise GRFs : (d)
and (h), (e) and (i), (f) and (j), (g) and (k) are all zero, which implies that no two GRFs
in the fourteen GRFs are indistinguishable. Figure 3-21 provides images generated from

GRFs in this experiment.

This experiment shows that two "useful" GRFs with parameter vectors "well-

separated" are distinguishable.

52

 

 

 

 

n
0.5—
jf b
0— g c db 1
ei
a
—0.5—
m
I I I
0 0.5 1

 

Figure 3-21 Images generated by Algorithm C at 50 iterations with parameter vectors

0’s given in Experiment PH2

53

3.4.4. Effects of Iterations on the Sampling Algorithm

The number of iterations in the sampling algorithm needed to simulate a GRF
depends on the parameters. A sampling algorithm with an inadequate number of itera-
tions may not generate images which follow the speciﬁed Gibbs distribution, while a
sampling algorithm with superﬂuous iterations is inefﬁcient. In the previous studies, we
used 50 iterations. This section will study the effects of the number of iterations. We

compare the results for 50 iterations and 500 iterations based on
(1) Reliability and Uniformity.
(2) Visual inspection of images generated with 50 and 500 iterations.

(3) The estimated parameters and true parameters.

The study is restricted to the isotropic Derin-Elliott GRFs with parameters near 0.3

motivated by Experiment PHI. However, the procedure can be applied for other GRFs.

Experiment PH3

We restrict the experiment to GRFs with parameter vectors 0 = (a) 0.21, (b) 0.251,
(c) 0.31, (d) 0.351, (e) 0.41. Table 3-13 shows that the GRFs of 0 = r1 with r 2 0.3,
which are detected as uniform GRFs at 50 iterations (PH3(A)), are no longer uniform
when the number of iterations increases to 500 (PH3(B)). However, the GRF with the
parameter vector 0 = 0.21 is uniform at both 50 and 500 iterations. The GRF with the
parameter vector 0 = 0.251 is "weakly" uniform. The two-dimensional principal com-
ponent projections of feature vectors corresponding to these GRFs at 50 and 500 itera-
tions are shown in Figure 3-22 and Figure 3-23 respectively (the largest two eigenvalues
with 50 iterations are 0.10613 and 0.00004, while those with 500 iterations are 0.15915
and 0.00003). That feature vectors with the same category form a single cluster implies
that all these GRFs are reliable at 50 iterations and 500 iterations, The leave-one-out
error of the Fisher’s classiﬁer on the feature vectors corresponding to GRFs (d) and (e) is
0.20 for 50 iterations and 0.25 for 500 iterations, which implies that the GRFs with

parameter vectors 0.351, and 0.41 are indistinguishable at 50 iterations and 500

54

iterations. Figures 3-24 and 3-25 provide examples of the images, generated by Algo-
rithm C with 50 iterations, and 500 iterations respectively. The images corresponding to
the same GRF at 50 and 500 iteration are visually different in Figures 3-24 and 3-25 (c),
(d), (e). While the images Figures 3—24(a) and 3-25(a), are not visually different.

 

0.04 -—

0.02_ 6.,

 

 

 

Figure 3-22 PCP of feature vectors with 50 iterations in Experiment PH3

 

 

 

 

 

0.06
0.04 —
0.02 — E
b b
bb %
0 _
li) a
—0.02 I I
-2 —1.5 —1

Figure 3-23 PCP of feature vectors with 500 iterations in Experiment PH3

55

 

Figure 3-24 Images generated in Experiment PH3 (A) at 50 iterations with 0 =rl,
where r = (a) 0.2, (b) 0.25, (c) 0.3, (d) 0.35, (e) 0.4

 

Figure 3-25 Images generated in Experiment PH3 (B) at 500 iterations with 0 =r1,
where r = (a) 0.2, (b) 0.25, (c) 0.3, (d) 0.35, (e) 0.4

The parameters estimated from images in Figures 3—24 and 3-25 are given in Table
3-14. The estimates based on images at 50 and 500 iterations are numerically close to
the true parameters, which suggests that the sampling algorithm with 50 iterations can be
used. The problem of nonuniformity occurred at 500 iterations will be further discussed

in Section 3.4.5.

56

Table 3-14 Estimated Parameters for Images in Experiment PH3

 

 

 

True 0 Figure 01 02 03 04 Figure 01 02 63 04

0.21 3-24(a) 0.201 0.206 0.199 0.203 3-25(a) 0.225 0.184 0.209 0.155
0.251 3-24(b) 0.204 0.271 0.264 0.255 3-25Cb) 0.271 0.225 0.282 0.247
0.31 3—24(c) 0.370 0.293 0.234 0.291 3-25(c) 0.289 0.233 0.315 0.352
0.351 3-24(d) 0.327 0.304 0.346 0.385 3-25(d) 0.288 0.297 0.373 0.401
0.41 3-24(e) 0.418 0.354 0.452 0.328 3-25(e) 0.351 0.397 0.359 0.455

 

 

3.4.5. Phase Transition

The term "phase transition" stems from the study of the Ising model. In a one-
parameter Ising model, if two different Gibbs distributions can be speciﬁed from the
same set of parameters by assuming different boundary conditions, this phenomenon is
called phase transition [Kin80]. It has been shown [Kin80, p.18] that in an Ising model
without an exterior magnetic ﬁeld (equivalent to a lst-order binary isotropic Derin-Elliott
model), if the parameter is larger than the critical value Tc (where T0 = 0.44 if Eq. (2.5)
is used to formulate the model), the phenomenon of phase transition exists. That is, the
boundaries affect the entire image when the parameter is above the critical value. Kin-
dermann and Snell [Kin80] demonstrate that a similar phenomenon also exists even if
periodic boundaries are assumed. They show by simulation [Kin80, p.59] that the realiza-
tions generated from the parameter near the critical value tend to have the same type of
spins.

The "phase transition" used in this thesis refers to the phenomenon that a realization
generated from a GRF has a nonuniform distribution of pixel colors. Usually a few colors
dominate the other colors. This phenomenon can be exhibited by the procedure of
detecting nonuniformity in Section 3.4.2. We suspect that this phenomenon is caused

when a Gibbs distribution (2.1) has "multiple peaks". On the other hand, the local

57

conditional distribution (2.13) of a Gibbs distribution implies that pixels separated by a
long distance are nearly independent and thus long-range correlations are near zero
[Pic87] if this GRF is useful. However, a long-range correlation is no longer zero if
phase transition occurs. Therefore, a long-range correlation may be another index to
detect phase transition.

In Experiment PH3, the GRF with the parameter vector 0 = 0.21 is uniform but the
GRF with 0 = 0.31 is nonuniform by 500 iterations. The frequency of colors for the ﬁrst
eight images generated from the GRF with 0 = 0.31 are characterized in Table 3-15. A
nonuniform distribution of pixel colors indicates that the phenomenon of phase transition
exists for this model. We conjecture that the critical value of phase transition in a
second-order isotropic Derin-Elliott model with four colors is between 0.2 and 0.3, and
the phenomenon of phase transition also exists for other Derin—Elliott models whose

parameters cannot be easily characterized.

Table 3-15 Frequency of Colors for the First Eight Images with 500 Iterations
in Experiment PH3 (the GRF with 0 = 0.31)

 

 

Image 0 1 2 3
1 312 72 563 3149
2 141 82 909 2964
3 1899 1405 522 270
4 2153 317 1071 555
5 973 2540 75 508
6 304 242 467 3083
7 1774 747 647 928
8 1291 1026 1531 248

 

 

 

58

3.5. Sampling Auto-normal GRFs

Sampling an auto-normal GRF (GMRF) is nothing but sampling a multivariate nor-
mal distribution (2.8) [And84, Ken80], so iterative procedures are not required. If B is a
positive deﬁnite matrix, so is B‘1 . Thus there exists a nonsingular matrix L such that
B‘1=LL*, where L* is the conjugate transpose of matrix L. If the rank of B is small, a
traditional procedure to sample (2.8) based on a matrix decomposition method [Ken80,

Sto83] is given in (3.1),

x ~ MVN(u, 023-1) (2.8)
Generate 0 ~ MVN(0,021), (3.1)(a)
x (- u + 1.11. (3.1)(b)

In image analysis, the B matrix is of order MNXMN (e.g. M, N 2 64), so the traditional
computations for L is infeasible due to the memory limitation of contemporary comput-
ers. A sampling algorithm proposed by Kashyap [Kas81] will be reviewed and dis-
cussed.

The matrix B in (2.8) is block-circulant with eigenvalues and eigenvectors illus-

trated in (2.9) and (2.10). By direct inspection, we can write [Mor73, Gon87]
B=wr>w*, (3.2)

where D is a diagonal matrix with eigenvalues in the diagonals, and W is as deﬁned in

(2.10). Further simpliﬁcations show that

B-1=LL*, where L=WD-‘/*w*. (3.3)

To introduce the sampling procedure, we ﬁrst deﬁne the two-dimensional Fourier

transform (2-D FT) and inverse Fourier transform (INFI‘) in (3.4) and (3.5), respectively.

M-lN—l
F(u,v) = lVIiN— 2 Z f(U,V) 21"“22—W, for u=0,l,....,M—1; v=0,1,....,N—1, (3.4)
x=0y=0

M—lN—l
f (u,v) = Z 2 F(u,v) zluxzzvy, for u=0,1,....,M—1;v=0,1,....,N-—1, (3.5)
x=0y=0

59

where z 1 ,22 are complex numbers deﬁned in (2.9).

Direct inspection shows that Ln: WD"1’°W‘n can be implemented by applying 2-D FT
and 2-D INFT. We represent the column vector 11 as an MXN array n, and assign the ﬁrst
row of matrix B to an MxN matrix A by A(i,j)=B(1,(i-1)xN+j). A sampling algorithm
based on [Che81, KasSl] is given below.

Algorithm N
/* Sampling auto-normal GRF of image size MxN */

(a) Generate an MxN array 7] with each element i.i.d. from N(0,0‘2),
(b) Apply 2-D FI‘ on 1], save the result in 1],

(c) Apply 2-D INFI‘ on A, save the result in A,

(d) x(u,v) <— n (u,v) /W, for u=0,1,....,M-1; v=0,1,....,N-l.

(e) Apply 2-D INFT on x, save the result in x; x + u is a realization.

Algorithm N uses two MxN complex arrays 11 and A, instead of using an MNXMN
real array L, as in the traditional algorithm [Ken80] for simulating a multivariate normal
distribution. If M and N are powers of 2, we can further use fast Fourier transform (FFI‘)
to speed up computations. This algorithm generates an image having pixels of continuous
values, while most of computers deal with digital images. So some transformation should
be done between a continuous image and a digital image. The user of an auto-normal
MRF model in digital texture synthesis should carefully consider the valid domain of
parameters to guarantee a positive deﬁnite correlation matrix B’1 , and the digital display
of continuous images. In this thesis, we will use the truncation method to transfer a con-
tinuous image for the display, and check the positive deﬁniteness of a correlation matrix

B‘1 by examining all eigenvalues of B via Eq. (2.9).

60

3.6. Discussion and Conclusion

We have proposed and reviewed Algorithm C and Algorithm G for sampling
discrete GRFs and proposed methods for validating and comparing the algorithms.
Experiments suggested that 50 iterations associated with a random initial state are in gen-
eral enough except for those isotropic GRFs with parameters beyond the critical value.
Our experiments show that Algorithm C performs as well as Algorithm G but is faster

than Algorithm G for the Derin-Elliott model and the auto-binomial model.

The phenomenon of phase transition in a 2nd-order isotropic Derin-Elliott model
was exhibited. The critical value is conjectured to be between 0.2 and 0.3. We suspect
that the phase transition is caused by a Gibbs distribution which has multiple peaks.
Other Derin-Elliott models may also have the phase transition but their parameters can-
not be easily characterized. The methodology based on the procedures to detect unreli-
able, nonuniform and indistinguishable GRFs towards identifying unique useful GRF
models is ﬁrst proposed in this thesis. We conclude that those parameter vectors
representing unreliable or nonuniform GRFs should not be used to establish models for

texture synthesis.

An algorithm [Che81] based on simulating a multivariate normal distribution by
using the Fourier transform and the inverse Fourier transform was reviewed. The digital
display of a continuous image should be carefully considered. The selection of a valid
auto-normal model (a GMRF model) is difﬁcult even for second-order GMRFs, and is
more difﬁcult for higher order GMRFs.

The contributions of this chapter are summarized as follows. (1) The issue of a sim-
ple sampling algorithm; (2) Methods for validating sampling algorithms; (3) Introduction
to the notions of reliability, uniformity, and indistinguishability; (4) The discovery of
phase transition for non-binary GRF image models; (5) Procedures of detecting unreli-

able, nonuniform GRFs towards identifying useful GRFs.

Chapter 4

Statistical Inference on Gibbs and Markov Random Fields

 

The problem of parameter estimation can be deﬁned simply as follows. Given a
realization of a GRF, estimate the parameters in this GRF. The type of texture model
determines the approach to parameter estimation, so we discuss discrete GRFs and auto—
normal models separately. In texture synthesis, we want to generate a texture visually
similar to a given texture from a GRF model. Thus we must be able to estimate the model
parameters of a given texture in an efﬁcient manner. Since we have only one realization
available and a complicated partition function is involved in the probability mass func-
tion for a discrete GRF, traditional statistical estimation schemes, such as the maximum
likelihood method and moment method, are inaccessible. Two approaches to parameter
estimation for a discrete GRF have been studied in the literature. Besag [Bes74] pro-
posed an ingenious estimation procedure, a conditionally maximum likelihood estimation
based on a coding scheme, which can be used for estimating parameters in any discrete
GRF. Derin and Elliott [Der87] claimed that Besag’s estimation is unreliable for their
model and proposed an ad hoc scheme. The main thrust of this chapter is to propose
methodology for comparing different estimators and to provide a quantitative measure of

the correspondence between a given texture and a Gibbs random ﬁeld texture model.

Section 4.1 reviews Besag’s coding scheme, Derin and Elliott’s approach, and pro-
poses a Logit method [Ash72] and a minimum logit x2 method [Ber55] for parameter
estimation in discrete GRFs. Section 4.2 reviews pseudo maximum likelihood, minimum

mean square error, and maximum likelihood estimators for parameters of auto-normal
61

62

MRFs. In Section 4.3, a nonparametric ranking test [Bar63, Dig83] is proposed to vali-
date parameters estimated for discrete GRFs, and a x2 test is proposed to measure the
goodness-of-ﬁt of estimated parameters for the auto-normal model. Since the distribu-
tions of estimators for discrete GRFs are unknown, we propose a nonparametric statisti-
cal method based on the Wilcoxon rank—sum test [Leh75] to compare different estima-
tors. The comparison is extended to three estimation methods for the auto-normal model
in Section 4.4. We restrict our discussion to the second—order, Derin-Elliott model
[Der87], auto-binomial model [Bes74, Cro80], and auto—normal (GMRF) model [Bes74,
Che8l]. The extension to higher order models or other types of GRFs is also discussed.
Section 4.5 examines the problem of ﬁtting these GRF models to natural textures via

parameter estimation and goodness—of-ﬁt test. Section 4.6 discusses the results.

4.1. Parameter Estimation for Discrete GRFs

This section deﬁnes four methods for estimating the parameters of a discrete GRF
from one realization of the GRF. The pseudo maximum likelihood method (PMLE) is
based on Besag’s coding scheme [Bes74]. The least square error method (LSQR) was
proposed by Derin and Elliott [Der87] for their model. We propose two other estimation
schemes, the standard logit method (Logit) [Ash72, Cox80] and the minimum logit x2
(Min-x2) [Ber55], based on statistical procedures for binary variables. The discussion is
restricted to binary second-order Derin-Elliott model and auto-binomial model, with the
parameter "or" = 0. The extension to other models and multigray valued images is also

discussed.

Let R[ be the ordered set of colors of eight nearest neighbors (Figure 2.1) to site t,

corresponding to a second-order GRF.
R1 = {Xe—1a Xt:+1, X1:—2r Xt:+2s x1:—3a X1:+3’ XII—4’ Xu+4I

The energy function in Equation (2.5) can be expressed in terms of a 4-tuple vector

w(x[,Rt) which depends on the colors of a pixel and its neighboring pixels.

63

w<x..R.)=[<w.<x.) W204) W304) W400]? (4.1)
From Section 2.2. 1, the w vector for the Derin-Elliott model has the following com-
ponents with reference to Figure 2-1.
W1 (Xr) = I(Xr,xr;—1) + I(Xr,Xr.-+1), W2 (xr) = I(XuXr—z) + I(erxmz),
W3 (Xr) = I(Xr,xr;—3) + I(Xr,xr.-+3), W4 (Xr) = I(erxr—4) + I(Xr.xr:+4), (4.2)
where I(a,b)=-1 if a=b, and I(a,b)=1 if a¢b.

The component w] involves horizontal cliques, wz involves vertical cliques, W3 involves
diagonal cliques with slope -1, and W4 involves diagonal cliques with slope 1. For exam-
ple, wl = —2 if pixel t and its two horizontal neighbors have the same color, wl = 2 if
pixel t has a color different from both of its two horizontal neighbors, and wl = 0 other-

wise.

From Section 2.2.2, the w vector has the following components for the auto-

binomial model.
W1(Xr) = Xr(Xr:—1 + X841), W2(Xr) = Xr(Xu—2 + Xe+2).
W3 (Xr) = Xr(Xr;—3 + Xr:+3). W4 (Xr) = xt(Xt:-4 + Xr+4). (4.3)
Denote the parameter vector 0 of a GRF as
e = [91 92 93 (MT (4.4)

where large positive 01 encourages (for the Derin-Elliott model) or discourages (for the
auto-binomial model) horizontal neighbors to have the same color, 02 affects vertical
neighbors, 03 affects neighbors at 135°, and 04 affects neighbors at 45°. Then the energy
function U(x) deﬁned in (2.5) can be expressed as

1 MN T
U(x) = h + 3 2 w(x,,R,) 0 (4.5)
t=1

MN
where h = 0 for the Derin-Elliott model, and h = —Z MKS-1] for the auto-binomial
t=l

model, where C? = m!/(m-n)ln!.
The local conditional probability (2.13) can be written as

efO‘t) - W(xt-Rt)T9

P(Xr I Rr) = P(Xr . Rr)/P(Rr) = /Zr (46)

G—l T
with zt = z ef(g)w(g'R" 9.
8:0
where f(g) = 0, each ge A, for the Derin-Elliott model, and f(g) = ln[CgG_l] for the auto-
binomial model. In the rest of this section, only the formulas for the Derin-Elliott model

are given, but the formulas for the auto-binomial can be easily derived.

The four estimators are deﬁned and discussed in the following subsections.

4.1.1. Pseudo Maximum Likelihood Estimation (PMLE)

This method is based on Besag’s coding scheme [Bes74]. The sites of an image are
partitioned into four codings as shown in Figure 4-1, which shows the rows and columns
of an array representing an image. All pixels labeled "1" are used for one set of parame-

ter estimates, all labeled "2", for a second set of parameter estimates and so forth.

1212 .......... 1212
3434 .......... 3434
1212 .......... 1212
3434 .......... 3434
1212 .......... 1212
3434 .......... 3434

Figure 4-1 Codings for a 2nd-order neighborhood system

A simple application of Bayes formula and Markov property (2.4) shows that
P(xt, x8 I Rt, R.) = P(xt I Rt) P(xs I Rs) if t and s belong to the same coding. In other
words, the colors of sites in one coding are conditionally independent. Let Sj be the set

of sites labeled "j" in Figure 4.1. The PMLE deﬁnes the estimate 0 as follows. First, ﬁnd

65

parameter vector 0]- which maximizes q(0) in (4.7), the product of local conditional pro-

babilities in each coding j.

q(0) =11S P(xi 1| RI), for j= 1, 2, 3, 4 (4.7)
16 j

The PMLE estimator is then deﬁned as

_ )

é:

AIH
“'th

Maximizing q (0) in (4.7) is equivalent to maximizing L(0) in (4.8) for each coding j.

e—IW(XirRi )1T9

L(6) = ln[ P(xi I R,)] = 1n —— (4.8)
re“;j res,- CS e—Iw(g.R0ITe
g=0

There are several numerical methods for maximizing (4.8), a nonlinear function of
several variables. Simulated annealing [Boh86], the simplex method, and the Newton-
Raphson [G0183, Sto83] method are examples. Since q(0) or L(0) are continuously dif—
ferentiable and the partial derivatives can be computed, we adopt the Newton-Raphson

method. Vector 00 maximizing (4.8) must satisfy (4.9) for 1 S k S 4 in each coding j.

Gil [Wr (g) — Wk(xi)] e[W(xi~Ri> - W<8.Rr>lT 901

 

aL(9) g=0
0 = — Ie=e = _ (49)
39k 0 igj G21 eIW(XI.Rr) — w(g,Ri)]T 90
g=0

The Newton-Raphson method is used to solve the nonlinear equations (4.9).

Since two local conditional probabilities may share some neighbors, estimates from

the four codings are not independent, which implies that the estimator is biased.

66

4.1.2. Least Square Error Method (LSQR)
This method was ﬁrst proposed by Derin and Elliott [Der85]. They reduced the

problem to solving the overdetermined linear system (4.10).
(:9 = q (4. 10)

Here, 0 is the vector parameters deﬁned previously and C and q are derived from image

data, as shown below. Equation (4.6) leads to the equality

e—w(x..R.)To _ Zr
P(xbRt) P(Rt)

 

(4.11)

The right hand side of (4.11) is independent of site t (it only depends on the colors of

neighbors to site t), so the left hand side is independent of the value x,. Thus we have

 

 

-w(x,.R,)To -W(X..R.)T0
C : Z = C , ith=Rs. (412)
P(XtrRt) P(Rt) P(XS’RS)
A further simpliﬁcation of (4.12) gives
[W(xr,Rr) - W(xs.Rs)]T 9 = ln[P(XsaRs)/P(xtrRt)]r if R: = Rs. (4.13)

The sites 5 and t in (4.12) and (4.13) can be arbitrarily chosen as long as their neigh-
bors have the same colors. If the colors of the neighbors of sites 8 and t are denoted by set

D, and u,v are the colors corresponding to sites s and t, Eq. (4.13) can be written as

[w(u,D) -— w(v,B)]T 0 = ln[P(v,D)/P(u,D)]. (4.14)

The linear system (4.10) consists of all equations of the form (4.14). The ratio
P(v,D)/P(u,D) can be estimated by counting the number of 3x3 blocks of type (v,D) and
dividing by the number of 3x3 blocks of type (u,D). In a second-order binary GRF, there
are 28:256 equations characterized in (4.14). Note that w(u,D) = w(v,B) implies
P(u,D)=P(v,D), so we ignore those types of equations. To avoid zero estimate of P(v,D)
or P(u,D), we modify (4.14) to

67

[w(u,D) - w(v,B)]T 0 = ln[(P(v,D)+£)/(P(u,D)I-e)], (4.15)
where e = 1/(2x256) = 1/512, a small number.

Derin and Elliott claimed to achieve accurate estimation by using the most fre-
quently occurring block types (u,D) and (v,D) in (4.14). However, there is no consistent
way to establish the most frequently occurring block types or the number of blocks to
ignore. We use Berkson’s suggestion [Ber55] and estimate parameters by applying the
linear least square error method to solve (4. 15). The selection of the perturbation factor 8
is very important, but there is no theoretical guide. The literature [Ber55, Ash72, Cox80]

suggests choosing e to be small relative to the magnitude of P(u,D) and P(v,D).

4.1.3. Logit Model Fit Method (Logit)

This method transforms the parameter estimation problem to a Logit model ﬁt prob-
lem [Ash72, Cox80] and applies the technique used in Section 4.1.2. Consider a binary
texture with pixel values denoted by 0 or 1. Let A be the collection of all possible
ordered sets R, of colors for the eight neighbors of a site t; IAI = 28 = 256. We deﬁne an
equivalence relation "z" by E = F where E, F e A if W(0,E) = W(0,F), where vector w is
as deﬁned in (4.2). Then "z" partitions A into 81 disjoint classes because each component
of the 4—tuple vector w can take on 3 values from {—2, 0, 2} (for the Derin-Elliott model)
or from {0, l, 2} (for the auto—binomial model). If the estimate of P(u,D) in (4.15) is
modiﬁed to the number of counts in an equivalence class over all 3X3 blocks, this method
is the same as ﬁtting a logit model to the image data. The problem reduces to solving the

linear system (4.10) with each equation formulated as
[w(1,D) — w(0,D)]T 0 =ln[(P(0,D)+e)/(1—P(0,D)+e)], (4.16)
where 8 = l/(2x81) = 1/ 162, a small number.

Again, the linear least square error method is applied to solve these equations.

Equations (4.15) and (4.16) differ in size. In (4.15), an equation is formed accord—
ing to an ordered set D, while in (4.16), an equation is formed through a class of all

equivalent ordered sets D. In LSQR, we may have 256 possible equations, while in Logit,

68

we have at most 81 equations. The selection of e is important. Anscombe [Ans56] sug-
gests e = 1/2m, where m is the number of 3x3 blocks corresponding to ordered sets D. In
logit model ﬁt, all the 3x3 blocks of observations should be independent, which is not

true in this application, so the estimator is also biased.

4.1.4. Minimum Logit 712 Method (Min-x2)

Follow the notation in Section 4.1.3, there are 81 types of 3x3 blocks in a binary

image. For each type, we have either 0 or 1 in the central site. Let

r, = number of blocks of type i with central value 1,
s,- = number of blocks of type i with central value 0,
n,- = r,- + S1, for i=1,2,....,81.
Deﬁne
p,- = rilni, q,- = 3,02,, 1,- = In (pi/qr). i,- = In (Pi/Q3), for i=1,2,....,81, where
P,- = P(l | R;), Q; = P(0 I R;), and R,- is a set of neighborhood colors of type i.
Note that the parameter vector 0 is involved in P,- and Qi, so I,- also depends on 0. Sim-
ple manipulations show that I,- is a linear function of 0. The proposed method is to ﬁnd

parameters 0 which minimizes the following equation

81 (’1 - "11°02 (Si — niQi)2
W(B) — i2 [ niPi + niQi ] (4.17)

 

Letting p,- = P,- and q,- = Q, we can reduce (4.17) to (4.18).
81 ,. 2
W(O) = 2 mm i(1r' — 1i) (4.18)
i=1

Taking the partial derivatives of (4.18), we can obtain four linear equations with four
parameters. The equations can be easily solved for the estimator. Note that the estimator

is biased due to dependence among the 3x3 blocks of observations.

69

4.1.5. Discussion and Extension

The Newton-Raphson method [G0183, Sto83] for solving a set of nonlinear equa-
tions in (4.9) requires an initial guess for the solution and convergence conditions. It may
run into a local maximum or a local minimum without reaching the proper solution. We
suggest repeating an estimation program for several different initial guesses, and check-
ing several function values of L(0) around the solution vector. The extension of PMLE
(4.9) to estimation for images with more than two gray levels is obvious. The extension
of LSQR (4.15) and Logit (4.16) to images with more than two colors is intuitively possi-
ble, but requires large image size to achieve accurate estimation since the number of
block types will increase rapidly with the number of colors. The Min-x2 (4.18) can only
be applied to binary images. Our experience indicates that the LSQR and Logit methods
require the least CPU time, PMLE requires the greatest CPU time, and Min-x2 is in
between. The robustness of the four estimators is compared in Section 4.4. Although this
discussion is restricted to the second-order Derin-Elliott and auto-binomial models, the

same estimation techniques can be extended to other models.

4.2. Parameter Estimation for the Auto-normal Model

We will review three methods for estimating the parameters of the auto-normal
model deﬁned in Section 2.2.4. : pseudo maximum likelihood, minimum sum of square
error, and maximum likelihood. We restrict the discussion to the second-order auto-

normal model. The extension to higher order auto-normal models is obvious.

4.2.1. Pseudo Maximum Likelihood Estimation (PMLE)

This technique is based on Besag’s coding scheme [Bes74], which was character-
ized in Section 4.1.1. We assume 11 = 0 in (2.6) or (2.8). The PMLE maximizes (4.20)
for each coding j, where 1 S j S 4, and estimates parameters by averaging over four cod-
ings. Letting Sj be the set of pixels in the jth coding, parameters are estimated to maxim—

ize, for each j, the product (4.20).

70

n P(xt I R,) (4.20)
IESj

Five parameters are involved in this model, i.e., 01, 02, 03, 04, and 0'. As in the Derin-
Elliott model, parameters 01, 02, 03, and 04 affect the horizontal, vertical, 135° diago-
nal, and 45° diagonal neighbors, respectively. Parameter 0' affects the spread of gray
value of a pixel. Maximizing (4.20) for a given 0']- is equivalent to searching for a param-
eter vector 0J- = (01-1, 0p, 01-3, 01-4) which minimize (4.21) for the jth coding. The vari—

ance 61-2 is estimated by (4.22)

 

4
Z I xt _ 2 ej1'(xt:+r + Xt:—r)]zr (4.21)
[63" 1:]
A 2 1 4 A 2 .
j = 2 [Xx - 2 91%qu + xt—r)] , forI=1.2,3,4. (4.22)
ISj I 1.6 Sj r=l

The estimator for parameters is deﬁned as

4A A2 1 A2
20., o =—
1:1J 4

a: ,.

bl»—
rm

4.2.2. Minimum Sum of Square Error Method (MSSE)

This method is characterized in terms of an alternative deﬁnition of the auto-normal
model [Bes74, Che85c, Kas81, Woo72], which is viewed as a natural extension of a
causal autoregressive time series model [Box76], but is noncausal [Kas81]. The

deﬁnition is given below.

4
XI = z 0,(Xt;+r + XL._,) + T1,, for each te S, where 11 ~ MVN(0, 62B). (4.23)
1:]

The MSSE method searches for parameters 0: 0 which rrrinimize (4.24) with the

estimated variance given in (4.25).

71

MN 4 2
2 [X1 ‘2 9r(X1:+r + xt:—r)] , (424)
1:1 1:

82 = xTBx/MN. (4.25)

4.2.3. Maximum Likelihood Estimation (MLE)

The MLE estimator of the parameters 0 and 6 maximize P(x) based on a single
observation x, or maximize the log-likelihood in (4.26) [Bes74, 75, 77]. It maximizes the
global probability density, while the MSSE estimator maximizes the product of local

conditional densities.
ln[P(x)] = ln( I B |)/2 — nln(27t)/2 — nln(02)/2 — (xTBx)/20’2 (4.26)

The parameters maximizing (4.26) must satisfy, for k = 1, 2, 3, 4,

 

aln P M“ N“1 ak Mka
0=—%= 2 2 [4—]— TB (4.27)
k 14:0 v=0 1_z era, X X
r=1
where
a1 = 2cos (27tv/ N), a2 = 2cos (27tu/M),
a3 = 2cos (27tu/M + 27tv/N), a4 = 2cos (27tu/M — 27tv/N),
MN MN
bl = 2 xt(xt:+1 + Xt-l )e b2 = E xt(x1:+2 + Xu—z),
t=1 r=1
MN MN
b3 = Z Xt(Xt;+3 + xC—3)! b4 = 2 X1(Xt:+4 + Xc—4),
1:1 [=1
The variance is then estimated by (4.28).
A2 A
o =xTBx/MN. (4.28)

The set of nonlinear equations in (4.27) can be solved by the Newton-Raphson method.

72

4.2.4. Discussion

Besag [Bes75, 77], Kashyap and Chellappa [Ka883] have shown that both of the
estimators based on MS SE and MLE for the ﬁrst-order isotropic auto-normal model are
asymptotically consistent as the image size goes to inﬁnity. The estimator based on MLE
is more efﬁcient than that based on MSSE, and the estimator based on MSSE is more
efﬁcient than that based on PMLE for the ﬁrst-order isotropic auto-normal model in an
asymptotic sense [Bes75, 77, Kas83]. The behaviors of the estimators for the nonisotro-
pic and higher order auto-normal models are unknown. Among three estimators, the one
based on MSSE is more computationally attractive than those based on MLE and PMLE.
In Section 4.4, we will propose a nonparametric statistical method to compare various

estimators.

4.3. Goodness-of-Fit Test

Given an MRF model with parameters 0 and a single texture image x, the

goodness-of-ﬁt problem is to test the following null hypothesis.

H 0 : The given image x is an observation from the MRF with parameters 0.

The previous sections discussed procedures for estimating parameters in GRFs or
MRFs. The success of the estimation can be assessed by visually comparing a given tex-
ture and a single texture generated from an estimated GRF. The drawback is that the
human visual system can be fooled [Gag81, Lev85]. A texture visually different from
the given texture may be accidently realized even if the model ﬁts the given texture well.
Also, the resolution and the device used for displaying textures may bias a visual com-

parison of textures.

One of the contributions of this thesis is to provide a quantitative measure of the
correspondence between a given texture and a GRF texture model. Only Cross [Cr080]
has proposed such a measure. Based on Besag’s suggestion [Bes74], Cross applied a x2
test to measure the correspondence between a given texture and an auto-binomial model.

The image size must be extremely large, especially for non-binary images, for high

73

power. The computation becomes unstable as the number of colors increases. We pro-
pose a nonparametric ranking test for measuring the correspondence between a given tex-
ture and a discrete GRF model. The implementation of the test can be highly parallel and
it can be applied to all GRFs. For the auto-normal model, we exploit properties of Gaus-

sian distributions to develop a x2 test for measuring goodness-of—ﬁt.

4.3.1. A Nonparametric Ranking Test

Our test is based on a test of H 0 which was ﬁrst proposed by Barnard [Bar63] that
requires Monte Carlo simulations. This approach has been used for measuring
goodness-of-ﬁt in the studies of spatial point processes [Cox80, Rip81, Dig83]. The idea
is that if an observed image is a realization of a GRF with speciﬁed parameters, then it
must be difﬁcult to distinguish the given image from images generated with these param-
eters. In brief, the problem is reformulated as follows. We start with the given image x
and 11 other images yi, 1 S i S 11, independently generated from a GRF with known

parameters. We wish to decide if x is similar to the images in the set {yi} in some sense.

Suppose we have a vector of k features or measurements v0 = (v 01, v02, ...... , v0k)
which characterize the given image x, and let V, be the corresponding feature vector
characterizing the image y, for 1 S i S n. Deﬁne V,- as the average feature vector not
involving the j th image.

it
Vj 2:11—2 vi , forj=0,1, ..... ,n.

2:?
Deﬁne uj as a measure of dissimilarity between the jth image and the remaining images.
uJ- =| vj ‘le , forj=0,1, ..... ,n.
Rank {uo, ul, ...... , un} in an ascending order such that “(0) S “(1) S S um). A test of
H 0 can be stated as
Reject H0 at the 100(1-t/(n+l))% signiﬁcance level if uo > “(0°
Values of n=99 and t=95 are commonly used [Hop68, Mar79, Dig83]. The n

independent images y, for 1 S i Sn could be generated by n parallel processors to save

74

time. The selection of a feature vector characterizing an image is by no means simple.
For the second-order Derin-Elliott model, we propose a feature vector v =
(v1, v2, V3, v4) in which V, is the number of pairs of pixels having the same color in the
ith direction, where i = 1, 2, 3, 4 corresponding to 0°, 45°, 90°, 135° directions, respec-
tively. Other possible feature vectors characterizing images such as features derived from
co-occurrence matrices [Har73] will be discussed in Chapter 5. The following experi-

ment demonstrates how the proposed test works.

Experiment SYl

Consider a second-order Derin-Elliott GRF with parameters 01 = 1.0, 02 = 1.0, 03 =
1.0, 04 = -1.0. Generate a 128x128 4-color synthetic texture by Algorithm C with 50
iterations. The parameters estimated by PMLE are 01 = 0.997, 02 = 1.030, 03 = 0.972,
04 = —1.027. The rank computed by the proposed method is 92 over 100 Monte Carlo
simulations, so we do not reject the null hypothesis H 0 at 5% level. The given texture
and a visually similar texture reproduced from the estimated parameters are shown in

Figures 4-2(a) and 4-2(b), which coincides with the ranking test.

4.3.2. A x2 Test

Although the nonparametric ranking test can also be applied to the auto-normal
model, it is computationally messy. We will exploit the properties of Gaussian distribu-
tions to deve10p a x2 test to measure the correspondence between a given texture and an

auto-normal GRF. From (2.8) and (3.1),
x ~ MVNQI, 6213-1), where 13-1 = IL” (4.29)

A simple derivation shows that

n = 0‘1 1:1(x—u) ~ MVN(0,I). (4.30)

We will propose a )8 test based on the following Lemma [Moo74].

75

Lemma 1.

Suppose Y1,Y2, ..... ,Yn have the standard normal distribution and are independent.

11
Then ZYi2 has the xZ-distribution with 11 degrees of freedom. If n is large, the
i=1

Central Limit Theorem (CLT) implies that

(iYi2 ‘ n)/\fZ_n ~ N(0,1). (4.31)
1:1

The n1, n2,...., mm are regarded as MN independent observations from the stan-

n
dard normal distribution. By Lemma 1, q =(27112 — MN)/\}2MN has the asymptotic

r=1
standard normal distribution. We reject the null hypothesis that the image ﬁts the model
at or signiﬁcance level if q S lot/2 or q 2 21-0,,2. For example, if or =0.05, we have
20,975=1.96, and z0_025=—1.96. In applications, 11 is obtained by substituting estimated

parameters in (4.30).

Experiment SY2

Generate a 64x64 texture from the second-order auto-normal model with parame-
ters 01 = 0.34, 02 = 0.34, 03 = —0.22, 04 = —0.22, 0' = 50, and ﬁx u at 130. The
estimated parameters by MSSE are 01 = 0.3508, 02 = 0.3480, 03 = 0.2157, 04 = 0.2202,
0 = 49.47. The associated xz-statistic is 0.0007, which indicates the acceptance of the
null hypothesis at 5% level. The observed texture and a visually similar texture repro-
duced from the estimated parameters are given in Figures 4-3(a) and 4—3(b), which coin-
cides with the ranking test.

76

 

Figure 4-2 Synthetic textures in Experiment SYl

 

Figure 4-3 Synthetic textures in Experiment SY2

77

4.4. The Comparison of Estimation Methods

The behaviors of the estimators deﬁned in Sections 4.1 and 4.2 are unknown except
for some asymptotic properties requiring inﬁnite image size for isotr0pic auto-normal
MRFs under regularity conditions [Bes75]. This section will propose a nonparametric
method for comparing estimators. The goal is to identify a robust and computationally
efﬁcient estimator for parameters in a GRF model under some constraints. The com-
parison of estimators is stated as a nonparametric test of hypothesis. The factors
involved in the comparison are listed, and experiments to compare estimators for the

Derin-Elliott model, auto-binomial model, and auto-normal model are reported.

Suppose x1, x2, ...., x" are n realizations of a GRF with corresponding vectors of
true parameters 01, 02, ...., 0". Let A,- and B,- be the parameters estimated from x;, by
two methods, say, by method A and B, respectively, for 1 S i S n. Deﬁne a distance meas-
ure 5 between true and estimated parameters and let

8(0;,A,~)=a,~, 8(0;,B,-)=b,~, for 1 S i S n.

The smaller the distance between true parameters and estimated parameters, the better.
We want to test if methods A and B are equally good, method A is better, or method B is

better. Alternatively, we could test the null hypothesis
H 0 : method A and method B are equally good
against the alternative hypothesis

H 1 : method A is better than method B or method B is better than method A.

A Wilcoxon rank-sum statistic [Leh75] is computed by the following procedure.

(1) Rank, in ascending order, the distances {a1, (12, ...... , an, b1, b2, ...... , bn} from 1
to Zn, and denote the corresponding ranks by [r1, r2, ..... , r,,, s1, 52, ...... , sn}. If

there are ties, assign the average rank to each tied place.

It
(2) Compute WAB = 2 S5.

i—l

my? (2n +1)+0.5

 

(3) Compute nAB =

 

Vnz(2n+l)/12

78

The distribution of WAR under H 0 can be directly computed. Moreover, nAB has,
asymptotically, the standard normal distribution. Then the comparison is based on the

following test.
Reject H0 at the a level if (nAB S 2,1,2 or 71,43 2 z Hm).
where za is the oc—quantile of the standard normal distribution.

Method A can also be compared to method B by a one-sided test. We report the
value nAB instead of just saying rejection or acceptance. At the 5% level, if nAB 2 1.645,
then method A is better than method B; if 71,”; S-1.645 then method B is better than
method A. Two methods are equally good if lnABI S 1.96. If two methods are shown to be
equally good by our test, the one requiring less computation time is the better. An exper-
imental study comparing some of the estimators in Sections 4.1 and 4.2 was conducted.

The factors in the study are listed below.

(1) Model (Derin-Elliott, auto-binomial, or auto-normal),
(2) The range of model parameters,

(3) Image size,

(4) Number of realizations n,

(5) Discrepancy measure 8 between true parameters and estimated parameters.

Our comparisons are restricted to the Derin-Elliott, auto-binomial, and auto—normal
GRFs but the technique is not limited to these models. Although the range of each
parameter for a discrete GRF is the entire real line, various visually different textures can
be modeled with parameters within a small range (Chapter 3). We restrict each parameter
to be in [-4,4] for the Derin-Elliott model and the auto-binomial model. Some useless
models may be involved in the speciﬁed range. Since our goal is to compare different
estimators, so we temporarily ignore the problem of "usefulness" discussed in Chapter 3.
On the other hand, we use Algorithm C with 50 iterations to generate images. As sug-

gested in Chapter 3, some useless GRFs can be viewed as useful GRFs. The valid

79

domain of parameters for the auto-normal model was discussed in Chapter 2. The image
sizes were selected ﬁom computational considerations. The image size for discrete
GRFs is either 128x128 or 256x256, while 64x64 images were used for the auto-normal
GRF. The number of realizations n is related to the accuracy of the test. The larger n, the
more accurate the test. To simultaneously achieve computational simplicity and accu-
racy of the test, we use n=25 for discrete GRFs, and n=16 for the auto-normal model.
The discrepancy measure between the true parameters and the estimated parameters can
be any metric function. We use the Euclidean distance, which is intuitive and reasonable.
Another discrepancy measure between a true model and the estimated one is a
goodness-of-ﬁt statistic in Section 4.3.1 but the tremendous amount of computations

forced us to abandon it.

4.4.1. Comparison of Estimation Methods for the Derin-Elliott Model

We will compare four estimation methods, PMLE, LSQR, Logit, and Min-x2
denoted A, B, C, D, for the 2nd-order model restricted to 25 128x128 and 256x256
binary textures generated by Algorithm C with 50 iterations. Since two GRFs with close
parameter vectors tend to be indistinguishable. we like to select the GRFs whose param-
eters are separated by a certain distance. Parameters 0 = (01, 02, 03 , 04), deﬁned in Sec-
tion 2.2.1, are obtained from simulating a hardcore process with periodic boundaries
[Dig83] on [-4,4]4 with minimum allowable distance between two points 3. The 25 true

parameter vectors are listed in Table 4-1.

80

Table 4-1 True Parameter Vectors for the Derin-Elliott Model

 

 

 

MRF 01 02 03 04

1 3.74 -3.47 0.17 3.28

2 -119 3.46 1.24 -3.83

3 0.10 -2.38 3.52 -2.37

4 0.13 2.64 3.59 -3.61
*5 0.76 -241 0.40 -1.61
6 3.49 -1.98 -229 0.42

7 -174 0.88 0.91 0.02

8 -352 0.16 1.44 2.37

9 1.52 -3.84 -3.11 3.22
10 1.07 0.16 2.44 0.14
11 1.62 -1.40 0.09 2.04
*12 -3.14 1.36 0.34 -225
13 -2.58 3.47 1.73 0.94
14 2.52 3.25 -2.67 -107
15 -273 -202 3.79 3.96
16 0.82 3.04 2.91 0.61
17 2.20 3.34 0.96 -259
*18 0.67 0.94 0.05 -3.96
*19 -150 -2.88 —2.77 0.42
20 0.70 0.42 3.83 -3.16
21 1.89 0.39 0.44 0.93
22 2.50 0.34 -2.90 2.01
23 1.41 0.98 0.99 -394
*24 -3.12 1.02 —3.98 1.01
25 2.56 0.72 2.70 -3.73

 

 

81

The CPU time for estimating parameters in a 128x128 binary texture on a VAX8600 by
PMLE, LSQR, Logit, and Min-x2 are 40, 1.8, 2.0, and 2.2 seconds respectively. The
Wilcoxon rank- sum statistics corresponding to the 6 pairs of methods are given in Table
4-2. We conclude that

(1) PMLE (method A) is signiﬁcantly better than other methods for 128x128 binary tex-

tures.

(2) PMLE and Min-x2 (method D) are equally good but signiﬁcantly better than other
methods for 256x256 textures.

(3) Min-x2 takes less time for estimating parameters than PMLE, thus, Min-x2 is the

best among the four estimation methods for 256x256 textures.

Our conclusion contradicts the statement made by Derin and Elliott [Der87] who
stated that PMLE is unreliable and proposed LSQR. LSQR estimates the parameters by
solving an overdetermined linear system (4.15) whose coefﬁcients are only integers from
{-4, -2, 0, 2, 4} (Section 4.1.2). Two different equations of the form (4.15) may have the
same coefﬁcients on the left side but numerically different constant terms on the right
side. The Min-x2 avoids these drawbacks by using the equivalent class of block types
and using the properties of logistic models [Ash72, Cox80].

Table 4-2 Comparison for the Derin-Elliott Model

 

Image Size "AB nAC nAD "BC n30 nCD

 

128x128 5.08 4.93 3.57 -0.52 -2.76 -2.52
256x256 4.23 4.08 1.59 -0.58 -3.51 -3.24

 

 

 

82

4.4.2. Comparison of Estimation Methods for the Auto-binomial Model

The constraints for the comparison are the same as given in Section 4.4.1 except
that the auto-binomial MRFs with the true parameter vectors 0 corresponding to MRF 5,
12, 18, 19, and 24 in Table 4-1 generate one-color images and are eliminated. The
remaining 20 MRF models are used for comparing estimators in the auto-binomial MRF
model. The CPU time for estimating parameters in a 128x128 binary texture on a
VAX8600 by PMLE, LSQR, Logit and Min-x2 are 25, 1.8, 2.0, and 2.2 seconds respec-
tively. The Wilcoxon rank-sum statistics corresponding to the 6 pairs of methods are

given in Table 4-3. We conclude that

(1) PMLE (method A) and Min-x2 (method D) are equally good but signiﬁcantly better
than Logit (method C) and LSQR (method B) for both 128x128 textures and
256x256 textures,

(2) Min-x2 takes less time for estimating parameters than PMLE, thus, Min-x2 is the

best among the four estimations.

We have established two new estimators (Logit and Min-x2) which are compared
with the existing estimators for parameters of 2nd-order auto—logistic models [Bes74].
The comparison showed that the competing estimator based on Min-x2 is better than the

existing estimators based on PMLE and LSQR.

Table 4-3 Comparison for the Auto-binomial Model

 

Image Size nAB nAc nAD nBC n30 nCD

 

128x128 5.10 4.94 0.77 -1.10 -4.80 -4.56
256x256 5.42 5.42 0.45 -O.10 -5.40 -5.40

 

 

 

83

4.4.3. Comparison of Estimation Methods for the Auto-normal Model

We compare three estimation methods : PMLE, MSSE, MLE denoted A, B, C,
based on sixteen 64x64 textures generated by Algorithm N with parameters 1.1 = 0, 0' = 1,
and 0 = (01, 02, 03, 04), where each 91 is randomly selected from [-0.5, 0.5] and the vali—
dity of parameter vector 0 is checked by equation (2.9) until sixteen valid parameter vec-

tors are obtained. The 16 true parameter vectors are listed in Table 4-4.

Table 4-4 True Parameter Vectors 0 for the Auto-normal Model

 

 

 

MRF 01 02 03 04
1 -0.04 -0.02 -0.15 -0.04
2 -0.02 -0.20 -0.22 -0.05
3 0.40 -0.31 0.17 0.06
4 0.09 -0.30 0.07 -0. 12
5 -0.21 -0.10 0.05 0.05
6 -0.09 -0. 10 -0.38 —0.02
7 -0.03 -0.37 0.09 -0.01
8 0.13 -0.02 0.14 0.16
9 0.20 -0. 17 -0. 12 0.13
10 -0.11 0.14 0.35 0.04
l 1 -0.33 0.09 -0.07 0.09
12 -0. 14 -0. 1 8 0.01 -0.28
13 -0.30 0.04 -0.09 0.14
14 -0. 18 0.23 0.14 -0.01
15 0.31 0.40 -0. 10 -0.23
16 0.25 -0.06 0.06 -0.19

 

 

84

The CPU time for estimating parameters in a 64x64 binary texture on a VAX8600 by
PMLE, MSSE and MLE are 3, 1 and 5 seconds respectively. The Wilcoxon rank-sum
statistics corresponding to the 3 pairs of methods are given in Table 4-5. We conclude
that

(1) PMLE (method A), MSSE (method B) and MLE (method C) are equally good.
(2) MSSE is better than PMLE and MLE since MSSE requires less computation time.

We have proposed a methodology for comparing various estimators. Our conclu-
sions are consistent with those reported by Besag [Bes75] and Chellappa [Che850] for
lst-order isotropic GMRFs. The methodology can be extended to compare estimators for

parameters of higher order GMRFs.

Table 4-5 Comparison for the Auto-normal Model

 

Image Size nAB nAC nBC

 

64x64 0.17 -0.09 -0.17

 

 

 

4.5. Fitting MRF Models to Natural Textures

Markov random ﬁeld models have been widely used to model natural textures
selected from Brodatz’s book [Bro66]. Cross and Jain [Cro83] have shown good ﬁts of
second-order auto-binomial models to textures like pressed cork (plate number D4) and
grass lawn (D9). Chellappa et a1. [Che85a] have ﬁtted the 4th order GMRF models to
textures like wood grain (D68), calf leather (D24) and pigskin (D94). Cross and Jain,
and Chellappa et al. also demonstrated that the reproduced textures based on the
estimated model parameters are visually similar to original textures. One of the goals of
modeling textures is in image compression. In other words, we would like to save a few
model parameters instead of saving the whole image. Unfortunately, Cross and Jain

[Cro83], and Chellappa et a1. [Che85a] did not report the estimated parameters.

85

We have established methodology to model textures. In this section, we will exer-
cise the methodology by ﬁtting second-order Derin-Elliott models and auto-normal
models to natural textures. Our goal is not only to ﬁnd an appropriate MRF model to ﬁt
natural textures, but also to examine the goodness-of-ﬁt test proposed in Section 4.3 in
ﬁtting MRF models to natural textures. We consider four textures : (a) D09 -- grass
lawn, (b) D12 -- tree bark, (c) D24 -- calf leather, and ((1) D68 -- wood grain which were
selected from Brodatz’s book [Bro66] and were digitized via a CCD camera, with 256

possible gray levels and image size 256x256. Figure 4—4 shows these textures.

Experiment NS 1

We want to examine the ﬁtting of second-order binary Derin-Elliott models to
natural textures. The four textures D09, D12, D24, D68 are reduced to binary textures
B09, B12, B24, B68 by an equal probability quantization technique (EPQ) [Har73,
Appendix C]. Figure 4-5 shows the corresponding binary textures obtained from Figure
4-4. We estimate the parameters of B09, B12, B24, B68 by the Min-x2 method. The
estimated parameters and the rank by a nonparametric ranking test for each texture are
reported in Table 4-6. The reproduced textures with estimated parameters by Algorithm
C with 50 iterations, are shown in Figure 4-6, and are visually different from the given
textures in Figure 4-5. The rank of each ﬁtting is 50 out of 50, which means the model

does not ﬁt any of the four binary textures and coincides with visual inspection.

Table 4-6 Results of Fitting DerineElliott Model to Natural Textures

 

Texture 01 02 03 04 Rank
B09 0.338 0.648 0.034 0.104 50 (50)
B 12 0.284 0.778 -0.014 0.172 50 (50)
B24 0.304 0.718 0.090 0.057 50 (50)
B68 0.148 0.732 0.067 0.092 50 (50)

 

 

 

86

 

(a) (b) (C) ((1)

Figure 4-4 Natural textures -- (a) grass lawn (D9), (b) tree bark (D12), (c) calf leather
(D24), (d) wood grain (D68)

 

(a) (b) (C) ((1)

Figure 4-6 Reproduced binary textures by estimated parameters

87

Experiment N S2
We ﬁt second-order GMRF models to the four textures with 256 possible gray lev-

els in Figure 4.3. For each texture, we compute the mean u and the standard deviation v
of gray levels over all pixels. We then subtract the mean 11 from the original texture to
get image data. The second-order GMRF model is then ﬁtted to the image data. The
parameters estimated by MSSE associated with 11 and v are shown in Table 4-7. Since
the estimated parameters form a negative deﬁnite covariance matrix for each ﬁt, we can-

not reproduce the textures based on the estimated parameters.

Table 4-7 Results of Fitting GMRF Model to Natural Textures

 

 

Texture 01 02 03 04 o p. v
D09 0.5579 0.5109 -0.2927 -0.2759 6.347 99.88 37.832
D12 0.5445 0.5056 -0.2976 -0.2532 6.894 148.24 66.204
D24 0.5828 0.5150 -0.2992 -0.3001 7.396 125.61 48.535
D68 0.5214 0.4810 -0.2538 0.2454 4.869 135.11 35.654

 

 

We conclude that the proposed goodness-of-ﬁt test coincides with visual inspection
and the second-order MRF model do not ﬁt these textures. We should pursue higher

order Derin-Elliott or GMRF models or other models to ﬁt these natural textures.

88

4.6. Discussion and Conclusion

We have reviewed PMLE and LSQR for estimating parameters of Derin-Elliott and
auto-binomial GRFs, and proposed Logit and Min-x2 for estimating parameters of binary
GRFs. We also proposed a nonparametric statistical method for comparing various esti-
mators when their distributions are unknown. It is particularly useful for comparing a
new estimator and existing estimators. The limited experiments showed that Min-952 is
the best of the four estimators for binary images. Three estimators, PMLE, MSSE and
MLE, for the second-order auto—normal model are also compared. The experiment
shows that three estimators are equally robust but MSSE needs least computation time,

so is preferred.

A model-independent quantitative measure of correspondence between a given tex-
ture and a GRF was proposed to help compare a given texture and a reproduced texture
for each of the Derin-Elliott, auto-binomial, and auto-normal GRFs. We formalize the
problem of examining the ﬁt between an MRF and a texture as a nonparametric ranking
test [Bar63, Dig83]. For discrete GRFs, this test involves sampling a large number of
images. Parallel computers [Qu187] may reduce the computational burden. We also pro-
posed a computationally efﬁcient goodness-of—ﬁt statistic for auto-normal models, based
on properties of Gaussian distributions. The proposed tests were implemented in exam-

ples and were shown to coincide with visual inspection.

Although parameter estimators were discussed only for MRF or GRF models in tex-
ture synthesis, they can also be applied to the problems of MRF model-based segmenta-
tion and restoration [Bes86, Che85c, Der86,87, Gem84, The83, The86]. The estimated
model parameters can also be used as textural features for classiﬁcation and recognition

as discussed in Chapter 5.

The contributions of this chapter are to propose (1) the Min-x2 estimator for binary
MRFs; (2) methodology which can ﬁnd a robust estimator among existing ones, even
when the distributions of estimators are unknown; (3) quantitative statistics to measure

the correspondence between a given texture and an MRF model.

Chapter 5

Texture Classiﬁcation

 

Texture classiﬁcation are widely used in medical imagery, remote sensing, the
petroleum industry, and the carpet industry. An example of a medical problem is the
evaluation of roentgenograms in order to classify normal and abnormal interstitial pul-
monary patterns [Tul76]. An early example in remote sensing was the identiﬁcation of
crop types by using radar imagery [Ber70]. In the petroleum industry, the identiﬁcation
of rock types (sandstones) present in an underground reservoir of crude oil was investi-
gated via aerial photography [Har73]. Texture classiﬁcation can also be applied for car-
pet wear assessment and quality control [Sie88]. The texture classiﬁcation problem can
be stated as follows. Given n,- training texture patterns from texture class i, 1 S i S K,
select texture features and establish a decision rule, based on training texture patterns,

which can efﬁciently classify an unknown texture.

This chapter characterizes a texture classiﬁcation system, reviews and discusses
existing techniques for texture classiﬁcation, and proposes a new method, based on ﬁtting
an auto-binomial model to digitized textures, to extract textural features for classiﬁcation.
The proposed method will be compared to traditional methods based on co-occurrence

matrices with synthetic and natural textures.

89

90

5.1. A Texture Classiﬁcation System

A texture classiﬁcation system is shown in Figure 5-1. We assume that an input
texture is digitized into an MxN matrix whose elements have possible integer values
from {0, 1, ...., G-l }, where G is the number of colors or gray levels. The entire image is

assumed to consist of a single texture pattern.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

input Feature Classiﬁer Decision
——> Digitization >
texturesl Extraction Design
\
\
\
\
\
\
\
\
\
\
Classiﬁer
Evaluation

 

 

 

Figure 5-1 A Texture Classiﬁcation System

The major steps in classiﬁcation are feature extraction and classiﬁer design. The
goal of feature extraction is to reduce dimensionality from an MxN texture to a small
feature vector which preserves as much information as possible. We would like all
feature vectors captured from one texture class to form a cluster and the clusters
corresponding to different texture classes to be separated in some sense. A classiﬁer par-
titions the feature space into K disjoint regions and assigns a feature vector computed

from an unknown texture into one of the K regions according to a discriminant function.

The classiﬁer itself is evaluated by a classiﬁcation error rate [Dev82, Dud73, Jai87].
In the texture classiﬁcation problem, the features extracted from a digital texture must be
selected so as to optimize classiﬁer design. 80 classiﬁer evaluation is used not only to

evaluate a classiﬁer but also to validate features.

91

5.2. Feature Deﬁnition and Extraction

Textural features are selected to preserve some characteristics of a texture pattern.
The extraction of textural features is the most important step in texture classiﬁcation.
There are two types of textural features, descriptive and generative. Descriptive textural
features capture the characteristics of a texture corresponding to visual perception like
coarseness, directionality, contrast, line-likeness, regularity, roughness [Tam78], but can-
not be used to generate a texture. Generative textural features can be used not only to
describe a texture but also to produce a texture. A good review was given by Van Gool
[Van85]. It is inappropriate to say one set of textural features are universally better than
another set of textural features. The principle of feature extraction in texture
classiﬁcation is to extract a minimal set of features displaying maximal discriminatory

information.

Descriptive textural features can be based on spatial gray level dependence matrices
(SGLDM) [Har73], Fourier power spectrum (FPS), gray level run length matrices
(GLRLM), and gray level difference matrices (GLDM) [Wes76]. Conners and Harlow
[Con80] compared the features derived from SGLDM, GLRLM, GLDM and FPS. The
textures they used to compare the feature extraction methods were obtained by indepen-
dently simulating a 4-state stationary Markov chain [Tay85] for each row of an image.
In other words, the textures consist of four colors. They showed in a limited theoretical
study that the SGLDM method is better than the other methods. Weszka et al. [Wes76]
experimentally compared the features derived from SGLDM, GLRLM, GLDM and FPS
for terrain images and concluded that the SGLDM method dominates the others.
Thousands of features can be derived from SGLDM, GLRLM, GLDM or FPS, while the
number of training texture patterns is limited. One faces a difﬁcult problem of selecting

"best features".

Generative features were introduced by several authors [Fau80, Che85b, Kas82,
Kho87], A texture can be reproduced from a generative feature vector. Faugeras and
Pratt introduced decorrelation method to extract features, such as the mean, standard

deviation, skewness and kurtosis computed from the residuals obtained from applying a

92

whitening operator to the image [Fau80]. They suggested that a whitening operator be
replaced by an edge operator, such as the Sobel operator or Laplacian operator [Ba182],
to speed up computations. However, the features extracted using an edge operator will in
general not be generative. Kaneko and Yodogawa [Kan82], Chellappa and Chatterjee
[Che85b] extracted textural features based on Gaussian Markov random ﬁeld (GMRF)
model parameters. Khotanzad and Kashyap [Kh087] extracted textural features based on
simultaneous autoregressive (SAR) model parameters. To extract effective features with
these methods, the number of gray levels should be large since these are continuous
MRFs. The GMRF or SAR model parameters used to derive the features are estimated
from a texture. Due to the restriction on the valid domain of model parameters, the

estimated model parameters may be out of range and become invalid (Section 2.2.4).

In the rest of this section, we review the features derived from SGLDM and propose
a new feature extraction method by ﬁtting discrete MRFs to textures to obtain a vector of

generative features.

5.2.1. Features Derived From SGLDM

A spatial gray level dependence matrix is computed from a digital texture, and con-
sists of elements p(i,j,d,a) obtained by counting the number of times each pair (i,j) of
gray levels occurs at separation d pixels and in the direction speciﬁed by an angle a.
Usually, a = {0°,45°,90°, 135°) and d=1. A spatial gray level dependence matrix is a
GxG symmetric matrix. Many textural features can be derived from a spatial gray level
dependence matrix, but there is no simple way to select or to extract effective ones. The
number of potential features is enormous. The deﬁnitions of these features can be found
in [Har73]. Four commonly used features are reviewed and will be used for comparison
in this chapter. Angular second moment (ASM) is a measure of homogeneity of the
image, contrast (CON) is a measure of the local variations present in the image, entropy
(ENT) is a measure of the complexity of the image, and correlation (COR) is a measure
of gray—tone linear-dependencies in the image. Computational formulas for these four

features are given below.

93

Assuming the direction a and separation d are given, we write p(i,j) instead of

 

G-lG-l
p(i,j,d,a). Let R = Z 2 p(i, j), and deﬁne f(i,j)=p(i,j)lR for 0Si, j SG-l as the frequency
i=0 j=0
with which colors (i,j) occur in a pair of pixels d units apart at angle a. We further deﬁne
G—l
the marginal densities along horizontal and vertical directions as px (i) = 2 f(i, j), and
j=0
a 6.1 o c
py0) = 281.1).
i=0
G—lG—l
ASM = z z f(i,j)2 (5.1)
i=0 j=0
G—lG—l
CON: 2 2 Ii—jI2f(i,j) (5.2)
i=0 j=0
G-lG—l . .
ENT=—2 2f(1,J)ln[f(i.j)] (5.3)
i=0 j=0
G—lG-lu . .
.2: .2 IJf(li.I)"IJ~iny
COR= F0 F0 (5.4)
oxoy

where 11x, try, Ox, 0), are the means and standard deviations of px and py.

5.2.2. Features Extracted by Fitting 3 Discrete MRF to 3 Texture

The parameters of an MRF capture the essential characteristics of a texture gen-
erated from the MRF [Che81, Cro83, Ha880]. Continuous MRF models are appropriate
only for those textures consisting of large number of colors, while discrete MRF models
are useful for textures with small number of colors. We propose ﬁtting a discrete MRF
model to a 4-color texture and using the estimated model parameters of the MRF as tex-

tural features for classiﬁcation.

94

5.3. Proposed Recognition Methodology

We propose using a 2-bit digitizer to digitize an input texture as a 4-color texture
and ﬁtting a second-order auto-binomial model, with a parameter vector
0 = (01, 02, 03, 04) deﬁned in Section 2.2.2 or Equation (4.4), to this texture. The
estimated parameter vector 0 is taken as a feature vector. There are several advantages to
this approach. First, the features characterize many textural properties such as direc-
tionality, line-likeness, and blob-likeness [Cro83]. Second, a 2—bit digitizer is cheaper
than digitizers required by other generative textural feature extraction techniques, which
need more bits to guarantee the effectiveness of textural features. Third, these features
are generative and thus are information-preserving [Che85c]. In other words, if the
model ﬁts the texture well, a similar texture can be reproduced by the sampling algorithm
given in Chapter 3. The parameters as features are estimated by a coding scheme
[Bes74] and the Newton-Raphson method [G0183] as discussed in Chapter 4.

For "good features", it is expected that all feature vectors corresponding to a single
texture class will form a cluster, and feature vectors corresponding to two different tex-
ture classes will form two separate clusters. We propose two methods for evaluating

potential textural features.

(1) Project feature vectors to a two-dimensional space by the principal component pro-
jection (PCP) [And84, Mor76], and visually judge clustering in the projected space.
Since a projection from a higher dimensional space to 2D space may distort the

structure of original data, we provide a quantitative evaluation below.

(2) Compute the classiﬁcation error rate of the Fisher’s linear classiﬁer [Fi336, Jai87,
Mor76] by the leave—one—out method [Dud73, Dev82]. The error rate is commonly
used to evaluate a classiﬁer, here we ﬁxed the classiﬁer and use the error rate to

evaluate the textural features.

95

5.4. Experiments

The objective of the experiments is to compare the features selected by the proposed
methodology, and those derived from SGLDM. The comparison is mainly based on the
leave-one-out error rate of Fisher’s linear classiﬁer [Appendix B], while the 2D projec-
tions of feature vectors give a visual judgement. We are interested in 4-level textures,
whose levels are denoted {0,1,2,3}. Under this constraint, the existing techniques for
extracting generative textural features [Che85, Fau80, Kas82, Kho87] are inappropriate
(since the accuracy requires large number of gray levels). The textural features derived
from SGLDM are thought to be better than those based on other descriptive features
[Con80, Sie88, Wes76], so we compare our proposed textural features with those features
derived from SGLDM [Har73]. The comparison is based on three experiments. Experi—
ments 5-1 and 5-2 involve synthetic textures generated from 3rd-order and 2nd-order
GMRF models respectively. Experiment 5-3 processes natural textures. Experiments
5-1 and 5-3 involve four classes of textures, while Experiment 5-2 contains eight classes
of textures. All experiments have sixteen training patterns in each class. The image size

is 64x64.

The sixteen features corresponding to angular second moment, contrast, entropy,
and correlation derived from SGLDM with distance 1 at 0°, 45°, 90°, 135° directions
[Har73] are ﬁrst obtained. The sample size is 16 for each class, and it is well known in
pattern recognition literature that the sample size must be at least four or ﬁve times as
large as the number of features [Dud73, Jai82]. We use the averages of ASM, CON,
ENT, and COR derived from SGLDM in four directions as four textural features (Method
A). We also extract, from 16—tuple feature vectors, the ﬁrst four principal components
[And84] as four textural features for classiﬁcation and recognition (Method B). The
features obtained from ﬁtting a second—order auto-binomial MRF to textures (Method C,
see Section 5.3) will be compared with those sets of features obtained from Methods A
and B. Note that the MRF parameters as features (Method C) are both descriptive and
generative, the features extracted by Method A are descriptive but not generative, and the

features extracted by Method B are neither descriptive nor generative.

96

Experiment 5-1

Four classes of 64x64 textures are generated from GMRF models by Algorithm N
(Chapter 3) with parameters 11:130, 0&5, and (01, 02, 03, 04, 05, 06) being (a) (0, 0,
0.28, -0.14, 0, 0), (b) (0.34, 0.34, -0.22, -0.22, 0, 0), (c) (0.10, 0.16, -0.03, -0.03, -0.14,
0.12), and (d) (0.17, 0.16, 0.08, 0.08, 0, 0) respectively. Note that the textures consist of
256 possible gray levels. These textures are reduced to 4-color textures by equal proba-
bility quantization (EPQ) (Appendix C) and the textural features are then extracted based
on 4-color textures. Figure 5-2 shows one texture of each type with 256 possible gray
levels. The corresponding textures with four gray levels by EPQ are shown in Figure 5-3.
The reduced textures in Figure 5-3 resemble the corresponding 256 gray-level textures in
Figure 5-2.

The projection of the ﬁrst two principal components of feature vectors by Methods
A, B and C are shown in Figures 5-4, 5-5, and 5-6 respectively. Figure 5-4 shows that the
textures of classes (a) and (b) overlap in the 2D space and the corresponding error rate in
Table 5-1 is 4/64. Figure 5-5 shows that the four classes of textures are well-separated
by Method B, and the leave-one out error in Table 5-1 is zero. Figure 5-6 show that the
textures of classes (c) and (d) overlap in the 2D projection. However, the zero error rate
in Table 5-1 suggests that the classes are separated in the four-dimensional space. This
experiment shows that the MRF parameters as features (Method C) and the features
based on the ﬁrst four principal components from a set of features derived from SGLDM
(Method B) performed better than those obtained by averaging the features derived from
SGLDM (Method A).

97

 

Figure 5-2 Synthetic textures generated from 3rd-order GMRF models

 

(a) (b) (C) ((0

Figure 5-3 Textures of four colors after EPQ on textures in Figure 5-2

 

2.77 — %

a
2.76 - (1%

 

 

 

2.75 —
b d
2.74 —
b
2.73 —1 b
I I I
—2 5 —2 —1 5

Figure 5-4 PCP of feature vectors from SGLDM (A) in Experiment 5-1

98

 

mbb
0.5— c
QC
“W
0-
-0.5—-
if

 

 

 

I I I I I
—5 —4.5 —4 —3.5 —3

Figure 5-5 PCP of feature vectors from SGLDM (B) in Experiment 5-1

 

 

 

a b
0 3 4 $1)
a
0.2—
0.1—
$0
0

 

Figure 5-6 PCP of MRF feature vectors (C) in Experiment 5-1

99

Table 5-1. Leave—one-out Errors on Experiments 5-1, 5-2, 5-3

 

Experiment Method A Method B Method C

 

5-1 4/64 0/64 0/64
5-2 21/ 128 0/ 128 0/ 128
5-3 19/64 7/64 1/64

 

 

 

Experiment 5-2

Eight classes of 64x64 textures are generated from GMRF models by Algorithm N
with parameters 11:160, 0:60, and the 0 parameters are chosen to be the ﬁrst eight sets of
parameter vectors in the experiment in Section 4.4.3, where 0 = (01, 02, 03, 04) = (a) (-
0.04, -0.02, -0.15, -0.04), (b) (-0.02, -0.20, -0.22, -0.05), (c) (0.40, -0.31, 0.17, 0.06), (d)
(0.09, -0.30, 0.07, -0.12), (e) (-0.21, -0.10, 0.05, 0.05), (f) (-0.09, -0.10, -0.38, -0.02), (g)
(-0.03, -0.37, 0.09, -0.01), (h) (0.13, -0.02, 0.14, 0.16). We repeat the procedure in
Experiment 5-1. Examples of textures consist of 256 possible gray levels, and their

corresponding 4-color textures by EPQ are shown in Figures 5-7 and 5-8 respectively.

The 2D projections of the feature vectors obtained from Methods A, B, and C are
shown in Figures 5-9, 5-10, and 5-11 respectively. Figures 5-10 and 5-11 show that the
eight texture classes are well-separated by the feature vectors obtained from Method B or
Method C. Figure 5-9 demonSUates that the texture classes are not separated by the
features obtained from Method A. The error rates given in Table 5-1 coincides with the
visual inspection. We conclude that the features obtained from Methods B and C per-

form equally well and both do better than the features obtained from Method A.

100

 

Figure 5-7 Synthetic textures generated from 2nd-order GMRF models

 

Figure 5-8 Textures of four colors after EPQ on textures in Figure 5-7

101

 

 

 

63
—2.72 —
-2.74 _.
—2.76 — I“ g
-—2.78 — 9
an

 

I 1 I I I
1.8 2 2.2 2.4 2.6

Figure 5-9 PCP of feature vectors from SGLDM (A) in Experiment 5-2

 

 

 

 

0‘ f
—0.5— i h
_,_, b9 e
—1.5— (,6 ﬂ
. . "t
—3 —2 —1 0

Figure 5- 10 PCP of feature vectors from SGLDM (B) in Experiment 5-2

102

 

 

 

 

0.4 f a
t 0
0.2— a
0— 35 In
—0.2—
it
2. of.

Figure 5-11 PCP of MRF feature vectors (C) in Experiment 5-2

103

Experiment 5-3

This experiment will process natural textures. Four textures D09 -— grass lawn, D12
-- tree bark, D24 -- calf leather, and D68 -— wood grain selected from Brodatz’s book
were digitized via a CCD camera, with 256 possible gray levels. These textures are fre-
quently used for the studies of texture synthesis [Che85a, Cro83, Tou76] and texture
classiﬁcation [Che85b, Cog85, Kan82, Kas82, Kho87, Tou80, Zuc80]. Figure 4-4 shows
these four textures with 256 possible gray levels. Each has size 256x256. Sixteen 64x64
nonoverlapping textures from each 256x256 texture are obtained as training textures for
classiﬁcation, so there are 64 textures in 4 classes, 16 in each class. Again, the training
textures are further reduced to 4-color textures by EPQ and the textural features are then

extracted based on 4-color textures.

The 2D projections of the feature vectors obtained from Methods A, B, and C are
shown in Figures 5-13, 5-14, and 5-15 respectively. A visual inspection on Figures 5-13
through 5—15 indicates that none of the three sets of features from Methods A, B, and C
can completely separate the texture classes. However, it seems that the texture features
obtained by an MRF model-ﬁt separate texture classes better than those derived from
SGLDM. The error rates by these three different methods for selecting and extracting

texture features given in Table 5-1 supports the visual inspection.

 

2.2_ blb iWac
b b
2.15— d 118%1‘93
b
b
b

2.1—

C

b b
2.05 — b

2— b

I I I
1.2 1.4 1.6

 

 

 

Figure 5-13 PCP of feature vectors from SGLDM (A) in Experiment 5-3

104

 

 

 

 

0.2- b b
b
bb b
-6.93889e-18 .. a
b be . Ba
.02... bit a 2 C
bb c0c cc %
—0.4_ b
d at
—0.6— mddad
I I I

2.5 3 3.5

Figure 5-14 PCP of feature vectors from SGLDM (B) in Experiment 5-3

 

 

 

—1.5

 

0.5 1 1.5 2

Figure 5-15 PCP of MRF feature vectors in Experiment 5-3

Ilr

105

5.5. Discussion and Conclusion

We have demonstrated a textural feature extraction method based on ﬁtting a
second-order auto-binomial model to a 4-color texture and used the four estimated
parameters as a feature vector for texture classiﬁcation. Our methodology, using a 2-bit
digitizer instead of using a 8-bit or a 16-bit digitizer, is more economical than the exist-
ing generative feature extraction techniques [Che85b, Fau80, Kan82, Kas82, Kh087].
Our feature extraction method can be done in one step, which means the four features can
be extracted by easily ﬁtting the model to a texture. Deﬁning these features is equivalent
to ﬁtting a second-order auto-binomial model to the texture, but the goodness-of-ﬁt is not
an issue here. However, the SGLDM approach [Har’7 3, Sie88, Wes76] faces the horren-
dous problem of feature selection or extraction after a set of features have been derived
from spatial gray level dependence matrices. The features derived from SGLDM are not
generative, and could also be non-descriptive. The features obtained from MRF model-
ﬁt are both generative and descriptive. Moreover, the MRF model-ﬁt approach is simple
and is shown to perform better than the traditional SGLDM approach through the experi-
ments on synthetic textures and natural textures. We conclude that the proposed tech-
nique may be used for extracting textural features with a 2-bit digitizer in designing a

texture classiﬁcation system.

Chapter 6

Conclusion, Discussion, and Future Research

 

The applications of Markov random ﬁelds in texture synthesis, texture classiﬁcation,
image compression, restoration, and segmentation involve the problems of sampling
MRFs and estimating MRF model parameters, and the goodness-of-ﬁt test between an
MRF and a given texture. This thesis is concerned with these fundamental problems of

characterization, sampling, estimation, goodness-of—ﬁt test, and texture classiﬁcation.

6.1 Summary of Results

We deﬁne a class of Gibbs random ﬁelds, in Chapter 2, which are characterized by
the cliques consisting of a single pixel and pairwise pixels in a second-order neighbor-
hood. The global distributions of the Derin-Elliott, auto-binomial, auto-Poisson, auto-
normal (GMRF) models are given. The corresponding local conditional densities of

these models are also derived based on Theorem 2.

In Chapter 3, a new sampling algorithm (Algorithm C) based on [Ham64] for
discrete GRFs is proposed. Criteria for validating sampling algorithms are also pro-
posed. The new algorithm is compared with the Gibbs sampler (Algorithm G) [Gem84]
for the Derin-Elliott and auto-binomial models via experiments. The comparison indi-
cated that the new algorithm is computationally more attractive than the Gibbs sampler.
A methodology for detecting unreliable, nonuniform and indistinguishable GRFs towards

identifying unique useful GRFs is characterized. The phenomenon of "phase transition"

106

107

was discovered and discussed. The critical value of phase transition for a second-order
isotropic Derin-Elliott texture model with four colors is conjectured to be between 0.2
and 0.3. This phenomenon may not appear if the number of iterations in the sampling
algorithm is not large enough. A phase transition occurs if a Gibbs distribution has "mul-
tiple peaks". The phase transition can be detected by our proposed methodology for
detecting a nonuniform GRF. We suggest not using an unreliable or a nonuniform GRF
for texture synthesis. For those indistinguishable useful GRFs of the same parametric
form, the one whose parameter vector has the least norm is suggested to represent the
other GRFs which are indistinguishable from it. A sampling algorithm for GMRF is also
reviewed and discussed. The exact valid domain for parameters in the ﬁrst-order GMRF
model is given. However, the exact valid parameter space for a higher order GMRF is

unknown but "narrow".

Parameter estimation for discrete GRF and GMRF models is examined in Chapter 4
and requires a huge computational effort. We review various estimators and proposed a
methodology for comparing estimators. The methodology is particularly useful when the
distributions of the estimators are unknown. We also propose new estimators for binary
textures based on logit model ﬁt for binary variables. The methodology is tested by com-
paring estimators for the Derin-Elliott and auto-binomial discrete GRFs. The limited
experiments show that the proposed Min-x2 estimator is computationally more efﬁcient
than Besag’s pseudo maximum likelihood estimator and is more robust than Derin and
Elliott’s ad hoc estimator (LSQR given in Section 4.1.2) for the Derin-Elliott and auto-
binomial models. We also compared estimators for GMRF model. The result is con-
sistent with Besag’s report [Bes77] for the ﬁrst-order isotropic GMRF models. A
model-independent statistic for measuring the correspondence between an MRF and a
given texture is proposed. A computationally efﬁcient x2 statistic is also proposed for
measuring the goodness-of-ﬁt when a GMRF model is used for texture synthesis. The
parameter estimation, goodness-of-ﬁt test together with sampling algorithms are applied
for ﬁtting MRF models to four natural textures. The results agree with visual inspection

(although none of the models ﬁts these textures).

108

Chapter 5 proposes discrete MRF model parameters as textural features for
classiﬁcation. The second-order auto-binomial model with four parameters is used to
capture textural features which are compared with those derived from spatial gray level
dependence matrices. Experiments show that the textural features based on MRF model
(Method C) parameters perform better than the features extracted by SGLDM and
averaging (Method A in Section 5.4) and those extracted by SGLDM and projection
(Method B in Section 5.4) for natural textures. However, the features extracted from
Methods C and B perform equally well for synthetic textures generated from GMRF
models, but both outperform those features extracted by Method A.

6.2. Discussion and Future Research

This thesis investigated the fundamental problems of sampling and statistical infer-
ence for MRF image models. The applications to texture synthesis and texture
classiﬁcation are demonstrated. Some areas of future research that extend the work in this

thesis are listed below.

(1) A test of homogeneity is desired. The Markov random ﬁeld model assumes homo-
geneity in the image. Some veriﬁcation of this should be made before ﬁtting an

MRF model to textures.

(2) We tried to ﬁt second-order Derin-Elliott and GMRF models to four natural tex-
tures, but the ﬁt was poor. Fitting higher order MRF models or other MRF models
(e. g. the auto-Poisson models) to homogeneous natural textures should be per-
formed. The selection of the type of model and the order of neighborhood is by

experience and intelligence.

(3) Design an interactive program to reconstruct the original image x based on the
degraded image y = Hx + n, where H is a linear operator and 1] consists of indepen-
dent noise. In Chellappa and Kashyap’s approach [Che82], they assumed either x
or y is a realization of an GMRF model and substituted the estimated parameters

based on the subimage of y into a restoration algorithm derived from minimum

(4)

109

mean square error (MMSE) estimate to recover the original image x. The drawback
is that the estimated parameters for the GMRF model may be invalid which leads to
a meaningless restoration algorithm. Assuming x is a realization of an GMRF
model with known parameters, an algorithm based on the maximum a posteriori

estimate could be established to restore the original image.

Establish knowledge-based algorithms, based on the maximum a posteriori esti-
mate, to automatically segment an image. Besag [Bes86] proposed the iterated con-
ditional modes (ICM) algorithm to restore or segment "dirty pictures". The ICM
algorithm combined the intensity and contextual information to process a "dirty pic-
ture" to recover the true image. The drawback is that the prior distribution of the
true image is assumed to be known and each pixel of an observed image is assumed
to have a Gaussian distribution and is independent of other pixels. To achieve the
level of automatic segmentation, the known parameters assumed in Besag’s ICM
algorithm should be estimated or "learned" from the observed image. A potential
research is to construct an algorithm which can estimate these parameters from the

observed image and simultaneously maximizes the a posteriori distribution.

Appendices

Appendix A

Proof of The Proposition in Chapter 2

 

Proposition : If a Derin-Elliott GRF (Section 2.2.1) has a probability measure P deﬁned

on the sample space (2 with parameters 010 = a1 = = 010.1, then

(1) Let ’c be a permutation on the set of colors [0, 1, ..... , G-1}. For any X e (2, and Y
is obtained by Y, = t (X,), t e S, we have P(Y) = P(X).

(2) P(X,=g)=1/Gfor0SgSG-1andanyte S.

Proof : A probability measure is characterized by the energy function, and is further
characterized by the indicator function I deﬁned in Section 2.2.1. For any permutation 1:
on A = [0, 1, ..... , G-l }, I(t(a), t(b)) = I(a,b) for any a,b e S, which implies U(Y)= U(X)
and hence P(Y) = P(X) that proves (1).

G—l
P(X, = g) = Z e'UO‘) / z 2 e‘UOO. Note that the sum 2 is over

X,=g h=0 X, :11 XI. :11
XjEA,j¢t XjeA,j¢t XjeA,j¢t
GMN‘1 items. Consider E and 2 , for each X contributed to 2 , we have a
X! =a Xt =b X, =3
Xje A,j¢t X} E A,j¢t Xje A,j¢t
corresponding Y contributed to 2 with Y, = I (Xi), where r is a permutation deﬁned
Y1 =b
YjEA,j¢t

by t(a)=b, t(b)=a, and t(r)=r for re A-{a,b}. By (1), U(Y)=U(X) and ﬂake-U(X),
which implies that Z e'UOQ is a constant for any a e A. Therefore, P(X, = g) = l/G.

X, =3
x56 A,j-¢t

110

Appendix B

Fisher’s Linear Classiﬁer

 

Consider n; training vectors (vii), v8), ° ° ° , v22} from class C,-, for i = 1, 2, , K.
Estimators of mean vector 11,- and covariance matrix 2,- of the ith pattern class are deﬁned

below.

A "i . A 1 ni . A . A .
11,-: —1— ): v“). 2:,- = —.- 2 pp - 11,-] {1}?) — 11,]T, for 1 =1 ,2, ,K.
"i j:1 "I j=1

The pooled mean vector and within—class scatter matrix are estimated by

A 1 K A A 1 K A K
u=—2 n,tt,-,S=—2 n,z,- , wheren= 2 n,-.
"i=1 ”i=1 i=1

The Mahalanobis distance between pattern v and the estimated mean vector of class C j is

denoted by g ,- (v), where

.. Te-l .. ,
gj(v)= (v—llj) S (V—llj),forj=1,2,...,K.

The Fisher’s linear classiﬁer is deﬁned as :

Assign v to class C,- if g,(v) = min gJ-(v) for all j.

111

Appendix C

Equal Probability Quantization

 

Let x(i,j) be the gray value at the site (i,j) of an MxN image (or lattice) x, and let the
number of possible gray values is much larger than an integer G. The equal probability
quantization method for reassigning the sites of the image x with colors from 0 to G-l is
described below.

Let F,( be the empirical distribution of gray values in an image x. Deﬁne

Q(k) = inf {g I F,,(g) 2 (k+l)/G}, 0SkSG-1, and Q(—1)=-1.

A possible assignment is

x(i,j) (— k if Q(k-1)< x(i,j) S Q(k), 0 S k S G-l.

112

List of References

[Ahu81]

[And84]

[And77]

[Ans56]

[Ash72]

[Ba182]

[Bar63]

[Ber55]

[Ber70]

[Bes74]

[Bes75]

[Bes77]

[Bes86]

List of References

Ahuja, N. and Rosenfeld, A.(1981), Mosaic Models for Textures, IEEE Trans.
on PAMI-3, PP.1-10.

Anderson, T.W.(1984), An Introduction to Multivariate Statistical Analysis,
New York : Wiley.

Andrews, HQ and Hunt, B.R.(1977), Digital Image Restoration, New Jersey
: Prentice-Hall.

Anscombe, F.J.(1956), On Estimating Binomial Response Relations, Biome-
trika, 43, PP.461-464.

Ashton, W.D.(1972), The Logit Transformation, New York : Hafner Publ. Co.

Ballard, DH. and Brown, C.M.(1982), Computer Vision, New York :
Prentice-Hall.

Barnard, G.A.(1963), Contribution to the Discussion of Professor Bartlett’s
paper, JR.Statist.Soc., SerB, 25, pp.294.

Berkson, J .(1955), Maximum Likelihood and Minimum Chi-square Estimates
of The Logistic Function, JASA, 50, pp.130-162.

Berger, DH. (1970), Texture as a Discriminant of Crops on Radar Imagery,
IEEE Trans. on Geoscience Electronics, GE-8, pp.344-348.

Besag, J.(1974), Spatial Interaction and the Statistical Analysis of Lattice Sys-
tems, J R.Statist. Soc.B, 36, pp.192-236.

Besag, J .(1975), Statistical Analysis of Non-lattice Data, The Statistician, 24,
pp.179-195.

Besag, J.(1977), Efﬁciency of Pseudolikelihood estimation for Simple Gaus-
sian Fields, Biometrika, 64, pp.616-618.

Besag, J.(1986), On the Statistical Analysis of Dirty Pictures, J.R.Statist.
506.3, 48, pp.259-302.

113

[Boh86]

[Box76]

[Bro66]

[Car85]

[Cha85]

[Che8 1]

[Che82]

[Che83]

[Che85a]

[Che85b]

[Che850]

[Ch087]

[Cog85]

114

Bohachevsky, I.O., Johnson, ME. and Stein, M.L.(1986), Generalized Simu-
lated Annealing for Function Optimization, Technometrics, 28, pp.209-217.

Box, G.E.P. and Jenkins, G.M.(1976), Time Series Analysis, Holden-Day Co.

Brodatz, P.(1966), Textures : A Photographic Album for Artists and
Designers, New York : Dover.

Carnevali, P., Coletti, L. and Patarnello, S.(1985), Image Processing by Simu-
lated Annealing, IBM J. Res. Develop, vol.29, No.6, pp.569-579.

Chatterjee, S. and Chellappa, R.(1985), Maximum Likelihood Texture Seg-
mentation Using Gaussian Markov Random Field Models. Proc. IEEE Conf.
CVPR, pp.215-217.

Chellappa, R.(198l), Stochastic Models in Image Analysis and Processing,
Ph.D. Thesis, Purdue University.

Chellappa, R. and Kashyap. R.L.(1982), Digital Image Restoration Using
Spatial Interaction Models, IEEE Trans. on ASSP-30, pp.461-472.

Chellappa, R. and Kashyap. R.L.(1983), Estimation and Choice of Neighbors
in Spatial-Interaction Models of Images, IEEE Trans. on IT -29, pp.60-72.

Chellappa, R., Chatterjee, S., Bagdazian, R.(1985), Textures Synthesis and
Compression Using Gaussian-Markov Random Field Models, IEEE Trans. on
SMC—33, PP.298-303.

Chellappa, R. and Chatterjee, S.(1985), Classiﬁcation of Textures Using
Gaussian Markov Random Fields, IEEE Trans. on ASSP—33,

Chellappa, R.(1985), Two—Dimensional Discrete Gaussian Markov Random
Field Models for Image Processing, Progress in Pattern Recognition 2 edited
by L.N. Kanal and A. Rosenfeld, North—Holland Publ. Co., pp.79-112.

Chow, P.B., Brown, C.M., and Raman, R.(1987), A Conﬁdence-Based
Approach to the Labeling Problem, IEEE Workshop on Image Understanding,
Miami, pp.51-56.

Coggins, J.M. and Jain, A.K.(1985), Spatial Filtering Approach to Texture
Analysis, Pattern Recognition Letters, 3, pp. 195-203.

[Coh87]

[Con80]

[Cox77]
[Cox80]

[CroSO]

[Cro83]

[Del79]

[Der84]

[Der85]

[Der86]

[Der87]

[Dev82]

[deSo82]

[Dig83]

115

Cohen, ES. and Cooper, D.B.(1987), Simple Parallel Hierarchical and Relax-
ation Algorithms for Segmenting Noncausal Markovian Random Fields,
IEEE, PAMI—9, pp. 195-219.

Conners, R.W. and Harlow, C.A.(1980), A Theoretical Comparison of Tex-
ture Algorithms, IEEE Trans. on PAMI-2, pp.204-222.

Cox, B.R.(1977), The Analysis of Binary Data, New York : Wiley.
Cox, DR. and Isham, V.(1980), Point Processes, London : Chapman & Hall.

Cross, G.R.(1980), Markov Random Field Texture Models, Ph.D. Thesis,
Michigan State University.

Cross, GR. and Jain, A.K.(1983), Markov Random Field Texture Models,
IEEE Trans. on PAMI-S, pp.25-39.

Delp, E]. and Kashyap, R.L.(1979), Image Data Compression Using Autore-
gressive Times Series Models, Pattern Recognition, 11, pp.313-323.

Derin, H., Elliott, H., Cristi, R. and Geman, D.(1984), Bayes Smoothing
Algorithms for Segmentation of Binary Images Modeled by Markov Random
Fields, IEEE Trans. on PAMI-6, pp.707-720.

Derin, H., Elliott, H., and Kuang, J.(1985), A New Approach to Parameter
Estimation for Gibbs Random Fields, Proc. IEEE Conf. ASSP-, Tampa, FL,
pp.24.8.1-24.8.4

Derin, H., Cole, W.S.(1986), Segmentation of Textured Images Using Gibbs
Random Fields, CVGIP, 35, pp.72-98.

Derin, H. and Elliott, H.(1987), Modeling and Segmentation of Noisy and
Textured Images Using Gibbs Random Fields, IEEE Trans. on PAMI-9,
pp.39-55.

Devijver, RA. and Kittler, J.(1982), Pattern Recognition .° A Statistical
Approach, Englewood Cliffs, NJ : Prentice-Hall International.

de Souza, P.(1982), Texture Recognition via Autoregression, Pattern Recog-
nition, 15, pp.471—475.

Diggle, P.J.(1983), Statistical Analysis of Spatial Point Patterns, New York :
Academic Press.

[Dud73]
[Fan88]
[Fau80]
[Fis36]

[Fra87]

[Gag81]

[Gem84]

[G0183]
[Gon87]

[Gou88]

[Ham64]
[Har73]

[Har79]

116

Duda, RD. and Hart, P.E.(1973), Pattern Classiﬁcation and Scene Analysis,
New York : Wiley.

Fan, Z. and Cohen, F.S.(l988), Textured Image Segmentation as a Multiple
Hypothesis Test, IEEE Trans. on Circuits & Systems, pp.691-702.

Faugeras, O.D., Pratt, W.K.(l980), Decorrelation Methods of Texture Feature
Extraction, IEEE Trans. on PAMI-Z, pp.323-332.

Fisher, R.A.(1936), The Use of Multiple Measurements in Taxonomic Prob-
lems, Ann. Eugenics, vol.7, part 11, pp.179-188.

Frankot, RT. and Chellappa, R.(1987), Lognormal Random-Field Models and
Their Applications to Radar Images Synthesis, IEEE Trans. on Geoscience
and Remote Sensing, GE-25, pp.195-206.

Gagalowicz, A.(1981), A New Methods for Texture Field Synthesis : Some
Applications to The Study of Human Vision, IEEE Trans. on PAMI-3,
pp.520-533.

Geman, S. and Geman, D.(1984), Stochastic Relaxation : Gibbs Distributions,
and the Bayesian Restoration of Images, IEEE Trans. on PAMI-6, pp.721-
741.

Golub, G.H. and Van Loan, C.F.(1983), Matrix Computations, The Johns
Hopkins University Press.

Gonzalez, RC. and Wintz, P.(1987), Digital Image Processing, New York :
Addison-Wesley.

Goutsias, J.(1988), A Comparative Study of Two Usrful Discrete-Valued
Random Fields for the Statistical Modeling of Images, Proc. IEEE Conf.
CVPR, pp.310-315.

Hammersley, J.M. and Handscomb, D.C.(1964), Monte Carlo Methods, New
York : Wiley, pp.1 12-126.

Haralick, R.M., Shanmugam, K., and Dinstein, I.(1973), Textural Features for
Image Classiﬁcation, IEEE trans. on SMC-3, pp.610-621.

Haralick, R.M.(1979), Statistical and Structural Approaches to Texture, Proc.
IEEE 67, pp.786-804.

[Har86]

[Has80]

[He88]

[Hop68]

[J ai82]

[J 21187]

[Ju181]

[Kan82]

[Kas80]

[Kas81]

[Kas82]

[Kas84]

[Ken80]

117

Haralick, R.M.(l986), Texture Analysis in Handbook of Pattern Recognition
and Image Processing edited by T. Y. Young and KS. Fu, Ch 11, pp.247-279.

Hassner, M. and Sklansky, J .(1980), The Use of Markov Random Fields as
Models of Texture, CGIP, 12, pp.357-370.

He, D.C., Wang, L, and Guibert, J.(1988), Texture Discrimination based on an
Optimal Utilization of Texture Features, Pattern Recognition, 21, pp.141-146.

H0pe, A.C.A.(1968), A Simplied Monte Carlo Signiﬁcance Test Procedure,
J.R.Statist.Soc, Ser.B, 30, pp.582-588.

Jain AK. and Chandrasekaran, B.(1982), Dimensionality and Sample Size
Considerations in Pattern Recognition Practice, Handbook of Statistics, 2,
edited by PR. Krishnaiah and LN. Kanal, North-Holland Publ. Co., pp.835-
855.

Jain A.K., Dubes RC. and Chen C.C.(1987), Bootstrap Techniques for Error
Estimation, IEEE Trans. on PAMI-9, pp.628-633.

Julez, B.(1981), Textons, the Elements of Texture Perception, and Their
Interactions, Nature, 290, pp.91-97.

Kaneko, H. and Yodogawa, B.(1982), A Markov Random Field Application
to Texture Classiﬁcation, Proc. IEEE Conf. PRIP, pp.221-225.

Kashyap, R.L.(1980), Univariate and Multivariate Random Field Models for
Images, CGIP, 12, pp.257-270.

Kashyap, R.L.( 1981), Random Field Models on Finite IAttices for Finite
Images, Presented in the Conference on Information Sciences, Johns Hokins
University, May.

Kashyap, R.L., Chellappa, R. and Khotanzad, A.(1982), Texture
Classiﬁcation Using Features derived from Random Field Models, Pattern
Recognition Letters, \fBl, pp.43-50. Hokins University, May.

Kashyap, R.L. and Lapsa, P.M.(1984), Synthesis and Estimation of Random
Fields Using Long-Correlation Models, IEEE Trans. on PAMI—6, pp.800-809.

Kennedy, W.J. and Gentle, J.E.(l980), Statistical Computing, New York :
Marcel Dekker.

[Kin80]

[Kho87]

[Law79]

[Leh75]

[Lev85]

[Lu79]

[Mar79]

[Mar87a]

[Mar87b]

118

Kindermann, R. and Snell, J .L.(1980), Markov Random Fields and Their
Applications, American Mathematical Society, volume I.

Khotanzad, A. and Kashyap, R.(1987), Feature Selection for Texture Recog-
nition Based on Image Synthesis, IEEE Trans. on SMC-, 17, pp.1087-1095.

Laws, K.I.(1979), Textured Image Segmentation, Ph.D. Thesis, University of
Southern California.

Lehmann, E.L.(1975), Nonparametrics .° Statistical Methods Based on Ranks,
New York : McGraw-Hill.

Levine, M.D.(1985), Vision in Man and Machine, New York : Mc-Graw Hill.

Lu, S.Y. and Fu, K.S.(1979), Stochastic Tree Grammar Inference for Texture
Synthesis and Discrimination, CGIP, 9, pp.234-245.

Marriott, F.H.C.(1979), Barnard’s Monte Carlo Tests : How many Simula-
tions?, Appl. Stat, 28, pp.75-77.

Marroquin, J., Mitter, S. and Poggio, T.(1987), Probabilistic Solution of 111-
Posed Problems in Computational Vision, JASA, 82, pp.76-89.

Marroquin, J.(1987), Deterministic Bayesian Estimation of Markovian Ran-
dom Fields with Applications to Computational Vision, Proc. IEEE ICCV, 1,
pp.597-601.

[McCo74] McCormick, B.H. and Jayaramamurthy, S.N.(1974), Time Series Model for

[Met53]

[Mod80]

[M0074]

[Mor73]

Texture Synthesis, Int. J. Comp. &. Infor. Sci., 3, pp.329-343.

Metropolis, N.,Rosenbluth, A.W., Rosenbluth, M.N., Teller, AH. and Teller,
E.(1953), Equation of State Calculations by Fast Computing Machines, J.
Chem. Phys, 21, pp.1087-1092.

Modestino, J .W., Fries, R.W. and Vickers, A.L.(1980), Stochastic Image
Models Generated by Random Tessellations of the Plane, CGIP, 12, pp.74—98.

Mood, A.M., Graybill, RA. and Boes, D.C.(1974), Introduction to The Theory
of Statistics, New York : McGraw-I—Iill.

Moran, P.A.P.(1973), A Gaussian Markovian Process on a Square Lattice, J.
Appl. Prob., 10, pp.605-612.

[Mor7 6]

[Pav86]

[Pen84]

[Pic87]

[Pra78]

[Qui87]

[Rip8ll
[Rip87]

[R0882]

[Ros85]

[Sch78]

[Sie88]

[Sto83]

[Tam7 8]

[Tay84]

119

Morrison, D.F.( 1976), Multivariate Statistical Methods, New York :
McGraw-Hill.

Pavlidis, T.(1986), A Critical Survey of Image Analysis Methods, Proc. of the
8th ICPR, Paris, France, pp.502-511.

Pentland, A.P.(1984), Fractal-Based Description of Natural Scenes, IEEE
Trans. on PAMI-6, pp.661-674.

Pickard, D.K.(1987), Inference for Discrete Markov Fields : The Simplest
Nontrivial Case, J. Appl. Prob., 24, pp.90-96.

Pratt, W.K.(1978), Digital Image Processing, New York : Wiley-Inter Sci-
ence.

Quinn, M.J.(1987), Designing Eﬁicient Algorithms for Parallel Computers,
McGraw-Hill.

Ripley, B.D.(1981), Spatial Statistics, New York : Wiley.
Ripley, B.D.(1987), Stochastic Simulation, New York : Wiley.

Rosenfeld, A. and Kak, A.C.(1982), Digital Picture Processing, New York :
Academic Press.

Rosenfeld, A.(1985), Survey : Picture Processing 1984, CVGIP, 30, pp.211-
214.

Schatcher, B.J., Rosenfeld, A. and Davis, L.S.(1978), Random Mosaic Models
for Textures,

Siew, L.H., Hodgson, RM. and Wood E.J.(1988), Texture Measures for Car-
pet Wear Assessment, IEEE Trans. on PAMI -10, pp.92—105.

Stoer, J. and Bulirsch, R.(1983), Introduction to Numerical Analysis, New
York : Springer-Verlag.

Tamura, H., Mori, S. and Yamawaki, T.(1978), Textural Features
Corresponding to Visual Perception, IEEE Trans. on SMC-8, pp.460-470.

Taylor, HM. and Karlin, S.(1984), An Introduction to Stochastic Modeling,
New York : Academic Press Inc., pp.12l-129.

[The83]

[The86]

[Tsa85]

[Tou76]

[Tou80]

[Tul76]

[Van85]

[Wec 80]

[Wes76]

[W0185]

[W0072]

[ZucSO]

120

Therrien, C.W.(1983), An Estimation-Theoretic Approach to Terrain Image
Segmentation, CVGIP, 22, pp.313-326.

Therrien, C.W., T.F. Quatieri, and Dudgeon, D.B.(1986), Statistical Model-
Based Algorithms for Image Analysis Proc. IEEE, 74, pp.532—551.

Tsai, W.C.(1985), Moment-Preserving Thresholding : A New Approach,
CVGIP, 29, PP.377-393.

Tou, J .T., Kao, DB. and Chang, Y.S.(1976), Pictorial Texture Analysis and
Synthesis, Proc. IEEE Conf. PRIP, pp.590-590p.

Tou, J .T.(1980), Pictorial Feature Extraction and Recognition via Image
Modeling, CGIP, 12, pp.376-406.

Tully, R.J., Connors, R.W., Harlow, C.A., Larsen, G.N., Dwyer, S.J., and
Lodwick, G.S.(1976), Interactive Analysis of Pulmonary Inﬁltration, Proc. of
the 3rd IJ CPR , Coronado, California, pp.238-242.

Van Gool, L., Dewaele, P. and Oosterinck, A.(1985) Survey : Texture
Analysis Anno 1983, C VGIP, 29, pp.336-357.

Wechsler, H.(1980), Texture Analysis - A Survey, Signal Processing, 2,
North-Holland Publ. Co., pp.271-282.

Weszka, J., Dyer, CR. and Rosenfeld, A.(1976), A comparative Study of
Texture Measures for Terrain Classiﬁcation, IEEE Trans. on SMC-6,pp.269—
285.

Wolberg, G. and Pavlidis, T.(1985), Restoration of Binary Images Using Sto-

chastic Relaxation with Annealing, Pattern Recognition Letters, 3, pp.357-
388.

Wood, J.W.(1972), Two-Dimensional Discrete Markovian Fields, IEEE
Trans. on IT -18, pp.232-240.

Zucker, SW. and Terzopoulos, D.(1980), Finding Structure in Co-occurrence
Matrices for Texture Analysis, CGIP, 12, pp.286-308.

 

 

 

"I111111111111111IES