4.

.

a

in?
4%? ﬁg: .
..i .. N

 

This is to certify that the
dissertation entitled

Modeling Genetic Algorithm Dynamics for OneMax and
Deceptive Functions

presented by
Bulent Buyukbozkirli

has been accepted towards fulﬁllment
of the requirements for the

Doctoral degree in Mathematics

$122229)“,

’Maior F'rofessor’s Signature
M £41 :2 00%
ﬂ /

Date

 

MSU is an Afﬁnnative Action/Equal Opportunity Institution

 

   

M‘Chlgan State
University

PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6/01 c:/CIRC/DateDue.p65-p. 15

 

 

MODELING GENETIC ALGORITHM DYNAMICS FOR ONEMAX
AND DECEPTIVE FUNCTIONS

BY

Bulent Buyukbozkirli

A DISSERTATION

Submitted to
Michigan State University
In partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Mathematics

2004

 

ln this
(Onel
mean
The
crossc
devek

indivic

CiOSSi
QGDer
(lowe‘
mode
ODGra
mode
One”
”0884

IIXed ‘

ABSTRACT

MODELING GENETIC ALGORITHM DYNAMICS FOR ONEMAX
AND DECEPTIVE FUNCTIONS

By

Bulent Buyukbozkirli

In this dissertation, we develop a model predicting dynamics of the counting-ones
(OneMax) and a form of deceptive function problems. The model describes the
mean allele and, in the case of deceptive function, mean deceptive block values.
The genetic algorithm (GA) that is being modeled consists of two-point
crossover, fitness proportional selection and mutation operators. The model is
developed to estimate the average GA dynamics, but it can also be used for an

individual run of the GA.

First, we develop the model for the OneMax problem with very high
crossover rates. Then, we modify the model by using statistics of very early
generations from GA runs, to describe the complete dynamics for different
(lower) crossover rates of the OneMax problem. In the development of the
model, we introduce a new quantity that measures the effect of the crossover
operation and is independent of generation, for practical purposes. Then, the
model is generalized to cover other cases of the OneMax, such as weighted
OneMax, as well as a form of deceptive function problem, for high enough
crossover rates. The model is also modified to include Boltzmann selection with

fixed or scaled selection pressures.

 

 

even
Botza
Since
dece:
a Sui
aﬂgon

gene:

 

shnpl

nontr

dﬁte
“Wet
" st
eq'u;

SQ!“-

The model can be applied to OneMax and deceptive function problems
even when the crossover, mutation and the selection pressures (in the case of
Boltzmann scaling) are changed at predetermined generations during a GA run.
Since our model estimates the mean value (and, mean deceptive block value for
deceptive function) at each locus at any generation, it can be used to determine
a suitable migration time as well as the migration rate for parallel genetic
algorithms (in the island model case) when migrations are allowed at any

generation for our benchmark problems.

Although the problems for which our model proved successful were rather
simple or idealized, they were often sufficiently involved to capture interesting
nontrivial features of the GA dynamics. The author hopes to extend the approach
to model solution of more representative real-world problems with various

degrees of OneMax similarity and various amounts of deception.

At the end of the dissertation, an attempt to develop a stochastic
differential equation model to predict the evolution of the fitness distribution for
the OneMax problem is also presented. Although our attempt was not successful,
it shows a strong connection between fitness evolution and certain diffusion
equations and points toward the possibility of the existence of a diffusion-type

equation that could describe the OneMax and maybe other type of GA dynamics.

 

 

DLCS
gene:
the 5

very ll

algori

 

thecﬂ
hketc
NMXDI
resee

for re

My v
mOm
bGIOI
Work

the g

ACKNOWLEDGEMENTS

The front page of this dissertation has the name of only one person; however it

came into existence by the collective being of many others.

Dr. Goodman has introduced me to the interesting, open questions in the field of
genetic algorithms. Through long and fruitful discussions with him, the details of
this study have emerged. He was also very patient to read the manuscript in its
very raw form and suggest many improvements. My involvement in the genetic
algorithm theory has started by the suggestions and guidance of Dr. MacCluer in
the direction of applying it to solve a practical power electronics problem. I would
like to also express my thanks to the other members of my thesis committee: Dr.
McCleer and Dr. Peng for their suggestions about the practical application of my
research in the field of electrical power inverters, and Dr. Newhouse and Dr. Hall

for reading my thesis and for their insightful questions.

My wife, Figen, has been the strongest source of inspiration and love at every
moment of my studies. Our son, Eren Baray, who was born just a couple months
before the completion of this dissertation, brought the energy and will to bring this
work into its final form at its hardest stages. I am also grateful for the prayers and

the good wishes of my parents, elders of my family and all the friends.

 

 

LIST ._

LIST

 

CHAl
Gen
1.1
1.2

d

1.3
1.4

##u—A
—L

1.5

CHA
A Ml
2.1

2.2

TABLE OF CONTENTS

LIST OF TABLES .............................................................................................. VII
LIST OF FIGURES ........................................................................................... VIII
CHAPTER 1
Genetic Algorithms: Theory and Models .......................................................... 1
1.1 Thesis Outline ............................................................................................... 1
1.2 What is a Genetic Algorithm? ....................................................................... 3
1 .2.1 Counting-Ones (OneMax) .............................................................. 10
1.2.2 Cumulants as a Tool to Observe GA Evolution ............................. 12
1.2.3 Counting-Ones and Migration ........................................................ 17
1.3 GA with a Real-Life Problem ...................................................................... 25
1.4 Theoretical Models of GA ........................................................................... 29
1.4.1 Schema Theory ............................................................................. 30
1.4.2 Random Walk Model ..................................................................... 34
1.4.3 Markov Chain Model ...................................................................... 35
1.4.4 Statistical Mechanics Model .......................................................... 37
1.5 The Need for a New Model ......................................................................... 40
CHAPTER 2
A Model of GA Dynamics for the OneMax Problem ....................................... 43
2.1 Problem Description and a Visual Representation of the
OneMax Dynamics ..................................................................................... 43
2.2 The Model for OneMax with Fitness Proportional Selection ....................... 53
2.2.1 Selection and Crossover with “High Enough”
Crossover Rate .............................................................................. 54
2.2.2 Mutation ......................................................................................... 57
2.2.3 Fitness Variance for “High Enough” Crossover Rates ................... 58
2.2.4 Lower Crossover Rates ................................................................. 60
2.2.5 The Algorithm of the Model ............................................................ 64
2.3 Weighted OneMax Fitness Function ........................................................... 66
2.4 OneMax with Boltzmann Scaling ................................................................ 69
CHAPTER 3
A Model of GA Dynamics for the Deceptive Function Problem .................... 74
3.1 Deceptive Function ..................................................................................... 74
3.2 A Model for the Deceptive Function with Fitness-Proportional
Selection ..................................................................................................... 76
3.2.1 Selection and Crossover with “High Enough”
Crossover Rate .............................................................................. 78
3.2.2 Mutation ......................................................................................... 84

 

3.3
3.4

CHAI
Coml

CHA
Cont
5.1
5.2

APF

BIB

3.2.3 Fitness Variance ............................................................................ 85

 

3.2.4 The Algorithm of the Model ............................................................ 88
3.3 Weighted Deceptive Fitness Function ........................................................ 90
3.4 Deceptive Function with Boltzmann Scaling ............................................... 92
CHAPTER 4
Comparison of the Model with GA Experiments ............................................ 95
4.1 OneMax Problem ........................................................................................ 96
4.1.1 Fitness Proportional Selection ....................................................... 96
4.1.2 Boltzmann Scaling ....................................................................... 103
4.1.3 Weighted Fitness Function .......................................................... 113
4.2 Deceptive Function Problem ..................................................................... 127
4.2.1 Fitness Proportional Selection ..................................................... 127
4.2.2 Boltzmann Selection .................................................................... 131
CHAPTER 5
Conclusions and Future Work ....................................................................... 136
5.1 Conclusions .............................................................................................. 136
5.2 Future Work .............................................................................................. 139
APPENDIX.................. _____________ _ _ -_ _ _ - - 142
A Comparison of the OneMax Problem with a Diffusion Model .................. 142
BIBLIOGRAPHY ............................................................................................... 149

vi

 

am

(I)

fable

 

 

LIST OF TABLES

Table 1.1 The three cases with the average generation numbers
when the mean fitness of 95 is achieved .................................. 22

Table 2.1 Notations .......................................................................... 46

vii

LIST OF FIGURES

Figure 1.1 Genetic Algorithm creating new populations with

selection, mutation and crossover operations .................................. 4
Figure 1.2 An example of an initial population together with the

fitnesses and relative fitnesses of their chromosomes ..................... 5
Figure 1.3 A possible outcome of fitness proportional selection ....................... 6
Figure 1.4 An example of two-point crossover .................................................. 7
Figure 1.5 An example of mutation ................................................................... 8

Figure 1.6 A GA with multiple populations, arrows showing the
direction of migration ....................................................................... 9

Figure 1.7 Evolution of mean fitness over 100 generations in a
GA, for a counting-ones problem. Population size =
50, chromosome length = 100, ,6 = 0.2 ........................................ 11

Figure 1.8 Evolution of mean fitness for a counting-ones problem
over 200 generations, after the average is taken over
1000 GA runs. Population size = 50, chromosome
length = 100, ,6 = 0.2. .................................................................. 12

Figure 1.9 The time evolution of cumulants, K1 and [(2, for two

cases of GA. First with selection, mutation and
crossover, the second with selection and mutation
only. The graphs are obtained by averaging 100 GA
runs ................................................................................................ 15

Figure 1.10 The time evolution of cumulants, K3 and K4, for two

cases of GA. First with selection, mutation and
crossover, the second with selection and mutation
only. The graphs are obtained by averaging 100 GA
runs ................................................................................................ 16

Figure 1.11 Evolution of fitness distributions for CASE1, CASE2
and CASE3 .................................................................................... 19

Figure 1.12 Evolution of mean and maximum fitness for CASE1,
CASE2 and CASE3 ....................................................................... 20

Figure 1.13 Evolution of mean fitness and fitness variance for
CASE1, CASE2 and CASE3 ......................................................... 21

viii

 

 

Figur
Figur
Figur

Figur

Figw.
Figu
Figu
Figu

Figu

Figu

Figu

Figu

Figu:

Figure 1.14

Figure 1.15
Figure 1.16

Figure 1.17

Figure 1 .18
Figure 1 .19
Figure 1.20

Figure 2.1

Figure 2.2

Figure 2.3

Figure 2.4

Figure 2.5

Figure 2.6

Figure 2.7
Figure 3.1

A), curves for three cases. pm = 0.001 and ,6 = 0.3 .................. 24

Passage from DC to AC voltage by switch-mode
inverters ......................................................................................... 26

A simplified circuit of pulse-width-modulated inverter. ................... 27

The part of the chromosome representing timing of a
transistor pair during one period of the desired output

voltage ........................................................................................... 29
Template H .................................................................................... 31
The Random Walk ......................................................................... 35
The statistical mechanics model applied to cumulants .................. 38

Selection-mutation-crossover versus crossover-
selection-mutation .......................................................................... 43

The chromosomes in the population, mean allele and
other notations ............................................................................... 45

The mean allele values, ai’s at each locus i for times

t =0, 10, 20, 30, 50 and 100 of a GA run with pc=0.25
and pm=0.001 ................................................................................. 47

Definition of Ah (t) ......................................................................... 49

The experimental average values of Ah(t) as a

function of time for h =0, 0.1, 0.2, 0.9 , and the
bar graph interpretation. The population size is 50 and
the chromosome length is 100 genes. The GA
parameters are pc = 0.25, pm = 0.001. The average is
taken over 100 experiments .......................................................... 52

The mean value of the correction weights as a
function of mean allele levels, h', for crossover rates
pc = 0.25, 0.75 and 3. The statistical average is found
over 100 runs of the GA. Population size is 50 and
chromosome length is 100 genes .................................................. 63

Fitness with weights ....................................................................... 66

The definition of the fitness of a 3-bit deceptive
funcﬁon .......................................................................................... 75

ix

Figure

 

Figure

Figur-

Figure

Figuri

FIgUfI

I I‘Mlgur

FiQUr.

 

 

Figure 3.2 Variables a and yin a 3-bit deceptive function
problem .......................................................................................... 77

Figure 3.3 Definitions of p?(t) and p37 (t) for a 3-bit deceptive
funcﬁon .......................................................................................... 79

Figure 3.4 An example of the weighted fitness function for a 3-bit
deceptive problem ......................................................................... 91

Figure 4.1 Ah as a function of time for four different cases where
the crossover rate is 4 and 25 percent, respectively.
There is no mutation. Black lines are the experimental
averages over 100 GA runs and thick gray lines are
the results obtained by model simulations. Population
size is 50 and the chromosome length is 100 genes ..................... 98

Figure 4.2 A), as a function of time for four different cases where
the crossover rate is 75 and 300 percent,
respectively. There is no mutation. Black lines are the
experimental averages over 100 GA runs and thick
gray lines are the results obtained by model
simulations. Population size is 50 and the
chromosome length is 100 genes .................................................. 99

Figure 4.3 Ah as a function of time for two different cases where
the mutation rate is 0.1 and 2 percent, respectively.
The crossover rate is 50% in both cases. Black lines
are the experimental averages over 100 GA runs and
thick gray lines are the results obtained by model
simulations ................................................................................... 1 00

Figure 4.4 The mean fitness for pa = 25%, 75% and 300%. In
each figure four different mUtation rates, pm = 0%,
0.1%, 1% and 2%, are shown. Black lines are the
experimental averages obtained by averaging over
100 GA runs and thick gray lines are the results
obtained by model simulations .................................................... 102

Figure 4.5 The fitness variance for four different rates of
mutation, pm = 0%, 0.1%, 1% and 2%, with crossover
rate at 300%. Black lines are the experimental
averages obtained by averaging over 100 GA runs
and thick gray lines are the results obtained by model
simulations ................................................................................... 1 03

Figure 4.6

Figure 4.7

Figure 4.8

Figure 4.9

Figure 4.10

Figure 4.11

Figure 4.12

Figure 4.13

Figure 4.14

Figure 4.15

Figure 4.16

Ah curves for fixed ,6 Boltzmann selection.
pm = 0, pc = 60. The first graph is for )6 = 0.1, the
second, for ,6 = 0.6 .................................................................... 105

Ah curves for fixed ,6 = 0.1 Boltzmann selection.
pm = 0.01, 0.1, 0.2 , p, = 60 ...................................................... 106

The mean fitness graphs for fixed-,6 Boltzmann

selection. The graphs show mean fitness for different
mutation rates with ,6 = 0.1 or ,6 = 0.6 ...................................... 107

The mean fitness graphs for scaled-,6 Boltzmann

selection. The graphs show mean fitness for different
mutation rates with )8 = 0.1 or ,6 = 0.6 ...................................... 108

The mean fitness graphs for fixed )6 Boltzmann
selection for three different selection

pressures ,8 = 0.1, 0.3, 0.6 with pm = 0 ..................................... 109
The fitness variance for fixed ,6 Boltzmann selection
for three different selection
pressures ,8 = 0.1, 0.3, 0.6 with pm = 0 ..................................... 110

A), curves for scaled-ﬂ Boltzmann selection.
pm = 0.001, pc = 60. The first graph for ,6 = 0.1, the

second ,6 = 0.6 ........................................................................... 111
The fitness variance for scaled ,8 Boltzmann
selection for two different selection
pressures ,8 = 0.1 and )6 = 0.6 showing each
selection pressure for different mutation rates ............................. 112

Profile A and Profile B for the weighted fitness
function; ....................................................................................... 115

Ah curves for Profile-A with fitness proportional
selection, pm = 0 and pm =0.2 ................................................ 116

Ah curves for Profile-A for scaled-ﬂ Boltzmann
selection. with ,6 = 0.6 and pm = 0, pc = 60. .............................. 117

xi

Figure 4.17

Figure 4.18

Figure 4.19

Figure 4.20

Figure 4.21

Figure 4.22

Figure 4.23

Figure 4.24

Figure 4.25

Figure 4.26

Figure 4.27

Mean fitness curves for Profile-A for fitness
proportional selection with different pm ’s .................................... 118

Mean fitness curves for Profile-A with Boltzmann
selection, scaled ,6 : (a) comparing different ,6

cases; (b) for ,6 = 0.6 with different pm’s ................................... 119

Fitness variance curves for Profile-A: (a) for fitness
proportional selection with different pm’s; (b)

comparing different ,6 cases ...................................................... 120

Fitness variance curves for Profile-A for ,8 = 0.6 with
different pm ’5. Boltzmann selection is with scaled ,6 ................. 121

Ah curves for Profile-B with scaled-,6 Boltzmann

selection. (a) ,3 = 0.1 , no mutation; (b) ,8 = 0.6
with no mutation ........................................................................... 122

Ah curves for Profile-B with scaled-,6 Boltzmann

selection. ,6 =0.6 with 1% mutation. In all
cases pc = 60 ............................................................................. 123

Mean fitness curves for Profile-B, showing fitness
proportional selection and Boltzmann selection with

,6=0.1 and ,B=O.6 .................................................................. 124
Mean fitness curves for Profile-B. Boltzmann selection
with ,6 = 0.6, and different mutation rates .................................. 125

Fitness variance curves for Profile-B, showing fitness
proportional selection and Boltzmann selection with
,6=0.1 and ﬂ=0.6 .................................................................. 126

A: and A; curves for fitness proportional selection.

pc = “high enough”. pm = 0. Black lines are from GA
run, gray lines are from model simulation .................................... 128

Aff and A; curves for fitness proportional selection

with mutation rate pm = 0.001, and p6: “high
enough” ....................................................................................... 129

xii

Figure 4.28 Mean fitness and fitness variance curves for fitness
proportional selection. p6: “high enough”. Pm = 0 ,

0.001 ,and 0.2. Black lines are from GA run, gray
lines are from model simulation ................................................... 130

Figure 4.29 Af and A; curves for selection with Boltzmann
scaling. ,B=O.I p6: “high enough”. Pm = 0. Black

lines are from GA run, gray lines are from model
simulation .................................................................................... 132

Figure 4.30 Af and A}: curves for selection with Boltzmann
scaling. [3:06 pc= “high enough”, pm = 0. Black

lines are from GA run, gray lines are from model
simulation .................................................................................... 133

Figure 4.31 Comparison of mean fitness and fitness variance
curves for selection with Boltzmann scaling for

,6 =0.1, 0.3 and 0.6. pc = “high enough”, pm = 0 ....................... 134

Figure 4.32 Comparison of mean fitness and fitness variance
curves for selection with Boltzmann scaling for

,6=0.1, p0: “high enough”, for different mutation
rates ............................................................................................. 135

Figure 5.1 A complicated fitness function a composition of
several OneMax and deceptive parts .......................................... 140

xiii

Chapter 1

GENETIC ALGORITHMS: THEORY AND MODELS

1.1 Thesis Outline

Genetic algorithms (GA) are stochastic, heuristic search algorithms that have
been applied to a wide range of problems. In Section 1.2, the simple genetic
algorithm is described together with some of the ways its dynamics are observed.
The counting-ones (OneMax) benchmark problem is also introduced in this
section serving as a simple demonstrative example for GA operators. We present
a real-world example in Section 1.3, in order to demonstrate a nontrivial problem
in which GA can be applied to find an optimum configuration. Section 1.4
contains a review of some of the ways GAs are currently analyzed. Chapter 1
ends with a critique of the current theories in Section 1.5, and explains why we
need a new model, which is the motivation in the development of this

dissertation.

Chapter 2 starts with a detailed description of the OneMax problem. A new
and practical model for the simple Genetic Algorithm dynamics of OneMax
problem is developed in Section 2.2. The GA that is being modeled consists of

two-point crossover, fitness proportional selection and mutation operators. For

low crossover rates, the model uses statistics of the early generations of GA runs
to describe the dynamics of the problem for all time, using a variety of crossover
and mutation rates. In the development of the model, we introduce a new
quantity that measures the effect of the crossover operation and is independent
of generation, for practical purposes. Then, the model is generalized, in Section
2.3 and 2.4, to cover the weighted OneMax and the Boltzmann selection with

fixed or scaled selection pressures.

Chapter 3 follows a similar pattern as in Chapter 2, but for a type of
deceptive function in stead of OneMax. The model in this chapter is developed
only for high enough crossover rates. It estimates the simultaneous evolution of
the mean allele and the mean all-1 deceptive blocks, as well as the mean fitness
and fitness variance of the population. The model for lower crossover rates has
still been under investigation by the author at the time this dissertation was
written. In Section 3.1 the deceptive function is described and the modeling
problem is formulated. The model for fitness proportional selection is developed
in Section 3.2, which is followed by the extensions of the model to weighted

deceptive function and Boltzmann selection in Sections 3.3 and 3.4, respectively.

In Chapter 4, the model that is developed in Chapter 2 and Chapter 3 is
tested by comparing its computer simulations with the results with averaged GA
runs. Section 4.1 presents the results for OneMax and Section 4.2 for the

deceptive function.

An attempt to develop a stochastic differential equation model to predict
evolution of fitness distribution for the OneMax problem is presented in Appendix.
Although our attempt was not successful, it demonstrates a strong connection
between fitness evolution and certain diffusion equations and points toward the
possibility of the existence of a diffusion-type equation that could describe the

OneMax and maybe other type of GA dynamics.

1.2 What Is a Genetic Algorithm?

Genetic Algorithms (GAs) are stochastic search algorithms based on principles of
natural selection and genetics. They were first developed by J. Holland in the
1960's, and presented, together with a large body of accompanying theory, in his
1975 book. There are three main operations in a genetic algorithm, namely
selection, mutation and crossover (sometimes called recombination). Each
operation has many different types which may be applied, depending on the
specific nature of the optimization problem. At the beginning, an initial population
that consists of a set of proposed solutions is chosen. The selection of the initial
population is usually random, unless it is specifically required to search a more
restricted region. Then, selection, mutation and crossover operations are applied
to this initial population in a predetermined order to create the first generation (or
second population), Figure 1.1. The second generation is obtained from the first
one by the application of the same operators. This procedure continues until a
“good enough” solution emerges within the population or until the whole

population converges to the point that additional search is deemed unproductive.

 

 

 

 

 

 

 

     

Selection Selection Selection
Mutation Mutation Mutation
Crossover crossover Crossover
———> ———>
Initial Population pt 2.“
(randomly generated set of Generation Generation
input values for the fitness

function)

Figure 1.1 Genetic Algorithm creating new populations with selection, mutation
and crossover operations

Let us consider a very simple example in order to illustrate the GA operations
and deﬁne some GA-speciﬁc terms. In this example, we want to ﬁnd the value of

the input variables that yields the maximum value of the objective function f
(called the ‘ﬁtness function’ in the GA terminology)
f(al,az,a3aa4aas) = 01+ “2 + a3 + a4 + “5v
(1 .1)
where the input variables, a1,a2,a3,a4 and a5 , take discrete values of only 0 or

1. A problem of this form, of arbitrary dimension, is called a ‘counting-ones’ (or
sometimes ‘OneMax’) problem, since the value of the function is equal to the

number of 1’s in the input values. It is obvious that the solution to the problem is
al =1, a2 =1, a3 =1, a4 =1, 05 = 1. It is not so obvious, but important, that even

some real-world problems — those that can be solved by decomposition into

independent components — exhibit many aspects of the behavior of a counting
ones problem, after suitable recoding of the inputs. Because of their simplicity,
counting ones problems are often used in analyses of the dynamics of a genetic
algorithm. A few examples of this kind of application can be found in [Goldberg,

1989], [Furutani 2002], [Priigel-Bennett, 2002] and in many other publications.

Now, let us see how a GA would solve this problem of find the optimum of
the function f . First, select values for (al,a2,a3,a4,a5) randomly, for example

(1 ,0,1 ,0,0), etc.. To make the initial population, we generate a fixed-size of such
number strings each of which is called a ‘chromosome’. In Figure 1.2, we see an
example of such a population with 4 chromosomes, showing their fitness values
and the ratios of their fitness to the total fitness of the population (so-called
"relative fitness values”). In this example, the ‘chromosome length’ is 5 — i.e., 5

variables — and the ‘population size’ is 4.

 

 

# Chromosome Fitness ‘70 of Total
1 101—00 2 25
2 1000-1 2 25

3 0-0-1—1—1 3 37.5

4 00-0-10 1 12.5
Total 8 100,0

 

Figure 1.2 An example of an initial population together with the fitnesses and
relative fitnesses of their chromosomes

Given a population, the selection operation creates a new population with the
same number of chromosomes by selecting some of its chromosomes. One of
the most common types of selection is ‘fitness proportional’ selection, also called
‘roulette wheel’ selection. In this type of selection, the probability that each
chromosome would be selected for the next generation is equal to the proportion
of its fitness to the total fitness of the population (i.e., its relative fitness). In our
example, Figure 1.2, the first two chromosomes have probability 0.25 of being
selected for the next generation, while the third one has probability 0.375 and the
last one, 0.125. The selection operation is performed with replacement. That is,
the fittest chromosomes might be selected more than once, while some
chromosomes might be lost during the selection procedure. A possible outcome
of the selection is shown in Figure 1.3, in which two copies of the third

chromosome are selected, while the fourth one is not present after selection.

0-0-1-1-1
———-)
After selection, 1‘0‘0'0-1
we may have 1-0-1-0-0
' rd
two copies of 3 0-0—1-1-1
chromosome,

while 4th one is lost.

 

Figure 1.3 A possible outcome of fitness proportional selection

The crossover operation is the most complicated operation of the GA. One of the
most common types of crossover is two-point crossover. In two-point crossover,
the chromosomes are ﬁrst paired randomly, using a uniform probability density
distribution. Then, within each pair, the parts of the chromosomes lying between
two randomly (with a uniform probability density distribution) selected locations
are swapped. Not all pairs of chromosomes have to undergo the crossover

operation, though. An important parameter of crossover operation, called the
crossover probability, pc, gives the probability at which a given pair of
chromosomes are crossed. In Figure 1.4, ﬁrst and third chromosomes pair off

and their last two genes are swapped. Also, the second and fourth chromosomes

pair off and their third and forth genes are swapped.

   

 

    
 

  

    

 
  

0-0—1-1-1
* 1-0-0—0-1
f
' 1-0-1-0-0
‘ 7 0-0-1-1-1
1-0-0 0-1 0—0 l-l 1
0-0-1 1-1 1-0 1-0 0
Swap position 3 to 5: Swap position 2 to 4 :
1_0_0_1_1 0_0_1_0_1 New population

after crossover

0-0-1-0-1 1-0-1-1-0

 

Figure 1.4 An example of two-point crossover

In bitwise mutation, each gene of each chromosome changes its value from 1 to

0 or 0 to 1 with mutation probability, Pm .For example, a mutation at the 4"1 gene

of 0-0-1-0-0 gives us 0-0-1-1-0, Figure 1.5.

0—0- 1 -0-0

Mutation at
4th gene

I

0—0-1-1-0

Figure 1.5 An example of mutation

Thus, after selection, crossover and mutation are applied, we get a new

population, which is, on average, expected to be ‘better‘ than the previous one.

Sometimes the relative fitness values of the chromosomes are replaced
by some other scaled values of the fitness for the purposes of the selection
operation. For example, if one wants to give disproportionately more chance of
being selected for chromosomes with higher fitnesses, one can multiply their
fitness values by some weights and use those values instead of their original

fitness values. One of the common ways of fitness scaling is Boltzmann scaling,

in which the fitness, fc, of chromosome c is replaced by f: = em , where ,6 is

a constant, called the Boltzmann constant. The advantages of using a Boltzmann

scaling is explained in Section 2.4.

Another common application of GA involves using multiple populations in
parallel. This case is called parallel GAs. Each population (deme) evolves
separately. However, at certain times some of the chromosomes, usually the
best ones, of some populations are allowed to ‘migrate’ into other populations
(typically being copied, without removal from the 'donor‘ population). Figure 1.6
shows a possible configuration for such an application. The main purpose of

such an application is to reduce the risk of premature convergence.

   
 
  

 

Population 1 Population 2

Population 3

 

Population 5

Population 4

Figure 1.6 A GA with multiple populations, arrows showing the direction of
migration

There are several advantages of GAs over other optimization and search
procedures for some classes of search problems. First, GAs can climb many
peaks in parallel. There is then a smaller probability that they will miss a peak

that leads to a global optimum. Second, GAs use the value of a function rather

than performing operations on an explicit representation of the function (for
example, formal differentiation). It is thus possible to find a good estimation of an
optimum in a situation where the form of function itself is not known, but values of
the function in particular situations are known, such as stock prices in the
financial markets. GAs also produce effective solutions to problems with too
many variables, in which other search procedures have hard time due to time
limitations (computational complexity grows exponentially, for example, with
problem order n). Since the information is exploited directly from a fitness
function, no other information is necessary about the problem under
consideration. In addition, GAs are better than a random search for almost any
problem domains of significant interest, since they use directed randomness,
which reduces the computation time by skipping areas that are not fruitful to
search, directing the search effort towards areas where it is more likely to find the

optimum.
1.2.1 Counting-Ones (OneMax)

In this section we study the counting-ones problem with a longer chromosome
length. In this example, the chromosome length is 100. Thus the fitness function

is

f(a1,a2,...,a100) = a1+a2 +...+aloo .
(1.2)

We apply the GA with population size P = 50, mutation rate Pm = 0.01,

crossover rate Pc =1, and Boltzmann scaling with Boltzmann constant ,6 = 0.2.

10

 

15 v T 1 r v v v v

10- ~

Initial generation

 

 

 

 

10*-

 

 

 

 

 

10- ~

Gen = 20

l

 

 

O 1 0 2O 30 40 50 60 70

Fitness Distribution, p( F)
01

 

 

 

 

 

 

 

 

10 - —
Gen = 40

5 h

on 1‘0 26 3‘0 46 Si) ab 7%) ab

10 ~ -

Gen = 100
5 b —4
00 1i) 2i) 30 40 50 ab 76 at) 9i) 100

Figure 1.7 Evolution of mean fitness over 100 generations in a GA, for a
counting-ones problem. Population size = 50, chromosome length =
100, ,6 = 0.2

11

In Figure 1.7, we see the evolution of fitness distribution for a run of the GA. If we
apply the GA 1000 times to this problem, starting with different randomly chosen
initial populations, and take the average of the fitness distributions, then the
picture looks as in Figure 1.8. This figure shows a typical evolution of the

average fitness distribution of a GA.

 

0.25-
10] and2lIJ
0.2 - '
H150
0.15-
10 I/ I i\
0.1 - r
.0“\ . a 1;, :f:
.. \ ,2
0.05- ’/ \X .. .[i f/ if]; \\\(\
/ \ , ,i- ”‘4 f/ \i\
/ . ,.
l— "’ l \.\3:\‘ J"! , .f:‘J , _ .‘l./ . \:\ .
030“" 40 50 60 70 00 90 100

Figure 1.8 Evolution of mean fitness for a counting-ones problem over 200
generations, after the average is taken over 1000 GA runs.
Population size = 50, chromosome length = 100, ,6 = 0.2.

1.2.2 Cumulants as a Tool to Observe GA Evolution

The characteristic function of a fitness distribution ,0(F) is defined by

(13(0)) = zp(F)ein . Then the nth moment , ,an, of this distribution is defined

12

- - - _ ﬂ" (1.60)" - - th
by the coeffrcrents of the series (I>(a))—Z——'-——. Similarly, the n

":0 n.
cumulant, K", of this distribution is defined as the coefficients obtained using the

logarithm of the characteristic function

Kn Um)"

10g(<I>(w))= 2—

n=1

It follows from this definition that the first cumulant is just the mean fitness of the
population, the second cumulant is the fitness variance, the third one is the
skewness and the fourth one is the kurtosis. Cumulants are also related to the

moments as

K] = ZptnF = A
F

19 = Zp<F><F — x02 = #2 —#f
F

K3 = ZP(F)(F — m3 = #3 -3#2#1+2#13
F

K4 = ZNFXF -K1)4 = #4 ‘4.”3ﬂ1 -3#22 +12%? -
F
(1.3)

One of the objectives of GA study is to understand and estimate how cumulants

change over time. In Figure 1.10 and Figure 1.10, we have time evolution graphs

[(3/2

of K1, K2, K3 and K4. The skewness and kurtosis are normalized byK and

K; in order to have a better representation of these quantities allowing

comparison for populations with different variances. In these graphs, two cases

13

of a GA are studied. In the first one, selection, mutation and crossover are all
applied. In the second case, only selection and mutation are applied in order to
observe the effect of the crossover operation. The parameters of the GA are the

same as in Section 1.2.1, --i.e., population size P=50, mutation rate
pm =0.01, crossover rate pc =1, and Boltzmann scaling with Boltzmann
constant ,6 = 0.2. Figures show that crossover operation, on average, increases

the fitness variance little bit while giving a more uniform look to the skewness and
kurtosis, and gives a better GA performance in terms of its mean fitness values,

in the case of our specific problem.

14

111.1%.3 V! Fl .1..- l J1
_.

I

 

 

 

................
-—-.--
:9.
o
,-
-

Max. Fitness

 

 

 

 

 

 

 

 

 

i
K1 (Mean Fitness) ~
S+M+C ~
--~-------- S+M w
t
160 1&0 260 250
Generation #
25 . .
Z] .4
15 { K2 (Variance) ~
10 .
s 3 l
“a 5'0 160 :50 260 250

Generation #
Figure 1.9 The time evolution of cumulants, K1 and K2, for two cases of GA.

First with selection, mutation and crossover, the second with

selection and mutation only. The graphs are obtained by averaging
100 GA runs

15

 

 

 

 

 

 

 
  

     

 

 

 

0.1 Y Y I I
o - .
A . 1 . ‘ f _
.01 a ”9" .‘\"\\,‘i: it"p'h‘: 'l":"‘\'l‘.f":'\:‘, u./,’ “.5 M." a":
I: _J
5 = 3/2
-0.5 H. 5 K3 / K2 -
415 '55 (normalized skewness) -
417 5,5 .
-0.8 3 _
I
f
'09 1 l l 1
0 50 100 150 200 250
S+M+C Generation #
“W” W" S+M
1 I U I l
l
0.5 J" 2 _
:5 K4 / K2
3': (normalized kurtosis)
o -i i .
" "ii." E 1 "6"”. 3,3, 1‘3"): 1"}. ”11:51 Si )2“ r‘.” ( ‘ I 2" :l
.; w‘t"
1.
'l
415 4 -
U 50 100 150 200 250
S+M+C Generation #

Figure 1.10 The time evolution of cumulants, K3 and K4, for two cases of GA.

First with selection, mutation and crossover, the second with
selection and mutation only. The graphs are obtained by averaging
100 GA runs

16

.. 74 END?! Mitriﬁr pan»-
1 , 1, 4.

/

 

1.2.3 Counting-Ones and Mlgratlon

Although in practice, the migration between two populations which evolve
simultaneously and independently from each other, usually occur at later
generations when each population created “good” members possibly different
than the other population’s “good” members, it is still useful to look at the
following two specially-structured experiments (CASE2 and CASES), in which an
artificial migration is applied at the very beginning. In the following comparison,
the case in which the population evolves normally with no migration is called
CASE1. In the second case (CASE2), for each run of the GA, we choose an
arbitrary chromosome (migrant) with fixed fitness 65 and place it in the otherwise
randomly chosen initial population, so that the migrant is different for each run
but always with fitness 65. The third case (CASE3 ) is similar to CASE2 except
that the chromosome with fitness 65 is not chosen randomly but is always the

following special chromosome

 

65 of them

The population size is taken as 50 and the chromosome length is 100. GA

includes two-point crossover, Boltzmann selection with scaled ,6 = 0.2 (see

Section 2.4 for a detailed description of this scaled-fitness selection) and no

mutation.

17

In Figure 1.11, we see the fitness density distributions for the first 9
generations of these three cases of GA experiments, with average taken over
100 GA runs. The curves move to the right as time goes. The spike in the fitness
distribution of the initial population, i.e. peak at the right of the leftmost curve, for
CASE2 and CASE3 is due to the migrant with fitness 65. In these figures, we see
how much faster the curves move when there is a migration in comparison to

CASE1.

The mean fitness and maximum fitness evolutions of these three cases
over 200 generations are shown in Figure 1.12. It shows that, for this specific
example, each case converges to the same value but in different rates. In Figure
1.13, we compare the three cases in relation to the time when they reach mean
fitness of 95. Table 1.1 shows these times when the mean fitness of each case
reaches 95. As we see, both cases of this specially-structured "migration”
perform better than the case with no migration, i.e. it takes less time for them to
evolve their populations to a mean fitness of 95, on the average. The fact that
CASE3 performs much better than CASE2 shows us that it is more beneficial to
have long series of adjacent 1’s (i.e. “correct” alleles) rather than one with same
fitness but scattered 1’s. This is because in CASE2 the part of the migrant
starting from the first allele with value 1 and ending at the last allele with value 1
is longer than the same length in CASE3. Thus, the crossover is less likely to
separate these 1’s in CASE3 into different chromosomes, in comparison to

CAS E2.

18

 

0.12 1 f l l

Without migration

’0

5:2 / 2
/ W\ .

  
     
   

Frequency, p( F)

  
     

 

0'12 r ' T I I

With migration ..
of a random

individual _
(CASE2)

0.12 y r r r r

 

Frequency, p( F)
O
8

 

 

 

 

With migration

of a well organized
individual

0-08 - A . (CASE3)

Frequency, p( F)

 

 

 

 

, 10]
Fitness

Figure 1.11 Evolution of fitness distributions for CASE1, CASE2 and CASE3

19

1ﬁmr>h1aﬁ Mr; H a 4m
9 .A .
I

 

100 r l I I

’_:__’Jfl,,'——-—J——hs--u!o-oan-.Iasnu-n-J-..-oonq
4 --- :.‘ --

I’-
’ IS A -__
J
,4 _
95 x n
.- I ,
( .
I' I

 

 

l

[(1 Mean Fitness

Fitness

     

__ without migration (CASE 1)

75 5 with migration of a random

70 E individual (CASE 2)

__ with migration of a well organized

0...... .1

55 individual (CASE 3)

 

 

l

D 20 40 80 80 1 00 1 20 1 40 1 60 180 200

Generation #

Figure 1.12 Evolution of mean and maximum fitness for CASE1, CASE2 and
CASE3

20

Mean Fitness
100 , ,

 

95

90

85

80

75

70

65

60

 

 

 

 

 

 

 

 

 

without migration (CASE 1)

with migration of a random ..

individual (CASE 2)
with migration of a well
organized individual (CASE 3) a

 

 

 

 

 

 

 

 

 

80

 

100

1 20
Generation #

Figure 1.13 Evolution of mean fitness and fitness variance for CASE1, CASE2

and CASE3

21

Table 1.1 The three cases with the average generation numbers when the

mean fitness of 95 is achieved

Mean Gen #
CASE 1 76.72

CASE 2 74.73
CASE 3 61.96

 

 

Now, let us look at a more realistic case of migration. In this counting-ones
experiment, we have two populations, P1 and P2, of size 100, running
simultaneously. We apply two-point crossover with 100% crossover rate,

mutation with 0.1% rate, and Boltzmann selection with scaled ,B=O.3 (see

section 2.4 for a definition of Boltzmann selection). The case when no mutation is
applied is called CASE-A. In CASE-B, 10 members with the highest fitness of P2
is migrated into P1 at generation 20. In CASE-C the migration procedure of

CASE-B applied at generations both 20 and 30.

In order to observe the inner structure of the population P1, we count the
number of loci in the population at which the mean allele values are less than or

equal to a given number, h. In Section 2.1, we define these measures more
formally and call them Ah. In particular, A0 counts the number of loci at which all
chromosomes in the population has 0 values. Figure 1.14 shows Ah curves for

h= 0, 0.1, ..., 0.5 as a function of time for each case. The migration times 20 and

30, in particular, are chosen by finding the time at which A0 is maximum, in other

22

words by finding average times when the number of loci whose values

completely converged to zero over all population members reaches to a local
maximum. As seen in these graphs, application of each migration pulls down Ah

curves towards 0. This means that, on the average, the number of loci with value

1 will increase faster with migration.

This example shows that the migration helps to make the population move
or converge faster. In more complicated fitness function problems, this kind of
migration will also prevent population converging to a false maximum, i.e. avoid
premature convergence or convergence to a non-global local maximum, when

the migration parameters are chosen correctly.

23

=0.001 , = 0.3
60 Pm [3

50>

40»

(c \ No migration

30 i
<= 0.4
\ <= 0 3
\\\\.\ -
20 . ﬂ/;;;:3§ <= o_2

10,, ;'// =0

  

 

 

 

 

 

 

 

 

llill//
MA/ 1 um I 1 L J l
0 5 1015 20 25 30 40 60 80 100
CASE-A t
60 .
l
50 1'
l
'2
40 -l With migration at t = 20
<= .. . \
40 ‘72»
CASE-B ‘
60 r-
i
50 A.
l
l.
4° *‘l . . .
r With migration at t = 20 and 30
(F 30- \\
N~\
2o » frg
/ ,/ / \
l/’ /‘ m
1° W / \
ill/l ’/ ij/ ’klf;;k
o ”I / t n 1 - -——i
0 20 40 60 80 100
CASE-C t

Figure 1.14 Ah curves for three cases. Pm = 0.001 and ,6 = 0.3

24

ﬂirtatin... ILL-n 9.. /
. _ i. .

r
z
.

1.3 GA with a Real-Life Problem

In this section, we will see an application of a GA to a real-world problem. We will
not give the solution of this problem by a GA in this dissertation since our main
subject is the modeling of the GA rather than its applications. However, it is
illustrative to include this example here to see how complicated the GA problems
could be. This problem, suggested by P.J. McCleer of McCleer Power Inc.,
introduced me to genetic algorithms for the first time. Considering how to solve
this problem and trying to determine most appropriate GA parameters and
structure drove me to seeking how to understand GA dynamics in a more general

framework.

The problem involves the automotive 42-Volt DC electrical systems of
hybrid cars. The system uses pulse-width-modulated inverters to convert the bus
voltage to a 3-phase AC voltage with desired amplitude and frequency. In Figure

1.16, we see a simplified circuit of this system.

DC voltage can be converted to a sinusoidal AC voltage, either single
phase or three phase, with desired amplitude and frequency, by means of
electrical devices called “switch-mode inverters.” The basic idea of these
inverters is sketched in Figure 1.15. The DC voltage is converted to an AC
voltage that alternates between values 0 and a certain fixed voltage. Only the
fundamental component of this voltage is to be used as the output. Thus, the
high frequency harmonics are filtered and the desired sinusoidal voltage is

obtained.

25

Inverter

switchings filter
DC
Voltage => | II H” II “H H H” H I" =5 W

Figure 1.15 Passage from DC to AC voltage by switch-mode inverters

By turning the switches, which are the six transistors Sig, 823, 83... 81b, 82., and
83b in Figure 1.16 on and off, the desired output power specifications are
achieved while providing an output voltage with a specific amplitude and
frequency but without unnecessary harmonics. One restriction in the functioning
of the switches is that, in a vertical pair, such as Sin and Sm, the transistors can
not be both on or off at the same time. Otherwise, this would cause a short
circuit. Thus, it is enough to determine the status of one transistor in each pair.

The whole art of switching scheme design comes into play at this point.

The practice in automotive inverters is to control the switches by a central
processing unit (CPU), which can be programmed to turn the switches on and off
in any desired pattern. The switching is accomplished with high-current field-
effect transistors (FET). The advantage of this method is the ability to control the
fundamental component while eliminating some of the harmonics. However, one

should be careful with the frequency of switching since, in practice, if a switch

26

turns off in an inverter leg, the turn-on of the other switch is delayed by a

blanking time, which introduces low-order harmonics in the output.

On the other hand, ripples in the output current result in ripples in the
current through the capacitors. This ripple current causes the capacitors to warm
up overtime, which is very undesirable since the temperature under the engine
hood under normal operating conditions can approach 120°C. Such high

temperatures can damage the capacitors.

Because of the high cost of these capacitors, any reduction in their

number will reduce the cost of the inverter. Reducing the number of capacitors
can be obtained by reducing the current ic. Hence, the main focus of our design
is to minimize ic while still meeting the desired output voltage and/or current

specifications. As a result, the question is “what is the best switching algorithm

which will produce minimum current ic through the capacitor C?”

 

Figure 1.16 A simplified circuit of pulse-width-modulated inverter.

27

Let’s say we want to determine the timing for 818, Sea and $3., (timing of
S"), 82:, and 83, are determined from them). A good solution to this problem can
be achieved by a GA in the following way. Split the chromosome into 3 equal
parts (one part for each vertical pair of transistors). For each part, the alleles
(values) at successive loci (fields) of the chromosome are interpreted as the
relative lengths of successive on and off intervals, of the corresponding
transistor. We use discrete values for the time lengths. Each allele, let’s say, can
take on a 4-bit value, so the values 0,1,2,...,15 are the possible field values
(alleles) at each locus. We assume a maximum of 101 ON and OFFs for each
transistor during a period. So each part consists of 101 loci and the total
chromosome length is, then, 3x101 = 303 loci. We take the fitness function as the
mean squared value of the current through the capacitor. The function is
restricted to the domain in which we have the correct amplitude and frequency of
the sinusoidal output (in a different application, the deviation from these
constraints can also be put as a part of fitness function with negative weights.)
Thus, we have a well-defined GA problem, whose application could produce

switching algorithms that are more efficient than traditional methods.

Recently, GA is, in fact, successfully applied to power inverters to
determine the switching angle, i.e. time delay between switching times of the
transistors, eliminating high order harmonics of the output voltage, (Ozpineci, et

al [2004].)

28

One period

 

 

 

 

/ A \
String OFF ON OFF ON OFF OFF ON OFF
(CthfDOSOITB): L 1 l n I l 6 n 3 1 3 I
4 l 10 5 7 o o o
\J///
Fields (Genes) 101“ Field (Gene)

Figure 1.17 The part of the chromosome representing timing of a transistor pair
during one period of the desired output voltage

1.4 Theoretical Models of GA

Some of the applications of GA may not converge to a desired value or
sometimes would take months or years to achieve a good solution just because
the design parameters are not chosen correctly. In practice, the fitness function is
often very complicated and the crossover operation is a very complex mixing
operator, which makes it quite difficult to analyze GAs theoretically. Several
theories have been developed in order to understand why and how GAs work,
and in order to choose the design parameters wisely. Theoretical models of
Genetic Algorithms (GAs) fall into three main categories. The Markov chain
model, as developed by Nix and Vose [1991], completely describes the
probabilistic behavior of the GA. However, this model is too costly to implement
computationally for problems with realistic population size and chromosome
length. The statistical mechanics approach, developed by Prt‘lgeI-Bennett,
Shapiro [1994] and Flattray [1996], gives fairly good results in modeling the
OneMax problem with Boltzmann scaling, for a crossover rate of 100%, however

it is not developed for lower crossover rates or to handle other benchmark

29

problems of GA such as deceptive functions. The approach of modeling GAs by
considering building blocks (Goldberg [1989] and Goldberg, Deb, Thierens
[1993]), on the other hand, gives us a good idea about the appropriate population
size or the convergence time of the OneMax and help us determine the failure
boundaries in the “control maps”. But the question of finding the most appropriate
crossover or mutation rate is answered, so far, only by experimental results. We
still lack a model that describes the behavior of the OneMax problem for different
crossover and mutation rates together and allows us to choose the best

parameters.

1.4.1 Schema Theory

In this theory, the search space is divided into subspaces called “schemata." The
aim is to characterize the schemata using macroscopic quantities, such as the
number of individuals within a given schema H at generation t, denoted by
m(H,t), average fitness of individuals in the schema and in the population, size
of the search space, size of the schema, etc. Schema theorems model
thematically how and why m(H,t) varies from one generations to the next.
Since GAs are non-deterrninistic, one can only predict the expected value,
E[m(H,t)], of m(H,t). As an example, consider a search space consisting of
5-bit binary chromosomes. A special subset of this search space can be
described by the template, H = (1 0 * * 1), which means that the first, second

and the last loci are fixed as 1, 0 and 1, respectively, while the third and fourth

3O

loci can take either of the binary values. Hence the schema H consists of 4

chromosomes, Figure 1 .18.

H=(10**1)—>{10111, 10101, 10011.10001}

order : 0(H) = 3 deﬁning length : 5(H) = 5 - 1 = 4

Figure 1.18 Template H

The order, 0(H), of a schema H is defined to be the number of fixed digits
within the schema. The defining length, 6(H ), of H is defined to be the distance

between the first and the last fixed string positions in the schema. For example,

for H=(10 * * 1) above, we have 0(H)=3 and 6(H) = 5-1 =4.

The first schema theorem was developed by J. H. Holland in the 1960's,
taught in his classes and used by his students in their theses, and made widely
available in his 1975 book. Different versions of this theorem were later
developed and published by Goldberg [1989], Whitley [1994], and Stephens and

Waelbroeck [1997]. In one form, the schema theorem states that

E[m(H,t+ 012 M -p(H,t)(1— pm)0<">[1— p, $430]

(1.4)

where

31

M :population size,
N :string length,
p(H,t) :selection probability of individuals in H at generation t,

Pm : mutation probability,

pc : crossover probability.

The term 0' in Equation (1.4) is taken as 1—m(H ,t)/ H by Holland, as 1 by

Goldberg, and as 1- p(H,t) by Whitley. In Equation (1 .4), the term M -p(H,t)
gives the expected number of population members which are instances of the

schema H after the selection operation. For example, if fitness-proportional

selection is applied, then we have

M -p(H,t)=m(H,t)f(H’t)

m

 

(1.5)

where f (H ,t) is the average fitness of strings representing H within the

population and ﬁt) is the average fitness of the whole population. After

selection, we can calculate the probability that each individual of H within the

population will survive the changes made by mutation and crossover. The terms
6 H
(1— Pm )0”) and 1— pc -N—(—1)0‘ in Equation (1 .4) represent these survival

probabilities respectively. Equation (1.4) gives only a lower bound for

32

E [m(H ,t +1)] since it does not consider the newly created instances of schema

H after mutation and crossover. On the other hand, we gain some insight into
why GAs work by examining this equation. For example, consider Goldberg's

version of the schema theorem with fitness-proportional selection:

f (H .t)

f (t)

selectlon mutation crossover

E[m(H,t +1)] 2 m(H,t)-

 

(I—p.)”‘”’[1—p.@—)]

N—l

\ J

(1.6)

Equation (1.6) tells us that, if the average fitness of the schema, f (H ,t), is

f (H)

greater than the average fitness of the whole population — that is, if T > 1 ,

then, provided that 0(H ) and 6(H ) are small enough, the expected number of
instances of H in the population increases exponentially over generations. By

small enough values of 0(H) and 6(H), we mean that the values of 0(H),

5(H), pm and pc satisfy

 

 

f(H’t) of! 6(H)
7??) (l-pm)()[1-ch_1]>1

(1.7)
By means of this observation, Goldberg, 1989, has defined the notion of Building
Blocks (BB), which are low-order, low—defining-length schemata of above
average fitness — in other words, these schemata satisfying the condition above.

Hence, the well-known “Building Block Hypothesis” of Goldberg says that

33

"A GA works by combining 885 to form higher-order 835 until it converges to an

optimum or near-optimum solution."

It is worthwhile to emphasize that Equation (1.4) works for all possible
schemata independently and in parallel. Later, in 1997, Stephens and
Waelbroeck, developed another schema theorem, which gives an exact equality
for E[m(H ,t +1)] by the formula

E[m(H,t+1)/M]=(1—p.)p(H.t)+Nag—1Nip(L(H.i),t)p(R(H,i),t)

_ i=1

(1.8)

where L(H,i) is the schema that is obtained by replacing the elements of H
to the right of position iwith *‘s, and R(H,i) is the schema that is obtained by

replacing the elements of H to the left of position i with *'s.

Schema theory is applied to modeling of single populations and to
determining good population sizes by G. Harik, D.E. Goldberg, , E. Cantu-Paz
and BL. Miller [1999] and later to modeling of parallel GAs by E. Cantu-Paz

[2001]
1.4.2 Random Walk Model

The random walk model is used with schema theory to determine the

appropriate population size N (G. Harik, D.E. Goldberg, , E. Cantu-Paz and

BL. Miller [1999]). Let x0 be the initial number of BBs in the population and the

variable x represent the number of 886 at any time. First, for a given

34

configuration of the population of size N , the probability, p, of producing one
additional building block is calculated. Then one can visualize the dynamics of

the number of 83s as a one-dimensional random walk as shown in Figure 1.19

 

x: # of BB’s
q p
(\A
l l J
I I 1
=0 x0=initial #of BB’s x:

x0=N/2k

Figure 1.19 The Random Walk

Then the probability, P, that this random walk converges to x = N is calculated.
Specifying an expected value of P will determine the correct size of the

population that would result in this expected value being met or exceeded.

1.4.3 Markov Chain Model

In this model, genetic operators are described by transition matrices acting on a
vector describing the precise state of the population, (Nix and Vose [1991], V039

and Liepins [1991], Suzuki [1995].)
In a search space where there are n possible points (i.e., chromosomes),
let p = (poip1,..., p,,_1,)represent a population where pk is the proportion of

the population occupied by item k. Let q = (q0,q1,...,qn_1,) be the probabilities

that each item is generated in the next population. Consider the next population

as P (the population size) independent samples of the search space, using q as

35

a probability distribution. Representing the chromosome length by L, we have

n=2L. Note that, the size of the state space, i.e. the number of different
. . . . P+2L—1 , ,
populations of Size P, IS given by P , which can be approximated by

LP
7 when P very small compared to 2" . This number easily reaches to billions

even for very small values of P and L.

Think of the action of a GA as a map G : A —> A, where

A={xe$)i" :xk 20,2xk =1}
(1.9)

Then the probability that population q follows population p is given by a

multinomial distribution

 

Plﬁ (G(p)j )qu
.j=0 (P41)!
(1.10)

where G(p)is the expected next population, G( p) j is the 1‘“ entry of the

probability vector G( p), i.e. the probability that the 1‘“ chromosome will be

selected for the next generation.

lterating G will produce a sequence of points. G describes the limiting
behavior as the population size grow large. Then, we have the following theorem

[Vose, 1999].

36

Theorem: Given an initial population p, let q be the actual population observed
after tgenerations. Then, for any 8 > O and 0 < 6 <1, there exists a number K

such that if the population size is bigger than K, then the probability that

l

 

G’(p) — q" < 8 is greater than 1— 6.

1.4.4 Statistlcal Mechanics Model

The population is described by a small set of macroscopic parameters under the
assumption that microscopic details are not of critical importance, by A. Prt'igel-
Bennett, J.L. Shapiro [1994,1997], and M. Rattray [1996]. They represent the
fitness distribution by its cumulants and determine the effect of selection,
mutation and crossover on each of the cumulants. Cumulants are more natural to
use in this kind of representation instead of moments of the fitness distribution
since they are self-averaging, i.e. their average represent a typical member of
possible distributions, while moments are not self-averaging, (see Prl'igel-
Bennett, [2002] for a discussion of this comparison.) Using these results, the

shape of the next population fitness distribution is determined, (see Figure 1.20.)

37

$5.». in...) (.E.I. I!

A

1 Estimate the shape of the population using
Represent the ﬁtness distribution these cumulants.
by its cumulants, K' s
(ﬁrst four of them)
xsm Ksmc
lam—578‘ 1 x,(i)S-——-—»t 2 xtpris,e3 x.- (t)=K.(t+1)
selection mutation crossover

Needs statistical mechanics methods

Figure 1.20 The statistical mechanics model applied to cumulants

The objective is to determine average allele per site, average correlation per site,

k

etc. terms like <a.’ > < aJa- > - . The one variables are assumed to be
i I I j¢k

j,
free to fluctuate subject to the constraint that the macroscopic quantities chosen
are satisfied. Then, the distribution of the allele values is assumed to be the one
which maximizes the entropy. As an example of their calculations, consider the

mean allele value at the ith locus

or. =<z=j>= 11:27! .

Let N (a3) be the number of different ways that would give a,- as mean allele

value at the f“ locus. Then, the entropy for this locus would be

38

"*4: ‘Ganiu who... .02 It /.
I

S(a,-)=log(N(a,-)). Introduce Lagrange multipliers, x and y, to enforce

constraints on mean fitness and correlation

yZZa,’ = yPZa, = yPKl : for mean fitness
i=1 i i
x2 . x2
7222mm? =7P22ai2 : for allele correlation
J' k i i

Then, the probability distribution for a {a} configuration is

L L 2
P({ai}) : Hp(ai) = Hes(ai)+yPCY,-+(ani) /2

i=1 i=1
The maximal value of p(a,~) with respect to a,- gives the maximum entropy
distribution for 01,. So, from derivatives with respect to a,, one gets a
relationship between a,- and the Lagrange variables x and y. Using this
expression of a,- in terms of x and y one can write down the previously chosen
macroscopic variables, such as mean fitness, K1, and fitness variance, K2 . When

the defining equations of K1 and K2 are written for the counting-ones problem,

and the average over all possible crossover operations and mutation operations

is taken, one sees that average mean fitness or the fitness variance does not
depend on the mean allele values. Thus, the expected values of K1 and K2 can
be calculated after mutation or crossover, and, we can find the values of x and

y using them. Finally, using these values of Lagrange multipliers, one finds the

39

values of mean allele, 62,-. Then, ai’s are used to estimate K3 and K4 values.

Finally, the shape of the fitness distribution for the next generation is estimated

using the first four cumulants.

Recently, the maximum entropy technique is applied by Whitley [2004],

also to determine the shape of the fitness distribution from schema frequencies.

1.5 The Need for a New Model

Studying the OneMax problem is important not for solution of that problem, per
se, but because many real-world problems solved via genetic algorithms consist
of a set of separable sub-problems for which the optimum is to optimize each
individually, which is reminiscent of OneMax. When we look at all the models

mentioned in this chapter, we have the following picture:

The Markov Chain Model uses huge matrices to model the dynamics of a
GA and is not applicable in practice, although it gives a good idea about some
general features of GA evolution when applied to simple cases with too small
population size and chromosome lengths. Its application to infinite populations
give an exact description of this case, but, as pointed out by Prl'igel-Bennett
[2002], only very large population size gives results close to infinite population
case. This model, yet, is not applicable to problems with typical population sizes.

It also requires the full knowledge of the fitness function.

40

Schema Theory assumes that the building blocks are already (partially)
known. Assembly of building blocks is useful to observe to understand GA
performance for many problems. This theory is hard to generalize to more
realistic cases since it has many simplifying assumptions, and loses its

applicability as populations lose their initial random character.

The Statistical Mechanics Method requires explicit knowledge of the
fitness function. This method makes very good predictions about the evolution of
the fitness distribution. However, it is applicable in practice only to problems with
very simple fitness functions. Effects of crossover are particularly hard to

estimate using this method for most classes of real-world problems.

So far, none of these models have been successfully applied to the
analysis of parallel genetic algorithm behavior that involves migration before the
population converges. Estimation of optimal times for performing such
migrations, for real-world problems such as that of McCleer Power, Section 1.3,
was among the initial motivators for the models developed in this thesis research.
The examples in Section 1.2.3, illustrate that a model that predicts the mean

allele evolutions can be applied to migration analysis.

Although the models mentioned above are applicable to some cases of
OneMax problem, there is no complete model predicting dynamics of deceptive
functions (a class of functions that misleads the GA in finding the optimum. See

Section 3.1 for a type of deceptive function).

41

As a result, we need a new model that can be applied to cover crossover
rates lower than 100%, and also be used for migration analysis. Moreover, this
model should also be simple enough, both theoretically and computationally, to

potentially apply to a real-life problem.

42

Chapter 2

A MODEL OF GA DYNAMICS FOR THE ONEMAX PROBLEM

2.1 Problem Description and a Visual Representation of the OneMax
Dynamics

In this section, we study the OneMax problem. We consider the simple genetic
algorithm in which two-point crossover, fitness-proportional selection and
mutation are applied in the order given. Note that, "canonical” GAs typically apply
selection before or after the crossover and mutation are applied. However,
considering selection in between crossover and mutation does not make much of
a difference in estimating the long term behavior of GAs since the essential
difference between them is only at the very beginning or very end of the GA run,
as illustrated in Figure 2.1. Also note that the order of mutation and crossover
can be changed without making any difference in GAs dynamics since each gene
remains in the population during these two operations and is equally likely to be

changed by mutation before or after crossover.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

AA A
chchs ) CSMXCSM)C
VV V

Figure 2.1 Selection-mutation-crossover versus crossover-selection-mutation

43

In this chapter, we develop a model of genetic algorithm behavior on the OneMax
problem with a population consisting of P chromosomes of length L (see also
Buyukbozkirli and Goodman [2004].) Let S(t) be the set of all chromosomes at
time t, chrom an element of this set, and chrom(i) the allele at the f" locus of this

chromosome, in other words, for chromk =(af,a§,...,af) the 1‘“ allele is

chromk (i) = a!‘ .The fitness of a chromosome, chrom, will be denoted as
f (chrom). So, for the OneMax problem,
f(chromk) = Zchrom,c (i) = of + aé‘ + + at,

l

(2.1)
where the values of of ’s are 1 or 0.

The population-level variables that we are interested in are the mean
fitness ml), the variance of the fitness mt), and the set of means of the alleles

at each locus i {ai(t)};=1 ,,,,, L , at time t. They are given by the formulae

PL

P
1:10) = %Zf(chr0mk ) = %ZZa,-k,
k-l

_ k=l i=1

P
K2(t)= i[Zf(chromk )2]- )'(1(t)2
P k=1 '
1 . 1 P k
all-(t) = —Zchromk (i) =—Zai (t).
Pk=l Pk=l

(2.2)

44

In the case of the “weighted fitness function” for a OneMax problem, each allele i

of the chromosome is weighted by a constant weight, w;’. In this case, the fitness

of a chromosome is given by the weighted sum

f(chr0mk) = Zwi ~chr0mk (i) = wlaft + wzaé‘ + + wLaf.
t'
(2.3)

Figure 2.2 shows a sample population at time ttogether with deﬁnitions of some

of its parameters.

mean allele at the ith locus : a,-

   

chrom, = 1 O 1 0 ﬁtness
chrom: = O 0 1 1 fitness
ehrom,= O O 0 . . . 1 1 fitness
w, w: W3 . . . WL_1WL
weights of alleles "‘03“ fitness 3 K1
variance of ﬁtness : K2 = 0'2

Figure 2.2 The chromosomes in the population, mean allele and other

notations
In Table 2.1, we see a list of the notations used in this chapter in the

development of the model.

45

Table 2.1 Notations
P :population size
L :chromosome length
a," (t) :the 1"” allele of the k‘h chromosome at generation:
a, (r) : the mean allele at the i'“ locus at generationr
f (chromk) : the fitness of the k‘” chromosome at generation:

[(10) :the mean fitness

K2 (1) : the variance of the fitness

pm : mutation probability of each allele, in decimal form
p, : crossover rate, in decimal form

S(t) : the set of all chromosomes at time t

p,(t) : the probability that a chromosome that is selected randomly at time t
with the probability scheme of the selection, has 1 at its ith locus

When we study a particular run of a GA it is useful to look at its mean allele
values for each locus and observe the way they change at each generation. In
Figure 2.3, we see the mean allele values at times 0, 10, 20, 30, 50 and 100. The
GA parameters of this OneMax problem are: chromosome length L=100,

population size P=50, crossover rate pa = 0.25 and mutation rate pm = 0.001.

46

H'w} ld\r.li.'i ’IIAI. .
I... : r. .3 K .a . M
.

 

1.
0.8*

0.6B;::£;:30. cg... o

 

time=0

 

 

 

 

 

 

0.4 .a
0.2L
0» l
-0.2 ‘ ‘
0 50 1(1)
locus, i
tlme=30
1- oo o o o a...
mas” a.“ . .
0.8. - ° " . ' .
~
0.6» ‘ .
- .‘s . '
0.4: D 0 ' . J(1
o
0.2» o o , .
o ‘ on .
0?. . 000' o e: on
-0.2 ‘
0 50 100
locus, I

 

 

 

 

 

 

 

 

pc=0.25 , pm=0.001
time =10
1 * e .' O . 4
o o o
0 Bi .0 :0 . I . A
' .. ..
3. g . o .0 u
0.6* 0. ’00 . ." ‘1... .
. . . .
0.4i .2. ..0 .'e ‘a
e. so 0
0‘2 1.. .. . . . 0 0.;
0. o oi
-O.2 ‘
0 50 1(1)
locus, i
time = 50
1 can." 0 -:.-HH
0 o 0
0a"- ,
0.6» "
. o
o 4 . ° ° ' . -
. . . . (I
0.2» . '
. . .. C .
0. o o 0:0 so. so 0 d'
-O.2 ‘ *
0 5O 1m
locus, i

 

 

 

 

 

 

 

 

time=20
1L0...‘oe so so 0.... e .
o
cage. 1:. "" h ‘
06’ ..~. . ..o. .0. o:
0 o .0 0
0.4 00 '
g 0' o
. e
02?. . -.. -0. '
0’ 0 o o
-0.2 ‘
0 50 100
locus I
tlme=100
1L“..- o—qo-
o
0.8. . ° -«
'. o' . o ..
0.6 1
. o .. o...
0.4- . ..
0.23 "
One 00.... one CI
-0.2
0 50 100
locus I

Figure 2.3 The mean allele values, ai’s at each locus i for times t =0, 10, 20,

30, 50 and 100 of a GA run with pc=0.25 and pm=0.001.

Define Ah(t) to be the number of ai(t)’s whose values are less than or equal

to h,

Ah(t) =#{ a,(t)| a,(t) s h, i = 1,...

47

.L}

(2.4)

By this definition, for example, A0(t) gives the number of loci where all of the

chromosomes have value 0, while A0_6(t) gives the number of loci where at

most 60% of the chromosomes have value 1. For instance, in Figure 2.3, we
have A0(3O)=9; i.e., at the 30th generation, at 9 of the loci, all the

chromosomes have value 0. So, for this example of the GA, at generation 30, the

allele value 1 is completely lost at 9 positions of the chromosomes. A1(t), by

definition, is always equal to L.

In Figure 2.4, we see how 1404 (t) is defined and the time evolution of

Ah(t) for h=0, 0.1, 0.2, 0.9. The time evolution graphs of Aha) in Figure 2.4

are obtained by taking the average of 100 GA runs.

48

 

 

pc=0.25 , pm=0.001

 

 

time =80
1» a. .. .0 no...” 0 so- 0.. a... 0-..... m
o . . o
o
0.8~ . ' ,' —
o o . . o o
a
0.6~ o _
d o
o
O_4>¢ oooooooooooooooooooooooooooooooooooooo g.............. ooooooo Q nnnnnnnnnnnnnnn ‘oollo-h
.- A0.4(t) :-' # 9f dots b '
0.2» , level 0.4 -
0 o o o 0 ° 0 ..° 0 o o o o o
_02 1 1 l L l l
20 40 60 80 100

 

allele nunber, i

One-Max with : pc=0.25 , pm=0.001

 

 

 

 

 

 

 

 

 

 

Figure 2.4 Deﬁnition of Ah (t)

49

The values of the variables (Km), 19(1), {m(1)}i=1 L) and Ah(t) change from one

9"!

GA run to another even if we have the same initial population. In terms of
experimental results, we run a GA, with fixed parameters of selection, mutation
and crossover, many times. For each run of the GA, we measure these quantities
at each generation and take the average over all of the runs. The goal of our
model is to estimate average values of these variables, hence the average

behavior of the GA. In order to simplify the notation, we will use the same
symbols (Km), 19(1), {ai(t)};=1,,,,L) and Ah(t) for values of a specific run of a GA,
or for an experimental average of these values, or for the estimated theoretical
average in our model. Which one is denoted will be clear from the context. We
will use the superscripts c, cs or csm in order to distinguish these variables after
crossover, selection or mutation is applied, respectively. So, cit-“(0 represents the
mean of alleles at the f" locus at the 1‘" generation after crossover and selection
have been applied, and 0493’"(t) equals a;(t+ 1). Note that the crossover operation
does not change the mean allele values in OneMax problem. Thus, aq(t) equals

ail-"(t)-

The study of the time evolution of Ah(t)’s for several values of h gives a
very practical insight into the behavior of the GA. Figure 2.5 shows the graphs of
Ah (t) for h = 0, 0.1, 0.9, where the crossover rate is 25%, the mutation rate is

0.1%, and fitness-proportional selection is used. The population size is taken as
50 chromosomes and the chromosome length is 100 genes. Each time slice of

such a graph can be seen as a “bar graph” of the mean allele distribution at the

50

given instant. In other words, the vertical distance between two curves gives the
number of gene locations at which the mean allele is between the corresponding
values, averaged across runs. For example, at t=100, at about 18 gene locations,
none of the chromosomes (i.e. h=0) have value 1; at about 5 locations from 1 to
5 chromosomes (i.e., more than 0% and up to 10% of the population, i.e.
0<h50.1) have a 1 and the rest have a 0; and at about 50 locations, from 45 to 50
chromosomes (i.e., 91% to 100% of the population, i.e., 0.9<hS1) have a 1 and
the rest have a 0, etc. The closer the curves are to each other, the smaller the
variation in the population. We observe that although the population converges to
a more-or-less stable configuration after 100 generations, there is still some
variation within the population, due to the existence of mutation, which has the
potential of creating new chromosomes. Thus, for example, at the right-hand
side of the graph, about 18% of the loci are "fixed" at 0, about 50% are “fixed" at
1, and about 32% of the loci have a mixture of 0's and 1's. The number of these

“mixed" loci typically increases as the mutation rate increases.

51

’0-
0-.-.-.
-.-
.

 

 

 

 

 

 

 

 

 

 

 

 

50

18

90
so Time-slice at t = 25 ,_ . _
as a bar graph ' ' '
70 - .........
so ‘‘‘‘‘‘
g Time-slice at t =100 _
<‘ 5° N as a bar graph °
40 RM "=09
- ﬂ KM“;- [
3° 7", ' "- “'3 0.1
20 ~‘w** M h=0
10
00 50 100 150 200
t
Figure 2.5 The experimental average values of Ah (t) as a function of time for

h =0, 0.1, 0.2, 0.9 , and the bar graph interpretation. The

population size is 50 and the chromosome length is 100 genes. The

GA parameters are p0 = 0.25, pm = 0.001. The average is taken

over 100 experiments

When h is quantized with a gap of 0.1 between two consecutive values as above,

we get 10 regions formed between the curves, including the region above the top

curve. We will use the index n to count these regions, h' = 1, 2, 10, given by

Rh’ = {050' A(h’—1)/10(‘)5 y SAir/10(1)}

where A1(t) is defined as the constant function L.

(2.5)

2.2 The Model for OneMax with Fitness Proportional Selection

The model is developed first for the case with a “high enough” crossover rate.
“High enough“ here means sufficiently high that the alleles at any locus are
distributed essentially randomly among the chromosomes. Then it is modified to
include cases with lower crossover rates. The first case involves three main
steps. First, mean alleles after crossover and selection are estimated assuming
that the crossover rate is “high enough”. Then, the effect of mutation on the mean
allele is determined. The last step involves the estimation of fitness variance

given the mean allele values.

The second case, in which the crossover rate takes more realistic values,
is modeled by observing some statistical properties of the GA at early
generations. This is an aspect in which this method differs strongly from earlier
theoretical models, but which, it is hoped, will allow simple models to be
developed that are applicable to a variety of interesting problems, adapting to the

behavior of the crossover and fitness functions on a problem-specific basis.

Note that at any GA stage, the mean fitness, K1, is always the sum of the

mean alleles across loci at that moment, i.e.

K1(t)=Za.-(t)

(2.6)

This gives us our first simulation formula:

53

M 2. 1: At any stage of the GA for OneMax the mean fitness is

x1(t)=Za,-(t)

2.2.1 Selection and Crossover with “High Enough” Crossover Rate

First, consider the case in which the crossover rate is so high that the alleles at
any locus are distributed essentially randomly among the chromosomes. We will
call this crossover rate a “high enough” crossover rate. In fitness-proportional
selection, each chromosome has a selection probability proportional to its relative
fitness within the population. If we denote as q, the probability of selecting the jth

chromosome, then

f;

2,; f k

qj=

(2.7)
where fk is the fitness of the k‘h chromosome.

Let p; be the probability that a chromosome that is selected randomly with
the above probability scheme after the application of crossover, has 1 at its 1‘"
locus. As with the other symbols, we will use the notation pi(t) for values of a
specific run of a GA at time t, or of an experimental average of these values at
time t, or of the estimated theoretical average in our model, depending on the

context. It is easy to estimate pi(t) theoretically in terms of K,(t) and a;(t) when the

crossover rate is “high enough”. Let S(t) be the population after crossover is

54

applied with high enough crossover rate to the population of generation t— 1.

Define the subsets S60) and Sf(t) of S (t) as

S (i, (t) = {chrome S (t) I chrom(i) = O}

and

Sf(t)={chrome S(t)| chr0n1i)=1},

 

 

(2.8)
Then, we have
2 f(chrom) = P(l — a, (i))(irl (t) — a,- (n)
chrome 53(1)
and
Zﬂchrom) = Pa,(t)(1+ K1(t)— ai(r))
chrome Sf“) ’
(2.9)

where the bar over the summation means the average over all possible
configurations of gene distributions, in which we assume that the genes are
distributed randomly satisfying the given mean allele values, since the crossover

rate is “high enough”. Thus, the estimated average value of p,(t) is

Pa, (t)(1 + Kl (t) - a, (t))
Pa,(t)(1+ x10) — a, (t)) + P(1- a, (t))(l(l (t) — a,- (t)) ’

 

1),-(t):

(2.10)

55

which simplifies to

 

(1" ai(t))ai(t) .

,(t =a, t +
p ) () K10)

(2.11)

In the process of fitness proportional selection, we apply selection of
chromosomes P times with replacement. Each time, the probability that the
selected chromosome has 1 as its 1‘" allele, is p)(t). So, the expected number of

1’s at the f“ locus, after the selection is over, can be obtained by using a binomial
distribution. Let B(n,P, pi) denote the probability of having n successes after P

trials, when the success probability is p, for each trial. Then, the expected

theoretical value of alt-“(0 is

P
af‘tt)=;1,-Zn-B(n.P.p.-(t>).

n=l

(2.12)

when the crossover rate is “high enough”.

In summary, we have the following formula that is used in the code

simulating the model:

_M_g_g: (a) After crossover with high enough rate is applied, the
probability that a randomly selected chromosome that is drawn with
the probability scheme of fitness proportional selection, has a 1 at
(1 - a,- (t))a,. (t)

K10) .

 

its 1'" locus is pim = 04(1) +

56

(b) The mean allele values after crossover and selection for high
enough crossover rates are given by the formula

P
a,“ (t) = %2n . B(n,P, p, (t)) .

n=l

2.2.2 Mutation

In this section, we want to estimate a)“’"(t) given the values of af"(t). Each gene
of a chromosome has the probability pm of changing its value from 1 to 0 or from
0 to 1 by mutation. When we consider the possible changes at the f“ locus only,
the expected number, N, of total allele changes due to mutation can be found by

using a binomial distribution as

P
N = Zn-B(n,P,pm).
n=l

(2.13)

Since the percentage of 1’s at the it“ locus is 0408(1), a)“(t)N of these changes are
going to be from 1 to 0, and (1-aF’(t))N of the changes are from 0 to 1, on the
average. This means that the number of 1’s at the ith locus, which is Fatwa), will
become Pa,“(t)- mcs(t)N+(1-a)°s(t))N after the mutation. Simplifying this quantity

and dividing by P gives the mean allele for the next generation as

 

_ .68
dig-(H1):aI,-"‘""(t)=(x,-"s(t)+1 2a, (”N

(2.14)

57

.,

1334*».“13 run-MN... Mu

 

As a result we have the following recipe to use in the model simulation:

M 2.3: The mean allele after mutation is given by

P
N , where N = Zn-B(n,P,pm),

n=l

ailt+1)=afsm(t)=af‘(t)+———l-2:f (1)

2.2.3 Fitness Variance for “High Enough” Crossover Rates

The fitness variance by definition is

P
K2 (t) = i-{Z f (chromk )2] - K1 (t)2 .
k=l

(2.15)

If we write the fitness of chromk as the sum of its gene values a!‘ and change the

order of summation after expanding the square sign above, we obtain

1 L P
19(1) =K1(t)+—Z Zafaf —K1(t)2 .
Pit-xj k=l
(2.16)

P
The term Zafaf in Equation (2.16), counts the number of chromosomes in
k=1

which loci i and j both contain 1’s. In the case of “high enough” crossover rates,

this count is estimated by using af’ and 01st as follows. The probability,

p(i, j,n), that locations i and j have n common 1’s is given by the formula

4- P CS], where 11 could take any value
a.
J

170313") =(

n Pa? - n

58

between max(O,Pa,-“ + Pale-3 — P) and min(Pa,-“,Paj-‘) and the product of

P with a’s is rounded to the nearest integer in order to calculate the
combinations. Thus, the estimation of the fitness variance in the case of “high

enough” crossover rates is found by using

CS CS 1 L - - CS
x2 (t)=l(1 (t)+-FZXn-p(l,],n)—Kl (t)2_
i¢j n
(2.17)

The estimation of fitness variance after mutation is done by Prt’igel-Bennett and

Shapiro, [1997]. Their formula gives us

L
2 1 2
K?“ = (1 - 2pm) If?” +[1_F]pm(1— Pm)ZWi

i=1

(2.18)

where w. is the weight of the ith locus. In other words, the fitness of a

chromosome (a1,a2,...,aL) is calculated by the weighted summation 2W1“: . In

our special case, the values of w,-’s are all 1. So, we use the formula
2 1 2 L
Kgsm(t)=(1-2pm) K§3(t)+[1-;)(pm — me1
i=1

CS 1
=(1-2pm)2K2 +L[1—F](pm—p31)=l(2(t+l) ,
(2.19)

59

to estimate the fitness variance after mutation.

As a result we have the following two statements to use in the model

simulations:

M 2. 4: The fitness variance after crossover and selection is found

1 L . .
by using K250) = K1“ (t)+FZXH'P(1.J,n)-Kfs(t)2, where

iatj n
Pa.“ P — Pa,“ P
p(i,j,n)= ' x + cs . The same formula
n Pa? —n Pa,-

can be used to estimate K2 (0) using [(1 (0) since the initial values

of the allele are chosen randomly which makes it equivalent to a
highly mixed population.

M 2. 5: The fitness variance after mutation is found by using

3 2 1
K5 "’ =(1-2pm) K?” +L(1-;)(pm #73.)-
2.2.4 Lower Crossover Rates

Equation (2.11) gives the probability p. when the crossover rate is very high. In
such a case, as in Section 2.2.1, we are able to treat the 1’s at a fixed locus of
different chromosomes as identical to each other in terms of their roles in
selection because of the high mixing rate of the crossover operator, which makes
chromosomes look similar to each other, on the average. However, for lower and
more realistic crossover rates, there will be some correlation between alleles

within a chromosome and Equation (2.11) will no longer hold. Let’s keep the

60

usage of notation pi(t) for the probability of selecting a 1 at the ith locus in the

case of the “high enough” crossover rate and denote the corresponding

probability in the case of a lower crossover rate by p, (t). To remedy this

situation and estimate 5,- (t) correctly, we consider artificial weights, c,-(t), for

each locus in order to reflect the average change in the role of 1’s played in the
selection process due to correlation between alleles. The correction weights, c,-(t),

are defined implicitly by

(1 — a, (1))a, (0c, (1)
K1 (1) .

 

EU) = 6310‘) +

(2.20)

The reason why we defined the correction weights as in the equation above is

because if we write Equation (2.11) for a fitness function of the form

f (chrom) = 2w,- -chrom(i) , with weights w), we would get an equation exactly

I
like Equation (2.20) with c; replaced by w). Our correction weights play a similar

role at each locus as Ms would, except that 01’s change over time.

The next step will be to estimate the c,’s statistically by means of some
data gathered from experiments. In order to do this, the GA is run with fixed rates
of pm and pc up to a pre-selected generation, say to. Let us call this generation

Go. The crossover operation with the current rate, pc, is applied to Go many

times. Each time, W t ) values are calculated from the experimental data for

each locus i, and the corresponding 0,- values are found using Equation (2.20).

61

‘1 uﬂuﬂﬂk.....z 3.13.5 .. . ﬁt

 

This process is repeated for many runs of the GA to obtain statistical measures.
It is observed that the value of c) strongly depends on the values of at, as
expected. Because of this dependence, it makes more sense to group the or

according to their corresponding a) values before finding the statistics of the data

gathered from the experiments. So, define Ciao) as the set

{ c,- values of the k‘h experiment of GA such that h s a,(t0) < h + 0.1}. for h = 0,

0.1, ...0.9 . The mean of the correction weights is obtained by finding

,u(h',t0):mekan(mean(C,:'(to)», h' = 1, 2, ...,10, where the relationship

between the index h and h' is given by h = (h' -1)/10, to be consistent with
definition (2.4). In order to measure how much the correction weights vary from

one experiment to another, we also calculate the standard deviation

0(h',t0) = s£d(mean(C,:’ 00)».

The experimental results show that, when pc is not too low (below about
4%), p and 0 remain more or less at the same value regardless of the time, to.
Moreover, 11 shows a linear-like behavior while a shows a quadratic-like behavior

as a function of h'. This behavior of the crossover operator allows us to use the

linear approximation of p(h',5) to predict 5,1! ) for the following generations.

Figure 2.6 shows the graphs of p for two different rates of crossover with to at
generations 5, 15 and 30. We have observed that the inclusion of u in our model
is good enough for describing the effects of c; distributions and the information

coming from a does not play a significant role in the counting-ones problem.

62

However, for other problems, such as OneMax with Boltzmann scaling, 0 might
be needed in the model. The deviations from the linear behavior, in Figure 2.6, at
h'=1 or 10 are due to effects of statistical averaging in which there were not

enough data points available for these border values.

 

 

 

 

 

 

 

 

 

 

3 2.51 2 5
pc = 0.25 2t pc = 0,75 2» p6 = 3
2 1.51 ' 1.5L
1 1M
1
1‘ . 0.5» 0.5,
- Iiiearfitforb=5
0* 0* 0’ Estoppedatto=3o
. * stoppedatto=5
e. :' '0.5‘ '0.5 "u stopped atto=15
-1 ' . J _1 1 a _1 1 l
0 5 10 0 5 h. 10 0 5 10

Figure 2.6 The mean value of the correction weights as a function of mean
allele levels, h', for crossover rates pc = 0.25, 0.75 and 3. The
statistical average is found over 100 runs of the GA. Population
size is 50 and chromosome length is 100 genes

M 2. 6: The mean allele values after crossover and selection for a
crossover rate pc are given by the formula

 

P — . . .
Cit-“(0 = i2]; - B(n,P,p'iU», Where ﬁt“) ___ 0’10) + (1 al(t))al(t)cl(t)
Pn=l K1 (I)

and the values of c, (t) are determined as described above for the

corresponding crossover rate.

63

411...... EH13 ..wﬂﬁ... .
1 M.

 

2.2.5 The Algorithm of the Model

Diagram 1, page 65, shows the flowchart of the algorithm that is used in the
simulation of the model that has been developed in the previous sections. We

apply this algorithm N times. The estimations of mean fitness and fitness

variance are found by taking the averages over these N runs. a,(t) values of

each run are used to find Ah (t) values for that particular run, h = 0, 0.1, 0.2,

0.9, Taking the average of these measures for each t, in turn, gives us the actual

Ah (t) evolutions, that describe average GA behavior.

The simulation of the model for high enough crossover rates starts with
selecting a set of ai(0) values chosen by considering a binomial distribution for

each locus in which we have P selections with a 50% chance of selecting a 1
each time. _M_g_.g (page 56) and M2_.3 (page 58) are applied to estimate the
mean alleles after the crossover, selection and mutation operations. This process
is iterated for each generation to obtain a dynamic simulation of the mean allele.
At any moment, the mean fitness is estimated by MA (Section 2.2), and the
fitness variance in the case of “high enough” crossover rates is estimated using
M_2_.g or M (Section 2.2.3), depending on whether we are considering the

variance right after the selection process or after the mutation, respectively.

In the case of lower crossover rates, M 2.2 is replaced by M 2.6, in which
the crvalues are pro-determined by the linear approximation of the data gathered

at the 5th eneration of a set of GA runs as described in Section 2.2.4, Fi ure 2.6.
9 9

64

 

 

Diagram 1 The Flowchart of the Simulation Algorithm

 

Select a, (O)’s randomly, i = 1, ,2 , , L using binomial distribution
for each i, P selections with 0.5 success probability

 

V

 

Apply M 2.1, (page 54), to find K1(O)

Apply M 2.4 ,(page 60), to find K2 (0)

 

 

 

 

Fort: 1 to Maximumieneration number

 

 

 

 

Fori=1toL:

 

 

 

 

 

Apply M 2.2(a), (page 56), to find the value of p,- (t)

 

Apply M 2.2(b), (page 56), to find a sample value for mean allele,
af‘ (t), using a binomial distribution, P selection with p, (t) as
the probability of success

 

i=i+1

 

 

 

J

 

 

Apply M 2.1, (Sec. 2.2), to find it,“ (t)

Apply M 2.4,(Sec. 2.2.3), to find K? (t)

Apply M 2.3, (See. 2.2.2), to find mean allele,
afsm (t) = a,- (t +1), after mutation

Apply M 2.1, (Sec. 2.2), to find mean fitness of the next generation
K?“ (t) = K1 (t + 1)

Apply M 2.5, (See. 2.2.3), to find the fitness variance of the next

generation K§sm (t) = K2 (t -l- 1)

 

 

 

2.3 Weighted OneMax Fitness Function

Now, we will consider the case in which each allele contributes to the ﬁtness with

different weights. In other words the ﬁtness of a chromosome

chromk =(a1k,a§,...,af) is

f(chromk) = Zwi -a{‘ .
I (2.21)

 

Fitness function for weighted counting l's problem:

IIIIOIIOO Fame”;
“1 “a “a “4 “5 “0 “1 “a “1 “1*“21'w31'w41’w01'w1

 

 

 

Figure 2.7 Fitness with weights

In Figure 2.7, we see an example of the weighted ﬁtness of a chromosome of

length 9.

When we study the GA with the weighted ﬁtness function of Equation (2.21) at

time t, for the 1‘“ locus, Equation (2.9) would change to

66

 

42flchr0m) = P(1- 0!. (1)th (t) — ma,- (0)

chrom e 53 (t)

and

 

2 f (chrom) = Pat,-(t)(1(1 (t) + (1 — at. (t))Wi )

chrome S{(t)

(2.22)

Then, Equation (2.11) would be replaced by

(1 - a, (t))a,. (ow, .

K10)

 

[7,-(1): at“) '1'
(2.23)

This would change our simulation model formula as

M 2.2’: (a) After crossover with high enough rate is applied, the
probability that a randomly selected chromosome that is drawn with
the probability scheme of fitness proportional selection, has a 1 at
(1‘ at “Dar (”Wt

K10)

 

is f" locus is p,- (t) = a, (t) +

(b) The mean allele values after crossover and selection for high
enough crossover rates are given by the formula

P
af‘(t>=%Zn-B(n.P.p.-(t>)-

n=1

The existence of weights changes Equation (2.16) as

67

L P
K2(t)=Kl(t)+-l—Z Zwmfwja’; «10)?

tab j k=1
(2.24)
Then, Equation (2.17) is replaced by
1 L 2
K55 (1) = K1“ (t) + 7522" - wiwjp(i, j,n) - K1“ (t) ,
i¢j n
(2.25)
where p(i, j,n) remains the same as
Pa.“ P - Pa,“ P
P“, j,n) = l X + cs ’
n Pa? —n Pa}
(2.26)

as in Section 2.2.3.

The fitness variance after mutation is already given by Equation (2.18) for

the weighted case.

The modification for the lower crossover rates is straightfonrvard. The

defining equation for the correction weights, Equation (2.20) is now

(1-a,.(z))a,.(t)w,.c,.(t)
K1 (1) .

 

13.0): ai(t)+

(2.27)

68

2.4 OneMax with Boltzmann Scaling

One of the most common ways to scale the fitness values so that the
chromosomes with higher fitness have more chance to survive the selection

operation compared to the usual fitness-proportional selection is Boltzmann

scaling. In this scaling, the fitness value f (chromi) of the ith chromosome is
replaced by

f(chromi) = eﬁkhmm‘),
(2.28)

where ,6 , which controls the pressure of the selection, might be a function of
some parameters of the population or it might be a constant. Then, the
probability, q,- that the ith chromosome would be selected by the Boltzmann-

scaled fitness proportional selection is given by the quotient

eﬁ‘ (chromi)

i = p '
Zeﬁkhmmj)
i=1

 

q

(2.29)

The advantage of using a Boltzmann scaling is that the value of q,- is shift

invariant. In other words, if the fitness values of all chromosomes present in the

population at any time t are shifted by a value c, then q,- would remain

unchanged for chrom). One can see this property easily from

69

.I'E 037.5,
m-r .. ...se..r.. .7 m .9..A

 

eﬂ(f(chromi )+c)

 

qi : p
Zeﬂ(f(chrom,)+c)
i=1

eﬁceﬂﬂchrom)

 

 

eﬂc ieﬂkhmmi)
j=l
eﬁ‘(chrom,)
= p = qi
Zeﬂlchmmi)
j=l

In the case of scaled ,6, the selection pressure, ,8, is sealed at each generation

inversely proportional to the standard deviation, 0‘ = ,iKz of the fitness. So,

.33 =___V21“(P)ﬁ is used instead of ,8 and the fitness values, f (chrom), are

0'

replaced by

Then q,- is

,(21n(P) )ﬂf( chrom)

f(chrom): e a

(2.30)

 

 

i

J

i

_mnmﬂ(chrom.)
f (chromi) = e a t
,. ,_[______21n(r>)
(chrom) Zea ___—131mm,.)

=1

j= —l

70

When ,8 is scaled like this, then the value of f(chromﬂis invariant under

multiplication of fitness values by a constant. This is because when all the fitness

values in a population are multiplied by c, then the new variance of the

2

population would be c [(2, where K2 is the variance of the original population.

Then,

i—gﬂﬂcﬂchrom) V 21“(1’),5cf(chrom)
f(chrom)’ = e “C K2 = e CJK—Z

 

= f(chrom).
Hence, using a Boltzmann selection with sealed ,6 , the selection operation

would be invariant under constant multiples or shifts of fitness values.

We will study two types of Boltzmann selection for the high enough

crossover rates only. In the first one, ,6 is kept constant throughout the whole
GA run. We call this case the ‘Boltzmann selection with fixed ,6 ’. In the second
one, ,6 changes inversely proportional to the standard deviation of the fitness

distribution of the current population. This case will be called ‘Boltzmann

selection with sealed ,6 .’

Using the same notation as in Section 2.2.1, we define the sets

Sf, (t) = { chrom e S (t) I chrom(i) = O}

and

71

31.4.4422»... ...l . «A

 

 

 

Sf(t)={chrome S(t)| chr0m(i)=1},

where S(t) is the set of all chromosomes at time t.

Let us consider an instance of a GA run and study this instance for the ith
locus. We define probability p), similar to that in Section 2.2.1, as the probability
that a chromosome that is selected randomly with the Boltzmann probability

scheme after the application of crossover has 1 at its 1th locus. Then, we have

Zeﬂﬂchrom)

chrome S i (t)

Zeﬁuhrom) + Zeﬁkhrom) °

chrome S f (t) chrome S3 (t)

 

1910‘) =
(2.31)

We want to find the expected value of the probability, pi(t), when the crossover

rate is ‘high enough’. If the ith locus values of all chromosomes in this population

are replaced by 1’s then we would have a fitness distribution whose mean fitness

is Kl + (1 — (1,), where [(1 is the mean fitness of the original population. At this

point, we assume that the variance of this new population is almost the same as
the variance of the original population. Further, assuming that all the fitness

distributions under consideration are normal distributions, we propose a model

distribution for the set Sf (r) :

M 2. 7: Fitness values of Sf (t) are assumed to have a normal

distribution with mean [(1 + (1— a,) and variance K2.

72

Similarly, for the fitness values of S60) we have:

M 2.8: Fitness values of 53(1‘) are assumed to have a normal

distribution with mean K1 — ai and variance K2.

73

Chapter 3

A MODEL OF GA DYNAMICS FOR THE DECEPTIVE FUNCTION

PROBLEM

GAs, with the help of the selection operator, have a tendency to treat the
direction in which an increase in fitness is observed when a gene value is
changed as the direction to go to seek the optimum value (essentially, hill
climbing). However, not all such changes, in general, lead toward a gene value
that is part of the optimum chromosome. In particular, there are some test
functions that are especially designed to mislead GAs that take advantage of this
weakness of the algorithm. An important class of such functions is called
“deceptive functions”. In this chapter, we will describe a form of deceptive

function and develop a model for the GA with this function.

3.1 Deceptive Function

First, we define an N-bit deceptive function. For this function, the chromosome is
made up of some number of N-bit blocks and the fitness of the chromosome is
found by adding up the fitness contribution of each block. The fitness contribution

of an N-bit block is given by

74

A ifai=1foralli

f(a ,a ,...,a )= . .
1 2 N Bm it some a, sare 0, wherem=# of zeros

(3.1)

The chromosome consists of many such N-bit blocks and its fitness is the sum of

each of their contributions given by Equation (3.1). The constants A and B are

such that A>BN. In other words, a deceptive block with the highest fitness

contribution would have all 1’s as its alleles.

Figure 3.1 shows an example of a 3-bit deceptive fitness function. In this

example the fitness of (111 101 100) is A+1B+ZB = A+SB. We see that the

fitness of the chromosome would increase if the last deceptive block, 100, would

change to 000. However, it is clear that the optimum chromosome is the one with

all 1’s. This is the reason why these functions are called deceptive: From the GA-

selection operator's point of view 111 101 000 is better than 111 101 100, while

the later is bitwise closer to the optimum.

 

Fitness function for n-bit Deceptive problem:

34)“ case: me"

    

| I 1 I o 1 I o o I “(C I" deceptive b106k3)
“7" i/ H -- o 9:01 0's)
A 9 a A 4' 39

 

 

 

Figure 3.1 The definition of the fitness of a 3-bit deceptive function

75

3.2 A Model for the Deceptive Function with Fitness-Proportional
Selection

When a GA is applied to find the optimum value of the deceptive function defined
in the previous section, it tries to increase the number of both 0’s and 1’s at the
same time. So, the problem involves a counting-0’s and a counting-all-l-
deceptive blocks (i.e., the N-bit blocks with all their allele values equal to 1)
problem, competing with each other. Both the counting-0’s and counting-all-1-
deceptive blocks work similarly to the OneMax problem modeled in Chapter 2. In
this section, we apply this idea and develop a model for the deceptive function,

using the results of the OneMax model.

We use the index letter j to index the deceptive blocks, and the index letter
i to index the locus in the chromosome. Similarly to the mean allele values

defined in Section 2.1, Figure 2.2, we define

a _ # of 0's at the jth locus of all chromosomes
J. _
P

 

and

y _ # of all -1 deceptive blocks at the 1"" N - bit deceptive loci in the population
’ _ P

 

(3.2)

where P is the population size. Figure 3.2 shows the definitions of a and y for a

3-bit deceptive function.

76

   
  
   

 

chrom1 *** *** **=l¢
chrom2 *** *** ***
chrom? *** *** ***

 

1t 0's

(1.: __ 11 ill—blocks
4 Population Size yi "

Population Size
Figure 3.2 Variables a and yin a 3-bit deceptive function problem

Similarly to Equation (2.4), Figure 2.4, deﬁne

A:(t)=#{aj(t)| aj(t).<_h,j=l,...,L}

and

A; (t) =#{ 7,-(t)| 7,-(t) S h, i = Lug-1%},

(3.3)

where L is an integer multiple of N.

In the following subsections, we develop a model to estimate the average

expected values of a j (t) and 7,-(t) from their previous values.

77

3.2.1 Selection and Crossover with “High Enough” Crossover Rate

Similarly to Section 2.2.1, we first consider the case in which the crossover rate
is so high that the alleles at any locus are distributed essentially randomly among
the chromosomes. We will call this crossover rate a “high enough” crossover

rate.

We have the following simple equation, following directly from the

definition of the mean fitness

M 3. 1 At any stage of the model the mean ﬁtness is given by

L/3 L
i j=l

Define p? (t) as the probability at generation tthat the f‘ locus of a chromosome
that is selected randomly with the fitness proportional selection scheme after the

application of crossover, has 1 as its gene value. p! (t) is defined as the

probability at generation tthat the f“ N-bit deceptive block of a chromosome that
is selected randomly with the fitness proportional selection scheme after the

application of crossover, has all 1’s as its alleles. In Figure 3.3, we see a pictorial

definition of p? (t) and pf (t) for a 3-bit deceptive function.

78

00‘39 b\ocY~

 

 

. \ W16
{9 I?“ 66cc?“
= chrom: . .® .
Select one with the ﬁtness /
proportional selection p,’ (i) : Probability that this block is 111
probability scheme
Population after pfﬂ) : Probability that this allele is 1

generation t- 1

Figure 3.3 Definitions of p? (t) and pl?’ (t) fora 3-bit deceptive function

Let S(t) be the population after crossover is applied with a high enough

crossover rate to the population of generation t— 1. Define the subsets 753 (t),

(sf (t), “55' (t) and “311(1) of S(t) as
7510) = chrome S(t) i'h deceptive block Of chrom is 11...1 ,
1
7530) = S(t)-(Sf (0.

05110) = {chrome S(t) j’h locus 01 chrom iS 1 }.

“55' (z) = S(t)-“511' (t).

(3.4)

Then, by definition of pj'(t) and p," (t) we have

79

2f (chr om) 2 f (chrom)

 

 

 

pg“) ___ 4 chromeasgﬂ) 4 : chromeaSJU)
2 f (chrom) + 2 f (chrom) K1“

chromeaSJ (t) chromeaSljU)

and
2f (Ch? om) 2 f (chrom)
p170) = _ chronie78{(t) J = chromersfﬂ)
2 f (chrom) + 2 f (chrom) Kf
chromersli (t) chrome’S3(t)

(3.5)

In order to simplify notation, from now on we assume that the deceptive function
is 3-bit. All the arguments below can easily be generalized to a deceptive

function with any number of bits.

Then, for the set 781" (t) we have

 

iZﬂchromFP[A+(14(1)-Arf(t)-B(aj+t(t)+a).2(t)+aj+a(t)»1.

chrome 751‘ (t)

(3.6)

where the values of land 1 are related to each other by j = N (i - 1), (N = 3 in

our case).

For the set “Slj (t), we have

 

Zﬂchmm) = P[3 + (Ki (t) - Aria) - 309(0)].
chrome "SJ (t)

(3.7)

80

where the values of land j are related to each other by i = int(j/ N ) + 1, (“int” is

the function giving the integer value of its argument and N = 3 in our case).

In both Equations (3.6) and (3.7), the bar over the summation means the
average over all possible configurations of gene distributions, In which we
assume that the genes are distributed randomly satisfying the given mean allele
values, since the crossover rate is “high enough”. The superscript “d’ in both
equations means the values of the corresponding variables after crossover is

applied. Note that, in the case of the deceptive function, the crossover operation

changes the value of mean fitness, K1(t), as well as 7,-(t), while 051- (1) remains

unchanged. In the OneMax problem, the mean fitness was not changed by

CI‘OSSOVGI’.

Equations (3.6) and (3.7) give us the probabilities p? (t) and p,” (t) as

 

y ___ yf(A+(xf —Ay,."—B(czj+1+arj+2 +698»

 

' ,where '=3i—1

p. Ki 1 ( )
(3.8)

and

23+ K‘—A F-Ba.
Pf: J( (I y, 1)),wherei=int(j/3)+1.
Ki

(3.9)

In order to use equations (3.8) and (3.9), we need to estimate the values of Kf
and yf - i.e., the mean fitness and deceptive block percentages at each

81

deceptive locus after crossover is applied. For high enough crossover rates, they

are easy to estimate. yf is determined by

c _ G(Pa—aﬁl)’ P(1-(Zj+2), P(1-aj+3))

yi_ P i

 

(3.10)

where G(nl,n2,n3) is the function that gives the expected number of 111’s in

the population when the genes are completely shuffled with nk being the number

of 1’s at the corresponding locus of all the chromosomes. Then, Kf is given by

L/3 r.
Kf = Any + B a}.
i j=l

(3.11)
In the process of fitness-proportional selection, we apply selection of
chromosomes P times with replacement. Each time, the probability that the

selected chromosome has 1 as its 1‘“ allele, is p? and the probability that the f“

deceptive block is 111 is p}’ . So, we use a binomial distribution to find the

expected number of 1’s at the f“ locus and expected number of all-1-deceptive

blocks at the f“ deceptive place. We use the notation B(n,P, r) to denote the

probability of having n successes after P trials, when the success probability is r

for each trial. Then, the expected theoretical value of a§‘(t) is

82

1 P
61?“) zign-B(n,P,p3-z(t)),

(3.12)

when the crossover rate is “high enough”. Similarly, the expected theoretical

value of yf‘(t) is

7,-“(0 =%ZP:"'B("’P”"'7(’)) '

n=l

(3.13)

Then, in the model simulation, we have

M 3.2: The values of a? (t) after crossover and selection for high

enough crossover rates are given by Equation (3.12), where p? (t)

is given by (3. 9) .

Having found the values of 15:10), ale-:20) and 1.130) in the simulation, we
need to modify Equation (3.13), in order to find yf‘(t), where j =3(i-1), as

follows. Since the values of 615.110), ajiza) and a§i3(t) are already

determined, this means that we have only

13=nnn(P(1—a;:,(i)), p(i-a3t120», P(1—a§-i3(t))).
(3.14)

83

chromosomes left in the population which have the possibility of having 111-

deceptive blocks at their 1”“ deceptive places. Moreover, the probability p! (I)

would be modified as

pi’
pi7(t)_____ _
" Pk
(3.15)

where k is the index of a for which we get the minimum value in

{P(1— 075110)), P(1— a§12(t)). P(l — ale-:30»). Thus, the expected

theoretical value of y,“ (t) is now given by

1 P
r."‘(t)=;Zn-B(n P 1),-70>)
"—1 (3.16)

As a result, we have

3
3.3 After 014,10), 0"- +20) and a1+3( ) are determined by M

3. 2, the values of yf (t) are given by Equation (3.16), where P

and p 7(t) are determined by Equations (3. 14) and (3. 15),

respectively.

3.2.2 Mutation
The modeling of mutation is done exactly as in Section 2.2.2.

34

Thus we have

3.4 The mean allele after mutation is given by

csm CS 1 _ 2a§s (t)
aj(t+1)=aj (t) =aj (t)+ N

 

, where

B(n,P,pm)

u _M'b

After the values of aj(t+ 1) are determined as above, the values of yi(t+ 1)

are estimated from them by using the function G, which Is defined in Section

3.2.1,

 

G(P(1—aj+1)9 P(1_aj+2)’ P(1_aj+3))

(3.17)
3.2.3 Fitness Variance
Fitness variance by definition is
1 P L 2 2
[(2 (t) = — Zf(chromk) — Kl(t)
P k=l
(3.18)

In order to write the fitness of a chromosome more easily in the calculations,

define variables gi and a, as

_ 1if the i’” deceptive block is 111
' 0 othen~ise

85

a _ lif the I“ alleleis0
I 0 othenlvise

Then the fitness of a chromosome is

L/3
_ k k k k
f (chr omit) — 2(148 i 1‘ B(a3(i-1)+l 1’ a3(i—l)+2 1' a3(i—1)+3))'

l

(3.19)
Then,
P ,(ZEﬂChmmk) = F 223018; + mam-0+1 + “3(1—1)+2 + “3(j-l)+3))*
= 1,1

k k k 1.
(A81 "' B(a3(i—1)+1 1' a3(i—l)+2 1' “3(i-1)+3))
(3.20)

Separating the terms with i=j from the rest of the summation yields an

estimation of the above expression, given by

86

1 P L/3
— 2 f (chromk )2 2'— E[Azyi + (33?;-
P k=1 i=1
2
+ (23) a3(i—l)+l 'a3(i—1)+2 '(1‘ a3(i-l)+3)
2
+ (23) a3(i-l)+1 ' (1 ‘ a3(i—1)+2) ' a3(i—1)+3
2
+ (23) (1" a3(i—l)+1)'a3(i—l)+2 'a3(i—l)+3
2
'1" B a3(,-_1)+1 ° (1 "" a3(i—l)+2) ' (1 _ a3(i-l)+3)
2
+ B (1‘ a3(i-1)+1) ' ash-1H2 '(1‘ a3(i-l)+3)

2
+ B (1 — a3(i—1)+1) ° (1 " “3(1-1)+2)' ash—n+3]
L/3
+ E[(B(a3(i—l)+l 1' “3(1—1)+2 + a3(i-1)+3)+ Ai’i)*

tat j
(3 (ax j—1)+l + a3( j—1)+2 + a3(j—1)+3)+ A7 j )1
(3.21)

In the estimation above, 5,- is the estimated percentage of deceptive blocks

which are 000 at the f" deceptive place.
As a result, we have

M 3.5 The estimation of fitness variance for high enough crossover

 

rates is given by Equation (3.18), where the first term is estimated
using Equation (3.21 ).

87

3.2.4 The Algorithm of the Model

Diagram 2, page 89, shows the flowchart of the algorithm that is used in the
simulation of the model developed in the previous sections. We apply this

algorithm several times (~10 times). The estimations of mean fitness and fitness

variance are found by taking their averages over these runs. Then, a j (t) and

79-0) values of each run are used to find A? 0) and A; 0) values, h = 0, 0.1,

0.2, 0.9, whose averages, in turn, give us the average Ah 0) evolutions.

88

Diagram 2 The Flowchart of the Simulation Algorithm

 

Select a]. (O) ’s randomly, j = 1, ,2 , , L using binomial
distribution for each i, P selections with 0.5 success probability

Estimate 71(0) by Equation (3.10), i=1, , U3

 

 

 

 

Apply M 3.1, (page 78), to find K1 (0)

Apply M 3.5,(page 87), to find K2 (0)

 

 

 

 

For t = 1 to maximum generation number do the following |

 

 

 

 

 

For each n =1 to U3:

Estimate yg0) by Equation (3.10), n=1, , U3
Apply M 3.1.. (page 78), to find K‘f (t)

 

 

 

 

For eachi: 1 to US:

Apply M 3.2, (page 83), to find the values of a? (t) ,
forj={3(i—1)+l, 3(1-1)+2, 3(i—1)+3}
Apply M 3.3, (page 84), to find to find the value of new),

 

 

Apply M 3.1. (Page 78), to find K1“ (1‘)
Apply M 3.5.(page 87), to find 19“ (t)

Apply M 3.4, (page 85), to find mean allele, afsm (t) = or, 0+1),
after mutation

Apply M 3.3, (page 84), to find to find the value of yfsm 0),

Apply M 3.1, (page 78), to find mean fitness of the next generation
14"" (t) = x, (t +1) . Apply M 3.5.(page 87), to find the

 

fitness variance of the next generation K5” 0) = K2 0+ 1)

 

 

 

 

89

 

 

3.3 Weighted Deceptive Fitness Function

We define the weighted fitness function for an N-bit deceptive problem by

assigning weights both for each locus and for each N-bit deceptive block. Let’s

use the notation W]. for the weight of f“ locus, and v,- for the weight of the i“

deceptive N-bit block. Define variables g,- and a, as

= {1 if the i‘” deceptive block is all 1 's

81 .

0 othervwse
a __ lit the 1'“ alleleiso
I 0 otherwise

Then, the fitness of a chromosome for a 3-bit case, for example, would be given

by

f (chromk ) =
“3 k k k k -
E[Awigi 1' B(v3(i—l)+la3(i—l)+l 1' V3(i—1)+2“3(i-1)+2 1' V3(i-1)+3“3(i-1)+3)]

i

(3.22)

Figure 3.4 shows an example fitness calculations for the chromosome

111101100, with a 3-bit weighted deceptive function.

90

 

Fitness function for 3—bit weighted deceptive problem:

I l I §l o I $1 0 0 Fitness:
Allele weights ill| w, w,§w4 w5 w,’ £111., w, w, Av, + B(Wstl'wsdrwq)
Deceptive block V. V2 V,

 

weights

 

 

Figure 3.4 An example of the weighted ﬁtness function for a 3-bit deceptive
problem

For the mean ﬁtness, in this case, we have

M 3.1’ At any stage of the model the mean ﬁtness is given by

L/3 L
K1 =A2viy, +Bijaj.
r j=1

The Equations (3.6) and (3.7) would be changed as

 

2 f (chrom)

chrom e ’5: (t)

= P[Av,- + rcf (t) — Aviyf (t)— B(wj+1aj+1(t) + wj+2aj+2 (t) + wj+3aj+3 (0)]

where j = 30 — 1) ,
(3.23)

and

91

 

h 2 f(chrom) = P[Bwj + (Kf (t) — Aviyf (t) — Bwjaj (0)] ,
c rome a ({(t)

(3.24)

where i=int(j/N)+1.

These modifications give us the probabilities p?(t) and pf (t) as

p7 =
l
7:? [Avi +ch(t)—Aviyic(t)- B(Wj+laj+l(t)+Wj+2aj+2(t)+Wj+3aj+3(t))]
Kf

 

(3.25)

where j = 3(i — 1) , and

pa _ alewj + (Kf(t)- Aviyf(t)-Bwjaj(t))J
j — ,.

 

(3.26)

where i=int(j/3)+1.

3.4 Deceptive Function with Boltzmann Scaling

In Section 2.4, we have defined the Boltzmann scaling for two cases, one with

fixed ,6 and the other with ,6 adjusted inversely proportional to the standard

deviation of the fitness distribution. The development of the model of the

deceptive function with Boltzmann scaling is similar to Section 2.4. Using the

92

definitions of 123-'0) and p,-7(t) and the sets 7S3(t), 75(0), “SJ (t) and

“51} (t), in Section 3.2.1, the equation corresponding to (3.5) would be

Zeﬂﬂchrom)
chromEaSJU)

“0) -
pf _ Zeﬂwhmm) + Zeﬁ‘whrom)

chromeang) chromeaslj (t)

 

and

Zeﬂkhmm)
chromE’SHt)

7’ _
pi (t) _ Zeﬂkhmm) + Zeﬂkhmm) °

chromersfﬂ) chrom€753(t)

 

(3.27)

In the model, we need to estimate expected values of Zemdm’m),
chromeaSlj (t)

Zemmm'"), Zeﬁfwhm’") and Zeﬁ (Ch’om).When the crossover

chromeaSJ (t) dimmers: (t) chrome 756 (t)

rate is high enough, we have for the y counting

 

ﬁz f(chrom) = P[A + (Kf (t) — A7,? (t) - B(aja) + a 1+1 (t) + a 1+2 (t)))]

chrome 75(0)

 

 

(t) + all-+10) + aj+2(t))]

_ (a,
h =P KC —A F B 1
2f (c ram) [ 1(t) 7, (0+ 1-7f(t)

chrom e ’53 (t)

(3.28)

and for the a counting

93

 

2 f (chrom) =

chrom e “S((t)

P[xf (t) — Bar]- (t) + A(Common(1— a 1+1 (0,1 - a 1,2 (t))-— yf (0)]

 

Z f(chrom) = P[B + (xfo) -- Ayfo) — Ba 1(0)] .

chrome “SJ (t)

(3.29)

Then, for the model simulation we have the assumptions

M 3.6 The fitness distributions of the sets 756 (t) and "S11' (t) are

normal with the mean values given by Equation (3.28).

M 3.7 The fitness distributions of the sets “SJ (1) and “Slj (t) are

normal with the mean values given by Equation (3.29).

94

Chapter 4

COMPARISON OF THE MODEL WITH GA EXPERIMENTS

In this chapter, we explore the model for OneMax and deceptive functions
developed in Chapter 2 and 3, with various sets of GA parameters, and compare
the model predictions with the data obtained from GA runs. This way, we both
examine the GA behavior under different circumstances and also test the validity
of the model assumptions. The exploration of the OneMax function includes both
“high enough” and lower crossover rates, while for the deceptive function, we
have only the results for “high enough” crossover rates since the model for lower
rates is a subject for future research. In all the graphs of this chapter comparing
the model with the actual GA runs, the black lines represent the experimental GA
results and the thick gray lines represent the results obtained from our model.
For notation purposes only, in some of the graphs, we denote the fitness

proportional selection without the Boltzmann scaling as )3 = _1, although it is clear

 

that there is no such ,6 variable in this case.

95

4.1 OneMax Problem

4.1.1 Fitness Proportional Selection

In this section we apply the model developed in Sections 2.2.1 through 2.2.5.
The population size is 50 and the chromosome length is 100 genes. The
experimental results are obtained by averaging 100 runs of the GA. The “high
enough” crossover rate in our case is p6 = 300%, which means that 100%
crossover is applied 3 times in a row before the selection. This rate of crossover
is verified experimentally as “high enough" by observing that there is no
significant change in the graphs of Ah(t), K1(t) and K2(t) if a higher value of pc is
used. It can also be verified from Figure 2.6, since the correction weights, when
p0 = 300%, are all very close to 1. In Figure 4.2 and Figure 4.2, the time variation
of A, is shown for crossover rates of pa = 4%, 25%, 75% and 300%. We
observed in our experiments with the model that when the crossover rate is too
low, such as p0 = 4%, the statistics of correction weights taken only from the 5th
generation are not enough and we needed adjustment by using the statistics at
the 15th generation. Figure 4.2(a) shows the graph with this adjustment. For p6 =
25% and 75%, the statistics from only the 5th generation are used. The simulation
for pa = 300% is obtained by taking all cis as 1. In all four cases, no mutation is
applied — i.e., pm = 0. The graphs look quite similar to each other, except that
there is some variation in the value to which the lines converge as time
approaches 200 generations. We see that the limit value decreases from around
40 to 30 as pc increases from 4% to 300%. This slight decrease observed in the

experimental graphs is well captured by the model simulations. Such a change in

96

the converged values has a significant effect in the final fitness values of the

population as a whole.

In Figure 4.3, the time evolution of An is shown for two different mutation rates, in
both of which pc is kept constant at 50%. In the first case the mutation rate is very
low at pm = 0.1%, while in the second case it is pm = 2%. In both cases, the

model predicts the mean allele behavior very well.

97

153:1.)3 .I I.) ”.1.- /I

 

 

 

 

 

 

 

 

 

0 so 160 150 200
(b) t

Figure 4.1 Ah as a function of time for four different cases where the crossover
rate is 4 and 25 percent, respectively. There is no mutation. Black
lines are the experimental averages over 100 GA runs and thick
gray lines are the results obtained by model simulations. Population
size is 50 and the chromosome length is 100 genes

98

 

 

 

 

 

0 so 160 150 200

 

 

 

 

0 5b 160 150 200
(b) t
Figure 4.2 A, as a function of time for four different cases where the crossover
rate is 75 and 300 percent, respectively. There is no mutation.
Black lines are the experimental averages over 100 GA runs and
thick gray lines are the results obtained by model simulations.
Population size is 50 and the chromosome length is 100 genes

99

 

Ci

...,-Lu. {P .N. .41 J“

 

 

 

 

 

 

 

(a)

 

 

 

 

 

100 150 200
(b) t
Figure 4.3 Ah as a function of time for two different cases where the mutation
rate is 0.1 and 2 percent, respectively. The crossover rate is 50% in

both cases. Black lines are the experimental averages over 100 GA

runs and thick gray lines are the results obtained by model
simulations

100

The experimental results and the model estimations of mean ﬁtness for several
values of pm with crossover rates of p6 = 25%, 75% and 300% are shown in
Figure 4.4. The impact on the mean ﬁtness of changing pc from 300% to 25% is
more visible when pm is low, around 0 or 0.1%. The effect of higher mutation
rates dominates the dynamics of mean ﬁtness evolution, and decreases the
amount by which different crossover rates affect the mean ﬁtness. The estimation
of the ﬁtness variance when pa = 300% is shown in Figure 4.5 for various
mutation rates, together with experimental averages. In all these graphs, we see
that these dynamics of mean ﬁtness and ﬁtness variance are well captured by

the simulation of the model.

101

Figure 4.4

mean ﬁtness

 

 

 

(a)

 

mean fitness

 

(b)

 

mean ﬁtness

 

 

(c) 0 so 160 13to 260 2350 360
The mean ﬁtness for p6 = 25%, 75% and 300%. In each ﬁgure four
different mutation rates, pm = 0%, 0.1%, 1% and 2%, are shown.
Black lines are the experimental averages obtained by averaging
over 100 GA runs and thick gray lines are the results obtained by
model simulations

102

 

 

p -300%

25-
m ' p =0.02
: 20- «..., .
_g i" p =0.01
I... ‘ _3
g 15~-— ........ -— - ........
g “a!“
m 10 ------- 4., ---------------------------------
C ~ ..
E ‘ p =0.001

5 ____________ _- _ ........ E" ____________

.. ﬂ pm=0
00 5b 160 t 130 I 200 250

Figure 4.5 The ﬁtness variance for four different rates of mutation, pm = 0%,
0.1%, 1% and 2%, with crossover rate at 300%. Black lines are the
experimental averages obtained by averaging over 100 GA runs
and thick gray lines are the results obtained by model simulations

4.1.2 Boltzmann Scaling

In this section, we examine the model for ﬂ =0.1, 0.3 and 0.6 for both ﬁxed and
scaled ,6. The “high enough” crossover rate for Boltzmann selection is observed
to be pc 2 3000%, in other words, full crossover (i.e. 100%) is applied at least

30 times in the crossover operation.

In Figure 4.6 and Figure 4.7, we have the Ah curves. Figure 4.6 shows

the case when no mutation is applied. The ,6 values are ﬁxed at 0.1 or 0.6.

103

Figure 4.7 has the graphs for 0.1%, 1% and 2% mutation rates with ,6=0.1

selection pressure. The corresponding mean ﬁtness graphs are shown in Figure
4.9 and Figure 4.9. This ﬁgure also contains the mean ﬁtness curves for the

corresponding GA parameters when ,8 is scaled. The comparison of mean
ﬁtness graphs for ﬁxed-,6 Boltzmann selection for three different selection
pressures, ,6 = 0.1, 0.3, 0.6 , without any mutation are seen in Figure 4.10. The

corresponding ﬁtness variance graphs are seen in Figure 4.11.

When the selection pressure, ,6 is scaled, the Ah curves look as in

Figure 4.12. Lastly, the comparison of ﬁtness variance curves for different

mutation rates of ,6 = 0.1 and ,6 = 0.6 are seen in Figure 4.13.

104

 

 

 

 

 

Figure 4.6 Ah curves for ﬁxed ,6 Boltzmann selection. pm = 0, pc = 60. The
ﬁrst graph is for ,6 = 0.1, the second, for ,B = 0.6

105

100

pc=60 , pm=0.001 ,p=0.1

 

 

    

pc=60 . pm=0.01 .p=0.1

60

4O

20

 

 

o 10 20 30 40 so so
t
100
80- \
eo- \ \ pc=60,pm=0.02,ﬁ=0.1
<21: \-.
40»
20- “ \
0 A“ w—— _J
0 1o 20 so 40 so so

Figure 4.7 Ah curves for ﬁxed ,6 = 0.1 Boltzmann selection.

pm = 0.01, 0.1, 0.2 , p, = 60

106

1%! T 1 f

 

1m)

9.? pm=0.001

90y -

as.) / pm=0.01 l

80” .. 'l
l%-002

75—

70-

65*-

 

 

 

 

 

 

 

 

w% g=30ﬁ=01
55 ..
m L 1 L t 1 1 -1
0 10 20 30 40 50 60 70
t
gh=omn I
pm = 0.02 1‘
lh=0 (
£- 9
l
j
0 10 20 310 410 510 610 70

t

Figure 4.8 The mean ﬁtness graphs for ﬁxed- ,6 Boltzmann selection. The

graphs show mean ﬁtness for different mutation rates with ﬂ = 0.1
or ,6 = 0.6

107

 

1,110.52. .1. at». ..4
.. - /

 

 

 

 

 

 

 

T
l

 

 

 

 

 

85 .1 ___J 1 _____ L L 1 __._,..__
0 10 20 30 4O 50 60 70

Figure 4.9 The mean ﬁtness graphs for scaled— ,6 Boltzmann selection. The
graphs show mean ﬁtness for different mutation rates with ﬂ = 0.1

or ,8 =0.6

108

 

1 1.....unuyiehnﬁwvﬁmpw2 ”r
_-
.

 

1% l r I ﬁr

 

 

 

 

l = 0.3
100 ~ l
%» - ’ 3:011
[3 = 0.6 4
:2— ~
1
l
e‘o go

 

Figure 4.10 The mean ﬁtness graphs for ﬁxed ,6 Boltzmann selection for three

different selection pressures ,6 = 0.1, 0.3, 0.6 with pm = O

109

 

25 1 I I l I

 

 

 

 

 

Figure 4.11 The ﬁtness variance for ﬁxed ,6 Boltzmann selection for three

different selection pressures ,6 = 0.1, 0.3, 0.6 with Pm = O

110

1'? 333..-; .. mill s d
. .

,

 

 

Figure 4.12

ﬁ'iT f‘ ‘f‘ m

 

 

so
70
l pc=60, pm=0.001 ,p=0.1
so
<‘ 50 , \ \
40 x x \
w \ \
20‘} ‘ \‘\\
10‘ ‘\ ‘ 7_
0 ’M. F» J
o 10 20 so 40 so so
i
(3 = 0.6 , pm = 0.001
1007
w-
80,
7o
60
50
4o
30.
20
10‘ F
0 A ‘ , l
0 5 10 15 20 25 30

Ah curves for scaled- ,8 Boltzmann selection. Pm = 0.001, pc = 60.

The ﬁrst graph for ,6 = 0.1, the second ,6 = 0.6

111

1,.‘Hnw\‘ no Uri... h .A

,

,

 

25 T T j’

 

 

 

 

 

 

 

 

\
20» \e - 2
\ .
‘ ‘ __ pm=0.02
Isl \ '
‘ 4
‘\
\‘ ‘
N \ ‘ \
2
\
10* pm=0.01 l
pc=60 , p=0.1 _
5% p m - 0.001
O__ l J 1 A _1_
0 10 20 30 40 so so
i
(3 = 0.6
124-2;: 1 1 T r l 1
10» L
' it
8’ ‘
i.
(3:. ”it
N r
a: 6 a . "52-, ﬂ
4 1: - . w 2.222 2: 2
E ,2 15;“ 0.02
2 . a“. . . pm=0.01 2
.90... “'_‘::-‘_ J. __ - .: ______ .2__:--=_
= 0 =
0 pm \jm 000 - 2 . , . . ..
0 10 20 30 40 50 60 70
t

Figure 4.13 The ﬁtness variance for scaled ,6 Boltzmann selection for two
different selection pressures ,6 = 0.1 and ,6 = 0.6 showing each

selection pressure for different mutation rates

112

4.1.3 Weighted Fitness Function

In this section, we study the GA with two different weight proﬁles. We will call
them Proﬁle-A and Proﬁle-B. The graphs of the weight proﬁles as a function of

the gene locus are shown in Figure 4.14 and their equations are given by

—1§(i—1)+1.9 if ISiSSO
49

Proﬁle-A wi =4

 

%(i—51)+O.1 if 51sis100

L

1 if Isisso
Proﬁle-B w, = i—l ,
3—— If SISiSIOO

(4.1)

In Proﬁle-A, the weight increases from 0.1 to 1.9 linearly as we go from the
middle of the chromosome towards the ends. Hence the gene values in the
middle of the chromosome contribute less to the ﬁtness compared to the ones
near the ends of the chromosome. The maximum possible ﬁtness value of this

proﬁle is 100, which is obtained when all the alleles are 1.

Proﬁle-B is constant 1 for the ﬁrst half of the chromosome. The weights of
this proﬁle decrease from 1 to -1 linearly in the second half of the chromosome.
Note that Proﬁle-B takes negative values in the last quarter of the chromosome.
This means that having a 1 in this part of the chromosome causes the ﬁtness to
decrease. So, the GA would try to have 0’s instead of 1’s in the last quarter of the

chromosome. The maximum possible ﬁtness value of this proﬁle is 63, which is

113

obtained when all the alleles from locus 1 to 75 are 1 and from locus 77 to 100

are 0. The value of locus 76 does not matter in this case since its weight is 0.

First we study Proﬁle-A. The A}, curves for ﬁtness proportional selection and

Boltzmann selection with ,8 = 0.6 , without mutation are shown in Figure 4.15

and Figure 4.16. In Figure 4.15 (b) we also see the graphs for ﬁtness proportional
selection when the mutation rate is 2%. Figure 4.17 and Figure 4.18 have the
mean ﬁtness curves for several cases. In Figure 4.17, we see the graphs for
different mutation rates of the ﬁtness proportional selection. In Figure 4.18(a), the
ﬁtness proportional selection is compared with ,6 = 0.1 and ,6 = 0.6 of

Boltzmann selection with scaled ,6 . And, in Figure 4.18(b) ,6 =0.6 case of

Boltzmann selection is shown for mutation rates of 0%, 0.1% and 1%. The
corresponding ﬁtness variance curves of all the cases of are shown in Figure

4.19 and Figure 4.20.

When we consider Proﬁle-B, Ah curves look as in Figure 4.22 and Figure
4.22, where we see the curves for ,6 = 0.1 and ,6 = 0.6 with no mutation and
,8 = 0.6 with 1% mutation. Comparison of mean ﬁtness curves for ﬁtness
proportional selection and Boltzmann selection with ,8 = 0.1 and ,B = 0.6 is in

Figure 4.23. In Figure 4.24, the mean ﬁtnesses with Boltzmann selection

( ,6 = 0.6) for different mutation rates are shown for time > 9, to be able to show

the details of the graphs. Lastly, the ﬁtness variance curves comparing ﬁtness

114

proportional selection and Boltzmann selection with ,6 = 0.1 and ,6 = 0.6 are in

 

 

 

 

 

 

Figure 4.25.
2
six
1.5~ K
2" t»
.9 1.
.C
.9
3 0s
Ol-
-0_5 1 __ l__--2__.-. I 1.
0 20 40 so so 100
locus,i
Proﬁle-A
1.5 222

 

 

 

 

.°
cn

locus weights wi
s
01 O

 

 

 

20 40 60 80 100
locus, i

 

-1.5
0

Proﬁle-B
Figure 4.14 Proﬁle A and Proﬁle B for the weighted ﬁtness function;

115

 

 

r...1 ...r-Buh..,. .» . 5». id.
2 .

[3 = -1 , Proﬁle-A

 

 

 

 

250

(b)

Figure 4.15 Ah curves for Proﬁle-A with ﬁtness proportional selection, pm = 0

and pm =0.2

116

 

l ...IN.. M”
1-4...» ..a...
H1

 

B = 0.6 , p“ = 0 , Proﬁle-A

1m 7“

 

 

Figure 4.16 Ah curves for Proﬁle-A for scaled- ,6 Boltzmann selection.

withﬂ = 0.6 and pm =O,pc =60.

117

 

 

 

 

 

2.7
J- i
__l
45 l P _ l. l
0 50 100 150 200 250

Figure 4.17 Mean ﬁtness curves for Proﬁle-A for ﬁtness proportional selection

with different pm’s

118

 

 

pm = 0 , Proﬁle-A

 

 

 

 

5‘ ﬂ
[3 = -1 l
2
45 50 1(1) 1‘30 200
(a) t

102

 

100

932

96‘

92~

90*

 

88-

86 "

 

 

 

(b)

Figure 4.18 Mean ﬁtness curves for Proﬁle-A with Boltzmann selection, scaled
,B: (a) comparing different ,6 cases; (b) for ,6 = 0.6 with different

0

Pms

119

 

1.1.49.5...2 ulhltl. , A..

 

 

 

 

B = -1 , Proﬁle-A

 

 

 

 

 

35 'r——_~7 If”
I
30[ 2 H
A = 0.02
25l l 2 p25 .. w ”
. I _ £13,. h | V»
20- - '2 _ , .2 . ‘
N p = 0 01
2 . m '
15— , 4
102 pm = 0 d
5) 2
(a) t
pm = 0 , Profile-A
35 . T l

 

 

 

 

(b)
Figure 4.19 Fitness variance curves for Proﬁle-A: (a) for ﬁtness proportional

selection with different pm’s; (b) comparing different ,6 cases

120

- ;. li'e’IIl‘A“

 

 

B = 0.6 , p" = 0 , Proﬁle-A

T

35 l

 

30— _

25 2

20

l
l

 

10~

 

 

 

 

20 ' 30 40 50 so 70

Figure 4.20 Fitness variance curves for Proﬁle-A for )6 = 0.6 with different

pm ’s. Boltzmann selection is with scaled ,6

121

_L-A:

1'1.

 

 

B = 0.1 , pm = 0 , Proﬁle-B

 

 

 

(a)

 

 

 

 

(b)

Figure 4.21 Ah curves for Proﬁle-B with scaled- ,6 Boltzmann selection.

(a) ,6 = 0.1 , no mutation; (b) ,6 = 0.6 with no mutation

122

 

% ilh’l-u“ I» t. Ir724w

 

B = 0.6 , pm = 0.01 , Proﬁle-B

1CD

 

 

Figure 4.22 Ah curves for Proﬁle-B with scaled-,8 Boltzmann selection.

,6 = 0.6 with 1% mutation. In all cases pc = 60

123

 

 

 

pm = 0 , Proﬁle-B

 

65 l

 

60 +— __ ' .2

55

T
l

50
45 if“
40 ~ l3 2 _1 —
35 » 2
30

252 *

 

 

 

Figure 4.23 Mean ﬁtness curves for Proﬁle-B, showing ﬁtness proportional
selection and Boltzmann selection with ,8 = 0.1 and ,6 = 0.6

124

 

 

 

 

 

Figure 4.24 Mean ﬁtness curves for Proﬁle-B. Boltzmann selection with

,6 = 0.6, and different mutation rates

125

 

H.13031l1l. A

 

pm = 0 , Proﬁle-B

 

 

 

 

 

Figure 4.25 Fitness variance curves for Proﬁle-B, showing ﬁtness proportional
selection and Boltzmann selection with ﬂ = 0.1 and ,6 = 0.6

126

4.2 Deceptive Function Problem

In this Section, we apply the GA to a 3-bit deceptive function. We study only the
case where the crossover rate is high enough. The chromosome length is
chosen as L = 99, and the population size is P = 50. So, we have 33 deceptive
blocks, each of which is 3 bits long. In all the cases studied in this section, the
values of A and B are chosen as A=4 and 8:1. Note that the maximum possible
ﬁtness for a chromosome, when the ﬁtness is not weighted, is

A x 33 = 4 x 33 = 132, which is achieved when all the alleles are 1.
4.2.1 Fitness Proportional Selection

We apply the model developed in Sections 3.2.1 through 3.2.4. In Figure 4.26,

we have the curves for both the evolution of alleles with zero value, A2, and the

evolution of the deceptive blocks, A7 , h =0,0.1,...,0.9, when the ﬁtness
proportional selection is applied. The mutation rate is zero and the crossover rate
pc= “high enough”. In Figure 4.27, A: and A; curves are shown when the

mutation rate is 0.1%. The graphs of mean ﬁtness and ﬁtness variance are

shown in Figure 4.28 comparing cases with different mutation rates.

127

 

 

020406080100120140160100200
t

 

 

l _ L

o l 1 L l_, 1 1 -' i
0 20 40 60 80 100 120 140 160 1K) 200

t

 

Figure 4.26 A: and A}; curves for ﬁtness proportional selection. pc= “high

enough”, pm = 0. Black lines are from GA run, gray lines are from

model simulation

128

 

s=-1,qn=0.001

 

 

 

Figure 4.27 Af,’ and A; curves for ﬁtness proportional selection with mutation

rate pm = 0.001, and pc = “high enough”

129

 

 

 

 

 

 

 

 

 

 

250
B = -1 F
as ~ j x F ‘- - a
i ,2 m“ "h '6 ”1 '
m \
p =0.02
25~
N
M
20~ ‘ 1
152 1 j
- 155.7% 3,.”
10- ‘ I J
a. pm = 0.001
5»—
pm=° ‘ ‘
0 2 J I l A
0 so 100 150 200 250

Figure 4.28 Mean ﬁtness and ﬁtness variance curves for ﬁtness proportional
selection. pc = “high enough”, pm = 0 , 0.001 ,and 0.2. Black lines

are from GA run, gray lines are from model simulation

130

 

4.2.2 Boltzmann Selection

Application of scaled- ,6 Boltzmann selection with 0.1 and 0.6 selection pressures

to our deceptive function yield A}? and A; curves in Figure 4.29 and Figure

4.30, in which cases the mutation rate is taken as zero. Comparison of mean
ﬁtness and ﬁtness variance for selection pressures of 0.1, 0.3 and 0.6 are in

Figure 4.31. Lastly, in Figure 4.32, we have the mean ﬁtness and ﬁtness variance

curves for ,6 =0.1 Boltzmann selection with four different mutation rates, pm =0,

0.001, 0.01 and 0.02.

131

 

l

'12.:

v

we“... N..;.5,\,.:d.

 

120

 

 

 

120

 

Figure 4.29 A: and A; curves for selection with Boltzmann scaling. ,6 =0.1

pc = “high enough”, pm = 0. Black lines are from GA run, gray

lines are from model simulation

132

 

 

«'2 :3. :: :e'zla’ :3

 

 

 

 

20 25 30

 

Figure 4.30 A}: and A; curves for selection with Boltzmann scaling. ,B = 0.6

pc = “high enough”, pm = 0. Black lines are from GA run, gray

lines are from model simulation

133

 

almmvﬁwb. «t .92 2‘22 ’3
2
z
r
.. . .
v.

120

 

 

 

110

100-

70

 

 

 

120

 

35
30 ._
25 -
20 -
15 —

10~

 

 

 

120

 

Figure 4.31 Comparison of mean ﬁtness and ﬁtness variance curves for

selection with Boltzmann scaling for )3 :01, 0.3 and 0.6. Pa =

“high enough”, pm = 0.

134

 

115 ____@,,_2 —~v—— r . r 1

110

1(5

1GP

8

 

 

 

 

 

 

 

 

B = 0.1
40 A I f I I I III
. . _.-_ :.- -.- .___.'_I..I-_- I {F'- ,' V: "l 22,222.21, ‘ VT
35- 22..-... ﬁ —00 I
"v I pm .. I
30 2 -.-.-_I II
I
is I
...1‘
225‘
N “:32 ‘ — d
15 r III‘
2‘“ .
10 r- ‘=. I I
a ‘se,
5 .2 ‘IIIIII I
- h ‘ {7:23.
pm 0 922?“? '3
M JTS~ H. l......_.._
0 1 1 1 AJMW‘J‘F—
0 20 4O 60 80 1a) 120
t

Figure 4.32 Comparison of mean ﬁtness and ﬁtness variance curves for

selection with Boltzmann scaling for ,6 :01. pa = “high enough”,

for different mutation rates

135

Chapter 5

CONCLUSIONS AND FUTURE WORK

5.1 Conclusions

In this thesis, we have developed a new and very practical model for the GA
dynamics of the OneMax problem and for a form of deceptive function problem
that is described in Section 3.1. The model can be used to ﬁnd the mean allele
and, in the case of deceptive function, mean all-1 deceptive blocks values. Then,
it predicts mean ﬁtness and ﬁtness variance of the population. Although the
problems for which our model proved successful were rather simple or idealized,
they were often sufﬁciently involved to capture interesting nontrivial features of

the GA dynamics.

An attempt to develop a stochastic differential equation model to predict
evolution of ﬁtness distribution for the OneMax problem is presented in Appendix
A. Although our attempt was not successful, it shows a strong connection
between ﬁtness evolution and certain diffusion equations and points toward the
possibility of the existence of a diffusion-type equation that could describe the

OneMax dynamics.

136

 

2r >Na..1.l 1‘14...”

 

 

The model can be applied to these benchmark problems even when the
crossover, mutation and the selection rates (in the case of Boltzmann scaling)
are changed at predetermined generations during a GA run. Because of this
capability of the model, it is unique, to the best of the author‘s knowledge, among

the current models of the GA.

The method of building blocks for modeling parallel genetic algorithms is
applied by Cantﬂ-Paz [2001] in the case where the migration occurs only when
all the populations are converged. Since our model estimates the mean value at
each locus at any generation, it can be used to determine a suitable migration
time as well as the migration rate for parallel genetic algorithms (in the island
model case) when migrations are allowed at any generation for our benchmark

problems.

The OneMax model, Diagram 1 (page 65), for modeling the case of typical
crossover rates, uses some statistics of early generations of the GA in order to
predict the rest of the evolution. The simulation results in Section 4.1 show that
the model describes the GA dynamics for the OneMax problem very well for
different crossover and mutation rates with ﬁtness proportional selection. The
correction weights introduced by Equation (2.20) are a new way of analyzing the
crossover operator, and they work very well for two-point crossover, in our GA

problem.

Note that our OneMax model for “high enough” crossover rates is covering

a different case than the statistical mechanics model of Priigel-Bennett and

137

Shapiro [1994]. The maximum entropy assumption of Priigel-Bennett and
Shapiro essentially models a situation in which the crossover operator is
assumed to be effective enough to allow a relocation of the alleles which is
probabilistically most likely to occur, under the constraints of the given mean
ﬁtness and ﬁtness variance, when the alleles move freely. On the other hand, our
model of “high enough” crossover rates does not assume any constraint in
relation to how much the alleles can be mixed. The lower crossover rates are

modeled relative to this extreme case using correction measures.

Extension of the OneMax model to the weighted ﬁtness function is
described in Section 2.3. The graphs of the model simulations, Section 4.1.3,
show that the model is tested for weight Proﬁles A and B and the results are in
excellent agreement with the GA results. The author believes that the model is

effective for any weight proﬁle.

Application of the OneMax model, modiﬁed as described in Section 2.4, to
the Boltzmann selection gives very good estimations of this case for both the
ﬁxed and the scaled selection pressures, in the case of “high enough” crossover

rates.

The model for the deceptive function, Diagram 2 (page 89), is quite powerful in
predicting the dynamics for both ﬁtness proportional and Boltzmann selections.
This model is unique, to the best of the author’s knowledge, in predicting the

dynamics of a deceptive function to such an extent. It estimates the simultaneous

138

evolution of both the mean allele values and the mean all-1 deceptive blocks, as

well as the mean ﬁtness and ﬁtness variance of the population.

5.2 Future Work

One immediate improvement needed in our OneMax model is to modify it to
include the lower crossover rates with Boltzmann selection. It is observed by the
author that the correction weights, as deﬁned in Section 2.2.4, are not by

themselves sufﬁcient to predict the Boltzmann selection behavior since they-

graphs, Figure 2.6, of this case change signiﬁcantly as a function of time, unlike
the ﬁtness proportional selection case in which the statistics from generation 5,

for example, would be sufﬁcient to predict future behaviors.

Our model of the deceptive function also needs to be developed further to
include lower crossover rates. In this case, one would need two sets of correction
weights, one for modifying Equation (3.8) and the other for Equation (3.9). Then,

the correction weights are going to be a function of two variables, a and 7.

The future work to improve the model would also include the estimation of
the ﬁtness variance for lower crossover rates, and an investigation of the

predictive power of the model in the presence of external noise.

The next step in developing our model towards more complex ﬁtness
functions is to consider a ﬁtness function which is made up several OneMax and

deceptive parts as shown in Figure 5.1. Here, the chromosome has many

139

 

 

 

 

subdivisions, possibly ovenapping, and the ﬁtness of the chromosome is found
by adding up the contribution of each part, which is found by applying either a
(weighted) OneMax or a (weighted) deceptive function. It is speculated that a
model for such a GA problem might be established by “pasting together” models

developed in this thesis.

 

¢ chromosome 2

I l l loul II J

Lounting I's S-bit deceptive 41—bit deceptive “W““Q ‘3 34’“ de‘iV‘

 

 

 

 

 

 

 

 

Figure 5.1 A complicated ﬁtness function a composition of several OneMax
and deceptive parts

Once a model is developed for a GA problem as described in Figure 5.1,
the next step might be to consider a real-life problem and try to represent it with a

ﬁtness function that looks like Figure 5.1.

A different direction to follow in developing a model for complex real-life
problems is as follows. Note that, when a ”snapshot” of a GA is examined at
some instant, from the point of view of the crossover + selection operators, in
relation to determining the next mean allele (or mean all-1 deceptive block)
values, any GA problem looks as if it is a weighted OneMax problem at that
instant. In other words, any GA problem, in theory, can be seen as a series of
weighted OneMax problems, the weights of which change dynamically from
generation to generation. So, if one can estimate such a weighted OneMax

decomposition of a real-life problem, it should be sufﬁcient to predict the

140

dynamics the real-life problem by applying the OneMax model developed in this

dissertation with ﬁtness weights changing as a function of time.

141

APPENDIX

A COMPARISON OF THE ONEMAX PROBLEM WITH A

DIFFUSION MODEL

In this section we will study the diffusion equation

Lg ’) ..— .%[A, (2) f(x,t)]+ ggp, (x) f(x,t)] (M)

where
2 2
A1(x)= x(l—%lnx)|:a+%—(l——l—lnx]—g—] (A2)

and

2
A2 (x) = 02x2[1— imac) (A.3)
_ K
with initial condition
lim f(x, t) = 6(x — x0) (A.4)
t—->0

and explore its relationship to the evolution process of the one-max problem. The

constants are chosen such that 0', K > O and 0 < a < 1 .

142

Equation (A.1) appear in the study of population growth processes in
random environments [Capocelli,Ricciardi,1975], [Ricciardi,1977, 1985]. If we

consider the equation
ﬂ = ax(l — ilnx) (A.5)
K

as modeling a growth process with fertility rate a, then 2" becomes the
asymptotic population size. Changing a into a+ A(t), where A(t) is a white

2
a
noise with intensity —2—, the resulting stochastic equation becomes

ézax(l_-1_lnx)+X(l—ilnXJA(t) - (A6)
dt K K

Equation (A.1) is the Fokker-Plank equation, in other words the forward equation,

for the transition probability density function, f (x, t), of the stochastic process
described by Equation (A.6). Here, A1(x) and A2 (x) give the inﬁnitesimal drift
and variance of this diffusion process, respectively. From Equations (A2) and

(A3) we see that the diffusion interval is (0,eK ).

We can also write the so-called Kolmogorov or the backward equation for

(A.6) as

143

af( 2 x)_ 62f
T2: A(x og—l 2.2.2., 2;. (Ar)

   

 

In this equation, the initial condition, x0, of the forward equation is considered as

another variable of the function f . The solution of (A.7). or equivalently of (A.1),

can be obtained by applying the transformation

 

" d
(r/(t,x)=I :1 -at,
u[l——lnu]
K
3?=i//(t,x),
3'0 =V(t,xo),

to Equation (A.7). Deﬁning 7 as

x0) , (A.8)

 

f(xtl~o)= [L5 ’0] flat

Equation (A?) is transformed into the well known Wiener process described by

the equation

       

67(“3 '“ )= 627
at 52,2

 

 

with the solution in the form of a Gaussian function

144

 

~~

2
76,435): (4m)‘“2exp[-— 15%] . (A.9)

By (A.8) we obtain

r

2 .
[K 1n(__—II((_ 13:0] - as]
-exp<— - x , > . (A.10)

K
f(x’t) 2 2(K —1nx)x[nrt 47f

 

 

 

 

where x takes values between 0 and ex.

Figure (A.1) shows the graph of this solution for different time values. The
constants in f are taken as a =1.5, 0'2 = 0.4, and K = 4.6052. For any ﬁxed
value of t, say t = t1, the function f (x,t1) gives the probability density function

for the value of population after t1 time units from the beginning. In particular

Tﬂxetlﬂ" = eIf(x,tl)dx =1-

145

K=4.6052, a=1.5, 0 2=0.4

 

 

 

0.9 2 i a 2 2 .
t=10
0.sL -
0.7— -
i=9
0.s~ 2
Oﬁr i=8
"' 4» 2 2
{/0 t 7’
t=s
t=s ,
0.2 {:4 -<
_ ’
initial distribution (=1 l- 2 '
'0’! i ‘
2 J J l__ ._L2_.22___ l ___.._ l __2
30 40 50 so 70 so 90

 

 

Figure A.1 Evolution of transition probability distribution, f (x,t)

 

On the other hand, in Figure (A.2) we see the evolution of ﬁtness

distribution for the OneMax problem with a population size of 50, (mutation rate

0.001 and Boltzmann selection with ,6 =0.3, in which two-point crossover is

applied). The ﬁgure shows the ﬁtness graphs for every second generation from 0,

180. For each generation the area under the curve remains equal to 1 as in the

case of Figure (A.1). When we compare Figure (A.1) with Figure (A.2) we see a

remarkable similarity between two ﬁgures.

146

In Figure (A.3), the initial distribution is chosen somewhat similar to the
artiﬁcial migration case CASE2 of in Section 1.2.3. The solution of the differential
equation for this initial condition is shown together with the solution with regular
initial condition. The comparison shows a similar relationship as seen in Section

1.2.3 between CASE1 and CASE2.

chrom slze=100, pop size=50, margari=200, maxruns=500, beta=0.3, pm =0.001

 

 

          
    

   

 

    
 
            

 

 

 

 

 

0.9 I I I I I I
0.8 # gen 180 2
0.7 2
,5
§ 0.6 2
i5 2
.3‘ 0.5
a
C
8 0.4 2
E 2...,
0.3 2 9.080
“- am 60 .,
- 9’" 50 .1] I
g 02 2 gen 20 Ben 309°" 40 ‘r‘v/‘l‘l‘rﬁfr'hlalsI‘l/l
n v « 'i'o'v'v's’té'i’i'igoﬁtf'o’rlli.llWl l
'0 O 99'6’0’0:0,’0,'¢l’i)09’f’i3(l WIN:
0.1 b i99¢6$ttttto,iit,ltj( My,
9” ’I’Q’QO’W 0”.th i‘h‘.‘\ \‘ll:1\‘\
I; 36,424,923;3.393.242.3222:-Ml.10:-t\»
O *- 5 ’ ' A 2 o. .._l. at- -7 «x.» -. '-.~\ IIII‘. ~. . K
I l 1 l i I
30 40 so 60 70 so 90 100

Figure A.2 Evolution of ﬁtness distribution for 180 generations of OneMax
problem

147

 

 

0.12_"——w "T 1” *TTF "y i I l T

0.1»-

0.08

0.06

0.04

o(F), Fitness Density Distribution

0.02

 

 

 

9;

 

Figure A.3 Evolution of two transition probability distributions, f(x,t). For the

dark curves, initial condition is given to resemble a migration case

Although our attempts to determine suitable A1 and A2 and ﬁnd an

evolution equation in the form of Equation (A.1) for the ﬁtness distribution failed,
the similarity between two evolution graphs suggests that with a suitable
modiﬁcation to Equation (A.1), one might be able to obtain a differential equation
which models the evolution of the ﬁtness distribution for the one-max problem of

genetic algorithms.

148

BIBLIOGRAPHY

Buyukbozkirli, B., Goodman, E.D., A statistical model of GA dynamics for
OneMax and deceptive functions, Proceedings of the Genetic and
Evolutionary Computation Conference (GECCO), Seattle (2004)

Cantu-Paz, E.: Efﬁcient and Accurate Parallel Genetic Algorithms. Kluwer
Academic Publishers, (2001)

Capocelli, R.M., Ricciardi L.M., A note on growth processes in random
environments. Biol. Cybernetics 1 8 (1975), 105-109.

Furutani, H., Schema Analysis of OneMax Problem: Evolution equation for ﬁrst
order schemata. Foundations of Genetic Algorithms (FOGA) 7, (2002)

Goldberg D. E., Genetic Algorithms in Search. Optimization & Machine Learning,
Addison Wesley (1989)

Goldberg, D.E., Deb, K., Thierens, D.: Toward a Better Understanding of Mixing
in Genetic Algorithms. Journal of the Society of Instrument and Control
Engineers, (1993), 32(1), 10-16

Goldberg, DE: The Design of Innovation. Kluwer Academic Publishers, Boston,
Dordrecht, London, (2002)

Harik, G., Cantu-Paz, E., Goldberg, D., & Miller, B. L. The gambler’s ruin
problem, genetic algorithms, and the sizing of populations. Evolutionary
Computation, 7(3), 231 -253, (1 999)

Holland J.H., Hierarchical descriptions of universal spaces and adaptive systems,
Technical Report ORA Projects 01252 and 08226, University of Michigan,
Department of Computer and Communication Sciences (1968)

Holland J.H., Adaptation in natural and artiﬁcial systems. Ann Arbor: University of
Michigan Press (1975)

Nix, A.E., Vose, M.D.: Modeling Genetic Algorithms with Markov Chains. Ann.
Math. Art. Intell., (1991), 5:79-88

149

 

Ozpineci, B., Tolbert, L.M., Chiasson, J. N., Harmonic optimization of multilevel
converters using genetic algorithms. 35‘" Annual IEEE Power Electronics
Specialists Conference, Aachen, Germany (2004)

Poll, R., Exact schema theory for genetic programming and variable length
genetic algorithms with one point crossover. Genetic Programming and
Evolvable Machines, 2(2): 1 23—1 63, (2001 )

PriigeI-Bennett, A., Shapiro, J.L.: An Analysis of Genetic Algorithms Using
Statistical Mechanics. Phys. Rev. Lett., (1994), 72(9): 1 305-1 309

PriigeI-Bennett, A., Shapiro, J.L.: The Dynamics of a Genetic Algorithm for
Simple Random Ising Systems. Physica D, (1997), 104:75-114

Priigel-Bennett, A., Modeling ﬁnite populations. Foundations of Genetic
Algorithms (FOGA) 7, (2002)

Ricciardi L.M., Diffusion Processes and Related Topics in Biology. Lecture Notes
in Biomathematics, 14, Springer-Venag (1977)

Ricciardi L.M., Stochastic Population Theory: Diffusion Processes. (1985)

Rattray, L.M.: Modeling the Dynamics of Genetic Algorithms Using Statistical
Mechanics. PhD thesis, University of Manchester, Manchester, UK,
(1996)

Stephens C.R., Waelbroeck H., Schemata evolution and building blocks.
Evolutionary Computation, 7(2): 1 09-1 24, (1999)

Suzuki, J., A Markov chain analysis on simple genetic algorithms. IEEE
Transactions on Systems, Man and Cybernetics, Vol. 25 No.4, April
(1995)

Vose, M.D., Liepins, G.E.: Punctuated Equilibria in Genetic Search. Complex
Systems, (1991), 5:31-44

Vose, M. D., The Simple Genetic Algorithm, Foundations and Theory. The MIT
Press, (1999)

Whitley D., A Genetic Algorithm Tutorial, Statistics and Computing, vol. 4, pages
65-85 (1994)

150

Wright, A. H., Rowe, J.E., Stephens C. R., Poli, R., Bistability in a Gene Pool GA
with Mutation. Foundations of Genetic Algorithms 7, FOGA, (2002)

Wright, A., Poli, R., Stephens, C.,Langdon, W.B., and Pulavarty, S. An
Estimation of Distribution Algorithm Based on Maximum Entropy.

Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO), Seattle (2004)

151

   

I1111111111111111