. .8 t:
.a'

c; m

x,
Pr 9’:

 

‘ .
ﬁwaﬁ
.. In:
131...;
.35....
e
.hé.
\«Ww
:
'15:! S
{$39.5 :2,

. I:
v.... s...

4...? I:

“fies ‘
.: 7.2km?

 

 

W I"—vo\.,

3000

LIBRARY
Michigan State
University

This is to certify that the
thesis entitled

EVOLUTIONARY DYNAMICS OF 30 DIGITAL CONSTRUCTS

presented by

JASON MICHAEL STREDWICK

has been accepted towards fulﬁllment
of the requirements for the

Master of degree in Computer Science and
Science Engineering

 

Majyﬁrofessor’s Signature
l?-€-Q5

 

Date

MSU is an Afﬁrmative Action/Equal Opportunity Institution

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2/05 p:lClRC/DateDue.indd-p.1

EVOLUTIONARY DYNAMICS OF 3D DIGITAL CONSTRUCTS
By

Jason Michael Stredwick

A THESIS

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

MASTER OF SCIENCE
Department of Computer Science and Engineering

2005

ABSTRACT

EVOLUTIONARY DYNAMICS OF 3D DIGITAL CONSTRUCTS
By
Jason Michael Stredwick

Convergence and stagnation are two of the biggest obstacles to the development
of open-ended, evolving systems. These issues have received much attention in both
traditional biological research and in the ﬁeld of Evolutionary Computation. Here I study
their effects on evolution in 3D virtual worlds, with a focus on the evolving population’s
ability to continuously produce new, nontrivial survival strategies. I pay particular
attention to measures of convergence, stagnation, and ﬁtness, which provide an avenue
for comparison among distinct evolving systems. Toward this goal, I designed and
utilized an evolutionary system called WhirlingDervish that evolves 3D morphology and
behavior to examine these issues.

Convergence and stagnation are clearly related qualities. I deﬁne convergence as
the reduction in genetic diversity among individuals in a population, and stagnation as the
population’s lack of progress towards the evolutionary goal over a period of time. The
process of measuring these qualities in a population draws upon methodologies
commonly used in biology and evolutionary computation, including measures of genetic
diversity and both genetic and phenotypic variation. I test how informative these
methodologies are toward approximating the level of continued evolution in
WhirlingDervish and make use of their results to compare potential ﬁtness functions.
Finally, I look for correlations between these measurements and the structural and

behavioral qualities of individuals in the ﬁnal population.

ACKNOWLEDGEMENTS

I would like to thank Dr. Charles Ofn'a for taking me on as his student,
introducing me to the fascinating area of computational evolution, and suggesting the
idea for the WhirlingDervish system. I also appreciate all his time and guidance
throughout my learning experience. Most of all, I really appreciate his patience.

I would also like to thank the other members of my committee, Bill Punch and
Eric Tomg, for their time and effort, and their patience with me as I attempted to ﬁnally
set a date for my defense.

Thanks to Matthew Hall for all his support and insight.

Many thanks to all of the people in both the Ofn'a and Lenski labs for all of their
help and support. To Dehua Hang for all of the problem discussions, and helping me
make the most out of my time (I had a lot of fun). Wei Huang gave me a lot of support
during my time as a Masters student along with useful feedback on this thesis and many
other problems I tackled. I would like to thank Kaben Nanlohy for introducing me to QT
and all of the discussions about interesting and complex data structures and coding
practices. I also appreciated the feedback on the WhirlingDervish system ﬁ'om Art
Covert, along with his feedback on my thesis. Finally, I would like to thank Brian Baer
for all of his efforts to keep our systems running smoothly and his extensive system
administration knowledge.

The CSE department and the National Science Foundation (Grant DEB-9981397
to Lenski, Adami, and Riley) for each provided me with three semesters worth of

funding, which made it possible to complete my Masters degree.

iii

Finally, I would like to thank my wife, Kristina Stredwick, for all of her support
and encouragement. I would also like to thank her for all of the time she spent discussing
the many nuances of evolutionary theory with me and helping me bridge some'of the

gaps between biological and computational evolution.

iv

TABLE OF CONTENTS

 

 

LIST OF TABLES - - -- _ - VII
LIST OF FIGURES -- - -- - .. .VIII
1 INTRODUCTION - _ - - - 1

 

2 WHIRLINGDERVISH, A SYSTEM FOR EVOLVING 3D MORPHOLOGY

 

 

 

 

 

 

 

 

 

 

 

AND BEHAVIOR . -- - - 5
2.1 BACKGROUND - 6
2.2 THE VIRTUAL WORLD 11

2.2.1 The Environment _ - 13
2.2.2 Point-mass Ph sics -_ -- 15
2.2.3 Spring Physics 18
2.3 EVOLUTIONARY STRUCTURES AND PROCESSES 22
2.3.1 Fitness Scheme - 23
2.3.2 Genetic Language -- - 24
2.3.3 Replication and Mutation - - 27
2.4 RUNTIME ABILITIES - - - -- 30

3 ANALYZING CONVERGENCE AND STAGNATION THROUGH DIVERSITY

 

 

 

 

 

 

 

 

 

 

 

 

 

AND VARIATION MEASURES - - - 32
3.1 DEFINITIONS AND METHODOLOGY 34
3.1.1 Deﬁnitions - - 34
3.1.2 Population Diversity ---------- -- -- - 36
3.1.3 Variation ..... - - -- - 37
3.2 DIVERSITY AND VARIATION MEASURES 41
3.3 EXPERIMENTAL DESIGN - 43
3.4 RESULTS AND DISCUSSION - ..... - 45
3. 4. 1 Convergence ..... 45
3.4.1.1 Genetic Diversity -- ........... -- 45
3.4.1.2 Genetic Variation _- -- 51
3.4.1.3 MRCA Variation - 53

3. 4. 2 Stagnation 5 7
3. 4. 2. 1 Fitness Improvement and Morphological Convergence -- 57

 

3.4.2.2 Population Diversity Based on Various Phenotypic Traits ................ 62

64

 

3.4.2.3 Average Inferiority and Fitness Distributions
3.5 CONCLUSIONS

69

 

4 UNDERSTANDING THE FITNESS FUNCTION

72

 

4.1 EXPERIMENTAL DESIGN --
4.2 RESULTS AND DISCUSSION -

75
76

 

4.3 CONCLUSIONS

82

 

5 STRUCTURE AND BEHAVIOR CASE STUDIES

--84

 

5.1 CASE STUDY SETUP-

..... 86

 

 

5.2 RESULTS AND DISCUSSION -
5.2.1 Description of the Most-F it Individual in Each Run

88
89

 

5.2.2 Variation Within the Final Population

95

 

5.2.3 Alternative Method For Measuring Genetic Diversity
5.3 CONCLUSIONS

101

 

106

 

6 CONCLUSIONS AND FUTURE WORK

108

 

A INSTRUCTION SETS

112

 

 

B THE CONSTRUCTION OBJECT -

BIBLIOGRAPHY

117

119

 

vi

LIST OF TABLES

Table 2.1. Mass and ﬁictional coefﬁcients. ...................................................................... 15
Table 2.2. Spring Constants .............................................................................................. 19
Table 2.3. Mutation rates .................................................................................................. 29

Table 4.1. A chart of the ﬁtness components used in the ﬁtness ﬁmction for each set of
runs. The left hand column corresponds to the ﬁtness factor labels given in Figure
4.1, where the Base ﬁtness function uses all of the factors. ..................................... 76

Table 5.1. Generation of the most recent common ancestor ............................................. 99
Table 5.2. Richness measures for all six runs. Column two is the original genetic
richness from Chapter 3. Column three is the genetic richness using condensed

genomes, and column four is the phenotypic richness in terms of ﬁtness. ............. 103

Table 5.3. The dissimilarity and Similarity measures are given for each run along with
their 95% conﬁdence interval. ................................................................................ 104

Table 5.4. The dissimilarity and similarity measures are given for each run including their
95% conﬁdence interval. ........................................................................................ 104

vii

LIST OF FIGURES

MES in this thesis/dissertation are resented in color.”

Figure 2.1. An illustration of spring ﬂuctuation. The line between the two red points is
the spring at rest length. The dashed lines represent the range of possible lengths
through which the spring can move. The ﬂuctuation utilizes a sine wave for the
transition rate over time, where the maximum possible length change is twice the
maximal multiplier times the delta multiplier. .......................................................... 21

Figure 3.1. Genetic diversity using the following measures (A) richness, (B) entropy, (C)
McIntosh’s diversity index, (D) Simpsons’s diversity index. Each graph contains 50
grey lines, where one line represents the diversity over time for a single run. The
thick black line is the average diversity across all of the runs, and the thick dashed
line represents the expected value given by a null model where no selection is
present. ...................................................................................................................... 46

Figure 3.2. (A) The proportional representation of neutral, beneﬁcial, and detrimental
mutations for each generation. (B) The end of a phylogenetic tree including the most
recent common ancestor of the ﬁnal population. This particular tree has a later than
average most recent common ancestor and was chosen to Show that even a tree this

young has many groupings containing long branch lengths. .................................... 49
Figure 3.3. (A) Average dissimilarity over time (B) Average similarity over time ......... 51
Figure 3.4. Average dissimilarity and average ﬁtness over time. ..................................... 53

Figure 3.5. (A) A boxplot depicting the generation of the most recent common ancestor
for each run. (B) A boxplot depicting aggregate Dissimilarity/Similarity between
the most recent common ancestor and all of the individuals in the ﬁnal population
for each run. .............................................................................................................. 55

Figure 3.6. Noise difference between data deﬁned by (A) lineage and (B) most-ﬁt
individuals per generation. ........................................................................................ 58

Figure 3.7. Phenotypic trait changes along a lineage for the traits (A) ﬁtness, (B) xz-
distance moved, (C) height, (D) number of point, and (E) number of springs. ........ 59

Figure 3.8. Average phenotypic trait change across each population over time for the
traits (A) ﬁtness, (B) xz-distance moved, (C) height, (D) number of point-masses,
and (E) number of springs ......................................................................................... 61

Figure 3.9. Richness measures of (A) ﬁtness, (B) xz-distance moved, and (C) height. 63

viii

Figure 3.10. The four population diversity measures applied to ﬁtness. (A) richness, (B)
entropy, (C) McIntosh, and (D) Simpsons. ............................................................... 64

Figure 3.11. (A) Average inferiority per generation, (B) Average ﬁtness per generation 65
Figure 3.12. Average Inferiorities across populations ...................................................... 66

Figure 3.13. Average ﬁtness distributions at generational time points (A) 50, (B) 100, (C)
150, (D) 250, (E) 350, (F) 450, and (G) 499 ............................................................. 68

Figure 4.1. The default “Base” ﬁtness function ................................................................ 75

Figure 4.2. Average distance moved vs. time averaged over the 50 populations in each
run set. The pluses are the 95% conﬁdence interval for the respective run that Shares
its color. (A) The average distance moved over the entire two-minute evaluation
period, and (B) The average distance moved in the last minute of evaluation. ........ 77

Figure 4.3. Diversity of (A) genotypes and (B) the distance moved phenotypic trait, for
each set of runs. The pluses are the 95% conﬁdence intervals for their respective
run associated by color .............................................................................................. 79

Figure 4.4. The average dissimilarity across all populations for each Set. The pluses
represent the 95% conﬁdence interval for their respective runs according to color. 80

Figure 4.5. Phenotypic variation (based on distance moved) using MRCA measures. (A)
The generation of the MRCA for all ﬁﬁy runs for each set, and (B) The average
dissimilarity between the MRCA and each individual in the ﬁnal population for all
ﬁfty runs within the set. ............................................................................................ 82

Figure 5.1. Fitness vs. time for three ﬁtness metrics across six independent runs. The
ﬁtness metrics used are maximum ﬁtness at each time point, the ﬁtness of the
individual along the lineage at each time point, and the average ﬁtness at each time
point. The graphs are paired up in the following order: stagnated (runs one and
two), continues to evolve (runs three and four), and achieves maximal ﬁnal ﬁtness
observed (runs ﬁve and Six). ..................................................................................... 88

Figure 5.2. (A) A side view perpendicular to the direction of motion of the most-ﬁt
individual from run one. The direction of movement is to the left. The individual is
raising and thrusting its front forward, and will soon rotate down, pulling the back
end forward. While the front is rising, the back end is moving towards the ground.
(B) A view parallel to the direction of motion. (C) A top down view of the
individual. There are slight Size discrepancies because each image had to be taken a
different point in time and during different points in the rhythm. ............................ 90

ix

Figure 5.3. A side view perpendicular to the direction of movement of the most-ﬁt
individual from run two. The direction of movement is to the leﬁ. The slightly
curved shape of the base is visible at the bottom. The curve allows the individual to
rock back, thrusting the leading component forward, then rock forward, dragging the
back end forward through contraction. ..................................................................... 91

Figure 5.4. A side View perpendicular to the direction of movement of the most-ﬁt
individual from run three. The direction of movement is to the leﬁ. You can see the
interesting core component that consists of many pyramids pointing in different
directions. It is the interactions of these pyramids that allow the rocking of the
central component for lifting and thrusting. ............................................................. 92

Figure 5.5. A side view perpendicular to the direction of movement for the most-ﬁt
individual from run four. The direction of movement is to the left. This individual
has less infrastructure in the back end for lifting the front component. Thus, the
highest vertical point-mass near the front is key to lifting the other leading point-
masses. ...................................................................................................................... 93

Figure 5.6. A Side view perpendicular to the direction of movement for the most-ﬁt
individual from run ﬁve. The direction of movement is to the right. ...................... 94

Figure 5.7. A side view perpendicular to the direction of movement for the most-ﬁt
individual from run Six. The direction of movement is to the right ......................... 95

Figure 5.8. A partial phylogenetic tree from run one Showing two clusters. The numbers
along the bottom represent the generation. The ﬁrst vertical column is the
individual’s id and the second column is the ﬁtness of each individual. .................. 96

Chapter 1 Introduction

In this thesis I utilize an evolutionary system called WhirlingDervish to examine
the use of diversity and variation measures to characterize convergence and stagnation in
systems that evolve 3D morphology and behavior. I then apply the measures to each set
of ﬁfty runs for six variants of WhirlingDervish’s default ﬁtness function in order to
study the effects of the different phenotypic traits that comprise the default ﬁtness
function. The characterization for each set of runs provides me with a means to compare
the variants and differentiate the evolutionary impact of the various ﬁtness factors used
within the default ﬁtness ﬁinction. Then I will examine how closely the diversity and
variation measures approximate the evolved behaviors for six case studies. My goal for
these analyses is to provide a means of comparing results within studies in the same
system as well as providing a common set of measurements for comparisons between
related systems.

Before analyzing the diversity and variation measures, it is important to
understand the system that I use for a testbed. In Chapter 2, I provide a detailed overview
of the WhirlingDervish system, and compare it to other related programs. In
WhirlingDervish, 3D structures are evolved that are capable of moving about their virtual
world. Individuals in this system do not have a brain (or other centralized control
structure) to regulate their movements. The system employs simpliﬁed physics and
phenotype models while maintaining a 3D virtual world that is intuitive to the user. The
simpliﬁcations allow for large population sizes capable of running in real-time on a
standard desktop machine. Despite the simpliﬁcations in the system, it still has the

potential to produce highly complex results. The dynamic genetic language provides an

inﬁnite number of potential solutions, although the types of solutions may be categorized
into a ﬁnite number of groups.

The following are some highlights that deﬁne WhirlingDervish as an evolutionary
system. It uses a Simple, assembly-like language for its genetic representation, which is
executed by an external structure in order to generate the phenotype (an individual’s body
and, consequently, its behavior). When replication occurs, the offspring ﬁtness is
computed using an explicit ﬁmction. Fitness parameters are measured after testing the
individual in its virtual physical world. The goal for the system is to evolve individuals
that are capable of consistent movement, while avoiding premature convergence to
undesirable solutions such as “one-shot” constructs or other evolutionary dead-ends.
One-shot constructs have an initial burst of movement and then never move again.
Typical examples are individuals that move by simply falling over or by throwing
themselves a certain distance and collapsing into a heap when they land.

In Chapter 3, I use the WhirlingDervish system to generate ﬁfty replicate runs
using the default setup. I measure the diversity and inter-individual variation within each
population and use the results to characterize convergence and stagnation. By examining
the supporting information such as the relative proportion of neutral, beneﬁcial, and
detrimental mutations, I explore the evolutionary dynamics and judge the usefulness of
these measures for assessing convergence and stagnation.

Since the goal of this study is to assess the usefulness of various diversity and
inter-individual variation measures, I sample a broad range of measures. The majority of
the measures are drawn from genetic programming while some come from biology and

genetic algorithms. Richness, Shannon Entropy, Simpson’s Diversity Index, and

McIntosh’s Diversity Index are the four measures of diversity that I use and apply at both
the genotype and phenotype levels. These measures rely upon the classiﬁcation of
individuals into groups and the proportion of the total population represented by each
group. Conversely, variation measures utilize statistics between individuals rather than
groups. I use four categories of variation: genetic variation, phenotypic variation,
average inferiority, and measures relating to the most recent common ancestor (MRCA).
Genetic variation quantiﬁes the difference between two genomes across a population
over time. Phenotypic variation examines the change in phenotypic traits, including
ﬁtness, across a population over time. Average inferiority is similar to phenotypic
variation for ﬁtness, but measures the average difference from the ﬁtness of the most-ﬁt
individual rather than a straight average of ﬁtness. The measures that relate to the
MRCA include the generation that the MRCA was generated and the average genetic
difference between the MRCA and each individual in the ﬁnal population.

In Chapter 4, I utilize diversity and variation measures to discover the impact of
each component in WhirlingDervish’s default ﬁtness function, which I call the Base
ﬁtness ﬁrnction. The experiment conducted here examines the impact of these factors by
knocking out each ﬁtness component, one at a time, and generating ﬁﬁy replicate runs. 1
also generate ﬁfty replicate runs for a ﬁtness function that only utilizes the single factor
of distance moved, as this function is used by other related evolutionary systems. By
comparing the convergence and stagnation for each set of runs and their ability to move, I
can approximate the impact of each component to the Base ﬁtness function and isolate its
contribution to the evolutionary process.

In Chapter 5, I seek to understand how informative the diversity and variation

measures are for approximating structural and behavioral variation. It is exceptionally
difﬁcult to automatically perform qualitative measures on morphology and behavior, but
on a limited scale this can be done using visual analyses. To gain a better insight into
how the diversity and variation measures at the genotypic or phenotypic level relate to
behavioral variation, I perform six case studies by choosing six runs from the ﬁfty used
in Chapter 3. For each case study, I ﬁrst describe the most-ﬁt individual. Next, I visually
examine samples from the ﬁnal population, the MRCA for the ﬁnal population, and a
select number of individuals along the lineage prior to the MRCA, and determine what, if
any, variance exists. This information approximates the variation in the ﬁnal population
as well as the variation that has occurred over time. Finally, in this chapter I examine an
alternative genetic variation measure that examines the variation between two genomes
after all neutral genomic positions have been removed. This measure was not been
included in Chapter 3 due to time constraints, but the information found here provides an

idea of the impact of neutral mutations on measuring genetic diversity and variation.

Chapter 2 WhirlingDervish, A System for Evolving 3D
Morphology and Behavior

1 designed and implemented the WhirlingDervish software to be a ﬂexible
research platform for evolutionary studies, which currently targets the evolution of 3D
morphology and behavior. The purpose of generating a new system rather than using a
preexisting one is the ability to have large population sizes that run in real-time on a
standard desktop machine. WhirlingDervish accomplishes this feat by using
computationally Simple physics and reducing the complexity of the phenotype model to a
wire-frame body composed of point-masses and springs. Additionally, individuals lack a
behavioral control structure such as a neural network, which is often a central facet of
related systems, but creates unnecessary complexity for simpler studies.

The inspiration for the wire-frame, mobile body comes from The Soda Play
software (Soda Play, www.sodt_tplav.com). Soda Play is an online application that allows
users to create a two-dimensional, wire-frame body that moves around in a simple virtual
environment. The website contains many pre-made individuals that exhibit a wide range
of morphologies and behaviors such as walking and rolling. WhirlingDervish expands
these structures into three-dimensions, and successfully generates mobile individuals
through an evolutionary process.

The WhirlingDervish system uses a simple evolutionary algorithm to evolve
individuals capable of motility. Each generation begins by simulating all of the
individuals in a virtual environment and measuring characteristics of each individual (as
described in Section 2.3.1). These measurements are used to compute the ﬁtness value.

Next, the individuals are selected for replication using ﬁtness proportional selection. As

each individual is selected, replication occurs with a chance of mutation but no crossover.
Once an entirely new population is generated, the process starts over again until the last
generation is complete.

While the evolutionary algorithm is fairly standard, it employs a complex
evaluation method: the simulation of a virtual world, consisting of a single inﬁnite plane
(the ground) and the individuals being evaluated. The simulation models the behavior of
individuals by managing the point-masses and Springs within an individual and the
interactions between point-masses and the environment. The simulation of each
individual lasts for two minutes at which point, ﬁtness is measured based on data
recorded during the Simulation.

The rest of this chapter details the WhirlingDervish system and presents its
relationship to other systems that evolve 3D morphology and behavior. It begins in
Section 2.1 with a description of the evolutionary systems that relate to WhirlingDervish.
Then, Sections 2.2 and 2.3 describe the details of the two major subsystems: the virtual
world and the evolution structures and processes. Finally, a brief overview of the

system’s runtime abilities is given in Section 2.4.
2.1 Background

There are many 3D morphology and behavior evolutionary systems currently in
use. The evolutionary goals of these systems are to evolve non-trivial behavior and
complex morphology. Many of these systems are used to study the evolution of neural
networks used as control structures for an individual, bridging the gap between the virtual
world and the real world, or to study genetic representations. While each system was

created by different groups of researchers, they share many aspects in common.

One commonality is the phenotype structure that consists of rigid components
connected by joints. These are commonly represented as three-dimensional geometric
primitives such as Spheres, cylinders, or blocks, though the joints are not always visible.
Both component types are associated with a number of manipulation ﬁinctions that cause
them to act in a Speciﬁc manner. A neural network, often called a brain, controls the
manipulation of each component, which gives rise to behavior by adding a dynamic
quality to the components. In most systems, the structure and the controller are co-
evolved.

The inspiration for these evolutionary systems arose from a system called “Virtual
Creatures”, created in 1994 by Karl Sims (1994a). This system consists of blocks
connected through invisible motorized joints, all of which are controlled by a neural
network. Sims had a two-fold focus: to explore the possibility of evolving 3D virtual
creatures capable of different types of movement and to study methods of evolution
under direct, one-on-one competition. In Sims’s initial work, he was successful at using
a standard evolutionary algorithm with a static ﬁtness function to evolve creatures with
ﬂuid, realistic-looking movements. However, subsequent work (Sims, 1994b) suggested
that relative ﬁtness through competition rather than treating the evolution of behavior as
an optimization problem would create more complex creatures. While he did evolve
interesting creatures through competition, he never provided a direct comparison between
his ﬁtness schemes, most likely due to the differences in evolutionary goals.

The system created by Tim Taylor and Colm Massey (2001) was a recreation of
Sims’s system. Its purpose was to test the use of off-the-shelf physics engines, more

speciﬁcally MathEngine, as an alternative to individually crafted physics engines.

During the creation of their system, they encountered and documented many of the
common problems faced by designers of these types of systems, mostly dealing with
premature convergence. One such problem is a type of individual they call a “one-push”
creature, which is a creature that throws itself a certain distance and then stops moving.
They found that “many cases where these 'one-push' creatures were selected in the initial
generations, the population reached an evolutionary impasse (a local optimum in the
ﬁtness landscape), and had no easy mutational routes to higher ﬁtness” (Taylor and
Massey, 2001). They suggest that increasing the time a creature is simulated will reduce
the number of “one-push” creatures, but did not indicate if they had tested this
suggestion. They also state that Sims was able to achieve continuous swimming over
time by giving a stronger relative weight to velocity toward the end of the simulation.
However, when they implemented this type of ﬁtness function in their system, they still
evolved “one-push” creatures fairly frequently.

Another problem they ran into was “folded” creatures. These creatures have a
similar evolutionary effect as the “one-push” creatures where the system would often
converge on them if they arose early in the evolutionary process. Folded creatures are
described as “creatures evolved that would initially adopt a compact, folded
conﬁguration, then as the evaluation period proceeded they would 'unfold' in a particular
direction. This unfolding had the effect of shifting the creature's center of mass, thereby
increasing its ﬁtness” (Taylor and Massey, 2001). They attempted to prevent this by only
“using the distance moved by the body part that had moved least over the duration of the
evaluation” (Taylor and Massey, 2001). However, this wasn't enough to completely

eliminate the problem.

Based on their evolutionary problems and their attempts to ﬁx them, Taylor and
Massey acknowledged an inherent problem with existing ﬁtness schemes. The problem
was a trade-off between the inclusion of special cases to prevent undesirable behavior
and the ability to use high-level speciﬁcation of required behavior. They also noticed
“even using straightforward ﬁtness functions will sometimes produce desired results”
(Taylor and Massey, 2001). In terms of preventing convergence, Taylor and Massey
suggest that “one way to help prevent the runs getting stuck in local ﬁtness optima would
be to maintain the genetic diversity of the population by introducing some kind of
incomplete mixing (e.g. by using an island model GA). I have performed some initial
experiments with incomplete mixing, but not enough at this stage to say how effective
this strategy is at solving the problem” (Taylor and Massey, 2001). I.

Multiple systems and ideas have arisen from the lab of Jordan Pollack. Some of
their goals are truly novel, including evolving individuals capable of being built as
functional structures in the real world (Lipson and Pollack, 2000; Homby et. al. 20013;
Pollack et. al., 2003). Ideas centering on language and language structures have been
long standing directions of research within the lab. One of the general language
constructs is the concept of a Genetic Library Builder (GLiB) (Angeline and Pollack,
1994). GLiB is a language paradigm that allows for the dynamic creation of new more
complex instructions by associating each new instruction with an ordered set of other
instructions that are currently in the library. The library begins as a set of atomic
instructions and then mutation operators allows the creation, deletion, and breaking up of
instructions within the library. As a genome is copied, it has access to the current State of

the instruction library.

The study of language in the evolutionary systems designed in Pollack’s lab
extended into their systems evolving 3D morphology and behavior (Homby and Pollack,
2002; Pollack et. al., 2003). They began to use a type of language called Lindenmayer
systems (L-systems) as their genetic representation. L-Systems use a recursive symbol-
and rule—based representation. They were originally created to generate graphics of the
branching patterns in plants. By deﬁning a hierarchical set of rules, each assigned a
unique symbol; you can build up complexity by composing more complex rules based on
other, simpler rules. The use of an L-systems genetic representation in their evolutionary
systems allowed for the evolution of symmetrical constructs. They conducted
comparisons between the L-system representation and a non-symmetric representation
and found that the use of L-systems allowed the creation of more complex and successful
constructs.

Another major system that evolves 3D morphology and behavior is called
Framsticks (Komosinski, 2003), which was created by Maciej Komosinski and Szymon
Ulatowski (1999). The system uses constructs composed of motorized joints and sticks
that move using a neural network controller; both the body and the controller are
coevolved. Like many of the other systems described here, much of the work on
Framsticks deals with the evolution of neural network control structures. The portion of
their work relating to the evolution of 3D morphologies and behavior centers around the
study of genetic representations, all of which are instruction-based languages that they
refer to as ‘encodings’. The purpose of this research is to ﬁnd an encoding that will

maximize the complexity of their evolved constructs.

10

 

Another system called Artiﬁcial Ontogeny (Bongard and Pfeifer, 2003) evolves
3D morphology using spheres as its building blocks, which are controlled by a neural
network. The evolutionary goal of this system was to evolve constructs capable of
pushing a block, much larger than itself, as far as it could. The most important difference
between this system and the other systems mentioned above is that it does not use a
genetic language to build the phenotype. Instead it uses an alternative genetic
representation and mapping algorithm called a gene regulatory network (GRN).

There are other major systems that are similar to those listed above but focus on
topics other than strictly morphology or behavior. One example is a system by Bongard
and Lipson (2004) that focuses on the co-evolution of simulator and construct for the
purpose of automatic design through evolution of robots capable of compensating for bad
components or coping with small differences between the real world and the virtual
world in which it evolved. There are still other systems that use more rigid phenotypic
representations, with speciﬁc evolutionary goals. An example is the massive quantity of
research being conducted on evolving behavior in static two and four legged constructs,
where these constructs are generated prior to evolution, such as the system by Gene
Ruebsamen (2002). For more information on 3D morphology and behavior evolutionary
systems, some of which were mentioned here, see the paper by Taylor and Massey (2001)

and the webpage by Tim Taylor (Taylor, www.tim-tavlor.com/research/embodvhtml).
2.2 The Virtual World

In WhirlingDervish, the virtual world is the subsystem that takes a population of
individuals and simulates their actions in a Virtual physical environment. The

cornerstone of this subsystem is the body of an individual. The body is a three-

11

dimensional wire-frame structure, composed of point-masses connected by springs that
can form complex structures with intricate behavior. This behavior is derived through the
interaction of point—masses with the environment and with other point-masses through the
springs that connect them. The point-mass is the only physical unit that is capable of
environmental interaction; Springs are treated as nothing more than forces between point-
masses. Point-masses are treated as inﬁnitely small, and therefore cannot collide with
each other.

Springs connect pairs of point-masses and apply a force that pushes or pulls them
to a speciﬁc distance apart (known as the “rest length” of the Spring). This force is the
only manifestation of the spring. It has no mass or volume and exerts force only along an
unbendable straight line between its two point-masses. The rest length of a spring can
oscillate over time, adding dynamics to the body, which allows it to move. These
oscillations are the primary driving force for the movement, and are effectively a
continuous input of energy into the system.

The Simulator is the hub of this subsystem. It manages all interactions within the
world, particularly the modeling of body dynamics and body/environment interactions,
by updating the internal state of a body through the application of physics functions.
These processes occur in time steps, meaning that multiple interactions can occur during
the same time step. However, the inability of individuals to collide and the environment
consisting solely of the ground plane, nullify this issue. Regardless, the entire set of
point-masses and Springs are updated simultaneously across all individuals for the given
time step. The individuals continue to be updated until the maximum simulation time has

been reached. At the termination point, all of the relevant characteristics measured

12

throughout the simulation are saved for use by the evolutionary algorithm in determining
the ﬁtness of each individual.

The physics functions are the rules of the system. Point-masses and springs each
have a subset of the physics functions associated with them. The functions operate on
discrete time steps, such that the state of each point in the next time step can be calculated
purely based on the current state of the system. The functions used in the system
calculate the forces on each point-mass based on gravity, the forces exerted by springs,
ground friction, and air friction (drag). The simulator will then update the positions of
the point-masses based on the current forces acting on them through projectile motion.
Finally, the system will also update the rest length of each spring, which may alter the
spring forces in the next time step. The idea behind altering the spring’s rest length
comes from the Soda Play software (Soda Play, www.sod2_rplav.com) and a description of

this procedure is fully presented in Section 2.2.3.
2.2.1 The Environment

In order to give meaning to the individuals’ bodies, they must have an
environment in which to exist. In WhirlingDervish, the environment is a three
dimensional virtual world. For the purposes of this project, there was no need for
complex topography or objects. Thus the only physical object within the world, besides
the bodies of the wire-frame individuals, is the ground. The ground is a ﬂat non-
deforrnable surface that is essentially the inﬁnite xz-plane. This surface provides
environmental feedback to an individual’s body through its interactions with the point-

masses of that body.

13

The purpose of the ground is to provide a surface on which the individuals can
move. To maintain a stable surface, the ground must prevent all point-masses from
penetrating through to the other side. If a mass should penetrate the ground during a
given time step, I use a Speciﬁc manipulation to rectify the situation. The manipulation is
based on the nature of the movement equation (covered in detail at the end of Section
2.2.2), where each mass has two positions: “current” and “previous”. Both the current
and previous position's y-components are inverted when the current position penetrates
the ground. This inversion represents the point having bounced upon hitting the surface
such that it is above ground again. The previous position is also inverted to maintain
velocity.

To represent collisions that are not perfectly elastic with the ground, a loss of
energy was added to the design. Since the previous position is used to determine
velocity, the easiest way to implement energy loss is to modify the previous position
relative to the current position. This modiﬁcation is in the form of a scaling factor (0.4
by default) that is applied to the distance between the two positions. Equation 2.1

describes how the previous position ( pp) is moved closer to the current position ( pc).

p1) = pc + 0.4[pp -pc) (2.1)

Also, during collision between a point-mass and the ground, friction must be considered.
To allow point—masses to be of different materials I allow each point-mass to have its
own coefﬁcient of ﬁiction relative to the ground, which I will discuss in the next section.
In the future, a more sophisticated friction system could be incorporated into the

simulation in order to provide additional accuracy.

14

2.2.2 Point-mass Physics

The point-mass is one of the basic components of the system. An individual’s
body is composed of a set of point-masses connected by springs. By deﬁnition, a point-
mass has a mass (m) but no volume or surface area, and can only move when force is
applied to it. The most consistently applied force is gravity. Gravitational acceleration
(g) is a vector standardized to <0.0, -9.8, 0.0> m/S2 and does not change during the
simulation. However, the simulator is designed to allow g to be any vector and is
modiﬁable in a conﬁguration ﬁle. The force (F) of gravity is given as

F = mg (2.2)

The other forces that affect a point-mass are friction related. To allow a point-

mass to generate ﬁiction without a surface area, coefﬁcients of friction: static (,us),
kinetic (pk), and drag (pd) are required. These coefﬁcients were tuned by hand to

produce realistic results. Since all point-masses are uniform in the current state of the
system, they are all assigned the same mass (m) and ﬁiction coefﬁcients. These values
are changeable from a conﬁguration ﬁle, but not during evolution. The standard values

are Shown in Table 2.1.

Table 2.1. Mass and frictional coefﬁcients.

,uS 0.75
pk 0.5
”d 0.01

One frictional force is air resistance, which affects all points in motion by
providing a force in the direction opposite to the motion and proportional to the velocity.

The following equation is used to determine the force of the air resistance, where v is

15

 

velocity, vn is the direction unit vector of the velocity, and ,u d is the coefﬁcient of
friction, drag:
F =—1 0,. Hv|l2v (2 3)
. d n .

The other frictional forces comprise ground ﬁiction. These forces affect all point-
masses that have contact with the ground during a time step. To ensure this ﬁictional
force is applied during the same time step that the ground is penetrated, an initial
calculation of a point-mass’s next position is taken and its y-component is tested for
ground penetration. If penetration will occur, the frictional force is computed and applied
prior to the actual movement of the point-mass. More complicated approaches to
determining when and how frictional forces are applied are available but were
unnecessary for the initial design.

There are two types of ground ﬁiction, which apply in different situations. The

ﬁrst type is static ﬁiction, which uses the coefﬁcient ,us. This force is only applied to a

point-mass when it has no velocity at the beginning of the time step, comes in contact

with the ground, and the following inequality is true given the current force (Fe) on the

point-mass.

[IFcHS/Ismllgll (2.4)

This force provides a counteracting force that prevents a stationary object from moving
along a surface by providing a force equal and opposite to the forces exerted parallel to
the plane. Thus the point-mass can only move in the xz-direction when there is force in

that direction whose magnitude causes the above inequality to become false. Given the

16

direction of the current force whose y-component is set to zero (F d)’ the equation for the
static ﬁictional force is:
F = —1.0I|Fc||Fd (2.5)

The other type of ground ﬁiction is kinetic ﬁiction, which uses the coefﬁcient pk.
This force only applies when a point-mass collides with the ground and static ﬁiction
does not apply. Like static friction, it is a counteracting force that applies in the opposite
direction of the motion of the point-mass parallel to the ground, which is represented by

the velocity. Given the direction of the velocity whose y-component is set to zero (vd),

the equation for kinetic frictional force is:
F = —1 .01. dmngllv d (2.6)
However, if the magnitude of the ﬁictional force is greater than the magnitude of current

force on the point-mass, the following equation is used
F = —1 .OHFCHvd (2.7)

Once all of the forces have been applied to each point-mass, the simulator
determines the new position for each point-mass by using projectile motion. There are a
few versions of the projectile motion equation available, but here I used the Verlet
algorithm (Verlet, 1967). It relies on the current and previous positions, and the size of
the time step. This removes the need to store velocity explicitly, since it is not needed in
the equation. However, the velocity and acceleration can be calculated at any time with

the data available. The following is the Verlet algorithm:

(2.8)

F (At)2
2

p=2p1-p0 +—m—;p0 =p1.p1 =19

17

By using only the two positions, the Verlet algorithm reduces error accumulation
that can occur as velocity and position data deviate over time. When deviation occurs, it
means that one cannot calculate the other using the standard formulas, and the results will
continue to deviate by increasing amounts over time. Comparing a time step algorithm
using position and velocity with the continuous time version of projectile motion can

show the extent of the error.
2.2.3 Spring Physics
Individual movement is driven by two aspects of a Spring that control the force

exerted on the point-masses that it connects ( p0 and p1). The ﬁrst aspect is the simple

spring force that attempts to keep the two point-masses a speciﬁc distance apart, known

as the rest length (l r) of the spring. The other is an algorithm for something similar to a

muscle that essentially imparts a constant inﬂow of energy into the system that keeps the
springs expanding and contracting.

The simple spring force provides an attractive force between the two point-masses
when the spring length between them is greater than the rest length of the spring and
exerts a repulsive force when the distance is less than the rest length of the spring. The
magnitude of the force applied to each point-mass is given by the following equation
where k is the Spring constant and x is the difference between the current Spring length
and the rest length:

Fla-pie, <2.»
Fmag = kx (2.10)

18

A force is then applied to each point-mass in the appropriate direction where p01 and

are unit direction vectors from p0 to p1 and from p1 to p0. The two forces F and

F1 are computed from the following equations:
F0 = Fmaxp10 (2.11)
F1 = Fmaxpm (2.12)

The only special case occurs when the spring length is zero. Because the spring
does not physically exist and point-masses cannot collide with each other, it is possible
for the two point-masses to pass through each other. If a time step begins where the
point-masses connected by a spring are at the same position, this can make it impossible
to calculate the direction to apply the force. To solve this, I use the difference between
the velocities to determine the direction and calculate the magnitude in the normal
manner. If the magnitudes of the velocities are both zero, then zero force is applied by
the spring. If another force should separate the two point-masses at a later time, the
spring will once again be able to provide a force contribution.

Each spring has two main properties that affect the force it exerts. These
properties are the rest length and the spring constant. Both of these values are calculated
when a spring is ﬁrst created between two point-masses. The distance between the two
point-masses determines the rest length, and the spring constant is determined through
the equations shown in Table 2.2:

Table 2.2. Spring Constants

k: 9.8 3 >0
Q.0+0.001er) '

 

k=0.0 €.=0

l9

The design decision to relate the rest length and the starting length of the spring was
made to prevent the springs from exerting huge amounts of force during the ﬁrst time
step of the simulation. In a previous iteration of the system, the rest length had a default
value that could evolve. However the wire-frame individual would evolve long starting
lengths relative to their rest lengths, resulting in huge differences between the starting and
rest lengths. Those types of individuals dominated the evolution of the system and
necessitated the change to the current state.

By using only the point-mass and spring forces described thus far, the system will
come to a rest if given enough time. In order to prevent this and to inject some dynamic
qualities into the system, I added a ﬂuctuation component to the springs. This component
essentially adds energy to the system and provides a constant cyclic dynamic to a spring
by changing its rest length over time. The change centers on the original rest length

(6 O), by following the harmonic motion of a sine wave.

There are two parameters required to deﬁne the range of possible rest lengths:

maximal multiplier (mm) and delta multiplier (dm ). The maximal multiplier has a ﬁnal
range of [-0.45, 0.45] and when applied to the original rest length gives the range of rest
lengths as E06 - mm) 2 06 + mm )]. However, the delta multlplrer IS an evolvable

parameter with a range of [—1, 1] that is used to scale the maximal multiplier. By scaling

the maximal multiplier, the wire-frame individual is able to adjust the ﬂuctuation range

for any of its Springs. Figure 2.1 visualizes spring ﬂuctuation.

20

 

 

 

Rest Length

 

 

 

Figure 2.1. An illustration of spring ﬂuctuation. The line between the two red points is
the spring at rest length. The dashed lines represent the range of possible lengths through
which the spring can move. The ﬂuctuation utilizes a sine wave for the transition rate
over time, where the maximum possible length change is twice the maximal multiplier
times the delta multiplier.

The ﬂuctuation range is represented as the multiplier (‘1’). The absolute value of
the delta multiplier is necessary to ensure the positive value of L1’. The value of LP is

determined with the following equation:
‘11: d lm (2.13)
m m
Once the range of possible rest lengths are deﬁned, a method of traversal is

required to give the cyclic change of the rest length. A sine wave was chosen for

simplicity and the output range of [— 1, 1]. Three parameters are required to generate the

wave: rate (r), where along the wave a spring starts (s), and time (t). The rate and

21

starting position each have a range of [—1, 1] and are evolvable, while time can be any

value but is scaled to a range of [0, 21]. The phase (CD) is simply a multiplier that when

rll

 

 

applied to ‘1’ determines the correct scaling factor for the original rest length. The
equation for the phase follows:

(I) = sin(rt + 7215') (2.14)
The change in rest length is given in the following equation:

€r=€0(1+(D‘I’) (2.15)
2.3 Evolutionary Structures and Processes

The evolutionary structures can be broken down into two components: the
population and the genome. The population is a collection of all of the individuals in a
given generation. While the virtual world and all of the evolutionary processes affect
each individual, they perform their functions at the population level. The genome is an
array of instructions used to create the phenotype. Some instruction examples are
creating a new point-mass, changing the properties of a point-mass, connecting two
point-masses together with a Spring, or performing other sorts of calculations or ﬂow
control. Some instructions have parameters, where a parameter is either a vector of three
real numbers or a reference to a vector within the construction object (see Appendix B for
further information on the construction object). The other important aspects of the
evolutionary algorithm are the ﬁtness scheme (Section 2.3.1), the genetic language

(Section 2.3.2), and replication and mutation (Section 2.3.3).

22

2.3.1 Fitness Scheme

The ﬁtness scheme is both the method of determining ﬁtness (ie, the ﬁtness
ﬁmction) and its interpretation. In biology, ﬁtness is measured as the impact that a
genotype has on future generations, often in terms of replication rate. Evolutionary
computation considers ﬁtness explicitly as the quality of a solution, whether absolute or
relative to others in the population. There are many methods available to compute ﬁtness
such as an explicit ﬁtness function, the outcome of direct competition, or replication rate.
Determining the appropriate ﬁtness scheme for a given problem is important because the
scheme can bias the solution, affect evolutionary speed, and change the likelihood of
premature convergence. The interpretation of ﬁtness is the method used to determine the
number of offspring each individual passes on to the next generation.

My goal in designing the ﬁtness function was to evolve individuals that
continuously move over time. However, it is difﬁcult to design a static ﬁtness function
that can encompass the abstract nature of the goal in concrete terms. For example, I
could use a function to measure ﬁtness as the difference between the center of mass at the
beginning and end of the simulation. Unfortunately, this function will encourage the
evolution of individuals that move either by ﬂinging themselves and collapsing ﬂat, or
else by simply growing tall and falling over. Thus the ﬁtness ﬁmction must not allow the
population to easily get stuck at these less desirable solutions. This trade-off between
restricting the ﬁtness function versus using high-level descriptions to deﬁne the function
had already been found by Taylor and Massey (2001).

The ﬁtness function I use in the WhirlingDervish system returns a single real

number value. This value is an accumulation of the raw values from the following

23

phenotypic trait measurements: xz-distance moved by the center of mass, the height of
the center of mass, the overall height of the body, and the total number of point-masses in
the structure. All four of these measurements have restrictions. The xz-distance only
contributes to the ﬁtness if the ﬁnal height is at least one third of the starting height. For
the two height measurements, neither can contribute more than nine meters, and the
number of point-masses is capped at ten. While these restrictions affect ﬁtness, none of
them impact structure or behavior directly. Also, while there are restrictions there are
also scaling factors. Both the center of mass height and the xz-distance moved are scaled
by a factor of three, while the number of point masses are scaled by a factor of 0.0254.
While there are additive elements to the ﬁtness function, there are also a few
detrimental elements. The primary detrimental element is the cap on the number of
springs. If the number of spring used in the body of an individual exceeds ﬁfty-four, the
ﬁtness is reduced by one for each spring over that cap. The number ﬁfty-four is derived
as one less than the minimum number of springs to fully connect eleven point-masses.
The other caps put restrictions on the structure as a whole, such as a maximal bounding
box, in order to prevent outlandish structures and behaviors. Before these restrictions, it

was common to evolve individuals that were many thousands of meters tall.
2.3.2 Genetic Language

Genetic languages play an important role in the evolution of morphology because
they provide a method of describing how to build a body, which is more expressive than
a direct encoding. A direct encoding explicitly lists each body component and all of its
parameters. There are many advantages and disadvantages to genetic languages, mostly

revolving around what is possible through a single mutation. One advantage is that a

24

language can be designed to have high-level concepts represented as a single atomic unit,
thus a mutation can produce large, meaningful changes. On the other hand, this can also
be a disadvantage because the particular choice of instructions invariably also limits the
types of changes possible.

The power of a genetic language is that the designer has the ability to implement
abstract concepts in a language that would otherwise be difﬁcult to evolve through
natural selection. A Simple example is that of a loop. It is typically easier to implement a
looping instruction than it is to evolve the equivalent out of mathematical, jump, and
conditional instructions. In the latter situation, many speciﬁc, ordered mutations would
be required to implement a basic loop, including the ability to terminate this loop at the
appropriate time. By directly implementing these concepts in a language, it essentially
narrows the search Space, which is often beneﬁcial to specialized questions.

Another advantage of a genetic language is that implementing language constructs
can increase or decrease the impact of a mutation on certain aspects of the morphology.
This is possible due to instruction complexity in which an instruction pleiotropically
affects more than one trait or otherwise accomplishes multiple atomic actions. The
complexity of a mutation to one of these instructions can be characterized as a single
atomic instruction being replaced by a series of other atomic instructions. Thus a single
mutation to or from a complex instruction can have a huge effect because that mutation
would simultaneously impact the entire series of instructions that the single instruction
represents. While this can be a disadvantage due to the more dramatic (and hence more
often detrimental) nature of mutations, it also enables complex concepts to be represented

in individuals—something that would otherwise occur quite rarely. Another

25

disadvantage is that adding these more complex instructions, I either bloat the set of
possible instructions (making a mutation to any one of them much less likely) or else
increase the number of mutations required to make small changes. However, a properly
designed language can minimize (but never eliminate) these tradeoffs. For example, in a
morphological system it may take many mutations to change a speciﬁc parameter for a
single component of the phenotype already in existence, but it would take only one
mutation to add an entirely new component with default parameter values.

The underlying genetic language is a critical topic for most of the systems that
evolve both morphology and behavior. A common thread among them is the focus on
symmetrical, or recursive, construction of body parts (Sims, 1994a; Komosinski and
Rotaru-Varga, 2001b; Homby and Pollack, 2002). Some of the research done by Homby
and Pollack (2002) as well as Komosinski and Rotaru-Varga (2001b) has Shown
improvement in both Speed and reliability when evolving movement using a recursive
language over a direct encoding in certain situations. Homby and Pollack (2001b) also
observed an order of magnitude increase in body parts and a higher degree of regularity.
Komosinski and Rotaru-Varga also noticed more ordered structures including modularity
and segmentation. These experiments are important in building an experimental
framework for these types of systems.

However, the problems associated with direct encodings are poorly understood.
Why do they seem to perform so poorly compared to recursive languages? Is it because a
direct encoding is too complex, requiring a massive number of mutations to ﬁne tune
component parameters? Will the same results occur between symmetric versus non-

symmetric languages both of which are not direct encodings? These questions remain

26

exciting topics in the ﬁeld, but for the most part researchers have moved on to more
dynamic languages. I developed a language for WhirlingDervish that was inspired by a
preexisting genetic language from the Avida (Oﬁia, 2004) system.

The Avida system strives to answer both biological and general evolutionary
questions through the use of self-replicating computer programs (as opposed to 3D
evolving constructs). The most common type of evolutionary goal for these individuals
is to perform math and logic functions. Any of these ﬁmctions can be solved because
Avida employs the use of virtual CPUS with a Turing complete pseudo-assembly
language. In addition, the Avida language provides the necessary instructions that allow
the individuals to self-replicate. The WhirlingDervish language uses some of the
mathematical functionality of the Avida language including the use of stacks. However,
the concept of self-replication is not used and the respective Avida instructions are
replaced by instructions that build and alter a three dimensional virtual body. The other
primary difference is that the Avida genome is circular while in WhirlingDervish they are
a linear array of instructions. On the other hand, both languages are executed in order to

produce their phenotypes. For more details on the language used here, see Appendix A.
2.3.3 Replication and Mutation

Once all of the individuals have been simulated and their ﬁtness calculated, it is
time to replicate the individuals and generate a new population for the next generation.
The process I use is called ﬁtness prOportional selection. First each individual’s ﬁtness is
divided by the total ﬁtness for the whole population. This essentially creates a weight for
each individual that represents it percentage chance of replication during a single round

of selection. Then, for each empty slot in the new population, except one, I generate a

27

random number that corresponds to a speciﬁc individual. The chosen individual is
replicated then put backing into its population, sampling with replacement. The extra slot
in the new population is reserved for the most-ﬁt individual in the current population who
gets a single exact copy of itself generated and placed into the new population, making
this a system with an elitism of one.

When a wire-frame individual has been selected for replication, the process of
genome duplication begins. As each instruction is copied from the parent genome to the
offspring's genome there is a chance for mutation (P). However, crossover was not used
in this system because the added complexity was not required to accomplish the goal.
The chance of mutation is calculated at the beginning of the copy process and remains
constant throughout. The reason the chance of mutation is calculated each time is that the
value is composed of two components: one that is ﬁxed and applies to every individual

throughout the evolution process (Pn ), and one that is based on the length ( 2g) of the
parent genome being copied (Pg). The chance of mutation is given in the following

equation:

_ g
P-P + (2.16)

The addition of one to the length of the genome exists due to the fact that some genomes
may be empty, as is the case for the initial individuals. This equation has many potential

uses, but I only use Pg for the mutation rate with a value of 1.0, by setting a value of 0.0
for Pn' Both Pn and Pg can be set in a conﬁguration ﬁle and will remain unchanged

throughout the run.

28

If a mutation occurs while copying an instruction, one of four possible types of
mutation is used to determine what will happen to the copied instruction. The four types
of mutation are point mutation, insertion, deletion, and instruction duplication. Each
mutation type has a probability associated with it that is conﬁgurable and must sum to
1.0. The probabilities used in this thesis are shown in Table 2.3.

Table 2.3. Mutation rates

insertion 0.05
duplication 0.05
deletion 0. 1
point 0.8

The effect of insertions, duplications, and deletions are simple. When an insertion
occurs, an extra random instruction is added to the offspring's genome immediately after
the most recently copied instruction. To allow insertions at the beginning of the genome,
there is always a chance of insertion before the ﬁrst instruction is copied. Duplication is
essentially an insert except the parent instruction is copied twice. Deletion simply does
not add a copy of the parent instruction into the offspring's genome.

There are two types of point mutation: random instruction and parameter
modiﬁcation, with a 50% chance that it will be one or the other. The random instruction
disregards the parent instruction and creates a new random instruction with a random
parameter. The parameter modiﬁcation works differently if the current parameter is a
name rather than an explicit vector. If the parameter is a name, a new random parameter
is generated with a 65% chance of becoming a random explicit vector. If the current
parameter is an explicit vector there is still a 65% chance of getting a new random
explicit vector. Otherwise, a random component of the vector (x) is chosen and a

modiﬁer is added to it as Shown in the following equation

29

x = x +1 1random _number(—l,1) (2.17)

The value of x is capped between 1 and -1. Also, the chance that a parameter will
change into a random explicit vector is conﬁgurable, and the choice of 0.65 was chosen

to give a slight bias to explicit vectors.
2.4 Runtime Abilities

An important attribute of WhirlingDervish is its ability to evolve large
populations in real-time. The success stems primarily from the Simplicity of the wire-
frame individuals and the environment. The simplicity gives a large reduction in the time
spent modeling individuals through simulation, where the simulation consumes the
majority of a run’s overall computational time. However, other researchers have also
achieved large population sizes. One such researcher is Ventrella (1994; 1998) who
simpliﬁed his constructs to two dimensions in order to achieve population sizes of up to
200 individuals in real time on an SGI Indig02 machine in 1996. His constructs were
motorized stick ﬁgures that had to evolve the ability to swim within a ﬁxed size circular
pond in order to survive. Another is Homby and Pollack (2001b) who were able to have
a large population of complex individuals made up of many components that consisted of
bars and joints controlled by a neural network. The bars and joints contribute structurally
and have corresponding motorized capabilities, which is controlled by a neural network.
The experiments were run with 100 individuals, each with an upper limit of 350 bars, for
up to 500 generations. In another similar experiment, they used a population size of 200
individuals for 250 generations (Homby and Pollack, 2002). Bongard and Pfeifer (2003)

used the Artiﬁcial Ontogeny system and were able to have a population of size 200.

30

In WhirlingDervish, the average time taken to evolve a population of 500 wire-
frame individuals for 500 generations is approximately 25.8 hours on a single 2.8 Mhz
Athlon processor, or about 1.548 times real-time. The body Size for a wire-frame
individual can vary greatly since the body size begins at zero and has no explicit upper
limit. However, the Size is somewhat constrained by only giving a wire-frame individual
an increase in ﬁtness for up to 10 point-masses and penalizing ﬁtness when the number
of springs exceed 54 (one less than the number of springs in a complete graph with 11
points). The average runtime was taken using the default genetic language, a two minute
simulation time, and a time step of 0.01 seconds over ﬁfty replicate runs. This speed is at
the high end relative to the other systems for evolving morphology and mobility.
However, it is difﬁcult to directly compare speeds due to the disparate body sizes of
individuals evolved in WhirlingDervish, the difference in the types of individuals

evolved, and the information reported about their conﬁguration.

31

Chapter 3 Analyzing Convergence and Stagnation
Through Diversity and Variation Measures

Current research of diversity measures in evolutionary computation suggests
possible, beneﬁcial methods for characterizing convergence and stagnation in 3D
morphology and behavior evolutionary systems. This research commonly examines the
change in genetic and phenotypic diversity within a population over time. Practical
motivations often involve controlling diversity through the use of operators that utilize
active feedback of the current diversity. Conceptual motivations strive for a better
understanding of the underlying dynamics of evolutionary algorithms, their structures,
methods, and operators. My motivation is to ﬁnd a way to quantify the ability of the
WhirlingDervish system to continually evolve or improve solutions. Studies of other
evolutionary aspects can then use this quantity as a reference point for comparisons.
Therefore, this chapter will examine how informative the various diversity measures are
towards quantifying convergence and stagnation.

To make use of diversity measures both a method for classifying individuals,
grouping similar individuals, and deciding on the unit of classiﬁcation are required. The
best choice of classiﬁcation unit is phenotypic behavior, as it is the overarching goal of
the system. However, with the behavioral complexity of individuals in WhirlingDervish,
where the ﬁtness scheme only approximates the desired outcome, I am unaware of
computational models capable of classifying them. Currently, the majority of the
classiﬁcation occurs by visual analysis, which also makes it difﬁcult to judge the success
of a system in terms of desired solutions. Therefore, I rely on quantitative aspects of the

individuals such as phenotypic traits: ﬁtness and the xz-distance an individual travels.

32

Some work already exists for classifying phenotypes in 3D morphology and
behavior evolutionary systems (Komosinski et. al., 2001a). This work discarded
classiﬁcation based on genotype information. Instead, a similarity measure is used based
on discrete structural characteristics of the phenotype such as the degree at each vertex.
While their method may do well at classifying their structure, it does not address the
classiﬁcation of behavior. Also, individuals in the WhirlingDervish system may express
different structures over time even though the number of connections at a vertex remains
constant. To this end, their measure would not work well at classifying structural or
phenotypic behavior of individuals in WhirlingDervish, but may form the foundation for
future computational models.

Section 3.1 presents an overview of diversity measures taken from other research
studies in evolutionary computation and population biology. At that point, I will also
examine the differences between their current usage and any modiﬁcations I made in
order for them to work with WhirlingDervish. While highlighting these differences, I
will also discuss how the interpretation of those measures needs to be adjusted from the
problems they were tested on originally, and their potential application to open-ended
evolutionary systems. Section 3.2 will then provide a detailed description of all of the
diversity measures used in this thesis. Next, the experimental design, in Section 3.3,
describes the conﬁguration and important parameters for the WhirlingDervish system to
generate ﬁfty replicate runs used as data for this chapter. Then, Section 3.4 provides an
analysis of the application of the diversity measures to those runs, and the conclusions

follow in Section 3.5.

33

3.1 Deﬁnitions and Methodology

3.1.] Deﬁnitions

Diversity is a measure that describes a population in terms of classes, or groups,
taking into account both how many classes there are and how evenly those classes are
distributed within the population. The term class represents the classiﬁcation of
individuals into groups based on measures of Similarity. The classiﬁcation of individuals
divides the entire population such that every individual falls into a single class. Thus
each class represents a proportion of the population. Evenness is a measure of how close
to equal the proportion for each class is to the others. Evenness is maximized when all of
the classes contribute the same number of individuals to the population. In biology, it is
often difﬁcult to determine the proportions for classes, or species, so when the evenness
information is unavailable, richness is used. Richness is a raw measure of diversity that
only represents the number of different classes.

In evolutionary computation and population biology, convergence is the loss of
genetic diversity. Convergence for genetic algorithms occurs when the genomes are
similar enough that crossover can only produce a limited range of offspring all of which
have equal or lesser ﬁtness (Fogel, 1994). Studies conducted in genetic programming
(McPhee and Hopper, 1999; Daida et. al., 2004) have shown that using only crossover
and randomly generated initial populations will cause rapid convergence through the loss
of genetic material, which can never be replaced or recovered. Primarily, convergence
occurs in optimization systems where the selective pressures and environment are held

constant. Other aspects of an evolutionary system that can contribute to convergence are

34

exponential ﬁtness growth and low mutation rates. However, convergence is often a
desired quality for many of the problems addressed through evolutionary computation.

While convergence is often used as an indicator that a ﬁnal solution has been
reached, many research studies have been conducted on how to prevent premature
convergence. Premature convergence is the convergence on a suboptimal solution for a
problem that has multiple solutions where some are suboptimal. Once premature
convergence has occurred, the population has usually lost much of its genetic diversity
and is unable to ﬁnd an optimal solution. In an attempt to prevent premature
convergence, the study of diversity has become an important topic, particularly when
focusing on controlling diversity (Burke et. al., 2004).

Stagnation occurs when a population has not progressed toward its evolutionary
goals (approximated here by ﬁtness) for a period of time. It is independent of the genetic
variation and does not preclude beneﬁcial mutations within reach of the population’s
search capability. However, the presence of such mutations does not guarantee that a
population will ﬁnd them in any ﬁnite amount of time (Bossert, 1967). Two interrelated
issues with stagnation are determining why a population has stagnated and the trade-off
between continuing to search for improvements and stopping the search. In systems
where the ﬁtness landscape is unknown or complex, it may not be possible to determine
the cause of stagnation, but the state is still quantiﬁable. However, the existence of
stagnation does not imply that a better solution would not be found later in the

evolutionary process.

35

3.1.2 Population Diversity

Given all of the different diversity measures available, I settled on four primary
methods for calculating the diversity of a sampling unit, or population. They are
richness, entropy, Simpsons’s diversity index, and Molntosh’s diversity index taken from
Magurran (Magurran, 2004). The Simpsons’s and McIntosh’s diversity indices are often
deﬁned in terms of ecological diversity, incorporating both evenness and quantity of
unique species. While these measures deal with species, it is easy to convert them to
different units such as genotype and phenotype diversity within a population. All that is
required is to deﬁne a sampling unit such as a population and a method for classifying
individuals into distinct groups within that population.

The use of class richness is common in evolutionary computation as noted in
(Burke et. al., 2004). It is also called variety when in reference to unique genotypes
(Koza, 1992). Class richness is a raw measure of diversity, and only takes into account
the quantity of classes, not the distribution. Lacking that key bit of information about
evenness limits the information content of this measure. However, there are situations
where the richness measure will be a good approximation of the other measures, which
occurs when a high percentage of unique classes exist in a population. When the
percentage of unique classes is low, it is impossible to determine their distribution within
the population based solely on the count.

While most uses of richness are deﬁned in terms of genotypes, it is also used in
genetic programming for phenotypic diversity (Rosca, 1995b). In this Situation,
phenotypic classes are grouped by unique ﬁtness values. Rosca also suggested a coarser

method for classifying ﬁtness by binning ﬁtness values based on a range of values rather

36

than an exact match. This alternative method was tested in WhirlingDervish because of
the specialized “Jitter” function. Jitter modiﬁes each component of the phenotype by a
very small delta causing differing ﬁtness values for multiple measures of the same
individual. However, the use of the alternative method was dismissed when it did not
show additional information over the non-binned version.

While the use of entropy as a diversity measure was found in terms of ecology
(Magurran, 2004), it has also been used in evolutionary computation and artiﬁcial life
(Ray, 1994; Adami, 1998). In these systems, the method for calculating diversity is the
same as the ecological method, but is calculated in terms of either genotypes or
phenotypes (Rosca, 1995 a). Essentially, entropy is a measure that takes into account both
the quantity of classes and their distributions. Maximal entropy occurs for a population
with a given number of classes when all of those classes ﬁll the same proportion of the
population (i.e., there is maximal evenness). Ray’s (1994) use of a genetic entropy
measure is a good example of what can be learned by using an entropy measure. In his
system, Ray was able to show that sharp drops in entropy corresponded to the

introduction of new innovations, which swept the population.
3.1.3 Variation

There are many different measures in genetic programming that are labeled
“diversity”. However, many of these measures use statistics at the level of individual
comparisons. To help distinguish between these and the diversity measures mentioned in
Section 3.1.2, I am going to call them variation measures. There are four categories of
variation used in this thesis: phenotypic, genetic, inferiority, and information relating to

the most recent common ancestor (MRCA). These categories use measurements that

37

move away from looking at classes and their abundance within a population and focus on
comparisons between individuals.

The ﬁrst variation category utilizes measurements of phenotypic traits. Since
there is no good way of quantifying behavior, I will focus on ﬁtness and morphological
traits. The morphological traits I use include height, xz-distance moved, number of
point-masses, and number of springs. These traits will be visualized in two ways: as an
overall average of that trait across an entire population over time and how those traits
change along a lineage. The changes in the ﬁve phenotypic traits should help distinguish
convergence and stagnation factors.

The second category is genetic variation, which is based on edit distance. This
category is broken into two measures intended to quantify the relationship between two
genomes. The ﬁrst measure quantiﬁes how dissimilar are two genomes while the second
measure quantiﬁes the amount of Similarity as the proportion of shared genetic material.
Both measures are based on the length of the two genomes being compared and the edit
distance between them. Also, the relationship between the two measures is that the
similarity equals one minus the dissimilarity when both genome lengths are equal. The
speciﬁc formulas are given in Section 3.2.

The use of edit distance itself is commonly used in evolutionary algorithms to
determine genetic diversity and variation. In genetic algorithms, Hamming distance is
used because each genome is often a ﬁxed length linear array of bits. Genetic
programming had to develop specialized edit distance formulas due to the use of a tree
style genome representation. According to (Burke et. al., 2004), the two most common

edit distance measurements in genetic programming are 1) the number of changes

38

required to transform one tree into another and 2) weighted edit distance (Ekart and
Nemeth, 2002). The second method ﬁrst ﬁlls in each tree with empty nodes to make it
full, then computes the edit distance to be a weighted sum of the Boolean difference at
each node, where the root has the most weight.

My method for calculating edit distance is Levenstein distance. The basic
algorithm calculates distance as the sum of the number of mutations, insertions, and
deletions required to go from one string to another. In this situation, the genome is the
string and the instructions are the characters. To simplify edit distance calculations, I
consider sites from two genomes identical only if both instructions and arguments are a
perfect match.

The inferiority measure comes from artiﬁcial life. Adarni (1998) created this
measure as an analog to energy in a physical system, where the population has a pressure
to reduce its average inferiority to‘ zero, called the ground state. This idea represents an
overall trend for the population to converge. Originally, inferiority was deﬁned in terms
of an individual’s replication rate, but it was trivial to convert it to using ﬁtness. Here,
inferiority for an individual is the difference between the best ﬁtness in the population
and the ﬁtness of that individual. Thus, the average inferiority is the average difference
in ﬁtness for the population.

The ﬁnal variation category contains three measures that utilize the most recent
common ancestor (MRCA). The ﬁrst measure is simply the generation of the MRCA for
the ﬁnal population of a given run. In some sense, this value is an estimate of the
phylogenetic diversity, but there is a potential confounding effect caused by outlier

individuals within the tree. These individuals diverge early relative to the rest of the

39

individuals in the population. This can cause a skew in the MRCA generation when the
majority of the individuals are closely related. Even with this confounding effect, this
measure can still provide useful information. Intuitively, the closer in generational time
the MRCA is to the last generation, the more likely the genetic variation will be low since
there would be little time to diverge. Likewise, the further back in time the value, the
more likely the ﬁnal population has a diverse set of genotypes. However, if the time
difference between the ﬁnal generation and the generation of the MRCA grows too large,
it can indicate low selection pressure, or even no selection.

The other MRCA measures are averages that use the similarity measures from the
genetic variation category. Application of those similarity measures occurs between the
MRCA and each individual in the ﬁnal population. The average dissimilarity measure
approximates the level of divergence between the MRCA and the ﬁnal population as the
average proportional difference. This measure is an estimator for the change in variation
over time, which in turn can be used to approximate the likelihood for convergence. The
average similarity measure approximates the quantity of genetic material from the
MRCA conserved in the ﬁnal population. This measure can potentially indicate if a new
phenotype has arisen during a run. It also has the potential to indicate that the population
is incrementally improving on some phenotypic aspect represented by the conserved
genetic material.

The idea of using the average Similarity measures with the MRCA was inspired
by the work of McPhee and Hopper (1999), which was later rechecked Daida et. a1.
(2004). Their work examined how much genetic material from the initial population

contributes to the ﬁnal population in a genetic programming system with crossover but

40

no mutation. This experiment Showed that all else being equal, the situation forces severe

loss of diversity in both genetic material and genetic structure. The reason for this was

that crossover could not introduce new genetic material, and when something was lost, it

could never be regained.

3.2 Diversity and Variation Measures

The following is a short description of each diversity and variation measure

accompanied by a formula when appropriate.

Pgulation Diversity Measures:

Class Richness- It is simply the count of the number of unique classes in a
population.
Shannon Diversity (Entropy)- Measure of disorder in the population, where N is

the population size and Nj is the number of individuals in class j.

Nj Nj
—2.—I—V_logNN <3“
J

Simpsons’s Diversity Index - A measure that accentuates uneven class
proportions in a population by weighting each as the square of its abundance,

where N is the population size and Nj is the number of individuals in class j.

1—§—1V7£((—:—Jg)1—) (3-2)

41

McIntosh’s Diversity Index - A measure that treats the proportion for each class
as a position in n-dimensional space, where n is the number of classes. The
position for a given class is only along its own axis at a distance from the origin
equivalent to its proportion in the population. The measure of diversity is then

calculated as the average distance. N is the population Size and Nj is the
number of individuals in class j.

N— lZNZ.
J- J (3.3)

N-«W

Variation Measures:

Average phenotypic trait over time - The sum of the trait value for a given
population divided by the population size.

Similarity- The proportion of genetic material in the smaller genome that also
appears in the larger one, where L is genome length and D is the Levenstein

distance.

Lmax
L . (3.4)
mm
Dissimilarity- The ratio of the differences between genomes to the length of the

longer genome, where L is genome length and D is the Levenstein distance.

D
f— (3.5)
max

42

 

o Inferiority- The difference between the most ﬁtness individual and the ith
individual in a given population.
Emax ‘ 81' (3-6)
0 MRCA generation- The generation of the MRCA for the ﬁnal population.

0 MRCA edit distance- The average edit distance between the MRCA and each

individual from the ﬁnal population.
3.3 Experimental Design

To accomplish the goals of this chapter, ﬁfty replicate runs were generated using
the WhirlingDervish system. At the completion of those runs, all of the data were
processed and the necessary information was extracted. Section 3.4 discusses the results
from the application of the diversity and variation measures to the extracted data.
However, to better understand the results, this section provides an overview of the
conﬁguration and important aspects of the WhirlingDervish system.

The conﬁguration uses a ﬁxed population size of 500 individuals. With the initial
population consisting entirely of empty genomes, the run continues for 499 generations.
At the end of each generation an entirely new population is created using ﬁtness
proportional selection with an elitism of one. Once an individual is selected to generate
an offspring, its genome is copied instruction by instruction into the genome of the
offspring. This process occurs external to the individual; the individuals do not
technically self-replicate. As each instruction is copied into the offspring’s genome, there
is a chance mutations will occur based on a rate computed prior to the copy process. That

rate is genome length dependent and set such that there is an average of one mutation per

43

 

offspring. The four types of possible mutations are point, insertion, duplication, and
deletion. No crossover is currently used in WhirlingDervish.

The evolutionary goal of the WhirlingDervish system is to produce mobile
individuals that move consistently over time. Approximating this goal, a simply additive
ﬁtness function deﬁnes selective pressures using the following measurable phenotypic
traits: (1) the xz-distance between the original center of mass and the ﬁnal center of mass,
(2) the number of point-masses, (3) the vertical distance from an individuals lowest point
to its highest point, and (4) the height from the lowest point to the center of mass. Both
the xz-distance moved and the height to the center of mass are measured in meters after a
two—minute evaluation period and scaled by a factor of 3, and the number of points is
scaled by the value 0.0254. Also, a few of the primary constraints consist of a cap on the
ﬁtness beneﬁt from the number of point-masses at 10 and a cap on the two height traits at
15 meters, but these constraints do not restrict any trait from exceeding their caps.

The ﬁtness values generated during a run must be recomputed prior to analysis.
The primary reason that I compute the ﬁtness again is to remove the stochastic
perturbations that are normally applied to each point-mass in a newly generated
individual before its evaluation period. These perturbations normally cause multiple
ﬁtness measurements to vary widely for the same individual. Thus, without these
perturbations, the true ﬁtness value of an individual is computed and could be used for
comparisons across runs. The other reason to recompute ﬁtness is to achieve a consistent
ﬁtness measure that will correlate better with visual analysis Since the Speciﬁc

perturbations used during the experiment are difﬁcult to duplicate and are not applied.

44

3.4 Results and Discussion

This experiment examines the occurrence and likelihood of convergence and
stagnation in the WhirlingDervish system under the conditions described in Section 3.3.
This section will ﬁrst examine convergence by exploring the results of different diversity
and variation measures. The same will then occur for stagnation, which will draw on
some of the measures and conclusions for found for convergence. However, to be clear,
the results and conclusions drawn from this experiment only apply to the limitations
given in the design. It is entirely possible that given a long enough period of
evolutionary time, the results may differ. At a future time, the procedures used to
ascertain results for this experiment will be conducted on longer runs and multiple data

sets for veriﬁcation.
3.4.1 Convergence
This section utilizes genetic diversity and variation measures in order to examine

convergence. In the process, I explore the selection pressures within the system and their

impact on the assertions drawn from the various measures.
3.4.1.1 Genetic Diversity

To recognize convergence in a system, the ﬁrst step is to examine its genetic
diversity. Figure 3.1 shows the genetic diversity as measured by the four diversity

measures described above in Section 3.2.

45

Genetic Richness

 

a.
, ~ 1 1~ “'1.- Wm“:

 

 

 

 

300
E
u 200
100 ----- Null Model
-— Average
- —~—— Individual Run

 

 

 

 

O0 100 200 300 400 500
Generations
A
McIntosh Genetic Diversity Measure

 

 

 

 

 

 

 

 

1 rm f
0.8
3,116
g
.2
o 0.4
0.2 ----- Null Model
— Average
- Individual Run
0 I I I 1_
O 100 200 300 400 500
Generations
C

Entropy Genetic Diversity Measure

 

0.2 ----- Null Model .
— Average
Individual Run

0 I I
O 100 200 300 400 500

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Generations
B
Simpsons Genetic Diversity Measure
1 1' - - . -
0.8
grits
g
.2
o 0.4
02 "'"' NUII Model
—— Average
__._.--__ Individual Run
0 I I I I
0 100 200 300 400 500
Generations
D

Figure 3.1. Genetic diversity using the following measures (A) richness, (B) entropy, (C)
McIntosh’s diversity index, (D) Simpsons’s diversity index. Each graph contains 50 grey
lines, where one line represents the diversity over time for a single run. The thick black
line is the average diversity across all of the runs, and the thick dashed line represents the
expected value given by a null model where no selection is present.

The graphs in Figure 3.1 clearly show a high level of genetic diversity after an initial loss

and recovery near the beginning. The McIntosh and Simpsons’s diversity indices Show a

maximal diversity over the majority of the run, while the richness and entropy measures

 

Show a high and Slightly increasing diversity. Therefore, these graphs strongly indicate a
lack of genetic convergence.

In an attempt to understand why the genetic diversity is so high, a null model was
created to estimate he number of unique genotypes that would occur in the absence of
selection. These values are represented as a thick, black dashed-line in Figure 3.1. While
it is interesting that the genetic diversity is high for most of the run, it appears to be close
to the null model estimate. This suggests a weak selective pressure, most likely brought
about by the potential accumulation of a high number of neutral mutations. Figure 3.2A
shows the percentage of neutral, detrimental, and beneﬁcial mutations at each generation.
The weak selective pressure also means that it is hard to weed out individuals and sweeps
will occur infrequently, if at all. It may also lead to a high distribution of groups within
the population that have approximately equivalent ﬁtness, and are fairly unrelated within

a phylogenetic tree. Figure 3.2B is an example of phylogenetic tree.

47

Proportion
:0 .0 o .0
(a) Ill 01 0')

.0
N

0.1

Proportion of Mutation Type per Generation

 

 

Beneficial
Neutral
Detrimental I,

 

o 100 M 20 300 400

Generation
A

48

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

l l l

350 400 450 500
B

Figure 3.2. (A) The proportional representation of neutral, beneﬁcial, and detrimental
mutations for each generation. (B) The end of a phylogenetic tree including the most
recent common ancestor of the ﬁnal population. This particular tree has a later than
average most recent common ancestor and was chosen to show that even a tree this
young has many groupings containing long branch lengths.

In view of the lack of convergence and the weak selection, it is important to

understand some of the mechanisms underlying the evolutionary process. First, all of the

49

genomes in the ﬁrst generation are empty, which means that the second generation is
always comprised of genomes with at most one instruction. There is only an 8% chance
that a non-neutral insertion will occur, and all of the individuals with non-zero ﬁtness
will have the same ﬁtness value at this point in time. Once the population has a mixture
of zero and non-zero ﬁt individuals, the individuals with zero ﬁtness have no chance of
reproducing, thus all of the non-zero ﬁt individuals will be the only progenitors.
However, because ﬁtness proportional selection is in use, all of the individuals with non-
zero ﬁtness will have the same chance at reproduction.

This process will continue for the ﬁrst few generations until the evolution of a
stable individual with height, which should require at least four point-masses to not fall
over. The ﬁrst individuals that retain height will begin to sweep the population, which is
noticeable in Figure 3.1. However, the sweep is weakened as successful offspring are
now in competition with only small relative differences in ﬁtness between them, which
will reduce selective pressure again. It is understandable that the transition from
receiving ﬁtness beneﬁt from only the number of point-masses, which is small, to
individuals with height would have the largest ﬁtness differential and hence the most
likely to trigger a sweep. The ﬁtness gains from the other components of ﬁtness are
gradual and provide a more constant, progressive improvement, which dramatically
reduces the likelihood of a full sweep. The term sweep here refers to a single genotype
rapidly ﬁlling the entire population with its immediate descendents because of its high

ﬁtness relative to the other individuals.

50

3.4.1.2 Genetic Variation

Genetic variation has the potential to strengthen the usefulness of diversity
measures by quantifying the difference between classes within a population. This section
examines how the use of genetic variation contributes towards characterizing
convergence. I begin by examining the average similarity and dissimilarity between all

pairs of genomes for each population over time (Figure 3.3).

 

 

 

 

 

 

 

 

Average Population Dissimilarity Average Population Similarity
1 .
Average
Individual Run
0.8 1
.3' .
E :g‘ 0.6
.g z:
a -- o I
a "’ ‘
0.2 0.2
Average
Individual Run
0 A e + 1 1 o 1 1 1 1 1
0 100 200 300 400 500 0 100 200 300 400 500
Generation Generation
A B

Figure 3.3. (A) Average dissimilarity over time (B) Average similarity over time

In Figure 3.3, there is rapid variation in average similarity and dissimilarity,
particularly during the early generations. While some runs exhibit large deviations from
the average across populations, most do stay near that average. It is also interesting that
the initial rapid change in the average accompanies the genotype explosion mentioned
above for genetic diversity. As a sweep begins wiping out radically different genotypes
and replacing it with offspring, the dissimilarity drops, and the average shared genetic

material increases, as shown in the ﬁrst 50 generations. Interestingly, the averages

51

continue their trend for another 50-100 generations after the Spike representing the loss of
genetic diversity.

I suspect the continued trend in the averages is due to slight domination, however
random, of a few phylogenetically related groups before the groups have grown in size
enough to ensure their representation in the next generation. One of the most useful
pieces of information is continual increase of dissimilarity, and decrease in Similarity, for
the rest of the run after the initial, respective thrusts. These continuous trends support the
high genetic diversity found above. However, with the knowledge that the neutral
mutations increasingly dominate over the other mutation types, these trends are most
likely caused by the accumulation of neutral material in the genomes. A way of testing
the assertion that genomes are accumulating neutral mutation, and potentially improving
the signiﬁcance of both measures, would be the removal of all neutral instructions before
the measures are applied. However, that test is beyond the scope of this experiment due
to the computational expense.

The other issue with Figure 3.3 is that the average across runs of the average
dissimilarity, and similarity, do not show the oscillations apparent in some of the
individual runs. In order to better understand what these oscillations represent, I examine
a single run with oscillations in Figure 3.4, which shows average ﬁtness and average
dissimilarity over time. What is interesting about this ﬁgure is that each time ﬁtness
increases, dissimilarity brieﬂy drops, presumably due to a hitchhiking effect as the new

best solution takes over the population.

52

Dissimilarity and Fitness

0.9
0.8
0.7
0.6
0.5

0.4

Normalized Values

0.3

0.2

 

 

Average Fitness
— Dissimilarity

0 100 200 300 400 500
Generation

0.1

 

 

 

 

 

Figure 3.4. Average dissimilarity and average ﬁtness over time.

3.4.1.3 MRCA Variation

This section examines the use MRCA variation towards characterizing
convergence. The following graphs (Figure 3.5) show results of the variation measures

utilizing the MRCA.

53

Generation

500

450

400

350

300

250

200

150

100

50

Generation of MRCA for each Run

 

I

 

 

 

 

 

l I I
l L I

I
l

 

 

 

MRCA

54

Difference/Similarity of Most RCA and Final Population

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1 r T I d
I
l
0.9 — g a
0.8 — I J
l
l
0.7 F l _
+ +
0.6 — + _
8 +
3
g 0 5 e + -
l
0.4 — ' —
+ 1
l
l
0.3 — ‘ I
0.2 — J
l
l
0.1 - E T
0 l l
Dissimilar Similar

B

Figure 3.5. (A) A boxplot depicting the generation of the most recent common ancestor
for each run. (B) A boxplot depicting aggregate Dissimilarity/Similarity between the
most recent common ancestor and all of the individuals in the ﬁnal population for each
run.

It is apparent from Figure 3.5A that the average MRCA occurs at approximately

th . . . . . .
the 350 generation, Wthh IS 70% of the total run time. Usrng this value as a rough
indicator of diversity, suggests a moderately diverse ﬁnal population. Figure 3.28
showed a phylogenetic tree whose MRCA occurs at the later end of the spectrum

suggesting that if a tree that late in time has such a diverse grouping of leaves, then it is

reasonable to expect the ones who are earlier would also be diverse. I veriﬁed this

55

 

assertion by visually examining the phylogenetic tree for each run. The other interesting
item about the MRCA generation data is that the earlier half of the generation times falls
within 40-70% of the total run time. These earlier generation times may indicate a low
selective pressure because of a lack of domination from any given group of individuals,
which agrees with the results shown above in Section 3.4.1.1.

When examining the dissimilarity of the MRCA individual and the individuals
from the ﬁnal population, the average dissimilarity is slightly higher than 80%. The
increase of genetic dissimilarity can arise due to a combination of genetic drift and the
insertion of new genetic material. The data do not distinguish between the two methods
that increase dissimilarity, so without extraneous information, the average dissimilarity
indicates a low likelihood of genetic convergence. The few outliers with low
dissimilarity Show that at least some of the runs are likely to have converged. However,
with the low selection pressure established earlier, it is unclear how much the converged
runs dominate the results the dissimilarity results of Figure 3.3A.

The similarity measure shown in Figure 3.58 shows that on average about 20% of
the genetic material is conserved from the MRCA individual. I conclude that some
kernel of genetic material is being conserved and the continued increase in ﬁtness occurs
by both tweaking and building off of this kernel. It is also not possible to tell the relative
importance of the conserved material versus the recently generated material in regards to
the phenotype. Thus, the difference in proportion of conserved material between runs
could relate to the variable amount of genetic material to generate their different

phenotypes.

56

 

3.4.2 Stagnation

This section utilizes phenotypic variation measures and ﬁtness distributions in

order to examine stagnation.
3.4.2.1 Fitness Improvement and Morphological Convergence

One simple way of monitoring an evolving population is to examine the change
over time of various phenotypic traits. This can be visualized in two different ways. The
ﬁrst is to average the trait across an entire population. The second involves ﬁnding a
representative individual at each generation. Two common methods for choosing a
representative are selection of the most-ﬁt individual from the population at each
generation, or selection of an individual from the lineage that led to an individual
(typically the most-ﬁt) in the ﬁnal population. Even though using the most-ﬁt individual
is more common in evolutionary computation, I chose to select individuals from a lineage
because these data are less noisy. Figure 3.6 visualizes the noise difference between the
most-ﬁt individuals and lineage individuals in terms of height. This noise is also evident
in all of the other phenotypic traits, though less extreme in ﬁtness and distance moved

data.

57

 

Fitness for each Individual in Lineage

Fitness of the Highest F it Individual

0 ,
Average
7' Individual Run

   
   

100

Fitness

Fitness

  

501

'1. ' l. _ 6

”in"!!!
.‘ ‘ ‘ I)": | ‘y_ -._
‘ ..n.;g.yl,,._!, _.

   
 

 

  
  

 

 

 

 

 

f w‘o“ r, A . - —
O ‘2... ,I‘ .. .. T. I. , ‘ I I
0 100 200 300 400 500
Generations Generations
A B

Figure 3.6. Noise difference between data deﬁned by (A) lineage and (B) most-ﬁt
individuals per generation.

Two sets of visualizations follow that present all ﬁve phenotypic traits over time
along a lineage (Figure 3.7) and averaged across the population (Figure 3.8).

Fitness for each Individual in Lineage Distance Moved for each Individual in Lineage
I 5 .

 

Average JV". 'lt

Individual Run "

L . .3" - ,. .1, 11". I;

1°°’ "1 7i... (171:; 1 31‘?

Fitness
Distance

 

 

 

 

 
 

 

 

  

 
 

--.+ z.

   

 

0 100 200 300 400 500 0 100 200 300 400 500
Generations Generations
A

58

Height for each Individual in Lineage

 

Average
* Individual Run

Height
CI
1
i
l
l
l

 

 

     

o 4 A i_l I 4
0 100 200 300 400 500

 

 

 

 

 

 

 

 

 

Generations
Points for each Individual in Lineage Springs for each Individual in Lineage
70 Average
Individual Run .
60 ’ l ; l ‘
g S , ,.
c ’ , . . ,_ , ..
‘6 '5. 50
‘L m
‘6 z r.- 40 i
, o
a 3 ._
0.)
-° .0 301
E 6 E
Z 1 3 .
4 E z 20
2 10
0 1 1 1 1 1 0 4 1 1 1 Q
0 100 200 300 400 500 0 100 200 300 400 500
Generations Generations
D E

Figure 3.7. Phenotypic trait changes along a lineage for the traits (A) ﬁtness, (B) xz-
distance moved, (C) height, (D) number of point, and (E) number of springs.

59

Average Fitness per Generation

   
   

100

. I111 1 11.111111» .-

. —Average
' i’lndividual Run 111 I .

     

Fitness

 

I'vilr‘i’ttt
.

 

Generations

Average Height per Generation

(0 & UIO!

Heidit (Meters)

N

 

 

 

 

300

200
Generations

Average Distance Moved per Generation
0

  
       

 

— Average
'* Individual Run
‘5‘
«g 25-
23.
o 20-
2
,E 15
o 10 _
. ‘ 123-7“
5 "“‘ ii
0
0 100 200 300 400 500
Generations

60

Average Points per Generation Average Springs per Generation

 

 

 

    

 

 

 

 

 

 

4 70
Average .- a .L, Average ; ;.
12’ , 7 Individual Run . . , 60 Individual Run " ’
g 10 .r“ ‘ U Viﬁl. h f 3:: Y K a 5° ‘1‘:;”-‘::; :N‘A‘v H‘- N
a .r , J‘ I: ‘ -
.5 . . . a, '5.
CL 8L 5 * ‘ m 40»
‘2' ,. ~a
g 6 y’ E 30-
. E
5 4 -‘ :7 5 20»
2 10
o A A A J J o A l I A i
O 100 200 300 400 500 0 100 200 300 400 500
Generations Generations

E

Figure 3.8. Average phenotypic trait change across each population over time for the
traits (A) ﬁtness, (B) xz-distance moved, (C) height, (D) number of point-masses, and (E)
number of springs.

From the graphs in Figures 3.7 and 3.8, height appears to be the only trait where
the average has converged, while the averages of the other traits are still gradually
increasing. This supports the argument that, by 500 generations, a given run is less likely
to stagnate in WhirlingDervish because it is clearly exploring alternate or improved
phenotypic avenues, which corresponds to the continued improvement in ﬁtness. While
stagnation does not appear to occur, it is interesting that the averages begin by rapidly
increasing but soon settle out to a slower rate of change. This may indicate stagnation for
all runs may occur in the future or more runs are stagnating over time.

During the ﬁrst 150 generations, most many traits seem to experience wild
oscillations (Figure 13.8), which persist to a lesser degree throughout the entire run. As
these ﬂuctuations become less frequent and the rate of change slows, some strata clearly
form. The high rate of change in the early generations indicates that a run should persist

for at least 200 generations before a stable solution is likely to be found. It also

61

corresponds to the initial steep increase in average ﬁtness. I suspect the reason for this
correlation is that xz-distance moved does not begin to play a key role in ﬁtness until
later in a run.

Furthermore, morphological changes other than xz-distance moved, may have
little to no selective pressure to increase once distance moved dominates ﬁtness, unless
that change also improves the movement behavior. It is difﬁcult to say how easy it is to
change the morphology through mutation when it has little remaining selection pressure.
Some evidence of this difﬁculty is apparent in the formation of strata for height, number
of points, and number of springs. Clearly changes do occur and the strata are not
completely ﬁxed. Unfortunately, to understand this phenomena will require a correlation

between morphological change and change in ﬁtness and distance moved.
3.4.2.2 Population Diversity Based on Various Phenotypic Traits

Now that I have shown the lack of stagnation during the time period observed
(due to the continual increase in the average ﬁtness and the average lineage ﬁtness), I will
qualify this assertion by applying diversity measures to ﬁtness, xz-distance moved, and
height. Figure 3.9 shows the richness measure for the three phenotypic traits. It is
apparent that richness for three traits gives the same information. Therefore, ﬁtness is
chosen to represent the other phenotypic traits for the contribution of phenotypic diversity
towards characterizing stagnation in order to be optimally comparable with other
evolutionary systems and diversity studies. Figure 3.10 shows the application of the four
diversity measures to ﬁtness, and all four give the same information. From the ﬁgure, it
is apparent that after an initial jtnnp in diversity, it begins to decrease for the rest of the

run. This information supports the assertion that stagnation has not yet occurred.

62

However, the trend of decreasing phenotypic diversity suggests that stagnation may be

likely to occur for longer runs.

    
  

    

 

 

Phenotypic (Fitness) Richness Phenotypic (Distance Moved) Richness
500
400 400
300 300
‘5‘ §
8 8
200 200
100 100,
Individual Run
0 A A A A A o A A A A A
0 100 200 300 400 500 O 100 200 300 400 500
Generations Generations

A B
Phenotypic (Height) Richness

400

300 .,,

Count

200

Averag
individual Run

0 100 200 300 400 500
Generations

C
Figure 3.9. Richness measures of (A) ﬁtness, (B) xz-distance moved, and (C) height.

   

 

63

Phenotypic (Fitness) Richness Phenotypic (Fitness) Entropy

4oo .I

300

Count

200

       

“MA... it"

a}... 1 .
Averag ‘
Individual Run

 

 

 

 

' Individual Run
0 A A o A A J A A
O 100 200 300 400 500 0 100 200 300 400 500
Generations Generations
Ph

enotypic (Fitness) 3 impsons

       

- “I; AAA,

_ , M Wliiia-I V

 

 

 

 

 

 

0.8
3‘ 3- 0.6
8 , 5
.2 z .1 , .2
o 0.4 , ,v , a, . o 0.4
A. A ., h' 0 ill" ,: .175
it I. (VI 3,. |
0.2 0.2
Average —— Average
Individual Run ____w Individual Run
0 A L A A A o A A A A A
O 100 200 300 400 500 0 100 200 300 400 500
Generations Generations
C

D

Figure 3.10. The four population diversity measures applied to ﬁtness. (A) richness, (B)
entropy, (C) McIntosh, and (D) Simpsons.

3.4.2.3 Average Inferiority and Fitness Distributions

By using inferiority and the distribution of ﬁtness values at discrete time points, I
will now examine the variation of phenotypic traits. The inferiority of an individual is
the difference between its ﬁtness and the best ﬁtness in its population.

Average

inferiority is a rough estimate of phenotypic variation and is a strong indicator of when a

64

population is going through a transition. Thus, as the quantity of improvements
decreases or the selection pressure is too light, average inferiority will decrease.

Figure 3.11A shows average inferiority over time for each run and Figure 3.11B
shows average ﬁtness over time. While some of the increases in average inferiority are
mufﬂed in the noise of the others, a number of sharp increases, many of which are spikes,
stand out. Using the outlier, it is clear that a sharp increase accompanies the sudden jump
in average ﬁtness. As this average ﬁtness continues to rise more slowly, the average
inferiority begins to decrease. Since I know that selection pressure is low and decreases
over time, successful individuals should produce more offspring with neutral mutations,

thus imparting similar ﬁtness and lowering inferiority.

 

 

 

 

 

 

 

 

 

 

 

 

   

 

Average Inferiority Average Fitness per Generation
300' 350.
— Average Average
250. mm Individual Run 300* Individual Run (Eli il'im'
W” 17“ I‘I'I II m, ('4 , '
250- =‘
3g 200. A, J , AM, Ital}
'5 ” Aim
{:3 I’ g 200- ,4, .
; 150 A, k g .
a t “-
g 100» l
<
50~
O ‘lOO 200 300 400 500 0 ‘IOO 200 300 400 500
Generation Generations
A B

Figure 3.11. (A) Average inferiority per generation, (B) Average ﬁtness per generation

The decrease in average inferiority of the outlier suggests that the run begins to
lose phenotypic diversity. Examining the average of the average inferiorities (Figure
3.12) may shed some light on what is happening in the system. For instance, why does

the average of the average inferiorities increase and then level off? To answer this

65

question, seven marks are placed along the average of the average inferiorities at different

points in time with three on the increase, three on the level portion, and the last one.

Average Inferiority Across Populations
0.7 .

9.0.09.0
Nubmm

Average of Average Inferiorities
.0

 

 

0 100 200 300 400 500
Generation

Figure 3.12. Average Inferiorities across populations

Then, for each of the seven time points, I found the normalized ﬁtness distribution
averaged across all ﬁfty runs (Figure 3.13). The idea for using ﬁtness distributions is
partially inspired by the work of Popovici and De Jong (2003) who used ﬁtness
distributions as a method of characterizing evolutionary algorithms. I use the ﬁtness
distributions to help determine the driving force for the change in the average of average

inferiorities.

66

Avg. Fit Distribution across all runs (Gen 50) Avg. Fit Distribution across all runs (Gen 100)
3000 3000

 

 

2500 2500

2000 2000

1500 1500

1000 1000

500 500
00 0.2 0.4 0.6 0.8 'l 00 0.2 0.4 0.6 0.8 1

A B
Avg. Fit Distribution across all runs (Gen 150) Avg. Fit. Distribution across all runs (Gen 250)

 

 

3500 - 3500
3000 3000
2500 2500
2000 2000
1500 1500
1000 1000
500 I l I I I 500
00 0.2 0.4 0.6 0.8 1 00 0.2 0.4 0.6 0.8 1
C D

67

Avg. Fit Distribulion across all runs (Gen 350) Avg. Fit Distribution across all runs (Gen 450)

 

 

   

 

 

3500- 3500F
3000 3000
2500 2500
2000 2000
1500 1500
1000 1000
500 500
0 0
0 (l2 014 (16 (18 1 0 (12 014 (16 (18 1
IE F
Avg. Fit Distribution across all runs (Gen 499)
3000-
2500
2000
1500
1000
500
0
0 (L2 ()4

 

Figure 3.13. Average ﬁtness distributions at generational time points (A) 50, (B) 100,
(C) 150, (D) 250, (E) 350, (F) 450, and (G) 499.

The average ﬁtness distributions are constructed from the ﬁtness values of an
entire population across all runs for a given point in time. First, each population scales its
ﬁtness values by dividing by the maximum ﬁtness within that population. Once scaled,
all of the ﬁtness values are lumped into a larger set of ﬁtness values that range from zero

to one. This essentially increases the sample size represented by each histogram to the

68

number of runs times the population size (in this case 50 * 500 = 25,000). One of the
biasing factors is the fact that there will always be a minimum number of values at 1.0
equal to the number of runs. However, the relative fraction of data points this represents
in terms of the total number of data points is small enough to ignore. Also, the
distribution only scales the ﬁtness values, it does not shift them. Thus, any values that
fall into bins at or near zero must have those low ﬁtness values.

The change in the average inferiority across populations can be explained by the
trend in the ﬁtness distribution over time. Figure 3.13 shows the ﬁtness distributions in
WhirlingDervish tend to be bimodal around 0.0 and a positive transient mode. In the
early stages of the run, the second mode begins to move toward the other end of the
ﬁtness spectrum. Once the mode has established itself near 1.0, its movement is slowed
and lesser ﬁt individuals are gradually pulled toward it. Even though the right mode is
pulling more and more individuals toward it, the left mode still retains a fair chunk of
individuals with low ﬁtness, which I assume correlates with the mutational trends shown
in Figure 3.2A. The changing ﬁtness distributions trends explain the changes in the
average inferiority across populations. The increase in the average of the average
inferiorities corresponds to the second mode shifting to the right, where the curve levels
off when both modes are near their ﬁnal positions. The ﬂuctuations in the average are

the ebb and ﬂow of a small portion of the individuals between these modes.
3.5 Conclusions

From the results discussed in Section 3.4, many conclusions can be drawn about
the WhirlingDervish system. In particular, this chapter focuses on determining whether

or not stagnation and convergence occur. The data clearly demonstrated that no

69

convergence occurs, at least in the time frame observed, as all of the genetic diversity
measures showed high and increasing diversity. After looking at genetic variation, I
determined that the low selective pressure was a primary factor in the high diversity. In
terms of morphological convergence, it appears that on average individuals continue to
change slightly over time. However, the measures were too noisy to make strong
assertions about convergence on a ﬁxed morphology, or lack thereof.

The data supporting the likelihood of a given run to stagnate are not as strong as
the support for convergence, but still indicates that stagnation is unlikely. It is clear that,
on average, the ﬁtness is continuing to increase, and the other phenotypic traits continue
to change over time. However, when the system is viewed in terms of variation, it
appears to be losing phenotypic variation, indicating movement toward future stagnation.
However, my results from testing inferiority and the change in ﬁtness distributions over
time, throw doubt on how likely it is for a stagnation event to occur in the future. The
other stagnation related results show that the system is in a state of ﬂux in the early
generations, and that under the experimental setup given above, a run should always go
for at least 200 generations in order to ﬁnd a stable result.

Other conclusions drawn from this chapter relate to the use of diversity and
variation methods and their usefulness. In terms of population diversity, little if any extra
information was found using more than one measure for both genetic and phenotypic
diversity. The measure that both represented the most information and provided the most
contrasting results was Shannon diversity (entropy). Therefore, I will use this measure as
the primary measure of population diversity for WhirlingDervish data. However, without

other sources of data to compare, this result may be speciﬁc to the experiment conducted

70

here, and may vary with other experimental designs. Thus, the other measures should be
examined before discarding in future work.

The use of ﬁtness distributions did provide useful information about the system.
However, due to the scaling of the ﬁtness values in order to produce an average
distribution, it essentially strips out the quantitative portion of the data. Thus it can never
show an inﬁnitely increasing distance between the two modes, only that the distributions
are near maximized. Using the average inferiority appears to compensate for this deﬁcit
due to the fact that one mode does not move, thus increasing average inferiority means
the mode is either moving to higher ﬁtness values or pulling more individuals from the
other mode. However, the average inferiority cannot characterize a ﬁtness distribution.
Therefore, it seems that these measures are more informative together than they are
individually, at least in terms of stagnation.

In conclusion, this chapter has shown that convergence did not occur in
WhirlingDerivsh, and as time continues, the selective pressures continue to decrease. It
has also shown that at least under the design of this experiment, stagnation is unlikely.
Lastly, these measures have shown themselves to be potential tools for characterizing

convergence and stagnation for the WhirlingDervish system.

71

Chapter 4 Understanding the Fitness Function

Drawing upon my previous work with diversity and variation measures (Chapter
3), I analyze the default ﬁtness function in WhirlingDervish. I systematically focus on
each ﬁtness component to understand its role in the evolutionary dynamics of the
population. My overarching goal is to ﬁnd a generalized computational method of
comparing systems that evolve 3D structures and behaviors.

Few studies have been conducted on ﬁtness schemes in systems that evolve 3D
morphology and behavior. The primary study is by Karl Sims (1994b). He published
two separate works, one on explicit ﬁtness functions to evolve nontrivial individuals
capable of walking, swimming, and jumping, and one on the use of competition to evolve
behaviors centered on gaining possession of a cube placed in the environment. In the
competition experiment, Sims examined different types of tournament competition and
the use of species, or islands, on the evolved creature behavior. Through competition he
expected to see an increase in behavioral complexity from the creatures evolved versus
the ones evolved under an explicit ﬁtness function, but did not explicitly test this
hypothesis. The evolved behavior may have been sufﬁciently different that a good
comparison was not possible. Regardless, Sims did demonstrate that both ﬁtness
methods could successfully evolve interesting behavior.

Since Sims’ work, most 3D morphology and behavior systems use an explicit
ﬁtness function based primarily on the distance an individual moves. Other phenotypic
traits are also used but tend to vary from system to system. The premise for most ﬁtness
functions is often derived from Sims’s (1994a) work, which used velocity as the ﬁtness

function for both swimming and walking. In at least one of Homby and Pollack's

72

(2001b) systems, the ﬁtness function took the difference between the xz-distance moved
and the total distance all of the physical components were dragged along the ground,
which was intended to encourage a rolling or stepping behavior. The ﬁtness function
used in the Framsticks system (Komosinski, 2000) is based on a few phenotypic
characteristics such as age, velocity (distance over age), and distance traveled by the
center of mass. In the recreation of Sims's work by Taylor and Massey (2001), they used
multiple ﬁtness ﬁmctions based mostly on the distance an individual moved. Due to the
wide usage of a static ﬁtness function based on distance moved, I designed a similar
ﬁtness function for the WhirlingDervish system.

The default ﬁtness function in the WhirlingDervish system is an explicit ﬁtness
function, which I will call the “Base” ﬁtness function. The goal of the Base ﬁtness
function is to evolve a body structure capable of non-trivial, mobile behavior. Due to the
abstract nature of this goal and a current inability to automatically categorize mobile
behavior and structure, I designed a function based on a set of phenotypic characteristics
that each reﬂect an aspect of the evolutionary goal. Some phenotypic characteristics are
also used as constraints and ﬁtness penalties to avoid speciﬁc trivial solutions and reduce
the computational time required to test the population each generation.

The reasoning behind the Base ﬁtness function is that different phenotypic aspects
are important at different generational times. After initial testing, the sole use of the xz-
distance distance traveled turned out to be insufﬁcient due to evolutionary domination of
individuals that started out as compressed springs that threw themselves and then stopped
moving. With the idea that consistent movement over time requires the maintenance of

three-dimensional structure, I added four phenotypic traits to the ﬁtness function. These

73

traits are the number of point-masses, the structure’s overall height, the height of its
center of mass, and the requirement that the ﬁnal overall height must be at least one third
of the original overall height. The ﬁrst three of these traits are limited in their ﬁtness
contribution because structure is not the primary focus. The ﬁtness caps are a maximum
of ten point-masses and a maximum of nine meters for both the overall and center of
mass height.

Some additional constraints are designed to prevent wire-frame individuals from
using extreme physical properties to their advantage, such as bounds on spring constants
and the actual mass of a point-mass. Other constraints are meant to keep run times
consistent and manageable, and the body structure within reasonable bounds. The
genetic language handles some of these constraints by not allowing individuals to evolve
certain parameters away from the default value. The other constraints are treated as
penalties in the ﬁtness function.

The three ﬁtness penalties used in the Base ﬁtness function are a bounding
volume, a maximum y-value, and a cap on the number of springs in a phenotype. The
penalty for the number of springs uses a cap set at ﬁfty-four (one less than a complete
graph for eleven point-masses). For each spring above the cap, a penalty of one is
deducted from the ﬁnal ﬁtness value. However, this only discourages wire-frame
individuals with more springs, it does not prevent such a situation given an otherwise
sufﬁciently good solution. For the bounding volume of a given body, a maximum value
was set of 26.0m for the length of the diagonal of the box, which equates to a bounding
box with dimensions of 15m x 15m x 15m. The maximum y-value is set to 30m and is

used to prevent individuals from throwing themselves so far into the air that they don’t

74

land during the evaluation period.
Figure 4.1 shows a pseudocode snippit that calculates the Base ﬁtness function
including both the positive and negative factors.
ﬁtness = 0.0;
if (ﬁnal_height > 0.3 * original_height)
ﬁtness += 3.0 * xz-distance
ﬁtness += 0.0254 * number_of_points
ﬁtness += ﬁnal_height
ﬁtness += 3.0 * center_of_mass_height
if (number_of_springs > 54)
ﬁtness += 54 — number_of_springs
if (bounding_box_diagonal > 26.0m ||
max_y_coordinate_value > 30.0m ||

ﬁtness < 0.0)
ﬁtness = 0.0

Figure 4.1. The default “Base” ﬁtness function

4.1 Experimental Design

This experiment examines the impact of the different factors that comprise ﬁtness,
and how well diversity and variation measures can provide that differentiation. To
determine the impact of these factors, I take the Base ﬁtness function (Figure 4.1) and
break it up into six variants, which I will call run Sets I-VI. The ﬁrst variant, Set I, was
created so that I could examine a ﬁtness function that uses only the distance moved
ﬁtness component, the spring penalty, and other constraints. The purpose of this variant
is to test the raw evolutionary power of this metric, and is of particular interest due to the
fact that a similar measure is used in most related evolutionary systems.

The other variants, II-VI, examine the impact of each factor independently by
knocking out that factor from the Base ﬁtness ﬁrnction. Table 4.1 displays these

knockouts.

75

Table 4.1. A chart of the ﬁtness components used in the ﬁtness function for each set of
runs. The left hand column corresponds to the ﬁtness factor labels given in Figure 4.1,
where the Base ﬁtness function uses all of the factors.

 

 

 

 

 

 

 

 

 

 

 

 

 

Component l l
(a) Base 111 IV V VI
(b) Base I 11'111 IV v VI
(0) Base .11 IV V VI
, (d) Base 11 111% V VI.
, (e) Base 11 III IV .VI
(0 (Base 1 ,11,111,1v vi

 

 

 

 

 

 

Each set consists of ﬁfty replicate runs that use identical conﬁgurations other than
the ﬁtness function. The conﬁgurations use a population size of 500 individuals that are
simulated for two minutes to compute ﬁtness. With the initial population consisting
entirely of empty genomes, the run continues for another 499 generations. An entirely
new population is created each generation using ﬁtness proportional selection with an
elitism of one. Once an individual is selected to generate an offspring, its genome is
copied instruction by instruction into the genome of the offspring. This process occurs
external to the individual; the individuals do not technically self-replicate. As each
instruction is copied into the offspring’s genome, there is a chance mutation will occur
based on a rate computed prior to the copy process. That rate is genome length
dependent and set such that there is an average of one mutation per offspring. The four
types of possible mutations are point, insertion, duplication, and deletion, but no

crossover is used.
4.2. Results and Discussion

When comparing the results of each run set, ﬁtness is not an appropriate choice

because each function has slightly different range. Instead, I select a common phenotypic

76

trait that is important to the desired output from the system. In this situation, I choose to
use the average distance moved across all populations for a single set of runs. I then use
the average to compare all set of runs as shown in Figure 4.2.

Compare Average Distance Move (meters)

0

a)
O

 

Compare Average Distance Moved (meters)

 

 

Base Set
— Setl
— Set II

Base Set
—Setl
' —Setll

   
    
  

(Al
O
a
0

——Set III
. —-—Set IV
—SetV

Set VI

A
O
5
O

. A.

 

 

 

 

N
O
N
O

.-
0

Avg. Dist. Moved Across Populations (meters)
‘6’
8

 

Avg. Dist Moved Across Populations (meters)
‘0)
O

 

 

O
O

O

-. Q

8 . '~
;..',‘ .

§

200 300 400 500 300 400 500
Generation Generation

0
D
.4»
O’
0

Figure 4.2. Average distance moved vs. time averaged over the 50 populations in each
run set. The pluses are the 95% conﬁdence interval for the respective run that shares its
color. (A) The average distance moved over the entire two-minute evaluation period, and
(B) The average distance moved in the last minute of evaluation.

It is evident from Figure 4.2A that three variants are signiﬁcantly different from
the Base ﬁtness function, with Set VI, no spring restriction, only establishing itself from
the Base ﬁtness function near the end of the evolutionary process. Runs I and II, the
variant with just distance moved and the variant with no height restriction for gaining
ﬁtness for moving, show the ability to evolve individuals that can move much farther on
average. The possible reasons are that requiring height restricts some of the intermediate
forms that would allow an individual to move ﬁrrther or faster. Admittedly, height is an
additional selective pressure and trying to evolve more than one factor at a time can slow

progress. Evolving under movement only, it is possible that the population has access to

77

evolutionary routes to better movers early in the evolutionary process.

While these alternative ﬁtness functions seem to show improvement, they
encourage individuals to move further faster with no concern for maintaining structure.
The purpose of requiring structure at the beginning and end is to encourage consistent
movement for long periods of time. Figure 4.2B shows that while both these sets of runs
are able to evolve consistent movers, it is apparent when comparing them in terms of
moving in the second half of the evaluation, the resulting differences with the Base
ﬁtness function disappear. The early movers in Sets I and II are more successful on the
whole because they ﬂing themselves at the beginning of the simulation due to their initial
structural conﬁguration, but then lose much of their structural integrity. Historically,
their early individuals were likely to simply throw themselves and stop moving.
Generations later, their progeny were then likely evolve the ability to roll or repeat the
jump, but the historical jumping component is likely to be maintained evolutionarily.

The only set of runs that showed signiﬁcant difference under both methods of
comparing distance moved was Set III. This set uses a ﬁtness function that is only
missing the reward for point-masses. Under such a ﬁtness function, the individuals have
no selective pressure until they are able to move or have height, which can cause a long
lag-phase before the evolution of movement can begin.

Next, I compare the different sets of runs using diversity and variation methods.
There appears to be little difference between the runs in terms of genetic and phenotypic
diversity (Figure 4.3). The only exception is Set 111, which has a signiﬁcant lag from the
others in terms of phenotypic diversity (Figure 4.3B). It is interesting that there is little

diversity change over time.

78

Genetic Diversity — Entropy Phenotypic (Dist Moved) Diversity — Entropy
1

0.8

..........

   
   
      
  
  

 

    

 

          
  

 

 

  

 

 
   
 

 

  

 

 

 

   

 

 

8 c
.0 _o
E E
:I 3
a. a.
o o
O. L
g g 0.6 .90........¢06o§09'¢¢
2 Base Set 2 ' .
a Setl 3‘ o 4 ‘ ”
o “4 Set II o '
E —Setlll E —Setlll
g 032 —Set IV a 02 Set IV
E — Set VI ; Set VI
0.9 0 - ‘ ‘ ' ‘ ‘
0 100 200 300 400 500 0 100 200 300 400 500
Generations Generations
A B

Figure 4.3. Diversity of (A) genotypes and (B) the distance moved phenotypic trait, for
each set of runs. The pluses are the 95% conﬁdence intervals for their respective run
associated by color.

Figure 4.4 shows a qualitatively similar result for the dissimilarity metric. While
the other sets of runs are increasing in average dissimilarity, individuals in Set HI are
continually becoming more similar. This may be showing the randomness at the
beginning of a run and the accumulation of useful instructions. Only toward the end of

the run, is the amount of similarity is approaching that of the other sets of runs.

79

Comparing Phenotypic Variation (Similarity)
1 .

 

 

 

 

 

 

 

 

 

Base Run
g M" Runl
g 0.3. -—-Runll
3 —Runlll
g °-7i Run IV
< 06 —RUI'IV
f Run VI
g 0.5r
,E
g 0.4
o .
o 0.3 A.
o ..
ii 0.2
>
<

 

 

0 100 200 300 400 500
Generation

Figure 4.4. The average dissimilarity across all populations for each Set. The pluses
represent the 95% conﬁdence interval for their respective runs according to color.

The other genetic variation method I used to compare the ﬁtness function variants
is the average MRCA generation and the average dissimilarity between the MRCA and
the ﬁnal population (Figure 4.5). Both graphs show no signiﬁcant difference between the
different sets of runs. Once again, Run III varies slightly from the others with much more
variable values. In conjunction with the results from above, I suspect that all of the
information about Run 111 is pointing to such a small level of selective pressure that it is

almost a random process, at least for the majority of the evolutionary process.

80

Generation of MRCA for each Run

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

.IIIIIIIIIAIAIIIIIIII

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

YIIIIIIIIIII

 

 

 

 

500 e

350e

300
250
200 r
150 —

100,

50-

Run

81

Difference of MRCA and Final Population

T l I I I I I

1L —~ A

 

I
—r— l
l

 

0.9 -

 

 

 

 

 

 

 

 

 

 

0.8 >- #

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

+

 

 

 

 

 

 

 

___._.___..___.._.—.—A

Proportion
_o
O!
I
|

|_——__—________

 

 

 

0.5 r- + ~
+

0.4 - + -,
+

0.3 A + + A
+

0.2 A l I l 1 l 1 a

1 2 3 4 5 6 7
Column Number
B

Figure 4.5. Phenotypic variation (based on distance moved) using MRCA measures. (A)
The generation of the MRCA for all ﬁfty runs for each set, and (B) The average
dissimilarity between the MRCA and each individual in the ﬁnal population for all ﬁfty
runs within the set.

4.3. Conclusions

In this chapter I examined the various factors that comprise the default ﬁtness
function. I clearly showed that knocking out any single factor ﬁom the Base ﬁtness
function had little differentiable impact on the average distance moved. The exception
was the removal of the ﬁtness component associated with the number of point-masses,
which had a signiﬁcant negative impact on the ability to evolve long distance movers.
On the other hand, removing the height condition on the ability to gain ﬁtness for moving

had a signiﬁcant positive impact on the ability to move. Using only the ﬁtness

82

component for distance moved also had the same impact as removing the height
restriction for gaining ﬁtness for moving.

While it is interesting that the simplest ﬁtness function, only distance moved, was
one of the best for evolving movement, those improvements were lost when comparing
the impact of the various factors at evolving the ability to also move during the second
half of the evaluation period. This suggests that some of the movement behaviors are not
a desired solution type, typically non-consistent movers. However, it is impossible to tell
what speciﬁc behavior styles evolved that gave these results. Thus with the exception of
the removal of the reward for point-masses from the Base ﬁtness function, there appears
to be no signiﬁcant impact for any given factor or even over using only distance moved,
at least under the state of WhirlingDervish used for this experiment.

The other purpose of this chapter was to discover if diversity and variation
measures, by characterizing convergence and stagnation, would provide a method to
compare the evolutionary results. With the exception of run 111, all of the runs had
basically given the same measurement results. There are three possible situations at this
juncture. The ﬁrst possibility is that these measures are not useful for this task. The
second is that the ﬁtness factors have little, if no, differentiable impact on the
evolutionary results, given the state evolutionary system used for this experiment. The
data provided here can not differentiate which of the two possibilities are more likely.
The third possibility is the low selective pressure, observed in Chapter 3, may be

obscuring useful information.

83

Chapter 5 Structure and Behavior Case Studies

I choose six runs out of the ﬁfty that used the Base ﬁtness function as case studies
to perform a more detailed examination of evolved structure and behavior. Using these
runs, I focus on three areas in this chapter. First, I provide a characterization of structure
and behavior for select individuals in each run. Using these characterizations, I then
examine both the inter-population variation and the historic variation of structure and
behavior. Finally, I examine their genetic diversity and how it relates to the structures
and behaviors that evolved. These topics are important because they reveal more details
about the evolution of structure and behavior, which is the true goal of WhirlingDervish.

The primary data for this chapter are the manual characterizations of various
individuals. For each characterization I interpret an individual’s structure and behavior
by viewing it. There are two major problems with using this method: the many thousands
of individuals in a given run and the difﬁculty in providing consistent analysis of
structure and behavior. When dealing with the sheer quantity of individuals, I attempt to
use available information to help me reduce that set to a manageable size. The primary
forms of information at my disposal are phylogenetic trees, including the full details for
the lineages of each individual in the ﬁnal population, and historical ﬁtness data. The
phylogenetic trees allow me to limit myself to samples from each group of closely related
individuals within the ﬁnal population. Also, the lineage and ﬁtness data allow a similar
sampling of ancestral individuals for comparison. I then take these individuals and
compare them on both structural and behavioral levels. While this technique is clearly

subjective, I take great pains to make it as consistent and formalized as possible. While

84

more automated analysis methods would be preferred, they often have severe ﬂaws and
require signiﬁcant tailoring to the problem (and such ﬁne-tuning would make for a good
thesis on its own).

The problem with consistently analyzing individuals is that the evolved structures
are dense, it is difﬁcult to characterize their components, and it is nearly impossible to
determine how those components contribute to the perceived behavior. Any given point-
mass can have many springs, which, in turn, leads to a large quantity of crisscrossing
lines when an individual is viewed from an outside perspective. The crisscrossing of
lines can also make it difﬁcult to observe point-masses in the interior of the body.
Besides the problems of viewing key pieces of a structure, another problem is the
potential non-linear movement of an individual, which can have a confounding effect by
transitioning from one type of behavior to another part way through its lifetime. This can
occur in different situations such as periodicity differences in spring ﬂuctuations where
some point-masses will fall out of synch with one another causing an overall change to
the behavior.

To address the analysis problem, I consistently apply a speciﬁc method of
analysis to an individual. I ﬁrst identify key point-masses or groups of point-masses.
Next, I divide up the overall behavior into a series of smaller more speciﬁc behaviors that
are executed in a deterministic order. Once each sub-behavior has been identiﬁed, I
associate it with the key point-masses identiﬁed earlier to determine a rough idea of how
that behavior works. The overall generic behavior is characterized in terms of the

periodic execution of the sub-behaviors and utilization of speciﬁc point-masses. Finally,

85

 

to avoid issues such as transitioning between different overall behaviors, I reserve
characterization until the ﬁrst set of consistently repeating behaviors is apparent.

The rest of this chapter involves the examination of behavior for individuals in the
ﬁnal population and certain historic individuals. Using lineage and ﬁtness data, I
perform an analysis of structural and behavioral variation. I also examine genetic
diversity at informative positions in the genomes, which involves the knockout of non-
functional instructions from the genome. These knockouts help me gauge the use of
some genetic diversity measures used in Chapter 3 for measuring variation of structure

and behavior over time, but especially in the ﬁnal population.
5.1 Case Study Setup

I chose six runs from the ﬁfty used in Chapter 3 (under the Base ﬁtness function).
These six runs were selected in three pairs: two runs that appear to have stagnated (row
one), two runs with continually increasing ﬁtness over time (row two), and the two runs
that achieved the highest ﬁtness observed in the ﬁnal generation (row three). I
designated the ﬁrst pair of runs as stagnated due the overall lack of change in lineage and
maximum ﬁtness, and the small, noisy increase in average ﬁtness. The purpose to this
selection method using ﬁtness (Figure 5.1), rather than random sampling, is to develop
and intuition as to if there is a correlation between the lack of variation in structure and
behavior and the stagnation of a run. For example, it would be understandable if the

stagnated runs have low structure and behavior variation in the ﬁnal population.

86

 

  

Run One Fitness

   

 
 
   

 

 

 

 

 

 

 

 

 

 

 

 
 
  

 

 

80 -
70 -
60 -
5° .
2
g 40 -
LT.
30 -
2° _ — Maximum Fitness
Lineage Fitness
10 _ Average Fitness
95% Conﬁdence
o . . , , A
o 100 zoo 300 400 50°
Generation
Run Three Fitness
Maximum Fitness
70 r Lineage Fitness
Average Fitness
60 . ....... 95% conﬁdence
2'
i2

 

 

 

500

400

300
Generation

 

Run Two Fitness

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

55 -
50 '
45 -
40 -
35 -
g 30 -
E 25 ’
20 -
15 ~ Maximum Fitness
10 _ Lineage Fitness
5 _ Average Fitness
95% Conﬁdence
o ‘ ‘ ' ' I
0 10° 20° 300 400 500
Generation
Run Four Fitness
160 . — Maximum Fitness
Lineage Fitness
140 Average Fitness
------- - 95% Conﬁdence
120
3 100 JD It
I so r - -‘
60 -
40 >
20 -
o ' ' ’ ‘ I
100 200 300 400 500
Generation

87

Run Five Fitness Run Six Fitness

 

 

 

 

 

      

 

 

 

 

 

 

 

 

 

 

500 h — Maximum Fitness [
450 200 am? Lineage Fitness .
40° Average Fitness
. 95% Conﬁdence
350 ’ 150 .
m 300r , , ,. a
m ,_ v.‘ .r*'.,->
.3 250 (Writ —' .3
zoo A , ﬂirty. ;
150 - 1y "5'4 _ _ . .
Maximum Fitness 50 , .- l '
100 - ~~~— Lineage Fitness ,
5° _ ' Average Fitness rs “-
- ~ ' - 95% Conﬁdence
0 ' ‘ ‘ ‘ ‘ 0 ‘ ‘ ‘ ‘ ‘
0 100 200 300 400 500 0 100 200 300 400 500
Generation Generation

Figure 5.1. Fitness vs. time for three ﬁtness metrics across six independent runs. The
ﬁtness metrics used are maximum ﬁtness at each time point, the ﬁtness of the individual
along the lineage at each time point, and the average ﬁtness at each time point. The
graphs are paired up in the following order: stagnated (runs one and two), continues to
evolve (runs three and four), and achieves maximal frnal ﬁtness observed (runs ﬁve and
six).

In order to make this study consistent across identical individuals, I recompute the
ﬁtness of each genome to remove the stochastic perturbations that are normally applied to
each point-mass before and individual’s evaluation. This means that exactly identical
phenotypes will always produce exactly identical measurements — something that the
small perturbations would normally cause problems with.

5.2 Results and Discussion

Characterization of evolved individuals is the ﬁrst step in examining the evolution
of structure and behavior. I begin by characterizing the most-ﬁt individual from each
run. These characterizations form a reference point in which to describe other individuals
within their respective populations. To understand these characterizations, it is important
to keep in mind the importance of rhythm for locomotion. Rhythm is the interaction of

two or more components that move in a periodic fashion. Thus, without reactive control

88

over its body, an individual is forced to rely on very regular cyclic behaviors in order to

repeatedly execute their locomotive behavior.
5.2.1 Description of the Most-Fit Individual in Each Run

The most—ﬁt individual in run one (Figure 5.2) forms a complete graph with three
point-masses comprising the top and six on the bottom. The structure appears to be
approximately rectangular and is narrow in the direction of movement. The three top
point-masses cross the narrow width of the body near one end of the rectangle, and the
infrequent, slight lifting of the bottom point-masses at the other end cause the body to
rotate. The rotation leads to a non-linear direction of movement that is similar to an
overall elliptical movement. The movement is driven by the lead point-mass on the top
which points in the forward direction of movement. The other two point-masses are off
center from the direction of movement and move in large circular motions. The large
circular motion means that the point-mass will provide more lift to the point-masses on
the bottom nearest to it and farther from the others. As a point-mass on the top picks up
those on the bottom to varying extents, the bottom ones are moved to new locations. Due
to the leading point-mass, this picking up and moving of the bottom ones is what gives

the individual its ability to move in a forward direction.

89

 

 

 

 

 

 

 

Figure 5.2. (A) A side view perpendicular to the direction of motion of the most-ﬁt
individual from run one. The direction of movement is to the left. The individual is
raising and thrusting its front forward, and will soon rotate down, pulling the back end
forward. While the front is rising, the back end is moving towards the ground. (B) A
view parallel to the direction of motion. (C) A top down view of the individual. There
are slight size discrepancies because each image had to be taken a different point in time
and during different points in the rhythm.

The most-ﬁt individual ﬁom case study two (Figure 5.3) forms a lopsided
pyramid consisting of a complete graph with nine point-masses. The behavior relies on
three components: the lead points, the backbone, and the base. The two point-masses in

the lead, point in the direction of motion. They are then pulled up and thrust forward.

90

Pulling up of the lead point-masses is accomplished by the expansion of the backbone
while pushing the base down onto the ground and leaning back. Once the leading point-
masses have been thrust forward, the backbone rocks forward as if rolling on the base.
Two point-masses from the base are then picked up into the air and two remain on the
ground, which, in conjunction with the leading two, pull the body forward through
contraction. The cycle is then repeated, giving approximately linear forward movement.
It also appears that given enough time, it will rotate approximately 135 degrees to its
right, and once again proceed in a linear direction. However, the evaluation period
during evolution is only two minutes, thus the rotation would not be selected against.
Figure 5.3 shows this individual in the process of lifting the leading point-masses before

thrusting them forward.

 

 

 

Figure 5.3. A side view perpendicular to the direction of movement of the most-ﬁt
individual from run two. The direction of movement is to the left. The slightly curved
shape of the base is visible at the bottom. The curve allows the individual to rock back,
thrusting the leading component forward, then rock forward, dragging the back end
forward through contraction.

A similar behavior is observed in the most-ﬁt individual from run three (Figure
5.4). This individual appears to also be a connected graph, but consisting of

approximately twelve point-masses. The primary difference between the two individuals

91

is run two utilized a backbone type structure where this individual utilizes a pentagram
type structure. The pentagram structure rolls back and forth along the direction of
motion, lifting and thrusting the leading point-masses out ahead of the body. The other
point-masses are connected to the pentagram opposite the leading ones but off the
ground, which contributes to the ability of the pentagram to rock. With the few point-
masses in contact with the ground, the distance that the leading point-masses are thrust is
farther than the contracting distance of the point-masses, giving the overall forward

motion of this individual.

 

 

 

 

 

Figure 5.4. A side view perpendicular to the direction of movement of the most-ﬁt
individual from run three. The direction of movement is to the left. You can see the
interesting core component that consists of many pyramids pointing in different
directions. It is the interactions of these pyramids that allow the rocking of the central
component for lifting and thrusting.

For run four, the most-ﬁt individual (Figure 5.5) is able to lift its entire body into
the air, essentially, throwing itself forward. This individual has the same general
conﬁguration as the individuals from run two and three, from the top down it appears to
be a triangle or diamond pointing in the direction of motion. What makes this individual
unique is that it has two important sets of point-masses that alternate positions in the air.
The ﬁrst set forms a type of backbone, which rotates down to lay almost completely on
the ground. As it rotates downward, a point-mass in the front pops up into the air. At

some instance during movement, this forward point-mass is the only one in the air, and

92

provides the leverage used by the rest of the point-masses on the ground to pull
themselves up and forward during contraction. The movement exhibited by this
individual is fast, but unstable. Given enough time, it collapses onto the xz-plane unable
to recover. I suspect this occurs if the body gets out of synch enough that it cannot raise
the point toward the front. However, the time frame for collapse is well beyond the two-
minute evaluation period, preventing selection against this instability. Figure 5.5 shows

both ends of the body contracting, raising up off the ground.

 

 

Figure 5.5. A side view perpendicular to the direction of movement for the most-ﬁt
individual from run four. The direction of movement is to the left. This individual has
less infrastructure in the back end for lifting the front component. Thus, the highest
vertical point-mass near the front is key to lifting the other leading point-masses.

In run ﬁve, the most-ﬁt individual (Figure 5.6) utilizes a fully connected graph
where one point-mass resides far out in ﬁ'ont of the rest. The bulk of the point-masses are
close together in a large clump. The behavior follows an alternation between ﬁrst lifting
the point in front and thrusting it forward with most of the back point-masses on the
ground. Then the forward point mass lands and the majority of the back point-masses are
lifted into the air. During the second period, considerably less friction is present allowing
the contraction of the back point-masses to provide further movement towards the front.

During this contraction phase, the back point-masses appear to almost dance by

93

constantly switching who is touching the ground, which gives each point more time in the

air than touching the ground.

 

 

 

 

Figure 5.6. A side view perpendicular to the direction of movement for the most-ﬁt
individual from run ﬁve. The direction of movement is to the right.

The most-ﬁt individual in run six (Figure 5.7) is similar to the one in run ﬁve.
The primary difference is the addition of one component. This component adds a point-
mass connected to the bulk at the back, but sticks out behind. As the front point-masses,
there are two, are thrust forward, this point in the back is moved ahead of the bulk. When
the bulk is ready to contract with the front ones, this extra point acts a leg, allowing the
bulk to be thrust up into the air. This gives an overall affect of lifting the entire back end
up into the air, and sometime the whole body. This motion is repeatable, but can get
slightly out of synch causing a slow down in movement speed and a rotation 90 degrees
to the right. Figure 5.7 shows the individual right before it would rock back, lifting the
ﬁ'ont point-masses into the air. It also shows the opposite where the leading point-masses

have been thrust forward and the body is rocking back to a rest position.

94

 

 

 

 

 

Figure 5.7. A side view perpendicular to the direction of movement for the most—ﬁt
individual from run six. The direction of movement is to the right.

One aspect shared by all of the most-ﬁt individuals from each run is a body
formation where almost every point-mass is connected to every other point-mass. There
are occasional point-masses hanging connected through a single spring to the rest of the
body. Also, the use of alternating point-masses in the base components being lifted into
the air at different times appears fairly common. Another aspect is that most of the
individuals seemed to make use of internal structures such as the pentagram ﬁ'om run
three, and levers in run six. All of the structures and behaviors viewed so far have been
complex and have shown the importance synchronous cycles play in the dynamics of

these individuals.
5.2.2 Variation Within the Final Population

Knowing the structure and behavior of the most-ﬁt individuals within a run is
useful, but how much variation exists within a population? To answer this question, I
sampled individuals from the ﬁnal populations guided by the phylogenetic tree for each
run. The purpose of using a phylogenetic tree is that it provides a rough clustering of
individuals who are closely related. Individuals near each other in the phylogeny are

much more likely to have similar, if not identical, phenotypes. The designation of a

95

 

cluster is the set of individuals whose common ancestor lived within the previous twenty-
ﬁve generations. To reduce the time expense required to analyze every individual, I
sampled individuals from each cluster consisting of the highest ﬁt individual, an
individual with about half the highest ﬁtness, and sometimes a few others with different

ﬁtness values. Figure 5.8 illustrates a partial phylogenetic tree containing two clusters.

 

 

i6
1

249429
249372
249265
249171
249437
249079
249336
249302
249365
249310
249249
249187
249114

13458500
75058900
75058900
75058900
75058900
75058900
39.945000
39.94 5000
75058900
75058900
75058900
75.058900
75058900
75058900

(JON—e
miss
\J
(O

O')O')O)(OO')O')O')O)\IO')O)OUNO)—~—~
I\) 0'38

@3258

00

Ln'tn'mmmtnmoaoomtnoo

00

 

460 480

Figure 5.8. A partial phylogenetic tree from run one showing two clusters. The numbers

along the bottom represent the generation. The ﬁrst vertical column is the individual’s id
and the second column is the ﬁtness of each individual.

In the end, I sampled between ﬁfty and seventy—six individuals from each of the

six ﬁnal populations. The one exception was run two where I only selected eleven

96

individuals. The reason for this exception was the sheer quantity of individuals that had
exactly the same ﬁtness. Each of these individuals seems to have divergent genomes due
primarily to neutral mutation, but have identical phenotypes. There is no guarantee that
individuals with the same ﬁtness will have the same phenotype. However, ﬁtness is a
high precision real value, which makes it very unlikely for multiple phenotypes with the
same ﬁtness to be different, especially between individuals in the same cluster. Thus, I
took multiple samples of individuals with the same ﬁtness value from different clusters,
and included other individuals that had reasonable ﬁtness values.

The examination of all of the sampled individuals from the ﬁnal populations
showed that in ﬁve of the six runs there was only one type of structure and behavior
present in the ﬁnal population. The only exception was run six, which showed two
distinct types of structure and behavior. Interestingly, the most-ﬁt individual in run six
was not the most common phenotype. The most common phenotype was actually more
similar to the most-ﬁt individuals from runs two and three, which use the lift of the
leading point-masses and then rock forward type of behavior in conjunction with a raised
back end.

The variance in ﬁtness is caused most often by changes in the quality and
efﬁciency of executing a behavior. These changes occurred in many different ways such
as shifting one point-mass relative to the others or shifting a point-masses position
causing a change in rhythm. Changes also occurred through the slight alteration of
structure such as the addition of a new point-mass or the removal of a spring. These

alterations often led to changes in rhythm, but also changed how the structure was

97

maintained over time. The alterations also affected behavior, sometimes adding or
removing complexity from the phenotype.

With the ﬁnal populations having minimal variation in general structure and
behavior, I wanted to know how long those general structures and behaviors have
persisted. To determine this, I compared the most recent common ancestor (MRCA) to
the individuals in the ﬁnal population. With the exception of run six (where a new, more
ﬁt phenotype is arising), the MRCA had the same general phenotype as the most-ﬁt
individual. For run six, the MRCA had the same phenotype as the most common
phenotype in the ﬁnal population, but not the same as the most-ﬁt phenotype in the
population. This implies that the most-ﬁt phenotype in run six evolved after the most
common phenotype swept the population. After comparing the two phenotypes in run
six, it is clear that the most-ﬁt phenotype evolved by building off and modifying its
ancestor. All of the other runs simply contained a sweep, and judging by the ﬁtness
increase in the majority of the runs (Figure 5.1), they have only improved on that
phenotype.

The MRCAs for the runs occur at different times (Table 5.1) and do not provide
enough information about the historical context of these sweeps. Thus, I utilize the
lineage ﬁtness for each run to select ancestors for comparison. These comparisons will
shed light on evolutionary progression of the structure and behavior found in the ﬁnal
population. I start by picking the most-ﬁt individual from the ﬁnal population and trace
its ancestry back to the ﬁrst generation. Then, starting at the earliest ancestor in the ﬁrst
generation, I proceed forward along the lineage, ﬁnding those individuals who meet

certain criteria. The criteria uses the ﬁtness of the ﬁnal individual to calculate a

98

percentage of that ﬁtness starting at 5% and increasing by 5% up to 100% or the MRCA,
which ever is ﬁrst. Those whose ﬁtness stagnates early will sample all twenty
individuals long before the generation of the MRCA, while the others will only sample

about ten individuals.

Table 5.1. Generation of the most recent common ancestor.

Run MRCA
Generation
One 208
Two 356
Three 352
Four 367
Five 278
Six 247

In run one, the phenotype found in the ﬁnal population began at least as early as
generation 67, about 15% of the ﬁnal ﬁtness. However, while the basics are there, the
behavior is sluggish in comparison to the ﬁnal version. At 10% of ﬁnal ﬁtness,
generation 53, the starting conﬁguration is the same, but simply falls over into a different
conﬁguration, which does not move. The ﬁrst time sample at 5%, was at generation 25
and appeared to have all of the point-masses and some of the structure in place, but lack
stability and collapsed down onto the ground with one point-mass held in the air. Run
two ﬁnds its ﬁnal form sometime after generation 46, 25% of the ﬁnal ﬁtness. At this
point, the ﬁnal structure is in place, but it does not move. Then, starting at the earliest
sample at generation 17, each progressive sample up to generation 46 builds off an initial
standing pyramid. As time progresses the structure approaches the ﬁnal structure but
continues to collapse into secondary conﬁgurations until sometime after generation 39,
20% of the ﬁnal ﬁtness, the last point-mass is added in the correct position to keep the

ﬁnal structure intact.

99

 

For the runs that appear not to stagnate, runs three and four have a much longer
history. To begin, the MRCA for both has the same general structure and behavior as the
ﬁnal phenotype, but slightly less stable structure. Run three begins at 5% of the ﬁnal
ﬁtness, generation 108, as a simple pyramid with a chain of unused point-masses hanging
from it. Then up through generation 168, a basic stacked pyramid structure had evolved
that falls over to form the basic ﬁnal structure. From generation 168 to 230, the ﬁnal
structure is in place, but cannot move because it can barely get its front point-masses off
the ground. At generation 322, about 50% of the ﬁnal ﬁtness, the individual has started
moving slightly through expansion and contraction, and has started picking up its leading
point-masses.

Run four has fewer noticeable intervals of change, but with a long history. The
initial pyramid form at generation 20 continues to increase in complexity until it is close
to the ﬁnal structure before generation 175, 45% of ﬁnal ﬁtness. At this point, it has
begun to move by picking up its leading points and rocking back and forth. This
behavior is unstable and moves into a secondary behavior for a short while before getting
back into it primary behavior. Interestingly, the secondary behavior is similar to the ﬁnal
behavior. Finally, after 50% of the ﬁnal ﬁtness has been reached at generation 218, it has
evolved the general structure and behavior of the ﬁnal population.

Due to the large ﬁnal ﬁtness of runs ﬁve and six, they have very short intervals of
change. Run ﬁve has achieved 5% of the ﬁnal ﬁtness by generation 78, and has the basic
phenotype without the leading point-mass and can move through expanding and
contraction. By generation 99 and then 107, 10% and 15% of the ﬁnal ﬁtness, the

leading point-mass has been added to the structure making it like the ﬁnal structure. It

100

also moves using the ﬁnal behavior, but cannot really jump. The rest of the evolution
seems to focus on improving three aspects: getting into its leap conﬁguration, leaping,
and maintaining linear movement. In run six, the lineage has achieved the ﬁnal structure
by generation 83, 5% of the ﬁnal ﬁtness, but can not move. However, by 10% of the
ﬁnal ﬁtness, generation 94, it has attained the ﬁnal phenotype.

The evolution of the initial structures is interesting and may indicate the path of
future evolution. Any new behavior will most likely occur by adding complexity and
altering the structure. It appears unlikely that runs will have high variation of structure
and behavior, but transitions between distinct behaviors are more likely. These
comparisons have also shown that the runs tended to lock in a structure and behavior

early, and the evolution that follows, selects for improvement on that phenotype.
5.2.3 Alternative Method For Measuring Genetic Diversity

With the extra knowledge of structure and behavior variation in both the ﬁnal
generation and along the lineage, these case studies afford me the opportunity for further
examination of genetic similarity. In Chapter 3, genetic diversity measures, such as
genetic richness (the count of unique genomes) indicated near maximal diversity.
Additionally, those measures showed little genetic similarity between individuals in a
given population or between the MRCA and the ﬁnal population. However, the data
examined in this chapter suggest that while there is variation in structure and behavior,
the variation is not that dramatic. One potential reason for the discrepancy between
previous measurements and phenotype diversity is the weak selection and the large
percentage of neutral mutation accumulation over time (on average, 70% of mutations in

the ﬁnal generation are neutral). These two factors can add a lot of extra instructions that

101

translate into increased genetic difference, but have no affect on the resulting phenotype
or ﬁtness.

I use the following method to address the discrepancy between the high genetic
diversity, low genetic similarity within a pOpulation, and similarity in structure and
behavior examined in this chapter. I knock out as many neutral instructions from a
genome as possible, which can be considered condensing the genome. These condensed
genomes are then used to determine the genetic richness of the ﬁnal population, the
genetic similarity within the ﬁnal population, and the similarity between MRCA and the
ﬁnal population. I then compare these values to their counterparts in Chapter 3.

I chose a simple method for condensing genomes. The genome is traverserd
linearly, each instruction is individually removed and the resulting phenotype is
measured. If the ﬁtness value of the new phenotype is the same as the original ﬁtness,
the current instruction is marked as neutral. After all of the instructions have been tested,
all of the neutral instructions are removed from the genome.

This genome condensing method produces only an approximation of the minimal
genome for a given individual. Any difference from the true minimal genome is the
result of epistatic effects, which are not considered in this study. However, epistatic
effects were found in on 0 to 15% of the individuals for any given run.

For genetic richness, the count of unique genomes, the value was much lower for
the condensed populations than what was found in Chapter 3. Due to the approximation
in the condensing method, the richness may be lower than the true value of the minimal
genome. The reason is that the approximation would remove extra instructions,

increasing the chance of identical genomes. Table 5.2 shows the comparison between

102

genetic richness from Chapter 3 and the new method. Also, phenotypic richness, in terms

of ﬁtness, is also given as a point of comparison.

Table 5.2. Richness measures for all six runs. Column two is the original genetic
richness from Chapter 3. Column three is the genetic richness using condensed genomes,
and column four is the phenotypic richness in terms of ﬁtness.

Run Original Condensed Phenotype (Fitness)
One 451 242 171
Two 457 205 155
Three 442 301 237
Four 431 276 190
Five 431 319 248
Six 464 274 226

The Table 5.2 indicates that richness measures of Chapter 3- are heavily
dominated by the accumulated neutral instruction. It is also interesting to note that the
genetic richness for condensed genomes is still much higher than phenotypic richness.
This means that different genomes have different phenotypes with the same ﬁtness. It
would be an interesting follow up to determine how much of this difference is due to
genomes that map to phenotypes with low ﬁtness.

My next comparison is based on the similarity between each pair of genomes
within the ﬁnal population. Table 5.3 shows the results from Chapter 3 versus the
condensed genomes for both similarity and dissimilarity measures. The similarity
measure is a ratio of the difference between the maximum genome length and the edit
distance over the minimum genome length. The dissimilarity measure is the ratio of the
edit distance over the maximum genome length. For more details see Sections 3.1 and

3.2 from Chapter 3.

103

 

Table 5.3. The dissimilarity and similarity measures are given for each run along with
their 95% conﬁdence interval.

Run Original Condensed Original Condensed
Dissimilarity Dissimilarity Similarity Similarity
One 0.779 t 0.017 0.368 i 0.016 0.238 3: 0.017 0.691 i 0.017
Two 0.806 i 0.015 0.257 t 0.011 0.210 i 0.015 0.775 i 0.010
Three 0.562 : 0.019 0.454 i 0.016 0.458 i 0.019 0.594 i 0.016
Four 0.798 i 0.015 0.619 i 0.016 0.219 i 0.015 0.436 at 0.016
Five 0.823 i 0.017 0.677 t 0.017 0.194 i 0.017 0.369 i 0.017
Six 0.761 i 0.015 0.454 :t 0.017 0.258 i 0.016 0.604 t 0.016

Next, the similarity and dissimilarity are measured between the MRCA and the
ﬁnal population, and compared to what was found in Chapter 3. Both of these measures

are shown in Table 5.4.

Table 5.4. The dissimilarity and similarity measures are given for each run including
their 95% conﬁdence interval.

Run Original Condensed Original Condensed
Dissimilarity Dissimilarity Similarity Similarity
One 0.899 i 0.002 0.611 t 0.006 0.166 t 0.003 0.430 i 0.006
Two 0.794 i 0.004 0.182 t 0.009 0.223 i 0.005 0.837 i 0.009
Three 0.853 i 0.003 0.654 at 0.007 0.193 i 0.004 0.374 i 0.008
Four 0.759 i 0.004 0.539 i 0.010 0.273 t 0.004 0.503 i 0.010
Five 0.863 t 0.002 0.836 i 0.005 0.190 1- 0.003 0.182 t 0.006
Six 0.790 i 0.002 0.507 i 0.010 0.337 1 0.004 0.521 t 0.008

Clearly, major differences are possible between measurements using condensed
verses non-condensed genomes. Given the strong, consistent measurements of high
genetic diversity, which seemed unreasonable to be maximized, the use of condensed
genomes for these measurements may prove more reliable and informative. This makes

sense because you are strictly using the parts of the genome that are important for

selective purposes. The neutral instructions can be considered random because while

104

they may hitchhike, it has been shown that they allow mutations to accumulate, which
drives apart the common neutral set of instructions.

By looking at the information provided using the condensed genomes, it is
apparent that there is more similarity between genomes in the ﬁnal population for the
runs that have little improvement in ﬁtness. In run six, it is only in the last generations
that it breaks free of its stagnating ﬁtness. It is also interesting that run three has a higher
similarity when its ﬁtness has been on the rise for the majority of the run. However, it is
unclear to what extent, if any, there is a correlation between ﬁtness and similarity, which
may provide interesting information in the ﬁrture. For now, it is interesting enough to
note the variation in similarity between the six runs, and the variation among the
observed phenotypes.

Similarity between the MRCA and the ﬁnal population is also quite different
when using condensed genomes. Run two appears to have almost inverted its quantity of
similarity, while run ﬁve is nearly unchanged. These values seem to correlate with two
factors observed among the samples individuals. The ﬁrst factor is the number of
individuals in the ﬁnal population who exhibit the exact same structure and behavior as
the MRCA. The second factor is the evolutionary difference of the MRCA to the
common and most-ﬁt individuals in the ﬁnal population, and the generational difference
between the MRCA and approximate ﬁrst generation to exhibit the ﬁnal structure and
behavior along the lineage. Most likely, the issue of the loss of structure and behavior
variation is a combination of many factors, and that information provided here show

some relationships but nothing stands out as a primary indicator.

105

 

5.3 Conclusions

The six most-ﬁt individuals all demonstrated different structure and behavior, but
appear to share many common themes. The most signiﬁcant similarity is the tendency to
have every point-mass connected to nearly all of the others. There are also a large
number of spike components to each body. From comparing various individuals, these
two commonalities seem to contribute the most to improving ﬁtness, but most likely the
contributions are gained by adding stability to the structure. For behavior, the primary
method of movement centers around picking up points off the ground, moving them
forward, and then dragging the rest of the body forward through contraction.

Each of the six case studies demonstrated that a structure and behavior become
ﬁxed in the population early in the evolutionary process. Then, selection pressure drives
the population toward improving that structure and behavior. Improvements are often
accomplished through the alteration of the structure to cause changes in an individual’s
rhythm. Also, changes in structure and behavior did occur early in each run and toward
the end of run six. Those changes to structure and behavior primarily arose through an
increase in structural complexity.

After viewing the lack of structure and behavior variation in the ﬁnal population
and along the lineage, it is apparent that the genetic diversity and phenotypic variation
measures from Chapter 3 are not good indicators of structure and behavior variation.
Instead, they most likely indicate the variation in improvements to the structure and
behavior that is ﬁxed in the population. The usefulness of these indicators for phenotypic
variation seems to improve when you switch from using the whole genome to a

condensed genome. However, genome condensation may not be an appropriate

106

alternative for every type of evolutionary system or language, but these results do suggest
possible confounding effects in the simple measures. In the end, it is clear that there are
many additional factors that affect the variation of structure and behavior. Those factors
addressed here show that using similarity information in conjunction with ﬁtness can
provide some indication of phenotypic convergence and stagnation. Structural and
behavioral stagnation can be identiﬁed by comparing a population to its MRCA, while

convergence can be identiﬁed by comparing condensed genomes.

107

 

Chapter 6 Conclusions and Future Work

I have examined the WhirlingDervish system in light of other 3D morphology and
behavior systems and their research. WhirlingDervish presents an alternative variation
for this type of evolutionary system in order to decrease run time and increase population
size allowing for experiments to be performed that can answer fundamental questions
about evolutionary dynamics. However, due to the difﬁculty of analyzing the
evolutionary results, especially in terms of behavior, it is hard to compare these systems.
In an attempt to gain common ground between these systems, I examined two aspects
common to all of them: convergence and stagnation. It is my hope that these two aspects
along with the work by Komosinski et. al. (2001a) in morphological classiﬁcation will
lead to further improvements in the behavior and system analysis.

In my opinion, one of the most beneﬁcial additions to studying the evolution of
morphology and behavior would be a way to classify changing morphology in terms of
behavior, preferably with an automated process rather than one that involves visual
analysis. Having the ability to classify individuals based on behavior can provide many
advantages such as a quick way to identify good solutions or signiﬁcantly reduce the set
of solutions from which to sample good solutions. It can also be useful in measuring
individual and system complexity. Also, measuring diversity and variation in the
previous chapters gave a rough idea of convergence and stagnation in WhirlingDervish,
perhaps behavior classiﬁcation would have given different or more precise information

about the system by characterizing diversity in terms of the system goal.

108

Current research on behavior classiﬁcation in 3D evolving systems is limited at
best. Therefore, I recommend the work of Walter F. Bischof and Terry Caelli as a
potential starting point. Their work centered on the use of a basic temporal pattern
classiﬁcation system that utilizes movement data from key structural points on a moving
object. It then utilizes the temporal data for classiﬁcation. In one situation, they used
their system to determine if a person was constructing one of three different designs out
of straight and bent tubes by monitoring the movement and speed of the hands (Bischof
and Caelli, 2001a). Another situation determined if a person was lifting either a heavy or
light box and if they were using their legs or their back for lifting (Bischof and Caelli,
2001b)

Looking once again at convergence and stagnation, an alternate population and
ﬁtness assignment model may prevent, or at least stave off, convergence and stagnation
in WhirlingDervish, which should ultimately improve both the quality and rate of ﬁnding
good solutions. One such approach is Hierarchical Fair Competition (HFC; Hu, 2004),
which addresses ﬁnding global optima and how to continue searching if an algorithm has
converged upon a local optima by maintaining diversity. HF C uses a population model
based on a hierarchy of isolated populations that conform to a speciﬁc ﬁtness range.
Using a promotion/demotion scheme provides a way to move individuals between levels
and potentially increase population diversity. It attempts to address continuous evolution
by creating new low ﬁtness solutions that can rise through the ranks and are not
immediately wiped out because their only competition are the others at the same

population level (Hu, 2004). While this thesis has shown that convergence and

109

stagnation did not occur using the speciﬁed experimental design, this approach may
prove helpful in experiments with longer time frames.

Another interesting avenue of research is self-replication and developmental
models. The WhirlingDervish system would make an excellent platform for this type of
research since its phenotypes are heavy based on physics models. Both of these avenues
would work together to allow offspring to be born in real time. Normally, the offspring
are created instantaneously in a vacuum. This research could provide insights into many
different areas of evolution including developmental stages, the effects of historical
contingency, and potentially the introduction of competition as a form selective pressure
rather than a static ﬁtness function.

The study of genetic language for constructing morphology is a primary research
topic for many of the evolutionary systems for 3D morphology and behavior. In the
course of creating WhirlingDervish and writing this thesis, I conducted a series of
experiments using the genetic language in WhirlingDervish. These experiments examine
the use of common programming structures such as loops, conditions, and jumping to
labels and how they affect WhirlingDervish results. However, due to time constraints, I
was unable to add them to the thesis. I hope to publish these experiments at a ﬁrture time.

Finally, the use of diversity measures in WhirlingDervish toward the study of the
Cambrian Explosion theory may be an exciting and useful application of these types of
evolutionary systems to evolutionary biology. The theory hypothesizes that all of the
major body plans in existence evolved within a short span of time. As time has gone on,
no new general body plans have been produced. It would be interesting to see if this

effect occurs in WhirlingDervish and if so, why. Basically, I would be looking for

110

 

kernels of behavior and dynamic structure that evolved early in a run and are maintained
over time. I would then need to show that while the behavior and dynamics have
changed, their abilities are still based primarily on the earlier form. Of course, this study

is also limited by the inability to classify behavior.

11]

 

Appendix A Instruction Sets

The WhirlingDervish instruction set consists a default and extended version. The
extended instruction set utilizes all of the instructions from the default set plus some
additional ones. By using the instruction_set.cfg conﬁguration ﬁle, a user can turn on or

off any given instruction for a run.

Instruction Name Instruction Description 7
add Add the parameter to the top of the current stack
add-line Using the parameter as an offset from the location of the

lcurrent point- mass, a spring is created between the current
{point-mass and the point-mass closest to the new position. i
This could even be the current point-mass itself. After the
creation of the spring, the current point- -mass becomes the
one that was closest to the new ’position.

a i 7, i Ah —_4

div Divide each component of the top of the current stack by
each respective component of the parameter. Any divide
by zero is set to zero.

 

merge A point-mass is found that rs closest to the location of the
current point-mass plus the offset vector represented by
the parameter. Nothing happens if an attempt is made to,
merge a point-mass with itself. Otherwise, a new point-
mass is created whose mass is the average of the two
point-masses at a location halfway between them. Then,
for each spring connected to each of the old point-masses,
a duplicate spring is connected to the new point-mass. ,‘
Then the two original point-masses are removed from the

individual.
mod Same as divide, except modulus instead of divide
move-to ,Finds a point- mass closest to the location of the current

point- mass plus the offset represented by the parameter
and sets that point- mass to be the new current point—mass. 1

+4

mult Multiply each component of the parameter to each
respective component of the top of the current stack

nop Does nothing

pop Poo the value off the current stack. If the stack becomes

112

Instruction Name

push
push-1

’push-Z

'push-3 I

remove-line

remove-point

rotate

set-point-parameters

 

set-spring—constants
set-spring-ﬂuctuations
shift—data-Ieft

shift-data-right
split

Instruction Description

empty, a vector of all zeros is pushed onto the stack.

Push the parameter onto the current stack

 

Push a vector consisting of the ﬁrst component of the
parameter onto the current stack

Push a vector consisting of the second component of the
parameter onto the current stack

Push a vector consisting of the third component of the
parameter onto the current stack

Using the parameter as an offset from the location of the
current point- mass, the constructor will move to the point-

 

mass closest to the new location and remove the spring,

between them.

masses, the constructor doesn't move.

Removes the current point-mass and moves to a point-
mass that rs closest to the current location plus the offset

 

represented by the parameter

If no spring exists between the point-‘

 

!

Changes the orientation of the individual builder: x->yaw, A

y->pitch, and z->roll

Sets the point-mass default data to be the parameter, but.

currently does nothing

Sets the spring constants to be the value of the parameter,

but currently does nothing

Sets the spring ﬂuctuation data to be the value of the

parameter, where each component is constrained between

-1and1

Shift the vector components on the top of the current stack

to the left. x->z, y->x, z->y

Same as shift- data-left but shifts the data to the right

The parameter is a vector, which rs ﬁrst normalized, then
scaled by the dilation state variable of the constructor
object. The modiﬁed vector is then used as an offset from
the location of the current point-mass to create a new
point -mass at that location. This new point-mass is
connected to the current point-mass by a single spring. No
spring is created if there was no originating point-mass,
which means the constructor object’ 8 location would have
been used as the starting location. The new point-mass

then becomes the current point-mass.

113

Instruction Name
‘ split-dup

’ sub

switch-stack

 

Instruction Name

else

' end

‘if

'jump

executed.

both if and loop instructions.

ifx—- —— ythen true

‘ instructions.

 

Instruction D_escription

 

 

Same as the split instruction but creates additional springs.

For each spring connected to the originating point-mass, a:
duplicate spring is created with the new point-mass. In
addition, the spring between the current point-mass and
the new point-mass is still created.

Subtract the parameter from the top of the current stack

Switches the current stack to one of the four stacks where
the parameter can only be the name of one of those stacks

‘ Table A.1. Default Instruction Set

 

Instruction Description

‘If this instruction does not fall between an if and end

instructions, it is ignored. If the if instruction fails, the

‘instructions between the else and end are executed,

otherwise the instructions between the if and else are

 

A11 instruction denoting the end of a control block, for

An if instruction must have a terminating end instruction,'
or it is ignored. The instructions within the if block are:
only executed if the if is successful. The if statement?
parameter is used to determine the success of the

instruction using the following logic

‘ifx>=ythentrue

ifx <= y then true

Finds the appropriate label, push the current constructor,
object’s state onto a special stack and move to the.
appropriate location in the genome to continue executing‘

114

Instruction Name
label

loop

partial

I POPS-Ci j

V popéline-d
push-d

set-dilation

I split-dup-oriented

of springs like the split-dup instruction.

Instruction Description :

 

A label is similar to a function in that the instructions
between it and the next one can only be executed by a
jump to that label. A label denotes a speciﬁc location in
the genome and its parameter is an id used by the jump
instruction to move the thread of execution to that
location. If more than one id exists, the last label is the»
one that is used. If a label is the ﬁrst instruction, a jump to‘
that label is automatic. Otherwise, you can never execute‘
a label's instructions unless a jump is used. Once the last
instruction in a label is executed the construction object
state stack is popped and its state restored. Once the last
state has been popped off the stack, the execution is

concluded.

The loop instruction executes the block of instructions,
between itself and the end instruction, at most ten times.
The loop is ignored if there is no terminating end
instruction. The condition logic is the same as the if ;
instruction, and the increment value utilizes the 2
component in the follow fashion. ‘
-z -> ifx >= y then true
+z -> if x <= y then true
0 ->ifx=ythent1'ue

The parameter is used to set the partial state in the
construction object, which IS used by split-partial.

 

_._4

Restores the construction object' s orientation and position

1 from the top of the orientation stack. If the stack 13 empty,

this instruction does nothing. The current point-mass is
set to the point- -mass nearest the restored location of the?
construction object.

Same as pop-d, except a spring is created between the
previous point-mass and the new point- mass

Pushes the construction object's orientation and position
onto the orientation stack.

The dilation vector is set to be the parameter. This 18 used

in split instructions to scale the offset vectors. 1

A -- #4,

Same as the split- orientation instruction with the creation

115

Instruction Name

split-oriented

‘ sit-partial

split-partial-oriented

top-d ’

LTable A.2. Instruction Set Extensions

OIICC.

 

Instruction Description

 

This is the same as a split, except the forward vector of the
orientation of the construction object is used instead of the
parameter as the offset. The parameter is then used as the‘
equivalent of the rotate command using the split-oriented
parameter as the rotate parameter, which occurs prior to‘

the creation of the new point-mass.

 

This? is the same as a split-dup with an exception. Instead.

‘of making duplicate of all of the springs, the parameter is

used to determine a percentage of springs to create.
Consider the springs as an array, x is the percentage of '
springs to duplicate from the start of the array, y is the
percentage around the middle of the array, and z is the‘
percentage from the end of the array. The percentages can
overlap, but a spring can never be replicated more than

 

,Same as a split-partial, but also utilizes the orientation
aspect of split-oriented.

53m as pop-d, except Reﬁt; is n_ot— popped "1101113;
stack.

116

 

Appendix B The Construction Object

The construction object is the external mechanism for replicating individuals. It
consists of two parts: a list of functions that correspond directly to the instruction set and
a set of vectors comprised of three real numbers that hold state information. The object
begins work when given a genome by linearly passing from the ﬁrst instruction to the
last. When an instruction is executed, the corresponding internal ﬁmction is called. The
functions do all of the work in generating the body by using a combination of internal
state information and the instruction parameter. Once the last instruction has been
executed, the body is complete and the genome and body are released back to the system.

There is a speciﬁc set of state variables available for use by the instructions that
manipulate the construction object during execution. The use of state variables allows a
reduction in instruction complexity by moving necessary information from the
instruction’s parameters into the construction object. Moving information into the object
requires that new instructions be added to the instruction set speciﬁcally to manipulate
the new states. Thus, a trade-off occurs between complex instructions and the need to
execute multiple instructions to accomplish the same task. However, manipulation of the
state variables is not necessary as each variable begins in a default state, and once the
state is set, future instructions will reuse that same state.

The state variables available for use by the instruction set are four stacks of
vectors, spring constants, spring ﬂuctuation data, point—mass properties, gravity, dilation,
partial data, and the construction object location and orientation. The stacks are available
for push and popping vectors for later use and for the results of mathematical

instructions. The spring constants, spring ﬂuctuation data, and point-mass properties are

117

 

each a vector describing the properties of the respective physical component when it is
added to the body. The gravity vector is constant and normalized, and can only be used
by an instruction for the purposes of direction. The dilation and partial data are scaling
effects upon a new point-mass. The location and orientation of the construction object
can be changed through instructions, but are mostly used as direction vectors.

Finally, the construction of the body is accomplished by executing the genome,
which is a simple program or script that drives the construction object. Each body begins
empty and the ﬁrst point-mass created always begins at the origin. As each new point-
mass is added to the body, the construction object keeps track of which point-mass it is
working on currently. The current point-mass is a key factor in body construction
because of the relative nature of the build order to that current point-mass. The relative
nature is based on the process where each new point-mass that is created uses the

instruction parameter as an offset from the current point-mass.

118

BIBLIOGRAPHY

Adami C. Introduction to Artificial Life. New York: Springer-Verlag, 1998.

Angeline PJ, Pollack J B. Coevolving High-Level Representations. In Langton CG (ed).
Artiﬁcial Life II], pp 52-72. Menlo Park, CA: Addison-Wesley, June 1994.

Bischof WF, Caelli T. On the learning of complex movement sequences. In Arcelli C,
Cordella LP, Sanniti di Baja G (eds). Visual Form, pp 463-472. New York: Springer,
2001a.

Bischof WF, Caelli T. Learning complex action patterns with CRGst. In Singh S,
Murshed N, Kropatsch W (eds). Advances in Pattern Recognition (IA CPR 2001), pp 280-
289. 2001b.

Bongard J C, Lipson H. Once More Unto the Breach: Co-evolving a robot and its
simulator. Proceedings of the Ninth International Conference on the Simulation and
Synthesis of Living Systems(ALIFE9), pp 57-62. 2004.

Bongard J C, Pfeifer R. Evolving Complete Agents Using Artiﬁcial Ontogeny. In Hara F,
Pfeifer R (eds), Morpho-functional Machines: The New Species (Designing Embodied
Intelligence), pp 237-258. Springer-Verlag, 2003.

Bongard, J C, Pfeifer R. Repeated Structure and Dissociation of Genotypic and
Phenotypic Complexity in Artiﬁcial Ontogeny, In Spector L (eds), Proceedings of The
Genetic and Evolutionary Computation Conference, GECCO—ZOOI, pp 829-836. San
Francisco, CA: Morgan Kauﬁnann, 2001.

Bossert W. Mathematical Optimization: Are There Abstract Limits on Natural Selection?
In Moorehead PS, Kaplan MM (eds). Mathematical Challenges to the Neo-Darwinian
Interpretation of Evolution, Philadelphia, PA: Wistar Institute Press, 35-46, 1967.

Burke EK, Gustafson S, Kendall G. Diversity in Genetic Programming: An Analysis of
Measures and Correlation with Fitness. IEEE Transactions on Evolutionary
Computation, 8(1):47-62, 2004.

Daida JM, Ward DJ, Hilss AM, Long SL, Hodges MR, Kriesel JT. Visualizing the Loss
of Diversity in Genetic Programming. Proceedings of the 2004 IEEE Congress of
Evolutionary Computation, pp 1225-1232. Portland, OR, Piscataway: IEEE Press, 2004.

Ekart A, Nemeth SZ. Maintaining the Diversity of Genetic Programs. In Foster J A,
Lutton E, Miller J F, Ryan C, Tettamanzi A (eds). Proceedings of the 5th European

Conference on Genetic Programming, pp 162-171. London, UK: Springer-Verlag, April
2002.

119

 

Fogel DB. An Introduction to Simulated Evolutionary Optimization. IEEE Transactions
on Neural Networks, 5(1):3-14, 1994.

Homby GS, Pollack J B. Creating High-Level Components with a Generative
Representation for Body-Brain Evolution. Artificial Life, 8(3):223-246. 2002.

Homby GS, Lipson H, Pollack J B. Evolution of generative design systems for modular
physical robots. IEEE International Conference on Robotics and Automation, pp 4146-
41 5 1 . 2001 a.

Homby GS, Pollack JG. Body-Brain Co-evolution Using L-systems as a Generative
Encoding. Genetic and Evolutionary Computation Conference, pp 868-875. 2001b.

Hu J. Sustainable evolutionary algorithms and scalable evolutionary synthesis of
dynamic systems. Doctoral dissertation, Department of Computer Science and
Engineering, Michigan State University, East Lansing, MI, 2004.

Komosinski M. The Framsticks system: versatile simulator of 3D agents and their
evolution. Kybernetes: The International Journal of Systems & Cybernetics, 32(1/2):156-
173, MCB University Press, 2003.

Komosinski M, Koczyk G, Kubiak M. On Estimating Similarity of Artiﬁcial and Real
Organisms. Theory in Biosciences, 120:271-286, 2001a.

Komosinski M, Rotaru-Varga A, Comparison of different genotype encodings for
simulated 3D agents. Artificial Life Journal, 7(4):395-418, MIT Press, 2001b.

Komosinski M. The World of Framsticks: Simulation, Evolution, Interaction. In Heudin
J C (ed). Virtual Worlds. Lecture Notes in Artificial Intelligence, pp 214-224. 2000.

Komosinski M, Ulatowski Sz. Framsticks: towards a simulation of a nature-like world,
creatures and evolution. Proceedings of 5th European Conference on Artificial Life

(ECAL99), pp 261-265. Lausanne, Switzerland: Springer-Verlag, September 1999.

Koza JR. Genetic Programming: 0n the Programming of Computers by Means of
Natural Selection. Cambridge, MA: MIT Press, 1992.

Lipson H, Pollack J B. Evolving Physical Creatures. In Bedau MA, McCaskill J S,
Packard NH, Rasmussen S (eds). Proceedings of Artificial Life VII (ALIFE7), pp 282-
287. Portland, OR, 2000.

Magurran AE. Measuring Biological Diversity. Oxford, UK: Blackwell Science, 2004.

120

 

McPhee NF, Hopper NJ. Analysis of genetic diversity through population history. In
Banzhaf W, Daida J, Eiben AE, Garzon MH, Honavar V, J akiela M, Smith RE (eds).
Proceedings of the Genetic and Evolutionary Computation Conference (GE C C099), pp
1112-1120. July 1999.

Ofria C, Wilke CO. Avida: A Software Platform for Research in Computational
Evolutionary Biology. Journal of Artificial Life, 10:191-229, 2004.

Pollack J B, Homby GS, Lipson H, Funes P. Computer Creativity in the Automatic
Design of Robots. LEONARDO. 36(2):115-121. 2003.

Popovici E, De J ong K. Understanding EA Dynamics via Population Fitness
Distributions. In Cantu-Pal E, et. al. (eds). Proceedings of the Genetic and Evolutionary
Computation Conference, pp 1604-1615. 2003.

Ray TS. Evolution, complexity, entropy, and artiﬁcial reality. Physica D, 75:239-263,
February 1994.

Rosca JP. Entropy-based adaptive representations. In Rosca JP (ed). Proceedings of the
ML95 Workshop: From Theory to Real- World Applications. pp. 23-32. 19953.

Rosca JP. An Analysis of Hierarchical Genetic Programming. Technical Report 566,
University of Rochester, NY, 1995b.

Ruebsamen GD. Evolving Intelligent Embodied Agents Within a Physically Accurate
Environment. Doctoral dissertation, Department of Engineering and Computer Science,

California State University, Long Beach, CA, December 2002.

Sims K. Evolving Virtual Creatures. Proceedings of SIGGRAPH 94, pp 15-22. July
1994a.

Sims K. Evolving 3D Morphology and Behavior by Competition. In Brooks R, Maes P
(eds). ALife IV: Proceedings of the 4th Conference on Artificial Life. pp. 28-39. MA:
MIT Press, 1994b.

Soda Play, www.sodaplay.com and www.sodaplay.com/constructor/how/works.htm.

Taylor T, Massey C. Recent Developments in the Evolution of Morphologies and
Controllers for Physically Simulated Creatures. Artificial Life, 7(1):77-87, 2001.

Taylor T. www.tim-taylor.com/research/embody.ht1nl.

Ventrella J. Sexual Swimmers: Emergent Morphology and Locomotion without a Fitness
Function. From Animals to Animats, pg 484. MIT Press, 1998.

121

 

Ventrella J. Explorations in the Emergence of Morpholgoy and Locomotion Behavior in
Animated Characters. In Brooks R, Maes P (eds). Proceedings of the Fourth Workshop
on Artificial Life, Boston, MA: MIT Press, 1994.

Verlet L. Computer "Experiments" on Classical Fluids. 1. Thermodynamical Properties of
Lennard-Jones Molecules. Physical Review, 159298-103, 1967.

122