THE EVOLUTIONARY POTENTIAL OF POPULATIONS ON
COMPLEX FITNESS LANDSCAPES
By
David Michael Bryson

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Computer Science
Ecology, Evolutionary Biology and Behavior
2012

ABSTRACT
THE EVOLUTIONARY POTENTIAL OF POPULATIONS ON
COMPLEX FITNESS LANDSCAPES
By
David Michael Bryson
Evolution is a highly contingent process, where the quality of the solutions produced is
a↵ected by many factors. I explore and describe the contributions of three such aspects that
inﬂuence overall evolutionary potential: the prior history of a population, the type and frequency of mutations that the organisms are subject to, and the composition of the underlying
genetic hardware. I have systematically tested changes to a digital evolution system, Avida,
measuring evolutionary potential in seven di↵erent computational environments ranging in
complexity of the underlying ﬁtness landscapes. I have examined trends and general principles that these measurements demonstrate and used my results to optimize the evolutionary
potential of the system, broadly enhancing performance. The results of this work show that
history and mutation rate play signiﬁcant roles in evolutionary potential, but the ﬁnal ﬁtness
levels of populations are remarkably stable to substantial changes in the genetic hardware
and a broad range of mutation types.

Very special thanks to
Emmalena J. Gregory-Bryson for love and support
Elena S. Bryson for helpful distraction
In memory of
Unnamed Bryson
8/6/2012

iii

ACKNOWLEDGMENT

I would like to extend my appreciation to the members of my guidance committee,
Richard Lenski, Erik Goodman, Eric Torng, Robert Pennock, and especially my advisor and
committee chair, Charles Ofria. Their collaboration, support, and feedback has been invaluable. Thanks to Zachary Blount, Arthur Covert III, Heather Goldsby, Matthew Rupp, Aaron
Wagner, and other members of the Michigan State University Digital Evolution Laboratory
for helpful discussion and critique. Special thanks to Je↵rey Barrick for the inspiration,
initial implementation, and collaboration on the early studies of slip mutations.
This material is based in part upon work supported by the National Science Foundation
under Cooperative Agreement No. DBI-0939454 and Grant CCF-0643952 Any opinions,
ﬁndings, and conclusions or recommendations expressed in this material are those of the
author(s) and do not necessarily reﬂect the views of the National Science Foundation.

iv

TABLE OF CONTENTS

List of Tables . . . . . . . . . . . . . . .
List of Figures
Chapter 1

. . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . viii
. . . . . . . . . . . . . . . . . .

xi

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

Chapter 2 Methods . . . . . . . . . . . . . . . . . . . . . . . . .
2.1 The Heads Instruction Set Architecture . . . . . . . . . . .
2.2 Environments . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Logic-9 . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Logic-77 . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Match-12 . . . . . . . . . . . . . . . . . . . . . . . .
2.2.4 Fibonacci-32 . . . . . . . . . . . . . . . . . . . . . . .
2.2.5 Sort-10 . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.6 Limited Resource . . . . . . . . . . . . . . . . . . . .
2.2.7 Navigation . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Analysis Tools and Methods . . . . . . . . . . . . . . . . . .
2.3.1 Mutational Neighborhood . . . . . . . . . . . . . . .
2.3.2 Short-Term Evolutionary Potential (STEP) Sampling
2.3.3 Data Analysis and Statistical Testing . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

5
7
10
11
11
11
12
12
13
13
13
14
14
15

Chapter 3 How Does History A↵ect Evolutionary Potential?
3.1 Assessing the Contribution of History . . . . . . . . . . . . . .
3.1.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Measuring Change in Evolutionary Potential . . . . . . . . . .
3.2.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Mutational Neighborhood Lineage Analysis . . . . . .
3.2.3 STEP Sampling . . . . . . . . . . . . . . . . . . . . . .
3.2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Characteristics of Potential Stepping-Stone Mutations . . . . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

16
19
19
20
22
23
23
24
32
35
36

v

Chapter 4 How Does the Genetic Hardware A↵ect Evolutionary Potential?
4.1 Optimizing Evolutionary Potential with Instruction Set Architectures . . . .
4.1.1 Tested Architecture Modiﬁcations . . . . . . . . . . . . . . . . . . . .
4.1.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.3.1 Fully-Associative Argumentation . . . . . . . . . . . . . . .
4.1.3.2 Number of Registers . . . . . . . . . . . . . . . . . . . . . .
4.1.3.3 Explicit Labels . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.3.4 Split Input/Output Operations . . . . . . . . . . . . . . . .
4.1.3.5 Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.3.6 Flow Control . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Task Success Acquisition Rate of Selected Instruction Sets . . . . . . . . . .
4.2.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Assessing Robustness to Poor Design Decisions . . . . . . . . . . . . . . . . .
4.3.1 Tested Instruction Set Modiﬁcations . . . . . . . . . . . . . . . . . .
4.3.2 General Performance Evaluation . . . . . . . . . . . . . . . . . . . . .
4.3.3 Impact on Evolutionary Potential in a New Environment . . . . . . .
4.3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 5 How Do Mutation Types A↵ect Evolutionary Potential?
5.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Exploring the Balance of Mutation Types . . . . . . . . . . . . . . . .
5.2.1 Optimal Mutation Rates for Adaptive Evolution . . . . . . . . .
5.2.2 Balance of Insertion/Deletions and Substitution Mutations . . .
5.2.3 Per-Site Mutation Rates vs. Per-Genome Mutation Rates . . . .
5.3 Investigating the E↵ect of Mutation Timing . . . . . . . . . . . . . . .
5.3.1 Copy Mutations versus Divide Mutations . . . . . . . . . . . . .
5.3.2 Cosmic-Ray Mutations . . . . . . . . . . . . . . . . . . . . . . .
5.3.3 Parent Mutations . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Exploring Aspects of Insertion/Deletion Mutations of Large E↵ect . . .
5.4.1 Mutation Rates . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.2 Insertion Sequence Content . . . . . . . . . . . . . . . . . . . .
5.4.3 Gene Translocation . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.5 Side-e↵ects of Mutations . . . . . . . . . . . . . . . . . . . . . . . . . .
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7.1 Comparing Sexual and Asexual Reproduction . . . . . . . . . .
5.7.2 Execution Errors . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7.3 Mutation Changes During Evolution . . . . . . . . . . . . . . .
vi

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

38
40
40
44
44
46
48
51
54
54
57
61
66
66
67
70
71
72
77
81
82
83
85
85
86
89
92
96
96
99
101
103
104
107
112
113
114
116
116
116
117
117

Chapter 6

Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . 118

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Appendix A Complete History Mutational Neighborhood Results . . . . . . 123
Appendix B Complete History STEP Sampling Results . . . . . . . . . . . . 149
References . . . . . . . . . . . . . . .

vii

. . . . . . . . . . . . . . . . . . . 176

LIST OF TABLES

Table 2.1

The seven environments used to test evolutionary potential.

. . . .

10

Table 3.1

General groups observed during mutational neighborhood analysis .

25

Table 3.2

General groups observed during STEP sampling . . . . . . . . . . .

33

Table 3.3

Mutation Corridor Entry and Exit Fitness E↵ects . . . . . . . . . .

36

Table 4.1

Flow series instruction sets tested

. . . . . . . . . . . . . . . . . .

43

Table 4.2

Heads and Fully-Associative Architectures Fitness . . . . . . .

47

Table 4.3

Heads and Fully-Associative Architectures Task Success . . . .

47

Table 4.4

R Series Architectures Fitness . . . . . . . . . . . . . . . . . . . . .

49

Table 4.5

R Series Architectures Task Success . . . . . . . . . . . . . . . . . .

50

Table 4.6

Label Series Architectures Fitness . . . . . . . . . . . . . . . . . .

52

Table 4.7

Label Series Architectures Task Success . . . . . . . . . . . . . . .

53

Table 4.8

Split-IO Architecture Fitness . . . . . . . . . . . . . . . . . . . . .

55

Table 4.9

Split-IO Architecture Task Success . . . . . . . . . . . . . . . . . .

55

Table 4.10

Search Series Architectures Fitness . . . . . . . . . . . . . . . . . .

56

Table 4.11

Search Series Architectures Task Success . . . . . . . . . . . . . .

56

Table 4.12

Flow Series Architectures Fitness . . . . . . . . . . . . . . . . . . .

59

Table 4.13

Flow Series Architectures Task Success . . . . . . . . . . . . . . . .

60

viii

Table 4.14

Heads and Flow-IfX-MovHead Architectures Fitness . . . . . .

65

Table 4.15

Heads and Flow-IfX-MovHead Architectures Task Success . . .

65

Table 4.16

Time to 50% Task Success . . . . . . . . . . . . . . . . . . . . . . .

67

Table 4.17

Extended Architecture Experiments Fitness . . . . . . . . . . . . . .

69

Table 4.18

Extended Architecture Experiments Task Success . . . . . . . . . .

70

Table 4.19

Bloat-X, Poison-X, and Die Architectures Fitness . . . . . . . .

75

Table 4.20

Bloat-X, Poison-X, and Die Architectures Task Success . . . . .

76

Table 5.1

Mutation Rate Fitness . . . . . . . . . . . . . . . . . . . . . . . . .

87

Table 5.2

Mutation Rate Task Success . . . . . . . . . . . . . . . . . . . . . .

88

Table 5.3

Insertion/Deletion and Substitution Proportion Fitness . . . . . . .

90

Table 5.4

Insertion/Deletion and Substitution Proportion Task Success . . . .

91

Table 5.5

Mutation Rate Application Fitness . . . . . . . . . . . . . . . . . . .

94

Table 5.6

Mutation Rate Application Task Success . . . . . . . . . . . . . . .

95

Table 5.7

Mutation Timing Fitness . . . . . . . . . . . . . . . . . . . . . . . .

98

Table 5.8

Mutation Timing Task Success . . . . . . . . . . . . . . . . . . . . .

98

Table 5.9

Cosmic-Ray Mutation Fitness . . . . . . . . . . . . . . . . . . . . . 100

Table 5.10

Cosmic-Ray Mutation Task Success . . . . . . . . . . . . . . . . . . 100

Table 5.11

Parent Mutation Fitness . . . . . . . . . . . . . . . . . . . . . . . . 102

Table 5.12

Parent Mutation Task Success . . . . . . . . . . . . . . . . . . . . . 102

Table 5.13

Insertion versions supported by the Slip mutation in Avida . . . . . 104

Table 5.14

Slip Duplication Mutation Rate Fitness . . . . . . . . . . . . . . . . 105

Table 5.15

Slip Duplication Mutation Rate Task Success . . . . . . . . . . . . . 106

ix

Table 5.16

Slip Mutation Insertion Versions Fitness . . . . . . . . . . . . . . . . 109

Table 5.17

Slip Mutation Insertion Versions Task Success . . . . . . . . . . . . 110

Table 5.18

Gene Translocation Fitness . . . . . . . . . . . . . . . . . . . . . . . 111

Table 5.19

Gene Translocation Task Success . . . . . . . . . . . . . . . . . . . . 111

Table 5.20

Sterilize Unstable — Fitness . . . . . . . . . . . . . . . . . . . . . . 115

Table 5.21

Sterilize Unstable Genotypes — Task Success . . . . . . . . . . . . . 115

x

LIST OF FIGURES

Figure 2.1

The architecture of the Avida virtual CPU . . . . . . . . . . . . . .

8

Figure 3.1

Distribution of EQU Evolution in Replay Experiments . . . . . . . .

20

Figure 3.2

Median time to EQU evolution . . . . . . . . . . . . . . . . . . . . .

21

Figure 3.3

Mutational Neighborhood Group I Example — Population 108 . . .

26

Figure 3.4

Mutational Neighborhood Group II Example — Population 147 . . .

27

Figure 3.5

Mutational Neighborhood Group III Example — Population 109 . .

27

Figure 3.6

Mutational Neighborhood Group IV Example — Population 136 . .

28

Figure 3.7

Mutational Neighborhood Group V Example — Population 116 . . .

29

Figure 3.8

Mutational Neighborhood Group VI Example — Population 125 . .

30

Figure 3.9

Mutational Neighborhood Group VII Example — Population 114 . .

31

Figure 3.10

Mutational Neighborhood of the PLoD from Population 145 . . . . .

31

Figure 3.11

STEP Sampling Group A Example

. . . . . . . . . . . . . . . . . .

34

Figure 3.12

STEP Sampling Group B Example . . . . . . . . . . . . . . . . . . .

34

Figure 3.13

STEP Sampling Group C Example . . . . . . . . . . . . . . . . . . .

35

Figure 4.1

Selected Flow Series Architectures Task Success Distribution . . .

58

Figure 4.2

Heads and Bloat-30 Fitness Trajectory . . . . . . . . . . . . . . .

77

Figure 4.3

STEP Sampling of Population 6156 . . . . . . . . . . . . . . . . . .

79

xi

Figure 4.4

STEP Sampling of Population 6166 . . . . . . . . . . . . . . . . . .

80

Figure 4.5

STEP Sampling of Population 2050 . . . . . . . . . . . . . . . . . .

80

Figure A.1

Mutational neighborhood of the PLoD from population 101 . . . . . 124

Figure A.2

Mutational neighborhood of the PLoD from population 102 . . . . . 124

Figure A.3

Mutational neighborhood of the PLoD from population 103 . . . . . 125

Figure A.4

Mutational neighborhood of the PLoD from population 104 . . . . . 125

Figure A.5

Mutational neighborhood of the PLoD from population 105 . . . . . 126

Figure A.6

Mutational neighborhood of the PLoD from population 106 . . . . . 126

Figure A.7

Mutational neighborhood of the PLoD from population 107 . . . . . 127

Figure A.8

Mutational neighborhood of the PLoD from population 108 . . . . . 127

Figure A.9

Mutational neighborhood of the PLoD from population 109 . . . . . 128

Figure A.10 Mutational neighborhood of the PLoD from population 110 . . . . . 128
Figure A.11 Mutational neighborhood of the PLoD from population 111 . . . . . 129
Figure A.12 Mutational neighborhood of the PLoD from population 112 . . . . . 129
Figure A.13 Mutational neighborhood of the PLoD from population 113 . . . . . 130
Figure A.14 Mutational neighborhood of the PLoD from population 114 . . . . . 130
Figure A.15 Mutational neighborhood of the PLoD from population 115 . . . . . 131
Figure A.16 Mutational neighborhood of the PLoD from population 116 . . . . . 131
Figure A.17 Mutational neighborhood of the PLoD from population 117 . . . . . 132
Figure A.18 Mutational neighborhood of the PLoD from population 118 . . . . . 132
Figure A.19 Mutational neighborhood of the PLoD from population 119 . . . . . 133
Figure A.20 Mutational neighborhood of the PLoD from population 120 . . . . . 133

xii

Figure A.21 Mutational neighborhood of the PLoD from population 121 . . . . . 134
Figure A.22 Mutational neighborhood of the PLoD from population 122 . . . . . 134
Figure A.23 Mutational neighborhood of the PLoD from population 123 . . . . . 135
Figure A.24 Mutational neighborhood of the PLoD from population 124 . . . . . 135
Figure A.25 Mutational neighborhood of the PLoD from population 125 . . . . . 136
Figure A.26 Mutational neighborhood of the PLoD from population 126 . . . . . 136
Figure A.27 Mutational neighborhood of the PLoD from population 127 . . . . . 137
Figure A.28 Mutational neighborhood of the PLoD from population 128 . . . . . 137
Figure A.29 Mutational neighborhood of the PLoD from population 129 . . . . . 138
Figure A.30 Mutational neighborhood of the PLoD from population 130 . . . . . 138
Figure A.31 Mutational neighborhood of the PLoD from population 132 . . . . . 139
Figure A.32 Mutational neighborhood of the PLoD from population 132 . . . . . 139
Figure A.33 Mutational neighborhood of the PLoD from population 133 . . . . . 140
Figure A.34 Mutational neighborhood of the PLoD from population 134 . . . . . 140
Figure A.35 Mutational neighborhood of the PLoD from population 135 . . . . . 141
Figure A.36 Mutational neighborhood of the PLoD from population 136 . . . . . 141
Figure A.37 Mutational neighborhood of the PLoD from population 137 . . . . . 142
Figure A.38 Mutational neighborhood of the PLoD from population 138 . . . . . 142
Figure A.39 Mutational neighborhood of the PLoD from population 139 . . . . . 143
Figure A.40 Mutational neighborhood of the PLoD from population 140 . . . . . 143
Figure A.41 Mutational neighborhood of the PLoD from population 141 . . . . . 144
Figure A.42 Mutational neighborhood of the PLoD from population 142 . . . . . 144

xiii

Figure A.43 Mutational neighborhood of the PLoD from population 143 . . . . . 145
Figure A.44 Mutational neighborhood of the PLoD from population 144 . . . . . 145
Figure A.45 Mutational neighborhood of the PLoD from population 145 . . . . . 146
Figure A.46 Mutational neighborhood of the PLoD from population 146 . . . . . 146
Figure A.47 Mutational neighborhood of the PLoD from population 147 . . . . . 147
Figure A.48 Mutational neighborhood of the PLoD from population 148 . . . . . 147
Figure A.49 Mutational neighborhood of the PLoD from population 149 . . . . . 148
Figure A.50 Mutational neighborhood of the PLoD from population 150 . . . . . 148
Figure B.1

STEP sampling along the PLoD from population 101. . . . . . . . . 149

Figure B.2

STEP sampling along the PLoD from population 102. . . . . . . . . 150

Figure B.3

STEP sampling along the PLoD from population 103. . . . . . . . . 150

Figure B.4

STEP sampling along the PLoD from population 104. . . . . . . . . 151

Figure B.5

STEP sampling along the PLoD from population 105. . . . . . . . . 151

Figure B.6

STEP sampling along the PLoD from population 106. . . . . . . . . 152

Figure B.7

STEP sampling along the PLoD from population 107. . . . . . . . . 152

Figure B.8

STEP sampling along the PLoD from population 108. . . . . . . . . 153

Figure B.9

STEP sampling along the PLoD from population 109. . . . . . . . . 153

Figure B.10 STEP sampling along the PLoD from population 110. . . . . . . . . 154
Figure B.11 STEP sampling along the PLoD from population 111. . . . . . . . . 154
Figure B.12 STEP sampling along the PLoD from population 112. . . . . . . . . 155
Figure B.13 STEP sampling along the PLoD from population 113. . . . . . . . . 155
Figure B.14 STEP sampling along the PLoD from population 114. . . . . . . . . 156

xiv

Figure B.15 STEP sampling along the PLoD from population 115. . . . . . . . . 156
Figure B.16 STEP sampling along the PLoD from population 116. . . . . . . . . 157
Figure B.17 STEP sampling along the PLoD from population 117. . . . . . . . . 157
Figure B.18 STEP sampling along the PLoD from population 118. . . . . . . . . 158
Figure B.19 STEP sampling along the PLoD from population 119. . . . . . . . . 158
Figure B.20 STEP sampling along the PLoD from population 120. . . . . . . . . 159
Figure B.21 STEP sampling along the PLoD from population 121. . . . . . . . . 159
Figure B.22 STEP sampling along the PLoD from population 122. . . . . . . . . 160
Figure B.23 STEP sampling along the PLoD from population 123. . . . . . . . . 160
Figure B.24 STEP sampling along the PLoD from population 124. . . . . . . . . 161
Figure B.25 STEP sampling along the PLoD from population 125. . . . . . . . . 161
Figure B.26 STEP sampling along the PLoD from population 126. . . . . . . . . 162
Figure B.27 STEP sampling along the PLoD from population 127. . . . . . . . . 162
Figure B.28 STEP sampling along the PLoD from population 128. . . . . . . . . 163
Figure B.29 STEP sampling along the PLoD from population 129. . . . . . . . . 163
Figure B.30 STEP sampling along the PLoD from population 130. . . . . . . . . 164
Figure B.31 STEP sampling along the PLoD from population 131. . . . . . . . . 164
Figure B.32 STEP sampling along the PLoD from population 132. . . . . . . . . 165
Figure B.33 STEP sampling along the PLoD from population 133. . . . . . . . . 165
Figure B.34 STEP sampling along the PLoD from population 134. . . . . . . . . 166
Figure B.35 STEP sampling along the PLoD from population 135. . . . . . . . . 166
Figure B.36 STEP sampling along the PLoD from population 136. . . . . . . . . 167

xv

Figure B.37 STEP sampling along the PLoD from population 137. . . . . . . . . 167
Figure B.38 STEP sampling along the PLoD from population 138. . . . . . . . . 168
Figure B.39 STEP sampling along the PLoD from population 139. . . . . . . . . 168
Figure B.40 STEP sampling along the PLoD from population 140. . . . . . . . . 169
Figure B.41 STEP sampling along the PLoD from population 141. . . . . . . . . 169
Figure B.42 STEP sampling along the PLoD from population 142. . . . . . . . . 170
Figure B.43 STEP sampling along the PLoD from population 143. . . . . . . . . 170
Figure B.44 STEP sampling along the PLoD from population 144. . . . . . . . . 171
Figure B.45 STEP sampling along the PLoD from population 145. . . . . . . . . 171
Figure B.46 STEP sampling along the PLoD from population 146. . . . . . . . . 172
Figure B.47 STEP sampling along the PLoD from population 147. . . . . . . . . 172
Figure B.48 STEP sampling along the PLoD from population 148. . . . . . . . . 173
Figure B.49 STEP sampling along the PLoD from population 149. . . . . . . . . 173
Figure B.50 STEP sampling along the PLoD from population 150. . . . . . . . . 174

xvi

Chapter 1
Introduction
Evolution by natural selection is a powerful process, having produced the incredible diversity
and complexity observed in the natural world. Biological evolution appears to demonstrate
seemingly limitless evolutionary potential, capable of continuously adapting to the environment. Still, while evolution may be broadly capable on such a grand scale, individual microcosms appear subject to the vagaries of history (Blount et al., 2008). Beyond basic research
questions, an understanding of how speciﬁc details of evolving systems a↵ect evolutionary
potential could have important implications for a wide range of challenges in evolutionary
biology, virology, bacteriology, immunology, oncology, and others. For example, given an
understanding of the factors that constrain evolutionary potential, we may be able to limit
pathogen evolution and prevent host shifts, or to develop antibiotics that simultaneously
reduce or prevent the evolution of resistance.
The ﬁeld of evolutionary computation seeks to harness the power of evolution to solve difﬁcult engineering problems. These techniques have shown great promise and have advanced
our capability to approach complex challenges Koza (2003). However, remaining limitations
1

to evolution in the computational realm include a capability for open-ended evolution of
complexity, as observed in nature. Evolutionary computation systems eventually stagnate,
failing to achieve further improvements, despite their theoretical availability. Understanding
factors that determine evolutionary potential is necessary if we are to overcome this limitation. A great deal of research has focused on increasing potential, such as preventing
diversity loss or improving ﬁtness functions and mutational operators.
There are many aspects of an evolving system that may limit or enhance its evolutionary
potential, including the prior history of a population, the type and frequency of mutations
that the organisms are subject to, and the composition of the underlying genetic hardware.
While there are other aspects that could be examined, these are the three that I selected to
explore systematically.
The prior history of an evolving population encompasses all of its historical events and
conditions, including environmental pressures, competition dynamics, and random chance.
These factors leave a lasting signature on the heritable substrate in an evolving system, which
determines where the organisms reside in a complex ﬁtness landscape. Di↵erent regions of a
ﬁtness landscape may have dramatically di↵erent properties, including epistatic interactions
between sites, robustness to mutations, and adaptive potential. Taken altogether, the prior
history of a population may play a major role in its evolutionary potential.
Mutations are a fundamental source of variation in evolving systems, deﬁning the types
and frequency of changes that can be made to the heritable substrate, and hence the possible trajectories that can be taken across the ﬁtness landscape of a given environment. The
resulting rules governing traversal can make some regions of a landscape more or less accessible, directly impacting evolutionary potential. Mutations also inﬂuence the degree to which
2

epistasis can be overcome, the organization and architecture of the functional characters in
the heritable substrate, and the robustness of evolved instances – all important aspects that
can limit or enhance evolutionary potential.
Here, the genetic hardware deﬁnes the characters of the heritable substrate and the
semantics of their interactions. The genetic hardware is the underlying foundation upon
which all aspects of evolution by natural selection ultimately act. In natural organisms,
the genetic hardware encompasses DNA, RNA, cellular machinery, and the biochemistry
that governs function. The genetic hardware deﬁnes the potential functional capabilities
of individuals in the system, how the characters encoded in the heritable substrate can be
changed, and the underlying architecture of the genetic hardware deﬁnes what those changes
will actually do. The genetic hardware establishes the rules by which a genotype is translated
into the actual phenotype upon which selection acts. The interactions of components of the
genetic hardware, at the genotype level and in the translation of it, heavily inﬂuence the
necessary epistasis of evolved instances, deﬁning the strength of connections required between
components and how robust or brittle they are to changes. Due to the foundational nature
of these properties, I expect the choice of genetic hardware to play a critical role in shaping
the ﬁtness landscape and ultimately the evolutionary potential of an evolving system as a
whole.
Digital evolution, which encompasses the essentials of the evolutionary process in a controlled, ﬂexible, and transparent computational environment, is a powerful tool for studying
evolving systems (Ofria et al., 2009). I have used digital evolution to better understand how
the above aspects of evolving systems a↵ect evolutionary potential at scopes and scales that
would have been intractable, if not impossible, with natural systems. I have measured and
3

described the consequences of history, investigating how the internal state of a population
impacts the future evolutionary potential of that population. In my exploration of mutations and genetic hardware I have taken a top-down approach, altering the characteristics of
the system and identifying how to enhance the evolutionary potential of the system in the
context of each.
The insight gained from these experiments will have signiﬁcant impact on the design
of future digital evolution systems. I have identiﬁed general design principles, parameter
selection guidelines, and novel techniques that support enhanced systematic evolutionary
potential in the form of greater population mean ﬁtness and long-term task success across
multiple environments. Furthermore, the knowledge gained provides a broad understanding
of how these core aspects of all evolving systems shape evolutionary potential. This understanding will allow researchers to make more accurate predictions about the evolutionary
potential of a system.

4

Chapter 2
Methods
I have performed all experiments using executables based on the Avida 2.x source code.1
Avida is a computational platform for performing experiments on evolving populations of
digital organisms. An experimental world in Avida consists of a two-dimensional grid of cells
that may each contain a single organism. Except where noted, experiments use worlds with
60-by-60 cells, supporting a maximum population size of 3,600 organisms. A digital organism
in this virtual world contains a central processing unit (CPU) that executes a genomic
program of assembly-like instructions based on a conﬁgured instruction set architecture. The
genomic program is responsible for directing organism behavior, including interactions with
the environment and replication. In all experiments described here, o↵spring organisms are
placed into the local neighborhood, the nine cells immediately connected to, and including,
the cell of the parent organism.
Organisms that replicate in Avida are subject to mutation. These mutations can take
many forms, including substitution of one instruction for another (point mutations), instruc1 The Avida Git source code repository may be retrieved from http://avida.devosoft.org
5

tion insertion, and instruction deletion, among others. Except where otherwise speciﬁed,
organisms were subject to a substitution mutation rate of 2.5 ⇥ 10 3 per site in the genome,
along with a 5 ⇥ 10 4 probability each for a single insertion or deletion per site in the
genome. The standard substitutions, insertions, and deletions occur during the division of
the o↵spring, after the parent organism has completed the copy process.
I seeded all cells in initial populations with identical organisms, all sharing a hand-written
ancestral genotype capable only of self replication. This ancestral genotype contains three
sections in its genomic program: (1) setup instructions that prepare for replication, (2) a
series of non-functional nop-C instructions, that take time and can modify other instruction
behavior, but do not otherwise alter organism behavior when executed, and (3) a simple
loop-based replication algorithm. The overall length of the ancestral genotype was either 50
or 100 instructions, varying only in the number of nop-C instructions between the header and
the replication loop (32 or 82, respectively). The replication algorithm copies the genomic
program one instruction at a time from the source to a subsequent region of memory. The
ancestral genome ends with a speciﬁc marker template that the organism can identify when
it has been copied. The default replication algorithm uses this signal to trigger o↵spring
division, extracting the region of memory containing the copied program as the genome of
the o↵spring. Once replication is complete, the state of the virtual CPU is completely reset,
including all memory, registers, stacks, bu↵ers, and head locations.

6

2.1

The Heads Instruction Set Architecture

The Heads instruction set architecture in Avida is the default virtual CPU conﬁguration.2
The virtual hardware that implements this instruction set is designed to operate on a genomic
program within a circular memory space, such that operations that reach the end of the
memory automatically wrap around to the beginning (as shown on the left side of Figure 2.1.
By default, it has three 32-bit registers, two stacks that can each hold ten values, four heads
that point to positions in the genome, input and output bu↵ers, and the ability to execute
26 standard instructions. The default instructions include three no-operation instructions
(nops): (nop-A, nop-B, and nop-C), which can serve to modify the default behavior of other
instructions. Most instructions observe the value of one subsequent nop instruction and alter
their behaviors accordingly. For instance, an instruction may change the source registers used
as operands when followed by a nop instruction. In addition to instruction modiﬁcation, nop
instructions can serve as patterns, which act as labels for genome locations. Label matching
uses cyclic complementary matching, where nop-A matches to nop-B, nop-B matches to
nop-C, and nop-C matches to nop-A.
The Heads instruction set has ﬁve ﬂow-control instructions: h-search, jmp-head,
mov-head, get-head, and set-flow. Each of these instructions can a↵ect the position
of one of the four architectural heads: the instruction pointer (ip), read head, write head,
and flow head. The h-search instruction searches the genome, starting from the ﬁrst
executed instruction in the genome, for a label (a sequence of one or more nop instructions)
that matches the cyclic complementary label that follows the instruction, placing the flow
2 All versions of Avida 2.x through version 2.13 share a functionally equivalent default
virtual CPU conﬁguration. Some speciﬁc details in the conﬁguration ﬁles have changed
across versions.
7

CPU

Genome
nand
nop-A
nop-B
nop-D
h-search
h-copy
if-label
nop-C
h-divide
h-copy
mov-head
nop-B

AX:FF0265DC
IP
OP1?
OP2?
OP3?

BX:00000100

FLOW

EX:00000000

READ

FX:A1476003

CX:1864CDFE
DX:D01B2594

FLOW
FLOW
WRITE

Heads

Input
Output

Stacks

Figure 2.1: The architecture of the Avida virtual CPU
Registers (upper right), stacks (lower right), genomic program (left), heads (middle), and
environmental bu↵ers (lower right). The solid lines depict the default Heads architectural
features. The dashed lines show some of the modiﬁcations tested in Chapter 4.
head after the matching sequence; if the sought-after label is not found, it places the flow
head on the instruction immediately subsequent to itself. This is one of only two instructions in the default Heads instruction set that is a↵ected by more than one nop instruction,
the other being if-copied described below. The mov-head instruction moves the ip to the
current location of the flow head. The jmp-head instruction shifts the position of the ip by
the amount speciﬁed in a register. The get-head instruction places the current location
of the ip into a register. Finally, the set-flow instruction moves the flow head to the
absolute genome location speciﬁed by the value in a register.
8

The set also contains three conditional instructions that will skip a subsequent instruction
if the test condition is false. The two basic conditional instructions, if-n-equ and if-less,
perform a comparison between two registers. The if-copied instruction interacts with
the read head, evaluating to true if the last sequence of instructions copied matches the
complement of the label that follows the instruction. This instruction is primarily for use in
conjunction with the replication instructions described below to identify the portion of the
genome most recently copied.
Seven arithmetic and logic operations are supported in the default Heads instruction
set: add, sub, inc, dec, nand, shift-l, and shift-r. All of these instructions operate on
values stored within registers and accept a single nop modiﬁer, which changes the source
and destination registers depending on the operation.
Five instructions facilitate data movement and environmental interaction. The push,
pop, and swap-stk instructions all operate on the two stacks within the architecture. Only
one stack is accessible at a time, with the swap-stk instruction toggling the currently active
stack, while push and pop copy numbers from registers to the top of the stack and vice-versa.
Each of these instructions can be nop-modiﬁed to specify which register should be used. The
swap instruction exchanges the values of two registers. The IO instruction interacts with the
environment of the digital organism, outputting the current value in a register and replacing
it with a value from the environmentally controlled input bu↵er. Values output via this
instruction are evaluated by the environment, potentially triggering a reward or other action
if they match one of the tasks in the environment as explained below.
Lastly, there are three instructions that facilitate self-replication. The h-alloc instruction allocates additional memory within which the digital organism can copy its o↵spring.
9

Environment

Rewarded Functions

Logic-9

Calculate nine logic operations with one or two inputs.

Logic-77

Calculate 77 logic operations with one, two, or three inputs.

Match-12

Generate up to 12 speciﬁc numbers.

Fibonacci-32

Output up to 32 numbers of the Fibonnaci sequence, in order.

Sort-10

Input 10 random numbers and output in correctly sorted order.

Limited-9

Logic-9 environment with a limited resource associated with each task.

Navigation

Successfully traverse a labeled pathway.

Table 2.1: The seven environments used to test evolutionary potential.

Copying is performed by repeated execution of the h-copy instruction, which duplicates the
current instruction found at the read head to the position marked by the write head and
advances both heads. Once copying has been completed, the organism must execute the
h-divide instruction to ﬁnalize the replication process, extracting the memory between the
read head and the write head as the genome of the o↵spring.

2.2

Environments

Avida supports a wide range of computational environments. For the work presented here,
I have used seven distinct environments (Table 2.1), each of which focuses on a di↵erent
aspect of the virtual architecture and presents unique evolutionary pressures. Activities (or
tasks) whose performance provide a metabolic reward deﬁne the environment. These rewards
increase the computation speed of the digital organism’s virtual CPU, making it possible to
obtain a competitive advantage relative to other organisms in the population.
10

2.2.1

Logic-9

The Logic-9 environment consists of metabolic rewards for all possible 1- and 2-input binary
logic operations; there are nine unique operations after removing symmetries and excluding
trivial operations. The environment rewards the performance of these tasks multiplicatively,
thus virtual CPU speed will increase exponentially as the organism performs additional tasks.
There are ﬁve reward levels associated with groups of logic operations, ranked by di culty.
The easiest group (NOT and NAND) will double computational speed, while the highest
level (EQU) increases execution speed by thirty-two times. The environment rewards each
task only once during an organism’s lifetime. This environmental setup is the default for
Avida and many previous experiments have used it (Lenski et al., 1999, 2003; Misevic et al.,
2006).

2.2.2

Logic-77

The Logic-77 environment increases the size and complexity of the Logic-9 environment by
adding a reward for all unique 3-input binary logic operations. In contrast to the Logic-9
environment, all operations provide an equal beneﬁt, doubling the execution speed of the
virtual CPU for the ﬁrst time the organism performs computation. Yedid et al. (2009) used
this environment.

2.2.3

Match-12

The Match-12 environment tests organisms’ ability to build arbitrary numbers, a task that
we have previously observed to constitute an obstacle to evolution (unpublished data). The
environment grants rewards in an additive manner for outputting each of twelve possible
11

numbers, unrelated to the random inputs. We selected numbers spaced approximately exponentially throughout the 32-bit number space, but the numbers contain no explicit patterns
to them. The environment rewards the output of each number only once during an organism’s lifetime. Output evaluation allows near matches, but the reward decays via a half-life
function based upon the number of bits that are incorrect with a minimum threshold of 22
bits correct to prevent most numbers from triggering many ‘lucky guesses’.

2.2.4

Fibonacci-32

The Fibonacci-32 environment rewards organisms multiplicatively for each number in the
Fibonacci sequence until the 32nd iteration of the sequence. After this target, the environment penalizes the organism at half this rate for additional numbers output, whereby
outputting 64 additional numbers will e↵ectively negate all beneﬁt of the ﬁrst 32. The purpose of this setup is to examine the capability of an instruction set to support ﬁnite recursion
and conditional looping.

2.2.5

Sort-10

The Sort-10 environment supplies a list of 10 random inputs, and rewards organisms for
outputting those values in descending order. Similar to the Match-12 environment, the
reward value decays via a half-life function for each incorrectly sorted value, based on the
number of moves required to shift it to the correct order. Given the limited number of
available registers, this task requires the use of the stacks and relatively complex ﬂow control.
12

2.2.6

Limited Resource

The Limited-9 environment, based on the Logic-9 environment, o↵ers metabolic rewards each
of the nine rewarded logic operations. However, unlike the Logic-9 environment, the Limited9 environment associates a separate, consumable resource with each task, the amount of
which determines the exact reward value. Each resource ﬂows into the environment at a
rate of 100 units per update, and out at 1% of the remaining concentration. If no organisms
are using the resource it will level out to 10, 000 units. This environment was ﬁrst used in
Cooper and Ofria (2003) and was recently used in Walker and Ofria (2012).

2.2.7

Navigation

The Navigation environment rewards organisms for successfully navigating a circuitous path
marked by cues (“sign posts”) including “turn left”, “turn right”, and “repeat last turn”,
as described in Grabowski et al. (2010). This task requires the use of basic memory, looping, and decision making. Additionally, the environment tests robustness of instruction set
architectures to the addition of several experiment-speciﬁc instructions, in this instance for
sensing and moving in the virtual maze.

2.3

Analysis Tools and Methods

In order to conduct the analysis of the experiments performed, as described below, I required
several tools, many of which I developed for this purpose. Here I describe the general
operation of these tools, reserving the details of their application for the experiment methods
sections below.
13

2.3.1

Mutational Neighborhood

Mutational neighborhood analysis is the systematic exploration of potential mutant genotypes surrounding a reference genotype. This method examines all possible single and double
mutations for all classes of mutations present in a given experimental setup. Each mutant
genotype is evaluated for ﬁtness, task proﬁle (phenotype), and an information-theoretic
measure of complexity. I also collected aggregate statistics regarding ﬁtness of the mutant
genotype relative to the reference genotype, shifts in phenotype, and changes in complexity. These data provide a wealth of information regarding the local ﬁtness landscape of the
reference genotype.
The mutational neighborhood tool currently implements this method for all possible
combinations of single and double mutations from three types of mutation: single-instruction
substitution (point mutation), single-instruction insertion, and single-instruction deletion.
Additional information is collected for genotypes demonstrating the acquisition of a selected
target task, including counts of mutation pathways that confer the target task, along with
mean relative ﬁtness of all such pathways discovered. When two-step pathways that confer
the selected target task are identiﬁed, the class of each mutation taken individually in that
pathway is also tracked, be it beneﬁcial, neutral, deleterious, or lethal.

2.3.2

Short-Term Evolutionary Potential (STEP) Sampling

Short-term evolutionary potential (STEP) sampling explores the mid-range ﬁtness landscape
of a reference genotype by evolving repeated short runs from the same starting point and
analyzing aggregate statistics of the outcome of each. This procedure involves injecting
the reference genotype as a single organism in an otherwise empty experimental world. I
14

then allow the world to evolve for a short period, 10, 000 updates (approximately 1, 000
generations) for the work presented here, after which I collect metrics of interest, such as
phenotype and ﬁtness. I repeat this procedure with the same reference genotype multiple
times for statistical assessment of the genotype’s evolutionary potential (10 for all studies
presented here).

2.3.3

Data Analysis and Statistical Testing

Data analyses and statistical tests were performed using MATLAB R2012a, Python 2.7.3,
Matplotlib 1.10, NumPy 1.6.2, SciPy 0.10.1, MySQLdb 1.2.3, MySQL 5.5.23, and the Apto
C++ Library (pre-release 1.0).

15

Chapter 3
How Does History A↵ect
Evolutionary Potential?
The evolutionary process shapes the genomes of organisms through a combination of both
chance and determinism (Mayr, 1988). Random mutations introduce novel genetic variation
that can a↵ect phenotypic traits while natural selection acts non-randomly on such heritable
phenotypic variation (Conner and Hartl, 2004). Despite the strength of natural selection,
genetic drift may randomly cull adaptive mutants before selection can ﬁx them (Conner and
Hartl, 2004).
Individual sites within a genome and indeed whole genes in organisms can form highly
interactive networks to express a phenotype. As a result of these epistatic interconnections,
the ﬁtness of a mutation can depend upon the genetic background in which it occurs. Fixation of a mutation yields a new background, altering the ﬁtness and likelihood of future
mutations (Weinreich and Chao, 2005; Weinreich et al., 2006). The order in which mutations occur can therefore take genotypes along di↵erent mutational pathways through the
16

adaptive landscape. Such alternate paths may lead to di↵erent ﬁtness peaks (Wright, 1988).
Steven J. Gould has argued that, in the context of such properties, chance and historical events make evolution prone to historical contingency and therefore fundamentally
unpredictable (Gould, 1989). Contingency is the property of history whereby outcomes follow from antecedent states, both unpredictably and causally dependent upon prior states
(Beatty, 2006). To explain this, Gould posed a thought experiment:

I call this experiment “replaying life’s tape.” You press the rewind button and,
making sure you thoroughly erase everything that actually happened, go back
to any time and place in the past.... Then let the tape run again and see if the
repetition looks at all like the original (Gould, 1989).
He expected that any replay of the tape would lead evolution down a pathway radically
di↵erent from the road actually taken (Gould, 1989).
Others suggest that more striking is the importance of determinism and thus predictability. The pervasiveness of parallel and convergent evolution support this assertion, as seen in
the multiple independent development of eyes, echolocation, and social systems, among other
innovations (Conway Morris, 2003; Dawkins, 1996b,a). Common challenges will inevitably
arise from ecological interactions, for which the scope of viable solutions may be limited.
Faced with such sparse options, natural selection is robust enough to discover these solutions as a lineage adapts, regardless of its history (Conway Morris, 2003; Dawkins, 1996b,a).
Although important di↵erences would result from any single replay of the tape, multiple replays would also demonstrate notable similarity (Van Valen, 1991). As Simon Conway Morris
puts it “the evolutionary routes are many, but the destinations are limited” (Conway Morris,
2003).
In the ﬁeld of evolutionary computation, it is just as important to understand the role
17

of history in evolution as it is in biology. Genetic algorithms (Holland, 1975), genetic programming (Koza and Poli, 2005), evolutionary programming (Fogel, 1995), and evolutionary
strategies (Rechenberg, 1971) all seek to e ciently solve computational and engineering
problems through the use of evolutionary principles. Contingent, historical events may constrain the search space of a given evolutionary algorithm run, limiting the e↵ectiveness of
the search. Given the size and complexity of target problems and the high cost of evaluating
potential solutions, it is crucial to maximize search e ciency (Eby et al., 1999).

As evolutionary outcomes are deﬁned by both random events and non-random processes,
the challenge is to deﬁne the degree to which contingency shapes those outcomes (Foote,
1998). A small but growing number of studies have sought to provide such insight, looking
into the relative roles of chance, history, and adaptation in evolution, as well as contingency
in macroevolutionary trends (Blount et al., 2008; Travisano et al., 1995; Taylor and Hallam,
1998; Losos et al., 1998; Yedid and Bell, 2002). In light of this work, I have explored the role of
history in determining evolutionary potential and to gain an understanding of how historical
events shape the trajectory of evolution. The studies presented below focus on the evolution
of the EQU task from the Logic-9 environment, a trait that has been shown in previous
work to be complex and contingent upon building-block tasks (Lenski et al., 2003). First, I
performed a Gould-inspired replay experiment, investigating the contribution of history to
the evolution of EQU. Following that, I used two means of measuring evolutionary potential
along the principal line of descent of several case study populations and assessed the longterm impact of historical mutations. Lastly, I examine the characteristics of stepping-stone
mutations that bring genotypes within regions of the ﬁtness landscape close to the evolution
of the EQU function.
18

3.1

Assessing the Contribution of History

Stephen J. Gould’s thought experiment of “replaying life’s tape”, though impossible on the
grand scale he envisioned, has the power to demonstrate the relative contributions of history,
chance, and determinism. Using digital evolution, we can take historical snapshots, from
which we can restart evolution. With this capability, it becomes possible to truly replay
the tape of digital life, examining the repeatability of outcomes. Moreover, this replay
experiment can itself be replicated with multiple independent histories, allowing conclusions
to be drawn about the speciﬁc contribution of history to the evolutionary potential of the
subsequently repeated replays.

3.1.1

Methods

I evolved a total of 100 independent populations for 100,000 updates using the standard
conﬁguration described in Chapter 2. The environment used for these 100 runs was a variant
of the Logic-9 environment — one in which performance of the complex EQU task did not
provide a metabolic reward. I saved the complete genotypic history as well as the distribution
and age of all organisms at the end of the 100,000 updates for all 100 populations.
I used the saved population information from each run to found 10 new replicate populations each, for a total of 1,000 replicates grouped as 100 shared-history clusters. I placed
these populations into the standard Logic-9 environment (including reward for performing
EQU) and allowed them to evolve for 900,000 additional updates. Due to the shared history
from the initial 100 runs, each of the 1,000 replicates experienced a total of 106 updates of
evolution from the ancestral population, 10% of which was shared with 9 other replicates. I
examined the task proﬁle of the most abundant (dominant) genotype from the end of each
19

Figure 3.1: Distribution of EQU Evolution in Replay Experiments
Distribution of EQU predicted by a null model (red line) and experimentally observed (blue
bars). Each bar represents the number of shared-history clusters where EQU evolved the
speciﬁed number of times. The null model is the expected binomial distribution that should
have been seen if shared history played no role, and thus there was no correlation between
runs in a shared-history cluster. For interpretation of the references to color in this and all
other ﬁgures, the reader is referred to the electronic version of this dissertation.
of the 1,000 replicate populations for EQU performance.

3.1.2

Results

A total of 597 populations evolved to calculate the EQU function at the end of the 1,000
replays performed. The resulting populations performing EQU were organized into sharedhistory clusters. Among these shared-history clusters, thirty demonstrated 10/10 replays
20

Figure 3.2: Median time to EQU evolution
Each point represents the value observed for shared-history clusters where EQU evolved
the speciﬁed number of times. The y-axis shows total updates of evolution, including the
100,000 updates from shared history. Dotted lines show the 95% non-parametric bootstrap
conﬁdence intervals (10,000 iterations).

performing the function, seven of them 9/10, eight of them 8/10, etc. The full distribution
of results can be found in Figure 3.1. For comparison, if history were not a factor in the
evolution of EQU, the distribution of populations would be expected to follow a binomial
distribution with p = 0.597, also shown in Figure 3.1. The observed results signiﬁcantly
diverge from the binomial expectation (p < 5.084 ⇥ 10 20 , Fisher’s Exact).
Overall, the median time to evolve the EQU function in successful populations was
135,291 total updates, including the 100,000 updates of history shared among replay groups,
with 95% non-parametric bootstrap conﬁdence interval of 123,340 to 152,944. When the
21

times to evolve EQU were grouped according to shared history and fraction of successful
replays, a notable trend was observed (Figure 3.2). Historical populations that resulted in
10/10 replays evolving the EQU function had a median time to EQU of only 104,336 updates
(102,769, 105,750.5), signiﬁcantly lower the global median (p < 3.174 ⇥ 10 23 , Wilcoxon
rank sum test). At the opposite end of the spectrum, historical populations that yielded
only one successful replay demonstrated median time to EQU of 509,512 updates (398,881,
742,679), signiﬁcantly higher than both the 10/10 populations (p < 1.025 ⇥ 10 6 ) and the
overall median (p < 9.980 ⇥ 10 5 ).

3.1.3

Discussion

History clearly plays a substantial role in determining evolutionary potential of the EQU
function. In comparison to the null expectation that history has no impact on future evolution, the observed results are strikingly divergent. The variation in time to evolve EQU
between groups of shared history presents hints at potential explanations for the strong e↵ect
of history. As shared-history clusters demonstrated less aggregate success, the median time
to reach the observed instances of EQU evolution also increased. The history of these less
successful populations may have taken the genotypes to regions of the ﬁtness landscape farther away from the evolution of EQU than their more successful counterparts. Constraints,
such as epistasis and genetic substrate, may be slowing these populations down and limiting
overall evolutionary potential with respect EQU.
22

3.2

Measuring Change in Evolutionary Potential

History substantially contributes to the future evolutionary potential of populations with
respect to the evolution of the EQU function. Implicit in that conclusion is that there are
some aspects of the historical genotypes that enhance or limit the ability of future generations
to evolve a particular phenotypic trait, in this case EQU. Evolution in some of those histories
traversed into regions of the ﬁtness landscape that made it highly unlikely for the function
to evolve in the next 900,000 updates. In other histories, the landscape presented great
potential for evolving EQU, with the rest ﬁlling out the spectrum of possibilities. These
observations raise questions about how evolutionary potential changes during evolution. Are
there particular types of mutations that decrease the potential for EQU evolution, and if so,
are those mutations typically beneﬁcial when they come into the genotype? As genotypes
approach regions of the ﬁtness landscape close to the EQU function, what do the mutation
pathways to the task look like? Does EQU typically require steps through ﬁtness valleys,
contributing to its rarity? In the experiments described here, I attempt to answer these
questions by directly exploring the ﬁtness landscape surrounding genotypes throughout the
history of selected runs. I have applied two techniques to exploring the landscape, exhaustive
mutational neighborhood mapping and STEP sampling.

3.2.1

Methods

I evolved ﬁfty reference populations for 106 updates in the Logic-9 environment and collected
historical genotype data. At the end of each experiment I extracted the principal line of
descent, deﬁned as the complete lineage from the ancestral genotype to the most abundant
genotype at the end of the run. I used a test environment to evaluate the task proﬁle of each
23

genotype along each of these principal lines of decent. I truncated all lineages at the ﬁrst
genotype capable performing EQU if one was found. I then used these truncated-EQU and
non-EQU lineages to perform the two analysis methods described in the following sections.
In order to aid the analysis of the experiments, the study populations evaluated in this
section utilize a conﬁguration that di↵ers in a number of ways from the description in the
methods chapter. The most important among these di↵erences is the mutations that were
experienced by the organisms: rather than the standard substitution, insertion, and deletion
mutations, the organisms were subject to a uniform mutation that chose equally among 53
possibilities, 26 instruction substitutions, 26 instruction insertions, or a deletion. Additionally, a standard length-based adjustment factor that is typically applied to metabolic rate
was disabled, increasing selection pressure for the evolution of more rapid production of
o↵spring. The changes allowed for direct interpretation of mutation probabilities calculated
during mutational neighborhood analysis.

3.2.2

Mutational Neighborhood Lineage Analysis

I performed mutational neighborhood analysis for all genotypes along each principal line of
descent (PLoD) extracted from the ﬁfty populations. All possible one- and two-step mutation
pathways conferring the non-lethal1 performance of the EQU task were identiﬁed. I further
classiﬁed two-step mutation pathways according to the ﬁtness e↵ect of each mutation in the
path, providing an indication of the character of the pathway if the mutations were taken
individually, rather than combined. Although the individual results of the 50 lineages all
1 It is possible for a genotype to be capable of computing the EQU function, while being
incapable of producing an o↵spring. Mutations that generate such genotypes were classiﬁed
as ‘lethal’ and excluded from identiﬁed pathways.
24

Group

# Pops

Found EQU?

When near EQU in genotype space?

I
II
III
IV

11
6
6
4

Yes
Yes
Some
No

V

3

Yes

VI
VII
VIII

2
4
14

No
No
No

Just prior to EQU appearing
Sporadically throughout lineage
Most of lineage
Sporadically throughout lineage
Early followed by long period away,
returning just before EQU appeared
Early, but never returned
Long contiguous region
Rarely or never

Table 3.1: General groups observed during mutational neighborhood analysis
Each row shows the group number, number of populations that ﬁt into the group, and the
deﬁning characteristic of the group.
had unique features (see Appendix A), I observed eight general types of traversals through
the ﬁtness landscape (Table 3.1).
Group I, representing 11 of the populations examined, demonstrated a reasonably short,
mostly contiguous section of the lineage where the genotype was within two mutations of
the evolution of EQU before the task actually appeared in the line of descent. The two-step
pathways in these regions varied in the quality of their individual steps, showing di↵erent
distributions of beneﬁcial, neutral, deleterious, and lethal steps. Almost universally the
pathways contain some deleterious single steps. The pathways in this group frequently
showed neutral pathways as well, and occasionally some extremely rare beneﬁcial steps.
Figure 3.3 shows a representative example of results from this group, where the EQU task
appeared in the genotype immediately following the block of two-step pathways. As can be
observed in the ﬁgure, there were also single-step pathways that non-lethally conferred the
EQU task.
Group II, representing 6 populations, demonstrated similar patterns leading up to the
25

Figure 3.3: Mutational Neighborhood Group I Example — Population 108
Mutational neighborhood of the PLoD from population 108, representative of Group I. The
top graph shows the absolute probability of a single mutation conferring performance of EQU.
The middle graph shows the absolute probability of two mutations occurring simultaneously
that, combined, confer performance of EQU. Note that the top and middle graphs are on
distinct scales. The bottom graph shows the quality of individual mutations within two-step
pathways when taken alone, grouped into four general classes: beneﬁcial (green), neutral
(blue), deleterious (red), lethal (black).
evolution of EQU. However, unlike the ﬁrst group, these populations also included genotypes
at numerous points earlier in the line of descent that were within two mutations of the task,
but never found it at those points. These earlier two-step pathways generally had low
probability of occurring, with each mutation typically being deleterious or lethal if taken
alone. Figure 3.4 shows an example population from this group.
The six lines of descent from Group III were even more extreme: they contained genotypes
that were within two mutations of EQU for most of their histories, but either mutation singly
would have been deleterious or lethal, although on rare occasion some neutral paths were
available. Only one of the six populations eventually evolved EQU, population 109 (Figure
26

Figure 3.4: Mutational Neighborhood Group II Example — Population 147
Fitness classes: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure 3.5: Mutational Neighborhood Group III Example — Population 109
Fitness classes: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

27

3.5), although the character of the pathways prior to the evolution of the task were similar in
probability and mutation class fractions to the other ﬁve. Interestingly, population 109 was
within a single mutation of EQU early in the lineage, although this mutation itself would
have been highly deleterious, despite the fact that it would have conferred EQU.
Group IV, containing four lineages, demonstrated scattered regions in the histories where
the genotypes were within reach of the EQU task. The pathways identiﬁed were all either
deleterious or lethal when taken a single mutation at a time. None of the four populations
evolved EQU during the original experiment. Figure 3.6 shows a representative line of
descent.
The three lineages categorized as Group V showed small pockets of genotypes early on
that were within two mutations of EQU, followed by many hundreds of genotypes represent-

Figure 3.6: Mutational Neighborhood Group IV Example — Population 136
Fitness classes: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

28

Figure 3.7: Mutational Neighborhood Group V Example — Population 116
Fitness classes: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

ing hundreds of thousands of updates of evolutionary time prior to the genotypes returning
to a region of the landscape near the task toward the end of the experiment. The mutation
pathways were almost universally deleterious. Although these populations were within two
mutations of EQU at the end, often for hundreds of steps, none of them evolved it during
the initial experiment. Figure 3.7 shows a representative lineage from this group.
Group VI represents two lines of descent where a segment of the genotypes early on
in the experiment were within two steps of EQU. These lineages, however, wandered away
from this region and were never within two mutations again. The pathways in population
113 lineage were all deleterious in nature, the other, population 125, demonstrating a small
fraction of neutral pathways for some of the genotypes (Figure 3.8).
Four lines of descent ﬁt into Group VII, demonstrating long contiguous regions of geno29

Figure 3.8: Mutational Neighborhood Group VI Example — Population 125
Fitness classes: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

types that are within two mutations of EQU. These four lineages can be subdivided into
two groups, based on the character of the pathways within the regions. Two of the lineages
had only deleterious and lethal paths identiﬁed, neither of which evolved EQU in the original experiment, despite the long stretch of time represented in the lineage. The other two
lines of descent had pathways primarily consisting of deleterious mutations, accompanied by
small fractions of the other mutation classes. Both also presented with single-step pathways
for most of the genotypes as well, and both eventually evolved EQU at the end of the long
stretch of lineage. Figure 3.9 shows a representative from the latter subgroup.
Finally Group VIII consisted of 14 lines of descent that were rarely or never within
two mutations of EQU. In fact, only two of the lineages demonstrated any such genotypes.
One lineage contained a single genotype with two-step pathways to EQU, all of which were
30

Figure 3.9: Mutational Neighborhood Group VII Example — Population 114
Fitness classes: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure 3.10: Mutational Neighborhood of the PLoD from Population 145
Fitness classes: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

31

deleterious. The other lineage also contained only a single genotype with two-step pathways,
but in this case the pathways were all lethal.
One population from the ﬁrst group, population 145, presents an interesting case study
(Figure 3.10). It initially appeared as if it would be in Group I, with buildup of EQU
likelihood, culminating the evolution of the function at around genotype 80 or so in the
lineage. Extended mutational neighborhood analysis following the appearance of the EQU
function showed that after about half a dozen genotypes, the EQU function was traded o↵
for a group of three other functions, AndNot, NOR, and XOR. The genotypes in the lineage
stayed within two mutations of EQU for the remainder of the experiment, but it was never
re-acquired, despite occasional neutral paths.

3.2.3

STEP Sampling

I performed STEP sampling utilizing the same 50 principal lines of decent as used for the
mutational neighborhood analysis, with 10 replicate STEP sampling runs based on each
genotype along all 50 principal lines. For example, the 2,708 genotypes on the principal line
of descent of population 113 were sampled with a total of 27,080 sample runs. I extracted
the most abundant genotype from the population at the end of each STEP sample run and
evaluated it in a test environment for the ability to perform the EQU function. I classiﬁed
the results from these analyses into three general groups (Table 3.2).
Group A, which contained 16 lines of descent, demonstrated peaks of successful sample
runs early in the history, followed by long periods of no activity prior to returning to higher
probabilities just before the evolution of EQU in the main run being tested. Some of the rises
toward the end were only a few steps before reaching the point where 10 out of 10 sample runs
32

Group

# Pops

Found EQU?

A
B
C

16
13
21

Yes
Some
No

Characteristic
Minimal activity in the middle of lineage
Steady, low probability throughout
Early chances, none observed later

Table 3.2: General groups observed during STEP sampling
Each row shows the group letter, number of populations that ﬁt into the group, and the
deﬁning characteristic of the group.

were evolving EQU. Others showed many more steps on the lineage, with gradually increasing
counts of successful sample runs, before EQU was actually observed in the original lineage.
Figure 3.11 shows a representative lineage.

Thirteen lineages ﬁt into Group B, showing a moderate propensity for EQU evolution
during the sample runs from genotypes throughout the history of the run. Some of the genotypes in the middle of these lineages had fairly successful sample runs, with 6 or more of the
10 runs reaching EQU. Despite these successful genotypes, only ﬁve of the thirteen eventually evolved EQU in the original lineage. Figure 3.12 shows a line of descent demonstrating
the pattern.

Finally, Group C consisted of 21 lineages demonstrate a few successful sample runs early
in their histories, followed by essentially zero probability in the middle and late periods.
Many of the lineages had only one or two successful sample runs, with one lineage containing
none. Figure 3.13 shows one of the most successful lines of descent from this group.

The complete results for all 50 replicates are presented in Appendix B.
33

Figure 3.11: STEP Sampling Group A Example
STEP sampling along the principal line of descent from population 147, representative of
Group A.

Figure 3.12: STEP Sampling Group B Example
STEP sampling along the principal line of descent from population 109, representative of
Group B.
34

Figure 3.13: STEP Sampling Group C Example
STEP sampling along the principal line of descent from population 125, representative of
Group C.

3.2.4

Discussion

In general, we found a logical consistency between our analysis of the local mutational
neighborhood and STEP sampling analyses. The STEP sampling Group A, demonstrating
increasingly successful samples leading up to EQU, overlaps completely with the mutational
neighborhood groups with increasing pathways before EQU appeared. The other sampling
and mutational neighborhood groups match up similarly. Unlike the sampling runs, the
mutational neighborhood was able to identify highly unlikely pathways that were none-theless possible. The 10 replicates used in the genotype sampling runs are likely not enough to
identify such improbable events.
One mutational neighborhood group did stand out — the long window of EQU pathways
along the lineage. The populations it contains came from the full spectrum of lineage samples.
35

Fitness E↵ect
Mutation Corridor

Beneﬁcial

Neutral

Deleterious

Lethal

p
4.104 ⇥ 10 37
8.049 ⇥ 10 48

2 Steps to EQU

Enter

71

125

25

0

2 Steps to EQU

Leave

43

107

63

0

1 Step to EQU

Enter

7

22

11

0

1 Step to EQU

Leave

18

14

8

0

3642

52714

2676

0

All Mutations on PLoD

1.530 ⇥ 10 8
6.398 ⇥ 10 10

Table 3.3: Mutation Corridor Entry and Exit Fitness E↵ects
Fitness e↵ect of mutations identiﬁed as entries and exits to corridors in the ﬁtness landscape
near the EQU function. The last line shows the ﬁtness e↵ect of all mutations found on all
50 principal lines of descent examined. p-values calculated using Fisher’s Exact.
These lines of descent likely demonstrate varying degrees of epistatic interactions between
parts of the genome. There are pathways to EQU present for a great deal of the history
sampled, yet it is still rare. Case study population 145 was initially classiﬁed into the
ﬁrst mutational neighborhood group, due to the early evolution of EQU. Including its full
lineage, however, places it into the group of populations with long windows of genotypes
with pathways to EQU in their history. The epistatic tradeo↵ is fairly explicit in the lineage
of population 145, staying within 2 mutations of EQU for much of evolutionary time, but
never able to break up the intertwined functions that killed EQU initially.

3.3

Characteristics of Potential Stepping-Stone Mutations

The mutational neighborhood analysis performed for Section 3.2.2 identiﬁed all mutations
on the principal lines of decent that brought genotypes within one or two mutations of the
36

EQU function. All such mutations have the potential to serve as stepping stones to the
evolution of EQU. I collected all mutations that served as entries to corridors within one and
two mutational steps of the EQU function from all 50 lines of descent. I also identiﬁed all
mutations that moved the line of descent away from regions of the ﬁtness landscape where
EQU was accessible. I assessed the ﬁtness e↵ect of each mutation, the results of which are
presented in Table 3.3.
All four mutation categories, entering and exiting corridors within one and two mutation
steps from the EQU function, demonstrated substantially over represented beneﬁcial and
deleterious mutations when compared with all mutations on the lines of descent (Fisher’s
Exact, see Table 3.3). For example, beneﬁcial mutations were 5.2 times and deleterious
mutations 2.4 times more common in entries to 2-step corridors relative the base distribution
of mutations on the line of descent. Such beneﬁcial mutations cannot be due to the evolution
of EQU at the entry of these corridors. These results imply that mutations actively changing
the quality of the genotype are more important for exploiting evolutionary potential than
neutral mutations.

37

Chapter 4
How Does the Genetic Hardware
A↵ect Evolutionary Potential?
The genetic hardware of an evolving system forms the basis of heritability, encompassing
an encoding of the complete description of entities (a genome) and the mechanics of how
that description may be translated into the entities themselves (the virtual hardware). The
properties of the genetic hardware a↵ect the robustness, ﬂexibility, and potential complexity
of those entities within the system. These factors must inherently have broad impacts on
evolutionary potential, but the nature of that impact is not well understood. In natural
evolution, the genetic hardware is the genome (such as DNA and RNA), the proteins that
must be present for it to be translated, and the biochemistry that governs that translation.
The mechanics of constructing and copying these structures play critical roles in determining properties such as the robustness of genetic material and the potential complexity of
biological processes.
In evolutionary computation, the combination of how individual solutions are encoded
38

and the rules that are used to evaluate the ﬁtness of those encodings deﬁne the genetic
hardware. The genetic hardware may be as simple as bit strings with pre-deﬁned meanings associated with each bit, as is commonly used in some genetic algorithms, or it may
be complete programs with instruction interactions and data-ﬂow semantics, as in genetic
programming or digital evolution. All aspects of these systems are designed by engineers
to provide speciﬁc capabilities deemed su cient for solving the particular problem under
consideration. These design choices may have signiﬁcant impact on the ability of a system
to evolve the desired outcome.
In designing evolutionary computation systems, it is critical to understand how sensitive the evolutionary process is to changes in the mechanics of the genetic hardware. Will
seemingly small changes yield large di↵erences in performance? Are certain aspects of a
genetic hardware more important than others? How sensitive is genetic hardware to the
environment? Is it possible for a genetic hardware to perform well generally, or is an e↵ectively designed one critically tied to aspects of the environment? What characteristics of
the genetic hardware a↵ect the potential complexity of evolved entities? What aspects of
genetic hardware promote robustness, and how do they interact with complexity? Are there
general design principles that can be gleaned to help steer the design of future evolutionary
computation systems?
Understanding how genetic hardware a↵ects evolutionary potential will provide insight
into the answers to these and other questions. Although the no free lunch theorem states
that no “perfect” encoding exists, I expect that, due to the many arbitrary choices in its
design, the evolutionary potential of the current genetic hardware in Avida is far from
optimal for evolving populations in the types of environments interested to researchers and
39

engineers (Wolpert and Macready, 1995, 1997).

4.1

Optimizing Evolutionary Potential with Instruction Set Architectures

The design of the instruction set within a digital evolution system forms the basis of the
genetic hardware upon which organisms evolve. It plays an important role in the robustness
and ﬂexibility of evolved solutions (Ofria et al., 2002). As the scope of problems explored
within digital evolution systems such as Avida, continues to expand, it is desirable to expand the evolutionary potential a↵orded by the system throughout a broad range of computational environments. Starting from the current default, the Heads instruction set, I
have systematically investigated a series of modiﬁcations to the instruction set architecture
of Avida virtual CPUs, incorporating the most promising modiﬁcations into the system as
I go. I have evaluated the evolutionary potential of these modiﬁed instruction sets in seven
computational environments, described in Section 2.2, representing a wide range of desired
capabilities by comparing the ﬁnal results of experiments performed in each environment
with each instruction set modiﬁcation.

4.1.1

Tested Architecture Modiﬁcations

In the default Heads instruction set, most instructions can have one aspect of their function
modiﬁed by a single nop instruction that follows in the genome. I aimed to improve the
ﬂexibility by which data could be accessed and modiﬁed in the virtual CPUs by implementing
the Fully-Associative (FA) instruction set. I extended the nop modiﬁcation system
40

used by instructions so that most instructions could be modiﬁed by more than one nop.
The default behavior of all instructions remains the same when followed by zero or one nop
instruction. However, for arithmetic, logic, and conditional instructions that operate on
more than one operand or have both source and destination registers, the FA instruction
set uses additional nops, if present, to further specify these parameters. For example, an
add instruction, by default will perform regB = regB + regC. If it is followed by one
nop-A, this will alter the register where the result is placed so that it instead performs
regA = regA + regB. When followed by nop-A nop-C, the add instruction would have
the same behavior in the default instruction set (ignoring the nop-C), but in the FullyAssociative set it would instead perform regA = regC + regA. In this manner, very
speciﬁc operations may be created, while retaining robust default behavior.

The R series of instruction set architectures build upon the Fully-Associative architecture to increase the working register set beyond the three default registers, exposing one
or more additional architectural registers, up to a total of 16 in R16. The original design
choice was made to minimize the number of registers in order to simplify the complexity of
using them, but a larger number of registers has not previously been systematically tested.
For each additional register, I added a corresponding nop instruction to the instruction set
(nop-D, nop-E, nop-F, nop-G, etc.). None of the default registers used by the instruction set
were altered, meaning that these additional registers can be accessed only when the new nop
instructions are used to modify an instruction. Since nop modiﬁcation is also used for head
selection, the additional nop-D in the R4 architecture provides direct access to the flow
head. In the R5 through R16 architectures, extra unassigned heads that may be used as
genome place-markers are available for each additional nop instruction.
41

The Label series of instruction set architectures extends the R6 architecture (which
proved to be the most e↵ective, as described in the results below), explicitly separating
genome labels from nop sequences used to modify instruction operands. The goal of this
change was to prevent labeled genome positions, especially those used for self-replication,
from otherwise conﬂicting with instruction argumentation as facilitated by the FullyAssociative architecture. Instructions that operate on genome labels, search-seq-comp-s
and if-copied-seq-comp, were extended with variants (search-lbl-comp-s and if-copiedlbl-comp) that recognize sequences of nop instructions only if they begin with the special label instruction. When executed directly, the label instruction performs no operation. The Label-Direct series architectures alter the pattern matching algorithm from
the default of cyclic-complementary to direct sequence matching. For example, Label-SeqDirect replaces search-seq-comp-s and if-copied-seq-comp with search-seq-directs and if-copied-seq-direct, respectively, while retaining search-lbl-comp-s and ifcopied-lbl-comp. The Label-Both architectures include both pattern matching algorithm instruction variants. In order to increase the power of labeled execution ﬂow, all
Label series instruction sets omit the set-flow instruction that performs absolute addressing.
The Split-IO instruction set architecture alters the Label-Seq-Direct architecture,
splitting the IO instruction into two separate input and output instructions. Both of the
new instructions use the same default register location as the IO instruction and can each
be modiﬁed by one nop.
The Search series of instruction set architectures extend the Split-IO architecture
with enhanced searching and jumping capabilities. The Search-Directional set adds
42

Instruction Set
Flow-If0
Flow-IfX
Flow-MvH
Flow-If0-MvH
Flow-IfX-MvH
Flow-If0-IfX-MvH

If0
•

•
•

IfX
•
•
•
•

MovHead

•
•
•

Table 4.1: Flow series instruction sets tested
Instruction set by row, with marks in each column indicating that the set contains the
relevant instruction group.
two pairs of directional search- instructions that scan the genome forward or backward relative to the instruction pointer for a label or sequence match. The Search-Goto set, adds
a single goto instruction that reads the nop sequence that follows the instruction, if present,
and will unconditionally jump to the ﬁrst genome location following the matching label that
begins with a label instruction. If no matching label is found, execution ignores the goto instruction. The Search-GotoIf group adds two conditional goto variants, goto-if-n-equ
and goto-if-less, that execute the jump only if the conditional test evaluates to true.
The Flow series of instruction set architectures builds upon the ﬂow control features
of the Search-Directional architecture, testing multiple combinations of additional ﬂow
control instructions (Table 4.1). The If0 group adds four single argument if instructions,
if-not-0, if-equ-0, if-gtr-0, and if-less-0, that conditionally execute the following
instruction based on the comparison of the argument with 0. The IfX group adds two
if variants, if-gtr-x and if-equ-x, that conditionally execute the following instruction
based on the result of comparing regB with a nop modiﬁed number. The default value used
by if-gtr-x and if-equ-x is a 1. For each nop in the label following a given if-gtr-x
or if-equ-x instruction, the bit is left shifted 1, 2, or 3 times for each nop-B, nop-C, or
43

nop-D, respectively. Whenever a nop-A is found in the label sequence, the sign-bit of the
value is toggled. Finally, the MovHead group adds two conditional mov-head variants,
mov-head-if-n-equ and mov-head-if-less, that operate similarly to the conditional goto
instructions.

4.1.2

Methods

I tested each instruction set architecture with 200 replicate populations in each of the seven
computational environments, Logic-9, Logic-77, Match-12, Fibonacci-32, Sort-10, Limited9, and Navigation. The populations evolved on 60x60 worlds for 100,000 updates. Each
population was seeded with a single ancestral organism capable only of self-replication. Small
variations in the initial genotype used in each architecture were often necessary, due to the
functional di↵erences among the instruction sets, but I made a strong e↵ort to limit these
variations speciﬁcally to neutral labeling instructions used in self-replication.

4.1.3

Results

I have focused on two measures to evaluate how well populations solved the computational
challenges of the environment when evolved with each instruction set architecture: mean
ﬁtness and task success. Both measure ability of the evolved organisms to perform tasks
within the environment.
Mean ﬁtness averages the ﬁtness values of each living organism in the population at the
moment the experiment ﬁnished. It takes into account both the computational capability
of the organism and the e ciency of self-replication, also called the “gestation time”. We
examined the distributions of these ﬁtness values for all instruction set variants in each
44

environment. For each modiﬁed instruction set, we compared the 200 population ﬁtness
values with those of a reference instruction set architecture using a Wilcoxon rank-sum test.
We determined signiﬁcance using a = 0.05 with sequential Bonferroni correction. Conﬁdence
intervals, as shown in tables below, represent 2.5% and 97.5% quantiles that we generated
using non-parametric bootstrap with 10,000 iterations.
Task success is a direct examination of the computational capabilities of the organisms
within the ﬁnal population, for the speciﬁc environment of the experiment. We measure
the task success of a population as the sum of the qualities by which the average organism
performs each task. To calculate a task success tp of population p, we determine each
organism’s quality at each task and then sum over these values, ﬁnally dividing by the total
number of organisms in the population. More formally,
Np T
X X qi,j
tp =
Np
i=1 j=1

(4.1)

where Np is the number of organisms in population p, T is the number of tasks in the
environment, and qi,j is the quality q at which organism i is performing task j. Task quality
(q) is a value between 0 and 1, where 1 means the organism has found a perfect solution for
a task. Environments that support near-matches use task quality to adjust the metabolic
reward accordingly. The maximum task success for a given environment is equal to the
total number of tasks rewarded in that environment; for example the maximum task success
of the Logic-9 environment is 9. Normalized task success, as presented in the following
results, divides the observed task success by the maximum in each environment. Similar to
population mean ﬁtness, we compared the distribution of task success of each instruction set
to the control architecture using a Wilcoxon rank-sum test, sequential Bonferroni correction,
45

and non-parametric bootstrap conﬁdence intervals.

4.1.3.1

Fully-Associative Argumentation

In conducting my analysis, the Fully-Associative (FA) instruction set, which addresses
the ﬂexibility of register data ﬂow, shows signiﬁcant improvement in six of the seven environments (Tables 4.2 and 4.3). The logic-based environments, Logic-9, Logic-77, and Limited-9,
all show substantially improved ﬁtness and task success. The Logic-77 environment in particular, beneﬁts from the FA instruction set with nearly 2.9 times increase in median task success and dramatically increased average ﬁtness. The fully-associative capability, facilitating
speciﬁc instruction formats, appears crucial within the highly diverse Logic-77 environment.
Indeed, on average 9.2% of the nop-modiﬁed instructions in the dominant genotype at the
end of the FA experiments with the Logic-77 environment used more than one nop modiﬁer.
The Fibonacci-32 environment also sees a notable 44% improvement in task success, with a
corresponding increase in ﬁtness. Mean usage of multiple nop modiﬁers was 16.4% in the
ﬁnal dominant genotype of the Fibonacci runs. The Sort-10 and Match-12 environments
show statistically signiﬁcant gains for both metrics, but none of these improvements are
substantial in nature. The Navigation environment shows a slight, non-signiﬁcant decline in
ﬁtness (p < 0.054, Wilcoxon rank-sum test) and task success (p < 0.178, Wilcoxon rank-sum
test) when tested with the FA instruction set.

46

Logic-9
Heads
FA

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

19.07

12.43

0.173

3.730

-0.54

4.430

1.071

(17.71, 19.76)

(11.51, 14.22)

(0.146, 0.224)

(3.300, 4.050)

(-0.63, -0.45)

(4.283, 4.595)

(1.035, 1.383)

22.99

39.35

0.215

4.806

-0.38

4.840

1.038

(22.70, 23.08)

(35.05, 41.83)

(0.919, 0.251)

(4.474, 5.212)

(-0.45, -0.33)

(4.671, 5.082)

(1.022, 1.069)

Table 4.2: Heads and Fully-Associative Architectures Fitness
Fitness results of the Heads and Fully-Associative (FA) instruction set architectures, where multiple nop arguments
can modify the behavior of an instruction. Each entry shows the median log2 population mean ﬁtness in the respective
environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05, Wilcoxon rank-sum
test) deviations after sequential Bonferroni correction.

Logic-9
Heads
FA

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

0.829

0.176

0.145

0.206

1.31⇥10 4

0.909

Navigation
3.97⇥10 3

(0.752, 0.839)

(0.161, 0.198)

(0.145, 0.146)

(0.178, 0.238)

(1.08, 1.47)

(0.894, 0.913)

(3.96, 4.35)

0.936

0.505

0.148

0.297

1.55⇥10 4

0.927

3.96⇥10 3

(0.930, 0.943)

(0.453, 0.546)

(0.147, 0.149)

(0.278, 0.332)

(1.44, 1.67)

(0.924, 0.929)

(3.95, 3.97)

Table 4.3: Heads and Fully-Associative Architectures Task Success
Task success results of the Heads and Fully-Associative (FA) instruction set architectures. Each entry shows the median
normalized task success in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries denote
signiﬁcant (p < 0.05, Wilcoxon rank-sum test) deviations.

47

4.1.3.2

Number of Registers

The R series instruction sets generally show little variation in performance (Tables 4.4 and
4.5). In the Logic-77 environment there is a slight positive trend as the number of registers
increases, but none are signiﬁcant after Bonferroni correction, and the magnitudes of the
changes are not particularly notable. The only substantial di↵erences observed among all
tested conﬁgurations are a drop in task success and a drop in ﬁtness with R16 in the Logic9 environment, indicating a potential drag on the system due to the dramatic increase in
instruction set size with the addition of 13 more nops. The Sort-10 environment demonstrates
signiﬁcant loss of performance in all treatments, relative to the FA architecture, though
none of the variation observed is substantial in nature (⌧ 1% di↵erence in task success).
The Navigation environment does show what initially appears to be a substantial uptick
in performance under R16, but with task success still well below 1%, it is not enough to
allow the populations to complete this task. It does, however, indicate that we may wish to
explore higher register counts again in conﬁgurations where populations have more success
with this task.

48

Logic-9
FA
R4
R5
R6
R7
R8
R12
R16

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

22.63

38.85

0.223

4.828

0.09

4.934

1.027

(22.39, 23.01)

(34.76, 43.67)

(0.191, 0.273)

(4.255, 5.332)

(-0.07, 0.21)

(4.661, 5.282)

(1.016, 1.042)

22.85

38.70

0.243

4.666

-0.39

5.253

1.080

(22.67, 23.01)

(33.90, 43.02)

(0.204, 0.290)

(4.233, 5.142)

(-0.48, -0.32)

(4.925, 5.514)

(1.054, 1.792)

22.73

38.42

0.231

5.067

-0.45

5.158

1.083

(22.50, 22.86)

(34.43, 42.54)

(0.206, 0.281)

(4.540, 5.623)

(-0.57, -0.39)

(4.300, 5.400)

(1.056, 1.340)

22.78

43.01

0.229

4.908

-0.43

5.117

1.111

(22.29, 22.97)

(40.01, 45.90)

(0.206, 0.274)

(4.293, 5.719)

(-0.54, -0.34)

(4.925, 5.374)

(1.080, 2.730)

22.75

43.41

0.204

4.598

-0.40

5.135

1.562

(22.58, 22.97)

(38.97, 45.67)

(0.177, 0.225)

(4.174, 5.078)

(-0.49, -0.29)

(4.978, 5.407)

(1.096, 3.234)

22.75

43.04

-0.027

4.831

-0.47

5.292

1.156

(22.55, 22.95)

(39.25, 47.78)

(-0.07, 0.19)

(4.392, 5.308)

(-0.57, -0.33)

(5.058, 5.736)

(1.099, 2.815)

22.62

44.26

-0.11

4.678

-0.54

5.180

1.377

(22.45, 22.76)

(40.82, 48.18)

(-0.12, -0.08)

(4.082, 5.244)

(-0.56, -0.49)

(4.901, 5.621)

(1.114, 3.012)

21.68

42.26

-0.11

4.028

-0.55

5.734

3.78

(19.76, 22.22)

(40.02, 46.26)

(-0.13, -0.10)

(3.620, 4.474)

(-0.59, -0.50)

(5.390, 6.466)

(1.157, 3.326)

Table 4.4: R Series Architectures Fitness
Fitness results of the R series instruction set architectures, which vary the number of registers available in the virtual CPUs.
Each entry shows the median log2 population mean ﬁtness in the respective environment, with 95% conﬁdence intervals
in parentheses. Bold entries indicate signiﬁcant (p < 0.05, Wilcoxon rank-sum test) deviations after sequential Bonferroni
correction.

49

Logic-9
FA
R4
R5
R6
R7
R8
R12
R16

Logic-77

Match-12

Fibonacci-32

Sort-10

0.932

0.495

0.147

0.288

2.53⇥10 4

Limited-9
0.926

Navigation
3.96⇥10 3

(0.921, 0.938)

(0.452, 0.565)

(0.146, 0.148)

(0.263, 0.307)

(2.21, 2.74)

(0.923, 0.929)

(3.95, 3.96)

0.937

0.506

0.146

0.276

1.52⇥10 4

0.923

3.97⇥10 3

(0.929, 0.941)

(0.441, 0.554)

(0.145, 0.148)

(0.256, 0.289)

(1.42, 1.67)

(0.920, 0.929)

(3.96, 5.05)

0.936

0.493

0.145

0.300

1.38⇥10 4

0.927

3.97⇥10 3

(0.929, 0.940)

(0.450, 0.544)

(0.144, 0.147)

(0.284, 0.327)

(1.09, 1.62)

(0.923, 0.929)

(3.96, 4.20)

0.932

0.563

0.145

0.294

1.49⇥10 4

0.930

3.98⇥10 3

(0.927, 0.940)

(0.521, 0.592)

(0.144, 0.147)

(0.268, 0.326)

(1.14, 1.160)

(0.926, 0.932)

(3.97, 6.68)

0.940

0.554

0.144

0.281

1.51⇥10 4

0.928

4.47⇥10 3

(0.930, 0.943)

(0.502, 0.592)

(0.142, 0.146)

(0.247, 0.305)

(1.26, 1.63)

(0.923, 0.932)

(3.98, 7.65)

0.938

0.555

0.079

0.299

1.33⇥10 4

0.927

3.99⇥10 3

(0.931, 0.942)

(0.504, 0.613)

(0.078, 0.143)

(0.275, 0.323)

(1.06, 1.57)

(0.924, 0.929)

(3.97, 6.19)

0.939

0.575

0.078

0.298

1.06⇥10 4

0.930

4.31⇥10 3

(0.933, 0.943)

(0.525, 0.613)

(0.077, 0.078)

(0.268, 0.318)

(1.03, 1.11)

(0.928, 0.933)

(3.98, 7.33)

0.910

0.550

0.077

0.269

1.05⇥10 4

0.928

7.31⇥10 3

(0.854, 0.931)

(0.524, 0.589)

(0.077, 0.078)

(0.237, 0.302)

(1.01, 1.08)

(0.925, 0.932)

(3.99, 7.81)

Table 4.5: R Series Architectures Task Success
Task success results of the R series instruction set architectures. Each entry shows the median normalized task success in
the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries denote signiﬁcant (p < 0.05, Wilcoxon
rank-sum test) deviations.

50

4.1.3.3

Explicit Labels

The Label series instruction sets show mixed results (Tables 4.6 and 4.7). The Logic-9,
Limited-9, Sort-10, and Navigation environments show virtually no substantial di↵erences
in task success, regardless of the set used. The Limited-9, Sort-10, and Navigation environments shows slight positive ﬁtness trends as more labeling options are included in the
instruction set. The Logic-77 environment shows signiﬁcantly detrimental results for both
ﬁtness and task success when only the label-based instructions are included. When any form
of sequence matching instructions are included in the Logic-77 environment, both metrics
return to the reference levels. The Match-12 environment shows no signiﬁcant di↵erence
for either metric among all but one instruction set. Label-Seq-Both, the most complete
instruction set in this group, shows a notably signiﬁcant drop of both metrics in the Match
environment. The Fibonacci-32 environment shows positive gains in all Label series instruction sets. The positive gains observed in the Fibonacci-32 environment were both signiﬁcant
and substantial, with 24.6% and 27.2% improvement in ﬁtness and task success, respectively, when using the Label-Seq-Both instruction set. In the Navigation environment
using the Label-Seq-Direct instruction set, 8 outlier populations notably demonstrated
task success greater than 0.10, with two at 0.141, indicating that substantial progress was
made in those particular runs. No previous runs in this environment have exhibited such
success in the short time period of 100,000 updates used (Grabowski et al. (2010, 2011), plus
unpublished data).

51

Logic-9
R6
Label
Label-Direct
Label-Both
Label-Seq
Label-Seq-Direct
Label-Direct-Seq
Label-Seq-Both

Logic-77

Match-12

Fibonacci-32 Sort-10

Limited-9

Navigation

22.59

38.42

0.216

4.572

-0.22

5.205

1.537

(22.10, 22.85)

(34.96, 44.12)

(0.201, 0.257)

(3.904, 5.134)

(-0.32, -0.09)

(5.016, 5.518)

(1.097, 3.261)

22.67

28.70

0.248

6.213

-0.42

5.430

1.926

(22.42, 22.94)

(25.66, 33.37)

(0.218, 0.316)

(5.669, 6.648)

(-0.52, -0.33)

(5.195, 5.630)

(1.108, 3.313)

22.50

27.20

0.215

5.435

-0.46

5.784

1.087

(21.84, 22.74)

(24.38, 31.32)

(0.189, 0.252)

(4.545, 6.174)

(-0.54, -0.40)

(5.429, 6.325)

(1.064, 2.742)

22.32

32.26

0.203

6.403

-0.47

5.553

3.084

(19.68, 22.61)

(28.96, 38.07)

(0.163, 0.230)

(5.715, 6.606)

(-0.56, -0.38)

(5.098, 5.885)

(1.148, 3.396)

22.43

40.26

0.207

6.357

-0.29

5.438

2.203

(21.94, 22.66)

(35.12, 44.36)

(0.134, 0.257)

(5.797, 6.733)

(-0.36, -0.15)

(5.079, 5.755)

(1.089, 3.161)

22.46

44.13

0.198

6.175

-0.33

5.651

2.335

(22.23, 22.66)

(39.73, 48.52)

(0.126, 0.217)

(5.212, 6.531)

(-0.44, -0.21)

(5.441, 5.967)

(1.077, 3.335)

22.53

41.74

0.210

5.933

-0.40

5.528

2.319

(22.25, 22.69)

(38.58, 44.36)

(0.183, 0.300)

(5.135, 6.495)

(-0.47, -0.23)

(5.528, 5.968)

(1.093, 3.235)

22.44

39.56

-0.094

6.116

-0.33

5.990

3.173

(21.75, 22.68)

(36.64, 42.72)

(-1.42, 0.088)

(5.336, 6.430)

(-0.45, -0.17)

(5.621, 6.374)

(2.955, 3.295)

Table 4.6: Label Series Architectures Fitness
Fitness results of the Label series instruction set architectures. Each entry shows the median log2 population mean ﬁtness in
the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05, Wilcoxon
rank-sum test) deviations after sequential Bonferroni correction.

52

Logic-9
R6
Label
Label-Direct
Label-Both
Label-Seq
Label-Seq-Direct
Label-Direct-Seq
Label-Seq-Both

Logic-77

Match-12

Fibonacci-32 Sort-10

0.926

0.505

0.144

0.278

1.86⇥10 4

Limited-9
0.930

Navigation
4.33⇥10 3

(0.908, 0.937)

(0.461, 0.574)

(0.142, 0.145)

(0.241, 0.303)

(1.62, 2.10)

(0.926, 0.932)

(3.97, 7.76)

0.941

0.389

0.146

0.380

1.45⇥10 4

0.932

4.58⇥10 3

(0.934, 0.945)

(0.352, 0.450)

(0.144, 0.147)

(0.342, 0.396)

(1.14, 1.57)

(0.928, 0.934)

(3.98, 7.68)

0.937

0.366

0.145

0.335

1.34⇥10 4

0.934

3.98⇥10 3

(0.916, 0.943)

(0.329, 0.416)

(0.144, 0.146)

(0.295, 0.383)

(1.10, 1.57)

(0.931, 0.936)

(3.97, 6.16)

0.922

0.422

0.145

0.388

1.37⇥10 4

0.932

7.05⇥10 3

(0.857, 0.939)

(0.385, 0.495)

(0.140, 0.146)

(0.367, 0.403)

(1.09, 1.51)

(0.929, 0.935)

(4.00, 7.96)

0.932

0.509

0.143

0.397

1.68⇥10 4

0.928

5.31⇥10 3

(0.919, 0.938)

(0.460, 0.573)

(0.139, 0.144)

(0.365, 0.405)

(1.57, 2.02)

(0.925, 0.929)

(3.98, 7.57)

0.929

0.559

0.144

0.370

1.61⇥10 4

0.931

5.06⇥10 3

(0.918, 0.939)

(0.522, 0.612)

(0.141, 0.146)

(0.309, 0.398)

(1.49, 1.81)

(0.928, 0.934)

(3.98, 7.70)

0.932

0.542

0.143

0.359

1.53⇥10 4

0.930

5.24⇥10 3

(0.919, 0.941)

(0.500, 0.562)

(0.141, 0.145)

(0.300, 0.399)

(1.36, 1.77)

(0.928, 0.933)

(3.98, 7.63)

0.926

0.517

0.079

0.374

1.64⇥10 4

0.928

7.74⇥10 3

(0.914, 0.934)

(0.482, 0.545)

(0.078, 0.125)

(0.300, 0.398)

(1.51, 1.94)

(0.923, 0.930)

(7.20, 7.91)

Table 4.7: Label Series Architectures Task Success
Task success results of the Label series instruction set architectures. Each entry shows the median normalized task success in
the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries denote signiﬁcant (p < 0.05, Wilcoxon
rank-sum test) deviations.

53

4.1.3.4

Split Input/Output Operations

The Split-IO instruction set shows improvements that are both signiﬁcant and often substantial in the Logic-9 and Logic-77 environments, the Match-12 environment, and the
Fibonacci-32 environment (Tables 4.8 and 4.9). Indeed the Logic-77 and Fibonacci-32 environments show 21% and 17% improvements in median task success, respectively. The Sort-10
environment, on the other hand, completely collapses, showing e↵ectively 0 task success and
correspondingly low ﬁtness. The Limited-9 environment shows mixed results, with a small
gain in task success but a drop in ﬁtness. The Navigation environment shows marginal drops
in both metrics, though neither signiﬁcant and, similar to previous instruction sets tested,
still well below 1% of the success possible.

4.1.3.5

Search

The three Search series instruction sets showed little measurable di↵erence in performance
for the Logic-9, Match-12, Fibonacci-32, Sort-10, Limited-9, and Navigation environments
(Tables 4.10 and 4.11). The Logic-77 environment showed small, signiﬁcant drops in ﬁtness
for all sets, with a corresponding drop in task success.
In the Search-Goto instruction set, I initially tested a variant of the jmphead instruction, which changed the default head it operated on to be the flow head. A notable
and often signiﬁcant drop in ﬁtness was observed in all seven environments with these two
instruction sets, leading to the architectures explored here.

54

Logic-9
Label-Seq-Direct
Split-IO

Logic-77

Match-12

Fibonacci-32 Sort-10

Limited-9

Navigation

22.56

43.57

0.207

6.106

-0.32

5.812

2.641

(22.32, 22.72)

(39.73, 46.60)

(0.182, 0.239)

(5.326, 6.549)

(-0.39, -0.24)

(5.390, 6.169)

(1.198, 3.341)

23.07

53.94

0.337

8.096

-1.03

5.343

1.091

(22.87, 23.22)

(50.34, 56.72)

(0.314, 0.360)

(7.983, 8.207)

(-1.03, -1.02)

(5.221, 5.520)

(1.062, 2.920)

Table 4.8: Split-IO Architecture Fitness
Fitness results of the Label-Seq-Direct and Split-IO instruction set architectures. Each entry shows the median log2
population mean ﬁtness in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate
signiﬁcant (p < 0.05, Wilcoxon rank-sum test) deviations after sequential Bonferroni correction.

Logic-9
Label-Seq-Direct
Split-IO

Logic-77

Match-12

Fibonacci-32

Sort-10

0.930

0.559

0.145

0.384

1.62⇥10 4

Limited-9
0.926

Navigation
6.63⇥10 3

(0.920, 0.935)

(0.521, 0.593)

(0.142, 0.146)

(0.318, 0.397)

(1.52, 1.74)

(0.922, 0.928)

(3.99, 7.94)

0.940

0.678

0.148

0.449

0.0

0.931

3.99⇥10 3

(0.936, 0.942)

(0.651, 0.707)

(0.148, 0.149)

(0.447, 0.461)

(0.0, 0.0)

(0.927, 0.933)

(3.97, 7.41)

Table 4.9: Split-IO Architecture Task Success
Task success results of the Label-Seq-Direct and Split-IO instruction set architectures. Each entry shows the median
normalized task success in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries denote
signiﬁcant (p < 0.05, Wilcoxon rank-sum test) deviations.

55

Logic-9
Split-IO
SearchDirectional
Search-Goto
Search-GotoIf

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

23.19

54.58

0.307

8.139

-1.03

5.477

2.909

(23.05, 23.25)

(52.46, 58.54)

(0.261, 0.335)

(8.027, 8.318)

(-1.04, -1.02)

(5.014, 5.912)

(1.121, 3.382)

23.02

48.75

0.313

8.188

-1.02

5.393

3.150

(22.87, 23.17)

(46.33, 52.21)

(0.265, 0.335)

(8.042, 8.273)

(-1.03, -0.98)

(5.177, 5.745)

(1.708, 3.431)

23.13

50.42

0.311

7.946

-1.04

5.598

2.584

(22.90, 23.21)

(48.27, 53.34)

(0.232, 0.337)

(7.853, 8.080)

(-1.05, -1.02)

(5.272, 5.850)

(1.084, 3.219)

22.92

48.49

0.283

7.937

-1.04

5.840

2.283

(22.61, 23.06)

(44.61, 52.01)

(0.223, 0.336)

(7.844, 8.070)

(-1.05, -1.01)

(5.624, 6.059)

(1.322, 3.028)

Table 4.10: Search Series Architectures Fitness
Fitness results of the Search series instruction set architectures. Each entry shows the median log2 population mean ﬁtness in
the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05, Wilcoxon
rank-sum test) deviations after sequential Bonferroni correction.
Logic-9
Split-IO
SearchDirectional
Search-Goto
Search-GotoIf

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

0.937

0.694

0.149

0.449

0.0

0.927

Navigation
7.28⇥10 3

(0.934, 0.942)

(0.658, 0.719)

(0.148, 0.149)

(0.446, 0.463)

(0.0, 0.0)

(0.926, 0.930)

(3.99, 8.03)

0.937

0.626

0.149

0.448

0.0

0.929

7.67⇥10 3

(0.932, 0.941)

(0.584, 0.652)

(0.148, 0.150)

(0.445, 0.455)

(0.0, 0.0)

(0.927, 0.932)

(4.60, 8.01)

0.940

0.644

0.148

0.447

0.0

0.930

6.43⇥10 3

(0.935, 0.943)

(0.608, 0.676)

(0.147, 0.149)

(0.444, 0.452)

(0.0, 0.0)

(0.928, 0.933)

(3.99, 7.61)

0.933

0.601

0.149

0.447

0.0

0.929

5.78⇥10 3

(0.928, 0.939)

(0.569, 0.650)

(0.148, 0.150)

(0.445, 0.448)

(0.0, 0.0)

(0.925, 0.933)

(4.15, 7.57)

Table 4.11: Search Series Architectures Task Success
Task success results of the Search series instruction set architectures. Each entry shows the median normalized task success in
the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries denote signiﬁcant (p < 0.05, Wilcoxon
rank-sum test) deviations.
56

4.1.3.6

Flow Control

The Flow series instruction sets tested three groups of ﬂow control instructions separately and in several combinations (Tables 4.12 and 4.13). Throughout all instruction sets
tested, the Fibonacci-32 environment showed no signiﬁcant variation from the SearchDirectional instruction set performance. The Match-12 environment had some signiﬁcant
drops in ﬁtness, but these were not substantial and also not coupled with a drop in task
success. The Logic-9 environment showed signiﬁcant, though insubstantial, loss of ﬁtness
with all Flow series instruction sets. Three instruction sets, Flow-If0, Flow-IfX, and
Flow-If0-IfX-MovHead, had corresponding small signiﬁcant decreases in task success.
Individually, the If0 instruction group made virtually no di↵erence in performance among
any of the seven environments. When tested in combination with the other instruction
groups, there is no clear indication of interaction, positive or negative.
The IfX instruction group both individually and in combination with other groups shows
positive gains in the Navigation environment, both ﬁtness and task success. This outcome is
likely due to the nature of the signposts in this environment (Grabowski et al., 2010), such
that comparing against certain “magic” numbers for decision making is likely beneﬁcial. The
remaining six environments show no substantial variation attributable to these instructions.
The third instruction group, MovHead, shows the greatest variation in performance
among those tested. In the Logic-77 environment, all instruction sets containing the MovHead group show substantial decreases in median ﬁtness, 14.3% on average. The two combination sets containing MovHead, Flow-IfX-MovHead and Flow-If0-IfX-MovHead,
also show corresponding decreases in task success in the Logic-77 environment. The Sort-10,
Limited-9, and Navigation environments, on the other hand, show substantial improvements
57

Figure 4.1: Selected Flow Series Architectures Task Success Distribution
Normalized task success distributions of selected Flow series instruction sets in the
Sort-10 environment.

in task success, and often ﬁtness, for all three instruction sets containing the MovHead
group. The Navigation environment, notably, approaches median task success around 1%
when the IfX and MovHead instruction groups are combined, indicating the importance
of e↵ective ﬂow control for that environment. The Sort-10 environment improvements are
di cult to observe from median values. Indeed the greatest driver of the improvements are
infrequent outliers approaching 0.7% task success, the highest ever observed in the Sort-10
environment (see Figure 4.1).

58

Logic-9
SearchDirectional
Flow-MvH
Flow-If0
Flow-IfX
Flow-If0-IfX
Flow-IfX-MvH
Flow-If0-IfXMvH

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

23.14

48.59

0.313

8.061

-1.02

5.571

3.022

(23.01, 23.23)

(46.46, 51.41)

(0.243, 0.346)

(7.990, 8.176)

(-1.04, -1.01)

(5.334, 5.867)

(2.151, 3.387)

22.78

42.25

0.277

8.124

-0.96

5.474

3.875

(22.38, 23.05)

(39.76, 46.35)

(0.216, 0.347)

(7.990, 8.270)

(-1.01, -0.69)

(5.181, 5.771)

(3.729, 4.052)

22.80

47.99

0.296

7.995

-1.04

5.553

3.229

(22.62, 23.07)

(45.02, 51.50)

(0.258, 0.323)

(7.857, 8.099)

(-1.05, -1.02)

(5.326, 5.855)

(2.919, 3.549)

22.64

46.40

0.199

8.037

-1.06

5.804

4.028

(22.33, 22.86)

(44.51, 48.76)

(0.168, 0.264)

(7.964, 8.198)

(-1.07, -1.03)

(5.460, 6.243)

(3.861, 4.312)

22.82

46.00

0.270

8.011

-1.08

5.595

4.331

(22.65, 23.01)

(42.64, 49.03)

(0.201, 0.308)

(7.952, 8.078)

(-1.09, -1.07)

(5.292, 6.071)

(4.113, 4.615)

22.95

41.29

0.311

8.063

-1.00

5.751

4.475

(22.74, 23.11)

(38.48, 44.64)

(0.244, 0.346)

(7.980, 8.193)

(-1.03, -0.92)

(5.522, 6.061)

(4.244, 4.983)

21.94

41.76

0.213

7.995

-1.01

6.077

5.036

(21.55, 22.37)

(39.12, 43.84)

(0.189, 0.283)

(7.849, 8.066)

(-1.05, -0.91)

(6.723, 6.625)

(4.576, 6.457)

Table 4.12: Flow Series Architectures Fitness
Fitness results of the Flow series instruction set architectures. Each entry shows the median log2 population mean ﬁtness in
the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05, Wilcoxon
rank-sum test) deviations after sequential Bonferroni correction.

59

Logic-9
SearchDirectional
Flow-MvH
Flow-If0
Flow-IfX
Flow-If0-IfX
Flow-IfX-MvH
Flow-If0-IfX-MvH

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

0.943

0.623

0.148

0.448

0.0

0.928

Navigation
7.55⇥10 3

(0.939, 0.946)

(0.584, 0.648)

(0.147, 0.149)

(0.444, 0.452)

(0.0, 0.0)

(0.926, 0.932)

(5.51, 8.04)

0.938

0.563

0.150

0.465

0.0

0.945

8.63⇥10 3

(0.930, 0.946)

(0.532, 0.593)

(0.149, 0.150)

(0.452, 0.476)

(0.0, 0.0)

(0.941, 0.948)

(8.40, 8.93)

0.932

0.611

0.148

0.447

0.0

0.932

7.78⇥10 3

(0.924, 0.937)

(0.581, 0.647)

(0.147, 0.149)

(0.446, 0.451)

(0.0, 0.0)

(0.930, 0.935)

(6.52, 8.12)

0.929

0.607

0.147

0.447

0.0

0.931

8.88⇥10 3

(0.916, 0.935)

(0.577, 0.622)

(0.146, 0.148)

(0.445, 0.453)

(0.0, 0.0)

(0.929, 0.933)

(8.43, 9.67)

0.937

0.594

0.148

0.451

0.0

0.931

9.47⇥10 3

(0.932, 0.941)

(0.550, 0.626)

(0.147, 0.149)

(0.446, 0.459)

(0.0, 0.0)

(0.928, 0.935)

(8.75, 10.12)

0.945

0.549

0.150

0.459

0.0

0.941

9.65⇥10 3

(0.940, 0.951)

(0.508, 0.576)

(0.149, 0.151)

(0.451, 0.467)

(0.0, 0.0)

(0.937, 0.947)

(9.10, 10.64)

0.926

0.544

0.150

0.451

0.0

0.940

11.25⇥10 3

(0.901, 0.935)

(0.516, 0.577)

(0.148, 0.150)

(0.447, 0.459)

(0.0, 0.0)

(0.935, 0.945)

(10.23, 13.40)

Table 4.13: Flow Series Architectures Task Success
Task success results of the Flow series instruction set architectures. Each entry shows the median normalized task success in
the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries denote signiﬁcant (p < 0.05, Wilcoxon
rank-sum test) deviations.

60

4.1.4

Discussion

I have investigated the evolutionary potential of six groups of modiﬁed instruction set architectures of a digital evolution system, each within seven di↵erent computational environments. Among the groups investigated there were three classes of outcomes, broad multienvironment improvement, limited environment improvement, and no discernible trend. Notably absent from the observed classes was a broad detrimental change. Although some
instruction set architectures did demonstrate decreased performance, there was only one
example of highly substantial degradation, the Split-IO instruction set in the Sort-10 environment. I explore potential explanations for this particular case below. In general,
evolution has proven to be surprisingly robust to the explored genetic hardware
changes, regardless of environment.
Two groups of instruction set modiﬁcations yielded broadly beneﬁcial changes in both
ﬁtness and task success. The Fully-Associative (FA) architectures instruction data ﬂow
enhancements led to highly signiﬁcant gains in ﬁve of the seven environments. The remaining
two environments, Sort-10 and Navigation, show some slight improvement and no discernible
di↵erence, respectively. The second group that demonstrated broadly positive results was
the Split-IO instruction set. The separation of the input and output operation allows
ﬁner-grained data ﬂow between the CPU and the environment. This control a↵orded by the
Split-IO architecture was beneﬁcial to the same ﬁve environments as the FA architecture.
The Navigation environment showed no particular change in ﬁtness performance, and a
small, but insubstantial change in task success. The only major detriment to the splitting of
input and output operations was observed in the Sort-10 environment. As a whole, these two
groups indicate that it is beneﬁcial to maintain as much ﬂexibility as possible with regard to
61

instruction interactions. This ﬂexibility allows evolution to ﬁnely tune interactions, yielding
greater evolutionary potential.
The R series, Label series, and Search series architectures all demonstrated no discernible trend in performance, despite representing 17 of the 25 tested architectures. There
were some particular environment/instruction set combinations that had signiﬁcant variations, yet these were rarely substantial in nature. It is particularly surprising that the R
series instruction sets showed such minimal deviation, given that going from the FA architecture to the R16 architecture represents a greater than ﬁve-fold increase in working
set and a 50% increase in instruction set size. Similarly, the Label-Seq-Both instruction
set represents a 20.6% increase in instruction set size, with no substantially negative e↵ect.
Taken together these groups hint at how robust the evolutionary process may be to genetic
language dilution, maintaining the ability to adapt successfully to the environment despite
searching a much larger genotype space. I explore this robustness in greater detail below in
Section 4.3.
The Flow series of instruction set architectures represents a third class of outcomes,
yielding improved results in a subset of environments and degradation of performance in
one environment. The Sort-10, Limited-9, and Navigation environments all show substantial
gains in both ﬁtness and task success metrics when using instruction sets containing the
IfX and MovHead instruction groups. The Logic-77 environment, on the other hand,
shows a notable drop in performance. It is possible that this environment does not require
a great deal of ﬂow control, thus is being negatively a↵ected by the disruptive nature of the
additional ﬂow control instructions. In environments where ﬂow control decisions are critical
for success, such as the Sort-10 and Navigation environment, the beneﬁts of more ﬂexible
62

ﬂow control outweigh their disruptive e↵ects.
The Sort-10 environment stands out as the only example where a single, small change,
splitting the input and output instructions, made a large destructive di↵erence in performance. Median task success collapsed to be statistically indistinguishable from 0, and remained there despite further beneﬁcial instruction set modiﬁcations. These results are likely
an artifact of the environment itself, rather than a general trend. I set up the Sort-10 environment to control for random inputs and to, on average, provide no beneﬁt unless active
sorting was performed by an organism. However, the inputs for sorting are indeed a random
sample of 10 integers. It is possible, due to chance, for a partial ordering of numbers to yield
a positive metabolic reward even if the sequence of inputs is simply echoed back to the environment. When using instruction sets featuring the paired-input-and-output instruction,
simply mutating this instruction into the section of the genome responsible for replication
may be enough to confer the echo capability, presenting an opportunity for lucky organisms
to occasionally reap rewards. When the operations are split into two separate instructions, it
then requires two coordinated mutations to confer the echo capability and doubles the execution cost for performing the task. The combination of these factors most likely contributes
to the observed drop in median performance.
Instruction data ﬂow, working set size, and ﬂow control are the three main features
addressed by the six groups of instruction set modiﬁcations presented here. All of these
features play an important role in implementing a successful sorting algorithm. Despite
the modiﬁcations in the instruction set architectures I tested, no signiﬁcantly beneﬁcial
change was observed in either ﬁtness or task success within the Sort-10 environment. Most
likely, the highly constrained memory size of these architectures limits the potential within
63

this environment. In fact, a hand-written organism that performs the task successfully
with the Heads architecture requires nearly every single stack location in both available
stacks. Another factor limiting potential may simply be the time allotted for evolution.
The additional ﬂow control instructions tested in the Flow series architectures show some
signs of improved success in this environment, with numerous outlier populations. Given
additional time to evolve, these and other populations would likely be able to reﬁne the
emerging solutions.
When features from all six instruction set groups are combined to form the FlowIfX-MovHead architecture, signiﬁcant and substantial improvements relative to the base
Heads architecture are observed in six of the seven environments (Tables 4.14 and 4.15).
Despite these improvements, there still remains a great deal of unexploited opportunity in
ﬁve of the environments. Speciﬁc architectural changes to address these environments may
yield greater results, such as the addition of an instruction capable of building arbitrary
numbers for the Match-12 environment. However, such focused modiﬁcations could mask
the need for more sweeping changes. Even with signiﬁcant gains under two instruction sets,
the Logic-77 environment still shows room for substantial improvement, as median task
success shows populations utilizing less than 55% of the opportunities present. Even more
so, the Sort-10 and Navigation environments exploit less than 1% of the available potential.
It is clear from this present study that we have just started to identify the most e↵ective
genetic hardware for adaptive evolution in digital organisms and there remains room for
signiﬁcant future improvement. I expect that further studies of instruction set architecture
enhancements for evolvable systems will unlock this potential, facilitating advancements in
the application of digital evolution and artiﬁcial life.

64

Logic-9
Heads
Flow-IfX-MvH

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

19.44

13.50

0.194

3.453

-0.47

4.328

1.656

(17.74, 19.79)

(11.67, 15.30)

(0.168, 0.248)

(3.216, 3.858)

(-0.61, -0.33)

(4.157, 4.445)

(1.108, 3.606)

22.95

41.292

0.311

8.063

-1.00

5.751

4.475

(22.74, 23.11)

(38.50, 44.56)

(0.245, 0.347)

(7.980, 8.189)

(-1.03, -0.92)

(5.517, 6.049)

(4.244, 4.953)

Table 4.14: Heads and Flow-IfX-MovHead Architectures Fitness
Fitness results of the Heads and Flow-IfX-MovHead instruction set architectures. Each entry shows the median log2
population mean ﬁtness in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate
signiﬁcant (p < 0.05, Wilcoxon rank-sum test) deviations.

Logic-9
Heads
Flow-IfX-MvH

Logic-77

Match-12

Fibonacci-32

Sort-10

0.834

0.185

0.146

0.202

1.42⇥10 4

Limited-9
0.908

Navigation
4.72⇥10 3

(0.752, 0.844)

(0.162, 0.211)

(0.145, 0.147)

(0.177, 0.228)

(1.08, 1.66)

(0.897, 0.914)

(3.99, 8.23)

0.945

0.549

0.150

0.459

0.0

0.941

9.65⇥10 3

(0.940, 0.951)

(0.507, 0.577)

(0.149, 0.151)

(0.451, 0.467)

(0.0, 0.0)

(0.937, 0.947)

(9.12, 10.62)

Table 4.15: Heads and Flow-IfX-MovHead Architectures Task Success
Task success results of the Heads and Flow-IfX-MovHead instruction set architectures. Each entry shows the median
normalized task success in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries denote
signiﬁcant (p < 0.05, Wilcoxon rank-sum test) deviations.

65

4.2

Task Success Acquisition Rate of Selected Instruction Sets

The instruction set modiﬁcations explored in 4.1 demonstrated signiﬁcant improvement in
both ﬁtness and task success over a broad range of architectures. The Heads architecture,
however, was already rather successful in two of the environments, Logic-9 and Limited-9,
demonstrating 83.4% and 90.8% median task success after 100,000 updates, respectively.
The changes in architecture tested above may have more substantial impact on evolutionary
potential than can be observed from a single time point. Alterations to the architecture can
a↵ect both the rate of task success acquisition, in addition to the ﬁnal achieved performance.
In the discussion above (see Section 4.1.4), I also noted that environments may demonstrate
greater overall task success given more evolutionary time. History plays a signiﬁcant role
in determining evolutionary potential (see Chapter 3). Architecture modiﬁcations that can
limit the impact of history will continue to improve, while others stagnate. Here, I explore
both of these aspects, examining the rate of task success acquisition and the longer term
potential of selected instruction sets.

4.2.1

Methods

I tested the six instruction sets, the ﬁve reference architectures selected in Section 4.1, FA,
R6, Label-Seq-Direct, Search-Directional, and Flow-IfX-MovHead, as well as
the Heads control architecture, in four environments that demonstrated the most substantial
variation in performance, Logic-9, Logic-77, Fibonacci-32, and Limited-9. The architectures
were tested using 50 replicate populations allowed to evolve for an extended time period of
66

Logic-9
Heads
FA
R6
Label-Seq-Direct
Search-Directional
Flow-IfX-MvH

6700
4300
4650
4700
5400
4700

Logic-77
(50)
(50)
(50)
(50)
(50)
(50)

201700
48900
43900
38850
32400
24500

Fibonacci-32
(16)
(30)
(39)
(38)
(42)
(40)

Limited-9

144500

6850
5400
5600
5400
5250
4600

41500
34400
16500
93950

(3)
(0)
(7)
(12)
(18)
(26)

(50)
(50)
(50)
(50)
(50)
(50)

Table 4.16: Time to 50% Task Success
Median time (in updates) to 50% task success of selected instruction set architectures. The
number of replicates, out of 50 possible by 500,000 updates, that reached 50% task success
is shown in parentheses.
500,000 updates. I collected data necessary for determining current population task success
every 100 updates.

4.2.2

Results and Discussion

I used task success (Eq. 4.1), measuring the direct computational capabilities of the organisms in the population, to assess the rate at which each tested instruction set architecture
evolves. For each replicate, I calculated the ﬁnal population task success. If the ﬁnal task
success was greater than or equal to 50% of the theoretical maximum, I then determined
the earliest update at which the population crossed the 50% threshold (with a granularity
of 100 update intervals).
The results (Table 4.16) correspond well with the observations of Section 4.1.3. Each
of the reference instruction sets, building upon the previous architecture, generally demonstrates reduced time to 50% task success and increased overall number of populations meeting
that threshold. All replicate populations with all six tested instruction sets met the 50%
threshold in the Logic-9 and Limited-9 environments. The FA instruction set demonstrates
67

the greatest improvement in those two environments, reducing time to 50% task success
by 35.8% in Logic-9 and 21.2% in Limited-9. Median time to 50% task success remains
largely consistent in both of those environments with the remaining four instruction sets.
The Search-Directional instruction set, however, does slow evolution somewhat in the
Logic-9 environment, while the Flow-IfX-MovHead instruction set improves Limited-9
performance by another 14.8%.
The Logic-77 environment demonstrates continuous improvement in both median time
to 50% task success and number of populations meeting that threshold by the end of the
experiment. The Flow-IfX-MovHead instruction set shows dramatic improvement compared to the Heads architecture, with 87.9% reduction in median time to 50% task success
and a 2.5 fold increase in number of populations exceeding threshold.
The Fibonacci-32 environment shows a trend similar to the Logic-77, with continuous improvement. The Flow-IfX-MovHead instruction set does exhibit increased median time
to 50% task success (93950 updates), relative to the Search-Directional architecture
(16500 updates). However, the number of successful replicates is substantially increased as
well, 26 with the Flow-IfX-MovHead architecture compared to 16 with the SearchDirectional instruction set.
Final population performances (Tables 4.17 and 4.18) compare favorably to the results
presented in Section 4.1.3. Given additional time, the longer term evolutionary potential
of the R6, Label-Seq-Direct, Search-Directional, and Flow-IfX-MovHead instruction set architectures are essentially equivalent in the Logic-9, Logic-77, and Limited-9
environments. These results further support the observation in Section 4.1.4 that evolution
is surprisingly robust to changes in architecture.
68

Logic-9

Search-Directional
Flow-IfX-MvH

32.26

5.562

4.184

(14.76, 23.88)

(11.43, 66.43)

(2.952, 21.98)

(3.098, 15.31)

44.42

6.595

4.665

(15.79, 72.85)

(4.166, 8.45)

(3.077, 10.99)

57.68

6.263

4.841

(14.70, 23.85)

Label-Seq-Direct

23.54

23.48

R6

Limited-9

(14.79, 23.89)

FA

Fibonacci-32

23.54

Heads

Logic-77

(21.93, 73.09)

(3.035, 21.93)

(3.079, 10.04)

23.32

57.38

7.555

5.089

(14.08, 23.82)

(19.14, 72.96)

(3.801, 22.12)

(2.898, 11.11)

23.55

55.77

8.518

5.028

(17.84, 23.81)

(19.38, 72.68)

(7.577, 22.22)

(3.111, 22.13)

23.32

57.94

8.706

5.736

(13.96, 23.81)

(18.28, 73.55)

(7.701, 22.64)

(2.243, 18.36)

Table 4.17: Extended Architecture Experiments Fitness
Final ﬁtness results of selected instruction set architectures run for extended experiments.
Each entry shows the median log2 population mean ﬁtness in the respective environment,
with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05,
Wilcoxon rank-sum test) deviations.

Final population performance in the Limited-9 environment does demonstrate a minor,
though not signiﬁcant trend of increasing ﬁtness and task success with increasing instruction set complexity. The Fibonacci-32 environment, however, continues to show notably
improved evolutionary potential, indicating that the architectural changes remain beneﬁcial
to some environments, despite making no signiﬁcant di↵erence in the other three tested environments. Although, as previously mentioned, the no free lunch theorem states that no
“perfect” encoding exists, these results support my hypothesis that the current genetic hardware in Avida is far from optimal and that substantial, generic improvements in performance
appear to be possible.
69

Logic-9

(0.820, 0.952)

0.558

0.350

0.928

(0.215, 0.906)

(0.224, 0.464)

(0.883, 0.966)

0.734

0.325

0.934

(0.744, 0.970)

Flow-IfX-MvH

0.920

(0.167, 0.969)

0.946

Search-Directional

0.301

(0.158, 0.800)

(0.744, 0.975)

Label-Seq-Direct

0.418

0.949

R6

Limited-9

(0.743, 0.966)

FA

Fibonacci-32

0.936

Heads

Logic-77

(0.292, 0.921)

(0.171, 0.944)

(0.905, 0.967)

0.943

0.714

0.435

0.933

(0.731, 0.969)

(0.266, 0.925)

(0.191, 0.950)

(0.906, 0.966)

0.951

0.722

0.472

0.932

(0.804, 0.977)

(0.263, 0.915)

(0.400, 0.963)

(0.879, 0.977)

0.958

0.753

0.505

0.954

(0.746, 0.986)

(0.238, 0.947)

(0.443, 0.984)

(0.750, 0.985)

Table 4.18: Extended Architecture Experiments Task Success
Final normalized task success results of selected instruction set architectures run for extended
experiments. Each entry shows the median log2 population mean ﬁtness in the respective
environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant
(p < 0.05, Wilcoxon rank-sum test) deviations.

4.3

Assessing Robustness to Poor Design Decisions

Note, the material in the section previously appeared as Bryson and Ofria (2012).
Changes to the instruction set can have a profound e↵ect on the evolutionary potential of
the system with respect to the environment, as shown above in Section 4.1.3. It is di cult,
if not impossible, to asses the impact of instruction set changes a priori. A seemingly
beneﬁcial change may, in fact, have unintended negative interactions with other aspects of
the system. Here I describe results of investigations into three types of poor instruction
set design decisions: functionally neutral instructions that bloat the instruction set, actively
deleterious instructions that poison the organism, and a lethal instruction. I evaluate the
evolutionary potential of each instruction set using the same approach as Section 4.1.
Given a ﬁxed environment, one particular instruction set may show greater evolutionary
70

potential in comparison to another. However, it is possible that aspects of an instruction set
may demonstrate better adaptability to changing circumstances. As evolution progresses,
genomes lock in features and genetic organization that are beneﬁcial. The structure of these
genomes impact their potential when the environment changes. I investigate this connection
by evaluating short term evolutionary potential of genomes following environmental change.
I examine how this potential changes relative to the progress of evolution in the origin
environment.

4.3.1

Tested Instruction Set Modiﬁcations

The Bloat-X instruction sets test what happens if you add useless, although not directly
disruptive, instructions to the instruction set. They extend the 26 instructions in the Heads
architecture with the addition of one or more copies of the no-operation instruction, nop-X,
which is functionally neutral, in that it does not alter the state of the virtual CPU. Additionally, unlike the default no-operation instructions, it does not alter the behavior of other
instructions. We tested four Bloat-X instruction sets, varying the mutational frequency of
the nop-X instruction. Bloat-1 adds nop-X with a frequency of one, increasing the overall
instruction set size to 27 and yielding an e↵ective mutational frequency of 0.037 for each
instruction. Bloat-3, Bloat-10, and Bloat-30 increase the frequency of nop-X to 3, 10,
and 30, respectively. In Bloat-30, the e↵ective instruction set size is 56, with an e↵ective
mutational frequency of the nop-X instruction at 0.536, and 0.018 for each of the remaining
26 standard instructions.
With the Poison instruction sets we are testing what happens when we make a poor
decision by adding an instruction that disrupts the functionality of the organisms upon
71

execution. These instruction sets extend the Heads architecture with the addition of a
poison instruction that, when executed, reduces the metabolic rate of the organism by a
conﬁgurable severity. Reduced metabolic rate translates to fewer relative CPU cycles, and
therefore diminished competitive ability. We tested three poison severities, 0.003, 0.01, 0.03,
which reduce metabolic rate by 0.3%, 1%, and 3%, respectively, each time the organism
executes the instruction. We hypothesized that lower penalties might be more detrimental
to long-term evolution, because they may slip in and accumulate over time.
Lastly, the Die instruction set sought to determine what happens when we make a
catastrophic error in including an instruction in the set. This instruction set adds a single
die instruction to the Heads architecture. The presence of a die instruction in a genome
is not itself lethal. If the organism executes the instruction during its’ lifetime, however,
the organism will be immediately removed from the population. In this manner, the die
instruction is similar to a poison instruction with a severity of 1.0.

4.3.2

General Performance Evaluation

As described in Section 4.1.3, I have focused on two measures of evolved populations to
evaluate the general performance of each instruction set: mean ﬁtness and task success.
Both measure the ability of the evolved organisms to perform tasks within the environment.
Upon analysis, all three Poison instruction sets and the Die instruction set demonstrated no signiﬁcant variation in either population mean ﬁtness (Table 4.19) or task success
(Table 4.20) across all seven environments. Indeed, the distributions of observed results were
largely similar, regardless of the severity of the penalty associated with a given instruction.
Likewise, the Bloat-1 and Bloat-3 instructions sets showed comparable performance to
72

the Heads control. This result would indicate that a single poor choice of an instruction,
no matter how bad, is not likely to signiﬁcantly limit evolutionary outcomes.
The Bloat-10 and Bloat-30 instruction sets, representing high levels of instruction
mutational dilution, demonstrate some negative e↵ects in ﬁnal population performance. In
the Logic-9 environment, both Bloat-10 and Bloat-30 showed signiﬁcantly decreased
ﬁtness and task success. Runs with these instruction sets evolved one fewer task on average in
the Logic-9 environment when compared to the control. The Logic-77 environment similarly
demonstrated decreased ﬁtness and task success, but unlike in the Logic-9 environment,
the Bloat-30 instruction set was notably worse than Bloat-10. This pattern of declining
performance was also observed in the Fibonacci-32 environment, with task success indicating
that the populations are outputting fewer one to three fewer numbers in the sequence as
instruction set dilution increases.
The Limited-9 environment demonstrated a split between ﬁtness and task success results
for Bloat-10 and Bloat-30. The ﬁtness results for both instruction sets were greater
than the Heads control, though not signiﬁcantly after Bonferroni correction. The Bloat10 instruction set demonstrated task success that was somewhat, though not signiﬁcantly,
reduced compared to the control. Task success with the Bloat-30 instruction set, however,
was signiﬁcantly reduced, with populations typically evolving one fewer task, as compared
to the Heads instruction set.
In the Navigation environment, the Bloat-10 instruction set demonstrated comparable
performance to the control for both ﬁtness and task success. The Bloat-30 instruction, on
the other hand, showed signiﬁcantly improved median ﬁtness. Task success was also notably
increased, nearly double all other instructions sets, though not statistically signiﬁcant from
73

the control after Bonferroni correction. Despite the increase, the populations are still quite
far from exploiting the opportunities in this environment, taking advantage of less than 1%
of the potential resources.
The Match-12 environment showed no variation in task success with any of the tested
instruction sets. The Bloat-10 and Bloat-30 instruction sets both demonstrated signiﬁcantly lower ﬁtness, with Bloat-30 the most severely depressed. Given the lack of variation
in task success, these ﬁtness result likely reﬂect the impact of neutral instruction set bloat
on the evolution of replication e ciency in these digital organisms.
Lastly, the Sort-10 environment was signiﬁcantly reduced in both ﬁtness and task success
under both Bloat-10 and Bloat-30. The di↵erences observed, however, were relatively
insubstantial. None of the tested instruction sets, including the Heads control, were able to
take advantage of the opportunities in the Sort-10 environment; all sets demonstrated ⌧ 1%
of the potential task success. The current limitations of the virtual CPU appear to make
this task incredibly di cult to evolve.

74

Logic-9
Heads
Bloat-1
Bloat-3
Bloat-10
Bloat-30
Poison-0.003
Poison-0.01
Poison-0.03
Die

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

19.71

15.22

0.198

3.645

-0.482

4.220

1.447

(19.33, 19.81)

(12.81, 16.34)

(0.159, 0.243)

(3.299, 4.082)

(-0.598, -0.324)

(4.095, 4.294)

(1.103, 3.197)

19.33

14.42

0.211

3.241

-0.498

4.301

1.123

(18.12, 19.72)

(12.36, 15.89)

(0.180, 0.242)

(3.010, 3.815)

(-0.608, -0.372)

(4.141, 4.418)

(1.051, 2.453)

19.67

12.31

0.176

3.400

-0.538

4.247

1.123

(18.15, 19.75)

(10.71, 14.37)

(0.142, 0.207)

(3.206, 3.762)

(-0.611, -0.441)

(4.138, 4.375)

(1.031, 3.202)

14.80

11.77

0.106

2.621

-0.696

4.526

1.068

(14.73, 17.32)

(10.66, 14.32)

(0.082, 0.127)

(2.540, 3.116)

(-0.717, -0.675)

(4.312, 4.779)

(1.022, 2.640)

14.38

7.74

-0.053

1.768

-0.770

4.480

2.872

(13.70, 14.61)

( 7.69, 8.67)

(-0.170, 0.083)

(1.722, 1.802)

(-0.785, -0.741)

(4.317, 4.684)

(1.313, 3.165)

19.57

14.11

0.206

4.240

-0.342

4.152

1.157

(18.72, 19.73)

(12.19, 15.96)

(0.177, 0.252)

(3.496, 4.623)

(-0.483, -0.225)

(4.071, 4.226)

(1.056, 1.635)

19.41

12.63

0.159

3.309

-0.591

4.300

1.342

(18.20, 19.76)

(11.64, 14.76)

(0.137, 0.214)

(3.181, 3.825)

(-0.664, -0.442)

(4.127, 4.492)

(1.052, 3.386)

18.66

12.19

0.157

3.417

-0.464

4.290

1.356

(17.79, 19.62)

(11.42, 14.40)

(0.130, 0.192)

(3.219, 3.883)

(-0.616, -0.313)

(4.207, 4.538)

(1.068, 3.275)

19.60

14.28

0.181

3.476

-0.581

4.422

1.324

(17.84, 19.78)

(11.93, 16.32)

(0.156, 0.235)

(3.211, 3.909)

(-0.634, -0.438)

(4.288, 4.563)

(1.060, 3.150)

Table 4.19: Bloat-X, Poison-X, and Die Architectures Fitness
Fitness results for all 8 test instruction sets and the Heads control architecture. Each entry shows the median log2 population
mean ﬁtness in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant
(p < 0.05, Wilcoxon rank-sum test) deviations after sequential Bonferroni correction.

75

Heads
Bloat-1
Bloat-3
Bloat-10
Bloat-30
Poison-0.003
Poison-0.01
Poison-0.03
Die

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

0.842

0.207

0.145

0.205

1.42⇥10 4

0.906

Navigation
4.35⇥10 3

(0.835, 0.847)

(0.179, 0.227)

(0.144, 0.146)

(0.177, 0.237)

(1.14, 1.71)

(0.896, 0.912)

(3.98, 7.54)

0.835

0.198

0.146

0.177

1.46⇥10 4

0.906

4.05⇥10 3

(0.753, 0.843)

(0.173, 0.218)

(0.145, 0.147)

(0.174, 0.207)

(1.10, 1.67)

(0.896, 0.911)

(3.96, 5.84)

0.839

0.171

0.146

0.203

1.37⇥10 4

0.899

3.99⇥10 3

(0.825, 0.846)

(0.150, 0.197)

(0.145, 0.147)

(0.176, 0.236)

(1.14, 1.54)

(0.829, 0.911)

(3.97, 7.20)

0.747

0.166

0.146

0.174

0.832

4.00⇥10 3

(0.744, 0.750)

(0.150, 0.203)

(0.144, 0.146)

(0.149, 0.178)

1.01 ⇥ 10 4
(0.99, 1.04)

(0.823, 0.894)

(3.97, 6.86)

0.736

0.114

0.146

0.120

0.777

7.57⇥10 3

(0.648, 0.744)

(0.113, 0.126)

(0.125, 0.147)

(0.119, 0.120)

9.7 ⇥ 10 5
(9.6, 9.9)

(0.732, 0.808)

(4.08, 7.82)

0.841

0.194

(0.828, 0.846)

76

Logic-9

0.147

0.239

1.67⇥10 4

0.910

3.99⇥10 3

(0.172, 0.217)

(0.146, 0.148)

(0.205, 0.285)

(1.45, 1.97)

(0.903, 0.915)

(3.97, 4.79)

0.839

0.174

0.146

0.179

1.22⇥10 4

0.898

4.33⇥10 3

(0.821, 0.845)

(0.162, 0.204)

(0.145, 0.146)

(0.176, 0.208)

(1.07, 1.54)

(0.844, 0.909)

(3.97, 7.77)

0.838

0.170

0.146

0.202

1.52⇥10 4

0.911

4.33⇥10 3

(0.773, 0.844)

(0.159, 0.202)

(0.145, 0.147)

(0.177, 0.237)

(1.11, 1.82)

(0.897, 0.916)

(3.97, 7.97)

0.827

0.198

0.146

0.195

1.27⇥10 4

0.906

4.29⇥10 3

(0.754, 0.841)

(0.163, 0.224)

(0.145, 0.147)

(0.176, 0.210)

(1.04, 1.52)

(0.840, 0.913)

(3.97, 6.58)

Table 4.20: Bloat-X, Poison-X, and Die Architectures Task Success
Task success results for all 8 test instruction sets and the Heads control architecture. Each entry shows the median normalized
task success in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant
(p < 0.05, Wilcoxon rank-sum test) deviations after sequential Bonferroni correction.

76

Figure 4.2: Heads and Bloat-30 Fitness Trajectory
Median log2 ﬁtness trajectory of the Heads (blue line) and Bloat-30 (green line) architectures. Lines calculated from all 200 replicates of each instruction set in the Fibonacci-32
environment. Shaded regions show 95% bootstrap conﬁdence intervals, 10,000 iterations.

4.3.3

Impact on Evolutionary Potential in a New Environment

The Bloat-X instruction sets, especially Bloat-10 and Bloat-30, showed reduced performance when examining a ﬁxed end point, as shown above. The ﬁtness trajectories of
these instruction sets demonstrated a corresponding drag on evolution throughout the entire history of the runs, relative to the Heads control instruction set (see Figure 4.2 for an
example). Despite this apparent drag, ﬁtness was still rising at the end of the experiment,
albeit more slowly.
The neutral bloat represented by the Bloat-X instruction sets, although detrimental
to the rate of evolution, may have a beneﬁcial e↵ect on the genetic architecture of the
evolved genomes. The nop instructions tend to decouple strings of other instructions, such
that genetic functions must be more loosely coupled. This property may a↵ord greater
77

evolutionary potential when the genomes experience environmental change. We have tested
this hypothesis by performing STEP sampling of genotypes in a new environment, never
before encountered in the history of the genotype.
We extracted the principal line of descent from 12 selected runs utilizing the Heads
and Bloat-30 instruction sets that we initially evolved in the Logic-9 and Fibonacci-32
environments. These two environments both demonstrated variation in performance between
the Heads and Bloat-30 instruction sets, and present computationally unique challenges
to the organisms (logic computation in Logic-9 and loop coordination in Fibonacci-32). The
genotypes along each of the lines of descent were STEP sampled, still using their native
instruction set, but in the opposite environment. For example, we placed genotypes evolved
in the Logic-9 environment into the Fibonacci-32 environment and evolved them for 10,000
updates. We sampled each genotype ten times and examined the ﬁtness and task success of
the resulting populations.
The STEP sampling results of the Heads control instruction set show that the short term
evolutionary potential of the genotypes declined in the Logic-9 environment as evolution
progressed in the Fibonacci-32 environment (see Figure 4.3). All sampled lines of descent
with the Heads architecture demonstrate similar patterns of evolutionary potential in the
Fibonacci-32 to Logic-9 shift. The Bloat-30 instruction set runs, on the other hand, show
a relatively ﬂat trend of evolutionary potential. Additionally, the Bloat-30 instruction set
demonstrated an increased number of high potential outliers throughout all of the sampled
lines of descent originally evolved in the Fibonacci-32 environment (see Figure 4.5).
Both the Heads control and the Bloat-30 demonstrated a consistent pattern of gradual
decline in short term evolutionary potential when sampling genotypes originally evolved in
78

Figure 4.3: STEP Sampling of Population 6156
STEP sampling results of population 6156 originally evolved in the Fibonacci-32 environment with the Heads instruction set. Blue line: ﬁtness of the reference genotype in its
native Fibonacci-32 environment. Green line: initial ﬁtness of the reference genotype in the
alternative Logic-9 environment. Orange line: median STEP results, smoothed using a FFT,
shown with 95% quantiles. Gray circles: individual STEP results. The ﬁrst point (step 0)
shows the performance of the ancestral genotype.

the Logic-9 source environment within the Fibonacci-32 sample environment (see Figure 4.4).
Similar to the Fibonacci-32 to Logic-9 environment transition, the Bloat-30 instruction set
showed notably more outlier samples of high potential. However, the overall spread and
trend of samples of the Bloat-30 genotypes were comparable to the Heads instruction set.
In order to assess the generality of the observed patterns, we STEP sampled the ﬁnal dominant genotype from all 200 runs of each of the original Heads and Bloat-30 experiments
from the Logic-9 and Fibonacci-32 environments in the appropriate alternate environment.
As observed in the line of descent sampling, the Bloat-30 instruction set genotypes from the
Fibonacci-32 environment demonstrated signiﬁcantly greater potential (p < 0.017; Wilcoxon
rank-sum test) when sampled in the Logic-9 environment (median log2 ﬁtness 10.40) in comparison to genotypes evolved with the Heads instruction set (median log2 ﬁtness 9.653).
79

Figure 4.4: STEP Sampling of Population 6166
STEP sampling results of population 6166 originally evolved in the Logic-9 environment
with the Heads instruction set. Blue line: ﬁtness in its native Logic-9 environment. Green
line: initial ﬁtness in the alternative Fibonacci-32 environment. Orange line: median STEP
results (smoothed) with 95% quantiles. Gray circles: individual STEP results.

Figure 4.5: STEP Sampling of Population 2050
Fitness STEP sampling results of population 2050 originally evolved in the Logic-9 environment with the Heads instruction set. Blue line: ﬁtness in its native Logic-9 environment.
Green line: initial ﬁtness in the alternative Fibonacci-32 environment. Orange line: median
STEP results (smoothed) with 95% quantiles. Gray circles: individual STEP results.

80

The transition from the Logic-9 environment to the Fibonacci-32 environment showed the
opposite results, with the Heads instruction set resulting in signiﬁcantly greater (p < 0.026)
evolutionary potential (median log2 ﬁtness 1.836; Wilcoxon rank-sum test) in comparison
to the Bloat-30 instruction set (median log2 ﬁtness 1.768).

4.3.4

Discussion

In examination of general performance, evolution demonstrated surprising robustness to increasingly poor design decisions. The addition of individual instructions that were incredibly
deleterious or lethal made no signiﬁcant di↵erence in the evolutionary potential of the system
across a wide range of static test environments. Similarly, low levels of neutral instruction set
bloat contributed negligibly to the observed performance. These results indicate that digital
evolution can reasonably overcome individual or small sets of detrimental design decisions,
regardless of the severity of the error. Populations exhibit substantial declines only if many
poor decisions compound one another.
High levels of instruction set bloat, diluting the frequency of functional mutations, resulted in an overall signiﬁcant drag on evolution. Despite this dilution decreasing the rate of
evolution, populations were still gaining ﬁtness and task success, indicating that evolution
could potentially overcome the detrimental e↵ects of such poor designs given additional time.
Although Bloat-30 performed poorly in the initial experiments, STEP sampling showed
that, under certain circumstances, the increased proportion of neutral mutations associated
with instruction set bloat can actually improve evolutionary potential when changing the
environment. The genetic architecture of the genotypes from the Fibonacci-32 environment
with the Bloat-30 instruction set, broken up by neutral instructions, showed to be more
81

adaptable to the logic ﬂow necessary for success in the Logic-9 environment. Conversely, the
genotypes with the Bloat-30 instruction set evolved in the Logic-9 environment performed
worse in the Fibonacci-32 environment, indicating that the looping structures necessary in
the Fibonacci-32 environment likely beneﬁt from more closely connected instructions.

4.4

Future Work

There are many potential avenues for further exploration of genetic hardware beyond the
scope of this dissertation. Among those avenues is performing head-to-head competition experiments between instruction sets to directly assess the evolutionary potential of particular
architectures. Additionally, there are an almost inﬁnite number of additional architecture
modiﬁcations that are worth investigating, including multiple memory spaces, changing registers to stacks, implementing instructions to facilitate the generation of arbitrary numbers,
splitting the h-copy instruction into read and write instructions, adding threading capabilities, implementing an interrupt-driven execution to facilitate environmental interaction, and
developing gene-regulatory-network inspired ﬂow control.

82

Chapter 5
How Do Mutation Types A↵ect
Evolutionary Potential?
A source of variation is a required component for adaptive evolution. Mutation rates clearly
a↵ect the rate of adaptive evolution, but are some types of mutations better than others?
Mutations can take on many forms and may occur at multiple levels of activity in a system.
Genotypic mutations take place in the heritable substrate, producing lasting variation. In
contrast, phenotypic mutations occur during expression of the genotype, producing transient
variation. Such transient variation, however, may trigger other types of genotypic variation.
In addition to type, mutations can also vary in their rate, e↵ect size, target, and distribution. Mutations can occur during an organism’s lifetime or only during replication. The
probability of mutation occurring may be ﬁxed, or it may be dependent on other factors,
such as genome size, organism activity, location, or simply time. The e↵ect size of a mutation may impact only a single position in a genome (be it a nucleotide, an amino acid, or an
instruction) or a mutation may change many elements simultaneously, such as by duplicat83

ing subroutines, genes, chromosomes, or even whole genomes. Mutations may target speciﬁc
elements or may be general rates across all such components. The distribution of characters
or activities that may be substituted as the result of a mutation can be uniform, skewed, or
dependent upon the original element being mutated.
Clearly, there are many aspects of mutations that can be varied, and it is unclear a priori
what type of role each will play in terms of evolutionary potential. While the genotypephenotype relationship may set the characteristics of each point in a ﬁtness landscape, these
mutational rules determine its connectivity. Certain genotypic mutations may allow evolution to traverse rugged regions of the landscape that would otherwise have been nearly
impassable. Others, however, may force the evolution of greater genotypic robustness, at
the cost of adaptive ability.
The ﬁeld of evolutionary computation, seeking to optimize search performance, has explored many mutation types (which are called “search operators” in the ﬁeld), typically
varying many of the characteristics described above. Given the focus on solving engineering
problems, evaluation of new search operators typically focuses on short-term problem solving capability. In biological and digital evolution, additional properties, such as robustness,
modularity, adaptive complexity, and future evolvability are also important. To date there
has been no systematic exploration of mutation types and their e↵ect on evolutionary potential in an open-ended evolution system. I have explored four main aspects of mutations,
the optimal balance of mutation types (Section 5.2), mutation timing (Section 5.3), versions
of large insertion/deletion mutations (Section 5.4), and the side-e↵ects of mutations (Section 5.5). Each treatment was tested in the seven computational environments described in
Section 2.2, evaluating how these mutations alter evolutionary potential.
84

5.1

Methods

I have evaluated the evolutionary potential of all mutation types described below using sets of
50 replicate populations run for 500,000 updates. During these runs I collected two metrics
for comparison, mean population ﬁtness and normalized task success, every 100 updates.
Normalized task success, described in detail in Section 4.1.3, is a standardized measure of
the environmental task performance of a genotype.
The overall mutation rate of 3 ⇥ 10 3 per site in the genome identiﬁed as the most
broadly beneﬁcial in Section 5.2.1 was using in all subsequent experiments in this chapter.
Additionally, the balanced proportion (1:1) of insertion/deletion mutations and instruction
substitution mutations found to be the most broadly beneﬁcial in Section 5.2.2 was used in
all subsequent experiments in this chapter.

5.2

Exploring the Balance of Mutation Types

The standard mutation types utilized in Avida are single instruction substitution, and single
instruction insertion or deletion. These rates and proportions of these mutations have been
selected based on previous work in limited environments. I have systematically explored the
balance of mutation rates, proportions, and probability applications in all seven computational environments, assessing the evolutionary potential of the system given a ﬁxed time
period.
85

5.2.1

Optimal Mutation Rates for Adaptive Evolution

Mutation rate plays a critical role in determining the speed of adaptive evolution. High
rates allow rapid exploration of a ﬁtness landscape, but if the rates are too high adaptive
evolution will slow down as populations struggle to maintain any genetic information.
I tested ﬁve mutation loads, 0.01, 0.1, 0.3, 1.0, and 3.0, such that given the genome
length of the default ancestral genome the 1.0 rate would on average yield one mutation
per generation. The results, presented in Tables 5.1 and 5.2, match well with expectations,
with most environments demonstrating the greatest evolutionary potential at intermediate
mutation rates. The Logic-9, Match-12, Sort-10, and Navigation environments all show
the highest ﬁtness and generally task success with 0.3. The Logic-77 and Fibonacci-32
environments appear to peak with the 1.0 rate. The Limited-9 environment demonstrates
an unexpected tradeo↵ between ﬁtness and task success. Median ﬁtness continued to increase
with increased mutation rate, even at 3.0. However, task success appears to to peak around
0.1, with steady decline in performance as mutation rate increases.
As indicated in the methods, above, I selected the 0.3 mutation rate to perform all of the
other experiments in this Chapter.

86

Rate
0.01
0.1
0.3
1.0
3.0

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

9.77

4.88

-0.086

1.342

-0.551

1.861

0.574

(2.00, 17.41)

(0.71, 9.66)

(-0.423, 0.55)

(0.541, 4.44)

(-1.211, -0.274)

(-0.099, 5.996)

(-0.148, 4.208)

19.89

12.54

0.336

3.462

-0.377

3.745

3.956

(12.71, 24.10)

(6.36, 63.81)

(-0.154, 0.59)

(1.414, 18.82)

(-0.934, 0.906)

(2.642, 7.253)

(1.101, 18.475)

23.63

28.20

0.366

5.736

0.126

4.354

5.358

(14.75, 23.94)

(10.18, 72.29)

(0.137, 0.76)

(2.572, 21.67)

(-0.736, 1.130)

(2.894, 15.336)

(1.033, 22.181)

23.04

45.75

-0.003

7.659

-0.427

4.364

1.227

(14.46, 23.48)

(11.81, 68.62)

(-0.461, 0.28)

(2.755, 21.65)

(-0.906, 1.940)

(2.830, 12.115)

(1.048, 21.070)

17.24

48.63

-0.800

6.007

-0.799

7.882

0.791

(12.02, 22.06)

(8.03, 70.37)

(-0.917, -0.76)

(3.335, 20.60)

(-0.922, -0.762)

(2.049, 15.370)

(0.638, 3.575)

Table 5.1: Mutation Rate Fitness
Fitness results of tested mutation rates. Each entry shows the median log2 population mean ﬁtness in the respective environment, with 95% conﬁdence intervals in parentheses.

87

Rate
0.01
0.1
0.3
1.0
3.0

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.555

0.078

0.083

0.094

0.99⇥10 4

0.630

2.805⇥10 3

(0.256, 0.815)

(0.021, 0.145)

(0.074, 0.166)

(0.062, 0.322)

(0.00, 1.48)

(0.269, 0.885)

(2.045, 9.361)

0.871

0.177

0.160

0.184

1.42⇥10 4

0.970

8.713⇥10 3

(0.639, 0.988)

(0.093, 0.846)

(0.079, 0.163)

(0.093, 0.990)

(0.92, 5.20)

(0.709, 0.982)

(3.671, 27.646)

0.947

0.370

0.151

0.307

2.62⇥10 4

0.931

10.293⇥10 3

(0.740, 0.970)

(0.143, 0.903)

(0.144, 0.221)

(0.148, 0.912)

(0.93, 6.05)

(0.815, 0.958)

(3.938, 42.982)

0.876

0.512

0.122

0.423

1.46⇥10 4

0.845

3.880⇥10 3

(0.702, 0.926)

(0.156, 0.834)

(0.063, 0.138)

(0.152, 0.910)

(0.00, 13.49)

(0.714, 0.925)

(3.355, 40.114)

0.687

0.508

0.000

0.350

0.00⇥10 4

0.729

3.491⇥10 3

(0.539, 0.892)

(0.121, 0.735)

(0.000, 0.000)

(0.207, 0.817)

(0.00, 0.00)

(0.451, 0.891)

(3.298, 8.881)

Table 5.2: Mutation Rate Task Success
Normalized task success results of tested mutation rates. Each entry shows the median normalized task success in the respective
environment, with 95% conﬁdence intervals in parentheses.

88

5.2.2

Balance of Insertion/Deletions and Substitution Mutations

In the default conﬁguration, insertion/deletion (indel) mutations and substitutions occur
with proportions 2:5, values that were originally chosen in an arbitrary manner. I have
varied these proportions with seven treatments, increasing or decreasing insertion/deletion
mutations relative to substitution mutations. Speciﬁcally, I tested the proportions 0:1 (substitutions only), 1:5, 2:5, 1:1, 5:2, 5:1, 1:0 (indels only). The 0:1 treatment group, featuring
only substitution mutations, did not require organisms to maintain ﬁxed length, but length
changes required unstable genotypes, a side-e↵ect of other mutations to be discussed in detail
in Section 5.5.
Analysis of the results (see Tables 5.3 and 5.4) showed that ﬁve of the tested proportions, 1:5, 1:1, 5:2, 5:1, 1:0 (indels only), were, for all practical purposes, equivalent in
performance. The 0:1 (substitutions only) treatment was substantially worse in the Logic-9,
Logic-77, Fibonacci-32, and Navigation environments, likely due to the ﬁxed length aspect of
the treatment. The 2:5 treatment, representing the default setting for Avida, demonstrated a
notable tradeo↵ between the Logic-77 and Fibonacci-32 environments, where median ﬁtness
and task success were the highest of all treatments, and the Match-12, Sort-10, and Navigation environments, where ﬁtness and task success were substantially depressed relative to
all other treatments.
As indicated in the methods, above, I selected the 1:1 proportion of indel:substitution
mutations for all other experiments in this Chapter.

89

Proportion
0:1
1:5
2:5
1:1
5:2
5:1
1:0

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

16.61

11.78

0.365

2.278

-0.033

4.860

1.431

(9.76, 24.04)

(4.26, 60.85)

(-0.193, 0.64)

(1.428, 12.67)

(-0.752, 1.145)

(1.709, 12.136)

(0.418, 14.248)

20.65

29.27

0.416

3.742

-0.084

4.410

3.637

(14.22, 24.05)

(9.92, 69.67)

(-0.129, 0.73)

(2.024, 12.83)

(-0.711, 1.192)

(2.954, 12.101)

(1.220, 19.992)

22.94

43.95

-0.004

7.588

-0.487

3.929

1.137

(14.62, 23.46)

(17.20, 71.80)

(-0.455, 0.18)

(3.105, 21.60)

(-1.053, 1.190)

(2.401, 12.320)

(0.950, 51.268)

23.56

36.04

0.365

5.559

-0.021

5.000

4.973

(14.84, 23.95)

(12.56, 68.87)

(0.127, 0.86)

(2.235, 18.26)

(-0.734, 1.649)

(3.216, 16.851)

(1.156, 25.002)

23.44

28.53

0.349

5.185

0.213

4.114

4.123

(14.47, 23.96)

(9.98, 66.62)

(0.140, 0.83)

(2.700, 21.77)

(-0.675, 1.135)

(3.162, 9.824)

(1.061, 36.873)

23.11

30.54

0.331

5.279

0.288

3.996

4.983

(14.69, 23.87)

(11.73, 72.37)

(0.080, 0.71)

(2.244, 21.49)

(-0.574, 1.263)

(2.772, 21.295)

(0.994, 22.772)

22.98

25.50

0.316

6.666

0.274

3.937

5.786

(14.37, 23.76)

(9.17, 62.71)

(0.059, 0.53)

(2.322, 18.31)

(-0.704, 2.777)

(2.478, 22.769)

(0.751, 30.193)

Table 5.3: Insertion/Deletion and Substitution Proportion Fitness
Fitness results of tested insertion/deletion:substitution mutation proportions. Each entry shows the median log2 population
mean ﬁtness in the respective environment, with 95% conﬁdence intervals in parentheses.

90

Proportion
0:1
1:5
2:5
1:1
5:2
5:1
1:0

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.772

0.167

0.158

0.125

1.85⇥10 4

0.938

4.088⇥10 3

(0.550, 0.993)

(0.064, 0.743)

(0.074, 0.162)

(0.093, 0.649)

(0.86, 6.03)

(0.627, 0.988)

(2.048, 26.287)

0.874

0.388

0.156

0.209

1.84⇥10 4

0.954

8.088⇥10 3

(0.721, 0.988)

(0.140, 0.869)

(0.078, 0.186)

(0.121, 0.632)

(0.86, 6.53)

(0.638, 0.978)

(4.027, 40.032)

0.869

0.518

0.115

0.425

0.93⇥10 4

0.817

3.838⇥10 3

(0.705, 0.918)

(0.222, 0.805)

(0.063, 0.134)

(0.168, 0.915)

(0.00, 6.96)

(0.732, 0.918)

(3.746, 94.071)

0.949

0.450

0.153

0.298

2.24⇥10 4

0.944

10.271⇥10 3

(0.754, 0.978)

(0.172, 0.852)

(0.146, 0.231)

(0.139, 0.944)

(0.90, 10.33)

(0.810, 0.967)

(3.988, 49.302)

0.944

0.379

0.151

0.310

2.74⇥10 4

0.926

8.741⇥10 3

(0.735, 0.969)

(0.139, 0.837)

(0.143, 0.228)

(0.151, 0.971)

(0.94, 6.39)

(0.822, 0.955)

(3.938, 70.421)

0.936

0.387

0.152

0.289

2.83⇥10 4

0.924

10.345⇥10 3

(0.735, 0.961)

(0.162, 0.939)

(0.121, 0.218)

(0.138, 0.963)

(1.02, 7.06)

(0.821, 0.958)

(3.910, 42.835)

0.931

0.346

0.150

0.353

2.90⇥10 4

0.917

11.870⇥10 3

(0.733, 0.963)

(0.131, 0.801)

(0.119, 0.176)

(0.148, 0.970)

(0.67, 30.43)

(0.730, 0.952)

(3.889, 44.913)

Table 5.4: Insertion/Deletion and Substitution Proportion Task Success
Normalized task success results of tested insertion/deletion:substitution mutation proportions. Each entry shows the median
normalized task success in the respective environment, with 95% conﬁdence intervals in parentheses.

91

5.2.3

Per-Site Mutation Rates vs. Per-Genome Mutation Rates

Given a constant genome length, mutation rate denotes a consistent probability distribution
that a certain number of mutations would occur. However, when genome length is allowed
to vary, the translation of mutation rate to the probability of a certain number of mutations
becomes open to multiple potential implementations. Mutations may be determined and
applied as a ﬁxed genomic probability, regardless of length variation, which would cause the
per-site mutation rate to decrease as genome length increases. Alternatively, the mutation
rate can be applied as a per-site/instruction probability of mutation, in which case the
genomic level mutation rate will increase as genome size increases.
Similar to mutation timing, described above, the default mutations utilized in Avida use
a combination of these mutation rate methods, again originally chosen in a semi-arbitrary
manner. By default, instruction substitution mutations are applied as a per-instruction
copied rate. Insertion and deletion mutations are a ﬁxed probability of a single instruction
insertion or deletion per genome.
I have tested and compared variation in evolutionary potential when utilizing multiple combinations of absolute and per-site relative mutation probabilities. The experiments
consisted of six treatment groups and one control group. All groups applied all mutations
during o↵spring division. The overall mutation rate was initially ﬁxed, such that per-site
and absolute rates started out identical given the length of the seed genotype.
The ﬁrst four groups varied the rates of substitution mutations and insertion/deletion
mutations as two parameters. The control group utilized all per-site mutation rates. The
ﬁrst treatment group closely replicated Avida defaults, with a per-site substitution rate and
absolute probability of single instruction insertion or deletion. The second treatment group
92

tested the inverse of the Avida defaults, applying insertion/deletion mutations as per-site
rates and substitution mutations as an absolute probability of a single mutation. The third
treatment group tested both mutation classes as absolute probabilities of single mutation.
In the three treatment groups described so far, all absolute mutation rates were limited to
a single occurrence. The per-site mutation rates, in contrast, allowed for multiple mutations
to occur in a given generation. I repeated the three previously described mutation treatments,
testing Avida defaults, inverted defaults, and all absolute rates, using mutations that were
applied with a Poisson process.
Analysis of all seven mutation rate application methods (see Tables 5.5 and 5.6) showed
that evolution is generally quite robust to such variation. No signiﬁcant di↵erence in evolutionary potential was observed in the Logic-9, Match-12, Fibonacci-32, Sort-10, or Navigation
environments among any of the tested mutation rate application methods.
The Logic-77 did demonstrate signiﬁcantly reduced ﬁtness when per-genome rates were
used, both single mutation and poisson rate. Task success was also signiﬁcantly depressed
with single mutation per-genome rates. Due to the complexity of the environment, with
seventy-seven tasks, it is likely that longer genomes are important for exploiting them. With
per-genome rates, the e↵ective per-site mutation rate decreases as the genome length increases. As observed in Section 5.2.1, as the per-site mutation rate decreases the Logic-77
environment demonstrates reduced evolutionary potential, explaining the observations here.
In contrast to the Logic-77 environment, the Limited-9 environment demonstrated significantly increased median ﬁtness and task success with both per-genome calculation methods.
This also matches well with the task success results of Section 5.2.1, where the Limited-9
environment shows increased potential with lower per-site mutation rates.

93

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

Per-Site
Both

23.09

29.17

0.368

5.099

0.006

4.272

2.307

(14.66, 23.94)

(9.89, 63.39)

(0.195, 0.53)

(2.593, 21.29)

(-0.745, 1.208)

(2.735, 8.799)

(1.123, 21.519)

Per-Site
Substitution

23.37

30.48

0.381

5.693

0.072

4.592

3.522

(14.87, 23.96)

(10.49, 69.02)

(-0.073, 0.84)

(2.417, 17.68)

(-0.813, 1.228)

(3.414, 19.856)

(1.104, 21.806)

Per-Genome
Substitution

23.51

31.06

0.366

5.591

-0.026

4.179

4.568

(14.21, 23.96)

(10.58, 64.46)

(0.080, 0.67)

(1.978, 21.82)

(-0.727, 1.835)

(3.101, 22.910)

(1.135, 43.903)

Per-Genome
Both

19.91

17.90

0.362

3.802

0.226

6.465

3.531

(9.94, 24.25)

(6.83, 56.20)

(0.037, 0.54)

(1.470, 13.12)

(-0.666, 1.281)

(3.722, 23.996)

(0.975, 18.637)

Poisson
Indel

23.47

29.50

0.373

5.207

-0.048

4.425

4.380

(14.88, 24.02)

(9.95, 64.11)

(-0.108, 0.82)

(2.509, 13.21)

(-0.702, 0.939)

(3.274, 18.366)

(0.990, 25.160)

Poisson
Substitution

23.55

35.40

0.391

5.475

0.124

4.478

2.601

(14.83, 24.02)

(13.43, 66.69)

(0.200, 0.83)

(1.962, 13.12)

(-0.695, 1.154)

(3.416, 13.833)

(0.906, 17.380)

Poisson
Both

22.10

18.81

0.367

3.759

0.168

4.963

3.635

(11.83, 24.23)

(4.44, 68.94)

(0.132, 0.81)

(1.481, 13.06)

(-0.752, 1.716)

(3.493, 23.831)

(0.986, 12.653)

Table 5.5: Mutation Rate Application Fitness
Fitness results of tested mutation rate applications methods. Each entry shows the median log2 population mean ﬁtness in the
respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.

94

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

Per-Site
Both

0.948

0.390

0.153

0.289

2.25⇥10 4

0.945

5.646⇥10 3

(0.744, 0.971)

(0.138, 0.785)

(0.145, 0.158)

(0.149, 0.908)

(0.90, 6.22)

(0.809, 0.960)

(3.958, 39.625)

Per-Site
Substitution

0.950

0.414

0.153

0.311

2.35⇥10 4

0.942

8.041⇥10 3

(0.746, 0.978)

(0.146, 0.885)

(0.124, 0.227)

(0.139, 0.975)

(0.00, 6.84)

(0.826, 0.966)

(3.694, 40.570)

Per-Genome
Substitution

0.953

0.420

0.154

0.304

2.09⇥10 4

0.949

8.995⇥10 3

(0.709, 0.975)

(0.146, 0.822)

(0.144, 0.230)

(0.122, 0.931)

(0.54, 11.44)

(0.781, 0.978)

(3.828, 76.636)

Per-Genome
Both

0.874

0.247

0.153

0.207

1.92⇥10 4

0.951

8.137⇥10 3

(0.556, 0.994)

(0.102, 0.701)

(0.147, 0.162)

(0.112, 0.622)

(0.92, 6.52)

(0.692, 0.995)

(3.914, 35.491)

Poisson
Indel

0.949

0.396

0.153

0.278

2.06⇥10 4

0.946

9.680⇥10 3

(0.755, 0.977)

(0.139, 0.779)

(0.117, 0.230)

(0.149, 0.628)

(0.62, 4.78)

(0.911, 0.966)

(3.528, 47.875)

Poisson
Subsitution

0.958

0.450

0.153

0.289

2.25⇥10 4

0.950

6.152⇥10 3

(0.754, 0.979)

(0.183, 0.859)

(0.143, 0.228)

(0.121, 0.614)

(0.59, 5.54)

(0.896, 0.974)

(3.644, 34.698)

Poisson
Both

0.916

0.260

0.153

0.208

2.17⇥10 4

0.954

8.176⇥10 3

(0.606, 0.992)

(0.073, 0.927)

(0.149, 0.230)

(0.107, 0.630)

(0.88, 8.59)

(0.724, 0.989)

(3.918, 24.294)

Table 5.6: Mutation Rate Application Task Success
Normalized task success results of tested mutation rate application methods. Each entry shows the median normalized task
success in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05)
deviations.

95

5.3

Investigating the E↵ect of Mutation Timing

Genotypic mutations in natural systems may arise during the process of copying genetic material. In digital evolution systems, mutations can be modeled to produce similar behavior,
producing mutations during the actual execution of the copy operation. However, unlike
natural systems, it is also possible to limit the timing of all mutations such that they occur
only during division of o↵spring, to toggle whether mutations are also applied to the parent
genome, to trigger mutations randomly at any time (cosmic-ray mutations), among other
possibilities, providing the experimenter with greater control of the exact characteristics of
the mutation.

5.3.1

Copy Mutations versus Divide Mutations

There are many potential side e↵ects associated with the timing of mutation application.
Mutations that occur during the copy process can a↵ect the execution of the current organism, whereas mutations on divide cannot. Intra-lifetime copy mutations could possibly
allow the evolution of error correction, provided that the genetic language supports it. Mutations applied on divide can be exactly calibrated to genome length or applied with absolute
probability, whereas copy mutations may only be produced reliably as per-site rates.
In Avida, the default mutations use a combination of copy process mutations and mutations on divide, chosen for experiment run-time e ciency. The consequences of this combination, and the e↵ect on evolutionary potential of choosing other combinations have not
been explored. I have tested this with four treatments varying the timing of mutations. The
control treatment applied all mutations on-divide, as described in Chapter 2 and used in
all other studies. One treatment used mutations similar to the Avida defaults, with copy
96

process substitutions and on-divide insertion and deletion mutations.1 The inverted treatment applied insertion and deletion mutations during the copy process, while instruction
substitutions occurred on-divide. The ﬁnal treatment group applied all mutations during
the copy process, with no mutations on divide.
The results, presented in Tables 5.7 and 5.8, showed that the Avida default treatment,
with on-copy substitution mutations and on-divide insertion and deletion mutations, and the
on-divide treatment both demonstrated the greatest evolutionary potential. The inverted
and all on-copy treatments, featuring copy process insertion and deletion mutations, demonstrated signiﬁcantly reduced task success and ﬁtness in many environments, indicating that
such mutations are likely disruptive and present a drag on evolutionary potential.

1 Avida default mutation rates apply a probability of instruction substitution per instruction copied during the organism’s lifetime and an absolute probability each of singleinstruction insertion or deletion on division of the o↵spring. In order to maintain consistent
genomic mutation rates, insertion and deletion mutations were applied using per-site rates
compensated for the ancestral genotype length.
97

Logic-9
On-Divide
Default
Inverted
On-Copy

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

23.64

32.98

0.377

5.507

0.224

4.278

4.055

(14.77, 24.00)

(11.53, 68.36)

(0.179, 0.85)

(2.629, 22.04)

(-0.693, 2.159)

(2.638, 10.425)

(1.075, 62.130)

23.44

35.19

0.367

5.447

0.020

4.187

4.371

(14.87, 23.96)

(9.62, 70.67)

(0.120, 0.80)

(1.937, 21.94)

(-0.713, 1.225)

(2.984, 7.240)

(1.049, 63.064)

22.74

41.42

0.363

7.079

-0.266

4.495

1.096

(14.00, 23.79)

(10.77, 72.54)

(0.015, 0.64)

(2.358, 21.58)

(-1.073, 0.884)

(2.912, 20.787)

(0.822, 21.592)

22.83

39.42

0.330

7.740

-0.060

3.922

2.184

(14.20, 23.69)

(7.81, 71.19)

(-0.022, 0.71)

(2.188, 21.72)

(-1.185, 1.053)

(3.152, 19.155)

(0.592, 107.041)

Table 5.7: Mutation Timing Fitness
Fitness results of tested mutation timings. Each entry shows the median log2 population mean ﬁtness in the respective
environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.
Logic-9
On-Divide
Default
Inverted
On-Copy

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.943

0.403

0.153

0.299

2.59⇥10 4

0.941

8.661⇥10 3

(0.744, 0.971)

(0.161, 0.799)

(0.140, 0.225)

(0.149, 0.946)

(0.94, 21.57)

(0.801, 0.975)

(3.964, 117.995)

0.951

0.465

0.153

0.289

2.44⇥10 4

0.944

9.150⇥10 3

(0.743, 0.975)

(0.134, 0.877)

(0.147, 0.222)

(0.121, 0.954)

(0.58, 6.08)

(0.831, 0.962)

(3.964, 100.548)

0.934

0.548

0.149

0.398

1.96⇥10 4

0.917

3.999⇥10 3

(0.730, 0.966)

(0.149, 0.926)

(0.141, 0.216)

(0.148, 0.958)

(0.89, 7.00)

(0.767, 0.953)

(3.913, 41.426)

0.933

0.528

0.148

0.446

2.33⇥10 4

0.919

5.007⇥10 3

(0.736, 0.966)

(0.118, 0.887)

(0.141, 0.216)

(0.148, 0.962)

(0.87, 7.19)

(0.796, 0.967)

(3.673, 200.922)

Table 5.8: Mutation Timing Task Success
Normalized task success results of tested mutation timings. Each entry shows the median normalized task success in the
respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.
98

5.3.2

Cosmic-Ray Mutations

Copy mutations represent variation introduced during the replication process, internally
created and typically limited in scope to future o↵spring.2 Externally induced mutations,
sometimes referred to as “cosmic-ray” mutations, however, are not limited in scope. Such
mutations introduce broad variation into the population, potentially a↵ecting both current
and future generations.
I have tested the e↵ects of cosmic-ray mutations on evolutionary potential relative to the
control mutation settings. Canonical Avida cosmic-ray mutations perform single instruction
substitution at a given ﬁxed rate. In order to fairly compare with the control, which features
both substitution and insertion/deletion mutations, I implemented and tested cosmic-ray
insertion/deletion mutations alongside instruction substitution cosmic-ray mutations. Since
cosmic-ray mutations are applied as a function of time, rather than genome length, I set the
mutation rate to be approximately 0.3 genomic by factoring in the initial gestation time of
the ancestral organism.
In all seven environments, cosmic-ray mutations demonstrated decreased median task
success (Table 5.10). Among these, the Fibonacci-32 and Limited-9 environments were
signiﬁcant with 40.4% and 54.6% drops, respectively. Fitness results were similarly depressed
six of the seven environments, with the Fibonacci-32, Limited-9, and Navigation signiﬁcantly
so (Table 5.9). The Logic-77 environment was the only environment to feature increased
median ﬁtness, though the results were statistically identical to the control. Overall these
results indicate that cosmic-ray mutations, which may disrupt an organism’s genome during
its lifetime, are detrimental to the evolutionary potential of the system.
2 It is technically possible for intra-lifetime copy mutations to overwrite portions of the
parental genome.
99

Logic-9
Control
Cosmic-Ray

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

23.23

36.05

0.375

5.673

-0.011

3.993

4.221

(14.59, 23.96)

(11.56, 65.00)

(0.075, 0.72)

(2.665, 21.99)

(-0.734, 1.342)

(2.681, 9.328)

(1.141, 34.603)

22.86

37.95

0.361

4.651

-0.189

3.226

1.202

(12.68, 23.84)

(8.44, 65.99)

(0.220, 0.75)

(2.279, 21.24)

(-1.027, 1.345)

(1.426, 6.248)

(1.003, 19.167)

Table 5.9: Cosmic-Ray Mutation Fitness
Fitness results of cosmic-ray mutations. Each entry shows the median log2 population mean ﬁtness in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.

Logic-9
Control
Cosmic-Ray

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.945

0.450

0.153

0.305

2.19⇥10 4

0.942

8.961⇥10 3

(0.736, 0.970)

(0.161, 0.823)

(0.116, 0.224)

(0.151, 0.961)

(0.92, 6.72)

(0.792, 0.968)

(3.964, 67.347)

0.939

0.426

0.153

0.182

1.95⇥10 4

0.934

4.069⇥10 3

(0.646, 0.970)

(0.126, 0.721)

(0.147, 0.224)

(0.127, 0.870)

(0.00, 8.06)

(0.560, 0.954)

(4.021, 35.548)

Table 5.10: Cosmic-Ray Mutation Task Success
Normalized task success results of cosmic-ray mutations. Each entry shows the median normalized task success in the respective
environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.

100

5.3.3

Parent Mutations

During the process of division in biological organisms the resulting organisms may represent
a distinct parent and o↵spring pair. Alternatively, the resulting organisms may be more correctly treated as two daughter cells, with genetic material and cellular machinery somewhat
equally distributed among the o↵spring. During the formation of daughter cells, mutations
accumulated during the replication process may end up in either of the o↵spring. As a result
of this, even though genomic mutation rate may remain constant, the generational mutation
rate could be substantially larger.
I have explored the evolutionary potential consequences of this di↵erence. I compared
two experimental groups featuring parental mutations applied on divide relative to the standard control conﬁguration. The ﬁrst experiment group featured parent-organism mutations
identical in class and probability to those of o↵spring. The net e↵ect of this change is that
generational mutation rate was approximately double that of the control group. In the second experiment group, I tested parent and o↵spring mutation rates that were half that of
the ﬁrst group. This change yielded a generational mutation rate comparable to the control
group, but featured a corresponding decrease in individual organism mutation rate.
As a mutation timing, parent mutations made no signiﬁcant di↵erence in evolutionary
potential (see Tables 5.11 and 5.12). The Logic-77, Fibonacci-32, and Sort-10 environments
were statistically equivalent under both treatments. The Logic-9, Match-12, Limited-9 and
Navigation environments demonstrated signiﬁcant reductions in task success and often ﬁtness in the standard parent mutations treatment. However, when compensated for the
generational mutation rate in the parent half-rate treatment, these di↵erences disappeared.

101

Logic-9
Control
Parent
Parent
Half-rate

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

22.36

28.30

0.375

5.192

0.250

4.343

4.256

(14.47, 23.92)

(8.91, 70.21)

(0.029, 0.72)

(2.025, 21.72)

(-0.713, 1.731)

(3.124, 10.958)

(1.104, 21.705)

22.63

31.33

0.204

5.502

-0.081

4.019

1.267

(13.53, 23.79)

(10.41, 66.55)

(0.029, 0.40)

(2.634, 21.36)

(-0.747, 3.574)

(2.808, 8.415)

(1.083, 59.015)

22.29

31.55

0.369

5.043

-0.045

4.009

3.602

(13.03, 23.93)

(7.63, 61.80)

(-0.004, 0.68)

(2.375, 20.64)

(-0.768, 1.212)

(2.521, 7.815)

(1.077, 22.752)

Table 5.11: Parent Mutation Fitness
Fitness results of parent mutations and parent mutations at half-rate. Each entry shows the median log2 population mean
ﬁtness in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05)
deviations.

Logic-9
Control
Parent
Parent
Half-Rate

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.894

0.362

0.153

0.293

2.68⇥10 4

0.943

8.681⇥10 3

(0.743, 0.969)

(0.127, 0.818)

(0.120, 0.223)

(0.121, 0.962)

(0.00, 9.63)

(0.850, 0.962)

(3.978, 40.146)

0.890

0.390

0.140

0.277

2.00⇥10 4

0.895

3.980⇥10 3

(0.685, 0.956)

(0.146, 0.787)

(0.123, 0.149)

(0.147, 0.886)

(0.00, 41.22)

(0.833, 0.938)

(3.938, 109.987)

0.898

0.423

0.152

0.292

2.11⇥10 4

0.939

8.337⇥10 3

(0.646, 0.976)

(0.110, 0.781)

(0.142, 0.229)

(0.139, 0.974)

(0.80, 6.48)

(0.746, 0.958)

(3.993, 44.897)

Table 5.12: Parent Mutation Task Success
Normalized task success results of parent mutations and parent mutations at half-rate. Each entry shows the median normalized
task success in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant
(p < 0.05) deviations.

102

5.4

Exploring Aspects of Insertion/Deletion Mutations
of Large E↵ect

Mutations commonly used in digital evolution systems typically a↵ect only one instruction
at a time. These mutations can accumulate over generations, producing complex features
(Lenski et al., 2003). Natural systems, however, are not limited to single-site mutations.
Mutations of large-e↵ect can occur, producing duplicate genes, chromosomes, or even whole
genomes, and play an important role in biological evolution (Lynch, 2000; Zhang, 2003).
The consequences of multi-site mutations can vary widely. Given the potential size and
scope of the genome changes, they may result in signiﬁcant jumps through genotype space.
The creation of duplicate genetic material, including potentially complete genes, may allow
for the evolution of novel or improved functionality that would otherwise have been unlikely.
One copy can be altered or used as a building block for a new trait while the organism retains
the original functionality using the other copy.
I systematically tested versions of an insertion/deletion mutation of large e↵ect in Avida,
called a “slip” mutation. Slip mutations produce genetic variation that simulates the genome
replication mechanism shifting while reading the source genome, causing wholesale duplication or deletion of contiguous genetic segments. Slip mutations, as implemented in Avida,
support four versions of the duplication (insertion) behavior3 , as summarized in Table 5.13.
Deletion behavior is identical for all versions. The size of resulting insertions and deletions is limited to the length of the genome, selected as occurring at a random upstream or
3 Slip mutations in Avida were originally conceived and implemented by Je↵ Barrick. I
have extended both the mutations themselves and the analysis of their e↵ect on evolutionary
potential.
103

Name

Description of Inserted Code

duplicate

An identical copy of the previous region of code

scramble

A randomly reordered copy of the previous region of code

random

A completely random sequence of instructions

nop-X

A series of nop-X instruction (that are as neutral as possible)
Table 5.13: Insertion versions supported by the Slip mutation in Avida

downstream location with uniform probability.
Results from a preliminary study, conducted in the Logic-9 environment, indicated that
slip mutation versions that inserted duplicate or scrambled genomic program segments signiﬁcantly improved evolutionary potential. The increased potential was borne out both as
a more rapid acquisition of ﬁtness relative to all other treatments, and as a greater median
ﬁnal ﬁtness. Building o↵ of this preliminary work, I have systematically explored how slip
mutations impact evolutionary potential across all seven environments.

5.4.1

Mutation Rates

Mutation rate and the balance of substitution and insertion/deletion mutations both play
critical roles in determining evolutionary potential, as demonstrated previously in this chapter. The rate at which slip mutations occur likely needs to be similarly calibrated for optimal
evolutionary potential with respect to the standard mutations. I have examined the evolutionary potential of slip mutations, featuring duplication insertion sequences, at four distinct
rates proportional to the substitution and insertion/deletion rates, 0.3, 0.1, 0.03, and 0.01,
while holding the overall mutation rate of the system constant at 0.3 genomic.

104

Logic-9
Control
0.3
0.1
0.03
0.01

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

23.37

28.67

0.361

4.691

0.028

4.395

6.275

(14.10, 23.94)

(9.12, 63.77)

(0.181, 0.87)

(1.643, 8.56)

(-0.650, 1.238)

(3.098, 13.742)

(1.089, 21.199)

23.98

44.84

0.288

7.018

-0.223

6.310

1.828

(-0.23, 24.33)

(-0.23, 71.46)

(-0.008, 0.85)

(3.506, 8.08)

(-0.239, -0.210)

(-0.227, 14.118)

(0.817, 17.660)

23.87

42.73

0.218

7.373

-0.231

7.584

1.828

(-0.23, 24.22)

(-0.23, 72.00)

(-0.041, 0.74)

(3.256, 21.52)

(-0.284, 0.037)

(-0.239, 23.278)

(0.807, 30.084)

23.91

37.87

0.295

5.518

-0.239

7.798

4.151

(15.07, 24.19)

(21.17, 67.94)

(-0.237, 0.71)

(3.250, 12.89)

(-0.297, 0.615)

(-0.152, 23.476)

(0.814, 28.909)

23.90

38.69

0.317

6.780

-0.272

6.564

3.750

(8.68, 24.10)

(21.01, 70.97)

(-0.232, 0.74)

(3.124, 20.53)

(-0.696, 0.433)

(2.298, 23.479)

(1.456, 23.998)

Table 5.14: Slip Duplication Mutation Rate Fitness
Fitness results for tested slip duplication mutation rates. Each entry shows the median log2 population mean ﬁtness in the
respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.

105

Logic-9
Control
0.3
0.1
0.03
0.01

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.950

0.377

0.153

0.239

2.29⇥10 4

0.942

13.39⇥10 3

(0.712, 0.976)

(0.130, 0.794)

(0.142, 0.227)

(0.112, 0.463)

(0.000, 0.001)

(0.844, 0.961)

(0.004, 0.035)

0.956

0.506

0.154

0.347

0.0

0.951

4.05⇥10 3

(0.000, 0.978)

(0.000, 0.788)

(0.078, 0.212)

(0.183, 0.472)

(0.000, 0.000)

(0.000, 0.973)

(0.002, 0.034)

0.957

0.477

0.154

0.358

0.0

0.940

4.07⇥10 3

(0.000, 0.977)

(0.000, 0.877)

(0.100, 0.229)

(0.171, 0.927)

(0.000, 0.000)

(0.000, 0.969)

(0.002, 0.055)

0.964

0.423

0.155

0.282

0.0

0.944

8.25⇥10 3

(0.582, 0.977)

(0.269, 0.790)

(0.000, 0.222)

(0.171, 0.629)

(0.000, 0.000)

(0.069, 0.964)

(0.002, 0.052)

0.963

0.431

0.155

0.319

0.000

0.944

8.16⇥10 3

(0.427, 0.980)

(0.277, 0.891)

(0.000, 0.228)

(0.159, 0.943)

(0.000, 0.000)

(0.598, 0.963)

(0.004, 0.041)

Table 5.15: Slip Duplication Mutation Rate Task Success
Normalized task success results for tested slip duplication mutation rates. Each entry shows the median log2 population mean
ﬁtness in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05)
deviations.

106

Fitness results, presented in Table 5.14, show that slip mutations at all tested rates significantly increase evolutionary potential in the Logic-9, Logic-77, Fibonacci-32, and Limited-9
environments. The Match-12 and Navigation environments demonstrated the opposite, with
reduced median ﬁtness under all rates, though generally statistically identical to the control.
The Sort-10 environment demonstrates substantial loss of ﬁtness with all tested rates.
Task success results, shown in Table 5.15, demonstrate shifts that correlate with ﬁtness.
However, unlike the ﬁtness results, most of the observed di↵erences are not signiﬁcant.
Only Logic-9, with the 0.03 and 0.01 rates, and Fibonacci-32, with the 0.3 and 0.1 rates,
demonstrate notable improvements.
These results indicate that slip mutations may present a tradeo↵ in determining evolutionary potential. Some environments yield improved results, while others experience the
opposite. However, the inﬂection point in evolutionary potential is largely insensitive to the
range of rates that I have tested here. Given that, I arbitrarily selected the 0.1 rate for use
in all of the remaining experiments in this section.

5.4.2

Insertion Sequence Content

Slip mutations have demonstrated improved evolutionary potential in a subset of the environments explored here, the Logic-9, Logic-77, Fibonacci-32, and Limited-9 environments.
There are multiple possible explanations for why slip mutations help those environments.
Gene (or subroutine) duplication of complete functionality may be essential for creating and
exploiting building blocks of functionality. Alternatively, it may simply be large changes in
genome length that is beneﬁcial, inserting chunks of unused genetic code. Existing somewhere between these two extremes of duplicated code and random (or blank) genetic material,
107

scrambled genetic sequences may allow for favorable distributions of instructions while disentangling epistasis present in previously evolved functions. In order to test each of these
explanations, I implemented four versions of the slip mutation that vary the content of the
inserted sequence (see Table 5.13). Each of these versions was tested in all seven environments.
The Slip-Random and Slip-Nop-X versions, testing two forms of “blank” tape, were
both statistically equivalent to the control in all but the Sort-10 and Navigation environments for both ﬁtness and task success (see Tables 5.16 and 5.17). The Sort-10 environment
demonstrated signiﬁcantly reduced evolutionary potential with both insertion sequence versions, while the Navigation environment was a↵ected negatively only by the Slip-Nop-X
version.
The Slip-Duplicate and Slip-Scramble versions both demonstrated equivalent evolutionary potential, substantially improving ﬁtness results in the Logic-9, Logic-77, and
Limited-9 environments. The Fibonacci-32 environment showed increased median ﬁtness,
though these were not signiﬁcant after Bonferonni correction. The Match-12, Sort-10, and
Navigation environments all had reduced performance with these two slip mutation versions.
Overall, these results match well with previous observations, indicating that environments
helped by slip mutations beneﬁt equally from duplicated and scrambled insertion sequences.
The distribution of instructions in inserted sequences contributes substantially to their utility
for exploiting evolutionary potential. However, the utility of slip mutations remains sensitive
to the environment, regardless of insertion content.

108

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

21.58

30.66

0.372

5.132

0.109

4.298

4.002

(12.86, 24.00)

(8.57, 60.50)

(0.140, 0.67)

(1.932, 22.22)

(-0.740, 1.324)

(2.837, 14.264)

(1.111, 25.834)

SlipDuplicate

23.95

43.77

0.241

5.969

-0.226

8.480

1.821

(-0.24, 24.26)

(-0.23, 71.54)

(-0.001, 0.69)

(3.898, 14.79)

(-0.267, -0.051)

(2.253, 22.833)

(0.776, 25.595)

SlipScramble

24.03

44.50

0.276

5.925

-0.229

6.539

1.809

(-0.23, 24.23)

(-0.23, 65.88)

(0.044, 0.72)

(3.706, 8.33)

(-0.284, 0.576)

(2.088, 17.902)

(0.821, 5.426)

SlipRandom

23.14

29.01

0.301

5.849

-0.352

3.546

3.655

(-0.35, 23.99)

(7.06, 66.41)

(-0.376, 0.66)

(1.696, 8.31)

(-0.522, -0.316)

(0.497, 11.188)

(1.325, 29.682)

22.38

31.63

0.309

4.769

-0.362

3.792

0.981

(-0.38, 24.02)

(-0.39, 68.59)

(-0.047, 0.72)

(1.367, 22.10)

(-0.466, -0.107)

(-0.387, 14.290)

(0.555, 11.855)

Control

Slip-Nop-X

Table 5.16: Slip Mutation Insertion Versions Fitness
Fitness results of all four slip mutation versions. Each entry shows the median log2 population mean ﬁtness in the respective
environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.

109

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.941

8.482⇥10 3

0.910

0.391

0.153

0.291

2.42⇥10 4

(0.643, 0.973)

(0.123, 0.792)

(0.145, 0.223)

(0.121, 0.963)

(0.90, 7.08)

(0.792, 0.969)

(3.964, 51.545)

SlipDuplicate

0.963

0.503

0.153

0.294

0.00⇥10 4

0.943

4.063⇥10 3

(0.000, 0.977)

(0.000, 0.872)

(0.079, 0.228)

(0.199, 0.625)

(0.00, 0.98)

(0.599, 0.970)

(2.038, 46.883)

SlipScramble

0.965

0.492

0.154

0.291

0.00⇥10 4

0.950

4.041⇥10 3

(0.000, 0.979)

(0.000, 0.805)

(0.119, 0.225)

(0.180, 0.452)

(0.00, 3.73)

(0.572, 0.966)

(2.040, 10.204)

SlipRandom

0.949

0.391

0.154

0.316

0.00⇥10 4

0.946

8.214⇥10 3

(0.000, 0.976)

(0.100, 0.815)

(0.000, 0.223)

(0.093, 0.453)

(0.00, 1.05)

(0.483, 0.968)

(3.563, 56.061)

0.869

0.389

0.153

0.264

0.00⇥10 4

0.946

2.815⇥10 3

(0.000, 0.972)

(0.000, 0.823)

(0.125, 0.229)

(0.081, 0.970)

(0.00, 1.89)

(0.000, 0.964)

(2.030, 23.641)

Control

Slip-Nop-X

Table 5.17: Slip Mutation Insertion Versions Task Success
Normalized task success results of all four slip mutation versions. Each entry shows the median normalized task success in the
respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.

110

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

23.03

30.73

0.378

4.817

0.018

4.142

1.316

(14.80, 24.02)

(10.93, 62.77)

(0.032, 0.50)

(2.350, 8.31)

(-0.726, 1.325)

(3.134, 10.092)

(1.087, 14.637)

TransDuplicate

23.89

42.62

0.250

7.068

-0.230

6.961

4.186

(9.38, 24.15)

(-0.23, 64.89)

(0.047, 0.61)

(4.052, 13.23)

(-0.702, 0.740)

(4.132, 14.846)

(0.808, 33.287)

TransScramble

23.91

43.95

0.272

6.064

-0.230

6.391

1.816

(9.65, 24.10)

(23.49, 70.03)

(-0.093, 0.62)

(3.222, 13.04)

(-0.239, 0.398)

(2.464, 9.906)

(0.817, 16.947)

Control

Table 5.18: Gene Translocation Fitness
Fitness results of both gene translocations. Each entry shows the median log2 population mean ﬁtness in the respective
environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.

Logic-9

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.943

0.403

0.152

0.237

2.25⇥10 4

0.942

4.037⇥10 3

(0.749, 0.975)

(0.151, 0.809)

(0.117, 0.159)

(0.138, 0.458)

(0.91, 6.94)

(0.803, 0.965)

(3.967, 26.673)

TransDuplicate

0.965

0.491

0.154

0.343

0.00⇥10 4

0.949

8.398⇥10 3

(0.472, 0.977)

(0.000, 0.760)

(0.129, 0.228)

(0.210, 0.644)

(0.00, 3.50)

(0.922, 0.964)

(2.042, 64.116)

TransScramble

0.963

0.505

0.154

0.294

0.00⇥10 4

0.949

4.048⇥10 3

(0.486, 0.973)

(0.301, 0.811)

(0.075, 0.219)

(0.171, 0.616)

(0.00, 2.28)

(0.600, 0.969)

(2.046, 32.459)

Control

Table 5.19: Gene Translocation Task Success
Normalized task success results of both gene translocation versions. Each entry shows the median normalized task success
in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05)
deviations.

111

5.4.3

Gene Translocation

The development of the slip mutation, as tested above, was inspired by the idea that the
mechanisms responsible for replication could “slip” and copy or skip regions of the genome.
The controls detailed above allow for several variations in the distribution of instructions
that may be inserted into a region, including copied and scrambled from the region immediately surrounding the current copy location. The source of these fragments, however, was
restricted to be immediately adjacent to the insertion location. Gene translocation relaxes
that restriction, allowing the genome fragment to be selected from any location within the
current genome. I tested the impact of this relaxation with two new versions of the slip
mutation. These versions support the insertion of copied and scrambled sequences selected
at random from any location in the genome.
Gene translocation, in both duplication and scrambled forms, demonstrated signiﬁcantly
improved ﬁtness in the Logic-9, Logic-77, Fibonacci-32, and Limited-9 environments (see Tables 5.18 and 5.19). These results match well with conventional slip mutations, tested above.
In contrast to slip mutations, gene translocation mutations demonstrated correspondingly
substantial improvements in task success in all but one treatment, Trans-Duplication
in the Logic-77 environment. Additionally, the Navigation environment demonstrated substantially improved ﬁtness with Trans-Duplication mutation. The Match-12 and Sort-10
environments, on the other hand, demonstrated signiﬁcantly reduced evolutionary potential,
similar to previous slip mutation results.
Overall, these results indicate that limiting the source of genetic insertions to neighboring
regions of the genome, as featured in slip mutations, is unnecessary. Indeed, relaxing that
restriction with gene translocation mutations unlocks additional evolutionary potential in
112

favorable environments and may help mitigate the detrimental e↵ects in others.

5.4.4

Discussion

Slip mutations that insert duplicated or scrambled genetic code substantially increase evolutionary potential, especially in the gene translocation form. Five of the environments, the
Logic-9, Logic-77, Fibonacci-32, Limited-9, and Navigation environments all beneﬁted from
these mutations. The Match-12 and Sort-10 environments did show reduced ﬁtness and task
success with these mutations. However, as observed throughout Chapter 4, both of these
environments are particularly di cult. The instruction set, especially the Heads architecture used in this chapter, simply does not have computational power necessary to e↵ectively
tackle the Sort-10 environment. Similarly, the lack of support for building arbitrary numbers
likely limits the potential in the Match-12 environment. Given instruction set improvements
supporting these two environments, it is possible that slip mutations may positively support
the evolutionary potential of populations in them.

113

5.5

Side-e↵ects of Mutations

Unstable genotypes produce o↵spring whose genomes di↵er from the parent, even in the context of no new explicit mutations occurring. These “implicit mutations” result from errors in
the replication algorithm of the parent organism. Avida o↵ers the ability to prevent unstable
genotypes from propagating in order to allow for full control of the exact mutation characteristics of an experiment. The consequences of this control mechanism on evolutionary
potential, however, have not previously been explored. Preventing the reproduction of these
genotypes causes what may otherwise be neutral or beneﬁcial mutations to be converted into
lethal mutations. I have explored the ramiﬁcations of sterilizing unstable genotypes within
my evolutionary potential framework.
In general, sterilizing unstable genotypes made no substantial impact on evolutionary
potential in all but one environment, the Logic-77 environment (see Tables 5.20 and 5.21).
Although the Match-12, Limited-9, and Navigation environments each featured a signiﬁcant variation in either ﬁtness or task success, none were accompanied with a corresponding
change in the other metric. The Logic-77 environment, however, demonstrated highly signiﬁcant and substantial declines in both median log2 ﬁtness (47.8%) and median normalized
task success (44.3%), indicating that in this environment unstable genotypes play an important role in exploring the ﬁtness landscape to exploit evolutionary potential.

114

Logic-9
Control
Sterilize
Unstable

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

23.19

37.11

0.352

4.955

0.162

4.317

4.125

(13.06, 23.91)

(9.90, 66.25)

(-0.053, 0.64)

(1.894, 13.24)

(-0.733, 1.164)

(3.415, 19.058)

(1.126, 25.009)

23.00

19.38

0.399

5.344

-0.020

4.192

2.720

(14.06, 23.95)

(8.86, 56.23)

(0.218, 0.78)

(1.858, 22.01)

(-0.693, 1.288)

(3.318, 9.398)

(1.035, 40.186)

Table 5.20: Sterilize Unstable — Fitness
Fitness results when sterilizing unstable genotypes. Each entry shows the median log2 population mean ﬁtness in the respective
environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05) deviations.

Logic-9
Control
Sterilize
Unstable

Logic-77

Match-12

Fibonacci-32

Sort-10

Limited-9

Navigation

0.950

0.470

0.153

0.274

2.54⇥10 4

0.942

8.500⇥10 3

(0.648, 0.978)

(0.139, 0.847)

(0.077, 0.182)

(0.122, 0.638)

(0.56, 5.55)

(0.802, 0.966)

(3.965, 43.938)

0.960

0.262

0.153

0.291

2.13⇥10 4

0.954

6.210⇥10 3

(0.694, 0.976)

(0.126, 0.717)

(0.148, 0.223)

(0.122, 0.992)

(0.95, 6.96)

(0.786, 0.970)

(3.813, 79.650)

Table 5.21: Sterilize Unstable Genotypes — Task Success
Normalized task success results when sterilizing unstable genotypes. Each entry shows the median normalized task success
in the respective environment, with 95% conﬁdence intervals in parentheses. Bold entries indicate signiﬁcant (p < 0.05)
deviations.

115

5.6

Conclusion

Exploration of four aspects of mutations (types of mutations, timing of mutation application,
mutations of large e↵ect, and the side-e↵ects of mutations) highlighted that digital evolution
is surprisingly robust to variation in mutations. Although some minor variation in
speciﬁc environment and mutation combinations was observed, the overall mutation rate
demonstrated the greatest impact on evolutionary potential.

5.7

Future Work

There are many potential avenues of further exploration that would be worthwhile investigating, but are beyond the scope of this dissertation (indeed, some could be the topic of an
entire dissertation unto themselves). Here I describe several possibilities that I considered.

5.7.1

Comparing Sexual and Asexual Reproduction

Genetic recombination, such as which occurs during sexual reproduction, has been shown
to have a broad range of e↵ects on genetic architecture (Misevic et al., 2006). Evolutionary
computation makes frequent use of it as a primary search operator, especially in genetic algorithms (Holland, 1975). The e↵ects on evolutionary potential of sexual reproduction could
be systematically explored in digital organisms under multiple recombination probabilities,
module numbers, genome alignment techniques, and mate-selection models.
116

5.7.2

Execution Errors

Up to this point, all mutation types discussed create lasting changes to the genome of o↵spring, parents, or both. Mutations in natural systems are not limited to this level of activity.
Indeed, it is possible for errors to occur at almost any level. Execution errors are thought to
be a strong force in evolving more robust code, and especially redundancies (that otherwise
wouldn’t have a strong selective pressure). In digital organisms, however, the virtual CPU
activity is generally error free, producing deterministic results. Introducing execution errors
and/or failures, however, may have consequences on the structure and robustness of the
evolved genetic program.

5.7.3

Mutation Changes During Evolution

All of the work proposed above examines mutations as discrete, ﬁxed sets. In natural
evolution, however, the mechanisms of replication and expression are themselves evolving.
Through this evolution, new types of mutations may become possible, rates of mutations
can change, the distribution of substitutions may vary. Indeed almost all characteristics of
mutations can change over time.
It is possible, in principle, for implicit mutations to arise out of the replication process
in a digital evolution system, altering the mutations of the system. These speciﬁc types of
mutations are di cult to intentionally create and control. However, it is possible to explicitly
change mutation rates during a run. Utilizing such changes, the consequences of changing
mutations during evolution could be systematically explored.

117

Chapter 6
Conclusion and Future Work
The research that I have conducted here contributes generally to our understanding of how
history, mutations, and genetic hardware a↵ect evolutionary potential in evolving systems.
The results provide insights applicable for the prediction of future evolutionary directions.
Knowledge of the roles these aspects play in determining evolution potential will have important implications in both evolutionary computation and evolutionary biology, allowing
for the alteration of potential as desired. My work has shown that evolution is quite robust
in the face of all of these factors. Most changes are innocuous, yielding little impact on the
overall evolutionary potential of populations. Some changes, however, can be critical. In
the history case, for example, most mutations will make no di↵erence to the evolutionary
outcomes at all, but the right mutation can create a strong shift in potential. Still, multiple
instances demonstrated the possibility for signiﬁcantly improving performance.
The work presented will facilitate improvements in our ability to harness digital evolution.
Beyond speciﬁc improvements, the work demonstrates generalizable observations, methods,
and system design guidelines for maximizing evolutionary potential. Importantly, the work
118

also demonstrates the robustness of the evolutionary process to perturbations of fundamental
aspects. In many ways, this helps us understand how evolution functions so well on Earth,
where many of the rules are quite arbitrary.
The work presented in this dissertaion covers three major aspects of evolution that a↵ect
evolutionary potential. I did not explore spatial structure, a↵ecting gene ﬂow and growth
rates. The interplay between spatial structure and mutations, including sexual reproduction,
is known to be important, but many details of how and why remain unclear. As continuation
of this work, I plan to pursue the exploration of spatial structure and its e↵ect on evolutionary
potential.
In the discussion above I indicated that mutations and genetic hardware both have potential implications for genomic architecture, and that genetic architecture may be responsible
for some of the observed contingencies of history. For example, I think high levels of epistasis will reduce evolutionary potential because the landscape becomes too rugged to easily
traverse. Examining epistasis might give us many insights into the evolutionary potential
of a population. I believe that important insights would be found investigating exactly how
the above changes should alter genetic architecture. These insights may inform and improve
future genetic hardware changes, as well as mutation design and selection.

119

GLOSSARY

120

Glossary
head Hardware element that points to an instruction located in the genomic program.. 6,
7, 41, 54
mutational neighborhood All potential mutant genotypes with up to a speciﬁed number
of mutations surrounding a reference genotype. 14, 23, 24, 32, 35, 36
nop no-operation instruction. 7–9, 40–43, 46, 48, 77
PLoD principal line of descent. 24
principal line of descent The complete genotypic lineage of the most abundant genotype
at the end of a given experiment, starting from the ancestral genotype.. 18, 23, 24, 32,
36, 78
STEP short-term evolutionary potential. 14, 23, 32, 35, 78, 79, 81

121

APPENDICES

122

Appendix A
Complete History Mutational
Neighborhood Results
The mutational neighborhood results for all lineages analyzed in Section 3.2.2 are presented
here. In each ﬁgure, the top graph shows the absolute probability of a single mutation conferring performance of the EQU function. The middle graph of each ﬁgure shows the absolute
probability of two mutations occurring simultaneously that, combined, confer performance
of EQU. Lastly, the bottom graph shows the quality of individual mutations within two-step
pathways when taken alone, grouped into four general classes: beneﬁcial (green), neutral
(blue), deleterious (red), lethal (black).

123

Figure A.1: Mutational neighborhood of the PLoD from population 101
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.2: Mutational neighborhood of the PLoD from population 102
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
124

Figure A.3: Mutational neighborhood of the PLoD from population 103
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.4: Mutational neighborhood of the PLoD from population 104
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
125

Figure A.5: Mutational neighborhood of the PLoD from population 105
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.6: Mutational neighborhood of the PLoD from population 106
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
126

Figure A.7: Mutational neighborhood of the PLoD from population 107
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.8: Mutational neighborhood of the PLoD from population 108
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
127

Figure A.9: Mutational neighborhood of the PLoD from population 109
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.10: Mutational neighborhood of the PLoD from population 110
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
128

Figure A.11: Mutational neighborhood of the PLoD from population 111
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.12: Mutational neighborhood of the PLoD from population 112
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
129

Figure A.13: Mutational neighborhood of the PLoD from population 113
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.14: Mutational neighborhood of the PLoD from population 114
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
130

Figure A.15: Mutational neighborhood of the PLoD from population 115
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.16: Mutational neighborhood of the PLoD from population 116
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
131

Figure A.17: Mutational neighborhood of the PLoD from population 117
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.18: Mutational neighborhood of the PLoD from population 118
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
132

Figure A.19: Mutational neighborhood of the PLoD from population 119
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.20: Mutational neighborhood of the PLoD from population 120
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
133

Figure A.21: Mutational neighborhood of the PLoD from population 121
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.22: Mutational neighborhood of the PLoD from population 122
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
134

Figure A.23: Mutational neighborhood of the PLoD from population 123
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.24: Mutational neighborhood of the PLoD from population 124
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
135

Figure A.25: Mutational neighborhood of the PLoD from population 125
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.26: Mutational neighborhood of the PLoD from population 126
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
136

Figure A.27: Mutational neighborhood of the PLoD from population 127
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.28: Mutational neighborhood of the PLoD from population 128
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
137

Figure A.29: Mutational neighborhood of the PLoD from population 129
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.30: Mutational neighborhood of the PLoD from population 130
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
138

Figure A.31: Mutational neighborhood of the PLoD from population 132
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.32: Mutational neighborhood of the PLoD from population 132
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
139

Figure A.33: Mutational neighborhood of the PLoD from population 133
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.34: Mutational neighborhood of the PLoD from population 134
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
140

Figure A.35: Mutational neighborhood of the PLoD from population 135
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.36: Mutational neighborhood of the PLoD from population 136
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
141

Figure A.37: Mutational neighborhood of the PLoD from population 137
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.38: Mutational neighborhood of the PLoD from population 138
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
142

Figure A.39: Mutational neighborhood of the PLoD from population 139
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.40: Mutational neighborhood of the PLoD from population 140
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
143

Figure A.41: Mutational neighborhood of the PLoD from population 141
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.42: Mutational neighborhood of the PLoD from population 142
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
144

Figure A.43: Mutational neighborhood of the PLoD from population 143
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.44: Mutational neighborhood of the PLoD from population 144
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
145

Figure A.45: Mutational neighborhood of the PLoD from population 145
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.46: Mutational neighborhood of the PLoD from population 146
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
146

Figure A.47: Mutational neighborhood of the PLoD from population 147
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.48: Mutational neighborhood of the PLoD from population 148
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
147

Figure A.49: Mutational neighborhood of the PLoD from population 149
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).

Figure A.50: Mutational neighborhood of the PLoD from population 150
Fitness classes shown: beneﬁcial (green), neutral (blue), deleterious (red), lethal (black).
148

Appendix B
Complete History STEP Sampling
Results
The STEP sampling results for all lineages analyzed in Section 3.2.3 are presented here.

Figure B.1: STEP sampling along the PLoD from population 101.

149

Figure B.2: STEP sampling along the PLoD from population 102.

Figure B.3: STEP sampling along the PLoD from population 103.

150

Figure B.4: STEP sampling along the PLoD from population 104.

Figure B.5: STEP sampling along the PLoD from population 105.

151

Figure B.6: STEP sampling along the PLoD from population 106.

Figure B.7: STEP sampling along the PLoD from population 107.

152

Figure B.8: STEP sampling along the PLoD from population 108.

Figure B.9: STEP sampling along the PLoD from population 109.

153

Figure B.10: STEP sampling along the PLoD from population 110.

Figure B.11: STEP sampling along the PLoD from population 111.

154

Figure B.12: STEP sampling along the PLoD from population 112.

Figure B.13: STEP sampling along the PLoD from population 113.

155

Figure B.14: STEP sampling along the PLoD from population 114.

Figure B.15: STEP sampling along the PLoD from population 115.

156

Figure B.16: STEP sampling along the PLoD from population 116.

Figure B.17: STEP sampling along the PLoD from population 117.

157

Figure B.18: STEP sampling along the PLoD from population 118.

Figure B.19: STEP sampling along the PLoD from population 119.

158

Figure B.20: STEP sampling along the PLoD from population 120.

Figure B.21: STEP sampling along the PLoD from population 121.

159

Figure B.22: STEP sampling along the PLoD from population 122.

Figure B.23: STEP sampling along the PLoD from population 123.

160

Figure B.24: STEP sampling along the PLoD from population 124.

Figure B.25: STEP sampling along the PLoD from population 125.

161

Figure B.26: STEP sampling along the PLoD from population 126.

Figure B.27: STEP sampling along the PLoD from population 127.

162

Figure B.28: STEP sampling along the PLoD from population 128.

Figure B.29: STEP sampling along the PLoD from population 129.

163

Figure B.30: STEP sampling along the PLoD from population 130.

Figure B.31: STEP sampling along the PLoD from population 131.

164

Figure B.32: STEP sampling along the PLoD from population 132.

Figure B.33: STEP sampling along the PLoD from population 133.

165

Figure B.34: STEP sampling along the PLoD from population 134.

Figure B.35: STEP sampling along the PLoD from population 135.

166

Figure B.36: STEP sampling along the PLoD from population 136.

Figure B.37: STEP sampling along the PLoD from population 137.

167

Figure B.38: STEP sampling along the PLoD from population 138.

Figure B.39: STEP sampling along the PLoD from population 139.

168

Figure B.40: STEP sampling along the PLoD from population 140.

Figure B.41: STEP sampling along the PLoD from population 141.

169

Figure B.42: STEP sampling along the PLoD from population 142.

Figure B.43: STEP sampling along the PLoD from population 143.

170

Figure B.44: STEP sampling along the PLoD from population 144.

Figure B.45: STEP sampling along the PLoD from population 145.

171

Figure B.46: STEP sampling along the PLoD from population 146.

Figure B.47: STEP sampling along the PLoD from population 147.

172

Figure B.48: STEP sampling along the PLoD from population 148.

Figure B.49: STEP sampling along the PLoD from population 149.

173

Figure B.50: STEP sampling along the PLoD from population 150.

174

References

175

References
Beatty, J. (2006). Replaying Life’s tape. Journal of Philosophy, 103:336–362.
Blount, Z. D., Borland, C. Z., and Lenski, R. E. (2008). Historical contingency and the
evolution of a key innovation in an experimental population of Escherichia coli. Proceedings
of the National Academy of Sciences, 105:7899–7906.
Bryson, D. M. and Ofria, C. (2012). Digital Evolution Exhibits Surprising Robustness to
Poor Design Decisions. In Adami, C., Bryson, D. M., Ofria, C., and Pennock, R. T.,
editors, Artiﬁcial Life 13: Proceedings of the Thirteenth International Conference on the
Simulation and Synthesis of Living Systems, pages 19–26, East Lansing, MI.
Conner, J. K. and Hartl, D. L. (2004). A Primer of Ecological Genetics. Sinauer Associates,
Sunderland, MA, 1st edition.
Conway Morris, S. (2003). Life’s Solution: Inevitable Humans in a Lonely Universe. Cambridge University Press, Cambridge, UK.
Cooper, T. F. and Ofria, C. (2003). Evolution of stable ecosystems in populations of digital
organisms. In Standish, R. K., Bedau, M. A., and Abbass, H. A., editors, Artiﬁcial Life
VIII: Proceedings of the Eighth International Conference on Artiﬁcial life, pages 227–232,
Cambridge, MA. International Society of Artiﬁcial Life, MIT Press.
Dawkins, R. (1996a). Climbing Mount Improbable. W. W. Norton & Company, New York.
Dawkins, R. (1996b). The Blind Watchmaker. W. W. Norton & Company, New York, third
edition.
Eby, D., Averill, R. C., Punch, W. F., and Goodman, E. D. (1999). The Optimization of
Flywheels using an Injection Island Genetic Algorithm. In Bentley, P., editor, Evolutionary
Design by Computers, pages 167–190. Morgan Kaufmann, San Francisco, California, USA.
Fogel, D. B. (1995). Evolutionary Computation: Toward a New Philosophy of Machine
Intelligence. IEEE Press, Piscataway, NJ.
Foote, M. (1998). Contingency and Convergence. Science, 280(5372):2068–2069.
Gould, S. J. (1989). Wonderful Life: The Burgess Shale and the Nature of History. W. W.
Norton & Company, New York.
176

Grabowski, L. M., Bryson, D. M., Dyer, F. C., Pennock, R. T., and Ofria, C. (2011).
Clever Creatures: Case Studies of Evolved Digital Organisms. In Advances in Artiﬁcial
Life, ECAL 2011: Proceedings of the Eleventh European Conference on the Synthesis and
Simulation of Living Systems, pages 276–283. MIT Press.
Grabowski, L. M., Bryson, D. M., Pennock, R. T., Dyer, F., and Ofria, C. (2010). Early
evolution of memory usage in digital organism. In Proceedings of the 12th International
Conference on the Synthesis and Simulation of Living Systems, pages 224–231, Odense,
Denmark.
Holland, J. H. (1975). Adaptation in Natural and Artiﬁcial Systems. University of Michigan
Press, Ann Arbor, MI.
Koza, J. R. (2003). Genetic Programming IV: Routine Human-Competitive Machine Intelligence. Kluwer Academic Publishers, Norwell, MA.
Koza, J. R. and Poli, R. (2005). Genetic Programming. In Burke, E. and Kendal, G., editors,
Search Methodologies. Springer.
Lenski, R., Ofria, C., Pennock, R. T., and Adami, C. (2003). The Evolutionary Origin of
Complex Features. Nature, 423:139–144.
Lenski, R. E., Ofria, C., Collier, T. C., and Adami, C. (1999). Genome complexity, robustness
and genetic interactions in digital organisms. Nature, 400(6745):661–664.
Losos, J. B., Jackman, T. R., Larson, A., de Queiroz, K., and Rodriguez-Schettino, L. (1998).
Contingency and Determinism in Replicated Adaptive Radiations of Island Lizards. Science, 279:2115–2118.
Lynch, M. (2000). The Evolutionary Fate and Consequences of Duplicate Genes. Science,
290(5494):1151–1155.
Mayr, E. (1988). Toward a New Philosophy of Biology: Observations of an Evolutionist.
Harvard University Press, Cambridge, MA, USA.
Misevic, D., Ofria, C., and Lenski, R. E. (2006). Sexual reproduction shapes the genetic
architecture of digital organisms. Proceedings of the Royal Society of London: Biological
Sciences, 273:457–464.
Ofria, C., Adami, C., and Collier, T. (2002). Design of Evolvable Computer Languages.
IEEE Transactions on Evolutionary Computation, 6(4):420–424.
Ofria, C., Bryson, D. M., and Wilke, C. O. (2009). Avida: A Software Platform for Research
in Computational Evolutionary Biology. In Adamatzky, A. and Komosinski, M., editors,
Artiﬁcial Life Models in Software, pages 3–36. Springer-Verlag, London, UK.
Rechenberg, I. (1971). Evolutionsstrategie: Optimierung technischer Systeme und Prinzipien
der biologischen Evolution. PhD thesis, Berlin Technical University.
177

Taylor, T. and Hallam, J. (1998). Replaying Life’s Tape: An Investigation into the Role of
Contingency in Evolution. In Adami, C., Belew, R., Kitano, H., and Taylor, C., editors,
Artiﬁcial Life VI: Proceedings of the Sixth International Conference on Artiﬁcial Life, MA.
International Society of Artiﬁcial Life, MIT Press.
Travisano, M., Mongold, J. A., Bennett, A. F., and Lenski, R. E. (1995). Experimental Tests
of the Roles of Adaptation Chance and History in Evolution. Science, 267:87–90.
Van Valen, L. (1991). How far does contingency rule? Evolutionary Theory, 10:47–52.
Walker, B. and Ofria, C. (2012). Evolutionary Potential is Maximized at Intermediate Diversity Levels. In Adami, C., Bryson, D. M., Ofria, C., and Pennock, R. T., editors, Artiﬁcial
Life 13: Proceedings of the Thirteenth International Conference on the Simulation and
Synthesis of Living Systems, pages 116–120.
Weinreich, D. M. and Chao, L. (2005). Rapid Evolutionary Escape by Large Populations
from Local Fitness Peaks is Likely in Nature. Evolution, 59(5):1175–1182.
Weinreich, D. M., Delaney, N. F., DePristo, M. A., and Hartl, D. L. (2006). Darwinian
Evolution Can Follow Only Very Few Mutational Paths to Fitter Proteins. Science,
312(5770):111–114.
Wolpert, D. and Macready, W. (1997). No free lunch theorems for optimization. Evolutionary
Computation, IEEE Transactions on, 1(1):67–82.
Wolpert, D. H. and Macready, W. G. (1995). No Free Lunch Theorems for Search. Technical
report.
Wright, S. (1988). Surfaces of Selective Value Revisited. AMERICAN NATURALIST,
131(1):115–123.
Yedid, G. and Bell, G. (2002). Macroevolution simulated with autonomously replicating
computer programs. Nature, 420(6917):810–812.
Yedid, G., Ofria, C. A., and Lenski, R. E. (2009). Selective Press Extinctions, but Not
Random Pulse Extinctions, Cause Delayed Ecological Recovery in Communities of Digital
Organisms. The American Naturalist, 173(4):E139–E154.
Zhang, J. (2003). Evolution by gene duplication: an update. Trends in Ecology and Evolution, 18(6):292–298.

178