AN ANALYSIS OF FITNESS IN LONG
TERM ASEXUAL
EVOLUTION EXPERIMENTS
By Michael J Wiser
A DISSERTATION
Submitted to
Michigan State University
In partial fulfillment of the requirements
for the degree of
Zoology
 Doctor of Philosophy
Ecology, Evolutionar
y Biology and Behavior
 Dual Major
2015
ABSTRACT
AN ANALYSIS OF FITNESS IN LONG
TERM ASEXUAL
EVOLUTION EXPERIMENTS
By Michael J Wiser
Evolution is the central unifying concept of modern biology. Yet it can be
hard to study in natural system, as it unfold
s across generations. Experimental
evolution allows us to ask questions about the process of evolution itself: How
repeatable is the evolutionary process? How predictable is it? How general are
the results? To address these questions, my collaborators a
nd I carried out
experiments both within the Long
Term Evolution Experiment (LTEE) in the
bacteria
Escherichia coli
, and the digital evolution software platform Avida.
In Chapter 1, I focused on methods. Previous research in the LTEE has
relied on one pa
rticular way of measuring fitness
, which we know becomes less
precise as fitness differentials increase
. I therefore decided to test whether two
alternate ways of measuring fitness would improve precision, using one focal
population. I found that all thr
ee methods yielded similar results in both fitness
and coefficient of variation
, and thus we should retain the traditional method
. In Chapter 2
, I turned to measuring fitness in each of the populations.
Previous work had considered fitness to change as a
hyperbola. A hyperbolic
function is bounded, and predicts that fitness will asymptotically approach a
defined upper bound; however, we knew that fitness in these populations
routinely exceeded the asymptotic limit calculated from a hyperbola fit to the
earlier data. I instead used to a power law, a mathematical function that does not
have an upper bound. I found that this function substantially better describes
fitness in this system, both among the whole set of populations, and in most of
the individua
l populations. I also found that the power law models fit on just early
subsets of the data accurately predict fitness far into the future. This implies that
populations, even after 50,000 generations of evolution in consistent
environment, are so far fr
om the tops of fitness peaks that we cannot detect
evidence of those peaks.
In Chapter 3, I examined to how variance in fitness changes over long
time scales. The among
population variance over time provides us information
about the adaptive landscape on
which the populations have been evolving. I
found that among
population variance remains significant. Further, competitions
between evolved pairs of p
opulations reveal additional
details about fitness
trajectories than can be
seen
from competitions agai
nst the ancestor. These
results demonstrate that our populations have been evolving on a complex
adaptive landscape.
In Chapter 4, I examined whether the patterns found in Chapter 2 apply to
a very different evolutionary system, Avida. This system incorp
orates many
similar evolutionary pressures as the LTEE, but without the details of cellular
biology that underlie nearly all organic life. I find that in both the most complex
and simplest environments in Avida, fitness also follows the same power law
dyn
amics as seen in the LTEE. This implies that power law dynamics may be a
general feature of evolving systems, and not dependent on the specific details of
the system being studied.
Copyright by
MICHAEL J WISER
2015
ACKNOWLEDGEMENTS
In my time at Michi
gan State University, I™ve been fortunate to receive the
help and support of a wide range of individuals. I cannot take the space to
acknowledge everyone, but there are some who need to be specifically
mentioned.
My advisor, Richard Lenski, has provided
a superb environment in which
to be a graduate student. He has allowed me freedom to pursue the projects
which interested me, while simultaneously offering ideas of projects related to the
main work in the laboratory whenever my other projects ran into me
thodological
difficulties. His laboratory funding has allowed me to pursue projects merely
because I found them interesting, and not because he already had a grant
specifically tied to that work. He has challenged me to become a better writer, to
improve
my statistical skills, to place my work in the context of the field (forcing
me to at least occasionally see the forest among the trees), and to work
independent of direct oversight.
My committee members, past and present, have each brought their own
additions to my work. Charles Ofria, who was all but officially a second advisor,
taught me how to explain my work to a broader audience. Thomas Schmidt kept
me grounded in the molecular details of my system. Andrew McAdam taught me
to look beyond the mech
anics of a statistical test to what the fundamental point of
the comparison was. Ian Dworkin forced me to be able to defend my
mathematical choices. Arend Hintze provided a useful sanity check when I was
v trying to write up what felt like disparate parts
of my work. Without each of them,
my dissertation would have suffered.
Support staff are essential to any well
functioning enterprise. A few of
them I need to call out for particular recognition. Neerja Hajela, the lab manager
who has kept me supplied
with glassware for the thousands of competitions
assays contained in these pages, and who therefore freed up substantial time for
me to have something resembling a life while a graduate student. Brian Baer,
the computer manager who took care of many techn
ological issues along the
way. Connie James, the administrative assistant in the BEACON center, without
whom the center would grind to a halt. Darcie Zubeck, the financial officer for the
BEACON Center, who has been almost unbelievably kind in helping me
with
reimbursement protocols.
Within both the Lenski and Ofria labs, I™ve overlapped with dozens of
individuals. Here I list just the most notable of the interactions with them. Noah
Ribeck, a postdoc with whom I collaborated on most of my fitness traje
ctory work
in the LTEE. David Bryson, a developer and (former) graduate student with
whom I collaborated on all of the Avida work contained in these pages. Jeffrey
Barrick, a postdoc who started on the same day in the lab as I did, and who
acted as a pse
udo
advisor for several years. Jeffrey Morris, a postdoc who
started later, but with whom I collaborated on multiple projects not in this
dissertation. Caroline Turner, Rohan Maddamsetti, and Alita Burmeister, all
fellow Lenski lab students without whose
help in our laboratory writing group I
may still not have finished writing this dissertation. Emily Dolson and Anya
vi Vostinar, graduate students in the Ofria lab whose help was instrumental in
repeating some of the Avida analyses. Bess Walker, a former st
udent in the
Ofria lab, with whom I discussed most of my work over many years; Justin
Meyer, a former student in the Lenski lab, for the same. Jeffrey Barrick, Jeffrey
Morris, Rohan Maddamsetti, Caroline Turner, Emily Dolson, Brian Connelly, Luis
Zaman, D
avid Knoester, Rosangela Canino
Koning, Daniel Mitchell, and Neem
Serra, all individuals with whom I collaborated with in my time at Michigan State
University on projects not in this dissertation.
Beyond the university, there is also the support of family
. I thank my
brother, Matthew Wiser, for taking a surprisingly long time before we achieved
the point of mutual incomprehensibility in the specifics of our research. And I
thank my father, James Wiser, who never objected to the fact that his science

mind
ed son had no interest in becoming a physician.
vii
TABLE OF CONTENTS
LIST OF TABLES
...................................................................................................x
LIST OF FIGURES
..........................................................
......................................xi
KEY TO ABBREVIATIONS
.................................................................................xiv
CHAPTER
1: A COMPARISON OF METHODS TO MEASURE FITNESS IN
ESCHERICHIA COLI
....................................
.........................................................1
Abstract....
..................................................................................................
.1 Introduction..
....................................................................
..........................
.2 Materials and Methods..
............................................................................
..5 Experimental conditions..
................................................................
.5 Bacterial strains..
........
.....................................................................
.5 Fitness measurements...
..................................................................
6 Statistical methods...
...................................................................
.....
9 Bootstrapping..
..............................................................................
.10
Results and Discussion...
..........................................................................
10 Conclusion
s..
.................................
............................................................
18 Acknowledgements...
................................................................................
18 APPENDIX
.................................................................................
...............
19 REFERENCES.
.........................................................................................
23 CHAPTER 2: LONG
TERM DYNAMICS IN ASEXUAL POPULATIONS
............27
Abstract..........................................................
.............
..............................27
Main Text....................................................................
..............................27
APPENDIX...
..............................................................
.........................
.....40
REFERENCES.....
.......................................................
..............................64
CHATPER 3:
PERSISTENT AMONG
POPULATION VARIANCE IN FITNESS IN
A LONG
TERM EVOLUTION EXPERIMENT WITH
ESCHERICHIA COLI
.........68
Abstract......
.................................................................
..............................68
Introduction.................................................................
..............................69
Meanings of changes in variance..............
.......
..............................72
Study S
ystem..............................................................
..............................73
Previous work.....................................................
............................74
Fitness
assays..................................................
..............................76
Statistical methods...........................................
..............................76
Results and Discussion..............................................
...............................77
Komologrov
Smirnov tests...............................
..............................84
Using po
pulation pairs to examine finer
scale differences.............8
8 Summary.................................................
...................
.............................112
Conclusions................................................................
.............................114
viii
Future Work...............................................................
....................
.........115
Acknowledgements....................................................
.............................116
REFERENCES...
.................................................................................
....117
CHATPER 4
: LONG
TERM DYNAMICS IN ASEXU
AL DIGITAL
POPULATIONS
....................................................................
.............................120
Abstract..............................................................................
.....................120
Introduction........
........................................................
.............................120
Study System.............................................................
.............................121
Experimental conditions..................................
.............................122
Statistical methods..........................................
.............................123
Results and Discussion...........................................................
................124
Logic
77 Environment....
.................................
.............................124
No Task Environment......................................
.............................131
Logic
9 Environment.......................................
.............................139
Conc
lusions................................................................
.............................146
Future Work............................................................................................14
6 Acknowledgements...........................
.........................
.............................146
APPENDIX
.................................................................
.............................148
REFERENCES..
..............................................................................
.......
.152
ix LIST OF TABLES
Table 1.1:
ANOVA on the coefficient of variation across time and comparing
the three methods used to estimate fitness...
.................................
13 Table 1.2:
ANOVA on the coefficient of variation ac
ross time and c
ompari
ng
the Traditional and DCC methods....
..............................................
13 Table 1.3:
Selected evolution experiments...
..................................................
17 Table S1.1:
ANOVAs of fitness for three methods, by generatio
n...
...........
......
.21 Table S2.1:
Differences in Bayesian Information Criteria (BIC) scores between
hyperbolic and power
law model trajectories fit to the measured
fitness values.
...................................................................
............
..60
Table S2.2:
Differences in BIC scores between the hyperbolic and power
law
trajectories fit to the measured fitness values for 12 individual
E.
coli
populations...
.....................................................................
......
.61
Table S2.3:
Analysis of variation to test for heterogenetic ln
g values among the
six populations that maintained the low ancestral mutation rate
throughout the LTEE..
...................................................................
.62
Table S2.4:
Parameter estimates for the power
law model fit to each individual
population's measured fitness values...
.........................................
.63
x LIST OF FIGURES
Figure 1.1:
Fitness trajectories over time––––––––––––––––
12 Figure 1.2:
Coe
fficient of variation over time––––––––––––––..
14 Figure 1.3:
Histogra
m of bootstrap analysis––––––––––––––...
16 Figure S1.1:
Temporal tre
nds in coefficient of variation––––––––––..
20 Figure 2.1:
Fitness changes in nine
E. coli
populations between 4
0,000 and
50,000 generations––––––––––––––––––––.
29 Figure 2.2:
Comparison of
hyperbolic and power
law models––––––
....31
Figure 2.3:
Theoretical mode
l generating
power
law dynamics––––––..
34 Figure 2.4:
Effect of hypermutability on observed and pred
icted fitness
trajectories–––––––––––––––––––––––...
37 Figure S2.1:
Comparison of the fit of the hyperbolic (red) and power
law (blue)
models to the fitness trajectories for t
he 12 individual
Escherichia
coli
populations–––––––––––––––––––––...
53 Figure S2.2:
Comparison of hyperbolic and power
law models in terms of
squared deviat
ions between their fit trajectori
es and measured
grand
mean fitness values over time––––––––––––..
54 Figure S2.3:
Comparison of hyperbolic and power
law models
in their ability to
predict future fitness values fro
m temporally truncated datasets–
55 Figure S2.4:
Parameterization of diminishing
returns epistasis based on the fit of
the dynamic model to the fitness trajectories accords well with
independent
data on the form and streng
th of epistasis from the
LTEE––––––––––––––––––––––––––.
56 Figure S2.5:
Predicted number of beneficial fixation events in relation to the
fitness trajectory, based on the theoretical model with clonal
interference and
diminishing
returns epistasis––––––––..
57 Figure S2.6:
Numerical simulations of fitness trajectories show good agreement
with the theory over a wide range of th
e beneficial mutation rate
––––––––––––––––––––––––––––...
58 Figure S2.7:
Hypothetical
growth kinetics of evolved (blue) and ancestral (black)
competitors that would pro
duce a relative fitness of ~4.7––––
59 xi Figure 3.1:
Among
population standard deviation o
f fitness over the first 10,000
generations acr
oss all populations in the LTEE–
–––––––75 Figure 3.2:
Among
population standard deviation in fitness calculated across
all populations in the LTEE––––––––––––––––...
79 Figure 3.3:
Among
population standard deviation in fitness calculated across
LTEE populations tha
t did not become
hypermutators–––––
81 Figure 3.4:
Among
population standard deviation in fitness, calculated across
LTEE populations tha
t did not become hypermutators–––––
82 Figure 3.5:
Cumulative frequency of p values among ANOVAs used to
calculate among
populat
ion variance in fitness––––––––.
85 Figure 3.6:
Cumulative frequency of p values among ANOVAs used to
calculate among
population variance in fitness––––––––.
86 Figure 3.7:
Cumulative frequency of p values among ANOVAs used to
calculate among
popul
ation variance in fitness––––––––.
87 Figure 3.8:
Ara
1 v Ara+1––––––––––––––––––––––..
92 Figure 3.9:
Ara
1 v Ara+1––––––––––––––––––––––..
93 Figure 3.10:
Ara
1 v Ara+1––––––––––––––––––––––..
96 Figure 3.11:
Ara
4 v Ara+4––––––––––––––––––––––..
97 Figur
e 3.12:
Ara
4 v Ara+4––––––––––––––––––––––..
98 Figure 3.13:
Ara
4 v Ara+4––––––––––––––––––––––..
99 Figure 3.14:
Ara
5 v Ara+5––––––––––––––––––––––101
Figure 3.15:
Ara
5 v Ara+5––––––––––––––––––––––
102 Figure 3.16:
Ara
5 v Ara+5––––––––––––––––––––
––103 Figure 3.17:
Ara
2 v Ara+2––––––––––––––––––––––
105 Figure 3.18:
Ara
2 v Ara+2––––––––––––––––––––––
106 Figure 3.19:
Ara
2 v Ara+2––––––––––––––––––––––
107 Figure 3.20:
Ara
3 v Ara+3––––––––––––––––––––––
109 Figure 3.21:
Ara
3 v Ara+3–––––––––––
–––––––––––110 xii
Figure 3.22:
Ara
3 v Ara+3––––––––––––––––––––––
111 Figure 4.1:
Fitness over t
ime in the Logic
77 environment––––––––
125 Figure 4.2:
Late fitness v
final fit
ness in the Logic
77 environ
ment––...
–..
127 Figure 4.3:
Fitness over t
ime in the
Logic
77 environment––––––––
129 Figure 4.4:
Comparison of model f
its in the Logic
77 environment––––..
130
Figure 4.5:
Fitness over time in the No Task e
nvironment––––––––
133 Figure 4.6:
Late fitness v
final fit
ness in the No Task environment––
.––.134
Figure 4.7:
Fitness over
time in the No Task environment––––––––
136 Figure 4.8:
Comparison of model f
its in the No Task environment––––..1
38 Figure 4.9:
Fitness over time in th
e Logic
9 environment––––––––..
140
Figure 4.10:
Late fitness v
final fit
ness
in the Logic
9 environment–––
.–..
142
Figure 4.11:
Fitness over
time in the Logic
9 environment––––––––..
143
Figure 4.12:
Comparison of model
fits in the Logic
9 environment–––––
144
Figure S4.1:
Diagnostic plots for Logic
77 environment, l
ate fitness as
linear
models–––––––––––––––––––––––––
149 Figure S4.2:
Diagnostic plots for No Task environment, l
ate fitness as linear
models––––––––––––––––––––––––...
150
Figure S4.3:
Diagnostic plots for Logic
9 environment, l
ate fitness as linear
models––––
–––––––––––––––––––––
151 xiii
KEY TO ABBREVIATIONS
ASR: Altered Starting Ratio
BIC: Bayesian Information Criteria
DCC: Different Common Competitor
E. coli: Escherichia coli
LTEE: Long
Term Evolution Experiment
xiv
CHAPTER 1: A COMPARISON OF METHODS TO MEASURE FI
TNESS IN
ESCHERICHIA COLI
Authors: Michael J. Wiser and Richard E. Lenski
Abstract
: In order to characterize the dynamics of adaptation, it is important to be able to
quantify how a population™s mean fitness changes over time. Such measurements are
esp
ecially important in experimental studies of evolution using microbes. The Long
Term Evolution Experiment (LTEE) with
Escherichia coli
provides one such system in
which mean fitness has been measured by competing derived and ancestral
populations. The tr
aditional method used to measure fitness in the LTEE and many
similar experiments, though, is subject to a potential limitation. As the relative fitness of
the two competitors diverges, the measurement error increases because the less
fit
population becom
es increasingly small and cannot be enumerated as precisely. Here,
we present and employ two alternatives to the traditional method. One is based on
reducing the fitness differential between the competitors by using a common reference
competitor from an
intermediate generation that has intermediate fitness; the other
alternative increases the initial population size of the less
fit, ancestral competitor. We
performed a total of 480 competitions to compare the statistical properties of estimates
obtained
using these alternative methods with those obtained using the traditional
method for samples taken over 50,000 generations from one of the LTEE populations.
1 On balance, neither alternative method yielded measurements that were more precise
than the tradit
ional method.
Introduction
: The concept of fitness is central to evolutionary biology. Genotypes with higher
fitness will tend to produce more offspring
and thereby increase in frequency over time
compared to their less
fit competitors. Fitness, howeve
r, is often difficult to measure,
especially for long
lived organisms. Unlike traits such as color, fitness cannot be
observed at a single point in time, but instead it must be measured and integrated
across the lifespan of the individuals. Thus, researc
hers typically measure fitness
components
Œ such as the number of seeds produced or offspring fledged
Œ and use
them as proxies for fitness.
These limitations can be overcome in experimental evolution studies using
microorganisms. Microbes typically have
rapid generations and require little space,
making them attractive for laboratory
based studies. Replicate populations founded
from a common ancestor allow researchers to examine the repeatability of evolutionary
changes. Environments can be controlled,
reducing uninformative variation between
samples or populations and allowing precise manipulations of conditions of interest.
Also, one can often freeze microbial populations at multiple points along an evolutionary
trajectory and revive them later, allo
wing direct comparisons between ancestral and
derived populations
(1, 2). Owing to these advantages, evolution experiments with
microbes are becoming increasingly common
(3Œ5). Thus, it is important to be able to
2 accurately quantify fitness in these experiments, in order to understand the evolutionary
dynamics at work.
One commonly employed method of quantifying microbial f
itness is to calculate
the maximum growth rate (V
max
) of a culture growing on its own
(6Œ10), usually by
measuring the optical density of the culture over time. These measurements have the
advantages of being simple and fast; a spectrophotometer can measure many samples
in a multi
well plate in quick succession, and systems can be programmed
to take
measurements over the full growth cycle of a culture. However, maximum growth rate is
typically only one component of fitness even in the simplest systems
(11), and hence it
provides, at best, only a proxy for fitness.
A second type of fitness measurement comes from studies where microbes are
adapting to stressful compounds, such as antibiotics. In
these situations, researchers
typically quantify the Minimum Inhibitory Concentration (MIC) of the compound, and
those organisms with higher MICs are considered to be more fit in environments that
contain that compound, as it takes more of the substance t
o inhibit their growth
(12, 13). A thir
d approach for quantifying fitness in microbial systems
Šand the approach
that most closely corresponds to the meaning of fitness in evolutionary theory
Šuses a
competition assay. The basic approach is to compete one strain or population against
another and
directly measure their relative contributions to future generations. This
approach typically produces a measure of relative, rather than absolute, fitness.
Relative fitness is more important than absolute fitness when considering the
evolutionary fate o
f a particular genotype, provided that absolute fitness is high enough
to prevent extinction of the entire population
(14, 15). Competitive fitness assays, by
3 measuring the net growth of two different populations, incorporate and integrate
differences across the full culture cycle, which may i
nclude such fitness components as
lag times, exponential growth rates, and stationary phase dynamics in batch culture
(11, 16). Despite their relevance to evolutionary theory, competitive fitness assays
sometimes have practical limitations. In particular, and the focus of our paper, these
measurements are
more precise when the two competitors have similar fitness than
when one is substantially more fit than the other. When one competitor is markedly less
fit, its abundance will decrease over the course of the competition assay, potentially
reaching values
low enough that measurement error has a large impact. Thus, as the
duration of an evolution experiment increases, and the fitness of the evolved organisms
increases relative to the ancestral competitor, the measurement error also tends to
increase, as we
will show in this study.
We used a population from the Long
Term Evolution Experiment (LTEE) with
Escherichia coli
to investigate whether changes in the methods of performing
competition assays
Œ changes meant to reduce the discrepancy in the final abunda
nces
of the competitors
Œ would yield more precise fitness measurements. The LTEE has
been described in detail elsewhere
(1, 17Œ19), and a brief summary is provided in the
Materials and Methods section below. Previous work in this system has established
that changes in V
max
explained much, but not all, of
the improvement in relative fitness
in this system, at least in the early generations
(11).
4 Materials and Methods
: Experimental conditions:
The LTEE is an ongoing experiment that began in 1988, and which has now
surpassed 50,000 bacterial generations. The experiment uses a Davi
s Minimal salts
medium with 25
g/mL glucose (DM25), which supports densities of ~3
5 x 10
7 bacteria
per mL. Each population is maintained in 10 mL of DM25 in a 50
mL glass Erlenmeyer
flask incubated at 37C and shaken at 120 rpm. Every day, each populati
on is diluted
1:100 into fresh media. This dilution sets the number of generations, as the regrowth up
to the carrying capacity allows log
2
Bacterial strains:
The LTEE has 12 populations of
E. coli
(1). Six populations were f
ounded by the
strain REL606
(20) and six by the strain REL607. REL606 is unable to grow on the
sugar arabinose (Ara
Œ); REL607
is an Ara
+ mutant derived from REL606. The DM25
medium does not contain arabinose, and the arabinose
utilization marker is selectively
neutral in the LTEE environment
[20]. In this study, we use both ancestral strains as
well as samples taken from one population, called Ara
1, at generations 500, 1000,
1500, 2000, 5000, 10,000, 15
,000, 20,000, 25,000, 30,000, 35,000, 40,000, 45,000, and
50,000. We also use another strain REL11351, which is an Ara
+ mutant of a clone
isolated from the 5000
generation sample of population Ara
1. 5 Fitness measurements:
We quantify fitness in this
system as the ratio of the realized growth rates of two
populations while they compete for resources in the same flask and under the same
environmental conditions used in the LTEE. This calculation is identical to the ratio of
the number of doublings ach
ieved by the two competitors. In all cases, we compete
samples from the Ara
1 population (including the ancestor REL606) against an Ara
+ competitor (either REL607 or REL11351). We distinguish the two competitors on the
basis of their arabinose
utilizatio
n phenotypes; Ara
Œ and Ara
+ cells produce red and
white colonies, respectively, on Tetrazolium Arabinose (TA) agar plates
(1, 21). We employ three different methods for measuring fitness in
this study. For all
three methods, we begin by removing aliquots of the competitors from the vials in which
they are stored at
Œ80C into separate flasks containing Luria
Bertani (LB) broth. The
cultures grow overnight at 37C and reach stationary phase.
We then dilute each culture
100fold into 0.86% (w/v) saline solution and transfer 100
L into a flask containing 9.9
mL of DM25. These cultures grow for 24 h under the same conditions as the LTEE, so
that all competitors are acclimated to this environment. We then jointly inoculate 100
L in total of the Ara
1 population sample and the Ar
a+ competitor into 9.9 mL of DM25.
We immediately take an initial 100
L sample of this mixture, dilute it in saline solution,
and spread the cells onto a TA plate. The competition mixture is then incubated in the
same conditions as the LTEE for 24 h, at
which point we take a final 100
L sample,
dilute it, and spread the cells onto a TA plate. We count each competitor on the TA
plates, and multiply the numbers by the appropriate dilution factor to determine their
initial and final population sizes. We
calculate fitness as
6 where
w is fitness,
A and
B are the population sizes of the two competitors, subscripts
i and f indicate the initial and final time points in the assay; here, ln refers to the natural
logarithm in order to
reflect population growth, although the ratio used to express fitness
is insensitive to the choice of base used.
For the Traditional method, we measure the relative fitness of the evolved
population samples against the Ara
+ ancestor, REL607. We inoculat
e the competition
flasks with 50
L (an equal volumetric ratio) of each competitor. This method has been
used extensively in evaluating fitness in the LTEE
(1, 2). The Altered Starting Ratio (ASR) method also uses the ancestral Ara
+ strain as
the common competitor. However, we inoculate the competition fl
asks with 20
L of the
evolved population and 80
L of the ancestral population, leading to an initial 1:4
volumetric ratio. This difference in the starting ratio increases the population size of the
ancestor at the end of the competition assay, which reduces the prob
lem of small
numbers when the ancestor is much less fit than the evolved population. The initial
ratio is not so extreme, however, that it is difficult to enumerate the evolved population
at the start of the competition assay. We attempted to keep total
plate counts around a
few hundred colonies, with at least 20 of the minority competitor, to reliably estimate
population densities
(22), and we chose this initial ratio with that objective in mind. It
seemed particularly important to increase the final count of the ancestral population in
7 the context of our fitness measurements;
smaller numbers are subject to increased
sampling error, and the realized growth rate of the ancestor is the denominator when
calculating the relative fitness of the evolved population, which can magnify the
measurement error. More extreme ratios have bee
n used in some experiments testing
invasion when rare
(23), but these ratios would result in minority populations of fewer
than 20 colonies per plate; therefore, they were not tested in this study. It is also
important to note that we test different ratios of culture volume, not specifically of
different
numbers of starting cells per se; differences in carrying capacity between the
ancestral and evolved bacteria
(11, 17) and stochastic sampling effects will prevent the
initial ratio of cell numbers from prec
isely matching these volumetric ratios.
Using the Different Common Competitor (DCC) method, we compete the evolved
population samples against the marked clone from generation 5,000, rather than
against the marked ancestor. We chose a 5,000
generation clo
ne because its fitness
was near the geometric mean of the expected fitness values spanning generations 0 to
50,000, and thus it might reduce the overall disparity in population counts across the full
time series being considered. We inoculate the competit
ions with equal volumes (50
L each) of the Ara
1 population sample and reference competitor. We considered that
this method might increase the precision of our fitness measurements because the
ratios used in the fitness calculation tend to be more precis
e as they approach 1.
We selected 15 time points from the focal population Ara
1 to evaluate these
three methods: generations 0, 500, 1,000, 1,500, 2,000, 5,000, 10,000, 15,000, 20,000,
25,000, 30,000, 35,000, 40,000, 45,000, and 50,000. We ran competiti
ons as complete
blocks; each block included one competition for each time point using each method,
8 plus an additional competition (see below) used as a scaling factor to compare the
methods. We performed a total of 10 replicate blocks, and so there were a
total of 450
competition assays to measure fitness (3 methods x 15 time points x 10 blocks) plus an
additional 30 assays to generate the scaling factors.
A scaling factor was necessary for comparing the DCC method with the
Traditional and ASR methods, be
cause the DCC method measured fitness relative to a
different competitor than the ancestor used for the other two methods. To calculate this
scaling factor, we performed an additional competition between the Ara
Œ ancestor
(REL606) and the Ara
+ reference c
ompetitor (either REL607 or REL11351) for each
method in every block. We then divided the fitness values from all of the competition
assays for a given method and block by the fitness value that served as the scaling
factor. We did not otherwise include t
he scaling
factor competitions in our data analysis.
We applied the same procedure to all three methods to ensure consistency, although
adjusting for the scaling factor was not otherwise required for the Traditional and ASR
methods.
The data and analysis
scripts are available at the Dryad Digital Depository (doi:
http://dx.doi.org/10.5061/dryad.4875k
). The data obtained using the Traditional method
previously appeared in
(19); the data for the ASR
and DCC methods, as well as all of
the analyses in this article, are new .
Statistical methods:
We performed statistical analyses in R version 2.14.1. We fit the fitness
trajectories using nonlinear least
squares regression, as implemented with the nl
s()
9 function. We performed ANOVAs using the aov() function. For the single
generation
ANOVAs, Method was a fixed factor and Block was a random factor. For the combined
ANOVA, Generation was included as a fixed factor.
Bootstrapping:
We employed a boot
strap procedure to compare the differences between the
coefficients of variation in our three methods to a null distribution. We sampled the total
dataset with replacement, to produce 3 datasets of equal size, each containing 10
measurements at each gener
ation. We then fit a linear regression of the coefficient of
variation against time (i.e., generation) to each of the 3 datasets. We then summed the
squares of the differences between each pairwise combination of the 3 linear
regressions over all 15 time
points when fitness was measured. We repeated this entire
procedure 1,000,000 times, and we compared the observed sum of the squared
differences to this distribution.
Results and Discussion
: There are two fundamental ways in which these different metho
ds could produce
meaningfully different results. One way is that different methods could produce
significantly different fitness estimates. In that case, we would need additional
information or another criterion to determine which method was superior. T
he other
way is that different methods could have different levels of precision; that is, one
method may have significantly less variation in measured values across replicate
10 assays than another. In this case, the method with the greater precision would c
learly
be preferred.
Figure 1.1 shows the results of our fitness assays for all three methods, with
trajectories fit to the data obtained using each method. These trajectories are in the
form of an Offset Power Law:
w = (
bT + 1)
a, where w is fitness, T
is time in generations, and
a and
b are model parameters, as
derived in
[2]. All three methods produce virtually identical fitness trajectories. S1
Table shows the results of ANOVAs performed at each generation to test for variation
among the three methods in the mean fitness values they p
roduce; the effect of Method
was not significant in any of the 15 tests, even without accounting for multiple tests.
From these results, we conclude that the three methods do not produce meaningfully
different estimates of mean fitness.
Next, we calculat
ed the coefficient of variation (i.e., the standard deviation
divided by the mean) for each method at each time point to determine whether they
differed in their precision. We then constructed a linear model of the coefficient of
variation as a response t
o time (i.e., generation) and method. Figure 1.2 shows the
data and linear model fit to the coefficients of variation for all three methods. Table 1.1
presents the ANOVA table for this model. There is a highly significant tendency for the
coefficient of
variation to increase in later generations, as the evolving bacteria become
progressively more fit, as discussed in the Introduction. However, the effect of Method
was not significant as a predictor of the coefficient of variation, although a p
value of
11 Figure 1.1:
Fitness trajectories over time
. Fitness trajectories for each method, shown separately, have the form w =
(bT +1)
a, where w is fitness, T is time in generations, and
a and
b are model parameters. Black circles and curve show
the Tradition
al method; blue squares and curve show the ASR method; red triangles and curve show the DCC method.
12 0.0762 is suggestive. On inspection of the data (Figure 1.2), it is clear that any
difference between the methods is driven by the ASR method having a hig
her
coefficient of variation
Œ and thus lower precision
Œ in early generations. Indeed, when
we removed the ASR method from the analysis and performed an ANOVA on the
remaining data, there was no suggestion of any difference between the Traditional and
DCC methods (Table 1.2, p = 0.8802).
df SS MS F p Time
1 0.03672
0.03672
69.664
<0.0001
Method
2 0.00289
0.00145
2.743
0.0762
Residuals
41 0.21610
0.00053
Table 1.1:
ANOVA on the coefficient of variation across time and comparing the
three methods u
sed to estimate fitness.
df SS MS F p Time
1 0.03068
0.03068
70.035
<0.0001
Method
1 0.00001
0.00001
0.023
0.8802
Residuals
27 0.01183
0.00044
Table 1.2:
ANOVA on the coefficient of variation across time and comparing the
Traditional and DCC me
thods.
We can also express the differences between these methods as follows. The regression
line for the coefficient of variation based on the ASR method is always higher than at
least one of the other two methods (Figure 1.2), and therefore it is neve
r the best
13 Figure 1.2:
Coefficient of variation over time
. Lines are linear regressions on the relevant data. Black circles and line
show the Traditional method; blue squares and line show the ASR method; red triangles and line show the DCC method.
Figure S1.1 shows the confidence bands associated with each regression line.
14 method, at least for the system and generations analyzed here. By contrast, the
Traditional and DCC methods yield coefficients of variation, as inferred from the
regression li
nes, that are very similar and always within the 95% confidence interval of
one another (Figure S1.1). Which of these two methods gave a lower point estimate of
the coefficient of variation varied over time, but the difference was not significant (Table
1.2).
An alternative way to assess whether the differences in the coefficient of
variation between the methods are statistically significant involves bootstrapping the
data, as detailed in the Methods section. Figure 1.3 shows that the observed differences
in the coefficient of variation among the three methods are no greater than would be
expected by chance if there were no differences among the methods.
Over the range of fitness changes that we observed in the LTEE (i.e., from 1 to
~1.8), neither alterna
tive method for assaying fitness (ASR or DCC) outperformed the
Traditional method. Given its extensive prior use in this study system [1,2,17], we
therefore prefer to use the Traditional method for fitness competitions that span this
range. It is importa
nt to note, however, that the ASR or the DCC method might turn out
to have higher precision in systems that exhibit larger fitness changes than the system
studied here, as suggested by the regression lines in Figure 1.2. The LTEE has, to our
knowledge, ru
n for many more generations than any other evolution experiment, but the
extent of fitness improvements has been less than that seen in some other shorter
duration experiments. The relatively limited fitness gains that have occurred during the
LTEE reflect
the fact that the experimental environment is quite benign; also, the
ancestor of the LTEE had been studied by microbiologists for many decade
(24) and
15 Figure 1.3:
Histogram of bootstrap
analysis
. Histogram showing the distribution for the bootstrapped sums of squared
differences in the coefficient of variation for 3 arbitrary groupings of the combined data. The dark arrow indicates the
difference for the actual grouping of the 3 methods
employed. The light arrow shows the most extreme 5% of the sums of
the squared differences.
16 was thus probably already well
adapted to general laboratory conditions. Other
experiments conducted for fewer generations, but performed under more stressful
conditions or founded by less
fit ancestors, might reach fitness differences where these
or other alternative methods would be helpful. Table 3 summarizes the duration and
range of fitness improvements reported in a number of other evolution experiments
that
used a variety of microorganisms including bacteria, fungi, and viruses (see also Table
2.3 in
(25)). We have included values for both relative fitness, W
f / Wi, and the
difference between final and initial fitness values, W
f Œ Wi, when the latter was reported
in the p
aper cited. The value of W
f Œ Wi necessarily depends on the time frame of the
experiment, whereas W
f / Wi is a dimensionless number and thus readily compared
across experiments.
Reference
Organism
Generations
Wf / Wi Wf  Wi This study
E. coli
50,000
1.88 3.5 / day
(26) E. coli
at 32C
2,000
1.10
E. coli
at 42C
2,00
0 1.19
(27)* E. coli
1,100
1.98
0.23 / h
(28)** Saccharomyces cerevisiae
300 1.80
(29) Aspergillus nidulans
800 1.48
(30) phage
6 with bottleneck = 10
100 1.26
phage
6 with bottleneck = 1,000
40 2.03
(31) phage G4
180 1.18
3.8 / h
phage ID2
600 2.55
13.5 / h
* Mean calculated from four replicate populations
** Value estimated from figure
Wf is the fitness at the end of the evolution experiment.
Wi is the fitn
ess at the start of the evolution experiment.
Table 1.3:
Selected evolution experiments
17 Conclusion
s: We performed 480 assays to compare three different methods for estimating the
relative fitness of bacterial competitors. The three methods generated res
ults that were
not meaningfully or significantly different in terms of either their mean values or
dispersion. The only suggestion of a meaningful difference was that the ASR method
appeared worse than the other two methods in the early generations, when
the fitness
gains of the evolved bacteria were still fairly small. Therefore, we see no compelling
reason to adopt one of the alternatives to the Traditional method when analyzing
systems that have achieved fitness gains less than or similar to those meas
ured in the
LTEE over its first 50,000 generations. When expected relative fitness values are much
greater than 1.8, or when fitness differences are compounded for more generations,
researchers may need to consider using one of these or other alternative
methods.
Acknowledgments
: We thank Caroline Turner, Amy Lark, Rohan Maddamsetti, and
Christopher
Strelioff
for helpful discussions during manuscript preparation, two reviewers for
constructive suggestions, and Neerja Hajela for technical assistance.
18 APPENDIX
19 Figure S1.1:
Temporal trends in coefficient of variation
. Temporal trends in the
coefficient of variation across replicate assays for the three different methods used to
measure fitness. Black circles show the Traditional method; blue squ
ares show the
ASR method; red triangles show the DCC method. The solid colored lines show the
linear regressions based on the corresponding data. The dashed colored curves show
the 95% confidence bands for the regressions for the three methods: A) Tradit
ional, B)
ASR, and C) DCC. The points and regression lines are the same across all three
panels, but the confidence bands are shown separately for clarity.
20 Table S1.1:
ANOVAs of fitness for three methods, by generation
. Analys
es of
variance of measured fitness values for the three methods, analyzed separately for the
various generations examined.
21 Table S1.1 (cont™d)
This chapter was originally published as:
Wiser MJ, Lenski RE (2015) A Comparison of
Methods to Measure Fitness in
Escherichia coli
. PLoS ONE 10(5): e0126210. doi: 10.1371/journal.pone.0126210
Copyright: © 2015 Wiser, Lenski. This is an open access article distributed under the
terms of the Creative Commons Attribution License, which pe
rmits unrestricted use,
distribution, and reproduction in any medium, provided the original author and source
are credited.
22 REFERENCES
23 REFERENCES
1.
R. E. Lenski, M. R. Rose, S. C. Simpson, S. C. T
adler, Long
term experimental
evolution in Escherichia coli. I. Adaptation and divergence during 2,000
generations.
Am. Nat.
138, 1315
Œ1341 (1991).
2.
M. J. Wiser, N. Ribeck, R. E. Lenski, Long
term dynamics of adaptation in asexual
populations.
Science
. 342, 1364
Œ1367 (2013).
3.
S. F. Elena, R. E. Lenski, Evolution experiments with microorganisms: the
dynamics and genetic bases of adaptation.
Nat Rev Genet
. 4, 457
Œ469 (2003).
4.
T. J. Kawecki
et al.
, Experimental evolution.
Trends Ecol. Evol.
27, 547
Œ560 (2012).
5.
J. E. Barrick, R. E. Lenski, Genome dynamics during experimental evolution.
Nat
Rev Genet
. 14, 827
Œ839 (2013).
6.
W. Paulander, S. Maisnier
Patin, D. I. Andersson, Multiple mechanisms to
ameliorate the fitness burden of mupirocin resistance
in Salmonella typhimurium.
Mol. Microbiol.
64, 1038
Œ1048 (2007).
7.
A. I. Nilsson
et al.
, Reducing the fitness cost of antibiotic resistance by amplification
of initiator tRNA genes.
Proc. Natl. Acad. Sci.
103, 6976
Œ6981 (2006).
8.
L. Sandegren, A. Lindq
vist, G. Kahlmeter, D. I. Andersson, Nitrofurantoin
resistance mechanism and fitness cost in Escherichia coli.
J. Antimicrob.
Chemother.
62, 495
Œ503 (2008).
9.
D. I. Andersson, D. Hughes, Antibiotic resistance and its cost: is it possible to
reverse resis
tance?
Nat Rev Micro
. 8, 260
Œ271 (2010).
10.
K. Walkiewicz
et al.
, Small changes in enzyme function can lead to surprisingly
large fitness effects during adaptive evolution of antibiotic resistance.
Proc. Natl.
Acad. Sci.
109
, 21408
Œ21413 (2012).
11.
F.
Vasi, M. Travisano, R. E. Lenski, Long
term experimental evolution in
Escherichia coli. II. Changes in life
history traits during adaptation to a seasonal
environment.
Am. Nat.
144, 432
Œ456 (1994).
12.
D. M. Weinreich, N. F. Delaney, M. A. DePristo, D. L.
Hartl, Darwinian evolution
can follow only very few mutational paths to fitter proteins.
Science
. 312, 111
Œ114
(2006).
24 13.
A. Ripoll
et al.

lactamase
inhibitors in CTX

lactamases: Predicti
ng the in vivo scenario?
Antimicrob.
Agents Chemother.
55, 4530
Œ4536 (2011).
14.
G. Bell, Evolutionary rescue and the limits of adaptation.
Philos. Trans. R. Soc.
Lond. B Biol. Sci.
368
(2012), doi:10.1098/rstb.2012.0080.
15.
A. F. Bennett, R. E. Lenski,
Evolutionary adaptation to temperature II. Thermal
niches of experimental lines of Escherichia coli.
Evolution
. 47, 1Œ12 (1993).
16.
S. C. Sleight, R. E. Lenski, Evolutionary adaptation to freeze
thaw
growth cycles in
Escherichia coli.
Physiol. Biochem.
Zool.
80, 370
Œ385 (2007).
17.
R. E. Lenski, M. Travisano, Dynamics of adaptation and diversification: a 10,000
generation experiment with bacterial populations.
Proc. Natl. Acad. Sci.
91, 6808
Œ6814 (1994).
18.
J. E. Barrick
et al.
, Genome evolution and a
daptation in a long
term experiment
with Escherichia coli.
Nature
. 461, 1243
Œ1247 (2009).
19.
S. Wielgoss
et al.
, Mutation rate dynamics in a bacterial population reflect tension
between adaptation and genetic load.
Proc. Natl. Acad. Sci.
110
, 222
Œ227 (20
13).
20.
F. W. Studier, P. Daegelen, R. E. Lenski, S. Maslov, J. F. Kim, Understanding the
differences between genome sequences of Escherichia coli B strains REL606 and
BL21(DE3) and comparison of the E. coli B and K
12 genomes.
J. Mol. Biol.
394
, 653Œ680
(2009).
21.
R. E. Lenski, Experimental studies of pleiotropy and epistasis in Escherichia coli. I.
Variation in competitive fitness among mutants resistant to virus T4.
Evolution
. 42, 425Œ432 (1988).
22.
R. S. Breed, W. D. Dotterrer, The number of colon
ies allowable on satisfactory agar
plates.
J. Bacteriol.
1, 321
Œ331 (1916).
23.
R. F. Inglis, S. West, A. Buckling, An experimental study of strong reciprocity in
bacteria.
Biol. Lett.
10 (2014), doi:10.1098/rsbl.2013.1069.
24.
P. Daegelen, F. W. Studier
, R. E. Lenski, S. Cure, J. F. Kim, Tracing ancestors and
relatives of Escherichia coli B, and the derivation of B strains REL606 and
BL21(DE3).
J. Mol. Biol.
394, 634
Œ643 (2009).
25.
R. Kassen,
Experimental evolution and the nature of biodiversity
(Rober
ts and
Company Publishers, Inc, Greenwood Village, Colorado, 2014).
26.
A. F. Bennett, R. E. Lenski, J. E. Mittler, Evolutionary adaptation to temperature. I.
Fitness responses of Escherichia coli to changes in its thermal environment.
Evolution
. 46, 16
Œ30 (1992).
25 27.
T. Conrad
et al.
, Whole
genome resequencing of Escherichia coli K
12 MG1655
undergoing short
term laboratory evolution in lactate minimal media reveals flexible
selection of adaptive mutations.
Genome Biol.
10, R118 (2009).
28.
M. R. Goddar
d, H. C. J. Godfray, A. Burt, Sex increases the efficacy of natural
selection in experimental yeast populations.
Nature
. 434
, 636
Œ640 (2005).
29.
S. E. Schoustra, T. Bataillon, D. R. Gifford, R. Kassen, The properties of adaptive
walks in evolving populat
ions of fungus.
PLoS Biol
. 7, e1000250 (2009).
30.
C. L. Burch, L. Chao, Evolution by small steps and rugged landscapes in the RNA
virus
6.
Genetics
. 151
, 921
Œ927 (1999).
31.
D. R. Rokyta, Z. Abdo, H. A. Wichman, The genetics of adaptation for eight
microvirid bacteriophages.
J. Mol. Evol.
69, 229
Œ239 (2009).
26 CHAPTER 2: LONG
TERM DYNAMICS OF ADAPTATION IN ASEXUAL
POPULATIONS
Authors: Michael J. Wiser, Noah Ribeck, and Richard E. Lenski
Abstract
: Experimental studies of evolution have increased greatly in number in
recent years, stimulated by the g
rowing power of genomic tools. However,
organismal fitness remains the ultimate metric for interpreting these experiments,
and the dynamics of fitness remain poorly understood over long time scales.
Here, we examine fitness trajectories for 12
Escherichia
coli
populations during
50,000 generations. Mean fitness appears to increase without bound, consistent
with a power law. We also derive this power
law relation theoretically by
incorporating clonal interference and diminishing
returns epistasis into a
dyna
mical model of changes in mean fitness over time.
Main Text
: The dynamics of evolving populations are often discussed in terms of
movement on an adaptive landscape, where peaks and valleys are states of high
and low fitness, respectively. There is consid
erable interest in the structure of
these landscapes
(1Œ7). Recent decades have seen tremendous growth in
experiments using microbes to address fundamental questions about evolution
(8), but most have been short in duration. The Long
Term Evolution Experiment
27 (LTEE) with
Escherichia coli
provides the opportunity to characterize the
dynamics of adaptive evolution over long pe
riods under constant conditions
(1, 9, 10). Twelve populations were founded from a common ancestor in 1988 and
have been evolving
for >50,000 generations, with samples frozen every 500
generations. The frozen bacteria remain viable, and we use this ﬁfossil recordﬂ to
assess whether fitness continues to increase and to characterize mean fitness
trajectories (see Appendix: Material and
Methods).
We first performed 108 competitions, in the same conditions as the LTEE,
between samples from nine populations at 40,000 and 50,000 generations
against marked 40,000
generation clones (see Appendix: Material and Methods).
Three populations wer
e excluded for technical reasons (see Appendix: Material
and Methods). Fitness was quantified as the dimensionless ratio of the
competitors™ realized growth rates. Most populations experienced significant
improvement (Figure 2.1A), and the grand mean fitn
ess increased by 3.0%
(Figure 2.1B).
To examine the shape of the fitness trajectory, we competed samples
from all 12 populations and up to 41 time points against the ancestor (see
Appendix: Material and Methods). We compared the fit of two alternative mo
dels
with the fitness trajectories. The hyperbolic model describes a decelerating
trajectory with an asymptote. The power law also decelerates (provided the
exponent is <1), but fitness has no upper limit.
Hyperbolic model
28 Figu
re 2.1:
Fitness changes in nine
E. coli
populations between 40,000 and 50,000 generations
. (A) Filled symbols:
six populations whose improvement was significant (
P < 0.05); open symbols: three populations without significant
improvement. (
B) Grand
mean fit
ness at 40,000 and 50,000 generations relative to 40,000
generation competitor and the
ratio of means showing overall gain. Error bars are 95% confidence limits based on replicate assays (A) or populations
(B).
29 Power law
Mean fitnes
s is
, time in generations is
t, and each model has two parameters,
a and
b. Both models are constrained such that the ancestral fitness is 1, hence
the offset of +1 in the power law. The hyperbolic model was fit to the first 10,000
gen
erations of the LTEE
(9), but others suggested an alternative nonasymptotic
trajectory
(11). The grand mean fitness values and the trajectory for each model
are shown in Figure 2.2A and the individual populations in Figure S2.1. Both
models fit the data very well; the correlation coefficients for the grand means and
model
trajectories are 0.969 and 0.986 for the hyperbolic and power
law models,
respectively. When Bayesian information criterion scores (see Appendix: Material
and Methods) are used, the power law outperforms the hyperbolic model with a
posterior odds ratio of
~30 million (Table S2.1). The superior performance of the
power law also holds when populations are excluded because of incomplete time
series or evolved hypermutability (Table S2.1). The power law provides a better
fit to the grand
mean fitness than the
hyperbolic model in early, middle, and late
generations (Figure S2.2). The power law is supported (odds ratios >10) in six
individual populations, whereas none supports the hyperbolic model to that
degree (Table S2.2). The power law also predicts fitness
gains more accurately
than the hyperbolic model. When fit to data for the first 20,000 generations only,
the hyperbolic model badly underestimates later measurements, whereas the
power
law trajectory predicts them accurately (Figure 2.2B and Figure S2.3).
30 Figure 2.2:
Comparison of hyperbolic and power
law models
. (A) Hyperbolic
(red) and power
law (blue) models fit to the set of mean fitness values (black
symbols) from all 12 populations. (
B) Fit of hyperbolic (solid red) and power
law
(solid blue) model
s to data from first 20,000 generations only (filled symbols),
with model predictions (dashed lines) and later data (open symbols). Error bars
are 95% confidence limits based on the replicate populations.
31 The power law describes the fitness trajectories
well, but it is not
explanatory. We have derived a dynamical model of asexual populations with
clonal interference and diminishing
returns epistasis, which generates mean
fitness trajectories that agree well with the experimental data. Clonal interference
refers to competition among organisms with different beneficial mutations, which
impedes their spread in asexual populations
(12Œ15). Diminishing
returns
epistasis occurs when the marginal improvement from a beneficial mutation
declines with increasing fitness
(5, 6). We outline key points of the model below
(see Appendix: Material and Methods).
We used a coarse
grained approach that describes the magnitudes and
time scales of fixation events
(12). Beneficial mutations of advantage
s are
eŒs
advantage. This distribution is for mathematical convenience; the theory of clonal
interference is robust to the form of the di
stribution
(12). We assume that
deleterious mutations do not appreciably affect the dynamics; deleterious
mutations occur at a higher rate than beneficial mutations, but the resulting load
is very small relative
to the fitness increase measured over the course of the
LTEE
(16). We assume the distribution of available benefits decli
nes after a mutation
with advantage
fixes, such that
: where
g > 0 is the diminishing
returns parameter,
is beneficial effect of the
n
n
n fixations. Then, the mean fitness of an
asexual population adaptin
g to a constant environment is approximated by (see
32 Appendix: Material and Methods): where
and are the beneficial effect
and fixation time, respectively, for the first fixed mutation.
Comparing this formula with the power law,
g = 1/2
a. The value of
g estimated for the six populations that retained the low ancestral mutation rate
throughout 50,000 generations is 6.0 (95% confidence interval 5.3 to 6.9). In the
LTEE, the beneficial effect of the first fixation,
, is typically ~0.1
(1, 9, 10). It
follows that the distribution of beneficial effects immediately after the first fixation
is shifted such that the
mean advantage is
of its initial value
(see Appendix: Material and Methods). This estimate of
g also accords well with
epistasis observed for early mutations in one of the populations (Figure S2.4). In
principle,
g might vary among po
pulations if some fixed mutations lead to regions
of the fitness landscape with different epistatic tendencies
(17). However, an
analysis of variance shows no significant heterogeneity in
g among the six
populations that maintained the ancestral mutation
rate (
p = 0.3478) (Table
S2.3). The
g values tend to be lower for several populations that evolved
hypermutability (Table S2.4). However, these fits are confounded by the change
in mutation rate; we show below that it is not necessary to invoke a differenc
e in
diminishing
returns epistasis between the hypermutable populations and those
that retained the low ancestral mutation rate.
Diminishing
returns epistasis generates the power
law dynamics through
the relation between
a and g. Clonal interference affe
cts the dynamics through
the parameter
b, which depends on
and , which in turn are functions of
33 Figure 2.3:
Theoretical model generating power
law dynamics
. (A) Parameter pairs for
0 that match best fit of
power law to fitness trajectories for populations that retained ancestral mutation rate for 50,000 generations. (
B) Expected
times and beneficial effects of successive fixations for different pairs that match the best fit. T
0 values corresponding
g = 6.0, and
N = 3.3 × 10
7. 34 the population size
N
0 (see Appendix: Material and Methods). For the LTEE,
N = 3.3 × 10
7, which
takes into account the daily dilutions and regrowth
(1)
0 are
unknown. Pairs of values that all match the best fit to the populations that
retained the low mutation rate are shown in Figure 2.3A. The expected values for
beneficial effects
and fixation times across a range of pairs are shown in Figure
3.3B. The dynamics are similar among pairs with high beneficial mutation rates
8), giving
and generations for the first fixation, which
agree we
ll with observations from the LTEE
(1, 9, 10). At lower values of
adaptation becomes limited by the supply of beneficial mutations, and fixation
times are inconsistent with the LTEE. This model also predicts that the rate of
adaptation decelerates more sharply than the rate of genomic evolution (Figure
S2.5), which is
qualitatively consistent with observations
(10) (see Appendix:
Material and Methods). The model assumes that individual beneficial mutations
sweep sequentially, although ﬁcohortsﬂ of beneficial mutations may co
occur,
(14, 15, 18) (see Appendix: Material and Methods). However,
the inferred role of diminishing returns in generating populatio
n mean
fitness
dynamics is unaffected by this complication, because the power
law exponent is

occurring beneficial mutations have no appreciable affect on long
term fitness
traje
ctories over the range of parameters considered here (Figure S2.6).
Six populations evolved hypermutator phenotypes that increased their
point
mutation rates by ~100
fold (see Appendix: Material and Methods). Three
35 of them became hypermutable early in t
he LTEE (between ~2500 and ~8500
generations) and had measurable fitness trajectories through at least 30,000
generations (Table S2.2). Our model predicts these populations should adapt
faster than those that retained the ancestral mutation rate. We pooled
the data
from these early hypermutators and confirmed that their composite fitness
trajectory was substantially higher than that of the populations with the low
mutation rate (Figure 2.4). If the hypermutators™ beneficial mutation rate also
increased by ~
100fold, the difference in trajectories is best fit by an ancestral
rate
6 (95% confidence interval 2.5 × 10
7 to 6.1 × 10
5), although
higher values cannot be ruled out (see Appendix: Material and Methods). Note
that this fit was obtained by using the same initial distribution of fitness effects,
0, and epistasi
s parameter,
g, for the hypermutators and the populations that
retained the ancestral mutation rate.
Both our empirical and theoretical analyses imply that adaptation can
continue for a long time for asexual organisms, even in a constant environment.
The
50,000 generations studied here occurred in one scientist™s laboratory in ~21
years. Now imagine that the experiment continues for 50,000 generations of
scientists, each overseeing 50,000 bacterial generations, for 2.5 billion
generations total. At that ti
me, the predicted fitness relative to the ancestor is
~4.7 based on the power
law parameters estimated from all 12 populations
(Table S2.4). The ancestor™s doubling time in the glucose
limited minimal
36 Figure 2.4:
Effect of hypermutability on observed an
d predicted fitness
trajectories
. Black circles: mean fitness of six populations that retained low
ancestral mutation rate. Green triangles: mean fitness of three populations that
evolved hypermutability early in the LTEE, including one with measurable val
ues
through 30,000 generations only. The hypermutators have higher mean fitness at
28 of 31 time points from 5000 to 50,000 generations. Black curve: Predicted
60 = 85,
g = 6.0, and
N = 3.3 ×
107. Green curv
fold starting at 4667
generations and all other
37 medium of the LTEE was ~55 min, and its growth commenced after a lag phase
of ~90 min
(19). If the bacteria eliminate the lag, a fitness of 4.7 implies a
doubling time of ~23 min (Figure S2.7). Although that is fast for a minimal
medium where cells must synthesize most constituents, it is slower than the 10
min that some sp
ecies can achieve in nutrient
rich media
(20). At some distant
time, biophysical constraints may come
into play, but the power
law fit to the
LTEE does not predict implausible growth rates even far into the future. Also,
some equilibrium might eventually be reached between the fitness
increasing
effects of beneficial mutations and fitness
reducing effects
of deleterious
mutations
(21), although
it is impossible to predict when for realistic scenarios
with heterogeneous selection coefficients, compensatory mutations, reversions,
and changing mutation rates.
Fitness may continue to increase because even very small advantages
become important over
very long time scales in large populations. Consider a
mutation with an advantage
s = 10
6. The probability that this mutation escapes
drift loss is ~4
s for asexual binary fission
(12), so it would typically have to occur
2.5 × 10
5 times before finally taking hold. Given a mutation rate of 10
10 per base
pair per generation
(22) and effective population size of ~3.3 × 10
7, it would
require ~10
8 generations for that mutation to escape drift and millions more to fix.
Also, pleiotropy and epistasis might allow a
sustained supply of advantageous
mutations, because many net
beneficial mutations have maladaptive side effects
that create opportunities for compensatory mutations to ameliorate those effects.
38 The LTEE uses a simple, constant environment to minimize co
mplications
and thus illuminates the fundamental dynamics of adaptation by natural selection
in asexual populations. The medium has one limiting resource and supports low
population densities (for bacteria) to minimize the potential for cross
feeding on,
or inhibition by, secreted by
products. Frequency
dependent interactions are
weak in most populations, although stronger in some others
(23). Also, such
interactions should favor organisms that are more fit than their immediate
predec
essor, but they are not expected to amplify gains relative to a distant
ancestor, as fitness was measured here. In fact, such interactions may cause
fitness to fall relative to a distant ancestor
(24). In any case, small
effect
beneficial mutations s
hould allow fitness to increase far into the future.
At present, the evidence that fitness can increase for tens of thousands of
generations in a constant environment is limited to the LTEE, but these findings
have broader implications for understanding
evolutionary dynamics and the
structure of fitness landscapes. It might be worthwhile to examine fitness
trajectories from other evolution experiments in light of our results, although data
from short
term experiments may not suffice to discriminate betwee
n asymptotic
and nonasymptotic trajectories. We hope other teams will perform long
experiments similar to the LTEE and that theoreticians will refine our models as
appropriate.
39 APPENDIX
40 Materials and Methods
: Evolution experiment
: The long
 term evoluti
on experiment (LTEE) began in 1988, and it has continued
(with occasional interruptions) since then
(1). Six populations were founded from
each of two variants of the same ancestral strain of
Escherichia coli
B (25). One
ancestral variant, REL607, is able to grow on arabinose (Ara
+) while the other,
REL606, cannot (Ara
Œ). The 12 populations are called Ara
1 to Ara
6 and Ara+1
to Ara+6. They are maintained by daily serial transfer in 10 mL of Davis mini
mal
are held in 50
mL Erlenmeyer flasks and incubated with shaking at 120 rpm and
37 °C. These conditions support a stationary
phase cell density of ~5 × 10
7per
mL for the ancestral
strain
(1); the evolved populations tend to produce
somewhat fewer and larger cells
(19). The 1:100 dilution and re
growth allow
log
2100
7, which takes into account both the population bottleneck and re
growth
(1). Every
75 days (500 generations), after the populations have been transferred to fresh
medium, glycerol is added to th
e remaining culture, the material is split between
two vials and stored frozen at
Œ80°C. The bacteria remain viable and can be
revived for later study; the freezer samples thus provide a living fossil record.
Populations with truncated fitness data
: We
obtained complete fitness trajectories for nine of the populations. However,
the trajectories for three populations were truncated, even though the populations
41 themselves continued to evolve for the full 50,000 generations. Populations
Ara+6 and Ara
2 no l
onger produced reliable colonies on the agar plates used to
enumerate competitors in the fitness assays after 4000 and 30,000 generations,
respectively. Ara
3 evolved the ability to use the citrate in the DM25 medium,
which led to a greatly increased cell
density
(26) and other complications for
assessing fitness, and therefore its fitness was only measured through 32,000
generations. The same
three populations were also excluded from the assays
comparing fitness levels at 40,000 and 50,000 generations.
General procedures for fitness assays
: Fitness is measured by mixing two bacterial strains or populations and
assessing their relative growt
h rates during head
tohead competition. In this
study, all competitions were performed in the same DM25 medium and other
culture conditions as used in the LTEE. The competitors were distinguished on
the basis of an arabinose
utilization marker, which is s
electively neutral under
these conditions
(1); Ara
Œ and Ara
+ cells form red and white colonies,
respectively, on tetrazolium
arabinose (TA) agar plates. To begin, samples of the
population of interest and the reciprocally marked reference competitor were
taken fro
m the freezer, transferred into 10 mL of Luria Broth (LB), and grown
overnight at 37 °C. These cultures were then diluted 100
fold in saline solution,
and 100
These cultures were incubated for 24 h under the same conditions as the LTEE,
such that each competitor was comparably acclimated to those conditions. To
42
L from each acclimation culture were inoculated
into 9.9 mL of DM25 and mixed together. An initial 100

immediately after mixing, then diluted and spread on a TA plate to enumerate the
initial density of each competitor. The competition
culture was then incubated
under the same conditions as the LTEE. For assays used to obtain the fitness
trajectories, a final 100

agar. For assays comparing fitness levels between 40,000 and 50,000
generations, the competitions were propagated through daily 100
fold dilutions
until, after three days, a sample was taken to enumerate the final density of each
competitor. In each case, relative fitness was calculated as the ratio of the
realized Malthu
sian parameters of the two competitors over the course of the
competition
(1). For the one
day assays, fitness was calculated simply as
, where
A and
B are the respective densities of the evolved population and
reference competitor, and s
ubscripts
i and
f indicate initial and final densities,
respectively. For the three
day assays, the final densities were both multiplied by
10,000 to account for the two additional cycles of dilution and re
growth. In either
case, this metric encompasses a
ny and all differences between the competitors
in their lag, growth, and stationary phases over the same serial
transfer cycle as
used in the LTEE itself
(1, 19).
43 Specific procedures for comparing fitness levels between 40,000 and 50,000
generations
: In general, the statistical error associated with competition assays
becomes larger as the differ
ence in fitness increases, because the losing
competitor becomes increasingly rare and its abundance less certain in the final
sample. Given the small fitness changes expected in later generations, we
decided to compete the 40,000
 and 50,000
generation po
pulations against a
late
generation competitor rather than the ancestor in order to reduce the fitness
differential and thereby improve statistical power. To that end, we used a clone,
REL10948, sampled from population Ara
5 at 40,000 generations, and we
isolated an Ara+ mutant of that clone, REL11638, by plating millions of cells on a
minimal medium supplemented with arabinose. Competition assays confirmed
that the marker was selectively neutral on this background under the conditions
of the LTEE. Samples
from the nine populations (excluding the three with
truncated fitness trajectories) at generations 40,000 and 50,000 were competed
against the reciprocally marked reference clone for three days. Each pairwise
competition was replicated six
fold in a comple
teblock design.
Specific procedures for obtaining fitness trajectories through 50,000 generations
: To ensure uniformity of procedures, all of the data used to characterize
the long
term fitness trajectories were based on one
day competitions against
the
reciprocally marked ancestral clone, either REL606 (Ara
Œ) or REL607 (Ara
+).
All of the generational samples from a given population were simultaneously
44 placed in competitions, and each complete time
series was replicated twice at
different times (with a f
ew missing values caused by procedural errors). The
fitness trajectory for each individual population was fit using the replicate values
at each time point. The trajectory for the grand
mean fitness of the ensemble of
populations was fit using the average
of the replicate values for each population
at each time point.
Statistical analyses
: All of the statistical analyses of experimental data were performed using
the R software suite (version 2.14.1). The hyperbolic and power law models were
fit to the fit
ness trajectories using the nls function in R. Both models have two
parameters, and they are not nested, so they cannot be compared using
likelihood
ratio or F tests. Instead, we compare them using Bayesian Information
Criterion (BIC) scores
(27). To calculate the 95% confidence interval for the
diminishing
returns parameter g for populations that retained the low mutation
rate throu
ghout, we first calculated the confidence interval for a using the
estimates from the six corresponding populations. The endpoints of that interval
were then transformed to g values based on the relationship g = 1 / (2a). As a
consequence, the interval for
g is asymmetric around the point estimate. The
datasets and analysis scripts have been deposited in the Dryad database.
45 Derivation of theory
: The derivation of the theory that generates the power
law dynamics was
checked by obtaining numerical soluti
ons using Wolfram Mathematica (version
8.0). The script has been deposited in the Dryad database. To examine the
possible effects of co
occurring beneficial mutations, we used LabVIEW 2010
(version 10.0.1) to simulate the dynamics of mean fitness and fixed
beneficial
mutations; these dynamics were then compared to predictions from our theory,
which does not consider multiple co
occurring mutations. In particular, adaptation
was simulated using a Wright
Fisher model with discrete generations. Distinct
genoty
pes were tracked along with their corresponding frequencies, fitnesses,
population size in the following generation by drawing from a binomial distribution
with 2
x trials and (1/2)(
/ ) success probability in each trial, where
x and
are the genotype™s population size and fitness, respectively, and
is the mean
fitness of the entire population
. Each generation, a number of beneficial
mutations (drawn from Poisson distribution with mean Nµ, where N is the total
population size and µ is the beneficial mutation rate) were assigned randomly to
genotypes (with probability weighted by
x), with the mu
tant designated as a new
genotype with new fitness
, where s is drawn from an exponential
distribution
, and new
value
The full derivation of dynamical model
of long
term fitness trajectory that
incorporates clonal interference and diminishing
returns epistasis can be found in
www.sciencemag.org/content/342/6164/1364/suppl/DC1, the Supplementary
46 Materials for this paper. We have omitted this for the purposes
of this
dissertation, as the model was derived by Noah Ribeck.
8 that match the best fit to the populations that
retained the ancestral mutation rate, the model
predicts ~13 fixation
events of
beneficial mutations over the cou
rse of 20,000 generations. However, 45
50
mutations were discovered by sequencing the genomes of clones from two LTEE
populations that were not hypermutators at that time
(10, 28). Some of the
discrepancy may reflect neutral mutations that hitchhiked along with beneficial
ones. However, this explanation is insufficient given the paucity of synonymous
substitutions
(22), the prevalence of parallel changes across replicate
populations
(10), and the results of competitions between isogenic strains
(10). Another factor that could contribute to this discrepancy is the sequential,
oneatatime fixations of beneficial mutations assumed by our model of clonal
interference. That is, a single fixation event may sometimes involve multiple
beneficial mutations. At high Nµ , ﬁcohortsﬂ of multiple beneficial mutations can
cooccur in the same li
neage before one of them fixes and, in some cases, they
may alleviate the effect of clonal interference on the rate of adaptation
(18, 29). Some theoretical work has examined the effect of co
occurring beneficial
mutations on the rate of adaptation
(18, 29Œ32), but its direct application here is
prevented by the pervasive epistasis in our model. At intermediate Nµ , selective
sweeps are sometimes caused by a single large
effect beneficia
l ﬁdriverﬂ
mutation accompanied by a weakly beneficial passenger that hardly affects the
47 dynamics of adaptation
(29, 30). Indeed, such weakly beneficial passenger
mutations have been observed in the LTEE populations
(33, 34). To test whether our theor
etical model is accurate, despite ignoring cohorts
of beneficial mutations, we ran individual
based simulations of asexual
populations for a range of µ values, each with the corresponding value of
such that the simulation matches t
he best fit to the fitness trajectories for the
populations that retained the low mutation rate throughout the LTEE (Figure
2.3A). These simulations show that fitness trajectories are consistent across a
wide range of (µ,
) values,
and they closely match the theoretical fitness
trajectory that assumes one
atatime fixations (Figure S2.6). Thus, our
theoretical model with its simplifying assumptions does well with respect to the
fitness trajectory. With respect to genomic evolution,
the individual
based
simulations show a number of fixed beneficial mutations that is slightly higher
than the theoretical values for µ > 10
7, with the discrepancy increasing with
higher µ (Figure S2.6). Taken together, these observations are consistent with
the intermediate Nµ regime, where weakly beneficial passengers occasionally fix
along with highly beneficial drivers but do not apprecia
bly affect the rate of
adaptation. The pervasive diminishing
returns epistasis inherent to our model
likely reduces the effect of weakly beneficial passenger mutations relative to
previous theory that does not include this epistasis.
From our analysis of
the effect of hypermutators on fitness trajectories
(Figure 2.4), we estimated the ancestral rate of beneficial mutations to be 1.7 ×
106. At this rate, however, the simulations predict only ~14 beneficial mutations
48 to fix by 20,000 generations. Therefore, weakly beneficial passenger mutations
Šat least those that occur at typical point
mutation rates
Šcannot account for the
discrepancy bet
ween the observed number of mutations in the LTEE and that
predicted by both theory and simulations. Instead, we suspect that certain types
of insertion and deletion mutations that occur at much higher rates than point
mutations
(33Œ35)Šin particular those that are neutral or nearly neutral
Šmight
help to explain why the rate of genomic evolution exceeds the number of
beneficial fixation events to the extent that it does. In that respect
, it is noteworthy
that the two weakly beneficial mutations that fixed early in the most intensively
studied LTEE population, Ara
1, were non
point mutations of types known to
occur at unusually high rates
(33Œ35). More generally, 16 of the 45 mutations in a
20,000
generation clone from that population were non
point mutations
(10), which potentially reduces by about half the discrepancy between the observed
number of mutations and the
number predicted by theory and simulations.
For the number of observed fixed mutations to be in close agreement with
the simulations would require a higher ancestral beneficial mutation rate than we
have estimated here (Figure S2.6). In fact, we cannot r
ule out this possibility. For
simulations with µ = 10
4, the fitness trajectory is slightly higher than the theory
predicts, indicating entry into the high Nµ regime, where co
occurring beneficial
mutations alleviate the inhibitory effect of clonal interfe
rence on the rate of
adaptation. If the hypermutators have entered this regime, then our theory would
underestimate the fitness trajectory, and our derived estimate of the beneficial
49 mutation rate would be too low. We therefore interpret our estimate of µ
= 1.7 ×
106 for the ancestral beneficial mutation rate to be a lower bound on the actual
value.
Reflecting these complications and uncertainties, our dynamical model
cannot predict the overall rate of genomic evolution. However, we can use the
model™s predicted ra
te of fixation events as a proxy for the overall rate. Figure
S2.5 shows the predicted fixation trajectory and the corresponding mean fitness
trajectory that fits the LTEE data for the populations that maintained the low
ancestral mutation rate. Both the r
ate of fitness improvement and the rate of
fixation events decline over time; however, the deceleration in the rate of fixation
events is much less pronounced, giving the appearance of relative constancy. It
has also been shown elsewhere, using another the
oretical framework, that
evolution on fitness landscapes with antagonistic (e.g., diminishing
returns)
epistasis can produce nearly linear fixation trajectories
(4). In any case, the
difference in the relative curva
ture of the trajectories for mean fitness and
genomic evolution observed in the LTEE
(10) is consistent with our model.
Parameterization of diminishing
returns epistasis fits well with other data
from the LTEE. Khan et al.
(6) constructed the 32 possible combinations of t
he
first five mutations that fixed in the Ara
1 population. The fitness of each
construct was then measured against the ancestor, providing estimates of the
marginal effect of each mutation on backgrounds of varying fitness. Three of the
mutations
Šthose af
fecting the
topA
, spoT
, and
glmUS
genes
Šexhibited
significant diminishing
returns epistasis, i.e., they had smaller beneficial effects in
50 higher fitness backgrounds. Of the other two, one was nearly neutral and showed
no significant trend, and one exhibite
d positive epistasis. Here, we compare the
data for the three mutations with diminishing
returns epistasis to the best
fit
parameter g obtained from our theoretical model of long
term adaptation. From
our model, we expect the effects of beneficial mutation
s to scale as:
, or
, or equivalently:
. Figure S2.4 shows fits of this equation to these independently measured data,
which appear consistent with the general form of diminishing
returns
epistasis
assumed in our theoretical model.
Khan et al.
(6) concluded there is a tendency toward diminishing
returns
epistasis among beneficial mutations. However, the magnitude of that epistasis
seems to vary even among the mutations that clearly show diminishing return
s,
as evidenced by the best
fit g values of 3.1, 2.9, and 7.2 for the mutations
affecting
topA
, spoT
, and
glmUS
, respectively. In comparison, the dashed curve
in Figure S2.4 corresponds to g = 6.0, which derives from 50,000 generations of
fitness measureme
nts for all six populations that maintained the low ancestral
mutation rate throughout 50,000 generations. This value is our best estimate of
the mean strength of diminishing
returns epistasis for the LTEE as a whole.
As a technical aside, we note that ou
r model is meant for use in the long
time limit. However, by evaluating
here, it is also reasonably accurate for
51 values of
near 1. For the values of s/s
0 shown in Figure S2.4, this
approximation is accurate
to within ~10% at w = 1.3.
52 Figure S2.1:
Comparison of the fit of the hyperbolic (red) and power
law
(blue) models to the fitness trajectories for the 12 individual
Escherichia
coli
populations
. (A) Ara
1. (B) Ara
2. (C) Ara
3. (D) Ara
4. (E) Ara
5. (F) A
ra6.
(G) Ara+1. (H) Ara+2. (I) Ara+3. (J) Ara+4. (K) Ara+5. (L) Ara+6. Three
trajectories are truncated because of difficulties in measuring fitness that arose in
those populations, as explained in the Materials and Methods.
53 Figure S2.2:
Comparison of h
yperbolic and power
law models in terms of
squared deviations between their fit trajectories and measured grand
mean
fitness values over time
. (A) Difference in squared deviations between the two
models; positive values indicate the power law provides a b
etter fit. The dashed
line shows the average difference in squared deviations over 50,000 generations.
(B) Cumulative squared deviations between the hyperbolic (red circles) and
power
law (blue triangles) models and the measured values over time.
54 Figur
e S2.3:
Comparison of hyperbolic and power
law models in their ability
to predict future fitness values from temporally truncated datasets
. (A) Fit
of the hyperbolic model to all 12 populations using data from several subsets of
generations (light red) or
from all 50,000 generations (dark red). The subsets,
from bottom to top, include data through 5000, 10,000, 20,000, 30,000, and
40,000 generations. The underestimation of the later values becomes
progressively more severe as the data are truncated at ea
rlier time points. (
B) Fit
of the power
law model to all 12 populations using data from several subsets of
generations (light blue) or from all 50,000 generations (dark blue). The subsets
include data through 5000, 10,000, 20,000, 30,000, and 40,000 gener
ations, and
all are very close to the trajectory fit to the complete 50,000 generations. Error
bars are 95% confidence limits based on replicate populations.
55 Figure S2.4:
Parameterization of diminishing
returns epistasis based on the
fit of the dynamic
model to the fitness trajectories accords well with
independent data on the form and strength of epistasis from the LTEE
. Each set of points shows the beneficial effect of adding an individual mutation to
different progenitor backgrounds of varying fitnes
s, as measured by Khan et al.
(6) using the first several mutations that fixed in the Ara
1 population. The solid
colored curves are fits to the parameterization of diminishing
returns epistasis
used in our theoretical model, giving g values of 3.1 for the addition of a
beneficial mutation in
topA
, 2.9 for
spoT
, and 7.2 for
glmUS
. The black dashed
curve corresponds to g = 6.0, the value that provides the best fit of the power
law
model to the fitness trajectories of the populations that retained the low ancestral
mutation
rate throughout the 50,000 generations.
56 Figure S2.5:
Predicted number of beneficial fixation events in relation to the
fitness trajectory, based on the theoretical model with clonal interference
and diminishing
returns epistasis
. The expected fitness t
rajectory and
number of fixation events are shown for the follow set of parameters:
, , , and
. 57 Figure S2.6:
Numerical simulations of fitness trajectories show g
ood agreement with the theory over a wide range of the beneficial mutation rate
. Simulations of the theory of clonal interference with diminishing
returns
epistasis are shown, with different colors representing different pairs of t
he parameters
and
(each curve is labeled by its
) that equivalently give the
best fit to the set of the populations that maintained the ancestral mutation rate
for 50,000 generations (Fig
ure 2.3A). The dashed line represents the theoretical
fitness trajectory for this family of parameters. Deviations from the theory at early
times are small for simulations with
; deviations at lower
are cause
d by
the small
approximation. Deviations from the theory at later times are
negligible for
; deviations at higher
result from co
occurring beneficial
mutations. All simulations were run w
ith
and. Curves are
based on the mean fitness from multiple runs, with 3 runs for
, 50 each
for
and
, 200 for
, 500 for
, and 2000 each
for
and. The inset panel shows the mean number of beneficial
mutations fixed in each set of simulations at 20,000 generations, compared to the
number predicted by the theory
, which does not account for co
occurring
beneficial mutations.
58 Figure S2.7:
Hypothetical growth kinetics of evolved (blue) and ancestral
(black) competitors that would produce a relative fitness of ~4.7
. The LTEE
ancestral strain grows with a doublin
g time of 55 minutes, following a lag phase
of 90 minutes. The hypothetical evolved population has a doubling time of ~23
minutes without any lag.
59 Table S2.1:
Differences in Bayesian Information Criteria (BIC) scores
between hy
perbolic and power
law model trajectories fit to the measure
d fitness values.
Contrasts are based on: (a) the full dataset including all 12
populations and all time points available for each population; (b) the dataset
excluding 3 populations with incompl
ete fitness trajectories; and (c) the dataset
excluding 6 populations that evolved hypermutability. A BIC difference >10 is
considered to provide very strong support for one model over another
(27), which
can also be expressed as a posterior odds ratio.
60 Table S2.2:
Differences in BIC scores between the hyperbolic and power
law trajectories fit to the mea
sured fitness values for 12 individual
E. coli
populations
. The column labeled ﬁComplete?ﬂ indicates whether the
population™s fitness trajectory extended the full 50,000 generations (ﬁYesﬂ) or was
terminated at an earlier generation (ﬁNoﬂ followed by the
last generation with
fitness data in Figure S2.1). The column labeled ﬁHypermutator?ﬂ indicates
whether the population evolved hypermutability (ﬁYesﬂ followed by the
approximate generation when the hypermutable genotype become the majority
(10, 28, 36) ) or retained the low ancestral mutation rate throughout (ﬁNoﬂ).
An
odds ratio >1 or <1 indicates support for the power law or the hyperbolic model,
respectively.
61 Table S2.3:
Analysis of varia
tion to test for heterogenetic l
nvalues among
the six populations that mainta
ined the low ancestral mutation rate
throughout the LTEE
. Each fitness trajectory was replicated twice, giving two
estimates of the power
law exponent
and, using the relationship
, two
estimates of the diminis
hing
returns parameter
. 62 Table S2.4:
Parameter estimates for the power
law model fit to each
individual population™s measured fitness values
. Parameter estimates for the
exponent a and scaling factor b
are also shown for the model fit to the set of all
12 populations and to the subset of six populations with complete trajectories that
maintained the low ancestral mutationr ate throughout the 50,000 generations.
The column labeled ﬁComplete?ﬂ indicates
whether a population™s fitness
trajectory extended for the full 50,000 generations ﬁYesﬂ) or was terminated,
shown as the last generation with fitness data. The column labeled
ﬁHypermutator?ﬂ indicates whether the population evolved hypermutability,
shown
by the approximate generation when the hypermutable genotype became
the majority, or retained the ancestral mutation rate throughout (ﬁNoﬂ). The
column labeled
shows the estimate of the diminishing
returns epistasis
parameter in t
he dynamical model, calculated using
. However, for
populations that evolved hypermutability, the change in mutation rate
complicates the fit of the power law and the estimation of those parameters. As
shown in Figure 2.4, the chan
ge in mutation rate can explain the difference in
trajectories between the hypermutator populations and those that retained the
ancestral mutation rate, without any change in the diminishing
returns parameter
. For that reason, the
estimates of
are inaccurate for the populations that
evolved hypermutability before their trajectories were terminated (
values
shown in brackets).
From
M. J. Wiser, N. Ribeck, R. E. Lenski, Long
term dynamic
s of adaptation in
asexual populations.
Science
. 342, 1364
Œ1367 (2013).
Reprinted with permission
from AAAS.
63 REFERENCES
64 REFERENCES
1.
R. E. Lenski, M. R. Rose, S. C. Simpson, S. C. Tadler, Long
term e
xperimental
evolution in Escherichia coli. I. Adaptation and divergence during 2,000
generations.
Am. Nat.
138, 1315
Œ1341 (1991).
2.
C. L. Burch, L. Chao, Evolution by small steps and rugged landscapes in the RNA
virus
6.
Genetics
. 151
, 921
Œ927 (1999).
3.
D. M. Weinreich, N. F. Delaney, M. A. DePristo, D. L. Hartl, Darwinian evolution
can follow only very few mutational paths to f
itter proteins.
Science
. 312, 111
Œ114
(2006).
4.
fitness landscapes.
Proc. Natl. Acad. Sci.
106
, 18638
Œ18643 (2009).
5.
H.H. Chou, H.
C. Chiu, N. F. Delaney, D. Segrè, C
. J. Marx, Diminishing Returns
Epistasis Among Beneficial Mutations Decelerates Adaptation.
Science
. 332, 1190
Œ1192 (2011).
6.
A. I. Khan, D. M. Dinh, D. Schneider, R. E. Lenski, T. F. Cooper, Negative Epistasis
Between Beneficial Mutations in an Evolving
Bacterial Population.
Science
. 332, 1193
Œ1196 (2011).
7.
I. G. Szendro, M. Schenk, J. Franke, J. Krug, J. A. G. M. de Visser, Quantitative
analyses of empirical fitness landscapes.
J. Stat. Mech. Theory Exp.
2013
, P01005
(2013).
8.
T. J. Kawecki
et al.
, Experimental evolution.
Trends Ecol. Evol.
27, 547
Œ560 (2012).
9.
R. E. Lenski, M. Travisano, Dynamics of adaptation and diversification: a 10,000
generation experiment with bacterial populations.
Proc. Natl. Acad. Sci.
91, 6808
Œ6814 (1994).
10.
J. E. B
arrick
et al.
, Genome evolution and adaptation in a long
term experiment
with Escherichia coli.
Nature
. 461, 1243
Œ1247 (2009).
11.
P. Sibani, M. Brandt, P. Alstrøm, Evolution and Extinction Dynamics in Rugged
Fitness Landscapes.
Int. J. Mod. Phys. B
. 12, 361
Œ391 (1998).
12.
P. Gerrish, R. Lenski, The fate of competing beneficial mutations in an asexual
population.
Genetica
. 102103
, 127
Œ144 (1998).
65 13.
M. Hegreness, N. Shoresh, D. Hartl, R. Kishony, An Equivalence Principle for the
Incorporation of Favor
able Mutations in Asexual Populations.
Science
. 311
, 1615
Œ1617 (2006).
14.
S.C. Park, J. Krug, Clonal interference in large populations.
Proc. Natl. Acad. Sci.
104, 18135
Œ18140 (2007).
15.
G. I. Lang
et al.
, Pervasive genetic hitchhiking and clonal inte
rference in forty
evolving yeast populations.
Nature
. 500
, 571
Œ574 (2013).
16.
S. Wielgoss
et al.
, Mutation rate dynamics in a bacterial population reflect tension
between adaptation and genetic load.
Proc. Natl. Acad. Sci.
110
, 222
Œ227 (2013).
17.
R. J.
Woods
et al.
, Second
Order Selection for Evolvability in a Large Escherichia
coli Population.
Science
. 331
, 1433
Œ1436 (2011).
18.
M. M. Desai, D. S. Fisher, A. W. Murray, The Speed of Evolution and Maintenance
of Variation in Asexual Populations.
Curr. B
iol.
17, 385
Œ394 (2007).
19.
F. Vasi, M. Travisano, R. E. Lenski, Long
term experimental evolution in
Escherichia coli. II. Changes in life
history traits during adaptation to a seasonal
environment.
Am. Nat.
144, 432
Œ456 (1994).
20.
R. G. Eagon, Pseudom
onas natriegens, a marine bacterium with a generation time
of less than 10 minutes.
J. Bacteriol.
83, 736
Œ737 (1962).
21.
S. Goyal
et al.
, Dynamic Mutation
ŒSelection Balance as an Evolutionary Attractor.
Genetics
. 191
, 1309
Œ1319 (2012).
22.
S. Wielgoss
et al.
, Mutation Rate Inferred From Synonymous Substitutions in a
LongTerm Evolution Experiment With Escherichia coli.
G3 Genes Genomes
Genet.
1, 183
Œ186 (2011).
23.
S. F. Elena, R. E. Lenski, Long
Term Experimental Evolution in Escherichia coli.
VII. Mec
hanisms Maintaining Genetic Variability Within Populations.
Evolution
. 51, 1058
Œ1067 (1997).
24.
C. E. Paquin, J. Adams, Relative fitness can decrease in evolving asexual
populations of S. cerevisiae.
Nature
. 306
, 368
Œ371 (1983).
25.
P. Daegelen, F. W. S
tudier, R. E. Lenski, S. Cure, J. F. Kim, Tracing ancestors and
relatives of Escherichia coli B, and the derivation of B strains REL606 and
BL21(DE3).
J. Mol. Biol.
394, 634
Œ643 (2009).
26.
Z. D. Blount, C. Z. Borland, R. E. Lenski, Historical contingency
and the evolution
of a key innovation in an experimental population of Escherichia coli.
Proc. Natl.
Acad. Sci.
105
, 7899
Œ7906 (2008).
66 27.
A. E. Raftery, Bayesian model selection in social research.
Sociol. Methodol.
25, 111Œ164 (1995).
28.
Z. D. Blount
, J. E. Barrick, C. J. Davidson, R. E. Lenski, Genomic analysis of a key
innovation in an experimental Escherichia coli population.
Nature
. 489
, 513
Œ518
(2012).
29.
B. H. Good, I. M. Rouzine, D. J. Balick, O. Hallatschek, M. M. Desai, Distribution of
fixe
d beneficial mutations and the rate of adaptation in asexual populations.
Proc.
Natl. Acad. Sci.
109
, 4950
Œ4955 (2012).
30.
S. Schi
Adaptive Asexual Evolution.
Genetics
. 189, 1361
Œ1375 (2011).
31.
Y. Kim, H. A. Orr, Adaptation in Sexuals vs. Asexuals: Clonal Interference and the
Fisher
Muller Model.
Genetics
. 171
, 1377
Œ1386 (2005).
32.
P. D. Sniegowski, P. J. Gerrish, Beneficial mutations and the dynamics of
adaptation in asexual populations.
Philos. Trans. R. Soc. Lond. B Biol. Sci.
365
, 1255
Œ1263 (2010).
33.
V. S. Cooper, D. Schneider, M. Blot, R. E. Lenski, Mech
anisms Causing Rapid and
Parallel Losses of Ribose Catabolism in Evolving Populations of Escherichia coli
B.
J. Bacteriol.
183
, 2834
Œ2841 (2001).
34.
M. T. Stanek, T. F. Cooper, R. E. Lenski, Identification and dynamics of a beneficial
mutation in a long
term evolution experiment with Escherichia coli.
BMC Evol. Biol.
9, 1Œ13 (2009).
35.
E. R. Moxon, P. B. Rainey, M. A. Nowak, R. E. Lenski, Adaptive evolution of highly
mutable loci in pathogenic bacteria.
Curr. Biol.
4, 24
Œ33 (1994).
36.
P. D. Sniegowsk
i, P. J. Gerrish, R. E. Lenski, Evolution of high mutation rates in
experimental populations of E. coli.
Nature
. 387
, 703
Œ705 (1997).
67 CHAPTER 3:
PERSISTENT AMONG
POPULATION VARIANCE IN FITNESS
IN A LONG
TERM EVOLUTION EXPERIMENT WITH
ESCHERICHIA C
OLI
Authors: Michael J. Wiser and Richard E. Lenski
Abstract
: Adaptive landscapes for real populations are difficult to characterize both
qualitatively and quantitatively, in part because individual natural populations
often occupy only one small region
of any given landscape. However, variation
in fitness across independent experimental populations can provide insight about
the adaptive landscapes on which they evolve. Previous research has
addressed how mean fitness changes in populations over time,
but there has
been much less work on how variation among populations changes over time.
Here, we investigate populations from a long
term evolution experiment (LTEE)
in
Escherichia
coli that evolved for 50,000 generations. We look collectively at
the popu
lations to measure the trajectory of the among
population variance in
fitness over that time. We further measure the relative fitness of pairs of evolving
populations, and compare these measurements to predictions based on each
individual population™s fit
ness relative to the ancestor. We find persistent among
population variance in fitness, providing evidence that the populations have not
converged
Œ and probably are not converging
Œ to the same fitness level in the
adaptive landscape. Our data indicate
a rich and complex adaptive landscape
even in a simple and nearly constant physical environment.
68 Introduction
: In order to understand how evolution will unfold over long time periods, it
is critical to understand how variance within the population change
s over time. If
variance within a population remains substantial, there will be sufficient
differences among individuals for natural selection to operate. However, if
variance diminishes to negligible levels, the rate of adaptation will slow
dramatically
, and perhaps even come to a stop.
As a thought experiment, consider a hypothetical population with a finite
number of possible beneficial mutations. Without some sort of change to the
environment causing additional mutations to be beneficial, the popu
lation will
inevitably reach a point of having incorporated the full set of beneficial mutations.
At this point, adaptation would essentially stop, except for second
order effects
such as the population moving to regions of the landscape that limit the
deleterious impact of new mutations
(1). In such a case, the process of evolution
is limited by the availability of b
eneficial mutations; once those mutations are
incorporated, the stock of potentially adaptive mutations has been exhausted,
and the population stagnates.
In real populations, there are at least two ways in which the supply of
beneficial mutations can be r
efreshed. One is epistasis: that is, some mutations
may not be beneficial at the moment, but would be beneficial on a different
genetic background
(2). Every step along an adaptive trajectory thus brings an
individual to a new area of the a
daptive landscape. While the number of
69 mutations that are beneficial at any one point is finite, that does not necessarily
mean that there are a finite total number of potentially adaptive mutations; in fact,
because genome length is variable, we cannot
a priori
list all possible genotypes.
A second possibility is that the environment could change. For example, the
availability of a new resource may favor mutations allowing use of that resource,
while absence of the resource prevents those mutations from
being beneficial
(3, 4). Changes in the environment can also involve biotic interactions; changes in
predator, prey, competitor, or mutualist populations can all alter what m
utations
will be favored within a given population
(5, 6). Looking beyond any single population, the variance among populations
provides insight into the topology of the underlying fitness landscape. While
each population in nature typically experiences a different environment, and
hence evolv
es on a different fitness landscape, theory and experiments allow us
to consider the case of initially identical populations evolving under identical
conditions. In the theoretical case described above, in which each individual
population has exhausted the
within
population variance, there are still two
possible outcomes for the among
population variance. First, if there is only a
single accessible fitness peak (a smooth landscape), then the among
population
variance should eventually drop to zero because
all of the populations approach
the same equilibrium mean fitness. Second, if multiple peaks exist and are
accessible (a rugged landscape), then different populations may reach different
fitness peaks. Because populations can become stuck at the sub
opti
mal peaks
in a rugged landscape, evolutionary outcomes will be more variable in rugged
70 adaptive landscapes than in smooth ones
(7), and the among
population
variance in fitness may remain positive indefinitely.
From previous work
(8), we already know that 50,000 generations has not
been enough tim
e for populations to reach fitness peaks in the LTEE. Instead,
the grand mean fitness is well described by a power law, of the form
where
w is fitness,
T is time in generations, and
a and
b are model parameters.
Because the power
law does not have an asymptotic limit
Œ fitness keeps
increasing indefinitely in this model
Œ it implies that the evolving populations are
so far from the top of whatever peaks might exist that it is not useful to think of
them reaching these peaks over 5
0,000 generations, or even over much longer
timescales. Therefore, we do not expect the among
population variance in
fitness to decline to zero in this time frame. However, we might be able to use
estimates of the among
population variance in fitness, an
d especially its
trajectory over time, to infer more information about how the set of population
fitness trajectories map onto the adaptive landscape. If the among
population
variance remained zero (i.e., its initial state given that the populations all s
tarted
from the same ancestor) throughout the evolution experiment, then this would
imply that either i) the populations are all following the same adaptive path, or ii)
the populations are on different paths that nonetheless map onto regions of the
adapti
ve landscape with parallel slopes. Conversely, if the among
population
variance continues to exceed zero indefinitely, then this implies that either i) the
different populations are following different paths, or ii) the timing of the
71 appearance of equival
ent beneficial mutations varies enough to sustain the
among
population variance in fitness.
Meanings of changes in variance
: To better understand evolutionary dynamics in this system, we examine
how the variance in fitness changes over evolutionary tim
e. By definition, the
populations in the LTEE have no variance in fitness at generation 0, as all
populations have the same fitness. Previous work showed that among
population variance in fitness increased over the first 10,000 generations in this
experi
ment. Lenski and Travisano (1994) suggested that this among
population
variance in fitness may have leveled off, but did not explicitly test whether an
asymptotic model provided a better fit to the data than did an unbounded model
(9). There are three hypothetical possibilities of how variance in fitness changes
after these first 10,000 generations: continued increase, constancy, or decrease
after the initial inc
rease.
One possibility is that among
population variance in fitness continues to
increase across the 50,000 generations of data. This possibility is most
consistent with populations continuing to explore different areas of the fitness
landscape, or else
exploring the same peak but at very different rates by climbing
faces with different slopes. Because our previous work showed no evidence for
populations reaching fitness peaks
(8), we hypothesize that this is the most likely
scenario.
72 Another possibility is that among
population variance in fitness could
increase for
some length of time before reaching a plateau. This possibility is
most consistent with a scenario in which different populations explore different
peaks in the adaptive landscape, eventually reaching peaks of different heights.
Because each population
would reach its own fitness maximum, variance in
fitness would stop increasing once all of the populations reached their fitness
peaks. Alternately, different populations could reach regions of the fitness
landscape where they experience the same slopes a
s each other, leading to a
consistent variance in fitness. Because we saw no evidence of populations
reaching fitness peaks in previous work
(8), and because truly parallel slopes are
mathematically unlikely, we do not expect this result to occur.
A third possibility is that after an initial increase, among
population
variance in fitness could decrease. This possibility is most consistent with
different populations exploring different routes to the same fitness peak
Œ or
different peaks of the same fitness
Œ and then converging together at the
peak(s). Again, because
our previous work showed no indication of populations
reaching fitness peaks, we do not expect this scenario to occur.
Study
System
: The Long
Term Evolution Experiment (LTEE) is an ongoing evolution
experiment, using populations of the bacteria
E. coli
. This experiment has been
described in detail in previous chapters, but a brief summary follows. The
experiment consists of twelve populations of
E. coli
, each descended from a
73 common ancestor. Six of the populations are Ara
+, capable of growing on the
sugar arabinose as their sole carbon source; the other six populations are Ara
, unable to grow on arabinose as a sole carbon source. Each population exists
within a separate 50 mL Erlenmeyer flask, with the cells growing in a growth
medium of Davis Minim
al salts supplemented with 25 mg/L glucose (DM 25).
Each day, a member of the research team transfers 0.1 mL of the previous day™s
culture into 9.9 mL of fresh DM 25, repeating separately for each of the twelve
populations, and places the new cultures in
a shaking incubator at 37
oC and 120
rpm. All populations grow rapidly enough to exhaust the available glucose prior
to the next transfer. Every 75 days, corresponding to every 500 generations,
frozen samples are made from each of the LTEE populations, a
nd stored at
Œ80 oC. Previous work:
Lenski and Travisano (1994) previously showed that in the LTEE, among
population variance in fitness increased during the first 10,000 generations of the
experiment
(9). By 10,000 generations, they calculate an among population
standard deviation in fitness of between 0.04 and 0.05. Interestingly, though they
fit a curve to the among
population standard deviation in fitness a
s a form of a
hyperbola, they did not test whether the hyperbola is a better fit than a linear
regression. This data, Figure 7 within the original paper, is reproduced here as
Figure 3.1
. 74 Figure 3.1:
Among
population standard deviation in fitness ov
er the first 10,000 generations across all
populations in the LTEE
. Data fro
m Lenski and Travisano (1994).
75 This figure has been replicated from the summary data available from
Lenski™s website
(10). Bec
ause only summary data is available, not all of the
analyses we will be performing on our other data are applicable to this data set.
However, it is clear that there is appreciable among population variance in fitness
during the first 10,000 generations o
f this experiment.
Fitness assays:
We performed fitness assays much as discussed previously (Chapters 1
and 2). Each of these assays has one, but only one, of two differences from the
Traditional method outlined in Chapter 1. One, in almost all cases,
we performed
competitions over the course of three days (roughly 20 generations) rather than
one day (roughly 6.67 generations). These additional generations allow greater
precision in fitness, but require that the two competing populations have fitness
differences no more than approximately 10%. In five individual measurements of
592 three day competitions, the plate from day 3 was uncountable due to error; in
these cases, we used the counts and dilution factor from day 2 instead, making
these two day (r
oughly 13.33 generation) competitions. In three additional
individual measurements, the plate from day 0 was uncountable; we excluded
these three measurements.
Statistical m
ethods:
To calculate the among
population variance in fitness, we followed the
procedure outlined in Sokal and Rohlf (1995)
(11). We treated each time point
76 separat
ely. Within each time point, we performed an ANOVA, with Population as
a random effect. From these ANOVAs, we subtracted the mean square error
term from the mean square population term. We divided this difference by the
number of replicate blocks. This
produces an estimate of the among
population
variance. To obtain an estimate of the among
population standard deviation, we
first preserved the sign of the variance estimate, and then calculated the square
root of the absolute value of the variance estim
ate. Because the among
population standard deviation in fitness is simply the square root of the among
population variance in fitness, broad
scale patterns (i.e. increases, decreases, or
consistency) will be consistent across the two calculations.
All st
atistical analyses were performed in R version 3.0.2
(12). Th
e local
smoothing function was fit with the loess() command.
Results and Discussion
: In Figure 3.1, we have replicated a figure from a previous study that
examined among
population standard deviation in fitness over the first 10,000
generations of the ex
periment. Before we look at additional data, it is worth
taking a moment to interpret these findings.
One immediate point to notice in Figure 3.1 is that the estimate of the
among
population standard deviation varies from one measured time point to
anoth
er. Part of this difference likely reflects real changes in the degree to which
different populations have achieved different levels of fitness over time. Part of
the difference, however, is due to measurement error. Indeed, this measurement
77 error can c
learly be seen in the estimates for generations 8,000 and 9,000, when
the estimated among
population standard deviation in fitness is negative. This
negative result is directly due to measurement error
Œ when the ANOVA mean
square error term is larger tha
n the mean square population term, the estimate
will be a negative number. The magnitude of these negative numbers, though,
can give us an indication of the size of this measurement error. The fact that the
majority of the positive estimates of among
pop
ulation standard deviation in
fitness (14 out of 18) are greater than the largest of the negative estimates is a
strong point in favor of the among population standard deviation being
appreciably greater than 0. Therefore, these populations are achieving
different
fitness values.
As an initial look at how among
population variance has changed in the
first 50,000 generations of the LTEE, we calculated the among population
standard deviation in fitness from the data set
(13) that formed the basis of
Chapter 2. That data is presented i
n Figure 3.2.
In Figure 3.2, we can see that among
population standard deviation in
fitness has remained mostly positive across the first 50,000 generations of
evolution in the LTEE. However, the relative number of negative estimates has
also increased
later on in the experiment; half of the estimates after 30,000
generations are negative. There are several possible causes of this.
78 Figure 3.2
: Among
population standard deviation in fitness calculated across all populations in the LTEE
. Data
from
Wiser et al (2013).
79 LTEE became hypermutators during the time period studied here
(8, 14, 15), with
an additional one becoming a hypermutator after it was already excluded
(16). These hypermutator populations will therefore have more diverse populations,
and consequently greater measurement error in population fitness.
Because our previous results have already shown that increas
es in the
mutation rate of a population lead to increases in fitness
(8), we would expect
that including populations that have become hypermutators at different times
would increase the among
population variance in fitness. We therefore choose
to restrict our analysis to just the six populations that maintained the ance
stral
mutation rate in order to make this a more conservative test. Figure 3.3 shows
the among
population standard deviation in fitness in the previous data, using
only the six populations that maintained the ancestral mutation rate through all
50,000 gen
erations. In these populations, we find that among
population
variance in fitness is positive in 28 of the 40 generations after generation 0, a
significant result (binomial test, one
tailed p = 0.008295).
To address lack of precision cause by low degre
es of replication, we
generated new data using a smaller number of time points, but a greater degree
of replication at each time point
Œ five replicate fitness measurements from each
population at each time point, rather than two. For this data set, data
from each
time point was collected separately, and thus there cannot be a Block effect in
the relevant ANOVA, as all replicates of a given population from a given
generation were conducted simultaneously. In Figure 3.4 we present this data
from just the s
ix populations that did not become hypermutators:
80 Figure 3.3:
Among
population standard deviation in fitness calculated across LTEE populations that did not
become hypermutators
. Data from Wiser et al (2013).
81
Figure 3.4:
Among
population standar
d deviation in fitness, calculated across LTEE populations that did not
become hypermutators
. Data are new to this study.
82 As we can see from Figure 3.4, increasing the replication level on measurements
decreases the relative frequency of negative estimat
es for the among
population
standard deviation in fitness. We find that eight of the eleven time points
produce positive among
population standard deviation estimates.
This is not
statistically significant (binomial test, one
tailed p=0.1133), although t
his test is
very conservative and suffers from a low statistical power. Although it would be
tempting to interpret the variance as declining in the latest generations, we
should be cautious about not over
interpreting the data. Much of the apparent
decli
ne is driven by the negative estimate at 50,000 generations. Overall, there
is substantial agreement between our data sets on the among
population
standard deviation in fitness early in the experiment, with decreasing precision in
these measurements as po
pulations deviate further from the ancestor.
Komologrov
Smirnov tests
: Although many of the individual time points considered are not,
themselves, statistically significant, we can still look for statistical significance in
the data set as a whole. Each
individual among
population variance in fitness
has an associated significance value, because the among
population variance is
calculated from an ANOVA table. Under a null distribution, we would expect the
cumulative relative frequency of p
values to be
equal to that p
value; in other
words, 30% of the p
values would be 0.3 or less, 65% of the p
values would be
0.65 or less, etc. The Kolmogorov
Smirnov test allows us to compare the
distribution of p
values from our series of ANOVAs to a null distribution
and
83 determine whether we have an excess of small p
values; that is, whether our
results as a whole are more significant than expected by chance. We have
chosen to use a Kolmogorov
Smirnov test, rather than calculating a False
Discovery Rate, as we are int
erested in whether there is overall evidence of a
significant among
population variance in fitness within this data, and are not
particularly interested in determining how many of the significant values are likely
to only appear significant due to chance.
Figure 3.5 shows the cumulative relative frequency of p
values for our
combined data set, considering only the six populations that maintained the
ancestral mutation rate. Many more of our p
values are at the small end of the
distribution
Œ particularly
under 0.2
Œ than would be expected by chance. This is
a highly significant result (Kolmogorov
Smirnov test, 2
tailed, D=0.3024,
p=0.0001239. From this, we can see that although individual time points often do
not show statistically significant among
population variance in fitness, the data
set as a whole does.
The same basic pattern holds for both of the two data sets considered
separately. Figure 3.6 shows the cumulative frequency of p
values for just the
Wiser et al (2013) data set, again considering
only the populations that
maintained the ancestral mutation rate. These data are highly significant
(Kolmogorov
Smirnov test, 2
tailed, D=0.2689, p=0.004812). We see the same
pattern in Figure 3.7 for the data set of higher replication but fewer time poi
nts.
These data are highly significant (Kolmogorov
Smirnov test, 2
tailed, D=0.514,
p=0.003187).
84 Figure 3.5:
Cumulative frequency of p values among ANOVAs used to
calculate among
population variance in fitness
. Points represent empirical
data; the
solid line at y = x shows the null expectation. Data is combined from
Wiser et al (2013) and new data for this study; populations that became
hypermutators were excluded.
85
Figure 3.6:
Cumulative frequency of p values among ANOVAs used to
calculate amo
ngpopulation variance in fitness
. Points represent empirical
data; the solid line at y=x shows the null expectation. Data from Wiser et al
(2013).
86 Figure 3.7:
Cumulative frequency of p values among ANOVAs used to
calculate among
population variance
in fitness
. Points represent empirical
data; the solid line at y=x shows the null expectation. Data are new for this
study.
87 Given that our data is highly significant in both individual data sets
considered separately, as well as in our combined data co
nsidered as a whole,
we conclude that there is significant evidence of among
population variance in
fitness within our data. This further strengthens our conclusion that our
populations are not converging at the top of a single peak in the adaptive
lands
cape.
Using population pairs to examine finer scale differences:
The preceding analyses show that there is a substantial among
population
variance in fitness across the first 50,000 generations of the LTEE. This variance
increases rapidly from 0 at the
start of the experiment to significant levels within
the first few thousand generations, and remains positive thereafter. These
analyses lack sufficient statistical power to state with confidence whether this
variance continues to increase or remains at a
constant level. However,
examining the among
population variance across a range of populations is not
the only way to look at differences that have evolved across different populations.
From our previous work, we have already established fitness traject
ories
for individual populations within the LTEE
(8). We also kno
w that our fitness
assays are most precise when our two competitors have similar fitnesses
(17). This poses a potential problem for accurately determining differences in fitness of
two evolved populations late in the experiment: each one is being compared to a
common ancestor, and thus each has a
n increasing measurement error.
Further, comparing two different populations to a common competitor to
88 determine which one is competitively superior assumes complete transitivity in
fitness; if A > B, and B > C, it assumes A > C, which may or may not be t
he case.
One obvious way to overcome these limitations is to compete different
evolved populations against each other. Competing populations directly against
each other, instead of competing each against a common competitor, avoid the
issue of error pro
pagation from non
transitivity. We would further expect
populations that have been evolving in the lab for the same number of
generations to have fitness values closer to each other than they would to their
common ancestor, inherently reducing the impact
of measurement error.
Additionally, two populations of similar fitness can be competed against each
over a larger number of generations, which increases the precision of our fitness
measurements.
We therefore chose to compete pairs of evolved populations
from each of
the time points involved in the individual population fitness trajectories. In order
to make our pairings as independent as possible, we assigned each population
to only a single pairing. By making the population pairings independent, we
reduce the capacity for one population to have excessive influence on our
findings
Œ similar patterns would be caused by similar evolutionary patterns,
rather than the effect of a single population on multiple different pairings.
For each of our pairings,
we need both an Ara
+ and an Ara
 population.
Because our populations are labeled as Ara
1, Ara
2,–Ara
6, Ara+1,
Ara+2–Ara+6, the simplest approach is to compete each Ara
+ population
against the equivalently numbered Ara
 population. This is not strictly
necessary
89 Œ there is nothing in particular linking Ara+1 to Ara
1 more than to Ara
5. Three
of our populations become difficult to work with in later time points: Ara
2 and
Ara+6 stop growing reliably on the TA medium and Ara
3 has a substantial
populati
on increase due to its ability to metabolize citrate in the presence of
oxygen in later generations
(3). We competed Ara
2 against Ara+2 for t
he first
30,000 generations, the time period in which Ara
2 grows reliably on TA plates.
Similarly, we only consider competitions of Ara
3 against Ara+3 for the first
32,000 generations of the experiment, as this is before the Cit
+ population
expansion. W
e conducted competitions over the course of 3 days (roughly 20
generations), with two replicate measurements at each generation. We
structured the replicates such that one measurement for each generation
collectively formed a block, and we repeated this b
lock a second time.
For the pairing of Ara+1 v Ara
1, we chose to expand the number of
generations under consideration. In the other pairings we looked at as many of
the 40 distinct time points as possible that were used to establish individual
populatio
n fitness trajectories (see Chapter 2). For this pairing, we looked at each
500 generation interval across the first 50,000 generations. Because 101 unique
time points were too many to include in a single block, we had to split the time
points into two se
parate collections. We chose to do so in the form of one set of
every generation evenly divisible by 1,000, and a second set of those
generations ending in 500. This interweaving, as opposed to splitting
populations between early and late generations, re
duces the likelihood of a
90 systematic temporal difference between blocks having a significant effect on the
pattern of fitness change.
Figure 3.8 shows the fitness of population Ara
1 relative to population
Ara+1 for the first 50,000 generations of the L
TEE. All of the graphs in this
section follow the same basic format. The light colored, open symbols represent
each individual measurement of fitness. The dark, filled symbols represent the
average at each individual generation. The line is a local smo
othing function,
finding the average trend through nearby points, without imposing a specific
mathematical relationship across the data set as a whole. From this figure, we
can see that Ara
1 quickly gained a lead over Ara+1 in fitness, rising to about a
5% advantage by 25,000 generations and maintaining that lead through 50,000
generations.
It is not surprising that Ara+1 is lagging in fitness
Œ previous findings
(Chapter 2) showed that it has a substantially lower fitness than all other
populations
Œ but a comparison directly against another population allows us to
glean additional information. For one, the precision in these measurements is
substantially greater than what we were able to obtain by competing the evolved
populations against their ances
tor. Figure 3.9 places the data from Figure 3.8 in
context with our expectations. The light
colored, solid line is the ratio of the
fitness of Ara
1 to Ara+1 from the curves fit in Chapter 2; the dashed lines to
either side show the 95% confidence interv
al of this expectation, obtained
through bootstrapping the data.
91 Figure 3.8:
Ara
1 v Ara+1
. Open symbols show each measured value of relative fitness from head
tohead competitions.
The solid symbols are the mean at each time point. The dashed gray
line at 1.00 is the level at which the competitors
have equal fitness. The solid purple curve is a local smoothing function showing the general trend of the data.
92 Figure 3.9:
Ara
1 v Ara+1
. The solid light blue line shows the mean fitness of Ara
1 re
lative to Ara+1, based on 10,000
bootstrap re
samplings of the data from Wiser et al (2013). Dashed lines show the corresponding, non
parametric 95%
confidence. Open symbols show each relative fitness measured in a head
tohead competition. The solid sy
mbols are
the mean at each time point. The dashed gray line at 1.00 is the level at which the competitors have equal fitness.
93 As we can see, the actual data differ from the expectations in a number of
ways. The expectations suggest Ara+1 would have a
sizeable advantage at the
earliest time points, while the empirical data show much less, or possibly no
advantage for Ara+1 even at the earliest time points. This is influenced by
constraints on model fits. All of the populations start out with the same
fitness at
generation 0, and all are fit to the same mathematical function, though with
different parameter values. In order for a population to have a steeper increase
in fitness later in the experiment, it must by necessity have a shallower increase
in fitness early in the experiment. Therefore, the fact that Ara+1 shows a slow
rate of increase in fitness in later time points requires it to have a relatively rapid
increase in fitness early, which causes the prediction that it would have a higher
fitnes
s than anything it competed against at these early time points. The fact that
our expectations are calculated from a ratio of two smooth power law curves
means that we can only predict zero, one, or two changes in which population is
gaining fitness more r
apidly in any particular pairing. Direct measurements of
pairs of evolved populations could have many more changes in which population
is gaining fitness more rapidly. The expectations also suggest that Ara
1 would
end up with a larger advantage over Ar
a+1 than it achieved, with a mean
advantage in the 40,000+ generation range that is more consistent with the
highest individual measurements than the means at each generation. Further,
the expectations show a widening uncertainty as time progresses, while
the
visible spread in measurements of fitness differential between the two
populations remain roughly constant.
94 It is also noteworthy that the fitness differential between these populations
is as low as it is. Even Ara+1, the population that has seen
the least gain in
fitness by 50,000 generations, is roughly 40% more fit than its generation 0
ancestor. That the fitness differential between the populations is only ~5%
means that there has been a striking degree of parallelism in fitness changes
across
replicate populations. Figure 3.10 shows this in stark contrast. The two
individual population fitness trajectories each increase markedly from the
ancestor, while staying relatively close to each other. As a consequence, both
the expectation for, and
the measured values of, their fitness relative to each
other remains much closer to 1. If anything, the measured population pair
relative fitness is closer to 1 than the expectation is, suggesting that these
populations have more similar fitness trajector
ies than each would appear from
the individual trajectories against the ancestor.
The pairing of populations Ara
4 and Ara+4, shown in Figure 3.11
, displays a somewhat different pattern than Ara
1 and Ara+1. In this pairing,
Ara+4 has a notable early lea
d of roughly 5% by 5,000 generation, but it is only
temporary. Ara
4 catches up by generation 10,000, and then takes a lead of its
own of roughly 2
3% for the next several tens of thousands of generations. From
Figure 3.12, we can see that this pairing b
ehaves largely as expected, with the
majority of measured relative fitness points falling within the confidence interval
of the expectations. This is notable, because as we can see from Figure 3.13,
populations Ara
4 and Ara+4 have individual population f
itness trajectories that
are much more similar to each other than Ara
1 and Ara+1 do. Yet despite these
95 Figure 3.10:
Ara
1 v Ara+1
. The solid magenta curve shows the mean Power Law fitness trajectory for population Ara
1.
The solid green curve shows
the mean Power Law fitness trajectory for population Ara+1. The solid light blue curve
shows the mean fitness of Ara
1 relative to Ara+1. Each of these lines is the mean across 10,000 bootstrap re
samplings
of the data from Wiser et al (2013). Dashed li
nes show corresponding non
parametric 95% confidence intervals around
the solid curves. Open symbols show each relative fitness measured in a head
tohead competition. The solid symbols
are the mean at each time point.
96 Figure 3.11:
Ara
4 v Ara+4
. Open symbols show each measured value of relative fitness from head
tohead
competitions. The solid symbols are the mean at each time point. The dashed gray line at 1.00 is the level at which the
competitors have equal fitness. The solid purple curve is
a local smoothing function showing the general trend of the
data.
97 Figure 3.12:
Ara
4 v Ara+4
. The solid light blue line shows the mean fitness of Ara
4 relative to Ara+4, based on 10,000
bootstrap re
samplings of the data from Wiser et al (2013). Da
shed lines show the corresponding, non
parametric 95%
confidence. Open symbols show each relative fitness measured in a head
tohead competition. The solid symbols are
the mean at each time point. The dashed gray line at 1.00 is the level at which the c
ompetitors have equal fitness.
98 Figure 3.13:
Ara
4 v Ara+4
. The solid magenta curve shows the mean Power Law fitness trajectory for population Ara
4.
The solid green curve shows the mean Power Law fitness trajectory for population Ara+4. The solid l
ight blue curve
shows the mean fitness of Ara
4 relative to Ara+4. Each of these lines is the mean across 10,000 bootstrap re
samplings
of the data from Wiser et al (2013). Dashed lines show corresponding non
parametric 95% confidence intervals around
the solid curves. Open symbols show each relative fitness measured in a head
tohead competition. The solid symbols
are the mean at each time point.
99 small differences between populations, compared to both of their substantial
differences from the ancesto
r, we observe essentially the expected pattern in the
empirical data.
Figure 3.14 shows the pairing of populations Ara
5 and Ara+5. Much
more so than the previous pairings, this one shows marked change over time.
For the first 10
 to 15,000 generations,
Ara+5 has a substantial and widening
advantage over Ara
5, reaching roughly a 5% advantage around generation
15,000. At this point, however, Ara
5 begins to rise in relative fitness, reaching
roughly equal fitness to Ara+5 by approximately generation 35
,000, and
subsequently surpassing Ara
5, reaching a roughly 3% advantage over Ara+5 by
50,000 generations. From Figure 3.15, we can see that this broad
strokes
pattern is very similar to what we expect
Œ an initial lead for population Ara+5,
followed by p
opulation Ara
5 catching up
Œ but the measured fitness difference
between the two populations is typically tilted more in favor of population Ara
5 than expected. As we can see in Figure 3.16, in the latest generations the two
populations are expected to
have such similar fitnesses that the confidence
intervals overlap, though with population Ara+5 having the higher mean estimate.
However, the direct competition data show population Ara
5 having a slightly
higher mean estimate for fitness.
The pattern fo
r the pairing of Ara
5 and Ara+5 is particularly striking,
because it demonstrates how different populations can reach very different local
regions of the adaptive landscape. In Chapter 2, we saw that most of the
population in the LTEE have fitness trajec
tories that follow power laws, including
100 Figure 3.14:
Ara
5 v Ara+5
. Open symbols show each measured value of relative fitness from head
tohead competitions.
The solid symbols are the mean at each time point. The dashed gray line at 1.00 is the le
vel at which the competitors
have equal fitness. The solid purple curve is a local smoothing function showing the general trend of the data.
101 Figure 3.15:
Ara
5 v Ara+5
. The solid light blue line shows the mean fitness of Ara
5 relative to Ara+5, bas
ed on 10,000
bootstrap re
samplings of the data from Wiser et al (2013). Dashed lines show the corresponding, non
parametric 95%
confidence. Open symbols show each relative fitness measured in a head
tohead competition. The solid symbols are
the mean a
t each time point. The dashed gray line at 1.00 is the level at which the competitors have equal fitness.
102 Figure 3.16:
Ara
5 v Ara+5
. The solid magenta curve shows the mean Power Law fitness trajectory for population Ara
5.
The solid green curve sh
ows the mean Power Law fitness trajectory for population Ara+5. The solid light blue curve
shows the mean fitness of Ara
5 relative to Ara+5. Each of these lines is the mean across 10,000 bootstrap re
samplings
of the data from Wiser et al (2013). Dashe
d lines show corresponding non
parametric 95% confidence intervals around
the solid curves. Open symbols show each relative fitness measured in a head
tohead competition. The solid symbols
are the mean at each time point.
103 both Ara
5 and Ara+5. Differ
ent parameter values within the power law can lead
to different populations improving at different rates at various points in their
evolution, but each individual population would be expected to follow a relatively
simple trajectory in fitness over time.
However, measuring the populations
directly against each other can show cases like th
is pairing, where one of the two
gets an early lead, but that lead is subsequently lost as the initially
trailing
population catches up and later surpasses the one with th
e faster start.
Figure 3.17 shows the pairing of Ara
2 and Ara+2. This pairing only
extends through the first 30,000 generations, before population Ara
2 no longer
grows reliably on TA plates. In this pair, Ara
2 takes a rapid early lead, climbing
to ap
proximately a 7
8% fitness advantage by 5,000 generations. This trend
then reverses, with the two populations reaching approximately equal fitness by
generations 15,000. Subsequently, population Ara
2 regains a lead, reaching a
roughly 5% fitness advanta
ge over Ara+2 by generation 30,000. Interestingly,
this is not even close to the pattern we expected, as shown in Figure 3.18. Our
expectation is for an initial advantage in population Ara+2, gradually shrinking or
even disappearing by 50,000 generations
. This pairing shows an unusually wide
confidence interval in its expectation. This is likely influenced by how much of
their individual fitness trajectories overlap; as we can see in Figure 3.19, the
confidence intervals of Ara
2 and Ara+2™s individual
fitness trajectories overlap by
15,000 generations, and the mean values lie within each other™s confidence
intervals by 25,000 generations.
104
Figure 3.17:
Ara
2 v Ara+2
. Open symbols show each measured value of relative fitness from head
tohead compet
itions.
The solid symbols are the mean at each time point. The dashed gray line at 1.00 is the level at which the competitors
have equal fitness. The solid purple curve is a local smoothing function showing the general trend of the data.
105
Figure 3.
18:
Ara
2 v Ara+2
. The solid light blue line shows the mean fitness of Ara
2 relative to Ara+2, based on 10,000
bootstrap re
samplings of the data from Wiser et al (2013). Dashed lines show the corresponding, non
parametric 95%
confidence. Open symbols s
how each relative fitness measured in a head
tohead competition. The solid symbols are
the mean at each time point. The dashed gray line at 1.00 is the level at which the competitors have equal fitness.
106 Figure 3.19:
Ara
2 v Ara+2
. The solid magent
a curve shows the mean Power Law fitness trajectory for population Ara
2.
The solid green curve shows the mean Power Law fitness trajectory for population Ara+2. The solid light blue curve
shows the mean fitness of Ara
2 relative to Ara+2. Each of these
lines is the mean across 10,000 bootstrap re
samplings
of the data from Wiser et al (2013). Dashed lines show corresponding non
parametric 95% confidence intervals around
the solid curves. Open symbols show each relative fitness measured in a head
tohead competition. The solid symbols
are the mean at each time point.
107 Figure 3.20 shows the pairing of populations Ara
3 and Ara+3. Between
generations 32,000 and 34,000, population Ara
3 went through a massive
population expansion, as it developed the abi
lity to metabolize citrate in the
presence of oxygen. Citrate is present in our growth medium DM 25 at a high
enough concentration that this population has a roughly 7
fold larger population
size than other populations in the LTEE
(16). T
herefore, we have restricted our
analysis to just those time points before the citrate
utilizing population expansion.
In Figure 3.20, we can see that popu
lation Ara+3 has a steadily widening fitness
advantage over population Ara
3 for the first 25,000 generations, at which point it
maintains a roughly 10% fitness advantage over population Ara
3 through 32,000
generations.
Figure 3.21 shows how well these m
easurements match our predictions.
From the trajectories of populations Ara
3 and Ara+3 competed against the
ancestor, we would expect population Ara+3 to have a substantial and continual
fitness advantage over population Ara
3. Our data reflect this, wi
th many of the
individual measurements falling within the 95% confidence interval of our
expectation. The deviations from expectation are small relative to the two
population™s individual trajectories, as is shown in Figure 3.22.
Looking across the set
of these population pairs, we see that fitness
differences between a chosen pair of populations do not always accumulate
monotonically
Œ a population may increase its fitness relative to another for a
while, only to lose that advantage, and then gain it b
ack again. Nor do the
changes in relative fitness always perfectly track those we would expect based
108
Figure 3.20:
Ara
3 v Ara+3
. Open symbols show each measured value of relative fitness from head
tohead competitions.
The solid symbols are the me
an at each time point. The dashed gray line at 1.00 is the level at which the competitors
have equal fitness. The solid purple curve is a local smoothing function showing the general trend of the data.
109 Figure 3.21:
Ara
3 v Ara+3
. The solid light blue
line shows the mean fitness of Ara
3 relative to Ara+3, based on 10,000
bootstrap re
samplings of the data from Wiser et al (2013). Dashed lines show the corresponding, non
parametric 95%
confidence. Open symbols show each relative fitness measured in a
head
tohead competition. The solid symbols are
the mean at each time point. The dashed gray line at 1.00 is the level at which the competitors have equal fitness.
110 Figure 3.22:
Ara
3 v Ara+3
: The solid magenta curve shows the mean Power Law fitness
trajectory for population Ara
3.
The solid green curve shows the mean Power Law fitness trajectory for population Ara+3. The solid light blue curve
shows the mean fitness of Ara
3 relative to Ara+3. Each of these lines is the mean across 10,000 bootstr
ap re
samplings
of the data from Wiser et al (2013). Dashed lines show corresponding non
parametric 95% confidence intervals around
the solid curves. Open symbols show each relative fitness measured in a head
tohead competition. The solid symbols
are t
he mean at each time point.
111 on how each population diverged from a common competitor. Instead, the population
pairs show more dynamic changes in relative fitness than our simple expectations
predict. This dynamism suggests that it is unlikely that each
of our populations is
climbing a parallel slope in the adaptive landscape, whether on the same peak, or on
different peaks. Combined with our data from chapter 2 that the populations in the
LTEE are so far from reaching fitness peaks that we
cannot detect
evidence of an
eventual asymptote to fitness, we are left with two possibilities of how the populations
are traversing the fitness landscape. One, they may be climbing different peaks, with
trajectories that are not parallel in fitness. In this scenario
, they will not necessarily
converge to similar fitness values at any point in the future, as the different peaks may
be of different heights. Two, they may be climbing the same peak, but along very
different paths. In this scenario, if the populations e
ventually get close enough to the
top of this peak, we would expect population fitness to converge, and among
population
variance in fitness to decline. We do not yet see evidence that this second scenario is
occurring, but recognize that we may be so far
from the top of whatever peak(s) the
populations are climbing that we cannot rule out this possibility either.
Summary:
We examined several sources of data to look for patterns in among
population
variance in fitness within a long
term evolution experim
ent and, from those patterns,
learn about the adaptive landscape for populations in this experiment. We first
analyzed data from Wiser et al (2013), calculating among
population variance estimate
at each time point considered. We also gathered new data f
rom a smaller number of
112 time points, but with greater replication. These data show an among
population
standard deviation in fitness consistent with what had been observed through 10,000
generations in Lenski and Travisano (1994). Our data lack sufficien
t statistical power to
state whether the among
population standard deviation in fitness continues to rise or
reaches a plateau after this point.
In order to estimate the among
population variance in fitness, we perform an
ANOVA at each time point measur
ed. Though most of these ANOVAs are not
individually significant, there is a significant overrepresentation of small p values within
the set, demonstrating that there is still significant among
population variance in fitness.
We next competed pairs of ev
olved populations against each other, and compared
those results to the expectations derived from each population™s individual fitness
trajectory compared to the generation 0 ancestor. We find that the population pair
estimates largely follow the predicte
d trends, but that these greater precision
measurements are often subtly different from the expectations, allowing us to observe
finer
scale patterns of relative fitness change between populations.
From previous work, we already knew that populations with
in this experiment had
not reached a fitness peak by 50,000 generations. Whether peaks even exist in real
adaptive landscapes is itself a contested point; real populations face selection on far
more than two dimensions at once, and our intuition about geo
metry in 3
dimensional
space may not apply to much higher
dimensional spaces
(18, 19). Nevertheless, it is
possible to use information about the var
iance among populations within an evolution
experiment to infer the likely topology of the adaptive landscape experienced by the
populations.
113 Conclusions
: Among
population variance in fitness remains at appreciable levels even after
50,000 generations of
evolution. This is further evidence that populations are not
converging at the top of a single fitness peak in their adaptive landscape. Although we
lack sufficient statistical power to define a function of among
population variance over
evolutionary ti
me, we can state firmly that it is not being driven down to insignificant
levels over the course of 50,000 generations. However, even in the absence of clear
patterns of how variance is changing, looking at the cumulative distribution of
significance valu
es in the ANOVAs used to calculate among
population variance
demonstrates that there is significant signal of persistent variance despite the noise.
Broad
scale patterns of fitness differences in populations are consistent with
expectations. Fitness is l
argely transitive in this system: if A > B, and B > C, then A > C
in the majority of cases. Populations that show generally larger gains in fitness
compared to their ancestor over long periods of time also show higher relative fitness
when competed direct
ly against evolved populations with smaller gains relative to the
ancestor. This is in spite of known cases of frequency
dependent fitness within
individual populations
(20Œ22), which could easily disrupt transitivity of fitness.
Divergence in fitness between different populations is most often quite low
compared to their divergence in fitness from the ancestor. This allows for
measurements comparing evolved population pairs to extend over additional
generations, and consequently reach higher levels of precision. It also demonstrates a
substantial degree
of parallelism in fitness
Œ which is, essentially, an integrated
114 measurement of competitive ability within a given environment
Œ despite populations
being isolated for an extended period of time and having many differences in specific
mutations.
Competi
tions between evolved populations reveal greater detail than can be
observed from populations competing against their ancestor. The empirically
measured
relative fitness of a pair of evolved populations is most often near what the expected
value is based
on their fitness trajectories relative to the ancestor. However, this
measured relative fitness is still often outside the confidence interval of expectations.
Further, these theoretical expectations are constrained to have relatively simple
dynamics. A
ctual population pair measurements often show more complex dynamics,
such as having more inflection points, or abrupt changes in slope than the expectations.
Our power law models explain large
scale changes in fitness both across and within
populations, b
ut individual populations have time frames in which their actual fitness
either accelerates or decelerates relative to the power law, and competitions between
evolved populations can reveal these deviations from expectation.
Future Work
: The material in
this chapter is almost exclusively empirical and statistical. A
collaborator, Noah Ribeck, is working on simulating evolving populations using the
population genetics framework published in Wiser et al (2013). We plan to compare our
empirical measurement
s of variance over time to models in which the parameter that
describes diminishing
returns epistasis parameter is either constant or changes over
time as a population encounters different regions of the genetic space that underlies the
115 fitness landscape.
These models will provide a description of how we should expect the
among
population variance in fitness to change over time, to which we can then
compare our empirical measurements.
Acknowledgements
: We thank Caroline Turner, Alita Burmeister, Noah Rib
eck, Rohan Maddamsetti, Anya
Vostinar, and Brian Goldman for discussion and feedback in the drafting of this chapter.
We also thank Neerja Hajela for technical assistance. This work was supported, in part,
by an NSF grant (DEB
1451740) and by the BEACON
Center for the Study of Evolution
in Action (NSF Cooperative Agreement DBI
0939454).
116 REFERENCES
117 REFERENCES
1.
C. O. Wilke, J. L. Wang, C. Ofria, R. E. Lenski, C. Adami, Evolution of digital
organi
sms at high mutation rates leads to survival of the flattest.
Nature
. 412
, 331
Œ333 (2001).
2.
P. C. Phillips, Epistasis
 the essential role of gene interactions in the structure and
evolution of genetic systems.
Nat Rev Genet
. 9, 855
Œ867 (2008).
3.
Z.
D. Blount, C. Z. Borland, R. E. Lenski, Historical contingency and the evolution
of a key innovation in an experimental population of Escherichia coli.
Proc. Natl.
Acad. Sci.
105
, 7899
Œ7906 (2008).
4.
S. Kinoshita, S. Kageyama, K. Iba, Y. Yamada, H. Okada
, Utilization of a Cyclic
Aminocaproic Acid by Achromobacter guttatus KI
72.
Agric. Biol. Chem.
39, 1219
Œ1223 (1975).
5.
B. Koskella, J. N. Thompson, Gail M. Preston, A. Buckling, Local Biotic
Environment Shapes the Spatial
Scale of Bacteriophage Adaptation to Bacteria.
Am. Nat.
177, 440
Œ451 (2011).
6.
M. C. Urban, P. L. Zarnetske, D. K. Skelly, Moving forward: dispersal and species
interactions determine biotic responses to climate change.
Ann. N. Y. Acad. Sci.
1297
, 44
Œ60 (2013).
7.
A. H. Melnyk, R. Kassen, Adaptive Landscapes in Evolving Populations of
Psuedomonas fluorescens.
Evolution
. 65, 3048
Œ3059 (2011).
8.
M. J. Wiser, N. Ribeck, R. E. Lenski, Long
term dynamics of adaptation in asexual
populations.
Science
. 342, 1364
Œ1367 (2013).
9.
R. E. Lenski, M. Travisano, Dynamics of adaptation and diversification: a 10,000
generation experiment with bacterial populations.
Proc. Natl. Acad. Sci.
91, 6808
Œ6814 (1994).
10.
Relative Fitness Data Through Generation 10,000, (ava
ilable at
http://myxo.css.msu.edu/ecoli/relfit.html).
11.
R. Sokal, F. J. Rohlf,
Biometry
(W. H. Freeman and Company, New York, ed. 3rd,
1995).
12.
R Core Team,
R: A language and environment for statistical computing
(R
Foundation for Statistical Compu
ting, Vienna, Austria, 2013; http://www.R
project.org/).
118 13.
M. J. Wiser, N. Ribeck, R. E. Lenski, Data from: Long
term dymanics of adaptation
in asexual populations. (2013), (available at
http://dx.doi.org/10.5061/dryad.0hc2m).
14.
P. D. Sniegowski, P.
J. Gerrish, R. E. Lenski, Evolution of high mutation rates in
experimental populations of E. coli.
Nature
. 387
, 703
Œ705 (1997).
15.
J. E. Barrick
et al.
, Genome evolution and adaptation in a long
term experiment
with Escherichia coli.
Nature
. 461, 1243
Œ1247 (2009).
16.
Z. D. Blount, J. E. Barrick, C. J. Davidson, R. E. Lenski, Genomic analysis of a key
innovation in an experimental Escherichia coli population.
Nature
. 489
, 513
Œ518
(2012).
17.
M. J. Wiser, R. E. Lenski, A Comparison of Methods to Measure
Fitness in
Escherichia coli.
PLoS ONE
. 10, e0126210 (2015).
18.
S. Gavrilets, Evolution and speciation on holey adaptive landscapes.
Trends Ecol.
Evol.
12, 307
Œ312 (1997).
19.
D. M. McCandlish, Visualizing Fitness Landscapes.
Evolution
. 65, 1544
Œ1558
(20
11).
20.
R. Maddamsetti, R. E. Lenski, J. E. Barrick, Adaptation, Clonal Interference, and
Frequency
Dependent Interactions in a Long
Term Evolution Experiment with
Escherichia coli.
Genetics
. 200, 619
Œ631 (2015).
21.
D. E. Rozen, R. E. Lenski, Long
Term
Experimental Evolution in Escherichia coli.
VIII. Dynamics of a Balanced Polymorphism.
Am. Nat.
155, 24
Œ35 (2000).
22.
N. Ribeck, R. E. Lenski, Modeling and quantifying frequency
dependent fitness in
microbial populations with cross
feeding interactions.
Evolution
. 69, 1313
Œ1320
(2015).
119 CHAPTER 4:
LONG
TERM DYNAMICS OF ADAPTATION IN ASEXUAL
DIGITAL POPULATIONS.
Authors: Michael J. Wiser, David M. Bryson, Charles Ofria, and Richard E.
Lenski
Abstract
: Previous work has shown that experimental evoluti
on populations of
bacteria exhibit power law dynamics, implying that improvements will continue
indefinitely. Computational systems offer us the chance to study evolving
populations for more generations than are ever feasible in microbial experimental
evo
lution studies. Here we evolve populations of digital organisms in Avida for
either 200,000 or 1,000,000 generations, across three different environments.
We find that in both the most complex and the simplest of these environments,
fitness obeys power l
aw dynamics. In the intermediate case, fitness is better
described by a hyperbolic model, but fitness still increases over long time scales.
Our work suggests that power law fitness dynamics may be a general feature of
evolving systems.
Introduction
: Wiser et al (2013) previously showed that in populations of
Escherichia
coli
evolved for 50,000 generation, fitness over time exhibited power law
dynamics
(1). The long
term evolution experiment (LTEE) from which those data
120 are derived is the biological experimental evolution study that has the run for the
largest number
of generations
(2). It would therefore app
ear that we cannot
investigate whether similar patterns arise in other systems. However, explicitly
biological experiments are not the only ones that can provide insight into
evolutionary dynamics.
Artificial systems allow us to study whether certain p
roperties are shared
across evolving systems, independent of the details of cellular machinery. This
has been happening for decades. John Maynard Smith (1992) stated ﬁSo far, we
have been able to study only one evolving system and cannot wait for interst
ellar
flight to provide us with a second. If we want to discover generalizations about
evolving systems, we will have to look at artificial ones.ﬂ
(3). We therefore turn to
a computational system of evolving populations, and ask whether this system
exhibits similar patterns of fitness over evolutionary time as the LTEE.
Study S
ystem
: We conduc
ted computational evolution experiments with the digital
evolution software platform Avida. This platform has been detailed extensively
elsewhere
(4), but a brief summary follows. Organisms are self
replicating
asexual computer programs, composed of sequences of instructions. Users
define a mutation rate, contro
lling the per
site probability that a new organism will
be different from its parent. As with biological organisms, these mutations may
be beneficial, neutral, deleterious, or lethal. Organisms within Avida compete for
space in their virtual world with o
ther organisms. Additionally, in most
121 environments, organisms compete with each other for resources by completing
tasks that are rewarded with additional CPU cycles. These extra CPU cycles
allow organisms to copy themselves faster than less fit competito
rs.
Evolution by natural selection is substrate neutral; any system of
organisms that exhibits variation, inheritance, selection, and time across
generations will undergo evolution by natural selection
(5). This applies to
artificial life, as well as natural organisms. Populations in Avida gain variation
through mutation, organisms inherit parental variations, and the environment
imposes a selective pres
sure. Because experiments in Avida extend across
many generations, Avida thus meets all of the criteria for evolution by natural
selection. Avida is not merely a simulation of evolution, but an instance of it.
Fitness in Avida is calculated as Execution
Rate
1 divided by Generation
Length
2, the amount of time it takes an organism to copy itself. Organisms thus
can increase their fitness either by executing instructions more rapidly
(increasing their Execution Rate) or by requiring fewer executed instruct
ions to
replicate (reducing their Generation Length). Fitness therefore measures the
number of offspring produced in a given amount of time.
Experimental conditions:
We performed experiments in three different environmental reward
regimes: No Task, Logi
c9, and Logic
77. In the No Task environment, there
were no tasks that organisms could perform to gain additional CPU cycles. In
1 Listed as Merit in the Avida data files
2 Listed as Gestation Time in the Avida data files
122 this environment, competition between organisms was solely for space. In the
Logic
9 environment, nine different one
 or tw
oinput Boolean logic tasks are
rewarded. More complex tasks are given greater rewards; the simplest tasks
reward the organism by doubling the Execution Rate, while the most complex
task rewards the organism by multiplying Execution Rate by 32. In the Log
ic77 environment, 77 different one
, two
, or three
input Boolean logic tasks are
rewarded. Each task performed doubles the organism™s Execution Rate,
regardless of complexity of the task.
In each of the three environments, we conducted experiments for
a defined number of generations. We ran populations for 200,000
1 generations
in the Logic
77 and Logic
9 environments, and 1,000,000
1 generations in the
No Task environment.
Statistical m
ethods:
We per
formed all statistical analyses in R version 3.0.2
(6). We calc
ulated
relative fitness by dividing population fitness by the fitness of the ancestor. When
necessary (in the Logic
77 and Logic
9 environments), we transformed fitness as
log 2 fitness; means across replicates were calculated after transformation. We
fit linear models with the lm() command, and we fit non
linear models with the
nls() command. We calculated posterior odds ratios from difference in BIC value,
according to Raftery (1995)
(7). 123 Results and Discussion
: Logic
77 Environment
: Of the environments that we tested, the Logic
77 environment is the most
complex. This complexity makes the Logic
77 environment the most like
biol
ogical environments, where complexity is a common. Even in extremely
simple laboratory environments, different organisms in a population can
specialize on different resources
(8), and there are many internal cellular
processes that can be optimized.
We first tested whether evolution in this environment reaches an optimum.
Previous resea
rch in Avida has generally run for durations of
150,000 Updates
(9Œ12) Œ an interna
l measure of time within Avida
Œ which corresponds to less
than 15,000 generations. Because the rate of adaptation slows with time, other
researchers have concluded that populations are approaching a fitness peak
(10, 13). By looking at the same data over a range of time scales, we can examine
whether the appearance of an early plateau actually signals a halt in the adaptive
process. Figure 4.1 show the mean fitness across 20 replicate runs in the Logic
77 environment.
Different panels in the figure show the same data examined
over different numbers of generations. The dashed vertical lines show the end
points of previous panels. As we can see, the curve has the same basic shape
in each of the panels. What appears to
be a plateau in one panel is revealed to
be part of the upward trajectory in a later panel. In fact, the appearance of a
plateau is an artifact of sampling. Were the run to extend over considerably
124 Figure 4.1:
Fitness over time in the Logic
77 enviro
nment
. The solid (black)
curve is the mean log 2 relative fitness across 20 replicates. Different panels
show different numbers of generations. Dashed (green) vertical lines show ends
of previous panels.
125 more generations, we would expect the new apparen
t plateau to be of a higher
value, and reached later in the run. This is evidence that the populations have
not reached an evolutionary optimum, but are still adapting to their environment.
We can also look at changes over time within individual replicat
es. Figure
4.2 shows a scatterplot of fitness for the Logic
77 data, with each individual point
being one replicate run. From these data, we notice two striking facts. First,
points predominantly fall above the y = x line, which shows that they are reac
hing
higher fitness at the end of the evolutionary run than they are 2/3 of the way
through a run. Indeed, log 2 relative fitness is higher at 200,000 generations
than at 133,333 generations (one
tailed t test, t = 2.7414, df = 19, p = 0.00649).
We also
find a significant, positive slope to a linear regression in this late time
period for log 2 relative fitness over time (slope estimate = 4.081 *
, t =
11.95, p < 2 *
). Note that we are not arguing that thi
s late slope is linear,
but merely that a significant, positive linear slope indicates that fitness is
increasing in some fashion (see Figure S4.1). Second, different replicates reach
very different levels of fitness. Because each task performed in the L
ogic
77 environment doubles the organism™s fitness, the log 2 relative fitness provides an
approximation of how many tasks the organism performs. Many of the replicates
still have fitness values indicating more than a dozen additional tasks could be
perfo
rmed by the average members of their populations. This means that at
least 19 of the 20 replicates could reach a higher fitness, indicating clearly that
they have not reached a global optimum.
126 Figure 4.2:
Late fitness v
final fitness in the Logic
77 env
ironment
. Each
point is one replicate. The dashed line is at y = x; points on this line have the
same fitness at the end of the run as at 2/3 of the run.
127 The analysis above examines just specific points in the fitness trajectories;
we gain additional i
nformation by looking at the entire trajectories (Figure 4.3). In
this figure, each of the individual replicates are shown as gray points, with the
mean across runs as the black curve. The mean fitness trajectory appears as a
smooth curve, while many ind
ividual trajectories appear to be made of step
like
combinations of rapid increases and long
term stability. From these trajectories,
we can see not only that different populations reach different final fitness values
Œ as we saw in Figure 4.2
Œ but that
even populations which achieve similar final
fitness do not necessarily do so in similar time frames. For example, of the three
populations that achieved the highest final fitness, one of them had gotten to
roughly this final fitness prior to 50,000 gener
ations, while the other two did not
until after 150,000 generations.
We next examine the functional shape of how fitness changes over
evolutionary time. We have previously shown that in a long
term evolution study
in bacteria, fitness over time is bette
r fit by a power law than a hyperbola,
indicating that fitness is expected to increase indefinitely
(1). In Figure 4.4, we
compare the best fit power law to the best fit hyperbola in the Logic
77 environment in Avida. When considering all of the data, the power law model
substantially outperforms the hyperbolic model (
difference in BIC = 10341.01,
posterior odds ratio <
<< ). Like with the LTEE, we also fit models to
the first 40% of the generations, and project what those models predict for the
rest of the data. Here,
the power law model somewhat overestimates future
fitness, while the hyperbolic model somewhat underestimates future fitness
128 Figure 4.3:
Fitness over time in the Logic
77 environment
. The gray points show each of the 20 replicates. The black
curve sho
ws the mean log 2 relative fitness over time.
129 Figure 4.4:
Comparison of model fits in the Logic
77 environment
. (A) Hyperbolic (red) and power
law (blue) models fit to the set of mean log 2 fitness
values (black symbols) from all 20 replicates. (
B) Fit
of hyperbolic (solid red) and
power
law (solid blue) models to data from first 80,000 generations only (solid
black), with model predictions (dashed red and blue curves) and later data
(dashed black curve).
130 The weight of the data in the Logic
77 environme
nt strongly indicates that
fitness has not reached a final plateau. At least 19 of the 20 replicates have not
reached a global optimum, as they have lower fitness than the most fit
population. Fitness values at 200,000 generations are significantly highe
r than at
133,333 generations. The time frame from 133,333 generations to 200,000
generations displays a significant, positive slope in fitness over time. Log 2
relative fitness exhibits power law dynamics in this environment. The power law
finding is p
articularly striking, as relative fitness also exhibits power law dynamics
in the LTEE. The fact that we get similar dynamics
Œ albeit, in a log 2
transformation of fitness
Œ in a completely different system lends support to the
idea that power law fitnes
s dynamics may not be an idiosyncrasy of the LTEE.
Instead, these dynamics may be a more general feature of evolving systems.
No Task Environment
: The No Task environment is at other extreme of complexity from the
Logic
77 environment. Here, the only w
ay for organisms to improve their fitness
is to lower their Generation Length, and thus replicate faster. For a self
replicating organism this is one of the simplest environments conceivable. We
also extended the evolutionary runs much further, out to 1,
000,000 generations.
If any of our environments would lead to evolutionary stagnation, we would
expect it to be this one: a simpler environment, fewer ways to improve, and small
selection coefficients for those mutations which are beneficial than in the l
ogic
environments.
131 Despite this extreme simplicity, evolution does not reach a maximum and
stop in this environment. From Figure 4.5, we see that apparent plateaus in
relative fitness are
Œ just as in the Logic
77 environment
Œ artifacts of sampling. If
we allow the runs more evolutionary time, what previously appeared to be a
plateau in relative fitness now becomes a steep portion of the curve. Note that in
this case, we are analyzing relative fitness, not a log 2 transformation of it,
because fitness g
ains are much smaller when there are not tasks to be evolved.
We likewise get similar results when we look at changes over time within
replicates. Figure 4.6 shows a scatterplot of fitness in the No Task environment.
Again, the points fall predominantly
above the y = x line of equal fitness at the
end of the evolutionary run as 2/3 of the way through the run. Fitness is higher at
1,000,000 generations than at 666,667 generations (one
tailed t test, t = 2.2666,
df = 19, p = 0.0176). A linear regression o
f fitness over time between generations
666,667 and 1,000,000 yields a significant, positive slope to fitness (slope =
23.81, t = 5.267, p = 1.39 *
), indicating a rise in fitness over the final third of
this experiment. In this ca
se, we certainly don™t interpret this positive linear slope
as indicating a linear increase in fitness
Œ the diagnostic plots for the model
reveal that the data do not meet the assumptions for a linear model (see Figure
S4.2)
Œ but merely note that a signi
ficant, positive slope for a linear regression of
fitness indicates an increase in fitness in this time frame.
In this case, though, some of the points are actually below the y = x line,
indicating replicates where the final population fitness is lower th
an the
population fitness two thirds of the way through the evolutionary run. What can
132 Figure 4.5:
Fitness over time in the No Task environment
. The solid (black)
curve is the mean log 2 Relative fitness across 20 replicates. Different panels
show di
fferent numbers of generations. Dashed (green) vertical lines show ends
of previous panels.
133 Figure 4.6:
Late fitness v final fitness in the No Task environment
. Each
point is one replicate. The dashed line is at y = x; points on this line have the
sam
e fitness at the end of the run as at 2/3 of the run.
134 account for this unexpected result? One possibility is the appearance of
additional, detrimental mutations. The population is not homogenous; at any
given time, there will be at least some genetic var
iation because some organisms
will be one or two mutational steps away from their parent. When a beneficial
variant begins to spread through a population, the spreading clade will initially be
close to clonal. However, as the population size of this bene
ficial clade
increases, so does the probability that it will contain individuals both with the
beneficial mutation and other, not beneficial, mutations. If the rate of detrimental
or lethal mutations is high enough, and the rate of new beneficial mutation
s is low
enough, population fitness may rise as the beneficial mutation spreads, but then
decline until reaching a mutation
selection equilibrium later. Because the fitness
gains of individual beneficial mutations are so small in this environment, and the
declines we observe in fitness are so small, this seems like a likely explanation.
These changes in population fitness are shown in detail in Figure 4.7.
Populations exhibit short periods of rapid rise in fitness, followed by long periods
of little chan
ge in fitness. However, during these periods of relative stability,
fitness still fluctuates. Unlike with physical organisms, these are not explained by
measurement error
Œ for any given population, at any given time point, we can
measure the exact fitne
ss within Avida. Instead, they show actual small changes
in population fitness, due to changes in population composition, either from
existing genotypes changing in frequency, or new genotypes arising through
mutation. These small scale fluctuations exis
t in all populations with non
zero
135 Figure 4.7:
Fitness over time in the No Task environment
. The gray points show each of the 20 replicates. The black
curve shows the mean relative fitness over time.
136 mutation rates, but are more visible in this case b
ecause there are few mutations
of large enough effect to obscure the dynamics.
As before, we then fit two different models to fitness over time in this
system. From Figure 4.8, we observe that the two models do a similar job in
predicting future change
s in fitness, with the power law model overestimating
future fitness to a similar extent that the hyperbolic model underestimates it.
However, we can again see that even over these long time frames in a simple
environment, fitness is better fit by a power
law than by a hyperbola (difference in
BIC = 13712.11, posterior odds ratio <
<< ).
It is particularly striking that fitness continues to increase over long time
scales in this experiment. For one, the
time scales here are five
fold greater than
in the logic environments, which themselves are four
fold greater than the
number of generations we examined in the LTEE in Chapter 2 and Chapter 3.
Combined with the simplicity of the environment, it would be e
asy to assume that
all the populations in this environment would reach a global optimum in fitness,
and stop improving. Yet this is not the case. Instead, populations stuck in
regions of relatively low fitness for extended periods of time eventually find
their
way to regions of higher fitness. Populations in regions of relatively high fitness
themselves experience fluctuations in fitness, sometimes including a rise to an
even higher fitness region. That our measurements in even this very simple
environm
ent support an unbounded fitness function lends credence the
possibility that such unbounded increases are a general feature of evolving
populations.
137 Figure 4.8:
Comparison of model fits in the No Task environment
. (A) Hyperbolic (red) and power
law (bl
ue) models fit to the set of mean fitness values
(black symbols) from all 20 replicates. (
B) Fit of hyperbolic (solid red) and power
law (solid blue) models to data from first 400,000 generations only (solid black),
with model predictions (dashed red and b
lue curves) and later data (dashed
black curve).
138 Logic
9 Environment
: In the Logic
9 environment in Avida there are a small number of rewarded
behaviors that organisms can evolve, putting it between the extremes of the
other environments. Yet the reward
s for these behaviors are very large; all else
being equal, the lowest
reward behaviors double the organism™s fitness, while the
most
rewarded behavior multiplies it by a factor of 32. We therefore expect the
fitness gains from evolving tasks to mask smal
l changes from improved
replication efficiency. Despite this, the Logic
9 environment is the most
extensively used in previous work in Avida
(9, 14, 15), whic
h is why we chose to
examine fitness dynamics in this environment.
Figure 4.9 shows log 2 relative fitness over time in the Logic
9 environments. Unlike in the Logic
77 environment (Figure 4.1) or the No Task
environment (Figure 4.5), here we see that th
e appearance of a plateau in fitness
does not disappear simply by looking at longer time frames. While there is a
slight upward trajectory from 20,000 to 200,000 generations, the increase is
small enough that it isn™t immediately obvious. The value of th
is plateau is also
telling. Holding everything else constant, an organism in the Logic
9 environment gets a
 fold improvement in fitness by performing all nine logic
tasks. The plateau in fitness is very close to
, as the y
axis
is the log 2 of
relative fitness. Therefore, what improvements remain in fitness will be
predominantly those from decreasing Generation Length, and each mutation that
139 Figure 4.9:
Fitness over time in the Logic
9 environment
. The solid (black)
curve i
s the mean log 2 Relative fitness across 20 replicates. Different panels
show different numbers of generations. Dashed (green) vertical lines show ends
of previous panels.
140 does so will have a smaller individual effect than any of the mutations which
prov
ided solutions to new tasks.
From Figure 4.10, we can see that 16 of the 20 replicates have fitness
values consistent with most individuals in the population performing all nine logic
tasks by 200,000 generations. The improvement in fitness in this envir
onment
from generation 133,333 to 200,000 is only marginal (one
tailed t test, t = 1.3402,
df = 19, p = 0.0980). A linear regression of log 2 relative fitness over time from
133,333 generations to 200,000 generations yields a significant, positive slope
(slope = 2.59 *
, t = 4.65, p = 3.33 *
). As in the No Task environment,
though, diagnostic plots for this model reveal that a linear model is not a good fit
for these data (see Figure S4.3).
The fact that
most fitness gains in this environment are driven by task
acquisition is further underscored by the individual replicate fitness trajectories
shown in Figure 4.11. Most populations achieve a log 2 relative fitness of slightly
more than 25, and then stop v
isibly improving in fitness from that point onward.
This saturation is further corroborated by comparing the two model fits in Figure
4.12. In this environment, log 2 relative fitness is better explained by a hyperbola
than a power law (difference in BIC
= 14010.79, posterior odds ratio <
<< ). In Figure 4.12B, we can see that the hyperbolic model does a strikingly
better job of predicting future fitness than the power law model does.
What accounts for th
is major difference between the Logic
9 environment
in Avida and the other ones we have examined? One possibility is that is the
nature of the rewards for task completion. In the Logic
77 environment, each
141 Figure 4.10:
Late fitness v final fitness in
the Logic
9 environment
. Each
point is one replicate. The dashed line is at y = x; points on this line have the
same fitness at the end of the run as at 2/3 of the run.
142 Figure 4.11:
Fitness over time in the Logic
9 environment
. The gray points show ea
ch of the 20 replicates. The black
curve shows the mean log 2 relative fitness over time.
143 Figure 4.12:
Comparison of model fits in the Logic
9 environment
. (A) Hyperbolic (red) and power
law (blue) models fit to the set of mean log 2 fitness
values (bl
ack symbols) from all 20 replicates. (
B) Fit of hyperbolic (solid red) and
power
law (solid blue) models to data from first 80,000 generations only (solid
black), with model predictions (dashed red and blue curves) and later data
(dashed black curve).
144 new
task doubles fitness of an organism. In the No Task environment, fitness
gains are substantially smaller. In the LTEE, the largest known beneficial
mutations were on the order of a 13% fitness boost
(16). In the Logic
9 environment, though, individual mutations can increase fitness up to multiplying it
by 32. These large effect mutations will, by ne
cessity, drive the pattern of fitness
change over time, and minimize the impact of small mutational steps such as
those that drive the pattern in the No Task environment. In fact, given that the
No Task environment exhibits power law dynamics in fitness o
ver time, we would
expect the Logic
9 environment to do the same starting from the point where all
nine logic tasks are being performed. A related explanation lies in the fact that in
both of the logic environments in Avida, the ancestor is drastically un
fit compared
to its eventual descendants. In the LTEE, fitness gains were on the order of 60
80% over 50,000 generations
(1). In the Logic
77 environment, fitness gains are
on the order of ~
by 200,000 generations; in the Logic
9 environment, they are
on the order of ~
by 200,000 generations, and in the No Task environment
they™re on the order of ~ 320% by 1,000,000 generations. With a small number
of large effect mutations, and a large number of drastically
smaller effect
mutations available, population fitness will ten
d to rise rapidly when the large
effect mutations are spreading, and move only small amount otherwise. This will
cause the trajectory to look more like a hyperbola. In future work, we will
address these explanations by 1) starting runs with evolved ances
tors, and 2)
changing the task rewards so that individual mutations do not have as large of an
impact as in the Logic
9 environment.
145 Conclusions
: In the Logic
77 and No Task environments of Avida, fitness obeys power
law dynamics, much as it does in the
LTEE. In the Logic
9 environment of Avida,
fitness is better explained by a hyperbolic model. Even over hundreds of
thousands of generations, fitness continues to increase in this system across
these environments. This suggests that unbounded increases i
n fitness over
evolutionary time scales may be general to evolving systems as a whole, and not
due to the specifics of the LTEE.
Future Work
: We will extract the numerically
dominant organism from the end of each of
ten runs in each of the three environm
ents tested. We will use these organisms
as the ancestor for additional (replicated) evolutionary runs, both within the
environment in which they had evolved, and within simpler environments. We
will test if these new bouts of evolution, starting from a
more adapted ancestor,
exhibit unbounded fitness increases over time.
Acknowledgements
: We thank Noah Ribeck, Alita Burmeister, Rohan Maddamsetti, Anya Vostinar,
and Emily Dolson for discussion and feedback in the drafting of this chapter. We
also thank
Neerja Hajela for technical assistance. This work was supported, in
146 part, by the BEACON Center for the Study of Evolution in Action (NSF
Cooperative Agreement DBI
0939454).
147 APPENDIX
148 Figure S4.1:
Diagnostic plots for Logic
77 environment, late fitness a
s linear
model
. 149 Figure S4.2:
Diagnostic plots for No Task environment, late fitness as linear
model
. 150 Figure S4.3:
Diagnostic plots for Logic
9 environment, late fitness as linear
model
. 151 REFERENCES
152 REFERENCES
1.
M. J. Wiser, N. Ribeck, R. E. Lenski, Long
term dynamics of adaptation in asexual
populations.
Science
. 342, 1364
Œ1367 (2013).
2.
M. J. Wiser, R. E. Lenski, A Comparison of Methods to Measure Fitness in
Escherichia coli.
PLoS ONE
. 10, e012
6210 (2015).
3.
J. Maynard Smith, Byte
sized evolution.
Nature
. 355
, 772
Œ773 (1992).
4.
C. Ofria, D. M. Bryson, C. O. Wilke, in
Artificial Life Models in Software
(Springer
London, 2009), pp. 3
Œ35. 5.
D. C. Dennett, Darwin™s dangerous idea.
The Sciences
. 35, 34
Œ40 (1995).
6.
R Core Team,
R: A language and environment for statistical computing
(R
Foundation for Statistical Computing, Vienna, Austria, 2013; http://www.R
project.org/).
7.
A. E. Raftery, Bayesian model selection in social research.
Socio
l. Methodol.
25, 111Œ164 (1995).
8.
C. B. Turner, Z. D. Blount, D. H. Mitchell, R. E. Lenski, Evolution and coexistence
in response to a key innovation in a long
term evolution experiment with
Escherichia coli.
bioRxiv
(2015), doi:10.1101/020958.
9.
R. E
. Lenski, C. Ofria, R. T. Pennock, C. Adami, The evolutionary origin of complex
features.
Nature
. 423, 139
Œ144 (2003).
10.
H. Zhang, M. Travisano, in
Artificial Life, 2007. ALIFE ™07. IEEE Symposium on
(2007), pp. 39
Œ46.
11.
C. Adami, C. Ofria, T. C. Col
lier, Evolution of biological complexity.
Proc. Natl.
Acad. Sci.
97, 4463
Œ4468 (2000).
12.
J. Clune
et al.
, Natural Selection Fails to Optimize Mutation Rates for Long
Term
Adaptation on Rugged Fitness Landscapes.
PLoS Comput Biol
. 4, e1000187
(2008).
13. R. K. Standish, Open
Ended Artifical Evolution.
Int. J. Comput. Intell. Appl.
03, 167Œ175 (2003).
14.
D. Misevic, C. Ofria, R. E. Lenski, Sexual reproduction reshapes the genetic
architecture of digital organisms.
Proc. R. Soc. Lond. B Biol. Sci.
273
, 457Œ464
(2006).
153 15.
S. S. Chow, C. O. Wilke, C. Ofria, R. E. Lenski, C. Adami, Adaptive Radiation from
Resource Competition in Digital Organisms.
Science
. 305, 84
Œ86 (2004).
16.
E. Crozat, N. Philippe, R. E. Lenski, J. Geiselmann, D. Schneider, Long
term
experimental evolution in Escherichia coli. XII. DNA topology as a key target of
selection.
Genetics
. 169
(2005), doi:10.1534/genetics.104.035717.
154