..

.rbii

ﬁnk.” tn. «E

2. JV... . C .g... .31..
? avg.

 

 

 

. z.
:1“!

 

. £1}. 7....
-o\ l . . .2:

‘3. c .
‘3.
9.2L!

 

3.:

4......Ihm

 

. I. a.
nQJuKLI‘P:

. . J
« wall-35.... 1

 

   

 

. .x . 2.5.. . .....u
c. . 4. .
V ..E\ p
v.5}?! . ...-LLhC.-.u._ , y , . II
. . . 1 . ‘l
.30.}. Q... . | 'l

   

_ h. '55.... . 9;...“

THEmS

MICHIGAN STA

” lllllllllllllllllll lllllllllllllllllll

3 1293 01787 9895

 

LIBRARY
Michigan State
University

 

 

 

This is to certify that the

dissertation entitled

DYNAMICS OF MUTATION AND SELECTION
IN ASEXUAL POPULATIONS

presented by

Philip J; Gerrish

has been accepted towards fulﬁllment
of the requirements for

Ph. D.— degree in ZOO log;

Kim? s, Law;

Major professor

Date A?“ 2f (qqg

MSU i: an Afﬁrmative Action/Equal Opportunity Institution 0-12771

L

v-

¥‘-‘—“~‘—“—I ‘__..

PLACE lN REI'URN BOX to remove this checkout from your record.
TO AVOID FINE return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE

DATE DUE

DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1/98 WW“

DYNAMICS OF MUTATION AND SELECTION
IN ASEXUAL POPULATIONS

BY

Philip J. Gerrish

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Zoology

1998

JABSTEUMCT

DYNAMICS OF MUTATION AND SELECTION
IN ASEXUAL POPULATIONS

BY

Philip J. Gerrish

The dynamics of asexual populations are characterized by strong
associations between alleles at different loci due to complete linkage.
One consequence of complete linkage is the phenomenon known as
hitchhiking, whereby a specified mutant allele is driven to high
frequency or even fixation by a beneficial mutation to which it is
linked. If the specified mutant allele confers a change in the mutation
rate of the organism, then the fixation of that allele affects the mean
mutation rate of the population. Observations of high mutation rates in
evolving E. coli populations are consistent with the hypothesis that
mutator alleles were fixed in these populations by hitchhiking.
Theoretical exploration of the hitchhiking mechanism reveals its general
significance for asexual evolution. I conclude that (i) mutation rates
found in asexual populations are more likely to be determined by
sporadic hitchhiking than by evolutionary "fine-tuning" as previous
theoretical models would suggest, and (ii) there exists a most probable
increase in mutation rate due to hitchhiking that is both positive and

finite and its value is typically significant.

l
1

i‘

, v
‘- eta
..bu a

. O 9- A.
.0- ~gv-A

.
1.1 {7.5:

rite an.

o.‘

.Another consequence of complete linkage is the phenomenon called
“cloned interference”, whereby the progression of one beneficial
mutation to fixation is hindered or even prevented by competition with
alternative beneficial mutations. From theoretical exploration of
clonal interference, I derive several fundamental population-genetic
parameters, as well as the probability that a beneficial mutation will
transiently achieve polymorphic frequency. After treating the case in
which an unlimited number of beneficial mutations are available, I treat
the case in which that number is limited. From these developments, I
solve the inverse problem of estimating the following parameters from
fitness data of evolving E. coli populations: (1) beneficial mutation
rate, (ii) the distribution of mutational effects, and (iii) the number
of available beneficial mutations for the case in which this number is
limited. Salient conclusions from both theoretical treatments are (i)
adaptive evolution of asexual populations is characteristically
punctuated with short bursts of rapid change followed by long periods of
stasis regardless of population size or mutation rate, (ii) in identical
environments, the trajectory of adaptive evolution of large, parallel
populations is highly repeatable due to clonal interference, (iii) the
rate of fitness improvement is an increasing function of both mutation
rate and population size, but the function is decelerating so that the
rate of adaptive evolution is constrained by a “speed limit”, and (iv)
with significant probability, clonal interference may transiently

maintain fitness variants at polymorphic frequencies.

n}.

D-

 

--

.5.

r».

.q.

:-

3-.

—\H

Q
“I
”v.

 

ACKNOWLEDGMENTS

First and foremost, I thank my committee members: I thank my advisor,
Dr. Richard Lenski, for admirable instruction, guidance and support as
well as his vital role in the conception of ideas presented here, I
thank Dr. Paul Sniegowski for his crystal clear explanations of
population genetics as well as his vital role in the conception of ideas
presented here, I thank Dr. V. Mandrekar for superior instruction in
stochastic processes, I thank Dr. Judy Mongold and Dr. Don Hall for
helpful comments and conversations. I especially thank Danny Rozen for
important conceptual contributions, Drs. Alex White and.Alejandra Sorto
for help with mathematical and statistical analyses. I am grateful to
Dr. Francois Taddei for sharing a manuscript prior to publication and to
Dr. Cliff Zeyl for permission to cite unpublished data. I thank Dr.
Brendan Bohannan, Lynette Ekunwe and Phyllis Frank for technical
assistance, Drs. Tom Cebula, Fred Adler, Frank Stewart, Warren Ewens,
Andrew Leigh Brown, Santiago Elena, Mike Travisano, John Gerrish, Larry
Segerlind, Arjan DeVisser, and Paco Moore for helpful discussions and
comments. Thanks to Drs. H. Maki and J.E. LeClerc for plasmids. Thanks
to my present supervisor, Dr. Marcia Kalish, for patience and support.
Thanks to Lucy Gerrish for moral support and good humor, Irene Gerrish

for playing dinosaur, and Dr. Theophilus Okosun for plenty of beer.

iv

V... F
s I
new. a.

f)

h‘
\rc

. -‘KAFO.

“ ""V~'\"

R‘

TABLE OF CONTENT

LIST OF FIGURES . .... . . . . . . . . . . .
LIST OF MATHEMATICAL SYMBOLS . . . . . . . .
Chapter 2 . . . . . . . . . . . . . . .
Chapter 3 . . . . . . . . . . . . . . .
Chapter 4 . . . . . . . . . . . . . . .

INTRODUCTION . . . . . . . . . . . . . . . .

S

Fundamental issues and roadmap to the dissertation . .

Chapter 1 . . . . . . . . . . . . . . .
Chapter 2 . . . . . . . . . . . . . . .
Chapter 3 . . . . . . . . . . . . . . .

Chapter 4 . . . . . . . . . . . . . . .

Chapter 1: EVOLUTION OF HIGH MUTATION RATES IN EXPERIMENTAL

POPULATIONS OF ESCHERICHIA

Abstract . . . . . . . . . . . . . . .

Introduction and Findings . . . . . . .

Methods . . . . . . . . . . . . . . . .

Experimental System . . . . . . .

Mutation rate measurements . . .

Time of Origin and Persistence of
Complementation Tests . . .

Analysis of Fluctuation Test Data

COLI . .

xiii

xiii

xiv

xvi

15

 

JAKE! .‘
.

.7“

...

Vet

’i

U"

Discussion I O I O O O O O O O O O O O O I O O I O O O O O 0

Chapter 2: THE FATE OF COMPETING BENEFICIAL MUTATIONS
IN AN ASEXUAL POPULATION . . . . . . . . . . . .

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . .
Clonal interference and fixation . . . . . . . . . . . . . .

Clonal interference among beneficial mutations . . . .
Some definitions . . . . . . . . . . . . . . . .
The expected number of interfering mutations . .
Fixation probability of a beneficial mutation . . . . .
Expected rate of substitution . . . . . . . . . . . . .
Expected selection coefficient of successful mutations

Effect of clonal interference
on rate of fitness increase . . . . . . . . . .

Estimation of parameters: an empirical example . . . .
Transiently common mutations . . . . . . . . . . . . . . . .
Clonal interference - a general model . . . . . . . . .

Probability of transiently polymorphic
beneficial mutations . . . . . . . . . . . . . .

Probability of transiently common mutations:

the leapfrog . . . . . . . . . . . . . . . . .
Discussion . . . . . . . . . . . . . . . . . . . . . . . . .
Summary of results . . . . . . . . . . . . . . . . . .
Assumptions of the models . . . . . . . . . . .
Model validation by simulation . . . . . . . . . . . .
Inclusion of the double mutant . . . . . . . . . . . .

Implications for the evolution of

18

25
25
26
28
28
28
29
33
35'

36

37
38
42

42

48

52
54
54
56
62

63

‘7‘);a.
van-lb —
.

8:3

4
,.

p:
e...

 

Chapter 3:

reproductive strategies . . . . . . . . . . . . .

Implications for the general nature of
adaptive evolution . . . . . . . . . . . . . . .

Evidence for transiently common beneficial mutations in
microbial populations . . . . . . . . . . . . .

A suggestion for further research . . . . . . . . . . .

THE ORDER OF FIXATION OF BENEFICIAL MUTATIONS IN

ASEXUAL POPULATIONS . . . . . . . . . . . . . . .

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . .
Probabilities of fixation orderings . . . . . . . . . . . . .
Order of fixation of two beneficial mutations . . . . .

Probability that a superior mutation is
the first of many . . . . . . . . . . . . . . .

Probability that fixation ordering is from largest
to smallest s . . . . . . . . . . . . . . . . . .

General solution for any ordering of fixations . . . .

Fitness trajectories I: parallel populations share a common

set of available beneficial mutations . . . . . . . . .
Expectation and variance . . . . . . . . . . . . . . .
Reducing the number of parameters . . . . . . . . . . .

Fitness trajectories II: parallel populations share a common

distribution of beneficial mutational effects . . . . .
The non-ordered region . . . . . . . . . . . . . . . .
The ordered region . . . . . . . . . . . . . . . . . .
Discussion . . . . . . . . . . . . . . . . . . . . . . . . .
.Assumptions of the models . . . . . . . . . . . . . . .
Estimating parameters . . . . . . . . . . . . . . . . .

Ordered substitutions in nature,
with an application to HIV . . . . . . . . . .

64

67

69

72

84

84

85

88

88

94

95

96

98

99

103

106

107

110

115

115

118

122

Chapter 4: HITCHHIKING OF DELETERIOUS MUTATIONS
WITH SPECIAL ATTENTION TO MUTATOR ALLELES

Abstract . . . . . . . . . . . . . . . . . . .
Introduction . . . .
Theoretical developments
The hitchhiking rate
An effective mutation—selection balance
Probability of fixation of double-mutations, Pr{fix}
Converting per-capita rate to populational probability
An application .
Hitchhiking rate of mutator mutations . . .

Estimating rate of mutation to mutator for
evolving E. coli populations

Discussion . . . . . . . . . . . . . . . . . . . . . . .
Assumptions
Comparison with simulation
Effect of population size on hitchhiking rate
Effect of selective disadvantage on hitchhiking rate
Effect of mutator strength on hitchhiking rate
APPENDIX 1.1: FLUCTUATION TEST ANALYSIS PROGRAM
Theory
Obtaining and using the program .
APPENDIX 2.1: PROBABILITY OF SURVIVING DRIFT
APPENDIX 2.2: n-GENOTYPE LOGISTIC SYSTEM WITH MUTATION
General solution .

Application of boundary conditions due to mutation

135

135

136

138

138

140

142

148

151

151.

154

157

157

159

162

163

164

178

178

182

183

185

185

187

_...‘...-..
1'"!
unsu- it."

v...
x 7"
um... v.

 

 

Notation for the 3-genotype case . . . . . . . . . . .
APPENDIX 2.3: EXPECTED NUMBER OF CANDIDATE REPLICATIONS . .
APPENDIX 2.4: FUNCTIONS EMPLOYING THE RECTANGULAR DISTRIBUTION
APPENDIX 3.1: ADAPTIVE SUBSTITUTIONS IN ASEXUAL POPULATIONS

ARE ACCURATELY MODELED AS INSTANTANEOUS

REPLACEMENTS . . . . . . . . . . . . . . .
APPENDIX 3.2: ANALYTICAL INTEGRATION OF EQUATIONS . . . . . .
.APPENDIX 4.1: EFFECTIVE MUTATION-SELECTION BALANCE . . . . .
APPENDIX 4.2: EFFECTIVE POPULATION NUMBER UNDER

A SERIAL TRANSFER REGIME . . . . . . . . . .
LIST OF REFERENCES . . . . . . . . . . . . . . . . . . . . . .

188

190

194

197

200

204

208

218

LIST OF FIGURES

Figure 1.1. Rates of mutation in 12 Araf and Ara+
experimental populations (A-l to A-6 and A+1 to A+6) at
10000 generations and in their common ancestors REL606
(Ara') and REL607 (Arat). . . . . . . . . . . . . . 22

Figure 1.2. Time of appearance and evolutionary persistence
of mutator phenotypes in experimental populations
Ara‘2, Ara'4, and Ara‘3. . . . . . . . . . . . . . . 23

Figure 1.3. Mutation rates to nalidixic acid resistance in
10,000-generation isolates from populations Araih
Araf4 and Araf3 and ancestor REL606 transformed with
plasmids bearing wild—type alleles of seven known
general mutator loci. . . . . . . . . . . . . . . . 24

Figure 2.1. The probability that a given beneficial
mutation with selection coefficient, 5, achieves
fixation. O 0 O O O O O O O O O O O 0 O O O O 0 O O 74

Figure 2.2. The probability of fixation of an arbitrarily
chosen beneficial mutation is a decreasing function of
both beneficial mutation rate, p, and population size,
N . . . . . . . . . . . . . . . . . . . . . . . . . 75

Figure 2.3. The substitution rate of a population is an
increasing function of its beneficial mutation rate
. . . . . . . . . . . . . . . . . . . . . . . . . 76

Figure 2.4. The expected selection coefficient of
substitutions, <s(a,u,N)>, is an increasing function of
population size, N. . . . . . . . . . . . . . . . . 77

Figure 2.5. The rate of fitness improvement is an
increasing function of both population size, N, and
beneficial mutation rate, p. . . . . . . . . . . . . 78

u . ‘0‘ .I - a-h - - ans . u . . 0. Q A v F .u "I! a 4
aﬁy PM. awu .RU AH» .FU p. a.» .Hh (U a» .u A P91. u.“ fl 5» ‘is C. A.V X a» “U.
r . r . r . v . v . v a v a v
.a H u . a “a . a “a .. a .. a
To. an. we. 14. we. a: ~u. 7..
.. .. i. .. 4. I; ..
no. no. —Do He. not his ~54 \i

 

~E

§

Figure 2.6. The probability that an arbitrarily chosen
beneficial mutation transiently achieves polymorphic
frequency is plotted against log population size for
various beneficial mutation rates. . . . . . . . . . 79

Figure 2.7. The leapfrog phenomenon illustrated
phylogenetically. . . . . . . . . . . . . . . . . . 80

Figure 2.8. The leapfrog phenomenon illustrated
dynamically. . . . . . . . . . . . . . . . . . . . . 81

Figure 2.9. A simulation of competition among numerous
beneficial mutations. . . . . . . . . . . . . . . . 82

Figure 2.10. The probability of fixation of an arbitrarily
chosen beneficial mutation is plotted against the
beneficial mutation rate, u, for various population
sizes. . . . . . . . . . . . . . . . . . . . . . . . 83'

Figure 3.1. Probability that a superior mutation (ss== 0.1)
is fixed before an inferior mutation (s,==0.08) as a
function of both population number, N (for which u5:=
1h = 3 x 104), and overall beneficial mutation rate, p
(for which N'= 3.3 x 107). . . . . . . . . . . . . 128

Figure 3.2. Probability of fixation ordering from largest
to smallest selective advantage as a function of both
population number, N (for which u = 6 x 104), and
overall beneficial mutation rate, u (for which N'= 3.3
x 107). . . . . . . . . . . . . . . . . . . . . . 129

Figure 3.3. Evolutionary trajectories as a function of

population size. . . . . . . . . . . . . . . . . . 130
Figure 3.3 (cont.) . . . . . . . . . . . . . . . . . . 131
Figure 3.3 (cont.) . . . . . . . . . . . . . . . . . . 132

Figure 3.4. Least-squares fit of expected fitness
trajectory to fitness data from evolving E. coli

population, the Ara‘1.line from Lenski & Travisano
(1994). . . . . . . . . . . . . . . . . . . . . . 133

Figure 3.5. Estimated probability (solid line) and maximum
probability (dotted line) that the inferior AZT-
resistance mutation, K7OR, is fixed before the superior
mutation, T215Y/F, as a function of effective
population number. . . . . . . . . . . . . . . . . 134

Figure 4.1. The dynamics of mutation-selection balance.
0 O O O O O O I O 0 O O O O O O O O O O O O 170

Figure 4.2. Panel A shows the per-capita hitchhiking rate
of a specified deleterious mutation as a function of
population number, N. Panel B shows the corresponding
probability that a population fixes the deleterious
mutation by hitchhiking within a time interval of
10,000 generations. . . . . . . . . . . . . . . . 172

Figure 4.3. Panel.A shows the per-capita hitchhiking rate
of a specified deleterious mutation as a function of
its selective disadvantage, sD. Panel B shows the
corresponding probability that a population fixes the
deleterious mutation by hitchhiking within a time
interval of 10,000 generations. . . . . . . . . . 173

Figure 4.4. Panel A shows the per-capita hitchhiking rate
of a mutator allele as a function of its strength, m.
Panel B shows the corresponding probability that a
population fixes the mutator allele by hitchhiking
within a time interval of 10,000 generations. . . 174

Figure 4.5. The per-capita recruitment rate of double-
mutations (beneficial mutations on mutator background)
as a function of mutator strength, m. . . . . . . 175

Figure 4.6. Probability of fixation of a double-mutation
(beneficial mutation on mutator background) as a
function of mutator strength, m. . . . . . . . . . 176

Figure 4.7. Frequency of a deleterious mutation during a
substitutional event on both wildtype and beneficial
backgrounds. . . . . . . . . . . . . . . . . . . . 177

33:}

()

 

Chapter 2:

x(t)

y(t)

z(t)

t:

.n(s)

.Pr{fix}
c7

<.>

LIST OF MATHEMATICAL SYMBOLS

number of wildtype individuals at time t

number of individuals carrying the beneficial
mutation in question at time t

number of individuals carrying an alternative
beneficial mutation at time t

beneficial mutation rate
total population number
selection coefficient
time to fixation

exponential parameter for distribution of
beneficial mutational effects

probability that a beneficial mutation of
selective advantage 3 survives drift

(referred to as the “survival function”)

expected number of “interfering mutations” as
defined in the text

probability of fixation
rate of adaptive substitution

expected value

xm

nut.

 

 

 

n‘ .
r . «may a\» e d
9. or .I
t . PC .1 ,. rev
by. a» (vi 4 I
r I. . .
.Maw Nu... RUN c. 74
v

Chapter 3:

K _

frequency

number of “candidate replications” as defined
in the text

expected number of superior mutations that
would prevent a given beneficial mutation
from attaining frequency f.

time at which the appearance and survival of
a superior mutation of selective advantage sz
ensures that the beneficial mutation in
question attains a frequency of exactly f.

expected number of superior mutations
preventing the fixation of a beneficial
mutation that has achieved a frequency of at
least f.

the linear constant in the survival function,
such that n(s) z Ks.

The case of only two available beneficial mutations:

I -

S _

Pr{I;S} -

Pr{S;I} -

denotes inferior beneficial mutation
denotes superior beneficial mutation

probability that the inferior mutation is
fixed before the superior mutation

probability that the superior mutation is
fixed before the inferior mutation

rate of appearance and survival of the
inferior mutation

my

ii; i
q...
_-u
... 3 r”.
4. . A...
a.” Pint
1 .
. P.
r.
0.7-
yin

a:

I

.2

v .o

I

rate of appearance and survival of the
superior mutation

The case of many available beneficial mutations:

{{j}

¢({)

Witlé)

E(whﬂ

var(whﬂ

1:

number of available beneficial mutations

probability that the largest-effect mutation
is fixed before all others

rate of appearance of the beneficial mutation
with the jth largest selective advantage

permuting function mapping place in the
fixation ordering onto rank by selective
advantage

probability of the fixation ordering
specified by the permuting function 5

time between the j-lth fixation and the

appearance of the jth beneficial mutation to
be fixed

eXpected time until the i” fixation, given
fixation ordering C

population fitness after the i” fixation,
given fixation ordering (

population fitness at time t, given fixation
ordering (

expected population fitness at time t

variance of population fitness among
replicate populations at time t

the set of all available beneficial mutations
both inferior to and subsequent to the jth
beneficial mutation fixed

XV

n,

f)

C)’

>
‘1

I"
, J
.‘ix

 

#1,

#5,

Chapter 4:

'9"

Pr{fix}

h

x(t)

Y“)

the set of all available beneficial mutations
both superior to and subsequent to the jth
beneficial mutation fixed

cardinality of set Ij

cardinality of set 59

mutation-selection balance
equilibrium mutation-selection balance

“effective mutation-selection balance” as
defined in the text

overall beneficial mutation rate on the
wildtype background

overall beneficial mutation rate on the
specified deleterious background

rate of mutation from wildtype to the
specified deleterious mutation

selective disadvantage of the specified
deleterious mutation

probability of fixation

per-capita hitchhiking rate of the specified
deleterious mutation

number of wildtype individuals at time t

number of individuals carrying only the
specified deleterious mutation at time t

z(t)

number of individuals carrying both
beneficial and deleterious mutations at time t

factor by which a specified mutator allele
elevates the general mutation rate, e.g.
u’=mu.

B B

overall deleterious mutation rate of the
wildtype

"it. -I' '
44 p.
s III-“‘08..

nanno- ,

.-
9""8 i- .

:C-‘gt.

~.
. ~.~‘

0 “
‘.
‘Q
‘sxn‘v
a;
‘1

 

INTRODUCTION

Fundamental issues and roadmap to the dissertation

Adaptive evolution occurs by the action of selection on
genetic variants in a population. In sexual populations,
different fitness variants may recombine, thereby allowing
selection to act at different loci with a certain degree of
independence. In asexual populations, however, fitness
variants compete and the fittest variant eventually
displaces all others. Thus the fate of a particular allele
at a specified locus is determined as much by selection at
other loci to which it is linked as by its own selective
value. One consequence of asexuality (or, more generally,
genetic linkage) is a phenomenon known as hitchhiking
awaynard Smith & Haigh, 1974; Berg, 1995), whereby a
specified mutant allele is driven to high frequency or even
:fixation by a successful beneficial mutation to which it is

liJiked. .A second consequence of asexuality is the

‘. ’A-;
F “.u
y-»U°“"

b

“valﬂvp‘

H.V'

L..p nav‘
...so\0v-

‘ F

n;" . ‘

5‘... Q-
-

Q .
bn‘ﬂ'-

so- u v...-

'9!

(“D

QQA
G2:

-..b-'.,'

‘u—Ju-

I ‘ w .
P' .‘F‘PI-
“. b we...

Av-
“’ \oo‘t

’-
C..::“
- ‘1
h-
_
F‘
v..:
.‘VL
.
-
-~P~.
.4. ‘y
v- .
‘M‘ u
i
.. -
IN
‘._ .ﬁ’.
‘\
..
wt“ pv
. v.‘~-\\
‘
-
-‘ ‘
Q i
In
as —

2

phenomenon called “clonal interference”, whereby the
progression of one beneficial mutation to fixation is
hindered or even prevented by competition with alternative
beneficial mutations.

This dissertation explores the effects of both
hitchhiking and clonal interference on the adaptive dynamics
of asexual populations. Chapter 1 is an empirical study of
mutation rate evolution in bacterial populations and invokes
hitchhiking as an explanation for observed trends. Chaptersl
2 and 3 are theoretical explorations of clonal interference
under different assumptions about the availability of
beneficial mutations. Chapter 4 ties the theory developed
in chapters 2 and 3 to the empirical observations of

Chapter 1.

Chapter 1

Mutations may arise in the genes responsible for
.accurate DNA synthesis, proofreading or repair. An
:Lndividual is said to carry a mutator allele if any of these
genes is impaired or disabled by such a mutation; a mutator

zillele therefore confers an elevated mutation rate. An

irn'V"
{044“ .u

‘ugﬂ" "
.mthO5

.
.r, ..

,.
.....Sy v

rgrwrr -
‘HU‘DD -

.
o. .u.
to. ﬂu“.

58.62:.

HH‘A9.
ﬂd‘iue

'4.
th- 2.

si...‘

“4“.“

up“

 

3

individual is said to carry an antimutator allele if the
function of any of these genes is enhanced by a mutation;
thus, an antimutator allele confers a diminished mutation
rate. Because mutator and antimutator alleles are
recurrently produced in a population, their fate and hence
the mutation rate of the population may be determined by
selection. Chapter 1 examines mutation rate evolution in
laboratory populations of E. coli (Sniegowski et al., 1997).
This study began as a test of the hypothesis that
mutation rates in a constant environment should decrease
with time. This hypothesis is grounded in the logic that,
as a population adapts to a constant environment, there
should be (i) weaker selection for mutator alleles as the
available beneficial mutations are expended and (ii)
stronger selection for antimutator alleles as the remaining
available mutations become almost exclusively neutral or
deleterious. This line of reasoning is in accordance with
conventional theory (Kimura, 1960; Kimura, 1967; Leigh,
1970; Leigh, 1973; Gillespie, 1981; Ishii et al., 1989) and
predicts that evolution should reduce mutation rates in a

constant environment.

“Artﬂr-
pt. 0V."
6

85:115.:

b'p-e 'v--

ﬂ
.A-‘- 5-4

QI'VI‘ 'n-v
vu‘..w-

.3.“
‘31 o

1: In» -

U lav'.

vamp . _,

I’V‘A
Juio ”
“4 I .
-FA‘ ‘
‘ﬂ '7‘
U.

V.

by
)
w

'v
n-
4
av
(I)

 

4

To test this hypothesis, Sniegowski et al. (1997)
performed fluctuation assays (Luria & Delbruck, 1943) to
estimate the mutation rates of twelve E. coli populations
that had been evolving independently in a constant
environment for 10,000 generations (Lenski & Travisano,
1994). Estimation of mutation rates from fluctuation assays
is notoriously difficult because of its mathematical
complexity (Lea & Coulson, 1949; Stewart et al., 1990;
Sarkar, 1991). Building on recent developments (Ma et al.,
1992; Stewart, 1994; Jaegger & Sarkar, 1995), I derive the
analytical Newton series for the maximum likelihood
estimator of mutation rates from fluctuation assays, as well
as the analytical variance, and have incorporated these into
a computer program. Instructions for obtaining and using
this program, as well as the novel mathematical derivations,
are given in Appendix 1.1.

Having estimated mutation rates for each of the twelve
E. coli populations, none of the twelve populations had a
systematically lower mutation rate than the ancestor.
Surprisingly, three of the twelve populations had mutation
rates that were about two orders of magnitude higher than

the ancestral rate. This result was cause for rejection of

.u‘ q:

. . ‘\ u A Cy u s
.- . .m u. S . . ... a. ”a v. .3 n...
a. C ; _ .2 .. . a. 2.. S C. E . c 3 v.
.... : X : s L“ n... L" r. 3 .>.. we r C
. e E C .3 . . e o . .. . C. ..n E C. c 1

u -

 

5.
the original hypothesis and for critical review of
conventional mutation rate evolution theory. In re-
examining this theory, Paul Sniegowski (personal
communication) proposed that a major flaw was the implicit
assumption of infinite population size. A consequence of
this flaw is the unsubstantiated prediction that adaptive
evolution should "fine-tune" mutation rates. To explain
their observations, Sniegowski et al. (1997) hypothesized
instead that either (i) mutation rates in asexual
populations may be determined by chance hitchhiking events,
whereby mutator alleles are driven to fixation by beneficial
mutations that they produce, or (ii) mutator alleles may
provide a direct fitness benefit if reduced fidelity allows
for an increased replication rate. Chapters 1 and 4 explore
and support the plausibility of (1). While preliminary
results suggest rejection of (ii) for the E. coli

populations, the validity of this rests on future empirical

studies.

I!

Ma-

-—.-.r '1
”v.04
'

‘..

EVGV- ]

w“

Or,

(I)

Chapter 2

Chapter 2 is a theoretical exploration of the asexual
adaptive dynamics that result from competitive interactions
among beneficial mutations (Gerrish & Lenski, 1998). In
asexual populations, clones that carry alternative
beneficial mutations compete with one another and, thereby,
interfere with the expected progression of a given mutation
to fixation (Fisher, 1930; Muller, 1932; Muller, 1964; Crow .
& Kimura, 1965; Haigh, 1978). This phenomenon has been
called “clonal interference”. Intuitively, it seemed this
phenomenon would significantly affect fundamental
population-genetic parameters. Richard Lenski (personal
communication) speculated that this phenomenon might also
have a curious effect on phylogenetic and fitness dynamics
and on the relationship between these. He reasoned that a
beneficial mutation may rise to polymorphic frequency, or
even become the majority genotype, before being
competitively displaced by a superior, alternative mutation.
Such a phenomenon, which he termed the “leapfrog”, would

present strange phylogenetic dynamics in which ancestor ab

is succeeded by mutant Ab, and Ab is then succeeded not by

R in
.’.-~ . r
.w H~“'

O‘b ‘8'
...€ .-.

.no 5. .
2

o—Jiib-‘

“'~‘A l
. H ‘
want...

”ﬁn”, -
Viv-A.

 

VI,

F‘y‘
A 44" ,- .
V“. «m:

.
‘ ’7‘ '4.
V‘ I"

 

7

AB but by aB (the two successive mutants being more closely
related to the ancestor than they are to each other). From
the fitness standpoint, the leapfrog phenomenon would give
the appearance of two fixations when, in fact, only one
mutation is actually fixed. The question remained, however,
under what conditions the leapfrog phenomenon was likely to
occur and whether such conditions were biologically
reasonable.

From theoretical exploration of clonal interference, I
analytically derive several fundamental population-genetic
parameters such as fixation probability, substitution rate
and rate of fitness increase. Also, I derive the
probability that a beneficial mutation transiently achieves
polymorphic frequency or majority status (the “leapfrog”).
Employing fitness data from an evolving E. coli population,
I solve the inverse problem to estimate (i) the beneficial
mutation rate in this population and (ii) the distribution

of mutational effects.

If

arr}: rt

...‘--\

a? F
ﬁg

to

:va. '
..

be. I

-~\.--
H
m¢__‘

tﬁv.

s it"

A.

Cd 7-
‘ t.
C

r!)

Chapter 3

An assumption made throughout Chapter 2 is that the
population in question is evolving in a slowly changing
environment, such that opportunities for further improvement
are never depleted. In other words, the number of available
beneficial mutations is infinite. In a constant
environment, however, the set of available beneficial
mutations is eventually depleted. This set must therefore
be considered finite. Chapter 3 explores the adaptive
dynamics of asexual populations in a constant environment,
i.e., when the number of available beneficial mutations is
finite.

Given a finite set of available beneficial mutations,
the course of adaptive evolution is determined by both the
timing and ordering of fixations. In large, asexual
populations, the ordering of fixations is affected by
competition among beneficial mutations. Taking into account
such clonal interference, I derive the probability of any
specified fixation ordering. From this, I derive dynamic
equations for fitness expectation and variance among

replicate populations, given (i) a particular set of

A“."

.i

Ugh-v v-

if. 81':

".2'
u»

‘A
l.»

s

g“.-
t'uhr‘-
.

o

-Hch
.o-‘vg‘

“w.

A:

9

beneficial mutations, or given (ii) only a common
distribution of mutational effects. Using fitness data from
an evolving E. coli population, I solve the inverse problem
to determine (i) the beneficial mutation rate in this
population, (ii) the distribution of beneficial mutational

effects, and (iii) the number of available beneficial

mutations.

Chapter 4

Chapter 4 employs the theory developed in Chapter 2 to
model the hitchhiking mechanism that was proposed as an
explanation for the observations of Chapter 1 (Sniegowski et
al., 1997). In order for any allele to hitchhike to
fixation, it must first be linked to a beneficial mutation.
Beneficial mutations appear in linkage with a specified
deleterious mutation at a certain rate. As long as this
rate is greater than zero, the subpopulation carrying the
specified deleterious mutation will eventually produce a
beneficial mutation. If this beneficial mutation is
subsequently fixed in the population, then the specified

deleterious mutation to which it is linked will hitchhike to

10

fixation. The likelihood that the beneficial mutation will
be fixed is reduced, however, by (i) the selective
disadvantage of the deleterious mutation to which it is
linked, (ii) genetic drift, and (iii) clonal interference
from both wildtype and mutant subpopulations. Taking all of
this into account, I derive the probability that a
particular deleterious allele will hitchhike to fixation
within a given time period. The probability that a mutator
allele hitchhikes to fixation is then presented as a special
case. Results corroborate the observations of Chapter 1 and
suggest that hitchhiking is of general importance to the

evolution of mutation rates in asexual populations.

Abstr

Ayn.
«u»!

ad.
ad.
I. .A

a v
e .
x 5 8..
. 4 ‘ .6 Q l....
he 5..

..:
.3
q a

Chapter 1

EVOLUTION OF HIGH MUTATION RATES
IN EXPERIMENTAL POPULATIONS OF ESCHERICHIA COLI1

Abstract
Most mutations are likely to be deleterious, and hence the

spontaneous mutation rate is generally held at a very low value (Drake,
1991) . Nonetheless, evolutionary theory predicts that high mutation
rates can evolve under certain circumstances (Leigh, 1970; Ishii et al.,
1989; Taddei et al., 1997) . Empirical observations have heretofore been
limited to short-term studies of the fates of mutator strains
deliberately introduced into laboratory populations of E. coli (Cox &
Gibson, 1974; Chao & Cox, 1983; Trobner & Piechocki, 1984) and to the
effects of intense selective events on mutator frequencies in E. coli
(Mao et al., 1997) . Here we report a new phenomenon: the rise of
spontaneously originated mutators in populations of E. coli undergoing
long-term adaptation to a novel environment. Our results corroborate
computer simulations of mutator evolution in adapting clonal populations
(Taddei et al., 1997) and may shed light on recent observations that
associate high mutation rates with emerging pathogens (LeClerc et al.,

1996) and with certain cancers (Modrich et al., 1995).

 

1

This chapter was written originally as a paper (Sniegowski et al.,
1997) . The “we" in this chapter refers to P.D. Sniegowski, P.J.
Gerrish, and R.E. Lenski.

11

Introduct

N‘v -" 1y
u..CC..!

8849' l

l
OVUQAUH

“‘9‘: V on ‘

5M~d§ ‘Vcsa

'7’.
bui.

Y‘“Wc

e vs“...

yr,

12

Introduction and Findings

Lenski and collaborators (Lenski et al., 1991)
established twelve replicate experimental populations of a
clonal strain of E. coli in order to study evolution
directly in the laboratory. Because these populations were
founded from a single ancestral clone, mutation provided
their only source of genetic variation. The glucose-limited
environment in which these populations were propagated was
essentially novel at the outset of the experiment and
provided considerable scope for adaptive evolution.
Substantial evolutionary increases in fitness and changes in
certain other phenotypic features in these populations have
been documented elsewhere (Lenski et al., 1991; Lenski &
Travisano, 1994; Vasi et al., 1994; Travisano & Lenski,
1996; Elena et al., 1996).

We measured mutation rates in the common ancestral
strain and in the twelve experimental populations after they
had been evolving for 10,000 generations. A majority of the
populations retained the ancestral mutation rate; however,
three populations (designated Ara-2, Ara-4 and Ara+3)

displayed mutation rates that were between one and two

n'n “AMT“;
.y“ oAVv‘V‘L' v

Ara 4 an:

n'nav-A"'v~_
Vuys.V-‘L .
h J.

exrc"'~:

‘5 a. -...c

" ..i,.,_
a.‘ sit-Mu,

3‘38? 9‘
D

u
UA.V . { ~
s
6
'9‘
“vim f‘r-
p...

‘-
Pie a' ﬁg

A‘y"

Hl‘q-t..:

s

H

“A
a..-
~~~S.r.

a»:

.‘g“~aﬁJ
‘

s"

. Q

“h rn

au.e

:. .

U", k'-

. “89 V‘
1'

v

t; k‘,

I the ‘2

\-

 

13

orders of magnitude higher than those in the ancestor
(Figure 1.1). Figure 1.2 illustrates when mutator
phenotypes arose during the evolution of populations Ara-2,
Ara-4 and Ara+3, as determined by screening for the mutator
phenotype in isolates stored periodically throughout the
experiment. Once a lineage displayed a mutator phenotype,
all subsequent isolates from that lineage also tested as
mutators through 10,000 generations. This observation
indicates that mutators rose to and remained at high
frequencies in these populations.

To ascertain a genetic basis for the observed mutator
phenotypes, we transformed 10,000-generation clonal isolates
from populations Ara-2, Ara-4 and Ara+3 and an isolate from
the ancestral strain REL606 with multicopy plasmids bearing
wild-type alleles of seven known general mutator loci. The
ancestral mutation rate was fully restored in the Ara—2,
Ara-4 and Ara+3 strains only by the presence of wild-type
alleles of genes in the methyl-directed mismatch repair
pathway (Modrich, 1991) (Figure 1.3). The mutation rate in
the ancestral strain, REL606, was not significantly affected
by the plasmids (with a single exception; see Figure 1.3).

In the.Ara+3 strain the mutS+ allele alone restored the

l .l! p
[9.898.
‘.’ f‘F -
on 9'“ n.

. .
{Wear *n
Vvtsﬁ. ‘.‘

Methods

FYY‘Ar‘ VF!
“”Fccsh:

r.

...E

“A

I.‘
*4 -§.
. r‘ed\

‘q‘ A
U.
‘
14>.W‘
H‘Uﬂ‘
.
F
”WA-J
“'8. ‘0.
H
s d“‘
“C.“
..

14

ancestral mutation rate. In the Ara-2 and Ara-4 strains the
ancestral rate was completely restored by uer+, and the
data also indicated a partial effect of mutL+. In fact,
recent studies have suggested a mechanistic interaction
between the mutL and uer gene products, such that a defect
in one may be complemented by increased production of the

other in some cases (P. Modrich, personal communication).

Methods

Experimental System

The common ancestral strain was REL606, an Ara- clone
of E. coli B obtained by B. R. Levin from S. Lederberg
(Lederberg, 1966). A spontaneous Ara+ revertant was selected
from REL606 and designated REL607. Six clones each from
REL606 and REL607 were used to found the twelve experimental
populations, which were propagated at 37°C by daily loo-fold
dilutions into 10 ml of fresh Davis minimal medium (Carlton
& Brown, 1981) supplemented with glucose at 25 ug/ml.
Population sizes fluctuated between approximately 5 X 106

cells after dilution and 5 X 108 cells at stationary phase.

-o -
5.4.6:
r 8"
37.6-8...

“e expe

v... p. A!
out. L.-v.

' "Y F
6 9.»-..

IQCI‘I'H
"vacs...’
d

m
I". .
I

«0.3‘ p
D. ,

.

.-
~ ﬁx a
“‘v‘ ~ V

in

15

Isolates from each population were stored at -80°C at 100—
generation intervals during the first 2500 generations of

the experiment and at SOD-generation intervals thereafter.

.MUtation rate measurements

Fluctuation tests (Luria & Delbruck, 1943) to estimate
a given rate of mutation were conducted simultaneously on
all strains to provide controlled comparison. Prior to
testing, strains were revived and regrown for three days in
the original experimental medium to reestablish the
physiological conditions that had prevailed in the evolving
populations. In general, 24 or more cultures of a given
strain were grown from inocula of approximately 1,000 cells
for each fluctuation test. For a given mutation rate
measurement, all strains to be compared were grown in
aliquots from the same batch of Davis minimal medium.
Cultures were grown to stationary phase before selective
plating. (An additional 24 hours in stationary phase had no
discernable effect upon the mutation rate in pilot studies.)
Final population sizes were estimated by growing and

randomly harvesting three extra cultures for each strain and

.
on '3‘“
\r

."a..‘
DU-GbAu

awran’

-
yuan-inc J -

..,.:'...‘. A:
pg: nu. v.

F ”9" In
rﬁlv¢ss
J

"' c
1128 C.

16

measuring cell densities using a Coulter particle counter;
the means were used in mutation rate calculations. Ara+
mutants were enumerated on Davis minimal agar medium
supplemented with arabinose at 10 ug/ml; Nalr mutants were
enumerated on LB agar medium (Miller, 1992) containing 20
ug/ml of nalidixic acid; T5r mutants were enumerated by

plating cultures with excess phage T5 in soft agar.

Time of Origin and Persistence of MUtator Phenotypes

Mutator screens were conducted at approximately 500
generation intervals over the 10,000 generations of the
experiment. Overnight cultures in 5 ml of Davis minimal
medium (supplemented with glucose at 1,000 ug/ml) were
assayed for Nalr mutant numbers at stationary phase.
Expected distributions of mutants, based on mutator and
ancestral mutation rates, were used to generate criteria by
which isolates could be scored as mutator or nonmutator with

>95% confidence.

(I)

I
(TVA II
Bunny“-

“VA
twp:

”Tara
1‘“;

Nu“

MW...

17

Complementation Tests

Strains were transformed with plasmids carrying
wildtype alleles for mutH (plasmid pGW1899), mutL (pGW1842),
mutS (pGW1811), uer (pGT26), mutT (pSK25), dnaQ (pMMS) and
dnaE (pMK9) according to a standard protocol (Sambrook et
al., 1989). The plasmid labels correspond to the following
sources: pGW from Pang et al. (1985), pGT from Taucher-
Scholz & Hoffman-Berling (1983), pSK from Bhatnagar &
Bessman (1988), pMM from Horiuchi et a1. (1981), and pMK
from Schaaper & Cornacchio (1992). Fluctuation tests were
conducted as described above, except that all strains were
propagated in LB medium (containing 60 ug/ml of ampicillin
where the strain was plasmid-bearing) and five parallel

cultures were used per fluctuation test.

Analysis of Fluctuation Test Data

I wrote a computer program employing a recent Luria-
Delbruck distribution-generating algorithm (Ma et al., 1992)
t1) calculate maximum likelihood mutation rates from

flinztuation test data. Instructions for obtaining and using

‘i ‘86“!
15‘s ‘ ‘V!

. a
“A“:I‘AF'
HUHO‘“ ash

{5285 l.

u
. 5
Daisy w

who've. u-

.n m, as

“"“I f‘v
“‘~ how.

Discuss

‘ h .

\..

f)

18

this program are given in Appendix 1.1. Approximate 95%
confidence intervals for the mutation rates illustrated in
Figure 1.1 were calculated using the formulae of Stewart
(1994). Approximate 95% confidence limits for the mutation
rates illustrated in Figure 1.3 are based upon the
theoretical variance of the maximum likelihood estimate of

1n m, assuming normality, where m is the expected number of

mutations per culture (see Appendix 1.1).

Discussion

Because most mutations are deleterious, mutator alleles
are likely to have negative average effects on fitness. In
an evolving clonal population, however, a deleterious
mutator can rise to high frequency (hitchhike) in
association with an adaptive mutation provided that the
selective cost of the mutator does not outweigh the
selective benefit of the adaptive mutation. Hitchhiking of
mutators with adaptive mutations was demonstrated previously
in chemostat populations of E. coli (Chao & Cox, 1983), but
these results left open the question of whether such events

vanild be likely to occur in natural populations: When a

:"'G"Ej

u» 5‘

I
a I In
a his...»

4

1.1

, ..

.

.
{—7
Q!

l

.IPF

w.‘
3..” "
A

D]
(I)
l V

I"

L.

 

19

mutator was introduced above a relatively high threshhold
number relative to the wild type, the mutator population
always acquired an adaptive mutation before the wildtype
population did and the mutator rose to high frequency.
However, when the mutator was introduced in lower and more
realistic numbers, the wild type population always acquired
a beneficial mutation first and displaced the mutator.

In the evolution experiment we have described here, as
in natural populations, mutator alleles must have arisen by
mutation. This raises the question of how mutators reached
high frequencies in three of our twelve populations.
Computer simulations (Taddei et al., 1997) and stochastic
analytical modeling (Gerrish, Sniegowski and Mandrekar,
unpublished; see Chapter 4) suggest that rare mutators may
occasionally hitchhike to high frequencies in finite asexual
populations as a consequence of chance associations with
adaptive mutations. We view such chance hitchhiking events
as the likely explanation for our results. Sufficient time
may not have existed to observe them in earlier studies of
competition between mutator and wild-type strains (Cox &
Gibson, 1974; Chao & Cox, 1983; Trobner & Piechocki, 1984),

vﬂmich were carried out for only a few tens or hundreds of

0
361835.
I

,...{.‘y
..J‘J! °

6.7-:0 '9"
"no. 9"

“"255

not.“

.v'.:.A'
iﬁvU-V.

up u I
V' r“

ubloh‘ n

("'V'“, Air
11.4 d;

Fena‘u y- .
LUuu-..‘

3.:

A] I‘I
5" la:
.‘1

a.” f.

a...

20

generations, as opposed to the 10,000 generations in our
study. Although we cannot formally rule out the possibility
that certain mutations in mismatch repair might increase
fitness directly, it is much more likely that the evolved
mutators we observed are deleterious or at best effectively
neutral. Indeed, known mutator alleles of mutS, mutL and
uer do not increase cell fitness under our experimental
conditions (C. Zeyl, unpublished data).

A prediction of the simulations conducted by Taddei et
al. (1997) is that mutators will hitchhike in some, but not
all, finite asexual populations undergoing adaptation. Our
experimental observations are consistent with this
prediction. A further prediction is that fitness will be
higher on average in mutator populations than in those that
retain the wild type mutation rate. However, this fitness
effect is very small and subtle: approximately half of the
mutator populations in the Taddei et al. simulations did not
show higher fitness, and the observed increases in fitness
over contemporaneous wild type populations were slight
(approximately 1%) in the remainder (see Taddei et al.,
1997, their Figure 3B). We tested for a relationship between

Inutability and fitness in our experimental populations in

«by
Cu

_“

bill

is:

A‘ﬂ
Hr».

 

we

hh‘

:1

21

several ways, none of which gave statistical significance.
For example, we tested for a correlation between mutability
ranks (as calculated from data obtained in screens for
mutator phenotypes: see Methods) and relative fitness ranks
averaged across the duration of the experiment in all twelve
populations. The results were inconclusive (r = 0.28, n =
12, one-tailed P = 0.19). It is clear from our experiments
and the simulations of Taddei et al. that increased mutation
rate and not increased fitness is the more striking
consequence of the hitchhiking process.

Our finding that asexual populations can evolve high
mutation rates in a relatively benign environment may shed
light on recent observations associating mismatch repair
mutators with certain cancers (Modrich, 1995) and with
pathogenicity in E. coli and Salmonella (LeClerc et al.,
1996). .As in our experimental populations, high mutation
rates may evolve as a stochastic byproduct of adaptation in
clonal tumour lineages and in populations of asexual
pathogens (Nowell, 1974; Moxon et al., 1994). The potential
for faster evolution once high mutation rates have evolved

may have important health implications.

22

 

 

 

 

 

 

 

 

 

 

 

 

405
0 ﬁ' 8‘ W ‘V ‘9 <0
8 <: < «< '< .< <
-7
B I 9
.93 8 f
(U
L.
c -9.
.9 Q
E's-mi 9 Q 9 IQ?
4—0
.2: i ‘l
E.118l\Fvam©s-Nm Inc
as o8<<=t<<<2222322
CD -45
-' C 9
-&5 I
. é
-65+ f
-&5
(ONPNC'DVL‘PCDv-NCOV‘OCO
884222<22252s2
Population

 

Figure 1.1. Rates of mutation in 12 Ara_ and Ara+
experimental populations (A-l to Ar6 and A+l to A+6) at
10000 generations and in their common ancestors REL606
(Ara-) and REL607 (Ara+). Error bars give approximate 95%
confidence intervals. (A) Reversion to Ara+ in isolates from
the seven Ara- populations. (B) Mutation to nalidixic acid
resistance. (C) Mutation to bacteriophage T5 resistance.

CC...C_ZZCQ

23

 

 

 

 

 

 

 

 

Wildiype I:]
Mutator -
Ara*8 ﬂ
C
.9
L3 Ara'4 ‘
Q .
(O
o. _ , ,
Ara 2 _

 

 

0 1000 2000 3&0 4000 5000 6000 7000 8000 9000 10000
Generation

 

Figure 1.2. Time of appearance and evolutionary persistence
of mutator phepotypes in experimental populations.Ara 2,
Ara 4, and Ara 3.

24

 

 

£5 Ara+3 !
i
-7.5« § I i i i
°8.5-( I
.9... ______ i7 _______ j
40

 

 

 

5 control MutH MutL MutS Uer MutT MutO MutE

 

 

 

 

 

Ara'4

-6.5~ i . O
J5~ § § I

Q) es< { r

4...: .

SE -9& ________ l_ ______
-10.

g 5 control Mutt-l MutL MutS Uer MutT Mth MutE

Ara' 2
-7 I

i i i f

.9.
_____. ,____ ______
.10. j l-

1
control MutH MutL MutS Uer MutT Mth MutE

REL606

pl ‘ i ‘
.10. t "- cl

1'
l

1 .
control Mutt-l MutL MutS Uer MutT Mth MutE

 

10

 

Log mutat
dn

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 1.3. Mutation rates to nalidixic acid resistance in
10,000-generation isolates from populations Ara-2, Ara-4 and
Ara 3 and ancestor REL606 transformed with plasmids bearing
wild-type alleles of seven known general mutator loci.
Controls shown are plasmid-free. Error bars give approximate
95% confidence limits. Only an upper confidence limit is
shown for the case in which no mutants were obtained in any
culture. For visual comparison, the dashed horizontal line
and dotted horizontal lines in each panel illustrate the

mutation rate and approximate 95% confidence limits measured
in the ancestral control.

he

r!

’n

Chapter 2

THE FATE OF CWPETING BENEFICIAL MUTATIONS
IN AN ASEXUAL POPULATION1

Abstract

In sexual populations, beneficial mutations that occur in
different lineages may be recombined into a single lineage. In asexual
populations, however, clones that carry such alternative beneficial
mutations compete with one another and, thereby, interfere with the
expected progression of a given mutation to fixation. From theoretical
exploration of such "clonal interference", we have derived (1) a
fixation probability for beneficial mutations, (2) an expected
substitution rate, (3) an expected coefficient of selection fer realized
substitutions, (4) an expected rate of fitness increase, (5) the
probability that a beneficial mutation transiently achieves polymorphic

frequency (2 1%), and (6) the probability that a beneficial mutation

transiently achieves majority status. Based on (2) and (3), we were
able to estimate the beneficial mutation rate and the distribution of

mutational effects from changes in mean fitness in an evolving E. coli

population.

 

1

This chapter was written originally as a paper (Gerrish & Lenski, 1998).
The “we" in this chapter refers to P.J. Gerrish and R.E. Lenski.

25

H;

D

p.

p-

(D
v-r

Intrc

ﬁﬁh~~y

‘VVVCA-

Pv~b _ .

‘~‘_

.'R.‘\'

b..-“ -

26

Introduction

Asexual populations adapt to their environment by the
occurrence and subsequent rise in frequency of beneficial
mutations. Without recombination, a population must
incorporate beneficial mutations in a sequential manner
(Fisher, 1930; Muller, 1932, 1964; Crow & Kimura, 1965).

The time required for fixation of a beneficial mutation may
be considerable if the population is large; however, the
mutation remains at low frequency for much of this time
(Lenski et al., 1991). While the mutation is at low
frequency, another beneficial mutation may arise on the
ancestral background. If two such beneficial mutations
occur in a sexual population, then the two novel genotypes
can recombine to form a fitter double-mutant (assuming no
negative gene interactions). In an asexual population,
however, these two novel genotypes compete with one another.
Such competition between beneficial mutations slows the
spread of, and may even eliminate, the first mutation. Such
"clonal interference" between beneficial mutations has many
important consequences for the dynamics of evolution in

asexual populations.

27

The idea that progression of a beneficial mutation to
fixation may be impeded by competing beneficial mutations
was articulated by Muller (1932, 1964) in the context of
discussions on the evolutionary advantage of sex. .Almost in
passing, a brief theoretical treatment was later given by
Haigh (1978), in which he proposed a discrete-time model of
competing beneficial mutations. Employing a different
approach, we give a full theoretical treatment of the
phenomenon of competing beneficial mutations and its
consequences.

The body of this paper is presented in two main parts.
In the first part, a probability of fixation is derived
which incorporates the effect of competition between
beneficial mutations, and some consequences of this
derivation are then explored. The dynamics of fixation are
such that a relatively simple derivation suffices. In the
second part, the probability is derived that a beneficial
mutation achieves a frequency greater than or equal to some
specified frequency, f. From this, the probability that a
beneficial mutation becomes transiently polymorphic
(0.01<f<1) or transiently common (0.5<f<l) is derived. The

derivations in the second part require treatment of the

‘ C
3a

£5.

A“
v

 

NJ:

.Hu

.3

:e

28

dynamics of a three-genotype system; hence, the derivations

are more complex than those of the first part.

Clonal interference and fixation

Clonal interference among beneficial mutations

Some definitions. We refer to the common progenitor of one
or more mutants as the ancestor; the ancestral genotype,
which is haploid, is denoted by ab, and the number of
carriers of the ab genotype present at time t is denoted by
x(t). .A mutant that carries a beneficial mutation has
genotype Ab and has y(t) carriers at time t. Another mutant
that carries an alternative mutation, also beneficial, has
genotype a3 and has z(t) carriers at time t. When
discussing a beneficial mutation that is followed by the
appearance of one or more alternative mutations, the first
beneficial mutation shall often be described in retrospect
as the original mutation. If the original mutation is

followed by a superior mutation, then there is a significant

probability that the original mutation will be eliminated.

F .

.01.

F...
pad-

'3‘

on.

sea

(I)

.0-

9»

he:

or

lQ

I l

29

This phenomenon, whereby the fate of an original beneficial
mutation is altered by the appearance of a superior

alternative mutation, shall be called clonal interference.

The expected number of interfering mutations. We derive
here the expected number of alternative mutations that are
superior to a beneficial mutation and hence interfere with
the progression of that mutation to fixation. Assuming that
the number of such interfering mutations is Poisson
distributed, we determine the probability that no
interfering mutation occurs by calculating the zero-class.
This probability is an important factor in determining the
likelihood of fixation of a beneficial mutation.

Suppose a population is homogeneous until the
appearance of a beneficial mutation, at which time the
population consists of two fitness variants: the ancestral
genotype and the beneficial mutant. Let the beneficial
mutant appear in the ancestral population at time t:= 0.

The beneficial mutant, being competitively superior to the
ancestor, slowly displaces the latter until finally reaching
fixation at some time, tf. Therefore, the time interval

during which the beneficial mutation is present but not yet

(1

30

fixed is (0, tr). The expected number of interfering

mutations is simply the expected number of mutations

occurring in the interval (0, tr) that are competitively

superior to the original beneficial mutation.

If total population size, N, is constant, then the
dynamics of the two genotypes are logistic (see, for
example, Crow & Kimura, 1970). Let 3 denote the difference
in Malthusian parameters between ancestor ab and beneficial
mutant Ab. We have chosen the letter s because, under
logistic growth, a difference in Malthusian parameters is
equivalent to a selection coefficient when the unit of time
is generations. Let u denote the beneficial mutation rate
per capita, per generation. Define t} as the time to
virtual fixation of the mutant subpopulation, i.e., y(tﬂ ==
N—1. The expected number of further beneficial mutations
between the time of appearance of the original mutation and

its fixation is

t
ufx(t)dt = ENlnN. (1)
0 S

31

Of these beneficial mutations produced, we now
calculate the fraction that interfere with the growth of the
original mutation. We assume the effects of beneficial
mutations to be exponentially distributed (c.f. Kimura,

1970; see Discussion). Hence, the probability density for s

is dre‘“’, where o'characterizes the distribution of

mutational effects and may be determined from empirical data
(see Estimation of parameters).

In the first few generations of growth, a beneficial
mutation may be lost due to random sampling events, or
drift. We employ the general notation, n(s), to denote the
probability that a beneficial mutation is not lost by drift
in these first few generations. (While dependence on 3
alone is indicated by our notation, this probability may in
some cases be a function of other parameters as well.) All
further derivations employ the general notation, n(s),
whereas all computations implement the approximation,
n(s)=s4s, which is derived in Appendix 2.1 for the special
case of bacterial populations. We emphasize that our
analytical results are general for asexual species;

implementing these results, however, depends on first

—\~

.3

Y?

a».

 

a a

he.

We.

V
" 1

‘o

32
finding an appropriate function, n(s), for the particular
species under study. A viral species, for example, might be
assumed to have a Poisson distribution of offspring, in

which case n(s)=:23 when population size is constant

(Haldane, 1927). Expressions for probabilities of surviving
drift in fluctuating populations are given in Otto and
Whitlock (1997).

In light of the two preceding paragraphs, the
probability that an arbitrarily chosen beneficial mutation

(i) has a selection coefficient greater than s and (ii)

survives drift is fn(u)de”“du. All further derivations
J

assume that n(u) is linearly related to u, in which case

a:

this integral is equivalent to e" n(s+%).

Because loss by drift occurs in the first few
generations, whereas loss by clonal interference is more
probable in later generations, we can make the simplifying
assumption that these two processes are independent.
Therefore, the expected number of mutations occurring in the

interval (0, tr) that are superior to a given beneficial

33

mutation with selective advantage, 5, and that survive the

effects of drift is

A(s, or, u, N) = %N1n(N) e'“ n(s+%) . (2)

This is the expected number of interfering mutations.

Fixation probability of a beneficial mutation

If a beneficial mutation survives the first few
generations and is not lost by drift, its fixation is still
far from ensured. In fact, fixation of a beneficial
mutation may be very unlikely as a consequence of the
presence of, and competition with, alternative mutations.

A beneficial mutation is fixed only if (i) it survives
drift, and (ii) no superior mutation appears and survives
drift in the time interval required for fixation. Given
selective advantage 5, the probability that a beneficial
mutation will not be lost by either drift or clonal

interference is simply the product,

Pr{fixls,or,u,N} = n(s) e"‘(”“’”’M . (3)

0'

not .H

34

This probability is plotted against s in Figure 2.1.
Finally, the probability density for the condition that a

given beneficial mutation confers a selective advantage 5 is

‘03

d e . The probability, therefore, that an arbitrarily

chosen beneficial mutation will become fixed in a population

is

Pr{fixld,u,N} = dfn(s) e'}‘(”°"”’M '“’ ds . (4)
o

In Figure 2.2, the fixation probability given by equation
(4) is plotted for different combinations of a and N. Note
the dramatic decrease in fixation probability with
increasing beneficial mutation rate and with increasing N.
The above calculations will become more informative and
useful when the fixation probability is converted to an
expected substitution rate of beneficial mutations in a

population.

35

The expected rate of substitution

We now make a simple, intuitive, but erroneous
calculation in order to demonstrate that the results of this
section may counter a seemingly reasonable train of thought.
The total number of beneficial mutations produced per
generation by a population is equal to the beneficial
mutation rate times the number of individuals in the

population, pN. Suppose that a fraction 45 survive drift.

If one made the assumption that a certain fixed fraction,

0, of these beneficial mutations go to fixation, then the
rate of substitution would be 4EBuN. Put differently, the

rate of substitution might be presumed to be a linear
function of either mutation rate or population size. In
this section, however, we show that when clonal interference
is taken into account and the population is large, mutation
rate and population size have surprisingly small effects on
substitution rate.

With the fixation probability given by equation (4),
the expected substitution rate of beneficial mutations is

given by

<o(d,u,N)> = uNPr{fix|or,u,N} , (5)

36
where <e> denotes the expected value. As shown in Figure
2.3, a very large change in beneficial mutation rate
(several orders of magnitude) has little effect on the
substitution rate of the population, especially when the
population is large. This constraint may be thought of as a
"law of diminishing returns," where the investment is the
number of beneficial mutations produced by a population and

the returns are adaptive substitutions.

The expected selection coefficient of successful mutations

Figure 2.1 showed that there is some critical value of

5 below which the probability of fixation of a beneficial
mutation is essentially zero. A beneficial mutation whose
selective advantage is small is not likely to become fixed
because it must compete with many superior mutations. On
the other hand, a beneficial mutation whose advantage is
large is less likely to be produced. Hence, there must be
some intermediate selection coefficient that balances the

fixation advantage of large 3 with the more frequent

occurrence of small 3. This balance corresponds to the

expected selection coefficient of successful mutations.

se‘

t.

‘

 

37

Let p(s) = Kn(s) e_l(”°"”’M 1’, where K is a

normalizing constant such that j p(s) ds = 21. Then p(s) is
o

the probability density that a beneficial mutation of
selective advantage 3 will be (i) produced and (ii) fixed.
Therefore, the expected value for the selection coefficient

of successful mutations is

<S((X,].1,N)> = fspls) ds . (6)
0

Figure 2.4 reveals that this expectation is essentially
constant for pN < 0.01 and increases approximately linearly

with the log of population size when uN > 0.1.

Effect of clonal interference on rate of fitness increase

At this point, sufficient information has been provided
to determine how clonal interference between beneficial
mutations affects the rate of adaptive evolution. Having
derived (i) the rate at which substitutions occur and (ii)

the expected selective advantage conferred by substitutions,

tax;

““\A\

u“

e...

"Pr
“v.4“

V v-
‘E‘-

38

we now calculate the expected rate of fitness increase

simply as the product of (i) and (ii):

dw
d—t

<O(0t,u,N)> <s(0t,u,N)>

(7)

or u Nf sn(s) e’“”“’"’~> '°“ ds
0

Equation (7) is plotted against Logm N'in Figure 2.5 for

. . dw
different mutation rates. It appears that '3? approaches a
maximum value for increasing N. The same is true for p.
Indeed, that a maximum value exists can be shown
mathematically. The implication is that there exists a sort

of "speed limit" for asexual evolution imposed by clonal

interference.

Estimation of parameters: an empirical example

The previous developments show some characteristic
consequences of clonal interference; yet, these developments
remain at the level of sweeping generalities until we find
the region of parameter space in which biological reality
lies. We demonstrate here that the parameters a and u may,

in fact, be estimated empirically.

39

Equations (5) and (6) govern the expected rate of
substitution and the expected selection coefficient of
substitutions, respectively, both being functions of a, u,
and N. If N'is known, then the resulting two equations

contain two unknowns and are linearly independent:

ll
0

(5 (mp) > — sob:
(8)

ll
0

(C(q’u)>_—Ows
The parameters a and u may, therefore, be determined from
this pair of equations given observed values for the
substitution rate, 0g”, and the selection coefficient of
substitutions, sun.

It is possible to obtain such values by tracking the
fitness trajectory of an evolving population (Lenski et al.,
1991; Lenski & Travisano, 1994). The average time between
periodic selection events gives the reciprocal of the
substitution rate estimate; the average fitness increase
caused by periodic selection events gives an estimate for
the selection coefficient of substitutions.

As an example, we estimate a'and u using the fitness

trajectory observed for an evolving Escherichia coli

40

population (Lenski et al., 1991; Lenski & Travisano, 1994).
This example serves two purposes: (i) it demonstrates the
estimation procedure, and (ii) it puts us in the "biological
ball-park" of parameter space. Lenski and colleagues
serially propagated several E. coli populations for ten
thousand generations of binary fission in a constant
environment. (A particularly nice feature of working with
bacteria is that samples of the evolving populations may be
frozen and later "resurrected" for comparison with samples
from earlier or later times. In this way, one may track the
evolution of populations over time by competing the evolved
populations against the ancestor to estimate their relative
fitness.) That calculation of generation number implies a
discrete-time formulation of population growth, whereas the
mathematics in this paper employ a continuous-time
formulation. In the following estimation of parameters, we
adjust the number of generations by a factor of in 2

(= 0.693) to reflect this difference. During the first
2000 generations of binary fission (:1400 natural

generations), they intensively assayed fitness for one
population (Lenski & Travisano, 1994). The observed fitness

trajectory was characteristically punctuated with sudden

41

fitness increases followed by long periods of stasis. This
general pattern is in accordance with the results of
previous sections: that due to clonal interference, the
substitution of a beneficial mutation is a rare, isolated
event, and that the fitness increases due to substitutions
are large. Based on three sudden fitness increases during
~1400 natural generations (Lenski & Travisano, 1994), the
average substitution rate is estimated as can = 0.002
substitutions per generation; the average fitness increase
resulting from a substitution is San = 0.1. The effective
population size with respect to the substitution of
beneficial mutations, and given the serial transfer regime,
was determined to be 3.3 x 107 (Lenski et al., 1991).

We have estimated parameters a and u from these data by
finding the point of intersection between the solution

curves of equations (8). The solution for this system of
equations is or = 35 and u = 2.0 x 10‘9 beneficial mutations
per replication. Given that the genomic mutation rate of E.
coli is approximately 3 x10"3 mutations per replication

(Drake, 1991), one can infer that the proportion of

mutations that are beneficial is roughly one in a million.

42
We emphasize that these estimates depend on (i) the
assumption of an exponential distribution of beneficial
mutational effects, and (ii) the assumption that a and u
remain constant even as mean fitness increases. The
empirical fitness trajectories referred to in this section
show a decreasing rate of increase, suggesting that

assumption (ii) is false if the environment is constant.

(See Assumptions of the models.)

Transiently common mutations

Clonal interference - a general model

Suppose that, while one beneficial mutation grows in
number, a second beneficial mutation appears that is
superior to the first. The population is now composed of
three genotypes of interest: the ancestor and two competing
beneficial mutations. If the first beneficial mutation is
not close to fixation, then its growth is unaffected by the
growth of the second, superior mutation until the latter has

become sufficiently abundant to affect the mean fitness of

43

the population noticeably. When the superior mutation
attains sufficient number, the growth of the original
mutation is retarded until, at some point, it reaches a
maximum frequency and then begins to decline. We are
interested in the probability that the frequency at which
this maximum occurs is greater than or equal to some
frequency, f. To determine the probability that any
particular beneficial mutation achieves a frequency of at
least f, we begin by computing the time, t,, at which a
superior mutation with selective advantage 32 must have
appeared to insure that the original mutation achieves a
maximum frequency of exactly f. Then we calculate the
probability that no such superior mutation occurs in the
interval (0, t2); this is the probability that the original
mutation achieves a maximum frequency of at least f. (Note
that ti is itself a function of the selective advantage, 5”
of a given superior mutation.) To facilitate presentation
of this development, we introduce the term candidate
replication to refer to any replication event which, if it
were to produce a superior mutation, would prevent the

original mutation from attaining frequency f.

44

Consider a three-genotype system with ancestor x,
original beneficial mutant y, and alternative superior
mutant 2; the deterministic solution for the dynamics of
such a system is derived in Appendix 2.2. The time, tn“, at
which beneficial mutation, y, reaches maximum number is a

function of the time of occurrence, ti, of an alternative

mutation, 2, which is superior to y, i.e., t = t (t) .
max max 2
The time, t,, is that which satisfies y(tmx(tz)) = fN,
. dy .

where QMX(QJ 15 such that Cd? = 0. If the superior

mutation, 2, were to occur before time t,, then the original
mutation, y, would not achieve frequency f. We can,
therefore, calculate the probability that no superior
mutation occurs in the interval (0, ti) by determining the
expected number of such mutations in this interval and
assuming that they are Poisson distributed.

The first step in determining the expected number of
superior mutations interfering with the original mutation is
to calculate how many candidate replications take place,

i.e., the number of ancestral replications in the interval

45

!
(O,tz)l or R = fx(t) dt. But t. is a function of the
0

selective advantage, s“ of the superior mutation. R is
closely approximated by evaluating tzkh) at the expected

value for s, conditional on it being greater than sy, i.e.,

. l
t = t ((3 Is )3 l) , where is Is >3) = s + — is the expected
2 z z z y z z y y C1

selection coefficient of a superior mutation.

The expected number of beneficial mutations in the
interval (0, 52) is pR, where u is the per-replication rate
at which beneficial mutations are produced. Of these uR

beneficial mutations, only the fraction eiﬁ” will be

competitively superior to y, the original beneficial

mutation. And of these u.Ree-” superior mutations, only

1
another fraction “‘5{“E) will survive drift. Therefore,

the expected number of beneficial mutations that occur in

the interval (0, 52), that are superior to y, and that

46

a I
survive drift is w = u R e ”n(sy+a) . Because this

expectation is a function of sy but not sz, we simplify our
notation at this point by letting s = sy. The analytical
solution for R, the number of candidate replications, is
derived in Appendix 2.3. The resulting expected number of
superior mutations that would prevent a given beneficial

mutation from attaining frequency f is

 

_ 1
wls,a.u.N.f) = u NlntN/x) e °” rim-a) , (9)
where X z: 1 + ds+l (ff-1) (orsN)“"”l -'QS . Thus, the

probability that a given beneficial mutation achieves a

maximum frequency of at least f is

11(3) e-q‘("avquvf) I (10)

where the effect of drift is incorporated by n(s).
It is important to point out that equation (9)
incorporates an approximation that is essentially an

eQuality for f < 0.95, but which introduces significant

47

error for f > 0.99. (See Appendix 2.3 for details.) A
technical difficulty with equation (10) is that there is no

guarantee that w(s,d,u,N,f) is non-negative, whereas a
fundamental assumption of the Poisson process is that the
Poisson parameter be non-negative. To remedy this problem,
we impose the condition,

w(s,d,u,N,f) == max{w(s,d,u,N,f),0}. Otherwise, a negative
Poisson parameter may arise if superior mutation 2 must

appear before original beneficial mutation y to insure that

the latter attains maximum frequency f, i.e., fl is

negative. In this case, the probability that a given
beneficial mutation achieves a maximum frequency of at least
f is equal to one, because an assumption of our analysis is
that the superior mutation 2 does not appear before original
mutation y. We have shown that this assumption does not

introduce much error (see Assumptions of the models).

48

Probability of transiently polymorphic beneficial mutations

In this section, our objective is to determine with
what probability one might expect a beneficial mutation to
rise temporarily to polymorphic frequency. We define
polymorphic frequency as any frequency greater than or equal
to 0.01. In the Clonal interference and fixation section,
we were only concerned with whether or not a beneficial
mutation became fixed in a population, i.e., whether or not
f’z-Egi. Now, we examine the probability that the

frequency, f, of a beneficial mutation exceeds 0.01 yet

N-l

N’ This is the probability that a

 

never reaches
mutation will be transiently polymorphic.

Given that a beneficial mutation survives drift, the
probability that it will achieve polymorphic frequency is
e““““”“M°J”). Given that the same mutation achieves
polymorphic frequency, the probability that it does not
reach fixation is computed as the probability that at least
one superior mutation appears in the interval (£;,tf). The
expected number of superior mutations appearing in this

interval is:

49

- , 1
vls.a,u.N,f> = lgNlnlxi e“ n<s+3> , (11)

where x is as defined in equation (9). Therefore, given

that a mutation with selective advantage, 3, has achieved

polymorphic frequency, the probability that it does not

reach fixation is 1.- e““““”“M°J“). The probability that a

mutation will be transiently polymorphic is the product of
(i) the probability that the mutation survives drift, (ii)
the probability that the mutation achieves polymorphic
frequency given (i), and (iii) the probability that the
mutation does not reach fixation given (ii). Therefore, the
probability that any arbitrarily chosen beneficial mutation

transiently achieves polymorphic frequency is

Pr{polylor, u, N} =

" _ _ _ (12)
aim“) e Ms,a,u,N,0.01) as (1 _ e y(s,u,a,N,0.0l) )ds
0

This equation is plotted in Figure 2.6. Given a certain
population size, there is an intermediate value of the
beneficial mutation rate at which the probability is

greatest that an arbitrarily chosen beneficial mutation will

50

transiently achieve polymorphic frequency. Likewise, given
a certain beneficial mutation rate, there is an intermediate
population size that maximizes the probability that an
arbitrary beneficial mutation will be transiently
polymorphic. This result seems reasonable, because an
increased recruitment rate of beneficial mutations, pN,
increases the probability that a superior mutation occurs
before a given beneficial mutation can reach polymorphic
frequency (i.e., increases clonal interference). By
lowering pN, on the other hand, one reduces the probability
that a superior mutation occurs later, hence increasing the
probability that a beneficial mutation, which has already
achieved polymorphic frequency, will go to fixation (i.e.,
is not transient).

Given the parameters estimated previously for an
evolving E. coli population (a = 35, u = 2.0 x 104, N = 3.3
x 107), the probability that an arbitrarily chosen
beneficial mutation becomes transiently polymorphic is
approximately 0.034. With uN=:0.07, a beneficial mutation
would have occurred every 15 generations or so. Of these,
about 1 in 30 would become transiently polymorphic. Hence,

one would expect about three transient polymorphisms

51

(f>0.01) in 1400 natural generations. This number is
roughly comparable to the number of periodic selection
events that were observed. This correspondence suggests
that each beneficial mutation that went to fixation
displaced not only its "parent" genotype but also a
"sibling" genotype that had achieved some success.

Surprisingly, these estimates do not rely heavily on
the assumption that beneficial mutations are exponentially
distributed. Calculations based on an alternative
rectangular distribution show that the probability that a
beneficial mutation transiently achieves polymorphic
frequency is approximately 0.05. Thus, by assuming a
rectangular distribution, one might expect about four
transient polymorphisms in 1400 natural generations. The
fact that assuming such very different distributions results
in less than a two-fold difference in estimates suggests
that these results are fairly robust. The section
Assumptions of the models gives a more complete discussion

of this test of robustness.

52

Probability of transiently common mutations: the leapfrog

In a slight variation of the previous section, we will
now examine the probability that a beneficial mutation
achieves a frequency of 0.5 but is not fixed. We devote a
separate section to this special case because of the strange
dynamics it would present to an observer of a population in
which it occurred. In this case, a mutant Ab reaches
majority status before being supplanted by a superior mutant.
aB, where both mutants are derived directly from the same
ancestor ab. At the genetic level, this appears as a
"leapfrog" episode in which (i) Ab replaces ab as the most
common genotype and thereafter aB replaces Ab as the most
common genotype, even though (ii) a8 is more closely related
to ab than to Ab (Figure 2.7). If one were to sample this
population at times ta, t2, and t3, as indicated in Figure
2.8, then one would observe that the sample from t3 is more
closely related at the genetic level to the sample taken at
ta than to that taken at t2.

Following the derivation of equation (12), the

probability that an arbitrarily chosen beneficial mutation

transiently achieves a frequency of 0.5 or more is

53

Pr{ leapfrogla, u: N} =

. - - _ (13)
QfH(S) e Ms,a.u.N.0.S) (13(1 _ e y(s,u,a,N,0.5) )dS
0

Using the parameters previously estimated from an
evolving E. coli population, and following the same logic as
described at the end of the previous subsection, a
beneficial mutation would occur every 15 generations or so.
About one in every 55 of these mutations would be subject to
the "leapfrog" effect, which should thus occur every 800
generations or 50. Therefore, it is quite possible that
some of the three periodic selection events observed during
the 1400 natural generations experiment were complicated by
this effect. Whether empirical data would resolve the
leapfrog as one or two periodic selection events would
depend, in part, on how close in time the relevant genotypes
became numerically dominant. If a leapfrog was resolved as
two distinct periodic selection events, then the descendants
after 1400 natural generations should differ from the
founding ancestral genotype by fewer than the three

beneficial mutations that would be expected under the

54

presumption that each periodic selection event was caused by
the sequential substitution of an additional mutation
(Figure 2.8). Figure 2.9a shows a numerical simulation of
the E. coli populations using the empirically estimated
parameters. The resulting trajectory for mean fitness,
shown in Figure 2.9b, illustrates that a single leapfrog
episode may indeed give the appearance of two periodic
selection events. But rather than implying the successive
fixation of two beneficial mutations, only a single

substitution has actually occurred.

Discussion

Summary of results

Competition between clones that carry different
beneficial mutations may be very important for the
evolutionary dynamics of asexual populations. The
prevalence of such "clonal interference" among beneficial
mutations increases dramatically with population size and
with mutation rate. The following points summarize some of

the most salient consequences of clonal interference. (1)

55

The fixation probability of a given beneficial mutation is a
decreasing function of both population size and mutation
rate. (2) Substitutions appear as discrete, rare events, no
matter how frequently beneficial mutations arise. If a
beneficial mutation is to overcome clonal interference and
become fixed, then it must confer a substantial selective
advantage. The advantage that is required for a reasonable
probability of fixation is an increasing function of
population size and mutation rate. (3) The rate of fitness
increase is an increasing function of both population size
and mutation rate, but it is only weakly dependent on these
parameters when their product is not small. (4) Using
observable trajectories for the mean fitness of evolving
asexual populations, it is possible to estimate both the
beneficial mutation rate and the distribution of beneficial
mutational effects. We obtained such estimates for an
evolving laboratory population of Escherichia coli. (5)
Beneficial mutations that become transiently abundant, but
which do not go to fixation, may be quite common in asexual
populations. (6) Some of these transient polymorphisms may
give rise to a "leapfrog" effect, in which the majority

genotype at some point in time is less closely related to

56

the immediately preceding majority genotype than to an
earlier genotype. Parameter estimates obtained for the
evolving laboratory population of E. coli are consistent
with this effect being an important feature of asexual

dynamical systems.

Assumptions of the models

The models presented here assume that the general form
of the distribution of beneficial mutational effects is that
of an exponential distribution. Kimura (1979) employs the
more general gamma distribution to describe the distribution
of deleterious mutational effects. Elena et al. (1997) have
shown that a compound gamma-rectangular distribution fits
well to experimental data from transposon-induced mutations
in E. coli. Intuitively, the exponential distribution seems
a good choice for beneficial mutational effects, because it
is reasonable to suppose that there are many more beneficial
mutations of small effect than of large effect. Fisher
(1930) reasoned that most mutations of large effect are
deleterious as a geometrical consequence of the high

dimensionality of fitness landscapes. He argued that the

57

ratio of deleterious to beneficial mutations increases with
mutational effect (i.e., phenotypic difference between
mutant and non-mutant), because a large radius in phenotypic
space is very likely to circumscribe potential improvements,
whereas a small radius stands a better chance of being
tangent to an improvement. That this effect increases with
the dimensionality of the fitness landscape is an easily
demonstrable fact of geometry. A convincing argument for
the use of the exponential distribution in particular comes
from extreme value theory (see Gillespie, 1991, p. 262).
Suppose that Alfitness alleles are present in a population
such that Wm>W[2]>W[3]>...>W[M, (where W denotes fitness). If
the population is in dynamic equilibrium, then the fittest
of these Alalleles greatly outnumbers the other MFl alleles,
which are held at some low frequency by mutation-selection
balance. A fitness mutation results in a genotype whose
fitness is drawn at random from some unknown parent
distribution. Now, imagine that a novel fitness mutation
appears that is beneficial (i.e., fitter than the current
fittest genotype). If we denote the fitness of this
mutation by Wm, then W101>Wr11 and the selection coefficient

will. — 1. Gillespie (1991)
m

of this novel mutation is s =

58

shows that this s is exponentially distributed in the limit

as bl~<», regardless of the shape of the unknown parent

distribution. (The reason for this result has to do with
the fact that Wm and Wm are the two largest fitnesses;
they are extreme values of the parent distribution.) In
other words, in the limit of infinite fitness alleles, the
distribution of s is necessarily exponential.

To evaluate the sensitivity of our analysis to the
assumption that selection coefficients are exponentially
distributed, we replaced the exponential density with a

l/s , s<:s
max max

rectan ular densit i.e. s = . See
9 Yr Ipl) 0 I S>Sm

Appendix 2.4 for results of these derivations. Following

the logic employed in Estimation of parameters, we have

estimated the parameters SW” and u to be 0.12 and 5 xilOdo,

respectively, for an evolving laboratory population of E.
coli. Note that the beneficial mutation rate estimate, p,
is of the same order of magnitude as was obtained assuming
exponentially distributed selection coefficients. Figure

2.10 further shows that replacing the exponential with a

59

rectangular density changes the resulting fixation
probabilities only slightly. The slight discrepency when uN
is very small, such that clonal interference is unimportant,
arises because the average selection coefficient (and hence
43) is slightly higher for the rectangular than for the
exponential distribution. The probabilities of transient
polymorphisms (either f>0.01 or f>0.5) are consistently
higher when a rectangular density is assumed, although the
discrepancies are small. In view of these results, our
analyses appear to be reasonably robust with respect to the
form of the distribution of selection coefficients.

A second assumption of our analyses is that neither the
beneficial mutation rate nor the distribution of selection
coefficients changes over time. But in a constant
environment, a population becomes better adapted with time,
leaving progressively less room for further improvement. It
is likely that a well adapted population has (i) a lower
overall rate of beneficial mutation, (ii) a smaller average
effect of beneficial mutations, or both. Consequently, u =
p(w) may be a decreasing function of fitness, while a = a(w)
may increase with fitness. These parameters are therefore

constant only when w is constant. This condition may be met

60

in an environment that changes just fast enough to counter
adaptation of a population.

A third assumption made in these models is that the
progress of a given beneficial mutation is unaffected by the
presence of inferior beneficial mutations. By definition,
inferior beneficial mutations cannot themselves
competitively exclude a given beneficial mutation. However,
the selective advantage of a given beneficial mutation will
be lower relative to these inferior beneficial mutations
than relative to the ancestral genotype, and so inferior
beneficial mutations may prolong the time that is required
for fixation of a given beneficial mutation. As a
consequence, there may be a longer interval in which a
superior beneficial mutation could appear that would prevent
fixation of the original beneficial mutation. To address
this possible complication, the probabilistic models were
made fully dynamic by considering all beneficial mutations
since the most recent substitution. When the dust settled,
the results were essentially unchanged from those that we
have presented. For example, the fixation probability of an

arbitrarily chosen beneficial mutation was changed by a

61

a . . . . . .
factor of 'ETI which is inconsequential Since a 18

generally large.

An assumption made in estimating the parameters a and p
from observed fitness trajectories is that the sudden jumps
observed in these trajectories are, in fact, fixation
events. Based on the results of The leapfrog, however, this
assumption is questionable; if a leapfrog event were to
occur, then it would give the appearance of two such
fixation events (Figure 2.9). Thus, of the three observed
jumps in fitness during 1400 natural generations (Lenski &
Travisano, 1994), for example, perhaps only two were actual
fixations and one was the result of a leapfrog episode. If
this were the case, then our estimates of a and u would be
incorrect. To evaluate the degree to which these estimates
may be in error, we changed the assumption that observed
fitness jumps represent fixations and assumed instead that
these jumps represent beneficial mutations that achieved a
frequency of f > 0.5. To that end, we employed the
derivations of Clonal interference - a general model. This
change of assumption did not appreciably affect the

estimates (a = 29, p = 1.6 x 10”), indicating that our

62

initial assumption, at least in this case, did not introduce

much error.

.Mbdel validation by simulation

A general result of the section, Clonal interference
and fixation, is that trajectories of population mean
fitness are characteristically punctuated, with sudden jumps
followed by long periods of stasis, regardless of the
mutation recruitment rate, uN. To assess this general
prediction qualitatively, we simulated the occurrence of,
and competition among, many different beneficial mutations
whose selection coefficients were drawn at random from an
exponential distribution. Figure 2.9 demonstrates that,
despite fierce competition among numerous beneficial
mutations (pN = 0.1), the population mean fitness is not
appreciably affected until a fitness variant achieves high
frequency. (These results also lend support to the
assumption that mutations inferior to the currently dominant
variant play a negligible role in clonal interference.)

To test the models quantitatively, we ran repeated

simulations. The probabilistic predictions for (i) the

63
probability of fixation, (ii) the expected fitness increase
conferred by a substitution, (iii) the expected substitution
rate, (iv) the probability of transiently achieving
polymorphic frequency, all agreed well with a large number
of fully stochastic simulations. We emphasize that these
simulations allow for the more realistic situation in which
a mutant may acquire further beneficial mutations at any

time after its appearance.

Inclusion of the double mutant

To this point, we have emphasized competition between
three genotypes, the progenitor (ab) and two mutants that
carry different beneficial mutations (Ab and aB). However,
a fourth genotype should eventually appear that has both
beneficial mutations (AB). If the effects of the two
beneficial mutations on fitness are additive, then the
double mutant will eventually take over the population. A
full treatment of the dynamics involving this fourth
genotype is beyond the scope of this paper. For now, we
address only one particular issue. If a leapfrog event is

to be manifest, then genotypes Ab and a3 must each achieve

64
majority status before AB does; otherwise, the dynamics will
appear as a sequential substitution (Figure 2.7). The
probability of occurrence of the leapfrog must, therefore,
incorporate the probability that sequential substitution

does not occur. We have conservatively estimated this

n
. . n(s)u ‘ .
probability as exp - e f'th) dt ; this factor was

0

 

incorporated into the integrand of equation (13) and found
to have a negligible effect (probabilities were reduced by
no more than five percent for a wide range of parameters).
Therefore, we neglected this factor in our earlier

developments in order to keep things as simple as possible.

Implications for the evolution of reproductive strategies

Muller (1964) briefly alludes to the concept of clonal
interference while making a case for the evolutionary
advantage of sex. Muller argued that adaptive evolution of
asexual populations is inefficient, because the fraction of

beneficial mutations that are lost due to competition with

65

alternative beneficial mutations may be substantial in a
large population. Recombination would remedy such
inefficiency, which suggested an evolutionary advantage for
sex. This argument was restated and explored analytically
by Crow & Kimura (1965), to which Maynard Smith (1968)
responded by pointing out that Muller's original argument
relied on the erroneous assumption that mutations were
unique events, such that each could occur only once. In a
counter-example, Maynard Smith demonstrated that models of
sexual and asexual systems yielded the same rate of adaptive
evolution when mutations were treated as recurrent events.
For a nice summary of this controversy and further
developments on this topic, see Felsenstein (1974, 1988).
Much recent work has focused on how fixation probabilities
are affected by variance in fitness at background loci and
the degree of linkage to these loci (Barton, 1993, 1995;
Keightley, 1991; Pamilo et al., 1987; Peck et al., 1997).
Barton (1994) derived the conditional probability of
fixation of a beneficial mutation given that a single
substitution occurs or that substitutions occur at a given
rate. He explored the dynamics of this probability under

varying degrees of recombination. We believe that the

66

models presented here may contribute to understanding the
evolution of sex by giving an explicit expression for the
unconditional probability of fixation of a beneficial
mutation, in the limit as recombination rate goes to zero.
Another part of a population's reproductive strategy,
namely its mutation rate, may also be affected by the clonal
interference phenomenon. Much work has been done to
determine whether and how natural selection may adjust
mutation rates. A high mutation rate may confer an
evolutionary advantage, for example, if it increases the
rate of substitution of beneficial alleles. This advantage,
however, must overcome the disadvantage of a parallel
increase in deleterious mutations. Leigh (1970)
demonstrated theoretically that elevated mutation rates can
evolve in asexual populations that experience oscillating
selection on some locus. Since then, much work has
supported the notion that evolutionary elevation of mutation
rates is at least possible, and perhaps likely, in changing
environments (Gillespie, 1981; Ishii et al., 1989). In
light of the developments presented in this paper, however,
it seems that the strength of selection to elevate mutation

rates (above some minimal value set by the physiological

67

cost of fidelity) may be smaller than the established theory
would indicate, especially when populations are large. As
we have shown, an increase in mutation rate hardly changes
the rate of adaptation of large populations because of
clonal interference (Figure 2.5). To gain an appreciable
increase in the rate of adaptation for a large population
would, therefore, require a disproportionate increase in
mutation rate. Such a large increase in mutation rate,
however, would undoubtedly have a detrimental effect due to _
the greatly increased production of deleterious alleles.
Consequently, it seems reasonable to suggest that selection
for elevated mutation rates should be weak in large

populations.

Implications for the general nature of adaptive evolution

Three especially interesting consequences of the
results obtained here concern the general nature of adaptive
evolution in asexual populations. The first is that one
should expect the trajectory for mean fitness of any asexual
population to be punctuated with short bursts of rapid,

significant increase followed by long periods of stasis,

68
regardless of the size of the population or its mutation

rate. This result contradicts the intuitive, but erroneous,
View that discrete bouts of periodic selection (in which
individual mutations sweep to fixation) should overlap, thus
giving the appearance of continuity, when the mutation
recruitment rate, pN, is sufficiently high. A second
intriguing implication is that there exists a "speed limit"
on the rate of adaptive evolution in asexual populations.

As shown in Figure 2.5, the rate of improvement in a
population's mean fitness decelerates with increasing u and
N. This result reflects intensified clonal interference as
well as the longer time required for selection to proceed to
fixation in large populations. A third important
consequence is closely related to the second: the rate of
adaptive evolution is clearly not always limited by mutation
rate. In fact, because of clonal interference, the rate of
adaptive evolution is only weakly dependent on mutation rate
and population size unless uN is small (uN < 0.1 for a’=

35).

69
Evidence for transiently common beneficial mutations in

microbial populations

One of the intriguing consequences of asexuality is
that beneficial mutations may become quite common
temporarily but eventually go extinct as superior mutations
arise (Figures 2.8 & 2.9). In principle, it should be
possible to find evidence for this effect in natural
populations of asexual organisms. A complication arises,
however, in that a beneficial mutation may also become
transiently common, but then disappear, if the environment
changes so that the mutation is no longer favored.

For example, Holmes et al. (1992) followed the
molecular evolution of a population of the human
immunodeficiency virus (HIV) within a single infected
patient, and their data show several instances of
transiently common mutations. In particular, they monitored
changes in the RNA sequence encoding the third hypervariable
loop of gp120 (V3) throughout the asymptomatic phase of
infection (7 years) of a single hemophiliac patient. All 12
viral sequences that were obtained immediately after

infection were identical, and these were denoted as sequence

70

A. In year three, a set of related sequences, denoted C1
through C5, were numerically dominant (11 of 15), but in
year seven they and their descendants comprised only a small
fraction of the population (2 of 13). By contrast, sequence
E1 was present only as a small minority (1 of 15) after
three years, but after seven years it and its descendants
were numerically dominant (10 of 13). Within the E1 clade,
a subset of derived sequences, denoted E2 through E5, became
numerically dominant after five years (19 of 23). However,
neither they nor their descendants were represented by even
a single sequence in years six (0 of 15) and seven (0 of
13). Thus, it is evident that certain mutations became
transiently common, only to decline subsequently in
frequency. Moreover, the data show the "leapfrog" effect in
which the majority type at one point in time is not
descended from the majority type that immediately preceded
it. Holmes et al. (1992, p. 4838) recognized the importance
of these dynamics when they said that changes in the viral
population, "instead of being the sequential replacement of
one antigenically distinct variant by another, may involve a
complex interaction between the different, and competing,

evolutionary lineages present in the plasma."

71

The one caveat to our interpretation of these data,
however, is that the host immune system responds to the
viruses, and so the HIV population is evolving in a changing
environment. Thus, for example, sequences E2 through E5 may
not have been driven extinct solely by an intrinsically
superior mutation, but instead they may have become
selectively disadvantaged after they were targeted by the
immune system. This scenario is supported by the fact that
the V3 loop is a principal target of the immune system. But
even with the added complication of the changing immune
environment, asexuality can have important dynamical
consequences for HIV and other pathogens. In particular,
the "leapfrog" effect necessarily increases the genetic
distance between successive majority types (Figures 2.7 &
2.8), and so it may actually facilitate a pathogen's evasion
of the host immune response.

An unambiguous demonstration of the "leapfrog" effect
will require data from an asexual organism living in a
constant environment. To that end, we are now using
molecular methods to determine the phylogeny of clones

sampled over time from an experimental population of E.

coli, as it evolved for thousands of generations in a

72

defined laboratory environment (Lenski et al., 1991; Lenski
& Travisano, 1994; Elena et al., 1996). If the "leapfrog"
phenomenon is important, then we expect to see a clade
become numerically dominant, only to be driven extinct by
the emergence of another, even more successful clade that is
derived from its ancestral base (rather than from the

formerly dominant clade).

A suggestion for further research

Clonal interference is not the only dynamic that
inhibits the progression of beneficial mutations to fixation
in an asexual population. A similar inhibition may be
caused by Muller's ratchet (Muller, 1964; Haigh, 1978), in
which deleterious mutations will tend to accumulate in small
asexual populations. As shown by Manning & Thompson (1984)
and by Peck (1994), the fate of a beneficial mutation is
determined as much by the selective disadvantage of any
deleterious mutations with which it is linked as by its own
selective advantage. In asexual organisms, the entire
genome in which a beneficial mutation occurs will remain

linked to that mutation and will hitchhike to fixation if

73

that is the fate of the mutation. Therefore, a beneficial
mutation that spreads to fixation presents a severe
population bottleneck in which only a single genome is
sampled, thus exacerbating the effect of Muller's ratchet.
Consequently, a beneficial mutation may only be considered
advantageous if its benefit more than compensates for the
drastic reduction in effective population size caused by its
fixation and the associated acceleration of Muller's
ratchet.

Haigh (1978) modeled the effect of deleterious
mutations on population fitness; Manning & Thompson (1984)
and Peck (1994) modeled the effect of deleterious mutations
on the fate of beneficial mutations; and the models
presented here provide a quantitative account of how
beneficial mutations affect one another. A logical next
step would be to integrate these models into a single
theoretical treatment of mutation in asexual populations.
Such a synthesis would be a valuable contribution toward a
general understanding of evolutionary dynamics in asexual

systems.

74

 

 

 

 

c 0.18““.“ n .......... -~ .
g l
.U
m
N
'7 o 12
"4'3? 0 —l)—
'I-I
'ﬂ
° ~11
>' u
.U
a... n. 0.06 4’-
H
'87
.o
g o
D.

 

selection coefficient, 3

 

Figure 2.1. The probability that a given beneficial
mutation with selection coefficient, 3, achieves fixation.
Equation (3) with a = 35, p = 2.0 x 10", and N = 3.3 x 107.

75

 

 

 

 

 

 

 

 

,2: N II 105
.35 N .. 10‘
'21: N I 107
H
D. N I 10'
t: N a 10’
O
H
iii
-a .r 1 g
N
«4 -10 -8 -6 -4
'H
‘H
o 0 ..
>1
U
...|
H
13
'8 N I 10’
H
e N s 10‘
o
.4 N 8 107
UI
0
II: N a 10.
N I 10’
-8 l l J]
-10 -8 -5 -4

Log10 (beneficial mutation rate, it)

 

Figure 2.2. The probability of fixation of an arbitrarily
chosen beneficial mutation is a decreasing function of both
beneficial mutation rate, p, and population size, N

(Equation (4)). The exponential parameter, a, appears only

to shift the curves down vertically without changing their
shape.

76

 

 

 

 

 

 

O
N = 109
= 105
x—\
b
0
+)
M
I:
c
o
«4
.p
:3
4)
a) ‘-10 -a -s -4
.Q
:3
m
13 o
m
.p
0
m N a 109
E; N =- 105
‘2
F!
or
o
_u4

 

 

 

 

-10 -8 -6 -4

Logic (Beneficial Mutation Rate, 1.1)

 

Figure 2.3. The substitution rate of a population is an
increasing function of its beneficial mutation rate
(Equation (5)). When the population size is large, however,
a large change in beneficial mutation rate hardly affects
the substitution rate.

77

 

0.4-

0.3 -_

 

 

 

Expected Selection
Coefficient, <s (a,u,N) >

 

 

 

Figure 2.4. The expected selection coefficient of
substitutions, <s(a,p,N)>, is an increasing function of
population size, N; Equation (6) with a = 35, and p = 2.0 x
104. The solid line indicates the expected value; the
dashed lines indicate numerically determined 95% predictive
confidence limits.

78

 

Loglo dw/ dt

 

 

 

 

 

dw
Figure 2.5. The rate of fitness improvement, ‘35' is an

increasing function of both population size, N, and
beneficial mutation rate, p. Equation (7) with a'= 35.
This rate of increase decelerates substantially, however,
due to increased clonal interference when pN > 0.1 (i.e.,
when more than one beneficial mutation is produced on
average every ten generations).

79

 

 

 

 

“\ (3.03 “will-
of}. _‘
G C) It- 10
“a.
3“ p-lOJ
H 4
3n. u-lo _,
g ‘ 0.02 w— “.10 “.10-10
6
'H
0%
hm
U
«4.3 0.01 -
H
“E“
go
'39.
“H
“‘0 0 r %
a.
4 6 8‘ 10
LOQMN

 

Figure 2.6. The probability that an arbitrarily chosen
beneficial mutation transiently achieves polymorphic
frequency is plotted against log population size for various
beneficial mutation rates. Equation (12) with a = 35. For
a giVen beneficial mutation rate, there is an intermediate
P0Pu1ation size at which the probability of achieving
FOlymorphic frequency is a maximum.

80

 

ab——>Ab ———>AB

sequential substitution

 

Ab

/
\aB ———> AB

Ieapﬁog

ab

time —->

 

Figure 2.7. The leapfrog phenomenon illustrated
phylogenetically. The phylogeny of majority genotypes is
compared with that of sequential substitution.

lﬁ‘r-ﬁu turn—v1

81

 

 

 

 

 

 

1.E+08-
U)
7|“ ab Ab a3
a
'U 1.E+06 -/
-a
>
-a
'U
c
H 1.E+04 -~
u;
o
H
'3 l.E+02—~
o
1.E+00 . .. __1 n = ' '
0 t1 t2 500 t3 1000

time (generations)

 

Figure 2.8. The leapfrog phenomenon illustrated
dynamically. Genotype ab is displaced by mutant Ab which is
later displaced by alternative mutant aB. Equations (25)
and (26) with sy1= 0.09, 52 = 0.13. Note that genotypes

sampled at time t5 are more closely related to those sampled
at ta than to those sampled at t2.

82

1.E+00‘"

 

 

 

1 .E-OZ “'-

 

 

1 .E-O4 T’

frequency

1.3-06 '-

 

 

 

1 .E-OB l 1 §

1.2 T

1.15 -_ /

1.1 '-

1.05 ‘-

 

 

mean realtive fitness

0 ' 150 300 450

. time (generations)

 

Figure 2.9. (a) A simulation of competition among numerous
beneficial mutations. The heavy lines represent genotypes
that achieve majority status. Note that a leapfrog event
has occurred in this particular simulation. Selection
coefficients were drawn at random from an exponential
distribution. Parameters used are a'= 35, p = 2.0 x 104, N
= 3.3 x 10K (b) The mean fitness trajectory of the
population simulated in panel (a). Note that the leapfrog

phenomenon gives the appearance of two distinct periodic
selection events.

83

 

 

Log” Pr{fix}

 

 

 

-10 -8 -6 -4

 

Figure 2.10. The probability of fixation of an arbitrarily
chosen beneficial mutation is plotted against the beneficial
mutation rate, p, for various population sizes. The solid
lines indicate probabilities assuming an exponential
distribution of beneficial mutational effects (Eq. (4) with
a'= 35, as estimated from the E. coli populations). The
dashed lines indicate probabilities which assume a
rectangular distribution of beneficial mutational effects
(Eq. (33) with SW“ = 0.12, as estimated from the E. coli
populations). The discrepancies resulting from the
different distributional assumptions are small.

Chapter 3

THE ORDER OF FIXATION OF BENEFICIAL MUTATIONS
IN ASEXUAL POPULATIONS1

Abstract
In a constant environment, an organism's opportunities for further

improvement are slowly depleted. To model adaptive evolution in such an'
environment, one should consider the number of available beneficial
mutations to be finite. Given a finite set of beneficial mutations, the
course of adaptive evolution is determined by both the timing and the
ordering of fixations. In large, asexual populations, the ordering of
fixations is affected by competition among beneficial mutations. Taking
into account such competition, we derive the probability of any
specified fixation ordering. The trajectories of fitness expectation
and variance among replicate populations are then derived, given (i) a
particular set of beneficial mutations, or given (ii) only a common

distribution of mutational effects. USing fitness data from evolving E.
coli populations, we solve the inverse problem to determine (i) the

number of available beneficial mutations, (ii) the distribution of

beneficial mutational effects, and (iii) the beneficial mutation rate.

 

1

This chapter is written in the format of a paper. The “we” in this
chapter refer to P.J. Gerrish, R.E. Lenski, and V. Mandrekar.

84

85

Introduction

In a continually changing environment, an organism's
potential adaptations are never depleted. To model the
adaptation of an organism in a changing environment,
therefore, it is reasonable to assume the number of
available beneficial mutations to be infinite (Gerrish &
Lenski, 1998). When an organism's environment is constant,
however, then the number of possible genetic improvements is
slowly depleted as the organism adapts. To model the
adaptation of an organism in a constant environment,
therefore, the number of available beneficial mutations must
be considered finite.

Given that a finite set of beneficial mutations is
available to an organism, all available beneficial mutations
are eventually fixed in the population. Whenever one of
these beneficial mutations is fixed, the pOpulation fitness
increases. Because each beneficial mutation is unique, each
confers a different increase in population fitness. Thus,

the adaptive dynamics of a population are determined as much

by‘

c; .
“Xi

inc:

5““ ;
. D
th‘

wot}

woul

86

by the order in which the available beneficial mutations are
fixed as by the timing of the fixations.

If there were only two different mutations that would
increase the fitness of a particular asexual organism in
some environment, and if one of these beneficial mutations
occurred in one individual, while the other occurred in
another individual, then the progeny of the two individuals
would compete. The lineage carrying the superior mutation
would competitively exclude both the wildtype and the
inferior beneficial mutation (Muller, 1932; Muller, 1964;
Pamilo et al., 1987). Eventually, the inferior mutation
would appear on the background of the superior mutation and
the double mutant would be fixed.

Let there be two beneficial mutations, A and B, and let
a and b denote the corresponding wildtype alleles. Let A be
the fitter of the two alleles, i.e., up > we. If genotypes
Ab and a8 are both present in the population, then aB will
be excluded, and the order of fixation of the genotypes will
be ab ~ Ab ~ AB. In small populations, the time interval
between occurrence of beneficial mutations is typically long

enough to ensure that both alternative beneficial mutations

are not simultaneously present in the population. Thus, the

5213

time

alte

(‘5-
mt-c

87

two possible fixation orderings of the beneficial mutations
(ab ~ aB ~ AB, and ab ~ Ab ~ AB) are approximately
equiprobable in small populations, their probabilities
depending primarily on the relative rates of appearance and
survival of A and B. In large populations, however, the
time interval between occurrence of beneficial mutations is
short, resulting in a high probability that alternative
beneficial mutations are simultaneously present in the
population. The simultaneous presence of the two
alternative beneficial mutations promotes the ordering, ab ~
Ab ~ AB. In general, large populations (and/or high
mutation rates) make probable the simultaneous presence of
alternative beneficial mutations, which in turn promotes the
ordering of fixations from largest to smallest effect
mutations.

In what follows, we derive the probabilities of
different fixation orderings. Because each ordering of
fixations corresponds to a particular fitness trajectory,
each trajectory is therefore assigned a probability, from
which the aggregate expected trajectory may be computed. We
first treat the case in which a set of beneficial mutations

is given, such that the set of selection coefficients is

prol
0"" ‘
mt.

0‘.—
ma...

(T
(D
:1

Pro]

01

88

known. Then we demonstrate how the number of unknown
parameters may be reduced either by extracting the most
probable set of selection coefficients from a parent
distribution or by taking the limit as selection
coefficients converge. Finally, we treat the fully
probabilistic case in which, instead of assuming that
different populations fix the same set of beneficial
mutations, we assume only a distribution for beneficial
mutational effects. In the discussion section, we employ
these developments to estimate parameters of the model for
evolving populations of Escherichia coli by regressing their
fitness trajectories; parameters to be estimated are: (i)
the number of available beneficial mutations, (ii) the
distribution of beneficial mutational effects, and (iii) the

beneficial mutation rate.

Probabilities of fixation orderings

Order of fixation of two beneficial mutations

Let subscripts S and I refer respectively to the

superior and inferior of two available beneficial mutations.

89

Suppose the inferior mutation appears and survives drift
before the superior mutation does so. Then the inferior
mutation will be fixed before the superior mutation only if
the superior mutation does not appear and survive drift
before the inferior mutation is fixed. Let tq denote the
time of appearance of an inferior mutation which

subsequently survives drift. Then the total time before

 

fixation of the inferior mutation is tI-+ (see Appendix

I
3.1), where 31.15 the selective advantage of the inferior
mutation over the wildtype, and N'is population number

(assumed constant). Thus, the expected number of times the

superior mutation occurs and survives drift before fixation

 

of the inferior mutation is Is(t1+ ), where rk==Lkn(sS)N

I

is the rate at which the superior mutation appears and
subsequently survives drift; us.is the rate of mutation to
the superior mutation, ss is the selective advantage of the

superior mutation over the wildtype, and n(u) is a linear

function of u giving the probability that a beneficial
mutation with selective advantage u survives the effects of

drift (Haldane, 1927; Otto & Whitlock, 1997; Gerrish &

90

Lenski, 1998). Given t1, the probability that inferior
mutation I is fixed before superior mutation S appears and

survives drift is the zero-class of the Poisson

.N
)}. Let Pr{I;S} denote the
I

 

distribution, exp{-rk(tl+

probability that I is fixed before 8. (This is
unconventional notation and should not be confused with a
conventional notation in which S would denote a condition.)
If the probability density function (p.d.f.) of t, is
denoted by g(tﬂ, then the probability that I'is fixed

before 5 is given by:

 

. _ " lnN
Pr{LS} —£exp{-rs(tl+ 31)}g(t1)dt1 . (14)

The rate at which the inferior mutation appears and

subsequently survives drift is 1}==15n(sI)N. This rate is

time independent. Therefore, the waiting time p.d.f. for
such an event is the homogeneous exponential:

g(tI)==r}eﬂ7”£ Upon integration, equation (14) becomes

91

u 11(3)
Pr{I;S} = l + L—i exp{~usn(ss) N l: N} . (15)

 

1% II(SI) 1

For later convenience, let n denote overall beneficial
mutation rate, such that p = uI-+;mu In some of the
subsequent figures, we assume individual mutation rates to
be equal, implying in this case Hz==lk = p/2.

We now derive the probability that S is fixed before I,
The superior mutation will be fixed first only if its
appearance and subsequent survival precede the fixation of
the inferior mutation. Let t3 denote the time of occurrence
of a superior mutation which subsequently survives drift.
The inferior mutation cannot be fixed in the interval (0,
ts) if the superior mutation is to precede. The time

lnN
required for fixation of the inferior mutation is .

 

I
Therefore, the superior mutation will be fixed first only if

an inferior mutation does not appear and survive drift

lnN

before time (tS-—E—J+, where the notation (0+ is defined
I

here as max(u0). The expected number of inferior mutations

. . . . , N ,
appearing in this interval 15 r}(tS-—§—J . Therefore, the
I

92

unconditional probability that superior mutation S is fixed

before inferior I is given by

 

. _ ” lnN ,
Pr{S,I} — £exp{-r1(ts- ) }g(tS)dtS , (16)

l

_ ”5's _
where g(tS) - rSe , and rs - uSn(sS)N.

As expected, the probability that the superior mutation
is fixed first is simply the complement of the probability
that the inferior is fixed first: Pr{S;I} = 1.- Pr{I;S}.
Note that limNm Pr{S;I}==l, confirming our claim that large
population number promotes the ordering of fixation from
largest to smallest effect mutations (Figure 3.1).

uSIHsS)
Moreover, lim Pr{S;I) = , confirming our
N»: u1n(sl) + )JSII(SS)

 

claim that the ordering of fixations in small populations is
determined primarily by the relative rates of appearance and
survival of beneficial mutations.

One limit of special interest to the empiricist is the
limit in which the two selection coefficients converge.

This limit is of interest because it is often the case that

93

little is known about the selection coefficients other than

the relation sS:>sl. And it is conveniently the case that

selection coefficients cancel out in this limit, rendering
the probability independent of either selection coefficient.
The limit as selection coefficients converge yields a

maximum value for Pr{I,S}. More precisely, define e as the

positive constant which satisfies ss==sI+-e. Then, given

only that 58 >.sn the maximum probability, Pr{I,S}, is
found by taking the limit of (15) as e-0.

A second limit of interest is that in which the
mutation rate ratio, pyﬁm, gets large. This limit is of
interest because it again yields a maximum probability,
Pr{I,S}, this time eliminating dependence on u,. In the
cases where selection coefficients as well as the mutation-
rate ratio are unknown (see HIV example in Discussion), it

may be convenient to examine the double limit,

sup{Pr{I;S}|uS,N} =
mMnN

lim 0 u, Pr{I;S} = e ,
e~,—«o
us

(17)

94

where K is the linear constant in the drift survival
function, and we assume the intercept of this function to
equal zero, i.e., rxuﬂ = Ks. (This assumption is reasonable
when N is large, because n(O)==1/ll= 0; see Crow & Kimura,
1970.) Note that (17) does not depend on either selection
coefficient and gives a maximum value for all cases in which

sS>>sI and p, is unknown. A useful inequality may be

derived by noting that the probability given by (17) is

greater than 0.05 when

pSKNlnN < 3 . (18)

The utility of this inequality is demonstrated in the

discussion section.

Probability that a superior mutation is the first of many

In the case of many available beneficial mutations, the
probability that the superior mutation will be fixed first
is an extension of the previous developments. Let the set of

M'available beneficial mutations be ordered from largest to

smallest effect such that s, > 52 > 53 > . . . > SM. We

95

assume that these mutational effects are addditive, so that

the rank-order of selection coefficients does not depend on

which mutations may have been fixed previously in the

population. The probability that the superior mutation

(mutation 1) is the first to be fixed is the probability

that none of the NFl inferior mutations are fixed first:
Pr{l;2,3,...,M} = jexp{-fri(tl-—ln—N)+}g(t)dt , (19)

s 1 1

0 i=2 i

- t
where g(tl) = rler“, and r: = uin(si)N.

Probability that fixation ordering is from largest

to smallest s

Fixations appear in order from largest to smallest s
when the largest of the M'available beneficial mutations is
fixed first, and the largest of the:RF1 remaining mutations
is fixed next, etc. Given the ordered set of available

beneficial mutations as described above and extending this

logic, the probability that the mutations will be fixed in

96

order from largest to smallest selective advantage is,

intuitively, Pr{l;2, . . . ,M} *Pr{2;3, . . . ,M} *. . . *Pr{M—1;M}, or:

T1 Iexp{- g r] (ti- lnN

1:1 0 j =i +1 1'

 

)*}g(ti)dt, . (20)

t
where g(t'_) = rier“, and If = pin(si)N. Above a certain

population number or beneficial mutation rate, this
probability of rank-ordered fixations is equal to one

(Figure 3.2).

General solution for any ordering of fixations

The previous developments may be generalized to derive
an expression for the probability of any specified ordering
of fixations. Let C(-} denote a permuting function. .A
permuting function specifies an ordering. For example,

C(3} = 1 would imply that the third mutation to be fixed is
the mutation of largest selective advantage. Put
differently, the function §{'} maps place in the fixation
ordering onto rank by selective advantage. (Note, there are
M? such functions, corresponding to all possible fixation

orderings.) Define 13 to be the set of all beneficial

97

mutations both inferior to and subsequent to the jth
mutation fixed, given the fixation ordering, C. Define $9
to be the set of all beneficial mutations both superior to
and subsequent to the j” mutation fixed. Then, the

probability of any specified fixation ordering, Q, is given

 

by:
¢(€) =
” lnN . lnN (21)
fexp{-Xrl(ti-—S—) - £r_(ti+ )}g(t'_)dt’_ .
i: 0 1.611 I J jesi 1 CU} '
-r !
Where giti)==1}“}e ‘”". Note that
1, §i= o
limﬁm¢(() = 0” S ‘ 0, again confirming our claim that

large populations promote rank-ordered fixations. Moreover,

M ‘1

Sui-1(3)) , again confirming our
..1 J

14

liqu¢(() == iliin155)[

claim that fixation ordering in small populations is
determined primarily by the relative rates of appearance and

survival of beneficial mutations. Indeed, if all such

98

rates, Lrn(si), are equal, then this probability reduces to

lim ¢(€) =

-—— which a rees with intuition.
N4. 1:11:01) “"1“" j) M! ’ g

The analytical integration of (21) is given in Appendix 3.2.

Fitness trajectories I: parallel populations share a common

set of available beneficial mutations

How repeatable is evolution? Albeit on a miniature
scale, evolutionary biologists are beginning to address this
question experimentally by studying parallel microbial
populations evolving in identical environments (e.g., Lenski
& Travisano, 1994). We address this question theoretically
for the case of adaptive evolution of replicate asexual
populations. In the previous section, we derived the
probability of any specified ordering of fixations. In this
section, we derive the fitness trajectories that correspond
to the various fixation orderings. From this, we compute
trajectories of fitness expectation and variance among
populations.

In identical environments, it is reasonable to assume

that genetically identical clones will have the same set of

99

potential adaptive improvements. Hence, in this section, we
assume that parallel populations share a common set of

available beneficial mutations.

Expectation and variance

Given some fixation ordering, C, the time until

 

. . . . . ' ln.N
fixation of the ith benefiCial mutation is if = S: I, + s .
I .-
"' J (0}
The time I, is a waiting time between the j-lth fixation and.
the appearance and survival of the j“ mutation; this time

plus the time required for fixation of the jth mutation,

lnN

 

s , equals the total time between the j-l” and j”
(0}

fixations. The If depend on fixation ordering, C. If, for
example, the j+1m mutation to be fixed is superior to the
j“ mutation, given fixation ordering C, then the expected
waiting time until occurrence and survival of the jth
mutation is shortened because of the condition that it must
be fixed before the appearance and survival of the superior
j+lth mutation. Mathematically, given that the jth mutation

appears and subsequently survives drift at time I], the

100

probability that the j+lm mutation appears after time

lnN . lnN
ls exp{-rj+l (Ij+

 

 

I,+ )}. The condition that a

S S
’ cm <0)

subsequent fixation, say the j+2m fixation, fixes an
inferior mutation will shorten the expected waiting time
slightly, because the waiting time for the j” mutation must
not be so long as to allow the prior fixation of the

inferior j+2m mutation. Mathematically, the j+2m mutation

 

 

lnN’ . ,
appears and survives drift after time (I'_-‘S ) with
(U +2}
. . lnN + .
probability exp{<€ 2(5- 5 )}. To generalize, let S,
{(10}

denote the set of all beneficial mutations both superior to
and subsequent to the j‘“ mutation fixed, and let I, denote
the set of all beneficial mutations both inferior to and
subsequent to the j“ mutation fixed. Then, given fixation

ordering C, time I] has expectation

lnN __lnN

 

 

E(t_|() = ftg(t)exp{-er(t+ )-er(t )+}dt (22)
J 0 1:63: CU} kelj k
where g(t)==r‘ eﬂkUﬁ. Thus, the accumulated expected time

CU}

until fixation of the i” mutation is

101

Population fitness after the ith

 

lnN
E(T.|Z.) = :Eulcw
c0}

fixation is defined as w;==l + 2: 3 Therefore, the

function describing the trajectory of population fitness

over time is

W(t|<) = W, I

(23)
where i = (k: t e [E(Tk|(), Emmlcn}.

Given the set of beneficial mutations available to an
organism, one can determine the expected fitness of the
population at time t by weighting each of the possible
fitnesses at that time by its probability and summing over
all possible trajectories. The set of possible trajectories
has a one-to-one correspondence with the set of fixation
orderings. To compute expected fitness at time t, one must
multiply the population fitness (given a particular fixation
ordering) by the probability of that fixation ordering and

sum over all possible fixation orderings:

E<w|t) = ;wwlcmm , (24)

102

where ¢(§) and wttlé) are given by (21) and (23),

respectively. Likewise, fitness variance is computed as

Var(w|t) = [;wmluzmmj -E(w|t)2 . (25)

We emphasize that this equation gives the trajectory of
fitness variance among replicate populations, and is
therefore an indicator of the repeatability of the fitness
trajectory (Lenski et al., 1991; Johnson et al., 1995).
This quantity is quite distinct from the within-population,
among-individual fitness variance, which is the basis of
Fisher's Fundamental Theorem (Fisher, 1930).

Figure 3.3 displays the expected fitness trajectories
for populations of various sizes. Plotted in the panel
below each expected fitness trajectory is the corresponding
trajectory for the standard deviation of fitness. This
analysis shows the dramatic reduction in fitness variance as
population size increases, which results from the increased
predictability of the order of fixation of the available

beneficial mutations.

103

Reducing the number of parameters

Selection coefficients of beneficial mutations are
rarely known. Consequently, the utility of the previous
developments may be enhanced by eliminating the need to
specify a particular set of selection coefficients. This
may be achieved in one of two ways: (i) the most probable
set of selection coefficients may be extracted from some
parent distribution, thereby requiring only the parameters
of the distribution, or (ii) the limit may be examined in
which all selection coefficients converge, which
conveniently eliminates dependence on selection coefficients
altogether.

Here, we assume a set of selection coefficients based
on some parent distribution. To proceed analytically, we
have chosen the set of expectations of the Atorder-
statistics (Feller, 1968). Given an exponential parent
distribution, as we have assumed for selection coefficients,

the ith order—statistic has density

9(Sm) = M[?4:1J01 (1-e-M‘“)Mi emm' . (26)

104

From this density, we have derived the expectation:

- 1 A! I) _ 14 i
“501’ - a:()[ ‘1’ ? ° ‘27)

This gives a set of M'selection coefficients. Since the set
is based solely on the exponential parameter, a, the number
of parameters of our model is reduced. Thus, given only the
population number, N, the number of available beneficial
mutations, Ah the beneficial mutation rate, p, and an
exponential parameter, a, it is now possible to compute the
expected fitness trajectory as described in the previous
section. Only N, however, is normally known. In the
discussion section, we solve the inverse problem of
estimating M, p, and a, given fitness data of evolving
populations.

A disadvantage of the above approach is that it
requires an assumption about the shape of the distribution
of mutational effects. Now, we eliminate the need for such
an assumption by examining the limit as selection
coefficients converge. This approach may serve as alternate

means of reducing the number of parameters of the model.

105

The probability of any specified fixation ordering in the
limit as selection coefficients converge is derived in
Appendix 3.2. Conveniently, this limit is independent of
any selection coefficient but depends simply on the ordering
of fixations of beneficial mutations. This limit gives a
maximum probability for all fixation orderings in which
beneficial mutations are not fixed in order from largest to
smallest effect; it gives a minimum probability for the case
in which fixations do appear in this order. Also derived in
Appendix 3.2 is the analytical integration of (22), which
gives E(B|(). From this derivation, the limit as selection
coefficients converge is determined. This limit is
dependent on the average selection coefficient of beneficial
mutations. Therefore, given this average value, the
trajectory of fitness expectation and variance among
replicate populations may be computed as previously

described.

106
Fitness trajectories 11: parallel populations share a common

distribution of beneficial mutational effects

In previous sections, we assumed that different
populations evolving in identical environments would fix the
same set of beneficial mutations. In this section, we
assume instead that every set of beneficial mutations fixed
by such parallel populations is extracted from a common
parent distribution. Such an assumption might be
appropriate, for example, to describe the within-host
evolution of a parasite. The parasite may invade different
organs of the host, giving rise to several different
subpopulations. Each subpopulation may have its own set of
potential adaptations, yet the entire set of potential
adaptations across all subpopulations within the host would
form a common parent distribution of mutational effects.

Given such a distribution, we determine the probability
densities for both population fitness and time of fixation.
In terms of the evolving parasite example, the expected
fitness trajectory derived from these p.d.f.s may be
interpreted as the average fitness trajectory of the entire

Parasite population within the host. The variance may be

107

interpreted as the fitness variance among samples taken
throughout the entire parasite population. The
probabilistic derivation of the fitness trajectories
consists of two parts. Firstly, we determine the p.d.f. for
population fitness after each fixation. Secondly, we
determine the p.d.f. for the time until fixation. In this
way, a complete picture of the adaptive dynamics may be
drawn. For brevity, we analyze only two regions of
parameter space. In the first region, the non-ordered
region, all fixation orderings are assumed equiprobable. In
the second region, the ordered region, beneficial mutations
are fixed in order from largest to smallest effect with

probability approaching one.

The non-ordered region

When populations are small and/or mutation rates are
low, all orderings of fixations are nearly equiprobable
under the condition that all.ian(si) are nearly equal.
Because n(si) is assumed to be a linear function, this
condition may be restated to require simply that all (rs: be

nearly equal. This condition is met when (i) all selective

108

coefficients for beneficial mutations are nearly equal and
the rates of mutation to all beneficial mutations are nearly
equal, or (ii) large effect mutations have lower mutation
rates (i.e., are rarer) than small effect mutations. When
populations are small (or mutation rates are low) and the
above condition is met, we refer to that region as the non-
ordered region of parameter space.

Here follows a probabilistic derivation for fitness
trajectories of evolving populations whose parameters belong
to the non—ordered region. The selective advantage of
beneficial mutations, S, is assumed to be exponentially
distributed with parameter a. (Note that this distribution
requires the second scenario, number (ii) above, to satisfy
the condition that all orderings of fixations are nearly

equiprobable.) In the non-ordered case, population fitness

i
is given by W: = 1 + X S,’ where the Sj are independent,
14

identically distributed (i.i.d.) random variables. Thus,

Wpﬂ.is a gamma-distributed random variable with p.d.f.

¢ _ UN = -————— e . (28)

109

i
The expectation and variance are E(W;)==1 +-a and
i .
Var(Wi) = —2-, respectively.
a

The p.d.f. for time until fixation has two components.
.After the fixation of one mutation, a certain time, 1;, will
elapse before the appearance of the next beneficial
mutation. After this mutation has appeared, a certain
amount of time, I}, is required for its fixation. Thus the

time until fixation of the ith mutation is given by

T.= 23(1' + T.). The Tm are i.i.d. exponential random
1 J :1 M E]

variables; thus, the sum, 1L}: :3 T., is a gamma-

distributed random variable with density

—i i-l M
g (t) = f—t—e‘n where i7 = 1211 n(s )N (recall the
T... (i-l)! ’ Mf=1j 1

condition that all L5n(sf) be approximately equal). The Th

ln.N
are given by 15 = S' ; therefore, they are i.i.d. with
I

 

density f%(t) =

d ln.N { dtln.N
__ exp -_____
F t

2 t }. Because the random

110

variables are independent, the p.d.f. for their sum, 2: T5,
14

may be computed numerically (there is no analytical

solution) as the i-fold convolution of ff(t). Thus the
F

total time until fixation of the i” beneficial mutation has

p.d.f.,
h (t) = g (t) *f (t)" , (29)

where g? (t) is the gamma density with parameter 1, *
M1

denotes convolution, and the exponent *i denotes i-fold
convolution. Numerical solution may be facilitated by

noting that (29) is equivalent to

hT (t) = 9"(9'(gT (t|i)) [.9'(fT (t) ) 1") , where 57 denotes

Fourier transform.

The ordered region

When populations are large or mutation rates are high,

mutations are guaranteed to be fixed in order from largest

111

to smallest selective advantage. In this case, matters are
complicated by the fact that all independence among random
variables is lost. Suppose that htbeneficial mutations are
available to an organism. If mutations must be fixed in
order from largest to smallest selective advantage, then the
first mutation to appear will confer a selective advantage

whose p.d.f. is given by the largest order-statistic,

pS (s) = M[F<s)1’”“f(s> . (30)

[1]

where F(s) and f(s) are, respectively, the cumulative
density function (c.d.f.) and p.d.f. for the selective
advantage of beneficial mutations. If we assume, as we have
in the previous derivations, that selective advantage S of
beneficial mutations is an exponentially distributed random

variable, then (30) becomes

ps (3) = M(1-e‘“’)“‘ae'°" . (31)

[1]

This gives the p.d.f. for W1, since W, = 1 + Sm. Derivation

Of the p.d.f. for W;==l+:: S_ does not yield the gamma

112

distribution (as in the non-ordered region) because the 5w)
are not independent. In this case, the p.d.f. is found by
implementing a convenient feature of the exponential
distribution, namely, that S,” and the differences SUtU -
Sn], j=1,2,...,i—1, are independent random variables. Thus,
for computing the density of population fitness upon
fixation of the ith beneficial mutation, the following
theorem obtains.

Theorem. For i = 1,2,...,M; let Xuj.be the order-
statistics drawn from exponential parent density, aeT“, such

that X[1]>X[2]>e s o>X[M]. Define Zi = 2 lel 0 Then! the
j:

‘probability density function of Zi.is

(91(2) =

,
M(1-e—“")M’lde_°“' , i =1

hFl d’ -m, z’ ‘ Rhi j . '
(17.) [m w“) v),

 

 

- i "1 '“ZL " (‘1)k i k ”-kd
w ere Y ( 1) [00.] (l e ) + : (i-k—l)![ 00.) z .

113

This theorem is readily proved by induction. Thus, the
density for fitness after the i” fixation is ¢WTI(W)' as
given by the theorem.

.As in the non-ordered case, the time to fixation of the

i“ beneficial mutation is given by the sum,

T = S:(I’ + T.). And as before, the waiting time is
l J :1 M F)
decomposed into two parts, (i) 23 TW and (ii) TE. Part
1:1 1a ‘

(i) is the same as in the non-ordered case. In part (ii),

ln.N ,
, applies;

 

the same transformation of variables, Ta =
U]

however, the transformed variables in the ordered case are
not independent. Again, the lack of independence is
remedied by the convenient feature of the exponential

distribution described above. The p.d.f. for Tm is

fTF, (y!) =
y y y1-2 yi-l (33)
f ‘f . . . .f j. y(thxv...,yg)(macbcd...cbg ,

t-l [-2 2 l

TyiTlTy2 '3'y1-2'iy1-1

where, y =

114

M! - -a In N M“ ' -2 -a 1n N
——. 0( ln N ' 1 - ex —— - ex —-—
(M-i) ! ( ) [ p{ Y’. D I! (y! y] +1) p{ y] '1’}. +1 }'

and )3” = 0. .As in the non-ordered case, the p.d.f. for the

sum of the two parts, T) = 2 (TM),

+ T_), is given by their
1:] F)

convolution,

hum = gT (t)*fTF(t) = .7'(.7{gTM(t) )9{fTF(t)}) . (34)

M

In this section, I have presented work in progress. It
is hoped that some features of the fitness p.d.f. derived
for the ordered region will enable one to distinguish it
from the fitness p.d.f. derived for the non-ordered region.
If such distinctive features are discovered between sums of
order-statistics (ordered region) vs. sums of i.i.d. random
variables (non-ordered region), it may be possible to
determine whether the evolution of a meta-population is
selection limited (corresponding to the ordered region) or
mutation limited (corresponding to the non-ordered region),

simply based upon the fitness distribution.

115

Discussion

Previous work by Johnson et al. (1996) explored the
case in which a single beneficial mutation is available to a
population. They computed the variance in mean fitness
among replicate populations and discovered the existence of
two distinct domains: (i) When the product of beneficial
mutation rate and population number, pN, is above a critical
value, adaptive evolution is repeatable; they called this
the "coincident-event" domain. (ii) When pN is small,
adaptive evolution is not repeatable; they called this the
"isolated-event" domain. Qualitatively, we have reached the
same conclusion by our discovery of the distinct ordered and
non-ordered regions of parameter space, the main difference
being that we have treated the case of two or more available

beneficial mutations.

Assumptions of the models

The expected fitness trajectory given by equation (24)
was derived by assuming that fitness jumps suddenly at each

expected time of fixation of a beneficial mutation. (These

116

sudden jumps in fitness explain the jagged trajectories in
Figures 3.3.) While this assumption is justified in
Appendix 3.1 and in Lenski et al. (1991) for the expected
fitness trajectory, it is not justified for the trajectory
of fitness variance. The fitness variance given by equation
(25) is the variance among replicate populations in which
fixations always occur at their expected time. Equation
(25) does not account for the variance in timing of
occurrence of beneficial mutations. While this timing
variance may be large, the corresponding fitness variance is
typically much smaller than the fitness variance due to the
different possible fixation orderings.

The expected fitness trajectory given by equation (24)
also assumes that selection coefficients are additive. That
is, the analysis does not allow any epistatic interactions,
whereby the selection coefficients of certain mutations
might change in magnitude or even sign. This absence of
epistatic effects in conjunction with the finite number of
available beneficial mutations allows the roughly monotonic
deceleration of the fitness trajectory to its final plateau
(Figure 3.3, panels A, C, E). If there were epistasis, then

the trajectory could suddenly re-accelerate after, for

117

example, the fixation of a beneficial mutation of small
effect which resulted in certain other mutations becoming
much more beneficial than they were against the former
genetic background. The absence of epistasis ensures the
eventual convergence of all populations to the same fitness
plateau, such that the variance in fitness among populations
eventually goes to zero (Figure 3.3, panels B, D, F). With
strong epistasis in which the sign of certain mutational
effects varies according to genetic background different
populations can become stuck indefinitely on fitness peaks
of different heights (Wright, 1982). That is, the initial
fixation order of beneficial mutations will vary among
finite populations, which may then open up (or close off)
alternative adaptive routes. Therefore, sustained fitness
variation among replicate populations, even while they
individually approach fitness plateaus, implies a rugged
adaptive landscape with strong epistasis (Lenski et al.
1991; Lenski & Travisano, 1994). In future work, we intend
to examine formally the dynamical consequences of epistatic
interactions among beneficial (and conditionally beneficial)

mutations in finite populations.

118

The previous section, entitled Fitness trajectories II,
relies heavily on the assumption that the parent
distribution of beneficial mutational effects is the
exponential. The notion that this distribution should be
some monotonic decreasing function was proposed by Fisher
(1930). His claim was based on certain assumptions about
both the structure and the dimensionality of fitness
landscapes. More recently, Gillespie (1991) elegantly
defended the use of the exponential distribution for
beneficial mutational effects, basing his argument solely on
the statistical theory of extreme values. Despite the sound
logic of these arguments, however, very little biological

data are available to test this assumption.

Estimating'parameters

For ten thousand generations of bacterial evolution,
Lenski et al. (e.g. Lenski & Travisano, 1994) have recorded
fitnesses of twelve E. coli populations relative to a common
ancestor. We have performed a least-squares regression of
their average fitness data to a simplified variation of

equation (24) (Figure 3.4). A simplification was necessary

119

because the number of terms in the sum of equation (24) is
M7; hence, exploration of parameter space for moderately
large Atis computationally prohibitive. Instead of summing
over all possible fixation orderings, our simplified
equation sums only two terms: (i) the fitness trajectory in
which fixations are rank-ordered times the probability of
rank-ordering, and (ii) the expected fitness trajectory in
which all orderings are equiprobable times the complement of
the probability of rank-ordering. The expected fitness

trajectory in which all orderings are equiprobable is given

by:

E[w(t|no order)] = wt , where i = (k: te[t:'tx+1)} , (35)

where w;==1.+ i/d is the fitness after the ith fixation, and

t - i;( j +dlnN)- i(i+l) 4-idlnN is the timin
i -j= un(1/a)N‘ - 2un(1/a)N‘ g

 

 

of the ith fixation; p is the overall beneficial mutation
rate at time zero, and a is the exponential parameter. This
simplification is systematically biased toward extreme

unpredictability, because it assumes that fixation orderings

are either perfectly rank-ordered or completely

120

unpredictable. This assumption neglects to favor "well-
ordered" fixations (e.g. 124356) over "poorly-ordered"
fixations (e.g. 516324), but instead treats both as
equiprobable. In spite of this bias, however, tests of our
simplified approach, for the computationally feasible case
of small AL yielded a close approximation to exact solution
of equation (24) when there is a moderate to strong tendency
toward rank-ordering of fixations. Previous estimates of p
and a for the E. coli populations (Gerrish & Lenski, 1998)
gave us a priori reason to believe that there was a strong
tendency toward rank—ordering of fixations, justifying the
use of our simplified approach for these populations.
Least-squares regression of the fitness data from the Araﬁl
population (Lenski & Travisano, 1994) gave: exponential
parameter, a'= 20, beneficial mutation rate, p = 5 x 104,
and number of available mutations, M'= 8. Two of these
parameters correspond reasonably well with previous
estimates, a'= 35 and p = 2 x 104, from a model in which an
infinite number of available beneficial mutations is assumed
(Gerrish & Lenski, 1998). That the infinite-mutations model
estimates a larger value for a reflects the fact that the

beneficial mutation rate does not change when there is no

121

limit to the number of available beneficial mutations. In
the present model, the beneficial mutation rate effectively
decreases over time because the finite number of available
beneficial mutations are slowly expended. Thus, to achieve
a similar fitness increase, the average selective advantage
of each mutation must be greater for the present model in
which there is a finite number of available beneficial
mutations. This would explain why the present model
estimates a lower a (average selective advantage = l/a).
The parameter, AL has not previously been estimated.

In future work, I will develop a modification of the
estimation procedure described here that will take into
account all “well-ordered” fixations. I envision a routine
that computes the probabilities of all fixation orderings in
which zero or only two fixations do not appear in rank-
order. In addition, a bootstrap routine could then sample
other permutations at random. Such a routine will be used
not only to estimate parameters with greater accuracy, but
also to compute the corresponding trajectory of fitness
variance. Comparison of this theoretical variance with
observed fitness variance among E. coli populations could be

useful in addressing the question of whether these

122

populations are following the same evolutionary pathway

(Lenski et al., 1991).

Ordered substitutions in nature, with an application to HIV

Some recent studies with bacteria and viruses have
shown that certain pairs of mutations are fixed in evolving
populations in a more or less predictable order (Hall, 1988;
Mittler & Lenski, 1992; Cunningham et al., 1997). Three
different hypotheses can, in principle, explain this
predictability. First, mutation A may be much more
advantageous than B. In that case, A deterministically
out-competes B if they co-occur, as is likely in a
sufficiently large population. Also, A has a
correspondingly higher probability of escaping initial loss
due to drift than does B. Second, mutation A may occur at a
higher rate than B so that, in a small population, A is more
likely to arise and be fixed first. Third, the mutations
may interact epistatically, such that B is advantageous only
in the presence of A. In that case, B will be fixed in the
population only after A has been fixed. The theory that we

have developed here provides a mathematical framework for

123

evaluating the first hypothesis. Moreover, it can be
readily extended to also incorporate the second hypothesis,
at least for the case of two mutations. In the following
paragraphs, we apply this framework to the case of
resistance mutations in HIV (human immunodeficiency virus).
The effective population number of virions in an HIV-
infected person has been a matter of recent debate
(Mascolini, 1997; Leigh Brown & Richman, 1997). The outcome
of this debate is of central importance to the understanding
of in vivo evolution of HIV.) On one side of the debate,
Coffin (1995) argues that, with respect to population
dynamics, in Vivo populations of HIV are effectively
infinite and therefore behave deterministically. On the
other side, Leigh Brown (1997) contends that in vivo HIV has
the stochastic behavior of a small, certainly finite
population. Based on molecular data and an algorithm
developed by Kuhner et al. (1995), Leigh Brown (1997)
<:a1culated that the effective population number of virions
in an HIV-infected patient was on the order of one thousand.
This number is five to eight orders of magnitude smaller
than.the actual number of virions present in an HIV-infected

JPerson (Ho et al., 1989; Piatak et al., 1993). To support

124

this calculation, he pointed to the fact that mutations
conferring resistance to anti-retroviral drugs do not always
appear in order from largest to smallest effect. He
reasoned that, if the effective population number were very
large as modelers had previously assumed, one would expect
resistance mutations to always appear in order from largest
to smallest effect. Yet, despite the sound logic of this
argument, his calculation was met with some criticism
(Mascolini, 1997). Here, we evaluate Leigh Brown's claim by
mathematically formalizing his argument concerning the
ordering of resistance mutations.

Based on known parameters for two resistance mutations
(Boucher et al., 1992), the ordering of their fixation in
vivo, and an estimate of the in vivo point mutation rate
(Mansky & Temin, 1995), we estimate the probability of the
observed ordering as a function of effective population
number. (Effective population number is simply the
parameter N'in our previous developments.) Then, employing
the double limit given by equation (17), we compute the
maximum possible probability of the observed ordering.

In 4 out of 18 patients receiving AZT treatment for HIV

infection, Boucher et al. (1992) observed that an inferior

125

resistance mutation, K70R, appeared to be fixed before a
superior resistance mutation, T215Y/F.1 (In several cases,
the appearance of K70R was transient; we did not count these
transient appearances as fixations.) The 50 percent

inhibitory coefficients (;Rgo) for wildtype, mutant K70R,

and mutant T215Y/F are 0.006, 0.01, and 0.15 uM AZT,
respectively. Employing a saturation function, we compute

selection coefficients of mutants by

IC,0 (M) [C+1'Cso (W) ]

= - l where C denotes AZT concentration
5 1550‘") [91°30 (M) 1 '

 

in vivo, W'and.M’denote wildtype and mutant, respectively.
Using equation (15), the solid line in Figure 3.5 plots the
probability of the observed fixation ordering against
effective population number. It suggests that such an
ordering would never occur if effective population number

were greater than ~ 3000.

 

l

A. Leigh Brown (personal communication) has pointed out that a caveat to
our interpretation of this data is that both mutations may be present in
the population prior to drug treatment. But such pre-existence (of any
consequence) is unlikely if the effective population number is small.
So, in this sense, our argument appears circular. Nevertheless, the
observed fixation ordering could be explained (without invoking unusual
epistatic interactions or extraordinarily large differences in mutation
selection balance) only if the T215Y/F mutation does not pre—exist in
the population.

126

One source of error in the above calculations is due to
uncertainty in the conversion of the IC50 to selection
coefficients. The accuracy of the resulting selection
coefficient relies on (i) the accuracy of the IO”, (ii) the
dubious assumption that these values determined in vitro are
the same in vivo, and (iii) both the accuracy and constancy
of the in Vivo concentration of AZT, C. This latter
concentration is highly variable throughout the patient's
body and over time, thus introducing considerable
uncertainty in estimates of the selection coefficients. To
circumvent this difficulty, we appeal to equation (17) for
the maximum probability, Pr{I,S}, given only that ss.> sp

A second source of uncertainty in our calculations is
introduced by the assumption of equal mutation rates.
Indeed, that K70R appears first may be explained, at least
in part, by invoking a higher mutation rate for K70R than
for T215Y/F. This explanation may be ruled out, however, by
calculating the maximum probability given by the limit as
)mwm gets large; this limit is employed in equation (17).

The dotted line in Figure 3.5 employs equation (17) to
plot the largest possible probability of observing K70R

before T215Y/F against effective population number. By

127

largest possible probability, we mean the maximum
probability given simply that smuwpsmm and that pxm is
unknown. This figure demonstrates that these two simple
conditions together with the observation of K70R before
T215Y/F imply that the effective population number is less
than about 5000. This observation is confirmed by
numerically solving the inequality given by equation (18)
for N.

Based on the Poisson distribution and the estimated
probability (solid line of Figure 3.5), the observations of
Boucher et al. statistically support the hypothesis that the

effective population number is less than ~ 300. Based on

our maximum probability (dotted line), their observations
support an effective population number of anything less than

~4000. This number corresponds to an average production of
one mutation per nucleotide per ~10 generations ([4000

replications X (3)10"5 mutations per nucleotide per
replication]*; see Mansky & Temin, 1995). Thus, our
observations generally support the notion advocated by Leigh
Brown that the effective population number of virions in an
HIV—infected person is small enough that it must be

considered finite.

128

 

 

0.0 .—
0.8 --
0.7 --
0.e .-
0.5 --
0.4 .-
0.3 «-
0.2 --
0.1 --

 

 

 

 

 

Log[population number, N]

 

 

0.9 .-
0.a --
0.7 --
0.e ..
0.5 --
0.4 --
0.3 ..
0.2 ~-
0.1 [

probability of superior before inferior, Pr{S,l}

 

-11 -10 -9 -8 .7

 

 

 

Log[beneficial mutation rate, p]

 

Figure 3.1. Probability that a superior mutation (ss = 0.1)
is fixed before an inferior mutation (s, = 0.08) as a
function of both population number, N (for which p5 = u, = 3
x 104), and overall beneficial mutation rate, u (for which
N = 3.3 x 107) .

129

 

 

 

 

 

 

 

 

 

 

 

1.00 1-
0.75 ..
0) 0.50 ~~
C2
CD
23 025»
(U
.5
H—
000 e , p
.8 4 5 6 7 8 9
L.
8
§ Log[population number, N]
a:
C:
E
‘8
>5 1.00T
::
g 0.75 --
'8
I...
Q. 0.50 ..
0.25 d-
000 "r + ii 1
.11 -10 -9 -8 -7

 

 

 

Log[beneficial mutation rate, (1]

 

Figure 3.2. Probability of fixation ordering from largest
to smallest selective advantage as a function of both
population number, N (for which p = 6 x 104), and overall
beneficial mutation rate, p (for which N'= 3.3 x 107). The
number of available beneficial mutations is At: 7, and the
corresponding set of selection coefficients is {.08, .05,
.04, .03, .02, .01, .005}. The individual beneficial

mutation rates are p,==)bﬂ%.

130

(A) N: 1E6

 

i
i
135 !
13

1.25 i-
1.2 ._

1.1 ..

1.05 4.

 

 

expected ﬁtness

0 2000 4000 6000 8000 10000

 

 

 

 

0.04 .-
0.035 .-
0.03 ..
0.025 ..
0.02 ..
0.015 ..
0.01 ..
0.005 ..

 

 

standard deviation

4

0 2000 ” 4000 6000 8000 10000

 

 

 

 

generations

 

'Figure 3.3. Evolutionary trajectories as a function of
population size. The top and bottom panels in each pair
show population fitness and its standard deviation. (A) N'=
10‘, (B) N’= 107, (C) N'= 10°. The number of available
beneficial mutations is M'= S, and the corresponding set of
selection coefficients is {.12, .08, .07, .05, .02}. The
overall beneficial mutation rate is u = 6 x 104, and
individual mutation rates are u, = p/Mz

1
t

(B

mmmwcoc UGHUOQXQ

:0.=m..>0U UthCmcm

Pim

131

(B)

N=1E7

 

1.35 Y

 

mar
125i
i2r

1.11.

L05 i

 

 

41

0 2000 4000 6000 8000 10000

 

expected ﬁtness

 

 

 

 

0041
0mm..
onal
00251
002"
0011
0mm.-

 

 

 

standard deviation

 

0 2000 4000 6000 8000 10000

 

 

 

generations

 

Figure 3.3 (cont.)

132

(C)

N=1E8

 

1.35 .-

 

1.3 ..
1.25 ..
1.2 ..
1.15 ..
1.1 ..

1.05 l

 

expected ﬁtness

 

0 2000 4000 6000 8000 10000

 

 

 

 

0.04 ..
0.035 ..
0.03 ..
0.025 ..
0.02 ..
0.015 ..
0.01 J.
0.005 ..

standard deviation

 

 

0 2000 4000 6000 8000 10000

 

 

 

generations

 

Figure 3.3 (cont.)

133

 

 

 

 

 

1.6 ..
I
8 ' g
a) I I .
g .
(D
>
z:
.‘E
m 1 ml
*- 0000 10000

 

 

 

time (generations)

 

Figure 3.4. Least-squares fit of expected fitness
trajectory to fitness data from evolving E. coli population,
the Ara'1.line from Lenski & Travisano (1994). Effective
population number is.N’= 3.3 x 107. The generation number
in the xeaxis corresponds to a discrete-time model of
population growth by binary fission, whereas the
mathematical developments in this paper employ a continuous—
time formulation. For purposes of parameter estimation, we
have therefore multiplied the generation number by a factor
of ln 2 to adjust for this difference. Estimated parameters
are M': 8, a = 20, and u = 5 x 104.

nix/U r01.) (11411 [(11 . 1

134

 

 

 

 

U.

\

>—

L0 1 e .........

\— ~~~~~

e
G) 081 \K
L. ﬂ

.13.: .

JD 061

m “\

E 04 .. ' \

x \\\

“- 3
0 02.. ‘.
>~ ‘.

:2 .

1% o 1 ’° . 1

.0 1 2 3 4 5
9.

Q.

 

 

Log[effective population number, Ne]

 

Figure 3.5. Estimated probability (solid line) and maximum
probability (dotted line) that the inferior AZT-resistance
mutation, K70R, is fixed before the superior mutation,
T215Y/F, as a function of effective population number. We
used a conservative value for the mutation rate,

them”? = 10'5 (see Mansky & Temin, 1995). To compute
estimated probabilities, we employed equation (15) and
assumed (i) that the two mutations conferred in vivo
selective advantages over the wildtype of sum,= 0.3 and
snuwﬁ.= 1.5 (converted from IC5° values reported by Boucher
et al., 1992), and (ii) that up“ = 11,215,”. To compute
maximum probabilities, we employed equation (17) which
relies only on the condition that smsy/psnog and an estimate
for ”T215171“

Chapter 4

HITCHHIKING OF DELETERIOUS MUTATIONS
WITH SPECIAL.ATTENTION TO MUTATORALLELES1

Abstract

In asexual populations, the fate of’a particular allele is
determined as much by the selective values of alleles at other loci as
by its own selective value. A deleterious allele may achieve high
frequency or even fixation in a population as a result of linkage with a
beneficial mutation. For a specified deleterious mutation, we derive
the rate at which such “hitchhiking” takes place, taking into account
competitive interactions among beneficial mutations. As a special case,
we explore the hitchhiking of mutator alleles. we find that there
exists a most probable mutator strength that is positive and typically
of significant effect, suggesting a sort of “quantum” behavior in which
a population’s mutation rate either corresponds to the wildtype rate or
is elevated significantly above that rate. This characteristic has been
observed in both natural and laboratory populations. Our results
indicate that hitchhiking is generally an important mechanism in the

evolution of mutation rates in asexual populations.

 

l

'Ihis chapter is written in the format of a paper. The “we” in this
«chapter refer to P.J. Gerrish, P.D. Sniegowski and R.E. Lenski.

135

136

Introduction

A neutral or deleterious mutation may rise to high
frequency or even fixation in a population as a result of
being linked to a beneficial mutation. This process is
known as hitchhiking. Maynard Smith & Haigh (1974) and
Kaplan et al. (1989) have explored the consequences of
hitchhiking on heterozygosity for both selectively neutral
alleles and selectively maintained polymorphisms. Among
other things, they found that hitchhiking reduced average
heterozygosity by an amount determined in part by the degree
of linkage between the beneficial mutation and the locus in
question. In asexual organisms, there is complete linkage
among all loci. For the asexual case, Berg (1995) derives a
stochastic model of hitchhiking in which beneficial
mutations are implicitly assumed to be very rare events such
that they do not interfere with each other's progression to
fixation; he explores implications of this model for neutral
and nearly neutral variation. In large populations,
however, the fate of a given allele is affected not only by

selection at linked loci but also by competition with other

137

fitness variants in the population, each with its own
complex array of linked fitness alleles. Taking into
account such competition, we explore the dynamics of
hitchhiking, focusing our attention on the fate of a single
specified deleterious allele. We derive the per-capita rate
at which the specified deleterious allele is fixed by
hitchhiking and the consequent probability that it is fixed
within a given time interval.

In obligately asexual organisms, linkage simply means
that alleles are present in the same genome. There are
three ways in which a deleterious mutation and a beneficial
mutation may appear on the same genome; the deleterious
mutation may occur first, the beneficial mutation may occur
first, or the two may occur simultaneously (in the same
generation). If the beneficial mutation occurs first, then
the deleterious mutation will not be fixed because it will
compete with fitter organisms that carry the beneficial but
not the deleterious mutation. The deleterious mutation may
appear early in the growth of the beneficial mutation, in
which case it will initially rise to high frequency with the

beneficial mutation; however, it will subsequently decrease

138

in frequency and eventually stabilize at some frequency
determined by the balance between mutation and selection.

In light of the above, a deleterious mutation can be
fixed by selection only if it is in the background upon
which a beneficial mutation appears. That is, the
deleterious mutation must occur before, or simultaneously
with, the beneficial mutation. Thus, the rate at which a
specified deleterious mutation hitchhikes to fixation is
reduced to the rate at which successful beneficial mutations

are produced in the specified deleterious subpopulation.

Theoretical developments

The hitchhiking rate

For convenience, we refer to any beneficial mutation
that is produced on a specified deleterious background (in
that order) as a double—mutation. The frequency of a
deleterious mutation in a population is increased by
recurrent mutation and decreased by selection. The
resulting mutation-selection balance determines the

frequency of the deleterious mutation, which we denote by

139

o. Let u; denote the beneficial mutation rate in this
deleterious subpopulation. Then, the per capita recruitment
rate of double-mutations is ¢>u;. We allow the beneficial
mutation rate in the deleterious subpopulation to differ
from the corresponding rate in the wildtype subpopulation
(hence the prime), because one class of deleterious
mutations that is of special interest causes a general
increase in mutation rate.

Let Pr{fix} denote the probability that any given
beneficial mutation produced by the deleterious
subpopulation achieves fixation in the population. Then,

the per-capita rate of fixation of such double-mutations is

h = (pp; Pr{fix} . (36)

This is, in other words, the per-capita hitchhiking rate of
the deleterious mutation in question. To assemble the
pieces of equation (36), we first derive an effective
mutation-selection balance, ¢¢’ and compare it with the
commonly used equilibrium mutation-selection balance, 6.
Then we derive the fixation probability, Pr{fix}, by

considering that a beneficial mutation may be lost as a

140

result of either drift or competitive exclusion by a
superior beneficial mutation. All derivations in this

section are general for deleterious mutations.

An effective mutation-selection balance

The specified deleterious mutation appears at per-
capita rate up, and selection acts to remove this mutation
at per-capita rate 80¢, where 5,) is the coefficient of
selection against the specified deleterious mutation.
Therefore, the dynamic equation for mutation-selection
b l ' d¢ ‘ ( - )(1- ) h l '

a ance 15 Id? - 1%) 55¢ ¢ , w ere ( ¢) imposes
logistic frequency dependence and ensures that o s 1.

Because mutation selection balance is typically a low

frequency (such that 1-¢ e21), this equation is well

d
approximated by -é% =1s)-'sd¢. The equilibrium solution to
. u
this dynamic equation is ¢ 2'32' which is the conventional
D

expression for mutation-selection balance.

Evolving populations, however, are not at equilibrium.
Hitchhiking of deleterious alleles only occurs in adaptively
evolving populations. In such populations, genetic

variation is periodically purged by adaptive substitutions.

141

In Appendix 4.1, we derive an effective mutation-selection
balance, o‘, which employs the above dynamic equation and
takes into account such periodic homogenizing of a
population. In subsequent figures, we plot curves employing
both equilibrium (¢>= 6) and effective (q>= ¢¢) mutation-
selection balance.

Figure 4.1 illustrates the importance of accounting for
the mutation-selection dynamics. It uses mathematical
developments and equations that are presented in Appendix
4.1, but the basic points can be readily understood
graphically. Panel A shows, for two different values of em
the dynamics of a deleterious mutation as it approaches its
equilibrium without any interruption. Notice that the more
deleterious mutation not only has a lower equilibrium
frequency, but also that it approaches that equilibrium much
faster. However, the on-going substitution of beneficial
mutations will perturb these dynamics away from their
approach to the equilibrium, periodically re-setting the
deleterious mutation frequency to near zero. Panels B and C
show this effect for two different values of the
substitution rate, 0, the inverse of which approximates the

median time between fixation of successive beneficial

142

mutations. As can be seen, the discrepancy between the
equilibrium (dashed line) and effective (solid line)
mutation-selection balance becomes greater as the
substitution rate increases. The discrepancy also becomes
greater at lower values of’sb because, as shown in panel A,
the equilibrium frequency is higher and the approach to
equilibrium slower, leaving correspondingly more time for a
given rate of adaptive substitutions to depress the
frequency of the deleterious mutation substantially below
its equilibrium. Thus, the discrepancy between the
equilibrium and effective mutation-selection balance is a

decreasing function of‘sm and an increasing function of 0.

Probability of fixation of double—mutations, Pr{fix}

If a beneficial mutation on a deleterious background is
to achieve fixation, it must (i) confer an advantage that
outweighs the disadvantage of the deleterious background,
(ii) survive the effects of drift in the first few
sgenerations of growth, and (iii) outcompete any alternative

beneficial mutations that may arise on the wildtype or

deleterious backgrounds. Condition (iii) may be rephrased

143

to state that a beneficial mutation will be fixed only if it
does not encounter an interfering mutation, defined as
follows. If a superior mutation appears and itself survives
drift, then it is called an interfering mutation since it
interferes with, indeed prevents, the fixation of the
original beneficial mutation. Because loss by drift occurs
in the first few generations of growth whereas loss by
clonal interference (competition among alternative
beneficial mutations) occurs in later generations of growth,
we can make the assumption that these two processes are
independent. Thus, we compute the probability of fixation
of a beneficial mutation on a deleterious background as the
product of the probabilities that (i) this double-mutation
survives drift and (ii) no interfering mutation is
encountered.

Suppose a population consists of two homogeneous
subpopulations, wildtype and deleterious, until the
appearance of a beneficial mutation on the deleterious
background. Before the beneficial mutation appears, the
size of the deleterious subpopulation is determined by the

(iynamic balance between mutation and selection. Let x(t)

Cienote the number of wildtype individuals at time t, let

144
y(t) denote the number of individuals carrying only the
deleterious mutation, and let z(t) denote the number of
individuals carrying both beneficial and deleterious
mutations. Let 3,, and SD denote, respectively, the
selection coefficients for the beneficial mutation and
against the deleterious mutation, such that, relative to the
wildtype, the fitness of an individual carrying both
mutations is 1 + 3,, - sD (i.e., both 3,3 and sD assume
positive values). Define t; as the time to virtual fixation
of the double mutation. Start time with the appearance of a
beneficial mutation on the deleterious background, i.e.,
z(0) = 1. Then the total number of alternative beneficial
mutations is the number produced on either the wildtype or
deleterious background during the time interval (0, t;). On
the wildtype background, this number is

us

S—

S (1-¢)N1nN , (37)
B D

 

t
(sij‘x(t) dt =
0

where uB denotes the rate of beneficial mutation in the
wildtype subpopulation, and the dynamics of x(t) are assumed
to be logistic, which carries the assumption of constant

population size (see Crow & Kimura, 1970). Likewise, the

145

number of alternative beneficial mutations produced on the
deleterious background between the times of appearance and

fixation of the original beneficial mutation is

t I

/ 3
30 SB 30

¢1Vln.N . (38)

 

The total expected number of alternative beneficial
mutations is the sum of equations (37) and (38).

We now determine what fraction of these alternative
mutations are interfering mutations. To proceed requires an
assumption about how selection coefficients of beneficial
mutations are distributed. For reasons discussed in
Gillespie (1991, p. 262), we have chosen the exponential
distribution. Let a denote the parameter of this
distribution. Then, given some function, n(s), describing
the probability of surviving drift (see Appendix 4.1), a

beneficial mutation chosen at random from the wildtype

subpopulation (i) is superior to the double-mutant, and (ii)

survives drift, with probability f n(s)de%”ds. When n(s)
‘3-30

is a linear function of s (which we assume in the following

developments), this integral reduces to

-a( 3 8-3 D)

n(siwafl/d) e The second factor in this product is

146

the probability that an arbitrarily chosen beneficial
mutation is superior to the double mutant in question, given
the exponential distribution of mutational effects. The
first factor is the expected probability that an arbitrarily
chosen superior mutation survives drift. The probability
that a beneficial mutation chosen at random from the

deleterious subpopulation is (i) superior to the original

mutation and (ii) survives drift is [n(s-SD) de'” ds, or

’8

n(sB-sD+1/01) em” when n(s) is linear. Note the change in.

the lower integration limit from sB-sD for the wildtype
subpopulation to 33 for the deleterious subpopulation.
This is because a mutation occurring on the deleterious
background must have a selection coefficient greater than 33
if it is to be superior to the double-mutation, whereas a
mutation occurring on the wildtype background needs a
selection coefficient only greater than SB -.ap

Interfering mutations are those which (i) occur in the
interval (0, tf), (ii) survive the effects of drift, and
(iii) are superior to the double-mutation. Therefore, the
number of interfering mutations produced by the wildtype

subpopulation is

147

HE
S ‘S
B D

 

(1-¢)N1n(N) n(SB-SD+l/Ot) e'“('”"") . (39)

Likewise, the number of interfering mutations produced by

the deleterious subpopulation is

/
u -

A] = s -Bs cp Nln(N) n(sB-sD+l/01) e as” . (40)
B D

 

The total number of interfering mutations in the population,
therefore, is simply the sum A +2U. The number of
interfering mutations is Poisson distributed, so the

conditional probability of fixation of a beneficial mutation
with net selective advantage, sB-:% > 0, is equal to the
probability that (i) it survives drift and (ii) no

interfering mutation appears:

I
Pr{fixlsB} = n(sB-SD) 9‘0”“ . (41)

Given our assumption that $3.15 exponentially distributed
with parameter a, the expected probability that a beneficial

mutation on the deleterious background achieves fixation is

. ” 11+ﬂ)-ua
Pr{fix} = Oth'I(SB-SD) e dsB . (42)

D

148

The lower limit of integration reflects the obvious
condition that a beneficial mutation is defined as one that
has a net selective advantage, i.e., sB-:% > 0.
Substituting equation (42) into (36) gives the final
expression for the per capita hitchhiking rate of a

deleterious mutation:

_ , m _ 11+16-a”
h - oqufIHsB s0) e ds . (43)
3

D

Converting per-capita rate to populational probability

To make sense of a per-capita hitchhiking rate, it is
helpful to convert it into the corresponding probability
that a hitchhiking event occurs in a population within a
given time interval. A hitchhiking event is said to have
occurred in a population if the specified deleterious
mutation (i) has produced a successful beneficial mutation,
and (ii) has achieved fixation as a result of linkage with
this beneficial mutation. Given per-capita hitchhiking
rate, h, the populational probability that the specified

deleterious mutation produces a successful beneficial

149

mutation in the interval (0,1) is 1.-e‘“ﬂ. After the

successful beneficial mutation is produced, a certain time,
I}, elapses until the resulting double mutation is fixed.
Only then has the hitchhiking event taken place. Thus, the
populational probability that the specified deleterious

mutation hitchhikes to fixation in the interval (0,T) is

~hN(T-T[)
1 — e I TZT
Pr{hitchhikelT} = f . (44)
0 , T<Tf

This equation may be understood as the complement of the
probability that zero hitchhiking events take place by time
T. In other words, it is the probability that one or more
hitchhiking events take place in the interval (0,T), thus
accounting for the redundancy of multiple hitchhiking
events. (We assume that once the mutation is fixed, it
remains fixed.)

The time required for fixation of a double mutation
depends on the selective advantage of the successful
beneficial mutation. While this value is different for each
successful beneficial mutation, its expected value may be
determined following the logic of Gerrish & Lenski (1998).

They reason that "a beneficial mutation whose selective

150

advantage is small is not likely to become fixed because it
must compete with many superior mutations. On the other
hand, a beneficial mutation whose advantage is large is less
likely to be produced. Hence, there must be some
intermediate selection coefficient that balances the
fixation advantage of large 3 with the more frequent
occurrence of small 5. This balance corresponds to the
expected selection coefficient of successful mutations."

-(1 +A’) w B

Let p(sB) = Kn(sB-sD)e , where K is a constant

such that [p(sB)dsB = 1. Given that a beneficial mutation
8

D
(i) is in linkage with the specified deleterious mutation

and (ii) achieves fixation in the population, then its
selective advantage,.my has probability density p(sa.

Thus, the expected value for the selection coefficient of

successful double mutations is

E(sB-sD) = fsBp(sB)dsB-SD . (45)

3
D

This may also be interpreted as the expected increase in
population mean fitness during a hitchhiking event. Hence,

to a first approximation, the time required for the fixation

151

lnN
of a successful double mutation is {P z ———————— thereby

f .E(sB-sD)'
completing equation (44).

Some of the subsequent figures plot both per-capita
hitchhiking rates as well as populational probabilities of
hitchhiking. The benefit of plotting rates is to avoid
dependence on the time interval, and the benefit of plotting
per-capita rates is to avoid dependence on population size.
Elimination of these two dependencies facilitates
interpretation of the observed trends in terms of the
processes involved, yet it somewhat obfuscates the
consequences of these processes at the population level.
Thus, in order to elucidate these more intuitively

accessible consequences, subsequent figures also plot

populational probabilities.

An application

Hitchhiking rate of mutator mutations

A class of deleterious mutations that is of particular
interest, especially in light of Chapter 1, is the class

known as mutator mutations. These are mutations that

152

disrupt DNA synthesis and repair, and thereby elevate the
mutation rate of the organism. Both experimental and
theoretical evidence (Sniegowski et al., 1997; Taddei et
al., 1997) suggests that the evolution of mutation rates in
asexual populations may be determined more by chance
hitchhiking of mutators than by the evolutionary "fine-
tuning" of mutation rates proposed in previous work (Leigh,
1970; Ishii et al., 1989).

Before proceeding, a clarification of terminology is
essential. Two subpopulations are of interest, wildtype and
mutator; what we previously called the deleterious
subpopulation, we now call the mutator subpopulation.
Whereas previous reference to deleterious mutation implied a
specific mutation, the following developments make reference
to an overall deleterious mutation rate. A genotype's
overall deleterious mutation rate is the total rate at which
all deleterious mutations are produced. Define (16 to be the
overall deleterious mutation rate in the wildtype
subpopulation; this is to be contrasted with.1b which is now
defined as the rate of mutation to the mutator genotype.

A mutator mutation is deleterious most of the time.

TPhe overwhelming majority of mutations are either neutral or

153
deleterious. We assume that a mutator allele elevates the
mutation rate of all mutations by the same factor such that
11; = muB and u; = muo' where m is called the mutator
strength. Because the wildtype deleterious mutation rate,
lb: is typically several orders of magnitude greater than
the wildtype beneficial mutation rate, Lb: a general
increase in the mutation rate increases the total number of
deleterious mutations produced much more than the total
number of beneficial mutations produced. Therefore, an
increased mutation rate confers reduced fitness. We now
calculate how much fitness is expected to decline for a
given increase in the mutation rate based on the increased
production of deleterious mutations.

The number of mutations occurring per generation is
Poisson distributed, so that the fraction of the
subpopulation in which zero deleterious mutations occur in a
single generation is e-%. This is equivalent to the
absolute fitness of the wildtype subpopulation, because each
deleterious mutation is ultimately a genetic death in a
large asexual population. That is, in a large asexual

population, each deleterious mutation is eliminated sooner

or later by selection (except for the occasional deleterious

154

mutation that hitchhikes to fixation), irrespective of the

form of the fitness function.1 Likewise, the absolute

fitness of the mutator subpopulation is 9 mg

, where mu6 is
the overall deleterious mutation rate of the mutator.
Therefore, the selection coefficient against the mutator is

9
SD = 1 - 1 z (16(m-1) . This definition of 3,, may now be
6

 

inserted into equation (43). The resulting expression gives.

the hitchhiking rate of a mutator.

Estimating rate of mutation to mutator for evolving E. coli

populations

As described in Sniegowski et al. (1997) and in Chapter
1, three of twelve evolving E. coli populations fixed a

mutator mutation within 10,000 generations. The times of

 

1This is most easily understood by first computing the average fitness
of an individual in which all deleterious mutations confer the same
selective disadvantage. If deleterious mutation rate is )1, and the
selection coefficient against this class of mutations is 5,, then expected

fitness of an individual in the population is

u/ -u

- s -u /s -u /s
E(hq) = e 1 11+ (l-sl)(u1/sl)e 1 11+ (1-s1)2(u1/231)e ‘ 1+ ... = e 1. If

there are M’s-classes of deleterious mutations, then the expected fitness

of an individual in the population is em“, where 115 = : pi.
1-1

155

these three fixations are known; they are approximately t1==
2000, t2== 3000, and t3== 8000 generations. The censored

likelihood function for the mutator hitchhiking rate is

3
(e-h(10,000))9 h3 exp{-h 2 t1} . Note that mutator hitchhiking
14

rate here refers to the rate at which the population fixes a
mutator by hitchhiking (i.e., it is not a per-capita rate).
From this, the maximum likelihood estimate of the mutator
hitchhiking rate for the E. coli populations is
hob, = 3 (90,000 +12: ti).1 = 2.9 x 10’5 hitchhiking events per
generation. Variance was determined from the likelihood

hz

function to be Var(h) = 3’” = 2.8 x 10'"). Assuming a normal

 

density truncated at zero and normalized, the approximate
95% confidence interval for the hitchhiking rate is (3.8 x
10*, 6.2 x 10*).

All parameters of our model have previously been

estimated for E. coli populations, except the rate of

mutation to mutatory ib- These parameters are d = 35 and
(.13 = 2.0 x 10'9 (Gerrish & Lenski, 1998), (16 = .0003 (Kibota &

Lynch, 1996), m = 100 (Sniegowski et al., 1997), and N'= N;

= 3.3 x 107 (Lenski et al., 1991). The rate of mutation to

156

mutator may now be determined, given the estimate of the
hitchhiking rate. This estimation procedure is facilitated
by noting that A +)U from (42) is well approximated by

113 N 1n N 105-'0)

 

n(sB—sD+1/d) e , which is independent of up.

53.30
(This approximation is due to the fact that mutation-
selection balance, o, is typically quite low.) Thus, Lb may

be estimated explicitly by inserting ¢ = ¢e (given by (84)

in Appendix 4.1) into (36) and rearranging:

 

h S 1- -:D/O ‘1
u e 0” D 1 - e , where 0‘is the rate of
D N u; Pr{fix} 3D; C

beneficial substitutions (equation (88) in Appendix 4.1),
and Pr{fix} is given by (42). The rate of mutation to the
mutator type is then estimated to be uqm = 2.0.x1.(Y6 mutator
mutations produced per replication. Based solely on the
variance in the hitchhiking rate, the actual rate of
mutation to mutator may be assumed to be in the interval
(2.6 x 10”, 4.3 x 10*) with 95% confidence. This estimate

is quite reasonable, in view of the following

considerations. There are four loci (mutS, mutL, mutH, and

157

uer) that together encode the methyl-directed mismatch
repair pathway. Mutations causing defects in any of these
genes will yield a mutator with strength are 100, and all
three of the mutators found in the experiments by Sniegowski
et al. (1997) mapped to this pathway. Assuming ~1000 base-
pairs for each locus as well as a mutation rate for E. coli
of ~5 x 1040 per base-pair per replication (Drake, 1991),
and estimating that perhaps 25% of all mutations cause a
major loss of protein function, one would predict an overall‘
mutation rate for this repair pathway of about 4 x 1000 x 5
x 10'10 x 0.25 = 5 x 10”. This calculation lies within the
confidence limits of our estimate for Lb based on the

observed dynamics of the hitchhiking process.

Discussion

Assumptions

An assumption of the above developments is that the
number of beneficial mutations is unlimited, such that their
rate (us) and distribution of effects (a) are constant.

This simplification renders our developments best suited to

158

populations in slowly changing environments in which the
potential for adaptive improvement is renewed at a rate
roughly equal to the rate of adaptation. In what follows,
we compare our results to the numerical simulations of
Taddei et al. (1997) and to the experimental results of
Sniegowski et al. (1997). The assumption of unlimited
mutations, however, is not appropriate for either of these
comparisons. To evaluate sensitivity to this assumption, we
are presently working on the analytical derivation for the
case when the number of possible beneficial mutations is
finite.

We have also assumed that a beneficial mutation appears
either on the wildtype background or on the background
carrying the specified deleterious mutation. In reality, a
population is composed of several deleterious mutations;
thus, a beneficial mutation may appear on the background of
any number of deleterious mutations, thereby effectively
reducing the fitness of what we have called the "wildtype"
background. Nevertheless, our assumption may be justified
by considering that (i) mutation-selection balance is quite
low for strongly deleterious mutations, meaning that the

beneficial mutation recruitment rate in that subpopulation

159
is very small, and (ii) mutation-selection balance may be
high for weakly deleterious mutations, but these will hardly
affect the outcome of a linked beneficial mutation.

As previously mentioned, the assumption that selection
coefficients of beneficial mutations are exponentially
distributed is nicely defended both by Fisher (1930) and by
Gillespie (1991). Their arguments, of course, do not
absolve this assumption of criticism, but they offer logical
support to the choice of the exponential when the shape of
the distribution is unknown. Lastly, we have assumed that
there are no gene interactions between the deleterious
mutation in question and any beneficial mutation. The
validity of this assumption is a purely biological question
and appears to be highly contingent on the particulars of

the mutations in question.

Comparison with simulation

Taddei et al. (1997) simulated populations that grew
exponentially from 108 to 1010 individuals, of which 108 were

then sampled, with the process repeated to mimic a serial

160

transfer regime. For comparison with their simulations, we
employ the fixation effective population size, which we
calculate as Iﬁ “ ;AQT (Appendix 4.2), where Ah is the
number transferred (108), and I is the number of generations
between transfers (logﬂlﬂn. We have determined that the
set of beneficial mutations they used approximates an
exponential distribution with parameter a = 98 (maximum
likelihood estimate). Also, for comparison, we employ the

equilibrium approximation for mutation-selection balance,

4>e 0; for comparison, we do not employ our effective

mutation-selection balance because their simulations do not
appear to take into account the periodic purging of genetic
variation caused by substitutions.2 Inserting their

simulation parameters ((16 = 10", 118 = 108, and 110 = 5 x 107’)
into our analytical model, we computed the probability that
such a population would fix a mutator by hitchhiking within

a time span of 20,000 generations. Out of 100 simulations,

Taddei et al. found that 19 fixed a mutator in 20,000

 

2This is evidenced by their Figure 3 in which mutator frequencies never
drop much below mutation-selection balance. Given parameters used by
Taddei et al., their Figure 3 should show occasional downward spikes of
between one and two orders of magnitude if the variation—purging effect of
substitutions in the wildtype subpopulation were allowed by their
simulations.

161

generations when m = 10, and 7 fixed a mutator when m = 100.
Employing their assumption of mutation-selection equilibrium

(¢>= 6), our model gives probabilities of 0.223 and 0.073

for m = 10 and m = 100, respectively, which shows good
agreement with their findings. However, when we correctly
account for the variation-purging effect of substitutions by
employing the effective mutation-selection balance (6==6‘),
probabilities are 0.040 and 0.057 for m = 10 and m = 100,
respectively. We restate that the fundamental differences
between our analytical model and the simulations are: (i)
the analytical model assumes an unlimited number of
available beneficial mutations whereas the simulations
assume that number to be finite, and (ii) the simulations
assume mutation-selection equilibrium whereas our analytical
model takes into account the dynamics of mutation-selection

balance.

162

Effect of population size on hitchhiking rate

Panel A of Figure 4.2 plots per-capita hitchhiking
rate, h, against population number, N, for a mutator; it
shows that h remains constant below about N'= 106 and
decreases steadily above that value. This figure elucidates
the effect of increased clonal interference on per-capita
hitchhiking rate. In small populations, clonal interference
is highly improbable. Thus, the principal determinant of h.
in small populations is the per-capita probability of
occurrence of a beneficial mutation on the specified
deleterious background. Because it is per-capita, this
probability is independent of N, explaining why h is
essentially independent of N below a certain population
size. Above this population size, the decline in h reflects
the fact that clonal interference is probable and
intensifies with increasing AL thus reducing the probability
that any given beneficial mutation (including those
occurring on the specified deleterious background) will
achieve fixation.

Panel B of Figure 4.2 plots the corresponding

probability that a population fixes the specified

163

deleterious mutation by hitchhiking within a time interval
of 10,000 generations against population size, N.
Hitchhiking probability increases with population number but
this increase decelerates at large N because of intensifying

clonal interference.

Effect of selective disadvantage on hitchhiking rate

Panel A of Figure 4.3 shows the per capita hitchhiking.
rate as a function of the strength of selection against the
specified deleterious mutation, 50. Interestingly, this
rate is essentially independent of 5% when it is small.

This is most easily understood by considering that the
initial rise in the deleterious mutation's frequency after a
substitution is determined much more by recurrent mutation
than by selection against the mutation. It is not until the
frequency of the mutation approaches its equilibrium value
that selection begins to significantly affect its
trajectory. Because very slightly deleterious mutations
have relatively high equilibrium frequencies, there is a
good chance that a substitution in the population will occur

before selection becomes important to the trajectories of

164

such mutations. In other words, in the time interval
between substitutions, slightly deleterious mutations may
behave as neutral mutations, in which case their probability
of hitchhiking is essentially that of a neutral mutation.
(In the population genetic language, substitutions reduce
the effective population size, AL, and any mutation whose
selective disadvantage is less than Ngq, in a haploid
population, is effectively neutral.)

Panel B of Figure 4.3 plots the corresponding
probability that a population fixes the specified
deleterious mutation by hitchhiking within a time interval
of 10,000 generations as a function of so. .A trend similar

to that observed in panel A.is observed.

Effect of mutator strength on hitchhiking rate

Panel A of Figure 4.4 shows how the strength of a
mutator affects its own per—capita hitchhiking rate. Recall
that a mutator allele elevates the general mutation rate,
which includes the beneficial mutation rate, by a factor, m

(i.e., u’

B =1nu%). Thus, equation (36) becomes

Ii=.m)si6 Pr{fix}, where uB denotes wildtype beneficial

165
mutation rate. This equation is the product of (i) the per-
capita recruitment rate, niuBcp, of double-mutations, and
(ii) the probability of fixation, Pr{fix}, of a double-
mutation. Factor (i) is plotted in Figure 4.5, and factor
(ii) is plotted in Figure 4.6. Thus, Figure 4.4A, which
plots hitchhiking rate against mutator strength, may be
understood simply as the product of Figures 4.5 and 4.6.

Our aim in the next several paragraphs is therefore to
explain the trends in Figures 4.5 and 4.6 in order to
understand Figure 4.4.

We begin by explaining the dashed line in Figure 4.5.
This shows that the invalid equilibrium assumption (q>= 6)
renders the double-mutation recruitment rate essentially
independent of mutator strength over the region of
biological interest. This may be understood as follows.
The rate at which a mutator subpopulation produces
beneficial mutations is directly proportional to the number
of individuals carrying the mutator allele and hence to the
mutation-selection balance of that allele. As derived in
the subsection entitled Hitchhiking rate of mutators, the
coefficient of selection against a mutator is l-e-%Umn.

Thus, if deleterious mutations were always maintained at

166

. u

their equilibrium frequency, 6 =0): 2?, then the population
D

would produce beneficial mutations on the mutator background

Reno
no

words, a weaker mutator allows for higher equilibrium

In

 

. 115(1-m) ‘1
at a per-capita rate of HiuBlﬁ) 1-e z

mutation-selection balance but has a lower per capita
mutation rate. These two factors influencing the double-
mutation recruitment rate, m1s36, directly cancel each
other out, rendering this rate essentially independent of
mutator strength. Note that this rate is also independent
of population size, N.

When the dynamics of mutation-selection balance are
properly accounted for by employing ¢>= 6‘, then the double-
mutation recruitment rate is not independent of mutator
strength. In fact, the solid line in Figure 4.5 shows this
rate to be a monotonically increasing function of mutator
strength. This is explained as follows. After an adaptive
substitution takes place, the frequency of mutators in the
population is very low. It takes some time for a mutator
allele to approach its equilibrium frequency. And the
higher that equilibrium frequency, the longer it takes to
approach (Figure 4.1A). This can be seen in the dynamic

solution to the mutation-selection balance equation,

167

'3!
—2 (l-e D). Weak mutators have small.a,and therefore

D
require long times to approach their high equilibrium
frequency. If adaptive substitutions occur in the
population at a given rate, then mutator frequency drops
with a certain periodicity determined by this rate. Since
stronger mutators recuperate their lower equilibrium
frequency more rapidly, their effective mutation-selection
balance is less affected by the periodic purging of genetic
variation caused by adaptive substitutions. Thus, their
effective recruitment rate of double-mutations is higher.
Hence the trend of monotonic increasing recruitment rate
with increasing mutator strength, as shown by the solid line
in Figure 4.5.

Figure 4.6 shows that probability of fixation, Pr{fix},
decreases monotonically with mutator strength. The stronger
the mutator, the more deleterious it is. Therefore, the net
fitness of a beneficial mutation that is linked to a mutator
is lower for stronger mutators. Hence, the probability of
fixation of such double-mutations decreases monotonically
with increasing mutator strength.

Figure 4.4 shows that, when the dynamics of mutation-

selection balance are properly accounted for, there exists

168

an intermediate mutator strength that maximizes the
hitchhiking rate. (Panel B of Figure 4.4 reveals that the
same trend is observed at the population level.) This
observation reflects a balance between the increased double-
mutation recruitment rate of stronger mutators (Figure 4.5)
which increases hitchhiking rate, with the decreased
probability of fixation of stronger mutators (Figure 4.6)
which decreases hitchhiking rate.

Observations of bacterial populations in nature and in
laboratories seem to suggest that mutation rates either
correspond to a wildtype rate or they are elevated from this
rate by one, two or occasionally three orders of magnitude
(Sniegowski et al., 1997; LeClerc et al., 1996; Mao et al.,
1997). Of course, this observation is complicated by the
fact that strong mutators are more likely to be noticed and
their mutation rate is more likely to be statistically
distinguishable from the wildtype. Yet, it seems that the
strength of the evidence outweighs this complication.
Fragility of the genetic mechanisms involved in DNA
synthesis, proofreading and repair may offer one explanation
for this observation (Cox, 1976; Miller, 1996). Our work

suggests another explanation which is based solely on the

169

adaptive dynamics of asexual populations. If a mutator is
to be observed, it must achieve high frequency in the
population. Hitchhiking is a plausible mechanism by which a
mutator may achieve high frequency (Sniegowski et al.,
1997). Our work shows that a mutator's hitchhiking rate is
maximized when its strength is at some intermediate value.
Therefore, given that a mutator is observed, it is most

likely to be a mutator of intermediate strength.

170

 

01131 _ ..................................... A

0&D5-

1

frequency

 

 

 

generations

 

Figur. 4,1, The dynamics of mutation-selection balance.
The two solid lines in panel A.show the frequency
trajectories of two selectively distinct deleterious
mutations, starting with a frequency of zero, as after an
adaptive substitution, and asymptotically approaching their
equilibrium frequency. (These lines plot the dynamic

‘81‘:

equation, 6(t) = (l-e D )). Dashed lines show the

mlé:

U

- u
corresponding equilibrium frequencies (6 = E?)' Parameters

D
used are p0 = 3 x 10‘, 50:: 0.03 and 0.003.

 

frequency

 

ada

lir1
Cal

 

171

 

0.001 _— ....................................................... - B

0.0005

\
\
\
\
\
\

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

i" 0 1 1 1 1 1
3 0 500 1000 1500 2000 2500
t:
g 0.001 ........................................................... C
0.0005 --
O r 1 1 1 L 1

I i W

0 500 " 1000 1500 2000 2500

 

 

 

generations

 

Figure 4,1 (continued). Panels B and C plot the frequency
trajectories of the same deleterious mutations, but here
adaptive substitutions occur at rates 0 = 0.0009 and o =
0.003, respectively. These substitution rates are
determined from equation (88) using parameters N'= 3 x 107,
or = 35, 113 = 4 x 104° and 2 x 10". The solid horizontal
lines represent effective mutation-selection balance, as
calculated by equation (87).

“IO u-‘l‘l‘

'J‘I‘
( .... w r .....—
t.. 44 ....
tcwéwxs

Po}
pm

The
mut

172

 

 

 

 

 

 

 

 

 

 

 

 

 

£1
~ 113-12 --
0)
.U
(U
H
D)
3 .5 1E-14 ._
8.15
Si
I U
. .. y
0 «4 ..
as 1E-16 , , , 1;
1E+04 1E+06 1E+08 1E+10 I~
01 l.
D
H
:35
i m 1E‘01 ..
g -
3 .2
'1'. .u 1E-02 -
.6 e
H
In 0
o 5 1E-03 _
>1 0)
:3 o
H c 115-04 -
La °.
.0 3 1E-05 ‘ 1 g A,
8 t., 1E+04 -- 1E+06 1E+08 1E+10
Dav!

 

 

 

population number, N

 

Figure 4_2, Panel A shows the per-capita hitchhiking rate
of a specified deleterious mutation as a function of
population number, N. Panel B shows the corresponding
probability that a population fixes the deleterious mutation
by hitchhiking within a time interval of 10,000 generations.
The dashed lines show the result of incorrectly assuming

mutation-selection equilibrium (0): 6). Parameters used are
or =35, uB=2x10", uD=2x10“, s, = 0.03.

l “1
11waJJ JNJ

III...- I

173

 

 

 

 

 

 

 

 

 

 

 

 

 

 

a 115-12 - .
a; 1 ~
U
10
8 115-13 ._
U)
m G ‘
if. '52
gnu-1 1E-14 __
o :3.
I O
z :1
01.: 115-15 1 4. e v 1 a 1,
0.00001 0.0001 0.001 0.01 0.1 1 t
D)
1::
32
... 1E+00
ﬁ 111 .........
o 6 ‘"*.
U ”'1 1E‘01 -- °°. .
*1”
.G d .
... 3 11502
0 a
0
if; a) 115-03 .-
«II C
H O
:5 :3 115-04 1 1 r t 1 1
'8 .1 0.00001 0.0001 0.001 0.01 0.1 1
11 1:
an.)

 

 

selective disadvantage, 8n

 

Figure 4.3. Panel A shows the per-capita hitchhiking rate
of a specified deleterious mutation as a function of its
selective disadvantage, 80. Panel B shows the corresponding
probability that a population fixes the deleterious mutation
by hitchhiking within a time interval of 10,000 generations.
The dashed lines show the result of incorrectly assuming
mutation-selection equilibrium (4): 6). Parameters used are

or = 35. 113 = 2.0 x10‘9, 110 = 2 xlO-s, and N = 3 x 10".

t:?JwIJ(JﬁJ “1

111‘ .1. I

174

 

 

 

 

 

 

 

 

 

 

a 15-12 1—
0)
JJ
10
M 15-13 --
D)
d Q
33 31
gnu-1 1E-14 --
o i
l U
11.:
0.: 15.15 '4‘;
1 .
D)
d
34'
q, 15+00 -_
.G e
'3 5 ....................
,_. .,, 1501 -- --------
.... u
.G a
... 3 1502 --
0 ﬂ
0
5' b' 1503 -_
2.1 8
jé‘i 1504 .. 1 1 .1
'8 3 1 10 100 1000
H d
04-1-1

 

 

 

mutator strength, in

 

Figur. 4,4, Panel A shows the per-capita hitchhiking rate
of a mutator allele as a function of its strength, m. Panel
B shows the corresponding probability that a population
fixes the mutator allele by hitchhiking within a time
interval of 10,000 generations. The dashed line shows the
result of incorrectly assuming mutation-selection

equilibrium (6 = 6) . Parameters used are 01 = 35,
“a = 2.0 x 10", up = 2 x 10“, 116 = 0.0003, and N = 3 x 107.

 

coﬁumu5610HQSOU mUﬁQ001HoQ

 

175

 

 

 

 

1::
° i
...q
t; '
g 1E-10 .— 1+;
6 E
I '1
o
H o "
‘3 1:
o u 1E-11 ._
'0

U
«u t:
3%
a.
g E 1E-12 1 1 1 4.
3 3 1 10 100 1000 10000
a. u

 

 

 

mutator strength, 11:

 

Figur. 4,5, The per-capita recruitment rate of double-
mutations (beneficial mutations on mutator background) as a

function of mutator strength, m. The dashed line shows the
result of incorrectly assuming mutation-selection

equilibrium (<1) = <13) . Parameters used are N = 3 x 107,
01 = 35, 1.18 = 2.0 x 10", no = 2 x 10“, and 116 = 0.0003.

176

 

 

 

 

 

 

K

K

O”

u.

Nu

a: 1501 .-
c?

O

:3

a 1502--
K

-I-l

'H

"3 1503--
>1

4.)

.,.|

...]

'3 1504 1 1 1 1 a.
is" 1 10 100 1000 10000
H

01

 

mutator strength, m

 

Figure 4,5, Probability of fixation of a double—mutation
(beneficial mutation on mutator background) as a function of

mutator strength, m. Parameters used are N = 3 x 107,
a = 35, 113 = 2.0 x 10”, 140 = 2 x 10“, and 116 = 0.0003.

frequency

 

Pig

bag
ach
EXa
$01

177

deleterious on wildtype

deleterious on beneficial

     

0.0001
>5
(3
C 0.00008
exac,

E;0.00006 VZSden
‘— :Z'?'>.' ’ . « ' ‘.
1.... . :93: 3 -'=approx1mate

0-00004 'q; “:jesmmbng.

0.00002 i

 

 

 

10C] 200 300 1100 BBC

generations

 

Figur. 4.7. Frequency of a deleterious mutation during a
substitutional event on both wildtype and beneficial
backgrounds. Beneficial mutation appears at time t=0 and
achieves frequency 0.5 at time t=173 (vertical dotted line).
Exact solution is given by equation (85); approximate
solution is given by equation (86). Parameters used are
N= 3x107, “a: 5x10", s, = 0.1, s, = 0.005.

APPENDICES

.APPENDIX 1.1

MUTATION RATE ESTIMATION PROGRAM

Theory

The computer program FT.EXE was written to estimate
mutation rates from fluctuation test data (Luria & Delbruck,
1943). Briefly, fluctuation tests report the numbers of
mutants that have accumulated by spontaneous mutation and
subsequent replication during exponential growth of a
population (Chapter 1). Computation of the expected
distribution is notoriously difficult (Stewart et al.,
1990). The probability generating function for this
distribution was derived by Lea & Coulson (1949). From this
generating function, an algorithm for generating the
corresponding distribution was derived independently by
Gurland (1958) and Ma et al. (1992). (See also Gurland,
1963; Sarkar et al., 1992; Jaegger & Sarkar, 1995.) The
program FT.EXE employs this algorithm for computing the
distribution of numbers of mutants in a fluctuation test as

follows.

178

179

Let u denote mutation rate of the wildtype to the
selected mutation, let N denote the final population number,
and let 1 denote the number of mutants in the final
population. Then the probability, p5, that i mutants are

present in the final population is given by

P ==e

O

P. 12:: FE I

(46)

 

where m = uN is the expected number of mutations (not to be

confused with mutants). To estimate m, the program FT.EXE
employs the Newton series (derived below) for the recurrence
relation given by (46).

The program also implements the empirical formulas for
95% confidence limits derived by Stewart (1994). For small
sample sizes, however, Stewart's formulas have no solution.
To make estimates from small sample sizes, I derive (below)
the analytical variance based on the maximum likelihood
surface. As an alternative to Stewart's confidence limits,
the program also gives 95% confidence limits based on
computation of this analytical variance. Given this

variance, one can base statistical inference on either the

180

assumption of normality or the implementation of Chebyshev's
inequality (Feller, 1968).

Let S denote the sample set, let m = uN denote expected
number of mutations, and let u = 1n m. Given u, pjau
denotes the probability of j mutants in the final

population. Then, the log-likelihood function is

L(u) = Zlnpjlu) . (47)

ﬁS

From this, the estimate of u is computed by solution of

= o , (48>

 

where the prime indicates derivative. The variance of u is

given by

Var(u) = —[L”(u)]‘1 (49)

(see Hogg & Craig, 1995). To compute the variance, we have

// l
u) : pj(u) _ pj(u)

__ (50)
3‘53 pj(U) pj(u)

181

The derivatives necessary for computing (48) and (50) are

given by
u '- AP UH
‘9 j
= .___ _______ (51)
P1”) 1' 1:0 (i-j+1) ’
r 1 V ‘
p (H) -e"
1 0 O k e
/ u
where A = l l 0 , Pk(u) =lpk(u) i, and Po(u) =1 -e”'e K
1 2 1 n
pk(U)J (eU_l)eU‘e

 

 

 

 

To estimate u, the program solves (48) numerically by
Newton's method. This is achieved by iterating on r:
L’1u)

u = u - -—”—— o (52)
[*1 r L (u)

Once an estimate for u is obtained, this estimate is used to
compute the variance by (49). From here, the program
assumes that u is normally distributed to compute confidence
limits. (Stewart (1994) suggests that u is more nearly

normal than m.) However, the program may be easily modified

182

to implement Chebyshev's inequality in order to avoid
assumptions about how u is distributed. Lastly, the
estimate of the number of mutations, m, as well as
confidence limits on this estimate are obtained by back-
transformation, m = eK From here, the mutation rate

estimate and confidence limits are obtained from p = m/N.

Obtaining and using the program

The compiled program, FT.EXE, as well as the source
code, FT.BAS (or FT.TXT for ASCII format), may be obtained
at the following ftp site:
ftp://ftps.cdc.gov/pub/MuRates
The username and password needed to log on to this site are
“ncidftp” and “12emerG” (case sensitive), respectively.
Also at that site are: (i) FTDOC.TXT, documentation on how

to use the program, (ii) FORMAT.TXT, an outline of the input

format, and (iii) SAMPLE.DAT, an example data file.

.APPENDIX 2.1

PROBABILITY OF SURVIVING DRIFT

In the first few generations of growth, a beneficial
mutation may be lost by random sampling events, or drift.
Haldane (1927) derived the probability of surviving drift
for a single beneficial mutation. His derivation made use
of a result from the theory of branching processes, which
states that probability of extinction (i.e., not surviving
drift) is obtained by solving the equation f(9)==e, where

f(e) is the probability generating function for number of

offspring (see Ewens, 1969, p. 79). A simple assumption for
multicellular, sexual organisms is that this function
generates a Poisson distribution, in which case the
probability of survival of a beneficial mutation
approximates 23. Our analyses, however, are based on the
fundamental assumption of no recombination. We may further
restrict our analysis to a particular kind of asexual
organism, namely asexual bacteria. Bacteria reproduce by

binary fission, and so we derive the generating function as

183

184

follows. Our assumption of a constant population size (see
below) implies a sampling event every generation. Thus, a
bacterium that divides before sampling will leave zero, one,
or two offspring after sampling. In the case of bacteria,
therefore, the probability generating function for number of

offspring is

,.~. _.« t.

...-'l '.

f(9) = (1—c:/2)2 + c (1-c/2) e + (c/2)262 , (53)

where c is the expected number of offspring after division.
and sampling. Thus, the probabilities of passing zero, one,

and two offspring to the next generation are, respectively,

(1-c/2)2, C(l-c/Z), and (c/2)2. The selective advantage of

the mutant is s =JJ1C by definition, or approximately
5 scz- 1 when s is small. Let n(s) denote the probability
that a beneficial mutant survives drift. Then, by

substituting 1 + s for c in (53) and solving the equation

45
f(l—n(s))==l-n(s), we obtain n(s) =-—————— which is

<1+s)2'
approximately 45 for small 3. All derivations in this
dissertation employ the general notation, n(s), whereas all

computations implement the approximation, n(s)==4s.

.APPENDIX 2.2

n-GENOTYPE LOGISTIC SYSTEM WITH MUTATION

General solution

Logistic dynamics of an n-genotype system are modeled by

in“ 4.1-”-
.L
i
1

assuming that (i) total population size is constant, i.e.,

S: x ==N, where.xiis number of individuals of genotype i,

i=1 ’

and (ii) the differences in Malthusian parameters are

 

constant:
mi -m1 = Si 1 i : 210-01“ I (54)
dxi 1 dxl -1 dxi
where m = — and m = —— = x — N —
1 1 dt ' l xl dt {:2 ' l=2 t

Equation (54) may, therefore, be rewritten as:

l dXi ‘1 S: de .
_ dt + N-ngj jzz-(ﬁ' ‘ St. I 1-2,3,...,n .(55)

 

185

186

This system of n-l equations can be rearranged as follows:

dXi 1 (56)
dt = X151_TI,. SIX} ’

where i = 2,3,...,n. While this system of equations is non-

linear, its symmetry makes an analytical solution possible.
The key to its solution is the transformation,

X;==ln xi-sit. The system of equations now becomes:

dX. +
_' : -l S; s er ’1' , (57)
j:

where i = 2,3,...,n. Thus, the time derivatives of all

transformed variables are equal:

dX. dX

__' _ _L = (58)
dt dt 0 ’

where i,j = 2,3,...,n. Integration of (58) yields

.ﬁ -.%: = }%_, and k” is a constant of integration that is

determined from initial conditions:

k0, = Kim) -X}.(0) = ln xi(0) -ln x110) , (59)

187

where i,j = 2,3,...,n. Thus, the system of equations is
uncoupled by substituting X} from (57) with X} - k”, which

yields:

dX
__" = __1_ S; S_ exi’kif‘J' , (60)
j:

where i = 2,3,...,n. From solution and subsequent back-

F hum... L). as :
f.

transformation of equation (60), the analytical solution of

an n-genotype logistic system is obtained:

-1
x(t) = X.(0) e"t[1 + -1- 2; KW) (e’jt—l)] , (61)
l 1 N}: I

where i = 2,3,...,n, and

x1(t) = 111-f: let) . <62)
I

Application of boundary conditions due to mutation

If genotype i appears by mutation at time I}, then

boundary conditions are xi(n) = 1. From these, the

initial conditions are determined; they are

188

x = R“N , (63)

where x, is a vector whose elements are x1- (0), i =

2,3,...,n, R is an n-l x n-l matrix whose elements are

r” = , i,j = 2,3,...,n, and N'is a vector

whose n-l elements are the constant N.

NOtation for the 3-genotype case

The developments in this appendix use a more general
notation than is used in the rest of the paper, where.xzis
simply denoted by x, x2.is denoted by y, and x; is denoted

by z. This 3-genotype case has the particular solution,

ll
(D

K
+

 

 

y(t)
: (64)

z(t) 3; 1 ‘+ l a; _ 1.+ (O)( a; _1) 4
2(0) Tve 2(0) 8 '

 

 

ll
(D

 

 

 

189

The initial conditions are determined from the boundary

conditions, y(0)==1 and z(tz)==1; they are

ll
H

W0)

85‘. + N (65)
2(0) .

 

.APPENDIX 2.3

EXPECTED NUMBER OF CANDIDATE REPLICATIONS.

Here we derive the expected number of replications that
may generate superior mutations that prevent a given
beneficial mutation from attaining some frequency, f. We
have called these candidate replications, denoted by R, in
the subsection, Clonal interference - a general model. The,
crucial step in the derivation of R is finding an expression
for the time, t2, at which a superior mutation must appear
if the original mutation is to attain a maximum frequency of

exactly f.

The time, tmul at which y reaches a maximum number is

 

d
determined from.-3% == 0; it is
tumult)
1 S N2
t (t) = — ln ’ _ _ (66)
mu 2 s s -s (sysz) ta 3 z
z z y Ne

190

191

(s -:)t -:t , 3 F .
When e V ‘ ‘ < Ne ‘ ‘ (i.e. when e” ‘<N), equation (66)

(s -:z) t

 

is well approximated by omitting the term e V ‘ from the
denominator, resulting in the approximation
his
t (t)-—ln ’ +t . (67)
max 2 5 3-3 2
z z y

The constraints under which this approximation works well

are discussed later.
We now calculate the time ti at which superior mutation

2 must appear if y is to achieve a maximum of exactly fN.

 

 

 

 

The solution to y(t (t ))==f’N is
max 2
V l
s
N 1.+ y
1 82-3
1: = — ln s /” . (68)
Z Sy 1 Nsy sy’z Sy
._ — 1 _
f s -s s -s
\ z z y)

 

 

Next, we use the fact that the expected number of candidate
replications, R, given.%,and.sz.>.%ﬂ is well approximated

1

by evaluating R at s = (3 Is >s ) = s + —, to derive the
z z z y y d

expected number of candidate replications:

192

 

' N
R " fX(t) dt = —lnN-
0 S
(69)
Ni 1+ (1-1)( N)%-
‘3' “ 1+ds "f “S “s '

. 1
where t is simply t evaluated at s ==s +—-.
Z Z 2 y a

The approximation made in equation (67) is, for our

t
purposes, essentially an equality when 50’ ‘< N} If we

combine this condition with equation (68), then the
approximation works well only when the frequency f meets the

following condition:

I —l

 

his l-—3
2 s (70)
f<— y 2+
.N s -s 1
z y
1 O .0

If we let s = (sl5;>s) == 3 -+-, and if we Simplify the

z zzy yet

notation so that s =:%, then the above condition becomes

193

 

2 +.. 71
.f < 3i(01 51W 1“ 1-1 ( )

This upper bound on f reaches a minimum value when.-3§ == 0,

so that an overall bound below which the approximation works

well is obtained by solving for the value of s that

1
satisfies ln.(dsN) == 1 +-a§ and using that value in

equation (71). In general, the approximation is valid when
i’< 0.95 provided that N'is greater than 104. For the
purposes of this paper, the approximation is essentially an
equality because we are concerned only with the cases

1:: 0.01 and 1’: 0.5, for which the approximation works

extremely well. We compute fixation probabilities, i.e.,

the boundary case f’>-—N—, using the simpler derivations in

Clonal interference and fixation.

APPENDIX 2 . 4

FUNCTIONS EMPLOYING THE RECTANGULAR DISTRIBUTION

We present here the results only of the derivations in which
a rectangular distribution replaces the exponential
distribution of beneficial mutational effects. The
probability of fixation of an arbitrarily chosen beneficial

mutation is:

 

. 1 max _1 ’ ’ ,N)
Pr{fixls ,u,N} = n(s) e “(3 :‘mp ds , (72)
max 3
max 0
where A (s s N) = ll Nln Nn Sin-$2 (assumin that
R ' m'u' g 25 9
ma

n(u) is approximately linear). The expected rate of

substitution of beneficial mutations is:

194

195

<oR(sm,u,N)> = uNPrlfIXIsm.u,N} . (73)

The expected selection coefficient of successful mutations

is:

 

3
max -A( ’ , »N)
fsn(s) e “Hm“ ds
_ o
<SR(Smu:ll,N)> - 3m 40: “M I (74)
I II (:3) ea R ’ ““' ' (1:3

0
where )%(s,smu,u,N) is as defined above for equation (72).

The expected number of superior mutations in the interval

(o,£) is:
Z

2 2
u N Sam‘s
w (SIS (uprf) : — Nln[—) n —_ . (75)
R max S

The expected number of superior mutations in the interval

(ﬁg) is:

u S ‘S

: ._ 76
YR(SI Smulu,N,f) S Nln(xR) II ( )

196

2:
h - 1 + — 1-1 23 N - 2 Th
w ere )(R - 5 +5 (sum 5) (TC ) s -s s . e

 

 

probability that an arbitrarily chosen beneficial mutation

transiently achieves polymorphic frequency (f>0.01) is:

Pr{polylsmx, 11: N} =

l ’” -W(Ls ,mmoxn) -ﬂ(Ls ,mmoxn) (77)
—rn(s)eR ...... (l-e“ m )ds.

3
max0

Finally, the probability that an arbitrarily chosen
beneficial mutation transiently achieves majority status is

obtained by replacing 0.01 in equation (77) with 0.5.

APPENDIX 3.1

ADAPTIVE SUBSTITUTIONS IN.ASEXUAL POPULATIONS
ARE ACCURATELY MODELED AS INSTANTANEOUS REPLACEMENTS

Conveniently, the continuous process of adaptive
substitution in asexual populations is well approximated by
a discrete process of instantaneous replacement. Lenski et
al. (1991) suggested the use of such an approximation. They
reasoned that in a large population the frequency of a
substituting variant would remain very low for a
considerable time and then rise sharply, thus approximating
a step function.

Define time of adaptive substitution, t', as the time
at which the frequency of a beneficial mutation achieves
0.5. Then, the process of adaptive substitution may be
approximated by a simplified scenario in which the
substituting variant remains at a frequency of zero until
time t? and assumes a frequency of one thereafter. That we
have defined t7 appropriately is evidenced by the symmetry
of logistic growth. More generally, let tf denote the time
of the ith adaptive substitution in an evolving population.

Then populational processes are well approximated by

197

198

assuming the population to be numerically dominated by a
single "wildtype" variant throughout the interval (tpf,
tf), for every i.

Suppose a beneficial mutation appears in a population
at time t=0 and subsequently spreads to fixation. Let p
denote the frequency of the beneficial allele. Then, if the
unit of time is generations, the dynamic equation for the
growth of the beneficial mutation is dp/dt==sp(1-p), where
s is the selection coefficient, and p(0)=1/N. Define time
until fixation, tf, as that which satisfies p(tﬁl=(N-1)/N.
The time of substitution, t', as defined above, is that
which satisfies p(tlr=0.5; it is t“= ln(N)/s. Thus,
according to the continuous substitution process, the total

number of wildtype replications after time t=0 is given by

'f' Nln(N) .
[1-p(t)]dt = —— = Nt.
0 S

The discrete approximation to the above substitution

0, tst'
process is given by the equation, p(t) = .' where
l, t>t

t‘is as defined above. According to this approximation,

199

the number of wildtype replications after time t=0 is simply
ART, which is exactly equal to the number obtained for the
continuous process.

We conclude that (i) the time of substitution, t3 is
appropriately defined and (ii) the continuous process of
substitution is closely approximated by a discrete process

of instantaneous replacement at time t'.

.APPENDIX 3.2

ANALYTICAL INTEGRATION OF EQUATIONS (21) AND (22).

To analytically integrate equation (21), a more precise
notation is essential. Given fixation ordering, C, the i”
mutation fixed will be followed by a number of fixations of
mutations that are superior to the i“ mutation. We have
denoted the set of such superior mutations as Sh. Let the .
subscript S indicate membership in set Sh, such that ssck)
denotes the k“ member of this set. Now, order this set of
selection coefficients such that 35(1) > 5362) > . . . >
ss (#SJ}, where #81 denotes the cardinality of set 81-. The
i” mutation fixed will also be followed by a number of
fixations of mutations that are inferior to the iCh
mutation. We have denoted the set of such inferior
mutations as 1;. Let the subscript I denote membership in
set 1;, such that sILk) denotes the kth member of this set.
Now, order this set of selection coefficients such that

31(1) > 51(2) > . . . > sI(#Ii)}, where #11 denotes the

200

201

cardinality of set Ii. With this new notation, the integral

in equation (21) may be rewritten as

 

lnN/s (1) ”‘1
I lnN
f exp{-(ti+ S N)£r Si(k)}g(t )dt

0 1 k'1

lnN/51(2) 1n
f exp -(ti+

+

 

lnN
N)£rs (k) ' (t1 - ———)-81)r( (1)}g(t )dt
SI

lnN/31(1) 1 k-l

lnN/s (3)
I In
I exp{- ”:1 +

 

+

2 lnN
N)£r 5(k) ‘ z(ti-m)rr(k)]g(ti)dt

 

 

lnN/3H2) Si 101 k-l (78)
lnN/s (in) lnN "[1 lnN
+ exp-(t.+ )Er (k) - 2 (ti———k—)rI(k) g(ti)
lnN/51111114) 151 k'1 k'1 SI( )
°° lnN ”1 lnN
+ f exp-(t.+ N12r<k1 - Eu: - ——)r (k) g<t.)dt.
lnN/sIHlI‘) 13; k-1 k-1 51”“ I l 1
Thus, when analytically integrated, (21) becomes
#1 1' #s .
lnN lnN
¢ =ﬁ —ex -—er (k) + t—r k)
(z) ..(2; 12 p is k_l k=lsl(k) I(
(79)

x ex -R—lnN - ex -R———lnN
p s10) p s,(j+1) ’
#s

where R: r + XrS(k) + tr (k), 31(0) =00 and sI(#I‘,+1)= 0.
k=1 k=l

We now derive the limit as selection coefficients

converge. Let the ordered set of selection coefficients be

202

evenly spaced such that sf==s_l-+e. Then, the limit as

selection coefficients converge of the probability of a

given fixation ordering is

lim€*0¢(() =

1
n[mexp{-#SiuiKNlnN} l - exp{-(#Si+1)uiKNlnN}] (80)

 

1
#5.. +#I‘, +1 exp{ - (2#Si +1) uiKNlnN}] ,

where K denotes the linear constant in the drift survival
function, and the intercept of that function is assumed to
equal zero, i.e., n(s) = Ks. The mutation rate to each
beneficial mutation is assumed to be equal such that

1% = DAM V’i where p is the overall beneficial mutation

rate. When population number is very large, such that pN >

1, this limit is closely approximated by

NC) “

f

1
T1-—————exp{-#Siiﬁmnnﬁn , not.rank-ordered

 

, rank-ordered

 

 

M—i+1) exp{ -11: KNlnN}

203

Equation (22) may be integrated analytically by
decomposition of the integral into intervals as before. The
appropriate decomposition is achieved by simply multiplying

each integrand of (78) by ti. Then, analytical integration

 

 

yields
#1 r' #s .
- lnN 1 lnN
E(T.|§) = 24exp -—Xr (k) + t—r (k)
t I: R 8,- k=l S k=1 51(k) I
I (82)
-1131 -RJL
: (I) lnN 1 3 (j +1) lnN 1
x e I -———e—+—- - e I -———T———+—- ,
31(3) R sI(_7+l) R
#31 .
where R=r +Xr(k) + ﬁrm), 5(0) =00 and s(#1',+1) =0.
l k=l S k=1 1 I I a
In the limit as selection coeffecients converge, this
expected time is reduced to
limbthiIC) =
exp{-#SiuKNlnN} 1 lnN 1
#si+l §-( +ﬁ)eXp{-(#Si+l)uKNlnN} (83)

exp{ - (#S‘, +#Ii ) uKNlnN} lnN
#%+#E+1

 

1
+73) exp{ - (#Si +#Ii +1) uKNlnN}

where s is the average selection coefficient (to which all

sj converge) .

204

APPENDIX 4.1

EFFECTIVE MUTATION-SELECTION BALANCE

Let ¢(t) denote the frequency of a given deleterious
mutation, where t=0 at the time of the most recent adaptive
substitution (see below for precise definition). Given that
the time until the next adaptive substitution is t5, the
average mutation selection balance during the time interval

between the most recent and the next substitution is

t
1
-E—f¢(t)dt. Let 0 denote the expected rate of adaptive
3 0

substitutions. Then, to a first approximation, we define

effective mutation selection balance as

1/0

(be = cf¢(t)dt . (84)
0

Suppose a beneficial mutation appears on the wildtype
background at time t = 0. As the number of individuals

carrying this beneficial mutation grows, they produce at

205

rate Lb a deleterious mutation, which has selective
disadvantage sh. .Assuming constant population size, N, the
frequency, ¢, of this deleterious mutation on the

beneficial background is given by the dynamic equation,

db

= _ -: 3' -1
'EE ([(N 1)e + l] u

D ' SD¢)(1-¢) I (85)

with initial condition ¢(O) = O, reflecting the fact that

the beneficial mutation occurs on the wildtype background.-

It is commonly the case that ¢>« 1, for which the above

equation is well approximated by

do __ uD-SED ’ t:2 tuz (86)
'3? '_ O , t:< t“2

 

ln.N
where t“2 = s is the time necessary for the beneficial

mutation to attain a frequency of 0.5, and again this
equation has initial condition ¢(0) = O. For convenience,
we now shift our time axis such that the time at which the

beneficial mutation achieves frequency 0.5 is zero, i.e.,

206

1/2é:0' We define this time as the time of substitution.

That equation (85) is well approximated by (86) is good
evidence that the time of substitution, as we have defined
it, marks a "resetting" of genetic variability in the
population. Figure 4.7 plots the frequency of a deleterious
mutation in a population during a substitutional event. The
vertical dotted line marks the time of substitution as
defined here (the t=0 axis in the shifted coordinate
system). Note the close agreement between exact and
approximate solutions.

In our shifted coordinate system, equation (86) becomes

d
i = u - 5 ¢- When the initial condition, q>(0) = 0, is
dt D D
I I 0 DD -8! O 0
applied, the solution is o == 2? (l-e D). This solution
D

is plotted in Figure 4.1A.for two different values of SD;

up

note that as t «c», ¢ ~ :;-= o. Given that the time between

D

two substitutions is ts, the effective mutation-selection

balance during that time is

207

“D O' *JD/O
of ¢(t) dt = ? l-s—(l-e ) , (87)

where 0 is the rate of adaptive substitutions as determined

by Gerrish & Lenski (1997):

" u
= 0( uBNers) exp {ugly Nln(N) em" n(s+%) - d 5} ds (88)
o

This expression makes the assumption that selection
coefficients for beneficial mutations are exponentially
distributed with parameter a. The function n(s) describes
the probability that a beneficial mutation of selective
advantage, 3, is not lost by drift. Assuming constant total
population size and Poisson-distributed offspring, this
function is approximately n(s)==23 (Haldane, 1927). For
bacteria, which reproduce by binary fission, n(s)==4s

(Gerrish & Lenski, 1998).

.APPENDIX 4.2

EFFECTIVE POPULATION NUMBER UNDER.A SERIAL TRANSFER REGIME

I derive the effective population number with respect
to non-neutral mutations. First, I find the probability of
fixation} of a beneficial mutation in a population of
constant number. Then, I find this probability for a
population that is subject to periodic bottlenecks and grows
exponentially between these bottlenecks, as in a serial
transfer regime. Equating these two probabilities gives an
expression for effective population number. Conveniently,
the same equations apply for deleterious mutations as well,
implying that this effective population number is general
for non-neutral mutations.

Employing a diffusion approximation, the Kolmogorov

backward equation is solved to find the ultimate probability

 

1

By probability of fixation, we mean the probability that the mutant gene in
question is not lost by drift. In an asexual system, the mutant gene may
be lost as a result of competition with alternative beneficial mutant genes
(Haigh, 1978; Gerrish & Lenski, 1997); such competition, however, becomes
important only when frequencies become relatively high, at which point
stochastic effects are negligible. Because effective population size is a
stochastic equivalent, calculations here do not take such competition
between beneficial mutations into account.

208

209

of fixation u(p) given starting frequency p. (The
Kolmogorov backward equation is derived in Chapter 8 of Crow
& Kimura (1970).) Let u(p,t) denote the probability that a
mutant gene becomes fixed by the t” generation, given that
its starting frequency is p. Then, the diffusion

approximation for this probability is given by

au(p,t) _ au(p,t) 1282u(p,t)
+ —o ————————

3t - u 3P 2 6p2 (89)

We are interested in the ultimate probability of fixation
(probability that the mutant gene ever achieves fixation),

given by u(p) == limbmu(p,t), for which au/6t==0 and which
therefore satisfies

2
udumb) + 02 d u(p)

_. = o I (90)
dp dp

with boundary conditions, u(0) = 0 and u(l) = 1. The

solution which satisfies these boundary conditions is

P
IG(x)dx

u(p) = f—— . <91)
fG(x)dx

0

210

2
where G(x)=exp{-f—udx}. Let X, denote frequency of the

02

mutant gene at time t, and let 6x! denote the change in xt

between generations t and t+l, i.e. }< = xti'BX}. Then p

(+1

and.c?:may be understood as expectations E(5xt) and
E[(6xq)2], respectively. Given that the system is asexual

and that the mutant gene in question has selective advantage

8, the change in mean frequency is simply u = Nexs/Ne = sx.

To calculate CF, consideration must be given to both the
mode of reproduction as well as the predominant mode of
selection. At this point, I restrict the derivation to the
case of reproduction by binary fission; conveniently,
however, the result approximates that for other modes of
reproduction, implying generality. If selection occurs
mainly by differential growth, then the change in variance

x(1+s)
2N '

is oz==2ALx(l+s)%(l-%)/Nf== the binomial variance

in which 2A£x(1+s) offspring are sampled with probability

211

—. If selection occurs mainly by differential death (or

differential sampling), then the change in variance is

 

 

2 2 2N

1+ l-s x 1-s2
02 = 2ch[ S)[ )/N: = -(—)—, the binomial variance in

which ZACK offspring are sampled with probability (l+s)/2.

2 3
Thus, —; = 4N‘f(s), where ﬁsh—13; for differential growth
0

for differential death, and (91) becomes

or f(s)

 

l-s2

l - exp{-4N¢f(s)p}

”(p) = l-exp{-4N¢f(s)} ' (92)

 

Given ploidy number, ¢, the starting frequency of an
individual mutant gene is p = l/(¢N), where N'is the actual
number of individuals present in the population at the time
of mutation. When this starting frequency is inserted into
(92), the probability of fixation of an individual mutant

gene is closely approximated by

N‘ (93)
u == 4 35; ,

regardless of the choice of f(s). (The sexual case,
assuming Poisson-distributed offspring, yields the same
approximation; see Crow & Kimura, 1970.) Rearrangement of
this equation gives an expression for effective population

u N
number, N' = -—2—.
e 43

If the actual population number, AL fluctuates over
time, as in a serial transfer regime, then the effective
population number will also fluctuate. To leave the
effective population number as a function of time, however,
would defeat the purpose of having an effective population
number, which is to simplify the math. Thus, we employ the

geometric mean effective population number,

 

_ 1 t U¢Nt
ln(N) = -— 1 dt, where r is the period of time
c to 43

between serial transfers.

If the population grows exponentially between

transfers, then AC == Age", where r is the exponential

213

growth parameter. For continuous growth of bacteria, r = 1;
for discrete growth of bacteria, r = ln 2.

It remains to derive the probability of fixation, u.
Let v denote the probability that the beneficial mutation in
question does not survive the effects of random sampling,
such that u = l - v. The number of offspring of the
beneficial mutant that make it through the first population
bottleneck (due to the first transfer dilution) is a Poisson
random variable. The Poisson parameter of this random
variable is the product of (i) the number of offspring of
the beneficial mutant just before dilution, and (ii) the

dilution factor, D. Given that the beneficial mutant
appears at time t, factor (i) is e'“””(‘“). Thus, the
Poisson parameter for the number of offspring of the
beneficial mutant that are sampled at the first dilution is
e“l"”‘”)D. For convenience, I break this parameter into

two factors, A = e'(l+’)‘D and Y: = e—'(“’)' .

The total probability of loss of the beneficial
mutation is then given by the following logic. Either zero

offspring are transfered at the first dilution or one

214

offspring is transfered and lost in subsequent dilutions or
two offspring are transfered and both lineages are lost in
subsequent dilutions or . . . etc. The probability that

zero offspring are transfered at the first dilution is

-A
e Y‘. The probability that one offspring is transfered at

-A
the first dilution is Ay}e 7‘. Let x denote the probability

that one lineage is lost in subsequent dilutions. Then the
probability that one offspring is transfered at the first
dilution and its lineage is lost in subsequent dilutions is

—A
Av}e ytx. Likewise, the probability that two offspring are

transfered at the first dilution and both lineages are lost

. . . . 4? .
in subsequent dilutions lS %(Ayt)2e txz. The same logic

applies for any number of offspring transfered at the first
dilution. Thus the total probability of loss is given by

the sum,

v == S:-———L—-e ‘x’ == e . (94)

215
The remaining unknown is x, the probability that a single
lineage is lost in subsequent dilutions.
If one beneficial mutant is transfered at the first

dilution, then the number of its offspring present just

r(l+s)t
I

before the next dilution is e such that the Poisson

parameter for the number of its offspring that make it

r(l+:)tl)
I

through that dilution is e or simply A. Each of

those offspring that make it through that dilution will have
the same Poisson distribution of offspring after the next
dilution, etc. This is called a branching process (see p.
58 in Bailey, 1964). A result of branching process theory
is that the probability of extinction is given by the

smallest positive root of the equation g(x)==x, where g(x)

is the probability generating function for the number of
offspring produced by an individual in one "generation". If
we define one "generation" as starting just after one
dilution and ending just after the next, then the
probability generating function for the number of offspring
left by one individual after one "generation" is

m A! -i - A
g(x) = 2 Te x1 = e ”—1). Thus, the lineage started by
1.: o

216

one beneficial mutant transfered at the first dilution is

lost in a subsequent dilution with probability x, the
smallest positive root of the equation, ex””” = x. When

this probability is determined, then the total probability

of survival may be calculated:

u =].- v==1 - =].-.xY. From here, the effective

population number may be calculated:

dt . (95)

 

— lt ¢No rt 7!:
Ne — exp ;{lnl—Ee (1—x )

When lAyt(x-l)| is small, then the following

approximations are satisfactory for intermediate values of

s. The probability of survival may be approximated by

lny-l)

u==l - e z Ayt(l-x) (from a first order expansion),

where x is approximately x:=1.- 2rsr (from a second order
expansion of the equation e1”"” = x). Insertion of these

two approximations into (95) yields the following simplified

expression:

217

— z 1
Ne 2cpNort . (96)

If t is given in generations, this expression is further

simplified:

- ~ 1
Ne N 3¢NOI o (97)

LIST OF REFERENCES

LIST OF REFERENCES

Bailey, N.T.J., 1964. The Elements of Stochastic Processes.
John Wiley & Sons, Inc., New York.

Barton, N.H., 1993. The probability of fixation of a
favoured allele in a subdivided population. Genet.
Res. 62: 149-157.

Barton, N.H., 1994. The reduction in fixation probability
caused by substitutions at linked loci. Genet. Res.
64: 199-208.

Barton, N.H., 1995. Linkage and the limits to natural
selection. Genetics 140: 821-841.

Berg, O.G., 1995. Periodic selection and hitchhiking in a
bacterial population. J. Theor. Biol. 173: 307-320.

Bhatnagar, S.K. & M.J. Bessman, 1988. Studies on the
mutator gene, mutT, of Escherichia coli. Molecular
cloning of the gene, purification of the gene product,
and identification of a novel nucleoside
triphosphatase. J. Biol. Chem. 263: 8953-8957.

Boucher, C.A.B., E. O'Sullivan, J.W. Mulder, C.
Ramautarsing, P. Kellam, G. Darby, J.M.A. Lange, J.
Goudsmit & B.A. Larder, 1992. Ordered appearance of
zidovudine resistance mutations during treatment of 18
human immunodeficiency virus-positive subjects.
Journal of Infectious Diseases 165: 105-110.

Carlton, B.C. & B.J. Brown, 1981. Manual of Methods for

General Bacteriology, ed. P. Gerhardt. American
Society for Microbiology, Washington, D.C., p. 222-242.

218

219

Chao, L. & E.C. Cox, 1983. Competition between high and low

mutating strains of Escherichia coli. Evolution 37:
125-134.

Coffin, J.M., 1995. HIV population dynamics in vivo:
implications for genetic variation, pathogenesis, and
therapy. Science 267: 483-489.

Cox, B.C., 1976. Bacterial mutator genes and the control of
spontaneous mutation. Annu. Rev. Genet. 10: 135-156.

Cox E. C. & T.C. Gibson, 1974. Selection for high mutation
rates in chemostats. Genetics 77: 169-84.

Crow, J.F. & M. Kimura, 1965. Evolution in sexual and
asexual populations. Am. Nat. 99: 439-450.

Crow, J.F. & M. Kimura, 1970. An Introduction to Population
Genetics Theory. New York: Harper & Row.

Cunningham, C.W., K. Jeng, J. Husti, M. Badgett, I.J.
Molineux, D.M. Hillis & J.J. Bull, 1997. Parallel
molecular evolution of deletions and nonsense mutations
in bacteriophage T7. Molecular Biology & Evolution 14:
113-116.

Drake, J.W., 1991. A constant rate of spontaneous mutation
in DNA-based microbes. Proc. Natl. Acad. Sci. USA 88:
7160-7164.

Drake, J.W., 1991. Spontaneous mutation. Annu. Rev. Genet.
25: 125-46.

Elena, S.F., L. Ekunwe, N. Hajela, S.A. Oden & R.E. Lenski,
1998. Distribution of fitness effects caused by random
insertion mutations in Escherichia coli. Genetica, in
press.

Elena, S.F., V.S. Cooper & R.E. Lenski, 1996. Punctuated
evolution caused by selection of rare beneficial
mutations. Science 272: 1802-1804.

Ewens, W.J., 1979. Mathematical Population Genetics. New
York: Springer-Verlag.

220

Ewens, W.J., 1969. Population Genetics. London: Methuen
Press.

Feller, W., 1968. An Introduction to Probability Theory and
Its Application. New York: John Wiley & Sons.

Felsenstein, J., 1988. Sex and the evolution of
recombination, pp. 74-86 in The Evolution of Sex,
edited by R.E. Michod and B.R. Levin. Sunderland,
Mass.: Sinauer Associates.

Felsenstein, J., 1974. The evolutionary advantage of
recombination. Genetics 78: 737-756.

Fisher, R.A., 1930. The Genetical Theory of Natural
Selection. Oxford: Oxford Univ. Press.

Gerrish, P.J. & R.E. Lenski, 1998. The fate of competing
beneficial mutations in an asexual population.
Genetica (in press).

Gillespie, J.H., 1991. The Causes of Molecular Evolution.
Oxford: Oxford Univ. Press.

Gillespie, J.H., 1981. Mutation rate modification in a
random environment. Evolution 35: 468-476.

Gurland, J., 1958. Biometrics 14: 229—249.

Gurland, J., 1963. A method of estimation for some
generalized Poisson distributions. International
Symposium on Classical and Contagious Discrete
Distributions, McGill, Montreal.

Haigh, J., 1978. The accumulation of deleterious genes in a
population -- Muller's ratchet. Theor. Pop. Biol. 14:
251-267.

Haldane, J.B.S., 1927. The mathematical theory of natural
and artificial selection. Proc. Camb. Phil. Soc. 23:
838-844.

221

Hall, B.G., 1988. .Adaptive evolution that requires multiple
spontaneous mutations. I. Mutations involving an
insertion sequence. Genetics 120: 887-897.

Ho, D.D., T. Moudgil & M. Alam, 1989. Quantitation of human
immunodeficiency virus type 1 in the blood of infected
persons. New England Journal of Medicine 321: 1621-
1625.

Hogg, R.V. & A.T. Craig, 1995. Introduction to mathematical
statistics. New Jersey: Prentice Hall.

Holmes, B.C., L.Q. Zhang, P. Simmonds, C.A. Ludlam & A.J.L.
Brown, 1992. Convergent and divergent sequence
evolution in the surface envelope glycoprotein of human
immunodeficiency virus type 1 within a single infected
patient. Proc. Natl. Acad. Sci. USA 89: 4835-4839.

Horiuchi, T., H. Maki, M. Maruyama, & M. Sekiguchi, 1981.
Identification of the dnaQ gene product and location of
the structural gene for RNAse H of Escherichia coli by

cloning of the genes. Proc. Natl. Acad. Sci. USA 78:
3770-3774.

Ishii, K., H. Matsuda, Y. Iwasa &.A. Sasaki, 1989.
Evolutionary stable mutation rate in a periodically
changing environment. Genetics 121: 163-174.

Jaeger, G. & S. Sarkar, 1995. On the distribution of
bacterial mutants: the effects of differential fitness
of mutants and non-mutants. Genetica 96: 217-223.

Johnson, P., R.E. Lenski & F. Hoppensteadt, 1995.
Theoretical analysis of divergence in mean fitness
betweeen initially identical populations. Proceedings
of the Royal Society, London B 259: 125-130.

Kaplan, N.L., R.R. Hudson & C.H. Langley, 1989. The
“hitchhiking” effect revisited. Genetics 123: 887-889.

Keightley, P.D., 1991. Genetic variance and fixation
probabilities at quantitative trait loci in mutation-
selection balance. Genet. Res. 58: 139-144.

222

Kimura, M., 1960. Optimum mutation rate and degree of
dominance as determined by the principle of minimum
genetic load. J. Genet. 57: 21-34.

Kimura, M., 1967. On the evolutionary adjustment of
spontaneous mutation rates. Genet. Res. 9: 23-34.

Kimura, M., 1979. Model of effectively neutral mutations in
which selective constraint is incorporated. Proc.
Natl. Acad. Sci. USA 76: 3440-3444.

Kuhner, M.K., J. Yamato & J. Felsenstein, 1995. Estimating
effective population size and mutation rate from
sequence data using Metropolis-Hastings sampling.
Genetics 140: 1421-1430.

Lea, D.E. & C.A. Coulson, 1949. The distribution of the
numbers of mutants in bacterial populations. J.
Genetics 49: 264-285.

LeClerc, J.E., B. Li, W.L. Payne, & T. Cebula, 1996. High
mutation frequencies among Escherichia coli and
Salmonella pathogens. Science 274: 1208-1211.

Lederberg, S., 1966. Genetics of host-controlled
restriction and modification of deoxyribonucleic acid
in Escherichia coli. J. Bacteriol. 91: 1029—1036.

Leigh, E.G., 1970. Natural selection and mutability. Am.
Nat. 104: 301-305.

Leigh, E.G., 1973. The evolution of mutation rates.
Genetics 73 (suppl.): sl-sl8.

Leigh Brown, A.J. & D.D. Richman, 1997. HIV-1: gambling on
the evolution of drug resistance? Nature Medicine 3:
268-271.

Leigh Brown, A.J., 1997. Analysis of HIV-1 env gene
sequences reveals evidence for a low effective number
in the viral population. Proc. Natl. Acad. Sci. USA
94: 1862-1865.

223

Lenski, R.E & M. Travisano, 1994. Dynamics of adaptation
and diversification: a 10,000-generation experiment
with bacterial populations. Proc. Natl. Acad. Sci. USA
91: 6808-6814.

Lenski, R.E., M.R. Rose, S.C. Simpson & S.C. Tadler, 1991.
Long-term experimental evolution in Escherichia coli.

I. Adaptation and divergence during 2000 generations.
Am. Nat. 138: 1315-1341.

Luria, S. E. & M. Delbrﬁck, 1943. Mutations of bacteria
from virus sensitivity to virus resistance. Genetics
28: 491-511.

Ma, W.T., G.H. Sandri, & S. Sarkar, 1992. Analysis of the
Luria-Delbruck distribution using discrete convolution
powers. J. Appl. Prob. 29: 255-267.

Manning, J.T. & D.J. Thompson, 1984. Muller's ratchet
accumulation of favourable mutations. Acta Biotheor.
33: 219-225.

Mansky, L.M. & H.M. Temin, 1995. Lower in vivo mutation
rate of human immunodeficiency virus type 1 than that
predicted from the fidelity of purified reverse
transcriptase. J. Virol. 69: 5087-5094.

Mao, E.F., L. Lane, J. Lee & J.H. Miller, 1997.
Proliferation of mutators in A.cell population. J.
Bacteriol. 179: 417-422.

Mascolini, M., 1997. Of clades and quasispecies: making
sense of the HIV population census. Journal of the
International Association of Physicians in AIDS Care 3:
13-25.

Maynard Smith, J. & J. Haigh, 1974. The hitch-hiking effect
of a favourable gene. Genet. Res. 23: 23-35.

Maynard Smith, J., 1968. Evolution in sexual and asexual
populations. Am. Nat. 102: 469-473.

Miller, J.H., 1992. A Short Course in Bacterial Genetics.
Cold Spring Harbor Lab. Press, Plainview, NY.

224

Miller, J.H., 1996. Spontaneous mutators in bacteria:
insights into pathways of mutagenesis and repair.
Annu. Rev. Microbiol. 50: 625-643.

Mittler J.E. & R.E. Lenski, 1992. Experimental evidence for
an alternative to directed mutation in the bgl operon.
Nature 356: 446-448.

Modrich, P., 1995. Mismatch repair, genetic stability and
tumour avoidance. Phil. Trans. Roy. Soc. Lond. Ser. B.
347: 89-95.

Modrich, P., 1991. Mechanisms and biological effects of
mismatch repair. Ann. Rev. Genet. 25: 229-253.

Moxon B.R., P.B. Rainey, M.A. Nowak & R.E. Lenski, 1994.
Adaptive evolution of highly mutable loci in pathogenic
bacteria. Current Biology 4: 24-33. '

Muller, H.J., 1932. Some genetic aspects of sex. Am. Nat.
8: 118-138.

Muller, H.J., 1964. The relation of recombination to
mutational advance. Mutat. Res. 1: 2-9.

Nowell, P.C., 1974. The clonal evolution of tumor cell
populations. Science 194: 23-28.

Otto, S.P. & M.C. Whitlock, 1997. The probability of
fixation in populations of changing size. Genetics
146: 723-733.

Pamilo, P., M. Nei & W. Li, 1987. Accumulation of mutations
in sexual and asexual populations. Genet. Res. 49:
135—146.

Pang, P.P., A.S. Lundberg & G.C. Walker, 1985.
Identification and characterization of the mutL and

mutS gene products of Salmonella typhimurium LT2. J.
Bacteriol. 163: 1007-1015.

Peck, J.R., 1994. A ruby in the rubbish: beneficial
mutations, deleterious mutations and the evolution of
sex. Genetics 137: 597-606.

225

Peck, J.R., G. Barreau & S.C. Heath, 1997. Imperfect genes,
Fisherian mutation and the evolution of sex. Genetics
145: 1171-1199.

Piatak, M., M.S. Saag, L.C. Yang, S.J. Clark, J.C. Kappes,
K.C. Luk, B.H. Hahn, G.M. Shaw & J.D. Lifson, 1993.
High levels of HIV-1 in plasma during all stages of
infection determined by competitive PCR. Science 259:
1749-1755.

Sambrook, E.F., T. Fritsch & J. Maniatis, 1989. Molecular
Cloning: A Laboratory Manual, 2nd Ed. Cold Spring
Harbor Laboratory Press, Plainview, NY.

Sarkar, S., 1991. Haldane's solution of the Luria-Delbruck
distribution. Genetics 127: 257-261.

Sarkar, S., W.T. Ma & G.v.H. Sandri, 1992. On fluctuation
analysis: a new, simple and efficient method for

computing the expected number of mutants. Genetica 85:
173-179.

Schaaper, R.M. & R. Cornacchio, 1992. An Escherichia coli
dnaE mutation with suppressor activity toward mutator
mutDS. J. Bacteriol. 174: 1974-1982.

Sniegowski, P.D., P.J. Gerrish & R.E. Lenski, 1997.
Evolution of high mutation rates in experimental
populations of E. coli. Nature 387: 703-705.

Stewart, F.M., D.M. Gordon & B.R. Levin, 1990. Fluctuation
analysis: the probability distribution of the number of
mutants under different conditions. Genetics 124: 175-
185.

Stewart, F.M., 1994. Fluctuation tests: how reliable are
the estimates of mutation rates? Genetics 137: 1139-
1146.

Taddei, F., M. Radman, J. Maynard Smith, B. Toupance, P.H.
Gouyon & B. Godelle, 1997. Role of mutator alleles in
adaptive evolution. Nature 387: 700-702.

226

Taucher-Scholz, G. & H. Hoffman-Berling, 1983.
Identification of the gene for DNA helicase II of
Escherichia coli. Eur. J. Biochem. 137: 573-580.

Travisano, M. & R.E. Lenski, 1996. Long-term experimental
evolution in Escherichia coli. IV. Targets of selection
and the specificity of adaptation. Genetics 143: 15—
26.

Trdbner, W. & R. Piechocki, 1984. Competition between
isogenic mutS and.nnnﬁ populations of Escherichia coli
K12 in continuously growing cultures. Mol. Gen. Genet.
198: 175-176.

Vasi, F., M. Travisano & R.E. Lenski, 1994. Long-term
experimental evolution in Escherichia coli. II. Changes
in life-history traits during adaptation to a seasonal
environment. Am. Nat. 144: 432—456.

Wright, S., 1982. Character change, speciation, and the
higher taxa. Evolution 36: 427-443.

"‘111111111111“