THE EVOLUTION OF COMPLEXITY AND ROBUSTNESS IN SMALL POPULATIONS By Thomas LaBar A DISSERTATION Michigan State University in partial fulfillment of the requirements Submitted to for the degree of Microbiology and Molecular Genetics – Doctor of Philosophy Ecology, Evolutionary Biology, and Behavior – Dual Major 2018 ABSTRACT THE EVOLUTION OF COMPLEXITY AND ROBUSTNESS IN SMALL POPULATIONS By Thomas LaBar A central goal of evolutionary biology is to understand a population’s evolutionary trajectory from fundamental population-level characteristics. The mathematical framework of population genetics provides the tools to make these predictions. And while population genetics provides a well-studied framework to understand how adaptation and neutral evolution quantitatively alter population fitness, less attention has been paid to using population genetics to predict qualitative evolutionary outcomes. For instance, do different populations evolve alternative genetic mechanisms to encode similar phenotypic traits, and if so, which processes lead to these differences? This dissertation investigates the role of population size in altering the qualitative outcome of evolution. It is difficult to experimentally investigate qualitative evolutionary outcomes, especially in small populations, due to the time required for novel evolutionary features to appear. To get around this constraint, I use digital experimental evolution. While digital evolution experiments lack aspects of biological realism, in some regards they are the only methodology that can approach the complexity of biological systems while maintaining the ease of analysis present in mathematical models. Digital evolution experiments can never prove that certain evolutionary trajectories occur in biological populations, but they can suggest hypotheses to test in more realistic model systems. First, I explore the role of population size in determining the evolution of both genomic and phenotypic complexity. Previous hypotheses have argued that small population size may lead to increases in complexity and I test aspects of those hypotheses here. Second, I introduce the novel concept of “drift robustness” and argue that drift robustness is a strong factor in the evolution of small populations. Finally, I end with a project on the role of genome size in enhancing the extinction risk of small populations. I conclude with a broader discussion of the consequences of this research, some limitations of the results, and some ideas for future research. Copyright by THOMAS LABAR 2018 ACKNOWLEDGEMENTS As with any large body of work, this dissertation was not solely due to the efforts of one person. I have many people to thank for assisting me during my Ph.D. First, I thank my advisor, Chris Adami, for his support over my five years as his student. During my Ph.D., Chris was always available to provide feedback and advice on my work. He gave me the independence to develop my own projects, while also supporting those same projects as they progressed. Chris and I co-wrote many manuscripts during my Ph.D., only a fraction of which are reproduced here. I am not convinced that I would have been this productive without Chris as my mentor. I also must thank the past and current members of the Adami lab, for creating a productive and multidisciplinary intellectual atmosphere. Second, I thank Rich Lenski and the Lenski lab for providing me with a second academic home at Michigan State. The Lenski lab took me in and taught me how to perform microbial evolution experiments, an invaluable novel tool for myself that unfortunately is not apparent from the work present here. I am grateful to the members of the Lenski lab for their invaluable discussions concerning the field of microbial experimental evolution and evolutionary biology. I must especially thank Kyle Card for his willingness to incorporate myself into his Ph.D. project and mentor me in the ways of microbial evolution studies. I thank my committee members, Yann Dufour, Charles Ofria, and Chris Waters, for their guid- ance and comments during what was perhaps an atypical dissertation project for the Microbiology and Molecular Genetics program. I thank all of my collaborators, including Arend Hintze, Michael Miyagi, Aditi Gupta, Nitash C.G., Ali Tehrani-Saleh, Kyle Card, Jasper Gomez, and Minako Izutsu, for the opportunity to work with them on a variety of projects during graduate school. I thank the BioMolecular Sciences Gateway Ph.D. program, the Microbiology and Molecular Genetics Department, the Ecology, Evolutionary Biology, and Behavior program, and the BEACON Center for the Study of Evolution in Action for providing me with the opportunity to receive my iv PhD and establishing departments and programs with an interdisciplinary focus such that a student with a mathematics background would fit in. A special thanks goes out to all the support staff that bring these programs into existence. I thank Michigan State’s College of Natural Sciences for a recruiting fellowship and Michigan State’s Graduate School for a University Distinguished Fellowship. I thank the BEACON Center for the Study of Evolution in Action for a Top-Up Fellowship, for travel support, and for project funding. I thank the Michigan State Microbiology and Molecular Genetics Department and the relevant sponsors for the Russell B. DuVall award and the Rudolph Hugh Fellowship. I also thank the Society for Molecular Biology and Evolution for travel support. No PhD is done in isolation. I owe a debt of gratitude to my friends for helping me during my time as a PhD student, especially the following: Anna Huff, Cara Weisman, Kyle Card, Meredith Zettlemoyer, Katie Minnix, Josh Franklin, and Tanush Jagdish. I thank all of you for your friendship. Finally, I must thank my parents for providing me with the upbringing to make all of this possible. They always encouraged me to follow my own interests, both intellectual and otherwise, and allowed me the opportunity to choose my own path in life. For that, I am eternally grateful. v TABLE OF CONTENTS . . . . . . . . . . . . . . CHAPTER 1 LIST OF TABLES . LIST OF FIGURES . . . INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Population Genetics and Evolutionary Dynamics of Small Asexual Populations . 1.2 Molecular Mechanisms of Evolution in Small Populations . . . . . . . . . . . . . . 1.3 Overview of Digital Experimental Evolution . . . . . . . . . . . . . . . . . . . . . 1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 6 9 . 12 ix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 2 DIFFERENT EVOLUTIONARY PATHS TO COMPLEXITY FOR SMALL Introduction . 2.7 Acknowledgments . 2.5 Discussion . 2.6 Methods . . 2.1 Abstract . 2.2 Author Summary . 2.3 . . . 2.4 Results . AND LARGE POPULATIONS OF DIGITAL ORGANISMS . . . . . . . . 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.1 Genome Size Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4.2 Evolution of Phenotypic Complexity . . . . . . . . . . . . . . . . . . . . . 21 2.4.3 Non-Functional Insertions . . . . . . . . . . . . . . . . . . . . . . . . . . 24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.4.4 Deletion Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 . . 2.6.1 Avida . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 . . 2.6.2 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.6.3 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 . . CHAPTER 3 EVOLUTION OF DRIFT ROBUSTNESS IN SMALL POPULATIONS . . 39 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3.1 A mathematical model of drift robustness . . . . . . . . . . . . . . . . . . 42 3.3.2 Drift robustness in digital organisms . . . . . . . . . . . . . . . . . . . . . 45 Small populations evolve an altered DFE . . . . . . . . . . . . . . . . . . . 46 3.3.3 3.3.4 Small population genotypes are drift-robust . . . . . . . . . . . . . . . . . 47 3.3.5 Drift robustness is not due to fitness differences . . . . . . . . . . . . . . . 49 3.3.6 Epistatic mutations lead to drift robustness . . . . . . . . . . . . . . . . . . 50 3.3.7 Deleterious mutations drive the evolution of drift robustness . . . . . . . . 53 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 . . . . . . . . . . . . . . . . . . . 57 3.5.1 Mathematical model of drift robustness 3.1 Abstract 3.2 3.3 Results . 3.4 Discussion . . 3.5 Methods . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi . . . . . . 3.6 Acknowledgements . 3.5.2 Avida . 3.5.3 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.4 Data Analysis . . 3.5.5 Data Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 . 62 . 63 . 65 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 CHAPTER 4 GENOME SIZE AND THE EXTINCTION OF SMALL POPULATIONS . 66 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.3.1 Large genome size increases the extinction risk of small populations . . . . 69 4.3.2 Extinction and large genome size is associated with increases in the lethal 4.1 Abstract 4.2 4.3 Results . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Discussion . . 4.5 Methods . . mutational load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.3.3 Lethal mutation rates and stochastic viability drive population extinction . . 73 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 . . 4.5.1 Avida . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.5.2 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.5.3 Data Analysis . . . . . . . . . . . . . . . . . 4.5.3.1 Analysis of the relationship between genome expansions and 4.6 Acknowledgments . . . CHAPTER 5 CONCLUSION . . 5.1 Further Conclusions . changes in the lethal mutation rate . . . . . . . . . . . . . . . . . 82 4.5.3.2 Analysis of stochastic viability . . . . . . . . . . . . . . . . . . . 82 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 . . 5.1.1 Absence of mutational hazards in altering small population evolution . . . . 85 5.1.2 The differences between mutational robustness and drift robustness . . . . . 86 5.1.3 Drift robustness and bacterial endosymbionts . . . . . . . . . . . . . . . . 87 5.1.4 Drift robustness and the role of epistasis in small-population adaptation . . 88 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.2 Limitations . . . . . . . 5.3 Future Work . . 5.2.1 Avida as an intermediate between mathematical modeling and biological . . . . . . . experiments . 5.2.2 Populations in Avida are much smaller than relevant biological populations 5.2.3 Results apply to asexual haploid species evolving in a constant environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 90 91 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.3.1 What is the genetic mechanism behind drift robustness? . . . . . . . . . . . 92 5.3.2 Drift robustness and sexual reproduction . . . . . . . . . . . . . . . . . . . 93 5.3.3 Drift robustness and its interaction with other robustness mechanisms . . . 94 Small population size and the evolution of functional complexity . . . . . . 95 5.3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 . . . . . . . . BIBLIOGRAPHY . vii LIST OF TABLES Table 2.1: Merit rewards for the evolution of phenotypic traits. . . . . . . . . . . . . . . 35 Table 4.1: Notable Avida parameters changed from default value. . . . . . . . . . . . . 80 viii LIST OF FIGURES Figure 2.1: Final genome size as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . 18 Figure 2.2: Evolution of complexity for small population sizes. Statistics shown in the main text for population sizes ranging from 10 to 100 individuals. Data for populations with 10 and 100 individuals are the same as in Figure 2.1. A) Evolution of genome size. B) Proportion of fixed insertions that were slightly-deleterious. C) Proportion of fixed insertions that were under positive selection. D) Number of evolved novel phenotypic traits. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . 19 Figure 2.3: The number of insertions fixed as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . 20 Figure 2.4: Proportion of slightly-deleterious insertions as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 . . Figure 2.5: Proportion of insertions under positive selection as a function of popula- tion size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 . . ix Figure 2.6: Final number of evolved phenotypic traits as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 . . . Figure 2.7: The evolved fitness relative to the ancestor genotype. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . 22 Figure 2.8: Correlation between the final genome size and the final number of evolved traits. Black circles represent the combined data from populations with 10, 100, 1000, and 10000 individuals. Only replicates that survived all 2.5×105 generations were included. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Figure 2.9: The effect of a fixed mutation rate on the evolution of phenotypic com- plexity. The variable genomic mutation rate treatment represents the data from when the genomic point mutation rate is 10−1× L, were L is the genome size. The fixed genomic mutation rate treatment represents the data from when the genomic point mutation rate was fixed at 1.5 × 10−1, independent of the genome size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . . . . . . . . . . . . . . . . . . . . 24 Figure 2.10: The evolution of complexity in the non-functional insertion treatment. All subplots are a function of the population size. A) The final genome size. B) The final number of evolved phenotypic traits. Populations with 20 individuals are shown instead of those with ten individuals due to the high extinction rates of populations with ten individuals. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . 25 x Figure 2.11: The proportion of fixed insertions that were under positive selection in the non-functional insertion treatment compared to the original treatment for populations with 104 individuals. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . . . . . . . . . 26 Figure 2.12: The evolution of complexity in the deletion bias treatment. All subplots are a function of the population size. A) The final genome size. B) The final number of evolved phenotypic traits. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. . . . . . . . . . . . . . . . . . . . . . 27 Figure 3.1: Conceptual diagram of drift robustness. a) A single-peak fitness landscape. In this landscape, the large population (red circles) can climb to the top of the fitness peak, while the small population (black circles) can only maintain fitness on an intermediate part of the peak. b) A multi-peak fitness landscape. The large population evolves to the same fitness peak as in panel a. The small population evolves to the steeper, drift-robust peak. While this peak is still lower than the drift-fragile peak, the small population attains greater fitness than it would have on the drift-fragile peak in the single-peak fitness landscape . 41 Figure 3.2: The fitness landscapes for the Markov model to test for the evolution of drift robustness. Each circle represents one genotype and is labeled with its fitness. Each arrow represents the transition between one genotype to another (including the identical genotype) and is labeled with the transition probability. a) The fitness landscape for the minimal model. s represents the selection coefficient of the drift-fragile peak and  represents the small fitness difference between the drift-fragile peak and the drift-robust peak. ui j and πi j represent the mutation rate between genotypes and probability of fixation from one genotype to another, respectively. b) The fitness landscape for the extended model. Variables as in panel a. Transition probabilities omitted for clarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Figure 3.3: Critical population size for shift between robust and fragile peaks. a) Results for various n values with κ = 0.01. b) Results for various κ values with n = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 . . . . . xi Figure 3.4: Differences in mutational effects between small-population genotypes and large-population genotypes. Black and red represent small-population and large-population genotypes, respectively. a) The combined distribution of fitness effects (DFE) across all 100 small-population genotypes and 100 large- population genotypes. b) Same data as in panel a, but grouped into different classes of mutations, where a small-effect deleterious mutation is defined as having an effect less than 5%. Here, and throughout, deleterious mutation refers to viable deleterious mutations, while lethal mutation refers to non- viable deleterious mutations. c) The likelihood of a small-effect deleterious mutation for small-population and large-population genotypes. Red lines are medians, edges of the box are first and third quartile, whiskers are at most 1.5 times the interquartile range, and the plus signs are outliers. d) The mean relative fitness of every possible point mutation (1250 mutations) for each genotype. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Figure 3.5: Small-population genotypes are drift-robust due to a decreased likelihood of small-effect deleterious mutations. Black and red data points represent small-population genotypes and large-population genotypes, respectively. a) Relative fitness of the most-abundant genotype from every population during the drift robustness test. Each circle represents the relative fitness of one genotype from one replicate. b) Relationship between relative fitness in the drift robustness test (panel a) and the likelihood of small-effect deleterious mutations (Fig. 3.4c) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Figure 3.6: Fraction of small-effect deleterious mutations for genotypes from small populations and large populations with equal fitness. Each area separated by a dashed line shows genotypes with equal fitness (w). S and L represent small-population and large-population genotypes, respectively. Differences for each fitness value are significant. (Mann-Whitney U-test; Bonferroni- corrected p < 3 × 10−3). Box plots as described for Fig. 3.4c. . . . . . . . . . . 50 Figure 3.7: Evidence of small-population adaptation to drift-robust fitness peaks. a) Distribution of maintained beneficial mutational effects for (effects for mutations whose fitness gain was at-least partially maintained during sub- sequent evolution) small-population genotypes (black) and large-population genotypes (red). b) Spearman correlation coefficients between fitness and the likelihood of a deleterious mutation for each maintained beneficial mutation from each population. c) The likelihood of (viable) deleterious and lethal mutations, shown in magenta and green, respectively, in a representative small population’s lineage. The strong decrease in the likelihood of a (viable) deleterious mutation early in the population’s history is evidence of epistatic mutations resulting in drift robustness. d) Number of populations that fixed a maintained beneficial mutation that decreased the likelihood of a (viable) deleterious mutation by at least a specified amount. Colors as in panel a. . . . . 51 xii Figure 3.8: The evolution of drift robustness in small populations with or without deleterious mutations in the initial adaptation experiments. Black and blue data points represent small-population genotypes adapted with delete- rious mutations and small-population genotypes adapted without deleterious mutations, respectively. a) Relative fitness to the ancestral genotype after 105 generations of adaptation. Box plots as described for Fig. 3.4c. b) Likelihood of small-effect deleterious mutations. c) Relative fitness of the most-abundant genotype from every population during the drift robustness test. Each circle represents the relative fitness of one genotype from one replicate. d) Rela- tionship between relative fitness in the drift robustness test (panel c) and the likelihood of small-effect deleterious mutations (panel b). . . . . . . . . . . . . 54 Figure 4.1: Possibility of genome expansions increase extinction in low mutation rate populations. A) Number of extinct populations as a function of population size. Solid (dashed) lines represent variable (fixed) genome size populations. Black (white) circles represent low (high) mutation rate populations. Error bars are bootstrapped 95% confidence intervals (104 samples). B) Time to extinction for population size and mutation rate combinations. Lines and colors same as in panel A. Error bars represent two times the standard error of the mean. Data only shown for those treatments that resulted in at least ten extinct populations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Figure 4.2: Extinct populations evolved larger genomes. A) Final genome size for the low mutation rate populations as a function of population size. Populations that survived are shown with gray boxplots; populations that went extinct are shown with white boxplots. Red bars are the median value, boxes are the first and third quartile, upper/lower whiskers extend up to the 1.5 times the interquartile range, and circles are outliers. ** indicates p < 10−4, * indicates p < 10−2, and N.S. indicates p > 0.05 for the Mann-Whitney U-test. Population sizes where fewer than ten populations went extinct (or survived) not shown. B) Final genome size for the high mutation rate populations as a function of population size. Description same as in panel A. Population sizes where fewer than ten populations went extinct (or survived) not shown. . . . . . 70 Figure 4.3: Lethal mutation rate correlates with genome size and population extinc- tion. A) The lethal mutation rate as a function of genome size for the final genotypes from each evolved low mutation rate population. White circles are extinct populations; gray circles are surviving populations. B) The lethal mutation rate for extinct and surviving populations across population sizes. Boxplots as previously described. Colors and significance symbols same as Figure 4.2. Population sizes where fewer than ten populations went extinct (or survived) not shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 xiii Figure 4.4: Insertions and deletions directly change the lethal mutation rate. A) Change in the lethal mutation rate as a function of a mutation’s effect on genome size. Boxplots as previously described. Each data point represents a genotype from a reduced evolutionary lineage of a population with 20 individuals. B) Relationship between a mutation’s change in genome size and the change in the lethal mutation rate. Data same as in panel A. Dashed lines represent no change. Data points comparing genotypes with equal genome size were excluded. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Figure 4.5: Evolution of stochastic viability contributes to extinction risk. A) Number of population extinctions (out of 100 replicates) as a function of population size for the deleterious-reversion (squares) and no-reversion (circles) treat- ments. Dashed and solid lines represent populations from fixed genome size and variable genome size treatments, respectively. Error bars are 95% confidence intervals generated using bootstrap sampling (104 samples). No- reversion treatment data same as in Fig. 4.1A. B) Number of population extinctions (out of 100 replicates) as a function of population size for the lethal-reversion (triangles) and no-reversion (circles) treatments. Other sym- bols same as in panel A. C) Percent of viability trials (out of 1000) for which a given genotype was not viable. Values between 0 and 1 indicate stochastic via- bility. “Original” refers to the 100 genotypes from the N = 8 populations that evolved with lethal mutations. “Revert-lethal” refers the 100 genotypes from the N = 5 lethal-reversion populations. Boxplots as previously described. D) Genome sizes for always-viable and stochastically-viable genotypes from both the N = 8 no-reversion populations and the N = 5 lethal reversion populations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 xiv CHAPTER 1 INTRODUCTION This dissertation deals with questions of the following nature: how does small population size and strong genetic drift (or weakened natural selection) alter the long-term evolutionary outcome of a population? In other words, how does population size alter what evolves, not just how a population evolves. The role of population size on the evolutionary dynamics of a population has been a topic of interest to evolutionary biologists dating back to Sewall Wright and his Shifting Balance theory [223]. However, the consequences of population size for the molecular and genetic mechanisms of evolution has been less well-studied, although the rise of comparative genomics has started to fill this gap (e.g., [121]). More precisely, the research presented here focuses on the role of genetic drift and weakened I selection in altering the evolution of genetic and genomic architecture in small populations. define genetic and genomic architecture as the loci (or genes) encoding a given phenotypic trait and any relevant characteristics concerning these loci. Much of my work has focused on how population size can alter the evolution of genome architecture complexity. While there are perhaps an overabundance of definitions of “complexity” across many fields, I use an intuitive definition: the complexity of some feature is simply the number of parts that compose that feature. For example, Chapter 2 focuses on the role of population size in driving the evolution of genetic and phenotypic complexity. In this chapter, I defined genetic complexity as genome size (i.e., number of loci in the genome) and phenotypic complexity as the number of phenotypic traits. Another measure of genetic complexity, say for a given phenotypic trait instead of the whole genome, could be the number of loci encoding said trait. One reason why studies on how population size alters the molecular mechanisms of evolution are less abundant than studies on the population genetics of small populations is due to their difficulty. Small populations have a slower rate of adaptation than large populations due to decreases in the generation and effect size of beneficial mutations [37, 39, 215] and thus any experiments to 1 measure evolutionary change in small populations will take longer than experiments with large populations. This evolutionary slowness is further compounded if one wants to study changes in any complexity measure, as changes in complexity often require multiple mutations and thus occur over long experimental timescales [109]. To avoid these constraints, I use the digital evolution systems Avida [154] to experimentally study how population size alters the evolution of genetic architecture and complexity. Due to the previously-published or submitted nature of many of the chapters, specific in- troductory text will be present on a per-chapter basis. In this Introduction, I will review the population-genetic basis of how population size can alter evolutionary dynamics and may influ- ence the evolution of biological complexity. I will also provide an overview of digital evolution studies and their role in experimentally-testing theories concerning the evolution of complexity and robustness in small populations. I will end with an outline of the rest of the dissertation. 1.1 Population Genetics and Evolutionary Dynamics of Small Asexual Populations Here, I will review the relevant literature on the population genetics and evolutionary dynamics of small populations. For my purposes here, I will focus on evolution in small asexual haploid populations. While most biological populations are sexual and experience some recombination, most of the work in my subfield of experimental evolution uses asexual populations due to their ease of analysis, including the results I present here. While it is likely that some of the results from this research also apply to populations with recombination, these scenarios are not tested here (see Chapter 5 for discussions on the interplay between genetic drift and recombination and their possible relationship to the results presented here). There are two main factors that constrain adaptation in small asexual populations: 1) limited beneficial genetic variation, and 2) limited maintenance of weakly-beneficial mutations. Limited genetic variation in small populations results simply from the limited number of individuals. If a population has N individuals and each individual offspring receives a novel beneficial mutation with probability Ub, then the population receives N × Ub beneficial mutations per generation. 2 It follows that populations with a smaller N receive fewer beneficial mutations per generation. Furthermore, small populations are expected to experience beneficial mutations of smaller effect size, as large-effect beneficial mutations are likely rarer than small-effect beneficial mutations [111]. This stochasticity in mutation supply leads to decreased repeatability across populations due to increased stochasticity in which beneficial mutations appear [103]. Finally, small populations should have a decreased supply of standing genetic variation that may become beneficial upon environmental change, although evidence for this prediction is mixed in natural populations [221]. Thus, small populations have a decreased rate of adaptation in a constant environment [37, 39], increased stochasticity in adaptation across populations in complex environments [103, 175, 190], and a decreased capability to respond to environmental change [15, 103, 221]. The second evolutionary constraint faced by small populations is that purifying selection is weaker, and genetic drift stronger, in small populations. More precisely, if a mutation alters a genotype’s fitness by s, it will be effectively neutral if: |s| (cid:28) 1 Ne (1.1) , where Ne is the effective population size [155]. As a result, if a mutation is deleterious (i.e., s < 0) and the absolute value of its size is much less than the reciprocal of the effective population size, its fixation probability is approximately that of a neutral mutation. In other words, small populations have a greater likelihood of fixing a deleterious mutation with a small effect size than do large populations. Conversely then, small populations have a lesser likelihood of maintaining the fixation of small-effect beneficial mutations. Given that small populations lack the ability to purge small-effect deleterious mutations, what are the consequences for adaptation? When discussing the effect of deleterious mutations on population mean fitness, the concept of a mutational load is often introduced [150]. The mutational load is often defined as the (deterministic) reduction in mean fitness from some optimal value due to segregating deleterious variation and mutation-selection balance [7, 167]. A similar concept exists for fitness reductions in populations due to weakened selection in small populations: the drift load [7, 167, 212]. The drift load is often incorporated into the mutational load [7] and can 3 be defined as variations to the deterministic predictions from mutational load calculations due to stochastic, drift-induced, fluctuations in mutation frequency [91]. Here, I will define a population’s drift load as the reduction in fitness due to the fixation of slightly-deleterious mutations [7]; this has been referred to previously as the “fixed drift load” to separate fitness reductions due to fixation from fitness reductions due to increased frequency of slightly-deleterious mutations in small populations [167]. Over time, the continual fixation of slightly-deleterious mutations increases a population’s drift load. If one assumes beneficial mutations are sufficiently rare and fitness loss is irreversible, this leads to large fitness declines in a process known as Muller’s ratchet [72, 151]. If selection is hard, and reductions in fitness reduce the number of offspring produced each generation, fixation of slightly-deleterious mutations will cause population size to decrease [123]. This results in a positive feedback loop between the fixation of slightly-deleterious mutations and population size decreases known as a mutational meltdown and may explain why asexual species go extinct at high rates [125]. Experimental evidence of both Muller’s ratchet [9, 26, 41] and mutational meltdowns [233] in laboratory microbial populations suggests both phenomena may be relevant to natural populations. While both theory and experiments suggest that fitness declines due to genetic drift occur, the assumption that small populations will continually decline in fitness has been repeatedly challenged. One proposed mechanism that could halt Muller’s ratchet is synergistic/negative epistasis between slightly-deleterious mutations [92]. If mutations become more deleterious as fitness decreases, then the fitness decline will halt as the strength of purifying selection increases [93]. While there are studies that show that deleterious mutations interact in a negatively-epistatic manner (e.g., [185]), the overall evidence is mixed [73]. Some theoretical evidence suggests that small populations should evolve positive epistasis, not negative epistasis [69]. It should also be noted that this effect is dependent on the exact distribution of deleterious fitness effects [25]. Another assumption behind Muller’s ratchet is that beneficial mutations are sufficiently rare such that small populations cannot fix compensatory mutations that reverse their decline in fitness. There is abundant evidence that 4 populations can easily fix compensatory mutations in response to fitness declines [18, 23, 49, 147, 166, 189]. Furthermore, theory predicts that small populations should be able fix beneficial mutations to halt fitness declines [66, 167]. Laboratory experiments with small viral populations have shown that the rate of beneficial mutations increases as fitness decreases [183]; this relationship halts the fitness decline of small populations. These pieces of evidence, both theoretical and experimental, suggest that while small populations do face a drift load, fitness will not decline continuously. Instead, small populations reach an equilibrium fitness where the fixation of slightly- deleterious mutations is balanced by the fixation of compensatory beneficial mutations [66]. While most of the research concerning the evolutionary dynamics of small populations is focused on fitness reduction, some have suggested that small populations may actually possess an adaptive advantage over large populations [81]. This idea is as follows. Because large populations face strong natural selection, they can quickly adapt to the nearest fitness peak. However, while this fitness peak may be a local optimum, it may not be a global optimum (i.e., the highest possible fitness peak in the fitness landscape). Thus large populations may adapt to locally optimal, but globally suboptimal, areas of the fitness landscapes. Small populations do not face such strong selective constraints. While they may climb locally-optimal fitness peaks, they may also fall- down locally-optimal peaks. While this results in a short-term fitness decline, it also allows small populations to locate other, potentially higher, fitness peaks and possibly evolve greater fitness than large populations in a process known as valley-crossing. This scenario is the basic idea behind Wright’s Shifting Balance theory for adaptation in small sexual populations [223]. Experimental evidence from experimental evolution suggests that it is possible for small popula- tions to achieve greater fitness than large populations, at least in the short term. The maximum fitness of an ensemble of small (or intermediate-sized) populations is often greater than the maximum fitness of an ensemble of large populations, although the average fitness of the large populations will be greater than the average fitness of the smaller populations [175, 178]. It has been shown that this requires a complex, multi-peak, fitness landscape [74, 81]. However, additional theory suggests that any small-population advantage may be limited. In asexual populations, large pop- 5 ulations valley cross faster than small populations [210]. In sexual populations, the relationship between valley crossing and population size depends on the recombination rate: high recombina- tion rates leads to faster valley crossing in small populations, while low recombination rates lead to faster valley crossing in large populations [211]. This positive relationship between the rate of valley crossing and population size is due to large populations having a greater likelihood of stochastic tunneling, or receiving a beneficial mutation on a segregating, but not fixed, deleterious genetic background [79, 208]. This advantage towards stochastic tunneling in large populations overwhelms any advantage small populations may have in valley-crossing due to their increased fixation of deleterious mutations. There is also some theoretical evidence that intermediate-sized populations, and not large populations, get stuck on locally-optimal fitness peaks, while both small and large populations valley-cross to globally-optimal peaks [153]. Under these conditions, large populations should have an evolutionary advantage towards adaptation, as they avoid getting stuck on local fitness peaks, but can valley-cross faster than small populations. Overall, previous studies support the conclusion that while small populations do have an adaptive advantage over large populations under certain conditions, small population size tends to lead to reduced adaptation. 1.2 Molecular Mechanisms of Evolution in Small Populations While the population genetics and evolutionary dynamics of small populations have been explored in some detail for many years, only recently have researchers begun to study how population size alters the molecular mechanisms of evolution. In this section, I will focus on previous research that seeks to explain how small populations should evolve differently from large populations not in terms of fitness (as previously discussed), but in terms of specific phenotypic traits and genetic architecture. Why should it be assumed that small population size may result in alternative evolutionary “strategies” compared to large populations? The loss of fitness that small populations experience due to genetic drift may allow them to climb fitness peaks otherwise inaccessible to large populations. Fitness differences result from alternative phenotypes and alternative genetic systems and the ability to access alternative areas of the fitness landscape suggests that small 6 populations should be able to access alternative genetic architectures. An example of how non- adaptive evolutionary forces can alter genetic architecture comes from populations that have evolved under strong mutational pressure, i.e., the “Survival of the Flattest” effect [220]. One of the most prominent theories concerning the differential evolution of genome architecture and complexity in small populations is Michael Lynch’s Mutational Hazard hypothesis (MH hy- pothesis; see [119] for an overview). The MH hypothesis starts from the assumption that one should expect a species’s genome architecture to be influenced by its population-genetic environment [119]. While many theories try to explain differences in genome architecture across species by invoking adaptive explanations, non-adaptive evolutionary processes, such as drift and mutation, can also al- ter genome architecture [119]. The MH hypothesis argues that many of the features that distinguish prokaryotic genomes from eukaryotic genomes, particularly multicellular eukaryotic genomes, are due to the small effective population sizes of both current-day eukaryotic populations [95] and possibly the ancestral eukaryote population [115]. The hypothesis argues that weakened selection in small populations prevents the removal of excess genomic content [118]. This excess genome space is slightly-deleterious due to its ability to potentially receive deleterious mutations in future generations [121]. In other words, excess genome content is mutationally-hazardous [128]. This evolutionary mechanisms of weakened selection against mutational hazards in small populations has been invoked to explain the evolution of large genomes composed of primarily non-coding DNA [121], the abundance of introns [114, 121], the presence of complex gene regulatory net- works [117], the origins of complex gene structure and organization [10, 115], and the maintenance of gene duplicates and subfunctionalization [121, 122] in multiceulluar eukaryotes. Meanwhile, prokaryotes, due to strong selection pressure, maintain small, compact genomes with modular operons and comparatively little non-coding content [116]. Later on in evolutionary time, this increased genetic complexity may then give rise to increased phenotypic complexity [94, 118, 121]. It should also be noted that small-population prokaryotes, such as bacterial endosymbionts, have reduced genome size compared to their free-living relatives [13, 100] due to a strong deletion bias in prokaryoties [99]. The MH hypothesis has been controversial and some follow-up empirical 7 studies have resulted in conflicting evidence (e.g., [8, 67, 214, 230]). In recent years, the metaphorical concept of “drift barriers” has been introduced into the evo- lutionary biology literature [35, 124, 135, 188, 224]. This concept, similar to those previously mentioned for the role of population size in evolution, argues that there is some theoretical bio- physical optimum “value" for phenotypic traits and it is population size that determines how close to that optimum a population can evolve [129]. This idea is of course similar to those proposed in population genetics, with fitness as the stand-in phenotypic trait. The concept of a drift barrier has been most often applied to the evolution of certain error rates, such as the mutation rate [188] and the transcriptional error rate [135], although some experimental data contradicts the concept of drift barriers [193]. For example, one expects the per-nucleotide mutation rate to be negatively correlated with population size, a prediction supported by mutation accumulation experiments and estimates from natural populations [188]. However, it can in theory be applied to any phenotypic trait of interest. And while the concept of drift barriers does not provide any specific predictions for how population size should alter genetic architecture, it does provide a conceptual framework for thinking about how population size alters evolutionary outcomes beyond changes in fitness. A third proposed mechanism underlying adaptation in small populations, and one relevant to results presented here, is the evolution of mutational robustness [38, 198, 218, 220]. The evolution of mutational robustness occurs when a population evolves genomic architecture that reduces the deleterious effect of mutations. The concept has often been argued to shape the genomes of species with high mutation rates and large population sizes, such as RNA virus [19, 106]. However, theoretical results have demonstrated that in fact small populations should be favored to evolve mutational robustness and large populations should evolve mutational fragility [48, 68, 97]. Further theory suggests that both small and large populations should evolve robustness to mutations, but through different evolutionary paths [169, 224]. Small population should evolve costly global robustness solutions (such as improved DNA repair machinery that reduces the mutation across the whole genome) while large populations should evolve local robustness solutions (such as reducing It should also be noted that the the deleteriousness of mutations at each locus individually). 8 evolution of mutational robustness in small populations is likely tied to the exact fitness landscape. Most of the fitness landscapes used contain only small-effect deleterious mutations and neutral mutations [97, 169], a fact I will return to in Chapter 3. None of these (related) theories concerning the evolution of mutational robustness in small populations have been extensively tested due to the difficulty of evolving novel traits in small popu- lations in the time at which evolution experiments are performed. However, there is evidence from obligate bacterial endosymbionts that small populations do evolve mutational robustness in nature. These bacteria overexpress genes encoding heat-shock molecular chaperones and chaperonins, such as DnaK and GroEL, compared to their free-living relatives [137]. This overexpression phenotype is thought to have evolved in response to deleterious mutation accumulation due to small popula- tion size and genetic drift [148]; overexpressing groEL in both Escherichia coli and Salmonella typhimurium subjected to mutation accumulation greatly restores fitness [52, 130]. Furthermore, endosymbiotic bacteria accumulate mutations at greater rates than free-living bacteria, suggesting these mutations are less deleterious in endosymbionts [137, 148]. Laboratory evolution experi- ments have shown that overexpression of groEL prevents the extinction of small populations, but is deleterious in large populations and thus removed by purifying selection [176]. This result does support the predictions for the global and local robustness theory for small and large populations, respectively [169]. Small bacterial populations also possess other phenotypes that may be related to mutational robustness, such as increased protein multi-functionality [86, 231] and lack of genes encoding parts of the DNA repair machinery [137, 149], but the relevance of these phenotypes to robustness remains to be explored. 1.3 Overview of Digital Experimental Evolution In order to overcome some of the difficulties with evolving small populations with alternative genetic architectures, I performed digital evolution experiments. Digital experimental evolution [1, 3, 12] is the computational equivalent of microbial experimental evolution [6, 45, 83]. Instead of a population of microbes evolving in a flask, one has a population of computer programs (“digital 9 organisms”) evolving in some digital world. Digital evolution serves as an intermediate step in model-system complexity between the mathematical models used in theoretical population genetics and the microbial systems used in typical evolution experiments. Digital evolution experiments are more complex than mathematical models and allow for the spontaneous discovery of new evolutionary phenomena that may be missed in simpler models. This ability for discovery, though, is a trade-off with the opaqueness and occasional arbitrariness of many digital evolution systems; the large number of parameters in digital evolution system can make it difficult to tease apart which parameters cause a given outcome. Sometimes, it is better to have a simpler model in order to aid in understanding; digital evolution experiments often lead to the development of these models (see Chapter 3). The advantage of digital evolution experiments over microbial evolution experiments is that one can perform experiments for a longer time at greater replication, one can perform experimental manipulations not yet possible in biological systems, and digital systems can test hypotheses not yet reachable for biological models. Digital evolution experiments can allow for the development of hypotheses in a more efficient manner than in biological systems, and these hypotheses can then be tested in biological systems. Of course, digital systems are not biological systems, and the relative simplicity of digital systems to biological systems means important details will always be missed. Furthermore, digital evolution studies cannot answer some questions of strong interest to evolutionary biologists, such as what genetic variation occurs in biological populations (as opposed to how evolution shapes genetic variation). While the trade-offs are not such that they eliminate the effectiveness of digital experimental evolution, they must be remembered when performing these studies. The digital evolution system used here is the Avida system [154]. In Avida, self-replicating computer programs compete within a population for limited memory space and limited CPU time. Each program (“avidian”) consists of a circular genome; each loci in the genome consists of one of the twenty-six possible instructions in the Avida genetic alphabet. Each (viable) avidian genome encodes the ability for reproduction and can also encode the ability to perform Boolean logic calculations that serve as phenotypic traits in Avida [109]. Avida populations evolve these traits, 10 and other general alterations to their replication machinery. During reproduction, mutations may be introduced into the offspring’s genome and these mutations may alter their bearer’s replication speed. Faster replicators produce more offspring per unit time, and thus will out-compete slower replicators. This results in selection for replication speed, not due to some selection algorithm as in most evolutionary simulations, but due to faster growth. Because avidian populations have phenotypic variation (in replication speed), individuals pass this variation to their offspring (due to self-replication) and there are differences in reproductive output due to this phenotypic variation, they undergo Darwinian evolution by natural selection [164]. Each chapter goes into the relevant details behind Avida for their contained experiments, so I will avoid repeating any further description. Instead, I will now review some of the results from previous Avida experiments relevant to evolution in small populations. Lenski et al. studied the interplay between genome complexity, robustness and epistasis and found that complex genotypes (i.e., large genome size) were more robust to deleterious mutations than simple genotypes (i.e., small genome size) due to a lesser likelihood of lethal mutations [108]. Later, Wilke et al. demonstrated that populations with high mutation rates evolve “flat” genomic architecture with a high likelihood of neutral mutations and populations with low mutation rates evolve “fit” genomic architecture with a high likelihood of deleterious mutations; they named this phenomenon the “survival of the flattest” effect [216]. In a follow-up study, Edlund and Adami showed that flat genotypes evolved robustness by evolving increased negative epistasis, although mutational effects were, on average, multiplicative for flat genotypes [44]. Further work demonstrated that this relationship between robustness and epistasis was prevalent in many computational evolution systems [216]. Misevic et al. showed that sexual reproduction protected against extinction due to the mutational load in small populations [144]. Avida experiments supported the hypothesis [68, 97] that small populations evolve mutational robustness while large populations evolve mutational fragility [48]. Additional survival of the flattest experiments have shown that population size does not alter the critical mutation rate at which the transition from fit genotypes to flat genotypes occur (although they only tested populations as small as 250 individuals [32]). Altogether, these Avida results show that: 1) 11 populations can evolve alternative genetic architectures under differing environmental conditions, 2) different avidian genetic architectures differ in mutational robustness, and 3) population size interacts with robustness in a non-trivial manner. While Avida has been often used to study the evolution of genomic architecture in regards to robustness, it was also developed to study the evolution of biological complexity. Avida has been used to test the development of a new metric for complexity, physical complexity [2, 4], and show that biological complexity tends to increase in populations under selection [5]. Lenski et al. 2003 demonstrated how traits of great complexity require the intermediate evolution of simpler traits to act as a “structure” on which to construct greater complexity [109]. This study also showed that deleterious mutations often precede the evolution of novel traits in a lineage’s evolutionary trajectory [109], a result later confirmed in Avida for general adaptation for populations evolving with a high mutation rate [33]. In recent years, Avida has been used to explore many topics in the evolution of complexity, such as the origin of somatic cells [63], the origins of division of labor [62], and the role of host-parasite interactions [56, 232] and other ecologies [29, 205] in driving the evolution of complexity. Together, the prior results demonstrate that Avida is a suitable model system to study the evolution of robustness and complexity in small populations. 1.4 Outline In Chapter 2, I present previously-published [101] results on the role of population size and genetic drift in driving the evolution of genetic complexity (measured as genome size) and pheno- typic complexity (measured as the number of phenotypic traits). In Chapter 3, I present previously- published results [102] on a phenomenon I call “drift robustness” and discuss how that shapes the evolution of small populations. In Chapter 4, I present results on how greater genome size can enhance the likelihood of extinction of small populations by altering their likelihood of lethal mutations. Finally, I conclude by discussing some further conclusions I missed in the original published versions of each chapter, by discussing some limitations of these research projects, and by discussing what I see are likely productive areas of future research. 12 CHAPTER 2 DIFFERENT EVOLUTIONARY PATHS TO COMPLEXITY FOR SMALL AND LARGE POPULATIONS OF DIGITAL ORGANISMS Thomas LaBar, Christoph Adami Originally published: PLoS Computational Biology, 12(12): e1005066. Figures have been renumbered from the published version. Reproduced with permission from the publisher: http://journals.plos.org/ploscompbiol/s/licenses-and-copyright 2.1 Abstract A major aim of evolutionary biology is to explain the respective roles of adaptive versus non- adaptive changes in the evolution of complexity. While selection is certainly responsible for the spread and maintenance of complex phenotypes, this does not automatically imply that strong se- lection enhances the chance for the emergence of novel traits, that is, the origination of complexity. Population size is one parameter that alters the relative importance of adaptive and non-adaptive processes: as population size decreases, selection weakens and genetic drift grows in importance. Because of this relationship, many theories invoke a role for population size in the evolution of complexity. Such theories are difficult to test empirically because of the time required for the evolution of complexity in biological populations. Here, we used digital experimental evolution to test whether large or small asexual populations tend to evolve greater complexity. We find that both small and large—but not intermediate-sized—populations are favored to evolve larger genomes, which provides the opportunity for subsequent increases in phenotypic complexity. However, small and large populations followed different evolutionary paths towards these novel traits. Small pop- ulations evolved larger genomes by fixing slightly deleterious insertions, while large populations fixed rare beneficial insertions that increased genome size. These results demonstrate that genetic drift can lead to the evolution of complexity in small populations and that purifying selection is not 13 powerful enough to prevent the evolution of complexity in large populations. 2.2 Author Summary Since the early days of theoretical population genetics, scientists have debated the role of population size in shaping evolutionary dynamics. Do large populations possess an evolutionary advantage towards complexity due to the strength of natural selection in these populations? Or do small populations have the advantage, as genetic drift allows small populations to cross fitness valleys that large populations are unlikely to traverse? There are many theories that predict whether large or small populations–those with strong selection or those with strong drift–should evolve the greatest complexity. Here, we use digital experimental evolution to examine the interplay between popu- lation size and the evolution of complexity. We found that genetic drift could lead to increased genome size and phenotypic complexity in very small populations. However, large populations also evolved large genomes and phenotypic complexity. Small populations evolved larger genomes through the fixation of slightly deleterious insertions, while large populations used rare beneficial insertions. Our results suggest that both strong drift and strong selection can allow populations to evolve similar complexity, but through different evolutionary trajectories. 2.3 Introduction The relative importance of adaptive (i.e., selection) versus non-adaptive (i.e., drift) mechanisms in shaping the evolution of complexity is still a matter of contention among evolutionary biologists [5, 22, 94, 118, 140, 191]. In molecular evolution, the role of non-adaptive evolutionary processes such as genetic drift and genetic draft are well-established [61, 89, 155]. Theoretical population-genetic principles argue that neutral evolution, not natural selection, drove the evolution of large, primarily non-functional, genomes [43, 128, 161]. Meanwhile, there exists abundant experimental evidence that natural selection is the main cause of evolutionary change [103, 194, 202], including the spread of novel adaptive phenotypes [20, 142] in experimental populations. However, it is still possible 14 that non-adaptive processes play a significant role in the evolution of complexity. For instance, genetic drift (or relaxed selection) may allow for the accumulation of mutations that can later lead to the evolution of novel complexity [118, 204]. Much of the work demonstrating the role of selection in driving the evolution of novel complex traits is based on experiments with large populations and strong selection [84]. In much smaller populations (i.e., those with fewer than 104 individuals), selection is weaker, and genetic drift begins to alter evolutionary dynamics [103, 131]. Therefore, to explain the role of adaptive vs. non-adaptive process in the evolution of complexity, one must explore the role of population size in the evolution of complexity. Both theoretical modeling and experiments suggest many possibilities for the relationship between population size and the evolution of complexity. There are two classes of evolutionary trajectories that would favor large populations in the evolution of complexity. First, populations could perform an adaptive walk (the fixation of a sequence of beneficial mutations) towards the evolution of a novel complex trait [181]. If this was the case, then larger populations would follow this trajectory faster than small populations due to their larger mutation supply. Experiments with microorganisms support the possible existence of adaptive trajectories towards complexity, as there is strong evidence that the mutations leading up to a phenotypic innovation in both Escherichia coli [168] and phage λ [24] were under positive selection. However, it is unclear whether adaptive mutations generally precede the evolution of complex traits or whether these large microbial populations can only take adaptive walks due to the intensity of selection in large populations. The second type of trajectory that favors large populations is the neutral walk (the fixation of a sequence of neutral mutations). While any individual neutral mutation has a low probability of fixation, a large population would be able to accumulate many neutral mutations at any given time allowing for the exploration of its fitness landscape. Work by Wagner and colleagues suggests that many phenotypic traits are connected to each other by sequences of phenotypically neutral mutations [203, 204]. If the evolution of complexity requires the fixation of deleterious mutations (for example, via valley-crossing), then the elimination of deleterious mutations by purifying selection may 15 limit the evolutionary advantage large populations may have. Wright was the first to propose an evolutionary advantage of small populations due to valley-crossing [223]. More recently, scientists have explored under which conditions small populations have an evolutionary advantage over large populations [81, 175]. A prominent theory that predicts that small (but not large) populations should evolve the greatest genomic complexity (and subsequently organismal complexity) is the Mutational Burden (or Mutational Hazard) hypothesis, proposed by Lynch and colleagues [118, 119, 121]. This hypothesis argues that genome size should be inversely correlated with the product of the effective population size and the mutation rate [94, 121]. Strong purifying selection against excessive genome size streamlines the genomes in large populations [13, 116, 235]. Meanwhile, weakened purifying selection and increased genetic drift in small populations results in the accumulation of slightly deleterious excess genome content [94, 119]. At a later time, this slightly deleterious genome content may be mutated into novel beneficial traits [96, 118]. However, recent work on valley-crossing in asexual populations (and sexual populations with a low recombination rate) showed that both small and large populations cross valleys more than intermediate-sized populations [153, 210, 211]. Therefore, it is not clear whether large or small populations are expected to evolve the greatest complexity when deleterious mutations are required. The long timescales required to observe the emergence of novelty and evolution of complexity make biological experiments to distinguish between these theories difficult to perform. To overcome this difficulty, we used digital experimental evolution [3] to test the role of population size on the evolution of genome size and phenotypic complexity in asexual organisms. Digital evolution has a long history of addressing macroevolutionary questions (such as the evolution of novel traits) experimentally [14, 226]. Digital populations can be manipulated in ways that biochemical organisms can not, making it possible to study aspects of the evolutionary process that are ordinarily too difficult to test [12]. In this regard, digital experimental evolution has the same goals as microbial experimental evolution: to use a well-controlled model system that is as simple as possible, to study “evolution in action” [45]. And while digital evolution studies cannot test hypotheses that depend on particular biochemical processes involved in cellular life, digital populations do undergo 16 selection, drift, and mutation, allowing for their use in testing hypotheses derived from theoretical population genetics. Thus, digital experimental evolution represents a well-suited model system to test the population genetics-based theories concerning the role of population size in the evolution of complexity. Here, we evolved populations ranging in size from 10 to 104 individuals, starting with a minimal-genome ancestor. We found that small populations do evolve greater genome sizes and phenotypic complexity (number of phenotypic traits) than intermediate-sized populations. These small populations evolve larger genomes primarily through increased fixation of slightly deleterious insertions. However, the small population sizes that enhance the evolution of phenotypic complex- ity also enhance the likelihood of population extinction. We also found that the largest populations evolved similar complexity to the smallest populations. Large populations evolved longer genomes and greater phenotypic complexity through the fixation of rare beneficial insertions instead. Large populations were able to discover these rare beneficial mutations due to an increased mutation supply. Finally, we found that a strong deletion bias can prevent the evolution of greater complexity in small, but not in large, populations. 2.4 Results To explore the effect of population size on the evolution of genome size and phenotypic complexity, we used the Avida digital evolution system [154]. Avida is a platform that allows researchers to perform evolution experiments inside of a computer, as the genetic code that evolves are actual computer programs of variable length. It has been used extensively in research in evolutionary biology [1, 3, 217], and is described in detail in Methods. We evolved one hundred replicate populations across a range of population sizes (10 − 104 individuals) for 2.5 × 105 generations. Many of the smallest populations (those with ten indi- viduals) did not survive the entire experiment. Therefore, we evolved one hundred additional small populations ranging from twenty individuals to ninety individuals in order to examine how the probability of extinction was related to the evolution of complexity. All populations with at 17 least thirty individuals survived for the entire experiment. Forty-seven of the populations with ten individuals went extinct, while only one of one hundred populations underwent extinction in the populations with twenty individuals. Extinction was a consequence of populations evolving large genomes that accumulated deleterious mutations and led to the production of only non-viable offspring. These extinct populations were not included in the statistics described below. 2.4.1 Genome Size Evolution Of the surviving populations, we first examined how genome size changes from the ancestral value of fifteen instructions. The size of the genome from every population size increased, on average (see Fig. 2.1 and panel A in Fig. 2.2). However, both the smallest and the largest populations evolved the largest genomes. Populations with ten individuals evolved a median genome size of 35 instructions, while populations with ten thousand individuals evolved a median genome size of 36 instructions. The median final genome size decreased as population size increased for populations with between ten and fifty individuals. However, from populations with fifty individuals to populations with ten thousand individuals, the median final genome size increased as population size increased. Figure 2.1: Final genome size as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. Next, we examined the dynamics of fixation of insertion mutations (insertions, for short) to 18 10100100010000Population Size101102103Genome Size Figure 2.2: Evolution of complexity for small population sizes. Statistics shown in the main text for population sizes ranging from 10 to 100 individuals. Data for populations with 10 and 100 individuals are the same as in Figure 2.1. A) Evolution of genome size. B) Proportion of fixed insertions that were slightly-deleterious. C) Proportion of fixed insertions that were under positive selection. D) Number of evolved novel phenotypic traits. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. explain why both the smallest and the largest populations evolved the largest genomes. For each experimental population, we counted every insertion that occurred on the fittest genotype’s ancestral lineage that went back to the ancestral genotype (the “line of descent", see Methods). The median number of insertions fixed follows the same trend as the evolution of genome size (Fig. 2.3). A large fraction of these fixed insertions are slightly deleterious in populations with fewer than one hundred individuals (see Fig. 2.4 and panel B in Fig. 2.2). However, no insertions are slightly deleterious, on average, in large populations with more than one hundred individuals. The opposite trend holds for beneficial insertions. The fraction of insertions that are under positive selection 19 ABCD increases with increasing population size, with the largest populations usually fixing only beneficial insertions (Fig. 2.5 and panel C in Fig. 2.2). These data demonstrate that small populations evolve larger genomes through the fixation of slightly deleterious insertions. However, large populations can evolve similarly large genomes through the fixation of rare beneficial insertions. Figure 2.3: The number of insertions fixed as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. Figure 2.4: Proportion of slightly-deleterious insertions as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. 20 10100100010000Population Size050100150200250300350400450Insertions Fixed10100100010000Population Size0.00.20.40.60.81.0Proportion of Fixed Insertions Figure 2.5: Proportion of insertions under positive selection as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. 2.4.2 Evolution of Phenotypic Complexity Next, we focus on the role of population size in the evolution of phenotypic complexity (defined as the number of phenotypic traits). In Avida, a phenotypic trait is a program’s ability to perform a certain mathematical operation on binary numbers (see Methods). The evolution of phenotypic complexity follows the same trend as the evolution of genome size (see Fig. 2.6 and panel D in Fig. 2.2). Populations with ten individuals evolved a median of four traits, while populations with one thousand and ten thousand individuals evolved a median of one trait. The rest of the population sizes evolved a median of zero traits. As an avidian’s fitness is primarily determined by its phenotypic traits in the Avida environment used here, the evolution of fitness showed a similar trend to the evolution of phenotypic complexity (Fig. 2.7). That the trend in genome size evolution and in phenotypic complexity evolution are mirrored suggests that the evolution of larger genomes enables the evolution of increased phenotypic com- plexity. To establish a link between the two, we performed two tests. First, we examined the correlation between genome size and phenotypic complexity across all populations. Phenotypic complexity is positively correlated with genome size (Fig. 2.8, Spearman’s ρ ≈ 0.72; p < 2.3 x 10−57 ), suggesting that it was the increased genome size that allowed for the evolution of increased 21 10100100010000Population Size0.00.20.40.60.81.0Proportion of Fixed Insertions Figure 2.6: Final number of evolved phenotypic traits as a function of population size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. Figure 2.7: The evolved fitness relative to the ancestor genotype. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. 22 10100100010000Population Size0246810Evolved Traits10100100010000Population Size100101102103104105106107108Relative Fitness phenotypic complexity. However, there are two potential mechanisms that could cause an increased genome size to result in increased phenotypic complexity. On the one hand an increased genome size could simply allow more “room” for novel functional content. On the other hand, it could be that the increased genome size leads to a faster rate of evolution due to the increased genomic mu- tation rate. To examine the role of an increased mutation rate in driving the evolution of phenotypic complexity, we evolved a further one hundred populations of ten individuals with a fixed genomic mutation rate of 1.5 × 10−1 (i.e., the ancestral genomic mutation rate). Under this condition, no population went extinct (as opposed to forty-seven in the variable mutation rate treatment). The fixed genomic mutation rate populations evolved a median of 2 phenotypic traits compared to the variable genomic mutation rate populations that had evolved a median of 4 phenotypic traits (Fig. 2.9). These data demonstrate that the increased genomic mutation rate that follows from larger genomes does increase the evolution of phenotypic complexity. However, even with a fixed genomic mutation rate, the smallest populations still evolved a greater median number of traits (on average 2 traits) than every other population size. Thus, while an increased genomic mutation rate (due to increased sequence length) indeed enhances the evolution of phenotypic complexity, small populations still possess an evolutionary advantage due to drift-driven increases in genome size only. Figure 2.8: Correlation between the final genome size and the final number of evolved traits. Black circles represent the combined data from populations with 10, 100, 1000, and 10000 individuals. Only replicates that survived all 2.5×105 generations were included. 23 101102103Genome Size0246810Evolved Traits Figure 2.9: The effect of a fixed mutation rate on the evolution of phenotypic complexity. The variable genomic mutation rate treatment represents the data from when the genomic point mutation rate is 10−1 × L, were L is the genome size. The fixed genomic mutation rate treatment represents the data from when the genomic point mutation rate was fixed at 1.5 × 10−1, independent of the genome size. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. 2.4.3 Non-Functional Insertions In the previous experiments, large populations evolved larger genomes and greater phenotypic complexity because they fixed rare beneficial insertions. Next, we more closely examine the finding that beneficial insertions are necessary for the evolution of complexity in large populations. We repeated the experiments with the same population sizes and mutation rates, except we changed how insertions worked. Instead of inserting one of the twenty-six instructions that compose the Avida instruction set, we inserted “blank” instructions into the genome (see Methods for details). These blank instructions cannot be beneficial (on their own or in combination with existing instructions) and would have to be further mutated to lead to the evolution of phenotypic complexity. In this treatment, greater phenotypic complexity in large populations would require a two-step mutational process, as opposed to the single step in a beneficial insertion. We saw no qualitative difference in the trend between these experiments and the original experiments (Fig. 2.10). Very small and large populations still both evolved the largest genomes and the greatest phenotypic complexity. Populations of all sizes evolved longer genomes and more 24 Variable GenomicMutation RateFixed GenomicMutation Rate0246810Evolved Traits phenotypic traits in this treatment (Fig. 2.10) than in the original treatment (Fig. 2.1 and Fig. 2.6). The fraction of fixed insertions that were under positive selection decreased for every population size compared to the original experiments, as expected from the insertion of non-functional instructions (Fig. 2.11). We observed an increased rate of extinction in the very small populations, with only 2 populations with ten individuals and 25 populations with twenty individuals surviving the experiment. Population extinction was likely enhanced by the increased growth in genome size in these experiments as compared to the original experiments. Figure 2.10: The evolution of complexity in the non-functional insertion treatment. All subplots are a function of the population size. A) The final genome size. B) The final number of evolved phenotypic traits. Populations with 20 individuals are shown instead of those with ten individuals due to the high extinction rates of populations with ten individuals. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. 25 AB Figure 2.11: The proportion of fixed insertions that were under positive selection in the non-functional insertion treatment compared to the original treatment for populations with 104 individuals. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. 2.4.4 Deletion Bias Finally, we performed experiments to test whether the effect of a deletion bias (a higher fraction of deletions among all indels) alters the relationship between population size and the evolution of complexity. A biased ratio of deletion to insertion mutations is found in biological organisms across the tree of life, especially in bacteria [99, 143]. In these experiments we set the ratio of deletions to insertions as 9:1, but kept the total indel mutation rate as in the original experiments. In this treatment, only one population with ten individuals went extinct, as opposed to 47 populations in the original treatment. However, the advantage towards evolving complexity previously enjoyed by small populations vanished (Fig. 2.12). The median genome size increased as the population size increased for all populations sizes. Only the largest populations evolved a median number of novel phenotypic traits greater than zero. These results suggest that it is not only the role of genetic drift, but the equal frequency of insertions and deletions that results in the increased genome size and phenotypic complexity in small populations. 26 OriginalTreatmentNon-Functional InsertionTreatment0.00.20.40.60.81.0Proportion of Fixed Insertions Figure 2.12: The evolution of complexity in the deletion bias treatment. All subplots are a function of the population size. A) The final genome size. B) The final number of evolved phenotypic traits. Red lines are the median values for each population size. The upper and lower limits of each box denote the third and first quartile, respectively. Whiskers are 1.5 times the relevant quartile value. Plus signs denote those data points beyond the whiskers. Data represent only those populations that did not go extinct. 2.5 Discussion The idea that small populations could have an evolutionary advantage over large populations dates back to Wright and his Shifting Balance theory [223]. More recently, a potential small- population advantage has been demonstrated both theoretically [81] and experimentally [175], but only in regard to short-term increases in fitness. The Mutational Burden hypothesis provides an evolutionary mechanism that gives small populations an advantage towards increased phenotypic complexity [96, 118]. However, an experimental demonstration of this advantage is lacking. Our study provides further insight into the conditions that give small populations such an evolutionary advantage. We confirmed that small populations do evolve larger genomes due to the increased 27 AB fixation of slightly deleterious mutations, as predicted [121]. We also showed how small populations have an increased potential to later evolve increased phenotypic complexity in small populations through the larger genomes generated by increased genetic drift [94, 118]. As phenotypic traits are strongly beneficial in the Avida environment used here, these small populations used slightly deleterious genome expansions to cross fitness valleys and eventually reach novel fitness peaks. Our work also shows that this evolutionary advantage of small populations is limited by an increased rate of population extinction. Such a trend between the evolution of large genomes and an increased rate of extinction is seen in some multicellular eukaryote clades [200, 201]. These small populations are still likely to have a larger risk of extinction beyond that caused by population-genetic risks such as Muller’s ratchet [151] and mutational meltdowns [125, 233]. Ecological stressors increase extinction risk [134] and small populations are less able to adapt to detrimental environmental changes [221]. Our results concerning extinction, combined with the risk of other factors not examined here, suggest that the likelihood of a small population using genetic drift to evolve greater complexity without an increased risk of extinction may be limited. However, it is possible that multiple small populations could reduce the risk of extinction without reducing the evolution of complexity; future work should consider the interplay between population size and the evolution of complexity within a metapopulation of small populations. Large populations also evolved greater genome sizes and phenotypic complexity. In our original experiments, genome evolution in large populations was driven by the fixation of rare beneficial insertions (Fig. 2.6). While it is likely that many gene duplications are not under positive selection and lost due to genetic drift and mutation accumulation [120], some, especially those resulting in the amplification of gene expression, can be immediately beneficial and later lead to increased phenotypic complexity [17, 21, 77, 152]. Due to the increased mutation supply, these events would occur at a greater frequency in large populations [207] and possibly lead to an increased probability of the evolution of complexity there. However, we also found that large populations did not require this large supply of beneficial insertions. Even when insertion mutations added non-functional instructions and further point mutations were required to evolve functional traits, large populations 28 still evolved complexity similar to that evolved in small populations. These results suggest that purifying selection may not limit the evolution of complexity in large populations. Finally, we found that when deletions occur at a much greater frequency than insertions, only large populations have an evolutionary advantage towards complexity. As many bacteria do have a bias towards deletions [36, 100], this result suggests that large microbial populations can have an evolutionary advantage over small microbial populations for evolving novel traits after all. Such a trend where both large and small, but not intermediate-sized populations have an evolutionary advantage has already been theoretically proposed elsewhere. Weissman et al. showed that both small and large populations cross fitness valleys more easily than intermediate-sized populations [210]. Small populations valley-crossed due to genetic drift and large populations did so due to an increased supply of double mutants. Ochs and Desai also showed that intermediate- sized populations evolved to a lower fitness peak compared to small or large populations when valley-crossing was required for reaching a higher peak [153]. We found similar results, but from different evolutionary mechanisms. Here, populations needed to increase in genome size in order to evolve phenotypic complexity. Additionally, our populations evolved in a complex fitness landscape with many different possible paths to phenotypic complexity. While small populations did fix deleterious insertions to increase genome size, large populations evolved on a different path, either through beneficial insertions (Fig. 2.5) or neutral insertions (Fig. 2.9). It is possible that even larger populations than those evolved here would fix more deleterious insertions, as the likelihood of a further, beneficial mutation arising on the background of a segregating deleterious mutation increases as population size increases. However, our results emphasize that large populations may not be dependent on valley-crossing in some fitness landscapes if alternative evolutionary trajectories exist, even if these trajectories are rare. While the first maps of fitness landscapes suggested mutational paths are small in number [209], more recent work suggests that many indirect evolutionary trajectories exist in larger fitness landscapes [162]. The small population sizes that led to the evolution of greater phenotypic complexity via drift are very small (10 individuals). As biological populations of that size are unrealistic, we 29 may wonder whether small biological populations can actually evolve greater complexity due to increased genetic drift. However, there are reasons to believe that these results would generally hold for biological systems. The limited range of small population sizes that led to complexity is an Avida-specific result due to the severe fitness effect of insertion mutations in avidians with small sequence length. We found that for those sequences, most insertions are lethal (about 80%), and the rest are significantly detrimental, of the order 10% to 90%. To overcome a detrimental effect of 20% via drift, populations must be as small as N = 10. Insertion mutations in biological genomes are not nearly as detrimental, and therefore the critical population size to see evolution of complexity via drift is much larger. In E. coli, for example, the deleterious effect of insertions is between 1 and 3% [46]. We can therefore expect to see the effect of increased complexity due to drift in biological populations that are small, but not unreasonably small. Another possible avenue for future work suggested by this study is to use a simpler population genetics model to explore the same questions we attempted to answer here. Many previous theoretical studies have examined the relevance of valley-crossing to the evolution of complex traits in simple fitness landscapes [153, 210, 211]. One benefit of a simpler model is that it allows for a broader exploration of the relevant parameters involved in the interplay between population size, genome size, and the evolution of phenotypic complexity. While we were not able to perform large parameter searches using the Avida system, our work here establishes a possible relationship between the factors that influence the evolution of complexity in a fitness landscape with many possible mutational trajectories to novel traits [109]. These results should drive future theoretical studies on the evolution of genome size and phenotypic complexity using population genetics models with simpler fitness landscapes. Here we studied the evolution of complexity in haploid asexual digital organisms with an ancestral minimal genome on a frequency-independent fitness landscape. While beyond the scope of this work, it is worth considering how adjusting these genotype characteristics would alter our results. It is likely that the ancestral minimal genomes are a requirement for small populations to evolve the same number of novel traits as large populations. If the ancestor organism had a significant 30 amount of non-functional genome content, the mutation supply advantage that large populations have should result in an accelerated rate of phenotypic evolution in large populations [48]. The organisms used here, as in all Avida experiments, are haploid. It is possible that polyploidy would alter the results found here. However, the implementation of a ploidy cycle in Avida is non-trivial due to the mechanistic style of replication, and so presently other experimental systems would have to be used to explore the role of ploidy in the evolution of phenotypic complexity. It is unclear how sexual, instead of asexual, reproduction would change the results. While sex- ual reproduction can enhance adaptation by combining beneficial mutations that arise in different background, it can also break up beneficial combinations of mutations [157]. One result that may be altered by sexual reproduction is the rate of extinction in small populations, as sex has been found to reduce the rate of mutational meltdowns [127]. Weissman et al. also demonstrate that the large population advantage towards valley-crossing does not exist under high recombination rates [211]. Sexual reproduction has previously been studied using Avida, but it is more akin to homologous recombination in bacteria [145] (as there is no ploidy cycle). Future work should address the role of sexual recombination on the results shown here. Finally, the experiments performed here had no frequency-dependent fitness effects. Previous Avida studies showed that frequency-dependent interactions enhanced the evolution of complexity for a given population size [205, 232]. It is worth exploring how the presence of frequency-dependent selection alters the evolution of complexity, especially in small populations. The benefits of the diversity seen in frequency-dependent fitness landscapes may be reduced in small populations. The extensions to the experiments performed here would provide a more complete understanding of the role of adaptive and non-adaptive evolutionary processes in the origins of complexity. 31 2.6 Methods 2.6.1 Avida In order to experimentally test the role of population size and genetic drift in the evolution of complexity, we used the digital evolution system Avida version 2.14 [154]. In Avida, self-replicating computer programs (avidians) compete in a population for a limited supply of CPU (Central Processing Unit) time needed to successfully reproduce. Each avidian consists of a circular haploid genome of computer instructions. During its lifespan, an avidian executes the instructions that compose its genome. After executing certain instructions, it begins to copy its genome. This new copy will eventually be divided off from its mother (reproduction in most Avida experiments is asexual). Because an avidian passes on its genome to its descendants, there is heredity in Avida. As an avidian copies its genome, mutations may occur, resulting in imperfect transmission of hereditary information. This error-prone replication introduces variation into Avida populations. Finally, avidians that differ in instructions (their genetic code) also likely differ in their ability to self-replicate; this results in differential fitness. Therefore, because there is differential fitness, variation, and heredity, an Avida population undergoes evolution by natural selection [164]. This allows researchers to perform experimental evolution in Avida as in microbial systems [75, 84]. Avida has been successfully used as a model system to explore many topics concerning the evolution of complexity [5, 63, 108, 109, 232]. Twenty-six different instructions compose the Avida instruction set (see [154] for a more complete overview). These include instructions for genome replication, such as an instruction to allocate memory for a new daughter genome, an instruction to copy instructions from the mother genome into the daughter genome, and an instruction to divide off the new avidian. There are instructions that allow for the input, output, and manipulation of random numbers that are used in the performance of certain Boolean logic calculations (see below). There are also instructions for altering instruction execution, including conditional instructions and instructions for changing the next instruction location in the genome to be executed. It is important to note that the Avida 32 instruction set was not designed to mimic any biological organism. Instead, it was created in order to have an organism with mechanistic reproduction in a non-specified fitness landscape that allows for studies of evolutionary dynamics. The Avida world consists of a toroidal grid of N cells, where N is the (maximum) population size. When an avidian successfully divides, its offspring is placed into a cell in the population. While the default setting places the offspring into one of nine neighboring cells of the parent, here the offspring is placed into any cell in the entire population. This simulates a well-mixed environment without spatial structure. When there are empty cells in the population, new offspring are preferentially placed in an empty cell. However, if the population is at its carrying capacity, the individual who is currently occupying the selected cell is replaced by the new offspring (a new individual can also eliminate its parent if that cell is selected). This adds an element of genetic drift into the population as the individual to be removed is selected without regard to fitness. A population can also decrease in size by the death of individuals. An avidian will die without producing offspring if it executes 20L instructions without successfully undergoing division, where L is the avidian’s genome size. This can lead to population extinction in very small populations. Time in Avida is divided into updates, not generations. This method of keeping time was implemented in order to allow individuals to execute their genomes in parallel. During one update, a fixed number of instructions is executed across the entire population. The resource that is necessary to execute instructions (the CPU “energy”) is measured in SIPs (single instruction processing) units. By default, there are 30N SIPs available to the entire population per update, where N is the population size. SIPs are distributed among the individual genotypes within a population in proportion to the trait or traits displayed by an individual. The total amount of SIPs garnered by an individual from traits is called the “merit”. In a homogeneous population of one genotype (clones) where each individual has the same merit, each individual will obtain approximately 30 SIPs per update. However, in a heterogeneous population where merit differs between individuals, SIPs will be distributed in an uneven manner. That way, individuals with a greater merit will execute and/or replicate a larger proportion of their genome per update and 33 replicate faster, thus having a greater fitness. This places a strong selection pressure on evolving a greater merit. One generation has passed when the population has produced N offspring. Typically (depending on the complexity of an avidian) between 5 and 10 updates pass in one generation. A genotype’s merit is increased through the evolution of certain phenotypic traits that form a “digital metabolism” [3]. These phenotypic traits are the ability (or lack there-of) to perform certain Boolean logic calculations on random binary numbers that the environment provides. To do this, an avidian must have the right “genes”–in this case, the right sequence of instructions. First, during an avidian’s lifespan, instructions that allow for the input and output of these random binary numbers must be executed. Further instructions should manipulate those numbers so as to perform the rewarded computations. When a number is then written to the output, the Avida program checks to see whether a logic operation was successfully performed. If so, the the individual that performed the computation consumes a resource tied to the performance of that trait (there are many different codes, that is, combinations of instructions, that will trigger the reward). Resource consumption causes the offspring of that individual to have their merit modified by a factor set by the experimenter. Here, we use the “Logic-9” environment to reward the performance of nine one- and two-input logic functions [109]; see Table 2.1 for the names and specific rewards of each function. Each individual only gains a benefit from performing each function once per generation. There is an infinite supply of resources for the performance of each logic function in the present experiments, making fitness frequency-independent. Because the performance of these logic functions increases merit, they also increase fitness and are under strong positive selection. While increases in an individual’s merit increase replication speed and thus the individual’s fitness, fitness in Avida is implicit and not directly calculated. Unlike simulations of evolutionary dynamics, a genotype’s fitness is thus not set a priori by the experimenter. The only way to measure the fitness of an avidian is to run it through its lifecycle and examine its phenotype. This is similar in principle to how bacterial fitness cannot be calculated by examining an individual bacterium’s genome, but must be measured through a number of different experiments, such as competition assays [107]. A genotype’s fitness is determined by how many offspring it can produce per unit 34 Table 2.1: Merit rewards for the evolution of phenotypic traits. Boolean Logic Calculaton Merit Multiplier 2 2 4 4 8 8 16 16 32 NOT NAND ORNOT AND ANDNOT OR NOR XOR XNOR (Equals) time. Genotypes that can reproduce faster will out-compete other genotypes, all else being equal. Therefore, evolution will increase a population’s fitness through two means. The first is that the population will evolve individuals with a greater number of phenotypic traits and thus with a greater merit, as explained above. The second way to increase replication speed is by optimizing (shortening) the replication time. This occurs either by shrinking the genome, which results in fewer instructions that need to be copied and replicated, or by optimizing genome architecture for faster replication. Fitness w in Avida can be estimated by the following equation: w ≈ merit replication time (2.1) For an avidian to be able to successfully reproduce, it must first allocate memory for the new individual, copy its genome into the allocated memory space, and then divide off the daughter organism. As instructions are copied, the avidian may inaccurately copy some instructions into the newly allocated memory at a rate set by the experimenter. Additionally, upon division, insertions and deletions of a single instructions occur at (possibly different) rates set by the experimenter. Finally, larger insertions or deletions (indels) can occur when an avidian divides into two daughter genomes if the division occurs unevenly. In most cases, this results in the creation of one larger and one smaller genome and both of these are non-viable. However, in rare cases, one of these new genotypes is able to reproduce, resulting in a large change in genome size in that individual’s descendants. Because this mutation through inaccurate division is a characteristic of a genome and 35 thus emergent, the rate at which it occurs is not set by the experimenter. 2.6.2 Experimental Design We used four experimental designs (treatments) to explore how population size determines the evolution of complexity: the original experiments, the non-functional insertion experiments, the fixed genomic mutation rate experiments, and the deletion bias experiments. For all experiments, we evolved populations of size N={10, 100, 1000, 10000} for 2.5× 105 generations under 100-fold replication. For the original treatment, we also performed experiments with population sizes of N={20, 30, 40, 50, 60, 70, 80, 90}. All populations were initiated at full size N with an altered version of the standard length-100 Avida start organism [154]. The alteration was the removal of all non-essential genome content (85 nop-c instructions). This reduced the genome size of the ancestor organisms from 100 instructions to only 15 instructions. For the original experiments, point mutations occurred at a rate of 0.01 mutations per instruction copied, and insertions and deletions at 0.005 events per division. Insertions and deletions occur at most once per division. The ancestor thus started with a genomic mutation rate of 0.15 mutations per generation (0.01 mutations/instruction copied × fifteen instructions copied per generation), but this changes over the course of the experiment as genome size evolves. These experiments are similar to most standard Avida experiments, with the exception of a smaller genome size (fifteen instructions) for the ancestral organism. For the remainder of the experimental settings, one of the above settings was changed to examine a specific effect. For the experiments where the genomic mutation rate was fixed, point mutations occurred at a rate of 0.15 mutations per division, independently of genome size, which fixes the mutation rate at 0.15 mutations/genome/generation. For the non-functional insertion experiments, the mutation rates were the same as in the original experiments. However, instead of inserting one of the twenty-six instructions from the Avida instruction set (see [154] for the Avida instruction set), “blank" instructions called nop-x were inserted. These instructions have no function on their own or in combination with any other instruction. Finally, for the deletion bias experiments, 36 point mutations occurred at the same rate as in the standard experiments. However, insertions and deletions did not occur at the same rate. Insertions occurred at a rate of 0.001 per division and deletions occurred at a rate of 0.009 per division. This kept the total mutation rate equal to the other experimental treatments, while altering the ratio of insertions to deletions. 2.6.3 Data Analysis In order to analyze the evolution of complexity in each population, we extracted the individual with the greatest fitness at the end of each experiment (the “dominant” type). We then calculated relevant statistics for each of these genotypes by running them through Avida’s analyze mode. This mode allows us to run each genotype through its lifecycle in isolation, and calculate its fitness, its genome size, whether it performs any logic functions, and whether it produces viable offspring, among other characteristics. To measure the evolution of phenotypic complexity, we determined how many unique logic calculations each genotype could perform. Such a measure of complexity is similar to a measure of phenotypic complexity used previously [191] in population genetics. The relative fitness was calculated by dividing the analyzed fitness value by the ancestor’s fitness (0.244898). To examine why certain population sizes evolved larger genomes, we examined the “line of descent” (LOD) of the fittest type [109]. An LOD contains every intermediate genotype between the final individual with the greatest fitness and the ancestral genotype that initialized each popu- lation. This line provides a perfect “fossil record” to examine all of the mutations, insertions, and deletions that led to the final fittest genotype for each population. We also calculated the selection coefficient s for each mutation, defined as the ratio of the offspring’s fitness to the parent’s fitness minus one. We defined beneficial mutations as those with s > 0 and deleterious mutations as those with s < 0 (this ignores classifying slightly beneficial and slightly deleterious mutations as neutral.) We determined the number of beneficial insertion mutations by counting those insertions on the LOD with s > 1 N , where N is the population size. These are beneficial mutations that are not nearly-neutral and hence should be under positive selection. We note that using s > 1 N is only an 37 approximation, as the equation for a nearly neutral mutation is |s| (cid:28) 1 Ne , where Ne is the effective population size [156]. We also examined those mutations that had a slightly deleterious effect on fitness, i.e., those whose selection coefficient was − 1 N < s < 0. 2.7 Acknowledgments We thank members of the Adami lab for discussions. This work was supported in part by Michigan State University through computational resources provided by the Institute for Cyber-Enabled Research. 38 CHAPTER 3 EVOLUTION OF DRIFT ROBUSTNESS IN SMALL POPULATIONS Thomas LaBar, Christoph Adami Originally published: Nature Communications, 8:1012, 2017. Chapter (with minor corrections) reused with permission from the publisher: https://www.nature.com/reprints/permission-requests.html 3.1 Abstract Most mutations are deleterious and cause a reduction in population fitness known as the mutational load. In small populations, weakened selection against slightly-deleterious mutations results in an additional fitness reduction. Many studies have established that populations can evolve a reduced mutational load by evolving mutational robustness, but it is uncertain whether small populations can evolve a reduced susceptibility to drift-related fitness declines. Here, using mathematical modeling and digital experimental evolution, we show that small populations do evolve a reduced vulnerability to drift, or “drift robustness”. We find that, compared to genotypes from large popu- lations, genotypes from small populations have a decreased likelihood of small-effect deleterious mutations, thus causing small-population genotypes to be drift-robust. We further show that drift robustness is not adaptive, but instead arises because small populations can only maintain fitness on drift-robust fitness peaks. These results have implications for genome evolution in organisms with small effective population sizes. 3.2 Introduction One consequence of the power of adaptation is that the majority of mutations reduce their bearer’s fitness [51]. The recurring nature of these deleterious mutations results in an equilibrium reduc- tion of population fitness at mutation-selection balance. At the population level, this reduction 39 in fitness is known as the genetic or mutational load [7, 34, 90, 91]. As selection generally acts to increase a population’s mean fitness, one avenue for selection to increase mean fitness is to reduce the mutational load by altering mutation-selection balance and increasing mutational robustness [38, 218]. The evolution of mutational robustness has been demonstrated using the- oretical modeling [68, 97, 169, 198], digital experimental evolution [44, 48, 220], and microbial experimental evolution [146, 176, 179]. Recurring deleterious mutations are not the only strain on fitness. In small populations, genetic drift leads to the fixation of slightly-deleterious mutations that bring about a reduction in fitness [167, 212]. Over time, genetic drift can lead to continual fitness declines and ultimately population extinction [72, 125]. In asexual populations, this phenomenon of fitness decline is known as Muller’s ratchet [151] and is thought to play a role in the evolution of mitochondria [113], bacterial endosymbionts [148], the Y chromosome [65], and other obligate asexual lineages. Muller’s ratchet may explain why there are few long-lived obligate asexual species and may provide a selection pressure for the evolution of sexual recombination [54]. However, it was recently proposed that small populations do not continuously decline in fitness, but only do so until they reach drift-selection balance when the fixation of beneficial mutations counteracts the fixation of slightly-deleterious mutations [66, 167, 183, 212]. Furthermore, Muller’s ratchet may be limited in strength if small populations can alter drift-selection balance and evolve drift robustness. However, it is unknown if populations can evolve drift robustness, or what genetic and evolutionary mechanisms could cause drift robustness. Here, we propose a hypothesis concerning the evolution of drift robustness in small populations. Consider evolution on a single-peak fitness landscape (Fig. 3.1a). In a large population (defined here such that its effective population size is larger than the inverse of every selection coefficient in the landscape), natural selection will ultimately lead to the fixation of all beneficial mutations. In a small population, while selection may also lead to the fixation of these beneficial mutations, weakened purifying selection inherent to small populations will result in the subsequent loss of these beneficial mutations. Thus, while a large population can maintain itself at the top of the 40 fitness peak, a small population is unable to maintain fitness due to an increased rate of fixation of slightly-deleterious mutations. Therefore, this small population will not occupy the top of the fitness peak, but some lower area where the fixation of slightly-beneficial mutations and the fixation of slightly-deleterious mutations balance out [66]. Figure 3.1: Conceptual diagram of drift robustness. a) A single-peak fitness landscape. In this landscape, the large population (red circles) can climb to the top of the fitness peak, while the small population (black circles) can only maintain fitness on an intermediate part of the peak. b) A multi-peak fitness landscape. The large population evolves to the same fitness peak as in panel a. The small population evolves to the steeper, drift-robust peak. While this peak is still lower than the drift-fragile peak, the small population attains greater fitness than it would have on the drift-fragile peak in the single-peak fitness landscape Now, consider a fitness landscape with two fitness peaks, with one peak slightly higher than the other peak (Fig. 3.1b). We will denote the higher peak as the “drift-fragile” fitness peak. A population evolves towards this peak by fixing a sequence of small-effect beneficial mutations. As a consequence, the genotype at the top of the peak will have many small-effect deleterious mutations in its mutational neighborhood. We will denote the lower peak as the “drift-robust” fitness peak. A population evolves to this peak by fixing a sequence of large-effect beneficial mutations and the genotype at the top of the peak will have many large-effect deleterious mutations in its mutational neighborhood. The question is: how will small and large populations evolve on this fitness landscape? According to our hypothesis, large populations will evolve towards the drift-fragile fitness peak and small populations will evolve towards the drift-robust fitness peak. This hypothesis is 41 0.00.20.40.60.81.0Fitness0.00.20.40.60.81.0Fitnessab similar to the idea of the “Survival of the Flattest” effect, where mutationally-robust genotypes will out-compete fitter, but more mutationally-fragile genotypes at high mutation rates [220]. However, we stress that the evolutionary mechanism behind this trend is not the out-competition of drift-fragile genotypes by drift-robust genotypes in small populations. Instead, we propose that small populations evolve to drift-robust fitness peaks because these populations can only maintain themselves on drift-robust areas of the fitness landscape. If a small population would evolve towards a drift-fragile part of the fitness landscape, it would subsequently fix deleterious mutations and decrease in fitness until it evolved back to a drift-robust area. Large populations can easily maintain fitness in drift-fragile areas, and thus we expect them to evolve to the higher fitness peak. Here, we demonstrate that small populations should evolve drift robustness in accordance with our hypothesis. We first confirm the logic behind this hypothesis with a two-peak fitness landscape mathematical model and show that drift robustness will evolve in small, but not large, populations in a fitness landscape with a drift-fragile fitness peak and a lower-fitness drift-robust fitness peak. Then, we use digital experimental evolution with the Avida system [154] to test this hypothesis in a complex fitness landscape. We find that small populations of digital organisms evolve towards fitness peaks with a low likelihood of slightly-deleterious mutations, while large populations evolve towards fitness peaks with a high likelihood of slightly-deleterious mutations. We end by discussing the implications of these results for organisms exposed to strong genetic drift, including bacterial endosymbionts and RNA viruses. 3.3 Results 3.3.1 A mathematical model of drift robustness To test our drift robustness hypothesis, we designed a minimal mathematical model in order to test the conditions under which drift robustness will evolve in small populations, while drift fragility will evolve in large populations (see Methods). We designed a fitness landscape with a wild-type genotype with fitness w1 = 1 and two fitness peaks with w3 = 1 + s and w4 = 1 + s− , respectively 42 (Fig. 3.2a), where s is a fitness advantage (in percent), and  quantifies how much lower the fitness of the peak w4 is compared to w3. The lower of these peaks is the drift-robust fitness peak, as it can be reached from the wild-type genotype by fixing a strongly-beneficial mutation of size s − . As a consequence, this peak’s mutational neighborhood only consists of strongly-deleterious mutations. The drift-fragile peak is, in contrast, reached by first fixing an intermediate genotype with fitness w2 = 1 + s 2 and then fixing another mutation with the same fitness effect. Both these mutants are slightly-beneficial and thus the drift-fragile peak will have a mutational neighborhood of slightly-deleterious mutations. In an extended model (Fig. 3.2b), the drift-fragile peak has n − 1 intermediate steps that are reached with mutations of step size s/n, so that choosing n allows us to vary the steepness of the slope of the drift-fragile peak. When disregarding landscape structure, population genetics arguments imply that large popu- lations will fix on the higher of the two peaks, while if the population size is small, the difference in fitness between peaks is irrelevant and a population should fix on either peak with approximately equal probability. Instead, our model predicts that when deleterious mutations are more abundant than beneficial mutations, there is a broad range of parameter values where the small populations evolve to predominantly fix at the drift-robust fitness peak (even though it is of lower fitness) and the large populations evolve to the drift-fragile fitness peak (Fig. 3.3). For the minimal model, we derive a critical population size at which the small population shifts from fixing at the higher peak to the lower one instead, with the assumption that beneficial mutations are exponentially-distributed, (see Methods) Ncrit = 1 + (3.1) log κ−1 2 where κ = ub fitness effect ¯s . ¯s < 1 is the ratio between the beneficial mutation rate ub and the mean beneficial In the extended model, where n (and thus the slope of the drift-fragile peak) can vary, we find the critical population size to be: Ncrit = 1 + (n − 1)log κ−1 2 43 (3.2) Figure 3.2: The fitness landscapes for the Markov model to test for the evolution of drift robustness. Each circle represents one genotype and is labeled with its fitness. Each arrow represents the transition between one genotype to another (including the identical genotype) and is labeled with the transition probability. a) The fitness landscape for the minimal model. s represents the selection coefficient of the drift-fragile peak and  represents the small fitness difference between the drift-fragile peak and the drift-robust peak. ui j and πi j represent the mutation rate between genotypes and probability of fixation from one genotype to another, respectively. b) The fitness landscape for the extended model. Variables as in panel a. Transition probabilities omitted for clarity where n is the number of mutations required to reach the top of the drift fragile peak. This general equation makes the following predictions concerning how populations should shift from drift- fragile peaks to drift-robust peaks. As the fitness deficit of the drift-robust fitness peak increases (), the critical population size, and thus range of population sizes that lead to the evolution of drift robustness, also decrease (Fig. 3.3). In other words, small populations will only preferentially evolve towards the drift-robust fitness peak if the trade-off between drift robustness and fitness is not too severe. If the drift-robust peak results in extremely low fitness, the small population will evolve as far up the drift-fragile peak as it can while maintaining fitness. As the shallowness 44 1+s✏1+s2u41⇡4111+su14⇡141u41⇡41u21⇡21u12⇡12u23⇡23u32⇡321u32⇡321+s1+s✏11+ksn1+(n1)sn1+snab......1u21⇡21u23⇡231u12⇡12u14⇡14 Figure 3.3: Critical population size for shift between robust and fragile peaks. a) Results for various n values with κ = 0.01. b) Results for various κ values with n = 2. of the slope of the drift-fragile peak increases (i.e., n, or the number of mutations to reach the drift-fragile peak, increases), the critical population size also increases. This result argues that the range of population sizes leading to the evolution of drift robustness is greater as the mutations that lead to the drift-fragile peak are more frequent, with a decreased beneficial effect. Finally, as κ decreases [either by a decrease in the beneficial mutation rate (ub) or an increase in the height of the fitness peaks (s)] the critical population size increases, demonstrating that the larger the differential between the flux of beneficial mutations towards the peaks, the larger the critical population size. We should note here that κ has a weaker influence on Ncrit than  or n. This lesser influence, due to the equation containing log κ−1, exists because there is only a slight difference in the fixation probability of beneficial mutations between small and large populations. The relevant difference comes down to the lack of maintainability of these beneficial mutations in small populations, an effect captured by n, the number of beneficial mutations required to reach the drift-fragile peak. 3.3.2 Drift robustness in digital organisms The mathematical model supports our hypothesis for the evolution of drift robustness in small populations, but it rests on a number of assumptions that may alter the evolution of drift robustness in complex fitness landscapes. For instance, we assumed that populations can be viewed as monomorphic and evolution proceeds as transitions from one genotype to another (these models 45 00.020.040.060.080.10ǫ101102103104Critical Population Sizeκ = 0.01n = 2n = 3n = 500.020.040.060.080.10ǫ101102103104Critical Population Sizen = 2κ = 0.001κ = 0.01κ = 0.1ab are broadly known as origin-fixation models [136]). We also used a fitness landscape with only two fitness peaks, while biological fitness landscapes certainly contain many fitness peaks. To test if small populations evolve drift robustness in a complex fitness landscape, we used the digital evolution system Avida [154]. In Avida, a population of self-replicating computer programs (“avidians”) compete for the memory space and CPU time necessary for reproduction. During self-replication, random mutations occur, potentially altering the new avidian’s reproduction speed. When an avidian successfully reproduces, its offspring replaces a random individual in the population, resulting in genetic drift. As avidians that replicate faster will produce more offspring per unit time than avidians with slower replication speeds, faster replicators are selected for and spread mutations that enable faster replication. Because Avida populations undergo selection, mutation, and drift, they represent a digital model system to study fundamental questions concerning evolutionary dynamics. To test for the evolution of drift robustness in small avidian populations, we evolved 100 replicate populations at small (102 individuals) and 100 populations at large (104 individuals) population sizes for 105 generations. From each population, we isolated the most abundant genotype at the end of the experiment; we will refer to these genotypes as the 100 small-population genotypes and the 100 large-population genotypes. Small populations evolved a lower relative fitness than large populations (median = 1.85 vs. median = 2.05, Mann Whitney U = 2237.0, n = 100, p = 7.31 × 10−12 one-tailed), as expected for populations that experience a smaller beneficial mutation supply over the course of the experiment. 3.3.3 Small populations evolve an altered DFE To look for signs of drift robustness, we studied differences in the Distribution of Fitness Effects (DFE) of de-novo mutations for small-population genotypes and large-population genotypes. First, we generated every possible point mutation for all genotypes and combined these data into one DFE (Fig. 3.4a). Both show the typical properties of DFE’s found in biological organisms: most mutations are either lethal or have little effect [51]. However, there are some differences. Small- 46 population genotypes have an excess of neutral, beneficial, and strongly deleterious mutations (defined as viable deleterious mutations with a fitness effect greater than or equal to 5%; Fig. 3.4b), while large-population genotypes have an excess of small-effect deleterious mutations (defined as viable deleterious mutations with a fitness effect less than 5%; Fig. 3.4b). We confirmed that these trends hold when we calculated a DFE for each genotype (rather than one DFE for all genotypes from a given population size) as follows. Small population genotypes had a greater likelihood of beneficial mutations (median=0.0256 vs. median=0.0008, Mann Whitney U = 413.5, n = 100, p = 6.26 × 10−30 one-tailed), neutral mutations (median = 0.40 vs. median = 0.26, Mann Whitney U = 321.0, n = 100, p = 1.45 × 10−30 one-tailed), large-effect deleterious mutations (median = 0.084 vs. median = 0.054, Mann Whitney U = 2854.0, n = 100, p = 7.90 × 10−8 one-tailed), and a lesser likelihood of lethal mutations (median = 0.33 vs. median = 0.37, Mann Whitney U = 4031.5, n = 100, p = 9.00 × 10−3 one-tailed) and small-effect deleterious mutations (median = 0.11 vs. median = 0.31, Mann Whitney U = 124.5, n = 100, p = 5.13 × 10−33 one-tailed; Fig. 3.4c). Additionally, there was no difference in the average single-mutant relative fitness for small-population genotypes and large-population genotypes, even though there were fitness differences between the population-size treatments (median w = 0.612 vs. median w = 0.611, Mann Whitney U = 4890.0, n = 100, p = 0.39 one-tailed; Fig. 3.4d). 3.3.4 Small population genotypes are drift-robust The lack of small-effect deleterious mutations in small populations suggests that these populations adapted to drift-robust fitness peaks and that the large populations adapted to drift-fragile peaks. To test if these small-population genotypes are drift-robust, we took the 100 small-population genotypes and 100 large-population genotypes and measured these genotypes’ change in fitness when placed in an environment with strong genetic drift (i.e., low population size). We evolved 10 populations of 50 individuals for each genotype for 103 generations and measured their change in fitness. Small-population genotypes clearly declined less in fitness than large-population genotypes (median decline = 1% vs. median decline = 6%; Mann-Whitney U = 43959.5, n = 1000, 47 Figure 3.4: Differences in mutational effects between small-population genotypes and large-population genotypes. Black and red represent small-population and large-population genotypes, respectively. a) The combined distribution of fitness effects (DFE) across all 100 small-population genotypes and 100 large-population genotypes. b) Same data as in panel a, but grouped into different classes of mutations, where a small-effect deleterious mutation is defined as having an effect less than 5%. Here, and throughout, deleterious mutation refers to viable deleterious mutations, while lethal mutation refers to non-viable deleterious mutations. c) The likelihood of a small-effect deleterious mutation for small-population and large-population genotypes. Red lines are medians, edges of the box are first and third quartile, whiskers are at most 1.5 times the interquartile range, and the plus signs are outliers. d) The mean relative fitness of every possible point mutation (1250 mutations) for each genotype. 48 −1.0−0.8−0.6−0.4−0.20.0Selection Coefficient0.00.10.20.30.40.5Frequency of MutationsLethalLargeDel.SmallDel.NeutralBeneficial0.00.10.20.30.40.5Frequency of MutationsSmall-PopulationGenotypesLarge-PopulationGenotypes0.000.050.100.150.200.250.300.350.400.45Fraction Small-effect Deleterious MutationsSmall-PopulationGenotypesLarge-PopulationGenotypes0.10.20.30.40.50.60.70.8Mean Relative Mutant Fitnessabcd p = 1.44 × 10−273 one-tailed; Fig. 3.5a). Furthermore, a genotype’s decline in fitness is correlated with its likelihood of a small-effect deleterious mutation, supporting the idea that small populations have evolved to fitness peaks with a low likelihood of small-effect deleterious mutations due to the peak’s drift robustness (Spearman’s ρ = 0.80, p ≈ 0; Fig. 3.5b). Figure 3.5: Small-population genotypes are drift-robust due to a decreased likelihood of small-effect deleterious mutations. Black and red data points represent small-population genotypes and large-population genotypes, respectively. a) Relative fitness of the most-abundant genotype from every population during the drift robustness test. Each circle represents the relative fitness of one genotype from one replicate. b) Relationship between relative fitness in the drift robustness test (panel a) and the likelihood of small-effect deleterious mutations (Fig. 3.4c) 3.3.5 Drift robustness is not due to fitness differences The above results are consistent with the hypothesis that small populations evolve to drift-robust fitness peaks and large populations evolve to drift-fragile fitness peaks. However, one could argue the results are also consistent with evolution on a single-peaked fitness landscape (Fig. 3.1a). The small populations we examined have lower fitness than the large populations and thus could have a decreased likelihood of small-effect deleterious mutations and more robustness to drift because they are further down on the fitness peak and cannot climb the rest of the peak. To test if our results were due to the lower fitness of the small-population genotypes, we isolated genotypes of the same fitness from the evolutionary lineages of the small and large populations (see Methods for details). We then compared these genotypes’ likelihood of small-effect deleterious mutations. Genotypes from 49 Small-PopulationGenotypesLarge-PopulationGenotypes0.850.900.951.001.051.10Relative Fitness after Drift Test0.000.050.100.150.200.250.300.350.400.45Fraction Small-effect Deleterious Mutations0.850.900.951.001.051.10Relative Fitness after Drift Testab the small populations had a decreased likelihood of small-effect deleterious mutations compared to genotypes from large populations for every examined fitness value (Fig. 3.6). These results support the hypothesis that small populations have evolved to different fitness peaks than large populations and are not merely occupying a lower region of the same fitness peak. Figure 3.6: Fraction of small-effect deleterious mutations for genotypes from small populations and large populations with equal fitness. Each area separated by a dashed line shows genotypes with equal fitness (w). S and L represent small-population and large-population genotypes, respectively. Differences for each fitness value are significant. (Mann-Whitney U-test; Bonferroni-corrected p < 3 × 10−3). Box plots as described for Fig. 3.4c. 3.3.6 Epistatic mutations lead to drift robustness Next, we examined the mutations that enabled small populations to evolve towards drift-robust peaks. Our mathematical model has a drift-robust peak accessible by strongly-beneficial mutations and a drift-fragile peak accessible by slightly-beneficial mutations. Therefore, we first examined the distribution of fitness effects for maintained beneficial mutations (beneficial mutations whose fitness gain was at-least partially maintained during subsequent evolution) to see if small populations fixed more strongly-beneficial mutations (Fig. 3.7a). While small populations did fix a significantly large proportion of maintained strongly-beneficial (s > 0.05) mutations (median = 0.06 vs. median = 0.05, Mann Whitney U = 4067.5, n = 100, p = 0.01 one-tailed), the difference was slight. The fixation of strongly-beneficial mutations is not the only way small populations could climb drift-robust peaks. Small populations could also climb drift-robust fitness peaks through 50 SLSLSLSLSLSLSLSLSLSL0.00.10.20.30.40.5Fraction Small-effect Deleterious Mutationsw = 0.3721w = 0.375w = 0.378w = 0.381w = 0.384w = 0.3871w = 0.4w = 0.4848w = 0.4851w = 0.5 Figure 3.7: Evidence of small-population adaptation to drift-robust fitness peaks. a) Distribution of maintained beneficial mutational effects for (effects for mutations whose fitness gain was at-least partially maintained during subsequent evolution) small-population genotypes (black) and large-population genotypes (red). b) Spearman correlation coefficients between fitness and the likelihood of a deleterious mutation for each maintained beneficial mutation from each population. c) The likelihood of (viable) deleterious and lethal mutations, shown in magenta and green, respectively, in a representative small population’s lineage. The strong decrease in the likelihood of a (viable) deleterious mutation early in the population’s history is evidence of epistatic mutations resulting in drift robustness. d) Number of populations that fixed a maintained beneficial mutation that decreased the likelihood of a (viable) deleterious mutation by at least a specified amount. Colors as in panel a. 51 0.000.050.100.150.200.25Fitness Effect0.00.10.20.30.40.5FrequencySmall-PopulationGenotypesLarge-PopulationGenotypes−0.6.0.4.0.20.00.20.40.60.81.0Spearman Correlation Coefficient0500100015002000Mutation Number0.00.10.20.30.40.50.60.70.8Likeli ood of Mutational Effect(Viable) Delete)iousLet al−0.40.0.35.0.30.0.25.0.20.0.15.0.10.0.050.000.05Change in Deleterious Mutational Likelihood020406080100Number of Populationsabcd epistatic beneficial mutations that decreased the likelihood of small-effect deleterious mutations. By decreasing the likelihood of small-effect deleterious mutations, the likelihood that a future small- effect deleterious mutation will arise and fix is also decreased. Thus, these epistatic beneficial mutations can be maintained by small populations. To see if these types of mutations were fixed in the small populations, we first looked at the correlation between the fitness of maintained beneficial mutations and their genotypes’ likelihood of deleterious mutations for each population. In a non-epistatic fitness landscape, we would expect the likelihood of deleterious mutations to increase as fitness increases due to the fixation of, and the subsequent decrease in the likelihood of, beneficial mutations. However, in some epistatic fitness landscapes, this correlation is not guaranteed to exist, as the fixation of beneficial mutations may alter the fitness effects of mutations at other loci. In fact, small populations showed a significant decrease in the correlation between fitness and deleterious mutational likelihood when compared to large populations (median Spearman’s ρ = 0.24 vs. median Spearman’s ρ = 0.73, Mann Whitney U = 2082.0, n = 100, p = 5.07 × 10−13 one-tailed; Fig. 3.7b). This result is consistent with small populations evolving towards fitness peaks with a decreased likelihood of deleterious mutations through the fixation of epistatic mutations. Next, we looked for specific mutational signatures of the fixation of epistatic beneficial mutations in small populations. We found that 22 small populations had fixed beneficial mutations that reduced the likelihood of a deleterious mutation by 50%, while only 4 large populations did so. All of these mutations increased the likelihood of lethal mutations (mean increase = 72%, 2× S.E. = 11%; Fig. 3.7c). We then studied the magnitude of the decrease in the likelihood of deleterious mutations. Forty-five small populations fixed mutations that decreased this deleterious likelihood by at least 0.1, while only 13 large populations did so (Fig. 3.7d). All of these mutations also increased the likelihood of lethal mutations. Finally, we confirmed that these epistatic mutations specifically decreased the likelihood of small-effect deleterious mutations. Most of the decrease in the likelihood of deleterious mutations consisted of a decrease in small-effect deleterious mutations (median percentage of decrease = 83.9%, interquartile range = 20.6-97.1%), further suggesting that 52 small populations evolve drift robustness by fixing beneficial mutations that decrease the likelihood of small-effect deleterious mutations and increase the likelihood of lethal mutations. 3.3.7 Deleterious mutations drive the evolution of drift robustness Finally, to test whether drift robustness evolves in small populations because these populations can only maintain fitness in drift-robust areas of the fitness landscape, we performed further evolution experiments where deleterious mutations were prevented from occurring (see Methods for further details). In this environment, populations cannot decline in fitness, so small populations do not maintain fitness differently on drift-robust and drift-fragile fitness peaks. We evolved 100 small populations without deleterious mutations under the main experimental conditions. Small popula- tion genotypes evolved greater relative fitness without deleterious mutations than in the treatment with deleterious mutations (median = 2.05 vs. median = 1.85, Mann Whitney U = 2464.6, n = 100, p = 2.92 × 10−10 one-tailed; Fig. 3.8a). As expected in an environment where fitness maintenance was not a factor, small-population genotypes had a greater likelihood of small-effect deleterious mu- tations (median = 0.19 vs. median = 0.11, Mann Whitney U = 1769.0, n = 100, p = 1.47 × 10−15 one-tailed; Fig. 3.8b). These small-population genotypes were less robust to genetic drift (median fitness decline of 5% vs. median fitness decline of 1%, Mann Whitney U = 118333.0, n = 100, p = 2.25 × 10−192 one-tailed; Fig. 3.8c) and this decreased robustness correlates with their in- creased frequency of small-effect deleterious mutations (Spearman’s ρ = −0.43, p = 2.07× 10−45; Fig. 3.8d). These results suggest that small populations evolve to alternative areas of the fitness landscape if they can maintain small-effect beneficial mutations. 3.4 Discussion Our results suggest the following explanation for the evolution of drift robustness in small populations. Small populations cannot adapt to fitness peaks with a high likelihood of small-effect If small populations climb these peaks, genetic drift will cause them to deleterious mutations. 53 Figure 3.8: The evolution of drift robustness in small populations with or without deleterious mutations in the initial adaptation experiments. Black and blue data points represent small-population genotypes adapted with deleterious mutations and small-population genotypes adapted without deleterious mutations, respectively. a) Relative fitness to the ancestral genotype after 105 generations of adaptation. Box plots as described for Fig. 3.4c. b) Likelihood of small-effect deleterious mutations. c) Relative fitness of the most-abundant genotype from every population during the drift robustness test. Each circle represents the relative fitness of one genotype from one replicate. d) Relationship between relative fitness in the drift robustness test (panel c) and the likelihood of small-effect deleterious mutations (panel b). lose previously-fixed beneficial mutations, leading to a decrease in fitness. In other words, small populations cannot maintain themselves on drift-fragile fitness peaks. Thus, small populations, if they do adapt, must adapt to drift-robust fitness peaks. Due to the relative increased strength of selection, large populations do not face this constraint and adapt to drift-fragile peaks. Therefore, our results argue that small populations and large populations should evolve to different areas of the fitness landscape and evolve qualitatively-different genetic architecture. We should emphasize here that there are certain requirements for the evolution of drift robustness in small populations. First, the fitness landscape must contain multiple peaks; some peaks must 54 Small-PopulationGenotypesSmall-PopulationGenotypesNo Deleterious1.21.41.61.82.02.22.42.62.8Relative FitnessSmall-PopulationGenotypesSmall-PopulationGenotypesNo Deleterious0.000.050.100.150.200.250.300.350.400.45Fraction Small-effect Deleterious MutationsSmall-PopulationGenotypesSmall-PopulationGenotypesNo Deleterious0.800.850.900.951.001.051.10Relative Fitness after Drift Test0.000.050.100.150.200.250.300.350.400.45Fraction Small-effect Deleterious Mutations0.800.850.900.951.001.051.10Relative Fitness after Drift Testabcd be drift-robust with few small-effect deleterious mutations and some must be drift-fragile with many small-effect deleterious mutations. If there is only one fitness peak, small populations would still likely have a decreased likelihood of small-effect deleterious mutations. However, this would occur because these populations have failed to maintain small-effect beneficial mutations, not because they have evolved to drift-robust peaks. Second, the requirement of multiple fitness peaks further implies that this effect will only be seen in fitness landscapes with strong epistasis in parts of the landscape, as (sign) epistasis leads to multiple fitness peaks [165]. Third, there must be evolutionary trajectories between drift-robust fitness peaks and drift-fragile fitness peaks. Otherwise, small populations would only evolve downwards on a drift-fragile fitness peak. Finally, there must be more trajectories to drift-fragile fitness peaks than drift-robust fitness peaks. We are not the first to propose that small populations will evolve robustness mechanisms in response to their deleterious mutational burden. However, these mechanisms are usually discussed in terms of mutational robustness, not robustness to drift. Previous studies provided two charac- teristics of the evolution of mutational robustness in small populations. First, small populations should preferentially evolve to lower fitness peaks with more “redundancy,” defined as a decreased average deleterious mutational effect and large populations should evolve to fitness peaks with a high average deleterious mutational effect [48, 97]. Our results suggest opposite evolutionary trajectories for small and large populations. While our results concerning the evolution of drift robustness do suggest that small populations evolve to lower fitness peaks, and small populations do evolve more redundancy in terms of exactly-neutral mutations, these small populations do not evolve towards fitness peaks with a decreased deleterious mutational effect (Fig. 3.4d). In fact, they evolve towards fitness peaks with a minimal likelihood of small-effect deleterious mutations. This discrepancy likely exists due to the fitness landscape used to study the evolution of redundancy in small populations: the mutations in that fitness landscape were all small-effect deleterious mu- tations [97]. Thus, small populations could not maintain fitness except on the flattest of fitness landscapes [97]. In a version of this model with multiple fitness peaks (e.g., one with small-effect deleterious mutations and one with large-effect deleterious mutations), we expect that small pop- 55 ulations would evolve to the peak with large-effect deleterious mutations. Such an outcome was recently predicted for populations evolving at very high mutation rates [104], although a different model predicts that small populations should evolve to areas that minimize the deleterious effect of mutations [69], in accordance with Krakauer and Plotkin’s model [97]. The second characteristic of mutational robustness in small populations is that these populations should evolve “global” robustness mechanisms, such as error-correction mechanisms, that affect many loci [68, 169, 224]. There are no global error-correction mechanisms available to the avidian genomes here (although one could allow the evolution of mutation rates, e.g., [31]). However, we did find that small populations preferentially fixed epistatic mutations that strongly altered the likelihood of deleterious mutations (Fig. 3.6c, d). These mutations are global in the sense that they alter the fitness effects of mutations at multiple loci. However, unlike previous work that suggested small populations should fix global solutions that reduce the effect of deleterious mutations [68, 169, 224], we found that these mutations increased the likelihood of lethal mutations. We do not have strong evidence that this increased lethality is essential and expect that small populations could also fix mutations that increased the likelihood of neutral mutations while reducing the likelihood of deleterious mutations if they exist in the Avida fitness landscape. Generally, our results emphasize that the evolutionary process behind drift robustness is the trend to reduce the likelihood of small- effect deleterious mutations, which can be achieved in multiple ways. As the evolution of drift robustness relies on a number of conditions, we may ask which em- pirical fitness landscapes, or which organisms, meet these criteria? Candidates for organisms with drift-robust genomes include those that undergo severe bottlenecks during their lifecycle, includ- ing bacterial endosymbionts [148] and RNA viruses [234]. There is evidence that both bacterial endosymbionts [52, 86, 100, 176] and RNA viruses [47, 76] have evolved alternate genome archi- tectures in response to their population-genetic environment. In endosymbionts, drift robustness could be achieved by choosing rare codons in such a way that substitutions are highly deleteri- ous, and indeed proteins in Buchnera have been found to be exceptionally resistant to drift [192]. However, there has been to date no systematic study of how different organisms respond to strong 56 genetic drift. Future work with biological organisms should establish the circumstances that cause organisms to vary in their robustness to genetic drift. Furthermore, experimental evolution may be able to produce organisms with drift-robust genomes whose architecture can be studied directly. 3.5 Methods 3.5.1 Mathematical model of drift robustness We describe a model to study the minimal conditions required for the evolution of drift robustness. Based on our hypothesis, we need to study evolution on a fitness landscape with at least two fitness peaks: one drift-robust peak with few small-effect deleterious mutations and one drift-fragile peak with many small-effect deleterious mutations. We assume that deleterious mutations are more frequent than beneficial mutations and that beneficial mutations of large-effect are less frequent than beneficial mutations of small-effect. Finally, there must also be a mutational path between the drift-fragile peak and the drift-robust peak. Drift robustness on such a landscape would manifest itself when a population that predominantly occupies a high (drift-fragile) fitness peak when under selection at large population sizes, switches instead to the lower (drift-robust) fitness peak when the population is small. Below we will calculate the critical population size at which this switch occurs. We design a fitness landscape with four genotypes, represented by four nodes (Fig. 3.2a). Genotype 1 (the wild-type) has fitness w1 = 1 and is the ancestral genotype for our populations. Genotypes 2 and 3, with fitness w2 = 1 + s 2 and w3 = 1 + s, respectively represent the genotypes on the drift-fragile fitness peak (s is the size of the fitness benefit). Genotype 4, with fitness w4 = 1 + s − , illustrates the drift-robust fitness peak at lower fitness (lower by  > 0). In the extended version of this model that we present later, we discuss the case where an arbitrary number of mutations lie “on the path” towards the drift-fragile peak (thus increasing the peak’s fragility). The likelihood that a mutation on the genetic background of genotype i leads to genotype j is denoted by ui j and the probability of fixation of that mutation is denoted by πi j, with 0 < ui j, πi j ≤ 1. 57 the probability the population will not change is 1 − Therefore, the probability the population will evolve from genotype i to genotype j is ui j πi j and ui j πi j. Mutations cannot occur from the drift-fragile peak to the drift-robust peak and vice-versa, but an indirect path between them exists. To allow for this dynamic, back-mutations can occur. j We assume that evolution occurs in a mutation-limited environment (weak mutation, strong selection limit), where the population is almost always monoclonal. When a mutation arises, it will either go extinct or takeover the population. This assumption allows us to treat evolution as a Markov chain [50]. We then solve for the stationary distribution of mutants in the population, to calculate the likelihood a population with defined characteristics will evolve to either one fitness peak or the other. To solve the Markov chain, we first write down the transition matrix T as :  T = 1 − u12π12 − u14π14 u21π21 0 u41π41  . u12π12 1 − u21π21 − u23π23 u32π32 0 u14π14 u23π23 1 − u32π32 0 0 The stationary distribution (cid:174)x∗ = (x∗ 1, x∗ 2, x∗ with eigenvalue 1, i.e. (cid:174)xT = (cid:174)x. We are interested in the relative fraction R = x∗ fraction of occupation between the drift-robust and drift-fragile peaks and turns out to be 1 − u41π41 0 0 3, x∗ 4) is the left eigenvector of the transition matrix 4, which is the 3/x∗ R = u41π41u12π12u23π23 u14π14u21π21u32π32 πi j πji . (3.3) We first calculate the fractions Pi j = . Using Kimura’s probability of fixation [88] (a small s approximation of the exact formula of Sella and Hirsh [182, 219]) for an asexual Wright-Fisher process (N is the population size) we find so that P14 = e2(s−)(N−1) P12 = P23 = es(N−1) , , R = u41 u14 u12 u21 u23 u32 P12P23/P14 ≡ Me2(N−1) , 58 (3.4) (3.5) (3.6) where we introduced M = u41 u14 u12 u21 u23 u32 , which we now estimate. For simplicity, we assume that a deleterious mutation rate (for example, the “back mutation rate" u41) is given by µ, the overall mutation rate (thus assuming that most mutations are deleterious). We also assume that the mutation rate up the drift-fragile peak (u12 = u23 = ufragile) is greater than the mutation rate up the drift-robust peak (u14 = urobust); this is equivalent to assuming that small-effect mutations are more frequent than large-effect mutations. Then, one gets M = u41 u14 u23 u32 ufragileufragile u12 u21 µ Therefore, if urobust > = µ ufragile ufragile urobust µ µ = ufragile urobust ufragile µ , we find that M < 1 , (3.7) (3.8) thus allowing for a transition between the two peaks determined by the population size N. If we assume beneficial mutations follow a certain distribution, we can derive a precise critical population size at which this transition between fitness peaks occurs. Assume the beneficial mutation rate is pb(s) = ub µρ(s), where ρ(s) is the distribution function of mutations with benefit s, and ub is the likelihood that a mutation is beneficial. If mutations with larger benefit s are exponentially more unlikely (see, e.g., [55, 60, 64]), we can use the distribution function ρ(s) = ¯s e−s/¯s (here ¯s is the average beneficial effect) to show that 1 b(s/2) p2 pb(s) = ub ¯s . = pb(s − ) ≈ pb(s), while u12 = u23 u21 u32 ≡ κ < 1 . M = p2 b(s/2)/pb(s) = ub ¯s Then, for  small we find u14 u41 = pb(s/2), so that (3.9) (3.10) This result is expected to be general, as it simply states that the flux of beneficial mutations towards the peak with a shallower slope (smaller s, here 1 + s/2) is larger than the flux into the branch with steeper slope (larger s, here 1 + s − ). The critical point at which both the drift-fragile and the drift-robust peak are equally populated is determined from setting R = 1 in Eq. (3.6), which gives log κ−1 Ncrit = 1 + 2 . (3.11) 59 We show the critical population size as a function of the fitness deficit of the drift robust peak  in Figure 3.3. We see that, depending on the fitness deficit, the evolutionary dynamics prefer the drift-robust peak at small population sizes even though its peak height is inferior to the drift-fragile peak. These results do not depend on the explicit function we used to describe the distribution of beneficial mutations as long as that function is decreasing, nor does it depend on the specifics of the construction of the fitness landscape. Next, we created an extended version of our fitness-landscape model. Drift-fragility could be exacerbated by subdividing the height w3 = 1 + s into n increments (Fig. 3.2b, in the previous model n = 2). In this case, b(s/n) pn pb(s) = (ub ¯s )n−1 , (3.12) (3.13) and the critical population size becomes Ncrit = 1 + (n − 1)log κ−1 2 . The simple result Eq. (3.13) relies on an exact cancellation of the fixation probabilities of the intermediate n − 1 steps, and occurs for both the Kimura approximation as well as the exact Sella-Hirsh formula. 3.5.2 Avida Experimental evolution was carried out using the digital evolution system Avida version 2.14. Avida has previously been used to study many concepts that are difficult to test with biological systems [5, 62, 63, 101, 108, 232]. In Avida, a population of self-replicating computer programs undergoes Darwinian evolution. Each of the programs (“avidians”) consists of a genome of sequential computer instructions, drawn from an alphabet of twenty-six possible instructions. Together, these instructions encode the ability for an avidian to create a new daughter avidian, copy its genome into the new avidian, and divide off the offspring. During this process, mutations can be introduced into the offspring’s genome at a controlled rate, introducing genetic variation into the population. When a new offspring is placed into the population (and the population is at carrying 60 capacity), a random individual is replaced by the new avidian, a process that introduces genetic drift into Avida populations. Avidians differ in their replication speed due to different genomic sequences, so avidians that can replicate faster will out-compete slower-replicating types. Therefore, because variation is heritable, and because this variation leads to differential reproduction, an Avida population undergoes Darwinian evolution by natural selection. The Avida world consists of a toroidal grid of N cells, where N is the maximum population size. Each cell can be occupied by at most one avidian, although a cell may be empty. Upon reproduction, the offspring avidian is placed into an empty cell (if the population is below capacity) or into a random cell, where it replaces the already-present avidian. Although the default Avida setting places offspring into one of nine neighboring cells (including the parent) so as to emulate growth on a surface, in the present experiments any cell may be selected for replacement to simulate a well-mixed environment. Reproduction is asexual in all of the experiments performed here. Time in Avida is set according to “updates” (the time it takes for an avidian population to execute a give number of instructions). During each update, 30N instructions are executed across the population, where N is again the population size. In order to be able to execute its code, an avidian must have a resource, measured as “Single Instruction Processing” units (SIPs). At the beginning of each update, SIPs are distributed to programs in the population in proportion to a quantity called “merit”, which is related to a genotype’s ability to exploit the environment (see [154] for details). In the experiments performed here, merit was held constant across all individuals, so on average 30 SIPs were distributed to each individual every update. It should be noted that in most Avida experiments, populations can evolve the ability to perform certain Boolean logic calculations that can improve their merit and hence their fitness [109]. In the experiments performed here, the evolution of these logic calculations was set to be neutral and not under positive selection. Instead, the route for an avidian to improve its fitness was solely by reducing the number of instruction executions needed to copy its genome. A population will typically evolve a faster replication speed by increasing the number of instructions that copy instructions from the parent genome to the offspring genome. When this copy number increase 61 occurs, more instructions are copied per update, resulting in faster replication and greater fitness. This fitness landscape was used because the fitness landscape where logic calculations are under selection lack small-effect deleterious mutations, which would preclude the observation of drift robustness. This lack of small-effect deleterious mutations occurs due to antagonistic pleiotropy and trade-offs between the logic functions and genome replication. Because there are a fixed number of loci in the genome, the more loci dedicated to the logic functions, the fewer loci dedicated to genome replication. Therefore, because most loci are dedicated to logic functions, and mutations to these loci are strongly-deleterious, there are few small-effect deleterious mutations in the logic-function fitness landscape. Although Avida uses the update as its unit of time, experiments such as those performed here are often run for a given number of generations (the time it takes for the entire population to be replaced). The experiment ends when the average generation across all of the individuals in the population reaches a pre-specified number. Each individual’s generation counter is equal to its parent’s generation plus one. Therefore, while Avida experiments occur for a set number of generations, the population does not evolve with discrete generations. If fitness differs between individuals and lineages in the population, there can be variation in the individuals’ generations in the population. 3.5.3 Experimental Design We performed three sets of experiments here. First, initial adaptation experiments were performed to generate genotypes adapted to small and large population-size environments. We evolved 100 small populations (102 individuals) and 100 large populations (104 individuals) for 105 generations. The genomic mutation rate was set to 10−1 mutations/generation/genome and these mutations occurred upon division; offspring could differ by at most one mutation from their parent. The ancestor organism for the initial adaptation treatments was the default Avida ancestor, but with an altered genome length of 50 instructions. This alteration was performed by removing 50 nop-C instructions from the default genome (these instructions are inert). 62 The second experimental step was to perform a test to measure the drift robustness of individuals evolved at a small population size vs. individuals evolved at a large population size. From each small and large population, we used the most abundant individual to form a set of 100 small-population genotypes and 100 large-population genotypes per treatment. For each of these genotypes, we evolved 10 populations (2000 replicates in total) at a population size of 50 individuals for 103 generations. All other treatment parameters were the same as the initial adaptation experiments. The final set of experiments tested whether deleterious mutations were responsible for the evolution of drift robustness in small populations. We repeated the initial adaptation experiment and the drift robustness test under the same parameter settings as for the original treatment. However, during the initial adaptation experiment, we reverted any deleterious mutations that appeared in the population [33]. In this setup, the Avida world examines the fitness cost of every new point mutation. If this new mutant has decreased fitness relative to its parent, the mutant is prevented from entering into the population. 3.5.4 Data Analysis We calculated statistics for the evolved avidians using Avida’s Analyze Mode [154]. In Analyze Mode, the experimenter can run an avidian through its life-cycle (until reproduction) and calculate several genotype characteristics. Fitness was calculated as the ratio between the number of instruc- tions in the genome (the sequence length) to the number of instruction executions needed to copy the genome and reproduce (this is an unbiased predictor of the actual number of offspring). In order to calculate the distribution of fitness effects for each genotype and other related mutational measures, each point mutation was generated for each genotype (25 × L mutations, where L is the number of instructions in the genome). The fitness effect of each mutation was − 1, where wm is the fitness of the mutant and w0 was the fitness of the calculated as s = wm w0 genotype. The average mutational effect of each genotype is the arithmetic mean of these fitness effects. The fraction of mutations of a given fitness effect was calculated as the number of mutations with that fitness effect divided by 25L. 63 To estimate the distribution of fitness effects of fixed mutations for each genotype, we analyzed the line-of-descent (LOD) of these genotypes. The LOD of a genotype contains every genotype that led from the ancestral genotype to the genotype of interest; it represents a fossil record of that lineage [109]. We calculated the fitness effect of each mutation along the LOD as above. For calculating the change in the frequency of lethal and deleterious mutations along the LOD as in (Fig. 3.7c), we performed the calculations detailed above for each LOD genotype. In order to examine possible differences in the distribution of beneficial fixed effects between small populations and large populations, we had to identify the beneficial mutations that contributed to adaptation. This is non-trivial, as small populations fix more beneficial mutations than large populations due to their oscillations in fitness. In order to not include these transient fixed beneficial mutations, we selected the beneficial mutations from each population whose fitness gain was at least partially maintained during the future evolution of the population. We labeled a beneficial mutation on a population’s LOD as maintained if 1) it resulted in the lineage attaining a new fitness maximum, and 2) fitness never decreased below the previous fitness value on the LOD except for a transient amount of time. We defined a transient amount of time as less than five consecutive genotypes on the LOD having a lower fitness. This transient fitness decrease allowance is necessary due to the possibility of valley-crossing in Avida fitness landscapes [33]. To compare the fraction of small-effect deleterious mutations between genotypes from small populations and genotypes from large populations (Fig. 3.6), we first selected one genotype from each lineage for a given fitness value. If a lineage had multiple genotypes with the same fitness, as was often the case, we took the last genotype that appeared. Then, for each fitness value with more than 20 genotypes from both small and large populations, we calculated the fitness effect of every possible point mutation and the fraction of these mutations that were deleterious with a small effect size as described above. Statistical analyses were performed using the NumPy [197], SciPy [82], and Pandas [139] Python modules. Figures were created with the Matplotlib [78] Python module. The stationary distribution for the mathematical model was solved using Mathematica version 11.0.1.0 [222]. 64 3.5.5 Data Availability The Avida software is available for free use ( https://github.com/devosoft/avida). Avida configura- tion scripts, data from Avida experiments, statistical analysis and figure-generating scripts, as well as the Mathematica code, are available at the Dryad data repository (DOI:10.5061/dryad.nr780). 3.6 Acknowledgements T.L. acknowledges a Michigan State University Distinguished Fellowship, a BEACON fellow- ship, and the Russell B. Duvall award for support. This work was supported in part by Michigan State University through computational resources provided by the Institute for Cyber-Enabled Re- search. This material is based in part upon work supported by the National Science Foundation under Cooperative Agreement No. DBI-0939454. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. 65 CHAPTER 4 GENOME SIZE AND THE EXTINCTION OF SMALL POPULATIONS Thomas LaBar, Christoph Adami 4.1 Abstract Although extinction is ubiquitous throughout the history of life, the factors that drive extinction events are often difficult to decipher. Most studies of extinction focus on inferring causal factors from past extinction events, but these studies are constrained by our inability to observe extinction events as they occur. Here, we use digital evolution to avoid these constraints and study “extinction in action”. We focus on the role of genome size in driving population extinction and examine the genetic mechanisms behind the relationship between genome size and extinction. We find that the relationship between genome size and extinction is driven by two genetic mechanisms that in- crease a population’s lethal mutational burden: large genome size leads to both an increased lethal mutation rate and an increased likelihood of stochastic reproduction errors and non-viability. We further show that this increased lethal mutational burden is directly due to genome expansions, as opposed to subsequent adaptation after genome expansion. This result, contrary to the expectation that genome expansions should increase the neutrality of mutations, not their lethality, suggests a role for epistasis in driving extinction. These findings suggest that large genome size can enhance the extinction likelihood of small populations and may inform which natural populations are at an increased risk of extinction. 4.2 Introduction The ubiquity of extinction events throughout the history of life [80], and the increasing real- ization that Earth’s biosphere may be experiencing a sixth mass extinction [11] drives interest in determining the factors that cause certain species, but not others, to go extinct [134]. It is accepted 66 that a combination of genetic [160, 186], demographic [132, 141], environmental [112, 196], and ecological [30, 42, 163] factors contribute to species extinctions. Beyond those deterministic fac- tors, chance events also likely influence some extinction events [170, 195]. Here, we focus on the genetic factors influencing extinction, specifically the role of small population size and genetic drift [123]. In small populations, weakened purifying selection leads to increased fixation of small-effect deleterious mutations [213]. As multiple deleterious mutations fix, the absolute fitness of the population may decrease, resulting in a decrease in population size. This decreased population size further weakens selection, leading to the fixation of additional deleterious mutations and a further decrease in population size. This process continues until population extinction. This positive feedback loop between decreased population size and deleterious mutation fixation is known as a mutational meltdown [125]. Mathematical models of mutational meltdowns suggest that even intermediate-sized asexual populations can quickly go extinct [58, 126]. Likewise, small sexual populations are also vulnerable to fast meltdowns [105]. The concept of a mutational meltdown provides a population-genetic mechanism for extinction. However, it is still uncertain what factors beyond population size influence the likelihood of a meltdown. If deleterious mutation accumulation drives mutational meltdowns, then species with a greater genomic mutation rate should be at a greater risk of extinction [184, 233]. Another proposed genetic mechanism that could lead to population extinction are genome expansions (i.e., mutations that increase genome size). Indeed, there is some evidence that genome size positively correlates with extinction risk in certain clades of multicellular organisms [200, 201]. While the relationship between high mutation rates and extinction suggests that larger genome size heightens extinction risk solely by increasing mutation rates, the connection between genome size and extinction can be complicated. If genome expansions lead to increased neutrality, the overall genomic mutation rate may increase, but the deleterious mutation rate will remain constant. Species with larger genomes should only face an increased mutational burden if genome expansions lead to increased genome content under purifying selection. For example, potential detrimental 67 molecular interactions between an original genomic region and its duplicate may result in an increased mutational burden [40]. As genome expansions are likely to lead to many alterations in the distribution of mutational effects, it is still unclear which genetic mechanisms lead genome expansions to drive population extinction. It is difficult to test the role of genome size in extinction in both natural and laboratory model systems. Here, we use digital experimental evolution [3, 12, 75, 84, 164] to test whether genome expansions can drive population extinction. In a previous study with the digital evolution system Avida [154] on the role of population size in the evolution of complexity, we found that the smallest populations evolved the largest genomes and the most novel traits, but also had the greatest extinction rates [101]. Now, we use Avida to test explicitly the mechanisms behind the role of genome size in the extinction of small populations. Avida differs from previous models of extinction in small populations in the mode of selection. Unlike mutational meltdown models [123], where selection is hard and the accumulation of dele- terious mutations directly leads to population extinction, selection is primarily soft in Avida and deleterious mutations alter relative fitness (i.e., competitive differences between genotypes), not absolute fitness (i.e., differences in the number of viable offspring between genotypes). Extinction occurs in Avida through the accumulation of “lethal” mutations, or more precisely, “non-viable” mutations that prevent their bearer from reproducing. These non-viable avidians occupy a portion of the limited space allocated to an avidian population, thus reducing the effective population size and potentially causing extinction over time. We find multiple genetic mechanisms lead genome expansions to drive the extinction of small populations. Increased genome size not only leads to an increase in the genomic mutation rate, but specifically to an increase in the lethal mutation rate. Elevated lethal mutation rates in large-genome genotypes are likely due to detrimental interactions between ancestral genome regions and dupli- cated genome content. Additionally, we show that genotypes with large genomes have an elevated probability of stochastic replication errors during reproduction (i.e., stochastic viability), further elevating the likelihood of offspring non-viability and extinction. These results suggest that large 68 genome size does elevate the risk of population extinction due to an increased lethal mutational burden from multiple genetic mechanisms. 4.3 Results 4.3.1 Large genome size increases the extinction risk of small populations To test if genome expansions and large genome size enhanced the probability of population extinction, we evolved populations across a range of population sizes at both high (1.5 muta- tions/genome/generation) and low (0.15 mutations/genome/generation) mutation rates with either a fixed genome size or a variable genome size. Under the low mutation rate regime, populations with variable genome sizes had greater rates of extinction than those with fixed small genomes (Fig. 4.1A). Under the high mutation rate regime, there was no significant difference between populations with a variable genome size and populations with a constant genome size (Fig. 4.1A). Estimations of the time to extinction further support these trends: in the low mutation regime, populations where genome size could evolve went extinct in fewer generations than those where genome size was constant. There were no differences in the high mutation rate regime (Fig. 4.1B). Next, we compared the final evolved genome size between genotypes from extinct populations and surviving populations. Across the range of population sizes for which at least 10 populations both survived and went extinct, “extinct” genotypes evolved larger genomes than those “surviving” genotypes in the low mutation rate regime (Fig. 4.2A). In the high mutation rate regime, one population size (N = 15 individuals) led to surviving populations evolving larger genomes, while there was no statistically-significant difference for the other population sizes (Fig. 4.2B). Together, these results suggest that genome expansions and large genome size can enhance the risk of small population extinction if the initial mutation rate is too low for extinction to otherwise occur. We next focus on examining the mechanism behind the relationship between genome size and extinction in the low mutation rate populations. 69 Figure 4.1: Possibility of genome expansions increase extinction in low mutation rate populations. A) Number of extinct populations as a function of population size. Solid (dashed) lines represent variable (fixed) genome size populations. Black (white) circles represent low (high) mutation rate populations. Error bars are bootstrapped 95% confidence intervals (104 samples). B) Time to extinction for population size and mutation rate combinations. Lines and colors same as in panel A. Error bars represent two times the standard error of the mean. Data only shown for those treatments that resulted in at least ten extinct populations. Figure 4.2: Extinct populations evolved larger genomes. A) Final genome size for the low mutation rate populations as a function of population size. Populations that survived are shown with gray boxplots; populations that went extinct are shown with white boxplots. Red bars are the median value, boxes are the first and third quartile, upper/lower whiskers extend up to the 1.5 times the interquartile range, and circles are outliers. ** indicates p < 10−4, * indicates p < 10−2, and N.S. indicates p > 0.05 for the Mann-Whitney U-test. Population sizes where fewer than ten populations went extinct (or survived) not shown. B) Final genome size for the high mutation rate populations as a function of population size. Description same as in panel A. Population sizes where fewer than ten populations went extinct (or survived) not shown. 70 510152025Population Size020406080100Extinct Populations5101520Population Size02468Generations to Extinction (x104)AB5678101520Population Size02004006008001000Genome Size10121516172025Population Size02004006008001000AB*******N.S.N.S. 4.3.2 Extinction and large genome size is associated with increases in the lethal mutational load Avidian populations only face population-size reductions through one mechanism: parent avidians produce non-viable (or infertile) offspring that replace viable avidians. In other words, the lethal mutational load should drive population extinction. It is therefore possible that the increased genomic mutation rate that co-occurs with genome expansions specifically increased the genomic lethal mutation rate. The elevated lethal mutation rate then leads to increased rates of population extinction. We first tested whether larger genomes had an increased lethal mutation rate. Genome size was correlated with the lethal mutation rate across genotypes from all population sizes, supporting the hypothesis that increases in genome size result in increased lethal mutational load and eventually population extinction (Fig. 4.3A; Spearman’s ρ ≈ 0.75, p = 1.77 × 10−148). Next, we examined whether populations that went extinct had previously evolved greater lethal mutation rates than surviving populations. As with the trend for genome size, extinct populations evolved greater lethal mutations rates than surviving populations (Fig. 4.3B). Figure 4.3: Lethal mutation rate correlates with genome size and population extinction. A) The lethal mutation rate as a function of genome size for the final genotypes from each evolved low mutation rate population. White circles are extinct populations; gray circles are surviving populations. B) The lethal mutation rate for extinct and surviving populations across population sizes. Boxplots as previously described. Colors and significance symbols same as Figure 4.2. Population sizes where fewer than ten populations went extinct (or survived) not shown. The previous data support the hypothesis that genome expansions drive population extinction by increases in the lethal mutation rate and thus the lethal mutational load. However, it is unclear 71 02004006008001000Genome Size0.51.01.5Lethal Mutation Rate5678101520Population Size0.51.01.5AB****** whether genome expansions themselves increase the likelihood of lethal mutations (suggesting epistasis between genome expansions and the ancestral genome) or whether genome expansions merely potentiate future increases in the lethal mutation rate (due to subsequent adaptation). To test these two scenarios, we examined the evolutionary histories (i.e., lines-of-descent) for all N = 20 low mutation-rate populations. We then examined the relationship between changes in genome size and changes in the lethal mutation rate (Fig. 4.4A). When genome size was constant, the lethal mutation rate did not change on average (mean change = 7×10−4 mutations/genome/generation, 95% Confidence Interval (CI) = ±2 × 10−3 mutations/genome/generation). Genome size increases on average increased the lethal mutation rate (mean change = 0.024 mutations/genome/generation, 95% CI = ±0.0023 mutations/genome/generation), while genome size decreases on average decreased the lethal mutation rate (mean change = -0.033 mutations/genome/generation, 95% CI = ±0.0040 mutations/genome/generation). Additionally, the change in genome size positively correlates with the change in the lethal mutation rate (Fig. 4.4B; Spearman’s ρ = 0.67, p ≈ 0.0), suggesting that genome expansions directly lead to increases in the genomic lethal mutation rate. Figure 4.4: Insertions and deletions directly change the lethal mutation rate. A) Change in the lethal mutation rate as a function of a mutation’s effect on genome size. Boxplots as previously described. Each data point represents a genotype from a reduced evolutionary lineage of a population with 20 individuals. B) Relationship between a mutation’s change in genome size and the change in the lethal mutation rate. Data same as in panel A. Dashed lines represent no change. Data points comparing genotypes with equal genome size were excluded. 72 GenomeSizeConstantGenomeSizeIncreasedGenomeSizeDecreased-0.50.00.5Change in Lethal Mutation Rate-750-500-2500250Change in Genome Size-0.50.00.5AB 4.3.3 Lethal mutation rates and stochastic viability drive population extinction Finally, to establish the role of the lethal mutation rate in driving population extinction, we performed additional evolution experiments to test whether the prevention of lethal mutations would prevent population extinction. We repeated our initial experiments (Fig. 4.1), except offspring with lethal mutations were reverted to their parental genome (lethal-reversion treatment; see Methods for details). We also did the same experiment where deleterious, but non-lethal, mutations were reverted in order to test if deleterious mutations contributed to extinction. When populations evolved without deleterious mutations, extinction rates were similar to, if not greater than, those for populations that evolved with deleterious mutations (Fig. 4.5A). Populations that evolved with fixed-size genomes and without lethal mutations never went extinct, demonstrating how the lack of lethal mutations can prevent extinction (Fig. 4.5B). However, when these populations evolved with variable genome sizes, extinction still occurred, although at a lower rate than when lethal mutations were present (Fig. 4.5B). While these data demonstrate that lethal mutations do primarily drive extinction risk, the fact that extinction can still occur presumably without lethal mutations is unexpected and indicates that there is a second factor that relates genome size to extinction. This is surprising, as lethal mutations are the only direct mechanism to cause extinction in Avida. One possible explanation for extinction in the lethal-reversion populations is that mutants arise in these populations that are initially viable, but later become non-viable. In other words, these populations evolve stochastic viability, where characteristics of the random numbers the avidians input during their life-cycle affect their ability to reproduce. These genotypes with stochastic viability would, on occasion, not be detected as lethal mutants, and thus enter the population even when lethal mutations are reverted. As they reproduce, these stochastically-viable genotypes will input other numbers and thus become, in effect, non-viable and subsequently lead to population extinction. To check if the populations that went extinct without lethal mutations did evolve stochastic viability, we tested the viability of all 100 genotypes from the lethal-reversion, variable genome size N = 5 populations. We also performed the same tests with the 100 genotypes from the N = 8 populations that evolved 73 Figure 4.5: Evolution of stochastic viability contributes to extinction risk. A) Number of population extinctions (out of 100 replicates) as a function of population size for the deleterious-reversion (squares) and no-reversion (circles) treatments. Dashed and solid lines represent populations from fixed genome size and variable genome size treatments, respectively. Error bars are 95% confidence intervals generated using bootstrap sampling (104 samples). No-reversion treatment data same as in Fig. 4.1A. B) Number of population extinctions (out of 100 replicates) as a function of population size for the lethal-reversion (triangles) and no-reversion (circles) treatments. Other symbols same as in panel A. C) Percent of viability trials (out of 1000) for which a given genotype was not viable. Values between 0 and 1 indicate stochastic viability. “Original” refers to the 100 genotypes from the N = 8 populations that evolved with lethal mutations. “Revert-lethal” refers the 100 genotypes from the N = 5 lethal-reversion populations. Boxplots as previously described. D) Genome sizes for always-viable and stochastically-viable genotypes from both the N = 8 no-reversion populations and the N = 5 lethal reversion populations. 74 5101520Population Size020406080100Extinct Populations5101520Population Size020406080100Extinct PopulationsRevert-LethalSurvivedRevert-LethalExtinctOriginalSurvivedOriginalExtinct0.00.20.40.60.81.0Non-viable Trials (%)AllViableStochasticViable0246810Genome Size (x102)ABCD with lethal mutations to see if these mutants arose in our original populations. For both sets of genotypes, we found that some genotypes were stochastically viable (Fig. 4.5C). In fact, of the 23 genotypes from populations that went extinct in the lethal-reversion treatment, 19 displayed stochastic viability. No genotypes from surviving populations were stochastically-viable. Of the 42 genotypes from populations that went extinct among our original treatment genotypes, 8 displayed stochastic viability. Two genotypes from surviving populations were stochastically-viable, suggesting these populations would have an enhanced likelihood of extinction if the experiment had continued. Finally, we compared the genome sizes between genotypes from the lethal-reversion genotypes that were always measured as viable and those that measured as stochastic-viable. Stochastic-viable genotypes evolved larger genomes than deterministic-viable genotypes (Mann- Whitney U-test, median = 128 instructions versus median = 191 instructions, p < 2 × 10−4; Fig. 4.5D), further suggesting that increased genome size can lead to the evolution of stochastic viability and eventual population extinction. We comment on the relevance of stochastic viability in biological populations in the Discussion below. 4.4 Discussion We explored potential genetic mechanisms behind the relationship between genome size and the extinction of small populations. Genome expansions drive extinction because they increase the lethal mutation rate of small populations. Elevated lethal mutation rates arise through two genetic mechanisms. First, genome expansions directly increase the lethal mutation rate, suggesting that epistatic interactions between ancestral genome content and novel duplicated content lead to more lethal mutations. Second, genotypes with larger genomes have a greater likelihood of evolving stochastic viability. Both mechanisms contribute to the lethal mutational burden of small populations and together heightened the risk of population extinction. The relationship between genome expansions and increases in the lethal mutation rate is at first counterintuitive. It is classically thought that gene/genome duplications should lead to an increase in the rate of neutral mutations, not lethal mutations, due to increases in mutational robustness [70]. 75 Increases in the lethal mutation rate (and not the neutral mutation rate) should only occur if there are genetic interactions (i.e., epistasis) between the ancestral genome section and the duplicated genome section. Is there evidence for gene/genome duplications leading to increased mutational load, as opposed to increased robustness? Recently, it was argued that gene duplication can also result in increased mutational fragility, not just mutational robustness, if a duplicate gene evolves to interact with its ancestor [40]. However, more empirical studies are needed to determine whether genome expansions can elevate the mutational burden of a population to such a level that population extinction becomes a possibility. Our second proposed mechanism underlying the connection between genome size and extinction is the evolution of stochastic-viable genotypes that could only reproduce under some environmental conditions (here, random number inputs). The connection behind stochastic viability and extinction in small populations is intuitive. Mutations causing stochastic viability likely have a weak effect (due to their stochastic nature) and can fix in small populations due to weakened selection. After fixation, the lethality of these mutation may be stochastically revealed, and extinction occurs. However, studies on the functional consequences of mutations responsible for extinction are rare (although see [57, 174]) and it is uncertain whether these mutations arise in populations at high extinction risk. One suggestion that mutations with stochastic effects might be relevant to population extinction comes from microbial experimental evolution. It has been shown that small populations have reduced extinction risk if they overexpress genes encoding molecular chaperones that assist with protein folding [176]. These overexpressed chaperones presumably compensate for other mutations that cause increased rates of stochastic protein misfolding. Therefore, mutations responsible for an increased likelihood of protein misfolding may be an example of a class of mutations with a stochastic effect that enhance extinction risk. However, this is only speculation and further work is needed to determine if stochastic viability is a possible mechanism behind extinction risk. The most prominent model of small population extinction is the mutational meltdown model [123, 125, 126], which argues that even intermediate-sized asexual and sexual populations (i.e., 76 103 individuals) can go extinct on the order of thousands of generations. In contrast to mutational meltdown models, only very small populations go extinct in Avida, and extinction occurs on a longer timescale. The difference between our results and previous results from mutational meltdown models are likely due to differences in the character of selection between the two models. Selection is hard in mutational meltdown models, and the accumulation of deleterious mutations directly increases the probability that offspring will be non-viable [123]. In Avida, selection on deleterious mutations is soft and the accumulation of deleterious mutations is unrelated to the likelihood of non-viable offspring. The lethal mutation rate will only increase indirectly in Avida with genome expansions. Without the positive feedback loop between deleterious mutation accumulation and population size, avidian populations only evolve a high rate of non-viable mutants if they evolve large genomes, thus explaining the trends we saw here. These differences between extinction in hard selection models and the Avida selection model emphasizes the need to consider whether selection in biological populations is primarily hard or soft. Unfortunately, there has been little resolution on this question [172, 206]. There is some evidence that soft selection may be more prevalent than hard selection. For instance, soft selection has been invoked as an explanation for why humans are able to experience high rates of deleterious mutations per generation [27, 110]. Moreover, the persistence of small, isolated populations [16, 173, 225] suggests that not only is selection primarily soft in nature, but that the extinction dynamics we study here are relevant to a subset of biological populations. While large genome size may not be the factor that causes populations to decline, it could drive an already-reduced population to extinction. In a previous study, we observed that small populations evolved the largest genomes, the greatest phenotypic complexity, and the greatest rates of extinction [101]. This result raised the question of whether greater biological complexity itself could increase a population’s rate of extinction. Al- though we did not test whether increased phenotypic complexity had a role in extinction, we have shown that genome complexity, measured in terms of genome size, did drive small-population ex- tinction. While it is possible that phenotypic complexity also enhanced the likelihood of extinction, the Avida phenotypic traits likely do not increase the lethal mutation rate. Thus, both high extinction 77 rates and increased phenotypic complexity arise due to the same mechanism: greater genome size. This result illustrates an evolutionary constraint for small populations. While weakened selection and stronger genetic drift can lead to increases in biological complexity, small populations must also evolve genetic architectures that reduce the risk of extinction [176, 199]. Otherwise, small popu- lations cannot maintain greater complexity and their lethal mutational load drives them to extinction. 4.5 Methods 4.5.1 Avida Here we review those aspects of Avida (version 2.14; available at https://github.com/devosoft/avida) relevant to the current study (see [154] for a complete overview). In Avida, simple computer pro- grams (“avidians") compete for the resources required to undergo self-replication and reproduction. Each avidian consists of a genome of computer instructions drawn from a set of twenty-six avail- able instructions in the Avida genetic code. A viable asexual avidian genome must contain the instructions to allocate a new (offspring) avidian genome, copy the instructions from the parent genome to the offspring genome, and divide off the offspring genome into a new avidian. During this copying process, mutations may occur, introducing variation into the population. These novel mutations can then be passed onto future generations, resulting in heritable variation. This genetic variation causes phenotypic variation: avidians with different genomes may self-replicate at differ- ent speeds. As faster self-replicators will outcompete slower self-replicators, there is differential fitness between avidians. Therefore, given there is heritable variation and differential fitness, an Avida population undergoes Darwinian evolution [1, 164]. Avida has previously been used to test hypotheses concerning the evolution of genome size [71, 101], the role of population size in evolution [48, 101, 102, 144], and the consequences of population extinction [187, 227, 228, 229]. The Avida world consists of a grid of N cells; each cell can be occupied by at most one avidian. Thus, N is the maximum population size for the Avida environment. While avidian populations are usually at carrying capacity, the presence of lethal mutations can reduce their effective population 78 size below this maximum size. Here, offspring can be placed into any cell in the environment, simulating a well-mixed environment (i.e., no spatial structure). If a cell is occupied by another avidian, the new offspring will overwrite the occupant. The random placement of offspring avidians adds genetic drift to Avida populations, as avidians are overwritten without regard to fitness. Fitness for an avidian genotype is estimated as the ratio of the number of instructions a genotype executes per unit time to the number of instructions it needs to execute to reproduce. Therefore, there are two avenues for a population of avidians to increase fitness: increase their number of instructions executed per unit time or decrease the number of instruction executions needed for self-replication. Avidian populations can increase the number of instructions executed by evolving the ability to input random numbers and perform Boolean calculations on these numbers (a “computational metabolism”) [109]. They can also decrease the number of instruction executions necessary for reproduction by optimizing their replication machinery. There are a variety of different implementations of mutations in Avida. Here, we used settings that differed from the default in order to improve our ability to analyze the causes of population extinction (see Table 4.1 for a listing of changes to the default settings). Point mutations change one locus from one of the twenty-six Avida instructions to another random, uniformly chosen, instruction, occur upon division between parent and offspring, after replication. There is an equal probability that each instruction in the genome will receive a point mutation; thus, genome size determines the total genomic mutation rate. To model indels, we used so-called “slip” mutations. This mutational type will randomly select two loci in the genome and then, with equal probability, either duplicate or delete the instructions in the genome between those two loci. While the rate of indel mutations remains constant, the chance of large indel mutations increases as genome size grows. Finally, to ease our analysis, we required every offspring genotype to be identical to its parent’s genotype before the above mutations were applied at division. One aspect of Avida mutations that differs from traditional models of population extinction is the presence of lethal, or non-viable, mutations in addition to deleterious, and non-lethal, mutations. Instead, they prevent their bearer from successfully These mutations do not kill their bearer. 79 Table 4.1: Notable Avida parameters changed from default value. Parameter WORLD_ X WORLD_ Y BIRTH_ METHOD COPY_ MUT_ PROB DIV_ MUT_ PROB DIVIDE_ INS_ PROB DIVIDE_ DEL_ PROB DIVIDE_ SLIP_ PROB REQUIRE_ EXACT_ COPY 0 REVERT_ FATAL 0.0 REVERT_ DETRIMENTAL 0.0 N 1 4 0.0 µ 0.0 0.0 0.01 1 1.0 1.0 Default Value Changed Value Treatment 60 60 0 0.0075 0.0 0.05 0.05 0.0 All All All All All All All Variable Genome Size All Lethal-reversion Deleterious-reversion reproducing within the maximum allowed lifespan (i.e., they are non-viable). Here, we used the default maximum lifespan of 20 × L instruction executions, where L is the genome size. In other words, this setting limits the number of times an avidian can cycle through their genome in an attempt to reproduce. Such a setting must exist in order to allow avidian genomes to be analyzed. Otherwise, non-reproducing avidians could be analyzed forever, as the only way to decide if an avidian can reproduce is to actually execute the code in its genome. In Avida, it is possible to perform experiments where the appearance of mutations with certain effects is prevented [33]. For example, it is possible to revert a mutation of a particular predetermined effect after it has appeared. To enable this, the Avida program analyzes the fitness of every novel genotype that enters the population and, if the fitness is of the pre-set effect, the mutation is reverted. This system allows experimenters the ability to determine the relevance of certain mutational effects to evolution. However, mutations of certain effects can still enter the population if their fitness effects are stochastic. An avidian has stochastic fitness if its replication speed depends on characteristics of the random numbers it inputs in order to do its Boolean calculations. Some stored numbers may alter the order in which certain instructions are executed or copied into an offspring’s genome, thus altering fitness. 80 4.5.2 Experimental Design To study the role of genome size in the extinction of small populations, we first evolved populations across a range of per-site mutation rates (µ = 0.01 and µ = 0.1) and population sizes (N = {5, 6, 7, 8, 10, 15, 20} for µ = 0.01 and N = {10, 12, 15, 16, 17, 20, 25} for µ = 0.1). For each combination of population size and mutation rate we evolved 100 populations for at most 105 generations. Each population was initialized at carrying capacity with N copies of the default Avida ancestor (which has 100 instructions) with all excess instructions removed; this resulted in an ancestor with a genome of 15 instructions (only those needed for replication). Ancestral genotypes with per-site mutation rates of µ = 0.01 and µ = 0.1 thus have genomic mutation rates of U = 0.15 and U = 1.5 mutations/genome/generation, respectively. Genome size mutations (indels) occur at a fixed rate of 0.01 mutations/genome/generation for all treatments. Additionally, for each mutation rate and population size combination, an additional 100 populations were evolved in an environment where genome size was fixed. To directly test for the role of lethal and deleterious mutations in driving extinction, we evolved 100 populations at the low mutation rate population sizes under conditions where either lethal mutations or deleterious, but non-lethal, mutations were reverted (the “lethal-reversion” and “deleterious-reversion treatments”, respectively). 4.5.3 Data Analysis For all evolution experiments, we saved data on the most abundant (dominant) genotype every ten generations. The final saved dominant genotype was used in all analyses here. All data represents either genotypes at most ten generations before extinction (in the case of extinct populations) or genotypes from the end of the experiment (in the case of surviving populations). In order to calculate the lethal mutation rate and other relevant statistics for a genotype, we generated every single point mutation for that genotype and measured these mutants’ fitness using Avida’s Analyze mode. The lethal mutation rate was estimated as Ulethal = µ × L × plethal, where µ is the per-site mutation rate, L is the genome size, and plethal is the probability that a random mutation will be lethal. 81 4.5.3.1 Analysis of the relationship between genome expansions and changes in the lethal mutation rate To test whether genome expansions themselves were directly responsible for the increase in the lethal mutation rate or whether the lethal mutation rate increased after evolution in response to a genome expansion, we first reconstructed the line-of-descents (LODs) for each of the one hundred genotypes evolved in a population of 20 individuals with a per-site mutation rate of 0.01 mutations/site/generation (the low mutation rate). An LOD contains every intermediate genotype from the ancestral genotype to an evolved genotype and allows us to trace how genome size evolved over the course of the experiment [109]. We reduced these LOD to only contain the ancestral genotype, the genotypes that changed genome size, the genotype immediately preceding a change in genome size, and the final genotype. We measured the genome size and the lethal mutation rate for each of these remaining genotypes. Then, we measured the relationship between the change in genome size and the change in the lethal mutation rate for genome expansions, genome reductions, and the segments of evolutionary time where genome size was constant. 4.5.3.2 Analysis of stochastic viability In order to test the possibility that some of our populations had evolved stochastic viability, we analyzed each genotype from the N = 5 lethal-reversion populations and each genotype from the N = 8, µ = 0.01, original populations. These population sizes were chosen because they had the greatest equality between the number of extinct populations and the number of surviving populations. We performed 1000 viability trials, where a genotype was declared non-viable if it could not reproduce. A genotype was declared stochastically-viable if the number of non-viable trials was greater than 0 and less than 1000. Otherwise, it was defined as deterministically-viable. All data analysis beyond that using Avida’s Analyze Mode was performed using the Python packages NumPy version 1.12.1 [197], SciPy version 0.19.0 [82], and Pandas version 0.20.1 [139]; figures were generated using the Python package Matplotlib version 2.0.2 [78]. All Avida scripts and data analysis scripts used here are available at https://github.com/thomaslabar/ 82 LaBarAdami_GenomeSizeExtinction. 4.6 Acknowledgments We thank M. Zettlemoyer and C. Weisman for comments on drafts of the manuscript. This work was supported in part by Michigan State University through computational resources provided by the Institute for Cyber-Enabled Research. 83 CHAPTER 5 CONCLUSION This conclusion will consist of three sections. First, I will make some broad conclusions con- cerning the work I presented here that did not fit with any specific chapter. I will also address some chapter-specific conclusions that did not make the published version. Then, I will address limitations of this research and speculate on how these limitations may affect the results. Finally, I will end by discussing potential future work. 5.1 Further Conclusions If the main conclusions of the research present here could be summarized in one sentence, it would be the following. Population size qualitatively, and not just quantitatively, alters the outcome of the evolutionary process. Historically, the focus on the role of population size has been on how it alters a population’s evolutionary dynamics. In other words, small population size may limit adaptation [221], it may prevent the fixation of small-effect beneficial mutations [66], or it may increase the fixation of deleterious mutations [151], among other potential consequences of weak selection and strong drift. All of these effects are quantitative in the sense that while they may mathematically affect a population’s adaptive state by reducing fitness, small populations are still qualitatively similar to their large population size relatives. This view assumes that small populations are just less-adapted versions of large populations. They evolve similar phenotypes in similar environments compared to large populations, but in a less-optimal state. The results presented here argue for a larger difference between small and large populations. Instead of there solely being quantitative (i.e., fitness) differences between small and large pop- ulations, there are qualitative differences: small and large populations evolve towards alternative genetic architectures. These findings support other studies that have also showed qualitative differ- ences in the evolutionary trajectories of small and large populations (e.g., [94, 97, 119, 169, 171]). 84 However, previous attempts either are purely theoretical simulations (e.g., [97, 131], but see [171]) or rely on comparative genomics evidence (e.g., [121]), where it is difficult to control for every possible factor [214]. By using Avida, one can both control for only the role of population size in altering evolutionary outcomes and perfectly reconstruct the evolutionary trajectories that led to al- ternative genomic architectures. And while Avida populations likely possess different possibilities for genetic architecture than biological populations, the possible levels of control in Avida allowed for the discovery of a novel population-genetic principle (drift robustness) that should factor into some biological systems. The extent to which this general principle of small populations evolving qualitatively different from large populations applies across life remains to be explored. 5.1.1 Absence of mutational hazards in altering small population evolution In Chapter 2, I tested the role of population size in driving the evolution of genome complexity (measured as genome size) and the evolution of phenotypic complexity (measured as number of phenotypic traits). This study was inspired by Lynch’s Mutational Hazard (MH) hypothesis, which argues that genetic drift and mutational pressure should drive the evolution of genome complexity (one measure of which is genome size [121]) and potentiate the evolution of functional complexity [94, 118, 121]. Due to the differences in avidian genomes (asexual haploids) and the genomes of multicellular eukaryotes (primarily obligate sexual reproducers with ploidy greater than one), one must be cautious when claiming this work as a test of the Mutational Hazard hypothesis. However, one can speculate on how the conclusions from this work inform on the validity of the MH hypothesis. I did find evidence that genetic drift in small populations leads to greater genome complexity and phenotypic complexity (Chapter 2) as the MH hypothesis would predict. However, the MH hypothesis does not just predict evolutionary outcomes, but also the mechanism behind those outcomes. The MH hypothesis predicts it is the cost of excess DNA that drives selection against unneeded genome content. This cost originates from the potential of any section of the genome I did not find any evidence that this to be mutated in a manner deleterious to the organism. 85 selective pressure existed in my experiments. It is certainly the case that the reduction in fitness from deleterious mutations alters genomic architecture under some Avida conditions [220], but this occurs at much greater mutation rates than those used here. Why did I not see evidence of selection against mutational hazards? One possibility is the range of population sizes used in Avida, a possible limitation I discuss below. What these results suggest is that there are other parameters that alter the conclusions of the MH hypothesis beyond population size and mutational pressure. It is likely dependent on the exact structure of the fitness landscape and thus the distribution of fitness effects of de-novo mutations. I would speculate that all of these small-population evolution hypotheses: the MH hypothesis [119], global/local mechanisms of mutational robustness [169], the drift-barrier hypothesis [188], and drift robustness [102] are all aspects of the same overarching theory for evolution in small populations. However, this speculation remains for future theoretical work, both in terms of mathematical modeling and in terms of digital evolution. 5.1.2 The differences between mutational robustness and drift robustness A main conclusion from the discovery of drift robustness in small populations is that evolving robustness to the deleterious consequences of genetic drift is different from evolving robustness to the deleterious consequences of mutations, or mutational robustness. Drift robustness is caused by genetic architectures that limit the fixation of small-effect deleterious mutations. Drift robustness evolves by a small population either not fixing small-effect beneficial mutations (thus not increas- ing the likelihood of small-effect deleterious mutations and furthermore not adapting) or fixing adaptive mutations that increase the likelihood of large-effect deleterious mutations. This scenario does assume that a population cannot simultaneous adapt and increase the supply of neutral muta- tions. Mutational robustness theory argues that small populations should reduce either the rate of deleterious mutations or the effect size of said mutations [48, 68, 97, 131]. It is worth asking how previous work missed the distinction between drift robustness and mutational robustness. In some of the studies that demonstrated the evolution of mutational robustness in small popula- 86 tions [97, 169], all deleterious mutations were slightly-deleterious. In this scenario, these mutations fixed due to drift and lead to the evolution of increased neutrality (i.e., mutational robustness). This also occurred in a study using Fisher’s Geometric Model, where there is likely a significant density of slightly-deleterious mutations in the model’s distribution of fitness effects [68]. One of the re- quirements for drift robustness is the existence of multi-peak fitness landscapes, something absent in many of these models [68, 97]. Rajon and Masel’s model did include multiple fitness peaks and they did find that small and large populations evolved alternative robustness “solutions” [169]. However, their “robust” fitness peaks only led to a reduction in slightly-deleterious “molecular errors”, as all errors in this model were slightly-deleterious. My results emphasize the need to consider evolution on complex fitness landscapes with non-trivial distribution of fitness effects and to consider that the distribution of fitness effects is itself evolvable. 5.1.3 Drift robustness and bacterial endosymbionts The classic model systems for the study of mutational robustness in small populations are obligate bacterial endosymbionts [52]. Some pieces of evidence supporting mutational robustness include the overexpression of genes encoding molecular chaperones [52, 53, 176] and increased rates of gene evolution [148, 149, 177]. In light of the evidence presented here suggesting that small populations should be expected to evolve drift robustness, not mutational robustness, it is worth asking whether bacterial endosymbionts have really evolved drift robustness, not mutational robustness. Proponents of mutational robustness in bacterial endosymbionts argue that overexpression of molecular chaperones, or heat-shock proteins, epistatically alters the fitness landscape and reduces the deleterious effects of mutations [53]. In other words, the overexpression of molecular chap- erones decreases the phenotypic variation of mutations. However, recent work in Saccharomyces cerevisiae demonstrates that molecular chaperone overexpression increases, not decreases, pheno- typic variation on average [59]. This result suggests that this overexpression phenotype increases the average deleterious effect of de-novo mutations, if one assumes that most phenotypic varia- tion is deleterious. This consequence of overexpression is more aligned with the concept of drift 87 robustness, not mutational robustness, but rigorous tests are still needed for confirmation. There is evidence that endosymbionts have faster rates of evolution than their free-living rel- atives [148, 149], suggesting that mutations are more tolerable, or less deleterious; this trend is another sign of mutational robustness. However, recent work has shown that, for some endosym- bionts, while some genes do show faster rates of evolution, other genes show reduced rates of evolution [177]. This result is consistent with increased purifying selection in some genes, and thus an increased deleterious mutational load (either by increasing the prevalence of deleterious mutations or increasing their deleterious effect). Furthermore, these genes are not host-related, demonstrating that this increased purifying selection is not due to interactions with the host [177]. This result is also more consistent with drift robustness than mutational robustness. Finally, evidence now exists that some endosymbionts have evolved increased protein multi- functionality, compared to their free-living relatives [86, 231]. Increased protein multifunctionality, or increased pleiotropy, may be evidence of drift-robust adaptation. If a mutation hits a gene that encodes a multifunctional protein, multiple traits may face a loss of function. It stands to reason that these mutations on pleiotropic genes will be more deleterious than mutations in non-pleiotropic genes that affect only one function. Small populations may evolve increased pleiotropy because beneficial mutations may be more maintainable if they occur in, or lead to, pleiotropic genes. As with the other two pieces of evidence, its connection to drift robustness is speculative, but worthy of future exploration. 5.1.4 Drift robustness and the role of epistasis in small-population adaptation A prominent hypothesis suggest that epistasis, particularly negative (synergistic) epistasis between slightly-deleterious mutations, could limit drift-related fitness declines [92] (although see [25, 69]). I did not test directly for the presence of epistasis between slightly-deleterious mutations. Instead, I argued that it is the absence of small-effect deleterious mutations that causes drift robustness [102]. However, epistasis does play a large role in allowing small populations to adapt past their previ- ous drift barrier while maintaining drift-robust genomic architecture by increasing the severity of 88 previously small-effect deleterious mutations. Recent studies on the role of epistasis during adap- tation has shown that negative epistasis between beneficial mutations is present across a variety of species [28, 87, 98, 180]. My results provide an alternative example for how epistasis may shape the adaptive trajectories of populations, particularly those evolving in strong-drift environments. 5.2 Limitations 5.2.1 Avida as an intermediate between mathematical modeling and biological experiments One of the benefits of Avida, and digital evolution more broadly, is that it acts as a model system in-between simpler, but conceptually-clearer, mathematical modeling, and the complexities of biological systems. Thus, Avida can be used to discover novel phenomena that may be lacking in simple mathematical models, but obscured or difficult to decipher in biological systems. However, Avida’s intermediate position in model complexity can also be a drawback. While Avida allows for the discovery of novel phenomena lacking in many mathematical models, there is a benefit to simplicity unreachable in Avida experiments. Mathematical models allow one to narrow down possible causes of some evolutionary pattern until one understands the minimal requirements to reproduce the pattern. For example, the evidence for the evolution of drift robustness was better- supported by demonstrating its existence in a mathematical model (Chapter 3). And while not every problem can be modeled in such fashion, mathematical models do in many cases illustrate evolutionary dynamics in a clearer fashion than many Avida experiments. On the other end of the model-complexity spectrum, Avida lacks certain relevant characteristics of biological systems and is inferior in many ways to biological model systems. One of the assumptions behind using Avida to model biological evolution is that evolutionary dynamics are equivalent no matter the evolving substrate (computer code versus DNA). While I believe this is true, this assumption of similarity between Avida and biology may weaken when one begins to consider questions concerning the evolution of genetic architecture and how genotype-phenotype maps evolve. Put another way, Avida is an excellent system to study how evolution shapes genetic 89 variation. However, there is no guarantee that the characteristics of genetic variation in Avida, such as the possible distribution of mutational effects, are similar to that in biology. And while Avida can inform biologists of how evolution proceeds assuming the production of the specific variation in Avida, one must always keep in mind that if variation is different in biological systems than in Avida, the outcomes of the evolutionary process will also be different. While these limitations do not eliminate the usefulness of Avida as a model system to study the evolutionary process, they must be remembered during interpretation of the results. 5.2.2 Populations in Avida are much smaller than relevant biological populations When discussing my results from Chapter 2 and their implications for the Mutational Hazard hypothesis [119], I stated that I did not observe selection against mutational hazards in these experiments. One possible explanation for this result is that, compared to biological populations, all of my Avida experiments really use “small” populations. In the Mutational Hazard hypothesis, it was estimated that the effective population sizes of multicellular eukaryotes is approximately 104 individuals, which is the absolute population size of my largest populations. It is possible that evidence for selection against mutational hazards would be seen in larger Avidian populations. However, it is difficult to run Avida (or most agent-based models) at larger population sizes due to current computational limitations. One argument against the idea that a population size of 104 individuals is too small to see selection against mutational hazards in Avida is that even smaller avidian populations can fix anti-mutator (or mutation-decreasing) mutations if the deleterious mutational load is large [31], suggesting the selective capacity should still have existed at the population sizes I used in my experiment. It is also worth discussing whether this limitation in terms of Avida population sizes affected the conclusions of my drift robustness research. There are two consequences of small population size that are relevant for this consideration: weak selection in small populations and limited production of genetic variation in small populations. As far as weakened selection is concerned, In the fitness landscape the experiments did accurately represent small and large populations. 90 used to study drift robustness, the selection coefficient (s) of deleterious mutations did not appear to go below 10−3. Thus, populations of 104 individuals were large enough (i.e., N > 1|s|) that selection should have limited the frequency of every deleterious mutation. However, in regards to limited genetic variation in small populations, the implications are less clear. It is possible that differences in mutation supply altered the evolutionary outcomes between small populations and large populations in addition to the inability to fix small-effect beneficial mutations. The respective roles of weakened selection and limited mutation supply remain to be tested with both mathematical models and computational evolution studies. 5.2.3 Results apply to asexual haploid species evolving in a constant environment The final limitation of the experiments presented here is the limited experimental conditions used throughout. All experiments used asexual haploid genotypes evolving in a constant environment, as do most experimental evolution studies [83, 84]. While not a large issue for many studies of evolutionary dynamics, this limitation is relevant here. Other than obligate-host bacterial endosym- bionts, many, if not most, small populations are multicellular eukaryotes, which are obligate sexual species and often diploid [121]. These are also the small populations that are expected to evolve greater genetic complexity due to weakened selection and genetic drift [121]. By only performing experiments with asexual haploid organisms, it is not possible to test if elements of recombina- tion or ploidy alter evolutionary dynamics and lead to the evolution of greater/lesser complexity. Likewise, given that diploidy and recombination can alter the dynamics of deleterious mutations (e.g., [138, 159]), it is uncertain how multiple copies of each locus may alter the evolution of drift robustness in small populations. Furthermore, all experiments here were performed in constant environments. Given that one expects populations to evolve greater complexity in environments of greater complexity (if only to evolve regulatory machinery to handle environmental changes), the lack of environmental variability may be responsible for some of the results within. However, variable environments appears to increase the fixation of stochastically-deleterious mutations [35], and thus fluctuating environments may increase the range of population sizes that would lead to 91 drift-robust genetic architecture. An exploration of these additional factors remains for future studies. 5.3 Future Work The most impactful result of this dissertation is the discovery of drift robustness and its evolution in small populations. Therefore, the most productive future work would be spent developing a better understanding of how, and under which conditions, drift robustness would evolve in natural populations. The obvious first choice for extending the idea of drift robustness is to test whether drift robustness will still evolve under some of the conditions highlighted above as limitations, such as diploidy, sexual reproduction, and realistic population sizes. Testing the predictions of drift robustness using both bioinformatic approaches and experimental systems is also needed. In this section, I will discuss a few possibilities in greater detail. 5.3.1 What is the genetic mechanism behind drift robustness? In Chapter 3, when I proposed the concept of drift robustness, the idea was tested with both a mathematical model and digital evolution experiments. Both methods showed that the evolution of drift robustness was possible in small populations. However, I did not test whether the exact mechanism that led to drift robustness in the mathematical model was the same mechanism that led to drift robustness in the Avida populations. I only showed that the results from the Avida experiments were consistent with the mathematical model. The exact genetic mechanism behind the evolution of drift robustness in Avida remains to be explored. While there is no guarantee that an understanding of the genetic mechanisms of drift robustness in Avida will provide valuable insights about biological systems, it will likely lead to some testable hypotheses and insight into possible mechanisms. As I wrote in the Limitations section of this chapter, the role of limited mutation supply was not explored in either the mathematical model (in which variation in mutation supply does not exist by design) or the Avida experiments (in which 92 variation in mutation supply exists but was not tested). A study of the genetic mechanisms of drift robustness may reveal a role for genetic variation in addition to weakened selection. Fur- thermore, such a study may provide insight regarding the consequences of drift robustness beyond the distribution of fitness effects. For example, in order to maintain drift-robust genetic architec- ture, small populations may have to evolve altered epistasis, pleiotropy, modularity, complexity, or other genetic characteristics. Future work should use Avida to further explore the mechanisms and consequences of drift-robust adaptation in small populations. Such studies will also drive the production of more-sophisticated mathematical models of drift robustness, improvement of which is needed given the simplicity of the original model. 5.3.2 Drift robustness and sexual reproduction The connection between population size, genetic drift, and sexual reproduction has been a topic of discussion for a long time. Muller discussed the concept of a ratchet in asexual populations to argue that sexual populations suffer a lesser mutational load [151]. Others have argued that sex evolved in order to facilitate the combination of beneficial mutations that would otherwise be lost due to clonal interference [60]. This mechanism would only exist in large populations, as it requires multiple segregating beneficial mutations [133]. Otto and Barton argued that decreased variation in small populations will drive increases in recombination [158]. This decreased variation provides a role for recombination to combine alleles together in beneficial combinations not present in small populations. In large populations, the rate of recombination does not increase, as individuals with the optimal combination of alleles are likely already present. Sex can also evolve in finite large populations as a method of disassociating deleterious mutations from beneficial mutations [85, 138]. Given previous connections between small population size and the evolution of sexual recom- bination, it is worth asking if and how the requirement for drift robustness in small populations influences the evolution of sex. If the mutation rate is low enough so that mutations either fix or drift way, as in my model of drift robustness (Chapter 3), than it is unlikely the evolution of sex would occur, as detailed above. However, if there are multiple segregating mutations in both 93 the small and large populations, as in the Avida simulations, then there could be an interaction between the requirement for drift robustness and the evolution of sex. For example, one could hypothesize that the evolution of sexual reproduction would alleviate the need to evolve drift-robust genomic architecture, for similar reasons as Muller argued that sex would prevent a ratchet-like loss of fitness. Drift-robust genotypes and drift-fragile genotypes may also differ in their capabilities to evolve sexual reproduction due to their differences in mutational effects. All of these research directions will likely be fruitful for future work. 5.3.3 Drift robustness and its interaction with other robustness mechanisms Potential model systems in which drift robustness may evolve are bacterial endoymbionts, as these species already evolved one robustness trait: overexpression of molecular chaperones [52, 53]. However, in other ways, many obligate endosymbionts appear less robust. For example, many endosymbionts have lost many genes encoding DNA repair enzymes, suggesting they have evolved higher mutation rates than their free-living relatives [149]. Such an outcome is predicted for small populations [129] (although see [169, 171] for alternative predictions). It is unknown why small populations would evolve robustness through one mechanisms, but fragility through another mechanism. One possible explanation is that it is mutationally “easier” to evolve overexpression of a molec- ular chaperone than it is to maintain a set of DNA repair enzymes. Overexpression of a gene likely can occur through a single mutation. However, the DNA repair machinery is a large mutational target, and thus it may prove more evolutionarily-difficult to maintain. In other words, the most probable evolutionary outcome is to evolve the simple genetic mechanism (overexpression of molec- ular chaperones), which then allows for the loss of the complex genetic mechanism (DNA repair machinery). Such an evolutionary scenario could easily be modeled, either mathematically or com- putationally. Likewise, it may be not too difficult to test this hypothesis through experiments. One could subject microbial populations to mutation accumulation, and then perform small-population evolution experiments with clones from these populations. Then, one could examine the evolved 94 fitness recovery strategies to test this hypothesis. 5.3.4 Small population size and the evolution of functional complexity Here, I only focused on the role of population size in shaping the characteristics of genetic archi- tecture. However, it is conceivable that the same evolutionary factors that result in the evolution of alternative genomic architectures in small populations could also lead to the evolution of alternative phenotypic traits in these populations. For example, it has been proposed that small populations evolve alternative phenotypic characteristics because only small populations with these character- istics do not go extinct [199]. It has also been proposed that there is a relationship between large body size and small population size, as large body size drives increased resource consumption and thus a lower population size [119]. One of the proposals of the Mutational Hazard hypothesis and similar theories is that the absence of strong selection potentiates the future evolution of complexity, not only in terms of genome complexity [121], but also in terms of functional complexity [94, 118]. In other words, one should expect small populations to not only evolve alternative genetic architectures, but also alternative phenotypic traits, as different phenotypic traits likely have different selective strengths in different environments. An understanding of how a fundamental parameter, population size, leads to qualitatively-different evolutionary outcomes will lead to a more predictive theory of evolutionary dynamics. And while this dissertation only tested the role of population size in the evolution of genetic architecture, the success of population-genetic principles to explain the evolution of complexity and robustness suggests that such a theory of phenotypic evolution is within reach. 95 BIBLIOGRAPHY 96 BIBLIOGRAPHY [1] [2] [3] [4] [5] [6] [7] [8] [9] Christoph Adami. Introduction to Artificial Life. Springer Verlag, New York, 1998. Christoph Adami. What is complexity? BioEssays, 24(12):1085–1094, 2002. Christoph Adami. Digital genetics: Unravelling the genetic basis of evolution. Nature Reviews Genetics, 7:109–118, 2006. Christoph Adami and Nicolas J Cerf. Physical complexity of symbolic sequences. Physica D: Nonlinear Phenomena, 137(1-2):62–69, 2000. Christoph Adami, Charles Ofria, and Travis C Collier. Evolution of biological complexity. Proceedings of the National Academy of Sciences, 97:4463–4468, 2000. Julian Adams and Frank Rosenzweig. Experimental microbial evolution: History and conceptual underpinnings. Genomics, 104(6):393–398, 2014. Aneil F Agrawal and Michael C Whitlock. Mutation load: The fitness of individuals in populations where deleterious alleles are abundant. Annual Review of Ecology, Evolution, and Systematics, 43:115–135, 2012. Bin Ai, Zhao-Shan Wang, and Song Ge. Genome size is not correlated with effective population size in the Oryza species. Evolution, 66(10):3302–3310, 2012. Dan I Andersson and Diarmaid Hughes. Muller’s ratchet decreases fitness of a DNA-based microbe. Proceedings of the National Academy of Sciences, 93(2):906–907, 1996. [10] Raquel Assis, Alexey S Kondrashov, Eugene V Koonin, and Fyodor A Kondrashov. Nested genes and increasing organizational complexity of metazoan genomes. Trends in Genetics, 24(10):475–478, 2008. [11] Anthony D Barnosky, Nicholas Matzke, Susumu Tomiya, Guinevere OU Wogan, Brian Swartz, Tiago B Quental, Charles Marshall, Jenny L McGuire, Emily L Lindsey, Kaitlin C Maguire, et al. Has the Earth’s sixth mass extinction already arrived? Nature, 471(7336): 51–57, 2011. [12] Bérénice Batut, David P Parsons, Stephan Fischer, Guillaume Beslon, and Carole Knibbe. In silico experimental evolution: A tool to test evolutionary scenarios. BMC Bioinformatics, 14:1, 2013. [13] Bérénice Batut, Carole Knibbe, Gabriel Marais, and Vincent Daubin. Reductive genome evo- lution at both ends of the bacterial population size spectrum. Nature Reviews Microbiology, 12:841–850, 2014. [14] Graham Bell. Experimental macroevolution. Proceedings of the Royal Society B: Biological Sciences, 283:20152547, 2016. 97 [15] Graham Bell and Andrew Gonzalez. Evolutionary rescue can prevent extinction following environmental change. Ecology Letters, 12(9):942–948, 2009. [16] Andrea Benazzo, Emiliano Trucchi, James A Cahill, Pierpaolo Maisano Delser, Stefano Mona, Matteo Fumagalli, Lynsey Bunnefeld, Luca Cornetti, Silvia Ghirotto, Matteo Girardi, et al. Survival and divergence in a small group: The extraordinary genomic history of the endangered Apennine brown bear stragglers. Proceedings of the National Academy of Sciences, 114(45):E9589–E9597, 2017. [17] Ulfar Bergthorsson, Dan I Andersson, and John R Roth. Ohno’s dilemma: Evolution of new genes under continuous selection. Proceedings of the National Academy of Sciences, 104: 17004–17009, 2007. [18] Diana Blank, Luise Wolf, Martin Ackermann, and Olin K Silander. The predictability of molecular evolution during functional innovation. Proceedings of the National Academy of Sciences, 111(8):3044–3049, 2014. [19] Jesse D Bloom, Zhongyi Lu, David Chen, Alpan Raval, Ophelia S Venturelli, and Frances H Arnold. Evolution favors protein mutational robustness in sufficiently large populations. BMC Biology, 5(1):29, 2007. [20] Zachary D Blount, Christina Z Borland, and Richard E Lenski. Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli. Proceedings of the National Academy of Sciences, 105(23):7899–7906, 2008. [21] Zachary D Blount, Jeffrey E Barrick, Carla J Davidson, and Richard E Lenski. Genomic analysis of a key innovation in an experimental Escherichia coli population. Nature, 489: 513–518, 2012. John Tyler Bonner. The Evolution of Complexity by Means of Natural Selection. Princeton University Press, Princeton, 1988. [22] [23] Christina L Burch and Lin Chao. Evolution by small steps and rugged landscapes in the RNA virus φ6. Genetics, 151(3):921–927, 1999. [24] Alita R Burmeister, Richard E Lenski, and Justin R Meyer. Host coevolution alters the adaptive landscape of a virus. Proceedings of the Royal Society B: Biological Sciences, 283 (1839):20161528, 2016. [25] David Butcher. Muller’s ratchet, epistasis and mutation effects. Genetics, 141(1):431–437, 1995. [26] Lin Chao. Fitness of RNA virus decreased by Muller’s ratchet. Nature, 348(6300):454, 1990. [27] Brian Charlesworth. Why we are not dead one hundred times over. Evolution, 67(11): 3354–3361, 2013. 98 [28] Hsin-Hung Chou, Hsuan-Chao Chiu, Nigel F Delaney, Daniel Segrè, and Christopher J Marx. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science, 332(6034):1190–1192, 2011. [29] Stephanie S Chow, Claus O Wilke, Charles Ofria, Richard E Lenski, and Christoph Adami. Adaptive radiation from resource competition in digital organisms. Science, 305(5680): 84–86, 2004. [30] Miguel Clavero and Emili García-Berthou. Invasive species are a leading cause of animal extinctions. Trends in Ecology and Evolution, 20(3):110–110, 2005. [31] [32] Jeff Clune, Dusan Misevic, Charles Ofria, Richard E Lenski, Santiago F Elena, and Rafael Sanjuán. Natural selection fails to optimize mutation rates for long-term adaptation on rugged fitness landscapes. PLoS Computational Biology, 4(9):e1000187, 2008. Iñaki Comas, Andrés Moya, and Fernando González-Candelas. Validating viral quasispecies with digital organisms: A re-examination of the critical mutation rate. BMC Evolutionary Biology, 5(1):5, 2005. [33] Arthur W Covert, Richard E Lenski, Claus O Wilke, and Charles Ofria. Experiments on the role of deleterious mutations as stepping stones in adaptive evolution. Proceedings of the National Academy of Sciences, 110(34):E3171–E3178, 2013. James F Crow. Some possibilities for measuring selection intensities in man. Human Biology, 30(1):1–13, 1958. [34] [35] Ivana Cvijović, Benjamin H Good, Elizabeth R Jerison, and Michael M Desai. Fate of a mutation in a fluctuating environment. Proceedings of the National Academy of Sciences, 112(36):E5021–E5028, 2015. [36] Vincent Daubin and Nancy A Moran. Comment on “The origins of genome complexity”. Science, 306:978–978, 2004. [37] [38] J. Arjan G M. de Visser, Clifford W Zeyl, Philip J Gerrish, Jeffrey L Blanchard, and Richard E Lenski. Diminishing returns from mutation supply rate in asexual populations. Science, 283 (5400):404–406, 1999. J. Arjan G M. de Visser, Joachim Hermisson, Günter P Wagner, Lauren Ancel Meyers, Homayoun Bagheri-Chaichian, Jeffrey L Blanchard, Lin Chao, James M Cheverud, San- tiago F Elena, Walter Fontana, et al. Perspective: Evolution and detection of genetic robustness. Evolution, 57(9):1959–1972, 2003. [39] Michael M Desai, Daniel S Fisher, and Andrew W Murray. The speed of evolution and maintenance of variation in asexual populations. Current Biology, 17(5):385–394, 2007. [40] Guillaume Diss, Isabelle Gagnon-Arsenault, Anne-Marie Dion-Coté, Hélène Vignaud, Di- ana I Ascencio, Caroline M Berger, and Christian R Landry. Gene duplication can impart fragility, not robustness, in the yeast protein interaction network. Science, 355(6325):630– 634, 2017. 99 [41] Elizabeth Duarte, David Clarke, Andres Moya, Esteban Domingo, and John Holland. Rapid fitness losses in mammalian RNA virus clones due to Muller’s ratchet. Proceedings of the National Academy of Sciences, 89(13):6015–6019, 1992. [42] Robert R Dunn, Nyeema C Harris, Robert K Colwell, Lian Pin Koh, and Navjot S Sodhi. The sixth mass coextinction: Are most endangered species parasites and mutualists? Proceedings of the Royal Society of London B: Biological Sciences, 276(1670):3037–3045, 2009. [43] Sean R Eddy. The C-value paradox, junk DNA and ENCODE. Current Biology, 22: R898–R899, 2012. [44] Jeffrey A Edlund and Christoph Adami. Evolution of robustness in digital organisms. Artificial Life, 10(2):167–179, 2004. [45] Santiago F Elena and Richard E Lenski. Evolution experiments with microorganisms: The dynamics and genetic bases of adaptation. Nature Reviews Genetics, 4:457–469, 2003. [46] Santiago F Elena, Lynette Ekunwe, Neerja Hajela, Shenandoah A Oden, and Richard E Lenski. Distribution of fitness effects caused by random insertion mutations in Escherichia coli. Genetica, 102:349, 1998. [47] Santiago F Elena, Purificación Carrasco, José-Antonio Daròs, and Rafael Sanjuán. Mecha- nisms of genetic robustness in RNA viruses. EMBO Reports, 7(2):168–173, 2006. [48] Santiago F Elena, Claus O Wilke, Charles Ofria, and Richard E Lenski. Effects of population size and mutation rate on the evolution of mutational robustness. Evolution, 61:666–674, 2007. [49] Suzanne Estes and Michael Lynch. Rapid fitness recovery in mutationally degraded lines of Caenorhabditis elegans. Evolution, 57(5):1022–1030, 2003. [50] Warren J Ewens. Mathematical Population Genetics 1: Theoretical Introduction, volume 27. Springer Science & Business Media, 2004. [51] Adam Eyre-Walker and Peter D Keightley. The distribution of fitness effects of new muta- tions. Nature Reviews Genetics, 8(8):610, 2007. [52] Mario A Fares, Mario X Ruiz-González, Andrés Moya, Santiago F Elena, and Eladio Barrio. Endosymbiotic bacteria: groEL buffers against deleterious mutations. Nature, 417(6887): 398, 2002. [53] Mario Ali Fares, Andres Moya, and Eladio Barrio. GroEL and the maintenance of bacterial endosymbiosis. Trends in Genetics, 20(9):413–416, 2004. Joseph Felsenstein. The evolutionary advantage of recombination. Genetics, 78(2):737–756, 1974. [54] [55] Craig A Fogle, James L Nagle, and Michael M Desai. Clonal interference, multiple mutations and adaptation in large asexual populations. Genetics, 180(4):2163–2173, 2008. 100 [56] Miguel A Fortuna, Luis Zaman, Andreas Wagner, and Jordi Bascompte. Non-adaptive origins of evolutionary innovations increase network complexity in interacting digital organ- isms. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1735): 20160431, 2017. [57] Erin Fry, Sun K Kim, Sravanthi Chigurapti, Katelyn M Mika, Aakrosh Ratan, Alexander Dammermann, Brian J Mitchell, Webb Miller, and Vincent J Lynch. Functional architecture of deleterious genetic variants in the Wrangel Island mammoth genome. bioRxiv, page 137455, 2017. [58] Wilfried Gabriel, Michael Lynch, and R Burger. Muller’s ratchet and mutational meltdowns. Evolution, 47(6):1744–1757, 1993. [59] Kerry A Geiler-Samerotte, Yuan O Zhu, Benjamin E Goulet, David W Hall, and Mark L Siegal. Selection transforms the landscape of genetic variation interacting with Hsp90. PLoS Biology, 14(10):e2000465, 2016. [60] Philip J Gerrish and Richard E Lenski. The fate of competing beneficial mutations in an asexual population. Genetica, 102:127, 1998. [61] John H Gillespie. Genetic drift in an infinite population: The pseudohitchhiking model. Genetics, 155:909–919, 2000. [62] Heather J Goldsby, Anna Dornhaus, Benjamin Kerr, and Charles Ofria. Task-switching costs promote the evolution of division of labor and shifts in individuality. Proceedings of the National Academy of Sciences, 109(34):13686–13691, 2012. [63] Heather J Goldsby, David B Knoester, Charles Ofria, and Benjamin Kerr. The evolutionary origin of somatic cells under the dirty work hypothesis. PLoS Biology, 12:e1001858, 2014. [64] Benjamin H Good, Igor M Rouzine, Daniel J Balick, Oskar Hallatschek, and Michael M Desai. Distribution of fixed beneficial mutations and the rate of adaptation in asexual populations. Proceedings of the National Academy of Sciences, 109(13):4950–4955, 2012. [65] Isabel Gordo and Brian Charlesworth. The speed of Muller’s ratchet with background selection, and the degeneration of Y chromosomes. Genetics Research, 78(2):149–161, 2001. [66] Sidhartha Goyal, Daniel J Balick, Elizabeth R Jerison, Richard A Neher, Boris I Shraiman, and Michael M Desai. Dynamic mutation–selection balance as an evolutionary attractor. Genetics, 191(4):1309–1319, 2012. [67] T Ryan Gregory and Jonathan DS Witt. Population size and genome size in fishes: A closer look. Genome, 51(4):309–313, 2008. [68] Pierre-Alexis Gros and Olivier Tenaillon. Selection for chaperone-like mediated genetic robustness at low mutation rate: Impact of drift, epistasis and complexity. Genetics, 182(2): 555–564, 2009. 101 [69] Pierre-Alexis Gros, Hervé Le Nagard, and Olivier Tenaillon. The evolution of epistasis and its links with genetic robustness, complexity and drift in a phenotypic model of adaptation. Genetics, 182(1):277–293, 2009. [70] Zhenglong Gu, Lars M Steinmetz, Xun Gu, Curt Scharfe, Ronald W Davis, and Wen-Hsiung Li. Role of duplicate genes in genetic robustness against null mutations. Nature, 421(6918): 63–66, 2003. [71] Aditi Gupta, Thomas LaBar, Michael Miyagi, and Christoph Adami. Evolution of genome size in asexual digital organisms. Scientific Reports, 6:25786, 2016. [72] John Haigh. The accumulation of deleterious genes in a population—Muller’s ratchet. Theoretical Population Biology, 14(2):251–267, 1978. [73] Daniel L Halligan and Peter D Keightley. Spontaneous mutation accumulation studies in evolutionary genetics. Annual Review of Ecology, Evolution, and Systematics, 40:151–172, 2009. [74] Andreas Handel and Daniel E Rozen. The impact of population size on the evolution of asexual microbes on smooth versus rugged fitness landscapes. BMC Evolutionary Biology, 9(1):236, 2009. [75] Thomas Hindré, Carole Knibbe, Guillaume Beslon, and Dominique Schneider. New insights into bacterial adaptation through in vivo and in silico experimental evolution. Nature Reviews Microbiology, 10:352–365, 2012. [76] Edward C Holmes. The evolution and emergence of RNA viruses. Oxford University Press, 2009. [77] Austin L Hughes. The evolution of functionally novel proteins after gene duplication. Proceedings of the Royal Society of London B, 256:119–124, 1994. John D Hunter et al. Matplotlib: A 2D graphics environment. Computing in Science and Engineering, 9(3):90–95, 2007. [78] [79] Yoh Iwasa, Franziska Michor, and Martin A Nowak. Stochastic tunnels in evolutionary dynamics. Genetics, 166(3):1571–1579, 2004. [80] David Jablonski. Background and mass extinctions: The alternation of macroevolutionary regimes. Science, 231(4734):129–133, 1986. [81] Kavita Jain, Joachim Krug, and Su-Chan Park. Evolutionary advantage of small populations on complex fitness landscapes. Evolution, 65:1945–1955, 2011. [82] Eric Jones, Travis Oliphant, Pearu Peterson, et al. SciPy: Open source scientific tools for Python, 2001–. URL http://www. scipy. org, 73:86, 2015. [83] Rees Kassen. Experimental evolution and the nature of biodiversity. Roberts, 2014. 102 [84] Tadeusz J Kawecki, Richard E Lenski, Dieter Ebert, Brian Hollis, Isabelle Olivieri, and Michael C Whitlock. Experimental evolution. Trends in Ecology & Evolution, 27:547–560, 2012. [85] Peter D Keightley and Sarah P Otto. Interference among deleterious mutations favours sex and recombination in finite populations. Nature, 443(7107):89, 2006. [86] Yogeshwar D Kelkar and Howard Ochman. Genome reduction promotes increase in protein functional complexity in bacteria. Genetics, 193(1):303–307, 2013. [87] Aisha I Khan, Duy M Dinh, Dominique Schneider, Richard E Lenski, and Tim F Cooper. Negative epistasis between beneficial mutations in an evolving bacterial population. Science, 332(6034):1193–1196, 2011. [88] Motoo Kimura. On the probability of fixation of mutant genes in a population. Genetics, 47 (6):713–719, 1962. [89] Motoo Kimura. The neutral theory of molecular evolution. Cambridge University Press, New York, 1984. [90] Motoo Kimura and Takeo Maruyama. The mutational load with epistatic gene interactions in fitness. Genetics, 54(6):1337–1351, 1966. [91] Motoo Kimura, Takeo Maruyama, and James F Crow. The mutation load in small populations. Genetics, 48(10):1303–1312, 1963. [92] Alexey S Kondrashov. Muller’s ratchet under epistatic selection. Genetics, 136(4):1469– 1473, 1994. [93] Alexey S Kondrashov. Contamination of the genome by very slightly deleterious mutations: why have we not died 100 times over? Journal of Theoretical Biology, 175(4):583–594, 1995. [94] Eugene V Koonin. A non-adaptationist perspective on evolution of genomic complexity or the continued dethroning of man. Cell Cycle, 3:278–283, 2004. [95] Eugene V Koonin. Evolution of genome architecture. The International Journal of Bio- chemistry & Cell Biology, 41(2):298–306, 2009. [96] Eugene V Koonin. The Logic of Chance: The Nature and Origin of Biological Evolution. FT press, Upper Saddle River, 2012. [97] David C Krakauer and Joshua B Plotkin. Redundancy, antiredundancy, and the robustness of genomes. Proceedings of the National Academy of Sciences, 99(3):1405–1409, 2002. [98] Sergey Kryazhimskiy, Daniel P Rice, Elizabeth R Jerison, and Michael M Desai. Global epistasis makes adaptation predictable despite sequence-level stochasticity. Science, 344 (6191):1519–1522, 2014. 103 [99] Chih-Horng Kuo and Howard Ochman. Deletional bias across the three domains of life. Genome Biology and Evolution, 1:145–152, 2009. [100] Chih-Horng Kuo, Nancy A Moran, and Howard Ochman. The consequences of genetic drift for bacterial genome complexity. Genome Research, 19:1450–1454, 2009. [101] Thomas LaBar and Christoph Adami. Different evolutionary paths to complexity for small and large populations of digital organisms. PLoS Computational Biology, 12(12):e1005066, 2016. [102] Thomas LaBar and Christoph Adami. Evolution of drift robustness in small populations. Nature Communications, 8(1):1012, 2017. [103] Josianne Lachapelle, Joshua Reid, and Nick Colegrave. Repeatability of adaptation in experimental populations of different sizes. Proceedings of the Royal Society of London B: Biological Sciences, 282:20143033, 2015. [104] Yinghong Lan, Aaron Trout, Daniel M Weinreich, and C Scott Wylie. Natural selection can favor the evolution of ratchet robustness over evolution of mutational robustness. bioRxiv, page 121087, 2017. [105] Russell Lande. Risk of population extinction from fixation of new deleterious mutations. Evolution, 48(5):1460–1469, 1994. [106] Adam S Lauring, Judith Frydman, and Raul Andino. The role of mutational robustness in RNA virus evolution. Nature Reviews Microbiology, 11(5):327, 2013. [107] Richard E Lenski, Michael R Rose, Suzanne C Simpson, and Scott C Tadler. Long-term experimental evolution in Escherichia coli. I: Adaptation and divergence during 2,000 gen- erations. The American Naturalist, 138:1315–1341, 1991. [108] Richard E Lenski, Charles Ofria, Travis C Collier, and Christoph Adami. Genome com- plexity, robustness and genetic interactions in digital organisms. Nature, 400:661–664, 1999. [109] Richard E Lenski, Charles Ofria, Robert T Pennock, and Christoph Adami. The evolutionary origin of complex features. Nature, 423:139–144, 2003. [110] Yann Lesecque, Peter D Keightley, and Adam Eyre-Walker. A resolution of the mutation load paradox in humans. Genetics, 191(4):1321–1330, 2012. [111] Sasha F Levy, Jamie R Blundell, Sandeep Venkataram, Dmitri A Petrov, Daniel S Fisher, and Gavin Sherlock. Quantitative evolutionary dynamics using high-resolution lineage tracking. Nature, 519(7542):181, 2015. [112] Haley A Lindsey, Jenna Gallie, Susan Taylor, and Benjamin Kerr. Evolutionary rescue from extinction is contingent on a lower rate of environmental change. Nature, 494(7438): 463–467, 2013. 104 [113] Michael Lynch. Mutation accumulation in transfer RNAs: Molecular evidence for Muller’s ratchet in mitochondrial genomes. Molecular Biology and Evolution, 13(1):209–220, 1996. [114] Michael Lynch. Intron evolution as a population-genetic process. Proceedings of the National Academy of Sciences, 99(9):6118–6123, 2002. [115] Michael Lynch. The origins of eukaryotic gene structure. Molecular Biology and Evolution, 23:450–468, 2006. [116] Michael Lynch. Streamlining and simplification of microbial genome architecture. Annual Review of Microbiology, 60:327–349, 2006. [117] Michael Lynch. The evolution of genetic networks by non-adaptive processes. Nature Reviews Genetics, 8(10):803, 2007. [118] Michael Lynch. The frailty of adaptive hypotheses for the origins of organismal complexity. Proceedings of the National Academy of Sciences, 104(Suppl 1):8597–8604, 2007. [119] Michael Lynch. The Origins of Genome Architecture. Sinauer Associates, Sunderland, 2007. [120] Michael Lynch and John S Conery. The evolutionary fate and consequences of duplicate genes. Science, 290:1151–1155, 2000. [121] Michael Lynch and John S Conery. The origins of genome complexity. Science, 302: 1401–1404, 2003. [122] Michael Lynch and Allan Force. The probability of duplicate gene preservation by subfunc- tionalization. Genetics, 154(1):459–473, 2000. [123] Michael Lynch and Wilfried Gabriel. Mutation load and the survival of small populations. Evolution, 44(7):1725–1737, 1990. [124] Michael Lynch and Kyle Hagner. Evolutionary meandering of intermolecular interactions along the drift barrier. Proceedings of the National Academy of Sciences, 112(1):E30–E38, 2015. [125] Michael Lynch, Reinhard Bürger, D Butcher, and Wilfried Gabriel. The mutational meltdown in asexual populations. Journal of Heredity, 84:339–344, 1993. [126] Michael Lynch, John Conery, and Reinhard Bürger. Mutation accumulation and the extinction of small populations. American Naturalist, 146(4):489–518, 1995. [127] Michael Lynch, John Conery, and Reinhard Bürger. Mutational meltdowns in sexual popu- lations. Evolution, 49:1067–1080, 1995. [128] Michael Lynch, Louis-Marie Bobay, Francesco Catania, Jean-François Gout, and Mina Rho. The repatterning of eukaryotic genomes by random genetic drift. Annual Review of Genomics and Human Genetics, 12:347–366, 2011. 105 [129] Michael Lynch, Matthew S Ackerman, Jean-Francois Gout, Hongan Long, Way Sung, W Kelley Thomas, and Patricia L Foster. Genetic drift, selection and the evolution of the mutation rate. Nature Reviews Genetics, 17(11):704–714, 2016. [130] Sophie Maisnier-Patin, John R Roth, Åsa Fredriksson, Thomas Nyström, Otto G Berg, and Dan I Andersson. Genomic buffering mitigates the effects of deleterious mutations in bacteria. Nature Genetics, 37(12):1376, 2005. [131] Joanna Masel. Genetic drift. Current Biology, 21:R837–R838, 2011. [132] Diethart Matthies, Ingo Bräuer, Wiebke Maibom, and Teja Tscharntke. Population size and the risk of local extinction: Empirical evidence from rare plants. Oikos, 105(3):481–488, 2004. [133] John Maynard Smith. The Evolution of Sex. Cambridge University Press, 1978. [134] John Maynard Smith. The causes of extinction. Philosophical Transactions of the Royal Society B, 325:241–252, 1989. [135] David M McCandlish and Joshua B Plotkin. Transcriptional errors and the drift barrier. Proceedings of the National Academy of Sciences, 113(12):3136–3138, 2016. [136] David M McCandlish and Arlin Stoltzfus. Modeling evolution using the probability of fixation: History and implications. The Quarterly Review of Biology, 89(3):225–252, 2014. [137] John P McCutcheon and Nancy A Moran. Extreme genome reduction in symbiotic bacteria. Nature Reviews Microbiology, 10(1):13, 2012. [138] Michael J McDonald, Daniel P Rice, and Michael M Desai. Sex speeds adaptation by altering the dynamics of molecular evolution. Nature, 531(7593):233, 2016. [139] Wes McKinney. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O’Reilly Media, Inc., 2012. [140] Daniel W McShea and Robert N Brandon. Biology’s First Law: The Tendency for Diversity and Complexity to Increase in Evolutionary Systems. University of Chicago Press, Chicago, 2010. [141] Brett A Melbourne and Alan Hastings. Extinction risk depends strongly on factors con- tributing to stochasticity. Nature, 454(7200):100–103, 2008. [142] Justin R Meyer, Devin T Dobias, Joshua S Weitz, Jeffrey E Barrick, Ryan T Quick, and Richard E Lenski. Repeatability and contingency in the evolution of a key innovation in phage lambda. Science, 335:428–432, 2012. [143] Alex Mira, Howard Ochman, and Nancy A Moran. Deletional bias and the evolution of bacterial genomes. Trends in Genetics, 17:589–596, 2001. [144] Dusan Misevic, Richard E Lenski, and Charles Ofria. Sexual reproduction and Muller’s In Ninth International Conference on Artificial Life, pages ratchet in digital organisms. 340–345, 2004. 106 [145] Dusan Misevic, Charles Ofria, and Richard E Lenski. Sexual reproduction reshapes the genetic architecture of digital organisms. Proceedings of the Royal Society of London B, 273:457–464, 2006. [146] Rebecca Montville, Remy Froissart, Susanna K Remold, Olivier Tenaillon, and Paul E Turner. Evolution of mutational robustness in an RNA virus. PLoS Biology, 3(11):e381, 2005. [147] Francisco B-G Moore, Danieal E Rozen, and Richard E Lenski. Pervasive compensatory adaptation in Escherichia coli. Proceedings of the Royal Society of London B: Biological Sciences, 267(1442):515–522, 2000. [148] Nancy A Moran. Accelerated evolution and Muller’s ratchet in endosymbiotic bacteria. Proceedings of the National Academy of Sciences, 93(7):2873–2878, 1996. [149] Nancy A Moran, John P McCutcheon, and Atsushi Nakabachi. Genomics and evolution of heritable bacterial symbionts. Annual Review of Genetics, 42:165–190, 2008. [150] Hermann J Muller. Our load of mutations. American Journal of Human Genetics, 2(2):111, 1950. [151] Hermann Joseph Muller. The relation of recombination to mutational advance. Mutation Research, 1:2–9, 1964. [152] Joakim Näsvall, Lei Sun, John R Roth, and Dan I Andersson. Real-time evolution of new genes by innovation, amplification, and divergence. Science, 338:384–387, 2012. [153] Ian E Ochs and Michael M Desai. The competition between simple and complex evolutionary trajectories in asexual populations. BMC Evolutionary Biology, 15:55, 2015. [154] Charles Ofria, David M Bryson, and Claus O Wilke. Avida: A software platform for research in computational evolutionary biology. In Andrew Adamatzky Maciej Komosinski, editor, Artificial Life Models in Software, pages 3–35. Springer London, 2009. [155] Tomoko Ohta. The nearly neutral theory of molecular evolution. Annual Review of Ecology and Systematics, 23:263–286, 1992. [156] Tomoko Ohta and John H Gillespie. Development of neutral and nearly neutral theories. Theoretical Population Biology, 49:128–142, 1996. [157] Sarah P Otto. The evolutionary enigma of sex. The American Naturalist, 174:S1–S14, 2009. [158] Sarah P Otto and Nick H Barton. Selection for recombination in small populations. Evolution, 55(10):1921–1931, 2001. [159] Sarah P Otto and Aleeza C Gerstein. The evolution of haploidy and diploidy. Current Biology, 18(24):R1121–R1124, 2008. [160] Julian J O’Grady, Barry W Brook, David H Reed, Jonathan D Ballou, David W Tonkyn, and Richard Frankham. Realistic levels of inbreeding depression strongly affect extinction risk in wild populations. Biological Conservation, 133(1):42–51, 2006. 107 [161] Alexander F Palazzo and T Ryan Gregory. The case for junk DNA. PLoS Genetics, 10: e1004351, 2014. [162] Adam C Palmer, Erdal Toprak, Michael Baym, Seungsoo Kim, Adrian Veres, Shimon Bershtein, and Roy Kishony. Delayed commitment to evolutionary fate in antibiotic resistance fitness landscapes. Nature Communications, 6:7385, 2015. [163] Amy B Pedersen, Kate E Jones, Charles L Nunn, and Sonia Altizer. Infectious diseases and extinction risk in wild mammals. Conservation Biology, 21(5):1269–1279, 2007. [164] Robert T Pennock. Models, simulations, instantiations, and evidence: The case of digital evolution. Journal of Experimental & Theoretical Artificial Intelligence, 19:29–42, 2007. [165] Frank J Poelwijk, Sorin Tănase-Nicola, Daniel J Kiviet, and Sander J Tans. Reciprocal sign epistasis is a necessary condition for multi-peaked fitness landscapes. Journal of Theoretical Biology, 272(1):141–144, 2011. [166] Art Poon and Lin Chao. The rate of compensatory mutation in the DNA bacteriophage ϕX174. Genetics, 170(3):989–999, 2005. [167] Art Poon and Sarah P Otto. Compensating for our load of mutations: Freezing the meltdown of small populations. Evolution, 54(5):1467–1479, 2000. [168] Erik M Quandt, Jimmy Gollihar, Zachary D Blount, Andrew D Ellington, George Georgiou, and Jeffrey E Barrick. Fine-tuning citrate synthase flux potentiates and refines metabolic innovation in the Lenski evolution experiment. eLife, 4:e09696, 2015. [169] Etienne Rajon and Joanna Masel. Evolution of molecular error rates and the consequences for evolvability. Proceedings of the National Academy of Sciences, 108(3):1082–1087, 2011. [170] David M Raup. Extinction: Bad Genes or Bad Luck? WW Norton & Company, New York, 1992. [171] Yevgeniy Raynes, C Scott Wylie, Paul D Sniegowski, and Daniel M Weinreich. Sign of selection on mutation rate modifiers depends on population size. Proceedings of the National Academy of Sciences, 115(13):3422–3427, 2018. [172] David Reznick. Hard and soft selection revisited: How evolution by natural selection works in the real world. Journal of Heredity, 107(1):3–14, 2015. [173] Jacqueline A Robinson, Diego Ortega-Del Vecchyo, Zhenxin Fan, Bernard Y Kim, Clare D Marsden, Kirk E Lohmueller, Robert K Wayne, et al. Genomic flatlining in the endangered island fox. Current Biology, 26(9):1183–1189, 2016. [174] Rebekah L Rogers and Montgomery Slatkin. Excess of genomic defects in a woolly mammoth on Wrangel Island. PLoS Genetics, 13(3):e1006601, 2017. [175] Daniel E Rozen, Michelle G. J. L. Habets, Andreas Handel, and J. Arjan G. M. de Visser. Heterogeneous adaptive trajectories of small populations on complex fitness landscapes. PLoS One, 3:e1715–e1715, 2008. 108 [176] Beatriz Sabater-Muñoz, Maria Prats-Escriche, Roser Montagud-Martínez, Adolfo López- Cerdán, Christina Toft, José Aguilar-Rodríguez, Andreas Wagner, and Mario A Fares. Fitness trade-offs determine the role of the molecular chaperonin GroEL in buffering mutations. Molecular Biology and Evolution, 32(10):2681–2693, 2015. [177] Beatriz Sabater-Muñoz, Christina Toft, David Alvarez-Ponce, and Mario A Fares. Chance and necessity in the genome evolution of endosymbiotic bacteria of insects. The ISME Journal, 11(6):1291, 2017. [178] Merijn LM Salverda, Jeroen Koomen, Bertha Koopmanschap, Mark P Zwart, and J. Arjan G. M. de Visser. Adaptive benefits from small mutation supplies in an antibiotic resistance enzyme. Proceedings of the National Academy of Sciences, page 201712999, 2017. [179] Rafael Sanjuán, José M Cuevas, Victoria Furió, Edward C Holmes, and Andrés Moya. Selection for robustness in mutagenized RNA viruses. PLoS Genetics, 3(6):e93, 2007. [180] Sijmen Schoustra, Sungmin Hwang, Joachim Krug, and J. Arjan G. M. de Visser. Diminishing-returns epistasis among random beneficial mutations in a multicellular fun- gus. Proceedings of the Royal Society B: Biological Sciences, 283(1837):20161376, 2016. [181] Sijmen E Schoustra, Thomas Bataillon, Danna R Gifford, and Rees Kassen. The properties of adaptive walks in evolving populations of fungus. PLoS Biology, 7:2474, 2009. [182] Guy Sella and Aaron E Hirsh. The application of statistical physics to evolutionary biology. Proceedings of the National Academy of Sciences of the United States of America, 102(27): 9541–9546, 2005. [183] Olin K Silander, Olivier Tenaillon, and Lin Chao. Understanding the evolutionary fate of finite populations: The dynamics of mutational effects. PLoS Biology, 5(4):e94, 2007. [184] Tanya Singh, Meredith Hyun, and Paul Sniegowski. Evolution of mutation rates in hyper- mutable populations of Escherichia coli propagated at very small effective population size. Biology Letters, 13(3):20160849, 2017. [185] Mashaal Sohail, Olga A Vakhrusheva, Jae Hoon Sul, Sara L Pulit, Laurent C Francioli, Leonard H van den Berg, Jan H Veldink, Paul IW de Bakker, Georgii A Bazykin, Alexey S Kondrashov, et al. Negative selection in humans and fruit flies involves synergistic epistasis. Science, 356(6337):539–542, 2017. [186] Derek Spielman, Barry W Brook, and Richard Frankham. Most species are not driven to extinction before genetic factors impact them. Proceedings of the National Academy of Sciences, 101(42):15261–15264, 2004. [187] Giovanni Strona and Kevin D Lafferty. Environmental change makes robust ecological networks fragile. Nature Communications, 7:12462, 2016. [188] Way Sung, Matthew S Ackerman, Samuel F Miller, Thomas G Doak, and Michael Lynch. Drift-barrier hypothesis and mutation-rate evolution. Proceedings of the National Academy of Sciences, 109(45):18488–18492, 2012. 109 [189] Béla Szamecz, Gábor Boross, Dorottya Kalapis, Károly Kovács, Gergely Fekete, Zoltán Farkas, Viktória Lázár, Mónika Hrtyan, Patrick Kemmeren, Marian JA Groot Koerkamp, et al. The genomic landscape of compensatory evolution. PLoS Biology, 12(8):e1001935, 2014. [190] Ivan G Szendro, Jasper Franke, J. Arjan G. M. de Visser, and Joachim Krug. Predictability of evolution depends nonmonotonically on population size. Proceedings of the National Academy of Sciences, 110(2):571–576, 2013. [191] Olivier Tenaillon, Olin K Silander, Jean-Philippe Uzan, and Lin Chao. Quantifying organ- ismal complexity using a population genetic approach. PLoS One, 2:e217, 2007. [192] Christina Toft and Mario A Fares. Selection for translational robustness in Buchnera aphidi- cola, endosymbiotic bacteria of aphids. Molecular Biology and Evolution, 26(4):743–751, 2009. [193] Charles C Traverse and Howard Ochman. Conserved rates and patterns of transcription errors across bacterial growth states and lifestyles. Proceedings of the National Academy of Sciences, 113(12):3311–3316, 2016. [194] Michael Travisano, Judith A Mongold, Albert F Bennett, and Richard E Lenski. Experi- mental tests of the roles of adaptation, chance, and history in evolution. Science, 267:87–90, 1995. [195] Caroline B Turner, Zachary D Blount, and Richard E Lenski. Replaying evolution to test the cause of extinction of one ecotype in an experimentally evolved population. PloS One, 10 (11):e0142050, 2015. [196] Mark C Urban. Accelerating extinction risk from climate change. Science, 348(6234): 571–573, 2015. [197] Stefan Van Der Walt, S Chris Colbert, and Gael Varoquaux. The NumPy array: A structure for efficient numerical computation. Computing in Science & Engineering, 13(2):22–30, 2011. [198] Erik Van Nimwegen, James P Crutchfield, and Martijn Huynen. Neutral evolution of mutational robustness. Proceedings of the National Academy of Sciences, 96(17):9716– 9720, 1999. [199] Geerat J Vermeij and Richard K Grosberg. Rarity and persistence. Ecology Letters, 21(1): 3–8, 2018. [200] Alexander E Vinogradov. Selfish DNA is maladaptive: Evidence from the plant red list. Trends in Genetics, 19:609–614, 2003. [201] Alexander E Vinogradov. Genome size and extinction risk in vertebrates. Proceedings of the Royal Society of London B: Biological Sciences, 271:1701–1706, 2004. [202] Daniel A Wagenaar and Christoph Adami. Influence of chance, history, and adaptation on digital evolution. Artificial Life, 10:181–190, 2004. 110 [203] Andreas Wagner. Neutralism and selectionism: A network-based reconciliation. Nature Reviews Genetics, 9:965–974, 2008. [204] Andreas Wagner. The Origins of Evolutionary Innovations: A Theory of Transformative Change in Living Systems. Oxford University Press, New York, 2011. [205] Bess L Walker and Charles Ofria. Evolutionary potential is maximized at intermediate diversity levels. In C. Adami, D.M. Bryson, C. Ofria, and R. Pennock, editors, Proceedings of the 13th International Conference on the Synthesis and Simulation of Living Systems, pages 116–120, Cambridge, MA, 2012. MIT Press. [206] Bruce Wallace. Fifty Years of Genetic Load: An Odyssey. Cornell University Press, 1991. [207] J Bruce Walsh. How often do duplicated genes evolve new functions? Genetics, 139: 421–428, 1995. [208] Daniel M Weinreich and Lin Chao. Rapid evolutionary escape by large populations from local fitness peaks is likely in nature. Evolution, 59(6):1175–1182, 2005. [209] Daniel M Weinreich, Nigel F Delaney, Mark A DePristo, and Daniel L Hartl. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science, 312:111–114, 2006. [210] Daniel B Weissman, Michael M Desai, Daniel S Fisher, and Marcus W Feldman. The rate at which asexual populations cross fitness valleys. Theoretical Population Biology, 75: 286–300, 2009. [211] Daniel B Weissman, Marcus W Feldman, and Daniel S Fisher. The rate of fitness-valley crossing in sexual populations. Genetics, 186:1389–1410, 2010. [212] Michael C Whitlock. Fixation of new alleles and the extinction of small populations: Drift load, beneficial alleles, and sexual selection. Evolution, 54(6):1855–1861, 2000. [213] Michael C Whitlock, Reinhard Bürger, and Ulf Dieckmann. Fixation of new mutations in small populations. In Régis Ferrière, Ulf Dieckmann, and Denis Couvet, editors, Evo- lutionary Conservation Biology, chapter 9, pages 155–170. Cambridge University Press, Cambridge, 2004. [214] Kenneth D Whitney and Theodore Garland. Did genetic drift drive increases in genome complexity? PLoS Genetics, 6(8):e1001080, 2010. [215] Claus O Wilke. The speed of adaptation in large asexual populations. Genetics, 167(4): 2045–2053, 2004. [216] Claus O Wilke and Christoph Adami. Interaction between directional epistasis and average mutational effects. Proceedings of the Royal Society of London B: Biological Sciences, 268 (1475):1469–1474, 2001. [217] Claus O Wilke and Christoph Adami. The biology of digital organisms. Trends in Ecology & Evolution, 17:528–532, 2002. 111 [218] Claus O Wilke and Christoph Adami. Evolution of mutational robustness. Mutation Re- search/Fundamental and Molecular Mechanisms of Mutagenesis, 522(1):3–11, 2003. [219] Claus O Wilke and D Allan Drummond. Population genetics of translational robustness. Genetics, 173(1):473–481, 2006. [220] Claus O Wilke, Jia Lan Wang, Charles Ofria, Richard E Lenski, and Christoph Adami. Evolution of digital organisms at high mutation rates leads to survival of the flattest. Nature, 412(6844):331, 2001. [221] Yvonne Willi, Josh Van Buskirk, and Ary A Hoffmann. Limits to the adaptive potential of small populations. Annual Review of Ecology, Evolution, and Systematics, 37:433–458, 2006. [222] Wolfram Research, Inc. Mathematica, Version 11.0. Wolfram Research, Inc., Champaign, Illinois, 2016. [223] Sewall Wright. The roles of mutation, inbreeding, crossbreeding, and selection in evolution. In Proceedings of the Sixth International Congress of Genetics, volume 1, pages 356–366, 1932. [224] Kun Xiong, Jay P McEntee, David J Porfirio, and Joanna Masel. Drift barriers to quality control when genes are expressed at different levels. Genetics, 205(1):397–407, 2017. [225] Yali Xue, Javier Prado-Martinez, Peter H Sudmant, Vagheesh Narasimhan, Qasim Ayub, Michal Szpak, Peter Frandsen, Yuan Chen, Bryndis Yngvadottir, David N Cooper, et al. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding. Science, 348(6231):242–245, 2015. [226] Gabriel Yedid and Graham Bell. Macroevolution simulated with autonomously replicating computer programs. Nature, 420:810–812, 2002. [227] Gabriel Yedid, Charles Ofria, and Richard E Lenski. Historical and contingent factors affect re-evolution of a complex feature lost during mass extinction in communities of digital organisms. Journal of Evolutionary Biology, 21(5):1335–1357, 2008. [228] Gabriel Yedid, Charles A Ofria, and Richard E Lenski. Selective press extinctions, but not random pulse extinctions, cause delayed ecological recovery in communities of digital organisms. The American Naturalist, 173:E139–E154, 2009. [229] Gabriel Yedid, Jason Stredwick, Charles A Ofria, and Paul-Michael Agapow. A comparison of the effects of random and selective mass extinctions on erosion of evolutionary history in communities of digital organisms. PloS One, 7(5):e37233, 2012. [230] Soojin Yi and J Todd Streelman. Genome size is negatively correlated with effective popu- lation size in ray-finned fish. Trends in Genetics, 21(12):643–646, 2005. 112 [231] Eva Yus, Tobias Maier, Konstantinos Michalodimitrakis, Vera van Noort, Takuji Yamada, Wei-Hua Chen, Judith AH Wodke, Marc Güell, Sira Martínez, Ronan Bourgeois, et al. Impact of genome reduction on bacterial metabolism and its regulation. Science, 326(5957): 1263–1268, 2009. [232] Luis Zaman, Justin R Meyer, Suhas Devangam, David M Bryson, Richard E Lenski, and Charles Ofria. Coevolution drives the emergence of complex traits and promotes evolvability. PLoS Biology, 12:e1002023, 2014. [233] Clifford Zeyl, Melissa Mizesko, and J. Arjan G. M. de Visser. Mutational meltdown in laboratory yeast populations. Evolution, 55:909–917, 2001. [234] Mark P Zwart and Santiago F Elena. Matters of size: Genetic bottlenecks in virus infection and their potential impact on evolution. Annual Review of Virology, 2:161–179, 2015. [235] Mark P Zwart, Anouk Willemsen, José-Antonio Daròs, and Santiago F Elena. Experimental evolution of pseudogenization and gene loss in a plant RNA virus. Molecular Biology and Evolution, 31:121–134, 2014. 113