EXPLORING MASS EXTINCTION AND RECOVERY IN COMMUNITIES OF DIGITAL ORGANISMS By Gabriel Yedid A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Zoology Ecology, Evolutionary Biology, and Behaviour Program 2007 ABSTRACT EXPLORING MASS EXTINCTION AND RECOVERY IN COMMUNITIES OF DIGITAL ORGANISMS By Gabriel Yedid In geological history, mass extinctions are periods of elevated biodiversity loss that accompany, or result from, major environmental crises. They often result in reduction or removal of previously successful incumbent taxa and major resculpting of biotas and ecosystems. The dynamics of recovery from mass extinctions remain contentious, and have not been resolved satisfactorily by analyses of the fossil record. In this dissertation, I use the Avida digital evolution software to examine dynamics of mass extinction and recovery in communities of short, self-replicating computer programs (digital organism) that compete for limited resources, mutate, and evolve. The digital communities may evolve cross-feeding trophic interactions that permit ecological structure and dynamics. In Chapter 1, I investigate general dynamics of extinction and recovery in Avida communities, focusing specifically on recovery dynamics. Replicate communities of digital organisms are subjected to nonselective, instantaneous pulse extinctions that kill off at random most members of the community, and also to highly selective press extinctions that involve a period of sharp reduction of resource availability. Using several different measures of recovery that account for the communities’ ecological and phenotypic characteristics, recovery from press extinctions is markedly slower than that for pulse extinctions. A diversity measure that ignores the ecological and phenotypic characteristics of the digital organisms obscures differences in recovery dynamics following the two extinction types. Dynamics of recovery from a pulse extinction are similar to the slower dynamics of a press extinction if the pulse survivors are chosen for life-history characteristics that resemble those of organisms fiom the end of the press episode, rather than chosen at random. My results demonstrate that delayed recoveries can be biologically and ecologically real, not artefactual, and may be general features resulting from highly selective extinction episodes. “Complex” organismal traits, composed of multiple interlocking parts that work together, have evolved convergently in widely divergent clades separated by large periods of geological time, often after extinction of the earlier-evolved clade. In Chapter 2, I examine the loss and re-evolution of a complex computational trait, the function EQU, in press extinction communities. I find that re-evolution of EQU following the extinction episode is contingent on its pre-extinction evolution. EQU often evolves de novo in lineages that did not possess it at all prior to the extinction, and from immediate pre-extinction ancestors that were of low trophic position. I then inoculated new evolutionary processes, seeded with either the most abundant survivors of the clade in which EQU arose prior to the extinction (but that no longer performed EQU), or the actual pre-extinction ancestors of the organisms that did re-evolve EQU. EQU was quite likely to re-evolve even if the founder came from a source population where EQU did not re-evolve in the original experiment, and the actual end-extinction ancestors were not always superior in the number of subsidiary experiments where EQU re-evolved. Re-use of the ancestral genetic mechanism for performing EQU was highly variable, but often substantial. My results show that there is interplay between chance factors and adaptive pressures in whether or not a complex feature will re-evolve if it is lost due to extinction. This thesis is dedicated to the memory of my grandmother, Bessie Stilman Kisilevsky, who passed away in January 2006. Words alone cannot describe how terribly you are missed by your family and fi‘iends. iv ACKNOWLEDGEMENTS I first met my thesis advisors, Richard E. Lenski and Charles Ofria, at a conference seven years ago. After spending time talking with them, and appreciating the possibilities inherent in their digital evolution research, I realized that working with Rich and Charles would be a logical next step for me. I have been fortunate beyond measure to have had them as thesis advisors. These past six years have seen excitement and depression, false starts and fruitful avenues, and hope lost and regained. Through it all, I have always had their tireless support, guidance, and (above all) patience. It has been a fantastic experience to have Rich Lenski, a great scientist and one of the leaders in the field of experimental evolutionary biology, as my main advisor. I am deeply indebted to Rich for giving me the opportunity to work in his lab group, and the freedom to pursue unusual research directions. He gave me not only valuable advice on my research itself, but also how to develop important research and writing skills and habits. Rich always encouraged me to be more independent and self-reliant, and I appreciate his forthrightness in helping me be more self-critical of both my work and my attitude towards it. I sincerely hope that I have evolved into a more mature and capable scientist under his guidance. As my co-adviser, Charles Ofria has been a bedrock of support for me. Having him as a co-adviser was cool: it was rather like having an advisor, older brother, and gaming buddy all in one. I may have at times frustrated him with my inability to think in more quantitative terms, and I can not thank him enough for his patience and willingness to stick with me until I understood difficult concepts that flummoxed me. Both Charles and Rich have been outstanding role models for me, both as scientists and as people. I hope I can live up to the examples they set—I could not ask for better. I also thank my graduate committee members, Donald Hall, Robert Pennock, and Douglas Schemske, for their support and advice on my thesis, as well as their willingness to advise on what is admittedly an ambitious project with an unconventional experimental system. Their wisdom and insights, gleaned in both personal meetings and in classes I took with all of them, helped shape my thinking and improve my thesis in numerous ways, as well as challenging me to think more broadly and critically. I could not go without a very heartfelt thanks to all my colleagues over the years from the Lenski and Ofi'ia labs, for providing a stimulating and supportive environment at both work and play: Brian Baer, Zach Blount, Tim Cooper, Kristina Hillesland and Jason Stredwick, Christina Borland, Christopher Marx, Sean Sleight, Jeff Barrick, Brian Wade, Mike Wiser, Neerja Hajela, Marwa Adawe, Susi Remold, Bob Woods, Dule Misevic, and Elizabeth Ostrowski; Dehua Hang, Wei Huang, Sherri Goings, Matt Rupp, Dave Bryson, Kaben Nanlohy, Art Covert, and Bess Walker. A very special thanks to Jeff Clune, who really helped steer me in the right direction at a time when I was losing hope with my project. Also thanks to Wesley Elsberry for very helpful suggestions and insight while I was preparing my thesis defense. A special thanks also to the paleontologists from the Geological Sciences department at Michigan State: Robert Anstey, Danita Brandt, Mike Gottfried, and all their students, for letting me participate in their paleontology discussion group. They allowed me to indulge my interests in long-term evolution, and gave invaluable advice and feedback during the course of my work. While not in the paleo group, special thanks vi also goes out to Jay Sobel, my partner in crime for TAing the undergrad evolution course in Fall 2005, and our intrepid instructors, Angie Roles and Heather Sahli. I must also express my sincere thanks to my former master’s advisor, Dr. Graham Bell of McGill University. Under his tutelage, my interest in evolutionary biology was nurtured into a love and a lifelong vocation. He has been, and continues to be, a major influence in my scientific interests and world view. Life is not just about work, it is also about having a good time with friends who are great people. I extend thanks to Uri and Rachel Levine, Ting Hong, the East Lansing Battletech gaming group, and MSU Hillel. My many relatives have celebrated and supported my academic endeavors: Robert and Barbara Kisilevsky, Laurie and Alan Bultz; Giselle and Sheldon Prushan (and the rest of the Philadelphia clan); Diane and Mort Tuckman; Joyce Yedid and MS. Perinpanathan; and Danny and Suzy Yedid. They have never failed to be inquisitive of what tangent I happened to be on. My parents, Alan Yedid and Zipporah Kisilevsky, and my brother, Joseph Yedid, have given me their unconditional love and support in all my endeavours and all my time here. Although they did not always agree with my choices, I have never doubted that they stood behind me and always had my best interests at heart. I hope all their love, and all that they have invested in me over the years, is reflected in my accomplishments now and in the future. The same goes for my parents-in-law, Guangzhong Zhao and Yuanhuai Wu, who give me their love and support from half a world away—I only wish I could properly express my gratitude to them. It is with great sadness that I note the passing, during my years of work and study here, of my beloved grandmother, Bessie Stilman vii Kisilevsky, to whose memory this thesis is dedicated. She is now with me in spirit, and I know that she would share our joy and happiness at the completion of this stage of my career, and my marriage this past year. My final, and most special thanks, goes to my wife, Jing Zhao. She waited patiently during my years here in East Lansing, enduring what seemed like endless days of loneliness and epic long distance phone calls. In spite of all the obstacles and hardships, she has never wavered in her love and her support of me and my work. She has always been a source of strength and inspiration for me in my travails. Against all odds, our relationship has continued to grow and deepen, culminating in a promise of life commitment. And through her, my eyes have opened to the wonders and possibilities of another world. viii TABLE OF CONTENTS LIST OF TABLES ................................................................................ xii LIST OF FIGURES .............................................................................. xiv CHAPTER 1 SELECTIVE EXTINCTIONS CAUSE DELAYED ECOLOGICAL RECOVERY IN COMMUNITIES OF DIGITAL ORGANISMS Abstract ...................................................................................... 1 Introduction ................................................................................. 3 The digital microcosm. .5 Instantiating extinction .......................................................... 11 Defining recovery ............................................................... 12 Species definitions ............................................................... 16 Materials and Methods ................................................................... 17 Experimental platform .......................................................... 17 Environmental configuration .................................................. 18 Experimental methodology ..................................................... 19 Analysis ........................................................................... 21 Results ..................................................................................... 22 Press extinctions—functional degradation during press ................... 22 Press extinctions—recovery of functional activity ......................... 25 Press extinctions—recovery of phenotypic diversity ...................... 33 Press extinctions—recovery of stably coexisting eco-species ............ 33 Pulse extinctions—recovery of functional activity ......................... 43 Pulse extinctions—recovery of phenotypic diversity ...................... 50 Pulse extinctions—recovery of stably coexisting eco-species ............ 50 Average generation time in press vs. pulse .................................. 53 Effect of density and community composition on recovery ............... 55 Discussion ................................................................................. 62 Functional recovery ............................................................ 62 Diversity recovery .............................................................. 66 Future directions ......................................................................... 71 CHAPTER 2 CONVERGENCE, CONSTRAINTS, AND CONTINGENCY IN RE-EVOLUTION OF A COMPLEX TRAIT FOLLOWING EXTINCTION IN COMMUNITIES OF DIGITAL ORGANISMS Abstract .................................................................................... 74 Introduction ............................................................................... 77 Methods .................................................................................... 83 Experimental platform ......................................................... 83 ix Experimental methodology .................................................... 84 Contingency of re-evolution of EQU ......................................... 85 Calculations for pre-extinction demographic metrics ...................... 85 Tracing of lineages and clades ................................................ 88 “Replay” experiments with end-extinction organisms ..................... 93 Ecological position of pre-extinction ancestors ............................. 95 Functional genomics of EQU and retention of ancestral functional sites ..................................................... 96 Results ..................................................................................... 99 Contingency of re-evolution ................................................... 99 Effect of pre-extinction demography on re-evolution ...................... 100 Ecological position of pre-extinction ancestors ........................... 101 Functional degradation during press period ................................ 103 Genealogical influence on re-evolution of EQU .......................... 104 “Replay” experiments—differential tendency for qualitative re-evolution of EQU ........................................... 107 “Replay” experiments—differential time to re-evolution of EQU 113 Functional genomics of EQU pre- and post-extinction — —retention of functional sites ............................................. 120 Functional genomics of EQU pre- and post-extinction — —re-use of ancestral sites .................................................. 126 Functional genomics of EQU pre- and post-extinction — overlap of EQU with other functions ..................................... 126 Functional genomics—case studies of re-evolution within a clade 128 Discussion ............................................................................... 148 Measurements of pre-extinction demography do not reflect “deep history” effect ................................................ 150 Genetic architecture of ancestors from end of the extinction episode partially determines evolutionary fate of EQU .................................................................. 151 Low ecological position of ancestors and extinction survivors is not an unusual outcome .................................................. 154 Homology is a matter of degree in re-evolved versions of EQU ........ 158 No clear evidence for facilitation through functional decoupling and independence ............................................. 159 Possible limitations of analyses ............................................. 161 Summary ................................................................................. 163 Future directions ........................................................................ 165 APPENDIX 1 Appendix A1.1. Logic operations in Avida ........................................ 169 Appendix A1.2. Trophic weighting of stably coexisting eco-species .......... 171 Appendix A1.3. Progress of an example community through time ............. 174 Appendix A1.4. Environment used for subsidiary “large world” experiments ....................................... 180 Appendix A1.5. Examples of ecological recovery dynamics in two subsidiary “large world” experiments ....................................... 181 Appendix Al .6. Dynamics of evolving functional output in control experiments ............................................... 1 84 Appendix A1.7. Individual examples of extinction and recovery using eco-species with trophic weighting applied .................. 185 Appendix A1.8. Examples of functional recovery dynamics at trophic level L3 in two representative pulse extinction communities. ....................................................... 187 Appendix A1.9. Phenotypic diversity vs. time, averaged over all 100 pulse extinction communities .............................. 188 APPENDIX 2 Appendix 2.1. Categorization of replicate populations that re-evolved EQU ................................................... 189 Appendix 2.2. Fates of pre-extinction EQU clades during early recovery 192 Appendix 2.3. Successful re-evolution of EQU in replay experiments 196 Appendix 2.4. Differences in mean rank of re-evolution time between actual ancestor and most abundant survivor replay founders from the same replicate population ......................................................... 204 Appendix 2.5. Retention of EQU functional sites between pre-extinction ancestor and end-extinction descendent ........................ 207 Appendix 2.6. Number of EQU functional sites retained from the ancestral version of EQU that participate in the re-evolved version. ................................................ 209 Appendix 2.7. Number of EQU knockout mutations in the pre-extinction ancestor that also affect two or more other functions ........ 210 Appendix 2.8. Steps needed to determine retention of EQU functional sites ................................................................. 212 REFERENCES .................................................................................. 218 xi Table 1.1 Table 2.1 Table 2.2 Table 2.3 Table 2.4 Table 2.5 Table 2.6 Table 2.7 Table 2.8 Table 2.9 Table 2.10 LIST OF TABLES Summary of recovery of ecological activity, showing time required to each pre-extinction level. ................................................... 44 Association between evolution of EQU before press extinction, and its ubsequent evolution or re-evolution. .................................... 100 Highest-level functions of immediate pre-extinction ancestors of organisms in which EQU re-evolved following the press episode. ......................................................................... 102 Number of replicates in which the listed function was evolved, and subsequently either retained through, or re-evolved during, the press ............................................................................. 104 Number of descendent organisms of the pre-extinction EQU progenitor genotype remaining at the end of the extinction episode. ......................................................................... 106 Kruskal-Wallis tests for differences in number of end-extinction survivors of the pre-extinction EQU clade among re-evolution classes defined in Figure 2.2. ................................................. 107 Kruskal-Wallis test for differences in successful re-evolutions of EQU, in replay populations seeded with the most abundant end-extinction organism of the pre-extinction EQU clade. .............. 110 Kruskal-Wallis test for differences in successful re-evolutions of EQU, in replay populations seeded with the actual end-extinction ancestor of the genotype that re-evolved EQU after the press. .......... 110 Resultsof resampling analyses to assess heterogeneity, among founding organisms, of successful re-evolutions of EQU in replay populations seeded with the most abundant end-press survivor of the pre-extinction EQU clade. .................................................... 111 Results of resampling anlyses to assess heterogeneity, among founding organisms, of successful re-evolutions of EQU in replay populations seeded with the actual end-press ancestor of the genotype that re-evolved EQU. .......................................................... 112 Kruskal-Wallis table for differences among replicate classes in time needed to re-evolve EQU, in replay populations seeded with the most abundant surviving organism from the pre-extinction EQU clade. 114 xii Table 2.11 Table 2.12 Table 2.13 Table 2.14 Table 2.15 Table 2.16 Table 2.17 Table 2.18 Table 2.19 Kruskal-Wallis table for differences among replicate classes in time needed to re-evolve EQU in replay populations seeded with the actual end-extinction ancestor of the genotype that re-evolved EQU. ......... 114 Pairwise comparisons among mean ranks of time needed to re-evolve EQU in replay populations seeded with the most abundant survivor of the pre-extinction EQU clade. ................................................ 119 Pairwise comparisons among mean ranks of time needed to re-evolve EQU in replay populations seeded with the actual end-extinction ancestor of the organism that re-evolved EQU. ............................ 119 Summary statistics for percentage of ancestral EQU functional sites remaining in focal organisms from the end of the extinction episode. ......................................................................... 121 One-way AN OVA for differences, between replicate classes, in percentage of ancestral EQU fimctional sites remaining in focal organisms from the end of the extinction episode. ........................ 121 Pairwise comparisons among replicate classes of percentage of Ancestral EQU functional sites retained through press. .................. 125 Summary nonparametric statistics for percentage of EQU knockout mutations that eliminate two or more other functions in focal pre- extinction organisms. ......................................................... 127 Kruskal-Wallis table for differences among replicate classes in percentage of EQU knockout mutations that eliminate two or more other functions in focal pre-extinction organisms. ........................ 127 Two major modes of acquisition of EQU. .................................. 147 xiii Figure 1.1 Figure 1.2 Figure 1.3 Figure 1.4 Figure 1.5 Figure 1.6 Figure 1.7 Figure 1.8 Figure 1.9 Figure 1.10 Figure 1.11 Figure 1.12 Figure 1.13 Figure 1.14 Figure 1.15 LIST OF FIGURES Schematic of the cascading trophic interactions used in this study. ....... 9 Plots of percentage of replicate populations expressing a given trophic function at particular time points of interest. ................................ 23 Plots of recovery of functional output following press extinctions for trophic levels LO-L3. ............................................................ 27 Examples showing heterogeneous recovery dynamics in four press extinction replicates. ............................................................. 29 Phenotypic diversity vs. time, averaged over all 100 press extinction replicates. ................................................................................................ 34 Phenotypic recovery vs. time in four illustrative press extinction communities. ..................................................................... 36 Number of stably coexisting eco-species vs. time. ......................... 38 Number of stably coexisting eco-species weighted by trophic position. ........................................................................... 42 Plots of ftmctional recovery following pulse extinctions for trophic levels L0-L3. ..................................................................... 46 Comparison of fimctional recovery in press vs. pulse extinction communities over the first 3,000 updates of recovery. ..................... 48 Comparison of early stages of recovery of phenotypic diversity in pulse vs. press communities. ................................................... 51 Comparison of recovery of stably coexisting eco-species between press and pulse experiments. ................................................... 52 Comparison of recovery of trophic-weighted stably coexisting eco-species between press and pulse experiments. ......................... 54 Comparison of average generation time in pulse vs. press communities. ..................................................................... 56 Comparison of functional recovery in low-density recovery experiments. ..................................................................... 59 xiv Figure 2.1 Figure 2.2 Figure 2.3 Figure 2.4 Figure 2.5 Figure 2.6 Figure 2.7 Figure 2.8 Figure 2.9 Figure 2.10 Figure 2.11 Figure 2.12 Figure 2.13 Schematic illustration of categorization of replicate populations based on re-evolution of EQU in the press extinction experiments. .. 87 Schematic illustrations of types of re-evolution of EQU. .................. 92 Box plot of successful re-evolutions of EQU, in replay populations seeded with the most abundant end-press genotype of the pre-extinction EQU clade. .................................................... 108 Box plot of successful re-evolutions of EQU, in replay populations seeded with the actual end-press ancestor of the genotype that re-evolved EQU. .................................................................. 109 Plot of multiple comparison tests for ranks of time needed to re-evolve EQU in replay populations seeded with the most abundant survivor of the pre-extinction EQU clade. ..................... 116 Plot of multiple comparison tests for ranks of time needed to re-evolve EQU in replay populations seeded with actual ancestor of the pre—extinction EQU clade. ................................................ 118 Plot of multiple comparison tests among replicate classes for percentage of ancestral EQU functional sites retained through press. ............................................................................ 124 Genotype-phenotype map of the last pre-extinction ancestor to perform EQU in the lineage of the re-evolved EQU genotype from replicate 8100 (Category IB). ................................................ 130 Execution map of the last pre-extinction ancestor to perform EQU in the lineage of the organism that re-evolved EQU in replicate 8100 (Category IB). ............................................................ I32 Genotype-phenotype map of the re-evolved EQU genotype from replicate 8100 (Category IB). ................................................ 134 Execution map of the re-evolved EQU genotype from replicate 8100 (Category IB). .................................................................. 136 Genotype-phenotype map of the immediate pre-extinction ancestor of the re-evolved EQU genotype from replicate 7000 (Category IA). .................................................................. 139 Execution map of the immediate pre-press ancestor of the re-evolved EQU genotype from replicate 7000 (Category IA). ....................... 141 XV Figure 2.14 Genotype-phenotype map of the re—evolved EQU genotype from replicate 7000. ............................................................... 143 Figure 2.15 Execution map of the re-evolved EQU genotype from replicate 7000. .............................................................................. 145 xvi CHAPTER 1 SELECTIVE EXTINCTIONS CAUSE DELAYED ECOLOGICAL RECOVERY IN COMMUNITIES OF DIGITAL ORGANISMS ABSTRACT The issue of delays in recovery from mass extinctions is getting increased attention in both empirical and modeling studies, with a key question being how the mechanisms of extinction and diversification affect the recovery process. We subjected communities of digital organisms with a cross-feeding trophic structure to instantaneous pulse extinctions, where survivors were chosen at random, and to prolonged press extinctions involving massive environmental alteration through a period of low resource availability. We found that functional activity at the base of the trophic pyramids in the communities recovers more rapidly than activity at higher levels, with the most extensive delays seen at the top level. These results are qualitatively similar to those observed in a number of paleontological studies. On all trophic levels, communities recovered from pulse extinctions, on average, markedly faster than communities subjected to the low- resource press episode. Also, post-press communities often either re-evolved top-level function at greatly reduced levels of expression, or failed to re-evolve it at all, so that recovery to pre-extinction levels of functional activity was generally not achieved in the allotted time. When diversity was measured by the number of ecotypes that coexisted stably in an ecological setting, there was on average no discernible difference between press and pulse communities, though press communities showed a noticeable delay when these ecotypes were weighted by their ecological functions. Organisms also tended to evolve greatly reduced generation times during the press episode. In follow-up experiments where we seeded an empty environment with organisms taken from the end of the press episode, or with the shortest-generation organisms from the pre-extinction population, both showed a slower initial rate of ecological recovery than in experiments using the randomly-chosen pulse survivors. Organisms sampled from the end of press episodes had the slowest initial rate of recovery of all three, and the delay became amplified at higher trophic levels. These data indicate that adaptation during the press episode degrades genomic potential for evolvability, hindering the ability to re-evolve levels of fimction comparable to those that existed prior to the extinction. In spite of the exotic nature and relative simplicity of our system, the results we obtained have a number of parallels with patterns from the paleontological record. We suggest that at least some delayed recoveries from mass extinction involve the need to both re-evolve functional diversity and re-construct ecology lost during the extinction. Further, in the case of extinctions due to an extended disturbance, adaptive changes that hinder subsequent recovery must be overcome. INTRODUCTION At least five times during the history of life on Earth, mass extinctions have resulted in the loss of a large fraction of the planet’s biota (Hallam and Wignall 1997). These events are also notable for the new opportunities they create for surviving species, and new taxonomic and ecological patterns that emerge as a result of recovery and rediversification (Erwin 2001). Some episodes of extinction and recovery such as the Triassic/Jurassic had relatively little long-term impact, while others have resulted in drastic shifts in the course of evolution, the end-Permian being the most extreme known example (Erwin 1998a,b). Increasing attention is therefore being paid to the dynamics of recovery fiom mass extinctions, particularly those variables that might promote or impede recovery. A wide range of evidence ranging from geochemical studies, fossil occurrences, and time series analyses, seems to indicate that there are delays in the recovery of biological diversity that can span millions of years (D’Hondt et al. 1998, Looy et a1. 1999, Benton et al. 2004). The idea that there are lags in recovery has gained currency, with delays being observed in both empirical investigations using fossil compendia (Kirchner and Weil 2000a,b; Kirchner 2002) and in a recent modeling study (Sole et al. 2002). The dynamic of such recoveries remains a contentious issue. Early equilibrium- based models suggested that recovery after a mass extinction should rapidly refill ecological niches lost during the extinction episode (Carr and Kitchell 1980, Sepkoski 1984). However, these models have been criticized as being uninformative with respect to relevant biological and ecological mechanisms, and inconsistent with empirical work that suggests a more complex dynamic involving active rebuilding and re-evolution of collapsed ecological communities, rather than simple refilling of emptied ecospace (Hewzulla et al. 1999; Kirchner and Weil 2000b). Erwin (2001) suggested a need for process-based models that account for “synergistic interactions between ecosystem components” that are presumed to result in a positive feedback process. Such a dynamic could produce a rapid burst of diversification after a lag phase that represents the time needed for basic ecological interactions to be re-established. These views have their roots in the differing niche concepts of Simpson (1944, 1953), who favoured a static niche-filling model, and Whittaker (1977), who postulated a self-augmenting phenomenology for diversity increase (Schluter 2000). In recent years, the latter view has garnered increased attention from some palaeontologists (e. g. Kirchner and Weil 2000b; Erwin 2001). However, empirical studies using fossil compendia have not yet resolved the issue. Results of some time series and power spectrum analyses of these compendia (Hewzulla et al. 1999, Kirchner and Weil 2000a,b; Kirchner 2002), from which delayed recovery is inferred, are attributed to the active rebuilding of collapsed ecological structures (as opposed to simply radiating into vacated niches). This rebuilding is held to create new evolutionary opportunities that spur further diversification. However, a similar study, using a compendium where occurrence times are adjusted for the incompleteness of the fossil record, has reported that delayed recoveries are artefactual, and largely disappear when the incomplete record is accounted for (Lu et al. 2006). This paper investigates ecological dynamics of extinction and recovery in a digital model system, following a massive environmental perturbation. Using this type of model system offers a number of clear advantages. The records of key quantities and metrics are much more complete than what is possible with the geological fossil record, and free of the preservational artifacts that complicate quantitative analysis of fossil time series data. A digital system also offers the opportunity to bring a manipulative experimental approach to the problem of studying extinction and recovery. One can “replay life’s tape” (Gould 1989, Travisano et a1. 1995, Yedid and Bell 2002), making changes at specific time points in what would otherwise be identical replicates. This manipulability contrasts with the serial “natural experiments” of the geological record, where each “experiment” is unreplicated, and not independent of the previous ones. In these experiments, we examine only a single round of extinction and recovery, but do so with many independent replicates. The digital microcosm We use the Avida digital evolution system (Lenski et al. 2003, Chow et al. 2004, Ofria and Wilke 2004) as our platform for these experiments. Evolving systems of this type have been used to study a number of fundamental problems in evolution (Lenski et al. 2001, Yedid and Bell 2001, Yedid and Bell 2002, Ofria and Cooper 2002, Lenski et a1. 2003, Chow et al. 2004, Misevic 2006). Avida maintains and monitors populations of digital organisms, which are short, self-replicating computer programs written in an assembly-type language. (Any program that can be written with conventional, commercial computer assembly languages can also be written with this language.) The organisms execute the programs encoded by their genomes, including commands that allow them to copy the instructions in their genomes and divide to produce a daughter organism. The copy instruction duplicates a single instruction from parent to daughter. During the duplication process, the instruction being copied has a probability of being miscopied, and changed to a different instruction from the one in the parent organism’s genome. Mutations from one instruction to any other are equally likely. In addition, there is a probability that, on cell division, a random instruction will be inserted into, or deleted fiom, the daughter cell. These kinds of genomic mutations can indirectly affect the phenotype of a digital organism, including its ability to self-replicate or perform other computational functions. Mutations can also be neutral in their phenotypic effects. Thus, there is a genetic basis for adaptation and speciation (insofar as that can be defined for asexual organisms, not just in Avida, but for real biological asexuals). Since the range of variation possible in Avida organisms is indefinite (certainly astronomically large), the system is capable of open-ended evolution. The organisms in this study are completely asexual, but the focus here is on radiation into groupings that are functionally and phylogenetically distinct, followed by loss and regeneration of those groupings. These processes are relevant to both sexual and asexual organisms. Each digital organism occupies a cell in a rectangular memory lattice, the size of which sets the maximum population size. When an organism divides, the daughter organism is placed in one of the eight immediately neighbouring cells, killing any previous occupant of that cell. Organisms that replicate faster have a selective advantage and can more quickly overwrite slower replicators. In Avida, organisms can accelerate the execution of their genomic instructions if they evolve the ability to perform certain logic functions (Lenski et al. 2003; see Appendix A1.1 for more details). All organisms receive a basal number of CPU cycles (the organisms’ “energy”), that enables their programs to run. If an organism can perform one or more logic fimctions, it metabolizes a corresponding resource into additional CPU cycles that accelerate the execution of its genome. This creates differences in the rates at which organisms execute their genomes, depending on which functions they perform and which resources can be accessed for additional energy. The values of the energy rewards an organism receives for performing certain computations and consuming certain resources can be set by the experimenter. (In spite of the fact that Avida communities are instantiations of Darwinian evolution, metabolism is simulated in Avida, since many of its features are either abstract simplifications of, or do not correspond directly to, real-world biochemical metabolisms.) The virtual environment in our experiments has depletable resources, such that an organism’s access to a given resource is reduced as the resource is consumed and depleted by competitors (Cooper and Ofria 2002, Chow et al. 2004). Some or all resources may be supplied exogenously by the system, while others may arise as by- products. This process permits the construction of cross-feeding, co-dependent environments with trophic interactions. Although the mappings of functions to resources must be specified in advance by the user, the particular organisms that use those resources—as well as the manner in which they do so—are left to evolution. We have set up biotic interactions to act in a simple facilitative manner: through “metabolism”, organisms consume resources and generate by-products that can themselves serve as resources for other individuals. Therefore, the disappearance of organisms producing certain resources results in both a loss of realized ecological breadth, and also in the extinction of other organisms dependent on those resources. Schematics of the interactive networks we use are shown in Figure 1.1. A key feature of these interactive networks is that they are bottom-heavy. For example, the simple logic function NOT is linked to three higher-level functions (AND, ORN, and OR), and must be performed three times—each time consuming a unit of the resource mapped to NOT— in order to produce one unit of each resource mapped to those higher-level functions. Similarly, three units of the AND resource must be consumed to produce 1 unit of the ANDN, NOR, and XOR resources. This arrangement seems a reasonable emulation of a directed energy transfer system with thermodynamically decreasing efficiency between levels (Lindeman 1942). This setup produces a trophic pyramid where most of the total functionality of the community (and thus produced energy) is at the bottom, and decreases progressively at higher levels. Such an arrangement is not only ecologically realistic; we have found it is important for seeing certain extinction effects. Alternate environment setups that use either direct 1:1 conversions, or have all resources provided at equal inflow rates without any ecological interactions, produce an inverted trophic pyramid structure where the middle and higher levels have more total functionality, and produce more energy. Populations with an inverted trophic structure usually do not respond to the perturbation when resource inflows are lowered (see Methods). None of these resources have been set to act as poisons, and reciprocal cross-feeding between EQU (16) Top consumer (L4) ANDN(4) NOR(8) XOR (8) 2'"d level consumer (L3) AND(2) 3units ORN(2) OR(4) 1"t level consumer (L2) NOT(1) NAND(1) Primary production (L1) 3 units J J Figure 1.1. Schematic of the cascading trophic interactions used in this study. Resources are associated with each of the logic functions shown. The reward value for performing the particular function is shown in parentheses next to the function name. A line connecting resources signifies that the lower-level function consumes the incoming resource and produces a by-product that is available for the higher-level function. Only the resources associated with the lowest-level functions NOT and NAND are provided exogenously. Example conversion factors are shown to the right of one of the inflowing resources and on the connection arrows; in this case, three units of the NOT resource are required to produce one unit each of the resources for AND, ORN, and OR. Similarly, three units of the AND resource are required to produce one unit each of the ANDN, NOR, and XOR resources. levels (“downward” movement of resources from higher to lower) has not been implemented, although these are available options for future work. Usually, the digital organisms evolve the ability to perform more than a single function (Cooper and Ofria 2002, Chow et a1 2004), which in the current context means that a given organism may sometimes use its own by-products as additional resources. We do recognize, of course, that there are also shortcomings inherent in a digital system, such as the absence of biogeographic environmental variability (Jablonski 1998), the difficulty of applying conventional classification concepts, and the relative simplicity of the kinds and number of interactions among members of the biota, as well as between the biota and the physical environment. As mentioned previously, the digital organisms used here are strictly asexual. While introducing recombination into digital organisms is known to affect their genetic architecture and ability to adapt to rapidly changing environments (Misevic et al. 2006, Misevic 2006), we do not investigate recombination here. We prefer instead to first understand the dynamics for the simpler asexual system before moving on to the more complex reticulate dynamics of a sexual system. Also, many of the key analytical tools that have been developed for Avida are implemented only for asexual organisms. For example, tracing lines of descent is straightforward for asexual organisms, but not even applicable to sexual ones. 10 Instantiating extinction We examine two kinds of extinction: press and pulse. In geological terms, pulse extinctions happen with sufficient speed and power that no adaptive change occurs during the extinction episode, while press extinctions occur over a longer period that (theoretically) allows for an adaptive response in affected populations (Erwin 1998a). Pulse extinctions occur here by an instantaneous mass culling of organisms from the population, leaving a small number of survivors selected at random from the pool of viable organisms. They are simple “field of bullets” extinctions (Raup 1991) that do not directly induce any adaptive response in the survivors (though survivors may then radiate to fill any vacated niches). Press extinctions, on the other hand, are provoked by lowering the inflow rate of resources to near-starvation levels for a protracted period of time (a press). This episode creates conditions favouring different adaptations in the community, without a forced culling of the population. A press episode often lowers diversity via the altered selective pressures, at least over the short term. The closest biological analogue is a collapse of primary production. Such food web collapse through disruption of primary production has been implicated in several of the major extinctions in Earth history (Rhodes and Thayer 1991, Martin 1996, Hallam and Wignall 1997, Benton and Twitchett 2003). Previous simulation studies (Amaral and Meyer 1999, Solé et al. 2002) induced extinction through effects on the primary producer level. Our model also differs from previous ones (e.g. Solé et al. 2002) that emphasize cascading effects based on direct, targeted removals at the primary producer level. Here, we trigger a major alteration of the abiotic environment and let the ecosystem adjust and evolve 11 entirely on its own. Ecological recovery is then later initiated by restoring resource inflows to pre-extinction levels. It should be emphasized that we do not (and cannot) compare different causes of mass extinction and how these might apply to the geological record. Instead, our focus here is on the community-level effects of the extinction and recovery processes, particularly the latter. Defining recovery Palaeontologists and palaeoecologists use a number of different definitions of recovery, which focus on different aspects of the recovery process and employ different sources of evidence (Erwin 2001). Most commonly employed are measures based on counts of the number of occupants of a particular taxonomic level and their stratigraphic spans. However, organisms are not defined beyond species for the digital system employed here. We have examined the following aspects of the system in order to gauge recovery: 1) Total functional activity at each trophic level over the course of the experiment: This measures the total level of expression for each of the functions on a particular trophic level. For example, the expression levels for the functions NOT and NAN D are summed together to yield the total amount of “primary producer” activity. This measure is calculated from the data produced by the Avida software during the evolutionary phase of a run. It is best understood as a measure of the efficiency of resource draw-down in a 12 chemostat-type environment where resources are not spatially structured. Consumed resources provide “energy” for organisms that execute functions mapped to those resources. Through metabolism, this energy is in turn converted to resources that cascade up to organisms that perform functions on the next higher trophic level. With the trophic arrangement we use, the amount of available resource decreases at every level, correspondingly lowering the amount of potential energy that can be gained by organisms with higher-level trophic functions, and thus potential draw-down at those trophic levels. Thus, functions at the lower trophic levels have the highest total functional output, and the top level the least. This metric is most similar to the use of paleogeochemical data to infer historical ecosystem activity (usually productivity), but we measure it directly in real time. Functional activity is a “common currency” for all biological activity in Avida, which must be measured in different ways even for living organisms (e. g. evolved oxygen can be used to measure activity rates in photosynthetic organisms but not in heterotrophs). Total functional activity is also probably a better measure than total number of individuals expressing that function. We should note that a given organism can often perform multiple functions on different trophic levels, so functions are not independent of each other in their occurrence and output level. Similar complications occur in real ecosystems, such as omnivores that consume both plants and animals. Further, most Avida communities do not have strict representatives at each trophic level. For example, many organisms in a population have some level of “primary producer” activity, in addition to other functions they perform. Since Avida allows for the evolution of organisms with very high output levels of 13 particular functions, a high level of primary production, for example, can potentially be maintained by a relatively small population. Conversely, a large number of organisms may have small per-capita output of a particular fimction, so the total output of that function may not be high even if many organisms perform it. 2) Number of phenotypes representing distinct functional combinations present in the community: Function expression is measured in a binary manner, such that two genotypes that perform the same set of functions, regardless of expression level and evolutionary history, are scored as the same phenotype. Groupings based on phenotypes are often not monophyletic, since organisms from multiple clades may independently evolve the same set of functions, though many organisms with the same phenotype will often share a recent common ancestor (especially for sets comprising several or higher- level functions). This metric may underestimate diversity, because two genotypes that perform the same set of functions, but at different output levels and deriving different relative benefit from each, are combined into one group. It may also overestimate meaning‘id diversity: owing to the rather high rates of mutation used in these experiments, many new genotypic and phenotypic variants are constantly being introduced and going extinct. Not all phenotypes in a population can necessarily stably coexist in a constant ecological setting, so we employ the following additional metrics. 3) Number of stably coexisting eco-species: This measure is probably closest to the taxonomy-based measures used in most diversity-through-time studies. However, it is rather constrained by the small number of species possible in the digital system, and may 14 therefore lack resolving power. Moreover, it is laborious to determine (see “Species definitions” below), so the number has been determined only for time points of particular interest, including right before the onset of the press episode, the end of the press episode, and after a recovery time equal (in absolute time) to that for pre-extinction evolution. There is a connection between the number of stably coexisting eco-species and phenotypes, in that a community with more phenotypes also typically supports more stable eco-species (which are a subset of the existing phenotypes) if mutation is prevented. An environment that does not support great phenotypic diversity also supports fewer stably coexisting eco-species. 4) Number of eco-species weighted by their approximate trophic position. A stable eco- species’s trophic position is determined based on the level of expression of functions assigned to particular trophic levels, thereby correcting for “omnivory”, or use of resources across multiple levels (Vander Zanden and Rasmussen 1996). Thus, an organism that performs high-level functions, but also has significant expression of lower- level firnctions, will find its overall trophic position lower than if it is weighted simply by the highest-level function it performs. Base replicators (organisms with no functionality beyond self-replication), where they feature as members of an ecologically-stable community, are one-fold weighted. We introduce this correction because two communities (either from different replicates, or fiom the same replicate compared at different time points) may contain an identical number of eco-species, but be dissimilar in their ecological breadth. An example calculation of the metric for sample communities is given in Appendix Al .1. 15 Species definitions We use as our working definition for species the number of stably-coexisting eco- species during periods of high resource inflow. Given the difficulties in applying conventional species concepts to clonal organisms, asexual species may be defined based on their ecological characteristics. Ecotypic divergence allows coexistence of monophyletic clades, each of which has an independent evolutionary trajectory (Cohan 2001, 2002; Rozen et al. 2005). Following Cooper and Ofria (2002), we isolate these eco-species by allowing evolution to proceed for a particular time, then turning off mutation, such that only the most-fit types that can coexist owing to minimal overlap in resource use are able to persist. In order to maintain this definition even during the press period (i.e. to find how many stably coexisting eco-species are present at the end of the press episode, right before restoration is initiated), resources are restored to pre-extinction levels in concert with turning off mutation. Otherwise, proliferation of organisms that are well-adapted to low-resource conditions occurs even without new mutations (data not shown), rather than isolation of those types best able to take advantage of the restored environment. This method is less parameter-dependent than the species-clustering method of Chow et al. (2004). Eco-species can include very recently diverged organisms that coexist based on small ecological differences; an example is shown for the early radiation and end-press community portrayed in Appendix A1.2. A corollary of this fact is that this diversity measurement does not record losses of phylogenetic history resulting from niche invasions. Due to both the small number of ecological functions used here, and lack of spatial heterogeneity of resources in this version of Avida, clades to which the 16 stably coexisting eco-species belong may be purged of genotypic diversity not only by periodic selection events within an eco-species’ own clade of origination (Cohan 2002), but also by offshoots from other clades whose niches overlap perfectly with a previous eco-species’ niche. For example, a given community compared at different time points may contain what appear to be identical sets of eco-species, but sequence comparisons or phylogenetic investigations can reveal when an eco-species from the earlier time period has been replaced by a “doppelganger” that expresses the same set of fimctions, yet is more closely related to one of the other eco-species. This is a key difference from Cohan’s (2002) ecotype definition. Another drawback of the eco-species method is that it does not permit estimation of relative species abundance, because the relative abundances of eco-species in the ecologically stable communities may not reflect the situation in the non-equilibrium, evolving community from which they are sampled (though the question is moot if there is any frequency-dependence in coo-species abundance). MATERIALS AND METHODS Experimental platform All experiments are performed on a Beowulf cluster made up of Intel Pentium III, IV and AMD Athlon processors, using Avida v. 2.1. Configuration files are available at http://myxo.css.msu.edu/papers/. 17 Environmental confi ggation All replicates are performed with populations of maximum size N=3600. An organism can die either when it is replaced by new-born individuals, or when its total instructions executed exceed 20 times its genome length (this prevents non-reproductive organisms from persisting indefinitely). The copy mutation rate is set at 0.005 per instruction copied; insertion/deletion mutations occur at a rate of 0.05 per division. Genome size is not constrained in these experiments, although a one-fold limit is placed on the genome length differential between parent and offspring cells in order to avoid certain complications (Misevic et al. 2006). Newly-bom organisms replace randomly- chosen organisms in their immediate 8-cell neighbourhood, giving rise to spatially structured populations. Logic operations that are rewarded are those described in Lenski et al. (2003), with rewards scaled as a function of computational difficulty as described in Cooper and Ofria (2002) and Chow et al. (2004). Resources are globally available to all organisms, with no spatial structure. Only resources corresponding to the functions NOT and NAND are provided exogenously, at inflow rates of 200 units per update for each resource, following Cooper and Ofiia (2002). All other resources arise as by-products of function execution, according to the stoichiometric scheme shown in Figure 1.1. Organisms can obtain a maximum of 25 units of resource or 0.25% of the total concentration (whichever is smaller) per completed computation; the latter prevents negative values arising from finite time steps. 18 Experimental methodologY We perform a total of 100 replicates for each treatment (except as indicated otherwise). Each replicate is seeded with a single copy of a handwritten “default” Avida ancestor, of genome length 50 instructions. The initial conditions for each replicate differ only in the value of the seed supplied to Avida’s random number generator. The following types of treatments are performed: 1) Pre-extinction evolution. Each replicate runs for 100,000 updates (an arbitrary unit of time where each individual in the population executes on average 30 instructions) of evolutionary time. This evolutionary phase is followed by an ecological phase of 100,000 updates, where mutation is turned off, allowing only stably coexisting ecotypes (as defined above) to co-exist. We repeat the replicates for the pre—extinction evolutionary phase with abbreviated run times of 1000, 5000, and 50,000 updates in order to further elucidate particular aspects of the pre-extinction radiation. 2) Press/recovery treatment. Each replicate ms for 100,000 updates exactly as above, including using the same initial random number seed. Resource inflows are then lowered by two orders of magnitude for 5000 updates, and then restored to pre-extinction levels for a subsequent 100,000 updates of evolution, followed by a 100,000 update ecological phase. These runs are also repeated with abbreviated recovery times of 1000, 5000, and 50,000 updates. 19 3) Press-only treatment. Each replicate runs for 100,000 updates exactly as above, including the same initial random number seed. Resource inflows are then lowered by two orders of magnitude for 5 000 updates, and then restored to pre-extinction levels for 100,000 updates. However, in this treatment, mutation is turned off in concert with restoration of resource inflows, for reasons described above (see Introduction). 4) Uninterrupted evolution. Each replicate runs for 205,000 updates with no press treatment, followed by a 100,000 update ecological phase. This treatment serves as a control to see how evolution would have progressed in the absence of any perturbation. 5) Pulse extinctions. Each replicate runs for 100,000 updates, at which time an instantaneous mass cull of the population is performed, with no alteration of resource inflows. Only viable organisms are chosen as survival candidates, and the survivors are picked at random from this candidate pool. This pulse extinction was followed by 100,000 updates of recovery and a 100,000 update ecological phase. We perform culls to 4 individuals from the pre-extinction size of 3600 organisms (99.9% extinction). We also perform a few “large world” experiments, that feature a total community size of N=10,800 and where 30 logic functions were rewarded (out of a possible set of 77), with an environmental configuration as shown in Appendix Al.4. These supplementary experiments examine the effect of increasing the size and environmental complexity of ecosystems on the general results obtained in the main experiments. The supply rates of inflowing resources are adjusted to scale with the larger population size, 20 although mutation rates and the absolute time courses of the experiments remain the same. We cannot perform many such experiments, owing to the much longer computation time and much larger usage of computer resources they require. In addition to the diversity metrics described above, we also examine the behaviour of the average generation time (total number of instructions required for an organism to execute its entire code and thereby replicate itself) for the press and pulse extinctions. This approach is taken in order to verify and characterize the adaptive response of organisms during the press episode. We expect that the low-resource press will favour organisms with shorter generation times, since they can make the most efficient use of scarce resources. The pulse extinctions, however, should give rise to more heterogeneous responses, since the organisms picked for survival might have generation times anywhere around the pre-extinction community average. Analys's Where our graphs show trajectories averaged over multiple replicates of a treatment, we present approximate upper and lower 95% confidence series around the average trajectory. When these intervals exclude the point estimates for another treatment at a particular time, then those treatments are judged to be significantly different. Although time points are not independent in the series here, the differences are clear and compelling at many (if not most) time points. 21 RESULTS Press extingions—functiorral degradation during press period The press period often results in functional degradation of most of the organisms in the population, which is more pronounced on higher trophic levels. Figure 1.2 shows the percentage of replicate populations containing any organisms that can perform the trophic firnctions at particular time points. The simple functions NOT and NAND tend to be robust through the press episode, or are lost only transiently and re-evolve quickly. These two functions are present in all 100 replicate populations just prior to the extinction episode (Figure 1.2a). At the end of the extinction episode, NOT is performed in all populations, and NAND in 99 of 100 populations (Figure 1.2b). The Level 2 functions ORN and OR also evolve before the extinction episode in all replicates, and are present at the end of the extinction episode in 92 of 100 and 61 of 100 populations, respectively (Figure 1.2a, 1.2b). More difficult frmctions (including the low-value function AND) are present in fewer than half the populations at the end of the extinction episode. The most difficult functions, XOR and EQU, evolve in 88 of 100 and 87 of 100 populations before the onset of extinction. Of these, only 9 of 88 (for XOR) and 8 of 87 (for EQU) populations retain these functions through the extinction episode (Figure 1.2b). At the end of recovery, all replicate populations have recovered 7 of 9 functions. However, only 61% and 64% of the replicates have organisms that perform XOR and EQU respectively, compared to 88% and 87% of replicates prior to extinction (Figure 1.2c.) 22 Figure 1.2. Plots of percentage of replicate populations expressing a given trophic function at particular time points of interest. Bars are colour-coded according to the trophic level of the listed functions. Green bars - L0; Blue bars — L1, Purple bars — L2; Red bar — L3. IMAGES ARE PRESENTED IN COLOUR. a) Immediately prior to extinction episode (~100,000 updates). b) At end of press episode (~105,000 updates). c) At end of recovery period (205,000 updates). 100 90- ‘ ' 7 80- 7o— 60. 50- 4o- .. gram A: r m. ~04 30- 20— Percentage of replicates wlth function .4._ DAKAJV “4.1.4 NOT NAND AND ORN OR ANDN NOR XOR EQU Function (a) 23 (Figure 1.2 continued) 101 .8305: 5.3 083.32 .0 ammo—.022. EQU R O X NAND AND ORN OR Function NOT (b) .. Ihflr, 100 904 a 0 5 corona 5.3 $30.32 .0 38.329... NAND AND ORN OR ANDN NOR XOR EQU Function NOT (C) 24 Press extinctions—recovery of functional activity We next address the dynamics of total activity at each trophic level before and after the press treatments. Clear differences in the recovery of total functional activity are evident between trophic levels, designated L0 through L3 (Figure 1.3a-b). The two lowest levels, L0 and L1, require on average 28,000 and 32,000 updates before recovery to pre-extinction output levels is attained. L2 experiences a much longer delay of about 85,000 updates, on average, before recovery to the immediate pre-extinction value. The curve for L3, which contains only the difficult function EQU, suggests that recovery for this level is highly heterogeneous, as evinced by the shape of the curve, the extremely wide confidence intervals, and the absence of full recovery, on average, even after 100,000 updates of restored resource inflow. The time required for EQU to re-evolve ranges from only a few tens, to tens of thousands of updates among the 52/100 communities where this highest-level function re-evolves. These patterns persist even when only those replicates that successfully re-evolved EQU are considered (Figure 1. 3b), suggesting that even when EQU does re-evolve, there is often a failure to re-attain pre—extinction levels of expression in the allotted time. Examining some of the individual replicates reveals considerable heterogeneity in the results (Figure 1.4). Figure 1.4a plots results from a pure specialist community, wherein each eco-species from the ecologically stable pre-extinction community is a specialist on a single resource. This community fails to recover its pre-extinction output levels for EQU, although the function itself does eventually re-evolve (but note the 25 Figure 1.3. Plots of recovery of functional output following press extinctions for trophic levels L0-L3. Data in panel (a) is averaged over all 100 replicates. Data in panel (b) is averaged over only the 52 replicates in which the function EQU was lost and re-evolved. Horizontal lines originating at immediate pre-extinction values indicate approximate times at which functional outputs, on average, re-attained their pre-extinction values. In all panels, the Y axis is total functional output (in thousands of executions), and the X axis is time (in thousands of updates). Trophic levels: LO, green curve; L1, blue curve; L2, purple curve; L3, red curve. IMAGES ARE PRESENTED IN COLOUR. 26 3N y can 1 02. r cm; i ch o: cor Time (1000:; updates) 650 600 550 -l (a) 7 6 5 4 3 2 AI m fines—898 *0 38: 53.5 Each 3:03:35... he 88 5 35.5 n.— .30... 100 9 8 Time (1000s updates) 27 Figure 1.4. Examples showing heterogeneous recovery dynamics in four press extinction replicates. In all plots, Y axis is total functional output, and X axis is time (in thousands of updates). Colour coding for trophic levels is the same as in Figure 1.3. Panels (a)—(d) show example outcomes from four standard experiments. See text for details. IMAGES ARE PRESENTED IN COLOUR. 28 2N r OON r Gav . our r 2... floor . amp. . 3.. i car 1 our . O: r 2: 500 ._. ._. w w 3 m 1 35.585 .o 88: 5&3 .89.. 00- Time (1000: updates) (a) an floou - car i one i Ohv - amp 1 our . ovr r 02. 1 our ,rc: r car x cm row f: rec r cm i at i on - ON i or 600 m . 0 w .2238: .o 88: 3&3 .8.: 100 - _ o o 2 300 - Time (10003 updates) 00 29 (Figure 1.4 continued) 3N - 8N - 2: - 8. - a: - 2: - 8. - o: - 8. - 8. - a: . 2: - 8 f 8 450 400 ~ _ . . A _ . o o o o o m o 5 o 5 o 3 3 2 2 1 1 35.585 .o 88: 3&3 .38 Time (10008 updates) 3N - o8 - 2: - 8. - 2.. . 8. f 8. - o3 - 8. - 8. - o: - 8. - 8 - 8 . 2. f 8 - 8 T c1 . 8 - 8 8 __ WWW 00* 50- _ o o 5 4 3 3 2 2 1 1 35.585 .o 88: 59.8 .53 Time (10008 updates) (d) 30 considerable gap in time before this happens). The recoveries for trophic levels L0 and L1, by contrast, are very rapid once resources are restored. In fact, the total output for these levels at the end of the experiment is nearly four times the pre-extinction level. This outcome suggests that the performance of the functions on these trophic levels is qualitatively superior to the pre-extinction versions. Close inspection reveals that this community shifts to having more generalist eco-species following the press episode, compared to the previous dominance by specialists. L2 for this community displays intermediate behaviour, with an initial rapid re-evolution of frmctionality and increase in output followed by a deceleration, so that about 45,000 updates are required before pre- extinction output levels recover fully. iii) Other outcomes observed include (but are not limited to): Behaviour similar to that described above, but in communities where the ecologically stable eco-species are non-specialists before the press (Figure 1.4b). All functional output levels are roughly similar, and the press was a transient perturbation, with community performance rapidly resuming its upward shape for L0 and L1 , but with noticeably longer recovery times for L2 and L3. In contrast to the previous example, L3 did recover its previous level of functional output in this community. Communities where none of the trophic levels (except maybe LO) fully recover their previous functional output levels in the allotted time (Figure 1.4c), Communities that show longer delays at lower trophic levels before recovery to pre-extinction levels of functional output. Recovery on higher trophic levels, however, is relatively more rapid (Figure 1.4d). 31 Further, despite the larger population size and greater environmental complexity, the “large world” experiments exhibit the same kinds of heterogeneity in recovery dynamics. Two example large world communities are portrayed in Appendix A1.4. We should note that in these plots, levels of expression usually do not come to any equilibrium. In these experiments, genomes can grow or shrink over time. In order to prevent persistent reductive evolution (Yedid and Bell 2001, 2002) and loss of genomic complexity, genomes are rewarded in proportion to their length, which avoids rewarding minimal replicators with little or no ecological functionality. Since increased functional output (which confers fitness) is linked to greater genome length, total output levels tend to increase over time along with length. There is evidently considerable room for gradual fitness improvement (that further draws down available resources) even after all fiinctions evolve. Even with this size-bias mechanism in place, we obtain, in 7/ 100 . replicates, a number of eco-species in the pre-extinction ecologically stable communities that do not feature an appreciable increase or decrease in length (2t 5 instructions) over the ancestral organism, even though other eco-species in those same communities have grown considerably over the ancestral length; short and long genomes co-exist in the same populations. Further, there is a general tendency toward improvement over time scales that exceed the pre- and post-extinction periods in these experiments. As shown in Appendix A1.5, control populations that are not subject to extinction continue to show increasing levels of functionality during the second 100,000 update period on all four trophic levels. Output levels may yet stabilize if the experiments are run for much longer times. 32 Press extinctions—Recovery of phenoflpic diversity Following the press-type extinctions, the full pre-extinction phenotypic diversity is not, on average, recovered (Figure 1.5). This outcome is due largely to the fact that the most complex functions, XOR and EQU, often not do not re-evolve in the allotted time. A slight continuation of the upward trend is, however, apparent towards the end of the experiments, suggesting that full recovery rrright eventually occur, albeit over a very long period relative to the pre-extinction diversification. When individual replicates are examined, even considering only the 52/ 100 populations that re-evolve EQU, we see diverse outcomes ranging from little recovery even by the end of the experiment (Figure 1.6a) through both lengthy and short delays with full recovery (Figure 1.6b, c) to rapid and full recovery of phenotypic diversity (Figure 1.6d). Press extinctions—recovery of stably coexisting eco-species Figure 1.7a shows the initial diversification, loss, and recovery of the number of stably co-existing eco-species. Diversification from the seed ancestor is rapid, with between 4 and 5 stable eco-species, on average, already present by 1000 updates. Just prior to the press, communities contained on average six eco-species, a result consistent with previous work using this same species criterion (Ofiia and Cooper 2002), as well as with a genotypic clustering approach for similar community size and inflow rate (Chow et a1. 2004). There is again considerable variability between replicates, with some 33 3' 8 £3 h 525- 520— 15J 10~ 5.4 OfiIT‘IIITIIIIIjITrITII OOOOSOOOOOOOOOSSOOGOOO FN” IOONQOOv-NM GDNQGOV- PFFFPPPFPPNN Time (1000s updates) Figure 1.5. Phenotypic diversity vs. time, averaged over all 100 press extinction replicates. Y axis is total number of phenotypes, X axis is time (in thousands of updates). Error series are twice the standard error (approximate 95% confidence intervals). 34 figure um...) u in.» L i Figure 1.6. Phenotypic recovery vs. time in four illustrative press extinction communities. Y and X axes are as in Figure 1.5. All of these replicates re-evolved L3 (EQU) functionality after the press, yet they show a diversity of recovery patterns, ranging from long-term loss of phenotypic diversity (a), long and short delays (b, c) to rapid recovery ((1). IMAGES ARE PRESENTED IN COLOUR. 35 owu 1 Se +92. .. e... r 00—. i 8.. r 3.. i 09. i oar r a: r 8.. T 00 r on r on r O» - on r 3 i an r on r or Time (1000s updates) (a) 93 r SN 1 cap r as. 1 at. Tlme (1000s updates) 0)) 36 (Figure 1.6 continued) OFN r can 1 our . our - E... 1 :2. r 3.. 1 oz. - our 1 :3. r a: r 2: ion is. raw rem rot ion . o 8 u o 7 w 6 8.. m m m 32.88.8532 — Time (1000s updates) (C) 88 f 88 - 8. - 8. - 8. - 8. - 8. - 8. - 8. - 8. . 8. - 8. - 8 - 8 r 8 . 8 . 8 . 8 . 8 - 8 - 8 o 9 . 0 8 — o 7 m 8.. m w n. 3828888: Time (1000s updates) ((1) 37 Figure 1.7. Number of stably coexisting eco-species vs. time. a) Each data point represents an average of 100 replicate runs. Mutation was stopped at the time points shown to sort out those species that could stably coexist. Y axis is number of ecologically stable eco-species, X axis is time (in thousands of updates). Error bars are approximate 95% confidence intervals. The green point and error bars at the end represent the number of eco-species isolated from control runs with no environmental perturbation. b, c) Examples of eco-species recovery dynamics from individual replicates. Two replicates featuring rapid re-radiation following the press extinction (panel b), and two replicates featuring delayed re-radiation following the press extinction (panel c). 7 .8 1’ o 8 "6 L 8 E 3 22- 1+» orrlliillllllillllill °888888228§§§§§§§s§§§2 Time (10008 updates) (a) 38 (Figure 1.7 continued) 10 A Own .. 8.. i 03. r at. r cow 1 one i 01.. i 09. . our 1 o: r 2:. r cm i on fan r on i on r 3 T on i ON r 9.. 9i . . A m . . 7 6 5 4 3 2 3.08.903 he Lon—.52 Time (10003 updates) 0?) ‘ 8N - o8 . 8. f 8. - 8. - 8. - 8. - 8. . 8. - 8. - 8. . 8. - 8 - 8 - 8 . 8 - 8 - 8 - 8 - 8 - 8 — 4 A — a 5 4 3 2 06.08.0600 .0 .3532 Time (10003 updates) (C) 39 communities establishing multiple, coexisting eco-species very early in the run, while others do not reach their peak pre-extinction eco-species diversity until after 5000 updates. Pre-extinction diversity levels are generally maintained over time in the undisturbed control experiments, although a few replicates lose stable eco-species even in these controls (compare pre-extinction vs. end-experiment controls (green dots) in Figure 1.7a). Examples of delayed and rapid re-radiation following press extinctions are shown in Figures 1.7b and 1.7c, respectively. In our experiments, communities such as those shown in Figure 1.7c are atypical; the great majority of communities show recovery profiles similar to those in Figure 1.7b. Figure 1.7a also shows that, by the end of the press episode, communities contain, on average, only about three stable eco-species. The number of eco-species at the end of the press episode ranges from as few as one to as many as eight, with the latter cases being communities that were largely resilient to the press treatment. It should also be noted here that not all of these end-press eco-species represent more than one pre—existing line of descent that survives the press episode. Some certainly do, while others represent new diversifications that begin during the press period, after all but one pre-extinction line of descent has gone extinct. Once resource inflows are restored to their pre-extinction values, re- diversification is rapid, requiring about the same time as the initial radiation. By 1000 updates into the recovery, most populations have greater eco-species diversity than in the first 1000 updates of the initial radiation, albeit starting from a somewhat higher level in 40 most cases. Afier 5000 updates of recovery, eco-species diversity is not significantly different from that of the corresponding time point in the initial radiation (Figure 1.7a). Most replicates still have not recovered the final pre-extinction diversity by that time, having not yet evolved the highest-level functions. However, the end-experiment communities have on average the same diversity of coo-species that was present before the press episode (Figure 1.7a). A somewhat different picture emerges when trophic position is taken into account. Figure 1.8 shows that when eco-species are weighted by trophic position, there is often not a full ecological recovery even by the end of the experiments. Both eco- species loss and ecological degradation at the highest trophic levels are evident from the end-press data. Using this trophic weighting, the end-experiment treatment mean is significantly less than that of both the pre-extinction value and the end-experiment control value, indicating that communities often fail to re-evolve their prior ecological breadth. As mentioned previously for phenotypic diversity, this failure often stems from an inability to re-evolve (at least in the allotted time) the functions XOR and EQU, which are the most difficult of the nine one- and two-input logic functions (Lenski et al. 2003). With respect to the cross-feeding trOphic network studied here, the secondary consumer level is thus incomplete, and the top consumer level is unoccupied, even though other organisms are producing the resources necessary to support them. Thus, although these communities often contain the same number of coo-species as before, many of them are less diverse from this trophic perspective. Appendix A1.6 shows the same individual 41 a. N -L (I l d N L Number of trophic weighted stable ace-species a C9 T I fifi If T I I I l I7 T T l O O O O 8 8 O O O O O O O O 3 O O O O O 8 O 1- N 0 CD 5 on Q 0 ‘- N M '0 CD h Q G v- 1- 1- 1- 1- P 1- ‘- 1- 1- F N N Time (1000: updates) Figure 1.8. Number of stably coexisting coo-species weighted by trophic position. Each data point represents an average of 100 replicate runs. Y axis is weighted number of eco- species, X axis is time (in thousands of updates). Error bars and final control data as in Figure 1.7a. Calculations for weighting scheme are described in Appendix 1.1. IMAGES ARE PRESENTED IN COLOUR. 42 i m d 0| 1 A N 4 @ 1 Number of trophic weighted stable ace-species a O I I I I I fl f I If I f f I I I I I I O O O O 3 O O O O O O O O O 3 s O O O O 8 O 1- N M In O N O O O 1- N M O N O O 1- v- v- 1- v- F F 1- !- P P N N Time (10008 updates) Figure 1.8. Number of stably coexisting eco-species weighted by trophic position. Each data point represents an average of 100 replicate runs. Y axis is weighted number of eco- species, X axis is time (in thousands of updates). Error bars and final control data as in Figure 1.7a. Calculations for weighting scheme are described in Appendix 1.1. IMAGES ARE PRESENTED IN COLOUR. 42 replicates as those shown in Figure 1.7b and 1.7c, but adjusted for the trophic positions of the stable eco-species. Pulse exfinctiom—Recovemof functional activity Recall that press extinctions involve a prolonged period of an altered, less complex environment, whereas pulse extinctions reflect an instantaneous loss of individuals and diversity, but no sustained change in the environment. One might thus expect faster recovery from pulse than from press extinctions. As summarized in Table 1.1, the recovery times for pulse extinctions tend to be much shorter than for the press extinctions at all trophic levels. On average, the recovery times for trophic activity on the simple L0 and L1 levels require fewer than 8000 updates, whereas recovery from the press extinctions requires around 30,000 updates for full functional recovery on both these trophic levels (Table 1, Figure 1.9a). The recovery of trophic activity at L2 to previous levels required, on average, around 19,950 updates, as compared to more than 85,000 updates for the press extinctions. In contrast to the press extinctions, on the highest and most complex trophic level L3, average functional activity often did recover to its pre-event level of expression (Figure 1.9a). In part, this recovery is because the corresponding EQU expression was often not completely extinguished by the pulse event. 43 Table 1.1. Summary of recovery of ecological activity, showing time required to reach pre—extinction level. All times are expressed in updates and based on the average time series. Treatment Press Pulse Ratio Trophic level , . . . extinction extinction press/pulse L0 28,450 7,650 3.72 L1 32,200 6,900 4.67 L2 85,150 19,950 4.27 L3 > 100,000 37,050 >2.70 When all replicates are considered, recovery of EQU expression takes, on average, 37,050 updates. EQU expression dropped to zero in 27 of 100 replicates, and these cases require, on average, 75,800 updates before recovery to pre-pulse levels (Figure 1.9b). Variability in the recovery response among post-pulse communities is largely confined to L3, the highest trophic level (Appendix A1.7). When the initial phases of the recovery are examined in detail, with time suitably rescaled to make the beginning of the recovery zero for both extinction treatments, the pulse communities undergo a considerably faster rate of increase at all trophic levels than do the press communities (Figure 1.10). The faster recovery in the pulse communities occurs even though these communities often have, on average, lower average levels of functional output for the first few hundred updates immediately after the end of the perturbation. 44 Figure 1.9. Plots of functional recovery following pulse extinctions for trophic levels LO-L3. Data in panel (a) are averaged over all 100 replicates, data in panel (b) are averaged over only those 27 replicates in which the function EQU was lost and re- evolved. Horizontal lines originating at pre-extinction values indicate approximate times at which functional outputs, on average, re-attained their pre-pulse values. In both panels, Y axis is total functional output at that level, and X axis is time (in updates). Axes and colours as in Figure 1.3. IMAGES ARE PRESENTED IN COLOUR. 45 .I .I j I I. 8 fl r I .I 700 mm 3.53388 u. 650 - 600 - 550 4 arm can 00—. our 02. car 03. .3. 3:. our 0:. Time (10008 updates) Own 1 can 1 09. r 02. I 2... 1 cow . 62. 1 oz. 1 our 1 our r o: 2:. I I I I I I I I O O O O O O O O N M Q In GD 5 Q Q 120 a T a a m m w m 2 1 $5.38.... .6 88.. 2.9.6 8.. .50... Time (1000s updates) 46 700 8" 88 m .fl . 8N - .8 L. - 8. - 8. W W . 8. . 8. ._ :8. - 8. w ’ fl 8. - 8. i T 8. - 8. J - 8. . 8. Wm - 8. M . 8. W é - 8. d - 8. a M if - 8. m. . 8. w w 8. W m ram .m n .8 T .8 .8 .8 r3 .8 8 cu r a. mmmmmm mmmmmmo m m w w m m. o 353398 .o 88: 5&3 3 .82. 46 Figure 1.10. Comparison of functional recovery in press vs. pulse extinction communities over the first 3,000 updates of recovery. Time has been rescaled so that the beginning of the recovery period is zero for both extinction treatments. Curves show average of all 100 replicates. In all four panels, solid curves are pulse communities and dashed curves are press communities. 47 \ PRESS PULSE de 4N 1»; .md 500 m n . o v u n n u v . : o . . n s t — o m m w w m m w 3 3 2 1 4| 3:23.85 .o 88.. 8&3 S .53 2504 0 Recovery time (10003 updates) (a) 400 m -m& -N .m;. -w E S L U . \\\\p. -8... S\\\\\ n S . E R P u . _ d a _ a o 0 0 m m m m w m 5 3 3 2 2 1 1 35.885 .o 88.. 89.3 5 .80.. Recovery time (10003 updates) 0)) 48 Total L2 output (10003 of executions) Total L3 output (1000s of executions) (Figure 1.10 continued) 180 160 ~ .2 120 a 100 r 80 ~ 6° 4 PULSE ............................................... c c o .- 0 05- 1 .5 2 25~ 3 (C) Recovery time (10003 updates) 60 u C I & O (AD O l PULSE N O J .3 O .0 I! m /‘63 o..- ------ .o .. 0'. .oc' _____ ..... ,- """ .-' ." .-' ---- Recovery time (10003 updates) 49 Pulse extinctions—Recovery of phenotypic diversity Unlike communities recovering from the press extinctions, most pulse communities quickly and fully recover phenotypic diversity comparable to their pre- extinction diversity, although a few are still a bit depressed, again typically on the higher trophic levels (Appendix A1.8). As described above for functional activity, when time is rescaled to the beginning of the recovery, the pulse communities show faster recovery of phenotypic diversity than the press communities, even though the pulse communities begin recovery, on average, with fewer phenotypes than the end-press communities (Figure 1.11). Pulse extinctions—recovery of stably coexisting eco-species Recall that stable eco-species are those organisms that can coexist stably when mutation, and hence further evolution, is stopped. Figure 1.12 compares the recovery of eco-species in press vs. pulse populations, with time offset as in Figures 1.10 and 1.11. When measured by coo-species count, the initial rate of recovery does not differ significantly between the two treatment types, even though the post-pulse communities begin, on average, with slightly more coo-species. By the end of the recovery period, the press communities have a slightly, though not significantly, higher eco-species diversity than the pulse communities (140.05, 193) = -1.59, two-tailed p > 0.1). 50 N N O (I J L Number of phenotypes 8 1o - O 05— 1 15 2 5 3 35~ 4 45* 5 Recovery time (10003 updates) Figure 1.11. Comparison of early stages of recovery of phenotypic diversity in pulse vs. press communities. Time has been rescaled so that the beginning of the recovery period is zero. Curves represent average of 100 replicates. Solid curve — pulse; dashed curve — press. 51 0| 1 PULSE fi Q .L- ) Number of stable ecotypes N I 14 I r I I T I f I O O O O O O O O O 1- N n V m 0 N O 90 ~ 100 ~ Recovery time (10003 updates) Figure 1.12. Comparison of recovery of stably coexisting eco-species between press and pulse experiments. Time has been rescaled so that the beginning of the recovery period is zero. Each data point represents the average of 100 communities. Solid curve—pulse; dashed curve—press. 52 Adjusting each eco-species for its trophic position again changes the picture somewhat (Figure 1.13). Because sampling of pulse communities is at random (from among the pool of viable genotypes), more of the pre-existing ecological and phylogenetic structure is preserved than in the press communities, allowing some carryover of organisms that perform top-level functions. On average, post-pulse communities tend to remain more trophically varied than post-press communities afier their respective extinctions, and by 5000 updates most pulse communities have reached their end-experiment values (Figure 1.13, solid curve). The press communities, by comparison, are in general more ecologically depauperate at the end of their perturbations, and remain so over the first 5000 updates of recovery. Even by 50,000 updates post-press, they still, on average, lag the pulse communities in ecological recovery (Figure 1.13, dashed curve). By the end of the experiment, after 100,000 updates of recovery, press and pulse communities are roughly equivalent from a trophic perspective (two; 193 dt) = 0.782, two-tailed p > 0.43). The end-experiment average for both press and pulse treatments is lower than the extinction-flee control (see Figure 1.8), owing in both cases to the failure of many post-extinction communities to re-evolve the most difficult trophic functions, XOR and EQU. Average generation time in press vs. pulse Recall our expectation that the low-resource press treatment would favour organisms with short generation time, able to make efficient use of scarce resources and 53 ........ ............................. ................. ------------ s ....... .e ...... ...... ...... .- .-.- Number of trophic-weighted stable ecotypes O 4-4 2- o I I I I I l i I I I ° 2 a s s e s 2 s s g Recovery time (10003 updates) Figure 1.13. Comparison of recovery of trophic-weighted stably coexisting eco-species between press and pulse experiments. See Appendix 1.1 for details of weighting calculation. Time has been rescaled so that the beginning of the recovery period is zero. Each data point represents the average of 100 communities. Solid curve — pulse; dashed curve—press. 54 with little or no investment in code for performing costly metabolic fmetions that would be futile during the press episode. Indeed, we observe a consistent trend towards organisms with shorter generation times in the press experiments, with a subsequent rebound following restoration of resource inflows (Figure 1.14, dashed curve after 100,000 updates). The pre-extinction average generation time is 1356 instructions executed per replication, while that for the end of the press episode is 281.84 instructions executed per replication (paired 00.05, 99 d0 = 60.48, one-tailed p < 0.0001). In contrast, the pulse extinction experiments show no consistent response, as expected given random culling (paired 00,05, 99 d0 = -0.3527, two-tailed p > 0.72). Press extinction communities recover to a lower mean final generation time than do pulse extinction communities (1640.22 for press, 1802.89 for pulse, paired t(0.05, 99 d9 = -5.43, two-tailed p < 0.0001). Effect of density and communifl composition on recovery In Avida, because of the basal CPU cycles given to all organisms, organisms can persist without performing any logic-based trophic functions tied to resources, although with a very low rate of replication. Minimal replicators become common during the press episode, when resources are largely unavailable, because metabolically complex organisms waste energy by performing futile, yet costly, functions. Thus, at the end of the press episode, communities are still nearly full, but mostly with organisms well- adapted to low-resource conditions. Organisms that can take advantage of the suddenly restored resources following the press episode first must overcome a sea of competitors, 55 .L .5 .s 3 O O O O O O O l l 1 1200 _ 1000 ~ 800 _ 600 - 400* Average generation time (instructions executed) 200 4 0 I T If fl I f I I ‘I I r fl I I I I I I 7 I O O O O O O O O O O O O O O O O O O O O O O 1- N " Q In O N O O O 1- N M V In O N O O O F F F F 1- ‘- F F 1- v- 1- N N Time (10003 updates) Figure 1.14. Comparison of average generation time in pulse vs. press communities. Lines represent average of 100 replicates. Y axis is average generation time, X axis is absolute time (in thousands of updates). Data series are identical up to 100,000 updates, when treatments were imposed. Solid curve — pulse; dashed curve— press. 56 whereas survivors of pulse extinctions are free to expand into a nearly empty ecosystem with greatly reduced competitive pressures. To determine whether there is an effect of low density per se on the differences in recovery following pulse and press extinctions, we select (at random) a subset of 50 replicates, taking from each one only the stable eco-species from the end-press community. We then inoculate a single individual of each eco-species into an empty grid, with mutation rates and resource inflows the same as during the pre-extinction period, but with no accumulated standing resources. The objective is to determine how recovery time is influenced by seeding a low-density environment with these products of press-imposed selection. We then repeat these procedures, but instead seed communities with the organisms known to have survived the pulse extinctions. For all four trophic levels, the initial rate of recovery is slower when the community is seeded with organisms from the end of the press episode (Figure 1.15, dashed curves), compared to communities seeded with the pulse extinctions survivors (Figure 1.15, solid curves). This result demonstrates that important changes evolve during the press that impede rapid re- evolution of the trophic functions to their pre-extinction levels. Given the marked tendency towards organisms with shorter generation times during the press period, we perform a subsequent test to examine how recovery is influenced by introducing a similar bias into a post-pulse community. From each of the same 50 replicates, we select the four organisms with the shortest generation times from the pus-extinction community. As before, we inoculate an empty ecosystem with a 57 Figure 1.15. Comparison of functional recovery in low-density recovery experiments. All lines are averages of 50 replicates. Solid curves — recovery using survivors of pulse extinctions. Dashed curves — recovery using stably coexisting eco-species from end of press episode. Dot-dashed curves — recovery using pre-extinction organisms with shortest generation times. a) Trophic level L0. b) Trophic level L1. c) Trophic level L2. d) Trophic level L3 (EQU). 58 600 (I O O s 300 200 Total L0 output (10003 of executions) .5 O O 400 350 300 250 200 § 100 Total L1 output (10003 of executions) ‘ PULSE - PRE-EXTINCTION, SHORT GENERATION ............................... ‘ END-PRESS o IQ no N M? co IQ v IQ In O 1- N O V Recovery time (1000s updates) (a) a IQ no N I; co '4'! st IQ to O 1- N M 1’ Recovery time (10003 updates) (b) 59 (Figure 1.15 continued) 140 a — fi — o o 8 6 1 85.888 .e 88.. 5&3 S .88 120 4 o o w 0 Recovery time (10003 updates) (C) 60 . d . . q M m w m m 85.888 .e 88.. 8&8 8.. .58. Recovery time (10003 updates) 60 single individual of each type, with no accumulated standing resources. The objective here is to determine whether shorter generation times and the resulting slowed recovery result from a simple sorting of pre-extinction diversity, or whether further adaptation during press conditions also contributes to delayed recovery. The results of this test are shown in the dot-dashed curves in Figure 1.15; for trophic level L3, only those replicates where the EQU function actually re-evolves are averaged and plotted. The initial recovery at levels L0 and L1 appears similar to the pulse recovery, but then decelerates and converges on the trajectory of recovery for communities seeded with end-press organisms (Figure 1.15a, b). The recovery of L2 output falls almost exactly between the pulse and end-press seeded recoveries (Figure 1.15c). There is no statistically discernible difference between the short-generation and end-press seeded recoveries on L3. However, the sample sizes are smaller than for the experiments seeded with pulse survivors (only experiments where EQU re-evolved are plotted), and the small difference that is evident is likely driven by a few replicates. These results again indicate that a substantial contribution to the slower recovery seen in the press experiments is caused by these communities having evolved organisms with short generation times, and lacking much of the machinery needed to re-evolve high trophic-level functions. 61 DISCUSSION In this study, we use communities of digital organisms to investigate dynamics of recovery from instantaneous pulse extinctions, and from low-resource press extinctions. These communities feature cross-feeding food webs with trophic structure, where the digital organisms can evolve to consume and extract energy from supplied resources, and make by-products that themselves serve as resources for other individuals. This system offers the opportunity to instigate extinction either by a mass culling of the population without any environmental alteration (pulse) or by manipulating the environment, leading to evolutionary responses by members of the community (press). An important strength of our system is the ability to repeat replicates exactly up to a particular time point, then subject them to different treatments from that point onwards, and study the resulting differences. Functional recovery We observe clear differences in the dynamics of recovery from press versus pulse extinctions. When recovery is gauged by total functional activity on a particular trophic level, communities recover much faster, on average, from pulse extinctions than they do from press extinctions. The largest absolute differences are seen at the higher trophic levels, L2 and L3 (Fig. 3, 9). In fact, many press extinction communities never recover their pre-extinction performance on level L3. When we consider the relative differences between press and pulse recoveries, the differences are rather similar across trophic levels 62 (Table l). The lower levels with more easily evolved functions, L0 and L1, have 3.7-fold and 4.7-fold longer average recovery times, respectively, from press extinctions than from pulse extinctions, while the corresponding difference for L2 is about 4.7-fold. We cannot express this same ratio for L3 because its functional activity had still not reached its pre-extinction level by the end of those experiments. There are a number of important differences between the pulse and press extinctions. A pulse extinction is an instantaneous sampling of the community, where survivors are, from an individual standpoint, no better or worse adapted than previously, except for possibly missing other community members upon whom they might depend for resources. Our main set of pulse extinction experiments are, in effect, “field of bullets” extinctions. Of course, if more organisms survive, then more of the pre- extinction ecological fabric and evolutionary history will be preserved, thus spurring more rapid recovery. The press, by contrast, requires the passage of time in order for its effects to become evident, and it provokes an adaptive response in the organisms to the changed environment (Figure 1.14, 1.15). The low-resource press treatment used in our study tends to favour organisms with shorter generation times and, as a consequence, simplified functionality. Pulse survivors, by contrast, have generation times distributed randomly around the pre-extinction average and retain their functionality. Using the classification of McGhee et al. (2004), the pulse extinctions can be seen as mostly “Category II”, producing great loss of biomass and diversity, but usually failing to eliminate key taxa or ecological traits, and thus having minimal ecological impact (though some exceptions certainly exist in our experiments). The press extinctions, by 63 contrast, can be considered mostly “Category I”: pre-extinction ecosystems experience near-total collapse during a selective extinction episode, and are replaced by new ecosystems that evolve during the recovery period. Our results therefore indicate that selection favouring organisms with short generation times, and the resulting “de- evolution” of their higher trophic functions, are major factors delaying recovery from press extinctions. This effect supports the “nonconstructive selectivity” hypothesis advanced by Raup (1984, 1991), since during “background” times (pre-and post- extinction) organisms are rewarded for honing their ecological functions, not drastically shortening their generation times. Yet, it is the latter trait that facilitates survival through the press episode. This effect is further supported by the delayed recovery that we observed in subsidiary experiments using organisms chosen for having short generation times. Evidently, the press episode damages not only the ecological fabric of the community, but also the genetic architectures and functional potential of the surviving digital organisms. As a consequence, recovery from the press extinctions at the higher trophic levels is limited both by availability of sufficient resources, and by the need to re- evolve the functions for using those resources. This latter component typically includes not only the time needed for the lost fimctions to re-evolve, but also additional time for the expression of those functions to recover to pre-extinction levels, on both per-capita and full-population bases. Our results showing delays in recovery are comparable to the lags at higher trophic levels seen in the models of Solé et al. (2002). It should be stressed that they modeled recovery of species diversity per se, rather than ecosystem function, and their 64 evolutionary model lacked the explicit population dynamics present in our model system. In their model, the strength of links between species on different trophic levels is fixed for the lifetime of each species, and those species are simply present or absent. By contrast, the strength of ecological links in our system depends on both the per-capita expression of particular functions and the total number of individual organisms expressing those functions; moreover, both quantities can fluctuate considerably over time owing to demographic and evolutionary processes. Furthermore, their extinctions were all instantiated as pulse extinctions targeted at the primary producer level, which did nOt, by itself, include any adaptive component, in contrast to the press extinctions in this study. Regardless of the type of extinction imposed, and in spite of the exotic nature of Avida’s digital organisms and communities, our results have clear and interesting parallels to data from the paleontological record. In particular, our results for delayed recovery of ecological activity at higher trophic levels are broadly analogous with the findings of a study using carbon isotopic flux data to infer extinction and recovery. D’Hondt et al. (1998) concluded that marine productivity recovered quickly after the Cretaceous-Tertiary mass extinction, whereas many species on higher trophic levels (ranging from zooplankton to marine megafauna) went extinct. As a consequence, the ecosystem as a whole took considerably longer to recover, particularly in the case of the megafauna, which required replacement through adaptive radiation fiom terrestrial stocks. Benton et al. (2004) also reported impoverishment of diversity for some 15 million years following the end-Permian extinction, including delayed re-evolution of top 65 predators, based on analyses of macrofaunal fossils. Another parallel between the digital ecosystems and paleontological record is seen in the post-extinction trend towards organisms with shorter generation times and functional simplicity. Although often anecdotal, numerous studies have reported disproportionate extinction of larger bodied and more complex forms (Russell 1977, Fischer 1981, LaBarbera 1986, Jablonski 1986, 1996, Norris 1991, Stanley and Yang 1994, Arnold et al. 1995, Ross and Ross 1995, Saunders et al. 1999). For example, larger planktonic foraminifera disappeared at the K- T boundary, leaving survivors that were smaller and had simpler morphologies (Norris 1991, Arnold et al. 1995). Large foraminifera with complex architectures also declined during the end-Permian extinction (Stanley and Yang 1994, Ross and Ross 1995). While digital organisms lack morphology per se, they have other phenotypic characters, including genome size and functional complexity, such that the evolution of simpler phenotypes can be observed and quantified. Diversity recoversI Diversity-through-time studies are based on source data derived from compendia of fossils (Kirchner and Weil 2000a,b, Kirchner 2002, Lu et al. 2006). In these studies, analyses of recovery dynamics must deal with how best to accommodate the phenomena of backward and forward “smearing” of recorded organisms, which influence estimated times of originations and extinctions in the fossil record. The studies of Kirchner and Weil (2000a,b) and Kirchner (2002) used the uncorrected database of Sepkoski (1992, 66 1997, 2002), while Lu et al. (2006) used a database prepared by Foote (2003) that incorporates corrections for effects of rock volume and preservation potential. These studies accounted for smearing effects in different ways, using different assumptions, and they reached opposite conclusions regarding the biological basis of lags in recovery. Kirchner and Weil (2000b) and Kirchner (2002) infer a “niche construction” phenomenon to explain their findings, wherein the rate of recovery is limited by the efficacy of diversification mechanisms and opportunities following the ecological breakdown caused by the mass extinction. Their inference is based on statistical lags in the cross-correlation between extinction and origination time series derived from Sepkoski’s data. The signal of lag persists even if sampling is very incomplete, and also when the “Big Five” mass extinctions are excluded (Kirchner and Weil 2000b). Lu et al. (2006), by contrast, avoid making any explicit biological inferences, instead attributing the appearance of delayed recoveries to low preservation potential during the early stages of recovery periods. They conclude that diversification commences soon after extinction- causing perturbations have subsided, and that originations are usually rapid and even pulse-like (in geological time). They suggest that better preserved sections from later times, which contain more taxa, lead to inference of delayed episodes of diversification (and thus recovery). Thus, in their view, most recoveries from mass extinctions occur without prolonged delays. However, they acknowledge the strong evidence for delayed recovery following the end-Permian extinction. In any case, important questions remain as to the best methods to correct for preservation biases, and whether it is reasonable to assume “pulsed turnovers” in which all extinctions happen at the end of a geological interval, while all originations occur at the beginning of the succeeding one. 67 The empirical studies discussed above do not group fossil organisms by their ecological roles, but instead group them taxonomically or lump them together into a single dataset. Consideration of trophic position (when it can be ascertained), and other ecological factors, introduces different perspectives on extinction and recovery for fossil data. Most generally, the taxonomic and ecological severities of mass extinctions can be decoupled. Selective removal of specific traits or taxa may result in much larger ecological changes than elimination at random (Jablonski 2001, McKinney 2001). Thus, extinctions that are not severe from the standpoint of taxonomic loss can be extremely disruptive to ecosystems if they remove ecologically important taxa (McGhee et al. 2004). The most severe disruptions can have very long-lasting repercussions. Benton et al. (2004) note that, following the end-Permian extinction, certain ecological guilds (including small insectivores, small piscivores, large herbivores, and top predators) had failed to reappear afier approximately 15 million years of recovery, even though most ecosystem functions had been restored by this point. Modeling and simulation approaches, including Solé et al. (2002) as well as the present study, can incorporate an explicit trophic component and analyze how trophic structure influences recovery dynamics. Both of these theoretical studies found greater delays in recovery at higher trophic levels, although using rather different types of data and assumptions. In addition to measurements of ecosystem function per se, we also used various counts including the number of stably-coexisting “species” (coo-species), total phenotypic diversity, as well as combined metrics (stable eco-species counts weighted by their trophic functions). Insofar as we can measure simple “taxonomic” diversity through stable eco-species, our results largely support the conclusion of Lu et al. (2006). That is, re-radiation of Avida eco- 68 species following relief from press conditions is usually rapid and opportunistic. Previous modeling studies (Sepkoski 1978, 1979, 1984) also feature rapid radiations as regular features of diversification following low-diversity episodes. Even with the tight constraints on stable eco-species diversity in our system, where the number of different resources is small, substantially delayed radiations based on stable coo-species counts are very much the exception (Figure 1.7c), with most recoveries occurring quickly following restoration of the environment to its pre-extinction conditions (Figure 1.7a, Appendix A1.6). However, when we take into account the trophic positions of the eco-species during the recovery, we reach a very different conclusion. From this perspective, recovery was often incomplete, even by the end of the experiment (compare end- experiment points for control and treatment in Figure 1.8). Several of the diversity metrics used in our study, including the number of stably coexisting eco-species as well as the total count of distinct phenotypes, depend fundamentally on the presence or absence of trophic functions to differentiate distinct types. As stated above, both the initial diversification and recovery can be very rapid, especially using the count of stable coo-species. In the early stages of radiation and recovery, many of these coo-species are separated by rather small genetic and ecological differences, which are sufficient to permit two or more recently diverged types to partition resources and thereby stably coexist (see examples in Appendix 1.2). In some cases, one or several mutations are sufficient to produce a functional shift, one that builds upon some existing genetic pathway to perform a new trophic function (Lenski et a1. 2003), in much the same way that new biological fimctions evolve from existing 69 components that previously served other functions (Mortlock 1984, Bridgham et al. 2006). Of course, not all functions are equally easy to evolve or re-evolve. If the genetic and functional potential in a community decays sufficiently, as is often the case during a sustained press-type extinction, then it becomes much more difficult to rebuild the highest level trophic functions. Thus, we see compelling evidence of the interaction between genetic factors, which are internal to the organisms, and the nature of the environmental perturbation leading to the extinction, which together influence the ecological and evolutionary dynamics of the recovery from a mass extinction. The effect of selection for genomic and functional simplification during the press extinction in delaying the subsequent recovery of the community can also be interpreted in light of the findings of Yedid and Bell (2002). They showed that adaptive evolution sometimes follows “paths of no return” that severely constrain future options for evolving particular solutions. Adaptation to the press environment makes the digital organisms better able to tolerate the paucity of resources, but it also pushes them down evolutionary paths that inhibit their subsequent ability to adaptively re-radiate afler the extinction episode. The time necessary for functions to both re-evolve qualitatively, and recover quantitatively, thus becomes limited by the ability to regenerate genomic potential and “evolvability” that may be compromised during the press episode. While one must be cautious in extending such results to the organic world, the implication is that full recovery from a biotic crisis may be hampered by adaptation to the transiently impoverished conditions. 70 FUTURE DIRECTIONS The present study has focused on the recovery from mass extinctions, including the differences between extinction types and the variation among replicates subject to the same treatment. We have not examined why some pre-extinction communities are more resilient to the effects of extinction than others. Despite the severe effects of the press extinction on most replicates, there were eight communities (out of 100) that were largely unaffected by this treatment, experiencing little or no loss of either diversity or functionality. While these press-resistant communities were a minority, it would nonetheless be of considerable interest to determine whether their greater resilience reflected some particular aspect of the trophic web or, alternatively, some “intern ” factor that characterizes keystone species, such as genomic redundancy promoting functional robustness with respect to mutations (Lenski et al. 2006). Another future direction concerns understanding the re-evolution of the EQU trait, which is the most complex of the one- and two-input logic functions (Lenski et al. 2003) and which was the “top consumer” in the trophic webs in the present study. Whenever a complex function is lost and then re-evolved, it is interesting to know whether the later instance of the function arose de novo, or was built from bits and pieces of the ancestral function. In future work (see Chapter 2), we intend to determine the precise ancestry of those organisms that re-evolved EQU following the press extinctions that led to simplified genomes and functionality. Specifically, we want to test whether the late EQU-performing organisms descend more often than expected by chance from 71 ancestors that performed EQU before the press and lost that ability during the press episode. A third area for future work concerns the effects that press and pulse extinctions have on various properties of the phylogenetic trees of the digital communities. The Avida system is well suited to address this issue because the relationships among all organisms can be determined with absolute certainty, hence removing any potentially confounding effects associated with the various methods used for phylogenetic reconstruction. Particular features of tree shape that can be examined include their balance (evenness of distribution of leaves) and stemminess (ratio of interior to exterior branch lengths), as well as the overall retention of evolutionary history following the extinction (Nee and May 1997, Heard and Mooers 2002). A fourth area for future work is further characterization of the changes in community composition and ecosystem structure that take place during recovery from the extinction. Several results (Figure 1.4, Figure 1.6a, Appendix 1.4c,d) suggest that press extinction communities in particular can undergo dramatic ecological reorganization during the recovery period, even when all functions re-evolve. We can better quantify these changes, by using some measure of the distribution of functionality among organisms in the population, and seeing how this changes over time. The normalized mutual entropy approach of Gorelick et al. (2004) seems very promising, since it requires data that are very similar in nature to the phenotypes of Avida organisms. If this metric can be adapted successfully for use with Avida output, we can see how post-extinction 72 communities differ from their pre-extinction predecessors, especially regarding division of labour (i.e. shifis from specialization to generalization or vice versa). We expect that communities subjected to press extinctions will show more marked changes in this metric—and thus more drastic shifts in ecological and community organization—than pulse extinction communities. Finally, a fifth area for future work would be to examine the effect of recombination on the recovery dynamic. It is possible that some of the effects of delayed recovery seen in our experiments stem from the strictly asexual nature of the digital organisms used here. Long recovery times could result since beneficial mutations allowing niche expansion and increased efficiency of use must arise sequentially on the appropriate genetic backgrounds. Allowing sexual recombination might more quickly create new backgrounds that bring beneficial mutations together and thus shorten the recovery times, especially for more difficult functions such as EQU. 73 CHAPTER 2 CONVERGENCE, CONSTRAINTS, AND CONTINGENCY IN RE-EVOLUTION OF A COMPLEX TRAIT FOLLOWING EXTINCTION IN COMMUNITIES OF DIGITAL ORGAN ISMS ABSTRACT The repeated, convergent evolution of complex traits remains one of the most interesting phenomena in evolutionary biology. Such traits have arisen in different taxa separated by considerable spans of geological time, but the underlying genetic and physiological bases cannot be studied in fossil taxa. We used communities of digital organisms (computer programs that self-replicate, mutate, and evolve) to study the evolution, loss, and re-evolution of a complex trait. These communities were subjected to press extinctions caused by a period of low resource availability. We examined the re- evolution of the computational function EQU (bit-wise logical equals) in communities where organisms hearing this complex trait had gone extinct. There was a strong tendency for the EQU function to re-evolve in communities where it was previously present, suggesting a community-level “deep history” effect. However, associations between the re-evolution of EQU, and either the percentage of the pre-press community expressing it, or its total expression summed over the pre-press community, were weak. In replicate populations where EQU did re-evolve, organisms that re-evolved EQU after the press tended to have immediate pre-extinction ancestors of low trophic position and 74 functional complexity, evidently because these lower-level functions were more resilient during the press episode. Organisms in which EQU re-evolved often did not include that firnction as part of their pre-extinction evolutionary history; that is, EQU had previously evolved in a different pre-extinction clade. We further examined replicates where the first organism in which EQU evolved before the extinction had descendents remaining at the end of the extinction episode. In most of these replicate populations, the pre- extinction EQU clade went extinct during the early phases of the recovery, though there were also a number of replicates where the clade persisted without EQU re-evolving. To see if these clades had lost the capacity to re-evolve EQU, we seeded one set of multiple sub-replicate “replay” populations using the most abundant remaining descendent organism of the pre-extinction EQU clade, and another set with the actual end-extinction ancestor of the organism in which EQU re-evolved following the press episode. Descendents of genotypes taken from replicates where EQU did not re-evolve in the original experiment were not demonstrably worse at re-evolving it than those taken from populations where EQU did re-evolve. Organisms sampled from within the same clade often varied markedly in both their propensity to re-evolve EQU. In those lineages where EQU re-evolved in a lineage that previously possessed it, the amount of re-use of the ancestral mechanism was highly variable, but often quite substantial. Furthermore, the percentage of ancestral functional sites retained through the press episode was positively correlated with the probability of re-evolving EQU. The sizeable number of replicates where EQU re-evolved in a clade that did not previously possess it, or in clades where the ancestral mechanism had decayed considerably, demonstrates that there are a variety of distinct ways in which the EQU function may re-evolve, building on different sets of 75 precursors. Replicated experiments starting with the same ancestor demonstrate that stochastic factors can lead to constraints on further adaptation, as re-evolution of EQU is not always certain even in cases where many building blocks for it are already in place. 76 INTRODUCTION The phenomenon of convergent evolution is understood as the appearance of traits with broadly similar form and function in multiple clades that may be widely separated in space, time, or both. Convergent traits are held to be the result of adaptation to similar selective pressures, or represent the physically best possible solution to a particular aspect of an organism’s lifestyle (Patterson 1988, Salesa et al. 2006, Sumbre et a1. 2006). Constraints of various kinds may also play a substantial role in the formation and use of such traits (Wake and Roth 1989, Wake 1991, Conway Morris 2003, Brakefield 2006, Weinrich et al. 2006). Convergence pervades the natural world at all scales. Examples range from gross anatomy and physiology in megafauna and megaflora both living (Shirnek and Kohn 1981, Blackburn 1992, 1993, Sinha and Kellogg 1996, Klok et al. 2002, Reznick et al. 2002, Whiting et al. 2003, Fain and Houde 2004, Springer et al. 2004, Sumbre et al. 2005, 2006) and extinct (Marshall 1980, Padian 1985, Wroe et al. 2005, Salesa et al. 2006), to biological molecules and the metabolic networks in which they participate (Tomarev and Piatigorsky 1993, Chen et al 1997, Amoutzias et al. 2004, Shagin et al. 2004, Goetzman et a1. 2005, Perry et al. 2006, Sweeney and Johnsen 2006). Although the overall form and function of convergent traits may be broadly similar, the details of how the structures develop and operate can differ markedly for given cases. One of the best known such examples is the difference in structure (and presumed developmental mechanisms) between pterosaur, bird and bat wings (Padian 1985); insect wings have completely different evolutionary and developmental origins (Marden and Thomas 2003). Similar solutions have evolved even with very different anatorrries. For example, cephalopods lacking an endoskeleton use muscle contractions to emulate the 77 limb flexions of jointed arms (Sumbre et al. 2005, 2006). Convergence has apparently occurred not only in geographically and phylogenetically divergent clades (Conway Morris 2003, Fain and Houde 2004, Springer et al. 2004, Rich et al. 2005), but also in clades that have arisen during different eras of geological history (Padian 1985, Ji et al. 2006, Meng et al. 2006, Li et al. 2007). Convergent traits are particularly notable when they evolve anew following the extinction of the clade in which they evolved previously. What makes convergence all the more remarkable is how often similar forms—along with features deemed to be “complex”—have evolved, suggesting a degree of predictability and determinism in evolutionary outcomes (V errneij 2006). Within living taxa, powered flight has apparently re-evolved several times in stick insects (Phasmida), a clade that diversified from a putative Wingless ancestor (Whiting et al. 2003). In plants, many families have evolved C4 photosynthesis (and its associated anatomical specializations) for adapting to warm climates, with at least four independent origins in the grass family alone (Sinha and Kellogg 1996). Perhaps the most spectacular neontological examples of convergently-evolved “complex” traits are camera-type eyes in various clades of both invertebrates and vertebrates (Gehring 1999, Conway Morris 2003, Sweeney and Johnsen 2004). Complex traits need not be confined to morphology and physiology: eusocial community organization, often involving physiologically specialized castes, has evolved in at least two major insect clades (Wilson and Hfilldobler 2005), synalphid shrimp (Duffy et al. 2000), and bathyergid rodents (Bennett and Faulkes 2000). 78 “Complex” traits are normally understood as physiological structures or metabolic pathways that require the coordinated action of multiple interacting parts, genes and/or gene products to function, and that are often metabolically expensive to maintain. The repeated evolution of complex traits (see examples in preceding section) remains one of the most compelling issues in evolutionary biology, given the difficulties thought to be inherent in their formation. Darwin reasoned that complex traits must evolve through incremental steps requiring many intermediate states, and involves co-option of new parts from precursors with other functions. Much evidence now exists that supports his general model (Lenski et al. 2003). As a special case of convergent evolution, re- evolution of complex traits in taxa separated by large spans of geological time (and often extinction episodes) remains poorly understood from both paleontological and neontological standpoints. Repeated evolution of complex traits is of considerable interest not only intrinsically (Reznick et al. 2002, Lenski et al. 2003), but also in the context of whether or not such traits are re-evolved following the extinction of clades bearing them. While mass extinctions in particular are most notable for the new ecological opportunities they create for surviving taxa (Erwin 2001), they have lower-level effects in addition to the very broad macroecological re-sculpting which is their clearest hallmark. In addition to eliminating an appreciable fraction of biota across considerable spatial scales, mass extinctions can cause broad losses of evolutionary history. Clades (in whole or particular subsets of them), may go extinct either during background times, or during wide-ranging ecological crises and their aftermaths. Extinction of whole clades may 79 result in the loss of any unique or defining characters their members may have evolved (Frazetta 1970, Frey et al. 1997). Jablonski (1986) postulated that useful adaptations that would otherwise permit expansion of resource use, or other environmental opportunities (e.g. improved predator avoidance), could arise in the wrong places and times—Le. shortly before a mass extinction. Such traits could re-evolve after considerable spans of geological time, in descendents of lineages that survived the extinction and radiated in its aftermath. However, the fossil record provides surprisingly few unambiguous examples of this phenomenon in particular; examples of convergence in gross form and ecological habit are more obvious. Such ambiguity is shown by the example of the shell-drilling habit in carnivorous naticid gastropods, which was claimed by Newton (1983) and Fiirsich and Jablonski (1984) to have originated shortly before the end-Triassic mass extinction. Shell-drilling was seemingly lost during that event, re-originating roughly 120 million years later in a related group. The presence of Triassic naticids was, however, inferred from naticid-like drillholes found in other fossils dating from that time. A more recent, competing view suggests that naticids (and their shell-drilling habit) originated once in the Cretaceous, and all supposed Triassic-era naticid precursors were in fact herbivores, based on the feeding habits of the last living representative of the family in which these Triassic shell-drillers were thought to have arisen. An as-yet unidentified actor must be the source of the Triassic-era holes (Kase and Ishikawa 2003). These assertions are completely conjectural and potentially fallacious, since the clade may have been more ecologically and functionally diverse than suggested by the single extant remnant, and no other candidate has been suggested. 80 Other, clearer examples of re-evolution of a complex trait following extinction do exist, such as the rock-boring bivalves of the late Ordovician, which also perished in a mass extinction. The boring habit did not re-appear unequivocally until 100 million years later in the Triassic (Pojeta and Palmer 1976); in this case, the fossil organisms are found in association with their effect on the environment. Among vertebrates, the evolution of wings and powered flight in three major tetrapod clades (pterosaurs, birds, and bats) is probably the best-known and clearest example of (Padian 1985), the membranous, digit- supported wings of bats and the extinct pterosaurs being more comparable. While not as spectacular as powered flight, the use of patagia supported by dorsal rib projections for gliding has evolved in lizards during the Triassic, Cretaceous, and the modern agamid lizard Draco (Li et al. 2007). A functionally analogous structure not involving dorsal rib projections is known from a late Permian reptile (Frey et al. 1997). Finally, an additional caveat that must be considered when using fossil organisms is that the phylogenetic relationships among the taxa considered may not be well supported or resolved, particularly with respect to character polarity. “Simple” organisms that look like precursors to ones that bear a complex trait may in fact be derived and secondarily simplified (Bateman 1996). In order to better understand re-evolution of complex traits following extinction, an experimental system that overcomes many of the difficulties inherent in using fossil data is desirable. Such a system now exists. The digital organisms of the Avida system (Wilke and Adami 2002, Ofiia and Wilke 2004) have been used previously to study the evolution of complex features (Lenski et al. 2003). Avida is a tractable system that also 81 has many features conducive to study of re-evolution of complex traits if they are lost by some means, such as an environmentally-mediated mass extinction (see Chapter 1). In the current chapter of this thesis we focus on a particular complex trait, the bitwise logic function EQU. Of the nine basic logic functions, EQU is the algorithmically most complex, and empirically most difficult to evolve in Avida; it does not evolve in settings where simpler precursor functions yield no fitness bonus to the digital organisms (Lenski et al. 2003). The specific questions we address here include: 1) Is a complex trait disproportionately more likely to re-evolve in populations that previously possessed it and lost it during the extinction episode? Or is its post- extinction evolution essentially independent of its evolution before the extinction? 2) Did the complex trait re-evolve preferentially in populations where a high pr0portion of individuals in the population possessed the function? Did any surviving individuals descending from the pre-extinction organism that first performed that trait occur in the immediate post-extinction population, even if they had lost the trait? If so, what happened to these descendents during the recovery? 3) Did organisms from the end of a press extinction experiment that possessed the complex trait usually arise in a lineage where their immediate pre-extinction ancestor performed that function? Did they tend to re-evolve in lineages that included the trait in their pre-extinction history, even if the immediate pre-extinction ancestor did not perform it (i.e. secondary simplification)? Or did the lineage of the organism in which 82 the complex trait re-evolved often not include the trait in its pre-extinction evolutionary history at all? 4) When the complex trait re-evolves after the extinction, is its expression mechanistically similar in nature to the mechanism of expression that existed before the extinction episode? How much of the ancestral mechanism (if any) is preserved and re-used when the trait is re-evolved? How much does expression of the complex trait overlap with that of other traits? METHODS Experimental platform All replicates are performed on a Beowulf cluster made up of Intel Pentium III, IV and AMD Athlon processors, using Avida v. 2.1. Organism execution trace files and genotype-phenotype maps and calculation of Huang’s (2005) information criterion are generated with Avida v.2.3.1, using output files produced during the runs of the main experiments. Configuration files are available at http://myxo.css.msu.edulpapers/. All statistical analyses are performed using the Statistics Toolbox in MATLAB v. 7.1 (The Mathworks Inc., Natick, MA, USA.) Monte Carlo simulations (see below) and generation of execution trace plots are also performed in MATLAB; execution trace 83 plots are generated using a script written by Matthew Rupp, of the Computer Science and Engineering Department at Michigan State University. Experimental setup The data used here come from the same experiments described in Chapter 1 of this thesis, with the same trophic cascade environment (see Figure 1.1). We use only the following experimental treatments: 1) Pre-extinction evolution. Each replicate is run for 100,000 updates (an arbitrary unit of time where each individual in the population executes on average 30 instructions) of evolutionary time. This is followed by an ecological phase of 100,000 updates, where mutation is turned off, allowing sorting of the highest-fitness organisms that can co-exist with minimal ecological overlap. 2) Press/recovery treatment. Each replicate is run for 100,000 updates, as above. Resource inflows are then lowered by two orders of magnitude for 5000 updates (the press treatment), then restored to pre-extinction levels for a subsequent 100,000 updates of evolution, followed by a 100,000 update ecological phase. The full state of the population, including lineages of extant organisms, is saved at three key time points: just prior to the onset of extinction, at the end of the extinction 84 episode, and at the end of the recovery period. The experiments are then re-run with abbreviated recovery times of 1000, 2000, and 5000 updates, in order to determine the fates, during the early phases of recovery, of clades arising from the pre-extinction EQU progenitors (see below). Contingency of re-evolution of EOU Our first step in analysis is to divide the replicates into five categories shown in Figure 2.1 (see legend for description). In addition to excluding from most analyses replicates where EQU is not lost, we also exclude a single population where EQU re- evolves, but no EQU-capable (EQU) organisms are retained through the ecological filtering phase. With the remaining four categories, we perform a simple G-test contingency analysis (Sokal and Rohlf 1995) to determine whether or not the tendency of EQU to evolve after the extinction episode is independent of its prior evolution. Calculations for Dre-extinction demographic metrics We first compare replicates where EQU re-evolves with those where it does not, in order to ascertain whether there are consistent differences between the pre-extinction populations in: i) the percentage of the population that performed EQU, 85 Figure 2.1. Schematic illustration of categorization of replicate populations based on re- evolution of EQU in the press extinction experiments. Time goes from earlier (at left) to later (at right). The press period is indicated by the grey rectangle. Hatched rectangles indicate periods of time where organisms that perform EQU are present in the population. Replicates are divided into the following categories: a) b) 1. Those where EQU evolves prior to the extinction episode, is lost during the extinction episode, and subsequently re-evolved. II. Those where EQU evolves prior to the extinction episode, but fails to re- evolve afterwards (NE). III. Those where EQU does not evolve. IV. Those where EQU does not evolve prior to the extinction episode, but evolves after (or more rarely, during) the extinction episode. V. Those where EQU evolves, and is not lost during the extinction episode (my. These were excluded fi'om most analyses, but are included for determination of EQU functional sites (see Appendix A25). 86 (a) (b) (C) (d) \ (6) Press period TIME ii) the total number of times the EQU function is performed across the population. We first tabulate, from the immediate pre-extinction populations, the number of organisms capable of performing EQU, and calculate what percentage of the population is comprised by them (nonviable organisms are excluded from calculation). All percentages are arcsine-square-root transformed for statistical analysis. Our expectation is that replicates where EQU re-evolves have a larger percentage of the pro-extinction population performing EQU (a larger pool of potential ancestors). We also calculate the total level of expression of EQU across the populations at this time. Using the Lilliefors test in MATLAB, we verify that the values in each category (re-evolved vs. not re- evolved) do not deviate significantly from being normally distributed. In performing this test, we implicitly assume that high total expression of EQU means having evolved mechanistically robust versions of the function that are less likely to be completely erased by adaptation to press conditions (Chapter 1). Retention of this genetic “memory” then facilitates subsequent re-evolution of EQU (which we later investigate in more detail). Tracing of lineages and clades We are also interested in whether or not EQU is more likely to re-evolve in replicates where the population at the end of the extinction episode contains a large number of descendents of pre-extinction organisms that performed EQU. Our guiding 88 hypothesis is that lineages where EQU evolves following an extinction episode will usually descend from those that had evolved it previously. In order to simplify analyses, we select and trace the lineages of organisms able to perform EQU from simplified, ecologically stable populations. This requires the following steps: First, for each population where EQU evolves prior to the extinction, we trace the lineages of EQU-performing (EQIF) organisms from the pre-extinction ecologically stable populations back to the seed ancestor. In each lineage, we identify the first ancestral genotype able to perform the function (the EQU “progenitor”). This is usually the first genotype to have evolved the function in that replicate. There are, however, a number of cases where the time of origin of that ancestral genotype is later than the recorded time of first appearance of the function, indicating displacement of a previous EQU- performing clade during pre-extinction evolution. Second, for each population where EQU re-evolves, we trace back the lines of descent of EQU-performing organisms from the end-recovery, ecologically stable populations. In each lineage, we note whether or not EQU is present anywhere along the line of descent in the pre-extinction period. We sub- categorize these populations into three classes, based on the presence of EQU in the line of descent (shown diagrammatically in Figure 2.2): Category IA: The immediate pre-extinction ancestor is EQU+(Figure 2.2a). 89 Category IB: The immediate pre-extinction ancestor is EQU", but EQU is present somewhere along the line of descent, having been lost at some point prior to the extinction episode (Figure 2.2b). Category IC: All genotypes along the line of descent during the pre- extinction period are EQU' (Figure 2.2c). In cases where multiple organisms with distinct phenotypes are EQU+ (in addition to other functions), we perform lineage traces on all of them in order to verify whether or not EQU originated multiple times. 90 Figure 2.2. Schematic illustrations of types of re-evolution of EQU. Each circle represents a single genotype along the line of descent. Open circles are genotypes which do not perform EQU; filled circles are genotypes that perform EQU. “X” marks extinction of a lineage. Shaded areas indicate the low-resource press period. a) Category IA. EQU re-evolves in a lineage where it occurred prior to the press episode. The immediate pre-extinction ancestor expresses EQU, and EQU was lost during the press episode. b) Category IB. EQU re-evolves in a lineage where it occurred prior to the press episode. EQU expression was lost in the lineage sensu stricto prior to the press, but may have persisted in a sister clade into the press period. c) Category IC. EQU re-evolves in a lineage where it was not performed prior to the press episode. Other lineages that performed EQU either go extinct during the press, or fail to re-evolve EQU if they persist into the recovery phase. (1) Not re-evolved. No lineage that survives the press episode re-evolves EQU. 91 33.96 8.85.”. DOm 85%.... .0: o f 8.8.5.. 30m M 3396-8 .8 R8. 6 . o 80; o 8.8%.. 30m 8 w e .rmnxr... . t wommmwo . o .858. e . 3. Dom/J o m: bowofio 3 EIWIL EIWIL c8386 .5. 8:... 30m 8 a”. O> O>OIQH . m U _ 0 e F O 8.88.. RE I. . e e .8. 8.8%. 20m 8.8.5.. Dam/m 9. S bowofio A8 vocmmwo. me e. HWIJ. EINIJ. 92 We also note the highest-level trophic function (see Figure 1.1) the last pre- extinction ancestor performs, if it is not EQU. In some cases, the last pre-extinction ancestor is clearly a recent, degenerate descendent of a genotype with higher-level fimctionality, and so is scored with that ancestral genotype’s frmctionality instead. Once we have identified the pre-extinction EQU “progenitor”, we then perform a clade trace. This determines whether or not a given EQU progenitor has any descendents remaining at some future time (relative to its own origin), and if so, how many (whether or not they can still perform EQU). We check for the number of descendents just before the onset of extinction, at the end of the extinction episode, and also at 1000, 2000, and 5000 updates into the recovery. All tracing of clades and lineages is performed using software tools built into Avida, though identification of (putative) ancestral genotypes requires scanning output by eye, using conventional text editors and spreadsheet programs. 6 ‘Replay” experiments with end-extinction organisms We identify a number of cases where, although the pre-extinction EQU “progenitor” has a sizeable number of EQU' descendent organisms (i.e. a clade) at the end of the extinction episode (see Results), EQU either fails to re-evolve, or re-evolves in lineages from other clades where EQU has not evolved before the extinction episode 93 (Category IC). To determine if EQU is actually incapable of re-evolving in descendents of these survivors, we select, from the end-extinction clade in each population, the most common surviving organism (in the case of ties, we pick the organism that historically had the most copies), and use it to inoculate a new evolutionary process (Yedid and Bell 2002), which we refer to as “replay” experiments. We seed twenty replays with each organism picked this way. For replicates in Categories IA, IB, and IC, we also seed replays with the known, end-extinction ancestor organism fi'om the lineage where EQU actually re-evolves, excluding cases in Category IA and IB replicates where the known ancestor is also the most abundant surviving organism in the pre-extinction EQU clade. For each replay trial, we record whether or not EQU re-evolves, and if it does, the approximate (given the frequency of data sampling) time of its appearance, in Avida updates. In order to maintain a balanced experimental design, if EQU does not re-evolve, the time is considered the maximum length of the replay experiment. Due to both this previous step and pathological (often bimodal) distribution of time data, nonparametric tests are employed for statistical analysis of time-related metrics. To analyze the number of re-evolutions, we perform post-hoe multiple comparison tests with Tukey-Kramer HSD correction if the initial one-way ANOVA or Kruskal-Wallis test is significant. To compare the mean ranks of times of re-evolution between replays seeded with the actual ancestors and those seeded with the most abundant survivors of the pre-extinction EQU clade, we employ Wilcoxon rank-sum tests. 94 We also analyze whether or not the number of successful re-evolutions in the replays are homogeneous among different replicate populations in the same re-evolution class. Since EQU re-evolves in all replays seeded with certain founder organisms, we do not employ standard chi-square or G-based heterogeneity tests (Sokal and Rohlf 1995). Instead, we first calculate the variance of the fiaction of successful re-evolutions among replicates within a given re-evolution class. We then perform Monte Carlo simulations where we generate, for each re-evolution class, 10,000 pseudo-replicate data sets with both the same replication structure and total number of successes and failures as the actual replay tests (sampling without replacement). Variances of success ratios are then calculated as described above for each set of pseudo-replicates, and the resulting values are used to form distributions that serve as the basis for significance tests. Observed variances for the actual replay experiments are deemed statistically significant if they lie within (or beyond) the 2.5% tails of the Monte Carlo distributions. Ecological position ofpre-extinction ancestors The trophic cascade (Figure 1.1) allows us to consider re-evolution of EQU in an ecological context. We want to determine whether organisms of a particular trophic position tend to be the ancestors of organisms that re-evolve EQU following the extinction. We note the highest-level function performed by each of the immediate pre- extinction ancestors for each population, and derive an ecological scoring scheme. 95 i) Organisms that can only perform the functions NOT or NAND (or any combination of these) are “Level 0”; ii) Organisms whose highest-level functions are AND, ORN, or OR are “Level 1”; iii) Organisms whose highest-level functions are ANDN, NOR, or XOR are “Level 2”; iv) Organisms whose highest-level function is EQU are “Level 3”. Functional genomics of EOU and retention of ancestral function_al sites Digital organisms have been analyzed previously through functional genomic studies involving single-instruction knockout mutations (Lenski et al. 2003). We use these techniques to determine how much of the ancestral mechanism for performing EQU is retained in organisms that re-evolved it (representational diagrams of the steps taken are in Appendix 2.8). We first generate genotype-phenotype maps (Lenski et al. 2003) for either the immediate pro-extinction ancestors (Category IA replicates) or the last EQU+ pre-extinction ancestor (Category IB replicates), in order to assess which sites in the genome are functionally significant for performing EQU (sites affecting viability are not counted). This procedure is repeated for both the end-extinction ancestor, and the first post-extinction EQUT organism that performs EQU. Lineage traces for these focal organisms are then redone with sequence alignment, allowing location of the specific positions and identities of these sites, even with insertion/deletion mutations. We then 96 identify which functional sites for EQU have remained completely conserved between the pre-extinction ancestor, and the end-extinction and post-extinction descendents. For Category IC and NRE replicates, we repeat these procedures using the (aligned) lineage of the most abundant survivor of the pre-extinction EQU clade, though we can only go as far as the end of the extinction episode. As a control, we also examine the replicates where EQU was not lost during the extinction episode (ENL), in order to see how much change in the genetic basis of EQU can occur under that circumstance. Once all sites involved in coding for EQU have been identified in the pre- extinction organisms, we calculate the percentage of sites remaining unchanged in organisms from the end of the extinction episode. For replicates in Categories IA and IB, we also examine the percentage of these unchanged sites that participate in the re—evolved EQU. We only count sites that are identical by descent; we do not include sites that mutate away from, and then back to, the instruction present in the pre-extinction organism. For statistical analyses, all percentages are arcsine-square-root transformed. For each pre-extinction ancestor, we also count the number of EQU knockout mutations that also eliminate two or more other functions, normalizing them by the total number of EQU knockouts. This provides a first approximation of how much expression of EQU overlaps with expression of other functions. For cases with only two functions, we count how many knockouts eliminate both. We exclude a single NRE replicate where the organism only performs EQU. Due to highly pathological distribution of data in 97 several replicate classes, nonparametric methods are again employed for statistical analysis. We also investigate in detail two case studies of re-evolution of EQU within a single lineage. We provide two types of visual representation of the digital organisms. The first is a genotype-phenotype map derived fiom single-instruction knockout mutations (Lenski et a1. 2003). Each individual instruction in the genome is systematically replaced by a null mutation, and the resulting effect on the organism’s phenotype is assayed and recorded. The output is shown as a table with different colours indicating loss or gain of function with each knockout. The second is a graphical execution map developed by co-author Matthew Rupp. This latter representation is very useful since it permits a visual assessment of execution flow within a given organism, in a way not obvious from the genotype-phenotype map (see Figures 2.9, 2.11, 2.13, and 2.15 in Results). In particular, the organism’s copy loop, a section of code that is repeatedly executed in a looping manner in order for the organism to copy itself, is clearly indicated in this type of diagram. Sites that are critical for execution of ecological functions usually become consolidated in the copy loop, since firnctions can be rewarded more than once under the conditions used here. Thus, this general feature is shared by many organisms from different replicate populations, though the function combinations contained within the loop vary considerably both within and between populations, and can change over time within a single lineage. 98 RESULTS Contingency of re-evolution EQU is significantly more likely to re-evolve following the extinction episode in populations where it evolved previously (Table 2.1). Populations are categorized by which re-evolved EQU: i) 52/ 100 evolved EQU, lost it, and re-evolved it, ii) 26/ 100 evolved it before the extinction episode but failed to re-evolve it afterwards, iii) 3/ 100 evolved it only during or after the press episode, iv) 10/ 100 never evolved it. Nine replicates are excluded from analysis as noted in the Methods, for a total sample size of 91. Nonetheless, we can reject the null hypothesis that re—evolution of EQU is independent of its prior evolution (corrected G = 8.46, X2 (1 df, 0.05) = 3.84, p < 0.005). There is a clear tendency for EQU to re-evolve in populations where it evolved before. 99 Table 2.1. Association between evolution of EQU before press extinction, and its subsequent evolution or re-evolution. See text for statistical analysis. EQU evolved or re-evolved after press? EQU evolved before Y N extinction? Y 52 26 78 N 3 10 13 55 36 91 Effect of Dre-extinction demography on re-evolution The connection between the re-evolution of EQU and the pre-extinction demographic measurements of EQU expression is tenuous. On average, in pre-extinction populations of replicates where EQU re-evolves, 21.35% ($2.1, 2 SE), back- transformed value) of individuals in the population perform the function. In replicates where EQU did not re-evolve, an average of 18.24% ($2.07, 2 SB), back-transformed value) of individuals in the pre-extinction population perform EQU. The difference, while not large, is statistically significant (unequal variance 8005, (,4 d0 = 2.07, one-tailed p = 0.022). The total level of functional expression also has a weak (but significant) relationship with re-evolution of EQU. Populations where EQU re-evolves have, on average, a pre-extinction level of expression of 68038.37 (2t 9849.32, 2 SE.) total executions, while populations where EQU does not re-evolve have an average expression 100 level of 53176.65 (:1: 12139.1, 2 SE) total executions. Despite the large variation in each group, the data are not significantly different from normal (Lilliefors statistic for re- evolved populations = 0.0982, critical value = 0.1229, p > 0.05; Lilliefors statistic for not re-evolved populations = 0.0849, critical value = 0.1706, p > 0.05). The difference between the two groups is again statistically significant, but small (equal variance t(0_05, 76 d0 = 1.816, one-tailed p = 0.032). Ecological position of Dre-extinction ancestors In replicates where EQU re-evolves, a total of 41 out of 52 feature an immediate pre-extinction ancestor that is EQU- (Table 2.2). The immediate pre-extinction ancestor is a Level 0 (primary producer) organism in 4/52 replicates. In 21 out of 52 replicates, an organism with an ecological function on Level 1 is the immediate pre-extinction ancestor. These organisms usually perform the functions OR, ORN, or both. Ancestral organisms that can perform more difficult functions (Levels 2 or 3) are obtained in 16/52 replicates and 11/52 replicates, respectively. There is a high occurrence of EQU re-evolving in lineages where all pre- extinction ancestors are EQU: Of the 21 replicate populations containing a Level 1 pre- extinction ancestor, 4 of 21 are Category IB (EQU is previously present in the lineage), and the Level 1 ancestral phenotype arises via secondary simplification. 101 Highest-level function Number of performed by last Trophic re licates pre-extinction level (i=52) ancestor NOT, NAND 0 3 AND, ORN 1 1 AND, OR 1 1 ORN, OR 1 6 XOR 2 8 Highest Observed level trophic function “um?" 0f rephcates Level 0 4 Level 1 21 Level 2 16 Level 3 11 Table 2.2. Highest-level functions of immediate pre-extinction ancestors of organisms in which EQU re-evolved following the press episode. 102 The other 17 of 21 populations are Category IC, where EQU is never present in the lineage of the pre-extinction ancestor. By contrast, in the 16 populations where EQU re- evolves from an organism that expressed a Level 2 function just before the extinction episode, 7 of 16 are Category IB, and 9 of 16 are Category IC. Taken with the previous results, these demonstrate that most of the organisms where EQU re-evolves following the extinction do not descend from the pre-extinction incumbent EQU group, and many do not descend from any previous EQU group. Functional degradation duringgress period The press period often results in functional degradation of most of the organisms in the population, which is more pronounced on higher trophic levels (Table 2.3). The simple functions NOT and NAND tend to be robust through the press, or are lost only transiently and re-evolve quickly. These two fimctions are present in all 100 replicate populations both just prior to, and at the end of, the extinction episode. The Level 1 functions ORN and OR also evolve before the extinction episode in all replicates, and are present at the end of the extinction episode in 92 of 100 and 61 of 100 populations, respectively. More difficult functions (including the low-value function AND) are present in fewer than half the populations at the end of the extinction episode. The most difficult functions, XOR and EQU, evolve in only 88 of 100 and 87 of 100 populations before the onset of extinction. Of these, only 9 of 88 (for XOR) and 8 of 87 (for EQU) populations retain these functions through the extinction episode. 103 Table 2.3. Number of replicates in which the listed function was evolved, and subsequently either retained through, or re-evolved during, the press. . Number of replicates Number of repllcates . . . . . where functlon lS Function evolvmg function . . present at end pre-extmction . of press eplsode NOT 100/ 100 100/ 100 NAND 100/ 100 99/100 AND 100/ 100 29/100 ORN 100/ 100 92/100 OR 100/100 61/100 ANDN 100/ 100 29/ 100 NOR 100/ 100 29/ 100 XOR 88/ 100 9/88 EQU 87/ 100 8/87 Genealogical influence on re-evolution of EQU We ascertain the pre-extinction ancestry of EQU by tracing the lineages of the EQU organisms from simplified, ecologically stable populations back to the original ancestor (see Appendix A2.1). Many replicates are Category IC, where there are no EQU+ ancestral genotypes in the pre-extinction section of the lineage. Of the replicates where EQU does re-evolve, 11 are Category IA, 10 are Category IB, and 31 are Category IC (see Figure 2.2 for definitions). In all of these cases except one, we find that EQU originated only once. Even when there are multiple EQU+ ecologically stable organisms, all of these organisms have a post-extinction ancestor that is also EQU‘I. In the one case (r10500) where we did find multiple post-extinction origins of EQU, each EQU- 104 performing lineage descends from the same immediate pre-extinction ancestor. In order to find the EQU+ “progenitor” genotype for each replicate, we repeat this procedure using the pre-extinction ecologically stable populations. We again find only a single replicate (r2100) where EQU had originated multiple times, and only one of these progenitors has surviving descendents at the end of the extinction episode. The number of descendents of the EQU “progenitor” genotype present just at the end of the extinction episode (recall that the replicates are categorized based on what happens after the press) are shown for each replicate class in Table 2.4. (Also recall that replicates categorized as IA and IB cannot be zero because they must have descendents of the EQU “progenitor” at the end of the extinction episode, since it had already been determined that the post-extinction EQU+ organisms from these replicates had EQU+ pre- extinction ancestors.) In 19 of 31 Category IC replicates, and 13 of 26 replicates where EQU did not re-evolve (NRE), the EQU “progenitor” organism has no descendents at the end of the extinction episode, indicating genuine extinction of those clades. In all classes, the number of surviving descendents (where there were any) of the pre-extinction EQU “progenitor” ranged from a handful to most of the population, even in some Category IC and NRE populations. There are significant differences between classes if replicates with no surviving EQU clade descendents are included (Table 2.5a; Kruskal- Wallis x2 (0,05, 3 d9 = 24.94, p < 0.0001). However, when these replicates are excluded, there are no significant differences between classes in number of surviving descendents (Table 2.5b; Kruskal-Wallis x2 (0,05, 3 dt) = 5.82, p = 0.12). Illustrative examples of the fates of these survivors of the extinction episode are described in Appendix A22. 105 Cat. IA Cat. IB Cat. IC NRE (n=11) (n=10) (n=31) (n=26) 6 4 0 0 645 12 0 0 1950 13 0 0 2777 45 0 0 2827 596 0 0 2908 1048 0 0 2936 1126 0 0 2982 2919 0 0 3046 3055 0 0 3077 3085 0 0 3095 0 0 0 0 0 0 0 2 0 16 0 122 0 520 0- 1231 0 1876 0 1986 8 2057 17 2075 23 2461 131 2738 141 2930 451 2952 1863 2796 2835 2892 2975 Table 2.4. Number of descendent organisms of the pre-extinction EQU progenitor genotype remaining at the end of the extinction episode. Each datum represents a single replicate community in the relevant replicate class (total n=100). 106 Table 2.5. Kruskal-Wallis tests for differences in number of end-extinction survivors of the pre-extinction EQU clade among re-evolution classes defined in Figure 2.2. a) Test for all replicates that re-evolved EQU. Source SS df MS Chi-sq Prob>Chi—s¢L Class 11836 3 3945.3 24.94 < 0.00002 Error 24712 74 333.9 Total 36548 77 b) Test excluding replicates where the pre-extinction EQU clade went extinct Source SS df MS Chi-sq Prob>Chi-sq Class 1004.3 3 334.77 5.82 > 0.121 Error 6585.7 41 160.63 Total 7590.0 44 “Replay” experiments—differential tendency for qualitative re-evolution of EQU There are no significant differences among replicate classes in the number of replay experiments that re-evolve EQU when the most abundant survivors of the pre- extinction EQU clade are used to seed the replays (Table 2.6, Figure 2.3, Kruskal-Wallis x2 (005,3 d0 = 1.697, p > 0.63 8), nor when the actual end-extinction ancestor organisms are used as replay seeds (Table 2.7, Figure 2.4, Kruskal-Wallis x2 (0,05, 3 d0 = 5.97, p > 0.11). Replays seeded with the most abundant survivors of the pre-extinction EQU clades from NRE replicates are not demonstrably worse at re-evolving EQU than those seeded with ancestors from any of the other classes. 107 20 ~ + .—.———, fi— 4 l l D l x 53 18 . 1 g 3 E 16 _ K / 1 | I .. 2 / 1 g _ —\\ 1‘ .1 K E 14 ~ , 2— - a. \ / c I . 1% 12 F I a. _—I_J ___r_11 _ 25. . . g 10 — ,' '. —1— I ~ >. " 1 _L_ E x \ g 8 — ,1 15 ,_ 4 3 ‘ 1 E 6 _ _. ___l___3 .4 3 z | 4 L I _ _J_ Cat. IA Cat. IB Cat. lC NRE Figure 2.3. Box plot of successful re-evolutions of EQU, in replay populations seeded with the most abundant end-press genotype of the pre-extinction EQU clade. X axis is replicate class, Y axis is number of replays that re-evolved EQU. Red lines in centre of boxes are sample medians; upper and lower lines of box are 25:11 and 75th percentiles of samples; intervals show span of samples (excluding outliers); notches are graphic confidence interval about the sample median. Red crosses indicate outliers in the sample. IMAGES ARE PRESENTED IN COLOUR. 108 20 _ fi— f I. T‘ __,_2, _|___ _ D (——L““ I I I I 8 18 ~ I / I I I ‘ .2 III I II I/ 2 16 _ ., _/_ I j It I‘ .1 o I I \ I i | I I \ / \ / h 14 ~ I I I I— ’ > # I 3 8 __3_. I” I .2 I I / I I 33' 12 " —J— I ’II I / \I‘ _ 3 I I—fiI I I. g: 10 r I I I 4 E I I I I o. I I I g 3 ‘ —I— I I I - g | I 9 .. : : — z | I 4 - I I ~ _L _1_ Cat. IA Cat. IB Cat. 10 Cat. IC_EE Figure 2.4. Box plot of successful re-evolutions of EQU, in replay populations seeded with the actual end-press ancestor of the genotype that re-evolved EQU. “Cat. IC_EE” are Category IC replicates where the pre-extinction EQU clade was extinct by the end of the press episode. X axis is replicate class, Y axis is number of replays that re-evolved EQU. Box plot elements are as in Figure 2.3. IMAGES ARE PRESENTED IN COLOUR. 109 Table 2.6. Kruskal-Wallis test for differences in successful re—evolutions of EQU, in replay populations seeded with the most abundant end-extinction organism of the pre- extinction EQU clade. Source SS df MS Chi-sq Prob>Cbi-sq Class 286.15 3 95.383 1.697 > 0.638 Error 7133.40 41 173.980 Total 7419.50 44 Table 2.7. Kruskal-Wallis test for differences in successful re-evolutions of EQU, in replay populations seeded with the actual end-extinction ancestor of the genotype that re- evolved EQU after the press. Source SS df MS Chi-sq Prob>Chi-sq Class 1192.60 3 397.53 5.97 > 0.11 Error 8393 .90 45 186.53 Total 9586.50 48 This pattern holds even in those replicates where, in the original experiment, a large number of these descendents persist into the recovery without re—evolving EQU. Visual inspection of the data (Figure 2.3, Appendix A2.3.1b) indicates that Category IB has the largest between-replicate variability, containing both the most and least successful groups of replays. Heterogeneity among replicates in successful re-evolution of EQU is significant using distributions of variances derived from Monte Carlo simulations (Table 2.8) for Categories IA (variance = 0.0244, p = 0.0023) and IB (variance = 0.1058, p < 0.0001). The observed variance in successful re-evolution in replays is not significant for Category 110 IC (variance = 0.0167, p = 0.1391) but is significant for NRE, driven largely by a single highly successful ancestor in this class (variance = 0.0224, p = 0.013). Table 2.8. Results of resampling analyses to assess heterogeneity, among founding organisms, of successful re-evolutions of EQU in replay populations seeded with the most abundant end-press survivor of the pre-extinction EQU clade. P-values are based on 10,000 Monte Carlo simulations, where samples were drawn to preserve the total number of successes and failures between founders. Observed Replicate Class Total # Total # variance m Resamplmg successes failures proportion of p successes IA 167 53 0.0244 0.0023 IB 129 71 0.1058 < 0.0001 IC 148 72 0.0167 0.1391 NRE 180 80 0.0224 0.0130 In replay populations seeded with the actual end-extinction ancestors, observed variances of successful re-evolution are significantly more heterogeneous than expected for all classes (Table 2.9). We further analyze in this manner replays seeded with organisms from Category IC replicates where the pre-extinction EQU clade went extinct during the extinction episode (IC_EE), and find significant heterogeneity of variance here as well (Category IC_EE variance = 0.0757, p < 0.0001). 111 Table 2.9. Results of resampling analyses to assess heterogeneity, among founding organisms, of successful re-evolutions of EQU in replay populations seeded with the actual end-press ancestor of the genotype that re—evolved EQU. “Class III EQU extinct” are those Class III replicates where the pre-extinction EQU clade was extinct by the end of the press. P-values are based on 10,000 Monte Carlo simulations, where samples were drawn to preserve the total number of successes and failures between founders. Observed . Total # Total # variance in . Replicate Class successes failures proportion of Resamplmg p successes IA 164 36 0.0234 0.0004 1B 135 25 0.0753 < 0.0001 IC 155 65 0.0917 < 0.0001 IC’ 251 149 0 0796 < 0 0001 EQU extinct ° ' The significant results indicate that history does play some role in the re-evolution of EQU within a replicate class. Even in the same conditions (same selective treatment, and similar type of pre-extinction EQU ancestry), chance factors (the random seeds used for the replays) produce historical constraints on adaptation in different replicate populations within a class. However, certain individual ancestors show a definite bias (either positive or negative) towards re-evolving EQU, regardless of what chance factors they encounter. 112 Since we have the lineages of the organisms where EQU re-evolves following the extinction episode, we can determine unequivocally the identity of the ancestor genotype from the end of the extinction episode that led to this re-evolution. The actual ancestor is not always superior to other survivors of the pre-extinction EQU clade for re-evolving EQU in replay populations, again emphasizing the role of chance in the re-evolution of EQU. Appendix A2.3 details specific cases where the actual ancestor was deficient in this regard. Such examples illustrate that the choice of clone is very important in whether or not EQU re-evolves in the replays. “Re la ” ex eriments—differential time to re-evolution of E U Even in replays seeded with the same ancestor, the time required to re-evolve EQU can be highly variable. There are significant differences between classes in mean ranks of re-evolution time in replays seeded with the most abundant EQU clade survivor (Table 2.10, Kruskal-Wallis x2 (005, 3 dt) = 11.348, p < 0.01), despite the large number of tied ranks due to coding for no re-evolution. Multiple comparison tests using Tukey- Krarner HSD correction showed that Category IA ancestors have the lowest overall mean rank and differ significantly from only from IC. All other classes overlap with each other in mean rank of re-evolution time (Figure 2.5, Table 2.12). There are significant, and stronger, differences between the actual ancestor replays, including Category IC_EE replicates (Table 2.11, Kruskal-Wallis x2 (0,0,, 3 do = 122.44, p << 00001). Categories IA and IB do not differ significantly after HSD correction (Figure 2.6, Table 2.13). The two types of IC replicates differ significantly from IA and IB, and from each other. 113 Table 2.10. Kruskal-Wallis table for differences among replicate classes in time needed to re-evolve EQU, in replay populations seeded with the most abundant surviving organism from the pre-extinction EQU clade. Source SS df MS Chi-sq Prob>Chi-sq Class 6.49 x 105 3 2.16 x 105 11.348 < 0.01 Error 4.74 x 107 836 56652 Total 4.80x107 839 Table 2.11. Kruskal-Wallis table for differences among replicate classes in time needed to re-evolve EQU in replay populations seeded with the actual end-extinction ancestor of the genotype that re-evolved EQU. Source SS df MS Chi-sq Prob>Cbi-sq Class 9.59 x 106 3 3.20 x 10‘5 122.44 << 0.0001 Error 6.71 x 107 976 68757 Total 7.67 x 107 979 114 Figure 2.5. Plot of multiple comparison tests for ranks of time needed to re-evolve EQU in replay populations seeded with the most abundant survivor of the pre-extinction EQU clade. X axis is rank time, Y axis is re-evolution class. Error bars are two standard errors. Letters a-b indicate sets of classes that are NOT significantly different from each other after Tukey-Kramer HSD correction for multiple comparisons. 115 Cat. IA Cat. IB ssep areoudaa Cat. IC NRE Rank time u u u .o a a s e g 8 c c c o o o l L l I % :m .L 4 :‘3. o %g 116 Figure 2.6. Plot of multiple comparison tests for ranks of time needed to re-evolve EQU in replay populations seeded with actual ancestor of the pre-extinction EQU clade. “Cat. IC_EE” is those Category IC replicates where the pre-extinction EQU clade was extinct by the end of the extinction episode. X axis is rank time, Y axis is re-evolution class. Error bars are two standard errors. Letters a-c indicate sets of classes that are NOT significantly different from each other after Tukey-Kramer HSD correction for multiple comparisons. 117 Rank time mac ace 1 mac 1 mac 1 gal hco r wmc a uoc 1 J j j L If 118 Mac Cat. IA Cat. IB ”3:33 ammo Cat. IC Cat. IC_EE Table 2.12. Pairwise comparisons among mean ranks of time needed to re-evolve EQU in replay populations seeded with the most abundant survivor of the pre-extinction EQU clade. Confidence intervals are based on Tukey-Kramer HSD correction for multiple comparisons. Compared sets of replicate classes are significantly different if confidence interval does not include zero. Estimated Lower 95% Upper 95% Comparison difference in HSD CI HSD CI mean ranks IA IB -9.160 -74.342 56.022 IA IC -65.692 -125.734 -5.650 IA NRE -53.664 -111.464 4.137 IB IC -56.532 -120.384 7.320 IB NRE -44.504 -106.253 17.246 IC NRE 12.028 -44.268 68.324 Table 2.13. Pairwise comparisons among mean ranks of time needed to re-evolve EQU in replay populations seeded with the actual end-extinction ancestor of the organism that re-evolved EQU. Confidence intervals are based on Tukey-Kramer HSD correction for multiple comparison. Compared sets of replicate classes are significantly different if confidence interval does not include zero. Estimated Lower 95% Upper 95% Comparison difference in HSD CI HSD CI mean ranks IA IB 63.196 -l3.073 139.466 IA IC -117.361 -187.615 -47.107 IA IC_EE -190.84 -253.114 -128.566 IB IC -180.557 -255.27 -105.845 IB IC_EE -254.036 -321.299 -186.773 IC IC_EE -73.479 -133.836 -13.122 119 We also compare, for each replicate population, the mean ranks of time needed to re-evolve EQU in replays seeded with each type of ancestor; the Wilcoxon rank-sum test results are summarized in Appendix A2.4. These tests again demonstrate the importance of the particular clone taken from each population in whether EQU re-evolves in a consistently rapid time, or with high variation in time among replay populations seeded with that clone. Functional genomics of EQU ge- and post-extinction—retention of functionJal sites We can identify functional sites required for expression of EQU in the sequences of the actual end-extinction ancestor organisms (for Categories IA, IB, and ENL), or the most abundant survivors of the pre-extinction EQU clade (Category IC and NRE); summary statistics are in Table 2.14. Differences between replicate classes in retention of functional EQU sites through the extinction episode are significant (Table 2.15, one- way ANOVA, Tim-05, 4 df) = 25.84, p << 0.0001). EQU functional sites are highly conserved in all replicates where EQU is retained through the extinction episode (“ENL”). EQU sites are perfectly preserved through the extinction episode in 7 of 8 ENL replicates (raw data in Appendix A25). Only one replicate out of eight features a lineage from the end-experiment ecologically stable population with any degradation of functional sites. (It should be noted that in this lineage, EQU was de-activated before the extinction episode and re-activated subsequently, with 17 of 19 functional sites remaining 120 Table 2.14. Summary statistics for percentage of ancestral EQU functional sites remaining in focal organisms from the end of the extinction episode. Class Average % Average % Lower 95% CI Upper 95% CI remaining remaining (back— (back- ancestral EQU ancestral EQU transformed) transformed) functional sites functional sites (untransformed) (back-transformed) IA (n=11) 37.55 37.56 22.82 53.60 IB (n=10) 61.85 64.56 42.86 83.50 IC (n=11) 25.50 24.16 16.31 33.01 NRE (n=13) 28.54 25.05 13.43 38.85 ENL (n=8) 98.68 99.83 98.47 99.83 Table 2.15. One-way ANOVA for differences, between replicate classes, in percentage of ancestral EQU functional sites remaining in focal organisms from the end of the extinction episode. Data were arcsine-square root transformed. Source SS df MS F Prob>F Class 6.518 4 1.630 25.84 << 0.0001 Error 3.027 48 0.063 Total 9.545 52 121 by the end of the extinction episode. Additionally, we verified that related lineages descending from the same pre-extinction EQU progenitor retained EQU through the extinction episode with no loss of functional sites. Following the extinction episode, EQU was re-activated in this particular lineage, and it ultimately prevailed through the ecological filtering.) In Category IC and NRE replicates with surviving descendents of the pre- extinction EQU clade, retention of functional sites is highly variable, ranging from 0% to 64.3% (Table 2.14, Appendix A2.5). Retention of functional sites is, on average, higher in Category IA and IB replicates, ranging between 18% and 94% of functional sites in IA replicates, though only 2/11 replicates (r5000 and r7000) have better than 50% retention. Retention among IB replicates ranges from 9% to 100%, exceeding 50% in 6 of 10 replicates. After Tukey-Kramer correction, the ENL class differs significantly from all others in mean retention of EQU functional sites (Figure 2.7, Table 2.16). Category IB also differs from ENL, IC, and NRE, but overlaps in mean retention with IA. 122 Figure 2.7. Plot of multiple comparison tests among replicate classes for percentage of ancestral EQU functional sites retained through press. Data are arcsine-square-root transformed. Error bars are two standard errors. Letters a-c indicate sets of classes that are NOT significantly different from each other after Tukey-Kramer HSD correction for multiple comparisons. 123 % ancestral sites remaining (arcsine-square root transformed) 9 9 .° 9 .-* :3 .-* O N b m on A N -h on Cat. IA l——o—-Ig. Cat. IB : a :5- :u m E. 3' g. Cat. IC 1—0—4 1» 2 m tn m NRE l——o———lm ENL : :n 124 Table 2.16. Pairwise comparisons among replicate classes of percentage of ancestral EQU functional sites retained through press. Data are arcsine-square root transformed. Confidence intervals are based on Tukey-Kramer HSD correction for multiple comparisons. Compared sets of replicate classes are significantly different if confidence interval does not include zero. Estimated Lower 95% Upper 95% Comparison difference HSD CI HSD CI 1n means IA IB -0.273 -0.584 0.038 IA IC 0.146 -0.158 0.449 IA NRE 0.136 -0.156 0.427 IA ENL -0.870 -1.200 -0.539 IB IC 0.419 0.108 0.730 IB NRE 0.409 0.1 10 0.708 IB ENL -0.596 -0.934 -0.259 IC NRE -0.010 -0.302 0.281 IC ENL —1 .016 -1.346 -0.685 NRE ENL -1.005 -1.325 -0.686 We then check for correlation between the number of functional sites retained through the extinction episode, and the probability of re-evolving EQU obtained from replay experiments using the most abundant survivors of the pre-extinction EQU clade (for Category IC and NRE), or the actual end-extinction ancestors (for Categories IA and IB). Pearson correlation coefficients for Categories IA and IB are both positive (0.704 for IA, 0.635 for IB), and significantly different from zero at OL = 0.05. Coefficients for Categories IC and NRE are smaller (-0.213 for IC, 0.24 for NRE), and do not differ significantly from zero. 125 Functional genomics of EQU pre- and post-extinction—re-use of ancestral sites For Category IA and IB replicates, we also examine the percentage of functional sites in the pre-extinction version of EQU that participate in the re-evolved version of the firnction (data in Appendix A2.6). Percent re-use of ancestral sites ranges between 0% and 93.3% (with an average of 26.8%) for Category IA, and between 0% and 84.6% (with an average of 45.2%) for IE. These classes do not differ significantly in the percentage of re-used sites (equal variance 00.05, 19 d0 = -0.988, two-tailed p > 0.33), even after replicates with 0% retention are excluded. F unction_al genomics of EQU Dre- and post-extinction —— overlap of EOU with other functions The number of EQU knockout mutations that remove two or more other functions (where there are more than two) provides a first approximation as to how much the mechanism for EQU overlaps with that for other functions (data in Appendix A2.7). ENL organisms have the highest median number of these knockouts, followed by Category IA, though variation is very large within all replicate classes (Table 2.17). There are no significant differences among classes (Table 2.18, Kruskal-Wallis x2 (0,05, 4 m=2JLp>06D. 126 Table 2.17. Summary nonparametric statistics for percentage of EQU knockout mutations that eliminate two or more other functions in focal pre-extinction organisms. Class Median % of EQU Minimum Maximum knockouts eliminating two or more other functions IA (n=11) 50 0 94 IB (n=10) 47 7 71 IC (n=11) 35 10 93 NRE (n=12) 32 0 100 ENL (n=8) 67 14 81 Table 2.18. Kruskal-Wallis table for differences among replicate classes in percentage of EQU knockout mutations that eliminate two or more other functions in focal pre- extinction organisms. Source SS df MS Chi-sq Prob>Chi-sL Class 621.80 4 155.45 2.71 0.61 Error 11087.00 47 235.89 Total 1 1709.00 5 1 127 Functional genomics—case studies of re-evolution within a clade We provide two case study examples of re-evolution of EQU in the same clade. The first is from a Category IB replicate, r8100 (Figures 2.8-2.11). In the last EQU+ pre- ' extinction ancestor, all 15 functional sites for EQU (and other functions performed by this organism, though sites critical for replication are NOT counted) are contained within the copy loop (Figure 2.8, 2.9). In the re-evolved version, however, EQU is completely re-invented. All new functional sites arise outside of the reduced copy loop (Figure 2.10, 2.11). Though 8 of 15 ancestral functional sites remain in the descendent organism, none participate in the new version of the EQU function, even though they still code for simpler functions (Figure 2.10, ancestral instructions indicated with blue stars; Figure 2.11). These ancestral sites (and the functions they encoded) later decay, and a new copy loop structure evolves that encapsulates the sites for the re-invented EQU (data not shown). 128 Figure 2.8. Genotype-phenotype map of the last pre-extinction ancestor to perform EQU in the lineage of the re-evolved EQU genotype from replicate 8100 (Category IB). Genome sequence is shown at left. Labels at top denote replication and logic functions; associated colours show whether (green) or not (red) the organism can perform the function afier an instruction is knocked out. The fill in each interior cell shows the effect of replacing that instruction with a null mutation. Red—loss of indicated functions Green—gain of indicated functions Light blue—positive quantitative effect on indicated functions Yellow—negative quantitative effect on indicated functions Blank—no effect Blue stars near an instruction indicate sites critical for performing EQU. IMAGES ARE PRESENTED IN COLOUR. 129 Instruction 130 Figure 2.9. Execution map of the last pre-extinction ancestor to perform EQU in the lineage of the organism that re-evolved EQU in replicate 8100 (Category IB). Black tracks indicate forward movement of instruction pointer, grey tracks indicate backward movement. Genome instructions are in outer circle around coloured ring. Sites that are critical for ecological functions are located in the inner ring; coloured numbers correspond to listed functions in key at lower right of outer ring. Key indicates both which functions are performed by the organism, and how many times during a single execution cycle (at right of listed function). Colour coding at each position in the genome is by information criterion of Huang (2005). More intense shades of blue signify that fewer variant instructions are able to substitute at that position without the organism incurring a loss of fitness. More intense shades of red signify that the site is effectively neutral (i.e. any instruction will work at that position), or contains no information relevant to the tested function. Sites critical for viability are marked with white stars. IMAGES ARE PRESENTED IN COLOUR. 131 (0 0000003 \— O l: t. O s—NC’DVLDCONCDCD 880 'D C (U C 132 Figure 2.10. Genotype-phenotype map of the re-evolved EQU genotype from replicate 8100 (Category IB). Colour coding as in Figure 2.8. Blue stars indicate those sites from the pre-extinction EQU genotype (Figure 2.8) that remain unchanged in this descendent; none participate in the re—evolved version of the EQU function. IMAGES ARE PRESENTED IN COLOUR. 133 Instruction 134 (Figure 2.10 continued) 135 Figure 2.11. Execution map of the re-evolved EQU genotype from replicate 8100 (Category IB). Tracks and colour coding as in Figure 2.9. Sites critical for viability are marked with white stars. Note that all frmctional sites for EQU have now arisen outside the copy loop. IMAGES ARE PRESENTED IN COLOUR. 136 A contrasting case comes from a Category IA replicate, r7000 (Figure 2.12-2.15), where the organisms feature a partially nested structure, the copy loop being mostly contained within another loop (Figure 2.13, 2.15). In this case, the ancestral EQU mechanism is preserved largely intact and re-used. The ancestral form has 16 frmctional sites for EQU (Figure 2.12, 2.13), which are, again, also used in at least one other function. 13 of 18 functional sites for EQU in the re-evolved version (Figure 2.14, 2.15) are holdovers from the ancestral version, with little recruitment of new sites to the function. However, when EQU re-evolves in this lineage, the functions associated with it change, shifting from NOT, NAND, and XOR in the pre-extinction ancestor to NOT, NAND, AND, ORN, and ANDN. In both cases, a small number of functional sites lie outside the loops, though they are clearly different in the ancestral and re-evolved versions (Figure 2.13, 2.15). 137 Figure 2.12. Genotype-phenotype map of the immediate pre-extinction ancestor of the re-evolved EQU genotype from replicate 7000 (Category IA). Colour coding as in Figure 2.8. Blue stars near an instruction indicate sites critical for performing EQU. IMAGES ARE PRESENTED IN COLOUR. 138 Instruction I39 (Figure 2.12 continued) 140 Figure 2.13. Execution map of the immediate pre-press ancestor of the re-evolved EQU genotype from replicate 7000 (Category IA). Tracks and colour coding as in Figure 2.9. Sites critical for viability are marked with white stars. IMAGES ARE PRESENTED IN COLOUR. (‘0 N: #0000000? 141 Figure 2.14. Genotype-phenotype map of the re-evolved EQU genotype from replicate 7000. Colour coding as in Figure 2.9. Blue stars indicate sites from the pre-press EQU genotype (Figure 2.12) that remain unchanged in this descendent. 13/18 sites are retained from the ancestor and participate in the re-evolved version of the EQU function. IMAGES ARE PRESENTED IN COLOUR. 142 Instruction 143 (Figure 2.14continued) 144 Figure 2.15. Execution map of the re-evolved EQU genotype from replicate 7000. Tracks and colour coding as in Figure 2.9. Sites critical for viability are marked with white stars. IMAGES ARE PRESENTED IN COLOUR. m Nv-IDIDOVOOO) 145 We also illustrate two distinct ways in which EQU can (re-)evolve. The first is by co-option of a previously evolved function, such that EQU is expressed at high levels (multiple executions per replication) from its first appearance (Table 2.19a). The second (Table 2.1%) is when EQU evolves following a mutation that is strongly deleterious when it first occurs. The mutation results in degradation of previously present functionality and overall loss of fitness, but with acquisition of EQU at low levels of expression as a side effect. (Of note, the actual end-extinction ancestor from this replicate is not a very successful replay performer; EQU re-evolves in only 8 of 20 replays.) Despite the deleterious nature of the enabling mutation, such a beginning still marks the invasion of the EQU niche, and function expression can improve in small increments (compare with case study organism in Lenski et al. (2003)). 146 Table 2.19. Two major modes of acquisition of EQU. Each line in each table is a different genotype from a single line of descent, with the oldest at top. Columns fiom left to right: genotype ID; genome length (in instructions); replication size (total number of instructions executed for one replication cycle); number of times the listed function (NOT through EQU) is executed over the course of a single replication cycle. a) Acquisition of EQU at high expression by co-option of another function (in this case, NOR). This section of the lineage is from the re-evolution of EQU in replicate r1000. Genome Replication Genotype ID NOT NAND AND ORN OR ANDN NOR XOR EQU Size Sm 5287070 111 1312 0 0 54 0 0 0 105 0 0 5292230 1 11 1356 0 0 53 0 0 O 53 0 0 5302462 1 11 1356 0 0 53 0 0 O 53 0 0 5305957 111 1355 0 0 53 0 0 0 53 0 0 5311387 111 1355 0 0 53 0 0 0 53 0 0 5320463 111 1302 0 0 53 0 0 0 0 0 53 5336137 112 1311 0 0 54 0 0 0 0 0 53 5338819 112 1250 0 0 53 0 0 0 0 0 104 5344360 112 1251 0 0 53 0 0 0 0 0 104 5346973 112 1251 0 0 55 0 0 0 0 0 107 5350219 112 1251 0 0 53 0 0 0 0 0 104 b) Acquisition of EQU at low expression as a side effect of a mutation that is otherwise deleterious (in this case, reducing or eliminating NAND, ORN, ANDN, and XOR). This section of the lineage is from the re-evolution of EQU in replicate r3000. Genome Replication Genotype ID Sin Sm NOT NAND AND ORN OR ANDN NOR XOR EQU 6641322 82 1336 . 0 80 0 36 0 8 73 32 0 6645501 82 1336 0 80 0 36 0 8 73 32 0 6651964 82 1336 0 80 0 36 0 8 73 32 0 6653551 82 1338 0 80 0 36 O 8 73 32 0 6655244 82 1338 0 80 0 36 0 8 73 32 0 6682686 82 1334 0 75 0 6 0 1 60 0 1 6686310 82 1331 O 73 0 2 0 1 57 O 1 6689247 83 1348 0 74 0 2 0 1 58 0 1 6691 152 83 1348 0 74 0 2 0 1 58 0 1 6691522 83 1347 0 73 0 2 0 1 56 0 1 6692397 83 1347 O 73 0 2 0 1 56 0 1 6692875 84 1364 0 74 0 2 0 1 57 0 2 6693351 83 1343 0 68 0 7 0 1 56 0 2 6694778 83 1343 0 69 0 7 0 1 58 0 3 6698646 83 1341 0 68 0 7 0 1 56 0 4 147 DISCUSSION A detailed investigation of convergently evolved complex traits is very difficult in living organisms, and even more so when using fossil organisms (see Introduction). In this study, we have overcome many of the difficulties inherent in paleontological studies by investigating the evolution of a complex trait in communities of digital organisms that experienced mass extinction and recovery. In this digital system, we can determine unambiguously the points at which a complex trait (in this case, the logic function EQU) arises in a lineage, when it was lost, and when it was regained. It is also possible to determine precisely whether organisms in which EQU re-evolves belong in the same lineage as, or a different clade from, the organism where EQU first evolved. Further, we can examine the genomes of those organisms and investigate the genetic basis of the expression of EQU, and compare the underlying mechanisms for pre- and post-extinction states. We demonstrate that: 1) Re-evolution of EQU is highly contingent on its evolution prior to the extinction. 2) Community-wide characteristics measured just prior to the extinction, such as total number of organisms expressing EQU, and the total expression of EQU across the population, are only weakly predictive of whether or not EQU re-evolved following the extinction. 3) In replicates where EQU re-evolves following the extinction episode, EQU often evolves de novo in lineages that do not evolve it before the extinction. This ofien happens following outright extinction of the previously evolved EQU clade, but also 148 4) 5) 6) occurs sometimes when the previous clade survives to the end of the extinction episode (though any survivors can no longer perform EQU). There are fewer cases where EQU re-evolves in lineages that also evolved it at some point prior to the extinction episode. Organisms sampled from the same community, and even the same clade, at the end of the extinction episode often differ considerably in their ability to re-evolve EQU when used as the seeds for evolutionary “replay” experiments. In a number of these replays, the actual end-extinction ancestor of the organism in which EQU re-evolves in the original experiment is poor at giving rise to EQU-performing descendents . In experiments where EQU does not re-evolve, but the pre-extinction EQU clade has surviving EQU- descendents at the end of the extinction episode, descendents of the organisms sampled for replays are no better or worse at re-evolving EQU than descendents of organisms from experiments where EQU does re-evolve following the extinction. Replay experiments seeded with the same founder organism ofien differ markedly in the time required to re-evolve EQU. Some consistently evolve EQU in a short time, while others display wide variance between replays. The pre-extinction ancestors of organisms where EQU re-evolves are often of low trophic position. Lower-level ecological functions tend to be more robust, hence persistent, during the extinction episode. The mechanistic basis of EQU is highly conserved when EQU is not lost during the extinction episode. In replicates where EQU re-evolves in a lineage that possessed it prior to the extinction, there is considerable re-use of ancestral firnctional sites that remain intact after the extinction episode, though much variation 149 is evident. Replicates where EQU evolves in a new clade, or fails to re-evolve, experience, on average, more decay of ancestral functional sites. Measurements of pre-extinction demography do not reflect “deepfihistorv” effect We started with a set of simple statistical analyses focusing on the pre-extinction communities, asking whether or not widespread occurrence of the EQU function in the community, or high community-wide levels of expression, differed between those replicate populations that re-evolved EQU and those that did not. In the broadest terms, re-evolution of EQU is strongly contingent on its evolution before the press (Table 2.1), implying a “deep history” effect. Many organisms at the end of the extinction episode apparently have some inherited characteristics that facilitate re-evolution of EQU, making this propensity a population-level property. However, we have not yet identified precisely what these characteristics are. In particular, the associations between re- evolution of EQU, and both community-wide occurrence and expression of it, are weak (p =0.022 for occurrence, p = 0.032 for expression). Replicates where EQU re-evolves do not, on average, have a much larger fraction of EQU+ organisms just prior to the extinction episode (the difference is only about 3%). Nor does total expression differ much, only about 1.3-fold greater on average in replicates where EQU re-evolves. While simple and clear in the questions addressed, these analyses with broad-scale demographic parameters are largely devoid of any phylogenetic or ecological information which may be relevant to evolution of the trait. Just knowing these quantities for the pre-extinction 150 state of the population is uninformative for understanding what happens after the extinction episode, and does not reflect the “deep history” effect. In replicate r6400, for example, nearly 30% of the population expressed EQU just prior to the press episode, and practically the whole population descended from the pro-extinction EQU progenitor (3217 viable organisms out of 3600). Yet, in the original experiment, EQU did not re- evolve following the extinction episode, but did re-evolve in 14/20 “replays” seeded with the most abundant surviving organism from this replicate’s pre-extinction EQU clade. Even replays seeded with the actual end-extinction ancestor were not always particularly good at re-evolving EQU (Table 2.9), demonstrating variable predispositions for re- evolving EQU among organisms from the same community. The sequences, architectures, and preservation of functional sites in digital organisms can change over time (particularly during the press period), and it is on the backgrounds arising from the end-extinction states that EQU must re-evolve. For these reasons (and others we discuss below), the association between the pre-extinction community state and subsequent re- evolution of EQU is weak. Genetic architecture of ancestors from end of the extinction episode partially determines evolutionary fate of EQU In replay experiments, we start new evolutionary processes with a single organism that is either chosen from EQU: end-extinction survivors of the pre-extinction EQU clade (if available), or is the actual end-extinction ancestor from the lineage where EQU 151 re-evolves after the extinction episode. These experiments provide further insight into re- evolution of EQU following the extinction. In cases where the pre-extinction EQU clade still has surviving EQU- members at the end of the extinction episode, the clade often goes extinct within the first 5000 updates of the recovery (Appendix A2.2). Even if the pre-extinction clade persists into the recovery, EQU still might not re-evolve, either at all, or within that clade. The replay results suggest that in many cases, the result obtained for the original source replicate (EQU re-evolving or not) could easily have gone the other way. This was illustrated particularly by the occurrence of: i) actual end-extinction ancestors of the re-evolved EQU genotype whose descendents were very poor at re-evolving EQU in replays, and ii) replay populations seeded with organisms chosen from a replicate where EQU did not re-evolve (NRE), but where the function did re-evolve in most of the replay experiments. It is interesting that even in those NRE replicates that maintained a large number of pre-extinction EQU clade descendents (sometimes the whole population from the end of the extinction episode, see Appendix A2.2), the organism chosen to seed the replays is not notably poor in the number of replays where its descendents re-evolved EQU (Table 2.8, Appendix A2.3.1d). This frnding highlights where EQU could have re-evolved from the same clade in the original experiment, but by chance did not. Additionally, in all replicate classes, the time required for EQU to re-evolve in replays seeded with the same ancestor could vary widely. The replay descendents of a given ancestor may not easily evolve EQU in the (absolute) time it took for the clade to go extinct in the original 152 replicate population. This is particularly relevant for replicates where EQU does not re- evolve, or evolves in a new clade following the extinction. If only a very small number of descendents of the pre-extinction clade remain at the end of the extinction episode, the clade may simply be lost by drifi in the earliest stages of the recovery. In a replay population, however, the clade (if it is not the whole source population) is saved from extinction, placed in a “refuge”, and allowed to diversify without competition. Thus, the ease or difficulty in evolving EQU may be evident in a way not obvious fiom the original experimental result. The clade from r5400 provides a perfect example: the chosen organism was an excellent replay performer, but the clade contained only two survivors at the end of the extinction episode. As noted above, there are circumstances where the pre-extinction EQU clade may be very large, and thus may contain numerous sub-clades whose members are of heterogeneous functionality, and varying distances to accessing EQU. Thus, even within the pre-extinction EQU clade, the choice of clone can strongly influence the outcome of the replay experiments. Replicates r1000 and r10100 (where not all of the population coalesced to the pre-extinction EQU progenitor) are illustrative. In both these replicates, both the actual end-extinction ancestor and the most abundant survivor of the pre- extinction EQU clade descend from the same pre-extinction EQU progenitor. However, the most abundant survivor was very poor at re-evolving EQU (only in 6 of 20 and 4 of 20 replays for replicates 1000 and 10100 respectively). By contrast, the actual end- extinction ancestors from these populations re-evolved EQU in 20 of 20 replays, and usually in fewer than 5000 updates, indicating those organisms were already close to 153 accessing EQU. In replicate r10600, the actual end-extinction ancestors and most abundant EQU clade survivors were practically identical in replay performance, indicating similar genetic accessibility of the EQU function in those organisms (if not close relatedness between them). It is also possible, though, that survivors of the pre- extinction EQU clade persist following the extinction, but remain ecologically displaced. In replicate r2300 (Category IC), the pre-extinction EQU clade survived well into the recovery and at high frequency. The most abundant end-extinction survivor from this clade is not especially poor at re-evolving EQU (12 of 20 replays). However, an organism fiom another clade, where EQU had not previously evolved, was closer to accessing EQU at the end of the extinction episode, re-evolving EQU first in the original experiment, and seeding 20 successful replays (Appendices A22, A23). Thus, while there are genetic backgrounds that are more strongly biased towards evolving EQU (whether or not they descend from the pre-extinction clade), there are also elements of chance involved in which organism will hit upon it first. The random seeds of the replay experiments represent chance factors that can influence further adaptation, particularly in organisms whose descendents do not have an especially strong predisposition for re- evolving EQU. Low ecological position of ancestors and extinction survivors is not an unusual outcome We found that in an ecological context, the last pre-extinction ancestor from lineages where EQU re-evolves has a low position in the trophic cascade in 21 of 52 154 replicates (Figure 1.1, Table 2.2). Only 11 of 52 replicates feature an ancestor that performed EQU just prior to the onset of extinction. Lower-level functions are also more likely to be present in the population at the end of the extinction episode (with the notable exception of AND, see Table 2.3), having been either retained or lost only transiently. Given the ecology of this digital system, the extinction is more damaging to organisms with high-level ecological activity, which is correlated with more difficult and costly functions. Such organisms receive less resource from the base of the trophic structure, and usually disappear first during the extinction episode. Organisms that do not invest heavily in these difficult functions, and usually lie lower in the trophic cascade, are better able to cope with a dearth of resources. Thus, they are more likely to leave descendents that can survive the extinction episode. However, we also observe cases where the last pre-extinction ancestor performs more difficult higher-level functions (Table 2.2). In Avida, these functions are often genetically correlated with lower-level ones. Even if a high-level function is knocked out, much of the underlying code also overlaps with other, simpler functions and so may persist. In addition, the ecology is cross-feeding, rather than phagotrophic. Organisms that possess an array of lower and higher-level functions can use the by-products of their own lower-level functions, which buffers them (at least temporarily) against the effects of the press period. Depending on the mechanistic bases of the functions, some instantiations will be more robust than others, and so may persist (with some decay) through the press. There are both ecological and genetic components to the phenomenon, and looking at either in isolation is not sufficient to account for the observed effects. It would be interesting to examine further what factors make digital 155 organisms more robust to decay of their ecological functions, as such differences certainly exist in our populations (Table 2.3). While detailed comparisons between the artificial communities in Avida and natural communities preserved in the fossil record are difficult, there are nonetheless some general and interesting comparisons that can be made between them. Our finding that the immediate pre-extinction ancestors of the re-evolved EQU genotypes are frequently themselves EQU- is reminiscent of the observation from paleontology that, following a mass extinction, the post-extinction occupants of a particular niche are very often not descended from the pre-extinction incumbent group. Instead, these successors diversified from ancestors that had a different ecological role prior to extinction. These ancestors are often viewed as having been ecologically marginal, and suppressed by dominant taxa, prior to the extinction episode (Jablonski 1995). In the context of our experiments, the leveling of the ecological “playing field” caused by the press episode, including extinction of the EQU-performing organisms, created new opportunities for the descendents of all organisms, including those that were functionally simple, numerically rare, or otherwise marginal prior to the extinction episode. Also, in the fossil record, post-extinction communities are often reported to be dominated by generalist taxa with wide geographic distribution and broad environmental tolerances (J ablonski 1995, Erwin 1998a). Such assemblages typically include an abundance of so-called “disaster” taxa (Schubert and Bottjer 1992): organisms that are “weedy” early colonizers, and are usually competitively inferior, but that bloom opportunistically during the extinction episode and early recovery. Predominance of seemingly inferior competitors following 156 mass extinctions has been reported for terrestrial plant communities recovering from the Permo-Triassic extinction (Looy et al. 2001) In Avida communities, two types of organisms might be considered disaster taxa. One type would include extreme generalists that use every available resource but with low efficiency. The other type includes base replicators, which are organisms with no functionality beyond replication. The first type is not prevalent at the end of the extinction episode in our experiments, and should not be since there are not enough resources present during the press period for organisms with many functions to perform them all successfully; further, trying to perform every available function carries a very high metabolic cost. Instead, we find most end-extinction communities are dominated by organisms with low-level ecological functionality, having little or no expression of trophic functions. Given the low resource levels and simplified conditions of the press extinction episodes, these digital organisms certainly look and behave like the disaster taxa reported from paleontological studies. Similar outcomes are reported from simulations with more explicitly defined food webs, unlike those that evolve in Avida (Roopnarine 2006). However, these parallels cannot be extended too far. Digital organisms with firnctionality beyond primary production do not simply equate to “herbivores” and “carnivores”, since true predation is not implemented here. The cross- feeding relationships of the digital organisms are more closely akin to that of the co- existing ecotypes reported in Rozen et al. (2005), in which a later-evolving type develops the ability to consume metabolites produced by an earlier-evolved type, enabling mutual co-existence. Thus, in Avida, even organisms that were primary producers prior to the 157 extinction can ultimately give rise to post-extinction EQU)r genotypes, though this outcome is uncommon in our experiments (Table 2.2). Homology is a matter of degree in re-evolved version_s of EQU In 31 of 52 replicate populations, EQU evolves de novo in a lineage that never contained that function before the extinction. In the 21 populations where EQU re- evolves in a lineage that possessed it previously, we found wide variation in the preservation and subsequent re-use of ancestral functional sites (Table 2.14). The proportion of functional sites retained is positively correlated with the probability of re- evolving EQU obtained from the replay experiments. By contrast, end-extinction descendents in replicates where EQU does not re-evolve, or evolves in a new clade, generally experience greater loss of EQU functional sites through the extinction episode, at least in the most abundant survivor. In these replicate classes, there is no significant correlation between the probability of re-evolving EQU from that clade member and the number of retained sites. Since we find considerable, but varying, amounts of re-use of ancestral sites, “homology” in Avida is a matter of degree, rather than a binary property. The re-evolved EQU functions are extremely variable in the number and percentage of re-used sites. Some re-evolved versions of EQU feature only a handful of ancestral sites, making them practically new instantiations of the function. Others, such as the example shown in 158 Figures 2.12-2.15, are clear cases of “suppressing the suppressor” (Laurent 1983): they feature very little decay, and are good founders for re-evolving EQU in replays (Appendices A2.3.1, A2.4.la, A2.6). This re-use is perhaps expected, given the high occurrence of alleviating epistasis in Avida genomes (Lenski et al. 1999, Misevic et al. 2005), and that evolution of EQU itself requires co-option of previously evolved simpler functions (Lenski et al. 2003). Even if EQU expression is knocked out, much of the underlying code is used (and thus persists) in these other functions. Although we have little corresponding data from real biological systems, the prevalence of compensatory mutations supports the relevance of alleviating epistasis (Burch and Chao 2004, Sanjuan et a1. 2004), while the importance of co-option is widely applicable as well, in both molecular and organismal contexts (Salvini-Plawen and Mayr 1977, Melendez-Hevia et al. 1996, Tomarev and Piatigorsky 1996, Reznick et al. 2002, True and Carroll 2002, Conway Morris 2003, Lenski et al. 2003, Gehring 2005, Liu and Ochman 2007). No clear evidence for facilitation through functionflecoupling and independence It has been proposed that repeated evolution of a trait within a clade can be facilitated by either tight (Whiting et al. 2003), or loose (Rich et al. 2005) correlations with other traits critical for deve10pment and viability. The latter possibility seems the more intuitively plausible and generally applicable. Wainwright et al. (2005) have suggested that complex systems featuring the properties of partial decoupling between interacting parts (permitting independent “tinkerings” with each part), and “many-to-one” 159 mappings of form to function (different structures producing the same functional value), are more evolutionary labile. However, we were not able to find direct evidence supporting either loose or tight coupling of functions in digital organisms. We could not find significant differences between classes in the number of EQU knockout mutations that also affected two or more other functions (Table 2.18). Individual examples suggest that EQU can evolve on genetic backgrounds that have either loose or tight genetic correlations with other functions, which in the latter case must first be broken (Table 2.19). We hypothesize that deleterious mutations (and the accompanying pleiotropic effects) of the kind shown in Table 21% break epistatic interactions in the genome, facilitating formation of new interactions and enabling niche expansion (as of this writing, work by other members of our lab group is ongoing to verify this). This hypothesis suggests EQU is more difficult to re-evolve if pre-existing epistatic interactions in the genome must be broken and new ones formed, rather than by simple co-option of loosely correlated functions. If correct, a more general corollary of this fact is that organisms (at least asexual ones) sometimes incur losses of fitness prior to fitness increases that precede bouts of adaptive radiation (see also Lenski et al. 2003). As noted above, some re-evolved versions of EQU (particularly when it is present in the lineage up to the onset of extinction), feature a very low percentage of retained sites, meaning most of the mechanism for the re-evolved EQU is either new or recruited from other components. Replicate r8100 provides a particularly interesting case, where 8/15 functional sites (~53%) are present in the sequence at the end of the extinction episode. Yet, none of these sites participate in the re-evolved EQU, which originates in a 160 different section of the genome, with a completely new underlying mechanism (Figure 2.8-2.11). EQU re-evolves in only 3 of 20 replay populations seeded with the actual end- extinction ancestor from this community, suggesting strong historical constraints. Although striking, such radical re-invention of EQU within a surviving pre-extinction clade does not appear to be common. In contrast to the previous example, the end- extinction organism from replicate 10300 features 44% retention (4/9 sites) to the end of the extinction episode, and only 2/12 sites in the re-evolved EQU are ancestral. Yet, 20 of 20 replays seeded with this organism re-evolve EQU, suggesting other precursors for the function are already in place and readily used. Replicates where EQU evolves de novo following the extinction have no previous EQU mechanism to build on, though a variety of other building blocks are available depending on the functions performed by the ancestors (Lenski et a1. 2003). These results support the idea that latent competencies in biological pathways can facilitate re-acquisition of complex traits (Lande 1978, Whiting et al. 2003). Thus, there is opportunity for both re-use and innovation in the evolution of EQU in Avida. Which of these predominates depends not just on how much ancestral material is available for re—use, but also on the genetic background as a whole. Possible limitations of analyses Despite using a largely transparent system such as Avida, certain methodological difficulties remain in addressing questions of re-evolution One key limitation of our method is the use of ecologically simplified communities in order to detect the presence, 161 expression levels, and evolutionary origin of EQU. These simplified communities almost always contain EQU+ organisms with a monophyletic origin of EQU. We carmot rule out that in the full community, there may have been organisms representing multiple independent origins of EQU which did not persist through the ecological filtering. However, the recorded time of (re-)origin of the EQU function is always much earlier than the termination point of the experiment’s evolution phase (see Methods). This gives us confidence that this method did not simply isolate organisms that had evolved EQU late in the experiment. A related point is that prior to the extinction episode, any clade in which EQU initially arises can be displaced by the clade that we did detect by the filtering method, so the first EQU+ organism in that lineage is not necessarily the one where EQU did in fact first evolve. We find only a single replicate with two independent (and historically deep) pre-extinction origins of EQU that persist through the ecological filtering, and only one of those EQU clades survives the extinction in the full experiment. Thus, long-term persistence of clades stemming from different EQU+ progenitors is likely to be a rare event. The later-arising EQU clade is the one of interest since, in most cases, the earlier clade will either go extinct prior to the treatment-driven extinction episode, or at best leave only secondarily simplified EQU- descendents. Another methodological limitation is the use of only a single sequence from the population at the end of the extinction episode for determining the remaining number of ancestral EQU functional sites, since related genotypes could retain more or fewer sites than what we observe. This choice is completely justifiable in cases where the chosen organism is ancestral to the one where EQU re-evolves. However, for cases where EQU 162 does not re-evolve, or evolves anew in another clade, our choice of the most abundant survivor of the pre-extinction EQU clade is less easily defensible since abundance might be inversely correlated with loss of functional sites. However, these organisms did not differ significantly in percentage of retained sites from actual end-extinction ancestor organisms whose immediate pre-extinction ancestors were EQU+ (Figure 2.7, Table 2.16). Thus, these most abundant survivors are comparable to organisms that were actual ancestors of the re-evolved EQU organisms. SUMMARY While it has been widely recognized since Darwin’s time that co-option and recruitment of pre-existing features to new functions is important for understanding the evolution of complex features (Gould and Vrba 1982, Reznick et al. 2002, True and Carroll 2002, Wilkins 2002, Lenski et al. 2003, Conway Morris 2003), a more detailed understanding remains elusive. Re-evolution of such traits is sometimes deemed prohibitively difficult by the logic of Dollo’s Law, particularly in cases where the underlying mechanism has been deleted or diverted (McShea 1996). However, the appearance of such traits may often be more easily accomplished than is supposed by Dollo’s Law. Combinations of selection and various types of constraints have shaped functionally analogous outcomes across many taxa, suggesting repeatability and predictability in evolutionary trajectories (Salvini-Plawen and Mayr 1977, Conway Morris 2003, Weinrich et al. 2006, Vermeij 2006). Complex traits with multiple, interacting parts have apparently been evolved repeatedly within clades where most of 163 the basic building blocks are already present prior to the appearance of the trait (Kurtén 1963, Shimek and Kohn 1981, Reznick et al. 2002, Whiting et al. 2003, Rich et al. 2005, Li et a1. 2007). Complex traits have occasionally re-evolved in clades where the underlying mechanism for the trait has seemingly been lost for long periods of time, such as shell coiling in lirnpets (Collin and Cipriani 2003) and sexual reproduction in oribatid mites (Domes et al. 2007). However, there are also cases of complex traits and unique morphologies that have not re-evolved once lost (Frazetta 1970, Arnold et a1. 1989, McShea 1996). Convergent evolution of complex traits in widely separated taxa may involve recruitment and redeployment of components already present in a distant common ancestor, as has been suggested for camera-type eyes in cephalopods and vertebrates (Ogura et al. 2004) and middle ear bones in monotremes and therians (Rich et al. 2005). Our results suggest that a “deep history” effect operates in our experimental populations. Populations that had evolved EQU previously were much more likely to re- evolve it following a mass extinction. The immediate pre-extinction ancestors of organisms that re-evolved EQU following the extinction episode were often of low trophic position, so EQU did not usually arise in descendents of the incumbent group. Re-evolution of EQU occurred frequently in lineages where EQU had not evolved prior to the extinction episode, though with somewhat more difficulty than in lineages where EQU had occurred previously. In lineages where EQU occurred prior to the extinction, the probability of EQU re-evolving was positively correlated with the amount of ancestral functional sites remaining in end-extinction organisms. We could not verify that EQU 164 was more likely to re-evolve if it had greater overlap with other functions. We showed that re-evolution of EQU in a lineage from which it was previously lost can (but does not always) involve considerable re-use of the remaining parts of the ancestral mechanism, while evolution in a new clade necessarily involves co-option of other pre-existing functions. In Avida, simple logic functions can be recruited in many combinations to more complex functions such as EQU. Avida features “many-to-one” genotype- phenotype mapping (many genotypes map to similar phenotypes), so it is not so surprising that EQU (re-)evolves with fairly high fiequency. Such mappings are thought to exist in biological systems, as shown by the previously cited examples of convergent evolution. However, results of the replay experiments also demonstrate that even if many precursors are already in place, re-evolution of a complex trait is not always deterministic, and may be inhibited by chance factors producing historical constraints on further adaptation. Evolution of complex traits in Avida organisms (or any complex functional system) may be facilitated if precursors can be easily co-opted without having to break pre-existing epistatic interactions. These findings point to substantial roles for both chance and historically contingent effects in determining whether a complex trait is re-evolved. FUTURE DIRECTIONS As with previous work, it would be of great interest to see how using spatially heterogeneous resources alters these results. Multiple origins of EQU could be more often preserved through the ecological filtering, since independently originated sub- 165 populations would cross-feed primarily from organisms in their local neighbourhood, rather than globally accessible pools. Multiple EQU-performing clades might then co- exist, some of which may survive (or go extinct during) the extinction, as well as possessing different underlying mechanisms for the trait. Experiments using lower mutation rates may also alter the results for ancestry of re-evolved EQU genotypes, by preserving pre-extinction progenitors with higher-level functions. Alternative ways of removing EQU (such as post-hoe selection of progenitors by the experimenter) can also be explored, though many of these alternatives are probably less “natural” than the low- resource press episode used here. However, they are also much cleaner and can give more insight into the specific effects of the press treatment. We can identify those factors that lead to the observed “deep history” effect—re- evolution of EQU being contingent on its prior evolution. We hypothesize that this is a community-level property, determined early in the history of a replicate population. Work in this direction should focus on the ancestors of the first organism to evolve EQU, or perhaps the most recent common ancestors of the entire pre-extinction community. By performing firrther replay experiments that include the extinction, we can determine whether or not “deep” ancestor organisms from populations where EQU does re-evolve are somehow more “evolvable” than comparable organisms from populations where EQU fails to re-evolve subsequently. The ability to start replays from organisms selected at any point along a recorded lineage makes such experiments straightforward in principle. Additionally, such experiments can be performed using whole populations as founders, rather than single organisms (though the latter case is certainly simpler). 166 As mentioned above, at the time of writing, work by other lab members is ongoing to elucidate the possible connection between certain key deleterious mutations and epistasis, which constrain the functional interrelationships between different sections of the genome. We hypothesize that these key deleterious mutations break epistatic interactions and increase the evolvability of the genetic backgrounds on which they occur, even though the organism incurs a loss of fitness. When analytical techniques from this work have matured sufficiently, they can be turned towards investigating the differences between those organisms whose descendents evolve EQU easily, and those for which evolution of EQU is difficult. We hypothesize that in cases where (re- )evolution of EQU in descendents of replay founders is especially difficult, the penultimate ancestors of the genotypes where EQU (re-)evolves will have more constraining forms of epistasis compared to those that evolve EQU easily. We are also considering directions more oriented toward concerns in paleobiology. One such area is the role of refuges and so-called “Lazarus” taxa—taxa that disappear from the fossil record during an extinction crisis and the early stages of the recovery phase, only to re-emerge later in the recovery. These taxa are thought to find shelter in relict patches of preferred habitat, where they can “ride out” the extinction episode. We could investigate this phenomenon in Avida by using the organisms from the pre-extinction ecologically stable communities as “Lazarus” representatives of the pre-extinction biota. Since these are saved to disk, we can re-rlm multiple iterations of the original experiments and re-introduce these “Lazarus” organisms at successively later points during recovery from the press extinction, and observe whether or not they can 167 successfully invade and establish themselves in the post-extinction digital biota. Alternatively, we can use Avida to construct “dynamic refuges”, smaller regions with pr0portionally lower (but non-zero) amounts of available resource, in which these higher- functioning organisms can continue to live and evolve while the rest of the population is subjected to a press episode. Evolution in these dynamic refuges can even continue for varying lengths of (absolute) recovery time in the main population before the evolved descendents of the pre-extinction “Lazarus” organisms are introduced back, in effect allowing these refuges to have varying degrees of physical and temporal isolation. In closing, then, the Avida system offers future directions for further exploring issues in evolutionary biology, in general, and paleobiology, in particular, that—like recovery from mass extinctions—are not accessible to the comparative and experimental approaches used by most biologists. 168 APPENDIX 1 IMAGES ARE PRESENTED IN COLOUR. Appendix Al.1. Logic operations in Avida. When a digital organism in Avida performs one of nine basic logic operations on one or two random 32-bit strings, and then outputs the bitwise-correct result, in obtains additional energy (in the form of additional CPU cycles) that accelerates the execution of the instructions in its genome. The logic rules for these nine basic operations are as follows: Input Logic operation A B NOT NAND AND ORN OR ANDN NOR XOR EQU l—sv—too HOOO 1 0 1 1 Hot—o OOH—3 o——-.— ._......_.o OHOO coo—- CHI—O I—iCOh-I For example, if bit A = 0 and bit B = 0, then (A EQU B) = 1. These rules are defined on single-bit inputs. In order for an organism to be rewarded for performing an operation, it must perform that operation correctly on all 32 bits of the input strings. The NOT operation is performed with only one input string. 169 Consider an organism that obtains the following two inputs, and then executes a series of instructions that results in the output string shown below. Input A: 010101011100000000111010101100 Input B: 100001101010001111010110011110 Output: 001011001001110000010011001101 The organism would receive the energy reward for performing the EQU Operation, because it correctly calculated the EQU function for all 32 pairs of the corresponding bits for inputs A and B, and output the correct result. In the cross-feeding trophic network used for these experiments, successful execution of an operation results in consumption of the resource linked to that operation, receipt of the energetic reward (additional CPU cycles) associated with it, and production of a designated resource (possibly more than one) mapped to another operation on the next higher trophic level. In practice, the quantities of produced resources (which are available to all organisms anywhere in the population) are incremented by the Avida software upon successful completion of the resource-producing operation, and decremented by the activity of any organisms able to consume those resources (by performing the appropriate operations). 170 Appendix A1.2 Trophic weighting of stably coexisting eco-species. Consider the following two ecologically stable communities, each with six eco-species. Each line corresponds to one species. Functions highlighted in green are Level 1 (“primary producer”) functions, those in light blue Level 2 (“primary consumer”) fimctions, those in light purple Level 3 (“secondary consumer”) functions, and the red column is the “top consumer” firnction, EQU. Note the second community does not contain any eco-species that perform EQU. Community 9800 o o e 13' U '0‘ , “‘63? 0 O 297 0 0 0 0 o o o 0 o 0 0, 0 o 0 1 0 63 0, 0 129 0 O 129 0 0" 127 127 o 126 o ,9 o, 0 W U' 0 77 0 01' 160 0 0 0 0 0 0 0 0 325 0 0 O 0. 0 138 0 138 0 0 O o 161 160 0 o 0 01‘ o o 0 . .0 ,. .Q , 0_ '03 The weighting scheme accounts for the relative proportion of functions performed on a particular trophic level. Thus, for example, two (or more) eco-species that perform EQU will be weighted differently depending on what proportion EQU represents of the total functions performed in each eco-species. 171 Consider another two example communities, both of which have eco-species that perform EQU. First, sum up the total number of functions performed for each eco-species. 146 u u 144 0 ‘ 0 ' " O 0” 578 131 0 0 129 O 0 129 O 0 389 O 0 0 119 0 0 0 119 0; 238 0 399 0 133 0 0 0 0 0 532 0 0 0 O 1 0 0 O 11 2 n n 262 o o- 262 0 0 M0; 524 Smith-FLA ‘i..fit”‘m§} F o . o; 0 6? 1 91 149 O 0 0 297 0 0 0 01 446 0 450 O 0 O 0 0 0 0‘ 450 191 1 0 0 0 1 0 63 0- 256 0 130 0 129 O 0 129 O 0 388 . -_ . D .. 127 . . -0 12,6, 0. o 041 508 Second, convert each function in each eco-species to a proportion by dividing it by the organism’s total function expression: 172 0 0.3316 0 0.3316 0 0 0 0.5000 0 0 0.5000 0 0 0.2500 0 0 0 0 0 0 0 0.5000 0 o 0 0.5000 0.5000 0 0.5000 0 0 '0'“ “or ‘ . 0 0 0 “m 0 0 0.6659 0 0 0 01 1.0000 0 o 0 0 0 0 0.7461 0.0039 0 0 0.0039 0 0.2461 0 ’ 0 0.3351 0 0.3325 0 0.3325 0 01 0 0- 520 . . 30.24.89 9.... .. ,. . 0 . Q. Third, multiply the proportions obtained above by a vector that contains the weights for each level. Using the weights for the previous weighting scheme, an example would be: 0““ “ 0“ 0“ 0323 4 4 4 a O 0 ’ 0 2.6387,: Finally, summing up all of the proportion-adjusted function weights, we obtain for the two communities: ' 0 " ‘ 0 0 0‘1 1.249 0 13265 0 0:: 2.326 0 0 2.0000 0.‘ 3.000 0 0 0 01 1.250 0 0 0 4.0000} 5.000 2.0.0.00 W0 -.0,, , g. 3.000 sum 15.826 0 0 0 m 3.309 0 0 0 0;; 1.666 0 o o 0.1 1.000 0.0156 0 0.9844 01 1.750 0 1.3299 0 0;: 2.330 0.g9_21 o 0 M 03 2.244 sum 12.299 The second community’s lower score reflects the lower weight it receives due to the sizeable expression of Level 1 functions in most of its eco-species, despite the higher absolute expression of EQU. 173 30m 5:05.: .«o 568538 ”Gob Q Ens—co mox .202 .292 2963 .6 825896 x293 Ema N2: 86:60 mo .zmo dz< 826:3 .6 89566 ”6:3 :35 as 2628 QZU§§3w 3::08b8 8: :5 n88:3 828800 02828:: n: .8 8:0: 2 >8: :808 05 8 08 05 :23 380-80: 8288 8:» 2 :0:anin 880% 2C- .08: :288:0m :8 a8— o:8:0w .8 888 S. :25; 088 02: 2 528888 0&- .:0m: 88: Bo: 80:88: 02: :5 :23 608—08: 8: :285: 828800 2 8:259» 03802080 «0:8 Mn: am .888: 80.3 176 maid mam mm 0033? 5.5.0 we? 2.... omovwov $35 «3 cm. 32o? Bid «on 3 83:? 3.2.0 as E 5:99. ~83 var um 83»? :3 max mo: znz< mo zmo oz< oz5: £23 .3305 -8 E8305 05: 05 mo :33 5:» £033: coo; 5:.“ 05 5 298 moon was nouwowmmuozfivbu Hommofiom 5382 8369: go; ,o . o . a! .. a 83° 8. E 233. ,. .3. .. . o :3...p . a. a .3 . - 8mg 8. I E 3.53. 3m max :02 zoz< mo 56 oz< 952 52 356% was com :55. 2 .mLSmooSw 333.8 2: :9: 8:05 :95 can 3:5 aoufioaom :2: 35 cm? 802 .owoflmo mmoi 05 283 30m wogofiom SE ombocow team 2: ma own—o 25m 05 E oaTHSmooSw 8888 a 88m £382 bo> 333% 022 860% 806 85 “momma momoumméoo 93 085 mo 88: cornuoaom Ea mfiwfiv— 08 553cm 55:88 $32-55 033m a Ea $03.05 Enema 23a 28.13380 388:: a 3:823“ bananas $5 .038 Homwofioo SN E 25%? $on 05 .«o ES 06 «a 3:538va 3 83396 b >38». 358 8 03¢ 360%-80 9,3 Eco 05 Pa 32: $85-me 177 oomod new? 3 mnwmvoo 5 . . c5 a 8 - .. c o o «:8 82 8: 8335 o x: o , 53.0 $2 8. «23.8 -.. , a .. ..o .. 9-5.; . . 886 ms 3 $388 :8 8x 82 zoz< mo 28 oz< az0cm_oEo oEzlcom 59.0. E #808 80850 088 8 8p .88: 830m 888: 05 8:88: 883 08 3 88088 :00: 8: 880% w88:£8:-d02 0::- 82 :08 >8: 8: :08 80308 05 80¢ 8808-80 28820:: 08:. .8088: 8:38: ooofi 178 9 N moo—no mm: «NF 00:.on 2690 N09 oww we. Fon omwod vwve nNF Nmaomww ommod nut. w: vaFONw nvnod come 09. owhumww Nomad 0N2 VNP ommemmw :Om 10x «.02 zoz< mo sz oz< oz0_ 00 0.0 “006000-000 0008-08 05 me 0008 00‘“ 0029» 0005 00088 0083 80:80:0m 05 06mg— 080000 5006000 .8 00003: 8°63 Hots 008—08 003000-000 0380 05 03 002E. 0930000 000000: 2562 Appendix Al.4 Environment used for subsidiary “large world” experiments. Example links are shown for clarity; all functions on levels L1, L2, and L3 have a link with all functions on the level immediately below. In this example, nine units of the resources for ECHO are required to produce one unit for each of the functions on level Ll. Top consumer (L3) 3Al(32) 3AM (32) 3AP(32) 3AU(32) 3AV(32) 38l(32) 38L(32) 3CG(32) 301432) 2Ml level consumer (L2) XOR(16) EQU(16) 3AL(16) 3BG(16) 38M(16) 3AY(16) 388(16) 38W(16) some) , 1't level consumer (L1) AND 8) ORN(8) OR(8) ANDN(B) NOR(8) 3AG(8) - 3AQ(8) 3AX(8)1U 88(8) ) ‘t |l l 700,-, ’00- a K°° 1c“ 1““ ECHO(4) ._ NOT(4) NAND(4) 9units 4‘ 4‘ --I‘ Primary production (L0) 180 Appendix A1.5 Examples of ecological recovery dynamics in two subsidiary “large world” experiments. In all plots, Y axis is total functional output, and X axis is time (in thousands of updates). Colour coding for trophic levels is the same as in Figure 2. Only two trophic levels are shown per panel due to large scaling differences between trophic levels in this experiment. Panels (a)-(b) are from one experiment, panels (c)-(d) are from a second. a) L0 and L1, community 1 b) L2 and L3, community 1 c) L0 and L1, community 2 (1) L2 and L3, community 2 181 3N - cow - cow - our - o: - 8r - amp I 3.. - cnr - our - a: j 2:. - ca 1 on - ch - co - an r 3 - on - on r or 8000 7000 -l 000 ~ 000 - 000 - 6 5 4 3:22.093 .0 88... a 3000 - 33:0 000 - 2 .30... 1000 - Time (10008 updates) (A1.5a) ch - can - oar - 0a.. - E... - o: - omw - 3w - our - cur - a: - cor - on - ca - ch - co - on - 3 - on - 9N - 3 140 120 - . _ _ _ o o o o a 6 w 1 3.3.3098 ..o 88: 33:0 .30... o 2 Time (10008 updates) (A1.5b) 182 SN :3 W . 8" M A - SN (Appendix A1.5 continued) 6000 T a: N r .3.. 1 a: - 1».me - a: - .3.. w - cow . O*—. M I c‘—. - cm—. \I - On? 40.. WP - a. w. 0W - an. - a: u . - a: M l 1 . m. . 8 a W . 8 m .h. .. 8 n Nu - 8 - 2 . fl 2 . S l . 8 r 3 a 3 f an r on i Q" I a“ r x) . 3. k. - 3. u q T 1 c M j J m — a j q a _ + c 0 0 0 0 0 (x 0 0 0 0 0 0 0 0 0 0 W M M M N M 8 6 4 2 0 8 0 4 2 4 3 2 1 2 2 1 1 1 1 1 m A 223098 ..o 88: 3&3 .88 A 05.5098 ..o 82: 3&3 .53 Time (10003 updates) 183 (A1.5d) Appendix Al.6. Dynamics of evolving functional output in control experiments with no extinction treatment. Axes and colour coding as in Figure 2. 3N I can I :2. I an« 1 at. I cow - amp I a: 7 :2. l l O O F N P F é. 6 Time (1000: updates) I I I I I I I I I O O c O O O O O O O 1' N 9’ ‘ m m N Q a 650 600 - 550 - 184 Appendix Al.7. Individual examples of extinction and recovery using eco-species with trophic weighting applied. Y axis is weighted number of eco-species, X axis is time (in thousands of updates). a) The two example communities shown in Figure 7b. b) The two example communities shown in Figure 7c. 185 3N fisu 30 .... m .... . ._. 8.02.903 2n!» 322036235 .0 .5.—:52 Time (1000: updates) (A1.7a) 1 a. _ A n _ _ .— _ 4 2 o 8 6 4 2 1 1 1 90.02.0600 0380 uofiu_oi.o_:ao= .o .3532 o 2N 8w cap at our .99. :2 our o: 03 cm a... an tau 10.. 11me (10003 updates) (A1.7b) 186 Appendix Al.8. Examples of functional recovery dynamics at trophic level L3 in two representative pulse extinction communities. Y axis is L3 functional output, X axis is time (in thousands of updates). 140 120 ~ 100 - 40- Total L3 output (10009 of executions) 20- i 0 10 ~ 20 - 30 - 4., . 50 I 60 — 70 - 80 — O 90 _ 100 110 ~ 120 - 130 ~ 140 < 150 - 160 — 170 4 180 ~ 190 - 200 w 210 Tim (1 pdatee) 187 Appendix Al.9. Phenotypic diversity vs. time, averaged over all 100 pulse extinction communities. Y axis is number of phenotypes, X axis is time (in thousands of updates). Error series are two standard errors (approximate 95% confidence intervals). 3’ o C d) .2 a "6 3 .n S 20 — z 10 — 0 l I l l f 1 l l l l 1 F l f l l I I l 77 °2a3888238§§§§§§§§§§§2 Time (1000s updates) 188 APPENDIX 2 Appendix A2.l. Categorization of replicate populations that re-evolved EQU, by classes defined in Figure 2. Columns from left to right: i) random seed for population ii) genotype IDs of the genotypes performing EQU in the pre-extinction, ecologically- stable population iii) indication of multiple independent origins of the EQU function in that replicate population, if more than one ecologically-stable EQU-performing genotype was present iv) genotype ID of the progenitor EQU-performing genotype of the pre-extinction, ecologically-stable EQU genotypes v) classification of replicate population based on ancestry of the end-recovery, ecologically-stable EQU genotype (see text for description of classification). 189 (Appendix 2.1 continued) Replicate Genotype ID Multiple Genotype ID of Replicate population of EQU origins of first EQU- Class ecologically EQU? performing stable ancestor in organism lineage 1000 3667214 222581 IB 2000 3351458 853734 IC 3000 4752758 131959 IB 4000 3799758 415861 IC 5000 3558273 421607 IA 7000 4100261 195082 IA 8000 3657850 423756 IC 9000 346571 8 " 3462262 " 3467365 N 292394 IA 1 100 3734073 N 943363 IA 2100 3503137 526833 " 3498670 Y 2499150 IC 8100 3 591841 668227 IB 9100 3564971 488367 1A1 10100 3645692 509646 1B 2200 4010719 389593 IC 3200 3447733 339848 IB 9200 3617239 1777336 IC 10200 3 106416 4703 70 IC 2300 3530096 1750368 IC 3300 3771472 498773 IC 5300 3364610 114498 IC 6300 3064642 " 3062950 N 414416 IC 7300 3629988 3462489 IC 8300 3769025 243 8434 IC 10300 3351015 310461 1B 3400 3510658 203354 IC 7400 3591454 453952 IA 9400 33 74900 " 33 70595 N 362999 IC 1500 3596143 378381 IB 190 (Appendix A2.1continued) 5500 6500 7500 10500 1600 2600 8600 9600 10600 1700 2700 4700 5700 3635152 3811481 3360109 3103520 3653564 3476221 3461928 3232907 3571006 3801768 3264461 4070671 4070677 3613576 3283975 3278344 3279879 3200783 4785609 3139328 3875051 3642479 3636375 3519469 3521815 3115004 3863067 3489811 3587829 3524562 191 1284234 2575997 280922 623909 407456 2448288 333815 1156277 744837 545853 3557695 1734806 549569 1840831 274292 231273 1558538 371705 398539 1459214 1180618 578032 338920 526761 Total IA Total [B Total [C K3 K3 K3 K: H3 H3 H3 K: [B K3 K3 LA [B K: LA LA LA K3 LA [B K3 K3 K: H3 11 10 31 Appendix A2.2. Fates of pre-extinction EQU clades during early recovery. For Category IC and NRE replicates that had surviving EQU' descendents of the pre-extinction EQU clade at the end of the extinction episode, we performed clade traces for the pre-extinction EQU “progenitors” at 1000, 2000, and 5000 updates into the recovery, in order to ascertain subsequent fates of those clades during the early phases of the recovery. Responses to the low-resource press episode ranged fi'om outright extinction of the clade, to proliferation of the clade in response to press conditions. In some replicates, such as r2700 (Cat. IC), only a handful of representatives remained by the end of the extinction episode and could easily be lost by drift. In other replicates, such as r2300 (also Cat. IC), a large number of descendents of the pre-extinction EQU “progenitor” remained, yet EQU still re-evolved in another clade. A large number of descendents of the pre-extinction EQU clade also remained in NRE replicates r3 800, r4200, r6200, and r6400, remained, yet EQU did not re-evolve following the extinction episode. Other replicates (e.g. r2000, r2400, r2600). had a moderate to large number of end-extinction descendents, and these clades experienced more gradual declines in the first 5000 updates of the recovery, eventually going extinct. Particularly noteworthy cases are replicates r5400 and r6500, where the pre-extinction EQU clade, afier having declined through the press period, experienced a brief resurgence in the early phases of the recovery. However, by 5000 updates into the recovery, the clade had declined to “drift-death” levels, or went completely extinct. Such cases, where the clade survived to the end of the extinction episode, but had no long-term future through the recovery, provide examples in Avida of “dead clades walking” (J ablonski 2001). 192 Table A2.2. Fates of pre-extinction EQU clades in those Category IC and not re-evolved (NRE) populations where there was at least one surviving individual from the pre- extinction EQU clade at the end of the press period. For each of these populations, the number of surviving individuals in the clade was measured at 1000, 2000, and 5000 updates into the recovery 193 194 o o o omo foo cmhmmv UH oocw o o c : cmoo owvmcvM OH comb m mmm Em mm mm: boomhmm UH oomc o om m: 2; cm: wags: n: comm o c c w cmm_ 32% m m Dc comm o mm o; mowm mo_m wwmwvvm UH oocm 3.: $3 mom: mcf mmvm wcmomt UH comm o o m: Ev mow momowm OH comm o o c: whom cw: cm 3on Do com o wow 3: mmwm is: $23 UH ooom o 5o cow comm m 3 omvnov UH coc— ocemio 093% E953.— Egeeoh bogus.— mmeaa acre—55.0.:— ueumaouea man—U mega—E: ocom mega—E: ooom mega—E: coc— ue cam— bfifi—EEE— DOW me A: Sang—com 332:3— mEfl—EWE nae—0:38.. he honfia Z cfimm _c_m mm cvM mm mw~m mv_m mhfim mnhm cm ooom m_m amo— nc_ wwcm wvom vc nmom omnm wvv mhv omom chwfi mm~ nmom o_ mmom _Mm~ omm wmnm cwo~ whom _cvm cwc mwm mcofi me~ oo- ncmm ovum mmmo hmm~ vc~m cmo hmfifi c_m_ ovomcm anmo_m cothm omovvv oomvfim mvcmmw owcmmm_ owwmowfi wvoc_m woowvc wh_oom_ wm_mmm _mmov_ mMcA ocoo~ coco comm comm comb oovc ocmc oocm oon comv comm comm covm @8528 a? use: 195 Appendix A2.3. Successful re-evolution of EQU in replay experiments. The actual end-extinction ancestors were not always superior to the most abundant survivor of the pre-extinction EQU clade when it came to re-evolving EQU in replay experiments. The actual ancestors from replicates 1600 and 2200 (both Category IC) re-evolved EQU only 5/20 and 3/20 times respectively, whereas the most abundant EQU clade survivors from these populations re-evolved EQU in 14/20 and 15/20 trials. Among Category IB populations, replicates 3000 and 9800 were deficient, with actual ancestors able to re-evolve EQU 8/20 times for both these populations, whereas the most abundant survivors re-evolved EQU in 17/20 and 15/20 trials, respectively. In replicate 8100 (also Category IB), where the actual ancestor was also the most abundant survivor, EQU was re-evolved only 3/20 times. Among the 20 Category IC replicates where the pre-extinction EQU clade went extinct during the press episode, 8/20 actual ancestors re- evolved EQU in fewer than half the replays. However, the converse situation was also observed. In a Category IB replicate (10100), the most abundant survivor re-evolved EQU 6/20 times, and the actual ancestor 20/20 times. Clearly, there can be a great deal of within-population variability in ability to re-evolve EQU. 196 Table A2.3.l. Number of successful re-evolution of EQU in replay populations seeded with the most abundant surviving member of the pre-extinction EQU clade. a) Category IA. b) Category IB. c) Category IC. d) NRE. 197 A2.3.1a) Catlegory . M A # successes # failures total replays Replicates 5000 20 0 1 7000 15 5 0.75 9000 19 1 0.95 1 100 15 5 0.75 9100 14 6 0.7 7400 17 3 0.85 4700 14 6 0.7 8700 8 12 0.4 1800 16 4 0.8 2800 15 5 0.75 8800 14 6 0.7 A2.3. lb) Cat 0 IB . # successes R0353” # successes # failures t___otal replays 1000 6 14 0.3 3000 17 3 0.85 8100 3 17 0.15 10100 6 14 0.3 3200 20 0 1 1 03 00 20 0 1 1500 13 7 0.65 10600 20 0 1 5700 9 l 1 0.45 9800 15 5 0.75 198 (Table A2.3.1 continued) A2.3.lc) Cate 0 lg ry # successes # failures tfi—tsalll—crceefifyés Replicates 1600 14 6 0.7 2000 15 5 0.75 2100 14 6 0.7 2200 15 5 0.75 2300 12 8 0.6 2600 15 5 0.75 2700 13 7 0.65 5300 10 10 0.5 6500 17 3 0.85 7300 8 12 0.4 8000 15 5 0.75 A2.3.1d) NRE # successes # failures W Replicates total replays 2400 10 10 0.5 2500 15 5 0.75 3800 10 10 0.5 4200 15 5 0.75 5400 20 O 1.0 5600 9 11 0.45 6200 14 6 0.7 6400 14 6 0.7 7200 14 16 0.7 8200 16 4 0.8 8500 16 4 0.8 9900 15 5 0.75 10000 12 8 0.6 199 Table A2.3.2. Number of successful re-evolution of EQU in replay populations seeded with the actual end-extinction ancestor of the organism that re-evolved EQU. a) Category IA. b) Category IB. c) Category IC. (1) Category IC replicates where the pre-extinction EQU clade was extinct by the end of the press episode. 200 A2.3.2a) Category # successes IA # successes # failures tms Replicates 5000 20 0 1 7000 20 0 1 9000 19 1 0.95 1100 18 2 0.9 9100 16 4 0.8 4700 16 4 0.8 8700 16 4 0.8 1800 12 8 0.6 2800 16 4 0.8 8800 11 9 0.55 A2.3.2b) Category IB # successes # failures # successes total Replicates replays 1000 20 0 1 3000 8 12 0.4 10100 20 0 1 3200 20 0 1 1500 20 0 1 10600 20 0 1 5700 19 1 0.95 9800 8 12 0.4 201 (Table A2.3.2 continued) A2.3.2c) Cate 0 Kg) ry # successes # failures tfim—fis Replicates 1600 5 15 0.25 2000 10 10 0.5 2100 14 6 0.7 2200 3 17 0.15 2300 20 0 1 2600 20 0 1 2700 20 0 1 5300 16 4 0.8 6500 20 0 1 7300 14 6 0.7 8000 13 7 0.65 202 (Table A2.3.2 continued) A2.3.2d) Cate 0 IC Re licates, . # successes EgUrilade eitinct # successes # failures total replays 1700 17 3 0.85 1900 20 0 l 2900 19 1 0.95 3300 7 13 0.35 3400 19 1 0.95 4000 11 9 0.55 4900 15 5 0.75 5500 3 17 0.15 6300 17 3 0.85 6800 9 11 0.45 6900 6 14 0.3 7300 13 7 0.65 7500 15 5 0.75 7700 16 4 0.8 8300 8 12 0.4 8600 5 15 0.25 9200 5 15 0.25 9400 9 11 0.45 10200 17 3 0.85 10500 20 0 1 203 Appendix A2.4. Differences in mean rank of re-evolution time between actual ancestor and most abundant survivor replay founders from the same replicate population. There were significant differences in mean rank of re-evolution time between the most abundant pre-extinction EQU clade members and the actual ancestors for both Category IA and Category IB replicates. Differences between the actual ancestor and the most abundant surviving EQU clade member were statistically significant in 6/11 Category IC populations, 5/8 Category IB populations, and 3/10 Category IA populations, respectively. Reversals in the sign of the Z-value are present in all three classes, indicating the differences are not all unidirectional within a class. The actual ancestor replay populations usually (but not always) had a shorter median time for re- evolving EQU. We did not perform this comparison for the two types of Category IC replicates, since all organisms came from different sets of populations. 204 Table A2.4. Comparison of median times needed to re-evolve EQU between replays seeded with the most abundant surviving EQU clade organism, and those seeded with the actual end-extinction ancestor from the same replicate population. All times are in Avida updates. Medians of 100001 updates signify that >50% of the replay populations failed to re-evolve EQU in the allotted time. All p-values are based on two-tailed tests. A2.4a) Category IA replicates. Replicate Actual Most abundant Z-value Wilcoxon p-value ancestor survivor ranksum median median 5000 1363 1288 0.514 429.5 0.607 7000 163 33576 -5.407 210 < 0.0001 9000 1638 4188 -1.461 355.5 0.144 1100 6513 29246 -2.346 323 0.019 9100 11013 39888 -l.268 363 0.205 4700 18401 20626 -0.014 409 0.989 8700 11463 100001 -2.502 320 0.012 1800 31163 8888 0.699 436 0.484 2800 5825.5 27701 -1.537 353 0.124 8800 72776 16726 1.376 460 0.169 205 (Table A2.4.1 continued) A2.4b) Category IB replicates. Replicate Actual Most Z-value Wilcoxon p-value ancestor abundant ranksum median survivor median 1000 3263 100001 4963 230 < 0.0001 3000 100001 13938 2.126 487 0.034 10100 1213 100001 -5.157 223 <0.0001 3200 863 1013 -1.002 372.5 0.316 1500 2213 28413 -4.109 258 < 0.0001 10600 1875.5 1650.5 1.746 475 0.081 5700 5875.5 100001 -4.114 259 < 0.0001 9800 100001 39063 1.1 18 450.5 0.263 A2.4c) Category IC replicates. Replicate Actual Most Z-value Wilcoxon p-value ancestor abundant ranksum median survivor ‘ median 1600 100001 29800.5 2.413 493 0.016 2000 72420 21538 1.181 453 0.238 2100 26101 24988 -0.027 408.5 0.978 2200 100001 29888 4.088 548.5 < 0.0001 2300 663 71246 -5.4 l 9 210 < 0.0001 2600 17801 34126 -2.315 324 0.021 2700 1788 24126 -5.276 215 < 0.0001 5300 14882 68095 -1.237 365 0.216 6500 9100.5 9363 -0.271 399.5 0.787 7300 12801 100001 -2.029 338 0.043 8000 39688 10688 1.763 475 0.078 206 Appendix A2.5. Retention of EQU functional sites between pre-extinction ancestor and end-extinction descendent. For Categories IA, IB, and ENL, the end-extinction descendent is the actual ancestor of the organism that re-evolved EQU following the extinction episode. For Category IC and NRE, the end-extinction descendent is the most abundant survivor of the pre-extinction EQU clade. Replicate Replicate Number of Number of sites % sites Class EQU knockout remaining at end remaining sites in of extinction ancestor episode IA 5000 21 14 66.67 IA 7 000 16 1 5 93 .75 IA 9000 12 6 50.00 IA 1 100 17 5 29.41 IA 9100 22 8 36.36 IA 7400 14 4 28.57 IA 4700 22 4 18.18 IA 8700 1 7 4 23 .53 IA 1800 1 8 5 27.78 IA 2800 20 4 20.00 IA 8800 16 3 18.75 IB 1000 14 12 85.71 IB 3000 1 1 1 9.09 1B 8100 15 9 60.00 IB 10100 14 9 64.29 IB 3200 13 13 100.00 IB 10300 9 4 44.44 IB 1500 16 12 75 .00 IB 10600 15 12 80.00 IB 5700 1 7 12 70.59 1B 9800 1 7 5 29.41 207 (Appendix A2.5 continued) IC [C [C IC IC IC IC IC IC IC IC ENL ENL ENL ENL ENL ENL ENL ENL NRE NRE NRE NRE NRE NRE NRE NRE NRE NRE 1600 2000 2100 2200 2300 2600 2700 5300 6500 7300 8000 4100 5100 7100 4300 10400 3600 4800 10800 2400 2500 3800 4200 5400 5600 6200 6400 7200 8200 8500 9900 10000 11 10 17 21 13 19 14 17 14 27 20 16 16 17 15 14 18 15 19 15 16 18 20 14 11 19 20 20 19 14 16 208 M\OUl\lklt'—‘NOONUJO—‘ 16 17 15 14 18 15 OWMOM-D-NM I—n O v-‘\l*-‘\O 9.09 30.00 11.76 38.10 15.38 5.26 35.71 41.18 35.71 33.33 25.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 89.47 33.33 12.50 22.22 25.00 64.29 45.45 15.79 0.00 50.00 47.37 7.14 43.75 4.17 Appendix A2.6. Number of EQU functional sites retained from the ancestral version of EQU that participate in the re-evolved version. See Results for statistical analysis. Replicate Replicate Number of EQU Number of ancestral % sites Class knockout sites in EQU sites re-used re-evolved version participating in re- of EQU evolved EQU IA 5000 15 14 93 .33 IA 7000 1 8 13 72.22 IA 9000 10 2 20.00 IA 1100 11 4 36.36 IA 9100 12 0 0.00 IA 7400 11 1 9.09 IA 4700 20 1 5.00 IA 8700 16 2 12.50 IA 1800 15 3 20.00 IA 2800 16 3 18.75 IA 8800 13 1 7.69 IB 1000 15 10 66.67 IB 3000 18 0 0.00 IB 8100 15 0 0.00 IB 10100 14 8 57.14 IB 3200 13 1 1 84.62 IB 10300 12 2 16.67 IB 1 500 16 12 75.00 IB 10600 16 12 75.00 IE 57 00 17 11 64.71 IB 9800 17 2 11.76 209 Appendix A2.7. Number of EQU knockout mutations in the pre-extinction ancestor that also affect two or more other functions, normalized by the total number of EQU knockouts. Replicates marked with an asterisk indicate that organism performs only one other function in addition to EQU. See Results for statistical analysis. Replicate Seed Number of Total number Fraction of Class EQU of EQU-only knockouts knockouts knockouts involving EQU eliminating 2 and at least 2 or more other other functions functions IA 5000 12 21 0.57 IA 7000 7 16 0.44 IA 9000 4 12 0.33 IA 1100 0 17 0.00 IA 9100 17 22 0.77 IA 7400 1 14 0.07 IA 4700 11 22 0.50 IA 8700 13 17 0.76 IA 1 800 17 1 8 0.94 IA 2800 4 20 0.20 IA 8800 12 16 0.75 IB 1000 7 14 0.50 [B 3000 1 1 1 0.09 13 8100 4 15 0.27 IB 10100 1 14 0.07 1B 3200 9 13 0.69 IB 10300 2 9 0.22 IB 1500 10 16 0.63 IB 10600 7 15 0.47 [B 5700 12 17 0.71 IB 9800 8 17 0.47 210 (Appendix A2.7 continued) K3 K3 K3 K3 K3 K3 IC IC IC [C K: LUKE bHKE hHlE bHKE bUKE hUKE bH{E hH{E bH{E hHKE PHRE DUKE IflJL EDJL EEIL EHJL EHQL IHQL EHQL ERIL 1600 2000 2100* 2200 2300 2600 2700 5300 6500 7300 8000 2400 2500 3800* 5400 5600* 6200 6400 7200 8200 8500 9900 10000 4100* 5100 7100 4300 10400 3600* 4800 10800 oo-stomguu—‘A NV: 211 11 10 17 21 13 19 14 17 14 27 20 15 16 18 14 11 19 20 20 19 14 16 24 16 16 17 15 14 18 15 19 (I36 (110 (129 (I67 (115 (I47 (129 (I47 (193 (126 (I35 (I67 lu00 (133 (164 (127 (174 (130 (165 (100 (129 (169 (125 (I75 (I81 (I76 (I73 (114 (128 (160 (142 Appendix 2.8. Steps needed to determine retention of EQU functional sites, and removal of other functions with EQU knockout mutations. IMAGES ARE PRESENTED IN COLOUR. IDENTIFICATION OF FUNCTIONAL SITES—FOR ALL REPLICATE CLASSES EXCEPT CATEGORY IC_EE 1) Using saved Avida population data, produce lineage, with list of function_s performed and aljgmd sequences. of EQU organism from the ecologically stable population (Category IA, IB, and ENL), or the most abundant descendent genotype of the pre- extinction EQU progenitor (for Category IC and NRE) 2) Identify the last pre-extinction ancestor in the lineage that could perform EQU (all classes), and the first post-extinction genotype that performs EQU (for Categories IA and IB). Extract these genome files. 3) Produce genotype-phenotype maps for last pre-extinction ancestor and first post- extinction genotype that re-evolved EQU (Cat. IA and IB) 4) Using the genotype-phenotype map for the pro-extinction ancestor, identify knockout sites critical for EQU expression. Mark sites in aligned sequence. 5) Verify which sites in the aligned sequences remain unchanged (identical by descent) between the last pre-extinction ancestor and the end-press descendent. 212 RETENTION OF ANCESTRAL FUNCTIONAL SITES IN POST-EXTINCTION EQU ORGAN ISM—F OR CATEGORIES IA AND IB ONLY 6) Using the genotype-phenotype map for the first post-extinction organism where EQU re-evolved, identify knockout sites critical for EQU expression. Mark sites in aligned sequence. 7) Verify in the aligned sequences which sites from the ore-extinction ancestor remain unchanged (identical by descent) between the end-press ancestor and the first post-press descendent where EQU re-evolved. Count how many of these remaining ancestral sites also participate in the re-evolved EQU. PLEIOTROPIC EFFECT OF EQU KNOCKOUTS—FOR ALL CLASSES EXCEPT CATEGORY IC 8) Using the genotype phenotype maps of the pre-extinction ancestors, count how many EQU knockout sites also eliminate two or more otm functions (aside from viability). If the organism has only two functions, count knockouts that eliminate both functions. Exclude any organisms that perform only EQU. 213 2m—Z no on at o o o o o o S o $8341 quI> an on o o at o o o o B 0 28st. u o > an on o o m: o o o o S o 8:8: 3‘ -ooxno a non -ouxno a non -ouxfiolldlooa -ooxno a con 0 II #1 vi =3. 3333330000 DOE cos—Leta.— .E: .8335“ note—.580...— Bad 0 n x a a“ o o o o «N E o o 8 85% olfilx “In“ 3 o o o 5 o o o 0 $851 Anmmhm 2.2—Emmi 92.25% ...—a an: 252:: 5; owes—.5 o o o o o o o o 858?. o o o o o o o o Bank. 0 o o «N a o o 8 83:” o o o «N a o o 8 85.5 o o o S o o o o 28?» o o o S o o o o 8an o o o B o o o o @888 o o o S o o o o $3.8m . . ad...— <.—.<: o o o E o o o o 8:28 ZO—h< 20m «OX «.02 zoz< mo zmo oz< ozmlllwowm. >mlllpowmgvlml » a : a c .u-n c Dow 4| 2862:; A mnewoabmfi oESSU o. mam wxlulall m w E Am ES .5392; 339538-93 5 menu 3.85:5 DOM— @553 mgr—tozmmm -mm>HOZmO QEIQIovvaplclwoIQmIIQOUmIplv a :: nu... .Hfl! c EmHZ 5535 505255.95 E95 5:559. 05 5555055 :e5ou555mem .5 55m 52852.3 D03 55.? @553 : ASNfiolmlqu _ >oulwlwmlamllu. viviw nlhxlulall ..I>Qllolflolagllo o_ 5 .55. 50295 DOM— 0553 5555955 :e5oa555mea 5.:.“ 5 555 5255955 DOM— @553 fl>pm @ o 59lolaml>9llolaolamllomow d QSNHU Q a a :5 3: a: c :05 emmhm 885:8 a? 555$ 216 A39. 5 DOM— .Su :2: 828.— 855 .5.. oucE .5 953 2.3.2.5 .35: 93:. .5 PE Sufi—5.0 on? 2353::— uacv—oaS— DOM— :oEB com 8 .533...“ actoaflxououm ma 9:: ombcnui—uombcaow omD vmonu>OE > mm wvfl>mmun m mm swam m vm Gama 9 mm hQOUIn u no UIQOG 0 am 3 u om «Inca m mm Gama m mm Unmon 0 pm mom u mm wand m mm Uflmn m vm dum0fi m mm 3 u mm 3 u 3 Unmon 0 cm UGMG m m¢ nonmmmIn s m¢ :23:me a ES @26qu w.~< xmwaomQ/wv 217 REFERENCES Amara], L.A.N., and M. Meyer. 1999. Environmental changes, coextinction, and patterns in the fossil record. Physical Review Letters 82: 652-655. Anton, M., and A. Galobart. 1999. Neck function and predatory behavior in the scimitar toothed cat Homotherium Iatidens (Owen). Journal of Vertebrate Paleontology 19: 771-784. Arnold, A.J., D.C. Kelly, and W.C. Parker. 1995. Causality and Cope’s Rule: evidence from the planktonic Foraminifera. Journal of Paleontology 69: 203-210. Arnold, S.J., P. Alberch, S. Csanyi., R.C. Dawkins, S.B. Emerson, B. Fritzsch, T.J Herder, J. Maynard Smith, M.J. Starck, E.S. Vrba, G.P. Wagner, and DB. Wake, 1989. How do complex organisms evolve? Pp. 403-433 in Complex organismal functions: integration and evolution in the vertebrates (Dahlem Workshop Reports, Life Sciences Research Report 45, DB. Wake and G. Roth, eds.). John Wiley and Sons, Chichester, UK. Bateman, RM. 1996. Nonfloral homoplasy and evolutionary scenarios in living and fossil land plants. Pp. 91-130 in Homoplasy: the recurrence of similarity in evolution (M. J. Sanderson, L. Hufford, eds.). Academic Press, San Diego, CA. Bennett, NC, and CG. Faulkes. 2000. African Mole-rats: Ecology and Eusociality. Cambridge University Press, Cambridge, UK. Benton, M.J., and R]. Twitchett. 2003. How to kill (almost) all life: the end-Permian extinction event. Trends in Ecology and Evolution 18: 358-365. Benton, M.J., V.P. Tverdokhlebov, and M.V. Surkov. 2004. Ecosystem remodeling among vertebrates at the Permian-Triassic boundary in Russia. Nature 432: 97-100. Blackburn, D.G. 1992. Convergent evolution of viviparity, matrotrophy, and specializations for fetal nutrition in reptiles and other vertebrates. American Zoologist 32: 3 13-321 Blackburn, D.G. 1993. Histology of the late-stage placentae in the matrotrophic skink Chalcides chalcides (Lacertilia, Scincidae). Journal of Morphology 216: 1 79- 1 95 Brakefield, RM. 2006. Evo-devo and constraints on selection. Trends in Ecology and Evolution 21: 362-368. 218 Bridgham, J .T., S.M. Carroll, and J .W. Thornton. 2006. Evolution of hormone- receptor complexity by molecular exploitation. Science 312: 97-101. Burch, C. L., and L. Chao. 2004. Epistasis and its relationship to canalization in the RNA virus ¢6. Genetics 167: 559-567. Carr, T.R., and J .A. Kitchell. 1980. Dynamics of taxonomic diversity. Paleobiology 6: 427-443. Chen, L.B., A.L. DeVries, and C.H-C. Cheng. 1997. Convergent evolution of antifreeze glycoproteins in Antarctic notothenioid fish and Arctic cod. Proceedings of the National Academy of Sciences of the United States of America 94: 3817-3822. Chow, 8.8., CO. Wilke, C. Ofria, R.E. Lenski, and C. Adami. 2004. Adaptive radiation from resource competition in digital organisms. Science 305: 84-86. Cohan, F .M. 2001. Bacterial species and speciation. Systematic Biology 50: 513-524. Cohan, F.M. 2002. What are bacterial species? Annual Review of Microbiology 56: 457-487. Collin, R., and R. Cipriani. 2003. Dollo’s Law and the re-evolution of shell coiling. Proceedings of the Royal Society of London B 270: 2551-2555. Conway Morris, S. 2003. Life ’s solution. Cambridge University Press, Cambridge, UK. Cooper, T., and C. Ofria. 2002. Evolution of stable ecosystems in populations of digital organisms. Pp. 227-232 in R.K. Standish, M.A. Bedau, and HA. Abbass (eds.): Eighth International Conference on Artificial Life Dec 9-13, Sydney NS W Australia. Churcher, CS. 1985. Dental functional morphology in the marsupial sabre-tooth T hylacosmilus atrox (Thylacosmilidae) compared to that of felid sabre-tooths. Australian Mammalogy 8: 201-220. D’Hondt, S., P. Donaghay, J .C Zachos, D. Luttenberg, and M. Lindinger. 1998. Organic carbon fluxes and ecological recovery from the Cretaceous-Tertiary mass extinction. Science 282: 276-279. Domes, K., R.A. Norton, M. Maraun, and S. Scheu. 2007. Re-evolution of sexuality breaks Dollo’s Law. Proceedings of the National Academy of Sciences of the United States of America 104: 7139-7144. 219 Duffy, J .E., C.L. Morrison, and R. Rios. 2000. Multiple origins of eusociality among sponge-dwelling shrimps. Evolution 54: 503-516. Erwin, D.H. 1998a. The end and the beginning: recoveries from mass extinctions. Trends in Ecology and Evolution 13: 344-349. Erwin, D.H. 1998b. After the end: recovery from extinction. Science 279: 1324-1325. Erwin, DH. 2000. Life’s downs and ups. Nature 404: 129-130. Erwin, DH. 2001. Lessons from the past: Biotic recoveries from mass extinctions. Proceedings of the National Academy of Sciences of the USA. 98: 5399-5403. Fain, M. G., and P. Houde. 2004. Parallel radiations in the primary clades of birds. Evolution 58: 2558-2573. Fischer, A.G. 1981. Climatic oscillations in the biosphere. Pp. 103-131 in M.H Nitecki (ed.), Biotic crises in ecological and evolutionary time. Academic Press, New York. F oote, M. 2003. Origination and extinction through the Phanerozoic: A new approach. Journal of Geology 111: 125-148. Frazetta, TH. 1970. From hopeful monsters to bolyerine snakes. American Naturalist 104: 55-72. Frey, E., H.-D. Sues, and W. Munk. 1997. Gliding mechanism in the Late Permian reptile Coelurosauravus. Science 275: 1450-1452. Fiirsich, F .T., and D. Jablonski. 1984. Late Triassic naticid drillholes: Carnivorous gastropods gain a major adaptation but fail to radiate. Science 224: 7 8-80. Gehring, W.J. 2005. New perspectives of eye development and the evolution of eyes and photoreceptors. Journal of Heredity 96: 171-184. Gould, S. J ., and ES. Vrba, ES. 1982. Exaptation: a missing term in the science of form. Paleobiology 8: 4-15. Gorelick, R., S.M. Bertram, P.R. Killeen, and J .H. Fewell. 2004. Normalized mutual entropy in biology: quantifying division of labour. American Naturalist 164: 677-682. Hallam, A., and PB. Wignall. Mass extinctions and their aftermath. Oxford (UK), New York, Oxford University Press, 1997. 220 Heard, SB, and A.G. Mooers. 2002. Signatures of random and selective mass extinctions in phylogenetic tree balance. Systematic Biology 51: 889-897. Hewzulla, D., M.C. Boulter, M.J. Benton, and J .M. Halley. 1999. Evolutionary patterns from mass originations and mass extinctions. Philosophical Transactions of the Royal Society of London B 354: 463-469. Huang, W. 2005. Analyzing biological complexity with digital organisms. PhD. thesis, Department of Computer Science and Engineering. Michigan State University, East Lansing, Michigan, USA. Jablonski, D. 1986. Evolutionary consequences of mass extinctions. Pp. 313-330 in D.M. Raup, D. Jablonski (eds): Life Science Research Reports (Dahlem Konferenzen) 36: Patterns and Processes in the History of Life. Springer-Verlag, Berlin. J ablonski, D. 1995. Extinctions in the fossil record. Pp. 25-44 in Extinction events in Earth history (J .H. Lawton and RM. May, eds.). Oxford University Press, Oxford, UK. Jablonski, D. 1996. Body size and macroevolution. Pp. 256-289 in D. Jablonski, D.H. Erwin, and J .H. Lipps (eds.): Evolutionary Paleobiology. University of Chicago Press. Jablonski, D. 1998. Geographic variation in the molluscan recovery from the end- Cretaceous extinction. Science 279: 1327-1330. Jablonski, D. 2001. Lessons from the past: evolutionary impacts of mass extinctions. Proceedings of the National Academy of Sciences of the USA 98: 5393-5398. J i, Q., Z.-X. Luo, C.-X Yuan, and A. R. Tabrum. 2006. A swimming mammaliaform from the Middle Jurassic and ecomorphological diversification of early mammals. Science 311: 1123-1127. Kase, T., and M. Ishikawa 2003. Mystery of naticid predation history solved: evidence from a living fossil species. Geology 31: 403-406. Klok, C. J ., R.D. Mercer, and S.L. Chown. 2002. Discontinuous gas-exchange in centipedcs and its convergent evolution in tracheated arthropods. Journal of Experimental Biology 205: 1019-1029. Kurtén, B. 1963. Return of a lost structure in the evolution of the felid dentition. Commentary on Biology 26: 1-12. 221 LaBarbera, M. 1986. The evolution and ecology of body size. Pp. 69-98 in D.M. Raup, D. J ablonski (eds): Life Science Research Reports (Dahlem Konferenzen) 36: Patterns and Processes in the History of Life. Springer-Verlag, Berlin. Lande, R. 1978. Evolutionary mechanisms of limb loss in tetrapods. Evolution 32: 73-92. Laurent, RF. 1983. Irreversibility: a comment on MacBeth’s interpretations. Systematic Zoology 32: 75. Lenski, R.E., J .E. Barrick, and C. Ofn’a. 2006. Balancing robustness and evolvability. PLoS Biology 4: e428. Lenski, R.E., C. Ofria, T.C. Collier, and C. Adami. 1999. Genome complexity, robustness and genetic interactions in digital organisms. Nature 400: 661-664. Lenski, R.E., C. Ofria, R.T. Pennock, and C. Adami. 2003. The evolutionary origin of complex features. Nature 423: 139-144. Li, P.-P., K.-Q. Gao, L.-L. Hou, and X. Xu. 2007. A gliding lizard from the early Cretaceous of China. Proceedings of the National Academy of Sciences of the USA. 104: 5507-5509. Lindeman, R. 1942. The trophic-dynamic aspect of ecology. Ecology 23: 399-418. Liu, R., and Ochman, H. 2007. Stepwise formation of the bacterial flagellar system. Proceedings of the National Academy of Sciences of the USA 104: 7116-7121. Looy, C.V., W.A. Brugman, D.L. Dilcher, and H. Visscher. 1999. The delayed resurgence of equatorial forests after the Permian-Triassic ecological crisis. Proceedings of the National Academy of Sciences of the USA 96: 13857-13862. Looy, C.V., R.J. Twitchett, D.L. Dilcher, and J.H.A.V.K.-V. Cittert. 2001. Life in the end-Permian dead zone. Proceedings of the National Academy of Sciences of the USA. 98: 7879-7893. Lu, R], M. Yogo, and CR. Marshall. 2006. Phanerozoic marine biodiversity dynamics in light of the incompleteness of the fossil record. Proceedings of the National Academy of Sciences of the USA. 103: 2736- 2739. Marshall, LG. 1980. The great American interchange: an invasion-induced crisis for South American mammals. Pp. 133-230 in Biotic crises in ecological and evolutionary time (M.H. Nitecki, ed.). Academic Press, New York, NY. 222 Marden, J .H., and MA. Thomas. 2003. Rowing locomotion by a stonefly that possesses the ancestral pterygote condition of co-occurring wings and abdominal gills. Biological Journal of The Linnean Society 79: 341-349. Martin, RE. 1996. Secular increase in nutrient levels through the Phanerozoic: implications for productivity, biomass, diversity, and extinction of the marine biosphere. Paleontological Journal 30: 637-643. McGhee, G.R., P.M. Sheehan, D.J. Bottjer, and ML. Droser. 2004. Ecological ranking of Phanerozoic biodiversity crises: ecological and taxonomic severities are decoupled. Palaeogeography, Palaeoclimatology, Palaeoecology 211: 289-297. McKinney, ML. 2001. Selectivity during extinctions. Pp. 198-202 in D.E.G. Briggs, P.R. Crowther (eds.): Paleobiology, vol. II. Blackwell Science, Oxford. McShea, D.W. 1996. Complexity and homoplasy. Pp. 207-225 inHomoplasy: the recurrence of similarity in evolution (M.J. Sanderson, L. Hufford, eds). Academic Press, San Diego, CA. Melendez-Hevia, E., T.G. Waddell, and M. Cascante. 1996. The puzzle of the Krebs citric acid cycle: assembling the pieces of chemically feasible reactions, and opportunism in the design of metabolic pathways during evolution. Journal of Molecular Evolution 43: 293-303. Meng. J ., Y.-M. Hu, Y.-Q. Wang, X.-L. Wang, and C.-K. Li. 2006. A Mesozoic gliding mammal from northeastern China. Nature 444: 889-893. Misevic, D., C. Ofria, and RE. Lenski. 2006. Sexual reproduction reshapes the genetic architecture of digital organisms. Proceedings of The Royal Society B- Biological Sciences 273: 457-464. Misevic, D. 2006. Digital sex: causes and consequences of recombination. PhD thesis, Department of Zoology. Michigan State University, East Lansing, MI, USA. Mortlock, R.P. (ed.). 1984. Microorganisms as Model Systems for Studying Evolution. Plenum Press, New York. Nee, S., and R. M. May. 1997. Extinction and the loss of evolutionary history. Science 278: 692-694. Newton, C. R. 1983. Triassic origin of shell-boring gastropods. Geological Society of America Abstracts with Program 15: 652-653. Norris, RD. 1991. Biased extinction and evolutionary trends. Paleobiology 17: 388-399. 223 Ofria, C., and C. Wilke. 2004. Avida: a software platform for research in computational evolutionary biology. Journal of Artificial Life 10: 191-229. Ogura, A., K. Ikeo, and T. Gojobori. 2004. Comparative analysis of gene expression for convergent evolution of camera eye between octopus and human. Genome Research 14: 1555-1561. Pojeta Jr., J ., and T.J. Palmer. 1976. The origin of rock boring in mytilacean pelecypods. Alcheringa 1: 167-179. Padian, K. 1985. The origins and aerodynamics of flight in extinct vertebrates. Palaeontology 28: 413-433. Patterson, C. 1988. Homology in classical and molecular biology. Molecular Biology and Evolution 5: 603-625. Radwanski, A. 1995. A unique, “trilobite—like” fossil—the isopod Cyclosphaeroma malogostianum sp. N. from the Lower Kimmeridgian of the Holy Cross Mountains, Central Poland. Acta Geologica Polonica 45: 9-25. Raup, D.M. 1984. Evolutionary radiations and extinctions. Pp. 5-14 in H.D. Holland, A.F. Trendall, eds., Patterns of Change in Earth Evolution (Dahlem Conference). Springer Verlag, Berlin. Raup, D.M. 1991. Extinction: bad genes or bad luck? W.W. Norton, New York. Reznick, D.N., M.. Mateos, and MS Springer. 2002. Independent origins and rapid evolution of the placenta in the fish genus Poeciliopsis. Science 298: 1018- 1020. Rhodes, M.C., and CW. Thayer. 1991. Mass extinctions: ecological selectivity and primary production. Geology 19: 877-880. Rich, T.H., J .A. Hopson, A.M. Musser, T.F. Flannery, and P. Vickers-Rich. 2005. Independent origins of middle ear bones in monotremes and therians. Science 307: 910-914. Ross, C.A., and J .R.P. Ross. 1995. Foraminiferal zonation of the late Paleozoic depositional sequences. Marine Micropaleontology 26: 469-478. Rozen, D.E., D. Schneider, and RE. Lenski. Long-term experimental evolution in Escherichia coli. XIII. Phylogenetic history of a balanced polymorphism. Journal of Molecular Evolution 61: 171-180. 224 Russell, D. 1977. The biotic crisis at the end of the Cretaceous period. Syllogeus, National Museum of Natural Sciences of Canada 12: 11-23. Salesa, M. J ., M. Anton, S. Peigne’, and J. Morales. 2006. Evidence of a false thumb in a fossil carnivore clarifies the evolution of pandas. Proceedings of the National Academy of Sciences of the United States of America 103: 379-3 82. Salvini-Plawen, L.V., and E. Mayr. 1977. On the evolution of photoreceptors and eyes. Evolutionary Biology 10: 203-267. Sanjuan, R., A. Moya, and SF. Elena. 2004. The contribution of epistasis to the architecture of fitness in an RNA virus. Proceedings of the National Academy of Sciences of the United States of America 101: 15376-15379. Saunders, W.B., D.M. Work, and S.V. Nikovaeva. 1999. Evolution of complexity in Paleozoic ammonoid sutures. Science 286: 760-763. Schluter, D. 2000. The ecology of adaptive radiation. Oxford University Press, Oxford, UK. Sepkoski, J .J . 1978. A kinetic model of Phanerozoic taxonomic diversity. 1. Analysis of marine orders. Paleobiology 4: 223-251. Sepkoski, J.J. 1979. A kinetic model of Phanerozoic taxonomic diversity. 2. Early Phanerozoic families and multiple equilibria. Paleobiology 5: 222-251. Sepkoski, J .J . 1984. A kinetic model of Phanerozoic taxonomic diversity. 3. Post-Paleozoic families and mass extinctions. Paleobiology 10: 246-267. Sepkoski, J .J . 1992. A compendium of fossil marine families. Milwaukee Public Museum Contributions to Biology and Geology 51: 1-125. Sepkoski, J ..1 . 1997. Biodiversity: Past, present, and future. Journal of Paleontology 71: 533-539. Sepkoski, J .J . 2002. A compendium of fossil marine and animal genera. Bulletin of American Paleontology 363: 1-563. Sheehan, F.M., and TA. Hansen. 1986. Detritus feeding as a buffer to extinction at the end of the Cretaceous. Geology 14: 868-870. Shimek, R.L., and A.J. Kohn. 1981. Functional morphology and evolution of the toxoglossan radula. Malacologia 20: 423-438. Simpson, G.G. 1944. Tempo and mode in evolution. Columbia University Press, New York. 225 (I) f Simpson, G.G. 1953. The major features of evolution. Columbia University Press, New York. Sinha, NR, and EA. Kellogg. 1996. Parallelism and diversity in multiple origins of C4 photosynthesis in the grass family. American Journal of Botany 83: 1458- 1470. Sokal, RR, and R]. Rohlf. 1995. Biometry, 3rd ed. w. H. Freeman and Company, New York, N. Y. Solé, R.V., J.M. Montoya, and DH. Erwin. 2002. Recovery after mass extinction: evolutionary assembly in large-scale biosphere dynamics. Philosophical Transactions of the Royal Society of London B 357: 697-707. Springer, M.S., M.J. Stanhope, O. Madsen, and W.W. de Jong. 2004. Molecules consolidate the placental mammal tree. Trends in Ecology and Evolution 19: 430-438. Stanley, 8., and X. Yang. 1994. A double mass extinction at the end of the Paleozoic era. Science 266: 1340-4. Sumbre, G., G. Fiorito, T. Flash, and B. Hochner. 2005. Motor control of flexible octopus arms. Nature 433: 595-596. Sumbre, G., G. F iorito, T. Flash, and B. Hochner. 2006. Octopuses use a human-like strategy to control precise point-to-point arm movements. Current Biology 16: 767-772. Sweeney, A., and S. Johnsen. 2004. Evolution of complex optics in squid lenses. Integrative and Comparative Biology 44: 649. Taylor, RD, and PK. McKinney. 1996. An Archimedes-like cyclostome bryozoan from the Eocene of North Carolina. Journal of Paleontology 70: 218-229 Tomarev, S1, and J. Piatigorsky. 1996. Lens crystallins of invertebrates - Diversity and recruitment from detoxification enzymes and novel proteins. European Journal of Biochemistry 235: 449-465. Travisano, M., J .A. Mongold, A.F. Bennett, and RE. Lenski. 1995. Experimental tests of the roles of adaptation, chance, and history in evolution. Science 267: 87-90. True, J .R., and SB. Carroll. 2002. Gene co-option in physiological and Morphological evolution. Annual Review of Cell and Developmental Biology 18: 53-80. 226 Turnbull, W.D. 1978. Another look at dental specializations in the extinct sabre- toothed marsupial, T hylacosmilus, compared with its placental counterparts. Pp. 399-414 in Development, fimction, and evolution of teeth (eds. P.M. Butler and K.A. Joysey). Academic Press, London. Vander Zanden, M.J., and J .B. Rasmussen. 1996. A trophic position model of pelagic food webs: Impact on contaminant bioaccumulation in lake trout. Ecological Monographs 66: 45 1-477. Vermeij, G. 2006. Historical contingency and the purported uniqueness of evolutionary innovations. Proceedings of the National Academy of Sciences of the United States of America 103: 1804-1809. Wainwright, P.C., M.E. Alfaro, D.I. Bolnick, and CD. Hulsey. 2005. Many-to-one mapping of form to function: a general principle in organismal design? Integrative and Comparative Biology 45: 256-262. Wake, D. B. 1991. Homoplasy: the result of natural selection, or evidence of design limitations? American Naturalist 138: 543-567. Wake, DB, and G. Roth. 1989. The linkage between ontogeny and phylogeny in the evolution of complex systems. Pp. 361-3 77 in Complex organismalfimctions: integration and evolution in the vertebrates (Dahlem Workshop Reports, Life Sciences Research Report 45, DB. Wake and G. Roth, eds.). John Wiley and Sons, Chichester, UK. Weinrich, D.M., N.F. Delaney, M.A. DePristo, and D.L. Hart]. 2006. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312: 111-114. Whiting, M.F., S. Bradler, and T. Maxwell. 2003. Loss and re-evolution of wings in stick insects. Nature 421: 264-267. Whittaker, RH. 1977. Communities and Ecosystems. Macmillan, New York. Wilke, CO, and C. Adami, C. 2002. The biology of digital organisms. Trends in Ecology and Evolution 17: 528-532. Wilkins, AS. 2002. The evolution of developmental pathways. Sinauer, Sunderland, Massachusetts. Wilson, E.O., and B. Holldobler. 2005. Eusociality: origin and consequences. Proceedings of the National Academy of Sciences of the United States of America 102: 13367-13371. 227 Wroe, S., C. McHenry, and J. Thomason. 2005. Bite club: comparative bite force in big biting mammals and the prediction of predatory behaviour in fossil taxa. Proceedings of the Royal Society of London B 272: 619-625. Yedid, G., and G. Bell. 2002. Macroevolution simulated with autonomously replicating computer programs. Nature 420: 810-812. 228 u"giljigggmnrlggiyglw11311311