ENGINEERING SCALABLE DIGITAL MODELS TO STUDY MAJOR TRANSITIONS IN
                                    EVOLUTION
                                          By
                             Matthew Andres Moreno
                                A DISSERTATION
                                    Submitted to
                            Michigan State University
                     in partial fulfillment of the requirements
                                  for the degree of
                    Computer Science - Doctor of Philosophy
            Ecology, Evolutionary Biology and Behavior - Dual Major
                                         2022


                                                    ABSTRACT
     Evolutionary transitions occur when previously-independent replicating entities unite to form more
complex individuals. Such major transitions in individuality have profoundly shaped complexity, novelty,
and adaptation over the course of natural history. Regard for their causes and consequences drives many
fundamental questions in biology. Likewise, evolutionary transitions have been highlighted as a hallmark
of true open-ended evolution in artificial life. As such, experiments with digital multicellularity promise to
help realize computational systems with properties that more closely resemble those of biological systems,
ultimately providing insights about the origins of complex life in the natural world and contributing to
bio-inspired distributed algorithm design.
     Major challenges exist, however, in applying high-performance computing to the dynamic, large-scale
digital artificial life simulations required for such work. This dissertation presents two new tools that facilitate
such simulations at scale: the Conduit library for best-effort communication and the hstrat (“hereditary
stratigraphy”) library, which debuts novel decentralized algorithms to estimate phylogenetic distance between
evolving agents.
     Most current high-performance computing work emphasizes logical determinism: extra effort is expended
to guarantee reliable communication between processing elements. When necessary, computation halts in
order to await expected messages. Determinism does enable hardware-independent results and perfect re-
producibility, however adopting a best-effort communication model can substantially reduce synchronization
overhead and allow dynamic (albeit, potentially lossy) scaling of communication load to fully utilize available
resources. We present a set of experiments that test the best-effort communication model implemented by
the Conduit library on commercially available high-performance computing hardware. We find that best-
effort communication enables significantly better computational performance under high thread and process
counts and can achieve significantly better solution quality within a fixed time constraint.
     In a similar vein, phylogenetic analysis in digital evolution work has traditionally used a perfect tracking
model where each birth event is recorded in a centralized data structure. This approach, however, is difficult
scale robustly and efficiently to distributed computing environments where agents may migrate between a
dynamic set of disjoint processing elements. To provide for phylogenetic analyses in these environments,
we propose an approach to infer phylogenies via heritable genetic annotations. We introduce hereditary
stratigraphy, an algorithm that enables tunable trade-offs between annotation memory footprint and accuracy
of phylogenetic inference. Simulating inference over known lineages, we recover up to 85% of the information
contained in the true phylogeny using only a 64-bit annotation.
     We harness these tools in DISHTINY, a distributed digital evolution system designed to study digital
organisms as they undergo major evolutionary transitions in individuality. This system allows digital cells to


form and replicate kin groups by selectively adjoining or expelling daughter cells. The capability to recognize
kin-group membership enables preferential communication and cooperation between cells. We report group-
level traits characteristic of fraternal transitions, including reproductive division of labor, resource sharing
within kin groups, resource investment in offspring groups, asymmetrical behaviors mediated by messaging,
morphological patterning, and adaptive apoptosis. In one detailed case study, we track the co-evolution of
novelty, complexity, and adaptation over the evolutionary history of an experiment. We characterize ten
qualitatively distinct multicellular morphologies, several of which exhibit asymmetrical growth and distinct
life stages. Our case study suggests a loose relationship can exist among novelty, complexity, and adaptation.
      The constructive potential inherent in major evolutionary transitions holds great promise for progress
toward replicating the capability and robustness of natural organisms. Coupled with shrewd software engi-
neering and innovative model design informed by evolutionary theory, contemporary hardware systems could
plausibly already suffice to realize paradigm-shifting advances in open-ended evolution and, ultimately, scien-
tific understanding of major transitions themselves. This work establishes important new tools and method-
ologies to support continuing progress in this direction.


Copyright by
MATTHEW ANDRES MORENO
2022


Time, funding, freedom, peace, encouragement, presumed competence, education, role models, advisorship,
and colleagueship — for reparation of profound inadequacies in equitable and universal affordance of such
                                               privilege.
                                                   v


                                       ACKNOWLEDGEMENTS
     To my colleagues and collaborators in the DEVOLAB and BEACON, I benefited greatly from your
insight and your camaraderie. Thank you. Notable mentions here include Dr. Acacia Ackles, Dr. Wolfgang
Banzhaf, Cliff Bohm, Dr. Emily Dolson, Austin Ferguson, Jose Hernandez, Dr. Alex Lalejini, Dr. Josh
Nahum, Dr. Anselmo Pontes, and Kate Skocelas, and Dr. Anya Vostinar.
     Thank you to my mentees for your valuable work. Sara Boyd and Tait Wecht made huge improvements
to the Empirical library’s web UI toolkit. Katherine Perry, Nathan Rizik, and Santiago Rodriguez Papa
helped build software foundations for the experiments reported in this dissertation. I am grateful for the
opportunity to have worked with each of you. On frustrating days with my own work, I am always glad to
think of the good you’re out to do.
     In particular, Santiago Rodriguez Papa merits special recognition for jumping in the trenches to help
bring this dissertation over the finish line. Over the last four years, I have been endlessly entertained by
your encyclopedic knowledge of amusing grotesqueries in society and technology. I have also been endlessly
impressed by your determination and cleverness in engineering better ways to do almost everything. Thank
you.
     My own mentors invested time and personal support in my development. Thank you to Dr. Rex Cole,
Dr. America Chambers, Dr. John Fowler, Dr. Simon Garnier, Dr. Jason Graham, Mary Peterson, and
Dr. Adam Smith. Dr. America Chamber’s encouraging and wise advisorship on my undergraduate thesis
cemented my research interests and laid the foundation for my graduate career. I am also grateful for my
time training under the devoted mathematics and computer science faculty at the University of Puget Sound.
     Thank you to Dr. Marisa Silver. You are singularly responsible for getting me out the other end of
middle school in one piece, with some serviceable writing ability to boot.
     Without the benevolent grace of omnipotent administrative support staff, I could not have survived
Michigan State University with a paycheck, health insurance, and graduation requirements. Thank you to
Barbara Bloemers, Deanne Hubbell, Connie James, and Melissa Williams.
     This dissertation has benefited from the advice of my committee members: Dr. Wolfgang Banzhaf, Dr.
Emily Dolson, Dr. Charles Ofria, and Dr. Bill Punch. Thank you for especially for refining the focus of this
work (and preventing global deforestation from production of a print copy).
     What hasn’t been said about Dr. Charles Ofria across dozens of advisorial acknowledgments? I will add
this: thank you most of all for planting yourself wholly in my corner. From the very beginning, you made
clear that I had your unconditional and total support. I quickly grew to trust and rely on it. I am glad to
have been able to share my challenges with you, both technical and personal. Thank you.
     Thank you to my friends, family, and loved ones for your support.
                                                      vi


     This research was supported in part by NSF grants DEB-1655715 and DBI-0939454 as well as by Michi-
gan State University through the computational resources provided by the Institute for Cyber-Enabled Re-
search. This material is based upon work supported by the National Science Foundation Graduate Research
Fellowship under Grant No. DGE-1424871. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do not necessarily reflect the views of the National
Science Foundation.
                                                      vii


                                      TABLE OF CONTENTS
Chapter 1    Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   1
Part I Designing Computational Infrastructure to Enable Scalable Digital
Multicellularity Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          9
Chapter 2    Design and Scalability Analysis of Conduit: a Best-effort Communication Software
             Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   10
Chapter 3    Methods to Enable Decentralized Phylogenetic Tracking in a Distributed Digital
             Evolution System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    49
Part II   Evolving Complexity, Novelty, and Adaptation in Digital Multicells . . . . . . .                     67
Chapter 4    Exploring Evolved Multicellular Life Histories in an Open-Ended Digital Evolution
             System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  68
Chapter 5    A Case Study of Novelty, Complexity, and Adaptation in a Multicellular System . . . .             87
Chapter 6    Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Appendix A   Design and Scalability Analysis of Conduit: a Best-effort Communication Software
             Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Appendix B   Methods to Enable Decentralized Phylogenetic Tracking in a Distributed Digital
             Evolution System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Appendix C   Exploring Evolved Multicellular Life Histories in a Open-Ended Digital Evolution
             System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Appendix D   Case Study of Novelty, Complexity, and Adaptation in a Multicellular System . . . . . 296
                                                    viii


                                                 Chapter 1
                                              Introduction
Portions of this chapter are adapted from (Moreno and Ofria, 2019), (Moreno and Ofria, 2020), and from
(Moreno, 2020).
1.1      Major Evolutionary Transitions and Open-Ended Evolution
     Emergence of new replicating entities from the union of simpler entities constitutes some of the most
profound events in natural evolutionary history (Smith and Szathmary, 1997). In an evolutionary transi-
tion of individuality, a new, more complex replicating entity is derived from the combination of cooperating
replicating entities that have irrevocably entwined their long-term fates (West et al., 2015). Eusocial in-
sect colonies and multicellular organisms exemplify this phenomenon (Smith and Szathmary, 1997). Such
transitions in individuality are essential to the evolution of the most complex forms of life. As such, these
transitions have been highlighted as key research targets with respect to the question of open-ended evolution
(Banzhaf et al., 2016; Ray and Thearling, 1996).
     In particular, this dissertation focuses on fraternal transitions in individuality — events where closely-
related kin come together or stay together to form a higher-level organism (Queller, 1997). Potential evolv-
ability properties of fraternal collectives makes them an attractive evolutionary substrate. Multicellular
bodies configured through generative development (i.e., with indirect genetic representation) can promote
scalable properties (Lipson et al., 2007) such as modularity, regularity, and hierarchy (Clune et al., 2011;
Hornby, 2005). Developmental processes may also promote canalization (Stanley and Miikkulainen, 2003),
for example through exploratory processes and compensatory adjustments (Gerhart and Kirschner, 2007).
     Scientific understanding of fraternal transitions in individuality benefits from experimental work probing
the origins of multicellularity. In the biological domain, Ratcliff et al. have demonstrated evolution of
multicellularity in yeast, deriving fraternal clusters of cells that cling together in order to maximize their
settling rate (Ratcliff et al., 2012). The contributions of Goldsby and collaborators are particularly notable
among computational artificial life work on the origins of multicellularity.
     Goldsby’s work extends the Avida model system (Ofria et al., 2009), breaking the toroidal grid into
isolated pockets where colonies are grown up from a single progenitor cell. Direct selection for collective,
colony-level characteristics drives evolution of cooperative cellular traits characteristic of a transition to
colony-level individuality. When a colony meets selection criteria, a propagule from that colony is inoculated
into a freshly-cleared population slot. Cells explicitly self-designate eligibility to parent a propagule. This
clear distinction between somatic and gametogenic modes of reproduction has proven particularly useful in
experiments studying the origin of soma (Goldsby et al., 2014) and multicellular entrenchment (Goldsby
                                                         1


et al., 2020). Other work by Goldsby et al. has investigated the evolution of division of labor (Goldsby et al.,
2012, 2010) and the evolution of morphological development (Goldsby et al., 2017).
      This dissertation build’s on Goldsby’s work by relaxing simulation constraints to enable broad genetic
determination of multicellular life history and allowing for unconstrained cellular interactions between mul-
ticellular bodies. This approach enables new perspective in digital evolution work, especially with respect
to biotic interactions.
1.2       Digital Evolution Models
      Digital evolution techniques complement traditional wet-lab evolution experiments by enabling re-
searchers to address questions that would be otherwise limited by:
    • reproduction rate (which determines the number of generations that can be observed in a set amount
       of time),
    • incomplete observations (every event in a digital system can be tracked),
    • physically-impossible experimental manipulations (any event in a digital system can can be arbitrarily
       altered), or
    • resource- and labor-intensity (digital experiments can be automated).
      Despite their versatility and rapid generational turnover, digital artificial life experiments generally
operate at comparable or modest scales compared to laboratory biological evolution experiments. Although
digital evolution techniques can feasibly simulate populations numbering in the millions, such experiments
require simple agents with limited interactions. With more complex agents controlled by genetic programs,
neural networks, or the like, feasible population sizes can dwindle down to thousands or even hundreds of
agents. When considering major transitions to multicellularity — where individual organisms are composed
of many agents — population sizes may drop to tens of organisms, far below desirable for many evolution
experiments.
1.3       Putting Scale in Perspective
      One example of a digital evolution platform is Avida, a popular software system for evolutionary ex-
periments with self-replicating computer programs. In this system, a population of ten thousand digital
organisms can undergo approximately twenty thousand generations (or about two hundred million individ-
ual replication cycles) per day (Ofria et al., 2009). Each flask in the Lenski Long-Term Evolution Experiment
hosts a similar number of replication cycles; with an effective population size of 30 million E. coli that un-
dergo a bit more than 6.6 doublings per day, the bacteria experience about 180 million replication events
per day (Good et al., 2017). Likewise, in Ratcliff’s work studying the evolution of multicellularity in S.
                                                         2


cerevisiae, about six doublings per day occur among a population numbering on the order of a billion cells
(Ratcliff et al., 2012). These numbers translate to approximately six billion cellular replication cycles elapsed
per day in this system.
      Although artificial life practitioners traditionally describe instances of their simulations as “worlds,”
with serial processing power their scale aligns (in naive terms) more along the lines of a single flask. Of
course, such a comparison neglects profound disparities between Avidians and bacteria or yeast in terms
of complexity. Natural organisms have vastly more information content in their genomes and their cellular
state, as well as more (and more diverse) interactions with the environment and with other cells. Recent work
with SignalGP has sought to address some of these shortcomings by developing digital evolution substrates
suited to dynamic environmental and agent-agent interactions (Lalejini and Ofria, 2018) that more effectively
incorporate state information (Lalejini et al., 2020, 2021; Moreno, 2019). However, more sophisticated and
interactive evolving agents will necessarily consume more CPU time on a per-replication-cycle basis — further
shrinking the magnitude of experiments tractable with serial processing.
1.4       Thesis Statement
      Scalable digital evolution systems leveraging best-effort communication will enable us to study key phe-
nomena associated with open-ended evolution: the origins of novel traits and behaviors, complex organisms
and ecologies, and major evolutionary transitions in individuality.
1.5       A Path of Expanding Computational Scale
      The idea that orders-of-magnitude increases in compute power will open up qualitatively different pos-
sibilities with respect to open-ended evolution is both promising and well founded. Spectacular advances
achieved with artificial neural networks over the last decade illuminate a possible path toward this outcome.
As with digital evolution, artificial neural networks (ANNs) were traditionally understood as a versatile,
but auxiliary methodology — both techniques have been described as “the second best way to do almost
anything” (Eiben and Smith, 2015; Miaoulis and Plemenos, 2008). However, the utility and ubiquity of
ANNs has since increased dramatically. The development of AlexNet is widely considered pivotal to this
transformation. AlexNet united methodological innovations from the field (such as big datasets, dropout,
and ReLU) with GPU computing that enabled training of orders-of-magnitude-larger networks. In fact,
some aspects of their deep learning architecture were expressly modified to accommodate multi-GPU train-
ing (Krizhevsky et al., 2012). By adapting existing methodology to exploit commercially available hardware,
AlexNet spurred the greater availability of compute resources to the research domain and eventually the
introduction of custom hardware to expressly support deep learning (Jouppi et al., 2017).
      Notably within the domain of artificial life, David Ackley has envisioned an ambitious design for modular
                                                         3


distributed hardware at a theoretically unlimited scale (Ackley and Cannon, 2011). Progress toward realizing
artificial life systems with such indefinite scalability seems likely to unfold as incremental achievements that
spur additional interest and resources in a positive feedback loop with the development of methodology,
software, and eventually specialized hardware to take advantage of those resources. In addition to developing
hardware-agnostic theory and methodology, we believe that pushing the envelope of open-ended evolution will
analogously require designing systems that leverage existing commercially-available parallel and distributed
compute resources at circumstantially-feasible scales.
1.6       The Future is Parallel
      Throughout much of the 20th century, serial processing enjoyed regular advances in computational ca-
pacity due to quickening clock cycles, burgeoning RAM caches, and increasingly clever packing together
of instructions during execution. Since, however, performance of serial processing has bumped up against
apparent fundamental limits to the current technological foundations of computing (Sutter et al., 2005).
Instead, advances in 21st century computing power have arrived largely via multiprocessing (Hennessy and
Patterson, 2011, p. 55) and specialized hardware acceleration (e.g., GPU, FPGA, etc.) (Che et al., 2008).
Contemporary high-performance computing clusters link multiprocessors and accelerators with fast inter-
connects to enable coordinated work on a single problem (Hennessy and Patterson, 2011, p. 436). High-end
clusters already make hundreds of thousands or millions of cores available. More loosely-affiliated banks
of servers can also muster significant computational power. For example, Sentient Technologies notably
employed a distributed network of over a million CPUs to run evolutionary algorithms (Miikkulainen et al.,
2019). The availability of orders-of-magnitude greater parallel computing resources in ten and twenty years’
time seems probable, whether through incremental advances with traditional silicon-based technology (Don-
garra et al., 2014; Gropp and Snir, 2013) or via emerging, unconventional technologies such as bio-computing
(Benenson, 2009) and molecular electronics (Xiang et al., 2016). Such emerging technologies could greatly
expand the collections of computing devices that are feasible, albeit at the potential cost of component speed
(Bonnet et al., 2013; Ellenbogen and Love, 2000) and perhaps also component reliability. Making effective
use of massively parallel processing power may require fundamental shifts in existing programming practices.
1.7       Traditional Approaches to Digital Evolution at Scale Favor Iso-
          lation
      Digital evolution practitioners have a rich history of leveraging distributed hardware. It is common
practice to distribute multiple self-isolated instantiations of evolutionary runs across multiple hardware units.
In scientific contexts, this practice yields replicate datasets that provide statistical power to answer research
questions (Dolson and Ofria, 2017). In applied contexts, this practice yields many converged populations
                                                          4


that can be scavenged for the best solutions overall (Hornby et al., 2006). Another established practice is
to use “island models” where individuals are transplanted between populations residing on different pieces of
distributed hardware. Koza and collaborators’ genetic programming work with a 1,000-CPU Beowulf cluster
typifies this approach (Bennett III et al., 1999).
      In recent years, Sentient Technologies spearheaded evolutionary computation projects on an unprece-
dented computational scale, comprising over a million CPUs and capable of a peak performance of 9 petaflops
(Miikkulainen et al., 2019). According to its proponents, the scale and scalability of this “DarkCycle” sys-
tem was a key aspect of its conceptualization (Gilbert, 2015). Much of the assembled infrastructure was
pieced together from heterogeneous providers and employed on a time-available basis (Blondeau et al., 2009).
Unlike typical island models where selection occurs entirely independently on each CPU, this scheme trans-
ferred evaluation criteria between computational instances in addition to individual genomes (Hodjat and
Shahrzad, 2013). Sentient Technologies also notably exploited a large pool of hardware accelerators (e.g.,
100 GPUs) in work evolving neural network architectures by performing each candidate architecture’s costly
model training and evaluation process (Miikkulainen et al., 2019).
      Existing parallel and distributed digital evolution systems typically minimize interaction between simu-
lation components on disjoint hardware. Such independence facilitates simple and efficient implementation.
This approach typically involves independent evaluation of sub-populations (i.e., island models) or individu-
als (i.e., primary-subordinate or controller-responder parallelism (Cantú-Paz, 2001)). Cases where evaluation
of a single individual are parallelized often involve data-parallel evaluation over a set of independent test
cases, which are subsequently consolidated into a single fitness profile (Harding and Banzhaf, 2007b; Langdon
and Banzhaf, 2019).
      However, several notable parallel and distributed digital evolution systems have incorporated rich inter-
actions between parallelized simulation components. Harding applied GPU acceleration to cellular automata
models of artificial development systems, which involve intensive interaction between spatially-distributed
instantiation of a genetic program (Harding and Banzhaf, 2007a). Work on Network Tierra by Tom Ray
featured arbitrary communication between digital organisms residing on different machines (Ray, 1995) More
recently, in a continuation of much earlier work, Christian Heinemann’s ongoing ALIEN project has leveraged
GPU acceleration for perform physics-based simulation of soft body agents within a 2D arena (Heinemann,
2008).
1.8        Open-Ended Evolution at Scale Should Prioritize Interaction
      We argue that open-ended artificial life systems should prioritize dynamic interactions between simula-
tion elements situated across physically distributed hardware components.
                                                         5


     Unlike most existing applications of distributed computing in digital evolution, open-ended evolution re-
search demands dynamic interactions among distributed simulation elements. Many of the important natural
phenomena, including ecologies, co-evolutionary dynamics, and social behavior, all arise from interactions
among individuals. Likewise, at the scale of an individual organism, developmental processes and emergent
phenotypic functionality necessitate dynamic interactions.
     A best-effort communication model could enable maximization of available bandwidth (Byna et al.,
2010) while avoiding scaling issues typically associated with communication-intensive distributed computing
(Cardwell and Song, 2019). Under such a model, processes compute simulation updates unimpeded and
incorporate communication from collaborating processes as it happens to become available in real time.
As stochastic algorithms performing computational search with a broad set of acceptable outcomes, many
digital evolution simulations are well suited to such a best-effort approach.
1.9       Digital Multicellularity Suits Distributed Computing
     Multicellularity poses an attractive model to harness distributed computing power for digital evolution.
The basic notion is to achieve simulation dynamics that outstrip the capabilities of individual hardware
components via an interacting network of discrete cellular components simple enough to reside on individual
pieces of hardware. Indeed, early thinking around composing digital organisms of differentiated components
revolved around the possibility of multithreading and multiprocessing. However, this work eschews a spatial
model for cellular interaction in favor of a logical approach where “cellular” threads traversed logical space
within a replicating program (Ofria et al., 1999; Ray and Hart, 2000).
     Only later did Goldsby’s multicellularity experiments introduce a spatial model for digital multicellu-
larity, in which cells composing each digital “multicell” occupied tiles in a unique two-dimensional subgrid
(Goldsby et al., 2014). The clonal colony of cells constituting each multicell exists within an isolated spatial
domain provisioned by the simulation. Two distinct modes of reproduction occur in these experiments: (1)
cells replicate within a multicell and (2) multicells reproduce by sending a single cell to found a new organ-
ism — the target multicell is sterilized then re-innoculated with the cell supplied by the parent. Although
Goldsby did not pursue hardware acceleration of cell components within a multicell, such a spatial approach
could facilitate parallelization. Assuming local interactions, cells in a spatial model communicate directly
with relatively few other simulation elements (i.e., their neighbors). Such a limitation suits a distributed
computing approach.
     In fact, at truly vast scales where physical distance between hardware components limits viable commu-
nication, simulation topology that maps into three-dimensional space will become highly advantageous. (This
argument is a foundational tenet of Ackley’s “indefinite” scalability concept (Ackley and Cannon, 2011).)
                                                        6


Ackley’s recent work on emergent digital protocells exemplifies algorithm engineering grounded in spatial
considerations with respect to potential underlying distributed physical hardware (Ackley, 2018, 2019).
     The approach presented in this dissertation extends Goldsby’s spatial model of digital multicellularity by
developing mechanics to enable arbitrary interactions between multicells (e.g., competition, parental care for
offspring, etc.) within a unified spatial realm. (The DISHTINY model incorporates other notable changes, as
well, such as an event-driven genetic programming substrate and directionally-symmetric agent evaluation.)
1.10        Contributions
     Deepening our scientific understanding of major evolutionary transitions in individuality provides crucial
insight into how the remarkable diversity and complexity of biological life came to be and may yield facilitate
replication of lifelike capabilities in silico. Digital evolution enables unique experimental approaches to
investigate evolutionary questions, but computational limitations restrict the scope of systems that can
be modeled. Such practicalities are particularly cumbersome to digital models of multicellularity. This
dissertation develops and tests approaches to improve scalability of artificial life simulations and applies
them to construct a scalable simulation system for digital multicellularity. We then use this system to study
the relationships between major transitions, complexity, novelty, and adaptation.
     Contributions of this dissertation include:
    • de novo production of complex multicellular organisms without employing a segregating topology to
       force such a transition,
    • demonstrating metrics that can efficiently quantify complexity and adaptation in an system with im-
       plicit selection dynamics,
    • characterizing the evolution of complexity, novelty, and adaptation of digital multicells in an open-
       ended system,
    • implementing and evaluating techniques for general-purpose best-effort high-performance computing,
    • developing and implementing new methodologies for scalable simulations of evolving digital multicells
       that allows for arbitrary interactions between multicells in a unified spatial realm, and
    • providing a new technique for genome annotation to facilitate phylogenetic analyses in distributed
       digital evolution experiments.
     The work described here aims to spur reciprocal innovations:
    • distributed computing will expand the scope of experiments possible in artificial life systems by allowing
       us to evolve complex multicellular digital organisms, and
                                                        7


   • the unique objectives and latitude of artificial life will foster novel algorithms and distributed computing
      techniques.
1.11      Outline
     The remainder of this dissertation is divided up as followed:
     Part I describes computational infrastructure developed to enable scalable digital multicellularity ex-
periments.
   • Chapter 2 presents the Conduit library for best-effort high-performance computing, experimentally
      demonstrating the scalability benefits of the best-effort approach, and
   • Chapter 3 proposes and tests the “hereditary stratigraphy” approach to record phylogenetic information
      in decentralized artificial life experiments.
Although not delved into here, additional algorithm and software development work took place on regulation-
enabled tag lookup and efficient event-driven virtual CPUs.
     Part II reports experiments performed using the DISHTINY digital multicellularity framework.
   • Chapter 4 surveys multicellular life histories evolved within the framework, and
   • Chapter 5 studies the coevolution of complexity, novelty, and adaptation in a case study lineage.
     Finally, Chapter 6 provides concluding remarks and describes directions in which this research should
continue.
                                                         8


                            Part I
Designing Computational Infrastructure to Enable Scalable Digital
                Multicellularity Experiments
                               9


                                                Chapter 2
          Design and Scalability Analysis of Conduit: a Best-effort
                            Communication Software Framework
Authors: Matthew Andres Moreno, Santiago Rodriguez Papa, and Charles Ofria
Portions of this chapter have appeared as (Moreno et al., 2021b) in the ACM Workshop on Parallel and Dis-
tributed Evolutionary Inspired Methods (WS-PDEIM) at the 2021 Genetic and Evolutionary Computation
Conference (GECCO 2021) and as (Moreno et al., 2020) in the 6th International Workshop on Modeling and
Simulation of and by Parallel and Distributed Systems (MSPDS 2020) at the 2020 International Conference
on High Performance Computing & Simulation (HPCS 2020).
     This chapter develops the Conduit C++ library for best-effort communication in parallel and distributed
high-performance computing and tests it through a series of on-hardware experiments. We find that the best-
effort approach significantly increases performance at high CPU count. Because real-time volatility affects
the outcome of computation under the best-effort model, we additionally designed and measured a suite of
quality of service metrics. Scaling experiments show that median quality of service generally remains stable
as CPU count increases.
2.1       Introduction
     The parallel and distributed processing capacity of high-performance computing (HPC) clusters contin-
ues to grow rapidly and enable profound scientific and industrial innovations (Gagliardi et al., 2019). These
advances in hardware capacity and economy afford great opportunity, but also pose a serious challenge:
developing approaches to effectively harness it. As HPC systems scale, it becomes increasingly difficult to
write software that makes efficient use of available hardware and also provides reproducible results (or even
near-perfectly reproducible results — i.e., up to effects from floating point non-transitivity) consistent with
models of computation as being performed a reliable digital machine (Heroux, 2014).
     The bulk synchronous parallel (BSP) model, which is prevalent among HPC applications (Dongarra
et al., 2014), illustrates the challenge. This model segments fragments of computation into sequential global
supersteps, with fragments at superstep i depending only on data from strictly preceding fragments < i,
often just i − 1. Computational fragments are assigned across a pool of available processing components.
The BSP model assumes perfectly reliable messaging: all dispatched messages between computational frag-
ments are faithfully delivered. In practice, realizing this assumption introduces overhead costs: secondary
acknowledgment messages to confirm delivery and mechanisms to dispatch potential resends as the need
arises. Global synchronization occurs between supersteps, with computational fragments held until their
preceding superstep has completed (Valiant, 1990). This ensures that computational fragments will have
                                                       10


at hand every single expected input, including those required from fragments located on other processing
elements, before proceeding. So, supersteps only turn over once the entire pool of processing components
have completed their work for that superstep. Put another way, all processing components stall until the
most laggardly component catches up. In a game of double dutch with several jumpers, this would be like
slowing the tempo to whoever is most slow-footed each particular turn of the rope.
     Heterogeneous computational fragments, with some easy to process and others much slower, would
result in poor efficiency under a naive approach where each processing element handled just one fragment.
Some processing elements with easy tasks would finish early then idle while more difficult tasks carry on.
To counteract such load imbalances, programmers can allow for “parallel slack” by ensuring computational
fragments greatly outnumber processing elements or even performing dynamic load balancing at runtime
(Valiant, 1990).
     Unfortunately, hardware factors on the underlying processing elements ensure that inherent global su-
perstep jitter will persist: memory access time varies due to cache effects, message delivery time varies due to
network conditions, extra processing due to error detection and recovery, delays due to unfavorable process
scheduling by the operating system, etc. (Dongarra et al., 2014). Power management concerns on future
machines will likely introduce even more variability (Gropp and Snir, 2013). Worse yet, as we work with
more and more processes, the expected magnitude of the worst-sampled jitter grows and grows — and in
lockstep with it, our expected superstep duration. In the double dutch analogy, with enough jumpers, at
almost every turn of the rope someone will need to stop and tie their shoe. The global synchronization
operations underpinning the BSP model further hinder its scalability. Irrespective of time to complete com-
putational fragments within a superstep, the cost of performing a global synchronization operation increases
with processor count (Dongarra et al., 2014).
     Efforts to recover scalability by relaxing superstep synchronization fall under two banners. The first
approach, termed “Relaxed Bulk-Synchronous Programming” (rBSP), hides latency by performing collective
operations asynchronously, essentially allowing useful computation to be performed at the same time as
synchronization primitives for a single superstep percolate through the collective (Heroux, 2014). So, the
time cost required to perform that synchronization can be discounted, up to the time taken up by computa-
tional work at one superstep. Likewise, individual processes experiencing heavier workloads or performance
degradation due to hardware factors can fall behind by up to a single superstep without slowing the entire
collective. However, this approach cannot mask synchronization costs or cumulative performance degrada-
tion exceeding a single superstep’s duration. The second approach, termed relaxed barrier synchronization,
forgoes global synchronization entirely (Kim et al., 1998). Instead, computational fragments at superstep i
only wait on expected inputs from the subset of superstep i − 1 fragments that they directly interact with.
                                                       11


Imagine a double-dutch routine where each jumper exchanges patty cakes with both neighboring jumpers at
every turn of the rope. Relaxed barrier synchronization would dispense entirely with the rope. Instead, play-
ers would be free to proceed to their next round of patty cakes as soon as they had successfully patty-caked
both neighbors. With n players, player 0 could conceivably advance n rounds ahead of player n − 1 (each
player would be one round ahead of their right neighbor). Assuming fragment interactions form a graph
structure that persists across supersteps, in the general case before causing the entire collective to slow an
individual fragment can fall behind at most a number of supersteps equal to the graph diameter (Gamell
et al., 2015). Even though this approach can shield the collective from most one-off performance degradations
of a single fragment (especially in large-diameter cases), persistently laggard hardware or extreme one-off
degradations will ultimately still hobble efficiency. Dynamic task scheduling and migration aim to address
this shortcoming, redistributing work in order to “catch up” delinquent fragments (Acun et al., 2014). With
our double-dutch analogy, we could think of this something like a team coach temporarily benching a jumper
who skinned their knee and instructing the other jumpers to pick up their roles in the routine.
      In addition to concerns over efficiency, resiliency poses another inexorable problem to massive HPC sys-
tems. In small scales, it can suffice to assume that failures occur negligibly, with any that do transpire likely
to cause an (acceptably rare) global interruption or failure. At large scales, however, software crashes and
hardware failures become the rule rather than the exception (Dongarra et al., 2014) — running a simulation
to completion could even require so many retries as to be practically infeasible. A typical contemporary
approach to improve resiliency is checkpointing: the system periodically records global state then, when
a failure arises, progress is rolled back to the most recent global known-good state and runtime restarts
(Hursey et al., 2007). Global checkpoint-based recovery is expensive, especially at scale due to overhead as-
sociated with regularly recording global state, losing progress since the most recent checkpoint, and actually
performing a global teardown and restart procedure. In fact, at large enough scales global recovery durations
could conceivably exceed mean time between failures, making any forward simulation progress all but impos-
sible (Dongarra et al., 2014). The local failure, local recovery (LFLR) paradigm eschews global recovery by
maintaining persistent state on a process-wise basis and providing a recovery function to initialize a step-in
replacement process (Heroux, 2014; Teranishi and Heroux, 2014). In practice, such an approach can require
keeping running logs of all messaging traffic in order to replay them for the benefit of any potential step-in
replacement (Chakravorty and Kale, 2004). Returning once more to the double dutch analogy, LFLR would
transpire as something like a handful teammates pulling a stricken teammate aside to catch them up after
an amnesia attack (rather than starting the entire team’s routine back at the top of the current track). The
intervening jumpers would have to remind the stricken teammate of a previously recorded position then
discreetly re-feign some of their moves that the stricken teammate had cued off of between that recorded
                                                         12


position and the amnesia episode.
     The possibility of multiple simultaneous failure (perhaps, for example, of dozens of processes resident
on a single node) poses an even more difficult, although not insurmountable, challenge for LFLR that would
likely necessitate even greater overhead. On approach involves pairing up with a remote “buddy” process.
The “buddy” hangs to the focal process’ snapshots and is carbon-copied on all of that process’ messages in
order to ensure an independently survivable log. Unfortunately, this could potentially require forwarding
all messaging traffic between simulation elements coresident on the focal process to its buddy, dragging
inter-node communication into some otherwise trivial simulation operations (Chakravorty and Kalé, 2007).
Efforts to ensure resiliency beyond single-node failures currently appear unnecessary (Ni, 2016, p. 12). Even
though LFLR saves the cost of global spin-down and spin-up, all processes will potentially have to wait for
work lost since the last checkpoint to be recompleted, although in some cases this could be helped along by
tapping idle hardware to take over delinquent work from the failed process and help catch it up (Dongarra
et al., 2014).
     Still more insidious to the reliable digital machine model, though, are soft errors — events where
corruption of data in memory occurs, usually do to environmental interference (i.e., “cosmic rays”) (Karnik
and Hazucha, 2004). Further miniaturization and voltage reduction, which are assumed as a likely vehicle
for continuing advances in hardware efficiency and performance, could conceivably worsen susceptibility to
such errors (Dongarra et al., 2014; Kajmakovic et al., 2020). What makes soft errors so dangerous is their
potential indetectability. Unlike typical hardware or software failures, which explicitly result in an explicit,
observable outcome (i.e., an error code, an exception, or even just a crash), soft errors can transpire silently
and lead to incorrect computational results without leaving anyone the wiser. Luckily, soft errors occur rarely
enough to be largely neglected in most single-processor applications (except the most safety-critical settings);
however, at scale soft errors occur at a non-trivial rate (Scoles, 2018; Sridharan et al., 2015). Redundancy
(be it duplicated hardware components or error correction codes) can reduce the rate of uncorrected (or
at least undetected) soft errors, although at a non-trivial cost (Sridharan et al., 2015; Vankeirsbilck et al.,
2015). In some application domains with symmetries or conservation principles, the rate of soft errors (or, at
least, silent soft errors) could be also reduced through so-called “skeptical” assertions at runtime (Dongarra
et al., 2014), although this too comes at a cost.
     Even if soft errors can be effectively eradicated — or at least suppressed to a point of inconsequentiality
— the nondeterministic mechanics of fault recovery and dynamic task scheduling could conceivably make
guaranteeing bitwise reproducibility at exascale effectively impossible, or at least an unreasonable engineering
choice (Dongarra et al., 2014). However, the assumption of the reliable digital machine model remains near-
universal within parallel and distributed algorithm design (Chakradhar and Raghunathan, 2010). Be it just
                                                       13


costly or simply a practical impossibility, the worsening burden of synchronization, fault recovery, and error
correction begs the question of whether it is viable to maintain, or even to strive to maintain, the reliable
digital machine model at scale. Indeed, software and hardware that relaxes guarantees of correctness and
determinism — a so-called “best-effort model” — have been shown to improve speed (Chakrapani et al.,
2008), energy efficiency (Bocquet et al., 2018; Chakrapani et al., 2008), and scalability (Meng et al., 2009).
Discussion around “approximate computing” overlaps significantly with “best-effort computing,” although
focusing more heavily on using algorithm design to shirk non-essential computation (i.e., reducing floating
point precision, inexact memoization, etc.) (Mittal, 2016). As technology advances, computing is becoming
more distributed and we are colliding with physical limits for speed and reliability. Massively distributed
systems are becoming inevitable, and indeed if we are to truly achieve “indefinite scalability” (Ackley and
Cannon, 2011) we must shift from guaranteed accuracy to best-effort methods that operate asynchronously
and degrade gracefully under hardware failure.
     The suitability of the best-effort model varies from application to application. Some domains are clear
cut in favor of the reliable digital machine model — for example, due to regulatory issues (Dongarra et al.,
2014). However, a subset of HPC applications can tolerate — or even harness — occasionally flawed or even
fundamentally nondeterministic computation (Chakradhar and Raghunathan, 2010). Various approximation
algorithms or heuristics fall into this category, with notable work being done on best-effort stochastic gradient
descent for artificial neural network applications (Dean et al., 2012; Niu et al., 2011; Noel and Osindero,
2014; Rhodes et al., 2019; Zhao et al., 2019). Best-effort, real-time computing approaches have also been
used in some artificial life models (Ray, 1995). Likewise, algorithms relying on pseudo-stochastic methods
that tend to exploit noise (rather than destabilize due to it) also make good candidates (Chakradhar and
Raghunathan, 2010; Chakrapani et al., 2008). Real-time control systems that cannot afford to pause or
retry, by necessity, fall into the best-effort category (Rahmati et al., 2011; Rhodes et al., 2019). For this
dissertation we will, of course, focus on this latter case of systems well-suited to best-effort methods, as
evolving systems already require noise to fuel variation.
     This work distills best-effort communication from the larger issue of best-effort computing, paying it
special attention and generally pretermiting the larger issue. Specifically, we investigate the implications
of relaxing synchronization and message delivery requirements. Under this model, the runtime strives to
minimize message latency and loss, but guarantees elimination of neither. Instead, processes continue their
compute work unimpeded and incorporate communication from collaborating processes as it happens to
become available. We still assume that messages, if and when they are delivered, retain contentual integrity.
     We see best-effort communication as a particularly fruitful target for investigation. Firstly, synchroniza-
tion constitutes the root cause of many contemporary scaling bottlenecks, well below the mark of thousands
                                                         14


or millions of cores where runtime failures and soft errors become critical considerations. Secondly, future
HPC hardware is expected to provide more heterogeneous, more variable (i.e., due to power management),
and generally lower (relative to compute) communication bandwidth (Acun et al., 2014; Gropp and Snir,
2013); a best-effort approach suits these challenges. A best-effort communication model presents the pos-
sibility of runtime adaptation to effectively utilize available resources given the particular ratio of compute
and communication capability at any one moment in any one rack.
      Complex biological organisms exhibit characteristic best-effort properties: trillions of cells interact asyn-
chronously while overcoming all but the most extreme failures in a noisy world. As such, bio-inspired algo-
rithms present strong potential to benefit from best-effort communication strategies. For example, evolution-
ary algorithms commonly use guided stochastic methods (i.e., selection and mutation operators) resulting in
a search process that does not guarantee optimality, but typically produces a diverse range of high-quality
results. Indeed, island model genetic algorithms are easy to parallelize and have been shown to perform
well with asynchronous migration (Izzo et al., 2009). Likewise, artificial life simulations commonly rely on
a bottom-up approach and seek to model life-as-it-could-be evolving in a noisy environment akin to the
natural world, yet distinct from it (Bonabeau and Theraulaz, 1994). Although perfect reproducibility and
observability have uniquely enabled digital evolution experiments to ask and answer otherwise intractable
questions (Bundy et al., 2021; Covert et al., 2013; Dolson et al., 2020; Dolson and Ofria, 2017; Fortuna et al.,
2019; Goldsby et al., 2014; Grabowski et al., 2013; Lenski et al., 2003; Pontes et al., 2020; Zaman et al.,
2011), the reliable digital machine model is not strictly necessary for all such work. Issues of distributed and
parallel computing are of special interest within the the artificial life subdomain of open-ended evolution
(OEE) (Ackley and Small, 2014), which studies long-term dynamics of evolutionary systems in order to
understand factors that affect potential to generate ongoing novelty (Taylor et al., 2016). Recent evidence
suggests that the generative potential of at least some model systems are — at least in part — meaningfully
constrained by available compute resources (Channon, 2019).
      Much exciting work on best-effort computing has incorporated bespoke experimental hardware (Ackley
and Williams, 2011; Chakrapani et al., 2008; Chippa et al., 2014; Cho et al., 2012; Rhodes et al., 2019).
However, here, we focus on exploring best-effort communication among parallel and distributed elements
within existing, commercially-available hardware. Existing software libraries, though, do not explicitly
expose a convenient best-effort communication interface for such work. As such, best-effort approaches
remain rare in production software and efforts to study best-effort communication must make use of a
combination of limited existing support and the development of new software tools.
      The Message Passing Interface (MPI) standard (Gropp et al., 1996) represents the mainstay for high-
performance computing applications. This standard exposes communication primitives directly to the end
                                                        15


user. MPI’s nonblocking communication primitives, in particular, are sufficient to program distributed
computations with relaxed synchronization requirements. Although its explicit, the imperative nature of
the MPI protocols enables precise control over execution; unfortunately it also poses significant expense in
terms of programmability. This cost manifests in terms of reduced programmer productivity and software
quality, while increasing domain knowledge requirements and the effort required to tune for performance due
to program brittleness (Gu and Becchi, 2019; Tang et al., 2014).
      In response to programmability concerns, many frameworks have arisen to offer useful parallel and
distributed programming abstractions. Task-based frameworks such as Charm++ (Kale and Krishnan,
1993), Legion (Bauer et al., 2012), Cilk (Blumofe et al., 1996), and Threading Building Blocks (TBB)
(Reinders, 2007) describe the dependency relationships among computational tasks and associated data
and relies on an associated runtime to automatically schedule and manage execution. These frameworks
assume a deterministic relationship between tasks. In a similar vein, programming languages and extensions
like Unified Parallel C (UPC) (El-Ghazawi and Smith, 2006) and Chapel (Chamberlain et al., 2007) rely on
programmers to direct execution, but equips them with powerful abstractions, such as global shared memory.
However, Chapel’s memory model explicitly forbids data races and UPC ultimately relies on a barrier model
for data transfer.
      To bridge these shortcomings, we employ a new software framework, the Conduit C++ Library for Best-
Effort High Performance Computing (Moreno et al., 2021b). The Conduit library provides tools to perform
best-effort communication in a flexible, intuitive interface and uniform inter-operation of serial, parallel, and
distributed modalities. Although Conduit currently implements distributed functionality via MPI intrinsics,
in future work we will explore lower-level protocols like InfiniBand Unreliable Datagrams (Kashyap, 2006;
Koop et al., 2007).
      Here, we present a set of on-hardware experiments to empirically characterize Conduit’s best-effort
communication model. In order to survey across workload profiles, we tested performance under both a
communication-intensive graph coloring solver and a compute-intensive artificial life simulation.
      First, we determine whether best-effort communication strategies can benefit performance compared to
the traditional perfect communication model. We considered two measures of performance: computational
steps executed per unit time and solution quality achieved within a fixed-duration run window.
      We compare the best-effort and perfect-computation strategies across processor counts, expecting to see
the marginal benefit from best-effort communication increase at higher processor counts. We focus on weak
scaling, growing overall problem size proportional to processor count. Put another way, we hold problem
size per processor constant.1 This approach prevents interference from shifts in processes’ workload profiles
    1
      As opposed to strong scaling, where the problem size is held fixed while processor count increases.
                                                       16


in observation of the effects of scaling up processor count.
      To survey across hardware configurations, we tested scaling CPU count via threading on a single node
and scaling CPU count via multiprocessing with each process assigned to a distinct node. In addition to a fully
best-effort mode and a perfect communication mode, we also tested two intermediate, partially synchronized
modes: one where the processor pool completed a global barrier (i.e., they aligned at a synchronization point)
at predetermined, rigidly scheduled timepoints and another where global barriers occurred on a rolling basis
spaced out by fixed-length delays from the end of the last synchronization.2
      Second, we sought to more closely characterize variability in message dispatch, transmission, and de-
livery under the best-effort model. Unlike perfect communication, real-time volatility affects the outcome
of computation under the best-effort model. Because real-time processing speed degradations and message
latency or loss alters inputs to simulation elements, characterizing the distribution of these phenomena across
processing components and over time is critical to understanding the actual computation being performed.
For example, consistently faster execution or lower messaging latency for some subset of processing elements
could violate uniformity or symmetry assumptions within a simulation. It is even possible to imagine re-
ciprocal interactions between real-time best-effort dynamics and simulation state. In the case of a positive
feedback loop, the magnitude of effects might become extreme. For example, in artificial life scenarios, agents
may evolve strategies that selectively increase messaging traffic so as to encumber neighboring processing
elements or even cause important messages to be dropped.
      We monitor five aspects of real-time behavior, which we refer to as quality of service metrics (Karakus
and Durresi, 2017),
    • wall-time simulation update rate (“simstep period”),
    • simulation-time message latency,
    • wall-time message latency,
    • steadiness of message inflow (“delivery clumpiness”), and
    • delivery failure rate.
      In an initial set of experiments, we use the graph coloring problem to test this suite of quality of service
metrics across runtime conditions expected to strongly influence them. We compare
    • increasing compute workload per simulation update step,
    2
      Our motivation for these intermediate synchronization modes was interest in the effect of clearing any
potentially-unbounded accumulation of message backlogs on laggard processes.
                                                        17


    • within-node versus between-node process placement, and
    • multithreading versus multiprocessing.
We perform these experiments using a graph coloring solver configured to maximize communication relative
to computation (i.e., just one simulation element per CPU) in order to maximize sensitivity of quality of
service to the runtime manipulations.
      Finally, we extend our understanding of performance scaling from the preceding experiments by ana-
lyzing how each quality of service metric fares as problem size and processor count grow together, a “weak
scaling” experiment. This analysis would detect a scenario where raw performance remains stable under
weak scaling, but quality of service (and, therefore, potentially quality of computation) degrades.
2.2      Methods
      We performed two benchmarks to compare the performance of Conduit’s best-effort approach to a
traditional synchronous model. We tested our benchmarks across both a multithread, shared-memory context
and a distributed, multinode context. In each hardware context, we assessed performance on two algorithmic
contexts: a communication-intensive distributed graph coloring problem (Section 2.2.2) and a compute-
intensive digital evolution simulation (Section 2.2.1). The latter benchmark — presented in Section 2.2.1 —
grew out of the original work developing the Conduit library to support large-scale experimental systems
to study open-ended evolution. The former benchmark — presented in Section 2.2.2 — complements the
first by providing a clear definition of solution quality. Metrics to define solution quality in the open-ended
digital evolution context remain a topic of active research.
2.2.1       Digital Evolution Benchmark
      The digital evolution benchmark runs the DISHTINY (DIStributed Hierarchical Transitions in Indi-
viduality) artificial life framework. This system is designed to study major transitions in evolution, events
where lower-level organisms unite to form a self-replicating entity. The evolution of multicellularity and
eusociality exemplify such transitions. Previous work with DISHTINY has explored methods for selecting
traits characteristic of multicellularity such as reproductive division of labor, resource sharing within kin
groups, resource investment in offspring, and adaptive apoptosis (Moreno and Ofria, 2019).
      DISHTINY simulates a fixed-size toroidal grid populated by digital cells. Cells can sense attributes
of their immediate neighbors, can communicate with those neighbors through arbitrary message passing,
and can interact with neighboring cells cooperatively through resource sharing or competitively through
antagonistic competition to spawn daughter cells into limited space. This cell behavior is controlled by
SignalGP event-driven linear genetic programs (Lalejini and Ofria, 2018). Full details of the DISHTINY
simulation are available in (Moreno and Ofria, 2022).
                                                        18


     We use Conduit-based messaging channels to manage all interactions between neighboring cells. Conduit
models messaging channels as independent objects. However, support is provided for behind-the-scenes
consolidation of communication along these channels between pairs of processes. Pooling joins together
exactly one message per messaging channel to create a fixed-size consolidated message. Aggregation joins
together arbitrarily many messages per channel to create a variable-size consolidated message.
     During a computational update, each cell advances its internal state and pushes information about its
current state to neighbor cells. Several independent messaging layers handle disparate aspects of cell-cell
interaction, including
    • Cell spawn messages, which contain arbitrary-length genomes (seeded at 100 12-byte instructions with
      a hard cap of 1000 instructions). These are handled every 16 updates and use Conduit’s built-in
      aggregation support for inter-process transfer.
    • Resource transfer messages, consisting of a 4-byte float value. These are handled every update and use
      Conduit’s built-in pooling support for inter-process transfer.
    • Cell-cell communication messages, consisting of arbitrarily many 20-byte packets dispatched by genetic
      program execution. These are handled every 16 updates and use Conduit’s built-in aggregation support
      for inter-process transfer.
    • Environmental state messages, consisting of a 216-byte struct of data. These are handled every 8
      updates and use conduit’s built-in pooling support for inter-process transfer.
    • Multicellular kin-group size detection messages, consisting of a 16-byte bitstring. These are handled
      every update and use Conduit’s built-in pooling support for inter-process transfer.
     Implementing all cell-cell interaction via Conduit-based messaging channels allows the simulation to be
parallelized down to the granularity, potentially, of individual cells. These messaging channels allow cells
to communicate using the same interface whether they are placed within the same thread, across different
threads, or across diffferent processes. However, in practice, for this benchmarking we assign 3600 cells
to each thread or process. Because all cell-cell interactions occur via Conduit-based messaging channels,
logically-neighboring cells can interact fully whether or not they are located on the same thread or process
(albeit with potential irregularities due to best-effort limitations). An alternate approach to evolving large
populations might be an island model, where Conduit-based messaging channels would be used solely to
exchange genomes between otherwise independent populations (Bennett III et al., 1999). However, we chose
to instead parallelize DISHTINY as a unified spatial realm in order to enable parent-offspring interaction
                                                       19


                                     Mode             Description
                                        0     Barrier sync every update
                                        1         Rolling barrier sync
                                        2          Fixed barrier sync
                                        3           No barrier sync
                                        4    No inter-cpu communication
Table 2.1: Asynchronicity modes used for benchmarking experiments, arranged from most to least synchro-
nized.
and leave the door open for future work with multicells that exceed the scope of an individual thread or
process.
2.2.2       Graph Coloring Benchmark
     The graph coloring benchmark employs a graph coloring algorithm designed for distributed WLAN
channel selection (Leith et al., 2012). In this algorithm, nodes begin by randomly choosing a color. Each
computational update, nodes test for any neighbor with the same color. If and only if a conflicting neighbor
is detected, nodes randomly select another color. The probability of selecting each possible color is stored in
array associated with each node. Before selecting a new color, the stored probability of selecting the current
(conflicting) color is decreased by a multiplicative factor b. We used b = 0.1, as suggested by Leith et al.
Likewise, the stored probability of selecting all others is increased by a multiplicative factor. Regardless of
whether their color changed, nodes always transmit their current color to their neighbor.
     Our benchmarks focus on weak scalability, using a fixed problem size of 2 048 graph nodes per thread
or process. These nodes were arranged in a two-dimensional grid topology where each node had three
possible colors and four neighbors. We implement the algorithm with a single Conduit communication layer
carrying graph color as an unsigned integer. We used Conduit’s built-in pooling feature to consolidate color
information into a single MPI message between pairs of communicating processes each update. We performed
five replicates, each with a five second simulation runtime. Solution error was measured as the number of
graph color conflicts remaining at the end of the benchmark.
2.2.3      Asynchronicity Modes
     For both benchmarks, we compared performance across a spectrum of synchronization settings, which
we term “asynchronicity modes” (Table 2.1). Asynchronicity mode 0 represents traditional fully-synchronous
methodology. Under this treatment, full barrier synchronization was performed between each computational
update. Asynchronicity mode 3 represents fully asynchronous methodology. Under this treatment, individual
threads or processes performed computational updates freely, incorporating input from other threads or
processes on a fully best-effort basis.
                                                       20


     During early development of the library, we discovered episodes where unprocessed messages built up
faster than they could be processed — even if they were being skipped over to only get the latest message.
In some instances, this strongly degraded quality of service or even caused runtime instability. We opted for
MPI communication primitives that could consume many backlogged messages per call and increased buffer
size to address these issues, but remained interested in the possibility of partial synchronization to clear
potential message backlogs. So, we included two partially-synchronized treatments: asynchronicity modes 1
and 2.
     In asynchronicity mode 1, threads and processes alternated between performing computational updates
for a fixed-time duration and executing a global barrier synchronization. For the graph coloring benchmark,
work was performed in 10ms chunks. For the digital evolution benchmark, which is more computationally
intensive, work was performed in 100ms chunks. In asynchronicity mode 2, threads and processes executed
global barrier synchronizations at predetermined time points. In both experiments, global barrier synchro-
nization occurred on second hand ticks of the UTC clock.
     Finally, asynchronicity mode 4 disables all inter-thread and inter-process communication, including
barrier synchronization. We included this mode to isolate the impact on performance of communication
between threads and processes from other factors potentially affecting performance, such as cache crowding.
In this run mode for the graph coloring benchmark, all calls send messages between processes or threads
were skipped (except after the benchmark concluded, when assessing solution quality). Because of its larger
footprint, incorporating logic into the digital evolution simulation to disable all inter-thread and inter-process
messaging was impractical. Instead, we launched multiple instances of the simulation as fully-independent
processes and measured performance of each.
2.2.4      Quality of Service Metrics
     The best-effort communication model eschews effort to insulate of computation from real-time message
delivery dynamics. Because these dynamics are difficult to predict a priori and can bias computation,
thorough, empirical runtime measurements are necessary to understand results of such computation. To this
end, we developed a suite of quality of service metrics. Figure 2.1 provides space-time diagrams illustrating
the metrics presented in this section.
     For the purposes of these metrics, we assume that simulations proceed in an iterative fashion with
alternating compute and communication phases. For short, we refer to a single compute-communication
cycle as a “simstep.” We derive formulas for metrics in terms of independent observations preceding and
succeeding a “snapshot” window, during which the simulation and any associated best-effort communication
proceeds unimpeded. Snapshot observations are taken at one minute intervals over the course of each of our
                                                        21


   A                   B      A                  B A                        B         A                 B
                          vs
                                                                              ✅
                                                                              ✅ vs                 ❌ ✅
                                                                              ✅                           ✅
                                                                              ✅                    ❌
                   (a) Clumpiness                                      (b) Delivery Failure Rate
  A                  B        A                  B          A                   B A                    B
                                                                                  vs
                         vs
                     (c) Latency                                          (d) Simstep Period
Figure 2.1: Quality of service metrics. Each illustration is a space-time diagram, with A and B representing
independent processes. The vertical axis depicts the passage of time, from top to bottom. Solid black arrows
represent message delivery. The left panel of each metric’s diagram depicts a scenario with a lower (“better”)
value for that metric compared to the right panel, which depicts a higher (“worse”) value for that metric.
                                                       22


a replicate experiments. The following section, 2.2.5, details the experimental apparatus used to generate
quality of service metrics reported in this work.
Simstep Period
     We calculate the amount of wall-time elapsed per simulation update cycle (“Simstep Period”) during a
snapshot window as
                                  update count after − update count before
                                                                           .
                                        walltime after − walltime before
Figure 2.1d compares a scenario with low simstep period to a scenario with a higher simstep period.
Simstep Latency
     This metric reports the number of simulation iterations that elapse between message dispatch and
message delivery. Figure 2.1c compares a scenario with low latency to a scenario with a higher latency.
     To insulate against imperfect clock synchronization between processes, we estimate one-way wall-time
latency from a round-trip measure. As part of our instrumentation, each simulation element maintains an
independent zero-initialized “touch counter” associated with every neighbor simulation element it commu-
nicates with. Dispatched messages originating from each simulation element are bundled with the value of
the unique touch counter associated with the target element’s counter. When a message is received back to
the originating element from the target element, the touch counter is set to 1 + bundled touch count. In this
manner, the touch counter increments by two for each successful round trip completed. (Because simulation
elements are arranged as a toroidal mesh, all interaction between simulation elements is reciprocal.)
     We therefore calculate one-way latency during a snapshot window as,
                                  update count after − update count before
                                                                            .
                               min touch count after − touch count before, 1
Note that if no touches elapsed during the snapshot window, we make a best-case assumption that one might
elapse immediately after the end of the snapshot window (i.e., we count at least one elapsed touch).
Wall-time Latency
     Wall-time latency is closely related to simstep latency, except that interpret time in terms of elapsed
simulation updates instead of wall time. To calculate wall-time latency we apply a conversion to simstep
latency based on simstep period,
                                       simstep latency × simstep period.
     This metric directly tells the real-time performance of message transmission. Although it directly
follows from the interaction between simstep period and wall-time latency, it complements simstep latency’s
                                                       23


convenient interpretation in terms of potential simulation mechanics (e.g., simulation elements tending to
see data from two updates ago versus from ten).
     In addition to simstep latency, Figure 2.1c is also representative of wall-time latency — the difference
being interpretation of y axis in terms of wall-time instead of elapsed simulation updates.
Delivery Failure Rate
     Delivery failure rate measures the fraction of messages sent that are dropped. The only condition where
messages are dropped is when a send buffer fills. (Under the existing MPI-based implementation, messages
that queue on the send buffer are guaranteed for delivery.) So, we can calculate
                            successful send count after − successful send count before
                                                                                       .
                           attempted send count after − attempted send count before
Delivery Clumpiness
     Delivery clumpiness seeks to quantify the extent to which message arrival is consolidated to a subset of
message pull attempts. That is, the extent to which independently dispatched messages arrive in bundles
rather than as an even stream.
     If messages all arrive in independent pull attempts, then clumpiness will be zero. At the point where
the pigeonhole principle applies (num arriving messages >= num pull attempts), clumpiness will also be zero
so long as every pull attempt is laden. If all messages arrive during a single pull attempt, then clumpiness
will approach 1.
     We formulate clumpiness as the compliment of steadiness. (Reporting clumpiness provides a lower-is-
better interpretation consistent with the rest of the quality of service metrics.) Steadiness, in turn, stems
from three component statistics,
                             num laden pulls elapsed =laden pull count after
                                                         − laden pull count before
                              num messages received =message count after
                                                         − message count before
                                num pulls attempted =pull attempt count after
                                                         − pull attempt count before.
     Here, we refer to pull attempts that successfully retrieve a message as “laden.”
                                                        24


      We combine num messages received and num pulls attempted to derive,
                                                num opportunities for laden pulls =
                                                                                 
                               min num messages received, num pulls attempted .
      Then, to calculate steadiness,
                                             num laden pulls elapsed
                                                                          .
                                        num opportunities for laden pulls
      Finally, we find delivery clumpiness as 1 − steadiness. Figure 2.1a compares a scenario with low clumpi-
ness to a scenario with higher clumpiness.
2.2.5      Quality of Service Experiments
      Quality of service experiments executed the graph coloring algorithm described in Section 2.2.2. In
order to maximize communication intensity, only one graph vertex was assigned per CPU. Ten experimental
replicates were performed for each condition surveyed. Slightly over five minutes of runtime was afforded to
each replicate. Over five minutes of runtime, snapshot observations were taken at one minute intervals. The
first snapshot observation was taken one minute after the beginning of runtime.
      Snapshot observations lasted one second, with the graph coloring algorithm running fully unhampered
during the entire snapshot. This was accomplished by collecting and recording data via a separate thread.
That thread collected and recorded a first tranche of snapshot data, spin waited for one second, and then
recorded a second tranche. Because the underlying system runns in real-time while being observed, state
changes can occur during data collection (somewhat akin to photographic motion blur). Therefore, some in-
tuitive invariants — like strictly non-negative delivery failure rates — do not hold in some cases. However, the
magnitude of such violations is generally minor. Further, because data collection procedures were consistent
across treatments, statistical comparisons between treatments remain sound, even if direct interpretation of
reported metrics should be taken with a grain of salt.
      Snapshots were performed independently for each process at each timepoint. So, for example, for two
processes over the five minute window of a single replicate ten snapshots were collected. For statistical
tests comparing treatments, snapshots were aggregated by replicate by both mean and median. For each
quality of service statistic we estimate mean — which captures effects of extreme-magnitude outliers — and
median — which more better represents typicality — across these window samples. Statistical comparisons
across treatment conditions are performed via regression. We use ordinary least squares regression to analyze
means (Geladi and Kowalski, 1986) and quantile regression to analyze medians (Koenker and Hallock, 2001).
For comparisons between dichotomous, categorical treatment conditions, one condition is coded as 0 and
                                                        25


the other as 1. In the case of ordinary least squares regression, this boils down to an independent t-test.
Although quantile regression on categorical predictors is not precisely equivalent to a direct test on medians
between two groups (i.e., Mood’s median test), there is precedent for this approach (Konstantopoulos et al.,
2019; Petscher and Logan, 2014).
     Most statistics reported here can be calculated just as well in terms of incoming or outgoing messages.
That is, most statistics can be generated via data from instrumentation attached to message “inlets” or data
from instrumentation attached to message “outlets” with no obvious reason to prefer one over the other. As
“inlet-” and “outlet-”derived statistics are nearly identical in all cases, we simply report the mean over these
two measurements.
2.2.6     Code, Data, and Reproducibility
Benchmarking Experiments
     Benchmarking experiments were performed on Michigan State University’s High Performance Comput-
ing Center, a cluster of hundreds of heterogeneous x86 nodes linked with InfiniBand interconnects. For
multithread experiments, benchmarks for each thread count were collected from the same node. For mul-
tiprocess experiments, each processes was assigned to a distinct node in order to ensure results were rep-
resentative of performance in a distributed context. All multiprocess benchmarks were recorded from the
same collection of nodes. Hostnames are recorded for each benchmark data point. For an exact accounting
of hardware architectures used, these hostnames can be crossreferenced with a table included with the data
that summarizes the cluster’s node configurations.
     Code for the distributed graph coloring benchmark is available at https://github.com/mmore500/
conduit under
demos/channel_selection. Code for the digital evolution simulation benchmark is available at https:
//github.com/mmore500/dishtiny. Exact versions of software used are recorded with each benchmark data
point. Data is available via the Open Science Framework at https://osf.io/7jkgp/ and https://osf.io/72k5n
(Foster and Deardorff, 2017).      A live, in-browser notebook for all reported statistics and data visual-
izations and is available via Binder at https://mybinder.org/v2/gh/mmore500/conduit/binder?filepath=
binder%2Fdate%3D2021%2Bproject%3D72k5n (Project Jupyter et al., 2018).
Quality of Service Experiments
     Quality of service experiments were performed on Quality of service experiments were carried out on
Michigan State University’s High Performance Computing Center lac cluster, consisting of 28-core Intel(R)
Xeon(R) CPU E5-2680 v4 2.40GHz nodes. All statistical comparisons are performed between observa-
tions from the same job allocation. (Except in the case where intranode and internode configurations were
compared, where experiments were performed on separate allocations using comparable nodes on the same
                                                        26


cluster.)
     Benchmarking experiments described in Section 2.2.6 used a send/receive buffer size of 2. However, due
to the high communication intensity of the graph coloring problem with just one simulation element per CPU,
quality of service experiments required a larger buffer size of 64 to maintain runtime stability. In early work
developing the Conduit library, we discovered that real-time messaging channels can enter a destabilizing
positive feedback spiral when incoming messages take longer to handle (e.g., skip past or read) than sending
messages. Under such conditions, when a process exchanging messages from a partner process experiences
a delay it sends fewer messages to that partner process. Due to fewer incoming messages, the partner the
partner process can update more rapidly, increasing incoming message load on the delayed process. This
effect can snowball the partnership intended for even, two-way message exchange into effectively a unilateral
producer-consumer relationship where (potentially unbounded) work piles up on the consumer. To interrupt
such a scenario, we use the bulk message pull call MPI_Testsome to ensure fast message consumption under
backlogged conditions. So, receiver workload remains closer to constant under high traffic situations (instead
of having to pull messages down one-by-one). Larger receive buffer size, as configured for the quality of service
experiments, increases the effectiveness of the bulk message consumption countermeasure.
     Code for the distributed graph coloring benchmark is available at https://github.com/mmore500/
conduit under
demos/channel_selection.        Exact versions of software used are recorded with each benchmark data
point. Data is available via the Open Science Framework at https://osf.io/72k5n/ (Foster and Deardorff,
2017). A live, in-browser notebook for all reported statistics and data visualizations is available via Binder
at https://mybinder.org/v2/gh/mmore500/conduit/binder?filepath=binder%2Fdate%3D2021%2Bproject%
3D72k5n (Project Jupyter et al., 2018).
2.3       Results and Discussion
     Sections 2.3.1 and 2.3.2 compare execution performance under the best-effort communication versus
the perfect communication models. In particular, both sections investigate how the impact of best-effort
communication on performance relates to CPU count scale. Section 2.3.1 covers multithreading and Section
2.3.2 covers multiprocessing.
     The next sections investigate how system configuration affects quality of service. Specifically, these
sections cover the impact of
    • increasing compute workload per simulation update step (Section 2.3.3),
    • within-node versus between-node process placement (Section 2.3.4), and
    • multithreading versus multiprocessing (Section 2.3.5).
                                                      27


      Section 2.3.6 tests how quality of service changes with CPU count. This analysis fleshes out the
performance-centric picture of best-effort scalability established in Sections 2.3.1 and 2.3.2.
      Section 2.3.7 tests how inclusion of an apparently faulty node (i.e., that provided exceptionally poor
quality of service) affects global quality of service. This experiment provides insight into the robustness of
best-effort approaches to single-point failure.
2.3.1      Performance: Multithread Benchmarks
      We first tested how performance on the graph coloring and digital evolution benchmarks fared when
increasing thread count on a single hardware node.
      Figure 2.2a presents per-CPU algorithm update rate for the graph coloring benchmark at 1, 4, 16, and
64 threads. Update rate performance decreased with increasing multithreading across all asynchronicity
modes. This performance degradation was rather severe — per-CPU update rate decreased by 61% between
1 and 4 threads and by about another 75% between 4 and 64 threads. Surprisingly, this issue appears
largely unrelated to inter-thread communication, as it was also observed in asynchronicity mode 4, where
all interthread communication is disabled. Perhaps per-CPU update rate degradation under threading was
induced by strain on a limited system resource like memory cache or access to the system clock (which was
used to control run timing). This unexpectedly severe phenomenon merits further investigation in future
work with this benchmark.
      Nevertheless, we were able to observe significantly better performance of best-effort asynchronicity
modes 1, 2, and 3 at high thread counts. At 64 threads, these run modes significantly outperformed the
fully-synchronized mode 0 (p < 0.05, non-overlapping 95% confidence intervals). Likewise, as shown in Figure
2.2b, best-effort asynchronicity modes were able to deliver significantly better graph coloring solutions within
the allotted compute time than the fully-synchronized mode 0 (p < 0.05, non-overlapping 95% confidence
intervals).
      Figure 2.2c shows per-CPU algorithm update rate for the digital evolution benchmark at 1, 4, 16, and
64 threads. Similarly to the graph coloring benchmark, update rate performance decreased with increasing
multithreading across all asynchronicity modes — including mode 4, which eschews inter-thread commu-
nication. Even without communication between threads, with 64 threads each thread performed updates
at only 61% the rate of a lone thread. At 64 threads, best-effort asynchronicity modes 1, 2, and 3 exhibit
about 43% the update-rate performance of a lone thread. Although best-effort inter-thread communica-
tion only exhibits half the update-rate performance of completely decoupled execution at 64 threads, this
update-rate performance is roughly 2.1× that of the fully-synchronous mode 0. Indeed, best-effort modes
significantly outperform the fully-synchronous mode on the digital evolution benchmark at both 16 and 64
                                                        28


                                                                                Multithread Graph Coloring
                                                8000
                                                                                                              asynchronicity mode
                                                                                                                           0
                                                7000
                                                                                                                           1
                                                                                                                           2
                                                6000
                       updates per cpu-second
                                                                                                                           3
                                                5000                                                                       4
                                                4000
                                                3000
                                                2000
                                                1000
                                                      0
                                                                    1                   4             16                64
                                                                                              ncpus
                      (a) Graph coloring per-thread update rate. Higher is
                      better.
                                                                        Multithread Graph Coloring Solution Quality
                                                          asynchronicity mode
                                                                       0
                                                102                    1
                                                                       2
                                                                       3
                       conflicts per cpu
                                                                       4
                                                101
                                                                1                   4                 16                64
                                                                                             ncpus
                      (b) Graph coloring solution conflicts. Lower is better.
                                                                              Multithread Digital Evolution
                                                70
                                                                                                              asynchronicity mode
                                                                                                                           0
                                                60                                                                         1
                                                                                                                           2
                       updates per cpu-second
                                                50                                                                         3
                                                                                                                           4
                                                40
                                                30
                                                20
                                                10
                                                 0
                                                                1                   4                 16                64
                                                                                             ncpus
                      (c) Digital evolution per-thread update rate. Higher
                      is better.
Figure 2.2: Multithread benchmark results. Bars represent bootstrapped 95% confidence intervals.
                                                                                            29


threads (p < 0.05, non-overlapping 95% confidence intervals).
2.3.2     Performance: Multiprocess Benchmarks
     Next, we tested how performance on the graph coloring and digital evolution benchmarks fared when
scaling with fully independent processes located on different hardware nodes.
     Figure 2.3a shows per-CPU algorithm update rate for the graph coloring benchmark at 1, 4, 16, and 64
processes. Unlike the multithreaded benchmark, multiprocess graph coloring exhibits consistent update-rate
performance across process counts under asynchronicity mode 4, where inter-CPU communication is entirely
disabled. This matches the unsurprising expectation that, indeed, with comparable hardware a single process
should exhibit the same mean performance as any number of completely decoupled processes.
     At 64 processes, best-effort asynchronicity mode 3 with the graph coloring benchmark exhibits about
63% the update-rate performance of single-process execution. This represents a 7.8× speedup compared to
fully-synchronous mode 0. Indeed, best-effort mode 3 enables significantly better per-CPU update rates at
4, 16, and 64 processes (p < 0.05, non-overlapping 95% confidence intervals).
     Likewise, shown in Figure 2.3b, best-effort asynchronicity mode 3 yields significantly better graph-
coloring results within the allotted time at 4, 16, and 64 processes (p < 0.05, non-overlapping 95% confidence
intervals). Interestingly, partial-synchronization modes 1 and 2 exhibited highly inconsistent solution quality
performance at 16 and 64 process count benchmarks. Fixed-timepoint barrier sync (mode 2) had particularly
poor performance performance at 64 processes (note the log-scale axis). We suspect this was caused by a
race condition where workers would assign sync points to different fixed points different based on slightly
different startup times (i.e., process 0 syncs at seconds 0, 1, 2... while process 1 syncs at seconds 1, 2, 3..).
     Figure 2.3c presents per-CPU algorithm update rate for the digital evolution benchmark at 1, 4, 16, and
64 processes. Relative performance fares well at high process counts under this relatively computation-heavy
workload. With 64 processes, fully best-effort simulation retains about 92% the update rate performance
of single-process simulation. This represents a 2.1× speedup compared to the fully-synchronous run mode
0. Best-effort mode 3 significantly outperforms the per-CPU update rate of fully-synchronous mode 0 at
process counts 16 and 64 (p < 0.05, non-overlapping 95% confidence intervals).
2.3.3     Quality of Service: Computation vs. Communication
     Having shown performance benefits of best-effort communication on the graph coloring and digital
evolution benchmarks in Sections 2.3.1 and 2.3.2, we next seek to more fully characterize the best-effort
approach using a holistic suite of proposed quality of service metrics.
     This section evaluates how a simulation’s ratio of communication intensity to computational work affects
these quality of service metrics. The graph coloring benchmark serves as our experimental model.
                                                        30


                                                                                  Multiprocess Graph Coloring
                                                                                                                 asynchronicity mode
                                                6000                                                                          0
                                                                                                                              1
                                                                                                                              2
                                                5000
                       updates per cpu-second
                                                                                                                              3
                                                                                                                              4
                                                4000
                                                3000
                                                2000
                                                1000
                                                      0
                                                                    1                    4             16                  64
                                                                                               ncpus
                      (a) Graph coloring per-process update rate. Higher
                      is better.
                                                                        Multiprocess Graph Coloring Solution Quality
                                                          asynchronicity mode
                                                103
                                                                       0
                                                                       1
                                                                       2
                                                                       3
                       conflicts per cpu
                                                                       4
                                                102
                                                101
                                                                 1                    4                16                  64
                                                                                              ncpus
                      (b) Graph coloring solution conflicts. Lower is better.
                                                                                Multiprocess Digital Evolution
                                                60
                                                50
                       updates per cpu-second
                                                40
                                                30
                                                20        asynchronicity mode
                                                                       0
                                                                       1
                                                10
                                                                       2
                                                                       3
                                                 0
                                                                1                    4                 16                  64
                                                                                              ncpus
                      (c) Digital evolution per-process update rate. Higher
                      is better.
Figure 2.3: Multiprocess benchmark results. Bars represent bootstrapped 95% confidence intervals.
                                                                                             31


     For this experiment, arbitrary compute work (detached from the underlying algorithm) was added to
the simulation update process. We used a call to the std::mt19937 random number engine as a unit of
compute work. In microbenchmarks, we found that one work unit consumed about 35ns of walltime and 21ns
of compute time. We performed 5 treatments, adding 0, 64, 4 096, 262 144, or 16 777 216 units of compute
work to the update process. For each treatment, measurements were made on a pair of processes split across
different nodes.
Simstep Period
     Unsurprisingly, we found a direct relationship between per-update computational workload and the
walltime required per computational update.
     Supplementary Figures A.24 and A.26 depict the distribution of walltime per computational update
across snapshots. Once added compute work supersedes the light compute work already associated with the
graph coloring algorithm update step (at around 64 work units), simstep period scales in direct proportion
with compute work.
     Indeed, we found a significant positive relationship between both mean and median simstep period and
added compute work (Supplementary Figures A.32 and A.34). At 0 units of added compute work, mean and
median simstep period was 14.7 14.7 µs. At 16 777 216 units of added compute work, mean simstep period
was 611ms and median simstep period was 507ms. Supplementary Tables A.17 and A.18 detail numerical
results of these regressions.
Simstep Latency
     Unsurprisingly, again, we observed a negative relationship between the number of simulation steps
elapsed during message transit and added computational work. Put simply, longer update steps provide
more time for messages to transit.
     Supplementary Figures A.25 and A.21 show the distribution of simstep latency across compute work-
loads. With no added compute work, messages take between 20 and 100 simulation steps to transit (mean:
48.0 updates; median: 42.5 updates). At maximum compute work per update, messages arrive at a median
1.00 update latency.
     Regression analysis confirms a significant negative relationship between both mean and median log
simstep latency and log added compute work (Supplementary Figures A.33 and A.29). Supplementary
Tables A.17 and A.18 detail numerical results of these regressions.
Walltime Latency
     Effects of compute work on walltime latency highlight an important caveat in interpretation of this
metric.
     At 0, 64, and 4 096 work units, walltime latency measures ≈ 1 ms (means: 708 µs, 788 µs, 902 µs;
                                                      32


medians: 622 µs, 640 µs, 738 µs). However, once simstep period grows to ≈ 10 ms at 262 144 work units
and (an order of magnitude in excess of walltime latency observed at low compute loads), walltime latency
increases with added compute work. At 16 777 216 compute work units, 1.00s median walltime latency is
observed.
     Because our computational model assumes on-demand message delivery with a communication phase
only occurring once per simulation update, message transmission speed is fundamentally limited by simu-
lation update period. If a message is dispatched while its recipient is busy doing computational work, the
soonest it can be received will be when that recipient completes the computational phase of its update. In
order to measure transmission time fully independent of delays due to on-demand delivery, additional in-
strumentation would be necessary. However, when this latency is greater than a few simsteps, this measure
is reasonably representative of message transmission time.
     Supplementary Figures A.20 and A.22 show the distribution of walltime latency across computational
workloads. Supplementary Figures A.28 and A.30 summarize regression between walltime latency and added
compute work. Supplementary Tables A.17 and A.18 detail numerical results of those regressions.
Delivery Clumpiness
     We observed a significant negative relationship between computation workload and delivery clumpiness.
     At low computational intensity, we observed clumpiness greater than 0.95, meaning that fewer than 5%
of pull requests were laden with fresh messages (at 0 compute work mean: 0.96, median 0.96). However,
at high computational intensity clumpiness reached 0, indicating that messages arrived as a steady stream
(at 16 777 216 compute work mean: 0.00, median 0.00). Presumably, the reduction in clumpiness is due to
increased real-time separation between dispatched messages.
     Supplementary Figure A.23 shows the effect of computational workload on the distribution of observed
clumpinesses. We found a significant negative relationship between both mean and median clumpiness and
computational intensity. Supplementary Figure A.31 visualizes these regressions and Supplementary Tables
A.17 and A.18 provide numerical details.
Delivery Failure Rate
     We did not observe any delivery failures across all replicates and all compute workloads. So, compute
workload had no observable effect on delivery reliability.
     Supplementary Figure A.27 shows the distribution of delivery failure rates across computation workloads
and Supplementary Figure A.35 shows regressions of delivery failure rate against computational workload.
See Supplementary Tables A.17 and A.18 for numerical details.
                                                      33


2.3.4     Quality of Service: Intranode vs. Internode
     This section tests the effect of process assignment on best-effort quality of service, comparing multi-node
and single-node assignments. The graph coloring benchmark again serves as our experimental model.
     For this experiment, processes were either assigned to the same node or were assigned to different nodes.
In both cases, we used two processes.
Simstep Period
     Simstep period was significantly slower under internode conditions than under intranode conditions.
     When processes shared the same node, simstep period was around 9 µs (mean: 9.06 µs; median: 9.08 µs).
Under internode conditions, simstep period was around 14 µs (mean: 14.5 µs; median: 14.4 µs). Supplemen-
tary Figures A.40 and A.42 depict the distribution of walltime per computational update across intranode
and internode conditions.
     This result presumably attributes to an increased walltime cost for calls to the MPI implementation
backing internode communication compared to the MPI implementation backing intranode communication.
Although this effect is clearly detectable, its magnitude is modest given the minimal computational intensity
of the simulation update step — only ≈ 56% more expensive than intranode dispatch.
     Both mean and median simstep period increased significantly under internode conditions. (Supple-
mentary Figures A.48 and A.50 visualize these regressions and Supplementary Tables A.19 and A.20 detail
numerical results.)
Simstep Latency
     Significantly more simulation updates transpired during message transmission under internode condtions
compared to intranode conditions.
     Supplementary Figures A.41 and A.37 compares the distributions of simstep latency across these condi-
tions. Simstep latency was around 1 update for intranode communication (mean: 1.00 updates; median 0.75
updates) and around 40 updates for internode communication (mean: 41.6 updates; median: 37.4 updates).
     Regression analysis confirms the significant effect of process placement on simstep latency (Supple-
mentary Figures A.49 and A.45). Supplementary Tables A.19 and A.20 detail numerical results of these
regressions.
Walltime Latency
     Significantly more walltime elapsed during message transmission under internode condtions compared
to intranode conditions.
     Walltime latency was less than 10 µs for intranode communication (mean: 7.70 µs; median: 6.94 µs).
Internode communication had approximately 50× greater walltime latency, at around 500 µs (mean: 600 µs;
                                                       34


median: 551 µs).
     Supplementary Figures A.36 and A.38 show the distributions of walltime latency for intra- and inter-node
communication. Regression analysis confirmed a significant increase in walltime latency under inter-node
communication (Supplementary Figures A.44, A.46; Supplementary Tables A.19 and A.20).
Delivery Clumpiness
     Delivery clumpiness was minimal under intranode communication and very high under internode com-
munication.
     Under intranode conditions, we observed a mean clumpiness value of 0.014 and a median of 0.002. Under
internode conditions, we observed mean and median clumpiness values of 0.96. Supplementary Figures A.39
and A.39 show the distributions of clumpiness for intra- and inter-node communication. Regression analysis
confirmed a significant increase in clumpiness under inter-node communication (Supplementary Figures A.47,
A.47; Supplementary Tables A.19 and A.20).
Delivery Failure Rate
     Somewhat counterintuitively, a significantly higher proportion of deliveries failed for intranode commu-
nication than for internode communication.
     We observed a delivery failure rate of around 0.3 for intranode communication (mean: 0.33; median:
0.30) and no delivery failures for internode communication (mean: 0.00; median: 0.00). In some intranode
snapshot windows, we observed a delivery failure rate as high as 0.8. Supplementary Figures A.39 and A.39
show the distributions of delivery failure rate for intra- and inter-node communication.
     Because of Conduit’s current MPI-based implementation, messages only drop when the underlying send
buffer fills; queued messages are guaranteed for delivery. Slower simstep period under internode allocation
could improve stability of the send buffer due to more time, on average, between send attempts. Underlying
buffering or consolidation by the MPI backend for internode communication might also play a role by allowing
data to be moved out of the userspace send buffer more promptly.
     Regression analysis confirmed a significant increase in delivery failure under intra-node communication
(Supplementary Figures A.47, A.47; Supplementary Tables A.19 and A.20).
2.3.5      Quality of Service: Multithreading vs. Multiprocessing
     This section compares best-effort quality of service under multithreading and multiprocessing schemes.
We hold hardware configuration constant by restricting multiprocessing to cores a single hardware node, as is
the case for multithreading. However, inter-process communication occurred via MPI calls while inter-thread
communication occurring via shared memory access mediated by a C++ std::mutex.
     The graph coloring benchmark again serves as our experimental model. Both treatments used a single
                                                       35


pair of CPUs.
Simstep Period
     Multithreading enabled faster simulation update turnover than multiprocessing.
     Under multithreading, simstep period was around 5 µs (mean: 4.60 µs; median: 4.64 µs). Simstep period
for multiprocessing was around 9 µs (mean: 9.00 µs; median: 9.04 µs). Supplementary Figures A.56 and A.58
depict the distribution of walltime per computational update for both multiprocessing and multithreading.
This result falls in line with expectations that interaction via shared memory incurs lower overhead than via
MPI calls.
     Regression analysis showed that both mean and median simstep period were significantly slower un-
der multiprocessing compared to multithreading. (Supplementary Figures A.64 and A.66 visualize these
regressions and Supplementary Tables A.21 and A.22 detail numerical results.)
Walltime Latency
     No significant difference in walltime latency was detected between multiprocessing and multithreading.
     In the median case, walltime latency was approximately 5 µs for multithreading and 8 µs for multi-
processing. However, a pair of extreme outliers among snapshot windows — with walltime latencies of
approximately 12ms — drove multithreading walltime latency much higher in the mean case (451 µs). In
the median case, multiprocessing walltime latency was 8.56 µs.
     Cache invalidation or mutex contention provide possible explanations for the observed episodes of ex-
treme multithreading latency, although magnitude on the order of milliseconds for such effects is surprising.
Multithreading appears to provide marginally lower latency service in the median case, but at the cost of
vulnerability to extreme high-latency disruptions.
     Supplementary Figures A.52 and A.54 show the distributions of walltime latency for multithread and
multiprocess runs. Regression analysis did not detect any significant difference in walltime latency between
multithreading and multiprocessing (Supplementary Figures A.60, A.62; Supplementary Tables A.21 and
A.22).
Simstep Latency
     No significant difference in simstep latency was detected between multiprocessing and multithreading.
     In the median case, multiprocessing offered marginally lower simstep latency than multithreading. Me-
dian simstep latency was 0.84 updates under multiprocessing and 1.10 updates under multithreading. How-
ever, just as for walltime latency, extreme magintude outliers (≈ 2 000 simsteps) boosted mean simstep
latency for multithreading. Mean simstep latency was 0.94 updates under multiprocessing and 78.0 updates
under multithreading. Supplementary Figures A.57 and A.53 compare the distributions of simstep latency
across these conditions.
                                                        36


     Direct measurements of simstep period and walltime latency suggest that faster simstep period, rather
than slower walltime latency, explain the marginally higher simstep latency under multithreading.
     Regression analysis detected no significant effect of threading versus processing on simstep latency in
both the mean and median cases (Supplementary Figures A.65 and A.61). Supplementary Tables A.21 and
A.22 detail numerical results of these regressions.
Delivery Clumpiness
     Multithreading exhibited higher median clumpiness and greater variance in clumpiness than multipro-
cessing.
     Under multithreading, clumpiness was nearly 1 within some snapshot windows and less than 0.1 within
others. Under multiprocessing, clumpiness was consistently less than 0.1. Supplementary Figures A.55 and
A.55 show the distributions of clumpiness under both multiprocessing and multithreading. Multithreading
median clumpiness was 0.54. Multiprocessing median clumpiness was 0.03. Multithreading and multipro-
cessing mean clumpinesses were 0.56 and 0.03, respectively.
     Regression analysis confirmed a significantly greater clumpiness under both multithreading compared
to multiprocessing (Supplementary Figures A.63, A.63; Supplementary Tables A.21 and A.22).
Delivery Failure Rate
     We observed a higher proportion of deliveries fail for multiprocessing than for multithreading. (This is
as expected; the multithread implementation directly wrote updates to a piece of shared memory, so there
was no send buffer to backlog and induce message drops.)
     Multiprocessing exhibited both mean and median delivery failure rate of 0.38. In individual multipro-
cessing snapshot windows, we observed a delivery failure rates ranging from less than 0.1 to as high as 0.7. We
observed no multithreaded delivery failures. Supplementary Figures A.55 and A.55 show the distributions
of delivery failure rate for multithreading and multiprocessing.
     Regression analysis confirmed a significant increase in delivery failure under multiprocessing (Supple-
mentary Figures A.63, A.63; Supplementary Tables A.21 and A.22).
2.3.6     Quality of Service: Weak Scaling
     Sections 2.3.2 and 2.3.1 showed how best-effort communication could improve application performance,
particularly when scaling up processor count. Multiprocess performance scales well under the best-effort
approach, with overlapping performance estimate intervals for 16 and 64 processor counts on both surveyed
benchmark problems.
     This section aims to flesh out a more holistic picture of the effects of increasing processor count on
best-effort computation by considering a comprehensive suite of quality of service metrics. Our particular
                                                      37


interest is in which, if any, aspects of quality of service degrade under larger processing pools.
     To address these questions, we performed weak scaling experiments on 16, 64, and 256 processes using
the graph coloring benchmark. To broaden the survey, we tested scaling with different numbers of processors
allocated per node and different numbers of simulation elements assigned per processor.
     For the first variable, we tested scaling on allocations with each processor hosted on an independent
node and allocations where each node hosted an average of four processors. This allowed us to examine how
quality of service fared in homogeneous network conditions, where all communication between processes was
inter-node, compared to heteregeneous conditions, where some inter-process communication was inter-node
and some was intra-node.
     For the second variable, we tested with 2 048 simulation elements (“simels”) per processor (consistent
with the benchmarking experiments performed in Sections 2.3.2 and 2.3.1) and just one simulation element
per processor. This allowed us to vary the amount of computational work performed per process.
Simstep Period
     Supplementary Figures A.5 and A.7 survey the distributions of simstep periods observed within snapshot
windows. Across process counts, simstep period registers around 80 µs with one simel and around 200 µs
with 2 048 simels. However, on heterogeneous allocations (4 CPUs per node) this metric is more variable,
spanning up to an order of magnitude. Outlier observations range up to around 10ms with 2 048 simels and
up to slightly less than 100ms inlet / 4s outlet with 1 simel.
     We performed an ordinary least squares (OLS) regression to test how mean simstep period changed
with processor count. In all cases except one simel per CPU with four CPUs per node, mean simstep period
increased significantly with processor count from 16 to 64 to 256 CPUs. However, from 64 to 256 processors
mean simstep period only increased significantly with one simel per CPU and one CPU per node. Between
64 and 256 processes, mean simstep period actually decreased significantly for runs with 2 048 simels per
CPU. Figure 2.4 and Supplementary Figure A.13 visualize reported OLS regressions. Supplementary Tables
A.5 and A.7 provide numerical details on reported OLS regressions.
     Median simstep period exhibited the same relationships with processor count, tested with quartile re-
gression. Supplementary Figures A.18 and A.19 visualize corresponding quartile regressions. Supplementary
Tables A.13 and A.15 report numerical details on those quartile regressions.
     Except for the extreme case of one simel per CPU and one CPU per node, simstep period quality of
service is stable in scaling from 64 to 256 processes.
Walltime Latency
     Walltime latency sits at around 500 µs for one-simel runs and around 2ms for 2 048-simel runs. However,
variability is greater for heterogeneous (four CPUs per node) allocations. Extreme outliers of up to almost
                                                         38


100ms inlet/2s outlet occur in four CPUs per node, one-simel runs. In 256 process, 2 048-simel, one CPU
per node runs, outliers of more than 10s occur. Supplementary Figures A.1 and A.3 show the distribution
of walltime latencies observed across run conditions.
     We performed OLS regressions to test how mean walltime latency changed with processor count. Over
16, 64, and 256 processes, mean walltime latency increased significantly with processor count only with 2 048
simels per CPU. Between 64 and 256 processes, mean walltime latency increased significantly with processor
count only for one CPU per node with 2 048 simels per CPU. Supplementary Figures A.9 and Figure A.11
show these regressions. Supplementary Tables A.1 and A.3 provide numerical details.
     Next, we performed quantile regressions to test how processor count affected median walltime latency.
Over 16, 64, and 256 processes, median walltime latency increased significantly only with 4 CPUs per node
and 2 048 simels per CPU. Over 64 and 256 processes, there was no significant relationship between processor
count and median walltime latency under any condition. Figure 2.5 and Supplementary Figure A.16 show
regression results. Supplementary Tables A.9 A.11 provide numerical details.
Simstep Latency
     Simstep latency sits around 7 updates for runs with one simel per CPU and around 1.2 updates for
runs with 2 048 simels per CPU. For runs with one simel per CPU, outlier snapshot windows reach up to
50 updates under homogeneous allocations and up to almost 100 updates under heterogeneous allocations.
The 2 048 simels per CPU, one CPU per node, 256 process condition exhibited outliers of up to almost
8 000 update simstep latency. Supplementary Figures A.6 and A.2 show the distribution of simstep latencies
observed across run conditions.
     Over 16, 64, and 256 processes, mean simstep latency increased with process count only under 1 CPU
per node, 2 048 simel per CPU conditions. The same was true over just 64 to 256 processes. Supplemen-
tary Figures A.12 and A.10 show the OLS regressions performed, with Supplementary Tables A.6 and A.2
providing numerical details.
     For median simstep latency, however, there was no condition where latency increased significantly with
process count. Figure 2.6 and Supplementary Figure A.15 show the quantile regressions performed, with
Supplementary Tables A.14 and A.10 providing numerical details.
Delivery Clumpiness
     For one-simel-per-CPU runs, median delivery clumpiness registered between 0.8 and 0.6. On 2 048-
simel-per-CPU runs, median delivery clumpiness was lower at around 0.4. Supplementary Figure A.4 shows
the distribution of delivery clumpiness values observed across run conditions.
     Using OLS regression, we found no evidence of mean clumpiness worsening with increased process count.
In fact, over 16, 64, and 256 processes clumpiness significantly decreased with process count in all conditions
                                                      39


except four CPUs per node with 2 048 simels per CPU. Figure 2.7 and Supplementary Table A.4 detail
regressions performed to test the relationship between mean clumpiness and process count.
     Median delivery clumpiness exhibited the same relationships with processor count, tested with quartile
regression. Supplementary Figure A.17 and Supplementary Table A.12 detail regressions between median
clumpiness and process count.
Delivery Failure Rate
     Typical delivery failure rate was near zero, except with one simel per CPU and four CPUs per node
where median delivery failure rate was approximately 0.1. However, outlier delivery failure rates of up to 0.7
were observed with 1 CPU per node, 2 048 simels per CPU, and 256 processes. Outlier delivery failure rates
of up to 0.2 were observed with 4 CPUs per node, 2 048 simels per CPU, and 256 processes. Supplementary
Figure A.8 shows the distribution of delivery failure rates observed across run conditions.
     Mean delivery failure rate increased significantly between 64 and 256 processes with 1 CPU per node
and 2 048 simels per CPU as well as with 4 CPUs per node an 1 simel per CPU. However, the median
delivery failure rate only increased significantly with processor count with 4 CPUs per node and 1 simel per
CPU.
     Supplementary Figure A.14 and Supplementary Table A.8 detail the OLS regression testing mean de-
livery failure rate against processor count. Figure 2.8 and Supplementary Table A.16 detail the quantile
regression testing median delivery failure rate against processor count.
2.3.7      Quality of Service: Faulty Hardware
     The extreme magnitude of outliers for metrics reported in Section 2.3.6 prompted further investigation
of the conditions under which these outliers arose. Closer inspection revealed that the most extreme outliers
were all associated with snapshots on a single node: lac-417.
     So, we acquired two separate 256 process allocations on the lac cluster: one including lac-417 and one
excluding lac-417.
     Supplementary Figures A.72, A.74, A.73, A.69, A.68, A.70, A.71, and A.75 compare the distributions
of quality of service metrics between allocations with and without lac-417. Extreme outliers are present
exclusively in the lac-417 allocation for walltime latency, simstep latency, and delivery failure rate. Otherwise,
the metrics’ distributions across snapshots are very similar between allocations.
     Supplementary Figures A.80, A.82, A.81, A.77, A.76, A.78, A.79, A.79, and A.83 chart OLS and quantile
regressions of quality of service metrics on job composition. Mean walltime latency, simstep latency, and
delivery failure rate are all significantly greater with lac-417. Surprisingly, mean simstep period is significantly
longer without lac-417.
                                                           40


     However, there is no significant difference in median value for any quality of service metric between
allocations including or excluding lac-417. This stability of metric medians within allocations containing
lac-417 — which have significantly different means due to outlier values induced by the presence of lac-417
— demonstrates how the best-effort system maintains overall quality of service stability despite defective or
degraded components.
     Supplementary Tables A.23 and A.24 provide numerical details on regressions reported above.
                                                     41


                                                                         Ordinary Least Squares Regression
                                                                    Cpus Per Node = 1              1e8   Cpus Per Node = 4
                                                                                            1.75
                            90000
                                                                                            1.50
Simstep Period Inlet (ns)                                                                                                        Num Simels Per Cpu = 1
                            85000                                                           1.25                                                                  Estimated Statistic = Simstep Period Inlet (ns) Mean | Num Processes = 16, 64, 256
                                                                                            1.00                                                                                                                                Cpus Per Node = 1                Cpus Per Node = 4
                            80000
                                                                                                                                                                                                                       10000
                                                                                            0.75                                                                                                                                                      0.04
                                                                                                                                                                                                                                                                                      Num Simels Per Cpu = 1
                            75000
                                                                                            0.50                                                                                                                       8000
                                                                                                                                                                                                Absolute Effect Size
                                                                                                                                                                                                                                                      0.02
                            70000                                                           0.25
                                                                                                                                                                                                                       6000
                                                                                                                                                                                                                                                      0.00
                                                                                            0.00
                            65000                                                                                                                                                                                      4000
                                                                                                                                                                                                                                                     −0.02
                                                              1e6                                  1e6
                                                                                                                                                                                                                       2000
                                                                                                                                                                                                                                                     −0.04
                                                                                             2.0
                                                                                                                                 Num Simels Per Cpu = 2048
                                                        2.1
                            Simstep Period Inlet (ns)
                                                                                                                                                                                                                          0
                                                                                             1.9                                                                                                               200000
                                                        2.0
                                                                                                                                                                                                                                                                                      Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                                    200000
                                                                                             1.8
                                                                                                                                                                                      Absolute Effect Size
                                                                                                                                                                                                               150000
                                                        1.9
                                                                                                                                                                                                                                                    150000
                                                                                             1.7
                                                        1.8                                                                                                                                                    100000
                                                                                             1.6                                                                                                                                                    100000
                                                        1.7                                  1.5                                                                                                                       50000
                                                                                                                                                                                                                                                     50000
                                                                2          3            4            2          3            4
                                                                    Log Num Processes                    Log Num Processes                                                                                                0                              0
(a) Complete ordinary least squares regression plot. (b) Estimated regression coefficient for complete re-
Observations are means per replicate.                gression. Zero corresponds to no effect.
                                                                         Ordinary Least Squares Regression
                                                                    Cpus Per Node = 1              1e8   Cpus Per Node = 4
                                                                                            1.75
                            90000
                                                                                            1.50
Simstep Period Inlet (ns)                                                                                                        Num Simels Per Cpu = 1
                                                                                                                                                                  Estimated Statistic = Simstep Period Inlet (ns) Mean | Num Processes = 64, 256
                            85000
                                                                                            1.25                                                                                                                               Cpus Per Node = 1                  Cpus Per Node = 4
                            80000                                                           1.00
                                                                                                                                                                                                                       7000                            0.04
                                                                                                                                                                                                                                                                                                         Num Simels Per Cpu = 1
                                                                                            0.75                                                                                                                       6000
                                                                                                                                                                                               Absolute Effect Size
                            75000                                                                                                                                                                                                                      0.02
                                                                                            0.50                                                                                                                       5000
                            70000                                                           0.25                                                                                                                       4000                            0.00
                                                                                                                                                                                                                       3000
                            65000                                                           0.00                                                                                                                                                      −0.02
                                                                                                                                                                                                                       2000
                                                              1e6                                  1e6
                                                        2.1                                                                                                                                                            1000                           −0.04
                                                                                             2.0
                                                                                                                                 Num Simels Per Cpu = 2048
                                                                                                                                                                                                                          0
                            Simstep Period Inlet (ns)
                                                        2.0                                  1.9                                                                                                                          0                                  0
                                                                                                                                                                                                                                                                                                         Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                                    −25000
                                                                                             1.8                                                                                                −20000
                                                        1.9
                                                                                                                                                                       Absolute Effect Size
                                                                                                                                                                                                                                                    −50000
                                                                                             1.7
                                                                                                                                                                                                −40000                                              −75000
                                                        1.8
                                                                                             1.6                                                                                                                                                    −100000
                                                                                                                                                                                                −60000
                                                        1.7                                  1.5                                                                                                                                                    −125000
                                                                2          3            4            2          3            4                                                                                                                      −150000
                                                                                                                                                                                                −80000
                                                                    Log Num Processes                    Log Num Processes
(c) Piecewise ordinary least squares regression plot. (d) Estimated regression coefficient for rightmost par-
Observations are means per replicate.                 tial regression. Zero corresponds to no effect.
Figure 2.4: Ordinary least squares regressions of Simstep Period Inlet (ns) against log processor count for
weak scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Ordinary least squares regression estimates relationship between independent
variable and mean of response variable. Error bands and bars are 95% confidence intervals. Note that log is
base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                             42


                                                                                      Quantile Regression
                                                                         Cpus Per Node = 1                    Cpus Per Node = 4
                                                                                               500000
                              600000
Latency Walltime Inlet (ns)                                                                                                         Num Simels Per Cpu = 1
                              575000                                                           475000
                              550000                                                           450000                                                                Estimated Statistic = Latency Walltime Inlet (ns) Median | Num Processes = 16, 64, 256
                                                                                                                                                                                                                           Cpus Per Node = 1                Cpus Per Node = 4
                                                                                                                                                                                                                       0
                              525000                                                           425000                                                                                                                                           20000
                              500000
                                                                                                                                                                                                                                                                                 Num Simels Per Cpu = 1
                                                                                               400000                                                                                                             −10000
                                                                                                                                                                                           Absolute Effect Size
                                                                                                                                                                                                                                                10000
                              475000                                                           375000                                                                                                             −20000
                                                                                                                                                                                                                                                    0
                              450000
                                                                                               350000
                                                                                                                                                                                                                  −30000
                                                                   1e6                                  1e6                                                                                                                                    −10000
                                                                                                 2.75
                                                                                                                                                                                                                  −40000
                               Latency Walltime Inlet (ns)                                                                          Num Simels Per Cpu = 2048
                                                             2.6                                                                                                                                                                               −20000
                                                                                                 2.50
                                                                                                 2.25                                                                                                             150000
                                                             2.4                                                                                                                                                                               350000
                                                                                                                                                                                                                                                                                 Num Simels Per Cpu = 2048
                                                                                                                                                                                                                  125000
                                                                                                 2.00                                                                                                                                          300000
                                                                                                                                                                                           Absolute Effect Size
                                                                                                                                                                                                                  100000
                                                                                                                                                                                                                                               250000
                                                             2.2                                 1.75                                                                                                              75000
                                                                                                                                                                                                                                               200000
                                                                                                 1.50                                                                                                              50000
                                                                                                                                                                                                                                               150000
                                                             2.0                                                                                                                                                   25000
                                                                                                 1.25                                                                                                                                          100000
                                                                                                                                                                                                                       0
                                                                    2           3          4             2           3          4                                                                                                               50000
                                                                                                                                                                                                                  −25000
                                                                         Log Num Processes                    Log Num Processes                                                                                                                     0
(a) Complete quantile regression plot. Observations (b) Estimated regression coefficient for ordinary least
are medians per replicate.                          squares regression. Zero corresponds to no effect.
                                                                                      Quantile Regression
                                                                         Cpus Per Node = 1                    Cpus Per Node = 4
                                                                                               500000
                              600000
Latency Walltime Inlet (ns)                                                                                                         Num Simels Per Cpu = 1
                              575000                                                           475000
                                                                                                                                                                     Estimated Statistic = Latency Walltime Inlet (ns) Median | Num Processes = 64, 256
                              550000                                                           450000                                                                                                                      Cpus Per Node = 1                 Cpus Per Node = 4
                                                                                                                                                                                                                                                 50000
                                                                                               425000                                                                                                              30000
                              525000
                                                                                                                                                                                                                                                                                                   Num Simels Per Cpu = 1
                                                                                                                                                                                                                   20000                         40000
                                                                                                                                                                                      Absolute Effect Size
                              500000                                                           400000
                                                                                                                                                                                                                   10000                         30000
                              475000                                                           375000
                                                                                                                                                                                                                       0                         20000
                              450000                                                           350000
                                                                                                                                                                                                                  −10000
                                                                   1e6                                  1e6                                                                                                                                      10000
                                                                                                 2.75                                                                                                             −20000
                                                                                                                                    Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                                        0
                               Latency Walltime Inlet (ns)
                                                             2.6
                                                                                                 2.50                                                                                                             −30000
                                                                                                                                                                                                                                                 50000
                                                                                                 2.25
                                                             2.4
                                                                                                                                                                                                                                                                                                   Num Simels Per Cpu = 2048
                                                                                                                                                                                                                       0                                0
                                                                                                 2.00
                                                                                                                                                                            Absolute Effect Size
                                                                                                                                                                                                                                                −50000
                                                                                                                                                                                                                  −50000
                                                             2.2                                 1.75                                                                                                                                          −100000
                                                                                                                                                                                                      −100000
                                                                                                 1.50                                                                                                                                          −150000
                                                             2.0                                                                                                                                      −150000                                  −200000
                                                                                                 1.25
                                                                                                                                                                                                                                               −250000
                                                                    2           3          4             2           3          4                                                                     −200000
                                                                         Log Num Processes                    Log Num Processes                                                                                                                −300000
(c) Piecewise quantile regression plot. Observations (d) Estimated regression coefficient for rightmost par-
are medians per replicate.                           tial regression. Zero corresponds to no effect.
Figure 2.5: Quantile Regressions of Latency Walltime Inlet (ns) against log processor count for weak scaling
experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row shows
piecewise regression. Quantile regression estimates relationship between independent variable and median
of response variable. Note that log is base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                                43


                                                                        Quantile Regression
                                                          Cpus Per Node = 1                   Cpus Per Node = 4
                                                                                   7.5
                                                  9
                         Latency Simsteps Inlet                                                                       Num Simels Per Cpu = 1
                                                                                   7.0
                                                                                                                                                       Estimated Statistic = Latency Simsteps Inlet Median | Num Processes = 16, 64, 256
                                                  8
                                                                                                                                                                                                                    Cpus Per Node = 1           Cpus Per Node = 4
                                                                                   6.5                                                                                                                       0.0                          0.0
                                                  7                                                                                                                                                         −0.2                         −0.1
                                                                                   6.0
                                                                                                                                                                                                                                                                    Num Simels Per Cpu = 1
                                                                                                                                                                                                            −0.4
                                                                                                                                                                                     Absolute Effect Size
                                                                                                                                                                                                                                         −0.2
                                                  6                                                                                                                                                         −0.6
                                                                                   5.5                                                                                                                                                   −0.3
                                                                                                                                                                                                            −0.8
                                                                                                                                                                                                                                         −0.4
                                                  5                                5.0                                                                                                                      −1.0
                                                                                                                                                                                                                                         −0.5
                                                                                                                                                                                                            −1.2
                         1.40                                                     1.45                                                                                                                                                   −0.6
                                                                                                                                                                                                            −1.4
                                                                                                                      Num Simels Per Cpu = 2048
                         1.35                                                     1.40                                                                                                                                                   −0.7
Latency Simsteps Inlet
                                                                                                                                                                                                            −1.6
                         1.30                                                     1.35                                                                                                                      0.01
                                                                                                                                                                                                            0.00
                                                                                                                                                                                                                                                                    Num Simels Per Cpu = 2048
                         1.25                                                                                                                                                                                                            0.04
                                                                                  1.30
                                                                                                                                                                                                    −0.01
                                                                                                                                                                           Absolute Effect Size
                         1.20                                                                                                                                                                                                            0.02
                                                                                  1.25                                                                                                              −0.02
                         1.15                                                                                                                                                                       −0.03                                0.00
                                                                                  1.20
                                                                                                                                                                                                    −0.04
                         1.10                                                                                                                                                                                                           −0.02
                                                                                  1.15                                                                                                              −0.05
                         1.05                                                                                                                                                                       −0.06                               −0.04
                                                      2          3            4          2           3            4
                                                                                                                                                                                                    −0.07
                                                          Log Num Processes                   Log Num Processes                                                                                                                         −0.06
(a) Complete quantile regression plot. Observations (b) Estimated regression coefficient for ordinary least
are medians per replicate.                          squares regression. Zero corresponds to no effect.
                                                                        Quantile Regression
                                                          Cpus Per Node = 1                   Cpus Per Node = 4
                                                                                   7.5
                                                  9
                                                                                                                      Num Simels Per Cpu = 1
                                                                                                                                                       Estimated Statistic = Latency Simsteps Inlet Median | Num Processes = 64, 256
                         Latency Simsteps Inlet
                                                                                   7.0
                                                                                                                                                                                                                   Cpus Per Node = 1            Cpus Per Node = 4
                                                  8
                                                                                   6.5                                                                                                                                                    0.4
                                                                                                                                                                                                            0.0
                                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 1
                                                  7
                                                                                                                                                                      Absolute Effect Size
                                                                                   6.0                                                                                                            −0.2                                    0.2
                                                  6                                5.5                                                                                                                                                    0.0
                                                                                                                                                                                                  −0.4
                                                                                   5.0
                                                  5                                                                                                                                               −0.6                                   −0.2
                         1.40                                                     1.45
                                                                                                                                                                                                  −0.8
                                                                                                                                                                                                                                         −0.4
                                                                                                                      Num Simels Per Cpu = 2048
                         1.35                                                     1.40
Latency Simsteps Inlet
                                                                                                                                                                                                    0.06                                 0.08
                         1.30                                                     1.35
                                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                         0.06
                                                                                                                                                                                                    0.04
                         1.25
                                                                                  1.30
                                                                                                                                                            Absolute Effect Size
                                                                                                                                                                                                                                         0.04
                                                                                                                                                                                                    0.02
                         1.20                                                                                                                                                                                                            0.02
                                                                                  1.25
                                                                                                                                                                                                    0.00
                         1.15                                                                                                                                                                                                            0.00
                                                                                  1.20
                                                                                                                                                                                      −0.02                                             −0.02
                         1.10
                                                                                  1.15                                                                                                                                                  −0.04
                                                                                                                                                                                      −0.04
                         1.05
                                                                                                                                                                                                                                        −0.06
                                                      2          3            4          2           3            4                                                                   −0.06
                                                          Log Num Processes                   Log Num Processes                                                                                                                         −0.08
(c) Piecewise quantile regression plot. Observations (d) Estimated regression coefficient for rightmost par-
are medians per replicate.                           tial regression. Zero corresponds to no effect.
Figure 2.6: Quantile Regressions of Latency Simsteps Inlet against log processor count for weak scaling
experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row shows
piecewise regression. Quantile regression estimates relationship between independent variable and median
of response variable. Note that log is base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                  44


                                       Ordinary Least Squares Regression
                                 Cpus Per Node = 1                   Cpus Per Node = 4
                      0.84
                                                         0.700
                                                                                                                              Estimated Statistic = Delivery Clumpiness Mean | Num Processes = 16, 64, 256
                                                                                             Num Simels Per Cpu = 1
                      0.82
Delivery Clumpiness
                                                         0.675                                                                                                                  Cpus Per Node = 1             Cpus Per Node = 4
                      0.80                                                                                                                                               0.00                        0.000
                                                         0.650
                                                                                                                                                                                                    −0.005
                      0.78
                                                                                                                                                                                                                                  Num Simels Per Cpu = 1
                                                                                                                                                                        −0.01
                                                         0.625
                                                                                                                                                 Absolute Effect Size
                                                                                                                                                                                                    −0.010
                      0.76
                                                         0.600                                                                                                          −0.02                       −0.015
                      0.74
                                                         0.575                                                                                                                                      −0.020
                      0.72                                                                                                                                              −0.03
                                                                                                                                                                                                    −0.025
                                                                                                                                                                        −0.04                       −0.030
                      0.42
                                                          0.40
                                                                                             Num Simels Per Cpu = 2048
                      0.40
Delivery Clumpiness
                                                                                                                                                                         0.00                         0.06
                                                          0.35
                      0.38
                                                                                                                                                                                                                                  Num Simels Per Cpu = 2048
                                                                                                                                                                                                      0.05
                      0.36                                0.30                                                                                                          −0.01
                                                                                                                                                 Absolute Effect Size
                      0.34                                                                                                                                                                            0.04
                                                          0.25
                                                                                                                                                                        −0.02
                      0.32                                                                                                                                                                            0.03
                                                          0.20
                      0.30                                                                                                                                                                            0.02
                                                                                                                                                                        −0.03
                                                          0.15
                      0.28                                                                                                                                                                            0.01
                             2          3            4           2          3            4                                                                              −0.04
                                                                                                                                                                                                      0.00
                                 Log Num Processes                   Log Num Processes
(a) Complete ordinary least squares regression plot. (b) Estimated regression coefficient for complete re-
Observations are means per replicate.                gression. Zero corresponds to no effect.
                                       Ordinary Least Squares Regression
                                 Cpus Per Node = 1                   Cpus Per Node = 4
                      0.84                                                                                                    Estimated Statistic = Delivery Clumpiness Mean | Num Processes = 64, 256
                                                         0.700
                                                                                             Num Simels Per Cpu = 1
                      0.82                                                                                                                                                      Cpus Per Node = 1             Cpus Per Node = 4
                                                                                                                                                                        0.000                        0.000
Delivery Clumpiness
                                                         0.675
                      0.80                                                                                                                                 −0.005                                   −0.005
                                                         0.650
                                                                                                                                                                                                                                                      Num Simels Per Cpu = 1
                                                                                                                                 Absolute Effect Size
                      0.78                                                                                                                                 −0.010                                   −0.010
                                                         0.625
                      0.76                                                                                                                                                                          −0.015
                                                                                                                                                           −0.015
                                                         0.600
                                                                                                                                                                                                    −0.020
                      0.74                                                                                                                                 −0.020
                                                         0.575                                                                                                                                      −0.025
                      0.72                                                                                                                                 −0.025
                                                                                                                                                                                                    −0.030
                                                                                                                                                           −0.030
                      0.42                                                                                                                                                                          −0.035
                                                          0.40
                                                                                             Num Simels Per Cpu = 2048
                      0.40
Delivery Clumpiness
                                                          0.35                                                                                                           0.00
                      0.38
                                                                                                                                                                                                                                                      Num Simels Per Cpu = 2048
                                                                                                                                                                                                       0.06
                                                                                                                                           Absolute Effect Size
                      0.36                                0.30                                                                                                          −0.01
                                                                                                                                                                                                       0.04
                      0.34                                0.25                                                                                                          −0.02
                                                                                                                                                                                                       0.02
                      0.32
                                                          0.20                                                                                                          −0.03
                      0.30                                                                                                                                                                             0.00
                                                          0.15                                                                                                          −0.04
                      0.28
                                                                                                                                                                                                     −0.02
                             2          3            4           2          3            4                                                                              −0.05
                                 Log Num Processes                   Log Num Processes                                                                                                               −0.04
(c) Piecewise ordinary least squares regression plot. (d) Estimated regression coefficient for rightmost par-
Observations are means per replicate.                 tial regression. Zero corresponds to no effect.
Figure 2.7: Ordinary least squares regressions of Delivery Clumpiness against log processor count for weak
scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Ordinary least squares regression estimates relationship between independent
variable and mean of response variable. Error bands and bars are 95% confidence intervals. Note that log is
base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                         45


                                                 Quantile Regression
                                    Cpus Per Node = 1                   Cpus Per Node = 4
                         0.04                                0.10
                                                                                                Num Simels Per Cpu = 1
Delivery Failure Rate
                                                                                                                                 Estimated Statistic = Delivery Failure Rate Median | Num Processes = 16, 64, 256
                         0.02                                0.08
                                                                                                                                                                                     Cpus Per Node = 1           Cpus Per Node = 4
                                                                                                                                                                                                         0.030
                                                             0.06
                         0.00                                                                                                                                                0.04
                                                                                                                                                                                                                                     Num Simels Per Cpu = 1
                                                                                                                                                                                                         0.025
                                                                                                                                                     Absolute Effect Size
                                                             0.04
                        −0.02                                                                                                                                                0.02
                                                                                                                                                                                                         0.020
                                                             0.02
                                                                                                                                                                             0.00                        0.015
                        −0.04
                                                             0.00
                                                                                                                                                                            −0.02                        0.010
                                                                                                                                                                            −0.04                        0.005
                                                                                                Num Simels Per Cpu = 2048
                         0.04                                0.04
                                                                                                                                                                                                         0.000
Delivery Failure Rate
                         0.02                                0.02
                                                                                                                                                                             0.04                         0.04
                                                                                                                                                                                                                                     Num Simels Per Cpu = 2048
                         0.00                                0.00
                                                                                                                                                     Absolute Effect Size
                                                                                                                                                                             0.02                         0.02
                        −0.02                               −0.02
                                                                                                                                                                             0.00                         0.00
                        −0.04                               −0.04                                                                                                           −0.02                        −0.02
                                2          3            4           2          3            4                                                                               −0.04                        −0.04
                                    Log Num Processes                   Log Num Processes
(a) Complete quantile regression plot. Observations (b) Estimated regression coefficient for ordinary least
are medians per replicate.                          squares regression. Zero corresponds to no effect.
                                                 Quantile Regression
                                    Cpus Per Node = 1                   Cpus Per Node = 4
                         0.04                                0.10                                                                Estimated Statistic = Delivery Failure Rate Median | Num Processes = 64, 256
                                                                                                Num Simels Per Cpu = 1
Delivery Failure Rate
                                                                                                                                                                                    Cpus Per Node = 1            Cpus Per Node = 4
                         0.02                                0.08
                                                                                                                                                                            0.04                         0.030
                                                             0.06
                                                                                                                                                                                                                                                        Num Simels Per Cpu = 1
                         0.00
                                                                                                                                     Absolute Effect Size
                                                                                                                                                                                                         0.025
                                                                                                                                                                            0.02
                                                             0.04
                        −0.02                                                                                                                                                                            0.020
                                                                                                                                                                            0.00
                                                             0.02                                                                                                                                        0.015
                        −0.04
                                                                                                                                                               −0.02
                                                             0.00                                                                                                                                        0.010
                                                                                                                                                               −0.04                                     0.005
                                                                                                Num Simels Per Cpu = 2048
                         0.04                                0.04                                                                                                                                        0.000
Delivery Failure Rate
                         0.02                                0.02
                                                                                                                                                                            0.04                          0.04
                                                                                                                                                                                                                                                        Num Simels Per Cpu = 2048
                                                                                                                                     Absolute Effect Size
                         0.00                                0.00
                                                                                                                                                                            0.02                          0.02
                        −0.02                               −0.02                                                                                                           0.00                          0.00
                        −0.04                               −0.04                                                                                              −0.02                                     −0.02
                                2          3          4             2          3          4                                                                    −0.04                                     −0.04
                                    Log Num Processes                   Log Num Processes
(c) Piecewise quantile regression plot. Observations (d) Estimated regression coefficient for rightmost par-
are medians per replicate.                           tial regression. Zero corresponds to no effect.
Figure 2.8: Quantile Regressions of Delivery Failure Rate against log processor count for weak scaling
experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row shows
piecewise regression. Quantile regression estimates relationship between independent variable and median
of response variable. Note that log is base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                            46


2.4       Conclusion
     The fundamental motivation for best-effort communication is efficient scalability. Our results confirm
that best-effort communication can fulfill on this goal.
     We found that the best-effort approach significantly increases performance at high CPU count. This
finding was consistent across the communication-intensive graph coloring benchmark and the computation-
intensive digital evolution benchmark. The computation-heavy digital evolution benchmark yielded the
strongest scaling efficiency, achieving at 64 processes 92% the update-rate of single-process execution. We
observed the greatest relative speedup under distributed communication-heavy workloads — about 7.8×
on the graph coloring benchmark. In the case of the graph coloring benchmark, we found that best-effort
communication can help achieve tangibly better solution quality within a fixed time constraint.
     Because real-time volatility affects the outcome of computation under the best-effort model, raw execu-
tion speed performance does not suffice to fully understand the consequences of the best-effort communication
model. In order to characterize the real-time dynamics under the best-effort model, we designed and mea-
sured a suite of quality of service metrics: simstep period, simstep latency, wall-time latency, delivery failure
rate, and delivery clumpiness.
     We performed several experiments to validate and characterize these metrics. Comparing quality of
service between multithreading and multiprocessing, we found that multithreading had lower runtime over-
head cost but that multiprocessing reduced delivery erraticity, curbing especially extreme poor quality of
service outlier events. We found better quality of service, especially with respect to latency, for processes
occupying the same node. Finally, varying the ratio of computational work to communication, we found
lower communication intensity associated with less volatile quality of service.
     In order for best-effort communication to succeed in facilitating scale-up, median quality of service
must stabilize with increasing CPU count. Put another way, best-effort communication cannot succeed at
scale if communication quality tends toward complete degradation. In Section 2.3.6, we used weak scaling
experiments to test the effect of scale-up on quality of service at 8, 64, and 256 processes. Under a lower
communication-intensivity task parameterization, we found that all median quality of service metrics were
stable when scaling from 64 to 256 process. Under maximal communication intensivity, we found in one
case that median simstep period degraded from around 80 µs to around 85 µs. In another case, median
message delivery failure rate increased from around 7% to around 9%. Such minor — and, in most cases, nil
— degradation in median quality of service despite maximal communication intensivity bodes well for the
viability of best-effort communication at scale.
     Resilience is a second major motivating factor for best-effort computing. In another promising result,
                                                       47


we found that the presence of an apparently faulty compute node did not degrade median performance or
quality of service. Despite extreme quality of service degradation measured among that node and its clique,
collective performance and quality of service remained steady. In effect, the best-effort approach successfully
decoupled global performance from the worst performer. Such so-called “straggler effects” plague traditional
approaches to large-scale high-performance computing (Aktaş and Soljanin, 2019), so avoiding them is a
major boon.
     Development of the Conduit library stemmed from a practical need for an abstract, prepackaged best-
effort commmunication interface to support our digital evolution research. Because real-time effects are
fundamentally application-dependent and arise without any explicit in-program specification (and therefore
may be unanticipated) it is important to be able to perform such quality of service profiling case-by-case
in applications of best-effort communication. The instrumentation used in these experiments is written as
wrappers around the library’s Inlet and Outlet classes that may be enabled via compile-time configuration
switch. This makes data generation for quality of service analysis trivial to perform in any system built
with the Conduit library. We hope that making this library and quality of service metrics available to the
community can reduce domain expertise and programmability barriers to taking advantage of the best-effort
communication model to efficiently leverage burgeoning parallel and distributed computing power.
     In future work, it may be of interest to design systems that monitor and proactively react to real-time
quality of service conditions. For example, imposing a variable cost for cell-cell messaging to agents based
on traffic levels or increasing per-update resource generation for agents on slow-running nodes. We are eager
to investigate how Conduit’s best-effort communication model scales on much larger process counts on the
order of thousands of cores.
                                                       48


                                                Chapter 3
      Methods to Enable Decentralized Phylogenetic Tracking in a
                          Distributed Digital Evolution System
Authors: Matthew Andres Moreno, Emily Dolson, and Charles Ofria
This chapter is adapted from (Moreno et al., 2022b), which underwent peer review and appeared in the
proceedings of the 2022 Conference on Artificial Life (ALIFE 2022).
     This chapter presents a novel algorithm (“hereditary stratigraphy”) to facilitate reconstruction-based
phylogenetic studies in digital evolution systems. This approach enables efficient, accurate phylogenetic
reconstruction with tunable, explicit trade-offs between annotation memory footprint and reconstruction
accuracy. We can estimate, for example, MRCA generation of two genomes within 10% relative error with
95% confidence up to a depth of a trillion generations with genome annotations smaller than a kilobyte.
Simulated inference over known lineages recovers up to 85.70% of the information contained in the original
tree using 64-bit annotations.
3.1      Introduction
     In traditional serially-processed digital evolution experiments, phylogenetic trees can be tracked per-
fectly as they progress (Bohm et al., 2017; Lalejini et al., 2019; Wang et al., 2018) rather than reconstructed
afterward, as must be done in most biological studies of evolution. Such direct phylogenetic tracking en-
ables experimental possibilities unique to digital evolution, such as perfect reconstruction of the sequence of
phylogenetic states that led to a particular evolutionary outcome (Dolson et al., 2020; Lenski et al., 2003).
In a shared-memory context, it is not difficult to maintain a complete phylogeny by ensuring that offspring
retain a permanent reference to their parent (or vice versa). As simulations progress, however, memory usage
would balloon if all simulated organisms were stored permanently. Garbage collecting extinct lineages and
saving older history to disk greatly ameliorates this issue (Bohm et al., 2017; Dolson et al., 2019).
     If sufficient memory or disk space can be afforded to log all reproduction events, recording a perfect
phylogeny in a distributed context is also not especially difficult. Processes could maintain records of each
reproduction event, storing the parent organism (and its associated process) with all generated offspring
(and their destination processes). As long organisms are uniquely identified globally, these “dangling ends”
could be joined in postprocessing to weave a continuous global phylogeny. Of course, for the huge population
sizes made possible by distributed systems, such stitching may become a demanding task in and of itself.
Additionally, even small amounts of lost or corrupted data could fundamentally degrade tracking by disjoining
large tree subsections.
     However, if memory and disk space are limited, distributed phylogeny tracking becomes a more burden-
                                                       49


some challenge. A naive approach might employ a server model to maintain a central store of phylogenetic
data. Processes would dispatch notifications of birth and death events to the server, which would curate (and
garbage collect) phylogenetic history much the same as current serial phylogenetic tracking implementations.
Unfortunately, this server model approach would present scalability challenges: burden on the server process
would worsen in direct proportion to processor count. This approach would also be similarly brittle to any
lost or corrupted data.
      A more scalable approach might record birth and death events only on the process(es) where they unfold.
However, lineages that went extinct locally could not be safely garbage collected until the extinction of their
offspring’s lineages on other processors could be confirmed. Garbage collection would thus require extinction
notifications to wind back across processes each lineage had traversed. Again, this approach would also be
brittle to loss or corruption of data.
      In a distributed context — especially, a distributed, best-effort context — phylogenetic reconstruction
(as opposed to tracking) could prove simpler to implement, more efficient at runtime, and more robust
to data loss while providing sufficient information to address experimental questions of interest. However,
phylogenetic reconstruction from genomes with a traditional model of divergence under gradual accumulation
of random mutations poses its own difficulties, including
    • accounting for heterogeneity in evolutionary rates (i.e., the rate at which mutations accumulate due to
       divergent mutation rates or selection pressures) between lineages (Lack and Van Den Bussche, 2010),
    • performing sequence alignment (Casci, 2008),
    • mutational saturation (Hagstrom et al., 2004),
    • appropriately selecting and applying complex reconstruction algorithms (Kapli et al., 2020), and
    • computational intensity (Sarkar et al., 2010).
      The computational flexibility of digital artificial life experiments provides a unique opportunity to over-
come these challenges: designing heritable genome annotations specifically to ensure simple, efficient, and
effective phylogenetic reconstruction. For maximum applicability of such a solution, these annotations should
be phenotypically neutral heritable instrumentation (Stanley and Miikkulainen, 2002) that can be applied
to any digital genome.
      In this paper, we present “hereditary stratigraphy,” a novel heritable genome annotation system to
facilitate post-hoc phylogenetic inference on asexual populations. This system allows explicit control over
trade-offs between space complexity and accuracy of phylogenetic inference. Instead of modeling genome
components diverging through a neutral mutational process, we keep a record of historical checkpoints that
                                                          50


allow comparison of two lineages to identify the range of time in which they diverged. Careful management
of these checkpoints allows for a variety of trade-off options, including:
    • linear space complexity and fixed-magnitude inference error,
    • constant space complexity and inference error linearly proportional to phylogenetic depth, and
    • logarithmic space complexity and inference error linearly proportional to time elapsed since MRCA
      (which we suspect will be the most broadly useful trade-off).
     In Methods we motivate and explain the hereditary stratigraphy approach. Then, in Results and Discus-
sion we simulate post-hoc inference on known phylogenies to assess the quality of phylogenetic reconstruction
enabled by the hereditary stratigraphy method.
3.2      Methods
     This section will introduce intuition for the strategy of our hereditary stratigraph approach, define the
vocabulary we developed to describe aspects of this approach, overview configurable aspects of the approach,
present mathematical exposition of the properties of space complexity and inference quality under particular
configurations, and then recap digital experiments that demonstrate this approach in an applied setting.
3.2.1      Hereditary Strata and the Hereditary Stratigraphic Column
     Our algorithm, particularly the vocabulary we developed to describe it, draws loose inspiration from
the concept of geological stratigraphy, inference of natural history through analysis of successive layers of
geological material (Steno, 1916). As an introductory intuition, suppose a body of rock being built up
through regular, discrete events depositing of geological material. Note that in such a scenario we could
easily infer the age of the body of rock by counting up the number of layers present. Next, imagine making
a copy of the rock body in its partially-formed state and then moving it far away. As time runs forward on
these two rock bodies, independent layering processes will cause consistent disparity in the layers forming
on each forwards from their point of separation.
     To deduce the historical relationship of these rock bodies, we could simply align and compare their
layers. Layers from their base up through the first disparity would correspond to shared ancestry; further
disparate layers would correspond to diverged ancestry. Figure 3.1 depicts the process of comparing columns
for phylogenetic inference.
     Shifting now from intuition to implementation, a fixed-length randomly-generated binary tag provides
a suitable “fingerprint” mechanism mirroring our metaphorical “rock layers.” We call this “fingerprint” tag
a differentia. The width of this tag controls the probability of spurious collisions between independently
generated instances. At 64 bits wide the tag effectively functions as a UID: collisions between randomly
                                                       51


                    Column A                           Column B
          Stratum                            Stratum
                     “Fingerprint”                      “Fingerprint”
  Gen 0                              Gen 0
                     0x504b                             0x504b
                                                                                         lity
                                                                            t         na
  Gen 1        (eliminated)          Gen 1        (eliminated)           las mmo
                                                                           co
          Stratum                            Stratum
                     “Fingerprint”                      “Fingerprint”
  Gen 2                              Gen 2
                     0x3d49                             0x3d49
                                                                                      MRCA
                                             Stratum
                                                        “Fingerprint”
  Gen 3        (eliminated)          Gen 3
                                                        0xe191
                                                                          firs
                                                                        dis t
          Stratum                            Stratum
                     “Fingerprint”                      “Fingerprint”
  Gen 4                              Gen 4                                  pa
                     0xb9da                             0x25c4                 rity
          Stratum                            Stratum
                                                                                         A      B
                     “Fingerprint”                      “Fingerprint”
                       …
                      0x7d5a                              …
                                                         0x4f00
Figure 3.1: Inferring the generation of the most-recent common ancestor (MRCA) of two hereditary strati-
graphic columns “A” and “B”. Columns are aligned at corresponding generations. Then the first generation
with disparate “fingerprints” is determined. This provides a hard upper bound on the generation of the
MRCA: these strata must have been deposited along separate lines of descent. Searching backward for the
first commonality preceding that disparity provides a soft lower bound on the generation of the MRCA:
these strata evidence common ancestry but might collide by chance. Some strata mmay have been elimi-
nated from the columns, as shown, in order to save space at the cost of increasing uncertainty of MRCA
generation estimates.
                                                                        52


generated tags are so unlikely (p < 5.42 × 10−20 ) they can essentially be ignored. At the other end of the
spectrum, collision probability would be 1/256 for a single byte and 1/2 for a single bit. In the case of narrow
differentia, in order to set a lower bound for the MRCA generation, you would have to backtrack common
strata from the last commonality until the probability of that many successive spurious collisions was enough
to satisfy your desired confidence level (e.g., 95% confidence). Even then, there would be a possibility of
the the true MRCA falling before the estimated lower bound. Note, however, that no matter the width of
the differentia the generation of the first discrepancy provides a hard upper bound on the generation of the
MRCA.
     In accordance with our geological analogy, we refer to the packet of data accumulated each generation
as a stratum. This packet contains the differentia and, although not employed in this work, could hold
other arbitrary user-defined data (i.e., simulation timestamp, phenotype characteristics, etc.). Again in
accordance with the geological analogy, we refer to the chronological stack of strata that accumulate over
successive generations as a hereditary stratigraphic column.
3.2.2     Stratum Retention Policy
     As currently stated, strata in each column will accumulate proportionally to the length of evolutionary
history simulated. In an evolutionary run with thousands or millions of generations, this approach would
soon become intractable — particularly when columns are serialized and transmitted between distributed
computing elements. To solve this problem, we can trade off precision for compactness by strategically
deleting strata from columns as time progresses. Figure 3.2 overviews how stratum deposit and stratum
elimination might progress over two generations under the hereditary stratigraphic column scheme.
     Different patterns of deletion will lead to different trade-offs, both in terms of the scaling relationship of
column size to generations elapsed and in terms of the arrangement of inference precision over evolutionary
history (i.e., focusing precision on more recent evolutionary history versus spreading it evenly over the entire
history).
     We refer to the rule set used to selectively eliminate strata over time as the “stratum retention policy.”
We explore several different retention policy designs here, and implement our software to allows for free,
modular interchange of retention policies.
     Our software allows specification of a policy as either a “predicate” or a “generator.” The predicate
method requires a function that takes the generation of a stratum and the current number of strata deposited
and returns whether that stratum should be retained at that point in time. The generator method requires
a function that takes the current number of strata deposited and yields the set of generations that should
be deleted at that point in time. Although the predicate form of a policy is useful for analyzing and proving
                                                         53


                           2nd Generation
                      1. Deposit               2. Apply           3rd Generation (A)                             3rd Generation (B)
                       Stratum              Retention Policy
                      Column                                                     Column                                           Column
                                      idx                                                        idx                                              idx
                                      0                                                          0                                                0
            Stratum                                                    Stratum                                          Stratum
                      “Fingerprint”                retain                        “Fingerprint”         retain                     “Fingerprint”         retain
  Gen 0                                                        Gen 0                                            Gen 0
                       0x504b                                                    0x504b                                           0x504b
                                      idx
                                      1
            Stratum
                      “Fingerprint”
  Gen 1                                                        Gen 1        (eliminated)                        Gen 1         (eliminated)
                       0xddbd                    eliminate
                                      idx                                                        idx                                              idx
                                      2                                                          1                                                1
            Stratum                                                    Stratum                                          Stratum
                      “Fingerprint”                retain                        “Fingerprint”         retain                     “Fingerprint”         retain
  Gen 2                                                        Gen 2                                            Gen 2
                       0x3d49                                                    0x3d49                                           0x3d49
                                      w!
                                   Ne                                                            idx                                              idx
                                                                                                 2                                                2
  (Gen 3)
                                                                       Stratum                                          Stratum
                                                                                 “Fingerprint”         retain                     “Fingerprint”         retain
                                                               Gen 3                                            Gen 3
                                                                                 0xd01a                                           0xe74a
                                                                                                 w!                                                w!
…                                                              …                            Ne                  …                            Ne
                        …                                                          …                                                …
Figure 3.2: Cartoon illustration of stratum deposit process. This process marks the elapse of a generation
when a hereditary stratigraphic column is inherited by an offspring. First, a new stratum is appended to the
end of the column with a randomly-generated “fingerprint.” This “fingerprint” distinguishes strata that were
generated along disparate lines of descent (e.g., 0xd01a for 3rd Generation A and 0xe74a for 3rd generation
B). Then, the column’s configured stratum retention policy is applied to “prune” the column by eliminating
strata from specific generations. Although this cartoon depicts an empty space for eliminated strata, the
underlying data structure behind a column (i.e., the pink overlay) can condense to reduce space complexity.
                                                                                 54


properties of policies, the generator form is generally more efficient in practice. We provide equivalent
predicate and generator implementations for each stratum retention policy discussed here.
     Strata elimination causes a stratum’s position within the column data structure to no longer correspond
to the generation it was deposited. Therefore, it may seem necessary to store the generation of deposit
within the stratum data structure. However, for all deterministic retention policies a perfect mapping exists
backwards from column index to recover generation deposited without storing it. We provide this formula
for each stratum retention policy surveyed here. Finally, for each policy we provide a formula to calculate
the exact number of strata retained under any parameterization after n generations.
     The next subsections introduce several stratum retention policies, explain the intuition behind their
implementation, and elaborate their space complexity and resolution properties. For each policy, patterns
of stratum retention are illustrated in Figure 3.3. The formulas for number of strata retained after n
generations, the formulas to calculate stratum deposit generation from column index, and the retention
predicate specifications of each policy are available in Supplementary Section B.4. The generator specification
of each policy is available in Supplementary Section B.5. For tapered depth-proportional resolution and
recency-proportional resolution, the accuracy of MRCA estimation can also be explored via an interactive
in-browser web applet at https://hopth.ru/bi.
3.2.3     Fixed Resolution Stratum Retention Policy
     The fixed resolution retention policy imposes a fixed absolute upper bound r on the spacing between
retained strata. The strategy is simple: permanently retain a stratum every rth generation. (For arbitrary
reasons of implementation convenience, we also require each stratum to be retained during at least the
generation it is deposited). See top panel of Figure 3.3.
     This retention policy suffers from linear growth in a column’s memory footprint with respect to number
of generations elapsed: every rth generation generation a new stratum is permanently retained. For this
reason, it is likely not useful in practice except potentially in scenarios where the number of generations is
small and fixed in advance. We include it here largely for illustrative purposes as a gentle introduction to
retention policies.
3.2.4     Depth-Proportional Resolution Stratum Retention Policy
     The depth-proportional resolution policy ensures spacing between retained strata will be less than or
equal to a proportion 1/r of total number of strata deposited n. Achieving this limit on uncertainty requires
retaining sufficient strata so that no more than n/r generations elapsed between any two strata. This policy
accumulates retained strata at a fixed interval until twice as many as r are at hand. Then, every other
retained stratum is purged and the cycle repeats with a new twice-as-wide interval between retained strata.
                                                         55


    Policy Lower-Resolution Parameterization Higher-Resolution Parameterization                                                                                                  Properties
     Fixed Resolution
                                                                    0                                                            0                                                 Space
                                                                                                                                                                                Complexity
                                                                  256                                                          256
                                                                                                                                                                                    O(n)
                                                     Generation                                                   Generation
                                                                  512                                                          512                                                MRCA
                                                                                                                                                                                Uncertainty
                                                                  768                                                          768
                                                                                                                                                                                    O(1)
                                                                  1024                                                         1024
                                                                                                                                                                               where n is
                                                                         0   256        512          768   1024                       0   256        512          768   1024
                                                                                   Column Position                                              Column Position
                                                                                                                                                                               gens elapsed.
     Tapered Depth-Proportional Depth-Proportional
                                                                    0                                                            0                                                 Space
                                                                                                                                                                                Complexity
                                                                  256                                                          256
                                                                                                                                                                                    O(1)
                                                     Generation                                                   Generation
                                                                  512                                                          512                                                MRCA
                                                                                                                                                                                Uncertainty
                                                                  768                                                          768
                                                                                                                                                                                    O(n)
                                                                  1024                                                         1024
                                                                                                                                                                               where n is
                                                                         0   256        512          768   1024                       0   256        512          768   1024
                                                                                   Column Position                                              Column Position
                                                                                                                                                                               gens elapsed.
                                                                    0                                                            0                                                 Space
             Resolution             Resolution
                                                                                                                                                                                Complexity
                                                                  256                                                          256
                                                                                                                                                                                    O(1)
                                                     Generation                                                   Generation
                                                                  512                                                          512                                                MRCA
                                                                                                                                                                                Uncertainty
                                                                  768                                                          768
                                                                                                                                                                                    O(n)
                                                                  1024                                                         1024
                                                                                                                                                                               where n is
                                                                         0   256        512          768   1024                       0   256        512          768   1024
                                                                                   Column Position                                              Column Position
                                                                                                                                                                               gens elapsed.
     Recency-Proportional
                                                                                                                                                                                    Space
                                                                    0                                                            0
                                                                                                                                                                                Complexity
                                                                  256                                                          256
                                                                                                                                                                                  O(log(n))
                                                                                                                                                                                   MRCA
                                                     Generation                                                   Generation                                                    Uncertainty
                                                                  512                                                          512
          Resolution
                                                                  768                                                          768
                                                                                                                                                                                    O(m)
                                                                                                                                                                               where m is
                                                                  1024
                                                                         0   256        512          768   1024
                                                                                                                               1024
                                                                                                                                      0   256        512          768   1024
                                                                                                                                                                               gens since
                                                                                   Column Position                                              Column Position
                                                                                                                                                                               MRCA and n
                                                                                                                                                                               is total gens
                                                                                                                                                                               elapsed.
Figure 3.3: Comparison of stratum retention policies. Policy visualizations show retained strata in black.
Time progresses along the y-axis from top to bottom. New strata are introduced along the diagonal and
then “drip” downward as a vertical line until eliminated. The set of retained strata present within a column
at a particular generation g can be read as intersections of retained vertical lines with a horizontal line with
intercept g. Policy visualizations are provided for two parameterizations for each policy: the first where the
maximum uncertainty of MRCA generation estimates would be 512 generations and the second where the
maximum uncertainty of MRCA generation estimates would be 128 generations.
                                                                                                                               56


See second from top panel of Figure 3.3.
      When comparing stratigraphic columns from different generations, the resolution guarantee holds in
terms of the number of generations experienced by the older of the two columns. Because this retention
policy is deterministic, for two columns with the same policy, every stratum that is held by the older column
is also guaranteed to be present in the younger column (unless it hasn’t yet been deposited on the younger
column). Therefore, the strata that would enable the desired resolution when comparing two columns of the
same age are guaranteed to be available, even when one column has elapsed more generations.
      Because the number of strata retained under this policy is bounded as 2r + 1, space complexity scales
as O(1) with respect to the number of strata deposited. It follows that the MRCA generation estimate
uncertainty scales as O(n) with respect to the number of strata deposited.
3.2.5      Tapered Depth-Proportional Resolution Stratum Retention Policy
      This policy refines the depth-proportional resolution policy to provide a more stable column memory
footprint over time. The naive depth-proportional resolution policy builds up strata until twice as many are
present as needed then purges half of them all at once. The tapered depth-proportional resolution policy
functions identically to the depth-proportional policy except that it removes unnecessary strata gradually
from back to front as new strata are deposited, instead of eliminating them simultaneously. See third from
top panel of Figure 3.3.
      The column footprint stability of this variation makes it easier to parameterize our experiments to
ensure comparable end-state column footprints for fair comparison between retention policies, in addition
to making this policy likely better suited to most use cases. By design, this policy has the same space
complexity and MRCA estimation uncertainty scaling relationships with number generations elapsed as the
naive depth-proportional resolution policy.
3.2.6      MRCA-Recency-Proportional Resolution Stratum Retention Policy
      The MRCA-recency-proportional resolution policy ensures distance between the retained strata sur-
rounding any generation point will be less than or equal to a user-specified proportion 1/r of the number of
generations elapsed since that generation.
      This policy can be constructed recursively. So, to begin, let’s consider setting up just the first generation
g of the stratum after the root ancestor we will retain when n generations have elapsed. A simple geometric
analysis reveals that providing the guaranteed resolution for the worst-case generation within the window
between generation 0 and generation g (i.e., generation g − 1) requires
                                                g ≤ ⌊n/(r + 1)⌋.
                                                        57


  Num Gens                                      Guaranteed MRCA-Recency-Proportional Resolution
   Elapsed       1                       4                       10                        100
   1.0 × 103     18                      26                      41                        80
   1.0 × 106     32                      50                      85                        184
   1.0 × 109     51                      79                      134                       293
   1.0 × 1012    64                      102                     177                       396
Table 3.1: Number strata retained after one thousand, one million, one billion, and one trillion generations
under the recency-proportional resolution stratum retention policy. Four different policy parameterizations
are shown, the first where MRCA generation can be determined between two extant columns with a guar-
anteed relative error of 100%, the second 25%, the third 10%, and the fourth 1%. A column’s memory
footprint will be a constant factor of these retained counts based on the fingerprint differentia width chosen.
For example, if single byte differentia were used, the column’s memory footprint in bits would be 8× the
number of strata retained.
     We now have an upper bound for the generation of the first stratum generation we must retain. However,
we must guarantee that strata at these generations are actually available for us to retain (i.e., haven’t been
purged out of the column at a previous time point). We will do this by picking the generation that is
the highest power of 2 less than or equal to our bound. If we repeat this procedure as we recurse, we are
guaranteed that this generation’s stratum will have been preserved across all previous timepoints.
     Why does this work? Consider a sequence where all elements are spaced out by strictly nonincreasing
powers of 2. Consider the first element of the list. All multiples this first element will be included in the list.
So, when we ratchet up g to 2g as n increases, we are guaranteed that 2g has been retained. This principle
generalizes recursively down the list. This is a similar principle to the approach of strictly-doubling interval
sizes used in the Depth-Proportional Resolution stratum retention policies described above.
     This step of truncating to the nearest less than or equal to power of 2 affects our recursive step size is
at most halved. So, because step size is a constant fraction of remaining generations n (at worst     2(r+1) ), the
                                                                                                         n
number of steps made (and number of strata retained) scales as O(log(n)) with respect to the number of strata
deposited. Table 3.1 provides exact figures for the number of strata retained under different parameterizations
of the recency-proportional retention policy between one thousand and one trillion generations.
     As for MRCA generation estimate uncertainty, in the worst case it scales as O(n) with respect to the
greater number of strata deposited. However, with respect to estimating the generation of the MRCA for
lineages diverged any fixed number of generations ago, uncertainty scales as O(1).
     How does space complexity scale with respect to the policy’s specified resolution r? Through extrapola-
tion from OEIS sequences A063787 and A056791 via guess and check (Oeis, 2021a,b), we posited the exact
number of strata retained after n generations as
                                                       58


                                                            X r
                                  HammingWeight(n) +              ⌊log2 (⌊n/r⌋)⌋ + 1.
                                                              1
     This expression has been unit tested extensively to ensure perfect reliability. Approximating and ap-
plying logarithmic properties, this policy’s space complexity can be calculated within a constant factor as
                                                                 nr 
                                                 log(n) + log            .
                                                                   r!
     To analyze the relationship between space complexity and resolution r, we will examine the ratio of
space complexities induced when scaling resolution r up by a constant factor f > 1. Evaluating this ratio
as r → ∞, we find that space complexity scales directly proportional to f ,
                                                                        
                                                                   nf r
                                                log(n) + log      (f r)!
                                           lim                         = f.
                                          r→∞                      nr
                                                  log(n) + log      r!
     Evaluating this ratio as n → ∞, we find that this scaling relationship is never worse than directly
proportional for any r,
                                                                  
                                                             nf r
                                            log(n) + log    (f r)!         fr + 1
                                      lim                        =
                                     n→∞
                                             log(n) + log    nr            r+1
                                                              r!
                                                                            r + 1/f
                                                                       =f
                                                                             r+1
                                                                       ≤ f.
3.2.7     Computational Experiments
     In order to assess the practical performance of the hereditary stratigraph approach in an applied setting,
we simulated the process of stratigraph propagation over known “ground truth” phylogenies extracted from
pre-existing digital evolution simulations (Hernandez et al., 2022). These simulations propagated populations
of between 100 and 165 bitstrings between 500 and 5,000 synchronous generations under the NK fitness
landscape model (Kauffman and Weinberger, 1989). In order to ensure coverage of a variety of phylogenetic
conditions, we sampled a variety of selection schemes that impose profoundly different ecological regimens
(Dolson and Ofria, 2018),
    • EcoEA Selection (Goings et al., 2012),
                                                          59


    • Lexicase Selection (Helmuth et al., 2014),
    • Random Selection, and
    • Sharing Selection (Goldberg et al., 1987).
     Supplementary Table B.8 provides full details on the conditions each ground truth phylogeny was drawn
from. The phylogenies themselves are available with our supplementary material.
     For each ground truth phylogeny, we tested combinations of three configuration parameters:
    • target end-state memory footprints for extant columns (64, 512, and 4096 bits),
    • differentia width (1, 8, and 64 bits), and
    • stratum retention policy (tapered depth-proportional resolution and recency-proportional resolution).
     Stratum retention policies were parameterized so that the maximum number of strata possible were
present at the end of the experiment without exceeding the target memory footprint. If the target mem-
ory footprint is exceeded by the sparsest possible parameterization of a retention policy, then that sparsest
possible parameterization was used. Supplementary Tables tables B.1 to B.5 provide the calculated para-
materizations and memory footprints of extant columns.
     In order to assess the viability of phylogenetic inference using hereditary stratigraphic columns from
extant organisms, we used the end-state stratigraphs to reconstruct an estimate of the actual ground truth
phylogenetic histories. The first step to reconstructing a phylogenetic tree for the history of an extant popu-
lation at the end of an experiment is to construct a distance matrix by calculating all pairwise phylogenetic
distances between extant columns. We defined phylogenetic distance between two extant columns as the
sum of each extant organism’s generational distance back to the generation of their MRCA, estimated as the
mean of the upper and lower 95% confidence bounds. Figure 3.4 provides a cartoon summary of the process
of calculating phylogenetic distance between two extant columns.
     We then used the unweighted pair group method with arithmetic mean (UPGMA) reconstruction tool
provided by the BioPython package to generate estimate phylogenetic trees (Cock et al., 2009; Sokal, 1958).
After generating the reconstructed tree topology, we performed a second pass to adjust branch lengths so
that each internal tree node sat at the mean of its estimated 95% confidence generation bounds.
3.2.8     Software and Data
     As part of this work, we published the hstrat Python library with a stable public-facing API intended
to enable incorporation in other projects with extensive documentation and unit testing on GitHub at
                                                       60


                                                       actual MRCA
                                                        Gen 0 “Fingerprint” . 0x1f.
                                                        Gen 2 “Fingerprint” . 0x76.
                                                                                                                                 possible
                                                          Column, Generation 2
estimated MRCA generation
                                                                                                                                  MRCA
                   Gen 0 “Fingerprint” . 0x1f.                                              Gen 0 “Fingerprint” . 0x1f.
                   Gen 2 “Fingerprint” . 0x76.                                              Gen 2 “Fingerprint” . 0x76.         generations
                   Gen 3 “Fingerprint” . 0xa4.                                        d2    Gen 3 “Fingerprint” . 0xba.
   d1
                    Column, Generation 3                                                      Column, Generation 3
 Gen 0 “Fingerprint” . 0x1f.         Gen 0 “Fingerprint” . 0x1f.                            Gen 0 “Fingerprint” . 0x1f.
 Gen 2 “Fingerprint” . 0x76.         Gen 2 “Fingerprint” . 0x76.                            Gen 2 “Fingerprint” . 0x76.
 Gen 4 “Fingerprint” . 0x95.         Gen 4 “Fingerprint” . 0xa2.                            Gen 4 “Fingerprint” . 0x47.
  Column, Generation 4                 Column, Generation 4                                    Column, Generation 4
                                                              extinct columns
                                                  extant columns
 Gen 0 “Fingerprint” . 0x1f.
                                                                          Gen 0 “Fingerprint” . 0x1f.        Gen 0 “Fingerprint” . 0x1f.
 Gen 2 “Fingerprint” . 0x76.                                              Gen 2 “Fingerprint” . 0x76.        Gen 2 “Fingerprint” . 0x76.
 Gen 4 “Fingerprint” . 0x95.
                                                                          Gen 4 “Fingerprint” . 0x47.        Gen 4 “Fingerprint” . 0x47.
 Gen 5 “Fingerprint” . 0xcb.                                              Gen 5 “Fingerprint” . 0x88.        Gen 5 “Fingerprint” . 0xe3.
  Column, Generation 5                                                     Column, Generation 5                Column, Generation 5
Figure 3.4: Cartoon illustration of column inheritance along a phylogenetic tree and the process to infer
phylogenetic history along extant columns. This scenario supposes a stratum retention policy where only
strata from even generations are retained. The common ancestor of the focal clade, which is at generation
2, is shown at top. Generation 3 columns inherit that ancestor’s strata and each append a new stratum.
Generation 4 columns append another new stratum then eliminate their generation 3 strata. Finally, another
generation elapses to yield generation 5 strata. Suppose that only generation 5 strata are extant. So, greyed-
out columns above are not directly observable. Phylogenetic history is deduced by pairwise comparison
between extant columns. For each pair (like the example highlighted in red) a phylogenetic distance can
be computed as the sum generations elapsed to each extant column after the estimated generation of their
MRCA. These pairwise distances can then be fed into a phylogeny reconstruction algorithm.
                                                                   61


https://github.com/mmore500/hstrat and on PyPI.In the near future, we intend to complete and publish a
corresponding C++ library.
     Supporting software materials can be found on GitHub at https://github.com/mmore500/
hereditary-stratigraph-concept Supporting computational notebooks are available for in-browser use via
BinderHub at https://hopth.ru/bk (Ragan-Kelley and Willing, 2018). Our work benefited from many pieces
of open source scientific software (Bostock et al., 2011; Hunter, 2007; Meurer et al., 2017; Paradis et al.,
2004; Smith, 2020b,c; Sukumaran and Holder, 2010; Ushey et al., 2022; Virtanen et al., 2020,?; Waskom,
2021; Wickham et al., 2022). The ground truth phylogenies used in this work as well as supplementary
figures, tables, and text are available via the Open Science Framework at https://osf.io/4sm72/ (Foster and
Deardorff, 2017; Moreno et al., 2022a). Phylogenetic data associated with this project is stored in the Alife
Community Data Standards format (Lalejini et al., 2019).
3.3       Results and Discussion
     In this section, we analyze the quality of reconstructions of known phylogenetic trees using hereditary
stratigraphy. Figure 3.5 compares an example reconstruction from columns using tapered depth-proportional
stratum retention, an example reconstruction using recency-proportional stratum retention, and the under-
lying ground truth phylogeny. Interactive in-browser visualizations comparing all reconstructed phylogenies
to their corresponding ground truth are available at https://hopth.ru/bi.
3.3.1      Reconstruction Accuracy
     Measuring tree similarity is a challenging problem, with many conflicting approaches that all provide
different information (Smith, 2020a). Ideally, we would use a metric of reconstruction accuracy that 1)
is commonly used so that there exists sufficient context to understand what constitutes a good value, 2)
behaves consistently across different types of trees, and 3) behaves reasonably for the types of trees common
in artificial life data. Unfortunately, these objectives are somewhat in conflict. The primary source of this
problem is multifurcations, nodes from which more than two lineages branch at once. In reconstructed
phylogenies in biology, multifurcations are generally assumed to be the result of insufficient information.
It is thought that the real phylogeny had multiple bifurcations that occurred so close together that the
reconstruction algorithm is unable to separate them. In artificial life phylogenies, however, we have the
opposite problem. When we perfectly track a phylogeny, it is common for us to know that a multifurcation
did in fact occur. However, it is challenging for our reconstructions to properly identify multifurcations,
because it requires perfectly lining up multiple divergence times. Many of the most popular tree distance
metrics interpret the difference between a multifurcation and a set of bifurcations as a dramatic change in
topology. For some use cases, this change in topology may indeed be meaningful, although research on the
                                                       62


                                                 3                                                            3                                                              h
                                                  a                                                            a                                                             l
                                                 4                                                            4                                                              8
                                                  9                                                            9                                                           1
       5                                          c                 5                                          c                 5                                           m
                                                0                                                            0                                                              3
                                                 2                                                            2                                                              a
                                                   j                                                            j                                                           4
                                                   h                                                            h                                                            9
       10                                          l                10                                          l                10                                          c
                                                  8                                                            8                                                          0
taxa                                                         taxa                                                         taxa
                                                 1                                                            1                                                            2
                                                   m                                                            m                                                            j
                                                   d                                                            d                                                            d
       15                                          g                15                                          g                15                                          g
                                                   e                                                            e                                                            e
                                                   f                                                            f                                                            6
                                                   i                                                            i                                                            f
                                                  5                                                            6                                                             i
       20                                          k                20                                         5                 20                                          5
                                                  6                                                            7                                                             k
                                                  7                                                             b                                                            7
                                                   b                                                            k                                                            b
                                                   n                                                            n                                                            n
            0   100   200       300       400   500    600               0   100   200       300       400   500    600               0   100   200       300       400   500    600
                            branch length                                                branch length                                                branch length
        (a) Ground truth phylogeny.                          (b) 1-bit Fingerprint Differentia,                           (c) 1-bit Fingerprint Differentia,
                                                             Tapered Depth-Proportional Res-                              MRCA-Recency-Proportional
                                                             olution Stratum Retention Predi-                             Resolution Stratum Retention
                                                             cate, 64 bit target column foot-                             Predicate, 64 bit target column
                                                             print.                                                       footprint.
Figure 3.5: Example phylogeny reconstructions of ground-truth lexicase selection phylogeny from inference
on extant hereditary stratigraphic columns. Shaded error bars on reconstructions indicate 95% confidence
intervals for the true generation of tree nodes. Arbitrary color is added to enhance distinguishability.
Figure 3.6: Proportion of information present in the ground-truth ftness sharing phylogeny that was cap-
tureed by our reconstruction, across various retention policies. High is better (1 is perfect). RPR is recency-
proportional resolution policy and TDPR is tapered depth-proportional resolution policy.
extent of this problem is limited. Nevertheless, we suspect that for the majority of use cases, the tiny branch
lengths between the internal nodes will make this source of error relatively minor.
            To overcome this obstacle, we have measured our reconstruction accuracy using multiple metrics. We
will primarily focus on Mutual Clustering Information (as implemented in the R TreeDist package) (Smith,
2020c), which is a direct measure of the quantity of information in the ground truth phylogeny that was
successfully captured in the reconstruction. It is relatively unaffected by the failure to perfectly reproduce
multifurcations. For the purposes of easy comparison to the literature, we also measured the Clustering
Information Distance (Smith, 2020c).
                                                                                         63


     Across ground truth phylogenies, we were able to reconstruct the phylogenetic topology with between
47.75% and 85.70% of the information contained in the original tree using a 64-bit column memory footprint,
between 47.75% and 80.36% using a 512-bit column memory footprint, and between 51.13% and 83.53% us-
ing a 4096-bit column memory footprint. While the Clustering Information Distance reached its maximum
possible score (1.0) for the heavily-multifurcated EcoEA phylogeny, it agreed with the Mutual Clustering
Information score for less multifurcated phylogenies, such as fitness sharing. Using the Recency Proportional
Resolution retention policy and a 4096-bit column memory footprint, we were able to reconstruct a fitness
sharing phylogeny with a Clustering Information Distance of only 0.2923471 from the ground truth. For
context, that result is comparable to the distance between phylogenies reconstructed from two closely-related
proteins in H3N2 flu (0.25) (Jones et al., 2021). To build further intuition, we strongly encourage readers to
refer to our interactive web reconstruction. Figure 3.6 summarizes error reconstructing the fitness sharing
selection phylogeny in terms of the mutual clustering information metric (Smith, 2022). The phylogenies
reconstructed from the EcoEA condition performed comparably, with lexicase and random selection faring
somewhat worse (Moreno et al., 2022a). In the case of random selection, we suspect that this reduced per-
formance is the result of having many nodes that originated very close together at the end of the experiment.
As expected, we did observe overall more accurate reconstructions from columns that were allowed to occupy
larger memory footprints.
3.3.2     Differentia Size
     Among the surveyed ground truth phylogenies and target column footprints, we consistently found that
smaller differentia were able to yield more or as accurate phylogenetic reconstructions. The stronger per-
formance of narrow differentia was particularly apparent in low-memory-footprint scenarios where overall
phylogenetic inference power was weaker. Overall, single-bit differentia outperformed 64-bit differentia under
20 conditions, and were indistinguishable under 6 conditions, and were worse under 4 conditions. We used
Cluster Information Distance to perform these comparisons. Full results are available in Supplementary
Section B.2. Although narrower differentia have less distinguishing power on their own, their smaller size
allows more to be packed into the memory footprint to cover more generations, which seems to help recon-
struction power. We must note that narrower differentia can pack more thoroughly into the footprint caps
we imposed on column size, so their extant columns tended to have slightly more overall bits. However, this
was a small enough imbalance (in most cases < 10%) that we believe it is unlikely to fully account for the
stronger performance of narrow-differentia configurations.
                                                      64


3.3.3      Retention Policy
      Across the surveyed ground truth phylogenies and target column memory footprints, we found that the
recency-proportional resolution stratum retention policy generally yielded better phylogenetic reconstruc-
tions. Phylogenetic reconstruction quality was better in 28 conditions, equivalent in 13 conditions, and
worse in 4 conditions. Again, this effect was most apparent in the small-stratum-count scenarios where
overall inference power was weaker. We used Cluster Information Distance to perform these comparisons.
Full results are available in Supplementary Section B.3. The stronger performance of recency-proportional
resolution is likely due to the denser retention of recent strata under the recency-proportional metric, which
help to resolve the more numerous (and therefore typically more tightly spaced) phylogenetic events in the
near past (Zhaxybayeva and Gogarten, 2004). Recency-proportional resolution tended to be able to fit fewer
strata within the prescribed memory footprints (except in cases where it could not fit within the footprint)
so its stronger performance cannot be attributed to more retained bits in the end-state extant columns.
3.4       Conclusion
      To our knowledge, this work provides a novel design for digital genome components that enable phylo-
genetic inference on asexual populations. This provides a viable alternative to perfect phylogenetic tracking,
which is complex and possibly cumbersome in distributed computing scenarios, especially with fallible nodes.
Our approach enables flexible, explicit trade-offs between space complexity and inference accuracy. Hered-
itary stratigraphic columns are efficient: our approach can estimate, for example, the MRCA generation
of two genomes within 10% error with 95% confidence up to a depth of a trillion generations with genome
annotations smaller than a kilobyte. However, they are also powerful: we were able to achieve tree re-
constructions recovering up to 85.70% of the information contained in the original tree with only a 64-bit
memory footprint.
      This and other methodology to enable decentralized observation and analysis of evolving systems will
be essential for artificial life experiments that use distributed and best-effort computing approaches. Such
systems will be crucial to enabling advances in the field of artificial life, particularly with respect to the
question of open-ended evolution (Ackley and Cannon, 2011; Moreno et al., 2021a,b) Mork work is called for
to further enable experimental analyses in distributed, best-effort systems while preserving those systems’
efficiency and scalability. As parallel and distributed computing becomes increasingly ubiquitous and begins
to more widely pervade artificial life systems, hereditary stratigraphy should serve as a useful technique in
this toolbox.
      Important work extending and analyzing hereditary stratigraphy remains to be done. Analyses should
be performed to expound MRCA resolution guarantees of stratum retention policies when using narrow (i.e.,
                                                        65


single-bit) differentia. Constant-size-complexity stratum retention policies that preferentially retain a denser
sampling of more-recent strata should be developed and analyzed. Extensions to sexual populations should
be explored, including the possibility of annotating and tracking individual genome components instead of
whole-genome individuals. An alternate approach might be to define a preferential inheritance rule so that
at each generation slot within a column, a single differentia sweeps over an entire interbreeding population.
Optimization of tree reconstruction from extant hereditary stratigraphs remains an open question, too,
particularly with regard to properly handling multifurcations. It would be particularly valuable to develop
methodology to annotate inner nodes of trees reconstructed from hereditary stratigraphs with confidence
levels.
     The problem of designing genomes to maximize phylogenetic reconstructability raises unique questions
about phylogenetic estimation. Such a backward problem — optimizing genomes to make analyses trivial
as opposed to the usual process of optimizing analyses to genomes — puts questions about the genetic
information analyses operate on in a new light. In particular, it would be interesting to derive upper bounds
on phylogenetic inference accuracy given genome size and generations elapsed.
                                                       66


                             Part II
Evolving Complexity, Novelty, and Adaptation in Digital Multicells
                                67


                                                 Chapter 4
  Exploring Evolved Multicellular Life Histories in an Open-Ended
                                     Digital Evolution System
Authors: Matthew Andres Moreno and Charles Ofria
This chapter is adapted from (Moreno and Ofria, 2022), which appeared in the Frontiers in Ecology and
Evolution Models in Ecology and Evolution special issue, Digital Evolution: Insights for Biologists.
     This chapter introduces the DISHTINY framework, which enables evolution experiments with digital
multicells. Indeed, in evolutionary experiments, we repeatedly observed group-level traits that are charac-
teristic of a fraternal transition. These included reproductive division of labor, resource sharing within kin
groups, resource investment in offspring groups, asymmetrical behaviors mediated by messaging, morpho-
logical patterning, and adaptive apoptosis. We report eight case studies from replicates where transitions
occurred and explore the diverse range of adaptive evolved multicellular strategies.
4.1       Introduction
     An evolutionary transition in individuality is an event where independently replicating entities unite to
replicate as a single, higher-level individual (Smith and Szathmary, 1997). These transitions are understood
as essential to natural history’s remarkable record of complexification and diversification (Smith and Szath-
mary, 1997). Likewise, artificial life researchers have highlighted transitions in individuality as a mechanism
that is missing in digital systems, but necessary for achieving the evolution of complexity and diversity that
we witness in nature (Banzhaf et al., 2016; Taylor et al., 2016).
     Fraternal evolutionary transitions in individuality are transitions in which the higher-level replicating
entity is derived from the combination of cooperating kin that have entwined their long-term fates (West
et al., 2015). Multicellular organisms and eusocial insect colonies exemplify this phenomenon (Smith and
Szathmary, 1997) given that both are sustained and propagated through the cooperation of lower-level kin.
This work focuses on fraternal transitions. Although not our focus here, egalitarian transitions — events
in which non-kin unite, such as the genesis of mitochondria by symbiosis of free-living prokaryotes and
eukaryotes (Smith and Szathmary, 1997) — also constitute essential episodes in natural history.
     In nature, major fraternal transitions occur sporadically with few extant transitional forms, making them
challenging to study. For instance, on the order of 25 independent origins of Eukaryotic multicellularity are
known (Grosberg and Strathmann, 2007) with most transitions having occurred hundreds of millions of years
ago (Libby and Ratcliff, 2014). Recent work in experimental evolution (Gulli et al., 2019; Koschwanez et al.,
2013; Ratcliff et al., 2015; Ratcliff and Travisano, 2014), mechanistic modeling (Hanschen et al., 2015; Staps
et al., 2019), and digital evolution (Goldsby et al., 2012, 2014) complements traditional post hoc approaches
                                                        68


focused on characterizing the record of natural history. These systems each instantiate the evolutionary
transition process, allowing targeted manipulations to test hypotheses about the requisites, mechanisms,
and evolutionary consequences of fraternal transitions. Digital evolution, computational model systems
designed to instantiate evolution in abstract algorithmic substrates rather than directly emulating any specific
biological system (Dolson and Ofria, 2021; Wilke and Adami, 2002), occupies a sort of middle ground
between wet work and mechanistic modeling. This approach offers a unique conjunction of experimental
capabilities that complements work in both of those disciplines. Like modeling, digital evolution affords
rapid generational turnover, complete observability (every event in a digital system can be tracked), and
complete manipulability (every event in a digital system can can be arbitrarily altered). However, as with in
vivo experimental evolution, digital evolution systems can exhibit rich evolutionary dynamics stemming from
complex, rugged fitness landscapes (LaBar and Adami, 2017) and sophisticated agent behaviors (Grabowski
et al., 2013).
      Our work here follows closely in the intellectual vein of Goldsby’s deme-based digital evolution exper-
iments (Goldsby et al., 2012, 2014). In her studies, high-level organisms exist as a group of cells within a
segregated, fixed-size subspace. High-level organisms must compete for a limited number of subspace slots.
Individual cells that comprise an organism are controlled by heritable computer programs that allow them
to self-replicate, interact with their environment, and communicate with neighboring cells.
      Goldsby’s work defines two modes of cellular reproduction: tissue accretion and offspring generation.
In this way, somatic and gametogenic modes of reproduction are explicitly differentiated. Within a group,
cells undergo tissue accretion, whereby a cell copies itself into a neighboring position in its subspace. In the
latter, a population slot is cleared to make space for a daughter organism then seeded with a single daughter
cell from the parent organism.
      Goldsby’s model abstracts away developmental cost to focus on resource competition between groups.
Cells grow freely within an organism, but fecundity depends on the collective profile of computational tasks
(usually mathematical functions) performed within the organism. When an organism accumulates sufficient
resource, a randomly chosen subspace is cleared and a single cell from the replicating organism is used as a
propagule to seed the new organism. This setup mirrors the dynamics of biological multicellularity, in which
cell proliferation may either grow an existing multicellular body or found a new multicellular organism.
      Here, we take several steps to develop a computational environment that removes the enforcement and
rigid regulation of multiple organismal levels. Specifically, we remove the explicitly segregated subspaces
and we let multicells interact with each other more freely. We demonstrate the emergence of multicellularity
where each organism manages its own spatial distribution and reproductive process. This spatially uni-
fied approach enables more nuanced interactions among organisms, albeit at the cost of substantially more
                                                       69


complicated analyses. Instead of a single explicit interface to mediate interactions among high-level organ-
isms, such interactions must emerge via many cell-cell interfaces. Novelty can occur in terms of interactions
among competitors, among organism-level kin, or even within the building blocks that make up hierarchical
individuality. Experimentally studying fraternal transitions in a digital system where key processes (repro-
ductive, developmental, homeostatic, and social) occur implicitly within a unified framework can provide
unique insights into nature. For example, pervasive, arbitrary interactions between multicells introduces the
possibility for strong influence of biotic selection.
     However, in our system, multicells do not emerge from an entirely impartial substrate. We do explicitly
provide some framework to facilitate fraternal transitions in individuality by allowing cells to readily designate
distinct hereditary groups. Offspring cells may either remain part of their parent’s hereditary group or found
a new group. Cells can recognize group members, thus allowing targeted communication and resource sharing
with kin. We reward cells for performing tasks designed to require passive collaboration among hereditary
group members. As such, cells that form hereditary groups to maximize advantage on those tasks stand
to increase their inclusive fitness. In previous work introducing the DISHTINY (DIStributed Hierarchical
Transitions in IndividualitY) framework we evolved parameters for manually designed cell-level strategies to
explore fraternal transitions in individuality (Moreno and Ofria, 2019). In this work we extend DISHTINY
to incorporate a more dynamic event-driven genetic programming representation called SignalGP, which was
designed to facilitate dynamic interactions among agents and between agents and their environment (Lalejini
and Ofria, 2018). As expected, with the addition of cell controllers capable of nearly arbitrary computation
we see a far more diverse set of behaviors and strategies arise.
     Here, we perform case studies to characterize notable multicellular phenotypes that evolved via this more
dynamic genetic programming underpinning. Each case study strain was chosen by screening the entire set
of replicate evolutionary runs for signs of the trait under investigation and then manually the most promising
strain(s) for further investigation. Case studies presented therefore represent an anecdotal sampling, rather
than an exhaustive summary, with respect to each trait of interest. Our goal is to explore a breadth of
possible evolutionary outcomes under the DISHTINY framework. We see this as a precursory step toward
hypothesis-driven work contributing to open questions about fraternal transitions in individuality.
4.2      Materials and Methods
     We performed simulations in which cells evolved open-ended behaviors to make decisions about resource
sharing, reproductive timing, and apoptosis. We will first describe the environment and hereditary grouping
system cells evolved under and then describe the behavior-control system cells used.
                                                        70


                                                                                                                         Signal          Signal                Signal     Signal
                                                                                                                           GP              GP                    GP         GP
                                                                                                                        Instance        Instance              Instance   Instance
                                                                                                             Signal                                 Signal                           Signal
                                                                                                               GP                                     GP                               GP
                                                                                                            Instance                               Instance                         Instance
                                                              [tag]
                                                              [instruction]
                                                                                          0111
                                                              [instruction]                                  Signal                                 Signal                           Signal
                                                              [instruction]                                    GP                                     GP                               GP
                                                                                                            Instance                               Instance                         Instance
                                                                                                        1
                          0               uctio                                                  n]
                                                                                           uctio
                                                                                                     011
                                                n]
                         010     Sign
                                       al 00
                                                                                     [instr
                                                                                                   n]
                           [instr            00                                              uctio
                               [instr                                                  [instr                            Signal          Signal                Signal     Signal
                                                                                                    11
                                      uctio
                                                         SignalGP                             al 01                        GP              GP                    GP         GP
                                            n]                                           Sign
                                                                                                                        Instance        Instance              Instance   Instance
                          0000
                                                         Instance
                                                 [in                                 n]
                                                     str                          tio                        Signal                                 Signal                           Signal
                                                         u cti                 uc         n]
                                            [in
                                                str            on          str         tio
                                                                                                               GP                                     GP                               GP
                                                    u             ]     [in         uc
                                        [in           cti                       str            n]           Instance                               Instance                         Instance
                                           str            on                [in             tio
                                               uc            ]                           uc
                                    00            tio                                str
                                         00           n]                         [in
                                                                                                10
                                                                                         10
                                0001
                                                                                           1110
        Broadcast 0001
                                                                                                             Signal                                 Signal                           Signal
                                                              Environment                                      GP                                     GP                               GP
                                                                                                            Instance                               Instance                         Instance
                                                                                                                       (b) How individual SignalGP instances are organized
                                                                                                                         Signal
                                                                                                                           GP
                                                                                                                                         Signal
                                                                                                                                           GP
                                                                                                                                                               Signal
                                                                                                                                                                 GP
                                                                                                                                                                          Signal
                                                                                                                                                                            GP
                                                                                                                       into DISHTINY cells. Above, DISHTINY cells are
                                                                                                                        Instance        Instance              Instance   Instance
     (a) Overview of a single SignalGP instance.                                                                       depicted as gray squares. Each DISHTINY cell is
 SignalGP program modules contain ordered sets of                                                                      controlled by independent execution of the cell’s ge-
instructions that activate and execute independently                                                                   netic program on four distinct SignalGP instances,
in response to tagged signals. Above, these modules                                                                    depicted as colored circles. Each of four independent
  are shown as rectangular lists with bitstring tags                                                                   instances manages cell behavior with respect to a sin-
    protruding from the SignalGP instance. These                                                                       gle cardinal direction: sensing environmental state,
 signals can originate from any of three sources: (1)                                                                  receiving intercellular messages, and determining cell
  internally from execution of “Signal” instructions                                                                   actions. Above, the special role of each instance is
  within a program’s modules, (2) from the outside                                                                     depicted as a reciporical arrow to the neighboring in-
   environment, or (3) from other agents executing                                                                     stance in the neighboring cell. (All four instances
 “Message” instructions. Graphic provided courtesy                                                                     sense non-directional environmental cues and non-
                  Alexander Lalejini.                                                                                  directional actions may be taken by any instance.)
                                                                                                                       These four instances can communicate with one an-
                                                                                                                       other via intracellular messaging, indicated above by
                                                                                                                       smaller reciprocal arrows among instances within a
                                                                                                                       cell.
Figure 4.1: Schematic illustrations of how an individual SignalGP instance functions and how SignalGP
instances control DISHTINY cells. Execution of cells’ genetic programs on SignalGP instances controls cell
behavior in our model.
                                                                                                                   71


4.2.1     Cells and Hereditary Groups
     Cells occupy individual tiles on a 60-by-60 toroidal grid. Over discrete time steps (“updates”), cells can
collect a resource. Collected resource decays at a rate of 0.1% per update, incentivizing its quick use but
gradual enough so as to not prevent the most naive cells from eventually accumulating enough resource to
reproduce. Once sufficient resource accrues, cells may pay one unit of resource to place a daughter cell on
an adjoining tile of the toroidal grid (i.e., reproduce), replacing any existing cell already there. Daughter
cells inherit their parent’s genetic program, except any novel mutations that may arise. Mutations included
whole-function duplication and deletion, bit flips on tags for instructions and functions, instruction and
argument substitutions, and slip mutation of instruction sequences. We used standard SignalGP mutation
parameters from (Lalejini and Ofria, 2018), but only applied mutations to 1% of daughter cells at birth.
Daughter cells may also inherit hereditary group ID, introduced and discussed below.
     Cells accrue resource via a cooperative resource-collection process. The simulation distributes large
amounts of resource within certain spatial bounds in discrete, intermittent events. Working in a group allows
cells to more fully collect available resource during these events. Cooperating in medium-sized groups (on
the order of 100 cells) accelerates per-cell resource collection rate. Unicellular, too-small, or too-large groups
collect resource at a lesser per-cell rate. As an arbitrary side effect of the simulation algorithm employed
to instantiate the cooperative resource distribution process, groups with a roughly circular layout collect
resource faster than irregularly-shaped groups. Cooperative resource collection unfolds as an entirely passive
process on the part of the cells, influenced only by a group’s spatial layout. Full details on the simulation
algorithm that determines cooperative resource collection rates appear in in Supplementary Section C.2.
     Cells may grow a cooperative resource-collecting group through cell proliferation. We refer to these
cooperative, resource-collecting groups as “hereditary groups.” As cells reproduce, they can choose to adsorb
daughter cells onto the parent’s hereditary group or expel those offspring to found a new hereditary group.
These decisions affect the spatial layout of these hereditary groups and, in turn, affect individual cells’
resource-collection rate.
     To promote group turnover, we counteract the established hereditary groups’ advantage with a simple
aging scheme. As hereditary groups age over elapsed updates and somatic generations, their constituent cells
lose the ability to regenerate somatic tissue and then, soon after, to collect resource. A complete description
of group aging mechanisms used appears in Supplementary Section C.3.
     Because new hereditary group IDs arise first in a single cell and grow disseminate exclusively among
direct descendants of that progenitor cell, hereditary groups are reproductively bottlenecked. This clonal (or
“staying together”) multicellular life history stands in contrast with an aggregative (or “coming together”) life
                                                        72


cycle where chimeric groups arise via fusion of potentially loosely-related lineages (Staps et al., 2019). Such
clonal development is known to strengthen between-organism selection effects (Grosberg and Strathmann,
2007).
      In this work, we screen for fraternal transitions in individuality with respect to these hereditary groups
by evaluating three characteristic traits of higher-level organisms: resource sharing, reproductive division of
labor, and apoptosis. We can further screen for the evolution of complex multicellularity by assessing cell-
cell messaging, regulatory patterning, and functional differentiation between cells within hereditary groups
(Knoll, 2011).
4.2.2      Hierarchical Nesting of Hereditary Groups
      Successive fraternal transitions in natural history — for example, to multicellularity and then to euso-
ciality (Smith and Szathmary, 1997) — underscores the constructive power of evolution to harness emergent
structures as building blocks for further novelty. Such substructure can also provide scaffolding for dif-
ferentiation and division of labor within an organism (Wilson, 1984). To explore these dynamics, in some
experimental conditions we incorporated a hierarchical extension to the hereditary grouping scheme described
above.
      Hierarchical levels are introduced into the system by providing a mechanism to groups of hereditary
groups to form. We accomplish this through two separate, but overlaid, instantiations of the hereditary
grouping scheme. We refer to each independent hereditary grouping system as a “level.” The hierarchical
extension allows two levels of hereditary grouping, identified here as L0 and L1. L0 instantiates smaller,
inner grouping embedded inside of a L1 grouping. Without the hierarchical extension, only L0 is present.1
We refer to the highest hereditary grouping level present in a simulation as the “apex” level.
      Under the hierarchical extension, each cell contained a pair of separate hereditary group IDs — the first
for L0 and the second for L1. During reproduction, daughter cells could either
   1. inherit both L0 and L1 hereditary group ID,
   2. inherit L0 hereditary group ID but not L1 hereditary group ID, or
   3. inherit neither hereditary group ID.
In order to enforce hierarchical nesting of hereditary group IDs, daughter cells could not inherit just the L1
hereditary group ID.
    1
      We chose to number these levels using the computer science convention of zero-based indexing (as
opposed to everyday practice of counting up from one) to maintain consistency with source code and data
sets associated with this work.
                                                        73


      Hierarchical hereditary group IDs are strictly nested: all cells are members of one L0 hereditary group
and L1 hereditary group. No cell can be a member of two L0 hereditary groups or two L1 hereditary groups.
Likewise, no L0 hereditary group can appear within more than one L1 hereditary group. Useful as a concrete
illustration of this scheme, Figure 4.6a depicts hierarchically-nested hereditary groupings assumed by an
evolved strain.
4.2.3      Cell-Level Organisms
      Our experiments use cell-level digital organisms controlled by genetic programs subject to mutations
and selective pressures that stem from local competition for limited space.
      We employ the SignalGP event-driven genetic programming representation. As sketched in Figure 4.1a ,
this representation is specially designed to express function-like modules of code in response to internal signals
or external stimuli. This process can be considered somewhat akin to gene expression. In our experiments,
virtual CPUs can execute responses to up to 24 signals at once, with any further signals usurping the longest-
running modules. The event-driven framework facilitates the evolution of dynamic interactions between
digital organisms and their environment (including other organisms) (Lalejini and Ofria, 2018).
      Special module components allow evolving programs to sense and interact with their environment,
through mechanisms including resource sharing, hereditary group sensing, apoptosis, cell reproduction, and
arbitrary cell-cell messaging. Modules can also include general purpose computational elements like con-
ditionals and loops, which allows cells to evolve sophisticated behaviors conditioned on current (and even
previous) local conditions. A simple “regulatory” system provides special CPU instructions that dynamically
adjust which modules are activated by particular signals. In our simulation, directionality of some inputs
and outputs must be accounted for (e.g., specifying which neighbor to share resource with). To accomplish
this, we provide each cell an independent SignalGP hardware instance to manage inputs and outputs with
respect to each specific cell neighbor. So there are four virtual hardware sets per cell, one for each cardinal
direction.2 Figure 4.1b overviews the configuration of the four SignalGP instances that constitute a single
cell.
      Supplementary Sections C.4, C.5, C.6, and C.7 provide full details of the digital evolution substrate
underpinning this work.
4.2.4      Surveyed Evolutionary Conditions
      To broaden our exploration of possible evolved multicellular behaviors in this system, we surveyed
several evolutionary conditions.
    2
      This approach differs from existing work evolving digital organisms in grid-based problem domains, where
directionality is managed by a within-cell “facing” state that determines the source direction for inputs and
the target direction for outputs (Biswas et al., 2014; Goldsby et al., 2014, 2018; Grabowski et al., 2010;
Lalejini and Ofria, 2018); see Supplemental Section C.4 for further detail.
                                                        74


     In one manipulation, we explored the effect of enabling hierarchical structure within hereditary groups,
such that parent cells can choose to keep offspring in their same sub-group, in just the same full group, or
expel them entirely to start a new group. Cells can sense and react to the level of hereditary ID commonality
shared with each neighbor. This manipulation presents opportunity for hierarchical individuality or for a
mechanism to mediate differentiation within a multicell, but does not enforce it.
     In a second manipulation, we explored the importance of explicitly selecting for medium-sized groups (as
had been needed to maximize resource collection) by removing this incentive. Instead, the system distributed
resource at a uniform per-cell rate.
     We combined these two manipulations to yield four surveyed conditions:
   1. “Flat-Even”: One hereditary group level (flat) with uniform resource inflow (even). In-browser simula-
      tion: https://hopth.ru/i,
   2. “Flat-Wave”: One hereditary group level (flat) with group-mediated resource collection (wave); In-
      browser simulation: https://hopth.ru/j),
   3. “Nested-Even”: Two hierarchically-nested hereditary group levels (nested) with uniform resource inflow
      (even). In-browser simulation: https://hopth.ru/k,
   4. “Nested-Wave”: Two hierarchically-nested hereditary group levels (nested) with group-mediated re-
      source collection (wave). In-browser simulation: https://hopth.ru/l.
     Supplementary Section C.8 provides full details for each of the four surveyed evolutionary conditions.
     For each condition, we simulated 40 replicate populations for up to 1,048,576 (220 ) updates. During
this time, on the order of 4,000 cellular generations and 500 apex-level group generations elapsed in runs.
(Full details appear in Supplementary Table C.2.) Due to variability in simulation speed, four replicates only
completed 262,144 updates. All analyses involving inter-replicate comparisons were therefore performed at
this earlier time point.
4.3      Results
     To characterize the general selective pressures induced by surveyed environmental conditions, we assessed
the prevalence of characteristic multicellular traits among evolved genotypes across replicates. In the case of
an evolutionary transition of individuality, we would expect cells to modulate their own reproductive behavior
to prioritize group interests above individual cell interests. In DISHTINY, cell reproduction inherently
destroys an immediate neighbor cell. As such, we would expect somatic growth to occur primarily at
group peripheries in a higher-level individual. Supplementary Figure C.1 compares cellular reproduction
rates between the interior and exterior of apex-level hereditary groups. For all treatments, phenotypes
                                                        75


with depressed interior cellular reproduction rates dominated across replicates (non-overlapping 95% CI).
By update 262,144 (about 1,000 cellular generations; see Supplementary Table C.2), all four treatment
conditions appear to select for some level of reproductive cooperation among cells.
      Across replicate evolutionary runs in all four treatments, we also found that resource was transferred
among registered kin at a significantly higher mean rate than to unrelated neighbors (non-overlapping 95%
CI). Genetic programs controlling cells can sense whether any particular neighbor shares a common hereditary
group ID. Thus, selective activation of resource sharing behavior to hereditary group members might have
evolved, which would provide one possible explanation for this observation.3 However, cells are also capable
of conditioning behavior on whether a particular neighbor is direct kin (i.e., a parent or child). To test
whether this resource-sharing was solely an artifact of sharing between direct cellular kin, we also assessed
mean sharing to registered kin that were not immediate cellular relatives. Mean sharing between such
cells also exceeded sharing among unrelated neighbors (non-overlapping 95% CI). Thus, all four treatments
appear to select for functional cooperation among wider kin groups. Supplementary section C.12 presents
these results in detail.
4.3.1      Qualitative Life Histories
      Although cooperative cell-level phenotypes were common among evolved hereditary groups, across repli-
cates functional and reproductive cooperation arose via diverse qualitative life histories. To provide a general
sense for the types of life histories we observed in this system, Figure 4.2 shows time lapses of representative
multicellular groups evolved in different replicates. Figure 4.2a depicts an example of a naive life history
in which — beyond the cellular progenitor of a propagule group — the parent and propagule groups exhibit
no special cooperative relationship. In Figure 4.2b , propagules repeatedly bud off of parent groups to yield
a larger network of persistent parent-child cooperators. In Figure 4.2c , propagules are generated at the
extremities of parent groups and then rapidly replace most or all of the parent group. Finally, in Figure 4.2d
, propagules are generated at the interior of a parent group and replace it from the inside out.
      To better understand the multicellular strategies that evolved in this system, we investigated the mech-
anisms and adaptiveness of notable phenotypes that evolved in several individual evolutionary replicates. In
the following sections, we present these investigations as a series of case studies.
4.3.2      Case Study: Burst Lifecycle
      We wondered how the strain exhibiting the “burst” lifecycle in Figure 4.2d determined when and where
to originate its propagules. To assess whether gene regulation instructions played a role in this process, we
prepared two knockout strains. In the first, gene regulation instructions were replaced with no-operation
    3
      Alternately to the same end, resource sharing behavior could be instead suppressed in the opposite case,
when a neighbor holds a different hereditary group ID.
                                                        76


          Update 0            Update 128           Update 256          Update 384          Update 512
(a) Naive (animation: https://hopth.ru/x, in-browser simulation: https://hopth.ru/1). The offspring group
is birthed at the exterior of the parent group. Parent and offspring groups then compete with each other for
space just the same as they do with other groups.
          Update 0             Update 32            Update 64          Update 128          Update 512
(b) Adjoin (animation: https://hopth.ru/y, in-browser simulation: https://hopth.ru/2). The offspring
group begins as a single cell at the exterior of the parent group. Parent and offspring groups then exclusively
expend reproductive effort to compete with other groups. This results in a stable interface between the
parent and offspring groups as the offspring group grows over time.
          Update 0             Update 72           Update 144          Update 216          Update 288
(c) Sweep (animation: https://hopth.ru/z, in-browser simulation: https://hopth.ru/3). The offspring group
begins as a single cell at the exterior of the parent group. The offspring group then grows rapidly into the
parent group, resulting in a near-complete transfer of simulation space into the offspring group. Multiple
offspring groups may simultaneously grow over the parent, as is the case here.
          Update 0             Update 96           Update 192          Update 288          Update 384
(d) Burst (animation: https://hopth.ru/0, in-browser simulation: https://hopth.ru/4). The offspring group
begins as a single cell at the interior of the parent group. Over time, the offspring group grows over the
parent group from the inside out. Multiple offspring groups may develop simultaneously, as is the case here.
Figure 4.2: Time lapse examples of qualitative life histories evolved under the Nested-Wave treatment.
From left to right within each row, frames depict the progression of simulation state within a subset of the
simulation grid. L1 hereditary groups are by differentiated by grayscale tone and separated by solid black
borders. L0 hereditary groups are by separated by dashed gray borders. In each example, the focal parent
L1 group is colored purple and the focal offspring group orange.
                                                        77


                                                            Wild type     Propagule         Regulation
                                                                           knockout          knockout
                                                                  (a) Regulation visualizations
                                                      0.5
                           Interior Propagule Count
                                                      0.4
                                                      0.3
                                                      0.2
                                                      0.1
                                                      0.0
                                                                Wild Type   Propagule Knockout Regulation Knockout
                                                                                Genotype
                              (b) Interior propagule rate by genotype
Figure 4.3: Analysis of a wild type strain exhibiting a “burst” lifecycle evolved under the “Nested-Wave”
treatment exhibiting interior propagule generation. Subfigure 4.3a compares gene regulation between an-
alyzed strains. Group layouts are overlaid via borders between cells. Black borders divide L1 groups and
white borders divide L0 groups. Borders between L1 groups are underlined in red for greater visibility.
Within these group layouts, regulation state for each cell’s four directional SignalGP instances is color coded
using a PCA mapping from regulatory state to three-dimensional RGB coordinates. (The PCA mapping is
calculated uniquely for each L1 hereditary group.) Within a L1 hereditary group, color similarity among
tile quarters indicates that the corresponding SignalGP instances exhibit similar regulatory state. However,
the particular hue of a SignalGP instance has no significance. In the case of identical regulatory state (here,
due to the absence of genetic regulation in a knockout strain) this color coding appears gray. Wild type
interior propagules are annotated with red arrows. Subfigure 4.3b compares the mean number of interior
propagules observed per L1 hereditary group. Error bars indicate 95% confidence. View an animation of
wild type gene regulation at https://hopth.ru/t. View the wild type strain in a live in-browser simulation
at https://hopth.ru/g.
                                                                                78


(Nop) instructions (so that gene regulation state would remain baseline). In the second, the reproduction
instructions to spawn a propagule were replaced with Nop instructions. Figure 4.3a depicts the gene
regulation phenotypes of these strains.
     Figure 4.3b compares interior propagule generation between the strains, confirming the direct mecha-
nistic role of gene regulation in promoting interior propagule generation (non-overlapping 95% CI).
     In head-to-head match-ups, the wild type strain outcompetes both the regulation-knockout (20/20; p <
0.001; two-tailed Binomial test) and the propagule-knockout strains (20/20; p < 0.001; two-tailed Binomial
test). The deficiency of the propagule-knockout strain confirms the adaptive role of interior propagule
generation. Likewise, the deficiency of the regulation-knockout strain affirms the adaptive role of gene
regulation in the focal wild type strain.
4.3.3      Case Study: Cell-cell Messaging
     We discovered adaptive cell-cell messaging in two evolved strains. Here, we discuss a strain evolved under
the Flat-Wave treatment where cell-cell messaging disrupts directional and spatial uniformity of resource
sharing. Supplementary Section C.13 overviews an evolved strain where cell-cell messaging appears to
intensify expression of a contextual tit-for-tat policy between hereditary groups.
     Figure 4.4 depicts the cell-cell messaging, resource sharing, and resource stockpile phenotypes of the wild
type strain side-by-side with corresponding phenotypes of a cell-cell messaging knockout strain. In the wild
type strain, cell-cell messaging emanates from irregular collection of cells — in some regions, grid-like and
in others more sparse — broadcasting to all neighboring cells. Resource sharing appears more widespread
in the knockout strain than in the wild type. However, messaging’s effects suppressing resource sharing is
neither spatially nor directionally homogeneous. Relative to the knockout strain, cell-cell messaging increases
variance in cardinal directionality of net resource sharing (WT: mean 0.28, S.D. 0.07, n = 54; KO: mean
0.17, S.D. 0.07, n = 69; p < 0.001, bootstrap test). Cell-cell messaging also increases variance of resource
sharing density with respect to spatial quadrants demarcated by the hereditary group’s spatial centroid
(WT: mean 0.23, S.D. 0.07, n = 52; KO: mean 0.16, S.D. 0.08, n = 68; p < 0.001, bootstrap test). We used
competition experiments to confirm the fitness advantage both of cell-cell messaging (20/20; p < 0.001; two-
tailed Binomial test) and (using a separate knockout strain) resource sharing (20/20; p < 0.001; two-tailed
Binomial test). The fitness advantage of irregularized sharing might stem from a corresponding increase in
the fraction of cells with enough resource to reproduce stockpiled (WT: mean 0.18, S.D. 0.11, n = 54; KO:
mean 0.06, S.D. 0.08, n = 69; p < 0.001, bootstrap test).
                                                        79


                           Messaging
                           Resource Sharing
                           Resource Stockpile
                                                Wild Type        Messaging Knockout
Figure 4.4: Visualization of phenotypic traits of a wild type strain evolved under the “Flat-Wave” treatment
and corresponding intercell messaging knockout strain. For these visualizations, group layouts are overlaid
via borders between cells. Black borders divide L0 hereditary groups. In the messaging visualization, color
coding represents the volume of incoming messages. White represents no incoming messages and the magenta
to blue gradient runs from one incoming message to the maximum observed incoming message traffic. Unlike
the wild type strain, as expected the messaging knockout strain exhibits no messaging activity. In the
resource sharing visualization, color coding represents the amount of incoming resource. White represents
no incoming resource and the magenta to blue gradient runs from the minimum to the maximum observed
incoming incoming resource. The wild type strain exhibits much more sparse resource sharing than the
messaging knockout strain. In the resource stockpile visualization, white represents zero-resource stockpiles,
blue represents stockpiles with just under enough resource to reproduce, green represents stockpiles with
enough resource to reproduce, and yellow represents more than enough resource to reproduce. The wild
type groups contain more cells with rich resource stockpiles (green and yellow) than the messaging knockout
strain. View an animation of the wild type strain at https://hopth.ru/p. View the wild type strain in a live
in-browser simulation at https://hopth.ru/e.
                                                            80


Resource Stockpile
Resource Sharing
                        Wild Type                              Relative Stockpile Sensing Knockout
Figure 4.5: Visualization of phenotypic traits of a wild type strain evolved under the “Nested-Wave” treatment
and corresponding resource-sensing knockout strain. For these visualizations, group layouts are overlaid
via borders between cells. Black borders divide L1 hereditary groups and dashed gray borders divide L0
hereditary groups. In the resource stockpile visualization, white represents zero-resource stockpiles, blue
represents stockpiles with just under enough resource to reproduce, green represents stockpiles with enough
resource to reproduce, and yellow represents more than enough resource to reproduce. The wild type groups
contain more cells with rich resource stockpiles (green and yellow) than the knockout strain. In the resource-
sharing visualization, white represents no incoming resource and the magenta to blue gradient runs from the
minimum to the maximum observed amount of incoming shared resource. The wild type strain exhibits less
resource sharing than the knockout strain. View an animation of the wild type strain at https://hopth.ru/s.
View the wild type strain in a live in-browser simulation at https://hopth.ru/h.
                                                     81


4.3.4     Case Study: Gradient-conditioned Cell Behavior
     To further assess how multicellular groups process and employ spatial and directional information, we
investigated whether successful multicellular strategies evolved where cells condition their behavior based
on the resource concentration gradient within a multicellular group. We discovered a strain that employs a
dynamic strategy where cells condition their own resource-sharing behavior based on the relative abundance
of their own resource stockpiles compared to their neighbors. This strain appears to use this information
to selectively suppress resource sharing. This strain’s wild type outcompeted a variant where cells’ capacity
to assess relative richness of neighboring resource stockpiles was knocked out (20/20; p < 0.001; two-tailed
Binomial test). Figure 4.5 contrasts the wild type resource-sharing phenotype with the more sparse knockout
resource-sharing phenotype.
     This result raises the question of whether more sophisticated morphological patterning might evolve
within the experimental system. Next, in Section 4.3.5, we examine a strain that exhibited striking genetically
driven morphological patterning of hereditary groups.
4.3.5     Case Study: Morphology
     Figure 4.6a shows one of the more striking examples of genetically encoded hereditary group patterning
we observed. In this strain, which arose in a Nested-Even treatment replicate, L0 hereditary groups arrange
as elongated, one-cell-wide strands.
     Knocking out intracell messaging disrupts the stringy arrangement of L0 hereditary groups groups,
shown in Figure 4.6b . Figure 4.6c compares the distribution of cells’ L0 same-hereditary-group neighbor
counts for L1 groups of nine or more cells. Compared to the knockout variant, many fewer wild-type cells
are have three or four L0 same-hereditary-group neighbors, consistent with the one-cell-wide strands (non-
overlapping 95% CI). However, we also observed that wild-type L0 hereditary groups were overall smaller
than the knockout strain (WT: mean 2.1, S.D. 1.5; messaging knockout: mean 4.3, S.D. 5.1; p < 0.001;
bootstrap test).
     So, we set out to determine determine whether smaller L0 group size alone was sufficient to explain
these observed differences in neighbor count. We compared a dimensionless shape factor describing group
stringiness (perimeter divided by the square root of area) between the wild type and messaging knockout
strains. Between L0 group size four (the smallest size stringiness can emerge at on a grid) and L0 group size
six (the largest size we had sufficient replicate wild type observations for), wild type exhibited significantly
greater stringiness (Figure 4.6d ; 4: p < 0.01, bootstrap test; 5: p < 0.01, bootstrap test; 6: non-overlapping
95% CI). This confirms that more sophisticated patterning beyond just smaller L0 group size is at play to
create the observed one-cell-wide L0 strand morphology.
                                                        82


                             (a) Wild type                                                                   (b) Messaging knock-
                                                                                                             out
                   0.5                                                                              8
                                                    Genotype                                                                  Genotype
                                                  Wild Type                                                             Wild Type
                                                                                                    7
                                                  Messaging Knockout                                                    Messaging Knockout
                   0.4
                                                                                                    6
                                                                            Shape Factor (P/√ A )
  Fraction Cells
                                                                                                    5
                   0.3
                                                                                                    4
                   0.2
                                                                                                    3
                                                                                                    2
                   0.1
                                                                                                        n = 90       n = 41    n = 77       n = 38     n = 23       n = 31             n = 23             n = 17             n = 14
                                                                                                    1
                                                                                                                                                                             n=8                n=3                n=4
                   0.0
                                                                                                    0
                         0   1         2           3           4
                                                                                                                 4                      5                       6                  7                  8                  9
                                 Neighbor Count
                                                                                                                                                     Level-zero Group Size
    (c) Distribution of L0 same-hereditary-group                            (d) L0 hereditary group stringiness measure
    neighbor counts.                                                        versus group sizes.
Figure 4.6: Comparison of a wild type strain evolved under the “Nested-Even” treatment with stringy L0
hereditary groups and the corresponding intracellular-messaging knockout strain. Subfigures 4.6a and 4.6b
visualize hereditary group layouts; color hue denotes and black borders divide L1 hereditary groups while
color saturation denotes and white borders divide L0 hereditary groups. Smaller, thinner, and more elongated
L0 groups can be seen in the wild type strain than in the knockout strain. Subfigures 4.6c and 4.6d quantify
the morphological effect of the intracellular-messaging knockout. In the formula for Shape Factor given in
Subfigure 4.6c , P refers to group perimeter and A refers to group area. Error bars indicate 95% confidence.
View an animation of the wild type strain at https://hopth.ru/q. View the wild type strain in a live in-
browser simulation at https://hopth.ru/f.
                                                                       83


    Competition experiments failed to show a fitness effect of this strain’s morphological patterning. The
wild type strain won competitions about as often as the knockout strain (6/20). Thus, it seems this trait
emerged either by drift, as the genetic background of a selective sweep, or was advantageous against a
divergent competitor earlier in evolutionary history.
4.3.6    Case Studies: Apoptosis
                           Strain A
                           Strain B
                                      Wild Type              Apoptosis Knockout
Figure 4.7: Comparison of wild type strains and corresponding apoptosis knockout strains. In all visualiza-
tions, color hue denotes and black borders divide apex-level hereditary groups. In Strain A visualizations,
color saturation denotes and white borders divide L0 hereditary groups. (Strain B evolved under the flat
treatment.) Black tiles are dead. These dead tiles, all due to apoptosis, can be seen in both strain’s wild
type. Dead tiles appear to be clustered contiguously or near contiguously at group peripheries in both
strains, with more dead tiles apparent in Strain A than Strain B. View an animation of wild type strain
A at https://hopth.ru/m. View an animation of wild type strain B at https://hopth.ru/n. View wild type
strain A in a live in-browser simulation at https://hopth.ru/b. View wild type strain B in a live in-browser
simulation at https://hopth.ru/c.
    Finally, we assessed whether cell self-sacrifice played a role in multicellular strategies evolved across
our survey. Screening replicate evolutionary runs by apoptosis rate flagged two strains with several orders
of magnitude greater activity. In strain A, evolved under the Nested-Even treatment, apoptosis accounts
for 2% of cell mortality. In strain B, evolved under the Nested-Flat treatment, 15% of mortality is due to
apoptosis.
    To test the adaptive role of apoptosis in these strains, we performed competition experiments against
apoptosis knockout strains, in which all apoptosis instructions were substituted for Nop instructions. Figure
4.7 compares the wild type hereditary group structures of these strains to their corresponding knockouts.
    Apoptosis contributed significantly to fitness in both strains (strain A: 18/20, p < 0.001, two-tailed
                                                        84


Binomial test; strain B: 20/20, p < 0.001, two-tailed Binomial test). The success of strategies incorporating
cell suicide is characteristic of evolutionary conditions favoring altruism, such as kin selection or a transition
from cell-level to collective individuality.
      To discern whether spatial or temporal targeting of apoptosis contributed to fitness, we competed wild
type strains with apoptosis-knockout strains on which we externally triggered cell apoptosis with spatially
and temporally uniform probability. In one set of competition experiments, the knockout strain’s apoptosis
probability was based on the observed apoptosis rate of the wild type strain’s monoculture. In a second
set of competition experiments, the knockout strain’s apoptosis probability was based on the observed
apoptosis rate of the population in the evolutionary run the wild type strain was harvested from. In both
sets of experiments on both strains, wild type strains outcompeted knockout strains with uniform apoptosis
probabilities (strain A monoculture rate: 18/20, p < 0.001, two-tailed Binomial test; strain A population
rate: 19/20, p < 0.001, two-tailed Binomial test; strain B monoculture rate: 20/20, p < 0.001, two-tailed
Binomial test; strain B population rate: 20/20, p < 0.001, two-tailed Binomial test).
4.4       Discussion
      In this work, we selected for fraternal transitions in individuality among digital organisms controlled by
genetic programs. Because — unlike previous work (Goldsby et al., 2012, 2014) — we provided no experi-
mentally prescribed mechanism for collective reproduction, we observed the emergence of several distinct life
histories. Evolved strategies exhibited intercellular communication, coordination, and differentiation. These
included endowment of offspring propagule groups, asymmetrical intra-group resource sharing, asymmetri-
cal inter-group relationships, morphological patterning, gene-regulation mediated life cycles, and adaptive
apoptosis.
      Across treatments, we observed resource-sharing and reproductive cooperation among registered kin
groups. These outcomes arose even in treatments where registered kin groups lacked functional significance
(i.e., resource was distributed evenly), suggesting that reliable kin recognition alone might be sufficient to
observe aspects of fraternal collectivism evolve in systems where population members compete antagonisti-
cally for limited space or resources and spatial mixing is low. In addition to their functional consequences,
perhaps the role of physical mechanisms such as cell attachment simply as a kin recognition tool might merit
consideration.
      In future work, we are eager to undertake experiments investigating open questions pertaining to major
evolutionary transitions such as the role of pre-existing phenotypic plasticity (Clune et al., 2007; Lalejini and
Ofria, 2016), pre-existing environmental interactions, pre-existing reproductive division of labor, and how
transitions relate to increases in organizational (Goldsby et al., 2012), structural, and functional (Goldsby
                                                         85


et al., 2014) complexity. Expanding the scope of our existing work to directly study evolutionary dynamics
and evolutionary histories will be crucial to such efforts.
     In particular, we plan to investigate mechanisms to evolve greater collective sophistication among agents.
The modular design of SignalGP lends itself to the possibility of exploring sexual recombination. We are
interested in exploring extensions to allow cell groups to develop neural and vascular networks (Moreno and
Ofria, 2020). We hypothesize that selective pressures related to intra-group coordination and inter-group
conflict might spur developmental and structural infrastructure that could be co-opted to evolve agents
proficient at unrelated tasks like navigation, game-playing, or reinforcement learning.
     Unfortunately, however, experiments with multicellularity are specially constrained by a fundamental
limitation of digital evolution research: processing power (Moreno, 2020). This limitation, which commonly
manifests as smaller population sizes than natural populations (Liard et al., 2018), only compounds when the
unit of selection shifts to computationally expensive groups of dozens or hundreds of component individuals.
Ongoing work with DISHTINY is testing approaches to harness increasingly abundant parallel processing
power for digital evolution simulation (Moreno et al., 2021b). The spatial, distributed nature of our approach
potentially affords a route to achieve large-scale digital multicellularity experiments consisting of millions,
instead of thousands, of cells via high-performance parallel computing.
     We hope that such technical efforts will also benefit other computational work exploring a broader
range of conceptual models of multicellularity. For instance, this work assumes incessant, pervasive biotic
interaction via competition for space. However, many natural systems exhibit more intermittent, sparse
encounters between multicells and such selective interactions have been hypothesized as key to the evolution
of complexity and diversity (Soros and Stanley, 2014). Also crucial to explore, and unaccounted for in this
work, are dynamics of cell migration in development (Horwitz and Webb, 2003) and motility of multicells
(Arnellos and Keijzer, 2019). It seems certain that the varied conditions and mechanistic richness of biological
reality can only be fully explored through a plurality of conceptual models and model systems.
                                                       86


                                              Chapter 5
        A Case Study of Novelty, Complexity, and Adaptation in a
                                      Multicellular System
Authors: Matthew Andres Moreno, Santiago Rodriguez Papa, and Charles Ofria
This chapter is adapted from (Moreno et al., 2021a), which underwent peer review and appeared in the
proceedings of the Fourth Workshop on Open-Ended Evolution (OEE4) at the 2021 Conference on Artificial
Life (ALIFE 2021).
     This chapter reports trajectories of novelty, complexity, and adaptation in a case study from the
DISHTINY simulation system. This case study lineage produced ten qualitatively distinct multicellular
morphologies, several of which exhibit asymmetrical growth and distinct life stages. We find that a loose —
sometimes divergent — relationship can exist among novelty, complexity, and adaptation.
5.1      Introduction
     The challenge, and promise, of open-ended evolution has animated decades of inquiry and discussion
within the artificial life community (Packard et al., 2019). The difficulty of devising models that produce
continuing open-ended evolution suggests profound philosophical or scientific blind spots in our understand-
ing of the natural processes that gave rise to contemporary organisms and ecosystems. Already, pursuit
of open-ended evolution has yielded paradigm-shifting insights. For example, novelty search demonstrated
how processes promoting non-adaptive diversification can ultimately yield adaptive outcomes that were pre-
viously unattainable (Lehman and Stanley, 2011). Such work lends insight to fundamental questions in
evolutionary biology, such as the relevance — or irrelevance — of natural selection with respect to increases
in complexity (Lehman, 2012; Lynch, 2007) and the origins of evolvability (Kirschner and Gerhart, 1998;
Lehman and Stanley, 2013). Evolutionary algorithms devised in support of open-ended evolution models
also promise to deliver tangible broader impacts for society. Possibilities include the generative design of
engineering solutions, consumer products, art, video games, and AI systems (Kenneth O. Stanley, 2017;
Nguyen et al., 2015).
     Preceding decades have witnessed advances toward defining — quantitatively and philosophically —
the concept of open-ended evolution (Bedau et al., 1998; Dolson et al., 2019; Lehman and Stanley, 2012)
as well as investigating causal phenomena that promote open-ended dynamics such as ecological dynamics,
selection, and evolvability (Dolson, 2019; Huizinga et al., 2018; Soros and Stanley, 2014). The concept of
open-endedness is fundamentally characterized by intertwined generation of novelty, functional complexity,
and adaptation (Taylor et al., 2016). How and how closely these phenomena relate to one another remains an
open question. Here, we aim to complement ongoing work to develop a firmer theoretical understanding of
                                                      87


the relationship between novelty, complexity, and adaptation by exploring the evolution of these phenomena
through a case study using the DISHTINY digital multicelullarity framework . We apply a suite of qualitative
and quantitative measures to assess how these qualities can change over evolutionary time and in relation
to one another.
5.2      Methods
5.2.1      Simulation
     The DISHTINY simulation environment tracks cells occupying tiles on a toroidal grid (size 120 × 120
by default). Cells collect a uniform inflow of continuous-valued resource. This resource can be spent in
increments of 1.0 to attempt asexual reproduction into any of a cell’s four adjacent cells. A cell can only be
replaced if it commands less than 1.0 resource. If a cell rebuffs a reproduction attempt, its resource stockpile
decrements by 1.0 down to a minimum of 0.0.
     In order to facilitate the formation of coherent multicellular groups, the DISHTINY framework pro-
vides a mechanism for cells to form groups and detect group membership . Groups arise through cellular
reproduction. When a cell proliferates, it may choose to initiate its offspring as a member of its kin group,
thereby growing it, or induce the offspring to found a new kin group. This process is similar to the growth of
biological multicellular tissues, where cell offspring can be retained as members of the tissue or permanently
expelled.
     We incentivize group formation by providing an additional resource inflow bonus based on group size.
Per-cell resource collection rate increases linearly with group size up to a cap of 12 members. Past 12
members, the decay rate of cells’ resource stockpiles begins increasing exponentially. These mechanisms
select for medium-sized groups; the harsh penalization of oversize groups, in particular, prevents any single
group from consuming the entire population. Groups that are too small do not receive this bonus. Groups
that are too large receive a penalty. In order to ensure group turnover, we force groups to fragment into
unicells after 8,192 (213 ) updates.
     In Chapter 4 , we established that this framework can select for traits characteristic of multicellularity,
such as cooperation, coordination, and reproductive division of labor . We also found more case studies of
interest arose when two nested levels of group membership were tracked as opposed to a single, un-nested level
of group membership . With nested group membership, group growth still occurs by cellular reproduction.
Cells are given the choice to retain offspring within both groups, to expel offspring from both groups, or
to expel offspring from the innermost group only. In addition to being given the choice to expel or retain
offspring within both groups, cells are also allowed to expel offspring from the innermost group only. Section
4.2.2 provides greater detail on group membership and hierarchical group membership in DISHTINY. In this
work, we allow for nested kin groups.
                                                        88


   events &                                                                                    output
   sensors                                                             simulation             registers
                                                                                                                               outputs
                                        🍓
  inputs
                                       [tag]
                                                        🕓
                                                        [tag]                           🎁🍓 ☠
                                                                                           � 🐣 📬 📬🛂 📝
                    [tag]
                                                                          core 0
                                   �
                �
                     🍓 🛂🍓 🕓 📝 🛂🕓                    �                                   7 67 3 56 0 13
                                                                          core 1
                                                                                        Ra   Rb      Rc   Rd   Re   Rf   ...
           1 10 4 40 3 72
                                                                            ...
                                                                          core 2
           R0        R1       R2        R3     R4         R5    ...
                                                                                             [tag]              [tag]
                                                                          virtual
                      [tag]                     [tag]
                                       [tag]
                                                                           cpu                        [tag]
     tagged                                                                                   tagged
    messages                                                          neighbor agents        messages
                                                                       & co-cardinals
Figure 5.1: Overview of genome execution. Tagged events and messages (shown as bells and envelopes, re-
spectively) activate module execution on virtual cores. Simulation state can also be read directly using sensor
instructions to access input registers. Special instructions write to output registers, allowing interaction with
the simulation, and generate tagged messages, allowing interaction with other virtual CPUs.
                                                                            89


                                                                             [tag]   [tag anchor]
                                                                                         [instruction] core 0
                                                                                   [tag]    [tag anchor]
                                                                                                [instruction] core 2
                                                                                                                      core 3
                                                                                                      [instruction]
                                                                                               [tag]      [tag anchor]
                                                                                 c.
                                                                                                             [tag anchor]
    a.                                    b.
                                                                                                     [tag]
                                                                                                               [instruction
Figure 5.2: Overview of DISHTINY system. Cells occupy slots on a toroidal grid (Subfigure a). As cells
reproduce, they may grow their existing kin group (shown here by color) or splinter off to found new ones.
Each cell, shown here bounded within black squares, is controlled by four virtual CPUs, referred to as
“cardinals” and shown here within triangles (Subfigure b). Cardinals within a cell can interact via message
passing (blue conduits). Cardinals can interact with the corresponding cardinal in their neighboring cell
through message passing or simulation intrinsics (i.e., resource sharing, offspring spawning, etc.), represented
here by purple conduits. These inter-cell interactions may span physical hardware threads or processes. All
virtual CPUs within a cell independently execute the same linear genetic program (Subfigure c). Tagged
subsections of this linear genetic program (“modules”) activate in response to stimuli.
     In addition to controlling reproduction behavior, evolving genomes can also share resources with adja-
cent cells, perform apoptosis (recovering a small amount of resource that may be shared with neighboring
cells), and pass arbitrary messages to neighboring cells. Cell behaviors are controlled by event-driven genetic
programs in which linear GP modules are activated in response to cues from the environment or neighbor-
ing agents; signals are handled in quasi-parallel on up to 32 virtual cores (Figure 5.1) (Lalejini and Ofria,
2018). Each cell contains four independent virtual CPUs, all of which execute the same genetic program
(Figure 5.2a). Each CPU manages interactions with a single neighboring cell. We refer to a CPU managing
interactions with a particular neighbor as a “cardinal” (as in “cardinal direction”). These CPUs may com-
municate via intra-cellular message passing. Full details on the instruction set and event library used as well
as simulation logic and parameter settings appear in supplementary material.
     Supplementary Section D.5 provides full detail on simulation components and parameters.
5.2.2     Evolution
     We performed evolution in three-hour windows for compatibility with our compute cluster’s scheduling
system. We refer to these windows as “stints.” We randomly generated one-hundred instruction genomes at
the outset of the initial stint, stint 0. At the end of each three hour window, the system harvested and stored
genomes in a population file. We then seeded subsequent stints with the previous stint’s population. No
simulation state besides genome content was preserved between stints. In addition to simplifying implemen-
                                                          90


tation concerns, re-seeding each stint ensured that strains retained the capability to grow from a well-mixed
innoculum. This facilitated later competition experiments between strains.
      In order to ensure heterogeneity of biotic environmental factors experienced by evolving cells, we imposed
a diversity maintenance scheme. In this scheme, descendants of a single progenitor cell from stint 0 that
proliferated to constitute more than half of the population were penalized with resource loss. The severity
of the penalty increased with increasing prevalence beyond half of the population. Thus, we ensured that
descendants from at least two distinct stint 0 progenitors remained over the course of the simulation. We
arbitrarily chose a strain for primary study — we refer to this strain as the “focal” strain and others as
“background” strains. In our case study, there was only one background strain in addition to this focal
strain.
      In our screen for case studies, we evolved 40 independent populations for 101 stints. We selected
population 16005 from among these 40 to profile as a case study due to its distinct asymmetrical group
morphology.
      At the conclusion of each stint, we selected the most abundant genome within the population as a
representative specimen. We performed a suite of follow-up analyses on each representative specimen to
characterize aspects of complexity, detailed in the following subsections. To ensure that specimens were
consistently sampled from descendants of the same stint 0 progenitor, we only considered genomes with the
lowest available stint 0 progenitor ID.
5.2.3      Phenotype-neutral Nopout
      After harvesting representative specimens from each stint, we filtered out genome instructions that had
no impact on the simulation.
      To accomplish this, we performed sequential single-site “nopouts” where individual genome instructions
were disabled by replacing them with a Nop instruction.        1
                                                                  We reverted nopouts that altered a strain’s
phenotype and kept those that did not. To determine whether phenotypic alteration occurred, we seeded
an independent, mutation-disabled simulation with the stain in question and ran it side-by-side with an
independent, mutation-disabled simulation of the wildtype strain. If any divergence in resource concentration
was detected between the two strains within a 2,048 update window, the single site nopout was reverted. We
continued this process until no single-site nopouts were possible without altering the genome’s phenotype.
To speed up evaluation, we performed step-by-step, side-by-side comparisons using a smaller toroidal grid
size of just 100 tiles.
      This process left us with a “Phenoytpe-neutral Nopout” variant of the wildtype genome where all re-
    1
      This Nop instruction was chosen to perform the same number of random number generator touches as
the original instruction to control for arbitrary effects of advancing the generator.
                                                        91


maining instructions contributed to the phenotype.
      However, in further analyses we discovered that 21 phenotype-neutral nopouts from our case study were
not actually neutral — competition experiments revealed they were significantly less fit than the wildtype
strain. This might be due to insufficient spatial or temporal scope to observe expression of particular genome
sites in our test for phenotypic divergence.
5.2.4      Estimating Critical Fitness Complexity
      Next, we sought to detect genome instructions that contributed to the strain fitness.
      For each remaining op instruction in the Phenotype-neutral Nopout variant, we took the wildtype strain
and applied a nopout at the corresponding site. We then competed this variant against the wildtype strain.
Evaluating only remaining op instructions in the Phenotype-neutral Nopout variant allowed us to decrease
the number of fitness competitions we had to perform.
      We initialized fitness competitions by seeding a population half-and-half with two strains. We ran these
competitions for 10 minutes (about 4,200 updates) on a 60 × 60 toroidal grid, after which we assessed the
relative abundances of descendants of both seeded strains.
      To determine whether fitness differed significantly between a wildtype and variant strain, we compared
the relative abundance of the strains observed at the end of competitions against outcomes from 20 control
wildtype-vs-wildtype competitions. We fit a T -distribution to the abundance outcomes observed under the
control wildtype-vs-wildtype competitions and deemed outcomes that fell outside the central 98% probability
density of that distribution a significant difference in fitness. This allowed us to screen for fitness effects of
single-site nopouts while only performing a single competition per site.
      This process left us with a “Fitness-noncritical Nopout” variant of the wildtype genome where all re-
maining instructions contributed to the phenotype. We called the number of remaining instructions its
“critical fitness complexity.” We adjusted this figure downwards for the expected 1% rate of false-positive
fitness differences among tested genome sites. This metric mirrors the MODES complexity metric described
in (Dolson et al., 2019) and the approximation of sequence complexity advanced in (Adami et al., 2000).
5.2.5      Estimating State Interface Complexity
      In addition to estimating the number of genome sites that contribute to fitness, we measured the
number of different environmental cues and the number of different output mechanisms that cells adaptively
incorporated into behavior.
      One possible way to take this measure would be to disable event cues, sensor instructions, and output
registers one by one and test for changes in fitness. However, this approach would fail to distinguish context-
dependent input/output from merely contingent input/output. For example, a cell might happen to depend
                                                        92


on a sensor being set at a certain frequency but not on the actual underlying simulation information the
sensor represents.
     To isolate context-dependent input/output state interactions, we tested the fitness effect of swapping
particular input/output states between CPUs rather than completely disabling them. That is, for example,
CPU b would be forced to perform the output generated by CPU a or CPU b would be shown the input
meant for CPU a. We performed this manipulation on half the population in a fitness competition for
each individual component of the simulation’s introspective state (44 sensor states relating to the status of
a CPU’s own cell), extrospective state (61 sensor states relating to the status of a neighboring cell), and
writable state (18 output states, 10 of which control cell behavior and 8 of which act as global memory
for the CPU).   2
                   We deemed a state as fitness-critical if this manipulation resulted in decreased fitness at
significance p < 0.01 using a T -test parameterized by 20 control wild-type vs wild-type competitions.
     We describe the number of states that cells interact with to contribute to fitness as “State Interface
Complexity.”
5.2.6     Estimating Messaging Interface Complexity
     In addition to estimating the number of input/output states cells use to interact with the environment,
we also estimated the number of distinct intra-cellular messages cardinals within a cell use to coordinate
and inter-cellular messages that cells use to coordinate. As with state interface complexity, distinguishing
context-dependent behavior from contingent behavior is critical to attaining a meaningful measurement. For
example, a cardinal might happen to depend on always receiving a inter-cellular message from a neighbor or
an intra-cellular message from another cardinal. Although meaningless, if that message were blocked, fitness
would decrease. So, instead of simply discarding messages to test for a fitness effect, we re-route messages
back to the sending cardinal instead of their intended recipient. We deemed a messages as fitness-critical if
this manipulation resulted in decreased fitness at significance p < 0.01 using a T -test parameterized by 20
control wild-type vs wild-type competitions.
     We refer to the number of distinct messages that cells send to contribute to fitness as “Messaging
Interface Complexity.”
     We refer to the sum of State Interface Complexity, Intra-messaging Interface Complexity, and Inter-
messaging Interface Complexity as “Cardinal Interface Complexity.”
5.2.7     Estimating Adaptation
     In order to assess ongoing changes in fitness, we performed fitness competitions between the repre-
sentative focal strain specimen sampled at each stint and the focal strain population from the preceding
   2
     A full description of each piece of introspective, extrospective, and writable state is listed in supplemen-
tary material.
                                                         93


stint. (Recall from Section 5.2.2 that, due to a diversity maintenance procedure, two completely indepen-
dent strains coexisted over the course of the experiment — the “focal” strain selected for analysis and a
“background” strain.) Using the population from the preceding stint as the competitive baseline (rather
than the representative specimen) ensured more focused, consistent measurement of the fitness properties
of the specimen at the current stint (e.g., preventing skewed results from a sampled “dud” at the preceding
stint).
      We performed 20 independent replicates of each competition. Competing strains were well-mixed within
the full-sized toroidal grid at the outset of each competition, which lasted for 10 minutes of wall time. This was
sufficient to simulate about 8,000 updates at stint 0 and 2,000 updates at stint 100 (Supplementary Figures
D.10, D.12, and D.11). We determined that a gain of fitness had occurred if the current stint specimen
constituted a population majority at the conclusion of more than 17 of those competitions, corresponding to
a significance level of p < 0.005 under the two-tailed binomial null hypothesis. Likewise, we deemed winning
fewer than 3 competitions a significant fitness loss.
5.2.8      Implementation
      We employed multithreading to speed up execution. We split the simulation into four 60 × 60 subgrids.
Each subgrid executed asynchronously, using the Conduit C++ Library presented in Chapter 2 to orchestrate
best-effort, real-time interactions between simulation elements on different threads . This approach is inspired
by Ackley’s notion of indefinite scalability (Ackley and Small, 2014). In other work benchmarking the system,
we have demonstrated that this approach improves scalability. The simulation scales to 4 threads with 80%
efficiency, up to 64 threads with 40% efficiency and up to 64 nodes with 80% efficiency (Chapter 2) .
      Over the 101 three-hour evolutionary stints performed to evolve the case study, 7,565,309 simulation
updates elapsed. This translates to 74,904 updates elapsed per stint or about 6.9 updates per second.
However, the update processing rate was not uniform across stints: the simulation slowed about 77% as
stints progressed. Supplementary Figure D.9 shows elapsed updates for each stint. During stint 0, 176,816
updates elapsed (about 16.3 updates per second). During stint 100, only 41,920 updates elapsed (about 3.8
updates per second).
      Although working asynchronously, threads processed similar number of updates during each stint. The
mean standard deviation of update-processing rate between threads was 2%. The mean difference of the
update-processing rate between the fastest and slowest threads was 5%. The maximum value of these
statistics observed during a stint was 9% and 20%, respectively, at stint 44. Supplementary Figure D.9b
shows the distribution of elapsed updates across threads for each stint evolved during the case study.
      Software is available under a MIT License at https://github.com/mmore500/dishtiny. All data is avail-
                                                        94


able via the Open Science Framework at https://osf.io/prq49. Supplementary material is available via the
Open Science Framework at https://osf.io/gekc8.
5.3       Results
5.3.1      Evolutionary History
      Due to the parallel nature of the experimental framework, we did not perform perfect phylogeny tracking.
Chapter 3 discusses challenges parallelizing perfect phylogeny tracking in depth.
      However, we did track the total number of ancestors seeded into stint 0 with extant descendants. At
the end of stints 0 and 1, three distinct original phylogenetic roots were present in the population. From
stint 2 onward, only two distinct original phylogenetic roots were present.
      We performed follow-up analyses on specimens sampled from the lowest original phylogenetic root ID
present in the population.   3
                               For the first two stints, the focal strain was root ID 2,378. During stint 2, original
phylogenetic root 2,378 went extinct. So, all further follow-up analyses were sampled from descendants of
ancestor 12,634.
      We also tracked the number of genomes reconstituted at the outset of each stint with extant descendants
at the end of that stint. This count grows from approximately 10 around stint 15 to upwards of 30 around
stint 40 (Supplementary Figure D.4a). Among descendants of the lowest original phylogenetic root, the
number of independent lineages spanning a stint also increases from around 5 to around 15 (Supplementary
Figure D.4b). This decrease in phylogenetic consolidation on a stint-by-stint basis correlates with the waning
number of simulation updates performed per stint (Supplementary Figures D.4c and D.4d). More complete
phylogenetic data will be necessary in future experiments to address questions about the possibility of long-
term stable coexistence beyond the two strains supported under the explicit diversity maintenance scheme.
      On the specimen from stint 100 used in the final case study, an evolutionary history of 20,212 cell
generations had elapsed. Of these cellular reproductions, 11,713 (58%) had full kin group commonality,
7,174 had partial kin group commonality (35%), and 1,325 had no kin group commonality (7%). On this
specimen, 1,672 mutation events had elapsed. During these events, 7,240 insertion-deletion alterations had
occurred and 26,153 point mutations had occurred. This strain experienced a selection pressure of 18% over
its evolutionary history, meaning that only 82% of the mutations that would be expected given the number
of cellular reproductions that had elapsed were present.
      In order to characterize the evolutionary history of the experiment in greater detail, we performed
a parsimony-based phylogenetic reconstruction on the sampled representative specimens from each stint,
    3
      This approach was designed to choose an arbitrary strain as focal. Barring extinction, that same strain
will then be identified as focal consistently across subsequent stints. Phylogenetic root ID had no functional
consequences; it is simply an arbitrary basis for focal strain selection.
                                                           95


                                                                                                                                 93 (e)
                                                                                                                                   96 (b)
                                                                                                                                  95 (b)
                                                                                                                                 94 (e)
                                                                                                                                    97 (h)
                                                                                                                                    99 (e)
                                                                                                                                    100 (j)
                                                                                                                                   98 (e)
                                                                                                                              90 (g)
                                                                                                                              91 (e)
                                                                                                                            89 (g)
                                                                                                                             92 (i)
                                                                                                                         87 (e)
                                                                                                                        85 (e)
                                                                                                                         86 (e)
                                                                                                                         88 (e)
                                                                                                                         83 (e)
                                                                                                                          84 (e)
                                                                                                                    80 (e)
                                                                                                                     82 (e)
                                                                                                                   78 (e)
                                                                                                                  74 (i)
                                                                                                          68 (e)
                                                                                                              70 (e)
                                                                                                        64 (e)
                                                                                                        65 (e)
                                                                                                       66 (g)
                                                                                                         67 (g)
                                                                                                     63 (g)
                                                                                                                  76 (g)
                                                                                                                  77 (i)
                                                                                                                  79 (g)
                                                                                                                  81 (e)
                                                                                                                 75 (i)
                                                                                                                72 (e)
                                                                                                               73 (g)
                                                                                                             69 (g)
                                                                                                              71 (e)
                                                                                                 56 (g)
                                                                                                  59 (h)
                                                                                                 57 (g)
                                                                                                   61 (e)
                                                                                                    62 (g)
                                                                                                  60 (g)
                                                                                                55 (g)
                                                                                                  58 (g)
                                                                                             50 (g)
                                                                                              51 (g)
                                                                                              52 (g)
                                                                                               53 (e)
                                                                                           49 (g)
                                                                                                54 (g)
                                                                                        46 (g)
                                                                                         47 (g)
                                                                                          48 (g)
                                                                                        45 (g)
                                                                                       42 (e)
                                                                                       44 (e)
                                                                                    40 (e)
                                                                                     38 (e)
                                                                                    37 (e)
                                                                                  36 (e)
                                                                                     41 (e)
                                                                                      43 (e)
                                                                                   39 (f)
                                                                                 33 (e)
                                                                          30 (e)
                                                                                32 (e)
                                                                                  35 (e)
                                                                                 34 (e)
                                                                           29 (e)
                                                                              31 (e)
                                                                        26 (b)
                                                                             28 (b)
                                                                          27 (e)
                                                                       24 (e)
                                                                         25 (e)
                                                                     23 (e)
                                                                18 (e)
                                                               19 (e)
                                                                 21 (e)
                                                                    22 (e)
                                                                 20 (e)
                                                           17 (e)
                                                             16 (e)
                                                           15 (e)
                                                        13 (b)
                                                          14 (d)
                                                      12 (b)
                                           9 (b)
                                               10 (b)
                                                 11 (b)
                               5 (b)
                              6 (b)
                                7 (b)
                                     8 (b)
                         3 (b)
                            4 (b)
                2 (c)
               0          100                     200                   300                    400                    500                   600
                                                                    Phylogenetic Distance
Figure 5.3: Phylogeny of sampled focal strain representatives across stints reconstructed using parsimony
algorithm (Cock et al., 2009). Each leaf node corresponds to a sampled representative. Representatives
from stints 0 and 1, which share no common ancestry with representatives from other stints, are excluded.
Numbers refer to stint that each representative was sampled from. Color coding and parentheticals of stint
labels correspond to qualitative morph codes described in Table 5.1.
                                                                              96


shown in Figure 5.3. We used genomes’ fixed-length blocks of 35 64-bit tags that mediate environmental
interactions as the basis for this reconstruction. These tag blocks underwent bitwise mutation over the course
of the experiment.4 Supplementary Figure D.5 shows hamming distance between all pairs of tag blocks. We
additionally tried several other tree inference methods, discussed in supplementary material; however, these
yielded lower-quality reconstructions.
       Although the phylogeny of stint representatives includes many instances that do not constitute a string
lineage (i.e., each stint’s representative descending directly from the preceding stint’s representative), we did
not observe evidence of long-term coexistence of clades over more than ten stints.
5.3.2      Qualitative Morphological Categorizations
       We performed a qualitative survey of the evolved life histories along the evolutionary timeline by ana-
lyzing video recordings of monocultures representative specimens from each stint.
       Table 5.1 summarizes the ten morphological categories we grouped specimens into. In brief, specimens
from early stints largely grew as unicellular or small multicellular groups (morphs a, b). Then, the specimen
from stint 14 grew as larger, symmetrical groups (morph d). At stint 15, a distinct, asymmetrical horizontal
bar morphology evolved (morph e). At stint 45, a delayed secondary spurt of group growth in the vertical
direction arose (morph g). This morphology was sampled frequently until stint 60 when morph e began to
be sampled primarily again. However, morph g was observed as late as stint 90.
       Phylogenetic analysis (Figure 5.3) indicates that observations of morph e at stint 53 and onward are
instances of secondary loss rather than retention of trait e by a separate lineage coexisting with the lineage
expressing morph g. Three separate reversion events from morph g to morph e appear likely. Interestingly,
morph g individuals at stints 89 and 90 appear to represent subsequent trait re-gain after reversion from
morph g to morph e.
       Table 5.1 provides more detailed descriptions of each qualitative morph category as well as video and a
still image example of each. Supplementary Table D.1 provides morph categorization for each stint as well
as links to view the stint’s specimen in a video or in-browser web simulation.
5.3.3      Fitness
       Of the 100 competition assays performed, 57 indicated significant fitness gain, 23 were neutral, and
20 indicated significant fitness loss (shown in upper right of Figure 5.4, at the intersection of the “Biotic
Background, Without” column and “Assay Subject, Specimen” row.)
       We were surprised by the frequency of deleterious outcomes, leading us to perform a second set of
experiments to investigate whether these outcomes could be explained as sampling of “dud” representatives.
    4
      In future experiments, we plan to incorporate new methodology for “hereditary stratigraph” genome
annotations expressly designed to facilitate phylogenetic reconstruction (Chapter 3) .
                                                        97


  ID     Morphology                                                Snapshot                 Video
         Individual cells, no multi-cellular kin groups. Re-
                                                                                            https://hopth.ru/
   a     source use is low—most cells simply hoard resource
         until their stockpile is beyond sufficient to repro-
         duce. Only a handfuls of cells intermittently ex-
                                                                                            21/b=prq49+s=
                                                                                            16005+t=0+v=
                                                                                            video+w=specimen
         pend resource.
                                                                                            https://hopth.ru/
   b     Mostly individual cells, with some two-, three-, and
         four-cell groups evenly spread out. Resource usage
         occurs in short spurts in one or two adjacent cells.
                                                                                            21/b=prq49+s=
                                                                                            16005+t=1+v=
                                                                                            video+w=specimen
         Large multi-cellular groups dominate, consisting of                                https://hopth.ru/
   c     hundreds of cells. Group growth is unchecked and
         continues until cells’ resource stockpiles are entirely
                                                                                            21/b=prq49+s=
                                                                                            16005+t=2+v=
         depleted by the excess group size penalty.                                         video+w=specimen
                                                                                            https://hopth.ru/
   d     Clear groups of 10 to 15 cells in size form. Cell
         proliferation appears somewhat more active at the
         periphery of groups compared to the interior.
                                                                                            21/b=prq49+s=
                                                                                            16005+t=14+v=
                                                                                            video+w=specimen
                                                                                            https://hopth.ru/
   e     Groups are visibly elongated along the horizontal
         axis. After initial development, some gradual, ir-
         regular growth occurs along the vertical axis.
                                                                                            21/b=prq49+s=
                                                                                            16005+t=15+v=
                                                                                            video+w=specimen
                                                                                            https://hopth.ru/
   f     Groups are horizontally elongated similarly to mor-
         phology e, but have a greater consistent vertical
         thickness of three or four cells.
                                                                                            21/b=prq49+s=
                                                                                            16005+t=39+v=
                                                                                            video+w=specimen
         Initial group growth is almost entirely horizontal,                                https://hopth.ru/
   g     with groups usually taking up only one row of cells.
         However, after an apparent timing cue groups per-
                                                                                            21/b=prq49+s=
                                                                                            16005+t=45+v=
         form a brief bout of aggressive vertical growth.                                   video+w=specimen
         Groups grow horizontally and then proliferate ver-                                 https://hopth.ru/
   h     tically on a timing cue like morph e. However, after
         that timing cue cell proliferation is incessant with
                                                                                            21/b=prq49+s=
                                                                                            16005+t=59+v=
         almost no resource retention.                                                      video+w=specimen
                                                                                            https://hopth.ru/
   i     Irregular groups of mostly less than ten cells. Inces-
         sant proliferation with almost no resource retention
         leads to rapid group turnover.
                                                                                            21/b=prq49+s=
                                                                                            16005+t=74+v=
                                                                                            video+w=specimen
         Groups grow horizontally and then proliferate ver-                                 https://hopth.ru/
   j     tically on a timing cue like morph e. However, sev-
         eral viable horizontal-bar offspring groups form be-
                                                                                            21/b=prq49+s=
                                                                                            16005+t=100+v=
         fore force-fragementation.                                                         video+w=specimen
Table 5.1: Qualitative morph phenotype categorizations. Color coding of morph IDs has no significance
beyond guiding the eye in scatter plots where points are labeled by morph. Snapshot visualizes spatial
layout of kin groups on toroidal grid at a fixed point in time. Each cell corresponds to a small square tile.
Color hue denotes and black borders divide outermost kin groups while color saturation denotes and white
borders divide innermost kin groups.
                                                                 98


                                                  Biotic Background                                                 Biotic Background
                  Biotic Background                 Contemporary                    Biotic Background                    Prefatory                    Biotic Background
                    Contemporary                 (no diversity maint.)                   Prefatory                 (no diversity maint.)                    Without
                                                                                                                                                                          57
                                                                       53
        50                            48
                                                                                                        44                               45
                           40
                                                                                                                                                                                   Assay Subject
        40                                                  36
                                                                                             33                               32
count   30
                                                                                                                                                                                   Specimen
                                                                                  23                               23                                          23
                                                                                                                                                    20
        20
                12                               11
        10
        0
                           50                                                                           49                                                                50
        50                                                                                   48
                                      44
                                                                                                                                                                                   Assay Subject
        40
                                                                                                                                                               34
count   30
        20                                                                                                                                                                         Population
                                                                                                                                                    16
        10       6
                                                                                   3
        0
              Fit Loss    Neutral   Fit Gain   Fit Loss    Neutral   Fit Gain   Fit Loss    Neutral   Fit Gain   Fit Loss    Neutral   Fit Gain   Fit Loss    Neutral   Fit Gain
                         Outcome                          Outcome                          Outcome                          Outcome                          Outcome
Figure 5.4: Distributions of adaptation assay outcomes over all stints. For each adaptation assay, three
outcomes were possible: significant fitness gain, significant fitness loss, or no significant fitness change (“neu-
tral”). Significance cutoff p < 0.005 was used. A fitness loss — color coded red — corresponds to winning 2
or fewer competitions out of 20 against the preceding stint’s focal strain population. A fitness gain — color
coded green — corresponds to winning 18 or more competitions out of 20 against the preceding stint’s focal
strain population. Neutral fitness outcomes are color coded yellow. Outcome counts are accumulated over
experiments from stint 1 through stint 100. Upper row shows results for sampled focal strain genome, lower
row shows results for entire focal strain population. See Figure 5.5 for explanation of competition biotic
backgrounds. See Figure 5.6 for joint distributions of fitness outcomes across biotic backgrounds.
In these competition assays, we competed the entire focal strain population against the focal strain population
from the preceding stint. However, we observed a similar result: 50 assays indicated significant fitness gain,
34 were neutral, and 16 indicated significant fitness loss (shown in lower right of Figure 5.4, at the intersection
of the “Biotic Background, Without” column and “Assay Subject, Population” row.)
             Next, we investigated whether the presence of the background strain as a “biotic background” influenced
fitness. We repeated the two experiments described above (specimen and population competition assays), but
inserted the background strain as half of the initial well-mixed population. In one assay setup, we used the
background strain population from the current stint. We refer to this as “contemporary biotic background.”
In another, which we call “prefatory biotic background,” we used the background strain population from the
previous stint. We refer to the original competition assays absent the background strain as “without biotic
background.” Figure 5.5 summarizes these competition assay designs.
             After incorporating the background strain into our measure of fitness, we detected fewer whole-
population deleterious outcomes — only six under contemporary biotic background conditions and only
three under prefatory biotic background conditions (Figure 5.4). To determine whether the presence of the
background strain caused the overall reduction in whole-population deleterious outcomes, we performed a
control competitions under biotic background conditions, but with the focal strain population substituted for
                                                                                            99


                              ⚖⠀ :
                             ⠀
                             diversity
                            maintenance
                                                                                   evolution experiment
                             ⠀📏⠀              :     k
                                                    e                                                   ⠀  ⚖⠀                                                                  ⠀  ⚖⠀
                                                    y
                             prevalence
                               assay
                             ⠀🧫⠀              :
                             population                 …                          background strain                  focal strain                      background strain                  focal strain
                                                                                                                                                                                                                …
                             ⠀🦠⠀              :
                                                                                   🧫population⠀⠀                                                        🧫population⠀⠀
                                                                                                                     🧫population                                                          🧫population
                            specimen
                                 ⠀⠀ :
                             stint n-1
                                 ⠀⠀ :                                                                                4 hours simulation                                                   4 hours simulation
                             stint n-1
                                 ⠀⠀ :                                  ⠀stint n-1⠀                                                                ⠀stint n⠀                      sampled focal strain
                                                                                                                                                                                 🦠specimen genome
                            background
                               strain
                                 ⠀⠀ :
                             focal strain                       specimen adaptation assays
                                                                                                                     ⠀⚖⠀                                                                  ⠀⚖⠀
                                                  ⠀📏⠀
                                                                                                                                      ⠀📏⠀                                                                 ⠀📏⠀
                                   focal strain         focal strain                                    background                                                           background
                                  🧫population           🦠specimen                                      strain 🧫pop                   focal strain                           strain 🧫pop
                                                                                                                        focal strain                                                         focal strain focal strain
                                                                                                                                     specimen                                                             specimen
                                                                                                                          🧫pop                                                                 🧫pop
                                                                                                                                              🦠                                                               🦠
                                                                           hout                                                             fatory                                                          orary
                                                           ⠀“wit                                                                      ⠀“pre                                                           temp d”⠀
                                                                           und”⠀                                                           round
                                                                                                                                                 ”⠀                                               ⠀“con      un
                                                         ckgro                                                              backg                                                                  b a ckgro
                                          bio     tic ba                                                             biotic                                                                 biotic
                                                  population adaptation assays
                                                                                                                     ⠀⚖⠀                                                                  ⠀⚖⠀
                                                  ⠀📏⠀
                                                                                                                                      ⠀ 📏⠀                                                                ⠀📏⠀
                                   focal strain             focal strain                                background                                                           background
                                  🧫population              🧫population                                 strain 🧫pop                                                          strain 🧫pop
                                                                                                                        focal strain focal strain                                            focal strain focal strain
                                                                                                                          🧫pop         🧫pop                                                    🧫pop         🧫pop
                                                                           hout                                                             fatory                                                          orary
                                                           ⠀“wit                                                                      ⠀“pre                                                           temp d”⠀
                                                                           und”⠀                                                           round
                                                                                                                                                 ”⠀                                               ⠀“con      un
                                                         ckgro                                                              backg                                                                  b a ckgro
                                          bio     tic ba                                                             biotic                                                                 biotic
Figure 5.5: Detail of adaptation assay design. Top panel shows progress of original evolutionary experiment
over one stint. A diversity maintenance procedure was used to ensure long-term coexistence of a least
two strains over the course of the experiment by penalizing any strain that occupied more than half of
thread-local population space. A “focal strain” was arbitrarily chosen for study; we refer to the other strain
as the “background strain.” Adaptation assays in lower panels measure fitness change over the course of
that stint through against the population from the preceding stint. The middle panel shows measurement
of adaptation of the representative specimen that was sampled for analysis at each stint. The bottom
panel shows measurement of the adaptation of the entire focal strain population at each stint. Competitors
were mixed in even proportion into the environment. Bar heights represent initial relative proportions of
assay participants at the beginning of the competition. Adaptation was measured by measuring change
in population composition over a 10 minute competition window. We call this measurement of population
composition change a “prevalence assay”. Competition experiments were performed absent the background
strain, with the background strain population from the preceding stint, or with the background strain
population from the current stint — shown separately in each panel.
                                                                                                                                     100


                 Contemporary                    Contemporary                    Contemporary                                                    Contemporary                    Contemporary                    Contemporary                                                          Contemporary                    Contemporary                    Contemporary
               Biotic Background               Biotic Background               Biotic Background                                              (no diversity maint.)           (no diversity maint.)           (no diversity maint.)                                                  Biotic Background               Biotic Background               Biotic Background
                     Fit Loss                        Neutral                         Fit Gain                                                  Biotic Background               Biotic Background               Biotic Background                                                           Fit Loss                        Neutral                         Fit Gain
                                                                                                                                                     Fit Loss                        Neutral                         Fit Gain
        25
                                                                                                                                                                                                                                                                              25
                                                                                                                                         25
                                                                                                             Prefatory                                                                                                                        Prefatory                                                                                                                            Prefatory
        20                                                                                                                                                                                                                                                                    20
                                                                                                                                         20
                                                                                                                                                                                                                                              (no diversity maint.)
                                                                                                             Biotic Background                                                                                                                                                                                                                                                     Biotic Background
        15
count                                                                                                                                                                                                                                                                 count
                                                                                                                                                                                                                                                                              15
                                                                                                                                 count
                                                                                                                                         15
                                                                                                                                                                                                                                              Biotic Background
        10                                                                                                                                                                                                                                                                    10
                                                                                                             Fit Loss                                                                                                                                                                                                                                                              Fit Loss
                                                                     7                                                                   10
                6
                                                                                                                                                                                                                                              Fit Loss
                                                                                                                                                 7
        5                                                                                                                                                                                             5                                                                       5
                                                                                                                                         5                                                                                                                                            3
                                     2          2         2                                          2                                                                                     3                                          3
                          1                                                               1                                                                           2
        0                                                                                                                                                  1                     1                                         1                                                  0
                                                                                                                                         0
        25
                                                                                                                                                                                                                                                                              25
                                                                                                                                                                                                                                                                                                                                23
                                                                                                                                         25
                                                                                                             Prefatory                                                                                                                        Prefatory                                                                                                                            Prefatory
        20                                                                                                                                                                                                                                                                    20
                                                                                                                                         20
                                                                                                                                                                                                                                              (no diversity maint.)
                                                                                                             Biotic Background                                                                                                                                                                                                                                                     Biotic Background
        15
count                                                                                                                                                                                                                                                                 count
                                                                                                                                                                                                                                                                              15
                                                                                                                                 count
                                                                                                                                         15
                                                                    11
                                                                                                                                                                                                                                              Biotic Background
        10                                                                                                                                                                                           11                                                                                                                                   10
                                                                                                                                                                                                                                                                              10
                                                                                                             Neutral                                                                                                                                                                                                                                                               Neutral
                                                                                                                                         10                                                                                                                                                                                                                                8
                                                                                                     6
                                                                                                                                                                                                                                              Neutral
                                                          5                                                                                                                                                                           7
        5                                                                                                                                                                                                                                                                                                             5
                                                4                                         4                                                                                                                                                                                   5
                                                                                                                                         5                                       4         4                               4
                2                                                                                                                                                                                                                                                                     2
                                     1                                                                                                           1                                                               1
        0                                                                                                                                0                                                                                                                                    0
                                                                                                    25                                                                                                                               27                                                                                                                                   26
        25
                                                                                                                                                                                                                                                                              25
                                                                                                                                         25
                                                                                                                                                                                                                                              Prefatory
                                                                                                             Prefatory                                                                                                                                                                                                                                                             Prefatory
        20                                                                                                                                                                                                                                                                    20
                                                                                                                                         20
                                                                                                                                                                                                                                              (no diversity maint.)
                                                                                                             Biotic Background                                                                                                                                                                                                                                                     Biotic Background
        15
count                                                                                                                                                                                                                                                                 count
                                                                                                                                                                                                                                                                              15
                                                                                                                                 count
                                                                                                                                         15
        10                                                                                                                                                                                                                                    Biotic Background               10
                                                                                                             Fit Gain                                                                                                                                                                                                                                                              Fit Gain
                                                                                                                                         10
                                                                                                                                                                                                                                              Fit Gain
                                                                                          6                                                                                                                                6                                                                                                                                    6
        5                                                                                                                                                                                                                                                                                                                       5          5
                                                          4                     4                                                        5                                                 4                     4                                                            5                                                                       4
                                                                     3
                                                2                                                                                                                                2                    2                                                                                                               2
                                                                                                                                                                                                                                                                                                           1
        0                                                                                                                                0                                                                                                                                    0
             Fit Loss  Neutral    Fit Gain   Fit Loss  Neutral    Fit Gain   Fit Loss  Neutral    Fit Gain                                    Fit Loss  Neutral    Fit Gain   Fit Loss  Neutral    Fit Gain   Fit Loss  Neutral    Fit Gain                                        Fit Loss  Neutral    Fit Gain   Fit Loss  Neutral    Fit Gain   Fit Loss  Neutral    Fit Gain
              Without Biotic Background       Without Biotic Background       Without Biotic Background                                        Without Biotic Background       Without Biotic Background       Without Biotic Background                                            Without Biotic Background       Without Biotic Background       Without Biotic Background
(a) Joint distribution of adaptation assay on                                                                                    (b) Joint distribution of adaptation assay on                                                                                        (c) Joint distribution of adaptation assay on fo-
representative specimen from focal strain over                                                                                   representative specimen from focal strain over                                                                                       cal strain population over biotic backgrounds,
biotic backgrounds, with diversity maintenance                                                                                   biotic backgrounds, with diversity maintenance                                                                                       with diversity maintenance during competition.
during competition.                                                                                                              disabled during competition.
Figure 5.6: Joint distribution of adaptation assay outcomes across biotic backgrounds. For each adaptation assay, three outcomes were possible:
significant fitness gain, significant fitness loss, or no significant fitness change (“neutral”). Significance cutoff p < 0.005 was used. A fitness loss —
color coded red — corresponds to winning 2 or fewer competitions out of 20 against the preceding stint’s focal strain population. A fitness gain —
color coded green — corresponds to winning 18 or more competitions out of 20 against the preceding stint’s focal strain population. Neutral fitness
outcomes are color coded yellow. Outcome counts are accumulated over experiments from stint 1 through stint 100. Counts in each subfigure therefore
sum to 100. Column position in facet grid indicates outcome with contemporary biotic background, row position indicates outcome with prefatory
biotic background, and bar color and x position indicates outcome without biotic background. See Figure 5.5 for explanation of competition biotic
backgrounds. See Figure 5.7 for detail on joint distribution of outcomes with and without diversity maintenance, which were mostly identical.
                                                                                                                                                                                         101


the background strain population (Supplementary Figure D.18). Under these conditions, nine of the stints
where whole-population deleterious outcomes had been detected came up neutral and one, surprisingly,
tested significantly adaptive (Supplementary Figure D.17). Dose-dependent fitness effects and/or reduced
experimental sensitivity of the biotic background assay appear to play at least a partial role in explaining
the reduction of detected whole-population deleterious outcomes. However, 10 stints still tested significantly
deleterious with the control focal strain biotic background in addition to without biotic background.
     Four stints do provide strong, direct evidence of a selective effect by the background strain: four whole-
population outcomes that were deleterious without their biotic background were actually significantly ad-
vantageous in the presence of both the prefatory and contemporary background strain populations (Figure
5.6c). All four of these stints exhibited whole-population deleterious outcomes under the control focal strain
biotic background, indicating that the observed fitness sign change was specifically due to the presence of
the background strain (Supplementary Figure D.17).
     Additionally, we detected two deleterious outcomes without biotic background as significantly adaptive
under the prefatory biotic background but as neutral under contemporary biotic background (Figure 5.6c).
Control focal strain biotic background experiments again suggest that the background strain, specifically, is
responsible for this effect (Supplementary Figure D.17).
     We also found one whole-population outcome that was significantly advantageous without biotic back-
ground and in the presence of the prefatory background strain population but significantly deleterious in
the presence of the contemporary background strain, possibly suggesting a “arms race”-like evolutionary
innovation on the part of the background strain over that stint (Figure 5.6c).
     Nonetheless, we still saw three whole-population outcomes that were significantly deleterious under all
three conditions (Figure 5.6c). These whole-population outcomes were also deleterious under the control
focal strain biotic background experiments (Supplementary Figure D.17). Muller’s ratchet (Andersson and
Hughes, 1996) or maladaptation due to environmental change (Brady et al., 2019) may provide possible
explanations, but a definitive answer will require further study.
     We also performed fitness assays on individual sampled specimens with both biotic backgrounds. Out
of 100 stints tested, we observed 20 significantly deleterious outcomes without biotic background, 23 signif-
icantly deleterious outcomes under prefatory biotic background, and 12 significantly deleterious outcomes
under contemporary biotic background (Figure 5.4). Unlike the whole-population deleterious outcomes dis-
cussed above, some deleterious outcomes from sampled specimens is not surprising. Evolving populations
naturally contain standing variation in fitness (Martin and Roques, 2016), so occasional sampling of less-fit
individuals should be expected. Reciprocally, we observed 57 significantly adaptive outcomes without biotic
background, 44 with prefatory biotic background, and 48 with contemporary biotic background (Figure 5.4).
                                                      102


                                      44                                                            50                            48
                                            Prefatory                                                                                     Contemporary
        40                                  (no diversity maint.) Biotic Background                                                       (no diversity maint.) Biotic Background
                                                                                                    40
                            32                                 Fit Loss                                                                                      Fit Loss
                                                                                                                        34
        30                                                     Neutral                                                                                       Neutral
                                                                                                    30
                                                               Fit Gain
count                                                                                       count
                                                                                                                                                             Fit Gain
              22
        20
                                                                                                    20
        10                                                                                                 10
                                                                                                    10
                                                                                                                             5
                   1    1                                                                                       2   1
        0                                                                                           0
              Fit Loss Neutral Fit Gain                                                                    Fit Loss Neutral Fit Gain
              Prefatory Biotic Background                                                                Contemporary Biotic Background
(a) prefatory biotic background outcomes with and (b) contemporary biotic background outcomes with
without diversity maintenance                     and without diversity maintenance
Figure 5.7: Joint distribution of competition experiments performed under biotic background conditions with
diversity maintenance enabled and disabled. Color coding denotes outcome without diversity maintenance
and x position denotes outcome with diversity maintenance. Note that both plots above show distributions
for adaptation assays on representative specimens. Competition experiments without diversity maintenance
were not performed for population-level adaptation. See Figure 5.5 for explanation of competition biotic
backgrounds.
Greater sensitivity of the “without biotic background” adaptation assay could account for the counterintuitive
detection of more adaptive outcomes under abiotic conditions (i.e., the absence of the background strain).
             As before with the population-level adaptation assays, we detected four specimen outcomes that were
deleterious without biotic background but significantly advantageous under both tested background strain
populations (Figure 5.6a). Additionally, and again as before, we detected two deleterious outcomes without
biotic background as significantly adaptive under the prefatory biotic background but as neutral under the
contemporary biotic background (Figure 5.6a). Control focal strain biotic background experiments confirm
that the background strain, specifically, is responsible for these effects (Supplementary Figure D.17).
             We found no specimen outcomes that were advantageous under the prefatory biotic background but
deleterious under the contemporary background. However, we found three stints with opposite dynamics:
specimen outcomes deleterious under prefatory biotic background but advantageous under contemporary
biotic background (Figure 5.6a), further suggesting coincident, interacting evolutionary innovations along
focal and background strain lineages (Figure 5.6a).
             To better characterize the mechanism behind fitness effects caused by the background strain, we per-
formed additional specimen adaptation assays under biotic background conditions with diversity maintenance
disabled. This analysis allowed us to test whether action of the diversity maintenance mechanism, rather
than direct interactions between the focal and background strains, caused the observed fitness effects. Figure
5.7 compares adaptation assay outcomes with and without diversity maintenance under both the prefatory
and contemporary biotic background conditions. Outcomes were generally similar, and we observed only one
sign-change difference was observed: one specimen outcome was beneficial under prefatory biotic background
                                                                                      103


conditions without diversity maintenance but deleterious with diversity maintenance. Further, as shown in
Figure 5.6b, without diversity maintenance we still to observed four outcomes that were advantageous only
under biotic conditions and instead tested deleterious under abiotic conditions. So, biotic selective effects
cannot be explained as an artifact of activation of the diversity maintenance scheme.5
      Significant increases in fitness occur throughout the evolutionary history of the case study, but not at
every stint. Figure 5.8 summarizes the outcome of all adaptation assays stint-by-stint across evolutionary
history. Neutral outcomes appear to occur more frequently at later stints. This may be indicative of slower
evolutionary innovation, but may also result to some extent from simulation of fewer generations during
evolutionary stints (Supplementary Figure D.9) and during competition experiments (Supplementary Figure
D.12) due to slower execution of later genomes.
      Figure 5.9 shows the magnitudes of calculated fitness differentials for all adaptation assays. Fitness
differentials during the first 40 stints are generally higher magnitude than later fitness differentials, although
a strong fitness differential occurs at stint 93. Although the emergence of morphology d was associated
with significant increases in fitness in some specimen assays and morphologies e and g were associated
with significant increases in fitness across all specimen assays (Figure 5.8), the magnitude of these fitness
differentials appears ordinary compared to fitness differentials at other stints (Figure 5.9). Supplementary
Figure D.13 shows mean end-competition prevalence across assays, telling a similar story.
      In addition to competition assays, we also measured growth rate of specimen strains by tracking doubling
time (in updates) when seeded into quarter-full toroidal grids (Figure 5.10). Morph b exhibited a fast growth
rate early on that was never matched by later morphs. This measure appears to be a poor overall proxy for
fitness, highlighting the importance of biotic aspects of the simulation environment which are not present in
the empty space the assayed cells double into.
5.3.4      Fitness Complexity
      Figure 5.11 plots critical fitness complexity of specimens drawn from across the case study’s evolutionary
history.
      Critical fitness complexity reaches more than 20 under morph b, jumps to more than 40 under morph d,
drops to slightly more than 30 for morph e. Critical fitness complexity reaches a peak of 48 sites around stint
39 then levels out and decreases. This decrease may in part be due to declining sensitivity of competition
experiments due to slower simulation resulting in execution of fewer updates within the fixed-duration jobs
(Supplementary Figure D.10).
    5
      We also conducted specimen adaptation assays with diversity maintenance disabled under the control
focal strain biotic background. In these experiments, we again found no evidence for impact from the
diversity maintenance scheme on results (Supplementary Figure D.17).
                                                         104


                                                                                                                                                                                                                 Figure 5.8: Summary of adaptation assay outcomes for sampled representative specimen (top) population-level adaptation (bottom). Color coding
                                                      Assay Subject = Specimen                                                             Assay Subject = Population
                                                                                                                                                                                  100(j)██
                                                                                                                                                                                  99-(e)██
                                                                                                                                                                                  98-(e)██
                                                                                                                                                                                  97-(h)██
                                                                                                                                                                                  96-(b)██
                                                                                                                                                                                  95-(b)██
                                                                                                                                                                                  94-(e)██
                                                                                                                                                                                  93-(e)██
                                                                                                                                                                                  92-(i)██
                                                                                                                                                                                  91-(e)██
                                                                                                                                                                                  90-(g)██
                                                                                                                                                                                  89-(g)██
Significant Fitness Gain (p < 0.005)
                                                                                                                                                                                  88-(e)██
                                                                                                                                                                                  87-(e)██
                                                                                                                                                                                  86-(e)██
                                                                                                                                                                                  85-(e)██
                                                                                                                                                                                  84-(e)██
                                                                                                                                                                                  83-(e)██
                                                                                                                                                                                  82-(e)██
                                                                                                                                                                                  81-(e)██
                                                                                                                                                                                  80-(e)██
                                                                                                                                                                                  79-(g)██
                                                                                                                                                                                  78-(e)██
                                                                                                                                                                                  77-(i)██
                                                                                                                                                                                  76-(g)██
                                                                                                                                                                                  75-(i)██
                                                                                                                                                                                  74-(i)██
                                                                                                                                                                                  73-(g)██
                                                                                                                                                                                  72-(e)██
                                                                                                                                                                                  71-(e)██
                                                                                                                                                                                  70-(e)██
                                                                                                                                                                                  69-(g)██
                                                                                                                                                                                  68-(e)██
                                                                                                                                                                                                                 and parentheticals of stint labels correspond to qualitative morph codes described in Table 5.1. See Figure 5.5 for explanation of competition biotic
                                                                                                                                                                                  67-(g)██
                                                                                                                                                                                  66-(g)██
                                                                                                                                                                                  65-(e)██
                                                                                                                                                                                  64-(e)██
                                                                                                                                                                                  63-(g)██
                                                                                                                                                                                  62-(g)██
                                                                                                                                                                                  61-(e)██
                                                                                                                                                                                  60-(g)██
                                                                                                                                                                                  59-(h)██
                                                                                                                                                                                  58-(g)██
                                                                                                                                                                                  57-(g)██
                                                                                                                                                                                  56-(g)██
                                                                                                                                                                                  55-(g)██
                                                                                                                                                                                  54-(g)██
                                                                                                                                                                                  53-(e)██
                                                                                                                                                                                             Competition Stint
                                                                                                                                                                                  52-(g)██
    Neutral
                                                                                                                                                                                  51-(g)██
                                                                                                                                                                                  50-(g)██
                                                                                                                                                                                                                                                                                                                                                                         105
                                                                                                                                                                                  49-(g)██
                                                                                                                                                                                  48-(g)██
                                                                                                                                                                                  47-(g)██
                                                                                                                                                                                  46-(g)██
                                                                                                                                                                                  45-(g)██
                                                                                                                                                                                  44-(e)██
                                                                                                                                                                                  43-(e)██
                                                                                                                                                                                  42-(e)██
                                                                                                                                                                                  41-(e)██
                                                                                                                                                                                  40-(e)██
                                                                                                                                                                                  39-(f)██
                                                                                                                                                                                  38-(e)██
                                                                                                                                                                                  37-(e)██
                                                                                                                                                                                  36-(e)██
                                                                                                                                                                                  35-(e)██
                                                                                                                                                                                  34-(e)██
                                                                                                                                                                                  33-(e)██
                                                                                                                                                                                  32-(e)██
                                                                                                                                                                                                                 backgrounds.
                                                                                                                                                                                  31-(e)██
                                                                                                                                                                                  30-(e)██
                                                                                                                                                                                  29-(e)██
                                                                                                                                                                                  28-(b)██
                                                                                                                                                                                  27-(e)██
                                                                                                                                                                                  26-(b)██
                                                                                                                                                                                  25-(e)██
                                                                                                                                                                                  24-(e)██
                                                                                                                                                                                  23-(e)██
                                                                                                                                                                                  22-(e)██
Significant Fitness Loss (p < 0.005)
                                                                                                                                                                                  21-(e)██
                                                                                                                                                                                  20-(e)██
                                                                                                                                                                                  19-(e)██
                                                                                                                                                                                  18-(e)██
                                                                                                                                                                                  17-(e)██
                                                                                                                                                                                  16-(e)██
                                                                                                                                                                                  15-(e)██
                                                                                                                                                                                  14-(d)██
                                                                                                                                                                                  13-(b)██
                                                                                                                                                                                  12-(b)██
                                                                                                                                                                                  11-(b)██
                                                                                                                                                                                  10-(b)██
                                                                                                                                                                                  9--(b)██
                                                                                                                                                                                  8--(b)██
                                                                                                                                                                                  7--(b)██
                                                                                                                                                                                  6--(b)██
                                                                                                                                                                                  5--(b)██
                                                                                                                                                                                  4--(b)██
                                                                                                                                                                                  3--(b)██
                                                                                                                                                                                  2--(c)██
                                                                                                                                                                                  1--(b)██
                                                                                                                  Without                                               Without
                                       Contemporary
                                                            Contemporary
                                                                              Prefatory
                                                                                                      Prefatory
                                                                                                                            Contemporary
                                                                                                                                                      Prefatory
                                                      (no diversity maint.)               (no diversity maint.)
                                                                                                                                               Biotic Background
                                                                     Biotic Background


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Symlog Median Fitness Differential
                                                                                                                                                                                                                                                                                                                                                                                                                                                −10−4                                                                                                                                        0                                                                                                                                                                                             10−4
                                          Contemporary
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Assay Subject = Specimen
                          Contemporary
                    (no diversity maint.)
Biotic Background
                                              Prefatory
                                Prefatory
                    (no diversity maint.)
                                               Without
                                          Contemporary
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Assay Subject = Population
                      Biotic Background
                                              Prefatory
                                               Without
                                                          1--(b)██   2--(c)██   3--(b)██   4--(b)██   5--(b)██   6--(b)██   7--(b)██   8--(b)██   9--(b)██   10-(b)██   11-(b)██   12-(b)██   13-(b)██   14-(d)██   15-(e)██   16-(e)██   17-(e)██   18-(e)██   19-(e)██   20-(e)██   21-(e)██   22-(e)██   23-(e)██   24-(e)██   25-(e)██   26-(b)██   27-(e)██   28-(b)██   29-(e)██   30-(e)██   31-(e)██   32-(e)██   33-(e)██   34-(e)██   35-(e)██   36-(e)██   37-(e)██   38-(e)██   39-(f)██   40-(e)██   41-(e)██   42-(e)██   43-(e)██   44-(e)██   45-(g)██   46-(g)██   47-(g)██   48-(g)██   49-(g)██   50-(g)██   51-(g)██   52-(g)██   53-(e)██   54-(g)██   55-(g)██   56-(g)██   57-(g)██   58-(g)██   59-(h)██   60-(g)██   61-(e)██   62-(g)██   63-(g)██   64-(e)██   65-(e)██   66-(g)██   67-(g)██   68-(e)██   69-(g)██   70-(e)██   71-(e)██   72-(e)██   73-(g)██   74-(i)██   75-(i)██   76-(g)██   77-(i)██   78-(e)██   79-(g)██   80-(e)██   81-(e)██   82-(e)██   83-(e)██   84-(e)██   85-(e)██   86-(e)██   87-(e)██   88-(e)██   89-(g)██   90-(g)██   91-(e)██   92-(i)██   93-(e)██   94-(e)██   95-(b)██   96-(b)██   97-(h)██   98-(e)██   99-(e)██   100(j)██
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Competition Stint
Figure 5.9: Median calculated fitness differential outcomes of competition experiments. Zero fitness differential corresponds to a neutral result, color
mapped to white. Blue indicates positive fitness differential (fitness gain) compared to the previous stint and red indicates negative fitness differential
(fitness loss). Color coding and parentheticals of stint labels correspond to qualitative morph codes described in Table 5.1. Note that color intensity
is plotted on a symlog scale due to distribution of fitness differentials over multiple orders of magnitude. Upper panels shows results for sampled focal
strain genome, lower panel shows results for entire focal strain population. See Figure 5.5 for explanation of competition biotic backgrounds.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               106


                                                             1.0012                                                   Morph
                                                                                                                          a
                            Mean Doubling Time Growth Rate
                                                                                                                          b
                                                             1.0010
                                                                                                                          c
                                                                                                                          d
                                                             1.0008                                                       e
                                                                                                                          f
                                                                                                                          g
                                                             1.0006                                                       h
                                                                                                                          i
                                                                                                                          j
                                                             1.0004
                                                             1.0002
                                                                      0        20    40              60    80   100
                                                                                            Stint
Figure 5.10: Growth rate estimated from doubling time experiments, measuring time for a monoculture to
grow from 0.25 maximum population size to 0.5 maximum population size.
                                                             50
                                                                                                                      Morph
                                                                                                                          a
                                                                                                                          b
                                                             40
                            Critical Fitness Complexity
                                                                                                                          c
                                                                                                                          d
                                                                                                                          e
                                                             30
                                                                                                                          f
                                                                                                                          g
                                                             20                                                           h
                                                                                                                          i
                                                                                                                          j
                                                             10
                                                              0
                                                                  0       20        40              60    80    100
                                                                                          Stint
Figure 5.11: Critical fitness complexity. Number of single-site nopouts that significantly decrease fitness, ad-
justed for expected false positives. Color coding and letters correspond to qualitative morph codes described
in Table 5.1. Dotted vertical line denotes emergence of morph e. Dashed vertical line denotes emergence of
morph g.
                                                                                          107


     Phylogenetic analysis (Figure 5.3) suggests independent origins of the critical fitness complexity in morph
d and morph e — the morph d specimen from stint 14 is more closely related to the morph b specimen from
stint 13 than to the morph e specimen from stint 15. Likewise, specimens of lower complexity morphs i and
b that appear past stint 70 appear to have independent evolutionary origins.
5.3.5     Interface Complexity
     Figure 5.12 summarizes cardinal interface complexity, as well as its constituent components, for speci-
mens drawn from across the case study’s evolutionary history.
     Notably, cardinal interface complexity more than doubles from 6 interactions to 17 interactions coinci-
dent with the emergence of morph e (Figure 5.12a). This is due to simultaneous increases in extrospective
state sensing (2 to 9 states; Figure 5.12f), introspective state sensing (1 to 4 states; Figure 5.12e), and
writable state usage (1 to 2 states; Figure 5.12b).
     The emergence of morph g coincided with an increase in writable state interface complexity from 1 to
3 as shown in Figure 5.12b. However, morph g was not associated with other changes in other aspects of
cardinal interface complexity. The greatest observed cardinal interface complexity was 22 interactions at
stints 54 and 67.
5.3.6     Genome Size
     Figure 5.13 shows evolutionary trajectories of three genome size metrics in sampled focal strain spec-
imens. Instruction count and module count increased from 100 and 5 to around 800 and 30, respectively,
between stints 0 and 40. Within this period at stint 24, instruction count jumped from around 600 to
more than 800 and module count jumped by about 5. This was coincident with detection in our adaptation
assays of population-level sign-change mediation of adaptation by the background strain (Figure 5.8). In
sampled specimen fitness assays at stint 24, we detected significant increases in fitness in the presence of the
background strain but no significant change in fitness in its absence.
     Between stints 40 and 90, module count gradually increased to around 40 while instruction count
remained stable. Then, at stint 93, instruction count jumped to around 1,500 and module count jumped to
around 60. This was coincident with the strong fitness differentials observed at stint 93 (Figure 5.9).
     To better understand the functional effects of changes in genome size, we additionally measured the
number of instructions that affected agent phenotype, shown as “phenotype complexity” in Figure 5.13c.
This measure can be considered akin to a count of “active” sites.
     Phenotype complexity varied greatly stint to stint. The median value increased from nearly 0 to around
200 sites between stints 0 and 40. Between stints 40 and 90, we observed phenotype complexity values
ranging from less than 100 to more than 500. Morph g specimens appear to show particularly great variance
                                                      108


                                                                                                               Num Less Fit Under Writable State Perturbation
                                                                                                 Morph                                                               4.0                                                Morph
                                                                                                     a                                                                                                                      a
                                                      20
                                                                                                     b                                                               3.5                                                    b
Cardinal Interface Complexity
                                                                                                     c                                                                                                                      c
                                                                                                                                                                     3.0
                                                                                                     d                                                                                                                      d
                                                      15
                                                                                                     e                                                               2.5                                                    e
                                                                                                     f                                                                                                                      f
                                                                                                     g                                                               2.0                                                    g
                                                      10
                                                                                                     h                                                                                                                      h
                                                                                                                                                                     1.5
                                                                                                     i                                                                                                                      i
                                                                                                     j                                                               1.0                                                    j
                                                      5
                                                                                                                                                                     0.5
                                                      0                                                                                                              0.0
                                                           0   20   40           60   80   100                                                                              0       20    40            60   80   100
                                                                         Stint                                                                                                                 Stint
(a) Cardinal interface complexity, the total number of                                                     (b) Writable state interface complexity, the number of
distinct interactions between a virtual CPU control-                                                       output states that contribute to fitness. See Supple-
ling cell behavior and its surroundings that contribute                                                    mentary Figure D.3 for detail on the writable states
to fitness. (Sum of Figures figs. 5.12b to 5.12f.)                                                         that contribute to fitness.
Num Less Fit Under Inter Self-Send @ Filter Mod 20                                                             Num Less Fit Under Intra Self-Send @ Filter Mod 20
                                                      5                                          Morph
                                                                                                                                                                     2.00                                               Morph
                                                                                                     a
                                                                                                                                                                                                                            a
                                                                                                     b
                                                                                                                                                                     1.75                                                   b
                                                      4                                              c
                                                                                                                                                                                                                            c
                                                                                                     d                                                               1.50
                                                                                                                                                                                                                            d
                                                                                                     e
                                                      3                                                                                                              1.25                                                   e
                                                                                                     f
                                                                                                                                                                                                                            f
                                                                                                     g                                                               1.00                                                   g
                                                      2                                              h                                                                                                                      h
                                                                                                     i                                                               0.75
                                                                                                                                                                                                                            i
                                                                                                     j                                                                                                                      j
                                                                                                                                                                     0.50
                                                      1
                                                                                                                                                                     0.25
                                                      0                                                                                                              0.00
                                                           0   20   40           60   80   100                                                                                  0    20   40            60   80   100
                                                                         Stint                                                                                                                  Stint
(c) Intermessage interface complexity, the number of (d) Intramessage interface complexity, the number of
distinct inter-cell messages that contribute to fitness. distinct inter-cell messages that contribute to fitness.
Num Less Fit Under Introspective State Perturbation                                                            Num Less Fit Under Extrospective State Perturbation
                                                                                                 Morph                                                                                                                  Morph
                                                                                                                                                                     10
                                                                                                     a                                                                                                                      a
                                                      8
                                                                                                     b                                                                                                                      b
                                                                                                     c                                                                8                                                     c
                                                                                                     d                                                                                                                      d
                                                      6
                                                                                                     e                                                                                                                      e
                                                                                                     f                                                                6
                                                                                                                                                                                                                            f
                                                                                                     g                                                                                                                      g
                                                      4
                                                                                                     h                                                                                                                      h
                                                                                                                                                                      4
                                                                                                     i                                                                                                                      i
                                                                                                     j                                                                                                                      j
                                                      2
                                                                                                                                                                      2
                                                      0                                                                                                               0
                                                           0   20   40           60   80   100                                                                              0       20    40            60   80   100
                                                                         Stint                                                                                                                 Stint
(e) Introspective interface complexity, the number of                                                      (f) Extrospective interface complexity, the number of
states viewed in the own cell that contribute to fit-                                                      states viewed in neighboring cells that contribute to
ness. See Supplementary Figure D.2 for detail on the                                                       fitness. See Supplementary Figure D.1 for detail on
introspective states that contribute to fitness.                                                           the extrospective states that contribute to fitness.
Figure 5.12: Interface complexity estimates. Color coding and letters correspond to qualitative morph codes
described in Table 5.1. Dotted vertical line denotes emergence of morph e. Dashed vertical line denotes
emergence of morph g.
                                                                                                         109


                                                                                                                 Mean Program Module Count (monoculture mean)
                    1600                                                                           Morph
                                                                                                                                                                                                                      Morph
                                                                                                       a                                                        70
                                                                                                                                                                                                                          a
                    1400                                                                               b
                                                                                                                                                                                                                          b
                                                                                                       c                                                        60
                    1200                                                                                                                                                                                                  c
                                                                                                       d
 Num Instructions
                                                                                                                                                                                                                          d
                                                                                                       e                                                        50
                    1000                                                                                                                                                                                                  e
                                                                                                       f
                                                                                                                                                                                                                          f
                                                                                                       g                                                        40
                    800                                                                                                                                                                                                   g
                                                                                                       h
                                                                                                                                                                                                                          h
                    600                                                                                i                                                        30
                                                                                                                                                                                                                          i
                                                                                                       j
                    400                                                                                                                                                                                                   j
                                                                                                                                                                20
                    200
                                                                                                                                                                10
                           0   20     40                              60           80        100
                                           Stint                                                                                                                     0        20         40           60   80   100
                                                                                                                                                                                              Stint
                                (a) instruction count
                                                                                                                                                                                    (b) module count
                                                                                                                                                                                         Morph
                                                                      700                                                                                                                    a
                                                                                                                                                                                             b
                                                                      600
                                                                                                                                                                                             c
                                               Phenotype Complexity
                                                                                                                                                                                             d
                                                                      500
                                                                                                                                                                                             e
                                                                                                                                                                                             f
                                                                      400
                                                                                                                                                                                             g
                                                                      300                                                                                                                    h
                                                                                                                                                                                             i
                                                                      200                                                                                                                    j
                                                                      100
                                                                           0
                                                                               0        20          40                                                          60       80        100
                                                                                                         Stint
                                                                                        (c) phenotype complexity
Figure 5.13: Genome size of sampled focal strain specimens. Instruction count is the total number of in-
structions present in the genome. Module count is the number of tagged linear GP modules available for
activation by signals from the environment, from other agents, or from within an agent. Phenotype complex-
ity is the number of genome sites that contribute to phenotype, measured as number sites remaining after
phenotype-neutral nopout (Section 5.2.3). This measure gives a sense of the number of “active” instructions
that influence agents’ behavior. Color coding and letters correspond to qualitative morph codes described
in Table 5.1. Dotted vertical line denotes emergence of morph e. Dashed vertical line denotes emergence of
morph g.
                                                                                                           110


in phenotype complexity. The first observed morph g specimen at stint 45 exhibited relatively low phenotype
complexity of around 100 active sties. The highest phenotype complexity values of around 700 were measured
from three specimens of morphs e and g in the last ten stints.
5.4       Discussion
     Throughout the case study lineage, we describe an evolutionary sequence of ten qualitatively distinct
multicellular morphologies (Table 5.1). The emergence of some, but not all, of these morphologies coincided
with an increase in fitness compared to the preceding population. Outcomes from the first observed morphol-
ogy c specimen are significantly deleterious in all contexts. Likewise, morphology f , while advantageous in
the absence of the background strain, appeared neutral in its presence (Figure 5.8). However, the genesis of
morphology e, and g are associated with significant fitness gain in all contexts (Figure 5.8). This latter set of
novelties might be described as “innovations,” which Hochberg et al. define as qualitative novelty associated
with an increase in fitness (Hochberg et al., 2017). Interestingly, the magnitude of the fitness differentials
associated with the emergence of morphologies e, and g do not appear to fall outside the bounds of other
stint-to-stint fitness differentials (Figure 5.9).
     The relationship between innovation and complexity also appears to be loosely coupled. The emergence
of morphology d was accompanied by a spike in critical fitness complexity (from 25 sites at stint 13 to 43
sites at stint 14). However, the emergence of morphology i coincided with a loss of critical fitness complexity
(from more than 30 sites to fewer than 10 sites). The specimen of morph i at stint 77, which phylogenetic
analysis suggests may have independent trait origin from the specimen at stint 75, exhibited significant
fitness gain across all contexts despite decimation of complexity.
     Phylogenetic analysis suggests that morphology e was not a direct descendant of morphology d. So the
emergence of morphology e appears to have coincided with a more modest increase in fitness complexity
from 25 sites to 31 sites. Similarly, the emergence of morphology g with 42 critical sites at stint 45 coincided
with a relatively modest increase in fitness complexity from 39 critical sites at stint 44.
     We also see evidence that increases in complexity do not imply qualitative novelty in morphology. In
Figure 5.11, we can also observe notable increases in critical fitness complexity that did not coincide with
apparent morphological innovation. For example, fitness complexity jumped from 11 sites at stint 11 to 27
sites at stint 12 while morphology b was retained. In addition, a more gradual increase in fitness complexity
was observed from 27 sites at stint 16 to 46 sites at stint 36 all with consistent morphology e.
     Finally, we also observed disjointedness between alternate measures of functional complexity. Notably,
critical fitness complexity increased by 18 sites with the emergence of morph d but interface complexity
increased only marginally. Subsequently observed morph e had nearly triple the interface complexity of
                                                      111


morph d (6 interactions vs. 17 interactions) but had 12 sites lower critical fitness complexity. In addition, the
gradual increase in critical fitness complexity between stint 15 and 36 under morphology e is not accompanied
by a clear change in interface complexity (Figures 5.12a and 5.11). These apparent inconsistencies between
metrics for functional complexity evidence the multidimensionality of this idea and underscore well-known
difficulties in attempts to describe and quantify it (Böttcher, 2018).
5.5       Conclusion
      Complexity and novelty are not inevitable outcomes of Darwinian evolution (Kenneth O. Stanley, 2017).
Instead, how and why some lineages within some model systems evolve complexity and novelty merits
explanation.
      Efforts to develop substrates and conditions sufficient to observe the evolution of complexity and novelty
plays a crucial role in validating the sufficiency of theory. Additionally, subsequent availability of complexity
and novelty potential within experimental substrates enables work to test and refine theory. The artificial
life research community has a rich track record to these ends.
      The case study reported here tracks a lineage over two phenotypic innovations and several-fold increases
in complexity. DISHTINY relaxes common simulation constraints (Goldsby et al., 2012, 2014), enabling
broad genetic determination of multicellular life history and allowing for unconstrained cellular interactions
between multicellular bodies. As such, this case study opens new windows into evolutionary origins of
complexity and novelty, especially with respect to biotic interactions.
      Our case study exhibits loose coupling between novelty, complexity, and adaptation. We observe in-
stances where novelty coincides with adaptation and instances where it does not. We observe instances where
increases in complexity coincide with adaptation and where decreases in complexity coincide with adapta-
tion. We observe instances where innovation coincides with spikes in complexity and instances where it does
not. We even observe contradiction between metrics that measure different aspects functional complexity.
For example, the specimen sampled at stint 15 had near triple the interface complexity of the specimen
sampled at stint 14 but lower critical fitness complexity.
      Loose coupling between the conceptual threads of novelty, complexity, and adaptation in this case study
highlights the importance of considering these factors independently when developing open-ended evolution
theory — direct coupling among them cannot be assumed.
      Our observation of significant selective effects by the background strain suggests it may serve a crucial
role in understanding the focal strain. Future work should characterize trajectories of adaptation, novelty,
and complexity in this background strain. Additionally, success of the biotic background in fleshing out our
adaptation assays suggests that complexity measures could be improved through similar incorporation of the
                                                        112


biotic background. It would be particularly interesting to measure the contribution of the background strain
to complexity as the difference between complexity statistics with and without the biotic background. To
more systematically test the role of biotic selection on facilitating evolution of complexity, future experiments
might test for differences in the rate of high-complexity evolutionary outcomes between evolution experiments
with and without long-term coexistence between lineages (i.e., diversity maintenance mechanism enabled
versus disabled).
     This case study highlights the potential usefulness of toolbox-based approaches to analyzing open-ended
evolution systems in which an array of analyses are performed to distinguish disparate dimensions of open-
endedness (Dolson et al., 2019). Our findings emphasize, in particular, the critical role of biotic context
in such analyses. In future work, we are interested in further extending this toolbox. One priority will
be estimating epistatic contributions to fitness without resorting to all-pairs knockouts or other even more
extensive assays. Such methodology will be crucial for systems where fitness is implicit and expensive to
measure.
                                                       113


                                                 Chapter 6
                                                Conclusion
Portions of this chapter are adapted from (Moreno and Ofria, 2020)
     The complexity, novelty, and diversity found in the natural world continually inspires scientific curiosity
to understand their origins, just as the ingenuity of natural adaptations spur engineers to try to replicate
their design. In this dissertation, I have pushed forward parallel and distributed high-performance comput-
ing techniques, and leveraged them to perform digital evolution experiments to study complexity, novelty,
diversity, and adaptation in evolved multicells.
     This chapter details contributions of this dissertation, describes avenues for future research, then pro-
vides some closing reflections.
6.1      Contribution
     Part I of this dissertation developed algorithm engineering for computational scale-up of digital evo-
lution experiments. In addition to proposed algorithms and reported experimental results, each chapter’s
accompanying open source software library will directly enable real-world applications within the broader
community. Although methods and software in this section tailor to digital evolution, we anticipate potential
for other applications within distributed computing.
     Chapter 2 implemented and tested a best-effort communication framework (Conduit) on commercially
available high-performance computing hardware. Conduit’s median performance on several quality-of-service
metrics remains stable in scaling experiments up to 256 processes. In separate experiments, I demonstrated
how the best-effort approach can provide better quality solutions to a graph coloring problem within a fixed
time limit. At 64 processes, best-effort communication yielded a 2× faster update rate on a compute-intensive
problem and a 7× faster update rate on a communication-intensive problem.
     Chapter 3 presented the “hereditary stratigraphy” algorithm for phylogenetic analyses in decentralized,
best-effort artificial life experiments. This approach supports tunable trade-offs between inference precision
and annotation memory footprint. We derive several alternate asymptotic trade-offs and report strategies to
attain each. Simulated reconstructions of phylogenies taken from real experiments demonstrate end-to-end
viability of the approach, with up to 85% of original phylogenetic information recovered under reconstruction
from 64-bit annotations.
     Part II of this dissertation introduced DISHTINY, a new framework for experiments evolving digital
multicells. Application of engineering techniques from Part I to DISHTINY yields efficient scalability, with
scale up from one to 64 processes incurring only 8% performance degradation. Experiments in this chapter
survey evolved multicellular life histories within the system, using case studies to characterize complexity,
                                                      114


novelty, adaptation, morphology, and mechanisms.
     Chapter 4 described four qualitative life histories that arose across 40 DISHTINY evolutionary repli-
cates. Phenotypic traits characteristic of multicellularity corroborate occurance of fraternal transitions in
individuality across replicates. Observed traits include reproductive division of labor, resource sharing within
kin groups, resource investment in offspring groups, asymmetrical behaviors mediated by messaging, morpho-
logical patterning, and adaptive apoptosis. These findings validate simulation design, confirming sufficiency
of agent implementation and selective pressures to produce diverse multicellular traits. This work also builds
baseline intuition for DISHTINY life histories, providing a foundation for further work with the system.
     Chapter 5 tracks the evolution of novelty, complexity, and adaptation along a case study lineage. Ten
qualitatively distinct multicellular morphologies occurred along this lineage, several of which exhibited asym-
metrical growth or distinct life stages. This chapter develops and applies a suite of adaptation and complexity
measures. These include competition experiments under various background conditions, a doubling time as-
say, knockout competitions to count active genome sites, knockout competitions to count adaptive genome
sites, and decontextualization competitions to count distinct adaptive environmental interactions. Measures
of novelty, complexity, and adaptation trace loosely coupled, sometimes divergent, trajectories along the case
study lineage. This result reinforces the paradigm shift away from reductive distillation of these phenom-
ena to common symptoms of implicit underlying evolutionary “progress.” Additionally, adaptation assays
indicate significant biotic selection effects, raising questions about the role of co-evolution on this strain’s
evolutionary history.
6.2      Future Work
     Important work remains across the breadth of topics explored in this dissertation. This section briefly
highlights several pertinent open questions and unsolved problems.
     Decentralized methods for diversity maintenance also remain an open question. Although diversity
maintenance can readily be performed on a per-process basis using a finite resource model, how to generalize
this approach to a distributed context remains unclear. (In future work, the current per-process approach
may not be sufficient if smaller per-process cell counts or larger group size reduces the number of multicells
occupying a single process too far.) Perhaps, in addition to enabling post-hoc analyses, hereditary stratigraph
annotations could guide phylogeny-aware during-simulation interventions to maintain diversity.
     Sexual recombination plays a central role in natural history (Smith and Szathmary, 1997) and genetic
programming (Poli et al., 2008). Incorporating sexual recombination into DISHTINY could enable digital
evolution experiments probing the intersection between fraternal transitions in individuality, the evolution
of sex, and the evolution of complexity.
                                                        115


      However, work has yet to be performed on sexual recombination with event-driven genetic programming
encodings. It will be of particular interest to determine whether such encodings’ distinct, tagged modules
provide an effective basis for semantic crossover. Further, in contrast to many natural systems, genetic
programming work overwhelmingly employs monoploid (rather than the polyploid) genomic structure. This
approach avoids difficulty integrating co-execution of two separate programs into a single phenotype profile.
Tagged modules, however, could support co-expression among multiple alleles of the same gene, poten-
tiallyenabling more effective recombination and more salient digital evolution models for research on the
evolution of sex.
      Sexual recombination also constitutes important unexplored territory for distributed phylogenetic in-
ference on digital evolution agents. As presented in Chapter 3, hereditary stratigraphy assumes asexual
lineages. One possible strategy for generalizing this methodology to sexual lineages would be applying anno-
tations to individual genome sites to track independent gene trees. Another possibility would be to apply a
gene drive mechanism to annotations so that a single consensus differentia coalescences at each strata. This
would distinguish genetically isolated subpopulations, providing a basis for species tree reconstruction.
      Direct efforts to evolve emergent multicellular functionality should also be considered. Multicellular
motility could be selected for by increasing resource collection rate based on the distance from the site
where a group originated. More sophisticated inter- and intra-groups interactions could be selected for by
introducing discrete tokens with resource value that differs among cells or among groups (perhaps determined
via a hash of cell or group ID and token ID).
      Such efforts could extend to selecting for multicells that solve simple pattern detection tasks. This goal
would require careful consideration about how to “wire” input/output controls into multicell collectives and
how to make problem instances available on demand to multicells in a distributed setting, but could have
powerful applications. The ability to show multicells outperforming individual unicells or even collections of
unicells on such problems would be an exciting result. Past work exploring the introduction of neuron-like
cell-cell interconnects into the DISHTINY model could serve as a stepping stone toward these objectives
(Moreno and Ofria, 2020).
6.3       Closing Remarks
      Above all, this dissertation pursues larger-scale, more dynamic digital evolution models. This requires
reconciliation of orthogonal, perhaps even somewhat conflicting, aims: engineering for efficient scalability
and relaxation of programmed-in model constraints on multicells. However, the artificial life ethos of “life as
it could be” furnishes a uniquely pliant testbed for approaches to distributed computing that radically depart
from established practice (Forbes, 2000). Best-effort approaches explored first in this context could prove
                                                       116


useful in broader realms of high-performance computing, particularly hard real-time and machine learning
applications (Rhodes et al., 2019).
      We are excited to see impact from dawning adoption of high-performance computing hardware unfold
in advancing the fecundity of open-ended artificial life models. In conjunction with progress in theory,
paleontology, and laboratory-based experiments, such work will play an instrumental role in fleshing out our
account of natural history. Indeed, many fundamental questions remain to be addressed, particularly notable
among them the likely multifaceted and interconnected mechanisms shaping biological complexity. Artificial
life systems, in particular, will be increasingly well-positioned to untangle the origins of biological complexity
in relation to fitness, genetic drift over elapsed evolutionary time, mutational load, genetic recombination
(sex and horizontal gene transfer), ecology, historical contingency, and key innovations. Such insight can
make practical, real-world impact: understanding evolution helps us predict and influence it (e.g., managing
natural ecosystems, mitigating antimicrobial resistance) as well as harness it for automated design through
evolutionary algorithms.
      That the small sampling of experiments reported here yielded a wide variety of evolved behaviors
and individual life histories lends credence to the notion that natural history’s breadth is not surprising
so much as it is inevitable. Further research teasing apart the constructive potential inherent in major
evolutionary transitions promises better capability to shape them and produce computing systems that
reflect the capability and robustness of natural organisms.
                                                        117


                                             BIBLIOGRAPHY
Ackley, D. and Small, T. (2014). Indefinitely scalable computing= artificial life engineering. In ALIFE
  14: The Fourteenth International Conference on the Synthesis and Simulation of Living Systems, pages
  606–613. MIT Press.
Ackley, D. H. (2018). Digital protocells with dynamic size, position, and topology. In ALIFE 2018: The
  2018 Conference on Artificial Life, pages 83–90.
Ackley, D. H. (2019). Building a survivable protocell for a corrosive digital environment. In ALIFE 2019:
  The 2019 Conference on Artificial Life, pages 111–118. MIT Press.
Ackley, D. H. and Cannon, D. C. (2011). Pursue robust indefinite scalability. In Proceedings of the 13th
  USENIX Conference on Hot Topics in Operating Systems, HotOS’13, page 8, USA. USENIX Association.
Ackley, D. H. and Williams, L. R. (2011). Homeostatic architectures for robust spatial computing. In 2011
  Fifth IEEE Conference on Self-Adaptive and Self-Organizing Systems Workshops, pages 91–96. IEEE.
Acun, B., Gupta, A., Jain, N., Langer, A., Menon, H., Mikida, E., Ni, X., Robson, M., Sun, Y., Totoni, E.,
  et al. (2014). Parallel programming with migratable objects: Charm++ in practice. In SC’14: Proceedings
  of the International Conference for High Performance Computing, Networking, Storage and Analysis, pages
  647–658. IEEE.
Adami, C., Ofria, C., and Collier, T. C. (2000). Evolution of biological complexity. Proceedings of the
  National Academy of Sciences, 97(9):4463–4468.
Aktaş, M. F. and Soljanin, E. (2019). Straggler mitigation at scale. IEEE/ACM Transactions on Networking,
  27(6):2266–2279.
Andersson, D. I. and Hughes, D. (1996). Muller’s ratchet decreases fitness of a dna-based microbe. Proceedings
  of the National Academy of Sciences, 93(2):906–907.
Arnellos, A. and Keijzer, F. (2019). Bodily complexity: Integrated multicellular organizations for contraction-
  based motility. Frontiers in Physiology, 10.
Baig, U. I., Bhadbhade, B. J., and Watve, M. G. (2014). Evolution of aging and death: what insights
  bacteria can provide. The Quarterly Review of Biology, 89(3):209–233.
Banzhaf, W., Baumgaertner, B., Beslon, G., Doursat, R., Foster, J. A., McMullin, B., De Melo, V. V.,
  Miconi, T., Spector, L., Stepney, S., and White, R. (2016). Defining and simulating open-ended novelty:
  Requirements, guidelines, and challenges. Theory in Biosciences, 135(3):131–161.
Bauer, M., Treichler, S., Slaughter, E., and Aiken, A. (2012). Legion: Expressing locality and indepen-
  dence with logical regions. In SC’12: Proceedings of the International Conference on High Performance
  Computing, Networking, Storage and Analysis, pages 1–11. IEEE.
Bedau, M. A., Snyder, E., and Packard, N. H. (1998). A classification of long-term evolutionary dynamics.
  In Artificial Life VI: Proceedings of the Sixth International Conference on Artificial Life, pages 228–237.
  MIT Press.
Benenson, Y. (2009). Biocomputers: from test tubes to live cells. Molecular BioSystems, 5(7):675–685.
Bennett III, F. H., Koza, J. R., Shipman, J., and Stiffelman, O. (1999). Building a parallel computer system
  for $18,000 that performs a half peta-flop per day. In Proceedings of the 1st Annual Conference on Genetic
  and Evolutionary Computation-Volume 2, pages 1484–1490.
Biswas, R., Bryson, D., Ofria, C., and Wagner, A. (2014). Causes vs benefits in the evolution of prey
  grouping. In ALIFE 14: The Fourteenth International Conference on the Synthesis and Simulation of
  Living Systems, ALIFE 2021: The 2021 Conference on Artificial Life, pages 641–648.
                                                      118


Blondeau, A., Cheyer, A., Hodjat, B., and Harrigan, P. (2009). Distributed network for performing complex
  algorithms. US Patent App. 12/267,287.
Blumofe, R. D., Joerg, C. F., Kuszmaul, B. C., Leiserson, C. E., Randall, K. H., and Zhou, Y. (1996). Cilk:
  An efficient multithreaded runtime system. Journal of Parallel and Distributed Computing, 37(1):55–69.
Bocquet, M., Hirztlin, T., Klein, J.-O., Nowak, E., Vianello, E., Portal, J.-M., and Querlioz, D. (2018).
  In-memory and error-immune differential rram implementation of binarized deep neural networks. In 2018
  IEEE International Electron Devices Meeting (IEDM), pages 20–6. IEEE.
Bohm, C., G., N. C., and Hintze, A. (2017). MABE (Modular Agent Based Evolver): A framework for
  digital evolution research. In ECAL 2017, the Fourteenth European Conference on Artificial Life, pages
  76–83.
Bonabeau, E. W. and Theraulaz, G. (1994). Why do we need artificial life? Artificial Life, 1(3):303–325.
Bonnet, J., Yin, P., Ortiz, M. E., Subsoontorn, P., and Endy, D. (2013). Amplifying genetic logic gates.
  Science, 340(6132):599–603.
Bostock, M., Ogievetsky, V., and Heer, J. (2011). D3 data-driven documents. IEEE Transactions on
  Visualization and Computer Graphics, 17(12):2301–2309.
Böttcher, T. (2018). From molecules to life: quantifying the complexity of chemical and biological systems
  in the universe. Journal of Molecular Evolution, 86(1):1–10.
Brady, S. P., Bolnick, D. I., Angert, A. L., Gonzalez, A., Barrett, R. D., Crispo, E., Derry, A. M., Eckert,
  C. G., Fraser, D. J., Fussmann, G. F., et al. (2019). Causes of maladaptation. Evolutionary Applications,
  12(7):1229–1242.
Bundy, J., Ofria, C., and Lenski, R. E. (2021). How the footprint of history shapes the evolution of digital
  organisms. bioRxiv.
Byna, S., Meng, J., Raghunathan, A., Chakradhar, S., and Cadambi, S. (2010). Best-effort semantic docu-
  ment search on gpus. In Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics
  Processing Units, pages 86–93.
Cantú-Paz, E. (2001). Master-slave parallel genetic algorithms. In Efficient and Accurate Parallel Genetic
  Algorithms, pages 33–48. Springer.
Cardwell, D. and Song, F. (2019). An extended roofline model with communication-awareness for distributed-
  memory hpc systems. In Proceedings of the International Conference on High Performance Computing in
  Asia-Pacific Region, pages 26–35.
Casci, T. (2008). Lining up is hard to do. Nature Reviews Genetics, 9(8):573–573.
Chakradhar, S. T. and Raghunathan, A. (2010). Best-effort computing: Re-thinking parallel software and
  hardware. In Design Automation Conference, pages 865–870. IEEE.
Chakrapani, L. N., Korkmaz, P., Akgul, B. E., and Palem, K. V. (2008). Probabilistic system-on-a-chip
  architectures. ACM Transactions on Design Automation of Electronic Systems (TODAES), 12(3):1–28.
Chakravorty, S. and Kale, L. V. (2004). A fault tolerant protocol for massively parallel systems. In 18th
  International Parallel and Distributed Processing Symposium, page 212. IEEE.
Chakravorty, S. and Kalé, L. V. (2007). A fault tolerance protocol with fast fault recovery. In 2007 IEEE
  International Parallel and Distributed Processing Symposium, pages 1–10. IEEE.
Chamberlain, B. L., Callahan, D., and Zima, H. P. (2007). Parallel programmability and the chapel language.
  The International Journal of High Performance Computing Applications, 21(3):291–312.
                                                     119


Channon, A. (2019). Maximum individual complexity is indefinitely scalable in geb. Artificial Life, 25(2):134–
  144.
Che, S., Li, J., Sheaffer, J. W., Skadron, K., and Lach, J. (2008). Accelerating compute-intensive applications
  with gpus and fpgas. In 2008 Symposium on Application Specific Processors, pages 101–107. IEEE.
Cheney, N., MacCurdy, R., Clune, J., and Lipson, H. (2014). Unshackling evolution: evolving soft robots
  with multiple materials and a powerful generative encoding. ACM SIGEVOlution, 7(1):11–23.
Chippa, V. K., Mohapatra, D., Roy, K., Chakradhar, S. T., and Raghunathan, A. (2014). Scalable effort
  hardware design. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 22(9):2004–2016.
Cho, H., Leem, L., and Mitra, S. (2012). Ersa: Error resilient system architecture for probabilistic applica-
  tions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 31(4):546–558.
Clune, J., Ofria, C., and Pennock, R. T. (2007). Investigating the emergence of phenotypic plasticity in
  evolving digital organisms. In Costa, F. A. e., Rocha, L. M., Costa, E., Harvey, I., and Coutinho, A.,
  editors, Proceedings of the 9th European Conference on Advances in Artificial Life, pages 74–83, Berlin,
  Heidelberg. Springer-Verlag.
Clune, J., Stanley, K. O., Pennock, R. T., and Ofria, C. (2011). On the performance of indirect encoding
  across the continuum of regularity. IEEE Transactions on Evolutionary Computation, 15(3):346–367.
Cock, P. J., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., Friedberg, I., Hamelryck,
  T., Kauff, F., Wilczynski, B., et al. (2009). Biopython: freely available python tools for computational
  molecular biology and bioinformatics. Bioinformatics, 25(11):1422–1423.
Covert, A. W., Lenski, R. E., Wilke, C. O., and Ofria, C. (2013). Experiments on the role of deleterious
  mutations as stepping stones in adaptive evolution. Proceedings of the National Academy of Sciences,
  110(34):E3171–E3178.
Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., Ranzato, M. a., Senior, A., Tucker, P.,
  Yang, K., Le, Q., and Ng, A. (2012). Large scale distributed deep networks. In Pereira, F., Burges, C.
  J. C., Bottou, L., and Weinberger, K. Q., editors, Advances in Neural Information Processing Systems,
  volume 25. Curran Associates, Inc.
Dolson, E., Banzhaf, W., and Ofria, C. (2018). Applying ecological principles to genetic programming. In
  Genetic Programming Theory and Practice XV, pages 73–88. Springer.
Dolson, E., Lalejini, A., Jorgensen, S., and Ofria, C. (2020). Interpreting the tape of life: Ancestry-based
  analyses provide insights and intuition about evolutionary dynamics. Artificial Life, 26(1):58–79.
Dolson, E. and Ofria, C. (2017). Spatial resource heterogeneity creates local hotspots of evolutionary poten-
  tial. In ECAL 2017, the Fourteenth European Conference on Artificial Life, pages 122–129.
Dolson, E. and Ofria, C. (2018). Ecological theory provides insights about evolutionary computation. In
  Proceedings of the Genetic and Evolutionary Computation Conference Companion, pages 105–106.
Dolson, E. and Ofria, C. (2021). Digital evolution for ecology research: A review. Frontiers in Ecology and
  Evolution, 9.
Dolson, E. L. (2019). On the Constructive Power of Ecology in Open-Ended Evolving Systems. PhD thesis,
  Michigan State University.
Dolson, E. L., Vostinar, A. E., Wiser, M. J., and Ofria, C. (2019). The modes toolbox: Measurements of
  open-ended dynamics in evolving systems. Artificial Life, 25(1):50–73.
Dongarra, J., Hittinger, J., Bell, J., Chacon, L., Falgout, R., Heroux, M., Hovland, P., Ng, E., Webster, C.,
  and Wild, S. (2014). Applied mathematics research for exascale computing. Technical report, Lawrence
  Livermore National Lab.(LLNL).
                                                       120


Downing, K. L. (2015). Intelligence emerging: adaptivity and search in evolving neural systems. MIT Press.
Eiben, A. and Smith, J. (2015). Introduction to Evolutionary Computing. Springer, Berlin.
El-Ghazawi, T. and Smith, L. (2006). Upc: Unified parallel c. In Proceedings of the 2006 ACM/IEEE
  Conference on Supercomputing, SC ’06, page 27–es, New York, NY, USA. Association for Computing
  Machinery.
Ellenbogen, J. C. and Love, J. C. (2000). Architectures for molecular electronic computers. i. logic structures
  and an adder designed from molecular electronic diodes. Proceedings of the IEEE, 88(3):386–426.
Forbes, N. (2000). Life as it could be: Alife attempts to simulate evolution. IEEE Intelligent Systems and
  their Applications, 15(6):2–7.
Fortuna, M. A., Barbour, M. A., Zaman, L., Hall, A. R., Buckling, A., and Bascompte, J. (2019). Coevolu-
  tionary dynamics shape the structure of bacteria-phage infection networks. Evolution, 73(5):1001–1011.
Foster, E. D. and Deardorff, A. (2017). Open science framework (osf). Journal of the Medical Library
  Association, 105(2):203.
Gagliardi, F., Moreto, M., Olivieri, M., and Valero, M. (2019). The international race towards exascale in
  europe. CCF Transactions on High Performance Computing, pages 1–11.
Gamell, M., Teranishi, K., Heroux, M. A., Mayo, J., Kolla, H., Chen, J., and Parashar, M. (2015). Local
  recovery and failure masking for stencil-based applications at extreme scales. In SC’15: Proceedings of
  the International Conference for High Performance Computing, Networking, Storage and Analysis, pages
  1–12. IEEE.
Geladi, P. and Kowalski, B. R. (1986). Partial least-squares regression: a tutorial. Analytica Chimica Acta,
  185:1–17.
Gerhart, J. and Kirschner, M. (2007). The theory of facilitated variation. Proceedings of the National
  Academy of Sciences, 104(suppl 1):8582–8589.
Gilbert, D. (2015). Artificial intelligence is here to help you pick the right shoes. International Business
  Times.
Goings, S., Goldsby, H., Cheng, B. H., and Ofria, C. (2012). An ecology-based evolutionary algorithm to
  evolve solutions to complex problems. In ALIFE 2012: The Thirteenth International Conference on the
  Synthesis and Simulation of Living Systems, pages 171–177. MIT Press.
Goldberg, D. E., Richardson, J., et al. (1987). Genetic algorithms with sharing for multimodal function
  optimization. In Genetic Algorithms and their Applications: Proceedings of the Second International
  Conference on Genetic Algorithms, volume 4149. Hillsdale, NJ: Lawrence Erlbaum.
Goldsby, H., Kerr, B., and Ofria, C. (2020). Major transitions in digital evolution. In Evolution in Action:
  Past, Present and Future, pages 333–347. Springer.
Goldsby, H. J., Dornhaus, A., Kerr, B., and Ofria, C. (2012). Task-switching costs promote the evolu-
  tion of division of labor and shifts in individuality. Proceedings of the National Academy of Sciences,
  109(34):13686–13691.
Goldsby, H. J., Knoester, D. B., and Ofria, C. (2010). Evolution of division of labor in genetically homogenous
  groups. In Pelikan, M. and Branke, J., editors, Proceedings of the 12th Annual Conference on Genetic and
  Evolutionary Computation, pages 135–142, New York, NY. ACM.
Goldsby, H. J., Knoester, D. B., Ofria, C., and Kerr, B. (2014). The evolutionary origin of somatic cells
  under the dirty work hypothesis. PLOS Biology, 12(5):1–11.
                                                      121


Goldsby, H. J., Young, R. L., Hofmann, H. A., and Hintze, A. (2017). Increasing the complexity of solutions
  produced by an evolutionary developmental system. In Pelikan, M. and Branke, J., editors, Proceedings of
  the Genetic and Evolutionary Computation Conference Companion, pages 57–58, New York, NY. ACM.
Goldsby, H. J., Young, R. L., Schossau, J., Hofmann, H. A., and Hintze, A. (2018). Serendipitous scaffolding
  to improve a genetic algorithm’s speed and quality. In Proceedings of the Genetic and Evolutionary
  Computation Conference, GECCO ’18, page 959–966, New York, NY, USA. Association for Computing
  Machinery.
Good, B. H., McDonald, M. J., Barrick, J. E., Lenski, R. E., and Desai, M. M. (2017). The dynamics of
  molecular evolution over 60,000 generations. Nature, 551(7678):45–50.
Grabowski, L. M., Bryson, D. M., Dyer, F. C., Ofria, C., and Pennock, R. T. (2010). Early evolution of
  memory usage in digital organisms. In Fellermann, H., Dörr, M., Hanczyc, M. M., Laursen, L. L., Maurer,
  S. E., Merkle, D., Monnard, P., Støy, K., and Rasmussen, S., editors, Artificial Life XII: Proceedings of
  the Twelfth International Conference on the Synthesis and Simulation of Living Systems, pages 224–231,
  Odense, Denmark. MIT Press.
Grabowski, L. M., Bryson, D. M., Dyer, F. C., Pennock, R. T., and Ofria, C. (2013). A case study of the de
  novo evolution of a complex odometric behavior in digital organisms. PLOS ONE, 8(4):1–10.
Gropp, W., Lusk, E., Doss, N., and Skjellum, A. (1996). A high-performance, portable implementation of
  the mpi message passing interface standard. Parallel Computing, 22(6):789–828.
Gropp, W. and Snir, M. (2013). Programming for exascale computers. Computing in Science & Engineering,
  15(6):27–35.
Grosberg, R. K. and Strathmann, R. R. (2007). The evolution of multicellularity: A minor major transition?
  Annual Review of Ecology, Evolution, and Systematics, 38(1):621–654.
Gu, R. and Becchi, M. (2019). A comparative study of parallel programming frameworks for distributed gpu
  applications. In Proceedings of the 16th ACM International Conference on Computing Frontiers, pages
  268–273.
Gulli, J. G., Herron, M. D., and Ratcliff, W. C. (2019). Evolution of altruistic cooperation among nascent
  multicellular organisms. Evolution, 73(5):1012–1024.
Hagstrom, G. I., Hang, D. H., Ofria, C., and Torng, E. (2004). Using avida to test the effects of natural
  selection on phylogenetic reconstruction methods. Artificial Life, 10(2):157–166.
Hanschen, E. R., Shelton, D. E., and Michod, R. E. (2015). Evolutionary transitions in individuality and
  recent models of multicellularity. In Ruiz-Trillo, I. and Nedelcu, A. M., editors, Evolutionary Transitions
  to Multicellular Life, pages 165–188. Springer, Dordrecht, Netherlands.
Harding, S. and Banzhaf, W. (2007a). Fast genetic programming and artificial developmental systems
  on gpus. In 21st International Symposium on High Performance Computing Systems and Applications
  (HPCS’07), pages 1–7. IEEE.
Harding, S. and Banzhaf, W. (2007b). Fast genetic programming on gpus. In European Conference on
  Genetic Programming, pages 90–101. Springer.
Heinemann, C. (2008). Artificial life environment. Informatik-Spektrum, 31(1):55–61.
Helmuth, T., Spector, L., and Matheson, J. (2014). Solving uncompromising problems with lexicase selection.
  IEEE Transactions on Evolutionary Computation, 19(5):630–643.
Hennessy, J. L. and Patterson, D. A. (2011). Computer architecture: a quantitative approach. Elsevier.
Hernandez, J. G., Lalejini, A., and Dolson, E. (2022). What Can Phylogenetic Metrics Tell us About Useful
  Diversity in Evolutionary Algorithms? In Banzhaf, W., Trujillo, L., Winkler, S., and Worzel, B., editors,
  Genetic Programming Theory and Practice XVIII, pages 63–82. Springer Singapore, Singapore.
                                                      122


Heroux, M. A. (2014). Toward resilient algorithms and applications. arXiv preprint arXiv:1402.3809.
Hochberg, M. E., Marquet, P. A., Boyd, R., and Wagner, A. (2017). Innovation: an emerging focus from cells
  to societies. Philosophical Transactions of the Royal Society B: Biological Sciences, 372(1735):20160414.
Hodjat, B. and Shahrzad, H. (2013). Distributed evolutionary algorithm for asset management and trading.
  US Patent 8,527,433.
Hornby, G., Globus, A., Linden, D., and Lohn, J. (2006). Automated antenna design with evolutionary
  algorithms. In Space 2006, page 7242. American Institute of Aeronautics and Astronautics.
Hornby, G. S. (2005). Measuring, enabling and comparing modularity, regularity and hierarchy in evolu-
  tionary design. In Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation,
  pages 1729–1736.
Horwitz, R. and Webb, D. (2003). Cell migration. Current Biology, 13(19):R756–R759.
Huizinga, J., Stanley, K. O., and Clune, J. (2018). The emergence of canalization and evolvability in an
  open-ended, interactive evolutionary system. Artificial Life, 24(3):157–181.
Hunter, J. D. (2007). Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–
  95.
Hursey, J., Squyres, J. M., Mattox, T. I., and Lumsdaine, A. (2007). The design and implementation
  of checkpoint/restart process fault tolerance for open mpi. In 2007 IEEE International Parallel and
  Distributed Processing Symposium, pages 1–8. IEEE.
Izzo, D., Rucinski, M., and Ampatzis, C. (2009). Parallel global optimisation meta-heuristics using an
  asynchronous island-model. In 2009 IEEE Congress on Evolutionary Computation, pages 2301–2308.
  IEEE.
Jones, J. E., Le Sage, V., Padovani, G. H., Calderon, M., Wright, E. S., and Lakdawala, S. S. (2021). Parallel
  evolution between genomic segments of seasonal human influenza viruses reveals rna-rna relationships.
  eLife, 10:e66525.
Jouppi, N. P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., Bates, S., Bhatia, S., Boden, N.,
  Borchers, A., et al. (2017). In-datacenter performance analysis of a tensor processing unit. In Proceedings
  of the 44th Annual International Symposium on Computer Architecture, pages 1–12.
Kajmakovic, A., Diwold, K., Kajtazovic, N., and Zupanc, R. (2020). Challenges in mitigating soft errors in
  safety-critical systems with cots microprocessors. In PESARO 2020, The Tenth International Conference
  on Performance, Safety and Robustness in Complex Systems and Applications, pages 13–18. IARIA.
Kale, L. V. and Krishnan, S. (1993). Charm++ a portable concurrent object oriented system based on c++.
  In Proceedings of the Eighth Annual Conference on Object-oriented Programming Systems, Languages, and
  Applications, pages 91–108.
Kapli, P., Yang, Z., and Telford, M. J. (2020). Phylogenetic tree building in the genomic age. Nature Reviews
  Genetics, 21(7):428–444.
Karakus, M. and Durresi, A. (2017). Quality of service (qos) in software defined networking (sdn): A survey.
  Journal of Network and Computer Applications, 80:200–218.
Karnik, T. and Hazucha, P. (2004). Characterization of soft errors caused by single event upsets in cmos
  processes. IEEE Transactions on Dependable and Secure Computing, 1(2):128–143.
Kashyap, V. (2006). Ip over infiniband (ipoib) architecture. The Internet Society, 22.
Kauffman, S. A. and Weinberger, E. D. (1989). The nk model of rugged fitness landscapes and its application
  to maturation of the immune response. Journal of Theoretical Biology, 141(2):211–245.
                                                     123


Kenneth O. Stanley, J. L. (2017). Open-endedness: The last grand challenge you’ve never heard of. Radar
  / AI & ML.
Kim, J.-S., Ha, S., and Jhon, C. S. (1998). Relaxed barrier synchronization for the bsp model of computation
  on message-passing architectures. Information Processing Letters, 66(5):247–253.
Kirschner, M. and Gerhart, J. (1998). Evolvability. Proceedings of the National Academy of Sciences,
  95(15):8420–8427.
Knoll, A. H. (2011). The multiple origins of complex multicellularity. Annual Review of Earth and Planetary
  Sciences, 39(1):217–239.
Koenker, R. and Hallock, K. F. (2001). Quantile regression. Journal of Economic Perspectives, 15(4):143–156.
Konstantopoulos, S., Li, W., Miller, S., and van der Ploeg, A. (2019). Using quantile regression to estimate
  intervention effects beyond the mean. Educational and Psychological Measurement, 79(5):883–910.
Koop, M. J., Sur, S., Gao, Q., and Panda, D. K. (2007). High performance mpi design using unreliable
  datagram for ultra-scale infiniband clusters. In Proceedings of the 21st Annual International Conference
  on Supercomputing, pages 180–189.
Koschwanez, J. H., Foster, K. R., and Murray, A. W. (2013). Improved use of a public good selects for the
  evolution of undifferentiated multicellularity. eLife, 2:e00367.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional
  neural networks. Advances in Neural Information Processing Systems, 25:1097–1105.
LaBar, T. and Adami, C. (2017). Evolution of drift robustness in small populations. Nature Communications,
  8(1):1–12.
Lack, J. B. and Van Den Bussche, R. A. (2010). Identifying the confounding factors in resolving phylogenetic
  relationships in vespertilionidae. Journal of Mammalogy, 91(6):1435–1448.
Lalejini, A., Dolson, E., Bohm, C., Ferguson, A. J., Parsons, D. P., Rainford, P. F., Richmond, P., and Ofria,
  C. (2019). Data standards for artificial life software. In ALIFE 2019: The 2019 Conference on Artificial
  Life, pages 507–514. MIT Press.
Lalejini, A., Moreno, M. A., and Ofria, C. (2020). Case study of adaptive gene regulation in dishtiny. Preprint
  via Open Science Framework at https://osf.io/kqvmn.
Lalejini, A., Moreno, M. A., and Ofria, C. (2021). Tag-based regulation of modules in genetic programming
  improves context-dependent problem solving. Genetic Programming and Evolvable Machines, 22(3):325–
  355.
Lalejini, A. and Ofria, C. (2016). The evolutionary origins of phenotypic plasticity. In Gershenson, C., Froese,
  T., Siqueiros, J. M., Aguilar, W., Izquierdo, E., and Sayama, H., editors, Proceedings of the Artificial Life
  Conference 2016, pages 372–379, Cambridge, MA. MIT Press.
Lalejini, A. and Ofria, C. (2018). Evolving event-driven programs with signalgp. In Proceedings of the
  Genetic and Evolutionary Computation Conference, pages 1135–1142.
Langdon, W. B. and Banzhaf, W. (2019). Continuous long-term evolution of genetic programming. In ALIFE
  2019: The 2019 Conference on Artificial Life, pages 388–395. MIT Press.
Lehman, J. (2012). Evolution through the Search for Novelty. PhD thesis, University of Central Florida.
Lehman, J. and Stanley, K. O. (2011). Abandoning objectives: Evolution through the search for novelty
  alone. Evolutionary Computation, 19(2):189–223.
Lehman, J. and Stanley, K. O. (2012). Beyond open-endedness: Quantifying impressiveness. In ALIFE
  2012: The Thirteenth International Conference on the Synthesis and Simulation of Living Systems, pages
  75–82. MIT Press.
                                                      124


Lehman, J. and Stanley, K. O. (2013). Evolvability is inevitable: Increasing evolvability without the pressure
  to adapt. PloS One, 8(4):e62186.
Leith, D. J., Clifford, P., Badarla, V., and Malone, D. (2012). Wlan channel selection without communication.
  Computer Networks, 56(4):1424–1441.
Lenski, R. E., Ofria, C., Pennock, R. T., and Adami, C. (2003). The evolutionary origin of complex features.
  Nature, 423(6936):139–144.
Liard, V., Parsons, D., Rouzaud-Cornabas, J., and Beslon, G. (2018). The complexity ratchet: Stronger
  than selection, weaker than robustness. In ALIFE 2018: The 2018 Conference on Artificial Life, pages
  250–257, Tokyo, Japan.
Libby, E. and Ratcliff, W. C. (2014). Ratcheting the evolution of multicellularity. Science, 346(6208):426–427.
Lipson, H. et al. (2007). Principles of modularity, regularity, and hierarchy for scalable systems. Journal of
  Biological Physics and Chemistry, 7(4):125–128.
Lynch, M. (2007). The frailty of adaptive hypotheses for the origins of organismal complexity. Proceedings
  of the National Academy of Sciences, 104(suppl 1):8597–8604.
Martin, G. and Roques, L. (2016). The nonstationary dynamics of fitness distributions: asexual model with
  epistasis and standing variation. Genetics, 204(4):1541–1558.
Meng, J., Chakradhar, S., and Raghunathan, A. (2009). Best-effort parallel execution framework for recogni-
  tion and mining applications. In 2009 IEEE International Symposium on Parallel & Distributed Processing,
  pages 1–12. IEEE.
Meurer, A., Smith, C. P., Paprocki, M., Čertík, O., Kirpichev, S. B., Rocklin, M., Kumar, A., Ivanov, S.,
  Moore, J. K., Singh, S., et al. (2017). Sympy: symbolic computing in python. PeerJ Computer Science,
  3:e103.
Miaoulis, G. and Plemenos, D. (2008). Intelligent Scene Modelling Information Systems, volume 181.
  Springer.
Miikkulainen, R., Liang, J., Meyerson, E., Rawal, A., Fink, D., Francon, O., Raju, B., Shahrzad, H.,
  Navruzyan, A., Duffy, N., et al. (2019). Evolving deep neural networks. In Artificial Intelligence in the
  Age of Neural Networks and Brain Computing, pages 293–312. Elsevier.
Mittal, S. (2016). A survey of techniques for approximate computing. ACM Computing Surveys (CSUR),
  48(4):1–33.
Moreno, M. A. (2019). Evaluating function dispatch methods in signalgp. Preprint via Open Science
  Framework at https://osf.io/rmkcv.
Moreno, M. A. (2020). Profiling foundations for scalable digital evolution methods. Preprint via Open
  Science Framework at https://osf.io/tcjfy.
Moreno, M. A., Dolson, E., and Ofria, C. (2022a). Hereditary stratigraph concept supplement. Available at
  https://osf.io/4sm72.
Moreno, M. A., Dolson, E., and Ofria, C. (2022b). Hereditary Stratigraphy: Genome Annotations to Enable
  Phylogenetic Inference over Distributed Populations. In ALIFE 2022: The 2022 Conference on Artificial
  Life, ALIFE 2022: The 2022 Conference on Artificial Life. 64.
Moreno, M. A. and Ofria, C. (2019). Toward open-ended fraternal transitions in individuality. Artificial
  Life, 25(2):117–133.
Moreno, M. A. and Ofria, C. (2020). Practical steps toward indefinite scalability: In pursuit of robust
  computational substrates for open-ended evolution. Preprint via Open Science Framework at https://doi.
  org/10.17605/OSF.IO/53VGH.
                                                       125


Moreno, M. A. and Ofria, C. (2022). Exploring evolved multicellular life histories in a open-ended digital
  evolution system. Frontiers in Ecology and Evolution, 10.
Moreno, M. A., Papa, S. R., and Ofria, C. (2020). Conduit: A c++ library for best-effort high performance
  computing. In Proceedings of the 6th International Workshop on Modeling and Simulation of and by
  Parallel and Distributed Systems at the 2020 International Conference on High Performance Computing
  & Simulation, HPCS 2020.
Moreno, M. A., Papa, S. R., and Ofria, C. (2021a). Case study of novelty, complexity, and adaptation in a
  multicellular system. In OEE4: The Fourth Workshop on Open-Ended Evolution.
Moreno, M. A., Papa, S. R., and Ofria, C. (2021b). Conduit: A c++ library for best-effort high performance
  computing. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, GECCO
  ’21, page 1795–1800, New York, NY, USA. Association for Computing Machinery.
Nguyen, A. M., Yosinski, J., and Clune, J. (2015). Innovation engines: Automated creativity and improved
  stochastic optimization via deep learning. In Proceedings of the 2015 Annual Conference on Genetic and
  Evolutionary Computation, GECCO ’15, page 959–966, New York, NY, USA. Association for Computing
  Machinery.
Ni, X. (2016). Mitigation of failures in high performance computing via runtime techniques. PhD thesis,
  University of Illinois.
Niu, F., Recht, B., Re, C., and Wright, S. J. (2011). Hogwild! a lock-free approach to parallelizing stochastic
  gradient descent. In Proceedings of the 24th International Conference on Neural Information Processing
  Systems, pages 693–701.
Noel, C. and Osindero, S. (2014). Dogwild!-distributed hogwild for cpu & gpu. In NIPS Workshop on
  Distributed Machine Learning and Matrix Computations, pages 693–701.
Oeis (2021a).       Sequence a056791.        The on-line encyclopedia of integer sequences. Available at
  https://oeis.org/A056791.
Oeis (2021b).       Sequence a063787.        The on-line encyclopedia of integer sequences. Available at
  https://oeis.org/A063787.
Ofria, C., Adami, C., Collier, T. C., and Hsu, G. K. (1999). Evolution of differentiated expression patterns
  in digital organisms. In European Conference on Artificial Life, pages 129–138. Springer.
Ofria, C., Bryson, D. M., and Wilke, C. O. (2009). Avida, pages 3–35. Springer London, London.
Ofria, C., Dolson, E., Lalejini, A., Fenton, J., Moreno, M. A., Jorgensen, S., Miller, R., Stredwick, J., Zaman,
  L., Schossau, J., Gillespie, L., G, N. C., and Vostinar, A. (2019). Empirical c++ scientific software library
  for research, education, & public engagement.
Packard, N., Bedau, M. A., Channon, A., Ikegami, T., Rasmussen, S., Stanley, K. O., and Taylor, T. (2019).
  An overview of open-ended evolution: Editorial introduction to the open-ended evolution ii special issue.
  Artificial Life, 25(2):93–103.
Paradis, E., Claude, J., and Strimmer, K. (2004). Ape: analyses of phylogenetics and evolution in r language.
  Bioinformatics, 20(2):289–290.
Petscher, Y. and Logan, J. A. (2014). Quantile regression in the study of developmental sciences. Child
  Development, 85(3):861–881.
Poli, R., Langdon, W. B., and McPhee, N. F. (2008). A Field Guide to Genetic Programming. Lulu
  Enterprises, UK Ltd.
Pontes, A. C., Mobley, R. B., Ofria, C., Adami, C., and Dyer, F. C. (2020). The evolutionary origin of
  associative learning. The American Naturalist, 195(1):E1–E19.
                                                      126


Project Jupyter, Matthias Bussonnier, Jessica Forde, Jeremy Freeman, Brian Granger, Tim Head, Chris
  Holdgraf, Kyle Kelley, Gladys Nalvarte, Andrew Osheroff, Pacer, M., Yuvi Panda, Fernando Perez, Ben-
  jamin Ragan Kelley, and Carol Willing (2018). Binder 2.0 - Reproducible, interactive, sharable envi-
  ronments for science at scale. In Fatih Akici, David Lippa, Dillon Niederhut, and Pacer, M., editors,
  Proceedings of the 17th Python in Science Conference, pages 113 – 120.
Queller, D. C. (1997). Cooperators since life began. The Quarterly Review of Biology, 72(2):184–188.
Ragan-Kelley, B. and Willing, C. (2018). Binder 2.0-reproducible, interactive, sharable environments for
  science at scale. In Proceedings of the 17th Python in Science Conference (F. Akici, D. Lippa, D. Niederhut,
  and M. Pacer, eds.), pages 113–120.
Rahmati, D., Murali, S., Benini, L., Angiolini, F., De Micheli, G., and Sarbazi-Azad, H. (2011). Com-
  puting accurate performance bounds for best effort networks-on-chip. IEEE Transactions on Computers,
  62(3):452–467.
Ratcliff, W. C., Denison, R. F., Borrello, M., and Travisano, M. (2012). Experimental evolution of multicel-
  lularity. Proceedings of the National Academy of Sciences, 109(5):1595–1600.
Ratcliff, W. C., Fankhauser, J. D., Rogers, D. W., Greig, D., and Travisano, M. (2015). Origins of multicel-
  lular evolvability in snowflake yeast. Nature Communications, 6:6102.
Ratcliff, W. C. and Travisano, M. (2014). Experimental evolution of multicellular complexity in saccha-
  romyces cerevisiae. BioScience, 64(5):383–393.
Ray, T. (1995). A proposal to create a network-wide biodiversity reserve for digital organisms. Technical
  Report TR-H-133, ATR.
Ray, T. S. and Hart, J. F. (2000). Evolution of differentiation in multithreaded digital organisms. Artificial
  Life, 7:132–140.
Ray, T. S. and Thearling, K. (1996). Evolving parallel computation. Complex Systems, 10(3):229–237.
Reinders, J. (2007). Intel threading building blocks: outfitting C++ for multi-core processor parallelism. "
  O’Reilly Media, Inc.".
Rhodes, O., Peres, L., Rowley, A., Gait, A., Plana, L., Brenninkmeijer, C., and Furber, S. (2019). Real-time
  cortical simulation on neuromorphic hardware. Royal Society of London. Proceedings A. Mathematical,
  Physical and Engineering Sciences, 378(2164):1–21.
Sarkar, S., Majumder, T., Kalyanaraman, A., and Pande, P. P. (2010). Hardware accelerators for biocom-
  puting: A survey. In Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pages
  3789–3792. IEEE.
Scoles, S. (2018). Cosmic ray showers crash supercomputers. here’s what to do about it. Wired.
Smith, J. M. and Szathmary, E. (1997). The major transitions in evolution. Oxford University Press.
Smith, M. R. (2020a). Information theoretic generalized robinson-foulds metrics for comparing phylogenetic
  trees. Bioinformatics, 36(20):5007–5013.
Smith, M. R. (2020b). ms609/treedistdata: v1.0.0.
Smith, M. R. (2020c). TreeDist: Distances between Phylogenetic Trees. R package version 2.5.0.
Smith, M. R. (2022). Robust analysis of phylogenetic tree space. Systematic Biology.
Sokal, R. R. (1958). A statistical method for evaluating systematic relationships. Univ. Kansas, Sci. Bull.,
  38:1409–1438.
                                                       127


Soros, L. and Stanley, K. (2014). Identifying necessary conditions for open-ended evolution through the
  artificial life world of chromaria. In ALIFE 14: The Fourteenth International Conference on the Synthesis
  and Simulation of Living Systems, pages 793–800. MIT Press.
Sridharan, V., DeBardeleben, N., Blanchard, S., Ferreira, K. B., Stearley, J., Shalf, J., and Gurumurthi, S.
  (2015). Memory errors in modern systems: The good, the bad, and the ugly. ACM SIGARCH Computer
  Architecture News, 43(1):297–310.
Stanley, K. O. and Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies.
  Evolutionary Computation, 10(2):99–127.
Stanley, K. O. and Miikkulainen, R. (2003). A taxonomy for artificial embryogeny. Artificial Life, 9(2):93–
  130.
Staps, M., van Gestel, J., and Tarnita, C. E. (2019). Emergence of diverse life cycles and life histories at the
  origin of multicellularity. Nature Ecology & Evolution, 3(8):1197–1205.
Steno, N. (1916). The prodromus of Nicolaus Steno’s dissertation concerning a solid body enclosed by process
  of nature within a solid, volume 11. University of Michigan Press.
Sukumaran, J. and Holder, M. T. (2010). Dendropy: a python library for phylogenetic computing. Bioin-
  formatics, 26(12):1569–1571.
Sutter, H. et al. (2005). The free lunch is over: A fundamental turn toward concurrency in software. Dr.
  Dobb’s Journal, 30(3):202–210.
Tang, C., Bouteiller, A., Herault, T., Venkata, M. G., and Bosilca, G. (2014). From mpi to openshmem:
  Porting lammps. In Workshop on OpenSHMEM and Related Technologies, pages 121–137. Springer.
Taylor, T., Bedau, M., Channon, A., Ackley, D., Banzhaf, W., Beslon, G., Dolson, E., Froese, T., Hickin-
  botham, S., Ikegami, T., et al. (2016). Open-ended evolution: Perspectives from the oee workshop in york.
  Artificial Life, 22(3):408–423.
Teranishi, K. and Heroux, M. A. (2014). Toward local failure local recovery resilience model using mpi-ulfm.
  In Proceedings of the 21st European MPI Users’ Group Meeting, pages 51–56.
Ushey, K., Allaire, J., and Tang, Y. (2022).             reticulate:   Interface to ’Python’.     Available at
  https://rstudio.github.io/reticulate/.
Valiant, L. G. (1990). A bridging model for parallel computation. Communications of the ACM, 33(8):103–
  111.
Vankeirsbilck, J., Hallez, H., and Boydens, J. (2015). Soft error protection in safety critical embedded
  applications: An overview. In 2015 10th International Conference on P2P, Parallel, Grid, Cloud and
  Internet Computing (3PGCIC), pages 605–610. IEEE.
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E.,
  Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov,
  N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C. J., Polat, İ., Feng, Y., Moore, E. W.,
  VanderPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R.,
  Archibald, A. M., Ribeiro, A. H., Pedregosa, F., van Mulbregt, P., and SciPy 1.0 Contributors (2020).
  SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272.
Wang, R., Clune, J., and Stanley, K. O. (2018). Vine: an open source interactive data visualization tool
  for neuroevolution. In Proceedings of the Genetic and Evolutionary Computation Conference Companion,
  pages 1562–1564.
Waskom, M. L. (2021). seaborn: statistical data visualization. Journal of Open Source Software, 6(60):3021.
West, S. A., Fisher, R. M., Gardner, A., and Kiers, E. T. (2015). Major evolutionary transitions in individ-
  uality. Proceedings of the National Academy of Sciences, 112(33):10112–10119.
                                                     128


Wickham, H., François, R., Henry, L., and Müller, K. (2022). dplyr: A Grammar of Data Manipulation.
  Available at https://dplyr.tidyverse.org.
Wilke, C. O. and Adami, C. (2002). The biology of digital organisms. Trends in Ecology & Evolution,
  17(11):528–532.
Wilson, E. O. (1984). The relation between caste ratios and division of labor in the ant genus pheidole
  (hymenoptera: Formicidae). Behavioral Ecology and Sociobiology, 16(1):89–98.
Xiang, D., Wang, X., Jia, C., Lee, T., and Guo, X. (2016). Molecular-scale electronics: from concept to
  function. Chemical Reviews, 116(7):4318–4440.
Zaman, L., Devangam, S., and Ofria, C. (2011). Rapid host-parasite coevolution drives the production and
  maintenance of diversity in digital organisms. In Proceedings of the 13th Annual Conference on Genetic
  and Evolutionary Computation, pages 219–226.
Zhao, X., Papagelis, M., An, A., Chen, B. X., Liu, J., and Hu, Y. (2019). Elastic bulk synchronous parallel
  model for distributed deep learning. In 2019 IEEE International Conference on Data Mining (ICDM),
  pages 1504–1509. IEEE.
Zhaxybayeva, O. and Gogarten, J. P. (2004). Cladogenesis, coalescence and the evolution of the three
  domains of life. TRENDS in Genetics, 20(4):182–187.
                                                    129


                          Appendix A
    Design and Scalability Analysis of Conduit: a Best-effort
             Communication Software Framework
A.1 Weak Scaling
                                                        This section provides full results from weak scaling experiments discussed in 2.3.6.
                                                        1e6        Cpus Per Node = 1              Cpus Per Node = 4                                                                                                   1e7        Cpus Per Node = 1              Cpus Per Node = 4
                                   1.2
Latency Walltime Inlet (ns)                                                                                                                                                         Latency Walltime Inlet (ns)
                                                                                                                                                                                                                  8
                                                                                                                            Num Simels Per Cpu = 1                                                                                                                                        Num Simels Per Cpu = 1
                                   1.0
                                   0.8                                                                                                                                                                            6
                                   0.6
                                                                                                                                                                                                                  4
                                   0.4
                                                                                                                                                                                                                  2
                                   0.2
                                                                                                                                                                                                                  0
                                   0.0
                                                        1e6                                                                                                                                                           1e10
                                                                                                                                                                                                 1.4
                      Latency Walltime Inlet (ns)                                                                           Num Simels Per Cpu = 2048         Latency Walltime Inlet (ns)                                                                                                 Num Simels Per Cpu = 2048
                                                    4                                                                                                                                            1.2
                                                                                                                                                                                                 1.0
                                                    3                                                                                                                                            0.8
                                                                                                                                                                                                 0.6
                                                    2
                                                                                                                                                                                                 0.4
                                                                                                                                                                                                 0.2
                                                    1
                                                                                                                                                                                                 0.0
                                                              16         64            256   16         64            256                                                                                                   16         64            256   16         64            256
                                                                    Num Processes                  Num Processes                                                                                                                  Num Processes                  Num Processes
(a) Distribution of Latency Walltime Inlet (ns) for (b) Distribution of Latency Walltime Inlet (ns) for
each snapshot, without outliers.                    each snapshot, with outliers.
Figure A.1: Distribution of Latency Walltime Inlet (ns) for individual snapshot measurements for weak
scaling experiment (Section 2.3.6). Lower is better.
                                                                                                                                                        130


                                                                      Cpus Per Node = 1              Cpus Per Node = 4                                                                                                                 Cpus Per Node = 1              Cpus Per Node = 4
                                   17.5
                                   15.0
Latency Simsteps Outlet                                                                                                        Num Simels Per Cpu = 1
                                                                                                                                                                                                                          80
                                                                                                                                                                                                Latency Simsteps Outlet                                                                         Num Simels Per Cpu = 1
                                   12.5
                                                                                                                                                                                                                          60
                                   10.0
                                                     7.5                                                                                                                                                                  40
                                                     5.0
                                                                                                                                                                                                                          20
                                                     2.5
                                                     0.0                                                                                                                                                                  0
                                                     2.2                                                                                                                                             8000
                                                     2.0                                                                                                                                             7000
                                                                                                                               Num Simels Per Cpu = 2048                                                                                                                                        Num Simels Per Cpu = 2048
               Latency Simsteps Outlet                                                                                                                           Latency Simsteps Outlet
                                                     1.8                                                                                                                                             6000
                                                     1.6                                                                                                                                             5000
                                                     1.4                                                                                                                                             4000
                                                     1.2                                                                                                                                             3000
                                                     1.0                                                                                                                                             2000
                                                     0.8                                                                                                                                             1000
                                                     0.6                                                                                                                                                                  0
                                                                 16          64           256   16         64            256                                                                                                      16         64            256   16          64           256
                                                                        Num Processes                 Num Processes                                                                                                                     Num Processes                   Num Processes
(a) Distribution of Latency Simsteps Outlet for each (b) Distribution of Latency Simsteps Outlet for each
snapshot, without outliers.                          snapshot, with outliers.
Figure A.2: Distribution of Latency Simsteps Outlet for individual snapshot measurements for weak scaling
experiment (Section 2.3.6). Lower is better.
                                                                                                                                                                                                                           1e9        Cpus Per Node = 1               Cpus Per Node = 4
                                                           1e6        Cpus Per Node = 1              Cpus Per Node = 4
                                   1.2                                                                                                                                                               2.0
Latency Walltime Outlet (ns)                                                                                                                                     Latency Walltime Outlet (ns)
                                                                                                                               Num Simels Per Cpu = 1                                                                                                                                           Num Simels Per Cpu = 1
                                   1.0
                                                                                                                                                                                                     1.5
                                   0.8
                                                                                                                                                                                                     1.0
                                   0.6
                                   0.4
                                                                                                                                                                                                     0.5
                                   0.2
                                                                                                                                                                                                     0.0
                                   0.0
                                                           1e6                                                                                                                                                             1e10
                                                                                                                                                                                                     1.4
                      Latency Walltime Outlet (ns)                                                                             Num Simels Per Cpu = 2048         Latency Walltime Outlet (ns)                                                                                                   Num Simels Per Cpu = 2048
                                                                                                                                                                                                     1.2
                                                     4
                                                                                                                                                                                                     1.0
                                                     3                                                                                                                                               0.8
                                                                                                                                                                                                     0.6
                                                     2
                                                                                                                                                                                                     0.4
                                                                                                                                                                                                     0.2
                                                     1
                                                                                                                                                                                                     0.0
                                                                 16         64            256   16         64            256                                                                                                     16         64            256    16         64            256
                                                                       Num Processes                  Num Processes                                                                                                                    Num Processes                   Num Processes
(a) Distribution of Latency Walltime Outlet (ns) for (b) Distribution of Latency Walltime Outlet (ns) for
each snapshot, without outliers.                     each snapshot, with outliers.
Figure A.3: Distribution of Latency Walltime Outlet (ns) for individual snapshot measurements for weak
scaling experiment (Section 2.3.6). Lower is better.
                                                                                                                                                           131


                                                                     Cpus Per Node = 1              Cpus Per Node = 4                                                                                                          Cpus Per Node = 1              Cpus Per Node = 4
                                                                                                                                                                                                1.0
                            0.9
                                                                                                                              Num Simels Per Cpu = 1                                                                                                                                    Num Simels Per Cpu = 1
                                                                                                                                                                                                0.8
                            0.8
Delivery Clumpiness                                                                                                                                             Delivery Clumpiness
                            0.7                                                                                                                                                                 0.6
                            0.6
                                                                                                                                                                                                0.4
                            0.5
                            0.4                                                                                                                                                                 0.2
                            0.3
                                                                                                                                                                                                0.0
                                                                                                                                                                                                0.8
                            0.7
                                                                                                                              Num Simels Per Cpu = 2048                                                                                                                                 Num Simels Per Cpu = 2048
                                                                                                                                                                                                0.7
                            0.6
Delivery Clumpiness                                                                                                                                             Delivery Clumpiness
                                                                                                                                                                                                0.6
                            0.5
                                                                                                                                                                                                0.5
                            0.4
                                                                                                                                                                                                0.4
                            0.3
                                                                                                                                                                                                0.3
                            0.2                                                                                                                                                                 0.2
                            0.1                                                                                                                                                                 0.1
                            0.0                                                                                                                                                                 0.0
                                                               16          64            256   16         64            256                                                                                               16         64            256   16         64            256
                                                                      Num Processes                  Num Processes                                                                                                              Num Processes                  Num Processes
(a) Distribution of Delivery Clumpiness for each (b) Distribution of Delivery Clumpiness for each
snapshot, without outliers.                      snapshot, with outliers.
Figure A.4: Distribution of Delivery Clumpiness for individual snapshot measurements for weak scaling
experiment (Section 2.3.6). Lower is better.
                                                                          Cpus Per Node = 1          Cpus Per Node = 4                                                                                              1e7        Cpus Per Node = 1              Cpus Per Node = 4
                            200000
Simstep Period Inlet (ns)                                                                                                                                                           Simstep Period Inlet (ns)
                                                                                                                                                                                                                8
                                                                                                                              Num Simels Per Cpu = 1                                                                                                                                    Num Simels Per Cpu = 1
                            150000                                                                                                                                                                              6
                                                                                                                                                                                                                4
                            100000
                                                                                                                                                                                                                2
                             50000
                                                                                                                                                                                                                0
                                                               1e6                                                                                                                                                  1e7
                                                                                                                                                                                                1.0
                                                                                                                              Num Simels Per Cpu = 2048                                                                                                                                 Num Simels Per Cpu = 2048
                                                         2.5
                             Simstep Period Inlet (ns)                                                                                                          Simstep Period Inlet (ns)
                                                                                                                                                                                                0.8
                                                         2.0
                                                                                                                                                                                                0.6
                                                         1.5
                                                                                                                                                                                                0.4
                                                         1.0                                                                                                                                    0.2
                                                                     16         64       256    16         64       256                                                                                                   16         64            256   16         64            256
                                                                           Num Processes              Num Processes                                                                                                             Num Processes                  Num Processes
(a) Distribution of Simstep Period Inlet (ns) for each (b) Distribution of Simstep Period Inlet (ns) for each
snapshot, without outliers.                            snapshot, with outliers.
Figure A.5: Distribution of Simstep Period Inlet (ns) for individual snapshot measurements for weak scaling
experiment (Section 2.3.6). Lower is better.
                                                                                                                                                          132


                                                                                Cpus Per Node = 1              Cpus Per Node = 4                                                                                                                 Cpus Per Node = 1              Cpus Per Node = 4
                               17.5
                               15.0
                                                                                                                                                                                                                                 80
Latency Simsteps Inlet                                                                                                                   Num Simels Per Cpu = 1                                         Latency Simsteps Inlet                                                                            Num Simels Per Cpu = 1
                               12.5
                                                                                                                                                                                                                                 60
                               10.0
                                       7.5                                                                                                                                                                                       40
                                       5.0
                                                                                                                                                                                                                                 20
                                       2.5
                                       0.0                                                                                                                                                                                           0
                                                                                                                                                                                                             8000
                                       2.0
                                                                                                                                                                                                             7000
                                                                                                                                         Num Simels Per Cpu = 2048                                                                                                                                        Num Simels Per Cpu = 2048
                                       1.8
              Latency Simsteps Inlet                                                                                                                                       Latency Simsteps Inlet
                                                                                                                                                                                                             6000
                                       1.6
                                                                                                                                                                                                             5000
                                       1.4                                                                                                                                                                   4000
                                       1.2                                                                                                                                                                   3000
                                       1.0                                                                                                                                                                   2000
                                       0.8                                                                                                                                                                   1000
                                       0.6                                                                                                                                                                                           0
                                                                           16         64            256   16         64            256                                                                                                      16         64            256   16          64           256
                                                                                 Num Processes                  Num Processes                                                                                                                     Num Processes                   Num Processes
(a) Distribution of Latency Simsteps Inlet for each (b) Distribution of Latency Simsteps Inlet for each
snapshot, without outliers.                         snapshot, with outliers.
Figure A.6: Distribution of Latency Simsteps Inlet for individual snapshot measurements for weak scaling
experiment (Section 2.3.6). Lower is better.
                                                                                     Cpus Per Node = 1          Cpus Per Node = 4                                                                                                    1e9        Cpus Per Node = 1               Cpus Per Node = 4
                                                                                                                                                                                                                                 4
                              200000
Simstep Period Outlet (ns)                                                                                                               Num Simels Per Cpu = 1                                 Simstep Period Outlet (ns)                                                                                Num Simels Per Cpu = 1
                                                                                                                                                                                                                                 3
                              150000
                                                                                                                                                                                                                                 2
                              100000
                                                                                                                                                                                                                                 1
                                       50000
                                                                                                                                                                                                                                 0
                                                                          1e6                                                                                                                                                        1e7
                                                                                                                                                                                                             1.0
                                                                                                                                         Num Simels Per Cpu = 2048                                                                                                                                        Num Simels Per Cpu = 2048
                                                                    2.5
                                       Simstep Period Outlet (ns)                                                                                                          Simstep Period Outlet (ns)
                                                                                                                                                                                                             0.8
                                                                    2.0
                                                                                                                                                                                                             0.6
                                                                    1.5
                                                                                                                                                                                                             0.4
                                                                    1.0                                                                                                                                      0.2
                                                                                16         64       256    16         64       256                                                                                                         16         64            256    16         64            256
                                                                                      Num Processes              Num Processes                                                                                                                   Num Processes                   Num Processes
(a) Distribution of Simstep Period Outlet (ns) for (b) Distribution of Simstep Period Outlet (ns) for
each snapshot, without outliers.                   each snapshot, with outliers.
Figure A.7: Distribution of Simstep Period Outlet (ns) for individual snapshot measurements for weak scaling
experiment (Section 2.3.6). Lower is better.
                                                                                                                                                                     133


                                                            Cpus Per Node = 1         Cpus Per Node = 4                                                                         Cpus Per Node = 1              Cpus Per Node = 4
                                                 0.5                                                                                                                 0.5
                                                                                                           Num Simels Per Cpu = 1                                                                                                        Num Simels Per Cpu = 1
                                                 0.4                                                                                                                 0.4
                         Delivery Failure Rate                                                                                               Delivery Failure Rate
                                                 0.3                                                                                                                 0.3
                                                 0.2                                                                                                                 0.2
                                                 0.1                                                                                                                 0.1
                                                 0.0                                                                                                                 0.0
                                                                                                                                                                     0.7
                                                                                                           Num Simels Per Cpu = 2048                                                                                                     Num Simels Per Cpu = 2048
                         0.0002
                                                                                                                                                                     0.6
Delivery Failure Rate                                                                                                                        Delivery Failure Rate
                                                                                                                                                                     0.5
                         0.0001
                                                                                                                                                                     0.4
                         0.0000                                                                                                                                      0.3
                                                                                                                                                                     0.2
                        −0.0001                                                                                                                                      0.1
                                                                                                                                                                     0.0
                        −0.0002
                                                       16         64       256   16         64       256                                                                   16         64            256   16         64            256
                                                             Num Processes             Num Processes                                                                             Num Processes                  Num Processes
(a) Distribution of Delivery Failure Rate for each (b) Distribution of Delivery Failure Rate for each
snapshot, without outliers.                        snapshot, with outliers.
Figure A.8: Distribution of Delivery Failure Rate for individual snapshot measurements for weak scaling
experiment (Section 2.3.6). Lower is better.
                                                                                                                                       134


                                                                             Ordinary Least Squares Regression
                                                                         Cpus Per Node = 1              1e8   Cpus Per Node = 4
                              625000                                                             1.50
Latency Walltime Inlet (ns)
                              600000
                                                                                                                                      Num Simels Per Cpu = 1
                                                                                                 1.25
                              575000                                                             1.00                                                                   Estimated Statistic = Latency Walltime Inlet (ns) Mean | Num Processes = 16, 64, 256
                                                                                                                                                                                                                                                                  Cpus Per Node = 1             Cpus Per Node = 4
                              550000                                                             0.75                                                                                                                                                   0
                                                                                                                                                                                                                                                                                       100000
                              525000                                                             0.50                                                                                                                           −5000
                                                                                                                                                                                                                                                                                                                    Num Simels Per Cpu = 1
                                                                                                                                                                                             Absolute Effect Size
                                                                                                                                                                                                                      −10000                                                                0
                              500000                                                             0.25
                                                                                                                                                                                                                      −15000
                                                                                                 0.00                                                                                                                                                                                 −100000
                              475000
                                                                                                                                                                                                                      −20000
                                                                                                                                                                                                                      −25000                                                          −200000
                                                                   1e7                                  1e6
                                                                                                                                                                                                                      −30000
                                                                                                                                                                                                                                                                                      −300000
                                                                                                                                      Num Simels Per Cpu = 2048
                                                                                                  2.8
                               Latency Walltime Inlet (ns)
                                                             2.0                                                                                                                                                      −35000
                                                                                                  2.6                                                                                                                                                       1e6
                                                                                                                                                                                                                                                                                       350000
                                                             1.5
                                                                                                                                                                                                                                                        7
                                                                                                                                                                                                                                                                                                                    Num Simels Per Cpu = 2048
                                                                                                  2.4                                                                                                                                                                                  300000
                                                                                                                                                                                                                                                        6
                                                                                                                                                                                                                                 Absolute Effect Size
                                                             1.0                                                                                                                                                                                                                       250000
                                                                                                  2.2                                                                                                                                                   5
                                                                                                                                                                                                                                                                                       200000
                                                             0.5                                  2.0                                                                                                                                                   4
                                                                                                                                                                                                                                                                                       150000
                                                                                                                                                                                                                                                        3
                                                             0.0                                  1.8                                                                                                                                                                                  100000
                                                                                                                                                                                                                                                        2
                                                                                                                                                                                                                                                        1                               50000
                                                                     2          3            4            2          3            4
                                                                         Log Num Processes                    Log Num Processes                                                                                                                         0                                   0
(a) Complete ordinary least squares regression plot. (b) Estimated regression coefficient for complete re-
Observations are means per replicate.                gression. Zero corresponds to no effect.
                                                                             Ordinary Least Squares Regression
                                                                         Cpus Per Node = 1              1e8   Cpus Per Node = 4
                              625000                                                             1.50
Latency Walltime Inlet (ns)
                              600000
                                                                                                                                      Num Simels Per Cpu = 1
                                                                                                 1.25
                                                                                                                                                                        Estimated Statistic = Latency Walltime Inlet (ns) Mean | Num Processes = 64, 256
                              575000
                                                                                                 1.00                                                                                                                                                         Cpus Per Node = 1                 Cpus Per Node = 4
                              550000                                                                                                                                                                    50000
                                                                                                 0.75                                                                                                                                                                                  200000
                                                                                                                                                                                                                                                                                                                                      Num Simels Per Cpu = 1
                              525000
                                                                                                                                                                              Absolute Effect Size
                                                                                                                                                                                                        40000
                                                                                                 0.50                                                                                                                                                                                  150000
                              500000                                                                                                                                                                    30000
                                                                                                 0.25                                                                                                                                                                                  100000
                              475000                                                                                                                                                                    20000
                                                                                                 0.00                                                                                                                                                                                   50000
                                                                   1e7                                  1e6                                                                                             10000
                                                                                                                                                                                                                                                                                            0
                                                                                                                                                                                                                                 0
                                                                                                                                      Num Simels Per Cpu = 2048
                                                                                                  2.8
                               Latency Walltime Inlet (ns)
                                                                                                                                                                                                                                                                                      −50000
                                                             2.0
                                                                                                                                                                                                                                                        1e7
                                                                                                  2.6                                                                                                                                                                                       0
                                                                                                                                                                                                                                1.4
                                                                                                                                                                                                                                                                                                                                      Num Simels Per Cpu = 2048
                                                             1.5                                                                                                                                                                                                                      −50000
                                                                                                  2.4                                                                                                                           1.2
                                                                                                                                                                                                         Absolute Effect Size
                                                                                                                                                                                                                                                                                      −100000
                                                                                                  2.2                                                                                                                           1.0
                                                             1.0
                                                                                                                                                                                                                                                                                      −150000
                                                                                                                                                                                                                                0.8
                                                                                                  2.0
                                                             0.5                                                                                                                                                                0.6                                                   −200000
                                                                                                  1.8                                                                                                                           0.4                                                   −250000
                                                                                                                                                                                                                                0.2
                                                                     2          3            4            2          3            4                                                                                                                                                   −300000
                                                                         Log Num Processes                    Log Num Processes                                                                                                 0.0
(c) Piecewise ordinary least squares regression plot. (d) Estimated regression coefficient for rightmost par-
Observations are means per replicate.                 tial regression. Zero corresponds to no effect.
Figure A.9: Ordinary least squares regressions of Latency Walltime Inlet (ns) against log processor count for
weak scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Ordinary least squares regression estimates relationship between independent
variable and mean of response variable. Error bands and bars are 95% confidence intervals. Note that log is
base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                                  135


                                                     Ordinary Least Squares Regression
                                              Cpus Per Node = 1                  Cpus Per Node = 4
                                                                       9.5
                                                                       9.0
            Latency Simsteps Outlet                                                                      Num Simels Per Cpu = 1
                                      9
                                                                       8.5                                                                 Estimated Statistic = Latency Simsteps Outlet Mean | Num Processes = 16, 64, 256
                                                                                                                                                                                                            Cpus Per Node = 1           Cpus Per Node = 4
                                      8                                8.0                                                                                                                           0.0
                                                                       7.5                                                                                                               −0.2                                    0.04
                                                                                                                                                                                                                                                            Num Simels Per Cpu = 1
                                      7
                                                                                                                                                               Absolute Effect Size
                                                                       7.0                                                                                                               −0.4
                                                                                                                                                                                                                                 0.02
                                                                       6.5                                                                                                               −0.6
                                      6                                                                                                                                                                                          0.00
                                                                       6.0                                                                                                               −0.8
                                                                                                                                                                                         −1.0                                   −0.02
                             12                                                                                                                                                          −1.2
                                                                      1.50                                                                                                                                                      −0.04
                                                                                                         Num Simels Per Cpu = 2048
Latency Simsteps Outlet
                                                                                                                                                                                         −1.4
                             10                                       1.45
                                                                                                                                                                                                     4.0
                                                                                                                                                                                                                                 0.05
                                      8                               1.40
                                                                                                                                                                                                                                                            Num Simels Per Cpu = 2048
                                                                                                                                                                                                     3.5
                                                                                                                                                                                                                                 0.04
                                      6                               1.35
                                                                                                                                                                             Absolute Effect Size
                                                                                                                                                                                                     3.0
                                                                                                                                                                                                                                 0.03
                                      4                               1.30                                                                                                                           2.5
                                                                                                                                                                                                                                 0.02
                                                                                                                                                                                                     2.0
                                      2                               1.25
                                                                                                                                                                                                                                 0.01
                                                                                                                                                                                                     1.5
                                      0                               1.20                                                                                                                                                       0.00
                                                                                                                                                                                                     1.0
                                                                                                                                                                                                     0.5                        −0.01
                                          2          3            4          2          3            4
                                              Log Num Processes                  Log Num Processes                                                                                                   0.0                        −0.02
(a) Complete ordinary least squares regression plot. (b) Estimated regression coefficient for complete re-
Observations are means per replicate.                gression. Zero corresponds to no effect.
                                                     Ordinary Least Squares Regression
                                              Cpus Per Node = 1                  Cpus Per Node = 4
                                                                       9.5
                                                                       9.0
            Latency Simsteps Outlet                                                                      Num Simels Per Cpu = 1
                                      9                                                                                                    Estimated Statistic = Latency Simsteps Outlet Mean | Num Processes = 64, 256
                                                                       8.5                                                                                                                                 Cpus Per Node = 1            Cpus Per Node = 4
                                                                                                                                                                                        0.4
                                      8                                8.0
                                                                                                                                                                                                                                 0.04
                                                                                                                                                                                                                                                                               Num Simels Per Cpu = 1
                                                                       7.5                                                                                                              0.2
                                                                                                                                                Absolute Effect Size
                                                                                                                                                                                                                                 0.02
                                      7                                7.0
                                                                                                                                                                                        0.0
                                                                       6.5                                                                                                                                                       0.00
                                      6                                6.0                                                                                               −0.2
                                                                                                                                                                                                                                −0.02
                                                                                                                                                                         −0.4
                                                                                                                                                                                                                                −0.04
                              12                                      1.50
                                                                                                         Num Simels Per Cpu = 2048
                                                                                                                                                                         −0.6
Latency Simsteps Outlet
                              10                                      1.45
                                                                                                                                                                                                     8                           0.04
                                                                      1.40
                                                                                                                                                                                                                                                                               Num Simels Per Cpu = 2048
                                      8
                                                                                                                                                                                                                                 0.02
                                                                                                                                                                              Absolute Effect Size
                                                                      1.35                                                                                                                           6
                                      6
                                                                                                                                                                                                                                 0.00
                                                                      1.30
                                      4                                                                                                                                                              4
                                                                      1.25                                                                                                                                                      −0.02
                                      2                               1.20                                                                                                                                                      −0.04
                                                                                                                                                                                                     2
                                          2          3            4          2          3            4                                                                                                                          −0.06
                                              Log Num Processes                  Log Num Processes                                                                                                   0
(c) Piecewise ordinary least squares regression plot. (d) Estimated regression coefficient for rightmost par-
Observations are means per replicate.                 tial regression. Zero corresponds to no effect.
Figure A.10: Ordinary least squares regressions of Latency Simsteps Outlet against log processor count for
weak scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Ordinary least squares regression estimates relationship between independent
variable and mean of response variable. Error bands and bars are 95% confidence intervals. Note that log is
base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                     136


                                                                               Ordinary Least Squares Regression
                                                                           Cpus Per Node = 1              1e8   Cpus Per Node = 4
                               625000                                                              1.50
Latency Walltime Outlet (ns)
                                                                                                                                        Num Simels Per Cpu = 1
                               600000                                                              1.25
                               575000                                                              1.00
                                                                                                                                                                          Estimated Statistic = Latency Walltime Outlet (ns) Mean | Num Processes = 16, 64, 256
                               550000                                                              0.75                                                                                                                                                              Cpus Per Node = 1             Cpus Per Node = 4
                                                                                                                                                                                                                                                           0
                               525000                                                              0.50                                                                                                                            −5000                                                    0.04
                                                                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 1
                                                                                                                                                                                                Absolute Effect Size
                                                                                                                                                                                                                         −10000
                               500000                                                              0.25                                                                                                                                                                                     0.02
                                                                                                                                                                                                                         −15000
                               475000                                                              0.00
                                                                                                                                                                                                                         −20000                                                             0.00
                                                                                                                                                                                                                         −25000
                                                                     1e7                                  1e6                                                                                                                                                                              −0.02
                                                                                                                                                                                                                         −30000
                                Latency Walltime Outlet (ns)
                                                                                                                                                                                                                                                                                           −0.04
                                                                                                                                        Num Simels Per Cpu = 2048
                                                               2.0                                  2.8                                                                                                                  −35000
                                                                                                    2.6                                                                                                                                                        1e6
                                                               1.5                                                                                                                                                                                                                        350000
                                                                                                                                                                                                                                                           7
                                                                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 2048
                                                                                                    2.4                                                                                                                                                                                   300000
                                                               1.0                                                                                                                                                                                         6
                                                                                                                                                                                                                                    Absolute Effect Size
                                                                                                                                                                                                                                                                                          250000
                                                                                                    2.2                                                                                                                                                    5
                                                                                                                                                                                                                                                                                          200000
                                                               0.5                                                                                                                                                                                         4
                                                                                                    2.0
                                                                                                                                                                                                                                                                                          150000
                                                                                                                                                                                                                                                           3
                                                               0.0                                  1.8                                                                                                                                                                                   100000
                                                                                                                                                                                                                                                           2
                                                                                                                                                                                                                                                           1                               50000
                                                                       2          3            4            2          3            4
                                                                           Log Num Processes                    Log Num Processes                                                                                                                          0                                  0
(a) Complete ordinary least squares regression plot. (b) Estimated regression coefficient for complete re-
Observations are means per replicate.                gression. Zero corresponds to no effect.
                                                                               Ordinary Least Squares Regression
                                                                           Cpus Per Node = 1              1e8   Cpus Per Node = 4
                               625000                                                              1.50
Latency Walltime Outlet (ns)
                                                                                                                                        Num Simels Per Cpu = 1
                               600000                                                              1.25
                                                                                                                                                                          Estimated Statistic = Latency Walltime Outlet (ns) Mean | Num Processes = 64, 256
                               575000                                                              1.00
                                                                                                                                                                                                                                                                 Cpus Per Node = 1                 Cpus Per Node = 4
                               550000                                                              0.75                                                                                                    50000
                                                                                                                                                                                                                                                                                            0.04
                                                                                                                                                                                                                                                                                                                                         Num Simels Per Cpu = 1
                               525000                                                              0.50
                                                                                                                                                                                 Absolute Effect Size
                                                                                                                                                                                                           40000
                                                                                                                                                                                                                                                                                            0.02
                               500000                                                              0.25                                                                                                    30000
                                                                                                                                                                                                                                                                                            0.00
                               475000                                                              0.00                                                                                                    20000
                                                                                                                                                                                                                                                                                           −0.02
                                                                     1e7                                  1e6                                                                                              10000
                                                                                                                                                                                                                                                                                           −0.04
                                Latency Walltime Outlet (ns)                                                                            Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                    0
                                                                                                    2.8
                                                               2.0
                                                                                                                                                                                                                                                           1e7
                                                                                                    2.6                                                                                                                                                                                        0
                                                                                                                                                                                                                                   1.4
                                                                                                                                                                                                                                                                                                                                         Num Simels Per Cpu = 2048
                                                               1.5                                                                                                                                                                                                                       −50000
                                                                                                    2.4                                                                                                                            1.2
                                                                                                                                                                                                            Absolute Effect Size
                                                                                                                                                                                                                                                                                         −100000
                                                               1.0                                  2.2                                                                                                                            1.0
                                                                                                                                                                                                                                                                                         −150000
                                                                                                                                                                                                                                   0.8
                                                                                                    2.0
                                                                                                                                                                                                                                   0.6                                                   −200000
                                                               0.5
                                                                                                    1.8                                                                                                                            0.4                                                   −250000
                                                                       2          3            4            2          3            4                                                                                              0.2                                                   −300000
                                                                           Log Num Processes                    Log Num Processes                                                                                                  0.0
(c) Piecewise ordinary least squares regression plot. (d) Estimated regression coefficient for rightmost par-
Observations are means per replicate.                 tial regression. Zero corresponds to no effect.
Figure A.11: Ordinary least squares regressions of Latency Walltime Outlet (ns) against log processor count
for weak scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom
row shows piecewise regression. Ordinary least squares regression estimates relationship between independent
variable and mean of response variable. Error bands and bars are 95% confidence intervals. Note that log is
base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                                    137


                                                    Ordinary Least Squares Regression
                                             Cpus Per Node = 1                  Cpus Per Node = 4
                                                                      9.0
                                     9
            Latency Simsteps Inlet                                                                      Num Simels Per Cpu = 1
                                                                      8.5                                                                 Estimated Statistic = Latency Simsteps Inlet Mean | Num Processes = 16, 64, 256
                                                                      8.0                                                                                                                                              Cpus Per Node = 1           Cpus Per Node = 4
                                     8                                                                                                                                                              0.0                                      0.0
                                                                      7.5                                                                                                              −0.2
                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 1
                                                                                                                                                                                                                                            −0.2
                                     7
                                                                                                                                                              Absolute Effect Size
                                                                      7.0                                                                                                              −0.4
                                                                      6.5                                                                                                                                                                   −0.4
                                                                                                                                                                                       −0.6
                                     6
                                                                      6.0                                                                                                              −0.8                                                 −0.6
                                                                                                                                                                                       −1.0
                                                                                                                                                                                                                                            −0.8
                             12                                                                                                                                                        −1.2
                                                                                                        Num Simels Per Cpu = 2048
                                                                     1.45
                                                                                                                                                                                       −1.4                                                 −1.0
Latency Simsteps Inlet
                             10
                                                                     1.40                                                                                                                                         4                         0.05
                                     8
                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 2048
                                                                     1.35                                                                                                                                                                   0.04
                                     6
                                                                                                                                                                                           Absolute Effect Size
                                                                                                                                                                                                                  3                         0.03
                                                                     1.30
                                     4
                                                                                                                                                                                                                                            0.02
                                                                     1.25                                                                                                                                         2
                                     2                                                                                                                                                                                                      0.01
                                                                     1.20
                                     0                                                                                                                                                                                                      0.00
                                                                                                                                                                                                                  1
                                                                     1.15
                                                                                                                                                                                                                                           −0.01
                                         2          3            4          2          3            4
                                             Log Num Processes                  Log Num Processes                                                                                                                 0
(a) Complete ordinary least squares regression plot. (b) Estimated regression coefficient for complete re-
Observations are means per replicate.                gression. Zero corresponds to no effect.
                                                    Ordinary Least Squares Regression
                                             Cpus Per Node = 1                  Cpus Per Node = 4
                                                                      9.0
                                     9                                                                                                    Estimated Statistic = Latency Simsteps Inlet Mean | Num Processes = 64, 256
            Latency Simsteps Inlet                                                                      Num Simels Per Cpu = 1
                                                                      8.5                                                                                                                                             Cpus Per Node = 1            Cpus Per Node = 4
                                                                      8.0                                                                                                                                                                    0.6
                                     8
                                                                                                                                                                                                                                                                                          Num Simels Per Cpu = 1
                                                                      7.5                                                                                                             0.2                                                    0.4
                                                                                                                                              Absolute Effect Size
                                     7                                7.0
                                                                                                                                                                                      0.0                                                    0.2
                                                                      6.5
                                     6                                                                                                                                  −0.2                                                                 0.0
                                                                      6.0
                                                                                                                                                                                                                                            −0.2
                                                                                                                                                                        −0.4
                             12
                                                                                                        Num Simels Per Cpu = 2048
                                                                     1.45                                                                                                                                                                   −0.4
Latency Simsteps Inlet
                             10                                                                                                                                                                        8                                    0.04
                                                                     1.40
                                                                                                                                                                                                                                                                                          Num Simels Per Cpu = 2048
                                                                                                                                                                                                       7
                                     8                               1.35                                                                                                                                                                   0.02
                                                                                                                                                                             Absolute Effect Size
                                                                                                                                                                                                       6
                                     6                               1.30
                                                                                                                                                                                                       5                                    0.00
                                     4                               1.25                                                                                                                              4
                                                                                                                                                                                                                                           −0.02
                                                                     1.20                                                                                                                              3
                                     2
                                                                                                                                                                                                       2                                   −0.04
                                                                     1.15
                                                                                                                                                                                                       1
                                         2          3            4          2          3            4
                                                                                                                                                                                                                                           −0.06
                                             Log Num Processes                  Log Num Processes                                                                                                      0
(c) Piecewise ordinary least squares regression plot. (d) Estimated regression coefficient for rightmost par-
Observations are means per replicate.                 tial regression. Zero corresponds to no effect.
Figure A.12: Ordinary least squares regressions of Latency Simsteps Inlet against log processor count for
weak scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Ordinary least squares regression estimates relationship between independent
variable and mean of response variable. Error bands and bars are 95% confidence intervals. Note that log is
base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                    138


                                                                           Ordinary Least Squares Regression
                                                                      Cpus Per Node = 1              1e8   Cpus Per Node = 4
                             90000                                                            1.75
Simstep Period Outlet (ns)
                                                                                              1.50
                                                                                                                                   Num Simels Per Cpu = 1
                             85000
                                                                                              1.25
                                                                                                                                                                     Estimated Statistic = Simstep Period Outlet (ns) Mean | Num Processes = 16, 64, 256
                             80000                                                            1.00                                                                                                                                  Cpus Per Node = 1                Cpus Per Node = 4
                                                                                              0.75                                                                                                                         10000
                             75000                                                                                                                                                                                                                        0.04
                                                                                                                                                                                                                                                                                          Num Simels Per Cpu = 1
                                                                                              0.50
                                                                                                                                                                                                                           8000
                                                                                                                                                                                                    Absolute Effect Size
                             70000                                                                                                                                                                                                                        0.02
                                                                                              0.25
                                                                                                                                                                                                                           6000
                                                                                              0.00                                                                                                                                                        0.00
                             65000
                                                                                                                                                                                                                           4000
                                                                                                                                                                                                                                                         −0.02
                                                                1e6                                  1e6
                                                          2.1                                  2.0                                                                                                                         2000
                                                                                                                                                                                                                                                         −0.04
                             Simstep Period Outlet (ns)                                                                            Num Simels Per Cpu = 2048
                                                                                               1.9                                                                                                                            0
                                                          2.0
                                                                                                                                                                                                                   200000
                                                                                               1.8                                                                                                                                                      200000
                                                                                                                                                                                                                                                                                          Num Simels Per Cpu = 2048
                                                          1.9
                                                                                                                                                                                          Absolute Effect Size
                                                                                                                                                                                                                   150000
                                                                                               1.7                                                                                                                                                      150000
                                                          1.8
                                                                                               1.6                                                                                                                 100000
                                                                                                                                                                                                                                                        100000
                                                          1.7
                                                                                               1.5
                                                                                                                                                                                                                           50000                         50000
                                                                  2          3            4            2          3            4
                                                                      Log Num Processes                    Log Num Processes                                                                                                  0                              0
(a) Complete ordinary least squares regression plot. (b) Estimated regression coefficient for complete re-
Observations are means per replicate.                gression. Zero corresponds to no effect.
                                                                           Ordinary Least Squares Regression
                                                                      Cpus Per Node = 1              1e8   Cpus Per Node = 4
                             90000                                                            1.75
Simstep Period Outlet (ns)
                                                                                              1.50
                                                                                                                                   Num Simels Per Cpu = 1
                             85000                                                                                                                                   Estimated Statistic = Simstep Period Outlet (ns) Mean | Num Processes = 64, 256
                                                                                              1.25
                                                                                                                                                                                                                                   Cpus Per Node = 1                  Cpus Per Node = 4
                             80000                                                            1.00
                                                                                                                                                                                                                           7000                            0.04
                                                                                              0.75
                                                                                                                                                                                                                                                                                                            Num Simels Per Cpu = 1
                             75000                                                                                                                                                                                         6000
                                                                                                                                                                                                    Absolute Effect Size
                                                                                              0.50                                                                                                                                                         0.02
                                                                                                                                                                                                                           5000
                             70000
                                                                                              0.25                                                                                                                         4000                            0.00
                             65000                                                            0.00                                                                                                                         3000
                                                                                                                                                                                                                                                          −0.02
                                                                                                                                                                                                                           2000
                                                                1e6                                  1e6
                                                                                               2.0                                                                                                                         1000                           −0.04
                             Simstep Period Outlet (ns)                                                                            Num Simels Per Cpu = 2048
                                                          2.0                                                                                                                                                                 0
                                                                                               1.9
                                                                                                                                                                                                                              0                                  0
                                                                                               1.8
                                                                                                                                                                                                                                                                                                            Num Simels Per Cpu = 2048
                                                          1.9                                                                                                                                                                                           −25000
                                                                                                                                                                                                     −20000
                                                                                                                                                                           Absolute Effect Size
                                                                                               1.7                                                                                                                                                      −50000
                                                          1.8
                                                                                                                                                                                                                                                        −75000
                                                                                               1.6                                                                                                   −40000
                                                                                                                                                                                                                                                        −100000
                                                          1.7
                                                                                               1.5                                                                                                   −60000                                             −125000
                                                                  2          3            4            2          3            4                                                                                                                        −150000
                                                                      Log Num Processes                    Log Num Processes                                                                         −80000
(c) Piecewise ordinary least squares regression plot. (d) Estimated regression coefficient for rightmost par-
Observations are means per replicate.                 tial regression. Zero corresponds to no effect.
Figure A.13: Ordinary least squares regressions of Simstep Period Outlet (ns) against log processor count for
weak scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Ordinary least squares regression estimates relationship between independent
variable and mean of response variable. Error bands and bars are 95% confidence intervals. Note that log is
base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                               139


                                                     Ordinary Least Squares Regression
                                                 Cpus Per Node = 1                Cpus Per Node = 4
                                                                       0.14
                                     0.04
                                                                                                         Num Simels Per Cpu = 1
                                                                       0.13                                                                Estimated Statistic = Delivery Failure Rate Mean | Num Processes = 16, 64, 256
           Delivery Failure Rate
                                                                                                                                                                                                                Cpus Per Node = 1              Cpus Per Node = 4
                                     0.02                              0.12
                                                                                                                                                                                                                                       0.020
                                                                       0.11                                                                                                                             0.04
                                     0.00
                                                                                                                                                                                                                                                                   Num Simels Per Cpu = 1
                                                                                                                                                                                Absolute Effect Size
                                                                       0.10                                                                                                                             0.02                           0.015
                                   −0.02
                                                                       0.09
                                                                                                                                                                                                        0.00
                                                                                                                                                                                                                                       0.010
                                   −0.04                               0.08
                                                                                                                                                                                                       −0.02
                                                                                                                                                                                                                                       0.005
                                                                                                                                                                                                       −0.04
                                                                                                         Num Simels Per Cpu = 2048
                                   0.003
                                                                                                                                                                                                                                       0.000
                                                                      0.006
Delivery Failure Rate
                                   0.002                                                                                                                                                0.00175                                      0.00000
                                                                                                                                                                                                                                                                   Num Simels Per Cpu = 2048
                                                                      0.004                                                                                                             0.00150                                     −0.00025
                                                                                                                                                              Absolute Effect Size
                                   0.001                                                                                                                                                0.00125                                     −0.00050
                                                                      0.002                                                                                                             0.00100                                     −0.00075
                                   0.000
                                                                                                                                                                                        0.00075                                     −0.00100
                            −0.001                                    0.000                                                                                                             0.00050
                                                                                                                                                                                                                                    −0.00125
                                                                                                                                                                                        0.00025
                                            2          3          4           2          3          4                                                                                                                               −0.00150
                                                Log Num Processes                 Log Num Processes                                                                                     0.00000
(a) Complete ordinary least squares regression plot. (b) Estimated regression coefficient for complete re-
Observations are means per replicate.                gression. Zero corresponds to no effect.
                                                    Ordinary Least Squares Regression
                                                Cpus Per Node = 1                 Cpus Per Node = 4
                                                                       0.14
                                    0.04                                                                                                   Estimated Statistic = Delivery Failure Rate Mean | Num Processes = 64, 256
                                                                                                        Num Simels Per Cpu = 1
                                                                       0.13
Delivery Failure Rate
                                                                                                                                                                                                               Cpus Per Node = 1               Cpus Per Node = 4
                                    0.02                               0.12
                                                                                                                                                                                              0.04                                    0.030
                                                                       0.11
                                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 1
                                    0.00
                                                                                                                                                     Absolute Effect Size
                                                                                                                                                                                                                                      0.025
                                                                       0.10                                                                                                                   0.02
                            −0.02                                                                                                                                                                                                     0.020
                                                                       0.09                                                                                                                   0.00
                                                                                                                                                                                                                                      0.015
                            −0.04                                      0.08
                                                                                                                                                                                −0.02
                                                                                                                                                                                                                                      0.010
                                                                                                                                                                                −0.04                                                 0.005
                                                                                                        Num Simels Per Cpu = 2048
                                   0.003                                                                                                                                                                                              0.000
                                                                      0.006
    Delivery Failure Rate
                                                                                                                                                                                                                                     0.0005
                                                                                                                                                                        0.0030
                                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 2048
                                   0.002
                                                                                                                                                                                                                                     0.0000
                                                                      0.004
                                                                                                                                              Absolute Effect Size
                                                                                                                                                                        0.0025
                                                                                                                                                                                                                                    −0.0005
                                   0.001                                                                                                                                0.0020
                                                                      0.002                                                                                                                                                         −0.0010
                                                                                                                                                                        0.0015
                                   0.000                                                                                                                                                                                            −0.0015
                                                                                                                                                                        0.0010
                                                                      0.000
                                                                                                                                                                                                                                    −0.0020
                                                                                                                                                                        0.0005
                                            2          3          4           2          3          4                                                                                                                               −0.0025
                                                Log Num Processes                 Log Num Processes                                                                     0.0000
(c) Piecewise ordinary least squares regression plot. (d) Estimated regression coefficient for rightmost par-
Observations are means per replicate.                 tial regression. Zero corresponds to no effect.
Figure A.14: Ordinary least squares regressions of Delivery Failure Rate against log processor count for weak
scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Ordinary least squares regression estimates relationship between independent
variable and mean of response variable. Error bands and bars are 95% confidence intervals. Note that log is
base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                     140


                                                                  Quantile Regression
                                                    Cpus Per Node = 1                   Cpus Per Node = 4
                                                                             7.5
                                            9
                  Latency Simsteps Outlet                                                                       Num Simels Per Cpu = 1
                                                                             7.0
                                            8                                                                                                     Estimated Statistic = Latency Simsteps Outlet Median | Num Processes = 16, 64, 256
                                                                             6.5                                                                                                                               Cpus Per Node = 1           Cpus Per Node = 4
                                                                                                                                                                                                         0.0                         0.0
                                            7
                                                                             6.0                                                                                                                        −0.2                        −0.1
                                                                                                                                                                                                                                                               Num Simels Per Cpu = 1
                                                                                                                                                                                                        −0.4                        −0.2
                                                                                                                                                                                 Absolute Effect Size
                                            6
                                                                             5.5                                                                                                                        −0.6                        −0.3
                                                                                                                                                                                                        −0.8
                                            5                                                                                                                                                                                       −0.4
                                                                             5.0
                                                                                                                                                                                                        −1.0
                                                                                                                                                                                                                                    −0.5
                                                                            1.50                                                                                                                        −1.2
                                                                                                                                                                                                                                    −0.6
                                                                                                                Num Simels Per Cpu = 2048
                                                                            1.45                                                                                                                        −1.4
                             1.4                                                                                                                                                                                                    −0.7
Latency Simsteps Outlet
                                                                                                                                                                                                        −1.6
                                                                            1.40
                                                                                                                                                                                                                                    0.06
                             1.3                                            1.35                                                                                                                        0.00
                                                                                                                                                                                                                                                               Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                    0.04
                                                                            1.30
                                                                                                                                                                       Absolute Effect Size
                                                                                                                                                                                                −0.02                               0.02
                             1.2                                            1.25
                                                                                                                                                                                                                                    0.00
                                                                                                                                                                                                −0.04
                                                                            1.20
                                                                                                                                                                                                                                   −0.02
                             1.1
                                                                            1.15                                                                                                                −0.06
                                                                                                                                                                                                                                   −0.04
                                                2          3            4          2           3            4
                                                    Log Num Processes                   Log Num Processes                                                                                                                          −0.06
                                                                                                                                                                                                −0.08
(a) Complete quantile regression plot. Observations (b) Estimated regression coefficient for ordinary least
are medians per replicate.                          squares regression. Zero corresponds to no effect.
                                                                  Quantile Regression
                                                    Cpus Per Node = 1                   Cpus Per Node = 4
                                                                             7.5
                                            9
                  Latency Simsteps Outlet                                                                       Num Simels Per Cpu = 1
                                                                             7.0                                                                  Estimated Statistic = Latency Simsteps Outlet Median | Num Processes = 64, 256
                                            8                                                                                                                                                                  Cpus Per Node = 1           Cpus Per Node = 4
                                                                             6.5                                                                                                                        0.2                          0.4
                                            7
                                                                                                                                                                                                                                                                                 Num Simels Per Cpu = 1
                                                                             6.0                                                                                                                        0.0
                                                                                                                                                                  Absolute Effect Size
                                                                                                                                                                                                                                     0.2
                                                                                                                                                                                              −0.2
                                            6                                5.5
                                                                                                                                                                                                                                     0.0
                                                                                                                                                                                              −0.4
                                                                             5.0
                                                                                                                                                                                                                                    −0.2
                                                                            1.50                                                                                                              −0.6
                                                                                                                Num Simels Per Cpu = 2048
                                                                            1.45                                                                                                                                                    −0.4
                             1.4                                                                                                                                                              −0.8
Latency Simsteps Outlet
                                                                            1.40                                                                                                                0.06
                                                                                                                                                                                                                                    0.06
                             1.3                                            1.35
                                                                                                                                                                                                                                                                                 Num Simels Per Cpu = 2048
                                                                                                                                                                                                0.04
                                                                                                                                                                                                                                    0.04
                                                                                                                                                        Absolute Effect Size
                                                                            1.30
                                                                                                                                                                                                0.02                                0.02
                             1.2                                            1.25                                                                                                                                                    0.00
                                                                                                                                                                                                0.00
                                                                            1.20                                                                                                                                                   −0.02
                                                                                                                                                                                  −0.02
                             1.1                                                                                                                                                                                                   −0.04
                                                                            1.15
                                                                                                                                                                                  −0.04
                                                                                                                                                                                                                                   −0.06
                                                2          3            4          2           3            4
                                                    Log Num Processes                   Log Num Processes                                                                         −0.06                                            −0.08
(c) Piecewise quantile regression plot. Observations (d) Estimated regression coefficient for rightmost par-
are medians per replicate.                           tial regression. Zero corresponds to no effect.
Figure A.15: Quantile Regressions of Latency Simsteps Outlet against log processor count for weak scaling
experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row shows
piecewise regression. Quantile regression estimates relationship between independent variable and median
of response variable. Note that log is base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                            141


                                                                                        Quantile Regression
                                                                           Cpus Per Node = 1                    Cpus Per Node = 4
                               625000
                                                                                                 500000
Latency Walltime Outlet (ns)
                                                                                                                                      Num Simels Per Cpu = 1
                               600000
                                                                                                 475000
                               575000
                                                                                                 450000                                                                 Estimated Statistic = Latency Walltime Outlet (ns) Median | Num Processes = 16, 64, 256
                               550000
                                                                                                                                                                                                                              Cpus Per Node = 1                Cpus Per Node = 4
                               525000                                                            425000                                                                                                                   0                        20000
                               500000                                                            400000
                                                                                                                                                                                                                                                                                    Num Simels Per Cpu = 1
                                                                                                                                                                                                                     −10000                        10000
                                                                                                                                                                                              Absolute Effect Size
                               475000                                                            375000
                                                                                                                                                                                                                     −20000                            0
                               450000
                                                                                                                                                                                                                     −30000
                                                                     1e6                                  1e6                                                                                                                                     −10000
                                                                                                   2.75
                                Latency Walltime Outlet (ns)
                                                                                                                                                                                                                     −40000
                                                                                                                                      Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                                  −20000
                                                               2.6
                                                                                                   2.50
                                                                                                   2.25                                                                                                              150000
                                                               2.4                                                                                                                                                                                350000
                                                                                                                                                                                                                                                                                    Num Simels Per Cpu = 2048
                                                                                                                                                                                                                     125000
                                                                                                   2.00                                                                                                                                           300000
                                                                                                                                                                                              Absolute Effect Size
                                                                                                                                                                                                                     100000
                                                                                                                                                                                                                                                  250000
                                                               2.2                                 1.75                                                                                                               75000
                                                                                                                                                                                                                                                  200000
                                                                                                   1.50                                                                                                               50000
                                                                                                                                                                                                                                                  150000
                                                                                                                                                                                                                      25000
                                                               2.0                                 1.25                                                                                                                                           100000
                                                                                                                                                                                                                          0
                                                                      2           3          4             2           3          4                                                                                                                50000
                                                                                                                                                                                                                     −25000
                                                                           Log Num Processes                    Log Num Processes                                                                                                                      0
(a) Complete quantile regression plot. Observations (b) Estimated regression coefficient for ordinary least
are medians per replicate.                          squares regression. Zero corresponds to no effect.
                                                                                        Quantile Regression
                                                                           Cpus Per Node = 1                    Cpus Per Node = 4
                               625000
                                                                                                 500000
Latency Walltime Outlet (ns)
                                                                                                                                      Num Simels Per Cpu = 1
                               600000
                                                                                                 475000
                               575000                                                                                                                                   Estimated Statistic = Latency Walltime Outlet (ns) Median | Num Processes = 64, 256
                                                                                                 450000
                               550000                                                                                                                                                                                         Cpus Per Node = 1                 Cpus Per Node = 4
                                                                                                                                                                                                                                                    60000
                               525000                                                            425000                                                                                                               30000
                                                                                                                                                                                                                                                    50000
                                                                                                                                                                                                                                                                                                     Num Simels Per Cpu = 1
                               500000                                                            400000                                                                                                               20000
                                                                                                                                                                                          Absolute Effect Size
                                                                                                                                                                                                                                                    40000
                               475000                                                                                                                                                                                 10000
                                                                                                 375000                                                                                                                                             30000
                               450000                                                                                                                                                                                     0                         20000
                                                                     1e6                                  1e6                                                                                                        −10000                         10000
                                                                                                   2.75                                                                                                              −20000                                0
                                Latency Walltime Outlet (ns)                                                                          Num Simels Per Cpu = 2048
                                                               2.6                                                                                                                                                                                 −10000
                                                                                                   2.50
                                                                                                                                                                                                                      50000
                                                                                                   2.25                                                                                                                                                    0
                                                                                                                                                                                                                                                                                                     Num Simels Per Cpu = 2048
                                                               2.4                                                                                                                                                        0
                                                                                                   2.00                                                                                                                                            −50000
                                                                                                                                                                                Absolute Effect Size
                                                                                                                                                                                                                     −50000
                                                                                                                                                                                                                                                  −100000
                                                               2.2                                 1.75
                                                                                                                                                                                                         −100000                                  −150000
                                                                                                   1.50
                                                                                                                                                                                                         −150000                                  −200000
                                                               2.0                                 1.25
                                                                                                                                                                                                                                                  −250000
                                                                                                                                                                                                         −200000
                                                                      2           3          4             2           3          4
                                                                                                                                                                                                                                                  −300000
                                                                           Log Num Processes                    Log Num Processes                                                                        −250000
(c) Piecewise quantile regression plot. Observations (d) Estimated regression coefficient for rightmost par-
are medians per replicate.                           tial regression. Zero corresponds to no effect.
Figure A.16: Quantile Regressions of Latency Walltime Outlet (ns) against log processor count for weak
scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Quantile regression estimates relationship between independent variable and
median of response variable. Note that log is base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                                  142


                                                         Quantile Regression
                                           Cpus Per Node = 1                   Cpus Per Node = 4
                                                                   0.70
                                0.84
                                                                                                       Num Simels Per Cpu = 1
                                0.82                               0.68                                                                  Estimated Statistic = Delivery Clumpiness Median | Num Processes = 16, 64, 256
          Delivery Clumpiness
                                                                                                                                                                                            Cpus Per Node = 1            Cpus Per Node = 4
                                0.80                               0.66                                                                                                              0.00                        0.000
                                0.78                                                                                                                                                                            −0.005
                                                                                                                                                                                    −0.01
                                                                                                                                                                                                                                             Num Simels Per Cpu = 1
                                                                   0.64
                                                                                                                                                             Absolute Effect Size
                                                                                                                                                                                                                −0.010
                                0.76
                                                                   0.62                                                                                                             −0.02
                                                                                                                                                                                                                −0.015
                                0.74
                                                                                                                                                                                    −0.03                       −0.020
                                0.72                               0.60
                                                                                                                                                                                                                −0.025
                                                                                                                                                                                    −0.04
                        0.425                                                                                                                                                                                   −0.030
                                                                   0.40
                                                                                                       Num Simels Per Cpu = 2048
                                                                                                                                                                                    −0.05                       −0.035
                        0.400
Delivery Clumpiness
                                                                   0.35                                                                                                              0.00
                        0.375                                                                                                                                                                                     0.06
                                                                   0.30
                                                                                                                                                                                                                                             Num Simels Per Cpu = 2048
                                                                                                                                                                                    −0.01
                        0.350
                                                                                                                                                             Absolute Effect Size
                                                                   0.25                                                                                                                                           0.04
                                                                                                                                                                                    −0.02
                        0.325                                      0.20
                                                                                                                                                                                                                  0.02
                                                                                                                                                                                    −0.03
                        0.300                                      0.15
                        0.275                                      0.10                                                                                                             −0.04                         0.00
                                       2          3            4          2           3            4                                                                                −0.05
                                                                                                                                                                                                                 −0.02
                                           Log Num Processes                   Log Num Processes
(a) Complete quantile regression plot. Observations (b) Estimated regression coefficient for ordinary least
are medians per replicate.                          squares regression. Zero corresponds to no effect.
                                                         Quantile Regression
                                           Cpus Per Node = 1                   Cpus Per Node = 4
                                                                   0.70
                                0.84                                                                                                     Estimated Statistic = Delivery Clumpiness Median | Num Processes = 64, 256
                                                                                                       Num Simels Per Cpu = 1
                                0.82                               0.68                                                                                                                     Cpus Per Node = 1            Cpus Per Node = 4
          Delivery Clumpiness
                                                                                                                                                                                    0.01
                                0.80                               0.66
                                                                                                                                                                                                                 0.00
                                                                                                                                                                                                                                                                Num Simels Per Cpu = 1
                                                                                                                                                                                    0.00
                                0.78
                                                                                                                                             Absolute Effect Size
                                                                   0.64
                                0.76                                                                                                                                   −0.01
                                                                                                                                                                                                                −0.01
                                                                   0.62
                                0.74                                                                                                                                   −0.02
                                                                   0.60                                                                                                                                         −0.02
                                0.72
                                                                                                                                                                       −0.03
                                                                                                                                                                                                                −0.03
                        0.425                                                                                                                                          −0.04
                                                                   0.40
                                                                                                       Num Simels Per Cpu = 2048
                        0.400
Delivery Clumpiness
                                                                   0.35                                                                                                             0.02
                        0.375                                                                                                                                                                                    0.06
                                                                                                                                                                                                                                                                Num Simels Per Cpu = 2048
                                                                   0.30
                                                                                                                                             Absolute Effect Size
                        0.350                                                                                                                                                       0.00                         0.04
                                                                   0.25
                        0.325                                      0.20                                                                                                −0.02                                     0.02
                        0.300                                      0.15
                                                                                                                                                                                                                 0.00
                                                                                                                                                                       −0.04
                        0.275                                      0.10
                                                                                                                                                                                                                −0.02
                                                                                                                                                                       −0.06
                                       2          3            4          2           3            4
                                           Log Num Processes                   Log Num Processes                                                                                                                −0.04
(c) Piecewise quantile regression plot. Observations (d) Estimated regression coefficient for rightmost par-
are medians per replicate.                           tial regression. Zero corresponds to no effect.
Figure A.17: Quantile Regressions of Delivery Clumpiness against log processor count for weak scaling
experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row shows
piecewise regression. Quantile regression estimates relationship between independent variable and median
of response variable. Note that log is base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                   143


                                                                                 Quantile Regression
                                                                    Cpus Per Node = 1                   Cpus Per Node = 4
                            90000
Simstep Period Inlet (ns)                                                                                                     Num Simels Per Cpu = 1
                            85000                                                         75000
                                                                                                                                                                Estimated Statistic = Simstep Period Inlet (ns) Median | Num Processes = 16, 64, 256
                            80000                                                         70000                                                                                                                                Cpus Per Node = 1                Cpus Per Node = 4
                                                                                                                                                                                                                      10000
                            75000                                                                                                                                                                                                                    4000
                                                                                          65000
                                                                                                                                                                                                                                                                                     Num Simels Per Cpu = 1
                                                                                                                                                                                               Absolute Effect Size
                                                                                                                                                                                                                      8000
                            70000                                                                                                                                                                                                                    3000
                                                                                          60000
                                                                                                                                                                                                                      6000
                                                                                                                                                                                                                                                     2000
                            65000
                                                                                                                                                                                                                      4000
                                                                                                                                                                                                                                                     1000
                                                              1e6                                 1e6
                                                        2.1                                 2.0                                                                                                                       2000
                                                                                                                                                                                                                                                        0
                                                                                                                              Num Simels Per Cpu = 2048
                            Simstep Period Inlet (ns)
                                                                                            1.8                                                                                                                          0
                                                        2.0
                                                                                                                                                                                                              250000                               250000
                                                                                            1.6
                                                                                                                                                                                                                                                                                     Num Simels Per Cpu = 2048
                                                        1.9
                                                                                                                                                                                                              200000                               200000
                                                                                                                                                                                     Absolute Effect Size
                                                                                            1.4
                                                        1.8
                                                                                                                                                                                                              150000                               150000
                                                                                            1.2
                                                        1.7                                                                                                                                                   100000                               100000
                                                                                            1.0
                                                        1.6                                                                                                                                                           50000                         50000
                                                               2           3          4            2           3          4
                                                                    Log Num Processes                   Log Num Processes                                                                                                0                              0
(a) Complete quantile regression plot. Observations (b) Estimated regression coefficient for ordinary least
are medians per replicate.                          squares regression. Zero corresponds to no effect.
                                                                                 Quantile Regression
                                                                    Cpus Per Node = 1                   Cpus Per Node = 4
Simstep Period Inlet (ns)                                                                                                     Num Simels Per Cpu = 1
                            85000                                                         75000
                                                                                                                                                                Estimated Statistic = Simstep Period Inlet (ns) Median | Num Processes = 64, 256
                                                                                                                                                                                                                              Cpus Per Node = 1                  Cpus Per Node = 4
                            80000
                                                                                          70000
                                                                                                                                                                                                                                                      3000
                                                                                                                                                                                                                      8000
                            75000
                                                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 1
                                                                                          65000                                                                                                                                                       2000
                                                                                                                                                                                               Absolute Effect Size
                                                                                                                                                                                                                      6000
                            70000                                                                                                                                                                                                                     1000
                                                                                          60000
                                                                                                                                                                                                                                                            0
                            65000                                                                                                                                                                                     4000
                                                                                                                                                                                                                                                    −1000
                                                              1e6                                 1e6                                                                                                                 2000
                                                        2.1                                 2.0                                                                                                                                                     −2000
                                                                                                                              Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                                    −3000
                            Simstep Period Inlet (ns)
                                                                                            1.8                                                                                                                          0
                                                        2.0                                                                                                                                                              0                                  0
                                                                                            1.6
                                                                                                                                                                                                                                                                                                       Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                                   −20000
                                                        1.9                                                                                                                                     −20000
                                                                                                                                                                      Absolute Effect Size
                                                                                            1.4
                                                                                                                                                                                                                                                   −40000
                                                                                                                                                                                                −40000
                                                        1.8                                 1.2                                                                                                                                                    −60000
                                                                                            1.0                                                                                                 −60000                                             −80000
                                                        1.7
                                                                                                                                                                                                                                                   −100000
                                                                                                                                                                                                −80000
                                                                2          3          4             2          3          4
                                                                                                                                                                                                                                                   −120000
                                                                    Log Num Processes                   Log Num Processes
(c) Piecewise quantile regression plot. Observations (d) Estimated regression coefficient for rightmost par-
are medians per replicate.                           tial regression. Zero corresponds to no effect.
Figure A.18: Quantile Regressions of Simstep Period Inlet (ns) against log processor count for weak scaling
experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row shows
piecewise regression. Quantile regression estimates relationship between independent variable and median
of response variable. Note that log is base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                          144


                                                                                   Quantile Regression
                                                                      Cpus Per Node = 1                   Cpus Per Node = 4
                                                                                            75000
Simstep Period Outlet (ns)
                             85000
                                                                                                                                Num Simels Per Cpu = 1
                             80000                                                          70000                                                                 Estimated Statistic = Simstep Period Outlet (ns) Median | Num Processes = 16, 64, 256
                                                                                                                                                                                                                                  Cpus Per Node = 1                Cpus Per Node = 4
                             75000                                                                                                                                                                                                                      4000
                                                                                                                                                                                                                         10000
                                                                                            65000
                                                                                                                                                                                                                                                                                        Num Simels Per Cpu = 1
                             70000                                                                                                                                                                                                                      3000
                                                                                                                                                                                                                         8000
                                                                                                                                                                                                  Absolute Effect Size
                                                                                            60000
                             65000                                                                                                                                                                                       6000                           2000
                                                                                                                                                                                                                         4000                           1000
                                                                1e6                                 1e6
                                                                                                                                                                                                                         2000
                                                                                                                                                                                                                                                           0
                             Simstep Period Outlet (ns)                                                                         Num Simels Per Cpu = 2048
                                                          2.0                                 1.8
                                                                                                                                                                                                                            0
                                                          1.9                                 1.6                                                                                                                250000                               250000
                                                                                                                                                                                                                                                                                        Num Simels Per Cpu = 2048
                                                                                              1.4                                                                                                                200000                               200000
                                                                                                                                                                                        Absolute Effect Size
                                                          1.8
                                                                                              1.2                                                                                                                150000                               150000
                                                          1.7
                                                                                              1.0                                                                                                                100000                               100000
                                                          1.6
                                                                                              0.8                                                                                                                        50000                         50000
                                                                  2          3          4             2          3          4
                                                                      Log Num Processes                   Log Num Processes                                                                                                 0                              0
(a) Complete quantile regression plot. Observations (b) Estimated regression coefficient for ordinary least
are medians per replicate.                          squares regression. Zero corresponds to no effect.
                                                                                   Quantile Regression
                                                                      Cpus Per Node = 1                   Cpus Per Node = 4
                             85000                                                          75000
Simstep Period Outlet (ns)                                                                                                      Num Simels Per Cpu = 1
                                                                                                                                                                  Estimated Statistic = Simstep Period Outlet (ns) Median | Num Processes = 64, 256
                             80000
                                                                                            70000                                                                                                                                Cpus Per Node = 1                  Cpus Per Node = 4
                             75000                                                                                                                                                                                                                       3000
                                                                                            65000                                                                                                                        8000
                                                                                                                                                                                                                                                                                                          Num Simels Per Cpu = 1
                                                                                                                                                                                                  Absolute Effect Size
                                                                                                                                                                                                                                                         2000
                             70000
                                                                                                                                                                                                                         6000
                                                                                            60000                                                                                                                                                        1000
                             65000
                                                                                                                                                                                                                         4000                                  0
                                                                1e6                                 1e6                                                                                                                                                −1000
                                                                                                                                                                                                                         2000
                                                                                                                                Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                                       −2000
                             Simstep Period Outlet (ns)
                                                          2.0                                 1.8                                                                                                                           0
                                                                                                                                                                                                                            0                                  0
                                                                                              1.6
                                                          1.9
                                                                                                                                                                                                                                                                                                          Num Simels Per Cpu = 2048
                                                                                                                                                                                                                                                      −20000
                                                                                              1.4                                                                                                  −20000
                                                                                                                                                                         Absolute Effect Size
                                                                                                                                                                                                                                                      −40000
                                                          1.8
                                                                                              1.2                                                                                                  −40000
                                                                                                                                                                                                                                                      −60000
                                                          1.7                                 1.0                                                                                                  −60000                                             −80000
                                                                                              0.8                                                                                                  −80000                                             −100000
                                                                  2          3          4             2          3          4
                                                                      Log Num Processes                   Log Num Processes                                                                                                                           −120000
(c) Piecewise quantile regression plot. Observations (d) Estimated regression coefficient for rightmost par-
are medians per replicate.                           tial regression. Zero corresponds to no effect.
Figure A.19: Quantile Regressions of Simstep Period Outlet (ns) against log processor count for weak
scaling experiment (Section 2.3.6). Lower is better. Top row shows complete regression and bottom row
shows piecewise regression. Quantile regression estimates relationship between independent variable and
median of response variable. Note that log is base 4, so processor counts correspond to 16, 64, and 256.
                                                                                                                                                            145


Table A.1: Full Ordinary Least Squares Regression results of Latency Walltime Inlet (ns) against against log (Section 2.3.7). Significance level
p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                     nd                un
                                                                                                                                                                          d                       tS
                                                                                                                                               Bo                    rB                             ize
                                                                                                                                                  u                              Re
                                                                                                                                                                        o                                95
                                                                                                                                         Lo                   Up                    lat
                                                                                                                                           we                                           ive                 %
                                                                                                                                              r                 pe                                            CI
                                                                                                                                                                                            Eff
                                                                                                                                    CI                     CI                                                    Lo
                                                                                                                                                                                n              ec                  we
                                                                                                                                 %                       %                                        tS
                                                                                                                                                                                                                      rB
                                                            ca                                                                 95                    95                                              ize
                                                               nt                                                                                                                                                        ou
                                                           sP      Eff
                                                                       ec
                                                                                                          ze              Si
                                                                                                                            ze
                                                                                                                                                    ze
                                                                                                                                                                tS                                       95                 nd
                                                              er
                                                                  N
                                                                          tS
                                                                             ign
                                                                                                     Si                                        Si
                                                                                                                                                                   ize                                      %
                                                       Nu           od                         Eff                  Eff                  Eff                                                                  C
                                                          m
                                                            Si
                                                                        e
                                                                                                   ec                   ec                   ec          ive                 Effp                               IU
                                                               m els                                  t                    t                    t                               ec                                pp
                                                                      Pe                                                                                     Eff
                                            tic        Nu
                                                                          rC
                                                                             pu
                                                                                            te                  te                   te                         ec       ive                                         er
                      c                              ifi  m                            so                      lu                so                                                                                     Bo
                 et
                    ri                  at                  Pr                            lu                   so                   lu          Re                   Re
                                           is       gn     Cp  oc                                                                                                                                                          un
                                                                  es se              Ab                   Ab                   Ab                 lat                  lat
                M                      St         Si          u        s                                                                                                                                                      d
 Latency   Walltime   Inlet   (ns)   mean       -      1      1      16/64/256   -19 000         -35 000             -2 400               -0.033             -0.062           -0.0043         30       0.026
 Latency   Walltime   Inlet   (ns)   mean       +      1      2048   16/64/256   5.5e+06         3.5e+06             7.5e+06              2.5                1.6              3.4             30       4.8e-06
 Latency   Walltime   Inlet   (ns)   mean       0      4      1      16/64/256   -110 000        -350 000            120 000              -0.13              -0.4             0.14            30       0.33
 Latency   Walltime   Inlet   (ns)   mean       +      4      2048   16/64/256   230 000         110 000             340 000              0.11               0.057            0.17            30       0.0003
 Latency   Walltime   Inlet   (ns)   mean       -      1      1      16/64       -63 000         -92 000             -34 000              -0.11              -0.16            -0.061          20       0.00024
 Latency   Walltime   Inlet   (ns)   mean       +      1      2048   16/64       300 000         170 000             430 000              0.14               0.077            0.2             20       0.00014
 Latency   Walltime   Inlet   (ns)   mean       0      4      1      16/64       -320 000        -900 000            260 000              -0.37              -1               0.3             20       0.26
 Latency   Walltime   Inlet   (ns)   mean       +      4      2048   16/64       660 000         530 000             800 000              0.33               0.27             0.4             20       5.5e-09
 Latency   Walltime   Inlet   (ns)   mean       0      1      1      64/256      26 000          -990                52 000               0.046              -0.0018          0.093           20       0.058
 Latency   Walltime   Inlet   (ns)   mean       +      1      2048   64/256      1.1e+07         6.5e+06             1.5e+07              4.9                3                6.8             20       3.9e-05
 Latency   Walltime   Inlet   (ns)   mean       0      4      1      64/256      93 000          -44 000             230 000              0.11               -0.051           0.27            20       0.17
 Latency   Walltime   Inlet   (ns)   mean       -      4      2048   64/256      -210 000        -310 000            -110 000             -0.11              -0.16            -0.055          20       0.00039
                                                                                                  146


Table A.2: Full Ordinary Least Squares Regression results of Latency Simsteps Outlet against against log (Section 2.3.7). Significance level p < 0.05
used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                            ca nt
                                                       us          Eff ec
                                                            Pe rN         tS  ign
                                                    Nu m            od  e
                                                            Si m els  Pe  rC
                                                    Nu m                      pu
                                                            Pr oc es se s
                                                    Ab so lu te  Eff ec
                                                    Ab                  tS  ize
                                                       so lu te  Eff ec
                                                    Ab                  tS  ize
                                                       so lu te                  95 %
                                                                 Eff ec               CI Lo
                                                    Re                  tS  ize             we rB
                                                       lat ive                   9 5%              ou nd
                                                                Eff ec                CI U
                                                    Re                 tS  ize             pp er
                                                       lat ive                                   Bo  un
                                                                Eff                                     d
                                                                    ec tS
                                                    Re lat                 ize  95
                                        tic                ive  Eff                %  CI
                   c                               n                ec tS                Lowe
                et                  at            ifi                      ize  95            rB  ou
                   ri                  is     Si   p                               %  C IU
                                                                                                     nd
               M                   St           gn          Cp                            pp er  Bo un d
 Latency   Simsteps     Outlet   mean       -           1        1      16/64/256   -1.1     -1.4     -0.73    -0.13     -0.17    -0.087    30   5.5e-07
 Latency   Simsteps     Outlet   mean       +           1        2048   16/64/256   2.9      1.8      4        2.2       1.3      3         30   1.1e-05
 Latency   Simsteps     Outlet   mean       NaN         4        1      16/64/256   inf      nan      nan      inf       nan      nan       30   nan
 Latency   Simsteps     Outlet   mean       0           4        2048   16/64/256   0.017    -0.017   0.052    0.013     -0.012   0.038     30   0.3
 Latency   Simsteps     Outlet   mean       -           1        1      16/64       -2       -2.7     -1.4     -0.24     -0.32    -0.17     20   1.9e-06
 Latency   Simsteps     Outlet   mean       -           1        2048   16/64       -0.092   -0.17    -0.013   -0.069    -0.13    -0.0097   20   0.025
 Latency   Simsteps     Outlet   mean       -           4        1      16/64       -1.4     -2.1     -0.71    -0.17     -0.26    -0.087    20   0.00053
 Latency   Simsteps     Outlet   mean       0           4        2048   16/64       0.048    -0.035   0.13     0.036     -0.026   0.097     20   0.24
 Latency   Simsteps     Outlet   mean       0           1        1      64/256      -0.099   -0.56    0.36     -0.012    -0.067   0.043     20   0.65
 Latency   Simsteps     Outlet   mean       +           1        2048   64/256      5.8      3.6      8        4.4       2.7      6.1       20   3.8e-05
 Latency   Simsteps     Outlet   mean       NaN         4        1      64/256      inf      nan      nan      inf       nan      nan       20   nan
 Latency   Simsteps     Outlet   mean       0           4        2048   64/256      -0.013   -0.066   0.04     -0.0095   -0.049   0.03      20   0.62
                                                                                                147


Table A.3: Full Ordinary Least Squares Regression results of Latency Walltime Outlet (ns) against against log (Section 2.3.7). Significance level
p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                                  CI
                                                                                                                                                                     Lo we
                                                                                                                                                      Si                   rB
                                                                                                                                                         ze                   ou nd
                                                                                                                                                             95 %
                                                                                                                                             Eff                  CI
                                                                                                                                                ec tS                Up
                                                                                                                                                      ize               pe rB
                                                                                                                                                                               ou nd
                                                                                                                                             Eff ec tS ize
                                                                                                                                       95                   95 %
                                                             ca                                                                           %  Eff                  CI
                                                                                                                                                                     Lo
                                                                nt                                                                               ec                    we
                                                                    Eff                                       ze                      ze            tS ize                rB
                                                          us            ec tS                              Si                    Si                         95               ou
                                                             Pe rN            ign                                                                              %                nd
                                                                                                                                                                  CI
                                                        Nu
                                                           m         od  e                          eE                     Eff                     Eff               Up
                                                                                                                                                                        pe
                                                             Si m                                     ffe                      ec                      ec                  rB
                                                                  els                                    ct                       t                       t                   ound
                                                                       Pe  rC                                         te                      te
                                             tic        Nu                   pu                   lu                                                          lat             ive         ive
                      c                                ifi m                                         t             so                      so                     ive
                 M                       at                  Pr                                so                     lu                      lu                        Re          Re
                  et                        is     gn           oc
                     ri                 St         Si            Cpes se                     Ab                 Ab                     Ab                 Re              lat         lat
                                                                        s                                                                                                                       n      p
 Latency   Walltime   Outlet   (ns)   mean       -           1        1      16/64/256   -20 000         -36 000             -3 000                 -0.035          -0.064      -0.0054     30      0.022
 Latency   Walltime   Outlet   (ns)   mean       +           1        2048   16/64/256   5.5e+06         3.5e+06             7.4e+06                2.5             1.6         3.4         30      5e-06
 Latency   Walltime   Outlet   (ns)   mean       NaN         4        1      16/64/256   inf             nan                 nan                    inf             nan         nan         30      nan
 Latency   Walltime   Outlet   (ns)   mean       +           4        2048   16/64/256   230 000         110 000             340 000                0.11            0.056       0.17        30      0.00034
 Latency   Walltime   Outlet   (ns)   mean       -           1        1      16/64       -65 000         -95 000             -36 000                -0.12           -0.17       -0.063      20      0.00021
 Latency   Walltime   Outlet   (ns)   mean       +           1        2048   16/64       290 000         160 000             430 000                0.13            0.073       0.19        20      0.00021
 Latency   Walltime   Outlet   (ns)   mean       0           4        1      16/64       -2.5e+06        -7.2e+06            2.2e+06                -0.82           -2.4        0.73        20      0.28
 Latency   Walltime   Outlet   (ns)   mean       +           4        2048   16/64       670 000         530 000             810 000                0.33            0.26        0.4         20      8e-09
 Latency   Walltime   Outlet   (ns)   mean       0           1        1      64/256      26 000          -1 500              53 000                 0.045           -0.0026     0.093       20      0.062
 Latency   Walltime   Outlet   (ns)   mean       +           1        2048   64/256      1.1e+07         6.5e+06             1.5e+07                4.8             2.9         6.7         20      4e-05
 Latency   Walltime   Outlet   (ns)   mean       NaN         4        1      64/256      inf             nan                 nan                    inf             nan         nan         20      nan
 Latency   Walltime   Outlet   (ns)   mean       -           4        2048   64/256      -210 000        -320 000            -110 000               -0.11           -0.16       -0.054      20      0.00045
                                                                                               148


Table A.4: Full Ordinary Least Squares Regression results of Delivery Clumpiness against against log (Section 2.3.7). Significance level p < 0.05
used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                         an  tE
                                                        Pe rN    ffe ct
                                                Nu              od       Si gn
                                                   m    Si          e
                                                           m els  Pe  rC
                                                Nu m                       pu
                                                        Pr oc es se s
                                                Ab so lu te  Eff ec
                                                Ab                  tS   ize
                                                   so lu te  Eff ec
                                                Ab                  tS   ize
                                                   so lu te                   95 %
                                                             Eff                   CI Lo
                                                Re               ec tS                   we rB
                                                  lat                    ize  95               ou nd
                                                       ive  Eff                  % CI
                                                                ec tS                 Up pe
                                                Re lat                 ize                  rB ou
                                                       ive  Eff                                   nd
                                                                ec tS
                                                Re                     ize   95
                                                   lat ive                      %  CI
                                tic                         Eff                       Lo
                 c                        ifi c
                                               n                e ct  Si                we rB
              et
                 ri         at                    us
                                                                         ze  95               ou nd
                               is       Si     p
                                                                                %  CI
             M             St             gn   Cp                                     Up pe rBou nd
 Delivery   Clumpiness   mean       -      1        1      16/64/256   -0.036   -0.045    -0.026    -0.044   -0.056    -0.033     30   1.6e-08
 Delivery   Clumpiness   mean       -      1        2048   16/64/256   -0.03    -0.044    -0.015    -0.078   -0.12     -0.041     30   0.00021
 Delivery   Clumpiness   mean       -      4        1      16/64/256   -0.021   -0.033    -0.0087   -0.033   -0.052    -0.014     30   0.0016
 Delivery   Clumpiness   mean       0      4        2048   16/64/256   0.028    -0.0021   0.058     0.1      -0.0077   0.21       30   0.067
 Delivery   Clumpiness   mean       -      1        1      16/64       -0.055   -0.074    -0.036    -0.068   -0.092    -0.044     20   1.1e-05
 Delivery   Clumpiness   mean       0      1        2048   16/64       -0.034   -0.07     0.001     -0.092   -0.19     0.0027     20   0.056
 Delivery   Clumpiness   mean       0      4        1      16/64       -0.022   -0.053    0.0087    -0.034   -0.082    0.013      20   0.15
 Delivery   Clumpiness   mean       0      4        2048   16/64       0.038    -0.038    0.11      0.14     -0.14     0.41       20   0.31
 Delivery   Clumpiness   mean       -      1        1      64/256      -0.017   -0.033    -0.0002   -0.021   -0.041    -0.00025   20   0.048
 Delivery   Clumpiness   mean       0      1        2048   64/256      -0.025   -0.052    0.003     -0.065   -0.14     0.008      20   0.078
 Delivery   Clumpiness   mean       -      4        1      64/256      -0.02    -0.035    -0.0039   -0.031   -0.055    -0.0061    20   0.017
 Delivery   Clumpiness   mean       0      4        2048   64/256      0.018    -0.035    0.071     0.065    -0.13     0.25       20   0.48
                                                                                          149


Table A.5: Full Ordinary Least Squares Regression results of Simstep Period Inlet (ns) against against log (Section 2.3.7). Significance level p < 0.05
used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                        nd                nd                              nd      pe
                                                                                                                                                   Bo                   Bo                            Bo            rB
                                                                                                                                                      u                    u                             u
                                                                                                                                                                   er                                                  ou
                                                                                                                                             Lo                                                Lo                        nd
                                                                                                                                               we               Up                               we
                                                                                                                                                  r                p                                r           Up
                                                                                                                                         CI                  CI                              CI              CI
                                                                                                                                    95                  95                                 %      tS
                                                          ca                                                                          %                    %                            95
                                                             nt                                                                                                                                      ize
                                                                 Eff                                      Si                       ze              Si             tS              tS
                                                       us
                                                          Pe
                                                                     ec tS                                  ze                Si                     ze                                                  95
                                                             rN            ign                                                                                      ize              ize                    %
                                                     Nu
                                                        m        od   e                             Eff                 Eff                  Eff
                                                          Si                                            ec                  ec                   ec            Eff             Eff           Eff
                                                            m els                                          t                   t                    t             ec              ec            ec
                                                                    Pe  rC                       te                 te                    te
                                          tic        Nu                   pu                                                                              ive             ive           ive
                    c                               ifi m                                      lu                so                     lu
                et
                   ri                 at                  Pr                                so                     lu                   so          Re                 Re          Re
                                         is     gn           oc
                                                              Cpes se                     Ab                   Ab                  Ab                 lat                lat         lat
               M                     St         Si                   s                                                                                                                         n            p
 Simstep   Period   Inlet   (ns)   mean       +           1        1      16/64/256   8 800           7 400              10 000               0.12             0.11            0.14          30       1.4e-13
 Simstep   Period   Inlet   (ns)   mean       +           1        2048   16/64/256   150 000         97 000             190 000              0.085            0.056           0.11          30       1.5e-06
 Simstep   Period   Inlet   (ns)   mean       NaN         4        1      16/64/256   nan             nan                nan                  nan              nan             nan           30       nan
 Simstep   Period   Inlet   (ns)   mean       +           4        2048   16/64/256   160 000         89 000             220 000              0.1              0.058           0.15          30       5.6e-05
 Simstep   Period   Inlet   (ns)   mean       +           1        1      16/64       12 000          9 200              15 000               0.17             0.13            0.21          20       4.5e-08
 Simstep   Period   Inlet   (ns)   mean       +           1        2048   16/64       360 000         350 000            380 000              0.21             0.2             0.22          20       7e-21
 Simstep   Period   Inlet   (ns)   mean       NaN         4        1      16/64       -inf            nan                nan                  nan              nan             nan           20       nan
 Simstep   Period   Inlet   (ns)   mean       +           4        2048   16/64       450 000         430 000            480 000              0.3              0.28            0.31          20       6.2e-20
 Simstep   Period   Inlet   (ns)   mean       +           1        1      64/256      5 600           3 700              7 500                0.08             0.052           0.11          20       8.8e-06
 Simstep   Period   Inlet   (ns)   mean       -           1        2048   64/256      -72 000         -83 000            -61 000              -0.042           -0.049          -0.036        20       5.8e-11
 Simstep   Period   Inlet   (ns)   mean       NaN         4        1      64/256      inf             nan                nan                  nan              nan             nan           20       nan
 Simstep   Period   Inlet   (ns)   mean       -           4        2048   64/256      -140 000        -160 000           -120 000             -0.093           -0.11           -0.079        20       4.4e-11
                                                                                                      150


Table A.6: Full Ordinary Least Squares Regression results of Latency Simsteps Inlet against against log (Section 2.3.7). Significance level p < 0.05
used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                          ca nt
                                                          Pe     Eff ec
                                                  Nu         rN         tS  ign
                                                     m            od  e
                                                          Si m els  Pe  rC
                                                  Nu m                      pu
                                                          Pr oc es ses
                                                  Ab so lu te  Eff ec
                                                  Ab                  tS  ize
                                                     so lu te  Eff ec
                                                  Ab                  tS  ize
                                                     so lu te                  95 %
                                                               Eff ec               CI Lo
                                                  Re                  tS  ize             we rB
                                                     lat ive                   95%               ou nd
                                                              Eff ec                C IU
                                                  Re                 tS  ize             pp er
                                                     lat ive                                   Bo  un
                                                              Eff                                     d
                                                                  ec tS
                                                  Re lat                 ize  95
                                      tic                ive  Eff                %  CI
                  c                              n                ec tS                Lowe
              M                   at           ifi                       ize                rB
                                     is       gn      us                      95 %              ou nd
               et                                p
                                                     Cp
                                                                                    C IU
                  ri             St         Si
                                                                                        pp er  Bo un d
 Latency   Simsteps    Inlet   mean       -      1        1      16/64/256   -1       -1.4      -0.7      -0.13     -0.17    -0.086    30   5.7e-07
 Latency   Simsteps    Inlet   mean       +      1        2048   16/64/256   2.8      1.7       3.9       2.2       1.4      3.1       30   1.1e-05
 Latency   Simsteps    Inlet   mean       -      4        1      16/64/256   -0.64    -0.97     -0.3      -0.079    -0.12    -0.037    30   0.00055
 Latency   Simsteps    Inlet   mean       0      4        2048   16/64/256   0.016    -0.016    0.047     0.012     -0.012   0.036     30   0.32
 Latency   Simsteps    Inlet   mean       -      1        1      16/64       -2       -2.6      -1.4      -0.24     -0.31    -0.17     20   2e-06
 Latency   Simsteps    Inlet   mean       -      1        2048   16/64       -0.081   -0.16     -0.0058   -0.064    -0.12    -0.0045   20   0.036
 Latency   Simsteps    Inlet   mean       -      4        1      16/64       -1.4     -2.1      -0.69     -0.17     -0.26    -0.085    20   0.00055
 Latency   Simsteps    Inlet   mean       0      4        2048   16/64       0.043    -0.033    0.12      0.032     -0.025   0.09      20   0.25
 Latency   Simsteps    Inlet   mean       0      1        1      64/256      -0.09    -0.53     0.35      -0.011    -0.065   0.043     20   0.67
 Latency   Simsteps    Inlet   mean       +      1        2048   64/256      5.7      3.5       8         4.5       2.8      6.2       20   3.8e-05
 Latency   Simsteps    Inlet   mean       0      4        1      64/256      0.1      -0.39     0.6       0.013     -0.049   0.074     20   0.67
 Latency   Simsteps    Inlet   mean       0      4        2048   64/256      -0.011   -0.061    0.038     -0.0086   -0.046   0.029     20   0.64
                                                                                               151


Table A.7: Full Ordinary Least Squares Regression results of Simstep Period Outlet (ns) against against log (Section 2.3.7). Significance level p < 0.05
used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                       nd                   d
                                                                                                                                                                        Bo                           nd             nd
                                                                                                                                                 Bo                       un                    Bo             er
                                                                                                                                                    u                                              u              Bo
                                                                                                                                                                   er                                                u
                                                                                                                                      CI                                                   Lo
                                                                                                                                                                 Up                          we              Up
                                                                                                                                         Lo                         p
                                                                                                                                           we                                                   r               p
                                                                                                                                              r             CI                            CI             CI
                                                                                                                                   %                       %                          %              %
                                                           ca                                                                    95                    95                            95             95
                                                              nt
                                                                  Eff                                       ze              Si                        ze               ze       ze             ze
                                                          sP          ec tS                            Si                     ze                 Si                Si           Si             Si
                                                             er             ign
                                                      Nu         No                              Eff                  Eff                  Eff                                      t
                                                         m           de
                                                           Si                                        ec                   ec                   ec              Eff Re
                                                                                                                                                                      lat
                                                              m els                                     t                    t                    t               ec      ive
                                                                     Pe rC                    te                  te                   te                                     Eff
                                            ic                                                                                                             ive                   ec t
                                                      Nu                   pu
                    et
                      c                    ist     ifi   m                               so                      lu                so                              Re
                       ri                        gn        Pr                               lu                   so                   lu          Re                  lat
                                                                                                                                                                          ive
                                       at                    Cp
                                                              oc es                    Ab                   Ab                   Ab                 lat                       Eff
                M                     St         Si             u   ses                                                                                           n              ec t                    p
 Simstep   Period   Outlet   (ns)   mean     +           1      1      16/64/256   8 700           7 400               10 000               0.13               0.11         0.14          30    1.2e-13
 Simstep   Period   Outlet   (ns)   mean     +           1      2048   16/64/256   150 000         98 000              190 000              0.087              0.058        0.12          30    1.2e-06
 Simstep   Period   Outlet   (ns)   mean     NaN         4      1      16/64/256   nan             nan                 nan                  nan                nan          nan           30    nan
 Simstep   Period   Outlet   (ns)   mean     +           4      2048   16/64/256   150 000         87 000              220 000              0.1                0.058        0.15          30    5.5e-05
 Simstep   Period   Outlet   (ns)   mean     +           1      1      16/64       12 000          9 100               15 000               0.17               0.13         0.21          20    4.2e-08
 Simstep   Period   Outlet   (ns)   mean     +           1      2048   16/64       360 000         350 000             380 000              0.22               0.21         0.22          20    8.8e-21
 Simstep   Period   Outlet   (ns)   mean     NaN         4      1      16/64       -inf            nan                 nan                  nan                nan          nan           20    nan
 Simstep   Period   Outlet   (ns)   mean     +           4      2048   16/64       440 000         420 000             470 000              0.3                0.28         0.31          20    3.4e-19
 Simstep   Period   Outlet   (ns)   mean     +           1      1      64/256      5 600           3 700               7 500                0.081              0.053        0.11          20    7.1e-06
 Simstep   Period   Outlet   (ns)   mean     -           1      2048   64/256      -69 000         -79 000             -59 000              -0.041             -0.047       -0.035        20    2.2e-11
 Simstep   Period   Outlet   (ns)   mean     NaN         4      1      64/256      inf             nan                 nan                  nan                nan          nan           20    nan
 Simstep   Period   Outlet   (ns)   mean     -           4      2048   64/256      -140 000        -160 000            -120 000             -0.092             -0.11        -0.078        20    6.5e-11
                                                                                                 152


Table A.8: Full Ordinary Least Squares Regression results of Delivery Failure Rate against against log (Section 2.3.7). Significance level p < 0.05
used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                       nd                   nd
                                                                                                                                                                                                             nd
                                                                                                                                                 Bo               pe                            Bo  Up
                                                                                                                                                    u               rB                             u  pe
                                                                                                                                           Lo                         ou                                rB
                                                                                                                                                                                           Lo              ou
                                                                                                                                             we                 Up                           we
                                                                                                                                                r                                               r            nd
                                                                                                                                     CI                     CI                           CI         CI
                                                                                                                               95                          %
                                                                                                                                                                                    95          95
                                                     ca                                                                           %                    95                              %           %
                                                        nt
                                                            Eff                                       ze                      ze                      ze               ze        Si           ze
                                                  us            ec tS                              Si                    Si                      Si                                 ze
                                                     Pe rN            ign                                                                                              Si                  Si
                                                Nu
                                                   m         od  e                          eE                     Eff                     Eff                                     ct
                                                     Si m                                     ffe                      ec                      ec              Eff Re
                                                          els                                    ct                       t                       t                e  lat
                                                                                                                                                                          ive
                                                               Pe  rC                                         te                      te                                      Eff
                                     tic        Nu                    pu                  lu                                                               ive                    ec
                  c                            ifi m                                         t             so                      so                              Re                t
               et
                  ri             at                  Pr                                so                     lu                      lu              Re              lat
                                    is     gn           oc                                                                                                                ive
                                                         Cpes se                     Ab                Ab                     Ab                        lat       n           Eff
              M                 St         Si                   s                                                                                                                ec t               p
 Delivery   Failure    Rate   mean       NaN         1        1      16/64/256   0               0                  0                      nan                   nan         nan     30       nan
 Delivery   Failure    Rate   mean       +           1        2048   16/64/256   0.0015          0.0011             0.0018                 -13                   -9.6        -16     30       2.5e-09
 Delivery   Failure    Rate   mean       +           4        1      16/64/256   0.015           0.0088             0.02                   0.15                  0.091       0.21    30       1.7e-05
 Delivery   Failure    Rate   mean       0           4        2048   16/64/256   -0.00075        -0.0016            9.3e-05                -0.49                 -1          0.06    30       0.079
 Delivery   Failure    Rate   mean       NaN         1        1      16/64       0               0                  0                      nan                   nan         nan     20       nan
 Delivery   Failure    Rate   mean       0           1        2048   16/64       8.6e-06         -0.00013           0.00015                -0.073                1.1         -1.3    20       0.9
 Delivery   Failure    Rate   mean       0           4        1      16/64       -5.7e-08        -0.012             0.012                  -5.8e-07              -0.12       0.12    20       1
 Delivery   Failure    Rate   mean       0           4        2048   16/64       -0.00047        -0.0026            0.0016                 -0.31                 -1.7        1.1     20       0.64
 Delivery   Failure    Rate   mean       NaN         1        1      64/256      0               0                  0                      nan                   nan         nan     20       nan
 Delivery   Failure    Rate   mean       +           1        2048   64/256      0.0029          0.0026             0.0032                 -25                   -23         -28     20       6.7e-14
 Delivery   Failure    Rate   mean       +           4        1      64/256      0.029           0.025              0.033                  0.3                   0.26        0.34    20       1.7e-11
 Delivery   Failure    Rate   mean       0           4        2048   64/256      -0.001          -0.0026            0.00055                -0.66                 -1.7        0.36    20       0.19
                                                                                                      153


Table A.9: Full Quantile Regression results of Latency Walltime Inlet (ns) against against log (Section 2.3.7). Significance level p < 0.05 used. Inf
or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                           nd                 un                                   nd              nd
                                                                                                                                                                                 d
                                                                                                                                                     Bo                                                                          Bo
                                                                                                                                                        u
                                                                                                                                                                            Bo                          Lo                          u
                                                                                                                                                                        er                                we
                                                                                                                                               Lo                                                           rB              er
                                                                                                                                                 we                  Up                                                   Up
                                                                                                                                                    r                   p                                      ou
                                                                                                                                                                                                                             p
                                                                                                                                          CI                      CI                                CI                CI
                                                                                                                                       %                       %                                   %
                                                                                                                                     95                    95                                  95           tS
                                                               ca nt
                                                                                                                ze                                                                                             ize
                                                                      Eff                                                       Si                        ze
                                                                                                                                                                       tS                     ze
                                                               Pe         ec tS                            Si                     ze                 Si                                  Si                        95
                                                          Nu
                                                                  rN
                                                                       od       ign                                                                                      ize                                          %
                                                             m             e                         Eff                  Eff                  Eff                                  Eff
                                                               Si                                        ec                   ec                   ec              Eff                         tiv
                                                                  m els                                     t                    t                    t               ec               ec
                                                                         Pe                                                                                                               t       eE
                                                                             rC                   te                  te                   te                                                        ffe
                                              tic         Nu                    pu                                                                             ive             ive
                 et
                      c
                                         at              ifi m                                  lu                   lu                so                                                                c
                    ri                      is
                                                               Pr
                                                                us                           so                      so                   lu              Re                Re
                                                        gn        oc es                                                                                     lat               lat         Re
                M                       St          Si         Cp       se                 Ab                   Ab                   Ab                                                      la
                                                                          s                                                                                                                            n             p
 Latency   Walltime   Inlet   (ns)   median         -     1         1      16/64/256   -26 000         -47 000             -6 000               -0.049             -0.087            -0.011            30       0.013
 Latency   Walltime   Inlet   (ns)   median         0     1         2048   16/64/256   61 000          -29 000             150 000              0.029              -0.014            0.072             30       0.18
 Latency   Walltime   Inlet   (ns)   median         0     4         1      16/64/256   -1 400          -23 000             20 000               -0.003             -0.05             0.044             30       0.9
 Latency   Walltime   Inlet   (ns)   median         +     4         2048   16/64/256   200 000         25 000              380 000              0.11               0.014             0.21              30       0.027
 Latency   Walltime   Inlet   (ns)   median         -     1         1      16/64       -63 000         -110 000            -18 000              -0.12              -0.2              -0.034            20       0.0083
 Latency   Walltime   Inlet   (ns)   median         +     1         2048   16/64       260 000         30 000              500 000              0.12               0.014             0.23              20       0.029
 Latency   Walltime   Inlet   (ns)   median         0     4         1      16/64       -43 000         -110 000            21 000               -0.095             -0.24             0.046             20       0.17
 Latency   Walltime   Inlet   (ns)   median         +     4         2048   16/64       690 000         290 000             1.1e+06              0.38               0.16              0.6               20       0.002
 Latency   Walltime   Inlet   (ns)   median         0     1         1      64/256      2 900           -27 000             33 000               0.0055             -0.051            0.062             20       0.84
 Latency   Walltime   Inlet   (ns)   median         0     1         2048   64/256      -96 000         -230 000            35 000               -0.045             -0.11             0.017             20       0.14
 Latency   Walltime   Inlet   (ns)   median         0     4         1      64/256      24 000          -3 100              51 000               0.052              -0.0068           0.11              20       0.08
 Latency   Walltime   Inlet   (ns)   median         0     4         2048   64/256      -130 000        -310 000            48 000               -0.071             -0.17             0.026             20       0.14
                                                                                                     154


Table A.10: Full Quantile Regression results of Latency Simsteps Outlet against against log (Section 2.3.7). Significance level p < 0.05 used. Inf or
NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                     Bo                                     nd     rB
                                                                                                                                                        un
                                                                                                                                            CI             d                    Lo                    ou
                                                                                                                                               Up                                 we                    nd
                                                                                                                                   CI             pe                                rB           Up
                                                                                                                                      Lo            rB                                 ou          pe
                                                                                                                                        we            ou
                                                                                                                                           r             nd                    CI            CI
                                                                                                                              95                 %                         %                %
                                                                 an                                                              %              95                        95           95
                                                                     tE                                  ze                  ze             ze
                                                                Pe       ffe ct                                                                              ze       ze            Si
                                                                   rN           Si                     Si               Si                 Si                Si      Si                ze
                                                           Nu           od         gn
                                                              m
                                                                Si
                                                                            e                  eE                 Eff                  ct                                ct
                                                                   m els                         ffe                  ec           Eff               Eff Re
                                                                          Pe                        ct                   t             e                 e
                                                                                                                                                            lat ive
                                                                              rC                              te                  te                                Eff
                                          tic              Nu                    pu          lu                                                  ive                    ec t
                   c                                  ifi c   m                                 t         so                  lu                         Re
                et
                   ri                at                       usPr                       so                  lu              so            Re               lat
                                        is          Si             oc                                                                                           ive
                                                      gn   Cp         es se             Ab             Ab               Ab                   lat                    Eff
               M                    St                                     s                                                                            n               ec t                 p
 Latency   Simsteps     Outlet   median         -      1        1      16/64/256   -1.1             -1.6          -0.55            -0.13               -0.2       -0.069       30      0.00025
 Latency   Simsteps     Outlet   median         0      1        2048   16/64/256   -0.037           -0.08         0.0057           -0.029              -0.063     0.0045       30      0.087
 Latency   Simsteps     Outlet   median         -      4        1      16/64/256   -0.47            -0.76         -0.18            -0.068              -0.11      -0.026       30      0.0027
 Latency   Simsteps     Outlet   median         0      4        2048   16/64/256   -0.003           -0.063        0.057            -0.0022             -0.047     0.042        30      0.92
 Latency   Simsteps     Outlet   median         -      1        1      16/64       -2               -3            -1.1             -0.25               -0.38      -0.13        20      0.00033
 Latency   Simsteps     Outlet   median         0      1        2048   16/64       -0.096           -0.24         0.051            -0.075              -0.19      0.04         20      0.19
 Latency   Simsteps     Outlet   median         -      4        1      16/64       -1.1             -1.7          -0.43            -0.16               -0.25      -0.063       20      0.0025
 Latency   Simsteps     Outlet   median         0      4        2048   16/64       -0.0014          -0.17         0.17             -0.001              -0.12      0.12         20      0.99
 Latency   Simsteps     Outlet   median         0      1        1      64/256      -0.32            -0.81         0.18             -0.04               -0.1       0.022        20      0.19
 Latency   Simsteps     Outlet   median         0      1        2048   64/256      -0.0014          -0.061        0.058            -0.0011             -0.048     0.046        20      0.96
 Latency   Simsteps     Outlet   median         0      4        1      64/256      -0.017           -0.43         0.4              -0.0024             -0.063     0.058        20      0.93
 Latency   Simsteps     Outlet   median         0      4        2048   64/256      -0.0068          -0.081        0.068            -0.005              -0.06      0.05         20      0.85
                                                                                                       155


Table A.11: Full Quantile Regression results of Latency Walltime Outlet (ns) against against log (Section 2.3.7). Significance level p < 0.05 used.
Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                   nd                                                 er
                                                                                                                                             Bo                 Re                                       Bo
                                                                                                                                                u                  lat                                      un
                                                                                                                                                                       ive                                     d
                                                                                                                                  CI                                       Eff
                                                                                                                                                           Up                  ec
                                                                                                                                     Lo                       p                   tS
                                                                                                                                       we                                            ize
                                                                                                                                          r             CI      Re
                                                                                                                                                                   lat                   95
                                                                                                                            ze                     95                  ive                 %
                                                                                                                                                      %                    Eff                CI
                                                               ca nt                                                           95                                                                Lo
                                                                                                          ze                      %               ze           n               ec
                                                              sP      Eff
                                                                          ec                                                                                  tS                  tS               we
                                                                 er          tS                        Si              ct                    Si
                                                                                                                                                                ize                  ize              rB
                                                          Nu         No         ign                                       Si
                                                             m           d e                    eE                                     Eff                                               95              ou
                                                               Si                                                   Eff                    ec             Eff                               %               nd
                                                                  m els                           ffe                   e                     t              ecp                              CI
                                                                         Pe                          ct                                                                                          Up
                                                                             rC                                 te                    te
                                               tic        Nu                    pu            lu                                                       ive                                          pe
                      c                                  ifi m                                   t             lu              so
                 M                        at                   Pr                          so                  so                lu               Re                                                  rB
                  et                         is      Si        Cp oc                                                                                                                                      ou
                     ri                                gn            es                  Ab               Ab                 Ab                     lat
                                         St                       u     ses                                                                                                                                  nd
 Latency   Walltime   Outlet   (ns)   median         -    1       1      16/64/256   -26 000         -47 000         -4 900             -0.048               -0.086     -0.0091       30     0.017
 Latency   Walltime   Outlet   (ns)   median         0    1       2048   16/64/256   60 000          -32 000         150 000            0.028                -0.015     0.071         30     0.19
 Latency   Walltime   Outlet   (ns)   median         0    4       1      16/64/256   -3 400          -26 000         19 000             -0.0074              -0.056     0.041         30     0.75
 Latency   Walltime   Outlet   (ns)   median         +    4       2048   16/64/256   200 000         13 000          380 000            0.11                 0.0073     0.21          30     0.036
 Latency   Walltime   Outlet   (ns)   median         -    1       1      16/64       -69 000         -120 000        -21 000            -0.13                -0.22      -0.038        20     0.0077
 Latency   Walltime   Outlet   (ns)   median         +    1       2048   16/64       260 000         7 700           520 000            0.12                 0.0036     0.24          20     0.044
 Latency   Walltime   Outlet   (ns)   median         0    4       1      16/64       -44 000         -110 000        20 000             -0.096               -0.24      0.044         20     0.17
 Latency   Walltime   Outlet   (ns)   median         +    4       2048   16/64       710 000         300 000         1.1e+06            0.38                 0.16       0.6           20     0.0017
 Latency   Walltime   Outlet   (ns)   median         0    1       1      64/256      3 100           -26 000         33 000             0.0057               -0.049     0.06          20     0.83
 Latency   Walltime   Outlet   (ns)   median         0    1       2048   64/256      -100 000        -240 000        38 000             -0.048               -0.11      0.018         20     0.14
 Latency   Walltime   Outlet   (ns)   median         0    4       1      64/256      24 000          -9 100          57 000             0.052                -0.02      0.12          20     0.14
 Latency   Walltime   Outlet   (ns)   median         0    4       2048   64/256      -150 000        -310 000        13 000             -0.08                -0.17      0.0072        20     0.07
                                                                                                156


Table A.12: Full Quantile Regression results of Delivery Clumpiness against against log (Section 2.3.7). Significance level p < 0.05 used. Inf or NaN
values may occur due to multicollinearity or due to inf or NaN observations.
                                                             an  tE
                                                         sP   er     ffe ct
                                                    Nu            No  d     Si gn
                                                       m    Si          e
                                                               m els  Pe  rC
                                                    Nu m                      pu
                                                            Pr oc es ses
                                                    Ab so lu te  Eff ec
                                                    Ab                  tS  ize
                                                       so lu te  Eff ec
                                                    Ab                  tS  ize
                                                       so lu te                  95 %
                                                                 Eff ec               CI Lo
                                                    Re                  tS  ize              we rB
                                                       lat ive                   9 5%               ou nd
                                                                Eff ec                CI U
                                                    Re                 tS  ize              pp er
                                                       lat ive                                    Bo  un
                                                                Eff ec                                   d
                                                    Re                 tS  ize
                                                       lat ive                  95 %
                                  tic                           Eff                   CI
                 c                            ifi c
                                                   n                ec tS  ize
                                                                                         Lo we rB
              et
                 ri          at                                                 95                 ou nd
                                is          Si     Cp
                                                   p                               %  C IU
             M              St                gn      u
                                                                                           pp er  Bo un d
 Delivery   Clumpiness   median         -      1      1      16/64/256   -0.038   -0.051   -0.024    -0.047   -0.063   -0.03     30   2.7e-06
 Delivery   Clumpiness   median         -      1      2048   16/64/256   -0.031   -0.055   -0.0064   -0.079   -0.14    -0.017    30   0.015
 Delivery   Clumpiness   median         -      4      1      16/64/256   -0.024   -0.035   -0.014    -0.036   -0.052   -0.021    30   5.5e-05
 Delivery   Clumpiness   median         0      4      2048   16/64/256   0.019    -0.024   0.061     0.057    -0.072   0.19      30   0.37
 Delivery   Clumpiness   median         -      1      1      16/64       -0.057   -0.087   -0.027    -0.07    -0.11    -0.033    20   0.00091
 Delivery   Clumpiness   median         0      1      2048   16/64       -0.042   -0.11    0.024     -0.11    -0.28    0.063     20   0.2
 Delivery   Clumpiness   median         -      4      1      16/64       -0.033   -0.062   -0.0034   -0.049   -0.093   -0.0052   20   0.031
 Delivery   Clumpiness   median         0      4      2048   16/64       0.03     -0.14    0.2       0.091    -0.41    0.6       20   0.71
 Delivery   Clumpiness   median         0      1      1      64/256      -0.017   -0.043   0.0085    -0.021   -0.053   0.011     20   0.18
 Delivery   Clumpiness   median         0      1      2048   64/256      -0.024   -0.072   0.024     -0.063   -0.19    0.061     20   0.3
 Delivery   Clumpiness   median         0      4      1      64/256      -0.014   -0.036   0.0071    -0.022   -0.054   0.011     20   0.18
 Delivery   Clumpiness   median         0      4      2048   64/256      0.015    -0.04    0.07      0.045    -0.12    0.21      20   0.58
                                                                                           157


Table A.13: Full Quantile Regression results of Simsteps Period Inlet (ns) against against log (Section 2.3.7). Significance level p < 0.05 used. Inf or
NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                         nd      Up                                          nd       Up
                                                                                                                                         CI                        pe                                   Bo              pe
                                                                                                                                            Lo                       rB                                    u              rB
                                                                                                                                              we                        ou                        Lo                         ou
                                                                                                                                                rB                        nd                        we                         nd
                                                                                                                                                                                                       r
                                                                                                                                                  ou          CI                                CI                CI
                                                                                                                                      %                  95
                                                                                                                                    95                      %                             95                  %
                                                               an  tE                                                                                                                        %               95
                                                                       ffe                                Si                   Si                   Si             tS              tS                   ze
                                                              Pe
                                                                 rN        ct                               ze                   ze                   ze
                                                         Nu                   Si gn                                                                                   ize             ize             tS
                                                                      od                                                                                                                                 i
                                                            m             e                         Eff                  Eff                  Eff
                                                              Si m                                      ec                   ec                   ec          Eff           ive            tiv
                                                                   els                                     t                    t                    t           ec
                                                                        Pe                                                                                                      Eff           eE
                                                                            rC
                                            tic          Nu
                                                        fic                    pu              te                   te                   te                ive                     ec           ffe
                et
                    c
                                       at
                                                            m                                  lu                   lu                   lu                                                         c
                   ri                     is          gn      Pr
                                                              us oc                       so                    so                   so              Re               Re                Re
                                                         i   Cp     es                   Ab                    Ab                   Ab                 lat              lat
               M                      St          Si                   ses                                                                                                                 la
                                                                                                                                                                                                 n                p
 Simstep   Period   Inlet   (ns)   median         +     1         1      16/64/256   8 600          6 900                 10 000              0.12              0.098           0.15             30      6.7e-11
 Simstep   Period   Inlet   (ns)   median         +     1         2048   16/64/256   140 000        33 000                240 000             0.081             0.019           0.14             30      0.012
 Simstep   Period   Inlet   (ns)   median         0     4         1      16/64/256   2 000          -470                  4 400               0.03              -0.0072         0.067            30      0.11
 Simstep   Period   Inlet   (ns)   median         +     4         2048   16/64/256   140 000        27 000                250 000             0.087             0.017           0.16             30      0.016
 Simstep   Period   Inlet   (ns)   median         +     1         1      16/64       10 000         7 000                 13 000              0.15              0.1             0.19             20      2.9e-06
 Simstep   Period   Inlet   (ns)   median         +     1         2048   16/64       360 000        330 000               380 000             0.21              0.2             0.22             20      2.7e-17
 Simstep   Period   Inlet   (ns)   median         0     4         1      16/64       4 900          -3 300                13 000              0.074             -0.051          0.2              20      0.23
 Simstep   Period   Inlet   (ns)   median         +     4         2048   16/64       370 000        320 000               410 000             0.23              0.2             0.26             20      6.8e-12
 Simstep   Period   Inlet   (ns)   median         +     1         1      64/256      6 400          3 700                 9 000               0.091             0.053           0.13             20      9e-05
 Simstep   Period   Inlet   (ns)   median         -     1         2048   64/256      -79 000        -95 000               -63 000             -0.046            -0.056          -0.037           20      6.5e-09
 Simstep   Period   Inlet   (ns)   median         0     4         1      64/256      300            -2 800                3 400               0.0045            -0.043          0.052            20      0.85
 Simstep   Period   Inlet   (ns)   median         -     4         2048   64/256      -91 000        -120 000              -59 000             -0.058            -0.078          -0.037           20      1.2e-05
                                                                                                         158


Table A.14: Full Quantile Regression results of Latency Simsteps Inlet against against log (Section 2.3.7). Significance level p < 0.05 used. Inf or
NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                     nd                 nd
                                                                                                                                                                                                  tS
                                                                                                                                       Lo                          Bo
                                                                                                                                                                      u         Re                   ize
                                                                                                                                         we                                        lat
                                                                                                                                           rB                 er                                         95
                                                                                                                                                           Up                          ive                  %
                                                                                                                                             ou
                                                                                                                                                              p                            Eff                CI
                                                                                                                                     CI               CI                       n               ec                Lo
                                                                                                                                                                                                  tS               we
                                                                                                                              95                 95
                                                            ca                                                                  %                   %                                                ize              rB
                                                                                                                                                                                                         95
                                                               nt
                                                           sP      Eff                                    ze                 ze                 ze                 ze                                                    ou
                                                              er       ec tS                         Si                 Si                 Si                 Si                                            %               nd
                                                       Nu         No
                                                                      de
                                                                             ign
                                                                                               Eff                Eff                Eff                                       p                              CI
                                                          m                                                                                             Eff
                                                            Si m                                   ec                 ec                 ec                 ec          ive                                      Up
                                                                 els                                  t                  t                  t                  t
                                                                      Pe
                                                                         rC                                                                                                 Eff                                     pe
                                                                                           te                  te                 te                                           ec
                                        tic            Nu                   pu            lu
                                                                                                                                                     ive                                                               rB
                  c                                ifi    m                                                so                so
              M                    at                       Pr                        so                     lu                 lu          Re                     Re                                                    ou
               et                     is          gn       Cp  oc
                  ri                                              es se              Ab               Ab                Ab                    lat                    lat                                                    nd
                                  St              Si          u        s
 Latency   Simsteps    Inlet   median         -        1      1      16/64/256   -1.1           -1.6              -0.62              -0.14                 -0.2            -0.079          30        6.6e-05
 Latency   Simsteps    Inlet   median         0        1      2048   16/64/256   -0.034         -0.074            0.0068             -0.027                -0.06           0.0055          30        0.1
 Latency   Simsteps    Inlet   median         -        4      1      16/64/256   -0.42          -0.72             -0.12              -0.063                -0.11           -0.018          30        0.0074
 Latency   Simsteps    Inlet   median         0        4      2048   16/64/256   -0.0031        -0.059            0.053              -0.0023               -0.044          0.04            30        0.91
 Latency   Simsteps    Inlet   median         -        1      1      16/64       -1.9           -2.7              -1.1               -0.24                 -0.35           -0.14           20        0.00017
 Latency   Simsteps    Inlet   median         0        1      2048   16/64       -0.089         -0.23             0.051              -0.072                -0.19           0.041           20        0.2
 Latency   Simsteps    Inlet   median         -        4      1      16/64       -1.1           -1.7              -0.36              -0.16                 -0.26           -0.054          20        0.005
 Latency   Simsteps    Inlet   median         0        4      2048   16/64       -0.0057        -0.15             0.14               -0.0043               -0.12           0.11            20        0.94
 Latency   Simsteps    Inlet   median         0        1      1      64/256      -0.39          -0.85             0.064              -0.051                -0.11           0.0082          20        0.088
 Latency   Simsteps    Inlet   median         0        1      2048   64/256      -0.0032        -0.063            0.057              -0.0026               -0.051          0.046           20        0.91
 Latency   Simsteps    Inlet   median         0        4      1      64/256      0.00035        -0.42             0.42               5.2e-05               -0.062          0.062           20        1
 Latency   Simsteps    Inlet   median         0        4      2048   64/256      0.0012         -0.073            0.076              0.00089               -0.055          0.057           20        0.97
                                                                                                          159


Table A.15: Full Quantile Regression results of Simstep Period Outlet (ns) against against log (Section 2.3.7). Significance level p < 0.05 used. Inf
or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                                                                                                                         d                    un
                                                                                                                                                    Bo                          d                          nd        pe
                                                                                                                                                      un              pe                               Bo              rB
                                                                                                                                                                        rB                                u               ou
                                                                                                                                             Lo                            o                      Lo                        nd
                                                                                                                                               we                  Up                               we
                                                                                                                                                  r                                                    r            Up
                                                                                                                                         CI                     CI                               CI             CI
                                                                                                                                        %                    %                               %              %
                                                                 an                                                                 95                   95                                 95             95
                                                                    tE                                         ze                  ze                   ze
                                                                Pe       ffe ct                                                                                             ze          ze            ze
                                                                   rN           Si                        Si                  Si                   Si                  Si              Si             Si
                                                           Nu          od          gn
                                                              m
                                                                Si
                                                                            e                       Eff                 Eff                  Eff             ive             Re
                                                                  m                                     ec                  ec                   ec                             lat
                                                                    els                                    t                   t                    t                               ive
                                                                          Pe                                                                                     Eff                    Eff
                                                                              rC                 te                 te                   te                          ec                    ec t
                                        at                 Nu
                                                          fic                    pu                                                                                     t    Re
                    et
                      c                    ist                m                             so                  so                      lu
                                                                                                                                                                               lat
                       ri                      ic       gn      Pr
                                                                us oc                          lu                  lu               so              Re                              ive
                                                           i   Cp     es                   Ab                  Ab                  Ab                 lat                               Eff
                M                      St           Si                   se s                                                                                               n              ec t                 p
 Simstep   Period   Outlet   (ns)   median          +     1         1      16/64/256   8 500        6 800                10 000              0.12                0.099              0.15         30    3.2e-11
 Simstep   Period   Outlet   (ns)   median          +     1         2048   16/64/256   140 000      35 000               240 000             0.083               0.021              0.14         30    0.011
 Simstep   Period   Outlet   (ns)   median          0     4         1      16/64/256   1 700        -460                 3 800               0.026               -0.0071            0.059        30    0.12
 Simstep   Period   Outlet   (ns)   median          +     4         2048   16/64/256   140 000      28 000               240 000             0.088               0.018              0.16         30    0.015
 Simstep   Period   Outlet   (ns)   median          +     1         1      16/64       10 000       7 000                13 000              0.14                0.1                0.19         20    1.3e-06
 Simstep   Period   Outlet   (ns)   median          +     1         2048   16/64       350 000      330 000              370 000             0.21                0.19               0.22         20    1e-16
 Simstep   Period   Outlet   (ns)   median          0     4         1      16/64       4 700        -3 000               12 000              0.072               -0.047             0.19         20    0.22
 Simstep   Period   Outlet   (ns)   median          +     4         2048   16/64       370 000      320 000              420 000             0.24                0.21               0.27         20    6.6e-12
 Simstep   Period   Outlet   (ns)   median          +     1         1      64/256      6 500        4 000                9 000               0.094               0.057              0.13         20    4.1e-05
 Simstep   Period   Outlet   (ns)   median          -     1         2048   64/256      -75 000      -94 000              -57 000             -0.045              -0.056             -0.034       20    9.9e-08
 Simstep   Period   Outlet   (ns)   median          0     4         1      64/256      630          -2 200               3 500               0.0097              -0.034             0.054        20    0.65
 Simstep   Period   Outlet   (ns)   median          -     4         2048   64/256      -90 000      -120 000             -60 000             -0.058              -0.078             -0.039       20    6.6e-06
                                                                                                      160


Table A.16: Full Quantile Regression results of Delivery Failure Rate against against log (Section 2.3.7). Significance level p < 0.05 used. Inf or NaN
values may occur due to multicollinearity or due to inf or NaN observations.
                                                             ca nt
                                                        us          Eff ec
                                                             Pe rN         tS  ign
                                                     Nu m            od  e
                                                             Si m els  Pe  rC
                                                     Nu m                      pu
                                                             Pr oc es ses
                                                     Ab so lu te  Eff ec
                                                     Ab                  tS  ize
                                                        so lu te  Eff ec
                                                     Ab                  tS  ize
                                                        so lu te                  95
                                                                  Eff                % CI
                                                     Re lat           ec tS               Lo we
                                                            ive              ize  95            rB
                                                                 Eff ec              % CI          ou nd
                                                     Relat              tS  ize           Up
                                                            ive                              pe rB
                                                                 Eff ec                            ou nd
                                                     Re lat             tS  ize
                                                            ive                  95 %
                  c                    tic          n            Eff ec tS             CI Lo
               et                 at               ifi
                                                    p
                                                                            ize  95 %
                                                                                            we rB ou
                  ri                 is        gn                                      CI Up         nd
              M                  St            Si            Cp                              pe rBou nd
 Delivery   Failure    Rate   median         NaN         1        1      16/64/256   0       nan        nan     nan    nan     nan    30   nan
 Delivery   Failure    Rate   median         NaN         1        2048   16/64/256   0       nan        nan     nan    nan     nan    30   nan
 Delivery   Failure    Rate   median         +           4        1      16/64/256   0.019   0.0076     0.029   0.34   0.14    0.54   30   0.0016
 Delivery   Failure    Rate   median         NaN         4        2048   16/64/256   0       nan        nan     nan    nan     nan    30   nan
 Delivery   Failure    Rate   median         NaN         1        1      16/64       0       nan        nan     nan    nan     nan    20   nan
 Delivery   Failure    Rate   median         NaN         1        2048   16/64       0       nan        nan     nan    nan     nan    20   nan
 Delivery   Failure    Rate   median         0           4        1      16/64       0.026   -0.023     0.074   0.47   -0.41   1.4    20   0.28
 Delivery   Failure    Rate   median         NaN         4        2048   16/64       0       nan        nan     nan    nan     nan    20   nan
 Delivery   Failure    Rate   median         NaN         1        1      64/256      0       nan        nan     nan    nan     nan    20   nan
 Delivery   Failure    Rate   median         NaN         1        2048   64/256      0       nan        nan     nan    nan     nan    20   nan
 Delivery   Failure    Rate   median         +           4        1      64/256      0.018   0.0038     0.032   0.33   0.069   0.59   20   0.016
 Delivery   Failure    Rate   median         NaN         4        2048   64/256      0       nan        nan     nan    nan     nan    20   nan
                                                                                                  161


A.2                                Computation vs. Communication
                               This section provides full results from computation vs. communication experiments discussed in Section
2.3.3.
                               109                                                                             109
 Latency Walltime Inlet (ns)                                                     Latency Walltime Inlet (ns)
                               108                                                                             108
                               107                                                                             107
                               106                                                                             106
                                        0       1    2     3             4                                           0     1    2     3     4
                                              Log Compute Work                                                           Log Compute Work
(a) Distribution of Latency Walltime Inlet (ns) for each(b) Distribution of Latency Walltime Inlet (ns) for
snapshot, without outliers.                             each snapshot, with outliers.
Figure A.20: Distribution of Latency Walltime Inlet (ns) for individual snapshot measurements for compu-
tation vs. communication experiment (Section 2.3.3). Lower is better.
                                                                              162


                                102
                                                                                                  102
 Latency Simsteps Outlet                                           Latency Simsteps Outlet
                                101
                                                                                                  101
                                100                                                               100
                                      0     1    2     3     4                                          0     1    2     3     4
                                          Log Compute Work                                                  Log Compute Work
(a) Distribution of Latency Simsteps Outlet for each(b) Distribution of Latency Simsteps Outlet for each
snapshot, without outliers.                         snapshot, with outliers.
Figure A.21: Distribution of Latency Simsteps Outlet for individual snapshot measurements for computation
vs. communication experiment (Section 2.3.3). Lower is better.
                                109                                                               109
 Latency Walltime Outlet (ns)                                      Latency Walltime Outlet (ns)
                                108                                                               108
                                107                                                               107
                                106                                                               106
                                      0     1    2     3     4                                          0     1    2     3     4
                                          Log Compute Work                                                  Log Compute Work
(a) Distribution of Latency Walltime Outlet (ns) for(b) Distribution of Latency Walltime Outlet (ns) for
each snapshot, without outliers.                    each snapshot, with outliers.
Figure A.22: Distribution of Latency Walltime Outlet (ns) for individual snapshot measurements for com-
putation vs. communication experiment (Section 2.3.3). Lower is better.
                                                                 163


                             1.0                                                            1.0
                             0.8                                                            0.8
 Delivery Clumpiness                                            Delivery Clumpiness
                             0.6                                                            0.6
                             0.4                                                            0.4
                             0.2                                                            0.2
                             0.0                                                            0.0
                                   0     1    2     3     4                                       0     1    2     3     4
                                       Log Compute Work                                               Log Compute Work
(a) Distribution of Delivery Clumpiness for each snap-(b) Distribution of Delivery Clumpiness for each snap-
shot, without outliers.                               shot, with outliers.
Figure A.23: Distribution of Delivery Clumpiness for individual snapshot measurements for computation vs.
communication experiment (Section 2.3.3). Lower is better.
                                                                                            109
                             108
 Simstep Period Inlet (ns)                                      Simstep Period Inlet (ns)
                                                                                            108
                             107                                                            107
                             106                                                            106
                             105                                                            105
                                                                                            104
                             104
                                   0     1    2     3     4                                       0     1    2     3     4
                                       Log Compute Work                                               Log Compute Work
(a) Distribution of Simstep Period Inlet (ns) for each(b) Distribution of Simstep Period Inlet (ns) for each
snapshot, without outliers.                           snapshot, with outliers.
Figure A.24: Distribution of Simstep Period Inlet (ns) for individual snapshot measurements for computation
vs. communication experiment (Section 2.3.3). Lower is better.
                                                              164


                              102
                                                                                                  102
 Latency Simsteps Inlet                                              Latency Simsteps Inlet
                              101
                                                                                                  101
                              100                                                                 100
                                        0     1    2     3     4                                          0     1    2     3     4
                                            Log Compute Work                                                  Log Compute Work
(a) Distribution of Latency Simsteps Inlet for each(b) Distribution of Latency Simsteps Inlet for each
snapshot, without outliers.                        snapshot, with outliers.
Figure A.25: Distribution of Latency Simsteps Inlet for individual snapshot measurements for computation
vs. communication experiment (Section 2.3.3). Lower is better.
                                  1e8                                                                   1e9
                              5                                                                   1.0
 Simstep Period Outlet (ns)                                          Simstep Period Outlet (ns)
                              4                                                                   0.8
                              3                                                                   0.6
                              2                                                                   0.4
                              1                                                                   0.2
                              0                                                                   0.0
                                    0        1     2     3     4                                          0     1    2     3     4
                                            Log Compute Work                                                  Log Compute Work
(a) Distribution of Simstep Period Outlet (ns) for each(b) Distribution of Simstep Period Outlet (ns) for each
snapshot, without outliers.                            snapshot, with outliers.
Figure A.26: Distribution of Simstep Period Outlet (ns) for individual snapshot measurements for compu-
tation vs. communication experiment (Section 2.3.3). Lower is better.
                                                                   165


                          0.04                                                         0.04
 Delivery Failure Rate                                        Delivery Failure Rate
                          0.02                                                         0.02
                          0.00                                                         0.00
                         −0.02                                                        −0.02
                         −0.04                                                        −0.04
                                 0     1    2     3     4                                     0     1    2     3     4
                                     Log Compute Work                                             Log Compute Work
(a) Distribution of Delivery Failure Rate for each snap-(b) Distribution of Delivery Failure Rate for each snap-
shot, without outliers.                                 shot, with outliers.
Figure A.27: Distribution of Delivery Failure Rate for individual snapshot measurements for computation
vs. communication experiment (Section 2.3.3). Lower is better.
                                                            166


                     Ordinary Least Squares Regression
 Log Latency Walltime Inlet (ns)
                                   20
                                   18                                       Estimated Statistic = Latency Walltime Inlet (ns) Mean
                                   16                                                                               0.04
                                                                                       Absolute Effect Size
                                                                                                                    0.02
                                   14
                                                                                                                    0.00
                                   12
                                                                                                                   −0.02
                                        0       1     2     3     4                                                −0.04
                                               Log Compute Work
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                            Quantile Regression
 Log Latency Walltime Inlet (ns)
                                   20
                                   18
                                                                            Estimated Statistic = Latency Walltime Inlet (ns) Median
                                   16                                                                              40
                                                                                            Absolute Effect Size
                                   14                                                                              30
                                                                                                                   20
                                   12
                                                                                                                   10
                                        0       1     2     3         4
                                               Log Compute Work                                                     0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.28: Regressions of Latency Walltime Inlet (ns) against log computational intensity for computation
vs. communication experiment (Section 2.3.3). Lower is better. Ordinary least squares regression (top
row) estimates relationship between dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                          167


                  Ordinary Least Squares Regression
 Log Latency Simsteps Outlet
                               4
                                                                   Estimated Statistic = Latency Simsteps Outlet Mean
                               3
                                                                                                         0.04
                               2
                                                                            Absolute Effect Size
                                                                                                         0.02
                               1
                                                                                                         0.00
                               0
                                                                                                        −0.02
                                   0      1     2      3     4                                          −0.04
                                         Log Compute Work
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                       Quantile Regression
 Log Latency Simsteps Outlet
                               4
                               3                                   Estimated Statistic1e−6
                                                                                       = Latency Simsteps Outlet Median
                                                                                                         1.5
                               2
                                                                                                         1.0
                                                                                 Absolute Effect Size
                               1                                                                         0.5
                                                                                                         0.0
                               0
                                                                                                        −0.5
                                   0       1     2     3     4                                          −1.0
                                          Log Compute Work                                              −1.5
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.29: Regressions of Latency Simsteps Outlet against log computational intensity for computation
vs. communication experiment (Section 2.3.3). Lower is better. Ordinary least squares regression (top
row) estimates relationship between dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                 168


                     Ordinary Least Squares Regression
 Log Latency Walltime Outlet (ns)
                                    20
                                    18
                                                                             Estimated Statistic = Latency Walltime Outlet (ns) Mean
                                    16                                                                               0.04
                                                                                        Absolute Effect Size
                                                                                                                     0.02
                                    14
                                                                                                                     0.00
                                    12
                                                                                                                    −0.02
                                         0       1     2     3     4                                                −0.04
                                                Log Compute Work
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                             Quantile Regression
 Log Latency Walltime Outlet (ns)
                                    20
                                    18
                                                                             Estimated Statistic = Latency Walltime Outlet (ns) Median
                                    16
                                                                                                                    40
                                                                                             Absolute Effect Size
                                    14                                                                              30
                                                                                                                    20
                                    12
                                                                                                                    10
                                         0       1     2      3        4
                                                Log Compute Work                                                     0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.30: Regressions of Latency Walltime Outlet (ns) against log computational intensity for computa-
tion vs. communication experiment (Section 2.3.3). Lower is better. Ordinary least squares regression (top
row) estimates relationship between dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                           169


             Ordinary Least Squares Regression
                       1.0
 Delivery Clumpiness
                                                                 Estimated Statistic = Delivery Clumpiness Mean
                                                                                                     0.00
                       0.8
                                                                                                    −0.05
                       0.6
                                                                        Absolute Effect Size
                                                                                                    −0.10
                       0.4
                                                                                                    −0.15
                       0.2
                                                                                                    −0.20
                       0.0
                             0       1     2     3     4                                            −0.25
                                    Log Compute Work
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                 Quantile Regression
                       1.0
 Delivery Clumpiness
                                                                 Estimated Statistic = Delivery Clumpiness Median
                       0.8                                                                           0.00
                       0.6                                                                          −0.05
                                                                             Absolute Effect Size
                       0.4                                                                          −0.10
                       0.2                                                                          −0.15
                       0.0                                                                          −0.20
                                                                                                    −0.25
                             0       1     2     3         4
                                    Log Compute Work                                                −0.30
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.31: Regressions of Delivery Clumpiness against log computational intensity for computation vs.
communication experiment (Section 2.3.3). Lower is better. Ordinary least squares regression (top row) esti-
mates relationship between dependent variable and mean of response variable. Quantile regression (bottom
row) estimates relationship between independent variable and median of response variable. Error bands and
bars are 95% confidence intervals.
                                                               170


                                 Ordinary Least Squares Regression
 Log Simstep Period Inlet (ns)
                                 20.0
                                 17.5                                   Estimated Statistic = Simstep Period Inlet (ns) Mean
                                 15.0                                                                         35
                                                                                                              30
                                                                                  Absolute Effect Size
                                 12.5                                                                         25
                                                                                                              20
                                 10.0
                                                                                                              15
                                  7.5                                                                         10
                                        0       1     2     3     4                                           5
                                               Log Compute Work                                               0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                            Quantile Regression
 Log Simstep Period Inlet (ns)
                                 20.0
                                 17.5
                                                                        Estimated Statistic = Simstep Period Inlet (ns) Median
                                 15.0                                                                         30
                                 12.5                                                                         25
                                                                                       Absolute Effect Size
                                                                                                              20
                                 10.0
                                                                                                              15
                                  7.5
                                                                                                              10
                                        0       1     2     3     4                                            5
                                               Log Compute Work                                                0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.32: Regressions of Simstep Period Inlet (ns) against log computational intensity for computation
vs. communication experiment (Section 2.3.3). Lower is better. Ordinary least squares regression (top
row) estimates relationship between dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                      171


                 Ordinary Least Squares Regression
 Log Latency Simsteps Inlet
                              4
                                                                      Estimated Statistic = Latency Simsteps Inlet Mean
                              3
                                                                                                            0.04
                              2
                                                                              Absolute Effect Size
                                                                                                            0.02
                              1
                                                                                                            0.00
                              0                                                                            −0.02
                                  0      1     2     3      4                                              −0.04
                                        Log Compute Work
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                      Quantile Regression
 Log Latency Simsteps Inlet
                              4
                              3                                       Estimated Statistic = Latency Simsteps Inlet Median
                                                                                        1e−6
                                                                                                            1.5
                              2
                                                                                                            1.0
                                                                                    Absolute Effect Size
                              1                                                                             0.5
                                                                                                            0.0
                              0
                                                                                                           −0.5
                                  0       1     2      3        4                                          −1.0
                                         Log Compute Work                                                  −1.5
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.33: Regressions of Latency Simsteps Inlet against log computational intensity for computation
vs. communication experiment (Section 2.3.3). Lower is better. Ordinary least squares regression (top
row) estimates relationship between dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                    172


                                  Ordinary Least Squares Regression
 Log Simstep Period Outlet (ns)
                                  20.0
                                  17.5                                       Estimated Statistic = Simstep Period Outlet (ns) Mean
                                  15.0                                                                             35
                                                                                                                   30
                                                                                       Absolute Effect Size
                                  12.5
                                                                                                                   25
                                  10.0                                                                             20
                                                                                                                   15
                                   7.5                                                                             10
                                         0       1     2     3     4                                               5
                                                Log Compute Work                                                   0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                             Quantile Regression
 Log Simstep Period Outlet (ns)
                                  20.0
                                  17.5
                                                                             Estimated Statistic = Simstep Period Outlet (ns) Median
                                  15.0
                                                                                                                   30
                                  12.5                                                                             25
                                                                                            Absolute Effect Size
                                  10.0                                                                             20
                                                                                                                   15
                                   7.5
                                                                                                                   10
                                         0       1     2     3         4                                            5
                                                Log Compute Work                                                    0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.34: Regressions of Simstep Period Outlet (ns) against log computational intensity for computation
vs. communication experiment (Section 2.3.3). Lower is better. Ordinary least squares regression (top
row) estimates relationship between dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                           173


              Ordinary Least Squares Regression
                          0.04
 Delivery Failure Rate
                                                                   Estimated Statistic = Delivery Failure Rate Mean
                          0.02
                                                                                                        0.04
                          0.00
                                                                           Absolute Effect Size
                                                                                                        0.02
                         −0.02
                                                                                                        0.00
                         −0.04
                                                                                                       −0.02
                                 0      1    2     3     4                                             −0.04
                                      Log Compute Work
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                 Quantile Regression
                          0.04
 Delivery Failure Rate
                                                                   Estimated Statistic = Delivery Failure Rate Median
                          0.02
                                                                                                        0.04
                          0.00
                                                                                Absolute Effect Size
                                                                                                        0.02
                         −0.02
                                                                                                        0.00
                         −0.04
                                                                                                       −0.02
                                  0     1    2     3         4                                         −0.04
                                      Log Compute Work
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.35: Regressions of Delivery Failure Rate against log computational intensity for computation
vs. communication experiment (Section 2.3.3). Lower is better. Ordinary least squares regression (top
row) estimates relationship between dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                 174


Table A.17: Full Ordinary Least Squares Regression results of quality of service metrics against log computational intensity for computation vs.
communication experiment (Section 2.3.3). Significance level p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or
NaN observations.
                                                           ca nt
                                                      us          Eff ec
                                                           Pe r          tS  ign
                                                   Nu m          N od
                                                   Nu      Si m        e
                                                      m         els  P
                                                           Pr oc       er  Cp
                                                   Ab            es se        u
                                                      so lu           s
                                                            te  Eff
                                                   Ab               ec tS
                                                      so lu                ize
                                                            te  Eff
                                                   Ab               ec tS
                                                      so lu                ize  95
                                                            te  Eff                %
                                                   Re               ec t             CI Lo
                                                      lat                Si ze             we
                                                          ive                   95 %          rB ou
                                                               Eff ec                C IU           nd
                                                   Re                 tS  ize             p pe
                                                     lat  ive                                 rB  ou
                                                               Eff ec                                nd
                                                   Re                 tS  ize
                                                      lat ive                  95 %
                                                  n            Eff ec                CI Lo
                                       tic                            tS  ize             we rB
                  ric                            ifi                           95               ou
                                   at             p                               %  CI            nd
               M                      is     gn                                         Up
                et                St         Si            Cp                              pe rB ou nd
 Latency Walltime Inlet (ns)    mean       NaN         1        1   2   inf     nan     nan     inf      nan      nan      50   nan
 Latency Walltime Outlet (ns)   mean       NaN         1        1   2   inf     nan     nan     inf      nan      nan      50   nan
 Latency Simsteps Inlet         mean       NaN         1        1   2   inf     nan     nan     inf      nan      nan      50   nan
 Latency Simsteps Outlet        mean       NaN         1        1   2   inf     nan     nan     inf      nan      nan      50   nan
 Delivery Failure Rate          mean       NaN         1        1   2   0       0       0       nan      nan      nan      50   nan
 Delivery Clumpiness            mean       -           1        1   2   -0.25   -0.28   -0.23   -0.26    -0.29    -0.24    50   2.8e-28
 Simstep Period Inlet (ns)      mean       +           1        1   2   37      36      37      0.0025   0.0024   0.0026   50   7.1e-55
 Simstep Period Outlet (ns)     mean       +           1        1   2   36      35      37      0.0025   0.0024   0.0025   50   1.1e-54
                                                                                          175


Table A.18: Full Quantile Regression results of quality of service metrics against log computational intensity for computation vs. communication
experiment (Section 2.3.3). Significance level p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                               ca nt
                                                          us          Eff ec
                                                       Nu      Pe rN         tS  ign
                                                          m    Si      od  e
                                                       Nu m       m els
                                                               Pr oc     Pe  rC
                                                       Ab            es se       pu
                                                          so lu           s
                                                                te  Eff ec tS
                                                       Ab                      ize
                                                          so lu te  Eff ec
                                                       Ab                  tS  ize
                                                          so lu                     95 %
                                                                te  Eff                  CI Lo
                                                                        ec tS                  we rB
                                                       Re lat                  ize  95               ou nd
                                                              ive  Eff                 % CI
                                                                       ec tS                Up pe
                                                       Re                     ize                 rB ou
                                                         lat  ive                                       nd
                                                                   Eff ec tS  ize
                                                       Re lat                      95 %
                                                              ive                        CI
                   c                     tic          n            Eff ec tS                Lowe rB
               M                    at               ifi                      ize  95 %             ou nd
                et                     is        gn   p                                  C IU
                   ri              St            Si            Cp                            pp er Bo un d
 Latency Walltime Inlet (ns)    median         +           1        1   2   45        45           45        7.2e-05   7.2e-05    7.2e-05   50   5.8e-107
 Latency Walltime Outlet (ns)   median         +           1        1   2   45        45           45        7.2e-05   7.2e-05    7.2e-05   50   1.8e-103
 Latency Simsteps Inlet         median         0           1        1   2   6.1e-08   -1.5e-06     1.6e-06   1.5e-09   -3.5e-08   3.8e-08   50   0.94
 Latency Simsteps Outlet        median         0           1        1   2   6.1e-08   -1.4e-06     1.6e-06   1.4e-09   -3.5e-08   3.7e-08   50   0.94
 Delivery Failure Rate          median         NaN         1        1   2   0         nan          nan       nan       nan        nan       50   nan
 Delivery Clumpiness            median         -           1        1   2   -0.26     -0.31        -0.22     -0.27     -0.32      -0.23     50   1.5e-16
 Simstep Period Inlet (ns)      median         +           1        1   2   30        30           30        0.0021    0.0021     0.0021    50   1.9e-139
 Simstep Period Outlet (ns)     median         +           1        1   2   30        30           30        0.0021    0.0021     0.0021    50   1.1e-143
                                                                                             176


A.3                                Intranode vs Internode
                               This section provides full results from intranode vs. internode experiments discussed in Section 2.3.4.
                                                                                                                      1e6
                                                                                                                1.4
                               800000
 Latency Walltime Inlet (ns)
                                                                                  Latency Walltime Inlet (ns)
                                                                                                                1.2
                               600000                                                                           1.0
                                                                                                                0.8
                               400000
                                                                                                                0.6
                                                                                                                0.4
                               200000
                                                                                                                0.2
                                       0                                                                        0.0
                                                  0              1                                                           0                1
                                           0 = Intranode | 1 = Internode                                               0 = Intranode | 1 = Internode
(a) Distribution of Latency Walltime Inlet (ns) for each(b) Distribution of Latency Walltime Inlet (ns) for
snapshot, without outliers.                             each snapshot, with outliers.
Figure A.36: Distribution of Latency Walltime Inlet (ns) for individual snapshot measurements for intranode
vs. internode experiment (Section 2.3.4). Lower is better.
                                                                               177


                                60                                                                        100
 Latency Simsteps Outlet                                                   Latency Simsteps Outlet
                                50
                                                                                                           80
                                40
                                                                                                           60
                                30
                                                                                                           40
                                20
                                                                                                           20
                                10
                                 0                                                                          0
                                           0                1                                                           0                1
                                     0 = Intranode | 1 = Internode                                                0 = Intranode | 1 = Internode
(a) Distribution of Latency Simsteps Outlet for each(b) Distribution of Latency Simsteps Outlet for each
snapshot, without outliers.                         snapshot, with outliers.
Figure A.37: Distribution of Latency Simsteps Outlet for individual snapshot measurements for intranode
vs. internode experiment (Section 2.3.4). Lower is better.
                                                                                                                1e6
                                800000                                                                    1.4
 Latency Walltime Outlet (ns)
                                                                           Latency Walltime Outlet (ns)
                                                                                                          1.2
                                600000                                                                    1.0
                                                                                                          0.8
                                400000
                                                                                                          0.6
                                                                                                          0.4
                                200000
                                                                                                          0.2
                                     0                                                                    0.0
                                                0              1                                                       0                1
                                         0 = Intranode | 1 = Internode                                           0 = Intranode | 1 = Internode
(a) Distribution of Latency Walltime Outlet (ns) for(b) Distribution of Latency Walltime Outlet (ns) for
each snapshot, without outliers.                    each snapshot, with outliers.
Figure A.38: Distribution of Latency Walltime Outlet (ns) for individual snapshot measurements for intra-
node vs. internode experiment (Section 2.3.4). Lower is better.
                                                                         178


                             1.0                                                                    1.0
                             0.8                                                                    0.8
 Delivery Clumpiness                                                    Delivery Clumpiness
                             0.6                                                                    0.6
                             0.4                                                                    0.4
                             0.2                                                                    0.2
                             0.0                                                                    0.0
                                           0                1                                                     0                1
                                     0 = Intranode | 1 = Internode                                          0 = Intranode | 1 = Internode
(a) Distribution of Delivery Clumpiness for each snap-(b) Distribution of Delivery Clumpiness for each snap-
shot, without outliers.                               shot, with outliers.
Figure A.39: Distribution of Delivery Clumpiness for individual snapshot measurements for intranode vs.
internode experiment (Section 2.3.4). Lower is better.
                             16000                                                                  16000
 Simstep Period Inlet (ns)                                              Simstep Period Inlet (ns)
                             14000                                                                  14000
                             12000                                                                  12000
                             10000                                                                  10000
                              8000                                                                   8000
                              6000                                                                   6000
                              4000                                                                   4000
                                            0               1                                                      0               1
                                      0 = Intranode | 1 = Internode                                          0 = Intranode | 1 = Internode
(a) Distribution of Simstep Period Inlet (ns) for each(b) Distribution of Simstep Period Inlet (ns) for each
snapshot, without outliers.                           snapshot, with outliers.
Figure A.40: Distribution of Simstep Period Inlet (ns) for individual snapshot measurements for intranode
vs. internode experiment (Section 2.3.4). Lower is better.
                                                                      179


                              60                                                                      100
 Latency Simsteps Inlet                                                  Latency Simsteps Inlet
                              50
                                                                                                       80
                              40
                                                                                                       60
                              30
                                                                                                       40
                              20
                                                                                                       20
                              10
                               0                                                                        0
                                            0                1                                                      0                1
                                      0 = Intranode | 1 = Internode                                           0 = Intranode | 1 = Internode
(a) Distribution of Latency Simsteps Inlet for each(b) Distribution of Latency Simsteps Inlet for each
snapshot, without outliers.                        snapshot, with outliers.
Figure A.41: Distribution of Latency Simsteps Inlet for individual snapshot measurements for intranode vs.
internode experiment (Section 2.3.4). Lower is better.
                              16000                                                                   16000
 Simstep Period Outlet (ns)                                              Simstep Period Outlet (ns)
                              14000                                                                   14000
                              12000                                                                   12000
                              10000                                                                   10000
                              8000                                                                    8000
                              6000                                                                    6000
                              4000                                                                    4000
                                             0               1                                                       0               1
                                       0 = Intranode | 1 = Internode                                           0 = Intranode | 1 = Internode
(a) Distribution of Simstep Period Outlet (ns) for each(b) Distribution of Simstep Period Outlet (ns) for each
snapshot, without outliers.                            snapshot, with outliers.
Figure A.42: Distribution of Simstep Period Outlet (ns) for individual snapshot measurements for intranode
vs. internode experiment (Section 2.3.4). Lower is better.
                                                                       180


                         0.8                                                             0.8
 Delivery Failure Rate                                           Delivery Failure Rate
                         0.6                                                             0.6
                         0.4                                                             0.4
                         0.2                                                             0.2
                         0.0                                                             0.0
                                     0                1                                              0                1
                               0 = Intranode | 1 = Internode                                   0 = Intranode | 1 = Internode
(a) Distribution of Delivery Failure Rate for each snap-(b) Distribution of Delivery Failure Rate for each snap-
shot, without outliers.                                 shot, with outliers.
Figure A.43: Distribution of Delivery Failure Rate for individual snapshot measurements for intranode vs.
internode experiment (Section 2.3.4). Lower is better.
                                                               181


                   Ordinary Least Squares Regression
                               800000
 Latency Walltime Inlet (ns)
                               600000
                                                                           Estimated Statistic = Latency Walltime Inlet (ns) Mean
                               400000                                                                             600000
                                                                                      Absolute Effect Size
                                                                                                                  500000
                               200000                                                                             400000
                                                                                                                  300000
                                   0                                                                              200000
                                           0                     1                                                100000
                                        0 = Intranode | 1 = Internode                                                 0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                        Quantile Regression
 Latency Walltime Inlet (ns)
                               600000
                               500000
                               400000                                      Estimated Statistic = Latency Walltime Inlet (ns) Median
                               300000                                                                             500000
                                                                                           Absolute Effect Size
                               200000                                                                             400000
                               100000                                                                             300000
                                    0                                                                             200000
                                            0                     1                                               100000
                                         0 = Intranode | 1 = Internode                                                 0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.44: Regressions of Latency Walltime Inlet (ns) against categorically coded treatment for intranode
vs. internode experiment (Section 2.3.4). Lower is better. Ordinary least squares regression (top row)
estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                                         182


                Ordinary Least Squares Regression
 Latency Simsteps Outlet
                           50
                           40                                      Estimated Statistic = Latency Simsteps Outlet Mean
                           30                                                                           40
                                                                            Absolute Effect Size
                           20                                                                           30
                           10                                                                           20
                           0
                                                                                                        10
                                  0                        1
                                0 = Intranode | 1 = Internode                                           0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                 Quantile Regression
 Latency Simsteps Outlet
                           40
                           30                                      Estimated Statistic = Latency Simsteps Outlet Median
                                                                                                        40
                           20
                                                                                 Absolute Effect Size
                                                                                                        30
                           10
                                                                                                        20
                            0
                                                                                                        10
                                0                          1
                                 0 = Intranode | 1 = Internode                                           0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.45: Regressions of Latency Simsteps Outlet against categorically coded treatment for intranode
vs. internode experiment (Section 2.3.4). Lower is better. Ordinary least squares regression (top row)
estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                                 183


                    Ordinary Least Squares Regression
                                800000
 Latency Walltime Outlet (ns)
                                600000
                                                                            Estimated Statistic = Latency Walltime Outlet (ns) Mean
                                400000                                                                             600000
                                                                                       Absolute Effect Size
                                                                                                                   500000
                                200000                                                                             400000
                                                                                                                   300000
                                    0                                                                              200000
                                            0                     1                                                100000
                                         0 = Intranode | 1 = Internode                                                 0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                         Quantile Regression
 Latency Walltime Outlet (ns)
                                600000
                                500000
                                400000                                      Estimated Statistic = Latency Walltime Outlet (ns) Median
                                300000
                                                                                                                   500000
                                                                                            Absolute Effect Size
                                200000                                                                             400000
                                100000                                                                             300000
                                     0                                                                             200000
                                            0                      1                                               100000
                                          0 = Intranode | 1 = Internode                                                 0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.46: Regressions of Latency Walltime Outlet (ns) against categorically coded treatment for intranode
vs. internode experiment (Section 2.3.4). Lower is better. Ordinary least squares regression (top row)
estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                                          184


             Ordinary Least Squares Regression
                       1.0
 Delivery Clumpiness
                       0.8                                     Estimated Statistic = Delivery Clumpiness Mean
                                                                                                  1.0
                       0.6
                                                                                                  0.8
                                                                      Absolute Effect Size
                       0.4
                                                                                                  0.6
                       0.2
                                                                                                  0.4
                       0.0
                                                                                                  0.2
                              0                         1
                             0 = Intranode | 1 = Internode                                        0.0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                             Quantile Regression
                       1.0
 Delivery Clumpiness
                       0.8
                                                               Estimated Statistic = Delivery Clumpiness Median
                                                                                                  1.0
                       0.6
                                                                                                  0.8
                                                                           Absolute Effect Size
                       0.4
                                                                                                  0.6
                       0.2
                                                                                                  0.4
                       0.0
                                                                                                  0.2
                               0                       1
                             0 = Intranode | 1 = Internode                                        0.0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.47: Regressions of Delivery Clumpiness against categorically coded treatment for intranode vs.
internode experiment (Section 2.3.4). Lower is better. Ordinary least squares regression (top row) estimates
relationship between categorical dependent variable and mean of response variable. Quantile regression (bot-
tom row) estimates relationship between categorical independent variable and median of response variable.
Error bands and bars are 95% confidence intervals.
                                                             185


                 Ordinary Least Squares Regression
 Simstep Period Inlet (ns)
                             14000                                     Estimated Statistic = Simstep Period Inlet (ns) Mean
                                                                                                             6000
                                                                                                             5000
                             12000
                                                                                 Absolute Effect Size
                                                                                                             4000
                                                                                                             3000
                             10000
                                                                                                             2000
                                        0                      1                                             1000
                                     0 = Intranode | 1 = Internode                                             0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                     Quantile Regression
 Simstep Period Inlet (ns)
                             14000
                                                                       Estimated Statistic = Simstep Period Inlet (ns) Median
                                                                                                             6000
                             12000
                                                                                                             5000
                                                                                      Absolute Effect Size
                                                                                                             4000
                             10000                                                                           3000
                                                                                                             2000
                                        0                      1                                             1000
                                     0 = Intranode | 1 = Internode                                              0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.48: Regressions of Simstep Period Inlet (ns) against categorically coded treatment for intranode
vs. internode experiment (Section 2.3.4). Lower is better. Ordinary least squares regression (top row)
estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                                     186


               Ordinary Least Squares Regression
                          50
 Latency Simsteps Inlet
                                                                 Estimated Statistic = Latency Simsteps Inlet Mean
                          40
                                                                                                     40
                          30
                                                                         Absolute Effect Size
                          20                                                                         30
                          10                                                                         20
                          0
                                                                                                     10
                                 0                       1
                               0 = Intranode | 1 = Internode                                         0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                Quantile Regression
 Latency Simsteps Inlet
                          40
                                                                 Estimated Statistic = Latency Simsteps Inlet Median
                          30                                                                         40
                          20
                                                                              Absolute Effect Size
                                                                                                     30
                          10                                                                         20
                           0                                                                         10
                                 0                        1
                               0 = Intranode | 1 = Internode                                          0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.49: Regressions of Latency Simsteps Inlet against categorically coded treatment for intranode
vs. internode experiment (Section 2.3.4). Lower is better. Ordinary least squares regression (top row)
estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                               187


                  Ordinary Least Squares Regression
 Simstep Period Outlet (ns)
                              14000
                                                                        Estimated Statistic = Simstep Period Outlet (ns) Mean
                                                                                                              6000
                              12000                                                                           5000
                                                                                  Absolute Effect Size
                                                                                                              4000
                              10000                                                                           3000
                                                                                                              2000
                                        0                      1                                              1000
                                      0 = Intranode | 1 = Internode                                             0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                      Quantile Regression
 Simstep Period Outlet (ns)
                              14000
                                                                        Estimated Statistic = Simstep Period Outlet (ns) Median
                                                                                                              6000
                              12000
                                                                                                              5000
                                                                                       Absolute Effect Size
                                                                                                              4000
                              10000                                                                           3000
                                                                                                              2000
                                         0                     1                                              1000
                                      0 = Intranode | 1 = Internode                                              0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.50: Regressions of Simstep Period Outlet (ns) against categorically coded treatment for intranode
vs. internode experiment (Section 2.3.4). Lower is better. Ordinary least squares regression (top row)
estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                                      188


              Ordinary Least Squares Regression
                         0.4
 Delivery Failure Rate
                                                                 Estimated Statistic = Delivery Failure Rate Mean
                         0.3                                                                          0.00
                                                                                                     −0.05
                         0.2
                                                                         Absolute Effect Size
                                                                                                     −0.10
                                                                                                     −0.15
                         0.1
                                                                                                     −0.20
                                                                                                     −0.25
                         0.0
                                                                                                     −0.30
                                 0                       1
                               0 = Intranode | 1 = Internode                                         −0.35
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                               Quantile Regression
                         0.4
 Delivery Failure Rate
                         0.3                                     Estimated Statistic = Delivery Failure Rate Median
                                                                                                      0.00
                         0.2                                                                         −0.05
                                                                              Absolute Effect Size
                                                                                                     −0.10
                         0.1                                                                         −0.15
                                                                                                     −0.20
                         0.0
                                                                                                     −0.25
                                 0                        1
                                                                                                     −0.30
                               0 = Intranode | 1 = Internode
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.51: Regressions of Delivery Failure Rate against categorically coded treatment for intranode vs.
internode experiment (Section 2.3.4). Lower is better. Ordinary least squares regression (top row) estimates
relationship between categorical dependent variable and mean of response variable. Quantile regression (bot-
tom row) estimates relationship between categorical independent variable and median of response variable.
Error bands and bars are 95% confidence intervals.
                                                               189


Table A.19: Full Ordinary Least Squares Regression results of quality of service metrics against log processor count for weak scaling experiment
(Section 2.3.6). Listed results include both piecewise and complete regression. Ordinary least squares regression estimates relationship between
independent variable and mean of response variable. Quantile regression estimates relationship between independent variable and median of response
variable. Significance level p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                         ca nt  Eff  ec
                                                         Pe r           tS  ign
                                                 Nu m          No  de
                                                 Nu      Sim  els
                                                    m    Pr        Pe  rC
                                                            oc es           pu
                                                 Ab               s es
                                                    so lu te  Eff ec tS
                                                 Ab                       ize
                                                    so lu te  Eff ec tS
                                                 Ab so                    ize  95
                                                       lu te                      % CI
                                                              Eff ec                   Lo we
                                                 Relat               tS   ize                rB
                                                        ive                    95               ou nd
                                                 Re          Eff ec               % CI
                                                    lat ive          tS ize            Up pe
                                                             Eff e                           rB ou
                                                 Re lat            ct  Si                          nd
                                                        ive               z e9
                                                n            Eff               5%   CI
                   c                   tic                       ec  tS ize            Lowe
               M                   at          ifi                            95 %          rB
                                      is
                                                p     us                            CI         ou nd
                et                           Si                                        Up
                   ri             St           gn    Cp                                   pe rBou nd
 Latency Walltime Inlet (ns)    mean       +    1-2       1   2   600 000   530 000   660 000   77    68     85      20   2.1e-13
 Latency Walltime Outlet (ns)   mean       +    1-2       1   2   590 000   530 000   650 000   77    68     85      20   1.1e-13
 Latency Simsteps Inlet         mean       +    1-2       1   2   41        36        45        41    36     45      20   1.9e-13
 Latency Simsteps Outlet        mean       +    1-2       1   2   40        36        45        40    36     45      20   1e-13
 Delivery Failure Rate          mean       -    1-2       1   2   -0.33     -0.35     -0.31     -1    -1.1   -0.94   20   4e-18
 Delivery Clumpiness            mean       +    1-2       1   2   0.94      0.93      0.96      68    67     69      20   2.6e-30
 Simstep Period Inlet (ns)      mean       +    1-2       1   2   5 500     5 100     5 900     0.6   0.56   0.65    20   1.3e-16
 Simstep Period Outlet (ns)     mean       +    1-2       1   2   5 400     5 100     5 800     0.6   0.56   0.64    20   7.1e-17
                                                                                      190


Table A.20: Full Quantile Regression results of quality of service metrics against log processor count for weak scaling experiment (Section 2.3.6).
Listed results include both piecewise and complete regression. Ordinary least squares regression estimates relationship between independent variable
and mean of response variable. Quantile regression estimates relationship between independent variable and median of response variable. Significance
level p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                             ca nt  Eff  ec
                                                             Pe rN          tS  ign
                                                     Nu m            o de
                                                     Nu      Sim els
                                                        m    Pr        Pe  rC
                                                                oc es           pu
                                                     Ab               s es
                                                        so lu te Eff  ec tS
                                                     Ab                       ize
                                                        so lu te Eff  ec tS
                                                     Ab so                    ize  95
                                                           lu te                      % CI
                                                                 Eff  ec                    Lo we
                                                     Re lat              t  Si ze                 rB
                                                            ive                    95                ou nd
                                                                 Eff                  % C
                                                     Re              e ct  Si             I Up
                                                        lat ive               ze               pe rB
                                                                 Eff e ct                            ou nd
                                                     Relat                 Si z
                                                            ive                 e9 5%
                                                    n            Eff
                   c                     tic                        ec   tS ize
                                                                                        CI  Lowe
               M                    at             ifi                            95 %           rB
                                       is
                                                    p     us                            CI          ou
                et                             Si                                           Up         nd
                   ri              St            gn      Cp                                    perB ou nd
 Latency Walltime Inlet (ns)    median         +    1-2       1   2   550 000   530 000    560 000   78     76     81      20   2.6e-23
 Latency Walltime Outlet (ns)   median         +    1-2       1   2   540 000   530 000    560 000   78     76     80      20   1.5e-24
 Latency Simsteps Inlet         median         +    1-2       1   2   37        35         39        38     36     39      20   2.1e-19
 Latency Simsteps Outlet        median         +    1-2       1   2   37        35         39        37     35     39      20   6.7e-19
 Delivery Failure Rate          median         -    1-2       1   2   -0.32     -0.32      -0.32     -1     -1     -0.99   20   5.7e-39
 Delivery Clumpiness            median         +    1-2       1   2   0.96      0.95       0.97      610    600    620     20   4.6e-31
 Simstep Period Inlet (ns)      median         +    1-2       1   2   5 600     5 200      6 000     0.62   0.58   0.66    20   1.3e-16
 Simstep Period Outlet (ns)     median         +    1-2       1   2   5 500     5 200      5 800     0.61   0.58   0.65    20   3.7e-18
                                                                                          191


A.4                                Multithreading vs Multiprocessing
                               This section provides full results from multithreading vs. multiprocessing experiments discussed in
Section 2.3.5.
                                                                                                                   1e7
                               16000
                                                                               Latency Walltime Inlet (ns)
                                                                                                             1.2
 Latency Walltime Inlet (ns)
                               14000
                               12000                                                                         1.0
                               10000                                                                         0.8
                                8000                                                                         0.6
                                                                                                             0.4
                                6000
                                                                                                             0.2
                                4000
                                                                                                             0.0
                                2000
                                              0              1                                                           0               1
                                  0 = Multithreading | 1 = Multiprocessing                                    0 = Multithreading | 1 = Multiprocessing
(a) Distribution of Latency Walltime Inlet (ns) for each(b) Distribution of Latency Walltime Inlet (ns) for
snapshot, without outliers.                             each snapshot, with outliers.
Figure A.52: Distribution of Latency Walltime Inlet (ns) for individual snapshot measurements for multi-
threading vs. multiprocessing experiment (Section 2.3.5). Lower is better.
                                                                            192


                                3.0
                                                                                                            2000
 Latency Simsteps Outlet                                                     Latency Simsteps Outlet
                                2.5
                                                                                                            1500
                                2.0
                                                                                                            1000
                                1.5
                                1.0                                                                          500
                                0.5                                                                               0
                                            0               1                                                            0               1
                                 0 = Multithreading | 1 = Multiprocessing                                     0 = Multithreading | 1 = Multiprocessing
(a) Distribution of Latency Simsteps Outlet for each(b) Distribution of Latency Simsteps Outlet for each
snapshot, without outliers.                         snapshot, with outliers.
Figure A.53: Distribution of Latency Simsteps Outlet for individual snapshot measurements for multithread-
ing vs. multiprocessing experiment (Section 2.3.5). Lower is better.
                                                                                                                  1e7
                                14000
 Latency Walltime Outlet (ns)                                                Latency Walltime Outlet (ns)
                                                                                                            1.2
                                12000
                                                                                                            1.0
                                10000
                                                                                                            0.8
                                 8000                                                                       0.6
                                 6000                                                                       0.4
                                 4000                                                                       0.2
                                 2000                                                                       0.0
                                              0              1                                                          0               1
                                  0 = Multithreading | 1 = Multiprocessing                                   0 = Multithreading | 1 = Multiprocessing
(a) Distribution of Latency Walltime Outlet (ns) for(b) Distribution of Latency Walltime Outlet (ns) for
each snapshot, without outliers.                    each snapshot, with outliers.
Figure A.54: Distribution of Latency Walltime Outlet (ns) for individual snapshot measurements for multi-
threading vs. multiprocessing experiment (Section 2.3.5). Lower is better.
                                                                         193


                             1.0                                                                      1.0
                             0.8                                                                      0.8
 Delivery Clumpiness                                                      Delivery Clumpiness
                             0.6                                                                      0.6
                             0.4                                                                      0.4
                             0.2                                                                      0.2
                             0.0                                                                      0.0
                                         0               1                                                        0               1
                              0 = Multithreading | 1 = Multiprocessing                                 0 = Multithreading | 1 = Multiprocessing
(a) Distribution of Delivery Clumpiness for each snap-(b) Distribution of Delivery Clumpiness for each snap-
shot, without outliers.                               shot, with outliers.
Figure A.55: Distribution of Delivery Clumpiness for individual snapshot measurements for multithreading
vs. multiprocessing experiment (Section 2.3.5). Lower is better.
                             9000                                                                     9000
 Simstep Period Inlet (ns)                                                Simstep Period Inlet (ns)
                             8000                                                                     8000
                             7000                                                                     7000
                             6000                                                                     6000
                             5000                                                                     5000
                             4000                                                                     4000
                                          0               1                                                        0               1
                               0 = Multithreading | 1 = Multiprocessing                                 0 = Multithreading | 1 = Multiprocessing
(a) Distribution of Simstep Period Inlet (ns) for each(b) Distribution of Simstep Period Inlet (ns) for each
snapshot, without outliers.                           snapshot, with outliers.
Figure A.56: Distribution of Simstep Period Inlet (ns) for individual snapshot measurements for multithread-
ing vs. multiprocessing experiment (Section 2.3.5). Lower is better.
                                                                      194


                              3.0
                                                                                                        2000
                              2.5
 Latency Simsteps Inlet                                                    Latency Simsteps Inlet
                                                                                                        1500
                              2.0
                              1.5                                                                       1000
                              1.0                                                                        500
                              0.5                                                                          0
                                          0               1                                                         0               1
                               0 = Multithreading | 1 = Multiprocessing                                  0 = Multithreading | 1 = Multiprocessing
(a) Distribution of Latency Simsteps Inlet for each(b) Distribution of Latency Simsteps Inlet for each
snapshot, without outliers.                        snapshot, with outliers.
Figure A.57: Distribution of Latency Simsteps Inlet for individual snapshot measurements for multithreading
vs. multiprocessing experiment (Section 2.3.5). Lower is better.
                              9000                                                                      9000
 Simstep Period Outlet (ns)                                                Simstep Period Outlet (ns)
                              8000                                                                      8000
                              7000                                                                      7000
                              6000                                                                      6000
                              5000                                                                      5000
                              4000                                                                      4000
                              3000                                                                      3000
                                           0               1                                                        0               1
                                0 = Multithreading | 1 = Multiprocessing                                 0 = Multithreading | 1 = Multiprocessing
(a) Distribution of Simstep Period Outlet (ns) for each(b) Distribution of Simstep Period Outlet (ns) for each
snapshot, without outliers.                            snapshot, with outliers.
Figure A.58: Distribution of Simstep Period Outlet (ns) for individual snapshot measurements for multi-
threading vs. multiprocessing experiment (Section 2.3.5). Lower is better.
                                                                       195


                         0.7                                                                 0.7
                         0.6                                                                 0.6
 Delivery Failure Rate                                               Delivery Failure Rate
                         0.5                                                                 0.5
                         0.4                                                                 0.4
                         0.3                                                                 0.3
                         0.2                                                                 0.2
                         0.1                                                                 0.1
                         0.0                                                                 0.0
                                     0               1                                                   0               1
                          0 = Multithreading | 1 = Multiprocessing                            0 = Multithreading | 1 = Multiprocessing
(a) Distribution of Delivery Failure Rate for each snap-(b) Distribution of Delivery Failure Rate for each snap-
shot, without outliers.                                 shot, with outliers.
Figure A.59: Distribution of Delivery Failure Rate for individual snapshot measurements for multithreading
vs. multiprocessing experiment (Section 2.3.5). Lower is better.
                                                                 196


                   Ordinary Least Squares Regression
                                     1e6
 Latency Walltime Inlet (ns)
                               2.0
                                                                            Estimated Statistic1e6
                                                                                                = Latency Walltime Inlet (ns) Mean
                               1.5                                                                                 0.2
                                                                                                                   0.0
                               1.0
                                                                                       Absolute Effect Size
                                                                                                                  −0.2
                               0.5                                                                                −0.4
                                                                                                                  −0.6
                               0.0
                                                                                                                  −0.8
                                     0                          1
                                0 = Multithreading | 1 = Multiprocessing                                          −1.0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                       Quantile Regression
                               16000
 Latency Walltime Inlet (ns)
                               14000
                                                                            Estimated Statistic = Latency Walltime Inlet (ns) Median
                               12000
                               10000                                                                              5000
                                                                                           Absolute Effect Size
                                8000                                                                              4000
                                6000                                                                              3000
                                4000                                                                              2000
                                                                                                                  1000
                                         0                      1
                                 0 = Multithreading | 1 = Multiprocessing                                            0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.60: Regressions of Latency Walltime Inlet (ns) against categorically coded treatment for multi-
threading vs. multiprocessing experiment (Section 2.3.5). Lower is better. Ordinary least squares regression
(top row) estimates relationship between categorical dependent variable and mean of response variable.
Quantile regression (bottom row) estimates relationship between categorical independent variable and me-
dian of response variable. Error bands and bars are 95% confidence intervals.
                                                                        197


                Ordinary Least Squares Regression
 Latency Simsteps Outlet
                           300                                         Estimated Statistic = Latency Simsteps Outlet Mean
                                                                                                               0
                           200
                                                                                Absolute Effect Size
                                                                                                            −50
                           100
                                                                                                            −100
                             0
                                   0                        1                                               −150
                            0 = Multithreading | 1 = Multiprocessing
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                 Quantile Regression
 Latency Simsteps Outlet
                           3.0
                                                                       Estimated Statistic = Latency Simsteps Outlet Median
                           2.5
                           2.0                                                                               0.0
                                                                                     Absolute Effect Size
                           1.5                                                                              −0.2
                           1.0                                                                              −0.4
                                 0                           1                                              −0.6
                            0 = Multithreading | 1 = Multiprocessing
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.61: Regressions of Latency Simsteps Outlet against categorically coded treatment for multithread-
ing vs. multiprocessing experiment (Section 2.3.5). Lower is better. Ordinary least squares regression (top
row) estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of re-
sponse variable. Error bands and bars are 95% confidence intervals.
                                                                   198


                    Ordinary Least Squares Regression
                                      1e6
 Latency Walltime Outlet (ns)
                                2.0
                                                                             Estimated Statistic = Latency Walltime Outlet (ns) Mean
                                1.5                                                              1e6
                                                                                                                    0.2
                                                                                                                    0.0
                                1.0
                                                                                        Absolute Effect Size
                                                                                                                   −0.2
                                0.5                                                                                −0.4
                                                                                                                   −0.6
                                0.0
                                                                                                                   −0.8
                                       0                         1
                                 0 = Multithreading | 1 = Multiprocessing                                          −1.0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                        Quantile Regression
 Latency Walltime Outlet (ns)
                                15000
                                12500                                        Estimated Statistic = Latency Walltime Outlet (ns) Median
                                10000                                                                              5000
                                                                                            Absolute Effect Size
                                                                                                                   4000
                                 7500
                                                                                                                   3000
                                 5000
                                                                                                                   2000
                                 2500                                                                              1000
                                          0                      1
                                                                                                                      0
                                  0 = Multithreading | 1 = Multiprocessing
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.62: Regressions of Latency Walltime Outlet (ns) against categorically coded treatment for multi-
threading vs. multiprocessing experiment (Section 2.3.5). Lower is better. Ordinary least squares regression
(top row) estimates relationship between categorical dependent variable and mean of response variable.
Quantile regression (bottom row) estimates relationship between categorical independent variable and me-
dian of response variable. Error bands and bars are 95% confidence intervals.
                                                                         199


             Ordinary Least Squares Regression
                       0.8
                                                                   Estimated Statistic = Delivery Clumpiness Mean
 Delivery Clumpiness
                                                                                                       0.0
                       0.6
                                                                                                      −0.1
                                                                          Absolute Effect Size
                       0.4                                                                            −0.2
                                                                                                      −0.3
                       0.2
                                                                                                      −0.4
                       0.0                                                                            −0.5
                              0                          1
                                                                                                      −0.6
                        0 = Multithreading | 1 = Multiprocessing
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                             Quantile Regression
                       0.8
 Delivery Clumpiness
                                                                   Estimated Statistic = Delivery Clumpiness Median
                                                                                                       0.0
                       0.6
                                                                                                      −0.1
                                                                               Absolute Effect Size
                       0.4
                                                                                                      −0.2
                       0.2                                                                            −0.3
                                                                                                      −0.4
                       0.0
                                                                                                      −0.5
                              0                         1
                        0 = Multithreading | 1 = Multiprocessing                                      −0.6
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.63: Regressions of Delivery Clumpiness against categorically coded treatment for multithreading
vs. multiprocessing experiment (Section 2.3.5). Lower is better. Ordinary least squares regression (top
row) estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                               200


                 Ordinary Least Squares Regression
 Simstep Period Inlet (ns)
                             9000
                                                                         Estimated Statistic = Simstep Period Inlet (ns) Mean
                             8000
                             7000                                                                              4000
                                                                                   Absolute Effect Size
                             6000                                                                              3000
                             5000                                                                              2000
                             4000
                                                                                                               1000
                                      0                      1
                              0 = Multithreading | 1 = Multiprocessing                                           0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                    Quantile Regression
 Simstep Period Inlet (ns)
                             9000
                                                                         Estimated Statistic = Simstep Period Inlet (ns) Median
                             8000
                                                                                                               5000
                             7000
                                                                                                               4000
                                                                                        Absolute Effect Size
                             6000
                                                                                                               3000
                             5000
                                                                                                               2000
                             4000
                                                                                                               1000
                                     0                       1
                              0 = Multithreading | 1 = Multiprocessing                                            0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.64: Regressions of Simstep Period Inlet (ns) against categorically coded treatment for multithread-
ing vs. multiprocessing experiment (Section 2.3.5). Lower is better. Ordinary least squares regression (top
row) estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of re-
sponse variable. Error bands and bars are 95% confidence intervals.
                                                                     201


               Ordinary Least Squares Regression
 Latency Simsteps Inlet
                                                                      Estimated Statistic = Latency Simsteps Inlet Mean
                          300
                                                                                                             0
                          200
                                                                              Absolute Effect Size
                                                                                                          −50
                          100
                                                                                                          −100
                            0
                                 0                         1                                              −150
                           0 = Multithreading | 1 = Multiprocessing
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                Quantile Regression
                          3.0
 Latency Simsteps Inlet
                                                                      Estimated Statistic = Latency Simsteps Inlet Median
                          2.5
                                                                                                           0.1
                          2.0                                                                              0.0
                                                                                   Absolute Effect Size
                                                                                                          −0.1
                          1.5                                                                             −0.2
                                                                                                          −0.3
                          1.0
                                                                                                          −0.4
                                                                                                          −0.5
                                0                           1                                             −0.6
                           0 = Multithreading | 1 = Multiprocessing                                       −0.7
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.65: Regressions of Latency Simsteps Inlet against categorically coded treatment for multithreading
vs. multiprocessing experiment (Section 2.3.5). Lower is better. Ordinary least squares regression (top
row) estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                                  202


                  Ordinary Least Squares Regression
 Simstep Period Outlet (ns)
                              9000
                                                                          Estimated Statistic = Simstep Period Outlet (ns) Mean
                              8000
                              7000                                                                              4000
                                                                                    Absolute Effect Size
                              6000                                                                              3000
                              5000
                                                                                                                2000
                              4000
                                                                                                                1000
                                      0                       1
                               0 = Multithreading | 1 = Multiprocessing                                           0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                     Quantile Regression
 Simstep Period Outlet (ns)
                              9000
                              8000                                        Estimated Statistic = Simstep Period Outlet (ns) Median
                                                                                                                5000
                              7000
                                                                                                                4000
                                                                                         Absolute Effect Size
                              6000
                                                                                                                3000
                              5000
                                                                                                                2000
                              4000
                                                                                                                1000
                                    0                         1
                               0 = Multithreading | 1 = Multiprocessing                                            0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.66: Regressions of Simstep Period Outlet (ns) against categorically coded treatment for multi-
threading vs. multiprocessing experiment (Section 2.3.5). Lower is better. Ordinary least squares regression
(top row) estimates relationship between categorical dependent variable and mean of response variable.
Quantile regression (bottom row) estimates relationship between categorical independent variable and me-
dian of response variable. Error bands and bars are 95% confidence intervals.
                                                                      203


              Ordinary Least Squares Regression
                         0.5
 Delivery Failure Rate
                                                                     Estimated Statistic = Delivery Failure Rate Mean
                         0.4
                                                                                                         0.4
                         0.3
                                                                             Absolute Effect Size
                                                                                                         0.3
                         0.2
                         0.1                                                                             0.2
                         0.0                                                                             0.1
                                0                           1
                          0 = Multithreading | 1 = Multiprocessing                                       0.0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                               Quantile Regression
                         0.6
 Delivery Failure Rate
                         0.5                                         Estimated Statistic = Delivery Failure Rate Median
                                                                                                         0.4
                         0.4
                         0.3
                                                                                  Absolute Effect Size
                                                                                                         0.3
                         0.2
                                                                                                         0.2
                         0.1
                         0.0                                                                             0.1
                               0                         1
                          0 = Multithreading | 1 = Multiprocessing                                       0.0
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.67: Regressions of Delivery Failure Rate against categorically coded treatment for multithreading
vs. multiprocessing experiment (Section 2.3.5). Lower is better. Ordinary least squares regression (top
row) estimates relationship between categorical dependent variable and mean of response variable. Quantile
regression (bottom row) estimates relationship between categorical independent variable and median of
response variable. Error bands and bars are 95% confidence intervals.
                                                                 204


Table A.21: Full Ordinary Least Squares Regression results of quality of service metrics against against categorically coded treatment for multi-
threading vs. multiprocessing experiment (Section 2.3.5). Significance level p < 0.05 used. Inf or NaN values may occur due to multicollinearity or
due to inf or NaN observations.
                                                         ca nt
                                                      sP        Eff ec
                                                 Nu        er  N       tS  ign
                                                    m    Si      od  e
                                                 Nu        m  els
                                                    m    Pr        Pe  rC
                                                            oc es se       pu
                                                 Ab so               s
                                                       lu te  Eff ec tS  ize
                                                 Ab so lu te  Eff ec tS  ize
                                                 Ab so                        95 %
                                                       lu te                       CI Lo
                                                              Eff ec                     we
                                                                     tS  ize                rB ou
                                                 Relat                        9 5%                nd
                                                        ive  Eff                   CI Up
                                                                 ec tS                   pe rB
                                                 Re                     ize                     ou nd
                                                    lat ive  Eff ec tS  ize
                                                 Re lat                      95 %
                                                        ive                        CI
                                       tic                   Eff ec                   Lowe rB
                  ric                           n
                                               ifi                  tS  ize                   ou
                                   at                                        95                  nd
               M                      is     Si p    Cp
                                                                                %  CI
                                               gn
                                                                                      Up
                et                St                    u
                                                                                         pe rB ound
 Latency Walltime Inlet (ns)    mean       0    2       1   1/2   -440 000   -1e+06       170 000   -0.98      -2.3       0.37       20   0.15
 Latency Walltime Outlet (ns)   mean       0    2       1   1/2   -450 000   -1.1e+06     160 000   -0.98      -2.3       0.35       20   0.14
 Latency Simsteps Inlet         mean       0    2       1   1/2   -76        -180         28        -0.99      -2.3       0.36       20   0.14
 Latency Simsteps Outlet        mean       0    2       1   1/2   -77        -180         27        -0.99      -2.3       0.34       20   0.14
 Delivery Failure Rate          mean       +    2       1   1/2   0.38       0.33         0.44      -1.4e+06   -1.2e+06   -1.5e+06   20   6e-12
 Delivery Clumpiness            mean       -    2       1   1/2   -0.53      -0.62        -0.44     -0.94      -1.1       -0.78      20   5.8e-10
 Simstep Period Inlet (ns)      mean       +    2       1   1/2   4 500      4 200        4 700     0.97       0.91       1          20   1.5e-18
 Simstep Period Outlet (ns)     mean       +    2       1   1/2   4 400      4 100        4 700     0.95       0.89       1          20   7.4e-17
                                                                                        205


Table A.22: Full Quantile Regression results of quality of service metrics against against categorically coded treatment for multithreading vs.
multiprocessing experiment (Section 2.3.5). Significance level p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or
NaN observations.
                                                              ca nt
                                                           sP        Eff  ec
                                                      Nu        e rN         tS   ign
                                                         m    Si      od  e
                                                      Nu        m  els
                                                         m    Pr        Pe  rC
                                                                 oc es           pu
                                                      Ab so            se s
                                                            lu te  Eff ec
                                                      Ab so               tS   ize
                                                            lu te  Eff ec
                                                      Ab so               tS   ize
                                                            lu te                    95
                                                                   Eff e                % CI
                                                      Re lat             ct  Si              Lo we
                                                             ive                z e9               rB
                                                                  Eff                 5 %             ou nd
                                                      Re              ec tS               CI U
                                                         lat                 i z e             p pe rB
                                                             ive  Eff                                  ou
                                                                      ec t                                nd
                                                      Re lat                Si ze
                                                             ive                    95 %
                                                     n            E ffe                   CI
                                         tic                            ct  Si ze
                                                                                             Lowe
                  ric                               ifi                             95            rB ou
                                    at               p                                 %  CI            nd
               M                       is          gn     Cp                                 Up pe
                et                 St          Si            u                                    rB  ound
 Latency Walltime Inlet (ns)    median         0     2       1   1/2   2 700   -180    5 600   0.51    -0.034   1.1     20   0.064
 Latency Walltime Outlet (ns)   median         0     2       1   1/2   2 500   -350    5 400   0.47    -0.064   1       20   0.081
 Latency Simsteps Inlet         median         0     2       1   1/2   -0.29   -0.69   0.12    -0.25   -0.6     0.1     20   0.15
 Latency Simsteps Outlet        median         0     2       1   1/2   -0.3    -0.72   0.11    -0.26   -0.62    0.099   20   0.14
 Delivery Failure Rate          median         +     2       1   1/2   0.38    0.37    0.38    inf     inf      inf     20   2e-27
 Delivery Clumpiness            median         -     2       1   1/2   -0.53   -0.59   -0.47   -0.97   -1.1     -0.87   20   2.6e-13
 Simstep Period Inlet (ns)      median         +     2       1   1/2   4 500   4 000   4 900   0.96    0.87     1.1     20   2.3e-14
 Simstep Period Outlet (ns)     median         +     2       1   1/2   4 400   4 000   4 800   0.94    0.86     1       20   1.3e-14
                                                                                        206


A.5                                  With lac-417 vs. Sans lac-417
                               This section provides full results from faulty hardware experiments discussed in Section 2.3.7.
                                     1e6                                                                              1e10
                               3.5                                                                              1.4
 Latency Walltime Inlet (ns)                                                      Latency Walltime Inlet (ns)
                                                                                                                1.2
                               3.0                                                                              1.0
                                                                                                                0.8
                               2.5
                                                                                                                0.6
                               2.0                                                                              0.4
                                                                                                                0.2
                               1.5
                                                                                                                0.0
                                             0                1                                                               0                1
                                     0 = With lac-417 | 1 = Sans lac-417                                              0 = With lac-417 | 1 = Sans lac-417
(a) Distribution of Latency Walltime Inlet (ns) for each(b) Distribution of Latency Walltime Inlet (ns) for
snapshot, without outliers.                             each snapshot, with outliers.
Figure A.68: Distribution of Latency Walltime Inlet (ns) for individual snapshot measurements for faulty
hardware experiment (Section 2.3.7). Lower is better.
                                                                               207


                                1.8                                                                        8000
 Latency Simsteps Outlet
                                1.6
                                                                            Latency Simsteps Outlet
                                                                                                           6000
                                1.4
                                1.2                                                                        4000
                                1.0
                                                                                                           2000
                                0.8
                                0.6                                                                              0
                                              0                1                                                          0               1
                                      0 = With lac-417 | 1 = Sans lac-417                                        0 = With lac-417 | 1 = Sans lac-417
(a) Distribution of Latency Simsteps Outlet for each(b) Distribution of Latency Simsteps Outlet for each
snapshot, without outliers.                         snapshot, with outliers.
Figure A.69: Distribution of Latency Simsteps Outlet for individual snapshot measurements for faulty
hardware experiment (Section 2.3.7). Lower is better.
                                      1e6                                                                        1e10
                                3.5                                                                        1.4
 Latency Walltime Outlet (ns)                                               Latency Walltime Outlet (ns)
                                                                                                           1.2
                                3.0
                                                                                                           1.0
                                                                                                           0.8
                                2.5
                                                                                                           0.6
                                2.0                                                                        0.4
                                                                                                           0.2
                                1.5
                                                                                                           0.0
                                              0                1                                                         0                1
                                      0 = With lac-417 | 1 = Sans lac-417                                        0 = With lac-417 | 1 = Sans lac-417
(a) Distribution of Latency Walltime Outlet (ns) for(b) Distribution of Latency Walltime Outlet (ns) for
each snapshot, without outliers.                    each snapshot, with outliers.
Figure A.70: Distribution of Latency Walltime Outlet (ns) for individual snapshot measurements for faulty
hardware experiment (Section 2.3.7). Lower is better.
                                                                        208


                                                                                                     0.8
                             0.6
                             0.5
 Delivery Clumpiness                                                     Delivery Clumpiness
                                                                                                     0.6
                             0.4
                                                                                                     0.4
                             0.3
                             0.2
                                                                                                     0.2
                             0.1
                             0.0                                                                     0.0
                                           0                1                                                      0                1
                                   0 = With lac-417 | 1 = Sans lac-417                                     0 = With lac-417 | 1 = Sans lac-417
(a) Distribution of Delivery Clumpiness for each snap-(b) Distribution of Delivery Clumpiness for each snap-
shot, without outliers.                               shot, with outliers.
Figure A.71: Distribution of Delivery Clumpiness for individual snapshot measurements for faulty hardware
experiment (Section 2.3.7). Lower is better.
                                   1e6                                                                   1e6
                             2.3                                                                     6
 Simstep Period Inlet (ns)                                               Simstep Period Inlet (ns)
                             2.2
                                                                                                     5
                             2.1
                                                                                                     4
                             2.0
                             1.9                                                                     3
                             1.8
                                                                                                     2
                             1.7
                                           0                1                                                    0                 1
                                   0 = With lac-417 | 1 = Sans lac-417                                   0 = With lac-417 | 1 = Sans lac-417
(a) Distribution of Simstep Period Inlet (ns) for each(b) Distribution of Simstep Period Inlet (ns) for each
snapshot, without outliers.                           snapshot, with outliers.
Figure A.72: Distribution of Simstep Period Inlet (ns) for individual snapshot measurements for faulty
hardware experiment (Section 2.3.7). Lower is better.
                                                                     209


                              1.8                                                                      8000
                              1.6
 Latency Simsteps Inlet                                                   Latency Simsteps Inlet
                                                                                                       6000
                              1.4
                              1.2                                                                      4000
                              1.0
                                                                                                       2000
                              0.8
                              0.6                                                                          0
                                            0                1                                                       0               1
                                    0 = With lac-417 | 1 = Sans lac-417                                     0 = With lac-417 | 1 = Sans lac-417
(a) Distribution of Latency Simsteps Inlet for each(b) Distribution of Latency Simsteps Inlet for each
snapshot, without outliers.                        snapshot, with outliers.
Figure A.73: Distribution of Latency Simsteps Inlet for individual snapshot measurements for faulty hard-
ware experiment (Section 2.3.7). Lower is better.
                                    1e6                                                                    1e6
                                                                                                       6
 Simstep Period Outlet (ns)                                               Simstep Period Outlet (ns)
                              2.2
                                                                                                       5
                              2.1
                              2.0                                                                      4
                              1.9                                                                      3
                              1.8
                                                                                                       2
                              1.7
                                            0                1                                                     0                 1
                                    0 = With lac-417 | 1 = Sans lac-417                                    0 = With lac-417 | 1 = Sans lac-417
(a) Distribution of Simstep Period Outlet (ns) for each(b) Distribution of Simstep Period Outlet (ns) for each
snapshot, without outliers.                            snapshot, with outliers.
Figure A.74: Distribution of Simstep Period Outlet (ns) for individual snapshot measurements for faulty
hardware experiment (Section 2.3.7). Lower is better.
                                                                      210


                          0.04
                                                                                           0.6
 Delivery Failure Rate                                             Delivery Failure Rate
                          0.02
                                                                                           0.4
                          0.00
                         −0.02                                                             0.2
                         −0.04
                                                                                           0.0
                                      0               1                                                  0                1
                             0 = With lac-417 | 1 = Sans lac-417                                 0 = With lac-417 | 1 = Sans lac-417
(a) Distribution of Delivery Failure Rate for each snap-(b) Distribution of Delivery Failure Rate for each snap-
shot, without outliers.                                 shot, with outliers.
Figure A.75: Distribution of Delivery Failure Rate for individual snapshot measurements for faulty hardware
experiment (Section 2.3.7). Lower is better.
                                                               211


                  Ordinary Least Squares Regression
                                     1e7
 Latency Walltime Inlet (ns)
                               2.0
                                                                             Estimated Statistic1e7
                                                                                                 = Latency Walltime Inlet (ns) Mean
                                                                                                                    0.0
                               1.5
                                                                                                                   −0.2
                                                                                                                   −0.4
                                                                                        Absolute Effect Size
                               1.0
                                                                                                                   −0.6
                                                                                                                   −0.8
                               0.5
                                                                                                                   −1.0
                                                                                                                   −1.2
                                         0                        1
                                                                                                                   −1.4
                                     0 = With lac-417 | 1 = Sans lac-417
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                        Quantile Regression
                                     1e6
                               2.6
 Latency Walltime Inlet (ns)
                               2.5
                                                                             Estimated Statistic = Latency Walltime Inlet (ns) Median
                               2.4                                                                                 100000
                                                                                                                    75000
                                                                                            Absolute Effect Size
                                                                                                                    50000
                               2.3
                                                                                                                    25000
                                                                                                                          0
                               2.2                                                                                 −25000
                                                                                                                   −50000
                                         0                        1
                                                                                                                   −75000
                                     0 = With lac-417 | 1 = Sans lac-417
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.76: Regressions of Latency Walltime Inlet (ns) against categorically coded treatment for faulty
hardware experiment (Section 2.3.7). Lower is better. Ordinary least squares regression (top row) estimates
relationship between categorical dependent variable and mean of response variable. Quantile regression (bot-
tom row) estimates relationship between categorical independent variable and median of response variable.
Error bands and bars are 95% confidence intervals.
                                                                           212


               Ordinary Least Squares Regression
                           12
 Latency Simsteps Outlet
                           10                                           Estimated Statistic = Latency Simsteps Outlet Mean
                                                                                                              0
                            8                                                                                −1
                            6                                                                                −2
                                                                                 Absolute Effect Size
                                                                                                             −3
                            4                                                                                −4
                                                                                                             −5
                            2
                                                                                                             −6
                                    0                         1                                              −7
                                0 = With lac-417 | 1 = Sans lac-417                                          −8
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                   Quantile Regression
                           1.30
 Latency Simsteps Outlet
                           1.25                                         Estimated Statistic = Latency Simsteps Outlet Median
                                                                                                              0.04
                           1.20
                                                                                      Absolute Effect Size
                                                                                                              0.02
                           1.15
                                                                                                              0.00
                           1.10                                                                              −0.02
                                    0                        1                                               −0.04
                                0 = With lac-417 | 1 = Sans lac-417
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.77: Regressions of Latency Simsteps Outlet against categorically coded treatment for faulty hard-
ware experiment (Section 2.3.7). Lower is better. Ordinary least squares regression (top row) estimates
relationship between categorical dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between categorical independent variable and median of response vari-
able. Error bands and bars are 95% confidence intervals.
                                                                      213


                   Ordinary Least Squares Regression
                                      1e7
 Latency Walltime Outlet (ns)
                                2.0
                                                                              Estimated Statistic = Latency Walltime Outlet (ns) Mean
                                                                                                  1e7
                                1.5                                                                                  0.0
                                                                                                                    −0.2
                                1.0                                                                                 −0.4
                                                                                         Absolute Effect Size
                                                                                                                    −0.6
                                0.5                                                                                 −0.8
                                                                                                                    −1.0
                                                                                                                    −1.2
                                          0                        1
                                                                                                                    −1.4
                                      0 = With lac-417 | 1 = Sans lac-417
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                         Quantile Regression
                                      1e6
 Latency Walltime Outlet (ns)
                                2.5
                                                                              Estimated Statistic = Latency Walltime Outlet (ns) Median
                                2.4
                                                                                                                     100000
                                                                                             Absolute Effect Size
                                2.3                                                                                   50000
                                                                                                                           0
                                2.2
                                                                                                                     −50000
                                          0                         1
                                      0 = With lac-417 | 1 = Sans lac-417                                           −100000
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.78: Regressions of Latency Walltime Outlet (ns) against categorically coded treatment for faulty
hardware experiment (Section 2.3.7). Lower is better. Ordinary least squares regression (top row) estimates
relationship between categorical dependent variable and mean of response variable. Quantile regression (bot-
tom row) estimates relationship between categorical independent variable and median of response variable.
Error bands and bars are 95% confidence intervals.
                                                                            214


             Ordinary Least Squares Regression
                       0.38
 Delivery Clumpiness
                                                                      Estimated Statistic = Delivery Clumpiness Mean
                       0.36
                       0.34                                                                               0.02
                                                                             Absolute Effect Size
                       0.32                                                                               0.01
                       0.30
                                                                                                          0.00
                       0.28
                                                                                                         −0.01
                                   0                       1
                              0 = With lac-417 | 1 = Sans lac-417                                        −0.02
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                Quantile Regression
                       0.38
 Delivery Clumpiness
                                                                      Estimated Statistic = Delivery Clumpiness Median
                       0.36
                                                                                                          0.04
                       0.34
                                                                                  Absolute Effect Size
                       0.32                                                                               0.02
                       0.30
                                                                                                          0.00
                       0.28
                                                                                                         −0.02
                                   0                       1
                              0 = With lac-417 | 1 = Sans lac-417                                        −0.04
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.79: Regressions of Delivery Clumpiness against categorically coded treatment for faulty hardware
experiment (Section 2.3.7). Lower is better. Ordinary least squares regression (top row) estimates relation-
ship between categorical dependent variable and mean of response variable. Quantile regression (bottom
row) estimates relationship between categorical independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                    215


                 Ordinary Least Squares Regression
                                     1e6
 Simstep Period Inlet (ns)
                             2.01
                                                                          Estimated Statistic = Simstep Period Inlet (ns) Mean
                             2.00                                                                               20000
                                                                                    Absolute Effect Size
                                                                                                                15000
                             1.99
                                                                                                                10000
                             1.98
                                                                                                                5000
                                    0                        1
                                0 = With lac-417 | 1 = Sans lac-417                                                0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                      Quantile Regression
                                      1e6
                             1.995
 Simstep Period Inlet (ns)
                             1.990
                             1.985                                        Estimated Statistic = Simstep Period Inlet (ns) Median
                                                                                                                20000
                             1.980
                                                                                                                15000
                                                                                         Absolute Effect Size
                             1.975
                             1.970                                                                              10000
                             1.965                                                                               5000
                                         0                       1
                                                                                                                    0
                                    0 = With lac-417 | 1 = Sans lac-417
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.80: Regressions of Simstep Period Inlet (ns) against categorically coded treatment for faulty hard-
ware experiment (Section 2.3.7). Lower is better. Ordinary least squares regression (top row) estimates
relationship between categorical dependent variable and mean of response variable. Quantile regression
(bottom row) estimates relationship between categorical independent variable and median of response vari-
able. Error bands and bars are 95% confidence intervals.
                                                                      216


               Ordinary Least Squares Regression
                          12
 Latency Simsteps Inlet
                          10                                           Estimated Statistic = Latency Simsteps Inlet Mean
                                                                                                            0
                           8                                                                               −1
                                                                                                           −2
                                                                               Absolute Effect Size
                           6
                                                                                                           −3
                           4                                                                               −4
                                                                                                           −5
                           2
                                                                                                           −6
                                   0                         1                                             −7
                               0 = With lac-417 | 1 = Sans lac-417                                         −8
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                  Quantile Regression
                          1.30
 Latency Simsteps Inlet
                                                                       Estimated Statistic = Latency Simsteps Inlet Median
                          1.25
                                                                                                            0.04
                          1.20
                                                                                    Absolute Effect Size
                                                                                                            0.02
                          1.15
                                                                                                            0.00
                          1.10                                                                             −0.02
                                    0                       1
                                                                                                           −0.04
                               0 = With lac-417 | 1 = Sans lac-417
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.81: Regressions of Latency Simsteps Inlet against categorically coded treatment for faulty hardware
experiment (Section 2.3.7). Lower is better. Ordinary least squares regression (top row) estimates relation-
ship between categorical dependent variable and mean of response variable. Quantile regression (bottom
row) estimates relationship between categorical independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                                     217


     Ordinary Least Squares Regression
                                       1e6
                               2.00
  Simstep Period Outlet (ns)
                               1.99                                         Estimated Statistic = Simstep Period Outlet (ns) Mean
                                                                                                                  20000
                                                                                      Absolute Effect Size
                               1.98                                                                               15000
                                                                                                                  10000
                               1.97
                                                                                                                  5000
                                       0                      1
                                  0 = With lac-417 | 1 = Sans lac-417                                                0
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                        Quantile Regression
                                        1e6
                               1.980
 Simstep Period Outlet (ns)
                               1.975
                                                                            Estimated Statistic = Simstep Period Outlet (ns) Median
                               1.970
                                                                                                                  17500
                               1.965
                                                                                                                  15000
                                                                                           Absolute Effect Size
                               1.960                                                                              12500
                                                                                                                  10000
                               1.955
                                                                                                                   7500
                               1.950                                                                               5000
                                                                                                                   2500
                                           0                       1
                                                                                                                      0
                                      0 = With lac-417 | 1 = Sans lac-417
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.82: Regressions of Simstep Period Outlet (ns) against categorically coded treatment for faulty
hardware experiment (Section 2.3.7). Lower is better. Ordinary least squares regression (top row) estimates
relationship between categorical dependent variable and mean of response variable. Quantile regression (bot-
tom row) estimates relationship between categorical independent variable and median of response variable.
Error bands and bars are 95% confidence intervals.
                                                                        218


    Ordinary Least Squares Regression
                         0.003
 Delivery Failure Rate
                                                                   Estimated Statistic = Delivery Failure Rate Mean
                                                                                                         0.0000
                         0.002                                                                          −0.0005
                                                                           Absolute Effect Size
                                                                                                        −0.0010
                         0.001                                                                          −0.0015
                                                                                                        −0.0020
                         0.000
                                                                                                        −0.0025
                                 0                       1                                              −0.0030
                            0 = With lac-417 | 1 = Sans lac-417
(a) Ordinary least squares regression plot. Observa-(b) Estimated regression coefficient for ordinary least
tions are means per replicate.                      squares regression. Zero corresponds to no effect.
                                 Quantile Regression
                          0.04
 Delivery Failure Rate
                                                                   Estimated Statistic = Delivery Failure Rate Median
                          0.02
                                                                                                         0.04
                          0.00
                                                                                 Absolute Effect Size
                                                                                                         0.02
                         −0.02
                                                                                                         0.00
                         −0.04
                                                                                                        −0.02
                                  0                       1                                             −0.04
                             0 = With lac-417 | 1 = Sans lac-417
(c) Quantile regression plot. Observations are medians(d) Estimated regression coefficient for quantile regres-
per replicate.                                        sion. Zero corresponds to no effect.
Figure A.83: Regressions of Delivery Failure Rate against categorically coded treatment for faulty hardware
experiment (Section 2.3.7). Lower is better. Ordinary least squares regression (top row) estimates relation-
ship between categorical dependent variable and mean of response variable. Quantile regression (bottom
row) estimates relationship between categorical independent variable and median of response variable. Error
bands and bars are 95% confidence intervals.
                                                               219


Table A.23: Full Ordinary Least Squares Regression results of quality of service metrics against against categorically coded treatment for faulty
hardware experiment (Section 2.3.7). Significance level p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN
observations.
                                                         ca nt
                                                      sP        Eff ec
                                                 Nu        er  No      tS  ign
                                                    m    Si        d e
                                                            m els
                                                 Nu m              Pe  rC
                                                         Pr oc             pu
                                                               es ses
                                                 Ab so lu te  Eff ec tS  ize
                                                 Ab so lu te  Eff ec tS  ize
                                                 Ab so                        95 %
                                                       lu te                       CI Lo
                                                              Eff ec                     we rB
                                                 Re                  tS  ize                   ou nd
                                                   lat  ive                   95 %
                                                             Eff                   CI Up
                                                                 ec tS                   pe rB
                                                 Re lat                 ize                    ou
                                                        ive  Eff                                  nd
                                                 Re              ec tS
                                                    lat ive             ize  95
                                       tic      n            Eff                %  CI
                                                                 ec                   Lo
                  ric                          ifi                  tS  ize             we
                                   at                                        95            rB ou
               M                      is     Si p    Cp
                                                                                %  CI Up
                                                                                                 nd
                et                St           gn       u
                                                                                         perB ou nd
 Latency Walltime Inlet (ns)    mean       -    1       2048   256   -1.1e+07   -1.5e+07   -6.6e+06   -0.82    -1.1     -0.5    20   3.6e-05
 Latency Walltime Outlet (ns)   mean       -    1       2048   256   -1.1e+07   -1.5e+07   -6.6e+06   -0.82    -1.1     -0.5    20   3.5e-05
 Latency Simsteps Inlet         mean       -    1       2048   256   -5.8       -8         -3.5       -0.83    -1.1     -0.51   20   3.8e-05
 Latency Simsteps Outlet        mean       -    1       2048   256   -5.8       -8         -3.6       -0.83    -1.1     -0.51   20   3.6e-05
 Delivery Failure Rate          mean       -    1       2048   256   -0.003     -0.0033    -0.0027    -1       -1.1     -0.94   20   4.9e-14
 Delivery Clumpiness            mean       0    1       2048   256   0.002      -0.021     0.025      0.0062   -0.067   0.079   20   0.86
 Simstep Period Inlet (ns)      mean       +    1       2048   256   14 000     6 200      21 000     0.0069   0.0031   0.011   20   0.0012
 Simstep Period Outlet (ns)     mean       +    1       2048   256   13 000     6 900      20 000     0.0068   0.0035   0.01    20   0.00043
                                                                                    220


Table A.24: Full Quantile Regression results of quality of service metrics against against categorically coded treatment for faulty hardware experiment
(Section 2.3.7). Significance level p < 0.05 used. Inf or NaN values may occur due to multicollinearity or due to inf or NaN observations.
                                                               ca nt
                                                          us          Eff ec
                                                               Pe rN         tS  ign
                                                       Nu m            od  e
                                                               Si m els
                                                       Nu m              Pe  rC
                                                               Pr oc             pu
                                                                     es ses
                                                       Ab so lu te  Eff ec
                                                       Ab                  tS  ize
                                                          so lu te  Eff ec
                                                       Ab                  tS  ize
                                                          so lu te                  95 %
                                                                    Eff                  CI Lo
                                                                        ec tS                  we rB
                                                       Re lat                  ize  95               ou
                                                              ive                      % C              nd
                                                                   Eff ec                  IU pp
                                                                          tS  ize                er Bo
                                                       Re lat                                          un d
                                                              ive  Eff ec tS
                                                       Re lat                 ize  95
                                                              ive                     %  CI
                   c                     tic          n            Eff ec tS                Lowe
                                                     ifi                      ize                rB
               M                    at                                             95               ou nd
                et                     is        gn   p                               %  CI Up
                   ri              St            Si            Cp                              pe rB ou nd
 Latency Walltime Inlet (ns)    median         0           1        2048   256   9 800     -87 000   110 000   0.0042    -0.038     0.046    20   0.83
 Latency Walltime Outlet (ns)   median         0           1        2048   256   7 000     -91 000   110 000   0.003     -0.04      0.046    20   0.88
 Latency Simsteps Inlet         median         0           1        2048   256   0.0013    -0.046    0.049     0.0011    -0.039     0.042    20   0.95
 Latency Simsteps Outlet        median         0           1        2048   256   -0.0013   -0.05     0.047     -0.0011   -0.042     0.04     20   0.96
 Delivery Failure Rate          median         NaN         1        2048   256   0         nan       nan       nan       nan        nan      20   nan
 Delivery Clumpiness            median         0           1        2048   256   0.0012    -0.04     0.042     0.0039    -0.13      0.13     20   0.95
 Simstep Period Inlet (ns)      median         0           1        2048   256   9 000     -1 800    20 000    0.0046    -0.00093   0.01     20   0.098
 Simstep Period Outlet (ns)     median         0           1        2048   256   8 700     -980      18 000    0.0045    -0.0005    0.0094   20   0.075
                                                                                             221


                         Appendix B
   Methods to Enable Decentralized Phylogenetic Tracking in a
             Distributed Digital Evolution System
B.1 Memory Footprint Outcomes
Table B.1: Hereditary stratigraph column memory footprint outcomes for phylogenetic reconstruction exper-
iments on the NK EcoEA Selectiondataset. All treatments’ stratum retention policies were parameterized to
use as much of the target per-column memory footprint as possible without exceeding it. Treatments where
the stratum retention policy could not be parameterized low enough to meet the target per-column memory
footprint are highlighted in red. TDPR denotes the “Tapered Depth-Proportional Resolution” policy and
RPR denotes the “Recency-Proportional Resolution” policy.
                                                              Fingerprint Differentia                       Retention Policy
                     Actual Mean
 Target Per-Column
                                         Memory Footprint                               Stratum Retention
 Memory Footprint    Per-Column Memory
 (bits)              Footprint (bits)    Percent Error        Width (bits)              Policy              Resolution Parameter
 64                  56.0                −12.5                8                         TDPR                3
 64                  192.0               200.0                64                        TDPR                1
 512                 510.2               −0.3                 1                         TDPR                255
 512                 504.0               −1.6                 8                         TDPR                31
 512                 448.0               −12.5                64                        TDPR                3
 4096                3000.5              −26.7                1                         TDPR                2999
 4096                4081.8              −0.3                 8                         TDPR                255
 4096                4032.0              −1.6                 64                        TDPR                31
 64                  56.4                −11.9                1                         RPR                 5
 64                  66.9                4.6                  8                         RPR                 0
 64                  535.4               736.5                64                        RPR                 0
 512                 511.4               −0.1                 1                         RPR                 83
 512                 450.9               −11.9                8                         RPR                 5
 512                 535.4               4.6                  64                        RPR                 0
 4096                3000.5              −26.7                1                         RPR                 2999
 4096                4090.9              −0.1                 8                         RPR                 83
 4096                3607.4              −11.9                64                        RPR                 5
                                                            222


Table B.2: Hereditary stratigraph column memory footprint outcomes for phylogenetic reconstruction exper-
iments on the NK Lexicase Selectiondataset. All treatments’ stratum retention policies were parameterized
to use as much of the target per-column memory footprint as possible without exceeding it. Treatments
where the stratum retention policy could not be parameterized low enough to meet the target per-column
memory footprint are highlighted in red. TDPR denotes the “Tapered Depth-Proportional Resolution” policy
and RPR denotes the “Recency-Proportional Resolution” policy.
                                                              Fingerprint Differentia                       Retention Policy
                     Actual Mean
 Target Per-Column
                                         Memory Footprint                               Stratum Retention
 Memory Footprint    Per-Column Memory
 (bits)              Footprint (bits)    Percent Error        Width (bits)              Policy              Resolution Parameter
 64                  56.0                −12.5                8                         TDPR                3
 64                  192.0               200.0                64                        TDPR                1
 512                 497.3               −2.9                 1                         TDPR                496
 512                 502.7               −1.8                 8                         TDPR                31
 512                 448.0               −12.5                64                        TDPR                3
 4096                497.3               −87.9                1                         TDPR                496
 4096                3978.3              −2.9                 8                         TDPR                496
 4096                4021.3              −1.8                 64                        TDPR                31
 64                  58.0                −9.4                 1                         RPR                 8
 64                  55.7                −13.0                8                         RPR                 0
 64                  445.3               595.8                64                        RPR                 0
 512                 497.3               −2.9                 1                         RPR                 496
 512                 463.7               −9.4                 8                         RPR                 8
 512                 445.3               −13.0                64                        RPR                 0
 4096                497.3               −87.9                1                         RPR                 496
 4096                3978.3              −2.9                 8                         RPR                 496
 4096                3709.3              −9.4                 64                        RPR                 8
                                                            223


Table B.3: Hereditary stratigraph column memory footprint outcomes for phylogenetic reconstruction exper-
iments on the NK Random Selectiondataset. All treatments’ stratum retention policies were parameterized
to use as much of the target per-column memory footprint as possible without exceeding it. Treatments
where the stratum retention policy could not be parameterized low enough to meet the target per-column
memory footprint are highlighted in red. TDPR denotes the “Tapered Depth-Proportional Resolution” policy
and RPR denotes the “Recency-Proportional Resolution” policy.
                                                              Fingerprint Differentia                       Retention Policy
                     Actual Mean
 Target Per-Column
                                         Memory Footprint                               Stratum Retention
 Memory Footprint    Per-Column Memory
 (bits)              Footprint (bits)    Percent Error        Width (bits)              Policy              Resolution Parameter
 64                  56.0                −12.5                8                         TDPR                3
 64                  192.0               200.0                64                        TDPR                1
 512                 510.2               −0.3                 1                         TDPR                255
 512                 504.0               −1.6                 8                         TDPR                31
 512                 448.0               −12.5                64                        TDPR                3
 4096                4096.0              0.0                  1                         TDPR                2048
 4096                4081.7              −0.3                 8                         TDPR                255
 4096                4032.0              −1.6                 64                        TDPR                31
 64                  58.4                −8.8                 1                         RPR                 5
 64                  50.9                −20.5                8                         RPR                 0
 64                  407.1               536.1                64                        RPR                 0
 512                 506.4               −1.1                 1                         RPR                 71
 512                 466.9               −8.8                 8                         RPR                 5
 512                 407.1               −20.5                64                        RPR                 0
 4096                4096.9              0.0                  1                         RPR                 1596
 4096                4050.9              −1.1                 8                         RPR                 71
 4096                3735.1              −8.8                 64                        RPR                 5
                                                            224


Table B.4: Hereditary stratigraph column memory footprint outcomes for phylogenetic reconstruction exper-
iments on the NK Sharing Selectiondataset. All treatments’ stratum retention policies were parameterized to
use as much of the target per-column memory footprint as possible without exceeding it. Treatments where
the stratum retention policy could not be parameterized low enough to meet the target per-column memory
footprint are highlighted in red. TDPR denotes the “Tapered Depth-Proportional Resolution” policy and
RPR denotes the “Recency-Proportional Resolution” policy.
                                                              Fingerprint Differentia                       Retention Policy
                     Actual Mean
 Target Per-Column
                                         Memory Footprint                               Stratum Retention
 Memory Footprint    Per-Column Memory
 (bits)              Footprint (bits)    Percent Error        Width (bits)              Policy              Resolution Parameter
 64                  56.0                −12.5                8                         TDPR                3
 64                  192.0               200.0                64                        TDPR                1
 512                 510.1               −0.4                 1                         TDPR                255
 512                 504.0               −1.6                 8                         TDPR                31
 512                 448.0               −12.5                64                        TDPR                3
 4096                4096.0              0.0                  1                         TDPR                2048
 4096                4080.8              −0.4                 8                         TDPR                255
 4096                4032.0              −1.6                 64                        TDPR                31
 64                  58.2                −9.1                 1                         RPR                 5
 64                  49.4                −22.8                8                         RPR                 0
 64                  395.4               517.8                64                        RPR                 0
 512                 506.2               −1.1                 1                         RPR                 71
 512                 465.4               −9.1                 8                         RPR                 5
 512                 395.4               −22.8                64                        RPR                 0
 4096                4097.0              0.0                  1                         RPR                 1596
 4096                4049.4              −1.1                 8                         RPR                 71
 4096                3723.4              −9.1                 64                        RPR                 5
                                                            225


Table B.5: Hereditary stratigraph column memory footprint outcomes for phylogenetic reconstruction exper-
iments on the NK Tournament Selectiondataset. All treatments’ stratum retention policies were parameter-
ized to use as much of the target per-column memory footprint as possible without exceeding it. Treatments
where the stratum retention policy could not be parameterized low enough to meet the target per-column
memory footprint are highlighted in red. TDPR denotes the “Tapered Depth-Proportional Resolution” policy
and RPR denotes the “Recency-Proportional Resolution” policy.
                                                              Fingerprint Differentia                       Retention Policy
                     Actual Mean
 Target Per-Column
                                         Memory Footprint                               Stratum Retention
 Memory Footprint    Per-Column Memory
 (bits)              Footprint (bits)    Percent Error        Width (bits)              Policy              Resolution Parameter
 64                  56.0                −12.5                8                         TDPR                3
 64                  192.0               200.0                64                        TDPR                1
 512                 510.2               −0.3                 1                         TDPR                255
 512                 504.0               −1.6                 8                         TDPR                31
 512                 448.0               −12.5                64                        TDPR                3
 4096                4096.0              0.0                  1                         TDPR                2048
 4096                4081.8              −0.3                 8                         TDPR                255
 4096                4032.0              −1.6                 64                        TDPR                31
 64                  58.4                −8.7                 1                         RPR                 5
 64                  51.6                −19.4                8                         RPR                 0
 64                  412.4               544.4                64                        RPR                 0
 512                 506.4               −1.1                 1                         RPR                 71
 512                 467.6               −8.7                 8                         RPR                 5
 512                 412.4               −19.4                64                        RPR                 0
 4096                4097.0              0.0                  1                         RPR                 1596
 4096                4051.6              −1.1                 8                         RPR                 71
 4096                3740.4              −8.7                 64                        RPR                 5
                                                            226


B.2       Differentia Size
Table B.6: Comparison of phylogenetic reconstruction quality across differentia bit counts. Reconstruction
quality measured as clustering information distance (lower is better), mutual clustering information (higher
is better), and generalized Robinson-Foulds similarity (higher is better) between reconstructed phylogeny
and ground truth phylogeny (Smith, 2020a,c). RPR is recency-proportional resolution stratum retention
policy and TDPR is tapered depth-proportional resolution stratum retention policy.
                                                 Target         1            8             64
                                     Stratum
                         Selection                Num      Differentia  Differentia   Differentia
                                     Retention
                          Scheme                 Column        Bit          Bits          Bits
  Tree Comparison Metric              Policy
                                                  Bits        Score        Score         Score
       Clustering
      Information          ecoea       RPR          64     0.75741448   0.88187568    0.88445336
        Distance
       Clustering
      Information          ecoea       RPR         512     0.77527017   0.86851389    0.88445336
        Distance
       Clustering
      Information          ecoea       RPR        4096     0.74973967   0.87205268    0.87675364
        Distance
       Clustering
      Information          ecoea      TDPR          64     0.84394104    0.9063332    0.91965543
        Distance
       Clustering
      Information          ecoea      TDPR         512     0.78313215    0.8850693     0.9197733
        Distance
       Clustering
      Information          ecoea      TDPR        4096     0.76066297   0.87342671    0.89416878
        Distance
       Clustering
      Information         lexicase     RPR          64      0.5229912   0.49939839    0.49939839
        Distance
       Clustering
      Information         lexicase     RPR         512     0.44076455   0.48910738    0.49939839
        Distance
       Clustering
      Information         lexicase     RPR        4096     0.47469575   0.47469575    0.48910738
        Distance
       Clustering
      Information         lexicase    TDPR         64      0.45878823   0.56157204    0.58095457
        Distance
       Clustering
      Information         lexicase    TDPR         512     0.41731754   0.48910738    0.56157204
        Distance
       Clustering
      Information         lexicase    TDPR        4096     0.41731754   0.47469575    0.48910738
        Distance
                                                    227


                                         Table B.6 (cont’d)
                                             Target         1           8           64
                                 Stratum
                       Selection              Num      Differentia Differentia Differentia
                                 Retention
                        Scheme               Column        Bit         Bits        Bits
Tree Comparison Metric            Policy
                                               Bits       Score       Score       Score
     Clustering
    Information         random     RPR           64    0.53282846  0.49739862  0.49739862
      Distance
     Clustering
    Information         random     RPR         512     0.48993329  0.48702856  0.49739862
      Distance
     Clustering
    Information         random     RPR        4096     0.50719806  0.48702856  0.48702856
      Distance
     Clustering
    Information         random    TDPR          64     0.55056055  0.71147249  0.72346771
      Distance
     Clustering
    Information         random    TDPR          512    0.53611693  0.52442493  0.72346771
      Distance
     Clustering
    Information         random    TDPR         4096     0.535141   0.50616929  0.52442493
      Distance
     Clustering
    Information         sharing    RPR          64     0.26996582  0.57029131  0.57029131
      Distance
     Clustering
    Information         sharing    RPR         512     0.34865105   0.3265691  0.57029131
      Distance
     Clustering
    Information         sharing    RPR         4096    0.29234711   0.3265691   0.3265691
      Distance
     Clustering
    Information         sharing   TDPR          64     0.62262967  0.72582082  0.74538968
      Distance
     Clustering
    Information         sharing   TDPR         512     0.44765528  0.56873674  0.74358073
      Distance
     Clustering
    Information         sharing   TDPR        4096     0.3877124   0.47324102  0.56367128
      Distance
     Clustering
    Information       tournament   RPR          64         1.0         1.0         1.0
      Distance
     Clustering
    Information       tournament   RPR         512         1.0         1.0         1.0
      Distance
                                                 228


                                         Table B.6 (cont’d)
                                             Target         1           8           64
                                 Stratum
                       Selection              Num      Differentia Differentia Differentia
                                 Retention
                        Scheme               Column        Bit         Bits        Bits
Tree Comparison Metric            Policy
                                               Bits       Score       Score       Score
     Clustering
    Information       tournament   RPR        4096         1.0         1.0         1.0
      Distance
     Clustering
    Information       tournament  TDPR           64        1.0         1.0         1.0
      Distance
     Clustering
    Information       tournament  TDPR          512        1.0         1.0         1.0
      Distance
     Clustering
    Information       tournament  TDPR         4096        1.0         1.0         1.0
      Distance
    Generalized
  Robinson-Foulds        ecoea     RPR           64    0.12789964  0.09858057  0.10959622
     Similarity
    Generalized
  Robinson-Foulds        ecoea     RPR          512    0.11969511  0.12522623  0.10959622
     Similarity
    Generalized
  Robinson-Foulds        ecoea     RPR         4096     0.1352247  0.12705271  0.12705271
     Similarity
    Generalized
  Robinson-Foulds        ecoea    TDPR          64     0.10518887  0.08705949  0.07270782
     Similarity
    Generalized
  Robinson-Foulds        ecoea    TDPR         512     0.12010034  0.10492534  0.07037986
     Similarity
    Generalized
  Robinson-Foulds        ecoea    TDPR        4096     0.11906405  0.12705271  0.10270166
     Similarity
    Generalized
  Robinson-Foulds       lexicase   RPR          64     0.61418408  0.58734585  0.58734585
     Similarity
    Generalized
  Robinson-Foulds       lexicase   RPR         512     0.63531469  0.58632479  0.58734585
     Similarity
    Generalized
  Robinson-Foulds       lexicase   RPR        4096     0.61132479  0.61132479  0.58632479
     Similarity
    Generalized
  Robinson-Foulds       lexicase  TDPR          64     0.62011447  0.52863553  0.52913068
     Similarity
                                                 229


                                         Table B.6 (cont’d)
                                             Target         1           8           64
                                 Stratum
                       Selection              Num      Differentia Differentia Differentia
                                 Retention
                        Scheme               Column        Bit         Bits        Bits
Tree Comparison Metric            Policy
                                               Bits       Score       Score       Score
    Generalized
  Robinson-Foulds       lexicase  TDPR          512    0.67174145  0.58632479  0.52863553
     Similarity
    Generalized
  Robinson-Foulds       lexicase  TDPR         4096    0.67174145  0.61132479  0.58632479
     Similarity
    Generalized
  Robinson-Foulds      random      RPR           64    0.52725701  0.55443922  0.55443922
     Similarity
    Generalized
  Robinson-Foulds      random      RPR         512     0.56270752  0.55569694  0.55443922
     Similarity
    Generalized
  Robinson-Foulds      random      RPR        4096     0.53187026  0.55569694  0.55569694
     Similarity
    Generalized
  Robinson-Foulds      random     TDPR          64     0.50801107  0.32223085  0.29271759
     Similarity
    Generalized
  Robinson-Foulds      random     TDPR          512    0.52564843  0.54364234  0.29275636
     Similarity
    Generalized
  Robinson-Foulds      random     TDPR         4096     0.5263896  0.54654778  0.54364234
     Similarity
    Generalized
  Robinson-Foulds       sharing    RPR          64      0.657461   0.42017511  0.42017511
     Similarity
    Generalized
  Robinson-Foulds       sharing    RPR         512      0.5960249  0.64111808  0.42017511
     Similarity
    Generalized
  Robinson-Foulds       sharing    RPR         4096    0.67107653  0.64111808  0.64111808
     Similarity
    Generalized
  Robinson-Foulds       sharing   TDPR          64     0.38504625  0.27779412  0.25892199
     Similarity
    Generalized
  Robinson-Foulds       sharing   TDPR         512     0.52078196  0.41515824  0.25953255
     Similarity
    Generalized
  Robinson-Foulds       sharing   TDPR        4096     0.56819923  0.47414881  0.42323694
     Similarity
                                                 230


                                         Table B.6 (cont’d)
                                             Target         1           8           64
                                 Stratum
                       Selection              Num      Differentia Differentia Differentia
                                 Retention
                        Scheme               Column        Bit         Bits        Bits
Tree Comparison Metric            Policy
                                               Bits       Score       Score       Score
    Generalized
  Robinson-Foulds     tournament   RPR          64         0.0         0.0         0.0
     Similarity
    Generalized
  Robinson-Foulds     tournament   RPR         512         0.0         0.0         0.0
     Similarity
    Generalized
  Robinson-Foulds     tournament   RPR        4096         0.0         0.0         0.0
     Similarity
    Generalized
  Robinson-Foulds     tournament  TDPR           64        0.0         0.0         0.0
     Similarity
    Generalized
  Robinson-Foulds     tournament  TDPR          512        0.0         0.0         0.0
     Similarity
    Generalized
  Robinson-Foulds     tournament  TDPR         4096        0.0         0.0         0.0
     Similarity
 Mutual Clustering
                         ecoea     RPR           64     0.7327002  0.73891815  0.76353253
    Information
 Mutual Clustering
                         ecoea     RPR          512    0.69242131  0.77699572  0.76353253
    Information
 Mutual Clustering
                         ecoea     RPR         4096    0.77343524  0.77923387  0.77923387
    Information
 Mutual Clustering
                         ecoea    TDPR          64     0.68352682  0.68449434  0.61223714
    Information
 Mutual Clustering
                         ecoea    TDPR         512     0.69076868  0.78211568  0.61381997
    Information
 Mutual Clustering
                         ecoea    TDPR        4096     0.75375606  0.77923387  0.77783636
    Information
 Mutual Clustering
                        lexicase   RPR          64     0.49694107  0.53130189  0.53130189
    Information
 Mutual Clustering
                        lexicase   RPR         512     0.58712533  0.53758344  0.53130189
    Information
 Mutual Clustering
                        lexicase   RPR        4096     0.55516873  0.55516873  0.53758344
    Information
 Mutual Clustering
                        lexicase  TDPR          64     0.58129219  0.47847215  0.48501846
    Information
 Mutual Clustering
                        lexicase  TDPR          512    0.61765009  0.53758344  0.47847215
    Information
                                                 231


                                         Table B.6 (cont’d)
                                             Target         1           8           64
                                 Stratum
                       Selection              Num      Differentia Differentia Differentia
                                 Retention
                        Scheme               Column        Bit         Bits        Bits
Tree Comparison Metric            Policy
                                               Bits       Score       Score       Score
 Mutual Clustering
                        lexicase  TDPR         4096    0.61765009  0.55516873  0.53758344
    Information
 Mutual Clustering
                        random     RPR           64    0.50890079   0.5767782   0.5767782
    Information
 Mutual Clustering
                        random     RPR         512     0.54839495  0.55033495   0.5767782
    Information
 Mutual Clustering
                        random     RPR        4096     0.53078963  0.55033495  0.55033495
    Information
 Mutual Clustering
                        random    TDPR          64     0.53252454  0.47831995  0.47757262
    Information
 Mutual Clustering
                        random    TDPR          512    0.50245234   0.5436817  0.47757262
    Information
 Mutual Clustering
                        random    TDPR         4096    0.51137401  0.54309559   0.5436817
    Information
 Mutual Clustering
                        sharing    RPR          64     0.85702133  0.59156955  0.59156955
    Information
 Mutual Clustering
                        sharing    RPR         512     0.75676843  0.80361689  0.59156955
    Information
 Mutual Clustering
                        sharing    RPR         4096    0.83529931  0.80361689  0.80361689
    Information
 Mutual Clustering
                        sharing   TDPR           64    0.57941904  0.50361655   0.4889657
    Information
 Mutual Clustering
                        sharing   TDPR         512     0.70076968  0.58438899  0.49271413
    Information
 Mutual Clustering
                        sharing   TDPR        4096      0.7253681  0.71300752  0.60322108
    Information
 Mutual Clustering
                      tournament   RPR          64
    Information
 Mutual Clustering
                      tournament   RPR         512
    Information
 Mutual Clustering
                      tournament   RPR        4096
    Information
 Mutual Clustering
                      tournament  TDPR           64
    Information
 Mutual Clustering
                      tournament  TDPR          512
    Information
 Mutual Clustering
                      tournament  TDPR         4096
    Information
                                                 232


                   condition = random               condition = ecoea               condition = sharing                     condition = tournament               condition = lexicase
        1.0
        0.8
                                                                                                                                                                                               policy = RPR
        0.6
score
        0.4
        0.2
                                                                                                                                                                                                               Num Differentia Bits
        0.0
                                                                                                                                                                                                                          1
                                                                                                                                                                                                                          8
        1.0
                                                                                                                                                                                                                          64
        0.8
                                                                                                                                                                                               policy = TDPR
        0.6
score
        0.4
        0.2
        0.0
              64         512            4096   64        512            4096   64         512                   4096   64           512              4096   64         512              4096
                    Num Column Bits                 Num Column Bits                  Num Column Bits                           Num Column Bits                    Num Column Bits
Figure B.1: Comparison of phylogenetic reconstruction quality across differentia bit counts. Reconstruction quality measured as clustering information
distance between reconstructed phylogeny and ground truth phylogeny (Smith, 2020a,c). Lower is better. RPR is recency-proportional resolution
stratum retention policy and TDPR is tapered depth-proportional resolution stratum retention policy.
                                                                                                          233


                   condition = random               condition = ecoea               condition = sharing                     condition = tournament               condition = lexicase
        0.8
        0.7
        0.6
                                                                                                                                                                                               policy = RPR
        0.5
score
        0.4
        0.3
        0.2
        0.1
                                                                                                                                                                                                               Num Differentia Bits
        0.0
                                                                                                                                                                                                                          1
                                                                                                                                                                                                                          8
                                                                                                                                                                                                                          64
        0.8
        0.7
        0.6
                                                                                                                                                                                               policy = TDPR
        0.5
score
        0.4
        0.3
        0.2
        0.1
        0.0
              64         512            4096   64        512            4096   64         512                   4096   64           512              4096   64         512              4096
                    Num Column Bits                 Num Column Bits                  Num Column Bits                           Num Column Bits                    Num Column Bits
Figure B.2: Comparison of phylogenetic reconstruction quality across differentia bit counts. Reconstruction quality measured as mutual clustering
information between reconstructed phylogeny and ground truth phylogeny (Smith, 2020a,c). Higher is better. RPR is recency-proportional resolution
stratum retention policy and TDPR is tapered depth-proportional resolution stratum retention policy.
                                                                                                          234


                   condition = random               condition = ecoea               condition = sharing                     condition = tournament               condition = lexicase
        0.7
        0.6
        0.5
                                                                                                                                                                                               policy = RPR
        0.4
score
        0.3
        0.2
        0.1
                                                                                                                                                                                                               Num Differentia Bits
        0.0
                                                                                                                                                                                                                          1
        0.7                                                                                                                                                                                                               8
                                                                                                                                                                                                                          64
        0.6
        0.5
                                                                                                                                                                                               policy = TDPR
        0.4
score
        0.3
        0.2
        0.1
        0.0
              64         512            4096   64        512            4096   64         512                   4096   64           512              4096   64         512              4096
                    Num Column Bits                 Num Column Bits                  Num Column Bits                           Num Column Bits                    Num Column Bits
Figure B.3: Comparison of phylogenetic reconstruction quality across differentia bit counts. Reconstruction quality measured as generalized Robinson-
Foulds similarity between reconstructed phylogeny and ground truth phylogeny (Smith, 2020a,c). Higher is better. RPR is recency-proportional
resolution stratum retention policy and TDPR is tapered depth-proportional resolution stratum retention policy.
                                                                                                          235


B.3       Retention Policy
Table B.7: Comparison of phylogenetic reconstruction quality between stratum retention policies. Recon-
struction quality measured as clustering information distance (lower is better), mutual clustering information
(higher is better), and generalized Robinson-Foulds similarity (higher is better) between reconstructed phy-
logeny and ground truth phylogeny (Smith, 2020a,c).
                                                                                                 Tapered
                                                                     Target      Recency-
                                                          Num                                    Depth-
                                           Selection                  Num      Proportional
         Tree Comparison Metric                        Differentia                            Proportional
                                            Scheme                  Column       Resolution
                                                          Bits                                 Resolution
                                                                      Bits         Score
                                                                                                  Score
     Clustering Information Distance         ecoea          1           64      0.75741448     0.84394104
     Clustering Information Distance         ecoea          1          512      0.77527017     0.78313215
     Clustering Information Distance         ecoea          1         4096      0.74973967     0.76066297
     Clustering Information Distance         ecoea          8           64      0.88187568      0.9063332
     Clustering Information Distance         ecoea          8          512      0.86851389      0.8850693
     Clustering Information Distance         ecoea          8         4096      0.87205268     0.87342671
     Clustering Information Distance         ecoea         64           64      0.88445336     0.91965543
     Clustering Information Distance         ecoea         64          512      0.88445336      0.9197733
     Clustering Information Distance         ecoea         64         4096      0.87675364     0.89416878
     Clustering Information Distance        lexicase        1           64       0.5229912     0.45878823
     Clustering Information Distance        lexicase        1          512      0.44076455     0.41731754
     Clustering Information Distance        lexicase        1         4096      0.47469575     0.41731754
     Clustering Information Distance        lexicase        8           64      0.49939839     0.56157204
     Clustering Information Distance        lexicase        8          512      0.48910738     0.48910738
     Clustering Information Distance        lexicase        8         4096      0.47469575     0.47469575
     Clustering Information Distance        lexicase       64           64      0.49939839     0.58095457
     Clustering Information Distance        lexicase       64          512      0.49939839     0.56157204
     Clustering Information Distance        lexicase       64         4096      0.48910738     0.48910738
     Clustering Information Distance        random          1           64      0.53282846     0.55056055
     Clustering Information Distance        random          1          512      0.48993329     0.53611693
     Clustering Information Distance        random          1         4096      0.50719806      0.535141
     Clustering Information Distance        random          8           64      0.49739862     0.71147249
     Clustering Information Distance        random          8          512      0.48702856     0.52442493
     Clustering Information Distance        random          8         4096      0.48702856     0.50616929
     Clustering Information Distance        random         64           64      0.49739862     0.72346771
     Clustering Information Distance        random         64          512      0.49739862     0.72346771
     Clustering Information Distance        random         64         4096      0.48702856     0.52442493
     Clustering Information Distance        sharing         1           64      0.26996582     0.62262967
     Clustering Information Distance        sharing         1          512      0.34865105     0.44765528
     Clustering Information Distance        sharing         1         4096      0.29234711      0.3877124
     Clustering Information Distance        sharing         8           64      0.57029131     0.72582082
     Clustering Information Distance        sharing         8          512       0.3265691     0.56873674
     Clustering Information Distance        sharing         8         4096       0.3265691     0.47324102
     Clustering Information Distance        sharing        64           64      0.57029131     0.74538968
     Clustering Information Distance        sharing        64          512      0.57029131     0.74358073
     Clustering Information Distance        sharing        64         4096       0.3265691     0.56367128
     Clustering Information Distance      tournament        1           64          1.0            1.0
     Clustering Information Distance      tournament        1          512          1.0            1.0
     Clustering Information Distance      tournament        1         4096          1.0            1.0
     Clustering Information Distance      tournament        8           64          1.0            1.0
     Clustering Information Distance      tournament        8          512          1.0            1.0
     Clustering Information Distance      tournament        8         4096          1.0            1.0
     Clustering Information Distance      tournament       64           64          1.0            1.0
     Clustering Information Distance      tournament       64          512          1.0            1.0
     Clustering Information Distance      tournament       64         4096          1.0            1.0
  Generalized Robinson-Foulds Similarity     ecoea          1           64      0.12789964     0.10518887
  Generalized Robinson-Foulds Similarity     ecoea          1          512      0.11969511     0.12010034
  Generalized Robinson-Foulds Similarity     ecoea          1         4096       0.1352247     0.11906405
                                                     236


                                           Table B.7 (cont’d)
                                                                                       Tapered
                                                                Target   Recency-
                                                       Num                             Depth-
                                        Selection                Num   Proportional
       Tree Comparison Metric                       Differentia                     Proportional
                                         Scheme                 Column  Resolution
                                                       Bits                          Resolution
                                                                 Bits       Score
                                                                                        Score
Generalized Robinson-Foulds Similarity    ecoea          8         64   0.09858057   0.08705949
Generalized Robinson-Foulds Similarity    ecoea          8        512   0.12522623   0.10492534
Generalized Robinson-Foulds Similarity    ecoea          8       4096   0.12705271   0.12705271
Generalized Robinson-Foulds Similarity    ecoea         64        64    0.10959622   0.07270782
Generalized Robinson-Foulds Similarity    ecoea         64        512   0.10959622   0.07037986
Generalized Robinson-Foulds Similarity    ecoea         64       4096   0.12705271   0.10270166
Generalized Robinson-Foulds Similarity   lexicase        1         64   0.61418408   0.62011447
Generalized Robinson-Foulds Similarity   lexicase        1        512   0.63531469   0.67174145
Generalized Robinson-Foulds Similarity   lexicase        1       4096   0.61132479   0.67174145
Generalized Robinson-Foulds Similarity   lexicase        8         64   0.58734585   0.52863553
Generalized Robinson-Foulds Similarity   lexicase        8        512   0.58632479   0.58632479
Generalized Robinson-Foulds Similarity   lexicase        8       4096   0.61132479   0.61132479
Generalized Robinson-Foulds Similarity   lexicase       64         64   0.58734585   0.52913068
Generalized Robinson-Foulds Similarity   lexicase       64        512   0.58734585   0.52863553
Generalized Robinson-Foulds Similarity   lexicase       64       4096   0.58632479   0.58632479
Generalized Robinson-Foulds Similarity   random          1         64   0.52725701   0.50801107
Generalized Robinson-Foulds Similarity   random          1       512    0.56270752   0.52564843
Generalized Robinson-Foulds Similarity   random          1       4096   0.53187026    0.5263896
Generalized Robinson-Foulds Similarity   random          8         64   0.55443922   0.32223085
Generalized Robinson-Foulds Similarity   random          8       512    0.55569694   0.54364234
Generalized Robinson-Foulds Similarity   random          8       4096   0.55569694   0.54654778
Generalized Robinson-Foulds Similarity   random         64         64   0.55443922   0.29271759
Generalized Robinson-Foulds Similarity   random         64        512   0.55443922   0.29275636
Generalized Robinson-Foulds Similarity   random         64       4096   0.55569694   0.54364234
Generalized Robinson-Foulds Similarity   sharing         1         64     0.657461   0.38504625
Generalized Robinson-Foulds Similarity   sharing         1        512    0.5960249   0.52078196
Generalized Robinson-Foulds Similarity   sharing         1       4096   0.67107653   0.56819923
Generalized Robinson-Foulds Similarity   sharing         8         64   0.42017511   0.27779412
Generalized Robinson-Foulds Similarity   sharing         8        512   0.64111808   0.41515824
Generalized Robinson-Foulds Similarity   sharing         8       4096   0.64111808   0.47414881
Generalized Robinson-Foulds Similarity   sharing        64         64   0.42017511   0.25892199
Generalized Robinson-Foulds Similarity   sharing        64        512   0.42017511   0.25953255
Generalized Robinson-Foulds Similarity   sharing        64       4096   0.64111808   0.42323694
Generalized Robinson-Foulds Similarity tournament        1         64        0.0         0.0
Generalized Robinson-Foulds Similarity tournament        1        512        0.0         0.0
Generalized Robinson-Foulds Similarity tournament        1       4096        0.0         0.0
Generalized Robinson-Foulds Similarity tournament        8         64        0.0         0.0
Generalized Robinson-Foulds Similarity tournament        8        512        0.0         0.0
Generalized Robinson-Foulds Similarity tournament        8       4096        0.0         0.0
Generalized Robinson-Foulds Similarity tournament       64        64         0.0         0.0
Generalized Robinson-Foulds Similarity tournament       64        512        0.0         0.0
Generalized Robinson-Foulds Similarity tournament       64       4096        0.0         0.0
   Mutual   Clustering Information        ecoea          1         64    0.7327002   0.68352682
   Mutual   Clustering Information        ecoea          1        512   0.69242131   0.69076868
   Mutual   Clustering Information        ecoea          1       4096   0.77343524   0.75375606
   Mutual   Clustering Information        ecoea          8         64   0.73891815   0.68449434
   Mutual   Clustering Information        ecoea          8        512   0.77699572   0.78211568
   Mutual   Clustering Information        ecoea          8       4096   0.77923387   0.77923387
   Mutual   Clustering Information        ecoea         64         64   0.76353253   0.61223714
   Mutual   Clustering Information        ecoea         64        512   0.76353253   0.61381997
   Mutual   Clustering Information        ecoea         64       4096   0.77923387   0.77783636
   Mutual   Clustering Information       lexicase        1         64   0.49694107   0.58129219
   Mutual   Clustering Information       lexicase        1        512   0.58712533   0.61765009
   Mutual   Clustering Information       lexicase        1       4096   0.55516873   0.61765009
   Mutual   Clustering Information       lexicase        8         64   0.53130189   0.47847215
   Mutual   Clustering Information       lexicase        8        512   0.53758344   0.53758344
                                                  237


                                  Table B.7 (cont’d)
                                                                              Tapered
                                                       Target   Recency-
                                              Num                             Depth-
                               Selection                Num   Proportional
  Tree Comparison Metric                   Differentia                     Proportional
                                Scheme                 Column  Resolution
                                              Bits                          Resolution
                                                        Bits      Score
                                                                               Score
Mutual Clustering Information   lexicase        8       4096   0.55516873   0.55516873
Mutual Clustering Information   lexicase       64         64   0.53130189   0.48501846
Mutual Clustering Information   lexicase       64        512   0.53130189   0.47847215
Mutual Clustering Information   lexicase       64       4096   0.53758344   0.53758344
Mutual Clustering Information   random          1         64   0.50890079   0.53252454
Mutual Clustering Information   random          1        512   0.54839495   0.50245234
Mutual Clustering Information   random          1       4096   0.53078963   0.51137401
Mutual Clustering Information   random          8         64    0.5767782   0.47831995
Mutual Clustering Information   random          8        512   0.55033495    0.5436817
Mutual Clustering Information   random          8       4096   0.55033495   0.54309559
Mutual Clustering Information   random         64         64    0.5767782   0.47757262
Mutual Clustering Information   random         64        512    0.5767782   0.47757262
Mutual Clustering Information   random         64       4096   0.55033495    0.5436817
Mutual Clustering Information   sharing         1         64   0.85702133   0.57941904
Mutual Clustering Information   sharing         1        512   0.75676843   0.70076968
Mutual Clustering Information   sharing         1       4096   0.83529931   0.7253681
Mutual Clustering Information   sharing         8         64   0.59156955   0.50361655
Mutual Clustering Information   sharing         8        512   0.80361689   0.58438899
Mutual Clustering Information   sharing         8       4096   0.80361689   0.71300752
Mutual Clustering Information   sharing        64         64   0.59156955    0.4889657
Mutual Clustering Information   sharing        64        512   0.59156955   0.49271413
Mutual Clustering Information   sharing        64       4096   0.80361689   0.60322108
Mutual Clustering Information tournament        1         64
Mutual Clustering Information tournament        1        512
Mutual Clustering Information tournament        1       4096
Mutual Clustering Information tournament        8         64
Mutual Clustering Information tournament        8        512
Mutual Clustering Information tournament        8       4096
Mutual Clustering Information tournament       64         64
Mutual Clustering Information tournament       64        512
Mutual Clustering Information tournament       64       4096
                                         238


                          Num Differentia Bits = 1                               Num Differentia Bits = 8                              Num Differentia Bits = 64
        1.0
        0.8
                                                                                                                                                                                   Num Column Bits = 64
        0.6
score
        0.4
        0.2
        0.0
        1.0
        0.8
                                                                                                                                                                                   Num Column Bits = 512
        0.6
score
                                                                                                                                                                                                            policy
                                                                                                                                                                                                               RPR
                                                                                                                                                                                                               TDPR
        0.4
        0.2
        0.0
        1.0
        0.8
                                                                                                                                                                                   Num Column Bits = 4096
        0.6
score
        0.4
        0.2
        0.0
              random   ecoea      sharing    tournament   lexicase   random   ecoea      sharing    tournament   lexicase   random   ecoea      sharing    tournament   lexicase
                                 condition                                              condition                                              condition
Figure B.4: Comparison of phylogenetic reconstruction quality between stratum retention policies. Recon-
struction quality measured as clustering information distance between reconstructed phylogeny and ground
truth phylogeny (Smith, 2020a,c). Lower is better. RPR is recency-proportional resolution stratum retention
policy and TDPR is tapered depth-proportional resolution stratum retention policy.
                                                                                               239


                          Num Differentia Bits = 1                               Num Differentia Bits = 8                              Num Differentia Bits = 64
        0.8
        0.7
        0.6
                                                                                                                                                                                   Num Column Bits = 64
        0.5
score
        0.4
        0.3
        0.2
        0.1
        0.0
        0.8
        0.7
        0.6
                                                                                                                                                                                   Num Column Bits = 512
        0.5
score
                                                                                                                                                                                                            policy
                                                                                                                                                                                                               RPR
        0.4                                                                                                                                                                                                    TDPR
        0.3
        0.2
        0.1
        0.0
        0.8
        0.7
                                                                                                                                                                                   Num Column Bits = 4096
        0.6
        0.5
score
        0.4
        0.3
        0.2
        0.1
        0.0
              random   ecoea      sharing    tournament   lexicase   random   ecoea      sharing    tournament   lexicase   random   ecoea      sharing    tournament   lexicase
                                 condition                                              condition                                              condition
Figure B.5: Comparison of phylogenetic reconstruction quality between stratum retention policies. Recon-
struction quality measured as mutual clustering information between reconstructed phylogeny and ground
truth phylogeny (Smith, 2020a,c). Higher is better. RPR is recency-proportional resolution stratum retention
policy and TDPR is tapered depth-proportional resolution stratum retention policy.
                                                                                               240


                          Num Differentia Bits = 1                               Num Differentia Bits = 8                              Num Differentia Bits = 64
        0.7
        0.6
        0.5
                                                                                                                                                                                   Num Column Bits = 64
        0.4
score
        0.3
        0.2
        0.1
        0.0
        0.7
        0.6
        0.5
                                                                                                                                                                                   Num Column Bits = 512
        0.4
score
                                                                                                                                                                                                            policy
                                                                                                                                                                                                               RPR
        0.3                                                                                                                                                                                                    TDPR
        0.2
        0.1
        0.0
        0.7
        0.6
        0.5
                                                                                                                                                                                   Num Column Bits = 4096
        0.4
score
        0.3
        0.2
        0.1
        0.0
              random   ecoea      sharing    tournament   lexicase   random   ecoea      sharing    tournament   lexicase   random   ecoea      sharing    tournament   lexicase
                                 condition                                              condition                                              condition
Figure B.6: Comparison of phylogenetic reconstruction quality between stratum retention policies. Recon-
struction quality measured as generalized Robinson-Foulds similarity between reconstructed phylogeny and
ground truth phylogeny (Smith, 2020a,c). Higher is better. RPR is recency-proportional resolution stratum
retention policy and TDPR is tapered depth-proportional resolution stratum retention policy.
                                                                                               241


B.4     Condemner Implementations
             Code Listing B.1: Depth-proportional resolution policy condemner implementation.
 import t y p i n g
 from . . H e r e d i t a r y S t r a t u m import H e r e d i t a r y S t r a t u m
 from . . s t r a t u m _ r e t e n t i o n _ p r e d i c a t e s \
     import S t r a t u m R e t e n t i o n P r e d i c a t e D e p t h P r o p o r t i o n a l R e s o l u t i o n
 class StratumRetentionCondemnerDepthProportionalResolution (
     # i n h e r i t CalcNumStrataRetainedUpperBound , e t c .
     StratumRetentionPredicateDepthProportionalResolution ,
 ):
     """ Functor t o implement t h e depth−p r o p o r t i o n a l r e s o l u t i o n s t r a t a r e t e n t i o n
     p o l i c y , f o r u s e with H e r e d i t a r y S t r a t i g r a p h i c C o l u m n .
     This f u n c t o r e n a c t s t h e depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y by s p e c i f y i n g
     t h e s e t o f s t r a t a r a n k s t h a t s h o u l d be purged from a h e r e d i t a r y
      s t r a t i g r a p h i c column when t h e nth stratum i s d e p o s i t e d .
     The depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y e n s u r e s e s t i m a t e s o f MRCA rank w i l l
     have u n c e r t a i n t y bounds l e s s than o r e q u a l t o a u s e r −s p e c i f i e d
     p r o p o r t i o n o f t h e l a r g e s t number o f s t r a t a d e p o s i t e d on e i t h e r column .
     Thus , MRCA rank e s t i m a t e u n c e r t a i n t y s c a l e s a s O( n ) with r e s p e c t t o t h e
      g r e a t e r number o f s t r a t a d e p o s i t e d on e i t h e r column .
     Under t h e depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y , t h e number o f s t r a t a
      r e t a i n e d ( i . e . , s p a c e c o m p l e x i t y ) s c a l e s a s O( 1 ) with r e s p e c t t o t h e number
     of strata deposited .
     See Also
     −−−−−−−−
     StratumRetentionPredicateDepthProportionalResolution :
              For d e f i n i t i o n s o f methods i n h e r i t e d by t h i s c l a s s t h a t d e s c r i b e
              g u a r a n t e e d p r o p e r t i e s o f t h e depth r e s o l u t i o n stratum r e t e n t i o n
              policy .
     """
     def __init__ (
               s e l f : ’ StratumRetentionCondemnerDepthProportionalResolution ’ ,
              g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n : i n t=10 ,
     ):
              """ C o n s t r u c t t h e f u n c t o r .
              Parameters
             −−−−−−−−−−
              guaranteed_depth_proportional_resolution : int , optional
                       The d e s i r e d minimum number o f i n t e r v a l s f o r t h e rank o f t h e MRCA t o
                       be a b l e t o be d i s t i n g u i s h e d between . The u n c e r t a i n t y o f MRCA
                       rank e s t i m a t e s p r o v i d e d under t h e depth−p r o p o r t i o n a l r e s o l u t i o n
                       p o l i c y w i l l s c a l e a s t o t a l number o f s t r a t a d e p o s i t e d d i v i d e d by
                                                                          242


                                                Code Listing B.1 (cont’d)
             guaranteed_depth_proportional_resolution .
    """
    super (
             StratumRetentionCondemnerDepthProportionalResolution ,
             self ,
    ) . __init__ (
             guaranteed_depth_proportional_resolution
                     =g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n ,
    )
def __call__ (
     s e l f : ’ StratumRetentionCondemnerDepthProportionalResolution ’ ,
    num_stratum_depositions_completed : int ,
    r e t a i n e d _ r a n k s : t y p i n g . O p t i o n a l [ t y p i n g . I t e r a b l e [ i n t ] ]=None ,
) -> t y p i n g . I t e r a t o r [ i n t ] :
    """ Decide which s t r a t a w i t h i n t h e s t r a t a g r a p h i c column s h o u l d be purged .
    Every time a new stratum i s d e p o s i t e d , t h i s method i s c a l l e d t o
    d e t e r m i n e which s t r a t a s h o u l d be purged . A l l s t r a t a a t r a n k s y i e l d e d
    from t h i s f u n c t o r a r e i m m e d i a t e l y purged from t h e column , meaning t h a t
     f o r a stratum t o p e r s i s t i t must not be y i e l d e d by t h i s f u n c t o r each
    and e v e r y time a new stratum i s d e p o s i t e d .
    Parameters
    −−−−−−−−−−
    num_stratum_depositions_completed : i n t
             The number o f s t r a t a t h a t have a l r e a d y been d e p o s i t e d , not
             i n c l u d i n g t h e l a t e s t stratum b e i n g d e p o s i t e d which prompted t h e
             c u r r e n t purge o p e r a t i o n .
    retained_ranks : i t e r a t o r over int , optional
             An i t e r a t o r o v e r r a n k s o f s t r a t a c u r r e n t l y r e t a i n e d w i t h i n t h e
             h e r e d i t a r y s t r a t i g r a p h i c column . Not used i n t h i s f u n c t o r .
    Returns
    −−−−−−−
     i t e r a t o r over i n t
             The r a n k s o f s t r a t a t h a t s h o u l d be purged from t h e h e r e d i t a r y
             s t r a t i g r a p h i c column a t t h i s d e p o s i t i o n s t e p .
    See Also
    −−−−−−−−
    StratumRetentionPredicateDepthProportionalResolution :
             For d e t a i l s on t h e r a t i o n a l e , i m pl em en t at io n , and g u a r a n t e e s o f t h e
             depth−p r o p o r t i o n a l r e s o l u t i o n stratum r e t e n t i o n p o l i c y .
    """
     r e s o l u t i o n = s e l f . _guaranteed_depth_proportional_resolution
    # _ g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n i s from s u p e r c l a s s
    # u n t i l s u f f i c i e n t s t r a t a have been d e p o s i t e d t o r e a c h t a r g e t r e s o l u t i o n
    # don ’ t remove any s t r a t a
                                                                243


                                                          Code Listing B.1 (cont’d)
                i f num_stratum_depositions_completed <= r e s o l u t i o n : return
               # newest stratum i s in−p r o g r e s s d e p o s i t i o n
               # t h a t w i l l occupy rank num_stratum_depositions_completed
                second_newest_stratum_rank = num_stratum_depositions_completed - 1
               # +1 ’ s b e c a u s e o f in−p r o g r e s s d e p o s i t i o n
               # _ c a l c _ p r o v i d e d _ u n c e r t a i n t y i s from s u p e r c l a s s
                cur_provided_uncertainty = s e l f . _calc_provided_uncertainty (
                      num_stratum_depositions_completed + 1
                )
                prev_provided_uncertainty = s e l f . _calc_provided_uncertainty (
                      num_stratum_depositions_completed + 1
                      - 1
                )
                i f c u r _ p r o v i d e d _ u n c e r t a i n t y != p r e v _ p r o v i d e d _ u n c e r t a i n t y :
                     # we j u s t p a s s e d t h e t h r e s h o l d where t h e s p a c i n g between r e t a i n e d
                     # s t r a t a c o u l d be doubled w i t h o u t v i o l a t i n g our r e s o l u t i o n g u a r a n t e e
                     # c l e a n up no−l o n g e r −needed s t r a t a t h a t b i s e c t
                     # cur_provided_uncertainty i n t e r v a l s
                      assert p r e v _ p r o v i d e d _ u n c e r t a i n t y * 2 == c u r _ p r o v i d e d _ u n c e r t a i n t y
                      yie ld from range (
                             prev_provided_uncertainty ,
                             second_newest_stratum_rank ,
                             cur_provided_uncertainty ,
                      )
                i f second_newest_stratum_rank % c u r _ p r o v i d e d _ u n c e r t a i n t y :
                     # we alw ays keep t h e newest stratum
                     # but u n l e s s t h e now−second−newest stratum i s needed a s a waypoint
                     # of the cur_provided_uncertainty i n t e r v a l s , get r i d of i t
                      yie ld second_newest_stratum_rank
                        Code Listing B.2: Fixed resolution policy condemner implementation.
import t y p i n g
from . . H e r e d i t a r y S t r a t u m import H e r e d i t a r y S t r a t u m
from . . s t r a t u m _ r e t e n t i o n _ p r e d i c a t e s \
        import S t r a t u m R e t e n t i o n P r e d i c a t e F i x e d R e s o l u t i o n
c l a s s S t ratu m Re te n tion Con de mn er F ixe dRes olu tio n (
       # i n h e r i t CalcNumStrataRetainedUpperBound , e t c .
        StratumRetentionPredicateFixedResolution ,
):
        """ Functor t o implement t h e f i x e d r e s o l u t i o n stratum r e t e n t i o n p o l i c y , f o r
        u s e with H e r e d i t a r y S t r a t i g r a p h i c C o l u m n .
        This f u n c t o r e n a c t s t h e f i x e d r e s o l u t i o n p o l i c y by s p e c i f y i n g t h e s e t o f
        s t r a t a r a n k s t h a t s h o u l d be purged from a h e r e d i t a r y s t r a t i g r a p h i c column
                                                                        244


                                                     Code Listing B.2 (cont’d)
when t h e nth stratum i s d e p o s i t e d .
The f i x e d r e s o l u t i o n p o l i c y e n s u r e s e s t i m a t e s o f MRCA rank w i l l have
u n c e r t a i n t y bounds l e s s than o r e q u a l a f i x e d , a b s o l u t e u s e r −s p e c i f i e d cap
t h a t i s i n d e p e n d e n t o f t h e number o f s t r a t a d e p o s i t e d on e i t h e r column .
Thus , MRCA rank e s t i m a t e u n c e r t a i n t y s c a l e s a s O( 1 ) with r e s p e c t t o t h e
 g r e a t e r number o f s t r a t a d e p o s i t e d on e i t h e r column .
Under t h e f i x e d r e s o l u t i o n p o l i c y , t h e number o f s t r a t a r e t a i n e d ( i . e . ,
s p a c e c o m p l e x i t y ) s c a l e s a s O( n ) with r e s p e c t t o t h e number o f s t r a t a
deposited .
See Also
−−−−−−−−
StratumRetentionPredicateFixedResolution :
        For d e f i n i t i o n s o f methods i n h e r i t e d by t h i s c l a s s t h a t d e s c r i b e
         g u a r a n t e e d p r o p e r t i e s o f t h e f i x e d r e s o l u t i o n stratum r e t e n t i o n
         policy .
"""
def __init__ (
         s e l f : ’ Stra tu m R ete nti on C onde mne rFi xedR eso luti on ’ ,
         f i x e d _ r e s o l u t i o n : i n t=10 ,
):
        """ C o n s t r u c t t h e f u n c t o r .
         Parameters
        −−−−−−−−−−
         fixed_resolution : int , optional
                 The rank i n t e r v a l s t r a t a s h o u l d be r e t a i n e d a t . The u n c e r t a i n t y o f
                 MRCA e s t i m a t e s p r o v i d e d under t h e f i x e d r e s o l u t i o n p o l i c y w i l l
                  alw ays be s t r i c t l y l e s s than t h i s cap .
        """
        super (
                  StratumRetentionCondemnerFixedResolution ,
                  self ,
         ) . __init__ (
                  f i x e d _ r e s o l u t i o n=f i x e d _ r e s o l u t i o n ,
         )
def __call__ (
         s e l f : ’ Stra tu m R ete nti on C onde mne rFi xedR eso luti on ’ ,
         num_stratum_depositions_completed : int ,
         r e t a i n e d _ r a n k s : t y p i n g . O p t i o n a l [ t y p i n g . I t e r a b l e [ i n t ] ]=None ,
) -> t y p i n g . I t e r a t o r [ i n t ] :
        """ Decide which s t r a t a w i t h i n t h e s t r a t a g r a p h i c column s h o u l d be purged .
        Every time a new stratum i s d e p o s i t e d , t h i s method i s c a l l e d t o
         d e t e r m i n e which s t r a t a s h o u l d be purged . A l l s t r a t a a t r a n k s y i e l d e d
        from t h i s f u n c t o r a r e i m m e d i a t e l y purged from t h e column , meaning t h a t
         f o r a stratum t o p e r s i s t i t must not be y i e l d e d by t h i s f u n c t o r each
                                                                      245


                                                       Code Listing B.2 (cont’d)
           and e v e r y time a new stratum i s d e p o s i t e d .
           Parameters
          −−−−−−−−−−
           num_stratum_depositions_completed : i n t
                   The number o f s t r a t a t h a t have a l r e a d y been d e p o s i t e d , not
                    i n c l u d i n g t h e l a t e s t stratum b e i n g d e p o s i t e d which prompted t h e
                    c u r r e n t purge o p e r a t i o n .
           retained_ranks : i t e r a t o r over int , optional
                   An i t e r a t o r o v e r r a n k s o f s t r a t a c u r r e n t l y r e t a i n e d w i t h i n t h e
                    h e r e d i t a r y s t r a t i g r a p h i c column . Not used i n t h i s f u n c t o r .
           Returns
          −−−−−−−
           i t e r a t o r over i n t
                            The r a n k s o f s t r a t a t h a t s h o u l d be purged from t h e h e r e d i t a r y
                    s t r a t i g r a p h i c column a t t h i s d e p o s i t i o n s t e p .
           See Also
          −−−−−−−−
           StratumRetentionPredicateFixedResolution :
                    For d e t a i l s on t h e r a t i o n a l e , i m pl em en t at io n , and g u a r a n t e e s o f t h e
                    f i x e d r e s o l u t i o n stratum r e t e n t i o n p o l i c y .
           """
           resolution = s e l f . _fixed_resolution
          # _ f i x e d _ r e s o l u t i o n i s from s u p e r c l a s s
          # in−p r o g r e s s d e p o s i t i o n not r e f l e c t e d i n
          # num_stratum_depositions_completed
           second_newest_stratum_rank = num_stratum_depositions_completed - 1
           if (
                    second_newest_stratum_rank > 0
                   and second_newest_stratum_rank % r e s o l u t i o n
           ):
                    yie ld second_newest_stratum_rank
    Code Listing B.3: MRCA-recency-proportional resolution policy condemner implementation.
import t y p i n g
from . . H e r e d i t a r y S t r a t u m import H e r e d i t a r y S t r a t u m
from . . s t r a t u m _ r e t e n t i o n _ p r e d i c a t e s \
    import S t r a t u m R e t e n t i o n P r e d i c a t e R e c e n c y P r o p o r t i o n a l R e s o l u t i o n
class StratumRetentionCondemnerRecencyProportionalResolution (
    # i n h e r i t CalcNumStrataRetainedUpperBound , e t c .
    StratumRetentionPredicateRecencyProportionalResolution ,
):
                                                                     246


                                                     Code Listing B.3 (cont’d)
""" Functor t o implement t h e MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n stratum
                                                                                 r e t e n t i o n p o l i c y , f o r u s e with
                                                                                 HereditaryStratigraphicColumn .
This f u n c t o r e n a c t s t h e MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n p o l i c y by
 s p e c i f y i n g t h e s e t o f s t r a t a r a n k s t h a t s h o u l d be purged from a h e r e d i t a r y
 s t r a t i g r a p h i c column when t h e nth stratum i s d e p o s i t e d .
The MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n p o l i c y e n s u r e s e s t i m a t e s o f MRCA
rank w i l l have u n c e r t a i n t y bounds l e s s than o r e q u a l t o a u s e r −s p e c i f i e d
p r o p o r t i o n o f t h e a c t u a l number o f g e n e r a t i o n s e l a p s e d s i n c e t h e MRCA and
t h e d e e p e s t o f t h e compared columns . MRCA rank e s t i m a t e u n c e r t a i n t y s c a l e s
i n t h e w o r s t c a s e s c a l e s a s O( n ) with r e s p e c t t o t h e g r e a t e r number o f
 s t r a t a d e p o s i t e d on e i t h e r column . However , with r e s p e c t t o e s t i m a t i n g t h e
                                                                                 rank o f t h e MRCA when l i n e a g e s
                                                                                 d i v e r g e d any f i x e d number o f
                                                                                 g e n e r a t i o n s ago ,
u n c e r t a i n t y s c a l e s a s O( 1 ) .
Under t h e MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n p o l i c y , t h e number o f s t r a t a
 r e t a i n e d ( i . e . , s p a c e c o m p l e x i t y ) s c a l e s a s O( l o g ( n ) ) with r e s p e c t t o t h e
number o f s t r a t a d e p o s i t e d .
See Also
−−−−−−−−
StratumRetentionPredicateRecencyProportionalResolution :
         For d e f i n i t i o n s o f methods i n h e r i t e d by t h i s c l a s s t h a t d e s c r i b e
         g u a r a n t e e d p r o p e r t i e s o f t h e MRCA−r e c e n c y p r o p o r t i o n a l r e s o l u t i o n
         stratum r e t e n t i o n p o l i c y .
"""
def __init__ (
          s e l f : ’ StratumRetentionCondemnerRecencyProportionalResolution ’ ,
         g u a r a n t e e d _ m r c a _ r e c e n c y _ p r o p o r t i o n a l _ r e s o l u t i o n : i n t=10 ,
):
         """ C o n s t r u c t t h e f u n c t o r .
         Parameters
        −−−−−−−−−−
         guaranteed_mrca_recency_proportional_resolution : int , optional
                  The d e s i r e d minimum number o f i n t e r v a l s between t h e MRCA and t h e
                  d e e p e r compared column t o be a b l e t o be d i s t i n g u i s h e d between . The
                  u n c e r t a i n t y o f MRCA rank e s t i m a t e s p r o v i d e d under t h e MRCA−r e c e n c y −
                  p r o p o r t i o n a l r e s o l u t i o n p o l i c y w i l l s c a l e as the a c t u a l
                  p h y l o g e n e t i c depth o f t h e MRCA d i v i d e d by
                  guaranteed_mrca_recency_proportional_resolution .
         """
         super (
                  StratumRetentionCondemnerRecencyProportionalResolution ,
                  self ,
         ) . __init__ (
                                                                    247


                                               Code Listing B.3 (cont’d)
             guaranteed_mrca_recency_proportional_resolution
                     =g u a r a n t e e d _ m r c a _ r e c e n c y _ p r o p o r t i o n a l _ r e s o l u t i o n ,
    )
def _num_to_condemn (
     s e l f : ’ StratumRetentionCondemnerRecencyProportionalResolution ’ ,
    num_stratum_depositions_completed : int ,
) -> i n t :
    """How many s t r a t a s h o u l d be e l i m i n a t e d a f t e r
    num_stratum_depositions_completed have been d e p o s i t e d and t h e
    num_stratum_depositions_completed + 1 ’ th d e p o s i t i o n i s i n p r o g r e s s ?
    Used t o implement f u n c t o r ’ s __call__ method s p e c i f y i n g which r a n k s
    s h o u l d be purged d u r i n g t h i s stratum d e p o s i t i o n . This e x p r e s s i o n f o r
    e x a c t number d e p o s i t e d was e x t r a p o l a t e d from
             ∗ r e s o l u t i o n = 0 , <h t t p s : / / o e i s . o r g / A001511>
             ∗ r e s o l u t i o n = 1 , <h t t p s : / / o e i s . o r g / A091090>
    and i s u n i t t e s t e d e x t e n s i v e l y .
    """
     r e s o l u t i o n = s e l f . _guaranteed_mrca_recency_proportional_resolution
    # _ g u a r a n t e e d _ m r c a _ r e c e n c y _ p r o p o r t i o n a l _ r e s o l u t i o n i s from s u p e r c l a s s
     i f num_stratum_depositions_completed % 2 == 1 :
             return 0
     e l i f num_stratum_depositions_completed < 2 * ( r e s o l u t i o n + 1 ) :
             return 0
    else :
             return \
                     1 + s e l f . _num_to_condemn ( num_stratum_depositions_completed // 2 )
def __call__ (
     s e l f : ’ StratumRetentionCondemnerRecencyProportionalResolution ’ ,
    num_stratum_depositions_completed : int ,
    r e t a i n e d _ r a n k s : t y p i n g . O p t i o n a l [ t y p i n g . I t e r a b l e [ i n t ] ]=None ,
) -> t y p i n g . I t e r a t o r [ i n t ] :
    """ Decide which s t r a t a w i t h i n t h e s t r a t a g r a p h i c column s h o u l d be purged .
    Every time a new stratum i s d e p o s i t e d , t h i s method i s c a l l e d t o
    d e t e r m i n e which s t r a t a s h o u l d be purged . A l l s t r a t a a t r a n k s y i e l d e d
    from t h i s f u n c t o r a r e i m m e d i a t e l y purged from t h e column , meaning t h a t
     f o r a stratum t o p e r s i s t i t must not be y i e l d e d by t h i s f u n c t o r each
    and e v e r y time a new stratum i s d e p o s i t e d .
    Parameters
    −−−−−−−−−−
    num_stratum_depositions_completed : i n t
             The number o f s t r a t a t h a t have a l r e a d y been d e p o s i t e d , not
             i n c l u d i n g t h e l a t e s t stratum b e i n g d e p o s i t e d which prompted t h e
             c u r r e n t purge o p e r a t i o n .
    retained_ranks : i t e r a t o r over int , optional
            An i t e r a t o r o v e r r a n k s o f s t r a t a c u r r e n t l y r e t a i n e d w i t h i n t h e
                                                                248


                                                        Code Listing B.3 (cont’d)
                     h e r e d i t a r y s t r a t i g r a p h i c column . Not used i n t h i s f u n c t o r .
            Returns
           −−−−−−−
            i t e r a t o r over i n t
                             The r a n k s o f s t r a t a t h a t s h o u l d be purged from t h e h e r e d i t a r y
                     s t r a t i g r a p h i c column a t t h i s d e p o s i t i o n s t e p .
            See Also
           −−−−−−−−
            StratumRetentionPredicateRecencyProportionalResolution :
                    For d e t a i l s on t h e r a t i o n a l e , i m pl em en t at io n , and g u a r a n t e e s o f t h e
                    r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n stratum r e t e n t i o n p o l i c y .
            """
            num_to_condemn = s e l f . _num_to_condemn ( num_stratum_depositions_completed )
            r e s o l u t i o n = s e l f . _guaranteed_mrca_recency_proportional_resolution
           # _ g u a r a n t e e d _ m r c a _ r e c e n c y _ p r o p o r t i o n a l _ r e s o l u t i o n i s from s u p e r c l a s s
            f o r i in range ( num_to_condemn ) :
                     factor = 2 * resolution + 1
                    num_ranks_back = f a c t o r * ( 2 ** i )
                    yie ld num_stratum_depositions_completed - num_ranks_back
    Code Listing B.4: Tapered depth-proportional resolution policy condemner implementation.
import t y p i n g
from . . H e r e d i t a r y S t r a t u m import H e r e d i t a r y S t r a t u m
from . . s t r a t u m _ r e t e n t i o n _ p r e d i c a t e s \
    import S t r a t u m R e t e n t i o n P r e d i c a t e T a p e r e d D e p t h P r o p o r t i o n a l R e s o l u t i o n
class StratumRetentionCondemnerTaperedDepthProportionalResolution (
    # i n h e r i t CalcNumStrataRetainedUpperBound , e t c .
    StratumRetentionPredicateTaperedDepthProportionalResolution ,
):
    """ Functor t o implement t h e t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n s t r a t a
    r e t e n t i o n p o l i c y , f o r u s e with H e r e d i t a r y S t r a t i g r a p h i c C o l u m n .
    This f u n c t o r e n a c t s t h e t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y by
    s p e c i f y i n g t h e s e t o f s t r a t a r a n k s t h a t s h o u l d be purged from a h e r e d i t a r y
    s t r a t i g r a p h i c column when t h e nth stratum i s d e p o s i t e d .
    The t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y e n s u r e s e s t i m a t e s o f MRCA
    rank w i l l have u n c e r t a i n t y bounds l e s s than o r e q u a l t o a u s e r −s p e c i f i e d
    p r o p o r t i o n o f t h e l a r g e s t number o f s t r a t a d e p o s i t e d on e i t h e r column .
    Thus , MRCA rank e s t i m a t e u n c e r t a i n t y s c a l e s a s O( n ) with r e s p e c t t o t h e
    g r e a t e r number o f s t r a t a d e p o s i t e d on e i t h e r column .
                                                                       249


                                                    Code Listing B.4 (cont’d)
Under t h e t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y , t h e number o f s t r a t a
 r e t a i n e d ( i . e . , s p a c e c o m p l e x i t y ) s c a l e s a s O( 1 ) with r e s p e c t t o t h e number
of strata deposited .
See Also
−−−−−−−−
StratumRetentionPredicateTaperedDepthProportionalResolution :
         For d e f i n i t i o n s o f methods i n h e r i t e d by t h i s c l a s s t h a t d e s c r i b e
         g u a r a n t e e d p r o p e r t i e s o f t h e depth r e s o l u t i o n stratum r e t e n t i o n
         policy .
StratumRetentionCondemnerDepthProportionalResolution :
         For a p r e d i c a t e r e t e n t i o n p o l i c y t h a t a c h i e v e s t h e same g u a r a n t e e s f o r
         depth−p r o p o r t i o n a l r e s o l u t i o n but p u r g e s u n n e c e s s a r y s t r a t a more
          a g g r e s s i v e l y and a b r u p t l y .
"""
def __init__ (
          s e l f : ’ StratumRetentionCondemnerTaperedDepthProportionalResolution ’ ,
         g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n : i n t=10 ,
):
         """ C o n s t r u c t t h e f u n c t o r .
         Parameters
        −−−−−−−−−−
         guaranteed_depth_proportional_resolution : int , optional
                  The d e s i r e d minimum number o f i n t e r v a l s f o r t h e rank o f t h e MRCA t o
                  be a b l e t o be d i s t i n g u i s h e d between . The u n c e r t a i n t y o f MRCA
                  rank e s t i m a t e s p r o v i d e d under t h e depth−p r o p o r t i o n a l r e s o l u t i o n
                  p o l i c y w i l l s c a l e a s t o t a l number o f s t r a t a d e p o s i t e d d i v i d e d by
                  guaranteed_depth_proportional_resolution .
         """
         super (
                  StratumRetentionCondemnerTaperedDepthProportionalResolution ,
                  self ,
         ) . __init__ (
                  guaranteed_depth_proportional_resolution
                          =g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n ,
         )
def __call__ (
          s e l f : ’ StratumRetentionCondemnerTaperedDepthProportionalResolution ’ ,
         num_stratum_depositions_completed : int ,
         r e t a i n e d _ r a n k s : t y p i n g . O p t i o n a l [ t y p i n g . I t e r a b l e [ i n t ] ]=None ,
) -> t y p i n g . I t e r a t o r [ i n t ] :
         """ Decide which s t r a t a w i t h i n t h e s t r a t a g r a p h i c column s h o u l d be purged .
         Every time a new stratum i s d e p o s i t e d , t h i s method i s c a l l e d t o
         d e t e r m i n e which s t r a t a s h o u l d be purged . A l l s t r a t a a t r a n k s y i e l d e d
         from t h i s f u n c t o r a r e i m m e d i a t e l y purged from t h e column , meaning t h a t
         f o r a stratum t o p e r s i s t i t must not be y i e l d e d by t h i s f u n c t o r each
         and e v e r y time a new stratum i s d e p o s i t e d .
                                                                     250


                                            Code Listing B.4 (cont’d)
Parameters
−−−−−−−−−−
num_stratum_depositions_completed : i n t
         The number o f s t r a t a t h a t have a l r e a d y been d e p o s i t e d , not
         i n c l u d i n g t h e l a t e s t stratum b e i n g d e p o s i t e d which prompted t h e
         c u r r e n t purge o p e r a t i o n .
retained_ranks : i t e r a t o r over int , optional
         An i t e r a t o r o v e r r a n k s o f s t r a t a c u r r e n t l y r e t a i n e d w i t h i n t h e
         h e r e d i t a r y s t r a t i g r a p h i c column . Not used i n t h i s f u n c t o r .
Returns
−−−−−−−
 i t e r a t o r over i n t
         The r a n k s o f s t r a t a t h a t s h o u l d be purged from t h e h e r e d i t a r y
         s t r a t i g r a p h i c column a t t h i s d e p o s i t i o n s t e p .
See Also
−−−−−−−−
StratumRetentionPredicateDepthProportionalResolution :
         For d e t a i l s on t h e r a t i o n a l e , i m pl em en t at io n , and g u a r a n t e e s o f t h e
         t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n stratum r e t e n t i o n p o l i c y .
"""
 r e s o l u t i o n = s e l f . _guaranteed_depth_proportional_resolution
# _ g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n i s from s u p e r c l a s s
# u n t i l s u f f i c i e n t s t r a t a have been d e p o s i t e d t o r e a c h t a r g e t r e s o l u t i o n
# don ’ t remove any s t r a t a
 i f num_stratum_depositions_completed < 2 * r e s o l u t i o n : return
# +1 ’ s b e c a u s e o f in−p r o g r e s s d e p o s i t i o n
# _ c a l c _ p r o v i d e d _ u n c e r t a i n t y i s from s u p e r c l a s s
cur_stage_uncertainty = s e l f . _calc_provided_uncertainty (
         num_stratum_depositions_completed + 1
)
p r e v _ s t a g e _ u n c e r t a i n t y = c u r _ s t a g e _ u n c e r t a i n t y // 2
assert p r e v _ s t a g e _ u n c e r t a i n t y
cur_stage_idx \
         = num_stratum_depositions_completed // c u r _ s t a g e _ u n c e r t a i n t y
prev_stage_idx \
         = num_stratum_depositions_completed // p r e v _ s t a g e _ u n c e r t a i n t y
# we j u s t added a new peg s o we have t o c l e a r out an o l d one
 i f ( num_stratum_depositions_completed % p r e v _ s t a g e _ u n c e r t a i n t y == 0 ) :
         t a r g e t _ i d x = prev_stage_idx * 2 - 4 * r e s o l u t i o n + 1
         target_rank = target_idx * prev_stage_uncertainty
         # a s s e r t t a r g e t _ r a n k <= num_stratum_depositions_completed
         assert t a r g e t _ r a n k >= 0
                                                          251


                                                      Code Listing B.4 (cont’d)
                       i f t a r g e t _ r a n k < num_stratum_depositions_completed :
                             yiel d t a r g e t _ r a n k
              # newest stratum i s in−p r o g r e s s d e p o s i t i o n
              # t h a t w i l l occupy rank num_stratum_depositions_completed
               second_newest_stratum_rank = num_stratum_depositions_completed - 1
              # we always keep t h e newest stratum
              # but u n l e s s t h e now−second−newest stratum i s needed a s a waypoint
              # o f t h e c u r _ p r o v i d e d _ u n c e r t a i n t y i n t e r v a l s , we w i l l g e t r i d o f i t
                if (
                       second_newest_stratum_rank % p r e v _ s t a g e _ u n c e r t a i n t y
                       or prev_stage_idx == 4 * r e s o l u t i o n - 1
               ):
                      # we alw ays keep t h e newest stratum
                      # but u n l e s s t h e now−second−newest stratum i s needed a s a waypoint
                      # of the cur_provided_uncertainty i n t e r v a l s , get r i d of i t
                       yie ld second_newest_stratum_rank
B.5      Predicate Implementations
 Code Listing B.5: Depth-proportional resolution policy predicate implementation, including method to
 calculate exact number of strata retained by generation and method to calculate stratum deposition
 rank from column index.
 import math
 import t y p i n g
 class StratumRetentionPredicateDepthProportionalResolution :
      """ Functor t o implement t h e depth−p r o p o r t i o n a l r e s o l u t i o n stratum r e t e n t i o n
      p o l i c y , f o r u s e with H e r e d i t a r y S t r a t i g r a p h i c C o l u m n .
      This f u n c t o r e n a c t s t h e depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y by s p e c i f y i n g
      whether a stratum with d e p o s i t i o n rank r s h o u l d be r e t a i n e d w i t h i n t h e
       h e r e d i t a r y s t r a t i g r a p h i c column a f t e r n s t r a t a have been d e p o s i t e d .
      The depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y e n s u r e s e s t i m a t e s o f MRCA rank w i l l
      have u n c e r t a i n t y bounds l e s s than o r e q u a l t o a u s e r −s p e c i f i e d
      p r o p o r t i o n o f t h e l a r g e s t number o f s t r a t a d e p o s i t e d on e i t h e r column .
      Thus , MRCA rank e s t i m a t e u n c e r t a i n t y s c a l e s a s O( n ) with r e s p e c t t o t h e
       g r e a t e r number o f s t r a t a d e p o s i t e d on e i t h e r column .
      Under t h e depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y , t h e number o f s t r a t a
       r e t a i n e d ( i . e . , s p a c e c o m p l e x i t y ) s c a l e s a s O( 1 ) with r e s p e c t t o t h e number
      of strata deposited .
      See Also
      −−−−−−−−
      StratumRetentionCondemnerDepthProportionalResolution :
               For a p o t e n t i a l l y more c o m p u t a t i o n a l l y e f f i c i e n t s p e c i f i c i a t i o n o f t h e
                                                                      252


                                                Code Listing B.5 (cont’d)
    depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y t h a t d i r e c t l y g e n e r a t e s t h e r a n k s
    o f s t r a t a t h a t s h o u l d be purged d u r i n g t h e nth stratum d e p o s i t i o n .
StratumRetentionPredicateTaperedDepthProportionalResolution :
    For a p r e d i c a t e r e t e n t i o n p o l i c y t h a t a c h i e v e s t h e same g u a r a n t e e s f o r
    depth−p r o p o r t i o n a l r e s o l u t i o n but p u r g e s u n n e c e s s a r y s t r a t a more
    graudally .
"""
_guaranteed_depth_proportional_resolution : int
def __init__ (
     s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n : i n t=10
):
    """ C o n s t r u c t t h e f u n c t o r .
    Parameters
    −−−−−−−−−−
    guaranteed_depth_proportional_resolution : int , optional
             The d e s i r e d minimum number o f i n t e r v a l s f o r t h e rank o f t h e MRCA t o
             be a b l e t o be d i s t i n g u i s h e d between . The u n c e r t a i n t y o f MRCA
             rank e s t i m a t e s p r o v i d e d under t h e depth−p r o p o r t i o n a l r e s o l u t i o n
             p o l i c y w i l l s c a l e a s t o t a l number o f s t r a t a d e p o s i t e d d i v i d e d by
             guaranteed_depth_proportional_resolution .
    """
    assert g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n > 0
     s e l f . _guaranteed_depth_proportional_resolution = (
             guaranteed_depth_proportional_resolution
    )
def __eq__(
     s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    other : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
) -> bool :
    """ Compare f o r v a l u e −w i s e e q u a l i t y . """
     i f i s i n s t a n c e ( o t h e r , s e l f . __class__ ) :
             return s e l f . __dict__ == o t h e r . __dict__
    else :
             return F a l s e
def _ c a l c _ p r o v i d e d _ u n c e r t a i n t y (
     s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    num_stratum_depositions_completed : int ,
) -> i n t :
    """ A f t e r n s t r a t a have been d e p o s i t e d , how many r a n k s a r e s p a c e d
    between r e t a i n e d s t r a t a ? """
    guaranteed_resolution = s e l f . _guaranteed_depth_proportional_resolution
    max_uncertainty \
            = num_stratum_depositions_completed // g u a r a n t e e d _ r e s o l u t i o n
                                                                253


                                                 Code Listing B.5 (cont’d)
    # round down t o l o w e r o r e q u a l power o f 2
    p r o v i d e d _ u n c e r t a i n t y _ e x p = ( max_uncertainty // 2 ) . b i t _ l e n g t h ( )
    p r o v i d e d _ u n c e r t a i n t y = 2 ** p r o v i d e d _ u n c e r t a i n t y _ e x p
    return p r o v i d e d _ u n c e r t a i n t y
def __call__ (
     s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    stratum_rank : int ,
    num_stratum_depositions_completed : int ,
) -> bool :
    """ Decide i f a stratum w i t h i n t h e s t r a t a g r a p h i c column s h o u l d be
     r e t a i n e d o r purged .
    Every time a new stratum i s d e p o s i t e d , t h i s method i s c a l l e d on each
    stratum p r e s e n t i n a H e r e d i t a r y S t r a t i g r a p h i c C o l u m n t o d e t e r m i n e whether
     i t s h o u l d be r e t a i n e d . S t r a t a t h a t r e t u r n F a l s e a r e i m m e d i a t e l y purged
    from t h e column , meaning t h a t f o r a stratum t o p e r s i s t i t must e a r n a
    True r e s u l t from t h i s method each and e v e r y time a new stratum i s
     deposited .
    This f u n c t o r ’ s r e t e n t i o n p o l i c y i m p l e m e n t a t i o n g u a r a n t e e s t h a t columns
     w i l l r e t a i n a p p r o p r i a t e s t r a t a s o t h a t f o r any two columns with m and n
     s t r a t a d e p o s i t e d , t h e rank o f t h e most r e c e n t common a n c e s t o r can be
    d e t e r m i n e d with u n c e r t a i n t y o f a t most
             bound = f l o o r (max(m, n ) / g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n )
    r a n k s . A c h i e v i n g t h i s l i m i t on u n c e r t a i n t y r e q u i r e s r e t a i n i n g s u f f i c i e n t
     s t r a t a s o t h a t no more than bound r a n k s e l a p s e d between any two s t r a t a .
    This p o l i c y a c c u m u l a t e s r e t a i n e d s t r a t a a t a f i x e d i n t e r v a l u n t i l t w i c e
    a s many a s g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n a r e a t hand . Then ,
    e v e r y o t h e r r e t a i n e d stratum i s purged and t h e c y c l e r e p e a t s with a new
    t w i c e −as−wide i n t e r v a l between r e t a i n e d s t r a t a .
    Suppose g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n i s 3 .
                   guaranteed                actual
    time           resolution                uncertainty        column
    −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
    1              0                        0                   |
    2              0                        0                   ||
    3              1                        0                   |||
    4              1                        0                   ||||
    5              1                        0                   |||||
    6              2                        2                   | | ||
    7              2                        2                   | | | |
    8              2                        2                   | | | ||
    9              3                        2                   | | | | |
    10             3                        2                   | | | | ||
    11             3                        2                   | | | | | |
    12             4                        4                   |      |       |      |
                                                            254


                                          Code Listing B.5 (cont’d)
    13          4                     4                   |       |  |        |
    14          4                     4                   |       |  |        ||
    15          5                     4                   |       |  |        | |
    16          5                     4                   |       |  |        |   |
    17          5                     4                   |       |  |        |     |
    18          6                     4                   |       |  |        |     ||
    19          6                     4                   |       |  |        |     | |
    20          6                     4                   |       |  |        |     |   |
    21          7                     4                   |       |  |        |     |     |
    22          7                     4                   |       |  |        |     |     ||
    23          7                     4                   |       |  |        |     |     | |
    24          8                     8                   |          |              |         |
    25          8                     8                   |          |              |           |
    26          8                     8                   |          |              |           ||
    27          9                     8                   |          |              |           | |
    28          9                     8                   |          |              |           |   |
    29          9                     8                   |          |              |           |     |
    30          10                    8                   |          |              |           |       |
    Parameters
    −−−−−−−−−−
    stratum_rank : i n t
          The number o f s t r a t a t h a t were d e p o s i t e d b e f o r e t h e stratum under
          consideration for retention .
    num_stratum_depositions_completed : i n t
          The number o f s t r a t a t h a t have a l r e a d y been d e p o s i t e d , not
          i n c l u d i n g t h e l a t e s t stratum b e i n g d e p o s i t e d which prompted t h e
          c u r r e n t purge o p e r a t i o n .
    Returns
    −−−−−−−
    bool
          True i f t h e stratum s h o u l d be r e t a i n e d , F a l s e o t h e r w i s e .
    """
    guaranteed_resolution = s e l f . _guaranteed_depth_proportional_resolution
    # e a s y edge c a s e s we must always r e t a i n
     if (
         # always r e t a i n newest stratum
          stratum_rank == num_stratum_depositions_completed
         # r e t a i n a l l s t r a t a u n t i l more than num_intervals a r e d e p o s i t e d
          or num_stratum_depositions_completed < g u a r a n t e e d _ r e s o l u t i o n
    ) : return True
    # +1 b e c a u s e o f in−p r o g r e s s d e p o s i t i o n
    provided_uncertainty = s e l f . _calc_provided_uncertainty (
          num_stratum_depositions_completed + 1 ,
    )
    return stratum_rank % p r o v i d e d _ u n c e r t a i n t y == 0
def CalcNumStrataRetainedExact (
                                                     255


                                               Code Listing B.5 (cont’d)
    s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    num_strata_deposited : int ,
) -> i n t :
    """ E x a c t l y how many s t r a t a a r e r e t a i n e d a f t e r n d e p o s t e d ? """
    i f num_strata_deposited == 0 : return 0
    provided_uncertainty = s e l f . _calc_provided_uncertainty (
              num_strata_deposited ,
    )
    newest_stratum_rank = num_strata_deposited - 1
    # +1 f o r 0 ’ th rank stratum
    num_strata_at_uncertainty_intervals \
             = newest_stratum_rank // p r o v i d e d _ u n c e r t a i n t y + 1
    newest_stratum_distinct_from_uncertainty_intervals \
             = ( newest_stratum_rank % p r o v i d e d _ u n c e r t a i n t y != 0 )
    return (
              num_strata_at_uncertainty_intervals
             + newest_stratum_distinct_from_uncertainty_intervals
    )
def CalcNumStrataRetainedUpperBound (
    s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    num_strata_deposited : t y p i n g . O p t i o n a l [ i n t ]=None ,
) -> i n t :
    """At most , how many s t r a t a a r e r e t a i n e d a f t e r n d e p o s t e d ? I n c l u s i v e . """
    return s e l f . _ g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n * 2 + 1
def CalcMrcaUncertaintyUpperBound (
    s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    *,
    f i r s t _ n u m _ s t r a t a _ d e p o s i t e d : int ,
    second_num_strata_deposited : int ,
    actual_rank_of_mrca : t y p i n g . O p t i o n a l [ i n t ]=None ,
) -> i n t :
    """At most , how much u n c e r t a i n t y t o e s t i m a t e rank o f MRCA? I n c l u s i v e . """
    return max(
              first_num_strata_deposited ,
              second_num_strata_deposited ,
    ) // s e l f . _ g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n
def _CalcRankAtColumnIndexImpl (
    s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    i n d e x : int ,
    num_strata_deposited : int ,
) -> i n t :
    """ A f t e r n s t r a t a have been d e p o s i t e d , what w i l l t h e rank o f t h e
    stratum a t column i n d e x k be ?
                                                              256


                                           Code Listing B.5 (cont’d)
    Assumes t h e no in−p r o g r e s s stratum d e p o s i t i o n s t h a t haven ’ t been
    r e f l e c t e d i n num_strata_deposited .
    """
    provided_uncertainty = s e l f . _calc_provided_uncertainty (
            num_strata_deposited ,
    )
    return min (
             index * provided_uncertainty ,
            num_strata_deposited - 1
    )
def CalcRankAtColumnIndex (
    s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
    i n d e x : int ,
    num_strata_deposited : int ,
) -> i n t :
    """ A f t e r n s t r a t a have been d e p o s i t e d , what w i l l t h e rank o f t h e
    stratum a t column i n d e x k be ?
    Enables a HereditaryStratigraphicColumn using t h i s p r e d i c a t e to
    o p t i m i z e away s t o r a g e o f rank a n n o t a t i o n s on s t r a t a . Takes i n t o t h e
    a c c o u n t t h e p o s s i b l i t y f o r in−p r o g r e s s stratum d e p o s i t i o n s t h a t haven ’ t
    been r e f l e c t e d i n num_strata_deposited .
    """
    i f i n d e x == 0 :
            # 0th i n d e x i s always rank 0
            return 0
    e l i f index \
            == s e l f . CalcNumStrataRetainedExact ( num_strata_deposited ) - 1 :
            # c a s e where i n d e x i s t h e v e r y most r e c e n t stratum
            return num_strata_deposited - 1
    e l i f i n d e x == s e l f . CalcNumStrataRetainedExact ( num_strata_deposited ) :
            # i n c a s e s where t h e i n d e x i s an in−p r o g r e s s
            # d e p o s i t i o n rank must be c a l c u l a t e d a s t h e rank s u c c e e d i n g t h e
            # p r e v i o u s stratum ’ s rank
            return s e l f . CalcRankAtColumnIndex (
                    index - 1 ,
                    num_strata_deposited ,
             ) + 1
    else :
            return s e l f . _CalcRankAtColumnIndexImpl (
                    index ,
                    num_strata_deposited ,
             )
                                                      257


                                                      Code Listing B.6 (cont’d)
Code Listing B.6: Fixed resolution policy predicate implementation, including method to calculate exact
number of strata retained by generation and method to calculate stratum deposition rank from column
index.
import math
import t y p i n g
class StratumRetentionPredicateFixedResolution :
     """ Functor t o implement t h e f i x e d r e s o l u t i o n stratum r e t e n t i o n p o l i c y , f o r
     u s e with H e r e d i t a r y S t r a t i g r a p h i c C o l u m n .
     This f u n c t o r e n a c t s t h e f i x e d r e s o l u t i o n p o l i c y by s p e c i f y i n g
     whether a stratum with d e p o s i t i o n rank r s h o u l d be r e t a i n e d w i t h i n t h e
     h e r e d i t a r y s t r a t i g r a p h i c column a f t e r n s t r a t a have been d e p o s i t e d .
     The f i x e d r e s o l u t i o n p o l i c y e n s u r e s e s t i m a t e s o f MRCA rank w i l l have
     u n c e r t a i n t y bounds l e s s than o r e q u a l a f i x e d , a b s o l u t e u s e r −s p e c i f i e d cap
     t h a t i s i n d e p e n d e n t o f t h e number o f s t r a t a d e p o s i t e d on e i t h e r column .
     Thus , MRCA rank e s t i m a t e u n c e r t a i n t y s c a l e s a s O( 1 ) with r e s p e c t t o t h e
     g r e a t e r number o f s t r a t a d e p o s i t e d on e i t h e r column .
     Under t h e f i x e d r e s o l u t i o n p o l i c y , t h e number o f s t r a t a r e t a i n e d ( i . e . ,
     s p a c e c o m p l e x i t y ) s c a l e s a s O( n ) with r e s p e c t t o t h e number o f s t r a t a
     deposited .
     See Also
    −−−−−−−−
     S t r a tu mR e te n tion Con de mn erF ixe dRes olu tio n :
             For a p o t e n t i a l l y more c o m p u t a t i o n a l l y e f f i c i e n t s p e c i f i c i a t i o n o f t h e
             f i x e d r e s o l u t i o n p o l i c y that d i r e c t l y g e n e r a t e s the ranks of s t r a t a
             t h a t s h o u l d be purged d u r i n g t h e nth stratum d e p o s i t i o n .
     """
     _fixed_resolution : int
     def __init__ (
              s e l f : ’ StratumRetentionPredicateFixedResolution ’ ,
             f i x e d _ r e s o l u t i o n : i n t=10
     ):
             """ C o n s t r u c t t h e f u n c t o r .
             Parameters
            −−−−−−−−−−
             fixed_resolution : int , optional
                      The rank i n t e r v a l s t r a t a s h o u l d be r e t a i n e d a t . The u n c e r t a i n t y o f
                     MRCA e s t i m a t e s p r o v i d e d under t h e f i x e d r e s o l u t i o n p o l i c y w i l l
                      alw ays be s t r i c t l y l e s s than t h i s cap .
             """
             assert f i x e d _ r e s o l u t i o n > 0
              s e l f . _fixed_resolution = (
                      fixed_resolution
                                                                     258


                                             Code Listing B.6 (cont’d)
    )
def __eq__(
     s e l f : ’ StratumRetentionPredicateFixedResolution ’ ,
    other : ’ StratumRetentionPredicateFixedResolution ’ ,
) -> bool :
    """ Compare f o r v a l u e −w i s e e q u a l i t y . """
     i f i s i n s t a n c e ( o t h e r , s e l f . __class__ ) :
             return s e l f . __dict__ == o t h e r . __dict__
    else :
             return F a l s e
def __call__ (
     s e l f : ’ StratumRetentionPredicateFixedResolution ’ ,
    stratum_rank : int ,
    num_stratum_depositions_completed : int ,
) -> bool :
    """ Decide i f a stratum w i t h i n t h e s t r a t a g r a p h i c column s h o u l d be
     r e t a i n e d o r purged .
    Every time a new stratum i s d e p o s i t e d , t h i s method i s c a l l e d on each
    stratum p r e s e n t i n a H e r e d i t a r y S t r a t i g r a p h i c C o l u m n t o d e t e r m i n e whether
     i t s h o u l d be r e t a i n e d . S t r a t a t h a t r e t u r n F a l s e a r e i m m e d i a t e l y purged
    from t h e column , meaning t h a t f o r a stratum t o p e r s i s t i t must e a r n a
    True r e s u l t from t h i s method each and e v e r y time a new stratum i s
    deposited .
    This f u n c t o r ’ s r e t e n t i o n p o l i c y i m p l e m e n t a t i o n g u a r a n t e e s t h a t columns
     w i l l r e t a i n a p p r o p r i a t e s t r a t a s o t h a t f o r any two columns with m and n
     s t r a t a d e p o s i t e d , t h e rank o f t h e most r e c e n t common a n c e s t o r can be
    d e t e r m i n e d with u n c e r t a i n t y o f a t most f i x e d _ r e s o l u t i o n .
    Parameters
    −−−−−−−−−−
    stratum_rank : i n t
             The number o f s t r a t a t h a t were d e p o s i t e d b e f o r e t h e stratum under
             consideration for retention .
    num_stratum_depositions_completed : i n t
             The number o f s t r a t a t h a t have a l r e a d y been d e p o s i t e d , not
             i n c l u d i n g t h e l a t e s t stratum b e i n g d e p o s i t e d which prompted t h e
             c u r r e n t purge o p e r a t i o n .
    Returns
    −−−−−−−
    bool
             True i f t h e stratum s h o u l d be r e t a i n e d , F a l s e o t h e r w i s e .
    """
    return (
             stratum_rank == num_stratum_depositions_completed
             or stratum_rank % s e l f . _ f i x e d _ r e s o l u t i o n == 0
                                                           259


                                               Code Listing B.6 (cont’d)
    )
def CalcNumStrataRetainedExact (
    s e l f : ’ StratumRetentionPredicateFixedResolution ’ ,
    num_strata_deposited : int ,
) -> i n t :
    """ E x a c t l y how many s t r a t a a r e r e t a i n e d a f t e r n d e p o s t e d ? """
    i f num_strata_deposited == 0 : return 0
    uncertainty = s e l f . _fixed_resolution
    newest_stratum_rank = num_strata_deposited - 1
    # +1 f o r 0 ’ th rank stratum
    num_strata_at_uncertainty_intervals \
             = newest_stratum_rank // u n c e r t a i n t y + 1
    newest_stratum_distinct_from_uncertainty_intervals \
             = ( newest_stratum_rank % u n c e r t a i n t y != 0 )
    return (
              num_strata_at_uncertainty_intervals
             + newest_stratum_distinct_from_uncertainty_intervals
    )
    i f num_strata_deposited <= 2 : return num_strata_deposited
    else :
              return ( num_strata_deposited - 2 ) // s e l f . _ f i x e d _ r e s o l u t i o n + 2
def CalcNumStrataRetainedUpperBound (
    s e l f : ’ StratumRetentionPredicateFixedResolution ’ ,
    num_strata_deposited : int ,
) -> i n t :
    """At most , how many s t r a t a a r e r e t a i n e d a f t e r n d e p o s t e d ? I n c l u s i v e . """
    return s e l f . CalcNumStrataRetainedExact (
              num_strata_deposited ,
    )
def CalcMrcaUncertaintyUpperBound (
    s e l f : ’ StratumRetentionPredicateFixedResolution ’ ,
    *,
    f i r s t _ n u m _ s t r a t a _ d e p o s i t e d : int ,
    second_num_strata_deposited : int ,
    actual_rank_of_mrca : t y p i n g . O p t i o n a l [ i n t ]=None ,
) -> i n t :
    """At most , how much u n c e r t a i n t y t o e s t i m a t e rank o f MRCA? I n c l u s i v e . """
    return s e l f . _ f i x e d _ r e s o l u t i o n
def CalcRankAtColumnIndex (
    s e l f : ’ StratumRetentionPredicateFixedResolution ’ ,
    i n d e x : int ,
    num_strata_deposited : int ,
                                                              260


                                                    Code Listing B.6 (cont’d)
     ) -> i n t :
             """ A f t e r n s t r a t a have been d e p o s i t e d , what w i l l t h e rank o f t h e
             stratum a t column i n d e x k be ?
             Enables a HereditaryStratigraphicColumn using t h i s p r e d i c a t e to
             o p t i m i z e away s t o r a g e o f rank a n n o t a t i o n s on s t r a t a . Takes i n t o t h e
             a c c o u n t t h e p o s s i b l i t y f o r in−p r o g r e s s stratum d e p o s i t i o n s t h a t haven ’ t
             been r e f l e c t e d i n num_strata_deposited .
             """
            # upper bound i m p l e m e n t a t i o n g i v e s t h e e x a c t number o f s t r a t a r e t a i n e d
             i f i n d e x == s e l f . CalcNumStrataRetainedUpperBound ( num_strata_deposited ) :
                     # in−p r o g r e s s d e p o s i t i o n c a s e
                     return num_strata_deposited
             else :
                     return min (
                            index * s e l f . _fixed_resolution ,
                            num_strata_deposited - 1 ,
                     )
Code Listing B.7: MRCA-recency-proportional resolution policy predicate implementation, including
method to calculate exact number of strata retained by generation and method to calculate stratum
deposition rank from column index.
import gmpy
import math
import t y p i n g
class StratumRetentionPredicateRecencyProportionalResolution :
     """ Functor t o implement t h e MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n stratum
     r e t e n t i o n p o l i c y , f o r u s e with H e r e d i t a r y S t r a t i g r a p h i c C o l u m n .
     This f u n c t o r e n a c t s t h e MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n p o l i c y by
     s p e c i f y i n g whether a stratum with d e p o s i t i o n rank r s h o u l d be r e t a i n e d
     w i t h i n t h e h e r e d i t a r y s t r a t i g r a p h i c column a f t e r n s t r a t a have been
     deposited .
    The MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n p o l i c y e n s u r e s e s t i m a t e s o f MRCA
     rank w i l l have u n c e r t a i n t y bounds l e s s than o r e q u a l t o a u s e r −s p e c i f i e d
     p r o p o r t i o n o f t h e a c t u a l number o f g e n e r a t i o n s e l a p s e d s i n c e t h e MRCA and
     t h e d e e p e s t o f t h e compared columns . MRCA rank e s t i m a t e u n c e r t a i n t y s c a l e s
     i n t h e w o r s t c a s e s c a l e s a s O( n ) with r e s p e c t t o t h e g r e a t e r number o f
     s t r a t a d e p o s i t e d on e i t h e r column . However , with r e s p e c t t o e s t i m a t i n g t h e
                                                                           rank o f t h e MRCA when l i n e a g e s
                                                                           d i v e r g e d any f i x e d number o f
                                                                           g e n e r a t i o n s ago ,
     u n c e r t a i n t y s c a l e s a s O( 1 ) .
                                                                    261


                                                     Code Listing B.7 (cont’d)
Under t h e MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n p o l i c y , t h e number o f s t r a t a
 r e t a i n e d ( i . e . , s p a c e c o m p l e x i t y ) s c a l e s a s O( l o g ( n ) ) with r e s p e c t t o t h e
number o f s t r a t a d e p o s i t e d .
See Also
−−−−−−−−
StratumRetentionCondemnerRecencyProportionalResolution :
         For a p o t e n t i a l l y more c o m p u t a t i o n a l l y e f f i c i e n t s p e c i f i c i a t i o n o f t h e
        MRCA−r e c e n c y −p r o p o r t i o n a l r e s o l u t i o n p o l i c y t h a t d i r e c t l y g e n e r a t e s t h e
         r a n k s o f s t r a t a t h a t s h o u l d be purged d u r i n g t h e nth stratum d e p o s i t i o n .
"""
_guaranteed_mrca_recency_proportional_resolution : int
def __init__ (
          s e l f : ’ StratumRetentionPredicateRecencyProportionalResolution ’ ,
         g u a r a n t e e d _ m r c a _ r e c e n c y _ p r o p o r t i o n a l _ r e s o l u t i o n : i n t=10 ,
):
         """ C o n s t r u c t t h e f u n c t o r .
         Parameters
        −−−−−−−−−−
         guaranteed_mrca_recency_proportional_resolution : int , optional
                  The d e s i r e d minimum number o f i n t e r v a l s between t h e MRCA and t h e
                  d e e p e r compared column t o be a b l e t o be d i s t i n g u i s h e d between . The
                  u n c e r t a i n t y o f MRCA rank e s t i m a t e s p r o v i d e d under t h e MRCA−r e c e n c y −
                  p r o p o r t i o n a l r e s o l u t i o n p o l i c y w i l l s c a l e as the a c t u a l
                  p h y l o g e n e t i c depth o f t h e MRCA d i v i d e d by
                  guaranteed_mrca_recency_proportional_resolution .
         """
          s e l f . _guaranteed_mrca_recency_proportional_resolution = (
                  guaranteed_mrca_recency_proportional_resolution
         )
def __eq__(
          s e l f : ’ StratumRetentionPredicateRecencyProportionalResolution ’ ,
         other : ’ StratumRetentionPredicateRecencyProportionalResolution ’ ,
) -> bool :
         """ Compare f o r v a l u e −w i s e e q u a l i t y . """
          i f i s i n s t a n c e ( o t h e r , s e l f . __class__ ) :
                  return s e l f . __dict__ == o t h e r . __dict__
         else :
                  return F a l s e
def _ c a l c _ p r o v i d e d _ u n c e r t a i n t y (
          s e l f : ’ StratumRetentionPredicateDepthProportionalResolution ’ ,
         num_stratum_depositions_completed : int ,
) -> i n t :
         """When n s t r a t a have been d e p o s i t e d , how b i g w i l l t h e i n t e r v a l u n t i l
         t h e f i r s t r e t a i n e d stratum be ? """
                                                                    262


                                                 Code Listing B.7 (cont’d)
    r e s o l u t i o n = s e l f . _guaranteed_mrca_recency_proportional_resolution
    max_uncertainty = num_stratum_depositions_completed // ( r e s o l u t i o n + 1 )
    # round down t o l o w e r o r e q u a l power o f 2
    p r o v i d e d _ u n c e r t a i n t y _ e x p = ( max_uncertainty // 2 ) . b i t _ l e n g t h ( )
    p r o v i d e d _ u n c e r t a i n t y = 2 ** p r o v i d e d _ u n c e r t a i n t y _ e x p
    return p r o v i d e d _ u n c e r t a i n t y
def __call__ (
    s e l f : ’ StratumRetentionPredicateRecencyProportionalResolution ’ ,
    stratum_rank : int ,
    num_stratum_depositions_completed : int ,
) -> bool :
    """ Decide i f a stratum w i t h i n t h e s t r a t a g r a p h i c column s h o u l d be
    r e t a i n e d o r purged .
    Every time a new stratum i s d e p o s i t e d , t h i s method i s c a l l e d on each
    stratum p r e s e n t i n a H e r e d i t a r y S t r a t i g r a p h i c C o l u m n t o d e t e r m i n e whether
    i t s h o u l d be r e t a i n e d . S t r a t a t h a t r e t u r n F a l s e a r e i m m e d i a t e l y purged
    from t h e column , meaning t h a t f o r a stratum t o p e r s i s t i t must e a r n a
    True r e s u l t from t h i s method each and e v e r y time a new stratum i s
    deposited .
    This f u n c t o r ’ s r e t e n t i o n p o l i c y i m p l e m e n t a t i o n g u a r a n t e e s t h a t columns
    w i l l r e t a i n a p p r o p r i a t e s t r a t a s o t h a t f o r any two columns with m and n
    s t r a t a d e p o s i t e d , t h e rank o f t h e most r e c e n t common a n c e s t o r a t rank k
    can be d e t e r m i n e d with u n c e r t a i n t y o f a t most
            bound = f l o o r (
                    max(m − k , n − k )
                    / guaranteed_mrca_recency_proportional_resolution
            )
    ranks .
    How d o e s t h e p r e d i c a t e work and how d o e s i t g u a r a n t e e t h i s r e s o l u t i o n ?
    To begin , l e t ’ s c o n s i d e r s e t t i n g up j u s t t h e ∗ f i r s t ∗ rank o f t h e
    stratum a f t e r t h e r o o t a n c e s t o r we w i l l r e t a i n .
    root ancestor                                                                              extant i n d i v i d u a l
    |                                                                                                                   |
    |                                          num_strata_deposited                                                     |
    | ___________________________/\_________________________________ |
    |/                                                                                                                \|
    |−−−−−−−−−−−−−−−−−−−|#############################################|
      \_______ ________/ | \____________________ ______________________/
                      \/                     |                                    \/
          max_uncertainty |                                          worst_case_mrca_depth
                                             |
                                             p r o p o s e d r e t e n t i o n rank
                                                                 263


                                      Code Listing B.7 (cont’d)
To p r o v i d e g u a r a n t e e d r e s o l u t i o n , max_uncertainty must be l e q than
      worst_case_mrca_depth // g u a r a n t e e d _ r e s o l u t i o n
So , t o f i n d t h e l a r g e s t p e r m i s s i b l e max_uncertainty we must s o l v e
      max_uncertainty = worst_case_mrca_depth // g u a r a n t e e d _ r e s o l u t i o n
By c o n s t r u c t i o n we have
      worst_case_mrca_depth = num_strata_deposited − max_uncertainty
S u b s t i t u t i n g i n t o t h e above e x p r e s s i o n g i v e s
      max_uncertainty
     = ( num_strata_deposited − max_uncertainty ) // g u a r a n t e e d _ r e s o l u t i o n
S o l v i n g f o r max_uncertainty y i e l d s
    max_uncertainty
   = num_strata_deposited // ( g u a r a n t e e d _ r e s o l u t i o n + 1 )
We now have an upper bound f o r t h e rank o f t h e f i r s t stratum rank
we must r e t a i n . We can r e p e a t t h i s p r o c e s s r e c u r s i v e l y t o s e l e c t
r a n k s t h a t g i v e ad e q u a t e r e s o l u t i o n p r o p o r t i o n a l t o
worst_case_mrca_depth .
However , we must g u a r a n t e e t h a t t h e s r a n k s a r e a c t u a l l y a v a i l a b l e f o r
us t o r e t a i n ( i . e . , i t hasn ’ t been purged out o f t h e column a t a
p r e v i o u s time p o i n t a s t h e column was grown by s u c c e s s i v e d e p o s i t i o n ) .
We w i l l do t h i s by p i c k i n g t h e rank t h a t i s t h e h i g h e s t power o f 2
l e s s than o r e q u a l t o our bound . I f we r e p e a t t h i s p r o c e d u r e a s we
r e c u r s e , we a r e g u a r a n t e e d t h a t t h i s rank w i l l have been p r e s e r v e d
across a l l previous timepoints .
This i s b e c a u s e a p a r t i a l sum s e q u e n c e where a l l e l e m e n t s a r e powers
o f 2 and e l e m e n t s i n t h e s e q u e n c e a r e w i l l i n c l u d e a l l m u l t i p l e s o f
powers o f 2 g r e a t e r than o r e q u a l t o t h e f i r s t e l e m e n t t h a t a r e l e s s
than o r e q u a l t o t h e sum o f t h e e n t i r e
sequence .
An example i s t h e b e s t way t o c o n v i n c e y o u r s e l f . Thinking a n a l o g o u s l y
i n b a s e 10 ,
      100 + 10 . . . + 1 . . .
t h e p a r t i a l sums o f any s e q u e n c e o f t h i s form w i l l always i n c l u d e a l l
m u l t i p l e s o f powers o f 100 , 1000 , e t c . t h a t a r e l e s s than o r e q u a l t o
t h e sum o f t h e e n t i r e s e q u e n c e .
In our a p p l i c a t i o n , p a r t i a l sums r e p r e s e n t r e t a i n e d r a n k s . So , a l l
r a n k s t h a t a r e p e r f e c t powers o f 2 measuring from t h e r o o t a n c e s t o r
                                                        264


                                             Code Listing B.7 (cont’d)
     w i l l have been r e t a i n e d a f t e r b e i n g d e p o s i t e d . This p r o p e r t y
     generalizes recursively .
    Parameters
    −−−−−−−−−−
    stratum_rank : i n t
             The number o f s t r a t a t h a t were d e p o s i t e d b e f o r e t h e stratum under
             consideration for retention .
    num_stratum_depositions_completed : i n t
             The number o f s t r a t a t h a t have a l r e a d y been d e p o s i t e d , not
             i n c l u d i n g t h e l a t e s t stratum b e i n g d e p o s i t e d which prompted t h e
             c u r r e n t purge o p e r a t i o n .
    Returns
    −−−−−−−
    bool
             True i f t h e stratum s h o u l d be r e t a i n e d , F a l s e o t h e r w i s e .
    """
     r e s o l u t i o n = s e l f . _guaranteed_mrca_recency_proportional_resolution
    # t o s a t i s f y r e q u i r e m e n t s o f H e r e d i t a r y S t r a t i g r a p h i c C o l u m n impl
    # we must alw ays keep r o o t a n c e s t o r and newest stratum
     i f ( stratum_rank in ( 0 , num_stratum_depositions_completed ) ) : return True
     e l i f num_stratum_depositions_completed <= r e s o l u t i o n : return True
    provided_uncertainty = s e l f . _calc_provided_uncertainty (
             num_stratum_depositions_completed ,
    )
    # l o g i c a l l y , we c o u l d j u s t t e s t
    #        i f stratum_rank == p r o v i d e d _ u n c e r t a i n t y : r e t u r n True
    # but we a r e g u a r a n t e e d t o e v e n t u a l l y r e t u r n True under t h e weaker
    # condition
    #        i f stratum_rank % p r o v i d e d _ u n c e r t a i n t y == 0
    # s o a s an o p t i m i z a t i o n go ahead and r e t u r n True now i f i t h o l d s
     i f stratum_rank % p r o v i d e d _ u n c e r t a i n t y == 0 : return True
     e l i f stratum_rank < p r o v i d e d _ u n c e r t a i n t y : return F a l s e
    e l s e : return s e l f . __call__ (
             stratum_rank - p r o v i d e d _ u n c e r t a i n t y ,
             num_stratum_depositions_completed - p r o v i d e d _ u n c e r t a i n t y ,
    )
def CalcNumStrataRetainedExact (
     s e l f : ’ StratumRetentionPredicateRecencyProportionalResolution ’ ,
    num_strata_deposited : int ,
) -> i n t :
    """ E x a c t l y how many s t r a t a a r e r e t a i n e d a f t e r n d e p o s t e d ?
    The c a l c u l a t i o n can be w r i t t e n m a t h e m a t i c a l l y as ,
         w e i g h t o f b i n a r y e x p a n s i o n o f n ( i . e . , #1 ’ s s e t i n b i n a r y r e p r )
                                                          265


                                               Code Listing B.7 (cont’d)
         + sum (
                 f l o o r ( l o g 2 ( n// r ) )
                 f o r r from 1 t o r i n c l u s i v e
         )
         + 1
    where
         n = num_strata_deposited − 1
         r = resolution
    This e x p r e s s i o n f o r e x a c t number d e p o s i t e d was e x t r a p o l a t e d from
              ∗ r e s o l u t i o n = 0 , <h t t p s : / / o e i s . o r g / A063787>
              ∗ r e s o l u t i o n = 1 , <h t t p s : / / o e i s . o r g / A056791>
    and i s u n i t t e s t e d e x t e n s i v e l y .
    Note t h a t t h e i m p l e m e n t a t i o n must i n c l u d e a s p e c i a l c a s e t o a c c o u n t f o r
    n < r c a u s i n g l o g 2 ( 0 ) . In t h i s c a s e , t h e number o f s t r a t a r e t a i n e d i s
    e q u a l t o t h e number d e p o s i t e d ( i . e . , none have been d i s c a r d e d y e t ) .
    """
    r e s o l u t i o n = s e l f . _guaranteed_mrca_recency_proportional_resolution
    i f num_strata_deposited - 1 <= r e s o l u t i o n : return num_strata_deposited
    e l s e : return (
              gmpy . popcount ( num_strata_deposited - 1 )
             + sum (
                     # X. b i t _ l e n g t h ( ) − 1 e q u i v a l e n t t o f l o o r ( l o g 2 (X) )
                     ( ( num_strata_deposited - 1 ) // r ) . b i t _ l e n g t h ( ) - 1
                     f o r r in range ( 1 , r e s o l u t i o n + 1 )
              )
             + 1
    )
def CalcNumStrataRetainedUpperBound (
    s e l f : ’ StratumRetentionPredicateRecencyProportionalResolution ’ ,
    num_strata_deposited : int ,
):
    """At most , how many s t r a t a a r e r e t a i n e d a f t e r n d e p o s t e d ? I n c l u s i v e . """
    return s e l f . CalcNumStrataRetainedExact (
              num_strata_deposited=num_strata_deposited ,
    )
def CalcMrcaUncertaintyUpperBound (
    s e l f : ’ StratumRetentionPredicateRecencyProportionalResolution ’ ,
    *,
    actual_rank_of_mrca : int ,
    f i r s t _ n u m _ s t r a t a _ d e p o s i t e d : int ,
    second_num_strata_deposited : int ,
) -> i n t :
    """At most , how much u n c e r t a i n t y t o e s t i m a t e rank o f MRCA? I n c l u s i v e . """
                                                              266


                                                Code Listing B.7 (cont’d)
    max_ranks_since_mrca = max(
            first_num_strata_deposited ,
            second_num_strata_deposited ,
    ) - actual_rank_of_mrca
    i f s e l f . _ g u a r a n t e e d _ m r c a _ r e c e n c y _ p r o p o r t i o n a l _ r e s o l u t i o n == 0 :
            return min ( f i r s t _ n u m _ s t r a t a _ d e p o s i t e d , second_num_strata_deposited )
    e l s e : return (
            max_ranks_since_mrca
            // s e l f . _ g u a r a n t e e d _ m r c a _ r e c e n c y _ p r o p o r t i o n a l _ r e s o l u t i o n
    )
def CalcRankAtColumnIndex (
    s e l f : ’ StratumRetentionPredicatePerfectResolution ’ ,
    i n d e x : int ,
    num_strata_deposited : int ,
) -> i n t :
    """ A f t e r n s t r a t a have been d e p o s i t e d , what w i l l t h e rank o f t h e
    stratum a t column i n d e x k be ?
    Enables a HereditaryStratigraphicColumn using t h i s p r e d i c a t e to
    o p t i m i z e away s t o r a g e o f rank a n n o t a t i o n s on s t r a t a . Takes i n t o t h e
    a c c o u n t t h e p o s s i b l i t y f o r in−p r o g r e s s stratum d e p o s i t i o n s t h a t haven ’ t
    been r e f l e c t e d i n num_strata_deposited .
    """
    r e s o l u t i o n = s e l f . _guaranteed_mrca_recency_proportional_resolution
    # c a l c u l a t e t h e i n t e r v a l between r e t a i n e d s t r a t a we ’ r e s t a r t i n g out with
    # −1 due t o ∗ l a c k ∗ o f an in−p r o g r e s s d e p o s i t i o n
    provided_uncertainty = s e l f . _calc_provided_uncertainty (
            num_strata_deposited - 1 ,
    )
    # we c o u l d j u s t t a k e a s i m p l e r e c u r s i v e approach l i k e t h i s
    #
    #       i f i n d e x == 0 : r e t u r n 0
    #       e l s e : r e t u r n p r o v i d e d _ u n c e r t a i n t y + s e l f . CalcRankAtColumnIndex (
    #                index − 1 ,
    #                num_strata_deposited − p r o v i d e d _ u n c e r t a i n t y ,
    #       )
    #
    # but a s an o p t i m i z a t i o n , we s h o u l d t a k e a s many s t e p s a s p o s s i b l e a t
    # t h e c u r r e n t u n c e r t a i n t y i n t e r v a l s i z e s t a g e b e f o r e r e c u r s i n g down t o
    # t h e next h a l f −as−s m a l l u n c e r t a i n t y i n t e r v a l s i z e
    # c a l c u l a t e t h e g r e a t e s t rank a t which t h a t i n t e r v a l i s v i a b l e
    # i . e . , where max_uncertainty == p r o v i d e d _ u n c e r t a i n t y . . . we must s o l v e
    #
    #       p r o v i d e d _ u n c e r t a i n t y = least_num_strata // ( r e s o l u t i o n + 1 )
    #
    # and
    #
    #       least_num_strata = num_strata_deposited − g r e a t e s t _ v i a b l e _ r a n k
                                                              267


                                                      Code Listing B.7 (cont’d)
            # −1 due t o ∗ l a c k ∗ o f an in−p r o g r e s s d e p o s i t i o n
             greatest_viable_rank = (
                     num_strata_deposited - 1
                      - provided_uncertainty * ( r e s o l u t i o n + 1 )
             )
            # c a l c u l a t e how many s t e p s a t p r o v i d e d _ u n c e r t a i n t y i n t e r v a l we can t a k e
            # +p r o v i d e d _ u n c e r t a i n t y b e c a u s e g r e a t e s t v i a b l e rank i s t h e ∗ l a s t ∗
            # p o s i t i o n we can t a k e a s t e p i n our i n t e r v a l from
             num_interval_steps = (
                     ( greatest_viable_rank + provided_uncertainty )
                     // p r o v i d e d _ u n c e r t a i n t y
             )
             i f i n d e x <= num_interval_steps or p r o v i d e d _ u n c e r t a i n t y == 1 :
                     # we can r e a c h i n d e x w i t h i n t h e c u r r e n t p r o v i d e d _ u n c e r t a i n t y s t a g e
                     # n o t e : must always e n t e r b a s e c a s e when p r o v i d e d u n c e r t a i n t y i s 1
                     # or i n f i n i t e r e c u r s i o n w i l l r e s u l t
                     return i n d e x * p r o v i d e d _ u n c e r t a i n t y
             else :
                     # t a k e a s many i n d e x s t e p s a s p o s s i b l e a t c u r r e n t u n c e r t a i n t y s t a g e
                     # and add number o f r a n k s a c c r u e d a t s u b s e q u e n t u n c e r t a i n t y s t a g e s
                     num_ranks_traversed = num_interval_steps * p r o v i d e d _ u n c e r t a i n t y
                     return num_ranks_traversed + s e l f . CalcRankAtColumnIndex (
                           i n d e x=i n d e x - num_interval_steps ,
                           num_strata_deposited
                                  =num_strata_deposited - num_ranks_traversed ,
                     )
Code Listing B.8: Tapered depth-proportional resolution policy predicate implementation, including
method to calculate exact number of strata retained by generation and method to calculate stratum
deposition rank from column index.
import math
import t y p i n g
class StratumRetentionPredicateTaperedDepthProportionalResolution :
     """ Functor t o implement t h e t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n stratum
     r e t e n t i o n p o l i c y , f o r u s e with H e r e d i t a r y S t r a t i g r a p h i c C o l u m n .
     This f u n c t o r e n a c t s t h e t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y by
     s p e c i f y i n g whether a stratum with d e p o s i t i o n rank r s h o u l d be r e t a i n e d
     w i t h i n t h e h e r e d i t a r y s t r a t i g r a p h i c column a f t e r n s t r a t a have been
     deposited .
    The t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y e n s u r e s e s t i m a t e s o f MRCA
     rank w i l l have u n c e r t a i n t y bounds l e s s than o r e q u a l t o a u s e r −s p e c i f i e d
                                                                    268


                                                   Code Listing B.8 (cont’d)
p r o p o r t i o n o f t h e l a r g e s t number o f s t r a t a d e p o s i t e d on e i t h e r column .
Thus , MRCA rank e s t i m a t e u n c e r t a i n t y s c a l e s a s O( n ) with r e s p e c t t o t h e
 g r e a t e r number o f s t r a t a d e p o s i t e d on e i t h e r column .
Under t h e t a p e r e d depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y , t h e number o f s t r a t a
 r e t a i n e d ( i . e . , s p a c e c o m p l e x i t y ) s c a l e s a s O( 1 ) with r e s p e c t t o t h e number
of strata deposited .
See Also
−−−−−−−−
StratumRetentionCondemnerTaperedDepthProportionalResolution :
         For a p o t e n t i a l l y more c o m p u t a t i o n a l l y e f f i c i e n t s p e c i f i c i a t i o n o f t h e
         depth−p r o p o r t i o n a l r e s o l u t i o n p o l i c y t h a t d i r e c t l y g e n e r a t e s t h e r a n k s
         o f s t r a t a t h a t s h o u l d be purged d u r i n g t h e nth stratum d e p o s i t i o n .
StratumRetentionPredicateDepthProportionalResolution :
         For a p r e d i c a t e r e t e n t i o n p o l i c y t h a t a c h i e v e s t h e same g u a r a n t e e s f o r
         depth−p r o p o r t i o n a l r e s o l u t i o n but p u r g e s u n n e c e s s a r y s t r a t a more
          a g g r e s s i v e l y and a b r u p t l y .
"""
_guaranteed_depth_proportional_resolution : int
def __init__ (
          s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
         g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n : i n t=10
):
         """ C o n s t r u c t t h e f u n c t o r .
         Parameters
        −−−−−−−−−−
         guaranteed_depth_proportional_resolution : int , optional
                  The d e s i r e d minimum number o f i n t e r v a l s f o r t h e rank o f t h e MRCA t o
                  be a b l e t o be d i s t i n g u i s h e d between . The u n c e r t a i n t y o f MRCA
                  rank e s t i m a t e s p r o v i d e d under t h e depth−p r o p o r t i o n a l r e s o l u t i o n
                  p o l i c y w i l l s c a l e a s t o t a l number o f s t r a t a d e p o s i t e d d i v i d e d by
                  guaranteed_depth_proportional_resolution .
         """
         assert g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n > 0
          s e l f . _guaranteed_depth_proportional_resolution = (
                  guaranteed_depth_proportional_resolution
         )
def __eq__(
          s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
         other : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
) -> bool :
         """ Compare f o r v a l u e −w i s e e q u a l i t y . """
          i f i s i n s t a n c e ( o t h e r , s e l f . __class__ ) :
                  return s e l f . __dict__ == o t h e r . __dict__
         else :
                                                                     269


                                                 Code Listing B.8 (cont’d)
             return F a l s e
def _ c a l c _ p r o v i d e d _ u n c e r t a i n t y (
     s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
    num_stratum_depositions_completed : int ,
) -> i n t :
    """ A f t e r n s t r a t a have been d e p o s i t e d , how many r a n k s a r e s p a c e d
    between r e t a i n e d s t r a t a ? """
    guaranteed_resolution = s e l f . _guaranteed_depth_proportional_resolution
    max_uncertainty \
            = num_stratum_depositions_completed // g u a r a n t e e d _ r e s o l u t i o n
    # round down t o l o w e r o r e q u a l power o f 2
    p r o v i d e d _ u n c e r t a i n t y _ e x p = ( max_uncertainty // 2 ) . b i t _ l e n g t h ( )
    p r o v i d e d _ u n c e r t a i n t y = 2 ** p r o v i d e d _ u n c e r t a i n t y _ e x p
    return p r o v i d e d _ u n c e r t a i n t y
def __call__ (
     s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
    stratum_rank : int ,
    num_stratum_depositions_completed : int ,
) -> bool :
    """ Decide i f a stratum w i t h i n t h e s t r a t a g r a p h i c column s h o u l d be
     r e t a i n e d o r purged .
    Every time a new stratum i s d e p o s i t e d , t h i s method i s c a l l e d on each
    stratum p r e s e n t i n a H e r e d i t a r y S t r a t i g r a p h i c C o l u m n t o d e t e r m i n e whether
     i t s h o u l d be r e t a i n e d . S t r a t a t h a t r e t u r n F a l s e a r e i m m e d i a t e l y purged
    from t h e column , meaning t h a t f o r a stratum t o p e r s i s t i t must e a r n a
    True r e s u l t from t h i s method each and e v e r y time a new stratum i s
    deposited .
    This f u n c t o r ’ s r e t e n t i o n p o l i c y i m p l e m e n t a t i o n g u a r a n t e e s t h a t columns
     w i l l r e t a i n a p p r o p r i a t e s t r a t a s o t h a t f o r any two columns with m and n
     s t r a t a d e p o s i t e d , t h e rank o f t h e most r e c e n t common a n c e s t o r can be
    d e t e r m i n e d with u n c e r t a i n t y o f a t most
             bound = f l o o r (max(m, n ) / g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n )
    r a n k s . A c h i e v i n g t h i s l i m i t on u n c e r t a i n t y r e q u i r e s r e t a i n i n g s u f f i c i e n t
     s t r a t a s o t h a t no more than bound r a n k s e l a p s e d between any two s t r a t a .
    This p o l i c y a c c u m u l a t e s r e t a i n e d s t r a t a a t a f i x e d i n t e r v a l u n t i l t w i c e
    a s many a s g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n a r e a t hand . Then ,
    e v e r y o t h e r r e t a i n e d stratum i s purged g r a d u a l l y from back t o f r o n t
     u n t i l t h e c y c l e r e p e a t s with a new t w i c e −as−wide i n t e r v a l between
     retained strata .
    Parameters
    −−−−−−−−−−
    stratum_rank : i n t
             The number o f s t r a t a t h a t were d e p o s i t e d b e f o r e t h e stratum under
                                                            270


                                                Code Listing B.8 (cont’d)
              consideration for retention .
    num_stratum_depositions_completed : i n t
             The number o f s t r a t a t h a t have a l r e a d y been d e p o s i t e d , not
             i n c l u d i n g t h e l a t e s t stratum b e i n g d e p o s i t e d which prompted t h e
             c u r r e n t purge o p e r a t i o n .
    Returns
    −−−−−−−
    bool
             True i f t h e stratum s h o u l d be r e t a i n e d , F a l s e o t h e r w i s e .
    """
    guaranteed_resolution = s e l f . _guaranteed_depth_proportional_resolution
    # e a s y edge c a s e s we must always r e t a i n
     if (
            # always r e t a i n newest stratum
             stratum_rank == num_stratum_depositions_completed
            # r e t a i n a l l s t r a t a u n t i l more than num_intervals a r e d e p o s i t e d
             or num_stratum_depositions_completed < g u a r a n t e e d _ r e s o l u t i o n
    ) : return True
    # +1 b e c a u s e o f in−p r o g r e s s d e p o s i t i o n
    cur_stage_uncertainty = s e l f . _calc_provided_uncertainty (
             num_stratum_depositions_completed + 1 ,
    )
    cur_stage_idx = stratum_rank // c u r _ s t a g e _ u n c e r t a i n t y
    cur_stage_max_idx = \
             num_stratum_depositions_completed // c u r _ s t a g e _ u n c e r t a i n t y
    # u s e lambdas t o p r e v e n t d i v i s i o n by z e r o
    p r e v _ s t a g e _ u n c e r t a i n t y = c u r _ s t a g e _ u n c e r t a i n t y // 2
    prev_stage_idx = lambda : stratum_rank // p r e v _ s t a g e _ u n c e r t a i n t y
    prev_stage_max_idx = \
             lambda : num_stratum_depositions_completed // p r e v _ s t a g e _ u n c e r t a i n t y
    return (
             stratum_rank % c u r _ s t a g e _ u n c e r t a i n t y == 0
             or (
                     stratum_rank % p r e v _ s t a g e _ u n c e r t a i n t y == 0
                     and prev_stage_idx ( )
                             > 2 * prev_stage_max_idx ( )
                                   - 4 * guaranteed_resolution
                                  + 1
             )
    )
def CalcNumStrataRetainedExact (
     s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
    num_strata_deposited : int ,
) -> i n t :
    """ E x a c t l y how many s t r a t a a r e r e t a i n e d a f t e r n d e p o s t e d ? """
                                                              271


                                               Code Listing B.8 (cont’d)
    guaranteed_resolution = s e l f . _guaranteed_depth_proportional_resolution
    i f num_strata_deposited < g u a r a n t e e d _ r e s o l u t i o n * 2 + 1 :
              return num_strata_deposited
    else :
             # must c a l c u l a t e whether t h e r e w i l l be +1 due t o r e t e n t i o n o f
             # most r e c e n t l y d e p o s i t e d stratum
             # ( i . e . , whether i t o v e r l a p s with a rank from among
             # the 2 ∗ r e s o l u t i o n pegs )
              subtrahend = 2 * guaranteed_resolution + 1
              s h i f t e d = num_strata_deposited - s u b t r a h e n d
              d i v is o r = 2 * guaranteed_resolution - 1
             # e q u i v a l e n t i n t ( math . f l o o r ( math . l o g ( s h i f t e d // d i v i s o r + 1 , 2 ) ) )
              exp = ( s h i f t e d // d i v i s o r + 1 ) . b i t _ l e n g t h ( ) - 1
              modulus = 2 ** exp
              bump = ( num_strata_deposited - 1 ) % modulus != 0
              return 2 * g u a r a n t e e d _ r e s o l u t i o n + bump
def CalcNumStrataRetainedUpperBound (
    s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
    num_strata_deposited : t y p i n g . O p t i o n a l [ i n t ]=None ,
) -> i n t :
    """At most , how many s t r a t a a r e r e t a i n e d a f t e r n d e p o s t e d ? I n c l u s i v e . """
    return s e l f . _ g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n * 2 + 1
def CalcMrcaUncertaintyUpperBound (
    s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
    *,
    f i r s t _ n u m _ s t r a t a _ d e p o s i t e d : int ,
    second_num_strata_deposited : int ,
    actual_rank_of_mrca : t y p i n g . O p t i o n a l [ i n t ]=None ,
) -> i n t :
    """At most , how much u n c e r t a i n t y t o e s t i m a t e rank o f MRCA? I n c l u s i v e . """
    return max(
              first_num_strata_deposited ,
              second_num_strata_deposited ,
    ) // s e l f . _ g u a r a n t e e d _ d e p t h _ p r o p o r t i o n a l _ r e s o l u t i o n
def _CalcRankAtColumnIndexImpl (
    s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
    i n d e x : int ,
    num_strata_deposited : int ,
) -> i n t :
    """ A f t e r n s t r a t a have been d e p o s i t e d , what w i l l t h e rank o f t h e
    stratum a t column i n d e x k be ?
    Assumes t h a t no in−p r o g r e s s stratum d e p o s i t i o n s t h a t haven ’ t been
                                                              272


                                                Code Listing B.8 (cont’d)
    r e f l e c t e d i n num_strata_deposited .
    """
    guaranteed_resolution = s e l f . _guaranteed_depth_proportional_resolution
    cur_stage_uncertainty = s e l f . _calc_provided_uncertainty (
             num_strata_deposited ,
    )
    cur_stage_max_idx = num_strata_deposited // c u r _ s t a g e _ u n c e r t a i n t y
    p r e v _ s t a g e _ u n c e r t a i n t y = c u r _ s t a g e _ u n c e r t a i n t y // 2
    prev_stage_max_idx = ( num_strata_deposited - 1 ) // p r e v _ s t a g e _ u n c e r t a i n t y
    thresh_idx = (
             2 * prev_stage_max_idx
             - 4 * guaranteed_resolution
            + 2
    ) // 2
    b e f o r e _ t h r e s h _ i d x = min ( thresh_idx , i n d e x )
    a f t e r _ t h r e s h _ i d x = max( i n d e x - thresh_idx , 0 )
    return \
             before_thresh_idx * cur_stage_uncertainty \
            + after_thresh_idx * prev_stage_uncertainty
def CalcRankAtColumnIndex (
    s e l f : ’ StratumRetentionPredicateTaperedDepthProportionalResolution ’ ,
    i n d e x : int ,
    num_strata_deposited : int ,
) -> i n t :
    """ A f t e r n s t r a t a have been d e p o s i t e d , what w i l l t h e rank o f t h e
    stratum a t column i n d e x k be ?
    Enables a HereditaryStratigraphicColumn using t h i s p r e d i c a t e to
    o p t i m i z e away s t o r a g e o f rank a n n o t a t i o n s on s t r a t a . Takes i n t o t h e
    a c c o u n t t h e p o s s i b l i t y f o r in−p r o g r e s s stratum d e p o s i t i o n s t h a t haven ’ t
    been r e f l e c t e d i n num_strata_deposited .
    """
    guaranteed_resolution = s e l f . _guaranteed_depth_proportional_resolution
    i f num_strata_deposited < g u a r a n t e e d _ r e s o l u t i o n * 2 + 1 :
            # u s e i d e n t i t y mapping b e f o r e f i r s t r a n k s a r e condemned
             return i n d e x
    elif (
             i n d e x == s e l f . CalcNumStrataRetainedExact ( num_strata_deposited ) - 1
    ):
            # c a s e where i n d e x i s t h e v e r y most r e c e n t stratum
             return num_strata_deposited - 1
    e l i f i n d e x == s e l f . CalcNumStrataRetainedExact ( num_strata_deposited ) :
            # i n c a s e s where t h e i n d e x i s an in−p r o g r e s s
                                                              273


                                          Code Listing B.8 (cont’d)
               # d e p o s i t i o n rank must be c a l c u l a t e d a s t h e rank s u c c e e d i n g t h e
               # p r e v i o u s stratum ’ s rank
               return s e l f . CalcRankAtColumnIndex (
                     index - 1 ,
                     num_strata_deposited ,
               ) + 1
          else :
               return s e l f . _CalcRankAtColumnIndexImpl (
                     index ,
                     num_strata_deposited ,
               )
B.6   Miscellaneous
   Table B.8: Evolutionary conditions of ground-truth phylogenies taken from (Dolson et al., 2018).
     Selection             Per-bit Muta-      Number Gener-           Population Size     Phylogeny
     Scheme                tion Rate          ations                                      Download URL
     Eco-EA                0.01               3000                    100                 https:
                                                                                          //osf.io/5d3be/
     Lexicase              0.01               500                     165                 https:
                                                                                          //osf.io/8ycq7/
     Random                0.01               5000                    100                 https:
                                                                                          //osf.io/ydxt7/
     Sharing               0.01               5000                    100                 https:
                                                                                          //osf.io/cz9fk/
     Tournament            0.01               5000                    100                 https:
                                                                                          //osf.io/5ubn8/
                                                     274


                                                Appendix C
    Exploring Evolved Multicellular Life Histories in a Open-Ended
                                     Digital Evolution System
C.1 Background
      In the domain of experimental evolution, William Ratcliff and collaborators imposed a selective pres-
sure for hydrodynamic settling on Baker’s yeast and observed, in response, the emergence of a multicellular
snowflake morphology in which parent and daughter cells remained tethered (Ratcliff and Travisano, 2014).
With this system, they showed that multicellular life history can arise from a single mutation and demon-
strated that unicellular bottlenecking of lineages implicitly arises as an inherent geometric consequence of the
snowflake morphology (Ratcliff et al., 2015). Under extreme settling selection pressure, they observed the
emergence (and, encumbered by free riders, subsequent collapse) of altruistic behavior in which extracellular
DNA and proteins released under elevated apoptosis rates scaffolds the formation of multi-group collectives
(Gulli et al., 2019). Separately, incomplete cell separation mediated by the same mutational pathway has
also been observed to evolve in response to selection for extracellular sucrose digestion at low population
densities (Koschwanez et al., 2013).
      A wealth of mathematical models spanning both reductive and agent-based approaches have been de-
veloped to describe evolution of multicellularity and cellular specialization Hanschen et al. (2015). Recent
work by Staps and collaborators exemplifies the increasing mechanistic nuance of contemporary computa-
tional models. They present a mechanistic, agent-based system in which cells evolve weights for a Boolean
gene regulatory network with two inputs (representing environmental state), two hidden nodes (representing
regulatory state), and two outputs that control reproduction rate and probability of dissociation from or
association into a group (representing gene products). Evolutionary runs reveal how ecological conditions,
such as predation pressure, constraints on diffusion of nutrients/waste, and changing environmental condi-
tions, influence multicellular evolutionary outcomes with respect to group size, group lifespan, group fertility,
and cell fate (Staps et al., 2019). Staps et al. identify further augmentation of the mechanistic capabilities
of their agents — particularly the capacity to establish explicit spatial spatial structure within groups and
sense local state — as a compelling target for future work.
      Heather Goldsby and collaborators’ deme-based work is illustrative of the artificial life approach, where
focal structures and processes realize conceptual analogy to (but not necessarily direct representation of)
biological reality. In this string of studies, spatially-segregated pockets of cells (“demes”) compete for space
in a fixed-size population of demes. Individual cells are controlled by self-replicating Avida-style computer
programs with special instructions that allow them to interact with their environment and with neighboring
cells. The free-form paradigm of the genetic programming substrate, theoretically capable of performing
                                                         275


arbitrary computation, enables the evolution of agents exhibiting the advanced behavioral capacities proposed
by Staps et al., albeit in a manner without direct mechanistic analogy to biological cells. Two modes of
reproduction are defined under the deme model: within-deme and deme-founding. In the first, a cell copies
itself into a neighboring toroidal tile within its deme. In the second, a deme slot is cleared in the deme
population then seeded with a single cell from the parent deme. Cells grow freely within demes, but deme
fecundity depends on the collective profile of computational tasks (e.g., logic functions) performed within
the deme. This setup mirrors the dynamics of biological multicellularity, in which cell proliferation may
either grow an existing multicellular body or spawn a new multicellular body. Notably, when task-switching
costs are applied Goldsby et al. have observed the evolution of division of labor and extensive functional
interdependence within demes (Goldsby et al., 2012). When mutagenic side-effects are applied Goldsby et
al. have observed the evolution of germ-soma differentiation (Goldsby et al., 2014).
C.2        Resource Collection Process
      Resource appears at a single point then spreads outwards update-by-update in a diamond-shaped wave.
The expanding wave halts at a predefined limit. Cells must enter an “activated” state to harvest resource
as it passes overhead. The cell at the starting position of a resource wave is automatically activated, and
will propagate the activation signal to neighboring cells in the same hereditary group. The newly activated
cells, in turn, activate their own neighbors registered to the same hereditary group. Neighbors registered to
other hereditary groups do not activate. Each cell, after sending the activation signal, enters a temporary
quiescent state. In this manner, cells sharing a hereditary group track and harvest an expanding resource
wave. The rate of resource collection for a cell is determined by the size and shape of of its hereditary group;
small or fragmented hereditary groups will frequently miss out on resource as it passes by.
      Resource waves have a limited extent. Cells that activate outside the extent of a resource wave collect
no resource. A long quiescent period ensures that erroneously activated cells miss several subsequent op-
portunities to collect resource and therefore will tend to collect resource at a slower rate. In this manner,
“Goldilocks” — not to small and not too big — signaling networks enjoy superior fitness. On L0 under the
“nested” condition, resource waves extend a radius of two toroidal tiles. On the apex level (L0 under the
“flat” condition and on L1 under the “nested” condition) they extend a radius of six toroidal tiles. On each
level, activated cells netted +0.2 resource from a resource wave, but did not collect any resource outside the
extent of the resource wave.
      Resource wave starting points (seeds) are tiled over the toroidal grid from a randomly chosen starting
location such that the extents of the resource waves do not overlap. All resource waves begin and proceed
synchronously; when they complete, the next resource waves are seeded. This process provides efficient,
                                                       276


spatially-uniform, distributed selection for “Goldilocks” hereditary groups.
      Cells control the size and shape of their hereditary group through strategic reproduction. Three choices
are afforded: whether to reproduce at all, where among the four adjoining tiles of the toroidal grid to place
their offspring, and whether the offspring should be registered to the parent’s hereditary group or be given a
random hereditary ID (in the range 1 to 264 − 1). The probability of hereditary group collision is miniscule:
60 × 60 × 220 (the grid dimensions times the number of simulation updates) independent hereditary group
IDs will collide with probability less than 1 × 10−9 . No guarantees are made about the uniqueness of a
newly-generated hereditary group ID, but chance collisions are vanishingly rare.
      In addition to hereditary group-based resource collection, we provide a uniform inflow of +0.0051,
sufficient for one reproduction approximately every thousand updates.
C.3        Hereditary Group Life Cycle
      Mature hereditary groups enjoy a considerable advantage over fledgling propagules. Because of the
isometric scaling relationship between surface area and perimeter, cooperating hereditary groups can marshal
more resource at their periphery. In addition, because of their greater surface area, mature hereditary groups
are able to seed resource-wave events and collect resource at a higher per-cell rate.
      In order to ensure hereditary group turnover and facilitate hereditary group propagation, we impose a
timed phase-out of somatic reproduction and resource wave harvests. For each cell, we track a hereditary
group generation counter at each resource wave level. At the genesis of a new hereditary group, these
counters are set to zero. Daughter cells that expand a hereditary group’s soma are initialized to a counter
value one greater than their parent. Additionally, all hereditary generation counters are incremented every
512 updates to ensure that soma ages even in the absence of reproduction. When a cell’s hereditary group
generation counter reaches 1.5 times the wave radius of its level, it can no longer produce somatic daughter
cells. Then, after two additional counter steps, cells lose their ability to seed resource wave events and
collect resource. Thus, as hereditary groups age over time, their constituent cells lose the ability regenerate
somatic tissue and then, soon after, to collect resource. To prevent complete stagnation in the case where
all cells’ hereditary group generation counters expire we provide a uniform inflow of +0.0051, sufficient for
one reproduction approximately every thousand updates.
      Interaction between nested hereditary groups produces a notable selective byproduct. Because smaller,
L0 hereditary groups tend to have intrinsically shorter lifespans, in order to achieve the full potential pro-
ductive somatic lifespan of a larger, L1 hereditary group its constituent small hereditary groups must be
intermittently regenerated. Otherwise, the soma’s capacity to seed resource-wave events and to collect
resource will be prematurely lost once its constituent smaller, L0 hereditary groups expire.
                                                       277


      This aging scheme’s design ultimately stems from a desire
    1. to facilitate evolution through regular turnover of emergent individuals and
    2. to scaffold workable propagation for primitive cellular strategies while furnishing opportunities for more
       sophisticated adaptations to the imposed life cycle constraints.
However, in some sense the aging scheme is heavy-handed, in effect enforcing rather than enabling a birth-
death life cycle. The evolutionary basis of aging and mortality — in particular, the possibility of intrinsic
evolutionary adaptations promoting these phenomena in addition to extrinsic factors — remains an active
topic of scientific discussion Baig et al. (2014). In future work, we are interested in evaluating the outcomes
of relaxing constraints of this aging scheme under different evolutionary conditions (such as cosmic ray
mutations or irregular population structure) in light of theory attributing mortality and aging to evolvability,
mutational accumulation, and costly somatic maintenance.
C.4        Cell-Level Organisms
      SignalGP programs are collections of independent procedural functions, each equipped with a bit-string
tag Lalejini and Ofria (2018). A function is triggered by a signal with affinity that maximally and sufficiently
matches its tag. (A binding threshold of 0.1 was used in these experiments.) Signals may be generated by the
environment, received as messages from other agents, or triggered internally by function execution. Signals,
and the ensuing chains of procedural execution they give rise to, are processed pseudo-concurrently by 24
virtual CPU cores. Figure 4.1a schematically depicts a single SignalGP instance.
      In this work, we include a regulatory extension to the SignalGP system Lalejini et al. (2021). During
runtime, instructions may increase or decrease each tagged function’s intrinsic tendency to match with — and
activate in response to — tagged queries. Intrinsic tag-to-tag match distances m are modulated by a regulator
value r (baseline, 1.0) to become r + r × m. This scheme allows a function to be upregulated such that
every query activates that function (e.g., r = 0) or no query activates that function (e.g., r = inf). These
regulation settings are heritable during reproduction but automatically decay after a number of updates
determined when they are set.
      To allow cells to protect themselves form potentially antagonistic interactions with their neighbors, we
filter intercellular messages through a tag-matching membrane. At runtime, cells can embed tags in this
membrane that either admit or repel incoming messages. Messages that do not match with a membrane
tag are repelled. A message, for example, that would activate a SignalGP function containing an apoptosis
instruction could be rejected while other messages are accepted. Tags embedded in this membrane auto-
matically decay and may also be regulated. We also filter messages between hardware instances within the
                                                        278


same cell through a tag-matching membrane, but the default behavior for messages with unmatched tags is
admission rather than rejection.
      Previous work evolving digital organisms in grid-based problem domains has relied on a single com-
putational instance which designates a direction to act in via an explicit cardinal “facing” state or output
Biswas et al. (2014); Goldsby et al. (2014, 2018); Grabowski et al. (2010); Lalejini and Ofria (2018). Under
this paradigm, a large portion of genotype space encodes behaviors that are intrinsically asymmetrical with
respect to absolute or relative (depending on implementation) cardinal direction. However, in grid-based
tasks, directional phenotypic symmetry is generally advantageous. That is — in the absence of a polarizing
external stimulus — successful agents generally behave uniformly with respect to each cardinal direction of
the grid. In this work, each cell employs four instances of SignalGP hardware: one “facing” each cardinal di-
rection. These computational instances all execute the same SignalGP program but are otherwise decoupled
and may follow independent chains of execution and develop independent regulatory states. Instances within
a cell execute round robin step-by-step in an order that is randomly drawn at the outset of each update.
      Genetic encodings that exploit problem-domain symmetries are known to promote evolvability and —
ultimately — evolved solution quality Cheney et al. (2014); Clune et al. (2011). We submit that this direc-
tional hardware replication protocol likely increases the fraction of genotype space that encodes cardinally-
symmetric phenotypes and therefore better facilitates the evolution of high-fitness phenotypes. In further
work, we look forward to exploring the evolvability and solution quality implications of this new approach.
      The single SignalGP program that is mirrored across the cell’s computational instances represents the
cell’s genome. Mutation, with standard SignalGP mutation parameters as in Lalejini and Ofria (2018),
is applied to 1% of daughter cells at birth. In addition, genomes encode the bitstrings associated with
environmental events. These bitstrings evolve at a per-bit mutation rate equivalent to the bitstring labels of
SignalGP functions.
      Instances within a cell may send intracellular messages to one another or intercellular messages to a
neighboring cell. Intercellular messages are received by the SignalGP instance that faces the sending cell.
Figure 4.1b schematically depicts the configuration of the four SignalGP instances that constitute a single
DISHTINY cell as well as the instances of neighboring cells that receive extracellular messages from the focal
cell.
C.5        Standard SignalGP Instruction Library
      The default SignalGP instruction set defines a number of generic arithmetic, logic, utility, and program
flow instructions Lalejini and Ofria (2018). We include these instructions in our experiment’s instruction
library.
                                                      279


    To counteract crowding of the mutational landscape by the volume of custom instructions provided, a
second identical copy of each standard SignalGP instruction was included in the library.
   • Increment Increment value in a designated register.
   • Decrement Decrement value in a designated register.
   • Not Logically toggle value in a designated register.
   • Add Add values from two designated registers into a third designated register.
   • Subtract Subtract values from two designated registers into a third designated register.
   • Multiply Multiply values from two designated registers into a third designated register.
   • Divide Divide values from two designated registers into a third designated register.
   • Modulus Calculate the modulus from two designated registers and place result into a third designated
     register.
   • Test Equal Compare values in two designated registers and place equality result into a third designated
     register.
   • Test Non-equality Compare values in two designated registers and place opposite equality result
     into a third designated register.
   • Test Less Compare values in two designated registers and place less-than result into a third designated
     register.
   • If If a designated register is non-zero, proceed. Otherwise, skip block.
   • While While a designated register is non-zero, loop over a program block. Otherwise, skip block.
   • Countdown While a designated register is non-zero, loop over a program block and decrement the
     value in the designated register. Otherwise, skip block.
   • Close If a preceding program block is, close it.
   • Break Break to the end of the current program block.
   • Call Call the SignalGP program module that best matches instruction’s affinity.
   • Return If possible, return from the current function.
                                                     280


• Set Memory Set a designated register’s value to hard-coded memory value.
• Set True Set a designated register’s value to true (1.0).
• Copy Memory Copy the value of a designated register to a second designated register.
• Swap Memory Swap the values of two designated registers.
• Input Copy a designated element of input memory into a designated register.
• Output Copy to a designated element of output memory from a designated register.
• Commit Copy a designated register into a designated element of global memory.
• Pull Copy a designated element of global memory into a designated register.
• Fork Fork a new thread with the SignalGP program module that best matches the instruction’s affinity.
• Terminate Terminate the current thread.
• Nop No operation.
• RNG Draw Draw a random value between 0.0 and 1.0 from random number generator and store
  result in a register.
• Set Regulator Set the program module regulator that best matches (without regulation) the instruc-
  tion’s affinity to the value of a designated register.
• Set Own Regulator Set the program module regulator of the currently-executing program module
  to value of a designated register.
• Adjust Regulator Adjust the program module regulator of the program module that best matches
  (without regulation) the instruction’s affinity a designated fraction toward a designated register’s value.
• Adjust Own Regulator Adjust the program module regulator of the currently-executing program
  module a designated fraction toward a designated register’s value.
• Extend Regulator Adjust the program module regulator decay timer of the program module that
  best matches (without regulation) the instruction’s by a designated register’s value.
• Sense Regulator Copy the program module regulator value of the program module that best matches
  (without regulation) the instruction’s affinity into a designated register.
• Sense Own Regulator Copy the program module regulator value of currently-executing program
  module into a designated register.
                                                   281


C.6       Custom Instruction Library
     We define a number of custom instructions to allow evolving programs to sense and interact with their
environment, through mechanisms including
    • reproduction,
    • resource sharing,
    • hereditary group ID sensing,
    • apoptosis,
    • intracellular messaging, and
    • intercellular messaging.
     We provide an listing of our experiment’s instruction library below.
     Instructions that involve an extracellular neighbor default to the cell that the executing SignalGP
instance is facing. To ensure a founding crop of viable individuals, apoptosis and program flow instructions in
the initial randomly-generated population were replaced with no-op instructions. However, these instructions
were allowed to mutate in to genomes freely once evolutionary runs began.
    • Send Intracellular Message Send a message to a single other SignalGP instance within the cell
      specified by a designated register’s value.
    • Broadcast Intracellular Message Send a message to all SignalGP instances within the cell, exclud-
      ing self.
    • Put Internal Membrane Gatekeeper Place a tag in the internal membrane that, depending on
      insertion order, admits or blocks incoming internal messages it matches with.
    • Send Intercellular Message Send a message to a single cellular neighbor.
    • Broadcast Intercellular Message Send a message to all cellular neighbors.
    • Put External Membrane Gatekeeper Place a tag in the external membrane that, depending on
      insertion order, admits or blocks incoming external messages it matches with.
    • Set External Membrane Regulator Set the regulation of the gatekeeper in the external membrane
      that best matches (without regulation) the instruction’s affinity to the value of a designated register.
                                                      282


• Adjust External Membrane Regulator Adjust the regulation of the gatekeeper in the external
  membrane that best matches (without regulation) the instruction’s affinity a designated fraction toward
  a designated register’s value.
• Sense External Membrane Regulator Copy the regulation value of the program module that best
  matches (without regulation) the instruction’s affinity into a designated register.
• Activate Intercellular Inbox Mark the intercellular inbox to accept messages. At cell birth, the
  inbox is deactivated.
• Deactivate Intercellular Inbox Mark the intercellular inbox to decline messages.
• Share Resource Send a proportion of the cell’s stockpiled resource to a neighboring cell. One in-
  struction defaults to sending a large proportion of available resource (50%) to the neighboring cell. A
  second instruction defaults to sending a small proportion of available resource (5%) to the neighboring
  cell. The proportion of available resource can be adjusted by a register-based argument.
• Set Stockpile Sharing Reserve Designate a quantity of stockpiled resource as ineligible for sharing.
  The amount may be modified by a register-based argument.
• Clear Stockpile Sharing Reserve Designate all stockpiled resource as eligble for sharing.
• Restrict Outgoing Shared Resource Reduce outgoing sharing efficacy. Unsent resource is retained
  by the sending cell (with no resource lost). The fraction reduced is determined by a register-based
  argument.
• Restrict Incoming Shared Resource Reduce incoming sharing efficacy. Declined resource is re-
  tained by the sending cell (with no resource lost). The fraction reduced is determined by a register-based
  argument.
• Reproduce Attempt to spawn a child cell in a particular direction, paid for out of the parent cell’s
  resource stockpile. If sufficient resource is not available in the cell’s stockpile, no resource is action
  is taken. Variants of this instruction are defined for each hereditary group ID inheritance level: from
  endowing the daughter cell with the parental hereditary group IDs across all levels, to endowing the
  daughter cell with a new level-one hereditary group ID but the parent’s level-two hereditary group
  ID, to endowing the daughter cell with all-new hereditary group IDs. If a hereditary group generation
  counter limit has been reached, reproduction is simply attempted at the next highest level; even with
  hereditary group generation counters maxed out, cells may generate offspring with all-new hereditary
  group IDs.
                                                   283


• Pause Reproduction Pause cellular reproduction in a single direction for the remainder of the current
  update and for the entire next update. Variants of this instruction pause reproduction at a certain
  hereditary grouping level or across all hereditary group inheritance levels.
• Set Stockpile Reproduction Reserve Designate a quantity of stockpiled resource as ineligible for
  use to reproduce. The amount may be modified by a register-based argument.
• Clear Stockpile Sharing Reserve Designate all stockpiled resource as eligible for use to reproduce.
• Apoptosis The cell is killed at the end of the current update.
• Designate/Revoke Heir A dying cell’s own stockpile is split evenly among neighboring cells that
  are designated at the time of death. On apoptosis, 50% of the reproduction cost to establish a cell is
  also split between designated neighboring cells. These instructions mark or un-mark a neighbor as a
  heir.
• Increase Hereditary Group Generation Counter Increases the cell’s hereditary group generation
  counter. The amount the cell’s generation counter is increased by can be adjusted by register-based
  argument.
• Query Own Stockpile Sets a designated register to the amount of resource present in the cell’s
  stockpile.
• Query Own Hereditary Group Generation Counter This instruction sets a designated register
  to the value of the cell’s hereditary group generation counter. A variant of this instruction is provided
  for each hereditary group level.
• Query “Is Neighbor Live?” This instruction sets a designated register to 1 if the neighboring tile
  contains a live cell and 0 otherwise.
• Query “Is Neighbor My Cellular Child?” This instruction sets a designated register to 1 if the
  neighboring cell is the daughter of the querying cell and 0 otherwise.
• Query “Is Neighbor My Cellular Parent?” This instruction sets a designated register to 1 if the
  neighboring cell is the parent of the querying cell and 0 otherwise.
• Query “Does Neighbor’s Hereditary Group ID Match Mine?” This instruction sets a desig-
  nated register to 1 if the neighboring cell has the same hereditary group ID as the querying cell and 0
  otherwise. A variant of this instruction is provided for each level of hereditary grouping.
                                                   284


   • Query “Does Neighbor’s Hereditary Group ID Descend From Mine?” This instruction sets a
      designated register to 1 if the neighboring cell’s highest-level hereditary group ID is different from the
      querying cell’s highest-level hereditary group ID, but is descended from the querying cell’s hereditary
      group ID via an explicit propagule-generating reproduction call. This instruction allows a querying
      cell to sense whether its neighbor is a member of a hereditary group that is a propagule of the querying
      cell’s hereditary group.
   • Query “Does My Hereditary Group ID Descend From Neighbor’s?” This instruction sets
      a designated register to 1 if the querying cell’s highest-level hereditary group ID is different from
      the neighboring cell’s highest-level hereditary group ID, but is descended from the neighboring cell’s
      hereditary group ID via an explicit propagule-generating reproduction call. This instruction allows
      a querying cell to sense whether it is a member of a hereditary group that is a propagule of the
      neighboring cell’s hereditary group.
   • Query “Is Neighbor Poorer?” This instruction sets a designated register to 1 if the querying cell’s
      resource stockpile is larger than the neighboring cell’s.
   • Query “Is Neighbor Older?” This instruction sets a designated register to 1 if the querying cell’s
      cell age is less than the neighboring cell’s.
   • Query “Is Neighbor Expired?” This instruction sets a designated register to 1 if a neighboring
      cell’s hereditary group generation counter has exceeded the expiration threshold.
   • Query Neighbor’s Hereditary Group ID This instruction sets a designated register to the neigh-
      bor’s hereditary group ID. A variant of this instruction is provided for each hereditary grouping level.
   • Query Neighbor’s Stockpile This instruction sets a designated register to the amount of resource
      present in the neighbor’s stockpile.
C.7       Environmental Cue Library
     Event-driven sensing has been shown to enable evolution of SignalGP programs that more successfully
react to environmental state Lalejini and Ofria (2018), so we supplement our instruction-based sensors with
event-based input. Every eight updates, a subset of environmental events are triggered on each SignalGP
hardware based on current local environmental conditions. The activating affinity of each event is genetically-
encoded as part of the program currently executing on the hardware. We provide a listing of our experiment’s
event library in supplementary material C.7.
   • On Update This event is triggered every eight updates.
                                                       285


• Just Born This event is triggered once after a cell is born.
• Richer Neighbor This event is triggered if a neighbor cell has more stockpiled resource than the focal
  cell.
• Poorer Neighbor This event is triggered if a neighbor cell has less stockpiled resource than the focal
  cell.
• Facing Cellular Child This event is triggered if the SignalGP instance is facing a neighboring cell
  that is the querying cell’s daughter.
• Facing Cellular Parent This event is triggered if the SignalGP instance is facing a neighboring cell
  that is the querying cell’s parent.
• Neighbor’s Hereditary Group ID Descends From Mine This event is triggered if the neighboring
  cell’s highest-level hereditary group ID is different from the querying cell’s highest-level hereditary group
  ID, but is descended from the querying cell’s hereditary group ID via an explicit propagule-generating
  reproduction call. This event allows a querying cell to sense whether its neighbor is a member of a
  hereditary group that is a propagule of the querying cell’s hereditary group.
• My Hereditary Group ID Descends From Neighbor’s This event is triggered if the focal
  cell’s highest-level hereditary group ID is different from the neighboring cell’s highest-level hereditary
  group ID, but is descended from the neighboring cell’s hereditary group ID via an explicit propagule-
  generating reproduction call. This event allows a neighboring cell to sense whether its neighbor is a
  member of a hereditary group that is a propagule of the neighboring cell’s hereditary group.
• Neighbor’s Hereditary Group ID Matches Mine This event is triggered if a SignalGP instance
  is facing a neighbor cell that shares its hereditary group ID. A different event is provided for each level
  of hereditary grouping.
• Neighbor’s Hereditary Group ID Does Not Match Mine This event is triggered if a SignalGP
  instance is facing a neighbor cell that does not share its hereditary group ID. A different event is
  provided for each level of hereditary grouping.
• Hereditary Group Generation Counter Is Unexpired This event is triggered if a SignalGP
  instance’s cell’s hereditary group generation counter has not yet reached the expiration threshold. A
  different event is provided for each level of hereditary grouping.
                                                     286


    • Hereditary Group Generation Counter Is Expiring This event is triggered if a SignalGP in-
      stance’s cell’s hereditary group generation counter has not yet reached the threshold where somatic
      propagation capacity, but not resource accumulation capacity, is lost. A different event is provided for
      each level of hereditary grouping.
    • Hereditary Group Generation Counter Is Expired This event is triggered if a SignalGP in-
      stance’s cell’s hereditary group generation counter has not yet reached the threshold where both so-
      matic propagation capacity and resource accumulation capacity are lost. A different event is provided
      for each level of hereditary grouping.
    • No-reward Resource Activation This event is triggered if a SignalGP instance’s cell experiences
      a resource collection activation where no resource reward is achieved (e.g., the cell lies extent of the
      resource wave). A different event is provided for each level of hereditary grouping.
C.8       Evolutionary Conditions
                           Table C.1: Observed productivity at epoch 1 (mean ± S.D.).
     Measure                              Flat-Even         Flat-Wave      Nested-Even       Nested-Wave
     Per-cell-update resource inflow       0.0175 ± 0     0.0145 ± 0.0008     0.0175 ± 0     0.0178 ± 0.0010
     Per-cell-update cell reproduction  0.0108 ± 0.0020   0.0094 ± 0.0018 0.0113 ± 0.0018    0.0117 ± 0.0023
     In this work, we screened replicates conducted under combinations of two experimental conditions:
   1. flat versus two nested hierarchical levels of hereditary grouping and
   2. cooperative versus independent resource collection.
     The first experimental manipulation explores the effects of hierarchical nesting of kin-sensing and/or
functional cooperation. The second manipulation explores the effects of functional cooperation.
     To enact the first manipulation, we compared the nested hierarchical hereditary grouping scheme de-
scribed above with a single-level scheme with waves extending six toroidal tiles. We also increased the resource
wave reward to +0.6 to approximately match the observed resource inflow rate of the nested scheme. To
enact the second manipulation, we removed the resource wave reward and increased the uniform resource
inflow rate to +0.0175 in order to approximately match the net inflow rate under the dual-level wave-based
scheme. Table C.1 reports productivity observed under these different conditions.
     We mix and match these experimental manipulations in three treatments:
   1. one level with even resource (“Flat-Even”; in-browser simulation https://hopth.ru/i),
   2. one level with wave-based resource (“Flat-Wave”; in-browser simulation https://hopth.ru/j),
                                                        287


                                            Table C.2: Systematics outcomes (mean ± S.D.).
                                                  Epoch 1                                               Epoch 4
Measure                      Flat-Even Flat-Wave    Nested-Even   Nested-Wave   Flat-Even   Flat-Wave     Nested-Even  Nested-Wave
Replicate count                40/40      40/40         40/40         40/40        40/40       40/40          39/40        37/40
Cellular generations elapsed 832 ± 555 1140 ± 697     935 ± 636     953 ± 563   3508 ± 2532 5869 ± 3246    4628 ± 3274  4635 ± 3077
Level 1 generations elapsed   104 ± 89  127 ± 68      592 ± 553     566 ± 637    436 ± 357   654 ± 353     2908 ± 2768  2637 ± 2933
Level 2 generations elapsed     N/A        N/A        117 ± 99       121 ± 85       N/A         N/A         526 ± 352    546 ± 349
Phylogenetic depth             11 ± 5    14 ± 6        15 ± 8         14 ± 6      43 ± 24     55 ± 26        60 ± 44      54 ± 31
Coalescent replicates          57.5%     77.5%          62.5%         58.5%         95%        97.5%          92.3%        94.6%
                                                                  288


    3. two levels with even resource (“Nested-Even”; in-browser simulation https://hopth.ru/k), and
    4. two levels with wave-based resource (“Nested-Wave”; in-browser simulation https://hopth.ru/l).
      We ran 40 replicates under each treatment condition. Replicates were seeded with randomly generated
SignalGP programs. To conserve disk space, we divided evolutionary runs into 262144 (218 ) update epochs
and collected data in 8096 (213 ) update snapshots between epochs. All replicates ran at least one full epoch,
and all comparisons between or within treatments are conducted at this time point. However, most replicates
(156/160) were able to run to four epochs during available compute time. We screened for and conducted
case studies at the latest available data for each replicate. All reported case studies happen to be drawn
from runs that completed 4 epochs of evolution. Table C.2 reports the systematics outcomes observed under
each treatment at epoch 1 and at epoch 4.
      All experiments took place on a traditional 60-by-60 toroidal grid, supporting a population of at most
3600 individual cells.
C.9        Competition Experiments and Phenotype Assays
      We performed further experiments to develop case studies of evolved strains we manually screened from
our evolutionary runs. In these experiments, the most-abundant genotype was harvested from the end-state
of evolutionary runs as the wild type strain. We collected epigenetic state (i.e., regulatory settings) along
with genetic state (i.e., SignalGP program and environmental-cue-to-tag mapping). All further work with
harvested strains was conducted under environmental conditions identical to that of the treatment they
evolved in.
      To analyze the relative fitness of knockout strains versus wild type, we seeded 20 60 × 60 toroidal
grids with ten cells of each strain, including epigenetic regulator state. We ran competition experiments for
the duration of one snapshot. Seeded cells generally proliferated to completely fill the toroidal grid in the
first quarter of the snapshot. Competition experiment outcomes were determined by strains’ relative cell
populations within the grid at the end of the snapshot.
      To perform phenotypic comparisons between knockout strains and wild type, we seeded ten cells of each
strain onto separate 60 × 60 toroidal grids and then cultured them for the duration of one snapshot.
C.10        Implementation
      We implemented our experimental system using the Empirical library for scientific software development
in C++, available at https://github.com/devosoft/Empirical (Ofria et al., 2019). The code used to perform
and analyze our experiments, our figures, data from our experiments, and a live in-browser demo of our
system is available via the Open Science Framework at https://osf.io/g58xk/. Most replicates finished
within a day, but some took up to a week to complete.
                                                      289


  C.11                    Reproductive Cooperation
                                                                    Cell Context
                    0.020                                                 Interior
                                                                          Exterior
Reproduction Rate
                    0.015
                    0.010
                    0.005
                    0.000
                                   Flat-Even             Flat-Wave             Nested-Even             Nested-Wave
                                                                    Treatment
  Figure C.1: Cellular reproduction rates at the interior and exterior of apex-level hereditary groups. Error
  bars indicate 95% confidence.
                    This section contains Figure C.1.
  C.12                    Resource Sharing
                    Figure C.2 overviews evolved resource sharing behavior across cellular contexts.
                    Replicates in the flat-wave treatment exhibit an especially elevated rate of resource sharing to cell
  children. This could perhaps be due to an especial selective pressure to convey resource towards the group
  periphery.
                    In the Nested-Wave treatment resource was shared at a higher mean rate among L1 hereditary groups
  than L0 groups. At face value, this observation appears counterintuitive: why should cells prefer to share
  with more distant relatives with only one hereditary ID in common as opposed to closer relatives with both
  hereditary IDs in common? However, we believe it likely an artifact of replicates where L1 groups were
  composed of single-cell L0 groups (where no or very few opportunities for L0 resource sharing occurred).
                                                                   290


                                                                                                                                                                                      Treatment
                                                                                                                                                                                      Flat-Even
                         0.020                                                                                                                                                        Flat-Wave
                                                                                                                                                                                      Nested-Even
Resource Transfer Rate
                                                                                                                                                                                      Nested-Wave
                         0.015
                         0.010
                         0.005
                                                                                                    N/AN/A
                         0.000
                                 Neighbor
                                            Related Neighbor
                                                               Unrelated Neighbor
                                                                                    Channelmate 0       Channelmate 1
                                                                                                                        Nonchannelmate
                                                                                                                                         Cell Child
                                                                                                                                                      Cell Parent
                                                                                                                                                                    Propagule Child
                                                                                                                                                                                              Propagule Parent
                                                                                                          Relationship
   Figure C.2: Resource sharing rates across donor-recipient relationships. Neighbor describes any potential
   recipient cell. Related neighbor describes a recipient cell that is a direct cellular progenitor or offspring of
   the donor, registered to a same hereditary group as the donor, or a member of a hereditary group that is a
   progenitor or offspring of the donor’s. Unrelated neighbors constitutes all other cells. Channelmate refers to
   donor-recipient pairs that are registered to the same hereditary group. Note that L1 groups are not defined in
   the flat treatment. Non-channelmate recipients are not registered to any common hereditary groups with the
   potential donor. Cell child and parent describe direct nuclear cell relationships between donor and recipient.
   Finally, a propagule child relationship exists when a donor cell is a member of the apex-level hereditary
   group that directly begat the recipient cell’s hereditary group. A propagule parent relationship describes the
   reverse, when a recipient cell is a member of the apex-level hereditary group that directly begat the donor
   cell’s hereditary group. Error bars indicate 95% confidence.
                                                                                                        291


                                                   Level = 0                                               Level = 1
                         0.020
Resource Transfer Rate
                         0.015
                                                                                                                                            Relationship
                                                                                                                                         Child Channelmate
                                                                                                                                         Parent Channelmate
                         0.010                                                                                                           Non-nuclear Channelmate
                                                                                                                                         Unrelated Neighbor
                         0.005
                                                                                   N/A N/A N/A   N/A N/A N/A
                         0.000
                                 Flat-Even   Flat-Wave   Nested-Even Nested-Wave   Flat-Even      Flat-Wave    Nested-Even Nested-Wave
                                                  Treatment                                                Treatment
   Figure C.3: Resource sharing to mutually exclusive sub-categories of hereditary group comrades: cellular
   child, cellular parent, and neither (“non-nuclear”). Resource sharing to entirely non-related cells (no cell,
   hereditary group, or propagule relation) is included for comparison. Note that L1 hereditary groups are not
   defined in either of the flat treatments. Error bars indicate 95% confidence.
                            Finally, under all treatments resource was transferred to hereditary group comrades at a significantly
   higher mean rate than to unrelated neighbors (non-overlapping 95% CI). This observation suggests that
   functional cooperation within hereditary groups might have been a common evolutionary outcome under
   all four treatments. However, it could potentially be driven exclusively by resource-sharing between direct
   cellular kin.
                            Figure C.3 investigates this possibility by breaking within-group resource sharing apart by cellular kin
   relation. In all four treatments, mean sharing to direct-kin hereditary group comrades was indeed greater
   than to other hereditary group comrades. This could be due to an evolutionary incentive to favor direct cell
   kin over other hereditary comrades, group-level selection for asymmetric resource flow achieved by preferential
   sharing, or some combination of the two. However, in all four treatments mean sharing to non-direct-kin
   hereditary group comrades was also significantly greater than resource sharing to unrelated neighbors (non-
   overlapping 95% CI). Thus, all four treatments appear to be sufficient to select for functional cooperation
   within hereditary groups.
   C.13                             Case Study: Cell-Cell Messaging
                            Figure C.4a compares the cell-cell messaging, resource sharing, and parent-propagule phenotypes be-
   tween wild type and cell-cell messaging knockout variants of a strain evolved under the Nested-Wave treat-
   ment. Cell-cell messaging volume appears generally uniform in the interiors of hereditary groups, but some
   group-group borders — largely, but not entirely parent-propagule interfaces — manifest somewhat reduced
   cell-cell messaging overlaid with an alternating motif of elevated cell-cell messaging. We affirmed the adap-
                                                                                           292


                          Messaging
                          Resource Sharing
                                  Wild Type            Messaging Knockout
                          (a) Phenotype visualizations
                                                                                              Genotype
                                                               3.0                          Wild Type
                                                                                            Messaging Knockout
                                                               2.5
                                             Mean Border Age
                                                               2.0
                                                               1.5
                                                               1.0
                                                               0.5
                                                               0.0
                                                                     Neighbor                 Affiliate
                                                                                Aggressor
                                               (b) Border age
Figure C.4: Comparison of wild type strain evolved under the “Nested-Wave” treatment and corresponding
intercell messaging knockout strain. Subfigure C.4a visualizes phenotypic traits in the wild type and
knockout strain. In the messaging visualization, color coding represents the volume of incoming messages.
White represents no incoming messages and the magenta to blue gradient runs from one incoming message
to the maximum observed incoming message traffic. In the resource sharing visualization, this same color
coding represents the amount of incoming shared resource. Solid black borders divide L1 hereditary groups
and dotted light gray borders divide L0 hereditary groups. Subfigure C.4b quantifies knockout effects on
border age. View an animation of the wild type strain at https://hopth.ru/o. View the wild type strain in
a live in-browser simulation at https://hopth.ru/d. Error bars indicate 95% confidence.
                                                                                293


tiveness of cell-cell messaging in this strain through competition experiments between wild type and knockout
variants (19/20; p < 0.001; two-tailed Binomial test). The gene activated by cell-cell messaging in this strain
contains a share resource instruction and, indeed, we observed significantly greater net resource sharing in
the wild type strain (WT: mean 0.27, S.D. 0.03, n = 20; KO: mean 0.23, S.D. 0.02, n = 20; p < 0.001,
bootstrap test non-overlapping 95% CI). However, that same gene also contains a reproduction-inhibiting
instruction, leading us to investigate whether cell-cell messaging could influence a broader set of phenotypic
traits.
      Cell-cell messaging in the wild type strain appears to be associated with a drawn out hereditary group life
history. The wild-type strain exhibits significantly greater mean cell age (WT: mean 59, S.D. 7, n = 20; KO:
mean 49, S.D. 4, n = 20; p < 0.001, bootstrap test) and, across propagule-generation events, significantly
greater mean parent group age (WT: mean 1055, S.D. 82, n = 20; KO: mean 924, S.D. 62, n = 20; p < 0.001,
bootstrap test). This strain exhibits the “sweep” life history depicted in Figure 4.2c , so propagule generation
can be largely or entirely destructive to the parent group. So, the increase in mean cell age could plausibly
be attributable to delayed propagule genesis or, alternatively, delayed propagule genesis could arise from
other factors retarding life history.
      In this strain, we anecdotally observed that contiguous bands of low cell turnover and anomalous cell-cell
messaging volumes frequently arose along parent-propagule borders, but also occasionally between other pairs
hereditary groups. Cell-cell messaging not only enables functional coordination within cellular collectives but
could also enable adaptive communication among cellular collectives. This possibility motivated us to test
for non-uniform interactions between hereditary groups that did not share a parent-propagule relationship.
      We measured mean border age (equivalent to the youngest age of either flanking cell) along the borders
of non-parent-propagule hereditary groups. Figure C.4b splits this statistic out between borders that were
disrupted either by cells birthed from members of the hereditary groups flanking the border (“affiliate”) or
from a member of a third hereditary group (“neighbor”). In both wild type and knockout strains, there was
significantly more recent turnover in the absence of intrusion by a third hereditary group (non-overlapping
95% CI, bootstrap test). Restated, borders invaded by a third party were more on average more stable than
those perturbed by either of the flanking hereditary groups.
      This phenomenon was accentuated in the wild type strain. Although the wild type strain exhibits
slightly higher turnover rates on borders plied by only two groups, borders invaded by a third group are
significantly more stale than the knockout strain (non-overlapping 95% CI, bootstrap test).
      Greater age of borders disrupted by a third party would be consistent with a general slowing of turnover
as hereditary groups age or reduced resource availability in the presence of a third party. However, a primitive
tit-for-tat policy where a subset of non-parent-propagule hereditary group borders stabilize (until invaded
                                                        294


by a third party) could also contribute to such an observation.
     So, does the cell reproduction rate fluctuate uniformly across a hereditary group’s borders or can re-
production rate differ significantly between a group’s borders with different non-parent-propagule neighbor
groups at a single time point? To assess this question, we used Kruskal-Wallis tests (with Bonferroni correc-
tion) to screen for hereditary groups with border reproduction rate distributions that differed significantly
between neighboring non-propagule-parent hereditary groups. For each hereditary group, we calculated
mean per-border-cell birth rate at the interface of each of its non-propagule-parent neighbor groups. We
collected observations with respect to each neighbor group every eighth update over 256 updates. Groups
with significantly differentiated border reproduction rate distributions occurred in both the wild type and
messaging knockout strains. That is, in both strains, we observed some groups that preferentially expended
resource to reproduce at their interfaces with only certain non-parent-propagule hereditary group neighbors.
     Again, this phenomenon was accentuated in the wild type strain. A significantly higher proportion of
groups exhibited asymmetric border reproduction rates with non-parent-child groups (WT: mean 0.36, S.D.
0.06, n = 20; KO: mean 0.28, S.D. 0.04, n = 20; p < 0.001, bootstrap test).
     Messaging between cells registered to different parent-propagule hereditary groups seems unlikely to
directly underlie asymmetric border reproduction rates because execution of the gene targeted by messages
triggers resource-sharing to the sender, which we seldom observed between non-parent-propagule groups
(Figure C.4a ). So, intercell messaging within hereditary groups most likely underlies this phenomenon. It
seems most plausible that increased incidence of asymmetric border reproduction rates arises as a knock-
on effect of the life history retardation effect originally discussed. Perhaps older, “full-grown” hereditary
groups arrive at a low-reproduction detente at interfaces with other older, “full-grown” hereditary groups
while resisting incursion at interfaces with younger, growing hereditary groups. This would constitute a
contextually-expressed tit-for-tat policy, perhaps mediated by cell age or cell generations elapsed from the
group’s founding propagule cell.
                                                      295


                                              Appendix D
          Case Study of Novelty, Complexity, and Adaptation in a
                                       Multicellular System
Table D.1: Qualitative morph categorization of representative specimens sampled across evolutionary stints.
Video links provide an time-lapse animation of each specimen’s morphology when grown in monoculture.
Web viewer URLs link to an in-browser simulation of each genome or population. Human Readable genome
link will download the corresponding genome as an annotated JSON file.
  Stint    Morph    Monoculture         Monoculture         Population          Human Read-
                    Video Link          Web       Viewer    Web       Viewer    able Genome
                                        Link                Link                Link
    0                           https:    https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
             a      //hopth.ru/21/b=    21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
                    prq49+s=16005+      16005+t=0+v=        16005+t=0+v=        16005+t=0+v=
                    t=0+v=video+w=      simulation+w=       simulation+w=       text+w=specimen
                    specimen            specimen            population
    1                           https:    https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
             b      //hopth.ru/21/b=    21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
                    prq49+s=16005+      16005+t=1+v=        16005+t=1+v=        16005+t=1+v=
                    t=1+v=video+w=      simulation+w=       simulation+w=       text+w=specimen
                    specimen            specimen            population
    2                           https:    https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
             c      //hopth.ru/21/b=    21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
                    prq49+s=16005+      16005+t=2+v=        16005+t=2+v=        16005+t=2+v=
                    t=2+v=video+w=      simulation+w=       simulation+w=       text+w=specimen
                    specimen            specimen            population
    3                           https:    https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
             b      //hopth.ru/21/b=    21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
                    prq49+s=16005+      16005+t=3+v=        16005+t=3+v=        16005+t=3+v=
                    t=3+v=video+w=      simulation+w=       simulation+w=       text+w=specimen
                    specimen            specimen            population
    4                           https:    https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
             b      //hopth.ru/21/b=    21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
                    prq49+s=16005+      16005+t=4+v=        16005+t=4+v=        16005+t=4+v=
                    t=4+v=video+w=      simulation+w=       simulation+w=       text+w=specimen
                    specimen            specimen            population
    5                           https:    https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
             b      //hopth.ru/21/b=    21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
                    prq49+s=16005+      16005+t=5+v=        16005+t=5+v=        16005+t=5+v=
                    t=5+v=video+w=      simulation+w=       simulation+w=       text+w=specimen
                    specimen            specimen            population
    6                           https:    https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
             b      //hopth.ru/21/b=    21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
                    prq49+s=16005+      16005+t=6+v=        16005+t=6+v=        16005+t=6+v=
                    t=6+v=video+w=      simulation+w=       simulation+w=       text+w=specimen
                    specimen            specimen            population
    7                           https:    https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
             b      //hopth.ru/21/b=    21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
                    prq49+s=16005+      16005+t=7+v=        16005+t=7+v=        16005+t=7+v=
                    t=7+v=video+w=      simulation+w=       simulation+w=       text+w=specimen
                    specimen            specimen            population
                                                       296


                                      Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture         Population          Human Read-
            Video Link         Web       Viewer    Web       Viewer    able Genome
                               Link                Link                Link
  8                     https:   https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
       b    //hopth.ru/21/b=   21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
            prq49+s=16005+     16005+t=8+v=        16005+t=8+v=        16005+t=8+v=
            t=8+v=video+w=     simulation+w=       simulation+w=       text+w=specimen
            specimen           specimen            population
  9                     https:   https://hopth.ru/   https://hopth.ru/  https://hopth.ru/
       b    //hopth.ru/21/b=   21/b=prq49+s=       21/b=prq49+s=       21/b=prq49+s=
            prq49+s=16005+     16005+t=9+v=        16005+t=9+v=        16005+t=9+v=
            t=9+v=video+w=     simulation+w=       simulation+w=       text+w=specimen
            specimen           specimen            population
 10                     https:      https://hopth.      https://hopth.  https://hopth.ru/
       b    //hopth.ru/21/b=   ru/21/b=prq49+      ru/21/b=prq49+      21/b=prq49+s=
            prq49+s=16005+     s=16005+t=10+       s=16005+t=10+       16005+t=10+v=
            t=10+v=video+      v=simulation+w=     v=simulation+w=     text+w=specimen
            w=specimen         specimen            population
 11                     https:      https://hopth.      https://hopth.  https://hopth.ru/
       b    //hopth.ru/21/b=   ru/21/b=prq49+      ru/21/b=prq49+      21/b=prq49+s=
            prq49+s=16005+     s=16005+t=11+       s=16005+t=11+       16005+t=11+v=
            t=11+v=video+      v=simulation+w=     v=simulation+w=     text+w=specimen
            w=specimen         specimen            population
 12                     https:      https://hopth.      https://hopth.  https://hopth.ru/
       b    //hopth.ru/21/b=   ru/21/b=prq49+      ru/21/b=prq49+      21/b=prq49+s=
            prq49+s=16005+     s=16005+t=12+       s=16005+t=12+       16005+t=12+v=
            t=12+v=video+      v=simulation+w=     v=simulation+w=     text+w=specimen
            w=specimen         specimen            population
 13                     https:      https://hopth.      https://hopth.  https://hopth.ru/
       b    //hopth.ru/21/b=   ru/21/b=prq49+      ru/21/b=prq49+      21/b=prq49+s=
            prq49+s=16005+     s=16005+t=13+       s=16005+t=13+       16005+t=13+v=
            t=13+v=video+      v=simulation+w=     v=simulation+w=     text+w=specimen
            w=specimen         specimen            population
 14                     https:      https://hopth.      https://hopth.  https://hopth.ru/
       d    //hopth.ru/21/b=   ru/21/b=prq49+      ru/21/b=prq49+      21/b=prq49+s=
            prq49+s=16005+     s=16005+t=14+       s=16005+t=14+       16005+t=14+v=
            t=14+v=video+      v=simulation+w=     v=simulation+w=     text+w=specimen
            w=specimen         specimen            population
 15                     https:      https://hopth.      https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+      ru/21/b=prq49+      21/b=prq49+s=
            prq49+s=16005+     s=16005+t=15+       s=16005+t=15+       16005+t=15+v=
            t=15+v=video+      v=simulation+w=     v=simulation+w=     text+w=specimen
            w=specimen         specimen            population
 16                     https:      https://hopth.      https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+      ru/21/b=prq49+      21/b=prq49+s=
            prq49+s=16005+     s=16005+t=16+       s=16005+t=16+       16005+t=16+v=
            t=16+v=video+      v=simulation+w=     v=simulation+w=     text+w=specimen
            w=specimen         specimen            population
                                              297


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 17                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=17+      s=16005+t=17+      16005+t=17+v=
            t=17+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 18                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=18+      s=16005+t=18+      16005+t=18+v=
            t=18+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 19                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=19+      s=16005+t=19+      16005+t=19+v=
            t=19+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 20                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=20+      s=16005+t=20+      16005+t=20+v=
            t=20+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 21                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=21+      s=16005+t=21+      16005+t=21+v=
            t=21+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 22                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=22+      s=16005+t=22+      16005+t=22+v=
            t=22+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 23                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=23+      s=16005+t=23+      16005+t=23+v=
            t=23+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 24                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=24+      s=16005+t=24+      16005+t=24+v=
            t=24+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 25                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=25+      s=16005+t=25+      16005+t=25+v=
            t=25+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             298


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 26                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       b    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=26+      s=16005+t=26+      16005+t=26+v=
            t=26+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 27                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=27+      s=16005+t=27+      16005+t=27+v=
            t=27+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 28                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       b    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=28+      s=16005+t=28+      16005+t=28+v=
            t=28+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 29                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=29+      s=16005+t=29+      16005+t=29+v=
            t=29+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 30                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=30+      s=16005+t=30+      16005+t=30+v=
            t=30+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 31                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=31+      s=16005+t=31+      16005+t=31+v=
            t=31+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 32                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=32+      s=16005+t=32+      16005+t=32+v=
            t=32+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 33                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=33+      s=16005+t=33+      16005+t=33+v=
            t=33+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 34                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=34+      s=16005+t=34+      16005+t=34+v=
            t=34+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             299


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 35                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=35+      s=16005+t=35+      16005+t=35+v=
            t=35+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 36                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=36+      s=16005+t=36+      16005+t=36+v=
            t=36+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 37                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=37+      s=16005+t=37+      16005+t=37+v=
            t=37+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 38                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=38+      s=16005+t=38+      16005+t=38+v=
            t=38+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 39                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       f    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=39+      s=16005+t=39+      16005+t=39+v=
            t=39+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 40                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=40+      s=16005+t=40+      16005+t=40+v=
            t=40+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 41                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=41+      s=16005+t=41+      16005+t=41+v=
            t=41+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 42                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=42+      s=16005+t=42+      16005+t=42+v=
            t=42+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 43                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=43+      s=16005+t=43+      16005+t=43+v=
            t=43+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             300


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 44                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=44+      s=16005+t=44+      16005+t=44+v=
            t=44+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 45                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=45+      s=16005+t=45+      16005+t=45+v=
            t=45+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 46                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=46+      s=16005+t=46+      16005+t=46+v=
            t=46+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 47                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=47+      s=16005+t=47+      16005+t=47+v=
            t=47+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 48                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=48+      s=16005+t=48+      16005+t=48+v=
            t=48+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 49                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=49+      s=16005+t=49+      16005+t=49+v=
            t=49+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 50                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=50+      s=16005+t=50+      16005+t=50+v=
            t=50+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 51                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=51+      s=16005+t=51+      16005+t=51+v=
            t=51+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 52                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=52+      s=16005+t=52+      16005+t=52+v=
            t=52+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             301


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 53                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=53+      s=16005+t=53+      16005+t=53+v=
            t=53+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 54                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=54+      s=16005+t=54+      16005+t=54+v=
            t=54+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 55                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=55+      s=16005+t=55+      16005+t=55+v=
            t=55+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 56                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=56+      s=16005+t=56+      16005+t=56+v=
            t=56+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 57                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=57+      s=16005+t=57+      16005+t=57+v=
            t=57+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 58                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=58+      s=16005+t=58+      16005+t=58+v=
            t=58+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 59                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       h    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=59+      s=16005+t=59+      16005+t=59+v=
            t=59+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 60                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=60+      s=16005+t=60+      16005+t=60+v=
            t=60+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 61                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=61+      s=16005+t=61+      16005+t=61+v=
            t=61+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             302


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 62                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=62+      s=16005+t=62+      16005+t=62+v=
            t=62+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 63                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=63+      s=16005+t=63+      16005+t=63+v=
            t=63+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 64                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=64+      s=16005+t=64+      16005+t=64+v=
            t=64+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 65                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=65+      s=16005+t=65+      16005+t=65+v=
            t=65+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 66                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=66+      s=16005+t=66+      16005+t=66+v=
            t=66+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 67                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=67+      s=16005+t=67+      16005+t=67+v=
            t=67+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 68                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=68+      s=16005+t=68+      16005+t=68+v=
            t=68+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 69                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=69+      s=16005+t=69+      16005+t=69+v=
            t=69+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 70                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=70+      s=16005+t=70+      16005+t=70+v=
            t=70+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             303


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 71                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=71+      s=16005+t=71+      16005+t=71+v=
            t=71+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 72                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=72+      s=16005+t=72+      16005+t=72+v=
            t=72+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 73                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=73+      s=16005+t=73+      16005+t=73+v=
            t=73+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 74                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       i    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=74+      s=16005+t=74+      16005+t=74+v=
            t=74+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 75                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       i    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=75+      s=16005+t=75+      16005+t=75+v=
            t=75+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 76                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=76+      s=16005+t=76+      16005+t=76+v=
            t=76+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 77                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       i    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=77+      s=16005+t=77+      16005+t=77+v=
            t=77+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 78                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=78+      s=16005+t=78+      16005+t=78+v=
            t=78+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 79                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=79+      s=16005+t=79+      16005+t=79+v=
            t=79+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             304


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 80                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=80+      s=16005+t=80+      16005+t=80+v=
            t=80+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 81                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=81+      s=16005+t=81+      16005+t=81+v=
            t=81+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 82                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=82+      s=16005+t=82+      16005+t=82+v=
            t=82+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 83                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=83+      s=16005+t=83+      16005+t=83+v=
            t=83+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 84                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=84+      s=16005+t=84+      16005+t=84+v=
            t=84+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 85                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=85+      s=16005+t=85+      16005+t=85+v=
            t=85+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 86                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=86+      s=16005+t=86+      16005+t=86+v=
            t=86+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 87                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=87+      s=16005+t=87+      16005+t=87+v=
            t=87+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 88                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=88+      s=16005+t=88+      16005+t=88+v=
            t=88+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             305


                                     Table D.1 (cont’d)
Stint Morph Monoculture        Monoculture        Population         Human Read-
            Video Link         Web      Viewer    Web      Viewer    able Genome
                               Link               Link               Link
 89                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=89+      s=16005+t=89+      16005+t=89+v=
            t=89+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 90                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       g    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=90+      s=16005+t=90+      16005+t=90+v=
            t=90+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 91                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=91+      s=16005+t=91+      16005+t=91+v=
            t=91+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 92                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       i    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=92+      s=16005+t=92+      16005+t=92+v=
            t=92+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 93                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=93+      s=16005+t=93+      16005+t=93+v=
            t=93+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 94                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       e    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=94+      s=16005+t=94+      16005+t=94+v=
            t=94+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 95                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       b    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=95+      s=16005+t=95+      16005+t=95+v=
            t=95+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 96                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       b    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=96+      s=16005+t=96+      16005+t=96+v=
            t=96+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
 97                     https:     https://hopth.     https://hopth.  https://hopth.ru/
       h    //hopth.ru/21/b=   ru/21/b=prq49+     ru/21/b=prq49+     21/b=prq49+s=
            prq49+s=16005+     s=16005+t=97+      s=16005+t=97+      16005+t=97+v=
            t=97+v=video+      v=simulation+w=    v=simulation+w=    text+w=specimen
            w=specimen         specimen           population
                                             306


                                                 Table D.1 (cont’d)
 Stint    Morph       Monoculture          Monoculture        Population           Human Read-
                      Video Link           Web      Viewer    Web      Viewer      able Genome
                                           Link               Link                 Link
   98                             https:       https://hopth.     https://hopth.    https://hopth.ru/
              e       //hopth.ru/21/b=     ru/21/b=prq49+     ru/21/b=prq49+       21/b=prq49+s=
                      prq49+s=16005+       s=16005+t=98+      s=16005+t=98+        16005+t=98+v=
                      t=98+v=video+        v=simulation+w=    v=simulation+w=      text+w=specimen
                      w=specimen           specimen           population
   99                             https:       https://hopth.     https://hopth.    https://hopth.ru/
              e       //hopth.ru/21/b=     ru/21/b=prq49+     ru/21/b=prq49+       21/b=prq49+s=
                      prq49+s=16005+       s=16005+t=99+      s=16005+t=99+        16005+t=99+v=
                      t=99+v=video+        v=simulation+w=    v=simulation+w=      text+w=specimen
                      w=specimen           specimen           population
  100                             https:       https://hopth.     https://hopth.    https://hopth.ru/
              j       //hopth.ru/21/b=     ru/21/b=prq49+     ru/21/b=prq49+       21/b=prq49+s=
                      prq49+s=16005+       s=16005+t=100+     s=16005+t=100+       16005+t=100+v=
                      t=100+v=video+       v=simulation+w=    v=simulation+w=      text+w=specimen
                      w=specimen           specimen           population
     In the following sections, L refers to the number of hierarchical kin group levels defined for the simulation.
In this work, we use L = 2.
                                                         307


                            ApoptosisRequest
                                        CellAge
                                           Epoch
                                    HeirRequest
              IncomingInterMessageCounter
              IncomingIntraMessageCounter
                                          IsAlive
                                   IsChildCellOf
                             IsChildGroupOf-0
                             IsChildGroupOf-1
                                     IsNewborn
                                 IsParentCellOf
                           IsParentGroupOf-0
                           IsParentGroupOf-1
                               KinGroupAge-0
                               KinGroupAge-1
                  KinGroupIDAncestorView-0
                  KinGroupIDAncestorView-1
                            KinGroupIDView-0
                            KinGroupIDView-1
                             KinGroupMatch-0
                             KinGroupMatch-1
                        KinGroupWillExpire-0
                        KinGroupWillExpire-1
                   MostRecentCauseOfDeath
                           NeighborApoptosis
                        NeighborFragmented
                          NeighborIsNewborn
               NeighborKinGroupWillExpire-0
               NeighborKinGroupWillExpire-1
State
        NeighborOptimumQuorumExceeded-0
        NeighborOptimumQuorumExceeded-1
                                     NopState-0
                                     NopState-1
                                     NopState-2
                                     NopState-3
                    NumKnownQuorumBits-0
                    NumKnownQuorumBits-1
                OptimumQuorumExceeded-0
                OptimumQuorumExceeded-1
                           ParentFragmented
                     PhylogeneticRootMatch
                       PhylogeneticRootView
                      ReceivedResourceFrom
                             RepLevRequest-0
                             RepLevRequest-1
                 ResourceReceiveResistance
                    ResourceReserveRequest
                          ResourceSendLimit
                       ResourceSendRequest
                           ResourceStockpile
                        RicherThanNeighbor
                                   SpawnArrest
                                   SpawnCount
                                SpawnRequest
                                 SpawnedFrom
                           StockpileDepleted
                              StockpileFecund
                         TransientNopState-0
                         TransientNopState-1
                         TransientNopState-2
                         TransientNopState-3
                                                      0
                                                      2
                                                      4
                                                      6
                                                      8
                                                     10
                                                     12
                                                     14
                                                     16
                                                     18
                                                     20
                                                     22
                                                     24
                                                     26
                                                     28
                                                     30
                                                     32
                                                     34
                                                     36
                                                     38
                                                     40
                                                     42
                                                     44
                                                     46
                                                     48
                                                     50
                                                     52
                                                     54
                                                     56
                                                     58
                                                     60
                                                     62
                                                     64
                                                     66
                                                     68
                                                     70
                                                     72
                                                     74
                                                     76
                                                     78
                                                     80
                                                     82
                                                     84
                                                     86
                                                     88
                                                     90
                                                     92
                                                     94
                                                     96
                                                     98
                                                    100
                                                                    Stint
Figure D.1: Fitness effect of extrospective states (read-only state information of neighboring cells) for focal
strains between stint 0 and 100. Peach color indicates no fitness effect. Burgundy indicates a significant
fitness effect. Supplementary material provides a description for each state.
                                                     308


                                   CellAge
                                    Epoch
              IncomingInterMessageCounter
              IncomingIntraMessageCounter
                                    IsAlive
                              IsChildCellOf
                          IsChildGroupOf-0
                          IsChildGroupOf-1
                                IsNewborn
                            IsParentCellOf
                         IsParentGroupOf-0
                         IsParentGroupOf-1
                           KinGroupAge-0
                           KinGroupAge-1
                 KinGroupIDAncestorView-0
                 KinGroupIDAncestorView-1
                         KinGroupIDView-0
                         KinGroupIDView-1
                         KinGroupMatch-0
                         KinGroupMatch-1
                      KinGroupWillExpire-0
                      KinGroupWillExpire-1
State
                 MostRecentCauseOfDeath
                        NeighborApoptosis
                      NeighborFragmented
                       NeighborIsNewborn
              NeighborKinGroupWillExpire-0
              NeighborKinGroupWillExpire-1
        NeighborOptimumQuorumExceeded-0
        NeighborOptimumQuorumExceeded-1
                  NumKnownQuorumBits-0
                  NumKnownQuorumBits-1
               OptimumQuorumExceeded-0
               OptimumQuorumExceeded-1
                        ParentFragmented
                    PhylogeneticRootMatch
                     PhylogeneticRootView
                    ReceivedResourceFrom
                        ResourceStockpile
                      RicherThanNeighbor
                              SpawnCount
                            SpawnedFrom
                         StockpileDepleted
                          StockpileFecund
                                                0
                                                2
                                                4
                                                6
                                                8
                                               10
                                               12
                                               14
                                               16
                                               18
                                               20
                                               22
                                               24
                                               26
                                               28
                                               30
                                               32
                                               34
                                               36
                                               38
                                               40
                                               42
                                               44
                                               46
                                               48
                                               50
                                               52
                                               54
                                               56
                                               58
                                               60
                                               62
                                               64
                                               66
                                               68
                                               70
                                               72
                                               74
                                               76
                                               78
                                               80
                                               82
                                               84
                                               86
                                               88
                                               90
                                               92
                                               94
                                               96
                                               98
                                              100
                                                                    Stint
Figure D.2: Fitness effect of introspective states (read-only state information of own cell) for focal strains
between stint 0 and 100. Peach color indicates no fitness effect. Burgundy indicates a significant fitness
effect. Supplementary material provides a description for each state.
                                                     309


                ApoptosisRequest
                     HeirRequest
                      NopState-0
                      NopState-1
                      NopState-2
                      NopState-3
                 RepLevRequest-0
                 RepLevRequest-1
        ResourceReceiveResistance
State
          ResourceReserveRequest
               ResourceSendLimit
            ResourceSendRequest
                     SpawnArrest
                   SpawnRequest
              TransientNopState-0
              TransientNopState-1
              TransientNopState-2
              TransientNopState-3
                                      0
                                      2
                                      4
                                      6
                                      8
                                     10
                                     12
                                     14
                                     16
                                     18
                                     20
                                     22
                                     24
                                     26
                                     28
                                     30
                                     32
                                     34
                                     36
                                     38
                                     40
                                     42
                                     44
                                     46
                                     48
                                     50
                                     52
                                     54
                                     56
                                     58
                                     60
                                     62
                                     64
                                     66
                                     68
                                     70
                                     72
                                     74
                                     76
                                     78
                                     80
                                     82
                                     84
                                     86
                                     88
                                     90
                                     92
                                     94
                                     96
                                     98
                                    100
                                                                  Stint
Figure D.3: Fitness effect of writable states for focal strains between stint 0 and 100. Peach color indicates no
fitness effect. Burgundy indicates a significant fitness effect. Supplementary material provides a description
for each state.
                                                      310


D.1        Virtual CPU
      Each cardinal hosts a signalgp-lite virtual CPU. Each CPU can host up to 16 active virtual cores. If
additional cores are required after all 16 available are in use, the oldest active core is killed and replaced.
Each virtual core contains 8 virtual float registers.
      Cores execute round-robin in quasi-parallel, with up to 8 instructions being executed on a single core
before execution shifts to the next active core.
      Like SignalGP, the signalgp-lite system uses tag-matching to determine which modules to activate in
response to incoming signals from the environment, from other agents (i.e., messages), and from internal
events (i.e., execution of call and fork instructions).
      In the DISHTINY simulation, each CPU hosts two independent module-lookup data structures. The
first module-lookup data structure is used to activate modules in response to internally-generated signals,
messages from other cardinals within the same cell, and environmental events; this module-lookup data
structure contains all modules within the genetic program. The second module-lookup data structure is used
to activate modules in response to messages from other cells; this module-lookup data structure contains
only modules with bitstring tags that end in 0. (Hence, the subset of modules with bitstring tags that end
in 1 cannot be activated by messages from other cells, so that sensitive functionality like resource sharing
and apoptosis can be protected from potentially malicious exploitation.)
      We also use a tag-matching system to route jump instructions executed within a module. When a
module is loaded, all local anchor instructions are registered within a tag-matching data structure. The
local jump instruction routes to the best-matching local anchor. If no matching local anchor is available,
then no jump is performed and execution continues as if a nop instruction had elapsed.
D.2        Tag Matching
      We use 64-bit bitstring tags to label modules and jump destinations. We use a variant of Downing’s
streak metric to compute tag matches (Downing, 2015). We deterministically select the single best-matching
result for a tag lookup. If there is a tie, an arbitrary result is selected. For module-lookup, the best-matching
tag must have a match quality at the 80th or better percentile among match qualities of pairs of randomly
generated tags. Otherwise, no module is activated. For jump-lookup, the best match must be at the 50th
or better percentile. Otherwise, no jump is performed.
D.3        Program Generation and Mutation
      Initial populations were seeded with programs consisting of 128 randomly generated instructions. Pro-
gram length was capped at 4096 instructions.
      Mutation was applied to one in 10 reproductions where any kin group commonality was maintained
                                                         311


and to 10 in 10 reproductions where it was not. If mutation occurred, bits in the binary representation
of the genome were flipped with 0.02% probability. If mutation occurred, sequence mutations were also
introduced into the program at a per-site rate of 0.1%. Half of sequence mutations were deletion events,
with a number of sites deleted drawn uniformly between 0 and 8. Half of sequence mutations were insertion
events, with a number of sites inserted drawn uniformly between 0 and 8. When sites were inserted, half of
the time randomly-generated instructions were added and half of the time existing the preceding sequence of
instructions was duplicated. With 0.1% probability a sequence mutation took on severe intensity, meaning
that the number of sites inserted or deleted was drawn uniformly between 0 and program size rather than
between 0 and 8.
D.4        Cooperative Resource Collection
      In order to ensure kin group structure had functional ramifications, we based part of cell resource
collection on the number of contiguous kin group members. To do this, we needed an efficient distributed
method to approximate kin group size.
      Each cell held a 64-bit bitstring with one chosen bit fixed. We estimated group size by counting the
number of distinct set bits (out of 64 available slots) that were contained within a kin group. We refer to
this count of distinct set bits as a group’s “quorum count.”
      At every update within each tile, the simulation system broadcasts all bits that were known to be set
within that tile’s kin group. This broadcast was only sent neighboring tiles that were part of the same kin
group as the broadcasting tile. Each tile tracked which neighbor it learned of each set bit from so that when
tiles left the kin group their set bits could be forgotten from the bits known to be set within the kin group.
      This scheme was replicated independently for each kin group level simulated. For the lowest-level kin
group, a different fixed bit was chosen independently for each tile. Thus, the quorum count for these lowest-
level kin groups was a function of the number of cells contained. For the highest-level kin group, each tile’s
fixed bit was chosen as a deterministic function of its lowest-level kin group ID. Thus, the quorum count for
these highest-level kin groups was a function of the number of lowest-level kin groups contained.
      To incentivize kin group formation and maintenance, we gave each cell a 0.02 resource bonus every
four updates for each non-self quorum count. This bonus saturated at the simulation-defined target quorum
count. For both the lowest- and highest-level kin groups we used a target quorum count of 12. The source
code controlling cooperative resource collection can be found at https://github.com/mmore500/dishtiny/
blob/prq49/include/dish2/services/CollectiveHarvestingService.hpp.
      In addition to this cooperative resource collection mechanism, cells enjoyed a continuous resource inflow
of 0.02 units per update. The source code controlling base resource inflow can be found at https://github.
                                                        312


com/mmore500/dishtiny/blob/master/include/dish2/services/ResourceHarvestingService.hpp.
     To penalize groups that expanded beyond the simulation-defined target quorum count, we decayed held
                                               n
resource by a multiplicative factor of 0.99952 , where n is the excess quorum count beyond the simulation-
defined target. The source code controlling cooperative resource collection can be found at https://github.
com/mmore500/dishtiny/blob/prq49/include/dish2/services/CollectiveResourceDecayService.hpp.
     In addition to any decay due to group size, held resource decayed at a rate of 0.05% per update. Received
resource was decayed 0.099975% upon receipt.
D.5       Simulation Details
D.5.1      Events
     This section enumerates simulation-managed events that were dispatched on virtual CPUs. In addition
to a program, each genome contained an array of 64-bit tags — one for each event. When an event’s criteria
was met in the simulation, the genome’s corresponding tag was used to dispatch a module in the program
and launch a core executing that module.
     All events are also exposed to the cell as a corresponding input sensor. The state of the event (0
for false, 1 for true) is stored in the sensor prior to virtual CPU execution. In fact, events are triggered
based on the reading of the sensor register (not by re-reading the underlying simulation state). This means
that experimental perturbations that perturb sensor input also disrupted event-handling, allowing the state
interface complexity metric to measure both event-driven and sensor-based behaviors.
     The source code controlling events can be found at https://github.com/mmore500/dishtiny/
tree/prq49/include/dish2/events and https://github.com/mmore500/dishtiny/blob/prq49/include/dish2/
services/InterpretedIntrospectiveStateRefreshService.hpp.
Always
     This event is always dispatched.
Is Child Cell Of
     Is this cell a daughter cell of the cardinal’s neighbor? Triggered if this cell was spawned from the
cardinal’s neighbor and its cell is younger than the neighbor.
Is Child Group Of (0 thru L − 1)
     Is this cell’s kin group a daughter group of the cardinal’s neighbor cell’s kin group? Triggered if a cell’s
kin group ancestor ID(s) are equal to the cardinal’s neighbor’s current kin group ID(s).
Is Newborn
     This event is dispatched once when a cell is first born. Triggered if cell age is less than frequency at
which events are launched.
                                                       313


Is Parent Cell Of
     Is this cardinal’s cell the parent cell of the cardinal’s neighbor? Triggered if neighbor was spawned from
cell and cell age is greater than neighbor age.
Kin Group Match (0 thru L − 1)
     Is this cell part of the same kin group as the cardinal’s neighbor? Triggered if a cell’s kin group ID(s)
are equal to the cardinal’s neighbors’ current kin group ID(s).
Kin Group Mismatch (0 thru L − 1)
     Is this cell part of a different kin group from the cardinal’s neighbor cell? Triggered if a cell’s kin group
ID(s) are not equal to the cardinal’s neighbors’ current kin group ID(s).
Kin Group will Expire (0 thru L − 1)
     Triggered if kin group age is greater than 80% of the kin group expiration duration. (Depending on
experiment configuration, the group may be force-fragmented after expiration.)
Kin Group will not Expire (0 thru L − 1)
     Triggered if kin group age is less than or equal to than 80% of the kin group expiration duration.
Neighbor Apoptosis
     Triggered if the most recent cell death in the cardinal’s neighbor tile was apoptosis.
Neighbor Fragmented
     Triggered if the most recent cell death in the cardinal’s neighbor tile was fragmentation.
Neighbor Is Alive
     Triggered if a cardinal’s neighbor tile is occupied by a live cell.
Neighbor Is Newborn
     Triggered once for each time a newborn spawns into the cardinal’s neighboring tile. Triggered if the
cardinal’s neighbor’s age is less than the frequency at which events are launched.
Neighbor Is Not Alive
     Triggered if the cardinal’s neighboring tile is not occupied.
Neighbor Kin Group Will Expire (0 thru L − 1)
     Triggered if the cardinal’s cell neighbor’s kin group age is less than or equal to 80% of the kin group
expiration duration.
Neighbor Optimum Quorum Exceeded
     Triggered if the cardinal’s cell neighbor’s number of known quorum bits exceed the target quorum count.
Optimum Quorum Exceeded (0 thru L − 1)
     Triggered if the cell’s number of known quorum bits exceed the target quorum count.
                                                         314


Optimum Quorum Not Exceeded (0 thru L − 1)
     Triggered if the cell’s number of known quorum bits is less than or equal to than the target quorum
count.
Parent Fragmented
     Triggered if a cell’s parent died from fragmentation. That is, if the last cause of death on the current
tile was fragmentation.
Phylogenetic Root Match
     Does this cell descend from the same originally-generated genome as its neighbor? Triggered if a
cardinal’s cell’s root ID is equal to that cardinal’s neighbor cell’s root ID.
Phylogenetic Root Mismatch
     Does this cell and its neighbor descend from a different originally-generated genomes? Triggered if a
cardinal’s cell’s root ID is not equal to that cardinal’s neighbor cell’s Root ID.
Poorer Than Neighbor
     Does this cell have less resource stockpiled than its neighbor? Triggered if a cardinal’s cell has less
resource than that cardinal’s neighbor cell.
Received Resource From
     Triggered if a cardinal’s cell has received resource from that cardinal’s neighbor cell.
Richer Than Neighbor
     Does this cell have more resource stockpiled than its neighbor? Triggered if a cardinal’s cell has more
resource in its stockpile than that cardinal’s neighbor cell.
Stockpile Depleted
     Is this cell’s stockpile empty? Triggered if a cell’s stockpile is less than twice the base harvest rate.
Stockpile Fecund
     Does this cell have enough stockpiled resource to fund cellular reproduction? Triggered if a cell’s
stockpile is greater than 1.0.
D.5.2      Operations
     This section overviews the operation library made available to evolving signalgp-lite genetic programs
within the simulation.
     Within the program section of each genome, each instruction contained
    • an op code, specifying which operation should be performed;
    • a 64-bit bitstring, used as a tag for operations that required tag-matching or as data for some config-
      urable operations; and
                                                       315


    • three integer arguments, specifying which registers the operation should apply to (many operations do
       not use all arguments).
     In the operation descriptions below, we refer to register access via to the nth argument as reg[arg_n].
Each core has its own eight float registers. All core registers are zeroed out at core launch.
     In order to prevent bread-and-butter operations like global anchors, local anchors, and terminals from
being swamped out by large instruction set size, we manually defined increased “prevalences” for some
instructions. This prevalence increased the probability of the operation being selected under mutations and
initial program generation. Prevalence works like increasing the number of identical copies of the operation
included in the operation library. We provide the prevalence of each operation below.
     See    https://github.com/mmore500/dishtiny/tree/prq49/include/dish2/operations        for   the source
code     of    DISHTINY-specific      operations    and    https://github.com/mmore500/signalgp-lite/tree/
b6c437f44136651aa6f4051d84bc62a86c2afbbe/include/sgpl/operations for the source code of generic
operations.
     Refer to Section D.1 for details on the virtual CPU running these instructions.
Fork If
       Prevalence    1
       Num Args 1
     If reg[arg_0] is nonzero, registers a request to activate a new core with the module best-matching the
current instruction’s tag. These fork requests are only handled when the current core terminates. Each core
may only register 3 fork requests.
Nop, 0 RNG Touches
       Prevalence    1
       Num Args 0
     Performs no operation for one virtual CPU cycle.
Nop, 1 RNG Touches
       Prevalence    1
       Num Args 0
     Performs no operation for one virtual CPU cycle, and advances the RNG engine once. (Important to
nop-out operations that perform one RNG touch without causing side effects.)
Nop, 2 RNG Touches
       Prevalence    1
       Num Args 1
     Performs no operation for one virtual CPU cycle, and advances the RNG engine twice. (Important to
                                                      316


nop-out operations that perform two RNG touches without causing side effects.)
Terminate If
      Prevalence    1
      Num Args 1
     Terminates current core if reg[arg_0] is nonzero.
Add
      Prevalence    1
      Num Args 3
     Adds reg[arg_1] to reg[arg_2] and stores the result in reg[arg_0].
Divide
      Prevalence    1
      Num Args 3
     Divides reg[arg_1] by reg[arg_2] and stores the result in reg[arg_0]. Division by zero can result in
an Inf or NaN value.
Modulo
      Prevalence    1
      Num Args 3
     Calculates the modulus of reg[arg_1] by reg[arg_2] and stores the result in reg[arg_0]. Mod by
zero can result in a NaN value.
Multiply
      Prevalence    1
      Num Args 3
     Multiplies reg[arg_1] by reg[arg_2] and stores the result in reg[arg_0].
Subtract
      Prevalence    1
      Num Args 3
     Subtracts reg[arg_2] from reg[arg_1] and stores the result in reg[arg_0].
Bitwise And
      Prevalence    1
      Num Args 3
     Performs a bitwise AND of reg[arg_1] and reg[arg_2] then stores the result in reg[arg_0].
Bitwise Not
      Prevalence    1
      Num Args 2
     Computes the bitwise NOT of reg[arg_1] and stores the result in reg[arg_0].
                                                   317


Bitwise Or
     Prevalence      1
     Num Args 3
    Performs a bitwise OR of reg[arg_1] and reg[arg_2] then stores the result in reg[arg_0].
Bitwise Shift
     Prevalence      1
     Num Args 3
    Shifts the bits of reg[arg_1] by reg[arg_2] positions. (If reg[arg_2] is negative, this is a right shift.
Otherwise it is a left shift.) Stores the result in reg[arg_0].
Bitwise Xor
     Prevalence      1
     Num Args 3
    Performs a bitwise XOR of reg[arg_1] and reg[arg_2] then stores the result in reg[arg_0].
Count Ones
     Prevalence      1
     Num Args 2
    Counts the number of bits set in reg[arg_1] and stores the result in reg[arg_0].
Random Fill
     Prevalence      1
     Num Args 1
    Fills register pointed to by reg[arg_0] with random bits chosen from a uniform distribution.
Equal
     Prevalence      1
     Num Args 3
    Checks whether reg[arg_1] is equal to reg[arg_2] and stores the result in reg[arg_0].
Greater Than
     Prevalence      1
     Num Args 3
    Checks whether reg[arg_1] is greater than reg[arg_2] and stores the result in reg[arg_0].
Less Than
     Prevalence      1
     Num Args 3
    Checks whether reg[arg_1] is less than reg[arg_2] and stores the result in reg[arg_0].
                                                       318


Logical And
      Prevalence    1
      Num Args 3
     Performs a logical AND of reg[arg_1] and reg[arg_2], storing the result in reg[arg_0].
Logical Or
      Prevalence    1
      Num Args 3
     Performs a logical OR of reg[arg_1] and reg[arg_2], storing the result in reg[arg_0].
Not Equal
      Prevalence    1
      Num Args 3
     Checks whether reg[arg_1] is not equal to reg[arg_2] and stores the result in reg[arg_0].
Global Anchor
      Prevalence    15
      Num Args 0
     Marks a module-begin position. Based on tag-lookup, new cores or global jump instructions may set
the program counter to this instruction’s program position.
     This instruction can also mark a module-end position — executing this instruction can terminate the
executing core. If no local anchor instruction is present between the current global anchor instruction and
the preceding global anchor instruction, this operation will not terminate the executing core. (This way,
several global anchors may lead into the same module.)
     However, if a local anchor instruction is present between the current global anchor instruction and the
preceding global anchor instruction, this operation will terminate the executing core. Local jump instructions
will only consider local anchors between the preceding global anchor and the subsequent global anchor
instruction.
Global Jump If
      Prevalence    1
      Num Args 2
     Jumps the current core to a global anchor that matches the instruction tag if reg[arg_0] is nonzero.
If reg[arg_1] is nonzero, resets registers.
Global Jump If Not
      Prevalence    1
      Num Args      2
                                                     319


     Jumps the current core to a global anchor that matches the instruction tag if reg[arg_0] is nonzero.
If reg[arg_1] is zero, resets registers.
Protected Regulator Adjust
       Prevalence   1
       Num Args 1
     Adjusts the regulator value of global jump table tags matching this instruction’s tag by the amount
reg[arg_0].
     This regulator value affects the outcome of tag lookup for internal events and signals from the environ-
ment. (Note, as described in D.1, that independent tag lookup tables handle activating genome modules
across different contexts.)
Protected Regulator Decay
       Prevalence   1
       Num Args 1
     Ages the regulator decay countdown of global jump table tags matching this instruction’s tag by the
amount reg[arg_0]. If reg[arg_0] is negative, this can forestall decay.
     This decay countdown affects the outcome of tag lookup for internal events, and signals from the
environment. (Note, as described in D.1, that independent tag lookup tables handle activating genome
modules across different contexts.)
Protected Regulator Get
       Prevalence   1
       Num Args 1
     Gets the regulator value of the global jump table tag that best matches this instruction’s tag. Stores
the value in reg[arg_0].
     If no tag matches, a no-op is performed.
     The regulator value gotten controls internal events and signals from the environment. (Note, as described
in D.1, that independent tag lookup tables handle activating genome modules across different contexts.)
Protected Regulator Set
       Prevalence   1
       Num Args 1
     Sets the regulator value of global jump table tags matching this instruction’s tag to reg[arg_0].
     This regulator value affects the outcome of tag lookup for internal events and signals from the environ-
ment. (Note, as described in D.1, that independent tag lookup tables handle activating genome modules
across different contexts.)
                                                      320


Local Anchor
       Prevalence    20
       Num Args 0
     Marks a program location local jump instructions may route to. This program location is tagged with
the instruction’s tag.
     As described in Section D.5.2, this operation also plays a role in determining whether global anchor
instructions close a module.
Local Jump If
       Prevalence    1
       Num Args 1
     Jumps to a local anchor that matches the instruction tag if reg[arg_0] is nonzero.
Local Jump If Not
       Prevalence    1
       Num Args 1
     Jumps to a local anchor that matches the instruction tag if reg[arg_0] is zero.
Local Regulator Adjust
       Prevalence    1
       Num Args 1
     Adjusts the regulator value of local jump table tags matching this instruction’s tag by the amount
reg[arg_0].
Local Regulator Decay
       Prevalence    1
       Num Args 1
     Ages the regulator decay countdown of local jump table tags matching this instruction’s tag by the
amount reg[arg_0]. If reg[arg_0] is negative, this can forestall decay.
Local Regulator Get
       Prevalence    1
       Num Args 1
     Gets the regulator value of the local jump table tag that best matches this instruction’s tag. Stores the
value in reg[arg_0].
     If no tag matches, a no-op is performed.
Local Regulator Set
       Prevalence    1
       Num Args      1
                                                     321


     Sets the regulator value of global jump table tags matching this instruction’s tag to reg[arg_0].
Decrement
      Prevalence    1
      Num Args 1
     Takes reg[arg_0], decrements it by one, and stores the result in reg[arg_0].
Increment
      Prevalence    1
      Num Args 1
     Takes reg[arg_0], increments it by one, and stores the result in reg[arg_0].
Negate
      Prevalence    1
      Num Args 1
     Negates reg[arg_0] and stores the result in reg[arg_0].
Not
      Prevalence    1
      Num Args 1
     Performs a logical not on reg[arg_0] and stores the result in reg[arg_0].
Random Bool
      Prevalence    1
      Num Args 1
     Stores 1.0f to reg[arg_0] with probability determined by this instruction’s tag. Otherwise, stores
0.0f to reg[arg_0].
Random Draw
      Prevalence    1
      Num Args 1
     Stores a randomly drawn float value to reg[arg_0].
Terminal
      Prevalence    50
      Num Args 1
     Stores a genetically-encoded value to reg[arg_0]. This value is determined deterministically using the
instruction’s tag.
Exposed Regulator Adjust
      Prevalence    1
      Num Args      1
                                                     322


     Adjusts the regulator value of global jump table tags matching this instruction’s tag by the amount
reg[arg_0].
     This regulator value affects the outcome of tag lookup for messages from neighbor cells. (Note, as
described in D.1, that independent tag lookup tables handle activating genome modules across different
contexts.)
Exposed Regulator Decay
       Prevalence   1
       Num Args 1
     Ages the regulator decay countdown of global jump table tags matching this instruction’s tag by the
amount reg[arg_0]. If reg[arg_0] is negative, this can forestall decay.
     This decay countdown affects the outcome of tag lookup for messages from neighbor cells. (Note, as
described in D.1, that independent tag lookup tables handle activating genome modules across different
contexts.)
Exposed Regulator Get
       Prevalence   1
       Num Args 1
     Gets the regulator value of the global jump table tag that best matches this instruction’s tag. Stores
the value in arg[0].
     If no tag matches, a no-op is performed.
     The regulator value gotten controls messages from other cells. (Note, as described in D.1, that inde-
pendent tag lookup tables handle activating genome modules across different contexts.)
Exposed Regulator Set
       Prevalence   1
       Num Args 1
     Sets the regulator value of global jump table tags matching this instruction’s tag to reg[arg_0].
     This regulator value affects the outcome of tag lookup for messages from other cells. (Note, as described
in D.1, that independent tag lookup tables handle activating genome modules across different contexts.)
Add to Own State
       Prevalence   5
       Num Args 1
     Adds reg[arg_0] to the current value in a target writable state then stores the sum back in to that
target writable state.
     To determine the target writable state, interprets the first 32 bits of the instruction tag as an unsigned
integer then calculates the remainder of integer division by the number of writable states.
                                                      323


Broadcast Intra Message If
       Prevalence    1
       Num Args 1
     If reg[arg_0] is nonzero, generates a message tagged with the instruction’s tag that contains the core’s
current register state. Broadcasts this message to every other cardinal within the cell.
Multiply Own State
       Prevalence    5
       Num Args 1
     Multiplies reg[arg_0] by the current value in a target writable state then stores the result back in to
that target writable state.
     To determine the target writable state, interprets the first 32 bits of the instruction tag as an unsigned
integer then calculates the remainder of integer division by the number of writable states.
Read Neighbor State
       Prevalence    10
       Num Args 1
     Reads a target readable state from the neighboring cell and stores it into reg[arg_0].
     To determine the target readable state, interprets the first 32 bits of the instruction tag as an unsigned
integer then calculates the remainder of integer division by the number of readable states.
Read Own State
       Prevalence    20
       Num Args 1
     Reads a target readable state and stores it into reg[arg_0].
     To determine the target readable state, interprets the first 32 bits of the instruction tag as an unsigned
integer then calculates the remainder of integer division by the number of readable states.
Send Inter Message If
       Prevalence    5
       Num Args 1
     If reg[arg_0] is nonzero, generates a message tagged with the instruction’s tag that contains the core’s
current register state. Sends this message to the neighboring cell.
Send Intra Message If
       Prevalence    5
       Num Args 1
     If reg[arg_0] is nonzero, generates a message tagged with the instruction’s tag that contains the core’s
current register state. Sends this message to a target cardinal within the cell.
                                                     324


      To determine the target cardinal, sums instruction arguments then calculates the remainder of integer
division by the number of co-cardinals.
Write Own State If
        Prevalence    5
        Num Args 2
      If reg[arg_1] is nonzero, stores reg[arg_0] into a target writable state.
      To determine the target writable state, interprets the first 32 bits of the instruction tag as an unsigned
integer then calculates the remainder of integer division by the number of writable states.
D.5.3       Introspective State
      Introspective state refers to the collection of simulation-generated sensor values that evolving programs
running within each cardinal can access via read-only operations. Each cardinal has an independent copy of
each piece of introspective state state. (However, some introspective states representing cell state are set to
identical values across cardinals within the same cell.)
      Each cardinal’s introspective state regularly copied and dispatched to that cardinal’s neighbor cell,
where it serves as read-only extrospective state.
      Introspective state is organized into two categories:
    1. raw introspective state, and
    2. interpreted introspective state.
      Raw introspective state directly exposes aspects of simulation state. Interpreted introspective state is
filled with truthy values that are interpreted as booleans to dispatch environmentally-managed events.
      See     https://github.com/mmore500/dishtiny/tree/prq49/include/dish2/peripheral/readable_state/
introspective_state for source code implementing introspective state.
Is Child Cell Of
        Type        char (w/ boolean semantics)
        Category    interpreted
      Did this cell spawn from this cardinal’s neighbor cell?
Is Child Group Of (0 thru L − 1)
        Type        char (w/ boolean semantics)
        Category    interpreted
      Does this cell’s kin group ID descend directly from the neighbor’s kin group ID?
                                                         325


Is Newborn
       Type          char (w/ boolean semantics)
       Category      interpreted
     Is this cell’s age less than EVENT_LAUNCHING_SERVICE_FREQUENCY?
Is Parent Cell Of
       Type          char (w/ boolean semantics)
       Category      interpreted
     Did this cardinal’s neighbor cell spawn from this cell? That is, was neighbor was spawned from this cell
and is this cell older than neighbor?
Is Parent Group Of
       Type          char (w/ boolean semantics)
       Category      interpreted
     Did this cell’s kin group descend directly from the cardinal’s neighbor cell’s kin group? That is, is cell’s
kin group ancestor ID(s) equal to the cardinal’s neighbor’s current kin group ID(s).
Kin Group Match (0 thru L − 1)
       Type          char (w/ boolean semantics)
       Category      interpreted
     Does this cell’s kin group ID match the neighbor’s kin group ID?
Kin Group will Expire (0 thru L − 1)
       Type          char (w/ boolean semantics)
       Category      interpreted
     Is this cell’s kin group age greater than 80% of this level’s GROUP_EXPIRATION_DURATIONS?
Neighbor Apoptosis
     Was the neighbor tile’s most recent death apoptosis?
       Type          char (w/ boolean semantics)
       Category      interpreted
Neighbor Fragmented
     Was group fragmentation the most recent cause of death in the cardinal’s neighbor cell?
       Type          char (w/ boolean semantics)
       Category      interpreted
                                                      326


Neighbor Is Newborn
       Type          char (w/ boolean semantics)
       Category      interpreted
     Is the neighbor’s cell age less than EVENT_LAUNCHING_SERVICE_FREQUENCY?
Neighbor Kin Group Will Expire (0 thru L − 1)
     interpreted
       Type          char (w/ boolean semantics)
       Category      interpreted
     Is this cell’s kin group age greater than 80% of the this level’s GROUP_EXPIRATION_DURATIONS?
Neighbor Optimum Quorum Exceeded (0 thru L − 1)
       Type          char (w/ boolean semantics)
       Category      interpreted
     Is this cardinal’s neighbor cell’s kin group quorum count more than the simulation-defined target count
OPTIMAL_QUORUM_COUNT?
Optimum Quorum Exceeded (0 thru L − 1)
       Type          char (w/ boolean semantics)
       Category      interpreted
     Is   this   cell’s  kin   group  quorum    count   more   than    the  simulation-defined target  count
OPTIMAL_QUORUM_COUNT?
Parent Fragmented
       Type          char (w/ boolean semantics)
       Category      interpreted
     Did the cell’s parent die from fragmentation? That is, was the last cause of death on the current tile
was fragmentation?
Phylogenetic Root Match
       Type          char (w/ boolean semantics)
       Category      interpreted
     Does this cell’s root ID equal the cardinal’s cell neighbor’s root ID? (This means they originate from
the same seed ancestor.)
                                                       327


Richer Than Neighbor
      Type          char (w/ boolean semantics)
      Category      interpreted
    Does this cell’s stockpile more resource than the cardinal’s cell neighbor?
Stockpile Depleted
      Type          char (w/ boolean semantics)
      Category      interpreted
    Is this cell’s stockpile less than twice the base harvest rate?
Stockpile Fecund
      Type          char (w/ boolean semantics)
      Category      interpreted
    Does this cell have enough resource stockpiled to fund spawning an offspring cell?
Cell Age
      Type          size_t
      Category      raw
    Number CellAgeService calls elapsed since cell was born.
Epoch
      Type          size_t
      Category      raw
    Updates elapsed since start of simulation.
Incoming Inter Message Counter
      Type          size_t
      Category      raw
    Counter tracking incoming messages from cardinal’s neighbor cell. Intermittently reset to zero.
Incoming Intra Message Counter
      Type          size_t
      Category      raw
    Counter of incoming messages from other cardinals within the cell. Intermittently reset to zero.
                                                       328


Is Alive
      Type        char (w/ boolean semantics)
      Category    raw
    Whether the cell is alive. Although trivial as introspective state, this state is useful for neighbor cell’s
extrospective state.
Kin Group Age (0 thru L − 1)
      Type        size_t
      Category    raw
    Number of epochs elapsed since kin group formation.
Kin Group ID Ancestor View (0 thru L − 1)
      Type        size_t
      Category    raw
    Kin group ID from which cell’s kin group ID is descended.
Kin Group ID View (0 thru L − 1)
      Type        size_t
      Category    raw
    Kin group ID of this cell.
Most Recent Cause of Death
    raw
      Type        char
      Category    raw
    What was this the most recent cause of death on this tile? Encoded using the CauseOfDeath enum.
Num Known Quorum Bits (0 thru L − 1)
      Type        size_t
      Category    raw
    What is this cell’s known quorum count? (How many unique quorum bits collected from kin group
members are known?)
Phylogenetic Root View
      Type        size_t
      Category    raw
                                                     329


     What is this cell’s phylogenetic root ID?
     (Which initially-generated ancestor is this cell descended from?)
Received Resource From
       Type        float
       Category    raw
     How much resource is being received from the cardinal’s cell neighbor?
Resource Stockpile
       Type        float
       Category    raw
     Amount of resource this cell has.
Spawn Count
       Type        float
       Category    raw
     Number of offspring generated from this cell and sent to the cardinal’s neighbor tile. Includes offspring
that do not successfully take into the neighbor tile or have not survived.
Spawned From
       Type        char (w/ boolean semantics)
       Category    raw
     Did this cell spawn from this cardinal’s neighbor cell?
D.5.4      Writable State
     Writable state refers to the collection of output values that evolving programs running within each
cardinal can write to and read from. Some of these outputs enable interaction with the simulation (i.e.,
control phenotypic characteristics). Each cardinal has an independent copy of each piece of writable state
state.
     See     https://github.com/mmore500/dishtiny/tree/prq49/include/dish2/peripheral/readable_state/
writable_state for source code implementing writable state.
Nop State (4×)
       Type    float
Writing to this state has no external effect. It can be used as global memory shared between cores.
Transient Nop State (4×)
       Type    float
                                                      330


Writing to this state has no external effect. It is cleared regularly by the decay to baseline service. It can
be used as temporary global memory shared between cores.
Apoptosis Request
        Type    char
Writing a nonzero value to this state causes cell death.
Heir Request
        Type    char
If this state is set when cell death occurs, the cardinal’s neighbor cell will inherit leftover resource from the
cell’s stockpile.
RepLev Request (0 thru L)
        Type    char
Controls kin group inheritance for daughter cells spawned to this cardinal’s neighbor tile.
      If no copies of this state are set at cell spawn, the daughter cell will have no common kin group IDs. If
one copy of this state is set at cell spawn, the daughter cell will have one common kin group ID. If L copies
of this state are set at cell spawn, the daughter cell will have L common kin group IDs.
Resource Receive Resistance
        Type    float
Setting this state reduces the amount of resource received from the cardinal’s neighbor cell.
Resource Reserve Request
        Type    float
Setting this state prevents that amount of stockpiled resource from being drawn from to be sent to the
cardinal’s neighbor cell.
Resource Send Limit
        Type    float
Setting this state caps the amount of resource that this cell can send to the cardinal’s neighbor cell per
update.
Resource Send Request
        Type    float
Setting this state initiates resource sharing to the cardinal’s neighbor cell. The value stored controls the
amount of resource shared.
Spawn Arrest
        Type    char
Setting this state prevents this cell from spawning offspring into this cardinal’s neighbor tile, even if sufficient
                                                         331


resource is available.
Spawn Request
       Type    char
Setting this state attempts to initiate spawning offspring into this cardinal’s neighbor tile.
D.5.5      Cellular Simulation Services
     Simulation logic is applied to each cell through a collection of distinct functors, referred to as services.
     All services specified to run on a particular update are applied in sequence to a single cell. (Some
services run only every nth update.) Then, to another randomly-chosen cell in a thread_local population,
and another until the entire population has been updated.
     See https://github.com/mmore500/dishtiny/tree/prq49/include/dish2/services for source code imple-
menting these services.
Decay to Baseline Service
       Frequency    every 32 update(s)
Decays a cell’s global regulators, resets its controller-mapped peripheral states, and resets its transient NOP
states.
Running Log Purge Service
       Frequency    every 64 update(s)
Purges a cell’s running logs. (Only affects data collection, not simulation logic.)
Controller Mapped State Noise Service
       Frequency    every 8 update(s)
Given a non-zero controller-mapped state defect rate, picks a random number n from a Poisson distribution
parameterized by CONTROLLER_MAPPED_STATE_DEFECT_RATE. Then, it introduces n defects to a cell’s writable
state. Half of these defects zero out the state and half randomize it.
Interpreted Introspective State Refresh Service
       Frequency    every i update(s)
Refreshes the interpreted introspective state of a cell.
Extrospective State Exchange Service
       Frequency    every 1 update(s)
Used for experimental manipulations testing the fitness effect of extrospective state. (Not part of core
simulation logic.)
Extrospective State Rotate Service
       Frequency    every 1 update(s)
                                                        332


Used for experimental manipulations testing the fitness effect of extrospective state. (Not part of core
simulation logic.)
Introspective State Exchange Service
      Frequency     every 1 update(s)
Used for experimental manipulations testing the fitness effect of introspective state. (Not part of core
simulation logic.)
Introspective State Rotate Service
      Frequency     every 1 update(s)
Used for experimental manipulations testing the fitness effect of introspective state. (Not part of core
simulation logic.)
CPU Execution Service
      Frequency     every 1 update(s)
Executes a cell’s genome on its cardinals’ virtual CPUs for HARDWARE_EXECUTION_CYCLES virtual cycles. The
order of cardinal evaluation is randomized. This is repeated HARDWARE_EXECUTION_ROUNDS times.
Event Launching Service
      Frequency     every 8 update(s)
Dispatches environmentally-managed events for each cardinal.
Introspective State Rotate Restore Service
      Frequency     every 1 update(s)
Used for experimental manipulations testing the fitness effect of introspective state. (Not part of core
simulation logic.)
Introspective State Exchange Restore Service
      Frequency     every 1 update(s)
Used for experimental manipulations testing the fitness effect of introspective state. (Not part of core
simulation logic.)
Extrospective State Rotate Restore Service
      Frequency     every 1 update(s)
Used for experimental manipulations testing the fitness effect of extrospective state. (Not part of core
simulation logic.)
Extrospective State Exchange Restore Service
      Frequency     every 1 update(s)
Used for experimental manipulations testing the fitness effect of extrospective state. (Not part of core
                                                     333


simulation logic.)
Writable State Exchange Service
        Frequency    every 1 update(s)
Used for experimental manipulations testing the fitness effect of writable state. (Not part of core simulation
logic.)
Writable State Rotate Service
        Frequency    every 1 update(s)
Used for experimental manipulations testing the fitness effect of writable state. (Not part of core simulation
logic.)
Birth Setup Service
        Frequency    every 16 update(s)
Births a new cell into the current cell.
      This occurs by first iterating through the cell’s cardinals’ birth request inputs in random order. While the
cell’s resource stockpile is greater than the SPAWN_DEFENSE_COST, the requests are ignored and the stockpile
depleted by that cost. The first request that cannot be defended against is then acted upon. The current
cell’s death routine is called, the old genome is replaced by the incoming genome, and the cell’s make-alive
routine is called.
Cell Age Service
        Frequency    every 1 update(s)
Advances the cell age introspective state and refreshes kin group age introspective state.
Collective Harvesting Service
        Frequency    every 4 update(s)
Calculates the total amount of resource collectively harvested to this cell by the cell’s kin group. This amount
increases with quorum count and saturates at OPTIMAL_QUORUM_COUNT. Adds the harvested amount to the
cell’s resource stockpile.
Collective Resource Decay Service
        Frequency    every 1 update(s)
If the cell’s quorum count exceeds OPTIMAL_QUORUM_COUNT, applies multiplicative decay to the cell’s resource
stockpile. This effect strengthens exponentially with excess cell quorum count.
Conduit Flush Service
        Frequency    every 16 update(s)
Flushes each cardinals’ inter-process and inter-thread output conduits.
                                                          334


Inter Message Launching Service
       Frequency    every 8 update(s)
Launches new virtual cores to process incoming inter-cell messages.
Inter Message Purging Service
       Frequency    every 8 update(s)
Purges excess incoming inter-cell messages that couldn’t be handled due to virtual core availability.
Intra Message Launching Service
       Frequency    every 8 update(s)
Launches new virtual cores to process incoming messages from co-cardinals.
Message Counter Clear Service
       Frequency    every 16 update(s)
Intermittently resets introspective message count state.
Quorum Service
       Frequency    every 1 update(s)
Performs distributed estimation of kin group size by simulation.
     Each cell has a single randomly-chosen index set within a fixed-length bitstring. (Depending in param-
eter settings, some cells may may have index set — all positions within the bitstring are zeroed out.)
     Broadcasts bits known to be set are to all neighbor cells within the same kin group. Incoming bitstrings
from neighbors are ORed with known bits.
     The original neighbor each non-self bit was first learned from is recorded alongside that bit. If that
neighbor no longer broadcasts that bit, it is erased from the cell’s known bits.
     Updates latest quorum count into introspective state.
     This scheme is replicated independently for each kin group level simulated.
Resource Decay Service
       Frequency    every 1 update(s)
Decays cell resource stockpile multiplicatively by RESOURCE_DECAY constant.
Resource Harvesting Service
       Frequency    every 1 update(s)
Adds a constant amount to cell’s resource stockpile.
Resource Receiving Service
       Frequency    every 4 update(s)
Calculates total amount of resource received across every cardinal, and then adds that total to resource
                                                      335


stockpile.
     If the cell is not alive, it instead refunds all received resources back to each sending cell.
Resource Sending Service
       Frequency      every 1 update(s)
Based on writable state within each cardinal, calculates and dispatches resource that should be shared to
each neighbor cell.
Spawn Sending Service
       Frequency      every 16 update(s)
If available resource is greater than or equal to 1.0, iterates randomly through every cardinal to determine
whether it requested to spawn and has not arrested spawning. Then, one of these requests is dispatched at
random and stockpile is decreased by one.
State Input Jump Service
       Frequency      every 8 update(s)
Pulls a fresh copy of each neighbor cardinal’s current readable state.
State Output Put Service
       Frequency      every 8 update(s)
Dispatches a copy of each cardinal’s current readable state to each cardinal’s neighbor cell.
Epoch Advance Service
       Frequency      every 8 update(s)
The cell’s current-known epoch count is advanced by one then set to the maximum of the cell’s current-known
epoch count and neighbor cells’ current-known epoch count.
Writable State Rotate Restore Service
       Frequency      every 1 update(s)
Used for experimental manipulations testing the fitness effect of writable state. (Not part of core simulation
logic.)
Writable State Exchange Restore Service
       Frequency      every 1 update(s)
Used for experimental manipulations testing the fitness effect of writable state. (Not part of core simulation
logic.)
Group Expiration Service
       Frequency      every 64 update(s)
As group age exceeds GROUP_EXPIRATION_DURATIONS, with increasing probability fragments cell from its kin
group. This process kills the cell and replaces it in place with a daughter without kin ID commonality.
                                                         336


Apoptosis Service
      Frequency       every 16 update(s)
If any cardinals have requested apoptosis, do death routine on the cell.
D.5.6      Threadlocal Simulation Services
     Actions that are performed on each thread_local population.
     See https://github.com/mmore500/dishtiny/tree/prq49/include/dish2/services_threadlocal for source
code implementing these services.
Cell Update Service
      Frequency       every 1 update(s)
Performs each cell’s simulation services, iterating over cells in randomized order.
Diversity Maintenance Service
      Frequency       every 8 update(s)
Prevents any one originally-generated ancestor from sweeping the population, preserving deep phylogenetic
diversity.
     Counts     cells   that    descend   from   each    originally-seeded  ancestor.       If  more    than
DIVERSITY_MAINTENANCE_PREVALENCE of cells descend from a single seeded ancestor, decay their re-
source stockpiles. The magnitude of this effect increases with excess prevalence.
Stint Diversity Maintenance Service
      Frequency       every n/a update(s)
Prevents any one seeded or reconstituted stint-originating ancestor from sweeping the population, preserving
phylogenetic diversity within a single stint.
     Counts cells that descend from each seeded or reconstituted stint-originating ancestor. If more than
STINT_DIVERSITY_MAINTENANCE_PREVALENCE of cells descend from a single seeded or reconstituted ancestor,
decay their resource stockpiles. The magnitude of this effect increases with excess prevalence.
D.5.7      Runtime Parameters
     This section enumerates simulation parameters and provides default settings that were used.
     See    https://github.com/mmore500/dishtiny/blob/prq49/include/dish2/config/ConfigBase.hpp           for
source code defining run time parameters.
     Some parameter settings were overridden in some assays. See https://github.com/mmore500/dishtiny/
tree/prq49/configpacks/bucket=prq49+diversity=0.50_series+mut_freq=1.00+mut_sever=1.00 for con-
figuration files used in each assay and https://github.com/mmore500/dishtiny/tree/prq49/slurm for run-
scripts used in each assay.
                                                      337


EXECUTION
N_THREADS
 Type     size_t
 Default 4
   How many threads should we run with?
RUN_UPDATES
 Type     bool
 Default false
   Should we run evolution or skip directly to post-processing and data collection?
RUN_UPDATES
 Type     size_t
 Default 0
   How many updates should we run the experiment for?
RUN_SECONDS
 Type     double
 Default 0
   How many updates should we run the experiment for?
MAIN_TIMEOUT_SECONDS
 Type     double
 Default 10800
   After how many seconds should we time out and fail with an error?
END_SNAPSHOT_TIMEOUT_SECONDS
 Type     double
 Default 1200
   After how many seconds should the end snapshot timeout?
LOG_FREQ
 Type     double
 Default 20
   How many seconds should pass between logging progress?
                                                  338


ASYNCHRONOUS
 Type      size_t
 Default 3
    Should updates occur synchronously across threads and processes?
SYNC_FREQ_MILLISECONDS
 Type      size_t
 Default 100
    How often updates occur synchronously across threads and processes for async mode 1?
RNG_PRESEED
 Type      utin64_t
 Default std::numeric_limits<uint64_t>::max()
    Optionally override the calculated rng preseed.
THROW_ON_EXTINCTION
 Type      bool
 Default true
    Should we throw an exception if populations go extinct?
EXPERIMENT
RUN_SLUG
 Type      std::string
 Default “default”
    Run-identifying slug.
PHENOTYPIC_DIVERGENCE_N_UPDATES
 Type      size_t
 Default 2048
    How many updates should we run phenotypic divergence experiments for? If phenotypic divergence is
not detected within this many updates, we consider two strains to be phenotypically identical.
PHENOTYPIC_DIVERGENCE_N_CELLS
 Type      size_t
 Default 100
    How many cells should we simulate while testing for phenotypic divergence?
                                                    339


STINT
  Type      utin64_t
  Default std::numeric_limits<uint64_t>::max()
     How many evolutionary stints have elapsed?
SERIES
  Type      utin64_t
  Default std::numeric_limits<uint64_t>::max()
     Which evolutionary series are we running?
REPLICATE
  Type      std::string
  Default “”
     What replicate are we running?
TREATMENT
  Type      std::string
  Default “none”
     What experimental treatment has been applied?
SEED_FILL_FRACTION
  Type      double
  Default 1.0
     If we are seeding the population, what fraction of available slots should we fill?
GENESIS
  Type      std::string
  Default "generate"
     How should we initialize the population? Can be “generate” to randomly generate a new population,
“reconstitute” to load a population from file, “monoculture” to load a single genome from file, or “innoculate”
to load genomes annotated with root ID keyname attributes from file.
DEMOGRAPHICS
N_CELLS
  Type      size_t
  Default 10000
     How many cells should be simulated?
                                                     340


WEAK_SCALING
 Type      bool
 Default false
    Should number of total cells be multiplied by the total number of threads (num procs times threads per
proc)?
N_DIMS
 Type      size_t
 Default DISH2_NLEV
    What dimensionality should the toroidal mesh have?
GROUP_EXPIRATION_DURATIONS
 Type      internal::nlev_size_t_t
 Default internal::nlev_size_t_t{ 256, 1024 }
    After how many epochs should groups stop collecting resource?
CELL_AGE_DURATION
 Type      size_t
 Default 1024
    After how many epochs should cells die?
RESOURCE
MIN_START_RESOURCE
 Type      float
 Default 0.8
    How much resource should a cell start with?
MAX_START_RESOURCE
 Type      float
 Default 0.9
    How much resource should a cell start with?
RESOURCE_DECAY
 Type      float
 Default 0.995
    How much resource should remain each update?
                                                    341


APOP_RECOVERY_FRAC
 Type    float
 Default 0.8
   What fraction of REP_THRESH is recovered to heirs after apoptosis?
SPAWN_DEFENSE_COST
 Type    float
 Default 1.1
   What is the cost of repelling an incoming spawn?
HARVEST
BASE_HARVEST_RATE
 Type    float
 Default 0.02
   How much resource should cells accrue per update?
COLLECTIVE_HARVEST_RATE
 Type    internal::nlev_float_t
 Default internal::nlev_float_t{0.25}
   How much resource should cells accrue per update?
OPTIMAL_QUORUM_COUNT
 Type    internal::nlev_float_t
 Default internal::nlev_size__t{12}
   What group size does collective harvest work most effectively at?
QUORUM
P_SET_QUORUM_BIT
 Type    internal::nlev_float_t
 Default internal::nlev_float_t{1.0}
   What fraction of cells should have a quorum bit set?
GENOME
PROGRAM_START_SIZE
 Type    size_t
 Default 128
   How many instructions should initial programs be?
                                                 342


PROGRAM_MAX_SIZE
 Type     size_t
 Default 4096
   How many instructions should programs be capped at?
MUTATION_RATE
 Type     internal::nlev_float_t
 Default internal::nreplev_float_t{0.1}
   For each replev, what fraction of cells should be mutated at all?
POINT_MUTATION_RATE
 Type     float
 Default 0.0002
   What fraction of bits should be scrambled?
SEQUENCE_DEFECT_RATE
 Type     float
 Default 0.001
   How often should sloppy copy defect occur?
MINOR_SEQUENCE_MUTATION_BOUND
 Type     size_t
 Default 8
   For minor sequence mutations, at most how many instructions should be inserted or deleted?
SEVERE_SEQUENCE_MUTATION_RATE
 Type     float
 Default 0.001
   With what probability should sequence mutation be severe?
HARDWARE
HARDWARE_EXECUTION_ROUNDS
 Type     size_t
 Default 1
   How many hardware cardinal rounds to run?
                                                   343


HARDWARE_EXECUTION_CYCLES
 Type    size_t
 Default 16
   How many hardware cycles to run per round?
CONTROLLER_MAPPED_STATE_DEFECT_RATE
 Type    float
 Default 0.0005
   At what rate should bits should be flipped in writable memory?"
SERVICES
APOPTOSIS_SERVICE_FREQUENCY
 Type    size_t
 Default 16
   Run service every ?? updates. Must be a power of 2.
BIRTH_SETUP_SERVICE_FREQUENCY
 Type    size_t
 Default 16
   Run service every ?? updates. Must be a power of 2.
CONDUIT_FLUSH_SERVICE_FREQUENCY
 Type    size_t
 Default 1
   Run service every ?? updates. Must be a power of 2.
COLLECTIVE_HARVESTING_SERVICE_FREQUENCY
 Type    size_t
 Default 16
   Run service every ?? updates. Must be a power of 2.
CPU_EXECUTION_SERVICE_FREQUENCY
 Type    size_t
 Default 4
   Run service every ?? updates. Must be a power of 2.
                                                  344


GROUP_EXPIRATION_SERVICE_FREQUENCY
 Type        size_t
 Default 1
     Run service every ?? updates. Must be a power of 2.
RUNNING_LOG_PURGE_SERVICE_FREQUENCY
 Type        size_t
 Default 64
     Run service every ?? updates. Must be a power of 2.
DIVERSITY_MAINTENANCE_SERVICE_FREQUENCY
 Type        size_t
 Default 64
     Run service every ?? updates. Must be a power of 2.
DIVERSITY_MAINTENANCE_PREVALENCE
 Type        double
 Default 0.25
     If an originally-seeded ancestor’s descendants constitute more than this fraction of the population, decay
their resource stockpiles.
STINT_DIVERSITY_MAINTENANCE_SERVICE_FREQUENCY
 Type        size_t
 Default 0
     Run service every ?? updates. Must be a power of 2.
STINT_DIVERSITY_MAINTENANCE_PREVALENCE
 Type        double
 Default 0.25
     If a seeded or reconstituted stint-originating ancestor’s descendants constitute more than this fraction
of the population, decay their resource stockpiles.
DECAY_TO_BASELINE_SERVICE_FREQUENCY
 Type        size_t
 Default 32
     Run service every ?? updates. Must be a power of 2.
                                                      345


EPOCH_ADVANCE_SERVICE_FREQUENCY
 Type    size_t
 Default 8
   Run service every ?? updates. Must be a power of 2.
EVENT_LAUNCHING_SERVICE_FREQUENCY
 Type    size_t
 Default 8
   Run service every ?? updates. Must be a power of 2. Must be > 1.
INTERMITTENT_CPU_RESET_SERVICE_FREQUENCY
 Type    size_t
 Default 64
   Run service every ?? updates. Must be a power of 2.
INTERMITTENT_STATE_PERTURB_SERVICES_FREQUENCY
 Type    size_t
 Default 1
   Run service every ?? updates. Must be a power of 2.
INTER_MESSAGE_COUNTER_CLEAR_SERVICE_FREQUENCY
 Type    size_t
 Default 16
   Run service every ?? updates. Must be a power of 2.
INTER_MESSAGE_LAUNCHING_SERVICE_FREQUENCY
 Type    size_t
 Default 8
   Run service every ?? updates. Must be a power of 2.
INTRA_MESSAGE_LAUNCHING_SERVICE_FREQUENCY
 Type    size_t
 Default 1
   Run service every ?? updates. Must be a power of 2.
                                               346


STATE_OUTPUT_PUT_SERVICE_FREQUENCY
 Type    size_t
 Default 8
   Run service every ?? updates. Must be a power of 2.
PUSH_SERVICE_FREQUENCY
 Type    size_t
 Default 16
   Run service every ?? updates. Must be a power of 2.
QUORUM_CAP_SERVICE_FREQUENCY
 Type    size_t
 Default 16
   Run service every ?? updates. Must be a power of 2.
QUORUM_SERVICE_FREQUENCY
 Type    size_t
 Default 1
   Run service every ?? updates. Must be a power of 2.
RESOURCE_DECAY_SERVICE_FREQUENCY
 Type    size_t
 Default 1
   Run service every ?? updates. Must be a power of 2.
RESOURCE_HARVESTING_SERVICE_FREQUENCY
 Type    size_t
 Default 1
   Run service every ?? updates. Must be a power of 2.
RESOURCE_INPUT_JUMP_SERVICE_FREQUENCY
 Type    size_t
 Default 1
   Run service every ?? updates. Must be a power of 2.
                                               347


RESOURCE_RECEIVING_SERVICE_FREQUENCY
 Type     size_t
 Default 4
    Run service every ?? updates. Must be a power of 2.
RESOURCE_SENDING_SERVICE_FREQUENCY
 Type     size_t
 Default 1
    Run service every ?? updates. Must be a power of 2.
SPAWN_SENDING_SERVICE_FREQUENCY
 Type     size_t
 Default 16
    Run service every ?? updates. Must be a power of 2.
STATE_INPUT_JUMP_SERVICE_FREQUENCY
 Type     size_t
 Default 8
    Run service every ?? updates. Must be a power of 2.
CONTROLLER_MAPPED_STATE_NOISE_SERVICE_FREQUENCY
 Type     size_t
 Default 8
    Run service every ?? updates. Must be a power of 2.
DATA
PHENOTYPE_EQUIVALENT_NOPOUT
 Type     bool
 Default false
    Should we make and record a phenotype equivalent nopout strain at the end of the run? Must also
enable ARTIFACTS_DUMP.
BATTLESHIP_PHENOTYPE_EQUIVALENT_NOPOUT
 Type     bool
 Default false
    Should we make and record a phenotype equivalent nopout strain at the end of the run? Must also
enable ARTIFACTS_DUMP.
                                                348


JENGA_PHENOTYPE_EQUIVALENT_NOPOUT
 Type      bool
 Default false
    Should we make and record a phenotype equivalent nopout strain at the end of the run? Must also
enable ARTIFACTS_DUMP.
JENGA_NOP_OUT_SAVE_PROGRESS_AND_QUIT_SECONDS
 Type      size_t
 Default 10800
    After how many seconds should we save nop-out progress and quit?
TEST_INTERROOT_PHENOTYPE_DIFFERENTIATION
 Type      bool
 Default false
    Should we test for phenotype differentiation between roots?
ALL_DRAWINGS_WRITE
 Type      bool
 Default false
    Should we generate and record drawings of the final state of the simulation?      Must also enable
DATA_DUMP.
DATA_DUMP
 Type      bool
 Default false
    Should we record data on the final state of the simulation?
RUNNINGLOGS_DUMP
 Type      bool
 Default false
    Should we dump running logs at the end of the simulation? Must also enable DATA_DUMP.
CENSUS_WRITE
 Type      bool
 Default false
    Should we write the cell census at the end of the simulation? Must also enable DATA_DUMP.
                                                    349


ARTIFACTS_DUMP
 Type      bool
 Default false
    Should we record data on the final state of the simulation?
BENCHMARKING_DUMP
 Type      bool
 Default false
    Should we record data for benchmarking the simulation?
ROOT_ABUNDANCES_FREQ
 Type      size_t
 Default 0
    How many updates should elapse between recording phylogenetic root abundances? If 0, never record
phylogenetic root abundances. Must be power of two.
ABORT_IF_COALESCENT_FREQ
 Type      size_t
 Default 0
    How many updates should elapse between checking for coalescence? If 0, never check for coalescence.
Must be power of two.
ABORT_IF_EXTINCT_FREQ
 Type      size_t
 Default 0
    How many updates should elapse between checking for coalescence? If 0, never check for coalescence.
Must be power of two.
ABORT_AT_LIVE_CELL_FRACTION
 Type      double
 Default 0.0
    Should we terminate once a live cell fraction is reached? If 0, will not terminate.
REGULATION_VIZ_CLAMP
 Type      double
 Default 10.0
    What bounds should we clamp regulation values into before running PCA visualization?
                                                     350


RUNNING_LOG_DURATION
 Type    size_t
 Default 4
   How many purge epochs should we keep events in the running log?
SELECTED_DRAWINGS_FREQ
 Type    size_t
 Default 0
   How many updates should elapse between outputting snapshot images?
DRAWING_WIDTH_PX
 Type    double
 Default 500.0
   What should the width of the drawings be, in pixels?
DRAWING_HEIGHT_PX
 Type    double
 Default 500.0
   What should the height of the drawings be, in pixels?
SELECTED_DRAWINGS
 Type    std::string
 Default “”
   What drawings should be drawn? Provide slugified drawer names separated by colons.
ANIMATE_FRAMES
 Type    bool
 Default false
   Should we stitch the output images into a video? Only valid if DRAWING_FREQ is not 0.
VIDEO_FPS
 Type    size_t
 Default 16
   How many frames per second should the video be?
                                                 351


VIDEO_MAX_FRAMES
 Type       size_t
 Default 960
    At most how many frames should output video include?
D.5.8     Compile Time Parameters
    This section enumerates simulation parameters and provides default settings that were used.
    See     https://github.com/mmore500/dishtiny/blob/prq49/include/dish2/spec/Spec_prq49.hpp   for
source code defining compile time parameters.
NLEV
 Type     size_t
 Value    2
    How many hierarchical kin group levels should be simulated?
AMT_NOP_MEMORY
 Type     size_t
 Value    4
    How many nop and transient nop states should exist in the peripheral?
STATE_EXCHANGE_CHAIN_LENGTH
 Type     size_t
 Value    128
    How many callees should we displace state by in state exchange experiments?
sgpl_spec_t::num_cores
 Type     size_t
 Value    32
    How many virtual cores should each cardinal’s virtual CPU be able to support?
sgpl_spec_t::num_fork_requests
 Type     size_t
 Value    3
                                                  352


    How many fork requests can a virtual core make at most?
sgpl_spec_t::num_registers
 Type   size_t
 Value  8
    How many registers should each virtual core contain?
sgpl_spec_t::switch_steps
 Type   size_t
 Value  8
    Maximum num steps executed on one core before next core is executed.
                                                  353


sgpl_spec_t::global_matching_t
 Type   typedef
 Value  emp::MatchDepository<
            // program index type
            unsigned short,
            // match metric
            emp::OptimizedApproxDualStreakMetric<64>,
            // match selector
            emp::statics::RankedSelector<
                // match threshold
                std::ratio<1, 5>
            >,
            // regulator
            emp::PlusCountdownRegulator<
                std::deci, // Slope
                std::ratio<1,4>, // MaxUpreg
                std::deci, // ClampLeeway
                2 // CountdownStart
            >,
            true, // raw caching
            8 // regulated caching
        >
    What matching datastructure implementation should we use for global jump tables?
                                                354


sgpl_spec_t::local_matching_t
 Type    typedef
 Value   emp::MatchDepository<
              // program index type
              unsigned short,
              // match metric
              emp::OptimizedApproxDualStreakMetric<64>,
              // match selector
              emp::statics::RankedSelector<
                   // match threshold
                   std::ratio<1, 2>
              >,
              // regulator
              emp::PlusCountdownRegulator<
                   std::deci, // Slope
                   std::ratio<1,4>, // MaxUpreg
                   std::deci, // ClampLeeway
                   2 // CountdownStart
              >,
              false, // raw caching
              0 // regulated caching
         >
    What matching datastructure implementation should we use for local jump tables?
sgpl_spec_t::tag_width
 Type    size_t
 Value   64
    Tag width in bits.
D.6     Supplementary Figures
    This section provides supplementary figures for Chapter 5.
                                                   355


                                                                                                                                                   25                                                        Morph
Number Stint Phylogenetic Roots (evolve mean)
                                                45                                                            Morph                                                                                              a
                                                                                                                  a                                                                                              b
                                                                                                                                                   20
                                                                                                                       Lowest Num Stint Root IDs
                                                40                                                                                                                                                               c
                                                                                                                  b
                                                                                                                  c                                                                                              d
                                                35
                                                                                                                  d                                                                                              e
                                                                                                                                                   15
                                                30                                                                e                                                                                              f
                                                                                                                  f                                                                                              g
                                                25                                                                g                                                                                              h
                                                                                                                                                   10
                                                                                                                  h                                                                                              i
                                                20
                                                                                                                  i                                                                                              j
                                                15                                                                j                                5
                                                10
                                                                                                                                                   0
                                                5                                                                                                       0   20        40           60       80        100
                                                     0       20        40           60      80        100                                                                  Stint
                                                                            Stint
                                                    (b) Number of genomes with the lowest surviving orig-
(a) Number of genomes from the beginning of a stint inal phylogenetic root ID from the beginning of a stint
with extant offspring at the end of that stint.     with extant offspring at the end of that stint.
                                                180000                                                        Morph                                180000                                                    Morph
                                                                                                                  a                                                                                              a
                                                160000                                                            b                                160000                                                        b
                                                                                                                  c                                                                                              c
Update (evolve mean)                                                                                                   Update (evolve mean)
                                                140000                                                            d                                140000                                                        d
                                                                                                                  e                                                                                              e
                                                120000                                                            f                                120000                                                        f
                                                                                                                  g                                                                                              g
                                                100000                                                            h                                100000                                                        h
                                                                                                                  i                                                                                              i
                                                 80000                                                            j                                 80000                                                        j
                                                 60000                                                                                              60000
                                                 40000                                                                                              40000
                                                         5    10    15      20     25      30      35    40     45                                                     101
                                                              Number Stint Phylogenetic Roots (evolve mean)                                                  Number Stint Phylogenetic Roots (evolve mean)
(c) Relationship between number of simulation up- (d) Relationship between number of simulation up-
dates elapsed in a stint and the number of genomes dates elapsed in a stint and the number of genomes
seeded into a stint with extant descendants at the end seeded into a stint with extant descendants at the end
of that stint.                                         of that stint, log axis.
Figure D.4: Phylogenetic statistics. Color coding and letters correspond to qualitative morph codes described
in Table 5.1. Dotted vertical line denotes emergence of morph e. Dashed vertical line denotes emergence of
morph g.
                                                                                                                     356


           2                                                                                                                                                    500
           4
           6
           100 98 96 94 92 90 88 86 84 82 80 78 76 74 72 70 68 66 64 62 60 58 56 54 52 50 48 46 44 42 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8
                                                                                                                                                                400
                                                                                                                                                                300
                                                                                                                                                                200
                                                                                                                                                                100
                                                                                                                                                    2
                                                                                                                                                    4
                                                                                                                                                    6
                                                                                                                                                    8
                                                                                                                                                   10
                                                                                                                                                   12
                                                                                                                                                   14
                                                                                                                                                   16
                                                                                                                                                   18           0
                                                                                                                                                   20
                                                                                                                                                   22
                                                                                                                                                   24
                                                                                                                                                   26
                                                                                                                                                   28
                                                                                                                                                   30
                                                                                                                                                   32
                                                                                                                                                   34
                                                                                                                                                   36
                                                                                                                                                   38
                                                                                                                                                   40
                                                                                                                                                   42
                                                                                                                                                   44
                                                                                                                                                   46
                                                                                                                                                   48
                                                                                                                                                   50
                                                                                                                                                   52
                                                                                                                                                   54
                                                                                                                                                   56
                                                                                                                                                   58
                                                                                                                                                   60
                                                                                                                                                   62
                                                                                                                                                   64
                                                                                                                                                   66
                                                                                                                                                   68
                                                                                                                                                   70
                                                                                                                                                   72
                                                                                                                                                   74
                                                                                                                                                   76
                                                                                                                                                   78
                                                                                                                                                   80
                                                                                                                                                   82
                                                                                                                                                   84
                                                                                                                                                   86
                                                                                                                                                   88
                                                                                                                                                   90
                                                                                                                                                   92
                                                                                                                                                   94
                                                                                                                                                   96
                                                                                                                                                   98
                                                                                                                                                  100
Figure D.5: Point mutation distance between tag blocks from sampled focal strain representatives across
stints. These distances were used to reconstruct phylogenetic relationship of sampled focal strain repre-
sentatives. Light white indicates high similarity between tag blocks and dark blue indicates low similarity.
Representatives from stints 0 and 1, which share no common ancestry with representatives from other stints,
are excluded. Only upper triangular data plotted.
                                                                                                                                                          357


Fraction Deaths apoptosis (monoculture mean)
                                                                                               Morph                               1.0
                                                                                                   a
                                               0.4
                                                                                                   b
                                                                                                       Fraction Deaths apoptosis
                                                                                                   c                               0.8
                                                                                                   d
                                               0.3
                                                                                                   e
                                                                                                   f                               0.6
                                                                                                   g
                                               0.2
                                                                                                   h
                                                                                                   i                               0.4
                                                                                                   j
                                               0.1
                                                                                                                                   0.2
                                               0.0
                                                     0     20     40           60   80   100                                       0.0
                                                                       Stint
                                                                                                                                         0   10   20   30   40    50     60   70   80   90   100
                                                                                                                                                                 Stint
(a) Fraction cell deaths due to apoptosis. Color cod-
ing and letters correspond to qualitative morph codes (b) Distribution of apoptosis rates across evolutionary
described in Table 5.1.                               replicates.
                                                         Figure D.6: Apoptosis rates in case study strain and across evolutionary replicates.
                                                                                                   358


Birth Conflict Ratio for Kin Commonality At Least 1                                                                      Birth Conflict Ratio for Kin Commonality Exactly 1
                                                                                                                Morph                                                                                                      Morph
                                                                                                                    a                                                                                                          a
                                                                                                                    b                                                                                                          b
                                                        101                                                                                                                    100
                                                                                                                    c                                                                                                          c
                                                                                                                    d                                                                                                          d
                                                                                                                    e                                                                                                          e
                                                                                                                    f                                                                                                          f
                                                                                                                    g                                                                                                          g
                                                        100                                                         h                                                                                                          h
                                                                                                                    i                                                                                                          i
                                                                                                                    j                                                                                                          j
                                                                                                                                                                              10−1
                                                       10−1
                                                              0   20    40            60        80        100                                                                        0   20   40           60   80   100
                                                                              Stint                                                                                                                Stint
(a) Frequency at which cell proliferation replaces (b) Frequency at which cell proliferation replaces
neighbors with any kin group commonality to the par- neighbors with only outer kin group commonality to
ent, normalized for the relative frequency of neighbors the parent, normalized for the relative frequency of
with kin group commonality to the parent. Values be- neighbors with kin group commonality to the parent.
low 1.0 (horizontal bar) indicate preference to displace Values below 1.0 (horizontal bar) indicate preference
neighbors with no kin group commonality.                 to displace neighbors with no kin group commonality.
  Birth Conflict Ratio for Kin Commonality Exactly 2
                                                                                                                Morph
                                                                                                                    a
                                                                                                                    b
                                                                                                                    c
                                                                                                                    d
                                                       101                                                          e
                                                                                                                    f
                                                                                                                    g
                                                                                                                    h
                                                                                                                    i
                                                                                                                    j
                                                       100
                                                              0    20        40            60        80          100
                                                                                  Stint
(c) Frequency at which cell proliferation replaces
neighbors with full kin group commonality to the par-
ent, normalized for the relative frequency of neighbors
with kin group commonality to the parent. Values be-
low 1.0 (horizontal bar) indicate preference to displace
neighbors without full kin group commonality.
Figure D.7: Kin conflict rates. Color coding and letters correspond to qualitative morph codes described in
Table 5.1.
                                                                                                                       359


Mean Resource Received Per Cell (monoculture mean)                                                         Resource Receiving Cell Fraction (monoculture mean)
                                                                                                                                                                 0.0035
                                                     0.006                                         Morph                                                                                                               Morph
                                                                                                       a                                                                                                                   a
                                                                                                                                                                 0.0030
                                                                                                       b                                                                                                                   b
                                                     0.005
                                                                                                       c                                                                                                                   c
                                                                                                                                                                 0.0025
                                                                                                       d                                                                                                                   d
                                                     0.004                                             e                                                                                                                   e
                                                                                                                                                                 0.0020
                                                                                                       f                                                                                                                   f
                                                     0.003                                             g                                                                                                                   g
                                                                                                                                                                 0.0015
                                                                                                       h                                                                                                                   h
                                                     0.002                                             i                                                                                                                   i
                                                                                                                                                                 0.0010
                                                                                                       j                                                                                                                   j
                                                     0.001                                                                                                       0.0005
                                                     0.000                                                                                                       0.0000
                                                             0   20   40           60   80   100                                                                              0    20    40            60   80   100
                                                                           Stint                                                                                                               Stint
(a) Fraction of cells receiving shared resource at end of (b) Fraction of cells receiving shared resource in mono-
evolutionary stints, including not just the focal lineage. cultures of focal lineage.
Mean Resource Received Per Cell (evolve mean)
                                                                                                           Resource Receiving Cell Fraction (evolve mean)
                                                                                                   Morph
                                                                                                       a                                                         0.35                                                  Morph
                                                     0.25
                                                                                                       b                                                                                                                   a
                                                                                                       c                                                         0.30                                                      b
                                                     0.20                                              d                                                                                                                   c
                                                                                                       e                                                         0.25                                                      d
                                                                                                       f                                                                                                                   e
                                                     0.15
                                                                                                                                                                 0.20                                                      f
                                                                                                       g
                                                                                                       h                                                                                                                   g
                                                     0.10                                                                                                        0.15                                                      h
                                                                                                       i
                                                                                                       j                                                                                                                   i
                                                                                                                                                                 0.10                                                      j
                                                     0.05
                                                                                                                                                                 0.05
                                                     0.00
                                                                                                                                                                 0.00
                                                             0   20   40           60   80   100
                                                                           Stint                                                                                          0       20    40             60   80   100
                                                                                                                                                                                              Stint
(c) Mean amount of resource shared per cell-update at
                                                         (d) Mean amount of resource shared per cell-update in
end of evolutionary stints, including not just the focal
                                                         monocultures of focal lineage.
lineage.
Figure D.8: Resource-sharing phenotypic traits. Color coding and letters correspond to qualitative morph
codes described in Table 5.1. Dotted vertical line denotes emergence of morph e. Dashed vertical line denotes
emergence of morph g.
                                                                                                       360


                       180000                                   Morph
                                                                    a
                                                                                                       1.100
                       160000                                       b
                                                                          Normalized Elapsed Updates
                                                                    c                                  1.075
Update (evolve mean)
                       140000                                       d
                                                                    e                                  1.050
                       120000                                       f
                                                                                                       1.025
                                                                    g
                       100000                                       h                                  1.000
                                                                    i
                       80000                                        j                                  0.975
                       60000                                                                           0.950
                                                                                                       0.925
                       40000
                                0   20   40           60   80     100                                  0.900
                                              Stint                                                            0   10   20   30   40    50     60   70   80   90   100
                                                                                                                                       Stint
(a) Number simulation updates elapsed per three-hour
evolutionary stint. Color coding and letters corre- (b) Distribution of real-time simulation rate of con-
spond to qualitative morph codes described in Table current threads. Colors provided only to distinguish
5.1.                                                 neighboring data.
                                                Figure D.9: Real-time simulation performance.
                                                                        361


                                                                                                                                                                                                Figure D.10: Number updates elapsed during fixed-duration adaptation assay competitions for sampled representative specimen (top) population-
                                     Assay Subject = Specimen                                                             Assay Subject = Population
                                                                                                                                                                 100(j)██
                                                                                                                                                                 99-(e)██
                                                                                                                                                                 98-(e)██
                                                                                                                                                                 97-(h)██
                                                                                                                                                                 96-(b)██
                                                                                                                                                                 95-(b)██
                                                                                                                                                                 94-(e)██
                                                                                                                                                                 93-(e)██
                                                                                                                                                                 92-(i)██
                                                                                                                                                                 91-(e)██
                                                                                                                                                                 90-(g)██
                                                                                                                                                                 89-(g)██
                                                                                                                                                                 88-(e)██
                                                                                                                                                                 87-(e)██
                                                                                                                                                                 86-(e)██
                                                                                                                                                                 85-(e)██
                                                                                                                                                                 84-(e)██
                                                                                                                                                                 83-(e)██
                                                                                                                                                                 82-(e)██
                                                                                                                                                                 81-(e)██
                                                                                                                                                                 80-(e)██
                                                                                                                                                                 79-(g)██
                                                                                                                                                                 78-(e)██
                                                                                                                                                                 77-(i)██
                                                                                                                                                                                                level adaptation (bottom). See Figure 5.5 for explanation of competition biotic backgrounds. See Supplementary Figure D.12 for confidence interval
                                                                                                                                                                 76-(g)██
                                                                                                                                                                 75-(i)██
                                                                                                                                                                 74-(i)██
                                                                                                                                                                 73-(g)██
                                                                                                                                                                 72-(e)██
         14000                                                                                                                                                   71-(e)██
                                                                                                                                                                 70-(e)██
                                                                                                                                                                 69-(g)██
                                                                                                                                                                 68-(e)██
                                                                                                                                                                 67-(g)██
                                                                                                                                                                 66-(g)██
                                                                                                                                                                 65-(e)██
         12000                                                                                                                                                   64-(e)██
                                                                                                                                                                 63-(g)██
                                                                                                                                                                 62-(g)██
                                                                                                                                                                 61-(e)██
                                                                                                                                                                 60-(g)██
                                                                                                                                                                 59-(h)██
         10000
                                                                                                                                                                 58-(g)██
                                                                                                                                                                 57-(g)██
                                                                                                                                                                 56-(g)██
                                                                                                                                                                 55-(g)██
                                                                                                                                                                 54-(g)██
                                                                                                                                                                 53-(e)██
Num Updates Elapsed
                                                                                                                                                                            Competition Stint
                                                                                                                                                                 52-(g)██
                                                                                                                                                                 51-(g)██
        8000
                                                                                                                                                                 50-(g)██
                                                                                                                                                                                                                                                                                                                                                     362
                                                                                                                                                                                                estimates of mean updates elapsed during competition expedriments and Supplementary Figure D.11 for distributions of updates elapsed during
                                                                                                                                                                 49-(g)██
                                                                                                                                                                 48-(g)██
                                                                                                                                                                 47-(g)██
                                                                                                                                                                 46-(g)██
                                                                                                                                                                 45-(g)██
         6000                                                                                                                                                    44-(e)██
                                                                                                                                                                 43-(e)██
                                                                                                                                                                 42-(e)██
                                                                                                                                                                 41-(e)██
                                                                                                                                                                 40-(e)██
                                                                                                                                                                 39-(f)██
         4000
                                                                                                                                                                 38-(e)██
                                                                                                                                                                 37-(e)██
                                                                                                                                                                 36-(e)██
                                                                                                                                                                 35-(e)██
                                                                                                                                                                 34-(e)██
                                                                                                                                                                 33-(e)██
                                                                                                                                                                 32-(e)██
         2000                                                                                                                                                    31-(e)██
                                                                                                                                                                 30-(e)██
                                                                                                                                                                 29-(e)██
                                                                                                                                                                 28-(b)██
                                                                                                                                                                 27-(e)██
                                                                                                                                                                 26-(b)██
                                                                                                                                                                 25-(e)██
                                                                                                                                                                 24-(e)██
                                                                                                                                                                 23-(e)██
                                                                                                                                                                                                competition experiments.
                                                                                                                                                                 22-(e)██
                                                                                                                                                                 21-(e)██
                                                                                                                                                                 20-(e)██
                                                                                                                                                                 19-(e)██
                                                                                                                                                                 18-(e)██
                                                                                                                                                                 17-(e)██
                                                                                                                                                                 16-(e)██
                                                                                                                                                                 15-(e)██
                                                                                                                                                                 14-(d)██
                                                                                                                                                                 13-(b)██
                                                                                                                                                                 12-(b)██
                                                                                                                                                                 11-(b)██
                                                                                                                                                                 10-(b)██
                                                                                                                                                                 9--(b)██
                                                                                                                                                                 8--(b)██
                                                                                                                                                                 7--(b)██
                                                                                                                                                                 6--(b)██
                                                                                                                                                                 5--(b)██
                                                                                                                                                                 4--(b)██
                                                                                                                                                                 3--(b)██
                                                                                                                                                                 2--(c)██
                                                                                                                                                                 1--(b)██
                                                                                                 Without                                               Without
                      Contemporary
                                           Contemporary
                                                             Prefatory
                                                                                     Prefatory
                                                                                                           Contemporary
                                                                                                                                     Prefatory
                                     (no diversity maint.)               (no diversity maint.)
                                                                                                                              Biotic Background
                                                    Biotic Background


                                                                                                                                                                                                                                                          Figure D.11: Number updates elapsed during fixed-duration adaptation assay competitions for sampled representative specimen (upper panels)
                                             Contemporary                                                                                                                 Contemporary
                                             Contemporary                                                                                                                 Contemporary
                                             (no diversity maint.)                                                                                                        (no diversity maint.)
                                             Prefatory                                                                                                                    Prefatory
                                             Prefatory                                                                                                                    Prefatory
                                             (no diversity maint.)                                                                                                        (no diversity maint.)
                                             Without                                                                                                                      Without
Assay Subject = Specimen Assay Subject = Population                                                                          Assay Subject = Specimen Assay Subject = Population
                                                                                              50-(g)██                                                                                                                     100(j)██
                                                                                              49-(g)██                                                                                                                     99-(e)██
                                                                                              48-(g)██                                                                                                                     98-(e)██
                                                                                              47-(g)██                                                                                                                     97-(h)██
                                                                                              46-(g)██                                                                                                                     96-(b)██
                                                                                              45-(g)██                                                                                                                     95-(b)██
                                                                                              44-(e)██                                                                                                                     94-(e)██
                                                                                              43-(e)██                                                                                                                     93-(e)██
                                                                                              42-(e)██                                                                                                                     92-(i)██
                                                                                              41-(e)██                                                                                                                     91-(e)██
                                                                                              40-(e)██                                                                                                                     90-(g)██
                                                                                              39-(f)██                                                                                                                     89-(g)██
                                                                                                                                                                                                                                                          population-level adaptation (lower panels). Figure is split into two rows due to layout considerations. See Figure 5.5 for explanation of competition
                                                                                              38-(e)██                                                                                                                     88-(e)██
                                                                                              37-(e)██                                                                                                                     87-(e)██
                                                                                              36-(e)██                                                                                                                     86-(e)██
                                                                                              35-(e)██                                                                                                                     85-(e)██
                                                                                              34-(e)██                                                                                                                     84-(e)██
                                                                                              33-(e)██                                                                                                                     83-(e)██
                                                                                              32-(e)██                                                                                                                     82-(e)██
                                                                                              31-(e)██                                                                                                                     81-(e)██
                                                                                              30-(e)██                                                                                                                     80-(e)██
                                                                                              29-(e)██                                                                                                                     79-(g)██
                                                                                              28-(b)██                                                                                                                     78-(e)██                                                                                                                                                                               363
                                                                                              27-(e)██                                                                                                                     77-(i)██
                                                                                                         Competition Stint                                                                                                            Competition Stint
                                                                                              26-(b)██                                                                                                                     76-(g)██
                                                                                              25-(e)██                                                                                                                     75-(i)██
                                                                                              24-(e)██                                                                                                                     74-(i)██
                                                                                              23-(e)██                                                                                                                     73-(g)██
                                                                                              22-(e)██                                                                                                                     72-(e)██
                                                                                              21-(e)██                                                                                                                     71-(e)██
                                                                                              20-(e)██                                                                                                                     70-(e)██
                                                                                                                                                                                                                                                          biotic backgrounds.
                                                                                              19-(e)██                                                                                                                     69-(g)██
                                                                                              18-(e)██                                                                                                                     68-(e)██
                                                                                              17-(e)██                                                                                                                     67-(g)██
                                                                                              16-(e)██                                                                                                                     66-(g)██
                                                                                              15-(e)██                                                                                                                     65-(e)██
                                                                                              14-(d)██                                                                                                                     64-(e)██
                                                                                              13-(b)██                                                                                                                     63-(g)██
                                                                                              12-(b)██                                                                                                                     62-(g)██
                                                                                              11-(b)██                                                                                                                     61-(e)██
                                                                                              10-(b)██                                                                                                                     60-(g)██
                                                                                              9--(b)██                                                                                                                     59-(h)██
                                                                                              8--(b)██                                                                                                                     58-(g)██
                                                                                              7--(b)██                                                                                                                     57-(g)██
                                                                                              6--(b)██                                                                                                                     56-(g)██
                                                                                              5--(b)██                                                                                                                     55-(g)██
                                                                                              4--(b)██                                                                                                                     54-(g)██
                                                                                              3--(b)██                                                                                                                     53-(e)██
                                                                                              2--(c)██                                                                                                                     52-(g)██
                                                                                              1--(b)██                                                                                                                     51-(g)██
15000   12500   10000
                        7500   5000   2500
                                                 15000   12500   10000
                                                                         7500   5000   2500
                                                                                                                             15000   12500   10000
                                                                                                                                                     7500   5000   2500
                                                                                                                                                                              15000   12500   10000
                                                                                                                                                                                                      7500   5000   2500
                  Update                                           Update                                                                      Update                                           Update


                                                                                                                                                                                                                                                          Figure D.12: Number updates elapsed during fixed-duration adaptation assay competitions for sampled representative specimen (upper panels)
                                             Contemporary                                                                                                                 Contemporary
                                             Contemporary                                                                                                                 Contemporary
                                             (no diversity maint.)                                                                                                        (no diversity maint.)
                                             Prefatory                                                                                                                    Prefatory
                                             Prefatory                                                                                                                    Prefatory
                                             (no diversity maint.)                                                                                                        (no diversity maint.)
                                             Without                                                                                                                      Without
Assay Subject = Specimen Assay Subject = Population                                                                          Assay Subject = Specimen Assay Subject = Population
                                                                                              50-(g)██                                                                                                                     100(j)██
                                                                                              49-(g)██                                                                                                                     99-(e)██
                                                                                              48-(g)██                                                                                                                     98-(e)██
                                                                                              47-(g)██                                                                                                                     97-(h)██
                                                                                              46-(g)██                                                                                                                     96-(b)██
                                                                                              45-(g)██                                                                                                                     95-(b)██
                                                                                              44-(e)██                                                                                                                     94-(e)██
                                                                                              43-(e)██                                                                                                                     93-(e)██
                                                                                              42-(e)██                                                                                                                     92-(i)██
                                                                                              41-(e)██                                                                                                                     91-(e)██
                                                                                              40-(e)██                                                                                                                     90-(g)██
                                                                                              39-(f)██                                                                                                                     89-(g)██
                                                                                                                                                                                                                                                          population-level adaptation (lower panels). Error bars are bootstrapped 95% confidence intervals. Figure is split into two rows due to layout
                                                                                              38-(e)██                                                                                                                     88-(e)██
                                                                                              37-(e)██                                                                                                                     87-(e)██
                                                                                              36-(e)██                                                                                                                     86-(e)██
                                                                                              35-(e)██                                                                                                                     85-(e)██
                                                                                              34-(e)██                                                                                                                     84-(e)██
                                                                                              33-(e)██                                                                                                                     83-(e)██
                                                                                              32-(e)██                                                                                                                     82-(e)██
                                                                                              31-(e)██                                                                                                                     81-(e)██
                                                                                              30-(e)██                                                                                                                     80-(e)██
                                                                                              29-(e)██                                                                                                                     79-(g)██
                                                                                              28-(b)██                                                                                                                     78-(e)██                                                                                                                                                                       364
                                                                                              27-(e)██                                                                                                                     77-(i)██
                                                                                                         Competition Stint                                                                                                            Competition Stint
                                                                                              26-(b)██                                                                                                                     76-(g)██
                                                                                              25-(e)██                                                                                                                     75-(i)██
                                                                                              24-(e)██                                                                                                                     74-(i)██
                                                                                              23-(e)██                                                                                                                     73-(g)██
                                                                                              22-(e)██                                                                                                                     72-(e)██
                                                                                              21-(e)██                                                                                                                     71-(e)██
                                                                                              20-(e)██                                                                                                                     70-(e)██
                                                                                                                                                                                                                                                          considerations. See Figure 5.5 for explanation of competition biotic backgrounds.
                                                                                              19-(e)██                                                                                                                     69-(g)██
                                                                                              18-(e)██                                                                                                                     68-(e)██
                                                                                              17-(e)██                                                                                                                     67-(g)██
                                                                                              16-(e)██                                                                                                                     66-(g)██
                                                                                              15-(e)██                                                                                                                     65-(e)██
                                                                                              14-(d)██                                                                                                                     64-(e)██
                                                                                              13-(b)██                                                                                                                     63-(g)██
                                                                                              12-(b)██                                                                                                                     62-(g)██
                                                                                              11-(b)██                                                                                                                     61-(e)██
                                                                                              10-(b)██                                                                                                                     60-(g)██
                                                                                              9--(b)██                                                                                                                     59-(h)██
                                                                                              8--(b)██                                                                                                                     58-(g)██
                                                                                              7--(b)██                                                                                                                     57-(g)██
                                                                                              6--(b)██                                                                                                                     56-(g)██
                                                                                              5--(b)██                                                                                                                     55-(g)██
                                                                                              4--(b)██                                                                                                                     54-(g)██
                                                                                              3--(b)██                                                                                                                     53-(e)██
                                                                                              2--(c)██                                                                                                                     52-(g)██
                                                                                              1--(b)██                                                                                                                     51-(g)██
15000   12500   10000
                        7500   5000   2500
                                                 15000   12500   10000
                                                                         7500   5000   2500
                                                                                                                             15000   12500   10000
                                                                                                                                                     7500   5000   2500
                                                                                                                                                                              15000   12500   10000
                                                                                                                                                                                                      7500   5000   2500
                  Update                                           Update                                                                      Update                                           Update


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Mean Competition Prevalence
                                                                                                                                                                                                                                                                                                                                      0.0                                                                                                           0.2                                                                                                           0.4                                                                                                           0.6                                                                                                           0.8                                                                                                           1.0
                                          Contemporary
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Assay Subject = Specimen
                          Contemporary
                    (no diversity maint.)
Biotic Background
                                              Prefatory
                                Prefatory
                    (no diversity maint.)
                                               Without
                                          Contemporary
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Assay Subject = Population
                      Biotic Background
                                              Prefatory
                                               Without
                                                          1--(b)██   2--(c)██   3--(b)██   4--(b)██   5--(b)██   6--(b)██   7--(b)██   8--(b)██   9--(b)██   10-(b)██   11-(b)██   12-(b)██   13-(b)██   14-(d)██   15-(e)██   16-(e)██   17-(e)██   18-(e)██   19-(e)██   20-(e)██   21-(e)██   22-(e)██   23-(e)██   24-(e)██   25-(e)██   26-(b)██   27-(e)██   28-(b)██   29-(e)██   30-(e)██   31-(e)██   32-(e)██   33-(e)██   34-(e)██   35-(e)██   36-(e)██   37-(e)██   38-(e)██   39-(f)██   40-(e)██   41-(e)██   42-(e)██   43-(e)██   44-(e)██   45-(g)██   46-(g)██   47-(g)██   48-(g)██   49-(g)██   50-(g)██   51-(g)██   52-(g)██   53-(e)██   54-(g)██   55-(g)██   56-(g)██   57-(g)██   58-(g)██   59-(h)██   60-(g)██   61-(e)██   62-(g)██   63-(g)██   64-(e)██   65-(e)██   66-(g)██   67-(g)██   68-(e)██   69-(g)██   70-(e)██   71-(e)██   72-(e)██   73-(g)██   74-(i)██   75-(i)██   76-(g)██   77-(i)██   78-(e)██   79-(g)██   80-(e)██   81-(e)██   82-(e)██   83-(e)██   84-(e)██   85-(e)██   86-(e)██   87-(e)██   88-(e)██   89-(g)██   90-(g)██   91-(e)██   92-(i)██   93-(e)██   94-(e)██   95-(b)██   96-(b)██   97-(h)██   98-(e)██   99-(e)██   100(j)██
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Competition Stint
Figure D.13: Mean end-state population composition of competition experiments. Half (0.5) population composition corresponds to a neutral result,
color mapped to white. Blue indicates fitness gain compared to the previous stint and red indicates fitness loss. Color coding and parentheticals of stint
labels correspond to qualitative morph codes described in Table 5.1. Upper panel shows results for sampled focal strain genome, lower panel shows
results for entire focal strain population. See Figure 5.5 for explanation of competition biotic backgrounds. See Figure D.14 for boxplot depiction of
prevalence outcomes and Figure D.15 for bootstrapped confidence intervals on mean prevalence outcomes.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               365


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Assay Subject = Specimen Assay Subject = Population
                   1.0
Focal Prevalence
                   0.8
                   0.6
                   0.4
                   0.2
                   0.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Contemporary
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Contemporary
                   1.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               (no diversity maint.)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Prefatory
Focal Prevalence
                   0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Prefatory
                   0.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               (no diversity maint.)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Without
                   0.4
                   0.2
                   0.0
                         1--(b)██   2--(c)██   3--(b)██   4--(b)██   5--(b)██   6--(b)██   7--(b)██   8--(b)██   9--(b)██   10-(b)██   11-(b)██   12-(b)██   13-(b)██   14-(d)██   15-(e)██   16-(e)██   17-(e)██   18-(e)██   19-(e)██   20-(e)██   21-(e)██   22-(e)██   23-(e)██   24-(e)██   25-(e)██   26-(b)██   27-(e)██   28-(b)██   29-(e)██   30-(e)██   31-(e)██   32-(e)██   33-(e)██   34-(e)██   35-(e)██   36-(e)██   37-(e)██   38-(e)██   39-(f)██   40-(e)██   41-(e)██   42-(e)██   43-(e)██   44-(e)██   45-(g)██   46-(g)██   47-(g)██   48-(g)██   49-(g)██   50-(g)██
                                                                                                                                                                                                                                                                                      Competition Stint
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Assay Subject = Specimen Assay Subject = Population
                   1.0
Focal Prevalence
                   0.8
                   0.6
                   0.4
                   0.2
                   0.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Contemporary
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Contemporary
                   1.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               (no diversity maint.)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Prefatory
Focal Prevalence
                   0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Prefatory
                   0.6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               (no diversity maint.)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Without
                   0.4
                   0.2
                   0.0
                         51-(g)██   52-(g)██   53-(e)██   54-(g)██   55-(g)██   56-(g)██   57-(g)██   58-(g)██   59-(h)██   60-(g)██   61-(e)██   62-(g)██   63-(g)██   64-(e)██   65-(e)██   66-(g)██   67-(g)██   68-(e)██   69-(g)██   70-(e)██   71-(e)██   72-(e)██   73-(g)██   74-(i)██   75-(i)██   76-(g)██   77-(i)██   78-(e)██   79-(g)██   80-(e)██   81-(e)██   82-(e)██   83-(e)██   84-(e)██   85-(e)██   86-(e)██   87-(e)██   88-(e)██   89-(g)██   90-(g)██   91-(e)██   92-(i)██   93-(e)██   94-(e)██   95-(b)██   96-(b)██   97-(h)██   98-(e)██   99-(e)██   100(j)██
                                                                                                                                                                                                                                                                                      Competition Stint
Figure D.14: End-state population composition of competition experiments. Half (0.5) population composition corresponds to a neutral result. Zero
population composition corresponds to extreme fitness loss compared to the previous stint. Population composition of 1.0 corresponds to extreme
fitness gain compared to the previous stint. Color coding and parentheticals of stint labels correspond to qualitative morph codes described in Table
5.1. Upper panels show results for sampled focal strain genome, lower panels show results for entire focal strain population. Figure is split into two
rows due to layout considerations. See Figure 5.5 for explanation of competition biotic backgrounds. See Figure D.15 for bootstrapped confidence
intervals on mean prevalence outcomes.
                                                                                                                                                                                                                                                                                                                                  366


                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Assay Subject = Specimen Assay Subject = Population
                   1.0
Focal Prevalence
                   0.8
                   0.6
                   0.4
                   0.2
                   0.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Contemporary
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Contemporary
                   1.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               (no diversity maint.)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Prefatory
Focal Prevalence
                   0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Prefatory
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     (no diversity maint.)
                   0.6
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Without
                   0.4
                   0.2
                   0.0
                         1--(b)██   2--(c)██   3--(b)██   4--(b)██   5--(b)██   6--(b)██   7--(b)██   8--(b)██   9--(b)██   10-(b)██   11-(b)██   12-(b)██   13-(b)██   14-(d)██   15-(e)██   16-(e)██   17-(e)██   18-(e)██   19-(e)██   20-(e)██   21-(e)██   22-(e)██   23-(e)██   24-(e)██   25-(e)██   26-(b)██   27-(e)██   28-(b)██   29-(e)██   30-(e)██   31-(e)██   32-(e)██   33-(e)██   34-(e)██   35-(e)██   36-(e)██   37-(e)██   38-(e)██   39-(f)██   40-(e)██   41-(e)██   42-(e)██   43-(e)██   44-(e)██   45-(g)██   46-(g)██   47-(g)██   48-(g)██   49-(g)██   50-(g)██
                                                                                                                                                                                                                                                                                      Competition Stint
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Assay Subject = Specimen Assay Subject = Population
                   1.0
Focal Prevalence
                   0.8
                   0.6
                   0.4
                   0.2
                   0.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Contemporary
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Contemporary
                   1.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               (no diversity maint.)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Prefatory
Focal Prevalence
                   0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Prefatory
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     (no diversity maint.)
                   0.6
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Without
                   0.4
                   0.2
                   0.0
                         51-(g)██   52-(g)██   53-(e)██   54-(g)██   55-(g)██   56-(g)██   57-(g)██   58-(g)██   59-(h)██   60-(g)██   61-(e)██   62-(g)██   63-(g)██   64-(e)██   65-(e)██   66-(g)██   67-(g)██   68-(e)██   69-(g)██   70-(e)██   71-(e)██   72-(e)██   73-(g)██   74-(i)██   75-(i)██   76-(g)██   77-(i)██   78-(e)██   79-(g)██   80-(e)██   81-(e)██   82-(e)██   83-(e)██   84-(e)██   85-(e)██   86-(e)██   87-(e)██   88-(e)██   89-(g)██   90-(g)██   91-(e)██   92-(i)██   93-(e)██   94-(e)██   95-(b)██   96-(b)██   97-(h)██   98-(e)██   99-(e)██   100(j)██
                                                                                                                                                                                                                                                                                      Competition Stint
Figure D.15: End-state population composition of competition experiments. Half (0.5) population composition corresponds to a neutral result. Zero
population composition corresponds to extreme fitness loss compared to the previous stint. Population composition of 1.0 corresponds to extreme
fitness gain compared to the previous stint. Error bars are bootstrapped 95% confidence intervals. Color coding and parentheticals of stint labels
correspond to qualitative morph codes described in Table 5.1. Upper panels show results for sampled focal strain genome, lower panels show results
for entire focal strain population. Figure is split into two rows due to layout considerations. See Figure 5.5 for explanation of competition biotic
backgrounds. See Figure D.14 for boxplot depiction of prevalence outcomes.
                                                                                                                                                                                                                                                                                                                                  367


                                    1e−5
                                                                                                                                                                                                                                                                                   setup
                              2.0
                                                                                                                                                                                                                                        control-competitions-dosecorrected-withbioticbackground-withmut
                                                                                                                                                                                                                                        control-competitions-dosecorrected-withfuturebioticbackground-withmut
                              1.5                                                                                                                                                                                                       control-competitions-nobioticbackground-withmut
                                                                                                                                                                                                                                        control-competitions-dosecorrected-nodiversitymaintenance-withbioticbackground-withmut
Fitness Differential Focal
                              1.0                                                                                                                                                                                                       control-competitions-dosecorrected-nodiversitymaintenance-withfuturebioticbackground-withmut
                                                                                                                                                                                                                                        control-battles-dosecorrected-withbioticbackground-withmut
                                                                                                                                                                                                                                        control-battles-dosecorrected-withfuturebioticbackground-withmut
                              0.5
                                                                                                                                                                                                                                        control-battles-dosecorrected-nodiversitymaintenance-withbioticbackground-withmut
                                                                                                                                                                                                                                        control-battles-dosecorrected-nodiversitymaintenance-withfuturebioticbackground-withmut
                              0.0
                             −0.5
                             −1.0
                             −1.5
                                                        9                                     26                                   49                                         63                                93
                                                                                                                             Competition Stint
(a) Calculated fitness differential between competing strains based on population composition at the end of
competition experiments. Zero is neutral. Error bars are 95% confidence intervals.
                                                                                                                                                                                                                                                                                   setup
                             0.8                                                                                                                                                                                                        control-competitions-dosecorrected-withbioticbackground-withmut
                                                                                                                                                                                                                                        control-competitions-dosecorrected-withfuturebioticbackground-withmut
                                                                                                                                                                                                                                        control-competitions-nobioticbackground-withmut
                             0.7
                                                                                                                                                                                                                                        control-competitions-dosecorrected-nodiversitymaintenance-withbioticbackground-withmut
                                                                                                                                                                                                                                        control-competitions-dosecorrected-nodiversitymaintenance-withfuturebioticbackground-withmut
Focal Prevalence
                             0.6                                                                                                                                                                                                        control-battles-dosecorrected-withbioticbackground-withmut
                                                                                                                                                                                                                                        control-battles-dosecorrected-withfuturebioticbackground-withmut
                                                                                                                                                                                                                                        control-battles-dosecorrected-nodiversitymaintenance-withbioticbackground-withmut
                             0.5                                                                                                                                                                                                        control-battles-dosecorrected-nodiversitymaintenance-withfuturebioticbackground-withmut
                             0.4
                             0.3
                             0.2
                                                    9                                        26                                   49                                          63                                93
                                                                                                                            Competition Stint
(b) Fractional composition of focal population at the end of competition experiments. A neutral outcome
corresponds to even (0.5) composition, annotated with a horizontal line.
                             20.0
                                                                                                                                                                                                                                                                                   setup
                                                                                                                                                                                                                                        control-competitions-dosecorrected-withbioticbackground-withmut
                             17.5
                                                                                                                                                                                                                                        control-competitions-dosecorrected-withfuturebioticbackground-withmut
                                                                                                                                                                                                                                        control-competitions-nobioticbackground-withmut
                             15.0                                                                                                                                                                                                       control-competitions-dosecorrected-nodiversitymaintenance-withbioticbackground-withmut
                                                                                                                                                                                                                              14
                                                                     13            13   13                                                                                                                               13        13   control-competitions-dosecorrected-nodiversitymaintenance-withfuturebioticbackground-withmut
                             12.5                                                                       12                                   12        12        12                12                                                   control-battles-dosecorrected-withbioticbackground-withmut
                                           11               11                                                         11       11                11        11                                                                          control-battles-dosecorrected-withfuturebioticbackground-withmut
count
                                                                              10             10 10 10        10   10                                                                    10 10 10       10                               control-battles-dosecorrected-nodiversitymaintenance-withbioticbackground-withmut
                             10.0
                                                9                                                                                    9                                    9    9                                9                       control-battles-dosecorrected-nodiversitymaintenance-withfuturebioticbackground-withmut
                                       8                         8        8                                                                                           8                                              8
                              7.5                   7                                                                                                                                              7
                                                                                                                                         6                                                                  6
                              5.0
                              2.5
                              0.0
                                                    9                                        26                                   49                                          63                                93
                                                                                                                            Competition Stint
(c) Number competitions out of 20 won by first strain. Ten competitions won corresponds to a perfectly
neutral outcome. Eighteen and more or two or less competitions won were considered to indidicate a
significant fitness difference between strains. These thresholds for significance annotated with horizontal
lines.
Figure D.16: Control adaptation experiments for selected stints. Control experiments were performed by
competing two identical genomes or populations against each other with the contemporary biotic background,
with the prefatory biotic background, or with no biotic background. See Figure 5.5 for summary of adaptation
experiment design.
                                                                                                                                                                              368


                                                                                                    Significant Fitness Loss (p < 0.005)                                                                                                                                                    Neutral                                                                                                                                              Significant Fitness Gain (p < 0.005)
                    Prefatory Focal Strain
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Assay Subject = Specimen
Biotic Background
                    Prefatory Focal Strain
                     (no diversity maint.)
                                  Without
                    Prefatory Focal Strain
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Assay Subject = Population
Biotic Background
                                  Without
                                             1--(b)██   2--(c)██   3--(b)██   4--(b)██   6--(b)██   7--(b)██   9--(b)██   13-(b)██   14-(d)██   15-(e)██   17-(e)██   19-(e)██   20-(e)██   23-(e)██   24-(e)██   26-(b)██   28-(b)██   30-(e)██   32-(e)██   38-(e)██   40-(e)██      41-(e)██   42-(e)██   45-(g)██   47-(g)██   48-(g)██   52-(g)██   53-(e)██   58-(g)██   60-(g)██   62-(g)██   66-(g)██   71-(e)██   72-(e)██   77-(i)██         79-(g)██   80-(e)██   85-(e)██   86-(e)██   88-(e)██   90-(g)██   92-(i)██   93-(e)██   95-(b)██
                                                                                                                                                                                                                                                                                    Competition Stint
Figure D.17: Summary of biotic background control adaptation assay outcomes for sampled representative specimen (top) population-level adaptation
(bottom). Color coding and parentheticals of stint labels correspond to qualitative morph codes described in Table 5.1. See Figure 5.5 for explanation
of competition biotic backgrounds.
                                                                                                                                                                                                                                                                             369


                                   1.0
                                   0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Assay Subject = Specimen
                Focal Prevalence
                                   0.6
                                   0.4
                                   0.2
                                   0.0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Prefatory Focal Strain
                                   1.0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Prefatory Focal Strain
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        (no diversity maint.)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Without
                                   0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Assay Subject = Population
                Focal Prevalence
                                   0.6
                                   0.4
                                   0.2
                                   0.0
                                          1--(b)██         2--(c)██              3--(b)██         4--(b)██        6--(b)██   7--(b)██   9--(b)██        13-(b)██              14-(d)██              15-(e)██           17-(e)██      19-(e)██   20-(e)██       23-(e)██         24-(e)██              26-(b)██         28-(b)██         30-(e)██    32-(e)██      38-(e)██       40-(e)██              41-(e)██              42-(e)██           45-(g)██    47-(g)██    48-(g)██        52-(g)██          53-(e)██              58-(g)██             60-(g)██         62-(g)██    66-(g)██    71-(e)██        72-(e)██              77-(i)██              79-(g)██         80-(e)██       85-(e)██   86-(e)██        88-(e)██          90-(g)██              92-(i)██         93-(e)██         95-(b)██
                                                                                                                                                                                                                                                                                                                                                                                     Competition Stint
(a) Fractional composition of focal population at the end of competition experiments. Zero is neutral. Error
bars are 95% confidence intervals.
                                   1.0
                                   0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Assay Subject = Specimen
                Focal Prevalence
                                   0.6
                                   0.4
                                   0.2
                                   0.0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Prefatory Focal Strain
                                   1.0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Prefatory Focal Strain
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        (no diversity maint.)
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        Without
                                   0.8
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Assay Subject = Population
                Focal Prevalence
                                   0.6
                                   0.4
                                   0.2
                                   0.0
                                          1--(b)██         2--(c)██              3--(b)██         4--(b)██        6--(b)██   7--(b)██   9--(b)██        13-(b)██              14-(d)██              15-(e)██           17-(e)██     19-(e)██    20-(e)██       23-(e)██         24-(e)██              26-(b)██         28-(b)██         30-(e)██    32-(e)██      38-(e)██       40-(e)██              41-(e)██              42-(e)██           45-(g)██    47-(g)██   48-(g)██         52-(g)██          53-(e)██              58-(g)██             60-(g)██         62-(g)██    66-(g)██    71-(e)██        72-(e)██              77-(i)██          79-(g)██             80-(e)██       85-(e)██   86-(e)██        88-(e)██          90-(g)██              92-(i)██        93-(e)██         95-(b)██
                                                                                                                                                                                                                                                                                                                                                                                     Competition Stint
(b) Fractional composition of focal population at the end of competition experiments. A neutral outcome
corresponds to even (0.5) composition, annotated with a horizontal line.
                                                             20                                              202020 2020                                202020                            20                                      202020                                                        20                                                                20                             20                                                                                                                                                                              202020          20                                                                  202020                                          202020
                                   20.0
                                                                                                                                                                                         19                                                                                                                                                                 19 1919                                                                                                                     19 19                                                                                                 1919                                                                                                              19
                                                                      18               18 18                                       18              18                                                                                                                                                             18                                                              1818                                                                                             18
                                   17.5                                                 17                                                                                        17                                                                                                                             17                                                                                                                              17                                                                                                                                                               17
                                                                                                                                                                                                                                                                          16                                                                        16                                                                                                                                                                                                                                                         16                                                                              16      16
                                                                                                                                                                                                                                                                                                            15                                                                                                                                                                                15                                                                                                                15
                                   15.0
                                                                                                                                                                                                                                                                                                                                                                                                                                        1414
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Assay Subject = Specimen
                                                                           13                                                                                                                                                                                                     1313                                                             13                                                              13                                                   13
                                   12.5                                                                                                                                                                                                                                                                                           12                                                                              12                                                   12                                                                                                                                                                                                                            12
                                                                                                                                                                                                                                                 11                                                                                        11                                                                                                                                                                        11
                count
                                                                                                                                                                                                                                                                                                                                                                                                            10                                             10                                                                                                                                                                                                                          10
                                   10.0
                                                                                                                                             9                                                                                                  9          9                                                                                                                                                                                                       9
                                                                                                                                                                                                                                                                                                                                                                                                                                                             8
                                    7.5                                                                                                 7                                                                                                                                                                                              7                                                                                                                                                                                                                                                                                                                                              7
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            6              6
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               5                                55
                                    5.0
                                    2.5                                                                                                                                                                        2          2                                                                                                                                                                                                                                                                                                                                          22
                                          1                                                                                                                                                                         1                                                 1                                                                                                                                                                                                                                                                                         1                                                                          1
                                    0.0
                                                                                       20 20 20 20 20 20                                                20 20 20 20 20 20                                                                20 20 20                                 20 20 20 20                                                                                                               20 20                                                                                                                                                        20 20 20                                                                                           20                                  20                                                  20 20                                                               Prefatory Focal Strain
                                   20.0
                                                                                                                                        19                                                                                        19                                      19                                                                                                      19                                                                             19                                                                                                                                                                                                  19                                                                                                                                                                 Prefatory Focal Strain
                                                             18                                                                                                                                                                                                                                                                                                                                  18                                                        18                                                                                                                                                                                                                                                                                                                                                                           (no diversity maint.)
                                   17.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              17                                                                                                                                 Without
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               16
                                                                                                                                                                                                                                                               15                                                                                                                                                                                                                                                                                               15
                                   15.0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                       14
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Assay Subject = Population
                                   12.5                                                                                                                                                                                                                                                                                                                                                                                                                                            12                                                                                                                                         12                                                      12
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    11                                                                                                                                                                                         11
                count              10.0
                                                                                                                                                                                                                                                                                                                                                                  9                                                                                                                                                                                                                                                                                                                                                                9
                                                                                                                                                                                                                                                                                                                                                                             8                                                                                                           8
                                    7.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       7
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               6
                                                                             5
                                    5.0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           4                                                                                                          4
                                                                                                                                                                                                                                                                                                                                                                                                                                        3                                                                                                                                                                                                  3                                                                           3                    3            3
                                    2.5   2                                                                                                                                                                                                                                                                                       2                2                                                                                                                                                                                                                                                      2
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            1                         1                                                                                                                                                                                                                                  1
                                    0.0
                                                1--(b)██              2--(c)██              3--(b)██           4--(b)██      6--(b)██       7--(b)██               9--(b)██              13-(b)██                  14-(d)██         15-(e)██        17-(e)██         19-(e)██              20-(e)██              23-(e)██             26-(b)██         30-(e)██       38-(e)██              40-(e)██              41-(e)██                  42-(e)██        45-(g)██      47-(g)██             52-(g)██              53-(e)██                  60-(g)██             62-(g)██       72-(e)██        77-(i)██              79-(g)██              80-(e)██                85-(e)██           86-(e)██      88-(e)██            90-(g)██              92-(i)██              93-(e)██             95-(b)██
                                                                                                                                                                                                                                                                                                                                                                                       Competition Stint
(c) Number competitions out of 20 won by first strain. Ten competitions won corresponds to a perfectly
neutral outcome. Eighteen and more or two or less competitions won were considered to indicate a significant
fitness difference between strains. These thresholds for significance annotated with horizontal lines.
Figure D.18: Biotic background control adaptation experiments for selected stints. Biotic background control
experiments were performed by substituting the baseline competitor for the biotic background. See Figure
5.5 for summary of adaptation experiment design.
                                                                                                                                                                                                                                                                                                                                                                                                                                                       370