2.! I: ,3 “nu-.3"? .lv 31: |. 3:... . . hfifig 4" a. . Ira ti. . .— A: .1 .ua..fe.mc, an, {ax firing. min in: x is ad .«mmw 533;. ($1.: a! i 1 LIBRARY Michigan State University This is to certify that the dissertation entitled THE EVOLUTIONARY ORIGINS OF MEMORY USE IN NAVIGATION presented by LAURA M. GRABOWSKI has been accepted towards fulfillment of the requirements for the Ph.D. degree in Computer Science (W Major Wssor’s Signature 5-29-03 Date MSU is an Affirmative Action/Equal Opportunity Employer PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 5/08 KrlProleoc8Pres/CIRCIDateDuo.indd THE EVOLUTIONARY ORIGINS OF MEMORY USE IN NAVIGATION By Laura M. Grabowski A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Computer Science 2009 ABSTRACT THE EVOLUTIONARY ORIGINS OF MEMORY USE IN NAVIGATION By Laura M. Grabowski Many organisms are able to cope with environmental variations through an array of innate behaviors. When environments change too quickly for evolutionary processes, or when the variation is more unique to a particular place, time, or individual organism’s experiences, innate responses may need to be augmented by more flexible responses that use individual life experience. In such situations, memory becomes necessary. Generally, memory refers to the stored record of experience, and is of great interest across many disciplines. I focus this study on the question of how evolution produces complex mem- ory structures, by examining the interaction of environment and memory during evolution in an environment where navigation is an important behavior. Digital organisms evolve to navigate within their environments, based on different sensory cues. The different environments present differing problems that require different uses of memory. The simplest environment is an idealized version of the chemical attractant gradients in the environments of bacteria such as Escherichia. 0012'. These experi- ments demonstrated the evolution of both a chemotaxis—like response (2'. e., organisms evolved to move up the virtual gradient and approach the highest concentration), and a rudimentary one—timestep memory. Inspired by maze-learning experiments with bees, in the next suite of experi- ments, the digital organisms evolved in environments with “paths,” formed by sen- sory cues in the environment. I used several different types of paths, and each path type placed different demands for memory use on the evolving organisms. The re- sults of the first group of path-following experiments demonstrated the evolution of “reflex” actions, where organisms responded in a fixed way, but differentially, to the environmental cues. For the second suite of experiments, I focused on evolving a one-bit individual memory. One experiment evolved a one-bit life-long memory, i.e., remembering a binary decision, in the form of which direction to turn in the current environment. This meant that an organism needed to store and refer to a single individual experience in future decision—making, but the content of the information did not change during the individual’s lifetime. In the next set of experiments, I focused on evolving a volatile, “short-term“ individual memory. In these environ- ments, new experiences that influenced future decisions happened at unpredictable times, requiring the the stored experiences to be updated frequently. The results of the experiments demonstrate that robust, flexible behavior can evolve even in simple environments. Organisms that evolved in each of the ex- perimental environments exhibited clever and flexible behavior that demonstrated simple behavioral intelligence, revealing capacities such as gathering and differential use of environmental information, and the ability to use prior individual experience to guide future actions. Copyright by LAURA M. GRABOWSKI 2009 DEDICATION To my family, who supported me always, and endured so much: my husband Tom Grabowski, daughters Caitlin and Miranda, my parents Pat and Lee Miesle, my sisters Julie Grossman and Paula May. ACKNOWLEDGMENTS I gratefully acknowledge the following people for their contributions and assistance with this work. First, many thanks to my outstanding committee, co-advisors Dr. Charles Ofria and Dr. Robert T. Pennock, and committee members Dr. Fred. C. Dyer and Dr. Philip K. McKinley, for their mentoring, support, encouragement, and tolerance. I am deeply grateful for all that I have learned from each of you, and I feel privileged to have the benefit of your help and advice. I owe much to my colleagues in the MSU Digital Evolution Laboratory. I would like to thank several members of the lab who made significant contributions to this work. David M. Bryson had the initial inspiration for the “dream grid” that stream- lined my work in the project, and also wrote the implementation. Dr. Wesley R. Elsberry provided much assistance in many areas, from ideas, to proof-reading, to IA'I‘E‘X debugging, to moral and social support. Brian Baer helped me repeatedly with all sorts of technical issues, even when I should have been able to fix them myself. Benjamin E. Beckmann and David B. Knoester helped with early exper- imental setups for this work, and were always ready with helpful suggestions and new ideas for improvement. Heather J. Goldsby graciously shared her presentation format with me for my defense. Dr. Chris Strelioff helped me with some pesky LATEX problems. Thanks also to all the lab members for lots of ideas and hard questions along the way. I am fortunate to have many supportive friends, both personal and professional. Several of them deserve special recognition for discussions and ideas as this work took shape. Drs. Katy and Dirk Luchini Colbry provided both technical and moral support, and some excellent dinners. Dr. Diane Blackwood helped with statistics, and lots of camaraderie on weekends. Pamela Rose Fox and Dr. Ann Kathleen Shea helped by always believing that I could succeed at this. Thanks to each of you, and all those whom I neglected to include here, for giving me the tools and the fortitude to complete this work. vi TABLE OF CONTENTS List of Tables .................................. ix List of Figures .................................. xi 1 Introduction and Background 1 1.1 Introduction ................................ 1 1.1.1 Motivations ............................ 1 1.1.2 Central Issues and Questions ................... 4 1.2 Background ................................ 7 1.2.1 Memory and Learning ...................... 7 1.2.2 Memory and Learning in Insect Navigation ........... 12 1.2.3 Computational Approaches to Memory, Learning, and Evolution 16 1.2.4 Avida: Overview ......................... 19 2 Evolving Rudimentary Memory 24 2.1 Evolving Intelligent Tactic Response and a One-timestep Memory . . 26 2.1.1 Orientation and Movement in Avida .............. 26 2.1.2 Experiments and Results ..................... 27 2.2 Further Experiments: Robustness to Sensor Noise ........... 35 2.2.1 Methods .............................. 35 2.2.2 Results ............................... 36 2.3 Conclusions ................................ 42 3 Evolving Precursors to Memory: Reflex Behaviors 43 3.1 The Avida State Grid .......................... 44 3.1.1 State grid paths .......................... 45 3.1.2 Path traversal task ........................ 46 3.1.3 State grid instructions ...................... 47 3.2 Experiments ................................ 49 3.2.1 State Grids and Experiment Design ............... 49 3.2.2 Results and Discussion ...................... 51 3.3 Conclusions ................................ 74 4 Evolving One-bit Memory 75 4.1 Evolving One-bit Lifetime Memory ................... 76 4.1.1 Experiments ............................ 76 4.1.2 Results and Discussion ...................... 78 l 4.2 Evolving One-bit Volatile Memory .................... 86 4.2.1 Experiments ............................ 88 4.2.2 Results and Discussion ...................... 89 4.3 Evolving Complexity ........................... 99 4.3. 1 Experiments ............................ 100 4.3.2 Results and Discussion ...................... 102 4.4 Conclusions ................................ 109 vii 5 Conclusions and Future Work 5.1 Conclusions ................................ 5.2 Future Work ................................ 5.2.1 Extensions of the Current Work ................. 5.2.2 Future Directions ......................... APPENDIX A Annotated Avida Code of Selected Evolved Genomes A.1 Avida Instruction Set for Experiments ................. A.2 Annotated Avida Code of Evolved Organisms ............. A.2.1 Single—direction Paths Example Organism ........... A.2.2 Right-left Turn Paths Example Organism ............ A.2.3 Cue-once Paths Example Organism ............... A.2.4 Irregular Paths Example Organism ............... BIBLIOGRAPHY viii 111 111 112 112 115 122 123 123 127 127 132 136 139 144 2.1 2.2 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 4.3 4.4 LIST OF TABLES New Avida instructions used for experiments. ............. 30 Memory treatments and simulated noise levels tested .......... 35 State values for state grid experiments. The definition of a state grid includes a listing of the state of each cell in the grid. The sg-sense instruction returns the value shown for each state ............ 48 N OP—modified behavior of 2f-grt-X and if-equ-X instructions ...... 48 Instruction set used for experiments. Instructions shown in italics are not part of the default instruction set. See also Appendix Al for explanation of instructions ......................... 49 Avida instructions for example single-direction path organism. . . . . 61 Number of genomic positions required for path following and the re- sulting dynamic plasticity ratios (DPR) for the final dominants from the replicate experiments with the highest average maximum task quality (AMTQ) in single-direction path experiments. Three repli- cates tied for the top tank (2.6., their AMTQ values were the same), and two tied for the next rank. The rank 9 organism evolved as a one-direction “specialist” 2.6., it could function well only on right- turn paths), and so was not considered in the DPR measurements. . 64 Avida instructions for example right-left turn path organism ...... 72 Avida instructions for example cue-once path organism ......... 85 Avida instructions for example irregular path organism ......... 98 Summary of ancestors and environments for transplant experiments. . 100 Transplant experiments, ancestral and evolved task quality. Summary of ancestral average maximum task quality (AMTQ) and evolved AMTQ for surviving organisms in transplant experiments. Ancestral AMTQ was measured at the end of 3 generations with no mutations occurring. Evolved AMTQ was taken at the end of 250,000 updates of evolution. Ancestral AMTQ entries that appear as “—” indicate that the organism was not able to survive in the no-mutations test en- vironment; the corresponding Evolved AMTQ values show that these organisms were able to adapt and survive after mutations in later evolution. ................................. 101 A1 Avida Instruction Set ........................... 124 ix A2 A3 A4 A5 A6 A7 A8 A9 Complete Genome of Example Organism, Single—direction Paths Ex- periments ................................. 129 Genome Detail for Example Organism, Single-direction Paths Exper- iments ................................... 130 Complete Genome of Example Organism, Right—left Turn Paths Ex- periments. ................................. 133 Genome Detail. for Example Organism, Right-left Turn Paths Exper- iments .................................... 135 Complete Genome of Example Organism, Cue—once Paths Experiments. 136 Genome Detail for Example Organism, Cue-once Paths Experiments . 138 Complete Genome of Example Organism, Irregular Paths. ...... 140 Genome Detail for Example Organism, Irregular Paths Experiments . 141 1.1 1.2 1.3 1.4 2.1 2.2 LIST OF FIGURES Images in the dissertation are presented in color. An Avida digital organism. The organism comprises a circular genome, a virtual CPU with three 32-bit registers, two stacks, and four heads (FLOW, IP, READ, WRITE). Numerical values can be input from the environment, and results are output to the environment. (Lenski et al., 2003, p. 139) ............................ Avida facing, torus or bounded grid interior. Valid facings for an Avida organism in a torus or in the interior (2.6., non-edge) cells of a bounded grid. The organism may be oriented toward any of its eight neighbor cells ................................ Avida facing, bounded grid corner. Valid facings for an Avida organ- ism at the corner of a bounded grid. Only three valid facing directions exist in bounded grid corners. ...................... Avida facing, bounded grid edge. Valid facings for an Avida organism at the edge of a bounded grid. Facings toward the grid edge are invalid; the organism’s facing is automatically adjusted so that it never faces off the edge of the grid. ................... Example of distance as an idealized gradient. The “concentration” source location is at the top of the hill. Note that it is not necessary for the peak to be in the center of the environment (from Grabowski et al., 2008) ................................. Average best distance ratio for evolution of taxis experiments using sense-and-remember instruction (both the current and the previous sensed value are given to the organism, but comparisons must be per- formed in addition to this instruction) and sense-now (the organism must remember previously sensed values and perform comparisons in addition to this instruction). The average represents all organisms in the population at each update (total of approximately 1,000,000- 2,000,000 total organisms per run), averaged over all 50 replicates of each treatment (Grabowski et al., 2008). ................ xi 20 22 23 23 28 31 2.3 2.4 2.5 2.6 2.7 3.1 Sample trajectories of evolved organisms using implicit memory (top) with the sense-and-remember instruction (both the current and the previous sensed value are given to the organism), and evolved mem- ory (bottom), using the sense-now instruction (the organism must remember previously sensed values). All rotation uses the tumble instruction, and the organisms must use additional instructions to perform comparisons. The trajectories suggest that the organisms are using the sensed information to track to their target locations. . . Average best distance ratio for tumble-and—move and rotate-right-one- and-move instructions. The figure shows the average best distance ratio over the entire population, for 100 replicates of each instruction set. Experiments ran for 100,000 updates (approx. 10,000-15,000 generations) (from Grabowski et al., 2008). .............. Sample trajectories of organisms from different memory treatments, :I:10% simulated sensor noise. Trajectories shown for the most abun- dant genotype at the end of evolution runs for the replicate popula- tions with the three highest Best Distance Ratios for the two memory treatments (implicit memory, top: both the current and the previous sensed value are given to the organism; and evolved memory, bottom: the organism must remember previously sensed values). All rotation uses the tumble instruction, and the organisms must use additional instructions to perform comparisons. .................. Sample trajectories of organisms from implicit memory treatment, :I:50% and i75% simulated sensor noise. Trajectories shown for the most abundant genotype at the end of evolution runs for the repli- cate populations with the three highest Best Distance Ratios for the two noise treatments (:I:50%, top, and i75%, bottom). All rotation uses the tumble instruction, and the organisms must use additional instructions to perform comparisons. .................. Sample trajectories for maximum sensor noise and random walk or- ganism. Trajectories shown for the most abundant genotype at the end of evolution runs for the replicate populations with the three highest Best Distance Ratios for the random distance value treatment (top). Random walk trajectories were generated with the same hand— coded organism, injected into three different locations. All rotation uses the tumble instruction. ....................... Example state grid “path.” This grid illustrates a single-turn, left— turn—only environment. Blue circles indicate the “nutrient” state, yellow triangles show cells with the “signpost” state, x appears in each cell with the “empty” state, and the organism’s initial location is shown by the green star ......................... xii 32 50 3.2 Average log(f2tness) and average maximum task quality (AMTQ) in single-direction paths experiments. In this case, log( fitness) and task quality track each other closely, and are both good measures of algorithm performance. The curves also show that the average task quality over all 50 final dominant organisms is near 0.6, indicating that, on average, the organisms successfully traversed over half of each path by the end of the evolution run. ............... 53 3.3 Distribution of average maximum task quality (AMTQ), individual single-direction paths. Paths 1 and 2 are right-turn—only paths, Paths 3 and 4 are left—turn—only paths. Although the medians for right-turn and left—turn paths appear different, there is no significant difference in the AMTQ distributions (Mann-Whitney U—test, p > 0.05 for all pairs of paths; see Table ?? for specific 1) values). ........... 54 3.4 Average maximum task quality (AMTQ) and population average from the case study population in single-direction path experiments. The plot shows the AMTQ for each path separately. ............ 55 3.5 Trajectories of an example evolved organism on paths that were ex- perienced during evolution ......................... 57 3.6 Trajectories of an example evolved organism (same organism as shown in Figure 3.5) on novel right-turn path. Figure (a) shows the organ- ism’s trajectory from the original initial position for the path, and Figure (b) shows the trajectory from a new initial position that cre- ated an extended distance to the first turn on the path (7 steps added to the distance to the first turn). .................... 62 3.7 Tfajectories of evolved organism (same organism as shown in Fig— ure 3.5) on novel left-turn path. Figure (a) shows the organism’s tra- jectory starting at the path’s original initial position, and Figure (b) shows the trajectory from a new initial position that shortens the distance to first turn (unused nutrient locations from original path appear for comparison). ......................... 63 3.8 Example right-left turn path. ................... i. . . 66 3.9 Average maximum task quality (AMTQ) and population average task quality for all 50 replicate populations in right-left turn path experi- ments. Task quality shown is averaged over all five paths experienced during evolution. ............................. 67 3.10 Distribution of average maximum task quality, right-left turn exper- iments, individual paths. There is no significant difference in the AMT Q distributions (Kruskal—Wallis Test, p = 0.950) .......... 68 xiii 3.11 3.12 3.13 4.1 4.2 4.3 4.4 4.5 4.6 Average maximum task quality (AMTQ) and population task quality average, top population in right-left turn path experiments. Task quality is shown for each individual path experienced during evolution. 69 Trajectories of example evolved organism from right—left turn experi- ments. ................................... Trajectories of example evolved organism from right-left turn experi- ments on novel paths containing both right and left turns. Figure (a) shows the organism’s trajectory on a path in a 20 x 20 grid, while Figure (b) illustrates the trajectory on a 19 x 17 grid path. The dimensions of both of these grids differ from the dimensions of the evolutionary environments (2.6., 25 X 25). ............... Example state grid from “cue-once” experiments. The state grids for this experiments have the “cue-once” directional cue and “repeat- last” states. The one directional cue, shown by the yellow triangle, signals whether the environment is a right-turn or left—turn environ- ment (here, a left-turn environment). Subsequent turns (brown dia- monds) are signaled by the “repeat-last” state, that gives the same return value in both right and left turn environments, serving as a cue to “repeat the last turn.” ...................... Average maximum task quality (AMTQ) and population average, all 50 replicate populations in cue-once experiments. Task quality shown is averaged over all four paths experienced during evolution. ..... Average maximum task quality (AMTQ) and population average, top performing population in cue-once path experiments, paths shown separately .................................. Distribution of average task quality values, individual cue—once paths. Paths 1 and 2 are right-turn only paths, Paths 3 and 4 are left-turn only paths. There is no significant difference in the AMTQ distribu- tions (Kruskal-Wallis Test, p = 0.805). ................. Trajectories of example evolved organism on paths that were experi- enced during evolution in cue-once experiments. ............ Tfajectories of example evolved organism from cue—once experiments, tested on novel paths: (a) Trajectory on a path using the right/ left turn states at every turn, as in the earlier right-left turn path exper- iments; (b) Trajectory on a path that incorporates right, left, and repeat-last states .............................. xiv 80 82 86 4.7 Example state grid with irregular path. A “new” turn direction (the first turn or a change from the previous turn direction) is cued by the unique right or left turn sense value. Subsequent turns in that same direction (brown diamonds) are signaled by the “repeat-last” state, that gives the same return value when preceded by either a right or left turn, cueing to “repeat the last turn.” ............... 88 4.8 Average maximum task quality (AMTQ), irregular paths. The plot shows the combined AMTQ and the overall population average task quality for all four paths experienced during evolution. ........ 90 4.9 Distribution of average maximum task quality (AMTQ)values, indi- vidual irregular paths. There is no significant difference between the AMTQ distributions (Kruskal-Wallis Test, p = 0.238) .......... 91 4.10 Average maximum task quality (AMTQ) for a single replicate in ir- regular paths environment. The plots shows the combined AMTQ and the population average task quality for each individual path ex- perienced during evolution. This population had the highest AMTQ at the end Of the 250,000 updates of evolution .............. 92 4.11 Trajectories of evolved organism, irregular path experiments. Tra- jectories of example organism on paths that were experienced during evolution in irregular path experiments. In both (a) and (b), the organism stops after encountering one empty cell. ........... 93 4.12 Example trajectory, example evolved organism from irregular path experiments on a novel path ........................ 94 4.13 Average maximum task quality (AMTQ), evolved vs. default ances- tor, right-left turn paths .......................... 104 4.14 Average maximum task quality (AMTQ), evolved 213. default ances- tor, cue—once paths ............................. 105 4.15 Average maximum task quality (AMTQ), evolved vs. default ances- tor, irregular paths ............................. 106 XV Chapter 1 Introduction and Background 1.1 Introduction 1.1.1 Motivations More than fifty years ago, Herbert Simon and Allen Newell created the first artificial intelligence (AI) system, Logic Theorist (Newell & Simon, 1956). At the time, the outlook for AI seemed bright, and many computer scientists optimistically projected that machines with human-level intelligence were no more than a couple of decades away, at worst. More than half a century later, this optimism has dimmed con- siderably. Despite many impressive successes in AI and dramatic improvements in hardware, the ability to create machines with high-level, general purpose intelligence continues to seem decades away, at best. Why has AI been unable to deliver on that early promise? Undoubtedly, there are many contributing factors. One aspect of the difficulty is that human-level intelligence is a highly complex facility, or system of facilities. In discussing Logic Theorist, Newell and Simon (1956, p. 61) identified characteristics of these types of complex systems: 1. There is a large number of different kinds of processes, all of which are important, although not necessarily essential, to the performance of the total system; 2. The uses of the processes are not fixed and invariable, but are highly contingent upon the outcome of previous processes and on information received from the environment; 3. The same processes are used in many different contexts to accom— plish similar functions towards different ends, and this often results in organizations of processes that are hierarchical, iterative, and recursive in nature. In many ways, this description of complexity is a good fit for both artificial and biological systems. The idea of complexity leads to another aspect of the failure of AI. The perspective that drove the first thirty to forty years of AI research was highly anthropocentric, focusing on mimicking human cognition and attempting to recreate aspects of the most complex intelligence known from scratch. This perspec- tive placed much emphasis on higher-order propositional intelligence. Propositional 2ntellz'gence relates to the human capacity for reasoning and abstract thought and enables humans to do things such as solve logic problems and play chess. This also overlooks the fact that, in general, humans are not very good at pure propositional logic, and tend to use a more fuzzy form. This approach tended to overlook other, simpler forms of intelligence. Many species demonstrate what is termed behavioral intelligence, intelligence that relates to effective, successful behavior. The interest in propositional intelligence—knowing that—rather than behavioral intelligence— knowing how—necessitated a top-down approach to designing intelligent machines. There is an obvious problem with using top-down methodology to approach intel- ligence: we still know relatively little about the computations and mechanisms of intelligence. Moreover, this was certainly not the way that biology produced i11- telligence. Evidence from evolutionary biology indicates that, just as the modern human body evolved from earlier physical forms, modern human intelligence evolved from earlier forms of intelligence. Therefore, rather than undertaking the Herculean task of designing AI from the top down, we can instead take a page from nature’s book and explore intelligence from the bottom up, finding the circumstances that permit intelligence to evolve. This bottom-up approach allows us to examine the evolution of simpler, fundamental capabilities that may facilitate the emergence of more complex intelligence. There are many aspects to intelligent behavior. A key capability is memory. Memory is a necessary condition for many kinds of intelligent behavior. Without memory, organisms cannot draw from their own experiences to make decisions. If organisms have no memory, they cannot learn, they cannot reliably return to im- portant places in the environment, and they cannot recognize many features of the world around them. Memory provides a tool that organisms can use to move be— yond the here-and-now, using their record of experience to build predictions of future outcomes. In this study, I investigate how evolution constructs this crucial building block of intelligent behavior. Digital evolution (Adami, Ofria, & Collier, 2000) provides the tools for exploring issues, such as evolving memory, from the bottom up and offers certain advantages over working with living organisms. Studying evolution 2n 321200 is much faster than working with living organisms, environmental conditions can be tightly controlled, and processes of interest can be readily isolated for study. Digital evolution is also transparent, so that we can trace everything that is happening during evolution, without influencing the system. Such an evolutionary approach can provide in- sights that are valuable in both biological and computational research contexts, by illuminating fundamental evolutionary issues and by finding surprising solutions to computational problems. Digital evolution, like natural evolution, may also be able to discover solutions when our intuitions as designers fail. In fact, computational evolution approaches have a track record of improving on existing computational approaches to complex problems, such as automated state diagram construction (Goldsby, Knoester, Cheng, McKinley, 82: Ofria, 2007), used in software require— ments engineering, self-healing (Knoester, McKinley, & Ofria, 2007), and adaptive resource—aware behavior (Beckmann, McKinley, & Ofria, 2007). 1.1.2 Central Issues and Questions Organisms must be able to respond appropriately to the enviromnent in order to maximize their chances of survival. Aspects of their environment will often vary based on time, space, or circumstance, and organisms must react to these variations. Evolution can incorporate environmental change that is slow-paced, encompassing several generations (Nolfi & Floreano, 2000). Evolution discovered a variety of mechanisms that allow all organisms, from the simplest to the most complex, to respond to variable factors in their environments. Some of these mechanisms involve reflexive behavioral routines, such as the response of bacteria like Eschem'chz'a c012 (E. col2) to move toward food, or innate behavioral preferences and patterns, as observed in many insects (Dukas & Bernays, 2000). These repertoires of fixed, innate behaviors allow organisms to successfully manage some well-defined sets of potential circumstances. However, when the variation becomes less predictable due to more possible outcomes, because of dependences on time, place, or individual organisms’ experiences, or simply due to more rapid environmental changes, mechanisms with additional flexibility are needed. In the face of quicker or more unique variation, memory and learning may provide the advantage, allowing individuals to adjust behavior according to the local world state (Dukas, 2008; Shettleworth, 1998). How do environment, memory, and learning interact in an evolutionary context? This question is of great interest on the biological side of evolving intelligence. It also has a direct connection to computer science in many ways, for example in the context Of robot navigation: historically, algorithms for robot navigation have been particularly brittle when confronted with unpredictability related to locally unique features or events. Fundamental investigation focusing on the interplay between environment, memory, and learning can provide key insights for both the biological and computational perspectives. There are many dimensions to the evolution of memory and learning, and the related questions are challenging. How memory and learning evolve is a central issue, comprising a number of interwoven questions, relating to the evolutionary pathways and building blocks of memory and learning. In this dissertation, I investigate how evolution builds complex memory struc- tures that can be used for learning behaviors. My approach uses evolution 2n 321260, in virtual environments that present different navigation problems. To successfully solve these problems, evolution must discover mechanisms for memory and simple learning, and for differentially responding to environmental cues. With the experi— ments described here, I addressed four central questions: 0 What env2r0nmental cond2t2ons promote the evolvt20n of a refl6x2ve response to a part2cular one? This can be viewed as evolving innate “reflex” responses, 2.6., responses that require no learning, and do not change with experience. Although this is not a memory or learning behavior, it is a key step in two regards. First, evolving a fixed innate behavior still allows organisms to alter their responses according to features of the current environmental circum- stance. Second, as seen in classical conditioning, sets of innate responses may form the basis for conditioned responses, 2.6., simple learning. From a high- level perspective, reflexes can be thought of as a sort of ancestral or genetic memory: evolution “hard-coded” the reflexive responses for frequently experi— enced, potentially harmful or beneficial environmental conditions. 0 What cond2t2ons lead to more plast2c responses? Phenotyp2c plast2c2ty is the ability of a given genotype to express different phenotypes in response to differ- ing environmental conditions. Some plastic changes are reversible, while others are not. Reversible plasticity is often considered the simplest form of learning, since it provides a capacity for altering behavior based on prior experience (Dukas, 1998). o What cond2t2ons lead to the transtt2on from a set of plast2c responses to the use of memory? This question addresses the transition from a suite of innate responses, tailored to handle environmental variation, to the storage and use Of individual experience to guide behavior. 0 Is it harder to evolve a l2fe-long memory than a volat2le one? We can view this question as comparing phenotypic plasticity with more active memory processes. Phenotypic plasticity involves evolving genetically specified traits or behavioral options, whereas more active memory processes require the abil- ity to continually update the information and associations. From a simpler perspective, we can relate this question to mechanisms for storing information for differing lengths of time. Some information is useful for a long time, while other information might be used and discarded. What mechanisms evolve to support these different types of memory? I use navigation as the behavior domain for the work presented here. Navigation behavior fits well with the idea of evolving behavioral intelligence, since navigation is an example of “knowing how” to manage particular problems of survival. The ability to interact efficiently and effectively with its environment is a prerequisite for an organism’s survival (Shettleworth, 1998). Many tasks that are pivotal to survival, such as foraging, finding shelter, and avoiding danger, require reliable strategies for moving through the environment. Natural evolution equipped organisms with a va- riety of navigation strategies, reflecting the varying challenges involved in different navigational tasks (Roche, Mangaong, Commins, & O’Mara, 2005). Even funda- mental navigation strategies build on other capabilities that are widely considered components of intelligence, including sensing and responding to cues in the environ— ment, and memory. In the context of the current study, navigation is the platform for investigating how evolution builds memory structures in response to environments that challenge survival. The current work is organized as follows. The remainder of this chapter explores background and related work in memory and learning across several different fields of study. Chapter 2 presents work related to evolving gradient following behavior and elementary memory use capabilities. Chapter 3 discusses evolving reflexive behaviors as precursors to memory and learning. Chapter 4 explores evolving one—bit memory for both short- and long-term uses. Chapter 5 presents conclusions and plans for future work. 1.2 Background 1.2.1 Memory and Learning Definitions The notions of memory and learning traditionally relate to separate bodies of re- search. Depending on the research context, memory and learning may be defined strictly, loosely, or hardly at all. The term memory most often refers to the process that allows animals to guide their behavior based on information from their own past experience, or to the stored record of the experience itself; memory research focuses largely on how information is stored and retrieved. It is remarkably difficult to synthesize a representative definition of learn2ng. All of these are definitions of learning: 0 “Learning . . . is a change in state resulting from experience” (Shettleworth, 1998, p. 100). 0 “Learning is the acquisition of neuronal representations of new information” (Dukas, 2008, p. 146). 0 [Learning is] “more or less lasting changes in the innate behavioral mechanisms under the influence of the outer world” (Tinbergen, 1951, p. 143). These definitions run the gamut from broadly inclusive to highly restrictive. Per- haps the crux of the difficulty was identified by Todd and Miller (1990, p. 307), that learning is a multi—faceted process, and “must be viewed not as a single monolithic process, but as the diverse set of distinct mechanisms, abilities, and dynamics that ” ethologists and psychologists have shown it to be. One common thread through these definitions is that learning refers to changes in behavior as a result of experi- ence. As such, discussions of learning imply the existence of memory, whether over brief periods of time (short-term memory) or for more lasting durations (long-term memory) (Dukas, 2008). Questions persist regarding the distinction between memory and learning. Shet— tleworth (1998) suggests that the traditional memory/ learning dichotomy should be abandoned in light of the more recent emphasis on multiple memory systems. Evi- dence from many areas of research supports the multiple memory system hypothesis, especially in humans. The multiple memory system hypothesis is that memory is or- ganized in different systems, according to the type of information that they mediate, and that these different types of information are processed and stored in different brain areas. Forms of memory may be characterized by how long they last, whether they involve accumulated knowledge or unique experiences, and whether memory is expressed explicitly, through conscious remembering, or implicitly, by changes in performance speed or bias (Eichenbaum, 2008). Certainly, memory and learning are intimately connected, and it is diflicult to imagine the evolution of one without the other. On one hand, memory is a prerequisite for learning: an organism cannot learn from its experiences if it has no means to retain and reuse information about those experiences. On the other hand, memory cannot evolve completely free of cost, and a costly system will not evolve unless its use provides some adaptive advantage (Dukas, 1999). Phenotypic plasticity refers to the ability of an individual organism to change from one genetically specified phenotype, which includes its behavior, to another, based on changes in that organism’s environment. Dukas (1998) discusses the im— portance of phenotypic plasticity and its relationship to learning. Some phenotypic plasticity is irreversible, and is seen in a phenotype that is determined at one time during development, and then remains unchanged for the rest of the organism’s lifetime, such as alternate wing patterns on butterflies. Other plastic traits are reversible: an individual can change those traits repeatedly in response to an envi- ronmental change. Examples of this sort of reversible plasticity are gain or loss of muscle strength according to the level of physical activity, and changes in shell thick- ness in the gastropod, Littorz'na obtasata, relative to exposure to predators (Trussell & Smth, 2000). Reversible plasticity implies that the organism maintains a sensory capacity to recognize a particular cue from the environment, and a mechanism to alter something in its phenotype in response to that cue. This reversible plasticity allows adaptation when the environment changes frequently within the individual’s lifetime, and permits a genetically determined association between stimuli from the environment and responses to those stimuli. Basic types of reversible plasticity may be considered as early forms of learning, since reversible plasticity provides a capac- ity for altering behavior based on prior experience. Learning is, itself, a complex and advanced form of plasticity. Simpler plasticity differs from more complex forms of learning in that plastic responses are still predetermined. That is, the available responses are genetically pre-determined; which response occurs is determined by the environmental conditions. Learning, by contrast, is more open—ended: responses are not already stored, ready for use, but need to be discovered (Dukas, 1998). Through the remainder of this work, I will use terms as follows. 0 Memory: the mechanisms and processes of storing information about prior experience. 0 Phenotypic plasticity: the ability of an individual organism to change from one genetically specified phenotype, which includes its behavior, to another, in response to environmental changes. 0 Learning: “the acquisition of or change in memory that allows a subject to alter its subsequent responses to certain stimuli” (Dukas, 1998, p. 133). Memory, Learning, and Evolution Memory and learning play crucial roles in intelligent behavior. Even simple organ- isms exhibit some capacity'for storing and reusing information about their environ- ments; these capacities can be thought of as rudimentary memory capabilities. The ability of E. 0012 to compare the present and past gradient concentrations and be- have accordingly is a simple form of memory (Koshland, 1979), and the slime mold Physarum polycephalvm has the ability to find the minimum length path between two points in a maze (Nakagaki, Yamanda, & Toth, 2000). Our intuition tells us that memory and learning must often be beneficial, since they are so widespread in nature. Nolfi and Floreano suggest that learning serves several different adaptive functions: 0 Learning supplements evolution by enabling an organism to adapt to environ- mental change that happens too quickly to be tracked by evolution. 0 The combination of ontogenetic adaptation (learning) and phylogenetic adap— tation (evolution) might be able to exploit more environmental information than evolution alone. 0 Learning may guide evolution, as suggested by Baldwin (Baldwin, 1896) and Waddington (Waddington, 1942). Baldwin proposed what he called organ2c select20n, a process through which acquired characteristics could be indirectly inherited. Through organic selection, individuals that have the ability to ac- quire a beneficial trait during their lifetime will be selected for; therefore, the capab2l2ty for acquiring that trait will be passed on to offspring. Since learn— ing takes time, Baldwin suggested that evolution might tend to push trait acquisition to earlier points in life and thus select individuals who already have those beneficial traits at birth, eliminating the need to learn them. This indirect genetic assimilation of learned traits was later defined by Wadding- ton as canalization, which describes developmental robustness to changes in environment and genotype. 0 Learning might allow complex phenotypes to be produced from much shorter genotypes by extracting some information necessary for building the phenotype from the environment. (Nolfi & Floreano, 2000). There is some debate regarding the selective forces that promote the evolution of learning. Stephens (1991) identified two opposing forces credited with this role— environmental change and environmental regularity—and presented a simple model that supported his argument that “the pattern of predictability in relation to an animal’s life history determines the evolutionary value of learning” (Stephens, 1991, p. 77). The model takes into account the influence of three factors—change, regu— larity, and value—on the evolution of animal learning. Change and regularity are 10 measured by two terms: “between-generation persistence” describes to what ex- tent environmental states in the parental generation predict states in the offspring’s generation; “within-generation persistence” specifies how well today predicts tomor- row within the lifetime of the individual. This model emphasizes how the pattern of environmental change relates to animal’s life history. Stephens stated that this model suggests that the most correct statement one can make is that learning is an adaptation to within-lifetime regularity and some environmental un- predictability; this unpredictability may occur either within or between generations (Stephens, 1991, p.85). Memory and learning may also incur costs. Some of these costs may be associ- ated with the development and maintenance of mechanisms that are precursors to memory and learning (Dukas, 1999). Other costs may include a delay in the acqui- sition of fitness due to suboptimal behavior during learning, increased unreliability related to the possibility of learning the wrong things, delayed reproduction time, and energy costs of the learning process itself (Mayley, 1996). Many mathematical models of memory (for example Kacelnik and Krebs (1985)) that are based on opti- mality modeling make the assumption that memory is cost-free and focus on how to weigh experience and what to forget. In such models, the implicit cost of memory relates to the use of irrelevant or old information, which might result in suboptimal performance (Dukas, 1999). Even though the fitness benefits of memory and learning are relatively well un— derstood (Dukas, 1998; Dukas & Bernays, 2000; Dukas & Duan, 2000), less is known about their costs, and evidence for the costs is difficult to come by. At present, the primary empirical evidence of the evolutionary costs of learning stems from a se- ries of experiments by Mery and Kawecki, performed with fruit flies, Drosoph2la melanogaster. The results of one study (Mery & Kawecki, 2003) suggested an evolu- tionary trade-off between learning ability and competitive ability: improved learning ability in experimental fly populations was consistently associated with a decline of larval competitive ability, compared to control populations. In another study, Mery 11 and Kawecki (2004) forced flies to repeatedly use their learning ability by exposing the fly populations to two-day cycles of alternating substrate conditions. Flies from lines that had been selected for higher learning ability had lower egg-laying rates than flies from lines that had not been exposed to selection for increased learning ability. These results suggest that egg-laying was impaired by either the accumula- tion of memory interference, or the energy costs of information collection, processing, and storage. A third study (Mery & Kawecki, 2005) showed that flies subjected to training that produced long-term memory died sooner under extreme stress (absence of food and water) than flies subjected to control treatments, suggesting that the formation and maintenance of long-term memory produced added strain. 1.2.2 Memory and Learning in Insect Navigation At the beginning Of the 20th century, there was a generally held belief that insect behavior was guided nearly exclusively by instinct. The supposition was that nav- igation mechanisms worked because evolution had tuned the insects’ sensorimotor responses and behaviors to function in the animals’ typical environments, leaving insects incapable of flexibly accommodating unexpected change or of learning charac- teristics of particular environments (Collett, 1993). Over time, however, researchers accumulated mounting evidence of learning in various insect species. By the close of the 20th century, the weight of evidence led to the widespread acknowledgment that learning plays an integral part in the decision-making of many insects (Dukas, 2008). Navigation behavior has provided many insights into insect memory and learning. Insects employ sophisticated navigation strategies that use a variety of environmen- tal information. Insects, particularly bees and ants, are excellent model systems for the study of navigation: they are fascinating in their own right, and the strategies that they employ are similar to those used by birds and mammals. The more mod- est neural resources of insects may be analyzed more effectively to reveal essential components of navigation (Collett, 1993; Collett & Collett, 2002). 12 Ants, bees, and other insects use an array of innate strategies for navigation. These strategies include path integration (Miiller & Wehner, 1988), which is the continual updating of distance and direction to a reference location (6. g., the nest), and responses to landmarks and beacons in the environment (Graham, Fauria, & Collett, 2003; Collett, Graham, & Durier, 2003). The interaction of these strategies may reduce navigational errors and provide more robust navigational performance (Collett & Graham, 2004), and may provide a scaffold for acquiring routes between the nest and other important sites in the environment (Collett et al., 2003). Research has provided evidence of memory and learning in ants and bees in numerous contexts related to navigation. Honeybees demonstrate mechanisms for storing motor sequences, linking motor sequences to visual stimuli, and expectations of particular visual stimuli at specific points along the route (Collett & Collett, 2002; Collett, Collett, & Wehner, 2001; Collett, Fry, & Wehner, 1993; Wehner, Michel, & Antonsen, 1996). Bees in general learn landmarks relative to the celestial compass (von Frisch, 1967), and this association can be used on overcast days to determine the compass direction (Dyer & Gould, 1981). Bees are born with a template of the solar ephemeris function (the daily pattern of solar movement through the sky), but learn the details of that function for their particular location and time of year through experience, and can estimate the location of the sun for times they have not experienced (Dickinson, 1994; Dyer & Dickinson, 1994, 1996). Honeybees associate both distant and local landmarks with previously traveled routes between the hive and food (Dyer, 1991), and ants and bees may use interconnected memories of associations between contextual cues and landmark memories to reliably recognize visual landmarks and faithfully execute learned travel routes (Collett & Collett, 2002). For long—distance navigation, honeybees depend on large features in the terrain and celestial cues but use local features to guide them to a goal (Dyer, 1998). Honeybees and other insects perform learning flights to memorize visual landmarks near a newly-discovered food source, a behavior called “turn back and look” (Lehrer, 1991). The length of these flights appears to be adjusted by the bees in response to changing visual information needs (Wei, Rafalko, & Dyer, 2002). 13 Studies of maze learning in animals provide insight into the mechanisms involved in navigation (e.g. Gallistel, 1990). Maze learning studies with insects may be of particular interest, since many bees and ants often follow fixed routes from the nest to a foraging site (Collett et al., 2003). In learning a maze, an insect is learning to follow a well-defined path. To follow a path means to obey a sequence of instructions. The instruc- tions could be inherent in the external world, as when water flows down a river valley along a route imposed by the contours of the land. But the fact that bees can be trained along arbitrary routes . . . and through mazes suggests that to some extent bees do have an internal repre— sentation of sequences of instructions (Collett et al., 1993, p. 693). Bees have been trained to fly through mazes of varying complexity. Studies by Collett and colleagues (Collett et al., 1993; Collett & Baron, 1995) used small mazes to investigate bees’ ability to learn motor or sensorimotor sequences. One study (1993) presented three different series of experiments that forced bees to fly along prescribed routes and through obstacles in a large box in a laboratory. The primary conclusion of this study is that bees remember sensory and motor information that allows them to reproduce a complex route. The bees to some degree learn a sequence of detailed instructions and do not rely solely on the external world to dictate their path. These results suggest that what is stored is a linked series of vectors, each vector a “command” for flying a certain distance in a certain direction (Collett et al., 1993). A study by Zhang and colleagues (Zhang, Bartsch, & Srinivasan, 1996) examined whether honey bees use specific visual cues to learn to fly through structurally complex mazes. Bees were trained to fly through mazes in the presence or absence of visual cues (2.6., color marks). The tested bees showed the ability to learn routes marked by a visual one (color mark), and routes with no cues, with varying degrees of success in the different treatments. The authors report several conclusions from their results. Bees are capable of learning a complex task, such as this maze solving task, if they learn through step-by—step training, by acquiring a succession of “rules” 14 for negotiating the maze ( e. 9., food is present at a location tagged with a certain color, and a path labeled with that same color leads to food). Bees can apply learned rules in new contexts, varying in spatial configuration, spectral domain (color), and both spatial and spectral domains simultaneously. While learning to follow the color marks, the bees also learn a set of motor commands for a correct sequence of actions for negotiating the maze. Bees are able to learn to navigate a maze even in the absence of internal marks (2.6., visual or scent) and external marks. A later study by Zhang and colleagues investigated bees’ ability to learn and remember two different sets of visual stimuli (Zhang, Lehrer, & Srinivasan, 1999). Bees were trained to fly through a compound Y—maze. While on route to a food reward, the bees were presented with two different visual stimuli sequences. The results suggest that the bees were able to successfully store the two different se- quences at the same time. Another study (Zhang, Mizutani, & Srinivasan, 2000) probed whether bees learn and recognize structural regularity in the mazes. For these experiments, bees were trained and tested in four different types of mazes: constant-turn, where turns are always in the same direction; zig-zag, where each turn alternates direction; irregular, which has no apparent pattern of turns; variable irregular, where bees had to learn several irregular mazes at the same time. The bees performed best in the constant-turn mazes, somewhat poorer in the zigzag mazes, still worse in the irregular mazes, and poorest of all in the variable irregular mazes. The authors state that these results demonstrate that the bees’ performance in the various configurations depends on the structural regularity of the mazes, and the ease with which the bees can recognize and learn that regularity. Bees provide an example of behavioral intelligence that can inform my experi- mental design, and so maintain a strong connection between my experiments and their biological motivations. I based my experimental environments on selected maze configurations from the maze—learning experiments with bees, discussed above. As highlighted in the preceding discussion, bees use a variety of navigation strategies that demonstrate use of different memory capabilities. Maze environments allow for systematic manipulation of environmental features to probe aspects of memory 15 use and mechanisms, and are easily constructed in virtual settings. By using these types of environments in the work presented here, I am able to probe specific issues relating to the evolution of memory. 1.2.3 Computational Approaches to Memory, Learning, and Evolution In the context of computer science, the concepts Of memory and learning take on particular connotations. The term “memory” is linked primarily to hardware- related issues, and the more specific “associative memory” points to specific neural- network based implementations, often combined with self-organization (Kohonen, 1977, 2001). Machine learning is an active and rich research area, encompassing a variety of methods, including supervised learning (Kotsiantis, 2007, review), un- supervised learning (Kotsiantis & Pintelas, 2004, review), and reinforcement learn- ing (Kaelbling, Littman, & Moore, 1996; Sutton & Barto, 1998, review). Machine learning focuses on how to engineer machines that learn, or behave in a manner that resembles learning, as opposed to evolving learning machines. Generally speaking, evolutionary computation employs algorithmic methods in- spired by Darwinian natural selection to find solutions to computational problems. Traditional evolutionary computation, comprised of such areas as evolutionary algo- rithms, evolutionary strategies, genetic algorithms and evolutionary programming, is largely concerned with optimization problems. These approaches also tend to be highly applied, with a focus on using evolutionary processes in solving particular in- stances of problems, in contrast to investigating how general solutions arise through evolution (De Jong, 2006). Other approaches are more concerned with evolving machines that exhibit some life-like behavior. Evolutionary robotics is one such methodology. Evolutionary robotics refers to efforts to develop robots that autonomously evolve and adapt to their environments (Floreano, Mondada, Perez-Uribe, & Roggen, 2004). This strat- egy emphasizes the importance of embodiment, autonomy, and adaptation. In this 16 technique, adaptation most often occurs as evolution in neural network controllers through the use of genetic algorithms (Floreano, 1997). Although not truly within the realm of evolutionary robotics, Valentino Braitenberg’s (1986) work could be considered a precursor of that field. In these thought experiments, Braitenberg en- visioned a series of simple wheeled robots, with different kinds of sensors that are connected in various ways to the wheel motors. When the robots are placed on a table, they exhibit different behaviors, such as approaching lights, running away ' from lights, and remaining close to the light. In the course of traveling around the table top, some robots will fall off the table, making it necessary to build copies of the robots in order to maintain a given number of robots on the table. In copying the robots, the builder is bound to make mistakes, resulting in new designs and potential improvements over time. This process of selective copying and random errors is clearly Darwinian evolution, and Braitenberg’s delightful thought experi— ments provide an excellent jumping-off point for later work in evolving actual robot behavior. The field of evolutionary robotics has dealt extensively with several facets of evolving memory and learning. One aspect is phenotypic plasticity, the ability of a genotype to express differently in different environments. Nolfi, Miglino, and Parisi (1994) studied this topic by evolving neural network “brains” for virtual robots in environments that alternated between light and dark; the dynamics of the environ- ments were designed such that behaviors that were successful in one environment would be unsuccessful in the other environment. Individuals that evolved under these conditions were able to tune their behavior appropriately for both kinds of environments, adapting within an individual “lifetime” to environmental changes. Although not within the scope of robotics, a closely related study by Stanley, Bryant, and Miikkulainen (2003) used NeuroEvolution of Augmenting Topologies (NEAT) to compare neural networks that never changed connection weights to those that did. In this experimental environment, food switched randomly between nu- tritious and poisonous. The fixed-weight networks found solutions that worked for both environmental conditions: different inputs to the networks produced different 17 behaviors. There is increasing interest in the evolution of learning and the interaction be- tween learning and evolution within the evolutionary robotics community. This interest is spurred by several purposes that are shared by other approaches and methodologies. One purpose is to examine the performance advantages of combin- ing evolutionary adaptation and learning; another purpose is to understand the role played by the interaction of these two adaptive processes, which employ different mechanisms and occur at differing time scales (Nolfi & Floreano, 2002). A study by Floreano and Urzelai (2000) is a strong example of the latter. They evolved neural networks with local synaptic plasticity and compared them to fixed-weight networks in a two—step task. The networks evolved to turn on a light and then move to a grey square. The results showed that local learning rules helped networks alter function— ality quickly, facilitating moving from one task to the other. Blynel and Floreano (2003) explored the ability of continuous time recurrent neural networks (CTRNNs) to show capabilities that resemble reinforcement learning, in the context of T-Maze and double T-Maze navigation tasks. The robot had to find and “remember” the location of a reward zone. The learning in this case occurred Without modifica- tion of synapse strengths, coming about instead from internal network dynamics. This work was directly related to an earlier study by Yamauchi and Beer (1994), that investigated the evolution of agents capable Of combining reactive, sequential, and learning behavior, using CTRNNs to control their agents. A study by Todd and Miller (1990) explored the conditions under which simple artificial creatures are more likely to evolve learning mechanisms for differentiating edible from poisonous food. In this model, learning was a particular connection between a color sensor and the single motor neuron of a neural network, and the genetic specification that controlled the development of the learning structure was specified by a single gene. Tuci et al. (2002) applied evolutionary robotics methodology to studying the evolu- tion of learning behavior from an ecological perspective, which treats each instance of learning as a specialized capability that is shaped by selective pressures; from this perspective, learning can be understood only by reference to the organism or 18 its ancestor. Their model required a robot to learn the relationship between a light and the location of its target. The robot had to interact with its environment to learn the relationship between the light and the target; in one environment, motion toward the light took the robot toward its target, but in another environment, the relationship between light and target was inverted. Results of their experiments show that artificial evolution can be used to combine low-level building blocks to produce controllers that are capable of associative learning. 1 .2.4 Avida: Overview The Avida Digital Evolution platform (Lenski, Ofria, Pennock, & Adami, 2003; Ofria & Wilke, 2004) is a widely used digital evolution software system. Digital evolution (Adami et al., 2000) is a form of evolutionary computation that places a popula- tion of self-replicating computer programs— “digital organisms,” or “Avidians”—in a user-defined computational environment without an explicit fitness function. Avida provides a virtual environment, but real evolution occurs: the digital organisms self- replicate, mutate, and compete, satisfying Dennett’s definition of an evolutionary process (Dennett, 2002). Digital evolution can be used as a tool to provide a better understanding of biological processes, and as a method for applying lessons learned from biology to computational and engineering problems. Avida affords the oppor- tunity to study evolution in detail, and look “inside” the process as it happens. These are both difficult things to achieve with a natural organism, even one as sim- ple as a bacterium. Since Avida is another instance of evolution, we can use Avida for generalizations about evolution. According to John Maynard Smith (1992), So far, we have been able to study only one evolving system, and we cannot wait for interstellar flight to provide us with a second. If we want to discover generalizations about evolving systems, we will have to look at artificial ones (Maynard Smith, 1992, p. 772). Pennock (2007) discusses these issues in detail. 19 nub —————— 1 mm»: Ix = 00010110....I lhift-l I mp I Y = 10000100... ' In“! ”ax; 11610613.. , Figure 1.1: An Avida digital organism. The organism comprises a circular genome, a virtual CPU with three 32—bit registers, two stacks, and four heads (FLOW, IP, READ, WRITE). Numerical values can be input from the environment, and results are output to the environment. (Lenski et al., 2003, p. 139) The Avida world is a discrete two—dimensional grid of cells, containing a popu— lation of individual digital organisms, with at most one organism in each grid cell. Each individual organism (Figure 1.1) comprises a circular list of assembly—like in— structions, its “genome,” and a virtual central processing unit (CPU). The virtual CPU consists of three general purpose registers, two stacks, and four heads (FLOW, used as a jump target; IP, an instruction pointer that denotes the next command to be executed; READ; and WRITE). Executing the instructions in the organism’s genome acts on the elements of the virtual CPU, incurring a cost measured in virtual CPU cycles. Executing Avida instructions accomplishes all functions in the Avida 20 world, such as gathering information from the environment or performing logic op- erations. The Avida instruction set is Turing-complete (Ofria, Adami, & Collier, 2002) and extensible, affording ease in expanding the system’s capabilities through adding new instructions. An Avidian replicates by copying its genome into a block of memory that will be its offspring’s genome. Errors in the copying process produce differences between the genomes of parent and offspring. These differences are matat20ns; these mutations take the form of inserting or deleting an instruction, or changing one instruction to another instruction. The Avida instruction set has the property of remaining syntactically correct, even when mutations occur (Ofria et al., 2002). A newly produced offspring is placed into a randomly selected grid cell, overwriting any organism occupying that grid cell. This gives an adaptive advantage to organisms that can replicate more quickly: the organism must compete for the limited space in the grid. Organisms that replicate sooner than others will have a higher proportion of descendants in future populations. Avidians can speed up their execution, and so replicate sooner, by accumulating “metabolic rate” bonuses by performing user— specified tasks. Metabolic rate is used to allocate virtual CPU cycles. Organisms with higher metabolic rates are given more virtual CPU cycles in a unit of time than organisms with lower metabolic rates, and so are able to execute more quickly and produce offspring sooner. Each Avida organism has a facing, 2.6., the direction in which it is oriented. An organism must always have a valid facing, meaning that it must face a cell that is connected to the organism’s cell. Because of differences between the geometries available in Avida, there are varying numbers of valid facings in certain geometries. The bounded grid geometry creates a world with defined “edges.” Grid cells at the edges have no connection to the cells outside the grid boundary. In a bounded grid geometry, most grid cells have eight valid facing directions (North, Northeast, East, etc., Figure 1.2); grid cells that are along the edge or in the corners of the grid have fewer valid facing directions (three in corners, Figure 1.3, and five at edges, Figure 1.4). A torus geometry creates an infinite plane, with no edges. In a torus 21 00 OO o t»>< O O DO GOO 000 O 0 w 0 DO GOO GOO O >§o oox 0A0 o O O 1 . 1 X O O o No oYo of- o Xoo o>'X X0 0X 00 Figure 1.3: Avida facing, bounded grid corner. Valid facings for an Avida organism at the corner of a bounded grid. Only three valid facing directions exist in bounded grid corners. OO w I O X 0 Figure 1.4: Avida facing, bounded grid edge. Valid facings for an Avida organism at the edge of a bounded grid. Facings toward the grid edge are invalid; the organism’s facing is automatically adjusted so that it never faces off the edge of the grid. o’\o o","o o/‘o X00 o>'