‘4— MSU RETURNING MATERIALS: PIace in book drop to LIBRARIES .—‘:—- remove this Checkout fr; your record. flfl§§_wi1' ——~ 7 be charged if book is returned after the daf' stamped below. I It I . 9 ‘ fl; ,m‘,¢ SEP 2 4 F563 M 2: - . ‘ .. “‘ ! II” 1 I 2"“ ~ -,$%,rl a ' 1"“ ' ¢ «‘3 ‘ “i. ‘1. I. u .N. .3 . 'JJ‘S hi! I‘- ’ -----—-——---.3. .3...” M.--‘ ”-9.3; ./ M.c1', s‘i' I Q- r”. '_,~| “ V I q, . “ 2 O ‘5! ()0 E 'I. | A COMPUTATIONAL THEORY OF LEARNING IN DISTRIBUTED SYSTEMS Adrian V. Sannier II A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical Engineering and Systems Science 1988 5/ 759%0 ABSTRACT A COMPUTATIONAL THEORY OF LEARNING IN DISTRIBUTED SYSTEMS By Adrian V. Sannier II A distributed computing system engaged in a learning task is subject to certain developmental considerations which arise from the potential for functional interaction that exists between the system’s constituent processes. As individual processes adapt in response to the task environment, they must simultaneously adapt to the behavior of the other processes in the system, in order to form an integrated and coordinated whole. Effective development depends on the ability of the system to achieve an appropriate balance between the self-assertive and integrative tendencies of each of the system processes. This dissertation specifies a method for developing a coordinated, goal-directed, distributed system from an initial set of independent, uncoordinated processes. The approach presented here is derived from theories which describe the evolution of biological communities. Two software systems which demonstrate the theory’s utility are discussed in some detail: the first , Asgard, is an artificial life simulation; the second, Midgard, develops distributed, rule-based, adaptive load balancers for a certain class of inter-LAN networks. To Banj, for, everything . . . iii ACKNOWLEDGEMENTS First, I would like to express my gratitude to my major professor and friend, Dr. Erik D. Goodman. He gave me just the right mix of freedom and guidance to keep me going, and kept faith with this project even when I had none. My association with Erik, and his lovely wife Cheryl, is one I will always treasure. Special thanks also to the members of my guidance committee, Drs. James Resh, Carl Page, Thomas Manetsch, and Donald Straney, for their helpful observations and suggestions. I am particularly grateful to Don Straney, for helping me fill in some of the many gaps in my biological background. Thanks for all the time spent in our many hours of discussion - they were one of the highlights of my doctoral program. I also owe a great debt to the AH. Case Center for CAEM - it taught me everything I know about computers and a good deal more than that. I feel extremely fortunate to have been associated with as interesting and bright a group of people as have been through the Center. To Ojer and Janey, Albert and Sarah, DPH, Guy, and Rerni, Joyce, Jack, and Captain Jupiter, the Yoss man, the wizard of Haas, my Consultants, and, of course, the Users; in short to anyone who ever touched a terminal at Case : thanks to all of you -- it was the best time of my life. Some things cannot be adequately expressed in words. One such thing is the gratitude I feel toward my parents for their unfailing love, strength, and support throughout the whole of my life. I couldn’t have done this, or anything else, without you. iv TABLE OF CONTENTS LIST OF FIGURES .......................................................................................................... vii LIST OF TABLES ........................................................................................................... viii CHAPTER I INTRODUCTION ......................................................................................... 1 CHAPTER II TOWARDS A COMPUTATIONAL THEORY OF DISTRIBUTED ADAPTIVE SELF-ORGANIZATION ................................................................ 10 2.0 Introduction ..................................................................................................... 10 2.1 Symbiosis ........................................................................................................ 10 2.2 Living Transitions ........................................................................................... 13 2.2.1 Replicator to Prokaryote ............................................................... 14 2.2.2 Prokaryote to Eukaryote ............................................................... 22 2.2.3 The Origins of Multicellularity ..................................................... 27 2.3 Distributed Adaptive Self Organization .......................................................... 32 2.3.1 The Common Thread .................................................................... 33 2.3.2 A Computational Theory .............................................................. 37 2.4 Summary ......................................................................................................... 41 CHAPTER III A DISTRIBUTED GENETIC ADAPTIVE SYSTEM ............................ 42 3.0 Introduction ..................................................................................................... 42 3.1 The Basic Model ............................................................................................. 42 3.2 Composite Programs -— Composite Processes ................................................ 44 3.3 Process Initiation ............................................................................................. 45 3.3.1 The distributed genetic algorithm ................................................. 46 3.3.1.1 Michigan and Pitt ........................................................... 47 3.3.1.2 A synthesis ..................................................................... 49 3.3.2 Structural and Spatial Coherence .................................................. 55 3.3.3 A Reproductive Continuum .......................................................... 57 3.3.4 Composite programs ..................................................................... 60 3.4 Summary ......................................................................................................... 62 CHAPTER IV ASGARD .................................................................................................. 63 4.0 Introduction ..................................................................................................... 63 4.1 System Organization ....................................................................................... 64 4.2 The Environment ............................................................................................ 68 4.3 Experiments .................................................................................................... 73 4.4 Summary ......................................................................................................... 83 CHAPTER V MIDGARD ................................................................................................ 85 5.0 Introduction ..................................................................................................... 85 5.1 Load Balancing ............................................................................................... 86 5.2 Model Specification ........................................................................................ 88 5.2.1 An Augmented Classifier Language ............................................. 89 5.2.2 Midgard ......................................................................................... 92 5.2.3 Relationship with DGAS .............................................................. 95 5.3 Evaluation ..................................................................................................... 100 5.3.1 Phase 1 ........................................................................................ 101 5.3.2 Phase 2 ................................................ . ........................................ l 06 5.4 Summary ....................................................................................................... 111 CHAPTER VI SUMMARY AND CONCLUSIONS ..................................................... 112 BIBLIOGRAPHY ........................................................................................................... 1 19 LIST OF FIGURES Figure 1. Gene Splinting ................................................................................................... 18 Figure 2. A Reproductive Continuum ............................................................................... 58 Figure 3. Example Asgard Animal Program ..................................................................... 66 Figure 4. Asgard ................................................................................................................ 71 Figure 5. Asgard Startup Population Transient With Non-Graduated Introduction ......... 74 Figure 6. Asgard Population Startup Transient With Graduated Introduction ................. 76 Figure 7. On-Line Progress of a Typical Homogeneous Experiment - FEl ................... 104 Figure 8. Summary of Homogeneous Network Results - FEl ........................................ 105 Figure 9. Summary of Homogeneous Network Results - Fm ........................................ 107 Figure 10. On-Line Progress of a Typical Heterogeneous Network Experiment ........... 109 Figure 11. Summary of Heterogeneous Network Results .............................................. 110 vii LIST OF TABLES Table 1. A Working Set of Asgard Parameters ................................................................. 77 Table 2. Mean Service Times it” for a Job of Type i on a Processor of Type j ............. 108 viii CHAPTER I INTRODUCTION Left to her own devices, the common ant is no match for the forces in the world capable of destroying her. Fortunately for her though, no ant is an island. Each individual ant’s efforts coordinate with the efforts of the other members of her colony and together they take the measure of their surroundings, construct a nest, locate and deliver food, defend themselves from the various agents of entropy, and reproduce their kind. Much of what they do together, they do implicitly, by instinct. The individual members of the colony cooperate by rote and promote their mutual survival. It is this kind of implicitly coordinated effort which allows life to exist, in all its forms, at all its levels. The colony of ants is a step in an enormously complex ladder of living things, each rung populated by living systems whose continued existence depends on the behavior of the systems around them. This interdependence extends downward in scale as well as upward. Our individual ant is herself composed of millions of cells, each performing the specialized functions which, on the one hand, perpetuate their own individual lives, and, on the other, go together to make up the life processes of the ant. Each of those cells is itself composed of individual organelles, and so it goes. In an intricate and subtle network, the hierarchies of living systems extend from the vast ecologies which cover continents to the microscopic chemical gradients which make up the inner workings of cells. What makes the story all the more remarkable is that all these elaborate strategies, each one linked to untold others, are constantly in a state of flux, continually adapting to the changing properties of the environment, in a process generally known as evolution. As a rule, when a population of living organisms reproduces over a period of time, it responds to its surroundings; the population’s genetic structure changes through successive generations, gradually reflecting discoveries about which combinations of traits the environment has favored and which it has not. The process works through a combination of natural selection and the action of certain genetic mechanisms that manipulate the information stored within the gene pool, to produce populations adapted to an ever-changing environment. Often, the environment an individual faces is shaped in great measure by the behavior of the living system which surrounds him. The individuals in a population interact functionally as well as reproductively, affecting each other’s survival -- sometimes favorably, sometimes unfavorably. An ant with an aberrant strategy affects not only her own survival, but the survival of the rest of the colony; a larger than usual ant, with increased load bearing ability, will improve the colony’s chances as well as her own. Each generation is a trial of a new set of survival strategies, tested individually and in concert against the rigors of surrounding conditions. From this cycle of testing, evaluating, and recreating the various and sundry strategies, combinations of strategies, and combinations of combinations, the natural system is able to generate, recognize, and consolidate mutually compatible strategies into a form which gives them stability and insures their proliferation so long as they continue to prove viable. As computing resources become increasingly distributed, increasingly parallel, the individual processing element becomes more and more like the individual ant. The tasks computer systems perform are becoming ever more complex, and the systems designed to address them no longer rely on the behavior of a single, sequential machine, but on the coordinated, simultaneous action of anywhere from 10 to 100,000 concurrent processes. The modern factory floor already contains a whole host of distributed intelligence, fiom the microprocessors which control the senses and local behavior of robots, to the line controllers, to the network of rnicros which handle the payroll. The advent of these massively distributed, parallel systems has produced a whole new set of research questions: by what methods, and to what extent, can these systems maintain fault tolerance?; how can effective cooperation be maintained with a minimum of communication?; can these systems, particularly in real-time applications, be made more robust with respect to changing conditions? Many of these questions, and others like them, have already been faced and solved by living systems. The idea I would like to explore in this dissertation is that many of the desirable properties of living systems -- their fault tolerance, their ability to coordinate with a minimum of communication, their robustness in the face of changing circumstances, their hierarchical control structure -- were produced gradually, as a direct consequence of their evolution. Studying this process should provide us with mechanisms for producing these effects artificially in distributed computing systems. A distributed computing system that is adapting to a task domain has much in common with an evolving community. Like the individuals in a functionally interacting population, each processor in the distributed system is faced with an environment with a structure richer than it alone can represent, a problem more complex than it alone can solve. At the outset, when little is known about the problem domain, exploration of the problem space proceeds most rapidly if each processor fends for itself, so that the system examines a multitude of different strategies under various conditions. As the system matures, however, individual strategies will begin to impinge on one another more and more, and the processors must begin to adapt not only to the task at hand but also to each other’s behavior. If the mechanisms which allow the living system to coadapt behaviors and forge independent individuals into a coherent and cooperating whole can be abstracted, they might go a long way towards solving many of the problems which emerge in distributed computing, particularly in their use as learning systems. This dissertation is a step in that direction. When learning is regarded as the process by which a system acquires information about its environment and uses that information to alter its behavior, marked similarities between learning and evolution emerge [Pringle 51]. The central idea behind the research presented here is that a useful computational model of leaming in distributed systems can be constructed from a computational model of a particular kind of evolving system. Using biological models to develop computational approaches is nothing new. Since the early 1950’s, researchers in computer science have used biology to drive their research. One of the first attempts at artificial learning, Friedberg’s pioneering study of behavior specified programming [Friedberg 58], used a learning algorithm based on a simple model of evolution. Another early study in learning, by Fogel, Walsh and Owen, [66] used a similar approach in an attempt to design an automaton capable of learning regular languages. Today, teams of researchers throughout the country are attempting to develop "intelligent" systems using neural networks, a computational abstraction based on the structure of the animal nervous system [Wasserman 87, Lippmann 87, Cowan 88]; others seek to develop artificial adaptive behavior by modeling facets of the immune system [Farmer 85]. As tools in the study of complex biological phenomena, computational models may prove themselves as an important new method in the study of biological systems as information processors. A constructive, computational process description of a biological phenomenon presents a useful alternative to less explicit written descriptions or explicit but cumbersome mathematical models which are often intractable for all but the simplest cases. Computational descriptions can elucidate the implications of the set of assumptions underlying a biological theory fiom an information processing perspective. As a number of authors in the biological sciences have noted, the processing of information is a critical concept in understanding the regulation and development of biological systems [Prigogine 80, Wiley 82]. Thus the new vantage point afforded by computational models should provide valuable insights into the study of the characteristics of living systems. A good example of the use of a computational description in this context can be found in a study by Holland [76a] exploring the spontaneous emergence of self-replicating molecules. One of the sticking points in the current view of the origins of life is that the chance assembly of the complex proteins and self-replicating nucleic acid chains on which the theory depends is all too unlikely. Arguments exist on both sides of the question, but they tend to lack precision, since the models advanced to account for spontaneous emergence rely on complicated chemical interactions which obscure the probabilities of formation. By reformulating the question in terms of abstract entities which change form according to specific rules and probabilities, Holland was able to more explicitly characterize the interactions between the various constraints which govern "replicator" development. While his 1976 paper is only an exploratory effort, it presents the beginnings of a framework which sheds new light on the nature of the role which enzymes play in reducing the expected waiting times for the spontaneous emergence of self- replicating compounds. Holland’s "alpha-universe" formulation is a computational abstraction of the "primordial soup", where the laws of chemistry are abstracted as the transition rules of a Lindenmayer grammar embedded in a cellular automaton. Substrings in this grammar, (abstract "chemicals"), change over time according to these transition rules and the form of the substrings which are proximate to them. Holland showed that dramatic decreases of up to several orders of magnitude can be obtained in the expected waiting time for the emergence of self-replicating strings if the transition rules of the grammar are such that certain simple substrings act as "enzymes", increasing the probability of occurrence of more complex substrings. By studying the properties of his "alpha-universe", Holland was able to make specific assertions about the conditions under which certain simple configurations can act to reduce the probabilities of formation for other, more complex configurations. ‘Moreover, his model gave rise to fairly complex behavior, (the spontaneous emergence of self-replicating strings), from a relatively simple set of transition rules. The computational technique is by no means comprehensive; computational models like the aforementioned fail to capture many facets of the biological phenomena they describe. But the level of abstraction they enforce allows a researcher to study more easily the interactions of information that occur between several complex processes. These models deal solely with abstract entities, letting the level of physical detail fall away to present the system’s information processing functions in sharper focus. The new insights which the computational perspective can bring to the study of biological systems promise to make it an important tool in the study of complex living processes. As results in computer science, models of biological information fill a crucial void. While much of the research in the field of computer science comes under the heading of computational mathematics, a large body of research is more akin to experimental physics than mathematics. Particularly in the field of artificial intelligence, research results are often presented as algorithms or software systems which have been developed as solutions to specific problems. A set of problem-dependent data structures are defined and a complex series of operations are performed on them. The system’s performance is then described in terms of these structures and operations, but, particularly if the algorithms are non-deterministic or involve heuristic methods, these "process descriptions" [Marr 77] are incomplete at best. While the systems are often of interest in and of themselves, sometimes dramatically demonstrating computing capabilities, time has shown them to be difficult to generalize or extend [Marr 77]. In terms of the scientific method, such results act as both hypothesis and experiment -- and since the hypothesis does not describe a more general class of events, even when the experiment succeeds, it gives only a little guidance into how future systems should be constructed, which features are crucial and should be retained, which are redundant or ineffective and need to be improved. Marr, an early researcher in artificial intelligence and computer vision, suggested an alternative, which aims at the development of a theory of information processing beyond the level of process description. For Marr, the ideal result in computer science involves : the isolation of a particular information processing problem, the formulation of a computational theory for it, the construction of an algorithm which implements it, and a practical demonstration that the algorithm is successful. At the crux of his approach is the observation and modeling of biological information processing. By studying various animal vision mechanisms Marr was able to identify some common features which exploit the same "natural constraints" to reduce the complexity of image resolution. In so doing, he fiamed what he called a "computational theory" of low-level vision processing, a hardware and implementation independent, "abstract formulation of what is to be computed and why". The computational theory focuses on the invariant properties of a calculation, irrespective of the algorithmic details of implementing same. These theories guide the scientific process in the traditional way. They are fiamed from observations of existing phenomena and subsequently tested ( by implementing algorithms based on their explanation of the computation ) and modified in accordance with the results of experiment. The rest of this dissertation proceeds along these lines. With Marr’s concerns in mind, Chapter 2 begins with a discussion of three living systems which began as sets of individual units with little or no coordinated interaction, and ended up as highly c00perative, hierarchically organized distributed systems. By discussing some biological theories which account for their emergence we will attempt to identify their common thread and unify them into a single computational theory of distributed adaptive self-organization. In Chapter 3, we will discuss the distributed genetic adaptive system (DGAS), an algorithm which implements the theory developed in Chapter 2. The chapter begins with a description of the learning mechanism used by the system, a variant of the genetic algorithm originally due to Holland. Following this, the DGAS is described in terms of a set of mechanisms capable of identifying and consolidating emergent symbiotic behaviors between independent, distributed units operating in a metric problem space. Chapter 4 presents Asgard, a software system which implements the DGAS. Designed as an artificial world populated by large numbers of "artificial organisms", Asgard showed the ability of the DGAS to develop a coherent, coordinated hierarchical distributed system from originally independent, uncoordinated individual units. The program is described in some detail and the results of experimentation are presented. Chapter 5 describes Midgard, a second test of the theory. In Midgard, a modified version of the DGAS is applied to develop a distributed system which balances the computational load in an inter-LAN network. The network is a set of autonomous processors which can communicate with one another and allow resource sharing. In a distributed computer system of this kind, often some processors become heavily loaded, with many jobs waiting in their queues, while other processors stand idle, or nearly so. Clearly gains in overall efficiency can be obtained by migrating jobs from heavily loaded processors to lightly loaded ones. Protocols which attempt to mediate these migrations to increase overall throughput are called load balancing protocols. Each processor in the network uses the protocol to evaluate its loading and the loading of the other systems, to decide whether to accept jobs from other processors or to send some of its own to another system, and to resolve any conflicts in communication which arise. In applying the DGAS to this area, the load balancing protocol is gradually developed from coadapted sets of processes which control the. scheduling behavior of the individual processors in a simulated network. Efficiency feedback is used to direct the adaptive process, producing a viable system of distributed schedulers which cooperate with one another to increase network efficiency and decrease waiting times. Chapter 6 is a discussion of results and some concluding remarks. CHAPTER II TOWARDS A COMPUTATIONAL THEORY OF DISTRIBUTED ADAPTIVE SELF-ORGANIZATION 2.0 Introduction The transition from distributed independent units to coherent, cooperative whole has been made countless times by living systems. In this chapter, we will examine theories of how that transition has occurred at three distinct levels of organization: the step from primitive replicators to prokaryotes, from prokaryotes to eukaryotes, and fi'om simple eukaryotes to multicellular organisms. Each of the theories we will examine emerges from recent work in biology centered around reasserting the role of symbiosis in the generation of evolutionary novelty. By comparing the common features of the theories proposed to explain each of the three transitions, we will develop a more formal statement of the problem of distributed self-organization and attempt to isolate the computations that have been performed via different algorithms at each level to foster each transition. These common computations will form the basis for a computational theory of distributed self-organization which is elaborated in the last section of the chapter. 2.1 Symbiosis The primary mechanism of evolution, as described by the neo-Darwinian synthesis, is the natural selection of random variation. The biosphere randomly generates survival strategies, embodied in replicating DNA, which are maintained in proportion to their 10 ll effectiveness in competing with other strategies for available resources. Competition among individuals is what weeds out the weak, in a nature "red in tooth and claw” where only the robust reproduce. Yet in spite of this competitive atmosphere, current research is revealing that living things are tremendously interdependent, not only in the food chain, but in many less competitive ways. At all organizational levels, from sub—cellular viruses [Reanney 74] to the biosphere as a whole [Lovelock 88], stable symbiotic partnerships are pervasive. Moreover, the idea that evolving symbioses are a source of novelty is enjoying a renaissance. The idea of symbiosis as a factor in the evolutionary development of new forms is not new. As early as 1883 [Margulis 81], biologists looked to symbiosis to account for the emergence of radical forms, particularly the development of the eukaryotic cell. In 1927, LE. Wallin advanced the claim that the mitochondria of eukaryotes had their origins as free-living, respiring prokaryotes which had invaded other host cells and adapted to live within their cytoplasm. His theory asserted the importance of symbiosis in the development of novelty, not only at the cellular level, but at all levels of organization. Wallin identified the selection of symbioses as the primary source of evolutionary novelty: Just as reproduction insures the perpetuation of existing species, the author believes that Symbionticism insures the origin of new species. [Wallin 27] At the time, Wallin’s claims were rejected by mainstream biology. What little hard evidence he presented was suspect and the links between symbiosis and the evolution of novelty were dismissed as unfalsifiable [Margulis 81]. Particularly in Wallin’s day, before ultra-structural and molecular analysis -- but even today -- demonstrating the symbiotic origin of a novel form poses a dilemma. When the individual partners in a symbiosis can be clearly identified, then no new form exists. If on the other hand, a new form has arisen from the integration of symbionts, the 12 boundaries between the partners may have blurred to the point that they share much of the same apparatus for their maintenance; a new form has arisen but the evidence of the separate natures of its constituent symbionts will have virtually disappeared. D.C. Smith [79] has likened the identification of closely linked symbionts to the problem of identifying the vanished Cheshire Cat solely from the traces left by his grin. Since Wallin’s time, however, new techniques in microbiology have been used to establish a better case. Margulis’ serial endosymbiosis dreary [Taylor 74], which claims that eukaryotes evolved from simpler forms through the integration of a series of symbionts, is now a part of the accepted explanation for their origin. The idea that symbioses have been a major source of evolutionary novelty is now accepted by most biologists [Margulis 81]. One of the primary obstacles to the idea that symbiosis plays a role in the development of new forms has been the belief that genetic exchange between members of different species was all but impossible. This view is changing; ideas from many quarters have led to the breakdown of the barrier. Advances in the evolutionary role of viruses, plasmids, transposons and other mechanisms of inter- cellular gene exchange have forced a reconsideration; inter-specific recombination events do occur, and with meaningful effect. According to Jeon and Darrielli [71] : There is significant evidence that organisms are not limited for their evolution to genes that belong to the gene pool of their species. Rather it seems more plausible that in the time-scale of evolution the whole of the gene pool of the biosphere is available to all organisms and that the more dramatic steps and apparent discontinuities in evolution are in fact attributable to very rare events involving the adoption of part or all of a foreign genome. 1 3 Anderson [70] goes even further : Viral transduction is a key mechanism for transporting segments of DNA across species and phylum barriers and evolution depends largely on this transfer That mechanisms for transferring genetic information between neighboring cells allows for the strengthening of symbiotic associations has been advanced in support of the idea that symbiosis may be an important force in the evolution of species [Taylor 79] -- a metasexual phenomenon resulting in the "the breakdown of genetic, physiological and spatial isolation between organisms that diverged in the ancient past" [Margulis 76]. The symbiotic paradigm is attractive for my purposes since it presents an avenue for addressing the problem of distributed self-organization as the coadaptation of originally independent processes. Inherent in the idea of symbiotic coevolution is the notion that independent and dissimilar distributed units may spontaneously form partnerships and gradually evolve into a more highly organized and coadapted whole. Any "hardware independent" mechanisms which can be identified from the study of living systems that have developed into coherent, integrated wholes, should be useful in constructing a theory to guide the consn'uction of adaptive algorithms to foster the coordination of distributed but interacting computing units. 2.2 Living Transitions In the next three sections, we will examine the application of the symbiotic perspective to three distinct stages in the evolution of life. Each of the stages is marked by the spontaneous coordination of distributed elements over time. At each stage, the basic units manipulated will be of different complexities, subject to different constraints, on different scales of size. Thus we expect a good deal of divergence in the physical mechanisms which accomplish the transition toward higher organization. 14 At the same time however, we expect that, in some abstract way, the "computations" these various mechanisms perform should be in many ways analogous. The identification of these computations will put us on the way toward a computational theory of distributed adaptive self-organization, a phenomenon which marks each of the almost mystical "quantum leaps" which evolving life has performed. 2.2.1 Replicator to Prokaryote The article of faith underlying current scientific investigations into the origins of life is that pre-biotic organics arose as a natural result of the laws of chemistry from more primitive compounds present in the atmosphere and oceans of the early earth. These relatively complex organics are thought to have recombined under the forces of chemistry and natural selection to form replicating molecules enclosed within lipid membranes. Eventually, through further recombination and selection, these enclosed replicating molecules, the precursors of prokaryotic cells, are believed to have developed the ability to use information encoded within them to synthesize proteins, an especially powerful class of physically active effector molecules. Once capable of self-replication and protein synthesis, these informational molecules expanded their range of phenotypic expression, increasing the complexity, diversity, and intricacy of the chemical interactions they were together capable of producing. Since the beginning of this century, various pieces of the progression from inanimate chemical compounds to simple cells have been explored, with some measure of success. In particular, evidence to support the idea that the building blocks of life were the spontaneous result of chemical reactions between simpler compounds likely to have been present on primitive earth has accumulated steadily since Stanley Miller first synthesized an amino acid via electrical discharge in a mixture of methane, ammonia and water in 1951 [Miller 53]. Subsequent efforts in what is now called pre- biotic chemistry have resulted in the laboratory synthesis, under conditions 15 simulating the early earth, of over 25 different nucleic acid derivatives, over 40 amino acids and their derivatives, and a number of carbohydrates [e.g.., Sagan 71, Ponnamperuma 68, White 76]. Other research has shown plausible routes by which nucleotides linking with each other via hydrogen bonding could have led to the familiar 3’,5’ double-helical DNA. Several researchers have had success in developing self-replicating nucleic acid chains starting from much simpler precursors [Spiegleman 67, Goulain 68, Khorana 70] A plausible theory that the 3’,5’ backbone may have had survival advantage over the disruptive 2’,5' bond which is so commonly produced by attempts at artificial DNA synthesis under the conditions of early earth has been given by David Usher [77]. Usher describes how a repeated cycle of warming and cooling under varying humidity favors the formation of the protective 3’,5' coil over the more easily hydrolisized 2’,5' chain. This work suggests that DNA may have been the fundamental molecule of life from the very beginning. Despite progress in various stages which the spontaneous emergence dreary requires, many gaps still exist. Nevertheless, enough evidence exists to conclude that at some point, some kind of self-replicating system did emerge from inanimate precursors as a result of natural processes. In its simplest form, this system was likely a self-replicating nucleotide chain in a solution of pre-biotic organics and water, perhaps enclosed by a lipid vesicle which protected it from disruptive effects. The subsequent evolution of these replicators into the diverse and interdependent prokaryotes is thought to be the result of two effects: the natural selection of variation and symbiotic association. The selection pressure to which these nucleotide chains were subject resulted from variations in the stability of individual molecules. Prior to the emergence of 16 replication, those molecules which were most stable, i.e., those which remained intact longest, and whose chance rates of formation were highest, became increasingly prevalent. Molecules which formed only rarely, or broke up with high probability shortly after formation, decreased in proportion to these more robust strains. Once the first nucleotide chains emerged, however, selection pressures changed in kind. Nucleotides do more than last; they can also react with their environment in a completely new way to produce copies of themselves; and with the advent of replication came new avenues to stability. These new molecules, the first replicators, had gained the beginnings of a phenotype. Given that the waters of early earth at the time of the first polynucleotides was rich in the pre-biotic organics required for their formation, their individual rates of increase were heavily dependent on their behavior as replicators, on their ability to interact with the environment to create more of their kind. In this phase, those molecules prospered which were the best at making stable copies of themselves. Dawkins [76] has identified the three distinct pressures that natural selection exerted on them as : longevity, fecundity, and fidelity. Longevity was the same pressure exerted on pre-replication molecules; those molecules whose rate of formation was greater than their rate of decay increased in numbers. For a replicator, though, the stability benefit is twofold, since a replicating molecule’s behavior afiects its rate of formation. The longer a given molecule lasts, the more copies it can produce. So, Other things being equal, long-lasting replicators have selective advantage over short lived ones. The second pressure, fecundity, has to do with the rate at which a molecule produces copies. Other things being equal, molecules which reproduce fastest will increase in numbers fastest. The last pressure, fidelity, refers to the integrity of the copying process. If two molecules have the same longevity and fecundity, the molecule whose copying process is most 17 faithful, i.e., which makes the fewest mistakes, will increase the fastest. If replication is the sole means of phenotypic expression, those replicators increase fastest that last longest, reproduce most rapidly, and produce the most faithful copies. In addition to the potential for replication, nucleotide chains have another property which afforded them with a second phenotypic outlet, a property which allowed them to interact with one another and directly influence not only the rate at which they reproduce, but also the rate at which chains associated with them reproduced. Laboratory experiments suggest that the first replicating chains were short, no more than 10 bases long [Day 84]. Due to flaws in the copying process, which must have been far less reliable than present day mechanisms, many different "strains" of chain should have arisen, each one associated with a different sequence of bases. When short chains of differing lengths, containing a wide variety base sequences, are allowed to combine in solution, a process known as gene splinting [Day 84] occurs (see Figure 1) . Base pairing occurs between chains which share a complementary sequence, and if the chains are of different lengths, a set of unmatched bases may form a " splint”, extending as a single chain from the double stranded complex. If this splint encounters another chain, whose end bases are arranged in the complementary sequence, a longer double stranded molecule forms, possibly extending its own "splint". Through this splinting process, associations of various strengths could form between the different chain varieties, characterized by their particular base sequences. The frequency and strength of these associations would have varied with the extent to which those associations promoted the longevity, fidelity and fecundity of their constituent parts. As the chains grew longer, the strength of association between sub-chains would have varied with the distance between them. Those composite chains which increased the longevity, fecundity, and fidelity of their component parts 18 (b) (d) (a) (C) Figure 1. Gene Splinting 19 were the first symbioses. Associations formed between proximate sin gle-stranded molecules with complementary base pair segments, and those associations which conferred survival advantage strengthened and proliferated. At some point, by an as yet unknown evolutionary path, the phenotypic expression of the polynucleotides somehow expanded to include protein synthesis. While many of the individual steps are still unclear, it is assumed that some of the longer replicating molecules developed tRN A and gained some rudimentary synthetic capabilities which, through the action of association and selection, eventually resulted in a minimal cell. The components of these cells were most likely some mix of the following [Morowitz 67] : some strands of replicating DNA (containing, in total, about 50 cistrons), strands of RNA, both transfer and messenger, some catalyzing enzymes (producible by internal protein synthesis), a nutrient rich cytoplasm, and an enclosing lipid membrane. At this point the nature of selection on the confederated replicators overseeing these systems would have changed again; the wide range of phenotypic possibilities which protein synthesis provided changed the emphasis of the selection process. Protein synthesis so escalated the rate and range of chemical expression, that rather than favoring those nucleotide chains with the best inherent replication abilities, chains were favored whose synthesized protein complexes resulted in the most efficient replication of the overall system. The basis for selection remained the same - longevity, fecundity and fidelity -- but the means for securing improvement expanded along multiple paths, as the enzymes produced by decoding the replicators as protein templates catalyzed new reactions, yielding new compounds. Proteins could speed reaction rates to increase fecundity, align molecules more precisely to increase copying fidelity, and construct more and more elaborate physical and chemical defenses against entropy to increase longevity. And the various substances indirectly 20 produced and metabolized by the various different nucleotide combinations introduced a new level of interaction. To illustrate the effect the introduction of protein synthesis had on the evolution of the replicators, we will consider one of the most potent chemical avenues which was opened -- the metabolic pathway. By the time the minimal cell appeared, much of the store of pre-biotic substances had been depleted [Day 84]. Where once complex molecules which provided the raw materials for the cell’s reactions and the energy which drove them - riboses, purines, protein and non-protein amino acids -- had been available in abundance, changes in external environmental conditions reduced their abiotic rate of formation. As the supply grew shorter, competition arose between cells for their acquisition. One possible result of this dwindling of basic resources could easily have been general extinction. The various cells, each with differing capabilities for acquiring their needed inputs, might have been gradually crowded out by ever more efficient acquirers. All the while, the store of basic resources would have dwindled until the organisms which survived the competitive milieu finally died from chemical starvation. While this may, in fact, have occurred in some cellular populations, that it did not occur in every one is made manifest by the present prokaryotic world. In the face of dwindling resources, cellular populations adapted, developing mechanisms for synthesizing the needed but disappearing complex molecules fiorn simpler, more readily available ones. Today, "prokaryotic microbes can metabolize nearly any organic compound made by nearly any organism" [Margulis 70]. The modern world of prokaryotes, the descendants of the first minimal cells, are the most metabolically diverse of living creatures. All the metabolic pathways known to living creatures are known to prokaryotes; the origins of these pathways is a story of symbioses. 21 Many of these reaction paths involve the synthesis of one compound fiom several simpler precursors, and involve the sequential action of many different enzymes. The individual steps comprising the reaction chain confer advantage on the organism only as a whole; the intermediate products involved may or may not be of use. The existence of these non-functional intermediates made accounting for pathway evolution one of the classic problems of explaining a macroevolutionary phenomenon in terms of micro-evolutionary steps. The present theory of pathway formation is due almost entirely to the insight of Horowitz (45) : the evolution of the basic syntheses. proceeded in stepwise manner, ..., but the order of attainment of the individual steps has been in the reverse direction from that in which the synthesis proceeds - i.e., the last step in the chain was the first to be acquired in the course of evolution, the penultimate step next, and so on. Horowitz goes on to assert that much of the development of these pathways was probably symbiotic in nature [Horowitz 45]. As a required substance becomes scarce, selective advantage will be enjoyed by mutant cells capable of synthesizing the substance from naturally occurring precursors. In an environment where more than one required substance is in limited supply, a number of different mutant "strains" can be expected to develop, each capable of synthesizing a different limited substance. Just as the "new-enzyme" producing mutants enjoyed selective advantage over the non-mutant types, associations between different mutant types which share their products enjoyed selective advantage over unassociated cells. Selective forces then act on those associations most able to form stable symbiotic aggregates. The symbiosis here exists on two levels: at the top level, cells which produce complementary enzymes are mutualistic; but underlying their association is a symbiosis between the recombinant base sequences, the genes, which code for the enzymes which actually catalyze the reactions. Among the mechanisms available, first physical cohesion and finally genetic exchange and incorporation of the relevant 22 information (through transformation or viral transduction) may have achieved advantage over previous forms giving rise to a new species containing a replicating composite capable of the entire pathway. We have identified two occurrences of the integration of previously independent units in the transition from simple replicator to simple cell. Both have depended on the emergence, selection, and improvement of symbiotic associations between separate processes. The first, the deveIOpment of long nucleotide chains from shorter, "less than ten base" units was accomplished via the preservation of chance interactions between units with complementary base pairs that improved the replicating abilities of the constituent parts. Subsequent interactions, between ever longer chains, tested associations of associations, retaining those which proved most advantageous. In the second instance, replicating units within distinct cells encoding for complementary enzymes conferred greatest advantage on their respective cells when proximate to one another. This advantage resulted in more offspring for the spatially associated cells, leading to increasing levels of interaction, culminating in the fusion of the codes into a single replicating strand. While the physical mechanisms differed, the common principle of selection of association was a driving force in these transitions. 2.2.2 Prokaryote to Eukaryote By the early Precambrian, some 2 billion years ago, the prokaryotic microbes had evolved most of the biochemical pathways characteristic of modern organisms [Margulis 70]. The abiotic organics which drove early evolution were replaced by substances produced biotically from simpler precursors. The anaerobic atmosphere of the early earth was transformed by the oxygen-producing photosynthetic microbes. The ozone layer had formed, reducing the amount of ultraviolet radiation which had provided much of the energy used by earlier life forms. 23 The biosphere became its patchy, recognizable, competitive self: associations, symbioses, parasitisms and predations evolved. Many different kinds of relations among microbes, more and less involved, came into play [Margulis 70]. From these beginnings, in the space of roughly 1 billion years, a new kind of organism, radically different from these prokaryotes arose. These new organisms, the eukaryotes, formed rapidly, and their evolved descendants today show a markedly greater level of complexity than modern prokaryotes. The currently accepted theory of eukaryote evolution, the serial endosymbiosis theory, or SET [Taylor 7 4], is largely due to the work of Lynn Margulis. Margulis’s thesis is that a series of endosymbioses between various strains of prokaryotes resulted in the eukaryote "cell"; according to Margulis. the eukaryote is actually a polygenomic unit, containing at least three separate sets of replicating informational molecules. The theory holds that two of the major organelles, the mitochondria and plastids, arose from the gradual integration of distinct prokaryotic endosymbionts with a host microbe. In Margulis’ words: Eukaryotic cells are considered to have originated as communities of cooperating entities that had joined together in a definite order ; with time, the members of the cooperative, already skilled in their specialties, became organelles [Margulis 81]. According to the SET, a large, relatively unspecialized microbe served as host, the so-called protoeukaryote [Margulis 70] which powered itself by anaerobic fermentation of glucose to pyruvate by the Embden—Meyerhof pathway. By some means, either invasion or phagocytosis, a smaller aerobic Gram-negative microbe entered the plasma membrane of the host. This microbe, the protomitochondrion, contained the Krebs-cycle enzymes and the cytochrome system for t0tal oxidation of carbohydrates to carbon dioxide and water. Originally fiee living, it is presumed to 24 have been at least a minimal replicating system, capable of controlling its own reproduction. - Once inside the host cell, the protomitochondrion remained undigested (by unknown mechanisms) and began to reproduce in the host’s cytoplasm. A mutualistic symbiosis developed. The host gained a new set of metabolic paths providing new sources of energy and making possible a new set of syntheses [Margulis 81]. The protomitochondrion, for its part, gained safe haven -- an osmotically controlled, chemically buffered habitat. In time the two became completely dependent on one another. Only under very special circumstances can modern eukaryotes survive mitochondrial dedifferentiation or loss [Margulis 81]. The mitochondrion gradually lost many of the traits of a free-living organism; protected from the outside' environment, able to exploit the hosts’ reproductive mechanisms, the protomitochondria lost their cell walls (but not their thin cell membranes) and the ability to produce many of the enzymes associated with reproduction. Nevertheless, many functions of free-living cells were retained; mitochondria supervise their own reproduction with mitochondrial DNA and synthesize many of their own proteins. They produce their own ribosomal and transfer RNA. While at one time some biologists were skeptical that such endosymbioses could have evolved, a similar phenomenon has been observed in a particular strain of laboratory amoebas [Jeon 76]. This strain of amoeba spontaneously contracted a bacterial infection; the infection involved a large number of bacteria per amoeba, on average 40,000, and at the outset the infection was "very harmful" to the host cells, drastically decreasing their rate of growth. After less than 1000 generations however, (a period of 5 years), the "infected" amoeba had regained their vigor and, moreover, had become dependent on the infecting bacteria for their survival. A series of micrurgical studies of the amoeba showed that an infected nucleus, when combined 25 with the cyt0plasm of an uninfected amoeba, did not form viable cells, while the reverse combination did. Jeon concludes that the "infected" amoeba have become completely dependent on the bacteria and cannot survive in their absence. Neither do the infecting bacteria seem to be able to survive outside the amoeba. The infecting bacteria is, however, able to infect other amoebal species, if properly treated. It may at first seem strange to cite a laboratory example of the development of a symbiont which confers no advantage on its host (the "infected" amoebal strain was no more fit than uninfected strains) when propounding selection of symbioses as a principle means of adaptive evolution. Taylor [7 9] made a similar observation in a review of Jeon’s results : Infected amoeba appear to have no obvious advantage over non- infected cells, a point worth recalling when questions of the selective value of symbioses are raised. Jeon’s results, of course, do not demonstrate that all symbiotic association will prove more robust than all Other competitors, regardless of external conditions. What they do show is that cellular mechanisms are sufficiently robust to withstand and adapt to massive intracellular infection. The rapid rate of coadaptation suggests that these mechanisms may even be pre-adapted to“ foster such associations. Those who would argue for symbiosis as a force in the evolution of complexity certainly do not argue that all symbioses necessarily form to mutual advantage, only that those which do confer advantage tend to increase in frequency and strength over time, and are selected for, and that those which confer disadvantage decrease in frequency and strength over time, and are selected against. The plastids, the other organelle whose origins are accounted for by the SET, are thought to have formed in the much the same way as mitochondria. Already specialized bacteria, a primitive photosynthesizing cyanobacteria, either invaded, or 26 were engulfed but not digested by a mitochondria-bearing protoeukaryotic host. Upon entry, these microbes entered into symbiotic association with the host, giving it photosynthetic capability in exchange for the advantages of a living cell as a habitat. Their subsequent integration with the host is thought to have followed a pattern ' similar to that given for mitochondrial development. The organelles retained some of their own informational molecules, but lost many of the characteristics of free living cells, replacing these characteristics by exploiting resources provided by the host. While no true intermediate forms have been identified, suggestive examples have been found. Paramecium bursaria is an algal-ciliate symbiosis similar to the kind expected in the formation of plastids. The protoplastids in this case are the photosynthetic algae Zoochlorellae, a close relative of the flee-living Chlorellae. Paramecium containing the symbiont Zoochlorellae can survive even when other food sources are scarce, provided there is adequate light; Paramecium without the symbionts cannot. As is usual in stable symbiotic relationships, the rate of reproduction of host and proto-organelle are approximately equal. The symbiont reproduces on its own and is cytoplasmically transferred to Paramecium offspring in order to maintain the association. Mitochondria and plastids may not be the only organelles gained through the selection of symbiotic associations. Several other bodies have been identified as possible endosymbionts [Smith 79]. Many of the DNA-containing bodies found within cells may well be of free-living origin, but may never be positively identified as such; if symbionts grow more and more integrated with their hosts, the identification of their origins becomes increasingly difficult. According to the SET as propounded by Margulis [86a], the intracellular relationship between the various bacterial forms began as a "loose confederation", a multicellular colony of specialized prokaryotes enclosed within a single plasma membrane. At first 27 the numbers and arrangement of the individuals in these colonies, and thus the strength of their “association, varied widely. Gradually, as the processes associated with mitosis developed, the individual members became increasingly coadapted, and their interrelationships grew more rigid; much of this coordination is probably accomplished through information encoded in nuclear DNA. Once again, the selective retention and strengthening of positive interactions between distributed units resulted in the development of a coordinated whole composed of specialized, cooperative processes. 2.2.3 The Origins of Multicellularity Once the cellular level of organization was reached, selection began on associations between cells. The end result has been a diversity of multicellular forms, most of them eukaryotic, which represent yet another quantum leap of complexity and capability. In their external behavior, the multicellular organisms interact on an entirely different scale of size and time than preceding forms of life and have created and exploited niches completely unreachable by their smaller components. Multicellular life has developed mechanisms of tremendous complexity, and used these mechanisms to develop radically new behaviors. This increase in external complexity has been matched by an increase in the complexity of their construction. In place of simple fission and mitosis are the complex processes of embryonic development, an intricate balance of meiosis and mitotic differentiation which produces organisms with trillions of specialized cells fiorn a single cell. Much of the phylogeny of multicellularity as well as most of the mechanisms by which multicellular development proceeds are largely unknown. The secrets of the two seem to be inextricably linked: 28 The fact that differentiation is still a mystery means that the last of the great Precambrian events -- the origin of multicellularity -- is a problem for which we have no solution [Barbieri 85]. Nevertheless, theories of their origins do exist. The most plausible starting place is that the multicellular forms arose from associations between unicellular organisms. In its simplest form, this is all that multicellularity is; the line that separates colonial organisms from those somehow "truly" multicellular is arbitrary at best. On the way to considering the origin of what are usually thought of as "true" multicellular forms - the organisms which develop by differentiation of a single-celled embryo -- we will examine first the development of a different kind of multicellularity, the kind found in organisms like Myxotrichia paradoxa. A popular freshman biology text [Wallace 81] defines a multicellular organism as : one that is composed of a number of cells that cooperatively carry out the functions of life, cells that cannot survive independently under natural conditions. Under such a definition, the many obligate symbioses found in the biosphere must be counted as multicellular organisms. In organisms like Myxotrichia, different specialized unicellular organisms, whose ancestral forms were originally free living, evolved into symbiotic complexes whose parts becorrre so integrated that independent life was no longer possible. These associations begin as weak chance interactions, accidents of proximity. If an association proves mutually beneficial, however, each member of the association produces more offspring when in the presence of the slightly complementary other than in its absence. As the relative concentrations of the complementary types increase in a given area, so does the probability of forming partnerships. So long as the complex proves viable, selection begins to act on the partners as a genetic complex, favoring variations in the complex which increase the integration and specialization of the partners. 29 The resulting organisms can be models of distributed coordination. Myxotrichia is a complex of no less than five separate species of organisms. The prorist host contains two different species of bacteria, the one responsible for the digestion of cellulose, the other for cellular respiration. On its surface, two different species of spirochetes -- anchored to special sites on the host’s plasma membrane with the aid of another species of "anchoring" bacteria - provide motility. In all, the complex is composed of well over a million cells. The spirochetes are capable of coordinated movement through some unknown mechanism. Each cell of the complex maintains its own replicating machinery, and the entire complex reproduces itself from the coordinated action of each cell. It is unculturable outside of its specialized environment. [Margulis 81] While there are examples of integrated, "multicellular" forms composed of specialized cells from distinct species, they are far more rare than the "true" multicellular organisms, the multicellular clones. The theory of their origins also begins with chance association by accident of proximity. Colonies are most likely the result of the failure of daughter cells to separate completely from the parent during division. If the physical coherence -- the association between mother and daughter - conferred advantage on the partners, by whatever means, the tendency to remain connected after division would be favored, resulting in primitive colonial organization [Margulis 86b]. ' The emergence of colonies sets the stage for differentiation. The physical composition of the colony is a major determinant of each cell’s external environment. This composition is in permanent flux as new cells are added and others die or are removed. When new cells are added, the potential for variation exists as a result of imperfect copying processes and variations in the intra-colonial environment. From these variations, simple specializations can arise, (e.g., the assumption of specific 30 orientations of individual parts with respect to the rest of the colony). What follows is a kind of clonal selection [Burner 69]. Differentiated cells which confer advantage on the colony are more likely to be retained by the colony. To the extent that such differentiations are useful, colonies which produce them will tend to grow in size and spawn other colonies, provided the differentiations have a heritable basis. At the outset, colonial organisms spread themselves in irregular fashion; new colonies are formed whenever one or more cells break away from the parent colony. Those colonies formed by groups which contain cells from the differentiated types will gain selective advantage over those which are missing them. The associations between the primitively differentiated cells within these colonies will have been selected for. Examples of this kind of organization still exist, for example the green algae Gonium. Any mechanism which arises to maintain the association of the advantageously differentiated types will be selected for by the same logic. Variations in one or more of the cell types which tend to organize the production of new colonies, and insure that the proto-colonies contain all the beneficial differentiated types, will be selected for. One possible mechanism a differentiated colony may have used to perpetuate itself would have been to connol the construction of its proto-colonies and regulate the types of cells comprising them. As they evolved, these processes would begin to insure the appropriate mix of cellular variants, providing a means for strengthening the association between the various specializations. The work of a number of authors [e.g., Anderson 70, Reanney 74 , Icon and Danielli 71, Temin 76, Steele 81] suggests a second, more powerful mechanism by which colonies may have insured the presence of the symbiotic collection of differentiated types in its offspring colonies. Work with the various means of DNA transfer akin to viral transduction has led some researchers to believe that these mechanisms have been a major evolutionary force, allowing the intercellular exchange 'of genetic 31 information. The presence of such mechanisms gives replicating molecules yet another avenue for strengthening advantageous associations. Given that transduction allows the transfer of informational molecule fragments from one specialized cell to another, cells can arise in colonies which contain the "blueprints" for several different ‘ specialized types. Such cells, set off on their own, would then be capable, by mitotic division, of producing each of the specialized cell types encoded within them. Emergent mechanisms to regulate the differentiation process would be selected for in proportion to their ability to insure appropriate development. The interaction of the two mechanisms -- construction of mixed proto—colonies and integration of differentiated types -- gives us a plausible set of selectable associations with which to develop colonial organisms with some level of cell specialization. In organisms like these, each cell in the colony is a potential germ-line replicator. Variations arise as the colony grows and those which prove advantageous are preserved, and have the potential to generate more c0pies of themselves in "offspring" colonies. But what of those organisms which arise from a single cell? According to the Weismann doctrine, the somatic gene line is completely separated from the germ line. Clearly, through the process of development and maturation, some clonal selection will occur, as variants of the original type either flourish or fail. An organism composed of thousands, or millions, even trillions of cells, has many encounters with the environment in which some cell combinations succeed where others fail. But according to accepted biological theory, none of the genetic information stored in the somatic cells can be propagated. While this "somatic information" affects the survival chances of the organism, if the germ—line genes cannot be altered to incorporate the genetic modifications stored in the somatic line, the offspring of that organism will be left unaltered by the information gleaned fiom its parent’s somatic selection process. 32 An alternative view has been provided by Steele [81]. In recent work with immunological inheritance, Steele has identified a mechanism whereby information gained through clonal selection can be transmitted via C-type RNA viruses to germ line cells, where it is incorporated into DNA via reverse transcription [Temin 76]. The idea is similar in character to the selection of cells in a colony. Each division potentially produces a variant type which finds itself in contact with other cells and the external environment. Those complexes which work well together survive best, replicate most, and have the greatest chance of being preserved by transduction into gamete cells. Many mechanisms are missing in Steele’s theory, and his ideas are not without their opponents; nevertheless, this new area of research provides yet another mechanism for increasing the strength of association between cooperative processes. 2.3 Distributed Adaptive Self Organization In the previous three sections, we have reviewed several different natural transitions which have resulted in the generation of cooperative, distributed wholes from originally independent distributed units : (1)the emergence of long nucleotide chains from associations between independent but mutually compatible, naturally occurring informational molecule fragments; (2)the emergence of integrated gene complexes capable of complex, multistage chemical reactions from associations between independent, spatially distributed genes; (3)the emergence of hierarchically structured eukaryotic cells from associations between originally independent, spatially distributed prokaryotes; and (4)the emergence of simple multicellular organisms fiom associations between independent, spatially distributed cells. 33 The physical mechanisms involved in each transition were of course very different from one another: the first involved certain properties of chemical attraction; the second, transduction between neighboring entities; the third, the intracellular mechanisms which establish and promote endosymbionts; the last, a combination of physical cohesion and transduction. The selection pressures which drove each level were also varied: explicit chemical stability in the first case, level of adaptedness of encoded enzyme complexes in the second, and organismic viability in the third and fourth cases. Yet despite the differences in the entities manipulated and the problems encountered, there is a common thread in each transition - tlre identification, selection and strengthening of symbiotic associations. 2.3.1 The Common Thread At any point in time, the fundamental units in each of the four examples can be viewed as distributed processes, each of which interacts with a portion of the environment -- its context -- in a characteristic way, based on encoded information which controls the process. The units are interchangeable in that the scheme for decoding this internal program is the same for each of the units. The processes continually take in information fiom their contexts, perform internal operations, and alter those contexts by producing output information. Each process is of limited complexity and capability, and the nature of each unit’s performance is similar. In each of the natural transitions described in the previous section, the set of active processes is dynamic; processes cease functioning and are removed from the set, others are initiated and added to it. A process’s longevity, the length of time it remains active, is a stochastic function whose mean increases with the process’s relative adaptedness to its environment. While active, each process is periodically capable of initiating a new process, in a new context, conn'olled by an internal program which is a more or less exact replica of itself. The rate at which a process 34 spawns "offspring" processes is also a stochastic function whose mean increases with adaptedness. ‘ Short of spontaneous emergence, errors in the copying process provide the only source for potentially adaptive variation in these systems. Make the copying process error free and there is no avenue for change; if it is fraught with errors though, the system loses its capacity to "remember" adaptive behaviors. But how, if fidelity is a selection pressure, can any mechanisms exist to introduce variation in the copying processes ? It is easy enough to account for the persistence of point mutation errors, introduced by entroPY; even though fidelity achieves selective advantage, systems made of physical components may be unperfectable -- but random variation does not make for much of a search algorithm. Adaptation would proceed far more quickly if means could be developed to concentrate variation in particularly promising areas. But, once again, if fidelity is a fundamental selection pressure, how can mechanisms which introduce specific kinds of variation persist? Despite the seeming contradiction, such mechanisms do exist. Sex certainly does, and a number of authors [e.g.., Waddington 61, Mayr 70 ] have claimed the existence of other genetically controlled mechanisms which permit certain kinds of change and prevent others. These mechanisms act to prevent changes in proportion not to their case of prevention, but to their tendency to produce non-viable variants. If fidelity is the test, such mechanisms should not become fixed in a population. The question is a complex one, but at least part of the answer can be gained by analyzing the question fiom the point of view of Dawkins’ [76] "replicator", the basic unit of replication, (in the current case a process). When the environment provides outlets for phenorypic expression aside fiom replication, even though a set of gene- controlled processes interact so closely that they reproduce as a unit, selection for 35 fidelity will act primarily at the lowest level, on the individual processes. The complex will receive its primary selection pressure from the effects its phenotype has on the complex’s longevity and fecundity. A process, or set of processes, that is capable of influencing the variability of the orher processes which its offspring are in association with will influence the phenotype of the offspring complexes. If this variation process increases offspring survivability, the "change regulating" process will have improved the longevity and fecundity of its kind, and its gene will persist. In our description then, we will not characterize the copying fidelity of processes in the same way as their longevities and fecundities. Instead, we will say that the copies produced are subject to some source of variation, be it random or adaptive. Thus, as processes spawn other processes, at certain stages variations will arise, diverging in input/output behavior from the original, eventually producing processes which can be called novel. One of the most important characteristics common to each of our example systems is that the individual processes in the system are not isolated from one another. When the contexts of two processes overlap, those processes may interact directly -- some of the outputs of one process acting as inputs to the other or else the same input acting on both processes. Even if their contexts are disjoint, two process may interact indirectly, through an intermediary process which interacts directly with both of them. These interactions change the environments of the processes involved, and thus may affect their performance -- their relative adaptedness -- for better or worse. According to the ideas we presented in section 2.2, coordination between originally independent processes has arisen as the gradual result of the selection of stable interactions, or associations. Margulis [81] describes these associations as follows: 36 Stable symbiotic partnerships are more fit than the partners are individually, in the sense that they leave more offspring when they are together than when they are apart. The genes of symbiotic partners are in close proximity; natural selection acts upon them as a unit. At the core of each symbiotic relationship we have discussed are true replicators, the molecule fragments which encode context dependent information, and increase their frequency in the overall gene pool by associating themselves with other replicators. Selection operates on the phenotypic expression of these associations in particular contexts, furthering and strengthening those associations that produce effects which confer advantage. Associations at different levels of organization are maintained by different physical mechanisms. It makes little difference to the replicating molecule if its association with its symbiont genes is secured via the phosphate-sugar backbone of a single DNA strand, the centromere of a chromosome, the nuclear membrane of a eukaryoric cell, the plasma membrane of a cell containing endosymbionts, the cohesion of cells in a colony or any other means. What does matter is the strength of the association and its selective value with respect to local environmental conditions. Selection of association acts in two ways. First, the frequency of association between partner processes tends to increase with the reproductive advantage conferred by the partnership. Over time, the probability a newly generated process of type A will share a context and interact with some process of type B increases to the extent that type A and type B processes form stable associations. Second, the strength of association between symbionts also tends to increase with reproductive advantage. (All associations are not equal. The sn'ength of association between processes varies fiorn non-interaction at one extreme to fusion at the other.) In each of situations examined, gradations in integration between partners exist and those symbionts which persist over longer periods of time tend to become increasingly interdependent. [Margulis 81]. For a given process type in association, 37 as the strength of symbiosis with the partner type increases, the interaction with the partner type becomes an increasingly significant portion of the given type’s environment. As copies of the given type are generated, those variants which improve performance by increasing adaptation to the properties of the symbiont type will gain survival advantage, furthering the integration process. To summarize then, the apparent mechanism which developed coordinated distributed systems from originally uncoordinated units in our examples involved the emergence, selection and development of symbiotic associations. Each system is composed of a dynamic set of processes; the processes have limited "lifespans" during which they interact with a portions of the environment and periodically initiate other processes. These offspring processes are replicas of their parent process, but are subject to some form of variation. When advantageous associations arise between processes, those associations increase in frequency and strength so long as they prove viable. As two types become more and more intimately associated, selection begins to favor those variants which form the most integrated complexes. If the system environment is such that composite processes can confer survival advantage on their constituent sub-processes, the processes in the system become increasingly integrated. Given a sufficient source of variation and adequate resources to maintain the process "population", this adaptation process should be able to continue so long as increased performance in the environment is possible. 2.3.2 A Computational Theory From the preceding observations, we will attempt to construct a more formal statement of the conditions sufficient to induce the phenomenon of distributed self- Organization which will serve as our computational theory. 38 Let E be an information environment specifying an information space U and performance function FE. Let P = {p1, ..., pi, ...} be a set of programs which can manipulate information in U. An active process a is then defined as an ordered triple (pa , Ia , 0a) where p“ in P is the program controlling the process, Ia Q U is the input space of a, and 03 C U is the output space of a. The performance of a process a under its environment is given by the positive real valued function, FE(a). If FE(a) > FE(a'), a is better adapted to E than a’. Processes remain active for a finite span of time during which they periodically spawn processes a’ = (p a. , 1‘. , Oar). The offspring process program, pa, , is produced by copying the parent program, pa , but we assume that the copying process is subject to adaptive variation. A distributed information system, S, is then defined as the ordered triple (E, P, A) where E is the system environment, P is the set of possible programs in the system; and A is a dynamic set which, at any time t, is A(t) = {31’ a2, , akm], the set of processes active in the system at t. The process in A(t) can be partitioned into m classes, Cl I C2 I l Cm, 1 s m S k(t), based on their controlling program such that : Vae Ci,a’e Cj, pa=p‘, 4: i=j. We expand the domain of FE to include sets A’ C A as follows : FE(A') fif‘i‘” / I A’l 39 Define Interact (a, a’) as a boolean-valued function which is true if and only if either : 1) (Ian Ia.)¢O or (Can 1'.) a: @or (Oanlawfl 01' 2) 3 a" such that Interact (a, a") and Interact (a”, a’). We say a process a is associated with process a’ if a’s interaction with a’ improves - a’s fitness. To capture this idea, we define another boolean function Assoc (a, a’) as follows. Let Cflj = [ a in Ci I there exists a’ in Cj such that Interact (a, a’) ]. Then : Assoc (a, a’) <=> Interact (a, a’) and FE(C ) > FE( Ci - Cu)- ilj r We expand the domain of Assoc to include sets A and A’ C A as follows. Let A’IA” = [ a’in A’ l 3 a”in A”, Interact ( a’, a") l , then Assoc ( A’,A”) c: A’IA” at Z and FE (A’IA”) > FE ( A’ - A’IA”) Every association is assumed to have a sn'ength which reflects its probability of persistance, i.e., the probability that a and a’ will remain associated while active. We model this as the real valued function : 0 if ~ Assoc (a,a’) Strength (a, a’) - k otherwise where k in (0,1) denotes the strength of the association. We are now in a position to define the level of system coordination in a distributed information system, LC (8), in terms of the strength of the associations between its 40 component processes. First, let FI denote the frequency of interaction between processes in class Ci with processes in class Cj : let SA denote the strength of association between processes in class Ci with processes in class Ci: 0 if 2Assoc(a,a’) =0 (3. 8')e SAan= C39 2 Strength (a, a’) / 2 Assoc (a, a’) , otherwise (a. 836 (a. a’)e and let AA (i, j) denote the advantage of association conferred on processes in class Ci by their association with processes of class Cj, defined as: M (iii) = [FE (Cm) ' FE ( Ci " Cuj )] / FE (Ci)° The level of coordination of class i processes with class j processes, LC (i, j), is given by: LC 6. j) = F1(i..i) * SA 6. J') "' AAGJ) The level of coordination of a system S is then : 1£$h= 2 Z uxmwmuz i=1...rn j=l...rn Having identified the relevant variables, we are now ready to set out the condition which our observations indicate will produce increases in the level of coordination of a system S. If processes are spawned such that : 41 when FE(Ci,j) > FE(Ci) , di‘ FI (ij) >0 and fit- SA (i,j) >0 ‘ then : d LC(S) > o a? o 2.4 Surmnary In this chapter we have atttempted, by observation of the origins of several cases of distributed process integration in biological systems, to develop a computational theory of distributed adaptive self-organization to serve as a guide for the development of algorithms to guide the development of artificial distributed adaptive systems. In sections 2.1-2.3 we examined the development of four different instances of the self-organization of distributed information processors. By abstracting the hardware independent features of each system, we identified the process of selection of association as a crucial mechanism in the automatic coordination of originally independent units. In the last section, we have attempted to formalize the ideas expressed in the other sections. CHAPTER III A DISTRIBUTED GENETIC ADAPTIVE SYSTEM 3.0 Introduction In this chapter we will discuss an algorithm which implements some of the ideas developed in the previous chapter. The model presented here, the DGAS, uses a distributed genetic algorithm to manipulate independent, computationally limited processes which interact with one another in a common environment. The purpose of the genetic algorithm in this context is twofold. First, it is used to provide the source of directed variation which the computational theory suggests will increase the rate of individual process adaptation. Second, when certain mate selection and offspring placement strategies are incorporated into. the model, the distributed genetic procedure provides a mechanism for identifying and strengthening beneficial associations between processes operating in a common problem domain. 3.1 The Basic Model In this idealized model of the natural genetic system, the task environment is defined in terms of a space, visualized as a two-dimensional integer grid of infinite extent, much like the standard setting for cellular automata [Burks 70]. Within this grid, environmental conditions are defined in a local way; i.e., conditions exist in regions of the space and vary with time and/or the action of the living systems in the region. Living systems are modeled as abstract processes, each one controlled by a finite, 42 43 encoded program, which operate at some location in the grid and interact with the environmental conditions present in some region surrounding them. Each process is capable of : l)detecting the conditions present in its immediate external environment; 2)detecting the conditions present in its internal environment; 3)reacting to sets of external and internal states by either establishing some internal condition or altering the conditions present in its immediate external environment. Each process’s performance with respect to the environment is evaluated in terms of a positive real value called its strength; ,in the DGAS strength is the "currency" of fitness. A process begins with a certain initial strength, and its value then changes with time as the process interacts with its environment. It is intended as an analog to energy in the natural system, in that it provides the basis for selection. Strength is used in the DGAS to regulate both the longevity and fecundity of processes. Each process expends a certain amount of its strength periodically in order to remain active. Since a process can only increase its strength by obtaining rewards fiom the environment, if a process’s rate of reward remains below the "price" of remaining active for a sufficient length of time, its strength falls below the activity threshold and it is terminated and removed from the grid. If, on the other hand, its rate of reward exceeds the price of remaining active, its strength will grow until it reaches the somewhat higher reproduction threshold, and it spawns an offspring process; a portion of the offspring’s initial strength is subtracted from the parent’s, restricting the rate of individual reproduction. Thus individual longevity and fecundity are both dependent on relative adaptedness to the environment. The strength of a process is not sufficient measure of its fitness however, since strength is periodically reduced discontinuously, not in response to decreased performance, but as a result of 44 reproducing. Thus a better measure of the individual fitness of a process a is the rate of increase of strength, FE(a) =§t— 8(a). Functional interaction between processes is mediated by the environmental grid. In the DGAS, processes which are proximate to one another on the grid experience similar environmental conditions and can simultaneously affect their mutual environment. Since the fitness of a given process is a function of its performance within its immediate environment, the actions of processes in the same region as the ' given process can affect its fitness, either positively or negatively. 3.2 Composite Prograrrrs -- Composite Processes According to our observations, in order for distributed self-organization to occur in a system, that system must allow for the formation of stable associations between processes. In the DGAS, the mechanism used emerges from consideration of multicellular organisms whose behavior emerges from the interplay of large numbers of interacting, spatially distributed processes. During ontogeny, a genetic program, encoded in nuclear and cytoplasmic DNA, replicates itself, more or less exactly, trillions of times. Despite the similarity between the programs controlling them, however, the cells containing these replicas of the original program do not all perform alike. Instead, they differentiate into many types, each of which performs a specialized function. None of the individual units operates on the scale of the organism, yet together the operations which these differentiated cells perform implicitly coordinate to produce its behavior. This process of differentiation is controlled, at least in large part, by the action and organization of the original program. So multicellular organisms can be viewed as composite processes, driven by multiple copies of a single composite program, which contains the information required to 45 control each process in the organism. The composite program contains a developmental and functional plan for an entire system. By replicating itself many times and placing the offspring in appropriate environments, a single composite program produces a set of independent, functionally interacting processes which operate in a common context toward a common goal. Furthermore, all of the information about the system is located in each component, simplifying communication of the group strategy. The complete expression of a composite program is a holonic process hierarchy [Koestler 76] whose structure is characteristic of living systems. The composite program is the principal method of stable process association in the DGAS. Using the mechanisms described in section 3.3, the DGAS forms composites from Specialized programs which control independent processes that have exhibited symbiotic behavior. Two programs, each of which produces a distinct specialized behavior, are consolidated into a single program capable of producing either behavior. Which of the encoded behaviors a process driven by a composite program will exhibit is determined by the internal and external environmental conditions in which it is placed. The thrust of the DGAS is to develop a single composite program which, when replicated, produces and controls a set of differentiated processes which cooperate toward a common goal. 3.3 Process Initiation In the DGAS, successful processes are in some sense those which produce offspring processes. From Chapter 2, offspring processes serve two purposes in the context of distributed adaptive self-organization. First, they are the principal source of adaptive variation. Second, how offspring are produced and in what context they begin their operation has a great deal to do with whether or not association information will affect population development. According to the theory, if our model is to generate a set of coordinated processes, contexts for offspring must be chosen so that the probability 45 and strength of association with complementary processes increases with the benefit conferred by the association. In the natural system, where association is primarily a spatial phenomenon and the mechanisms for increasing the frequency and strength of association usually operate at close range (e.g.., cohesion, transformation, viral transduction), this is accomplished implicitly by spatially biasing offspring placement. Since interaction is also spatial in the DGAS, a technique modeled on the natural system is used. In the following sections we will discuss how the distributed genetic procedure of the DGAS acts as the source of adaptive variation as well as the primary agent of selection of association. 3.3.1 The distributed genetic algorithm The algorithm used to control process initiation in the DGAS is a new variant of the genetic algorithm first described by Holland [76]. As used here, "genetic algorithm" refers to the class of algorithms emanating fi'om work done in the late 1960’s and early 1970’s by some members of the Logic of Computers Group under Holland’s direction at the University of Michigan. These early explorations into incorporating models of the genetic mechanism as the core for a function optimization algorithm came together to form the basis for Holland’s book "Adaptation in Natural and Artificial Systems" which has become the theoretical cornerstone of genetic algorithm research. The genetic algorithm is a high level abstraction of natural evolution which ignores most levels of physical detail in order to illuminate more clearly the transfer of information inherent in the evolutionary mechanism. Holland’s approach to describing evolution as a biological information system is an example of an information processing problem approached in the Marrian style. The theory underlying the genetic algorithm allows the problem domain to be described in abstract terms, identifies the necessary component computations which drive evolving systems, and explains the information processing functions that they 47 perform. Properly interpreted, elements of this theory apply to any process, natural or artificial, which directs an evolving system. Since 1975, a number of researchers have explored several important variations of the basic reproductive plan, applying them in a number of different contexts. Originally developed as function optimizers [Cavicchio 70, Hollstein 71, Frantz 72], genetic algorithms have since been successfully applied to a variety of learning problems: in game playing contexts [Smith 80, Fujiki 87], as the learning component of artificial animals [Holland 78, Wilson 85], and as the adaptive kernel of the Classifier System [Holland 86b, Burks 86], a rule-based adaptive scheme which has been successfully applied in several different problem domains. These variations are the result of changes in the organization of the algorithm, extensions or modifications of the genetic operators, concessions to problem specific factors or enhancements designed to overcome obstacles of implementation. Some empirical studies of the genetic algorithm have also been performed, the most notable of these being [DeJong 75], [Baker 87], and [Goldberg 87]. 3.3.1.1 Michigan and Pitt In recent years, much of the research in genetic algorithms has followed one of two approaches, which have come to be known in the literature as the Michigan and Pitt approaches [Grefenstette 87], after the locations of their first proponents, Holland and Smith, respectively. Briefly, the genetic algorithms used in the Pitt approach are those similar in organization to the learning component of Smith’s LS-l [80]. 15-1 maintains a population of multiple rule production programs whose only interaction is reproductive; each program maintains its own message list and operates in isolation. At each iteration of the improvement cycle, each program in the population is tested independently, on several instances of the problem, and evaluated on the basis of its performance in the domain, as well as certain factors related to its internal 48 composition. This evaluation produces a single number for each program which describes its fitness. The programs and their fitrresses are then fed to a genetic algorithm which produces new programs which are offspring of the previous ones by applying a hierarchically organized crossover operator to cross individuals at both the rule and sub-rule level. Repeated application of this cycle of testing and recombination results in a sequence of production systems which show increasing acumen in the problem domain. The genetic algorithms used in the Michigan approach are those similar in organization to the one used by Holland and Reitman[78], Booker [82], and Goldberg [83]. Unlike the Pitt approach, the Michigan method maintains a population of individual rules which interact with one another both functionally and reproductively. Rules compete with one another for the right to post messages on a single global message list, and, through chaining, may increase or decrease the fitness of other rules. The interaction is overseen by some version of Holland’s bucket brigade algorithm [Holland 85], which apportions credit among rules that establish the conditions necessary for other rules, which produce successful external responses, to ' fire. The bucket brigade provides the system with an "inner environment" [Pylyshyn 86], which regulates its internal activity. Within the context of this inner environment, the individual rules in the population interact and are evaluated in concert with one another. At certain intervals, a set of the strongest rules is selected from the population and fed into a genetic procedure which crosses them to produce new rules, which take the place of those rules in the population which have been least productive. The difference between Pitt and Michigan lies in the character of the structures manipulated and the purpose behind their manipulation. Under the Pitt approach, the individual structures manipulated by the algorithm are of arbitrary complexity; under 49 the Michigan approach individual complexity is fixed. At Michigan, individual structures interacti' at Pitt they remain isolated. The Michigan approach uses its genetic procedure solely to develop individual structures which are adapted to the system’s internal environment; overall system integration depends on the effectiveness of the algorithm which creates this internal environment, ( eg., the bucket brigade ). The Pitt approach, on the other hand, uses its genetic algorithm to deveIOp individual structures that are internally and externally adapted; the genetic procedure puts together units which are internally consistent and perform well as a 1 unit in the external environment. 3.3.1.2 A synthesis The approach I have taken in the DGAS is a hybrid of Michigan’s and Pitt’s. Like the Michigan systems, the individual structures in the DGAS interact with one another in a common environment, affecting one another’s fitnesses. Like the Pitt systems, the individual structures manipulated are of arbitrary complexity, and the genetic algorithm is used to provide both individual adaptation and internal coordination. The distributed genetic algorithm used in the DGAS is one of the few which has been classified as parallel [Sannier 87]. In its most general form it is a truly distributed, massively parallel implementation, where each individual is responsible for its own reproduction. The algorithm manipulates processes which can be modeled as composite data structures; associated with process a are the two items : 1) a.program, which represents a’s controlling program and 2) a.strength, which represents a’s current strength; 50 as well as a number of problem-dependent fields. The system-wide control algorithm which initiates and”oversees the system processes is given more formally below: main () I cobegin ( for all processes a in A(O) ) { a.strength = So; active (a); } oversee ()3 wait (termination); All processes a in the initial set of processes A(O) are launched concurrently, each with an initial strength of So. Oversee () is a problem-dependent global procedure which periodically checks the performance of the system as a whole, updates certain global values and determines if the system is sufficiently evolved to terminate. Each active process behaves as follows : 51 active (process, strength) { - while (not terminated) { execute (process.program, process.strength); check_thresholds ( Thr,Cr, Pd ); if ( process.strength > Thr ) { offspring_process = reproduce (process); report ()3 offspring.strength = So; process.strength -= Cr: active ( offspring_process); } else if ( process.strength < Tht or (with probability Pd) ) set event (terminated ); } report_termination (process); Until it is terminated, each process executes its program, interacting with its environment for a fixed period of time, performing the actions which deplete the process’s strength and receiving rewards which increase it. After each execution phase, a read-only global data area is checked to determine the current threshold values: Th7, the process’s reproductive threshold; Pd, the probability of death; and Cr the cost of reproduction. If, after an execution phase, a process’s strength is less than the termination threshold, Thu the process terminates, reporting its termination to the overall system. If, instead, a process’s strength is greater than the reproduction threshold, the process produces an offspring process and activates it in a chosen 52 context with initial strength So; the act of reproduction decreases the parent’s strength by an amount Cr The method of reproduction is outlined below : reproduce ( process ) { offspring_process.program = process.program; begin ( with probability Pc ) { mate_process = mate_select (process); offspring_process.program = crossover (offspring_process.program, mate_process.program); } else begin (with probability Pi ) offspring_process.program = invert (offspring_process.program); else begin (with probability Pm ) offspring_process.program = mutate (offspring_process.program); set_context (offspring); An offspring is driven by a program which is either an exact replica of the parent, a cross between the parent and a "mate" program chosen by the mate_select strategy, an inverted version of the parent, or a mutated version of the parent. The crossover, inversion and mutation operators used in the DGAS are taken from the LS-l system [Smith 80]. In the DGAS, as in most genetic algorithms, mutation and inversion are "background operators", which have a relatively small probability of occurrence. Their primary purpose is the source of "raw variation". For a more complete description the interested reader should consult [Holland 76 or Holland 86a]. The program set is assumed to be encoded such that Smith’s hyperplane analysis holds [Smith 80]. 53 The primary search operator in the DGAS is crossover. In order to be properly considered in the ‘class of genetic algorithms [De long 87] an algorithm must satisfy the so—called "schema theorems", (principally Lemmas 6.2.1, 6.2.2 and 6.2.3 in Holland [76]). These theorems establish the implicitly parallel search action of the crossover operator and give the genetic algorithm its adaptive power. In order for these results to apply to a given reproductive plan which utilizes a crossover operator, the operator must satisfy certain conditions (Lemma 6.2.1 and 6.2.2) and the plan must allocate trials to individuals based on their "usefulness", defined as the ratio of their performance to mean p0pulation performance. Since the crossover operator we use has been verified as standard by Smith [80], it remains to be shown whether or not the reproductive plan specified above maintains the sampling criterion required by these Lemmas, i.e., that the expected number of offspring M(a), of a given process a, be proportional to what Holland [75] terms its "usefulness", given by FE(a)/ FE(A). Theorem : Consider the set of viable programs in a population A, denoted by Av = {as A I FE(a) > 0]. Given the reproductive plan specified above, if C, = k * FE(a), and Pd = llk, k6 (1, 00), the expected number of offspring for a given program a6 Av, M(a), is given by FE(a)/ FE(Av). Proof: The expected lifespan of a viable process a, whose mean rate of strength increase , %? 8(a) > 0, is simply the inverse of its probability of "death" in a given time step. Since in order to produce an offspring, a process must increase its strength by Cr units, the expected number of offspring of a is given by : 54 Ma) = g- 5(a) * (I/c,> * (my). where %{S(a) is the rate of increase of strength of process a and l/Pd is a’s expected lifespan. Since FE(a) =gt— 8(a) and substituting the conditions : M(a) = FE(a) * 1/( k * FE(A) ) * k = FE(a)l FE(A) D So, by the theorem, the viable programs in the population reproduce "genetically" whenever the reproductive threshold is expressed as a positive fraction of the average fitness of the viable population, the fraction associated with the probability of individual "death" in any given time step. The interpretation of the constant is natural. So long as k is greater than 1, Pd is a valid probability and Cr, the su'ength required to generate an offspring process, is a positive multiple of the mean increase in strength obtained by reproducing individuals during each execution phase. So the average individual can be expected to produce a single offspring during his lifetime; individuals with greater than average fitness can be expected to produce more than one offspring, those with less than average fiuress can expect less than one offspring. Since the algorithm is genetic, it follows that when operating on properly encoded structures, it will provide a good source of adaptive variation; even under the influence of time-varying, stochastic environmental feedback [Holland 75]. But, in addition to its properties as an adaptive search algorithm, the plan proposed above, when embedded in the DGAS, has an added dimension. When appropriate mate selection and offspring placement su'ategies are used (mate_select and set_context), the 55 resulting algorithm acts to implicitly identify symbiotic associations and increase their frequency and strength in proportion to their selective advantage. 3.3.2 Structural and Spatial Coherence Standard implementations of the genetic algorithm pay almost no attention to the selection of crossover mates and offspring contexts. Mates are selected at random, sometimes biased by relative fitness. As for the selection of offspring context, under both the Michigan and Pitt approaches there is no choice; all individuals have identical contexts. Since the DGAS’s environment is defined in terms of a space with locally varying conditions, however, more flexibility is possible in mate selection and offspring placement strategies. As experiments with Asgard will show, the selection of appropriate strategies determines the power of the system to identify, promote and select associations. Motivated by the importance of proximity to interactions between processes in biological systems, I was led to introduce a spatial bias to the mate selection and offspring placement strategies. In place of random selection, mates in the DGAS are chosen so that the probability of an offspring emerging from a union of two parent processes is inversely related to the distance between them. Similarly, when an offspring is placed in the grid, the probability that that offspring is found a distance x from its parents decreases inversely with the magnitude of x. These spatial dependencies are important factors in promoting the formation of composite processes. In the spatially distributed p0pulations of the DGAS, two kinds of process groups emerge from the action of the genetic operators and regional variations in environmental conditions. These groups derive their coherence from different properties and each serves a different function. The first of these -- what I have 56 called the structurally coherent groups -- contain processes driven by programs which are the same or very nearly so. Typically, the members of a structurally coherent group will exhibit the same kinds of behaviors in response to similar environmental pressures. In sufficiently large, unevenly distributed populations, we can expect a number of different structurally coherent groups - corresponding to a variety of different survival strategies - to develop. According to a study by Grosso [85], if the individual processes in an unevenly distributed population are computationally limited, in the sense that they are incapable of responding, simultaneously, to all the demands and regularities present in their environment, a phenomenon he refers to as functional specialization will occur. Evolving populations of processes subject to different selection pressures begin to pursue different survival strategies and, after a time, are separable into distinct groups based on their patterns of behavior and the structure of the programs which encode for this behavior. In a different analysis of evolving systems, Katriss’s model [81] predicts that the variation and adaptation of these strategies will continue until further development is infeasible; in the DGAS this means, among other things, that individual su'ategies will steadily evolve to meet their computational limits. When processes develop to this point, they become mutually exclusive, in the following sense. Given that processes are computationally limited, two different specialized strategies which push those limits are mutually exclusive in that a single process cannot adequately perform the two strategies simultaneously. The second kind of group arises as a consequence of the spatial dependencies built into the model. In the DGAS, collections of processes which occupy particular regions of the environmental grid form what I have called spatially coherent groups, or subpopulations. Almost all interprocess interaction takes place within these groups; 57 not only do the individuals within them share a common context, but when mate selection and offspring placement are spatially biased, reproductive activity is also concentrated within them. If the population density is sufficiently low, individuals will tend to cluster in spatially coherent groups, particularly when the environment is "patchy" -- certain regions of the environment are more hospitable than others. The individuals within such "spatial niches" [Perry 84] will form semi-isolated subpopulations whose members interact bOth functionally and reproductively. and tend to place their offspring back in the niche with high probability. Since the members of spatially coherent groups interact functionally, if a niche is inhabited by individuals from different structurally coherent groups, the potential for symbioses arising between the different types exists. In the next two sections we will discuss how the crossover operator, by manipulating individuals within and between these groups, is able to implicitly increase the frequency and strength of interaction between groups which interact symbiotically. 3.3.3 A Reproductive Continuum Crossovers of parents within and between the groups described above produce offspring processes of different characters, each of which performs a different function in the DGAS. The reproductive operations induced by the crossover operator take on the character of a continuum, producing different kinds of offspring depending on the degree of structural similarity between the crossing parents. This continuum is 58 depicted in Figure 2, linking two naturally occurring operations, replication and recombination, with a third operator which I refer to as hybridization. Replication Recombination Hybridization Increasing similarity of parents -—> Figure 2. A Reproductive Continuum Admittedly, it may seem odd to assert .that any kind of continuum exists between replication and recombination, since so many differences exist between them. Replication, the process by which a process produces an identical copy of itself, seems to have little to do with recombination, the process which produces a new individual by "mixing" two parent processes. In the natural system, they occur at different levels in the biological hierarchy and the physical processes which implement ' them are quite distinct. Nevertheless, we argue that, from an information processing perspective, replication and recombination can be considered as points on a logical continuum of reproductive operations by which new processes can be produced from existing processes of varying degrees of similarity. While only certain regions of this continuum are realized in the natural genetic system, we have incorporated all of them in our model. Replication lies at one exueme of this hypothetical continuum, occurring in the model either from the action of the replication operator or as a crossover between identical parents. The program of the resulting offspring process is an exact replica of the parent process’s program. Nevertheless the behavior of parent and offspring may be 59 completely dissimilar, since in our model, a process’s actions and responses are dictated not only ‘by its program but it internal and external context. Replication is most interesting in our model when the program of the parent process is composite. When this is the case, the sections of the program which become active in the offspring may differ substantially from those which are active in the parent, due to differences in their respective contexts. For example, a process whose program contains instructions for controlling two mutually exclusive processes, each one sensitive to a distinct set of internal and external cues, can produce offspring which ‘ differentiate into one of two types. While each process is driven by an exact replica of the parent program, different components of this composite program, (prescribing different behaviors), are active in each of the offspring types, due to differences in the internal and external environments in which they are placed. As a result the two types exhibit two different behaviors. When crossovers occur between closely related processes, i.e., nonidentical members of the same structurally coherent group, the offspring process will be driven by a program which looks less like a replica of its parents and more like a mix of two similar, but distinct parents. In place of replication, we get recombination. Those familiar with genetic algorithms will recognize recombination as the standard image of crossover. The programs of two processes, closely related variants of the same structural group, cross to produce a program which shares sections of both parents’ code. A process driven by this program will exhibit behaviors characteristic of both parents, as well as new behaviors arising from epistatic interactions between the spliced sections. Recombination, then, is the DGAS’s source of adaptive variation. It has been studied extensively by a number of genetic algorithm researchers [Holland 86a, Booker 82, Smith 80], and has been shown, under a general set of conditions, to generate individuals of increasing fitness through an intrinsically parallel search of potential programs. . 60 As the parents become less and less similar and the behaviors they encode for increasingly specialized, crossovers in our model tend toward hybridizations. We define hybridization as a crossover between processes driven by radically different, mutually exclusive programs - each of which codes for a separate process activated by a distinct set of internal and external cues - producing a new, composite program which contains instructions for both processes. The formation of composite programs is the DGAS’s means for strengthening the association between processes. In hybridization, two parent processes, each of which is driven by a different, specialized program, cross to form a composite program, capable of controlling an offspring process which exhibits either behavior, depending on its external and internal context. 3.3.4 Composite programs Evolution in the DGAS occurs as an interaction of the operations described above. Each operation serves a different function. Recombinations between processes driven by programs from the same structural group produce variants which explore the space of programs near that su’ategy. Selection of the most fit result in a continual refinement of the strategy, both with respect to its external environment and its "process environment" - the effects produced by those processes in its spatial group with which it interacts most frequently. Recombination provides the adaptive variation needed to develop programs adjusted to a continually changing environment and the behavior patterns of associated processes. The other two operations -- hybridization and replication -- are directly related to the development of symbiotic association. To examine their effects, consider the development of a "spatial niche" which originally contains processes fi‘om two different structural groups, C1 and 02° Each group contains a number of different variants of the basic strategy, produced by mutations and inversions of the basic type, 61 as well as the offspring of recombinant crosses. Let V1 C Cl and V2 C 02 denote two variant subclasses,‘whose members interact to mutual advantage. In functional form : Assoc (C1,C2), V1 =C1|CZ and V2=CZIC1 sothat V1 :9 Q and FE(V1) > FE(C1-V1) and v2 at a and Fsz) > FE(C2-V2). Since offspring are allocated on the basis of relative fitness, the variant groups will tend to increase their numbers more quickly than the rest of the group : %t-(IV11/IC1|)> o and §?( Ivzl/IQI ) > o and %f ( IV1V(IC1|+I(‘1I))> o and g? ( IV2| / (ICING/2|» > 0. Since offspring placement is spatially biased, most of these offspring will be placed in the same spatial niche. The result is an increase in the variant types and the associations between them, so that g-t-FIW1,V2) >0 and gt—szylpo. So an evidenced symbiotic association will tend to increase in proportion and fi'equency as a result of spatial offspring placement. As we discussed above, some of the offspring produced will be crosses between variants of the symbiotic types, and those which act to su'engthen the relationship will tend to be retained. Other of the offspring will be crosses between the two variants, however, classifying as hybrid crosses in the continuum described in section 3.3. The frequency of hybrid crosses will tend to increase with the increasing proportions of V1 and V2, eventually forming a 62 composite program VIA/2, containing the programs for both processes. Given that the two variants follow mutually exclusive strategies, the full expression of the composite is only evidenced by a set of replicate processes, some of which have differentiated as type V1, others as type V2. Again, since offspring placement is spatially biased, the original composite, as well as its composite offspring will most probably interact. The composite program strengthens the association between the variant strategies, since a single composite process can insure, by replication, the presence of the complementary process pairs. This in turn confers selective advantage, insuring the increase of the composite. 3.4 Summary To summarize then, the key to the development of successful composite processes in the DGAS lies in hybridizations that occur within structurally diverse subpopulations between symbiotically related parents. When a spatially coherent group -- composed of processes from more than one structural type, each of which exhibits a different specialized behavior -- persists over time, spatial biases in mate selection and offspring placement increase the frequency of hybrid offspring, generated by parents fiom different program types within the spatial niche. Those offspring which successfully hybridize mutually beneficial behaviors gain selective advantage. Thus hybridizations which occur between the processes in spatially coherent groups whose members are symbiotically related give rise to composite program types which make structurally explicit the implicit, spatial coherence that exists between the members of such groups. Replications of these composite types produce a set of independent computational units which act as an implicitly coordinated, distributed system. CHAPTER IV ASGARD 4.0 Introduction To investigate whether or not the DGAS would behave as expected, I developed the software system Asgard, which implements the model in a specific problem context. Experiments with Asgard have illustrated some of the properties of the DGAS and verified that it can foster the adaptive self-organization of a set of distributed processes by forming composite programs from the spatially biased interaction of computationally limited adaptive units. The processes in Asgard simulate the behavior and adaptation of a community of simple animals which evolve under the . action of a distributed genetic algorithm. As such, it falls in the class of systems which have come to be known as "artificial life simulations". The artificial life domain has been an important one in genetic algorithm research; genetic life simulations have also been developed by Holland and Reitman [78], Booker [82], and Wilson [85]. Unlike these systems, however, the main objective of Asgard is the study of the evolution of a distributed population of interacting processes, rather than the development of a single, functionally isolated individual. In the sections which follow we will describe Asgard’s organization and review the results of experimentation. 63 4.1 System Organization Asgard’s environment is defined in terms of a finite, toroidal, two-dimensional grid, with a resolution of 160 x 60. The grid is divided into 4 equal quadrants, each of which can be displayed on a Tektronix 4105 graphics terminal. Associated with each location in the grid is a non-negative integer value representing the amount of "food" present at that location. "Food" is placed in the grid according to a pre-specified placement function, which depends on the time and location as well as the action of _ the artificial animals which "inhabit" the grid. These animals occupy locations in the grid, and move about with time in search of food. Associated with each animal is a strength. In order to "stay alive", the animals expend units of their strength. Whenever an animal moves to a location in the grid with a non-zero amount of "food" in it, he consumes all the food in the cell up to his capacity. Each unit of "food" an animal consumes increases its su'ength by one unit. So in order to survive, an animal’s average "food" consumption must be at least one unit per time step. If an animal’s rate of consumption is less than one unit per time step, it eventually runs out of energy and "dies"; if its average rate exceeds one unit per time step, its strength increases. When an animal’s strength increases by C, units, it produces an offspring and its strength decreases by Cr. Each animal can be characterized by the movement patterns it exhibits in the presence of different patterns and densities of "food" in the grid around it. An animal’s movement is controlled by its individual program, a binary-encoded circular list of labeled instructions of two different types: Move and Food?. The order of instruction execution is controlled by a program counter that specifies which instruction in the list is to be performed next. Move instructions take one argument which specifies one of eight directions of movement (N, NE, E, SE, etc.). When executed, Move (x) changes the animal’s grid location by one unit in the specified direction and advances 65 the program counter to the next instruction in the circular list. Food? instructions take two arguments drawn from the set of legal instruction labels. When a Food? instruction in an animal’s program is executed, the eight grid locations which surround the process’s current location are checked for the presence of food. If "food" is present in any of those locations, the program counter skips to the next instruction in the circular list whose label matches the label specified by the first argument; if "food" is not present, the program counter skips to the next statement whose label matches the label specified by the second argument. An example is given in Figure 3, together with a potential movement pattern. Asgard is an almost direct implementation of the DGAS. The animals of Asgard are concurrently executing distributed processes, each one capable of : l)detecting, through the execution of Food? instructions, the conditions in its immediate external environment, in terms of the "food" concentrations which exist there; 2)detecting its internal state, specified by the position of its program counter; and 3)reacting to its external and internal states, by altering the position of its program counter and/or potentially altering its current external environment by executing a series of Move instructions. Each process’s immediate external environment is defined in terms of the concentrations of "food" in the region of the grid which surrounds it. Its internal state is defined by the position of its program counter. A process can modify its external environment in two ways; either by consuming "food" or by moving to a different location in the grid. By executing Food? instructions, the process can detect the state of its external environment and, based on this information, can change its internal state, potentially altering its future external response pattern. 1 Move S Food? 2 l 2 Move E Food? 2 3 3 Move W Food? 2 3 Figure 3. Example Asgard Animal Program 67 Each experiment with the Asgard system begins with a population of processes whose controlling- programs are formed as a fixed length set of instructions drawn according to a random distribution from the set of all legal instructions. The initial "food" conditions are set at each grid location and the processes are started concurrently at randomly chosen grid locations. A global process oversees the system and periodically adds to the amount of "food" contained in the grid according to a pre- specified placement function. The objective is to produce a population which responds to the regularities of this function. The algorithm produces a sequence of populations whose members interact with each other through common local environments and are allocated offspring on the basis of their individual rates of consumption. Over time, the successive populations will contain individuals adapted not only to the conditions of the artificial environment, but also to the behavior patterns of the other members of the population. Let A(t) denote the population at time t and let a be any member of A(t); A(O) denotes the initial, random population. The problem dependent field a.location, representing process a’s current grid location, is added to the fields a.program and a.strength. Asgard’s overall control process is the control algorithm for the DGAS : main () { cobegin ( for all a in A(O) ) I SO; random_grid_location (); a.strength a.location active (a); } oversee ( ): } (The algorithm used in most of the experiments with Asgard differs somewhat fiom the one given above. Experimental trials revealed an undesirable property of the 68 model which necessitated its alteration. For a complete description of this change see section 4.3) ~ The algorithm for each active process is exactly that given for the DGAS (see section 3.3. 1.2), with the value of Th1 = 1. The execution of a process program is given as : execute ( process ) { process.strength = process.strength - 1; process.location = execute_next_instruction (process.program,process.location); meal = Min (Food(process.location), process_capacity); process.strength = process.strength + meal ; The cost of instruction execution is extracted from a process’s strength and the instruction is then executed, potentially modifying the process’s location. Whatever food is present in the process’s current location, up to its capacity, is then consumed and so added to process strength. Processes reproduce according to the process ' reproduce given in section 3.3.1.2; mate selection and offspring placement are both spatially biased, so that offspring most often arise from parents in the same region of the grid and are themselves placed in that region. 4.2 The Environment The task set before the animals of Asgard is to identify and exploit the regularities present in the pre- specified pattern of "food" placement in the presence of other animals. There is a single resource, "food", in limited supply, which each animal requires to live. There is no possibility of substitution, no animal behavior which can "manufacture" it from simpler sources; every unit of "food" consumed by one animal is 69 one more unit made unavailable for the others. It is hard to imagine a more competitive situation. Even under these intensely competitive conditions however, certain types of food placement patterns can still foster cooperation between animals. The placement pattern which acted as the learning environment for the experiments we will discuss here is one of these; the amount of "food" placed in the grid is dependent on the consumption behavior of certain spatial groups. An environmental measure, Ei(t), which I call efficiency, determines how much of the potential "food" _ supply is actually placed in the grid at any given time. Ei(t) varies between 0 and 1, reaching its maximum whenever the individuals of group i cooperate to maintain their aggregate consumption level at the pre-specified optimum. Efficiency is the only environmental variable under the direct influence of the animals in Asgard. As such, it provides us with a mechanism for determining the level of adaptedness -- and the level of association -- of animal groups. The most fit animals in Asgard are those capable of maintaining the highest efficiency ratings. If a particular structural group maintains higher efficiencies when in spatial groups containing individuals from a different structural group than when in spatial groups from which this type is absent, we can then say that the two structural groups are associated. More specifically, the three basic regularities of the "seasonal placement plan" to which the population can adapt are : 1) "Food" can appear in only 1/8 of the space, concentrated in eight evenly spaced fertile regions. Each animal must periodically visit one of these fertile regions in order to survive. 2) The productive capacity of each fertile region oscillates periodically between a maximum and a minimum value. Each fertile area can support many more animals during its "summer" than its "winter", 70 so corrrpetition for "food" within a particular area becomes increasingly fierce as winter approaches. 3) The amount of "food" actually produced during a given time step by a fertile region is linked to the. amount of "food" consumed by the animals in the region during the previous time step. At any given time, an optimal consumption level exists for a region, and the animals in that region can over- or under-consume with respect to this level, as a group, during a given time step, resulting in decreased "food" production in the next time step. In each of the four quadrants of the grid (see Figure 4), two 10x15 areas, called "farms" are established as fertile. The "farms" are the only areas in the grid where "food" can be found; the rest of the grid is desert-like. Each time step, Ki(t) units of "food" are produced by the i’th farm and distributed randomly within it, (i = 1,..8). Ki(t) is itself a function of two other quantities: Mi(t), the seasonal productive potential of farm i; and Ei(t), the consumption efficiency within farm i, a value derived from the total consumption within farm i during the previous period, Ci(t-1). The exact form of these relationships is as follows : Mi(t) =Bsin((21'It/Y)+I'I(-1)i)+Mo, B x) and (Queue length of host < y) -) Action are crucial. When the number of conditions in each classifier is fixed, as is usually the case, comparison predicates like the two in the example can only be performed by the coordinated action of several specially designed classifiers. For example, in her study 91 of the computational properties of classifier systems, Forrest [85] devised an algorithm to perform comparisons between integers within the standard paradigm. The algorithm requires three classifiers for each bit in the binary representation of the numbers, and takes three iterations to complete. It is also exceedingly brittle, in the sense that the removal or corruption of a single classifier in the comparison chain most often produces unreliable results. Thus, even if classifiers implementing the algorithm were a part of each scheduler at the outset of the learning process, the likelihood of maintaining them as functioning units throughout the standard adaptation cycle would be small, making the formation of numerical comparison predicates much less likely than the formation of simple predicates. For a scheduler to evolve a comparison algorithm for itself, even if possible, would require much of the adaptive force of the genetic algorithm, thus reducing its effectiveness in the intended application domain. One solution to this dilemma is to inuoduce a second level of semantics into the interpretation of classifier conditions to allow the expression of arbitrarily complex boolean operations as simple, sin gle-condition predicates. This innovation allows for the creation of specially designed classifier languages capable of high level operations necessary for proper problem solution, while preserving the simple syntactic properties which make the classifier language robust under genetic manipulation. It represents yet another use of classifier "tagging". (The fixed length conditions and messages of classifier systems are often considered as broken into separate fields one of which is often referred to as the tag field [Booker 82]. We let Tag(x) denote a function which returns the tag field of message or condition x.) In the standard classifier, an unnegated condition is satisfied if and only if there exists a message on the current message list which matches the pattern expressed by that condition. (The negation of an unsatisfied condition is satisfied.) In Midgard, the 92 standard classifier language is augmented with a set of predicate functions F=(F;1,...,FtJ, which take the arguments of conditions into the values 1 and 0. Associated with the function set F is a set of identifying tags T=[t1,...,tk}. The boolean-valued functions in F operate on conditions whose tag fields correspond to one of the identifying tags, i.e., V conditions end 3 Tag (cnd) e T. An unnegated condition is satisfied in our system if it is satisfied under the standard system, or if its tag field corresponds to an identifying tag tk whose associated function Fm returns true when applied to the argument of the condition. In functional notation : Let cl be a classifier of the form (cndl cndp) —-) m, where cndi, i=1, p is a condition 6 {0,1,#}“, and m is a message 6 (0,1)“. Let ml denote a message list in ML = F [0,1]“ [Forrest 85]. We then define the classifier function: ML —) ML as : m if Satisfy (cndi, ml) V i= 1. p ; 61 (ml) = (b if 3 i 3 ~ Satisfy (endi, ml) ; where: Match (end, ml) is a boolean-valued function which is true a: 3 a message nr 6 ml that matches end according to the standard classifier system’s rules of matching, and Satisfy (end, ml) is a boolean-valued function : [0,1,#}“x ML —) {0,1}. Satisfy (cnd,ml):Match (cnd,ml) v [ Tag(cnd) = t e T A Fl(cnd) ]. 5.2.2 Midgard The individual scheduler programs in Midgard are fixed length sets of 67—bit classifiers, composed of three 16 bit condition fields each with 1 additional negation a“? 93 bit, and one 16 bit action field. Each field is broken down into two sub fields : a four bit tag field and a 15 bit argument field. Two values of the tag field are reserved as identifying tags for two predefined predicate functions, Q>=X and Q<=X. These functions allow a single condition to test the length of the host queue against an integer value encoded in the condition’s argument field. Another two tag field values are reserved for Midgard’s two types of effector messages, Accept: (0,1)“ —> {0,1} and Migrate: [0,1}n -> [0,1]". When a message whose tag field matches one of the two effector identifiers is posted on the message list, the corresponding function is executed, taking the posted message as its argument. Migrate contains an interpretation algorithm which constructs a legal network address from any 12 bit argument field in [0,1]”. When Migrate(m) is executed by a node with an incoming job, a network address is constructed from the argument field of m and migration requests are sent to all nodes which match this address. Currently in Midgard, legal addresses are either individual nodes, clusters of nodes, the nodes in a particular class, or all other nodes in the network. If Migrate(m) is executed by a node which has no incoming job, then it has no effect. Accept contains an interpretation algorithm which evaluates any argument field in [0,1}12 and returns {0,1}. Whenever Accept (m) is executed, the argument field of m is evaluated. If the host node has just received a submitted job, the value of the argument field indicates whether the job is to be accepted (1) or a network-wide migration attempted (0). If the host node has just received a migration request, the value of the argument field determines whether the request will be accepted (1) or rejected (0). An example may serve to illustrate. Consider two processors, A, a class 1 processor in cluster 2, and B, a class 1 processor in cluster 3. A is heavily loaded, with five jobs waitil wakil The 1 so an comp the t1 mess: in Ch 0an initia SUpp. on th. configa 94 waiting in its job queue, while B is currently idle. A job of class 3 arrives at node A, waking A’s scheduler. A’s message list contains three messages: (1) Host is class 1, in cluster 2, machine A (2) Incoming job is class 3 (3) Job was submitted A’s scheduler program contains a classifier which specifies that : If (Job was submitted) (Host is class 1) (Q>=X X=3) —) MIGRATE (Cluster 3). The first two conditions match messages (3) and (1) on the current message list and so are satisfied. The third condition calls the predefined predicate function Q>=X to compare the host’s queue length to 3 and return true if the queue length is greater. So the third condition is also satisfied and the classifier fires, posting a Migrate effector message. Migrate is executed, issuing several migration requests, one to each node in Cluster 3, requesting one of them to agree to process A’s incoming class 3 job. On receiving its copy of this message, B’s scheduler awakes. B’s message list also initially contains 3 messages : (1) Host is class 1, in cluster 3, machine B (2) Incoming job is class 3 (3) Request from Host A, class 1, cluster 2 Suppose B’s production memory contains the following two classifiers: (a) If (Host is class 1) (Internal message Red) —) ACCEPT (1) (b) If (Request from class 1) (Incoming job is class 3) (Q<=X X=2) --) (Post internal message Red) On the first iteration, classifier (b) fires, posting an internal message of a particular configuration, denoted here as Red. On the second iteration, classifier (a) fires, is 95 issuing a message to A indicating B’s willingness to accept the job. On receipt, A sends the job to B to be processed, and, on completion, the results are returned to node A. Special cases exist when a migration request goes unaccepted or else is accepted by more than one processor. If a migration request is not accepted by any notified processor, the request expires and the incoming job is placed at the end of the job queue of the processor to which it was originally submitted. When a migration request is accepted by more than one processor, the first acceptance received is honored and the job is immediately migrated to that processor. 5.2.3 Relationship with DGAS Midgard retains many of the features of the DGAS directly. The individual schedulers on the network are concurrently executing distributed processes, each one capable of : l) detecting the conditions in its local external environment, in terms of the class, cluster, id and queue length of its host node, by checking rule conditions against messages posted by detectors and evaluating instances of the simple predicates Q<=X and Q>=X; 2) detecting its internal state, by checking rule conditions against messages posted on its message list by its own rules; and 3) reacting to its external and internal states, by posting messages to its message list and/or executing Migrate and Accept actions which potentially alter the routing of tasks in the network. Each process’s immediate external environment is defined in terms of the characteristics of the processor on which it is running. Its internal state is defined by the messages posted on its message list. A process can modify its external environment either by issuing migration requests to transfer jobs to other nodes, accepting jobs sent or migrated to it, or refusing requests for migration. 96 One of the principal differences between Midgard and the standard DGAS presented in Chapter 3 is that the number of processes running at any given time in Midgard is fixed. Each node must at all times have a scheduler, and each scheduler must run on its own node, so the number of active processes must be the same as the number of nodes in the network. A second, perhaps more obvious difference, is the lack of meaning which can be assigned to the spatial relationships between nodes. In Asgard, the environment is defined in terms of local conditions which vary with distance; thus spatial proximity carries with it a wealth of context information. The closer two individuals are on Asgard’s grid, the more intimately connected are the details of their contexts. In Midgard, however, two nodes may be adjacent to one another physically, but the characteristics of the network ( the classes of jobs submitted, their patterns of submission, the characteristics of the nodes, etc.) may be such that they hardly ever interact, at any level. Without an association between physical proximity and contextual interaction, spatially biased mating and offspring placement will not have the same effect as predicted in the standard DGAS. To overcome these differences, the controlling processes of Midgard are arranged somewhat differently fi'om those described in the previous two chapters. Let A(t) denote the set of schedulers active during generation t. A(O) then denotes the initial, random scheduler population. Midgard’s overall control algorithm can be given as : 97 main () { - oversee ( ); t = 0; while ( unterminated ) { cobegin ( for all a in A(t) } active (a, Days_end); wait (Days_end); A(t+l) = construct ( A(t) ); t+= 1; All the schedulers in the network are active for a fixed period, a "day", after which the global event Days_end is set. At the end of each "day", a new population of processes A(t+l) is constructed fi'om A(t), and this new population is activated to control the network for the next "day". The process act ive in Midgard is as follows : active (process, Days_end) { Until (Days_end) { wait (request_or_submission); set_detectors ; decision - execute (process.program); perform (decision); Until the end of the "day", a process waits for job submissions or migration requests to arrive. When a request or submission arrives, the detectors are set appropriately, 98 initializing the process’s message list. Execute returns a decision based on the outcome of a fixed number of iterations of the process’s production program. This decision (send a migration request, accept or reject a request or submission) is then carried out by the procedure perform. During the "day", the processes in A(t) schedule many jobs, and statistics are collected on the local nodes regarding their performance. At the end of the day, a new population is constructed from the old via the procedure construct : construct ( A(t) ) { for (Ai(t), i=1,..m) { evaluate (A1 (t) ) ; for ( each a in Ai(t) ) compute selection probability = F 2 r ’ pta) “Macaw I..(a) for (j = 1! oo-IlAi(t)l ) { select program a from Ai(t) ; aj(t+1) = reproduce (a); The population A(t) is divided into m subpopulations, A1(t)...Am(t), where m denotes the number of processor classes represented in the network. The processes in Ai(t) are all those processes which are running on a node of class i at time t. For each class of processes Ai(t), class means for the performance variables are computed and FE (a), a weighted sum of a’s performance values, is calculated for all 99 a in Ai(t). Based on these values, a set of selection probabilities p(a) are then .0 computed. The new population is generated by selecting processes from the previous population according to this selection distribution. Offspring processes of the selected processes are generated by the routine reproduce, and these offspring become the members of the new population. Two factors are associated with similarity of context between scheduling processes in the Midgard network. First, interaction between two processes is clearly proportional to the number of jobs they exchange with one another. Between process a and process a’, the strength of interaction is given by : e ’ I + o I, SI(a, a') = mtgrs (a a) mrgrs (a a) Xmigrs (a, a") + Zmigrs (a", a) a”eA a”eA where migrs (a, b) denotes the number of jobs migrated from a to b. Thus, the greater the interaction, the more the processes share the same context. Second, two processes within the same class experience similar conditions in the sense that they operate under the same service time disuibution. In Midgard, we exploit these two factors in mate selection and offspring placement to create groups analogous to the spatial and structural groups described in chapter 3. Offspring placement in Midgard is determined by the class membership of the selected parent; an offspring of a process running on a machine of class i will itself run on a node of class i. The machine classes thus act as subpopulations in Midgard; the offspring of the processes in class i on day t are the active processes on the class i nodes on day t+1. This tends to concentrate structurally coherent groups within classes. 100 Just as crossover served two functions in Asgard, in Midgard crossover is used to produce both recombinations and hybridizations. With fixed probability q, mates are selected within class boundaries, most often producing recombinant offspring. With probability 1-q mates are chosen based on their level of interaction, producing hybrids which link strategies Operating in the same context. select_mate (process) { With probability q choose mate from within process's class; else choose mate from A(t) according to strength of interaction; Since the sum of SI = 1, and SI (a, a’) 2 0 for all a’ in A (t), the Sl’s are a valid probability distribution. 5.3 Evaluation Experiments with Midgard were designed to explore the feasibility of using distributed genetic adaptive systems to devise strategies to coordinate the action of interconnected computing resources. Midgard was tested in two phases. In the first phase, the set of experiments described in section 5.3.1, a homogeneous network, containing processors from a single class, is sent jobs of a single type. Two different evaluation functions, Fg‘and F22 are tested to determine whether or not a set of implicitly coordinated disuibuted processors can be generated which cooperate to effectively balance the network’s computing load. In the second phase, the set of experiments described in section 5.3.2, a heterogeneous network, composed of processes from two distinct classes, receives tasks of two different types. The goal is 101 to generate a single composite program, which when replicated on all the nodes of the network, produces' differentiated behaviors on each class of machine, which cooperate together to effectively balance the network load. 5.3.1 Phase 1 For all the simulations in phase 1 we use a fixed network model with one class of processor and one a priori job classification. Fifty identical nodes are arranged in 5 connected clusters of 10 nodes each. Task arrival to each of these nodes is modeled as a Poisson process with mean 3.. The service time for each job is a sample from an exponential distribution with mean s. Each of the connecting LANs is assumed to have a "perfect" protocol capable of queuing messages and job transfers, processing them in the order received. Actual message and job transfer times (excluding any waiting time) are modeled as exponential processes with means s/107 and s/104 respectively. The total network load, p, is given by Is. For each evaluation function we performed a set of uials at each of three load values, p= .2, .5, and .8, recording the minimum mean network response time generated. As a control, we consider the mean response time at p=.2, .5, and .8 for the two limiting cases: the case where each node processes its own incoming jobs (the isolationist policy described above), and the theoretical lower limit, where each job is processed without waiting. Each simulation run is a repeated cycle of testing, evaluation and recombination. Our objective is to determine a fitness function for Midgard’s genetic algorithm that drives successive scheduler populations toward optimum mean network response time. Given some particular pattern of performance, we want to apportion credit to those schedulers whose decisions decreased network response time and blame to those whose decisions increased it. To test a particular population of schedulers however, 102 up to several thousand tasks are submitted to the nodes of the network to be processed, resulting in many thousands of individual scheduler decisions. The eventual result of each decision cannot be immediately determined, thus any "reward and punishment" scheme will necessarily involve some rather complicated record keeping as well as an elaborate scheme of IPC to transfer this information back and forth between the nodes. Instead we chose to pursue an evaluation strategy similar to Smith’s [80], identifying a set of four local performance measures for each scheduler whose weighted sum gives a good estimate of each scheduler’s relative contribution to overall network performance. The four measures we use are: 1) Mean response time, Resp; 2) Mean queue length, Qlen; 3) Mean utilization, Util; 4) Mean migration activity, Mact. We consider relative values for each measure and normalize them so that values near 1 indicate good performance with respect to the measure, 0 poor performance. Testing has shown that, when used independently, the four measures exert pressures in different directions, and that none is sufficient, on its own, to control recombination. A critic based solely on minimizing Resp time or Qlen rewards nodes which effectively migrate tasks which might otherwise overload them, but it also punishes any node which accepts migrated jobs. Thus a node which migrates all of its jobs and accepts none is optimal with respect to either of these measures, but is clearly pathological with respect to overall network performance. On the other hand, a critic based solely on maximizing Util rewards nodes which accept tasks that exploit their under-utilized resources, but also punishes any node which sends such a task. Thus a node which 103 accepts any and all jobs offered it is optimal with respect to this measure, even though it is also clearly pathological with respect to overall network performance. The last measure, Mact, is a composite measure that estimates the amount of migration activity at each node. Since requesting and accepting jobs are the only actions which schedulers in the network can take, maximizing this measure insures that the nodes in the network do not slip into isolationism. This can be important, particularly at the outset, when migratory behavior tends to be random, and often less effective than simply doing nothing. Used on its own, however, it will almost invariably develop a strategy of random scheduling, in which all stations issue global migration requests for every job submitted to them and offer to accept every migration request offered them. The first attempt at combining them, I?! 1, is the equally weighted sum of the four measures. We ran a number of experiments, at each of the three load values. Figure 7 shows the performance of a typical run with respect to mean network response time (p = .8). Beginning from a randomly generated population of schedulers which maintains the mean network response at lls, the first few populations get steadily worse, driving network response time up to maximum of 25‘s. As the members of the population become increasingly coadapted, network response time declines. Eventually a population of schedulers is developed whose combined action reduces network response time to 1.7*s. The performance of the "average-best" over several uials at each load is summarized in Figure 8. Using the equally weighted fitness function, Midgard was consistently able to generate a scheduler population capable of improving the network load balance by at least 65% at each load. The policies discovered, however, were somewhat less than satisfactory from an overhead perspective. In each case, the system consistently evolved strategies which 104 ( s “awn eat/eras uoetu our jo saldtrlnur u! ) strut] esuodsea xromreN uoew Generations Figure 7. On-Line Progress of a Typical Homogeneous Experiment - FEl 105 9.0 md _ Em .. 333M 45.502 33:03:53 me gem .m gamma Q 6004 {02:62 Nd _ 0.0 _ m6 _ ed 00 N6 _ _ X X UBQCU< DID Lm_co$o_om_ Ill EzEzaO XIX I rm ( s ‘aurg; sot/use were our jo seldmnw u! ) ewti esuodsea womreN uoew eds. 106 generated more migrations than necessary, and in low load situations even approached random scheduling. Analysis of the performance of the system with respect to each of the individual performance measures suggested that this undesirable behavior was due to the influence of the Mact, which escalated the migration activity of the system, sometimes even at the expense of the other measures. To remedy this, we considered a second fitness function, F32 , whose weights vary with time. At the outset, each of the four measures is equally weighted, but as time goes on, Mact’s weight approaches 0, while Qlen and Utils’ weights each approach .3 and Resp’s weight approaches .4. The results of several trials are summarized in Figure 9. For each of the three values of p, scheduling policies were consistently discovered capable of near optimal performance. In addition, the strategies developed require less than .5% of the total network channel capacity. 5.3.2 Phase 2 In each experiment in phase 2 we use a fixed network model with two classes of processor and three a prion“ job classifications. Fifty-four nodes are arranged in 6 connected clusters of 9 nodes each. Each cluster is a homogeneous collection of processors from a single class, and there are three such clusters for each of the two processor classes. Task arrival to each of these nodes is modeled as a two— dimensional Poisson process with mean 0t, 7.). The service time for each job is a sample from an exponential distribution with mean pg, 5, where i denotes the class of the executing processor and j the job type (see Table 2). Each of the connecting LAN s is assumed to have a "perfect" protocol capable of queuing messages and job transfers, processing them in the order received, as in phase 1. The total network load, p, is given by [.5(s) + .5(10$)]}t = 5.5sl. Actual message and job transfer times 107 «mm - 838m #8302 Seasoned—om .3 S885 d 2:me Q 6004 {02,52 md md hd who m_d .vd Md Nd F _ _ _ _ O ( e ‘aurgr act/use uoew our JO soldglnur u! ) curt] esuodsea xJOMreN uoew compact/x i cn_co:o_on_ ut- EJEDQO XIX I 108 (excluding any waiting time) are modeled as exponential processes with means mo7 and s/lO“ respectively. Processor class Job s 108 108 s Table 2. Mean Service Times ”Li for a Job of Type i on a Processor of Type j. I performed a set of uials at each of three load values, p= .2, .5, and .8, recording the minimum mean network response time generated. As a control, I again consider the mean response time at p=.2, .5, and .8 for the two limiting cases: the isolationist policy and the theoretical lower limit, where each job is processed without waiting by the machine which executes it most quickly. Using function FEz , I ran a number of experiments, at each of the three load values. Figure 10 shows the performance of a typical run with respect to mean network response time (p = .8). Beginning from a randomly generated population of schedulers which maintains the mean network response time at 14s, network response time increases steadily, peaking at 29*s after 10 generations. However, as the individual schedulers become ever more adapted to one another, network response time declines. After 100 generations, a population of schedulers is developed whose combined action has reduced network response time to 1.8*s. In every trial, after 120 generations, the members of both sub- populations converged to a single program type, which contained specialized rules that behave differently depending upon the class of machine on which they are run. The performance of the "average-best" over several trials at each load is summarized in Figure 11. Over several trials at each load value, Midgard generated a scheduler 109 30— I O O ( ”J5 ‘atugr aayuas uoaw wnwgutw out to satdmnur u! ) amt] esuodsea women] uoew Generations Figure 10. On-Line Progress of a Typical Heterogeneous Network Experiment 110 533M £2302 ”economEBo: we ban—89m .2 0.8me Q 6004 foam—m2 . md Nd 0.0 md v.0 Md Nd Pd _ ,_. _ . _ . _ v _ F F t m. . 0 mt a s [or ION tom roe endeavor elm. [Om ym_co:o_om_ Ill . EJEthO XIX ( “‘8 'ewg; aoyuas uoaur wnwgugw am, to satdmnur u! ) curt] esuodseg women] uoew 111 which when replicated over the entire network was capable of balancing the load to, on average, within 5% of the theoretical optimum. 5.4 Summary In this chapter we have presented a model for applying a distributed genetic algorithm to a problem in the field of computer networks. The results presented show that the genetic algorithm can be successful in coadapting the behavior of distributed computing resources to produce desirable global behaviors. Tests on homogenous networks show that the individual nodes of Mid gard are able to develop a global scheduling policy for a class of networks which achieves a load balance at or near the theoretical optimum, using only local information. Heterogeneous experiments yield similar results, developing composite programs containing rules which produce different behaviors depending upon the characteristics of the machine on which the program is running. When such a composite is replicated throughout the system, the differentiated processes operating on the different nodes cooperate implicitly, routing jobs to the machines which can perform them most efficiently. CHAPTER VI SUMMARY AND CONCLUSIONS This dissertation has focused on the development of a computational theory of distributed adaptive self-organization derived from consideration of evolving biological systems. We began by drawing an analogy between the individuals in an evolving community and the independent, but interconnected, processes in a distributed computing system engaged in a learning task. The idea we have explored here is that coherence between the components of living systems emerges as a result of coadaptive development, and that elements of this process can be abstracted to induce goal-directed integration in artificial computing systems. We first examined some of the biological literature, in order to identify part of the natural mechanism responsible for integrating originally independent individuals into coherent wholes. In particular, we focused on the role symbiosis has played in the emergence of evolutionary novelty. We examined three separate natural transitions involving increases in complexity which were due, at least in part, to the identification and selection of symbiotic association. While the physical mechanisms of interaction and fusion differed in each case, the same basic pattern of development was evinced. Equatin g the independent organisms with computational processes, we developed an absuact description of the three transitions in an attempt to capture the essence of their common coadaptive process. We considered each system as composed of a set of interacting processes of finite duration, which are periodically capable of initiating 112 113 "offspring" processes. These offspring are replicas of the parent processes, but are subject to a source of adaptive variation. We described a coadaptation process wherein when advantageous associations between processes arise, those associations increase in frequency and strength so long as they continue to confer advantage. As processes become more intimately associated, selection favors those which form the most integrated complexes. These complexes provide a means for insuring the continued persistence of the association, and result in an increase in overall system coherence. A formal statement of this process as a computational theory of disuibuted adaptive self-organization formed the basis for the rest of the work. It asserted conditions under which a certain class of distributed systems could increase their level of integration and organization. These conditions require that the system have some mechanism for detecting the existence of beneficial interactions between processes and increasing their strength and frequency of occurrence. The distributed genetic adaptive system (DGAS) was developed as an algorithm to implement the newly developed theory. Under this system, a group of concurrent processes occupy locations in a tessellated space which defines each process’s immediate environment. Associated with each process is a measure of its performance, continually updated by a task specific evaluation function. The process set is manipulated by a new variant of the genetic algorithm, which satisfies the requirements of our computational theory and provides the system with the adaptive variation and selection of association necessary to increase the level of system coordination and improve its performance. This new algorithm is a truly disuibuted version of the original genetic procedure and represents a synthesis of the two major directions in current genetic algorithm research. Like those in the Pitt approach, our algorithm manipulates units of arbitrary 114 complexity, relying on the genetic operators to develop internal coherence as well as improve external ‘ performance. Like the Michigan approach, the individual units manipulated by our procedure interact to affect one another’s performance. The algorithm acts as an efficient source of adaptive variation by biasing the allocation of "offspring" toward those processes which have been observed as the best performers. By statistically applying an appropriate crossover operator to produce these offspring, trials are biased not only to those individuals observed most fit, but also implicitly toward those sets of individuals (schemata) observed as most fit. The introduction of spatially biased mate selection and offspring placement strategies introduces a new dimension to the genetic algorithm, allowing it to encourage the development and eventual integration. of symbiotic strategies. Advantageous interactions between neighboring processes confer advantage, resulting in increased numbers of "offspring" for these processes. Spatially biased offspring placement insures that these offspring will themselves interact with increasing likelihood. Crossovers between these individuals in the DGAS can result in the emergence of composite programs, capable of producing one of several behaviors. Composite processes produced by the hybridization of processes which are symbiotically related make structurally explicit the implicit spatial coherence that exists between them. Replications of these composite types result in a set of independent processes which act as an implicitly coordinated, distributed system. To test the effectiveness of the DGAS, the software system Asgard was developed. Asgard is an implementation of the distributed genetic algorithm applied to an "artificial life" domain. The tessellated space provides an environment for the process set, a population of artificial "animals" which move about the grid consuming the "food" which is distributed throughout. Each animal’s movement pattern is controlled by its own program, a series of instructions drawn from a task-specific language. 115 Each "animal" is modeled as a process which executes its program based on input from its surrounding environment and which reproduces according to the rules of the distributed genetic algorithm. The learning task specified by the "food" placement strategy imposes constraints on the actions of the processes in the domain, both individually and in aggregate. Tests with the Asgard system were described which showed the capability of the DGAS to develop a system of interdependent, implicitly coordinated processes organized toward a common goal. An example of one of these trials was discussed at length, describing the gradual transformation of a population of "animals" from random, uncoordinated, inefficient individuals to an organized group of groups, whose combined efforts resulted in dramatic increases in task performance. The final result was a single, composite program whose replicated c0pies spontaneously organized into a set of coordinated groups which function together as a coherent whole. The Asgard trials demonstrated that the distributed processes which evolve under a DGAS can exhibit coordinated behavior without explicit communication. Despite the encouraging results, several characteristics of the system were described which hampered further experimentation. Asgard’s environment was extremely sensitive to slight changes in its internal parameters, making population sizes difficult to reliably control. Extensions of Asgard to more useful domains proved awkward, due to the difficulty of meaningfully translating realistic problems into Asgard’s metric grid. This suggested that in order to apply the DGAS to more practical problems, some substitute source for the contextual clues provided by the environmental space would have to be developed. These considerations prompted the development of a second software system, Midgard, which utilized a modified version of the DGAS to address a more realistic problem from the field of distributed computing. The processes of Midgard 116 corresponded to a set of independent rule-based load schedulers operating on the nodes of an inter-LAN network. These schedulers interacted with one another to allocate incoming tasks to specific network nodes. The goal was to develop a coordinated scheduler set which cooperated to reduce the mean network response time. The approach taken in Midgard differed from previous approaches in two respects. First, unlike previous attempts, the schedulers in Midgard are rule-based rather than parameter based. Second, in order to demonstrate the ability of our algorithm to develop implicitly coordinated units, capable of cooperative behavior with a minimum of communication, the scheduling processes in Midgard do not exchange statistics about the global state of the network; all scheduling decisions are based on information local to the node. In order to apply a genetic procedure to the schedulers in this context, we found it necessary to extend the standard genetic rule language, the Classifier Language (CL), deve10ping what we called the Augmented Classifier Language (ACL). The ACL provides for the incorporation of pre-defined, task-specific boolean-valued functions as simple, binary-encoded predicates. The ACL preserves the desirable features of the CL -- robusmess under genetic manipulation and computational completeness -- but allows certain operations which cannot be practically expressed in the standard language to be incorporated easily. Tests with the Midgard system provided more verification of our theory of distributed adaptive self-organization. Experiments with a homogeneous network showed that Midgard was able to develop a global scheduling policy for a class of networks which achieved a load balance at or near the theoretical optimum, using only local information. Heterogeneous experiments yielded similar results, developing composite programs which contained rules for several different behaviors, which are 117 interpreted differently depending upon the characteristics of the machine on which the program is run. When such a composite is replicated throughout the system, the differentiated processes operating on the different nodes cooperate implicitly, routing jobs to the machines which can perform them most efficiently. This study has established the feasibility of developing a coordinated set of concurrent processes organized toward a common objective through the process of selection of association. Formulated as an implementation-independent model, our computational theory can serve as a guide to the construction of algorithms other than the DGAS intended to drive the learning behavior of disuibuted computing systems. Our particular implementation, the DGAS, was successful in generating an interesting set of behaviors in the "artificial life" domain, developing complex, well- integrated behaviors fiorn the interactions of simple, computationally limited, non- communicating processes. Our application in the load balancing domain, while not completely realistic, demonstrated the practical potential of the technique, and provided an alternative to a spatially based model. Finally, while useful, the theory we have developed here is by no means complete. A number of areas inviting further research can be identified. Since the development of our algorithm, a number of improvements to the genetic procedure have been developed, which regulate the application of the various operators and introduce new operators at various stages of population development to increase the variability of the population and control its convergence properties [Sirag 87]. Further experimentation with our Augmented Classifier Language might include incorporating a rule-level credit assignment scheme like the bucket brigade to increase the internal coherence of the automatically generated rule-based processes. Increases in attainable complexity might also be obtained through the automatic assignment of 118 function tags to developed processes of a certain fitness, allowmg them to be referenced as simple predicates. Perhaps the most fertile areas for further research are to be found in extending the biological analogy begun here. At present, the complexity of the interactions between processes is limited due to a lack of process-controlled regulatory mechanisms to oversee the differentiation process. Further biological modeling may provide the abstractions necessary to devise a system of process construction or interprocess communication which will facilitate the development of more tightly correlated multi- process systems. BIBLIOGRAPHY Anderson, N. (1970). Evolutionary significance of virus infection. Nature, 227: 1346- 1347. Baker, LE. (1985). Adaptive selection methods for genetic algorithms. In: Proceedings of an International Conference on Genetic Algorithms and Their Applications, July, Pittsburgh, PA. Blackbourn, D. J., Taylor, F. J. R., Blackbourn, J. (1975). Foreign organelle retention by ciliates. Journal of Protozoology, 20(2), 286-288. Booker, L. (1982). Intelligent Behavior as an Adaptation to the Task Environment. Doctoral dissertation, University of Michigan, Ann Arbor, MI. Burks, A.W., ed. (1970). Essays on Cellular Automata. Urbana, ILzUniversity of Illinois Press. Burks, A.W. (1986). A radically non-Von Neumann architecture for learning and discovery. In: CONPAR 86: Conference on Algorithms and Hardware for Parallel Processing, Wolfgang Handler, ed. September 17-19, Springer-Verlag, Berlin. Bumet, RM. (1969). Cellular Immunology. Melbourne: Melbourne University Press. Cavicchio, DJ. ( 1970). Adaptive Search Using Simulated Evolution. Doctoral dissertation, University of Michigan, Ann Arbor, MI. Cleveland, L.R., Grimstone, A.V. (1964). The fine structure of the flagellate Mixotricha Paradoxa and its associated microorganisms. In: Proceedings of the Royal Society of London, B 157: 668-683. Cowan, J.D., Sharp, DH. (1988). Neural nets and artificial intelligence. Daedalus, 117(1): 85-121. Dawkins, R. (1976). The seb‘ish gene. New York, NY: Oxford University Press. Dawkins, R. (1982) The extended phenotype. New York, NY: Oxford University Press. Day, W. (1984). Genesis on Planet Earth. New Haven, CN: Yale University Press. 119 120 Dayhoff, M. (1972). Atlas of Protein Sequence and Structure: Vol. 5, Washington, DC: Nationaquiomedical Research Foundation. DeJong, K.A. (1975). Analysis of the Behavior of a Class of Genetic Adaptive Systems. Doctoral dissertation, University of Michigan, Ann Arbor, MI. DeJong, K.A. (1987). On using genetic algorithms to search program spaces. In: Genetic Algorithms and Their Applications: Proceedings of the 2nd International Conference on Genetic Algorithms, Grefenstette, ed., July, M.I.T., Cambridge, MA. Farmer, J.D., Packard, N.H., Perelson, AS. (1985). The immune system and artificial intelligence. In: Proceedings of an International Conference on Genetic Algorithms and Their Applications, July, Pittsburgh, PA. Fogel, L., Owens, A., Walsh, M. (1966). Artificial Intelligence Through Simulated Evolution. New York, NY: Wiley. Forrest, S. (1985). A Study of Parallelism in the Classifier System and its Application to Classification in KL-ONE Semantic Networks. Doctoral dissertation, University of Michigan, Ann Arbor, MI. Frantz, DR. (1972). Non-Linearities in Genetic Adaptive Search. Doctoral dissertation, University of Michigan, Ann Arbor, MI. Friedberg, R.M. (1958). A learning machine, Parts I and 11. IBM Journal of Research and Development. 2(1): 2-13. Fujiki, C., Dickinson, J. (1987). Using the genetic algorithm to generate LISP source code to solve the Prisoner’s Dilemma. In: Genetic Algorithms and Their Applications: Proceedings of the 2nd International Conference on Genetic Algorithms, Grefenstette, ed., July, M.I.T., Cambridge, MA. Fuller, 8., Siework, D. (1973). Some observations on semiconductor technology and the architecture of large digital modules, Computer, 6(10): 14-21. Goldberg, D. E. (1983). Computer-Aided Gas Pipeline Operation using Genetic Algorithms and Rule Learning. Doctoral dissertation, University of Michigan, Ann Arbor, MI. 121 Goldberg, D. E. (1987). Nonstationary function optimization using genetic algorithms with dominance and diploidy. In: Genetic Algorithms and Their Applications: Proceedings of the 2nd International Conference on Genetic Algorithms, Grefenstette, ed., July, M.I.T., Cambridge, MA. Goulain, M., Kornberg, A., Sinsheimer, R. (1968). Enzymatic synthesis of DNA. 24. synthesis of infectious phage phiX174 DNA. Proceedings of the. National Academy of Sciences, USA, 58: 2321-2328. Grefenstette, 1.]. (1987). Multilevel credit assignment in a genetic learning system. In: Genetic Algorithms and Their Applications: Proceedings of the 2nd International Conference on Genetic Algorithms, Grefenstette, ed., July, M.I.T., Cambridge, MA. Grosso, RB. (1985). Computer Simulation of Genetic Adaptation: Parallel Subcomponent Interaction in a Multilocus Model. Doctoral dissertation, University of Michigan, Ann Arbor, MI. Holland, LR. (1975). Adaptation in Natural and Artificial Systems, Doctoral dissertation, University of Michigan, Ann Arbor, MI. Holland, J .H. (1976). Studies of the spontaneous emergence of self-replicating systems using cellular automata and formal grammars. In: Automata, Languages, Development, Lindenmayer, A., Rozenberg, G., eds. North-Holland Publishing Company, New York. Holland, J .I-l., Reitrnan, LS. (1978). Cognitive systems based on adaptive algorithms. In: Pattern Directed Inference Systems, Waterman and Roth, ed., New York, NY: Academic Press. Holland, J H. (1985). Properties of the bucket brigade. In: Proceedings of an International Conference on Genetic Algorithms and Their Applications, July, Pittsburgh, PA. Holland, J.H., Holyoak, K., Nisbett, R., Thagard, P. (1986a). Induction: Processes of Inference, Learning, and Discovery, Cambridge, MA: MIT Press. 122 Holland, J .H. (1986b). Escaping brittleness: the possibilities of general- purpose learning algorithms applied to parallel rule-based systems. In: Machine Learning: An Artificial Intelligence Approach, Volume 11, Michalski, R.S., Carbonell, 1.0., Mitchell, T.M., eds. Los Altos, CA: Morgan Kaufmann Publishers Inc.. Hollstein, RB. (1971). Artificial genetic adaptation in computer control systems. Doctoral dissertation, University of Michigan, Ann Arbor, ML Horowitz, NH. (1945). On the evolution of biochemical synthesis. Proceedings of the National Academy of Sciences, USA, 31: 153-157. Jeon, K. W., Danielli, J. F. (1971). Micrurgical studies with large free-living amoebas. International Review of Cytology, 30: 49-89. Jeon, K. W., Jeon, M. S. (1976). Endosymbiosis in amoebae: recently established endosymbionts have become required cytoplasmic components. Journal of Cell Physiology. 89: 337-344. Kaniss, RC. (1981). Evolutionary Change in Hierarchical Systems: A General Theory. Doctoral dissertation, Cornell University, Ithaca, NY. Koestler, A. (1976). The Ghost in the Machine. New York, NY: Random House. Khorana, H.G., Agarwal, KL. (1970). Total synthesis of the gene alanine transfer ribonucleic acid from yeast. Nature, 237: 27-34. Lippmann, R.(l987). An introduction to computing with neural networks. IEEE ASSP Magazine, April: 4-22. Livny, M., Melman, M. (1982). Load balancing in homogeneous broadcast distributed systems. Performance Evaluation Review, 11(1): 47-56. Locke, M., ed. (1968). The Emergence of Order in Developing Systems. New York, NY: Academic Press. Lovelock, J.E. (1988). No longer willful, Gaia becomes respectable. Science, 240: 393- 395. Margulis, L. (1970). Origin of Eukaryotic Cells. New Haven, CN: Yale University Press. 123 Margulis, L. (197 6). A review: genetic and evolutionary consequences of symbiosis. Experimental Parisitology. 39: 277-349. Margulis, L. (1981). Symbiosis in Cell Evolution. San Francisco, CA: W. H. Freeman and Company. Margulis, L. (1982). Early Life. Boston, MA: Science Books International. Margulis, L., Chase, D., Geurrero, R. (1986a). Microbial communities. BioScience, 36(3). Margulis, L., Sagan, D. (1986b). Origins of Sex. New Haven, CN: Yale University Press. Margulis, L., Sagan, D. (1986c). Microcosmos. New York, NY: Summit Books. Marr, D. (1977). Artificial intelligence - a personal view. Artificial Intelligence, 9: 37.- 48. Mayr, E. (1970). Population, Species and Evolution. Cambridge, MA: Belkrrap. Miller, S.L. (1953). A production of amino acids under possible primitive earth conditions. Science, 117: 528-529. Morowitz, HJ. (1967). Biological self-replicatin g systems. Progress in Theoretical Biology, 1: 35-38. Oro, J., Sherwood, E., Eichberg, J., Epps, D. (1978). Formation of phospholipids under primitive earth conditions and the role of membranes in prebiological evolution. In: Light Transducing Membranes. D. Deamer, ed. New York, NY: Academic Press. Perry, 2.. A. (1984). Experimental Study of Speciation in Ecological Niche Theory Using Genetic Algorithms. Doctoral dissertation, University of Michigan, Ann Arbor, MI. Ponnamperuma, C., Gabel, N. (1968). Current status of chemical studies on the origin of life. Space Life Sciences, 1: 64-96. Prigogine, 1., Nicolis, G., Babloyantz, A. (1972). Thermodynamics of evolution. Physics Today, 25(11): 23-28, 25(12): 38-44. 124 Prigogine, l. (1980). From Being to Becoming. San Francisco, CA: W.H. Freeman. Pringle, J.W.S. (1951). On the parallel between learning and evolution. Behaviour, 3: 90-110. Pylyshyn, Z.W. (1986). Computation and Cognition. Cambridge, MA: MIT Press. Ramamritham, K., Stankovic, LA. (1984). Dynamic task scheduling in hard real-time distributed systems, IEEE Software, 1(3): 65-75. Reanney, DC. (1974). Viruses and Evolution. International review of Cytology. 37: 21-52. Richmond, MB. (1979). ’Cells’ and ’organisms’ as a habitat for DNA. Proceedings of the Royal Society of London. B204: 235-250. Riedel, R. (1978). Order in Living Organisms. New York, NY: John Wiley and Sons. Sagan, C., Khare, B. (1971). Long wavelength ultraviolet photoproduction of amino acids on primitive earth. Science, 173: 417-420. Sannier, A.V., Goodman, ED. (1987). Genetic learning procedures in distributed environments. In: Genetic Algorithms and Their Applications: Proceedings of the 2nd International Conference on Genetic Algorithms, Grefenstette, ed., M.I.T., Cambridge, MA. Smith, DC. (1979). From extracellular to intracellular: the establishment of a symbiosis. Proceedings of the Royal Society of London, 8204: 115-130. Smith, SE (1980). A Learning System Based on Genetic Adaptive Algorithms. Doctoral dissertation, University of Pittsburgh, Pittsburgh, PA. Spiegleman, S. (1967). An in vitro analysis of a replicating molecule, 80th Jubilee Lecture. American Scientist, 55: 3-68. Steele, EJ. (1981). Somatic Selection and Adaptive Evolution. Chicago, IL: University of Chicago Press. Stone, H. (1977). Multiprocessor scheduling with the aid of network flow algorithms, IEEE Transactions on Software Engineering, SE-3(l): 85-93. 125 Taylor, F.J.R. (1974). Implications and extensions of the serial endosymbiosis theory of the origin ofeukaryotes. Taxon, 23: 229-258. Taylor, F.J.R. (1979). Symbionticism revisited: a discussion of the evolutionary impact of intracellular symbioses. Proceedings of the Royal Society of London, B204: 267-286. Taylor, J. T. (1976). Captain Jim’s drunken dream. Country Roads Music, Inc. (BMI). Temin, HM. (1976). The DNA provirus hypothesis. Science, 192: 1075-1080. Thompson, J. N. (1982). Interaction and Coevolution. New York, NY: John Wiley and Sons. Towsley, D., Mirchdaney, R. (1987). The effects of communication delays on the performance of load balancing policies in disuibuted systems. In: Proceedings of the 2nd International Workshop on Applied Math. and Perf/Rel. Models in Computer Communications, May. Usher, DA. (1977). Early chemical evolution of nucleic acids: a theoretical model. Science, 196: 311-313. Waddington, CH. (1969). Paradigm for an evolutionary process. In: Towards a Theoretical Biology, Vol 11, Chicago, IL: Aldine. Wallace, R., King, J.L., Sanders, G. (1981). Biology: The Science of Life. Glenview, IL: Scott, Foresman and Company. Wallin, J.E. (1927). Symbionticism and the Origin of Species. Baltimore, MD: Williams and Wilkins. Wasserman, P.D., Schwartz, T. (1987). Neural networks: Part 1. IEEE Expert, 2(4): 10-14. Whatley, J., John, P., Whatley, FR. (1979). From extracellular to inuacellular: the establishment of mitochondria and chloroplasts. Proceedings of the Royal Society of London, 3204: 165-187. White, HE. (1976). Coenzymes as fossils of an earlier metabolic state. Journal of Molecular Evolution, 7: 101-104. 126 Wiley, E.O., Brooks, DR. (1982). Victims of history - a nonequilibrium approach to evolution. Systematic Zoology, 31(1), 1-24. Wilson, Stewart. (1985). Knowledge growth in an artificial animal. In: Proceedings of an International Conference on Genetic Algorithms and Their Applications, July, Pittsburgh, PA. "tttlllltlllllt