‘4—

MSU RETURNING MATERIALS:
PIace in book drop to
LIBRARIES
.—‘:—-

 

remove this Checkout fr;
your record. flﬂ§§_wi1'
——~ 7 be charged if book is
returned after the daf'
stamped below.

 

 

 

 

I
It I . 9 ‘ ﬂ; ,m‘,¢
SEP 2 4 F563 M
2: - . ‘
.. “‘ !
II” 1
I
2"“ ~

-,$%,rl a
' 1"“ ' ¢
«‘3 ‘

“i.

‘1.
I.

u .N.
.3 .
'JJ‘S

hi!
I‘- ’
-----—-——---.3. .3...” M.--‘
”-9.3; ./
M.c1',
s‘i' I Q- r”. '_,~|
“ V I
q,
. “ 2 O ‘5!
()0
E
'I. |

 

 

 

A COMPUTATIONAL THEORY OF LEARNING IN DISTRIBUTED SYSTEMS

Adrian V. Sannier II

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Electrical Engineering and
Systems Science

1988

5/ 759%0

ABSTRACT

A COMPUTATIONAL THEORY OF LEARNING IN DISTRIBUTED SYSTEMS

By

Adrian V. Sannier II

A distributed computing system engaged in a learning task is subject to certain
developmental considerations which arise from the potential for functional interaction
that exists between the system’s constituent processes. As individual processes
adapt in response to the task environment, they must simultaneously adapt to the
behavior of the other processes in the system, in order to form an integrated and
coordinated whole. Effective development depends on the ability of the system to
achieve an appropriate balance between the self-assertive and integrative tendencies
of each of the system processes. This dissertation specifies a method for developing a
coordinated, goal-directed, distributed system from an initial set of independent,
uncoordinated processes. The approach presented here is derived from theories which
describe the evolution of biological communities. Two software systems which
demonstrate the theory’s utility are discussed in some detail: the first , Asgard, is an
artiﬁcial life simulation; the second, Midgard, develops distributed, rule-based,

adaptive load balancers for a certain class of inter-LAN networks.

To Banj, for, everything . . .

iii

ACKNOWLEDGEMENTS

First, I would like to express my gratitude to my major professor and friend, Dr. Erik
D. Goodman. He gave me just the right mix of freedom and guidance to keep me
going, and kept faith with this project even when I had none. My association with

Erik, and his lovely wife Cheryl, is one I will always treasure.

Special thanks also to the members of my guidance committee, Drs. James Resh, Carl
Page, Thomas Manetsch, and Donald Straney, for their helpful observations and
suggestions. I am particularly grateful to Don Straney, for helping me fill in some of
the many gaps in my biological background. Thanks for all the time spent in our many

hours of discussion - they were one of the highlights of my doctoral program.

I also owe a great debt to the AH. Case Center for CAEM - it taught me everything
I know about computers and a good deal more than that. I feel extremely fortunate to
have been associated with as interesting and bright a group of people as have been
through the Center. To Ojer and Janey, Albert and Sarah, DPH, Guy, and Rerni, Joyce,
Jack, and Captain Jupiter, the Yoss man, the wizard of Haas, my Consultants, and, of
course, the Users; in short to anyone who ever touched a terminal at Case : thanks to

all of you -- it was the best time of my life.

Some things cannot be adequately expressed in words. One such thing is the
gratitude I feel toward my parents for their unfailing love, strength, and support
throughout the whole of my life. I couldn’t have done this, or anything else, without

you.

iv

TABLE OF CONTENTS

LIST OF FIGURES .......................................................................................................... vii

LIST OF TABLES ........................................................................................................... viii

CHAPTER I INTRODUCTION ......................................................................................... 1
CHAPTER II TOWARDS A COMPUTATIONAL THEORY OF DISTRIBUTED

ADAPTIVE SELF-ORGANIZATION ................................................................ 10

2.0 Introduction ..................................................................................................... 10

2.1 Symbiosis ........................................................................................................ 10

2.2 Living Transitions ........................................................................................... 13

2.2.1 Replicator to Prokaryote ............................................................... 14

2.2.2 Prokaryote to Eukaryote ............................................................... 22

2.2.3 The Origins of Multicellularity ..................................................... 27

2.3 Distributed Adaptive Self Organization .......................................................... 32

2.3.1 The Common Thread .................................................................... 33

2.3.2 A Computational Theory .............................................................. 37

2.4 Summary ......................................................................................................... 41

CHAPTER III A DISTRIBUTED GENETIC ADAPTIVE SYSTEM ............................ 42

3.0 Introduction ..................................................................................................... 42

3.1 The Basic Model ............................................................................................. 42

3.2 Composite Programs -— Composite Processes ................................................ 44

3.3 Process Initiation ............................................................................................. 45

3.3.1 The distributed genetic algorithm ................................................. 46

3.3.1.1 Michigan and Pitt ........................................................... 47

3.3.1.2 A synthesis ..................................................................... 49

3.3.2 Structural and Spatial Coherence .................................................. 55

3.3.3 A Reproductive Continuum .......................................................... 57

3.3.4 Composite programs ..................................................................... 60

3.4 Summary ......................................................................................................... 62

CHAPTER IV ASGARD .................................................................................................. 63

4.0 Introduction ..................................................................................................... 63

4.1 System Organization ....................................................................................... 64

4.2 The Environment ............................................................................................ 68

4.3 Experiments .................................................................................................... 73

4.4 Summary ......................................................................................................... 83

CHAPTER V MIDGARD ................................................................................................ 85

5.0 Introduction ..................................................................................................... 85

5.1 Load Balancing ............................................................................................... 86

5.2 Model Specification ........................................................................................ 88

5.2.1 An Augmented Classiﬁer Language ............................................. 89

5.2.2 Midgard ......................................................................................... 92

5.2.3 Relationship with DGAS .............................................................. 95

5.3 Evaluation ..................................................................................................... 100

5.3.1 Phase 1 ........................................................................................ 101

5.3.2 Phase 2 ................................................ . ........................................ l 06

5.4 Summary ....................................................................................................... 111
CHAPTER VI SUMMARY AND CONCLUSIONS ..................................................... 112
BIBLIOGRAPHY ........................................................................................................... 1 19

LIST OF FIGURES

Figure 1. Gene Splinting ................................................................................................... 18
Figure 2. A Reproductive Continuum ............................................................................... 58
Figure 3. Example Asgard Animal Program ..................................................................... 66
Figure 4. Asgard ................................................................................................................ 71
Figure 5. Asgard Startup Population Transient With Non-Graduated Introduction ......... 74
Figure 6. Asgard Population Startup Transient With Graduated Introduction ................. 76
Figure 7. On-Line Progress of a Typical Homogeneous Experiment - FEl ................... 104
Figure 8. Summary of Homogeneous Network Results - FEl ........................................ 105
Figure 9. Summary of Homogeneous Network Results - Fm ........................................ 107
Figure 10. On-Line Progress of a Typical Heterogeneous Network Experiment ........... 109
Figure 11. Summary of Heterogeneous Network Results .............................................. 110

vii

LIST OF TABLES

Table 1. A Working Set of Asgard Parameters ................................................................. 77
Table 2. Mean Service Times it” for a Job of Type i on a Processor of Type j ............. 108

viii

CHAPTER I

INTRODUCTION

Left to her own devices, the common ant is no match for the forces in the world
capable of destroying her. Fortunately for her though, no ant is an island. Each
individual ant’s efforts coordinate with the efforts of the other members of her colony
and together they take the measure of their surroundings, construct a nest, locate and
deliver food, defend themselves from the various agents of entropy, and reproduce
their kind. Much of what they do together, they do implicitly, by instinct. The

individual members of the colony cooperate by rote and promote their mutual survival.

It is this kind of implicitly coordinated effort which allows life to exist, in all its forms,
at all its levels. The colony of ants is a step in an enormously complex ladder of living
things, each rung populated by living systems whose continued existence depends on
the behavior of the systems around them. This interdependence extends downward in
scale as well as upward. Our individual ant is herself composed of millions of cells,
each performing the specialized functions which, on the one hand, perpetuate their
own individual lives, and, on the other, go together to make up the life processes of
the ant. Each of those cells is itself composed of individual organelles, and so it goes.
In an intricate and subtle network, the hierarchies of living systems extend from the
vast ecologies which cover continents to the microscopic chemical gradients which

make up the inner workings of cells.

What makes the story all the more remarkable is that all these elaborate strategies,
each one linked to untold others, are constantly in a state of ﬂux, continually adapting
to the changing properties of the environment, in a process generally known as
evolution. As a rule, when a population of living organisms reproduces over a period
of time, it responds to its surroundings; the population’s genetic structure changes
through successive generations, gradually reﬂecting discoveries about which
combinations of traits the environment has favored and which it has not. The process
works through a combination of natural selection and the action of certain genetic
mechanisms that manipulate the information stored within the gene pool, to produce

populations adapted to an ever-changing environment.

Often, the environment an individual faces is shaped in great measure by the behavior
of the living system which surrounds him. The individuals in a population interact
functionally as well as reproductively, affecting each other’s survival -- sometimes
favorably, sometimes unfavorably. An ant with an aberrant strategy affects not only
her own survival, but the survival of the rest of the colony; a larger than usual ant,
with increased load bearing ability, will improve the colony’s chances as well as her
own. Each generation is a trial of a new set of survival strategies, tested individually
and in concert against the rigors of surrounding conditions. From this cycle of testing,
evaluating, and recreating the various and sundry strategies, combinations of
strategies, and combinations of combinations, the natural system is able to generate,
recognize, and consolidate mutually compatible strategies into a form which gives

them stability and insures their proliferation so long as they continue to prove viable.

As computing resources become increasingly distributed, increasingly parallel, the
individual processing element becomes more and more like the individual ant. The
tasks computer systems perform are becoming ever more complex, and the systems

designed to address them no longer rely on the behavior of a single, sequential

machine, but on the coordinated, simultaneous action of anywhere from 10 to 100,000
concurrent processes. The modern factory ﬂoor already contains a whole host of
distributed intelligence, ﬁom the microprocessors which control the senses and local
behavior of robots, to the line controllers, to the network of rnicros which handle the
payroll. The advent of these massively distributed, parallel systems has produced a
whole new set of research questions: by what methods, and to what extent, can these
systems maintain fault tolerance?; how can effective cooperation be maintained with
a minimum of communication?; can these systems, particularly in real-time

applications, be made more robust with respect to changing conditions?

Many of these questions, and others like them, have already been faced and solved
by living systems. The idea I would like to explore in this dissertation is that many of
the desirable properties of living systems -- their fault tolerance, their ability to
coordinate with a minimum of communication, their robustness in the face of changing
circumstances, their hierarchical control structure -- were produced gradually, as a
direct consequence of their evolution. Studying this process should provide us with

mechanisms for producing these effects artiﬁcially in distributed computing systems.

A distributed computing system that is adapting to a task domain has much in
common with an evolving community. Like the individuals in a functionally interacting
population, each processor in the distributed system is faced with an environment
with a structure richer than it alone can represent, a problem more complex than it
alone can solve. At the outset, when little is known about the problem domain,
exploration of the problem space proceeds most rapidly if each processor fends for
itself, so that the system examines a multitude of different strategies under various
conditions. As the system matures, however, individual strategies will begin to
impinge on one another more and more, and the processors must begin to adapt not

only to the task at hand but also to each other’s behavior.

If the mechanisms which allow the living system to coadapt behaviors and forge
independent individuals into a coherent and cooperating whole can be abstracted, they
might go a long way towards solving many of the problems which emerge in
distributed computing, particularly in their use as learning systems. This dissertation
is a step in that direction. When learning is regarded as the process by which a
system acquires information about its environment and uses that information to alter
its behavior, marked similarities between learning and evolution emerge [Pringle 51].
The central idea behind the research presented here is that a useful computational
model of leaming in distributed systems can be constructed from a computational

model of a particular kind of evolving system.

Using biological models to develop computational approaches is nothing new. Since
the early 1950’s, researchers in computer science have used biology to drive their
research. One of the ﬁrst attempts at artiﬁcial learning, Friedberg’s pioneering study
of behavior speciﬁed programming [Friedberg 58], used a learning algorithm based
on a simple model of evolution. Another early study in learning, by Fogel, Walsh and
Owen, [66] used a similar approach in an attempt to design an automaton capable of
learning regular languages. Today, teams of researchers throughout the country are
attempting to develop "intelligent" systems using neural networks, a computational
abstraction based on the structure of the animal nervous system [Wasserman 87,
Lippmann 87, Cowan 88]; others seek to develop artiﬁcial adaptive behavior by

modeling facets of the immune system [Farmer 85].

As tools in the study of complex biological phenomena, computational models may
prove themselves as an important new method in the study of biological systems as
information processors. A constructive, computational process description of a
biological phenomenon presents a useful alternative to less explicit written

descriptions or explicit but cumbersome mathematical models which are often

intractable for all but the simplest cases. Computational descriptions can elucidate
the implications of the set of assumptions underlying a biological theory ﬁom an
information processing perspective. As a number of authors in the biological sciences
have noted, the processing of information is a critical concept in understanding the
regulation and development of biological systems [Prigogine 80, Wiley 82]. Thus the
new vantage point afforded by computational models should provide valuable insights

into the study of the characteristics of living systems.

A good example of the use of a computational description in this context can be found
in a study by Holland [76a] exploring the spontaneous emergence of self-replicating
molecules. One of the sticking points in the current view of the origins of life is that
the chance assembly of the complex proteins and self-replicating nucleic acid chains
on which the theory depends is all too unlikely. Arguments exist on both sides of the
question, but they tend to lack precision, since the models advanced to account for
spontaneous emergence rely on complicated chemical interactions which obscure the
probabilities of formation. By reformulating the question in terms of abstract entities
which change form according to speciﬁc rules and probabilities, Holland was able to
more explicitly characterize the interactions between the various constraints which

govern "replicator" development.

While his 1976 paper is only an exploratory effort, it presents the beginnings of a
framework which sheds new light on the nature of the role which enzymes play in
reducing the expected waiting times for the spontaneous emergence of self-
replicating compounds. Holland’s "alpha-universe" formulation is a computational
abstraction of the "primordial soup", where the laws of chemistry are abstracted as
the transition rules of a Lindenmayer grammar embedded in a cellular automaton.
Substrings in this grammar, (abstract "chemicals"), change over time according to

these transition rules and the form of the substrings which are proximate to them.

Holland showed that dramatic decreases of up to several orders of magnitude can be
obtained in the expected waiting time for the emergence of self-replicating strings if
the transition rules of the grammar are such that certain simple substrings act as

"enzymes", increasing the probability of occurrence of more complex substrings.

By studying the properties of his "alpha-universe", Holland was able to make
speciﬁc assertions about the conditions under which certain simple conﬁgurations can
act to reduce the probabilities of formation for other, more complex conﬁgurations.
‘Moreover, his model gave rise to fairly complex behavior, (the spontaneous

emergence of self-replicating strings), from a relatively simple set of transition rules.

The computational technique is by no means comprehensive; computational models
like the aforementioned fail to capture many facets of the biological phenomena they
describe. But the level of abstraction they enforce allows a researcher to study more
easily the interactions of information that occur between several complex processes.
These models deal solely with abstract entities, letting the level of physical detail fall
away to present the system’s information processing functions in sharper focus. The
new insights which the computational perspective can bring to the study of biological

systems promise to make it an important tool in the study of complex living

processes.

As results in computer science, models of biological information ﬁll a crucial void.
While much of the research in the ﬁeld of computer science comes under the heading
of computational mathematics, a large body of research is more akin to experimental
physics than mathematics. Particularly in the ﬁeld of artiﬁcial intelligence, research
results are often presented as algorithms or software systems which have been
developed as solutions to speciﬁc problems. A set of problem-dependent data
structures are deﬁned and a complex series of operations are performed on them. The

system’s performance is then described in terms of these structures and operations,

but, particularly if the algorithms are non-deterministic or involve heuristic methods,
these "process descriptions" [Marr 77] are incomplete at best. While the systems
are often of interest in and of themselves, sometimes dramatically demonstrating
computing capabilities, time has shown them to be difﬁcult to generalize or extend
[Marr 77]. In terms of the scientiﬁc method, such results act as both hypothesis and
experiment -- and since the hypothesis does not describe a more general class of
events, even when the experiment succeeds, it gives only a little guidance into how
future systems should be constructed, which features are crucial and should be

retained, which are redundant or ineffective and need to be improved.

Marr, an early researcher in artiﬁcial intelligence and computer vision, suggested an
alternative, which aims at the development of a theory of information processing
beyond the level of process description. For Marr, the ideal result in computer science

involves :

the isolation of a particular information processing problem,
the formulation of a computational theory for it, the construction of an
algorithm which implements it, and a practical demonstration that the
algorithm is successful.

At the crux of his approach is the observation and modeling of biological information
processing. By studying various animal vision mechanisms Marr was able to identify
some common features which exploit the same "natural constraints" to reduce the
complexity of image resolution. In so doing, he ﬁamed what he called a
"computational theory" of low-level vision processing, a hardware and
implementation independent, "abstract formulation of what is to be computed and
why". The computational theory focuses on the invariant properties of a calculation,
irrespective of the algorithmic details of implementing same. These theories guide the
scientiﬁc process in the traditional way. They are ﬁamed from observations of

existing phenomena and subsequently tested ( by implementing algorithms based on

their explanation of the computation ) and modiﬁed in accordance with the results of

experiment.

The rest of this dissertation proceeds along these lines. With Marr’s concerns in
mind, Chapter 2 begins with a discussion of three living systems which began as sets
of individual units with little or no coordinated interaction, and ended up as highly
c00perative, hierarchically organized distributed systems. By discussing some
biological theories which account for their emergence we will attempt to identify their
common thread and unify them into a single computational theory of distributed

adaptive self-organization.

In Chapter 3, we will discuss the distributed genetic adaptive system (DGAS), an
algorithm which implements the theory developed in Chapter 2. The chapter begins
with a description of the learning mechanism used by the system, a variant of the
genetic algorithm originally due to Holland. Following this, the DGAS is described in
terms of a set of mechanisms capable of identifying and consolidating emergent
symbiotic behaviors between independent, distributed units operating in a metric

problem space.

Chapter 4 presents Asgard, a software system which implements the DGAS.
Designed as an artiﬁcial world populated by large numbers of "artiﬁcial organisms",
Asgard showed the ability of the DGAS to develop a coherent, coordinated
hierarchical distributed system from originally independent, uncoordinated individual
units. The program is described in some detail and the results of experimentation are

presented.

Chapter 5 describes Midgard, a second test of the theory. In Midgard, a modiﬁed
version of the DGAS is applied to develop a distributed system which balances the

computational load in an inter-LAN network. The network is a set of autonomous

processors which can communicate with one another and allow resource sharing. In a
distributed computer system of this kind, often some processors become heavily
loaded, with many jobs waiting in their queues, while other processors stand idle, or
nearly so. Clearly gains in overall efﬁciency can be obtained by migrating jobs from
heavily loaded processors to lightly loaded ones. Protocols which attempt to mediate
these migrations to increase overall throughput are called load balancing protocols.
Each processor in the network uses the protocol to evaluate its loading and the
loading of the other systems, to decide whether to accept jobs from other processors
or to send some of its own to another system, and to resolve any conﬂicts in
communication which arise. In applying the DGAS to this area, the load balancing
protocol is gradually developed from coadapted sets of processes which control the.
scheduling behavior of the individual processors in a simulated network. Efﬁciency
feedback is used to direct the adaptive process, producing a viable system of
distributed schedulers which cooperate with one another to increase network

efﬁciency and decrease waiting times.

Chapter 6 is a discussion of results and some concluding remarks.

CHAPTER II

TOWARDS A COMPUTATIONAL THEORY OF DISTRIBUTED ADAPTIVE
SELF-ORGANIZATION

2.0 Introduction

The transition from distributed independent units to coherent, cooperative whole has
been made countless times by living systems. In this chapter, we will examine
theories of how that transition has occurred at three distinct levels of organization:
the step from primitive replicators to prokaryotes, from prokaryotes to eukaryotes,
and ﬁ'om simple eukaryotes to multicellular organisms. Each of the theories we will
examine emerges from recent work in biology centered around reasserting the role of
symbiosis in the generation of evolutionary novelty. By comparing the common
features of the theories proposed to explain each of the three transitions, we will
develop a more formal statement of the problem of distributed self-organization and
attempt to isolate the computations that have been performed via different algorithms
at each level to foster each transition. These common computations will form the
basis for a computational theory of distributed self-organization which is elaborated

in the last section of the chapter.
2.1 Symbiosis

The primary mechanism of evolution, as described by the neo-Darwinian synthesis, is
the natural selection of random variation. The biosphere randomly generates survival

strategies, embodied in replicating DNA, which are maintained in proportion to their

10

ll

effectiveness in competing with other strategies for available resources. Competition
among individuals is what weeds out the weak, in a nature "red in tooth and claw”
where only the robust reproduce. Yet in spite of this competitive atmosphere, current
research is revealing that living things are tremendously interdependent, not only in
the food chain, but in many less competitive ways. At all organizational levels, from
sub—cellular viruses [Reanney 74] to the biosphere as a whole [Lovelock 88], stable
symbiotic partnerships are pervasive. Moreover, the idea that evolving symbioses

are a source of novelty is enjoying a renaissance.

The idea of symbiosis as a factor in the evolutionary development of new forms is not
new. As early as 1883 [Margulis 81], biologists looked to symbiosis to account for
the emergence of radical forms, particularly the development of the eukaryotic cell. In
1927, LE. Wallin advanced the claim that the mitochondria of eukaryotes had their
origins as free-living, respiring prokaryotes which had invaded other host cells and
adapted to live within their cytoplasm. His theory asserted the importance of
symbiosis in the development of novelty, not only at the cellular level, but at all levels
of organization. Wallin identiﬁed the selection of symbioses as the primary source of

evolutionary novelty:

Just as reproduction insures the perpetuation of existing species, the
author believes that Symbionticism insures the origin of new species.
[Wallin 27]

At the time, Wallin’s claims were rejected by mainstream biology. What little hard
evidence he presented was suspect and the links between symbiosis and the
evolution of novelty were dismissed as unfalsiﬁable [Margulis 81]. Particularly in
Wallin’s day, before ultra-structural and molecular analysis -- but even today --
demonstrating the symbiotic origin of a novel form poses a dilemma. When the
individual partners in a symbiosis can be clearly identified, then no new form exists. If

on the other hand, a new form has arisen from the integration of symbionts, the

12

boundaries between the partners may have blurred to the point that they share much
of the same apparatus for their maintenance; a new form has arisen but the evidence
of the separate natures of its constituent symbionts will have virtually disappeared.
D.C. Smith [79] has likened the identiﬁcation of closely linked symbionts to the
problem of identifying the vanished Cheshire Cat solely from the traces left by his

grin.

Since Wallin’s time, however, new techniques in microbiology have been used to
establish a better case. Margulis’ serial endosymbiosis dreary [Taylor 74], which
claims that eukaryotes evolved from simpler forms through the integration of a series
of symbionts, is now a part of the accepted explanation for their origin. The idea that

symbioses have been a major source of evolutionary novelty is now accepted by most

biologists [Margulis 81].

One of the primary obstacles to the idea that symbiosis plays a role in the
development of new forms has been the belief that genetic exchange between
members of different species was all but impossible. This view is changing; ideas
from many quarters have led to the breakdown of the barrier. Advances in the
evolutionary role of viruses, plasmids, transposons and other mechanisms of inter-
cellular gene exchange have forced a reconsideration; inter-speciﬁc recombination

events do occur, and with meaningful effect. According to Jeon and Darrielli [71] :

There is signiﬁcant evidence that organisms are not limited for their
evolution to genes that belong to the gene pool of their species. Rather
it seems more plausible that in the time-scale of evolution the whole of
the gene pool of the biosphere is available to all organisms and that the
more dramatic steps and apparent discontinuities in evolution are in
fact attributable to very rare events involving the adoption of part or all
of a foreign genome.

1 3
Anderson [70] goes even further :

Viral transduction is a key mechanism for transporting segments of
DNA across species and phylum barriers and evolution depends
largely on this transfer

That mechanisms for transferring genetic information between neighboring cells
allows for the strengthening of symbiotic associations has been advanced in support
of the idea that symbiosis may be an important force in the evolution of species
[Taylor 79] -- a metasexual phenomenon resulting in the "the breakdown of genetic,
physiological and spatial isolation between organisms that diverged in the ancient

past" [Margulis 76].

The symbiotic paradigm is attractive for my purposes since it presents an avenue for
addressing the problem of distributed self-organization as the coadaptation of
originally independent processes. Inherent in the idea of symbiotic coevolution is the
notion that independent and dissimilar distributed units may spontaneously form
partnerships and gradually evolve into a more highly organized and coadapted whole.
Any "hardware independent" mechanisms which can be identiﬁed from the study of
living systems that have developed into coherent, integrated wholes, should be useful
in constructing a theory to guide the consn'uction of adaptive algorithms to foster the

coordination of distributed but interacting computing units.
2.2 Living Transitions

In the next three sections, we will examine the application of the symbiotic
perspective to three distinct stages in the evolution of life. Each of the stages is
marked by the spontaneous coordination of distributed elements over time. At each
stage, the basic units manipulated will be of different complexities, subject to different
constraints, on different scales of size. Thus we expect a good deal of divergence in

the physical mechanisms which accomplish the transition toward higher organization.

14

At the same time however, we expect that, in some abstract way, the "computations"
these various mechanisms perform should be in many ways analogous. The
identiﬁcation of these computations will put us on the way toward a computational
theory of distributed adaptive self-organization, a phenomenon which marks each of
the almost mystical "quantum leaps" which evolving life has performed.

2.2.1 Replicator to Prokaryote

The article of faith underlying current scientiﬁc investigations into the origins of life is
that pre-biotic organics arose as a natural result of the laws of chemistry from more
primitive compounds present in the atmosphere and oceans of the early earth. These
relatively complex organics are thought to have recombined under the forces of
chemistry and natural selection to form replicating molecules enclosed within lipid
membranes. Eventually, through further recombination and selection, these enclosed
replicating molecules, the precursors of prokaryotic cells, are believed to have
developed the ability to use information encoded within them to synthesize proteins,
an especially powerful class of physically active effector molecules. Once capable of
self-replication and protein synthesis, these informational molecules expanded their
range of phenotypic expression, increasing the complexity, diversity, and intricacy of

the chemical interactions they were together capable of producing.

Since the beginning of this century, various pieces of the progression from inanimate
chemical compounds to simple cells have been explored, with some measure of
success. In particular, evidence to support the idea that the building blocks of life
were the spontaneous result of chemical reactions between simpler compounds likely
to have been present on primitive earth has accumulated steadily since Stanley Miller
ﬁrst synthesized an amino acid via electrical discharge in a mixture of methane,
ammonia and water in 1951 [Miller 53]. Subsequent efforts in what is now called pre-

biotic chemistry have resulted in the laboratory synthesis, under conditions

15

simulating the early earth, of over 25 different nucleic acid derivatives, over 40 amino
acids and their derivatives, and a number of carbohydrates [e.g.., Sagan 71,

Ponnamperuma 68, White 76].

Other research has shown plausible routes by which nucleotides linking with each

other via hydrogen bonding could have led to the familiar 3’,5’ double-helical DNA.
Several researchers have had success in developing self-replicating nucleic acid
chains starting from much simpler precursors [Spiegleman 67, Goulain 68, Khorana
70] A plausible theory that the 3’,5’ backbone may have had survival advantage over

the disruptive 2’,5' bond which is so commonly produced by attempts at artiﬁcial DNA
synthesis under the conditions of early earth has been given by David Usher [77].

Usher describes how a repeated cycle of warming and cooling under varying humidity

favors the formation of the protective 3’,5' coil over the more easily hydrolisized 2’,5'
chain. This work suggests that DNA may have been the fundamental molecule of life
from the very beginning.

Despite progress in various stages which the spontaneous emergence dreary
requires, many gaps still exist. Nevertheless, enough evidence exists to conclude
that at some point, some kind of self-replicating system did emerge from inanimate
precursors as a result of natural processes. In its simplest form, this system was
likely a self-replicating nucleotide chain in a solution of pre-biotic organics and water,
perhaps enclosed by a lipid vesicle which protected it from disruptive effects. The
subsequent evolution of these replicators into the diverse and interdependent
prokaryotes is thought to be the result of two effects: the natural selection of variation

and symbiotic association.

The selection pressure to which these nucleotide chains were subject resulted from

variations in the stability of individual molecules. Prior to the emergence of

16

replication, those molecules which were most stable, i.e., those which remained intact
longest, and whose chance rates of formation were highest, became increasingly
prevalent. Molecules which formed only rarely, or broke up with high probability

shortly after formation, decreased in proportion to these more robust strains.

Once the ﬁrst nucleotide chains emerged, however, selection pressures changed in
kind. Nucleotides do more than last; they can also react with their environment in a
completely new way to produce copies of themselves; and with the advent of
replication came new avenues to stability. These new molecules, the ﬁrst replicators,
had gained the beginnings of a phenotype. Given that the waters of early earth at the
time of the ﬁrst polynucleotides was rich in the pre-biotic organics required for their
formation, their individual rates of increase were heavily dependent on their behavior
as replicators, on their ability to interact with the environment to create more of their

kind.

In this phase, those molecules prospered which were the best at making stable
copies of themselves. Dawkins [76] has identiﬁed the three distinct pressures that
natural selection exerted on them as : longevity, fecundity, and ﬁdelity. Longevity was
the same pressure exerted on pre-replication molecules; those molecules whose rate
of formation was greater than their rate of decay increased in numbers. For a
replicator, though, the stability beneﬁt is twofold, since a replicating molecule’s
behavior aﬁects its rate of formation. The longer a given molecule lasts, the more
copies it can produce. So, Other things being equal, long-lasting replicators have
selective advantage over short lived ones. The second pressure, fecundity, has to do
with the rate at which a molecule produces copies. Other things being equal,
molecules which reproduce fastest will increase in numbers fastest. The last
pressure, ﬁdelity, refers to the integrity of the copying process. If two molecules have

the same longevity and fecundity, the molecule whose copying process is most

17

faithful, i.e., which makes the fewest mistakes, will increase the fastest. If replication
is the sole means of phenotypic expression, those replicators increase fastest that

last longest, reproduce most rapidly, and produce the most faithful copies.

In addition to the potential for replication, nucleotide chains have another property
which afforded them with a second phenotypic outlet, a property which allowed them
to interact with one another and directly inﬂuence not only the rate at which they
reproduce, but also the rate at which chains associated with them reproduced.
Laboratory experiments suggest that the first replicating chains were short, no more
than 10 bases long [Day 84]. Due to ﬂaws in the copying process, which must have
been far less reliable than present day mechanisms, many different "strains" of chain
should have arisen, each one associated with a different sequence of bases. When
short chains of differing lengths, containing a wide variety base sequences, are
allowed to combine in solution, a process known as gene splinting [Day 84] occurs
(see Figure 1) . Base pairing occurs between chains which share a complementary
sequence, and if the chains are of different lengths, a set of unmatched bases may
form a " splint”, extending as a single chain from the double stranded complex. If this
splint encounters another chain, whose end bases are arranged in the complementary
sequence, a longer double stranded molecule forms, possibly extending its own
"splint".

Through this splinting process, associations of various strengths could form between
the different chain varieties, characterized by their particular base sequences. The
frequency and strength of these associations would have varied with the extent to
which those associations promoted the longevity, ﬁdelity and fecundity of their
constituent parts. As the chains grew longer, the strength of association between
sub-chains would have varied with the distance between them. Those composite

chains which increased the longevity, fecundity, and ﬁdelity of their component parts

18

 

(b)

 

(d)

 

 

 

(a)

 

 

 

(C)

Figure 1. Gene Splinting

 

 

19

were the ﬁrst symbioses. Associations formed between proximate sin gle-stranded
molecules with complementary base pair segments, and those associations which

conferred survival advantage strengthened and proliferated.

At some point, by an as yet unknown evolutionary path, the phenotypic expression of
the polynucleotides somehow expanded to include protein synthesis. While many of
the individual steps are still unclear, it is assumed that some of the longer replicating
molecules developed tRN A and gained some rudimentary synthetic capabilities
which, through the action of association and selection, eventually resulted in a
minimal cell. The components of these cells were most likely some mix of the
following [Morowitz 67] : some strands of replicating DNA (containing, in total,
about 50 cistrons), strands of RNA, both transfer and messenger, some catalyzing
enzymes (producible by internal protein synthesis), a nutrient rich cytoplasm, and an

enclosing lipid membrane.

At this point the nature of selection on the confederated replicators overseeing these
systems would have changed again; the wide range of phenotypic possibilities which
protein synthesis provided changed the emphasis of the selection process. Protein
synthesis so escalated the rate and range of chemical expression, that rather than
favoring those nucleotide chains with the best inherent replication abilities, chains
were favored whose synthesized protein complexes resulted in the most efﬁcient
replication of the overall system. The basis for selection remained the same -
longevity, fecundity and ﬁdelity -- but the means for securing improvement expanded
along multiple paths, as the enzymes produced by decoding the replicators as protein
templates catalyzed new reactions, yielding new compounds. Proteins could speed
reaction rates to increase fecundity, align molecules more precisely to increase
copying ﬁdelity, and construct more and more elaborate physical and chemical

defenses against entropy to increase longevity. And the various substances indirectly

20

produced and metabolized by the various different nucleotide combinations introduced

a new level of interaction.

To illustrate the effect the introduction of protein synthesis had on the evolution of the
replicators, we will consider one of the most potent chemical avenues which was
opened -- the metabolic pathway. By the time the minimal cell appeared, much of the
store of pre-biotic substances had been depleted [Day 84]. Where once complex
molecules which provided the raw materials for the cell’s reactions and the energy
which drove them - riboses, purines, protein and non-protein amino acids -- had
been available in abundance, changes in external environmental conditions reduced
their abiotic rate of formation. As the supply grew shorter, competition arose between

cells for their acquisition.

One possible result of this dwindling of basic resources could easily have been
general extinction. The various cells, each with differing capabilities for acquiring their
needed inputs, might have been gradually crowded out by ever more efﬁcient
acquirers. All the while, the store of basic resources would have dwindled until the
organisms which survived the competitive milieu ﬁnally died from chemical starvation.
While this may, in fact, have occurred in some cellular populations, that it did not
occur in every one is made manifest by the present prokaryotic world. In the face of
dwindling resources, cellular populations adapted, developing mechanisms for
synthesizing the needed but disappearing complex molecules ﬁorn simpler, more
readily available ones. Today, "prokaryotic microbes can metabolize nearly any
organic compound made by nearly any organism" [Margulis 70]. The modern world of
prokaryotes, the descendants of the ﬁrst minimal cells, are the most metabolically
diverse of living creatures. All the metabolic pathways known to living creatures are

known to prokaryotes; the origins of these pathways is a story of symbioses.

21

Many of these reaction paths involve the synthesis of one compound ﬁom several
simpler precursors, and involve the sequential action of many different enzymes. The
individual steps comprising the reaction chain confer advantage on the organism only
as a whole; the intermediate products involved may or may not be of use. The
existence of these non-functional intermediates made accounting for pathway
evolution one of the classic problems of explaining a macroevolutionary phenomenon
in terms of micro-evolutionary steps. The present theory of pathway formation is due

almost entirely to the insight of Horowitz (45) :

the evolution of the basic syntheses. proceeded in stepwise manner,
..., but the order of attainment of the individual steps has been in the
reverse direction from that in which the synthesis proceeds - i.e., the
last step in the chain was the ﬁrst to be acquired in the course of
evolution, the penultimate step next, and so on.

Horowitz goes on to assert that much of the development of these pathways was
probably symbiotic in nature [Horowitz 45]. As a required substance becomes
scarce, selective advantage will be enjoyed by mutant cells capable of synthesizing
the substance from naturally occurring precursors. In an environment where more than
one required substance is in limited supply, a number of different mutant "strains" can
be expected to develop, each capable of synthesizing a different limited substance.
Just as the "new-enzyme" producing mutants enjoyed selective advantage over the
non-mutant types, associations between different mutant types which share their
products enjoyed selective advantage over unassociated cells. Selective forces then
act on those associations most able to form stable symbiotic aggregates. The
symbiosis here exists on two levels: at the top level, cells which produce
complementary enzymes are mutualistic; but underlying their association is a
symbiosis between the recombinant base sequences, the genes, which code for the
enzymes which actually catalyze the reactions. Among the mechanisms available,

ﬁrst physical cohesion and ﬁnally genetic exchange and incorporation of the relevant

22

information (through transformation or viral transduction) may have achieved
advantage over previous forms giving rise to a new species containing a replicating

composite capable of the entire pathway.

We have identiﬁed two occurrences of the integration of previously independent units
in the transition from simple replicator to simple cell. Both have depended on the
emergence, selection, and improvement of symbiotic associations between separate
processes. The ﬁrst, the deveIOpment of long nucleotide chains from shorter, "less
than ten base" units was accomplished via the preservation of chance interactions
between units with complementary base pairs that improved the replicating abilities
of the constituent parts. Subsequent interactions, between ever longer chains, tested
associations of associations, retaining those which proved most advantageous. In the
second instance, replicating units within distinct cells encoding for complementary
enzymes conferred greatest advantage on their respective cells when proximate to
one another. This advantage resulted in more offspring for the spatially associated
cells, leading to increasing levels of interaction, culminating in the fusion of the codes
into a single replicating strand. While the physical mechanisms differed, the common

principle of selection of association was a driving force in these transitions.

2.2.2 Prokaryote to Eukaryote

By the early Precambrian, some 2 billion years ago, the prokaryotic microbes had
evolved most of the biochemical pathways characteristic of modern organisms
[Margulis 70]. The abiotic organics which drove early evolution were replaced by
substances produced biotically from simpler precursors. The anaerobic atmosphere of
the early earth was transformed by the oxygen-producing photosynthetic microbes.
The ozone layer had formed, reducing the amount of ultraviolet radiation which had
provided much of the energy used by earlier life forms.

23

The biosphere became its patchy, recognizable, competitive self:
associations, symbioses, parasitisms and predations evolved. Many
different kinds of relations among microbes, more and less involved,
came into play [Margulis 70].

From these beginnings, in the space of roughly 1 billion years, a new kind of
organism, radically different from these prokaryotes arose. These new organisms, the
eukaryotes, formed rapidly, and their evolved descendants today show a markedly
greater level of complexity than modern prokaryotes. The currently accepted theory of
eukaryote evolution, the serial endosymbiosis theory, or SET [Taylor 7 4], is largely
due to the work of Lynn Margulis. Margulis’s thesis is that a series of
endosymbioses between various strains of prokaryotes resulted in the eukaryote
"cell"; according to Margulis. the eukaryote is actually a polygenomic unit, containing
at least three separate sets of replicating informational molecules. The theory holds
that two of the major organelles, the mitochondria and plastids, arose from the
gradual integration of distinct prokaryotic endosymbionts with a host microbe. In
Margulis’ words:

Eukaryotic cells are considered to have originated as communities of
cooperating entities that had joined together in a deﬁnite order ; with
time, the members of the cooperative, already skilled in their
specialties, became organelles [Margulis 81].

According to the SET, a large, relatively unspecialized microbe served as host, the
so-called protoeukaryote [Margulis 70] which powered itself by anaerobic
fermentation of glucose to pyruvate by the Embden—Meyerhof pathway. By some
means, either invasion or phagocytosis, a smaller aerobic Gram-negative microbe
entered the plasma membrane of the host. This microbe, the protomitochondrion,
contained the Krebs-cycle enzymes and the cytochrome system for t0tal oxidation of
carbohydrates to carbon dioxide and water. Originally ﬁee living, it is presumed to

24

have been at least a minimal replicating system, capable of controlling its own

reproduction. -

Once inside the host cell, the protomitochondrion remained undigested (by unknown
mechanisms) and began to reproduce in the host’s cytoplasm. A mutualistic
symbiosis developed. The host gained a new set of metabolic paths providing new
sources of energy and making possible a new set of syntheses [Margulis 81]. The
protomitochondrion, for its part, gained safe haven -- an osmotically controlled,
chemically buffered habitat. In time the two became completely dependent on one
another. Only under very special circumstances can modern eukaryotes survive
mitochondrial dedifferentiation or loss [Margulis 81]. The mitochondrion gradually
lost many of the traits of a free-living organism; protected from the outside'
environment, able to exploit the hosts’ reproductive mechanisms, the
protomitochondria lost their cell walls (but not their thin cell membranes) and the
ability to produce many of the enzymes associated with reproduction. Nevertheless,
many functions of free-living cells were retained; mitochondria supervise their own
reproduction with mitochondrial DNA and synthesize many of their own proteins.
They produce their own ribosomal and transfer RNA.

While at one time some biologists were skeptical that such endosymbioses could
have evolved, a similar phenomenon has been observed in a particular strain of
laboratory amoebas [Jeon 76]. This strain of amoeba spontaneously contracted a
bacterial infection; the infection involved a large number of bacteria per amoeba, on
average 40,000, and at the outset the infection was "very harmful" to the host cells,
drastically decreasing their rate of growth. After less than 1000 generations however,
(a period of 5 years), the "infected" amoeba had regained their vigor and, moreover,
had become dependent on the infecting bacteria for their survival. A series of

micrurgical studies of the amoeba showed that an infected nucleus, when combined

25

with the cyt0plasm of an uninfected amoeba, did not form viable cells, while the
reverse combination did. Jeon concludes that the "infected" amoeba have become
completely dependent on the bacteria and cannot survive in their absence. Neither do
the infecting bacteria seem to be able to survive outside the amoeba. The infecting

bacteria is, however, able to infect other amoebal species, if properly treated.

It may at ﬁrst seem strange to cite a laboratory example of the development of a
symbiont which confers no advantage on its host (the "infected" amoebal strain was
no more ﬁt than uninfected strains) when propounding selection of symbioses as a
principle means of adaptive evolution. Taylor [7 9] made a similar observation in a

review of Jeon’s results :

Infected amoeba appear to have no obvious advantage over non-
infected cells, a point worth recalling when questions of the selective
value of symbioses are raised.

Jeon’s results, of course, do not demonstrate that all symbiotic association will prove
more robust than all Other competitors, regardless of external conditions. What they
do show is that cellular mechanisms are sufﬁciently robust to withstand and adapt to
massive intracellular infection. The rapid rate of coadaptation suggests that these
mechanisms may even be pre-adapted to“ foster such associations. Those who would
argue for symbiosis as a force in the evolution of complexity certainly do not argue
that all symbioses necessarily form to mutual advantage, only that those which do
confer advantage tend to increase in frequency and strength over time, and are
selected for, and that those which confer disadvantage decrease in frequency and

strength over time, and are selected against.

The plastids, the other organelle whose origins are accounted for by the SET, are
thought to have formed in the much the same way as mitochondria. Already

specialized bacteria, a primitive photosynthesizing cyanobacteria, either invaded, or

26

were engulfed but not digested by a mitochondria-bearing protoeukaryotic host. Upon
entry, these microbes entered into symbiotic association with the host, giving it
photosynthetic capability in exchange for the advantages of a living cell as a habitat.
Their subsequent integration with the host is thought to have followed a pattern
' similar to that given for mitochondrial development. The organelles retained some of
their own informational molecules, but lost many of the characteristics of free living

cells, replacing these characteristics by exploiting resources provided by the host.

While no true intermediate forms have been identiﬁed, suggestive examples have
been found. Paramecium bursaria is an algal-ciliate symbiosis similar to the kind
expected in the formation of plastids. The protoplastids in this case are the
photosynthetic algae Zoochlorellae, a close relative of the ﬂee-living Chlorellae.
Paramecium containing the symbiont Zoochlorellae can survive even when other food
sources are scarce, provided there is adequate light; Paramecium without the
symbionts cannot. As is usual in stable symbiotic relationships, the rate of
reproduction of host and proto-organelle are approximately equal. The symbiont
reproduces on its own and is cytoplasmically transferred to Paramecium offspring in

order to maintain the association.

Mitochondria and plastids may not be the only organelles gained through the
selection of symbiotic associations. Several other bodies have been identiﬁed as
possible endosymbionts [Smith 79]. Many of the DNA-containing bodies found
within cells may well be of free-living origin, but may never be positively identiﬁed as
such; if symbionts grow more and more integrated with their hosts, the identiﬁcation

of their origins becomes increasingly difﬁcult.

According to the SET as propounded by Margulis [86a], the intracellular relationship
between the various bacterial forms began as a "loose confederation", a multicellular

colony of specialized prokaryotes enclosed within a single plasma membrane. At ﬁrst

27

the numbers and arrangement of the individuals in these colonies, and thus the
strength of their “association, varied widely. Gradually, as the processes associated
with mitosis developed, the individual members became increasingly coadapted, and
their interrelationships grew more rigid; much of this coordination is probably
accomplished through information encoded in nuclear DNA. Once again, the selective
retention and strengthening of positive interactions between distributed units
resulted in the development of a coordinated whole composed of specialized,

cooperative processes.
2.2.3 The Origins of Multicellularity

Once the cellular level of organization was reached, selection began on associations
between cells. The end result has been a diversity of multicellular forms, most of them
eukaryotic, which represent yet another quantum leap of complexity and capability. In
their external behavior, the multicellular organisms interact on an entirely different
scale of size and time than preceding forms of life and have created and exploited
niches completely unreachable by their smaller components. Multicellular life has
developed mechanisms of tremendous complexity, and used these mechanisms to
develop radically new behaviors. This increase in external complexity has been
matched by an increase in the complexity of their construction. In place of simple
ﬁssion and mitosis are the complex processes of embryonic development, an intricate
balance of meiosis and mitotic differentiation which produces organisms with trillions

of specialized cells ﬁorn a single cell.

Much of the phylogeny of multicellularity as well as most of the mechanisms by which
multicellular development proceeds are largely unknown. The secrets of the two seem
to be inextricably linked:

28

The fact that differentiation is still a mystery means that the last of the
great Precambrian events -- the origin of multicellularity -- is a
problem for which we have no solution [Barbieri 85].

Nevertheless, theories of their origins do exist. The most plausible starting place is
that the multicellular forms arose from associations between unicellular organisms. In
its simplest form, this is all that multicellularity is; the line that separates colonial
organisms from those somehow "truly" multicellular is arbitrary at best. On the way
to considering the origin of what are usually thought of as "true" multicellular forms -
the organisms which develop by differentiation of a single-celled embryo -- we will
examine ﬁrst the development of a different kind of multicellularity, the kind found in

organisms like Myxotrichia paradoxa.
A popular freshman biology text [Wallace 81] deﬁnes a multicellular organism as :

one that is composed of a number of cells that cooperatively carry out
the functions of life, cells that cannot survive independently under
natural conditions.

Under such a deﬁnition, the many obligate symbioses found in the biosphere must be
counted as multicellular organisms. In organisms like Myxotrichia, different
specialized unicellular organisms, whose ancestral forms were originally free living,
evolved into symbiotic complexes whose parts becorrre so integrated that
independent life was no longer possible. These associations begin as weak chance
interactions, accidents of proximity. If an association proves mutually beneﬁcial,
however, each member of the association produces more offspring when in the
presence of the slightly complementary other than in its absence. As the relative
concentrations of the complementary types increase in a given area, so does the
probability of forming partnerships. So long as the complex proves viable, selection
begins to act on the partners as a genetic complex, favoring variations in the complex

which increase the integration and specialization of the partners.

29

The resulting organisms can be models of distributed coordination. Myxotrichia is a
complex of no less than ﬁve separate species of organisms. The prorist host contains
two different species of bacteria, the one responsible for the digestion of cellulose, the
other for cellular respiration. On its surface, two different species of spirochetes --
anchored to special sites on the host’s plasma membrane with the aid of another
species of "anchoring" bacteria - provide motility. In all, the complex is composed of
well over a million cells. The spirochetes are capable of coordinated movement
through some unknown mechanism. Each cell of the complex maintains its own
replicating machinery, and the entire complex reproduces itself from the coordinated
action of each cell. It is unculturable outside of its specialized environment. [Margulis
81]

While there are examples of integrated, "multicellular" forms composed of specialized
cells from distinct species, they are far more rare than the "true" multicellular
organisms, the multicellular clones. The theory of their origins also begins with
chance association by accident of proximity. Colonies are most likely the result of the
failure of daughter cells to separate completely from the parent during division. If the
physical coherence -- the association between mother and daughter - conferred
advantage on the partners, by whatever means, the tendency to remain connected
after division would be favored, resulting in primitive colonial organization [Margulis
86b]. '

The emergence of colonies sets the stage for differentiation. The physical composition
of the colony is a major determinant of each cell’s external environment. This
composition is in permanent ﬂux as new cells are added and others die or are
removed. When new cells are added, the potential for variation exists as a result of
imperfect copying processes and variations in the intra-colonial environment. From

these variations, simple specializations can arise, (e.g., the assumption of speciﬁc

30

orientations of individual parts with respect to the rest of the colony). What follows is
a kind of clonal selection [Burner 69]. Differentiated cells which confer advantage on
the colony are more likely to be retained by the colony. To the extent that such
differentiations are useful, colonies which produce them will tend to grow in size and
spawn other colonies, provided the differentiations have a heritable basis. At the
outset, colonial organisms spread themselves in irregular fashion; new colonies are
formed whenever one or more cells break away from the parent colony. Those
colonies formed by groups which contain cells from the differentiated types will gain
selective advantage over those which are missing them. The associations between
the primitively differentiated cells within these colonies will have been selected for.

Examples of this kind of organization still exist, for example the green algae Gonium.

Any mechanism which arises to maintain the association of the advantageously
differentiated types will be selected for by the same logic. Variations in one or more of
the cell types which tend to organize the production of new colonies, and insure that
the proto-colonies contain all the beneﬁcial differentiated types, will be selected for.
One possible mechanism a differentiated colony may have used to perpetuate itself
would have been to connol the construction of its proto-colonies and regulate the
types of cells comprising them. As they evolved, these processes would begin to
insure the appropriate mix of cellular variants, providing a means for strengthening

the association between the various specializations.

The work of a number of authors [e.g., Anderson 70, Reanney 74 , Icon and Danielli
71, Temin 76, Steele 81] suggests a second, more powerful mechanism by which
colonies may have insured the presence of the symbiotic collection of differentiated
types in its offspring colonies. Work with the various means of DNA transfer akin to
viral transduction has led some researchers to believe that these mechanisms have

been a major evolutionary force, allowing the intercellular exchange 'of genetic

31

information. The presence of such mechanisms gives replicating molecules yet
another avenue for strengthening advantageous associations. Given that transduction
allows the transfer of informational molecule fragments from one specialized cell to
another, cells can arise in colonies which contain the "blueprints" for several different
‘ specialized types. Such cells, set off on their own, would then be capable, by mitotic
division, of producing each of the specialized cell types encoded within them.
Emergent mechanisms to regulate the differentiation process would be selected for in

proportion to their ability to insure appropriate development.

The interaction of the two mechanisms -- construction of mixed proto—colonies and
integration of differentiated types -- gives us a plausible set of selectable
associations with which to develop colonial organisms with some level of cell
specialization. In organisms like these, each cell in the colony is a potential germ-line
replicator. Variations arise as the colony grows and those which prove advantageous
are preserved, and have the potential to generate more c0pies of themselves in
"offspring" colonies. But what of those organisms which arise from a single cell?
According to the Weismann doctrine, the somatic gene line is completely separated
from the germ line. Clearly, through the process of development and maturation, some
clonal selection will occur, as variants of the original type either ﬂourish or fail. An
organism composed of thousands, or millions, even trillions of cells, has many
encounters with the environment in which some cell combinations succeed where
others fail. But according to accepted biological theory, none of the genetic information
stored in the somatic cells can be propagated. While this "somatic information"
affects the survival chances of the organism, if the germ—line genes cannot be altered
to incorporate the genetic modiﬁcations stored in the somatic line, the offspring of that
organism will be left unaltered by the information gleaned ﬁom its parent’s somatic

selection process.

32

An alternative view has been provided by Steele [81]. In recent work with
immunological inheritance, Steele has identiﬁed a mechanism whereby information
gained through clonal selection can be transmitted via C-type RNA viruses to germ
line cells, where it is incorporated into DNA via reverse transcription [Temin 76]. The
idea is similar in character to the selection of cells in a colony. Each division
potentially produces a variant type which ﬁnds itself in contact with other cells and
the external environment. Those complexes which work well together survive best,
replicate most, and have the greatest chance of being preserved by transduction into
gamete cells. Many mechanisms are missing in Steele’s theory, and his ideas are not
without their opponents; nevertheless, this new area of research provides yet another

mechanism for increasing the strength of association between cooperative processes.
2.3 Distributed Adaptive Self Organization

In the previous three sections, we have reviewed several different natural transitions
which have resulted in the generation of cooperative, distributed wholes from

originally independent distributed units :

(1)the emergence of long nucleotide chains from associations between
independent but mutually compatible, naturally occurring
informational molecule fragments;

(2)the emergence of integrated gene complexes capable of complex,
multistage chemical reactions from associations between
independent, spatially distributed genes;

(3)the emergence of hierarchically structured eukaryotic cells from
associations between originally independent, spatially distributed
prokaryotes; and

(4)the emergence of simple multicellular organisms ﬁom associations
between independent, spatially distributed cells.

33

The physical mechanisms involved in each transition were of course very different
from one another: the ﬁrst involved certain properties of chemical attraction; the
second, transduction between neighboring entities; the third, the intracellular
mechanisms which establish and promote endosymbionts; the last, a combination of
physical cohesion and transduction. The selection pressures which drove each level
were also varied: explicit chemical stability in the ﬁrst case, level of adaptedness of
encoded enzyme complexes in the second, and organismic viability in the third and
fourth cases. Yet despite the differences in the entities manipulated and the problems
encountered, there is a common thread in each transition - tlre identiﬁcation,

selection and strengthening of symbiotic associations.
2.3.1 The Common Thread

At any point in time, the fundamental units in each of the four examples can be viewed
as distributed processes, each of which interacts with a portion of the environment --
its context -- in a characteristic way, based on encoded information which controls
the process. The units are interchangeable in that the scheme for decoding this
internal program is the same for each of the units. The processes continually take in
information ﬁom their contexts, perform internal operations, and alter those contexts
by producing output information. Each process is of limited complexity and capability,

and the nature of each unit’s performance is similar.

In each of the natural transitions described in the previous section, the set of active
processes is dynamic; processes cease functioning and are removed from the set,
others are initiated and added to it. A process’s longevity, the length of time it
remains active, is a stochastic function whose mean increases with the process’s
relative adaptedness to its environment. While active, each process is periodically
capable of initiating a new process, in a new context, conn'olled by an internal

program which is a more or less exact replica of itself. The rate at which a process

34

spawns "offspring" processes is also a stochastic function whose mean increases

with adaptedness. ‘

Short of spontaneous emergence, errors in the copying process provide the only
source for potentially adaptive variation in these systems. Make the copying process
error free and there is no avenue for change; if it is fraught with errors though, the
system loses its capacity to "remember" adaptive behaviors. But how, if ﬁdelity is a

selection pressure, can any mechanisms exist to introduce variation in the copying

processes ?

It is easy enough to account for the persistence of point mutation errors, introduced by
entroPY; even though ﬁdelity achieves selective advantage, systems made of physical
components may be unperfectable -- but random variation does not make for much of
a search algorithm. Adaptation would proceed far more quickly if means could be
developed to concentrate variation in particularly promising areas. But, once again, if
ﬁdelity is a fundamental selection pressure, how can mechanisms which introduce
speciﬁc kinds of variation persist? Despite the seeming contradiction, such
mechanisms do exist. Sex certainly does, and a number of authors [e.g.., Waddington
61, Mayr 70 ] have claimed the existence of other genetically controlled mechanisms
which permit certain kinds of change and prevent others. These mechanisms act to
prevent changes in proportion not to their case of prevention, but to their tendency to
produce non-viable variants. If ﬁdelity is the test, such mechanisms should not

become ﬁxed in a population.

The question is a complex one, but at least part of the answer can be gained by
analyzing the question ﬁom the point of view of Dawkins’ [76] "replicator", the basic
unit of replication, (in the current case a process). When the environment provides
outlets for phenorypic expression aside ﬁom replication, even though a set of gene-

controlled processes interact so closely that they reproduce as a unit, selection for

35

ﬁdelity will act primarily at the lowest level, on the individual processes. The complex
will receive its primary selection pressure from the effects its phenotype has on the
complex’s longevity and fecundity. A process, or set of processes, that is capable of
inﬂuencing the variability of the orher processes which its offspring are in association
with will inﬂuence the phenotype of the offspring complexes. If this variation process
increases offspring survivability, the "change regulating" process will have improved
the longevity and fecundity of its kind, and its gene will persist.

In our description then, we will not characterize the copying ﬁdelity of processes in
the same way as their longevities and fecundities. Instead, we will say that the
copies produced are subject to some source of variation, be it random or adaptive.
Thus, as processes spawn other processes, at certain stages variations will arise,
diverging in input/output behavior from the original, eventually producing processes
which can be called novel.

One of the most important characteristics common to each of our example systems is
that the individual processes in the system are not isolated from one another. When
the contexts of two processes overlap, those processes may interact directly -- some
of the outputs of one process acting as inputs to the other or else the same input
acting on both processes. Even if their contexts are disjoint, two process may interact
indirectly, through an intermediary process which interacts directly with both of them.
These interactions change the environments of the processes involved, and thus may

affect their performance -- their relative adaptedness -- for better or worse.

According to the ideas we presented in section 2.2, coordination between originally
independent processes has arisen as the gradual result of the selection of stable

interactions, or associations. Margulis [81] describes these associations as follows:

36

Stable symbiotic partnerships are more ﬁt than the partners are
individually, in the sense that they leave more offspring when they are
together than when they are apart. The genes of symbiotic partners are
in close proximity; natural selection acts upon them as a unit.

At the core of each symbiotic relationship we have discussed are true replicators, the
molecule fragments which encode context dependent information, and increase their
frequency in the overall gene pool by associating themselves with other replicators.
Selection operates on the phenotypic expression of these associations in particular
contexts, furthering and strengthening those associations that produce effects which
confer advantage. Associations at different levels of organization are maintained by
different physical mechanisms. It makes little difference to the replicating molecule if
its association with its symbiont genes is secured via the phosphate-sugar backbone
of a single DNA strand, the centromere of a chromosome, the nuclear membrane of a
eukaryoric cell, the plasma membrane of a cell containing endosymbionts, the
cohesion of cells in a colony or any other means. What does matter is the strength of

the association and its selective value with respect to local environmental conditions.

Selection of association acts in two ways. First, the frequency of association between
partner processes tends to increase with the reproductive advantage conferred by the
partnership. Over time, the probability a newly generated process of type A will share
a context and interact with some process of type B increases to the extent that type

A and type B processes form stable associations.

Second, the strength of association between symbionts also tends to increase with
reproductive advantage. (All associations are not equal. The sn'ength of association
between processes varies ﬁorn non-interaction at one extreme to fusion at the other.)
In each of situations examined, gradations in integration between partners exist and
those symbionts which persist over longer periods of time tend to become

increasingly interdependent. [Margulis 81]. For a given process type in association,

37

as the strength of symbiosis with the partner type increases, the interaction with the
partner type becomes an increasingly signiﬁcant portion of the given type’s
environment. As copies of the given type are generated, those variants which improve
performance by increasing adaptation to the properties of the symbiont type will gain
survival advantage, furthering the integration process.

To summarize then, the apparent mechanism which developed coordinated distributed
systems from originally uncoordinated units in our examples involved the emergence,
selection and development of symbiotic associations. Each system is composed of a
dynamic set of processes; the processes have limited "lifespans" during which they
interact with a portions of the environment and periodically initiate other processes.
These offspring processes are replicas of their parent process, but are subject to
some form of variation. When advantageous associations arise between processes,
those associations increase in frequency and strength so long as they prove viable.
As two types become more and more intimately associated, selection begins to favor
those variants which form the most integrated complexes. If the system environment
is such that composite processes can confer survival advantage on their constituent
sub-processes, the processes in the system become increasingly integrated. Given a
sufﬁcient source of variation and adequate resources to maintain the process
"population", this adaptation process should be able to continue so long as increased

performance in the environment is possible.

2.3.2 A Computational Theory

From the preceding observations, we will attempt to construct a more formal
statement of the conditions sufﬁcient to induce the phenomenon of distributed self-

Organization which will serve as our computational theory.

38

Let E be an information environment specifying an information space U and

performance function FE. Let P = {p1, ..., pi, ...} be a set of programs which can

manipulate information in U. An active process a is then deﬁned as an ordered triple

(pa , Ia , 0a) where p“ in P is the program controlling the process, Ia Q U is the input

space of a, and 03 C U is the output space of a.

The performance of a process a under its environment is given by the positive real
valued function, FE(a). If FE(a) > FE(a'), a is better adapted to E than a’.
Processes remain active for a ﬁnite span of time during which they periodically spawn

processes a’ = (p

a. , 1‘. , Oar). The offspring process program, pa, , is produced by

copying the parent program, pa , but we assume that the copying process is subject to

adaptive variation.

A distributed information system, S, is then defined as the ordered triple (E, P, A)
where E is the system environment, P is the set of possible programs in the system;

and A is a dynamic set which, at any time t, is A(t) = {31’ a2, , akm], the set of

processes active in the system at t.

The process in A(t) can be partitioned into m classes, Cl I C2 I l Cm, 1 s m S k(t),

based on their controlling program such that :

Vae Ci,a’e Cj, pa=p‘, 4: i=j.
We expand the domain of FE to include sets A’ C A as follows :

FE(A') ﬁf‘i‘” / I A’l

39

Deﬁne Interact (a, a’) as a boolean-valued function which is true if and only if either :

1) (Ian Ia.)¢O or (Can 1'.) a: @or (Oanlawﬂ
01'

2) 3 a" such that Interact (a, a") and Interact (a”, a’).

We say a process a is associated with process a’ if a’s interaction with a’ improves
- a’s ﬁtness. To capture this idea, we deﬁne another boolean function Assoc (a, a’) as

follows. Let Cﬂj = [ a in Ci I there exists a’ in Cj such that Interact (a, a’) ]. Then :

Assoc (a, a’) <=> Interact (a, a’) and FE(C ) > FE( Ci - Cu)-

ilj r

We expand the domain of Assoc to include sets A and A’ C A as follows. Let
A’IA” = [ a’in A’ l 3 a”in A”, Interact ( a’, a") l , then

Assoc ( A’,A”) c: A’IA” at Z and FE (A’IA”) > FE ( A’ - A’IA”)

Every association is assumed to have a sn'ength which reﬂects its probability of

persistance, i.e., the probability that a and a’ will remain associated while active. We
model this as the real valued function :

0 if ~ Assoc (a,a’)

Strength (a, a’) -
k otherwise

where k in (0,1) denotes the strength of the association.

We are now in a position to deﬁne the level of system coordination in a distributed

information system, LC (8), in terms of the strength of the associations between its

40

component processes. First, let FI denote the frequency of interaction between

processes in class Ci with processes in class Cj :

let SA denote the strength of association between processes in class Ci with

processes in class Ci:
0 if 2Assoc(a,a’) =0
(3. 8')e
SAan= C39

2 Strength (a, a’) / 2 Assoc (a, a’) , otherwise
(a. 836 (a. a’)e

and let AA (i, j) denote the advantage of association conferred on processes in class

Ci by their association with processes of class Cj, deﬁned as:
M (iii) = [FE (Cm) ' FE ( Ci " Cuj )] / FE (Ci)°

The level of coordination of class i processes with class j processes, LC (i, j), is given
by:
LC 6. j) = F1(i..i) * SA 6. J') "' AAGJ)

The level of coordination of a system S is then :

1£$h= 2 Z uxmwmuz

i=1...rn j=l...rn

Having identiﬁed the relevant variables, we are now ready to set out the condition
which our observations indicate will produce increases in the level of coordination of a

system S. If processes are spawned such that :

41

when FE(Ci,j) > FE(Ci) , di‘ FI (ij) >0 and ﬁt- SA (i,j) >0

‘

then :
d LC(S) > o
a? o
2.4 Surmnary

In this chapter we have atttempted, by observation of the origins of several cases of
distributed process integration in biological systems, to develop a computational
theory of distributed adaptive self-organization to serve as a guide for the
development of algorithms to guide the development of artiﬁcial distributed adaptive
systems. In sections 2.1-2.3 we examined the development of four different instances
of the self-organization of distributed information processors. By abstracting the
hardware independent features of each system, we identiﬁed the process of selection
of association as a crucial mechanism in the automatic coordination of originally
independent units. In the last section, we have attempted to formalize the ideas

expressed in the other sections.

CHAPTER III

A DISTRIBUTED GENETIC ADAPTIVE SYSTEM

3.0 Introduction

In this chapter we will discuss an algorithm which implements some of the ideas
developed in the previous chapter. The model presented here, the DGAS, uses a
distributed genetic algorithm to manipulate independent, computationally limited
processes which interact with one another in a common environment. The purpose of
the genetic algorithm in this context is twofold. First, it is used to provide the source
of directed variation which the computational theory suggests will increase the rate of
individual process adaptation. Second, when certain mate selection and offspring
placement strategies are incorporated into. the model, the distributed genetic
procedure provides a mechanism for identifying and strengthening beneﬁcial

associations between processes operating in a common problem domain.
3.1 The Basic Model

In this idealized model of the natural genetic system, the task environment is deﬁned
in terms of a space, visualized as a two-dimensional integer grid of inﬁnite extent,
much like the standard setting for cellular automata [Burks 70]. Within this grid,
environmental conditions are deﬁned in a local way; i.e., conditions exist in regions of
the space and vary with time and/or the action of the living systems in the region.

Living systems are modeled as abstract processes, each one controlled by a ﬁnite,

42

43

encoded program, which operate at some location in the grid and interact with the
environmental conditions present in some region surrounding them. Each process is

capable of :

l)detecting the conditions present in its immediate external
environment;

2)detecting the conditions present in its internal environment;

3)reacting to sets of external and internal states by either establishing
some internal condition or altering the conditions present in its
immediate external environment.

Each process’s performance with respect to the environment is evaluated in terms of
a positive real value called its strength; ,in the DGAS strength is the "currency" of
ﬁtness. A process begins with a certain initial strength, and its value then changes
with time as the process interacts with its environment. It is intended as an analog to
energy in the natural system, in that it provides the basis for selection. Strength is
used in the DGAS to regulate both the longevity and fecundity of processes. Each
process expends a certain amount of its strength periodically in order to remain
active. Since a process can only increase its strength by obtaining rewards ﬁom the
environment, if a process’s rate of reward remains below the "price" of remaining
active for a sufﬁcient length of time, its strength falls below the activity threshold and
it is terminated and removed from the grid. If, on the other hand, its rate of reward
exceeds the price of remaining active, its strength will grow until it reaches the
somewhat higher reproduction threshold, and it spawns an offspring process; a
portion of the offspring’s initial strength is subtracted from the parent’s, restricting
the rate of individual reproduction. Thus individual longevity and fecundity are both
dependent on relative adaptedness to the environment. The strength of a process is
not sufﬁcient measure of its ﬁtness however, since strength is periodically reduced

discontinuously, not in response to decreased performance, but as a result of

44

reproducing. Thus a better measure of the individual ﬁtness of a process a is the rate

of increase of strength, FE(a) =§t— 8(a).

Functional interaction between processes is mediated by the environmental grid. In
the DGAS, processes which are proximate to one another on the grid experience
similar environmental conditions and can simultaneously affect their mutual
environment. Since the ﬁtness of a given process is a function of its performance
within its immediate environment, the actions of processes in the same region as the

' given process can affect its ﬁtness, either positively or negatively.
3.2 Composite Prograrrrs -- Composite Processes

According to our observations, in order for distributed self-organization to occur in a
system, that system must allow for the formation of stable associations between
processes. In the DGAS, the mechanism used emerges from consideration of
multicellular organisms whose behavior emerges from the interplay of large numbers
of interacting, spatially distributed processes. During ontogeny, a genetic program,
encoded in nuclear and cytoplasmic DNA, replicates itself, more or less exactly,
trillions of times. Despite the similarity between the programs controlling them,
however, the cells containing these replicas of the original program do not all perform
alike. Instead, they differentiate into many types, each of which performs a
specialized function. None of the individual units operates on the scale of the
organism, yet together the operations which these differentiated cells perform
implicitly coordinate to produce its behavior. This process of differentiation is
controlled, at least in large part, by the action and organization of the original

program.

So multicellular organisms can be viewed as composite processes, driven by multiple

copies of a single composite program, which contains the information required to

45

control each process in the organism. The composite program contains a
developmental and functional plan for an entire system. By replicating itself many
times and placing the offspring in appropriate environments, a single composite
program produces a set of independent, functionally interacting processes which
operate in a common context toward a common goal. Furthermore, all of the
information about the system is located in each component, simplifying communication
of the group strategy. The complete expression of a composite program is a holonic

process hierarchy [Koestler 76] whose structure is characteristic of living systems.

The composite program is the principal method of stable process association in the
DGAS. Using the mechanisms described in section 3.3, the DGAS forms composites
from Specialized programs which control independent processes that have exhibited
symbiotic behavior. Two programs, each of which produces a distinct specialized
behavior, are consolidated into a single program capable of producing either behavior.
Which of the encoded behaviors a process driven by a composite program will exhibit
is determined by the internal and external environmental conditions in which it is
placed. The thrust of the DGAS is to develop a single composite program which,
when replicated, produces and controls a set of differentiated processes which

cooperate toward a common goal.
3.3 Process Initiation

In the DGAS, successful processes are in some sense those which produce offspring
processes. From Chapter 2, offspring processes serve two purposes in the context of
distributed adaptive self-organization. First, they are the principal source of adaptive
variation. Second, how offspring are produced and in what context they begin their
operation has a great deal to do with whether or not association information will affect
population development. According to the theory, if our model is to generate a set of

coordinated processes, contexts for offspring must be chosen so that the probability

45

and strength of association with complementary processes increases with the beneﬁt
conferred by the association. In the natural system, where association is primarily a
spatial phenomenon and the mechanisms for increasing the frequency and strength of
association usually operate at close range (e.g.., cohesion, transformation, viral
transduction), this is accomplished implicitly by spatially biasing offspring placement.
Since interaction is also spatial in the DGAS, a technique modeled on the natural
system is used. In the following sections we will discuss how the distributed genetic
procedure of the DGAS acts as the source of adaptive variation as well as the

primary agent of selection of association.
3.3.1 The distributed genetic algorithm

The algorithm used to control process initiation in the DGAS is a new variant of the
genetic algorithm ﬁrst described by Holland [76]. As used here, "genetic algorithm"
refers to the class of algorithms emanating ﬁ'om work done in the late 1960’s and
early 1970’s by some members of the Logic of Computers Group under Holland’s
direction at the University of Michigan. These early explorations into incorporating
models of the genetic mechanism as the core for a function optimization algorithm
came together to form the basis for Holland’s book "Adaptation in Natural and
Artiﬁcial Systems" which has become the theoretical cornerstone of genetic algorithm
research. The genetic algorithm is a high level abstraction of natural evolution which
ignores most levels of physical detail in order to illuminate more clearly the transfer of

information inherent in the evolutionary mechanism.

Holland’s approach to describing evolution as a biological information system is an
example of an information processing problem approached in the Marrian style. The
theory underlying the genetic algorithm allows the problem domain to be described in
abstract terms, identiﬁes the necessary component computations which drive

evolving systems, and explains the information processing functions that they

47

perform. Properly interpreted, elements of this theory apply to any process, natural or
artiﬁcial, which directs an evolving system.

Since 1975, a number of researchers have explored several important variations of the
basic reproductive plan, applying them in a number of different contexts. Originally
developed as function optimizers [Cavicchio 70, Hollstein 71, Frantz 72], genetic
algorithms have since been successfully applied to a variety of learning problems: in
game playing contexts [Smith 80, Fujiki 87], as the learning component of artiﬁcial
animals [Holland 78, Wilson 85], and as the adaptive kernel of the Classiﬁer System
[Holland 86b, Burks 86], a rule-based adaptive scheme which has been successfully
applied in several different problem domains. These variations are the result of
changes in the organization of the algorithm, extensions or modiﬁcations of the
genetic operators, concessions to problem speciﬁc factors or enhancements designed
to overcome obstacles of implementation. Some empirical studies of the genetic
algorithm have also been performed, the most notable of these being [DeJong 75],
[Baker 87], and [Goldberg 87].

3.3.1.1 Michigan and Pitt

In recent years, much of the research in genetic algorithms has followed one of two
approaches, which have come to be known in the literature as the Michigan and Pitt
approaches [Grefenstette 87], after the locations of their ﬁrst proponents, Holland
and Smith, respectively. Brieﬂy, the genetic algorithms used in the Pitt approach are
those similar in organization to the learning component of Smith’s LS-l [80]. 15-1
maintains a population of multiple rule production programs whose only interaction is
reproductive; each program maintains its own message list and operates in isolation.
At each iteration of the improvement cycle, each program in the population is tested
independently, on several instances of the problem, and evaluated on the basis of its

performance in the domain, as well as certain factors related to its internal

48

composition. This evaluation produces a single number for each program which
describes its ﬁtness. The programs and their ﬁtrresses are then fed to a genetic
algorithm which produces new programs which are offspring of the previous ones by
applying a hierarchically organized crossover operator to cross individuals at both the
rule and sub-rule level. Repeated application of this cycle of testing and
recombination results in a sequence of production systems which show increasing

acumen in the problem domain.

The genetic algorithms used in the Michigan approach are those similar in
organization to the one used by Holland and Reitman[78], Booker [82], and Goldberg
[83]. Unlike the Pitt approach, the Michigan method maintains a population of
individual rules which interact with one another both functionally and reproductively.
Rules compete with one another for the right to post messages on a single global
message list, and, through chaining, may increase or decrease the ﬁtness of other
rules. The interaction is overseen by some version of Holland’s bucket brigade
algorithm [Holland 85], which apportions credit among rules that establish the
conditions necessary for other rules, which produce successful external responses, to '
ﬁre. The bucket brigade provides the system with an "inner environment" [Pylyshyn
86], which regulates its internal activity. Within the context of this inner environment,
the individual rules in the population interact and are evaluated in concert with one
another. At certain intervals, a set of the strongest rules is selected from the
population and fed into a genetic procedure which crosses them to produce new rules,
which take the place of those rules in the population which have been least

productive.

The difference between Pitt and Michigan lies in the character of the structures
manipulated and the purpose behind their manipulation. Under the Pitt approach, the
individual structures manipulated by the algorithm are of arbitrary complexity; under

49

the Michigan approach individual complexity is ﬁxed. At Michigan, individual
structures interacti' at Pitt they remain isolated. The Michigan approach uses its
genetic procedure solely to develop individual structures which are adapted to the
system’s internal environment; overall system integration depends on the
effectiveness of the algorithm which creates this internal environment, ( eg., the
bucket brigade ). The Pitt approach, on the other hand, uses its genetic algorithm to
deveIOp individual structures that are internally and externally adapted; the genetic
procedure puts together units which are internally consistent and perform well as a

1 unit in the external environment.
3.3.1.2 A synthesis

The approach I have taken in the DGAS is a hybrid of Michigan’s and Pitt’s. Like the
Michigan systems, the individual structures in the DGAS interact with one another in
a common environment, affecting one another’s ﬁtnesses. Like the Pitt systems, the
individual structures manipulated are of arbitrary complexity, and the genetic
algorithm is used to provide both individual adaptation and internal coordination. The
distributed genetic algorithm used in the DGAS is one of the few which has been
classiﬁed as parallel [Sannier 87]. In its most general form it is a truly distributed,
massively parallel implementation, where each individual is responsible for its own

reproduction.

The algorithm manipulates processes which can be modeled as composite data

structures; associated with process a are the two items :

1) a.program, which represents a’s controlling program
and

2) a.strength, which represents a’s current strength;

50

as well as a number of problem-dependent ﬁelds. The system-wide control algorithm

which initiates and”oversees the system processes is given more formally below:

main ()

I
cobegin ( for all processes a in A(O) )

{
a.strength = So;

active (a);

}
oversee ()3
wait (termination);

All processes a in the initial set of processes A(O) are launched concurrently, each

with an initial strength of So. Oversee () is a problem-dependent global procedure

which periodically checks the performance of the system as a whole, updates certain
global values and determines if the system is sufﬁciently evolved to terminate. Each

active process behaves as follows :

51

active (process, strength)
{ -
while (not terminated)

{
execute (process.program, process.strength);
check_thresholds ( Thr,Cr, Pd );

if ( process.strength > Thr )

{
offspring_process = reproduce (process);
report ()3
offspring.strength = So;
process.strength -= Cr:

active ( offspring_process);

}
else if ( process.strength < Tht or

(with probability Pd) )
set event (terminated );

}
report_termination (process);

Until it is terminated, each process executes its program, interacting with its
environment for a ﬁxed period of time, performing the actions which deplete the
process’s strength and receiving rewards which increase it. After each execution
phase, a read-only global data area is checked to determine the current threshold
values: Th7, the process’s reproductive threshold; Pd, the probability of death; and Cr

the cost of reproduction. If, after an execution phase, a process’s strength is less than

the termination threshold, Thu the process terminates, reporting its termination to the

overall system. If, instead, a process’s strength is greater than the reproduction

threshold, the process produces an offspring process and activates it in a chosen

52

context with initial strength So; the act of reproduction decreases the parent’s

strength by an amount Cr The method of reproduction is outlined below :

reproduce ( process )
{

offspring_process.program = process.program;
begin ( with probability Pc )

{
mate_process = mate_select (process);
offspring_process.program = crossover
(offspring_process.program,
mate_process.program);

}
else begin (with probability Pi )

offspring_process.program = invert
(offspring_process.program);
else begin (with probability Pm )

offspring_process.program = mutate
(offspring_process.program);
set_context (offspring);

An offspring is driven by a program which is either an exact replica of the parent, a
cross between the parent and a "mate" program chosen by the mate_select strategy,
an inverted version of the parent, or a mutated version of the parent. The crossover,
inversion and mutation operators used in the DGAS are taken from the LS-l system
[Smith 80]. In the DGAS, as in most genetic algorithms, mutation and inversion are
"background operators", which have a relatively small probability of occurrence. Their
primary purpose is the source of "raw variation". For a more complete description the
interested reader should consult [Holland 76 or Holland 86a]. The program set is
assumed to be encoded such that Smith’s hyperplane analysis holds [Smith 80].

53

The primary search operator in the DGAS is crossover. In order to be properly
considered in the ‘class of genetic algorithms [De long 87] an algorithm must satisfy
the so—called "schema theorems", (principally Lemmas 6.2.1, 6.2.2 and 6.2.3 in
Holland [76]). These theorems establish the implicitly parallel search action of the
crossover operator and give the genetic algorithm its adaptive power. In order for
these results to apply to a given reproductive plan which utilizes a crossover
operator, the operator must satisfy certain conditions (Lemma 6.2.1 and 6.2.2) and
the plan must allocate trials to individuals based on their "usefulness", defined as the
ratio of their performance to mean p0pulation performance. Since the crossover
operator we use has been veriﬁed as standard by Smith [80], it remains to be shown
whether or not the reproductive plan speciﬁed above maintains the sampling criterion
required by these Lemmas, i.e., that the expected number of offspring M(a), of a
given process a, be proportional to what Holland [75] terms its "usefulness", given

by FE(a)/ FE(A).
Theorem : Consider the set of viable programs in a population A, denoted by
Av = {as A I FE(a) > 0]. Given the reproductive plan specified above, if

C, = k * FE(a), and Pd = llk, k6 (1, 00), the expected number of offspring for a given

program a6 Av, M(a), is given by FE(a)/ FE(Av).

Proof:

The expected lifespan of a viable process a, whose mean rate of strength increase ,
%? 8(a) > 0, is simply the inverse of its probability of "death" in a given time step.

Since in order to produce an offspring, a process must increase its strength by Cr

units, the expected number of offspring of a is given by :

54
Ma) = g- 5(a) * (I/c,> * (my).

where %{S(a) is the rate of increase of strength of process a and l/Pd is a’s expected

lifespan. Since FE(a) =gt— 8(a) and substituting the conditions :

M(a) = FE(a) * 1/( k * FE(A) ) * k

= FE(a)l FE(A) D

So, by the theorem, the viable programs in the population reproduce "genetically"
whenever the reproductive threshold is expressed as a positive fraction of the
average ﬁtness of the viable population, the fraction associated with the probability of

individual "death" in any given time step.

The interpretation of the constant is natural. So long as k is greater than 1, Pd is a
valid probability and Cr, the su'ength required to generate an offspring process, is a

positive multiple of the mean increase in strength obtained by reproducing individuals
during each execution phase. So the average individual can be expected to produce a
single offspring during his lifetime; individuals with greater than average ﬁtness can
be expected to produce more than one offspring, those with less than average ﬁuress

can expect less than one offspring.

Since the algorithm is genetic, it follows that when operating on properly encoded
structures, it will provide a good source of adaptive variation; even under the inﬂuence
of time-varying, stochastic environmental feedback [Holland 75]. But, in addition to
its properties as an adaptive search algorithm, the plan proposed above, when
embedded in the DGAS, has an added dimension. When appropriate mate selection

and offspring placement su'ategies are used (mate_select and set_context), the

55

resulting algorithm acts to implicitly identify symbiotic associations and increase their

frequency and strength in proportion to their selective advantage.
3.3.2 Structural and Spatial Coherence

Standard implementations of the genetic algorithm pay almost no attention to the
selection of crossover mates and offspring contexts. Mates are selected at random,
sometimes biased by relative ﬁtness. As for the selection of offspring context, under
both the Michigan and Pitt approaches there is no choice; all individuals have identical
contexts. Since the DGAS’s environment is deﬁned in terms of a space with locally
varying conditions, however, more flexibility is possible in mate selection and
offspring placement strategies. As experiments with Asgard will show, the selection
of appropriate strategies determines the power of the system to identify, promote and

select associations.

Motivated by the importance of proximity to interactions between processes in
biological systems, I was led to introduce a spatial bias to the mate selection and
offspring placement strategies. In place of random selection, mates in the DGAS are
chosen so that the probability of an offspring emerging from a union of two parent
processes is inversely related to the distance between them. Similarly, when an
offspring is placed in the grid, the probability that that offspring is found a distance x
from its parents decreases inversely with the magnitude of x. These spatial

dependencies are important factors in promoting the formation of composite

processes.

In the spatially distributed p0pulations of the DGAS, two kinds of process groups
emerge from the action of the genetic operators and regional variations in
environmental conditions. These groups derive their coherence from different

properties and each serves a different function. The first of these -- what I have

56

called the structurally coherent groups -- contain processes driven by programs
which are the same or very nearly so. Typically, the members of a structurally
coherent group will exhibit the same kinds of behaviors in response to similar

environmental pressures.

In sufﬁciently large, unevenly distributed populations, we can expect a number of
different structurally coherent groups - corresponding to a variety of different
survival strategies - to develop. According to a study by Grosso [85], if the
individual processes in an unevenly distributed population are computationally
limited, in the sense that they are incapable of responding, simultaneously, to all the
demands and regularities present in their environment, a phenomenon he refers to as
functional specialization will occur. Evolving populations of processes subject to
different selection pressures begin to pursue different survival strategies and, after a
time, are separable into distinct groups based on their patterns of behavior and the
structure of the programs which encode for this behavior. In a different analysis of
evolving systems, Katriss’s model [81] predicts that the variation and adaptation of
these strategies will continue until further development is infeasible; in the DGAS
this means, among other things, that individual su'ategies will steadily evolve to meet
their computational limits. When processes develop to this point, they become
mutually exclusive, in the following sense. Given that processes are computationally
limited, two different specialized strategies which push those limits are mutually
exclusive in that a single process cannot adequately perform the two strategies

simultaneously.

The second kind of group arises as a consequence of the spatial dependencies built
into the model. In the DGAS, collections of processes which occupy particular regions
of the environmental grid form what I have called spatially coherent groups, or

subpopulations. Almost all interprocess interaction takes place within these groups;

57

not only do the individuals within them share a common context, but when mate
selection and offspring placement are spatially biased, reproductive activity is also
concentrated within them. If the population density is sufﬁciently low, individuals will
tend to cluster in spatially coherent groups, particularly when the environment is
"patchy" -- certain regions of the environment are more hospitable than others. The
individuals within such "spatial niches" [Perry 84] will form semi-isolated
subpopulations whose members interact bOth functionally and reproductively. and

tend to place their offspring back in the niche with high probability.

Since the members of spatially coherent groups interact functionally, if a niche is
inhabited by individuals from different structurally coherent groups, the potential for
symbioses arising between the different types exists. In the next two sections we
will discuss how the crossover operator, by manipulating individuals within and
between these groups, is able to implicitly increase the frequency and strength of

interaction between groups which interact symbiotically.
3.3.3 A Reproductive Continuum

Crossovers of parents within and between the groups described above produce
offspring processes of different characters, each of which performs a different function
in the DGAS. The reproductive operations induced by the crossover operator take on
the character of a continuum, producing different kinds of offspring depending on the

degree of structural similarity between the crossing parents. This continuum is

58

depicted in Figure 2, linking two naturally occurring operations, replication and
recombination, with a third operator which I refer to as hybridization.

Replication Recombination Hybridization

Increasing similarity of parents -—>

Figure 2. A Reproductive Continuum

Admittedly, it may seem odd to assert .that any kind of continuum exists between
replication and recombination, since so many differences exist between them.
Replication, the process by which a process produces an identical copy of itself,
seems to have little to do with recombination, the process which produces a new
individual by "mixing" two parent processes. In the natural system, they occur at
different levels in the biological hierarchy and the physical processes which implement '
them are quite distinct. Nevertheless, we argue that, from an information processing
perspective, replication and recombination can be considered as points on a logical
continuum of reproductive operations by which new processes can be produced from
existing processes of varying degrees of similarity. While only certain regions of this
continuum are realized in the natural genetic system, we have incorporated all of them

in our model.

Replication lies at one exueme of this hypothetical continuum, occurring in the model
either from the action of the replication operator or as a crossover between identical
parents. The program of the resulting offspring process is an exact replica of the

parent process’s program. Nevertheless the behavior of parent and offspring may be

59

completely dissimilar, since in our model, a process’s actions and responses are
dictated not only ‘by its program but it internal and external context. Replication is
most interesting in our model when the program of the parent process is composite.
When this is the case, the sections of the program which become active in the
offspring may differ substantially from those which are active in the parent, due to
differences in their respective contexts. For example, a process whose program
contains instructions for controlling two mutually exclusive processes, each one
sensitive to a distinct set of internal and external cues, can produce offspring which
‘ differentiate into one of two types. While each process is driven by an exact replica of
the parent program, different components of this composite program, (prescribing
different behaviors), are active in each of the offspring types, due to differences in the
internal and external environments in which they are placed. As a result the two

types exhibit two different behaviors.

When crossovers occur between closely related processes, i.e., nonidentical members
of the same structurally coherent group, the offspring process will be driven by a
program which looks less like a replica of its parents and more like a mix of two
similar, but distinct parents. In place of replication, we get recombination. Those
familiar with genetic algorithms will recognize recombination as the standard image of
crossover. The programs of two processes, closely related variants of the same
structural group, cross to produce a program which shares sections of both parents’
code. A process driven by this program will exhibit behaviors characteristic of both
parents, as well as new behaviors arising from epistatic interactions between the
spliced sections. Recombination, then, is the DGAS’s source of adaptive variation. It
has been studied extensively by a number of genetic algorithm researchers [Holland
86a, Booker 82, Smith 80], and has been shown, under a general set of conditions, to
generate individuals of increasing ﬁtness through an intrinsically parallel search of

potential programs. .

60

As the parents become less and less similar and the behaviors they encode for
increasingly specialized, crossovers in our model tend toward hybridizations. We
deﬁne hybridization as a crossover between processes driven by radically different,
mutually exclusive programs - each of which codes for a separate process activated
by a distinct set of internal and external cues - producing a new, composite program
which contains instructions for both processes. The formation of composite programs
is the DGAS’s means for strengthening the association between processes. In
hybridization, two parent processes, each of which is driven by a different, specialized
program, cross to form a composite program, capable of controlling an offspring

process which exhibits either behavior, depending on its external and internal context.
3.3.4 Composite programs

Evolution in the DGAS occurs as an interaction of the operations described above.
Each operation serves a different function. Recombinations between processes driven
by programs from the same structural group produce variants which explore the space
of programs near that su’ategy. Selection of the most ﬁt result in a continual
reﬁnement of the strategy, both with respect to its external environment and its
"process environment" - the effects produced by those processes in its spatial group
with which it interacts most frequently. Recombination provides the adaptive
variation needed to develop programs adjusted to a continually changing environment

and the behavior patterns of associated processes.

The other two operations -- hybridization and replication -- are directly related to the
development of symbiotic association. To examine their effects, consider the
development of a "spatial niche" which originally contains processes ﬁ‘om two

different structural groups, C1 and 02° Each group contains a number of different

variants of the basic strategy, produced by mutations and inversions of the basic type,

61

as well as the offspring of recombinant crosses. Let V1 C Cl and V2 C 02 denote two

variant subclasses,‘whose members interact to mutual advantage. In functional form :
Assoc (C1,C2), V1 =C1|CZ and V2=CZIC1 sothat
V1 :9 Q and FE(V1) > FE(C1-V1) and

v2 at a and Fsz) > FE(C2-V2).

Since offspring are allocated on the basis of relative ﬁtness, the variant groups will

tend to increase their numbers more quickly than the rest of the group :
%t-(IV11/IC1|)> o and §?( Ivzl/IQI ) > o and

%f ( IV1V(IC1|+I(‘1I))> o and g? ( IV2| / (ICING/2|» > 0.

Since offspring placement is spatially biased, most of these offspring will be placed in
the same spatial niche. The result is an increase in the variant types and the

associations between them, so that

g-t-FIW1,V2) >0 and gt—szylpo.

So an evidenced symbiotic association will tend to increase in proportion and
ﬁ'equency as a result of spatial offspring placement. As we discussed above, some of
the offspring produced will be crosses between variants of the symbiotic types, and
those which act to su'engthen the relationship will tend to be retained. Other of the
offspring will be crosses between the two variants, however, classifying as hybrid
crosses in the continuum described in section 3.3. The frequency of hybrid crosses will

tend to increase with the increasing proportions of V1 and V2, eventually forming a

62

composite program VIA/2, containing the programs for both processes. Given that the

two variants follow mutually exclusive strategies, the full expression of the composite
is only evidenced by a set of replicate processes, some of which have differentiated

as type V1, others as type V2. Again, since offspring placement is spatially biased,

the original composite, as well as its composite offspring will most probably interact.
The composite program strengthens the association between the variant strategies,
since a single composite process can insure, by replication, the presence of the
complementary process pairs. This in turn confers selective advantage, insuring the

increase of the composite.
3.4 Summary

To summarize then, the key to the development of successful composite processes in
the DGAS lies in hybridizations that occur within structurally diverse subpopulations
between symbiotically related parents. When a spatially coherent group -- composed
of processes from more than one structural type, each of which exhibits a different
specialized behavior -- persists over time, spatial biases in mate selection and
offspring placement increase the frequency of hybrid offspring, generated by parents
ﬁom different program types within the spatial niche. Those offspring which
successfully hybridize mutually beneﬁcial behaviors gain selective advantage. Thus
hybridizations which occur between the processes in spatially coherent groups whose
members are symbiotically related give rise to composite program types which make
structurally explicit the implicit, spatial coherence that exists between the members
of such groups. Replications of these composite types produce a set of independent

computational units which act as an implicitly coordinated, distributed system.

CHAPTER IV

ASGARD

4.0 Introduction

To investigate whether or not the DGAS would behave as expected, I developed the
software system Asgard, which implements the model in a speciﬁc problem context.
Experiments with Asgard have illustrated some of the properties of the DGAS and
veriﬁed that it can foster the adaptive self-organization of a set of distributed
processes by forming composite programs from the spatially biased interaction of
computationally limited adaptive units. The processes in Asgard simulate the
behavior and adaptation of a community of simple animals which evolve under the .
action of a distributed genetic algorithm. As such, it falls in the class of systems
which have come to be known as "artiﬁcial life simulations". The artiﬁcial life domain
has been an important one in genetic algorithm research; genetic life simulations have
also been developed by Holland and Reitman [78], Booker [82], and Wilson [85].
Unlike these systems, however, the main objective of Asgard is the study of the
evolution of a distributed population of interacting processes, rather than the
development of a single, functionally isolated individual. In the sections which follow

we will describe Asgard’s organization and review the results of experimentation.

63

4.1 System Organization

Asgard’s environment is deﬁned in terms of a ﬁnite, toroidal, two-dimensional grid,
with a resolution of 160 x 60. The grid is divided into 4 equal quadrants, each of which
can be displayed on a Tektronix 4105 graphics terminal. Associated with each
location in the grid is a non-negative integer value representing the amount of "food"
present at that location. "Food" is placed in the grid according to a pre-speciﬁed
placement function, which depends on the time and location as well as the action of
_ the artiﬁcial animals which "inhabit" the grid. These animals occupy locations in the
grid, and move about with time in search of food. Associated with each animal is a
strength. In order to "stay alive", the animals expend units of their strength.
Whenever an animal moves to a location in the grid with a non-zero amount of "food"
in it, he consumes all the food in the cell up to his capacity. Each unit of "food" an
animal consumes increases its su'ength by one unit. So in order to survive, an
animal’s average "food" consumption must be at least one unit per time step. If an
animal’s rate of consumption is less than one unit per time step, it eventually runs out
of energy and "dies"; if its average rate exceeds one unit per time step, its strength

increases. When an animal’s strength increases by C, units, it produces an offspring

and its strength decreases by Cr.

Each animal can be characterized by the movement patterns it exhibits in the
presence of different patterns and densities of "food" in the grid around it. An animal’s
movement is controlled by its individual program, a binary-encoded circular list of
labeled instructions of two different types: Move and Food?. The order of instruction
execution is controlled by a program counter that speciﬁes which instruction in the list
is to be performed next. Move instructions take one argument which speciﬁes one of
eight directions of movement (N, NE, E, SE, etc.). When executed, Move (x)
changes the animal’s grid location by one unit in the speciﬁed direction and advances

65

the program counter to the next instruction in the circular list. Food? instructions take
two arguments drawn from the set of legal instruction labels. When a Food?
instruction in an animal’s program is executed, the eight grid locations which surround
the process’s current location are checked for the presence of food. If "food" is present
in any of those locations, the program counter skips to the next instruction in the
circular list whose label matches the label speciﬁed by the ﬁrst argument; if "food" is
not present, the program counter skips to the next statement whose label matches
the label speciﬁed by the second argument. An example is given in Figure 3, together

with a potential movement pattern.

Asgard is an almost direct implementation of the DGAS. The animals of Asgard are

concurrently executing distributed processes, each one capable of :

l)detecting, through the execution of Food? instructions, the conditions
in its immediate external environment, in terms of the "food"
concentrations which exist there;

2)detecting its internal state, speciﬁed by the position of its program
counter; and

3)reacting to its external and internal states, by altering the position of
its program counter and/or potentially altering its current external
environment by executing a series of Move instructions.

Each process’s immediate external environment is deﬁned in terms of the
concentrations of "food" in the region of the grid which surrounds it. Its internal state
is deﬁned by the position of its program counter. A process can modify its external
environment in two ways; either by consuming "food" or by moving to a different
location in the grid. By executing Food? instructions, the process can detect the state
of its external environment and, based on this information, can change its internal

state, potentially altering its future external response pattern.

1 Move S
Food? 2 l

2 Move E
Food? 2 3

3 Move W
Food? 2 3

 

Figure 3. Example Asgard Animal Program

67

Each experiment with the Asgard system begins with a population of processes
whose controlling- programs are formed as a ﬁxed length set of instructions drawn
according to a random distribution from the set of all legal instructions. The initial
"food" conditions are set at each grid location and the processes are started
concurrently at randomly chosen grid locations. A global process oversees the system
and periodically adds to the amount of "food" contained in the grid according to a pre-
speciﬁed placement function. The objective is to produce a population which responds
to the regularities of this function. The algorithm produces a sequence of populations
whose members interact with each other through common local environments and are
allocated offspring on the basis of their individual rates of consumption. Over time, the
successive populations will contain individuals adapted not only to the conditions of
the artiﬁcial environment, but also to the behavior patterns of the other members of

the population.

Let A(t) denote the population at time t and let a be any member of A(t); A(O)
denotes the initial, random population. The problem dependent ﬁeld a.location,
representing process a’s current grid location, is added to the ﬁelds a.program and
a.strength. Asgard’s overall control process is the control algorithm for the DGAS :

main ()

{
cobegin ( for all a in A(O) )

I

SO;
random_grid_location ();

a.strength

a.location
active (a);

}
oversee ( ):

}
(The algorithm used in most of the experiments with Asgard differs somewhat ﬁom

the one given above. Experimental trials revealed an undesirable property of the

68

model which necessitated its alteration. For a complete description of this change see

section 4.3) ~

The algorithm for each active process is exactly that given for the DGAS (see section
3.3. 1.2), with the value of Th1 = 1. The execution of a process program is given as :

execute ( process )

{
process.strength = process.strength - 1;
process.location = execute_next_instruction
(process.program,process.location);
meal = Min (Food(process.location),
process_capacity);
process.strength = process.strength + meal ;

The cost of instruction execution is extracted from a process’s strength and the
instruction is then executed, potentially modifying the process’s location. Whatever
food is present in the process’s current location, up to its capacity, is then consumed
and so added to process strength. Processes reproduce according to the process '
reproduce given in section 3.3.1.2; mate selection and offspring placement are both
spatially biased, so that offspring most often arise from parents in the same region of

the grid and are themselves placed in that region.
4.2 The Environment

The task set before the animals of Asgard is to identify and exploit the regularities
present in the pre- speciﬁed pattern of "food" placement in the presence of other
animals. There is a single resource, "food", in limited supply, which each animal
requires to live. There is no possibility of substitution, no animal behavior which can

"manufacture" it from simpler sources; every unit of "food" consumed by one animal is

69

one more unit made unavailable for the others. It is hard to imagine a more
competitive situation. Even under these intensely competitive conditions however,

certain types of food placement patterns can still foster cooperation between animals.

The placement pattern which acted as the learning environment for the experiments
we will discuss here is one of these; the amount of "food" placed in the grid is
dependent on the consumption behavior of certain spatial groups. An environmental

measure, Ei(t), which I call efﬁciency, determines how much of the potential "food"
_ supply is actually placed in the grid at any given time. Ei(t) varies between 0 and 1,

reaching its maximum whenever the individuals of group i cooperate to maintain their

aggregate consumption level at the pre-speciﬁed optimum.

Efficiency is the only environmental variable under the direct inﬂuence of the animals
in Asgard. As such, it provides us with a mechanism for determining the level of
adaptedness -- and the level of association -- of animal groups. The most ﬁt animals
in Asgard are those capable of maintaining the highest efﬁciency ratings. If a
particular structural group maintains higher efﬁciencies when in spatial groups
containing individuals from a different structural group than when in spatial groups
from which this type is absent, we can then say that the two structural groups are

associated.

More speciﬁcally, the three basic regularities of the "seasonal placement plan" to
which the population can adapt are :

1) "Food" can appear in only 1/8 of the space, concentrated in eight
evenly spaced fertile regions. Each animal must periodically visit
one of these fertile regions in order to survive.

2) The productive capacity of each fertile region oscillates periodically
between a maximum and a minimum value. Each fertile area can
support many more animals during its "summer" than its "winter",

70

so corrrpetition for "food" within a particular area becomes
increasingly ﬁerce as winter approaches.

3) The amount of "food" actually produced during a given time step by a
fertile region is linked to the. amount of "food" consumed by the
animals in the region during the previous time step. At any given
time, an optimal consumption level exists for a region, and the
animals in that region can over- or under-consume with respect to
this level, as a group, during a given time step, resulting in
decreased "food" production in the next time step.

In each of the four quadrants of the grid (see Figure 4), two 10x15 areas, called
"farms" are established as fertile. The "farms" are the only areas in the grid where

"food" can be found; the rest of the grid is desert-like. Each time step, Ki(t) units of

"food" are produced by the i’th farm and distributed randomly within it, (i = 1,..8).

Ki(t) is itself a function of two other quantities: Mi(t), the seasonal productive
potential of farm i; and Ei(t), the consumption efﬁciency within farm i, a value derived
from the total consumption within farm i during the previous period, Ci(t-1). The exact

form of these relationships is as follows :
Mi(t) =Bsin((21'It/Y)+I'I(-1)i)+Mo, B<Mo;
112 Ct m-M. ml

Ei(t+1) = Mi (0

l 0 , otherwise;

, for 0 < Ci(t) < Mi(t);

Kim = 5,0) Mia)

where Y is the length of Asgard’s year.

The strength of processes in the grid is based solely on their ability to consume; so

the faster they consume, the stronger they become and the more offspring they have.

71

 

 

 

 

.v. 0.5mm

U

 

 

 

 

 

 

 

 

a

 

 

 

 

 

 

 

 

 

 

 
 

 

 

 

 

 

 

72

In the absence of the factor Ei(t) then, the optimal strategy for each process would be

to consume as much "food" as possible, at every time step. This "greedy" strategy
leads to global extinction, however, when future "food" supplies are tied to present
rates of consumption. So the environment exerts two different pressures on the
members of the population. On the one hand, each process is encouraged to compete
with all the other processes in the population for every available unit of food, in order
to increase its individual strength. On the Other hand, the environment gives selective
advantage to groups of processes which manage to cooperate in restricting their

group consumption rate to track optimum levels over time.

It is the cooperative aspect of this environment that is most interesting, since on ﬁrst
inspection, cooperation between such limited animals seems improbable. Clearly,
given the limited capabilities of each process, any force which coordinates group
action must come from some source outside the individual animal. The animals are so
limited in their range of options that, individually, they can respond to only the ﬁrst
two regularities of the pattern; individually they can distinguish between fertile and
non-fertile regions and, on identifying a fertile region, can behave so as to remain near
it. But in order to respond to the third condition, group consumption levels must
somehow be regulated, and this is simply impossible from an individual perspective.
The individual animals have no way of determining current group consumption levels,
no means of communication with the other members of the population, and no direct
means for changing one another’s behavior. If animals are to cooperate in Asgard,
this cooperation must be an emergent property of the interaction between individuals,

accomplished without explicit communication.

If the computational theory developed in Chapter 2 is any guide, the selective
pressure which the third regularity places on groups of individuals is the agent which

produces coordinated action. The selective advantage given to associations between

73

individuals which succeed in cooperating to increase the productive efﬁciency of their
local region (the farm) assures their continued survival. Provided mate selection and
offspring placement are spatially biased, two kinds of groups should form as a result
of this pressure: spatially coherent groups of structurally similar animals whose
behaviors are nearly identical; and mutually complementary, spatially coherent groups
of animals from distinct, but symbiotic, structural groups, each of which performs

some different, but mutually compatible behavior.
4.3 Experiments

In this section, we discuss the results of experimentation with Asgard. In order to
test whether or not the system would generate a coordinated set of animals adapted
to the food placement pattern described in the previous section, I had ﬁrst to ﬁnd a set
of parameter values which would maintain a stable, but computationally manageable

p0pulation size.

In Asgard, simulation stability is a function of many parameters, which are coupled
with one another in various ways. The initial trials exposed a problem which was
eventually traced to the main control algorithm. The graph in Figure 5 shows the
results of a typical early experiment. A randomly generated population containing
1000 individuals is introduced into the grid at time 0. The carrying capacity of the
environment (when M0=80 and B=40) varies between 800 and 300 individuals,

depending on the population’s consumption efﬁciency. In this trial, after 80 = 50 time

steps, the population size collapsed, as individuals which failed to maintain
consumption levels above 1.0 began to disappear from the grid. After this initial
collapse, the population continued to decline, through several plateaus, eventually

resulting in global extinction.

74

oww

eonoaeohs 3235.32 .23 838.5. coma—snea— mabam Bangs .m paw—nu

38? i n he? om: snag: onset

omv own OWN oww o

p p O

p r p -

room
100?
room
room

loco—

 

loom—

ezrg uotiolndod

75

The population instability is due to the mass extinction of all those individuals which,
because of their random placement and program, fail to consume at the subsistence

level of 1 unit/time step. These animals survive until their initial energy So is

depleted, after which they "die", all of them within a few time steps of one another. A

number of different values of k, So, MO, and B were all tried without success;

regardless of parameter settings, after 80 time steps a massive population crash

occurred from which the system never recovered.

While a number of different possible remedies were considered, it was ﬁnally decided
that the main control algorithm should be altered to allow a graduated introduction of
processes in order to smooth the transient. The initial set A(O) was partitioned into

m sets of equal size, A1(O),...,Am(0), and started as follows :

main ()
{
for (i=0; i < m; i++)
{
cobegin ( for all a in Ai(0) )

{
a.strength

so;

a.location random_grid_location ();

active (a);

}

oversee ( );

The new system, controlled by this algorithm, was found to have a far more stable
u’ansient behavior than the previous version. The results of a typical experiment are

shown in Figure 6. A summary of the values of the working parameter set is given in

Table 1.

76

own

A._LO®\A: —.

F

b

:38ng 38380 .23 838.5. 9.56 eouaanom “5&2 .e Baum

:mson: omev :nsoo: ntomn<

owe

P

omv

com
t r _

P

OWN

-

OWF

-

o

IOOF

IOON

room

TOO¢

room

room

loom

 

room

ezgg uogrolndod

77

Parameter Meaning Value
Y Time steps / year 120
Mo Mean prod. pot. / farm 80
A Amplitude of season change 40
So Initial offspring strength 50
Cr Initial reproductive cost 20
|A(O)| Size of initial population 1000
m Length of initial phase 200
|A(O)I/m Initial generation size 5
Pd Initial prob. of death 1/30
Tht Termination threshhold 1

Table 1. A Working Set of Asgard Parameters

Once a stable parameter set had been identiﬁed, a series of trials were launched to
determine how the system would develop using random mate selection and offspring
placement strategies. Because of the complexities introduced by Asgard’s
environment, varying as it does with both space, time, and local population
characteristics, a non-standard mode of analysis was required. In order to adequately
study the development of the system, I needed the ability to isolate a particular local
subpopulation and study it over a particular interval of time. To this end, I developed a
set of tools which allowed me to study Asgard’s artiﬁcial animals qualitatively, as
one might study the behavior and development of an interacting community of animals
in nature. The interactive display thus became one of the primary means for studying
the system. By visually monitoring the behavior and consumption rates of the various

specialized animal groups which emerged during an experiment, I was able to see

78

how speciﬁc groups responded to the various phases of the environment and to one

another.

4

As the size of the experiments increased, (typical population sizes grew to on the
order of three to four hundred individuals and required several days of cpu time on a
Prime 550 to complete), observation in real-time soon proved unwieldy. In place of
real-time monitoring, I substituted a kind of "batch" mode, which allowed extensive
study of the population at intervals. "Snapshots" which preserved the complete state
of the system were taken periodically; after every one, each of the processes in the
population was automatically classiﬁed into a program type based on similarities in
program structure. Each of the program types’ behavior patterns was then studied
separately. For each program type, Asgard was run interactively, using a population
of representative individuals ﬁorn that program type, and their behaviors were noted.
After each program type had been studied in isolation, Asgard was again run
interactively, this time restarting the simulation with all the data ﬁ'om the "snapshot",
to study the behavior of the population as a whole. In this mode, groups of animals
with similar programs were displayed by the same letter, thus making it possible to -
visually identify each animal’s "species".

The results of the Asgard experiments are somewhat difﬁcult to generalize
quantitatively. Early in each of the trials, the animals in the grid clustered around the
eight farms in Asgard, forming nearly isolated sub-populations. The behaviors
subsequently developed by these sub-populations typically varied widely; there were
often local extinctions. However, in all of the latter uials, after a good working set of
model parameters had been identiﬁed, the behaviors characteristic of a self-

organizing adaptive system were present.

In an attempt to capture the character of population behaviors evidenced during these

trials, we will discuss the detailed results of a representative study here. An Asgard

79

simulation is measured in terms of time steps or "days", one "day" being the period of
time required forgone iteration of "active". The environmental cycle of productive
potential takes place over the course of four seasons, each lasting 30 of Asgard’s
days. To begin, an initial set of 1000 individuals was constructed at random, and their
strengths set to So (see Figure 4(b) ). This initial population was entered into the

grid in random locations at a graduated rate, according to the procedure identiﬁed
above. The progress of the run was as follows.

. As the simulation begins, the programs of the initial processes interact with the
environment to produce what look like random walks. Those which have been initially
placed far from any of the farms die quickly, many without ever consuming a single
unit of food. In this initial stage, selection favors those processes with the ability to
recognize and remain in the vicinity of a fertile zone. Farm efﬁciencies, as measured

by the value Ei(t), all average below .3. After 10 "years", the initial population

collapse stabilizes, and the population begins to oscillate between two and three

hundred individuals, evenly distributed between the four quadrants.

After 120 "years", (see Figure 4(c) ), the population consists of processes which can
be separated by program structure into eight different program types, with some
slight variation in each type. Some of these program types are present on only one
farm, while others have spread, by chance migrations and reproductions, to several of
the farms in the grid. All the processes exhibit the same basic behavior pattern at this
stage. Their programs consist of various series of Move insu'uctions which cause
them to drift in a particular direction, punctuated liberally with Food? tests. Positive
values for these tests tend to halt the drift, so that the processes wander about in
various directions and, upon encountering a fertile farm, move randomly within it.
Occasionally an individual will drift from one fertile farm to another, but most often

those who leave a farm end up dying in the desert. Globally the only populated areas

80

are the regions immediately surrounding the fertile farms, the processes there forming
eight spatially coherent groups. The size of each group follows the productive capacity
of the farm, climbing during the "summer", falling in "winter". In independent trials of
the various program types, no evidence of symbiosis was found to exist between the
groups. Each of the structural groups performed as well in the absence of the other
groups as they did in their presence, provided population densities were maintained
at approximately the same levels. Consumption efﬁciencies do not exceed .4, on any

farm, during any season.

After 250 "years" however, both su'ucturally and spatially coherent groups are
present and exhibiting cooperative behavior, (see Figure 4(d) ). The two upper
quadrants are completely extinct and the farms there can no longer produce food, but
the lower two quadrants now sustain an average of 350 individuals, roughly the same
size p0pulation as the four quadrants once maintained. Only two signiﬁcant program
types exist, each exhibiting markedly different behavior. The ﬁrst group, nicknamed
the Farmers, are the descendants of the program types present at 1,500 iterations. A
group of Farmers is present at each remaining fertile farm. As individuals, they follow
a ﬁxed, circular pattern of movement, so that when placed in the neighborhood of a
farm, following this pattern assures them of encountering it repeatedly. When tested
in isolation, a population consisting solely of Farmers was stable and able to maintain
the productive capacity of each farm in the grid. This "all Farmer" population
functioned efﬁciently as a group, maintaining average consumption efﬁciencies of
approximately .9 during the winter. During the summer however, the Farmers did not
multiply sufﬁciently to keep pace with the farm’s increased productive capacity and,

as a result, their efﬁciency rating declined to around .5.

The second group, nicknamed the Nomads, are a new program type, which exhibits a
kind of migratory behavior. Each Nomad moves steadily across the screen, from left

81

to right, in a tight, zigzag pattern. Their rate of advance is synchronized to the change
of seasons, so that they strike the left edge of a farm during the farm’s "spring" and
move across it during its most fertile period, leaving the rightmost edge of the farm
just as its productive potential wanes. The Nomads are found in two separate groups,
which are evenly spaced between the farms in the lower two quadrants. Each group is
largest as it leaves the rightmost edge of a farm. As the group crosses the "desert"

between farms, its numbers decrease steadily until the next farm is reached.

Unlike the Farmers, when a population of Nomads was tested in isolation, they could
not survive. In following their pattern of cyclic migration, the Nomads failed to
maintain consumption levels on farms during their "winter" season. As a result, only
a fraction of the "food" available the ﬁrst summer was available in the summer of the
second year. Without Farmers to build up "food" stocks during the winter and early
spring, the amount of "food" available to arriving groups of Nomads was gradually
reduced. This decline in the summer "food" supply began a vicious cycle of declining

numbers which ended in the extinction of the test group.

When observed together, however, the two structurally coherent groups
complemented each other’s efforts. With both Farmers and Nomads present in the
same population, consumption efﬁciencies were at all times greater than .8. Recall
that, operating in isolation, groups of Farmers were able to maintain near optimal
consumption rates during the winter, but that during the summer their efﬁciency fell
signiﬁcantly. When Nomads and Farmers were observed together, winter
consumption rates were maintained by the groups of Farmers at approximately the
same level as when the Farmers were tested in isolation. ( During the winter the two
groups of Nomads were crossing the desert, and thus did not affect consumption
rates.) When a Nomad group arrived at a farm during its "spring" though, the
population inﬂux increased the consumption rate within the farm enough to bring the

82

value closer to the optimum. As a result, summer consumption efﬁciencies increased

from .5 to .8.

Both structurally and spatially coherent group behaviors are present at this stage,
then. The Farmers and the Nomads are both examples of structurally coherent, (or
genotypic), groups. The implicitly coordinated actions of their individual members,
acting on a local scale, ( i.e., pursuing particular patterns of movement), add up to a
group action on a global scale, i.e., the regulation of group consumption rates. And on
each farm, during its summer, members of b0th program types, Farmers and Nomads,
form spatially coherent groups which act together to increase farm efﬁciencies during
that season. These increases are beneﬁcial to both program types, since the
increased "food" production allows both groups to generate more offspring. Because
of spatially biased mate selection, many of these offspring are crosses between

Nomad and Farmer.

Functional specialization is present in the population at this stage however, and as a
result, little in the way of improvement can be expected ﬁ'om recombining Farmers
and Nomads. Farmer behavior is sufﬁciently different ﬁom Nomad behavior that an
offspring which shares some of the traits of each group more often than not does a
poor job at both. The Farmer and Nomad function well together, but only as
interacting, but independent, processes. Each process is limited enough in complexity
that it cannot satisfactorily perform both functions. What is required is a hybridization
combining the two behaviors into a single process capable of producing both types of

offspring.

A successful composite process emerges after 350 "years" which consolidates the
symbiotic relationship between Farmer and Nomad into a single program. At the top
of this Farmer/Nomad composite is a section of code which acts as a kind of "switch";

this switch is actually a series of linked Food? tests. When an offspring of this

83

composite process is placed in the grid, it tests several times for the presence of food.
If two in a row of these tests turn up positive, the offspring’s program counter moves
to a set of statements which correspond to an inﬁnite loop which codes for Farmer
behavior; if two in a row turn up negative, the counter moves to a different set of
statements, and the offspring performs an inﬁnite loop which causes it to act like a

Nomad.

The population composed entirely of composite forms performed almost exactly as the
population described at 250 "years". The only difference is that the implicit link
between the two groups, Farmer and Nomad, is now explicitly expressed in a single
piece of code. A single copy of the composite process, placed in fertile conditions and
allowed to replicate, produces offspring which spread throughout the space and
spontaneously organize themselves into the symbiotic groups of Farmers and

Nomads which sustain the farms in the environment at near optimal levels.
4.4 Summary

Experiments with Asgard, a software implementation of the DGAS, veriﬁed that the
model does promote the adaptive self-organization of a set of distributed processes.
In Asgard, associations emerge from the reproductive interactions of spatially
proximate animals which exhibit mutually beneﬁcial behaviors. Offspring from these
associations adapt to one another’s characteristics through recombination, and can
strengthen their association by forming composite programs which make structurally
explicit the implicit spatial linkage between mutually beneﬁcial specialized behaviors.
Once consolidated, a single composite program, like the Farmer/Nomad, preserves
group strategy, expressed in the interaction of the composite’s differentiated
offspring. The Asgard trials demonstrate that the distributed processes which evolve

under a DGAS can exhibit coordinated behavior without explicit communication.

84

While the results ﬁ'om Asgard were very encouraging, work with the system was
hampered by a number of characteristics which made experimentation difﬁcult.
Asgard’s environment was extremely sensitive to slight changes in its internal
parameters, and unforeseen interactions between environmental components often
led to extinctions or population explosions. Attempts to extend Asgard to more useful
domains proved to be exercises in frustration. The regular, metric grid which was a
natural environment for artiﬁcial animals was extremely unwieldy for more realistic
domains. This suggested that in order to apply the DGAS to more practical problems,
some substitute source for the contextual clues provided by the environmental space

would have to be developed.

CHAPTER V

MIDGARD

5.0 Introduction

To investigate the practical potential of the DGAS and further test the idea of
selection of association as a means toward distributed adaptive self-organization, I
developed a second software system, Midgard, which implements a modiﬁed version
of the DGAS to control the development of a distributed network load balancing
algorithm. The individual processes in Midgard correspond to independent rule-based
schedulers acting on the nodes in a simulated inter-LAN. These schedulers interact
with one another to allocate incoming tasks to speciﬁc network nodes. The goal is to
develop a coordinated set of schedulers which cooperate to reduce the mean response
time of tasks submitted to the network. In order to more fully demonstrate the power
of this approach, the individual schedulers have no knowledge of the load situation on
any node other than their local one; as in Asgard, cooperation between processes

must arise from coadaptations induced by the controlling algorithm.

In section 5.1, we will discuss the problem domain in more detail, and brieﬂy describe
some of the current work in the area. Section 5.2 lays out Midgard’s organization, and
includes a discussion of the Augmented Classiﬁer Language, developed as an
extension to the original Classiﬁer Language [Holland 86b], to allow the more
compact expression of complex boolean operations. The last section describes the

results of experiments with Midgard which show that the DGAS is capable of

85

e"...

86

coadapting the behavior of distributed processes operating in a network environment

to produce desirable global behaviors with very little interprocessor communication.

5.1 Load Balancing

Load balancing is the process of migrating tasks between the nodes in a network in
order to increase network efﬁciency. If communication costs between the nodes on an
inter-LAN are sufﬁciently small, it is possible to achieve gains in overall efficiency by
reassigning jobs to exploit under-utilized computing resources [Stone 77]. Typically,
the algorithms which control this load balancing process are distributed; each node
runs a scheduling process, or scheduler, which communicates with the other
schedulers in the network to determine when and how to migrate jobs between
processors. A major constraint on the efﬁcient operation of the network under this
type of scheme is the minimization of interprocessor communication (IPC). Excessive
IPC results in decreased network throughput, and puts severe resuictions on network
size. Thus a design consideration for any load balancing algorithm is to insure that its

distributed schedulers do not markedly increase IPC on the network.

Much of the previous work in this area [e.g., Towsley 87, Livny 82, Ramarrnitham 84]
has focused on developing a dynamic, adaptive approach. Individual schedulers collect
and exchange certain statistics which estimate the current state of the load at each
node in the network and then a priori procedures are executed on each node to
calculate parameters which control the task scheduling decisions made by the
individual nodes. The approach is adaptive in the sense of adaptive control theory; as
the distribution of the load changes, the values of these controlling parameters
change, and with them, the algorithm’s behavior. The a priori procedures are the
system’s search mechanism, used to determine increasingly better adapted

parameter values.

 

87

The approach taken in Midgard can also be categorized as adaptive, but certain
important differences exist between previous efforts and the methods I employ. The
distributed system simulation described here is the ﬁrst attempt at load balancing
from a rule-based perspective. Rather than resolving the load balancing problem into
a set of key conuol parameters, we have consu'ucted a language which provides each
scheduler with a mechanism for framing speciﬁc kinds of allocation decision rules. We
seed this system with a random, but legal, set of these rules and then apply a cycle of
testing, evaluation and abstracted genetic recombination to develop increasingly

better adapted sets of schedulers.

Experiments with Midgard have shown the feasibility of this approach and at the
same time explore some of the generalized properties of distributed genetic adaptive
systems. One of the most important of these properties is the ability to generate
beneﬁcial global behavior, i.e., goal-directed behaviors which involve the coordinated
action of more than one process, with a minimal amount of on-line global
communication. Previous load balancing algorithms have all entailed some system of
communication among the distributed schedulers, aside from the actual migration of
jobs, designed to collect and disseminate explicit, global information about load
patterns in the network. This information is necessary to enable the individual
schedulers to effectively coordinate their behavior and so decrease mean network

response time.

In order to remove this communication overhead, some substitute source of
coordinating information must be provided. In Midgard, the scheduling set is
reconstructed periodically by a genetic procedure. During this phase, rules concerning
the effective patterns of job management are transmitted between the nodes,
gradually developing a global strategy for job management. As the distributed

schedulers become increasingly coadapted to one another’s behavior, more and more

88

of the global state is implicitly "known" by each processor, without the exchange of

messages, signiﬁcantly reducing IPC.

Midgard is designed to explore the utility of a distributed, genetically based, adaptive
algorithm in developing coadapted units in this context. In the design, each
scheduler’s access to information about the state of other nodes in the network is
intentionally limited, to explore what gains in efﬁciency can be obtained strictly

through coadaptive behavior.
5.2 Model Specification

The network simulated in Midgard is an inter-LAN, in the class of systems which
Fuller and Siewiorek [73] categorize as loosely coupled. A LAN connects a number
of distributed, independent processor clusters; each cluster is itself a LAN connecting
several independent processors. It is assumed that the processors on the network
can be separated into m classes, on the basis of their performance characteristics. It
is further assumed that each of the jobs submitted to the network can be put into one
of k a priori classes. Jobs may be submitted to any node on the network; the arrival
rates at each node are thus modeled as k-dimensional vectors of independent
Poisson disuibutions. The running time for a job of class i, when run on a processor of
class j, is modeled as a sample ﬁom an exponential distribution with mean u; j. These
means and variances are, of course, unknown to the schedulers in the network, as are

the characteristics of the arrival distributions.

In the absence of a load balancing algorithm, we will assume that the individual nodes
process each job submitted to them in turn. We will refer to this as an isolationist
policy, since each of the nodes functions as if it was not part of a network. The goal is
to develop a system-wide distributed scheduling algorithm which decides whether to

process arriving jobs at the node where they are submitted or at other, less-loaded

89

nodes. All the strategies developed in Midgard fall into the class of Receiver-
Initiated load sharing policies [Towsley 87]. A node’s scheduler is activated each
time a new job arrives or a migration request is received. The scheduler then
executes a rule based procedure in order to make a decision about how the job or
request is to be handled. Together, these procedures comprise an algorithm similar to
what Towsley, er al. have referred to as an Rk policy [87]; however, the RR policies

are only a subset of the policies expressible by Midgard’s rule-based language. Our
model provides a mechanism for determining increasingly better suited values for k,
but the ﬂexibility provided by a rule-based approach makes more elaborate strategies
possible.

5.2.1 An Augmented Classifier Language

The overall network job scheduler in our model is implemented as a distributed
genetic adaptive system whose components are the individual interacting schedulers.
Each scheduler is itself an independent production system capable of communicating
with the other schedulers in the network. When a job is submitted to a node, the
scheduler on that node either accepts the job or issues a migration request to some
other group of nodes. When a node receives a migration request, the scheduler on
that node either accepts the request or rejects it. The job is awarded to the ﬁrst

accepting node. Multiple migrations of the same job are prohibited in our formulation.

In Midgard, the following information is available to the scheduler running on a node

A to be used in the formation of production rules:

l)the initiating circumstance, (a job submission or migration request);
2)A’s class and unique identiﬁer (its cluster number and individual id);
3)the class of the submitted or receivable job.

4)the current length of A’s job queue ;

 

90

This information was chosen to restrict the schedulers to making allocation decisions
based solely on their local state. Items 1-3 make it possible for a scheduler to
distinguish between migration requests and job submissions, to behave differently for
different classes of jobs, even to behave differently depending on which node the
scheduler is managing. This information can be represented quite easily in the
standard classiﬁer system paradigm [Holland 86b] through the use of standard
message posting detectors. (Some familiarity with classiﬁer systems is assumed.
The interested reader should see [Holland 863]). Each time a scheduler is awakened,
a set of detectors posts specially tagged messages which indicate the class and
identiﬁer of the host node, the awakening circumstance, and the class of job involved.
Condition patterns can then be used to test assertions about the node’s state. Most

of these assertions can be expressed as simple predicates, i.e., as single ﬁxed-length

patterns 6 {0,1,#]“, which can be veriﬁed by the standard matching rules of classiﬁer
systems [Holland 86a].

The last piece of information, however, the length of the host node’s job queue, cannot
be so easily represented. If the queue length information is maintained in the system
by a detector which posts a tagged message containing the length of the queue,
certain important assertions cannot be formed as simple predicates. Since queue
length provides the scheduler with its only information about the host node’s load,

decision rules of the form:

If (Queue length of host > x) and (Queue length of host < y) -) Action
are crucial.
When the number of conditions in each classiﬁer is ﬁxed, as is usually the case,

comparison predicates like the two in the example can only be performed by the
coordinated action of several specially designed classiﬁers. For example, in her study

91

of the computational properties of classiﬁer systems, Forrest [85] devised an
algorithm to perform comparisons between integers within the standard paradigm.
The algorithm requires three classiﬁers for each bit in the binary representation of the
numbers, and takes three iterations to complete. It is also exceedingly brittle, in the
sense that the removal or corruption of a single classiﬁer in the comparison chain
most often produces unreliable results. Thus, even if classiﬁers implementing the
algorithm were a part of each scheduler at the outset of the learning process, the
likelihood of maintaining them as functioning units throughout the standard adaptation
cycle would be small, making the formation of numerical comparison predicates much
less likely than the formation of simple predicates. For a scheduler to evolve a
comparison algorithm for itself, even if possible, would require much of the adaptive
force of the genetic algorithm, thus reducing its effectiveness in the intended

application domain.

One solution to this dilemma is to inuoduce a second level of semantics into the
interpretation of classiﬁer conditions to allow the expression of arbitrarily complex
boolean operations as simple, sin gle-condition predicates. This innovation allows for
the creation of specially designed classiﬁer languages capable of high level operations
necessary for proper problem solution, while preserving the simple syntactic
properties which make the classiﬁer language robust under genetic manipulation. It
represents yet another use of classiﬁer "tagging". (The ﬁxed length conditions and
messages of classiﬁer systems are often considered as broken into separate ﬁelds
one of which is often referred to as the tag ﬁeld [Booker 82]. We let Tag(x) denote a

function which returns the tag ﬁeld of message or condition x.)

In the standard classiﬁer, an unnegated condition is satisﬁed if and only if there exists
a message on the current message list which matches the pattern expressed by that

condition. (The negation of an unsatisﬁed condition is satisﬁed.) In Midgard, the

 

92

standard classiﬁer language is augmented with a set of predicate functions
F=(F;1,...,FtJ, which take the arguments of conditions into the values 1 and 0.

Associated with the function set F is a set of identifying tags T=[t1,...,tk}. The
boolean-valued functions in F operate on conditions whose tag ﬁelds correspond to

one of the identifying tags, i.e., V conditions end 3 Tag (cnd) e T. An unnegated
condition is satisﬁed in our system if it is satisﬁed under the standard system, or if its

tag ﬁeld corresponds to an identifying tag tk whose associated function Fm returns

true when applied to the argument of the condition. In functional notation :

Let cl be a classiﬁer of the form (cndl cndp) —-) m, where cndi, i=1, p is

a condition 6 {0,1,#}“, and m is a message 6 (0,1)“. Let ml denote a message
list in ML = F [0,1]“ [Forrest 85]. We then deﬁne the classiﬁer

function: ML —) ML as :

m if Satisfy (cndi, ml) V i= 1. p ;
61 (ml) =
(b if 3 i 3 ~ Satisfy (endi, ml) ;

where:

Match (end, ml) is a boolean-valued function which is true a: 3 a

message nr 6 ml that matches end according to the standard
classiﬁer system’s rules of matching,

and

Satisfy (end, ml) is a boolean-valued function : [0,1,#}“x ML —) {0,1}.

Satisfy (cnd,ml):Match (cnd,ml) v [ Tag(cnd) = t e T A
Fl(cnd) ].

5.2.2 Midgard

The individual scheduler programs in Midgard are ﬁxed length sets of 67—bit
classiﬁers, composed of three 16 bit condition ﬁelds each with 1 additional negation

a“?

93

bit, and one 16 bit action ﬁeld. Each ﬁeld is broken down into two sub ﬁelds : a four bit
tag ﬁeld and a 15 bit argument ﬁeld. Two values of the tag ﬁeld are reserved as
identifying tags for two predeﬁned predicate functions, Q>=X and Q<=X. These
functions allow a single condition to test the length of the host queue against an
integer value encoded in the condition’s argument ﬁeld. Another two tag ﬁeld values
are reserved for Midgard’s two types of effector messages, Accept: (0,1)“ —> {0,1}
and Migrate: [0,1}n -> [0,1]". When a message whose tag ﬁeld matches one of the
two effector identiﬁers is posted on the message list, the corresponding function is

executed, taking the posted message as its argument.

Migrate contains an interpretation algorithm which constructs a legal network

address from any 12 bit argument ﬁeld in [0,1]”. When Migrate(m) is executed by a
node with an incoming job, a network address is constructed from the argument ﬁeld
of m and migration requests are sent to all nodes which match this address. Currently
in Midgard, legal addresses are either individual nodes, clusters of nodes, the nodes
in a particular class, or all other nodes in the network. If Migrate(m) is executed by a

node which has no incoming job, then it has no effect.

Accept contains an interpretation algorithm which evaluates any argument ﬁeld in

[0,1}12 and returns {0,1}. Whenever Accept (m) is executed, the argument ﬁeld of m
is evaluated. If the host node has just received a submitted job, the value of the
argument ﬁeld indicates whether the job is to be accepted (1) or a network-wide
migration attempted (0). If the host node has just received a migration request, the
value of the argument ﬁeld determines whether the request will be accepted (1) or
rejected (0).

An example may serve to illustrate. Consider two processors, A, a class 1 processor

in cluster 2, and B, a class 1 processor in cluster 3. A is heavily loaded, with ﬁve jobs

 

waitil

wakil

The 1
so an
comp
the t1
mess:

in Ch

0an

initia

SUpp.

on th.

conﬁga

 

94

waiting in its job queue, while B is currently idle. A job of class 3 arrives at node A,

waking A’s scheduler. A’s message list contains three messages:

(1) Host is class 1, in cluster 2, machine A
(2) Incoming job is class 3
(3) Job was submitted
A’s scheduler program contains a classiﬁer which speciﬁes that :

If (Job was submitted) (Host is class 1) (Q>=X X=3) —) MIGRATE (Cluster 3).

The ﬁrst two conditions match messages (3) and (1) on the current message list and
so are satisﬁed. The third condition calls the predeﬁned predicate function Q>=X to
compare the host’s queue length to 3 and return true if the queue length is greater. So
the third condition is also satisﬁed and the classiﬁer ﬁres, posting a Migrate effector
message. Migrate is executed, issuing several migration requests, one to each node

in Cluster 3, requesting one of them to agree to process A’s incoming class 3 job.

On receiving its copy of this message, B’s scheduler awakes. B’s message list also

initially contains 3 messages :

(1) Host is class 1, in cluster 3, machine B
(2) Incoming job is class 3
(3) Request from Host A, class 1, cluster 2

Suppose B’s production memory contains the following two classifiers:

(a) If (Host is class 1) (Internal message Red) —) ACCEPT (1)
(b) If (Request from class 1) (Incoming job is class 3)

(Q<=X X=2) --) (Post internal message Red)

On the ﬁrst iteration, classiﬁer (b) fires, posting an internal message of a particular

conﬁguration, denoted here as Red. On the second iteration, classiﬁer (a) ﬁres,

 

is

95

issuing a message to A indicating B’s willingness to accept the job. On receipt, A
sends the job to B to be processed, and, on completion, the results are returned to
node A.

Special cases exist when a migration request goes unaccepted or else is accepted by
more than one processor. If a migration request is not accepted by any notiﬁed
processor, the request expires and the incoming job is placed at the end of the job
queue of the processor to which it was originally submitted. When a migration
request is accepted by more than one processor, the ﬁrst acceptance received is

honored and the job is immediately migrated to that processor.
5.2.3 Relationship with DGAS

Midgard retains many of the features of the DGAS directly. The individual schedulers

on the network are concurrently executing distributed processes, each one capable of :

l) detecting the conditions in its local external environment, in terms of
the class, cluster, id and queue length of its host node, by checking
rule conditions against messages posted by detectors and
evaluating instances of the simple predicates Q<=X and Q>=X;

2) detecting its internal state, by checking rule conditions against
messages posted on its message list by its own rules; and

3) reacting to its external and internal states, by posting messages to
its message list and/or executing Migrate and Accept actions which
potentially alter the routing of tasks in the network.

Each process’s immediate external environment is deﬁned in terms of the
characteristics of the processor on which it is running. Its internal state is deﬁned by
the messages posted on its message list. A process can modify its external
environment either by issuing migration requests to transfer jobs to other nodes,

accepting jobs sent or migrated to it, or refusing requests for migration.

96

One of the principal differences between Midgard and the standard DGAS presented
in Chapter 3 is that the number of processes running at any given time in Midgard is
ﬁxed. Each node must at all times have a scheduler, and each scheduler must run on
its own node, so the number of active processes must be the same as the number of
nodes in the network. A second, perhaps more obvious difference, is the lack of
meaning which can be assigned to the spatial relationships between nodes. In
Asgard, the environment is deﬁned in terms of local conditions which vary with
distance; thus spatial proximity carries with it a wealth of context information. The
closer two individuals are on Asgard’s grid, the more intimately connected are the
details of their contexts. In Midgard, however, two nodes may be adjacent to one
another physically, but the characteristics of the network ( the classes of jobs
submitted, their patterns of submission, the characteristics of the nodes, etc.) may be
such that they hardly ever interact, at any level. Without an association between
physical proximity and contextual interaction, spatially biased mating and offspring

placement will not have the same effect as predicted in the standard DGAS.

To overcome these differences, the controlling processes of Midgard are arranged
somewhat differently ﬁ'om those described in the previous two chapters. Let A(t)
denote the set of schedulers active during generation t. A(O) then denotes the initial,

random scheduler population. Midgard’s overall control algorithm can be given as :

97

main ()
{ -
oversee ( );
t = 0;
while ( unterminated )
{
cobegin ( for all a in A(t) }
active (a, Days_end);
wait (Days_end);
A(t+l) = construct ( A(t) );
t+= 1;

All the schedulers in the network are active for a ﬁxed period, a "day", after which the
global event Days_end is set. At the end of each "day", a new population of
processes A(t+l) is constructed ﬁ'om A(t), and this new population is activated to

control the network for the next "day".
The process act ive in Midgard is as follows :

active (process, Days_end)
{
Until (Days_end)
{
wait (request_or_submission);
set_detectors ;
decision - execute (process.program);

perform (decision);

Until the end of the "day", a process waits for job submissions or migration requests

to arrive. When a request or submission arrives, the detectors are set appropriately,

 

98

initializing the process’s message list. Execute returns a decision based on the
outcome of a ﬁxed number of iterations of the process’s production program. This
decision (send a migration request, accept or reject a request or submission) is then

carried out by the procedure perform.

During the "day", the processes in A(t) schedule many jobs, and statistics are
collected on the local nodes regarding their performance. At the end of the day, a new

population is constructed from the old via the procedure construct :

construct ( A(t) )

{
for (Ai(t), i=1,..m)

{
evaluate (A1 (t) ) ;

for ( each a in Ai(t) )

compute selection probability

= F 2 r ’
pta) “Macaw I..(a)

for (j = 1! oo-IlAi(t)l )

{
select program a from Ai(t) ;

aj(t+1) = reproduce (a);

The population A(t) is divided into m subpopulations, A1(t)...Am(t), where m

denotes the number of processor classes represented in the network. The processes

in Ai(t) are all those processes which are running on a node of class i at time t. For
each class of processes Ai(t), class means for the performance variables are

computed and FE (a), a weighted sum of a’s performance values, is calculated for all

 

99

a in Ai(t). Based on these values, a set of selection probabilities p(a) are then

.0

computed. The new population is generated by selecting processes from the previous
population according to this selection distribution. Offspring processes of the selected
processes are generated by the routine reproduce, and these offspring become the

members of the new population.

Two factors are associated with similarity of context between scheduling processes
in the Midgard network. First, interaction between two processes is clearly
proportional to the number of jobs they exchange with one another. Between process

a and process a’, the strength of interaction is given by :

e ’ I + o I,
SI(a, a') = mtgrs (a a) mrgrs (a a)

 

Xmigrs (a, a") + Zmigrs (a", a)
a”eA a”eA

where migrs (a, b) denotes the number of jobs migrated from a to b.

Thus, the greater the interaction, the more the processes share the same context.
Second, two processes within the same class experience similar conditions in the
sense that they operate under the same service time disuibution. In Midgard, we
exploit these two factors in mate selection and offspring placement to create groups

analogous to the spatial and structural groups described in chapter 3.

Offspring placement in Midgard is determined by the class membership of the
selected parent; an offspring of a process running on a machine of class i will itself run
on a node of class i. The machine classes thus act as subpopulations in Midgard; the
offspring of the processes in class i on day t are the active processes on the class i
nodes on day t+1. This tends to concentrate structurally coherent groups within

classes.

100

Just as crossover served two functions in Asgard, in Midgard crossover is used to
produce both recombinations and hybridizations. With ﬁxed probability q, mates are
selected within class boundaries, most often producing recombinant offspring. With
probability 1-q mates are chosen based on their level of interaction, producing hybrids

which link strategies Operating in the same context.

select_mate (process)

{
With probability q
choose mate from within process's class;
else
choose mate from A(t) according to strength
of interaction;

Since the sum of SI = 1, and SI (a, a’) 2 0 for all a’ in A (t), the Sl’s are a valid
probability distribution.

5.3 Evaluation

Experiments with Midgard were designed to explore the feasibility of using
distributed genetic adaptive systems to devise strategies to coordinate the action of
interconnected computing resources. Midgard was tested in two phases. In the ﬁrst
phase, the set of experiments described in section 5.3.1, a homogeneous network,
containing processors from a single class, is sent jobs of a single type. Two different

evaluation functions, Fg‘and F22 are tested to determine whether or not a set of

implicitly coordinated disuibuted processors can be generated which cooperate to
effectively balance the network’s computing load. In the second phase, the set of
experiments described in section 5.3.2, a heterogeneous network, composed of

processes from two distinct classes, receives tasks of two different types. The goal is

101

to generate a single composite program, which when replicated on all the nodes of the
network, produces' differentiated behaviors on each class of machine, which cooperate

together to effectively balance the network load.
5.3.1 Phase 1

For all the simulations in phase 1 we use a ﬁxed network model with one class of
processor and one a priori job classiﬁcation. Fifty identical nodes are arranged in 5
connected clusters of 10 nodes each. Task arrival to each of these nodes is modeled
as a Poisson process with mean 3.. The service time for each job is a sample from an
exponential distribution with mean s. Each of the connecting LANs is assumed to
have a "perfect" protocol capable of queuing messages and job transfers, processing
them in the order received. Actual message and job transfer times (excluding any
waiting time) are modeled as exponential processes with means s/107 and s/104

respectively. The total network load, p, is given by Is.

For each evaluation function we performed a set of uials at each of three load values,
p= .2, .5, and .8, recording the minimum mean network response time generated. As a

control, we consider the mean response time at p=.2, .5, and .8 for the two limiting
cases: the case where each node processes its own incoming jobs (the isolationist
policy described above), and the theoretical lower limit, where each job is processed

without waiting.

Each simulation run is a repeated cycle of testing, evaluation and recombination. Our
objective is to determine a ﬁtness function for Midgard’s genetic algorithm that drives
successive scheduler populations toward optimum mean network response time.
Given some particular pattern of performance, we want to apportion credit to those
schedulers whose decisions decreased network response time and blame to those

whose decisions increased it. To test a particular population of schedulers however,

102

up to several thousand tasks are submitted to the nodes of the network to be
processed, resulting in many thousands of individual scheduler decisions. The
eventual result of each decision cannot be immediately determined, thus any "reward
and punishment" scheme will necessarily involve some rather complicated record
keeping as well as an elaborate scheme of IPC to transfer this information back and
forth between the nodes. Instead we chose to pursue an evaluation strategy similar
to Smith’s [80], identifying a set of four local performance measures for each
scheduler whose weighted sum gives a good estimate of each scheduler’s relative

contribution to overall network performance.
The four measures we use are:

1) Mean response time, Resp;
2) Mean queue length, Qlen;
3) Mean utilization, Util;

4) Mean migration activity, Mact.

We consider relative values for each measure and normalize them so that values near
1 indicate good performance with respect to the measure, 0 poor performance. Testing
has shown that, when used independently, the four measures exert pressures in
different directions, and that none is sufﬁcient, on its own, to control recombination. A
critic based solely on minimizing Resp time or Qlen rewards nodes which effectively
migrate tasks which might otherwise overload them, but it also punishes any node
which accepts migrated jobs. Thus a node which migrates all of its jobs and accepts
none is optimal with respect to either of these measures, but is clearly pathological
with respect to overall network performance. On the other hand, a critic based solely
on maximizing Util rewards nodes which accept tasks that exploit their under-utilized

resources, but also punishes any node which sends such a task. Thus a node which

 

103

accepts any and all jobs offered it is optimal with respect to this measure, even

though it is also clearly pathological with respect to overall network performance.

The last measure, Mact, is a composite measure that estimates the amount of
migration activity at each node. Since requesting and accepting jobs are the only
actions which schedulers in the network can take, maximizing this measure insures
that the nodes in the network do not slip into isolationism. This can be important,
particularly at the outset, when migratory behavior tends to be random, and often less
effective than simply doing nothing. Used on its own, however, it will almost
invariably develop a strategy of random scheduling, in which all stations issue global
migration requests for every job submitted to them and offer to accept every migration

request offered them.

The ﬁrst attempt at combining them, I?! 1, is the equally weighted sum of the four

measures. We ran a number of experiments, at each of the three load values.
Figure 7 shows the performance of a typical run with respect to mean network
response time (p = .8). Beginning from a randomly generated population of schedulers
which maintains the mean network response at lls, the ﬁrst few populations get
steadily worse, driving network response time up to maximum of 25‘s. As the
members of the population become increasingly coadapted, network response time
declines. Eventually a population of schedulers is developed whose combined action
reduces network response time to 1.7*s. The performance of the "average-best" over
several uials at each load is summarized in Figure 8. Using the equally weighted
ﬁtness function, Midgard was consistently able to generate a scheduler population

capable of improving the network load balance by at least 65% at each load.

The policies discovered, however, were somewhat less than satisfactory from an

overhead perspective. In each case, the system consistently evolved strategies which

104

 

 

 

( s “awn eat/eras uoetu our jo saldtrlnur u! )

strut] esuodsea xromreN uoew

Generations

Figure 7. On-Line Progress of a Typical Homogeneous Experiment - FEl

 

105

9.0

md

_

Em .. 333M 45.502 33:03:53 me gem .m gamma

Q 6004 {02:62

Nd

_

0.0
_

m6
_

ed 00 N6

_ _

 

X

X

UBQCU< DID
Lm_co$o_om_ Ill
EzEzaO XIX

 

I

 

rm

( s ‘aurg; sot/use were our jo seldmnw u! )

ewti esuodsea womreN uoew

 

 

eds.

106

generated more migrations than necessary, and in low load situations even
approached random scheduling. Analysis of the performance of the system with
respect to each of the individual performance measures suggested that this
undesirable behavior was due to the inﬂuence of the Mact, which escalated the
migration activity of the system, sometimes even at the expense of the other

measures.

To remedy this, we considered a second ﬁtness function, F32 , whose weights vary

with time. At the outset, each of the four measures is equally weighted, but as time
goes on, Mact’s weight approaches 0, while Qlen and Utils’ weights each approach .3
and Resp’s weight approaches .4. The results of several trials are summarized in
Figure 9. For each of the three values of p, scheduling policies were consistently
discovered capable of near optimal performance. In addition, the strategies developed

require less than .5% of the total network channel capacity.
5.3.2 Phase 2

In each experiment in phase 2 we use a ﬁxed network model with two classes of
processor and three a prion“ job classiﬁcations. Fifty-four nodes are arranged in 6
connected clusters of 9 nodes each. Each cluster is a homogeneous collection of
processors from a single class, and there are three such clusters for each of the two

processor classes. Task arrival to each of these nodes is modeled as a two—
dimensional Poisson process with mean 0t, 7.). The service time for each job is a
sample from an exponential distribution with mean pg, 5, where i denotes the class of

the executing processor and j the job type (see Table 2). Each of the connecting
LAN s is assumed to have a "perfect" protocol capable of queuing messages and job

transfers, processing them in the order received, as in phase 1. The total network

load, p, is given by [.5(s) + .5(10$)]}t = 5.5sl. Actual message and job transfer times

107

«mm - 838m #8302 Seasoned—om .3 S885 d 2:me

Q 6004 {02,52
md md hd who m_d .vd Md Nd

F _ _ _ _ O

 

 

 

 

 

 

 

 

( e ‘aurgr act/use uoew our JO soldglnur u! )

curt] esuodsea xJOMreN uoew

compact/x i
cn_co:o_on_ ut-
EJEDQO XIX

I

 

108

(excluding any waiting time) are modeled as exponential processes with means

mo7 and s/lO“ respectively.

Processor class
Job s 108
108 s

 

 

Table 2. Mean Service Times ”Li for a Job of Type i on a Processor of Type j.

I performed a set of uials at each of three load values, p= .2, .5, and .8, recording the

minimum mean network response time generated. As a control, I again consider the

mean response time at p=.2, .5, and .8 for the two limiting cases: the isolationist
policy and the theoretical lower limit, where each job is processed without waiting by

the machine which executes it most quickly. Using function FEz , I ran a number of

experiments, at each of the three load values. Figure 10 shows the performance of a
typical run with respect to mean network response time (p = .8). Beginning from a

randomly generated population of schedulers which maintains the mean network
response time at 14s, network response time increases steadily, peaking at 29*s
after 10 generations. However, as the individual schedulers become ever more
adapted to one another, network response time declines. After 100 generations, a
population of schedulers is developed whose combined action has reduced network
response time to 1.8*s. In every trial, after 120 generations, the members of both sub-
populations converged to a single program type, which contained specialized rules
that behave differently depending upon the class of machine on which they are run.
The performance of the "average-best" over several trials at each load is summarized

in Figure 11. Over several trials at each load value, Midgard generated a scheduler

109

 

 

 

 

30—

I
O O

( ”J5 ‘atugr aayuas uoaw wnwgutw out to satdmnur u! )

amt] esuodsea women] uoew

Generations

Figure 10. On-Line Progress of a Typical Heterogeneous Network Experiment

110

533M £2302 ”economEBo: we ban—89m .2 0.8me

Q 6004 foam—m2

 

 

 

 

. md Nd 0.0 md v.0 Md Nd Pd

_ ,_. _ . _ . _ v _ F F t m. . 0
mt a s

[or
ION
tom
roe
endeavor elm. [Om

ym_co:o_om_ Ill .

EJEthO XIX

 

( “‘8 'ewg; aoyuas uoaur wnwgugw am, to satdmnur u! )

curt] esuodseg women] uoew

111

which when replicated over the entire network was capable of balancing the load to,

on average, within 5% of the theoretical optimum.
5.4 Summary

In this chapter we have presented a model for applying a distributed genetic algorithm
to a problem in the ﬁeld of computer networks. The results presented show that the
genetic algorithm can be successful in coadapting the behavior of distributed
computing resources to produce desirable global behaviors. Tests on homogenous
networks show that the individual nodes of Mid gard are able to develop a global
scheduling policy for a class of networks which achieves a load balance at or near the
theoretical optimum, using only local information. Heterogeneous experiments yield
similar results, developing composite programs containing rules which produce
different behaviors depending upon the characteristics of the machine on which the
program is running. When such a composite is replicated throughout the system, the
differentiated processes operating on the different nodes cooperate implicitly, routing

jobs to the machines which can perform them most efﬁciently.

CHAPTER VI

SUMMARY AND CONCLUSIONS

This dissertation has focused on the development of a computational theory of
distributed adaptive self-organization derived from consideration of evolving
biological systems. We began by drawing an analogy between the individuals in an
evolving community and the independent, but interconnected, processes in a
distributed computing system engaged in a learning task. The idea we have explored
here is that coherence between the components of living systems emerges as a result
of coadaptive development, and that elements of this process can be abstracted to

induce goal-directed integration in artiﬁcial computing systems.

We ﬁrst examined some of the biological literature, in order to identify part of the
natural mechanism responsible for integrating originally independent individuals into
coherent wholes. In particular, we focused on the role symbiosis has played in the
emergence of evolutionary novelty. We examined three separate natural transitions
involving increases in complexity which were due, at least in part, to the identiﬁcation
and selection of symbiotic association. While the physical mechanisms of interaction
and fusion differed in each case, the same basic pattern of development was evinced.
Equatin g the independent organisms with computational processes, we developed an
absuact description of the three transitions in an attempt to capture the essence of
their common coadaptive process. We considered each system as composed of a set

of interacting processes of ﬁnite duration, which are periodically capable of initiating

112

113

"offspring" processes. These offspring are replicas of the parent processes, but are
subject to a source of adaptive variation. We described a coadaptation process
wherein when advantageous associations between processes arise, those
associations increase in frequency and strength so long as they continue to confer
advantage. As processes become more intimately associated, selection favors those
which form the most integrated complexes. These complexes provide a means for
insuring the continued persistence of the association, and result in an increase in

overall system coherence.

A formal statement of this process as a computational theory of disuibuted adaptive
self-organization formed the basis for the rest of the work. It asserted conditions
under which a certain class of distributed systems could increase their level of
integration and organization. These conditions require that the system have some
mechanism for detecting the existence of beneﬁcial interactions between processes

and increasing their strength and frequency of occurrence.

The distributed genetic adaptive system (DGAS) was developed as an algorithm to
implement the newly developed theory. Under this system, a group of concurrent
processes occupy locations in a tessellated space which deﬁnes each process’s
immediate environment. Associated with each process is a measure of its
performance, continually updated by a task speciﬁc evaluation function. The process
set is manipulated by a new variant of the genetic algorithm, which satisﬁes the
requirements of our computational theory and provides the system with the adaptive
variation and selection of association necessary to increase the level of system

coordination and improve its performance.

This new algorithm is a truly disuibuted version of the original genetic procedure and
represents a synthesis of the two major directions in current genetic algorithm

research. Like those in the Pitt approach, our algorithm manipulates units of arbitrary

114

complexity, relying on the genetic operators to develop internal coherence as well as
improve external ‘ performance. Like the Michigan approach, the individual units
manipulated by our procedure interact to affect one another’s performance. The
algorithm acts as an efﬁcient source of adaptive variation by biasing the allocation of
"offspring" toward those processes which have been observed as the best
performers. By statistically applying an appropriate crossover operator to produce
these offspring, trials are biased not only to those individuals observed most ﬁt, but

also implicitly toward those sets of individuals (schemata) observed as most ﬁt.

The introduction of spatially biased mate selection and offspring placement strategies
introduces a new dimension to the genetic algorithm, allowing it to encourage the
development and eventual integration. of symbiotic strategies. Advantageous
interactions between neighboring processes confer advantage, resulting in increased
numbers of "offspring" for these processes. Spatially biased offspring placement
insures that these offspring will themselves interact with increasing likelihood.
Crossovers between these individuals in the DGAS can result in the emergence of
composite programs, capable of producing one of several behaviors. Composite
processes produced by the hybridization of processes which are symbiotically related
make structurally explicit the implicit spatial coherence that exists between them.
Replications of these composite types result in a set of independent processes which

act as an implicitly coordinated, distributed system.

To test the effectiveness of the DGAS, the software system Asgard was developed.
Asgard is an implementation of the distributed genetic algorithm applied to an
"artiﬁcial life" domain. The tessellated space provides an environment for the process
set, a population of artiﬁcial "animals" which move about the grid consuming the
"food" which is distributed throughout. Each animal’s movement pattern is controlled

by its own program, a series of instructions drawn from a task-speciﬁc language.

115

Each "animal" is modeled as a process which executes its program based on input
from its surrounding environment and which reproduces according to the rules of the
distributed genetic algorithm. The learning task speciﬁed by the "food" placement
strategy imposes constraints on the actions of the processes in the domain, both

individually and in aggregate.

Tests with the Asgard system were described which showed the capability of the
DGAS to develop a system of interdependent, implicitly coordinated processes
organized toward a common goal. An example of one of these trials was discussed at
length, describing the gradual transformation of a population of "animals" from
random, uncoordinated, inefﬁcient individuals to an organized group of groups, whose
combined efforts resulted in dramatic increases in task performance. The ﬁnal result
was a single, composite program whose replicated c0pies spontaneously organized

into a set of coordinated groups which function together as a coherent whole.

The Asgard trials demonstrated that the distributed processes which evolve under a
DGAS can exhibit coordinated behavior without explicit communication. Despite the
encouraging results, several characteristics of the system were described which
hampered further experimentation. Asgard’s environment was extremely sensitive to
slight changes in its internal parameters, making population sizes difﬁcult to reliably
control. Extensions of Asgard to more useful domains proved awkward, due to the
difﬁculty of meaningfully translating realistic problems into Asgard’s metric grid. This
suggested that in order to apply the DGAS to more practical problems, some
substitute source for the contextual clues provided by the environmental space would

have to be developed.

These considerations prompted the development of a second software system,
Midgard, which utilized a modiﬁed version of the DGAS to address a more realistic
problem from the ﬁeld of distributed computing. The processes of Midgard

116

corresponded to a set of independent rule-based load schedulers operating on the
nodes of an inter-LAN network. These schedulers interacted with one another to
allocate incoming tasks to speciﬁc network nodes. The goal was to develop a
coordinated scheduler set which cooperated to reduce the mean network response

time.

The approach taken in Midgard differed from previous approaches in two respects.
First, unlike previous attempts, the schedulers in Midgard are rule-based rather than
parameter based. Second, in order to demonstrate the ability of our algorithm to
develop implicitly coordinated units, capable of cooperative behavior with a minimum
of communication, the scheduling processes in Midgard do not exchange statistics
about the global state of the network; all scheduling decisions are based on

information local to the node.

In order to apply a genetic procedure to the schedulers in this context, we found it
necessary to extend the standard genetic rule language, the Classiﬁer Language
(CL), deve10ping what we called the Augmented Classiﬁer Language (ACL). The
ACL provides for the incorporation of pre-deﬁned, task-speciﬁc boolean-valued
functions as simple, binary-encoded predicates. The ACL preserves the desirable
features of the CL -- robusmess under genetic manipulation and computational
completeness -- but allows certain operations which cannot be practically expressed

in the standard language to be incorporated easily.

Tests with the Midgard system provided more veriﬁcation of our theory of distributed
adaptive self-organization. Experiments with a homogeneous network showed that
Midgard was able to develop a global scheduling policy for a class of networks which
achieved a load balance at or near the theoretical optimum, using only local
information. Heterogeneous experiments yielded similar results, developing

composite programs which contained rules for several different behaviors, which are

117

interpreted differently depending upon the characteristics of the machine on which the
program is run. When such a composite is replicated throughout the system, the
differentiated processes operating on the different nodes cooperate implicitly, routing

jobs to the machines which can perform them most efﬁciently.

This study has established the feasibility of developing a coordinated set of
concurrent processes organized toward a common objective through the process of
selection of association. Formulated as an implementation-independent model, our
computational theory can serve as a guide to the construction of algorithms other than
the DGAS intended to drive the learning behavior of disuibuted computing systems.
Our particular implementation, the DGAS, was successful in generating an
interesting set of behaviors in the "artiﬁcial life" domain, developing complex, well-
integrated behaviors ﬁorn the interactions of simple, computationally limited, non-
communicating processes. Our application in the load balancing domain, while not
completely realistic, demonstrated the practical potential of the technique, and

provided an alternative to a spatially based model.

Finally, while useful, the theory we have developed here is by no means complete. A
number of areas inviting further research can be identiﬁed. Since the development of
our algorithm, a number of improvements to the genetic procedure have been
developed, which regulate the application of the various operators and introduce new
operators at various stages of population development to increase the variability of
the population and control its convergence properties [Sirag 87]. Further
experimentation with our Augmented Classiﬁer Language might include incorporating
a rule-level credit assignment scheme like the bucket brigade to increase the internal
coherence of the automatically generated rule-based processes. Increases in

attainable complexity might also be obtained through the automatic assignment of

118

function tags to developed processes of a certain ﬁtness, allowmg them to be

referenced as simple predicates.

Perhaps the most fertile areas for further research are to be found in extending the
biological analogy begun here. At present, the complexity of the interactions between
processes is limited due to a lack of process-controlled regulatory mechanisms to
oversee the differentiation process. Further biological modeling may provide the
abstractions necessary to devise a system of process construction or interprocess
communication which will facilitate the development of more tightly correlated multi-

process systems.

BIBLIOGRAPHY

Anderson, N. (1970). Evolutionary signiﬁcance of virus infection. Nature, 227: 1346-
1347.

Baker, LE. (1985). Adaptive selection methods for genetic algorithms. In:
Proceedings of an International Conference on Genetic Algorithms and Their
Applications, July, Pittsburgh, PA.

Blackbourn, D. J., Taylor, F. J. R., Blackbourn, J. (1975). Foreign organelle retention
by ciliates. Journal of Protozoology, 20(2), 286-288.

Booker, L. (1982). Intelligent Behavior as an Adaptation to the Task Environment.
Doctoral dissertation, University of Michigan, Ann Arbor, MI.

Burks, A.W., ed. (1970). Essays on Cellular Automata. Urbana, ILzUniversity of
Illinois Press.

Burks, A.W. (1986). A radically non-Von Neumann architecture for learning and
discovery. In: CONPAR 86: Conference on Algorithms and Hardware for Parallel
Processing, Wolfgang Handler, ed. September 17-19, Springer-Verlag, Berlin.

Bumet, RM. (1969). Cellular Immunology. Melbourne: Melbourne University Press.

Cavicchio, DJ. ( 1970). Adaptive Search Using Simulated Evolution. Doctoral
dissertation, University of Michigan, Ann Arbor, MI.

Cleveland, L.R., Grimstone, A.V. (1964). The ﬁne structure of the ﬂagellate
Mixotricha Paradoxa and its associated microorganisms. In: Proceedings of the
Royal Society of London, B 157: 668-683.

Cowan, J.D., Sharp, DH. (1988). Neural nets and artiﬁcial intelligence. Daedalus,
117(1): 85-121.

Dawkins, R. (1976). The seb‘ish gene. New York, NY: Oxford University Press.

Dawkins, R. (1982) The extended phenotype. New York, NY: Oxford University
Press.

Day, W. (1984). Genesis on Planet Earth. New Haven, CN: Yale University Press.

119

120

Dayhoff, M. (1972). Atlas of Protein Sequence and Structure: Vol. 5, Washington,
DC: Nationaquiomedical Research Foundation.

DeJong, K.A. (1975). Analysis of the Behavior of a Class of Genetic Adaptive
Systems. Doctoral dissertation, University of Michigan, Ann Arbor, MI.

DeJong, K.A. (1987). On using genetic algorithms to search program spaces. In:
Genetic Algorithms and Their Applications: Proceedings of the 2nd International
Conference on Genetic Algorithms, Grefenstette, ed., July, M.I.T., Cambridge,
MA.

Farmer, J.D., Packard, N.H., Perelson, AS. (1985). The immune system and artiﬁcial
intelligence. In: Proceedings of an International Conference on Genetic Algorithms
and Their Applications, July, Pittsburgh, PA.

Fogel, L., Owens, A., Walsh, M. (1966). Artiﬁcial Intelligence Through Simulated
Evolution. New York, NY: Wiley.

Forrest, S. (1985). A Study of Parallelism in the Classiﬁer System and its Application
to Classiﬁcation in KL-ONE Semantic Networks. Doctoral dissertation,
University of Michigan, Ann Arbor, MI.

Frantz, DR. (1972). Non-Linearities in Genetic Adaptive Search. Doctoral
dissertation, University of Michigan, Ann Arbor, MI.

Friedberg, R.M. (1958). A learning machine, Parts I and 11. IBM Journal of Research
and Development. 2(1): 2-13.

Fujiki, C., Dickinson, J. (1987). Using the genetic algorithm to generate LISP source
code to solve the Prisoner’s Dilemma. In: Genetic Algorithms and Their
Applications: Proceedings of the 2nd International Conference on Genetic
Algorithms, Grefenstette, ed., July, M.I.T., Cambridge, MA.

Fuller, 8., Siework, D. (1973). Some observations on semiconductor technology and
the architecture of large digital modules, Computer, 6(10): 14-21.

Goldberg, D. E. (1983). Computer-Aided Gas Pipeline Operation using Genetic
Algorithms and Rule Learning. Doctoral dissertation, University of Michigan, Ann
Arbor, MI.

121

Goldberg, D. E. (1987). Nonstationary function optimization using genetic algorithms
with dominance and diploidy. In: Genetic Algorithms and Their Applications:
Proceedings of the 2nd International Conference on Genetic Algorithms,
Grefenstette, ed., July, M.I.T., Cambridge, MA.

Goulain, M., Kornberg, A., Sinsheimer, R. (1968). Enzymatic synthesis of DNA. 24.
synthesis of infectious phage phiX174 DNA. Proceedings of the. National
Academy of Sciences, USA, 58: 2321-2328.

Grefenstette, 1.]. (1987). Multilevel credit assignment in a genetic learning system.
In: Genetic Algorithms and Their Applications: Proceedings of the 2nd
International Conference on Genetic Algorithms, Grefenstette, ed., July, M.I.T.,
Cambridge, MA.

Grosso, RB. (1985). Computer Simulation of Genetic Adaptation: Parallel
Subcomponent Interaction in a Multilocus Model. Doctoral dissertation,
University of Michigan, Ann Arbor, MI.

Holland, LR. (1975). Adaptation in Natural and Artiﬁcial Systems, Doctoral
dissertation, University of Michigan, Ann Arbor, MI.

Holland, J .H. (1976). Studies of the spontaneous emergence of self-replicating
systems using cellular automata and formal grammars. In: Automata, Languages,
Development, Lindenmayer, A., Rozenberg, G., eds. North-Holland Publishing
Company, New York.

Holland, J .I-l., Reitrnan, LS. (1978). Cognitive systems based on adaptive algorithms.
In: Pattern Directed Inference Systems, Waterman and Roth, ed., New York,
NY: Academic Press.

Holland, J H. (1985). Properties of the bucket brigade. In: Proceedings of an
International Conference on Genetic Algorithms and Their Applications, July,
Pittsburgh, PA.

Holland, J.H., Holyoak, K., Nisbett, R., Thagard, P. (1986a). Induction: Processes of
Inference, Learning, and Discovery, Cambridge, MA: MIT Press.

122

Holland, J .H. (1986b). Escaping brittleness: the possibilities of general- purpose
learning algorithms applied to parallel rule-based systems. In: Machine Learning:
An Artiﬁcial Intelligence Approach, Volume 11, Michalski, R.S., Carbonell, 1.0.,
Mitchell, T.M., eds. Los Altos, CA: Morgan Kaufmann Publishers Inc..

Hollstein, RB. (1971). Artiﬁcial genetic adaptation in computer control systems.
Doctoral dissertation, University of Michigan, Ann Arbor, ML

Horowitz, NH. (1945). On the evolution of biochemical synthesis. Proceedings of the
National Academy of Sciences, USA, 31: 153-157.

Jeon, K. W., Danielli, J. F. (1971). Micrurgical studies with large free-living
amoebas. International Review of Cytology, 30: 49-89.

Jeon, K. W., Jeon, M. S. (1976). Endosymbiosis in amoebae: recently established
endosymbionts have become required cytoplasmic components. Journal of Cell
Physiology. 89: 337-344.

Kaniss, RC. (1981). Evolutionary Change in Hierarchical Systems: A General Theory.
Doctoral dissertation, Cornell University, Ithaca, NY.

Koestler, A. (1976). The Ghost in the Machine. New York, NY: Random House.

Khorana, H.G., Agarwal, KL. (1970). Total synthesis of the gene alanine transfer
ribonucleic acid from yeast. Nature, 237: 27-34.

Lippmann, R.(l987). An introduction to computing with neural networks. IEEE ASSP
Magazine, April: 4-22.

Livny, M., Melman, M. (1982). Load balancing in homogeneous broadcast
distributed systems. Performance Evaluation Review, 11(1): 47-56.

Locke, M., ed. (1968). The Emergence of Order in Developing Systems. New York,
NY: Academic Press.

Lovelock, J.E. (1988). No longer willful, Gaia becomes respectable. Science, 240: 393-
395.

Margulis, L. (1970). Origin of Eukaryotic Cells. New Haven, CN: Yale University
Press.

 

123

Margulis, L. (197 6). A review: genetic and evolutionary consequences of symbiosis.
Experimental Parisitology. 39: 277-349.

Margulis, L. (1981). Symbiosis in Cell Evolution. San Francisco, CA: W. H. Freeman
and Company.

Margulis, L. (1982). Early Life. Boston, MA: Science Books International.

Margulis, L., Chase, D., Geurrero, R. (1986a). Microbial communities. BioScience,
36(3).

Margulis, L., Sagan, D. (1986b). Origins of Sex. New Haven, CN: Yale University
Press.

Margulis, L., Sagan, D. (1986c). Microcosmos. New York, NY: Summit Books.

Marr, D. (1977). Artiﬁcial intelligence - a personal view. Artiﬁcial Intelligence, 9: 37.-
48.

Mayr, E. (1970). Population, Species and Evolution. Cambridge, MA: Belkrrap.

Miller, S.L. (1953). A production of amino acids under possible primitive earth
conditions. Science, 117: 528-529.

Morowitz, HJ. (1967). Biological self-replicatin g systems. Progress in Theoretical
Biology, 1: 35-38.

Oro, J., Sherwood, E., Eichberg, J., Epps, D. (1978). Formation of phospholipids under
primitive earth conditions and the role of membranes in prebiological evolution. In:
Light Transducing Membranes. D. Deamer, ed. New York, NY: Academic Press.

Perry, 2.. A. (1984). Experimental Study of Speciation in Ecological Niche Theory
Using Genetic Algorithms. Doctoral dissertation, University of Michigan, Ann
Arbor, MI.

Ponnamperuma, C., Gabel, N. (1968). Current status of chemical studies on the origin
of life. Space Life Sciences, 1: 64-96.

Prigogine, 1., Nicolis, G., Babloyantz, A. (1972). Thermodynamics of evolution.
Physics Today, 25(11): 23-28, 25(12): 38-44.

 

124

Prigogine, l. (1980). From Being to Becoming. San Francisco, CA: W.H. Freeman.

Pringle, J.W.S. (1951). On the parallel between learning and evolution. Behaviour, 3:
90-110.

Pylyshyn, Z.W. (1986). Computation and Cognition. Cambridge, MA: MIT Press.

Ramamritham, K., Stankovic, LA. (1984). Dynamic task scheduling in hard real-time
distributed systems, IEEE Software, 1(3): 65-75.

Reanney, DC. (1974). Viruses and Evolution. International review of Cytology. 37:
21-52.

Richmond, MB. (1979). ’Cells’ and ’organisms’ as a habitat for DNA. Proceedings
of the Royal Society of London. B204: 235-250.

Riedel, R. (1978). Order in Living Organisms. New York, NY: John Wiley and Sons.

Sagan, C., Khare, B. (1971). Long wavelength ultraviolet photoproduction of amino
acids on primitive earth. Science, 173: 417-420.

Sannier, A.V., Goodman, ED. (1987). Genetic learning procedures in distributed
environments. In: Genetic Algorithms and Their Applications: Proceedings of
the 2nd International Conference on Genetic Algorithms, Grefenstette, ed.,
M.I.T., Cambridge, MA.

Smith, DC. (1979). From extracellular to intracellular: the establishment of a
symbiosis. Proceedings of the Royal Society of London, 8204: 115-130.

Smith, SE (1980). A Learning System Based on Genetic Adaptive Algorithms.
Doctoral dissertation, University of Pittsburgh, Pittsburgh, PA.

Spiegleman, S. (1967). An in vitro analysis of a replicating molecule, 80th Jubilee
Lecture. American Scientist, 55: 3-68.

Steele, EJ. (1981). Somatic Selection and Adaptive Evolution. Chicago, IL:
University of Chicago Press.

Stone, H. (1977). Multiprocessor scheduling with the aid of network ﬂow algorithms,
IEEE Transactions on Software Engineering, SE-3(l): 85-93.

 

125

Taylor, F.J.R. (1974). Implications and extensions of the serial endosymbiosis theory
of the origin ofeukaryotes. Taxon, 23: 229-258.

Taylor, F.J.R. (1979). Symbionticism revisited: a discussion of the evolutionary
impact of intracellular symbioses. Proceedings of the Royal Society of London,
B204: 267-286.

Taylor, J. T. (1976). Captain Jim’s drunken dream. Country Roads Music, Inc. (BMI).
Temin, HM. (1976). The DNA provirus hypothesis. Science, 192: 1075-1080.

Thompson, J. N. (1982). Interaction and Coevolution. New York, NY: John Wiley and
Sons.

Towsley, D., Mirchdaney, R. (1987). The effects of communication delays on the
performance of load balancing policies in disuibuted systems. In: Proceedings of
the 2nd International Workshop on Applied Math. and Perf/Rel. Models in
Computer Communications, May.

Usher, DA. (1977). Early chemical evolution of nucleic acids: a theoretical model.
Science, 196: 311-313.

Waddington, CH. (1969). Paradigm for an evolutionary process. In: Towards a
Theoretical Biology, Vol 11, Chicago, IL: Aldine.

Wallace, R., King, J.L., Sanders, G. (1981). Biology: The Science of Life. Glenview,
IL: Scott, Foresman and Company.

Wallin, J.E. (1927). Symbionticism and the Origin of Species. Baltimore, MD:
Williams and Wilkins.

Wasserman, P.D., Schwartz, T. (1987). Neural networks: Part 1. IEEE Expert, 2(4):
10-14.

Whatley, J., John, P., Whatley, FR. (1979). From extracellular to inuacellular: the
establishment of mitochondria and chloroplasts. Proceedings of the Royal Society
of London, 3204: 165-187.

White, HE. (1976). Coenzymes as fossils of an earlier metabolic state. Journal of
Molecular Evolution, 7: 101-104.

126

Wiley, E.O., Brooks, DR. (1982). Victims of history - a nonequilibrium approach to
evolution. Systematic Zoology, 31(1), 1-24.

Wilson, Stewart. (1985). Knowledge growth in an artiﬁcial animal. In: Proceedings of
an International Conference on Genetic Algorithms and Their Applications, July,
Pittsburgh, PA.

 

"tttlllltlllllt