c5:

 

‘ J.
. in"
:gm'ﬂ

F
I

 

)a.
in:
Hi...
11.1 .31.}?! n,
. P. n In 1 tgo‘n 9‘
'41.. II... tv‘lcurlvnrﬂ vix;{u.||
Vina? . It}! a.
c. a.
r. A}
‘3.
$5.
' ....\
.IA

7 ..- .!u" E i
ﬁnadwﬁmmﬂ? . ...!ﬂ..c.it "8&1! u... .. i
99‘ plan}; Hr! grooIIH

’Irra'a'
.ol‘
;

- ’ii‘

 

fin-l
Ina..!4,vli.b|)b'l
. ’ ﬁllut‘..l..
. ...; ‘

.I- 7'53}:

r. gilt!“

rs nl.I.ior
51"..

 

.it ...,Vrl. vi‘I.?l I.v.-: . .
. pl; I..1. .1.!.-.aublr ..... .0... nvs..t:.pl
.lnnl7. .Xl.. q A ¢_.I11le.|lll0
. ..Illtr . Al.l.....l-.l
.llIIO".

mm???
u I ' . '

H‘

‘ i fig; 7) I ' 5". 1'
.. a: luttalttfnvl.lxri|t;fl
. s..‘;x-l.3. ! ..- 3.0.1111}; ..
V .-.“.t...vx-....r...1 . . I!!i..l.!.t..ut - .
. Ett’l..-f’lal‘l t
03"- u I... .r)! a I

 

"9mm.

,

 

rl
... ‘- pOUII . . .. . I
I; lollrt. v.5! 5|.oitl! 5. .

ll>ru§k 9.}.Ivf.vl.v..uvvl.ln.n . .

.I..T>‘VIAI.. p...l.£loto'|. ..li» . . .

.. LA! .

. va'¥xot.‘.“.,l , .

n'll‘l!!.»ln .IIIII

        

     

   

«llvlv....tf. I - . V , . ....I ... .

. . . (\‘th: .A. 4 a

I..ll..E3.i:f-z.2,. . . . . .. . ..J: ...: 3.x, .. . .8 .--
(It: .. DILWII.14»IV. c 1...hsu..wu.llvou.i.. 7. -.«.......“vv. . ..a.......,. 7115:1471. ....rw‘nwfu .. ...".Hurmndﬁgmkb .ﬁ/vinmmwohil.
WI). viir. - . .-vi .. .. .... ...‘a ¢ chnW‘Wenﬂd W1;F..tﬂ.§vhmnb 1k}. .ﬁnﬁ. 1ft? ......vl .
T. . .cl 1: 5.32m“ .. .. . ..bllng‘mswu .....i .....v . ..- f .
fol l....!.. 1 If ...Attpl .71.»- s , ..

a 1

 

v 1 ..
f.t..sv n .
‘. M.,l.v ..

21%;. g

MIICH GANS

l 'llHlliH’lllUll"IUIIHIHIIIHHHIIIHHI

300902 1308

WI}

 

 

 

 

 

 

 

This is to certify that the

dissertation entitled

On Some Topological Issues In
Message-Passing
Multiprocessor Architectures

presented by

Guy Warren Zimmerman

has been accepted towards fulﬁllment
of the requirements for

PA 0. degree In WWQ

 

 

Date 7'- /3‘— yo

 

MSU L! an Afﬁrmative Action/Equal Opportunity Institution 0- 12771

 

LIBRARY
Mlchlgan State
University

 

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.

[_______________
DATE DUE DATE DUE DATE DUE

._ I
I: l
—i
ii

IL____
iﬁT—l

MSU Is An Afﬁrmative Action/Equal Opportunity Institution
chG-o.‘

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ll

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ON SOME TOPOLOGICAL ISSUES
, INMESSAGE-PASSING
MULTIPROCESSOR ARCHITECI'URES
By

Guy Warren Zimmerman

A DISSERTATION

Submitted to
Michigan State University
in partial fullﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Computer Science

1990

ABSTRACT

ON SOME TOPOLOGICAL ISSUES
IN MESSAGE-PASSING
MULTIPROCESSOR ARCHITECTURES

By

Guy Warren Zimmerman

Advances in computer technology have made it possible to construct machines with
very large numbers of processors. Among such machines are the distributed-memory
machines, termed Multicomputers, which have the potential to deliver very high perfor-
mance for a large class of applications. A major component in the design of a Multicom-
puter system is its underlying interconnection topology (usually modeled by a graph),
since it has a signiﬁcant inﬂuence on virtually every other aspect of the system, such as
communication capabilities and reliability.

This dissertation considers some aspects of communication and reliability in a Mul-
ticomputer system which are related to the underlying topology of the system. In particu-
lar, a type of communication known as broadcasting is studied. The roles of trees in com-
pleting broadcasting, and a near optimal broadcasting algorithm for a speciﬁc Multicom-
puter system known as the De Bruijn networks are explained. Other partial results such
as the distribution of the broadcast times of trees of diﬂ’erent orders are given.

The reliability, or fault-tolerance, issues discussed here are of topological nature.
Speciﬁcally, a new approach to system-level fault-tolerance in Multicomputers is
presented. In this approach a bound is placed on the maximum number of connections
allowed at each processor and the number and conﬁguration of redundant components is
then determined to achieve a speciﬁed level of fault-tolerance. The approach is applied
to the design of fault-tolerant ring topologies. Designs are presented which can tolerate

up to three processor failures.

To my wife, Janet

iii

ACKNOWLEDGMENTS

I would like to thank my advisor, Abdol-Hossein Esfahanian for all of his assistance dur-
ing my Ph.D studies, for his well chosen words of encouragement and for his conﬁdence
in me. I would also like to thank Dr. Lionel Ni, Dr. Anthony Wojcik, and Dr. Bruce
Sagan for their service on my committee and for their helpful comments and suggestions
regarding the preparation of this dissertation. I would like to thank my wife, Janet Ergo
Zimmerman, without whose support and sacriﬁce I could not have completed my Doc-
toral work. I would like to thank my parents, Jack and Jane Zimmerman, for instilling in
me the importance of learning and for giving me the opportunity to pursue a college edu-
cation. Finally, I wish to thank Jack and Carol Ergo and my grandparents, Mable and
Harry Raymond, and for their years of support and encouragement.

iv

List of Tables
List of Figures
1.

Bibliography

. Graphs and GMP

' Broadcasting in Multicomputers

. Conclusion

TABLE OF CONTENTS

 

 

Introduction ........

 

 

1.1 Multiprocessors and Multicomputers
1.2 Motivation and Problem Statement

 

1.3 Thesis Organization -

 

 

 

2.1 Graph Deﬁnitions, Notation, and Terminology
2. 2. GMP- A software package for graph manipulation

 

2.2.1 An overview of GMP

 

2.2.2 Utilization of GMP

 

 

 

3.1 Background -
3.2 Broadcasting in General Graphs

 

3. 3 Broadcasting in 'Il'ees

 

3.3.1 Minimum broadcast trees

 

3. 3. 2 Characteristics of general minimum broadcast trees

 

 

3. 3. 3 Broadcast times for general trees
3.4 Broadcasting in Binary DeBruijn Graphs

 

3.4.1 Binary DeBruijn graphs

 

3.4.2 A distributed broadcast algorithm for 806 (n)

 

 

Fault-Tolerant Loop Architectures
4.1 Background

 

4. 2 Chordal Rings as It -ft Cycle Topologies

 

4..21Chordalrings

 

 

4. 2.2 Fault tolerance of CR (N ,w)
4. 2. 3 Comparison with previous results

 

 

5.1 Conclusions and Summary of Research Contributions

 

 

5.2 Suggestions for Future Study

 

LIST OF TABLES

Table 3-1. The values of M(r .A) -

 

 

Table 3-2. Distribution of broadcast times for all trees, 1', 4 s IT I s 28 -

 

Table 3-2 (Cont’d)

38

41

Figure 2-1.
Figure 2-2.
Figure 2-3.
Figure 2-4.
Figure 2-5.
Figure 2-6.
Figure 3-1.
Figure 3-2.
Figure 3-3.
Figure 3-4.
Figure 3-5.
Figure 3-6.
Figure 3-7.
Figure 3-8.
Figure 4-1.
Figure 4-2.
Figure 4—3.
Figure 4-4.
Figure 4-5.
Figure 4—6.
Figure 4-7.
Figure 4-8.

Figure 4-9

Figure 4-10. F = {0.1.2,2i-1,2i.2i+1} ,N—2i 32 mod 4

Figure 4-1

Figure 4-12. F . (0,1,2.N-16}

Figure 4-1

Figure 4—14. F = [0.1.2.N-8}
Figure 4-15. F a (0,1,2,N-4]
Figure 4-16. F = (0.3.6.73)
Figure 4-17. F a {0.12.3.9}

Figure 41

Figure 4-19. F a [0,3,12,13,14]
Figure 4-20. F = [0,3,10,2i-l.2i.2i+1],N—2i I2mod 4

Figure 4—2

Figure 4.22. p = {0,4,2i—1.2i,2i+l],N-2i I2mod 4

Figure 4-2

LIST OF FIGURES

The three principal GMP windows --

 

The Std. Graphs menu

 

The Miscellaneous menu -

 

The Algorithms menu

 

The Find Cycles algorithm is invoked on a Chordal Ring
The Goodies Menu

 

 

 

Two example broadcasts in a graph

Two examples of MBTs

Three rooted MBTs and their common underlying free tree ...................

 

The unique trees Mam“), for I: =- 2,3,4

 

MBT (2‘)is constructed from two c0pies of MBT(2’)

 

 

An MBT not having MBT(23) as a subgraph
The graph 806(4) -

 

An example broadcast in BBC (4) using Algorithm 1

 

The Chordal Ring CR(20.3)

 

 

A 16-cycle in CR (20.3) - [0,1]

 

A Hamiltonian cycle in CR(20,3)
An 18-cycle in CR (203) - {0.3}

 

An induction-by-two illustration

 

An induction-by-four illustration

 

 

An induction-by-six, version A illustration
An induction-by-six, version B illustration

 

. An advance-by-four illustration

 

 

1. F I: {0,12,2i-l,2i.2i+1] , N-2r' I0 mod 4

 

 

3. F :3 {0.1.2,N-12}

 

 

 

 

 

8. F a: {0,3,4,10,ll}

 

 

1. F = [0,3,10,2i—l,2i.2i+l},N-2i IOmod 4 -

 

 

3. F = [0,4,2i-l,2i,2i+l],N-2¢° n0mod4 -

 

‘wvvvwv-

16
17
18
19

21
25
30
3 l
32
34
36

48
57
59
61
61

65
67
68
69
7 1
71
72
72
73
75
76
77
78
79
80
80
81

Figure 4-24.
Figure 4-25.
Figure 4-26.
Figure 4-27.
Figure 428.
Figure 4-29.
Figure 4-30.
Figure 4-31.
Figure 4-32.
Figure 4-33.
Figure 4-34.
Figure 4-35.
Figure 4-36.
Figrn'e 437.
Figure 4-38.

Figure 4.39
Figure 4-40

Figure 4-41.
Figure 4-42.
Figure 4-43.
Figure 4-44.
Figure 4-45.
Figure 446.
Figure 4-47.
Figure 4—48.
Figure 4-49.

Figure 4-50

F ={0,4.N-8.N-7] . - -
F a [094N49N‘5]
F I: [0,4,N-4,N-3} -
F = {05.10.11} -

F = [0.5.12]

 

 

 

 

 

F 3 {0.5.13}

 

F a: [0,5,14,15,16]
F = [0,5,2i-1,2i,2i+1},N-2i 32mm! 4
F a: {0,5,2i-l,2i,2i+l},N-2i 10mm! 4
F s [0.6.7,12,13]
F s {0,63,13,14}
F a (0.6.7.8,15]
F '10.6.7.N-7l
F a [0,6,7,N—8}
F :- {0.6,7.N-9} -
. F s [0.6.7.41' ,4i +1]
. F = [0.6.7,4i 4241' +3}
F = {0.6.7,N-10] -
F :3 [0,73,15,16]
F a {0.7.8.N-9}
F = [0.7.8.N—9N-10]
F a [0,7,8,N—ll]
F = {0.7.8.2j +1}
F = [0.7.8.20]
F = [0.7.8.N-2j}
F = {0.2} +1.1]

F ={02j21t}

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

viii

82
83

85
86
87
88
89
89

91
92
93
94
95
96

98

100
101
101
102
103
104
105
106

Chapter 1

Introduction

 

Advances in computer technology have made it possible to construct machines with
very large numbers of processors. Such machines have the potential to deliver very high
performance for a large class of applications. In fact, the development of multiple-
processor computers has made it possible to sharply reduce the amount of time required
to solve large scale problems and as a consequence, to greatly increase the scale of prob-
lems which can be solved in a reasonable amount of time. These problems are most often
of a scientiﬁc nature, are very CPU intensive, and typically require large amounts of
memory. Some examples of such problems include: aerodynamic simulations, weather
forecasting, petroleum exploration, image processing, and artiﬁcial intelligence
[HwBr84].

1.1. Multiprocessors and Multicomputers
At the highest level, the major components of the architecture of a multiprocessor
system are the processors, memory unit(s) and the interconnection network through

which the processors communicate with each other and access the memory unit(s).

2

Interconnection networks can be classiﬁed in two categories: static networks and
dynamic networks [Feng81]. The diﬁerence is that the physical connections may change
in a dynamic network, while they always remain ﬁxed in a static network. Examples of
static networks include the ring, tree, 2-D mesh and hypercube. Dynamic networks
include single-stage and multistage structures. Single-stage is also called recirculating
network, which is composed of a stage of switch boxes cascaded to a link connection pat-
tern. A typical multistage interconnection network in a multiprocessor system with N
processors consists of logkN stages with N /k k X]: switch boxes at each stage, where
2 S k S 8. A multistage network can usually connect an arbitrary input to an arbitrary
output. Examples of multistage networks include Omega (shuﬁle-exchange) and Delta
networks. Dynamic networks oﬂ’er greater ﬂexibility at the cost of additional hardware
and the overhead of managing the interconnection network [Agra83, BhAg83].

Over the years, two broad categories of multiple-processor computer systems have
emerged, based largely on the degree of autonomy of the processors and the way the
memory is organized and controlled. The ﬁrst of these classes of systems is referred to as
SIMD, an acronym for Single Instruction Multiple Data. SIMD machines are character-
ized by the fact that at any given instant, every enabled processor in the system is execut-
ing the same instruction as all the other processors, although each may be processing
diﬂ’erent data. For many applications with inherent massive data parallelism, SIMD
machines can provide high performance at a reasonable cost. Several successful SIMD
machines have been developed, including MPP, ICL’s DAP, and the Connection
Machine [Flan77, Batc80, I-Iill85]. One of the best features of these machines is their
case of programming. This feature is also the principal drawback in that the ease of pro-
gramming is made possible by the limited range of applications which can be eﬁ‘ectively
executed.

The other major category of multiple-processor machines is the MIMD class;
MIMD being an acronym for Multiple Instruction Multiple Data. As the name indicates,

at any given time each processor in the system may be executing a diﬁerent instruction,
and is typically processing diﬁ‘erent data. These machines are a more general class of
multiple-processor computer systems, since they may simulate the operation of SIMD
machines. MIMD machines are further categorized by the way the memory is organized
and managed. A shared-memory machine has a single global memory which is usually
equally accessible to all its processors. Since having a single memory can lead to
memory contention problems, the memory is often physically organized into modules
and the processors access the modules through a communication network, most fre-
quently multistage. Even so, the possibility for contention for access to speciﬁc modules
can still be a problem. Processors communicate with one another by accessing pre-
arranged locations in the shared-memory. Examples of shared memory machines include
the BBN Butterﬂy, the NYU Ultracomputer, the IBM RP3 , the Sequent
Balance/Symmetry, the Encore Multimax and the Cray X-MP/Y-MP [Gott83, Lars84,
Crow85, PﬁsSS, Sten88, RCCI‘90]. The chief advantage of shared memory machines is
that they allow the programmer the impression of a single address space. The major
disadvantages are that the machines do not scale well and contention for memory access
can lead to so-called "hot—spots" which can degrade performance.

The second type of MIMD machine is referred to as a distributed-memory machine.
Here each processor in the system has its own local memory (which it controls) and the
processors are connected by a communication network (most frequently static). The pro-
grammer does not have the advantages of a single global address space. Processors com-
municate by passing messages through the communication network, and a message may
have to go through several intermediate processors before reaching its destination. For
this reason, these systems have also been referred to as message-passing or point-to-point
systems. A number of research and development projects have been undertaken to con-
struct such machines [Seit85, I-IMSC86, SAFM88]. Among the existing systems, the

most dominant one is the hypercube which is commercially available in different

variations including the Mark III, the iPSC-l, and the Ncube [PTLP85, Gui-1886,
GrRe86]. In contrast to shared-memory machines, message passing architectures gen-
erally scale better and are more robust. However, they are more diﬂicult to program
eﬁectively; and since communication is usually slower, they are better suited to problems
where the communication/computation ratio is relatively small. This dissertation con-
cerns several problems related to this type of machine, which we will refer to as a Multi-
computer (MC).

1.2. Motivation and Problem Statement

There are many aspects to the design of an MC. As in the development of every
computer there is the broad hardware/ software dichotomy; and within each of these two
areas, one may consider many diﬂ‘erent levels of design. Here we will be concerned with
system-level issues. System-level means the consideration of the MC in a macroscopic
fashion. In terms of software, the issues are the development of operating systems, com-
pilers, programming tools, etc., all of which must address the MC as a complete entity.
In the hardware area, system-level means viewing the MC as a collection of
processor/memory-pairs connected by communication links. These two areas are highly
interdependent, and a successful design must consider both areas as they relate to each
other and to the design goals.

In terms of hardware, at the system-level the selection of the topology of the inter-
connection network is fundamental since it has a signiﬁcant inﬂuence on virtually every
other aspect of the MC including communication capabilities and system reliability. In a
certain sense, the topology is the MC. Communication characteristics such as message
delay and routing are directly related to the underlying topology [FaKr83]. Many
research articles have focused on the relationship between the topology and one or more
of these characteristics. As a result, many topologies for MCs have been proposed
including cube-connected cycles, binary tree, 2D-mesh, binary DeBruijn networks, and

k-ary n-cube [Prad81, PrVu81, HsYZS7, HwGh87, Dun88, ChCD88, Dall90]. This
dissertation will examine how the topology of an MC inﬂuences its communication capa-
bilities and reliability.

Since MCs rely on passing messages, the performance of an MC is signiﬁcantly tied
to the types of communication that can be accomplished, and how quickly and eﬂiciently
they can be done. Experience has shown that several communication paradigms are use-
ful in developing algorithms for MCs. These paradigms include: one-to-one where a pro-
cessor wishes to send a message to another speciﬁc processor; one-to-all where a proces-
sor wishes to send a message to all the other processors in the system; and ﬁnally one-to-
many, which is a generalization of the previous two. These classes have also been
referred to as unicast, broadcast and multicast, respectively. Some of the issues in
implementing these communication paradigms are: the number of messages that a pro-
cessor may simultaneously send or receive, the type of routing strategy (e.g., static or
dynamic), and the type of switching strategy (e.g. packet switched. circuit switched,
wormhole, virtual cut/through) [KeKl‘l9, DaSe87]. The performance of each of these
communication schemes is aﬁected by the topology. In this dissertation we consider the
problem of broadcasting in MCs and we assume that each processor may send or receive
only one message at a time.

The large number of components (such as processors and memories) and the overall
complexity of such systems serve to exacerbate reliability problems, and much researeh
has been done in studying reliability issues in MCs [KuRe80, PrRe82]. Several types of
reliability have been distinguished based principally on the nature of the applications the
MC is to execute. A highly available system may have frequent failures, but the failures
are such that the system can be restored to proper functionality in a short time; these sys-
tems are "up" most of the time. The Electronic Switching System (ESS) used by AT&T is
designed for high availability [Prad86]. An ultra reliable system is one which has very
few failures over a speciﬁed time interval. This type of reliability is demanded by

applications where failures can have catastrophic results. Examples of such applications
include operation of aircraft and medical monitoring equipment.

One approach to developing reliable systems is to use ultra-reliable components in
the construction. However, this can be an expensive proposition and does not always
adequately address the problem. Another alternative is the so-called fault avoidance
approach in which the system is designed to avoid faults from occurring. The approach
we will consider in this dissertation is referred to as fault-tolerance. As the name
implies, the idea is to design systems in which faults may be tolerated, and a system will
be said to be fault-tolerant if it can remain functional in the presence of failures. At the
system level, two types of failures are considered: the failure of a set of processors, and
the failure of a set of communication links. What constitutes a functional system depends
largely on the type of applications that the system will execute.

Two basic functionality criterion have received much attention. According to one
of these, an MC is considered to be functional as long as there is a non-faulty communi-
cation path between all pairs of non-faulty processors. In graph theoretic terms, the
underlying topology is connected. This type of functionality is applicable to the so-
called coarse grain application, in which the algorithms are relatively insensitive to
topology. Typically, each processor is made responsible for executing some subset of the
tasks required to solve some (large) problem; in the event of a failure, the tasks are
assigned to other processors. This is the so called fault-tolerance through degraded per-
formance approach [ArLe81, Esfa88, RaGA85].

The second functionality criterion considers an MC to be functional only when a
desired topology is contained in the system. In the past few years, much work has been
done in developing eﬂicient, high performance parallel algorithms for various MCs. For
many of these algorithms, the existence of certain topologies is a signiﬁcant factor in
delivering the desired performance. For such applications, the system should be able to
provide a speciﬁc topology throughout the execution of the algorithm. This criterion was

ﬁrst formulated by Hayes, and subsequently a number of fault-tolerant (in the sense of
the second criterion) topologies have been proposed [Haye76]. The basic approach in
this case is to employ redundant components (links or processors) throughout the system.
In the event of a failure, the system could then be reconﬁgured to exclude the faulty com-
ponent by using one of the redundant ones. In this dissertation, fault tolerance will refer
to the second functionality criterion, and we will consider only processor failures.

Broadcasting and fault-tolerance have much in common in that they are both inti-
mately tied to the topology of the MC. In addition, broadcasting can be an integral part
of the reconﬁguration process for a fault-tolerant system. Also, a broadcasting scheme
for a fault-tolerant system must be robust enough to accommodate faulty system com-
ponents. A number of papers have addressed problems common to these two areas
[Bien88, LeHa88, Lie588].

1.3. Thesis Organization

As indicated, this dissertation will focus on two problems both of which are related
to the topology of the interconnection network. Such a topology is conveniently modeled
as a graph where the nodes of the graph represent the processors and the edges represent
the communication links. Many problems in Computer Science and Engineering can be
so formulated. However, it is often the case that such models are only practical for
"small" instances of problems, since even modestly sized graphs can become incredibly
complicated and unwieldy to work with. In addition, many of the operations one wishes
to perform on such models are of a trial and error genre, and such operations become
increasingly labor-intensive for all but the smallest graphs. To address this problem, we
have developed a software package for graph manipulation which has been instrumental
in obtaining some of the results in this dissertation and other research as well. This pack-
age is described in Chapter 2, where we also present graph theoretic deﬁnitions and ter-
minology which will be used throughout the dissertation.

Broadcasting in MCs is discussed in Chapter 3. In particular, we examine the role
that trees play in the broadcasting process. A partial characterization of a class of graphs
called minimum broadcast trees is given. Finally, we present a distributed algorithm for
doing broadcasting in networks based on Binary DeBruijn graphs.

Fault-tolerance in MCs is the subject of Chapter 4. In this chapter we introduce a
new approach to system-level fault-tolerance. The approach is applied to the case when
the topology of the MC is a ring. For this case, a class of graphs called Chordal Rings is
shown to be an Optimal solution.

In the ﬁnal chapter we present our conclusions, summarize our research contribu-

tions and suggest some directions for future study.

Chapter 2

Graphs and GMP

 

We will use graphs to model the topology of an MC. The vertices of the graph
represent processors and the edges of the graph represent the communication links. With
this model in mind, we will use the terms vertex, node, and processor interchangeably
and similarly for the terms edge and link. In this chapter, we present our mph
deﬁnitions, notation and terminology which are common to both of the two problem
areas that we consider in subsequent chapters. Deﬁnitions and terms which are speciﬁc
to a problem area will be introduced in the context in which they are needed. We also
describe a software mph manipulation package which we have developed and which

has been useful in our research.

2.1. Graph Deﬁnitions, Notation, and Terminology

In this section we present our mph theoretic deﬁnitions and notation. Graph
theoretic terms not deﬁned here can be found in [I-Iara72]. Let G (V,E) be a ﬁnite mph
without loops or multiple edges with the vertex set V = WC) and the edge set
E = E(G ). The order (size) of G (V,E) is equal to the cardinality of the vertex set (edge

10

set), and is denoted by |V(G)| ( |E(G)| ) or simply |V| ( |E| ) when the context is
clear. If an edge e = (u,v) e E, then vertices u and v are said to be adjacent, and the
edge e is said to be incident to these vertices. For a vertex v e V,I(G :v) represents the
set of all edges incident to v in G . Two vertices connected by an edge will be referred to
as neighbors. The degree of a vertex v is d(v) = I! (v ) | . The degree sequence of G is a
non-decreasing list of the degrees of all the vertices in G . The minimum and maximum
degrees of a mph G are 8(G)=min[d(v) lv 6 V} and
A(G) = max { d(v) Iv e V}, respectively. A mph G is r-regular if for all veV,
d(v) = r. A mph is a cubic mph ifit is 3-regular.

A mph H which has all of its vertices and edges in G is a subgraph of G , denoted
H :6. H6 is isomorphic toa submph ofH, we say G maybe embeddedinH. IfH
is a submph of G , then G is a supergraph of H. A spanning subgraph of G is a sub-
mph containing all the vertices in G. For a set F : V(G ), the notation G-F represents
the submph of G obtained by removing from G all the vertices in F along with their
incident edges.

A path P is a sequence of distinct vertices vo,v1,..,vn_l,vn where (v‘. ,v‘. +1) 6 E(G ),
0 Si Sn-l. The length of the path is the number of edges in the path. A mph is con-
nected if every pair of vertices are joined by a path. A cycle is a sequence of vertices
vo,v1,..,vn_1,vn where (vi ,v‘. +1) 5 E (G ), 0 Si Sn-l, n 2 3, v0 = v” and all the other ver-
tices are distinct. The cycle of order N, also called an N cycle, will be denoted as C”.
When convenient, a cycle will be speciﬁed by listing, in order, the vertices in the cycle.
For example, x0, 11, x2, x3, x0 deﬁnes a 4-cyc1e. A Hamiltonian mph contains a cycle
passing through all its vertices.

The distance between two vertices u and v , dist(u ,v ), is deﬁned as the length of a
shortest path joining these vertices. The diameter of mph G , D (G), is the largest value
of dist (u ,v) in G . A connected acyclic mph is also called a tree. The vertices of degree

one in a tree, T(V,E ), are called the leaves of T. A rooted tree is one in which a speciﬁc

11

node is labeled as the root. When no such designation is made, the tree is said to be free.
Thelevel ofanodeinarootedtreeisthelengthoftheuniquepathfromtheroottotlre
node. The height of the tree is the number of levels or the length of the longest path from
the root to any node in the tree. A spanning submph which is also a tree is called a
spanning tree. G is a bipartite mph if V(G) can be partitioned into two disjoint subsets
V1 and V2 such that every edge ofG joins a vertex ueV1 with a vertex veVz. In such a

case G may be denoted G (V1,V2,E).

2.2. GMP - A Software Package for Graph Manipulation

In this section we give a brief description of the software package for mph mani-
pulation that we developed. A more detailed description, including a user’s manual, pro-
mmmers manual, as well as the source code can be found in [ZiEs88]. We will describe
the functionality of the package, its utilization to date and some suggestions for improve-
ment.

Graph theory has long become recognized as one of the more useful research tools
in Computer Science and Engineering. Extensive use of mph theory has been made in
areas such as topological design of networks, complexity theory, and design and analysis
of algorithms. This dissertation considers two such problems.

In the course of our work in mph theory, we had increasingly noticed the need for
a software package which would enable the user to interactively manipulate graphs, test
conjectures, etc. No such package was then available, partially due to the unavailability
of aﬂ'ordable mphics workstations. In the summer of 1986, when such workstations
became accessible to us, we undertook a project to develop such a software package. We
dubbed it GMP, an acronym for Graph Manipulation Package. The package was written
in C and currently runs in the SunView environment on SUN workstations.

The ﬁrst version of GMP became available in the winter of 1987. It is an interactive
program which allows users to visually manipulate mphs with up to 100 vertices. There

12

are essentially two major facets of the program. First, GMP allows the user to construct
and modify mphs on the screen. This construction can be done manually using a mouse.
Further, a menu of standard mphs is available, or the user can create a graph of-line and
then lead it into GMP. Second, GMP allows the user to investigate the interaction of
mph algorithms and mphs. A number of classic algorithms are built into GMP, and the
program has been designed to allow users to easily incorporate new algorithms into the
system. Facilities exist for storage and retrieval of mphs, as well as for obtaining publi-
cation quality hardcopies of created mphs.

In developing GMP, our initial intent was to produce an exploratory tool rather than
a results-oriented, computational promm ala LINPACK, etc. We wanted something
which would allow us to easily test conjectures and investigate ideas. This orientation
had important consequences on the overall design. One such consequence was the deci-
sion to limit the maximum size of mphs to 100 vertices. While the speciﬁc number was
somewhat arbitrary, it seemed that any number much higher would have decreased the
user’s ability to get a good understanding of what was going on. Also, our experience
was that any size monitor gets crowded with more than 100 vertices. Our design goal
also affected various aspects of the implementation. For example, data structures were
selected for their simplicity and universality, rather than economy of memory, etc.
Further, in adding an algorithm to the package, we generally chose a version which was
easier to implement, rather than one with the best performance. This was in keeping with
the above philosophy, and it was thought that for ”small" mphs, the decreased perfor-
mance was probably minimal.

2.2.1. An overview of GMP
The GMP user interface consists of three main windows: the message window, the
control window, and the canvas window (Figure 2—1). Virtually all user input to GMP is

done via the mouse, although the keyboard is required for some functions. The rummage

l3

window is the long horizontal window at the top of the screen. One purpose of this win-
dow is to provide feedback and instructions to users. This window also contains ﬁve
pulldown menus which are accessed by clicking the RIGHT mouse button over the menu
title (Figures 2-2 - 2-5). Each of these menus is described in more detail later. The long
vertical window on the right is the control window. From this window the user can ini-
tiate ﬁle system commands to store and retrieve mphs and select which manipulation
command is to be active when the cursor is in the canvas window. The large square win-
dow is the canvas window. This is the window in which the mph is drawn and manipu-
lated. Most user feedback occurs through this window.

As stated earlier, GMP has two major overlapping functions: 1) creating and mani-
pulating mphs, and 2) invoking algorithms which act on the created mphs. Graphs can
be created in several ways. Some standard mph deﬁnitions are available in the Std
Graphs pulldown menu in the message panel. Selecting one of these options will create
a mph of the speciﬁed type using the default number of vertices. In Figure 2-2, the Std.
Graphs menu is shown with the Binary Tree option highlighted. The mph which is gen-
erated by this command is shown in the canvas window. Note also the default parame-
ters window, in which the parameter number of vertices is set at 20. This indicates the
order of the mph that will be created by the Std. Graphs Menu. Graphs can be created
manually using one of the manipulation commands selected from the control window.
For example, select Add Vertex, move the cursor into the canvas and click where you
would like a new vertex to be placed. In addition, you can type an adjacency matrix into
a ﬁle, add an appropriate GMP header and then lead this mph into GMP. The vertices
and edges of mphs created using one of these three methods all have default values for
their parameters: edge weight, node weight, edge type, node type, etc. The default
values for each may be set by the user, and the user may also manually set the value of

any parameter for a particular vertex and/or edge.

14

The Miscellaneous menu (see Figure 2-3), as the name implies, contains a number
of commands for various purposes. Among these are: displaying/hiding mph labels,
manipulating the weights associated with the mph vertices/edges, and altering the lay-
out of the mph. Figure 2-3 shows the main miscellaneous menu along with the Weight
Options submenu. The Show Edge Weights option is highlighted and the results of this
command are seen in the canvas: all edges in the mph shown have weight equal to 1.0.

Users invoke algorithms on created mphs via the Algorithms menu (see Figure 2-
4). The Shortest Path algorithm is highlighted and the results of the command are
displayed in the canvas. In this example, a shortest path from a source vertex (black—
ened) to all the other vertices in the mph is shown by the ”double" edges. In addition,
the distance from the source to each vertex is displayed next to the vertex. The ﬁgure
also shows an example of another mph type available in the Std. Graphs menu: 2D-
Mesh. A second example of the results of an algorithm is seen in Figure 2-5. The Find
cycles algorithm was invoked and the user requested a search for a cycle of order 26.
Such a cycle was found and displayed in the canvas using double edges.

As mentioned, the package was designed to allow users to easily add new algo-
rithms. A programmer’s manual is available containing prommming guidelines, exam-
ples and a description of user-accessible utility procedures. User implemented algorithms
are accessed through the User Algs menu. This menu is user-deﬁnable and may contain
an unlimited number of user algorithms.

Finally, the Goodies menu (Figure 2-6) allows the user to perform a variety of
operations which alter the visual appearance of the mph.

2.2.2. Utilization of GMP
Since its inception, GMP has been used in conjunction with mph theory related
courses in both the Mathematics and Computer Science departments at Michigan State

University. In particular, in the Computer Science course entitled Analysis of Graph

15

Algorithms students have used the package in a discovery mode to investigate basic con-
cepts in graph theory. The interactive and visual features have been extremely motivat-
ing in this context. The package also provides a framework from which one can develop
and test new mph algorithms, as well as to study existing ones. Students have written
and/or coded algorithms as projects for this class.

In addition to its instructional use, GMP has also been a valuable research tool.
Brieﬂy, it automates many of the mundane operations that a mph theorist would nor-
mally do by hand. This automation allows many conjectures to be easily tested and
reﬁned, a task which is often too onerous to be done manually. The package has been
instrumental in obtaining results reﬂected in recent publications. In particular, GMP has
expedited the formulation of a new approach to system-level fault-tolerance and com-
munication paradigms for multicomputers [13st 89, ZiEs90]. In addition, some of the
proofs in Chapter 4 of this dissertation would have not been possible without it. It also
served as a tool for producing all the mph related mphics in this dissertation and other
recent publications.

The package has also been made available to researchers at other institutions
including Western Michigan University, Grand Valley State University, George Wash-
ington University, Georgia Institute of Technology, Rutgers University, Boston Univer-
sity, University of Iowa, and Université Paris-Sud. The responses have all been positive.

To date, GMP is unique in its category of software.

l6

.. 29¢) x3e; 3:28 0 I

a .coﬁueooH penance uo>o ecuuan an“; x030 £th 4 and On.

“h
#Nww.33v.a”..h””aw.upﬁﬁﬁﬁﬁwngﬁuﬁﬁ. gﬁﬁﬁr. ...6"n"”mm"nu”...»“unmunwunuunmuxﬁamuﬁwux”“9030..“ﬁg”... shanghai??? .. ggoﬁer .....ﬁv. 5....:...

.3853 .30 queues 8% 2:. .2 2:5

 

  

none. :- 33... D
.3... act; .935 D

 

 

 

G
29.) 38 .956 U

83 3:; D

92$ 32.: D

53 e3 0

5:3 26: D 9

x32; 26.-um D
:3...) RED
”9338 5:15am;

 

 

  

7:32.38 .1335
a a a

3.3.8”. 9’
«gm auteur.

 

    

   
      

   

...-rill been“: Houudou

  

 

\ hoe—:5 omen—nu:

8:... 8.88 8: ..8: 25.82 \3852822 aura .Bm
«abﬁncoaxtoax CE

       

A::-:a-p:~:~<-:~:»-51¢:<---m-¢,~'-:-:-.~:--«ea-~:-:-:¢¢rwe+:v. a,

 
 

>3mu¢>§5 tween amaze—z .teEE.m .) .9 FE citecsmw ...-.c >3 EQESB r.N pus—nun; .333..ch 9.35.52?! cant: W

.. 33.3 .. ._ siﬁs§~>xieiszex>§>

17

.558 Eng Em 2F .N-~ 2:3...“

 

33 A: 32332.. 83 O O O O 0
no: gauze; .3 i
833;... t5 .3353:

mtwucﬁunu .2333 15

 

nume- :. 26.5: D
En... xefo> one-5 D
.29.) x3e; 3ch U
.29.) 58 3:26 D

33 33:. D

.98 26.5: D

38 R: D

x3e; 30: U

xoauo> 395: D
5:; “.3 .8
33.3 52:52! O O

9.5 9:: 2896
9.5 €318
centuaoa .cou

 

suite... emcee;

 

 

 

 

 

 

O 3.2a; 329.3
59.6 3215a
H2:. 8:58 no: he: 952%: Renee—posh ﬁfe .595.
aeoﬁmtmatms "he
a .c0uuan ounce hzomm 30: ecu x030 5:9: use—u :3.“ 33> oh.

  

«Es 523.553 .:|.< >9 acne—ego :d ESE; unmade... 5.323.351 some
. 44. .> .. . ... who”. .... . . . . . .41... .4? .. . .. 4.

av“. s .wxgug 9/5:

ﬁgﬁﬁam»

18

.258 385—683 2F .m-~ meE

. . . ....u . .... .... . ... ...... .. -..... . .....n. . ... ... ........nv..p.n.h... . n. . . . . . .. ... . . . .. ...-.... .... .. .. .. ... .... ........ .....- ..... ...... . .......... ...“. ...... ......”J . .. A. . \. qr... ..u . n. ... V ... .. .. v.0“ . .L).u!....\5...>nh\..¥n5nh.. . . VTVVV .. . . . . HAVE). . . . . ”AK-VTY.)

                                       

.. .... .. . . .. .. ....-.- .. . ... n ......5..... .......... ..Airuu.....H....n ...-.... .

:1 Any»; u w -_

.uuxuu; \‘A‘

u

315:3:7¢21$:3:¥2§3‘-‘7372-"T'3 3:7" '

umwh-

my
. .
....”

.V.v.m

noon. 2. 26.5: D
.33 x3...> 855 D
29-) 5:2, .935 D
23.) 33 855 0
83 3:; 0

a... :2... o

.26 a: D

xouLo> 262 D o...

.VAW\\WI.'

 

.‘o‘IIN VAKV-‘V

 

 

.‘I.

xofo> 26.5”. D
xofo> 23 U
”2368 SF: Eng!

 

   

Write...“ 2.5.3 22:... 8 3.. .... . 3 ......

a a a 2:33 3 3:3.) act; :3: s 95 ...-3

o o 95 I .23.: .9; ...... 2: 3.. 82......

u _ u m~za.o: noego> ou.z gauge agony—mica

. . . . uses-a

a.ga.o) nougo> saga camp «moan oncogu
. . . . . .

9 upon-a

Ix gauge logo's —

 

 

 

 

call“. 50.5.5

-=.....-5::2::w:¢:=r-m3:=¢¥<>:-‘

ﬁﬂiﬁﬁ’iﬁfd‘:532313:7'?5‘:7'=4:5$:W&19

5‘.'Ifu‘n'vff.‘.‘.\‘.‘.‘.’I~'.‘.‘.’.‘.'.‘\'-‘.'A’JV-'4‘ '9 5'." 5"? .‘A'.W.‘-‘INA’.'-5W.’I

 

            
            

.‘E‘ﬁf. H'-

 

       

. ....c, ....5. .'

"0:“. 3:58 on: be: 2:: Qtab .3»

o§\n..oa\.a:\ CE

 

5...........5..~.

1:...unznv...<.\ .. 33...... ......II. x... H... ...-... 4.5.31.1? .....
nuff. 550‘“...

 

19

58.: guano»? 2:. .3 Baa

 

qu.‘

I.

.x- u, N v ..
9.53).}unwa~Ajvf~55~15u¢x53.).3j/5fa-sygmy.~.-.-.-.A.-.u.- - - . - -.- - - -.-

VI.

noun. :- o>o§¢ D

' ave.-

ponS x392, cue-cu D
230) roan-ta 00:25 0
.29.) 83 358 D

n ... . . ... .. . n... .......... ...... . .... 5...... .. ..... . .. .. . .. ...... .. ..... . . . . . . \....... ..........y.... . . ... f . .. .. ., ... . . . .. .. .. .. . . . .. .. . . .. .. .. I ..
O. (Ctr.Y€>.L.Iu\7P-.;;€)£C)>V;Pfuu.’))).a)f?.\.€ hung)? ...C FCC r. Pr. FYI). ...ru...s-)P.s.r...l‘.... .t.§f..h...r)\.c.€.£45.F-)b>\.<)z.\.h\ ($3934)... . \ ... 1...}... . .... ............ (CF-...s ......r. .C..\X\ 3... Cr... ...-.1.. ...L.Iv..>..\......5 C)... .

0.. .0 o...

... .. ....

0.. 0.. 00.0

0.. 0.. o...
.03 39.3. D

00.6 34 D o... m ... AV“...- A ...

.67.; 26: D

I.

 

-T"-f'f-Z‘M\V-$W6?-"' "5W". '5-55'561‘3-3443'5-"7 ' 1':

 

x325 26.5: D
5?: 83b 9 «3.32.59

ﬂ
33.8 1:25.53. .... v“? A 5.; ... J ...

-.-.-.-.-.’.-.-.v.v.-.-rah-nave.-.-.v~mxv.-.~.s- - vx

"T "-.V565?.-T-T-1(-M~M"N'f"fﬁf¢'*?-‘

 

...uau c=.u
ooh» soap-«a
c colon—om 9.226.:
H3328}. :33}; :9: ......o—au

o o
a a a 3.. AV||Illlnl3£ a...“ "can” ... "HAY-... "AV 3..
. sac—ago 9 9.2.3.:
m using... 5953 H 3 ...-co:
.2. 338.29 «926

-
I .2. 9323.29 933.33..

’u'.‘.‘.‘c'."¢'

 

 

 

 

 

 

 

5%....- ... -,.\.A-...._.'. .

 

 

 

o:— mfcconn 532::
«can; «nougogm
.. 2.2232,.

 

   
   

 

 

 

. .I-tMMme-ﬁwwxrwWWWNt-ww

"up: 00:38 537.8: ‘58-:ng 919.0 .3»

oeozxmgoaxgoax CS

-.-.\

  

.COuuan canoe Hzonm vac: vcu xouao .scos csov Hana aou> op

P5525 .CaEE—N .) .6 «Eu 5.5538 .216 >2 vane—$25 . $.m c389) 6933.... 832355.: :92... W,
“mandataa ..... ....ﬁﬁttmﬁmuﬁﬁmﬁﬁﬁ33333 $3.333: x3333. ugﬁg. .. ...ﬁnﬁxréwﬁt .3“.

     

.3

3.3 3:26 a .8 325 a 558% 865 2E 2:. .3 25mm

 

......A..,.,..............,....VU....,.. ”...... ........, . ... ... _.,._

oeuv- Za {6‘50

.3... xoago> one-.6 D
¢5v0) XOﬂkO> Dunn's D a 1 8

 

uy *u aw AMW Ag; u; MUAA

22.) SB 858 D
3...“. 3:; D

.25 26...... D 652 a... a "3.. 3 so». ¢

88 21 D

it: 26: 0

act; 26‘s 0
:38, R: D
83.8 Brad-.53.

...uk. 8. 8 8 .26

 

 

 

;‘_.A-M. “Luv“ A.“ n -

 

Hugoao'L-Pwra:oo H

mm@ a ﬁg
can—go
also“. 8.8..»

 

 

3:... .1289; Ill-3:832 It“. .3»
935-8383 23

 

'.'W-'W u-A‘A‘u-Lu ._-u .-,,u ...,U nu . n .. . Au. ...,4. . u

A WN "UHaﬂwovw UNA. 00Hho HQHCW Mm
.CxE§.N .) .3 Uta Cm,:¢£eum..— .II.< >8 —.QQC_..>..: 2.x. E: .L..> 1.9.1:... c..:.._.:_::1 29...:
. ..§$33§<§§sJé§§§§§ki>x3§><a31.4.1331.33.....1.......,........H».u.n......4...A..,....~.u3.3.4..>x...:..u...... . ..n..u......z.x.x6....53e...u

NW-

»..mblc: 32m 5.3.3.:
Ytﬁﬁ>$¥3$8§§ . .... .... , 133$

21

.2»: 8680 2F .3 95mm

1.: ..v I ..l.. .. I>4 ‘I vauluklt .1121: uI.I.II-.I - . ... . I...r ». .4~.n....) .u- .3 V. ... v» 1.. . .., .~.. . n . u . . .. .III .I. I.I . K. . . ... V . Alv.n.-.~ .x. u U I . 4 . . If. ; . I.....,.v~ . . . . . . . . .» .V . u .u r I. C ....v H .4. . r - ~ . . u u ..r .v .. ‘ . V u . . .‘ .
..yhtss.r).v.ir.~9$.kdr§.r)gr).>r€ :2».>€>.,... ... 5.5.2.2;5» .......). tr... £vs.;.?r..r.)y>.x Ekatsz ...... (V...,. .F....€ r... . .. .. .... r.>:..€....‘. r?...x.:€;.f ..... C r.>< {.2}. . ... ...)».nr. . .. C... . C ...! 5:. ...)r..,.. C c. ,
..

u

 

coon. :- 33: D
.3... it: 858 0
22., .623 855 D

W 22.) 83 855 0
w. 33 .63; D
W .93 26.5: D
W 33 an. n.
., x3...) 25: D
W xougo> o>osum nu
:35, “in
W ”9238 3:3;

 

 

 

W 1.33625 53:64

W 3. . 9.8
in: 5:3” IEBHI

_c~co~_;c:

 

 

 

 

5:. 8: ...... 857.8: 38.2.83: nits .3»
oscxosoatoax :3
£00.53 oases ...:on vac: vac £0.30 5:»... ...-on :2“ 33$ 8.

::7_£;. .ngtzcrzw .) .: vcm Ce.Ce£:»rW .:-.< )0 B:&:.:>;3 3.3 :;_:;;> .zccx.1; ::.ur_:;_:11 saw;
.3sz . ...ﬁuaﬁ. .. ..é3ééé >§§>§z§1§«333«35§<$5§,>5§? §3<2

 

xdmruc>ﬁcz :«aum

grows-M39 w.

 

Chapter 3

Broadcasting in Multicomputers

 

One type of information exchange that often arises in MCs and communication net-
works in general (e. g. computer networks, spy networks) is referred to as broadcasting.
Here, one member of the network, called the originator, wishes to send a message to all
the other members in the network, as rapidly as possible. The underlying topology of the
network plays a signiﬁcant role in how broadcasting should be done. In some networks,
by default, every message transmitted is received by all members of the network,
whereas in some others, a transmitted message is received by no more than one other
member of the network. These networks are respectively referred to as broadcast net-
work: and point-to-point networks [Tane88]. Here we are concerned with the problem of
broadcasting MultiComputer networks as deﬁned in Chapter 1. However, since our focus
is at the topological level, our discussion is also applicable to general point-to-point net-
works, and we hereafter refer simply to networks.

There are many applications of broadcasting in such networks. In general terms,
broadcasting is necessary whenever some information must be relayed to all the other

members of the network. Some speciﬁc examples include: synchronizing processors in

22

23

a distributed network, reconﬁguring processors to achieve a desired logical interconnec-
tion, diagnosing the state of a distributed system, and communicating results from paral-
lel algorithms computations [Prad85].

The basic idea in broadcasting is to keep informing uninformed members of the net-
work until all member are informed, however, which uninformed members and how
many of them can be informed at a time are usually restricted by the network technology
and/or application environment. This has led to deﬁning diﬁerent types of broadcasting.
In this chapter, we ﬁrst survey the existing work on broadcasting in a network. We then
discuss the difﬁculties that arise in implementing broadcasting and review the known
algorithms for broadcasting. We will examine the role that trees play in broadcasting and
we present a disu'ibuted broadcasting algorithm for a particular class of networks based

on Binary DeBruijn Graphs.

3.1. Background

We ﬁrst consider the broadcasting process in general. In this process, one node
called the originator wishes to send a message to all the other nodes in the network. The
originator begins the process by making a "call" to another node in the graph, informing
it of the message. Subsequently, the informed nodes call their uninformed neighbors and
the process continues until all nodes in the network are informed. It has been suggested
[FHMP79] that each node be involved in at most one call during each time interval,
although calls between distinct pairs of nodes may take place concurrently. We assume
that all calls take the same amount of time; the time interval during which a call takes
place will be referred to as a time step or simply step. Thus during the ﬁrst step of the
broadcasting process, the originator calls another node in the graph. At the end of the
ﬁrst step, two nodes are informed. Since there is no beneﬁt gained from a node receiving
the same message twice, we specify that each node (except the originator) be the

receiver of exactly one call during the broadcast.

24

Several types of broadcasting have been distinguished in the literature [Farl80]. In
local broadcasting a node may only call one of its neighbors. In line broadcasting a node
may call any other node to which it is connected by a path; however, the edges used to
make the call must not be used by any other call during that step. Path broadcasting is
similar to line broadcasting, except that it is further required that no node is used in more
than one call in any time step. The diﬁerence between line and path broadcasting is that
in the former, a node may allow several calls to pass through during any step, whereas in
the latter each node may participate in at most one call per step. In both cases, however,
only the starting node in a broadcast path need be informed; all the intermediate nodes in
the path can act as a switch and relay the message without being aware of its contents.
This assumption may not be very realistic; if a node is to relay a message it might as well
examine its contents since, in this context, the same message will be sent to it eventually.
Here, we will consider only local broadcasting and unless stated otherwise, we will use
the terms broadcasting and local broadcasting interchangeably.

The broadcast process is illustrated in Figure 3-1. The top graph represents the
topology of a communication network. Broadcasts originating from nodes e and f are
shown in the middle and bottom graphs respectively. The edges used in the broadcast are
highlighted with arrows to indicate the sender/receiver and are labeled with the time step
in which they are used. Unused edges are shown as dashed lines. Note that the broadcast
was completed in ﬁve time steps in the middle graph; the lower graph used only four.

It is clearly desirable to complete broadcasting in a minimum number of time steps.
For a general graph with N nodes, the minimum amount of time to complete a broadcast
is at least [log N 1 (all logarithms are in base 2). This result follows from the fact that the
number of informed nodes can at most double after each step of the process and may be
proved formally by a simple inductive argument [FHMP79]. Clearly, the "location" of
the originator within the graph aﬁ‘ects the amount of time needed; an originator which is

"in the middle" of the graph may require fewer time steps than one on the "periphery".

 

 

 

Figure 3-1. Two example broadcasts in a graph.

26

For example, in Figure 3-1, broadcasting from node a will require more time than broad—
casting from node 3 . In addition, the sequence in which an informed node informs its
uninformed neighbors affects the number of time steps for a particular broadcast. In the
middle graph of Figure 3-1, e sends messages to nodes b, a , and f , in that order. How-
ever, it is not difficult to see that the number of steps can be reduced to four by using the
order f , b , a. These considerations moﬁvate the following deﬁnitions.
Deﬁnition. 3.1 The minimum time (in number of time steps) required to complete
broadcasting in a graph G, when node v is the originator is denoted bt (G :v ). The
minimum is taken over all possible orderings of calls.
Deﬁnition. 3.2 Let bt (G) be the best possible broadcast time for a particular graph G.
We have bt(G) = min{bt(G:v)|veV}. Note that bt(G) is not necessarily equal to
[log|V(G)| 1.
Deﬁnition. 3.3 Let BT (G) be the worst possible broadcast time for a particular graph
G. We have BT(G) = max{bt(G:v)|veV}.
Deﬁnition. 3.4 The broadcast center of a graph G ,
BC(G) = {veV | bt(G:v) = bt(G)}, is the set of all nodes which when chosen as the
originator, can achieve the best possible broadcast time for G . This set is clearly
nonempty and may have more than one element.
Deﬁnition. 3.5 A calling schedule for a broadcast is a set of statements of the form
"node i calls node j during time step t”. The calling schedule is a legal calling schedule
if it satisﬁes the criteria [Farl79]:

i) During a given time step, each node may participate in at most one call.
ii) For each call scheduled for step t, the sender is either the originator or was
calledduringsteps, l<s<t . '

A graph onus) is said to be aminimum broadcast graph mum) = [log|V(G)| 1. In
addition, when BC (G) = V(G) then G is called a uniformly minimum broadcast
(UMB) graph. The above graphs are deﬁned diﬂ’erently in [FHMP79], however we

believe our deﬁnitions are more consistent with the rest of the related literature. By

27

deﬁnition, a broadcast informs all the nodes in the graph and to accomplish this, it uses
some of the edges in the graph, each edge being used at most once. The subset of edges
E ’ c E used in the process of broadcasting in G (V,E ) induces [Hara72] a rooted span-
ning tree T (V,E ’) of G . Further, the root of the tree is the originator of the broadcast and
there is a direction associated with the edges in the tree. A tree T(V,E) is a minimum
broadcast tree (MBT) if it is a minimum broadcast graph.

Broadcasting has received attention in the literature as both an abstract graph theory
problem and as a practical problem in communication networks. The problem of con-
structing UMB graphs is considered in [Farl79] and algorithms for their construction are
also presented. Designing UMB graphs with the minimum number of edges is discussed
in [Farl79, FHMP79] and solutions are given for a number of cases. In [FHMP79], such
graphs are termed "minimum broadcast graphs" and a catalog of these graphs with 15 or
fewer nodes is provided, along with proofs showing them to possess the minimum possi-
ble number of edges. This work has led us to consider what characterizes a minimum
broadcast graph, i .e., what are some of the necessary and/or suﬁicient conditions for a
graph to be a minimum broadcast graph? For general graphs no work has been done in
this regard.

In [Pros81], an algorithm is given to recognize minimum broadcast trees and to con-
struct all rooted minimum broadcast trees for a given number of nodes. An algorithm for
completing line broadcasting in minimum time is presented in [Far180]. The algorithm
also produces a legal calling schedule for the broadcast. In [FaIIe79], local broadcasting
in ﬁnite and inﬁnite grid graphs is considered. The minimum times to complete broad-
casting are derived for several special cases of grid graphs; among these is the graph
which represents the topology of the Illiac IV-type array processor [Barn68]. A conjec-
ture is made as to an upper bound on the number of nodes in an inﬁnite grid that may
become informed after a ﬁnite number of steps. A survey of much of the work on broad-

casting and the more general problem known as "gossiping" is provided in [Bel-11.88].

28

3.2. Broadcasting in General Graphs

In a general connected graph G (V,E), the problem of ﬁnding bt (G :v) and a
corresponding legal calling schedule for an arbitrary vs V, has been shown to be NP-
hard [SlCH81]. The same is true for determining BC (G ). In some circumstances, once a
"good" broadcasting schedule has been obtained, there is no need to determine a new
one. However, in other situations this may not be the case. For example, the topology of
a point-to-point network may change, either physically due to failure/recovery of a com-
puting element or communication channel, or logically due to reconﬁguration. With
either of these changes, a new legal broadcast schedule may need to be determined.
Additionally, as an attempt to increase overall eﬁ'rciency, it may be desirable to deter-
mine alternative broadcast schedules dynamically in response to traﬂic ﬂow in the net-
work induced by speciﬁc applications. The above concerns have motivated researchers to
investigate heuristic algorithms for broadcasting in general graphs.

A class of heuristic algorithms for local broadcasting in general graphs is presented
in [Sch84]. The heuristic algorithms presented are all variations on one basic process:
broadcasting in a graph G is done by successively generating maximum matchings in
bipartite graphs derived from G while optimizing a weight function. Different choices of
weight functions yield the variations. The performance analyses of these heuristics are
yet to be seen.

Although the problem of determining bt (G :v ), bt (G ), BT (G) and BC (G) of a gen-
eral graph G are NP-hard, this is not the case when G is a tree. In [SlCH81], algorithms
have been developed for determining BC (T) and for ﬁnding a broadcast schedule requir-
ing bt(T ) time steps for a given tree T. We remind the reader that, by deﬁnition, the
edges used in a broadcast in a given graph G induces a spanning tree of G . Therefore, to
ﬁnd bt (G) we need essentially to select a spanning tree T of G such that bt (T) = bt (G ),
and we know from Section 3.1 that such a spanning tree exists. In the absence of any

selection criterion, we may need to examine all the spanning ms of the graph in order

29

to ﬁnd a suitable one. This method, has the obvious drawback that the number of span-
ning trees of a graph may be an exponential function in the number of nodes, making the
selection process very costly if not impractical. An alternative would be to examine a
subset of spanning trees (which is the idea behind the existing heuristics) with the hope
that it will contain a spanning tree T such that bt (T) = bt (G ).

The above discussions morivated us to examine the role that trees play in the broad-
casting process. If we let ST(G) be the set of all distinct spanning trees of a graph G and
BST(G) = {TeST(G) | bt(T) = bt(G)}, we are interested in the comparative sizes of

these sets.

3.3. Broadcasting in Trees .

We have noted that trees play a fundamental role in the broadcasting process. In
this section we will examine this role beginning with a discussion of minimum broadcast
trees. In particular, we discuss a special class of MBTs - those whose order is equal to a
power of two. The properties of this class of trees are used to establish several theorems,
which characterize general MBTs. Finally, we examine the distribution of broadcast
times for all trees of order |V(T )| S 28. We will then be able to state some general ideas

for a heuristic algorithm for broadcasting.

3.3.1. Minimum broadcast trees

In this section, T(V,E) will denote a tree with |V| = N. Recall that a tree T is an
MBT if bt(T) = [logNl and BC(T) is the set of all nodes u in T such that
bt(Tzu) = bt(T). Each ueBC (T), along with the direction induced on the edges of T
by its corresponding broadcast process, deﬁnes a rooted tree on T. We will refer to such
trees as rooted minimum broadcast u'ees. Note that two rooted MBTs may have the same
underlying tree. Figure 3-2 shows an example of three different rooted MBTs on nine
nodes. In each case, the uppermost node is the originator of the broadcast. Each of these

30

 

 

 

Figure 3-2. Three rooted MBTs and their common underlying free tree.

31

 

 

Figure 3-3. Two examples of MBTs.

rooted trees shares the same underlying free tree which is also shown in the ﬁgure. The
labels (A,B,C) indicate how each rooted tree maps to the underlying tree. An algorithm
has been developed for constructing and counting all rooted MBTs on N nodes [ProsSl].
We are, however, interested in counting the number of distinct MBTs rather than rooted
MBTs. We denote by SMBT (N) the set of all distinct MBTs on N nodes. Figure 3-3
gives some examples of MBTs.

A special class of MBTs are those with N = 2‘t nodes (1: >0). These trees contain
the maximum number of nodes that can be informed in k time steps. The trees in this
class (also known as binomial trees) are in a sense, completely deﬁned with respect to
many diﬁerent properties, some of which are given below. We will utilize these proper-
ties in the characterization theorems for general MBTs in the next section.

Property 1. For each value of 1: >0, the set SMBT (2") has exactly one element. That is,
up to isomorphism, there is a unique minimum broadcast tree with 2‘r nodes and we will

denote it by MBT (2"). Figure e-4 illustrates some examples of trees from this class. The

32

Figure 3.4. The unique trees Mam“ ), for k = 2,3,4.

33

labels indicate the time step in which each node becomes informed.

Property 2. MBT (2") can be constructed by combining two copies of MBT (2""1

); the
tree can be decomposed into two symmetric halves. The construction is depicted in Fig-
ure 3-5. Let u be an originator in one copy of MBT(2k'1) and v be an originator in a
second copy of MBT(2"'1). A single edge is added between the originators, yielding a
tree with exactly 2" nodes. To see that is also an MBT, choose it (or v arbitrarily) to be
the originator for the new tree. Initially, u calls v , using 1 time unit. Since each of u and
v is the originator in their respective subtrees, the remaining nodes may be informed in
k-l steps. It is clear that all trees MBT (2") can be so constructed.

Property 3. The degree sequence of MBT (2k) is completely determined and can be com-
puted recursively. In particular, there are two nodes of degree It; one of which must be
the originator of the broadcast; the other must be the ﬁrst node to become informed in the
broadcast process. And, for each d, 1 S d S k-l there are 2‘4 nodes of degree d.
Property 4. There is exactly one calling schedule for MBT (2" ), once the originator is

ﬁxed.
Property 5. The number of nodes at any level 1 in MBT (2" )is given by [i l , hence the

name binomial tree.
Property 6. The number of levels in MBT (2k) is exactly I: .
Property 7. The number of leaves in MBT (2k) is exactly 2"“.

3.3.2. Characteristics of general minimum broadcast trees

We now present some results for general MBTs in the form of several theorems.
These theorems represent some necessary (and suﬂicient) conditions for a tree to be a
minimum broadcast tree. These results will serve as a ﬁrst step in characterizing
minimum broadcast graphs.
Theorem 3-1: A tree T(V,E) of order N is a member of SMBT (N) if and only if T is a
subgraph of MBT (2") where 2"“1 < N s 2".

34

 

Figure 3-5. MBT (2‘) is constructed from two copies of MBT (23).

35

Proof: (—>) By deﬁnition, there is a legal calling schedule C , which will complete the
broadcast in T in k = [ log N 1 time steps. Begin the broadcasting. Let step 1: be the
ﬁrst step during which a node has no neighbors left to inform and S (T) = l v e V |v and
all of its neighbors were informed before step T}. For each node v 6 5(1) add a new node
uv to V and add a new edge to E, connecting uv and v. Inform all the newly created
nodes. Continue this process during each subsequent step through time step It . It is clear
that in each step we inform as many nodes as possible (by adding new nodes where
necessary, it is assured that the number of informed nodes must double) and that all
nodes are informed in 1: steps. But MBT (2") is the only tree with these properties. Thus
we have constructed MBT (2") by adding nodes and edges to T and hence the tree T is a
subgraph of MBT (2" ).
(<—) Suppose TcMBT (2k ). By property 4, there is a unique calling schedule C ’ for
MBT (2" ). Now remove all elements of C ’ which refer to a node not in T. We thus
obtain a legal calling schedule C , C c: C ’ for T which completes the broadcast in 1:
time steps. Thus T e SMBT (N ). I

Observe that the above proof implies that we can embed T in MBT (2k) in such a
way that the originator of r is embedded onto the originator of MBT (2"). It should be
noted that it is not necessarily the case that MBT(2k-1) is a subgraph of an MBT on N
vertices, where 2""1 < N 52". This is illustrated in Figure 3-6 where an MBT with 9
nodes (right) is constructed by inserting a node (labeled C) into the tree MBT(23) (left).
It is clear from the ﬁgure that the original tree (left) is not a subgraph of the new one,
illustrating that an MBT need not have MBT (2") , k >2 as a subgraph.
Theorem 3.2: Let T(V,E) e warm). Then A(T) 5 [log N].
Proof: Let veV be a node whose degree d(v) > llogN]. Suppose veV is an origina-
tor of T. Since a tree is acyclic, v must inform each of its adjacent nodes and this takes
more than [log N 1 steps, contradicting the deﬁnition of T. Now suppose v is not an ori-

ginator. It takes at least 1 time step to inform v , using one of the edges incident to v , v

36

 

Figure 3-6. An MBT not having Mar(23) as a subgraph.

then has d (v )-1 uninformed neighbors. Since the only path to these nodes is through v ,
v must send the message to each of them in turn. This requires d(v )--1 calls. Thus the
time to complete the broadcast is at least d(v) > [log N 1, again violating minimality.
Thus no such v exists. I

Note that the converse of the above theorem is not necessarily true; it is easy to con—
struct trees with small maximum degree, which are not MBTs. For example, consider a
path P of order 2" , 1: >2. To be an MBT, it would have to be possible to complete a
broadcast in P in t S 1: time steps. It is clear that this cannot be done; the reason being
that each vertex (Other than the originator) may only inform one other vertex, and there
are more vertices in the tree than can be informed in the required number of time steps.

An interesting related question is: how many nodes can be informed in a tree with
maximum degree A after t time steps? In lemma 3.1, a recurrence relation is established

which relates the maximum number of nodes in a tree T which can be informed during

37

time step 1:, to the maximum degree of the tree, A.
Lemma 3-1: Let T be a tree with maximum degree A and MAX(t,A) be the maximum
number of nodes in T which can be informed during time step 1: of a local broadcast in T.

Then the following recurrence relation holds.

t-l
MAX(t,A)= 2 ISTSA

MAX (t—l,A) + MAX(t-2,A) + - - . + MAX(t-A+1,A) T > A

Proof. The correctness of the recurrence relation may be seen as follows. Consider an
inﬁnite rooted tree in which all nodes have degree A. With the root of the tree being the
originator, we wish to know how many nodes of this tree may be informed during time
step 1:. During each step ’t S A, each informed node has unused edges which may be used
to inform an uninformed node. Thus we generate MBT (25 and consequently 2"1 nodes
are informed during time step 1:. Now, consider any 1: > A. Every node which is ﬁrst
informed at time step t—A+1 has exactly A—l edges by which it can inform other nodes.
Since it may inform only one node per time step, it will inform one node during time
steps ‘t-A+2 , t—A+3, . . . ,1. Thus every node which becomes informed at step ‘t-A+l
will inform a node during time step 1. Similarly, every node which becomes informed at
time step T-A+k will inform a node during steps: t-A-i-lH-l ,‘t-A-l-k +2, . . . ,t. This
yields the stated recurrence relation. I

The relation of lemma 3-1 can be used to determine the maximum number of nodes
which can be informed in a tree T with maximum degree A, in t time steps. Let M (t ,A)

represent this number. Then we have

I
M(t,A) = Emma) 3.1

i=1
Table 3-1 shows the values of M (t ,A) for A = 2,3,4,5 and t = 1, . . . ,20. The entries in
the second through ﬁfth columns of the table are the values of M (t ,A). The last column

gives the lower bound on the number of nodes which must be informed in order for a tree

38

Table 3-1. The values of M (t ,A).

 

 

 

t A = 2 A = 3 A = 4 A = 5 Minimum
0 l 1 1 1 0
l 2 2 2 2 1
2 4 4 4 4 2
3 6 8 8 8 4
4 8 14 16 16 8
5 10 24 30 32 16
6 12 40 56 62 32
7 14 66 104 120 64
8 16 108 192 232 128
9 18 176 354 448 256
10 20 286 652 864 512
1 l 22 464 1200 1666 1024
12 24 752 2208 3212 2048
13 26 1218 4062 6192 4096
14 28 1972 7472 l 1936 8192
15 30 3192 13744 23008 16384
16 32 5166 25280 44350 32768
17 34 8360 46498 85488 65536
18 36 13528 85524 164784 131072
19 38 21890 157304 317632 262144
20 40 35420 289328 612256 524288

 

 

 

 

to be an MBT. For example, with t=5, 17 or more nodes must be informed for the tree to
be an MBT. If 16 or fewer are informed, then by deﬁnition, this cannot represent an
MBT. The table indicates in particular, that the largest MBT with maximum degree 2 has
6 nodes. This is because with A=2 and t =4, at most 8 nodes can be informed and thus, by
deﬁnition, the tree will not be an MBT. The table also shows that if a tee with maximum
degree 3 has more than 66 nodes, it is not possible to complete broadcasting in 7 time
steps. Thus no MBT with maximum degree 3 may have more than 66 nodes. Similar
claims may be made regarding trees with higher maximum degrees.

Theorem 3-3: Let T (V,E ) e SMBT (N ) and A. be the number of levels in T. Then

[logNl
2

53.5 llogNl.

 

 

39

Proof: The right hand inequality can be seen by recalling that by Theorem 3-1, T is a
subgraph of MBT (2" ), where k = [logN1. By property 6, the number of levels in
MBT (2") is k = [log Nl. We prove the left hand inequality by contradiction. Suppose
2. < [k /2 l . First, note that T cannot have fewer than 2k'l+l nodes (otherwise it will not
belong to SMBT (N )). By property 5, the number of nodes in MBT (2") has a binomial
distribution over the levels of the tree. Embed T onto M8712") and then remove all
nodes in levels [k /2] and greater. Now we consider two cases.

Case 1: k is odd. The disuibution of nodes is symmetric with respect to the levels in the
tree. Here we have removed exactly half of the nodes from MBT (2" ), so that the number
of remaining nodes is 2"".

Case 2: k is even. The distribution of nodes is asymmetric. Here we have removed the
middle level as well as the levels below it, and the number of nodes remaining is less
than 2"".

In both cases, we must have removed nodes from T in the pruning process, since the
pruned tree has fewer nodes than T is required to have. This yields the contradiction. I
Theorem 3-4: Let T (V,E) e SMBT (N) and p be the number of leaves in T. Then,
p S 2 [losN'I-l.

Proof: Let It = i log N1 and embed 7 within MBT (2"). Recall property 7. Now prune
MBT (2") by removing leaves, one at at time, to yield T. Clearly, the pruning process

cannot increase the number of leaves. I

3.3.3. Broadcast times for general trees

We have seen that for an arbitrary tree T, [log N 1 S bt (T) S N -1, and by applying
some of the above results we can further reﬁne these bounds. Even so, this represents a
large range of broadcast times. We examined all possible trees on N nodes (1 S N S 10)
and have noticed that in a large percentage of these, broadcasting can be done in the

optimal or near optimal number of steps. This motivated us to investigate the distribution

40

of broadcast times for all the trees of a given order.

In [WROM86], an algorithm is presented which generates all free trees of a given
order. Combining this algorithm with the algorithm to compute bt (T) from [SlCH81],
we wrote a computer program to calculate the disuibution mentioned above, for all ord-
ers N, 4 S N S 28. Verifying the correctness of the program was done by carefully deter-
mining that each algorithm was accurately translated into the C programming language
and by comparing the results of the program for trees up to order 10 with previously
known results. The results are given in Table 3-2, from which we make the following

observations.

Table 3-2. Distribution of broadcast times for all trees, T, 4 S IT | S 28.

 

 

 

Olderof Tree
1 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
l 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 l 2 4 2 1 0 0 0 0 O 0 0 0 0 0 0
4 O l l 7 17 28 42 46 45 29 16 4 l 0 0 0
5 0 0 l 1 3 14 52 147 370 788 1543 2727 4516 6867 9758 12715
6 0 0 0 l 1 3 7 30 105 3% 1293 3935 1M0 28407 69110 159211
7 0 0 0 0 1 l 3 7 19 63 229 848 3134 “”51 36354 114449
8 0 0 0 0 0 l l 3 7 19 47 149 494 1840 6974 26142
9 0 0 0 0 0 0 1 l 3 7 19 47 127 359 1136 3977
10 O 0 0 0 0 0 0 1 1 3 7 19 47 127 330 926
11 0 0 0 0 0 0 0 0 1 1 3 7 19 47 127 330
12 0 0 O 0 0 0 0 0 0 1 l 3 7 19 47 127
13 0 0 0 0 0 0 0 0 0 0 1 l 3 7 19 47
14 0 0 0 0 0 0 0 0 0 0 O l l 3 7 19
15 0 0 0 0 0 0 0 0 0 0 0 0 1 1 3 7
l6 0 0 0 0 0 0 0 O 0 0 0 0 0 l 1 3
17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

 

 

 

 

41

Table 3-2 (cont’d).

 

 

 

 

Order of Tree
1 20 21 22 23 24 25 26 27 28
1 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0
5 15334 16814 16818 15022 11966 8249 4900 2390 954
6 350268 739433 1505525 2963910 5657301 1048m29 18865609 33m4506 5613583
7 343884 989511 2741135 7336115 1%39628 48059182 118313484 284672624 670891043
8 94433 328241 1097455 3542358 1 1075203 33654560 99685018 288543207 817941510
9 14990 57687 219887 816259 2936970 10237157 34637275 1 14047524 366427678
10 2732 8971 32196 13918 478692 1853958 7(354-86 26016218 93624639
11 889 2424 6938 21293 71723 262261 101 1898 39853” 15655171
12 330 889 2378 6506 18162 53365 168262 578628 2146253
13 127 330 889 2378 6450 17577 48756 139014 417062
14 47 127 330 889 2378 6450 17510 47986 132470
15 19 47 127 330 889 2378 6450 17510 47907
16 7 19 47 127 330 889 2378 6450 17510
17 3 7 19 47 127 330 889 2378 6450
18 1 3 7 19 47 127 330 889 2378
19 1 l 3 7 19 47 127 330 889
20 0 l 1 3 7 19 47 127 330
21 0 0 1 l 3 7 19 47 127
22 0 0 0 1 l 3 7 19 47
23 0 0 0 0 1 1 3 7 19
24 0 0 0 0 0 1 1 3 7
25 0 0 0 0 0 0 1 1 3
26 0 0 0 0 0 0 0 l 1
27 0 0 0 0 0 0 0 0 1

 

 

 

42

Observations from Table 3-2.

1 The distribution is heavily skewed towards the lower values of 1:. That is, in the vast
majority of trees of a given order , broadcasting can be accomplished in near the
theoretical minimum time.

2 A pattern is evident in the "tails" of each column. Most obvious is the diagonal of
1’s at the end of each column. Moreover, moving from left to right, the entries in
each tail become ﬁxed after a certain point. For each column after the "10" column,
the last 4 entries are always 7,3,1,l. In fact the number of "ﬁxed entries" increases

by one for each even tree order.

3.4. Broadcasting in Binary DeBruijn Graphs

Binary De Bruijn graphs (BDG) are graphs whose interconnections are determined
by a simple coding scheme based on a numbering of the vertices in the graph. Networks
modeled after binary De Bruijn graphs will be referred to as binary De Bruijn networks
(BDNs). It has been shown that many network topologies can be embedded within
BDNs. The ring, one dimensional array, complete binary tree and shuﬂie exchange net-
works are some examples of such networks. Each of these networks is particularly well
suited for solving certain classes of problems [SaPr89]. For example, the shuﬂle
exchange network admits eﬁicient computation of FFT; a one dimensional array network
is effective in solving problems in the pipeline class, among which is matrix-vector multi-
plication. A single BDN can be conﬁgured to act as any of a number of special purpose
networks and thus can serve as a general purpose network for the eﬁicient solution of a
diverse set of problems. These facts make BDNs an attractive choice for the architecture
of an MC.

We have noted some of the issues related to broadcasting in general graphs. For an
arbitrary graph, determining an optimal broadcast schedule is NP-hard. We have given a

lower bound for bt (G) and noted that the value of bt (G) is related to the maximum

43

degree of G . In this section, we consider the problem of broadcasting in BDNs. We ﬁrst
give a precise deﬁnition of binary De Bruijn graphs and some of their properties. Then
we present a distributed algorithm for doing broadcasting in BDNs which requires
(2-log2 N )-1 time steps where N is the number of nodes in the network.

3.4.1. Binary De Bruijn graphs

The binary De Bruijn graph, BDG (n ), has 2" vertices and can be constructed using
procedure 3.1. It is not diﬂicult to see that each vertex (bn_l,bn_2. . . . .bo) in BDG (n) is
adjacent to vertices (bn_2,bn_3. . . . .bo.a) and (a,bn_1,bn_2. . . . .bl) where as (0.1}.
This makes BDG (n) a regular graph with the degree of each vertex equal to four.
Procedure 3.1 Construct BDG (n ).

1. Label the 2" vertices using distinct binary n-tuples (bn_1.bn_2. . . . .b0).

2. Connect (via an edge) each vertex (bn_1.bn_2. . . . .bo) to vertices
(bn_2,bn_3, . . . .b0.0) and (bn_2,bn_3. . . . .bo,1).

The fact that the A(BDG (n )) = 4 (the number of connections at each processing
element is ﬁxed) and the diameter of BDG (n) is equal to n means that the diameter of
the network increases only logarithmically with the number of vertices. This makes
BDG (n) particularly attractive as an interconnection network. Additionally, many use-
ful topologies are contained in BDG (n) including a 2" vertex linear array, a 2" vertex
ring, a (Zn—l) vertex complete binary tree, a (3-2"'2-2) vertex tree machine, and a 2"
vertex one-step shuﬂle-exchange graph [SaPr89]. As mentioned in Section 3.4, each of
these t0pologies is useful for solving various classes of problems. The fact that De Bruijn
graphs can accommodate these topologies makes them an especially powerful topology.

The construction given in procedure 3.1 implies that the two vertices with n-tuple
labels (0,0, . . . .0) and (1.1. . . . ,1) have a loop edge. Further. the two vertices with
labels (0.1.0.1. . . . .) and (1.0.1.0, . . . ,). (i.e.. alternating 0’s and 1’s) have 2 edges
between them. In a practical situation. the loop edges would not be used and the doubled
edge would be implemented as a single edge. Figure 3-7 shows BDG (4) with the above
modiﬁcations. In what follows. BDG (n) will refer to this modiﬁed graph.

 

 

 

 

 

 

 

 

@
as or
04 02
@
§2 03?
Q
13 ll
14 07

 

Figure 3-7. The mph BDG (4).

3.4.2. A distributed broadcast algorithm for BDG(n)

In this section we present an algorithm which generates a calling schedule for
broadcasting in BDG (n ). Observe that bt (BDG (n )) 2 n; however, in general
bt (BDG (n )) > n . This is due to the fact that A(BDG (n )) is a constant. The formula 3.1
from Section 3.3.2 can be used to rigorously prove that BDG (n) is not a minimum broad-
cast mph in general. Our broadcasting algorithm for BDG (n) generates a calling

schedule that requires exactly (Zn-l) time steps to complete broadcasting. irrespective

45

of the choice of the originator. Also. the control of the broadcast process described by the
algorithm is distributed. in the sense that each vertex can determine to whom it should
transmit the message. It is only required that the label of the originator of the broadcast
accompany the message.

Canal to our broadcast algorithm is an integer valued function BTS (A ), called
broadcast time step function. deﬁned on binary m-tuples A . Let
A = (am_1.am_2. . . . .a0) and deﬁne K(ai,aj), 0 S i .j Sm—l. as follows:

2 if a‘. 6 a]. = 0
K (a; ,aj) =

1 otherwise
where e is the binary Exclusive-0R operation. BTS is used to determine the time step
when a vertex becomes informed. Now. if m S 1 then BTS (A) = 0, otherwise BTS (A) is

deﬁned as:

m-2
BTS(A) = EK(ai’ai+l)
i=0

Example 3.1 Let A = (1,0,1.0,0.1.0). Then BTS (A) = 1+l+2+1+1+1 = 7. Note that the
value of BTS (A) is maximum if and only if am_1 = and = - - - = a0. and the maximum
value is 2(m -1).

Intuitively. our broadcasting algorithm proceeds as follows. Each vertex will inform
at most two of its adjacent vertices. In particular. for a vertex B = bn_1,bn_2. . . . ,bo).
let LSNE (B) and LSCE (B ) be two of its adjacent vertices whose labels, respectively,

are

LSNE(B) = (bn_2,bn_3, . . . .bo.bo) and 1503(3) = (bu_2.bn_3, . . . .b0.bo)

where 5; is the bit complement of b... (LSNE and was stand for Left Shift with Normal
Extension and Left Shift with Complement Extension, respectively.) Vertex B will
attempt to inform vertices LSCE (B) and LSNE (B ), and in that order provided that these

vertices have not already been informed. Observe that this scheme implies that a vertex

46

 

Algorithm 1.
1: Find the largest i such that

B = (b "-19b "-2, o o 0 ,b0) = (Si_1,8i_2, . e e ,SOybn_i_l,bn_‘-_2, o o 0 ,b0)
2: /* check LSCE neighbor */

IF LSCE(B) = S goto4.
ELSE ﬁnd the largest j such that

LSCE (B) = (s._1,sj_2, . . . .so,bn_j_2,bn_j_3. . . . .bofo)

3: IF BTS(so,bu_l_2.b n+3, . . . ,boj'o) > BTS(so,bn_,._1.bn_i_2. . . . .bo)
THEN inform LSCE (B).

4: /"‘ check LSNE neighbor */
IF LSNE (B) = S STOP
ELSE ﬁnd the largest k such that

”NE (B) = (st-IJk-z' e o o 'SO’bﬂ-k-Z’bn-k-3' e e . ,bo,bo)

5: IF BTS(s0.bn_k.2.bn_k_3.... .bo,bo) > BTS(so,b,H_l. bn_,_2,...,bo)
THEN inform LSNE (B ). STOP.

 

can potentially get informed from two other vertices. since for any vertex
A = (an_l,an_2. . . . .ao) there are exactly two other distinct vertices X and Y such that

either LSNE (X) = LSNE (Y) = A orLSCE (X) = LSCE (Y) = A . namely.
X = (0. "_1.an_2. . . . ,al) and Y = (l. "_1. an_2, . . . .al).

The function BTS will be used to check whether a vertex is already informed or not. We
proceed by stating our broadcast algorithm, Algorithm 1. formally. An example of its
execution is given. and then we prove a theorem which implies the correctness of the
algorithm It is assumed that the broadcast originator is vertex S = (s _1,sn_2. . . . ,so)
and this information accompanies the broadcast message. Each vertex B , in BDG (n ).
once informed of the broadcast message, will perform Algorithm 1.

Example 3.2 Let us trace its execution for vertex B = (1.0.0.1) when the network is

BDG (4) and the originator of the broadcast is vertex S = (0.0.1.0). In this case.

47

LSNE (B) is the vertex (0.0.1.1) and LSCE (B) is the vertex (0.0.1.0). The value of i
found in step 1 is 2. Since LSCE (1.0.0.1) = (0.0.1.0) = S . execution will continue from
step 4. In Step 4, the value of k is found to be 1. Thus. in step 5 we have:

BTS(so,bl.bo,bo) = BTS(0,0,1.1) = 2+1+2
and
BTS(so.bl,bo) = BTS(0.0.1) = 1+2,
and therefore since BTS (s0.b1,bo.bo) > BTS (s 0,b l,b0). LSNE (B) will be informed.

The broadcast process in BDG (4) is illustrated in Figure 3-8. The arrows indicate
the direction of the broadcast and highlight the spanning tree that is induced by the
broadcasting. Note that, as in example 3.2, node 9 informs node 3 and not node 2. Since
node 2 is the originator.

Theorem 3-5. Let vertex S = (sn_l,sn_2. . . . ,so) be the originator of a broadcast in
BDG (n ). Also, let A = (an_1.an_2. . . . .ao) be an arbitrary vertex in BDG (n) and ﬁnd

the largesti such that
A = (an-l’an-2’ - - - '00) = (Si-lJi-2' - ° - 'SO’an-i—l’an-i—2’ - - - .ao) -
Then. using Algorithm 1, vertex A is informed at the end of time step
T(A) = BTS (so,a,._i_1.an_i_2. . . . .ao)

Moreover, at the end of time step (2n -1) all vertices are informed.

Proof: We prove the theorem by induction on the number of time steps. Clearly. only
T(S) = BTS(s0) = 0. Now suppose the theorem is true for all vertices B with T(B) Sit.
and let A be a vertex such that T(A) = k +1. Consider the vertex
X = (spsH, . . . ,so,an_i_1.au_i_2. . . . ,al) where i is deﬁned as in the Statement of the
theorem. Observe that vertex X is adjacent to vertex A in BDG (n ). Since we either have

LSNE(X) =A orLSCE(X)=A. Also.

48

 

 

 

® ..
6 ”--~
~~~~‘
‘~
5 ..... ~
‘~
... {29

4 I

I
i ,t i
I 2 I I
I ’ I
: 04 02 :
I ‘ 3 '3' .
l ‘ ’p' I
I I I
' ‘\ o" I
1 \ ’l’ l I
I \ I
. . . ® .

'
1 ‘~ .r" 5 i
’
I ’3’ I
I 4" ‘ I
O' .
O
Q ({0} 05 - ®
6 ....
vvvv

14

 

Figure 3-8. An example broadcast in BDG (4) using Algorithm 1

T(A) = BTS (so,an_i_1.an_i_2, . . . ,a0)

= BTS(so,an_i_l,an_i_2.....al) + BTS(a1.ao)

T(X) + BTS(a1,ao).

Since BTS(a1.ao) 2 l. we have T(X ) S It. By the induction hypothesis. vertex X was
informed by the end of time step T(X ). We now consider the following two cases:

Case l: 016 a0: 1. This means that BTS(a1,ao) =1 and thus we have T(X) =k.
Also. in this case, LSCE (X) = A . and therefore. by Step 3 of Algorithm 1, vertex A will

49

be informed at the end of time Step k+1.
Case 2: 019 a0 = 0. This implies BTS(al,ao) = 2 and thus we. have T(X) = k-l. In
this case. LSNE (X) = A, and therefore. by step 5 of Algorithm 1. vertex A will be
informed at the end of time Step 1: +1.
In both cases. vertex A is informed at the end of time step T(A ). Note that when
A = ($050, . . . .I'o), T(A) is maximum. In this case. T(A) = 2n-1. Therefore, at the end
of time step (Zn-1) all vertices are informed. I

The above theorem and the inequalities in Step 3 and step 5 of Algorithm 1 ensure
that each vertex receives the broadcast message exactly once. Thus Algorithm 1 satisﬁes

the criteria of local broadcasting.

Chapter 4

Fault-Tolerant Loop
Architectures

 

The large number of components (processors, memory units. etc.) in an MC. cou-
pled with the demands placed on them by the types of applications that they are intended
to be used for, make reliability concerns a very important issue. We noted in Chapter 1
that one method of improving reliability is to design a system in such a way that it will be
able to tolerate component failures and still remain operational. For an MC. fault-
tolerance at the system level (also referred to as the topological level) implies that the
type of faults to be tolerated are processor and/or link failures. and an MC (by an MC we
will often mean its topology) is said to be fault tolerant if it can remain functional in the
presence of such failures. It is. however. the topological requirements of the application
that essentially determine when an MC is considered to be functional.

No basic functionality criteria have received much attention. According to one of
these criteria. an MC is considered functional as long as there is a nonfaulty communica-

tion path between each pair of nonfaulty processors [ArLe81, PrRe82, RaGA85. Boes86.

50

51

Esfa88]. In other words, the underlying topology of the MC Should remain connected in
the presence of certain failures. Both processor and link failures have been considered in
the literature. This fault-tolerance model is particularly applicable to large-scale MCs
that are to execute concurrent algorithms whose performance is mostly insensitive to
inﬁequent changes in the topology of the system. Such MCs generally permit graceful
degradation.

The second functionality criterion considers an MC functional as long as a desired
topology is contained in the system. In the past few years much work has been done in
developing parallel algorithms and the best MCs for their executions [Quin87]. For these
algorithms, the existence of certain topologies is a signiﬁcant factor in delivering the
desired performance. Thus. for such applications. the system should be able to provide a
speciﬁc topology throughout the execution of the algorithm. The existing work on such
fault-tolerant tapologies has mainly considered processor failures. The basic approach
in achieving fault-tolerance when this functionality criterion is used is to employ
system-wide redundancy. Two topologies that have received much attention are the ring
and the tree [Haye76. Kvﬂ'o81. RaAE84, LoFu87. DuI-Ia88]

Here, fault-tolerance will refer to the second functionality criteria. and we will con-
sider only processor failures. We assume that all failures are permanent and that some
diagnosis mechanism is available to detect and isolate the faulty processors. Further. it is
assumed that in the event of processor failures. some mechanism exists that allows the
system to be "reconﬁgured" (using redundant processors) to maintain the desired topol-
ogy. This also implies that as a part of this process. the tasks assigned to the faulty pro-
cessor can be successfully shifted to a fault-free processor. Neither the diagnosis nor the
reconﬁguration problems will be addressed here [YaHa86].

In designing such a fault tolerant topology, the number of faults to be tolerated is
speciﬁed a priori. Given this value, say It. a topology is determined such that the desired
sub—topology can be realized given any collection of k faults. In previous work. the

52

primary design criteria has been to employ the minimum number of redundant processors
in deference to other parameters. such as the number of communication links. etc. The
overriding rationale for this choice has largely been economic: processors are usually the
most expensive component. However. using the absolute minimum number of Spare pro-
cessors has typically forced these designs to incorporate a very large number of redun-
dant communication links; in mph theoretic terms each vertex has very "high" degree.
From a practical perspective this presents a problem in that processors have only a ﬁnite
(usually quite small) number of communication ports. For large scale systems these
designs can only be used to achieve low levels of fault tolerance, since for high levels.
the number of ports needed would exceed the number available. Additionally. in terms of
VLSI and WSI technology, one of the primary design concerns is overcoming pin-
limitation and layout problems; designs which require large numbers of links only exa-
cerbate these problems [ChCD88, Dall88].

In this chapter, we present an alternate primary design criterion and employ it in the
design of fault-tolerant loop t0pologies. The basic idea is to place a bound on the max-
imum number of connections at each processor and determine the number and
conﬁguration of redundant components required to achieve a speciﬁed level of fault-
tolerance. Loops are a common interconnection topology and are important in their own
right. However, as many other topologies have loops "embedded" within them. results
relating to the construction of "low" degree fault-tolerant loops can give insight into the

feasibility of such constructions for other topologies.

4.1. Background

In this section we introduce some additional terminology. deﬁnitions and a formal
statement of the problem we will consider. As we have done throughout this disserta-
tion. we model the topology of an MC by a mph G (.V,E ). In particular. a loop intercon-

nection network connecting N processors is modeled by the cycle CN . The failure of a

53

processor is modeled by the removal of the corresponding vertex and its incident edges in
the mph.

As previously mentioned, in this model an MC is functional as long as a desired
structure is contained within the system. This criterion was ﬁrst formulated by Hayes
and can be stated formally as follows [Haye76]. Let G (V ,5) represent the topology of
an MC and D (V,E ) be a desired structure. Then, G is k-fault-tolerant (k -ft) with respect
toD. iffor any set offaulty vertices F cV(G), with IF I = k. the mph D :G -—F.
We also say that G is a k-ft D mph. Thus, G is a l-ft N -cycle means that
CN C G — {v} for each ve V(G ). It is clear from this deﬁnition that G is a k-ft D graph
implies | V(G) | 2 | V(D) | + It. For a given mph D, and speciﬁed integers k and m.
the set of all k-ft D mphs G. with |V(G)| = |V(D) | +m. will be denoted by
I‘k [D .m].

Using the above deﬁnition. several classes of problems can be formulated by impos-
ing certain requirements on the mph G. The problem originally proposed by Hayes can
be Stated as: Given a graph D , and a positive integer k . construct an "optimal" graph, G .
which is k -ft with respect to D . The optimality criteria considered in [Haye76] is:

(a) |V(G) l = |V(D) I + k
(b) [E (G) | is as small as possible subject to (a).

In [Haye76] three candidates for the mph D are considered, namely cycles. trees
and Simple paths. In particular. a construction is given for G when D is a cycle of even
order. In this case. G is a (k+2)-regu1ar mph. A similar construction is also given for
the case of D being a cycle of odd order. The work of Hayes was followed by the results
in [KWI‘oSL RaAE84, LoFu87. DuHa88]. In these papers the authors have adhered to
the same optimality criterion. but imposed additional restrictions on the set F . For exam-
ple. in [RaAE84] the case of D being a binary tree is considered with the additional res-

triction that failures only occur at diﬁ’erent levels of the tree.

54

The criterion (a) alone dictates that the number of spares be exactly equal to the
desired level of fault tolerance. When no additional restriction is placed on the set F .
this criterion implies 8(6) 2 8(D) + 1:. Thus the degree of each vertex in G is at least
proportional to the number faults to be tolerated. Since current technology limits the
number of connections at each processing node (and it seems this constraint will persist
for some time to come [ChCDSS. DA1188],) the above optimality criteria may not be
viable in certain situations.

To address the above concern. we propose the following modiﬁed version of the
problem. Given a desired mph D and positive integers k and r . construct a mph G
such that:

(a) D is a subgraph ofG-F for any setF c: V(G). with IF I = k.

(b) G is r-regular. and

(e) no mph satisfying (a) and (b) above has fewer vertices than G .

It is clear that this problem may not have any solution for certain choices of D . k . and r.
A case in point. when r=2 and D = C . there is no solution G for any It . Thus 3 is the
smallest value of r for which a solution may exist for the case of D = CN .

In the sequel we will consider the modiﬁed problem for the case r=3 and D = CN ,
N even. We will show that no solutions using exactly I: spares exist; thus we must incor-
porate "extra" redundancy into the design. In fact. we Show that the minimum number of
spares required is bounded by 21:. that is | V(G)| 2 |V(D) | + 2k. and we will exhibit
a class of mphs. called Chordal Rings for which the lower bound is achievable for
It =1,2.3. Further. this result will be shown to be "optimal" in the sense that higher levels
of fault tolerance cannot be achieved for this class of mphs using the minimum number

of spares.

55

4.2. Chordal Rings as lc-ft Cycle Topologies

In this section we will deﬁne the Chordal Ring class of mphs, and we examine the
extent to which they can serve as fault-tolerant mphs for CN . The results are presented
in the form of several theorems, organized as follows. Theorem 4-1 establishes a lower
bound on the number of spares required by the use of cubic (3-regular) mphs.
Theorems 4—2 and 4-3 establish that certain members of the chordal ring class are l-ft
and 2-ft for CN . No lemmas are given to establish Theorem 4-4, which provides a
necessary condition for any mph G to be a member of the class I‘leN.2k]. This
theorem is used in proving Theorem 45. which establishes that 3-ft is the highest level of
fault-tolerance that can be achieved using Chordal Rings. Finally, in Theorem 4-6, we
prove that this level of fault tolerance can be achieved by the chordal rings CR (M ,7).
We begin with some additional deﬁnitions that we make use of in the proofs.

Deﬁnition 4.1. A fault-set F is a set of faulty vertices.

Deﬁnition 4.2. Given a mph G and a desired mph D, a vertex v is unusable with
respect to a given fault set F . if no submph of G - F isomorphic to D contains v .
Example 4.1 Let G be 3-regular and the desired mph D be a cycle. Let {u1.u2.u3} c
V(G) and {(u1.u2).(u2.u3)} c E (G ). Then the fault set [u1.u3} makes u2 unusable
because u2 has only one edge connected to a non-faulty vertex in G . implying that u2
cannot be a vertex in any cycle. It is "trapped" between two faulty vertices. Note that by
this deﬁnition any faulty vertex is unusable. A vertex is usable if it is not unusable.
Deﬁnition 4.3. Given a mph G. a fault set F, and a desired mph D, the set of all
unusable vertices induced by P will be denoted U (P) or simply U .

Clearly, a necessary condition for a mph to be k-ft for CN is that there be N usable
vertices for every fault set F of order k . In other words, the maximum cardinality of an
unusable set may not exceed the total number of spare vertices. As we have noted. the
tradeoﬁ‘ in using cubic mphs for k-ft cycles is that we must include "extra" spare ver-

tices in the design. Theorem 4-1 establishes the lower bound (cited earlier) on the

56

amount of redundancy required.
Theorem 4-1: Let G e I‘k[CN ,m] be 3-regular and connected. Then m 2 2k-1.
Proof. By deﬁnition. there is an N -cycle in G . Label the vertices of this cycle
v0,vl.....vN_1 and consider the fault set F = {v1.v3. . . . .v2k_1}. As in the example 4.1,
the failure of the vertices v2“l and v2, +1 . 1 Si S k-l . "traps" the vertex v2... making it
unusable. Thus all the vertices "between" such pairs of vertices in F are unusable. This
gives {v1,v2. . . . .vu_1} c U and hence I U | 22k-1. I

Since we are considering designs for even N , Theorem 4-1 and the fact that there
exists no cubic mph of odd order imply that the minimum number of spares is at least
21: . Hereafter we restrict our attention to cubic mphs having this minimum number of
spares, i.e.. those in the class I‘k[CN,2k]. In addition. we will consider only the cases
where k < N /2. This seems a reasonable requirement for large scale systems and elim-

inates many Special cases in the proofs.

4.2.1. Chordal rings

A chordal ring of degree 3, hereafter referred to simply as a chordal ring. is a 3-
regular Hamiltonian mph. Clearly, chordal rings are of even order. Following [ArLe81]
we deﬁne a speciﬁc class of chordal rings, CR (N ,w ). which can be constructed using
Procedure 4.1. The parameter w is called the chord length which is required to be odd
and at least 3. Note that CR (N .w) is bipartite with vi and vi +1 being in different parti-
tions. Without loss of generality. we will assume that w S N /2. The chordal ring
CR (20.3) is shown in Figure 4-1. (For sake of readability. we have labeled the mphs in
the ﬁgures using integral labels. i.e.. vertex v‘. is labeled simply i.)

Procedure 4.1. Construct CR (M .w)
1. Construct a cycle of order M; labeling the vertices 0.1, - - - ,M-1.
2. Foreach i,0Si < (M—l)/2, add the edge (2ij) wherej = (2i 4» w)mod M.

57

 

Figure 4-1. The Chordal Ring CR(20.3).

4.2.2. Fault tolerance of CR(N.w)

As we noted earlier, two solutions for l-ft cycles were given in [Haye76]. The solu-
tion for even cycles has A = 4, while the solution for odd cycles is cubic. We have
noticed that Hayes’ solution for 1-ft C 2’. +1 (using one Spare) can also serve as a l-ft C2].
(using two Spares). In fact Hayes’ design is the chordal ring CR (2] +2.3). Thus such
chordal rings belong to 1‘1 [C 21. ,2]. The following theorem establishes a Stronger result
for l-ft cycles.

Theorem 4-2: For any even N 2 4 . CR (N +2.w) e I‘l[C .2]. where w is as deﬁned
above.

Proof. Let N be given and let the vertex set of CR (N +2.w) be V = {v0.v1, - - ° .vN+1].
Without loss of generality. let P = {v0}. Then it is not diﬂicult to see that the N -cycle

58

v1.v2.v3. ' ' ' ’vN+1-w’vN+1’vN’. ' ' ’vN+3—w’v1

is contained in CR (N+2.w) — F. I

For the 2-ft case. we have proved [ZiEs89] that no cubic mph exists for N =4. and
we thus consider N 26. The next theorem establishes that for such N . the chordal rings
CR (N +4.3) are 2-ft for C” .
Theorem 4-3: For any even N 2 6. CR (N +4.3) 6 F2[CN.4].
Proof. Let N be given and let the vertex set of CR (N +4.3) be V = [vo,vl. - - - .vN+3}.
To establish the result, we must Show that for any fault set F = (x, y I c V. C" :
CR (N +4.3) - F . Since the design uses 4 spare vertices. for any fault set F as above,
two additional vertices will not be used in the resulting cycle. The two faulty vertices and
the two additional ones will be referred to as the inactive vertices for the fault F .
Without loss of generality, we let x = v0 and consider the possibilities for y . There are 3
cases:

Case 1. y = vl Consider the two paths:
v3.v4.v7,v8. ° ' ' ’v4i-1’v4i’ ° ’ ' ’vN-l’vN’vN-o-l
V2’V5’Volv9’ ' ° ' ’v4i-2’v4i +1, ' ' ' ’vN-3’vN-2’VN +1 °

These two paths have only the vertex v” +1 in common. A cycle of order N can be
formed by concatenating the two paths and including the edge between v2 and v3. Note
that the vertices vN +2 and vN +3 are inactive in this cycle. An example of such a cycle is
given in Figure 4-2 for CR(20.3). The edges in the 16-cycle are shown as solid lines;
dashed lines represent unused edges.

In the remaining two cases we will exploit the Hamiltonian cycle shown in Figure
4—3 to establish the result. This cycle is made up of paths of the form v25 “,vz‘. ,v2i +3; that
is, an edge ﬁom the "peripheral" cycle followed by a chord in the "opposite" direction. It
is clear that such a cycle always exists in CR(M.3). The salient feature of this Hamil-

tonian cycle is that we may remove any two "endpoints" of a chord and using an edge

 

Figure 4-2. A 16-cycle in CR (20.3) - {0,1}.

previously not used. obtain a new cycle without altering any other edge or vertex in the
original cycle. The order of the new cycle is obviously two less than that of the initial
one. For example. we may remove the vertices v0 and v3 in Figure 43 and employ the
edge (v 1.v 2) to obtain the cycle shown in Figure 44.

Case 2. y = vj . j¢1,3 In this case, an N -cycle can be obtained by removing the
chords (i.e.. their endpoints) corresponding to the faulty vertices as discussed above.
Case 3. y = v3 . Here. the two faulty vertices are the endpoints of a chord. To obtain a
cycle of the correct size. we choose the endpoints of any other chord in the mph (with
the exception of the pair V]. V” _3) and remove them to obtain a cycle of the desired Size
as in case 2.

Thus for all combinations of 2 failures we can obtain a cycle of order N. I

60

We have Shown that CR (N +2k .w) e I‘,‘[CN ,2k] for k =l.2 and w =3. We will next
Show that the 3-ft is the best that can be achieved from chordal rings using the minimum
number of spare vertices. To do this we need the following results.

Lemma 4-1 : Let G be a 3-regular and connected mph. Further. let G e I‘kICN.2k].
Then the girth ofG.g(G)2k+1.

Proof. By contradiction. Suppose G has a j-cycle C j. j S k. Consider the set
A = [v lvd Cj .v is adjacent to some vertex u e Cj}. Call A the adjacent set of Cj
and let I A | = m. Note that m Sj. since each vertex in Cj has one incident edge not in
the cycle with which to connect to another vertex in G . Suppose all the vertices in A
are faulty. The vertices in Cj are "trapped" by the fault set A and thus (A U Cl.) C U.
Since G is k-ft for CN and m S k. there is an N -cycle in G which does not include any
vertices in A UCj. Let wo,w1. . . . .wN_1,w0 be such a cycle. Since G is connected,
there is a path between some vertex vo eA to a vertex “’0 in the N -cycle which does not
include any other vertex in A. Let this path be P = v 0.ul,u 2. . . . , P,w0. Note that the
path P may be of length 1. Now extend P to P ’ by concatenating with it the path formed
by the vertices of the N-cycle: P’ = vo,ul,u2. . . . . vaoth . . . .wN_1. So. we have
established that there is a path beginning at v0. of length at least N . which does not con-
tain any vertices of A u Cj other than v0. Let us relabel P’, denoting the ﬁrst 2k-2m +2

vertices as vo.v1.v2. . . . ,vz,‘_2m +1. Now consrder the fault set:

F =A U [v1.v3, . . . 'v2k-7JI4-l} - {v0}

which contains | F | =m + (k-m+l)-l = k nodes. Thismakesthe set
[A UCJ. U {v1.v2, . . . ’v2k-2m+1}] CU

and thus I U| 2m + j + 2k—2m+122k+1,which contradicts the deﬁnition ofG. I
Lemma 4-2: Let G be as in Lemma4.l, then g(G) ¢k+1.

Proof. By contradiction. Suppose G has a k + l-cycle C , and let A be the adjacent set of
this cycle as in Lemma 4.1. If IA |Sk. then the proof ofLemma 4.1 also holds here.

61

 

Figure 4-4. An l8-cycle in CR (20,3) - {0.3}

62

and (C uF)cU. This implies | U | 2k+1+k =2k+1 which contradicts the
deﬁnition of G . I
Lemmas 4.1 and 4.2 give us the following theorem.
Theorem 4-4: Let G be a 3-regular and connected mph. Further. let G e I‘,‘[CN.2k].
Theng(G)2k-I-2. I

We are now ready to state the following theorem which indicates that 3-ft is the
highest level of fault-tolerance that can be attained with Chordal Rings using the
minimum number of spare vertices.
Theorem 4-5: CR (N +2k ,w) e I“ [CN ,2k] for any k24.
Proof. For any values of N and w as deﬁned above, the mph CR (N .w) contains the

cycle induced by the edge set:
{(vo.v1).(v1.vz).(v2.vw,2).(vw,2.vw,1).(vw,1.vw).(vw #0)} .

This implies that g (CR (M .w ))S6 and therefore from Theorem 4 the conclusion holds for

k25. For k =4. consider the fault set
F = {VO’VZ’V4’VN+2k—w+2}'
This makes the set
F’ = FVW1’v3’VN+2k—w+l’vN+2k-w«I-31 C U‘

Since ﬁve of these vertices all belong to one set of a bipartition of CR (N +21: .10 ). the
largest cycle that can exist in CR (N +2k.w) —F’ is C” +2.40. This gives the desired
result for k =4. I

The ﬁnal theorem of this chapter. establishes that the chordal rings CR (N +6.7) are
3-ft for C”. N 2 20. These mphs have chord length equal to seven. Note that since
g (CR (M .3)) = 4. we may conclude from theorem 4-4 that CR (N +6.3) 4 1‘3[C~,6] for
any N. In addition. for the chordal rings CR (N +6.5). we have found counterexamples
which Show that the mphs are not member of 1‘3[CN ,6] for arbitrarily large values of N.

63

Theorem 4-6: CR (N .7) e I‘3[CM .6]. where M =N —6 and N 2 26.

Proof. We need to show that for any fault set F = {xy .2 }, CM : CR (N .7)—F.

We proceed by specifying fault-sets F , and demonstrating that an M -cycle is present in
the presence of each such fault-set. For some fault sets, a Speciﬁc cycle will be given
explicitly. In the remaining cases. the existence of an M ~cycle in a speciﬁc mph or set
of mphs along with an inductive argument will demonstrate that the fault-set can be
tolerated for every CR (N .7). where N is as in the statrnent of the theorem. The proof
employs four types of inductive arguments. which are illustrated in Figures 4-5, 4-6. 4-7.
4-8. A ﬁfth type of argument (non-inductive) shows how a special class of faults can be
handled. Each of these arguments is described below. In our approach, each fault set is
characterized in terms of the relative distances between its member vertices; the induc-
tive argument is always done in such a way as to preserve these relative distances, and
hence the fault-set itself. .

Figure 4-5 illustrates the most straightforward inductive variation. In the ﬁgure, all
the faults are contained "within" one chord. The M -cycle is apparent. It is clear that we
may add a pair of vertices to this mph. (anywhere except between vertices 26 and 5)
extending the cycle by two. and not altering the fault set. We thus obtain a chordal ring
of order N +2, which has an M +2 cycle for the given fault-set. The key characteristic
here is that there be an edge in the M ~cycle of the form (2i +1.2i +2) such that there is no
chord in the M -cycle which begins at the vertices 2i -4. 2i -2, 2i. Intuitively. no edge of
the M -cycle "jumps" past the vertices 2i +1. 2i +2. It is clear that in such a circumstance.
we may add any even number of vertices between vertices 2i +1 and 2i +2, which will
establish the pattern for all chordal rings. We will refer to this as the induct-by-two
(1B2) pattern.

The second inductive variation is Show in Figure 4-6. The key characteristic is that
there be an edge in the M -cycle of the form (2i ,2i +7) (i.e. a chord) and that the edges
(2i +2.2i +9) and (2i -2,2i +5) are not used in the M -cycle. Intuitively. imagine drawing

 

Figure 4-5. An induction-by-two illustration.

a straight line from the center of the circle, between two vertices such that the only edge
of the M -cycle intersected is a chord. Observe that such a cycle can be extended by
adding four vertices between vertices 2i +3 and 2i +4, as shown in the bottom mph. In
the example Shown, 2i = 10 and the four new vertices are labeled "N" for "new". To
make use of this pattern, we will demonstrate the desired cycle in mphs of orders N and
N +2. and then use this property to induct by four to yield the conclusion. We will refer
to this as the induct-by-four (1B4) pattern.

Figure 4-7 illustrates the ﬁrst of two "induct-by-Six" patterns. The key feature in
this ﬁrst pattern is that the M -cycle "doubles-back" on itself. Intuitively. a line can be
drawn which does not intersect any edge of the M -cycle. Note that "doubling back"
implies the existence of two "corners": vertices 5 and 6 are corners in the upper mph.
Observe that any such cycle can be extended to a cycle with 6 additional vertices as
shown in the lower mph. The extension can be accomplished by removing the peri-

pheral edge (6.7). adding 6 new vertices between vertices 5 and 6 and reforming the

65

 

 

Figure 4-6. An induction-by-four illustration.

66

cycle as shown. Note that the extension can be done at either corner. Since this process
adds six vertices. to use it inductively it will be necessary to establish that a cycle exists
for a given fault-set in mphs with orders N ,N +2,N +4. We will refer to this as the
induct-by-six-A (IB6A) pattern, Figure 4-7.

The fourth variation will be referred to as the induct-by-six-B (IB6B) pattern, illus-
trated in Figure 4-8. The characteristic here is that the M -cycle uses three consecutive
chords (2i .2i +7). (2i +2.2i +9) and (2i +4.2i +13) and does not use the edge
(2i +5,2i+6). The cycle is extended by adding six vertices between 2i +5 . 2i+6 as
shown (2i = 8 in the example). Again. this situation may be intuitively characterized as
being able to draw a line from the center of the circle between a pair of vertices
(2 j +1.2j) which intersect exactly three edges of the cycle. all of which are chords.

The ﬁnal type of argument we use is illustrated in Figure 4—9. The idea is to identify
a cycle pattern for a speciﬁc fault-set and observe that the fault set can be "pushed" for-
ward and the cycle reformed. In the top mph, the fault is F = {0.3.4.5}. The cycle uses
the edge (1,2) and then "skips over" vertices 3.4.5 using a chord. The cycle then
traverses groups of four vertices connected by peripheral edges. with each group con-
nected to the next by a chord. Observe in the middle mph that the fault-set has been
advanced forward by four vertices. i.e. F = {0.7.8.9}. and the cycle reformed. Essen-
tially. one of the groups of four vertices has been displaced (backwards) by the advanced
fault set. This process can be continued until all the groups of four have been displaced,
as seen in the bottom mph. Note that typically. as in the ﬁgure. the cycles we exhibit
have speciﬁc patterns before and after the fault-set being advanced. The number of ver-
tices needed by these patterns (before and after) varies between fault-classes. but typi—
cally meets a certain congruence relation and range criteria. For example. the fault class
above can be described as follows. F = {0.2i—l,2i,2i+1}. where 4S2i SN—l4 and
N -2i 5 2 mod 4. This argument will sometimes be used in conjunction with the four

inductive arguments in establishing the result for certain fault-sets. We will refer to this

67

 

 

Figure 4-7. An induction-by—six, version A illustration.

68

 

 

Figure 4-8. An induction-by-six. version B illustration.

 

69

 

 

 

 

Figure 4-9. An advance-by-four illustration.

70

as the advance-by-four (AB4) pattern.

Since the mphs we are considering are bipartite. we may assume without loss of
generality that x = 0 and then consider the possible values for y and z . Note that when y
is odd. by symmetry we need only consider values of z in the range 2y S 2 S (N +y —l)/2.
Also. when y is even. we need only consider values of z where 2y S 2 S N -2y -1.

We proceed by considering several cases based on choices for y . For each case. we

provide a series of speciﬁc fault sets, numbered for convenience.

CASE 1. y e {1.2}.

Fault Set: 1 F = {0.1,2,3.4,N-1]. Here. all the faults are contained "inside" a chord,
which can be used to "short circuit" the faulty vertices. See Figure 4-5. This also covers
the faultF = {0.1,2,N-3,N-2,N—1].

Fault Set: 2 F = {0.1,2.2i-1,2i,2i-1} , where N-2i I 2 mod 4 and 6 S 2i SN—6. See
Figure 4-10. We may apply AB4. The requirement N —2i 5 2mod 4 simply guarantees
that the number of vertices after the fault is a multiple of four so that the group of four
pattern will work.

Fault Set: 3 F = [0,1,2,2i-l.2i,2i-1] , where N-2i s. 0 mod 4 and 6 SN-2i SN—20.
See Figure 4-11. We may apply AB4. Here the path after the fault-set requires 18 ver-
tices. The inequality aboves ensures that there are enough vertices for this pattern.

The above fault-sets cover all the odd vertices and all even vertices except
N -l6,N -12,N-8,N —4, which are considered next.

Fault Set: 4 F = {0,1,2,N-16]. See Figure 442. Apply IB2 to the edge (3,4).

Fault Set: 5 F = {0.1.2,N—12]. See Figure 4-13. Apply IB2 to the edge (3.4).

Fault Set: 6 F = {0.1.2,N-8}. See Figure 4.14. Note that the cycle doubles back at
vertices 5 and 6. We may apply IB6A to either corner. It is important to note that we are

not altering the fault pattern, since in this case we may consider the faults to be

71

 

 

Figure 4-11. F = {0.1.2,2i-1.2i,2i+1} , N—2i 50 mod 4

72

 

 

Figure 4-12. F = {0.1.2,N-16}.

 

Figure 4-13.F = {0.1.2,N-12}.

 

 

Figure 4-14. F = {0.1.2,N-8}.

 

74

"contained" between N —8 and 2 (in a clockwise sense from N —8) and altering the graph
outside of this containment preserves the relative distance between faults.
Fault Set: 7 F = {0.1.2,N—4}. See Figure 4-15. Apply IB6A to either corner.

We have covered all possible fault-sets of the form {0.1.2.2 }. completing this case.

CASE 2. y = 3, 6S2 SN/2+l.

Fault Set: 8 F = {03.6.7.8}. See Figure 4-16. Apply IB6A to either corner.

Fault Set: 9 F = {0.12.3.9}. See Figure 417. Apply IB6A to vertex N -1.

Fault Set: 10 F = {0.3.4.10.11}. See Figure 4-18. Apply IB6A to vertex 12.

Fault Set: 11 F = {0,3,12,13,14}. See Figure 4-19. Apply IB2 to the edge (15,16).
Fault Set: 12 F = {0.3.10.2i-l.2i ,2i +1 }. Where 14 S 2i S N-10 and N—2i E 2 mod 4.
See Figure 4-20. Apply AB4.

Fault Set: 13 = {0.3.10.2i-1,2i.2i+1}. Where 14 S2i SN—16 and N-2i a 0mod 4.
See Figure 4-21. Apply AB4.

This completes case 2.

CASE 3. y = 4. 8 S2 SN-3.

Fault Set: 14 F = {0,4,2i-1.2i.2i+l}. Where 8 S 2i SN-10 and N-2i a 2 mod 4. See
Figure 4-22. Apply AB4.

Fault Set: 15 F = {0.4,2i—l.2i.2i+l}. Where 8 S2i SN-12 and N-2i50mod4. See
Figure 4-23. Apply AB4.

 

 

 

15. F = {0.1.2,N—4}.

Figure 4-

 

 

 

.8}.

7

9

{0.3.6

Figure 4-16. F

 

{0,1,2,3.9].

17. F

Figure 4-

 

 

,10,11}.

= {0,3,4

Figure 4-18. F

79

 

Figure 4-19. F = {0,3,12,13,14}.

 

Figure 4-20. F = {0.3,10.2i-l,2i,2i+1}.N—2i =2mod 4.

 

 

Figure 4-21. F = {0.3,10.2i-1,2i,2i+1],N—2i EOmod 4.

81

 

Figure 4-22. F = {O,4,2i-1,2i,2i+l].N-2i 52mm! 4.

 

 

Figure 4-23. F = {0.4,2i-1,2i,2i+1}.N—2i 50mm! 4.

82

The previous sets cover all possible values of z in the range 8 S 2 S N —8. The next three
fault-sets complete this case.

Fault Set: 16 F = [0,4,N-8,N—7}. See Figtue 4-24. We may apply IB2 to the edge
(5.6).

Fault Set: 17 F = {0,4,N —6,N—5}. See Figure 4-25. Apply IB6A to either corner.
Fault Set: 18 F = {0.4,N-4,N-3}. See Figlue 4-26. Apply IB6A to either corner.

CASE 4. y = 5.10 S z SN/2+2.
Fault Set: 19 F = {0,5,10,11}. See Figure 4.27. Apply 1B4 between vertices N -1 and
0.

 

Figure 4-24. F = {0.4,N-8,N—7}.

 

 

Figme 4-25. F = {0,4,N—6,N—5}.

84

 

-3}.

[0.4.51 -4,N

Figure 4-26. F

85

 

{0,5,10,11}.

Figure 4-27. F

86

Fault Set: 20 F = {0.5.12}. See Figure 4-28. Apply IB2 to the edge (13.14).

Fault Set: 21 F = {0.5.13}. See Figure 4-29. Apply IB6B between vertices 15 and 16.
Fault Set: 22 F = {0,5,14,15,16}. See Figure 4-30. Apply IB2 to the edge (17.18).

The previous fault-sets are suﬁicient (for this case) for N =26 and N =28. The following
two complete this case forN 2 30

Fault Set: 23 F = {0.5,2i-l,2i.2i+l}. where 14S2i SN-10 and N-2i §2mod 4.
See Figure 4-31. Apply AB4.

Fault Set: 24 F = {0.5,2i-l,2i,2i+1}. where 14S2i SN-l6 and N-2i IOMOd 4.
See Figure 4-32. Apply AB4.

This complete the case {0,5,2 }.

 

 

Figure 4-28. F = {0.5.12}.

 

 

 

Figure 4-29. F = {0.5.13}.

 

Figure 4-30. F = {0,5,14,15,16}.

CASE 5. y e {6,7}.12Sz SN—6.

Fault Set: 25 F = {0,6,7.12,13}. See Figure 4-33. Apply IB6B between N -l and 0.
Fault Set: 26 F = {0,6,7.13.l4}. See Figure 4-34. Apply IB6B between N -1 and 0.
Fault Set: 27 F = {0.6.7.8,15}. See Figure 4-35. Apply IB6B between 23 and 24.
Fault Set: 28 F = {0,6,7.N-7}. See Figure 4-36. Apply I82 to the edge (17,18).

89

 

Figure 4-32. F = {0.5.2i—1,2i,2i+1].N-2i aOmod 4.

 

 

 

Figure 4.33. F = {0.6.7,12.13}.

 

 

 

= {0,6,7.13.l4}.

Figure 4-34. F

 

Figure 4-35. F = {0,678.15}.

 

Figure 4-36. F = {0.6.7,N—7}.

Fault Set: 29 F = {0.6.7,N-8}. See Figure 4-37. Apply IB6B between vertices l7 and
18.

Fault Set: 30 F = {0.6,7.8,N-9]. See Figure 4-38. Apply IB2 to the edge (9,10).

Fault Set: 31 F = {0,6,7.4i .4i +1 }. See Figure 4-39. The range of 4i is qualiﬁed as fol-
lows. 16 S 4i SN—12. where N—4i E 0 mod 6 (top), 16 S 4i SN—l4, where
N—4i £2mod 6 (middle), and 16S4i SN-16, where N-4i =4mod 6 (bottom). To
see this result, ﬁrst ﬁx 4i = 16. Then we may apply IB6B between vertices N —l and 0.
This veriﬁes this choice of 4i. Now note that we also apply 1B4 between 11 and 12.

 

 

 

Figure 4-37. F = {0.6.7,N-8}.

95

 

 

Figure 4-38. F = {0.6.7,N—9}.

Fault Set: 32 F = {0.6.7.4i +2.4i +3}. See Figure 4-40. The range of 4i is qualiﬁed as
follows. 16 S 4i S N-12, where N—4i = 0 mod4 (top). 16 S 4i S N—14, where
N —4i 5 2 mod 4 (bottom). To see this result, ﬁrst ﬁx 4i = 16. Then we may apply IB4
between vertices l9 and 20. This veriﬁes this choice of 4i . Now note that we also apply
IB4 between 7 and 8.

The above fault-sets cover all possible values of 2 except for z = N —10, which fol-
lows.
Fault Set: 33 F = {0,6,7.8,N-10}. See Figure 4-41. Apply IB4 between 7 and 8.

This completes the case {0.6.7.2}.

CASE 6. y e{7.8},16 S 2 SN—8.
Fault Set: 34 F = {0.7.8.15.l6]. See Figure 4.42, Apply IB6B between N -1 and 0.
Fault Set: 35 F = {0.7.8,N-9}. See Figure 4-43. Apply IB2 to the edge (13,14).

 

 

Figure 449.1i = {0.6,7.4i,4i +1 }.

97

 

 

Figure 4-40. F = {0.6,7.4i+2.4i+3}.

98

 

Figure 4-41. F = {0.6.7,N-10}.

 

{0.7.8.15,l6].

Figure 4-42. F

100

 

 

Figure 4-43. F = {0.7,8,N—9}.

We have thus far shown that any fault-set in which two vertices are at a distance of at
most eight can be tolerated. It is not diﬂicult to see that for N = 26. two faults must meet
this requirement. and thus the proof is complete for

Fault Set: 36 F = {0.7,8,N-9,N—10}. See Figure 4-44. Apply IB4 to vertices 9 and 10.
For reasons similar to the above. the previous fault-set completes the proof for N = 28.
Fault Set: 37 F = {0.7.8,N-11}. See Figtue 4-45. Apply IB2 to 9 and 10.

For reasons similar to the above. the previous fault-set completes the proof for N = 30.
Fault Set: 38 F = {0.7.8.2j+l}. where 17 S2j+l SN-ll. See Figure 4-46. We may
apply IB2 to the edge (9.10) and also IB6B to the vertices N —l and 0. allowing us to
cover all odd vertices in the speciﬁed range.

Fault Set: 39 F = {0.7.8,20}.N 232 . See Figure 4-47. Apply IB4 to 25 and 26.

Fault Set: 40 F = {0.7,8,N-2j }, 10 S 2j SN -22. See Figure 4-48. Apply IB2 to the
edge (23.24).

101

 

Figure 4-45. F = {0.7.8,N—11}.

 

Figure 4-46. F = {0.7.8,2j+1}.

103

 

Figure 4.47, F = {0.7.8.20}.

 

Figure 4-48. F = {0.7.8.N—2j }.

This completes the case of {0.7.8.2} and we now consider our ﬁnal two cases.

CASE 7. y = 2i+l.2i+129

Fault Set: 41 F = {0.2i+l.2i+1+z} , where 9S2i+1<N/2 and 2i+l Sz SN/2+i
See Figure 4-49. We may apply IB2 in the top mph to the edge (21,22); and also IB4
between vertices l7 and 18. In the bottom mph. apply IB2 to the edge (17.18) and IB4
between vertices 23 and 24. This covers all possible combinations of {0.2i + 1,2 j +1 } and
{0,2i + 1,2 j }. for the necessary limits. We have thus proved that the failure of any combi-
nation of two even vertices and one odd vertex can be tolerated. By the symmetry of the
mph. this also covers the case of two odd vertices and one even vertex, leaving only the

case of three even vertices remaining, which is coVered in case 8.

CASE 8. y = 2j,2j 210.
Fault Set: 42 F = {0.2j.2k }. See Figure 4-50. We may apply IB2 to the edges (1.2).
(11.12), and (21,22) in any combination.

105

 

Figure 4-49. F = {0,2j+1,z}.

106

 

 

Figure e50. F = {0,2ij 1.

We have now shown that every combination of three failures can be tolerated, ﬁnishing

the proof. I

4.2.3 Comparison with previous results.

In the previous section. we proved that cubic mphs, in particular chordal rings, can
be used as designs for fault-tolerant loops and can tolerate up to 3 failures. Here. we
compare our results with the work done in [Haye76]. For the l-ft case. we have exhi-
bited a class of 3-regular mphs which are in the class I‘llC .2]. Previously known
results have A=4 and do not have the advantages of being regular. In the 2-ft case, our
design is 3-regular. requires 4 spare vertices and 3(N +4)/2 edges. Previously known
results are 4—regular, use 2 Spare vertices and require 2(N +2) edges. For the 3—ft case,
our designs are 3-regular, require 6 spare vertices and 3(N +6)/2 edges. Previous results
are 5-regular. use 3 spares and SW +3)/2 edges. Thus, the savings in edges is 0(N ).
while the cost in vertices is ﬁxed (and small). So for large N . our designs yield improve-

ments in terms of numbers of edges and are more in accord with the limitations of current

107

technology. In addition. since these designs have "extra" redundancy. they have the abil-
ity to tolerate certain speciﬁc patterns of greater than k failures. Indeed. in the proofs of
the previous section, we demonstrated fault-sets of sizes k +1 .... 2k which could be
tolerated.

It should also be noted that the design based on chordal rings is not unique; there
are other families of cubic mphs which belong to the class 1‘2[CN .4]. but which are not
chordal rings. In particular. a certain subset of Generalized Petersen mphs is one such
family [ZiEs89].

Chapter 5

Conclusion

 

5.1. Conclusions and Summary of Research Contributions

In this dissertation we have examined two problems relating to the topology of Mul-
ticomputer interconnection networks. The ﬁrst of these is broadcasting; a communica-
tion paradigm in which one member of a network wishes to send a message to all the
other members of the network. The second problem involves system-level fault-
tolerance; a strategy for improving the reliability of a Multicomputer by designing it so
that it can remain functional in the presence of faulty components (processors and com-
munication links). Both of these problems were formulated in mph theoretic terms and
extensive use of mph theory was made in establishing the results.

A software package for mph manipulation, developed at Michigan State Univer-
sity, was described in Chapter 2. This package allows users to easily create and manipu-
late mphs interactively. In addition. users may invoke algorithms on mphs to study
their properties and test conjectures. This package was used in establishing some of the
results in this dissertation and other recent research. It was also used in the preparation of

ﬁgures for this dissertation and other recent publications.

108

109

It was noted that the topology of an MC plays a fundamental role in both the
manner that broadcasting can take place. and how rapidly the process can be completed.
The role of trees in the broadcasting process has been explored and we have provided a
partial characterization of minimum broadcast trees. A recurrence relation has been
developed which relates the maximum number of vertices in a mph which can be
informed in a broadcast, to the maximum degree of the mph. This relation can be used
in the characterization of both minimum broadcast trees and minimum broadcast mphs.
The distribution of broadcast times over all trees of order up to 28 was given, showing
that in "most" trees broadcasting can be done in the optimal or "near" the optimal amount
of time. Finally. a distributed algorithm for broadcasting in Binary DeBruijn Graphs has
been presented. The algorithm was shown to complete broadcasting in at most
(210g N )-1 time steps, where N is the order of the underlying mph.

In the area of fault-tolerance, a new approach to system-level fault-tolerance for
message-passing multicomputers has been presented. In this approach, a bound is placed
on the number of connections allowed at each processor. It was Shown that this approach
necessitated the use of "extra" redundancy and designs based on chordal rings which can
tolerate up to 3 processor failures were given. These designs illustrate that the new
approach can be successfully applied to the design of fault-tolerant topologies and can

achieve improvements over previous results.

5.2. Suggestions for Future Study.

There are a number of Open problems of practical and theoretical interest in broad-
casting in general networks. and in speciﬁc interconnection networks, e. g. BDN in partic-
ular. A partial list of problems which are immediate extensions of the results in Chapter
3 follows.

- Finding a tight upper bound for bt (G)

110

- Finding a relation between bt (G) and other invariants of mph G.

- Completing the characterization of minimum broadcast trees.

- Analyzing and extending (beyond trees of order 28) the counting results for the
broadcast time distributions given above.

- Generalizing the deﬁnition of local broadcasting to allow up to k messages to be
issued concurrently from each informed node.

Many variations on DeBruijn mphs have been proposed [Dun88]. One such variation

allows the vertex set to be of any order (not just a power of two). In another variation.

bases other than base 2 are used to deﬁne the interconnections. We conjecture that our

broadcasting algorithm can be readily adapted for these and other variations.

In the area of fault-tolerance. we are currently investigating the applications of
using extra redundancy in the development of general k -fault-tolerant designs for vari-
ous topologies. The principal goal of this research is to achieve designs with feasible
degree requirements. One basic area to consider is the relationship between the level of
fault tolerance k . the maximum degree of the graph A. and the number of spare vertices.

Since we are using "extra" redundancy, it will be possible to provide. in a proba-
bilistic sense, levels of fault tolerance which exceed the design speciﬁcations. It is an
open question as to what probabilistic levels might be attainable.

The reconﬁguration problem has not been addressed here. Although for any given
set of failures a cycle can always be found. a general reconﬁguration strategy is yet to be
developed.

Finally, we have considered only vertex failures. however, it is clear from the
proofs that the designs can also tolerate multiple edge failures. It would be valuable to

extend the research to formally study edge failures and/or combinations of both.

BIBLIOGRAPHY

[A9383]
[ArLe8l]

[BaKF86]

[Barn68]
[Batc80]

[BeSi86]

{BhAg83]

[Bien88]

[Boes86]

[ChCD88]

[Crow85]

[Dall90]

111

BIBLIOGRAPHY

D.P. Amwal. "Graph theoretic analysis and design of multistage intercon-
nection networks," IEEE Trans. on Comput., Vol. C-32. pp. 637-648. 1983.

B. Arden and H. Lee. "Analysis of Chordal Ring Network," IEEE Trans. on
Comput., Vol. 030. PP. 291-295, April 81.

Prithviraj Banerjee. Sy-Yen Kuo. W. Kent Fuchs. "Reconﬁgurable Cube-
Connected Cycles," Proc. of the Sixteenth Symposium on Fault-Tolerant
Computing. pp. 286-291. 1986.

OH. Barnes et al.. "The Illiac IV Computer." IEEE Trans. on Comput., Vol.
C-l7. PP. 746-757, 1968.

K.E.Batcher, "Design of a Massively Parallel Processor." IEEE Trans. on
Comput., Vol. C-29. pp. 336340.1980.

B. Becker and H.-U. Simon. "How Robust is the n -Cube." Proc. of the 27th
Annual Symposium on Foundation of Computer Science, pp. 283-291.
October 1986.

L. Bhuyan and D.P. Argawal. "Design and Performance of generalized
interconnection networks." IEEE Trans. on Comput., Vol. C-32, pp. 1081-
1090. 1983.

D. Bienstock, "Broadcasting With Random Faults." Discrete Applied Math..
Vol. 3 . PP. 4-7, 1988.

ET. Boesch, "Graph Theory and Reliable Network Synthesis." Technical
Report. Electrical Engineering and Computer Science Department, Stevens
Institute of Technology. 1986.

D.V. Chudnovsky. G.V. Chudnovsky. and M.M. Denneau, "Regular Graphs
with Small Diameter as Models for Interconnection Networks." Proc. of the
3rd International Conf. on Supercomputing. Vol. III, pp. 232-239. 1988.

W. Crowther, et al.. "Performance Measurements on a 128-node Butterly
Parallel Processor." Int’ 1 Conf. Parallel Proc.." IEEE Computer Society
Press. pp. 531-555 . 1985.

WJ. Dally. "Performance Analysis of k-ary n-Cube Interconnection Net-
works." IEEE Trans. on Comput., Vol. 39, No. 6. pp 775-785, June 1990.

[DaSe87]

[DuHa88]

[Dun88]

[EsHa85]

[Esfa88]

[ESNS 89]

[E82188]

[FaHe79]
[FaKr83]
[Farl79]
[Farl80]
[Feng8 1]
W79]

[17181177]

[Gott83]

[GrRe86]

112

W. Dally and C. Seitz, "Deadlock-Free Message Routing in Multiprocessor
Interconnection Networks." IEEE Trans. on Comput., Vol. C-36. No 5. pp.
547-553. May 1987.

S. Dutt and LP. Hayes. "Design and Reconﬁguration Strategies for Near-
Optimal k-Fault-Tolerant Tree Architectures." FTCS-I8. June 1988.

D.Z. Du and PK. Hwang, "Generalized de Bruijn Dimphs." Networks.
Vol. 18. pp. 27-38. 1988.

A.-H. Esfahanian and S.L. Hakimi. "Fault-Tolerant Routing in De Bruijn
Communication Networks." IEEE Trans. on Comput., Vol. 34, No. 9. pp.
777-789. September 1985.

A.-H. Esfahanian, "Generalized Measures of Fault-Tolerance with Applica-
tion to n-Cube Networks." IEEE Trans. on Comput., Vol. 38. No. 11. pp.
1586-1590. Nov. 1989.

A.-H. Esfahanian, L.M. Ni, and B. Sagan. "The twisted n-cube with Appli-
cation to Multiprocessing." to appear in IEEE Dans. on Comput., to appear.

A.-H. Esfahanian and G. Zimmerman. "A New Fault Tolerance Analysis for
N-cube Networks." Proceedings of the International Symposium on Mini
and Microcomputers. December 1988.

A.M. Farley. S.T. Hedetniemi, "Broadcasting in Grid Graphs," Proc. 9th
Conf. Combinatorics. Graph T heory, and Computing. pp. 275-288. 1979.

ET. Fathi and M. Krieger, "Multiple Microprocessor Systems: What. Why.
and When," IEEE Comput., Vol. 16, pp. 23-35. March 1983.

A.M. Farley. "Minimal Broadcast Networks," Networks. Vol. 9. pp.
313-332 1979.

A.M. Farley. "Minimum-time Line Broadcast Networks," Networks. Vol.
10. PP. 59-70, 1980.

T. Feng. "A Survey of Interconnection Networks," [BEE Comput., pp. 12-
27. December, 1981.

A. Farley. S Hedetniemi, S Mitchell. and A. Proskurowski. "Minimum
Broadcast Graphs," Discrete Math.. Vol. 25, pp. 189-193. 1979.

RM. Flanders et al.. "Eﬂicient High Speed Computing with the Distributed
Array Processor." High-Speed Computer and Algorithm Organization.
Kuch, Lawrie, and Sameh, eds., Academic Press. New York. 1977.

A. Gottlieb. "The NYU Ultracomputer - Designing an MIMD Shared
Memory Parallel Computer." IEEE Trans. on Comput., C-32, pp. 175-189.
1983.

DC. Grunwald and DA. Reed. "Benchmarking Hypercube Hardware and
Software." Technical Report. UIUCDCS-R-86-l303. Department of Com-
puter Science. University of Illinois at Urbana-Champaign. 1986.

[GuHS 86]

[Hara72]
[Hayc76]

[HMSC86]

[HeHL88]

[Hi1185]
[HoJe8 l]

[HSYZ87]

[HwBr84]

[HwGh87]

[JoHo89]

[KeKl79]

[KuRe80]

[KwTo81]

[LarsS4]

[LeHa88]

113

J.L. Gustafson. S. Hawkinson and K. Scott, "The Architecture of a Homo-
geneous vector supercomputer." Proc. of 1986 Int’ 1 Conf. on Parallel Pro-
cessing. pp. 649-652, August 1986.

F. Harary. Graph Theory, Addison Wesley. 1972.

JP Hayes. "A Graph Model for Fault-Tolerant Computing Systems," IEEE
Dans. on Comput., Vol. 25. No. 9. pp. 875-884. September 1976.

J.P. Hayes. T.N. Madge. Q.F. Stout, S. Colley. S. and J. Palmer. "Architec—
ture of a hypercube Supercomputer." Proc. of 1986 Int’l Conf. on Parallel
Processing. pp. 653-660, August 1986.

S. M. Hedetniemi, S. T. Hedetniemi and A. Liestman, "A survey of Gossip-
ing and Broadcasting in Communication Networks." Networks. Vol. 18. pp.
319-349. 1988.

W.D. Hillis. The Connection Machine, MIT Press. Cambridge, MaSS., 1985.

R.W. Hockney and CR. Jesshope. Parallel Computers, Adam Hilger, Ltd..
1981.

W.T. Hsu. P.C. Yew and C.Q. Zhu. "An Enhancement Scheme for Hyper-
cube Interconnection Networks." Proc. of the 1987 Int" l Conf. on Parallel
Processing, pp. 820-823. August 1987.

K. Hwang and EA. Briggs. Computer Architecture and Parallel Process-
ing, McGraw-Hill Book Co.. 1984.

K. Hwang and J. Ghosh, "Hypemet: A Communicaﬁon-Eﬂicient Architec-
ture for Constructing Massively Parallel Computers." IEEE Dans. on Com-
puters. pp. 1450-1466. December 1987.

S. Johnsson. Ching-Tien Ho. "Optimum Broadcasting and Personalized
Communications in Hypercubes," IEEE Trans. on Comput., Vol 38.. No. 9.
pp. 1249-1268. September 1989.

P. Kermani and L. Kleinrock. "Virtual cut-through: A new computer com-
munication switching technique." Comput. Networks. Vol 3. pp. 267-286.
1979.

1.6. Kuhl and S.M. Reddy. "Distributed Fault-Tolerance for Large Mul-
tiprocessor Systems." Proc. 7th Annu. Symp. Comput. Architecture. pp. 23-
30. May 1980.

C.-L. Kwan and S. Toida, "Optimal Fault-Tolerant Realizations of Some
Classes of Hierarchical Tiee Systems." The 11th Int’ I Conf. on Fault-
Tolerant Computing. pp. 176-178, 1981.

J. L. Larson. "An Inuoduction to Multitasking on the Cray X-MP-2 Mul-
tiprocessor." Computer, Vol. 16. No. 7. pp. 62-69, July 1984 Nov. 1988.

TC. Lee and JP. Hayes. "Routing and Broadcasting in Faulty Hypercube
Computers." The 3rd Conf. on Hypercube Concurrent Computers and

[LePe88]

[Lies85]

[LoFu87]

[P6885]

[PrRe82]

[Prad81]

[Prad85]

[Prad86]

[PrVu8l]

[P'I'LP85]

[Pros81]

[Quin87]

[RaAE84]

[RaGA85]

[RCCT90]

114

Applications, pp. 346—354, January 1988.

A. Leistrnan and J. Peters. "Broadcast Networks of Bounded Degree." SIAM
J. Disc. Math, Vol. 1, No. 4. pp. 531-540. November 1988.

Arthur L. Liestman. "Fault-Tolerant Broadcast Graphs." Networks, Vol. 15.
pp. 159-171. 1985.

MB. Lowrie and WK. Fuchs. "Reconﬁgurable Tlee Architectures using
Subtree Oriented Fault Tolerance." IEEE Dans. on Comput.. Vol. 36, pp.
1172-1183. October 87.

OF. Pﬁster et al.. "The IBM Research Parallel Processor Prototype (RP3)."
Proc. 14th Int’ l Conf. on Parallel Processing, 1985.

D.K. Pradhan and S.M. Reddy. "A Fault-Tolerant Communication Architec-
ture. for Distributed Systems." IEEE Trans. on Comput.. Vol. C-31, No. 9,
pp. 863-869. September 1982.

D.K. Pradhan. "Interconnection Topologies for Fault-tolerant Parallel and
Distributed Architectures," Proc. 10th Int’l Conf. on Parallel Processing.
pp. 238-242. August 1981.

D.K. Pradhan. "Dynamically Restructurable Fault-Tolerant Processor Net-
work Architectures," IEEE Trans. on Comput.. Vol. C-34. No. 5. pp. 434-
447. May 1985.

D. Pradhan. Ed.. Fault Tolerant Computing. Theory and Techniques. Vol. I
and]! Prentice Hall. 1986.

RF. Preparata and J. Vuillemin. "The Cube-Connected Cycles: A Versatile
Network for Parallel Computations." Commun. ACM, pp. 300-309. May
1981.

J.C. Peterson. J.O. Thazon. D. Lieberman and M. Pniel. "The Mark III
Hypercube-Ensemble Concurrent Computer," Proc. of the 1985 Int' 1 Conf.
on Parallel Processing, pp. 71-73. August 1985.

A. Proskmowski. "Minimum Broadcast trees." IEEE Trans. on Comput.. Vol
C-30, No. 5. 1981.

MJ. Quinn. Designing Eﬂicient Algorithms for Parallel Computers.
McGraw-Hill Book Company. 1987.

C.S. Raghavendra, A. Avizienis. and MD. Ercegovac. "Fault Tolerance in
Binary Tlee Architectures," IEEE Trans. on Comput.. Vol. 33. No. 6. pp.
568-571. June 1984.

C.S. Raghavendra, M. Gerla. and A. Avizienis. "Reliable Loop T0pologies
for Large Local Computer Networks." IEEE Trans. on Comput.. Vol. C-34.
No. 1. pp. 46-55. Jan. 1985.

RD. Rettberg. W.R. Crowther, P.T. Carvey and RS. Tomlinson. "The
Monarch Parallel Processor Hardware Design." Computer. Vol. 23. No. 4,

[SaPr89]

[SAFM88]

[Sch84]

[Seit85]
[SlCH81]
[Sten88]
[TancSS]

['roplt89]

[WROM86]

[YaHa86]

[ZiEs88]

[ZiEs89]

115

pp. 18-30. April 1990.

MR. Samatham. D.K Pradhan, "The De Bruijn Multiprocessor Network: A
Versatile Parallel Processing and Sorting Network for VSLI." IEEE Trans.
on Comput.. Vol. 38. No. 4. pp. 567-581. Apr. 1989.

CL. Seitz, W.C. Athas. C.M. Flaig. A.J. Martin. J. Seizovic. C.S. Steele
and W. Su, "The Architecture and Prommming of the Ametek Series 2010
Multiprocessor." Proc. of The 3 -'rd Conf. on Hypercube Concurrent Com-
puters and Applications. pp. 33—38. January 1988.

Peter Scheuermann and Geoffrey Wu. "Heuristic Algorithms for Broadcast-
ing in point-to-point Computer Networks." IEEE Trans. on Comput., Vol
C-33. No. 9, 1984.

C. Seitz, "The Cosmic Cube." Commun. of ACM, Vol. 28. No. 1, pp. 22- 33.
January 1985.

PJ. Slater. EJ. Cockayne, and ST. Heditniemi. "Information dissemination
in uees." SIAM J. Comput.. Vol 10, No. 4. pp. 692-701. 1981.

Per Stenstrom, "Reducing Contention in Shared-Memory Multiprocessors,"
Computer. Vol. 20. No. 11. pp. 26-37. Nov. 1988.

AS. Tanenbaum, Computer Networks. Second Ed.. Prentice-Hall. En gle-
wood Cliﬁs. N.J.. 1988

D. Topkis, "All-to-All Broadcast by Flooding in Communication Net-
works." 1855' Trans. on Comput., Vol. 38. No 9. pp. 1330—1333. September
1989.

R. Wright, B. Richmond. A. Odlyzko. and B. McKay. "Constant Time Gen-
eration of Free Trees." Siam J. Comput.. Vol 15.. No. 2, pp. 540-548. May
1986,

RM. Yanney and J .P. Hayes. "Distributed Recovery in Fault-Tolerant Mul-
tiprocessor Networks." IEEE Trans. on Comput.. Vol. 35. pp. 871-880,
October 1986.

G. Zimmerman, and A.-H. Esfahanian, "GMP: A Graph Manipulation
software Package for SUN Workstations." Technical Report. MSU-ENGR-
88-019. Department of Computer Science, Michigan State university.
October 1988.

G. Zimmerman and A.—H. Esfahanian. "A New Approach to System-Wide
Redundancy in Designing Fault-Tolerant Topologies." Technical Report.
MSUrCPS-ACS-OI9 , Department of Computer Science. Michigan State
University. February 1989.

"llllllllllllili