—
_—

‘—
=
—_
—
—
—
—
'—
—
—
—
‘—
—
—

 

 

lIBRARY
Michigan State
University

 

 

 

This is to certify that the

thesis entitled
”we/L {76‘5de leg/IM/r'; Ana? 5')"

presented by
I? ”I“

has been accepted towards fulﬁllment
of the requirements for

Megan in 0,ij M64,
*- SéMWCSV/(L: .

7%]: K

Major professor

Date 2/§./gg ’

0-7639 MS U is an Afﬁrmative Action/Equal Opportunity Institution

 

 

MSU

LIBRARIES
—:_.

 

 

RETURNING MATERIALS:
Place in book drop to
remove this checkout from
your record. FINES will
be charged if book is
returned after the date
stamped below.

 

 

 

 

NETWORK SYSTEM RELIABILITY ANALYSIS

By

FENG HSU

A THESIS

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

MASTER OF SCIENCE

Department of Statistics and Probability

December, 16 1987

ABSTRACT

NETWORK SYSTEM RELIABILITY ANALYSIS
By
Feng Hsu

The reliable performance of a network system for a mission
under various conditions is of utmost importance in many
industrial, military, and everyday life situations. Main effort
of this thesis research have been devoted to develop algorithms
and optimization methods for network reliability problems. In
chapter 1, two algorithms so-called MDT and DBM methods are
presented respectively for qualitative and quantitative
evaluation of large networks and the relevant theorems are
proved. The three sections in chapter 2 are spent on the
optimility problems of network reliability, in which a useful
optimal redundancy allocation method is presented and in
addition, two useful mesures called the maintenance importance
‘ ( MIs,t(Xi) ) and the diagnosis importance ( DIs,t(Xi) ) are
presented for deriving optimal policies used in network
reliability maintenance and failure diagnosis. Examples and

comments are included correspondingly to each of the topics in

the thesis.

To my parents, Mr. Ziwei Xu, and Mrs. Shensu Xie,

and a special gift on the 60'th birthday of my father

ACKNOWLEDGMENT

I would like to express my heartly thanks to my major
advisor, professor Martin Fox for his greatly appreciated
guidance, suggestions, comments, carefully readings and
crrections, which has made it possible for me to complete
this research project. Thanks are also expressed to my
graduate committee members, professor R. Schlueter and
professor Soumen Ghosh for thier very helpful comments and

suggestions.

ii

Abstract

Chapter

Section

1.1
1.2
Section
2.1

2.2

Chapter

Section

1.1
1.2

Section

2.1
2.2
2.3

2.4

REFERENCES

CONTENTS

Network reliability analysis

Qualitative analysis through
fault-tree and MDT method

Theorems and definitions

Algorithms and examples

Quantitative analysis by DBM method
DBM method and its application
Algorithms and examples

Some optimal policies for

network system reliability
Redundancy policy for series-paralell
system under resource constraints using
L-M method

Problem formulation and solution
Example

Some optimal policies used for
reliability maintenance and failure
diagnosis without cost constraint
Maintenance importance

Diagnosing importance

Example

Comments and More Examples on
Failure Diagnosis Policy

10

1A

17

22

27

27

28

3O

33

33

35

36

Al

45

Chapter 1 Network Reliability Analysis

Section 1 Qualitative Analysis of Network Reliability

Through Fault-tree and MDT Method

In this section an efficient approach is presented for
qualitative analysis of network reliability through PTA (Fault-
tree Technique) and finding MCS (Minimal Cut Sets) by the MDT
(Modified Dual-graph Transformation) method. It is generally
accepted that for the purpose of qualitative analysis for
networks (or any complex system) one must establish its Fault-
tree by first finding its MCS. Therefore the enumeration of all
MCS seperating a specified node pair is a fundamental step in
reliability evaluation of network systems which are frequently
encountered in communications, electronics, computers and power
systems, etc. It is well know in graph theory [1-2] that the
algorithms existing for enumerating MCS are not satisfactory so
far, while the algorithms for finding MPS (Minimal Path Sets) are
much more time efficient. Hence, based on the principle of graph
theory, a so called MDT method is presented below for enumerating
the MCS of networks by employing any of the existing efficient

MPS algorithms [2, 3-5].

P0

1.1 Theorems and Definitions

Definition 1
If, for any two vertices in a graph G, there
is at least one path between them, then G is

said to be a connected graph.

Definition 2
A network is a connected graph, which has a
finite number of vertices and a finite number

of edges.

Let C(V, E) be the notation of a network, where V and E are the

sets of vertices and edges in the network, respectively.

Definition 3
Consider a network G(V,E) drawn in the plane
in such a way that each vertex vi 6 V is
represented by a point, each edge is
represented by a continous curve connecting
the two points which represent its end
vertices and no two curves, which represent
edges, share any points except in their ends.

Such a drawing is called a Plane Network.

Throughout the reminder of this chapter, we assume that all

networks discussed are plane networks.

Definition A

 

Definition 5

Definition 6

Definition 7

Let C(V,E) be a connected plane graph, K be an
edge set. Then K C E is called a cutset if it

is a minimal seperating set of edges, ie, the
removal of K from G interrupts its connectivity,
but no proper subset of K has this property.

In other words, a cutset seperates G into two

connected components.

The network C(v, E) is said to be the Dual of
a connected grapn C(V, E), if there is a l - l
correspondence f: E a E, such that a set of
edges T forms a simple circuit in G if and only
if f(T) ( the corresponding set of edge in C )

froms a cut set in C.

A point V1 is a Boundary Point if there exists
a curve connecting vi to the exterior of the
convex hull of the network which does not

intersect the network.

An edge of a network G is a Boundary Edge if

each point on the edge is a boundary point.

Definition 8
A cycle C of a network G is a Boundary Cycle if at

least one boundary edge exists in C.

Theorem 1 Let G be a network and x be a boundary vertex
of G and assume that 3 at least one cycle
through x. Then, 3 at least one boundary
cycle through x.

Proof Suppose the theorem is false. Take an arbitary

.cycle CO through x and let e

be an edge in

O

CO. Then 3 a point yO 6 e0 which is not a
boundary point. Every curve connecting yO to

the exterior of the convex hall of the network

intersects the network.

Then, 3 a cycle C1 ¢ Co through x
which includes e0 and a subset of this set of
intersections. This construction can be

repeated, each time creating a new cycle by

taking en an edge of Cn and 01’ C2, ...Cn
distinct.
Thus, 3 an infinite sequence of

cycles leading to a contradiction.

Definition 9

PI

01

Let C(V, E) be a plane network and let C(V, E) be
a dual of G. If s and t are two boundary vertices
of G, (s, t E V) then, by drawing two half curves
from s and t respectively, the exterior of the
network G is divided into two parts.
Let s' and t' be points,, one on each side.
of the two parts of the exterior of G given
above, and let C(V, E) be a dual of G with
s' 6 v . Form a network 6*(V*, E*) as
as follows:

3% ._
V = V U {t'}

Find a boundary cycle through 5' with a
e' ~ .
boundary edge x - s', e' E E and replace such
e' e'
x - s' by x - t', so that the new graph is a

.plane graph.

Repeat 2 for as many cycles of C as possible

without destroying any path from s' to t' which
*

already existed. The new network C is called

a Modified Dual Network of G.

a .
Figure 2 G is a modified dual graph of C (Figure 1):

 

 

 

 

 

 

 

S
a" i
*9
$\ 1,
5 . , . t
0£ 7) o; 9(- 'f,
i
L 18
Ix;
2‘;
- *
C Figure l C Figure 2

*
Construction of the above modified network G from G is illustrated

in Figure 3.

r--
"-‘

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(c) 3rd stage

Figure 3

In the following theorem a notation s will be used to represent

the equivalency between any two edge sets.

Theorem 2.
Let S be the set of all MCS seperating s and t
in the network G. Let 6* be a modified dual
network of G and P' be the set of all MP3 from
s' to t’ in C*. Then we have S a P'.

Proof .
Let S be the set of all MCS's and T be the set

of all cycles in a usual dual network C of C.

Thus, each Sk E S is equivalent to some

rk 6 T , That is, Sk a rk e T , where

rk passes through at least one vertex v' with

v' E V and v' in the interior of C.

Morever, S and t are also seperated by r see

k'
(Figure 4).

If we suppose that M is the set of all such

rk 6 T , we then have:
M s S (1)
Let P’k to be any minimal path of 6* which
goes from s’ to t'. If s' and t' are merged
into one point say 3', then any path from s' to

t' will become a cycle of C through 5' and all
edges containing 5' are edges contained in each MPS

*
of C . That is P'k is equivalent to a cycle

rk of G.

C)

Thus, P k E rk E T (2)
Since rk a Sk E S, it follows that
a P k 6 M
For V P'k E P', it follows that
P' E M (3)

0

On the contrary, if each r, e M and s and

1

)‘c .
t' are the source and sink of G , respectively,

then we can break each cycle rk such that both

3' and t' are to be connected by edges of each

r without disturbing the 1 - l correspondence

k

of edges between E and E' as given in
Definition 5, and meanwhile, either 5 or t of G

is no longer inside any cycle rk of G

Hence, r is a minimal path of Gx by which

k

s and t is separated.

Hence, rk E M
I I = I
and P k E P = rk ~ P k Q P
= P'k e M Q P'
So, M Q P' (4)

is proved to be true. the theorem then follows

from (1). (3), (4) #

‘0

'X 'x .
—>-
5 X: t
7: 1"

(Figure 4)

 

 

 

Definition 10 A Fault-tree is a system logical diagram
composed of logical operational symbols
which explicitly shows the logical interactions
between each basic components of the system,
and gives all the possible system failure

information that may be existed in the system.

[6-10]

By applying the above Theorems, the problem of finding a
given network's MCS simply becomes a problem of enumerating the
minimal path of its modified dual graph. Therefore, any of the
efficient path-finding algorithms can be used to find the minimal
cut sets of a given network and, then, the Fault-tree of such a
network can be easily obtained by simply plugging all the MCS
into a tree with each of its leaves represent a possible failure

event, which equivalently corresonding to a MCS. (see Figure 5)

lo

1.2 Algorithm and Examples
Algorithm
*
(1) Obtain the modified dual network G of G.

(2) Enumerate all the path sets P' between s'

and t' .in G}. and then get the MCS S of

C by theorem 2, ie, S = P’.

(3) Drawing the fault-tree of the network by

applying all the MCS obtained in step (2).

The construction of fault tree stated in step (3) can be
very time consuming if a large complex network is encountered. In
recent years much effort has been spent to develop algorithms and
techniques, for computer-aided automatic synthesis of fault trees
of large networks or any complex systems, and several algorithms
have been presented [11-13]. However, for a network with small
number of components, the corresponding fault tree can be easily
drawn. That is, as is shown in Figure 5 we can first start out
from the source (vertex 5') and let 5' be the 'root' of the tree,
and then draw a tree rooted at s' by representing all the edges
in each of the MCS as an individual path from s' to a 'leaf' (ie,

vertex t') respectively.

ll

For example, the MCS and the fault-tree (Figure 5) of

network in Figure 2 is obtained by using above algorithm:

L

\

 

 

 

 

 

%-
(7t7t35) }; ,, X;
’ (Xx. web)
('3': 3(5 X; XI) M 03.0)19) 7‘1 (Xaﬂﬁi’l‘e’h)
(XI. Xt, Yr) ' (mum
(Figure 5)

The above fault-tree is a simple version which uses node and
'branches' instead of using logical OR , AND symbols
respectively. A formal fault-tree of the same network is given

below. see (Figure 5'):

12

 

 

 

 

 

 

 

 

O
. L/
g».— ‘

 

 

 

Figure 5'

The above fault tree (Figure 5') is logically equivalent to
the simple version (Figure 5), the only difference is that Figure
5' uses the logical AND, OR symbols (see Figure 5") to represent
the relationship between components instead of using nodes and

'branches' in Figure 5. Since the formal fault tree is easier to

construct by using a computer, it is more widely used than the

simple version.

 

 

 

(he—s..-

(AnB-=C) (AUB-=C)

AND gate OR gate

Figure 5"

14

Section 2 Quantitative Analysis by DBM Method

A meaningful masure of the reliability and availability of a
network system is the terminal reliability between a given pair
of vertices, defined as the probability that there exists at
least one path between these two nodes. This parameter depends on
.both the network topology and the reliability of all elements
composing the network and is relevant inot only for network

analysis but also for synthesis of reliable networks.[14]

In recent years much effort has been spent to develop
algorithms [15-17] for computating the network reliability.
However the difficulty is that the complexity of the algorithms
remains exponential with the increase of number of elements,
therefore the need remains to widen the class of tractable

networks.

It is well known that for the quantitative evaluation of
system failure probability (reliability), one must transfer the
structure function (a Boolean S-O-P form) into the disjoint-S-O-
P (sum of product) form. J. M Cargal has shown [18] that the
crucial problem of determining the network reliability is to
assess the desired probability with much less effort than is
currently extended and higher algebra is most likely the avenue

for such a task. To attain the same object by trying the DBM - a

15

Disjoint Boolean Manipulation method in this section, we have the
very helpful approach of calculating the network reliability. The
substance of the DBM method is the minimization principle of a
Boolean function in disjoint sense. In other words, the advantage
of using the DBM method is that it minimizes the number of
Boolean terms of sytem's logical expression while eliminating any
logical redundencies. The ordinary minimization principle of
Boolean functions is to minimize the total number of symbols of
intersection and union manipulations. Using DBM we can not only
simplyfy the computation of network reliability but also make the
design of switching circuits much efficient and easier.

For example, by the ordinary Boolean principle A U B is
minimal for an OR operations with two inputs, but A U A'B or B U
B'A is much more convenient for probability computations. The
difference between these principles seems to be very little when
n is small, but it may become very large eventually when n is
large. In fact the expression of system reliability given by the
sum of events X + X + ... + Xn (these X

l 2 1

MP8 in the network or any compex system that can be constructed

can express a MCS or

logically by Fault-tree technique) may contain 2n_1 terms, but
that of disj01nt sum event X1 + X1 X2 + ... + X 1...X n-lxn
contains only n terms.

Rosenthal has shown [19] that computing the probability of
a desired event (system success or failure) for an arbitry system

structure function, like computing the reliability of a general

network, is NP difficult, ie, the amount of computation increases

exponentially with the number of components in the network.

2.1 DBM Method and Its Application

(1) Nomenclature
UB(X)=UB(X1,...Xn) denotes a usual Boolean expression,
an event-expression for system success or
failure that contains only Boolean variables

and operations.

DB(X)»DB(X1,...Xn) denotes a disjoint Boolean expression
that can be used directly for computing the
system reliability.

X denotes an event.

denotes the indicator of Xi.

(2) Assumptions:

1. A network system S which consists of n-independent

elements could be expressed as:

S = {el,e2,...en ).

We assume that each of these elements has two states:

l7

functioning and failure. The state of ei could be

characterized by the binary variable xi as following:

0 (failure)

1 (functioning)

Thus, x1 is the indicator of the event that the ith component

 

functions.
Definition 11 The system structure function denoted by
9(x1, x2,...xn) is a Boolean function which
is the indicator of the event of system functioning.
Definition 12 a system failure function denoted by

 

W(x1, x2,...xn) is a Boolean function which

is the indicator of the event of system failure.

Let xri be the indicator of the event that the ith component

of the rth MP5 is functioning. The x are x

.,x and, for
ri n

1...

r # s, it is possible to have x . a x for some i and j.

r1 sj

If we enumerate the MCS and MP3 of S, the structure function

and the failure function will be written, respectively, as:

¢(x) - ¢(xl,x2,....xn) = Z eri xri e rth MPS
r1
r(x) = w(xl,x2 ..... xn) = E gxqj xqj e qth MCS

Let P = (p1,p2,...pn) denote the reliability vector of components

18

e1,e2,...en. Because of the assumption that the components

are independent of one another, the domain of g and of W are

extended by setting:

9(P) = El¢<X>1
W(P) = E[W(X)]
Thus,
¢(P) - ¢(pl,p2,....pn) = X Ilpri is the'probability of the
r i
system functioning.
W(P) - \I!(pl,p2 ..... pn) = X Hqu is the probability of the
qj

system failing.

Where, ﬁCP) = 1 - 0(P) since ¢(x) = 1 - W(x).

2 In graphs only arcs may be faulty. Nodes have 0 failure

probability. The reliability of each arc is given.

3. Component (arc) failures are independent.

4. Given a network and two of its particular nodes, 3 and t, all

simple paths from (or all minimal cut sets seperating) s and t

are known or can be found.

(3)

19

Operations (UBM):

Logical
interaction

1. AND operation
2. OR operation

3. NAND operation

4. NOR operation

Using Venn diagrams

OBM
rules

AB
A+B

(AB)'= A'+B’

(A+B)'= A'B'

the OBM and

interpreted as following:

...-M-..—

- -—.—...--—_—

. _

~—-—~_--—
— -— ..——.--__ __

...—...c‘ . ~, - -
-..—...—

..--—--. ...—- -——..--._--.~..q

L c- a--.
—— -... q“...-

—- ...—H——

-_. .-—-- ‘._—_.

 

 

 

A+B by UDM

B --

uu_¢..~ 0...... o -w-...

...-.7

(Figure 6)

Rules & comparision between DBM and usual

DBM

rules

AB
'A+A'B or B+B'A (Fig 6)
(AB)’- A'+AB' or
(AB)'= B'+BA' (Fig 7)

(A+B)'= (A+A'B)'= A'B’ or
(A+B)'= (Bl-B'A)'=== A'B'

UBM

can ‘be

rules

 

 

 

A+B by DBM

(Sa+S ”Sb
shows the reTation of area)

’7)

FJ

 

 

 

 

 

 

 

 

(AB)' by UDM (AB)' by DBM

(Figure 7)

According to the addition and multiplication theorems of

probability, it is easy to see that:

The rules of the two types are equivalent completely
for the calculation of probability.

Althrough the expressions of DB algebra are non-
unique, they are equivalent to each other.

Using DBM to decompose the logical structure of
system, the disjoint Boolean function can be obtained
directly from the logical structure of the network
(or any systems).

By extending, absorbing and summarizing the structure

functions using the logical identities such as:

A + AB - A AB + AB = AB

AA - A AB + AB'= A AA'= 0

the disjoint S-O-P expression of the system can be

eventually obtained.

2.2 Algorithm and Examples

(4) Examples:

A given network is shown in Figure 8 and its fault-tree shown in

Figure 9:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(Figure 8)

 

 

 

 

 

 

 

 

 

 

 

(Figure 9)

 

1. Using Fratta's algorithm: [20]

OB(X) a ABB + ABE + ABCD + ABDF + AHB + AHE + AHDC + AHDF

- AB + AHE + AHDC + AHDF.

Simplify the above structure function into the disjoint-S-O-P

form by the following steps:

a. (AB)'(AHE + AHDC + AHDF) a (A'+ B')(AHE + AHDC + AHDF)

= B'AHE + B'AHDC + B'AHDF

23

b. (B'AHE)'(B'AHDC + B'AHDF) = (B + A'+ H'+ E')(B’AHDC + B'AHDF)

: E'B'AHDC + E’B'AHDF

c. (E'B'AHDC)'(E'B'AHDF) = (E + B + A' + H' + D' +C')E'B'AHDF

= C'E'B'AHDF

Finally, the disjoint S-O-P form is:

UB(X) = AB + B'AHE + E’B'AHDC + C'E'B'AHDF (l)

2. Using the DBM method:

The disjoint S-O-P form can be obtainted directly (in one
step) from the system structure diagram (in either

network or Fault-tree form):

DB(X) = A(B+B'H)[B+B'E + B'E'D(C+C'F)]
= ABB + ABB'E + ABB'E'DC + ABB'E'DC'F + AB'HB
+ AB'HB'E + AB'HB'E'DC + AB'HB'E'DC'F

= AB + AB'ME + AB'HE'DC + AB'HE'DC'F (2)

Obviously, (l) a (2)

It is easy to see that in the DBM approch the amount of
computation is reduced significantly, and the computer
implementation of calculating the reliabilities between
any given pair of boundary vertices of a network,_can be
more easily performed in a shorter & unique program.
Suppose the element reliabilities of the given network
are as follows:
pa=0.9 pbnO.7 pch.8 pdw0.97
Faro-95 ph=0.7 pf=0.8
Then the system reliability of the network is:
RS = O.9*O.7 + O.9*O.3*O.7*O.95 + 0.9*O.3*O.7*0.05*O.97*O.8
+0.9*0.3*0.7*0.05*O.97*O.2*O.8

- 0.818 ' #
Combining section 1 and section 2 we can form an algorithm for
reliability evaluation (quantitatively & quantitatively) of the

network systems:
Algorithm
*
1. Apply the MDT algorithm to given network and get G of C

and all the MCS in the system.

2. Construct the system structre function ¢(x) using all
the MCS information obtained in step 2 (or by

applying the DBM method).

3. Reduce ¢(x) first by general boolean identities ie:

I")
(fl

X+XY=X XY+XYuXY XY+XY'=X XX=X XX'=O .

Apply the DBM rules directly to the simpfied ¢(x)

from 3.

Any more redundent terms? if yes, go to 4. if no go to 6.

Input all the failure probabilities of every component Xi'

Calculate the reliability directly for the whole

system and then stop.

Chapter 2 Some Optimal Policies for System Reliability

Section 1 Redundency Policy for Series-paralell System

Under Resource Constraints Using L-M Method

This section is devoted to discuss the problem of allocating
redundent components subject to 'resource constraints so as to
optimize some measure of system performance. Since the 19503,
many models and solution procedures have been developed for
various of these problems. The solution procedures can be divided
into two categries: heuristic and exact methods. While useful
heuristics are easy to implement and have modest computational
requirements, they do not guarantee optimal solutions. All known
methods which guarantee optimal solutions are enumerative, such
as integer and dynamic programming. These exact methods have
computational requirements that grow exponentially with the size

of conponents. Tillman & Frank [21] provide an excellent survey

of research on the problem.

The optimization problem presented here is a rather simple
and useful method in practice, which gives approximately the
optimal solution for minimizing the whole system cost under the
constraint that the whole system must meet the designed

reliability requirement. This approach rather than just

27

maximazing the system reliability subject to element constraints,

is most commonly used [22—24].

1.1 Problem Formulation and Solution

 

Suppose there are n different stages (subsystems)
coneeted serialy with each other, . where the ith
(i=1 ..... n) subsystem consist of mi identical
conponents connected paralelly with each other. Assume
also the compoent failures are independent. Given that
the failure probability is qi for any element in the
ith subsystem and that its cost is Ci’ the problem is
to find an optimal allocation of the elements for each‘
subsystem which will minimizing the total system cost
under the constraint that the whole system must be at
least as reliable as is required by the designer. Thus
let R0, be the minimum reliability requirement the

' whole system must meet. That is:

Evaluate the decision variables mi such that

n
Minimizes f(mi) = min E: Cimi (1)
=1

n .
Subject to R = |”“| (1 - q. )2 R (2)

and m , m ,...m are inte ers.
l 2 n g
where Ci’ qi and R0 are given. Since, the higher
the system reliability required, the more element

redundencies need to be added on to each of the
subsystems. so the solution (m1,m2,...mn) which
minimizes the total cost must also meet the requirement
that R 2 R0. However, a little increase in R0 will
result in a big cost for the system, Therefore, to
make the problem simpler we can get an approximation

for the problem by making the following two

assumptions:

a. Use equality (m) in (2) rather than inequality

(2)

b. Take each ”i as a continous variable (vector) and
the logrithm in both sides of (l) and round off

solutions at the very end.

c. Following assumptions (a) & (b) we can easily
apply the Lagrange multiplier method to solve the

problem.

[‘3
LC)

 

 

Let H(ml,m2 ...... mu, 7)
n
mi
__ X clu + 7 (21nd - qi ) - 10%)
1=1
m.
1
6H -qi lnqi
5;. = Ci + 7 ( —————— a? ) = O (l = l, 2, n)
i l
l - qi
mi Ci
9- (3)
i Ci+7lnqi
Ci
Then, m1 = 1n ( C.+7lnq. ) / lnqi (A)
i i
From (2) & (3) and (A) we get:
7“ -{ R lhI(C +71nq )/|_|1nq } = 0 (5)
0 . i i . i

1 l

we can get the solution 70 of (5) and then by (A) we geti

 

) / lnqi (6)

1.2 Example

The system is given as shown in (Figure 10),
it consists of three subsystems, and each of
m different

which is composed of ml, m2, 3

elements parallel. (components in a

Solution:

30

a

subsystem aSsumed all the same)

 

 

 

i .
z , .
_.__._::::::}”__- li_{“‘_:::}... ..ir-’*-j___
(Fig 10)

The failure probabilities of the three kinds
elements are assumed to be 0.02, 0.01, 0.03
and their cost are $3000, $4000, $5000
respectively. The best policy is sought for
allocating those elements so as to minimize
the total cost under the condition that the
system reliability must no less than

0.99.

q1 = 0.02, q2 = 0.01, q3 = 0.03. CI = 3

a 4, c3 - 5 (unit)

find (m1, m2. m3)

(l)

(2)

31

st. min{3ml + Amz + Sm3}
_ mi
subject to | | (1 - q. ) 2 0.99
i=1
By (5).

13 + 30372 - 2977 + 94 = O
solving the equation yields 70 = ~99.806.
Then, by (6),
ml = 1.24), 1112 = 1.032, m3 = 1.215
Rounding up, we finally get:
m1 = m2 = m3 a 2
system reliability is
2 2 2
[1-(0.02) ][l-(0.01) }[l-(0.03) ] = .9986 > .99

80 (ml, m3) = (2, 2, 2) is the optimal policy for

m2,
allocating the three different kinds of elements to the
system, which yields a total cost of (3 + 4 + 5)*2 = 24
(1000 dollars) to the system.

This can be verified as following that reduction

any of the elements mi will violate the system

reliability constraint :

If m2 = 1, system reliability is

[l-(O.02)2][l-(0.Ol)1][1-(0.03)2] = 0.988 < 0.99
If m3 = 1, system reliability is

I)
[l-(0.02)“][1-(0.01)2][1-(0.03)1] = 0.969 < 0.99

On the other hand, suppose we want the total cost be

$22 which is less than $24. let m1 = 3, m2 = 2 m3 = 1,

32

Then the system reliability becomes
[l—(0.02)3][l-(0.01)2][l-(0.03)1] a 0.9699 < 0.99
which violates the R0 constraint. This has verified

that the privous solution for the problem is truly a

optimal solution.

Section 2 Some Optimal Polices Used for Reliability Maintenance

 

And Failure Diagnosing Without Cost Constraint

Based on the results from chapter 1, once we have the
disjoint S~0-P form of system failure function (or success
function), we can also develop some useful mesures which can be
used in determining the optimal polices for system reliabiliy
maintenance and repairing diagnosis. In this section two measures
that may be useful in network reliability maintainence and

failure diagnosis will be discussed [25-26].

2.1 Maintenance Importance: { MIS t(xi) }

For a functioning network system, the primary interest
for the system operator and the maintenance engineer is
to know which one of the components among the system is
more important to the performance of the whole system.

In other words, we want to know the optimal policy

which gives a best time scheduling (or priorities) for
each of the components in a routine maintenance project
which will substantially reduce the risk of system
break-downs. However, without considering the cost, the
main effort is to determine the dependencies of the
network reliability on the reliability of a individul
element. Therefore it is quite natural that the
maintenance importance of the components should be

defined as follows:
HIS t(Xi) = Pr{ system failed I component Xi failed )

This is the conditional probability of system failure

(no path from s to t exists anymore) given that component

Xi has failed. As already described in chapter 1
(section 2), HIS t(Xi) is the expectation of the
failure function given that xi=0, ie,

~ Y a n =

M15 t ii) E[&(x) | xi 0] (2)

Similarly, 0(x) = 1 - W(x) so that,

E[o(x) | xi=0] = 1 - HIS t(Xi) (3)

2.

2

34

Diagnosing Importance: { DI (Xi) ).

For repairing a failed network system, one must
first find out which one of the elements is most likely

to be the failed one which has caused the break-down of

'the whole system. In other words, in order to fix the

system in a lowest cost and shortest time, one must
know the optimal priority ordering for cheking those
components. Therefore, a so-called Diagnosing

Importance DIS t<xi) should be defined as:

1

DIS t(Xi) = Pr{component Xi failed I system failed} (4)

DI

This is the conditional probability that component Xi
failed given the whole sytem has failed already.
Likewise, we can derive DIS t(Xi) by Bayes' rule and

conditioning to the failure of component xi and finally

conditioning to system failure, ie:

(Xi) Pr{Xi failed, system failed} / Pr{system failed}

P{system failed I xi failed}P{Xi failed} / w(p)

= (1'Pi)"“s,c(xi) / Mp) <5)

2.3 Example

For the network system given in chapter 1 (Figure 8)
the component's reliabilities are given as:

p = 0.9 pb = 0.7 pC = 0.8 p8 = 0.95

= 0.8 pd = 0.97 p} = 0.7

1
From (3), MIS,t(Xi) = 1 - E[e(x) | xi=0] and,
W(x) = l - 0(x)

80, W(P) = 1 - ﬁ(P) = 1 - ( papb + paqbphpe
+ paqbphqepdpc
+ paqbphqepdqcpf )

= l - 0.818 = 0.182.

Let w(xi=0) = E[W(x) | xi=0], which is the expectation of the

system failure function given that component Xi is failed.
Then, from (2) we get:

MI (A)

\II(xa = 0) = l - 0 = 1

1 - { 0.9*1*0.7*0.95

2
H
A
on
V
I!

\I!(xb = 0)
+ 0.9*l*0.7*0.05*0.97*0.8

+ 0.9*1*0.7*0.05*0.97*0.2*0.8}

= l - { 0.5985 + 0.02444 + 0.004888 }

= 0.372

3E

HIS t(C) = \I!(xC 2 0) = l - { 0.9*0.7 + 0.3*0.9*0.7*0.95
+ l*0.05*0.3*0.9*O.7*0.97*0.8 }

= 1 - { 0.63 + 0.17955 + 0.0073332}

= 0.183
Similarly,
“Is,t(D) = \II(xd = 0) = 0.1905
MIS,C(H) = II!(Xh = O) = 0.37
“Is,t(E) = II!(xe = 0) = 0.358
MIS t(F) = ‘I!(xf = 0) = 0.183

From (5) and the above results we get:

1*(l—O.9)/O.]82

!!

01 (A) == ms C<A><1 - p > / Mp)

s,t a

= 0. 549

DIS t(B) = M15,t(B)(l - pb) / W(p) = 0.372(1-0.7)/0.182
= 0.613

DIs,t(C) = MIs,t(C)(1 - pc) / w(p) r 0.183(1-0.8)/0.182

= 0.201
Similarly,
DIs,t(D) = MIs,t(D)(1 - pd) / W(p) = 0.0314
. g - T = /
DIs,t(E) MIs,t(E)(1 pe) / V(p) 0.0984
DIs,t(H) = MIs,t(H)(l - ph) / 0(p) = 0.609
= . - r =
DIs,t(F) MIS,t(F)(l pf) / W\p) 0.201

From the above results:

(a) If the system is functioning, the maintenance importances

for the components are listed as follows:

37

(ie, list of MIS t(Xi)from largest to smallest)

D

lst : component A 2nd : component B
3rd : component H 4th : component E
5th : component D 6th : component C or F

Obviously, for the functioning system the optimal
maintenance scheduling for each of the components should

follow the priorities as below:

lst maxinumH-‘II 01.)} ie, component A
. s,t l
1
After finishing examination and repair of component A
the next one will be:

maxi mum ( M I

Xi)’ given A is ok} which can be
i

s,t(

evaluated as maximum(W(xi=0, xa=l)} = 0.3025
i

ie, component B
After finishing component B, and so on we finally get

3rd component E 4th component H

5th component D 6th component C or F

It can be noticed that this actual optimal maintenance
scheduling for the given system is just slightly different

from the list of maintenance importances obtained before.

If the system has failed, the diagnosis importances for each
.of the components are listed below:

(ie, list of DIq t(Xi) from largest to smallest)

lst : coponent B 2nd : component H

3rd : component A 4th : component C or F

5th : component E 6th : component D

From above list, we can decide that component B ( with

largest DIS (Xi) ) . should be the one to check first.
However, the optimal diagnosis policy for checking out the
failed component is more complicated than that of the
maintenance cases described above, and which will depend on

different conditions of the system.

Suppose that on checking component B it was found that it did
not fail. Then the next one (or the only one for this

particular example) to check is:

maximum{ DI
i s,t

(Xi’ given B is ok) } which is obviously

component A . That is, for this particular example if system
has failed and first found out B did not failed, then we can
immediately decide that component A must be the failed one
which has causted the failure of the system.

However, for diagnosing the failure of a general network, if
we first find out component Xj did not failed, then the next

component needs to check will be:

‘maximum{ DIs,t(xi’ given kj is ok) } (1 # j)

max:mum{ DIs,t(Xi I xj=l) }

maximum { Pr(Xi failed I system failed, Xj 0k) }
i

Let in denote the event that component Xi failed

F denote the event that system failed

Then, DI (x. I x.=1)
s,t i j

Pr( FX | F , xj=l }

i s

Pr( F ., F , x. = l }
x1 s jg

Pr( F , x.
S J

l }

ﬂ

Pr{ F3 | F ., x.

l } P{ F ., x. = 1 }
x1 3 xi 3

 

Pr{ F5 , xj = 1 }

Pr( F8 | in, xj = 1 } Pr( in)
Pr{ F5 | xj a 1 }

M1s t( xi | xj = 1 ) ( 1 - pi)

7

W . = l
( X3 )

W( Xi = 0, xi = l ) ( 1 - pi )

 

0(x. = l)
J

40

Thus, by repeatedly using formula (5) we can always find a priority

list for the purpose of system failure diagnosis. That is:

maximum{ DIS
i

a.
L.

(Xi’ given Xj is ok)‘} (i # j)

W(xi-0,xj=1)(1-Pi)

 

= maximum( } (i ¢ j).
i @(x. e l)
J
2.4 Comments and More Examples on Failure Diagnosis Policy

Since the profit related optimal diagnosis policy for a
certain system depends on the actual conditions of the system, it
is not easy to determine an optimal diagnosis strategy which
yields a minimal expected total cost. The following two
examples are employed to show that no simple optimal diagnosis

strategy exists for general network systems.

Example 1

Consider a two components (A1 and A2) series system given

below ( Figure 11):

,4“ 5’, A, t’.
W'm

Figure 11

Here, p., denote the reliability and diagnosing cost of

i ci

component Ai respectively. Let F denote the event that

41

system failed and Ai denote the event that component Ai failed.

Then, we have

 

 

ql
P(Al I F) =
q1 + plq2
q
P(A2 I F) n 2
(11 + plq2

Suppose that the system has failed and let the possible diagnosis
policies be:

Policy 1: check component A1 first

Policy 2: check component A2 first

Let CP1 and CPQ denote the expected total costs involved in

policy 1 and policy 2 respectively. Then we have:

 

 

c 0
2‘2
CPl _ c1 + +
q1 p1‘12
C q
l l
CP2 — c2 + +
q1 p1q2
C q ‘ C q
2 2 l l
and, CP1 - CP2 = C1 - c2 +
ql +p1Q2

= (clplq2 - C2pqu) / [q1 + plqzl.

C1 p2q1
Obviously, if -—+ < , then CP1 < CP2 and
C2 pqu

 

policy 1 is better. Otherwise, policy 2 is better.

Example 2

For the three component system given below (Figure 12),

5 C.)

I

r—-————

A . C, I “J“?
__lon-{"”“'“}~___ﬁ_, plm_lMlmn

Figure 12

q
P(A | F) - 1 ;
q1 + p19233

 

. q (q + p q )
P<B I F) _ 2 1 1 3 ;
q1 + plq2q3.

 

q (q + p q )
P(C I F) - 3 1 1 2 ;
ql + plqzq3

 

q2‘13

 

P(BC | F) —
ql + qu2q3

Suppose that the system has failed and let the possible diagnosis

policies be:

Policy 1: Check component A first. Then,

(02 + C3>q2q3

 

 

 

CPl w c1 +
q1 + plq2q3
Policy 2: Check component B first. Then,
C 9
,, 1 l
CE2 c2 + .
q1 + p1q2q3
Policy 3: Check component C first. Then,
C q
l l
r) =
013 C3 +
(11 + plq2q3
Let Cm = min( c2, c3 ), and

c q
:7. C + ...-1;-..
m

CPm = min( 0P2, CP3 ) .
q1 + p1q2q3

Then, the numerator of CP1 — CPm is:

C1p1q2q3 + (C2 + C3>q2q3 ' Cm(ql + p1q2q3)°

If Cm = c2, the numerator becomes:

C1p1q2q3 '-62q1(1 ‘ qzq3) + C3q2q3'

Set c3 = kc2 ( k 2 1) Then, this quantity is:

Clp1q2q3 ‘ C2[q1(l - q2q3) - kq2q3] = f(ci’pi’k)'

If f < 0 than policy 1 is optimal.

44

Otherwise, policy 2 is optimal.

These examples show that we cannot find a general easy algorithm

for the failure diagnosis of general networks.

[5}

[6]

 

REFERENCES
Barlow R. E., Fussell J.B., Singpuwalla N.D. (ed)
"Reliability and fault tree analysis", SIAM, 1975

Shimon Even, Graph Algorithms, Computer science press, Inc.
Rockville, Maryland 20850,]985

K.K. Aggorwal, Suresh Rai, "Reliability evaluation in computer
communication network", IEEE Trans. Reliability, R-30, 1981
Nakagawa. H. Baysian, "Decomposition method for computing the
reliability of an orented network", IEEE Trans. Reliability,
R-25, No.2, 1976

John A. Buzacott, Clement C. Feng, "An algorithm for symbolic

.reliability computation with path-sets or cut-sets", IEEE

Trans. Reliability, vol R-36, No.1, 1987
R. G. Bennetts, "On the analysis of fault trees", IEEE Trans.
Reliability, vol R-24, No.3, 1975

Dean B. Wheeler, "Fault tree analysis using hit manipulation",

[10]

0"!
’4

[13}

[14]

[15]

[16]

IEEE Trans. Reliability, vol R-26, No.2, 1977

Barlow R. E., Proschan F., Statistical Theory of Reliability
and Life Testing, N.Y. Holt, Rinehart and Winston, 1975

Shi Dinghua, "A unified algorithm for computer-aided fault-
tree analysis", Scientia Sinica (Series A) vol 27, No.1, 1984
Feng Hsu, Shi Dinghua, "Problems and development of fault tree
analysis", Report on 4th National China Symposium of Mathema-
tical Theory of Reliability, Shanghai, Nov. 1-6,1983, Acta
Automatica Sinica, No.4, 1984

P. Camarda, F. Corsi, "An efficient simple algorithm for fault

tree automatic synthesis from the reliability graph", IEEE

Trans. Reliability, vol R-27, No.3, 1978

S.A. Lapp, C.J. Powers, "Computer aided synthesis of fault-
trees", IEEE Trans. Reliability, vol R-26, 1977

Feng Hsu, Shi Dinghua,”Computer-aided fault tree construction"
A selected Paper on 2nd China national conference on reliabi-
lity, China science press, vol.2, Hay, 1985

H. Frank, I.T. Frisch, "Analysis and design of survivable
networks", IEEE Trans. Communication Technol., vol COM-18,
pp501-519, October, 1970

C. Singh, S. Asgarpoor, "Reliability evaluation of flow
networks using delta-star transformations", IEEE Trans. Relia-
bility, vol R-35, No.4, 1986

Luig, Fratta, Ugo G. Montanari, "A recursive method based on
case analysis for computing network terminal reliability",

IEEE Trans. Communication, vol COM-26, No.8, 1978

[17]

[18]

to
H

I."

127-}

{231'

[25]

Liao Jiongsheng, "A new approach for fault tree analysis",
Scientia Sinica (Series A), vol. 25, No.9, 1982

Cargal, J.M., "An alternative fault-tree algebra", IEEE Trans.
Reliability, R-29, No.33, 1980

Rosenthal, A., "Reliability and fault tree analysis", SIAM,
1975, ppl35-152

Luigi Fratta, "A boolean algebra method for computing the
terminal reliability in a communication network", IEEE Trans.
Circuit Theory, vol CT-20, No.3, May, 1973

Frank A. Tillman, “Optimization techniques for system relia-
bility with redundancy - a review", IEEE Trans. Reliability,
vol R-26, No.3, 1977

Yuji Nakagawa, Kyoichi Nakashima, "A heuristic method for
determining optimal reliability allocation”, IEEE Trans.
Reliability, vol R-26, No.3, 1977

Robert L. Bulfin, "Optimal allocation of redundant components
for large systems", IEEE Trans. Reliability, vol R-34, No.3,
1985

Yoshio Hattori, "Reliability optimization with multiple
properties and integer variables", IEEE Trans. Reliability,
vol R-28, No.1, 1979

Martin Fox, Dorian Feldman, NOTES ON PROBABILITY, Department
of Statistics and Probability, Michigan State University,
August, 1986

Frederick S. Hillier, Gerald J. Lieberman, Introduction to

Operations Research, Stanford University, January,1980

   

"'TIIIIIIIIIIIITHIIIEIIHIIIIIIIEs