PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES Mum on or before duo due.

  

  

,DATE DUE DATE DUE DATE DUE

J
:

   

 

 

 

 

 

 

    
  

~ [
__
l [:1
LI l
JI ll 1

MSU Is An Afﬂrmdivc Action/Equal Oppoﬂunity Institution
cWWd

 

 

 

 

 

 

 

 

 

 

 

LJ

JU

 

 

 

 

 

 

#

 

 

 

 

 

 

 

 

 

 

——_———-

ARTIFICIAL NEURAL NETWORKS FOR

CONSTRAINED AND UNCONSTRAINED OPTIMIZATION

By

Jiahan Chen

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Electrical Engineering

1992

ABSTRACT

ARTIFICIAL NEURAL NETWORKS FOR
CONSTRAINED AND UNCONSTRAINED OPTIMIZATION

By

Jiahan Chen

Finding an accurate solution to linear and nonlinear programming problem formu-
lations is a challenge. Fast, accurate and cost-effective methods for solving simultane-
ous linear and nonlinear equations, for example those found in large-scale power sys-
tem control problems, have been sought for decades. In this dissertation, new
approaches for solving these problems using artiﬁcial neural networks (ANNs) are
presented. Foremost, the reason for degenerating accuracy of previously developed
ANNs for constrained optimization problems is discovered, and a new combination
penalty function for a neural network is proposed which can ensure that an equilibrium
point is sufﬁciently close to the optimal point. Then, an ANN for unconstrained optim-
ization is proposeed which leads to development of linear and nonlinear equation

solvers. Correspondingly, network formulations for the solvers are deﬁned, and condi-

_tions for network stability and convergence are identiﬁed and proven. An architecture

design for the solvers, in which a network is constructed from a neuron array and a
resistance connection matrix, is described and the hardware complexity is given. Furth-
ermore, the linear and nonlinear equation solvers are applied to solving power load
ﬂow and contingency analysis problems. The related issues in these applications, such

as formulation modiﬁcation and a relationship between the transmission line

parameters in a power system and the stability property of the neural network, are
analyzed. In addition, other applications of the linear equation solver are discussed.
These include matrix inversion, determining the stability of a linear control system,
and determining matrix singularity. Simulation experiments show encouraging results
for test examples with linear and nonlinear programming, linear system problems, and
power system control problems. Moreover, it is shown that the time complexity of the
linear and nonlinear equation solvers is problem size independent so that real-time
computing for large-scale power systems will become possible when the approaches

are implemented by VLSI technology.

ii

ACKNOWLEDGMENTS

It is my great pleasure to thank my major advisor, Dr. Michael A. Shanblatt, for
his guidance and encouragement throughout the years of my graduate studies. I espe-
cially appreciate his constructive critiques and intellectually challenging questions that

initiated signiﬁcant research directions.

I also want to thank my other Ph.D. Committee memberers, Dr. P. David Fisher,
Dr. Robert A. Schlueter, Dr. Gerald L. Park, and Dr. Joseph Gardiner, for their valu-

able comments and suggestions on this work.

Furthermore, I wish to appreciate the ﬁnancial support from the Center for
Remote Sensing and the Case Center, Michigan State University, which has lasted for

more than four years.

Finally, I wish to dedicate this dissertation to my parents for getting me started
right and encouraging me to pursue higher education, my wife, Zhenying Shen, for her
love, patience, understanding, and support, and my lovely daughters, Hallie and
Quanda.

TABLE OF CONTENTS

LIST OF TABLES ................................................................................................... vii
LIST OF FIGURES ................................................................................................. viii
Chapter 1. Introduction ........................................................................................... 1
1.1 Overview .................................................................................................. 1

1.2 Problem Statement .................................................................................. 3

1.3 Research Tasks ....................................................................................... 5

1.4 Organization of the Dissertation ............................................................ 8
Chapter 2. Artiﬁcial Neural Networks .................................................................. 11
2.1 Background .............................................................................................. 11

2.1.1 Basic Concepts ........................................................................ 12

2.1.2 Features ..................................................................................... 14

2.1.3 Leaming .................................................................................... 15

2.2 ANN Architecture .................................................................................... 19

2.2.1 Feedback Model ........................................................................ 19

2.2.2 Feedforward Model ................................................................... 22

2.2.3 Cellular Model .......................................................................... 23

2.3 ANN Applications ................................................................................... 24

2.3.1 Categories of Applications ............................ 24

2.3.2 Methodologies of Neural Computing Applications ................. 25

2.3.3 A Procedure for Optimization Applications ............................ 27

Chapter 3. Neural Networks for Linear and Nonlinear Programming ............ 35
3.1 Introduction .............................................................................................. 35

3.2 Methodology Review ............................................................................... 37

3.2.1 The Penalty Function Method .................................................. 38

3.2.2 The Lagrange Multiplier method ............................................. 38

iv

3.2.3 ANN Techniques ...................................................................... 39

 

 

3.3 A New Combination Penalty Function ................................................... 41

3.3.1 The Boundary Situation ............................................................ 41

3.3.2 A New Combination Penalty Function .................................... 43

3.4 Hardware Conﬁguration ........................................................................... 44

3.5 Experiments and Simulation Results ....................................................... 47

3.5.1 Examples ................................................................................... 48

3.5.2 Discussion ................................................................................. 49

3.6 Summary .................................................................................................. 50
Chapter 4. Solving Linear System Problems Using

An Artiﬁcial Neural Network ............................................................. 58

4.1 Introduction .............................................................................................. 59

4.2 Methodology Review ............................................................................... 60

4.2.1 The Traditional Methods .......................................................... 60

4.2.2 Systolic Arrays .......................................................................... 63

4.2.3 The MIMD Method .................................................................. 64

4.3 A Neural Network Solution to Linear Equations ................................... 66

4.4 The Relationship to Linear Systems ....................................................... 67

4.5 Architecture - .......................................................................... 72

4.6 Other Applications ................................................................................... 75

4.6.1 Matrix Inversion ........................................................................ 75

4.6.2 Determining the Stability of a Linear Control System ........... 76

4.6.3 Determining the Singularity of a Matrix ................................. 77

4.7 Simulation Results ................................................................................... 79

4.7.1 Veriﬁcation - ....................................................... 79

4.7.2 Hardware Simulation ................................................................ 81

4.7.3 Hardware Complexity ............................................................... 82

4.8 Summary .................................................................................................. 83

Chapter 5. ANN Techniques for Power System Control Analysis .................... 97

5.1 Instruction ................................................................................................. 98

5.2 Problem Statement and Methodology Review ....................................... 99

5.2.1 Problem Statement .................................................................... 99

5.2.2 Methodology Review ................................................................ 101

5.3 ANN Formulation for Power Load Flow ............................................... 104

 

5.3.1 ANN Solution to Nonlinear Equations .................................... 104

5.3.2 The Full Power Load Flow ...................................................... 108

5.3.3 The Decoupled load Flow ....................................................... 113

5.3.4 The DC Load Flow ................................................................... 114

5.3.5 Contingency Analysis ............................................................... 115

5.4 Architecture .............................................................................................. 115

5.4.1 Veriﬁcation ................................................................................ 118

5.5 Summary .................................................................................................. 122
Chapter 6. Conclusion ............................................................................................. 136
6.1 Summary - - - - ....................................................................... 136

6.2 Contributions ............................................................................................ 138

6.3 Future Research ....................................................................................... 139
Appendix A. Proof of Theorem 5.1 ......................................................................... 141
Appendix B. The Bus and Branch Data for the 39-bus Power System ................. 144
Appendix C. Simulation Results for Power Systems .............................................. 146
BIBLIOGRAPHY ..................................................................................................... 148

2.1
2.2
3.1
4.1
4.2
4.3
4.4
4.5
4.6
5.1
5.2
5.3
5.4
BI
3.2
Cl
C. 1

LIST OF TABLES

Comparison of the nervous system and an ANN system. ............................... 34
Weights and thresholds of a neural logic unit. ................................................ 34
Comparison results for linear and nonlinear programming. .......................... 57
The time complexity for linear system computing. ........................................ 93
The connection of resistors. ............................................................................ 93
Linear equations solutions. .............................................................................. 94
Matrix inversion. .............................................................................................. 95
Stability. ........................................................................................................... 96
Singularity. ........................................................................................................ 96
A Space estimation of the architecture for FLF. .............................................. 134
A space estimation of the architecture for DLF. ............................................. 134
A space estimation of the architecture for DCLF. .......................................... 134
The contingency analysis. ................................................................................ 135
The bus data for the 39-bus system. .............................................................. 144
The branch data for the 39-bus system. ......................................................... 145
The 7-bus system comparison results. ............................................................ 146
The 39-bus system comparison results. .......................................................... 147

1.1
2.1
2.2
2.3
2.4
2.5
2.6
2.7
3.1

3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.1
4.2
4.3

4.4
4.5
4.6
4.7
4.8
4.9

LIST OF FIGURES

A classiﬁcation of computer architectures. ......................................................
A processing element. .......................................................................................
A logic unit. .....................................................................................................
The Hopﬁeld network. ......................................................................................
The Kennedy-Chua network. ...........................................................................
The block diagram of Rordriquez network. .....................................................
The basic switched-capacitor integrating neuron. ..........................................
A cellular network. ..........................................................................................

Equipotential surfaces in space and their relationship to the feasible
region boundary. ..............................................................................................
Different penalty functions. .............................................................................

A sectioned penalty function. ..........................................................................
A combination penalty function. .....................................................................
Quadratic programming circuit. ......................................................................
A modiﬁed scheme for constrained ampliﬁer unit. ........................................
Trajectories for example 3.1(a). ......................................................................
Trajectories for example 3.2 (for 13 = 0.415). ...............................................
Trajectories for example 3.3. ...........................................................................
A systolic array for LU decomposition. .........................................................
A systolic array for linear state equations. .....................................................
The image of ¢(x1, x?) from equation (4.22), an ellipse-parabolic

surface. .............................................................................................................
A network architecture for the linear equation solver. ..................................
Applications of the ANN solver. ....................................................................
Error curves for examples. ..............................................................................
Trajectories for an example using different initial points. .............................
Trajectories of multi-solution (a singularity matrix). .....................................

The result of SPICE simulation (two neurons). .............................................

10
29
30
30
31
32
32
33

51
51
52
52
53
54
55
56
56
84
85

88
89
89
9O

4.10 The result of SPICE simulation (three neurons). ........................................... 91

4.11 The result of SPICE simulation (ﬁve neurons). ............................................. 91
4.12 A relationship between the chip area and the size. ....................................... 92
5.1 The neural network for classiﬁcation. ............................................................. 123
5.2 A power system consisting of 3 subsystems. ................................................. 124
5.3 A network for the full power load ﬂow. ........................................................ 125
5.4 A network for the decoupled load ﬂow. ......................................................... 126
5.5 A network for the DC load ﬂow. .................................................................... 127
5.6 The 7-bus system. ............................................................................................ 128
5.7 Trajectories of angle for the 7-bus system. .................................................... 129
5.8 Trajectories of voltage for the 7-bus system. ................................................. 130
5.9 Trajectories of angle for the 39-bus system. .................................................. 131
5.10 Trajectories of voltage for the 39-bus system. .............................................. 132
5.11 The contingency analysis for a 5-bus system. ............................................... 133

Chapter 1

Introduction

 

Since the late I980s, Artiﬁcial Neural Networks (ANNs) have captivated the
interest of researchers. The use of ANNs introduces fundamentally new concepts into
machine intelligence computing problems. ANNs have been successfully applied in
areas such as pattern recognition, associative memory, adaptive control, intelligent
robotics, and optimization. This dissertation presents a new approach for improving
the performance of artiﬁcial neural networks for linear and nonlinear programming,
and develops a neural network-based linear and nonlinear equation solver. This
includes the theoretical analysis, network formulation, architecture conﬁguration, and
approach veriﬁcation. This chapter begins with an overview of neural computing, fol-

lowed by the problems to be solved and research tasks of this work.

1.1 Overview

The birth of the electronic digital computer, the Von Neumann machine, marked
a new era of information processing in human civilization. Since then, computers have
developed from the ﬁrst generation to the ﬁfth generation, as devices, architectures and
computing mechanisms have advanced. From the electronic tube, to the transistor, to

the integrated circuit, to VLSI chips, the devices have become smaller, faster, and

cheaper. A variety of architectures such as SISD, SIMD, MISD, and MIMD have been
used in the Von Neumann-based paradigm. Due to the control-driven property, it is
difﬁculty to obtain a higher throughput in Von Neumann computers. Consequently, the
dataﬂow and reduction computers have been proposed [Dec89]. The former is data-
driven, that is, program instructions are executing as soon as their operands are ready,
and the later is demand-driven, that is, program instructions are carrying out when
results are needed for other calculations. All of these computers are called conven-
tional computers because their computing is based on the algorithm-oriented processes.
Of course, these computers have played very important roles in scientiﬁc, industrial,
and commercial sectors, because they are much superior to the human brain in speed
and accuracy for conventional computation tasks. They, however, are far inferior to the

human brain in intelligence, perception, fault tolerance, and intolerance to fuzzy input.

A nascent research area — the sixth generation, brain-like computers — has
emerged, which requires cooperation among the following scientiﬁc disciplines: biol-
ogy, psychology, mathematics, and computer science and engineering. The research
area is divided into three aspects: Science, Technology and Applications [SoSo88].
Scientists are exploring intrinsic principles such as perception, cognition, and intelli-
gence. Technologists are developing new techniques for adaptive learning, pattern
recognition and decision making. At the same time, they are envisioning and imple-
menting new hardware architectures, new processing algorithms, and new silicon dev-
ices. Application engineers will enhance information processing capabilities dramati-
cally in some areas, such as pattern recognition, computer vision, robotics, and optimi-
zation. In addition, they will open new ways to developing much more intelligent
machines. Consequently, tremendous changes will take place in the future. However,
it is impossible for the neural computers to replace the the conventional computers.
Because it has been proven that the conventional computers are more suitable for tasks

such as logical reason, sequential mathematical operations, planning, and language

understanding. In other words, both the neural computers and the conventional comput-
ers will be developed harmoniously. Furthermore, a hybrid computer may be
developed in the future which associates the real-time computing property of the neural
computer with the features of the conventional computer, such as general purpose and
program ﬂexibility, so that an excellent overall performance can be achieved. Figure

1.1 shows a classiﬁcation of computer architectures.

With the distinct features of asynchronous parallel processing, continuous-time
dynamics, and global interaction of network elements, artiﬁcial neural networks create
an opportunity for enhancing real-time control and decision-making processes. A major
candidate in these processes is large-scale power system control problems because they
are complex and time consuming. In this dissertation, a set of fundamental research
tasks aimed toward developing artiﬁcial neural network techniques for power system
control will be performed. The results of this research will lay the fundamental
mathematical and analytical groundwork to take advantage of neural computational
concepts in this application area. This will be accomplished by a series of research
tasks which will lead to implementing the algorithms and architectures from this
research in VLSI. This may ultimately be the key to providing a robust computing
environment for enhanced on-line control and decision-making, a development that

would greatly improve the security of electric power systems.

1.2 Problem Statement

Artiﬁcial neural networks, especially feedback networks, have been used to solve
optimization problems. However, the computing accuracy from the present models can-
not meet the demand for some applications. For example, the relative error is 1-3% for
linear and nonlinear programming [ChLi84, KeCh88, Roet90]. The problem here is to

ﬁnd the reason for degenerating computing accuracy and a method for improving

accuracy.

Solving linear equations is a fundamental computation task. The time complexity
of all the present methods is size-dependent: 0(n3) for the software approach, 0(n)
for the hardware approach [HwCh82, Pret86, CoRo87, JoJe88, Jelo90]. As the scale of
the system increases, the computing efﬁciency becomes a critical issue. In order to

overcome the problem of the size-independence, a new approach should be developed.

On-line control of large-scale power systems has always been weakened by an
inability to solve large-scale complex power load ﬂow and contingency analysis prob—
lems in time frames required for real—time response to rapidly changing operating con-
ditions [BuHW82, CoDK86, MoPG87, BaEM89]. The problem has been aggravated by
the lack of robust algorithms and implementation technology which would allow for
the design of economically feasible dedicated high-speed computational hardware

architecture.

Artiﬁcial neural networks have shown great potential for solving complex compu-
tation problems in real-time. A lot of progress has been made in ANN theory, architec-
tures, and applications [TuHW86, AtSu89, Aget89, WiLe90, Fiet89, SoPa89, MaSh90,
TaTr91, TrWa91, WeEl91]. Therefore, a signiﬁcant opportunity exists for applying the
ANN technology to the above problems. By utilizing these computational advances
and linking them to power system analysis, the research work will focus on:

(1) analyzing artiﬁcial neural networks for linear and nonlinear programming from
the viewpoint of optimization theory;
(2) developing an ANN solver for general linear equations;

(3) developing ANN techniques for the power contingency analysis;

(4) exploring a method for solving nonlinear equations by artiﬁcial neural networks

and applying it to solve the load ﬂow problem.

1.3 Research Tasks

The following research plan is used to achieve the goals of this research in a

stepwise and overlapping fashion and to set the stage for subsequent developmental

research further exploiting the expected results.

Task 1:

Objective:

Signiﬁcance:

Approach:

Task 2:

ANN Performance Improvement

Analyze various network formulations of feedback models from the
viewpoint of optimization theory in order to discover the reason for

degenerating computing accuracy.

This analysis will generate insights which will be useful in developing a
new method for improving performance of ANN for optimization prob-

lems.

ANN feedback models for optimization will be investigated ﬁrst. The
role of the feasible region boundary for linear and nonlinear program-
ming with constraints will be identiﬁed. In reviewing general optimiza-
tion techniques, the penalty function method will be analyzed, particu-
larly its behavior in a neighborhood range of the boundary conditions.
Based on these analyses, a new combination penalty function will be
proposed which exhibits good behavior in both the boundary neighbor-
hood and far away regions. This function must also be easily imple-
mented with physical circuits. Then, one ANN model, the Kennedy-
Chua network, for example, will be chosen to test the new penalty func-
tion with a modiﬁed circuit scheme for the constraint ampliﬁer unit, and

to check the computing accuracy for linear and nonlinear programming.

ANN Solver for Linear Equations

Objective:

Signiﬁcance:

Approach:

Establish a relation between the quadratic optimization and the linear
equations solution, and map the latter into an artiﬁcial neural network

SU’UCIUI‘C.

This step in this research will lay a theoretical foundation in linking
artiﬁcial neural networks and linear equations solution. In consequence,
a new application area of ANN, a completely parallel equation solver,

will be explored.

Quadratic programming without constraints and its application will be
investigated. In fact, there exists a relation between the optimal point of
quadratic function and the solution of linear equations. The idea here is
to map the general linear equations (Ax = b) into an ANN model. The
primary theories used in deriving the formulation will include matrix
theory, differential equation theory, and optimization theory [BaSt70,
HiSm74, BaSh79, Cro80, HiLi86, Ort87b]. It is essential to guarantee
stability and convergence of an ANN model. The stability of the qua-
dratic optimization network depends on the coefﬁcient matrix A , partic-
ularly on the eigenvalues of A. Based on this initial understanding, the
problem can be divided into three cases:

(1) A is a symmetric positive deﬁnite matrix. The optimization network
is equivalent to linear equations solution in a sense of equilibrium.

(2) A is an arbitrary nonsingular matrix. It is necessary to ﬁnd an
equivalent transformation L such that L(A) is a symmetric positive
deﬁnite matrix. Case 2, therefore, will be turned into case 1.

(3) A is a symmetric or asymmetric matrix with positive real part of
eigenvalues. There exists an unique stationary point of the optimization

network which is satisﬁed with the linear equations.

Task 3:

Objective:

Signiﬁcance:

Approach:

Task 4:

Objective:

Signiﬁcance:

Correspondingly, a hardware conﬁguration of ANN solver will be

sought in a modular fashion.
Steady-State Contingency and Power Load Flow Formulations

Use the models and mappings to express contingency analysis and

power load ﬂow as corresponding artiﬁcial neural network formulations.

This task will provide a case study veriﬁcation of the models and map-
pings. Furthermore, the results will indicate the capability for getting a

feasible solution to power system control problems by ANN.

The results from Task 1 and Task 2 will be applied in the contingency
analysis and power load ﬂow problems. Some approximation techniques
will be used for simplifying formulations. A mapping relationship
between the optimization networks and the nonlinear equations solution
will be sought. An emphasis will be put on how to get a feasible solu-
tion for power load ﬂow from the viewpoint of the security. In order to
work more efﬁciently, the formulations will be divided into three steps:
(1) the DC contingency, (2) the AC contingency and decoupled load
ﬂow, and (3) the full load ﬂow.

Simulations

Produce simulations of artiﬁcial neural network approaches to the new
penalty function, the ANN solver, the contingency analysis and power

load ﬂow.

The simulations will demonstrate the degree of practicality of imple-
menting these approaches. The comparison will show the advantages
and disadvantages of the approaches proposed with respect to other

approaches.

Approach: Simulation programs for all the above approaches will be developed in
C language. They will be used to provide performance metrics for com-
parison with other approaches. The simulations will also be used to
study convergence properties, computing accuracy, and network sensi-
tivity regarding certain parameters, for example, the input coefﬁcients
for the new penalty function, the time increment for numerical integra-
tion, and the initial value for power load ﬁow. According to the simula-
tion results, possible modiﬁcation and reformulation of the networks will
be done by going back to the previous tasks for the purpose of perfor-

mance improvement.

1.4 Organization of the Dissertation

The remainder of the dissertation is organized as follows. Chapter 2 contains a
background of artiﬁcial neural networks. It begins with the basic concepts of neurons
and neural networks, and then it brieﬂy describes the distinct features of neural com-
puting, followed by the discussion of learning methods and artiﬁcial neural network
architectures. It ends with an outline of ANN applications and a general procedure for

solving problems using the neural network technique.

Chapters 3, 4 and 5 form the backbone of the dissertation, which deal with

Research Tasks I, 2 and 3, respectively.

Chapter 3 presents a method for improving the computing accuracy of artiﬁcial
neural networks for linear and nonlinear programming. Based on investigation of boun-
dary situation of optimization problems with constraints, a new combination penalty
function is proposed which can ensure that the equilibrium point is sufﬁciently close to
the Optimal point. A modiﬁed ANN architecture with the new penalty function circuit

scheme is given.

Chapter 4 presents a new method for solving linear system problems using an
artiﬁcial neural network. Based on a mapping between the solution of linear equations
and quadratic minimization, solving linear equations can be viewed as ﬁnding the
minimum of a quadratic function. The conditions for stability and convergence of the
neural network-based dynamic system are identiﬁed and proven. A feedback ANN
model with symmetric or asymmetric connections for optimization is proposed to form
an ANN linear equation solver which can be implemented with VLSI technology. In
addition, other applications of the approach are discussed. These include matrix inver—
sion, determining the stability of a linear control system, and determining matrix singu-
larity. ’

Chapter 5 presents a new approach for solving nonlinear equations using an
artiﬁcial neural network technique. From an engineering point of view, a formulation
of an ANN nonlinear equation solver is proposed which can signiﬁcantly simplify
hardware architecture. The conditions for stability and convergence of neural network-
based on the formulation are proven. Furthermore, ANN formulations for the full
power load ﬂow, the decoupled load ﬂow, and the DC load ﬂow are given, and

corresponding architectures for them are proposed, which can be implemented with

VLSI technology.

Finally, Chapter 6 summarizes this work, identiﬁes the major contributions, and

points out the future research.

10

 

 

33:00

 

Biased

 

.mocaeroS 339:8 do couaocmmmﬂo < .2 Semi

 

 

Ban

 

 

 

 

 

 

as Boa

 

 

 

 

 

12:02

 

 

 

 

 

 

93:2

 

 

 

Q52

Dam

 

 

 

 

Dam

 

 

=80:qu

 

 

 

 

 

 

30c San—

 

 

 

 

 

Eaﬁnoz :o>

 

 

 

 

 

883800

 

 

 

 

 

 

\

3:333:00

/\

 

11

Chapter 2

Artiﬁcial Neural Networks

 

In this Chapter, a brief review of the development of artificial neural networks is
given. Then, the basic concepts, features, learning algorithms, and architectures of
neural networks are described. Finally, some issues of ANN applications are dis-

cussed.

2.1 Background

The notation and concept of V neural networks began in the 19405 with a linking
between biology and mathematics. By the late 19505, the ﬁrst neural network simula-
tions emerged [WiHo60]. After that, some learning rules, system control theories, and
algorithms for updating connection weights were released. Neural network research,
however, suddenly declined in the late 19605 due to a book titled Perceptrons written
by Marvin Minsky and Seymour Papert [MiPa69]. These leading artiﬁcial intelligence
scientists favored the Von Neumann machine and over-criticized the immature neural
networks’ shortcomings. The research was not revived until 1982 when John Hopﬁeld
presented a paper to the National Academy of Science [Hop82]. Hopﬁeld’s theory and
experiment re-convinced contemporary scientists of the beneﬁts and feasibility of

ANNs. A great return to neural network research followed. Since then, many

12

researchers have been involved in biological neuron modeling, neural dynamics, learn-
ing algorithms, neural architectures, applications and their implementations [RuHW86,
Mea88, MeIs89, GrKu89]. The following sections summarize the basic issues of neural

network technology.

2.1.1 Basic concepts

A neuron model is an abstract description of a neural cell. There are two major
categories of models [DaDe91]. The ﬁrst category attempts to represent the behavior of
nerve cells in the deepest mathematical details, often involving differential equations.
This model is sufﬁcient for use as a tool to validate and predicate the behavior of the
human nerve system. Based on a number of simpliﬁcations, the second category of
models places focus on the cause-effect and functions of nerve cells. Most neural net-
work models, including the ones discussed below, use the simpliﬁed model rather than

the detailed models.

Figure 2.1 illustrates a simpliﬁed neuron model, called a processing element. Neu-
ron i receives a set of input signals, (x1, x2, x"), each of which is multiplied by a
weighting factor, wki. The combination function adds up all weighted input signals to
produce the summation u,-. The transfer function maps the summation signal to an out-
put signal v,- according to a certain relationship. The normalized range of output sig-

nals is between 0 and 1. The output signals are then transmitted to other neurons.

The weighting factors have a signiﬁcant meaning for a real or artiﬁcial system. A
positive weighting factor represents an excitation connection because the level of
activation will increase from the positive weighted input. Conversely, a negative
weighting factor represents an inhibitory connection because the level of activation will
be reduced. The weighted connections play an important role in a neural network, as

these connections form a weight matrix which characterizes the dynamic behavior of

13

the neural network to a great extent.

A variety of functions, such as linear, piece-linear, threshold, and sigmoidal, can

be used to represent the transfer functions. Among them, the sigmoidal function

1
V1=f(u.-)= —— (2.1)
is most commonly used for the transfer function. The reason is that the sigmoidal

function has the following useful properties:
(1) The function is similar to the response function in the biological neuron.
(2) The output is bounded from both above and below;
(3) The function is continuous and continuously differentiable;

(4) The function is monotonically increasing, because
e

> o. (2.2)
(1 + 6“")2

f’(ui) =

At the kernel of neural computing is the artiﬁcial neural network (ANN) which

has been deﬁned as: massively parallel, interconnected networks of simple (usually
adaptive) elements and their hierarchical organizations which are intended to interact
with the objects of the real world in the same way as the biological nervous system
does" [Koh87]. It is important to point out that the ANN is not a model of the
human brain which is much more complex, but a model inspired by the human brain’s

parallel processing aspect. Table 2.1 indicates the similarities between the human ner-

vous system and an ANN system [KaBr89].

A threshold function is a simple transfer function. For the binary output case, as
shown in Figure 2.1, if u; .>_ T, then v,- = 1, otherwise, v,- = 0, where T is a threshold.
ANNs are often built in a multi-layer architecture (see Sections 2.2.2.2 and 2.2.2.3) so
that they can perform more complex information processing. As an example, Figure

2.2 illustrates that four basic logic problems (AND, OR, XOR, XNOR) can be solved

14

by an identical neural network with different weights and thresholds. Table 2.2 lists

their corresponding values.

2.1.2 Features

Artiﬁcial neural network technology introduces completely different concepts 'into
computing, that is, non-algorithmic computing, learning, distributed memory, fault

tolerance, and parallel processing [KaBr89].

2.1.2.1 Non-algorithm Computing

Conventional computing is an algorithm-oriented process in which each task is
described in an exact program and is fulﬁlled by serial or partially parallel machine
instructions. Neural computing is a dynamics-oriented process in which each task is
mapped into a proper network architecture, and is fulﬁlled by the network’s inherent

convergent behavior.

2.1.2.2 Learning

Many desired applications of conventional computer technology can not be
economically developed because the software needed is often too expensive to design,
write, and debug. Ideal neural networks overcome this bottleneck because they do n0t
have to be programmed for an application but theoretically learn an application by
training through examples. In other words, neural networks possess artiﬁcial intelli-

gence based on their accumulated "experience".

2.1.2.3 Distributed Memory

15

Conventional computer memory stores data and instructions in an address-
accessed manner. Neural computing technology stores information in a distributed

interconnected weight matrix.

2.1.2.4 Fault Tolerance

Conventional computer is very sensitive to its component failures, because the
computer is composed of a set of different functional units without redundance or with
little redundance. Neural networks can keep in operation although a few components
fail to work, because they have redundant connections and form a homogeneous struc-

ture [NeSY92].

2.1.2.5 Parallel Processing

Parallel processing in conventional computers is often based on time overlapping
(e.g., pipeline) and spatial duplication (e.g., vector register and systolic array) such that
an instruction is divided into a series of sub-instructions which can be executed simul-
taneously. Neural computing technology has an inherent parallel behavior because it
uses numbers of processing elements with adaptive connections such that signals can

be transmitted and processed in parallel throughout the network.

2.1.3 Learning

In order to imitate human intelligence, a neural network must learn how to organ-
ize itself. This means the network will determine its connection weights by training for
meeting the demands of a particular problem. Leaning has been deﬁned as " a change
in behavior which is to a signiﬁcant degree permanent in nature and which results
from activity, training or observation" [Jac88]. There are three basic learning methods:

supervised learning, reinforcement learning, and unsupervised leaning. In supervised

16

learning, the input and the corresponding target for an example are known. The learn-
ing process adjusts connection weights within a network such that the difference
between the output and the target approaches to a minimum. In reinforcement learning,
the target is not given in an exact quantity, but in a "good" or "bad" fashion. In unsu-
pervised learning, without the target description for an example, a network will
develop its own classiﬁcation by extracting features and using the dynamic relationship

between the inputs and the connection weights.

Learning rules are exact algorithms which give an explicit description for updat-
ing connection weights in a network dynamically. Four well-known learning rules are

brieﬂy described below.

2.1.3.1 The Delta Rule

The Delta Rule is expressed in
AWij = e * (0-01),“ 0“ (2.3)

where WU is the connection weight from element i to element j, e is a learning con-
stant, tj is the target activation of unit j, a j is the actual activation of unit j, and a,- is

the input node’s activation.

As supervised learning, the Delta rule is very efﬁcient for two-layer networks. A
well-known learning rule, Back Propagation, is used for three-layer and multi-layer
networks. An interesting link between back propagation learning and multivariate non-
linear least square estimation is established in [Ang89]. The purpose of learning is
updating weight matrix according to training data such that the squared error between
the desired output and the computed output is minimized. Using a normalization form,
the squared error can be written as

02(S,Y)="17i121(5ﬁ " in)2 (24)
F 1=

l7

1 n o h OH I H] 2
= ‘22 f 2W1}; f Ewe in ‘ Y}:
”j=1z=1 k=1 i=1
where
X i]- is the input value,

Yij is the desired output value,

Si]- is the computed output value,

I , o , h are the number of input, output, and hidden neurons, respectively,
n is the number of training exemplars,

W’” is the weight matrix between the input layer and the hidden layer,
W0” is the weight matrix between the hidden layer and the output layer,

f is the transfer function.

From a mathematical point of view, the problem can be viewed as solving the set of

equations

 

 

a[02(s, Y)] = 6[D2(S, Y)] =

(2.5)
awk’j’ :9ng

Here the number of equations is p, (= n -0) and the number of unknown variables is
p2 (= I -h + h '0 ). There are three cases for the general equations. If p1 = p2, the prob-
lem is determined since there may exist a unique solution. If p1 > p2, the problem is
overdeterrnined since there may exist no solution. If p1 < p2, the problem is under-
deterrnined since there may exist inﬁnite solutions. For coping with all of the cases, a
solution can be found in the sense of the least square estimation. Using the Newton-

Raphson iteration [Sca66], the updated weight scheme can be obtained as

a[oz(s, Y)]

 

 

w,f,.”(m +1) = wg’on) — or aw til-”(m) (2.6a)
2
w,2”(m +1) = w,2”(m) - a a”) (S ' Y” (2.6b)

3W3”(m )

18

where m is an iteration index, and or is a learning factor (0 < or < l). a is based on

the following assumption for the Newton-Raphson method

_1__
f ’(X‘m’)

0t. (2.6c)

2.1.3.2 Adaptive Resonance Theory (ART)

ART is as an unsupervised learning rule. By using long term and short term
memories, ART can classify the input and enhance its memory capability. This means
if the input pattern stored in short term memory is similar to the expected pattern
stored in long term memory, the input is classiﬁed to that category. If there is no simi-

larity between the input pattern and expected patterns, a new class is formed.

2.1.3.3 Kohonen Learning

The Kohoren Learning Rule is written as
W?” = W?“ + a (x,- — Wf’d) (2.7)

where X,- = (1:1,, 121': ° - - xni} is the input vector for processing element i, W,- is
the weight vector associated with inputs for processing element i, and a is a learning

constant.

The Kohonen learning rule is useful for unsupervised learning.

2.1.3.4 Boltzmann Learning

This is a reinforcement learning rule [I-IiSe86]. The signiﬁcant features of
Boltzmann learning are that a probability function, the Boltzmann distribution in this
case, is used to adjust connection weights, and simulated annealing is used to control

dynamical settling of the network into a minimum energy state.

19

The above learning rules are widely used in feedforward neural networks. For
feedback neural networks, the learning rule is often deﬁned as the state updating of a

network [Hec90]. For example,

It
new = sgn (2 W.,-xf"‘ — T.) (2.8)
i=1

where x,- is the state, sgn is a sign function, T,- is a threshold, and Wu is the connec-
tion weight. In general, wij- takes a ﬁxed value, and wij = wji.

Another aspect of learning within a feedback neural network is that an energy
function E is deﬁned and E is always decreasing as the state is updating. Eventually,

the network arrives at a stable state when the learning process converges [KaBr89,

Hec90].
2.2 ANN Architectures

Many ANN architectures have been proposed for solving a variety of problems
such as optimization, pattern recognition, computer vision, and associative memory
[Mea88, Hop84, TaH086, AnR088, FoTa88, TaTr91i, MoAG91]. According to their
topologies, the architectures can be divided into three categories: feedback neural net-

works, feedforward neural networks and cellular neural networks.

2.2.1 Feedback Neural Networks

In this type of neural network there exi5t feedback paths from outputs to inputs of
neurons. A key dynamic issue of these networks is that the energy derivative must be

strictly decreasing so that the system equilibrium state can be reached.

20

2.2.1.1 The Hopﬁeld Model

The signiﬁcance of the Hopﬁeld model is that it can be built with analog circuit
components and is suitable for analog VLSI implementation [Hop84, TaH086]. In this
model, each processing element is an ampliﬁer with a capacitor C; and a resistor p,- at
the input node, and a sigmoidal transfer function from input iii to output vi. Process-
ing element i is connected to processing element j via a ﬁnite conductance Ti]. all of
which form a symmetric connection weight matrix T. The architecture of the general
Hopﬁeld network is shown in Figure 2.3. By using KCL, the time derivative of input

potential u,- can be written as

(3.3%: £724). __“_é.+,. (2.9)
t dt j=1 l] j Ri t
where
1 1 n 1 1 n
_=_+ _=_+ T.., (2.10)
R; Pt El Rij Pt ,2 U

I,- is an input bias current for neuron i.

An energy function is deﬁned by integral of equation (2.9) as

E=——ZZT,-J v,vj -2::I,-,-v + Z— lijif 1(§;)d§;. (2.11)

2i: I]: -l i=1R o
The time derivative of the energy function can be found as

d_E__i_1_df(u.-() agz
dt _ i‘Cl dug (av,-

 

—)2 . (2.12)

Because v,- = f (u,) is monotonically increasing, and C,- is positive, g:— S 0 for all t,
the value of the energy function is dstrictly decreasing and becomes zero only at equili-

brium point at which -aav—= —C,-d —— =0 for all i. As described at the end of section

2.1.3.4, this represents the learning in the Hopﬁeld model [KaBr89].

21

The Hopﬁeld model has been used in combinatorial optimization, linear program-
ming, dynamic programming, signal and image processing applications [Hop84,

ChMS91, N aCh92]

2.2.1.2 The Kennedy-Chua Model

Kennedy and Chua proposed a cononical circuit model with feedback [ChLi84,
KeCh88]. Their model can be applied to both linear and nonlinear programming. In
addition, the networks’ structural parameters are in correspondence with the
coefﬁcients of the objective function and constraints descriptions. Figure 2.4 illustrates
an architecture of the model, where p-cells are constraint ampliﬁer units, and j-cells

are integrators (neurons).

The basic relation holds for the circuits as given in [KeCh88]

(2.13)

where ¢(v) is the objective function, g(v) are constraints, i- = pj (gj (v)). C,- is capaci-

tance, and v is the node voltages v1, v2, V”.

The energy function can be deﬁned as
m 81' (V)
E(V) = ¢<v> + 2 (Mods. (2.14)
i=1

The derivative of energy function is

2

dE " dVr
_ = = - C. _ . 2.15
dt Er ‘( dt ) ( )
. (1"; 2 dB .
Obvrously, C,- > 0, (717-) .>. 0, => 7 S 0. Therefore, E(v) 15 a Lyapunov

function which ensures that the system is completely stable.

22

Because the linear programming network of Hopﬁeld obeys the same unifying
stationary cocontent theorem as the cononical nonlinear programming circuit of Ken-
nedy and Chua, the Hopﬁeld model can be regarded as a special case of the Kennedy-
Chua model [KeCh87].

2.2.1.3 The Rordriguez Model

Although it is similar to the Kennedy-Chua model in principle and purpose, the
Rordriguez model has a distinct feature. The RC-active technique is replaced by SC-
reactive (Switched-Capacitor) technique which is more suitable for VLSI implementa-

tion [Roet90].

Figure 2.5 is a block diagram of this model. The time derivative of the objective

function ‘I’(‘) can be expressed as

d‘I’

_ ” _ ' d” _ dx‘ 2 (216)
d. '2377- ("——>—.—“2‘27’ ‘

. . . d ‘l’
where t 15 a posrtrve parameter. Therefore, 7‘— S 0.

In order to implement a discontinuous pseudo-cost function, a basic switched-
capacitor integrating "neuron" is used in this model. Figure 2.6 shows this scheme. By
using even and odd clock phases, the operation of Figure 2.6 becomes threshold-

controlled [Roet90].

2.2.2 Feedforward Neural Networks

A feedforward network is composed of multiple layers of neurons. Each layer
forms an individual group to perform a speciﬁed information task. The whole network
is connected from layer to layer, but there exists no connection in the same layer. The
signals propagate from the ﬁrst layer (input layer) to the last layer (output layer). The

transfer functions used may be different from layer to layer, but they must be the same

23

within a given layer.

2.2.2.1 The Two-layer Model

This model consists of only an input-layer and an output-layer. It can be used in
simple information processing. One kind of two-layer model has been used in an asso-

ciative memory [K0587].

2.2.2.2 The Three-layer Model

When the representation provided by the outside world is such that the similarity
structure of the input and output patterns are very different, the two—layer network
without internal stages will be unable to perform the necessary mapping. The three-
layer model is composed of an input-layer, an output-layer, and a hidden—layer. This
model is often used in pattern recognition based on some learning algorithms [I-Iec87,

WiLe90].

2.2.2.3 The Multi-layer Model

This model is similar to that of the three-layer model except with more than one
hidden layers. The multi-layer is capable to handle more complex information process-
ing tasks. One thing here worth pointing out is that the hidden layers must use non-
linear transfer functions. Because multiple hidden layers with linear transfer function
are equivalent to a single hidden layer [DaDe91]. In other words, it makes no sense to

use multiple hidden layered network when linear transfer functions are used.

2.2.3 The Cellular Neural Networks

In fact, many live organs are built as a cellular structure so that connections and

interactions among cells within an organ are more efﬁcient. Chua and Yang developed

 

24

a model of cellular neural networks [ChYa88a, ChYa88b]. An example of a two
dimensional cellular neural network with 4 x 4 cells is shown in Figure 2.7, where
each cell consists of linear capacitors, linear resistors, linear and nonlinear controlled

sources and independent sources.

Due to their binary value outputs, the cellular neural networks have been used in
image processing for feature extraction and character recognition [RoKa82, ChYa88b].
Moreover, the nearest neighbor interactive property and regularity of cellular neural

networks makes them suitable for VLSI implementation.

2.3 ANN Applications

By far the largest area of research in neurocomputing is that of applications.
Meanwhile, it has been shown that some signiﬁcant achievements in neurocomputing
were associated with applications [WiHo60, StPi63, Hop82, CaGr87]. In this Section,

application categories, methodology and related techniques are described.

2.3.1 Categories of Applications

Neurocomputing applications can be roughly divided into four classes: Pattern

Recognition, Control, Data Analysis, and Optimization.

Pattern recognition is the major application area of neurocomputing [Gro76,
Hec90]. The input of the neurocomputing system is spatial image data or time serial
signals. The neural network fulﬁlls a mapping from an input vector to an output
classiﬁcation. Associative memory can be regarded as a special example of this
category.

Adaptive control has a great signiﬁcance for real world problems. By supervised

traing or reinforced training, neural network controllers attempt to adjust the system

25

state by minimizing the difference between actual output and desired output. The
Vision-Based Broomstick Balancer [ToWi88] and Automobile Autopilot [She88] are

good examples of this kind of application.

Data analysis is the highest potential application area for emulating human brain
behavior to some extent. The ultimate purpose of data analysis is usually to extract
concentrated information from fuzzy raw data. Based on the knowledge from expert
systems, the trained neural network can make decisions for a variety of input informa-
tion in a very short time. For example, the One-Minute Manager has shown surpris-

ingly good performance for a neural network based medical diagnosis system [Bla86].

Optimization is a straight forward application of neural networks, because neuro—
computing is a dynamic process such that an equilibrium point of the system is often
associated with an optimal solution for a given problem. Recently, several achieve-
ments have been made for different problems in this area such as combinatorial
optimization [Hop84], linear and nonlinear programming [TaH086, KeCh88, MaSh92
ChMS92], and dynamic programming [ChMS91]. Since the dissertation focuses on this

area, the issues concerned in great detail will be provided in the following Chapters.

2.3.2 Methodology of Neurocomputing Applications

In order to solve real problems with neurocomputing efﬁciently, it is necessary to
combine neurocomputing with other technological methods. In other words, methodol-
ogy plays an important role in neurocomputing application projects. In the following

three key issues will be brieﬂy discussed.

2.3.2.1 Problem Solving Methodology

It is essential for application researchers to have a comprehensive understanding
of the work domain for which the application is intended. Therefore, the primary

methodology for neurocomputing applications is to "work in reverse" [Hec90]. That is,

26

ﬁrst, a list of neural networks is constructed by the domain specialists. The list
describes each network’s problem solving capabilities and training requirements. Then,
a set of operational functions is developed. Furthermore, applications in each of these

areas are considered, and candidate applications are evaluated as well.

2.3.2.2 Functional Speciﬁcation

The purpose of functional speciﬁcation is to deﬁne an application project pre-
cisely so that the developers who are not domain specialists can work on the project
more efﬁciently. Generally, the functional speciﬁcation can be outlined as follows

[l-lec90]:
(1) Description: Describe the capabilities of application project to be developed.

(2) System Interface: Identify requirements for interfaces between the neurocomput-

ing system and other systems.

(3) Human Interface: Identify requirements or constraints regarding the interface

between the developed system and humans.

(4) Performance: Provide requirements on the overall performance of the system,

such as accuracy, speed, reliability, size and mass.

(5) Economic factors: Estimate development cost and potential beneﬁt of the project.

2.3.2.3 Technological Development

Once the functional speciﬁcations have been made, technical development will
begin. This includes mathematical mapping, gathering training data, developing learn-
ing algorithms, and modifying network architecture. Of course, technical details
depend on the particular application. A procedure for optimization applications is

presented next.

27

2.3.3 A Procedure for Optimization Applications

Since ANN s for optimization problems will be developed and applied in the fol-
lowing Chapters, a general procedure for solving this kind of problems is described

below:

(1) Translate a real problem into an optimization problem;

(2) Develop a dynamical system for the optimization problem;
(3) Choose proper variables, and encode them;

(4) Map the the dynamical system into a neural network;

(5) Deﬁne the energy function for the network;

(6) Develop a learning algorithm for updating connection weight, or identify and

prove the conditions for stability and convergence of the network;

(7) Choose a proper initial point, and ﬁnd a solution to the problem by the network

(simulation or physical implementation);
(8) Improve the performances of the network.

In this procedure, heuristics are frequently used. For example, in the Traveling-
Salesman Problem (TSP), the best path is always chosen from a city to one of its four
nearest neighbors; if two cities A and B are as far apart as possible, they will tend to
occur near opposite sites of the closed route [HoTa85]. In the pattern recognition, the
expert’s knowledge often gives a good hint for feature extraction. In other words,
human intelligence still plays a dominant role in the design of ANN application sys-

tems.

Loosely speaking, ANN applications can be divided into two categories:
stochastic-oriented and deterministic-oriented. Pattern recognition, classiﬁcation, robot
vision and event forecasting are the stochastic applications, whereas the well-deﬁned
optimization such as the TSP, linear and nonlinear programming problems are the

deterministic applications. In the stochastic applications, the computing accuracy

28

depends on learning algorithm and variable choice. The relative error is measured by
the comparison between the computed result and the actual value, with a range from
1% to 8% [EOMD91, PaEM91, SrLC9l]. In the deterministic applications, there exists
an exact solution to the problem so that the relative error can be deﬁned quantitatively.
The computing accuracy depends on the mathematic formulation and architecture. In
addition, the environmental situation exerts an inﬂuence on the accuracy. The
inﬂuential factors include parameters such as the time increment for the Runge—Kutta
numerical integration in the software simulation, and the quality index of circuit com-
ponents in the hardware implementation. From an engineering point of view, it is
essential to improve the computing accuracy for ANN applications. This issue will be

discussed In more detail in the next Chapter.

29

Input

 

Combination Transfer
function function

Figure 2.1. A processing element.

30

 

1
A Tl W a w5

w2

9 X
w3

 

 

 

 

 

 

 

 

 

 

 

 

 

w6
4
. . w c
Figure 2.2. A logic unit.
I
11 2 In
T22
T11
if? P? pn
111 C1 ': “2 C2 T On Cu 7
l o o o
' l
V1 V2 VII

Figure 2.3. The Hopﬁeld network.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

‘2

 

 

4%
2

 

 

 

 

 

. ' ' 0' ' FZDZ—J
I? l 2 2 Eggp—
BT G
Fig 2.4. e Kennedy hua ne ork.

 

'2 2 2'2 2 22,1

 

 

 

 

%

 

 

 

 

B‘P
5?.
Nonlinear akp
Network .372
to calculate
a
VP (1’) '
‘8)?
'37

32

 

 

 

 

/\

i I

 

 

 

 

Figure 2.5. The block diagram of Rordriguez network .

 

 

 

 

 

 

 

 

 

even 81
,1- _1_. ’
even;- F0odd A
Y1 “T -_1_—
, - 3
y? ’ ° _1_
h
even T £00dd +
y2 + 0 -_E
O
O
even 3 m
’m - ° ’ _1_ /
hgo . I
even 4.) C O
/ O

 

 

Tp(sm)

Yout

Figure 2.6. The basic switched-capacitor integrating neuron.

lC(1,1)

 

 

C(2,1)

 

 

 

 

 

 

 

 

C(3,1)

 

 

 

 

 

 

 

 

 

 

 

C(4,1) _—

 

 

33

 

 

 

 

 

 

 

C(1.4)

 

 

 

 

 

 

 

 

Figure 2.7. A cellular network .

C(2,4)

C(3,4)

 

C(4,4)

 

Table 2.1. Comparison of the nervous system and an ANN system.

 

 

 

 

 

 

 

 

Nervous system ANN system
=
Neuron Processing element
Dendritcs Combining function
Cell body Transfer function
Axons Element output
Synapses Weights

 

 

Table 2.2. Weights and thresholds of a neural logic unit.

 

 

 

 

 

Operation w1 w2 W3 W4 w5 w5 T1 T2 T3 T4 T; X
AND 1 0 0 l 0.5 0.5 1 1 1 l 1 A - B
OR 1 0 0 1 1 1 1 1 l 1 1 A + B
XOR 1 -1 -1 1 l 1 l 1 l 1 1 A GB
XNOR 0.5 -l 0.5 -l 1 1 1 l 1 0 l A G B

 

 

 

 

 

 

 

 

 

 

 

 

 

 

35

Chapter 3

Neural Networks for Linear

and Nonlinear Programming

 

In Chapter 3, a method for improving the performance of artiﬁcial neural net-
works for linear and nonlinear programming is presented. By analyzing the behavior
of the conventional penalty function, the reason for the inherent degenerating accuracy
is discovered. Based on this, a new combination penalty function is proposed which
can ensure that the equilibrium point is sufficiently close to the optimal point. A
known neural network model has been modiﬁed using the new penalty function and a
corresponding circuit scheme is given. Simulation results show that the relative error

for linear and nonlinear programming is substantially reduced by the new method.

3.1 Introduction

Artiﬁcial Neural Networks (ANNs) have been applied as models of computation
for solving a wide variety of problems in such diverse ﬁelds as combinatorial optimi-

zation [l-lop84, TaH086, FoTa88], computer vision [RuHW86], and pattern recognition

36

[GaGr87]. In particular, some networks have been used for solving linear and non-
linear programming problems. There are three well-known models among these net-

works.

Tank and Hopﬁeld used sigmoidal input-output cells as a neuron model and
mapped the cost function and constraints into close-looped networks [Hop84, TaH086].
When a constraint violation occurs, the magnitude and direction of the violation is fed
back to adjust the states of the neurons in the network. Their model can be directly
implemented with analog circuits, however, the global optimal solution can be obtained

only under the assumption of very high gain and inﬁnite conductance.

Kennedy and Chua used integrator cells to model neurons and mapped the cost
function and constraints into a canonical nonlinear circuit [KeCh88]. Their model can
be applied to both linear and nonlinear programming. In addition, the networks’ struc-
tural parameters are in correspondence with the coefﬁcients of the objective function
and constraint descriptions. The shortcoming of their network, however, is that the

optimal solution can only be guaranteed with inﬁnite conductance.

Rodriguez-Vazquez, et al., also used integrator cells for the neurons and mapped
the cost function and constraints into a switched-capacitor network [Roet90]. Their
pseudo-cost function is not continuous from the feasible region to the infeasible region.
The distinct feature of their model is in a substitution of the RC-active technique by an
SCoreactive technique which is more suitable for VLSI implementation. However, a
new problem arises — there is no equilibrium point in the system because of discon-
tinuous behavior.

Due to the limitations mentioned above, the relative error of results from these
models is 1% to 3% [KeCh88, ChLi84], although, it is recognized that these are case-
dependent results [MaSh89]. The motivation for this work is to develop a new method

for improving the computing accuracy of linear and nonlinear programming.

37

In this Chapter, the role of the boundary for linear and nonlinear programming is
investigated. By analyzing the behavior of the conventional penalty function, the rea-
son for degenerating its accuracy is discovered. Based on this, a new combination
penalty function is proposed which can ensure that the equilibrium point is sufﬁciently
close to the optimal point. By using the new penalty function, a modiﬁed ANN model
is described and an alternative circuit scheme for the constraint ampliﬁer unit is given.
The stability and convergence of the modiﬁed model is addressed. Finally, a few

examples of linear and nonlinear programming are solved by simulation of the new

model.

3.2 Methodology Review

A large class of problems in science and engineering can be formulated as
optimization problems in which a minimum or maximum solution is sought to an
objective function subject to certain constraints. From a mathematical point of view,
however, the case of maximizing a concave function is equivalent to that of minimiz-
ing a convex function [BaSh79]. Therefore, only the latter will be discussed here.

Consider the following optimization formulation:
Minimize
$00 (3.1)
subject to
gt(X)20 i=1.'°°,m (3.2)

Where ¢(x) is a nonlinear strictly convex function, x e R", and gi(x) is convex for all
1'. The nonempty convex set S=[x:g,- (x) .>_ 0, i =1, 2, ..., m} is called the feasible

region.

38

The penalty function and the Lagrange multiplier are two well-known methods
for solving the problem [BaSh79, HiLi86]. The basic idea is to translate the con-
strained minimization problem into a new unconstrained minimization problem by

introducing a modiﬁed objective function.

3.2.1 The Penalty Function Method

Deﬁne a new objective function as

if (x) = ¢<x> + 11010!) (3.3)
where
(1(x) = m p,- (x) (3.4)
1'21
and '
0 if 810‘) 2 0.
pl (X) = (3.5)

lgio‘) IP ifg,-(x) < 0.

For the conventional penalty function, u > 0, and p is a positive integer, in general,
p 2 2.

A theoretical analysis has shown that only when it approaches inﬁnity can the
solution to (3.3) be the same as the solution to (3.1) and (3.2) [BaSh79]. Two prob-
lems arise: it is difﬁcult to realize inﬁnite p. physically, and too large a u may result in

an ill-conditioned problem [BaSh79].

3.2.2 The Lagrange Multiplier Method

This method can only handle the exclusive equality constraints. By introducing a

relaxed function Si , the inequality constraints can be transformed into equalities:

39

q.- = 810‘) - S. = 0 (3.6)
where S,- is a non-negative function depending on a relaxed variable xMi, generally
taking the form as

s, = x,,,2. (3.7)

Thus, the problem is changed to minimizing a modiﬁed objective function without
constraints
a: m
(D (X) = $00 + 22w,- (3.8)
i=0
where h,- is a unknown constant. A minimum point must satisfy

#

act
8x-

I

 

=0 fori=1,2,-~,n

(It = 0 fori = 1, 2’ ..., m. (3.9)

Consequently, solving the minimization problem with equality constraints is equivalent
to ﬁnding a root of the simultaneous equations with real coefﬁcients as formulated in

(3.9).

3.2.3 ANN Techniques

Tank and Hopﬁeld ﬁrst developed a simple optimization neural network and
applied it to the linear programming problem [TaH086]. Their network consists of two

parts: an ampliﬁer unit and a constraint unit. They are associated by the following

equations:
Cd“i ( ”" m'af’ (310)
.— : — . + —— + .— .
t dt at R Ell] all;
__ (a, + R + Elk-d!)
J:

and

40

v,- = g(ui) (3.11)
where g (-) is a sigmoid function.
¢(v) = vTa (3.12)
is the objective function subject to the constraints
fj(v = Vde - bj 2 0 (3-13)
for j = 1 to m, and i j- = q(fj(v)) is the penalty function.

The energy function for the network is deﬁned as

m fj(v)

E(v) = vTa + %£g“1(©d§ + 2 g q(g)dg (3.14)
j=l

n
l:

l

and the derivative of the energy function can be found as

d. 777

l

dv‘: dlli
£2 " (3.15)

n dg(u,) du;
C- ( )2.

d8 (ui)

 

Srnce g (u,) rs a monotone 1ncreas1ng function, 15 posrtrve. Thus, 7 S 0 1nd1-

i
cating that the system will tend to move toward a local minimum point.
In a modiﬁcation to this model, Kennedy and Chua proposed a neural network for
nonlinear programming described by a set of ﬁrst-order differential equations in (2.13).
The energy function E (x) is deﬁned in (2.14). It has been shown that this network can

cope with linear and nonlinear programming [KeCh88].

Networks of this genre were further developed by Ma and Shanblatt in which a
variety of ANN optimization models for linear, quadratic and general nonlinear pro-

gramming were analyzed. They showed that the optimization network formed by

1=1

it = — V¢(x) - s[§ g j+(x)Vg )- (x)] (3.16)

41

fulﬁlls both the Kuhn-Tucker optimality conditions and the criteria of the penalty func-
tion method [MaSh91]. Under an assumption of convexity, the network described by
the dynamics of equation (3.16) traverses along the surface of the energy function and
eventually settles on a local minimum. Furthermore, a two—phase optimization network
model was proposed such that different dynamics are used as the phase is changed by
a predetermined timing switch. As a consequence, the exact solutions of the network

can be guaranteed by adjusting the penalty parameter.

3.3 A New Combination Penalty Function

Since the penalty function method is widely used in ANN optimization models, it
is essential to investigate the behavior of the conventional penalty function technique

in order to propose an improved network for optimization problems.

3.3.1 The Boundary Situation

Let x" be the solution to (3.1) without constraints, and let x‘ be the solution to

(3.1) and (3.2) with constraints. The following Theorem can be stated.

Theorem 3.1: If x"e S, then xi: x"; otherwise, x‘e B(S), where S is bounded

and B(S) is the boundary of S.
Proof can be found in [BaSh79].

From Theorem 3.1, the constrained problem can be divided into two cases: case 1
in which the solution lies in the interior of the feasible region, and, case 2 where the
solution lies on the boundary of the feasible region. For the former, the solution is the
same as that without constraints and can be expressed as a set of algebraic equations
by setting the partial differentials of ¢(x) equal to zero. For the latter, (nonlinear pro-

gramming with constraints —— NLPW), obtaining the solution is more difﬁcult due to

42

possible complex situations on the boundary. In practice, most optimization problems

belong to case 2 and therefore it will be examined in detail.

For linear programming (LP) without constraints there is no minimum point. The
minimum point for LP with constraints (LPW) can only be on the boundary of the
feasible region [HiLi86] provided the feasible region is bounded. Thus, a strategy for

improving computing accuracy will be useful to both NLPW and LPW.

Consider a gradient system which is often used for optimization by ANNs:
it = -V¢" (x) = -[V¢(x) + Vuor(x)]. (3.17)

Due to the convex assumption, the unique equilibrium point (i.e., the minimum point

for (3.1) and (3.2)) must satisfy
i=0 ’ (3m)
which means

V¢(x) = —Vu0t(x). (3.19)

Figure 3.1 illustrates the situation for case 2 where dashed curves indicate equipo-
tential surfaces of ¢(x). Let xb be the solution of the problem, described by equations
(3.1) and (3.2), lying on the boundary of the feasible region, i.e., x,, = x'. Let xbo be
a point outside the feasible region S but in a neighborhood of xb so that
¢(xbo) < ¢(xb). According to the assumption of convexity, and equation (3.19), it can
be concluded that vectors V¢(xb) and Vua(xb) are in the opposite direction as shown
in Figure 3.1. Let l(xb) = (xb,5, xb), called a small "left" neighborhood of xb. If the
penalty function is too weak to increase ua(x) enough to offset the decreasing ¢(x) in
the range l(xb), then d)’ (Xbo) may be slightly lower than (if (xb) , or almost equal to
¢’ (xb ), where xboel(xb). For the conventional penalty function, it is clear that
'810‘66) H"1 is so small that its offset can be almost ignored. This is the reason why

the computing accuracy of the previous models, which use the conventional penalty

43

function, is not satisfactory.

3.3.2 A New Combination Penalty Function

According to the above analysis, it is desirable to ﬁnd a new penalty function
which can ensure that a solution to (3.3) is sufﬁciently close to that of the original

problem (3.1) and (3.2) [ChMS92].

a .
The objective is to make 8 (p, ) increase rapidly. A fraction-exponent function
81' x5- ’
can be chosen as
0 if gi(x) 2 0,
Pt (X) = ﬂ _ (3.20)

I81(X)| p ifg,-(x)<0
where p is a positive integer. Correspondingly, the derivative of p (-) is

8p,(x) _ 0 if 810‘) 2 0.
agi (X) p+1
p

 

_1- (3.21)
lgi(X)|p ifgi(X)<O

Suppose g,~(x) = x, Figure 3.2 illustrates three example penalty functions’ deriva-
l
tives |x |, Ix lz, and Ix | 8. Obviously, the last one has a good behavior in [(0).

1
One new problem arises: for [gi(x) la 1, the growth of | gi(x) I” is much smaller
than that of a linear function, therefore, the penalty function has a reduced inﬂuence on

(f (x) as |gi(x) |increases. At this time, however, |gi(x) | becomes large enough to be a

penalty. Therefore, a sectioned function can be used as

r

o ifgi(x)20,
ﬂ 1

p.(x)-_-i lg.-(x)| p irg,(x)<0and|g,(x)|p 2|8t(X)|. (3.22)

r

lgz(x)l2 if8s<xl<0and|a<x>lp <Ig.-(x)|.

 

44

An illustration of the effect of the derivatives of this sectioned function is shown in
Figure 3.3.

Furthermore, simplifying equation (3.22) by comprehensively utilizing

2:1.
| g,(x') | P and |g,~(x) |2 , provides a combination penalty function of the form

0 If gt (X) .>_ 0,
pi(x) = ﬂl (3.23)

k1 I81“) I p + k2 lgi(x)|2 ifg,-(x) < 0
where In > 0, k2 > 0. The purpose 0f putting these two coefﬁcients together is to sim-

plify implementing circuits which will be discussed in Section 3.4.

Equation 3.23 is the new desired penalty function, whose derivative is shown in
Figure 3.4. Comparing Figure 3.3 and Figure 3.4, it is apparent that the latter is
smoother and can yield a more accurate result. Since the derivative of the latter is con-
tinuous, and the value of the combination function at any given x (x < 0) is greater
than that of the corresponding selectioned function. In addition, the hardware imple-
mentation for this combination penalty function will be more efﬁcient than that of a

sectioned function which requires a comparator (see Section 3.4).

Note that the combination penalty function is still a continuous function since it is

the sum of two continuous functions.
3.4 Hardware Conﬁguration

As described in Section 2.2, the feedback neural networks are built in a "closed"
form so that they can be more easily implemented. Among them, Kennedy and Chua’s
network is a continuous model, and can handle linear and nonlinear situations without
a structural revision [KeCh88]. By modifying of the energy function and constraint

ampliﬁer circuit of Kennedy and Chua’s canonical nonlinear programming model, an

45

implementation of the combination penalty function proposed in Section 3.3 is dis-

cussed here.
Consider a quadratic programming problem:
Minimize

¢(x) = ATx + é—xT Gx (3.24)

subject to
g(x) = Bx - e .>_ 0 (325)

where A and x are n-vectors, g and e are m-vectors, B is an mxn matrix, and G is an
nxn symmetric, positive deﬁnite matrix. If G = 0, then this becomes a linear program-

ming problem.

Deﬁne the energy function as

500 = no + 35pm) (3.26)
j=1

where p j (x) is the new penalty function from equation (3.23), and the dynamical equa-

tion at node i can be written as

__'_ = - __ _z—'-—', (3.27)

Due to the continuity of pj(-), the derivative of the energy function can be

obtained as follows:

_ = 2—— + 2—1—4 (3.28)

 

 

46

= =—z<—)

i=1

n(,-),-dxdx
.:,(t)tdd

2
Obviously, (7)) _ 0, :> €5— S 0. Therefore, E (x) is a Lyapunov function which

ensures that the system is completely stable.

Based on the above, the circuit implementation will be straightforward. Accord-

ing to (3.24) and (3.25), variable x can be represented as voltage, and integrators are

ag-
used to solve dynamic equation (3.27), where 5J— is constant so that it can be rmple-
xi
. . . . apj
mented as a resrstance. The derrvatrve of the penalty function, 3g—, can be
1'

represented as current. Thus, a circuit scheme is shown in Figure 3.5, where p-cells are
constraint ampliﬁer units, j-cells are integrators and the interconnection network is
resistive. This network is similar to Kennedy and Chua’s model with the exception of
the constraint ampliﬁer units. The penalty function in Kennedy and Chua’s model can
be written as

0 if g j (x) 2 O

i 2 .
EIgJ-(xﬂ 1ng-(x)<0

where R0, and R,- are resistances. The model proposed here uses the new penalty
function of equation (3.23) for the constraint ampliﬁer unit. Comparing (3.23) and
(3.29) (see Figure 3.4), it is observed that the former will shrink the equilibrium region
efﬁciently and puts the trajectory back to the boundary quickly. Thus, the network will

converge to a stable result more quickly.

As mentioned above, current is used to represent the derivative of the penalty
function, i.e.,
if g,(x) 2 0,

i. =3g_= I (330)
I kp;-—L lam!” +2k2lg (x)| “81m”

47

p 1
— _ k — _, 3.31
[(1 — and 2- ( )

if 811x) 2 O,
i. = _= 1 (3.32)
Isa-(x) l P + |g,~(x)| ifgr(X) < 0.

This describes the new constraint ampliﬁer circuit. A problem remains in the imple-

1
mentation of x p which requires a complex circuit. By using the mathematical mani-

pulation

i 3. in,

x” =emp=ep , (3.33)

the radication is replaced by a simple logarithm, proportion, and exponentiation. The
circuit scheme for the modiﬁed constraint ampliﬁer unit is shown in Figure 3.6, where
T1 is the IN inverter, T2 is the comparator, T3 is the log unit, T4 is the proportion
unit, T5 is the antilog unit, and T6 is the sum unit; in T4, p of pR is the parameter for
the new penalty function. In fact, the circuit consists of operational ampliﬁers, transis-
tors and resistors with a regularity in conﬁguration so that it can be implemented by

VLSI techniques without difﬁculty.
3.5 Experiments and Simulation Results

According to the approach in Section 3.3 and the architecture proposed in Section
3.4, a simulation was developed in C language. The fourth-order Runge-Kutta numeri-
cal integration [WeRe66] was used for simulating the integrator cell in Figure 3.5.
The parameters for the penalty function in formulas (3.3) and (3.32) take values: 11 = 1

~ 1.5, p = 10. The typical cases (linear, quadratic and 3rd-order programming) are

48

considered. The three experimental examples are chosen from the recent literature

[Hop84, ChLi84, KeCh88, Roet90] to provide comparative data. In all of the exam-

ples, the initial points are located in different directions from the optimal point, includ-

ing the outside, the inside, and the boundary, of the feasible region to test the capabil-

ity of the new penalty function. The examples and their results are described below.

3.5.1 Examples

Example 1 [Hop84, ChLi84, KeCh88, Roet90]:

Minimize

subject to

and

where

(a)C1=
(b) C]:

(06'1—

((1)01

The

¢(XI,X2)=C1X1+ C212

ix, _x2 _ 22,
12 12
%x1+x2 S 3—25-,
—x135,
1255,

—l,c2=—1;
-1,c2= 1;
1,c2— 1,
l,c2——1

simulation results are listed in Table 3.1 and the trajectories for (a) are

shown in Figure 3.7. In this Figure 3.7 the solid lines represent the trajectories and

the dashed lines represent the boundary of the feasible region. (The same convention

is also used in the following examples).

49

Example 2 [ChLi84, KeCh88, Roet90]:

Minimize
¢(x1,x2,x3) = 0.4.rt1+-%(5)c12 + 8r22 + 4x32) - 3x112 — 3x2x3
subject to
x, + x2 + x3 2 1
and x1, x2,x3 2 0.

The results are listed in Table 3.1, and the 2-D trajectories on the x3 = 0.415 pro-

jection plane are shown in Figure 3.8 for clarity.

Example 3 [ChLi84, KeCh88, Roet90]:

Minimize
¢(x1,x2) = 0.4x1+ x12 +x22 - xlxz + 316x13
subject to
x] + 0.51:2 2 0.4,
0.5161 + X2 2 0.5,
and x1, x2 2 0.

The results are again listed in Table 3.1 and the corresponding trajectories are

shown in Figure 3.9.

3.5.2 Discussion

Starting at an initial point, each trajectory in the above examples is moving
towards the boundary. It then continuously converges to the equilibrium point on the
boundary. The relative errors are 0.03 - 0.08%, 0.02%, and 0.73% for example 1, 2,

and 3 (linear, quadratic, and 3th-order nonlinear programming), respectively. For com-

parison, the results of hardware implementation measurements by Chua and Lin

 

50

[ChLi84], and Kennedy and Chua [KeCh88] are also listed in Table 3.1.

3.6 Summary

NLPW and NPW problems have been studied from the aspect of computing accu-
racy. The conventional penalty function’s weakness has been found. With a very big
coefﬁcient u, the term uor(x) still may not provide a large enough "penalty" to (if (x)
in the "left" neighborhood of the boundary due to the polynomial behavior of (1040.
Based on this, a new combination penalty function has been proposed which exhibits
good behavior in both the boundary neighborhood and far away regions. By manipulat-
ing Equation (3.33), a modiﬁed circuit scheme for the new penalty function is given.
The circuit can be implemented in analog VLSI. The simulation for the modiﬁed Ken-
nedy and Chua model has shown satisfactory results for the experimental examples as
evidenced by a signiﬁcant decrease in the resultant errors. This method can be used in

other ANN models and orher optimization algorithms as well.

This Chapter has presented an improved ANN model for the optimization with
constraints. The other category, the optimization without constraints will be further

investigated in Chapters 4 and 5.

51

 

VHOKXb )

 

 

Figure 3.1. Equipotential surfaces in space and their

relationship to the feasible region boundary.

 

 

PQU)
ll
Ix Ii]? .. 1.0
II I ‘—' 0.5
Ix l2
J O I: x
-0.5 0.5

 

Figure 3.2. Different penalty functions.

52

 

1 1

#15

—1.0

—0.5

 

 

J
-1.5 -1.0 -0.5 0

Figure 3.3. A sectioned penalty function.

P'i(x)

l

(l

,2,0

-1.5

—1.0

 

 

 

-1.5

Figure 3.4. A combination penalty function.

 

 

 

 

 

 

 

 

 

 

 

? 3 Z?“

 

 

 

3

 

o
‘ g
e

 

 

 

 

 

Figure 3.5.

 

 

 

 

 

 

AT

 

Quadratic programming circuit

 

54

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F1 1 e O 0 I ll
1 O

55

 

   
 
  

 

x
r?
(7,8)
(-1.7)
\\
/Optimal point:

r .................. 5. ............... (5.00310,S.00246)

E (10.3)

I 1,1 \\
-3 0 ( ) 5 37 a,”

L’ ’ ’ r5

 

Figure 3.7. Trajectories for example 3.1 (a).

56

X2

(1,1)

  
   
 

(0.1,0.7)
0.585 1

Optimal point:

(-0.2,0.3) (025196033278)
/

 

 

>1 1
\ (0.6,-0.1)

Figure 3.8. Trajectories for example 3.2 ( for X3 = 0.415 ).

12
’l (1,1)

0.8

  
     

\ o o
‘\ Optimal pomt.

(0.2.0.5) \ (0.34197,32842)

(0.8,0. l)

 

03,-0.3)

Figure. 3.9. Trajectories for example 3.3.

Table 3.1. Comparison results for linear and nonlinear programming.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Example Theoretical Chua & Lin Kennedy & Chua Proposed Network

x1=5.000 xl=5.075 x1=5.00310

Ela x2=5.000 x2=4.895 x2=5.00246
error=2.1% error=0.06%

x1=7.000 x1=7.088 x1=7.00524

Elb “241000 =0.017 x2=0.00032
error=1.26% error=0.08%
xl=-5.000 x1=-4.976 x1=-5.00064
51° x2=—5.000 x2=4.978 x2=-5.00150
error=0.48% error=0.03%
x1=-5.000 x1=-4.966 x1=-4.99920

51“ =5.000 x2=4.906 x2=5.00091
error=l.88% error=0.02%

xl=0.2520 xl=0.258 xl=0.257 xl=0.25196

E2 x2=0.3328 x2=0.329 x2=0.332 x2=0.33278
x3=0.4150 x3=0.407 x3=0.412 x3=0.41495
error=2.38% error=1.98% error=0.03%

x1=0.3395 x1=0.336 x1=0.3406 x1=0.34197

E3 x2=0.3302 x2=0.320 x2=0.3385 x2=0.32842
error=3.09% error=2.51 error=0.73%

 

 

58

Chapter 4

Solving Linear System Problems Using

An Artiﬁcial Neural Network

 

In this Chapter, a new method for solving linear system problems using an
artiﬁcial neural network is presented. Foremost, a mapping between the solution of
linear equations and quadratic minimization is established so that solving linear equa-
tions can be viewed as ﬁnding the minimum of a quadratic function. The conditions for
stability and convergence of the neural network-based dynamic system are proven,
providing a mathematical basis for the approach. A feedback ANN model with sym-
metric or asymmetric connections for optimization is proposed to form an ANN linear
equation solver which can be implemented with VLSI technology. Moreover, it is
shown that the time complexity of this technique is problem size independent. In addi-
tion, other applications of the approach are discussed. These include matrix inversion,

determining the stability of a linear control system, and determining matrix singularity.

52

 

P30)

l
lrzo

 

 

 

 

 

J 1 1 1_>
-1.5 -1.0 -0.5 0 0.5
Figure 3.3. A sectioned penalty function.
P '1' (X )
l
r20
\\\\ {-1.5
.......... _10
-0.5
I 1 >
-1.5 0 0.5

Figure 3.4. A combination penalty function.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TE

 

 

2%

 

 

 

 

 

 

 

 

 

 

 

AT

 

Figure 3.5. Quadratic programming circuit.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

N
Vl LII——
u y :y
CF}! >M—yz r133
:1:— :
R
__/l/V1_

 

 

 

 

 

 

 

Figure 3.6. A modiﬁed scheme for the constraint ampliﬁer unit.

we N4 ”ML—o

Y6

55

 

   
  
  

 

(7.8)
('127)
\\
/Optimal point:

,. ................. 5. ............... (500310500246)

(10,3)
5.. 0 (1.1) 5 ()7 >11

L ’ t5

 

Figure 3.7. Trajectories for example 3.1 (a).

56

     

X 2
l (1,1)
(0.1 ,0.7)
0.585 .
Optimal point:
(-0.2,0.3) (025196033278)
/

  

 

 

\‘0.585 >x1
\(0.6,-0.1)
Figure 3.8. Trajectories for example 3.2 ( for x3 = 0.415 ).
X 2
’l (1,1)
0.8

(-0.2,0.5)

“N

  
     

\ Optimal point:
‘, (0.34197,32842)

(0.8,0. 1)

 

03,-0.3)

Figure. 3.9. Trajectories for example 3.3.

57

Table 3.1. Comparison results for linear and nonlinear programming.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TiExample theoretical Chua & Lin Kennedy & Chua Proposed Network
x1=5.000 xl=5.075 x1=5.00310
E13 x2=5.000 x2=4.895 x2=5.00246
error=2.1% error=0.06%
x1=7.000 x1=7.088 x1=7.00524
Elb x2=0.000 x2=-0.017 x2=0.00032
error=1.26% error=0.08%
xl=-5.000 x1=—4.976 x1=-5.00064
E1“ x2=-5.000 x2=4.978 x2=—5.00150
error=0.48% error=0.03%
xl=-5.000 x1=-4.966 xl=-4.99920
Eld x2=5.000 x2=4.906 x2=5.00091
error=l.88% error=0.02%
xl=0.2520 xl=0.258 xl=0.257 xl=0.25196
E2 x2=0.3328 x2=0.329 x2=0.332 x2=0.33278
x3=0.4150 x3=0.407 x3=0.412 x3=0.41495
error=2.38% error=l.98% error=0.03%
x1=0.3395 x1=0.336 x1=0.3406 x1=0.34197
E3 x2=0.3302 x2=0.320 x2=0.3385 x2=0.32842
; error=3.09% error=2.51 error=0.73%

 

 

 

 

58

C hapter 4

S Olving Linear System Problems Using

An Artiﬁcial Neural Network

 

In this Chapter, a new method for solving linear system problems using an
artiﬁcial neural network is presented. Foremost, a mapping between the solution of
linear equations and quadratic minimization is established so that solving linear equa-
tions can be viewed as ﬁnding the minimum of a quadratic function. The conditions for
stability and convergence of the neural network-based dynamic system are proven,
Provz’ding a mathematical basis for the approach. A feedback ANN model with sym-
metric or asymmetric connections for optimization is proposed to form an ANN linear
equation solver which can be implemented with VLSI technology. Moreover, it is
shown that the time complexity of this technique is problem size independent. In addi-
tion. other applications of the approach are discussed. These include matrix inversion,

derermining the stability of a linear control system, and determining matrix singularity.

59

4.1 Introduction

The linear system is an important mathematical model as it forms a kernel for
solving complex problems such as ﬁnite element analysis, and the numerical solution
of differential equations. In addition, under some conditions, a nonlinear system can
be replaced by a linear system for approximation analysis. Therefore, linear system
computing techniques play a key role in a wide variety of disciplines. The basic func-
tions in linear system manipulation are ﬁnding a solution to simultaneous linear equa-
tions [Cra56, Sti63, FoM067], inverting a matrix, ﬁnding the determinant of a matrix
[WiRe71, Par80, JoRi82], ﬁnding eigenvalues and eigenvectors of a matrix [601.083,
Joh87, LiZS91]. The recent availability of advanced computers has had a signiﬁcant
impact on all spheres of linear system computation. In the past three decodes, much
effort has been devoted to the development of new linear equation solvers for large-
scale systems [Sto73, HwCh80, l-leh82, Mol83, CoR087, BiVo91, Wri9l]. The
major concern of these effects is to speed up the solution procedure in order to ﬁnd a
feasible result as soon as possible. These techniques typically take advantage of one

or more of the following methodologies:
o decomposing the system into small subsystems;
o distributing computation tasks to a parallel architecture;
0 increasing data independency;
o reducing communication cost.

All of these methods, however, are fundamentally based on the traditional technique of
elimination and separation processes. Their time complexity is size-dependent: 0(n3)
for the software approach and 0(n) for the hardware approach [I-IwCh80, Pret86]. In
this Chapter, an artiﬁcial neural network approach and architecture for linear system

manipulation is developed whose time complexity is size-independent.

60

The Chapter is organized as follows. First, general linear system computing tech-
niques are reviewed. Then, a mapping between the linear equation solution and qua-
dratic minimization is established so that solving linear equations becomes equivalent
to ﬁnding the minimum of the quadratic function. The conditions for stability and con-
vergence of such a quadratic gradient system are established. Based on these
mathematical results, a conﬁguration of an artiﬁcial neural network linear equation
solver is given. Applications of this approach, such as matrix inversion, determining

the stability of a linear control system and determining the singularity of a matrix, are

described.

4-2 Methodology Review

Typical linear system computing techniques such as Gaussian elimination, sys-

tolic arrays, and MIMD algorithms are summarized as follows.

4-2.1 The Traditional Methods
A set of simultaneous linear equations can be written in matrix form as
A x = b (4.1)

Where A is n x n matrix, D and x are known and unknown column vectors, respec-
trvely. When A is nonsingular, there exists an unique solution to the equations. Only

thlS case is considered because most applications deal with this situation.

4-2-1.l Gaussian Elimination

Gaussian elimination [FoMo67] is the most important method for solving linear
System problems and serves as the basis for a variety of other approaches. The tech-

n‘ . . . . .
que can be consrdered as a process of eltmrnaung variables by the elementary

61

 

transformation
, 01101“
aij = aij ‘ ——1 (4.2a)
011
and
, 011191
bi = bl "’ . (4-2b)
011

This process is repeated until all entries below the main diagonal are zero so that the

coefﬁcient matrix is reduced to an upper triangular matrix in the form:

- I l- l- - F ’1
an’arz .aln x1 171
I I I
0 022 -02n 12 ’72
° ' ' ' ' = I . (4.3)
0 0 0a,,,,’ x,, b,,’

-l l- d b d

 

 

 

 

 

 

'I‘hen, back-substitution, the solution vector can be obtained.

4-2.1.2 Gauss-Jordan Elimination
This is a numerical method based on the elementary row and column transforma-
tions of column-augmented matrices
[A ]-[x Y] = [b I] (4.4)
Where A is a known matrix, Y is the unknown inverse matrix of A , I is the identity
matrix, b and x are known and unknown column vectors, respectively [WiRe71].

When A is reduced to the identity matrix by a sequence of the elementary row or

Colutnn transformations, the right-hand of (4.4) becomes the inverse of A .

4'2-l.3 LU Decomposition

Suppose A is an n x n nonsingular matrix, then A can be uniquely reduced into a

Upper triangular matrix U by a sequence of equivalent transformations.

62

U = AW” = Mn_1M,,_2...M2M1A. (4.5)
Set
1 0 0 0
”’21 1 0 0
_ _ _ ”131 ”132 0 0
L=M11M21 "’Mn_11= . .’ (46)
. . 1 0
_mnl mn2 ' ° ' mn(n—1) IJ

 

 

then, the LU decomposition of matrix A can be expressed as

A = MI'IMZ‘I - - - M,_,-1U =LU. (4.7)

In fact, the LU decomposition can be obtained during a process of Gaussian elim-
ination [Ort87b], that is, the ﬁnal upper triangular matrix is U, and the multipliers at

each step are the entries of the lower triangular matrix L :

011

 

mi] = — fori = 2, ..., n; (4.8a)
011
a- ’

"1,, = ‘1, fori = 3, n; (4.8b)
022
a. ’

m,3 = —“—, fori = 4, n; (4.8c)
033

and so on. It should be noted that aij’ is updated at each step of elimination.
The advantages of the LU decomposition of a matrix are:

(1) The solution of a triangular set of equations is quick. The triangularized sys-

tem can be solved by forward and backward substitution.

(2) The determinant of a matrix is just the product of the diagonal entries of

matrix U veriﬁed by

det (A) = det (LU) = det (L )-det(U) = l-det (U) = ﬁuii. (4.9)

i=1

63

(3) The inverse of A can be found by

A'1 = (NH. (4.10)

(4) The solution accuracy of linear equations can be improved by using LU fac-

torization iteratively.

4.2.2 Systolic Arrays

A systolic array is a computing network which consists of a set of interconnected
processor elements (PE) each capable of performing some arithmetic operations
[Dec89]. The array maximizes the computational concurrence by multiprocessing and
pipeline processing techniques very effectively reducing computation time. The major
design issues here include mapping a computational algorithm into a systolic array,
and specifying the array in terms of communication topology and the arithmetic opera-
tion of individual PEs. Driven by many practical applications, the LU decomposition
method has been implemented in a systolic array architecture [HwCh82, HwBr84].
Figure 4.1. shows a systolic array for the LU decomposition without pivoting where A
is 4 x 4. In general, (n — 1)2 M cells, and (n — 1) D cells are required for a n x n
matrix. The array has (4n —2) l/O ports, and its time complexity is (3n +1)

[HwCh80].
Another interestingexample is a systolic array system for linear state equations
x(t) =A’x(t) + Cu(t) (4-11)

where x(t) is the state vector of size n, u(t) is the input vector of size m, and A’, and
C are n x n and n x m matrices, respectively [JoJe88]. Using the backward Euler

method in discrete form [JoRi82], the problem can be decomposed into two simple

stages:

(1) compute B(t,,)=B’+ Cu(tn), where 8’: x- , h is the time increment

_1_
h

64

(h = In " tit—1);
, 1 .
(2) solve the linear equations Ax(t,,) = B(tn), where A = (A + I-Z), I rs the

identity matrix.

These two stages are mapped to two systolic arrays, one for the matrix-vector
multiplication-accumulation, the other one for the Gauss-Jordan elimination. Figure 4.2
illustrates their topologies and PE operations. The latency of this technique is
(n +m—1) and (4n —3) for srage 1, and stage 2, respectively. When n > m, this struc-
ture can overlap the two computations so that the system latency is (4n —3) which has

been shown to be the optimal for this problem [JoJe88].

4.2.3 The MIMD Method

The Multi-Instruction and Multi-Data (MIMD) methods can be divided into two
categories based on their data distribution scheme: the row/column distribution and the
square mesh distribution [BiVo91]. In general, these methods can only cope with spe-

cial cases, such as triangular systems, and banded systems, efﬁciently.

4.2.3.1 Triangular Systems

The grid-oriented structure is used for solving triangular system, because the
structure has good load balance property and a low communication overhead
[BiVo91]. This method is effective for solving the lower triangular system. Suppose
there are p processors which are organized in a Q x Q mesh (p = Q2) each with
local memory. Each process is executed on one processor, denoted by (s, t),
0 S s, t <Q. The communication occurs only among the direct neighbors, i. e., two

processors (s , t) and (s ’, t’) are able to communicate if and only if

|s—s’|+|t—t’|=1. (4.12)

65

A theoretical analysis [BiVo91] indicates that the total time required by the paral-
lel algorithm is

2

TS

p + (40 + 5)n, (4.13)

L
p
where at is a ratio between communication to computation. Considering the time for

the best sequential algorithm

 

 

Ts =n2—n, (4.14)

the Speedup is
n - 1
. (4.15)

E- + (40: + 5)

p
and an efﬁciency of 50 percent or more can be achieved if p S n .

4a + 5
4.2.3.2 Banded Linear Systems
A banded linear system is described as

where A is an n x n nonsingular matrix and Ail =0 if |i —j |> k, and k is an
integer (usually 1 < k < n). Obviously, the bandwidth of the matrix is 2k + 1. Such
systems arise in a great deal of applications, such as the numerical solution of

differential equations and the optimal control.

This method can be thought of as the Gaussian elimination with a certain res-
tricted pivoting strategy [Wri9l]. First, the matrix is broken up along the diagonal into
a number (say p) blocks (called sub-blocks), and each sub-block is separated by a
small dense k x k block (called separator). Then, A reduced system is formed from
the rows. of matrix A corresponding to the separators. After the reduced system is

solved, and the remaining variables are recovered using data stored during the initial

66

factorization stage. The experiments for problems with n = 2000 to 5000, k = 2 to 5
using 2, 4, 8, 16 processors showed that the maximum speedup over a single processor
was 4.2 to 5.7 (p = 16). When 2 S p S 4, the speedup is very low due to the boun-

dary condition operation cost.

The time complexity of the above methods is size-dependent as indicated in Table

4.1.

4.3 A Neural Network Solution to Linear Equations

An neural network linear equation solver based on the optimization network discussed
in Chapter 3 ﬁrst requires the establishment of a mapping between the solution of

linear equations and quadratic minimization.
Consider the linear system of equations
A x = b (4.17)
where A is an n x n real coefﬁcient matrix, D is a real column vector, and x is a vector
of unknowns. A must be nonsingular for a unique solution to exist.

Let ¢(x) be a quadratic function of the form
¢(x) = %xTA x — 0711 (4.18)

where A is a real symmetric n x n matrix, x and b are real column vectors.

A minimum point of ¢(x) must satisfy the necessary condition

_341.
8x1
a_¢
8X2

at

8x,,

 

 

 

d

67

Correspondingly, the partial derivative of the right hand side of equation (4.31) should

also be zero. It can be written in simple matrix form as
A x — b = 0 (4.20)

due to the symmetry of A. Equations (4.19) and (4.20) indicate the important fact that
the minimum point of a quadratic function must satisfy the corresponding linear equa-
tions. In other words, by deﬁning an artiﬁcial neural network for the quadratic minim-

ization problem as

i=—Vm0 «20

= - V(—;-XTA x - bTx)
= - (Ax — b),

the solution to associated the linear equations is found when the network settles on its
minimum point. This network is called an ANN linear equation solver. An important
issue for the ANN linear equation solver is ensuring its stability as will be discussed in

next section.
4.4 The Relationship to Linear Systems

The following theorems establish conditions for guaranteeing a unique bounded
minimum point of a quadratic function for the ANN linear equation solver. Moreover,
a general formulation of the ANN linear equation solver is proposed which can cope

with both symmetric and asymmetric coefﬁcient matrices.

Theorem 4.1: If A is a positive deﬁnite matrix, then the unique minimum point

of the quadratic function is the solution to the linear equations.

68

A formal proof can be found in [Ort87b]. A geometric explanation is given here

for clarity. For the two dimensional case, suppose
_ 011 0 _ b1
A — [0 022] and b— [b2]

Substituting the above A and b into equation (4.18) and manipulating algebraically,

where a11> 0, 022 > 0.

¢(x1, x2) can be rewritten as

 

 

1 0110 11 xi
¢(X1,XZ) = ‘2"[X1.XZ] 0 022 x2 ‘— [ble] X2 (4.22)
_ 1 2 1 2
- 3011M + 30212 " brxr ‘ b2x2
2
1 2 bl l 2 1 2 b2 b2 2 1 b1 1 b2
-—a [(x -2——x +(—)]+ —a [(x —2-—x +(——)]-——-———
2 11 I 011 1 011 2 22 2 022 2 022 2 011 2 022
b1 2 2 2
(x1- —) (xz- —)
_ _ 1 _1: 22,
2 2 2 2 2 an 022
( —) ( —)
011 022
The image of ¢(x1,x2), as shown in Figure 4.3, is an ellipse-parabolic surface whose
. . bl b2 . . 1
apex rs the lowest pornt at (a_' T)’ Wthh rs exactly equal to A' b. When
11 22

an: an at 0 and 011022 > a12a21, the statement is still true but the main-axis of
ellipse is rotated. For higher dimension cases, a similar image of ¢(x) will be a super
ellipse-parabolic surface whose apex coordinates are still the solution to the linear

equations.

The positive deﬁniteness of A , however, is a special case. In order to develop a
general technique, Theorem 4.1 must be extended as follows for ﬁnding the solution to

arbitrary linear equations.

69

Theorem 4.2: If A is an arbitrary nonsingular matrix, then there exists a transfor-
mation L such that L(A) is positive deﬁnite, and the new equations are equivalent to

the original ones.

Proof: Deﬁne

L(A ) = ATA, (4.23)

then
(ATA )T = ATA,
in other words, ATA is symmetric. At the same time,
xT(ATA )x = (xTAT)A x
= (A 10% x)
=||Ax||2>0, forx¢0

where H - ”represents the Euclidean norm. The inequality holds because of the non-

singularity of A. Therefore, ATA is positive deﬁnite.

Multiplying both sides of equation (4.17) by AT gives:

A TA x = A To (4.24a)
A TA x = b’ (4.24b)

where
b’ = A To. (4.24c)

Next, it must be shown that the solution to equation (4.24b) is satisﬁed with equation

(4.17) as well. The solution to (4.24b) can be expressed as
x = (ATA )"b'.
Using the relation between b’ and b in (4.240), the solution is given by

x=(ATA)-1ATb

70

=A-1(AT)-1ATb
=A-1Ib

=A“b. [:1

Furthermore, when A satisﬁes a certain condition, the solution can be directly

obtained without the transformation. The following theorem explains this situation.

Theorem 4.3: If A is an arbitrary (symmetric or otherwise) nonsingular matrix
with eigenvalues whose real part is positive, then there exists a unique stationary point
of a quadratic minimization problem which is satisﬁed by the solution to the

corresponding linear equations.
Proof: In artiﬁcial neural network theory, a gradient system is often used for
ﬁnding an equilibrium point

dr“
7 = —§§— for all i. (4.25)

Associating equations (4.19) and (4.20), the dynamic behavior of a quadratic system

can be described by a set of ﬁrst-order ordinary linear differential equations in matrix

form as
x = —A x + bu (4.26a)
where
{0 for 1 < 0
“ = 1 for 1 2 0. (426”)

When A is asymmetric, the energy function, ¢(x), takes on a modiﬁed form of equa-
tion (4.18) (see Section 4.5). In this case, however, the mathematical model of the

neural network can be directly deﬁned by equation (4.26a).

If the system is relaxed, then using the state transition matrix ul(t, I) [Che84], the

solution for t 2 0 can be found as

71

!

x(t) = w(t, 0)x0 + W“, 1')de (4.27)
0

where
ur(t, r) = e‘“’ ‘ T). (4.28)

The ﬁrst term of equation (4.27) comes from the initial condition x(0), and the second
term comes from a constant input bu (2 b). A stable solution exists if the real part of
the eigenvalues of matrix (-—A) are negative [BaSt70]. In this case, the ﬁrst term of
(4.27) will approach zero as t —-> oo. An equilibrium solution to the system will be
determined only by the second term of (4.27). Taking the limit of (4.27) as t—->oo
yields

1

lim x(t) = x(0)lime"“ + Iimje-M' ”hm (4.29)

l-—)00 l—)°° l-)°°0

t

= 0 + A-lbrime-Mt ‘W

l—)00

= A“b(eO — e-°°)

=A"b.

In other words, the equilibrium point of the differential equations must satisfy the

corresponding linear equations.

Moreover, it is easy to show that the signs of the eigenvalues of A are opposite

to those of —A , because
A x = Xx => —A x = 41.x => (—A )x = (—7.)x (4.30)

where 7» is an eigenvalue of A.

72

Therefore, A with eigenvalues whose real part is positive will guarantee that a

unique solution to the linear equation system can be found. C]
4.5 Architecture

A theoretical basis for an ANN linear equation solver has been presented in the
previous sections. In what follows, an architecture for the ANN linear equation solver
is proposed.

Artiﬁcial neural network computing is a dynamical process which is mapped into
a network architecture with an associated dynamical feature. The computing is per-
formed by the network’s inherent convergent behavior. There are three types of ANN
architectures: feedback, feedforward, and cellular models. The feedback neural net-
works are organized in a simple and regular form and they can easily express the
equality relation of a dynamical system. Therefore, they are the logical choice for
building the linear equation solver. As mentioned in Section 4.3, the function of the
linear equation solver is equivalent to that of an unconstrained optimization network.
The ANN solver can be constructed by a feedback minimization network. The present
models, however, work well only with the symmetric feedback connections [TaI-Io86,
KeCh88]. The stability property of a system, determined by matrix A , is the key issue
for design of the ANN architecture. As discussed in Section 4.4, matrix A in the case
of Theorems 4.1 and 4.2 (after transformation) is a positive deﬁnite matrix whose
eigenvalues are positive real numbers [Che84], whereas eigenvalues of matrix A in
Theorem 4.3 can be written as it,- = or,- + jBi, where (1,- > 0, j = 4:7. Theorems 4.1
and 4.2 can be regarded as a special case of Theorem 4.3 where [3,- 50. As in
Theorem 4.3, the architecture proposed here is not restricted to matrices with sym-

metric connections. This signiﬁcantly broadens its range of applications.

73

Figure 4.4 shows a conﬁguration of the linear equation solver composed of
integration units, a resistive network, and feedback links. The resistive network is
organized according to equation (4.263). Thus, the following relation is held for each

integrator (j-cell):

dxl- n

—dt ='(ZarjX'-br'1) (4.31)
1:1

where x,- is the voltage at node i, and the 1 in the term bi-l is a DC source voltage

(IV). The terms aijxj and b,-1 are thus interpreted as current terms. The negative

summation is implemented by connecting all current inputs to the negative (minus) ter-

minal of the integrator. Let the normalization values of current be 0.1 ma, then the

value of the resistors can be calculated as

10

 

J ' laij l
01'

where R)!- and Rb,- are resistors representing values in matrix A and vector b, respec-
tively. The lower ends of RI} and Rb,- are connected to the input of integrator i,
whereas the connections of their upper ends depend on the values of aij and b,- as

indicated in Table 4.2.

In order to verify the stability and convergence of this network (without the sym-
metry restriction), a shift transformation is carried out for the dynamical system as

described by equation (4.26a).
)2 = —(A x - b) ' (4.33a)
=—mx—A{)

=—A(x—x‘)

74

where
y = x - xi (4.33b)

and

Because of the constant (unknown) vector x‘ , the time derivative of y must satisfy
51 : )2 = —A y. (4.34)

When the condition of Theorem 4.3 is satisﬁed for A, there must exist a unique solu-

tion P for the Lyapunov equation
PA + ATP = Q (435)

where Q is an arbitrary positive deﬁnite matrix, and P is a positive deﬁnite matrix

(the solution) [BaSt70].

Deﬁne the energy function E (y) as
E (y) = We y. (4.36)
It is apparent that E (y) > 0 for y 4: 0 because of the positive deﬁniteness of P.
The derivative of the energy function for y at 0 can be found as
E(y) = yTPy + 9pr (4.37)
= -yTPA y - yTATPy
= —yT(PA + ATP)y
= -yTQ y < 0.

The above equation indicates that the energy function is always decreasing with time.
Therefore, the energy function is a global Lyapunov function which guarantees that the

network is completely stable and there exists a unique equilibrium point at

75

ye = 0. (4.38)
Using equation (4.33b), the corresponding equilibrium point for x is
xe = ye + x* (4.39)
= 0 + x‘

= A-lb.

This means that when the network arrives at the equilibrium state, the output x is the

exact solution of the linear equations.

4.6 Other Applications

The technique for solving linear equations is of fundamental importance to many
computing tasks. The method described in the previous sections can be further

extended to other applications.

4.6.] Matrix Inversion

If matrix A satisﬁes the conditions of Theorems 4.1 or 4.3, then the inverse of A

can be found by the following procedure.

Let
A"1=()’1Y2”'yrz)v
then
'10 0 .. 0'
010..0
AA-‘=A(y,y2--y,,)=l= 0 0 1 " 0 (4.40)
.00011.

 

 

where y,- is the ith column vector of A" and I is the identity matrix. The above

76

equations can be rewritten in separate form as

Ayl = C' (4.41)

l

for i = 1, 2, n, where e,- is a base vector (1 at the ith entry and 0 at all others).
Therefore, ﬁnding the inverse of A is equivalent to solving equation (4.41) n times

with a base vector as the right hand side.

As an extension of the approach, if matrix A is negative deﬁnite or Re (M) < 0
for all i, where it,- is an eigenvalue of A, then —A must satisfy the conditions of

Theorems 4.1 or 4.3. As a consequence, the inverse of A can be directly found by
A ‘1 = —(—A )-1 (4.42)

where (—A )‘l is obtained from the ANN linear equation solver.

4.6.2 Determining the Stability of 3 Linear Control System

Criteria such as Routh-Hurwitz and Nyquist are most often used to determine the
stability of a linear control system [Che84]. For large-scale systems, the process
involved in using either of the above criteria becomes very time-consuming due to the
computational complexity. A completely parallel approach based on Theorem 4.3 is
proposed here. From control theory, a linear system is stable in the sense of Lyapunov
if and only if all eigenvalues of A have no positive real parts. (For zero eigenvalues,
the order of the Jordan blocks should be one [Che84].) Without loss of generality, let
the input be zero. Then the control system can be written as

x = —A x
{ (4.43)

y=cx+e

where x is a vector of state variables, y is a vector of outputs, c is a constant matrix,
and e is a constant vector. Thus, the problem can be converted to solving the related
linear equations. Using equation (4.27) and starting at any x(0) ¢ 0, the ﬁnal state,

x(oo), can be expressed as

77

1

lim x(t) = x(0)lime"" + rimje-A“ ”>021: (4.44)
0

t—wo t—ioo l—-)°°

x(0)e"‘(l " 1) + 0

0 or ﬁnite value (stable)
z oo (unstable).

Correspondingly, the linear equations

can be solved with a starting point x(0) at 0. When a bounded solution is found the
system is stable and matrix A is called a stable matrix; otherwise, the system is

unstable.

4.6.3 Determining the Singularity of a Matrix

Since singular matrices have different properties than nonsingular matrices, it is
desirable to determine whether or not a matrix is singular. For example, when A
represents a linear transformation, a nonsingular A always indicates l-to-l mapping
whereas a singular A indicates m-to-l mapping. The traditional methods for determin-
ing the singularity of a matrix include the LU decomposition and the similarity

transformation. The former calculates the determinant of matrix A by
det (A) = det (LU ) (4.46)
= det(L)-det(U)
= 1-det(U )
= det(U )
where L is a lower triangular matrix with a unit diagonal, and U is an upper triangular

matrix. The time complexity for this method is 0(%n3) for the software approach,

78

and 0(3n — 5) for the hardware approach [HwBr84, Pret86].

The purpose of the similarity transformation is to ﬁnd the eigenvalues of A. If
there exist any zero eigenvalues, then the matrix is singular. The computational cost

for this is as high as 0(n5) [Pret86].
The singularity of a matrix can be distinguished by the following two operations:

(i) Determine whether matrix A (or —A) is stable. If yes, go to step (ii); other-
wise, st0p. (In this case the method fails because eigenvalues of A lie in both
right and left halves of the complex plane so that the ﬁnal state is. not

bounded)

(ii) Arbitrarily choose some b at 0. Solve the set of equations A x = b (or
—A x = b) for two different starting points. If a unique solution can be found,
then matrix A is nonsingular. If the solution is dependent on the initial point,

or no solution can be found (i.e., no equilibrium point so that the state vari-

'(k+1)

, — xi“) |> e for all i, where e is a convergence

ables keep changing: Ix
limit, (k) is an iteration index), then the matrix A is singular. For singular
matrix A, the solution situation depends on vector b; if augmented matrix
[A b] has the same rank as matrix A , then there exist multiple solutions. Oth-
erwise (rank [A b] > rank [A]) there is no solution. Thus, this technique not
only gives a judgement as to the singularity, but also indicates the rank rela-

tionship between [A b] and [A], that is, for the multi-solution case,

rank (A) = rank (A b), and for the no-solution case, rank [A b] > rank [A ].

For clarity, the above procedures for the different applications and their relation-
ship are summarized in Figure 4.5. In fact, this approach is valid for the single-side
eigenvalues "distribution". If all eigenvalues of A lie in the right half of the complex
plane, then A is used; if all eigenvalues of A lie in the left half of the complex plane,

then —A is used.

79

4.7 Simulation Results

Experiments include both software and hardware simulations. The former
attempts to verify the correctness of the approach, and the latter provides evidence of

the size-independent property of the ANN solver.

4.7.1 Veriﬁcation

A simulation is designed in the C language and based on the architecture of Fig-
ure 4.4. Integrators are simulated by a fourth-order Runge-Kutta numerical integration

(explicit formulation), and the resistive networks are simulated by ID and 2-D arrays.

4.7.1.1 Linear Equation Solution

Experimental examples are arbitrarily chosen for the three cases in Section 4.4
from n = 2 to n~ = 20. The ANN linear equation solver does indeed ﬁnd a unique
solution for every example no matter what initial point is chosen. Table 4.3 illustrates
three examples associated with three situations in Section 4.4 for matrix A. The con-

vergence is determined by

lin‘H) — xi“) |< E for all i (4.47)

where xi“) is the value of x,- at the kth iteration. e = 0.1E-7 is used for all the exam-
ples. The time increment dt used for the Runge-Kutta integration is 0.05 for example 1

and 2, and 0.001 for example 3. The simulation time is deﬁned as
T = dt-N (4.48)

where N is the number of iteration steps required for convergence. The experiments
show that T has an obvious relationship to the initial point; the further the initial point
is from the equilibrium point, the more iterations are required. For examples 1 to 3, T

is 18.05, 7.0 and 14.67, respectively.

80

In order to measure the accuracy, an error e is deﬁned as

M:

e:

l

n
(201'ij "' bj )2 (4.49)
lj=l

which represents the sum of the square of the difference between the left hand side
and the right hand side for each equation. In the ideal case, e is zero. The error
curves for the three examples are presented in Figure 4.6. Given enough iterations, the
error approaches zero for all praCtical purposes. Four trajectories for example 1 are

shown in Figure 4.7, indicating that the solution is initial-point independent.

4.7.1.2 Inverse Matrix Computation

Table 4.4 lists the matrix inversions (examples 4 and 5) where matrix E is a

measure of the error determined by
E = A'lA — I (4.50)

where I is the identity matrix. Ideally, E should be a zero matrix. The simulation

shows that all entries of E are very close to zero (0.000000001 to 0.000001).

4.7.1.3 Stability

Two examples of the stability judgement are presented in Table 4.5. Example 6
always results in a' bounded solution using the technique described in Section 4.6.
This indicates that the system is stable. Example 7 results in inﬁnite solutions, which
indicates the system is unstable. The corresponding eigenvalues are calculated and

listed in Table 4.5 for checking consistency.

4.7.1.4 Singularity

Table 4.6 gives the results of two example singularity calculations (No. 8 and 9).

Figure 4.8 shows the trajectories of a multi-solution for the following equation with a

81

singular matrix

11; :31 [21418].

It is obvious that all the solutions are on a co-linear pattern indicated by a dashed line

00
02

in Figure 4.8. Moreover, it is easy to shown that eigenvectors associated with eigen-

values X = 0 and I» = 3 of this matrix are

x2 = —2xl
and

x2 = 4x1,

respectively, indicated by dotted lines in Figure 4.8. Some interesting observations are
gleaned from this experiment. The line for all the solutions is parallel to eigenvector
x2=—2x1. Moreover, all the trajectories are straight lines and they are parallel to

eigenvector x2 = 4x 1.

4.7.2 Hardware Simulation

.The mathematical model has veriﬁed the correctness of the proposed approach.
The objective of the hardware simulation is to check the time-property of the ANN
linear equation solver architecture proposed in Section 4.5. SPICE, a computer-aided
design tool for circuit analysis is used to verify the ANN linear equation solver at the
component level. A simulated solver is built with 1, 2, 3, 5, and 10 neurons, respec-
tively. To provide comparison, all data are normalized so that starting points are
located at the origin and solutions are within (-1, 1). The results indicate that the reso-
lution time T5 is in the same range for all the examples. T, depends only on the time
constant of the integrator, I (= C 0R0). For example, when 't = 211s, T, = 2.25ms to
2.45ms; when 1: = 0.211s, T3 = 0.223ms to 0.247ms. The slight deviation in Ts is that

the distance from the staring point to an equilibrium point is not identical. Figures 4.9,

82

4.10, and 4.11 show the curves of transient analysis for the simulations with t = 211s.

The preliminary conclusion here is that the solution time is size-independent.

4.7.3 Hardware Complexity

In the development of computing technology, the computing time and the device
space are always associated with each Other. An estimate of the hardware complexity

of the ANN linear equation solver is derived as follows.

Let the size of a system be n, then the total numbers of resistors and integrators

2 + n, and n, respectively. Assume the

in the architecture shown in Figure 4.4 are n
average resistance of resistors is 50K Q, and the routing area is 40% of the circuit area

[AlHo87], then the chip area of the solver for a CMOS implementation is
s =1.4[(n2 + n)a + nb] (4.51)

where a is the unit area for a resistor (a = 6.25x10’3mm2), and b is the unit area for
an integrator (b = 1.5x10‘2mm 2) [AlH087, Rei87]. For clarity, a relationship between

the chip area and the size n is illustrated in Figure 4.12.

Connectivity is another important issue in hardware implementation. The topology
of an architecture, the connection order of a component, and the number of crossovers
in a circuit are three key factors of the connectivity. As shown in Figure 4.4, the con-
nection pattern of the solver takes a matrix form which can be implemented in a "bus"
structure so that connections among components are signiﬁcantly simpliﬁed. It is easy

to show that the number of crossovers of this network is

N

CI'OSS

= n2 — 1. (4.52)

This is another advantage of a feedback network paradigm as it can be shown that
New” for a three-layer feedforward network is on the order 0(n 4). Moreover, routing
network is straightforward. As shown in Figure 4.4, there are three bus lines: horizon-

tal (H), vertical (V), and feedback (F). H lines are on layer 1, V lines and F lines are

I _jiﬂizz‘

83

on layer 2. On each crossover, an isolated "stamp" is made between H and V. For cir-
cuit components, each resistor is on layer 1 with connections to H and V, and each

integrator is on layer 1 with connections to F and H.

The network requires n l/O pins for operands and additional pins for voltage

sources, control signals, and ground.

4.8 Summary

A new approach to solving linear systems using an artiﬁcial neural network has
been proposed. A mapping between the linear equation solution and a quadratic
minimization problem without constraints has been established. Three theorems in Sec-
tion 4.4 guarantee the stability and convergence of the quadratic system so that a
unique accurate solution can be obtained. Correspondingly, an ANN architecture for
the solver is given which can be implemented in VLSI technology. Because of the
inherent features of dynamic convergence and parallel processing in ANNs, the time
complexity of the approach is size-independent. The approach can also be used for
ﬁnding the inverse of a matrix, determining the stability of a linear control system, and
determining the singularity of a matrix. Simulation experiments have veriﬁed the

theoretical correctness of the technique.

 

84

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Z
Z

b e
d=a*b+c
. 0 a f o f
c
b qze/f

Figure 4.1. A systolic array for LU decomposition (4x4).

Stage 2

Figure 4.2. A systolic array for linear state equations .

86

¢ (x1. x2)

 

 

 

Figure 4.3. The image of 9051. 12) from eqaution (4.22), an ellipse-parabolic surface.

87

1V (or -1V)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 4.4. A 3x3 network architecture for the linear equation solver.

 

InputA
b = 0
x0 #0

 

 

 

 

 

 

 

 

 

Solve
Ax = b
(b at 0)

 

88

 

 

 

 

 

Solve

 

(b¢0)

Unique No
A solutio ‘7
Yes

Find 24—]

 

 

 

 

 

 

 

—Ax=b

 

 

Figure 4.5. Applications of the ANN solver.

 

 

89

 

 

 

 

  

e
ll
1.2 A —— Example 1
- - - - Example 2
1 ._ ......... Example 3
0.8 -
0.6 —
0.4 -—
0.2 -
0 l I 1 I I: Iteration
100 150 200 250 300
Figure 4.6. Error curves for examples.
(15,5)
(1.1)
0 / olution: T x1
/ (12.714,-1.429)
('20'3)

 

 

(18,-5)

Figure 4.7. Trajectories for an example using different initial points.

90

   
  

eigenvector x2 = 4x1 (for l = 3)

(2.4)

(-4,2) (3,1)

/ =x1

(0.2-3)

 

eigenvector x2 = —2x1 (for 7. = 0)

Figure 4.8. Trajectories of multi-solution (a singularity matrix).

 

 

 

xi
1 0’: Equilibrium time = 2.45 ms
XI
0.5 -
x 2
1 1 I 1 1 1 > time (ms )
0 0.5 1.0 1.5 2.0 2.5 3.0

 

Figure 4.9. The result of SPICE simulation (two neurons).

_.. __- ...,..n‘
MA -

if

91

 

 

 

 

11'
41
1,0 . Equilibrium time = 2.25 ms
1 2
0.5 -
X 3
4 ‘ ‘ ‘ ' 1 > time (ms)
0 05 10 15 20 25 30
-0.5 I
11
-1.0 -

 

Figure 4.10. The result of SPICE simulation (three neurons).

 

 

 

 

 

 

 

It
1 ()4 Equilibrium time = 2.35 ms
1 3
0.5 -
x 2
x1
I 5
x 4
J J I l l 1 Ar time (ms)
0 0.5 1.0 1.5 2.0 2.5 3.0

Figure 4.11. The result of SPICE simulation (ﬁve neurons).

92

chip area
(cm 2)

100_l

90-1

80—

70-—

50—

40—

30-4

20—

10—

 

 

l I I I
200 ’ 400 600 800

Figure 4.12. A relationship beteewn the chip area and the size.

1000

93

Table 4.1. The time complexity for linear system computing.

 

 

 

 

 

 

 

 

Method Time Steps
Gaussian Elimination %n3+0 (n2)
Gauss-Jordan Elimination n3
LU Decomposition by Software %n3
LU Decomposition by Systolic Array 3n +1
Jacobi Transformation 0 (n 3)-0 (n 2)
Householder Reduction (n —2)°0 (n 3)
or. Algorithm 0 (n3)-o (n)

 

 

 

 

Table 4.2. The connection of resistors.

 

Value Positive Negative Zero

 

a,- 1' xj —xj unconnected

 

b,- -1 V +1 V unconnected

 

 

 

 

 

 

where

94

Table 4.3. Linear equations solutions.

 

 

 

 

 

 

 

 

 

 

 

 

 

N 0. Case A b Solution Error
1 Positive A1 12.0 12.7142744064 0.11E-9
deﬁnite 3.5 -1.4285681248
3.0 0.6346439123
Positive 4.0 03502970338
2 real part A2 12.0 2.9066379070 2.3E-10
of eigenvalues -6.0 -2.8384561539
8.0 1.4461678267
2.3 03921020627
4.1 06200531721
1.4 -5.0949549675
2.6 -3.2620067596
3 Arbitatry A3 -4.2 0.3290435970 4.7E-6
-2.8 1.7367919683
12.0 0.5648825169
-21.0 -1.3450118303
10.0 -3.6507625580
7.7 -2.1458382607
1.0 0.5
A1: 0.5 2.0
"11.0 1.5 -1.2 0.6 1.2-
2.0 4.9 0.5 -O.8 0.5
A2— 1.2 0.6 5.0 1.8 1.4
0.5 -l.5 1.2 4.2 1.1
_0.5 -0.5 1.7 2.0 5.7_
12.0 -110 1.5 -1.2 0.6 1.2 -2.0 -1.1 0.8 0.6”
1.2 -0.6 -2.1 2.5 —0.9 1.0 2.0 1.3 0.5 —0.8
1.4 1.2 —1.3-2.1-2.1 1.2 0.6 3.0 1.8 1.4
2.3 2.1 -1.6—2.5 1.8 1.5 -l.5 1.2 3.2 1.1
A 3.1 2.0 —1.2 -1.2 1.8 0.5 -0.5 1.7 2.0 2.7
3’ 0.9 0.5 —O.6 1.5 2.5 3.1—2.1-1.61.3 1.6
-1.3 -1.6 -2.1 2.1 1.6 2.5 2.1 3.4 —2.4 1.2
0.5 0.2 0.7 2.4 1.3 -3.1—2.5 0.8 0.2 0.7
1.4 2.1 -1.5 -1.2 1.7 1.4 -5.6 -7.8 2.4 1.1
.8.2 7.1 -5.2 0.2 -0.8 1.6 3.2 6.1 1.7 3.2-

 

 

 

95

Table 4.4. Matrix inversion.

 

 

 

 

 

 

 

 

 

No. A Inversion Error (15=A.A-l — I)
4 A4 A;1 E4
42 42'1 52
where
0.8 0.4 0.3
A4: 0.4 1.2 0.8
0.3 0.8 1.6

1.50234556 —0.46948257 -0.04694886
A4‘1 = —0.46948251 1.39671147 —0.61032742
—0.04694865 -0.61032748 0.93896627

0.00000113 0.00000006 -0.00000056
E 4 = 0.00000037 0.00000107 0.00000052
—0.00000002 0.00000055 0.00000054

[009709457 -0.4654792 0.04247883 —-0.03103656 -0.02080171
-0.04244077 0.24640946 —0.04885584 0.08177510 -0.01646140
A 2'1 = —0.00934555 —0.05908667 0.23789524 -0.09630578 -0.03269460
-0.02375325 0.10910744 -0.07 685770 0.31758156 -0.04698043
_-0.001 1 1820 0.00503705 -0.05199542 -0.07281359 0.20205469 1

 

 

F0.00000018 0.00000004 -0.00000006 -0.00000001-0.00000009T
0.00000029 0.00000036 0.00000019 0.00000005 0.00000024
E 2 = —0.00000033 0.00000012 0.000000030 —0.00000003 -0.00000024
0.00000048 -0.00000014 0.00000029 0.00000012 0.00000035
.—0.00000008 0.00000002 —0.00000005 0.00000000 0.00000012.

 

 

96

Table 4.5. Stability.

 

No. -A Solution Stable Eigenvalues

 

711:6.710624
6 A 5 Exist Yes M=2.144688+i0.784808
703:2.144688—1' 0.784808

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

11:5.970251
7 A 6 Inﬁnite No 221:2.420508
23=—3.390759
where
3.0 2.0 1.0
A5: 1.0 4.0 2.0
2.0 1.0 4.0
-3.0 2.0 1.0
A6 = 1.0 4.0 2.0
2.0 1.0 4.0
Table 4.6. Singularity.
No. A 117 Solution Singularity
8 A7 (1.0 4.0) Multiple Yes
(2.0 4.0) No
9 A8 (1.0 4.0) Unique No
where
1.0 0.5
A7 = 4.0 2 0

97

Chapter 5

ANN Techniques for Power System

Control Analysis

 

This Chapter presents a new approach for solving nonlinear equations using an
artiﬁcial neural network technique which shows great potential for more efﬁciently
solving a class of power system problems. Foremost, methodologies for solving non-
linear equations addressing power load ﬂow and contingency analysis problems are
reviewed. Then, a mapping between the solution of nonlinear equations and
quadratic-nonlinear minimization is established so that solving nonlinear equations
can be viewed as ﬁnding the minimum of a quadratic-nonlinear function. From an
engineering point of view, a new formulation for an ANN nonlinear equation solver is
proposed which can signiﬁcantly simplify hardware architecture. The conditions for
stability and convergence of neural network based on the new formulation are proven,
providing a mathematical basis for the approach. Furthermore, ANN formulations for
the full power load ﬂow, the decoupled load ﬂow, and the DC load ﬂow are given, and
corresponding architectures for them are proposed, which can be implemented with

VLSI technology.

98

5.1 Introduction

On-line control of large-scale power syStems is a difﬁcult problem exacerbated by
the computational complexity of solving large-scale load ﬂow and contingency analysis
formulations in time frames required for real-time response to rapidly changing operat-
ing conditions. The problem has been aggravated by the lack of robust algorithms and
implementation technology which would allow for the design of economically feasible

dedicated high-speed computational hardware.

A large class of problems in science and engineering, including a group of these
power system analysis, formulations can be formulated as optimization problems. Con-
ventional computers are sometimes adequate for solving this kind of problem providing
that it is within reasonable bounds. Most real world problems, however, are too large
to be solved in this manner. As described in the previous chapters, a new computa-
tional trend, artiﬁcial neural network computation, has recently emerged for solving
these difﬁcult problems. Hopﬁeld and Tank proposed a neural network for solving
linear programming [TaH086]. Kennedy and Chua developed a canonical form of
neural networks for ﬁnding the minimum for nonlinear programming [KeCh88]. For
combinatorial optimization problems, such as the Traveling Salesman Problem (TSP),
several algorithms and architectures have been reported [HoTa85, AiFS90, WiP388].
One of the signiﬁcant features of these approaches is that the time complexity is size
insensitive so that solving large-scale systems in real-time is possible if these
approaches are implemented in hardware. Linking ANN optimization networks with
techniques for solving nonlinear equations, this Chapter proposes an artiﬁcial neural
network approach and architecture for power load ﬂow and contingency analysis with
the following characteristics: A mapping is established between an optimization prob-
lem and nonlinear equations solution. A simpler formulation is proposed for an ANN

nonlinear equation solver so that the method is more suitable for VLSI

99

implementation. Advantage of automatical convergence is taken so that the time com-
plexity of the solver is size-independent. The formulation uniform is made for solving
both power load ﬂow and contingency analysis, and resulting in the identical procedure
for single and multiple outages in contingency analysis. A straightforward process is
provided for solving the problem without training and testing. An accurate result is

guaranteed for power load ﬂow and contingency analysis.

This Chapter is organized as follows. First, the problem descriptions for power
load ﬂow and contingency analysis are given, and a variety of approaches to these
problems are reviewed. Then, an ANN approach for solving nonlinear equations is pro-
posed in which a mapping between an Optimization problem and nonlinear equation
solution is established. A formulation of an ANN solver for the general nonlinear
equations is deﬁned, and the conditions for the stability and convergence of networks
are proven. Based on the general nonlinear equation solver, the ANN formulations for
the full power load ﬂow, the decoupled load ﬂow and the DC load ﬂow are
represented. Furthermore, ANN architectures for three cases of power load ﬂow, and

the simulation results are presented.

5.2 Methodology Review

A description for power load ﬂow and contingency problems is given ﬁrst, and

then typical computational techniques for solving the problems are reviewed.

5.2.1 Problem Statement
5.2.1.1 Power Load Flow

The power load ﬂow is an analysis problem which deals with ﬁnding the
steady-state solution of bus voltage magnitudes and phase angles for given load

demands at the various load busses and for assumed generation levels. The load ﬂow

100

is of fundamental importance to power systems control because many practical applica-
tions, such as transmission planning, contingency analysis, VAR/voltage analysis, on-
line control, and security enhancement, require a load ﬂow solution [Deb88, Coet86,

Moet87].

Let k(i) denote the set of buses connected to bus i. At bus i, PI, Qi, PDI, QB,-
represent generated real, and reactive power, and load real and reactive power, respec-
tively. Let V,- and 5,- be the magnitude and the phase angle of the complex voltage at

bus i. The load ﬂow problem can be written using these terms as

Pi ‘ P0; = V120,; - V,- Z VI-[GI-I-cos5ij + BijsinBII], (5.1a)

I€k(i)
Qt ‘ QDi = —vi28ii - vi 2 V1101) $11151) ' 31190550], (5113)

1““)

where
01.- E 2 (Ga-,- + 01,-). (5.1c)
jeka)

Bit 5 351' + 2 (331,- + 31,-). (5.1d)

jek(i)
5"}: E 8" — 8]" (5.18)

GI-I- and Bi!- are the conductance and admittance of the transmission line between bus i

and bus j. For simplicity, these equations can be written vector form as
P(8, V) = 0
{Om V) = 0. (5.2)

Because all parameters in a power system are under consideration, this forrnula-

tion is called the full power load ﬂow (FLF).

5.2.1.2 Steady-State Contingency Analysis

101

The steady-state contingency analysis focuses on predicting the power load ﬂow
following some event such as a transmission line outage or a transformer outage. For
contingency evaluation, it is assumed that real and reactive loads, real power genera-
tion, and generator bus voltage magnitudes are unchanged before and after the outage.
The objective is to judge the security of the power system against the outage by solv-
ing the load ﬂow problem with changed parameters of the system topology. By com-

paring the solution with the constraint, a conclusion of security can be easily reached.

5.2.2 Methodology Review

5.2.2.1 Pattern Classiﬁcation Method

If the operational conditions of a power system are regarded as the input vari-
ables, and the judgement of static security for the system is regarded as the output
variable, then the contingency analysis can be treated as a typical pattern classiﬁcation
problem. Artiﬁcial neural networks have been applied to problems of this type
[Aget89, Fiet89, SoPa89, HuSh9I]. The approach involves two phases: off-line training
for determining parameters of the neural network, and on-line testing for contingency
analysis. A feedforward neural network model is often used for building the classiﬁer
[[Aget89]. Figure 5.1 shows a neural network architecture, where S, P, and Q, are
apparent power, real power and reactive power, respectively, 1.0 is a constant bias for
all thresholds. Each circle represens a processing element (neuron). There are three
layers (input, hidden and output) and two connection matrices (WW, and WM). The
back-propagation learning rule is used for training the network so that the connection
weights are adaptively updated. When the values of the connection weights converge,
the learning process is ﬁnished and the neural network is used to test different con-
tingency situations. The attractive feature of this method is that the testing is a

straight forward process which can be performed in real-time. However, the

102

convergence of the learning process is case dependent so that the general conditions

for convergence are still subject to investigation [Deb88].

5.2.2.2 Newton-Raphson Method

One common way to solve the FLF problem is using the Newton-Raphson

method [Ort87a]. By deﬁning the Jacobian matrix of (5.2) as

 

 

'21: 21:”
J = 33 32; , (5.3)
(55‘ 57‘
an iteration solution can be found by
' (k+1) (It)
30.41)] = 3(1)] + J“(8"".V("’)[ 35] (5.4)

 

 

where k is the iteration index.

5.2.2.3 Estimation Method

Since transmission line resistances are much smaller than the corresponding reac-
tances, angular differences across a transmission line are small, and the voltage magni-
tudes are close to the normal values (VI = 1.0 p.u. ), some approximations can be used
to simplifying the problem. These include the decoupled load ﬂow (DLF) and the DC
load ﬁow (DCLF). The solution to DLF takes the form

809] [8,102.4 0 ]T [AP‘ka‘k’ 0
l

+ _ (5.5)
V“) 0 Blood 0 AQ(k)/vload(k)

v(tt+1) =

 

5(k+l )]

 

where Bmde, 8,004, dee, and V100,, are node-connection matrix, load-connection

matrix, node-voltage matrix, and load-voltage matrix, respectively.

In DCLF, the variation of the voltage magnitude is ignored so that power load

ﬂow degenerates to a linear equation of the form

103

A 5 = P (5.6)

where A is a connection matrix, 5 and P are phase angle vector and active power vec-

tor, respectively. Therefore, the technique for solving linear equations can be used.

5.2.2.4 Multi-level Screening Method

The objective of contingency analysis is to identify the critical branches which
may suffer from violations. In general, the number of critical branches is very small so
that approximate techniques can be used for ﬁltering noncritical branches at the begin—
ning, and more accurate analysis can be used at each higher screening level with a
reduced set of candidates. In this way, a signiﬁcant amount of computing time can be
saved. Furthermore, a pre-screening algorithm has been developed for reducing the
number of branches for the ﬁrst process [Bret91]. An important assumption of this
algorithm is the contingency localization since the effects of power system contingency
are highly localized near the outaged branch. Consequently, the power system network
can be divided into three parts: the outaged inside network, the stiff boundary and the
rest of the network, where the ﬁrst and the second parts constitute the local cut-off net-
work. The third part can be ignored during the contingency analysis as it is considered
far away from the outage. This approach can handle a single contingency situation.
Tests of a 600-bus system have showed that the CPU time can be reduced by about
50% compared with the complete bounding method. The relative error was 1-2% com-

pared with the full power load ﬂow solutions [Bret91].

5.2.2.5 Homotopy Method

Homotopy is a global convergent method for solving nonlinear systems of poly-

nomial equations [ChMY78]. A homotopy function is deﬁned as

H(x,t)=(1—t)S(x)+t T(x) (5.7)

104

where t is a real parameter varying from 0 to l, T(x) is the target system to be solved,
and S (x) is a simplier system for initial situations, called the starting system. For

power load ﬂow problem, S (x) is often chosen as

 

 

alxl2 — b1
2 — b

S (x) = 02x3..- 2 (5.8)
a,,x,,2 - b,,

The procedure for homotopy method is deﬁned as:

(i) Let t,- =0. Randomly choose complex vectors a: (a1, 02, ..., an),
b = (b1, b2, ---, b,,), and solve S(x) = 0 to get a set of roots which are the

starting points of homotopy curvers.

(ii) Update t,- by tI-"ew = t?“ + At, where At is stepping distance. If t,- < 1.0, go to
(iii); otherwise, stop.

(iii) Use an iterative method, such as Newton Raphson method, to solve homotopy

equation for each t,-

H(X, ti) = (1 — II)S(X) + tiT(X) = 0. (5.9)

(iv) go to step (ii).

This method can systematically ﬁnd all possible solutions at the same time.
Moreover, the Speed of this process can be improved by means of dynamically adjust-

ing At and using parallel computation techniques [Saet89].

5.3 ANN Formulation for Power Load Flow

5.3.1 ANN Solution to Nonlinear Equations

Consider the set of nonlinear equations

f](x]s x29 ... 9 xn) = O
”x" x211... ’ x") = 0 (51%)
fn(x1, x2, , X") = O
or
f(x) = O (5.10b)

where x is a variable vector and f is a function vector. The Jacobian matrix of f(x) is

deﬁned as

 

 

 

 

 

 

 

ECLBLL af,
8x1 8x2 8x"
af_2 ﬂ... afz
J<x>= 8).“ 672 III a?" . (5.11)
afn afn afn
8x1 8X2 ax"

Note that J is a function matrix of x. In most cases, it is desirable to evaluate J at a
given point, say x0. As a consequence, J becomes a constant matrix, represented by

J (x0). Before deriving the ANN formulation, some deﬁnitions are given.

Deﬁnition 5.1: If the Jacobian matrix at x, J (x), is not a zero matrix for x e D c R”,

then it is called an unvanished Jacobian matrix on D.

Definition 5.2: If all eigenvalues of the Jacobian matrix at x satisfy Re (M) > O for

x e D , then it is called an eigen—positive Jacobian matrix.

Next, a mapping between an optimization problem without constraints and non-

linear equation solution can be established.

Proposition 5.1: If the Jacobian matrix of nonlinear equations (5.10) is unvanished for

any x, then an optimum point is a solution to the nonlinear equations.

Proof: Deﬁne an objective function

¢<x> = ifﬂx). (5.12)
i=1

106

then an optimum point of ¢(x) must satisfy the necessary condition

that is,

Since the Jacobian matrix is unvanished, the above equation implies that

 

 

 

 

 

F

 

 

 

 

 

 

 

 

 

 

 

19..“
8X1
E.
812
VtD =
34>
8x"
af1 afz
3; + ZfZSX—i’
8f 8f
2f1-5;-21— + 2&3:-
an a}... H
Bx" zfzax,
3a. 9:: an
3x1 8x2 ax"
if; a: afz
8x1 812 8x”
59f" afn afn
8x1 8X2 ax"
"ff
217 f3 =
fa.
Ffl 0
f2 = 0
fn.

 

Soc

0

 

 

 

_f.

f2

 

 

 

 

 

 

 

(5.13a)

(5.13b)

(5.13c)

(5.13d)

(5.13c)

In other words, the minimum point satisﬁes the corresponding nonlinear equations. D

107

n
Proposition 5.] indicates that a minimum point of 2 f f(x) is a solution to the
i=0 .

nonlinear equation. However, if a gradient system is directly derived from this objec-
tive function, then the corresponding artiﬁcial neural network for the problem becomes
complex since it encounters both f(x) and J(x). Alternatively, a simplier formulation

is proposed below.

For the set of nonlinear equations described by (5.10), deﬁne a dynamic system

 

 

as
X] f](x19 XZ, ... 9x")
. i (x,x,-".X)
x= ...2 =- f2 1.3... " . (5.14)
x.” fn(xlax29 9xn)‘
L a

 

 

When the system approaches an equilibrium, then
it = 0.
At that time, there must exist the equality
f(x) = 0.

We call this network, described in (5.14), an ANN nonlinear equation solver, because
equation (5.14) can be regarded as a simpliﬁed formulation of an artiﬁcial neural net-
work for optimization [KeCh88]. An important issue for the ANN nonlinear equation

solver is ensuring its stability as will be discussed the following Theorem.

Theorem 5.1: If f(x) is continuously differentiable and its Jacobian matrix is eigen-
positive, then the ANN nonlinear equation solver is stable in the neighborhood of a

solution point and its equilibrium point must be satisﬁed with the nonlinear equations.
Proof is given in the Appendix A.

Theorem 5.1 establishes a formulation for solving general nonlinear equations

using a newly developed artiﬁcial neural network. In what follows, this formulation

108

will be applied to power system control problems [ChSh92b].

5.3.2 The Full Power Load Flow

As prescribed by equations (5.2) and (5.14), the neural network for the full power

load ﬂow is formed as

8 = - P(5, V) (5.15a)
v = — Q(8, V) (5.15b)

where 8 and V are the phase angle vector and the voltage vector, respectively. It is
apparent that the Jacobian matrix described in (5.3) is continuously differential in the
domain 8 e R" and V e R". In order to investigate the eigen-positive property of the
Jacobian matrix at the neighborhood of an equilibrium point, the following deﬁnition

and theorem must be introduced.

Deﬁnition 5.3: If matrix A satisﬁes

an. > O, (5.163)
61,-,- 2 Z laij I, (5.16b)
jati
and
at: 2 2 la}.- I.' (5.16c)
jati

then A is said to be diagonally dominant. Furthermore, if only inequality is held in the

above, A is said to be strictly diagonal-dominant. For example, matrices A1 and A2

below
4.0 0.5 -0.5 —1.0 4.0 1.5 —O.5 —1.0
A _ 1.0 3.0 1.0 0.3 A _ 1.0 3.0 1.0 0.3
1— 1.0 .5 2.8 1.0 2" 1.0 —O.5 2.5 1.0
1.0-1.0 0.3 3.5 1.0 -1.0 0.3 3.5

are strictly diagonal-dominant, and diagonal-dominant, respectively.

109

Theorem 5.2: A strictly diagonally dominant matrix is eigen-positive.
Proof can be found in [Jen77].

In general, there exist multiple solutions to power load ﬂow due to its nonlinear
behavior. From a practical point of view, however, the feasible solution is the most
important because the power system operates under this situation. As derived in
Theorem 5.1, a nonlinear system usually features a local stability. Thus, a problem
arises in the choice of an appropriate initial point for an ANN solver in order to make
sure that the equilibrium point is the desired feasible solution. Fortunately, this prob-
lem can be solved by setting the initial point at the ideal operation point for the power
system. That is, all phase angles are 0, and the magnitude of all bus voltages is 1. It is
certain that the objective of power system control is to make the difference between
the system state (8,V) and the ideal state (8 0 = 0, V0 = 1) as small as possible by
means of shunting capacitor banks, and adjusting transformer taps. Usually, an allow-
able range for the difference is deﬁned as W — V°|< 0.05 and IS - 5° |< 0.15 (8 in
radian). Next, the Jacobian matrix at the ideal point is discussed. Of course, in this
case the Jacobian matrix only depends on topology parameters: conductance and sus-
ceptance of transmission lines, 8, C. By plugging the values of 5,- and V,- (0,1) for all

i into equation (5.3), the Jacobian matrix becomes

:- 1

011..01n b1]..b1n

anl'°ann bnl"bnn

 

 

“0’”: C11"Ctn d11"drn (5'17)
_cn1"c,,,, dnl ' ° dmd
where
011‘:- X 31;; (5.17a)
JEkU)
a-- =B-- (i ¢j); (5.17b)

110

bit = 20H " Z GU‘, (5.17C)
jekm

bij = -G,-,- (i #1); (5.17a)

Cit =- Z GU; (5.17e)

jek(i)

Cij = Gij (i $1); (5.170

dii = -23.-.- + 2 Bi); (5.17g)
jeka)

d-- :3. (i ¢j), (5.17h)

U ‘1

It is apparent that every diagonal entry (a 611;) of the Jacobian matrix is positive

it"

because
Bi,- < 0 => an = - 2 8,,- > 0 (5.18a)
j€k(i)
and
da- = —28ii + 2 Bi} (5.18b)
16/60)
=‘2(Bst + Z Bsij + 2 Bij)+ Z 3t;
16W) j€k(i) j€k(i)
=-2Bsi " 2 2 Bsij " 2 Bij-
jeka) jeka)

In power systems, 2 IBij |represent the sum of susceptance of the all transmission
jek(i)

lines to bus 1' , and [85,- |+ . EJ 85,)- I represent the sum of susceptance of the capacx-
[6 t

tor bank at bus 1' , and susceptance of the all I'l-equivalent sections at the side of bus

1'. In general, the former is much larger than the latter, that is,

Z IBij|>2(|Bsil+ 2‘. IBsijl)=>dii>O- (5.18c)
jek(i) jek(i)

Using the notation

111

Jeko) bed

where BiO represents admittance of the transmission line between bus 1' and the slack

bus (bus 0), and

Bi} 1f IhCl‘C iS a connection between bus I and j
l
J

0 otherwise, (5.20)

then the conditions for making Jacobian matrix strictly dominant are found as
lBto |> 2%“ (5213)
“’10 + 2831' + 2 Z Bsij I) 22'ng |+ I G“) I. (5.21b)

jek(i) j!=t'
In other words, when the parameters of a power system are satisﬁed with the condi-

tions in (5.21), the neural network formulated in (5.15) is stable in the neighborhood

of (0, 1) so that a feasible solution can be found by the network.

It is worth pointing out that the strict diagonal-dominance is a sufﬁcient condition
for an eigen-positive matrix. In some cases, a matrix is still eigen-positive, even
though it isn’t strictly diagonal-dominant. This situation will be discussed later because

it often appears in the decoupled load ﬂow and the DC load ﬂow.

Deﬁnition 5.4: If a matrix can be written as a block form, and all its non-zero
entries are in the diagonal blocks, then the matrix is called the diagonal-decoupled

matrix; otherwise, it is called the diagonal-undecoupled matrix.

Theorem 5.3: If A is a symmetric, diagonal-undecoupled and dominant matrix

with at least one strict dominant row (column) such that 0k); > Zak}- and a,“ > 2011:
jatk j¢k

for some k, then A is an eigen-positive matrix.
Proof: All eigenvalues of A must be real due to the symmetry [HiSm74], i.e.,

Im (2.1) E O.

112

According to Gerschgorin Theorem [HiSm74], every eigenvalue 1,- satisﬁes

I7~t ' aii IS Z |an I (5.223)
jar“
and
ll»: - at: IS 2 la); | (5.22b)
jaet'

where an- is a diagonal entry of the matrix. A graphic interpretation for the above rela-
tionship is that 1.,- must lie within a union of row-discs (center at aa- and radius =

2 I00 I) and column-discs (center at an- and radius = 2 la]; I) in the complex plane.
I.“- jati

Using the dominant condition in (5.16a-5.16c), it is concluded that all eigenvalues
are in the right half of the complex plane and X,- Z 0, since they are all real. Next,

matrix A must be shown to be nonsingular. Without loss of generality, let

a“ = Z IaijI fori = 1, 2,°--, n—l (5.23a)
j¢i
am > Elan-l (5.23b)
jam A

using the row elementary transformation reduces A into an upper-triangle matrix. It
can be shown that the diagonal dominance holds because of the diagonal un-decoupled

assumption on A [HiSm74]. Therefore,

(21,002 z|aij<k>| (5.24a)
jati
and
an“) > o (5.24b)

where (k) is the kth reduction such that (It-I") = O for j < i .

The elementary transformation does not change the determinant of the matrix,

det (A ) = a {(1% )5) ---a,,(,':-” > 0. (5.25)

113

In other words, there is no zero eigenvalue in A so that
M > 0 for all 1'.

Therefore, A is an eigen-positive matrix. El

It is worthy mentioning that the diagonal un-decoupled feature is not a necessary

condition for an eigen-positive matrix. For instance, two matrices

I
J
ﬂ
J

OOOONN

C1: C2:

 

 

 

 

OOOOIQN
OOOONN
OOUJUJOO
OOUJUJOO
A-DsOOOO
UrJ>OOOO
l_
OOOOUJN
OOWWOO
OOJ>UJOO
A-bOOOO
LII-hOOOO
l

I

are diagonal decoupled. However, C 1 is singular, and C 2 is nonsingular, because there
exists at least one diagonal block with zero eigenvalue in C1, whereas in C 2 each
diagonal block has a strictly dominant row which makes the diagonal block nonsingu-
lar. In a power system, a diagonal-decoupled matrix indicates that the system consists
of a group of buses in which there only exist connections between the slack bus and
the individual groups, and there is no interconnection between the groups. Figure 5.2
shows a power system composed of three groups A 1, A2, A3. These groups are con—
nected to the slack bus 5. It is obvious that case C 1 never occurs because in each
group, there is at least one bus (say bus k) connected to the slack bus so that ak,c is

strictly dominant in the corresponding block matrix.

Based on Theorem 5.3, two Corollaries can be deduced for the decoupled load

ﬂow and the DC load ﬂow.

5.3.3 Decoupled Power Load Flow

Corollary 5.1: Decoupled power (DCF) load ﬂow can be solved by the ANN

solver.

114

Proof: By ignoring the conductance of transmission lines, the ANN solver for

DLF can be formed as

8: — [— v,- 2; v1.30. sins-j — (P, - Pol-)1 (5.26a)
jek(i)
{1 = — {—VI 28‘? -' vi 2 Vj+ BU C0380 - (Qi ‘- QDi )1. (5.26b)
mun

Under the assumption of sinBU = 0 for all i , j, the corresponding Jacobian matrix at

(0, 1) is

J (0, 1) = (5.27)

OHOdnldm

)- u

 

 

which is a decoupled matrix. Obviously, every row is dominant, and there exists at
least one strictly dominant row in each block. According to Theorems 5.2 and 5.3, the
Jacobian matrix is eigen-positive which guarantees that the decoupled power load ﬂow

can be solved by the ANN solver. 1:]

5.3.4 The DC Power Load Flow

Corollary 5.2: The DC power load flow (DCLF) can be solved by the ANN

solver.

Proof: As mentioned in Section 5.2, the decoupled power ﬂow degenerates to a

linear equation. Consequencely, the ANN solver can be formulated as

S = — (A 5 — P) (5.28)
where
P1- 511
P 2 52
P = ' , 5 = ' (5.29)
LP"- I8".

 

 

 

 

115

and

 

r
" 2 Bil (. .)
_I lek(i) 1 =1
Note that the Jacobian matrix is a constant matrix
J = A , (5.31)

and A is still a diagonally dominant matrix with at least one strict dominant row so
that A is eigen-positive. In this case, a unique solution can be found by the ANN

solver due to the linear behavior of the system [Mea88]. Cl
5.3.5 Contingency Analysis

As described in Section 5.2, a kernel problem of the contingency analysis is to
solve the power load ﬂow with changed parameters of the power system. An assump-
tion in the contingency analysis is that the whole network remains unisolated before
and after outages so that the above topological conditions are unchanged. Therefore,
the ANN formulations for the different cases of power load ﬂow can be directly used
for the contingency analysis. It is worthy to point out that the multi-outage (the param-
eter change of transmission lines can be regarded as a speciﬁc case of transmission
line outage) share the identical ANN formulation with the single-outage. Consequently,
the architecture proposed in next Section will handle all the situations of power load

ﬂow and contingency analysis.

5.4 Architecture

A theoretical basis for an ANN nonlinear equation solver and the ANN formula-
tions for power load ﬂow have been presented in Section 5.3. In what follows, an

architecture of the ANN nonlinear equation solver for power load ﬂow is proposed.

116

Figure 5.3 shows the architecture of the ANN nonlinear equation solver for the
full power load ﬂow. From a structural point of view, the architecture can be divided
into two parts: the 8 component (the upper part in Figure 5.3) and the V component
(the lower part in Figure 5.3). These two components can be viewed as the mapping of
equations (5.15a) and (5.15b), respectively. This network is composed of neurons,
resistive matrices, analog multipliers, sinusoidal generators, feedback links and DC
voltage sources. The neurons are implemented with integrators (I - units) which
govern the dynamical process of the network for ﬁnding an equilibrium point. The
dynamical relations in (5.15) are held by connecting all current outputs from
P, B , G , G), and Q , G , B, B), to the inputs of integrators for 8 and V, respectively,
where the negative terminal of the integrator is used for implementing the negative
sign in (5.15). Another distinct feature of this approach is that the feedback connec-
tions 8 and G are not restricted to symmetric matrices. This signiﬁcantly broadens its

range of applications.

In most power systems, the phase angle 8 is usually near zero so that the follow-

ing polynomials can be used to approximating the sinusoidal function

53

sin8 :3 5 — '3' (5.323)
52

C030 "-3 1 - —2-. (5.32b)

In other words, the sinusoidal generator is replaced by a polynomial generator.

There are three nonlinear operational blocks in the right part of Figure 5.3. All
of them are organized in n x n arrays. In the sinusoidal generator block (Sin lCos ),
the outputs on the diagonal are sin 8,- and cos 8,, and the outputs off the diagonal are
sin (8,- — 81-) and cos (8,- — 8}), where i is the row-index, j is the column-index. If
there exists no connection between bus i and bus j, the output at (i, j) of the array is

zero. Similarly, in the voltage multiplication block (M1), the outputs on the diagonal

117

are V}, and the outputs off the diagonal are Viv]. if there exists a connection between
bus i and bus j. In the joint multiplication block (M 2), the outputs on the diagonal are
Visin 8,- and Vicos 8i, and the outputs off the diagonal are ViVJ-sin(8,- - 8}),

Vi Vj cos (8,- — 8 j) if there exists a connection between bus i and bus j.

As shown in Figure 5.3, Viz from the voltage multiplication block is fedback to
the B”, G), network, and all outputs from the joint multiplication block are fedback to
the B , G networks. It is necessary to point out that the original power load ﬂow for—
mulation (5.1a, 5.1b) is obtained from decomposition of the complex power so that
G, B, P , Q can be regarded as the unitless values. As a consequence, they can be
implemented in a resistive matrix network. In general, the dimensions of these matrices
are n x n for G, B, and n x I for G”, B”, P, Q. As described in Section 4.5, let the
normalization values of current be 0.1 ma, then the value of the resistors can be calcu-

lated as

10

IO.-

lk Q (5.33)
J

 

RGU =

where R65), is a resistor representing values in matrix G. If Gij = 0, R0,,- is inﬁnite. In
this case, the feedback to the corresponding position is suspended. R3. R6", R3”, RP

and RQ are calculated and handled in the similar manner.

Let n be the number of buses (excluding the slack bus) in a power system, it is
then straightforward to ﬁnd the number of circuit components. Furthermore, using
design data for CMOS technology [AlHo87, Rei87], a space estimation for building
this nonlinear equation solver for the full power load ﬂow is given in Table 5.1, where
0 represents the area for a resistor with an average resistance 50k Q, and b represents
the space area for an operational ampliﬁer which is the basic circuit to build the
integrator, the sinusoidal generator and the multiplier. Let the routing area be 40% of

the circuit area [AlHo87], then the space area of a chip for a n-bus power system is

s =1.4[(18n2 + 2n)b + (4n2 + 4n)a]. (5.34)

118

For the decoupled power load flow, an architecture takes a simpler form shown in
Figure 5.4. Without G blocks, this architecture makes the remaining same as that of
the full load ﬂow shown in Figure 5.3. A space estimation is given in Table 5.2. The

space area of a chip for a n—bus power system is

s =1.4[(18n2 + 2n)b + (2n2 + 4n )0]. (5.35)

For the DC load ﬁow, all the blocks associated with V, G, the sinusoidal genera-
tor and multiplications are ignored so that the architecture becomes signiﬁcantly sim-
ple as shown in Figure 5.5. A space estimation is given in Table 5.3. In fact, the
space area of a chip for a n—bus power system is the same as that of the linear equa-

tion solver described in (4.51). For convenience, it is written here again

5 = 1.4[(nb + (n2 + n)a]. (5.36)

Using equations (5.34) to (5.36), the actual chip area for a 100—bus power system
is 41.38cm2, 39.63cm2, and 0.905em2 for FLF, DLF, and DCLF, respectively. The
chip area for DCLF is much smaller than that of FLF and DLF due to without non-

linear structural components.

5.4.1 Veriﬁcation

To verify the computational correctness of the proposed neural network and archi-
tecture, simulations are fulﬁlled, where integrators are simulated by a fourth-order
Runge-Kutta numerical integration (explicit formulation), the resistive networks are
simulated by 1-D and 2-D arrays, and the sinusoidal units are simulated by function-

call.

5.4.1.1 The full power load ﬂow

There are two cases in the FLF:

119

(1) Find both phase angle and voltage amplitude for each bus except the slack bus
implying that there are 2n variables in the system. This case is called the
PQ—bus situation because the real power P and the reactive power Q for all
buses are speciﬁed. This is the general case as was discussed in the previous sec-

tions.

(2) Find the phase angle and voltage amplitude for load buses and ﬁnd the phase
angle only for generator buses implying that there are (Zn—g) variables in the
system, where g is number of generators. (in this case the load bus is a PQ-bus,
and the generator bus is a PV-bus [Ber86]) This case is called the
mix—bus (PQ -PV) situation, because the real power P and the reactive power Q
for all loads, and the real power P and the magnitude V for all generator buses
are speciﬁed. Obviously, this situation is a special case for the full power load

ﬂow which can still be covered by the formulation (5.15).

Here two examples are described to show these two situations. The ﬁrst example
is a 7-bus power system, shown in Figure 5.6 representing the PQ -bus situation. The
original data for transmission lines provided in resistance and reactance (in the
parentheses in Figure 5.6), a transformation from impedance to admittance is carried
out ﬁrst. The second example is the 39-bus in New England power system which
represents the mix—bus situation. The data for the 39-bus system are given in Appen-

dix B.

The convergence for the ANN simulation is determined by
|x,-<"+‘> - xi“) |< e for all i (5.37)

where x,-(") is the value of x,- at the kth iteration. e = 0.15—3 and e = 0.1E—7 are used
for the both examples. The time increment used for the Runge-Kutta integration is 0.05

for the 7-bus case and 0.001 for the 39-bus system. The simulation time is deﬁned as

T = dt -N (5.38)

120

where N is the number of iterations required for convergence. All starting points are
set to 0 and l for 8 and V, respectively. The equilibrium points are found for all the
examples. The simulation time is dependent on e: for 8 = 0.1E—3, T is 2.15 and 2.35
for the 7-bus and the 39-bus cases, respectively; for e = 0.1E—7, T is 13.15 and 14.55
for the 7-bus and the 39-bus cases, respectively. Furthermore, T has the same rela-
tionship to the initial point as that of linear equation solver described in Section
4.7.1.1. These results and the result from a hardware simulation for the ANN linear
equation solver [ChSh92a], whose formulation is the same as that of the DCLF, have
shown that the time complexity of this approach is size-independent due to its parallel

architecture and circuit dynamic properties.

In order to measure the accuracy, the absolute error Ed, and the relative error E,

are deﬁned as

 

1
k _
Ea = [2(1‘1'- Rj)2:l 2 (5.393)
i=1
Ea
Er = k (5.3%)
Z ”’1' I
i=1
where
2n for all P-Q buses
= 2n — g for mixed buses (5390)

and L,- and Rt are the values for the left hand side and the right hand side of each
equation in (5.1a) and (5.1b), respectively. Thus, the absolute error represents the sum
of the difference between the left hand sides and the right hand sides, whereas the
relative error represents a percentage of the absolute error over the absolute sum of the

, left hand sides (total real and reactive power).

For comparison, these two examples are solved by the Newton-Raphson method

as well. Tables Cl and C2 in Appendix C present the simulation results and the

121

relative errors for the 7-bus system and the 39-bus system from the ANN approach,
the Newton-Raphson version and the homotopy method as found in [Guo90]. It is
apparent that the accuracy of this approach is best. The trajectories of 8 and V, shown
in Figures 5.7-5.10, indicate that the network approaches its equilibrium without oscil-

lation.
5.4.1.2 Contingency analysis

A 5-bus power system is chosen to test contingency analysis using the ANN for-
mulations for the decoupled load ﬂow and the DC load ﬂow. Figure 5.11 shows the
topology of the system. As mentioned in Section 3, this approach can uniformly fulﬁll
the contingency analysis for all the cases. In Figure 5.11, the dashed line represents the
single transmission line outage, the dotted lines represent the multiple transmission line

outage, and the numbers in parentheses represent parameter change.

The ANN simulation procedure for contingency analysis is the same as ﬁnding
solution to the full power load ﬂow. Here the simulation parameters take values:
8 = 0.1E-3 and dt = 0.05. Table 5.4 presents the simulation results. In order to indi-
cate the values of the approximation, the simulation result of the FLF for the same
data is given as well. Apparently, the result of the DLF is very close to that of the
FLF, whereas the result of DCLF has some difference from the FLF when the actual
voltage magnitude has a large deviation from 1.0 for some buses. The security judge-

ment is based on the following constraints:
Iv, - v0|< 0.05
|8,- — 80 | < 9°

Non-security cases are marked (*) in Table 5.4.

122

5.5 Summary

A new approach for solving problems of power load flow and contingency
analysis using an artiﬁcial neural network has been proposed. A mapping between the
nonlinear equation solution and quadratic-nonlinear minimization problem without con-
straints has been established. Furthermore, a simple formulation given by equation
(5.14) for the artiﬁcial neural network has been deﬁned and its stability condition has
been identiﬁed. The ANN formulations for the full power load ﬂow, the decoupled
load ﬂow and the DC load ﬂow are obtained and their stability can be sufﬁciently
guaranteed by using the diagonal dominance property of the power systems.
Correspondingly, an ANN architecture for the three formulations is given which can be
implemented in VLSI technology. The simulation experiments have shown excellent
results for power load ﬂow and contingency analysis. Due to dynamic convergence
and distributed parallel processing in ANNs, the time complexity of the approach is
size-independent. Therefore, this approach can provide the real-time solution for large-

scale systems if it is implemented in hardware.

123

Input layer Hidden layer Output layer

s_. [f
<2 all

Figure 5.1. The neural network for classiﬁcation.

 

124

 

0 CD 0

Figure 5.2. A power system consisting of 3 subsystems.

125

 

 

Source . l
°—_] 6 f ViVjcos8ij

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

r 1 1 1 3-1 L.f;'"
r 0
v I. —
Source l V? *1”?—
I I T
Q G B 311
1 1 1 1 l-) V .M...
__L_+
—V‘.Vjsin8i}.

Figure 5 .3. A network for the full power load ﬂow.

126

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

V-Vjcos8..
l (J
Source
°—_l I
P B
8 ‘ 5'
I l ’J’ 5 4 cm
W "
2 _’ M2
Source 'Vt r>
° I V Y
Q B 311
V
l i l >;I V—p M1 _
-_I-_I .
—ViVj-sin8ij

Figure 5.4 . A network for the decoupled load ﬂow.

 

 

 

127

Source

O.___

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

III—l I m

 

 

Figure 5.5. A network for the DC load ﬂow.

 

 

128

 
   
   

 

 

 

  
 
 
  

 

 

 

 

 

 

P = 0.076 p = 0.039 P66 = 0.900
31 k b D5 D1 — 0 300
ac US QDS = 0.016 QDI = 0478 Q66 - .
I (0.054+j0.223) (0.013+j0.042) (0.082+j0. 192)
j0.0510
I j0.0424
: (0.057+j0.174) I
a ..
In
C. (0.058+j0.176) r:
9., r\
+ -:
2: a
+
E :3
(0.024+j0.100) (0.024+j0.100) g
. _l. . I
100717 I I)00200 1- j0.0273
PD4 = 0.183 PD3 =0.135 P02 = 0.942
QD4 = 0.127 QD3 = 0,053 QD2 = 0.190

Figure 5.6. The 7-bus system.

129

 

 

 

 

 

 

 

 

 

8
(degree)
bus6
5 _
200 400 600 bus]
0 m 1 . Iteration
busS
bus4
bus3
-5 .4
bus2

 

 

Figure 5.7. The trajectory of angle for the 7-bus system.

130

 

bus6
1.05—

1.00 l l l

 

Iteration

bus4

 

 

095‘“ busS

 

busl

 

bus3

 

busZ

 

Figure 5.8. The trajectory of voltage for the 7-bus system.

131

(degree)

15—

10—

 

Iteration

 

 

 

-5 - The bus number of curves from the top to bottom:
36, 38, 35, 33, 32, 34, 37, 31, 29, 22, 23, 19, 28, 20, 30, 21, 25, 10, 26
13,11, 2,12, 24,16, 6,14,17, 27,15, 5,1,18, 3, 4, 7, 9, 8

Figure 5.9. The trajectory of angle for the 39-bus system.

132

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

V
1.05
\ -
500 1000 1500 2000
1.00 I I I 1 Iteration
V
0.95—

 

 

 

 

The bus number of curves from top to buttom:
36, 35, 30, 37, 38, 25, 22, 29, 23, 28, 26, 1, 2, 34, 33, 27, 24, 21, 17
19,18, 3,16, 9, 20, 32, 31,15,10,14,13,11, 4, 6, 5, 7, 8,12

Figure 5.10. The trajectory of voltage for the 39-bus system.

133

 

 

 

 

 

Slack bus P64 = 1.9
e o awas
V4: 1.05
—1 _ 1-j7 _
. 145
1-1103 1-15
2 1 -j10
(1.5 -j15)
j0.05
PD3= 1.35
- Q03=05

 

Figure 5.11. The contingency analysis for a 5-bus system.

 

134

Table 5.1. A space estaimation of the architecture for FLF.

 

 

 

 

 

Resistors Integrators Sin & Cos generators Multipliers
Count 4n 2+4n 2n 2:22 2:22
Unit are: a b 5b 4b
Total area (4n2+4n)a 2nb 10an 8an

 

 

 

 

 

Note: a =6.25x10'3mm 2, b =1 .5x10‘2mm 2.

Table 5.2. A space estaimation of the architecture for DLF.

 

 

 

 

 

Resistors Integrators Sin & Cos generators Multipliers
Count 2n 2aI-4n 2n 2n2 2n2
Unit area a b 5b 4b
Total area (2n2+4n )a 2nb 10n2b 8an

 

 

 

 

 

Table 5.3. A space estaimation of the architecture for DCLF.

 

 

 

 

 

 

 

Resistors Integrators
Count n2+n n
Unit area a b
Total area (n 2+n )a nb

 

 

 

 

 

Table 5.4. The contingency analysis.

135

 

 

 

 

 

 

Type FLF DLF DCLF

1.0323/-4.6884 l.0154/-4.4897 1.0/-4.5878

Before 0.9971/-7.9609 1.0189/-7.6689 1.0/4.8452
0.9805/-6.4424 0.9996l-6.4237 l.0/-6.5378
l.0500/0.6858 1.0500/ 1.1810 l.0/ 1.3753
l.0067/-4.9227 l.0185/-4.7066 1.0/-2.2617

Single 1.0171/-9.l931 * 1.0373/-8.7828 * 1.0/-3.6459
outage 0.9741/-5.9313 0.9930/-5.9910 1.0/-3.0517
1.0500/0.6871 1.0500] 1.1818 1.0/3.2250
0.9936/-4.0240 l.0061/-3.8121 1.0/-1.8823

Multi- 0.9581/-lO.3693 * 0.9857l-10.0404 * 1.0/-5.5124
outage 0.9050/-12.2557 * 0.9442l—12.5443 * 1.0/-4.6721
1.0500/3.3805 1.0500/3.8428 1.0/4.0918
1.0120/-4.1638 1.0214/-3.9919 1.0/-5.6934

Parameter 1.0051/—7.5489 l.0236/-7.3278 l.0/-8.9020 *
change 0.9864l-6.2728 l.0032/-6.3175 l.0/-7.5798
l.0500/-0.0180 1.0500/0.3987 1.0/06439

 

 

 

 

 

PM“

136

Chapter 6

Conclusions

 

New approaches for ﬁnding an accurate solution to linear and nonlinear pro-
gramming and solving power system control problems using artiﬁcial neural network
techniques have been described in this dissertation. This chapter starts with a sum-
mary of the research work, followed by an identiﬁcation of the major contributions of

this work. A discussion of future research issues concludes the chapter.

6.1 Summary

Improving the performance of ANNs for linear and nonlinear programming, and
developing a fast approach for large-scale system computations are challenging prob-
lems in neurocomputing. Based on optimization theory and ANN technology, promis-

ing solutions to these problems have been obtained from this work.

The computing accuracy of constrained optimization from the previous ANN
models does not meet the demand for some applications. By analyzing the behavior of
the conventional penalty function in the neighborhood of the feasible region boundary,
which is often used in these ANN models, the reason for its inherent degenerating
accuracy has been discovered. Based on this, a new combination penalty function has

been developed. This function can provide enough "penalty" to offset the decreasing of

IWIH

137

¢(x) in both the range of [(xb) and far away from the feasible region boundary so
that an equilibrium point close to the optimal point can be located. A corresponding
neural network formulation has been given whose stability is guaranteed by the
assumption of a convex objective function and the continuously differentiable property
of the combination penalty function. Based on unconstrained optimization, an ANN-
based technique has been developed for solving simultaneous equations. For the linear
system, a mapping between the solution of linear equations and quadratic minimization
has been established so that solving linear equations can be viewed as ﬁnding the
minimum of a quadratic function. A general formulation for the ANN linear equation
solver has been deﬁned, and the stability of the network has been established. A rela-
tionship between the ANN solver and the linear dynamic system is identiﬁed, broaden-
ing the sc0pe of potential applications. An additional merit of the formulation is that
symmetry of the coefﬁcient matrix is not required, again broadening the range of
applications. For the nonlinear system, a mapping between the solution of nonlinear
equations and quadratic-nonlinear minimization has been established so that solving
nonlinear equations can be viewed as ﬁnding the minimum of a quadratic-nonlinear
function. A formulation for the ANN nonlinear equation solver has been deﬁned and
proofs of local stability and convergence of the solver have been given which provide

a sound theoretical basis.

Based on these formulations, ANN architectures have been developed at the algo-
rithm and structural level. The feedback neural network paradigm was used to express
the dynamic behavior of the objective function. A constraint ampliﬁer circuit for the
derivative of the combination penalty function has been proposed for the constrained
optimization network. The ampliﬁer is simpler than previous versions due to the
replacement of the radication operation. Another distinct feature of the architecture for
the linear and nonlinear equation solvers is that the resistance connection matrix can be

symmetric or asymmetric. The hardware complexity of the solvers has been estimated

138

as 0(n2a + nb) for the linear equation solver, and 0(n2a + nzb) for the nonlinear
equation solver, where a(=6.25x10‘3mm2) is the unit area for a resistor, and

b(=1.5x10’2mm 2) is the estimated unit area per neuron.

The ANN linear and nonlinear equation solvers have be applied to solving the
example power system control problems of power load ﬂow and contingency analysis.
The speciﬁc ANN formulations for the full power load ﬂow, the decoupled load ﬂow
and the DC load ﬂow have been established. The network stability for these formula-
tions has been further analyzed. Moreover, the usefulness of the ANN linear equation
solver has been extended. Other applications include matrix inversion, detemrining the

stability of a linear control system, and determining matrix singularity.

The computational correctness of the approaches and architectures have been
veriﬁed by simulations and it was discovered that the relative errors for linear and non-

linear programming examples are substantially reduced.

6.2 Contributions

The major contributions of this research are:

(1) An improved artiﬁcial neural network for linear and nonlinear programming
has been developed. The reason for the degenerating computing accuracy of
previous networks has been discovered and a new combination penalty func-
tion has been proposed. Both play key roles in improving the computational

accuracy for ANN constrained optimization.

(2) This work has established a new link between ANN theory and unconstrained
optimization. Correspondingly, an ANN linear equation solver has been
developed at the algorithm and structural level, and a theoretical basis for the

ANN nonlinear equation solver has been given.

139

(3) A direct ANN method (without training and testing) for power load ﬂow and
contingency analysis has been developed. The ANN formulations, their stabil-
ity property and the relationship with the parameters of a power system have

been provided.

6.3 Future Research

To implement the approaches pr0posed in this dissertation, a new method for a
variable resistor connection matrix is needed. Recently, experimental programmable
arrays have been reported for this purpose [MaHY90, FiFS91]. However, their adju-
stable range is too narrow to meet the demand of real problems. Studies in this area
are continuing. Secondly, to take advantage of the program ﬂexibility of the conven-
tional computer, and the real-time computing property of the ANN equation solver, a
hybrid interface should be developed. This interface should be able to provide data
communication, A/D and D/A conversions, and computing task control between a digi-
tal computer and a neural network. Another issue in implementation is to explore a
new method for handling the l/O problem for large systems; system decomposition

being the most likely starting point.

Feedback network based learning is another challenging issue as the capability of
the network will be greatly enhanced with built-in learning. There have been very few
implementable algorithms presented in this area so far [BaAC91, SuCL91]. New
research work includes some implementation-oriented efforts in architecture and device

development.

Finally, the ANN linear equation solver described in this thesis may be further
used for ﬁnding eigenvalues and associated eigenvectors. A relationship between state
trajectories of a neural network and eigenvectors of the corresponding coefﬁcient

matrix can be obtained by expanding the matrix singularity analysis. Based on this

140

relationship and numerical techniques, eigenvalues can been found by solving a set of

linear equations.

141

APPENDIX A

Proof of Theorem 5.1

Let x“ be an equilibrium point. After change of variable
y=x—{, (an
equation (5.14) can be written as
i = i = —r<y + x‘) = —q(y>. (a2)

Note that y = 0 is an equilibrium point of the system. It is easy to show that the fol-

lowing equality is held
Ja=xU=Ju=0) a»
Using the mean value theorem of derivative [Ap057], q(y) can be written as
qty) = (1(0) + 33c» (a.4)

- 951
-o+ayeu

_ £1 it. _. is.
— ay(0)er Iayw) ay(0)Iy

=Jy+mw
where
Jrﬁmgm=ﬂm—ﬂ®y 03
oy By By

and c is a point on a line segment from the origin to x. According to the norm ine-
quality relation and the assumption of continuity of the derivative, the following quali-

tative expressions can be obtained:

142

23 _i'L .
llgty)|l5||ay(c) ay(0)|| Ilyll. (a.6)
lim m =0. (a.7)
11y1iso Ilyll
and
Ilq(y)IISOIIyll. (a.8)

An eigen-positive J guarantees there must exist a solution P to the Lyapunov equation
F] + JTP = Q (a.9)

where P , Q are positive deﬁnite matrices [BaSt70].

Deﬁne the energy function as
50) = yTPy (a-lO)
where E (y) > 0 for y¢0. The derivative of the energy function is
E(y)=iTPy+yPy (a.11)
= (—J y - g(y)T Py + yTP (-J y - g(y)
= -yTJTPy + yTPJy + gT(y) + yTPgU)
= —yT<JTP + P! )y + 2g<y)TPy

= -yTQ y + 2gT(y)Py

2
S-yTQy+ 26H}D II'IIYII

2 2
s-xm..<Q) ~11y1g+ 2a11P11-11y11

2
=—<>»...<Q>- 2a11P11>~11y112

where

l.

143

lemm(Q) is the minimum eigenvalue for Q, and lmx(P) is the maximum eigenvalue

for P. When 0 satisﬁes

Xmin(Q)
< —, (a.13)
2 HP ll
then
E(y) < 0. (a.14)

Without loss of generality, let Q be the identity matrix, then Xmm(Q) = 1. Inequality

(3.18) can be further simpliﬁed as

l

< —, (3.15)
2 HP: ll

0

where P, is the positive deﬁnite matrix determined by equation (3.14) for Q = l .

Inequality (a.15) guarantees that the energy function E (y) is a Lyapunov function.
Therefore, the network based on equation (5.14) is stable in the neighborhood of the

origin in y-coordinate. When a starting point satisﬁes the condition

llq(y) |l< JI—ﬂ-y (a.16)

ZIIPIII,

the dynamical system will approach an equilibrium point, the origin (ye = 0) due to

change of variable. In x-coordinate, the equilibrium point is expressed
xc = ye + x‘ , (a.17)

which is the exact solution to the nonlinear equation. C]

 

144

APPENDIX B

Table 8.1. The bus data for the 39-bus system.

 

Number

Name

Qo

Po

Qc

V.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Po pacified

0 01THETAG 1.000
1 OIALPHA 0.0 0.0 0.0 0.0

2 01 KAPPA 0.0 0.0 0.0 0.0

3 03ETA 3.22 0.024 0.0 0.0

4 03THETA 5.00 1.84 0.0 0.0

5 03IOTA 0.0 0.0 0.0 0.0

6 07GAMMA 0.0 0.0 0.0 0.0

7 06LAMMBDA 2.338 0.84 0.0 0.0

8 07MU 5.22 1.766 0.0 0.0

9 OSNU 0.0 0.0 0.0 0.0

10 05X1 0.0 0.0 0.0 0.0

1 l 05MICRON 0.0 0.0 0.0 0.0

12 06PI 0.085 0.88 0.0 0.0

13 07RHO 0.0 0.0 0.0 0.0

14 07510 MA 0.0 0.0 0.0 0.0

15 07TAU 3.20 1.53 0.0 0.0

16 06EPS 1LON 3 .294 0.323 0.0 0.0

17 06PH1 0.0 0.0 0.0 0.0

18 07CH1 1.58 0.30 0.0 0.0

19 041331 0.0 0.0 0.0 0.0
20 (”OMEGA 6.80 1.03 0.0 0.0
21 03ALPHA 2.74 1.15 0.0 0.0
22 06BETA 0.0 0.0 0.0 0.0
23 ()7BETA 2.47 0.846 0.0 0.0
24 06DELTA 3.086 -0.922 0.0 0.0
25 OlGAMMA 2.24 0.472 0.0 0.0
26 01 DELTA 1.39 0.17 0.0 0.0
27 04EPS [LON 2.81 0.755 0.0 0.0
28 01 EPS [LON 2.06 0.276 0.0 0.0
29 OIZETA 2.835 0.269 0.0 0.0
30 01 KAPPAG 0.0 0.0 2.50 1.362 1.046
31 07GAM MAG 0.092 0.046 5.73 2.0458 0.982
32 04XIG 0.0 0.0 6.5 2.1036 0.985
33 04PSIG 0.0 0.0 6.32 1.1579 0.999
34 040M EGAG 0.0 0.0 5.08 1.6 1.011
35 06BETAG 0.0 0.0 6.5 2.0967 1.050
36 07BETAG 0.0 0.0 5.6 1.0214 1.065
37 OlGAMMAG 0.0 0.0 5.4 0.0409 1.029
38 OIZETAG 0.0 0.0 8.3 0.236 1.028

 

 

 

 

 

 

 

 

145

Table 8.2. The branch data for the 39-bus system.

umber Bus Bus R
l l . . 11
1

A

\l

l
1
l
I
l
1
l
1
l
1
l

 

146

APPENDIX C

Table C] The 7—bus system comparison results.

 

 

 

 

 

 

 

 

 

 

Method ANN Newton Homotopy
Bus 0: v0/80 1.00/0.00 1.00/0.00 1.00/0.00
Bus 1: v1/81 0.9435/0.2153 0.9435/0.22 0.9775/-3.2639
Bus 2: v2/82 0.8922/-6.4871 0.8923/—6.48 0.9203/-8.2981
Bus 3: v3/83 0.9207/-4.2131 0.9208/-4.21 0.9416/-5.9225
Bus 4: v4/84 0.9591/-1.2851 0.9592/—1.28 0.9732/-2.7507
Bus 5: v5/85 0.9538/-0.2468 0.9539/-0.25 0.9802/-2.9283
Bus 6: 126/86 1.0573/8.7592 1.0573/8.76 1.0887/5.0118
Relative Error 81: 0.1518% 82: 0.0007% 0.4788% 34.1466%

 

 

 

 

 

Nma(UEI=

ore—322:015—7

(2) 8 unit: degree.
(3) Bus 0 is the slack bus.

147

Table C2 The 39-bus system comparison results.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Method ANN Newton
Bus 0 (slack bus) 1.00/0.00 1.00/0.00
Bus 1 l.0158/l.6351 1.0158/l.64
Bus 2 1.0154/4.3654 1.0154/4.37
Bus 3 0.9885/1.3287 O.9885/1.33
Bus 4 0.9516/0.4860 0.9516/0.49
Bus 5 0.9494/l.8284 0.9494/ 1.83
Bus 6 0.9510/2.6l93 0.9510/2.62
Bus 7 0.9411/0.1487 0.9411/0.15
Bus 8 0.9409/-0.4l89 0.9409/-0.42
Bus 9 0.9875/-0.2177 0.9875/~0.22
Bus 10 0.9599/5.2923 0.9599/5/29
Bus -1 1 0.9556/4.3801 0.9556/4.38
Bus 12 0.9363/4.3576 0.9363/4.36
Bus 13 0.9581/4.48l4 0.9581/4.48
Bus 14 0.9585/2.5919 0.9585/2.59
Bus 15 0.9681/2.0828 0.9681/208
Bus 16 0.9880/3.5988 0.9879/3.60
Bus 17 0.9914/2.5000 0.9914/2.50
Bus 18 0.9888/1.5884 0.9888/l.59
Bus 19 0.9903/88008 0.9903/8.80
Bus 20 0.9865/73955 0.9865/7.40
Bus 21 0.9952/6.1719 0.9952/7.l7
Bus 22 1.0220/ 10.8770 1.0220/1088
Bus 23 1.0208/ 10.6511 l.0208/10.65
Bus 24 0.9964/3.7238 0.9964/3.72
Bus 25 l.0258/5.7526 l.0258/5.75
Bus 26 1.0167/4.4514 1.0167/4.45
Bus 27 0.9988/2.3169 0.9988/2.32
Bus 28 l.0194/8.1715 1.0194/8.l7
Bus 29 1.0213/1 1.0836 1.0213/1108
Bus 30 (generator) 1.0460/68071 1.0460/681
Bus 3] (generator) 0.9820/11.3()00 0.9820/11.30
Bus 32 (generator) 0.9850/ 13.1951 0.985/ 13.20
Bus 33 (generator) 0.9990/ 13.9817 0.9990/13.20
Bus 34 (generator) 1.01 lO/l2.5860 1.0110/12.59
Bus 35 (generator) l.0500/ 15.8461 l.0500/15.85
Bus 36 (generator) 1.0650/ 18.6514 1.0650/18.65
Bus 37 (generator) l.0290/ 12.5581 1.0290/12.56
Bus 38 (generator) 1.0280/ 18.1446 1.0280/18.15
Relative Error 8;: 4.47% 82: 3.04% 6.49%

 

 

Note: (1)8, = 0.1E-3, £2 = 0.1E—7.
(2) voltage/angle, 8 unit: degree.

 

[A get89]

[AiNF90]

[A 1H087]

[Ang89]

[AnR088]

[Ap057]

[AtSu89]

[BaAC91]

[BaEM89]

[BaSh79]

[BaSt70]

148

BIBLIOGRAPHY

M. E. Aggoune, et al., "Artiﬁcial Neural Networks for Power System Static
Security Assessment," ISCAS, pp. 490-494, 1989.

S. V. B. Aiyer, M. Niranjan, and F. Fallside, "A Theoretical Investigation
into the Performance of the HOpﬁeld Model," IEEE Trans. on Neural Net-
works, Vol. 1, No. 2, pp204-215, June, 1990.

P. E. Allen and D. R. Holberg, CMOS Analog Circuit Design, Holt,
Rinehart & Winston, New York, 1987.

 

J. E. Angus, "On the Connection between Neural Network Learning and
Multivariate Nonlinear Least Squares Estimation," Neural Networks, Vol.
1, No. 1, January, 1989.

J. A. Anderson and E. Rosefeld, Neurocomputing: Foundations of
Research, The MIT Press, Cambridge, MA, 1988.

 

T. M. Apostol, Mathematical Analysis, Addison-Wesley, Reading, Mas-
sachusetts, 1957.

 

L. E. Atlas and Y. Suzuki, "Digital Systems for Artiﬁcial Neural Net-
works," IEEE Circuits and Devices Magazine, November, 1989.

V. C. Barbosa, L. Alfredo, and V. Carvalho, "Learning in Analog Hopﬁeld
Networks," Proceedings of the International Joint Conference on Neural
Networks, 11, pp.l83-186, 1991.

M. L. Baughman, N. A. Eisner and P. S. Merrill, "Optimizing Combined
Cogeneration and Thermal Storage System: An Engineering Economic
Approach," IEEE Trans. on Power Systems, Vol. 4, pp. 974-980, August
1989.

M. S. Bazaraa and C. M. Shetty, Nonlinear Programming Theory and
Application, John Wiley & Sons, New York, 1979.

 

S. Barnett and C. Storey, Matrix Methods in Stability Theory, Barnes &
Noble, Inc., New York. 1970.

 

 

[BiVo91]

[Bla86]

[Bret91]

149

R. H. Bisseling and J. G. Vorst, "Parallel Triangular System Solving on a
Mesh Network of Transputers," SIAM Journal on Scientiﬁc and Statistical
Computing, Vol. 12, No. 4, pp. 787-799, July 1991.

K. Blanchard, The One-Minute Manager, Morrell Publishing, New York,
1986.

 

V. Brandwajn, et al., "Pre-Screening of Single Contingencies Causing Net-
work Topology Changes," IEEE Trans. on Power Systems, Vol. 6, No. 1,
pp. 30-36, Feb. 1991.

[BuHW82] R. C. Burchett, H. H. Happ and K. A. Wirgau, "Large Scale Optimal

[CaGr87]

[Che84]

[ChLi84]

[ChMS9l]

[ChMS92]

Power Flow," IEEE Trans. on Power Apparatus and Systems," Vol. 101,
pp. 3722- 3732, October, 1982.

G. Carpenter and S. Groosberg, "A massively Parallel Architecture for a
Self-organizing Neural Pattern Recognition Machine," Computer Vision,
Graphics and Image Processing, Vol. 37, pp. 54-115, 1987.

C.-T. Chen, Linear System Theory and Design, Holt, Rinehart & Winston,
Inc., New York, 1984.

 

L. O. Chua and G-N. Lin, "Nonlinear Programming without Computation,"
IEEE Trans. on Circuits and Systems, Vol. 31, pp. 182-188, February,
1984.

C. Chiu, C.-Y. Maa, and M. A. Shanblatt, "Energy Function Analysis of
Dynamic Programming Neural Networks," IEEE Trans. on Neural Net-
works, Vol. 2, No. 4, pp. 418-426, July, 1991.

J. Chen, C-Y. Maa, and M. A. Shanblatt, "Improved Artiﬁcial Neural Net-
works for Linear and Nonlinear Programming," to appear, the International
Journal of Neural Systems, Vol. 2, No. 4, 1992.

[ChSh92a] J. Chen and M. A. Shanblatt, "Solving Linear Systems Using an Artiﬁcial

Neural Network," (In review, IEEE Trans. on Neural Networks, 1992.)

[ChSh92b] J. Chen and M. A. Shanblatt, "Artiﬁcial Neural Network Techniques for

Power Load Flow and Contingency Analysis," (In review, IEEE Trans. on
Power Systems, 1992.)

150

[ChYa88a] L. O. Chua and L. Yang, "Cellular Neural Networks: Theory," IEEE Trans.

on Circuits and Systems, Vol. 35, pp. 1257-1271, October, 1988.

[ChYa88b] L. O. Chua and L. Yang, "Cellular Neural Networks: Applications," IEEE

Trans. on Circuits and Systems, Vol. 35, pp. 1273-1290, October, 1988.

[CoDK86] G. C. Contaxis, C. Delkis and G. Korres, "Decoupled Optimal Load Flow

[Coet86]

[CoRo87]

[Cra56]

[Cro80]

[DaDe91]

[Deb88]

[Dec89]

Using Linear or Quadratic Programming," IEEE Trans. on Power Systems,
Vol. 1, pp. 1-7, May, 1986.

G. C. Contaxis, et al., Decoupled Optimal Load Flow Using Linear or
Quadratic Programming," IEEE Trans. on Power System, Vol. 1, pp1-7,
May, 1986.

P. Comon and Y. Robert, "A Systolic Array for Computing BA"1," IEEE
Trans. on Acoustics, Speech, and Signal Processing, Vol. 35, No. 6, pp.
717-723, June, 1987.

S. H. Crandall, Engineering Analysis: A Survey of Numerical Procedures,
McGraw-Hill, Inc., New York, 1956.

J. Cronin, Differential Equations, Marcel Dekker, Inc., New York, 1980.

 

J. Dalton and A. Deshmane, "Artiﬁcial Neural Networks," IEEE Potential,
Vol. 10, No. 2, April 1991.

A. S. Debs, Modern Power Systems Control and Operation, Kluwer
Academic Publishers, Boston, 1988.

A. L. DeCegama, Parallel Processing Architectures and VLSI Hardware,
Prentice Hall, Englewood Cliffs, New Jersey, 1989.

[ELMA91] M. A. El-Sharkawi and R. J. Marks 11, (Edited), IEEE Proceedings of the

First International Forum on Applications of Neural Networks to Power
Systems, Seattle, WA, 1991.

[EOMD91] M. A. El-Sharkawi, 5. Oh, R. J. Marks 11 and M. J. Damborg, "Short Term

Electric Load Forecasting Using an Adaptively Trained Layered Percep-
tron," IEEE Proceedings of the Firm International Forum on Applications
of Neural Networks to Power Systems. Seattle, WA, 1991.

' Irv-win

[Fiet89]

[FiFS91]

[FoMo67]

[FoTa88]

[GaGr87]

[GoLo83]

[GrKu89]

[Gro76]

[G uo90]

[Hec87]

[Hec90]

[HiLi86]

151

R. Fischl, et al., "Screening Power System Contingencies Using A Back-
Propagation Trained Multiperceptron," ISCAS, pp. 486-489, 1989.

W. A. Fisher, R. J. Fujimoto, and R. C. Smithson, "A Programmable, Ana-
log Neural Network Processor," IEEE Trans. on Neural Networks, Vol. 2,
No. 2, pp. 222-229, March, 1991.

G. E. Forsythe and C. B. Moler, Computer Solution of Linear Algebraic
Systems, Prentice-Hall, Englewood, NJ, 1967.

 

Y. S. Foo and Y. Takefuji, "Integer Linear Programming Neural Networks
for Job-Shop Scheduling," IEEE International Conference on Neural Net-
works, Vol. II, pp. 341-348, 1988.

G. Carpenter and S. Grossberg, "A Massively Parallel Architecture for a
Self-organizing Neural Pattern Recognition Machine," Computer Vision,
Graphics and Image Processing, Vol. 37, pp. 54-115, 1987.

G. H. Golub, and C. F. Van Loan, Matrix Computations, John Hopkins
University Press, Baltimore, 1983.

 

S. Grossberg and M. Kuperstein, Neural Dynamics of Adaptive Sensory
Motor Control, Pergamon Press, Inc., Elmsford, New York, 1989.

 

S. Grossberg, "Adaptive Pattern Classiﬁcation and Universal Recording:
Parallel Development and Coding of Neural Feature Detectors," Biological
Cybernetics, Vol. 23, pp. 121-134, 1976.

S. Guo, Determining the Steady Solutions of Nonlinear Models of Power
Systems, Ph.D. Dissertation, EE Dept., Michigan State University, East
Lansing, 1990.

 

R. Hecht-Nielsen, "Counterpropagation Networks," IEEE FirSt Interna-
tional Conference on Neural Networks, Vol. 2, pp. 19-32, 1987.

R. Hecht-Nielsen, Neurocomputing, Addison-Wesley, Reading, MA 1990.

 

F. S. Hillier and G. J. Lieberman, Introduction to Operation Research,
Holden-Day, Inc., Oakland, CA, 1986.

 

[HiSe86]
[HiSm74]
[11013821
[11013841
[HoTa85]
[HuSh91]
[HwCh80]
[HwCh82]

[HwBr84]
[Jac88]

[J eJ 090]

152

G. Hinton and T. Sejnowski, "Learning and Relearning in Boltzmann
Machines," Parallel Distributed Processing Explorations and the Micro-
structures of Cognition, Edited by Rumelhart and McClelland, pp. 282-317,
MIT Press, 1986.

 

 

M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems,
and Linear Algebra, Academic Press, New York, 1974.

 

 

J. J. Hopﬁeld, "Neural Networks and Physical Systems with Emergent Col-
lective Computational Abilities," Proceedings of the National Academy of
Science, Vol. 79, pp. 2554-2558, 1982.

J. J. Hopﬁeld, "Neurons with Graded Response Have Collective Computa-
tional Properties Like Two-State Neurons," Proc. Natl. Acad. Sci., Vol. 81,
pp. 3088-3092, 1984.

J. J. Hopﬁeld and D. W. Tank, "’Neural’ Computational of Decisions in
Optimization problem," Biol. Cybern., Vol. 52, pp. 141-152, 1985.

K. C. Hui and M. J. Short, "A neural Networks Approach to Voltage Secu-
rity Monitoring and Control," Proceedings of the First International Forum
on Applications of Neural Networks to Power Systems, Seatle, WA, July,
1991.

K. Hwang and Y.-H. Cheng, "VLSI Computing Structures for Solving
Large-Scale Linear System of Equations," Proceedings of International
Conference on Parallel Processing, pp. 217-227, 1980.

K. Hwang and Y-H Cheng, "Partioned Matrix Algorithms for VLSI Arith-
metic Systems," IEEE Trans. on Computer, Vol. 31, pp. 1215-1224,
December 1982.

K. Hwang and F. A. Briggs, Computer Architecture and Parallel Pro-
cessing, McGraw-Hill Inc., New York, 1984.

A. Jackson, "Machine Learning," Expert Systems, Vol. 5, pp. 132, Oxford,
May 1988.

C-W. Jen and S-J. Jou, "Design of One-Dimensional Systolic-Array Sys-
tems for Linear State Equations," IEE Proceedings, Vol. 137, pp. 185-192,
June 1990.

[Jen77]

[JoJe88]

[J oh87]

[J oRi82]

[KaBr89]

[KeCh87]

[KeCh88]

[Koh87]

[K0587]

[Lip87]

[LiZS91]

153

Alan Jennings, Matrix Computation for Engineers and Scientists, John
Wiley & Sons, New York, 1977.

 

S-J. Jou and C-W. Jen, "Design of a Systolic Array System for Linear
State Equations," IEE Proceedings, Vol. 135, pp. 211-218, October 1988.

S. L. Johnson, "Solving Tridiagonal Systems on Ensemble Architectures,"
SIAM Journal on Scientiﬁc and Statistical Computing, No. 8, pp. 354-392,
1987.

L. W. Johnson and R. D. Riess, Numerical Analysis, 2nd ed., Addison-
Wesley, Reading, Massachusetts, 1982.

 

K. N. Kama and D. M. Breen, "An Artiﬁcial Neural Networks Tutorial:
Part 1 — Basics," Neural Networks, Vol. 1, No. 1, pp. 4-23, 1989.

M. P. Kennedy and L. O. Chua, "Unifying the Tank and Hopﬁeld Linear
Programming Circuit and the Cononical Nonlinear Programming Circuit of
Chua and Lin," IEEE Trans. on Circuits and SyStems, Vol. 34, pp. 210-
214, February 1987. '

M. P. Kennedy and L. O. Chua, "Neural Networks for Linear and Non-
linear Programming," IEEE Trans. on Circuits and Systems, Vol. 35, pp.
554-562, May 1988.

T. Kohonen, "State of the Art in Neural Computing," IEEE First Interna-
tional Conference on Neural Networks, Vol. 1, pp. 81, 1987.

B. Kosko, "Competitive Adaptive Bi-directional Associative Memories,"
IEEE First International Conference on Neural Networks, Vol. 2, pp. 759-
766, 1987.

R. P. Lippmann, "An Introduction to Computing with Neural Nets," IEEE
ASSP Magazine, April 1987.

T.-Y. Li, H. Zhang, and X.-H. Sun, "Parallel Homotopy Algorithm for the
Symmetric Tridiagonal Eigenvalue Problem." SIAM Journal on Scientiﬁc
and Statistical Computing, Vol. 12, No. 3, pp. 469-487, May 1991.

[MaHY89] A. Masski, Y. Hirai, and M. Yamada, "Neural Networks in CMOS: A case

study," IEEE Circuits and Devices Magazine, Vol. 6, pp. 12, July 1990.

[MaSh89]

[MaSh90]

[MaSh9l]

[Mea88]

[Me1589]

[MiPa69]

154

C-Y. Maa and M. A. Shanblatt, "Adjustment of Parameters for Signal
Decision Networks," Proceedings of International Joint Conference on
Neural Networks, Washington D. C., Vol. II, pp. 589, 1989.

C-Y. Maa and M. A. Shanblatt, "A Neural Net Technique for Economic
Power Dispatching with Consideration of Transmission Losses," (In Publi-
cation) 1990.

C-Y Maa and M. A. Shanblatt, "Linear and Quadratic Programming Neural
Network Analysis," to appear , IEEE Trans. on Neural Networks, 1992.

C. A. Mead, Analog VLSI and Neural Systems, Addison-Wesley, 1988.

 

C. Mead and M. Ismail. Analog VLSI Implementation of Neural Systems,
Kluwer Academic Publishers, 1989.

 

M. Minsky and S. Papert, Perceptrons, MIT Press, 1969.

 

[MoAG91] A. Moore, J. Allman, and R. M. Goodman, "A Real-Time Neural System

[Moet87]

[Mol83]

[MoPG87]

[NaCh92]

[N eS Y92]

for Color Constancy," IEEE Trans. on Neural Networks, Vol. 2, No. 2, pp.
237-247, March, 1991.

A. Monticelli, et al., Security-Constrained Optimal Power Flow with Post-
Contingency Corrective Rescheduling," IEEE Trans. on Power Systems,
Vol. 2, pp. 175-182, February, 1987.

D. I. Moldovan, "On the Design of Algorithms for VLSI Systolic Arrays,"
Proceedings of the IEEE, Vol. 71, No. 1, January 1983.

A. Monticelli, M. V. Pereira and S. Granville, "Security-Constrained
Optimal Power Flow with Post-Contingency Corrective Rescheduling,"
IEEE Trans. on Power Systems, Vol. 2, pp. 175-182, February, 1987.

N. M. Nasrabadi and C. Y. Choo, "Hopﬁeld Network for Stereo Vision
Correspondence," IEEE Trans. on Neural Networks, Vol. 3, No. 1, pp. 5-
13, January 1992.

C. Neti, M. H. Schneider, and E. D. Young, "Maximally Fault Tolerant
Neural Networks," IEEE Trans. on Neural Networks, Vol. 3, No. 1, pp.
14-23, January 1992.

[On87a]

[Ort87b]

[PaEM91]

[Par80]

[Pret86]

[Re187]

[Roet90]

[RoKa82]

[Rot91]

155

J. M. Ortega, Iterative Solution of Nonlinear Equations in Several Vari
ables, Academic Press, New York, 1987.

 

J. M. Ortega, Matrix Theory, Plenum Press, New York, 1987.

 

D. C. Park, M. A. El-Sharkawi, and R. J. Marks 11, "An Adaptively
Trained Neural Network," IEEE Trans. on Neural Networks, Vol. 2, No. 3,
May, 1991.

B. N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, Engle-
wood Cliffs, NJ, 1980.

 

W. H. Press, et al., Numerical Recipes: The Art of Scientiﬁc Computigg,
Cambridge University Press, New York, 1986.

D. K. Reinhard, Instruction to Integrated Circuit Engineering, Houghton
Mifﬁin Company, Boston, 1987.

 

A. Rodriguez-Vazquez et al., "Nonlinear Switched-Capacitor ’Neural’ Net-
works for Optimization Problems," IEEE Trans. on Circuits and Systems,
Vol. 37. PP. 384-398, March 1990.

A. Rosenfeld and A. C. Kak, Digital Picture Processing, Academic Press,
New York, 1982.

 

M. W. ROth, "Analogical Versus Logical or Connectionist Versus Symbolic
or Scruffy Versus Neat," IEEE Trans. on Neural Networks, Vol. 2, No. 6,
pp. 545-546, November, 1991.

[RuHW86] D. Rumelhart, G. Hinton and R. Williams, "Learning Internal Representa-

[Saet89]

[Sar91]

tions by Error Propagation," Parallel Distributed Processing: Foundations,
Edited by Rumelhart and McClelland, pp. 318-362, MIT Press, 1986.

 

F. M. Salam, et al. "Parallel Processing for the Steady State Solutions of
Large-Scale Non-Linear Models of Power Systems," IEEE International
Symposium on Circuits and Systems, Portland, May, 1989.

P. Saratchandran, "Dynamic Programming Approach to Optimal Weight
Section in Multilayer Neural Networks," IEEE Trans. on Neural Networks,
Vol. 2, No. 4, pp. 465-466, July, 1991.

[Sca66]

[She88]

[SoPa89]

[808088]

[SrLC91]

[Sti63]

[Sto73]

[StPi63]

[SuCL91]

[TaH086]

[TaTr91 ]

156

J. B. Scarborough, Numerical Mathematical Analysis, The Johns Hopkins
Press, Baltimore, 1966.

 

J. F. Shepanski, "Fast Learning in Artiﬁcial Neural Systems, Multilayer
Perceptron Training Using Optimal Estimation," Proceedings of the Inter-
national Conference on Neural Networks, 1., pp. 465-475, IEEE Press, New
York, July 1988.

D. J. Sobajic and Y-H. Pao, "Artiﬁcial Neural-Net Based Dynamic Security
Assessment for Electric Power System," IEEE Trans. on Power Systems,
Vol. 4, pp. 220-228, February, 1989.

B. Soucek and M. Soucek, Neural and Massively Parallel Computers: The
Six Generation, John Wiley & Sons, New York, 1988.

 

D. Srinivasan, A. C. Liew, and J. S. P. Chen, "Short Term Forecasting
Using Neural Network Approach," Proceedings of the First International
Forum on Applications of Neural Networks to Power Systems, Seatle, WA,
July, 1991.

E. L. Stiefel, An Introduction to Numerical Mathematics, Academic
Press, New York, 1963.

 

H. S. Stone, "An Efﬁcient Parallel Algorithm for the Solution of a Tridiag-
onal Linear System of Equations," Journal of ACM, Vol. 20, No. 1, pp.
27-38, 1973.

K. Steinbuch and V. A. W. Piske, "Learning Matrices and Their Applica-
tions," IEEE Trans. on Electron. Comput., vol. 12, pp. 846-862, 1963.

G.-Z. Sun, H.-H. Chen, and Y.-C. Lee, "A fast On-Line Learning Algo-
rithm for Recurrent Neural Networks," Proceedings of the International
Joint Conference on Neural Networks, II, pp.13-l8, 1991.

D. W. Tank and J. I. Hopﬁeld, "Simple ’Neural’ Optimization Networks:
An A/D Converter, Signal Decision Circuit, and a Linear Programming
Circuit," IEEE Trans. on Circuits and Systems, Vol. 33, pp. 533—541, May
1986.

A. Tabatabai and T. P. Troudet, "A Neural Net Based Architecture for the
Segmentation of Mixed Gray-Level and Binary Pictures," IEEE Trans. on

[TOW188]

[TrWa91 ]

[WeEl9 1]

[WiHo60]

[W iLe90]

[WiPa88]

[WiRe71]

[Wri91]

157

Circuits and Systems, Vol. 38, No. 1, pp. 66-77, 1991.

V. V. Tolat and B. Widrow, "An Adaptive ’Broom Balancer’ with Visual
Inputs," Proceedings of the International Conference on Neural Networks,
11, pp641-647, IEEE Press, New York, July 1988.

T. P. Troudet and S. M. Walters, "Neural Network Architecture for
Crossbar Switch Control," IEEE Trans. on Circuits and Systems, Vol. 38,
No. 1, pp. 42-56, January 1991.

S. Weerasooriya and M. A. El-Sharkawi, "Use of Karhunen-Loe‘ve
Expansion in Training Neural Networks for Static Security Assessment,"
Proceedings of the First International Forum on Applications of Neural
Networks to Power Systems, Seatle, WA, July, 1991.

B. Widrow and M. Hoff, "Adaptive Switching Circuits," 1960 IRE WES-
CON Connection Record, Part 4, pp. 96-104, New York, 1960.

B. Widrow and M. A. Lehr, "30 Years of Adaptive Neural Networks: Per-
ceptron, Madaline, and Backpropagation," Proceedings of the IEEE, Vol.
78, No. 9, pp. 1415-1442, September, 1990.

V. Wilson and G. S. Pawley, "On the Stability of the TSP Problem Algo-
rithm of Hopﬁeld and Tank," Biol. Cybern., Vol. 58, pp. 63-70, 1988.

J. H. Wilkinson and C. Reinsch, Linear Algebra, Vol. 2 Handbook for
Automatic Computation, Springer-Verlag, New York, 1971.

 

S. J. Wright, "Parallel Algorithms for Banded Linear Systems," SIAM Jour-
nal on Scientific and Statistical Computing, Vol. 12, No. 4, July 1991.