’N"."

\ .l." v

c ' ;::."-
My ;

, #29:).32‘3 4‘ \

51.41.... U3"

af‘:’-:_ ~ ' '3-

‘ VI.
’1:{

 

q?u§n‘.; .lh‘L

‘ 37‘:

if:

5!;
”72?? $5,}; ; t;.
’1 “9&1!” I"‘.
. ‘z'ﬁ‘wg" .3, I:

,-‘-:v
n 1"."

-1“
2 «15’. _

{£51

:1.

A.
'."."r"

*-
9,.

y
"ﬁﬂg E:
'9

3:}: Want.

. .
4-

, . v 7 ‘3 '1' u
I v- u. .‘v'
.“.. ; ‘I I‘m

.-~','.°' _;,L-,-‘-‘.o‘ ‘<H-."-.1‘f=."|‘- ‘
I J
'n‘ .
“NIH

, of: 7"
. n 5-3;: cf 1..
”’ 3302921. 5.3533,- .
wr1=nYﬁ:;x':L.
.-— fig-“371"“: '__
2:27.; ":5; _

 

mama
Q.\

:2 27:) / “/90

IUIllHllI”lllHllJHllllllHHlll”l1llHll’JllHHllllllll

31293 00767 8828
1 LIBRARY
b Mulligan State
University £
k - ~.......—-r

 

 

This is to certify that the

dissertation entitled

Neural Networks for Nonlinear Programming

presented by

Chia-Yiu Maa

has been accepted towards fulﬁllment
of the requirements for

Ph.D. Electrical Engineering

 

 

degree in

 

 

Date 27/ 1/ ?/

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0- 12771

 

PLACE ll RETURN BOX to remove this checkout from your record;
TO AVOID FINES return on or hetero date duo.

DATE DUE DATE DUE DATE DUE I,

as U 0 12..
{mi

 

 

 

 

 

MC

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[:3

 

 

 

 

 

 

 

 

j
—.J
J

 

 

 

 

 

 

 

 

 

 

MSU Is An Afﬂrmdlve AetlaVEqual Opportunity Institution

cWMt

 

 

 

 

 

NEURAL NETWORKS FOR NONLINEAR PROGRAMMING
By

Chia-Yiu Maa

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PI-HLOSOPHY

Department of Electrical Engineering

1991

ABSTRACT

NEURAL NETWORKS FOR NONLINEAR PROGRAMNIIN G

By

Chia-Yiu Maa

Artiﬁcial neural networks (ANN s) for optimization are analyzed from the
viewpoint of optimization theory. A unifying optimization network theory for linear
programming, quadratic programming, convex programming, and nonlinear program-
min g is derived. A 2-phase optimization network is proposed which can obtain both
the exact solution, in contrast to the approximate solution by Kennedy and Chua’s
networks, as well as the corresponding Lagrange multipliers associated with each con-
straint. The quality of the solutions obtained by the optimization ANN s is quantiﬁed

through simulation.

The applicability of the optimization ANN s for solving real-world problems is
demonstrated with examples of the economic power dispatching problem and the
optimal power ﬂow problem. It is shown that the mapping technique of the optimiza-
tion ANNs is simple and that they are able to handle various kinds of constraint sets.
Furthermore, it is demonstrated that the optimization ANN s attain a better objective

function value.

Overall, this work lays a solid groundwork for optimization ANN s in both
theoretical and practical aspects.

ACKNOWLEDGMENTS

I would like to express my sincere gratitude to Dr. M. Shanblatt, my major advi-
sor for his guidance and encouragement throughout the years of my graduate studies.
I also want to thank Dr. D. Reinhard, Dr. T. Grotjohn, Dr. R. Enbody, and Dr. J.

Gardiner for their kindness to be my committee and their valuable comments and

suggestions on this work.

Finally, I wish to dedicate this dissertation to my parents for getting me started
right and continued support throughout my studies, my wife Mei-Ling for her patience,

understanding, and love, and my lovely daughter Melody.

TABLE OF CONTENTS

LIST OF TABLES ..................................................................................................... v
LIST OF FIGURES ................................................................................................... vi
1. INTRODUCTION ................................................................................................ 1
1.1 Problem Statement .................................................................................. 4
1.2 Approach ................................................................................................. 6
1.3 Overview of the Dissertation ................................................................. 9
II. BACKGROUND .................................................................................................. 10
2.1 Neurobiological Review ......................................................................... 10
2.2 Simpliﬁed Neuron Models and Network Topologies ........................... 12
2.3 Literature Review ................................................................................... 17
2.4 The Hopﬁeld Model ............................................................................... 19
III. ANALYSIS OF NEURAL NETWORKS FOR OPTIMIZATION ................... 24
3.1 Mathematical Background ...................................................................... 24
3.2 Networks Analysis .................................................................................. 29
3.2.1 The Model by Tank and Hopﬁeld .......................................... 29
3.2.2 The Model by Kennedy and Chua .......................................... 31
3.2.3 The Model by Rodriguez-Vazquez, et al. ............................. 33
3.3 Linear Programming Network for Problems with Hypercube
Feasible Region ...................................................................................... 35
3.3.1 The Equilibrium of the System ............................................... 36
3.3.2 The Initial State in the Feasible Region ................................. 37
3.3.3 Example of a 2-dimensional Hypercube ................................ 38
3.3.4 Further Analysis ...................................................................... 40
3.3.5 Extensions ................................................................................ 42

IV. OPTIMIZATION NETWORK FORMULATION ............................................ 44
4.1 Linear Programming Network Theory ................................................... 44
4.2 Quadratic Programming Network Theory ............................................. ' 55
4.3 Nonlinear Programming Network Theory ............................................. 62
4.4 A Two-Phase Optimization Network ..................................................... 74
V. NETWORK SIMULATION ......................................... 79
5.1 Linear Programming ............................................................................... 79
5.2 Quadratic Programming ......................................................................... 95
5.3 Nonlinear Programming ......................................................................... 106
VI. CASE STUDIES ................................................................................................ 121
6.1 Economic Power Dispatch ..................................................................... 121
6.2 Optimal Power Flow .............................................................................. 132
VII. CONCLUSION ................................................................................................... 143
7.1 Summary ................................................................................................. 143
7.2 Contributions ........................................................................................... 147
BIBLIOGRAPHY ...................................................................................................... 149

iv

1.1
5.1

5.2
6.1
6.2
6.3

6.4
6.5
6.6
6.7

LIST OF TABLES

Characteristics of ANN s and conventional computers. .............................

The equilibria of the network for (LPI) for different
values of s. ......................................................................................

The equilibrium points for different networks for (NP 7). ........................
Equilibrium points of two networks for Case 1 of Example 6.1. .............
Equilibrium points of two networks for Case 2 of Example 6.1. .............

Comparison of the simulation results for Example 6.2 with the results
in [153]. ...........................................................................................

Simulation results for Example 6.3. .....................................................
Bus data for the 5-bus system. ............................................................
Line data for the 5-bus system. ...........................................................

Power ﬂow bus output for the 5-bus system. ........................................

85
115
125
127

132
136
138
138
142

2.1
2.2
2.3

2.4

2.5
2.6
3.1
3.2
3.3
3.4
3.5
4.1
4.2
4.3

4.4

4.5

4.6
5.1
5.2

LIST OF FIGURES

A biological neuron. ...........................................................................

Simpliﬁed representation of a neuron. ..................................................

Neuron response functions. (a) Linear function. (b) Nonlinear ramp

function. (e) Step function. (d) Sigmoidal function. ...............................

Interconnection topologies. (a) One-layer laterally connected neural
network. (b) Two—layer feed-forward neural network. (c) Two-layer

feedback neural network. ...................................................................
A basic neuron of the Hopﬁeld model. .................................................
A general Hopﬁeld network. ...............................................................
Linear programming network by Tank and Hopﬁeld. .............................
The integrator used in Kennedy and Chua’s model. ...............................
Trajectories starting in the feasible region. ...........................................
Trajectories starting in the infeasible region. .........................................
Trajectories for s=oo. , .........................................................................
The block diagram of the systein of equation 4.1. .................................
An illustration of the relationship among K, H, and F . .........................

The relationship between the equilibrium set of equation 4.1 and the
set of minimizers. (a) They are of the same size. (b) The equilibrium
set is smaller in size than the set of the minimizers. (c) The equilibrium

set is larger in size than the set of the minimizers. ................................

(a) Quasiconvex. (b) Quasiconcave. (c) Pseudoconvex. ((1) Not

pseudoconvex and not quasiconvex. ....................................................

(a) Gradient projection method [147]. (b) The dynamics of equation

4.1 on the boundary of the feasible region. ..........................................
The block diagram of the two-phase optimization network. ....................
The feasible region of (U’ 1). ..............................................................

Contours of E 2(x) for (LP 1). (a) The contour for s=10.

ll
l3

13

15
20
21
30
32
39
40
43
46
53

54

71

73
75
8O

(b) Detailed plot of (a) around f=(5, 5). (c) The contour for s=1.

((1) Detailed plot of (c) around x'=(5, 5). .............................................. 81 .
5.3 Trajectories of equation (4.1) for (LPl) with 3:10. (a) A broader view.
(b) A view around it = (4.992, 5.060). ................................................. 86

5.4 Trajectories of the 2-phase network for (LPI) with s=10, e=0.2.
(a) Trajectory of x. (b) Trajectories of E 2(x) and f (x) with respect to

time. ................................................................................................ 87
5.5 Trajectories of E 2(x) and f (x) for phase 2 with s=10, 8:02, and

x0 =[4.8, 4.8]T. (a) Trajectory of 520:). (b) Trajectory of f (x). ............. 89
5.6 Trajectories of equation 4.1 for (LP?) near (5.0, 5.5). (a) s = 2.

(b)s=2->10.(c)s=10. ............................................................... 90
5.7 Trajectories of equation 4.1 for (LP3) near the point (-5.5, -5.5).

(a):=2.(b)s=2—>10.(c)s=10. ................................................ 93
5.8 Trajectories of x of the (QP) network for (QPI). ................................. 95
5.9 Trajectories of x of equation 4.1 for (QP7) with s=50. .......................... 96
5.10 Simulation results of the 2-phase network for (QP7) with 3:50 and

e=0.2. (a) The trajectory of x. (b) The trajectories of 2.3. ...................... 98
5.11 Trajectories for (QP3) with s=50 and e=0.2. (a) Using the network of

equation 4.1. (b) Using the 2-phase network. ....................................... 99

5.12 The simulation results of the (LS) problem with initial condition

x0 =[0, 0, 0]T. (a) The trajectories of the state variables. (b) The

trajectory of E 2(x). ........................................................................... 101
5.13 The trajectories of x for the (LS) problem with different initial

conditions. (a) x,=[4, 4, 41’. (b) x0 =[11, .4, -4]T. (c) x0=[-4, 11,

4]? (d) x0=[11, 11, 417‘. (e) x,=[-4, -4, 11?. (t) x,=[11, 4, 1117'.

(g) x0=[-4, ll, 11]T. (h) xo=[ll, 11, 11]T. ......................................... 102
5.14 Contours of E 2(x) for (NPl). (a) The contour for s=0.5. (b) The

contour for s=5. (c) The contour for 3:50. .......................................... 108
5.15 Trajectories of x of equation 4.1 for (NPI) with 3:50. ......................... 112
5.16 Trajectories of x for (NPl) using the 2-phase network with s=10 and

8:0.2. (a) x0 = (0.75, 0.75). (b) x0 = (—0.75, 0.75). ............................. 114

5.17 Trajectories of x for different networks for (NP 2). (a) Equation 4.1

5.18
5.19
6.1

6.2

6.3

6.4

6.5
6.6
6.7
6.8

with s=10. (b) 2-phase network with 3:10 and e=0.2. .........................
(a) f (x) for (NP 3). (b) f2(x) for (NP3). .............................................
The u'ajectories of x of equation 5.4 versus E (x) for (NP 3). .................

Simulation results for Case 1 of Example 6.1 using equation 4.1 with
s=50 and x0 =[400, 300, 150]T. (a) Trajectories of x. (b) Trajectories
off(x) and E2(x). ............................................................................
Simulation results for Case 1 of Example 6.1 using the 2-phase network
with s=50, e=0.2 and x0 =[400, 300, 150]T. (a) Trajectories of x.

(b) Trajectories of f (x) and E 2(x ). .....................................................
Simulation results for Case 2 of Example 6.1 using equation 4.1 with
s=50 and x0 =[400, 300, 150]T. (a) Trajectories of x. (b) Trajectories
off(x) and E2(x). ............................................................................
Simulation results for Example 6.2 using equation 4.1 with s=50 and

x0 =[160, 40, 120]T. (a) Trajectories of x. (b) Trajectories of f (x) and

E 2(x ). (c) The trajectory of PL . .........................................................

Power system network for Example 6.3. ..............................................
The surface of h12(x). ........................................................................
The surface of h§(x). ........................................................................
A 5-bus power system used in Example 6.4. ........................................

119

124

126

128

135
135
137

CHAPTER I
INTRODUCTION

Conventional digital computers are very good at executing well-formulated
sequences of instructions represented by the stored program. There are some tasks,
however, which are very cumbersome to solve by conventional digital computers.
These include vision, speech, pattern recognition with distorted data, and information
retrieval where only partial input information is given. These tasks, on the other
hand, are accomplished and performed well by the human brain. The basic process-
ing elements of the human brain are neurons, which are electrochemical devices with
response times in the range of milliseconds. In the human brain there are approxi-
mately 1011 neurons and each of them may be connected to thousands of other neu-
rons. It is not yet well understood what interconnection structure organizes the neu-
rons, nor how this massively parallel interconnected system (a biological neural net-

work) interacts, stores and retrieves memory, and manipulates our thoughts.

In contrast, artiﬁcial nemal networks (ANN s) are machine models of the biologi-
cal neural networks with the aim of achieving human-like performance. Recently
there has been a resurgence in the ﬁeld of ANN 8 due» to new network topologies
(feed-forward multilayer network, Hopﬁeld feedback network) and algorithms (back
propagation, stochastic neural network), implementation techniques (digital VLSI tech-
nology, analog VLSI technology, electro-optics technology), and various emerging

applications.

ANN s do not always outperform conventional (sequential or parallel) computers.
Rather they provide a different approach to attack certain problems which are not
easily solvable using conventional computers. The key characteristics of ANN s are
listed in Table 1.1 and contrasted with the corresponding characteristics of conven-

tional computers.

Table l. 1. Characteristics of ANN s and conventional computers.

 

 

 

Characteristics ANN 8 Conventional Computers
Memory Distributed; Associative Localized; Speciﬁc
Fault-Tolerance Inherent Not Inherent
Pattern Recognition Ability Fast Slow
Classiﬁcation Excellent Poor
Partial Information Retrieval Excellent Poor
Error Correction Ability Excellent Poor
Learning Ability Excellent Poor
Math & Algorithmic Ability Poor Excellent
Timing Scheme Asynchronous **

Execution Mode Highly Parallel **
Processing Element Simple Unit **
Connectivity High **

 

 

 

 

** These characteristics are system-dependent.

The featm'e possessed by ANN s which diﬂ‘ers most from conventional computers
is that ANNs store information in their structure rather than in speciﬁc locations. All
the parameters (connection weights, external biases, thresholds of neurons, initial
states of neurons) collectively determine the information stored in the network. As a
result, if some of the interconnections are disconnected or some of the neurons fail,
the function of the network is preserved qualitatively. This provides the inherent abil-
ity of fault tolerance and sometimes the ability to retrieve the full output data pattern
with only partial input information. For pattern recognition, correlations of the input

patterns and output pattern are stored in the network. With a distorted or noisy input

ANN s do not always outperform conventional (sequential or parallel) computers.
Rather they provide a different approach to attack certain problems which are not
easily solvable using conventional computers. The key characteristics of ANN s are
listed in Table 1.1 and contrasted with the corresponding characteristics of conven-

tional computers.

Table 1.1. Characteristics of ANN s and conventional computers.

 

 

 

 

 

Characteristics ANN 5 Conventional Computers
Memory Distributed; Associative Localized; Speciﬁc
Fault-Tolerance Inherent Not Inherent
Pattern Recognition Ability Fast Slow
Classiﬁcation Excellent Poor
Partial Information Retrieval Excellent Poor
Error Correction Ability Excellent Poor
Learning Ability Excellent Poor
Math & Algorithmic Ability Poor Excellent
Timing Scheme Asynchronous **

Execution Mode Highly Parallel **
Processing Element Simple Unit **
Connectivity High **

 

 

** These characteristics are system-dependent.

The feattn'e possessed by ANN s which diﬂ'ers most from conventional computers
is that ANN 5 store information in their structure rather than in speciﬁc locations. All
the parameters (connection weights, external biases, thresholds of neurons, initial
states of neurons) collectively determine the information stored in the network. As a
result, (if some of the interconnections are disconnected or some of the neurons fail,
the function of the network is preserved qualitatively. This provides the inherent abil-
ity of fault tolerance and sometimes the ability to retrieve the full output data pattern
with only partial input information. For pattern recognition, correlations of the input

patterns and output pattern are stored in the network. With a distorted or noisy input

pattern applied, a well-trained ANN is able to map it to an output whose correspond—
ing input pattern best matches the applied one. Training, also called learning or adap-
tation, is the process determining the connection weights, usually over time, to
improve performance. Massive parallelism is another feature possessed by neural net-
works which is necessary for high performance computation for applications like
speech and pattern recognition.

Because of the immaturity of ANN research, various kinds of network structures
have been proposed and tested for different applications. There is no agreement on
which network best ﬁts a particular application. Neither is there a uniﬁed way to
classify the existing ANN s. One way, for example, is classifying them by topological
groupings of single-layered or multi-layered; feed-forward or feedback; fully con-
nected, nearest-neighbor connected, or hexagonally connected. ANN s can also be
divided into two categories depending on whether their usage is neuroscience-oriented
or engineering-oriented. The former tries to model simple nerve systems of some
animals and implement the functions of the model through software or hardware.
Then the implementation is used as a paradigm to validate and predict the behavior of
the original nerve system. The latter aims at mimicking the biological neurons and
their networks with some adaptation and modiﬁcation based upon the available analyt-
ical methods and available implementation technology in order to exploit the
decision-making functions.

For the engineering-oriented ANNs, one strong area of application is the solving
of constrained optimization problems. The ANN which is most widely used and cited
for such an application is the Hopﬁeld network [1-7]. This is an one-layered, fully
connected, feedback network. A often-adopted procedure to solve a speciﬁc problem

using the Hopﬁeld network includes the following steps:

i) select a mapping (representation, transformation, or encoding) such that the out-

puts of the network correspond to the solution to the problem;

ii) choose a proper energy function, bounded from below, whose minimum

corresponds to a feasible solution to the problem;

iii) derive network connection weights and bias inputs, which properly embed the

objective function and constrains of the problem into the network; and,

iv) choose initial values for the neurons in such a way that the network converges to

. a stable state which is a feasible solution to the problem.

Currently, each of the above steps is based on ad hoc procedures [6-44]. A lot
of effort, however, is being placed on formulating rule-based methodologies to obtain
parameters, derive energy functions, and choose initial values [12-13,18-24,45-50,57-
60]. For a speciﬁc problem, there are various choices at each step of the above pro-
cedure, but, except in a few cases [18-1923-2458-60], most of the work reported to
date does not guarantee, or at least analytically guarantee, that the state to which the

network converges is a feasible solution to the problem.

1.1 Problem Statement

ANN s, particularly the Hopﬁeld network, have been used to solve optimization
problems such as linear programming, nonlinear programming, and dynamic program-
ming. The stability and convergence of the Hopﬁeld network is ensured due to its
gradient descent nature [6]. But a reasonably formulated network, like the one used
for linear programming is not sufﬁcient to guarantee convergence to a feasible solu-
tion of the original problem. In fact, as will be shown in this dissertation, the linear
programming network by Hopﬁeld will converge to a point which is in general not
close enough to an optimal solution. For certain networks, some states of conver-
gence (local minima) may even turn out to be infeasible with respect to the original
problem. An example of this phenomenon is the traveling salesman problem as

ii) choose a proper energy function, bounded ﬁom below, whose minimum

corresponds to a feasible solution to the problem;

iii) derive network connection weights and bias inputs, which properly embed the

objective function and constrains of the problem into the network; and,

iv) choose initial values for the neurons in such a way that the network converges to

a stable state which is a feasible solution to the problem.

Currently, each of the above steps is based on ad hoc procedures [6-44]. A lot
of effort, however, is being placed on formulating rule-based methodologies to obtain
parameters, derive energy functions, and choose initial values [12—13,18-24,45-50,57-
60]. For a speciﬁc problem, there are various choices at each step of the above pro-
cedure, but, except in a few cases [lS-l9,23-24,58-60], most of the work reported to
date does not guarantee, or at least analytically guarantee, that the state to which the

network converges is a feasible solution to the problem.

1.1 Problem Statement

ANN s, particularly the Hopﬁeld network, have been used to solve optimization
problems such as linear programming, nonlinear programming, and dynamic program-
ming. The stability and convergence of the Hopﬁeld network is ensured due to its
gradient descent nature [6]. But a reasonably formulated network, like the one used
for linear programming is not sufﬁcient to guarantee convergence to a feasible solu-
tion of the original problem. In fact, as will be shown in this dissertation, the linear
programming network by Hopﬁeld will converge to a point which is in general not
close enough to an optimal solution. For certain networks, some states of conver-
gence (local minima) may even turn out to be infeasible with respect to the original

problem. An example of this phenomenon is the traveling salesman problem as

solved by the Hopﬁeld network, where some converged states set up a traveling

schedule to visit some cities more than once and not to visit some other cities at all.

To address this problem, a more thorough theoretical knowledge of the ANN
characteristics and more knowledge of the relationship between the ANN s and optimi-
zation theory is needed. As a result of this shallow understanding, trial-and-error
approaches are currently adopted in choosing the parameters of the network for solv-
ing various optimization problems. And what’s worse is the fact that the set of
parameters resulting from the trail-and-error procedures are case-dependent. Conse-
quently, the application variety of ANN s is limited, especially for applications requir-
ing real-time response. An urgent need is a more thorough establishment of the
theoretical analysis for ANN s from the viewpoint of optimization theory. A key ques-
tion is whether there is a general type network suitable for particular classes of optim-
ization applications. If a general network is possible, then a proper procedure for
mapping classes of optimization problems into this general network is desirable.
Additionally, guidelines are needed to verify whether a given network will converge
to the desired optimal solution(s) ornot. If an exact solution is not achievable due to
the ﬁniteness of the network parameters, a network which will converge to an approx-
imate solution is sought.

For fully exploring the computational power of ANN s in optimization problems,
an investigation involving the following fundamental research tasks is to be done in
this dissertation.

(1) Analysis of ANN s from the viewpoint of optimization theory.

(2) Developing generalized network(s) for solving basic optimization problems such
as linear programming, quadratic programming, and nonlinear programming.

(3) Quantifying, through simulation, the potential advantages and disadvantages of
the generalized network(s).

6

(4) Demonstrating the applicability of the generalized network(s) for solving some

optimization problems requiring real-time response.

The problems to be demonstrated are the economic power dispatching problem and

the optimal power ﬂow problem.

1.2 Approach

The following plan is organized to achieve the goals of this research in a step-

wise and overlapping fashion and to set the stage for subsequent developmental

research further exploiting the anticipated results.

Task 1:

Objective:

Approach:

Network Analysis

Various network formulations of the Hopﬁeld model will be analyzed
from the viewpoint of optimization theory in order to identify the rea-
sons why and when the network succeeds or fails.

The ﬁrst step in this research is to analyze optimization formulations of
the Hopﬁeld model. Various network formulations reported in recent
literature are classiﬁed into different categories to be analyzed in a sys-
tematic way from the viewpoint of optimization theory. The primary
theories used in the analysis include the Kuhn-Tucker optimality condi-
tions for constrained optimization and the penalty function methods
which translate constrained optimization problems to unconstrained
problems. Network formulations are translated to forms which can then
be analyzed for optimality conditions and the other criteria mentioned
above. Results of the analysis are compared with reported experiments

and used to verify the adequateness of the network formulations in the

Task 2:

Objective:

Approach:

Task 3:

Objective:

Approach:

literature. Based on the analysis results, guidelines for checking the

propriety of various network formulations are also sought.

Optimization Network F ormulation

Formulate basic optimization networks and their extensions, and

develop, if possible, a generalized network structure.

With the analysis results obtained in Task 1, some optimization net-
works can be formulated in such a way that the stability, convergence,
and optimality of convergence of these networks are theoretically
assured. The network formulation is started on some basic problems
like linear programming, quadratic programming, and nonlinear pro-
grammin g with afﬁne (linear) constraints. Optimization networks for
more complicated optimization problems, such as nonlinear program—
ming with nonlinear constraints and other combinatorial optimization
problems, are examined next as possible extensions of those basic net—
works. All the developed networks are checked with the previously
derived guidelines to see whether they are legitimate. Based on the
experience of formulating various networks, a generalized network is
thus developed and a procedure for mapping classes of optimization

problems into this network is sought.

Network Simulation

Produce simulations of the optimization networks developed in the last

task and provide some benchmark comparison parameters.

Simulation programs for each of the optimization networks are

developed. For each optimization network there is a set of ﬁrst order

Task 4:

Objective:

Approach:

differential equations which can be solved by standard numerical
analysis techniques. Simulation programs are used to provide perfor-
mance metrics for comparison with other approaches. Metrics to be
extracted include speed of convergence (throughput) and a measure of
computation accuracy (quality). The simulation is also used to study
the network sensitivity with respect to certain parameters, for example,
the input resistance and capacitance of each neuron and the time incre-
ment used solved the differential equations. Based upon the simulation
results, possible modiﬁcation and re-formulation of the networks are
taken by going back to Task 2 for the purpose of performance improve-

ment.

Case Study

Apply the developments in new optimization networks to real en gineer-
ing optimization problems: economic power dispatching (EPD) and
optimal power ﬂow (OPF).

For the EPD problem, two cases, with and without the consideration of

the transmission line losses, are solved by using the developed network

r/formulations and their results are compared with the results obtained by

other traditional methods. The formulation of the OFF is modiﬁed by
delineating the various constraint types for a complete OPF so that this
reduced model can be handled by the developed networks. Simulations
of the OFF will be made thereafter and the results will be compared to
benchmark OPF formulations.

1.3 Overview of the Dissertation

Chapter 2 covers the background knowledge pertaining to neural networks. It
includes a biological review of neural networks, some simpliﬁed neuron models and
network topologies, and a literature review of optimization-related work done on
neural networks. Because of its pioneering role in applying ANN to optimization
problem, the Hopﬁeld feedback network is covered in Chapter 2 to brieﬂy introduce
the basic idea behind optimization neural networks.

Chapter 3 starts out with an introduction of the requisite mathematical back-
ground, followed by the analyses of three existing optimization neural networks. It
ends with an in-depth study of the linear programming problem with hypercube feasi-
ble region as solving by the Kennedy and Chua’s network [19].

The core of this dissertation is in Chapter 4 in which the theoretical results for
various optimization networks are derived. The derivation of the linear programming
network is given ﬁrst. Then, the results are extended to the quadratic programming
network and ﬁnally the nonlinear programming network. The least squares problem is
shown to be solvable by the quadratic programming network. A two-phase network
described last is capable of converging to the exact solution of a optimization problem

and obtaining the corresponding Lagrange multipliers as well

Chapter 5 gives the simulation results of various optimization problems using the
developed network structures. Chapter 6 illustrates the applicability of the optimiza-
tion network to real-world problems by two case studies: economic power dispatch

and optimal power ﬂow. The conclusion of this work is given in Chapter 7.

CHAPTER H
BACKGROUND

This chapter starts with a brief foundational review of neurobiology followed by
some simpliﬁed models of neurons and different network interconnection topologies.
A literature review of ANN applications to optimization problems is given next. The
Hopﬁeld feedback network is covered independently in the last section because of its

pioneering role in applying ANN techniques to optimization problems.
2.1 Neurobiological Review

The major parts of a typical biological neuron include a nucleus, cell body, axon
hillock, axon, synapses, and dendrites as shown in Figme 2.1. Dendrites are the
receivers of incoming signal. When the incoming signal (a smooth varying analog
voltage) reaches a certain value, the nerve cell ﬁres and the axon hillock generates a
pulse (action potential). The output of a typical neuron consists of a series of action
potentials each about 1 millisecond long. If a neuron has a strong input, action poten-
tials are generated at a high rate. If the input is weak or absent, action potentials are
produced at a low rate. The mean rate of the generation of action potentials as a
function of the input follows the form of a sigmoid function. The effective input usu-

ally refers to the short time average or running integral of an excitation; its frequency

10

-11-

of generating action potentials is considered as the effective output [61].

The axon is very resistive and is in charge of transmitting action potentials to
synapses through which a neuron interacts with up to thousands of other neurons. A
synapse consists of two parts, the presynaptic membrane and the postsynaptic mem—
brane. They are separated by a gap called the synaptic cleft which is about 500A
wide. Inside the presynaptic membrane there are small and diffusible molecules
called neurotransmitters that are released into the cleft in response to an action poten-
tial. The released neurotransmitters diﬂ’use to the postsynaptic membrane where they
combine with certain receptor molecules causing a depolarization of the postsynaptic
membrane. The depolarization signal is collected by the dendrite of the second neu-
ron and sent to its cell body [62]. The response of the postsynaptic membrane is a
graded response rather than a pulse. With an excitatory synapse the postsynaptic
potential will be more positive, whereas with an inhibitory synapse the postsynaptic

potential will be more negative.

 

dendrites nucleus
hr

F5) fig...
W W

 

 

endbde \

 

 

 

cell body

Figure 2.1. A biological neuron.

-12-

2.2 Simpliﬁed Neuron Models and Network Topologies

Neuron models constructed for the purpose of studying the function of the brain
generally involve very complicated mechanisms and thus result in complex structures.
These models fall into the category of neuroscience-oriented ANN s and are of pri-
mary interest to neurobiologists. On the other hand, engineering-oriented neuron
models which are used for the purpose of improving the artiﬁcial (machine) process-
ing of data or information are of primary interest in this work.

The simplest neuron model sums the weighted inputs and passes the result
through a certain function. The outcome of the function, considered as the output of
the neuron, branches out via weighted connections to the inputs of other neurons. Let
a,- ’s be the inputs from other neurons, (0;,- be the weight of the connection from the
output of neuron i to the input of neuron j , f (o) be the neural function, and y j be the
output of the neuron j . As mentioned in the previous section, the ﬁring of action
potentials of a neuron is a type of threshold mechanism. To model this mechanism
requires two more variables, 0j and (00,- , denoting the threshold value and its weight
for neuron j , respectively. This simpliﬁed representation of a neuron is shown in Fig-

ure 2.2.

From the computation viewpoint, this neuron model can be thought of as a pro-
cessing element (PE). The function f (0) maps input values to a prespeciﬁed range
and is generally of the following four types: linear, nonlinear ramp, step, and sig-
moidal. These input/output relationships are shown in Figure 2.3. The sigmoidal
function is the most pervasive because it is bounded, monotonic, nondecreasing and
provides a graded, nonlinear response which most resembles a real neuron. The range
of the sigmoidal function is sometimes changed from [0,1] to [-1,l], depending on the
application, giving symmetry with respect to the origin.

-13-

 

 

Figure 2.2. Simpliﬁed representation of a neuron.

 

NV

 

(a)
Figure 2.3. Neuron response functions. (a) Linear function. (b) Nonlinear ramp func-
tion. (c) Step function. (d) Sigmoidal function.

-14-

Figure 2.3. (cont’d.).

+f(X) +7 if xzy

f(x)= x if lxl<y
-y K xS—y

 

. 'y is the saturation level

 

+7 ﬂ x>0
f(X) =

0 otherwise

 

 

m

 

f(X) = 1

1+e'x

 

‘@

-15-

The connection weight in the model corresponds to the correlating strength
between the presynaptic potential and the postsynaptic potential. A positive weight
models a excitatory synapse and a negative weight models a inhibitory synapse. A
positive connection weight implies a positive correlation between the connected neu-
rons or a rewarding relationship. A negative weight implies negative correlation or a
punishing relationship.

The interconnection structure of neurons in human brain is very complicated but
is thought not to be random. There is evidence showing that both the retina and cor-
tex are organized into layers of cells with interconnections within and between layers.
Connections within a layer are referred to as intra-ﬁeld connections, lateral connec-
tions, or short-term memory (STM). Connections between layers are referred to as
inter-ﬁeld connections, ﬁeld connections, or long-term memory (LTM). The intra—
ﬁeld connections are usually considered unidirectional while inter-ﬁeld connections
may propagate signals in a feed-forward and/or feedback direction. Three intercon-
nection topologies are given in Figure 2.4. The networks shown in Figures 2.4 (b)

and (c) can be extended to n layers.

,2
CO.
_' 0..

(a)
Figure 2.4. Interconnection topologies. (a) One-layer laterally connected neural net-
work. (b) Two-layer feed-forward neural network. (c) Two—layer feedback neural net-
work.

-17-

As an example of the functional difference between these interconnections, con-
sider the XOR problem [63]. This problem is not solvable using a single layer neural
network as was discovered early on in the ANN research. It can now be solved easily
using a multilayered trained network. This training, however, is nontrivial and it is
still an open question on how to best train a complicated ANN. The connection pat-
terns within and between the layers are not necessarily fully connected as is the case
illustrated. These connections may be nearest-neighbor type connections, connected
according to certain patterns, or randomly connected with ﬁxed fan-in and fan-out.
Other connection topologies, such as mesh, feature map, and three-dimensional arrays,
have also been studied [4749,6164]. Though not shown in Figure 2.4, every connec-

tion is weighted.

2.3 Literature Review

According to the chronologically edited book by Anderson and Rosenfeld. which
contains over 40 important historical papers in this ﬁeld, the idea of modern neural
networks can be traced back to as early as late 19th century [65]. Recent interest has
been sparked mainly due to the works of Hopﬁeld [4—7], Grossberg [66-69], Kohonen
[61,70], McClelland [71-74], and Rumelhart [71-76]. An in-dcpth study of neural net-
work research and applications up to 1987 has been done by DARPA [77]. In this
study, the storage of a neural network is measured in terms of interconnects, and the
speed of a network is described in terms of interconnects-per-second. Also in this
study, interconnects versus interconnects-per-second charts have been used the ﬁrst

time to measrn'e the computational capabilities of neural networks.

Tank and Hopﬁeld ﬁrst applied an ANN -based technique to solve optimization
problems like the traveling salesman problem and linear programming [67]. In the

case of linear programming, the objective function and inequality constraints are

-13-

mapped into a closed-loop network in such a way that constraint violations loop back
to adjust the states of the neurons. The overall energy functions of a network so
designed decreases until it reaches a minimum. Under a high—gain limit assumption
often placed on certain neurons, the corresponding output of the network then presum-
ably approaches a solution to the original problem. However, this presumption is not

true in general as will be seen in the next chapter.

Motivated by the work of Tank and Hopﬁeld, an abundance of research has been
done on applying the ANN s to other optimization problems. Some work has been
done on the justiﬁcation of the ANN model used for the traveling salesman problem
and possible model modiﬁcations as well as extensions [8-13,78-79]. There is
increasing interest in applying ANN s to various linear programming problems, such as
integer linear programming and problems with equality constraints and to related
topics such as nonlinear programming and dynamic programming [l4-24,80.82].

For engineering-related optimization applications, ANN s have been developed to
solve the placement and the routing problems in VLSI design [86-93], general
computer-aided design [94—99], and power systems engineering problems like security
monitoring, contingency classiﬁcation, and economic power dispatching [ZS-26,100-
109]. More general optimization-related applications using ANN-based techniques
can be found in [2746,110-117]. Most of the works cited, however, have not yet
been vindicated theoretically, and this limits their applicability.

Some researchers have also sought to combine both the ANN models and
another new optimization technique, namely, simulated annealing [118] in order to
explore the merits of the two [IS-24,51-54,119-122]. But the drawbacks are that the
amount of time needed for the annealing process is too long plus there is no simple

way to implement such models using any currently available technology.

Analysis is being carried out on some of the ANN models used for
optimization-related applications [SS-60,123-132]. But, except for a few papers

-19-

reported to date [IS-20,2324], most of these analyses were not undertaken from the
viewpoint of optimization theory. As a result, the network so designed may converge
to a point which is useless to the original problem. Though the ANN model used in
[20] was developed based on some optimization methods, they merely replaced the
discrete procedures by a set of differential equations, which do not necessarily guaran-
tee the stability nor the convergence of the network. We will justify our argument in
the next chapter when the models in [6,19,24] are carefully studied under a unifying
optimization ANN theory developed in the process of this work.

2.4 The Hopﬁeld Model

The Hopﬁeld model falls into the category of a one-layer, laterally connected
network; it is sometimes referred to as the feedback network [133]. One of the major
contributions of the Hopﬁeld model is that it can be built with analog circuit com-
ponents and .is suitable for analog VLSI implementation [134—142]. In this model
each neuron, with input u,- and output V,- , is modeled as an ampliﬁer with a capacitive
element Ci and a resistive element p,- at the input node. These components partially
deﬁne the time constant of the neuron. The output of neuron j is connected to the
input of neuron i via a ﬁnite conductance Tij . This conductance models the synapse
and is symmetric, i.e., Tij = Tﬁ. Figure 2.5 illustrates the basic structure of a
Hopﬁeld neuron. The input-output relationship of the ampliﬁer is sigmoidal. The
excitatory synapse (Tij >0) and the inhibitory synapse (TU <0) are implemented by con-
necting the conductance to the normal output and inverted output of ampliﬁer j,
respectively.

A general Hopﬁeld network is shown in Figure 2.6. The rate of change of u; is
determined by the following equation derived using KCL.

-20-

 

 

 

V.
1
Figure 2..5 A basic neuron of the HOpﬁeld model.
C( ‘) irw ) 1“ +1
— = 00 .-—u. — -— t
8 dt i=1 1] J 8 pi l
n l
=JET if Vj -(zTij + —)u,- +113 (2.1)
where
V]. = fj(uj). (2.2)
Let R,-j = -7-}—, 1%.] = 3— + i—, and choose fj(.) =f(.) for all j. Then the above
r} r r 1R1]

equations can be rewritten as
dug u,-
Ci(7)= 1'ng if Vj- R—i + I‘- (2.3)

where

Vj =f(uj)’ (2.4)

-21-

 

 

 

V1 V2 V3

Figure 2.6. A general Hopﬁeld network.

and
uj = f’1(Vj).
The ﬁrst integral or the energy function of equation 2.3 is

"53 55’0”“ ’ 21’ V" + rgk—tlv‘f-leodt.

i=1j=1

E:

 

(2.5)

(2.6)

-22-

3E

for -87.. = C (d; ) The time derivative of the energy function can be found by
applying the chain rule as
as: = 55.3335;
. greats:
V,- dui dt

 

 

(2.7)

Since f (u;) is monotonically increasing, %50 for all t. As a result, the value of the

energy function is strictly decreasing and becomes zero only at equilibrium points at

du-
which£=—C—— =0foralli.

3V ‘ dt
If uj in equation 2.4 is replaced by M)" where I. is a constant representing the

neuron gain, then equation 2.5 becomes
1 _
u.- = if ‘03-). (2.8)

Hopﬁeld asserted that if A is chosen to be large enough, then the third term on the
right hand side of equation 2.6 is negligible compared to the other terms and thus can
be dropped [5]. This leads to the following:

C(—-—)- .. 21wj + 1 (2.9)

and

l n n n
1:

i=1 j=l

-23-

Note that these two equations are valid only for the high gain limit assumption, that
is, when 2. is very large.

Equations 2.3 - 2.6 actually deﬁne a gradient system and thus guarantee no oscil-
lations or any complicated behavior in the system [133]. Furthermore, it has been
proven that such a system has only a ﬁnite number of isolated equilibria and they are
bounded [143-144]. The network can thus be envisioned as a system which tends to
ﬁnd a path leading to a local minimum in the energy surface. This surface is collec-
tively deﬁned by the network parameters. Those isolated equilibrium points may
correspond to memory patterns in associative memory, patterns in a pattern recogni-

tion problem, or locally optimal solutions to an optimization problem.

CHAPTER III
ANALYSIS OF NEURAL NETWORKS FOR OPTIMIZATION

The requisite mathematical backgron is given in the ﬁrst section to introduce
the notations and the basic theorems used in this dissertation. The ANN models
described in [6,19,24] are studied in detail and justiﬁed by the theorems stated in Sec-
tion 3.1. A thorough analysis of the linear programming neural network for problems
with hypercube feasible region is covered in the last section. It serves to demonstrate

the dynamics of the optimization network in [19].

3.1 Mathematical Background

The following notation and conventions are used throughout this dissertation.
XCR" is said to be convex iff for any a,beX implies [a,b]cX, where
[a,b]={xeR" lx=ka+(l—K)b, 0951}. Let X CR" be a nonempty convex set, then
f:X-—)R is said to be convex iffox-t—(l-My )5 1f (x)+(l—)t)f(y), for any x,yeX
and for 09.51. The function f :X —>R is concave if -f is convex. An affine func-
tion f :X -)R is a function which is convex and concave. K is a cone in R” iff
wceK for any xeK and for any (120. K is a convex cone in R" iff K is a cone
and K is convex. Note that x={Ay IyZO}, where A is an nxm matrix, is a closed

convex cone. H is a hyperplane in R" iff there exists aeR”, a¢0, and oak, such

25

that H = {xeRdl<a,x> = a}, where <.,.> is the Euclidean inner product on R", and
a is a normal vector of H. Hyperplanes H and H ' are said to be parallel iff their
normals are proportional. Let H = H(a,0t) = {xeR" |<a,x> = a], and a¢0, then the
corresponding closed half spaces are deﬁned by H+(a,0t) = {xeR" |<a,x> 2 a}, and
H_(a,a) = {xeR" |<a,x> S a]. Note that the quadratic function

f(x)= -;—xTAx +aTx +b

is convex (strictly convex) on R " iff A is positive semideﬁnite (positive deﬁnite),
where A is a symmetric nxn matrix, aeR", and beR.

Let the program (P )' be of the following form:
Minimize f (x) subject to constraints
810050. 8,0050. h1(x)=0. 12,,(x)=o,

where f and the 35’s are functions on R" and the hj’s are functions on R" for mSn.
(P) is said to be a convex program if f and the g,- ’s are convex functions on R" and
the hj’s are afﬁne functions on R". A vector x is called a feasible solution to (P)
iff 1: satisﬁes the r+m constraints of (P). In other words, the feasibility set to (P) is

the (possibly empty) set

K0 = Kln...nK,nL1n...an
where

Ki={x|g,-(x)30], i=1,...,r,

and

 

"' (P) and other capital letters, such as (LP), (QP), and (NP), when enclosed with
parentheses represent mathematical programs.

26

Lj={x Ihj (x )=0], j=1,....,ﬂl

When K0 is empty, (P) is said to be infeasible; otherwise, (P) is feasible. If (P) is
a convex program, K0 is necessarily convex. For xeKo, the binding set at x is the
set I={i|g,-(x)=0}. Let xeKo; x is said to be a regular point if the gradients,
Vgi(x), th (x ), i E] (x), lSjSm, are linearly independent.

The following theorem is known as the Kuhn-Tucker optimality theorem. (For
proof see [145].)
Theorem 3.1: Let (P) be a convex program in the notation above. Let f be a feasi—
ble solution to (P). Suppose each g,- and it} is differentiable at 2'. Assume further
that f is a regular point. Then i is an optimal solution to (P) if and only if there
exists 2.:[11 1,17 and u=[tt1 um? together with I that satisfy:

(i) 1,20, gi(f')SO, and A;g,-(f)==0, i =1,...,r;

and

(ii) VﬂfH ixtvtztm +§u,Vh,-(f> = o. n

i=1 j=1. —
The variables 3.,- and 11; are known as Lagrange multipliers . Without the
assumption of the convexity of the functions, the conditions (i) and (ii) are only the

necessary conditions for f to be a local minimizer.
Theorem 3.2: Let (P) be a program in the notation above. Let f be a feasible solu-
tion to (P). Suppose each g,- is diﬂ‘erentiable at 5.". Assume further that f is a regu-
lar point. If it solves (P) locally, then then: exists HA, x, 17‘ and u=[]l1 u... ]T
together with 1' that satisfy:

(i) 2.520, g;(5t')S0, and A;g,-(f')=0, i =l,...,r;

and

27
r 1!:
(ii) Vf(f)+ E‘Vg‘m +51”th = o. a
For proof of above theorem see [146].
Deﬁne uj+=max{uj,O], and uj:min{uj,0}. Then, uj=ujt+ur for ISj_<_m. The
corresponding term in condition (ii) above can be written as
uJ-th = (Prim-Wk; (3.1)
= uj+th-(-uj-)th
= uj+Vh,-+(—u,--)(—Vh,->
= llj+th+(—Hj-)V(-hj)-
Since only one of pit or uj- is nonzero and hi =0, condition (ii) can be extended by

stipulating mutually exclusive terms for h 130 and h j 20.

i=1 j=l

The Lagrange multipliers are now forced to be all non-negative. Let g,+2j_1=h-,
g,+2j=-hj, A,+2j_1=p.j+, and 2.,+2j=—p.j—, the extended program (P’) can now be
expressed as:

Minimize f (x) subject to g;(x)SO, 15i$r+2m

Under these notations, Theorem 3.1 may be restated as the following corollary.

Corollary 3.1: Under the assumptions of Theorem 3.1 and the notations above, it“ is
an optimal solution to (P’ ) if and only if there exists 2. = Del 1H2," 1T together with
x that satisfy

(1) 1320, 8; (i750, 811d A18; m, i=1,...,f;

and

28

(ii) Vf (r) + 'gxthtm = o. n
i=1

Next, deﬁne an alternative binding set to be I’(x)= {i liq-$0, r+l$i$r+2m ],

then condition (ii) of Corollary 3.1 can be expressed as

Vf(f)+ 2 A,Vg,(x)+ E xivgicf) =0- (3.3)
iel(f) 36”?)

This implies that —V f (1?) lies in the closed convex cone spanned by Vg, (r') for
i e! (f)ul' (f). If f is a regular point, then -Vf (f) can be uniquely decomposed into
a positive linear combination of Vg; , ieI (f')uI' (2’).

Deﬁne g;+(x )=max{0, g;(x)], i.e., g;+(x) is the magnitude of the violation of the
i th constraint in (P’) where 151' Sr+Zrn . The following theorem is known as the
penalty function theorem. (For proof see [147].)
Theorem 3.3: Let (P’ ) be the extended program stated above for f eC1 and g,-eC1
where 151' Sr-I-Zm . Let {s,‘ 11" be a nonnegative, strictly increasing sequence tending
to inﬁnity. Deﬁne the function

L(s. x) =f(x) + {fume (3.4)

i =1

Let the minimizer of L(sk, x) be xk. Then any limit point of the sequence {xkli' is
an optimal solution to (P’) and, equivalently, to (P). Ftn'thermore, if x], —)x' and f is
a regular point, then stg,-+(xk )—)k,- , which is the Lagrange multiplier associated with
(P’ ). CI

Note that the penalty function L (s , x) is called the energy function in [6]. Later
it will be shown that it is a qualiﬁed Lyapunov function for a neural network. The
following corollary is a direct result from Theorem 3.3.

Corollary 3.2: Let the notations and assumptions be as deﬁned in Theorem 3.3. Then

given 8>0, there exists a sufﬁcient large 3 such that the minimizer of L (s , x) lies in

29

N(0, e), where N(0, e) = {xl llx-flke, item and 0 is the set of minimizers of
(P’). E]

3.2 Networks-Analysis

There are cm'rently three ANN models proposed for solving nonlinear program-
ming problems [6,19,24]. Consider ﬁrst only the case of linear programming. The
linear program (LP) considered here is of the following form:

Minimize f(x)=aTx
subject to g (x) = Dx -b 50,

where D is an mxn mauix, beR’", aeR", and xeR". If equality constraints are
considered as well, then each of them can be replaced by two inequality constraints as
shown in the last section so that the following discussion still holds. Note that

Vf (x) = a and Vg (x) = DT. For simplicity in notation, denote g*=[g f...g,,‘,‘]T.
3.2.1 The Model by Tank and Hopﬁeld

The network su'ucture proposed by Tank and Hopﬁeld for solving the linear pro-
gramming problems is shown in Figure 3.1 [6]. a,— and b,- are implemented by current
sources. The voltage outputs, x,- , on the upper right of the ﬁgme are the variables in
the linear programming problem. The outputs, g f, on the lower left of the ﬁgure,
measure the constraint satisfaction (violation). s in the rectangles is a large positive

constant.

Their model can be described in compact form by

i = C"{-Vf (I) - Vs (x)g“(X) - %R’1x}s. (3.5)

30

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

"1
"2
T "a
Q x"
-b
D D 1
9f 9 11 12 D
D 4:
9 21 D 2
95' ¢ + 2n
9 031 D 4’3
95* ‘ +
-b
9 DIM Dmn m
9,; F + m

 

Figure 3.1. Linear programming network by Tank and Hopﬁeld.

where C is a n xn diagonal matrix due to the self-capacitance of each ampliﬁer. R is
"I
a n xn diagonal matrix with lax—013+; where —-1- the self-conductance of
Rii j=1 Pi Pi
each ampliﬁer. The so called energy function is chosen to be

2

m n xi
Etc) =f<x> + zero)? + 23—. (3.6)
i=1 i=12‘gRi

Taking the derivative of E 1(x) with respect to time yields

31
= (—Ci)Tx
= 47me (3.7)

for all t . Equality holds only at equilibrium points since C is a positive, diagonal
matrix. Note that Vf and Vg are constant vectors for linear programming and g+(x)
is just a vector. Replace g+(x) by a vector 7t and take C to be the identity matrix,

then the minimizer f of El occurs when

—Vf (5r) - Vg or». - 71-12-15: = o. (3.8)

Comparing this equation with the condition (ii) of Theorem 3.2 shows that either the
system described by equation 3.5 does not have an equilibrium or else, even if it does,

the equilibrium would not be a solution to the program (LP).

Suppose s is sufﬁciently large, as suggested by Tank and Hopﬁeld, the last terms
of equation 3.5 and equation 3.8 can be neglected. In this case equation 3.8 can be
viewed as fulﬁlling the necessary conditions of Theorem 3.2. But since it,- (=gf) is
required to be positive for some j , the equilibrium of equation 3.5 has to lie in the
infeasible region of the program (LP). Depending on a particular program, the equili-
brium may be quite far from the true minimizer of (LP) which is generally on a
corner (or boundary) of the feasible region. This drawback makes their model unreli-
able in solving linear programming problems even though the model has a big advan-

tage for hardware implementation.
3.2.2 The Model by Kennedy and Chua

The model developed by Kennedy and Chua [19] is based on the Chua’s previ-
ous work in [148]. The basic components in their network are integrators as shown in

Figure 3.2. Their network formulation requires more hardware components to form

32

 

 

 

 

 

 

 

 

Frgure 3.2 The integrator used in Kennedy and Chua’s model.

the integrator in its analog circuit implementation when compared to the Tank and
Hopﬁeld network. But it is superior in that it circumvents the undesired terms in
equations 3.5 and 3.6, namely, the terms due to self-conductance. Their model can be

described by
J? = C ’1{-Vf (X) - 3V8 008 +00} (39)

where C and s are deﬁned as in equation 3.5. For argument sake, C is normally
taken to be an identity matrix. This model has been used to solve both linear pro-
gramming and quadratic programming problems. The corresponding energy function

is

33
E2(X)=f(1)+ gimp»? (3.10)
j=l

Kennedy and Chua showed that E 2(x) is a Lyapunov function for the system of equa-
tion 3.9. This ensures that the system will converge to a stable equilibrium point
without oscillation. Their network analysis is primarily based on the nonlinear circuit
theory derived in [149].

Comparing equations 3.4 and 3.10 indicates that their work actually fulﬁlls the
penalty function method for a ﬁxed penalty parameter. But they fail to justify the
assumption required for the penalty function theorem (Theorem 3.3) to hold. Nor do
they clarify the relations between the equilibrium point of the network and the true
minimizer to the original program, since there could be more than one equilibrium
point with respect to one minimizer unless the regularity of the minimizer is assumed.
A straightforward analysis from the viewpoint of optimization theory is undertaken in
the next chapter to establish a more sound theoretical foundation for Kennedy and

Chua’s network.
3.2.3 The Model by Rodriguez-Vazquez, et al.

The energy functions of the previous two networks are variations of penalty
function methods, since they are formed by adding the cost functions with penalty
terms. The penalty terms are derived by taking the magnitude of the constraint viola-
tion squared times a penalty parameter. According to the penalty function theorem,
the true minimizer can only be obtained when the penalty parameter s is inﬁnite.
This is impossible to achieve in practice. To cope with this difﬁculty, Rodriguez-
Vazquez, et al. proposed a network model which is formed by two mutually exclusive
subsystems [2A]. (Mutually exclusive here means only one of the two will contribute

at a time.)

34

Let u, be the feasibility index of x, i.e., ux=1 if xeKo; otherwise, u;=0. Their
model can be expressed as

i = -quf(X) - ng(x)g+(x). (3.11)

The corresponding energy function is

Ego) = uxf<x> + gimp»? (3.12)
j=l

The trajectory of the system moves along -Vf (x) if u,=1; otherwise, it moves
according to the negative gradients of the violated constraints. The combined effort
of these two mutually exclusive subsystems forces the conglomerate trajectory to
move toward the boundary of the feasible region. As long as it hits the boundary, the
trajectory chatters around the boundary and, at the same time, it also approaches the
optimum point. But a new problem arises: there is no equilibrium point in this sys-
tem, since the condition (ii) of Theorem 3.2 can not be met. The trajectory bounces
back and forth in a neighborhood of the minimizer, though the neighborhood can be
made very small. The authors, however, have suggested that the system can be
viewed as stable if the variation of the solution is bounded. They also suggested

another model with the following system equation and energy function:

i = —u,Vf (x) - ng(x)(1—ux). (3.13)
and
330:) = uxf (x) + sﬁgrtx) (3.14)
j=l

But the same problem still exists for this model.

35

3.3 Linear Programming Network for Problem with Hypercube Feasible Region

The linear programming problem genre considered here is the minimization of a

cost function
f (x) = aTx (3.15)

subject to xe [0,1]", i.e., 09,51 for every i, where x,- is the ith component of the n-
vector x . Let 1,, and 0,, be n-vectors of all ones and zeros, respectively. Then the
constraints can be represented in matrix form as

g(x) = [35:1] = [3:11 + [31"] =Dx + b S0. (3.16)

f(x) is a hyperplane in R" and the feasible region is a unit hypercube in R". The
general assumption is made on the system that -a is not parallel to the normal vector
of any hyperplane g ,- =0, where g ,- is the j th component of g (x ). This ensures the
uniqueness of the optimum point, or in this case the minimizer of the cost function.
The minimizer of the cost function is one of the 2" cOrners of the hypercube depend-

ing on the normal vector of the hyperplane.
Let J = { j | g j>0} be the set of indices of the constraints which are violated. Let
It: card (J), the cardinality of J, i.e., the number of components of J. Then g 1* takes

on the magnitudes of constraint violations. Consider next a network structure given

by the piecewise linear differential equation
' = -[D,Tg,+ ] = -[of(o,x+b,)++a]. (3.17)

This is similar to equation 3.5 without considering the term due to self-conductance.
Note that this structure of the dynamical system described varies with time since J
changes from time to time. D, is a kxn matrix consisting of the jth rows of D, for

every j a]; similarly b 1 consists of the j th components of b . Observe that since

36

there are at most n constraint violations, D J is at most an n xn matrix.

3.3.1 The equilibrium of the system

Under the assumption placed on -a , the minimizer of the cost function is at an
intersection of n of Zn hyperplanes g ,- =0. Recall that by the deﬁnition in Section 3.1,
I is the set of indices of these hyperplanes. By the Kuhn-Tucker theorem for

optimality, the necessary and sufﬁcient condition for a point to be a minimizer is that

Vf + Zungj = 0 (3.18)
jel
where u ,- >0. For any hypercube, the ng ’s are orthogonal for j e] . Therefore, equa-
tion 3.18 can be viewed as decomposing Vf into n orthogonal vectors ij ’s, each
along the direction of -ng.
Note that Vf =a and ng=DjT. Also, (D,-x+b,-)+ measures the one-sided dis-
tance of x to gj=0. If we denote (Djx+b,-)+ by w-, then the equilibrium of equation
3.17 must satisfy

Vf + 219,-ng = 0. (3.19)
jeJ
Matching I With I and w,- with u,- shows that the equilibrium of the system fulﬁlls
the Kuhn-Tucker optimality condition. Furthermore the equilibrium is unique due to
the orthogonality of gj=0 for jeJ. Also observe that wj=(D,-x+b,-)+>0 for jeJ indi-
cating that the equilibrium lies in the infeasible region where n constraints are
violated.

37

3.3.2 The initial state in the feasible region

When the initial state of the system is any point in the feasible region, then J is
simply empty and the trajectory moves in the direction of -a until it hits either one of
the hyperplanes or an intersection of certain hyperplanes. Assume the former is the
case. On the hyperplane g ,- =0, the trajectory still follows the direction of -a and thus
it eventually enters into the other side of the hyperplane. At this time, D J becomes a
lxn matrix denoted by D ,- (small j ). The dynamics of the system are now described
by

r = -D,T(D,x+b,)+-a. (3.20)

-D,-T(D,-x +b,- )+ is a vector in the direction of —D,-T with magnitude proportional to
(Djx+b,-)+. If we decompose —a into two parts as —a = ale-t-aj, where a, is the pro-

jection of -a in the direction of D}, then equation 3.20 becomes
° _ T
x — [—Dj (Djx+bj)++a,-] + “I" (3.21)

As the system evolves in time, it reaches a point where the ﬁrst two terms on the
right hand side of equation 3.21 cancel each other. Thereafter, the trajectory rolls
along with a 1: until it hits another hyperplane.

When there is at least one constraint violation, similar to the case with the initial

state in the infeasible region, -a can be decomposed as -a=2a,-+aj, where a,- is the
jeJ

projection of -a on the span of DjT for 1's] and a, is the portion of .0. not in this
span. Note that a ,- is ﬁxed and unique for each j a] due to the orthogonality of the
normal vectors D}: I, however, is not ﬁxed. The differential equation of the system

for k21 can now be written as

x = -};D,T(D,-x+b,.)+ + 2c, + a,. (3.22)
jeJ jeJ

38

Extending the above argument to 1: >1 illustrates that as the trajectory moves
from one side of a hyperplane g ,- =0 to the other side of the hyperplane, the projection
a ,- will be continually reduced by DjT(D,-x+b,- )+ until it becomes zero. Although the
trajectory may hit another hyperplane, the orthogonality of the DjT’s nonetheless
guarantees that a ,- —D,-T(D jx +b ,- )+ eventually will become zero. It is clear at this point
that if a trajectory starting in the feasible region hits the intersection of certain hyper-
planes, each a,- will be diminished gradually by the corresponding term DjT(D,-x+b,- )+
and the result stated above still holds.

Since -a is not parallel to any hyperplane by assumption, it can be decomposed
into n orthogonal projections. As the trajectory moves from region to region, it will
eventually enter a region where n of the Zn constraints are violated. In this region
the trajectory settles to the unique equilibrium point on which -a is fully represented

by -zD,T(D,.x+b,)+.
jeJ

3.3.3 Example of a 2-dimensional hypercube

Due to the orthogonality of the hyperplanes, the trajectories of the system can be
illustrated by considering the case of a 2-dimensional hypercube. In Figure 3.3, the
vector —a is shown on the upper right comer together with its two orthogonal projec-
tions. The intersection of the two dotted lines is the equilibrium point of the system.
The broadened gray lines are trajectories of the system corresponding to different irri-

tial states.

Figure 3.4 gives several examples with the initial states in the infeasible region.

The corresponding -ED,.T(D,-x+b,- )+ with respect to each initial point is drawn by a
jeJ

solid line segment along with vector -a shown by a dotted line segment. The direc-
tion of the trajectory changes whenever it crosses a hyperplane. Once the trajectory

39

 

XZ=1 E

 

 

 

 

 

Figure 3.3. Trajectories starting in the feasible region.

reaches one of the dotted lines, i.e., one of the orthogonal projections of -a , it stays
on the line and moves toward the equilibrium point. The equilibrium point will
always lie in one of the four regions shown with light gray background in Figure 3.4.
If we consider that the normal vectors of the hyperplanes separate these four regions
from each other as bases of R 2, then there is only one region in which -a will have

positive coordinates. This is another interpretation of equation 3.17.

 

 

Frgure 3.4. Trajectories starting in the infeasible region.

3.3.4 Further Analysis

Consider now the system deﬁned by

x = -s [DJT(DJx+bJ)+]-a

= -s

 

'eJ

sz(D,x+b,-)+]-a
I

(3.23)
which is same as equation 3.9 but more explicitly expressed. When s=l, equation

3.23 is the same as equation 3.17. When s>1, denoting vj=s(D,-x+b,-)+ for jeJ, the
equilibrium of equation 3.23 must satisfy

41
Vf + ZvngJ- =0. " (3.24)
jeJ
The corresponding terms of equations 3.24 and 3.18, namely I=J and uj=vj, can be
u.
matched. Since u,- is ﬁxed for a given cost function, (Djx+b,- )+=-;J- is the equili-

brium of equation 3.23. This means that the distance from the equilibrium point x to
g 1:0 is reduced by a factor of s when compared to the case where s=1 (equation
3.19). By choosing s sufﬁciently large, the equilibrium point can thus be moved arbi-
trarily close to the intersection of the corresponding n hyperplanes, which is the

minimizer of the cost function.

Next, the energy function E is deﬁned as

)3 [(0 ,x +12,- )+]2. (3.25)

E: (”+1
f 21's]

The time derivative of E can is derived as

dE ddei

7:59am

_-. JET a+2st(D,-x+b,-)+
is]

T

= at it < o, (3.26)

for all x¢0. Thus E is a Lyapunov function for the system. Hence the equilibrium is
asymptotically stable by the asymptotical stability theorem [150]. In fact, it is also
asymptotically stable in the global sense due to the unboundedness of (D ,x +b ,- )+ as

le ll ->oo.

For s sufﬁciently large, equation 3.25 is a form of the penalty function method.

But, if we use the notation of v ,- , then equation 3.25 becomes

E = f(x) + % gvj(ojx+b,)+, (3.27)
16

42

which is a form of the Lagrangian function method. So the forms of both methods
are implicitly embedded in the network structure described by equation 3.23. As fol-
lows from the penalty function method theorem, the equilibrium of equation 3.23
approaches the minimizer of the cost function as s -)oo. This, however, is impossible
to implement in practice. A sufﬁciently large 5 will generally result in an equilibrium
state which is a reasonably good approximation to the minimizer of the cost function.
A diagram of the trajectories of equation 3.23 for s=oo is shown in Figure 3.5 for
a 2-dimensional hypercube. Whenever the trajectory lies in the infeasible region, it
will be forced to move directly to either the closest hyperplane or the intersection of
the hyperplanes if more than one constraint is violated. Then, it will move according
to —a to one of the hyperplanes g ,~ =0, where jel . Once it reaches such a hyper—
plane, the trajectory slides on the hyperplane toward an intersection of hyperplanes

with indices j e] .

3.3.5 Extensions

The results of above analysis are directly applicable to cases where the hyper-
cube is being scaled up or down, translated from the origin to any point, and/or
rotated at any angle. It is also applicable to the region deﬁned by
[ll,u1]x - - ° x[l,,, ,,] and its scaling, translation, and rotation, as long as the ortho—
gonality of the hyperplanes (g i=0 for j 61 ) is preserved. If the orthogonality is not
preserved, the equilibrium of the system deﬁned by equation 3.23 may not be unique,
though the local asymptotic stability of the desired equilibrium still holds.

An extended theoretical argument for nonlinear programming in general is given

in the next chapter.

43

 

 

X2=l

x2=0

 

Figure 3.5. Trajectories for s=oo.

CHAPTER IV
OPTIMIZATION NETWORK FORMULATION

A unifying mathematical framework for the Kennedy and Chua network for
linear and quadratic programming is given in the ﬁrst two sections followed by its
extension to more complicated nonlinear programming problems. A two-phase optim-
ization network model is then proposed which can obtain both the exact solution, in
contrast to the approximate solution by Kennedy and Chua’s network, as well as the

corresponding Lagrange multipliers associated with each constraint.

4.1 Linear Programming Network Theory

Let the linear program (LP) be deﬁned as in Section 3.2 and its objective func-
tions is referred as f I (x). In this section, the network by Kennedy and Chua for
linear programming is justiﬁed ﬁ'om the viewpoint of optimization theory. As has
been pointed out in Section 3.2.2, E2(x) is precisely L(s, x) for a ﬁxed penalty
parameter s . Thus we have the following proposition which is a restatement of

Corollary 3.2.

Proposition 4.1: Let 0 be the set of minimizers of a feasible (LP). If 0 is bounded
and contains only regular points, then given e>0, there exists a sufﬁciently large 5

such that M, the set of the rrrinimizers of the corresponding E2(x ), satisﬁes

45

mirﬂIx—flke for xeM. Cl

area

Note that M is convex due to the fact that the sublevel set of a convex function,
namely E2(x ), is convex. Without the assumption of the boundedness of 0 , the pro-
position is still true. But when 0 is unbounded, any bounded subset of 0 is
sufﬁcient for obtaining a minimizer. This is due to the fact that if an (LP) has a ﬁnite
optimum value, it must have a ﬁnite minimizer. Thus for 0 unbounded, we can place
some additional, suitable bounding constraints into the original program so that the

following discussion still holds.

Take C to be I in equation 3.9 and rewrite it as

r = -Vf (x) — sLigj+(x)VgJ-(x)]. (4.1)
‘=1

The block diagram of the system of described by equation 4.1 is drawn in Figure 4.1.
Deﬁne sgj+=vj and J (x)={ j lgj+(x)>0, ISjSm ]. The equilibrium of equation 4.1
occurs when

0 = VfCE) + sﬁgfawttjtr)
j=l

= Vf (50+ 2 vngjCi). (4-2)
fem)

Since for linear programming problems, ng=dj, where d, is the jth row of D , equa-

tion 4.2 can be expressed as

me+ z v,d,-=0. (4.3)
jaw)

Proposition 4.2: If the (LP) is feasible with ﬁnite optimum value, then the system
described by equation 4.1 has an equilibrium. CI

Proof: Let 1? be a minimizer to the (LP). Then, according to Corollary 3.1, there
exists kj>0 for jelo‘c') such that

46

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

x
CD *’ I x r
-‘ '
Vf *—
258i+V8t°
i ._,
sg,+._

 

 

 

Figure 4.1. The block diagram of the system of equation 4.1.

16“?)

Choosing J (xHa') and vj=7tj, veriﬁes the Proposition. El
Proposition 4.3: Let f be a solution to a feasible (LP) with a ﬁnite optimum value.
If x" is not regular, then the equilibrium of equation 4.1 corresponding to x is not

unique. CI

47

Proof: Since it" is not regular, {dj )1. e 1(3) are linearly dependent. This implies that

there exists two subsets of I ()7), say I' and I" where I’ztl” , such that

a + Enid,- = O, for (ll->0, (4.5)
jel’
and
a + z pjd, = o, for B,>o. (4.6)
jel”

Let aj=sgj+(xl) for jeI’ and Bj=sgj+(x7) for jeI”. Since I’ael”, then there is at
least one a,- ¢Bj and, consequently, at least one g ,- (x1)¢g ,- (x?) Thus the equilibrium

corresponding to f is not unique. [3
Note that even though the equilibrium points corresponding to r are not unique,
they all result in same value of E 2(x ), since any equilibrium point of equation 4.1 is a

minimizer to E2(x) as will be shown later.

In fact, it is observed that any convex combination of equations 4.5 and 4.6

satisﬁes the equilibrium condition. That is to say, for 038$],

a + 6 2 ajdj + (1—6) 2; ma, = o. (4.7)

jer jer'
Let i=I’UI" and r=card (f). Denote a as the vector in R' with its element equal to
at,- if jel’ , otherwise 0. Similarly, let p be the vector in R' with its element equal to
B,- if jeI” , otherwise 0. Then, equation 4.7 implies that the line segment [(1, B] C
R1 is mapped to a single point, —a, in the closed convex cone {y |y=27,~d,-, yj>0}.

je
More generally, from equation 4.3 -a lies in the closed convex cone {y ly= 2 7,- dj.
1' 61(1)

7} >0} as seen from equation 4.4.

For linear programming f (x) and (g j+(x ))2 are convex and continuously

differentiable as is E 2(x ). Furthermore,

48

(152:3an

;|;

_ 'T V m .+V .
—x f+s£gj g,
i=1

 

= 47150 (4.8)

holds with equality only at the equilibrium .2 of equation 4.1. Thus if the (LP) has a
ﬁnite optimum value, E2(x) is a Lyapunov function of equation 4.1 and achieves its
minimum at 52.

Proposition 4.4: Let (LP) be a feasible program and E 2(x) be correspondingly
deﬁned. Then the set of equilibrium points of equation 4.1 is the set of minimizers of
E 2(x ). El

Proof: Let M, be the set of minimizers of E2(x) and M2 the set of equilibrium points

of equation 4.1. For xeMz, we have

VE2(X) = .1. = 0

and

V2E2(x) = Vg , (x )ngot) 2 o,

for Vg ,=[Vg ,- ] ,6 1(1). Thus x satisﬁes the necessary condition of a minimum of
E2(x ). That is, M ,c M 2. To show the converse, i.e., Mzc M1, we proceed as fol-
lows.

Since M I: M 2 and M110, there is at least one equilibrium point of equation
4.1, say it, which is a minimizer of 5,. Observe that V2520?) is strictly positive

except along the direction of y at which

yTVgrngfcb = o. (4.9)

49

So i is a minimum except along the direction y . Therefore, E2(x) must be examined
along the direction of y. Equation 4.9 implies vg}(r)y =0, i.e., yeNull(Vg,T(5i)),
where Null (ngoz )) is the nullity of VgJT. Since VE 2(52 )=o implies
—aeRange (Vg, (r )), we have aTy=0 due to the fact that the linear subspace of
Null (V gJTCc' )) is perpendicular to the linear subspace of Range (VgJCE )). Now

VE2(5i+y)=a+ 2‘, g,-+(i+y)Vg,-(i+y)
1916+”

=a + 2 g,-+(r+y>Vg,-(i)
i610?)

=0 4 2 [vgf(r+y) - bIngC?)
iela)

=a + z [ng5: -b]Vg,-(x)
few)

=a + 2 groove-(i)
i610)

= V820?) = 0, (4.10)

if J(x+y)=J(J‘2). Thus 5t+y, for yeNuiI(Vg,T(i)), is an equilibrium of equation 4.1

as long as it does not evoke any new constraint violation. Moreover,

Ezc+yl=aT<r+y>+i 2; (SfCZH‘DZ
jeJ(x+y)

= 0T} + ‘3' 2 ($4.637)?
i616)

aTr + g 2 (Vs-Tao) - b)2
i616)

Ht

aT

+ % z (ngi -b)2
ism)

.. s ..
aTx +3 2 (gj+(x))2
few!)

so
= 52(2), (4.11)

ifJ(i+y)=J(J‘i). This implies i-l-y is also a minimizer of E2, and thus Mzc M1. D
Proposition 4.5: Let 0 be the set of minimizers of a feasible (LP). If 0 is bounded
and contains only regular points, then there exists a sufﬁciently large s such that
J(i)=I(x'), for)? an equilibrium of equation 4.1 and 1760. D

Proof: From Proposition 4.4, it is a minimizer of E2(x ). According to Theorem 3.3,
5i-—>x'e 0 and v ,- 4). ,- as s —)oo. Since the convex cone spanned by ng (f) is closed,

there exists a sufﬁciently large s such that J (52)=I(x). CI

Note that 0 is necessarily convex. Also for x1, x260 , we have I (x1)=I(x2) and
the corresponding Lagrange multipliers 3e,- are the same.
Proposition 4.6: If the unique minimizer f of a feasible (LP) is a regular point, then
there exists a suﬂiciently large s such that the equilibrium of equation 4.1 x with
respect to f is unique. [3
Proof: let so be the parameter that satisﬁes Proposition 4.5 so that J (5i )=I (2'), for it
an equilibrium of equation 4.1. Since 2' is regular and unique, it implies that
card (I (f))=n , which furthermore implies card (J (5i ))=n by the choice of so . In this
case equation 4.2 is equivalent to solving a system of n linear equations for n unk-
nowns, namely, the v,- ’s. Since ng (i ), for j eJ (J‘i ), are linearly independent by the

choice of so , v is uniquely determined. But

vj = sogj +(1)-=So(V8jx “bj)9 (4-12)
so i is uniquely determined, again by the linear independency of ng (x ), for j eJ (5i ).
D

The above proposition considers only the case for which card (I (f))=n . Suppose
now that 0 contains more than one point, i.e., card (I (f))<n. let so be deﬁned as in
. Proposition 4.6 and 56o be a corresponding equilibrium of equation 4.1, then

51

0 +50 2 8j+(io)dj = 0. (4.13)
I510?)

As s changes continuously from so to co, :2 moves continuously from so to an is 0.

If now so is raised to s1, the system moves ﬁem J‘io to 521 so that

a +31 2 gj+(il)dj = 0. (4.14)
felt?)

Comparing equation 4.14 to 4.13, the constraint violation has been reduced by a factor

so . . . . . .
of T. Also at the trme of swrtchrng so to s, the system moves 1n the direction of a
1

as can be seen by evaluating the dynamics of the system at so .

ili, =—[a +51 2 gj+(io)dj]
I510?)

= -[(a + (304's 1'30) 2 git-(fa )dj]
1.5“!)

= -<sr-so) 2‘, gj+(io)dj
felt?)

—a
= -(S 1.30)—
o

S 1’50

 

= ( )a. (4.15)

0

Multiplying equation 4.13 by s, and equation 4.14 by so and taking their
difference, we have
0 = (SI " a)“ +3130 2 (8j+(io)‘gj+(il))dj
161(1)

=(s1-so)a +s1so E (dJTJ‘io -b,- — dfil +bj)dj
i610?)

=(Sr —s.)a ms. 2: dis. -ir)d,-
felt!)

52

= (r1 -so)a + slso 2 djdfao -5i,). (4.16)
i616)

If d, is normalized, i.e., lldj IE1, then dde-TCio -J‘21) is an orthogonal projection of
J'c'o - it, along the direction of d,. So equation 4.16 actually says that the sum of the
projections of J‘io -J‘21 on the dj’s is in the direction of a. From equation 4.13 it is
clear that —a lies in the convex cone spanned by {dj )1. e ,m. Also when switching so
to s 1, the trajectory of the system will not move in any direction perpendicular the
subspace spanned by (d,- )1.6 , ,5. Thus the following fact is noted:

Fact 4.1: Let io, 52,, 1', so, and S, be deﬁned as above. Ifxo-i-y approaches f+yeO
as s changes ﬁem so to co, where y is perpendicular to {d,- 11.6“”, then, if so is
switched to s1 at x=5io, x1+y is the equilibrium of the system. C!

The above fact can be geometrically interpreted. First the following deﬁnitions
are helpful. Let K CR " , then a hyperplane H is said to be a supporting hyperplane of
K iﬁ‘ KCH+ or KcH_, and cl(K)nH is nonempty, where cl(K) is the closure of K.
Let K CR” be closed and convex, then FcK is called a face of K iff there exists a
supporting hyperplane H of K such that F = H nK . This relationship is illustrated in
Figure 4.2 in which K is represented by a gray background, H by a straight line, and
F by a broadened line segment.

When there is more than one minimizer, the set of minimizers of a linear pro—
gram (LP) actually forms a face. In Figure 4.3 the set of minimizers is represented
by broadened line segments. The straight lines in the ﬁgure are the hyperplanes that
deﬁne the feasible regions, which are shown with gray background. The equilibrium
points of equation 4.1 are plotted by dotted line segments. There are three possible
cases depending on whether the size of the equilibrium points of equation 4.1 is
larger, smaller, or equal to the size of the set of the minimizers. (The size of a com-

pact set S may be deﬁned to be maxgllx-yll.) Three possible cases are illustrated in

x0e

53

 

Figure 4.2. An illustration of the relationship among K , H, and F .

Figure 4.3. Any combination of these three cases is also possible. The arrows in Fig-
lue 4.3 depict the changing of the equilibrium set as the system is switched to a new
s value.

Using the results obtained thus far we are able to conclude the following
theorem. This is a very important result since it assures the complete stability of the
network described by equation 4.1.

Theorem 4.1: Under the notations and assumptions of Proposition 4.1, then, given
t-:>O, there exists a sufﬁciently large s such that the system is completely stable and it

satisﬁes minIIJT-xllqe. C!
re 0

Proof: Let s1 be the parameter that satisﬁes Proposition 4.5. Given £>0, by Proposi-
tion 4.1 there exists a sufﬁciently large s2>s1 such that llx—flke, where x is an equili-
brium of equation 4.1 and 1760. Since 0 is bounded, by Proposition 4.1 the set M
of equilibrium points of equation 4.1 is bounded. (Otherwise, for 0 bounded and M
unbounded the conclusion of Proposition 4.1 would not be true.) Using E 2(1) as the

Lyapunov function for the system, LeSalle’s Theorem [150] ensures that the system is

54

 
    
 

"Ill'lﬂlllulﬂlulllll III

(a)

| minimizers
; equilibria

(C)

Figrne 4.3. The relationship between the equilibrium set of equation 4.1 and the set of
minimizers. (a) They are of the same size. (b) The equilibrium set is smaller in size
than the set of the minimizers. (c) The equilibrium set is larger in size than the set of
the minimizers.

completely stable in the sense that every trajectory will converge to a point in M

without oscillation. El

55

4.2 Quadratic Programnring Network Theory

Consider next the case of quadratic programming. Let the quadratic program
(QP) be of the following form:

Minimize fo(x) = é-xTAx + 07x
subject to g(x) = Dx-bSO,
where A is a symmeuic, positive semideﬁnite matrix. It is clear that V f o (x )=A.x +a.

First the following lemma is needed.

Lemma 4.1: Let B be a symmetric, positive semideﬁnite matrix. Then Bx =0 if and
only if xTBx =0. U
Proof: The only if part is clear, since for Bx =0, it follows that

xTBx = xT(Bx) = 0.

The converse can be shown by using the Cauchy-Schwarz inequality; that is, if Q is

symmetric, positive semideﬁnite then the following is true:
(ﬂay)2 s (xTQnyTQy). (4.17)
for any x and y. By assumption xTQx=0, we have
0 s (£ny s (xTQnyTQy) = o. (4.18)
for any value of y. This implies leQy |=0, for any y. But this implies xTQ=0, i.e.,

Qx=0. The proof is complete. El

Similar to Proposition 4.1 for linear programming, we have the following propo-
sition for quadratic programming which is a direct result of Corollary 3.2.

Proposition 4.7: Let 0 be the set of minimizers of a feasible (QP ). If 0 is bounded

and contains only regular points, then given e>0, there exists a sufﬁciently large s

56

such that M, the set of the minimizers of the corresponding E 2(x ), satisﬁes

minllx-x‘lke for xeM. [3
£60

Proposition 4.8: Let (QP) be a feasible program and E 2(x) be correspondingly
deﬁned. Then the set of equilibrium points of equation 4.1 is the set of the minimiz-
ers of 82(x ). El

Proof: Let M, be the set of the minimizers of 52(1) and M2 the set of equilibrium

points of equation 4.1. For xeMz, we have

VE2(X) = "X. = O

and

Wax) = A t Vs: (x )VXJT (x) 2 0. (4.19)

for Vg,=[Vg,-]je,(,,). If the A matrix in the deﬁnition of fo (x) is positive deﬁnite,
then the Hessian of E2 (equation 4.19) is positive deﬁnite. Thus the equilibrium of

equation 4.1 is necessarily and sufﬁciently the unique minimizer of E 2(x ).

If, however, A is only positive semideﬁnite, then x satisﬁes only the necessary
condition of a minimum. That is to say M 1: M 2. To show the converse, i.e.,

M 2: M 12 we proceed as follows.

Since M 1c M 2 and M 1150, there is at least one equilibrium of equation 4.1, say
i , which is a minimizer of E 2. Observe that V2E 2(5'c) is strictly positive except along

the direction of y at which

yT(A + Varngltrm = o. (4.20)

So J? is a strict minimum except along the direction of y . We need only examine
E 2(x) along the direction of y . Equation 4.20 implies ngﬁ )y =0, i.e.,
y eNuiI (vg,T(r)). Equation 4.20 also implies yTAy =0, and this implies Ay=0 by
Lemma 4.1. Together we have yeNull (A )nNuu(Vg,T(r)). Also, since VE2(x)=0

57

implies —Ax-a eRange (Vg, (i )), we have
0 = yT(AJ‘i + a)
= yTAj't + yTa

= yTa, (4.21)

due to the fact that y eNuii (A )rwuu (ngor )) and that the linear subspace of
Null (VgJTCc' )) is perpendicular to the linear subspace of Range (Vg, (i )).

Now

VEZCHy) =A(x +y) + a + 2 g,-+(x+y)Vg,-(f+)’)
jeJ(x+y)

=Ax +a + 2 gj+(x+y)Vg,-(i)
. jeltx)

=Ajt +a + E [ng(i+y)-b]V8j(i)
i816)

=Ai +a + z [VgJ-Ti -b]Vg,-(x)
i616)

=A55 +0 + )3 8j+(i)V8j(i)
161(1)

= V520?) = o, (4.22)

if J(J‘i+y)=J(i). Thus 5t+y, for yeNuii(A)nNuii(vg,T(r)), is an equilibrium of

equation 4.1 as long as it does not evoke any new constraint violation. Moreover,

Erato)=(ic+y)TA(i+y)+aTtiz+y>+i z (8j+(i+)’))2
jeJ(1+y)

=xTAr + a7)? + % E (8,°+(J'5"')’))2
felt!)

=iTAi' + a7: + g- z (vg,T(r+y) - 12)2
felt?)

= “TAx +aTx + g 2 (ng2 — b)2
jela)

58

=2“: + oTr + % z (,gfci»2
isle)

if J(x+y)=.l(x'). This implies 5i+y is also a minimizer of E2, and thus M2: M1.
The proof is complete. El

Proposition 4.9: Let 0 be the set of minimizers of a feasible (QP ). If 0 is bounded
and contains only regular points, then there exists a sufﬁciently large s such that
J(J‘i)=I(x'), forx' an equilibrium of equation 4.1 and 2'60. D

Proof: Same proof as of Proposition 4.5. D

Proposition 4.10: If the unique minimizer r of a feasible program (QP) is a regular
point, then there exists a sufﬁciently large 3 such that the equilibrium :2 of equation
4.1 with respect to x is unique. El

Proof: let so be the parameter that satisﬁes Proposition 4.9 so that J (r )=I (r), for 52
an equilibrium of equation 4.1. Since 1? is regular and unique, it implies that
card (I (x’))=card (J (x' ))=n by the choice of so and Vnga) is of full rank. The latter
ensures that ng, )ngm is positive deﬁnite. Since it is an equilibrium of equation
4.1, from equation 4.2 and using J (i)=I(J't') we have

A: + a + So 2 ng(x')(Vg,-x' - bj) = O. (4.24)
felt!)
This is the same as
A? + a + sngIm[Vg,Tmi - by] = 0. (4.25)

By rearranging the variable, we have

[A + sng, VgT )).? + a - sngImb, = 0. (4.26)

(2') Hi

59

Observe that A + so Vgl(z)V317(r) is positive deﬁnite, since A is positive semideﬁnite

and VngVgJTm positive deﬁnite. Hence 5i can be uniquely solved as
-1
x = -[A 4» so VgIngITm] [a — snglme]. Cl (4.27)

The above proposition considers only the case for which card (I (f))=n . Suppose
now that 0 contains more than one point, i.e., card (I (f))<n . Let so be deﬁned as in
Proposition 4.10 and let to be a corresponding equilibrium of equation 4.1, then

Afio + a + so V813)[V811(z)io - bl] = 0. (4.28)

As s changes continuously from so to oo, 56 moves continuously from 52o to an is 0.

If now we raise so to s1, the system moves from J‘Zo to 521 so that
Ail + a + sIVgl(f)[ngmxl - by] = 0. (4.29)

Assume that so ail—9x" as so ->s1—>oo. Let yeNull (A )nNulI (V3170?) )) and
J (xo+y)=J(5io ). As the system of equation 4.1 is changed by raising s from so to s,
at x=xo +y , its dynamics are described by

rim =A(5io+y) + a +s1 z ng<rngf<a+y> - 12,)
jelm

=Aro + a + so 2 Vg,(r)(vg,-Tro — 12,-). (4.30)
law?)

This is same as the dynamics of equation 4.1 when changing so to s, at x=x'o. By

Proposition 4.8, zTa=0 for any zeNull (A )nNuIi(vg,T(ro)). It follows that

2T1? lit.” = 2T Ago +0 +51 2 ngmwgfio -bl')
1'6“!)

= s1 2‘, zTVg,(r)(Vg,Tro - b,-)
felt?)

= 0, (4.31)

60

since zeNull (VgJTa )) implies that z is perpendicular to g ,- for jeI (2'). Equations
4.30 and 4.31 actually say that any vector in Null (A )nNull (ngor )) remains
unchanged as the system evolves in time. This indicates the following fact.
Fact 4.2: Let 55o, 521, f, so, s1, and y be deﬁned as above. If xo+y approaches
f+y60 as so —>oo, then, as so is switched to s, at x=7‘io, the system obtains an equili-
brium at x1+y . D

Similar to the case of linear programming network, for the sake of practicality,
the results for the quadratic programming network are summarized in the following
theorem.
Theorem 4.2: Under the notations and assumptions of Proposition 4.7, then, given
e>0, there exists a sufﬁciently large s such that the system is completely stable and it

satisﬁes minllx—ftlke. El

feO
Proof: Basically the same proof as for Theorem 4.1. [3

Note that the results of above argument still hold even if I (if) is empty, i.e., no
binding constraints. In this case the minizer lies in the interior of the feasibility set.
This is a very strong result with a myriad of applications. In particular, the following

two cases are examined.
Case (A): Solving B,,,o,x=b
B is assumed to be of full rank. This problem can be converted into minimizing

f(x) = élle—HP over R". But
f(x) = %xTBTBx - bTBx + #71; (4.32)

and

Vf (x) = BT(Bx—b ). (4.33)

61

Since 3TB is symmetric and positive deﬁnite by the assumption placed on B, the
optimization network formulation described by equation 4.1 for the case of quadratic

programming can be applied. Deﬁne

i = _-Vf (X). (4.34)

It follows that 9%:4Wf (x )II2s0 with equality holds only at Bx =b. From

Theorem 4.2 the unique equilibrium of equation 4.1, x=B‘1b, is globally asymptoti-
cally stable. Thus the problem is solved without actually calculating the inverse

matrix of B .
Case (B): Solving Bmxnx=b

In this case, assume rank (B )=n<m . This is a least squares problem. Similarly,
we transform this problem into minimizing f (x) = élle-bll2 over R " and deﬁne a
system as equation 4.34. By the above theorems, the system converges asymptotically
to an equilibrium it which satisﬁes

0: B T (B: -b ).
This implies that
it = (Mm-13%.

But this is exactly the least squares solution to Bx =b . Again the problem is solved

without computing the inverse matrix.

The results derived can also be applied to the above two cases when rank (B )<n ,
though the equilibrium under such a condition is not unique. The above derivation is

summarized as the following corollary.

Corollary 4.1: Let B,,,,,,,x=b , where B is a constant matrix, b a constant vector, and
n Sm . If B is of full rank, then the dynamic system of equation 4.34 uniquely solves
the problem B,,,x,,x=b. If B is not of full rank, then depending on the initial states,

62

the system of equation 4.34 approaches one of the solutions. [3

Least squares problems with linear constraints can be solved as well by the same
network formulation as described ill equation 4.1. But the solution thus obtained is an

approximation to the exact solution due to the ﬁniteness of the penalty parameter s .

4.3 Nonlinear Programming Network Theory

Deﬁne the program (PP) to be

Minimize fo (x) subject to g (x )=Dx-b50,

where f o (x) is a Cl convex function and g (x) is similarly deﬁned as in (QP ).

Proposition 4.11: let 0 be the set of minimizers of a feasible (PP). If 0 is
bounded and contains only regular points, then given e>0, there exists a sufﬁciently
large s such that M , the set of the minimizers of the corresponding E 2(x) satisﬁes

minllx-x1|<e for xeM. El
feO

Proof: The proof follows from Corollary 3.2. III

This proposition implies that there exists a sufﬁciently large s such that M is

bounded. (Otherwise, there exists xeM and minllx -x1|>e which thus it draws a con-
£60

tradiction.)

Proposition 4.12: Let (PP) be a feasible program and E 2(x) be correspondingly
deﬁned. Then the equilibrium points of equation 4.1 are the minimizers of E2(x ). El
Proof: Since E2(x) is convex and continuously differentiable on R" , the critical
points of E2(x) are the minimizers. This can be seen from Theorem 3.1 by taking it

and it to be zero since there are no constraints for E2(x ). But the critical points

satisfy

63
0 = VE2(X) = ‘1‘.

Thus the equilibrium points of equation 4.1 are just the minimizers of E 2(x ). D
It is known that if f is a C2 (twice-continuously differentiable) real-valued func-

tion on an Open convex set S in R" , then f is convex if and only if its Hessian

matrix

V2f(x) = [L (1‘)]

3x,- 3x,-

is positive semideﬁnite for every x e S [145]. Using this fact, the implications of Pro-
position 4.12 can be extended.

Proposition 4.13: Let (PP) be a feasible program and E2(x) be correspondingly
deﬁned. Assume that the Hessian matrix of f o (x) is positive deﬁnite. Then the
equilibrium point of equation 4.1 is the unique minimizer of E 2(x ). Cl

Proof: By Proposition 4.12 the equilibrium of equation 4.1 is the minimizer of E 2(x ).

Furthermore, we have that

vzrt'zct) = szo (x) + vg,(x)vg,T(x) (4.35)

is positive deﬁnite, since szo (x) is positive deﬁnite and Vg, (x )Vgﬂx) is positive
semideﬁnite. Thus x is necessarily and sufﬁciently a strictly local minimizer of
E 2(x ). Due to the convexity of E 2(x ), we can conclude that i is the global (and thus
unique) minimizer of E2(x ). El

Proposition 4.14: Let 0 be the set of minimizers of a feasible (PP). If 0 is
bounded and contains only regular points, then there exists a sufﬁciently large s such
that J(i)=l(f), where it is an equilibrium of equation 4.1 and £60. El

Proof: Same proof as of Proposition 4.5. D

64

Theorem 4.3: Under the notations and assumptions of Proposition 4.11, then, given
e>0, there exists a sufﬁciently large s such that the system is completely stable and 5?

satisﬁes minlIf-J‘ilke. CI
is 0

Proof: let it, be the parameter that satisﬁes both Propositions 4.11 and 4.14. Since
0 is bounded, by Proposition 4.11 there exists a s2>s1 such that the minimizers M of
E20!) are bounded. Using the fact that E 2(x) is a Lyapunov function for the system
of equation 4.1, LeSalle’s theorem [150] ensures that the system is completely stable

in the sense that every trajectory converges to a point in M without oscillation. EJ
Corollary 4.2: Under the notations and assumptions of Proposition 4.13, then, given
€>0, there exists a sufﬁciently large s such that the unique equilibrium :2 is globally
asymptotically stable and llf-x'lke. El

Proof: let s, be the parameter satisﬁes Proposition 4.11. Using the uniqueness of Si
and applying Theorem 4.3, the result follows. E!

The discussion in this section up to now is just an extension of the quadratic pro-
gramming network with the objective function of the program allowed to be any Cl
convex function. The results obtained thus far apply to the program with linear equal-
ity constraints as well. In what follows, the discussion is extended to programs with
nonlinear constraints, either equalities or inequalities.

Consider/next the case of convex programming.

Lemma 4.2: let (CP) be a convex program. Then E 2(x) is a convex function. CI

Proof: By deﬁnition,

52(1) =f(x) + g ﬁerce)? + ivy-(of. (4.36)
i=1

i=1

where f and the g,- ’s are C1 convex functions on R" and the h,- ’s are afﬁne functions

on R " . The Hessian matrix of the second term on the right hand side of equation

65

4.36 is

WBEtgr-Txnz] = s

i=1 i6]

VSJVSJT + Egi+vz i],

 

r
which is clearly positive semideﬁnite. Thus 2 (gi+(x ))2 is a convex function.

i=1
Since each h ,- is an afﬁne function, they can each be expressed as

T

for some vector c, and constant e,, ISjSm. By checking the positive
semideﬁniteness of the Hessian matrix, it can be shown that the last term of equation

4.34 is also a convex function.

Now E 2(x) is a sum of three convex functions, so it must be a convex functions
as well. CI

If E 2(x) for some (CP) is bounded below, then by the formulation of equation
4.1 the system will converge to a minimizer of E2(x ). So the results obtained for
(PP) still hold for the case of convex programs.
Proposition 4.15: If the basic program (PP) is replaced by a convex program (CP)
in Propositions 4.11 - 4.14, Theorem 4.3, and Corollary 4.2 (with the assumption that
the minimizers of (CP) are bounded and contain only regular points), then their
results still hold. CI

Proposition 4.14 implies that for a sufﬁciently large s , the violated constraints at
an equilibrium 1‘? of equation 4.1 are the same as the binding constraints at a minim-

iw 5". For (PP). ifJ(i)=l(f). then
Vgiem) = ngertr) (4.37)

since the gradient of a violated constraint is a constant vector. But this is not always

true for the case of convex programming. The gradient of a violated constraint for a

66

convex program (CP) is a continuous vector function rather than a constant vector. It
is due to this continuity and the assumption of regularity of the minimizers that the
conclusion of Theorem 3.3 holds. For a convex program, it is only guaranteed that
equation 4.37 holds when s=oo. (For further discussion see Chapter 12 of [151].)

All the programs discussed thus far consider equality constraints (h ,- ’s) with only
afﬁne forms. The strategy to map such constraints into the optimization network is to
replace them by two inequality constraints, hJ-ZO and ll,- 50. Since each of these two
inequalities is again a convex function, the results of Theorem 3.1 are applicable. In
order to solve more general problems by the optimization network technique, we want
to relax the restriction on the afﬁneness of equality constraints. First the following

theorem is needed which is an extension of Theorem 3.1. (For proof see [146]).

Theorem 4.4: Let (P) be a program in the notation described in Section 3.1. Let f
be a feasible solution to (P). Suppose that f is a regular point. Further suppose that
f, g,, ieI(x’), and h-, lSjSm, are convex and all are differentiable at 1’. Then 1? is
a global optimal solution to (P) if and only if there exists Hit, 2.,1T20 and
ll=[tt1 um]T>0 together with x that satisfy

(i) l,g,-(x') = 0 for i=1,...,r

and

i=1

i=1
Corollary 4.3: let (P) be a program with the same notations and assumptions of
Theorem 4.4 except that h-, lSjSm, are all concave. Then i? is a global optimal
solution to (P) if and only if there exists Hit, mTzo and p.=[].11 umlT<0
together with f such that the conditions (i) and (ii) of Theorem 4.4 hold. El

Proof: Since -h,-(x) is convex for all j and

lle-hj) = (‘llehJ-r

67

the result follows by applying Theorem 4.4. D

Since the h,- ’s are not assumed to be afﬁne, the minimizers of the problem may
be isolated, rather than a convex set. If h,- is a convex function, then hjSO deﬁnes a
convex set. Similarly, —h,- S0 deﬁnes a convex set for It,- concave. Now using the
notation of a extended program, i.e., letting g,+2,-_1=h,- and g,+2,-=—h,-, leads to the
following corollary.
Corollary 4.4: Let (P’) be the extended program stated above for convex functions
feCl and g,eC1, 1SiSr, and convex or concave functions hjeCl, lSjSm. Let
L(s,x) and s be correspondingly deﬁned. Let x), be a minimizer of L(s,,,x).
Assume x), —)f for some £60, where 0 is the set of minimizers of (P’). Suppose 0
is bounded and contains only regular points. Then for It,- convex, skg,‘.‘,2,-_1(x,,)-9u,-
and soggy (x), )-)0; for hj concave, skg,12j-l(x,,)-)0 and skg,12j(x,,)—)(—tlj ), where
uj is the corresponding Lagrange multiplier at x‘. [3
Proof: Rewrite equation 3.4 as

us. x) =f(x) + g iot+<x>>2+ feta-ha)? + ions-(xi)? (4.38)
i=1 j=l

i=1

Applying Theorems 3.3 and 4.4, it follows that if h ,- is convex then
skg,’f,2j_1(x,,)—)ll,- and soggy-(o )—)0. By applying Theorem 3.3 and Corollary 4.3 it
follows that if It, is concave, spg,12j-1(x,,)—)0 and skgﬁozj (x), )—)(-|.lj ). D

In fact, since g,";2,-_1 (x) and g,";2,-(x) are continuous and mutually exclusive for
lSjSm , there exists a sufﬁciently large s), such that g,*;2,-_1(x,,) > 0 and g,T,2,-(x,,) = 0

for hj CODVCX, 811d 8,12j_1 (1*) = 0 and g,++2,-(x,,) > 0 for hi concave.

Proposition 4.16: Let so be the parameter stated above. Then L(s,,,x) is a convex
function in a neighborhood of x,, . D

68

Proof: Due to the choice of s,, and the continuity of {g,-+},'I12"‘, there exists a neigh-

borhood N of x,, so that for xeN, s,g,+;2,._, (x) > 0 and soggy-(x) = 0 if h, is con—
vex; skg,‘.*,2j_1 (x) = 0 and skg,‘_‘,2,-(x) > 0 if h,- is concave. Now for xeN, if hj is

convex,
(atom at»2 + (erotic)? = (3.12)-. (x »2 = (hj“(x))2
is convex since It,” is convex. Similarly for xeN , if h,- is concave,
(rater (x »2 + (gape)? = (gas,- (x ))2 = «are»?
is convex since (—h,- )+ is convex. As a sum of some convex functions, it follows that
L(s,,,x) is convex. D

To see that It," is convex provided h ,- is convex, consider the following lemma

Lemma 4.3: let g (x) be a convex function on a nonempty convex set X cR " , then

, _ g(x) ifgtxl>o
g (x)- 0 ifgu)g)

is a convex function over X . D

Proof: Choose x, y eX. By the convexity of g , it follows that for 0951
sax-«14034) S 1300+ (14030)

S MIG!) + (1-1)8+()').

If g(lxi-(l-lc)y)>0, then g+(2.x+(1-lt)y) =g(1x+(l-lv)y). and we are done. If
g (2.x +(1-1)y )50, then

Elia-”(140” = 0 S 48"(1) + (140840)

and the proof is complete. D
The neighborhood mentioned in Proposition 4.16 need not to be a R " ball

around x,,, i.e., B8 = {xeR"| |lx-x,,||<e}. In fact, it may be any shape convex set in

69

which L(sk, x) is convex. By Proposition 4.16 there exists an s>s,, and a neighbor-
hood N such that E 2(x) is convex over N. Thus if the boundedness of E 2(X) is
assumed, then the network formulation of equation 4.1 is again useful to converge
locally to one of the minimizers of E2(x) provided the starting point lies in N. If the
set 0 of minimizers is isolated, there is more than one neighborhood and E2(x) is
convex over each one. To restrict the discussion to the local stability of equation 4.1
over a particular neighborhood, assume there is only one isolated minimizer set, and

correspondingly one such neighborhood.

Proposition 4.17: let (P’) be a program in its extended form such that f e C1 and
g,-eCl, ISiSr, are convex functions, and hjeCI, lSjSm, are convex or concave
functions. let N be a neighborhood that satisﬁes Proposition 4.16. Suppose that the
set of minimizers of (P’) is bounded and contains only regular points. Then if the
program (PP) is replaced by a program (P) in Propositions 4.11 - 4.14, Theorem 4.3.
and Corollary 4.2, the results still hold locally over N. D
Although in Proposition 4.17 the stability of equation 4.1 is gude only

locally, in practice the N neighborhood can be very large as will be shown in some
examples. Also in Proposition 4.17 the convexity and concavity are assumed
throughout R " , but this is not absolutely necessary so for the proposition to hold. It
is sufﬁcient to assume that the convexity and concavity of the corresponding functions
hold on a nonempty open set, say OCR " , and restrict all discussion to (2 rather than
R" . That is to

MlmmIZC' ' ' f (X)

subject to g,-(x)S0, for i=1 to r,

hj(x)=0, for j=l to m,

and

70

er,

where f and g,- ’s are convex, differentiable on the open set (2, and h,- ’s are convex or

concave, differentiable on 9.

let f :X —-)R , where X is a nonempty convex set in R". The function f is said

to be quasiconvex if, for each x, and xzeX , the following inequality is true:
f Our + (1401(2) S maxtf (xi). f (1:2)} for each 460.1).

The function f is said to be quasiconcave if -f is quasiconvex. Let S be a
nonempty Open set in R " , and let g :S —>R be differentiable on S. The function g is
said to be pseudoconvex if for each x 1, x26 S with Vg (x 1)T (x 2—x 1)20 we have
g(x2)2g (x1), or equivalently, if g(x7)<g(xl) then Vg(xl)T(x2-x1)<0. The function
g is said to be pseudoconcave if -g is pseudoconvex. The pseudoconvexity of f
ensures that if Vf (f)=0, then f is a global minimum of f. Figure 4.4 shows some
examples of quasiconvex and pseudoconvex functions. Both quasiconvex and pseu-
doconvex functions assure that there are only global minimizers. But quasiconvex
functions may have saddle points and pseudoconvex functions may not. Therefore,
for pseudoconvex functions, the point at which the gradient vanishes is a global
minimizer. Note that under diﬁerentiability, convexity implies pseudoconvexity, and

under lower semi—continuity, pseudoconvexity implies quasiconvexity.

Thus, for a program (P) the results of Theorem 4.4 still hold by properly relax-

ing the convexity assumptions.

Theorem 4.5: Let (P) be a program in the notation described in Section 3.1. Let f
be a feasible solution to (P). Suppose that r is a regular point. Further suppose that
f is pseudoconvex, g,- is quasiconvex for 1'6! (1?), and hi is quasiconvex for 151' Sm .
And all are differentiable at x'. Then i? is a global optimal solution to (P) if and only
if there exists X=Dt1 1,]720 and [.l=[].l1 um]7>0 together with x' that satisfy

71

saddlg point
W paint A

(a) (b)
(C) ((0

Figure 4.4. (a) Quasiconvex. (b) Quasiconcave. (c) Pseudoconvex. ((1) Not Pseu-
doconvex and not quasiconvex.

(i) log-(f) = 0 for i=1,...,r

and

f m
(ii) VfO?) + 24iV8i(X-) + EMthjO?) = 0. Cl
i=1 j=l
Corollary 4.5: Let (P) be a program with same notations and assumptions as in
Theorem 4.5 except that h j, lSjSm , are all quasiconcave. Then if is a global optimal
solution to (P) if and only if there exists hot, 1,]T20 and ll=[ll1 um17<0

together with x' such that the conditions (i) and (ii) of Theorem 4.5 hold. El

72

Proof: Same proof as Corollary 4.3. C]

Now following a similar argument used to derive Proposition 4.17 it can be
shown that the network formulation of equation 4.1 is also useful to obtain (locally)
an approximate solution to (P) with proper pseudo— and quasi-convexity assumptions.
Corollary 4.6: Let (P’) be an extended program such that f e C1 is a pseudoconvex
(:1 function, g,- is a quasiconvex Cl function for 1sisr, and hjecl is either
quasiconvex or quasiconcave for 151' Sm. Let N be a neighborhood that satisﬁes Pro-
position 4.16. Suppose that the set of minimizers of (P’ ) is bounded and contains
only regular points. Then if the program (PP) is replaced by a program (P) in Pro-
positions 4.11 - 4.14, Theorem 4.3, and CorOllary 4.2, the results still hold locally

overN. CI

Note that the network formulation of equation 4.1 may be cOnsidcred as a con-
tinuous approximation of the gradient projection method [147]. Figure 4.5(a) depicts
the idea of the gradient projection method. The negative gradient of the objective
function is projected onto the tangent surface of the active constraint set in order to
ﬁnd a point y . Then a new point x,,+1 is found along the direction perpendicular to
the tangent plane of x,,. Figure 4.5(b) illustrates the dynamics of equation 4.1 when
the trajectory is on the boundary of the feasible region. The linear programming
problem with hypercube feasible region described in Section 3.3 is a good example
for illustrating the similarity between the gradient projection method and the dynamics

of equation 4.1.

73

 

(b)

Figure 4.5. (a) Gradient projection method [147]. (b) The dynamics of equation 4.1 on
the boundary of the feasible region.

74

4.4 A Two-Phase Optimization Network

In previous sections of this chapter it has shown that there exists a sufﬁciently
large s so the network formulation as described by equation 4.1 is guaranteed to con-
verge to an approximate solution for a large class of nonlinear programming prob—
lems. In what follows, a two—phase optimization network model, which can obtain
both the exact solution to the problem as well as the corresponding Lagrange multi-
pliers associated to each constraint is proposed. For linear programming problems,
the network solves both the primal and the dual problem exactly.

For the sake of argument, assume, unless otherwise explained, that the program
(P) considered in this section is a convex program for which f 6C1 and g,-eCl,
lsiSr, are convex functions, and hjecl, lsjsm, are afﬁne functions. It is clear
that if the set M of minimizers of (P) contains only regular points, for feM it fol-
lows that q 5 card (I (17)) s n . The penalty function used here is

Ms. x) =f(x) + ﬂioﬂxlf + ire-Rn]. (4.39)

i=1 j=l

It is formed based on the program (P) rather than on its extended form.

The block diagram of a two-phase optimization network is shown in Figure 4.6.
The network operates under different dynamics as the phase is changed by a predeter-
mined timing switch. For 0St<tl, the network operates according to the following

dynamics:
at = -Vf(x) — s[Vg,(x)gf(x) + Vh (x)h(x)]. (4.40)
It is just in this case the same as the system described by equation 4.1 except that the

sign of h ,- for 15 j Sm may be either positive or negative. The subsystems within the

two large rectangles do not contribute for t<tl. When tZtl, the dynamics of the

75

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

' ' “to“ . ‘5‘“

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 4.6. The block diagram of the two-phase optimization network.

76

network becomes

ii = —Vf (x) — Vg,(sgf + 2.) — Vh (sh + u), (4.41)
i = soar). (4.42)

and
ti = 8(Sh ). (4.43)

where e is a small positive constant. For this network there is no restriction on the

initial condition of x while the initial values of It. and it are set to be zero vectors.

Due to the network formulation it is easy to check that the equilibrium of the
system occurs when gf(x) = 0, h(x) = 0, It > 0, and

Vf (x) + Avgyot) + th (x) = 0. (4.44)

But this actually satisﬁes the optimality conditions of Theorem 3.3, and thus a equili-
brium point of the two-phase network is nothing but a global minimizer to a convex
program (P ).

The rationale behind the two-phase network formulation is the following. In
phase 1 (t<t1) it follows from Theorem 3.3 and Proposition 4.15 that for a sufﬁciently
large s , equation 4.40 converges to an equilibrium i at which sg,-"'(x) and shj (it) are
very close to 2., and ll,- , respectively, where IL,- and it} are the corresponding
Lagrange multipliers deﬁned in Theorem 3.1. Assume s is chosen such that
J(i) =I(x') for EEO, the minimizers of (P). By choosing t1 properly, the trajectory
of the system can be assumed to be within a small neighborhood of i, say
865,8) = {xeR’Hlx-filkS], such that the approximation of k,- and ll,- by sg,-+(x) and
sh ,- (x) respectively is qualitatively preserved for x EB (x ,5).

In Phase 2 (t2: 1), the network begins to shift the directional vector sg,-+(x) gra-
dually to 1,, and shj(x) to llj. By imposing a small vector 8, the updating of

77

equations 4.42 and 4.43 is comparatively much slower than that of equation 4.41.
Approximation of such dynamics is possible by considering 7t. and [.l. to be ﬁxed.
Then it can be seen that equation 4.41 is seeln'ng a minimum point of the augmented

Lagrangian function

Lo(s, x) =f(x) + ”g(x) + llTh(x) + %[llg+(x)||2 + ||h(x)||2]. (4.45)

With a small enough 8, the stability of equation 4.41 is qualitatively preserved and the

system is driven toward a equilibrium point, which is a minimizer of (P).
In practice 8 may be chosen to be 2%- leaving the network with only one

preselected parameter. But using a 8 independent of 3 gives more freedom to control
the dynamics of the network. Also if the initial condition of x is in the feasible '
region, simulation results show that phase 2 of the network alone is sufﬁcient to
ensure the convergence to a minimizer of a convex program (P). For a more general
program such as those covered in the last section, the convergence is only locally
assured.

The network formulation proposed here is similar to the multiplier method, or the
augmented Lagrangian method [147,152]. The multiplier method for the equality

constrained problem (EP)
minimize f (x)
subject to h(x) = 0

is to transform the problem into a successive process. Within each iteration an x,, is

sought to minimize
l,(x,ll,,)=f(x)+tl{h(x)+%||h(x)||2. (4.46)

If x,, is found, then ll), is updated according to

Ilk+1 = 111: + Sh (xii)- (4.47)

78

Then, an x“, is sought to minimize I, (x, um): and so on. Equation 4.43 is the
same, in essence, as equation 4.47 with the former being a continuous approximation
to the latter.

In practice, the multiplier method converts the inequality constraint g ,- (x) S 0 to
an equivalent equality constraint g, (x )+z,- = 0 by adding a dummy nonnegative vari-
able 2, . Then the problem is solved successively by ﬁnding x,, and 2,, 20 that minim-

ize
f(x)+2{[g(x)+z]+-;—||g(x)+z||2, (4.48)

and updating 2,, similar to equation 4.47. Finally, a mapping scheme similar to the
two-phase network can be formed to deal with only equality constraints. The state

variables of such a network include x , z , 2., and tr.

CHAPTER V
NETWORK SIMULATION

To demonstrate the behavior of the networks proposed in the last chapter and to
validate their properties, some examples for various problems have been performed
using the ACSL (Advanced Continuous Simulation Language) software package (Sun
version). The numerical algorithm used to integrate equations 4.1 and 4.38 to 4.41 is
a 4—th order Runge-Kutta method.

5.1 Linear Programming

For linear programming, consider the following problem (LP 1) taken from [6]:

Minimize f(x)=-x1-x2

subject to
-i _ _3_5
81(X)— 1211 I2 12 S0,
5 35
82(1) = 311442" 7 S 0.

g3(x) = -x1— 5 .<. 0,

g4(x) =x2 - 5 s 0.

79

80

The corresponding feasible region is the gray area in Figure 5.1.

 

Figure 5.1. The feasible region of (LP,).

It is easy to verify that the optimal solution to this problem is x=(5.0, 5.0)T and
the corresponding Lagrange multipliers are 21:0, 21:04, 23:0, and 24:06. To illus-
trate the variation of E2(x) with respect to different values of s , the contours of E 2(X)
for s=l and s=10 are shown in Figure 5.2. Comparing Figures 5.3(b) and (d), we see
that the larger the value of s, the closer the minimizer of E 2(x) to the optimal solu-
tion. This is implied by Theorem 3.3. Since E 2(x) is radially unbounded, the unique
equilibrium of equation 4.1, i.e., the minimizer of E 2(X) is thus globally asymptoti-
cally stable for some large s. For this example, s 2 1 is more than sufﬁcient. The

equilibrium point of the network and the terms sgj+(x) are given in Table 5.1 for

81

 

 

 

Ov

 

-2 . Sur-

 

-5..

 

 

-7. 51b

 

-10d I
=10 -7.5 -5 -2.5

U

‘P

u)-

p

o q)-
L)

04»
N
w
we»
4
(ll

J‘1

Figure 5.2. Contours of E 2(1) for (LP 1). (a) The contour for s=10.

12

U‘
ax

5

0

L"
N

(I.

4.8

82

 

Jrr

Figure 5.2. (cont’d.) (b) Detailed plot of (a) around f=(5, 5).

x2

83

7.54-

-2.s«»

-54.

-7.5<-

 

 

‘10". L I v
-10 -7.5 -5 -2.5 0 2.5 5 7.5 13

Figure 5.2. (cont’d.) (c) The contour for s=1.

84

 

 

 

5.6
5.4
12
5.2<P
S4)-
4 8 1 L A g
4.6 4 8 S 5.2 5 4
x1

Figure 5.2. (cont’d.) (d) Detailed plot of (c) around f=(5, 5).

85

Table 5.1. The equilibria of the network for (LP 1) for different values of s.

 

 

Parameter Equilibrium 52 r -f H sgf‘Ct)

s= 0.2 (4.600, 8.000) (0.400, 3.000) (0.0, 0.4, 0.0, 0.6)
s= 1 (4.920, 5.600) (0.080, 0.600) (0.0, 0.4, 0.0, 0.6)
s= 2 (4.960, 5.300) (0.040, 0.300) (0.0, 0.4, 0.0, 0.6)
s= 10 (4.992, 5.060) (0.008, 0.060) (0.0, 0.4, 0.0, 0.6)

 

 

 

 

 

different values of s . It is evident that IL? -x ll decreases with respect to the increase of

s. It is in fact linear.

To demonstrate the dynamics of equation 4.1 for this problem, trajectories with
various initial conditions have been plotted on Figure 5.3 for s=10. Figure 5.3(a)
illustrates the trajectories converging to the equilibrium i=(4.992, 5.060) whereas Fig-
ure 5.3(b) shows the trajectories around 52 . The sliding eﬂ‘ect of each trajectory along

the active constraint can be seen clearly in these ﬁgmes.

Next, the 2-phase network formulation is applied to this problem. In phase 1,
the trajectories are identical to those in Figure 5.3. In phase 2, the trajectories are
slowly moving ﬁom x to 'x' as shown in Figure 5.4(a) with s=10 and $0.2. The
corresponding trajectories of the objective function and E g(x) With respect to time are
given in Figure 5.4(b). It is clear how they approach asymptotically the optimum
value ~10.0. In Figure 5.4(b) the line atop is the trajectory of E2(x) which is slightly

large than f (x) as long as there are constraint violations.

As mentioned in Section 4.4 if the initial point is in the feasible region, the net-
work structlue in phase 2 is suﬂicient to converge to a minimizer. To see this effect,
a simulation using only the phase-2 network structure is done for s =10, e=0.2, and

xo =[4.8, 4.817. The simulation results show that the trajectory of x is nearly identical

 

86

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

o
(a) CF n r-
"'h U
8 K C
a /
{>108 \-———-'—’l
x /
a /
:3
LP
0
‘3' ti ‘
-- u
L’10.0 '5.00 0.00 5.00 10.0
X ( 1)
(b) g
UlI.‘ '3 r.
o /
9.
Lr; A
C)
o
AD I-
emi r
X
c:
o
or
v.
o
o
m K
at. it {a
4.800 4.900 5.000 5.100 5.200

X(1)

I:igur'e 5.3. Trajectories of equation 4.1 for (LP,) with s=10. (a) A broad view. (b) A
View around it = (4.992, 5.060).

87

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) §
to
O
.9
m.
D \
O
AD
mm
X
C)
O
07
V.
O
O
03
V4. 800 4. 900 5.000 5.100 5.200
X(1)
(b) 8 8
0') C7
9'7 9'
Q Q
03 CD
C2" 9'
L0 L0
33 <2:
cm or
.. ._ .
4x“? 2?
x
5 A
,3 e
a We
CD CD
C?" ‘r'5
L—
53 .9.
0.] O.
'7 70.00 10.0 T20.0 30.0 40.0

Figure 5.4. Trajectories of the 2-phase network for (01) with s=10, e=0.2. (a) Tra-
jectory of x. (b) Trajectories of E 2(X) and f (x) with respect to time.

88

to the one in Figure 5.4(a). The corresponding trajectories of E 2(x) and f (x) are
given in Figures 5.5(a) and (b), respectively. The lines atop in Figure 5.5 are for the
case using only the phase-2 network structure. It is clear that they begin to exhibit

the asymptotical behavior only after few generic time steps.

If the initial condition is not in the feasible region, using only the phase-2 net-
work structure may not lead to convergence. This can be seen by that fact that equa-
tion 4.40 only increases the value of 2. For ﬁxed s and e, choosing any initial point
from the infeasible region will result in a positive value of 2.,- for some i. If the irri-
tial point is far enough, 2,- will become larger than the correct Lagrange multiplier
before the state trajectory approaches the boundary of the feasible region. Once this
happens, there is no way to bring the value of the over-estimated 2,- back down.
And, the trajectory remains in the infeasible region since it can not enter into the
feasible region except through the minimizer with correct Lagrange multiplier. Even-

tually, the system diverges.

However, if the initial condition is restricted in a bounded region, it is possible
to ﬁnd a small enough 8 such that the phase-2 network structure is stable over this
region. But the drawback is that the smaller the value of e, the slower the rate to
equilibrium.

To illustrate the phenomenon described by Fact 4.1, consider the program (LP 7):

Minimizef(x) = -x2

subject to the same constraints as (LPl). Notice from Figure 5.1 that the minimizers
of (LPZ) lie on the line segment x2=5 within the feasible region. If s is chosen to be
2 in equation 4.1, then its equilibrium points are the line segment x2=5.5 within

5 35
g(x) = 3x, +x2- 3— $0

89

(a)

'0.960

 

 

-0.972

 

E2(X»(10
“0.985

 

~0.997

 

 

 

 

 

 

“1.010

 

0.00 10.0 .E0.0 30.0 40.0

Cb)

 

-0.960

 

'0.972

-0.985

 

f (x)x10

“0.997

 

 

ﬁr

0.00 10.0 .F0.0 30.0 40.0

 

 

 

 

 

 

“1.010

Figure 5.5. Trajectories of E2(x) and f (x) for phase 2 with s=10, 8:02, and
JCo=[4-8. 4.8]T. (a) Trajectory of E2(x ). (b) Trajectory of f (x).

90

and

g3(x) = —x1 — 5 .<_ 0.

In this case, the size of the equilibrium set is smaller than the size of the mimmizer
set. Trajectories of equation 4.1 near (5.0, 5.5) are shown in Figure 5.6(a). If we
raise s from 2 to 10 after the system settles on the line segment x2=5.5, then all the
trajectories will move in parallel to x2=5.1 as shown in Figure 5.6(b). This is exactly
what have been described in Fact 4.1. Note that the trajectories of s=2->10 are
different from that of s=10. The latter have been shown in Figure 5.6(c) as contrast
to the trajectories in Figure 5.6(b).

 

 

 

 

 

 

 

 

 

 

 

 

 

O

m [- r-

.o' {3 ’2 .

o / /

O h

(I; r )
(SL0- , / 11
X

D

O. \

0

Ln

val—pf

4.00 4.50 X01100 5.50 6.00

Figure 5.6. Trajectories of equation 4.1 for (LPZ) near (5.0, 5.5). (a) s = 2.

(b)

(c)

91

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0
Ln n 1'- I-
6 1:3 I
o //
D
as //r “
8 /
a”; / 11
x
8
U). L/
0
”3H,,
'4. 00 4. 50 X ('51.)00 s. 50 5. 00
o
I.”
(D. n /' I: I”
8. / /
o / /
Am
Ntn'
x (
8 \/ 0
In.
0
L”
4‘4
4.00 4.50 5.50 5.00

5.00
X(1)

Figure 5.6. (cont’d.) (b) s = 2 -) 10. (c) s = 10.

92

For the case where the size of the equilibrium set is larger than the size of the

minimizer set, consider the program (LP 3):

Minimizef(x) =x1

subject to the same constraints as (LPl). From Figure 5.1 it may be seen that the
minimizers of (LP?) lie on the line segment x 1=-5 within the feasible region. Choos-
ing s=2 in equation 4.1, the equilibrium points are the line segment x2=-5.5 within

__5_ _ -22
8100-1211 x2 1250

g4(x) =x2 - 5 S0.

Trajectories of equation 4.1 near (-5.5, -5.5) are shown in Figure 5.7(a). If we raise s
from 2 to 10 after the system settles on the line segment x1=-5.5, then all trajectories .
will move in parallel to x1=—5.l as shown in Figure 5.7(b).

The trajectory curving toward the upper-right in the middle of Figure 5.7(b) is
due to the fact that the size of the equilibrium set is reduced as s increases. (In fact,
the equilibrium set would eventually be identical to the set of minimizers when s
becomes inﬁnity.) This trajectory would have been a straight line along g1(x)=0 if s
had been changed continuously. Since s=2—910 abruptly, the early stage of this tra-
jOCtory tends to move strictly to the right and thus results in new constraint violation
as it crosses to the other side of g 1(x )=0. Now the asymptotic nature of the system
talias place to remedy such a violation and drives the trajectory toward the correct end
P0th of the equilibrium set. The trajectories of equation 4.1 with s =10 are given in

Figure 5.7(c) for comparison to the trajectories in Figure 5.7(b).

93

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) 3.
.-
t ,4. .
t- I
D
O
ui 3
' 13 a
D /
Am
0, .
stay/7
8. //
to 1
't/
O
1.0
to
“6.50 -6.00 -s.so -s.00 -4.so
IX(1)
(b) trot
4:
I i- It.
I...
O
o- t
L.“ i

 

H
Vol

 

o /

m

m' '
I

 

 

 

 

 

 

 

a r
x (:—"’/1//

8 /

to / 1:

ti/

8

to

~6.so —6.00 -s.so -s.00 -4.so

X(1)

E3111? 5.7. Trajectories of equation 4.1 for (LP3) near the point (-5.5, -5.5).
(a)s =2. (b)s =2—) 10.

94

 

-4.SO

 

 

*5. 00

 

/

 

 

 

 

V1

/

 

 

om.T

 

co .m:

Sour.
ANVX

 

 

/

co .wt

omd

“S. 50

X(1)

'6. 00

I'6. 50

= 10.

Figure 5.7. (cont’d.) (c) s

95

5.2 Quadratic Programming

For quadratic programming, consider a program (QP l):

minimizef(x)=x12 +x22 +x1x2+3x1+3x2

subject to the same constraints as (LP,). The minimum of f (x) occurs at
x, = x2 = —1. Since the unique minimizer lies in the interior of the feasible region, it
follows from the theorems derived in Section 4.2 that the unique equilibrium of the
(QP) network is exactly the minimizer. The simulation results of the trajectories of x
for the correspondingly formed (QP) network are shown in Figure 5.8. The equili—
brium r = [-l, —1]T clearly exhibits asymptotic stability. There is no need for using

a 2-phase network structure.

10 0
to
P
1

 

 

500
\

 

X(2)
9.

1

“S 00

 

 

 

 

 

 

r10.0

 

10.0 -s.00 0.00 5.00 10.0
X(1)

Figure 5.8. Trajectories of x of the (QP) network for (QP 1).

96

Suppose now the objective function in (QP 1) is replaced by

f(x)=x12 + x22 +rpt2 — 30x1 - 30x2,

and this new program is denoted by (QP 2). The unconstrained minimizer of f (x) of
(QP?) is [10, 10]. The simulation results using equation 4.1 are given in Figure 5.9.
Wherever the initial points of x are, the trajectories approach to the equilibrium
x = [4.97778, 5.174517.

It can be shown that the minimizer of (QP?) is r = [5.0, 5.0]T at which J(r) =

{2,4}. Solving the equation

Vf(f)+ 2 lngr-(f)=0.
i610?)

we get the corresponding Lagrange multipliers, 21:00, 24:60, 23:00, and 24:90.

 

10 0
ti

3

*1

 

5.00

 

000

 

 

 

 

 

 

 

 

a r~
. x /
O
D
LP'
0
2 r: o
L'10.0 ‘5.00 X61000 5.00 10.0

Figure 5.9. Trajectories of x of equation 4.1 for (QPZ) with s=50.

97

Using the 2-phase network formulation for (QPZ), a simulation is run with s=50,
8:02, and initial condition xo =[4.8, 4.8]T. The trajectories of x and 2 are shown in
Figure 5.10(a) and (b), respectively. The network is switched from phase 1 to phase
2 at T=2.0. Figure 5.10 clearly shows the dynamics of the network during phase 2 in
which trajectories moves asymptotically toward the equilibrium f = [5.0, 5.0]T and
2 = [0.0, 6.0, 0.0, 9.017.

To make the example more representative, add one equality constraint x1 = 3 to
(QPI), and call this new program (QP3). In the network formulation this equality

constraint is replaced by
g5(x)=x1- 3 S 0

g6(x)=-x1+3 50

The minimizer of (QP3) is f = [3, -%]T at which 1(f) = {1,6}. The theoretical

values of Lagrange multipliers are

8 76'
A: —9 s 9 a 9—1.-
[3 0000 9]

Simulation results for the network of equation 4.1 for (QP 3) are shown in Figure
5.11(a) for s=50 and 8:0.2. All trajectories lead toward the equilibrium point
it = [2.84279, -1.77791]T. For the correspondingly formulated 2-phase network, a
simulation is performed with the initial condition xo = [2.5, -1]T and the same s and
8. The resulting trajectory of x is given in Figure 5.11(b). Again, the 2-phase net-
work demonsuates its capability to tune the state variable x to the exact minimizer.
Though not shown in ﬁgure, the ﬁnal values of 2(t) for the 2-phase network are the
exact Lagrange multiplers described above.

As mentioned in Section 4.2 one of the applications of the quadratic program-

min g optimization network is for solving the least squares problem Bx =b . For

98

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CD
(a) g
to
8 \
to
O
O
no
(\m'
X
D
O
m
,,-
O
O
°°.
v4. 800 4. 900 5.000 5.100 s. 200
X(1)
0’) 8 D. 8 O.
._°- 52— __-- 2
,—
tn o in c: -“ /
N m ix tn
CD." N.‘ C54 t\ /
a 8 a 8 /
£6‘£d‘£d‘£m’ e
1.0 O I.” o
N If) N m
d- N" to" oi
OJ 0 D O
O O C) C)
o‘ o" o" o’
.00 10.0 T20.0 30.0 40.

Figure 5.10. Simulation results of the 2-phase network for (QP?) with s=50 and
8=0.2. (a) The trajectory of x. (b) The trajectories of 2.,-.

99

 

 

 

 

 

 

 

 

 

 

 

(a) O.
Dr D
"" 1r
I‘{
o
Aoof5 .-
No a _—————-/ '
X 1
<3 //77//A
c:
L.“ /
o
E? 11 LL
“10.0 'S.00 0.00 5.00 10.0
X ( 1)
(b)

*1.00

 

-125

 

 

 

 

 

 

 

 

 

 

O
Am
~34
X
m /
rx
7‘
O
D.
N
' 2.50 2.75 3.00 3.25 3.50

X(1)

Figure 5.11. Trajectories for (QP3) with s=50 and moi. (a) Using the network of
equation 4.1. (b) Using the 2-phase network.

100

illustration sake, consider the following least squares problem (LS) for which

I .

H

HO

m

ll
t—IHHD—lt—b
OOHHH

 

 

and

 

 

The least squares estimator calculated by the normal equation is

-1
5i =[BT31-IBTb = 10 .
-3

Since there is no constraint in this problem, E 2(x) = f (x) = ~34le -b 112. Also since

B is of full rank, E 2(x) is strictly convex and the unique equilibrium i=[-1, 10, —3]T
of equation 4.32 is globally asymptotically stable. A simulation with initial condition
x=[0, 0, 0]T has been performed and the trajectories of x and of E2(x) are plotted in
Figure 5.12. Note that the xj, ISan , does not approach 1, in a monotonic (increas-
ing or decreasing) manner as seen in Figure 5.12(a). But the network does converge
monotonically in the sense of E 2(x) (see Figure 5.12(b)). More importantly, even
though the network has not yet converged to its equilibrium, E 2(1) becomes very
close to its ﬁnal value in only few generic time steps. Eight more simulations with
various initial conditions were done to vindicate the stability of ii and the results are

shown in Figures 5.l3(a)-(h).

101

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) 3 5’.- ‘3-
r
8 8 8 /
«5'1 tar as
8 8 8
or .5- to
3:. K1 5 [
x8 x8 x8
or ~‘« s;
8. 8. 8. _
8 8 8
v" 1:4 v.
' . 10.00 4.00 8.00 T 12.0 16.0 20.0
(b) 8
a;
8
(it
8.
23
x
r 3
N8
tries
91
6k
8.
c’0.00 4.00 8.00 12.0 16.0 20.0

T

Figure 5.12. The simulation results of the (LS) problem with initial condition
xo =[0, 0, 0]T. (a) The trajectories of the state variables. (b) The trajectory of E 2(x ).

102

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) 0. =2 =2

...'1 :1 :

ff

3 o o /

c o o

to" to" a5

8 8 8

to“ 1.1")"1 tn'

3:. K] (7') /

N'" N'" N

8. 8. 8. m

8 8 8 ‘

v" v.‘ v.

' l ' .00 4.00 8.00 T 12.0 16.0 20.0
(b) =1 0. O.

:1 ::q :

W

8 8 8 /

an" an" to

8 8 8 \ /

or" 1:1" 111'

:- N :7)

N" N" N

8 8. 8. [\¥

8 8 8

v‘J v" v.

' ' ' .00 4.00 8.00 T 12.0 16.0 20.0

Figure 5.13. The trajectories of x for the (LS) problem with different initial condi-
tions. (a) xo 4.4, -4, 417‘. (b) xo =[11, -4, 4?.

(o)

(d)

103

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

=1 0. =1
:1 Z‘ :1?
8 8 8
06“ do" as
8 8 8
.o" to" to
Z (:1 8
s." 8H s;
8. 8. 8.
8 8., 8.
T 3' 70,00 4.00 6.00 T 12.0 16.0 20.0
D. D. 9
:4 :- 2%
Pi
8 8 8 /
«:51 «5" a5
8 8 8
ui- .o- to
2. a 8
x8 x8 x8
~'- N" ~'
8. 8. 8.
8 8 8
v.4 '3" v“,
1 ' «1.00 4.00 8.00 T 12.0 16.0 20.0

Figure 5.13. (cont’d.). (c)xo=[-4, 11,-41T. (d)xo=[11, 11, 417.

e)

(f)

104

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0. O. O.
:- :- :5
If
8 8 8 ,///
8- e~ e
8 8 8
m~ h~ m
2 N 8 /
x8 x8 x8 A,
~~ Ni N
1
8 8 8 '
u-"1 a..." .—: k \W
o 8 8
O
77 I 7.00 4.00 ‘ 6.00T 12.0 16.0 20.0
o_ Q. ‘3.
f
8 8 8 ,//
to" 85" to
8 8 8
8- mi 8
2 E 8
x8 x8 x8
as ~- 8 /
3 8. 8. \\
Q o 8
O D
I 7 78m mm 8m1_ we we mm

Figure 5.13.(cont’d.). (e)xo=[-4,-4, 1117‘. (f) xo=[ll,-4, 111T.

104

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0. O. ‘3.
r'—‘
8 8 8 /
8“ «5q tn'
8 8 8
m- as m
a 8 8 1
N- N8 N
'1
8 8. 8. a.
8 8 8
o, 44 o?
- . . 00 400 8.00T 120 m0 m0
/
8 8 8 /
Q. DJ 0
8 8 8
1n" tn" tn
2 8 8
x8 x8 x8
N." N" N/
5’. 8.] 8.;
T T 7.m mm am1_ we we we

Figure 5.13.(cont’d.). (e) xo=[-4, .4, 1117. (r) xo=[ll,-4, 111T.

(E)

(h)

 

X(1)

 

X(2)

 

X(2)

 

Figure 5.13

8.00

X13)

8.00

5.00

X131

2.!!!

5.00

“1.111

105

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8.00

12.0

16.0

 

 

 

 

 

 

rill)

 

 

 

 

 

 

 

 

“4.00

8.00

T

12.0

16.0

.(cont’d.). (g) xo=[-4, 11,111T. (h) xo=[ll, 11, HF.

106

5.3 Nonlinear Programming

Consider the following program (NP 1):
Minimize f(x) =x,2 + (x2 -1)2
subject to g(x) = x2 — x12 = 0.

Note that f (x) is strictly convex on R 2 as shown by the following derivation. For
2.8(0, 1),
f 09: + (1-2)y)
= 0x. + (May 112 +012 + (l—Mya - 1)2
= our + (Hm? + mics-1) + (moo-1112
= 22x12 + 22(1-2.)x1y1 + (140ny + 2282—1)2 + 22(1-2)(x2-1)(yz-1)
= 420:? + (12’1)21 + (1— )2er + 02—021 + 280-481;». +(12“1)0’2“'1)1
= 81x? + (xi-021+ (22-2)[x12 + (12.1)21'1'0-MD'12 + 02-1121

= [(1-702 - (18)]er + 02-021 + zul—xxxryr + (xi-norm

ﬂ

= Mot) + (l-w 0) + W-l)ffH12“1)2+y12+02“1)2“2-Xry1402-0024)]

 

= if (x) + (1-2)f (v) + Ml—l) _(xr “ xi)2 + ((xo-l) - 02-1))2]
< 2f (x) + (1-7»)f 0’).

since 2(2-1) < 0 and (x, - y,)2 + ((xz-l) — (342-1))2 > 0. To verify the convexity of
g (x), by deﬁnition it follows that

80x + (1403’)
= 0112 '1' (1'4”?) ‘ (411 ‘1' (1’4” 122

= 71x2 + (1-20)., .. [7121.12 + 22(1-2.)x1yr + (14023121

107

= let, — x12) + (14002 - (140212) “ ”(l—W 1
= xg (x) + (1-2)g (x) — 2241-70101
s 2g (x) + (140g (x).

if xly120. Thus g(x) is convex on the closed half spaces {xeRzlleO} and

{xeR21x1S0}, but not on R2.

The program (NPl) is equivalent to ﬁnding points on the parabola x12=x2 closest
to the point (0, 1). To solve the problem precisely, substitute x12=x2 into f (x) and

 

solve
2 _
12+(12—1)—C.
This gives
1:1: )14c — 3
X2: 2 .

It can be seen geometrically that there is only one solution of x2. This implies

em, = % and x2 = -;-. Correspondingly, x1 = :t-‘(l—in Thus the minimizer of (NP 1) is
x = 8%. é)

at which f(x) = in
E 2(x) for the above program (NP 1) is
Etc) =x? + (xi— 112 + %(12 -x.2>2.

The contours of E g(x) for different values of s are shown in Figure 5.14. When 3 is
small, the shape of the contour of E 2(1) is dominated by f (x). As s increases, it
tends toward the shape of the parabola g (x) with two minimizers

 

108

 

 

 

 

 

 

 

 

1.75../ \
C j

 

Figure 5.14. Contours of E2(x) for (NP 1). (a) The contour for s=0.5.

V

Jr2

 

 

0.75
0.5"
0.25'-\ /
°‘§\ : ‘r : i t i :7 4
-1 -0.75 -0.5 -0.25 0 0.25 0.5 0.75 1

I1

Figure 5.14. (cont’d.) (b) The contour for s=5.

110

21-

1 .75"

0.75'

0.5‘

0.25:l

 

 

 

Figure 5.14. (cont’d.) (c) The contour for s=50.

111

derived in the following.

By direct calculation, it follows that

 

_ 2x1(sx12 -sx2 +1)
V52“) — -sx12 + (s+2)x2 — 2
and
2+s(6ot,2 - 2.x?) -2sxl
2 _
V 520‘) "[ -2sxl s+2 '

By setting VE2(x) = 0 the critical points of E 2(x) are found to be

 

=,(0 ——2'2—') and Xe2= (iVS—ZEE ,-l").

For x = ,1, the eigenvalues of V2E 2(x) are

4—2s
s +2

 

and (s +2)

and they are positive for s<2. Thus x = xol is the minimizer for E2(x) when s<2.

Similarly, the eigenvalues of VZE 2(x) at x = 1.2 are

 

(3s +2) 8‘! (3s +2)2 - 4(4s

-3)
2 >0

 

for s>2. Hence, xoz is the minimizer for E 2(x) when s>2. As s -—)oo, the minimizer

x2 approaches (i— —), which are the exact minimizers of (NP 1).

«1" 2
Converting the program (NP 1) into the form of equation 4.1, the simulation

results of the trajectories of x with s=50 and various initial points are given in Figure

5.15. The trajectories of x, except for initial points with x1=0, converge to either

521 = (0.69282, 0.5) or 22 = (-0.69282, 0.5)

112

 

 

 

 

 

 

 

 

 

 

o
O J‘
8'" r} '- 1— ll
ca
on
N / \
o
no
N4: 11‘
x
o
o
o- / \
o
c:
--' 1 U11.
‘“2.00 “1.00 X01100 1.00 2.00

Figure 5.15. Trajectories of x of equation 4.1 for (NP 1) with s=50.

.—

depending on whether x, of the initial point is greater or smaller the zero. In fact, it
can be shown that x, is asymptotically stable on {xeRzlxl>0] whereas 122 is asymp-
totically stable on {x eRzlxl<0]. Though it was mentioned in Section 4.3 that for
nonlinear programming the asymptotic stability of the equilibrium points of equation
4.1 are held locally for sufﬁciently large s , this example shows for (NP 1) that the
basins of attraction of the two equilibrium points nearly cover R 2 except for the line
x1=0. For s=50 if a trajectory starts out with x1=0, it stays on x1=0 and moves

toward

x = (0.0, 0.03846),

113

which is a saddle point originating from xel as s>2. Since in hardware irhplementa—
tion one can not select a point which is precisely zero, the effect of this saddle point
is actually unobservable. Thus the network may be thought of almost completely

stable in practice.

A 2—phase network has been used to shift the equilibrium points

s-2 i) to (i-l? 1

(i 23’2 (’3'

).

The simulation results are shown in Figure 5.16. The little hook near the end of the
trajectory is due to the effect of phase-2 dynamics. By the choice of s and e for the
2-phase network in solving (NP 1), the time for phase-2 to converge is roughly triple
of the time for phase-l. Generally speaking, the smaller 8, the longer the time for

phase-2 to converge.

Consider now the following nonlinear program (NP 7) quoted from [19]:

I?

Minimize f(x)=x12 +Jc22 —x1x2+0.4x2+ 30
subject to g(x) = x1 + 0.5x2 2 0.4,
g(x) = 0.5xl + x2 2 0.5,
83(1)=1120:
g4(x)=x220.

This is in fact a convex program because f (x) is a convex function and the feasibility
set is convex. The convexity of f (1) may be seen by deriving the Hessian matrix of

f(X).

OBI-+2 -1
szu)=[ _l 2 J.

114

(a)

 

0.75

. />

 

0.65

 

X(2)
0.55

0.45

 

 

 

 

 

 

0.35

 

0.60 0.70 0.80 0.90 1.00
X(1)

(b)

0. 75

 

055
/\

 

 

 

 

 

 

 

 

 

Ln
Am
No’
x \
-5
ID
V”
0'
Ln
m
o'
-1.00 -0.90 ~0.80 -O. 70 -O.60
X ( 1)

Figure 5.16. Trajectories of x for (NP 1) using the 2-phase network with s=10 and
8:02. (a) x0 = (0.75, 0.75). (b) xo = (—0.75, 0.75).

115

and showing that its eigenvalues
(0.1x1+2)_+\/0.01x12+1

are positive for x1 2 0.

Using equation 4.1 to solve (NPZ), the trajectories of x of are shown in Figure
5.17(a). The simulations are done with s=10. The equilibrium of the network is
given in the ﬁrst entry of Table 5.2 and contrasted to the equilibrium obtained by a
2-phase network with s=10 and 8:02. As seen from the table, the solution obtained
by the 2-phase network is very accurate. The trajectories of the 2-phase network with
initial points (0.25, 0.25) and (0.45, 0.45) are plotted in Figure 5.17(b). The line seg—
ment near the middle of the ﬁgure is caused by the phase-2 dynamics.

Table 5.2. The equilibrium points for different networks for (NP 2).

 

 

 

 

 

Network formulation x1 x2
Equation 4.1 with s=10 0.3023760 0.2825410
2-phase net with s=10, 8:0.2 0.3395630 0.3302180
Exact solution 0.3395628 0.3302186

 

 

As shown in Section 4.3 the network formulation of equation 4.1 may be used
ﬁnd minimizers of a pseudo-convex function subject to some quasi-convex and quasi-
concave constraints. If an equilibrium 2' of equation 4.1 occurs at the interior of the
feasibility set, V f (i )=0 must hold. By the properties of a pseudo-convex function it
follows that i is a global minimizer of f . Suppose now that the objective function is
f:X—>R and feel, where x is an open interval in R. Let (P1) be the problem to
minimize f over S , where S is a subinterval of X . Mapping (P 1) to the network of
equation 4.1, the network dynamics drive the trajectory continuously along the des-
cending direction on the surface of f (x ). The trajectory ends at either the boundary
of S or a point in S for which Vf (x)=0. If the former happens, then f is strictly

116

 

 

 

 

 

 

 

 

 

 

 

(a) g
4" 0 ".
K) /
O /
c: Z//////
,Jn
<3de ’"
>< /
#8 //////,////
/
O X
c:
c3
c5 ii t3
0.00 0.25 0.50 0.75 1.00
X(1)
(b) y;
0' ’3
S?
c5

 

035
\

 

 

 

 

 

 

 

 

‘3
X
c: [/////
m
d 0]
Ln
N
c5

0.25 0.30 0.40 0.45

0.35
X ( 1 )
Figure 5.17. Trajectories of x for diﬂ'erent networks for (NP 2). (a) Equation 4.1 with
s=10. (b) 2-phase network with s=10 and $0.2.

117

monotonic (either increasing or decreasing) on S . Both cases lead to an interesting

application, namely, solving p (x )=0 over an interval.

Let p :s ->R and peCz, for s an closed interval in R. Let (P 2) be the problem

to solve p (x)=0 for xeS . Assume (P2) is feasible, i.e., there exists an xeS such that

p (x )=0. Assume that p is strictly monotonic on S. Let E (x )=-%-p 2(x), then

For) =p’(x)p(x) (5.1)
and
15"0c)=<p’0c»2 +p(x)p”<x). (5.2)

For p (50:0 we have E’ (f)=0 and E" (x')=(p’ (f))2>0 by the strictly monotonic
assumption of p. Thus )7 is a local minimizer of E(x) and and a solution to p (x)=0.
In fact, 1? is also the global minimizer of E (x) on S, since the strictly monotonic

assumption assures that E (x) is pseudo—convex on S with a unique minimizer.

To relax the above argument to more general functions, assume that p e C 1.

Note that since p (f)=0, it follows

mm = limp—(W)
II—)0 h '

Now taking the limit of the diﬂ‘erence quotient of E’ (x) at f, we get

E'(x-+h) - E’(5:')

 

 

 

lim

h-)0 h

= lim p’<f+h>p em.) *- p’mpm
II—)0 h

= m p’(f+h )p (f+h)
h—-)0 h

= . r- . p(f+h)
$30? (““335 h

= (p’(f))2 > 0. (5.3)

118

Thus if is again a local minimizer of E (x) on S. By the suictly monotonic assump-

tion ofp(x) on S,i’is also a global minimizer ofE(x) on S.

Since E (x) is pseudo-convex in the above, choose the network

15 = -E’ (X) = -p (X)p’(X) (5.4)
such that
L35) =p(X)p’(x)i (5-5)
= -(p (x )p’ (x ))2.

Since by the assumption of strict monotonicity p’ (x )3'50 for x e S , the unique equili-
brium of equation 5.4 is f at which E (x) achieves its minimum. The idea behind this
technique is that since E (x) is pseudo-convex with its minimum in S , then it is a

valid Lyapunov function for the system of equation 5.4.

Same technique is applicable to solving any feasible problem q (x )=0 on a closed
interval S, where q is such that q2 is C1 pseudo-convex on S. Thus (12 may be a
valid Lyapunov function for equation 5.4. It is possible for 42 to have more than one
minimizer. If qu1, and if q2 has only a local minimum equal to zero, ﬁnitely many
local maximizers, and no saddle points, then this technique is also usable. Since for
such a function q , except when the initial point is one of the maximizers, the system
of equation 5.4 will converge to one of the minimizers f of q2 for which q(x')=0. In
hardware implementation, the effect of the ﬁnitely many maximizers is unobservable.

Thus the system may be regarded as almost completely stable in practice.

Consider the following problem (NP 3):

Solve f(x) =x3 - 9x2 + 23x - 15 = 0.

The function f (x) is plotted in Figure 5.l8(a). It has three roots at l, 3, and 5. The
proﬁle of f2(x) is shown in Figure 5.18(b). f2(x) has three minimizers, 1, 3, and 5.

119

(a)

for)“

 

 

(b)

 

f2(X)

lO-

 

 

.—
N
U)
A
M

Figure 5.18. (a) f(x) for (NP3). (b) f2(x) for (NP3).

120

and two local maximizers,
4.1547 and 1.8453.

Solving this problem by equation 5.4, the trajectories of of x versus E (x) are shown
in Figure 5.19. The asymptotical stability of equation 5.4 on its three equilibria can
be clearly seen in the ﬁgure. The regions of attraction for x=l, 3, and 5 are respec-
tively (—oo, 1.8435), (1.8435, 4.1547), and (4.1547, co). They indeed cover almost all

R except two points. It is thus clear the system is almost completely stable.

 

10.0

8.00
———.=.>
ﬂ

 

 

 

 

 

 

 

 

 

 

 

 

 

8.
§
§ f" R f’ R
/ \
a. / \
. \/
°0.50 1.50 2.50 3.50 4.50 5.50
x

Figure 5.19. The trajectories of x of equation 5.4 versus E (x) for (NP 3).

CHAPTER VI
CASE STUDIES

Two optimization problems encountered in power system engineering are solved
by the developed network formulation in this chapter. They are the economic power
dispatch (EPD) problem and the optimal power ﬂow (OPF) problem. The results of
this case study demonstrate the applicability of the developed network for solving the

optimization problems encountered in real engineering situations.

6.1 Economic Power Dispatch

The EPD is a classical problem in power system optimization. The goal of EPD
is to determine the amount of power to be produced by each generating unit in the
system such that the load (demand) can be met with a minimum total generation cost.
The power system model typically consists of n thermal-generating units connected to
a single load R . Let x,- be the power generated by the i th unit and f i be the genera-
tion cost rate function of that unit. f ,- is generally approximated by a quadratic poly-
nomial of the form

f i(xi) = bi + aixi + Aiixiz (6.1)
where b; , a,- , and 4,; are positive constants. Each x, is restricted between

(1:,- m , Iran] as determined by the generation limits of the unit. Without

121

122
considering the transmission line losses, the EPD problem (D 1) may be expressed as:

Minimize f(x) = if;(x,~)

i=1

subject to ximei Sxim for i=l,...,n

II
and 2x,- —R = 0.
'=1

(D 1) is actually a quadratic program and thus the results of Section 4.2 may be
applied. Traditional methods to solve this problem can be found in [153].

If the line losses are taken into account, the EPD problem changes its form to
(D 2):

Minimize f(x)= inc.)

i=1

SUbjCCt t0 13mg x,- 51,-” for i=l,...,n

and for, —PL)-R =0
'=l

where PL represents the line losses. A general version of PL may be expressed as
the quadratic function

I! n n
,2 PL = zzxiBijxj + 23,-1; + 80' (6'2)
i=lj=l i=1

Due to the nonlinearity of PL it is more difﬁcult to solve (D7) than (D 1). An itera-
tive process is normally adopted to solve (0;) (see Chapter 4 of [153]). The
developed optimization network is superior in that with same mapping technique it is
applicable no matter whether the line losses are considered or not.

Example 6.1 [153]: A power system consists three generating units. The minimum
and maximum output of units 1, 2, and 3 are [150 MW, 600 MW], [100 MW, 400
MW], and [50 MW, 200 MW], respectively. A total of 850 MW is to be delivered at

123

a minimum overall cost.
Case I -' 311pposc the cost rates for the units are
f 1(x1) = 561 + 7.92::1 + 0.001562x12,
f 20:7) = 310 + 7.85x2 + 0.00194x22,
and
f 3(x 3) = 78 + 7.97x3 + 0.00482x32.

Matching the corresponding terms of f (x) to the objective function of a quadratic
Progmm (QP ). gives

0 0 0.009

7.92
a = 7.85 ,
7.97

0.003124 0 0
A = 0 0.00388 0 .
54

and

[561]
c = 310 .
78

It is clear that A is positive deﬁnite. Furthermore, the feasibility set of this problem
is convex and compact. Therefore, the results in Section 4.2 can be applied presum-
ing that the minimizer set 0 contains only regular points.

The simulations of the network of equation 4.1 are done with s =50 and the ini-
tial condition x0 =[400, 300, 150]T . The trajectories of the state variables are plotted
in Figme 6.1(a); the trajectories of f (x) and E2(x) are shown in Figure 6.1(b). The
asymptotic nature of the equilibrium point is clearly seen in these ﬁgures. The ﬁnal
values of the state variables, E 2(X ). and f (x) are given in Table 6.1 as is the exact
solution. The error between the simulation result and the exact solution is less than

0.1% on average. The value of E 2(1) is slightly larger than the objective function.

 

124

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

O O O
(a) m. in. L“.
(0‘1 “3'1 LO
O O O
O C) O
m" m" m’
N N N F
Clo Clo CDC)
m" m" m'
x x x
C) O O
C) O D
N“ N“ N
ck
O O C)
1.0 L0 Ln
0'4 o“ o‘
0.00 0.50 ’ 1.00 1.50 2.00
T .103
E O
(b) a 5
N
(13'? odé
Ln LD
03 CD
2 9
oo" oo'
0 0
E5 1235
X". ".
“org-(Mn
5 :5
N
7‘ in
Ln L0
or m
2 <2
(15“ oo'
2 O
2 5
co" 00'
0.00 0.50 1.00 1.50 2.00
T .103

Figure 6.1. Simulation results for Case 1 of Example 6.1 using equation 4.1 with
s=50 and x,,=[400, 300, 15017”. (a) Trajectories of x. (b) Trajectories of f (x) and
52(1)-

125

This is due to the fact that the equilibrium of the network always lies in the infeasible
region and thus results in a small positive value in the second term on the right hand

side of equation 3.10.

The simulation results for Case 1 using the 2-phase network with s =50 and e are
shown in Figure 6.2. The system is switched to phase-2 at t=1000. Normally,
phase-2 dynamics requires longer times to converge, but for this particular case the
phase-2 network converges rather quickly to its equilibrium as can be seen ﬁom Fig-
ure 6.2(b). From Table 6.1 the ﬁnal values of state variables of the 2-phase network

are the same as the exact solution.

Table 6.1. Equilibrium points of two networks for Case 1 of Example 6.1.

 

Networks x1 x2 x3 f (x) E 2(X)

 

Equation 4.1, s=50 393.10 334.52 122.20 8192.68 8193.52
2-phase net, s=50, $0.2 393.17 334.60 122.23 8194.36 8194.36

 

Exact Solution 393.17 334.60 122.23 8194.36 8194.36

 

 

 

 

 

 

 

 

Case 2: Suppose now that due to ﬂuctuation of the resource price, the cost rate

of unit 1 becomes
f1(x1) = 459 + 6.4st + 0.00128x12.

After adjusting the corresponding terms in the network, the simulation is performed
with the same parameter s and initial condition. The results are listed in Table 6.2.
Again, the results of using l-phase network (equation 4.1) approximate the exact solu-
tion closely while the results of using 2-phase network match it precisely. The trajec-
tories of the network variables for equation 4.1 are plotted in Figure 6.3. Unit 1 in

this case must produce its maximum output. In [153], a different scheme is required

126

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) 8 8 Si
O O O
O O O
m- m- m‘
“do “do “éo‘i‘
:m rm :m
m" «i- m'
:3 {\‘t m s/f
x x x a
O O O
O O O
N" N's a;
rim
C.) O O
”1] L” L“.
o o‘ o
0.00 1.00 2.00 3.00 4.00
T .103
'5 S
N N
«5‘ (0%
Ln Ln
CD to
2 52
an- an s
O 0
L0 to
g9: 'ée
fx‘m" Am
.5 ~25:
a... in \
L0 L0
to m
2 9.2
m'- m' t
.9. O
2 §
«SJ m'
0.00 1.00 2.00 3.00 4.00
T .103

Figme 6.2. Simulation results for Case 1 of Example 6.1 using the 2—phase network
with s=50, $0.2 and x0 =[400, 300, 150]T. (a) Trajectories of x . (b) Trajectories of

HIM“ 520‘)-

127

to handle the case when state variables are achieving their extremum. Oilr network

formulation, however, can apply to such a case without any change.

C]

Table 6.2. Equilibrium points of two networks for Case 2 of Example 6.1.

 

Networks x1 x2 X3 f(X) 52(X)

 

Equation 4.1, s=50 600.01 187.00 62.82 7250.63 7251.37
2-phase net, s=50, $0.2 600.00 187.13 62.87 7252.11 7252.11

 

Exact Solution 600.00 187.13 62.87 7252.11 7252.11

 

 

 

 

 

 

 

 

The above example does not take the line losses into account. To demonstrate
the capability of the developed network formulation in solving the EPD problem with

line losses, consider the following.

Example 6.2: This example problem is taken from Example 4B in [153]. In this

power system there are three generation units with unit dispatch limits
50.0 MW le S 200 MW,
37.5 MW 5 x2 S 150 MW,
45.0 MW S x3 S 180 MW.

A total of 210 MW is to be delivered at a minimal overall cost. The generation cost

rates are
f 1(x1) = 213.1 + 11.669x1 + 0.00533x12,
f 20:7) = 200.0 + 10.333x2 + 0.00889x3,

f 303) = 240.0 + 10.3331:3 + 0.00741x}.

 

Fig
5::

128

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) a a a
ton to'- to'
o O O /
O O O
m" m" 111'
N N N F
CO CO CO
2"“ 7L“ 2““
to" m'- m‘
E a £8 E
X X X
D O D
D O O
(\I-l N-t
(N
O O O
W L0 in
o" o" o'
0.00 0.50 1.00 1.50 2.00
T .103
C) O
(b) 5% B
r\" l\'
O O
O 0
v- v-
r\" N.
225. "is
x ._ 3: .
AN 1\
:3, 5
N
“e In
O O
O O
N N
r\" l\.
C) O
9] 2
l'\
0.00 0.50 1.00 1.50 2.00
T *103

Figure 6.3. Simulation results for Case 2 of Example 6.1 using equation 4.1 with
s=50 and x0 =[400, 300, 150]T. (a) Trajectories of x. (b) Trajectories of f (x) and
E2(1)-

128

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) a a a
to" to'“ to'
o o o /
o o o
m" Ln" m'
N N N F
00 oo oo
7‘“ 7‘“ :m
m" m" m'
3 N 8E
X X x
o o c:
c: o o
N" N“ N
(H
o c:
m m or
o" o" d
0.00 0.50 1.00 1.50 2.00
T .103
o o
(b) B 8
m" r<g
a a\
o o
v- v
ix" n.
1225 '55
X . X .
AM' AN
5 5 \L
N
‘5 it)
o o
o o
N N
(\"1 TN.
0 o
9. 9.
l\." N.
0.00 0.50 1.00 1.50 2.00
T .103

Figure 6.3. Simulation results for Case 2 of Example 6.1 using equation 4.1 with
s=50 and x,=[400, 300, 150?. (a) Trajectories of x. (b) Trajectories of f (x) and
E2(1)-

129

3
It is clear that the objective function f (x )=2 f ,- (xi) is a convex function. The
i =1

corresponding B coefﬁcients of the line losses PL are given by

6.760 0.953 -0.507
[BU]: 0.953 5.210 0.901 x104,
-0.507 0.901 2.940

—0.07660
[Bi] = ‘0.00342 ,
0.01890
and
B, = 4.0357.

Since [B ,- j] is positive deﬁnite, the equality constraint for this problem is a concave
function. The feasible region, deﬁned by the unit dispatch limits, is a rectangular
solid which is clearly convex. Therefore, the results in Section 4.3 can be applied
presuming that the set of optimal solutions 0 is nonempty and contains only regular
points.

The problem is mapped into the network formulation of equation 4.1. Simula-
tions of the network are done with s =50 and initial condition x, =[160, 40, 120]T
MW. The trajectories of the state variables are plotted in Figure 6.4(a). f (x) and
520:) are ploited in Figure 6.4(b) and the line losses in Figure 6.4(c). Again, the
asymptotic nature of the network equilibrium is seen in these ﬁgures. As the system
evolves with time, E 2(x) is almost identical to f (x) due to the appropriate choice of
s . The ﬁnal values of all variables are shown in Table 6.3 as are the results as given
in [153]. The proposed network achieves a better objective value with less line loss.
More importantly, even though the network has not yet converged to its equilibrium,
E205) becomes very close to its ﬁnal value in only few generic time steps. This may
be explained by the slow manifold effect stemming from the smallness of [B i J- ].

 

 

130

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(a) 8 8 8
N1 N1 (\i
ii. 8. 8.“
N N U
(Bo Do Do
:0, :0, :O,
3 R] E?)
x x x V
O O D
L“. m. L“.
Oq O“ O
07
O O O
O O O
o" o" o'
0.00 1.00 2.00 3.00 4.00
T .102
(b) 8 8
m- .n'
O O
m Ln
v.“ v‘ﬁr
ca, 1'2:
:28 3'28.
Avﬁ :v
5 Y4
"H in
O O
in 1n
m- to
O O
D O
m" m'
0.00 1.00 2.00 3.00 4.00
T «l02

Figure 6.4. Simulation results for Example 6.2 using equation 4.1 with s=50 and
x, =[160, 40, 12017. (a) Trajectories ofx. (b) Trajectories of f (x) and 1520:).

131

 

20.0

 

15.0

 

PL
10.0

 

5.00

 

 

 

 

 

 

0.00

 

0.00 1.00 2.00 3.00 4.00
T .102

Figure 6.4. (cont’d.) (c) The trajectory of PL.

Simulations with 8 possible dispatch limits as initial condition have been done and the
results conﬁrm the complete stability of the system as well as the uniqueness of the
equilibrium in this example.

This problem has also been solved by the 2-phase network for s =50, $0.2, and
x0 =[ 160, 40, 120]T . The trajectories of the variables of the 2-phase network are not
shown since they are nearly the same as the trajectories in Figure 6.4. The ﬁnal
results are listed in Table 6.3. Though the exact solution of the problem is not avail-
able, the results obtained by the 2-phase network may be regarded as nearly exact.
This is because at the results obtained by the 2-phase network. the constraint viola-
tions are less than 2x10‘2, which is considered negligible. 1:!

132

Table 6.3. Comparison of the simulation results for Example 6.2 with the results in [15 3].

 

 

Methods x1 x2 x3 PL f (x) E 2(x)
Equation 4.1. s=50 72.8876 70.2605 75.4437 8.8484 3162.35 3164.00
2-phase net, s=50, $0.2 72.9822 70.3250 75.5580 8.8652 3165.64 3165.64
Method in [153] 60.2677 79.4462 80.1503 9.8650 3168.62 --

 

 

 

 

 

 

 

 

 

6.2 Optimal Power Flow

The OPF is essentially a static optimization problem of power system operations.
The aim is to ﬁnd an optimal generation schedule that evokes a minimal production
cost and simultaneously satisﬁes the power ﬂow equations and the operational con-

straints. The general formulation of the OPF is:
Minimize f (u , x)
subject to g(u, x) S 0

h(u,x) = 0

where u is the set of independent variables, x is the set of dependent variables,
3 (u , x) reﬂects the operational constraints, and h (u , x) corresponds to the load ﬂow

equations.

The cost functions f (u , x) of the OPF problem are similar to those of the EPD
problem. But, OPF problems are more difﬁcult to solve because of the inclusion of
the smooth but non-convex power ﬂow equations. Also the OPF problems differ from
the basic load ﬂow problem in the presence of a performance index criterion (i.e., the
minimization of a cost function) and the explicit inclusion of inequality constraints
)1 (u , x). These constraints refer to lower and upper limits on real and reactive power

generation, power ﬂows on lines and transformers, and voltage levels.

133

Approaches currently used to solve the OPF problems include quadratic program-
ming, separable programming, and the Newton method [154]. These approaches, if
successful, converge to one of the local minimizers of the problem depending on the
starting point of the iterative process. If the developed optimization network tech-
nique is regarded as a continuous mode of the discrete, iterative optimization process,
it is expected that the region of attraction of the network is similar to the region that
results in convergence when using the traditional approaches. Moreover, the equili-
brium points of the network should be very close to the solution obtained by other

approaches.

Example 6.3: For a power system, Pg, (Qsi) is the real (reactive) power generation at
bus 1', Pd,- (Q43) is the real (reactive) power load at bus 1' , and V,- is the voltage mag-
nitude at bus 1' with a phase angle 5,- . Consider now the system shown in Figure 6.5

[155]. The costs of generation for generator] and 2 are, respectively,

flap“) = 1 + P81 + 300892

 

 

 

 

 

P313? P82=? 52:?
V1=1.O V2=1.0
Bus 1 Bus 2
l y = 1 - j10 i
Pdt=3.0 sz=1.0

2 generator : load

Figure 6.5. Power system network for Example 6.3.

134

and

112(ng = 0.5 + 0.51982 + 0.5(Pg 7)?
Bus 1 is taken to be the swing bus (i.e., 51:0). By the power ﬂow equation, we have
P31 - 3 =1 - cos82-10sin82
and
P82 - 1= 1 - cosﬁz + 10sin52.

Replacing P31’ P32’ and 52 by x1, x2, and x3, respectively, produces the following
nonlinear programming problem:

Minimize f(x) =f1(xl) +f2(12)

subject to h1(x) = cosx3 + lOsinx3 + x1 — 4 = 0.

h2(X) = 00813 - 1081M}; +12 - 2 = O.

The non-convex nature of h 12 (x) and 1122 (x), as shown respectively in Figures
6.6 and 6.7, shows the difﬁculty encountered in solving such a problem. But the
smoothness of these two functions suggests a high probability that the E2(x) thus
formed will have a local convexity around the points for which h 1(x )=0 and h2(x )=0.
By presuming the existence of such points, this problem is mapped into the network

of equation 4.1.

Since there is no operation limits in this particular example, the initial values for
x1 and x2 are both chosen to be zero. If desired, artiﬁcial constraints x120 and x220
may be added to the network to ensure that the solution of these two variables stays
positive. The phase angle variable (x 3 in this example) is generally assumed to be
close to 0 [156], thus x3(0)=0 is chosen. The simulation results are tabulated in Table
6.4 with s=100 and s =200 and are compared to the solution obtained by the 2-phase

network with s =100 and $0.1. Though the exact solution to the problem is not

135

 

 

 

  
 

144%.. 92... .
X

@456
7. / 1“. ll...“

  

 

     

 

   

 

 

     

   

 

 

Figure 6.6. The surface of hf(x ).

 

 

 

 

 

 

-141
m _ . .
a F4 7 ﬂ 0
8 l 0
6 0‘ :V 0
5 0
8 C a N w 0
3 2 1 0

I230)

Figure 6.7. The surface of 7122 (x ).

Table 6.4. Simulation results for Example 6.3.

136

 

 

 

 

 

 

 

 

NCtWOl'kS x1 XZ X3 f(X) E2(X)
Equation 4.1, S =100 0.52666 3.45379 0.23881 10.55“) 10.7147
Equation 4.1, S =200 0.53467 3.48660 0.25015 10.7138 10.7146
2-phase 1181. S =1“), 6:0.1 0.53938 3.52372 0.25187 10.8824 10.8824

 

 

available, the solution obtained by the 2-phase network may nonetheless be regarded
as nearly exact since at this solution the constraint violations are less than lxlO“.
Various initial points conﬁrm the robustness of results obtained by these three net-
works. The exact Lagrange multipliers for h 1(x) and h 2(x) obtained by the 2-phase

network are

x, = -4.23629 and it, = -4.02371,

respectively. CI

It is worth noting that although a large value of s results in an equilibrium point
closer to the exact solution, thus giving a better initial point for the second phase
dynamics of a 2—phase network, a 2-phase network with a smaller value of s can still
converge the exact solution. This phenomenon eases the burden of selecting s in the

2—phase network.

Though the operation limits are not included in above example, they may be
easily incorporated into the network, since they are linear and deﬁne a hyper-rectangle
(naturally convex) in R " . Other criteria, such as pollution criterion, security criterion,
and load shedding criterion, may replace or be added to the operation cost criterion to
make the OPF problem more general. The same network formulation may be applied
to all these different OPF problems as long as the objective functions are C1 func-

tions.

137

Furthermore, when compared to the OPF method in [155], the optimization net-
work technique in this thesis is also superior in that ﬁrst, there is no need for Jacobian
matrix inversion. This is a necessity for most other OPF methods and is very time-
consuming. Second, the solution for the power ﬂow equations is obtained simultane-
ously since the power ﬂow constraints are satisﬁed automatically when the OPF is
solved. In fact, viewing the power ﬂow equations (PFE) as a sub-problem of a gen-
eric OPF problem with a null objective function, the optimization network technique

can solve the PFE as well. To illustrate such an idea, consider the follow example.

Bus 1 (P 9 Bus 2

 

 

 

Bus 3 Bus 4

.1. l

 

 

 

 

 

 

 

 

 

Bus 5

2 ‘ generator 1 load

Figure 6.8. A 5-bus power system used in Example 6.4.

Example 6.4: A 5-bus power system is connected as in Figme 6.8. The bus data and
the line data for this system are given in Tables 6.5 and 6.6, respectively. All the

138

variables in Table 6.5 have been normalized. Bus 1 is the swing bus for which the
voltage is constant and its phase angle is zero (81:0). Bus 2 is a generation bus, i.e.,
a PV bus for which the voltage and the real power are constant. Busses 3-5 are

strictly load busses (PQ busses) because their real power P and reactive power Q are

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

constant.
Table 6.5. Bus data for the 5-bus system.
Bus # Gen Voltage P load Q load
(pu MW) (pu KV) (pu MW) (pu MVAR)
Swing l 0.00 1.05 0.00 0.00
2 0.80 1.07 0.00 0.00
3 0.00 1.05 0.50 , -0.30
4 0.00 1.05 0.50 0.30
5 0.00 1.05 0.50 -0.20
Table 6.6. Line data for the 5-bus system.
Line# From bus# To bus# R X G B
1 1 2 0.10 0.20 2.00 4.00
2 l 3 0.30 0.40 1.20 -1.60
3 l 4 0.10 0.30 1.00 -3.00
4 2 4 0.15 0.20 2.40 -3.20
5 3 5 0.10 0.20 2.00 -4.00
6 4 5 0.10 0.30 1.00 -3.00

 

 

The impedance of the transmission line from bus 1' to j is
Zij =Rij '11-in3

and the corresponding admittance is

139

A matrix Yb,“ called the bus admittance matrix is often used when deriving the power
ﬂow equations [156]. Yb“, is a symmetric nxn matrix (n is the number of busses in

the system) with elements

ya = sum of admittances of H-equivalent circuit elements incident to the ith

bus, and

ya = -(admittance of H-equivalent circuit element bridging the ith and k th

busses).

Denoting y”, = g”, + jbik, the power ﬂow equations at bus 1' are

P,- = f; IV, IIV, I[g,.,cos(5,—8,) + biksin(5,-—8k)] (6.3)
h=l
and
Qi = i IV‘ ”Vt IIgik 8111(05—0") — bikcos(8i—8k )1, (6.4)
k=l

where 8,- is the phase angle of the voltage V,- at bus 1' , and P,- (Q,) is the real (reac-

tive) power delivered ﬁ'om bus 1'. Also

P, = P8,- - P1,- (6.5)
and

Q: = Qgr' - Qli (6'6)
where Psi+sti is the complex power generation and P1,- +jQ,,- is the complex power
load at bus 1' .

By equations 6.3 and 6.4, a system with n busses has 2n power ﬂow equations,
but only part of them are used to solve the system variables V,- and 8;. For the exam-
ple considered here, both real and reactive power ﬂow equations are needed for busses
3-5 since they are strictly load busses. The real power ﬂow equation for bus 2 is also

140

needed because it is a voltage controlled generation bus. The real and reactive gen-
eration of bus 1 and the reactive generation of bus 2 do not contribute to the solution
of the problem; they are automatically obtained when the problem is solved by the
seven other power ﬂow equations. Since 81 and V1 are ﬁxed for the swing bus and
V2 is ﬁxed for the voltage-controlled bus, there are only seven variables left in the
system, 8,- for i=2 to 4 and Vi for j=3 to 5. It is thus a nonlinear system with seven

equations and seven unknowns as follows.
h1(V,8) = V28 22 ‘1' V2[V1(8 21°°S(52'5l)+b2lSin(52‘51))
+ V418 24°08(52-54)'*b2451n(52"54))] - P32 = 0
M055) = V328 33 + V31V1(g31°°S(53‘51)+b313in(53-5l))
‘1' V503 35COS(53-55)+b358in(53‘55))] + P13 = 0
h3(V,8) = 'Vaibas + ValVl(gslSin(53—5l)’b31°08(53'5l))
+ V518 3581n(53—55)-b35008(53-55))] + Q13 = 0
h4(V:5) = V38 44 4' V4lVl(841°°S(54'51)+b413in(54‘51))
+ V20 42°°S(54‘57)+b 4251110442»
+ V5(g 45cos(54—55)+b45sin(84-55))] + PM = 0
h 5(V ,6) = -V}b44 + V4[V1(g 41sin(54—61)—b 4lcos(84—81))
'1' V2@ 425i“(54‘52)‘b 42°°S(54‘52))
+ V5(g 455i“(54‘55)‘b45005(54‘55»] + QM = 0
h6(Vs5) = V528 55 + V5[V3(8 53008(55'53)+b538in(55—53))
+ V40 5400805442147 548in(55'54))] + P15 = 0
h7(V.5) = 41521255 + V5[V3(853$in(5r53)-b53008(55—53))

4’ V40! 545in(85-84)-b 540080754)” + Q15 = 0

141
To apply the optimization network, an energy function E (V ,8) is chosen as

7
Fw&=%zwwm. mm

1' =1

Denoting x = [V3 V4 VS a, 83 84 8sz and substituting in the values of V1, V2, and

81, the energy function becomes

E -1 7 2
(X) - 3211.- (x). (6.8)
i=1

Viewing equation 6.8 as a nonlinear programming problem with the objective

.7
function Ehﬂx) and no constraints, the one-phase Kennedy and Chua network [19]
i=1

is sufﬁcient to solve the problem. This is similar to the case where the minimizer of
a convex problem lies in the relative interior of the feasible region. In the problem
considered here the feasible region is simply the whole space. With such a viewpoint
in mind, the network structure for solving this PFE problem is formulated as

7
i = «Emmi/hm». (6.9)

i=1

The network of equation 6.9 is simulated by using the initial conditions given in
Table 6.5. Following the common practice for a ﬂat start in solving PFE, the initial
values of V3, V4, and V5 are chosen to be the same as the voltage magnitude of the
swing bus. Though not shown in Table 6.5g.the initial values of all phase angles are
zero. Simulation results are given in Table 6.7. The values of P81, Q81’ and Q32 are
obtained by substituting V,- ’s and 8,- ’s into equations 6.3 to 6.6. The results have
been conﬁrmed by a traditional method converging to the exact solution for the prob-
lem. Compared to the traditional methods for solving the PFE problem, the optimiza-
tion network technique eliminates the most time-consuming computation (inversion of

the Jacobian matrix) and results in the same exact solution. El

142

Table 6.7. Power ﬂow bus output for the 5—bus system.

 

 

 

Bus # Voltage Phase Angle Generation Load
degrees radians PG QG PL QL
1 1.050 0.000 0.0000 0.926 0.016 0.000 0.000
2 1.070 0.837 0.0146 0.800 0.170 0.000 0.000
3 0.964 -l6.529 -0.2885 0.000 0.000 0.500 -0.300
4 0.958 -5.932 -0.1035 0.000 0.000 0.500 0.300
5 0.959 -l6.530 -0.2885 0.000 0.000 0.500 -0.200

 

 

 

 

 

 

 

 

 

CHAPTER VH
CONCLUSION

Solving optimization problems is one of the major applications for en gineerin g-
oriented ANNs. Due to the lack of a more theoretically sound basis, however, the
Optimization formulations of artiﬁcial neural networks have been limited in applicabil-
ity and have caused overstatement of what the artiﬁcial neural networks can do, thus
abating their creditability. This shortcoming has been resolved as far as solving non-
linear programming problems by ANN s is concerned because of the outcome of this

work.

7.1 Summary

Based 6ft optimization theory, it has been shown that the network by Kennedy
and Chua, or equivalently the network of equation 4.1 fulﬁlls both the Kuhn-Tucker
optimality conditions and the penalty function method. From this viewpoint, the net-
work structure of Hopﬁeld and Tank is thus invalid and the network by Rodriguez—
Vazquez, et al. is a special case of equation 4.1 for s =oo.

The optimization network theory was derived in detail in Chapter 4. The key
idea is to use the energy function or the penalty function as a Lyapunov function.

The optimization network formed by equation 4.1 traverses along the surface of the

143

144

energy function in a manner of steepest descent and eventually settles on a local
minimum of the energy function. If the problem is a linear program, a quadratic pro-
gram,oraconvexprogram,thislocalminimumisalsoaglobalminimum. Proper
pseudo- and quasi-convexity assumptions placed on the original problem also lead to
a global minimum. For a nonlinear program, however, the minimum is ensured only
locally. By the theorem of the penalty function method, the local minimum obtained
by the network is an approximation to the exact solution, and these two can be moved
arbitrarily close by making the penalty parameter sufﬁciently large.

One interesting application of the quadratic programming network is in solving
the least squares problem. The network may be used to solve regular least squares
problems as well as the non-negative least squares and least distance problem [157].
The regular least squares problem seeks the least squares solutions in the sense of I 2
norm. The network in fact may be used to solve the least square solutions in the
sense of I, norm, for 1<p <oo, with or without constraints. This is because the 1,,

norm for 1<p <oo has been shown to be strictly convex [158].

It has also been shown in Chapter 4 that the optimization network formulation by
equation 4.1 is much like the gradient projection method. But they differ in that the
latter results in a solution in the feasible region while the former may end up at a
point in the infeasible region whenever there are binding constraints in the exact solu-
tion. The reason for such a slight infeasibility of the solution is due the fact that the
energy function used by the optimization network originates from the penalty function
method.

For most nonlinear programming problems, the approximate solution obtained by
the network is acceptable. But there are cases when the feasibility is very crucial to
the problem. A new 2-phase network formulation has been developed in Chapter 4 to
achieve such a goal. This network additionally provides the corresponding Lagrange
multipliers for each constraint. This is specially useful for linear programming

145

problems because the Lagrange multipliers solve the dual problem of the primary pro-
gram. Though the 2-phase network has not been proven in a rigid argument, the simu-

lation results nevertheless show its robustness in converging to an exact solution.

Through simulation, it has been shown in Chapter 6 that the networks (l-phase
and 2-phase) are useful in solving EPD problems with or without the consideration of
transmission line losses, as well as OPF problems with all kinds of constraints as long
as the constraints are continuously differentiable. The network of equation 4.1 also
solves exactly h(x)=0 for h:R"—)R"' ecl, no matter whether it is linear or not, as
exempliﬁed by the power ﬂow equations problem. Reduced to one dimension, this is
equivalent to ﬁnding the root of f(x)=0 for f:R—>ReC1.

The optimization ANN s developed in this work are applicable to a very large
class of (constrained or unconstrained) nonlinear programming problems with f e C1
and gjze C1 for all i. For the unconstrained problems, they may be solved exactly by
equation 4.1, similarly to the constrained problems with the solution in the relative
interior of the feasible region. In such cases, the network performs similarly to a gra-
dient descent method. For a general constrained problem, the 2-phase network may
be used. If a problem has only constraints, h (x )=0 and g (x )SO, the objective func-
tion is formed by summing 112(1) and (g+(x))2 and the sum may then be regarded an
as unconstrained problem and be solved accordingly. A good example is the solution

of the power ﬂow equations as shown in Chapter 6.

Note that in applying the optimization network, the problem must be feasible.
Furthermore, the regularity assumption and the boundedness of the solution set must
be met. This raises no difﬁculty at all, since in the practice of nonlinear programming
the regularity of the solution is normally presumed for large systems, and the bound-
edness of the solution may be achieved by adding some artiﬁcial upper and/or lower

bound constraints.

146

As mentioned above the optimization ANN s are similar in theory to many of the
existing optimization methods. In fact, the networks may be regarded as continuous
optimization systems in coth to the discrete nature of many optimization processes.
Any nonlinear programming problem which is solvable by the traditional discrete
methods is also solvable by the optimization ANN s developed in this work. Tradi-
tional discrete optimization methods such as the Newton’s method and Newton-
Raphson method often demand the calculation of the inverse of the Jacobian matrix
and then update the state variables by difference equations, which is a function of the
inverse Jacobian. The calculation of the inverse Jacobian grows factorially in compu-
tational complexity with respect to the size of the Jacobian and thus restricts the
applicability of these discrete optimization methods to large-scale systems. The
optimization ANN s, on the other hand, require no matrix inversion and, consequently,

have more potential for use with larger systems.

The above argument disagrees with the generalized networks proposed by
Tsirukis, er al.[159] In their networks, the diﬁ‘erence equations used in uadiﬁonal
discrete optimization processes are transformed into differential equations in which the
inverse Jacobian is included. But since the differential equations describe a dynamic
system and there is no software algorithm nor hardware structure that can calculate
the inverse Jacobian instantaneously (not even for a small system), their networks are
not hardware implementable. Even when implementing their networks in software,
the inverse J acobian in the differential equations actually slows down the convergence
rate when compared to the software implementation of difference equations in tradi-
tional optimization methods. This make their networks less useful. So a direct copy
from difference equations to differential equations does not necessarily yield any

advantage.

7.2

(1)

(2)

(3)

(4)

147

Contributions

The following are the salient contributions of this dissertation:

ANN s for optimization have been analyzed from the viewpoint of optimization

theory leading to discovery of the reasons why the ANN s succeed or fail.

The optimization network theory for linear programming, quadratic program-
ming, convex programming, and nonlinear programming has been derived and a
2-phase optimization network has been developed. These results lay a
mathematically sound foundation for the optimization ANN s and extend the
applicability of ANN s.

The quality of the solutions obtained by the optimization ANN s has been
quantiﬁed through simulation. The solutions of the l-phase networks are adju-
stable by tuning the penalty parameter and the solutions of the 2-phase networks
are exact.

The applicability of the optimization ANN s for solving real-world problems like
the economic power dispatching problem and the optimal power ﬂow problem
has been demonstrated. It was shown that the mapping technique of the optimi-
zation ANN s is simple and that they are able to handle various kinds of con-
straint sets. It was demonstrated that the optimization ANN s attain a better

objective function value.

As a whole, this work lays a solid groundwork for the optimization ANN s in

both the theoretical and practical aspects. As far as solving nonlinear programming

problems by ANN s is concerned, the work is completed, except perhaps a more rigid

analysis of the 2-phase network. The optimization ANN s also suggest a possible

structure for the next-generation analog computer. Future research should emphasize

148

developing suitable hardware implementation for the optimization ANN s so as to fully
exploit their capability.

BIBLIOGRAPHY

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

BIBLIOGRAPHY

Hopﬁeld, J.J., "Artiﬁcial Neural Networks," IEEE Circuits and Devices Maga-
zine, pp. 3-10, September 1988.

Tank, D.W. and Hopﬁeld, J J ., "Collective Computation in Neuronlike Cir-
cuits," Scieniﬁc American, pp. 104-114, December 1987.

Hopﬁeld, J.J. and Tank, D.W., "Computing with Neural Circuits: A Model,"
Science, vol. 233, no. 4764, pp. 625-633, August 8, 1986.

Hopﬁeld, J.J., "Neural Networks and Physical Systems with Emergent Collec-
tive Computational Abilities," Proceedings of the National Academy of Science
USA, vol. 79. PP. 2554-2558, April 1982.

Hopﬁeld, J .J ., "Neurons with Graded Response Have Collective Computational
Properties like those of Two-state Neurons,“ Proceedings of the National
Academy of Science USA, vol. 81, pp. 3088-3092, May 1984.

Tank, D.W. and Hopﬁeld, J.J., "Simple Neural Optimization Networks: An AID
Converter, Signal Decision Network, and a Linear Programming Circuit," IEEE
Transactions on Circuits and Systems, vol. CAS-33, no. 5, pp. 533-541, 1986.

Hopﬁeld, J.J. and Tank, D.W., "’Neural’ Computation of Decisions in Optimi-
zation Pioblems," Biological Cybernetics, vol. 52, pp. 141-152, 1985.

Bagherzadeh, N., et al., "On Parallel Execution of the Traveling Salesman
Problem on a Neural Network Model," Proceedings of the IEEE First Interna-
tional Conference on Neural Networks, San Diego, CA, vol. II], pp. 317-324,
July 1987.

Brandt, RD., et al., "Alternative Networks for Solving the Traveling Salesman
Problem and the List-Matching Problem," Proceedings of the IEEE Interna-
tional Conference on Neural Networks, San Diego, CA, vol. II, pp. 333-340,
July 1988.

149

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[13]

[19]

150

Szu, H., "Fast TSP Algorithm Based on Binary Nemon Output and Analog
Neuron Input Using the Zero-Diagonal Interconnect Matrix and Necessary and
Sufﬁcient Constraints of the Permutation Matrix," Proceedings of the IEEE
International Conference on Neural Networks, San Diego, CA, vol. II, pp.
259-266, July 1988.

Wacholder, E., Han, J. , and Mann, R.C., "An Extension of the Hopﬁeld-Tank
Model for Solution of the Multiple Traveling Salesmen Problem," Proceedings
of the IEEE International Conference on Neural Networks, San Diego, CA, vol.
II, pp. 305-324, July 1988.

Gunn, J P. and Weidlich, R.B., "A Derivative of the Hopﬁeld-Tank Neural N et-
work Model that Reliably Solved the Traveling Salesman Problem," Proceed-
ings of International Joint Conference on Neural Networks, Washington, DC,
vol. II, pp. 588, June 1989.

Angeniol, B., et al., "Self-Organizing Feasture Maps and the Traveling Sales-
man Problem," Neural Networks, vol. 1, pp. 289-293, 1988.

Culioli, J.-C., et al., "Neural Network Models for Linear Programming,"
Proceedings of International Joint Conference on Neural Networks, Washing-
ton, D.C., pp. 293-296, January 1990.

Culioli, J.-C., et al., "A Neural Network for Explicitly Bounded Linear Pro-
gramming," Proceedings of International Joint Conference on Neural Net-
works, Washington, D.C., pp. 381-384, January 1990.

Foo, Y.S. and Takefuji, Y., "Integer Linear Programming Neural Networks for
Job-Shop Scheduling," Proceedings of the IEEE International Conference on
Neural Networks, San Diego, CA, vol. II, pp. 341-348, July 1988.

Yao, Y. and Yang, Q, "Programming Neural Networks: A Dynamic-Static
Mode ," Proceedings of International Joint Conference on Neural Networks,
Washington, DC, pp. 345-348, January 1990.

Kennedy, MP. and Chua, L.O., "Unifying the Tank and Hopﬁeld Linear Pro-
gramming Circuit and the Canonical Nonlinear Programming Circuit of Chua
and Lin," IEEE Transactions on Circuits and Systems, vol. CAS-34, no. 2, pp.
210-214, February 1987.

Kennedy, MP. and Chua, L.O., "Neln'al Networks for Nonlinear Program-
ming," IEEE Transactions on Circuits and Systems, vol. CAS-35, no. 5, pp.
554—562, May 1988.

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

151

Tsirukis, A.G., Reklaitis, G.V. and Tenorio, M.F., "Nonlinear Optimization
Using Generalized Hopﬁeld Networks," Neural Computation, vol. 1, no. 4, pp.
511-521, 1989.

Smith, M.J.S. and Portrnann, C.L., "Practical Design and Analysis of a Simple
’Neural’ Optimization Circuit," IEEE Transactions on Circuits and Systems,
vol. 36, no. 1, pp. 42-50, January 1989.

Rodriguez-Vazquez, A., et al., "Analog Integrated Neural-Like Circuits for
Nonlinear Programming," Proceedings of 32nd Midwest Symposium on Circuits
and Systems, Charnpaign, IL, pp. 234-237, August 1989.

Rodriguez-Vazquez, A., et al., "Analog Neural Networks for Constrained
Optimization," IEEE International Symposium on Circuits and Systems, New
Orleans, LA, May 1990.

Rodriguez-Vazquez, A., et al., "Nonlinear Switched-Capacitor ’Neural’ N et-
works for Optimization Problems," IEEE Trans. on Circuits and Systems, vol.
CAS-37, no. 4, April 1990.

Atlas, L. and Conner, 1., "Existing and Potential ANN Techniques for Power
Systems Applications," IEEE International Symposium on Circuits and Sys-
tems, New Orleans, LA, May 1990.

Damborg, M.J., El-Sharkawi, MA. and Marks, R., "Potential Applications of
Artiﬁcial Neural Networks to Power System Operation," IEEE International
Symposium on Circuits and Systems, New Orleans, LA, May 1990.

Baum, E.B., "Towards Practical ’Neural’ Computations for Combinatorial
Optimization Problems," Proceedings of the AIP Conference on Neural Net-
works for Computing, Snowbird, UT, no. 151, pp. 53-58, 1986.

Jeﬁrey, W. and Rosner, R., "Neural Network Processing as a Tool for Function
Optimization," Proceedings of the AIP Conference on Neural Networks for
Computing, Snowbird, UT, no. 151, pp. 241-246, 1986.

Jeﬂ'rey, W. and Rosner, R., "Optimization Algoritluns: Simulated Annealing
and Neural Network Processing," Astrophysical Journal, vol. 310, pp. 473-481,
November 1986.

Kamgar-Parsi, B. and Kamgar—Parsi, B., "An Efﬁcient Model of Neural Net-
works for Optimization," Proceedings of the IEEE First International Confer-
ence on Neural Networks, San Diego, CA, vol. 111, pp. 785-790, July 1987.

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

152

Kuczewski, R.M., "Neural Network Approaches to Multi-Target Tracking,"
Proceedings of the IEEE First International Conference on Neural Networks,
San Diegon, CA, vol. IV, pp. 619-633, July 1987.

Ramanujarn, J. and Sadayappan, P., "Optimization by Neural Networks,"
Proceedings of the IEEE International Conference on Neural Networks, San
Diego, CA, vol. II, pp. 325-332, July 1988.

Tagliarini, GA. and Page, E.W., "Learning in Systematically Designed Net-
works," Proceedings of International Joint Conference on Neural Networks,
Washington, DC, vol. I, pp. 497-502, June 1989.

Bankes, S.C., "Constrained Differential Optimization for Temporally Dynamic
Problems," Proceedings of International Joint Conference on Neural Networks,
Washington, DC, vol. II, p. 587, June 1989.

Kajiura, M., Akiyama, Y. and Anzai, Y., "Neural Networks vs. Tree Search in
Puzzle Solving," Proceedings of International Joint Conference on Neural Net-
works, Washington, D.C., vol. II, p. 588, June 1989.

Kitamura, S. and Qing, P., "Neural Network Application to Solve Fredholm
Integral Equations of the First Kind, " Proceedings of International Joint
Conference on Neural Networks, Washington, DC, vol. II, p. 589, June 1989.

Li, T., Fang, L. and Wilson, W.H., "Parallel Approximate Solution of the 0/1
Knapsack Optimization on Competitive Neural Networks," Proceedings of
International Joint Conference on Neural Networks, Washington, DC, vol. II,
p. 589, June 1989.

Kamgar-Parsi, B. and Kamgar-Parsi, B., "On Problem Solving with Hopﬁeld
Neural Networks," Proceedings of International Joint Conference on Neural
Networks, Washington, DC, vol. II, p. 589, June 1989.

Peng, Y., "A Connectionist Solution for Vertex Cover Problems," Proceedings
of International Joint Conference on Neural Networks, Washington, DC, vol.
II, p. 590, June 1989.

Rignot, E.J.M., "Three-Dimensional Point Pattern Tracking Using a Completely
Determined Hopﬁeld Network Independent of the Input Data," Proceedings of
International Joint Conference on Neural Networks, Washington, DC, vol. II,
p. 590, June 1989.

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[43]

[49]

[50]

[51]

153

Thrift, P., "Function Optimization Using a Lattice of Neurons," Proceedings of
International Joint Conference on Neural Networks, Washington, DC, vol. II,
p. 591, June 1989.

Tanaka, T., et al., "Optimal Task Assignment Using Neural Netwo ,"
Proceedings of International Joint Conference on Neural Networks, Washing-
ton, D.C., vol. II, p. 591, June 1989.

Graupe, D. and Liu, R., "A Neural Network Approach to Decomposing Surface
EMG Signals," Proceedings of 32nd Midwest Symposium on Circuits and Sys-
tems, Champaign, IL, pp. 740-743, August 1989.

Mjolsness, E., Ganett, C. and Miranker, W.L., "Multiscale Optimization in
Neural Nets: A Preliminary Report," International Joint Conference on Neural
Networks, San Diego, CA, June 1990.

Maa, C.-Y. and Shanblatt, M.A., "Adjustment of Parameters for Signal Deci-
sion Networks," Proceedings of International Joint Conference on Neural Net-
works, Washington, D.C., vol. II, p. 589, June 1989.

Zak, M., Toomarian, N. and Barhen, 1., "Creative Dynamics Approach to
Optimization Problems," International Joint Conference on Neural Networks,
San Diego, CA, June 1990.

Chua, L.O. and Yang, L., "Cellular Nemal Networks: Theory," IEEE Transac-
tions on Circuits and System}, vol. CAS-35, no. 10, pp. 1257-1272, October
1988.

Chua, L0. and Yang, L., "Cellular Neural Networks: Applications," IEEE
Transactions on Circuits and Systems, vol. CAS-35, no. 10, pp. 1273-1290,
October 1988.

Lippmann, R.P., "An Introduction to Computing with Neural Nets," IEEE
ASSP Magazine. pp. 4-22, April 1987.

Guez, A., Protopopsecu, V. and Barhen, J ., "On the Stability, Storage Capacity,
and Design of Nonlinear Continuous Nerual Networks," IEEE Transactions on
Systems, Man, and Cybernetics, vol. 18, no. 1, pp. 80-87, 1988.

Foo, Y.S. and Takefuji, Y., "Stochastic Neural Networks for Solving Job-Shop
Scheduling: Part 1. Problem Representation," Proceedings of the IEEE Interna-
tional Conference on Neural Networks, San Diego, CA, vol. II. PP. 275-282,
July 1988.

[52]

[53]

[54]

[55]

[56]

[57]

[58]

[59]

[60]

1611

154

Foo, Y.S. and Takefuji, Y., "Stochastic Neural Networks for Solving Job-Shop
Scheduling: Part 2. Architecture and Simulation," Proceedings of the IEEE
International Conference on Neural Networks, San Diego, CA, vol. II, pp.
283-290, July 1988.

Levy, BC and Adams, M.B., "Global Optimization with Stochastic Neural
Networks," Proceedings of the IEEE First International Conference on Neural
Networks, San Diego, CA, vol. III, pp. 681-689, July 1987.

Peterson, C. and Soderberg, B., "A New Method for Mapping Optimization
Problems onto Neural Networks," International Journal of Neural Systems, vol.
1, no. 1, pp. 3-22, 1989.

Bruck, J. and Goodman, J .W., "A Generalized Convergence Theorem for
Neural Networks and its Applications in Combinatorial Optimization,"
Proceedings of the IEEE First International Conference on Neural Networks,
San Diego, CA, vol. 111, pp. 649-656, July 1987.

Kennedy, MP. and Chua, L.O., "Circuit Theoretic Solutions for Neural Net-
works - An Old Approach to a New Problem," Proceedings of the IEEE First
International Conference on Neural Networks, San Diego, CA, vol. II, pp.
169-176, July 1987.

Grujic, LT. and Michel, A.N., "Qualitative Analysis of Neural Networks Under
Structural Perturbations," IEEE International Symposium on Circuits and Sys-
tems, New Orleans, LA, May 1990.

Michel, A.N., Farrell, J.A. and Porod, W., "Qualitative Analysis of Neural Net-
works," IEEE Transactions on Circuits and Systems, vol. 36, no. 2, pp. 229-
243, 1989.

Li, J.-H., Michel, AN. and Porod, W., "Analysis and Synthesis of a Class of
Neural Networks: Variable Structure Systems with Inﬁnite Gain," IEEE Tran-
sactions on Circuits and Systems, vol. 36, no. 5, pp. 713-731, 1989.

Li, J.-H., Michel, AN. and Porod, W., "Analysis and Synthesis of a Class of
Neural Networks: Linear Systems Operating on a Closed Hypercube," IEEE
Transactions on Circuits and Systems, vol. 36, no. 11, pp. 1405-1422, 1989.

Kohonen, T., Sef-Organization and Associative Memory, Series in Information
Sciences, vol. 8, Spring-Verlag, 2nd ed., 1988.

[62]

[63]

[64]

[65]

[66]

[67]

[68]

[69]

[70]

[71]

[72]

[73]

[74]

155

Stryer, L., Biochemistry, WH. Freeman and Company, 2nd ed., 1981.
Minsky, M. and Papert, S., Perceptrons, MIT Press, MA, 1969.

Kung, S.Y. and Hwang, J.N., "Parallel Architectures for Artiﬁcial Neural Nets,"
Proceedings of the IEEE International Corference on Neural Networks, San
Diego, CA, vol. II, pp. 165-172, July 1988.

Anderson, J .A. and Rosenfeld, E., Eds., Neurocomputing: Foundations of
Reserach, MIT Press, MA, 1988.

Grossberg, 8., "How does a Brain Build a Cognitive Code?" Psychological
Review, vol. 87, pp. 1-51, 1980.

Cohen, MA. and Grossberg, 8., "Absolute Stability of Global Pattern Forma-
tion and Parallel Memory Storage by Competitive Neural Networ " IEEE
Transactions on Systems, Man, and Cybernetics, vol. 13, no. 5, pp. 815-826,
1983.

Grossberg, S., The Adaptive Brain, [1: Vision, Speech, Language, and Motor
Control, North-Holland, Amsterdam, 1987.

Grossberg, S., Eds., Neural Networks and Natural Intelligence, MIT Press,
MA, 1988.

Kohonen, T., "Self-organized Formation of Topologically Correct Feature
Maps," Biological Cybernetics, vol. 43, pp. 59-69, 1982.

McClelland, J .L. and Rumelhart, D.E., "An Interactive Activation Model of
Context Effects in Letter Perception: Part 1. An Account of Basic Findings,"
Psychological Review, vol. 88, pp. 375-407, 1981.

McClelland, J .L. and Rumelhart, D.E., "An Interactive Activation Model of
Context Effects in Letter Perception: Part 2. The Contextual Enhancement
Effect and Some Tests and Extensions of the Model," Psychological Review,
vol. 89, pp. 60-94, 1982.

McClelland, J .L. and Rumelhart, D.E., Eds., Parallel Distributed Processing:
Explorations in the Microstructures of Cognition, vol. 1, MIT Press, MA, 1986.

McClelland, J .L. and Rumelhart, D.E., Eds., Parallel Distributed Processing:
Explorations in the Microstructures of Cognition, vol. 2, MIT Press, MA, 1986.

[75]

[76]

[77]

[78]

[79]

[80]

[81]

182]

[83]

[84]

[85]

156

Rumelhart, D.E., Hinton, GE, and Williams, R.J., 'Learning Internal
Representations by Error Propagation," Parallel Distributed Processing:
Explorations in the Microstructures of Cognition, vol. 1, DE. Rumelhart and
J.L. McClelland (Eds), MIT Press, MA, pp. 318-362, 1986.

Rumelhart, D.E., Hinton, GE, and Williams, R.J., "Learning Representations
by Back-Propagating Errors," Nature, vol. 323, pp. 533-536, 1986.

DARPA Neural Network Study, AFCEA International Press, VA, 1988.

Wilson, G.V. and Pawley, G.S., "On the Stability of the Travelling Salesman
Problem Algorithm of Hopﬁeld and Tank," Biological Cybernetics, vol. 58, pp.
63-70, 1988.

Kahng, A.B., "Traveling Salesman Heuristics and Embedding Dimension in the
Hopﬁeld Model," Proceedings of International Joint Conference on Neural
Networks, Washington, DC, vol. I, pp. 513-520, June 1989.

Maa, C.-Y. and Shanblatt, M.A., "Stability of Linear Programming Neural Net-
work for Problems with Hypercube Feasible Region," International Joint
Conference on Neural Networks, San Diego, CA, June 1990.

Maa, C.-Y. and Shanblatt, M.A., "Improved Linear Programming Neural Net-
works," Proceedings of 32nd Midwest Symposium on Circuits and Systems,
Champaign, IL, pp. 740-743, August 1989.

Abe, S. and Kawakami, J ., "Theories on the Hopﬁeld Nemal Networks with
Inequality Constraints," Proceedings of International Joint Conference on
Neural Networks, Washington, DC, pp. 349-352, January 1990.

Chiu, G, Maa, C.-Y. and Shanblatt, M.A., "An Artiﬁcial Neural Network
Algorithm for Dynamic Programming," International Journal of Neural Sys-
tems, to appear.

Rauch, HE. and W'marske, T., "Neural Networks for Routing Communication
Trafﬁc," IEEE Control Systems Magazine. PP. 26-31, April 1988.

Zhang, L. and Thomopoulos, S.C.A., "Neural Network Implementation of the
Shortest Path Algorithm for Trafﬁc Routing in Communication Networks,"
Proceedings of International Joint Conference on Neural Networks, Washing-
ton, D.C., vol. II, p. 591, June 1989.

[86]

[87]

[88]

[89]

[90]

[91]

[92]

[93]

[94]

[95]

[96]

157

Zhang, C.-X. and Mylinski, D.A., "VLSI Placement with a Neural Network
Model," IEEE International Symposium on Circuits and Systems, New Orleans,
LA, May 1990.

Prasitjutrakul, S. and Kubitz, W.J., "Path-Delay Constrained Floorplanning: A
Mathematical Programming Approach for Initial Placement," Proceedings of
Design Automation Conference, Las Vegas, NV, pp. 364-369, June 1989.

Herrigel, A. and W. Fichtner, W., "An Analytic Optimization Technique for
Placement of Marco-Cells," Proceedings of Design Automation Conference,
Las Vegas, NV, pp. 376-381, June 1989.

Libeskind-Hadas, R. and Liu, C.L., "Solutions to the Module Orientation and
Rotation Problems by Neural Computation Networks," Proceedings of Design
Automation Conference, Las Vegas, NV, pp. 400-405, June 1989.

Caviglia, C., et al., "Neural Algorithms for Cell Placement in VLSI Design,"
Proceedings of International Joint Conference on Neural Networks, Washing-
ton, D.C., vol. I, pp. 573-580, June 1989.

N aft, J, "Neuropt: Neurocomputing for Multiobjective Design Optimization for
Printed Circuit Board Component Placement," Proceedings of International
Joint Conference on Neural Networks, Washington, DC, vol. I, pp. 503-506,
June 1989.

Melsa, PJ.W. and Kenney, J.B., "A Neural Network Solution for Routing in
Three Stage Interconnection Networks," IEEE International Symposium on Cir-
cuits and Systems, New Orleans, LA, May 1990.

Green, ADP. and Noakes, P.D., "Netn'al Network - Their Use for the Routing
of Integrated Circuits," Proceedings of 32nd Midwest Symposium on Circuits
and Systems, Champaign, IL, pp. 501-504, August 1989.

Yih, J.-S. and Mazumder, P., "A Neural Network Design for Circuit Partition-
ing," Proceedings of Design Automation Conference, Las Vegas, NV, pp. 406-
411, June 1989.

Yu, M.-L., "A Study of the Applicability of Hopﬁeld Decision Neural Nets to
VLSI CAD," Proceedings of Design Automation Conference, Las Vegas, NV,
pp. 412—417, June 1989.

Sculley, TL. and Brooke, M.A., "A Neural Network Approach to High Perfor-
mance Analog Circuit Design," Proceedings of 32nd Midwest Symposium on

158

Circuits and Systems, Champaign, IL, pp. 497-500, August 1989.

[97] Osowski, 5., "Electrical Circuits Optimization Using Standard Network Ana-
lysers," IEEE International Symposium on Circuits and Systems, New Orleans,
LA, May 1990.

[98] Starzyk, J.A. and El-Gamal, M., "Artiﬁcial Neural Network for Testing Analog
Circuits," IEEE International Symposium on Circuits and Systems, New Orle-
ans, LA, May 1990.

[99] Xiangrning, X. and Spence, R., "A Fast Constrained Optimization Algorithm
for IC Design," IEEE International Symposium on Circuits and Systems, New
Orleans, LA, May 1990.

[100] Thomas, R.J., et al., "On-Line Security Classiﬁcation Using an Artiﬁcial Neural
Network," IEEE International Symposium on Circuits and Systems, New Orle-
ans, LA, May 1990.

[101] Chow, J.-C., et al., "An Improved Hopﬁeld Model for Power System Con-
tingency Classiﬁcation," IEEE International Symposium on Circuits and Sys-
tems, New Orleans, LA, May 1990.

[102] Pao, Y.-H., Sobajic, DJ. and Nyo, W., "Real-Time Security Monitoring of
Electric Power Systems Using Parallel Associative Memories," IEEE Interna-
tional Symposium on Circuits and Systems, New Orleans, LA, May 1990.

[103] Mori, H. and Tsuzuki, 8., "Determination of Power System Topological Obser-
vability Using the Boltzmann Machine," IEEE International Symposium on Cir-
cuits and Systems, New Orleans, LA, May 1990.

[104] Karady, G. and Hubele, NF, "Conceptual Approach to the Application of
Neural Network for Short Term Load Forecasting," IEEE International Sympo-
sium on Circuits and Systems, New Orleans, LA, May 1990.

[105] Maa, C.-Y. and Shanblatt, M.A., "A Constrained Optimization Neural Net
Technique for Economic Power Dispatch," IEEE International Symposium on
Circuits and Systems, New Orleans, LA, May 1990.

[106] Maa, C.-Y. and Shanblatt, M.A., "A Neural Net Technique for Economic
Power Dispatching with Consideration of Transmission Losses," submitted to,
International Journal of Neural Systems.

159

[107] Matsuda, S. and Akimoto, Y. "The Representation of Large Numbers in Neural
Networks and its Application to Economical Load Dispatching of Electric
Power," Proceedings of International Joint Conference on Neural Networks,
Washington, D.C., vol. I, pp. 587-592, June 1989.

[108] Aggoune, ME, et al., "Artiﬁcial Neural Networks for Power System Static
Security Assessment," Proceedings of IEEE International Symposium on Cir-
cuits and System, Portland, OR, pp. 490-494, May 1989.

[109] Fischl, R., et al., "Screening Power System Contingencies Using a Back-
propagation Trained Multiperceptron," Proceedings of IEEE International Sym-
posium on Circuits and System, Portland, OR, pp. 486-489, May 1989.

[110] Chan, E.H., "Alarm Pattern Recognition by Using Artiﬁcial Neural Networks,"
IEEE International Symposium on Circuits and Systems, New Orleans, LA,
May 1990.

[111] Di Zitti, E., et al., "Emcient Emulation of Neural Networks on Concurrent
Architectures for Optimization Problems," IEEE International Symposium on
Circuits and Systems, New Orleans, LA, May 1990.

[112] Park, S., "Hopﬁeld Neural Network for AR Spectral Estimator," IEEE Interna-
tional Symposium on Circuits and Systems, New Orleans, LA, May 1990.

[113] Gao, K., Ahmad, MD. and Swamy, M.N.S., "A Neural Network Least-Square
Estimator," International Joint Conference on Neural Networks, San Diego,
CA, June 1990.

[114] Takefuji, Y. and Lee, K.C., "A Two Step Sorting Algorithm," International
Joint Conference on Neural Networks, San Diego, CA, June 1990.

[115] Sudharsanan, SJ. and Sundareshan, M.K., "Equilibrium Uniqueness and Global
Exponential Stability of a Neural Network for Optimization Applications,"
Proceedings of International Joint Conference on Neural Networks, Washing-
ton, D.C., pp. 472-475, January 1990.

[116] Sudharsanan, SJ. and Sundareshan, M.K., "Neural Network Computational
Algorithm for Least Squares Estimation Problem," Proceedings of International
Joint Conference on Neural Networks, Washington, D.C., vol. II, pp. 590, June
1989.

[117] Szu, H.H., "Reconﬁgurable Neural Nets by Energy Convergence Learning Prin-
ciple Based on Extended McCulloch-Pitts Neurons and Synapses," Proceedings

160

of International Joint Conference on Neural Networks, Washington, D.C., vol.
I, pp. 485-496, June 1989.

[118] van Laarhoven, P.J.M. and Aarts, E.H.L., Simulated Annealing: Theory and
Applications, Reidel, Dordrecht, 1987.

[119] Van den Bout, DE. and Miller 111, T.K., "Graph Partitioning Using Annealed
Neural Networks," Proceedings of International Joint Conference on Neural
Networks, Washington, D.C., vol. L pp. 521-528, June 1989.

[120] Akiyarna, Y., et al., "Combinatorial Optimization with Gaussian Machines,"
Proceedings of International Joint Conference on Neural Networks, Washing-
ton, D.C., vol. I, pp. 533-540, June 1989.

[121] de Carvalho, L.A.V. and Barbosa, V.C., "Toward a Stochastic Neural Model
for Combinatorial Optimization," Proceedings of International Joint Conference
on Neural Networks, Washington, D.C., vol. II, pp. 587, June 1989.

[122] Wong, W.S. and Funka-Lea, C.A., "An Elastic Net Solution to Obstacle
Avoidance Tour Planning," International Joint Conference on Neural Networks,
San Diego, CA, June 1990.

[123] Davenport, MR. and Hoffmann, G.W., "An Extension of the Hopﬁeld Neural
Network to Include Hidden Nemons," IEEE International Symposium on Cir-
cuits and Systems, New Orleans, LA, May 1990.

[124] Yue, T.W. and Fu, L.C., "Ineffectiveness in Solving Combinatorial Optimiza-
tion Problems Using a Hopﬁeld Network: A New Perspective from Aliasing
Eﬁ‘ect," International Joint Conference on Neural Networks, San Diego, CA,
June 1990.

[125] Hellstrom, BJ. and Kanal, L.N., "The Deﬁnition of Necessary Hidden Units in
Neural Networks for Combinatorial Optimization," International Joint Confer-
ence on Neural Networks, San Diego, CA, June 1990.

[126] Abe, S., "Theories on the Hopﬁeld Neural Networks," Proceedings of Interna-
tional Joint Conference on Neural Networks, Washington, D.C., vol. I, pp.
557-564, June 1989.

[127] Abu-Mostafa, Y.S. and St. Jacques, J .-M., "Information Capacity of the
Hopﬁeld Mode ," IEEE Transactions on Information Theory, vol. lT-31, no. 4,
pp. 461-464, 1985.

161

[128] McEliece, R.J., et al., "The Capacity of the Hopﬁeld Associative Memory,"
IEEE Transactions on Information Theory, vol. IT-33, no. 4, pp. 461-482,
1987.

[129] Davis, G.W., "Sensitivity Analysis of Hopﬁeld Neural Net," Proceedings of
IEEE First International Conference on Neural Networks, San Diego, CA, vol.
III. PP. 325-528, July 1987.

[130] Reibling, LA. and Olinger, M.D., "A Neural Network Implementation of Paral-
lel Serach for Multiple Pa ," Proceedings of International Joint Conference
on Neural Networks, Washington, D.C., pp. 293-296, January 1990.

[131] Coleland, B .R., "Global Minima within the Hopﬁeld Hypercube, " Proceedings
of International Joint Conference on Neural Networks, Washington, D. C, pp.
37 7 380, January 1990.

[132] Hirsch, M.W., "Convergent Activation Dynamics in Continuous Time Net-
works," Neural Networks, vol. 2, pp. 331-349, 1989.

[133] Salam, F., "A Tutorial Workshop on Neural Nets and Their Engineering Irnple-
mentations," 31st Midwest Symposium on Circuits and Systems, St. Louis,
August 1988.

[134] Graf, H.P., et. al., "VLSI Implementation of a Neural Network Memory with
Several Hundred Neurons," Proceedings of the AIP Conference on Neural Net-
works for Computing, Snowbird, UT, no. 151, pp. 182-187, 1986.

[135] Graf, H.P., Hubbard, W., Jackel, L.D., and deVegvar, P.G.N., "A CMOS Asso-
ciative Memory Chip," Proceedings of the IEEE First International Conference
on Neural Networks, San Diego, CA, vol. III, pp. 461-468, July 1987.

[136] Graf, H.P., Jackel, L.D., and Hubbard, W.E., "VLSI Implementation of a
Neural Network Model," Computer, vol. 21, no. 3, pp. 41-49, March 1988.

[137] Sivilotti, M.A., Emerling, MR, and Mead, C.A., "VLSI Architectmes for
Implementation of Neural Networks," Proceedings of the AIP Conference on
Neural Networks for Computing, Snowbird, UT, no. 151, pp. 408-413, 1986.

[138] Sivilotti, M.A., Mahowald, M.A., and Mead, C.V, "Real-time Visual Computa-
tions Using Analog CMOS Processing Arrays," Advanced Research in VLSI:
Proceedings of the 1987 Starn'ord Conference, P. Losleben (Ed), MIT Press,
MA, pp. 295-312, 1987.

162

[139] Mead, Carver, Analog VLSI and Neural Systems, Addison-Wesley, MA, 1989. ,

[140] Hopﬁeld, J ., "Neural Networks: Algorithm and Microhardware," Tutorial #7,
International Joint Conference on Neural Networks, Washington, D.C., June
1989.

[141] Jackel, L., "VLSI Technology and Neural Network Chips," Tutorial #8, Inter-
national Joint Conference on Neural Networks, Washington, D.C., June 1989.

[142] Murray, AF. and Smith, A.V.W., "Asynchronous VLSI Nelual Networks
Using Pulse-Stream Arithmetic," IEEE Journal of Solid-State Circuit, vol. 23,
no. 3, pp. 688-697, 1989.

[143] Salam, F., "A Formulation for the Design of Nemal Processors," Proceedings
of the IEEE International Conference on Neural Networks, San Diego, CA, vol.
I, pp. 173-180, July 1988.

[144] Li, J.H., Michel, A.N., and Porod, W., "Qualitative Analysis and Synthesis of a
Class of Neural Networks," IEEE Transactions on Circuits and Systems, vol.
35, no. 8, pp. 976-986, August 1988.

[145] Rockafellar, R.T., Convex Analysis, Princeton University Press, NJ, 1979.

[146] Bazaraa, MS. and Shetty, C.M., Nonlinear Programming, John Wiley & Sons,
NY, 1979.

[147] Luenberger, D.G., Introduction to Linear and Nonlinear Programming, Chapter
12, Addison-Wesley, MA, 1973.

[148] Chua, LC. and Lin, G.-N., "Nonlinear Programming without Computation,"
IEEE Transactions on Circuits and Systems, vol. GAS-31, no. 2, pp. 182-188,
Feb. 1984.

[149] Chua, L.O. and Wang N.N., "Complete Stability of Autonomous Reciprocal
Nonlinear Networks," Circuit Theory and Applications, vol. 6, pp. 211-241,
1978.

[150] Vidyasagar, M., Nonlinear System Analysis, Chapter 5, Prentice-Hall, NJ, 1978.

[151] Fletcher, R., Practical Methods of Optimization, 2nd ed., John Wiley & Sons,
NY, 1987.

163

[152] Bertsekas, D.P., Constrained Optimization and Lagrange Multiplier Methods,
Academic Press, NY, 1982.

[153] Wood, AF. and Wollenberg, B.F, Power Generation Operation and Control,
Chapter 3, John Wiley & Sons, NY, 1984.

[154] Stott, B., Alsac, O., and Monticelli, A.J., "Security Analysis and Optimization,"
Proceedings of the IEEE, vol. 75, no. 12, pp. 1623-1644, Dec. 1987.

[155] Debs, A.S., Modern Power Systems Control and Operation, Kluwer Academic
Publishers, MA, 1988.

[156] Bergen, A.R., Power Systems Analysis, Prentice-Hall, NJ, 1986.

[157] Lawson, CL. and Hanson, R.J., Solving Least Squares Problems, Prentice-Hall,
NJ , 1974.

[158] Sreedharan, V.P., "An Algorithm for Non-negative Norm Minimal Solutions,"
Numer. F unct. Anal. and Optimiz., vol. 9, pp. 193-232, 1987.

[159] Tsirukis, A.G., Reklaitis, G.V., and Tenorio, M.F, "Nonlinear Optimization
Using Generalized Hopﬁeld N etworkks," Neural Computation, vol. 1, no. 4, pp.
511-521, 1989.

ICHIGRN STRT

1111! 1

E UNIV. LIBRRRIES
11111111111111111111
7678828

3129300