’N"." \ .l." v c ' ;::."- My ; , #29:).32‘3 4‘ \ 51.41.... U3" af‘:’-:_ ~ ' '3- ‘ VI. ’1:{ q?u§n‘.; .lh‘L ‘ 37‘: if: 5!; ”72?? $5,}; ; t;. ’1 “9&1!” I"‘. . ‘z'fi‘wg" .3, I: ,-‘-:v n 1"." -1“ 2 «15’. _ {£51 :1. A. '."."r" *- 9,. y "fiflg E: '9 3:}: Want. . . 4- , . v 7 ‘3 '1' u I v- u. .‘v' .“.. ; ‘I I‘m .-~','.°' _;,L-,-‘-‘.o‘ ‘10.(c)s=10. ............................................................... 90 5.7 Trajectories of equation 4.1 for (LP3) near the point (-5.5, -5.5). (a):=2.(b)s=2—>10.(c)s=10. ................................................ 93 5.8 Trajectories of x of the (QP) network for (QPI). ................................. 95 5.9 Trajectories of x of equation 4.1 for (QP7) with s=50. .......................... 96 5.10 Simulation results of the 2-phase network for (QP7) with 3:50 and e=0.2. (a) The trajectory of x. (b) The trajectories of 2.3. ...................... 98 5.11 Trajectories for (QP3) with s=50 and e=0.2. (a) Using the network of equation 4.1. (b) Using the 2-phase network. ....................................... 99 5.12 The simulation results of the (LS) problem with initial condition x0 =[0, 0, 0]T. (a) The trajectories of the state variables. (b) The trajectory of E 2(x). ........................................................................... 101 5.13 The trajectories of x for the (LS) problem with different initial conditions. (a) x,=[4, 4, 41’. (b) x0 =[11, .4, -4]T. (c) x0=[-4, 11, 4]? (d) x0=[11, 11, 417‘. (e) x,=[-4, -4, 11?. (t) x,=[11, 4, 1117'. (g) x0=[-4, ll, 11]T. (h) xo=[ll, 11, 11]T. ......................................... 102 5.14 Contours of E 2(x) for (NPl). (a) The contour for s=0.5. (b) The contour for s=5. (c) The contour for 3:50. .......................................... 108 5.15 Trajectories of x of equation 4.1 for (NPI) with 3:50. ......................... 112 5.16 Trajectories of x for (NPl) using the 2-phase network with s=10 and 8:0.2. (a) x0 = (0.75, 0.75). (b) x0 = (—0.75, 0.75). ............................. 114 5.17 Trajectories of x for different networks for (NP 2). (a) Equation 4.1 5.18 5.19 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 with s=10. (b) 2-phase network with 3:10 and e=0.2. ......................... (a) f (x) for (NP 3). (b) f2(x) for (NP3). ............................................. The u'ajectories of x of equation 5.4 versus E (x) for (NP 3). ................. Simulation results for Case 1 of Example 6.1 using equation 4.1 with s=50 and x0 =[400, 300, 150]T. (a) Trajectories of x. (b) Trajectories off(x) and E2(x). ............................................................................ Simulation results for Case 1 of Example 6.1 using the 2-phase network with s=50, e=0.2 and x0 =[400, 300, 150]T. (a) Trajectories of x. (b) Trajectories of f (x) and E 2(x ). ..................................................... Simulation results for Case 2 of Example 6.1 using equation 4.1 with s=50 and x0 =[400, 300, 150]T. (a) Trajectories of x. (b) Trajectories off(x) and E2(x). ............................................................................ Simulation results for Example 6.2 using equation 4.1 with s=50 and x0 =[160, 40, 120]T. (a) Trajectories of x. (b) Trajectories of f (x) and E 2(x ). (c) The trajectory of PL . ......................................................... Power system network for Example 6.3. .............................................. The surface of h12(x). ........................................................................ The surface of h§(x). ........................................................................ A 5-bus power system used in Example 6.4. ........................................ 119 124 126 128 135 135 137 CHAPTER I INTRODUCTION Conventional digital computers are very good at executing well-formulated sequences of instructions represented by the stored program. There are some tasks, however, which are very cumbersome to solve by conventional digital computers. These include vision, speech, pattern recognition with distorted data, and information retrieval where only partial input information is given. These tasks, on the other hand, are accomplished and performed well by the human brain. The basic process- ing elements of the human brain are neurons, which are electrochemical devices with response times in the range of milliseconds. In the human brain there are approxi- mately 1011 neurons and each of them may be connected to thousands of other neu- rons. It is not yet well understood what interconnection structure organizes the neu- rons, nor how this massively parallel interconnected system (a biological neural net- work) interacts, stores and retrieves memory, and manipulates our thoughts. In contrast, artificial nemal networks (ANN s) are machine models of the biologi- cal neural networks with the aim of achieving human-like performance. Recently there has been a resurgence in the field of ANN 8 due» to new network topologies (feed-forward multilayer network, Hopfield feedback network) and algorithms (back propagation, stochastic neural network), implementation techniques (digital VLSI tech- nology, analog VLSI technology, electro-optics technology), and various emerging applications. ANN s do not always outperform conventional (sequential or parallel) computers. Rather they provide a different approach to attack certain problems which are not easily solvable using conventional computers. The key characteristics of ANN s are listed in Table 1.1 and contrasted with the corresponding characteristics of conven- tional computers. Table l. 1. Characteristics of ANN s and conventional computers. Characteristics ANN 8 Conventional Computers Memory Distributed; Associative Localized; Specific Fault-Tolerance Inherent Not Inherent Pattern Recognition Ability Fast Slow Classification Excellent Poor Partial Information Retrieval Excellent Poor Error Correction Ability Excellent Poor Learning Ability Excellent Poor Math & Algorithmic Ability Poor Excellent Timing Scheme Asynchronous ** Execution Mode Highly Parallel ** Processing Element Simple Unit ** Connectivity High ** ** These characteristics are system-dependent. The featm'e possessed by ANN s which difl‘ers most from conventional computers is that ANNs store information in their structure rather than in specific locations. All the parameters (connection weights, external biases, thresholds of neurons, initial states of neurons) collectively determine the information stored in the network. As a result, if some of the interconnections are disconnected or some of the neurons fail, the function of the network is preserved qualitatively. This provides the inherent abil- ity of fault tolerance and sometimes the ability to retrieve the full output data pattern with only partial input information. For pattern recognition, correlations of the input patterns and output pattern are stored in the network. With a distorted or noisy input ANN s do not always outperform conventional (sequential or parallel) computers. Rather they provide a different approach to attack certain problems which are not easily solvable using conventional computers. The key characteristics of ANN s are listed in Table 1.1 and contrasted with the corresponding characteristics of conven- tional computers. Table 1.1. Characteristics of ANN s and conventional computers. Characteristics ANN 5 Conventional Computers Memory Distributed; Associative Localized; Specific Fault-Tolerance Inherent Not Inherent Pattern Recognition Ability Fast Slow Classification Excellent Poor Partial Information Retrieval Excellent Poor Error Correction Ability Excellent Poor Learning Ability Excellent Poor Math & Algorithmic Ability Poor Excellent Timing Scheme Asynchronous ** Execution Mode Highly Parallel ** Processing Element Simple Unit ** Connectivity High ** ** These characteristics are system-dependent. The feattn'e possessed by ANN s which difl'ers most from conventional computers is that ANN 5 store information in their structure rather than in specific locations. All the parameters (connection weights, external biases, thresholds of neurons, initial states of neurons) collectively determine the information stored in the network. As a result, (if some of the interconnections are disconnected or some of the neurons fail, the function of the network is preserved qualitatively. This provides the inherent abil- ity of fault tolerance and sometimes the ability to retrieve the full output data pattern with only partial input information. For pattern recognition, correlations of the input patterns and output pattern are stored in the network. With a distorted or noisy input pattern applied, a well-trained ANN is able to map it to an output whose correspond— ing input pattern best matches the applied one. Training, also called learning or adap- tation, is the process determining the connection weights, usually over time, to improve performance. Massive parallelism is another feature possessed by neural net- works which is necessary for high performance computation for applications like speech and pattern recognition. Because of the immaturity of ANN research, various kinds of network structures have been proposed and tested for different applications. There is no agreement on which network best fits a particular application. Neither is there a unified way to classify the existing ANN s. One way, for example, is classifying them by topological groupings of single-layered or multi-layered; feed-forward or feedback; fully con- nected, nearest-neighbor connected, or hexagonally connected. ANN s can also be divided into two categories depending on whether their usage is neuroscience-oriented or engineering-oriented. The former tries to model simple nerve systems of some animals and implement the functions of the model through software or hardware. Then the implementation is used as a paradigm to validate and predict the behavior of the original nerve system. The latter aims at mimicking the biological neurons and their networks with some adaptation and modification based upon the available analyt- ical methods and available implementation technology in order to exploit the decision-making functions. For the engineering-oriented ANNs, one strong area of application is the solving of constrained optimization problems. The ANN which is most widely used and cited for such an application is the Hopfield network [1-7]. This is an one-layered, fully connected, feedback network. A often-adopted procedure to solve a specific problem using the Hopfield network includes the following steps: i) select a mapping (representation, transformation, or encoding) such that the out- puts of the network correspond to the solution to the problem; ii) choose a proper energy function, bounded from below, whose minimum corresponds to a feasible solution to the problem; iii) derive network connection weights and bias inputs, which properly embed the objective function and constrains of the problem into the network; and, iv) choose initial values for the neurons in such a way that the network converges to . a stable state which is a feasible solution to the problem. Currently, each of the above steps is based on ad hoc procedures [6-44]. A lot of effort, however, is being placed on formulating rule-based methodologies to obtain parameters, derive energy functions, and choose initial values [12-13,18-24,45-50,57- 60]. For a specific problem, there are various choices at each step of the above pro- cedure, but, except in a few cases [18-1923-2458-60], most of the work reported to date does not guarantee, or at least analytically guarantee, that the state to which the network converges is a feasible solution to the problem. 1.1 Problem Statement ANN s, particularly the Hopfield network, have been used to solve optimization problems such as linear programming, nonlinear programming, and dynamic program- ming. The stability and convergence of the Hopfield network is ensured due to its gradient descent nature [6]. But a reasonably formulated network, like the one used for linear programming is not sufficient to guarantee convergence to a feasible solu- tion of the original problem. In fact, as will be shown in this dissertation, the linear programming network by Hopfield will converge to a point which is in general not close enough to an optimal solution. For certain networks, some states of conver- gence (local minima) may even turn out to be infeasible with respect to the original problem. An example of this phenomenon is the traveling salesman problem as ii) choose a proper energy function, bounded fiom below, whose minimum corresponds to a feasible solution to the problem; iii) derive network connection weights and bias inputs, which properly embed the objective function and constrains of the problem into the network; and, iv) choose initial values for the neurons in such a way that the network converges to a stable state which is a feasible solution to the problem. Currently, each of the above steps is based on ad hoc procedures [6-44]. A lot of effort, however, is being placed on formulating rule-based methodologies to obtain parameters, derive energy functions, and choose initial values [12—13,18-24,45-50,57- 60]. For a specific problem, there are various choices at each step of the above pro- cedure, but, except in a few cases [lS-l9,23-24,58-60], most of the work reported to date does not guarantee, or at least analytically guarantee, that the state to which the network converges is a feasible solution to the problem. 1.1 Problem Statement ANN s, particularly the Hopfield network, have been used to solve optimization problems such as linear programming, nonlinear programming, and dynamic program- ming. The stability and convergence of the Hopfield network is ensured due to its gradient descent nature [6]. But a reasonably formulated network, like the one used for linear programming is not sufficient to guarantee convergence to a feasible solu- tion of the original problem. In fact, as will be shown in this dissertation, the linear programming network by Hopfield will converge to a point which is in general not close enough to an optimal solution. For certain networks, some states of conver- gence (local minima) may even turn out to be infeasible with respect to the original problem. An example of this phenomenon is the traveling salesman problem as solved by the Hopfield network, where some converged states set up a traveling schedule to visit some cities more than once and not to visit some other cities at all. To address this problem, a more thorough theoretical knowledge of the ANN characteristics and more knowledge of the relationship between the ANN s and optimi- zation theory is needed. As a result of this shallow understanding, trial-and-error approaches are currently adopted in choosing the parameters of the network for solv- ing various optimization problems. And what’s worse is the fact that the set of parameters resulting from the trail-and-error procedures are case-dependent. Conse- quently, the application variety of ANN s is limited, especially for applications requir- ing real-time response. An urgent need is a more thorough establishment of the theoretical analysis for ANN s from the viewpoint of optimization theory. A key ques- tion is whether there is a general type network suitable for particular classes of optim- ization applications. If a general network is possible, then a proper procedure for mapping classes of optimization problems into this general network is desirable. Additionally, guidelines are needed to verify whether a given network will converge to the desired optimal solution(s) ornot. If an exact solution is not achievable due to the finiteness of the network parameters, a network which will converge to an approx- imate solution is sought. For fully exploring the computational power of ANN s in optimization problems, an investigation involving the following fundamental research tasks is to be done in this dissertation. (1) Analysis of ANN s from the viewpoint of optimization theory. (2) Developing generalized network(s) for solving basic optimization problems such as linear programming, quadratic programming, and nonlinear programming. (3) Quantifying, through simulation, the potential advantages and disadvantages of the generalized network(s). 6 (4) Demonstrating the applicability of the generalized network(s) for solving some optimization problems requiring real-time response. The problems to be demonstrated are the economic power dispatching problem and the optimal power flow problem. 1.2 Approach The following plan is organized to achieve the goals of this research in a step- wise and overlapping fashion and to set the stage for subsequent developmental research further exploiting the anticipated results. Task 1: Objective: Approach: Network Analysis Various network formulations of the Hopfield model will be analyzed from the viewpoint of optimization theory in order to identify the rea- sons why and when the network succeeds or fails. The first step in this research is to analyze optimization formulations of the Hopfield model. Various network formulations reported in recent literature are classified into different categories to be analyzed in a sys- tematic way from the viewpoint of optimization theory. The primary theories used in the analysis include the Kuhn-Tucker optimality condi- tions for constrained optimization and the penalty function methods which translate constrained optimization problems to unconstrained problems. Network formulations are translated to forms which can then be analyzed for optimality conditions and the other criteria mentioned above. Results of the analysis are compared with reported experiments and used to verify the adequateness of the network formulations in the Task 2: Objective: Approach: Task 3: Objective: Approach: literature. Based on the analysis results, guidelines for checking the propriety of various network formulations are also sought. Optimization Network F ormulation Formulate basic optimization networks and their extensions, and develop, if possible, a generalized network structure. With the analysis results obtained in Task 1, some optimization net- works can be formulated in such a way that the stability, convergence, and optimality of convergence of these networks are theoretically assured. The network formulation is started on some basic problems like linear programming, quadratic programming, and nonlinear pro- grammin g with affine (linear) constraints. Optimization networks for more complicated optimization problems, such as nonlinear program— ming with nonlinear constraints and other combinatorial optimization problems, are examined next as possible extensions of those basic net— works. All the developed networks are checked with the previously derived guidelines to see whether they are legitimate. Based on the experience of formulating various networks, a generalized network is thus developed and a procedure for mapping classes of optimization problems into this network is sought. Network Simulation Produce simulations of the optimization networks developed in the last task and provide some benchmark comparison parameters. Simulation programs for each of the optimization networks are developed. For each optimization network there is a set of first order Task 4: Objective: Approach: differential equations which can be solved by standard numerical analysis techniques. Simulation programs are used to provide perfor- mance metrics for comparison with other approaches. Metrics to be extracted include speed of convergence (throughput) and a measure of computation accuracy (quality). The simulation is also used to study the network sensitivity with respect to certain parameters, for example, the input resistance and capacitance of each neuron and the time incre- ment used solved the differential equations. Based upon the simulation results, possible modification and re-formulation of the networks are taken by going back to Task 2 for the purpose of performance improve- ment. Case Study Apply the developments in new optimization networks to real en gineer- ing optimization problems: economic power dispatching (EPD) and optimal power flow (OPF). For the EPD problem, two cases, with and without the consideration of the transmission line losses, are solved by using the developed network r/formulations and their results are compared with the results obtained by other traditional methods. The formulation of the OFF is modified by delineating the various constraint types for a complete OPF so that this reduced model can be handled by the developed networks. Simulations of the OFF will be made thereafter and the results will be compared to benchmark OPF formulations. 1.3 Overview of the Dissertation Chapter 2 covers the background knowledge pertaining to neural networks. It includes a biological review of neural networks, some simplified neuron models and network topologies, and a literature review of optimization-related work done on neural networks. Because of its pioneering role in applying ANN to optimization problem, the Hopfield feedback network is covered in Chapter 2 to briefly introduce the basic idea behind optimization neural networks. Chapter 3 starts out with an introduction of the requisite mathematical back- ground, followed by the analyses of three existing optimization neural networks. It ends with an in-depth study of the linear programming problem with hypercube feasi- ble region as solving by the Kennedy and Chua’s network [19]. The core of this dissertation is in Chapter 4 in which the theoretical results for various optimization networks are derived. The derivation of the linear programming network is given first. Then, the results are extended to the quadratic programming network and finally the nonlinear programming network. The least squares problem is shown to be solvable by the quadratic programming network. A two-phase network described last is capable of converging to the exact solution of a optimization problem and obtaining the corresponding Lagrange multipliers as well Chapter 5 gives the simulation results of various optimization problems using the developed network structures. Chapter 6 illustrates the applicability of the optimiza- tion network to real-world problems by two case studies: economic power dispatch and optimal power flow. The conclusion of this work is given in Chapter 7. CHAPTER H BACKGROUND This chapter starts with a brief foundational review of neurobiology followed by some simplified models of neurons and different network interconnection topologies. A literature review of ANN applications to optimization problems is given next. The Hopfield feedback network is covered independently in the last section because of its pioneering role in applying ANN techniques to optimization problems. 2.1 Neurobiological Review The major parts of a typical biological neuron include a nucleus, cell body, axon hillock, axon, synapses, and dendrites as shown in Figme 2.1. Dendrites are the receivers of incoming signal. When the incoming signal (a smooth varying analog voltage) reaches a certain value, the nerve cell fires and the axon hillock generates a pulse (action potential). The output of a typical neuron consists of a series of action potentials each about 1 millisecond long. If a neuron has a strong input, action poten- tials are generated at a high rate. If the input is weak or absent, action potentials are produced at a low rate. The mean rate of the generation of action potentials as a function of the input follows the form of a sigmoid function. The effective input usu- ally refers to the short time average or running integral of an excitation; its frequency 10 -11- of generating action potentials is considered as the effective output [61]. The axon is very resistive and is in charge of transmitting action potentials to synapses through which a neuron interacts with up to thousands of other neurons. A synapse consists of two parts, the presynaptic membrane and the postsynaptic mem— brane. They are separated by a gap called the synaptic cleft which is about 500A wide. Inside the presynaptic membrane there are small and diffusible molecules called neurotransmitters that are released into the cleft in response to an action poten- tial. The released neurotransmitters difl’use to the postsynaptic membrane where they combine with certain receptor molecules causing a depolarization of the postsynaptic membrane. The depolarization signal is collected by the dendrite of the second neu- ron and sent to its cell body [62]. The response of the postsynaptic membrane is a graded response rather than a pulse. With an excitatory synapse the postsynaptic potential will be more positive, whereas with an inhibitory synapse the postsynaptic potential will be more negative. dendrites nucleus hr F5) fig... W W endbde \ cell body Figure 2.1. A biological neuron. -12- 2.2 Simplified Neuron Models and Network Topologies Neuron models constructed for the purpose of studying the function of the brain generally involve very complicated mechanisms and thus result in complex structures. These models fall into the category of neuroscience-oriented ANN s and are of pri- mary interest to neurobiologists. On the other hand, engineering-oriented neuron models which are used for the purpose of improving the artificial (machine) process- ing of data or information are of primary interest in this work. The simplest neuron model sums the weighted inputs and passes the result through a certain function. The outcome of the function, considered as the output of the neuron, branches out via weighted connections to the inputs of other neurons. Let a,- ’s be the inputs from other neurons, (0;,- be the weight of the connection from the output of neuron i to the input of neuron j , f (o) be the neural function, and y j be the output of the neuron j . As mentioned in the previous section, the firing of action potentials of a neuron is a type of threshold mechanism. To model this mechanism requires two more variables, 0j and (00,- , denoting the threshold value and its weight for neuron j , respectively. This simplified representation of a neuron is shown in Fig- ure 2.2. From the computation viewpoint, this neuron model can be thought of as a pro- cessing element (PE). The function f (0) maps input values to a prespecified range and is generally of the following four types: linear, nonlinear ramp, step, and sig- moidal. These input/output relationships are shown in Figure 2.3. The sigmoidal function is the most pervasive because it is bounded, monotonic, nondecreasing and provides a graded, nonlinear response which most resembles a real neuron. The range of the sigmoidal function is sometimes changed from [0,1] to [-1,l], depending on the application, giving symmetry with respect to the origin. -13- Figure 2.2. Simplified representation of a neuron. NV (a) Figure 2.3. Neuron response functions. (a) Linear function. (b) Nonlinear ramp func- tion. (c) Step function. (d) Sigmoidal function. -14- Figure 2.3. (cont’d.). +f(X) +7 if xzy f(x)= x if lxl0 f(X) = 0 otherwise m f(X) = 1 1+e'x ‘@ -15- The connection weight in the model corresponds to the correlating strength between the presynaptic potential and the postsynaptic potential. A positive weight models a excitatory synapse and a negative weight models a inhibitory synapse. A positive connection weight implies a positive correlation between the connected neu- rons or a rewarding relationship. A negative weight implies negative correlation or a punishing relationship. The interconnection structure of neurons in human brain is very complicated but is thought not to be random. There is evidence showing that both the retina and cor- tex are organized into layers of cells with interconnections within and between layers. Connections within a layer are referred to as intra-field connections, lateral connec- tions, or short-term memory (STM). Connections between layers are referred to as inter-field connections, field connections, or long-term memory (LTM). The intra— field connections are usually considered unidirectional while inter-field connections may propagate signals in a feed-forward and/or feedback direction. Three intercon- nection topologies are given in Figure 2.4. The networks shown in Figures 2.4 (b) and (c) can be extended to n layers. ,2 CO. _' 0.. (a) Figure 2.4. Interconnection topologies. (a) One-layer laterally connected neural net- work. (b) Two-layer feed-forward neural network. (c) Two—layer feedback neural net- work. -17- As an example of the functional difference between these interconnections, con- sider the XOR problem [63]. This problem is not solvable using a single layer neural network as was discovered early on in the ANN research. It can now be solved easily using a multilayered trained network. This training, however, is nontrivial and it is still an open question on how to best train a complicated ANN. The connection pat- terns within and between the layers are not necessarily fully connected as is the case illustrated. These connections may be nearest-neighbor type connections, connected according to certain patterns, or randomly connected with fixed fan-in and fan-out. Other connection topologies, such as mesh, feature map, and three-dimensional arrays, have also been studied [4749,6164]. Though not shown in Figure 2.4, every connec- tion is weighted. 2.3 Literature Review According to the chronologically edited book by Anderson and Rosenfeld. which contains over 40 important historical papers in this field, the idea of modern neural networks can be traced back to as early as late 19th century [65]. Recent interest has been sparked mainly due to the works of Hopfield [4—7], Grossberg [66-69], Kohonen [61,70], McClelland [71-74], and Rumelhart [71-76]. An in-dcpth study of neural net- work research and applications up to 1987 has been done by DARPA [77]. In this study, the storage of a neural network is measured in terms of interconnects, and the speed of a network is described in terms of interconnects-per-second. Also in this study, interconnects versus interconnects-per-second charts have been used the first time to measrn'e the computational capabilities of neural networks. Tank and Hopfield first applied an ANN -based technique to solve optimization problems like the traveling salesman problem and linear programming [67]. In the case of linear programming, the objective function and inequality constraints are -13- mapped into a closed-loop network in such a way that constraint violations loop back to adjust the states of the neurons. The overall energy functions of a network so designed decreases until it reaches a minimum. Under a high—gain limit assumption often placed on certain neurons, the corresponding output of the network then presum- ably approaches a solution to the original problem. However, this presumption is not true in general as will be seen in the next chapter. Motivated by the work of Tank and Hopfield, an abundance of research has been done on applying the ANN s to other optimization problems. Some work has been done on the justification of the ANN model used for the traveling salesman problem and possible model modifications as well as extensions [8-13,78-79]. There is increasing interest in applying ANN s to various linear programming problems, such as integer linear programming and problems with equality constraints and to related topics such as nonlinear programming and dynamic programming [l4-24,80.82]. For engineering-related optimization applications, ANN s have been developed to solve the placement and the routing problems in VLSI design [86-93], general computer-aided design [94—99], and power systems engineering problems like security monitoring, contingency classification, and economic power dispatching [ZS-26,100- 109]. More general optimization-related applications using ANN-based techniques can be found in [2746,110-117]. Most of the works cited, however, have not yet been vindicated theoretically, and this limits their applicability. Some researchers have also sought to combine both the ANN models and another new optimization technique, namely, simulated annealing [118] in order to explore the merits of the two [IS-24,51-54,119-122]. But the drawbacks are that the amount of time needed for the annealing process is too long plus there is no simple way to implement such models using any currently available technology. Analysis is being carried out on some of the ANN models used for optimization-related applications [SS-60,123-132]. But, except for a few papers -19- reported to date [IS-20,2324], most of these analyses were not undertaken from the viewpoint of optimization theory. As a result, the network so designed may converge to a point which is useless to the original problem. Though the ANN model used in [20] was developed based on some optimization methods, they merely replaced the discrete procedures by a set of differential equations, which do not necessarily guaran- tee the stability nor the convergence of the network. We will justify our argument in the next chapter when the models in [6,19,24] are carefully studied under a unifying optimization ANN theory developed in the process of this work. 2.4 The Hopfield Model The Hopfield model falls into the category of a one-layer, laterally connected network; it is sometimes referred to as the feedback network [133]. One of the major contributions of the Hopfield model is that it can be built with analog circuit com- ponents and .is suitable for analog VLSI implementation [134—142]. In this model each neuron, with input u,- and output V,- , is modeled as an amplifier with a capacitive element Ci and a resistive element p,- at the input node. These components partially define the time constant of the neuron. The output of neuron j is connected to the input of neuron i via a finite conductance Tij . This conductance models the synapse and is symmetric, i.e., Tij = Tfi. Figure 2.5 illustrates the basic structure of a Hopfield neuron. The input-output relationship of the amplifier is sigmoidal. The excitatory synapse (Tij >0) and the inhibitory synapse (TU <0) are implemented by con- necting the conductance to the normal output and inverted output of amplifier j, respectively. A general Hopfield network is shown in Figure 2.6. The rate of change of u; is determined by the following equation derived using KCL. -20- V. 1 Figure 2..5 A basic neuron of the HOpfield model. C( ‘) irw ) 1“ +1 — = 00 .-—u. — -— t 8 dt i=1 1] J 8 pi l n l =JET if Vj -(zTij + —)u,- +113 (2.1) where V]. = fj(uj). (2.2) Let R,-j = -7-}—, 1%.] = 3— + i—, and choose fj(.) =f(.) for all j. Then the above r} r r 1R1] equations can be rewritten as dug u,- Ci(7)= 1'ng if Vj- R—i + I‘- (2.3) where Vj =f(uj)’ (2.4) -21- V1 V2 V3 Figure 2.6. A general Hopfield network. and uj = f’1(Vj). The first integral or the energy function of equation 2.3 is "53 55’0”“ ’ 21’ V" + rgk—tlv‘f-leodt. i=1j=1 E: (2.5) (2.6) -22- 3E for -87.. = C (d; ) The time derivative of the energy function can be found by applying the chain rule as as: = 55.3335; . greats: V,- dui dt (2.7) Since f (u;) is monotonically increasing, %50 for all t. As a result, the value of the energy function is strictly decreasing and becomes zero only at equilibrium points at du- which£=—C—— =0foralli. 3V ‘ dt If uj in equation 2.4 is replaced by M)" where I. is a constant representing the neuron gain, then equation 2.5 becomes 1 _ u.- = if ‘03-). (2.8) Hopfield asserted that if A is chosen to be large enough, then the third term on the right hand side of equation 2.6 is negligible compared to the other terms and thus can be dropped [5]. This leads to the following: C(—-—)- .. 21wj + 1 (2.9) and l n n n 1: i=1 j=l -23- Note that these two equations are valid only for the high gain limit assumption, that is, when 2. is very large. Equations 2.3 - 2.6 actually define a gradient system and thus guarantee no oscil- lations or any complicated behavior in the system [133]. Furthermore, it has been proven that such a system has only a finite number of isolated equilibria and they are bounded [143-144]. The network can thus be envisioned as a system which tends to find a path leading to a local minimum in the energy surface. This surface is collec- tively defined by the network parameters. Those isolated equilibrium points may correspond to memory patterns in associative memory, patterns in a pattern recogni- tion problem, or locally optimal solutions to an optimization problem. CHAPTER III ANALYSIS OF NEURAL NETWORKS FOR OPTIMIZATION The requisite mathematical backgron is given in the first section to introduce the notations and the basic theorems used in this dissertation. The ANN models described in [6,19,24] are studied in detail and justified by the theorems stated in Sec- tion 3.1. A thorough analysis of the linear programming neural network for problems with hypercube feasible region is covered in the last section. It serves to demonstrate the dynamics of the optimization network in [19]. 3.1 Mathematical Background The following notation and conventions are used throughout this dissertation. XCR" is said to be convex iff for any a,beX implies [a,b]cX, where [a,b]={xeR" lx=ka+(l—K)b, 0951}. Let X CR" be a nonempty convex set, then f:X-—)R is said to be convex iffox-t—(l-My )5 1f (x)+(l—)t)f(y), for any x,yeX and for 09.51. The function f :X —>R is concave if -f is convex. An affine func- tion f :X -)R is a function which is convex and concave. K is a cone in R” iff wceK for any xeK and for any (120. K is a convex cone in R" iff K is a cone and K is convex. Note that x={Ay IyZO}, where A is an nxm matrix, is a closed convex cone. H is a hyperplane in R" iff there exists aeR”, a¢0, and oak, such 25 that H = {xeRdl = a}, where <.,.> is the Euclidean inner product on R", and a is a normal vector of H. Hyperplanes H and H ' are said to be parallel iff their normals are proportional. Let H = H(a,0t) = {xeR" | = a], and a¢0, then the corresponding closed half spaces are defined by H+(a,0t) = {xeR" | 2 a}, and H_(a,a) = {xeR" | S a]. Note that the quadratic function f(x)= -;—xTAx +aTx +b is convex (strictly convex) on R " iff A is positive semidefinite (positive definite), where A is a symmetric nxn matrix, aeR", and beR. Let the program (P )' be of the following form: Minimize f (x) subject to constraints 810050. 8,0050. h1(x)=0. 12,,(x)=o, where f and the 35’s are functions on R" and the hj’s are functions on R" for mSn. (P) is said to be a convex program if f and the g,- ’s are convex functions on R" and the hj’s are affine functions on R". A vector x is called a feasible solution to (P) iff 1: satisfies the r+m constraints of (P). In other words, the feasibility set to (P) is the (possibly empty) set K0 = Kln...nK,nL1n...an where Ki={x|g,-(x)30], i=1,...,r, and "' (P) and other capital letters, such as (LP), (QP), and (NP), when enclosed with parentheses represent mathematical programs. 26 Lj={x Ihj (x )=0], j=1,....,fll When K0 is empty, (P) is said to be infeasible; otherwise, (P) is feasible. If (P) is a convex program, K0 is necessarily convex. For xeKo, the binding set at x is the set I={i|g,-(x)=0}. Let xeKo; x is said to be a regular point if the gradients, Vgi(x), th (x ), i E] (x), lSjSm, are linearly independent. The following theorem is known as the Kuhn-Tucker optimality theorem. (For proof see [145].) Theorem 3.1: Let (P) be a convex program in the notation above. Let f be a feasi— ble solution to (P). Suppose each g,- and it} is differentiable at 2'. Assume further that f is a regular point. Then i is an optimal solution to (P) if and only if there exists 2.:[11 1,17 and u=[tt1 um? together with I that satisfy: (i) 1,20, gi(f')SO, and A;g,-(f)==0, i =1,...,r; and (ii) VflfH ixtvtztm +§u,Vh,-(f> = o. n i=1 j=1. — The variables 3.,- and 11; are known as Lagrange multipliers . Without the assumption of the convexity of the functions, the conditions (i) and (ii) are only the necessary conditions for f to be a local minimizer. Theorem 3.2: Let (P) be a program in the notation above. Let f be a feasible solu- tion to (P). Suppose each g,- is difl‘erentiable at 5.". Assume further that f is a regu- lar point. If it solves (P) locally, then then: exists HA, x, 17‘ and u=[]l1 u... ]T together with 1' that satisfy: (i) 2.520, g;(5t')S0, and A;g,-(f')=0, i =l,...,r; and 27 r 1!: (ii) Vf(f)+ E‘Vg‘m +51”th = o. a For proof of above theorem see [146]. Define uj+=max{uj,O], and uj:min{uj,0}. Then, uj=ujt+ur for ISj_<_m. The corresponding term in condition (ii) above can be written as uJ-th = (Prim-Wk; (3.1) = uj+th-(-uj-)th = uj+Vh,-+(—u,--)(—Vh,-> = llj+th+(—Hj-)V(-hj)- Since only one of pit or uj- is nonzero and hi =0, condition (ii) can be extended by stipulating mutually exclusive terms for h 130 and h j 20. i=1 j=l The Lagrange multipliers are now forced to be all non-negative. Let g,+2j_1=h-, g,+2j=-hj, A,+2j_1=p.j+, and 2.,+2j=—p.j—, the extended program (P’) can now be expressed as: Minimize f (x) subject to g;(x)SO, 15i$r+2m Under these notations, Theorem 3.1 may be restated as the following corollary. Corollary 3.1: Under the assumptions of Theorem 3.1 and the notations above, it“ is an optimal solution to (P’ ) if and only if there exists 2. = Del 1H2," 1T together with x that satisfy (1) 1320, 8; (i750, 811d A18; m, i=1,...,f; and 28 (ii) Vf (r) + 'gxthtm = o. n i=1 Next, define an alternative binding set to be I’(x)= {i liq-$0, r+l$i$r+2m ], then condition (ii) of Corollary 3.1 can be expressed as Vf(f)+ 2 A,Vg,(x)+ E xivgicf) =0- (3.3) iel(f) 36”?) This implies that —V f (1?) lies in the closed convex cone spanned by Vg, (r') for i e! (f)ul' (f). If f is a regular point, then -Vf (f) can be uniquely decomposed into a positive linear combination of Vg; , ieI (f')uI' (2’). Define g;+(x )=max{0, g;(x)], i.e., g;+(x) is the magnitude of the violation of the i th constraint in (P’) where 151' Sr+Zrn . The following theorem is known as the penalty function theorem. (For proof see [147].) Theorem 3.3: Let (P’ ) be the extended program stated above for f eC1 and g,-eC1 where 151' Sr-I-Zm . Let {s,‘ 11" be a nonnegative, strictly increasing sequence tending to infinity. Define the function L(s. x) =f(x) + {fume (3.4) i =1 Let the minimizer of L(sk, x) be xk. Then any limit point of the sequence {xkli' is an optimal solution to (P’) and, equivalently, to (P). Ftn'thermore, if x], —)x' and f is a regular point, then stg,-+(xk )—)k,- , which is the Lagrange multiplier associated with (P’ ). CI Note that the penalty function L (s , x) is called the energy function in [6]. Later it will be shown that it is a qualified Lyapunov function for a neural network. The following corollary is a direct result from Theorem 3.3. Corollary 3.2: Let the notations and assumptions be as defined in Theorem 3.3. Then given 8>0, there exists a sufficient large 3 such that the minimizer of L (s , x) lies in 29 N(0, e), where N(0, e) = {xl llx-flke, item and 0 is the set of minimizers of (P’). E] 3.2 Networks-Analysis There are cm'rently three ANN models proposed for solving nonlinear program- ming problems [6,19,24]. Consider first only the case of linear programming. The linear program (LP) considered here is of the following form: Minimize f(x)=aTx subject to g (x) = Dx -b 50, where D is an mxn mauix, beR’", aeR", and xeR". If equality constraints are considered as well, then each of them can be replaced by two inequality constraints as shown in the last section so that the following discussion still holds. Note that Vf (x) = a and Vg (x) = DT. For simplicity in notation, denote g*=[g f...g,,‘,‘]T. 3.2.1 The Model by Tank and Hopfield The network su'ucture proposed by Tank and Hopfield for solving the linear pro- gramming problems is shown in Figure 3.1 [6]. a,— and b,- are implemented by current sources. The voltage outputs, x,- , on the upper right of the figme are the variables in the linear programming problem. The outputs, g f, on the lower left of the figure, measure the constraint satisfaction (violation). s in the rectangles is a large positive constant. Their model can be described in compact form by i = C"{-Vf (I) - Vs (x)g“(X) - %R’1x}s. (3.5) 30 "1 "2 T "a Q x" -b D D 1 9f 9 11 12 D D 4: 9 21 D 2 95' ¢ + 2n 9 031 D 4’3 95* ‘ + -b 9 DIM Dmn m 9,; F + m Figure 3.1. Linear programming network by Tank and Hopfield. where C is a n xn diagonal matrix due to the self-capacitance of each amplifier. R is "I a n xn diagonal matrix with lax—013+; where —-1- the self-conductance of Rii j=1 Pi Pi each amplifier. The so called energy function is chosen to be 2 m n xi Etc) =f + zero)? + 23—. (3.6) i=1 i=12‘gRi Taking the derivative of E 1(x) with respect to time yields 31 = (—Ci)Tx = 47me (3.7) for all t . Equality holds only at equilibrium points since C is a positive, diagonal matrix. Note that Vf and Vg are constant vectors for linear programming and g+(x) is just a vector. Replace g+(x) by a vector 7t and take C to be the identity matrix, then the minimizer f of El occurs when —Vf (5r) - Vg or». - 71-12-15: = o. (3.8) Comparing this equation with the condition (ii) of Theorem 3.2 shows that either the system described by equation 3.5 does not have an equilibrium or else, even if it does, the equilibrium would not be a solution to the program (LP). Suppose s is sufficiently large, as suggested by Tank and Hopfield, the last terms of equation 3.5 and equation 3.8 can be neglected. In this case equation 3.8 can be viewed as fulfilling the necessary conditions of Theorem 3.2. But since it,- (=gf) is required to be positive for some j , the equilibrium of equation 3.5 has to lie in the infeasible region of the program (LP). Depending on a particular program, the equili- brium may be quite far from the true minimizer of (LP) which is generally on a corner (or boundary) of the feasible region. This drawback makes their model unreli- able in solving linear programming problems even though the model has a big advan- tage for hardware implementation. 3.2.2 The Model by Kennedy and Chua The model developed by Kennedy and Chua [19] is based on the Chua’s previ- ous work in [148]. The basic components in their network are integrators as shown in Figure 3.2. Their network formulation requires more hardware components to form 32 Frgure 3.2 The integrator used in Kennedy and Chua’s model. the integrator in its analog circuit implementation when compared to the Tank and Hopfield network. But it is superior in that it circumvents the undesired terms in equations 3.5 and 3.6, namely, the terms due to self-conductance. Their model can be described by J? = C ’1{-Vf (X) - 3V8 008 +00} (39) where C and s are defined as in equation 3.5. For argument sake, C is normally taken to be an identity matrix. This model has been used to solve both linear pro- gramming and quadratic programming problems. The corresponding energy function is 33 E2(X)=f(1)+ gimp»? (3.10) j=l Kennedy and Chua showed that E 2(x) is a Lyapunov function for the system of equa- tion 3.9. This ensures that the system will converge to a stable equilibrium point without oscillation. Their network analysis is primarily based on the nonlinear circuit theory derived in [149]. Comparing equations 3.4 and 3.10 indicates that their work actually fulfills the penalty function method for a fixed penalty parameter. But they fail to justify the assumption required for the penalty function theorem (Theorem 3.3) to hold. Nor do they clarify the relations between the equilibrium point of the network and the true minimizer to the original program, since there could be more than one equilibrium point with respect to one minimizer unless the regularity of the minimizer is assumed. A straightforward analysis from the viewpoint of optimization theory is undertaken in the next chapter to establish a more sound theoretical foundation for Kennedy and Chua’s network. 3.2.3 The Model by Rodriguez-Vazquez, et al. The energy functions of the previous two networks are variations of penalty function methods, since they are formed by adding the cost functions with penalty terms. The penalty terms are derived by taking the magnitude of the constraint viola- tion squared times a penalty parameter. According to the penalty function theorem, the true minimizer can only be obtained when the penalty parameter s is infinite. This is impossible to achieve in practice. To cope with this difficulty, Rodriguez- Vazquez, et al. proposed a network model which is formed by two mutually exclusive subsystems [2A]. (Mutually exclusive here means only one of the two will contribute at a time.) 34 Let u, be the feasibility index of x, i.e., ux=1 if xeKo; otherwise, u;=0. Their model can be expressed as i = -quf(X) - ng(x)g+(x). (3.11) The corresponding energy function is Ego) = uxf + gimp»? (3.12) j=l The trajectory of the system moves along -Vf (x) if u,=1; otherwise, it moves according to the negative gradients of the violated constraints. The combined effort of these two mutually exclusive subsystems forces the conglomerate trajectory to move toward the boundary of the feasible region. As long as it hits the boundary, the trajectory chatters around the boundary and, at the same time, it also approaches the optimum point. But a new problem arises: there is no equilibrium point in this sys- tem, since the condition (ii) of Theorem 3.2 can not be met. The trajectory bounces back and forth in a neighborhood of the minimizer, though the neighborhood can be made very small. The authors, however, have suggested that the system can be viewed as stable if the variation of the solution is bounded. They also suggested another model with the following system equation and energy function: i = —u,Vf (x) - ng(x)(1—ux). (3.13) and 330:) = uxf (x) + sfigrtx) (3.14) j=l But the same problem still exists for this model. 35 3.3 Linear Programming Network for Problem with Hypercube Feasible Region The linear programming problem genre considered here is the minimization of a cost function f (x) = aTx (3.15) subject to xe [0,1]", i.e., 09,51 for every i, where x,- is the ith component of the n- vector x . Let 1,, and 0,, be n-vectors of all ones and zeros, respectively. Then the constraints can be represented in matrix form as g(x) = [35:1] = [3:11 + [31"] =Dx + b S0. (3.16) f(x) is a hyperplane in R" and the feasible region is a unit hypercube in R". The general assumption is made on the system that -a is not parallel to the normal vector of any hyperplane g ,- =0, where g ,- is the j th component of g (x ). This ensures the uniqueness of the optimum point, or in this case the minimizer of the cost function. The minimizer of the cost function is one of the 2" cOrners of the hypercube depend- ing on the normal vector of the hyperplane. Let J = { j | g j>0} be the set of indices of the constraints which are violated. Let It: card (J), the cardinality of J, i.e., the number of components of J. Then g 1* takes on the magnitudes of constraint violations. Consider next a network structure given by the piecewise linear differential equation ' = -[D,Tg,+ ] = -[of(o,x+b,)++a]. (3.17) This is similar to equation 3.5 without considering the term due to self-conductance. Note that this structure of the dynamical system described varies with time since J changes from time to time. D, is a kxn matrix consisting of the jth rows of D, for every j a]; similarly b 1 consists of the j th components of b . Observe that since 36 there are at most n constraint violations, D J is at most an n xn matrix. 3.3.1 The equilibrium of the system Under the assumption placed on -a , the minimizer of the cost function is at an intersection of n of Zn hyperplanes g ,- =0. Recall that by the definition in Section 3.1, I is the set of indices of these hyperplanes. By the Kuhn-Tucker theorem for optimality, the necessary and sufficient condition for a point to be a minimizer is that Vf + Zungj = 0 (3.18) jel where u ,- >0. For any hypercube, the ng ’s are orthogonal for j e] . Therefore, equa- tion 3.18 can be viewed as decomposing Vf into n orthogonal vectors ij ’s, each along the direction of -ng. Note that Vf =a and ng=DjT. Also, (D,-x+b,-)+ measures the one-sided dis- tance of x to gj=0. If we denote (Djx+b,-)+ by w-, then the equilibrium of equation 3.17 must satisfy Vf + 219,-ng = 0. (3.19) jeJ Matching I With I and w,- with u,- shows that the equilibrium of the system fulfills the Kuhn-Tucker optimality condition. Furthermore the equilibrium is unique due to the orthogonality of gj=0 for jeJ. Also observe that wj=(D,-x+b,-)+>0 for jeJ indi- cating that the equilibrium lies in the infeasible region where n constraints are violated. 37 3.3.2 The initial state in the feasible region When the initial state of the system is any point in the feasible region, then J is simply empty and the trajectory moves in the direction of -a until it hits either one of the hyperplanes or an intersection of certain hyperplanes. Assume the former is the case. On the hyperplane g ,- =0, the trajectory still follows the direction of -a and thus it eventually enters into the other side of the hyperplane. At this time, D J becomes a lxn matrix denoted by D ,- (small j ). The dynamics of the system are now described by r = -D,T(D,x+b,)+-a. (3.20) -D,-T(D,-x +b,- )+ is a vector in the direction of —D,-T with magnitude proportional to (Djx+b,-)+. If we decompose —a into two parts as —a = ale-t-aj, where a, is the pro- jection of -a in the direction of D}, then equation 3.20 becomes ° _ T x — [—Dj (Djx+bj)++a,-] + “I" (3.21) As the system evolves in time, it reaches a point where the first two terms on the right hand side of equation 3.21 cancel each other. Thereafter, the trajectory rolls along with a 1: until it hits another hyperplane. When there is at least one constraint violation, similar to the case with the initial state in the infeasible region, -a can be decomposed as -a=2a,-+aj, where a,- is the jeJ projection of -a on the span of DjT for 1's] and a, is the portion of .0. not in this span. Note that a ,- is fixed and unique for each j a] due to the orthogonality of the normal vectors D}: I, however, is not fixed. The differential equation of the system for k21 can now be written as x = -};D,T(D,-x+b,.)+ + 2c, + a,. (3.22) jeJ jeJ 38 Extending the above argument to 1: >1 illustrates that as the trajectory moves from one side of a hyperplane g ,- =0 to the other side of the hyperplane, the projection a ,- will be continually reduced by DjT(D,-x+b,- )+ until it becomes zero. Although the trajectory may hit another hyperplane, the orthogonality of the DjT’s nonetheless guarantees that a ,- —D,-T(D jx +b ,- )+ eventually will become zero. It is clear at this point that if a trajectory starting in the feasible region hits the intersection of certain hyper- planes, each a,- will be diminished gradually by the corresponding term DjT(D,-x+b,- )+ and the result stated above still holds. Since -a is not parallel to any hyperplane by assumption, it can be decomposed into n orthogonal projections. As the trajectory moves from region to region, it will eventually enter a region where n of the Zn constraints are violated. In this region the trajectory settles to the unique equilibrium point on which -a is fully represented by -zD,T(D,.x+b,)+. jeJ 3.3.3 Example of a 2-dimensional hypercube Due to the orthogonality of the hyperplanes, the trajectories of the system can be illustrated by considering the case of a 2-dimensional hypercube. In Figure 3.3, the vector —a is shown on the upper right comer together with its two orthogonal projec- tions. The intersection of the two dotted lines is the equilibrium point of the system. The broadened gray lines are trajectories of the system corresponding to different irri- tial states. Figure 3.4 gives several examples with the initial states in the infeasible region. The corresponding -ED,.T(D,-x+b,- )+ with respect to each initial point is drawn by a jeJ solid line segment along with vector -a shown by a dotted line segment. The direc- tion of the trajectory changes whenever it crosses a hyperplane. Once the trajectory 39 XZ=1 E Figure 3.3. Trajectories starting in the feasible region. reaches one of the dotted lines, i.e., one of the orthogonal projections of -a , it stays on the line and moves toward the equilibrium point. The equilibrium point will always lie in one of the four regions shown with light gray background in Figure 3.4. If we consider that the normal vectors of the hyperplanes separate these four regions from each other as bases of R 2, then there is only one region in which -a will have positive coordinates. This is another interpretation of equation 3.17. Frgure 3.4. Trajectories starting in the infeasible region. 3.3.4 Further Analysis Consider now the system defined by x = -s [DJT(DJx+bJ)+]-a = -s 'eJ sz(D,x+b,-)+]-a I (3.23) which is same as equation 3.9 but more explicitly expressed. When s=l, equation 3.23 is the same as equation 3.17. When s>1, denoting vj=s(D,-x+b,-)+ for jeJ, the equilibrium of equation 3.23 must satisfy 41 Vf + ZvngJ- =0. " (3.24) jeJ The corresponding terms of equations 3.24 and 3.18, namely I=J and uj=vj, can be u. matched. Since u,- is fixed for a given cost function, (Djx+b,- )+=-;J- is the equili- brium of equation 3.23. This means that the distance from the equilibrium point x to g 1:0 is reduced by a factor of s when compared to the case where s=1 (equation 3.19). By choosing s sufficiently large, the equilibrium point can thus be moved arbi- trarily close to the intersection of the corresponding n hyperplanes, which is the minimizer of the cost function. Next, the energy function E is defined as )3 [(0 ,x +12,- )+]2. (3.25) E: (”+1 f 21's] The time derivative of E can is derived as dE ddei 7:59am _-. JET a+2st(D,-x+b,-)+ is] T = at it < o, (3.26) for all x¢0. Thus E is a Lyapunov function for the system. Hence the equilibrium is asymptotically stable by the asymptotical stability theorem [150]. In fact, it is also asymptotically stable in the global sense due to the unboundedness of (D ,x +b ,- )+ as le ll ->oo. For s sufficiently large, equation 3.25 is a form of the penalty function method. But, if we use the notation of v ,- , then equation 3.25 becomes E = f(x) + % gvj(ojx+b,)+, (3.27) 16 42 which is a form of the Lagrangian function method. So the forms of both methods are implicitly embedded in the network structure described by equation 3.23. As fol- lows from the penalty function method theorem, the equilibrium of equation 3.23 approaches the minimizer of the cost function as s -)oo. This, however, is impossible to implement in practice. A sufficiently large 5 will generally result in an equilibrium state which is a reasonably good approximation to the minimizer of the cost function. A diagram of the trajectories of equation 3.23 for s=oo is shown in Figure 3.5 for a 2-dimensional hypercube. Whenever the trajectory lies in the infeasible region, it will be forced to move directly to either the closest hyperplane or the intersection of the hyperplanes if more than one constraint is violated. Then, it will move according to —a to one of the hyperplanes g ,~ =0, where jel . Once it reaches such a hyper— plane, the trajectory slides on the hyperplane toward an intersection of hyperplanes with indices j e] . 3.3.5 Extensions The results of above analysis are directly applicable to cases where the hyper- cube is being scaled up or down, translated from the origin to any point, and/or rotated at any angle. It is also applicable to the region defined by [ll,u1]x - - ° x[l,,, ,,] and its scaling, translation, and rotation, as long as the ortho— gonality of the hyperplanes (g i=0 for j 61 ) is preserved. If the orthogonality is not preserved, the equilibrium of the system defined by equation 3.23 may not be unique, though the local asymptotic stability of the desired equilibrium still holds. An extended theoretical argument for nonlinear programming in general is given in the next chapter. 43 X2=l x2=0 Figure 3.5. Trajectories for s=oo. CHAPTER IV OPTIMIZATION NETWORK FORMULATION A unifying mathematical framework for the Kennedy and Chua network for linear and quadratic programming is given in the first two sections followed by its extension to more complicated nonlinear programming problems. A two-phase optim- ization network model is then proposed which can obtain both the exact solution, in contrast to the approximate solution by Kennedy and Chua’s network, as well as the corresponding Lagrange multipliers associated with each constraint. 4.1 Linear Programming Network Theory Let the linear program (LP) be defined as in Section 3.2 and its objective func- tions is referred as f I (x). In this section, the network by Kennedy and Chua for linear programming is justified fi'om the viewpoint of optimization theory. As has been pointed out in Section 3.2.2, E2(x) is precisely L(s, x) for a fixed penalty parameter s . Thus we have the following proposition which is a restatement of Corollary 3.2. Proposition 4.1: Let 0 be the set of minimizers of a feasible (LP). If 0 is bounded and contains only regular points, then given e>0, there exists a sufficiently large 5 such that M, the set of the rrrinimizers of the corresponding E2(x ), satisfies 45 mirflIx—flke for xeM. Cl area Note that M is convex due to the fact that the sublevel set of a convex function, namely E2(x ), is convex. Without the assumption of the boundedness of 0 , the pro- position is still true. But when 0 is unbounded, any bounded subset of 0 is sufficient for obtaining a minimizer. This is due to the fact that if an (LP) has a finite optimum value, it must have a finite minimizer. Thus for 0 unbounded, we can place some additional, suitable bounding constraints into the original program so that the following discussion still holds. Take C to be I in equation 3.9 and rewrite it as r = -Vf (x) — sLigj+(x)VgJ-(x)]. (4.1) ‘=1 The block diagram of the system of described by equation 4.1 is drawn in Figure 4.1. Define sgj+=vj and J (x)={ j lgj+(x)>0, ISjSm ]. The equilibrium of equation 4.1 occurs when 0 = VfCE) + sfigfawttjtr) j=l = Vf (50+ 2 vngjCi). (4-2) fem) Since for linear programming problems, ng=dj, where d, is the jth row of D , equa- tion 4.2 can be expressed as me+ z v,d,-=0. (4.3) jaw) Proposition 4.2: If the (LP) is feasible with finite optimum value, then the system described by equation 4.1 has an equilibrium. CI Proof: Let 1? be a minimizer to the (LP). Then, according to Corollary 3.1, there exists kj>0 for jelo‘c') such that 46 x CD *’ I x r -‘ ' Vf *— 258i+V8t° i ._, sg,+._ Figure 4.1. The block diagram of the system of equation 4.1. 16“?) Choosing J (xHa') and vj=7tj, verifies the Proposition. El Proposition 4.3: Let f be a solution to a feasible (LP) with a finite optimum value. If x" is not regular, then the equilibrium of equation 4.1 corresponding to x is not unique. CI 47 Proof: Since it" is not regular, {dj )1. e 1(3) are linearly dependent. This implies that there exists two subsets of I ()7), say I' and I" where I’ztl” , such that a + Enid,- = O, for (ll->0, (4.5) jel’ and a + z pjd, = o, for B,>o. (4.6) jel” Let aj=sgj+(xl) for jeI’ and Bj=sgj+(x7) for jeI”. Since I’ael”, then there is at least one a,- ¢Bj and, consequently, at least one g ,- (x1)¢g ,- (x?) Thus the equilibrium corresponding to f is not unique. [3 Note that even though the equilibrium points corresponding to r are not unique, they all result in same value of E 2(x ), since any equilibrium point of equation 4.1 is a minimizer to E2(x) as will be shown later. In fact, it is observed that any convex combination of equations 4.5 and 4.6 satisfies the equilibrium condition. That is to say, for 038$], a + 6 2 ajdj + (1—6) 2; ma, = o. (4.7) jer jer' Let i=I’UI" and r=card (f). Denote a as the vector in R' with its element equal to at,- if jel’ , otherwise 0. Similarly, let p be the vector in R' with its element equal to B,- if jeI” , otherwise 0. Then, equation 4.7 implies that the line segment [(1, B] C R1 is mapped to a single point, —a, in the closed convex cone {y |y=27,~d,-, yj>0}. je More generally, from equation 4.3 -a lies in the closed convex cone {y ly= 2 7,- dj. 1' 61(1) 7} >0} as seen from equation 4.4. For linear programming f (x) and (g j+(x ))2 are convex and continuously differentiable as is E 2(x ). Furthermore, 48 (152:3an ;|; _ 'T V m .+V . —x f+s£gj g, i=1 = 47150 (4.8) holds with equality only at the equilibrium .2 of equation 4.1. Thus if the (LP) has a finite optimum value, E2(x) is a Lyapunov function of equation 4.1 and achieves its minimum at 52. Proposition 4.4: Let (LP) be a feasible program and E 2(x) be correspondingly defined. Then the set of equilibrium points of equation 4.1 is the set of minimizers of E 2(x ). El Proof: Let M, be the set of minimizers of E2(x) and M2 the set of equilibrium points of equation 4.1. For xeMz, we have VE2(X) = .1. = 0 and V2E2(x) = Vg , (x )ngot) 2 o, for Vg ,=[Vg ,- ] ,6 1(1). Thus x satisfies the necessary condition of a minimum of E2(x ). That is, M ,c M 2. To show the converse, i.e., Mzc M1, we proceed as fol- lows. Since M I: M 2 and M110, there is at least one equilibrium point of equation 4.1, say it, which is a minimizer of 5,. Observe that V2520?) is strictly positive except along the direction of y at which yTVgrngfcb = o. (4.9) 49 So i is a minimum except along the direction y . Therefore, E2(x) must be examined along the direction of y. Equation 4.9 implies vg}(r)y =0, i.e., yeNull(Vg,T(5i)), where Null (ngoz )) is the nullity of VgJT. Since VE 2(52 )=o implies —aeRange (Vg, (r )), we have aTy=0 due to the fact that the linear subspace of Null (V gJTCc' )) is perpendicular to the linear subspace of Range (VgJCE )). Now VE2(5i+y)=a+ 2‘, g,-+(i+y)Vg,-(i+y) 1916+” =a + 2 g,-+(r+y>Vg,-(i) i610?) =0 4 2 [vgf(r+y) - bIngC?) iela) =a + z [ng5: -b]Vg,-(x) few) =a + 2 groove-(i) i610) = V820?) = 0, (4.10) if J(x+y)=J(J‘2). Thus 5t+y, for yeNuiI(Vg,T(i)), is an equilibrium of equation 4.1 as long as it does not evoke any new constraint violation. Moreover, Ezc+yl=aT+i 2; (SfCZH‘DZ jeJ(x+y) = 0T} + ‘3' 2 ($4.637)? i616) aTr + g 2 (Vs-Tao) - b)2 i616) Ht aT + % z (ngi -b)2 ism) .. s .. aTx +3 2 (gj+(x))2 few!) so = 52(2), (4.11) ifJ(i+y)=J(J‘i). This implies i-l-y is also a minimizer of E2, and thus Mzc M1. D Proposition 4.5: Let 0 be the set of minimizers of a feasible (LP). If 0 is bounded and contains only regular points, then there exists a sufficiently large s such that J(i)=I(x'), for)? an equilibrium of equation 4.1 and 1760. D Proof: From Proposition 4.4, it is a minimizer of E2(x ). According to Theorem 3.3, 5i-—>x'e 0 and v ,- 4). ,- as s —)oo. Since the convex cone spanned by ng (f) is closed, there exists a sufficiently large s such that J (52)=I(x). CI Note that 0 is necessarily convex. Also for x1, x260 , we have I (x1)=I(x2) and the corresponding Lagrange multipliers 3e,- are the same. Proposition 4.6: If the unique minimizer f of a feasible (LP) is a regular point, then there exists a sufliciently large s such that the equilibrium of equation 4.1 x with respect to f is unique. [3 Proof: let so be the parameter that satisfies Proposition 4.5 so that J (5i )=I (2'), for it an equilibrium of equation 4.1. Since 2' is regular and unique, it implies that card (I (f))=n , which furthermore implies card (J (5i ))=n by the choice of so . In this case equation 4.2 is equivalent to solving a system of n linear equations for n unk- nowns, namely, the v,- ’s. Since ng (i ), for j eJ (J‘i ), are linearly independent by the choice of so , v is uniquely determined. But vj = sogj +(1)-=So(V8jx “bj)9 (4-12) so i is uniquely determined, again by the linear independency of ng (x ), for j eJ (5i ). D The above proposition considers only the case for which card (I (f))=n . Suppose now that 0 contains more than one point, i.e., card (I (f))O, there exists a sufficiently large s such that the system is completely stable and it satisfies minIIJT-xllqe. C! re 0 Proof: Let s1 be the parameter that satisfies Proposition 4.5. Given £>0, by Proposi- tion 4.1 there exists a sufficiently large s2>s1 such that llx—flke, where x is an equili- brium of equation 4.1 and 1760. Since 0 is bounded, by Proposition 4.1 the set M of equilibrium points of equation 4.1 is bounded. (Otherwise, for 0 bounded and M unbounded the conclusion of Proposition 4.1 would not be true.) Using E 2(1) as the Lyapunov function for the system, LeSalle’s Theorem [150] ensures that the system is 54 "Ill'lfllllulfllulllll III (a) | minimizers ; equilibria (C) Figrne 4.3. The relationship between the equilibrium set of equation 4.1 and the set of minimizers. (a) They are of the same size. (b) The equilibrium set is smaller in size than the set of the minimizers. (c) The equilibrium set is larger in size than the set of the minimizers. completely stable in the sense that every trajectory will converge to a point in M without oscillation. El 55 4.2 Quadratic Programnring Network Theory Consider next the case of quadratic programming. Let the quadratic program (QP) be of the following form: Minimize fo(x) = é-xTAx + 07x subject to g(x) = Dx-bSO, where A is a symmeuic, positive semidefinite matrix. It is clear that V f o (x )=A.x +a. First the following lemma is needed. Lemma 4.1: Let B be a symmetric, positive semidefinite matrix. Then Bx =0 if and only if xTBx =0. U Proof: The only if part is clear, since for Bx =0, it follows that xTBx = xT(Bx) = 0. The converse can be shown by using the Cauchy-Schwarz inequality; that is, if Q is symmetric, positive semidefinite then the following is true: (flay)2 s (xTQnyTQy). (4.17) for any x and y. By assumption xTQx=0, we have 0 s (£ny s (xTQnyTQy) = o. (4.18) for any value of y. This implies leQy |=0, for any y. But this implies xTQ=0, i.e., Qx=0. The proof is complete. El Similar to Proposition 4.1 for linear programming, we have the following propo- sition for quadratic programming which is a direct result of Corollary 3.2. Proposition 4.7: Let 0 be the set of minimizers of a feasible (QP ). If 0 is bounded and contains only regular points, then given e>0, there exists a sufficiently large s 56 such that M, the set of the minimizers of the corresponding E 2(x ), satisfies minllx-x‘lke for xeM. [3 £60 Proposition 4.8: Let (QP) be a feasible program and E 2(x) be correspondingly defined. Then the set of equilibrium points of equation 4.1 is the set of the minimiz- ers of 82(x ). El Proof: Let M, be the set of the minimizers of 52(1) and M2 the set of equilibrium points of equation 4.1. For xeMz, we have VE2(X) = "X. = O and Wax) = A t Vs: (x )VXJT (x) 2 0. (4.19) for Vg,=[Vg,-]je,(,,). If the A matrix in the definition of fo (x) is positive definite, then the Hessian of E2 (equation 4.19) is positive definite. Thus the equilibrium of equation 4.1 is necessarily and sufficiently the unique minimizer of E 2(x ). If, however, A is only positive semidefinite, then x satisfies only the necessary condition of a minimum. That is to say M 1: M 2. To show the converse, i.e., M 2: M 12 we proceed as follows. Since M 1c M 2 and M 1150, there is at least one equilibrium of equation 4.1, say i , which is a minimizer of E 2. Observe that V2E 2(5'c) is strictly positive except along the direction of y at which yT(A + Varngltrm = o. (4.20) So J? is a strict minimum except along the direction of y . We need only examine E 2(x) along the direction of y . Equation 4.20 implies ngfi )y =0, i.e., y eNuiI (vg,T(r)). Equation 4.20 also implies yTAy =0, and this implies Ay=0 by Lemma 4.1. Together we have yeNull (A )nNuu(Vg,T(r)). Also, since VE2(x)=0 57 implies —Ax-a eRange (Vg, (i )), we have 0 = yT(AJ‘i + a) = yTAj't + yTa = yTa, (4.21) due to the fact that y eNuii (A )rwuu (ngor )) and that the linear subspace of Null (VgJTCc' )) is perpendicular to the linear subspace of Range (Vg, (i )). Now VEZCHy) =A(x +y) + a + 2 g,-+(x+y)Vg,-(f+)’) jeJ(x+y) =Ax +a + 2 gj+(x+y)Vg,-(i) . jeltx) =Ajt +a + E [ng(i+y)-b]V8j(i) i816) =Ai +a + z [VgJ-Ti -b]Vg,-(x) i616) =A55 +0 + )3 8j+(i)V8j(i) 161(1) = V520?) = o, (4.22) if J(J‘i+y)=J(i). Thus 5t+y, for yeNuii(A)nNuii(vg,T(r)), is an equilibrium of equation 4.1 as long as it does not evoke any new constraint violation. Moreover, Erato)=(ic+y)TA(i+y)+aTtiz+y>+i z (8j+(i+)’))2 jeJ(1+y) =xTAr + a7)? + % E (8,°+(J'5"')’))2 felt!) =iTAi' + a7: + g- z (vg,T(r+y) - 12)2 felt?) = “TAx +aTx + g 2 (ng2 — b)2 jela) 58 =2“: + oTr + % z (,gfci»2 isle) if J(x+y)=.l(x'). This implies 5i+y is also a minimizer of E2, and thus M2: M1. The proof is complete. El Proposition 4.9: Let 0 be the set of minimizers of a feasible (QP ). If 0 is bounded and contains only regular points, then there exists a sufficiently large s such that J(J‘i)=I(x'), forx' an equilibrium of equation 4.1 and 2'60. D Proof: Same proof as of Proposition 4.5. D Proposition 4.10: If the unique minimizer r of a feasible program (QP) is a regular point, then there exists a sufficiently large 3 such that the equilibrium :2 of equation 4.1 with respect to x is unique. El Proof: let so be the parameter that satisfies Proposition 4.9 so that J (r )=I (r), for 52 an equilibrium of equation 4.1. Since 1? is regular and unique, it implies that card (I (x’))=card (J (x' ))=n by the choice of so and Vnga) is of full rank. The latter ensures that ng, )ngm is positive definite. Since it is an equilibrium of equation 4.1, from equation 4.2 and using J (i)=I(J't') we have A: + a + So 2 ng(x')(Vg,-x' - bj) = O. (4.24) felt!) This is the same as A? + a + sngIm[Vg,Tmi - by] = 0. (4.25) By rearranging the variable, we have [A + sng, VgT )).? + a - sngImb, = 0. (4.26) (2') Hi 59 Observe that A + so Vgl(z)V317(r) is positive definite, since A is positive semidefinite and VngVgJTm positive definite. Hence 5i can be uniquely solved as -1 x = -[A 4» so VgIngITm] [a — snglme]. Cl (4.27) The above proposition considers only the case for which card (I (f))=n . Suppose now that 0 contains more than one point, i.e., card (I (f))s1—>oo. Let yeNull (A )nNulI (V3170?) )) and J (xo+y)=J(5io ). As the system of equation 4.1 is changed by raising s from so to s, at x=xo +y , its dynamics are described by rim =A(5io+y) + a +s1 z ng - 12,) jelm =Aro + a + so 2 Vg,(r)(vg,-Tro — 12,-). (4.30) law?) This is same as the dynamics of equation 4.1 when changing so to s, at x=x'o. By Proposition 4.8, zTa=0 for any zeNull (A )nNuIi(vg,T(ro)). It follows that 2T1? lit.” = 2T Ago +0 +51 2 ngmwgfio -bl') 1'6“!) = s1 2‘, zTVg,(r)(Vg,Tro - b,-) felt?) = 0, (4.31) 60 since zeNull (VgJTa )) implies that z is perpendicular to g ,- for jeI (2'). Equations 4.30 and 4.31 actually say that any vector in Null (A )nNull (ngor )) remains unchanged as the system evolves in time. This indicates the following fact. Fact 4.2: Let 55o, 521, f, so, s1, and y be defined as above. If xo+y approaches f+y60 as so —>oo, then, as so is switched to s, at x=7‘io, the system obtains an equili- brium at x1+y . D Similar to the case of linear programming network, for the sake of practicality, the results for the quadratic programming network are summarized in the following theorem. Theorem 4.2: Under the notations and assumptions of Proposition 4.7, then, given e>0, there exists a sufficiently large s such that the system is completely stable and it satisfies minllx—ftlke. El feO Proof: Basically the same proof as for Theorem 4.1. [3 Note that the results of above argument still hold even if I (if) is empty, i.e., no binding constraints. In this case the minizer lies in the interior of the feasibility set. This is a very strong result with a myriad of applications. In particular, the following two cases are examined. Case (A): Solving B,,,o,x=b B is assumed to be of full rank. This problem can be converted into minimizing f(x) = élle—HP over R". But f(x) = %xTBTBx - bTBx + #71; (4.32) and Vf (x) = BT(Bx—b ). (4.33) 61 Since 3TB is symmetric and positive definite by the assumption placed on B, the optimization network formulation described by equation 4.1 for the case of quadratic programming can be applied. Define i = _-Vf (X). (4.34) It follows that 9%:4Wf (x )II2s0 with equality holds only at Bx =b. From Theorem 4.2 the unique equilibrium of equation 4.1, x=B‘1b, is globally asymptoti- cally stable. Thus the problem is solved without actually calculating the inverse matrix of B . Case (B): Solving Bmxnx=b In this case, assume rank (B )=n0, there exists a sufficiently large s such that M , the set of the minimizers of the corresponding E 2(x) satisfies minllx-x1|e which thus it draws a con- £60 tradiction.) Proposition 4.12: Let (PP) be a feasible program and E 2(x) be correspondingly defined. Then the equilibrium points of equation 4.1 are the minimizers of E2(x ). El Proof: Since E2(x) is convex and continuously differentiable on R" , the critical points of E2(x) are the minimizers. This can be seen from Theorem 3.1 by taking it and it to be zero since there are no constraints for E2(x ). But the critical points satisfy 63 0 = VE2(X) = ‘1‘. Thus the equilibrium points of equation 4.1 are just the minimizers of E 2(x ). D It is known that if f is a C2 (twice-continuously differentiable) real-valued func- tion on an Open convex set S in R" , then f is convex if and only if its Hessian matrix V2f(x) = [L (1‘)] 3x,- 3x,- is positive semidefinite for every x e S [145]. Using this fact, the implications of Pro- position 4.12 can be extended. Proposition 4.13: Let (PP) be a feasible program and E2(x) be correspondingly defined. Assume that the Hessian matrix of f o (x) is positive definite. Then the equilibrium point of equation 4.1 is the unique minimizer of E 2(x ). Cl Proof: By Proposition 4.12 the equilibrium of equation 4.1 is the minimizer of E 2(x ). Furthermore, we have that vzrt'zct) = szo (x) + vg,(x)vg,T(x) (4.35) is positive definite, since szo (x) is positive definite and Vg, (x )Vgflx) is positive semidefinite. Thus x is necessarily and sufficiently a strictly local minimizer of E 2(x ). Due to the convexity of E 2(x ), we can conclude that i is the global (and thus unique) minimizer of E2(x ). El Proposition 4.14: Let 0 be the set of minimizers of a feasible (PP). If 0 is bounded and contains only regular points, then there exists a sufficiently large s such that J(i)=l(f), where it is an equilibrium of equation 4.1 and £60. El Proof: Same proof as of Proposition 4.5. D 64 Theorem 4.3: Under the notations and assumptions of Proposition 4.11, then, given e>0, there exists a sufficiently large s such that the system is completely stable and 5? satisfies minlIf-J‘ilke. CI is 0 Proof: let it, be the parameter that satisfies both Propositions 4.11 and 4.14. Since 0 is bounded, by Proposition 4.11 there exists a s2>s1 such that the minimizers M of E20!) are bounded. Using the fact that E 2(x) is a Lyapunov function for the system of equation 4.1, LeSalle’s theorem [150] ensures that the system is completely stable in the sense that every trajectory converges to a point in M without oscillation. EJ Corollary 4.2: Under the notations and assumptions of Proposition 4.13, then, given €>0, there exists a sufficiently large s such that the unique equilibrium :2 is globally asymptotically stable and llf-x'lke. El Proof: let s, be the parameter satisfies Proposition 4.11. Using the uniqueness of Si and applying Theorem 4.3, the result follows. E! The discussion in this section up to now is just an extension of the quadratic pro- gramming network with the objective function of the program allowed to be any Cl convex function. The results obtained thus far apply to the program with linear equal- ity constraints as well. In what follows, the discussion is extended to programs with nonlinear constraints, either equalities or inequalities. Consider/next the case of convex programming. Lemma 4.2: let (CP) be a convex program. Then E 2(x) is a convex function. CI Proof: By definition, 52(1) =f(x) + g fierce)? + ivy-(of. (4.36) i=1 i=1 where f and the g,- ’s are C1 convex functions on R" and the h,- ’s are affine functions on R " . The Hessian matrix of the second term on the right hand side of equation 65 4.36 is WBEtgr-Txnz] = s i=1 i6] VSJVSJT + Egi+vz i], r which is clearly positive semidefinite. Thus 2 (gi+(x ))2 is a convex function. i=1 Since each h ,- is an affine function, they can each be expressed as T for some vector c, and constant e,, ISjSm. By checking the positive semidefiniteness of the Hessian matrix, it can be shown that the last term of equation 4.34 is also a convex function. Now E 2(x) is a sum of three convex functions, so it must be a convex functions as well. CI If E 2(x) for some (CP) is bounded below, then by the formulation of equation 4.1 the system will converge to a minimizer of E2(x ). So the results obtained for (PP) still hold for the case of convex programs. Proposition 4.15: If the basic program (PP) is replaced by a convex program (CP) in Propositions 4.11 - 4.14, Theorem 4.3, and Corollary 4.2 (with the assumption that the minimizers of (CP) are bounded and contain only regular points), then their results still hold. CI Proposition 4.14 implies that for a sufficiently large s , the violated constraints at an equilibrium 1‘? of equation 4.1 are the same as the binding constraints at a minim- iw 5". For (PP). ifJ(i)=l(f). then Vgiem) = ngertr) (4.37) since the gradient of a violated constraint is a constant vector. But this is not always true for the case of convex programming. The gradient of a violated constraint for a 66 convex program (CP) is a continuous vector function rather than a constant vector. It is due to this continuity and the assumption of regularity of the minimizers that the conclusion of Theorem 3.3 holds. For a convex program, it is only guaranteed that equation 4.37 holds when s=oo. (For further discussion see Chapter 12 of [151].) All the programs discussed thus far consider equality constraints (h ,- ’s) with only affine forms. The strategy to map such constraints into the optimization network is to replace them by two inequality constraints, hJ-ZO and ll,- 50. Since each of these two inequalities is again a convex function, the results of Theorem 3.1 are applicable. In order to solve more general problems by the optimization network technique, we want to relax the restriction on the affineness of equality constraints. First the following theorem is needed which is an extension of Theorem 3.1. (For proof see [146]). Theorem 4.4: Let (P) be a program in the notation described in Section 3.1. Let f be a feasible solution to (P). Suppose that f is a regular point. Further suppose that f, g,, ieI(x’), and h-, lSjSm, are convex and all are differentiable at 1’. Then 1? is a global optimal solution to (P) if and only if there exists Hit, 2.,1T20 and ll=[tt1 um]T>0 together with x that satisfy (i) l,g,-(x') = 0 for i=1,...,r and i=1 i=1 Corollary 4.3: let (P) be a program with the same notations and assumptions of Theorem 4.4 except that h-, lSjSm, are all concave. Then i? is a global optimal solution to (P) if and only if there exists Hit, mTzo and p.=[].11 umlT<0 together with f such that the conditions (i) and (ii) of Theorem 4.4 hold. El Proof: Since -h,-(x) is convex for all j and lle-hj) = (‘llehJ-r 67 the result follows by applying Theorem 4.4. D Since the h,- ’s are not assumed to be affine, the minimizers of the problem may be isolated, rather than a convex set. If h,- is a convex function, then hjSO defines a convex set. Similarly, —h,- S0 defines a convex set for It,- concave. Now using the notation of a extended program, i.e., letting g,+2,-_1=h,- and g,+2,-=—h,-, leads to the following corollary. Corollary 4.4: Let (P’) be the extended program stated above for convex functions feCl and g,eC1, 1SiSr, and convex or concave functions hjeCl, lSjSm. Let L(s,x) and s be correspondingly defined. Let x), be a minimizer of L(s,,,x). Assume x), —)f for some £60, where 0 is the set of minimizers of (P’). Suppose 0 is bounded and contains only regular points. Then for It,- convex, skg,‘.‘,2,-_1(x,,)-9u,- and soggy (x), )-)0; for hj concave, skg,12j-l(x,,)-)0 and skg,12j(x,,)—)(—tlj ), where uj is the corresponding Lagrange multiplier at x‘. [3 Proof: Rewrite equation 3.4 as us. x) =f(x) + g iot+>2+ feta-ha)? + ions-(xi)? (4.38) i=1 j=l i=1 Applying Theorems 3.3 and 4.4, it follows that if h ,- is convex then skg,’f,2j_1(x,,)—)ll,- and soggy-(o )—)0. By applying Theorem 3.3 and Corollary 4.3 it follows that if It, is concave, spg,12j-1(x,,)—)0 and skgfiozj (x), )—)(-|.lj ). D In fact, since g,";2,-_1 (x) and g,";2,-(x) are continuous and mutually exclusive for lSjSm , there exists a sufficiently large s), such that g,*;2,-_1(x,,) > 0 and g,T,2,-(x,,) = 0 for hj CODVCX, 811d 8,12j_1 (1*) = 0 and g,++2,-(x,,) > 0 for hi concave. Proposition 4.16: Let so be the parameter stated above. Then L(s,,,x) is a convex function in a neighborhood of x,, . D 68 Proof: Due to the choice of s,, and the continuity of {g,-+},'I12"‘, there exists a neigh- borhood N of x,, so that for xeN, s,g,+;2,._, (x) > 0 and soggy-(x) = 0 if h, is con— vex; skg,‘.*,2j_1 (x) = 0 and skg,‘_‘,2,-(x) > 0 if h,- is concave. Now for xeN, if hj is convex, (atom at»2 + (erotic)? = (3.12)-. (x »2 = (hj“(x))2 is convex since It,” is convex. Similarly for xeN , if h,- is concave, (rater (x »2 + (gape)? = (gas,- (x ))2 = «are»? is convex since (—h,- )+ is convex. As a sum of some convex functions, it follows that L(s,,,x) is convex. D To see that It," is convex provided h ,- is convex, consider the following lemma Lemma 4.3: let g (x) be a convex function on a nonempty convex set X cR " , then , _ g(x) ifgtxl>o g (x)- 0 ifgu)g) is a convex function over X . D Proof: Choose x, y eX. By the convexity of g , it follows that for 0951 sax-«14034) S 1300+ (14030) S MIG!) + (1-1)8+()'). If g(lxi-(l-lc)y)>0, then g+(2.x+(1-lt)y) =g(1x+(l-lv)y). and we are done. If g (2.x +(1-1)y )50, then Elia-”(140” = 0 S 48"(1) + (140840) and the proof is complete. D The neighborhood mentioned in Proposition 4.16 need not to be a R " ball around x,,, i.e., B8 = {xeR"| |lx-x,,||s,, and a neighbor- hood N such that E 2(x) is convex over N. Thus if the boundedness of E 2(X) is assumed, then the network formulation of equation 4.1 is again useful to converge locally to one of the minimizers of E2(x) provided the starting point lies in N. If the set 0 of minimizers is isolated, there is more than one neighborhood and E2(x) is convex over each one. To restrict the discussion to the local stability of equation 4.1 over a particular neighborhood, assume there is only one isolated minimizer set, and correspondingly one such neighborhood. Proposition 4.17: let (P’) be a program in its extended form such that f e C1 and g,-eCl, ISiSr, are convex functions, and hjeCI, lSjSm, are convex or concave functions. let N be a neighborhood that satisfies Proposition 4.16. Suppose that the set of minimizers of (P’) is bounded and contains only regular points. Then if the program (PP) is replaced by a program (P) in Propositions 4.11 - 4.14, Theorem 4.3. and Corollary 4.2, the results still hold locally over N. D Although in Proposition 4.17 the stability of equation 4.1 is gude only locally, in practice the N neighborhood can be very large as will be shown in some examples. Also in Proposition 4.17 the convexity and concavity are assumed throughout R " , but this is not absolutely necessary so for the proposition to hold. It is sufficient to assume that the convexity and concavity of the corresponding functions hold on a nonempty open set, say OCR " , and restrict all discussion to (2 rather than R" . That is to MlmmIZC' ' ' f (X) subject to g,-(x)S0, for i=1 to r, hj(x)=0, for j=l to m, and 70 er, where f and g,- ’s are convex, differentiable on the open set (2, and h,- ’s are convex or concave, differentiable on 9. let f :X —-)R , where X is a nonempty convex set in R". The function f is said to be quasiconvex if, for each x, and xzeX , the following inequality is true: f Our + (1401(2) S maxtf (xi). f (1:2)} for each 460.1). The function f is said to be quasiconcave if -f is quasiconvex. Let S be a nonempty Open set in R " , and let g :S —>R be differentiable on S. The function g is said to be pseudoconvex if for each x 1, x26 S with Vg (x 1)T (x 2—x 1)20 we have g(x2)2g (x1), or equivalently, if g(x7)0 together with x' that satisfy 71 saddlg point W paint A (a) (b) (C) ((0 Figure 4.4. (a) Quasiconvex. (b) Quasiconcave. (c) Pseudoconvex. ((1) Not Pseu- doconvex and not quasiconvex. (i) log-(f) = 0 for i=1,...,r and f m (ii) VfO?) + 24iV8i(X-) + EMthjO?) = 0. Cl i=1 j=l Corollary 4.5: Let (P) be a program with same notations and assumptions as in Theorem 4.5 except that h j, lSjSm , are all quasiconcave. Then if is a global optimal solution to (P) if and only if there exists hot, 1,]T20 and ll=[ll1 um17<0 together with x' such that the conditions (i) and (ii) of Theorem 4.5 hold. El 72 Proof: Same proof as Corollary 4.3. C] Now following a similar argument used to derive Proposition 4.17 it can be shown that the network formulation of equation 4.1 is also useful to obtain (locally) an approximate solution to (P) with proper pseudo— and quasi-convexity assumptions. Corollary 4.6: Let (P’) be an extended program such that f e C1 is a pseudoconvex (:1 function, g,- is a quasiconvex Cl function for 1sisr, and hjecl is either quasiconvex or quasiconcave for 151' Sm. Let N be a neighborhood that satisfies Pro- position 4.16. Suppose that the set of minimizers of (P’ ) is bounded and contains only regular points. Then if the program (PP) is replaced by a program (P) in Pro- positions 4.11 - 4.14, Theorem 4.3, and CorOllary 4.2, the results still hold locally overN. CI Note that the network formulation of equation 4.1 may be cOnsidcred as a con- tinuous approximation of the gradient projection method [147]. Figure 4.5(a) depicts the idea of the gradient projection method. The negative gradient of the objective function is projected onto the tangent surface of the active constraint set in order to find a point y . Then a new point x,,+1 is found along the direction perpendicular to the tangent plane of x,,. Figure 4.5(b) illustrates the dynamics of equation 4.1 when the trajectory is on the boundary of the feasible region. The linear programming problem with hypercube feasible region described in Section 3.3 is a good example for illustrating the similarity between the gradient projection method and the dynamics of equation 4.1. 73 (b) Figure 4.5. (a) Gradient projection method [147]. (b) The dynamics of equation 4.1 on the boundary of the feasible region. 74 4.4 A Two-Phase Optimization Network In previous sections of this chapter it has shown that there exists a sufficiently large s so the network formulation as described by equation 4.1 is guaranteed to con- verge to an approximate solution for a large class of nonlinear programming prob— lems. In what follows, a two—phase optimization network model, which can obtain both the exact solution to the problem as well as the corresponding Lagrange multi- pliers associated to each constraint is proposed. For linear programming problems, the network solves both the primal and the dual problem exactly. For the sake of argument, assume, unless otherwise explained, that the program (P) considered in this section is a convex program for which f 6C1 and g,-eCl, lsiSr, are convex functions, and hjecl, lsjsm, are affine functions. It is clear that if the set M of minimizers of (P) contains only regular points, for feM it fol- lows that q 5 card (I (17)) s n . The penalty function used here is Ms. x) =f(x) + flioflxlf + ire-Rn]. (4.39) i=1 j=l It is formed based on the program (P) rather than on its extended form. The block diagram of a two-phase optimization network is shown in Figure 4.6. The network operates under different dynamics as the phase is changed by a predeter- mined timing switch. For 0St 0, and Vf (x) + Avgyot) + th (x) = 0. (4.44) But this actually satisfies the optimality conditions of Theorem 3.3, and thus a equili- brium point of the two-phase network is nothing but a global minimizer to a convex program (P ). The rationale behind the two-phase network formulation is the following. In phase 1 (t108 \-———-'—’l x / a / :3 LP 0 ‘3' ti ‘ -- u L’10.0 '5.00 0.00 5.00 10.0 X ( 1) (b) g UlI.‘ '3 r. o / 9. Lr; A C) o AD I- emi r X c: o or v. o o m K at. it {a 4.800 4.900 5.000 5.100 5.200 X(1) I:igur'e 5.3. Trajectories of equation 4.1 for (LP,) with s=10. (a) A broad view. (b) A View around it = (4.992, 5.060). 87 (a) § to O .9 m. D \ O AD mm X C) O 07 V. O O 03 V4. 800 4. 900 5.000 5.100 5.200 X(1) (b) 8 8 0') C7 9'7 9' Q Q 03 CD C2" 9' L0 L0 33 <2: cm or .. ._ . 4x“? 2? x 5 A ,3 e a We CD CD C?" ‘r'5 L— 53 .9. 0.] O. '7 70.00 10.0 T20.0 30.0 40.0 Figure 5.4. Trajectories of the 2-phase network for (01) with s=10, e=0.2. (a) Tra- jectory of x. (b) Trajectories of E 2(X) and f (x) with respect to time. 88 to the one in Figure 5.4(a). The corresponding trajectories of E 2(x) and f (x) are given in Figures 5.5(a) and (b), respectively. The lines atop in Figure 5.5 are for the case using only the phase-2 network structure. It is clear that they begin to exhibit the asymptotical behavior only after few generic time steps. If the initial condition is not in the feasible region, using only the phase-2 net- work structure may not lead to convergence. This can be seen by that fact that equa- tion 4.40 only increases the value of 2. For fixed s and e, choosing any initial point from the infeasible region will result in a positive value of 2.,- for some i. If the irri- tial point is far enough, 2,- will become larger than the correct Lagrange multiplier before the state trajectory approaches the boundary of the feasible region. Once this happens, there is no way to bring the value of the over-estimated 2,- back down. And, the trajectory remains in the infeasible region since it can not enter into the feasible region except through the minimizer with correct Lagrange multiplier. Even- tually, the system diverges. However, if the initial condition is restricted in a bounded region, it is possible to find a small enough 8 such that the phase-2 network structure is stable over this region. But the drawback is that the smaller the value of e, the slower the rate to equilibrium. To illustrate the phenomenon described by Fact 4.1, consider the program (LP 7): Minimizef(x) = -x2 subject to the same constraints as (LPl). Notice from Figure 5.1 that the minimizers of (LPZ) lie on the line segment x2=5 within the feasible region. If s is chosen to be 2 in equation 4.1, then its equilibrium points are the line segment x2=5.5 within 5 35 g(x) = 3x, +x2- 3— $0 89 (a) '0.960 -0.972 E2(X»(10 “0.985 ~0.997 “1.010 0.00 10.0 .E0.0 30.0 40.0 Cb) -0.960 '0.972 -0.985 f (x)x10 “0.997 fir 0.00 10.0 .F0.0 30.0 40.0 “1.010 Figure 5.5. Trajectories of E2(x) and f (x) for phase 2 with s=10, 8:02, and JCo=[4-8. 4.8]T. (a) Trajectory of E2(x ). (b) Trajectory of f (x). 90 and g3(x) = —x1 — 5 .<_ 0. In this case, the size of the equilibrium set is smaller than the size of the mimmizer set. Trajectories of equation 4.1 near (5.0, 5.5) are shown in Figure 5.6(a). If we raise s from 2 to 10 after the system settles on the line segment x2=5.5, then all the trajectories will move in parallel to x2=5.1 as shown in Figure 5.6(b). This is exactly what have been described in Fact 4.1. Note that the trajectories of s=2->10 are different from that of s=10. The latter have been shown in Figure 5.6(c) as contrast to the trajectories in Figure 5.6(b). O m [- r- .o' {3 ’2 . o / / O h (I; r ) (SL0- , / 11 X D O. \ 0 Ln val—pf 4.00 4.50 X01100 5.50 6.00 Figure 5.6. Trajectories of equation 4.1 for (LPZ) near (5.0, 5.5). (a) s = 2. (b) (c) 91 0 Ln n 1'- I- 6 1:3 I o // D as //r “ 8 / a”; / 11 x 8 U). L/ 0 ”3H,, '4. 00 4. 50 X ('51.)00 s. 50 5. 00 o I.” (D. n /' I: I” 8. / / o / / Am Ntn' x ( 8 \/ 0 In. 0 L” 4‘4 4.00 4.50 5.50 5.00 5.00 X(1) Figure 5.6. (cont’d.) (b) s = 2 -) 10. (c) s = 10. 92 For the case where the size of the equilibrium set is larger than the size of the minimizer set, consider the program (LP 3): Minimizef(x) =x1 subject to the same constraints as (LPl). From Figure 5.1 it may be seen that the minimizers of (LP?) lie on the line segment x 1=-5 within the feasible region. Choos- ing s=2 in equation 4.1, the equilibrium points are the line segment x2=-5.5 within __5_ _ -22 8100-1211 x2 1250 g4(x) =x2 - 5 S0. Trajectories of equation 4.1 near (-5.5, -5.5) are shown in Figure 5.7(a). If we raise s from 2 to 10 after the system settles on the line segment x1=-5.5, then all trajectories . will move in parallel to x1=—5.l as shown in Figure 5.7(b). The trajectory curving toward the upper-right in the middle of Figure 5.7(b) is due to the fact that the size of the equilibrium set is reduced as s increases. (In fact, the equilibrium set would eventually be identical to the set of minimizers when s becomes infinity.) This trajectory would have been a straight line along g1(x)=0 if s had been changed continuously. Since s=2—910 abruptly, the early stage of this tra- jOCtory tends to move strictly to the right and thus results in new constraint violation as it crosses to the other side of g 1(x )=0. Now the asymptotic nature of the system talias place to remedy such a violation and drives the trajectory toward the correct end P0th of the equilibrium set. The trajectories of equation 4.1 with s =10 are given in Figure 5.7(c) for comparison to the trajectories in Figure 5.7(b). 93 (a) 3. .- t ,4. . t- I D O ui 3 ' 13 a D / Am 0, . stay/7 8. // to 1 't/ O 1.0 to “6.50 -6.00 -s.so -s.00 -4.so IX(1) (b) trot 4: I i- It. I... O o- t L.“ i H Vol o / m m' ' I a r x (:—"’/1// 8 / to / 1: ti/ 8 to ~6.so —6.00 -s.so -s.00 -4.so X(1) E3111? 5.7. Trajectories of equation 4.1 for (LP3) near the point (-5.5, -5.5). (a)s =2. (b)s =2—) 10. 94 -4.SO *5. 00 / V1 / om.T co .m: Sour. ANVX / co .wt omd “S. 50 X(1) '6. 00 I'6. 50 = 10. Figure 5.7. (cont’d.) (c) s 95 5.2 Quadratic Programming For quadratic programming, consider a program (QP l): minimizef(x)=x12 +x22 +x1x2+3x1+3x2 subject to the same constraints as (LP,). The minimum of f (x) occurs at x, = x2 = —1. Since the unique minimizer lies in the interior of the feasible region, it follows from the theorems derived in Section 4.2 that the unique equilibrium of the (QP) network is exactly the minimizer. The simulation results of the trajectories of x for the correspondingly formed (QP) network are shown in Figure 5.8. The equili— brium r = [-l, —1]T clearly exhibits asymptotic stability. There is no need for using a 2-phase network structure. 10 0 to P 1 500 \ X(2) 9. 1 “S 00 r10.0 10.0 -s.00 0.00 5.00 10.0 X(1) Figure 5.8. Trajectories of x of the (QP) network for (QP 1). 96 Suppose now the objective function in (QP 1) is replaced by f(x)=x12 + x22 +rpt2 — 30x1 - 30x2, and this new program is denoted by (QP 2). The unconstrained minimizer of f (x) of (QP?) is [10, 10]. The simulation results using equation 4.1 are given in Figure 5.9. Wherever the initial points of x are, the trajectories approach to the equilibrium x = [4.97778, 5.174517. It can be shown that the minimizer of (QP?) is r = [5.0, 5.0]T at which J(r) = {2,4}. Solving the equation Vf(f)+ 2 lngr-(f)=0. i610?) we get the corresponding Lagrange multipliers, 21:00, 24:60, 23:00, and 24:90. 10 0 ti 3 *1 5.00 000 a r~ . x / O D LP' 0 2 r: o L'10.0 ‘5.00 X61000 5.00 10.0 Figure 5.9. Trajectories of x of equation 4.1 for (QPZ) with s=50. 97 Using the 2-phase network formulation for (QPZ), a simulation is run with s=50, 8:02, and initial condition xo =[4.8, 4.8]T. The trajectories of x and 2 are shown in Figure 5.10(a) and (b), respectively. The network is switched from phase 1 to phase 2 at T=2.0. Figure 5.10 clearly shows the dynamics of the network during phase 2 in which trajectories moves asymptotically toward the equilibrium f = [5.0, 5.0]T and 2 = [0.0, 6.0, 0.0, 9.017. To make the example more representative, add one equality constraint x1 = 3 to (QPI), and call this new program (QP3). In the network formulation this equality constraint is replaced by g5(x)=x1- 3 S 0 g6(x)=-x1+3 50 The minimizer of (QP3) is f = [3, -%]T at which 1(f) = {1,6}. The theoretical values of Lagrange multipliers are 8 76' A: —9 s 9 a 9—1.- [3 0000 9] Simulation results for the network of equation 4.1 for (QP 3) are shown in Figure 5.11(a) for s=50 and 8:0.2. All trajectories lead toward the equilibrium point it = [2.84279, -1.77791]T. For the correspondingly formulated 2-phase network, a simulation is performed with the initial condition xo = [2.5, -1]T and the same s and 8. The resulting trajectory of x is given in Figure 5.11(b). Again, the 2-phase net- work demonsuates its capability to tune the state variable x to the exact minimizer. Though not shown in figure, the final values of 2(t) for the 2-phase network are the exact Lagrange multiplers described above. As mentioned in Section 4.2 one of the applications of the quadratic program- min g optimization network is for solving the least squares problem Bx =b . For 98 CD (a) g to 8 \ to O O no (\m' X D O m ,,- O O °°. v4. 800 4. 900 5.000 5.100 s. 200 X(1) 0’) 8 D. 8 O. ._°- 52— __-- 2 ,— tn o in c: -“ / N m ix tn CD." N.‘ C54 t\ / a 8 a 8 / £6‘£d‘£d‘£m’ e 1.0 O I.” o N If) N m d- N" to" oi OJ 0 D O O O C) C) o‘ o" o" o’ .00 10.0 T20.0 30.0 40. Figure 5.10. Simulation results of the 2-phase network for (QP?) with s=50 and 8=0.2. (a) The trajectory of x. (b) The trajectories of 2.,-. 99 (a) O. Dr D "" 1r I‘{ o Aoof5 .- No a _—————-/ ' X 1 <3 //77//A c: L.“ / o E? 11 LL “10.0 'S.00 0.00 5.00 10.0 X ( 1) (b) *1.00 -125 O Am ~34 X m / rx 7‘ O D. N ' 2.50 2.75 3.00 3.25 3.50 X(1) Figure 5.11. Trajectories for (QP3) with s=50 and moi. (a) Using the network of equation 4.1. (b) Using the 2-phase network. 100 illustration sake, consider the following least squares problem (LS) for which I . H HO m ll t—IHHD—lt—b OOHHH and The least squares estimator calculated by the normal equation is -1 5i =[BT31-IBTb = 10 . -3 Since there is no constraint in this problem, E 2(x) = f (x) = ~34le -b 112. Also since B is of full rank, E 2(x) is strictly convex and the unique equilibrium i=[-1, 10, —3]T of equation 4.32 is globally asymptotically stable. A simulation with initial condition x=[0, 0, 0]T has been performed and the trajectories of x and of E2(x) are plotted in Figure 5.12. Note that the xj, ISan , does not approach 1, in a monotonic (increas- ing or decreasing) manner as seen in Figure 5.12(a). But the network does converge monotonically in the sense of E 2(x) (see Figure 5.12(b)). More importantly, even though the network has not yet converged to its equilibrium, E 2(1) becomes very close to its final value in only few generic time steps. Eight more simulations with various initial conditions were done to vindicate the stability of ii and the results are shown in Figures 5.l3(a)-(h). 101 (a) 3 5’.- ‘3- r 8 8 8 / «5'1 tar as 8 8 8 or .5- to 3:. K1 5 [ x8 x8 x8 or ~‘« s; 8. 8. 8. _ 8 8 8 v" 1:4 v. ' . 10.00 4.00 8.00 T 12.0 16.0 20.0 (b) 8 a; 8 (it 8. 23 x r 3 N8 tries 91 6k 8. c’0.00 4.00 8.00 12.0 16.0 20.0 T Figure 5.12. The simulation results of the (LS) problem with initial condition xo =[0, 0, 0]T. (a) The trajectories of the state variables. (b) The trajectory of E 2(x ). 102 (a) 0. =2 =2 ...'1 :1 : ff 3 o o / c o o to" to" a5 8 8 8 to“ 1.1")"1 tn' 3:. K] (7') / N'" N'" N 8. 8. 8. m 8 8 8 ‘ v" v.‘ v. ' l ' .00 4.00 8.00 T 12.0 16.0 20.0 (b) =1 0. O. :1 ::q : W 8 8 8 / an" an" to 8 8 8 \ / or" 1:1" 111' :- N :7) N" N" N 8 8. 8. [\¥ 8 8 8 v‘J v" v. ' ' ' .00 4.00 8.00 T 12.0 16.0 20.0 Figure 5.13. The trajectories of x for the (LS) problem with different initial condi- tions. (a) xo 4.4, -4, 417‘. (b) xo =[11, -4, 4?. (o) (d) 103 =1 0. =1 :1 Z‘ :1? 8 8 8 06“ do" as 8 8 8 .o" to" to Z (:1 8 s." 8H s; 8. 8. 8. 8 8., 8. T 3' 70,00 4.00 6.00 T 12.0 16.0 20.0 D. D. 9 :4 :- 2% Pi 8 8 8 / «:51 «5" a5 8 8 8 ui- .o- to 2. a 8 x8 x8 x8 ~'- N" ~' 8. 8. 8. 8 8 8 v.4 '3" v“, 1 ' «1.00 4.00 8.00 T 12.0 16.0 20.0 Figure 5.13. (cont’d.). (c)xo=[-4, 11,-41T. (d)xo=[11, 11, 417. e) (f) 104 0. O. O. :- :- :5 If 8 8 8 ,/// 8- e~ e 8 8 8 m~ h~ m 2 N 8 / x8 x8 x8 A, ~~ Ni N 1 8 8 8 ' u-"1 a..." .—: k \W o 8 8 O 77 I 7.00 4.00 ‘ 6.00T 12.0 16.0 20.0 o_ Q. ‘3. f 8 8 8 ,// to" 85" to 8 8 8 8- mi 8 2 E 8 x8 x8 x8 as ~- 8 / 3 8. 8. \\ Q o 8 O D I 7 78m mm 8m1_ we we mm Figure 5.13.(cont’d.). (e)xo=[-4,-4, 1117‘. (f) xo=[ll,-4, 111T. 104 0. O. ‘3. r'—‘ 8 8 8 / 8“ «5q tn' 8 8 8 m- as m a 8 8 1 N- N8 N '1 8 8. 8. a. 8 8 8 o, 44 o? - . . 00 400 8.00T 120 m0 m0 / 8 8 8 / Q. DJ 0 8 8 8 1n" tn" tn 2 8 8 x8 x8 x8 N." N" N/ 5’. 8.] 8.; T T 7.m mm am1_ we we we Figure 5.13.(cont’d.). (e) xo=[-4, .4, 1117. (r) xo=[ll,-4, 111T. (E) (h) X(1) X(2) X(2) Figure 5.13 8.00 X13) 8.00 5.00 X131 2.!!! 5.00 “1.111 105 8.00 12.0 16.0 rill) “4.00 8.00 T 12.0 16.0 .(cont’d.). (g) xo=[-4, 11,111T. (h) xo=[ll, 11, HF. 106 5.3 Nonlinear Programming Consider the following program (NP 1): Minimize f(x) =x,2 + (x2 -1)2 subject to g(x) = x2 — x12 = 0. Note that f (x) is strictly convex on R 2 as shown by the following derivation. For 2.8(0, 1), f 09: + (1-2)y) = 0x. + (May 112 +012 + (l—Mya - 1)2 = our + (Hm? + mics-1) + (moo-1112 = 22x12 + 22(1-2.)x1y1 + (140ny + 2282—1)2 + 22(1-2)(x2-1)(yz-1) = 420:? + (12’1)21 + (1— )2er + 02—021 + 280-481;». +(12“1)0’2“'1)1 = 81x? + (xi-021+ (22-2)[x12 + (12.1)21'1'0-MD'12 + 02-1121 = [(1-702 - (18)]er + 02-021 + zul—xxxryr + (xi-norm fl = Mot) + (l-w 0) + W-l)ffH12“1)2+y12+02“1)2“2-Xry1402-0024)] = if (x) + (1-2)f (v) + Ml—l) _(xr “ xi)2 + ((xo-l) - 02-1))2] < 2f (x) + (1-7»)f 0’). since 2(2-1) < 0 and (x, - y,)2 + ((xz-l) — (342-1))2 > 0. To verify the convexity of g (x), by definition it follows that 80x + (1403’) = 0112 '1' (1'4”?) ‘ (411 ‘1' (1’4” 122 = 71x2 + (1-20)., .. [7121.12 + 22(1-2.)x1yr + (14023121 107 = let, — x12) + (14002 - (140212) “ ”(l—W 1 = xg (x) + (1-2)g (x) — 2241-70101 s 2g (x) + (140g (x). if xly120. Thus g(x) is convex on the closed half spaces {xeRzlleO} and {xeR21x1S0}, but not on R2. The program (NPl) is equivalent to finding points on the parabola x12=x2 closest to the point (0, 1). To solve the problem precisely, substitute x12=x2 into f (x) and solve 2 _ 12+(12—1)—C. This gives 1:1: )14c — 3 X2: 2 . It can be seen geometrically that there is only one solution of x2. This implies em, = % and x2 = -;-. Correspondingly, x1 = :t-‘(l—in Thus the minimizer of (NP 1) is x = 8%. é) at which f(x) = in E 2(x) for the above program (NP 1) is Etc) =x? + (xi— 112 + %(12 -x.2>2. The contours of E g(x) for different values of s are shown in Figure 5.14. When 3 is small, the shape of the contour of E 2(1) is dominated by f (x). As s increases, it tends toward the shape of the parabola g (x) with two minimizers 108 1.75../ \ C j Figure 5.14. Contours of E2(x) for (NP 1). (a) The contour for s=0.5. V Jr2 0.75 0.5" 0.25'-\ / °‘§\ : ‘r : i t i :7 4 -1 -0.75 -0.5 -0.25 0 0.25 0.5 0.75 1 I1 Figure 5.14. (cont’d.) (b) The contour for s=5. 110 21- 1 .75" 0.75' 0.5‘ 0.25:l Figure 5.14. (cont’d.) (c) The contour for s=50. 111 derived in the following. By direct calculation, it follows that _ 2x1(sx12 -sx2 +1) V52“) — -sx12 + (s+2)x2 — 2 and 2+s(6ot,2 - 2.x?) -2sxl 2 _ V 520‘) "[ -2sxl s+2 ' By setting VE2(x) = 0 the critical points of E 2(x) are found to be =,(0 ——2'2—') and Xe2= (iVS—ZEE ,-l"). For x = ,1, the eigenvalues of V2E 2(x) are 4—2s s +2 and (s +2) and they are positive for s<2. Thus x = xol is the minimizer for E2(x) when s<2. Similarly, the eigenvalues of VZE 2(x) at x = 1.2 are (3s +2) 8‘! (3s +2)2 - 4(4s -3) 2 >0 for s>2. Hence, xoz is the minimizer for E 2(x) when s>2. As s -—)oo, the minimizer x2 approaches (i— —), which are the exact minimizers of (NP 1). «1" 2 Converting the program (NP 1) into the form of equation 4.1, the simulation results of the trajectories of x with s=50 and various initial points are given in Figure 5.15. The trajectories of x, except for initial points with x1=0, converge to either 521 = (0.69282, 0.5) or 22 = (-0.69282, 0.5) 112 o O J‘ 8'" r} '- 1— ll ca on N / \ o no N4: 11‘ x o o o- / \ o c: --' 1 U11. ‘“2.00 “1.00 X01100 1.00 2.00 Figure 5.15. Trajectories of x of equation 4.1 for (NP 1) with s=50. .— depending on whether x, of the initial point is greater or smaller the zero. In fact, it can be shown that x, is asymptotically stable on {xeRzlxl>0] whereas 122 is asymp- totically stable on {x eRzlxl<0]. Though it was mentioned in Section 4.3 that for nonlinear programming the asymptotic stability of the equilibrium points of equation 4.1 are held locally for sufficiently large s , this example shows for (NP 1) that the basins of attraction of the two equilibrium points nearly cover R 2 except for the line x1=0. For s=50 if a trajectory starts out with x1=0, it stays on x1=0 and moves toward x = (0.0, 0.03846), 113 which is a saddle point originating from xel as s>2. Since in hardware irhplementa— tion one can not select a point which is precisely zero, the effect of this saddle point is actually unobservable. Thus the network may be thought of almost completely stable in practice. A 2—phase network has been used to shift the equilibrium points s-2 i) to (i-l? 1 (i 23’2 (’3' ). The simulation results are shown in Figure 5.16. The little hook near the end of the trajectory is due to the effect of phase-2 dynamics. By the choice of s and e for the 2-phase network in solving (NP 1), the time for phase-2 to converge is roughly triple of the time for phase-l. Generally speaking, the smaller 8, the longer the time for phase-2 to converge. Consider now the following nonlinear program (NP 7) quoted from [19]: I? Minimize f(x)=x12 +Jc22 —x1x2+0.4x2+ 30 subject to g(x) = x1 + 0.5x2 2 0.4, g(x) = 0.5xl + x2 2 0.5, 83(1)=1120: g4(x)=x220. This is in fact a convex program because f (x) is a convex function and the feasibility set is convex. The convexity of f (1) may be seen by deriving the Hessian matrix of f(X). OBI-+2 -1 szu)=[ _l 2 J. 114 (a) 0.75 . /> 0.65 X(2) 0.55 0.45 0.35 0.60 0.70 0.80 0.90 1.00 X(1) (b) 0. 75 055 /\ Ln Am No’ x \ -5 ID V” 0' Ln m o' -1.00 -0.90 ~0.80 -O. 70 -O.60 X ( 1) Figure 5.16. Trajectories of x for (NP 1) using the 2-phase network with s=10 and 8:02. (a) x0 = (0.75, 0.75). (b) xo = (—0.75, 0.75). 115 and showing that its eigenvalues (0.1x1+2)_+\/0.01x12+1 are positive for x1 2 0. Using equation 4.1 to solve (NPZ), the trajectories of x of are shown in Figure 5.17(a). The simulations are done with s=10. The equilibrium of the network is given in the first entry of Table 5.2 and contrasted to the equilibrium obtained by a 2-phase network with s=10 and 8:02. As seen from the table, the solution obtained by the 2-phase network is very accurate. The trajectories of the 2-phase network with initial points (0.25, 0.25) and (0.45, 0.45) are plotted in Figure 5.17(b). The line seg— ment near the middle of the figure is caused by the phase-2 dynamics. Table 5.2. The equilibrium points for different networks for (NP 2). Network formulation x1 x2 Equation 4.1 with s=10 0.3023760 0.2825410 2-phase net with s=10, 8:0.2 0.3395630 0.3302180 Exact solution 0.3395628 0.3302186 As shown in Section 4.3 the network formulation of equation 4.1 may be used find minimizers of a pseudo-convex function subject to some quasi-convex and quasi- concave constraints. If an equilibrium 2' of equation 4.1 occurs at the interior of the feasibility set, V f (i )=0 must hold. By the properties of a pseudo-convex function it follows that i is a global minimizer of f . Suppose now that the objective function is f:X—>R and feel, where x is an open interval in R. Let (P1) be the problem to minimize f over S , where S is a subinterval of X . Mapping (P 1) to the network of equation 4.1, the network dynamics drive the trajectory continuously along the des- cending direction on the surface of f (x ). The trajectory ends at either the boundary of S or a point in S for which Vf (x)=0. If the former happens, then f is strictly 116 (a) g 4" 0 ". K) / O / c: Z////// ,Jn <3de ’" >< / #8 //////,//// / O X c: c3 c5 ii t3 0.00 0.25 0.50 0.75 1.00 X(1) (b) y; 0' ’3 S? c5 035 \ ‘3 X c: [///// m d 0] Ln N c5 0.25 0.30 0.40 0.45 0.35 X ( 1 ) Figure 5.17. Trajectories of x for difl'erent networks for (NP 2). (a) Equation 4.1 with s=10. (b) 2-phase network with s=10 and $0.2. 117 monotonic (either increasing or decreasing) on S . Both cases lead to an interesting application, namely, solving p (x )=0 over an interval. Let p :s ->R and peCz, for s an closed interval in R. Let (P 2) be the problem to solve p (x)=0 for xeS . Assume (P2) is feasible, i.e., there exists an xeS such that p (x )=0. Assume that p is strictly monotonic on S. Let E (x )=-%-p 2(x), then For) =p’(x)p(x) (5.1) and 15"0c)=0 by the strictly monotonic assumption of p. Thus )7 is a local minimizer of E(x) and and a solution to p (x)=0. In fact, 1? is also the global minimizer of E (x) on S, since the strictly monotonic assumption assures that E (x) is pseudo—convex on S with a unique minimizer. To relax the above argument to more general functions, assume that p e C 1. Note that since p (f)=0, it follows mm = limp—(W) II—)0 h ' Now taking the limit of the difl‘erence quotient of E’ (x) at f, we get E'(x-+h) - E’(5:') lim h-)0 h = lim p’p em.) *- p’mpm II—)0 h = m p’(f+h )p (f+h) h—-)0 h = . r- . p(f+h) $30? (““335 h = (p’(f))2 > 0. (5.3) 118 Thus if is again a local minimizer of E (x) on S. By the suictly monotonic assump- tion ofp(x) on S,i’is also a global minimizer ofE(x) on S. Since E (x) is pseudo-convex in the above, choose the network 15 = -E’ (X) = -p (X)p’(X) (5.4) such that L35) =p(X)p’(x)i (5-5) = -(p (x )p’ (x ))2. Since by the assumption of strict monotonicity p’ (x )3'50 for x e S , the unique equili- brium of equation 5.4 is f at which E (x) achieves its minimum. The idea behind this technique is that since E (x) is pseudo-convex with its minimum in S , then it is a valid Lyapunov function for the system of equation 5.4. Same technique is applicable to solving any feasible problem q (x )=0 on a closed interval S, where q is such that q2 is C1 pseudo-convex on S. Thus (12 may be a valid Lyapunov function for equation 5.4. It is possible for 42 to have more than one minimizer. If qu1, and if q2 has only a local minimum equal to zero, finitely many local maximizers, and no saddle points, then this technique is also usable. Since for such a function q , except when the initial point is one of the maximizers, the system of equation 5.4 will converge to one of the minimizers f of q2 for which q(x')=0. In hardware implementation, the effect of the finitely many maximizers is unobservable. Thus the system may be regarded as almost completely stable in practice. Consider the following problem (NP 3): Solve f(x) =x3 - 9x2 + 23x - 15 = 0. The function f (x) is plotted in Figure 5.l8(a). It has three roots at l, 3, and 5. The profile of f2(x) is shown in Figure 5.18(b). f2(x) has three minimizers, 1, 3, and 5. 119 (a) for)“ (b) f2(X) lO- .— N U) A M Figure 5.18. (a) f(x) for (NP3). (b) f2(x) for (NP3). 120 and two local maximizers, 4.1547 and 1.8453. Solving this problem by equation 5.4, the trajectories of of x versus E (x) are shown in Figure 5.19. The asymptotical stability of equation 5.4 on its three equilibria can be clearly seen in the figure. The regions of attraction for x=l, 3, and 5 are respec- tively (—oo, 1.8435), (1.8435, 4.1547), and (4.1547, co). They indeed cover almost all R except two points. It is thus clear the system is almost completely stable. 10.0 8.00 ———.=.> fl 8. § § f" R f’ R / \ a. / \ . \/ °0.50 1.50 2.50 3.50 4.50 5.50 x Figure 5.19. The trajectories of x of equation 5.4 versus E (x) for (NP 3). CHAPTER VI CASE STUDIES Two optimization problems encountered in power system engineering are solved by the developed network formulation in this chapter. They are the economic power dispatch (EPD) problem and the optimal power flow (OPF) problem. The results of this case study demonstrate the applicability of the developed network for solving the optimization problems encountered in real engineering situations. 6.1 Economic Power Dispatch The EPD is a classical problem in power system optimization. The goal of EPD is to determine the amount of power to be produced by each generating unit in the system such that the load (demand) can be met with a minimum total generation cost. The power system model typically consists of n thermal-generating units connected to a single load R . Let x,- be the power generated by the i th unit and f i be the genera- tion cost rate function of that unit. f ,- is generally approximated by a quadratic poly- nomial of the form f i(xi) = bi + aixi + Aiixiz (6.1) where b; , a,- , and 4,; are positive constants. Each x, is restricted between (1:,- m , Iran] as determined by the generation limits of the unit. Without 121 122 considering the transmission line losses, the EPD problem (D 1) may be expressed as: Minimize f(x) = if;(x,~) i=1 subject to ximei Sxim for i=l,...,n II and 2x,- —R = 0. '=1 (D 1) is actually a quadratic program and thus the results of Section 4.2 may be applied. Traditional methods to solve this problem can be found in [153]. If the line losses are taken into account, the EPD problem changes its form to (D 2): Minimize f(x)= inc.) i=1 SUbjCCt t0 13mg x,- 51,-” for i=l,...,n and for, —PL)-R =0 '=l where PL represents the line losses. A general version of PL may be expressed as the quadratic function I! n n ,2 PL = zzxiBijxj + 23,-1; + 80' (6'2) i=lj=l i=1 Due to the nonlinearity of PL it is more difficult to solve (D7) than (D 1). An itera- tive process is normally adopted to solve (0;) (see Chapter 4 of [153]). The developed optimization network is superior in that with same mapping technique it is applicable no matter whether the line losses are considered or not. Example 6.1 [153]: A power system consists three generating units. The minimum and maximum output of units 1, 2, and 3 are [150 MW, 600 MW], [100 MW, 400 MW], and [50 MW, 200 MW], respectively. A total of 850 MW is to be delivered at 123 a minimum overall cost. Case I -' 311pposc the cost rates for the units are f 1(x1) = 561 + 7.92::1 + 0.001562x12, f 20:7) = 310 + 7.85x2 + 0.00194x22, and f 3(x 3) = 78 + 7.97x3 + 0.00482x32. Matching the corresponding terms of f (x) to the objective function of a quadratic Progmm (QP ). gives 0 0 0.009 7.92 a = 7.85 , 7.97 0.003124 0 0 A = 0 0.00388 0 . 54 and [561] c = 310 . 78 It is clear that A is positive definite. Furthermore, the feasibility set of this problem is convex and compact. Therefore, the results in Section 4.2 can be applied presum- ing that the minimizer set 0 contains only regular points. The simulations of the network of equation 4.1 are done with s =50 and the ini- tial condition x0 =[400, 300, 150]T . The trajectories of the state variables are plotted in Figme 6.1(a); the trajectories of f (x) and E2(x) are shown in Figure 6.1(b). The asymptotic nature of the equilibrium point is clearly seen in these figures. The final values of the state variables, E 2(X ). and f (x) are given in Table 6.1 as is the exact solution. The error between the simulation result and the exact solution is less than 0.1% on average. The value of E 2(1) is slightly larger than the objective function. 124 O O O (a) m. in. L“. (0‘1 “3'1 LO O O O O C) O m" m" m’ N N N F Clo Clo CDC) m" m" m' x x x C) O O C) O D N“ N“ N ck O O C) 1.0 L0 Ln 0'4 o“ o‘ 0.00 0.50 ’ 1.00 1.50 2.00 T .103 E O (b) a 5 N (13'? odé Ln LD 03 CD 2 9 oo" oo' 0 0 E5 1235 X". ". “org-(Mn 5 :5 N 7‘ in Ln L0 or m 2 <2 (15“ oo' 2 O 2 5 co" 00' 0.00 0.50 1.00 1.50 2.00 T .103 Figure 6.1. Simulation results for Case 1 of Example 6.1 using equation 4.1 with s=50 and x,,=[400, 300, 15017”. (a) Trajectories of x. (b) Trajectories of f (x) and 52(1)- 125 This is due to the fact that the equilibrium of the network always lies in the infeasible region and thus results in a small positive value in the second term on the right hand side of equation 3.10. The simulation results for Case 1 using the 2-phase network with s =50 and e are shown in Figure 6.2. The system is switched to phase-2 at t=1000. Normally, phase-2 dynamics requires longer times to converge, but for this particular case the phase-2 network converges rather quickly to its equilibrium as can be seen fiom Fig- ure 6.2(b). From Table 6.1 the final values of state variables of the 2-phase network are the same as the exact solution. Table 6.1. Equilibrium points of two networks for Case 1 of Example 6.1. Networks x1 x2 x3 f (x) E 2(X) Equation 4.1, s=50 393.10 334.52 122.20 8192.68 8193.52 2-phase net, s=50, $0.2 393.17 334.60 122.23 8194.36 8194.36 Exact Solution 393.17 334.60 122.23 8194.36 8194.36 Case 2: Suppose now that due to fluctuation of the resource price, the cost rate of unit 1 becomes f1(x1) = 459 + 6.4st + 0.00128x12. After adjusting the corresponding terms in the network, the simulation is performed with the same parameter s and initial condition. The results are listed in Table 6.2. Again, the results of using l-phase network (equation 4.1) approximate the exact solu- tion closely while the results of using 2-phase network match it precisely. The trajec- tories of the network variables for equation 4.1 are plotted in Figure 6.3. Unit 1 in this case must produce its maximum output. In [153], a different scheme is required 126 (a) 8 8 Si O O O O O O m- m- m‘ “do “do “éo‘i‘ :m rm :m m" «i- m' :3 {\‘t m s/f x x x a O O O O O O N" N's a; rim C.) O O ”1] L” L“. o o‘ o 0.00 1.00 2.00 3.00 4.00 T .103 '5 S N N «5‘ (0% Ln Ln CD to 2 52 an- an s O 0 L0 to g9: 'ée fx‘m" Am .5 ~25: a... in \ L0 L0 to m 2 9.2 m'- m' t .9. O 2 § «SJ m' 0.00 1.00 2.00 3.00 4.00 T .103 Figme 6.2. Simulation results for Case 1 of Example 6.1 using the 2—phase network with s=50, $0.2 and x0 =[400, 300, 150]T. (a) Trajectories of x . (b) Trajectories of HIM“ 520‘)- 127 to handle the case when state variables are achieving their extremum. Oilr network formulation, however, can apply to such a case without any change. C] Table 6.2. Equilibrium points of two networks for Case 2 of Example 6.1. Networks x1 x2 X3 f(X) 52(X) Equation 4.1, s=50 600.01 187.00 62.82 7250.63 7251.37 2-phase net, s=50, $0.2 600.00 187.13 62.87 7252.11 7252.11 Exact Solution 600.00 187.13 62.87 7252.11 7252.11 The above example does not take the line losses into account. To demonstrate the capability of the developed network formulation in solving the EPD problem with line losses, consider the following. Example 6.2: This example problem is taken from Example 4B in [153]. In this power system there are three generation units with unit dispatch limits 50.0 MW le S 200 MW, 37.5 MW 5 x2 S 150 MW, 45.0 MW S x3 S 180 MW. A total of 210 MW is to be delivered at a minimal overall cost. The generation cost rates are f 1(x1) = 213.1 + 11.669x1 + 0.00533x12, f 20:7) = 200.0 + 10.333x2 + 0.00889x3, f 303) = 240.0 + 10.3331:3 + 0.00741x}. Fig 5:: 128 (a) a a a ton to'- to' o O O / O O O m" m" 111' N N N F CO CO CO 2"“ 7L“ 2““ to" m'- m‘ E a £8 E X X X D O D D O O (\I-l N-t (N O O O W L0 in o" o" o' 0.00 0.50 1.00 1.50 2.00 T .103 C) O (b) 5% B r\" l\' O O O 0 v- v- r\" N. 225. "is x ._ 3: . AN 1\ :3, 5 N “e In O O O O N N r\" l\. C) O 9] 2 l'\ 0.00 0.50 1.00 1.50 2.00 T *103 Figure 6.3. Simulation results for Case 2 of Example 6.1 using equation 4.1 with s=50 and x0 =[400, 300, 150]T. (a) Trajectories of x. (b) Trajectories of f (x) and E2(1)- 128 (a) a a a to" to'“ to' o o o / o o o m" Ln" m' N N N F 00 oo oo 7‘“ 7‘“ :m m" m" m' 3 N 8E X X x o o c: c: o o N" N“ N (H o c: m m or o" o" d 0.00 0.50 1.00 1.50 2.00 T .103 o o (b) B 8 m" rReC1. The optimization ANN s developed in this work are applicable to a very large class of (constrained or unconstrained) nonlinear programming problems with f e C1 and gjze C1 for all i. For the unconstrained problems, they may be solved exactly by equation 4.1, similarly to the constrained problems with the solution in the relative interior of the feasible region. In such cases, the network performs similarly to a gra- dient descent method. For a general constrained problem, the 2-phase network may be used. If a problem has only constraints, h (x )=0 and g (x )SO, the objective func- tion is formed by summing 112(1) and (g+(x))2 and the sum may then be regarded an as unconstrained problem and be solved accordingly. A good example is the solution of the power flow equations as shown in Chapter 6. Note that in applying the optimization network, the problem must be feasible. Furthermore, the regularity assumption and the boundedness of the solution set must be met. This raises no difficulty at all, since in the practice of nonlinear programming the regularity of the solution is normally presumed for large systems, and the bound- edness of the solution may be achieved by adding some artificial upper and/or lower bound constraints. 146 As mentioned above the optimization ANN s are similar in theory to many of the existing optimization methods. In fact, the networks may be regarded as continuous optimization systems in coth to the discrete nature of many optimization processes. Any nonlinear programming problem which is solvable by the traditional discrete methods is also solvable by the optimization ANN s developed in this work. Tradi- tional discrete optimization methods such as the Newton’s method and Newton- Raphson method often demand the calculation of the inverse of the Jacobian matrix and then update the state variables by difference equations, which is a function of the inverse Jacobian. The calculation of the inverse Jacobian grows factorially in compu- tational complexity with respect to the size of the Jacobian and thus restricts the applicability of these discrete optimization methods to large-scale systems. The optimization ANN s, on the other hand, require no matrix inversion and, consequently, have more potential for use with larger systems. The above argument disagrees with the generalized networks proposed by Tsirukis, er al.[159] In their networks, the difi‘erence equations used in uadifional discrete optimization processes are transformed into differential equations in which the inverse Jacobian is included. But since the differential equations describe a dynamic system and there is no software algorithm nor hardware structure that can calculate the inverse Jacobian instantaneously (not even for a small system), their networks are not hardware implementable. Even when implementing their networks in software, the inverse J acobian in the differential equations actually slows down the convergence rate when compared to the software implementation of difference equations in tradi- tional optimization methods. This make their networks less useful. So a direct copy from difference equations to differential equations does not necessarily yield any advantage. 7.2 (1) (2) (3) (4) 147 Contributions The following are the salient contributions of this dissertation: ANN s for optimization have been analyzed from the viewpoint of optimization theory leading to discovery of the reasons why the ANN s succeed or fail. The optimization network theory for linear programming, quadratic program- ming, convex programming, and nonlinear programming has been derived and a 2-phase optimization network has been developed. These results lay a mathematically sound foundation for the optimization ANN s and extend the applicability of ANN s. The quality of the solutions obtained by the optimization ANN s has been quantified through simulation. The solutions of the l-phase networks are adju- stable by tuning the penalty parameter and the solutions of the 2-phase networks are exact. The applicability of the optimization ANN s for solving real-world problems like the economic power dispatching problem and the optimal power flow problem has been demonstrated. It was shown that the mapping technique of the optimi- zation ANN s is simple and that they are able to handle various kinds of con- straint sets. It was demonstrated that the optimization ANN s attain a better objective function value. As a whole, this work lays a solid groundwork for the optimization ANN s in both the theoretical and practical aspects. As far as solving nonlinear programming problems by ANN s is concerned, the work is completed, except perhaps a more rigid analysis of the 2-phase network. The optimization ANN s also suggest a possible structure for the next-generation analog computer. Future research should emphasize 148 developing suitable hardware implementation for the optimization ANN s so as to fully exploit their capability. BIBLIOGRAPHY [1] [2] [3] [4] [5] [6] [7] [8] [9] BIBLIOGRAPHY Hopfield, J.J., "Artificial Neural Networks," IEEE Circuits and Devices Maga- zine, pp. 3-10, September 1988. Tank, D.W. and Hopfield, J J ., "Collective Computation in Neuronlike Cir- cuits," Scienific American, pp. 104-114, December 1987. Hopfield, J.J. and Tank, D.W., "Computing with Neural Circuits: A Model," Science, vol. 233, no. 4764, pp. 625-633, August 8, 1986. Hopfield, J.J., "Neural Networks and Physical Systems with Emergent Collec- tive Computational Abilities," Proceedings of the National Academy of Science USA, vol. 79. PP. 2554-2558, April 1982. Hopfield, J .J ., "Neurons with Graded Response Have Collective Computational Properties like those of Two-state Neurons,“ Proceedings of the National Academy of Science USA, vol. 81, pp. 3088-3092, May 1984. Tank, D.W. and Hopfield, J.J., "Simple Neural Optimization Networks: An AID Converter, Signal Decision Network, and a Linear Programming Circuit," IEEE Transactions on Circuits and Systems, vol. CAS-33, no. 5, pp. 533-541, 1986. Hopfield, J.J. and Tank, D.W., "’Neural’ Computation of Decisions in Optimi- zation Pioblems," Biological Cybernetics, vol. 52, pp. 141-152, 1985. Bagherzadeh, N., et al., "On Parallel Execution of the Traveling Salesman Problem on a Neural Network Model," Proceedings of the IEEE First Interna- tional Conference on Neural Networks, San Diego, CA, vol. II], pp. 317-324, July 1987. Brandt, RD., et al., "Alternative Networks for Solving the Traveling Salesman Problem and the List-Matching Problem," Proceedings of the IEEE Interna- tional Conference on Neural Networks, San Diego, CA, vol. II, pp. 333-340, July 1988. 149 [10] [11] [12] [13] [14] [15] [16] [17] [13] [19] 150 Szu, H., "Fast TSP Algorithm Based on Binary Nemon Output and Analog Neuron Input Using the Zero-Diagonal Interconnect Matrix and Necessary and Sufficient Constraints of the Permutation Matrix," Proceedings of the IEEE International Conference on Neural Networks, San Diego, CA, vol. II, pp. 259-266, July 1988. Wacholder, E., Han, J. , and Mann, R.C., "An Extension of the Hopfield-Tank Model for Solution of the Multiple Traveling Salesmen Problem," Proceedings of the IEEE International Conference on Neural Networks, San Diego, CA, vol. II, pp. 305-324, July 1988. Gunn, J P. and Weidlich, R.B., "A Derivative of the Hopfield-Tank Neural N et- work Model that Reliably Solved the Traveling Salesman Problem," Proceed- ings of International Joint Conference on Neural Networks, Washington, DC, vol. II, pp. 588, June 1989. Angeniol, B., et al., "Self-Organizing Feasture Maps and the Traveling Sales- man Problem," Neural Networks, vol. 1, pp. 289-293, 1988. Culioli, J.-C., et al., "Neural Network Models for Linear Programming," Proceedings of International Joint Conference on Neural Networks, Washing- ton, D.C., pp. 293-296, January 1990. Culioli, J.-C., et al., "A Neural Network for Explicitly Bounded Linear Pro- gramming," Proceedings of International Joint Conference on Neural Net- works, Washington, D.C., pp. 381-384, January 1990. Foo, Y.S. and Takefuji, Y., "Integer Linear Programming Neural Networks for Job-Shop Scheduling," Proceedings of the IEEE International Conference on Neural Networks, San Diego, CA, vol. II, pp. 341-348, July 1988. Yao, Y. and Yang, Q, "Programming Neural Networks: A Dynamic-Static Mode ," Proceedings of International Joint Conference on Neural Networks, Washington, DC, pp. 345-348, January 1990. Kennedy, MP. and Chua, L.O., "Unifying the Tank and Hopfield Linear Pro- gramming Circuit and the Canonical Nonlinear Programming Circuit of Chua and Lin," IEEE Transactions on Circuits and Systems, vol. CAS-34, no. 2, pp. 210-214, February 1987. Kennedy, MP. and Chua, L.O., "Neln'al Networks for Nonlinear Program- ming," IEEE Transactions on Circuits and Systems, vol. CAS-35, no. 5, pp. 554—562, May 1988. [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] 151 Tsirukis, A.G., Reklaitis, G.V. and Tenorio, M.F., "Nonlinear Optimization Using Generalized Hopfield Networks," Neural Computation, vol. 1, no. 4, pp. 511-521, 1989. Smith, M.J.S. and Portrnann, C.L., "Practical Design and Analysis of a Simple ’Neural’ Optimization Circuit," IEEE Transactions on Circuits and Systems, vol. 36, no. 1, pp. 42-50, January 1989. Rodriguez-Vazquez, A., et al., "Analog Integrated Neural-Like Circuits for Nonlinear Programming," Proceedings of 32nd Midwest Symposium on Circuits and Systems, Charnpaign, IL, pp. 234-237, August 1989. Rodriguez-Vazquez, A., et al., "Analog Neural Networks for Constrained Optimization," IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990. Rodriguez-Vazquez, A., et al., "Nonlinear Switched-Capacitor ’Neural’ N et- works for Optimization Problems," IEEE Trans. on Circuits and Systems, vol. CAS-37, no. 4, April 1990. Atlas, L. and Conner, 1., "Existing and Potential ANN Techniques for Power Systems Applications," IEEE International Symposium on Circuits and Sys- tems, New Orleans, LA, May 1990. Damborg, M.J., El-Sharkawi, MA. and Marks, R., "Potential Applications of Artificial Neural Networks to Power System Operation," IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990. Baum, E.B., "Towards Practical ’Neural’ Computations for Combinatorial Optimization Problems," Proceedings of the AIP Conference on Neural Net- works for Computing, Snowbird, UT, no. 151, pp. 53-58, 1986. Jefirey, W. and Rosner, R., "Neural Network Processing as a Tool for Function Optimization," Proceedings of the AIP Conference on Neural Networks for Computing, Snowbird, UT, no. 151, pp. 241-246, 1986. Jefl'rey, W. and Rosner, R., "Optimization Algoritluns: Simulated Annealing and Neural Network Processing," Astrophysical Journal, vol. 310, pp. 473-481, November 1986. Kamgar-Parsi, B. and Kamgar—Parsi, B., "An Efficient Model of Neural Net- works for Optimization," Proceedings of the IEEE First International Confer- ence on Neural Networks, San Diego, CA, vol. 111, pp. 785-790, July 1987. [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] 152 Kuczewski, R.M., "Neural Network Approaches to Multi-Target Tracking," Proceedings of the IEEE First International Conference on Neural Networks, San Diegon, CA, vol. IV, pp. 619-633, July 1987. Ramanujarn, J. and Sadayappan, P., "Optimization by Neural Networks," Proceedings of the IEEE International Conference on Neural Networks, San Diego, CA, vol. II, pp. 325-332, July 1988. Tagliarini, GA. and Page, E.W., "Learning in Systematically Designed Net- works," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. I, pp. 497-502, June 1989. Bankes, S.C., "Constrained Differential Optimization for Temporally Dynamic Problems," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. II, p. 587, June 1989. Kajiura, M., Akiyama, Y. and Anzai, Y., "Neural Networks vs. Tree Search in Puzzle Solving," Proceedings of International Joint Conference on Neural Net- works, Washington, D.C., vol. II, p. 588, June 1989. Kitamura, S. and Qing, P., "Neural Network Application to Solve Fredholm Integral Equations of the First Kind, " Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. II, p. 589, June 1989. Li, T., Fang, L. and Wilson, W.H., "Parallel Approximate Solution of the 0/1 Knapsack Optimization on Competitive Neural Networks," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. II, p. 589, June 1989. Kamgar-Parsi, B. and Kamgar-Parsi, B., "On Problem Solving with Hopfield Neural Networks," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. II, p. 589, June 1989. Peng, Y., "A Connectionist Solution for Vertex Cover Problems," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. II, p. 590, June 1989. Rignot, E.J.M., "Three-Dimensional Point Pattern Tracking Using a Completely Determined Hopfield Network Independent of the Input Data," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. II, p. 590, June 1989. [41] [42] [43] [44] [45] [46] [47] [43] [49] [50] [51] 153 Thrift, P., "Function Optimization Using a Lattice of Neurons," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. II, p. 591, June 1989. Tanaka, T., et al., "Optimal Task Assignment Using Neural Netwo ," Proceedings of International Joint Conference on Neural Networks, Washing- ton, D.C., vol. II, p. 591, June 1989. Graupe, D. and Liu, R., "A Neural Network Approach to Decomposing Surface EMG Signals," Proceedings of 32nd Midwest Symposium on Circuits and Sys- tems, Champaign, IL, pp. 740-743, August 1989. Mjolsness, E., Ganett, C. and Miranker, W.L., "Multiscale Optimization in Neural Nets: A Preliminary Report," International Joint Conference on Neural Networks, San Diego, CA, June 1990. Maa, C.-Y. and Shanblatt, M.A., "Adjustment of Parameters for Signal Deci- sion Networks," Proceedings of International Joint Conference on Neural Net- works, Washington, D.C., vol. II, p. 589, June 1989. Zak, M., Toomarian, N. and Barhen, 1., "Creative Dynamics Approach to Optimization Problems," International Joint Conference on Neural Networks, San Diego, CA, June 1990. Chua, L.O. and Yang, L., "Cellular Nemal Networks: Theory," IEEE Transac- tions on Circuits and System}, vol. CAS-35, no. 10, pp. 1257-1272, October 1988. Chua, L0. and Yang, L., "Cellular Neural Networks: Applications," IEEE Transactions on Circuits and Systems, vol. CAS-35, no. 10, pp. 1273-1290, October 1988. Lippmann, R.P., "An Introduction to Computing with Neural Nets," IEEE ASSP Magazine. pp. 4-22, April 1987. Guez, A., Protopopsecu, V. and Barhen, J ., "On the Stability, Storage Capacity, and Design of Nonlinear Continuous Nerual Networks," IEEE Transactions on Systems, Man, and Cybernetics, vol. 18, no. 1, pp. 80-87, 1988. Foo, Y.S. and Takefuji, Y., "Stochastic Neural Networks for Solving Job-Shop Scheduling: Part 1. Problem Representation," Proceedings of the IEEE Interna- tional Conference on Neural Networks, San Diego, CA, vol. II. PP. 275-282, July 1988. [52] [53] [54] [55] [56] [57] [58] [59] [60] 1611 154 Foo, Y.S. and Takefuji, Y., "Stochastic Neural Networks for Solving Job-Shop Scheduling: Part 2. Architecture and Simulation," Proceedings of the IEEE International Conference on Neural Networks, San Diego, CA, vol. II, pp. 283-290, July 1988. Levy, BC and Adams, M.B., "Global Optimization with Stochastic Neural Networks," Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, vol. III, pp. 681-689, July 1987. Peterson, C. and Soderberg, B., "A New Method for Mapping Optimization Problems onto Neural Networks," International Journal of Neural Systems, vol. 1, no. 1, pp. 3-22, 1989. Bruck, J. and Goodman, J .W., "A Generalized Convergence Theorem for Neural Networks and its Applications in Combinatorial Optimization," Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, vol. 111, pp. 649-656, July 1987. Kennedy, MP. and Chua, L.O., "Circuit Theoretic Solutions for Neural Net- works - An Old Approach to a New Problem," Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, vol. II, pp. 169-176, July 1987. Grujic, LT. and Michel, A.N., "Qualitative Analysis of Neural Networks Under Structural Perturbations," IEEE International Symposium on Circuits and Sys- tems, New Orleans, LA, May 1990. Michel, A.N., Farrell, J.A. and Porod, W., "Qualitative Analysis of Neural Net- works," IEEE Transactions on Circuits and Systems, vol. 36, no. 2, pp. 229- 243, 1989. Li, J.-H., Michel, AN. and Porod, W., "Analysis and Synthesis of a Class of Neural Networks: Variable Structure Systems with Infinite Gain," IEEE Tran- sactions on Circuits and Systems, vol. 36, no. 5, pp. 713-731, 1989. Li, J.-H., Michel, AN. and Porod, W., "Analysis and Synthesis of a Class of Neural Networks: Linear Systems Operating on a Closed Hypercube," IEEE Transactions on Circuits and Systems, vol. 36, no. 11, pp. 1405-1422, 1989. Kohonen, T., Sef-Organization and Associative Memory, Series in Information Sciences, vol. 8, Spring-Verlag, 2nd ed., 1988. [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] 155 Stryer, L., Biochemistry, WH. Freeman and Company, 2nd ed., 1981. Minsky, M. and Papert, S., Perceptrons, MIT Press, MA, 1969. Kung, S.Y. and Hwang, J.N., "Parallel Architectures for Artificial Neural Nets," Proceedings of the IEEE International Corference on Neural Networks, San Diego, CA, vol. II, pp. 165-172, July 1988. Anderson, J .A. and Rosenfeld, E., Eds., Neurocomputing: Foundations of Reserach, MIT Press, MA, 1988. Grossberg, 8., "How does a Brain Build a Cognitive Code?" Psychological Review, vol. 87, pp. 1-51, 1980. Cohen, MA. and Grossberg, 8., "Absolute Stability of Global Pattern Forma- tion and Parallel Memory Storage by Competitive Neural Networ " IEEE Transactions on Systems, Man, and Cybernetics, vol. 13, no. 5, pp. 815-826, 1983. Grossberg, S., The Adaptive Brain, [1: Vision, Speech, Language, and Motor Control, North-Holland, Amsterdam, 1987. Grossberg, S., Eds., Neural Networks and Natural Intelligence, MIT Press, MA, 1988. Kohonen, T., "Self-organized Formation of Topologically Correct Feature Maps," Biological Cybernetics, vol. 43, pp. 59-69, 1982. McClelland, J .L. and Rumelhart, D.E., "An Interactive Activation Model of Context Effects in Letter Perception: Part 1. An Account of Basic Findings," Psychological Review, vol. 88, pp. 375-407, 1981. McClelland, J .L. and Rumelhart, D.E., "An Interactive Activation Model of Context Effects in Letter Perception: Part 2. The Contextual Enhancement Effect and Some Tests and Extensions of the Model," Psychological Review, vol. 89, pp. 60-94, 1982. McClelland, J .L. and Rumelhart, D.E., Eds., Parallel Distributed Processing: Explorations in the Microstructures of Cognition, vol. 1, MIT Press, MA, 1986. McClelland, J .L. and Rumelhart, D.E., Eds., Parallel Distributed Processing: Explorations in the Microstructures of Cognition, vol. 2, MIT Press, MA, 1986. [75] [76] [77] [78] [79] [80] [81] 182] [83] [84] [85] 156 Rumelhart, D.E., Hinton, GE, and Williams, R.J., 'Learning Internal Representations by Error Propagation," Parallel Distributed Processing: Explorations in the Microstructures of Cognition, vol. 1, DE. Rumelhart and J.L. McClelland (Eds), MIT Press, MA, pp. 318-362, 1986. Rumelhart, D.E., Hinton, GE, and Williams, R.J., "Learning Representations by Back-Propagating Errors," Nature, vol. 323, pp. 533-536, 1986. DARPA Neural Network Study, AFCEA International Press, VA, 1988. Wilson, G.V. and Pawley, G.S., "On the Stability of the Travelling Salesman Problem Algorithm of Hopfield and Tank," Biological Cybernetics, vol. 58, pp. 63-70, 1988. Kahng, A.B., "Traveling Salesman Heuristics and Embedding Dimension in the Hopfield Model," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. I, pp. 513-520, June 1989. Maa, C.-Y. and Shanblatt, M.A., "Stability of Linear Programming Neural Net- work for Problems with Hypercube Feasible Region," International Joint Conference on Neural Networks, San Diego, CA, June 1990. Maa, C.-Y. and Shanblatt, M.A., "Improved Linear Programming Neural Net- works," Proceedings of 32nd Midwest Symposium on Circuits and Systems, Champaign, IL, pp. 740-743, August 1989. Abe, S. and Kawakami, J ., "Theories on the Hopfield Nemal Networks with Inequality Constraints," Proceedings of International Joint Conference on Neural Networks, Washington, DC, pp. 349-352, January 1990. Chiu, G, Maa, C.-Y. and Shanblatt, M.A., "An Artificial Neural Network Algorithm for Dynamic Programming," International Journal of Neural Sys- tems, to appear. Rauch, HE. and W'marske, T., "Neural Networks for Routing Communication Traffic," IEEE Control Systems Magazine. PP. 26-31, April 1988. Zhang, L. and Thomopoulos, S.C.A., "Neural Network Implementation of the Shortest Path Algorithm for Traffic Routing in Communication Networks," Proceedings of International Joint Conference on Neural Networks, Washing- ton, D.C., vol. II, p. 591, June 1989. [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] 157 Zhang, C.-X. and Mylinski, D.A., "VLSI Placement with a Neural Network Model," IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990. Prasitjutrakul, S. and Kubitz, W.J., "Path-Delay Constrained Floorplanning: A Mathematical Programming Approach for Initial Placement," Proceedings of Design Automation Conference, Las Vegas, NV, pp. 364-369, June 1989. Herrigel, A. and W. Fichtner, W., "An Analytic Optimization Technique for Placement of Marco-Cells," Proceedings of Design Automation Conference, Las Vegas, NV, pp. 376-381, June 1989. Libeskind-Hadas, R. and Liu, C.L., "Solutions to the Module Orientation and Rotation Problems by Neural Computation Networks," Proceedings of Design Automation Conference, Las Vegas, NV, pp. 400-405, June 1989. Caviglia, C., et al., "Neural Algorithms for Cell Placement in VLSI Design," Proceedings of International Joint Conference on Neural Networks, Washing- ton, D.C., vol. I, pp. 573-580, June 1989. N aft, J, "Neuropt: Neurocomputing for Multiobjective Design Optimization for Printed Circuit Board Component Placement," Proceedings of International Joint Conference on Neural Networks, Washington, DC, vol. I, pp. 503-506, June 1989. Melsa, PJ.W. and Kenney, J.B., "A Neural Network Solution for Routing in Three Stage Interconnection Networks," IEEE International Symposium on Cir- cuits and Systems, New Orleans, LA, May 1990. Green, ADP. and Noakes, P.D., "Netn'al Network - Their Use for the Routing of Integrated Circuits," Proceedings of 32nd Midwest Symposium on Circuits and Systems, Champaign, IL, pp. 501-504, August 1989. Yih, J.-S. and Mazumder, P., "A Neural Network Design for Circuit Partition- ing," Proceedings of Design Automation Conference, Las Vegas, NV, pp. 406- 411, June 1989. Yu, M.-L., "A Study of the Applicability of Hopfield Decision Neural Nets to VLSI CAD," Proceedings of Design Automation Conference, Las Vegas, NV, pp. 412—417, June 1989. Sculley, TL. and Brooke, M.A., "A Neural Network Approach to High Perfor- mance Analog Circuit Design," Proceedings of 32nd Midwest Symposium on 158 Circuits and Systems, Champaign, IL, pp. 497-500, August 1989. [97] Osowski, 5., "Electrical Circuits Optimization Using Standard Network Ana- lysers," IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990. [98] Starzyk, J.A. and El-Gamal, M., "Artificial Neural Network for Testing Analog Circuits," IEEE International Symposium on Circuits and Systems, New Orle- ans, LA, May 1990. [99] Xiangrning, X. and Spence, R., "A Fast Constrained Optimization Algorithm for IC Design," IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990. [100] Thomas, R.J., et al., "On-Line Security Classification Using an Artificial Neural Network," IEEE International Symposium on Circuits and Systems, New Orle- ans, LA, May 1990. [101] Chow, J.-C., et al., "An Improved Hopfield Model for Power System Con- tingency Classification," IEEE International Symposium on Circuits and Sys- tems, New Orleans, LA, May 1990. [102] Pao, Y.-H., Sobajic, DJ. and Nyo, W., "Real-Time Security Monitoring of Electric Power Systems Using Parallel Associative Memories," IEEE Interna- tional Symposium on Circuits and Systems, New Orleans, LA, May 1990. [103] Mori, H. and Tsuzuki, 8., "Determination of Power System Topological Obser- vability Using the Boltzmann Machine," IEEE International Symposium on Cir- cuits and Systems, New Orleans, LA, May 1990. [104] Karady, G. and Hubele, NF, "Conceptual Approach to the Application of Neural Network for Short Term Load Forecasting," IEEE International Sympo- sium on Circuits and Systems, New Orleans, LA, May 1990. [105] Maa, C.-Y. and Shanblatt, M.A., "A Constrained Optimization Neural Net Technique for Economic Power Dispatch," IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990. [106] Maa, C.-Y. and Shanblatt, M.A., "A Neural Net Technique for Economic Power Dispatching with Consideration of Transmission Losses," submitted to, International Journal of Neural Systems. 159 [107] Matsuda, S. and Akimoto, Y. "The Representation of Large Numbers in Neural Networks and its Application to Economical Load Dispatching of Electric Power," Proceedings of International Joint Conference on Neural Networks, Washington, D.C., vol. I, pp. 587-592, June 1989. [108] Aggoune, ME, et al., "Artificial Neural Networks for Power System Static Security Assessment," Proceedings of IEEE International Symposium on Cir- cuits and System, Portland, OR, pp. 490-494, May 1989. [109] Fischl, R., et al., "Screening Power System Contingencies Using a Back- propagation Trained Multiperceptron," Proceedings of IEEE International Sym- posium on Circuits and System, Portland, OR, pp. 486-489, May 1989. [110] Chan, E.H., "Alarm Pattern Recognition by Using Artificial Neural Networks," IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990. [111] Di Zitti, E., et al., "Emcient Emulation of Neural Networks on Concurrent Architectures for Optimization Problems," IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990. [112] Park, S., "Hopfield Neural Network for AR Spectral Estimator," IEEE Interna- tional Symposium on Circuits and Systems, New Orleans, LA, May 1990. [113] Gao, K., Ahmad, MD. and Swamy, M.N.S., "A Neural Network Least-Square Estimator," International Joint Conference on Neural Networks, San Diego, CA, June 1990. [114] Takefuji, Y. and Lee, K.C., "A Two Step Sorting Algorithm," International Joint Conference on Neural Networks, San Diego, CA, June 1990. [115] Sudharsanan, SJ. and Sundareshan, M.K., "Equilibrium Uniqueness and Global Exponential Stability of a Neural Network for Optimization Applications," Proceedings of International Joint Conference on Neural Networks, Washing- ton, D.C., pp. 472-475, January 1990. [116] Sudharsanan, SJ. and Sundareshan, M.K., "Neural Network Computational Algorithm for Least Squares Estimation Problem," Proceedings of International Joint Conference on Neural Networks, Washington, D.C., vol. II, pp. 590, June 1989. [117] Szu, H.H., "Reconfigurable Neural Nets by Energy Convergence Learning Prin- ciple Based on Extended McCulloch-Pitts Neurons and Synapses," Proceedings 160 of International Joint Conference on Neural Networks, Washington, D.C., vol. I, pp. 485-496, June 1989. [118] van Laarhoven, P.J.M. and Aarts, E.H.L., Simulated Annealing: Theory and Applications, Reidel, Dordrecht, 1987. [119] Van den Bout, DE. and Miller 111, T.K., "Graph Partitioning Using Annealed Neural Networks," Proceedings of International Joint Conference on Neural Networks, Washington, D.C., vol. L pp. 521-528, June 1989. [120] Akiyarna, Y., et al., "Combinatorial Optimization with Gaussian Machines," Proceedings of International Joint Conference on Neural Networks, Washing- ton, D.C., vol. I, pp. 533-540, June 1989. [121] de Carvalho, L.A.V. and Barbosa, V.C., "Toward a Stochastic Neural Model for Combinatorial Optimization," Proceedings of International Joint Conference on Neural Networks, Washington, D.C., vol. II, pp. 587, June 1989. [122] Wong, W.S. and Funka-Lea, C.A., "An Elastic Net Solution to Obstacle Avoidance Tour Planning," International Joint Conference on Neural Networks, San Diego, CA, June 1990. [123] Davenport, MR. and Hoffmann, G.W., "An Extension of the Hopfield Neural Network to Include Hidden Nemons," IEEE International Symposium on Cir- cuits and Systems, New Orleans, LA, May 1990. [124] Yue, T.W. and Fu, L.C., "Ineffectiveness in Solving Combinatorial Optimiza- tion Problems Using a Hopfield Network: A New Perspective from Aliasing Efi‘ect," International Joint Conference on Neural Networks, San Diego, CA, June 1990. [125] Hellstrom, BJ. and Kanal, L.N., "The Definition of Necessary Hidden Units in Neural Networks for Combinatorial Optimization," International Joint Confer- ence on Neural Networks, San Diego, CA, June 1990. [126] Abe, S., "Theories on the Hopfield Neural Networks," Proceedings of Interna- tional Joint Conference on Neural Networks, Washington, D.C., vol. I, pp. 557-564, June 1989. [127] Abu-Mostafa, Y.S. and St. Jacques, J .-M., "Information Capacity of the Hopfield Mode ," IEEE Transactions on Information Theory, vol. lT-31, no. 4, pp. 461-464, 1985. 161 [128] McEliece, R.J., et al., "The Capacity of the Hopfield Associative Memory," IEEE Transactions on Information Theory, vol. IT-33, no. 4, pp. 461-482, 1987. [129] Davis, G.W., "Sensitivity Analysis of Hopfield Neural Net," Proceedings of IEEE First International Conference on Neural Networks, San Diego, CA, vol. III. PP. 325-528, July 1987. [130] Reibling, LA. and Olinger, M.D., "A Neural Network Implementation of Paral- lel Serach for Multiple Pa ," Proceedings of International Joint Conference on Neural Networks, Washington, D.C., pp. 293-296, January 1990. [131] Coleland, B .R., "Global Minima within the Hopfield Hypercube, " Proceedings of International Joint Conference on Neural Networks, Washington, D. C, pp. 37 7 380, January 1990. [132] Hirsch, M.W., "Convergent Activation Dynamics in Continuous Time Net- works," Neural Networks, vol. 2, pp. 331-349, 1989. [133] Salam, F., "A Tutorial Workshop on Neural Nets and Their Engineering Irnple- mentations," 31st Midwest Symposium on Circuits and Systems, St. Louis, August 1988. [134] Graf, H.P., et. al., "VLSI Implementation of a Neural Network Memory with Several Hundred Neurons," Proceedings of the AIP Conference on Neural Net- works for Computing, Snowbird, UT, no. 151, pp. 182-187, 1986. [135] Graf, H.P., Hubbard, W., Jackel, L.D., and deVegvar, P.G.N., "A CMOS Asso- ciative Memory Chip," Proceedings of the IEEE First International Conference on Neural Networks, San Diego, CA, vol. III, pp. 461-468, July 1987. [136] Graf, H.P., Jackel, L.D., and Hubbard, W.E., "VLSI Implementation of a Neural Network Model," Computer, vol. 21, no. 3, pp. 41-49, March 1988. [137] Sivilotti, M.A., Emerling, MR, and Mead, C.A., "VLSI Architectmes for Implementation of Neural Networks," Proceedings of the AIP Conference on Neural Networks for Computing, Snowbird, UT, no. 151, pp. 408-413, 1986. [138] Sivilotti, M.A., Mahowald, M.A., and Mead, C.V, "Real-time Visual Computa- tions Using Analog CMOS Processing Arrays," Advanced Research in VLSI: Proceedings of the 1987 Starn'ord Conference, P. Losleben (Ed), MIT Press, MA, pp. 295-312, 1987. 162 [139] Mead, Carver, Analog VLSI and Neural Systems, Addison-Wesley, MA, 1989. , [140] Hopfield, J ., "Neural Networks: Algorithm and Microhardware," Tutorial #7, International Joint Conference on Neural Networks, Washington, D.C., June 1989. [141] Jackel, L., "VLSI Technology and Neural Network Chips," Tutorial #8, Inter- national Joint Conference on Neural Networks, Washington, D.C., June 1989. [142] Murray, AF. and Smith, A.V.W., "Asynchronous VLSI Nelual Networks Using Pulse-Stream Arithmetic," IEEE Journal of Solid-State Circuit, vol. 23, no. 3, pp. 688-697, 1989. [143] Salam, F., "A Formulation for the Design of Nemal Processors," Proceedings of the IEEE International Conference on Neural Networks, San Diego, CA, vol. I, pp. 173-180, July 1988. [144] Li, J.H., Michel, A.N., and Porod, W., "Qualitative Analysis and Synthesis of a Class of Neural Networks," IEEE Transactions on Circuits and Systems, vol. 35, no. 8, pp. 976-986, August 1988. [145] Rockafellar, R.T., Convex Analysis, Princeton University Press, NJ, 1979. [146] Bazaraa, MS. and Shetty, C.M., Nonlinear Programming, John Wiley & Sons, NY, 1979. [147] Luenberger, D.G., Introduction to Linear and Nonlinear Programming, Chapter 12, Addison-Wesley, MA, 1973. [148] Chua, LC. and Lin, G.-N., "Nonlinear Programming without Computation," IEEE Transactions on Circuits and Systems, vol. GAS-31, no. 2, pp. 182-188, Feb. 1984. [149] Chua, L.O. and Wang N.N., "Complete Stability of Autonomous Reciprocal Nonlinear Networks," Circuit Theory and Applications, vol. 6, pp. 211-241, 1978. [150] Vidyasagar, M., Nonlinear System Analysis, Chapter 5, Prentice-Hall, NJ, 1978. [151] Fletcher, R., Practical Methods of Optimization, 2nd ed., John Wiley & Sons, NY, 1987. 163 [152] Bertsekas, D.P., Constrained Optimization and Lagrange Multiplier Methods, Academic Press, NY, 1982. [153] Wood, AF. and Wollenberg, B.F, Power Generation Operation and Control, Chapter 3, John Wiley & Sons, NY, 1984. [154] Stott, B., Alsac, O., and Monticelli, A.J., "Security Analysis and Optimization," Proceedings of the IEEE, vol. 75, no. 12, pp. 1623-1644, Dec. 1987. [155] Debs, A.S., Modern Power Systems Control and Operation, Kluwer Academic Publishers, MA, 1988. [156] Bergen, A.R., Power Systems Analysis, Prentice-Hall, NJ, 1986. [157] Lawson, CL. and Hanson, R.J., Solving Least Squares Problems, Prentice-Hall, NJ , 1974. [158] Sreedharan, V.P., "An Algorithm for Non-negative Norm Minimal Solutions," Numer. F unct. Anal. and Optimiz., vol. 9, pp. 193-232, 1987. [159] Tsirukis, A.G., Reklaitis, G.V., and Tenorio, M.F, "Nonlinear Optimization Using Generalized Hopfield N etworkks," Neural Computation, vol. 1, no. 4, pp. 511-521, 1989. ICHIGRN STRT 1111! 1 E UNIV. LIBRRRIES 11111111111111111111 7678828 3129300