PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES Mum on or before duo due. ,DATE DUE DATE DUE DATE DUE J : ~ [ __ l [:1 LI l JI ll 1 MSU Is An Afflrmdivc Action/Equal Oppoflunity Institution cWWd LJ JU # ——_———- ARTIFICIAL NEURAL NETWORKS FOR CONSTRAINED AND UNCONSTRAINED OPTIMIZATION By Jiahan Chen A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical Engineering 1992 ABSTRACT ARTIFICIAL NEURAL NETWORKS FOR CONSTRAINED AND UNCONSTRAINED OPTIMIZATION By Jiahan Chen Finding an accurate solution to linear and nonlinear programming problem formu- lations is a challenge. Fast, accurate and cost-effective methods for solving simultane- ous linear and nonlinear equations, for example those found in large-scale power sys- tem control problems, have been sought for decades. In this dissertation, new approaches for solving these problems using artificial neural networks (ANNs) are presented. Foremost, the reason for degenerating accuracy of previously developed ANNs for constrained optimization problems is discovered, and a new combination penalty function for a neural network is proposed which can ensure that an equilibrium point is sufficiently close to the optimal point. Then, an ANN for unconstrained optim- ization is proposeed which leads to development of linear and nonlinear equation solvers. Correspondingly, network formulations for the solvers are defined, and condi- _tions for network stability and convergence are identified and proven. An architecture design for the solvers, in which a network is constructed from a neuron array and a resistance connection matrix, is described and the hardware complexity is given. Furth- ermore, the linear and nonlinear equation solvers are applied to solving power load flow and contingency analysis problems. The related issues in these applications, such as formulation modification and a relationship between the transmission line parameters in a power system and the stability property of the neural network, are analyzed. In addition, other applications of the linear equation solver are discussed. These include matrix inversion, determining the stability of a linear control system, and determining matrix singularity. Simulation experiments show encouraging results for test examples with linear and nonlinear programming, linear system problems, and power system control problems. Moreover, it is shown that the time complexity of the linear and nonlinear equation solvers is problem size independent so that real-time computing for large-scale power systems will become possible when the approaches are implemented by VLSI technology. ii ACKNOWLEDGMENTS It is my great pleasure to thank my major advisor, Dr. Michael A. Shanblatt, for his guidance and encouragement throughout the years of my graduate studies. I espe- cially appreciate his constructive critiques and intellectually challenging questions that initiated significant research directions. I also want to thank my other Ph.D. Committee memberers, Dr. P. David Fisher, Dr. Robert A. Schlueter, Dr. Gerald L. Park, and Dr. Joseph Gardiner, for their valu- able comments and suggestions on this work. Furthermore, I wish to appreciate the financial support from the Center for Remote Sensing and the Case Center, Michigan State University, which has lasted for more than four years. Finally, I wish to dedicate this dissertation to my parents for getting me started right and encouraging me to pursue higher education, my wife, Zhenying Shen, for her love, patience, understanding, and support, and my lovely daughters, Hallie and Quanda. TABLE OF CONTENTS LIST OF TABLES ................................................................................................... vii LIST OF FIGURES ................................................................................................. viii Chapter 1. Introduction ........................................................................................... 1 1.1 Overview .................................................................................................. 1 1.2 Problem Statement .................................................................................. 3 1.3 Research Tasks ....................................................................................... 5 1.4 Organization of the Dissertation ............................................................ 8 Chapter 2. Artificial Neural Networks .................................................................. 11 2.1 Background .............................................................................................. 11 2.1.1 Basic Concepts ........................................................................ 12 2.1.2 Features ..................................................................................... 14 2.1.3 Leaming .................................................................................... 15 2.2 ANN Architecture .................................................................................... 19 2.2.1 Feedback Model ........................................................................ 19 2.2.2 Feedforward Model ................................................................... 22 2.2.3 Cellular Model .......................................................................... 23 2.3 ANN Applications ................................................................................... 24 2.3.1 Categories of Applications ............................ 24 2.3.2 Methodologies of Neural Computing Applications ................. 25 2.3.3 A Procedure for Optimization Applications ............................ 27 Chapter 3. Neural Networks for Linear and Nonlinear Programming ............ 35 3.1 Introduction .............................................................................................. 35 3.2 Methodology Review ............................................................................... 37 3.2.1 The Penalty Function Method .................................................. 38 3.2.2 The Lagrange Multiplier method ............................................. 38 iv 3.2.3 ANN Techniques ...................................................................... 39 3.3 A New Combination Penalty Function ................................................... 41 3.3.1 The Boundary Situation ............................................................ 41 3.3.2 A New Combination Penalty Function .................................... 43 3.4 Hardware Configuration ........................................................................... 44 3.5 Experiments and Simulation Results ....................................................... 47 3.5.1 Examples ................................................................................... 48 3.5.2 Discussion ................................................................................. 49 3.6 Summary .................................................................................................. 50 Chapter 4. Solving Linear System Problems Using An Artificial Neural Network ............................................................. 58 4.1 Introduction .............................................................................................. 59 4.2 Methodology Review ............................................................................... 60 4.2.1 The Traditional Methods .......................................................... 60 4.2.2 Systolic Arrays .......................................................................... 63 4.2.3 The MIMD Method .................................................................. 64 4.3 A Neural Network Solution to Linear Equations ................................... 66 4.4 The Relationship to Linear Systems ....................................................... 67 4.5 Architecture - .......................................................................... 72 4.6 Other Applications ................................................................................... 75 4.6.1 Matrix Inversion ........................................................................ 75 4.6.2 Determining the Stability of a Linear Control System ........... 76 4.6.3 Determining the Singularity of a Matrix ................................. 77 4.7 Simulation Results ................................................................................... 79 4.7.1 Verification - ....................................................... 79 4.7.2 Hardware Simulation ................................................................ 81 4.7.3 Hardware Complexity ............................................................... 82 4.8 Summary .................................................................................................. 83 Chapter 5. ANN Techniques for Power System Control Analysis .................... 97 5.1 Instruction ................................................................................................. 98 5.2 Problem Statement and Methodology Review ....................................... 99 5.2.1 Problem Statement .................................................................... 99 5.2.2 Methodology Review ................................................................ 101 5.3 ANN Formulation for Power Load Flow ............................................... 104 5.3.1 ANN Solution to Nonlinear Equations .................................... 104 5.3.2 The Full Power Load Flow ...................................................... 108 5.3.3 The Decoupled load Flow ....................................................... 113 5.3.4 The DC Load Flow ................................................................... 114 5.3.5 Contingency Analysis ............................................................... 115 5.4 Architecture .............................................................................................. 115 5.4.1 Verification ................................................................................ 118 5.5 Summary .................................................................................................. 122 Chapter 6. Conclusion ............................................................................................. 136 6.1 Summary - - - - ....................................................................... 136 6.2 Contributions ............................................................................................ 138 6.3 Future Research ....................................................................................... 139 Appendix A. Proof of Theorem 5.1 ......................................................................... 141 Appendix B. The Bus and Branch Data for the 39-bus Power System ................. 144 Appendix C. Simulation Results for Power Systems .............................................. 146 BIBLIOGRAPHY ..................................................................................................... 148 2.1 2.2 3.1 4.1 4.2 4.3 4.4 4.5 4.6 5.1 5.2 5.3 5.4 BI 3.2 Cl C. 1 LIST OF TABLES Comparison of the nervous system and an ANN system. ............................... 34 Weights and thresholds of a neural logic unit. ................................................ 34 Comparison results for linear and nonlinear programming. .......................... 57 The time complexity for linear system computing. ........................................ 93 The connection of resistors. ............................................................................ 93 Linear equations solutions. .............................................................................. 94 Matrix inversion. .............................................................................................. 95 Stability. ........................................................................................................... 96 Singularity. ........................................................................................................ 96 A Space estimation of the architecture for FLF. .............................................. 134 A space estimation of the architecture for DLF. ............................................. 134 A space estimation of the architecture for DCLF. .......................................... 134 The contingency analysis. ................................................................................ 135 The bus data for the 39-bus system. .............................................................. 144 The branch data for the 39-bus system. ......................................................... 145 The 7-bus system comparison results. ............................................................ 146 The 39-bus system comparison results. .......................................................... 147 1.1 2.1 2.2 2.3 2.4 2.5 2.6 2.7 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 LIST OF FIGURES A classification of computer architectures. ...................................................... A processing element. ....................................................................................... A logic unit. ..................................................................................................... The Hopfield network. ...................................................................................... The Kennedy-Chua network. ........................................................................... The block diagram of Rordriquez network. ..................................................... The basic switched-capacitor integrating neuron. .......................................... A cellular network. .......................................................................................... Equipotential surfaces in space and their relationship to the feasible region boundary. .............................................................................................. Different penalty functions. ............................................................................. A sectioned penalty function. .......................................................................... A combination penalty function. ..................................................................... Quadratic programming circuit. ...................................................................... A modified scheme for constrained amplifier unit. ........................................ Trajectories for example 3.1(a). ...................................................................... Trajectories for example 3.2 (for 13 = 0.415). ............................................... Trajectories for example 3.3. ........................................................................... A systolic array for LU decomposition. ......................................................... A systolic array for linear state equations. ..................................................... The image of ¢(x1, x?) from equation (4.22), an ellipse-parabolic surface. ............................................................................................................. A network architecture for the linear equation solver. .................................. Applications of the ANN solver. .................................................................... Error curves for examples. .............................................................................. Trajectories for an example using different initial points. ............................. Trajectories of multi-solution (a singularity matrix). ..................................... The result of SPICE simulation (two neurons). ............................................. 10 29 30 30 31 32 32 33 51 51 52 52 53 54 55 56 56 84 85 88 89 89 9O 4.10 The result of SPICE simulation (three neurons). ........................................... 91 4.11 The result of SPICE simulation (five neurons). ............................................. 91 4.12 A relationship between the chip area and the size. ....................................... 92 5.1 The neural network for classification. ............................................................. 123 5.2 A power system consisting of 3 subsystems. ................................................. 124 5.3 A network for the full power load flow. ........................................................ 125 5.4 A network for the decoupled load flow. ......................................................... 126 5.5 A network for the DC load flow. .................................................................... 127 5.6 The 7-bus system. ............................................................................................ 128 5.7 Trajectories of angle for the 7-bus system. .................................................... 129 5.8 Trajectories of voltage for the 7-bus system. ................................................. 130 5.9 Trajectories of angle for the 39-bus system. .................................................. 131 5.10 Trajectories of voltage for the 39-bus system. .............................................. 132 5.11 The contingency analysis for a 5-bus system. ............................................... 133 Chapter 1 Introduction Since the late I980s, Artificial Neural Networks (ANNs) have captivated the interest of researchers. The use of ANNs introduces fundamentally new concepts into machine intelligence computing problems. ANNs have been successfully applied in areas such as pattern recognition, associative memory, adaptive control, intelligent robotics, and optimization. This dissertation presents a new approach for improving the performance of artificial neural networks for linear and nonlinear programming, and develops a neural network-based linear and nonlinear equation solver. This includes the theoretical analysis, network formulation, architecture configuration, and approach verification. This chapter begins with an overview of neural computing, fol- lowed by the problems to be solved and research tasks of this work. 1.1 Overview The birth of the electronic digital computer, the Von Neumann machine, marked a new era of information processing in human civilization. Since then, computers have developed from the first generation to the fifth generation, as devices, architectures and computing mechanisms have advanced. From the electronic tube, to the transistor, to the integrated circuit, to VLSI chips, the devices have become smaller, faster, and cheaper. A variety of architectures such as SISD, SIMD, MISD, and MIMD have been used in the Von Neumann-based paradigm. Due to the control-driven property, it is difficulty to obtain a higher throughput in Von Neumann computers. Consequently, the dataflow and reduction computers have been proposed [Dec89]. The former is data- driven, that is, program instructions are executing as soon as their operands are ready, and the later is demand-driven, that is, program instructions are carrying out when results are needed for other calculations. All of these computers are called conven- tional computers because their computing is based on the algorithm-oriented processes. Of course, these computers have played very important roles in scientific, industrial, and commercial sectors, because they are much superior to the human brain in speed and accuracy for conventional computation tasks. They, however, are far inferior to the human brain in intelligence, perception, fault tolerance, and intolerance to fuzzy input. A nascent research area — the sixth generation, brain-like computers — has emerged, which requires cooperation among the following scientific disciplines: biol- ogy, psychology, mathematics, and computer science and engineering. The research area is divided into three aspects: Science, Technology and Applications [SoSo88]. Scientists are exploring intrinsic principles such as perception, cognition, and intelli- gence. Technologists are developing new techniques for adaptive learning, pattern recognition and decision making. At the same time, they are envisioning and imple- menting new hardware architectures, new processing algorithms, and new silicon dev- ices. Application engineers will enhance information processing capabilities dramati- cally in some areas, such as pattern recognition, computer vision, robotics, and optimi- zation. In addition, they will open new ways to developing much more intelligent machines. Consequently, tremendous changes will take place in the future. However, it is impossible for the neural computers to replace the the conventional computers. Because it has been proven that the conventional computers are more suitable for tasks such as logical reason, sequential mathematical operations, planning, and language understanding. In other words, both the neural computers and the conventional comput- ers will be developed harmoniously. Furthermore, a hybrid computer may be developed in the future which associates the real-time computing property of the neural computer with the features of the conventional computer, such as general purpose and program flexibility, so that an excellent overall performance can be achieved. Figure 1.1 shows a classification of computer architectures. With the distinct features of asynchronous parallel processing, continuous-time dynamics, and global interaction of network elements, artificial neural networks create an opportunity for enhancing real-time control and decision-making processes. A major candidate in these processes is large-scale power system control problems because they are complex and time consuming. In this dissertation, a set of fundamental research tasks aimed toward developing artificial neural network techniques for power system control will be performed. The results of this research will lay the fundamental mathematical and analytical groundwork to take advantage of neural computational concepts in this application area. This will be accomplished by a series of research tasks which will lead to implementing the algorithms and architectures from this research in VLSI. This may ultimately be the key to providing a robust computing environment for enhanced on-line control and decision-making, a development that would greatly improve the security of electric power systems. 1.2 Problem Statement Artificial neural networks, especially feedback networks, have been used to solve optimization problems. However, the computing accuracy from the present models can- not meet the demand for some applications. For example, the relative error is 1-3% for linear and nonlinear programming [ChLi84, KeCh88, Roet90]. The problem here is to find the reason for degenerating computing accuracy and a method for improving accuracy. Solving linear equations is a fundamental computation task. The time complexity of all the present methods is size-dependent: 0(n3) for the software approach, 0(n) for the hardware approach [HwCh82, Pret86, CoRo87, JoJe88, Jelo90]. As the scale of the system increases, the computing efficiency becomes a critical issue. In order to overcome the problem of the size-independence, a new approach should be developed. On-line control of large-scale power systems has always been weakened by an inability to solve large-scale complex power load flow and contingency analysis prob— lems in time frames required for real—time response to rapidly changing operating con- ditions [BuHW82, CoDK86, MoPG87, BaEM89]. The problem has been aggravated by the lack of robust algorithms and implementation technology which would allow for the design of economically feasible dedicated high-speed computational hardware architecture. Artificial neural networks have shown great potential for solving complex compu- tation problems in real-time. A lot of progress has been made in ANN theory, architec- tures, and applications [TuHW86, AtSu89, Aget89, WiLe90, Fiet89, SoPa89, MaSh90, TaTr91, TrWa91, WeEl91]. Therefore, a significant opportunity exists for applying the ANN technology to the above problems. By utilizing these computational advances and linking them to power system analysis, the research work will focus on: (1) analyzing artificial neural networks for linear and nonlinear programming from the viewpoint of optimization theory; (2) developing an ANN solver for general linear equations; (3) developing ANN techniques for the power contingency analysis; (4) exploring a method for solving nonlinear equations by artificial neural networks and applying it to solve the load flow problem. 1.3 Research Tasks The following research plan is used to achieve the goals of this research in a stepwise and overlapping fashion and to set the stage for subsequent developmental research further exploiting the expected results. Task 1: Objective: Significance: Approach: Task 2: ANN Performance Improvement Analyze various network formulations of feedback models from the viewpoint of optimization theory in order to discover the reason for degenerating computing accuracy. This analysis will generate insights which will be useful in developing a new method for improving performance of ANN for optimization prob- lems. ANN feedback models for optimization will be investigated first. The role of the feasible region boundary for linear and nonlinear program- ming with constraints will be identified. In reviewing general optimiza- tion techniques, the penalty function method will be analyzed, particu- larly its behavior in a neighborhood range of the boundary conditions. Based on these analyses, a new combination penalty function will be proposed which exhibits good behavior in both the boundary neighbor- hood and far away regions. This function must also be easily imple- mented with physical circuits. Then, one ANN model, the Kennedy- Chua network, for example, will be chosen to test the new penalty func- tion with a modified circuit scheme for the constraint amplifier unit, and to check the computing accuracy for linear and nonlinear programming. ANN Solver for Linear Equations Objective: Significance: Approach: Establish a relation between the quadratic optimization and the linear equations solution, and map the latter into an artificial neural network SU’UCIUI‘C. This step in this research will lay a theoretical foundation in linking artificial neural networks and linear equations solution. In consequence, a new application area of ANN, a completely parallel equation solver, will be explored. Quadratic programming without constraints and its application will be investigated. In fact, there exists a relation between the optimal point of quadratic function and the solution of linear equations. The idea here is to map the general linear equations (Ax = b) into an ANN model. The primary theories used in deriving the formulation will include matrix theory, differential equation theory, and optimization theory [BaSt70, HiSm74, BaSh79, Cro80, HiLi86, Ort87b]. It is essential to guarantee stability and convergence of an ANN model. The stability of the qua- dratic optimization network depends on the coefficient matrix A , partic- ularly on the eigenvalues of A. Based on this initial understanding, the problem can be divided into three cases: (1) A is a symmetric positive definite matrix. The optimization network is equivalent to linear equations solution in a sense of equilibrium. (2) A is an arbitrary nonsingular matrix. It is necessary to find an equivalent transformation L such that L(A) is a symmetric positive definite matrix. Case 2, therefore, will be turned into case 1. (3) A is a symmetric or asymmetric matrix with positive real part of eigenvalues. There exists an unique stationary point of the optimization network which is satisfied with the linear equations. Task 3: Objective: Significance: Approach: Task 4: Objective: Significance: Correspondingly, a hardware configuration of ANN solver will be sought in a modular fashion. Steady-State Contingency and Power Load Flow Formulations Use the models and mappings to express contingency analysis and power load flow as corresponding artificial neural network formulations. This task will provide a case study verification of the models and map- pings. Furthermore, the results will indicate the capability for getting a feasible solution to power system control problems by ANN. The results from Task 1 and Task 2 will be applied in the contingency analysis and power load flow problems. Some approximation techniques will be used for simplifying formulations. A mapping relationship between the optimization networks and the nonlinear equations solution will be sought. An emphasis will be put on how to get a feasible solu- tion for power load flow from the viewpoint of the security. In order to work more efficiently, the formulations will be divided into three steps: (1) the DC contingency, (2) the AC contingency and decoupled load flow, and (3) the full load flow. Simulations Produce simulations of artificial neural network approaches to the new penalty function, the ANN solver, the contingency analysis and power load flow. The simulations will demonstrate the degree of practicality of imple- menting these approaches. The comparison will show the advantages and disadvantages of the approaches proposed with respect to other approaches. Approach: Simulation programs for all the above approaches will be developed in C language. They will be used to provide performance metrics for com- parison with other approaches. The simulations will also be used to study convergence properties, computing accuracy, and network sensi- tivity regarding certain parameters, for example, the input coefficients for the new penalty function, the time increment for numerical integra- tion, and the initial value for power load fiow. According to the simula- tion results, possible modification and reformulation of the networks will be done by going back to the previous tasks for the purpose of perfor- mance improvement. 1.4 Organization of the Dissertation The remainder of the dissertation is organized as follows. Chapter 2 contains a background of artificial neural networks. It begins with the basic concepts of neurons and neural networks, and then it briefly describes the distinct features of neural com- puting, followed by the discussion of learning methods and artificial neural network architectures. It ends with an outline of ANN applications and a general procedure for solving problems using the neural network technique. Chapters 3, 4 and 5 form the backbone of the dissertation, which deal with Research Tasks I, 2 and 3, respectively. Chapter 3 presents a method for improving the computing accuracy of artificial neural networks for linear and nonlinear programming. Based on investigation of boun- dary situation of optimization problems with constraints, a new combination penalty function is proposed which can ensure that the equilibrium point is sufficiently close to the Optimal point. A modified ANN architecture with the new penalty function circuit scheme is given. Chapter 4 presents a new method for solving linear system problems using an artificial neural network. Based on a mapping between the solution of linear equations and quadratic minimization, solving linear equations can be viewed as finding the minimum of a quadratic function. The conditions for stability and convergence of the neural network-based dynamic system are identified and proven. A feedback ANN model with symmetric or asymmetric connections for optimization is proposed to form an ANN linear equation solver which can be implemented with VLSI technology. In addition, other applications of the approach are discussed. These include matrix inver— sion, determining the stability of a linear control system, and determining matrix singu- larity. ’ Chapter 5 presents a new approach for solving nonlinear equations using an artificial neural network technique. From an engineering point of view, a formulation of an ANN nonlinear equation solver is proposed which can significantly simplify hardware architecture. The conditions for stability and convergence of neural network- based on the formulation are proven. Furthermore, ANN formulations for the full power load flow, the decoupled load flow, and the DC load flow are given, and corresponding architectures for them are proposed, which can be implemented with VLSI technology. Finally, Chapter 6 summarizes this work, identifies the major contributions, and points out the future research. 10 33:00 Biased .mocaeroS 339:8 do couaocmmmflo < .2 Semi Ban as Boa 12:02 93:2 Q52 Dam Dam =80:qu 30c San— Eafinoz :o> 883800 \ 3:333:00 /\ 11 Chapter 2 Artificial Neural Networks In this Chapter, a brief review of the development of artificial neural networks is given. Then, the basic concepts, features, learning algorithms, and architectures of neural networks are described. Finally, some issues of ANN applications are dis- cussed. 2.1 Background The notation and concept of V neural networks began in the 19405 with a linking between biology and mathematics. By the late 19505, the first neural network simula- tions emerged [WiHo60]. After that, some learning rules, system control theories, and algorithms for updating connection weights were released. Neural network research, however, suddenly declined in the late 19605 due to a book titled Perceptrons written by Marvin Minsky and Seymour Papert [MiPa69]. These leading artificial intelligence scientists favored the Von Neumann machine and over-criticized the immature neural networks’ shortcomings. The research was not revived until 1982 when John Hopfield presented a paper to the National Academy of Science [Hop82]. Hopfield’s theory and experiment re-convinced contemporary scientists of the benefits and feasibility of ANNs. A great return to neural network research followed. Since then, many 12 researchers have been involved in biological neuron modeling, neural dynamics, learn- ing algorithms, neural architectures, applications and their implementations [RuHW86, Mea88, MeIs89, GrKu89]. The following sections summarize the basic issues of neural network technology. 2.1.1 Basic concepts A neuron model is an abstract description of a neural cell. There are two major categories of models [DaDe91]. The first category attempts to represent the behavior of nerve cells in the deepest mathematical details, often involving differential equations. This model is sufficient for use as a tool to validate and predicate the behavior of the human nerve system. Based on a number of simplifications, the second category of models places focus on the cause-effect and functions of nerve cells. Most neural net- work models, including the ones discussed below, use the simplified model rather than the detailed models. Figure 2.1 illustrates a simplified neuron model, called a processing element. Neu- ron i receives a set of input signals, (x1, x2, x"), each of which is multiplied by a weighting factor, wki. The combination function adds up all weighted input signals to produce the summation u,-. The transfer function maps the summation signal to an out- put signal v,- according to a certain relationship. The normalized range of output sig- nals is between 0 and 1. The output signals are then transmitted to other neurons. The weighting factors have a significant meaning for a real or artificial system. A positive weighting factor represents an excitation connection because the level of activation will increase from the positive weighted input. Conversely, a negative weighting factor represents an inhibitory connection because the level of activation will be reduced. The weighted connections play an important role in a neural network, as these connections form a weight matrix which characterizes the dynamic behavior of 13 the neural network to a great extent. A variety of functions, such as linear, piece-linear, threshold, and sigmoidal, can be used to represent the transfer functions. Among them, the sigmoidal function 1 V1=f(u.-)= —— (2.1) is most commonly used for the transfer function. The reason is that the sigmoidal function has the following useful properties: (1) The function is similar to the response function in the biological neuron. (2) The output is bounded from both above and below; (3) The function is continuous and continuously differentiable; (4) The function is monotonically increasing, because e > o. (2.2) (1 + 6“")2 f’(ui) = At the kernel of neural computing is the artificial neural network (ANN) which has been defined as: massively parallel, interconnected networks of simple (usually adaptive) elements and their hierarchical organizations which are intended to interact with the objects of the real world in the same way as the biological nervous system does" [Koh87]. It is important to point out that the ANN is not a model of the human brain which is much more complex, but a model inspired by the human brain’s parallel processing aspect. Table 2.1 indicates the similarities between the human ner- vous system and an ANN system [KaBr89]. A threshold function is a simple transfer function. For the binary output case, as shown in Figure 2.1, if u; .>_ T, then v,- = 1, otherwise, v,- = 0, where T is a threshold. ANNs are often built in a multi-layer architecture (see Sections 2.2.2.2 and 2.2.2.3) so that they can perform more complex information processing. As an example, Figure 2.2 illustrates that four basic logic problems (AND, OR, XOR, XNOR) can be solved 14 by an identical neural network with different weights and thresholds. Table 2.2 lists their corresponding values. 2.1.2 Features Artificial neural network technology introduces completely different concepts 'into computing, that is, non-algorithmic computing, learning, distributed memory, fault tolerance, and parallel processing [KaBr89]. 2.1.2.1 Non-algorithm Computing Conventional computing is an algorithm-oriented process in which each task is described in an exact program and is fulfilled by serial or partially parallel machine instructions. Neural computing is a dynamics-oriented process in which each task is mapped into a proper network architecture, and is fulfilled by the network’s inherent convergent behavior. 2.1.2.2 Learning Many desired applications of conventional computer technology can not be economically developed because the software needed is often too expensive to design, write, and debug. Ideal neural networks overcome this bottleneck because they do n0t have to be programmed for an application but theoretically learn an application by training through examples. In other words, neural networks possess artificial intelli- gence based on their accumulated "experience". 2.1.2.3 Distributed Memory 15 Conventional computer memory stores data and instructions in an address- accessed manner. Neural computing technology stores information in a distributed interconnected weight matrix. 2.1.2.4 Fault Tolerance Conventional computer is very sensitive to its component failures, because the computer is composed of a set of different functional units without redundance or with little redundance. Neural networks can keep in operation although a few components fail to work, because they have redundant connections and form a homogeneous struc- ture [NeSY92]. 2.1.2.5 Parallel Processing Parallel processing in conventional computers is often based on time overlapping (e.g., pipeline) and spatial duplication (e.g., vector register and systolic array) such that an instruction is divided into a series of sub-instructions which can be executed simul- taneously. Neural computing technology has an inherent parallel behavior because it uses numbers of processing elements with adaptive connections such that signals can be transmitted and processed in parallel throughout the network. 2.1.3 Learning In order to imitate human intelligence, a neural network must learn how to organ- ize itself. This means the network will determine its connection weights by training for meeting the demands of a particular problem. Leaning has been defined as " a change in behavior which is to a significant degree permanent in nature and which results from activity, training or observation" [Jac88]. There are three basic learning methods: supervised learning, reinforcement learning, and unsupervised leaning. In supervised 16 learning, the input and the corresponding target for an example are known. The learn- ing process adjusts connection weights within a network such that the difference between the output and the target approaches to a minimum. In reinforcement learning, the target is not given in an exact quantity, but in a "good" or "bad" fashion. In unsu- pervised learning, without the target description for an example, a network will develop its own classification by extracting features and using the dynamic relationship between the inputs and the connection weights. Learning rules are exact algorithms which give an explicit description for updat- ing connection weights in a network dynamically. Four well-known learning rules are briefly described below. 2.1.3.1 The Delta Rule The Delta Rule is expressed in AWij = e * (0-01),“ 0“ (2.3) where WU is the connection weight from element i to element j, e is a learning con- stant, tj is the target activation of unit j, a j is the actual activation of unit j, and a,- is the input node’s activation. As supervised learning, the Delta rule is very efficient for two-layer networks. A well-known learning rule, Back Propagation, is used for three-layer and multi-layer networks. An interesting link between back propagation learning and multivariate non- linear least square estimation is established in [Ang89]. The purpose of learning is updating weight matrix according to training data such that the squared error between the desired output and the computed output is minimized. Using a normalization form, the squared error can be written as 02(S,Y)="17i121(5fi " in)2 (24) F 1= l7 1 n o h OH I H] 2 = ‘22 f 2W1}; f Ewe in ‘ Y}: ”j=1z=1 k=1 i=1 where X i]- is the input value, Yij is the desired output value, Si]- is the computed output value, I , o , h are the number of input, output, and hidden neurons, respectively, n is the number of training exemplars, W’” is the weight matrix between the input layer and the hidden layer, W0” is the weight matrix between the hidden layer and the output layer, f is the transfer function. From a mathematical point of view, the problem can be viewed as solving the set of equations a[02(s, Y)] = 6[D2(S, Y)] = (2.5) awk’j’ :9ng Here the number of equations is p, (= n -0) and the number of unknown variables is p2 (= I -h + h '0 ). There are three cases for the general equations. If p1 = p2, the prob- lem is determined since there may exist a unique solution. If p1 > p2, the problem is overdeterrnined since there may exist no solution. If p1 < p2, the problem is under- deterrnined since there may exist infinite solutions. For coping with all of the cases, a solution can be found in the sense of the least square estimation. Using the Newton- Raphson iteration [Sca66], the updated weight scheme can be obtained as a[oz(s, Y)] w,f,.”(m +1) = wg’on) — or aw til-”(m) (2.6a) 2 w,2”(m +1) = w,2”(m) - a a”) (S ' Y” (2.6b) 3W3”(m ) 18 where m is an iteration index, and or is a learning factor (0 < or < l). a is based on the following assumption for the Newton-Raphson method _1__ f ’(X‘m’) 0t. (2.6c) 2.1.3.2 Adaptive Resonance Theory (ART) ART is as an unsupervised learning rule. By using long term and short term memories, ART can classify the input and enhance its memory capability. This means if the input pattern stored in short term memory is similar to the expected pattern stored in long term memory, the input is classified to that category. If there is no simi- larity between the input pattern and expected patterns, a new class is formed. 2.1.3.3 Kohonen Learning The Kohoren Learning Rule is written as W?” = W?“ + a (x,- — Wf’d) (2.7) where X,- = (1:1,, 121': ° - - xni} is the input vector for processing element i, W,- is the weight vector associated with inputs for processing element i, and a is a learning constant. The Kohonen learning rule is useful for unsupervised learning. 2.1.3.4 Boltzmann Learning This is a reinforcement learning rule [I-IiSe86]. The significant features of Boltzmann learning are that a probability function, the Boltzmann distribution in this case, is used to adjust connection weights, and simulated annealing is used to control dynamical settling of the network into a minimum energy state. 19 The above learning rules are widely used in feedforward neural networks. For feedback neural networks, the learning rule is often defined as the state updating of a network [Hec90]. For example, It new = sgn (2 W.,-xf"‘ — T.) (2.8) i=1 where x,- is the state, sgn is a sign function, T,- is a threshold, and Wu is the connec- tion weight. In general, wij- takes a fixed value, and wij = wji. Another aspect of learning within a feedback neural network is that an energy function E is defined and E is always decreasing as the state is updating. Eventually, the network arrives at a stable state when the learning process converges [KaBr89, Hec90]. 2.2 ANN Architectures Many ANN architectures have been proposed for solving a variety of problems such as optimization, pattern recognition, computer vision, and associative memory [Mea88, Hop84, TaH086, AnR088, FoTa88, TaTr91i, MoAG91]. According to their topologies, the architectures can be divided into three categories: feedback neural net- works, feedforward neural networks and cellular neural networks. 2.2.1 Feedback Neural Networks In this type of neural network there exi5t feedback paths from outputs to inputs of neurons. A key dynamic issue of these networks is that the energy derivative must be strictly decreasing so that the system equilibrium state can be reached. 20 2.2.1.1 The Hopfield Model The significance of the Hopfield model is that it can be built with analog circuit components and is suitable for analog VLSI implementation [Hop84, TaH086]. In this model, each processing element is an amplifier with a capacitor C; and a resistor p,- at the input node, and a sigmoidal transfer function from input iii to output vi. Process- ing element i is connected to processing element j via a finite conductance Ti]. all of which form a symmetric connection weight matrix T. The architecture of the general Hopfield network is shown in Figure 2.3. By using KCL, the time derivative of input potential u,- can be written as (3.3%: £724). __“_é.+,. (2.9) t dt j=1 l] j Ri t where 1 1 n 1 1 n _=_+ _=_+ T.., (2.10) R; Pt El Rij Pt ,2 U I,- is an input bias current for neuron i. An energy function is defined by integral of equation (2.9) as E=——ZZT,-J v,vj -2::I,-,-v + Z— lijif 1(§;)d§;. (2.11) 2i: I]: -l i=1R o The time derivative of the energy function can be found as d_E__i_1_df(u.-() agz dt _ i‘Cl dug (av,- —)2 . (2.12) Because v,- = f (u,) is monotonically increasing, and C,- is positive, g:— S 0 for all t, the value of the energy function is dstrictly decreasing and becomes zero only at equili- brium point at which -aav—= —C,-d —— =0 for all i. As described at the end of section 2.1.3.4, this represents the learning in the Hopfield model [KaBr89]. 21 The Hopfield model has been used in combinatorial optimization, linear program- ming, dynamic programming, signal and image processing applications [Hop84, ChMS91, N aCh92] 2.2.1.2 The Kennedy-Chua Model Kennedy and Chua proposed a cononical circuit model with feedback [ChLi84, KeCh88]. Their model can be applied to both linear and nonlinear programming. In addition, the networks’ structural parameters are in correspondence with the coefficients of the objective function and constraints descriptions. Figure 2.4 illustrates an architecture of the model, where p-cells are constraint amplifier units, and j-cells are integrators (neurons). The basic relation holds for the circuits as given in [KeCh88] (2.13) where ¢(v) is the objective function, g(v) are constraints, i- = pj (gj (v)). C,- is capaci- tance, and v is the node voltages v1, v2, V”. The energy function can be defined as m 81' (V) E(V) = ¢ + 2 (Mods. (2.14) i=1 The derivative of energy function is 2 dE " dVr _ = = - C. _ . 2.15 dt Er ‘( dt ) ( ) . (1"; 2 dB . Obvrously, C,- > 0, (717-) .>. 0, => 7 S 0. Therefore, E(v) 15 a Lyapunov function which ensures that the system is completely stable. 22 Because the linear programming network of Hopfield obeys the same unifying stationary cocontent theorem as the cononical nonlinear programming circuit of Ken- nedy and Chua, the Hopfield model can be regarded as a special case of the Kennedy- Chua model [KeCh87]. 2.2.1.3 The Rordriguez Model Although it is similar to the Kennedy-Chua model in principle and purpose, the Rordriguez model has a distinct feature. The RC-active technique is replaced by SC- reactive (Switched-Capacitor) technique which is more suitable for VLSI implementa- tion [Roet90]. Figure 2.5 is a block diagram of this model. The time derivative of the objective function ‘I’(‘) can be expressed as d‘I’ _ ” _ ' d” _ dx‘ 2 (216) d. '2377- ("——>—.—“2‘27’ ‘ . . . d ‘l’ where t 15 a posrtrve parameter. Therefore, 7‘— S 0. In order to implement a discontinuous pseudo-cost function, a basic switched- capacitor integrating "neuron" is used in this model. Figure 2.6 shows this scheme. By using even and odd clock phases, the operation of Figure 2.6 becomes threshold- controlled [Roet90]. 2.2.2 Feedforward Neural Networks A feedforward network is composed of multiple layers of neurons. Each layer forms an individual group to perform a specified information task. The whole network is connected from layer to layer, but there exists no connection in the same layer. The signals propagate from the first layer (input layer) to the last layer (output layer). The transfer functions used may be different from layer to layer, but they must be the same 23 within a given layer. 2.2.2.1 The Two-layer Model This model consists of only an input-layer and an output-layer. It can be used in simple information processing. One kind of two-layer model has been used in an asso- ciative memory [K0587]. 2.2.2.2 The Three-layer Model When the representation provided by the outside world is such that the similarity structure of the input and output patterns are very different, the two—layer network without internal stages will be unable to perform the necessary mapping. The three- layer model is composed of an input-layer, an output-layer, and a hidden—layer. This model is often used in pattern recognition based on some learning algorithms [I-Iec87, WiLe90]. 2.2.2.3 The Multi-layer Model This model is similar to that of the three-layer model except with more than one hidden layers. The multi-layer is capable to handle more complex information process- ing tasks. One thing here worth pointing out is that the hidden layers must use non- linear transfer functions. Because multiple hidden layers with linear transfer function are equivalent to a single hidden layer [DaDe91]. In other words, it makes no sense to use multiple hidden layered network when linear transfer functions are used. 2.2.3 The Cellular Neural Networks In fact, many live organs are built as a cellular structure so that connections and interactions among cells within an organ are more efficient. Chua and Yang developed 24 a model of cellular neural networks [ChYa88a, ChYa88b]. An example of a two dimensional cellular neural network with 4 x 4 cells is shown in Figure 2.7, where each cell consists of linear capacitors, linear resistors, linear and nonlinear controlled sources and independent sources. Due to their binary value outputs, the cellular neural networks have been used in image processing for feature extraction and character recognition [RoKa82, ChYa88b]. Moreover, the nearest neighbor interactive property and regularity of cellular neural networks makes them suitable for VLSI implementation. 2.3 ANN Applications By far the largest area of research in neurocomputing is that of applications. Meanwhile, it has been shown that some significant achievements in neurocomputing were associated with applications [WiHo60, StPi63, Hop82, CaGr87]. In this Section, application categories, methodology and related techniques are described. 2.3.1 Categories of Applications Neurocomputing applications can be roughly divided into four classes: Pattern Recognition, Control, Data Analysis, and Optimization. Pattern recognition is the major application area of neurocomputing [Gro76, Hec90]. The input of the neurocomputing system is spatial image data or time serial signals. The neural network fulfills a mapping from an input vector to an output classification. Associative memory can be regarded as a special example of this category. Adaptive control has a great significance for real world problems. By supervised traing or reinforced training, neural network controllers attempt to adjust the system 25 state by minimizing the difference between actual output and desired output. The Vision-Based Broomstick Balancer [ToWi88] and Automobile Autopilot [She88] are good examples of this kind of application. Data analysis is the highest potential application area for emulating human brain behavior to some extent. The ultimate purpose of data analysis is usually to extract concentrated information from fuzzy raw data. Based on the knowledge from expert systems, the trained neural network can make decisions for a variety of input informa- tion in a very short time. For example, the One-Minute Manager has shown surpris- ingly good performance for a neural network based medical diagnosis system [Bla86]. Optimization is a straight forward application of neural networks, because neuro— computing is a dynamic process such that an equilibrium point of the system is often associated with an optimal solution for a given problem. Recently, several achieve- ments have been made for different problems in this area such as combinatorial optimization [Hop84], linear and nonlinear programming [TaH086, KeCh88, MaSh92 ChMS92], and dynamic programming [ChMS91]. Since the dissertation focuses on this area, the issues concerned in great detail will be provided in the following Chapters. 2.3.2 Methodology of Neurocomputing Applications In order to solve real problems with neurocomputing efficiently, it is necessary to combine neurocomputing with other technological methods. In other words, methodol- ogy plays an important role in neurocomputing application projects. In the following three key issues will be briefly discussed. 2.3.2.1 Problem Solving Methodology It is essential for application researchers to have a comprehensive understanding of the work domain for which the application is intended. Therefore, the primary methodology for neurocomputing applications is to "work in reverse" [Hec90]. That is, 26 first, a list of neural networks is constructed by the domain specialists. The list describes each network’s problem solving capabilities and training requirements. Then, a set of operational functions is developed. Furthermore, applications in each of these areas are considered, and candidate applications are evaluated as well. 2.3.2.2 Functional Specification The purpose of functional specification is to define an application project pre- cisely so that the developers who are not domain specialists can work on the project more efficiently. Generally, the functional specification can be outlined as follows [l-lec90]: (1) Description: Describe the capabilities of application project to be developed. (2) System Interface: Identify requirements for interfaces between the neurocomput- ing system and other systems. (3) Human Interface: Identify requirements or constraints regarding the interface between the developed system and humans. (4) Performance: Provide requirements on the overall performance of the system, such as accuracy, speed, reliability, size and mass. (5) Economic factors: Estimate development cost and potential benefit of the project. 2.3.2.3 Technological Development Once the functional specifications have been made, technical development will begin. This includes mathematical mapping, gathering training data, developing learn- ing algorithms, and modifying network architecture. Of course, technical details depend on the particular application. A procedure for optimization applications is presented next. 27 2.3.3 A Procedure for Optimization Applications Since ANN s for optimization problems will be developed and applied in the fol- lowing Chapters, a general procedure for solving this kind of problems is described below: (1) Translate a real problem into an optimization problem; (2) Develop a dynamical system for the optimization problem; (3) Choose proper variables, and encode them; (4) Map the the dynamical system into a neural network; (5) Define the energy function for the network; (6) Develop a learning algorithm for updating connection weight, or identify and prove the conditions for stability and convergence of the network; (7) Choose a proper initial point, and find a solution to the problem by the network (simulation or physical implementation); (8) Improve the performances of the network. In this procedure, heuristics are frequently used. For example, in the Traveling- Salesman Problem (TSP), the best path is always chosen from a city to one of its four nearest neighbors; if two cities A and B are as far apart as possible, they will tend to occur near opposite sites of the closed route [HoTa85]. In the pattern recognition, the expert’s knowledge often gives a good hint for feature extraction. In other words, human intelligence still plays a dominant role in the design of ANN application sys- tems. Loosely speaking, ANN applications can be divided into two categories: stochastic-oriented and deterministic-oriented. Pattern recognition, classification, robot vision and event forecasting are the stochastic applications, whereas the well-defined optimization such as the TSP, linear and nonlinear programming problems are the deterministic applications. In the stochastic applications, the computing accuracy 28 depends on learning algorithm and variable choice. The relative error is measured by the comparison between the computed result and the actual value, with a range from 1% to 8% [EOMD91, PaEM91, SrLC9l]. In the deterministic applications, there exists an exact solution to the problem so that the relative error can be defined quantitatively. The computing accuracy depends on the mathematic formulation and architecture. In addition, the environmental situation exerts an influence on the accuracy. The influential factors include parameters such as the time increment for the Runge—Kutta numerical integration in the software simulation, and the quality index of circuit com- ponents in the hardware implementation. From an engineering point of view, it is essential to improve the computing accuracy for ANN applications. This issue will be discussed In more detail in the next Chapter. 29 Input Combination Transfer function function Figure 2.1. A processing element. 30 1 A Tl W a w5 w2 9 X w3 w6 4 . . w c Figure 2.2. A logic unit. I 11 2 In T22 T11 if? P? pn 111 C1 ': “2 C2 T On Cu 7 l o o o ' l V1 V2 VII Figure 2.3. The Hopfield network. ‘2 4% 2 . ' ' 0' ' FZDZ—J I? l 2 2 Eggp— BT G Fig 2.4. e Kennedy hua ne ork. '2 2 2'2 2 22,1 % B‘P 5?. Nonlinear akp Network .372 to calculate a VP (1’) ' ‘8)? '37 32 /\ i I Figure 2.5. The block diagram of Rordriguez network . even 81 ,1- _1_. ’ even;- F0odd A Y1 “T -_1_— , - 3 y? ’ ° _1_ h even T £00dd + y2 + 0 -_E O O even 3 m ’m - ° ’ _1_ / hgo . I even 4.) C O / O Tp(sm) Yout Figure 2.6. The basic switched-capacitor integrating neuron. lC(1,1) C(2,1) C(3,1) C(4,1) _— 33 C(1.4) Figure 2.7. A cellular network . C(2,4) C(3,4) C(4,4) Table 2.1. Comparison of the nervous system and an ANN system. Nervous system ANN system = Neuron Processing element Dendritcs Combining function Cell body Transfer function Axons Element output Synapses Weights Table 2.2. Weights and thresholds of a neural logic unit. Operation w1 w2 W3 W4 w5 w5 T1 T2 T3 T4 T; X AND 1 0 0 l 0.5 0.5 1 1 1 l 1 A - B OR 1 0 0 1 1 1 1 1 l 1 1 A + B XOR 1 -1 -1 1 l 1 l 1 l 1 1 A GB XNOR 0.5 -l 0.5 -l 1 1 1 l 1 0 l A G B 35 Chapter 3 Neural Networks for Linear and Nonlinear Programming In Chapter 3, a method for improving the performance of artificial neural net- works for linear and nonlinear programming is presented. By analyzing the behavior of the conventional penalty function, the reason for the inherent degenerating accuracy is discovered. Based on this, a new combination penalty function is proposed which can ensure that the equilibrium point is sufficiently close to the optimal point. A known neural network model has been modified using the new penalty function and a corresponding circuit scheme is given. Simulation results show that the relative error for linear and nonlinear programming is substantially reduced by the new method. 3.1 Introduction Artificial Neural Networks (ANNs) have been applied as models of computation for solving a wide variety of problems in such diverse fields as combinatorial optimi- zation [l-lop84, TaH086, FoTa88], computer vision [RuHW86], and pattern recognition 36 [GaGr87]. In particular, some networks have been used for solving linear and non- linear programming problems. There are three well-known models among these net- works. Tank and Hopfield used sigmoidal input-output cells as a neuron model and mapped the cost function and constraints into close-looped networks [Hop84, TaH086]. When a constraint violation occurs, the magnitude and direction of the violation is fed back to adjust the states of the neurons in the network. Their model can be directly implemented with analog circuits, however, the global optimal solution can be obtained only under the assumption of very high gain and infinite conductance. Kennedy and Chua used integrator cells to model neurons and mapped the cost function and constraints into a canonical nonlinear circuit [KeCh88]. Their model can be applied to both linear and nonlinear programming. In addition, the networks’ struc- tural parameters are in correspondence with the coefficients of the objective function and constraint descriptions. The shortcoming of their network, however, is that the optimal solution can only be guaranteed with infinite conductance. Rodriguez-Vazquez, et al., also used integrator cells for the neurons and mapped the cost function and constraints into a switched-capacitor network [Roet90]. Their pseudo-cost function is not continuous from the feasible region to the infeasible region. The distinct feature of their model is in a substitution of the RC-active technique by an SCoreactive technique which is more suitable for VLSI implementation. However, a new problem arises — there is no equilibrium point in the system because of discon- tinuous behavior. Due to the limitations mentioned above, the relative error of results from these models is 1% to 3% [KeCh88, ChLi84], although, it is recognized that these are case- dependent results [MaSh89]. The motivation for this work is to develop a new method for improving the computing accuracy of linear and nonlinear programming. 37 In this Chapter, the role of the boundary for linear and nonlinear programming is investigated. By analyzing the behavior of the conventional penalty function, the rea- son for degenerating its accuracy is discovered. Based on this, a new combination penalty function is proposed which can ensure that the equilibrium point is sufficiently close to the optimal point. By using the new penalty function, a modified ANN model is described and an alternative circuit scheme for the constraint amplifier unit is given. The stability and convergence of the modified model is addressed. Finally, a few examples of linear and nonlinear programming are solved by simulation of the new model. 3.2 Methodology Review A large class of problems in science and engineering can be formulated as optimization problems in which a minimum or maximum solution is sought to an objective function subject to certain constraints. From a mathematical point of view, however, the case of maximizing a concave function is equivalent to that of minimiz- ing a convex function [BaSh79]. Therefore, only the latter will be discussed here. Consider the following optimization formulation: Minimize $00 (3.1) subject to gt(X)20 i=1.'°°,m (3.2) Where ¢(x) is a nonlinear strictly convex function, x e R", and gi(x) is convex for all 1'. The nonempty convex set S=[x:g,- (x) .>_ 0, i =1, 2, ..., m} is called the feasible region. 38 The penalty function and the Lagrange multiplier are two well-known methods for solving the problem [BaSh79, HiLi86]. The basic idea is to translate the con- strained minimization problem into a new unconstrained minimization problem by introducing a modified objective function. 3.2.1 The Penalty Function Method Define a new objective function as if (x) = ¢ + 11010!) (3.3) where (1(x) = m p,- (x) (3.4) 1'21 and ' 0 if 810‘) 2 0. pl (X) = (3.5) lgio‘) IP ifg,-(x) < 0. For the conventional penalty function, u > 0, and p is a positive integer, in general, p 2 2. A theoretical analysis has shown that only when it approaches infinity can the solution to (3.3) be the same as the solution to (3.1) and (3.2) [BaSh79]. Two prob- lems arise: it is difficult to realize infinite p. physically, and too large a u may result in an ill-conditioned problem [BaSh79]. 3.2.2 The Lagrange Multiplier Method This method can only handle the exclusive equality constraints. By introducing a relaxed function Si , the inequality constraints can be transformed into equalities: 39 q.- = 810‘) - S. = 0 (3.6) where S,- is a non-negative function depending on a relaxed variable xMi, generally taking the form as s, = x,,,2. (3.7) Thus, the problem is changed to minimizing a modified objective function without constraints a: m (D (X) = $00 + 22w,- (3.8) i=0 where h,- is a unknown constant. A minimum point must satisfy # act 8x- I =0 fori=1,2,-~,n (It = 0 fori = 1, 2’ ..., m. (3.9) Consequently, solving the minimization problem with equality constraints is equivalent to finding a root of the simultaneous equations with real coefficients as formulated in (3.9). 3.2.3 ANN Techniques Tank and Hopfield first developed a simple optimization neural network and applied it to the linear programming problem [TaH086]. Their network consists of two parts: an amplifier unit and a constraint unit. They are associated by the following equations: Cd“i ( ”" m'af’ (310) .— : — . + —— + .— . t dt at R Ell] all; __ (a, + R + Elk-d!) J: and 40 v,- = g(ui) (3.11) where g (-) is a sigmoid function. ¢(v) = vTa (3.12) is the objective function subject to the constraints fj(v = Vde - bj 2 0 (3-13) for j = 1 to m, and i j- = q(fj(v)) is the penalty function. The energy function for the network is defined as m fj(v) E(v) = vTa + %£g“1(©d§ + 2 g q(g)dg (3.14) j=l n l: l and the derivative of the energy function can be found as d. 777 l dv‘: dlli £2 " (3.15) n dg(u,) du; C- ( )2. d8 (ui) Srnce g (u,) rs a monotone 1ncreas1ng function, 15 posrtrve. Thus, 7 S 0 1nd1- i cating that the system will tend to move toward a local minimum point. In a modification to this model, Kennedy and Chua proposed a neural network for nonlinear programming described by a set of first-order differential equations in (2.13). The energy function E (x) is defined in (2.14). It has been shown that this network can cope with linear and nonlinear programming [KeCh88]. Networks of this genre were further developed by Ma and Shanblatt in which a variety of ANN optimization models for linear, quadratic and general nonlinear pro- gramming were analyzed. They showed that the optimization network formed by 1=1 it = — V¢(x) - s[§ g j+(x)Vg )- (x)] (3.16) 41 fulfills both the Kuhn-Tucker optimality conditions and the criteria of the penalty func- tion method [MaSh91]. Under an assumption of convexity, the network described by the dynamics of equation (3.16) traverses along the surface of the energy function and eventually settles on a local minimum. Furthermore, a two—phase optimization network model was proposed such that different dynamics are used as the phase is changed by a predetermined timing switch. As a consequence, the exact solutions of the network can be guaranteed by adjusting the penalty parameter. 3.3 A New Combination Penalty Function Since the penalty function method is widely used in ANN optimization models, it is essential to investigate the behavior of the conventional penalty function technique in order to propose an improved network for optimization problems. 3.3.1 The Boundary Situation Let x" be the solution to (3.1) without constraints, and let x‘ be the solution to (3.1) and (3.2) with constraints. The following Theorem can be stated. Theorem 3.1: If x"e S, then xi: x"; otherwise, x‘e B(S), where S is bounded and B(S) is the boundary of S. Proof can be found in [BaSh79]. From Theorem 3.1, the constrained problem can be divided into two cases: case 1 in which the solution lies in the interior of the feasible region, and, case 2 where the solution lies on the boundary of the feasible region. For the former, the solution is the same as that without constraints and can be expressed as a set of algebraic equations by setting the partial differentials of ¢(x) equal to zero. For the latter, (nonlinear pro- gramming with constraints —— NLPW), obtaining the solution is more difficult due to 42 possible complex situations on the boundary. In practice, most optimization problems belong to case 2 and therefore it will be examined in detail. For linear programming (LP) without constraints there is no minimum point. The minimum point for LP with constraints (LPW) can only be on the boundary of the feasible region [HiLi86] provided the feasible region is bounded. Thus, a strategy for improving computing accuracy will be useful to both NLPW and LPW. Consider a gradient system which is often used for optimization by ANNs: it = -V¢" (x) = -[V¢(x) + Vuor(x)]. (3.17) Due to the convex assumption, the unique equilibrium point (i.e., the minimum point for (3.1) and (3.2)) must satisfy i=0 ’ (3m) which means V¢(x) = —Vu0t(x). (3.19) Figure 3.1 illustrates the situation for case 2 where dashed curves indicate equipo- tential surfaces of ¢(x). Let xb be the solution of the problem, described by equations (3.1) and (3.2), lying on the boundary of the feasible region, i.e., x,, = x'. Let xbo be a point outside the feasible region S but in a neighborhood of xb so that ¢(xbo) < ¢(xb). According to the assumption of convexity, and equation (3.19), it can be concluded that vectors V¢(xb) and Vua(xb) are in the opposite direction as shown in Figure 3.1. Let l(xb) = (xb,5, xb), called a small "left" neighborhood of xb. If the penalty function is too weak to increase ua(x) enough to offset the decreasing ¢(x) in the range l(xb), then d)’ (Xbo) may be slightly lower than (if (xb) , or almost equal to ¢’ (xb ), where xboel(xb). For the conventional penalty function, it is clear that '810‘66) H"1 is so small that its offset can be almost ignored. This is the reason why the computing accuracy of the previous models, which use the conventional penalty 43 function, is not satisfactory. 3.3.2 A New Combination Penalty Function According to the above analysis, it is desirable to find a new penalty function which can ensure that a solution to (3.3) is sufficiently close to that of the original problem (3.1) and (3.2) [ChMS92]. a . The objective is to make 8 (p, ) increase rapidly. A fraction-exponent function 81' x5- ’ can be chosen as 0 if gi(x) 2 0, Pt (X) = fl _ (3.20) I81(X)| p ifg,-(x)<0 where p is a positive integer. Correspondingly, the derivative of p (-) is 8p,(x) _ 0 if 810‘) 2 0. agi (X) p+1 p _1- (3.21) lgi(X)|p ifgi(X)lp _ 0, pi(x) = fll (3.23) k1 I81“) I p + k2 lgi(x)|2 ifg,-(x) < 0 where In > 0, k2 > 0. The purpose 0f putting these two coefficients together is to sim- plify implementing circuits which will be discussed in Section 3.4. Equation 3.23 is the new desired penalty function, whose derivative is shown in Figure 3.4. Comparing Figure 3.3 and Figure 3.4, it is apparent that the latter is smoother and can yield a more accurate result. Since the derivative of the latter is con- tinuous, and the value of the combination function at any given x (x < 0) is greater than that of the corresponding selectioned function. In addition, the hardware imple- mentation for this combination penalty function will be more efficient than that of a sectioned function which requires a comparator (see Section 3.4). Note that the combination penalty function is still a continuous function since it is the sum of two continuous functions. 3.4 Hardware Configuration As described in Section 2.2, the feedback neural networks are built in a "closed" form so that they can be more easily implemented. Among them, Kennedy and Chua’s network is a continuous model, and can handle linear and nonlinear situations without a structural revision [KeCh88]. By modifying of the energy function and constraint amplifier circuit of Kennedy and Chua’s canonical nonlinear programming model, an 45 implementation of the combination penalty function proposed in Section 3.3 is dis- cussed here. Consider a quadratic programming problem: Minimize ¢(x) = ATx + é—xT Gx (3.24) subject to g(x) = Bx - e .>_ 0 (325) where A and x are n-vectors, g and e are m-vectors, B is an mxn matrix, and G is an nxn symmetric, positive definite matrix. If G = 0, then this becomes a linear program- ming problem. Define the energy function as 500 = no + 35pm) (3.26) j=1 where p j (x) is the new penalty function from equation (3.23), and the dynamical equa- tion at node i can be written as __'_ = - __ _z—'-—', (3.27) Due to the continuity of pj(-), the derivative of the energy function can be obtained as follows: _ = 2—— + 2—1—4 (3.28) 46 = =—z<—) i=1 n(,-),-dxdx .:,(t)tdd 2 Obviously, (7)) _ 0, :> €5— S 0. Therefore, E (x) is a Lyapunov function which ensures that the system is completely stable. Based on the above, the circuit implementation will be straightforward. Accord- ing to (3.24) and (3.25), variable x can be represented as voltage, and integrators are ag- used to solve dynamic equation (3.27), where 5J— is constant so that it can be rmple- xi . . . . apj mented as a resrstance. The derrvatrve of the penalty function, 3g—, can be 1' represented as current. Thus, a circuit scheme is shown in Figure 3.5, where p-cells are constraint amplifier units, j-cells are integrators and the interconnection network is resistive. This network is similar to Kennedy and Chua’s model with the exception of the constraint amplifier units. The penalty function in Kennedy and Chua’s model can be written as 0 if g j (x) 2 O i 2 . EIgJ-(xfl 1ng-(x)<0 where R0, and R,- are resistances. The model proposed here uses the new penalty function of equation (3.23) for the constraint amplifier unit. Comparing (3.23) and (3.29) (see Figure 3.4), it is observed that the former will shrink the equilibrium region efficiently and puts the trajectory back to the boundary quickly. Thus, the network will converge to a stable result more quickly. As mentioned above, current is used to represent the derivative of the penalty function, i.e., if g,(x) 2 0, i. =3g_= I (330) I kp;-—L lam!” +2k2lg (x)| “81m” 47 p 1 — _ k — _, 3.31 [(1 — and 2- ( ) if 811x) 2 O, i. = _= 1 (3.32) Isa-(x) l P + |g,~(x)| ifgr(X) < 0. This describes the new constraint amplifier circuit. A problem remains in the imple- 1 mentation of x p which requires a complex circuit. By using the mathematical mani- pulation i 3. in, x” =emp=ep , (3.33) the radication is replaced by a simple logarithm, proportion, and exponentiation. The circuit scheme for the modified constraint amplifier unit is shown in Figure 3.6, where T1 is the IN inverter, T2 is the comparator, T3 is the log unit, T4 is the proportion unit, T5 is the antilog unit, and T6 is the sum unit; in T4, p of pR is the parameter for the new penalty function. In fact, the circuit consists of operational amplifiers, transis- tors and resistors with a regularity in configuration so that it can be implemented by VLSI techniques without difficulty. 3.5 Experiments and Simulation Results According to the approach in Section 3.3 and the architecture proposed in Section 3.4, a simulation was developed in C language. The fourth-order Runge-Kutta numeri- cal integration [WeRe66] was used for simulating the integrator cell in Figure 3.5. The parameters for the penalty function in formulas (3.3) and (3.32) take values: 11 = 1 ~ 1.5, p = 10. The typical cases (linear, quadratic and 3rd-order programming) are 48 considered. The three experimental examples are chosen from the recent literature [Hop84, ChLi84, KeCh88, Roet90] to provide comparative data. In all of the exam- ples, the initial points are located in different directions from the optimal point, includ- ing the outside, the inside, and the boundary, of the feasible region to test the capabil- ity of the new penalty function. The examples and their results are described below. 3.5.1 Examples Example 1 [Hop84, ChLi84, KeCh88, Roet90]: Minimize subject to and where (a)C1= (b) C]: (06'1— ((1)01 The ¢(XI,X2)=C1X1+ C212 ix, _x2 _ 22, 12 12 %x1+x2 S 3—25-, —x135, 1255, —l,c2=—1; -1,c2= 1; 1,c2— 1, l,c2——1 simulation results are listed in Table 3.1 and the trajectories for (a) are shown in Figure 3.7. In this Figure 3.7 the solid lines represent the trajectories and the dashed lines represent the boundary of the feasible region. (The same convention is also used in the following examples). 49 Example 2 [ChLi84, KeCh88, Roet90]: Minimize ¢(x1,x2,x3) = 0.4.rt1+-%(5)c12 + 8r22 + 4x32) - 3x112 — 3x2x3 subject to x, + x2 + x3 2 1 and x1, x2,x3 2 0. The results are listed in Table 3.1, and the 2-D trajectories on the x3 = 0.415 pro- jection plane are shown in Figure 3.8 for clarity. Example 3 [ChLi84, KeCh88, Roet90]: Minimize ¢(x1,x2) = 0.4x1+ x12 +x22 - xlxz + 316x13 subject to x] + 0.51:2 2 0.4, 0.5161 + X2 2 0.5, and x1, x2 2 0. The results are again listed in Table 3.1 and the corresponding trajectories are shown in Figure 3.9. 3.5.2 Discussion Starting at an initial point, each trajectory in the above examples is moving towards the boundary. It then continuously converges to the equilibrium point on the boundary. The relative errors are 0.03 - 0.08%, 0.02%, and 0.73% for example 1, 2, and 3 (linear, quadratic, and 3th-order nonlinear programming), respectively. For com- parison, the results of hardware implementation measurements by Chua and Lin 50 [ChLi84], and Kennedy and Chua [KeCh88] are also listed in Table 3.1. 3.6 Summary NLPW and NPW problems have been studied from the aspect of computing accu- racy. The conventional penalty function’s weakness has been found. With a very big coefficient u, the term uor(x) still may not provide a large enough "penalty" to (if (x) in the "left" neighborhood of the boundary due to the polynomial behavior of (1040. Based on this, a new combination penalty function has been proposed which exhibits good behavior in both the boundary neighborhood and far away regions. By manipulat- ing Equation (3.33), a modified circuit scheme for the new penalty function is given. The circuit can be implemented in analog VLSI. The simulation for the modified Ken- nedy and Chua model has shown satisfactory results for the experimental examples as evidenced by a significant decrease in the resultant errors. This method can be used in other ANN models and orher optimization algorithms as well. This Chapter has presented an improved ANN model for the optimization with constraints. The other category, the optimization without constraints will be further investigated in Chapters 4 and 5. 51 VHOKXb ) Figure 3.1. Equipotential surfaces in space and their relationship to the feasible region boundary. PQU) ll Ix Ii]? .. 1.0 II I ‘—' 0.5 Ix l2 J O I: x -0.5 0.5 Figure 3.2. Different penalty functions. 52 1 1 #15 —1.0 —0.5 J -1.5 -1.0 -0.5 0 Figure 3.3. A sectioned penalty function. P'i(x) l (l ,2,0 -1.5 —1.0 -1.5 Figure 3.4. A combination penalty function. ? 3 Z?“ 3 o ‘ g e Figure 3.5. AT Quadratic programming circuit 54 F1 1 e O 0 I ll 1 O 55 x r? (7,8) (-1.7) \\ /Optimal point: r .................. 5. ............... (5.00310,S.00246) E (10.3) I 1,1 \\ -3 0 ( ) 5 37 a,” L’ ’ ’ r5 Figure 3.7. Trajectories for example 3.1 (a). 56 X2 (1,1) (0.1,0.7) 0.585 1 Optimal point: (-0.2,0.3) (025196033278) / >1 1 \ (0.6,-0.1) Figure 3.8. Trajectories for example 3.2 ( for X3 = 0.415 ). 12 ’l (1,1) 0.8 \ o o ‘\ Optimal pomt. (0.2.0.5) \ (0.34197,32842) (0.8,0. l) 03,-0.3) Figure. 3.9. Trajectories for example 3.3. Table 3.1. Comparison results for linear and nonlinear programming. Example Theoretical Chua & Lin Kennedy & Chua Proposed Network x1=5.000 xl=5.075 x1=5.00310 Ela x2=5.000 x2=4.895 x2=5.00246 error=2.1% error=0.06% x1=7.000 x1=7.088 x1=7.00524 Elb “241000 =0.017 x2=0.00032 error=1.26% error=0.08% xl=-5.000 x1=-4.976 x1=-5.00064 51° x2=—5.000 x2=4.978 x2=-5.00150 error=0.48% error=0.03% x1=-5.000 x1=-4.966 x1=-4.99920 51“ =5.000 x2=4.906 x2=5.00091 error=l.88% error=0.02% xl=0.2520 xl=0.258 xl=0.257 xl=0.25196 E2 x2=0.3328 x2=0.329 x2=0.332 x2=0.33278 x3=0.4150 x3=0.407 x3=0.412 x3=0.41495 error=2.38% error=1.98% error=0.03% x1=0.3395 x1=0.336 x1=0.3406 x1=0.34197 E3 x2=0.3302 x2=0.320 x2=0.3385 x2=0.32842 error=3.09% error=2.51 error=0.73% 58 Chapter 4 Solving Linear System Problems Using An Artificial Neural Network In this Chapter, a new method for solving linear system problems using an artificial neural network is presented. Foremost, a mapping between the solution of linear equations and quadratic minimization is established so that solving linear equa- tions can be viewed as finding the minimum of a quadratic function. The conditions for stability and convergence of the neural network-based dynamic system are proven, providing a mathematical basis for the approach. A feedback ANN model with sym- metric or asymmetric connections for optimization is proposed to form an ANN linear equation solver which can be implemented with VLSI technology. Moreover, it is shown that the time complexity of this technique is problem size independent. In addi- tion, other applications of the approach are discussed. These include matrix inversion, determining the stability of a linear control system, and determining matrix singularity. 52 P30) l lrzo J 1 1 1_> -1.5 -1.0 -0.5 0 0.5 Figure 3.3. A sectioned penalty function. P '1' (X ) l r20 \\\\ {-1.5 .......... _10 -0.5 I 1 > -1.5 0 0.5 Figure 3.4. A combination penalty function. TE 2% AT Figure 3.5. Quadratic programming circuit. N Vl LII—— u y :y CF}! >M—yz r133 :1:— : R __/l/V1_ Figure 3.6. A modified scheme for the constraint amplifier unit. we N4 ”ML—o Y6 55 (7.8) ('127) \\ /Optimal point: ,. ................. 5. ............... (500310500246) (10,3) 5.. 0 (1.1) 5 ()7 >11 L ’ t5 Figure 3.7. Trajectories for example 3.1 (a). 56 X 2 l (1,1) (0.1 ,0.7) 0.585 . Optimal point: (-0.2,0.3) (025196033278) / \‘0.585 >x1 \(0.6,-0.1) Figure 3.8. Trajectories for example 3.2 ( for x3 = 0.415 ). X 2 ’l (1,1) 0.8 (-0.2,0.5) “N \ Optimal point: ‘, (0.34197,32842) (0.8,0. 1) 03,-0.3) Figure. 3.9. Trajectories for example 3.3. 57 Table 3.1. Comparison results for linear and nonlinear programming. TiExample theoretical Chua & Lin Kennedy & Chua Proposed Network x1=5.000 xl=5.075 x1=5.00310 E13 x2=5.000 x2=4.895 x2=5.00246 error=2.1% error=0.06% x1=7.000 x1=7.088 x1=7.00524 Elb x2=0.000 x2=-0.017 x2=0.00032 error=1.26% error=0.08% xl=-5.000 x1=—4.976 x1=-5.00064 E1“ x2=-5.000 x2=4.978 x2=—5.00150 error=0.48% error=0.03% xl=-5.000 x1=-4.966 xl=-4.99920 Eld x2=5.000 x2=4.906 x2=5.00091 error=l.88% error=0.02% xl=0.2520 xl=0.258 xl=0.257 xl=0.25196 E2 x2=0.3328 x2=0.329 x2=0.332 x2=0.33278 x3=0.4150 x3=0.407 x3=0.412 x3=0.41495 error=2.38% error=l.98% error=0.03% x1=0.3395 x1=0.336 x1=0.3406 x1=0.34197 E3 x2=0.3302 x2=0.320 x2=0.3385 x2=0.32842 ; error=3.09% error=2.51 error=0.73% 58 C hapter 4 S Olving Linear System Problems Using An Artificial Neural Network In this Chapter, a new method for solving linear system problems using an artificial neural network is presented. Foremost, a mapping between the solution of linear equations and quadratic minimization is established so that solving linear equa- tions can be viewed as finding the minimum of a quadratic function. The conditions for stability and convergence of the neural network-based dynamic system are proven, Provz’ding a mathematical basis for the approach. A feedback ANN model with sym- metric or asymmetric connections for optimization is proposed to form an ANN linear equation solver which can be implemented with VLSI technology. Moreover, it is shown that the time complexity of this technique is problem size independent. In addi- tion. other applications of the approach are discussed. These include matrix inversion, derermining the stability of a linear control system, and determining matrix singularity. 59 4.1 Introduction The linear system is an important mathematical model as it forms a kernel for solving complex problems such as finite element analysis, and the numerical solution of differential equations. In addition, under some conditions, a nonlinear system can be replaced by a linear system for approximation analysis. Therefore, linear system computing techniques play a key role in a wide variety of disciplines. The basic func- tions in linear system manipulation are finding a solution to simultaneous linear equa- tions [Cra56, Sti63, FoM067], inverting a matrix, finding the determinant of a matrix [WiRe71, Par80, JoRi82], finding eigenvalues and eigenvectors of a matrix [601.083, Joh87, LiZS91]. The recent availability of advanced computers has had a significant impact on all spheres of linear system computation. In the past three decodes, much effort has been devoted to the development of new linear equation solvers for large- scale systems [Sto73, HwCh80, l-leh82, Mol83, CoR087, BiVo91, Wri9l]. The major concern of these effects is to speed up the solution procedure in order to find a feasible result as soon as possible. These techniques typically take advantage of one or more of the following methodologies: o decomposing the system into small subsystems; o distributing computation tasks to a parallel architecture; 0 increasing data independency; o reducing communication cost. All of these methods, however, are fundamentally based on the traditional technique of elimination and separation processes. Their time complexity is size-dependent: 0(n3) for the software approach and 0(n) for the hardware approach [I-IwCh80, Pret86]. In this Chapter, an artificial neural network approach and architecture for linear system manipulation is developed whose time complexity is size-independent. 60 The Chapter is organized as follows. First, general linear system computing tech- niques are reviewed. Then, a mapping between the linear equation solution and qua- dratic minimization is established so that solving linear equations becomes equivalent to finding the minimum of the quadratic function. The conditions for stability and con- vergence of such a quadratic gradient system are established. Based on these mathematical results, a configuration of an artificial neural network linear equation solver is given. Applications of this approach, such as matrix inversion, determining the stability of a linear control system and determining the singularity of a matrix, are described. 4-2 Methodology Review Typical linear system computing techniques such as Gaussian elimination, sys- tolic arrays, and MIMD algorithms are summarized as follows. 4-2.1 The Traditional Methods A set of simultaneous linear equations can be written in matrix form as A x = b (4.1) Where A is n x n matrix, D and x are known and unknown column vectors, respec- trvely. When A is nonsingular, there exists an unique solution to the equations. Only thlS case is considered because most applications deal with this situation. 4-2-1.l Gaussian Elimination Gaussian elimination [FoMo67] is the most important method for solving linear System problems and serves as the basis for a variety of other approaches. The tech- n‘ . . . . . que can be consrdered as a process of eltmrnaung variables by the elementary 61 transformation , 01101“ aij = aij ‘ ——1 (4.2a) 011 and , 011191 bi = bl "’ . (4-2b) 011 This process is repeated until all entries below the main diagonal are zero so that the coefficient matrix is reduced to an upper triangular matrix in the form: - I l- l- - F ’1 an’arz .aln x1 171 I I I 0 022 -02n 12 ’72 ° ' ' ' ' = I . (4.3) 0 0 0a,,,,’ x,, b,,’ -l l- d b d 'I‘hen, back-substitution, the solution vector can be obtained. 4-2.1.2 Gauss-Jordan Elimination This is a numerical method based on the elementary row and column transforma- tions of column-augmented matrices [A ]-[x Y] = [b I] (4.4) Where A is a known matrix, Y is the unknown inverse matrix of A , I is the identity matrix, b and x are known and unknown column vectors, respectively [WiRe71]. When A is reduced to the identity matrix by a sequence of the elementary row or Colutnn transformations, the right-hand of (4.4) becomes the inverse of A . 4'2-l.3 LU Decomposition Suppose A is an n x n nonsingular matrix, then A can be uniquely reduced into a Upper triangular matrix U by a sequence of equivalent transformations. 62 U = AW” = Mn_1M,,_2...M2M1A. (4.5) Set 1 0 0 0 ”’21 1 0 0 _ _ _ ”131 ”132 0 0 L=M11M21 "’Mn_11= . .’ (46) . . 1 0 _mnl mn2 ' ° ' mn(n—1) IJ then, the LU decomposition of matrix A can be expressed as A = MI'IMZ‘I - - - M,_,-1U =LU. (4.7) In fact, the LU decomposition can be obtained during a process of Gaussian elim- ination [Ort87b], that is, the final upper triangular matrix is U, and the multipliers at each step are the entries of the lower triangular matrix L : 011 mi] = — fori = 2, ..., n; (4.8a) 011 a- ’ "1,, = ‘1, fori = 3, n; (4.8b) 022 a. ’ m,3 = —“—, fori = 4, n; (4.8c) 033 and so on. It should be noted that aij’ is updated at each step of elimination. The advantages of the LU decomposition of a matrix are: (1) The solution of a triangular set of equations is quick. The triangularized sys- tem can be solved by forward and backward substitution. (2) The determinant of a matrix is just the product of the diagonal entries of matrix U verified by det (A) = det (LU) = det (L )-det(U) = l-det (U) = fiuii. (4.9) i=1 63 (3) The inverse of A can be found by A'1 = (NH. (4.10) (4) The solution accuracy of linear equations can be improved by using LU fac- torization iteratively. 4.2.2 Systolic Arrays A systolic array is a computing network which consists of a set of interconnected processor elements (PE) each capable of performing some arithmetic operations [Dec89]. The array maximizes the computational concurrence by multiprocessing and pipeline processing techniques very effectively reducing computation time. The major design issues here include mapping a computational algorithm into a systolic array, and specifying the array in terms of communication topology and the arithmetic opera- tion of individual PEs. Driven by many practical applications, the LU decomposition method has been implemented in a systolic array architecture [HwCh82, HwBr84]. Figure 4.1. shows a systolic array for the LU decomposition without pivoting where A is 4 x 4. In general, (n — 1)2 M cells, and (n — 1) D cells are required for a n x n matrix. The array has (4n —2) l/O ports, and its time complexity is (3n +1) [HwCh80]. Another interestingexample is a systolic array system for linear state equations x(t) =A’x(t) + Cu(t) (4-11) where x(t) is the state vector of size n, u(t) is the input vector of size m, and A’, and C are n x n and n x m matrices, respectively [JoJe88]. Using the backward Euler method in discrete form [JoRi82], the problem can be decomposed into two simple stages: (1) compute B(t,,)=B’+ Cu(tn), where 8’: x- , h is the time increment _1_ h 64 (h = In " tit—1); , 1 . (2) solve the linear equations Ax(t,,) = B(tn), where A = (A + I-Z), I rs the identity matrix. These two stages are mapped to two systolic arrays, one for the matrix-vector multiplication-accumulation, the other one for the Gauss-Jordan elimination. Figure 4.2 illustrates their topologies and PE operations. The latency of this technique is (n +m—1) and (4n —3) for srage 1, and stage 2, respectively. When n > m, this struc- ture can overlap the two computations so that the system latency is (4n —3) which has been shown to be the optimal for this problem [JoJe88]. 4.2.3 The MIMD Method The Multi-Instruction and Multi-Data (MIMD) methods can be divided into two categories based on their data distribution scheme: the row/column distribution and the square mesh distribution [BiVo91]. In general, these methods can only cope with spe- cial cases, such as triangular systems, and banded systems, efficiently. 4.2.3.1 Triangular Systems The grid-oriented structure is used for solving triangular system, because the structure has good load balance property and a low communication overhead [BiVo91]. This method is effective for solving the lower triangular system. Suppose there are p processors which are organized in a Q x Q mesh (p = Q2) each with local memory. Each process is executed on one processor, denoted by (s, t), 0 S s, t k, and k is an integer (usually 1 < k < n). Obviously, the bandwidth of the matrix is 2k + 1. Such systems arise in a great deal of applications, such as the numerical solution of differential equations and the optimal control. This method can be thought of as the Gaussian elimination with a certain res- tricted pivoting strategy [Wri9l]. First, the matrix is broken up along the diagonal into a number (say p) blocks (called sub-blocks), and each sub-block is separated by a small dense k x k block (called separator). Then, A reduced system is formed from the rows. of matrix A corresponding to the separators. After the reduced system is solved, and the remaining variables are recovered using data stored during the initial 66 factorization stage. The experiments for problems with n = 2000 to 5000, k = 2 to 5 using 2, 4, 8, 16 processors showed that the maximum speedup over a single processor was 4.2 to 5.7 (p = 16). When 2 S p S 4, the speedup is very low due to the boun- dary condition operation cost. The time complexity of the above methods is size-dependent as indicated in Table 4.1. 4.3 A Neural Network Solution to Linear Equations An neural network linear equation solver based on the optimization network discussed in Chapter 3 first requires the establishment of a mapping between the solution of linear equations and quadratic minimization. Consider the linear system of equations A x = b (4.17) where A is an n x n real coefficient matrix, D is a real column vector, and x is a vector of unknowns. A must be nonsingular for a unique solution to exist. Let ¢(x) be a quadratic function of the form ¢(x) = %xTA x — 0711 (4.18) where A is a real symmetric n x n matrix, x and b are real column vectors. A minimum point of ¢(x) must satisfy the necessary condition _341. 8x1 a_¢ 8X2 at 8x,, d 67 Correspondingly, the partial derivative of the right hand side of equation (4.31) should also be zero. It can be written in simple matrix form as A x — b = 0 (4.20) due to the symmetry of A. Equations (4.19) and (4.20) indicate the important fact that the minimum point of a quadratic function must satisfy the corresponding linear equa- tions. In other words, by defining an artificial neural network for the quadratic minim- ization problem as i=—Vm0 «20 = - V(—;-XTA x - bTx) = - (Ax — b), the solution to associated the linear equations is found when the network settles on its minimum point. This network is called an ANN linear equation solver. An important issue for the ANN linear equation solver is ensuring its stability as will be discussed in next section. 4.4 The Relationship to Linear Systems The following theorems establish conditions for guaranteeing a unique bounded minimum point of a quadratic function for the ANN linear equation solver. Moreover, a general formulation of the ANN linear equation solver is proposed which can cope with both symmetric and asymmetric coefficient matrices. Theorem 4.1: If A is a positive definite matrix, then the unique minimum point of the quadratic function is the solution to the linear equations. 68 A formal proof can be found in [Ort87b]. A geometric explanation is given here for clarity. For the two dimensional case, suppose _ 011 0 _ b1 A — [0 022] and b— [b2] Substituting the above A and b into equation (4.18) and manipulating algebraically, where a11> 0, 022 > 0. ¢(x1, x2) can be rewritten as 1 0110 11 xi ¢(X1,XZ) = ‘2"[X1.XZ] 0 022 x2 ‘— [ble] X2 (4.22) _ 1 2 1 2 - 3011M + 30212 " brxr ‘ b2x2 2 1 2 bl l 2 1 2 b2 b2 2 1 b1 1 b2 -—a [(x -2——x +(—)]+ —a [(x —2-—x +(——)]-——-——— 2 11 I 011 1 011 2 22 2 022 2 022 2 011 2 022 b1 2 2 2 (x1- —) (xz- —) _ _ 1 _1: 22, 2 2 2 2 2 an 022 ( —) ( —) 011 022 The image of ¢(x1,x2), as shown in Figure 4.3, is an ellipse-parabolic surface whose . . bl b2 . . 1 apex rs the lowest pornt at (a_' T)’ Wthh rs exactly equal to A' b. When 11 22 an: an at 0 and 011022 > a12a21, the statement is still true but the main-axis of ellipse is rotated. For higher dimension cases, a similar image of ¢(x) will be a super ellipse-parabolic surface whose apex coordinates are still the solution to the linear equations. The positive definiteness of A , however, is a special case. In order to develop a general technique, Theorem 4.1 must be extended as follows for finding the solution to arbitrary linear equations. 69 Theorem 4.2: If A is an arbitrary nonsingular matrix, then there exists a transfor- mation L such that L(A) is positive definite, and the new equations are equivalent to the original ones. Proof: Define L(A ) = ATA, (4.23) then (ATA )T = ATA, in other words, ATA is symmetric. At the same time, xT(ATA )x = (xTAT)A x = (A 10% x) =||Ax||2>0, forx¢0 where H - ”represents the Euclidean norm. The inequality holds because of the non- singularity of A. Therefore, ATA is positive definite. Multiplying both sides of equation (4.17) by AT gives: A TA x = A To (4.24a) A TA x = b’ (4.24b) where b’ = A To. (4.24c) Next, it must be shown that the solution to equation (4.24b) is satisfied with equation (4.17) as well. The solution to (4.24b) can be expressed as x = (ATA )"b'. Using the relation between b’ and b in (4.240), the solution is given by x=(ATA)-1ATb 70 =A-1(AT)-1ATb =A-1Ib =A“b. [:1 Furthermore, when A satisfies a certain condition, the solution can be directly obtained without the transformation. The following theorem explains this situation. Theorem 4.3: If A is an arbitrary (symmetric or otherwise) nonsingular matrix with eigenvalues whose real part is positive, then there exists a unique stationary point of a quadratic minimization problem which is satisfied by the solution to the corresponding linear equations. Proof: In artificial neural network theory, a gradient system is often used for finding an equilibrium point dr“ 7 = —§§— for all i. (4.25) Associating equations (4.19) and (4.20), the dynamic behavior of a quadratic system can be described by a set of first-order ordinary linear differential equations in matrix form as x = —A x + bu (4.26a) where {0 for 1 < 0 “ = 1 for 1 2 0. (426”) When A is asymmetric, the energy function, ¢(x), takes on a modified form of equa- tion (4.18) (see Section 4.5). In this case, however, the mathematical model of the neural network can be directly defined by equation (4.26a). If the system is relaxed, then using the state transition matrix ul(t, I) [Che84], the solution for t 2 0 can be found as 71 ! x(t) = w(t, 0)x0 + W“, 1')de (4.27) 0 where ur(t, r) = e‘“’ ‘ T). (4.28) The first term of equation (4.27) comes from the initial condition x(0), and the second term comes from a constant input bu (2 b). A stable solution exists if the real part of the eigenvalues of matrix (-—A) are negative [BaSt70]. In this case, the first term of (4.27) will approach zero as t —-> oo. An equilibrium solution to the system will be determined only by the second term of (4.27). Taking the limit of (4.27) as t—->oo yields 1 lim x(t) = x(0)lime"“ + Iimje-M' ”hm (4.29) l-—)00 l—)°° l-)°°0 t = 0 + A-lbrime-Mt ‘W l—)00 = A“b(eO — e-°°) =A"b. In other words, the equilibrium point of the differential equations must satisfy the corresponding linear equations. Moreover, it is easy to show that the signs of the eigenvalues of A are opposite to those of —A , because A x = Xx => —A x = 41.x => (—A )x = (—7.)x (4.30) where 7» is an eigenvalue of A. 72 Therefore, A with eigenvalues whose real part is positive will guarantee that a unique solution to the linear equation system can be found. C] 4.5 Architecture A theoretical basis for an ANN linear equation solver has been presented in the previous sections. In what follows, an architecture for the ANN linear equation solver is proposed. Artificial neural network computing is a dynamical process which is mapped into a network architecture with an associated dynamical feature. The computing is per- formed by the network’s inherent convergent behavior. There are three types of ANN architectures: feedback, feedforward, and cellular models. The feedback neural net- works are organized in a simple and regular form and they can easily express the equality relation of a dynamical system. Therefore, they are the logical choice for building the linear equation solver. As mentioned in Section 4.3, the function of the linear equation solver is equivalent to that of an unconstrained optimization network. The ANN solver can be constructed by a feedback minimization network. The present models, however, work well only with the symmetric feedback connections [TaI-Io86, KeCh88]. The stability property of a system, determined by matrix A , is the key issue for design of the ANN architecture. As discussed in Section 4.4, matrix A in the case of Theorems 4.1 and 4.2 (after transformation) is a positive definite matrix whose eigenvalues are positive real numbers [Che84], whereas eigenvalues of matrix A in Theorem 4.3 can be written as it,- = or,- + jBi, where (1,- > 0, j = 4:7. Theorems 4.1 and 4.2 can be regarded as a special case of Theorem 4.3 where [3,- 50. As in Theorem 4.3, the architecture proposed here is not restricted to matrices with sym- metric connections. This significantly broadens its range of applications. 73 Figure 4.4 shows a configuration of the linear equation solver composed of integration units, a resistive network, and feedback links. The resistive network is organized according to equation (4.263). Thus, the following relation is held for each integrator (j-cell): dxl- n —dt ='(ZarjX'-br'1) (4.31) 1:1 where x,- is the voltage at node i, and the 1 in the term bi-l is a DC source voltage (IV). The terms aijxj and b,-1 are thus interpreted as current terms. The negative summation is implemented by connecting all current inputs to the negative (minus) ter- minal of the integrator. Let the normalization values of current be 0.1 ma, then the value of the resistors can be calculated as 10 J ' laij l 01' where R)!- and Rb,- are resistors representing values in matrix A and vector b, respec- tively. The lower ends of RI} and Rb,- are connected to the input of integrator i, whereas the connections of their upper ends depend on the values of aij and b,- as indicated in Table 4.2. In order to verify the stability and convergence of this network (without the sym- metry restriction), a shift transformation is carried out for the dynamical system as described by equation (4.26a). )2 = —(A x - b) ' (4.33a) =—mx—A{) =—A(x—x‘) 74 where y = x - xi (4.33b) and Because of the constant (unknown) vector x‘ , the time derivative of y must satisfy 51 : )2 = —A y. (4.34) When the condition of Theorem 4.3 is satisfied for A, there must exist a unique solu- tion P for the Lyapunov equation PA + ATP = Q (435) where Q is an arbitrary positive definite matrix, and P is a positive definite matrix (the solution) [BaSt70]. Define the energy function E (y) as E (y) = We y. (4.36) It is apparent that E (y) > 0 for y 4: 0 because of the positive definiteness of P. The derivative of the energy function for y at 0 can be found as E(y) = yTPy + 9pr (4.37) = -yTPA y - yTATPy = —yT(PA + ATP)y = -yTQ y < 0. The above equation indicates that the energy function is always decreasing with time. Therefore, the energy function is a global Lyapunov function which guarantees that the network is completely stable and there exists a unique equilibrium point at 75 ye = 0. (4.38) Using equation (4.33b), the corresponding equilibrium point for x is xe = ye + x* (4.39) = 0 + x‘ = A-lb. This means that when the network arrives at the equilibrium state, the output x is the exact solution of the linear equations. 4.6 Other Applications The technique for solving linear equations is of fundamental importance to many computing tasks. The method described in the previous sections can be further extended to other applications. 4.6.] Matrix Inversion If matrix A satisfies the conditions of Theorems 4.1 or 4.3, then the inverse of A can be found by the following procedure. Let A"1=()’1Y2”'yrz)v then '10 0 .. 0' 010..0 AA-‘=A(y,y2--y,,)=l= 0 0 1 " 0 (4.40) .00011. where y,- is the ith column vector of A" and I is the identity matrix. The above 76 equations can be rewritten in separate form as Ayl = C' (4.41) l for i = 1, 2, n, where e,- is a base vector (1 at the ith entry and 0 at all others). Therefore, finding the inverse of A is equivalent to solving equation (4.41) n times with a base vector as the right hand side. As an extension of the approach, if matrix A is negative definite or Re (M) < 0 for all i, where it,- is an eigenvalue of A, then —A must satisfy the conditions of Theorems 4.1 or 4.3. As a consequence, the inverse of A can be directly found by A ‘1 = —(—A )-1 (4.42) where (—A )‘l is obtained from the ANN linear equation solver. 4.6.2 Determining the Stability of 3 Linear Control System Criteria such as Routh-Hurwitz and Nyquist are most often used to determine the stability of a linear control system [Che84]. For large-scale systems, the process involved in using either of the above criteria becomes very time-consuming due to the computational complexity. A completely parallel approach based on Theorem 4.3 is proposed here. From control theory, a linear system is stable in the sense of Lyapunov if and only if all eigenvalues of A have no positive real parts. (For zero eigenvalues, the order of the Jordan blocks should be one [Che84].) Without loss of generality, let the input be zero. Then the control system can be written as x = —A x { (4.43) y=cx+e where x is a vector of state variables, y is a vector of outputs, c is a constant matrix, and e is a constant vector. Thus, the problem can be converted to solving the related linear equations. Using equation (4.27) and starting at any x(0) ¢ 0, the final state, x(oo), can be expressed as 77 1 lim x(t) = x(0)lime"" + rimje-A“ ”>021: (4.44) 0 t—wo t—ioo l—-)°° x(0)e"‘(l " 1) + 0 0 or finite value (stable) z oo (unstable). Correspondingly, the linear equations can be solved with a starting point x(0) at 0. When a bounded solution is found the system is stable and matrix A is called a stable matrix; otherwise, the system is unstable. 4.6.3 Determining the Singularity of a Matrix Since singular matrices have different properties than nonsingular matrices, it is desirable to determine whether or not a matrix is singular. For example, when A represents a linear transformation, a nonsingular A always indicates l-to-l mapping whereas a singular A indicates m-to-l mapping. The traditional methods for determin- ing the singularity of a matrix include the LU decomposition and the similarity transformation. The former calculates the determinant of matrix A by det (A) = det (LU ) (4.46) = det(L)-det(U) = 1-det(U ) = det(U ) where L is a lower triangular matrix with a unit diagonal, and U is an upper triangular matrix. The time complexity for this method is 0(%n3) for the software approach, 78 and 0(3n — 5) for the hardware approach [HwBr84, Pret86]. The purpose of the similarity transformation is to find the eigenvalues of A. If there exist any zero eigenvalues, then the matrix is singular. The computational cost for this is as high as 0(n5) [Pret86]. The singularity of a matrix can be distinguished by the following two operations: (i) Determine whether matrix A (or —A) is stable. If yes, go to step (ii); other- wise, st0p. (In this case the method fails because eigenvalues of A lie in both right and left halves of the complex plane so that the final state is. not bounded) (ii) Arbitrarily choose some b at 0. Solve the set of equations A x = b (or —A x = b) for two different starting points. If a unique solution can be found, then matrix A is nonsingular. If the solution is dependent on the initial point, or no solution can be found (i.e., no equilibrium point so that the state vari- '(k+1) , — xi“) |> e for all i, where e is a convergence ables keep changing: Ix limit, (k) is an iteration index), then the matrix A is singular. For singular matrix A, the solution situation depends on vector b; if augmented matrix [A b] has the same rank as matrix A , then there exist multiple solutions. Oth- erwise (rank [A b] > rank [A]) there is no solution. Thus, this technique not only gives a judgement as to the singularity, but also indicates the rank rela- tionship between [A b] and [A], that is, for the multi-solution case, rank (A) = rank (A b), and for the no-solution case, rank [A b] > rank [A ]. For clarity, the above procedures for the different applications and their relation- ship are summarized in Figure 4.5. In fact, this approach is valid for the single-side eigenvalues "distribution". If all eigenvalues of A lie in the right half of the complex plane, then A is used; if all eigenvalues of A lie in the left half of the complex plane, then —A is used. 79 4.7 Simulation Results Experiments include both software and hardware simulations. The former attempts to verify the correctness of the approach, and the latter provides evidence of the size-independent property of the ANN solver. 4.7.1 Verification A simulation is designed in the C language and based on the architecture of Fig- ure 4.4. Integrators are simulated by a fourth-order Runge-Kutta numerical integration (explicit formulation), and the resistive networks are simulated by ID and 2-D arrays. 4.7.1.1 Linear Equation Solution Experimental examples are arbitrarily chosen for the three cases in Section 4.4 from n = 2 to n~ = 20. The ANN linear equation solver does indeed find a unique solution for every example no matter what initial point is chosen. Table 4.3 illustrates three examples associated with three situations in Section 4.4 for matrix A. The con- vergence is determined by lin‘H) — xi“) |< E for all i (4.47) where xi“) is the value of x,- at the kth iteration. e = 0.1E-7 is used for all the exam- ples. The time increment dt used for the Runge-Kutta integration is 0.05 for example 1 and 2, and 0.001 for example 3. The simulation time is defined as T = dt-N (4.48) where N is the number of iteration steps required for convergence. The experiments show that T has an obvious relationship to the initial point; the further the initial point is from the equilibrium point, the more iterations are required. For examples 1 to 3, T is 18.05, 7.0 and 14.67, respectively. 80 In order to measure the accuracy, an error e is defined as M: e: l n (201'ij "' bj )2 (4.49) lj=l which represents the sum of the square of the difference between the left hand side and the right hand side for each equation. In the ideal case, e is zero. The error curves for the three examples are presented in Figure 4.6. Given enough iterations, the error approaches zero for all praCtical purposes. Four trajectories for example 1 are shown in Figure 4.7, indicating that the solution is initial-point independent. 4.7.1.2 Inverse Matrix Computation Table 4.4 lists the matrix inversions (examples 4 and 5) where matrix E is a measure of the error determined by E = A'lA — I (4.50) where I is the identity matrix. Ideally, E should be a zero matrix. The simulation shows that all entries of E are very close to zero (0.000000001 to 0.000001). 4.7.1.3 Stability Two examples of the stability judgement are presented in Table 4.5. Example 6 always results in a' bounded solution using the technique described in Section 4.6. This indicates that the system is stable. Example 7 results in infinite solutions, which indicates the system is unstable. The corresponding eigenvalues are calculated and listed in Table 4.5 for checking consistency. 4.7.1.4 Singularity Table 4.6 gives the results of two example singularity calculations (No. 8 and 9). Figure 4.8 shows the trajectories of a multi-solution for the following equation with a 81 singular matrix 11; :31 [21418]. It is obvious that all the solutions are on a co-linear pattern indicated by a dashed line 00 02 in Figure 4.8. Moreover, it is easy to shown that eigenvectors associated with eigen- values X = 0 and I» = 3 of this matrix are x2 = —2xl and x2 = 4x1, respectively, indicated by dotted lines in Figure 4.8. Some interesting observations are gleaned from this experiment. The line for all the solutions is parallel to eigenvector x2=—2x1. Moreover, all the trajectories are straight lines and they are parallel to eigenvector x2 = 4x 1. 4.7.2 Hardware Simulation .The mathematical model has verified the correctness of the proposed approach. The objective of the hardware simulation is to check the time-property of the ANN linear equation solver architecture proposed in Section 4.5. SPICE, a computer-aided design tool for circuit analysis is used to verify the ANN linear equation solver at the component level. A simulated solver is built with 1, 2, 3, 5, and 10 neurons, respec- tively. To provide comparison, all data are normalized so that starting points are located at the origin and solutions are within (-1, 1). The results indicate that the reso- lution time T5 is in the same range for all the examples. T, depends only on the time constant of the integrator, I (= C 0R0). For example, when 't = 211s, T, = 2.25ms to 2.45ms; when 1: = 0.211s, T3 = 0.223ms to 0.247ms. The slight deviation in Ts is that the distance from the staring point to an equilibrium point is not identical. Figures 4.9, 82 4.10, and 4.11 show the curves of transient analysis for the simulations with t = 211s. The preliminary conclusion here is that the solution time is size-independent. 4.7.3 Hardware Complexity In the development of computing technology, the computing time and the device space are always associated with each Other. An estimate of the hardware complexity of the ANN linear equation solver is derived as follows. Let the size of a system be n, then the total numbers of resistors and integrators 2 + n, and n, respectively. Assume the in the architecture shown in Figure 4.4 are n average resistance of resistors is 50K Q, and the routing area is 40% of the circuit area [AlHo87], then the chip area of the solver for a CMOS implementation is s =1.4[(n2 + n)a + nb] (4.51) where a is the unit area for a resistor (a = 6.25x10’3mm2), and b is the unit area for an integrator (b = 1.5x10‘2mm 2) [AlH087, Rei87]. For clarity, a relationship between the chip area and the size n is illustrated in Figure 4.12. Connectivity is another important issue in hardware implementation. The topology of an architecture, the connection order of a component, and the number of crossovers in a circuit are three key factors of the connectivity. As shown in Figure 4.4, the con- nection pattern of the solver takes a matrix form which can be implemented in a "bus" structure so that connections among components are significantly simplified. It is easy to show that the number of crossovers of this network is N CI'OSS = n2 — 1. (4.52) This is another advantage of a feedback network paradigm as it can be shown that New” for a three-layer feedforward network is on the order 0(n 4). Moreover, routing network is straightforward. As shown in Figure 4.4, there are three bus lines: horizon- tal (H), vertical (V), and feedback (F). H lines are on layer 1, V lines and F lines are I _jiflizz‘ 83 on layer 2. On each crossover, an isolated "stamp" is made between H and V. For cir- cuit components, each resistor is on layer 1 with connections to H and V, and each integrator is on layer 1 with connections to F and H. The network requires n l/O pins for operands and additional pins for voltage sources, control signals, and ground. 4.8 Summary A new approach to solving linear systems using an artificial neural network has been proposed. A mapping between the linear equation solution and a quadratic minimization problem without constraints has been established. Three theorems in Sec- tion 4.4 guarantee the stability and convergence of the quadratic system so that a unique accurate solution can be obtained. Correspondingly, an ANN architecture for the solver is given which can be implemented in VLSI technology. Because of the inherent features of dynamic convergence and parallel processing in ANNs, the time complexity of the approach is size-independent. The approach can also be used for finding the inverse of a matrix, determining the stability of a linear control system, and determining the singularity of a matrix. Simulation experiments have verified the theoretical correctness of the technique. 84 Z Z b e d=a*b+c . 0 a f o f c b qze/f Figure 4.1. A systolic array for LU decomposition (4x4). Stage 2 Figure 4.2. A systolic array for linear state equations . 86 ¢ (x1. x2) Figure 4.3. The image of 9051. 12) from eqaution (4.22), an ellipse-parabolic surface. 87 1V (or -1V) Figure 4.4. A 3x3 network architecture for the linear equation solver. InputA b = 0 x0 #0 Solve Ax = b (b at 0) 88 Solve (b¢0) Unique No A solutio ‘7 Yes Find 24—] —Ax=b Figure 4.5. Applications of the ANN solver. 89 e ll 1.2 A —— Example 1 - - - - Example 2 1 ._ ......... Example 3 0.8 - 0.6 — 0.4 -— 0.2 - 0 l I 1 I I: Iteration 100 150 200 250 300 Figure 4.6. Error curves for examples. (15,5) (1.1) 0 / olution: T x1 / (12.714,-1.429) ('20'3) (18,-5) Figure 4.7. Trajectories for an example using different initial points. 90 eigenvector x2 = 4x1 (for l = 3) (2.4) (-4,2) (3,1) / =x1 (0.2-3) eigenvector x2 = —2x1 (for 7. = 0) Figure 4.8. Trajectories of multi-solution (a singularity matrix). xi 1 0’: Equilibrium time = 2.45 ms XI 0.5 - x 2 1 1 I 1 1 1 > time (ms ) 0 0.5 1.0 1.5 2.0 2.5 3.0 Figure 4.9. The result of SPICE simulation (two neurons). _.. __- ...,..n‘ MA - if 91 11' 41 1,0 . Equilibrium time = 2.25 ms 1 2 0.5 - X 3 4 ‘ ‘ ‘ ' 1 > time (ms) 0 05 10 15 20 25 30 -0.5 I 11 -1.0 - Figure 4.10. The result of SPICE simulation (three neurons). It 1 ()4 Equilibrium time = 2.35 ms 1 3 0.5 - x 2 x1 I 5 x 4 J J I l l 1 Ar time (ms) 0 0.5 1.0 1.5 2.0 2.5 3.0 Figure 4.11. The result of SPICE simulation (five neurons). 92 chip area (cm 2) 100_l 90-1 80— 70-— 50— 40— 30-4 20— 10— l I I I 200 ’ 400 600 800 Figure 4.12. A relationship beteewn the chip area and the size. 1000 93 Table 4.1. The time complexity for linear system computing. Method Time Steps Gaussian Elimination %n3+0 (n2) Gauss-Jordan Elimination n3 LU Decomposition by Software %n3 LU Decomposition by Systolic Array 3n +1 Jacobi Transformation 0 (n 3)-0 (n 2) Householder Reduction (n —2)°0 (n 3) or. Algorithm 0 (n3)-o (n) Table 4.2. The connection of resistors. Value Positive Negative Zero a,- 1' xj —xj unconnected b,- -1 V +1 V unconnected where 94 Table 4.3. Linear equations solutions. N 0. Case A b Solution Error 1 Positive A1 12.0 12.7142744064 0.11E-9 definite 3.5 -1.4285681248 3.0 0.6346439123 Positive 4.0 03502970338 2 real part A2 12.0 2.9066379070 2.3E-10 of eigenvalues -6.0 -2.8384561539 8.0 1.4461678267 2.3 03921020627 4.1 06200531721 1.4 -5.0949549675 2.6 -3.2620067596 3 Arbitatry A3 -4.2 0.3290435970 4.7E-6 -2.8 1.7367919683 12.0 0.5648825169 -21.0 -1.3450118303 10.0 -3.6507625580 7.7 -2.1458382607 1.0 0.5 A1: 0.5 2.0 "11.0 1.5 -1.2 0.6 1.2- 2.0 4.9 0.5 -O.8 0.5 A2— 1.2 0.6 5.0 1.8 1.4 0.5 -l.5 1.2 4.2 1.1 _0.5 -0.5 1.7 2.0 5.7_ 12.0 -110 1.5 -1.2 0.6 1.2 -2.0 -1.1 0.8 0.6” 1.2 -0.6 -2.1 2.5 —0.9 1.0 2.0 1.3 0.5 —0.8 1.4 1.2 —1.3-2.1-2.1 1.2 0.6 3.0 1.8 1.4 2.3 2.1 -1.6—2.5 1.8 1.5 -l.5 1.2 3.2 1.1 A 3.1 2.0 —1.2 -1.2 1.8 0.5 -0.5 1.7 2.0 2.7 3’ 0.9 0.5 —O.6 1.5 2.5 3.1—2.1-1.61.3 1.6 -1.3 -1.6 -2.1 2.1 1.6 2.5 2.1 3.4 —2.4 1.2 0.5 0.2 0.7 2.4 1.3 -3.1—2.5 0.8 0.2 0.7 1.4 2.1 -1.5 -1.2 1.7 1.4 -5.6 -7.8 2.4 1.1 .8.2 7.1 -5.2 0.2 -0.8 1.6 3.2 6.1 1.7 3.2- 95 Table 4.4. Matrix inversion. No. A Inversion Error (15=A.A-l — I) 4 A4 A;1 E4 42 42'1 52 where 0.8 0.4 0.3 A4: 0.4 1.2 0.8 0.3 0.8 1.6 1.50234556 —0.46948257 -0.04694886 A4‘1 = —0.46948251 1.39671147 —0.61032742 —0.04694865 -0.61032748 0.93896627 0.00000113 0.00000006 -0.00000056 E 4 = 0.00000037 0.00000107 0.00000052 —0.00000002 0.00000055 0.00000054 [009709457 -0.4654792 0.04247883 —-0.03103656 -0.02080171 -0.04244077 0.24640946 —0.04885584 0.08177510 -0.01646140 A 2'1 = —0.00934555 —0.05908667 0.23789524 -0.09630578 -0.03269460 -0.02375325 0.10910744 -0.07 685770 0.31758156 -0.04698043 _-0.001 1 1820 0.00503705 -0.05199542 -0.07281359 0.20205469 1 F0.00000018 0.00000004 -0.00000006 -0.00000001-0.00000009T 0.00000029 0.00000036 0.00000019 0.00000005 0.00000024 E 2 = —0.00000033 0.00000012 0.000000030 —0.00000003 -0.00000024 0.00000048 -0.00000014 0.00000029 0.00000012 0.00000035 .—0.00000008 0.00000002 —0.00000005 0.00000000 0.00000012. 96 Table 4.5. Stability. No. -A Solution Stable Eigenvalues 711:6.710624 6 A 5 Exist Yes M=2.144688+i0.784808 703:2.144688—1' 0.784808 11:5.970251 7 A 6 Infinite No 221:2.420508 23=—3.390759 where 3.0 2.0 1.0 A5: 1.0 4.0 2.0 2.0 1.0 4.0 -3.0 2.0 1.0 A6 = 1.0 4.0 2.0 2.0 1.0 4.0 Table 4.6. Singularity. No. A 117 Solution Singularity 8 A7 (1.0 4.0) Multiple Yes (2.0 4.0) No 9 A8 (1.0 4.0) Unique No where 1.0 0.5 A7 = 4.0 2 0 97 Chapter 5 ANN Techniques for Power System Control Analysis This Chapter presents a new approach for solving nonlinear equations using an artificial neural network technique which shows great potential for more efficiently solving a class of power system problems. Foremost, methodologies for solving non- linear equations addressing power load flow and contingency analysis problems are reviewed. Then, a mapping between the solution of nonlinear equations and quadratic-nonlinear minimization is established so that solving nonlinear equations can be viewed as finding the minimum of a quadratic-nonlinear function. From an engineering point of view, a new formulation for an ANN nonlinear equation solver is proposed which can significantly simplify hardware architecture. The conditions for stability and convergence of neural network based on the new formulation are proven, providing a mathematical basis for the approach. Furthermore, ANN formulations for the full power load flow, the decoupled load flow, and the DC load flow are given, and corresponding architectures for them are proposed, which can be implemented with VLSI technology. 98 5.1 Introduction On-line control of large-scale power syStems is a difficult problem exacerbated by the computational complexity of solving large-scale load flow and contingency analysis formulations in time frames required for real-time response to rapidly changing operat- ing conditions. The problem has been aggravated by the lack of robust algorithms and implementation technology which would allow for the design of economically feasible dedicated high-speed computational hardware. A large class of problems in science and engineering, including a group of these power system analysis, formulations can be formulated as optimization problems. Con- ventional computers are sometimes adequate for solving this kind of problem providing that it is within reasonable bounds. Most real world problems, however, are too large to be solved in this manner. As described in the previous chapters, a new computa- tional trend, artificial neural network computation, has recently emerged for solving these difficult problems. Hopfield and Tank proposed a neural network for solving linear programming [TaH086]. Kennedy and Chua developed a canonical form of neural networks for finding the minimum for nonlinear programming [KeCh88]. For combinatorial optimization problems, such as the Traveling Salesman Problem (TSP), several algorithms and architectures have been reported [HoTa85, AiFS90, WiP388]. One of the significant features of these approaches is that the time complexity is size insensitive so that solving large-scale systems in real-time is possible if these approaches are implemented in hardware. Linking ANN optimization networks with techniques for solving nonlinear equations, this Chapter proposes an artificial neural network approach and architecture for power load flow and contingency analysis with the following characteristics: A mapping is established between an optimization prob- lem and nonlinear equations solution. A simpler formulation is proposed for an ANN nonlinear equation solver so that the method is more suitable for VLSI 99 implementation. Advantage of automatical convergence is taken so that the time com- plexity of the solver is size-independent. The formulation uniform is made for solving both power load flow and contingency analysis, and resulting in the identical procedure for single and multiple outages in contingency analysis. A straightforward process is provided for solving the problem without training and testing. An accurate result is guaranteed for power load flow and contingency analysis. This Chapter is organized as follows. First, the problem descriptions for power load flow and contingency analysis are given, and a variety of approaches to these problems are reviewed. Then, an ANN approach for solving nonlinear equations is pro- posed in which a mapping between an Optimization problem and nonlinear equation solution is established. A formulation of an ANN solver for the general nonlinear equations is defined, and the conditions for the stability and convergence of networks are proven. Based on the general nonlinear equation solver, the ANN formulations for the full power load flow, the decoupled load flow and the DC load flow are represented. Furthermore, ANN architectures for three cases of power load flow, and the simulation results are presented. 5.2 Methodology Review A description for power load flow and contingency problems is given first, and then typical computational techniques for solving the problems are reviewed. 5.2.1 Problem Statement 5.2.1.1 Power Load Flow The power load flow is an analysis problem which deals with finding the steady-state solution of bus voltage magnitudes and phase angles for given load demands at the various load busses and for assumed generation levels. The load flow 100 is of fundamental importance to power systems control because many practical applica- tions, such as transmission planning, contingency analysis, VAR/voltage analysis, on- line control, and security enhancement, require a load flow solution [Deb88, Coet86, Moet87]. Let k(i) denote the set of buses connected to bus i. At bus i, PI, Qi, PDI, QB,- represent generated real, and reactive power, and load real and reactive power, respec- tively. Let V,- and 5,- be the magnitude and the phase angle of the complex voltage at bus i. The load flow problem can be written using these terms as Pi ‘ P0; = V120,; - V,- Z VI-[GI-I-cos5ij + BijsinBII], (5.1a) I€k(i) Qt ‘ QDi = —vi28ii - vi 2 V1101) $11151) ' 31190550], (5113) 1““) where 01.- E 2 (Ga-,- + 01,-). (5.1c) jeka) Bit 5 351' + 2 (331,- + 31,-). (5.1d) jek(i) 5"}: E 8" — 8]" (5.18) GI-I- and Bi!- are the conductance and admittance of the transmission line between bus i and bus j. For simplicity, these equations can be written vector form as P(8, V) = 0 {Om V) = 0. (5.2) Because all parameters in a power system are under consideration, this forrnula- tion is called the full power load flow (FLF). 5.2.1.2 Steady-State Contingency Analysis 101 The steady-state contingency analysis focuses on predicting the power load flow following some event such as a transmission line outage or a transformer outage. For contingency evaluation, it is assumed that real and reactive loads, real power genera- tion, and generator bus voltage magnitudes are unchanged before and after the outage. The objective is to judge the security of the power system against the outage by solv- ing the load flow problem with changed parameters of the system topology. By com- paring the solution with the constraint, a conclusion of security can be easily reached. 5.2.2 Methodology Review 5.2.2.1 Pattern Classification Method If the operational conditions of a power system are regarded as the input vari- ables, and the judgement of static security for the system is regarded as the output variable, then the contingency analysis can be treated as a typical pattern classification problem. Artificial neural networks have been applied to problems of this type [Aget89, Fiet89, SoPa89, HuSh9I]. The approach involves two phases: off-line training for determining parameters of the neural network, and on-line testing for contingency analysis. A feedforward neural network model is often used for building the classifier [[Aget89]. Figure 5.1 shows a neural network architecture, where S, P, and Q, are apparent power, real power and reactive power, respectively, 1.0 is a constant bias for all thresholds. Each circle represens a processing element (neuron). There are three layers (input, hidden and output) and two connection matrices (WW, and WM). The back-propagation learning rule is used for training the network so that the connection weights are adaptively updated. When the values of the connection weights converge, the learning process is finished and the neural network is used to test different con- tingency situations. The attractive feature of this method is that the testing is a straight forward process which can be performed in real-time. However, the 102 convergence of the learning process is case dependent so that the general conditions for convergence are still subject to investigation [Deb88]. 5.2.2.2 Newton-Raphson Method One common way to solve the FLF problem is using the Newton-Raphson method [Ort87a]. By defining the Jacobian matrix of (5.2) as '21: 21:” J = 33 32; , (5.3) (55‘ 57‘ an iteration solution can be found by ' (k+1) (It) 30.41)] = 3(1)] + J“(8"".V("’)[ 35] (5.4) where k is the iteration index. 5.2.2.3 Estimation Method Since transmission line resistances are much smaller than the corresponding reac- tances, angular differences across a transmission line are small, and the voltage magni- tudes are close to the normal values (VI = 1.0 p.u. ), some approximations can be used to simplifying the problem. These include the decoupled load flow (DLF) and the DC load fiow (DCLF). The solution to DLF takes the form 809] [8,102.4 0 ]T [AP‘ka‘k’ 0 l + _ (5.5) V“) 0 Blood 0 AQ(k)/vload(k) v(tt+1) = 5(k+l )] where Bmde, 8,004, dee, and V100,, are node-connection matrix, load-connection matrix, node-voltage matrix, and load-voltage matrix, respectively. In DCLF, the variation of the voltage magnitude is ignored so that power load flow degenerates to a linear equation of the form 103 A 5 = P (5.6) where A is a connection matrix, 5 and P are phase angle vector and active power vec- tor, respectively. Therefore, the technique for solving linear equations can be used. 5.2.2.4 Multi-level Screening Method The objective of contingency analysis is to identify the critical branches which may suffer from violations. In general, the number of critical branches is very small so that approximate techniques can be used for filtering noncritical branches at the begin— ning, and more accurate analysis can be used at each higher screening level with a reduced set of candidates. In this way, a significant amount of computing time can be saved. Furthermore, a pre-screening algorithm has been developed for reducing the number of branches for the first process [Bret91]. An important assumption of this algorithm is the contingency localization since the effects of power system contingency are highly localized near the outaged branch. Consequently, the power system network can be divided into three parts: the outaged inside network, the stiff boundary and the rest of the network, where the first and the second parts constitute the local cut-off net- work. The third part can be ignored during the contingency analysis as it is considered far away from the outage. This approach can handle a single contingency situation. Tests of a 600-bus system have showed that the CPU time can be reduced by about 50% compared with the complete bounding method. The relative error was 1-2% com- pared with the full power load flow solutions [Bret91]. 5.2.2.5 Homotopy Method Homotopy is a global convergent method for solving nonlinear systems of poly- nomial equations [ChMY78]. A homotopy function is defined as H(x,t)=(1—t)S(x)+t T(x) (5.7) 104 where t is a real parameter varying from 0 to l, T(x) is the target system to be solved, and S (x) is a simplier system for initial situations, called the starting system. For power load flow problem, S (x) is often chosen as alxl2 — b1 2 — b S (x) = 02x3..- 2 (5.8) a,,x,,2 - b,, The procedure for homotopy method is defined as: (i) Let t,- =0. Randomly choose complex vectors a: (a1, 02, ..., an), b = (b1, b2, ---, b,,), and solve S(x) = 0 to get a set of roots which are the starting points of homotopy curvers. (ii) Update t,- by tI-"ew = t?“ + At, where At is stepping distance. If t,- < 1.0, go to (iii); otherwise, stop. (iii) Use an iterative method, such as Newton Raphson method, to solve homotopy equation for each t,- H(X, ti) = (1 — II)S(X) + tiT(X) = 0. (5.9) (iv) go to step (ii). This method can systematically find all possible solutions at the same time. Moreover, the Speed of this process can be improved by means of dynamically adjust- ing At and using parallel computation techniques [Saet89]. 5.3 ANN Formulation for Power Load Flow 5.3.1 ANN Solution to Nonlinear Equations Consider the set of nonlinear equations f](x]s x29 ... 9 xn) = O ”x" x211... ’ x") = 0 (51%) fn(x1, x2, , X") = O or f(x) = O (5.10b) where x is a variable vector and f is a function vector. The Jacobian matrix of f(x) is defined as ECLBLL af, 8x1 8x2 8x" af_2 fl... afz J= 8).“ 672 III a?" . (5.11) afn afn afn 8x1 8X2 ax" Note that J is a function matrix of x. In most cases, it is desirable to evaluate J at a given point, say x0. As a consequence, J becomes a constant matrix, represented by J (x0). Before deriving the ANN formulation, some definitions are given. Definition 5.1: If the Jacobian matrix at x, J (x), is not a zero matrix for x e D c R”, then it is called an unvanished Jacobian matrix on D. Definition 5.2: If all eigenvalues of the Jacobian matrix at x satisfy Re (M) > O for x e D , then it is called an eigen—positive Jacobian matrix. Next, a mapping between an optimization problem without constraints and non- linear equation solution can be established. Proposition 5.1: If the Jacobian matrix of nonlinear equations (5.10) is unvanished for any x, then an optimum point is a solution to the nonlinear equations. Proof: Define an objective function ¢ = ifflx). (5.12) i=1 106 then an optimum point of ¢(x) must satisfy the necessary condition that is, Since the Jacobian matrix is unvanished, the above equation implies that F 19..“ 8X1 E. 812 VtD = 34> 8x" af1 afz 3; + ZfZSX—i’ 8f 8f 2f1-5;-21— + 2&3:- an a}... H Bx" zfzax, 3a. 9:: an 3x1 8x2 ax" if; a: afz 8x1 812 8x” 59f" afn afn 8x1 8X2 ax" "ff 217 f3 = fa. Ffl 0 f2 = 0 fn. Soc 0 _f. f2 (5.13a) (5.13b) (5.13c) (5.13d) (5.13c) In other words, the minimum point satisfies the corresponding nonlinear equations. D 107 n Proposition 5.] indicates that a minimum point of 2 f f(x) is a solution to the i=0 . nonlinear equation. However, if a gradient system is directly derived from this objec- tive function, then the corresponding artificial neural network for the problem becomes complex since it encounters both f(x) and J(x). Alternatively, a simplier formulation is proposed below. For the set of nonlinear equations described by (5.10), define a dynamic system as X] f](x19 XZ, ... 9x") . i (x,x,-".X) x= ...2 =- f2 1.3... " . (5.14) x.” fn(xlax29 9xn)‘ L a When the system approaches an equilibrium, then it = 0. At that time, there must exist the equality f(x) = 0. We call this network, described in (5.14), an ANN nonlinear equation solver, because equation (5.14) can be regarded as a simplified formulation of an artificial neural net- work for optimization [KeCh88]. An important issue for the ANN nonlinear equation solver is ensuring its stability as will be discussed the following Theorem. Theorem 5.1: If f(x) is continuously differentiable and its Jacobian matrix is eigen- positive, then the ANN nonlinear equation solver is stable in the neighborhood of a solution point and its equilibrium point must be satisfied with the nonlinear equations. Proof is given in the Appendix A. Theorem 5.1 establishes a formulation for solving general nonlinear equations using a newly developed artificial neural network. In what follows, this formulation 108 will be applied to power system control problems [ChSh92b]. 5.3.2 The Full Power Load Flow As prescribed by equations (5.2) and (5.14), the neural network for the full power load flow is formed as 8 = - P(5, V) (5.15a) v = — Q(8, V) (5.15b) where 8 and V are the phase angle vector and the voltage vector, respectively. It is apparent that the Jacobian matrix described in (5.3) is continuously differential in the domain 8 e R" and V e R". In order to investigate the eigen-positive property of the Jacobian matrix at the neighborhood of an equilibrium point, the following definition and theorem must be introduced. Definition 5.3: If matrix A satisfies an. > O, (5.163) 61,-,- 2 Z laij I, (5.16b) jati and at: 2 2 la}.- I.' (5.16c) jati then A is said to be diagonally dominant. Furthermore, if only inequality is held in the above, A is said to be strictly diagonal-dominant. For example, matrices A1 and A2 below 4.0 0.5 -0.5 —1.0 4.0 1.5 —O.5 —1.0 A _ 1.0 3.0 1.0 0.3 A _ 1.0 3.0 1.0 0.3 1— 1.0 .5 2.8 1.0 2" 1.0 —O.5 2.5 1.0 1.0-1.0 0.3 3.5 1.0 -1.0 0.3 3.5 are strictly diagonal-dominant, and diagonal-dominant, respectively. 109 Theorem 5.2: A strictly diagonally dominant matrix is eigen-positive. Proof can be found in [Jen77]. In general, there exist multiple solutions to power load flow due to its nonlinear behavior. From a practical point of view, however, the feasible solution is the most important because the power system operates under this situation. As derived in Theorem 5.1, a nonlinear system usually features a local stability. Thus, a problem arises in the choice of an appropriate initial point for an ANN solver in order to make sure that the equilibrium point is the desired feasible solution. Fortunately, this prob- lem can be solved by setting the initial point at the ideal operation point for the power system. That is, all phase angles are 0, and the magnitude of all bus voltages is 1. It is certain that the objective of power system control is to make the difference between the system state (8,V) and the ideal state (8 0 = 0, V0 = 1) as small as possible by means of shunting capacitor banks, and adjusting transformer taps. Usually, an allow- able range for the difference is defined as W — V°|< 0.05 and IS - 5° |< 0.15 (8 in radian). Next, the Jacobian matrix at the ideal point is discussed. Of course, in this case the Jacobian matrix only depends on topology parameters: conductance and sus- ceptance of transmission lines, 8, C. By plugging the values of 5,- and V,- (0,1) for all i into equation (5.3), the Jacobian matrix becomes :- 1 011..01n b1]..b1n anl'°ann bnl"bnn “0’”: C11"Ctn d11"drn (5'17) _cn1"c,,,, dnl ' ° dmd where 011‘:- X 31;; (5.17a) JEkU) a-- =B-- (i ¢j); (5.17b) 110 bit = 20H " Z GU‘, (5.17C) jekm bij = -G,-,- (i #1); (5.17a) Cit =- Z GU; (5.17e) jek(i) Cij = Gij (i $1); (5.170 dii = -23.-.- + 2 Bi); (5.17g) jeka) d-- :3. (i ¢j), (5.17h) U ‘1 It is apparent that every diagonal entry (a 611;) of the Jacobian matrix is positive it" because Bi,- < 0 => an = - 2 8,,- > 0 (5.18a) j€k(i) and da- = —28ii + 2 Bi} (5.18b) 16/60) =‘2(Bst + Z Bsij + 2 Bij)+ Z 3t; 16W) j€k(i) j€k(i) =-2Bsi " 2 2 Bsij " 2 Bij- jeka) jeka) In power systems, 2 IBij |represent the sum of susceptance of the all transmission jek(i) lines to bus 1' , and [85,- |+ . EJ 85,)- I represent the sum of susceptance of the capacx- [6 t tor bank at bus 1' , and susceptance of the all I'l-equivalent sections at the side of bus 1'. In general, the former is much larger than the latter, that is, Z IBij|>2(|Bsil+ 2‘. IBsijl)=>dii>O- (5.18c) jek(i) jek(i) Using the notation 111 Jeko) bed where BiO represents admittance of the transmission line between bus 1' and the slack bus (bus 0), and Bi} 1f IhCl‘C iS a connection between bus I and j l J 0 otherwise, (5.20) then the conditions for making Jacobian matrix strictly dominant are found as lBto |> 2%“ (5213) “’10 + 2831' + 2 Z Bsij I) 22'ng |+ I G“) I. (5.21b) jek(i) j!=t' In other words, when the parameters of a power system are satisfied with the condi- tions in (5.21), the neural network formulated in (5.15) is stable in the neighborhood of (0, 1) so that a feasible solution can be found by the network. It is worth pointing out that the strict diagonal-dominance is a sufficient condition for an eigen-positive matrix. In some cases, a matrix is still eigen-positive, even though it isn’t strictly diagonal-dominant. This situation will be discussed later because it often appears in the decoupled load flow and the DC load flow. Definition 5.4: If a matrix can be written as a block form, and all its non-zero entries are in the diagonal blocks, then the matrix is called the diagonal-decoupled matrix; otherwise, it is called the diagonal-undecoupled matrix. Theorem 5.3: If A is a symmetric, diagonal-undecoupled and dominant matrix with at least one strict dominant row (column) such that 0k); > Zak}- and a,“ > 2011: jatk j¢k for some k, then A is an eigen-positive matrix. Proof: All eigenvalues of A must be real due to the symmetry [HiSm74], i.e., Im (2.1) E O. 112 According to Gerschgorin Theorem [HiSm74], every eigenvalue 1,- satisfies I7~t ' aii IS Z |an I (5.223) jar“ and ll»: - at: IS 2 la); | (5.22b) jaet' where an- is a diagonal entry of the matrix. A graphic interpretation for the above rela- tionship is that 1.,- must lie within a union of row-discs (center at aa- and radius = 2 I00 I) and column-discs (center at an- and radius = 2 la]; I) in the complex plane. I.“- jati Using the dominant condition in (5.16a-5.16c), it is concluded that all eigenvalues are in the right half of the complex plane and X,- Z 0, since they are all real. Next, matrix A must be shown to be nonsingular. Without loss of generality, let a“ = Z IaijI fori = 1, 2,°--, n—l (5.23a) j¢i am > Elan-l (5.23b) jam A using the row elementary transformation reduces A into an upper-triangle matrix. It can be shown that the diagonal dominance holds because of the diagonal un-decoupled assumption on A [HiSm74]. Therefore, (21,002 z|aij| (5.24a) jati and an“) > o (5.24b) where (k) is the kth reduction such that (It-I") = O for j < i . The elementary transformation does not change the determinant of the matrix, det (A ) = a {(1% )5) ---a,,(,':-” > 0. (5.25) 113 In other words, there is no zero eigenvalue in A so that M > 0 for all 1'. Therefore, A is an eigen-positive matrix. El It is worthy mentioning that the diagonal un-decoupled feature is not a necessary condition for an eigen-positive matrix. For instance, two matrices I J fl J OOOONN C1: C2: OOOOIQN OOOONN OOUJUJOO OOUJUJOO A-DsOOOO UrJ>OOOO l_ OOOOUJN OOWWOO OOJ>UJOO A-bOOOO LII-hOOOO l I are diagonal decoupled. However, C 1 is singular, and C 2 is nonsingular, because there exists at least one diagonal block with zero eigenvalue in C1, whereas in C 2 each diagonal block has a strictly dominant row which makes the diagonal block nonsingu- lar. In a power system, a diagonal-decoupled matrix indicates that the system consists of a group of buses in which there only exist connections between the slack bus and the individual groups, and there is no interconnection between the groups. Figure 5.2 shows a power system composed of three groups A 1, A2, A3. These groups are con— nected to the slack bus 5. It is obvious that case C 1 never occurs because in each group, there is at least one bus (say bus k) connected to the slack bus so that ak,c is strictly dominant in the corresponding block matrix. Based on Theorem 5.3, two Corollaries can be deduced for the decoupled load flow and the DC load flow. 5.3.3 Decoupled Power Load Flow Corollary 5.1: Decoupled power (DCF) load flow can be solved by the ANN solver. 114 Proof: By ignoring the conductance of transmission lines, the ANN solver for DLF can be formed as 8: — [— v,- 2; v1.30. sins-j — (P, - Pol-)1 (5.26a) jek(i) {1 = — {—VI 28‘? -' vi 2 Vj+ BU C0380 - (Qi ‘- QDi )1. (5.26b) mun Under the assumption of sinBU = 0 for all i , j, the corresponding Jacobian matrix at (0, 1) is J (0, 1) = (5.27) OHOdnldm )- u which is a decoupled matrix. Obviously, every row is dominant, and there exists at least one strictly dominant row in each block. According to Theorems 5.2 and 5.3, the Jacobian matrix is eigen-positive which guarantees that the decoupled power load flow can be solved by the ANN solver. 1:] 5.3.4 The DC Power Load Flow Corollary 5.2: The DC power load flow (DCLF) can be solved by the ANN solver. Proof: As mentioned in Section 5.2, the decoupled power flow degenerates to a linear equation. Consequencely, the ANN solver can be formulated as S = — (A 5 — P) (5.28) where P1- 511 P 2 52 P = ' , 5 = ' (5.29) LP"- I8". 115 and r " 2 Bil (. .) _I lek(i) 1 =1 Note that the Jacobian matrix is a constant matrix J = A , (5.31) and A is still a diagonally dominant matrix with at least one strict dominant row so that A is eigen-positive. In this case, a unique solution can be found by the ANN solver due to the linear behavior of the system [Mea88]. Cl 5.3.5 Contingency Analysis As described in Section 5.2, a kernel problem of the contingency analysis is to solve the power load flow with changed parameters of the power system. An assump- tion in the contingency analysis is that the whole network remains unisolated before and after outages so that the above topological conditions are unchanged. Therefore, the ANN formulations for the different cases of power load flow can be directly used for the contingency analysis. It is worthy to point out that the multi-outage (the param- eter change of transmission lines can be regarded as a specific case of transmission line outage) share the identical ANN formulation with the single-outage. Consequently, the architecture proposed in next Section will handle all the situations of power load flow and contingency analysis. 5.4 Architecture A theoretical basis for an ANN nonlinear equation solver and the ANN formula- tions for power load flow have been presented in Section 5.3. In what follows, an architecture of the ANN nonlinear equation solver for power load flow is proposed. 116 Figure 5.3 shows the architecture of the ANN nonlinear equation solver for the full power load flow. From a structural point of view, the architecture can be divided into two parts: the 8 component (the upper part in Figure 5.3) and the V component (the lower part in Figure 5.3). These two components can be viewed as the mapping of equations (5.15a) and (5.15b), respectively. This network is composed of neurons, resistive matrices, analog multipliers, sinusoidal generators, feedback links and DC voltage sources. The neurons are implemented with integrators (I - units) which govern the dynamical process of the network for finding an equilibrium point. The dynamical relations in (5.15) are held by connecting all current outputs from P, B , G , G), and Q , G , B, B), to the inputs of integrators for 8 and V, respectively, where the negative terminal of the integrator is used for implementing the negative sign in (5.15). Another distinct feature of this approach is that the feedback connec- tions 8 and G are not restricted to symmetric matrices. This significantly broadens its range of applications. In most power systems, the phase angle 8 is usually near zero so that the follow- ing polynomials can be used to approximating the sinusoidal function 53 sin8 :3 5 — '3' (5.323) 52 C030 "-3 1 - —2-. (5.32b) In other words, the sinusoidal generator is replaced by a polynomial generator. There are three nonlinear operational blocks in the right part of Figure 5.3. All of them are organized in n x n arrays. In the sinusoidal generator block (Sin lCos ), the outputs on the diagonal are sin 8,- and cos 8,, and the outputs off the diagonal are sin (8,- — 81-) and cos (8,- — 8}), where i is the row-index, j is the column-index. If there exists no connection between bus i and bus j, the output at (i, j) of the array is zero. Similarly, in the voltage multiplication block (M1), the outputs on the diagonal 117 are V}, and the outputs off the diagonal are Viv]. if there exists a connection between bus i and bus j. In the joint multiplication block (M 2), the outputs on the diagonal are Visin 8,- and Vicos 8i, and the outputs off the diagonal are ViVJ-sin(8,- - 8}), Vi Vj cos (8,- — 8 j) if there exists a connection between bus i and bus j. As shown in Figure 5.3, Viz from the voltage multiplication block is fedback to the B”, G), network, and all outputs from the joint multiplication block are fedback to the B , G networks. It is necessary to point out that the original power load flow for— mulation (5.1a, 5.1b) is obtained from decomposition of the complex power so that G, B, P , Q can be regarded as the unitless values. As a consequence, they can be implemented in a resistive matrix network. In general, the dimensions of these matrices are n x n for G, B, and n x I for G”, B”, P, Q. As described in Section 4.5, let the normalization values of current be 0.1 ma, then the value of the resistors can be calcu- lated as 10 IO.- lk Q (5.33) J RGU = where R65), is a resistor representing values in matrix G. If Gij = 0, R0,,- is infinite. In this case, the feedback to the corresponding position is suspended. R3. R6", R3”, RP and RQ are calculated and handled in the similar manner. Let n be the number of buses (excluding the slack bus) in a power system, it is then straightforward to find the number of circuit components. Furthermore, using design data for CMOS technology [AlHo87, Rei87], a space estimation for building this nonlinear equation solver for the full power load flow is given in Table 5.1, where 0 represents the area for a resistor with an average resistance 50k Q, and b represents the space area for an operational amplifier which is the basic circuit to build the integrator, the sinusoidal generator and the multiplier. Let the routing area be 40% of the circuit area [AlHo87], then the space area of a chip for a n-bus power system is s =1.4[(18n2 + 2n)b + (4n2 + 4n)a]. (5.34) 118 For the decoupled power load flow, an architecture takes a simpler form shown in Figure 5.4. Without G blocks, this architecture makes the remaining same as that of the full load flow shown in Figure 5.3. A space estimation is given in Table 5.2. The space area of a chip for a n—bus power system is s =1.4[(18n2 + 2n)b + (2n2 + 4n )0]. (5.35) For the DC load fiow, all the blocks associated with V, G, the sinusoidal genera- tor and multiplications are ignored so that the architecture becomes significantly sim- ple as shown in Figure 5.5. A space estimation is given in Table 5.3. In fact, the space area of a chip for a n—bus power system is the same as that of the linear equa- tion solver described in (4.51). For convenience, it is written here again 5 = 1.4[(nb + (n2 + n)a]. (5.36) Using equations (5.34) to (5.36), the actual chip area for a 100—bus power system is 41.38cm2, 39.63cm2, and 0.905em2 for FLF, DLF, and DCLF, respectively. The chip area for DCLF is much smaller than that of FLF and DLF due to without non- linear structural components. 5.4.1 Verification To verify the computational correctness of the proposed neural network and archi- tecture, simulations are fulfilled, where integrators are simulated by a fourth-order Runge-Kutta numerical integration (explicit formulation), the resistive networks are simulated by 1-D and 2-D arrays, and the sinusoidal units are simulated by function- call. 5.4.1.1 The full power load flow There are two cases in the FLF: 119 (1) Find both phase angle and voltage amplitude for each bus except the slack bus implying that there are 2n variables in the system. This case is called the PQ—bus situation because the real power P and the reactive power Q for all buses are specified. This is the general case as was discussed in the previous sec- tions. (2) Find the phase angle and voltage amplitude for load buses and find the phase angle only for generator buses implying that there are (Zn—g) variables in the system, where g is number of generators. (in this case the load bus is a PQ-bus, and the generator bus is a PV-bus [Ber86]) This case is called the mix—bus (PQ -PV) situation, because the real power P and the reactive power Q for all loads, and the real power P and the magnitude V for all generator buses are specified. Obviously, this situation is a special case for the full power load flow which can still be covered by the formulation (5.15). Here two examples are described to show these two situations. The first example is a 7-bus power system, shown in Figure 5.6 representing the PQ -bus situation. The original data for transmission lines provided in resistance and reactance (in the parentheses in Figure 5.6), a transformation from impedance to admittance is carried out first. The second example is the 39-bus in New England power system which represents the mix—bus situation. The data for the 39-bus system are given in Appen- dix B. The convergence for the ANN simulation is determined by |x,-<"+‘> - xi“) |< e for all i (5.37) where x,-(") is the value of x,- at the kth iteration. e = 0.15—3 and e = 0.1E—7 are used for the both examples. The time increment used for the Runge-Kutta integration is 0.05 for the 7-bus case and 0.001 for the 39-bus system. The simulation time is defined as T = dt -N (5.38) 120 where N is the number of iterations required for convergence. All starting points are set to 0 and l for 8 and V, respectively. The equilibrium points are found for all the examples. The simulation time is dependent on e: for 8 = 0.1E—3, T is 2.15 and 2.35 for the 7-bus and the 39-bus cases, respectively; for e = 0.1E—7, T is 13.15 and 14.55 for the 7-bus and the 39-bus cases, respectively. Furthermore, T has the same rela- tionship to the initial point as that of linear equation solver described in Section 4.7.1.1. These results and the result from a hardware simulation for the ANN linear equation solver [ChSh92a], whose formulation is the same as that of the DCLF, have shown that the time complexity of this approach is size-independent due to its parallel architecture and circuit dynamic properties. In order to measure the accuracy, the absolute error Ed, and the relative error E, are defined as 1 k _ Ea = [2(1‘1'- Rj)2:l 2 (5.393) i=1 Ea Er = k (5.3%) Z ”’1' I i=1 where 2n for all P-Q buses = 2n — g for mixed buses (5390) and L,- and Rt are the values for the left hand side and the right hand side of each equation in (5.1a) and (5.1b), respectively. Thus, the absolute error represents the sum of the difference between the left hand sides and the right hand sides, whereas the relative error represents a percentage of the absolute error over the absolute sum of the , left hand sides (total real and reactive power). For comparison, these two examples are solved by the Newton-Raphson method as well. Tables Cl and C2 in Appendix C present the simulation results and the 121 relative errors for the 7-bus system and the 39-bus system from the ANN approach, the Newton-Raphson version and the homotopy method as found in [Guo90]. It is apparent that the accuracy of this approach is best. The trajectories of 8 and V, shown in Figures 5.7-5.10, indicate that the network approaches its equilibrium without oscil- lation. 5.4.1.2 Contingency analysis A 5-bus power system is chosen to test contingency analysis using the ANN for- mulations for the decoupled load flow and the DC load flow. Figure 5.11 shows the topology of the system. As mentioned in Section 3, this approach can uniformly fulfill the contingency analysis for all the cases. In Figure 5.11, the dashed line represents the single transmission line outage, the dotted lines represent the multiple transmission line outage, and the numbers in parentheses represent parameter change. The ANN simulation procedure for contingency analysis is the same as finding solution to the full power load flow. Here the simulation parameters take values: 8 = 0.1E-3 and dt = 0.05. Table 5.4 presents the simulation results. In order to indi- cate the values of the approximation, the simulation result of the FLF for the same data is given as well. Apparently, the result of the DLF is very close to that of the FLF, whereas the result of DCLF has some difference from the FLF when the actual voltage magnitude has a large deviation from 1.0 for some buses. The security judge- ment is based on the following constraints: Iv, - v0|< 0.05 |8,- — 80 | < 9° Non-security cases are marked (*) in Table 5.4. 122 5.5 Summary A new approach for solving problems of power load flow and contingency analysis using an artificial neural network has been proposed. A mapping between the nonlinear equation solution and quadratic-nonlinear minimization problem without con- straints has been established. Furthermore, a simple formulation given by equation (5.14) for the artificial neural network has been defined and its stability condition has been identified. The ANN formulations for the full power load flow, the decoupled load flow and the DC load flow are obtained and their stability can be sufficiently guaranteed by using the diagonal dominance property of the power systems. Correspondingly, an ANN architecture for the three formulations is given which can be implemented in VLSI technology. The simulation experiments have shown excellent results for power load flow and contingency analysis. Due to dynamic convergence and distributed parallel processing in ANNs, the time complexity of the approach is size-independent. Therefore, this approach can provide the real-time solution for large- scale systems if it is implemented in hardware. 123 Input layer Hidden layer Output layer s_. [f <2 all Figure 5.1. The neural network for classification. 124 0 CD 0 Figure 5.2. A power system consisting of 3 subsystems. 125 Source . l °—_] 6 f ViVjcos8ij r 1 1 1 3-1 L.f;'" r 0 v I. — Source l V? *1”?— I I T Q G B 311 1 1 1 1 l-) V .M... __L_+ —V‘.Vjsin8i}. Figure 5 .3. A network for the full power load flow. 126 V-Vjcos8.. l (J Source °—_l I P B 8 ‘ 5' I l ’J’ 5 4 cm W " 2 _’ M2 Source 'Vt r> ° I V Y Q B 311 V l i l >;I V—p M1 _ -_I-_I . —ViVj-sin8ij Figure 5.4 . A network for the decoupled load flow. 127 Source O.___ III—l I m Figure 5.5. A network for the DC load flow. 128 P = 0.076 p = 0.039 P66 = 0.900 31 k b D5 D1 — 0 300 ac US QDS = 0.016 QDI = 0478 Q66 - . I (0.054+j0.223) (0.013+j0.042) (0.082+j0. 192) j0.0510 I j0.0424 : (0.057+j0.174) I a .. In C. (0.058+j0.176) r: 9., r\ + -: 2: a + E :3 (0.024+j0.100) (0.024+j0.100) g . _l. . I 100717 I I)00200 1- j0.0273 PD4 = 0.183 PD3 =0.135 P02 = 0.942 QD4 = 0.127 QD3 = 0,053 QD2 = 0.190 Figure 5.6. The 7-bus system. 129 8 (degree) bus6 5 _ 200 400 600 bus] 0 m 1 . Iteration busS bus4 bus3 -5 .4 bus2 Figure 5.7. The trajectory of angle for the 7-bus system. 130 bus6 1.05— 1.00 l l l Iteration bus4 095‘“ busS busl bus3 busZ Figure 5.8. The trajectory of voltage for the 7-bus system. 131 (degree) 15— 10— Iteration -5 - The bus number of curves from the top to bottom: 36, 38, 35, 33, 32, 34, 37, 31, 29, 22, 23, 19, 28, 20, 30, 21, 25, 10, 26 13,11, 2,12, 24,16, 6,14,17, 27,15, 5,1,18, 3, 4, 7, 9, 8 Figure 5.9. The trajectory of angle for the 39-bus system. 132 V 1.05 \ - 500 1000 1500 2000 1.00 I I I 1 Iteration V 0.95— The bus number of curves from top to buttom: 36, 35, 30, 37, 38, 25, 22, 29, 23, 28, 26, 1, 2, 34, 33, 27, 24, 21, 17 19,18, 3,16, 9, 20, 32, 31,15,10,14,13,11, 4, 6, 5, 7, 8,12 Figure 5.10. The trajectory of voltage for the 39-bus system. 133 Slack bus P64 = 1.9 e o awas V4: 1.05 —1 _ 1-j7 _ . 145 1-1103 1-15 2 1 -j10 (1.5 -j15) j0.05 PD3= 1.35 - Q03=05 Figure 5.11. The contingency analysis for a 5-bus system. 134 Table 5.1. A space estaimation of the architecture for FLF. Resistors Integrators Sin & Cos generators Multipliers Count 4n 2+4n 2n 2:22 2:22 Unit are: a b 5b 4b Total area (4n2+4n)a 2nb 10an 8an Note: a =6.25x10'3mm 2, b =1 .5x10‘2mm 2. Table 5.2. A space estaimation of the architecture for DLF. Resistors Integrators Sin & Cos generators Multipliers Count 2n 2aI-4n 2n 2n2 2n2 Unit area a b 5b 4b Total area (2n2+4n )a 2nb 10n2b 8an Table 5.3. A space estaimation of the architecture for DCLF. Resistors Integrators Count n2+n n Unit area a b Total area (n 2+n )a nb Table 5.4. The contingency analysis. 135 Type FLF DLF DCLF 1.0323/-4.6884 l.0154/-4.4897 1.0/-4.5878 Before 0.9971/-7.9609 1.0189/-7.6689 1.0/4.8452 0.9805/-6.4424 0.9996l-6.4237 l.0/-6.5378 l.0500/0.6858 1.0500/ 1.1810 l.0/ 1.3753 l.0067/-4.9227 l.0185/-4.7066 1.0/-2.2617 Single 1.0171/-9.l931 * 1.0373/-8.7828 * 1.0/-3.6459 outage 0.9741/-5.9313 0.9930/-5.9910 1.0/-3.0517 1.0500/0.6871 1.0500] 1.1818 1.0/3.2250 0.9936/-4.0240 l.0061/-3.8121 1.0/-1.8823 Multi- 0.9581/-lO.3693 * 0.9857l-10.0404 * 1.0/-5.5124 outage 0.9050/-12.2557 * 0.9442l—12.5443 * 1.0/-4.6721 1.0500/3.3805 1.0500/3.8428 1.0/4.0918 1.0120/-4.1638 1.0214/-3.9919 1.0/-5.6934 Parameter 1.0051/—7.5489 l.0236/-7.3278 l.0/-8.9020 * change 0.9864l-6.2728 l.0032/-6.3175 l.0/-7.5798 l.0500/-0.0180 1.0500/0.3987 1.0/06439 PM“ 136 Chapter 6 Conclusions New approaches for finding an accurate solution to linear and nonlinear pro- gramming and solving power system control problems using artificial neural network techniques have been described in this dissertation. This chapter starts with a sum- mary of the research work, followed by an identification of the major contributions of this work. A discussion of future research issues concludes the chapter. 6.1 Summary Improving the performance of ANNs for linear and nonlinear programming, and developing a fast approach for large-scale system computations are challenging prob- lems in neurocomputing. Based on optimization theory and ANN technology, promis- ing solutions to these problems have been obtained from this work. The computing accuracy of constrained optimization from the previous ANN models does not meet the demand for some applications. By analyzing the behavior of the conventional penalty function in the neighborhood of the feasible region boundary, which is often used in these ANN models, the reason for its inherent degenerating accuracy has been discovered. Based on this, a new combination penalty function has been developed. This function can provide enough "penalty" to offset the decreasing of IWIH 137 ¢(x) in both the range of [(xb) and far away from the feasible region boundary so that an equilibrium point close to the optimal point can be located. A corresponding neural network formulation has been given whose stability is guaranteed by the assumption of a convex objective function and the continuously differentiable property of the combination penalty function. Based on unconstrained optimization, an ANN- based technique has been developed for solving simultaneous equations. For the linear system, a mapping between the solution of linear equations and quadratic minimization has been established so that solving linear equations can be viewed as finding the minimum of a quadratic function. A general formulation for the ANN linear equation solver has been defined, and the stability of the network has been established. A rela- tionship between the ANN solver and the linear dynamic system is identified, broaden- ing the sc0pe of potential applications. An additional merit of the formulation is that symmetry of the coefficient matrix is not required, again broadening the range of applications. For the nonlinear system, a mapping between the solution of nonlinear equations and quadratic-nonlinear minimization has been established so that solving nonlinear equations can be viewed as finding the minimum of a quadratic-nonlinear function. A formulation for the ANN nonlinear equation solver has been defined and proofs of local stability and convergence of the solver have been given which provide a sound theoretical basis. Based on these formulations, ANN architectures have been developed at the algo- rithm and structural level. The feedback neural network paradigm was used to express the dynamic behavior of the objective function. A constraint amplifier circuit for the derivative of the combination penalty function has been proposed for the constrained optimization network. The amplifier is simpler than previous versions due to the replacement of the radication operation. Another distinct feature of the architecture for the linear and nonlinear equation solvers is that the resistance connection matrix can be symmetric or asymmetric. The hardware complexity of the solvers has been estimated 138 as 0(n2a + nb) for the linear equation solver, and 0(n2a + nzb) for the nonlinear equation solver, where a(=6.25x10‘3mm2) is the unit area for a resistor, and b(=1.5x10’2mm 2) is the estimated unit area per neuron. The ANN linear and nonlinear equation solvers have be applied to solving the example power system control problems of power load flow and contingency analysis. The specific ANN formulations for the full power load flow, the decoupled load flow and the DC load flow have been established. The network stability for these formula- tions has been further analyzed. Moreover, the usefulness of the ANN linear equation solver has been extended. Other applications include matrix inversion, detemrining the stability of a linear control system, and determining matrix singularity. The computational correctness of the approaches and architectures have been verified by simulations and it was discovered that the relative errors for linear and non- linear programming examples are substantially reduced. 6.2 Contributions The major contributions of this research are: (1) An improved artificial neural network for linear and nonlinear programming has been developed. The reason for the degenerating computing accuracy of previous networks has been discovered and a new combination penalty func- tion has been proposed. Both play key roles in improving the computational accuracy for ANN constrained optimization. (2) This work has established a new link between ANN theory and unconstrained optimization. Correspondingly, an ANN linear equation solver has been developed at the algorithm and structural level, and a theoretical basis for the ANN nonlinear equation solver has been given. 139 (3) A direct ANN method (without training and testing) for power load flow and contingency analysis has been developed. The ANN formulations, their stabil- ity property and the relationship with the parameters of a power system have been provided. 6.3 Future Research To implement the approaches pr0posed in this dissertation, a new method for a variable resistor connection matrix is needed. Recently, experimental programmable arrays have been reported for this purpose [MaHY90, FiFS91]. However, their adju- stable range is too narrow to meet the demand of real problems. Studies in this area are continuing. Secondly, to take advantage of the program flexibility of the conven- tional computer, and the real-time computing property of the ANN equation solver, a hybrid interface should be developed. This interface should be able to provide data communication, A/D and D/A conversions, and computing task control between a digi- tal computer and a neural network. Another issue in implementation is to explore a new method for handling the l/O problem for large systems; system decomposition being the most likely starting point. Feedback network based learning is another challenging issue as the capability of the network will be greatly enhanced with built-in learning. There have been very few implementable algorithms presented in this area so far [BaAC91, SuCL91]. New research work includes some implementation-oriented efforts in architecture and device development. Finally, the ANN linear equation solver described in this thesis may be further used for finding eigenvalues and associated eigenvectors. A relationship between state trajectories of a neural network and eigenvectors of the corresponding coefficient matrix can be obtained by expanding the matrix singularity analysis. Based on this 140 relationship and numerical techniques, eigenvalues can been found by solving a set of linear equations. 141 APPENDIX A Proof of Theorem 5.1 Let x“ be an equilibrium point. After change of variable y=x—{, (an equation (5.14) can be written as i = i = —r. (a2) Note that y = 0 is an equilibrium point of the system. It is easy to show that the fol- lowing equality is held Ja=xU=Ju=0) a» Using the mean value theorem of derivative [Ap057], q(y) can be written as qty) = (1(0) + 33c» (a.4) - 951 -o+ayeu _ £1 it. _. is. — ay(0)er Iayw) ay(0)Iy =Jy+mw where Jrfimgm=flm—fl®y 03 oy By By and c is a point on a line segment from the origin to x. According to the norm ine- quality relation and the assumption of continuity of the derivative, the following quali- tative expressions can be obtained: 142 23 _i'L . llgty)|l5||ay(c) ay(0)|| Ilyll. (a.6) lim m =0. (a.7) 11y1iso Ilyll and Ilq(y)IISOIIyll. (a.8) An eigen-positive J guarantees there must exist a solution P to the Lyapunov equation F] + JTP = Q (a.9) where P , Q are positive definite matrices [BaSt70]. Define the energy function as 50) = yTPy (a-lO) where E (y) > 0 for y¢0. The derivative of the energy function is E(y)=iTPy+yPy (a.11) = (—J y - g(y)T Py + yTP (-J y - g(y) = -yTJTPy + yTPJy + gT(y) + yTPgU) = —yT»...- 2a11P11>~11y112 where l. 143 lemm(Q) is the minimum eigenvalue for Q, and lmx(P) is the maximum eigenvalue for P. When 0 satisfies Xmin(Q) < —, (a.13) 2 HP ll then E(y) < 0. (a.14) Without loss of generality, let Q be the identity matrix, then Xmm(Q) = 1. Inequality (3.18) can be further simplified as l < —, (3.15) 2 HP: ll 0 where P, is the positive definite matrix determined by equation (3.14) for Q = l . Inequality (a.15) guarantees that the energy function E (y) is a Lyapunov function. Therefore, the network based on equation (5.14) is stable in the neighborhood of the origin in y-coordinate. When a starting point satisfies the condition llq(y) |l< JI—fl-y (a.16) ZIIPIII, the dynamical system will approach an equilibrium point, the origin (ye = 0) due to change of variable. In x-coordinate, the equilibrium point is expressed xc = ye + x‘ , (a.17) which is the exact solution to the nonlinear equation. C] 144 APPENDIX B Table 8.1. The bus data for the 39-bus system. Number Name Qo Po Qc V. Po pacified 0 01THETAG 1.000 1 OIALPHA 0.0 0.0 0.0 0.0 2 01 KAPPA 0.0 0.0 0.0 0.0 3 03ETA 3.22 0.024 0.0 0.0 4 03THETA 5.00 1.84 0.0 0.0 5 03IOTA 0.0 0.0 0.0 0.0 6 07GAMMA 0.0 0.0 0.0 0.0 7 06LAMMBDA 2.338 0.84 0.0 0.0 8 07MU 5.22 1.766 0.0 0.0 9 OSNU 0.0 0.0 0.0 0.0 10 05X1 0.0 0.0 0.0 0.0 1 l 05MICRON 0.0 0.0 0.0 0.0 12 06PI 0.085 0.88 0.0 0.0 13 07RHO 0.0 0.0 0.0 0.0 14 07510 MA 0.0 0.0 0.0 0.0 15 07TAU 3.20 1.53 0.0 0.0 16 06EPS 1LON 3 .294 0.323 0.0 0.0 17 06PH1 0.0 0.0 0.0 0.0 18 07CH1 1.58 0.30 0.0 0.0 19 041331 0.0 0.0 0.0 0.0 20 (”OMEGA 6.80 1.03 0.0 0.0 21 03ALPHA 2.74 1.15 0.0 0.0 22 06BETA 0.0 0.0 0.0 0.0 23 ()7BETA 2.47 0.846 0.0 0.0 24 06DELTA 3.086 -0.922 0.0 0.0 25 OlGAMMA 2.24 0.472 0.0 0.0 26 01 DELTA 1.39 0.17 0.0 0.0 27 04EPS [LON 2.81 0.755 0.0 0.0 28 01 EPS [LON 2.06 0.276 0.0 0.0 29 OIZETA 2.835 0.269 0.0 0.0 30 01 KAPPAG 0.0 0.0 2.50 1.362 1.046 31 07GAM MAG 0.092 0.046 5.73 2.0458 0.982 32 04XIG 0.0 0.0 6.5 2.1036 0.985 33 04PSIG 0.0 0.0 6.32 1.1579 0.999 34 040M EGAG 0.0 0.0 5.08 1.6 1.011 35 06BETAG 0.0 0.0 6.5 2.0967 1.050 36 07BETAG 0.0 0.0 5.6 1.0214 1.065 37 OlGAMMAG 0.0 0.0 5.4 0.0409 1.029 38 OIZETAG 0.0 0.0 8.3 0.236 1.028 145 Table 8.2. The branch data for the 39-bus system. umber Bus Bus R l l . . 11 1 A \l l 1 l I l 1 l 1 l 1 l 146 APPENDIX C Table C] The 7—bus system comparison results. Method ANN Newton Homotopy Bus 0: v0/80 1.00/0.00 1.00/0.00 1.00/0.00 Bus 1: v1/81 0.9435/0.2153 0.9435/0.22 0.9775/-3.2639 Bus 2: v2/82 0.8922/-6.4871 0.8923/—6.48 0.9203/-8.2981 Bus 3: v3/83 0.9207/-4.2131 0.9208/-4.21 0.9416/-5.9225 Bus 4: v4/84 0.9591/-1.2851 0.9592/—1.28 0.9732/-2.7507 Bus 5: v5/85 0.9538/-0.2468 0.9539/-0.25 0.9802/-2.9283 Bus 6: 126/86 1.0573/8.7592 1.0573/8.76 1.0887/5.0118 Relative Error 81: 0.1518% 82: 0.0007% 0.4788% 34.1466% Nma(UEI= ore—322:015—7 (2) 8 unit: degree. (3) Bus 0 is the slack bus. 147 Table C2 The 39-bus system comparison results. Method ANN Newton Bus 0 (slack bus) 1.00/0.00 1.00/0.00 Bus 1 l.0158/l.6351 1.0158/l.64 Bus 2 1.0154/4.3654 1.0154/4.37 Bus 3 0.9885/1.3287 O.9885/1.33 Bus 4 0.9516/0.4860 0.9516/0.49 Bus 5 0.9494/l.8284 0.9494/ 1.83 Bus 6 0.9510/2.6l93 0.9510/2.62 Bus 7 0.9411/0.1487 0.9411/0.15 Bus 8 0.9409/-0.4l89 0.9409/-0.42 Bus 9 0.9875/-0.2177 0.9875/~0.22 Bus 10 0.9599/5.2923 0.9599/5/29 Bus -1 1 0.9556/4.3801 0.9556/4.38 Bus 12 0.9363/4.3576 0.9363/4.36 Bus 13 0.9581/4.48l4 0.9581/4.48 Bus 14 0.9585/2.5919 0.9585/2.59 Bus 15 0.9681/2.0828 0.9681/208 Bus 16 0.9880/3.5988 0.9879/3.60 Bus 17 0.9914/2.5000 0.9914/2.50 Bus 18 0.9888/1.5884 0.9888/l.59 Bus 19 0.9903/88008 0.9903/8.80 Bus 20 0.9865/73955 0.9865/7.40 Bus 21 0.9952/6.1719 0.9952/7.l7 Bus 22 1.0220/ 10.8770 1.0220/1088 Bus 23 1.0208/ 10.6511 l.0208/10.65 Bus 24 0.9964/3.7238 0.9964/3.72 Bus 25 l.0258/5.7526 l.0258/5.75 Bus 26 1.0167/4.4514 1.0167/4.45 Bus 27 0.9988/2.3169 0.9988/2.32 Bus 28 l.0194/8.1715 1.0194/8.l7 Bus 29 1.0213/1 1.0836 1.0213/1108 Bus 30 (generator) 1.0460/68071 1.0460/681 Bus 3] (generator) 0.9820/11.3()00 0.9820/11.30 Bus 32 (generator) 0.9850/ 13.1951 0.985/ 13.20 Bus 33 (generator) 0.9990/ 13.9817 0.9990/13.20 Bus 34 (generator) 1.01 lO/l2.5860 1.0110/12.59 Bus 35 (generator) l.0500/ 15.8461 l.0500/15.85 Bus 36 (generator) 1.0650/ 18.6514 1.0650/18.65 Bus 37 (generator) l.0290/ 12.5581 1.0290/12.56 Bus 38 (generator) 1.0280/ 18.1446 1.0280/18.15 Relative Error 8;: 4.47% 82: 3.04% 6.49% Note: (1)8, = 0.1E-3, £2 = 0.1E—7. (2) voltage/angle, 8 unit: degree. [A get89] [AiNF90] [A 1H087] [Ang89] [AnR088] [Ap057] [AtSu89] [BaAC91] [BaEM89] [BaSh79] [BaSt70] 148 BIBLIOGRAPHY M. E. Aggoune, et al., "Artificial Neural Networks for Power System Static Security Assessment," ISCAS, pp. 490-494, 1989. S. V. B. Aiyer, M. Niranjan, and F. Fallside, "A Theoretical Investigation into the Performance of the HOpfield Model," IEEE Trans. on Neural Net- works, Vol. 1, No. 2, pp204-215, June, 1990. P. E. Allen and D. R. Holberg, CMOS Analog Circuit Design, Holt, Rinehart & Winston, New York, 1987. J. E. Angus, "On the Connection between Neural Network Learning and Multivariate Nonlinear Least Squares Estimation," Neural Networks, Vol. 1, No. 1, January, 1989. J. A. Anderson and E. Rosefeld, Neurocomputing: Foundations of Research, The MIT Press, Cambridge, MA, 1988. T. M. Apostol, Mathematical Analysis, Addison-Wesley, Reading, Mas- sachusetts, 1957. L. E. Atlas and Y. Suzuki, "Digital Systems for Artificial Neural Net- works," IEEE Circuits and Devices Magazine, November, 1989. V. C. Barbosa, L. Alfredo, and V. Carvalho, "Learning in Analog Hopfield Networks," Proceedings of the International Joint Conference on Neural Networks, 11, pp.l83-186, 1991. M. L. Baughman, N. A. Eisner and P. S. Merrill, "Optimizing Combined Cogeneration and Thermal Storage System: An Engineering Economic Approach," IEEE Trans. on Power Systems, Vol. 4, pp. 974-980, August 1989. M. S. Bazaraa and C. M. Shetty, Nonlinear Programming Theory and Application, John Wiley & Sons, New York, 1979. S. Barnett and C. Storey, Matrix Methods in Stability Theory, Barnes & Noble, Inc., New York. 1970. [BiVo91] [Bla86] [Bret91] 149 R. H. Bisseling and J. G. Vorst, "Parallel Triangular System Solving on a Mesh Network of Transputers," SIAM Journal on Scientific and Statistical Computing, Vol. 12, No. 4, pp. 787-799, July 1991. K. Blanchard, The One-Minute Manager, Morrell Publishing, New York, 1986. V. Brandwajn, et al., "Pre-Screening of Single Contingencies Causing Net- work Topology Changes," IEEE Trans. on Power Systems, Vol. 6, No. 1, pp. 30-36, Feb. 1991. [BuHW82] R. C. Burchett, H. H. Happ and K. A. Wirgau, "Large Scale Optimal [CaGr87] [Che84] [ChLi84] [ChMS9l] [ChMS92] Power Flow," IEEE Trans. on Power Apparatus and Systems," Vol. 101, pp. 3722- 3732, October, 1982. G. Carpenter and S. Groosberg, "A massively Parallel Architecture for a Self-organizing Neural Pattern Recognition Machine," Computer Vision, Graphics and Image Processing, Vol. 37, pp. 54-115, 1987. C.-T. Chen, Linear System Theory and Design, Holt, Rinehart & Winston, Inc., New York, 1984. L. O. Chua and G-N. Lin, "Nonlinear Programming without Computation," IEEE Trans. on Circuits and Systems, Vol. 31, pp. 182-188, February, 1984. C. Chiu, C.-Y. Maa, and M. A. Shanblatt, "Energy Function Analysis of Dynamic Programming Neural Networks," IEEE Trans. on Neural Net- works, Vol. 2, No. 4, pp. 418-426, July, 1991. J. Chen, C-Y. Maa, and M. A. Shanblatt, "Improved Artificial Neural Net- works for Linear and Nonlinear Programming," to appear, the International Journal of Neural Systems, Vol. 2, No. 4, 1992. [ChSh92a] J. Chen and M. A. Shanblatt, "Solving Linear Systems Using an Artificial Neural Network," (In review, IEEE Trans. on Neural Networks, 1992.) [ChSh92b] J. Chen and M. A. Shanblatt, "Artificial Neural Network Techniques for Power Load Flow and Contingency Analysis," (In review, IEEE Trans. on Power Systems, 1992.) 150 [ChYa88a] L. O. Chua and L. Yang, "Cellular Neural Networks: Theory," IEEE Trans. on Circuits and Systems, Vol. 35, pp. 1257-1271, October, 1988. [ChYa88b] L. O. Chua and L. Yang, "Cellular Neural Networks: Applications," IEEE Trans. on Circuits and Systems, Vol. 35, pp. 1273-1290, October, 1988. [CoDK86] G. C. Contaxis, C. Delkis and G. Korres, "Decoupled Optimal Load Flow [Coet86] [CoRo87] [Cra56] [Cro80] [DaDe91] [Deb88] [Dec89] Using Linear or Quadratic Programming," IEEE Trans. on Power Systems, Vol. 1, pp. 1-7, May, 1986. G. C. Contaxis, et al., Decoupled Optimal Load Flow Using Linear or Quadratic Programming," IEEE Trans. on Power System, Vol. 1, pp1-7, May, 1986. P. Comon and Y. Robert, "A Systolic Array for Computing BA"1," IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. 35, No. 6, pp. 717-723, June, 1987. S. H. Crandall, Engineering Analysis: A Survey of Numerical Procedures, McGraw-Hill, Inc., New York, 1956. J. Cronin, Differential Equations, Marcel Dekker, Inc., New York, 1980. J. Dalton and A. Deshmane, "Artificial Neural Networks," IEEE Potential, Vol. 10, No. 2, April 1991. A. S. Debs, Modern Power Systems Control and Operation, Kluwer Academic Publishers, Boston, 1988. A. L. DeCegama, Parallel Processing Architectures and VLSI Hardware, Prentice Hall, Englewood Cliffs, New Jersey, 1989. [ELMA91] M. A. El-Sharkawi and R. J. Marks 11, (Edited), IEEE Proceedings of the First International Forum on Applications of Neural Networks to Power Systems, Seattle, WA, 1991. [EOMD91] M. A. El-Sharkawi, 5. Oh, R. J. Marks 11 and M. J. Damborg, "Short Term Electric Load Forecasting Using an Adaptively Trained Layered Percep- tron," IEEE Proceedings of the Firm International Forum on Applications of Neural Networks to Power Systems. Seattle, WA, 1991. ' Irv-win [Fiet89] [FiFS91] [FoMo67] [FoTa88] [GaGr87] [GoLo83] [GrKu89] [Gro76] [G uo90] [Hec87] [Hec90] [HiLi86] 151 R. Fischl, et al., "Screening Power System Contingencies Using A Back- Propagation Trained Multiperceptron," ISCAS, pp. 486-489, 1989. W. A. Fisher, R. J. Fujimoto, and R. C. Smithson, "A Programmable, Ana- log Neural Network Processor," IEEE Trans. on Neural Networks, Vol. 2, No. 2, pp. 222-229, March, 1991. G. E. Forsythe and C. B. Moler, Computer Solution of Linear Algebraic Systems, Prentice-Hall, Englewood, NJ, 1967. Y. S. Foo and Y. Takefuji, "Integer Linear Programming Neural Networks for Job-Shop Scheduling," IEEE International Conference on Neural Net- works, Vol. II, pp. 341-348, 1988. G. Carpenter and S. Grossberg, "A Massively Parallel Architecture for a Self-organizing Neural Pattern Recognition Machine," Computer Vision, Graphics and Image Processing, Vol. 37, pp. 54-115, 1987. G. H. Golub, and C. F. Van Loan, Matrix Computations, John Hopkins University Press, Baltimore, 1983. S. Grossberg and M. Kuperstein, Neural Dynamics of Adaptive Sensory Motor Control, Pergamon Press, Inc., Elmsford, New York, 1989. S. Grossberg, "Adaptive Pattern Classification and Universal Recording: Parallel Development and Coding of Neural Feature Detectors," Biological Cybernetics, Vol. 23, pp. 121-134, 1976. S. Guo, Determining the Steady Solutions of Nonlinear Models of Power Systems, Ph.D. Dissertation, EE Dept., Michigan State University, East Lansing, 1990. R. Hecht-Nielsen, "Counterpropagation Networks," IEEE FirSt Interna- tional Conference on Neural Networks, Vol. 2, pp. 19-32, 1987. R. Hecht-Nielsen, Neurocomputing, Addison-Wesley, Reading, MA 1990. F. S. Hillier and G. J. Lieberman, Introduction to Operation Research, Holden-Day, Inc., Oakland, CA, 1986. [HiSe86] [HiSm74] [11013821 [11013841 [HoTa85] [HuSh91] [HwCh80] [HwCh82] [HwBr84] [Jac88] [J eJ 090] 152 G. Hinton and T. Sejnowski, "Learning and Relearning in Boltzmann Machines," Parallel Distributed Processing Explorations and the Micro- structures of Cognition, Edited by Rumelhart and McClelland, pp. 282-317, MIT Press, 1986. M. W. Hirsch and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra, Academic Press, New York, 1974. J. J. Hopfield, "Neural Networks and Physical Systems with Emergent Col- lective Computational Abilities," Proceedings of the National Academy of Science, Vol. 79, pp. 2554-2558, 1982. J. J. Hopfield, "Neurons with Graded Response Have Collective Computa- tional Properties Like Two-State Neurons," Proc. Natl. Acad. Sci., Vol. 81, pp. 3088-3092, 1984. J. J. Hopfield and D. W. Tank, "’Neural’ Computational of Decisions in Optimization problem," Biol. Cybern., Vol. 52, pp. 141-152, 1985. K. C. Hui and M. J. Short, "A neural Networks Approach to Voltage Secu- rity Monitoring and Control," Proceedings of the First International Forum on Applications of Neural Networks to Power Systems, Seatle, WA, July, 1991. K. Hwang and Y.-H. Cheng, "VLSI Computing Structures for Solving Large-Scale Linear System of Equations," Proceedings of International Conference on Parallel Processing, pp. 217-227, 1980. K. Hwang and Y-H Cheng, "Partioned Matrix Algorithms for VLSI Arith- metic Systems," IEEE Trans. on Computer, Vol. 31, pp. 1215-1224, December 1982. K. Hwang and F. A. Briggs, Computer Architecture and Parallel Pro- cessing, McGraw-Hill Inc., New York, 1984. A. Jackson, "Machine Learning," Expert Systems, Vol. 5, pp. 132, Oxford, May 1988. C-W. Jen and S-J. Jou, "Design of One-Dimensional Systolic-Array Sys- tems for Linear State Equations," IEE Proceedings, Vol. 137, pp. 185-192, June 1990. [Jen77] [JoJe88] [J oh87] [J oRi82] [KaBr89] [KeCh87] [KeCh88] [Koh87] [K0587] [Lip87] [LiZS91] 153 Alan Jennings, Matrix Computation for Engineers and Scientists, John Wiley & Sons, New York, 1977. S-J. Jou and C-W. Jen, "Design of a Systolic Array System for Linear State Equations," IEE Proceedings, Vol. 135, pp. 211-218, October 1988. S. L. Johnson, "Solving Tridiagonal Systems on Ensemble Architectures," SIAM Journal on Scientific and Statistical Computing, No. 8, pp. 354-392, 1987. L. W. Johnson and R. D. Riess, Numerical Analysis, 2nd ed., Addison- Wesley, Reading, Massachusetts, 1982. K. N. Kama and D. M. Breen, "An Artificial Neural Networks Tutorial: Part 1 — Basics," Neural Networks, Vol. 1, No. 1, pp. 4-23, 1989. M. P. Kennedy and L. O. Chua, "Unifying the Tank and Hopfield Linear Programming Circuit and the Cononical Nonlinear Programming Circuit of Chua and Lin," IEEE Trans. on Circuits and SyStems, Vol. 34, pp. 210- 214, February 1987. ' M. P. Kennedy and L. O. Chua, "Neural Networks for Linear and Non- linear Programming," IEEE Trans. on Circuits and Systems, Vol. 35, pp. 554-562, May 1988. T. Kohonen, "State of the Art in Neural Computing," IEEE First Interna- tional Conference on Neural Networks, Vol. 1, pp. 81, 1987. B. Kosko, "Competitive Adaptive Bi-directional Associative Memories," IEEE First International Conference on Neural Networks, Vol. 2, pp. 759- 766, 1987. R. P. Lippmann, "An Introduction to Computing with Neural Nets," IEEE ASSP Magazine, April 1987. T.-Y. Li, H. Zhang, and X.-H. Sun, "Parallel Homotopy Algorithm for the Symmetric Tridiagonal Eigenvalue Problem." SIAM Journal on Scientific and Statistical Computing, Vol. 12, No. 3, pp. 469-487, May 1991. [MaHY89] A. Masski, Y. Hirai, and M. Yamada, "Neural Networks in CMOS: A case study," IEEE Circuits and Devices Magazine, Vol. 6, pp. 12, July 1990. [MaSh89] [MaSh90] [MaSh9l] [Mea88] [Me1589] [MiPa69] 154 C-Y. Maa and M. A. Shanblatt, "Adjustment of Parameters for Signal Decision Networks," Proceedings of International Joint Conference on Neural Networks, Washington D. C., Vol. II, pp. 589, 1989. C-Y. Maa and M. A. Shanblatt, "A Neural Net Technique for Economic Power Dispatching with Consideration of Transmission Losses," (In Publi- cation) 1990. C-Y Maa and M. A. Shanblatt, "Linear and Quadratic Programming Neural Network Analysis," to appear , IEEE Trans. on Neural Networks, 1992. C. A. Mead, Analog VLSI and Neural Systems, Addison-Wesley, 1988. C. Mead and M. Ismail. Analog VLSI Implementation of Neural Systems, Kluwer Academic Publishers, 1989. M. Minsky and S. Papert, Perceptrons, MIT Press, 1969. [MoAG91] A. Moore, J. Allman, and R. M. Goodman, "A Real-Time Neural System [Moet87] [Mol83] [MoPG87] [NaCh92] [N eS Y92] for Color Constancy," IEEE Trans. on Neural Networks, Vol. 2, No. 2, pp. 237-247, March, 1991. A. Monticelli, et al., Security-Constrained Optimal Power Flow with Post- Contingency Corrective Rescheduling," IEEE Trans. on Power Systems, Vol. 2, pp. 175-182, February, 1987. D. I. Moldovan, "On the Design of Algorithms for VLSI Systolic Arrays," Proceedings of the IEEE, Vol. 71, No. 1, January 1983. A. Monticelli, M. V. Pereira and S. Granville, "Security-Constrained Optimal Power Flow with Post-Contingency Corrective Rescheduling," IEEE Trans. on Power Systems, Vol. 2, pp. 175-182, February, 1987. N. M. Nasrabadi and C. Y. Choo, "Hopfield Network for Stereo Vision Correspondence," IEEE Trans. on Neural Networks, Vol. 3, No. 1, pp. 5- 13, January 1992. C. Neti, M. H. Schneider, and E. D. Young, "Maximally Fault Tolerant Neural Networks," IEEE Trans. on Neural Networks, Vol. 3, No. 1, pp. 14-23, January 1992. [On87a] [Ort87b] [PaEM91] [Par80] [Pret86] [Re187] [Roet90] [RoKa82] [Rot91] 155 J. M. Ortega, Iterative Solution of Nonlinear Equations in Several Vari ables, Academic Press, New York, 1987. J. M. Ortega, Matrix Theory, Plenum Press, New York, 1987. D. C. Park, M. A. El-Sharkawi, and R. J. Marks 11, "An Adaptively Trained Neural Network," IEEE Trans. on Neural Networks, Vol. 2, No. 3, May, 1991. B. N. Parlett, The Symmetric Eigenvalue Problem, Prentice-Hall, Engle- wood Cliffs, NJ, 1980. W. H. Press, et al., Numerical Recipes: The Art of Scientific Computigg, Cambridge University Press, New York, 1986. D. K. Reinhard, Instruction to Integrated Circuit Engineering, Houghton Miffiin Company, Boston, 1987. A. Rodriguez-Vazquez et al., "Nonlinear Switched-Capacitor ’Neural’ Net- works for Optimization Problems," IEEE Trans. on Circuits and Systems, Vol. 37. PP. 384-398, March 1990. A. Rosenfeld and A. C. Kak, Digital Picture Processing, Academic Press, New York, 1982. M. W. ROth, "Analogical Versus Logical or Connectionist Versus Symbolic or Scruffy Versus Neat," IEEE Trans. on Neural Networks, Vol. 2, No. 6, pp. 545-546, November, 1991. [RuHW86] D. Rumelhart, G. Hinton and R. Williams, "Learning Internal Representa- [Saet89] [Sar91] tions by Error Propagation," Parallel Distributed Processing: Foundations, Edited by Rumelhart and McClelland, pp. 318-362, MIT Press, 1986. F. M. Salam, et al. "Parallel Processing for the Steady State Solutions of Large-Scale Non-Linear Models of Power Systems," IEEE International Symposium on Circuits and Systems, Portland, May, 1989. P. Saratchandran, "Dynamic Programming Approach to Optimal Weight Section in Multilayer Neural Networks," IEEE Trans. on Neural Networks, Vol. 2, No. 4, pp. 465-466, July, 1991. [Sca66] [She88] [SoPa89] [808088] [SrLC91] [Sti63] [Sto73] [StPi63] [SuCL91] [TaH086] [TaTr91 ] 156 J. B. Scarborough, Numerical Mathematical Analysis, The Johns Hopkins Press, Baltimore, 1966. J. F. Shepanski, "Fast Learning in Artificial Neural Systems, Multilayer Perceptron Training Using Optimal Estimation," Proceedings of the Inter- national Conference on Neural Networks, 1., pp. 465-475, IEEE Press, New York, July 1988. D. J. Sobajic and Y-H. Pao, "Artificial Neural-Net Based Dynamic Security Assessment for Electric Power System," IEEE Trans. on Power Systems, Vol. 4, pp. 220-228, February, 1989. B. Soucek and M. Soucek, Neural and Massively Parallel Computers: The Six Generation, John Wiley & Sons, New York, 1988. D. Srinivasan, A. C. Liew, and J. S. P. Chen, "Short Term Forecasting Using Neural Network Approach," Proceedings of the First International Forum on Applications of Neural Networks to Power Systems, Seatle, WA, July, 1991. E. L. Stiefel, An Introduction to Numerical Mathematics, Academic Press, New York, 1963. H. S. Stone, "An Efficient Parallel Algorithm for the Solution of a Tridiag- onal Linear System of Equations," Journal of ACM, Vol. 20, No. 1, pp. 27-38, 1973. K. Steinbuch and V. A. W. Piske, "Learning Matrices and Their Applica- tions," IEEE Trans. on Electron. Comput., vol. 12, pp. 846-862, 1963. G.-Z. Sun, H.-H. Chen, and Y.-C. Lee, "A fast On-Line Learning Algo- rithm for Recurrent Neural Networks," Proceedings of the International Joint Conference on Neural Networks, II, pp.13-l8, 1991. D. W. Tank and J. I. Hopfield, "Simple ’Neural’ Optimization Networks: An A/D Converter, Signal Decision Circuit, and a Linear Programming Circuit," IEEE Trans. on Circuits and Systems, Vol. 33, pp. 533—541, May 1986. A. Tabatabai and T. P. Troudet, "A Neural Net Based Architecture for the Segmentation of Mixed Gray-Level and Binary Pictures," IEEE Trans. on [TOW188] [TrWa91 ] [WeEl9 1] [WiHo60] [W iLe90] [WiPa88] [WiRe71] [Wri91] 157 Circuits and Systems, Vol. 38, No. 1, pp. 66-77, 1991. V. V. Tolat and B. Widrow, "An Adaptive ’Broom Balancer’ with Visual Inputs," Proceedings of the International Conference on Neural Networks, 11, pp641-647, IEEE Press, New York, July 1988. T. P. Troudet and S. M. Walters, "Neural Network Architecture for Crossbar Switch Control," IEEE Trans. on Circuits and Systems, Vol. 38, No. 1, pp. 42-56, January 1991. S. Weerasooriya and M. A. El-Sharkawi, "Use of Karhunen-Loe‘ve Expansion in Training Neural Networks for Static Security Assessment," Proceedings of the First International Forum on Applications of Neural Networks to Power Systems, Seatle, WA, July, 1991. B. Widrow and M. Hoff, "Adaptive Switching Circuits," 1960 IRE WES- CON Connection Record, Part 4, pp. 96-104, New York, 1960. B. Widrow and M. A. Lehr, "30 Years of Adaptive Neural Networks: Per- ceptron, Madaline, and Backpropagation," Proceedings of the IEEE, Vol. 78, No. 9, pp. 1415-1442, September, 1990. V. Wilson and G. S. Pawley, "On the Stability of the TSP Problem Algo- rithm of Hopfield and Tank," Biol. Cybern., Vol. 58, pp. 63-70, 1988. J. H. Wilkinson and C. Reinsch, Linear Algebra, Vol. 2 Handbook for Automatic Computation, Springer-Verlag, New York, 1971. S. J. Wright, "Parallel Algorithms for Banded Linear Systems," SIAM Jour- nal on Scientific and Statistical Computing, Vol. 12, No. 4, July 1991.