PLACE N RETURN BOX to removo this checkout from your rooocd.
TO AVOID FINES return on or baton dd. duo.

=——_—__———_1

 

DATE DUE DATE DUE DATE DUE

    

 

 

 

 

 

 

 

 

"a c 5 «w: ‘

   

 

MAY 2 0 2801

 

 

 

 

 

 

 

 

3W 62521101 L
we»: I7 002002

 

 

 

PP

 

I

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[ﬁﬁl

 

 

 

MSU Is An Afﬁrmative AMqual Opponunﬂy Institution

ADAPTIVE CONTROL OF NONLINEAR SYSTEMS

USING NEURAL NETWORKS

By

Fu-Chuang Chen

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Electrical Engineering

1990

ABSTRACT

ADAPTIVE CONTROL OF NONLINEAR SYSTEMS
USING NEURAL NETWORKS

By

Fu-Chuang Chen

Layered neural networks are used in the adaptive control of nonlinear discrete-time
systems. The control algorithm is described and two convergence results are provided.
The ﬁrst result shows that the plant output converges to zero in the adaptive
regulation system. The second result shows that the error between the plant output
and the reference command converges to a bounded ball in the adaptive tracking

system. Computer simulations verify the theoretical results at the end of this thesis.

11

To my parents,
Chan-Chwang Chen and Hu Li-Shueh Chen
and my wife

Hway-Ming Ker

iii

ACKNOWLEDGEMENTS
I wish to thank Dr. Hassan K. Khalil for his Patience and Guidance; and my

committee members, Dr. R. O. Barr, Dr. P. M. FitzSimons, Dr. C. R. MacCluer,
Dr. F. M. A. Salam, and Dr. R. Schlueter, for their help and valuable suggestions.

1v

TABLE OF CONTENTS

. LIST OF FIGURES ............................................ vii
1. INTRODUCTION ............................................ 1
1.1 Neural Computing Research ............................... 2
1.2 Neural Networks in Control ............................... 3

1.3 Feedback Linearization of Minimum Phase
Nonlinear Discrete-Time Systems ........................... 9
2. LINEARIZING FEEDBACK CONTROL ......................... 12
3. ADAPTIVE CONTROL USING NEURAL NETWORKS ............. 20
3.1 Method 1 ............................................ 20
3.1.1 Control Law for Method 1 ......................... 23
3.1.2 Updating Rule for Method 1 ....................... 25
3.2 Method 2 ............................................ 26
3.2.1 Control Law for Method 2 ......................... 27
3.2.2 Updating Rule for Method 2 ....................... 27
3.3 Comparison between Method 1 and Method 2 ................ 28
4. CONVERGENCE RESULT: PART ONE ......................... 29
5. CONVERGENCE RESULT: PART TWO ......................... 40
6. SIMULATION .............................................. 56
6.1 Identiﬁcation ......................................... 57
6.2 Regulation using Neural Networks without Bias Weights ......... 63
6.3 Regulation using Neural Networks with Bias Weights ........... 71
6.4 Tracking: 1. The Plant is Stable ........................... 78
6.5 Tracking: 2. The Plant is Unstable ......................... 85
6.6 Controlling a Relative-Degree-Two System: 1. The Pendulum ..... 92

6.7 Controlling a Relative-Degree-Two System:
2. The Inverted Pendulum ............................. 96

V

7. CONCLUSION ............................................ 100

REFERENCE ............................................... 102

v1

LIST OF FIGURES

Figure 1.1 .................................................... 3
Figure 1.2 .................................................... 6
Figure 1.3 .................................................... 7
Figure 1.4 .................................................... 8
Figure 3.1 ................................................... 21
Figure 3.2 ................................................... 23
Figure 3.3 ................................................... 24
Figure 3.4 ................................................... 26
Figure 5.1 ................................................... 42
Figure 5.2 ................................................... 44
Figure 6.1 ................................................... 56
Figure 6.2 ................................................... 60
Figure 6.3 ................................................... 61
Figure 6.4 ................................................... 62
Figure 6.5 ................................................... 66
Figure 6.6 ............... ‘ .................................... 67
Figure 6.7 ................................................... 68
Figure 6.8 ................................................... 69
Figure 6.9 ................................................... 70
Figure6.10............ ....................................... 73
Figure 6.11 ................................................... 74
Figure 6.12 ................................................... 75
Figure 6.13 ................................................... 76
Figure 6.14 ................................................... 77
Figure 6.15 ................................................... 79
Figure 6.16 ................................................... 80

V11

Figure 6.17 ................................................... 81

Figure 6.18 ................................................... 82
Figure 6.19 ................................................... 83
Figure 6.20 ................................................... 84
Figure 6.21 ................................................... 86
Figure 6.22 ................................................... 87
Figure 6.23 ................................................... 88
Figure 6.24 ................................................... 89
Figure 6.25 ................................................... 90
Figure 6.26 ................................................... 91
Figure 6.27 ................................................... 92
Figure 6.28 ................................................... 94
Figure 6.29 ................................................... 95
Figure 6.30 ................................................... 97
Figure 6.31 ................................................... 98
Figure 6.32 ................................................... 99

viii

1 Introduction

Linearization by feedback [15] is a promising approach to the control of nonlinear
systems. The essence of the idea is to transform a state space model of the plant into
new coordinates where nonlinearities can be canceled (fully or partially) by feedback.
The major challenge in performing such cancellation is the need to know precise
models of the nonlinearities. One approach to address this challenge is to use adaptive
control where the controller learns the nonlinearities on line. This idea has been
investigated for continuous-time systems [17,18] assuming that the nonlinearities can
be parametrized linearly in some unknown parameters. In this thesis we investigate a
similar scheme for discrete-time systems, but we do not assume that the nonlinearities
depend linearly on unknown parameters. Instead, we explore the use of layered
neural networks to model the nonlinearities. In the discrete-time self-tuning adaptive
control scheme, the linearizing control is generated from the information provided by
the neural network. Then, the observed error is used to train the neural network
to improve its approximation of the unknown nonlinear plant. A review of neural
network research is given in sections 1.1 and 1.2 in chapter 1. Section 1.3 provides
some background for feedback linearization.

In chapter 2 we derive output feedback linearizing control for a discrete-time non-
linear system represented by an input-output model. The relative degree of the system
could be higher than one. In chapter 3, a neural network architecture is suggested
for modeling nonlinear systems. We will describe two different methods for apply-
ing layered neural networks to adaptive control problems and provide the associated
learning rules. The theoretical results of this research are presented in chapters 4
and 5. In these two chapters, we show local convergence properties based on differ-

ent network models and different learning rules. Simulation results are provided in

1 INTRODUCTION 2

chapter 6. Chapter 7 is the conclusion.

1.1 Neural Computing Research

Theoretical brain research has been published in the contexts of theoretical biology,
mathematical psychology, cybernetics, pattern recognition, the theories of adaptive
systems, and others. Recently the terms “neural computing” and “neural networks”
have been adopted to address more practical issues such as vision, sensory-motor
control, associative memory, supervised learning, unsupervised learning, robotics,
etc. Although these studies have rather diverse origins, they often have one common
objective : to implement new types of computers.

Traditional digital computers do very well on tasks which we know how to proceed
to solve. However, it is very difﬁcult to program a digital computer to solve problems
such as vision and speech recognition. The reasons are: ﬁrst, we do not have enough
information about how these tasks are actually done in the brain; second, even if suf-
ﬁcient knowledge is available about the function of the brain, it may be incomputable
by digital computers. The result of decades of research in artificial intelligence may
justify the arguments above. The most successful subfield in A1 is expert system. Ex-
pert systems are programs whichsolve speciﬁc problems using information collected
from domain experts. In contrast, the results are much more limited when applying
AI techniques to vision and language understanding problems.

Artiﬁcial neural networks are networks of processing elements (i.e., “neurons”)
that are interconnected. Each neuron can have multiple input signals, but only one
output signal. Different interconnection topologies and learning rules determine dif-
ferent neural network paradigms. Artiﬁcial neural networks are considered models of
the brain, and they are intended to interact with the real world in the same way as

the biological nervous systems do (at least for the original purpose). Most existing

1 INTRODUCTION 3

neural network architectures are constructed to reproduce speciﬁc brain functions.
Since our knowledge about how brain functions is still very limited, existing artiﬁcial
neural networks may be too simple compared with their biological counterpart. How-
ever, as our knowledge and experience increase, new and more sophisticated neural
network models will replace the old ones.

“Neural computing” became a very hot research area starting from the mid 80’s.
Not all research efforts in this ﬁeld are biologically motivated. In particular, in engi-
neering applications, artiﬁcial neural networks can be viewed as some new tools which
seem able to attack traditionally difficult problems.

Historical reviews and current developments in neural computing can be found in

[20,21,22].

1.2 Neural Networks in Control

 

Figure 1.1 A layered neural network with
two nonlinear hidden layers

1 INTRODUCTION 4
In this section we concentrate on the discussion of layered neural networks, since they
are the most prevailing network architecture studied for identiﬁcation and control ap-
plications. A layered neural network, shown in Figure 1.1, consists of an input layer,
an output layer, and at least one layer of nonlinear neurons. The nonlinear neu-
rons sum incoming signals and generate output signals according to some predeﬁned
functions. The neurons are interconnected layer by layer. The output of one neuron
multiplied by a weight becomes the input of adjacent neurons of next layer.

Layered neural networks have good potentials for control applications because
they can approximate nonlinear functions. it was noted more than two decades ago
by Minsky and Papert [23] that by inserting “nonlinear hidden neurons” between the
input layer and the output layer, the XOR mapping (which is a nonlinear mapping)
can be represented by the network. Recently, it is shown by Funahashi [24], Cybenko
[25], Hornik et a1. [26], and Hecht-Nielson [3], using different techniques, that layered
neural networks can approximate any “well-behaved” nonlinear function to any de-
sired accuracy. The theorem shown by Funahashi is quoted here.

Theorem
Let ¢(:r) be a nonconstant, bounded and monotonically increasing continuous func-
tion. Let K be a compact subset of R" and f(:c1, . . . ,mn) be a real valued continuous

function on K. Then for any 6 > 0, there exists an integer N and real constants Cg,

0,-(i=1,...,N), w,j(i= 1,...,N,j = 1,...,n) such that

N n
f(xl,...,xn) = gagging-x,- —o,-) (1)

i=1 j=l

<6.

 

satisfies maxzex |f(:r:1, . . . , xn) - ﬂxl, . . . , at“)
In other words, given any function f (1:1, . . . , it“) and any arbitrary e > 0, there exists
a three-layer network f(:r:1,. . . ,2") with linear input and output layers and with a

hidden layer whose output functions are 49(2), such that

1 INTRODUCTION 5

maxzex ]f(a:1, . . . ,zn) — f(xl, . . . , :rn)] < 6. Similar result for neural networks with
more than one hidden layer can be derived from the theorem above or be shown
from scratch [24]. Notice that the theorem is an existence result. It does not give
an estimate of the number of neurons needed to approximate a nonlinear function
given a speciﬁed error bound, nor does it say how to choose the weights. In control
applications some ad hoc procedures are used to determine a suitable size of the
network.

The next crucial issue is to train the network to approximate a given function. The
back propagation algorithm [1] is a widely accepted method to train a neural network
to approximate a function. If there is difference between the function output and the
network output for the same input, the difference can be used in the back propagation
algorithm to adjust the weights in the neural network in order to reduce the error.
The training is usually a time consuming process, and researchers are suggesting
modiﬁcations to the original back propagation algorithm to increase the learning
speed [32,33]. There is no theoretical result available yet about the convergence of
the training. However, many applications reported in the literature have conﬁrmed
the value of applying layered neural networks to various problems, e.g. [1 - 14].

Some recent papers on the application of neural networks to control and identiﬁ-
cation problems are reviewed in the following examples.

Example 1.1 : [4,5,6]

If the input vector U (k) of a nonlinear system can be uniquely determined by its

output vector Y(k) through a static mapping

U(k) = f(Y(k)),

then layered neural networks can be used to learn this mapping and generate controls.

1 INTRODUCTION 6

For example, the dynamical equations of robotics can be rearranged into

UUC) = f(q(k),€l(k),<i(k)),
where U (k) is the vector of the joint torques and q(k) is the vector of the joint
angles. The neural network can be trained to approximate the inverted mapping as
shown in Figure 1.2(a) and then be used as a feedforward compensator in Figure
1.2(b). On line learning can be carried out as illustrated in Figure 1.2(c), where
two identical neural networks are used. The neural network in the feedback loop
identiﬁes the inverted plant on line and its updated weights are copied to the second

neural network in the feedforward path. Simulation results have been presented in

the referenced papers. D

 

 

 

 

 

  

 

 

 

 

 

‘ Neural \Net

xi

 

(a)

 

 

 

—9‘Tralned Net ‘ Plant r—->

 

 

 

 

 

 

(b)

 

 

“—3 Neural Net

K
I

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

—_—————-—————— \

Figure 1.2 See Example 1.1 of Section 1.2 for description.

1 INTRODUCTION 7
Example 1.2 : [7]
Consider a dynamical system

$(k +1) = f1(x(k),U(k))-

Assume that the order of the plant (i.e., the number of the states) is known, say n,

and that the states are physically measurable. The states at k + 2 are

9506+?) = f1($(k+1),U(k+1)) = f1(f1($(k),U(k)),U(k+1)) = f2($'(k),U(k),U(k+1))-

Repeating the process, one concludes that the states at time k + n is determined by

the state at time k and the controls from time k to (k + n — 1), i.e.,
$(k + n) = fn(1‘(k),U), (2)

where

U = [u(k),u(k +1),...,u(k + n —1)]T.

Assume that equation (2) is uniquely invertible for U. Then U can be solved as

 

 

 

 

U = 902(k), 2(k + n))- (3)
x(kl E II, Uc
Neural, Net ———>
___) I
X(k+nl j
u—uc

Figure 1.3 See Example 1.2 of Section 1.2 for description

A layered neural network, as shown in Figure 1.3, can be used to approximate (3).
U. = éizik), mu: + n). W)

At each time step k the training of the neural network can be described as follows:

1 INTRODUCTION 8

0 Input the states :c(k — n) and :c(n) to the neural network.

0 The error (U — Uc) is used to update the weights of the neural network.

It was suggested to implement this learning and control scheme on line. C)
Example 1.3 : [8,9]

Dynamical Systems Identiﬁcation. Layered neural networks can be used to iden-

 

tify a class of unknown nonlinear functions
y(k+ 1) = f(y(k)vy(k_1)1°"3y(k_n+1)au(k)au(k —1),...,u(k—m+ 1))3

where n _>_ m. As depicted in Figure 1.4(a), this is essentially a function approxima-
tion problem. At each time step k, the control u(k) as well as all of the relevant past
inputs and outputs are applied to the neural network input layer. The error between
the output of the neural network and the plant output y(k + l) is used to train the

network.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

u(k) ' (k)
Neural Network
7% +
(a)
Ym
Model
[9‘ "
l' 9 e
N Neural Net (N) : C>————>°
i it?
+
U +
Controller Plant y

 

 

 

 

 

(b)

FigLre 1.4 See Example 1.3 of Section 1.2 for description.

1 INTRODUCTION 9

Adaptive Control. For a special class of nonlinear unknown systems such as

 

y(k+1) = f(y(k),y(k-1),-u,y(k-n+1))+9(y(k),y(k-1),.-.,y(k-n+1))u(k).

where the control u(k) appears linearly, layered neural networks can be used in the
Self-Tuning framework (Figure 1.4(b), with the Model block as l) or in the Model
Reference framework (Figure 1.4(b)) to adaptively control the system. The neural
network is used to learn the characteristics of the plant on line and generate appro-
priate controls to be applied to the plant in order to cancel the plant (self-tuning)
or control the closed-loop system to follow the output of a desired model (model
reference). Details about neural-network-based self-tuning adaptive control will be
provided in chapter 3. D

There are other approaches suggested in recent papers. In [10] and [2], neural nets
are used directly as controllers, but this approach bears a less direct connection to
traditional control design methods. Other works include the application of the CMAC
neural networks to robotics control problems [11] and the Reinforcement Learning
Problems [12]. Favorable simulation results related to these techniques are available
in the references listed.

The research in applying neural networks to control problems is still at the stage of
proposing ways to incorporate neural networks into control systems. Few theoretical

results are available to date, although there have some attempts to obtain theoretical

results [13,14].

1.3 Feedback Linearization of Minimum-Phase Nonlinear
Discrete Time Systems

The concept of zero dynamics and the minimum phase property for nonlinear continuous-

time systems were introduced by Isidori and coworkers [15]. They were adapted

to the discrete-time case by Monaco and Normand-Cyrot [16]. Consider a single-

1 INTRODUCTION 10

input / single-output system of the form:

$(k+1) = f($(k),U(k)) (4)

3106) = h(3:06))

with x(k) E R", u(k) E R, y(k) E R, and f and h analytic functions on their domains.
Denoting by fo the undriven state dynamics f(-,0) and by f3 the j-times iterated

composition of f0. The system is said to be of relative degree d if

6h 0 f: o f(:r,u)
Bu

 

50 OSk<d—1

and
3h 0 ﬁg“ 0 f(:r,u)
Bu

y(d) is the ﬁrst output affected by the input u(0). Let r 6 R be an external control.

750 a.e. in R“1

 

A nonlinear static state feedback control is denoted by

u = 7(1', 1‘) (5)

If g} ¢ 0, the feedback law (5) is said to be nonsingular.
Suppose the system (4) is of relative degree d. Solve the state equation of (4)

recursively to express y(k + d) in the form

yUc + d) = F(1300,1100)
An important assumption about the system (4) is that

0 E Range(F(a:,o)) Va: (6)
It also follows from the deﬁnition of relative degree d that

6F(:c,u)
‘79..— t 0

1 INTRODUCTION 11

Therefore, the implicit function theorem can be used to show the following theorem.

Theorem [16]
Assuming ( 6) is satisﬁed and d < 00, then there exists a nonsingular feedback con-
trol law of the form (5) such that the closed-loop system, after a suitable change of

coordinates z = T(:r), is described by the equations

21(k +1) = Azl(k) + Br(k)
22(k +1) = F(210$),22(k)»7‘(l°)) (7)

We) = 031(k)

Where dim 21 = d and (A, B, C) is a controllable-observable triple. D

A very informative proof is available in [16]. From (7) we see that if d < n, the
22 component of the state will be strongly unobservable in the closed-loop system,
because 22 has no effect on the plant output; if d = n the system is fully linearizable.

If system (7) starts from 21(0) = 0 and r E 0, then 21 E 0 and the plant output
stays at zero. The motion of the system is determined by the dynamics of 22, which
gives rise to the notion of zero dynamics.

Deﬁnition

The Zero Dynamics of system (7) are deﬁned to be
2206 +1) = F“), 22(k),0). D (8)

The system is said to be minimum phase if the zero dynamics have an asymptotically

stable equilibrium at the origin.

2 LINEARIZING FEEDBACK CONTROL 12

2 Linearizing Feedback Control

Many interesting systems can be described by a nonlinear model in which the con-
trol appears linearly. We are interested in the single-input/single—output nonlinear

discrete-time system:
ilk-H = fo(yk, yk—1,-- - : ilk—n+1, “Mk-d, uk—d-la - - - , uk-d-m+l) (9)
+ 90(yk, yk—i , - - ~ , Elk-n+1, Uk—d, Uk—d-1,--o, Uk—d—m+1)uk-d+1 a

where m S n, y is the output, it is the input, (1 is the relative degree of the system,
and go is bounded away from zero. The arguments of f0 and g0 are real variables.

Compared with the deterministic autoregressive moving average (DARMA) model

"-1 m-l
yk+1 = 2 dry)...- + Z biuk—d-l-l—i (10)
i=0 i=0
71-1 m-l
= [Z aiyk-i + z biuk_d+1_g] + bouk_d_1
i=0 {:1

[2:53ang + 22:1 bguk-d+1-.~] is a special case of f0 and ho is a special case of go.
The functions f0 and go are unknown. The objective is to design a self-tuning

control system using neural networks so that the output of the plant will asymptoti-

cally track the command. Two obvious difﬁculties show up immediately. First, even

if f0 and go were known, the control law cannot simply be

fo(°) r(k)

alt-”+1 : -.90(’) .900),

because this control is noncausal when d > 1. Second, since f0 and go depend on

 

past inputs, the system may become internally unstable after the feedback control,
if it exists, cancels the plant dynamics. These two issues are well known for linear
discrete-time systems [19,27]. Especially it is shown in [27] that the system (10) can

be Converted into

13-] m+d—l

yk+d = Zaiyk—i‘l' 2 dark-.- i (11)

i=0 i=0

2 LINEARIZING FEEDBACK CONTROL 13

11-1 m+d-1
= [Z aiyk—i + Z ﬂtuk—i] + ﬂour:
i=0 {:1

Then the control uk in (11) can be determined in terms of past inputs and past
outputs to cancel the plant dynamics, and the effect of the control uk will show up at
the plant output d steps later. The purpose of this section is to derive the nonlinear
counterpart of (11) for the nonlinear system (9) and to deﬁne the zero dynamics
associated with (9).

The work of Monaco and Normand-Cyrot [16] suggests that important properties
of system (9) may be revealed if the system is put into state space form and some
suitable coordinate transformation is performed on the model. We select the state
variables as the current output and all past inputs and outputs up to the most delayed

input or output on the right—hand side of (9), i.e.,

171(k) = Inc-n+1

$n-l(k) = yk-l

311(k) = yk
tin-HUB) = uk—d-m+l
$n+m+1(k) = Uk-d+1
zn+m+d-l(k) = uk-l:

Let x(k) be the state vector. A state space model of (9) is constructed accordingly

as

131(19‘1’1) = 332(k)

2 LINEARIZING FEEDBACK CONTROL 14

x..-,(k+1) = 2:..(k) (12)
xn(k+1) = f0(xn(k)axn-l(k)v”'731(k):xn+m(k)axn+m-l(k)a'°-axn+l(k))
+go($n(k),---,$1(k),xn+m(k),---,xn+1(k))$n+m+1(k)
= f0(xl(k)a”"xn+m(k))+90(x1(k)i'°-a$n+m(k))$n+m+l(k)

xn+1(k+1) = $n+2(k)

$n+m+l(k+1) : xn+m+2(k)

$n+m+d-l(k+l) = "4:

3106) 2.06)-

There are (n + m + d — 1) states. The state space representation (12) is, in general,
a nonminimal realization. However, no difficulty arises from working with this non-
minimal realization since the redundant dynamics are stable (for linear systems the
uncontrollable/ unobservable eigenvalues are at the origin). In the following we derive

a transformation that transformssystem (12) into the form ( 7).

zn(k+2) = y(k+2)
:: fo($1(k+l),...,$n(k+1),c-o,xn+m(k+1)) (13)

+go(a:i(k +1)....,x.(k +1),-..,zn+m(k +1))xn+m+1(k +1)-

After substituting (12) into (13), we have

:rn(k + 2) y(k + 2)

= fl($1(k)a ° - - a3n+m+l(k)) '1' 910510;), ' ° ° i xn+m+l(k))$n+m+2(k)°

2 LINEARIZING FEEDBACK CONTROL

By applying the same technique recursively, one gets

15

a0: + 3) = f2($1(k)7°-°a$n+m+2(k))+g2($1(k)a°°-axn+m+2(k))xﬂ+m+3(k)'

$n(k+d"1) = fd-2(xl(k):°-°ixn+m+d-2(k))

+ 9d-2(371(k)a - - - a $n+m+d—2(k))$n+m+d—1(k).

Then the following state transformation is suggested,

211(k)

21,..(k)
zl,n+1(k)
zik) = a
zl,n+d-l(k)

221(k)

 

L 2%n(k)

1

 

.l

P

 

31(k)

xn(k)

$n(k+1)

xn(k+d-l)

$n+1(k)

$n+m(k)

fo(°) + go(-)$n+m+1(k)

v-20) + g.-2i-)x..-+.-.<k)

After this transformation, (12) becomes

211(k +1)

Zln(l€ + 1)

zl,n+1(k + 1)

212(k)

Zl.n+1(k)

Zl,n+2(k)

 

milk)

zn(k)

$n+l(k)

2...“)

 

= T(x(k))

2 LINEARIZING FEEDBACK CONTROL 16
zl.n+d-1(k +1) = fd-1(X(k)) + gd-1(X(k))$n+m+d—1(k + 1)
= fd-1(T'1(Z(k)))+ 9d-1(T'1(Z(k)))uk
= F(Z(’€)) + C(Z(k))uk (15)

221(k+1) = 222(k)

22,m-1(k+1) = 22m(k)
22m(k+1) = uk_d+1 (16)

900 = 21710:):
Two interesting points about this transformed model can be discussed.

0 If F(o) and C(0) in (15) were known, the control u(k) could be deﬁned as

—F(z(k)) + r(k)
C(Z(k))

 

u(k) -—- (17)

and r(k) will appear as the desired output d steps later.

0 The past control u(k — d + 1) appears in (16). Notice that

u(k—d+1) = ‘—F(z(k;(:(-lI-€1_))d++r];c)—d+l) (18)

—fd_1(T"‘(z(k — d +1)))+ r(k — d +1)

 

 

 

 

 

: gd-1(T'l(z(k—d+1))) (19)
_ —f“"l(x(k—d+l))+r(k-d+1) (20)
_ 94-1(x(k—d+ 1))

_ -fo(x(k)) + r(k _ d +1)

_ go(X(k)) (21)
= —f°(T—l(z(k))) + r(lc — d +1) (22)

90(T"(Z(k)))
Thus, (22) can be substituted into (16). This makes the right hand side of the
transformed model a function of the state z(k) and the input r(k). Similar to

the deﬁnition given in Monaco and Normand-Cyrot [16], we say that the system

2 LINEARIZING FEEDBACK CONTROL l7

(9) is Minimum-Phase if the strongly unobservable part

221(k+1) = 222(k)

Z2,m—1(k +1) : z2m(k)
-fo(T_1(Z(k))) + TU“ — d +1)
90(T'1(Z(’€)))

has an asymptotically stable equilibrium at the origin when 211 = 212 = =

Z2m(k+1) =

 

21,, = 0 and r = 0, i.e., when the plant output and the reference command are

restricted to be zero.

Example 2.1
This example is used to illustrate the transformation process. For n = m = d = 2,

the system is

yk+l : f0(yka ilk-1a ale-2, alt—3)

+ 90(yka ilk-1 ) “ls—2, uk-3)uk-l -

Weselectn+m+d—l=5statesas

- 551(k) = ilk—1
372(k) = 311:
x3(k) = uk_3
x4(k) = uh;
z5(lc) = uh-“

and the state space model is

$1(’C+ 1) $206)

$20: + 1) = f0(32(k)izl(k)934(k)a333(k)) +

2 LINEARIZING FEEDBACK CONTROL 18
go($2(k)a$1(k),$4(k),33(k))$5(k)

= f0($1(k)a ' ° ° 317400) '1' 90013100, ° ' ' a 34(k))$5(k)

2:3(k+1) = $4(k)

x4(k + 1) = 2:50“)

275(k+1) = u),

WC) = $206)-

The purpose of the next step is to bring out the control u(k) explicitly.

32(1‘7 ‘1' 2) = fo(32(k + 1)1$1(k ‘1‘ ”13340“ + ”133““ '1' 1)) +

go(a:2(k +1),z1(k +1),x,(k +1),;.:3(k +1)):r5(k +1)
= fo(fo(2=1(k),-~,a=4(k))+ go(x1(k) ..... 24(k))xs(k),

x2(k),x5(k), x4(k)) +

go(fo(xi(k), - . - ,x4(k)) + go(x1(k), - - - ,x4(k))xs(k),

32(k),zs(k),xi(k))u(k)

= f1(:cl(k), . . . ,25(k)) + g1(a:1(k), . . . ,x5(k))u(k).

Then, after the state transformation

zl(k)
Z(lc) = [ 22(k) =

we get

211(k +1)
212(k +1)

2130C +1)

 

 

 

 

"as "“83

212 $2 T1(X(k))
2.30:) = x2(k+1) =[ x ]
221(k) 23“,) T2( (kl)

. 222(k) .. _ 124(k)

= 212(k)

= 213(k)

= f1(X(k)) + 91(X(k)):1=s(k +1)

= F(2(k))+G(2(k))u(k)

2 LINEARIZING FEEDBACK CONTROL

221(k) + 1)

222(k +1)

222(k)
u(k — 1)

212(k).

D

19

Before concluding this chapter, it remains to show that the inverse of the trans-

formation (14), i.e. x(k) = T’1(z(k)), exists. It suffices to show that the partial

o o 82
derivative 8):

dz
ﬁx

is nonsingular.

F1
0

* * * o as.

CO OO**

 

L

0

p...

***o

CO OC**

**

0

O OOOOO CO

to
7'

CO CO

M

 

(23)

where g,- 75 0,0 S i S d — 2. The matrix in (23) is nonsingular, since, after some row

interchange, it becomes

D

 

* til-CO COO-U OH

**

0
l

***OOOOO

**

O
*ﬂrl—‘o

OOOOO COCO O0

 

20

3 Adaptive Control using Neural Networks

The purpose of this chapter is to introduce the adaptive control system. Convergence
results will be provided in chapters 4 and 5. Two different methods for applying
neural networks to adaptive control of unknown nonlinear systems will be described
in sections 3.1 and 3.2, respectively. In 3.1 the neural network is used to model
the plant, and it needs to go through the same transformation described in chapter
2 in order to bring out the control. In 3.2 the neural network is used to model the
transformed plant directly. A comparison of these two approaches will be given in
section 3.3.

Although neural networks can model any nonlinear function to any desired accu-
racy (see section 1.2), there is no result about how many neurons should be used to
achieve that accuracy. In practice, given a nonlinear plant, some identiﬁcation process
is needed to determine a suitable neural network size for modeling the plant. The size
of the error between the plant and the network model may also be available from the
identiﬁcation process (see section 6.1). In chapter 4, we are going to assume that the
nonlinear plant can be exactly modeled by a multi-layer neural network. However, in
chapter 5, some error between the plant and the model is allowed. Thus, the analysis

in chapter 5 incorporates a robustness result.

3.1 1 Method 1

Rewrite the system

yk+l = f0(yka Elk—1, ° ° - , git-n+1) uk-d: uk-d—lt ‘ ' ' ) uk—d—m+l)

+ 90(yln yk—lv ° ° ° 1 Elk—n+1, "Ir—d, uk-d—la ° ° ' 9 uk—d-m+l)uk—d+l

yk-H = f0($1(k)i ° ° - a xn+m(k)) + 9003100: ° ° ° : xn+m(k))uk-d+l (24)

3 ADAPTIVE CONTROL USING NEURAL NETWORKS 21

We propose to use a layered neural network

37k+1 = fo(x(k), W) + 900‘“): V)“k-d+1 (25)

to model the system (24), where w and v are vectors containing variable weights in

the neural network. The neural network can have as many nonlinear hidden layers as

desired.

 

Figure 3.1 The neural network model

Figure 3.1 shows the architecture of a neural network model with one hidden layer
in f0 and go. The neurons labeled “L” are linear ones which can scale or shift the
sum of incoming signals. The nonlinear neurons, which are labeled “H”, employ the

Hyperbolic Tangent Function h,

h(iv) = (6’ - 6’”)/(8’ + e").

3 ADAPTIVE CONTROL USING NEURAL NETWORKS 22

as their transfer function. The mathematical descriptions of f0 and go with one hidden

 

 

layer are
k 621:1 wi,$y(k)+w.~ __ 6-2:?“ w.'1x,(k)+u‘;,
(’(x w " .. 26
() “0:: '(3 1:1 w'1x’(k)+wi + 6-21:1 woulHi-m ( )
and
621:1". v.“,'2,'(k)+ﬁ,- _ 6-21:1," ”i131(k)+ﬁi
0(X k), v' n m n m ‘ 27
() z; 62):! ”'131(k)+”1 + 6-21:1 ”i131(k)+vi ( )
The weights 1221, . . . , iv? and v1, . . . , vq in (26) and (27), which are not shown in Figure

3.1, are the bias weights, each attached to a corresponding nonlinear neuron. A
neural network with any number of hidden layers can be described mathematically
by iterative substitution from the output layer toward the input layer, although the
ﬁnal expression can be very complex.

Let 9 = [w v]. At time step k, the neural network weights are denoted by 0(k)

and the estimated output is

yin = fo(X(k),W(k)) + §o(X(k),V(k))Uk-d+1 (28)

The control algorithm is described as follows.

At each time step,

1. Calculate the control from the current states of the model (28), and apply it to
the plant (24) and the model (28). Section 3.1.1 describes how to calculate the

control.

2. Update the parameters 0(k) using the error between the plant output (24) and

the model output (28). The updating rule is provided in section 3.1.2.

If there are parameter errors, the output error may be observed. The output error is
then used to reduce the parameter errors in order to produce better controls. This is

a recursive process.

3 ADAPTIVE CONTROL USING NEURAL NETWORKS 23
3.1.1 Control Law for Method 1

The general procedure for calculating the control is given in chapter 2, where the
transformation is performed on the plant. In the adaptive control system, the trans-
formation is performed on the model, which is a neural network. Next example shows
how to use the neural network model to generate controls.

Example 3.1

Let us revisit Example 2.1. The unknown plant is

yk+1 = fo($1(k)a - - - a 934(k))

+ g0(:r1(k), . . . , x4(k))uk_1.

The neural network model of the plant is

31124—1 = ltd-771(k), ° ° ° 1 234(k), W(k))

+ go(a:1(k), . . . , 2:4(k), v(k))uk_l.

which is shown in Figure 3.2.

 

Figure 3.2 The neural network model (see example 3.1)

In order to bring out u, the following transformation is performed.

311.»: = f},(§:2(k +1),='=1(k +1),4¢4(k +1),a¢3(1c +1),W(k))

3 ADAPTIVE CONTROL USING NEURAL NETWORKS 24
+ §o(:i:g(k +1),2:1(k +1), 2:4(k +1),x3(k +1),v(k)):c5(k +1)
= fo(fo(x1(k),.--134(k),W(k))+§o(x1(k),.--,:r4(k),V(k))xs(k), (29)
x2(k),zs(k),24(k),W(k)) +
90(f0($1(k)1~-134(k)iw(k)) + §o(x1(k),- . . .x.(k),v(k))x5<ki,
32(k),ws(k)i34(k),V(k))ui.

= f1($1(k),-.-,xs(k),9(k))+§1($1(k),---,xs(k),9(k))uk-

where 0(k) = [w(k) v(k)] and 5:2(k + 1) is the estimation of 232(k + 1).

 

Figure 3.3 See example 3.1 for description

Notice that at the second equality of (29), the output of the network becomes one
of its inputs. Therefore, the transformation process can be realized by duplicating and

reconnecting the neural network model. Figure 3.3 shows the idea. The functions f1

3 ADAPTIVE CONTROL USING NEURAL NETWORKS 25

and g, are available from the network in Figure 3.3, and the control can be generated

to be
u. : —f.(-)+r(k>
£710)

The procedure used in Example 3.1 can be generalized to the general case. The cal-

Cl

 

culation of the transformation can be a time consuming task for digital computers.
It has been observed that the digital computer is a serious bottleneck when applying
computed-torque techniques to robotics control [30]. On the other hand, neural net-
works, with its massive parallel computing capability, should be able to handle the

computation efﬁciently, provided adequate hardware implementations are available.

3.1.2 Updating Rule for Method 1

Deﬁne the cost function to be

Jk = (yin - n+1)2

The effect of adjusting weights on the cost function can be revealed by the following

gradient:

3w(k)

8§o(x(k),v(k)) ’
( Uk—d+1

8v(k)

(ar‘ogxgk),wgk)))'
V9(k)Jk = 2(y;+1-yk+1)

The weights are updated as follows:
9(k +1) = 0(lc) — Lvmpk (30)
21‘]:

8w(k)

(8§o]x]k),v]k)]

(afogxgk),w(kn)l
I
6V(k) ) uk—d+1

= 9(k)-:-k(yi+1 - n+1) [

where p is a positive constant and

 

3 ADAPTIVE CONTROL USING NEURAL NETWORKS 26
3.2 Method 2

In chapter 2 the plant

yk+1 = f0($1(k)a - - - r$n+m(k)) + 9003109)» ~ - - i$n+m(k))uk-d+1(k)

is transformed into

yk-i-d = fd-1($1(k)a - - - a$n+m+d—1(k)) + 9d-1($1(k)a - - - a$n+m+d—1(k))uk (31)

Here the transformed plant is modeled by the neural network

37k+d = fd-1(zl(k)7 ' ' ' a $n+m+d-l(k)i W) + gd—l($1(k))‘ ' ° 7 xn+m+d-l(k)a v)uk (32)

which is shown in Figure 3.4.

  

   

' l
i l
I | :
i i i
l l l
l l I
I g l
' " Q Q 0 '
v . ' ooooooooooooooo '
l | I
I I |
i - l l w i
l I . mt '
I V l l l
I " V | , 1 2 n+m+d-t ,
' W‘ ' I e oooooooo e '
l l l l
| nHM-d-t l | .
1 lo 2 ........ o I | |
l l I l
l ' | x1 X2 xmmd—t l
l 1 I _____________________ I
l '- R
' X1 X2 xmm—1:‘\ \ A
L .................... \ \
\A ( ) fd-1<. )
0
9.1—1

Figure 3.4 The neural network model for the transformed plant

Similar to the control algorithm in Method 1, at each time step a control is applied

to the plant and the model. Then the network weights are updated according to the

3 ADAPTIVE CONTROL USING NEURAL NETWORKS 27
observed error. However, the control law and the updating rule are slightly different

as explained next.

3.2.1 Control Law for Method 2

At each time step the estimate of the transformed plant is

y;+d = fd-1($l(k)’ ' ' ° 3 xn+m+d-l(k)vw(k)) + Sci-“31(19): ' ' ° 3 $n+m+d—l(k)i v(k))uk

 

(33)
The control is deﬁned straightforwardly from (33) as
m. = ‘fd‘,‘(') + Tm (34)
9d-1(’)
3.2.2 Updating Rule for Method 2
Rewrite (31) and (32) as
yk+1 = fd—1($1(k " d '1' 1), - ~ - axn+m+d-l(k " d '1' 1))
+ §d_1($1(k — d +1), . . . ,$n+m+d_1(k — d + 1))Uk_d+1 (35)

and

ilk-+1 = fd-l($l(k — (1+1), ° ° ' a$n+m+d—l(k — d+ 1), W)

+ §d_1($1(k — d +1), . . . , $n+m+d—l(k — d +1), v)uk-d+1 (36)
Calculate the estimated plant output using current network weights as

11;... = f.-1(x1(k—d+1)....,x..m..-1(k—d+1).w(k>)

+ §d—1($1(k - d '1’ 1)1- . - izn+m+d-1(k — d '1' 1)»V(k))uk—d+1 (37)

Deﬁne the cost function to be

Jh = (y;+1 " tile-H)2

3 ADAPTIVE CONTROL USING NEURAL NETWORKS 28

The effect of adjusting weights on the cost function can be revealed by the following

 

 

 

 

gradient:
Bid—1(X(k-d+l).W(k)))’
Voka = 2(9):“ ’ yk-l-l) . aw“) I
(194-1(xch-(gl).v(kn) uk—d+1
The weights are updated as follows:
0(k +1) = 6(k) — Zﬁ-Vaiim. (38)
7'1:
’1 8fd_,(x(k-d+1),w(k)) ’
= 90“) " —(y;+l " yin) , 8w(k)
n. (69.1.. (xgkv-(ﬁllwlklly 11M“

where p is a positive constant and

] (erases-nuns») [2
8w(k)

 

= 1
r), + aid—1(X(k-d+1)vVLkD)'
6V(k) “ls-(1+1

 

 

3.3 Comparison between Method 1 and Method 2

Method 2 is simpler and more direct compared with Method 1. In Method 2, Only one
neural network architecture is needed for generating controls and updating weights.
In Method 1, two networks are needed: one for updating weights and the other for
generating controls. For relative degree one system, Method 1 and Method 2 are the
same.

The two methods introduced here have been standard algorithms for linear sys-

tems. The ﬁrst method is described in [19] and the second method appears in [27].

29

4 Convergence Result : Part One

The adaptive control systems and the related algorithms have been introduced in
chapter 3. Two different convergence results based on different assumptions will be
shown in this and next chapters. In this chapter we consider adaptive regulation

problem for a single—input/single—output relative-degreeone system

yk+l : f0(yk9yk-la°Hiyk—n-l-liuk—lauk-2a'°'iuk-m) (39)

+ 90(yka Elk—1, - ° ' wills-n+1) uk—la ”Ir—2a ' ' ' a uk-m)uk

Important assumptions about the system are listed here.

Assumptions
1. f0 vanishes at the origin, i.e., f0 = 0 when the arguments of f0 are all zeros.
2. go is bounded away from zero.

3. This system is minimum phase. By that we mean the zero dynamics

221(k+1) = 222(k)

22,m-j(k +1) = 22m(lc)
F(0,Zg(k),W)

Z2m(k '1' 1) - C(0,Zz(k),V)

 

has an globally exponentially stable equilibrium point at the origin, and there

exists a Lyapunov function V2(m(k)) such that

 

c1 lz2(k)|25V2(zz(k)) S 02 122(kllza (40)

V2(Zz(k+1))-V2(zz(k)) S —0|22(k)lzaand (41)
31/200

] ax s lel- (42)

 

4 CONVERGENCE RESULT : PART ONE 30
4. The nonlinear functions f0 can be exactly represented by a multi-layer neural
network f0 which does not have bias weights. In the case of a three-layer neural

network,

m+n

fofx (k l): fo(X =ngH (Z: wij$jl (43)

i=1

where H is the hyperbolic tangent function.

5. The function go can be exactly represented by a multi-layer neural network
without bias weights connected to nonlinear neurons. However, there is a bias
weight added to the linear neuron at the output layer. In the case of a three—
layer neural network,

m+n
90(X(k)) = 90X( (k) — v0 + Zv.H( Z viﬂj) (44)
The W and v in (43) and (44) are vectors containing variable weights in the

neural networks. Now the plant (39) can be written as
gm = fo(X(k),W) + ioixik).v)ui (45)
The estimate of the plant is

via = fo(x(k),W(k)) + §o(X(k),V(/€))Uk (46)

There is no theoretical evidence about how good three-layer neural networks without
bias weights can approximate nonlinear functions. But it is for sure that they can
deal with certain classes of systems. Some evidence from simulation will be provided
in chapter 6. The Method 1 and Method 2 described in chapter 3 are the same
for relativedegree—one systems. After the transformation described in chapter 2, the

plant (39) becomes

4 CONVERGENCE RESULT : PART ONE 31

211(k+1) = 212(k)

zln(k '1' 1) : f0(x(k)7w) + SOOKUCL")3’3n+m(lC '1' 1)
= F(Z(k).W)+G(Z(k),V)u(k) (47)

221(k+1) = 222(k)

22,m_1(k+1) = 22m(k)
22m(k+1) = u(k) (48)

We)

21,,(k).

It is convenient to recall that (see chapter 2)
21,,(k) = y(k), ...... , 211(k) = y(lc — n +1),
22m(k) = u(k -1), ...... , 221(k) = u(k — m).

The purpose of the control is to regulate the plant output to zero asymptotically.

At each time step, the control

F(ZUC),W(1€))

 

“" = " Gizik).v(k)) “”
is applied to the plant. Then the weights are updated.
0a +1) - 0(1) "i ' 3;: 3; “F (if) I: I
- - ; yk+l " yk+1) (ago x k ’v k )Iuk (50)
- 3"(k)

Notice that 31;“ in (50) equals zero, because the control (49), which is calculated
from the model, can exactly cancel the model dynamics. Therefore, the updating

rule is rewritten as

8 xkvk (51)

(3fo]X]k[. .Wik;[)u k]
3V“)

906 +1)— 9(k)+ -yi.+1 ](( 8"“)

4 CONVERGENCE RESULT : PART ONE 32

Next a local convergence result is provided.

Theorem 1 Under the gradient-descent updating rule (51) and the assumptions 1 -

5, given any initial condition x(0), there exists a positive constant R such that

if
then

yk will asymptotically converge to zero.
Proof:

step 1. The closed-loop control system.

Substituting uk deﬁned in (49) and uk_d+1 into (39) and (47), one gets
211(k +1) = 212(k)

F(Z(k),W(k))
G(Z(k),V(k)) ) ‘5”

l

21,,(k +1) = F(z(k),w) + G(z(k),v)(—

 

221(k+1) = 222(k)

z2,m-1(k+1) : 22m(k)

_F(21(k)122(k)aw(k))
G(21(k)azz(l€),v(k))

We) = 21,,(k).

22m(k +1)

 

The functions F and G in (52) and their derivatives are continuously differentiable

inﬁnitely many times. The term [0]] vanishes when 0(k) = 0 and the derivative of

W with respect to 0 is zero when z(lc) = 0. Using these properties, we have

F(Z(k),W(k)))l
C(ZUCLVUCD 1
F(Z(k),W(k)))'
G(Z(’¢),V(k))

 

['11 = F(Z(’C).W)+G(Z(k),V)(-

 

 

 

= F(z(k),W) + G(z(k),v)(-

4 CONVERGENCE RESULT : PART ONE 33

— [F(z(k),w) + Gizik),v)(-%(§[—’;)—’§)l)]

_ Fizik),w) _ Fizik).w(k))
‘ (MW) [C(2(k).v) Gizik),v(k))l

= Gizik),o)9— [——————F(z(")’wl]

 

(9(k) - 9)

9+(1-()(9(k)-9)

89 G(z(k),v)

 

where the last equality follows from the Mean Value Theorem [28].
Therefore

ll . LI 3 in law] - izik), where éik) = 0(1) — o (53)

step 2. To choose a Lyapunov function associated with zl.

P0 1 0']
0 0 0 1 0
21(k+1)=Azl(k)+ i [0]1, where/1: - -

 

 

L

A is a stable matrix (since all eigenvalues are at the origin). => Given any symmetric
Q > 0, 3 a symmetric P > 0 such that A’PA —- P = —Q [31]. Choose the Lyapunov
function

V1(Z1(k)) = z’,(k)le(k),

Then, using (53),

0
V1(21(k+1))-1601106)) = -Z’1(k)in(k)+2zi(k)’A’P E ['11
1
0
+l-lil0---1]P 2 (54)
1

s —z;(k)oz.(k) + k. lav] |2(k)l2 + k. |é(k)|” 121k)? .

4 CONVERGENCE RESULT : PART ONE 34
step 3. To choose a Lyapunov function associated with 22.

The plant dynamics associated with 22 is

221(k+1) = 222(k)

Z2.m-l(k+1) = Z2m(k)
—F(z,(k),z2(k),w(k))
G(zl(k))22(k)$v(k))
-F(0i 22(k),W(k))
G(0122(l€)aV(l€))
—F(Zl(k),22(k),W(k)) _ —F(O,Zg(k),W(k))]
G(ZINC),Z'2(k)i"(l€)) G(Oazz(k),v(k)) a
—F(0,Zg(k),W)
G(0,Zg(k),V)
P-F(zl(k),22(k)»w(k)) _ -F(0izz(k)aw(k))]
. G(21(16)122(k),v(k)) G(0,22(l€),"(l€)) a
+ ”Fio,zz(k).w(k)) _ —F(0,22(k),W)]
_G(0,zz(k),v(k)) G(0,zz(k),v) b

 

22m(k +1) =

 

 

 

 

+

 

 

 

 

 

By using similar techniques in showing (53), we can arrive at

 

ll - LI 5 ea lzi(k)| (55)
and
n.1,) g c, [é(k)| . [22(k)|. (56)
Let
2 k ,W I
ii
and

Q): = [0, . - - ,0, [o]. + I'lbl'

Applying (55), (56), (41), and (42), we have

V2(zz(k +1))— V2(22(l€)) = V2(5k + Qk) — V2(zz(k))

4 CONVERGENCE RESULT .’ PART ONE 35
[V2(3k) — V2(Z2(k))l + V2(5k + Qk) - V2(Sk)

S —k5 |22(l€)l2 + ’2 I451. + (1 — ()le ' lel
S -k5 1220012 +1}! 151:] ' lQicl‘l‘lWlCilkl2
S -l€s |22(l€)|2 + k6 |Z1(k)l - 122(k)| + k7 [2103)]2

+ c’1|22(k)|2|8(k)|+ calzl(k)l - l22(k)| 1%)]

+ 63|22(k)|2|5(k)l2

|/\

475 [220012 + k6 |21(k)l ' 122(k)] + k7 [2103)]2
+ k11|Z(l~‘)|2|9~(k)| + 16121201?)|2l"§(l€)l2 (57)

step 4. To combine step 3 and step 4.

Let V(z(k)) = V1(zl(k)) + ﬂV2(zg(k)). Then, from (54) and (57),
V(2(k + 1)) — V(2(k)) s —z;(k)cz.(k) + k3 |é<k)| |2(k)l2 + k. |é<k>|2 |2(k)l2

-ﬂks |22(k)|2 + We |21(k)| |22(k)| + ﬂkr lzi(l€)|2

+ﬂku|2(k)|’|5(k)| + 51612|2(k)|2|5(k)l2

|/\

l(-k10 + M70121? — 51651220012 ‘1' 51% 121(k)] 122(k)”
we. + an.) |é(k)| l2(k)|2 + (k. + ﬂk12)]5(k)I2 |2(k)|2

5 —k1' l2(k)|2 + k; [506)] izik)? + k; Incl” 120012. (58)

The last inequality is true if 3 is small enough.
step 5. A Lyapunov-type function related to weight convergence.

Rewrite the updating rule (51) as

0(k +1) = We) + f—At
k

(8fo]x]k),w(k)))’
A): = yk+1 8w“)

8§oxk,vk
(4W1: )“*

where

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4 CONVERGENCE RESULT : PART ONE 36
Deﬁne 9(k)= 0(k)— 0, then
~ _ ~ 11
0(lc +1) — 0(k) + :Ah
I:
and the inner product of ék+1 by itself is
7 I7 7 I7 2/1 7 I #2 I
0(k + 1) 0(k +1) = 0(k) 8(k) + 719nm). + 3AA). (59)
k 1:
Notice that r
yk+1 = fo(x(k)1w) + ﬁo(x(k),V)uk (60) ‘
and t
0 = foixik).w(k)) + ioixik),v(k))u. (61)
After subtracting (61) from (60), yk+1 can be rewritten as
, 8
11...: -éik) 3;; (62)
I“(1‘)=9+(1-C)(19(’€)--19)
Now let us investigate the term ﬁ(lc)’Ak, in (59). It can be quickly veriﬁed that
7 I 7 I O
0(1) A), = 0(1) ”8;“ yk+1
00:)
Then, making use of (62), we have
7 I _ 7 I alilk+l
90‘) A): - 90°) 39 yk+1
- 0(k)
" ayk-l-l ~ a1111111 ayk+1
00:) 11.1 + [our — —é(k)'— y)...
00 I‘Uc) 60 “my 80 I‘Uc)
= ‘11:.” + 500), 'aﬁﬂ " 50‘), gﬂ H1] yk+1 (63)
. ‘99 9(k) 3” rik)
" - are are ) P W (51h 6f. )
2
= ‘1! ‘1' wt — — + —
k-l-l 32:; (611),- 00¢) 6w.- 1‘(k) 2 gm” 6ng 9(k) 6ng 1‘01)
ago _390 q m“.-. ( 090 ago ) } ]
+ v — v,-- — uk yk+l
{if (6—v; 9(k) —6v.- rug) g 12:31 1 av.)- 9(lc) 6% rug)

4 CONVERGENCE RESULT : PART ONE 37
The functions and parameters used at the last equality of (63) are deﬁned in (43) and
(44). By virtue of special property of the hyperbolic tangent function, each

~ 2211 .215. ~ @112
“’(W 9(k) 3*” I‘m) °rv(3v

@211

9(k) — 8v ) term is of the order Iti(k)]2 - |z(k)|.

 

 

I‘0!)

Moreover, the control uk, as deﬁned in (49), is of the order |z(k)] and yk+1 is of the
order [6(1)] . |z(k)| (see (53)).
So,

01km. s 53... + k' I19(k)l3(|2(k)|2 + 126613); (64)
Next check about ngLAk in (59).

2 (6F(T(x(k)).w(k)) ’

2

iu , ll awik) 2

—A A = — .

7'13 k k r (aéirixiknwikn) u yk“
8V(k) 1‘

R'N

Deﬁne

(aririxikvmiki) '
8w(k)

7‘ = 1 + .
" (30(T(X(k)).V(k))) u
3V(k) *

Then

2 ‘ 2

" A’A " 2
_ < _ .
7‘)”: k k 1"" mp“

It can be shown that rk is bounded. Then, setting ,u = 1, the equation (59) becomes

to: +1)’é(k + 1) — 5(k)'i)'(k) 3 481,3“ + k9 [13(1)]3 (12(5))2 + |z(k)|3). (65)

Final step. A Lyapunov function for the overall system.

Choose the Lyapunov function
V(k) = 5(k)’5(k) + 7’V(z(k)).
By (58) and (65),

170: + 1) — V(k) 3 485,21, + k9 [é'(ic)|3 02(1))” + |z(k)|3) (66)

5’ (-1; Iz(k)l2 + Ic:’1]l?'(k)||z(k)|2 + k: limf lz("112) °

4 CONVERGENCE RESULT : PART ONE 38
Suppose [2(0)] S K. Then, by (40) and the deﬁnition of V(z(k)), there exists a
constant d1 such that

V(z(0)) s dlK".

S=[(g)|é’é+7zvgc2],

It can be veriﬁed that if 7 is chosen to be

Consider the set

C

7: I(\/2—dl_, (67)
then
[3(0)] S 5.5 and [2(0)] S K => ( 3E3; ) E 5.

Next, we show that if c is chosen small enough, then the set S is an invariant set.

ZUC)
For any (301:) ) E S,

[506)] _<_ c, and (68)

I V(z(k))] 3 KM.

Again, by (40) and the deﬁnition of V(z(k)), there exists a constant d2 such that

|z(lc)] Sdz

 

V(z(k))] s M 2111 (69)

Substituting (67), (68) and (69) into (66), one gets
kic2
K 2(2d1)
[1 — EU“; + kgc + 2153118111 + 2d, k9K3d2‘/2d1)].
1

1706 +1)- Wk) S —ksyl+1 - IZ(k)|2

 

(70)

It is obvious that there exists a co such that if c 5 co, then (68) can be rewritten as

Wk + 1) - 1706) S -ksyi+1 - l; |2(k)|2, (71)

4 CONVERGENCE RESULT : PART ONE

. . . . 2(0) 2
and the set S 18 an invariant set. Since ( 5(0) ) e S, ( (5

Finally, (71) implies that
V(k)-—rl7 ask—+00,

and therefore

gig—+0 ask——+oo. D

(k)
06)

)ES ‘v’kZO.

39

(73)

40

5 Convergence Result: Part Two

In this section we consider adaptive tracking problem for a single-input/single-output
relative—degree-d system
yk+l = f0(yka ilk-1, ’ ° ° :yk—n-i-la uk-d) uk—d-la ° ° ' a uk—d-m+l) (74)
"l" 90(3/111 ilk—1, ° ° ° , ilk—n+11uk-d1 uk-d-la ' ° ° 3 uk—d-m+l)uk—d+l

After the transformation described in section 2, the plant (74) becomes

211(k+1) = 212(k)
z1n(k+1) = 21.n+1(k)

Zl,n+d—1(k) = fd-1(X(l€)) + gd-1(X(k))$n+m+d-1(k + 1)
= fd-1(T-l(z(k))) + 9d-1(T_1(Z(k)))uk
= F(z(k)) + G(Z(k))uk (75)

221(k'l‘1) = 222(k)

22,..-10: +1) = emu)

22m(l¢+1) = uk-d-l-l
-fo(T"(Z(k))) + We — d + 1)
90(T71(z(k)))

 

y(k) = 2111“?)-
Some assumptions about the plant are made here:

Assumption 1.

9.1-1 (x(lc)) is bounded away from zero over any compact set. More precisely,

lgd-1(x(k))| 2 B > 0, Vx(k) E E (77)

5 CONVERGENCE RESULT: PART TWO 41
Where B is a known constant and E is a compact subset of Rm+”+“’1.
Assumption 2. The system is minimum phase.

Setting 21(k) and r(k — d+ 1) in (76) to be zero, the dynamics associated with 22(16)

 

become
221(k +1) = 222(k)
Z2,m-l(k +1) = z2m(k)
-fo(T71(0122(k))) .-
32 ( ) 90(T"(0,22(k))) (‘ l
have equilibrium at C, where C = [c, . . . , c]’, then

C = —fo(T-1(o.0))
go(T"(0,C))

 

After the state shift

621' = 221' - C,

(78) is transformed into

621(k+1) = 622(k)

€2,m-l(k+1) = e2m(k)

-fo(T"(0,ez(l€) + C))

82”“ +1) go(T-1(0.e2(k)+0)) “C

 

(79)

The dynamics (79) are called the zero dynamics. By assuming that the system is
minimum phase, we mean the zero dynamics have an asymptotically stable equilib-
rium point at the originand that there exists a Lyapunov function V2(e2(k)) such

that

6119209)]2 S 17203200) S 021820012, (80)

5 CONVERGENCE RESULT: PART TWO

V2(e2(k +1))- V2(82(k)) S -0 I92(l€)|2a and

[81/200

< .
8x _ L lxl

 

Rewrite the plant (75) in an input-output form as
yk+d = fd-1(X(k)) + 9d-1(X(k))uk
The plant (83) is modeled by the neural network
9k+d = fd_1(x(k),w) + gd_1(x(k),v)uk

In the case that (84) is a three-layer neural network, then

 

Figure 5.1 The neural network model for the transformed plant

m+n+d-l

fd_1(X(k),W) = iw;H( Z wax,- + ‘11).)

8:] '=1

(83)

(84)

5 CONVERGENCE RESULT: PART TWO 43

and
m+n+d-1

éd—1(X(k),V) = :51“ 2:1 ‘01'1'331' + a.) (86) ‘
The function H in (85) and (86) is the hyperbolic tangent function. The three-layer
neural network model is shown in Figure 5.1.
Assumption 3.
Assume that there exist w and v such that fad and an, can approximate fd_1 and

gd_1, which are continuous functions, to within 6 accuracy over the compact set E,

i.e.,

3w,v 3.1. max [fd_1(x(k),w)—fd_1(x(k))] g e, Vx(k) e: (87)

and max [gd_1(x(k),v) —gd_1(x(k))| S c, Vx(k) E E (88)

The weights w and v are unknown. w(k) and v(lc) represent the estimates of w and

v. Let 9 = [w v] and deﬁne the parameter error as
6(1) = 9(k) — a (89)

We are going to employ a dead-zone algorithm for updating the weights which has
been adapted from [29]. At each time step, if the error between the plant output
and the model output is larger than a certain threshold, the weights are updated.

Otherwise, the weights are not changed. In order to better deﬁne the error, rewrite

(83) and (84) as
yk+1 = fd-1(x(k - d + 1)) + 91-1(x(k - d +1))uk-d+1 (90)

and
1;... = fd-1(X(k — d +1).w)+ 13.1-1641: — d +1),v)u.-..1 (91)

The estimated plant output is

1):... = 5-16:0: - d + 1),w(k)) + 51-1w - 4+ 1),v(k))u1-.i.1 (92)

5 CONVERGENCE RESULT: PART TWO 44

The error ck.“ is deﬁned as

624.1 = yin - yk+1 (93)
This error is applied as input to a dead-zone function D(e) which is depicted in Figure
5.2.

D(e): e—do ife>do (94)

6 + do If 8 < —do
The output of the dead-zone function is used in the updating rule.

{ 0 2if lel S do

D(e)

 

 

Figure 5.2 The deadzone function

Updating Rule

 

 

 

 

1 3f4-1(X(al::k+)1)iw(k)) '
9k1=9k——D'— 95
( +) I) n (11114-1 11114-1) (gg-l(x(k—d+1),v(k)))’u ( )
8v(k) k-cH-l
1
= 9(k)‘;;D(ek+1)Jk-d+1
where 2
ai4-a;(k-d+u.wik)) '
7‘11 = 1 + . aw“) = 1 ‘1' Ji-d+1Jk-d+l
(Elie-1(Xgrv-(gllﬂ(k)))' ui- 1+1

 

 

5 CONVERGENCE RESULT: PART TWO 45
This updating rule is similar to that of Method 2 described in chapter 3, except that
ck.” in (38) is replaced by D(ek+1).

The control law is speciﬁed next.

Control Law
“k : —fd—1(X(k),W(k)) + We)
44-1(X(k),V(k))

Where r(k) is the reference command satisfying |r(lc)| < d1, d1 being a positive con-

 

(96)

stant.

Now, based on the assumptions made above, a local convergence result is given.

Theorem 2
Given any initial condition x(0) and any small constant do, if the initial parameter

error 9(0) (see (8.9)) and the modeling error is (see (87)) are small enough (depending
on do), then

1. l9(k)] will be monotonically nonincreasing, and 9(lc) will converge to a constant

vector.

2. The tracking error between the plant output and the reference command will

converge to a bounded ball centered at the origin and has radius do.

Emit
step 1: State transformation.

The dynamics associated with 21 is

2,101“) = 212(k)

zln(k+1) = zl,n+1(k)

5 CONVERGENCE RESULT: PART TWO 46
21.n+d-1(k) = F(Z(k)) + G(z(k))uk (97)

The last equation can be rewritten as
Zl.n+d—1(k + 1) = F(Z(k)) + G(Z(k))uk
= F(z(k),w) + G(z(k),v)uk
+ {F(Z(k)) - I31(Z(l€))W)+ [C(Z(k)) - G(Z(k),V)luz=}1
= 1E‘(Z(l€)aW) + C(ZUCLVWI + {°}1 (98)

where P(z(k),w) = fd_1(T‘1(z(k)),w) and G(z(k),v) = gd_1(T‘l(z(k)),v). Plug-

ging uk into (98), we have

 

 

 

 

 

_ ~z w -F(2(k)W(k))+ r(k) ,
21....-iiic) — F( (k), )+G( (1), v) [mm ('0) +1 )1
. - -F(zk() W(k))+ +r(k)
= F2 16 ,w Oz 16 ,v
(()A )+ H) ) CW) (,0)
-F(z(k),w)+rik)_—Fizik)w wk)+r() ,
+( Gizik).v) G(2(k,)V) ll+{}‘
= rik)+iﬁizik).w)—r(k)+é(z(k),v )( ”(15:1 )(:]]c;’"“°))i
— 1E‘(Z(lc),w(k))+ Pizarwik» + {-1.
= "(k)+{'}1
+ {F(z(k),w) 4- Fizik).w(k)) + ié(z(k).v) — Gizik),vik))iu1)2
= r(k)+{-}1+{-}2 (99)

Deﬁne
61(k) = 211(k) — T‘(l€ — n — d + 2)
(100)

€1.n+d—1(k) = Zl,n+d—1(l€)‘r(k)

In other words,

121(k) = z,(k) —- 11(1)

5 CONVERGENCE RESULT: PART TWO 47

Then (97) can be represented in new state variables e1 as

e1(k+1) = 82067)
en(k+1) = en+1(k)

en+d-l(k) = {'ll + {'}2 (101)
With the transformation
62.- = 22.- - c, (102)

the dynamics associated with z; is transformed into

621(k-l-1) = e22(k)

62,m_1(k +1) = 62m(k) (103)
€2m(k +1) = uk—d-H — C

Thus, (101) and (103) together is the new state space representation of the closed-
loop system.
Step 2: To show that l9(k)| will decrease and converge, and 9(k) will converge if the
states of the system stay in a compact set.
Consider the sets -
Ie = {( 2: ) I le1| S #nlezl S #2} (104)
and

Io = {5 I I5] S 5} (105)
In the forth coming analysis we will assume that e(k) stays in Ie for all k 2 0 and
investigate how 9(k) will behave in that situation. Later on, we will show that, under

certain conditions, e(lc) indeed stays in Ie.

5 CONVERGENCE RESULT: PART TWO 48
From the facts that
Z(k) = e(k) + [11(k) Cl’
and
We) = T“(z(k)).

it is clear that if e(k) stays in 19, then x(k) is bounded for all 1:.

93.1111

If e and 6 are small enough, then, for all (e(k),9(k)) 6 I8 x 19, gd_1(x(k),v(k)) is
bounded away from zero.

m (please refer to Assumption 1, Assumption 3 and Control Law for some

of the constant variables used here)

|§d-1(X(k),V(k))-gd-1(X(k))| S |§d-1(X(k),V(k))-§d-1(X(k),V)|
+ lid—1(X(k),V) - 9d—1(x(k))l

S c1|v(k)| + c S c16 + e. (106)

If c115 + c < B/2, then
lid—1(X(l€)1V(k))| > 13/2 0:. (107)
Claim 1 ensures that the control u). is uniformly bounded for all (e(k), 9(k)) E Ie x 19.

The input-output form of the system is

yk+1 = fd-1(x(k " d + 1)) '1' 9d-1(x(k — d ‘1' 1))uk-d+1

As long as (e(lc),9(lc)) E L, x 19, all previous controls are bounded and, by the

assumptions (87) and (88), we have
yk+1 = fd—1(X(k - ‘1 +1))+ 94-1(X(k — ‘1 +1))uk-d+1

= fd_,(x(k — d + 1),w) + ~c‘i.,i..1(x(lc — d + 1),V)uk—d+1

+ [fa—104k - d+1))—f1-1(x(k — d +1),w)i

5 CONVERGENCE RESULT: PART TWO 49
+ [(gd—1(X(k — ‘1 +1)) — 94-1041“ — ‘1 +1),V))uk-d+1]

= fd-l(x(k 7' d +1),W)+ Sd-1(x(k — d +1)1v)uk-d+l + 0(5) (108)

The weights w and v in (108) are unknown. The estimate of the plant output is
an =f1-1(x(k — «1 +1).w(k))+ 5.1—locus — «1 +1),v(k))u1-d+1 (109)
The error between the neural network output and the plant output is
e2“ 2 III-H — yk+1
= fd-1(X(k — d+1),w(k))—f1-1(x(k — d +1),w)
+ [5.1-104k — ‘1 +1)1V(k))- Sid-104k " ‘1 +1)1V)luk—d+1 + 0(6)

anew-Hume» '

= 51' 8W) 0 8k 2 o
( ) ] (854—1(x(k-d+l)iv(k)))’u ] + (I ( )l )+ (C)
8V(k) k-d+1

= é(l€)’Jk—d+1 + ”(k)

Since the states x(lc) and all previous controls are bounded, there exist 0; and c8

(depending on p1 and 112) such that
ltl(l€)| S Cr|12(k)l'2 + 036
Assume that 6 and e are small enough such that
ln(k)| S M < do (110)
(do deﬁned in (94)). Next, some analysis related to the deadzone function is provided.
0 If [6,3,] 3 do, then D(e;+1) = 0.

o If 6;“ > do, i.e., 9(k)’Jk_d+1 + 17(k) > do, then 9(lc)'Jk-d+1 > 0, since |n(lc)| <
do.
D(CIH-I) = 5(kl’Jk-di-l ‘1‘ 170“) ’ d0 < 5(klljk—d-H + do — do

= D(e;+,) < have,“

5 CONVERGENCE RESULT: PART TWO 50
o If 5714-1 < —d0, i.e., 9(lc)’Jk_d+1 + 17(lc) < —do, then 9(k)’Jk_d+1 < 0, since
l’7(k)l < do-
men.) = have... + We) + do > Ami—1+1 — do + do

= D(e;+l) > é(k)’Jk—d+l
Thus in all cases we can represent D(e;+1) as
D(€i+1) = a(k)9(k)'Jk_d+1 (111)

where 0 S a(k) _<_ 1. Plug (111) into the updating rule (95). we get

~

(9(kl'Jk-d+1)Jk—d+1

 

 

 

 

 

 

 

 

9 k +1 = 9 k — k 112
( ) ( ) ai ) 1+ 5-1.1.1 ( )
Subtracting 9 from both sides of (112), it becomes
.. ~ (5(k)’Jk—d+l)Jk-d+l
9 k + 1 = 9 k — k 113
( ) ( ) a( ) 1 + .11-... JW ( )
Then,
9(k +1)'é(k + 1) — 9(k)’9(k)
= _Mk) (Aryans)? +We)(Awarerunner“
1 + Jk_d+1Jk-d+l (1 + Jk-d+le—d+1)
(6(k)’Jk-d+1)2 2 (5(k)’Jie—a+1)’
< —2 k k 114
— a( )1+Ji—d+1Jk-d+1 a ( )1+Ji—d+1Jk-d+1 ( )
5(k)’Jk—d+l)2 (éU‘VYJk-dan)2
< — 2 k ( - 2 k —2 2 k
_ C!( )1+Ji’._d+1Ji.-.i+1 ( Cr( ) a( ))1 + 112-1+1 Jk-d-H
7 I 2
S —a”(k) (0(lik-d'l-1) (115)

 

1 + Ji—d+1Jk—d+1
=> 9(k)'9(lc) is monotonically decreasing and 9(k)'9(lc) —r C; as k ——1 00
(116)
where C; is a constant. This shows that the norm of parameter errors will converge.
Then, (116) implies that
éUCy-Ik-d-l-l

00¢)
ﬁ‘l' Ji-d+1JIc-d+1

—-+0ask—>oo (117)

5 CONVERGENCE RESULT: PART TWO 51

= 9(k+1)—9(k)——>0 ask—+00 (118)
=> 9(k) ——-> C; as k ——+ 00 (119)
==> 9(k) —* 02 + 9 as k —1 00 (120)

where C; is a constant vector. The result (118) can be veriﬁed by rewriting (113) as

(é(k)'Jk—d+1) Jk-d+1
\[1 + JL—d‘Fle‘dH \[1 + JI’c—d+1Jk-d+1

It is shown in (120) that the weights in the neural network model will converge.

9(k+ 1) =9(k) —a(k) (121)

Step 3: To show that |e1(k)| is uniformly bounded by #1 if e and 6 are small enough.

Rewrite the dynamics associated with e1 as

 

 

0 1 . 01

0 0 1 0
e<k+1>=Ae<k>+ s {-11, whereA= 2 s .

1 0 - 0 1

1.0 .1

and
['11 =1 {‘}1+{°}2
S c36+c4e (122)

The constants c3 and c4 in (122) depend on p1 and #2.
A is a stable matrix (since all eigenvalues are at the origin). => Given any symmetric

Q > 0, 3 a symmetric P > 0 such that
A'PA -— P = —Q.
Choose the Lyapunov function
V1(8(k)) == e'(k)Pe(k),

Then,

V1(e(k +1))— V1(e(k)) = —e’(k)Qe(k) + 2e(k)’A’P 2 {-1l

5 CONVERGENCE RESULT: PART TWO 52

 

0
+[.]';’[0---1]P
1
S -8'(k)Qe(’€) + (c3|9(k)| + 646) |e(k)| + (031506” + 6402
S —)\,,,,-,,(Q)|e(lc)|2 + (C36 + c4c)|e(k)l + (C36 + 646)2
S —’\min(Q)le(k)l2 + C11(C3‘S ‘1'” C45)2
S —)\V1(e(k)) + c11(c36 + c4e)2, where A = AWAQ)

Amu(P)
=> The R.H.S. will be negative

whenever 14(e(k)) > 5%1-(c36 + C46)2 (123)

The conclusion from (123) is that, given any #3 > 0, if 6 and e are small enough such
that

-CIT1(C36 + c4e)2 < #3 (124)

then

{V1(e) 5 #3} is an invariant set, i.e.,

V1(e(0)) S #3 => V1(e(k)) _<_ #3, V k 2 0.

#3 ‘ (‘3
le(0)l _<_ m = leUcll S l/AminU’)’ Vk Z 0.

Hence, m in (104) can be chosen to be Mm, where as is chosen large enough so
that |e(0)| S «m. Notice also that |e(k)| will decrease toward a ball of radius
0(c36 + C46).

It follows that

 

step 4: To show that the dynamics associated with e; are bounded.

The dynamics associated with e; is

621(k+1) = 822(k)

5 CONVERGENCE RESULT: PART TWO 53

e2,m—l(k‘l‘1) = e2m(k)

62m(k +1) = Uk—aH-l - C

where

uk—d+1
_ -13"(e1(k—d+1)+II(k—d+1),e2(k—d+1)+C,w(k—d+1))+r(k-d+ 1)
_ G(e1(k—d+1)+H(k—d+1),e2(k—d+1)+C,v(k—d+1))
—F(0,e2(k -d+ 1) + C)

G(0,e2(k—d+ 1) + C)
(—F(e1(k—d+ 1)+II(k—d+ 1),e2(k—d+ 1) +C,w(k—d+ 1))

 

 

A

G(e1(k—d+1)+H(k—d+1),e2(k—d+1)+C,v(k—d+ 1))
_ -F(0,e2(k — d+ 1) + 0))
C(0,62(k - d+ 1) + C)

 

+ .. r(k—d+l)
G(e1(lc— d+ 1) + H(k—d+1),e2(k — d+ 1) + C,v(k -d+1))
—fo(T"(0,e2(k)+C))
go(T-‘(0,e2(k)+0))
+ ([—13"(e,(k —d+ 1) + II(k — d+1),e2(k —d+ 1) + C,w(lc - d+1))
G(e1(k —d+ 1) + 11(k- d+ 1),e2(k -d+ 1) + C,v(k — d+1))
_ -13‘(0,e2(lc—d+1)+C,w(k—d+1))
G(e1(k —d+ 1) + II(k — d+1),e2(k — d+ 1) + C,v(k — d+1))l

 

 

 

 

+ [ . —F(O,e2(k—d+l)+C,w(k—d+1))
G(e1(k — d+ 1) +110: — d+1),e2(k — d+ 1) + C,v(k -— d+1))
_ —F(o,e,(k — d + 1) + C,w)
G(e1(k—d+1)+n(k—d+1),e,(k—d+1)+c,v)l
+ [ . —F(0,e2(k—d+ 1) + C,w)
G(e1(Ic—d+l)+II(k-d+1),e2(k—d+l)+C,v)
__ -F(O,e2(k-d+l)+C) )
G(0,e2(k—d+1)+C)
+0(H(k—d+1))
-fo(T"(0, e2(k) + 0))
90(T'1(0,e2(k) + 0))
+[0(9(lc—d+1))+0(e1(k-d+1))+0(H(k—d+1))+O(c)]2

 

 

 

 

 

Since all the states are bounded, [a]; is bounded by a constant.

5 CONVERGENCE RESULT: PART TWO 54
Let

 

—f0(T-1(0182(kl + Cl) _ c I

SI: = 622(k), . . . , 62m, 90(T’1(0192(k) + Cl)

and
Q. = [0,...,0, [o]2]’

Using the V2(e2(k)) described in (80) as a Lyapunov function candidate, we obtain

V2(92(k +1))— V2(ez(kll V2(5k + Q1) — V2(82(k))

= 115(5):) — V2(82(k))l ‘1' 151(5): + Qk) — V2(Sk)

s -1, Ie2(k)l2 + 121(5). + (1 — <1le - IQ.)
S "ks 182(kll2 +111 lskl ' lel + kleklz
5 451941012 + k61e2(k)l+ 1... (125)

Thus, if p; is chosen large enough, there is an invariant set {e2(k) | V2(e2(k)) S cm}
inside (e; I |e2| S 112}.
We summarize the results from Step 2,‘ Step 3 and Step 4 here: Given any initial

condition, if m and a; are chosen large enough, and 6 and e are small enough, then
1. (e(k),9(k)) E 1., x 19, V k 2 0.

2. l9(k)| will be monotonically nonincreasing, and 9(k) will converge to a constant

vector.

Step 5: To show that the plant output will eventually track the reference command
with an error less than do.
Since x(k) and u), are bounded for all Is, it can be veriﬁed that Jk_d+1 is bounded.

Hence, (117) implies that

a(k)9(k)Jk_d+1 —) 0 as k —+ 00 (126)

= D(ek+1) ——§ 0 as k -—i 00 (127)

5 CONVERGENCE RESULT: PART TWO

=> l€k+1l<do ask—+00

==> |y2+1 - yk+1| < do as k —* 00
Recall that
an = f1-1(x(k).w<k + d -1))+ 91-1(x(k).v<k + d -— 11m
While the control 11;, is generated from
§k+d = fd_1(x(k),w(k)) + §d_1(x(k),v(lc))uk
Then
lyli+d - 91ml S lfd-1(X(k),w(k + 0' —1))-fd-1(X(k),w(k))|

+ lDd-1(X(k),V(k + d -1))- éd-1(X(k),V(k))lukl

K|9(k-d+1)—9(lc)|——>0, ask—+00.

|/\

2 lgk+d " yk+dl < do as k —+ 00

(130)

(131)

56

6 Simulation

This section is divided into several subsections, with each subsection dealing with a
specific issue. Some issues are directly related to the results presented in the previous
chapters, while others are not, as will be described at the beginning of each subsec-
tion. All the simulation programs are written in Microsoft C and run on a IBM PC

compatible machine. The plants under consideration are of the general form

yk+1 = fo(yk, Slit-1"”, Elk—n+1, uk-d, Uk—d—u - - - , uk—d-m+1) + gouk—d+l (136)

which is a relative degree d system, with go being a constant. We will simulate on

relative-degree-one and relative-degree-two systems only.

 

" x 1(k) Xm,(kl

Figure 6.1 The neural network model

The neural network architecture which will be used for the rest of this chapter is

6 SIMULATION 57
shown in Figure 6.1 and denoted by NN (m, n), where m and n are the number of the
hidden neurons used in the ﬁrst and the second hidden layers (the ﬁrst hidden layer
is the one close to the inputs).

We look at identiﬁcation problem ﬁrst in section 6.1, and study various control

issues in subsequent sections.

6.1 Identiﬁcation

We start with identiﬁcation problems, although identiﬁcation is not directly related
to the results presented in the previous chapters. There are several reasons to pursue

identiﬁcation ﬁrst:

1. To demonstrate that layered neural networks can learn to approximate nonlinear

functions.

2. To get some insight about how to choose a suitable neural network size to handle

our current control problems.

3. Before the neural networks are applied to control problems, some “pretraining”

may be needed.

It is worth while trying to clearly describe the training process here. Neural networks
of the form in Figure 6.1 will be trained to approximate the system

1-5ykyk—1
1 ‘1‘ 312 + 3113-1

 

yk+1 = “l" 0.7sin[0.5(yk + yk_1)] ° cos[0.5(yk + yk_1))] + 1.221.); (137)

Several network sizes will be used to develop some feeling about how to choose a
suitable neural network size. Each nonlinear neuron in the neural network has a bias
weight attached to it, though the bias weight is not shown in Figure 6.1. In order to
focus on the effect of the number of nonlinear hidden neurons on the identiﬁcation
process, the weight between uk and the output neuron in Figure 6.1 (i.e., go) is ﬁxed

to be 1.2 (the same as the plant).

6 SIMULATION 58

At each time step, the control 21), is randomly selected from the interval [~1.7, 1.7].
The inputs uk, 3],, and yk-1 are applied to the plant and the neural network model.
The error between the plant and the model is used to train the neural network. We

use standard back-propagation algorithm [1] to update the weights. The algorithm is
described here.

The plant is

yk+1 = f0(yk, 311-1) '1' L211]:

The model is

yZ+1 = fo(yk,yk-1, Wk) + 1.2111:

Deﬁne the error to be J)c = (y):+1 — yk+1)2. The weights are updated as follows:

Wk+l = Wk—ﬂkvwkjk

 

. a” , - ,W ’
Wk — ﬂk(yk+1 — yk-H) [ f0(ykai;kl kl]

where aj°(”"é‘£',‘,;"w“) is calculated by the back-propagation algorithm. The learning

rate pk used in this simulation is

0.2, k g 1000
0.45, 1000 < k g 5000
_ 0.25, 5000 < k 5 10000
”" ‘ 0.15, 10000 < k 3 20000
0.1, 20000 < k 3 30000
0.5, 30000 < Is

Our experiences suggest that using smaller learning rate at the early stage of training
can avoid large oscillation which may cause instability. The learning rate is increased
and then gradually decreased for better convergence. The initial weights of the neural
networks are selected randomly between -0.1 and 0.1. In the rest of this chapter,
the initial weights of the neural networks are always selected this way. Small initial

weights can reduce the chance of saturating the nonlinear neurons.

6 SIMULATION 59

In all of the ﬁgures shown in this subsection, each data point represents the error
between fo(yk, yk-1) and fo(yk, yk_.1, Wk) averaged over 1000 time steps.

Figure 6.2 shows the training results for the ﬁrst 30 thousand time steps. The net-
work NN(2,2) does not contain enough hidden neurons, so it is incapable of bringing
the error down. It is also apparent from Figure 6.2 that NN(4,4) learns to approxi-
mate f(yk, yk_1) faster than NN(6, 6), and NN(6, 6) faster than NN(12, 12). For these
three neural networks, errors are reduced to about 0.033 at the end of 30 thousand
training. It seems that NN(4,4), NN(6, 6) and NN(12, 12) result in similar errors, but
next ﬁgure will provide more insight.

Figure 6.3 shows the training of NN(4,4), NN(6, 6) and NN(12,12) from 30 to 200
thousand time steps. It is obvious that N N(4,4) is leveling off, while NN(12,12) is
reducing the error at a much faster pace. NN(12,12) pushes the error to 0.012 at the
end of 200 thousand training.

Figure 6.4 shows the result of the same identiﬁcation problem using NN(25,25).
Compared with NN(12,12), NN(25,25) takes much longer time to reduce the error
signiﬁcantly.

We conclude that neural networks of sizes between NN (6, 6) and NN(12,12) are
adequate to be used in the adaptive control of the system (137), in terms of the speed

they can learn to approximate the plant and the accuracy they can achieve.

60

6 SIMULATION

 

0m

mawuw cc: 1_. US$305.

 

 

 

 

 

 

ANYN:
22

$3
22 11:1

BEEN: Ucmm30£ 0m I 9

note cOZMOCBCmU.

mwc when“

ANNV
22

00.0
00.0
0 _. .0
vwd
.de

0V0

10.1.13 eBBJeAv

61

6 SIMULATION

 

00m

00?

macaw mEC. ocmmzock

va

mm

V0

0m

 

 

_

 

 

 

ANPN:
22

6.00
22

 

62:6: 988% com 1 09

echo cozmoczch.

me 059“.

00.0

No.0

V70

r N0

mN.0

mmd

(L-E)

10113 eﬁmemvr

62

6 SIMULATION

 

00w

Cm

mompw 0E; 0cmmnock

00

0? ON 0

 

 

 

 

 

 

6N.va
ZZ l

AZZ 090. 0:63 >>0_m m. 9.500:

.0th cozmoczcmg

v.0 059“.

00.0

00.0

070

VN.0

Nm.0

0V0

10113 eﬁe1eAv

6 SIMULATION 63
6.2 Regulation using Neural Networks without Bias Weights

Part 1:

The simulation here corresponds to the result in Chapter 4. The neural network
NN(7, 7) is used in the control system to model and regulate the output of the plant
(138) to zero.

1-5ykyk-l
1 + y: '1' 3113—1

The plant satisﬁes the assumption that fo(X(k)) = 0 when X(k) = 0.

 

yk+1 = + 0.7sin[0.5(y;c + y),_1)]-cos[0.5(y1c + yk-1)] + 1.211,c (138)

The neural network contains no bias weights as it has been assumed in chapter 4 so
that fo(X(k),Wk) = 0 when X(k)=0. Before the neural network is used for control
purpose, it may go through similar identiﬁcation process as in 6.1 using standard
back propagation algorithm. But this time the weight go is not ﬁxed. When the
neural network is used in feedback regulation, the updating rule is switched to (45)
described in chapter 4, where p is set to be one. Simulation results are provided
next to show how different amount of training on the neural network NN (7, 7) would
affect the performance of closed-loop regulation. From the experience of 6.1, it is
expected that as a neural network receives more training cycles, it should have better
approximation of the plant and should perform better in feedback regulation.

In this simulation, the weights of the neural networks after the training (or iden-
tiﬁcation) process become the initial weights of the neural networks in the control

system. The initial condition of the plant is

(y01y-l) = (‘1-5, —1.5)

Figure 6.5 compares the plant outputs resulted from three neural network controllers;
one received no training, while the other two received 7000 and 14000 training cycles,
respectively, before being used to regulate the plant output. It conﬁrms the expecta-

tion that better trained neural networks should perform better in the control system.

6 SIMULATION 64
However, an important message from Figure 6.5 is that the plant output is regulated
to zero even when the neural network is not pretrained.

Part 2:

Figure 6.6 shows the result when the same neural network is used to regulate the

plant

1-5ykyk—1
1 + 3113 ‘1' 3112—1

 

ka = + 0.3cos[y)c + yk_1] + 1.211,c (139)

Large oscillations appear again and again when the plant output is close to zero. The
problem is that the neural network fo(yk,yk-1, Wk) vanishes at the origin (since it
contains no bias weights), but fo(yk,yk_1) is nonzero when y. = yk._1 = 0. There-
fore, it is impossible for the neural network to model the plant well and generate
satisfactory controls.

To solve the problem, we want to use the idea of dead-zone described in chapter
5. The cosine term in (139) can be treated as the error between the plant and the
best approximation which the neural network can achieve. The maximum magnitude
of 0.3cos[yk + yk-1] is 0.3, so it is intuitive to specify [—0.3,0.3] as the dead zone
region (i.e., setting do = 0.3) to cover the dynamics which can not be modeled by
the neural network. The simulation with do = 0.3 is shown in Figure 6.7, where the
plant output converges to the dead zone (more precisely, converges to [—0.1,0.3]).
Figure 6.8 shows the result when do is set to be 0.15. Figure 6.8 (do = 0.15) exhibits
worse transient behavior than that of Figure 6.7 (do = 0.3), and the region the plant
output converges to in Figure 6.8 is of the same size as that of Figure 6.7. These are
evidences that do = 0.15 is not a good choice. How about if do > 0.3? Figure 6.9
gives the plant output when do = 0.4. After a quick initial shift, the plant output
converges to 0.4. This suggests that a good choice of do may be somewhere between
0.3 and 0.4.

Although the plant output converges to a point within the speciﬁed dead zone

6 SIMULATION 65

when do = 0.4, it just happens to be the case. We did not provide theoretical results

that guarantee the convergence of plant output to a ﬁxed point within the dead zone.

66

6 SIMULATION

 

0v.

 

 

 

 

 

 

 

 

0005 08:.
Nm em or m 0
_ _ d _ ON.—.l
1 00.01
. H.
1 0v 01 m
0
m
1).).-. .. 00.0 d
m
.. Ovd
00.0
9:50: 0220:. 02:6:
OOOVF ........ OOON 1111 O Illl

GEES... m> notm cormﬁomm 0.0 050E

67

6 SIMULATION

 

000V

00mm

mompm oE_ ._.

OOVN 000 e 000 0

 

 

 

.

q - q

l

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0520.: 0 l

Ema 9: 53.32 9 5.3 £928 2.
0:0 cozmiomm 0.0 059“.

.00... I

00.0!

070i

00.0

0m.r

00.N

1nd1r10 wand

68

6 SIMULATION

 

000V

89.0 0:: ._.

00mm OOVN 000? 000 0

 

9.5m: 0 1

$03
mCON 0000 0:63 cozmiomm N0 EDGE

00%|

OQOI

070i

000

004

OQN

1nd1n0 Weld

69

6 SIMULATION

 

000v

 

0000.0 0:: ._.

00mm OOVN 000v 000 0
00;!

OQOI

0701

0Q0

Om;

OQN

oEEmb 0 l

Baku
mc0N 0000 0:63 cozwﬁmmm 0.0 059“.

100000 mend

NefeaU LN nu Hymn. . n..-er.ar.wﬁJ (FLOW. u mlﬁirwnnkua www-mlv myLaiwau

70

6 SIMULATION

 

00v.

009.0 0:: 1_.

0N0 OVN 00?

00 0
_ 00.? I

 

— — -

1 00.0I

 

 

 

ECWOI

1 00.0

1 00;.

 

 

otoucu
mcoN 0000 0:63 859309”.

00.N
9:29;. 0 l

0.0 059“.

mdmo lUEId

6 SIMULATION 71
6.3 Regulation using Neural Networks with Bias Weights

Part 1:
In section 3.2 part 2, a layered neural network without bias weights is unable to

model and regulate the system

1.5ykyk_1
1 + 311;: + 3113-1

 

yk+1 = + 0.3cos[y;c + yk_1] + 1.2111c (140)

and the idea of dead zone is tried to conﬁne the plant output to a bounded area. Here
we apply the neural network NN(7, 7) with bias weights to model and regulate the
system (140). Figure 6.10 shows the plant output when NN(7, 7), which receives no
pretraining, is used in the control system. No dead zone is speciﬁed in the learning
rule (92) (do = 0), because the plant output can be regulated to zero perfectly.

Part 2:

In this thesis it is assumed that the nonlinearities of the plant are unknown but can
be modeled by neural networks. There may be cases when the nonlinear functions
of the plant are known, but the coefﬁcients attached to these nonlinear functions are
unknown. The second situation has been considered recently in the continuous-time
setup by Sastry and Isidori [18]. We are going to refer to the second situation as
the analytic approach. The purpose of this part of simulation is to compare the
robustness properties of the neural network method and the analytic method.

Suppose the plant is

1-5ykyk—1
1 + y: ‘1' 312—1

 

yk+1 = + 1.211;c (141)

In the neural network method, (141) is modeled by NN(7, 7). In the analytic method,
(141) is modeled by

 

1.5yky _
yk.“ = 11(1)1+ 02:10“: 1 + b(k)uk (142)

6 SIMULATION 72
where a(k) and b(lc) are unknown coefﬁcients. Both the neural net model and the
analytic model are trained to approximate (142) well before they are applied to reg-

ulation problems. However, the actual plant to be controlled is

1-5ykyk-1
1 + y: + 3113-1

 

yk+1 = + w - cos[y)c + yk_1] + 1.211;c (143)

Figure 6.11 through Figure 6.14 compare the plant outputs resulted from the neural
network controller and the analytic controller when w = 0.0, 0.1, 0.4, and 0.7,
respectively. The neural network control can always bring the plant output to zero in
all of these cases, while errors, which are roughly proportional to the magnitude of w,
are observed when the analytic controller is used. The neural networks are universal
approximators in the sense that they can accommodate variations in the part of the

plant.

$35.16)) 1.1.2.2. rant.) 2.5L 95.53
LOLLNU f?.v.~ 0:11...qu AV P .6 ®1=me71

73

6 SIMULATION

 

00

mm

0000.0 0:: ._.

VN 0.. 0 0

 

_ . _

l

l

 

 

 

 

 

02:5:
0 l

28.03 «so 53 22 8.3
0:0 :030300E 0_..0 0:39“.

00%|

00.? I

00.0I

00.0

00.0

1'10an weld

1.2,.- »A :1. a .92 ”Wu. H >. nhwﬁb< M15.) .7L H/L F. F FIIV Q1: .C-Iu

74

6 SIMULATION

 

0m

 

 

 

 

 

 

0000.0 0:5.
#0 me we 0 0
_ d u .
_obcoo .9500
032054 1111 0.02 .0502 l
6.0 u 2
08:62 0:284. m> .22 :0 050E

Omél

00... I

00.0I

00.0

00.0

1nd1n0 1uelc1

1.1.000.» ..u r.._\/_ Mu. H \A...I..p../\ my.) .Z.Z N P .@ 11.1.ka :U. H;

75

6 SIMULATION

 

om

 

 

 

 

 

  

 

 

0000.0 09:...
em mr N? 0 O
_ 1 3 . Om.—.l
__ 1 00%|
__ 1b
_ e
__ m.
. 1 00.01
__ m
. d
. n
__ l.
1111111111111111111111111111 u 1 cod
11111111 ,4.
00.0
03:00 03:00
0:20:10. 1111 02 .0502 l
:d u 3.

000502 032094 m> 2.2

N F .0 0590

76

6 SIMULATION

0000.0 0E. ._.

00...I

 

00... I

00.0I

1nd1no lUBId

00.0

 

 

 

 

. 00.0

 

 

05:00

6580

032050. 1111 #02 .0502 l

3.0 ﬂ >3

0005.02 052055. 0> 2.2 9.0 0590

77

6 SIMULATION

 

0N

 

0000.0 0E;
00 00 Om m _. 0
_ _ _ q . 004. I
00.31
00.01

00.0

 

 

 

00.0

 

 

00. F

 

05:00 65:00
0...>_0:< 1111 .02 .0502 l

2.0 u >3
00050.2 0.30:4. 0> .22 3.0 0590

1nd1r10 1112111

6 SIMULATION 78
6.4 Tracking : 1. The Plant is Stable

The neural network NN(10,10) is used in the adaptive tracking problem to control

the plant

1-5ykyk—1
1 + 013 + 1113-1

 

yk+1 = + 0.7sin[0.5(yk + yk_1)] -cos[0.5(y;c + yk_1))] + 1.211)‘ (144)

to track a reference command. Since the nonlinear function fo(yk,yk_1) in (144) is
bounded uniformly over R2, the plant is stable in the sense that given any sequence
of bounded control {111‘}, the plant output is bounded. This means that the neural
network will have plenty of time to learn to approximate the plant and generate ade-
quate controls without the fear that the closed-loop system may blow out. Therefore,
in this part of simulation, the neural network NN(10, 10) is directly applied to the
control of the plant without any pretraining. From the data shown in Figure 6.15 to
Figure 6.20, it is observed that the neural network can reduce the tracking error sig-
niﬁcantly during the early stage of the control process. But it takes much longer time
to achieve “perfect” control. The same fact is also observed in section 6.1, where the
neural networks can reduce 85of error. Letting the neural networks go through some
identiﬁcation process will deﬁnitely help a lot when they are used in the control sys-
tem. It is clear from Figure 6.15-to Figure 6.20 that certain part of reference signals
cause persisting errors at the plant output. The neural networks may be pretrained
intensively around these corners in order to achieve quick convergence in the control

process.

79

6 SIMULATION

 

00V

 

 

 

 

 

 

 

 

 

0000.0 0E...
ONm OWN 00.. 00 O
_ _ _ _ OO..V|
1 OVNI
./ ._.... . ;: .
_ _ - cm 01 m
_ 0 ._ 6
x . . 0 W.
, ,_ 0
x , .0- 00.0 m
. \
IIII . /.\\ IIIIII ._ l\
J.
00.?
0:0EE00 5930
00:00.00. 1111 p:0_n. l

0005 0:5 00¢ 1 0
“:06. 0300 00:200.... 020 050.0

80

6 SIMULATION

 

00¢

000.10 0E... . +000N.

 

 

 

 

 

 

 

 

 

 

 

000 0.00 00% 00 0
. . _ _ OV.NI
00%|
0.01 w
e
_ m
_
u 0.0.0 W
.
00.0
0:08:00 «3930
00:00.00. 1111 H:0_n. l

.82» 25 09.0. 1 800
E00 0.00%.. ”0:200... 020 0.50.“.

81

6 SIMULATION

 

00v

89m mEC. A +0009

0mm 9% 09 om o
_ 9mm:

 

 

 

— _ u

L

VVEI

mvdl

 

 

 

mvd

 

 

 

.Evé

 

 

OWN

 

UcmEEOU 3930
$085me III... Ema ..||

8% as: 8% .. coon
Ema 993w ”0:209... 530 9:9“.

82

6 SIMULATION

 

00V

wampw mEC. A +0000:

 

 

 

 

 

 

 

 

 

 

ONmu OVN 00v 0m 0
A _ 4 _ OV.N|
,!-- I; lIJ, J 31:
1 deI
0V0
1 VV._.
OVN
_ucmEEoO 5930
mocmhmwmm III: Ema III

82» 9:: coco. n 809
Ema 0.99m ”ociom; 070 910K

epnwﬁew

83

6 SIMULATION

 

00V

ups-LEE ‘

wawpw 08:. A +0000mu

0mm OWN 00.. om 0

 

 

 

_ _ _ _ OV.NI

 

WV; l

mvdl

 

epnwﬁew

mvd

 

 

 

 

 

 

LEV...

 

 

OWN

2.9550 5930
mocmgmwmm III: Emﬁ

 

82» 2: 83m .. 0008
F66 993m “9.209;. 070 050.“.

84

6 SIMULATION

 

00V

8me mEc. A +0000Q

 

 

 

 

 

 

 

 

 

 

 

 

 

0mm OVN 00? cm 0
_ _ _ _ OYNI
XII
mvdl W
e
6
m.
as m
I 31
OWN
UcmEEOU 59:0
mocmgﬂmm Eli Ema Ill

82» as: 09.8 I 00000
Ema 0.09m “0:209... 0N0 $.59“.

6 SIMULATION 85
6.5 Tracking : 2. The Plant is Unstable

The plant used in this part of simulation is

yk+1 = 0.231: + 0.2yk-1 + 0.4.sz’n[0.5(g,nc + yk_1)] - cos[0.5(y;c + yk_1))] + 1.221,c (145)

The plant (145) is unstable in the sense that given a sequence of uniformly bounded
controls {uk}, the plant output may diverge. An example is shown in Figure 6.21,
where the plant output runs away after the step input m, = 0.9, V k 2 0 is applied
to the plant.

In this section we are going to use the neural network NN(IO, 10) to control the
unstable plant to track a bounded reference command. Figure 6.22 shows the result
when NN(10,10) is applied to control (145) without being pretrained. The plant
output runs away in the ﬁrst 10 time steps. Since the initial weights of the neural
network, including 00(0) are chosen small (between —0.1 and 0.1) as mentioned in

section 6.1, the magnitude of the control 11)., which is deﬁned as W, can be
very large initially, forcing the plant output to grow larger and larger. Hence, the
system blows out before the neural network has a chance to learn to approximate the
the plant.

As a remedy to the situation above, the neural network is trained for 2000 cycles
in the hope that go may come close to 1.2 after the training. During the training
process, u], is selected randomly from the interval {-1.2, 1.2]. The results of applying
the partially trained neural network to control the the plant are shown in Figures
6.23 to 6.25. It seems that good control is not achieved at the beginning of the step

reference command. The plant response when controlled by a neural network which

is pretrained for 10000 cycles is presented in Figure 6.26.

 

86

6 SIMULATION

momyw mEC.

_ a OOFI

 

 

 

1 CNN

1 Ovm

.- oom

emuuﬁew

00¢?

 

000..

 

 

HJQHJO
Ema I

“39... gym m *0 09.0009. ucmﬁ 0r:
Ema 059mg of. rN.0 050E

87

6 SIMULATION

 

 

 

 

 

 

momuw mEc.
0 w m m r
d, A _ . OOFI
.. 00..
.. 09. W
e
5
ma
.. omo m
.. Ova
OONF
UCMEEOU «30530
wocmnmuﬁm llll “£05 I!

oocﬁboh aoc w. 1.0250: 3.50: oak

Ema 999mg: ”meiomﬁ. NNO 050E

88

6 SIMULATION

 

00V

 

 

 

 

 

 

 

 

 

 

 

 

0000.0 0E_._.
ONm OVN 00 r 00 0
_ J . _ Omé I
2 00.0..
00.0I
0m.0
00.0
00.?
UcmEEOO 5930
0oc0n0u~0m Ill «:05 l

00:03ch 9 4.0250: .050: on...

E05 050me: “9.200; 0N0 0.59m.

epniguﬁew

89

6 SIMULATION

 

00v

0000.0 0E_._. A +0000v

 

 

 

 

 

 

 

 

 

 

 

0N0 OVN 00F 00 0
A A a _ Om.—.I
/ All]: >
.. 00.0I
.. 00.0!
n
n 00.0
_
_
_
A - omo
00.?
UcmEEOO 5930
090.009”. III: 305 i

0050305 .2 x6230: .050: 0:...
“C06 0505:: ”0:209... 0N0 050E

epnwﬁew

90

6 SIMULATION

 

00¢.

0000.0 0E_.r A +0000NV

 

 

 

 

 

 

 

 

 

 

0N0 OVN 00.. 00 0
1 1 A A om; I
1 00.0!
1 00.0I
0m.0
1 00.0
004
0:08:50 5930
00:000.”. III: 50.0 I

00:.0303 w. {930: .050: 0:...

ACME. 0590:: 0:.x00ﬁ. 0N0 0:30.“.

epnuuﬁew

91

6 SIMULATION

 

00?

0000.0 08. ._.

 

 

 

 

 

 

 

 

 

 

 

 

ONm OVN 00.. 00 O
A A A A Om._.l
1 O®.Ol
1 Omdl
Omud
n 00.0
00.?
0:0EEOO 5930
00:00.00. Ii: E0_n. ll

8.03 ooooA 8. 08.955 72 9:.
E05 050ch: “0:200; 0N0 0:30.“.

epnwﬁew

6 SIMULATION 92
6.6 Controlling a Relative-Degree-Two System: 1. the Pen-
dulum

In this section we are going to apply the neural network to control a pendulum (Figure

6.27).

    

/u

Suppose that the equation of motion of the pendulum is
(M) + é(t) + sin(6(t)) -_.— u(t) (146)

To discretize the system equation, let T be the sampling period. Then (146) is

discretized (using Euler’s Rule) as
0);.“ = (2 — T)0k + (T — 1)0k—l — Tzsin(0k_1) + T2Uk_1 (147)

It is veriﬁed through simulation that we can set T = 0.3 for the discretized model to be
a good approximation of (146). In order to deﬁne the control, the same transformation

as described in chapter 2 is performed.

0k+2 = (2 - T)9;¢+1 + (T — I)“; — Tzsin(9k) + Tzuk

= (2 - T)[(2 - T)9k + (T - l)9k_1 — Tzsin(0k-1)

6 SIMULATION 93
+ Tz‘uk-1] + (T - l)0k — Tzsinwk) + T211};
= [(2 — T)2 + (T — 1)]01. + (2 — T)(T —1)0,,_l — Tzsinwk)

— T2(2 — T)sin(0k_1) + T2(2 — T)Uk_1 + T211]; (148)
which is modeled by the neural network

ék+2 = f(6k30k-13 uk—i, Wk) + {11:111. (149)

Since the control uk is multiplied by T2, it is difﬁcult to reduce T from 0.3 to 0.2
in the simulation. At each time step, the control uk is generated from (149). The

updating of the weights is based on the error between 9k+1 and
(3.... = f(6._1,0.-2.u.-2, W.) + an.-. (150)

Figure 6.28 shows the result of using the neural network NN(10,10) to control the
pendulum to track a sinusoidal reference command. Since the pendulum is stable
around the origin, the neural network is not trained before being used in the control
system. The pendulum angle tracks the reference command quickly, although slight
errors are still observed at the end of 1200 time steps. Figure 6.29 shows that even

at 10000 time steps, the errors are still detectable.

m3 ‘

94

6 SIMULATION

 

00N..

06E

000 ONN. 00v OVN 0

 

A A A A 00%|

 

 

 

 

 

 

 

 

 

.. ONOI
. oto
. 3
. a
., 00.0
. 05.?
000
0:08:50 59.30
00:0:0000. ii E0.n. l

8.50.05 0:

E3_:U:0Q 0:595:00 0N0 0:30.”.

95

6 SIMULATION

 

00N..

00.9.0 0EF A +0009

 

 

 

 

 

000 CNN 00.» OWN 0
A A A A Om.—.|
, , .. 00.0I
1 00.0I
1m 0m.0
\ \ \ I 00.0
00.?
0:08:50 59.30
@OCmmemm III... HEM—Q l
0550.300 0:

E22800 0:_._O:EOU 0N0 0.50.“.

(suegpei) e|6uv

6 SIMULATION 96
6.7 Controlling a Relative-Degree-Two System: 2. the In-
verted Pendulum

The pendulum equation has been described in (146).

Let

y=0—7r

to be the output of the system. The system equation is rewritten in terms of the new

variable as
37(1) + 35/0) - 8in(y(t)) = “(t) (151)

which becomes

yk+1 = (2 — lek + (T — 1)yk_1 + T23i71(yk-1) + T2Uk-1 (152)

after being discretized.

The simulation setup is very similar to that of section 6.6, except that the origin
is unstable. The simulations presented'in Figures 6.30 to 6.35 are conducted under
the assumption that the coefﬁcient associated with uk, i.e. T2, is known. Otherwise
it is very difﬁcult to produce any reasonable result. Any small error in the estimation
of T 2 can produce large error in the control, driving the plant output from the origin
which is already an unstable equilibrium point. It is obvious that controlling the

inverted pendulum is much more difﬁcult.

97

6 SIMULATION

 

00N..

 

00.9.0 08. .r

000 CNN 00¢. OVN 0
00.0I
0N.ml
0v.0I
OWN
0N0
00.0

E08800 59.30
00:00.00. . u u . A.:0_n. lll

05:38.05 0:

Egzccmn. 0:508:00 00.0 050....

(suegpei) e|5uv

98

6 SIMULATION

 

OONP

mamum mEE +OO®Q

 

 
 

 

 

 

 

OOO Own omv OVN 0
q a _ 4 8.—.|
1 O©.OI
\ , omd:
1 ONO
1 00.0
004
UCMEEOO 59:0
mocmLmOmm u--- ENE II

3.93 CON? .9 gcﬁboh
83:6ch oc___obc00 N00 059“.

(suegpeJ) elﬁuv

99

6 SIMULATION

 

OONr

9.05 mg 1: +OOVV C

 

 

 

 

 

 

 

mo_o>o CON? .0“. vocﬁboh
£5.30ch 02:95:00 mmd 0159“.

000 ONN omv OVN O
_. q _ . 00%|
1 O©.OI
I .. 8.01
1 ONO
1 00.0
00;
UcmEEOU 5930
@955me 1111 ESQ l

(suegpeJ) e|6uv

100

7 Conclusion

In this research, layered neural networks are applied to the adaptive control of single-
input/single-output nonlinear systems. Under different assumptions, two local con-
vergence results are shown in chapter 4 and chapter 5. The simulation results in
chapter 6 suggest that neural networks can be powerful tools for practical identiﬁca-
tion and control problems.

The advantage of using neural networks in identiﬁcation and adaptive control
problems is that neural networks are universal approximators. The neural networks
can learn to approximate various nonlinear functions as long as they contain enough
nonlinear hidden neurons.

The disadvantage is that the learning of neural networks to approximate nonlinear
functions is a very time-consuming process. In the case that the plant is stable and
nicely behaved, the neural network controller usually can achieve satisfying control.
In the case that the plant is unstable, the neural network may need to be trained
before being applied to the control system. Otherwise, the performance of the closed-
loop system can be very bad because the neural network may not learn quickly enough
on line to produce suitable controls.

The speed of “learning ” in the part of the neural network is the bottleneck of
the control system. Slow learning of neural networks can limit the usefulness of their
applications to real world control problems. How to improve the learning speed is an
important issue to look at.

Another interesting issue is to generalize the current control scheme to multi-
input / multi-output setup. Many important systems are highly coupled multi-input
/ multi-output systems, e.g. robotics. . The control problem for MIMO systems is

more involved. The neural network architecture needs also be reconstructed to ac-

7 CONCLUSION 101
commodate MIMO systems.

In my point of view, there are two research directions in applying neural networks
to control problems. One is to combine neural networks with analytic control algo-
rithms. This thesis is an example of this approach. In addition to layered neural
networks, other neural network paradigms may directly or indirectly help to solve
control problems. The other research direction is to use neural networks to reproduce
biological control system, which is usually called sensory-motor control. This is a
relatively new research area for control engineers. Since we can control our hands
and fingers easily, it is very logical to try to understand how those excellent controls

are accomplished.

REFERENCES 102

References

[1] D. Rumelhart, G. E. Hinton, & R. J. Williams, “ Learning Internal Represen-
tations by Error Propagation,” In Rumelhart and McClelland (Ed), Parallel

Distributed Processing, Vol. 1, MIT Press, 1986.

[2] B. Widrow, & R. Winter, “Neural Nets for Adaptive Filtering and Adaptive

Pattern Recognition,” IEEE Computer, March 1988.

[3] R. Hecht-Nielsen, “Theory of the Back-propagation Neural Network,” Proceed-

ings Int’l Joint Conf. on Neural Networks, pp. I-593-605, June 1989.

[4] D. Psaltis, A. Sideris, & A. A. Yamamura, “A Multilayered Neural Network
Controller,” IEEE Control Systems Magazine, April 1988.

[5] M.-S. Lan, “Adaptive Control of Unknown Dynamical Systems via Neural Net-

work Approach,” Proceedings 1989 American Control Conference, pp. 910—915.

[6] V. Zeman, R. V. Patel, & K. Khorasani, “A Neural Network Based Control Strat-
egy for Flexible-Joint Manipulators,” Proceedings 1989 IEEE Conf. on Decision
and Control, pp. 1759—1764.

[7] W. Li & J. -J. E. Slotine, “Neural Network Control of Unknown Nonlinear Sys-

tems,” Proceedings 1989 American Control Conference, pp. 1136—1141.

[8] K. S. Narendra & K. Parthasarathy, “Adaptive Identiﬁcation and Control of
Dynamical Systems using Neural Networks,” Proceedings 1989 IEEE Conf. on
Decision and Control, pp. 1737—1738.

[9] F.-C. Chen, “Back-propagation Neural Networks for Nonlinear Self-tuning Adap-

tive Control,” Proceedings 1989 IEEE int’l Symp. on Intelligent Control, pp.

REFERENCES 103

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

274-279. Also to appear in IEEE Control System Magazine, Special Issue on

Neural Networks for Control Systems, April 1990.

A. Guez & J. Selinsky, “A Neuromorphic Controller with a Human Teacher,”
Proceedings IEEE' 1988 Int ’l Conf. on Neural Networks, pp. II-595-—602.

W. T. Miller, F. H. Glanz & L. G. Kraft, “Application of a General Learning Al-
gorithm to the Control of Robotics Manipulators,” Int’l J. of Robotics Research,

6.2:84—98, 1987.

C. W. Anderson, “Learning to Control an Inverted Pendulum Using Neural Net-

works,” IEEE Control Systems Magazine, pp.31—37, April 1989.

L. G. Kraft, E. An & D. P. Campagna, “Comparison of CMAC Controller Weight
Update Laws,” Proceedings 1989 IEEE Conf. on Decision and Control, pp. 1746—
1747.

F. Pourboghrat, “Neuromorphic Controllers,” Proceedings 1989 IEEE Conf. on
Decision and Control, pp. 1748—1749.

A. Isidori, Nonlinear Control Systems: An Introduction. New York: Spring-
Verlag, 1985. '

S. Monaco, D. Normand-Cyrot, “Minimum-Phase Nonlinear Discrete-Time Sys-
tems and Feedback Stabilization,” Proceedings 1987 IEEE Conf. on Decision
and Control, pp. 979—986.

D. G. Taylor, P. V. Kokotovic, R. Marino & I. Kanellakopoulos, “Adaptive
Regulation of Nonlinear Systems with Unmodeled Dynamics,” IEEE Transaction

on Automatic Control, Vol. 34, No. 4, pp. 405-412, April 1989.

REFERENCES 104

[18] S. S. Sastry & A. Isidori, “Adaptive Control of Linearizable Systems,” IEEE
Transaction on Automatic Control, Vol. 34, No. 11, pp. 1123—1131, November.
1989.

[19] K. J. Astrom & B. Wittenmark, Adaptive Control, Addison-Wesley, 1989.

[20] J. A. Anderson & E. Rosenfeld, Neural Computing: Foundations and Research,
MIT Press, 1988.

[21] T. Kohonen, Self-Organization and Associative Memory, Springer-Verlag, 1989.

[22] P. K. Simpson, Artiﬁcial Neural Networks: Foundations, paradigms, applications,

and implementations, Elmsford, NY: Pergamon Press, 1990.
[23] M. L. Minsky & S. S. Papert, Perceptron, MIT Press, 1969.

[24] K. Funahashi, “On the Approximate Realization of Continuous Mapings by Neu-
ral Networks”, Neural Networks, Vol. 2, pp. 183-192, 1989.

[25] G. Cybenko, “Approximation by Superpositions of a Sigmoidal function”, Tech.
Rep. No. 856, Department of Electrical and Computer Engineering, University

of Illinois at Urbana-Champaign, 1988.

[26] K. Hornik, M. Stinchcombe & H. White, “Multilayer Feedforward Networks are
Universal Approximators”, Neural Networks, V0. 2, pp. 359-366, 1989.

[27] G. C. Goodwin & K. S. Sin, Adaptive Filtering , Prediction and Control, Prentice-
Hall, Inc., 1984.

[28] W. Rudin, Principles of Mathematical Analysis, 3rd Edition, McGraw-Hill, 1976.

REFERENCES 105

[29]

[30]

[31]

[32]

[33]

G. Kreisselrneier 85 B. Anderson, “Robust Model Reference Adaptive Control”,
IEEE Transaction on Automatic Control, Vol. AC-31, No. 2, February 1986, pp.
127—133.

V. D. Tourasis & C. P. Neuman, “Robust Nonlinear Feedback Control for Robotic
Manipulators”, IEE Proc., Vol. 132, pt. D, pp 134—143, 1985.

R. E. Kalman & J. E. Bertram, “Control System Analysis and Design Via the
“Second Method” of Lyapunov: Discrete-Time Systems”, ASME Journal of Ba-

sic Engineering, pp. 394—400, 1960.

F. M. A. Salam, “Artiﬁcial Neural Nets: Basic Theory and Engineering Imple-
mentations”, Department of Electrical Engineering, Michigan State University,

October 1989.

A. J. Owen and D. L. Filkin, “Efﬁcient Training of the Back Propagation Network
by Solving a System of Stiff Ordinary Differential Equations”, Int ’1 Joint Conf.

on Neural Networks, pp. 381—386.