ON ALGORITHMS FOR NONLINEAR PREDICTION

Dissertation for the Degree of Ph. D.
MICHIGAN STATE UNIVERSITY
SHASHI PHOHA
1976

     
 

LIB Atlﬁ A? Y
Mitsheceam SEN-La»

Luv-8:31? as“

v WWI-1",”

This is to certify that the

thesis entitled

ON ALGORITHMS FOR NONLINEAR PREDICTION
presented by
Shashi Phoha

has been accepted towards fulﬁllment
of the requirements for

Ph.D. degree inwand
Probability

 

%/ fcéeé ’

Major professor

DateL/4/7/ﬂ
V // T

 

Ll/Zﬁj

G/d"

ABSTRACT
ON ALGORITHMS FOR NONLINEAR PREDICTION
By

Shashi Phoha

Any claim to prediction, i.e. to foretell the future in what-
ever limited sense, of a dynamic process, is based on a quantitative
understanding of the model. Most processes in nature are such that
their future behaviour is not completely determined by their past.
For such random phenomena the idea of perfect prediction is replaced
by that of a conditional distribution given the past.

In 1959 in their paper in the Harald Cramér volume, N. Wiener
and P. Masani formally defined the non-linear prediction problem for
a univariate strictly stationary stochastic process and obtained an
algorithm for determining the nonlinear predictor in terms of the
moments of the process. The corresponding non-linear prediction
problem for multivariate processes takes into consideration the in-
formation contained in the correlative behaviour of these variables.
In this thesis, by introducing certain operations on vectors to de—
fine a matrix-algebraic structure on the past, a determinate mathe-
matical problem is established and its solution demonstrated by a
corresponding algorithm.

In an attempt to obtain a more efficient algorithm for the

problem of non-linear prediction for multivariate processes we proceed

Shashi Phoha

along N. Wiener's idea of linearizing statistics of time-series to
obtain the non-linear predictor. The latter half of the thesis deals
with transferring the non-linear prediction problem of a univariate
stationary process to linear prediction of a related infinite-variate

process.

ON ALGORITHMS FOR NONLINEAR PREDICTION

By
-. ."\.
Q \' ilk?" “

Shashi Phoha

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Department of Statistics and Probability

1976

TO MY PARENTS

ii

ACKNOWLEDGEMENTS

I wish to express my sincere thanks to Professor H. Salehi
whose constant encouragement and guidance made this work possible.
Sincere appreciation is also extended to Professor V. Mandrekar whose
valuable criticisms gave a shape and form to unformulated ideas. I
wish to thank Professors J. Hannan, H.L. Koul and J. Shapiro for
serving on my committee. And for the very understanding attitude
and personal interest one almost takes for granted here, I wish to
thank all members of the faculty. Let me also thank Mrs. Noralee

Burkhardt for her excellent typing of this thesis.

iii

TABLE OF CONTENTS

Page
INTRODUCTION . . . . . . . . . . . . . . . . . . . . . 1
Chapter
I. PRELIMINARIES . . . . . . . . . . . . . 5
II. MULTIVARIATE NONLINEAR PREDICTION . . . . . . . . . . 16
III. EXTENSION OF THE ALGORITHM FOR LINEAR PREDICTION
TO BANACH SPACE VALUED STATIONARY STOCHASTIC
PROCESSES O O O O O O O O O O 0 O I O O O O O O O O O 43
IV. ON LINEARIZING STATISTICS OF TIME-SERIES FOR
NONLINEAR PREDICTION . . . . . . . . . . . . . . . . . 56
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . 66

iv

INTRODUCTION

The ultimate objective of modeling a dynamic system is, of
course, to predict or control the output of the system by observing
and manipulating the inputs. As opposed to the hitherto acceptable
idealization that the laws of nature could be formulated linearly,
technology in its trends towards both refinements and magnifications
of scope and complexity, has been pointing with increasing insistence
to the fact that in the formulation of natural laws, modern require-
ments of precision forbid the suppression of nonlinear elements, for
the suitable formulations are almost without exception, nonlinear.
Fundamentals of nonlinear prediction theory as an outstanding con-
tribution of World War II, came out of N. Wiener's work on predicting
aircraft paths for purposes of fire control. The nonlinear theory is
still in a juvenile state. This is a difficult terrain in which the
going has been rough. But beyond the challenge to master difficulties,
the demand for clearing of paths is arising for more mundane considera-
tions. Wiener's work is geared towards applications. In particular,
the theory modified and extended in various ways, constitutes a tool
of considerable potential in forecasting and regulation.

In prediction theory of a stationary stochastic process,
there is first the problem of defining the prediction, i.e., of

showing that we have a determinate mathematical problem, and of

demonstrating its theoretical solution. There is then a theoretical
analysis of time and spectral domains. Finally, there is an attempt
to use the time and spectral analysis to determine the predictor at a
more efficient level. The program for linear prediction has largely
been carried out in the multivariate case by N. Wiener and P. Masani
[20], [21]. Linear predictors for weakly stationary stochastic pro—
cesses are obtained and a factorization of the spectral density under
the boundedness condition yields an algorithm for computing the gen-
erating function. An algorithm for computing the linear predictor

is then obtained from the generating function. This work was gen-
eralized to Hilbert space valued random variables by A.G. Miamee [7].
A needed generalization to random variables defined on a Banach space,
is the content of Chapter III.

.For nonlinear prediction of a univariate, strictly stationary,
discrete parameter stochastic process, work has been done at the
first level by N. Wiener and P. Masani [19]; and the ultimate objec-
tive to determine the relevant conditional expectation has been es-
tablished. The predictor, which is the conditional expectation, is
obtained in terms of the moments of the process by reducing the prob-
lem to a projection on the Lz-closure of the algebra generated by the
present and past of the process. In Chapter II of this thesis the
corresponding nonlinear problem is defined for multivariate random
variables. By introducing certain operators on matrices to define a
matrix-algebraic structure on the past, a determinate mathematical

problem is established and its solution demonstrated.

But a solution to the problem -- a solution as an algorithm
which produces a numerical estimate from numerical observations,
perhaps with the intercession of a digital computer, has not yet been
satisfactorily obtained in the literature. Chapters III and IV of
this thesis deal with transferring the univariate nonlinear prediction
to linear prediction of an infinite-variate stationary stochastic
process.

In his paper [18] in the Berkeley Symposium, N. Wiener sug-
gested a closely related method of linearizing statistics of time-
series to obtain the non-linear predictor. Chapter IV of the thesis
deals with this method. For real—valued, measurable, univariate
random variables {fn: —m < n < m}, with all powers of fn Lebesgue
integrable, and an ergodic measure preserving transformation T, he
suggested develOping a linear theory of multiple prediction for random
variables Xn whose components are products of all combinations of
f1(Tn1),...,fv(Tnv). The autocorrelations demanded are expressions

of the form

n n n
E{fl(T 1) x f2(T 2) x...x £v(r v)}.

A dense subspace of L2 generated by {fn: -m < n < m} and
consisting of finite products of {fn}:m, can then be arranged as an
infinite-variate stationary stochastic process {Xn: -m < n < m}
whose linear past at any stage is exactly the nonlinear past (poly-
nomials and their limits) of the univariate process {fnz -m < n < m}.
Due to duplication of subfactors in the components, however, the

vectors Xn(m) do not possess a Hilbert norm. Xn's may then possess

a Banach norm. The linear predictor for {Xn: -w < n < on} may then
be obtained by methods established in Chapter III, which further gives
the nonlinear predictor for the univariate process {fn: —m < n < 00}.
Conditions on the moments of {fn} that would guarantee the bounded-
ness condition for {Xn: -w < n < co} have not been obtained explicitly.
An example of a two state Markov chain sheds light on the procedures
and some of the intricacies involved.

Following the work of Wiener and Masani much interest has been
aroused in the areas of nonlinear prediction. Some of the more recent
'work related to this problem appears in [A], [8], [9], [10], [ll], [14],

[16].

CHAPTER I

PRELIMINARIES

While observing a random quantity associated with some long
enduring mechanism in nature, from the remote past to the present
moment, we obtain a sequence of readings {Xk: k = O,-l,-2,...}. For
the problems of predicting a future value based on this information
we shall be considering {xk: k = O,-1,-2,...} as part of a stationary
stochastic process {fnz n = O, + 1, + 2,...} defined on a probability

space (Q,F,P) such that
xn = fn(m0), mo in 9, -w < n < m .

Now fn may be square integrable and hence lie in some Hilbert space

H. Following Masani in [5] we have

1.1. The Gram-matricial structure of HN for a fixedgpositive

 

 

integer N.
Let
fl
HN={f=Z :f €H,1<i<N}.
i _ _.
fN

For f,g in HN, the matricial inner-product of the ordered

pair f,g is defined to be the N x N matrix

(fag) = [(f )]

O

i’ gj NXN

Orthogonality in HN is defined as
fig¢°(f,g)=0

where 0 denotes the N x N matrix all whose entries are 0. Also

define Hf“2 = trace (f,f). Linear combinations in ‘HN are formed

with N x N matrix coefficients.

A subset M of H” is a linear manifold if and only if
f,g E M: for any N x N matrices A,B, Af + Bge M .

1.2. Lemma. Ref. [20] Lemma 5.8 p. 131.

 

(i) M is a closed linear manifold of HN if and only if M = MN
where M is a closed linear manifold of H. In fact M consists
of all coordinates of elements of M.

(ii) If f is an element of HN and M is a closed linear manifold
of HN then there exists a unique f in HN satisfying one of

the following equivalent conditions

(1) feM and f-EIM

(2) feM and (f-f,f-f):(f-g,f-g) forany

gtE M-
Corollary.

In particular for H = L2(O,F,P), let L2 denote the space
HN. Then M is a closed linear manifold of L2 iff M - MN, where
M is the closed linear manifold of L consisting of all co-ordinates

2
of elements of M.

Notation.

f of Lemma 1.2(ii) is called the projection of f onto M
and is written as (flM).

In this thesis n will always denote an integer and

{-00 < n < 00} denotes the set {n: n = O, :11, i 2,...}.

1.2.1. Definition.

 

A N-variate weakly stationary stochastic process is a bi-

sequence {fn: -°° < n < 0°} such that fn€ HN and the Gram matrix

depends only on m-n.

Associated with a N-variate weakly stationary stochastic pro-
cess {fn: -w < n < 00} are the linear present and past subspaces Mn
for -m < n < m. For a fixed integer n, Mn is defined to be the

closed linear manifold of HN spanned by {sz k.: n}. Also

2. The Linear Prediction Problem.

 

Let {fn: -w < n < 0o} be a N-variate weakly stationary
stochastic process with covariance bisequence {Rk: -m < k < w} and

let vi: 1. Determine

(1) matrices Aim) such that
E = (f |MO) = lim :3 Aén)f_k
V “ n+m k=0

(11) cv = (fV - Ev, fv - Ev).

3. The Linear Prediction Problem for a Banach Space Valued Stationary

 

Stochastic Process.

3.1. Notation.

For a Banach space X let X* denote the Banach space of all
conjugate bounded linear functionals on X.

For Banach spaces X, Y, let B(X,V) denote the Banach space
of all bounded linear Operators on X to Y.

An Operator f in B(X,X*) is defined to be nonnegative if
for each x E X, (f(x)) (x) Z O.

In case X is a real Banach space we further assume for non—

negative operators that

(f(x)) (y) = (f(y)) (x) for all x,y in x.

We will alternately use the notation (f(x), y) for (f(x)) (y).
* *
Let 8+(X,X ) denote the set of all elements of B(X,X ) which
are nonnegative.

3.2. Definitions.

 

For a Banach space X and a Hilbert space K and A, B
*
elements of B(X,K), (A,B) is the unique bounded operator B A. de-

*
fined from X to X . It follows that
((A,B) X.y) = (Ax, By). v x G X.V 3'5 V -

Further let A l B if (A,B) = O.
3.3. Lemma.
Let {fnz -m < n < on} be a stochastic process such that for

each n, fn is defined on the probability space (Q,F§P) and takes

values in a Banach space V. If for each x* in y*, x*(fn) E L2(S2,F,P)
for -m < n < m, we may identify the process {fn: dw < n < co} with

a process {gn: —m < n < m} where an is in B(X,K) for X = V* and
K :- L2(O,F,P).

Proof.

Define 56: X + L2(Q,F,P) as follows

i: _ *f c * X
€n(X)(w)-X(n(m)), w~§2,x e .

3.4 Definition.

A bisequence {gnz -m < n < m} of elements of B(X,K) where
X is a Banach space and K is Hilbert space, is called a B(X,K)
valued weakly stationary stochastic process if the Operator §;gn in
B(X,X*) depends only on m-n. And then the Operator sequence
R(n) - 535“ defined for —m < n < m is called the covariance bi-
sequence of the process.

3.4.1. Assumption.

 

Assume from now on that X is separable.
3.4.2. The concepts and theorems from here on up to the end of Section
5 are outlined by A.G. Miamee [7].

.With the stationary stochastic process {§n: -m < n < m} are
associated the following subspaces

M‘Io the subspace of K spanned by {gk(x): -co < k < co, x6 X}.

Mn the subspace of K spanned by {§k(x): -m < k: n, x6 X},

and M_OD = I] M .
-co<n<oo

10

3.4.3. The process {Enz -w < n < co} is said to be
(i) singular if M-” = Mn for -m < n < m

(ii) nondeterministic if M_co ;:Mn for some finite n,

(iii) regular if M_m = {O}.

4. Time Domain Analysis.

 

For a B(X,K) valued regular weakly stationary stochastic
process [5“: -w < n < 00} there exist mutually orthogonal isometries

SR and Ak 6 B(X,K) such that

E = Z S _ 9
n k=O n k Ak

convergence being in the weak sense. Ref. [7] Theorem 3.3.7, p. 18.

5. Spectral Analysis.

 

5.1. Definitions: Ref. [22], p. 130-132.

 

A function f defined on a measure space (S,B,u), with values
in a Banach space V, is said to be finitely-valued if it is of the

n
form 2 IB yi, yi E V and n < m, where B 's are disjoint measurable

1=1 i 1
sets of finite u-measure.

The function f is said to be strongly measurable if there
exists a sequence of finitely-valued functions strongly convergent to
f a.e. on S.

Further, f is said to be Bochner integrable if there exists a
sequemce of finitely-valued functions fn strongly convergent to f
a.e. in such a way that

lim fhf(s) - fn(s)“u(ds) = o.
n+m S

11

For any B e B, the Bochner integral of f over B is then defined
by

f f(s)u(ds) = s-lim f I (s)f (s)u(ds).

B new 3 B n
Bgmagk: Note that f is Bochner integrable if and only if f is
strongly measurable and {hfhdu < m.
5.1.1. For the weakly stationary stochastic process
{£n: -® < n < 00} which is B(X,K) valued, M; denotes the closed
linear subspace of the Hilbert space K, generated by

{Enx: x E X, -m < n < 0°}. The shift Operator U defined on M; as

follows
Uinx = €n+lx’ x E X, -m < n < w
2n -iO
has a spectral resolution U = ER-f e E(d6), where E is a projec-
0

tion valued measure over ([0,2n],B), B being the o-algebra of Borel

sets. Ref. [5], p. 359-360. Now define for BJE B,
_ *

*
Now F(6) = §0E(0,6]§O is B(X,X*) valued such that for x,y t X

1 2" -inO
(Enx,€0Y) = 65; g e B(dBIEOX’EOY)
=-%; Ewe-i“e(E(de)EOX.§0y)
= %;_Z"e-ine(£; B(de)€OX.y)
i.e. (R(n)x,y) = l-'iﬂe-ine(F(d9)xey)°

2r 0

12

Definition.

 

F defined in 5.1.1 is called the spectral distribution func-
tion for the process {€n: -w < n < co}.

Assumption.

 

Assume from now onwards that there exists a B+(X,X*) valued
function f(6) defined on [O,ZN) such that

(i) £(e) is strongly measurable

(ii) f(6) is Bochner integrable
and (iii) for each Borel measurable A c [O,ZN)

F(A) = f f(e)de.
A

This f(6) will then be called the spectral density of the process
{€n: -w < n < m}. In general none of the above assumptions may hold.
5.2. Let K be a separable Hilbert space and let L2(K) denote the
Hilbert space of all K-valued functions on the unit circle which have

a square summable norm with the inner product of gl,g2 in L2(K)

defined as
2n
8
(g ,g ) = l/2n f (g (e1 ), g (eie))d9.
l 2 0 l 2
Define
0+ 2" -1 e 19
L2 (K) = {g E L2(K): 1/2n f e n g(e )d6 = O for n < 0}
O
0. 2n éine 16
L2 (K) = {g e 12(K): l/2w f e g(e )d6 = 0 for n > 0}.
- 0

5.3. Further, define a weakly measurable B(X,K) valued function

A(eie) on the unit circle to be

l3

analytic, if for each x 6 X, A(eie)(x) is in Lg+(K),
conjugate analytic, if for each x E X, A(eie)(x) E Lg-(K).

5.2.1. Factorization of the Spectral Density.

 

*
A nonnegative weakly summable B(X,X ) valued function de-
fined on the unit circle is said to be factorable if there exists a
Hilbert space K and a conjugate analytic B(X,K) valued function

16 19

*
A(e ) defined on the unit circle such that f(eie) = A (eie)A(e ),

in the sense that (f(eie) x)(y) = (A(eie)x, A(eie)y) for all x E X,
y_€ X.

Regarding factorization of the spectral density of a B(X,K)
valued regular stationary stochastic process, the following has been
established in [7] Theorem 3.3.12, p. 23.

5.3.2. Theorem.
The spectral distribution F of a regular B(X,K) valued

stationary stochastic process is absolutely continuous and

-§§ <F<e16)x><x) = H¢(eie)xh2

where

¢<e19>(x) = z e'ikeAk<x), Ak.£ B(X,K).
k=0

5.3.3. Definition.

 

O as defined in the above theorem is called a generating
function of the stationary stochastic B(X,K) valued process

{€n: -w < n < co}.

l4

6. Transferring the Linear Prediction Problem from the Time Domain to

the Spectral Domain.

For the N—variate weakly stationary stochastic processes {fn: -0 < n < co}
an elementary solution to the linear prediction problem is summarized

in [5] by choosing Aim) such that

n (n)
f - g f_ I f , f_ , ..., f ,
V k=0 Ak k 0 l
whence

(n) (n)
[no , ... An ] (f0, f0) (f0, f_l)...(f

 

 

This method is not efficient when n is increasing as our observed

data accumulates. Wiener and Masani transferred the problem to the

spectral domain instead, establishing an isomorphism between the time

domain and the spectral domain and thus obtaining the linear predictor

at a more efficient level. This was done in [21] for a multivariate

real-valued stationary stochastic process. A.G. Miamee [7] has extended

their results to the case of a B(K,K) valued stochastic process for

some Hilbert space K. He established in [7] Chapter VII that under

regularity conditions on the spectral density, the matricial frequency-

response or transfer function Yv that corresponds to the linear predictor

A

fv in the time domain, is given as
-iv6 i6 -1 16
Yv [e ¢(e )]0+ ¢> (e )
where «eig is a generating function of the stochastic process.
In chapter 111 we obtain a factorization of the spectral density

for B(X, K) valued regular stochastic process and thus obtain the linear

predictor and the prediction error matrix Gv’ for any v Z_l, for such

15

a process. Chapter IV proceeds, to suggest the possibility of utilizing
this analysis for nonlinear prediction problem. Results in chapter IV

are not yet complete.

CHAPTER II

MULTIVARIATE NONLINEAR PREDICTION

In genuine applicationscxuaalmost always works with systems
whose states must be described by several random variables. Weather
forecasts depend on a set of interdependent variables, for example.
Although it is reasonable that ideas and methods should be develOped
first for the univariate case, ultimately one must be able to cope
with the multivariate situation.

The price {pm} and demand {dn} of a certain commodity,
tabulated by days, in themselves form univariate time-series, which are
by no means independent. Nonlinear predictors for each of the pro-
cesses {pn: -m < n < w} and {dn: —w < n < co} may be obtained under
regularity conditions by methods of Wiener and Masani in [19]. However
the predictors 6V and dv for v 3_1, thus obtained would ignore
the information contained in the correlative behavior of pn and (1H
for n.: 0. It would therefore be natural to consider
{(dn): -m < n < m} as a multivariate process and to define a corresr
pausing structure on the past which formalizes the notions of non-
linear predictor and prediction error for a multivariate process. In
fact with the appropriate formalism, all of the work of Wiener and
Masani in [19] generalizes to this case. We proceed to introduce the

same in this chapter and to obtain the corresponding generalization.

16

17

l. Multivariate Stationarity.

 

Let f f —m < n < m, be real random variables

nl’fn2’°°" nN’ .
defined on a probability space (X,A,u). Then the multivariate

stochastic process {fn: -w < n < 0o} where fn = fnl is defined
n2

.an

to be strictly stationary if and only if for each integer v > O,

c c =
u[f E Bl’ f \ B2,...,f \ B ]

n1+v n2+v nk+v k
u[fn1 6 Bl’ fn2 E 32,...,fnk.€ Bk]’ whatever B1,...,Bk Borel measur-
able in RN and -w < n < n <...< n < w.

2 k

1.1. Note.

 

(i) The multivariate stochastic process {fnz -m < n < w}

is strictly stationary if and only if for every integer v > 0

ulw E S: fi+vj(w) E Aij’ (1,3) 6 I] =

pr E S: fij(w) e A1 9 (193) e I]

j

whatever finite I c {—m < n < m} x {1,2,...,N} and Borel measurable
Aij c R.

(ii) If {fn: -m < n < Go} is a multivariate strictly stationary
process then the univariate stochastic processes {fn : —m < n < w}

j
for j = 1,2,...,N are all strictly stationary.

1.2. A Characterization of Multivariate Strictly Stationary Stochastic
Processes.
Notation.

For s,t eRN let

18

St=
1

"NZ
m
n

1
Theorem.

The process {fn: -m < n < m} is strictly stationary iff for

N
each tl,...,tk<E R and -m < nl < r12 <...< nk < m
t f + t f + + t f
1 n1 2 112 k nk
and
t f + t f +...+ t f
1 nl+l 2 n2+l k nk+l

are identically distributed.

Proof.

If {fn: -w < n < Go} is strictly stationary then

fn +1 + thn +1 +...+ tkfn +1 is a finite linear combination of

l 2 k
univariate random variables and the result then follows from Note

t1

1.1(ii).

Conversely for every v > O and t = (t
j j1,...’t

for l:j_<_k andany u‘ER-

N k
E[exp iu(t1fn +...+tkfn )]= E[exp{i Z X

in
1 k . j=l 2=1

ut f

2

1 nj
k N

E[exp{i Z Z
j=l i=1

}]

“tjefnj+le

E[exp{iu(tlfnl+l + t2fn2+l +...+ tkfnk+1)}]°

Since u and t 's were chosen arbitrarily, the characteristic

J

function of the joint distributions of {fn l: l_: j‘: k, 1.: 2.: N}

. J
and {fn +11; 1.: j_: k, l_: 1.: N} agree.

3 m
Stationarity of {fn}_0° follows by Note 1.1(i).

19
1.3. Theorem.

CD

With the multivariate strictly stationary process {fn}_m
defined on a complete probability space .(X,A,u), can be associated a
probability space (Q,F,P), a measure preserving transformation T on

Q onto itself and a random variable f on R such that the stochastic

process {fglfw defined by

End.) = f(an), m e :2

G)

has the same joint distribution functions as {fn}_m, i.e. (in obvious

notation)

F =F ,—m<n <000<n (we

Proof.

 

Let Q be the set of all bisequences of elements of RF, A
the o-algebra generated by cylinder sets and u the Kolmogorov measure
obtained by using the well-known Kolmogorov construction. Ref. [1]
Theorem 2.2, p. 605. Letting T be the forward shift of co—ordinates,
f is obtained as the projection on the 0th co-ordinate, F as the
completion of the o—algebra generated by A and P as the extension

of u to F.

2. Preliminaries and Notation.

 

Let HN denote the family of all N x N real matrices.

H:

For 1_5 p < m, LP(Q,F,P) a {f - .1 : f is an equivalence
. fN
*
class of (Q,F)-measurable functions and f{/f f}de < m}. For
' Q

f1,f2 e L2(§2,F,P) the gramian 1s defined as

*
(fl’fz) E fflfzdp ‘ [(fli’ij)]NXN

20

where f -

k for k = 1,2. Lw(Q,F,P) = {f: f is an equivalence

class of (Q,F) measurable functions such that there exists an
aGRN with If | <a
i - i’
2.1. For a strictly stationary N-dimensional stochastic process
{fnz -°°<n<°°} with fn€ Loo, define for -°°<n<oo and qlo

3n = o{f;;(A): A c R, A Borel measurable, -m < k.: n,

1:j:MUIAgQ:NM=OL

B = O B
-00 —oo< n<oo
. -1
BO,q - otfkj(A). A Borel measurable in R, —q §_k‘; O,

lijiN}U {As O: P(A)=O}

where 0 denotes the generated o-algebra.

Further define

Nn as the set of all RN—valued Bn-measurable functions on O,

NO q as the set of all RN-valued 80 q measurable functions
on O,
M - N n L
n n 2
M = N n L

0.q 0.q 2'

2.2. A Matricial Algebra.

 

Definitions.

 

N
2.2.1. For xl,x2€ R let

21

x + x x22 +...+ x1N x2N

x + x x23 +...+ x1N x21

x + x x +...+ x

21 IN sz-I/

Note.

(1) If co-ordinates of X1 and x2 were arranged along two dif-

Lferent circles as follows

x
x1N 1 x
12

x13

 

M

then x1 * x2 is a N x 1 vector whose kth co-ordinate is obtained
as the sum of products of corresponding entries on the two circles
after the second circle is rotated anti-clockwise through k-l units.
(ii) The aim of introducing this special type of multiplication is

to be able to obtain all finite products of co-ordinates of x1 and

x2 through the operations of * and premultiplication by suitable

N x N- matrices on x1 and x2. More precisely, for any product

ﬁg xlix2j for I, J c {1,2,...,N}, we have
I
jEJ
n x x
151 lizj ={H(I X)*(I X)}
JGJO 151111 1:32

j€J

where 111 denotes the N x N matrix having 1 in the (l,i)th place

22

and all other entries 0, and H on the right hand side denotes *
multiplication from right to left.

Note that by premultiplication with a suitable elementary
matrix, the position of the finite product on the left hand side of
the equality may be suitably changed.

(iii) (a) Note that * multiplication is not commutative.

(b) Also, * multiplication is not associative since

iIiIiI *IIII =I§I
{III *(iI}*IiI =I§I

(c) For A c HN, A(x1 * x2) # Axl * x2 in general. For example

I: :1 (131% *(xjn'II =,n°(

It: ‘3] (:11) It) If)

2.2.2. In the absence of these prOperties we make the following definition.

 

and

while

Any expression of the form

2.2.3. + Z An (A x * A x *...* A x )

a
O l,...,n£ n1 n1 n2 n2 n2 nk

2,...,n2.: r, An # O in HN, a0 is N x 1 vector of
i

scalars and 2 a sum over finitely many terms, is called a polynomial

for 1.: n1,n

23

in x1,...,xr. Note that the * multiplication operation is always

performed from right to left.

Denote a polynomial in x .,xr as P(x1,...,xr).

1,..

Note.
(1) P(x1,...,xr) is a N x 1 vector each of whose co-ordinates is
obtained by finite operations of addition and (ordinary) multiplication

on the co-ordinates of x x . Each co-ordinate of P(Xl"'°’xr)

1,..., r

is therefore some polynomial in {xijz 1 :_i :_r, 1 §_j §_N} with

scalar coefficients. We may thus write

’pl(xij)
I
; 92(xij)
P(xl,...,xr) =‘ I . 1.: 1.: r !
' ' ' lijiN’
PN(xij)

where p1,...,pN are polynomials in scalar coefficients.

(ii) Note that sum of two polynomials is also a polynomial and so is
the pre-multiplication by an element of RN. Furthermore the *
multiplication of two polynomials is also a polynomial as may be de-

duced from the following
*
(Alxl + A2x2) (Bly1 + B2y2)
= * * * * .
(Alxl Blyl) + (Alxl Bzyz) + (Azxz Blyl) + (Azxz 32y2)

(iii) The Degree of a Polynomial.
Degree of a summand A (A x *...* A x ) is de-
111,...,n£ n1 H1 112' nl

fined to be 2. The degree of a polynomial P(X1,...,xr) is the

maximum of the degrees of its summands.

24

Lemma.

 

The degree of a polynomial

p1(xij)
P(xl,...,xr) = I : 1 5-1 ﬁ-r
' 1:3:N

‘pN(xij)

is the maximum of the degrees of the ordinary polynomials p1,...,pN.

Proof.

 

Let the degree of P(x ,...,x ) be m. Then there exists a
l r

summand of this polynomial of the form

for 1_: n

1,...,nm i r. Since An # O for i = 1,2,...,m, each
i
N x 1 vector An xn has at least one nonzero entry. The co-ordinate
i i
polynomials of the N x 1 vector A x * A x therefore have
n n n n
m-l m-l m m

degree at most 2 and at least one of these has degree exactly 2. Con—
tinuing to perform the * multiplication operation from right to left
it is easy to see that the maximum of the degrees of the polynomials
is exactly m. Since P(x1,...,xr) consists of terms of the above
form, the result follows.

2.2.4. A Matricial Algebra of N—variate Measurable Functions.

 

A subset A of the family of all measurable functions on
(Q,F) into RN is called a matricial algebra if and onlyij it is
closed for the following binary operations

(1) f1,f2 e A and A1,A2 e HN = Alfl + AzfzeA

(ii) fl,f2 GA: f * £26 A.

l

25

Note.

 

(i) If co—ordinates of f1 and f2 are bounded in absolute value

so are the co-ordinates of f1 * f2.
(ii) Intersection of a family of matricial algebras is again a
matricial algebra. Hence given a subfamily E of A, the inter-
section of all matricial algebras containing E is the smallest
matrcial algebra containing E.
2.3. Notation.

For any set A let AN denote the set of all N x l vectors

a'
of elements of A, i.e. AN = ((.1) : aiE A\.
J

< o
.aN,

:N.
For any positive integer n let P: denote the family of

 

n-variate polynomials in r variables xl,...,xr.
Theorem.
N l N
Pr _ (PNr)
Proof.
N 1 N
Pr‘S (PNr) by Note 2.2.2(i).
pl 1 N
Conversely for ' 6 (PNr) ,
pN
2 3.1 p1 N 1’2,"
. — Z I O , where I is the elementary matrix
. E . 2
p i=1 .
\ N 0

obtained by interchanging the lst and £th rows of the identity matrix.

Also each p2 itself is a finite linear combination of finite products

26

of {xij: l :_i §_r, l §_j §_N}. So if

MW

Pg = x

a H
““0 (tartan) 15

m=1

where k < m and I(m) contains a finite number of elements,

repetitions allowed, of {1 §_i :_r} x {1,2,...,N}, then

W
O k .
I = Z A { H I .x }
° m=l I(m) (i,j)€I(m) 1‘] i
.0,
where AI(m) is the N x N matrix with aI(m) in the (1,1)th place
and 0 elsewhere and Ilj is the matrix with l in the (l,j)th place and

O elsewhere. Note that H here denotes * multiplication. Thus by

2.3.1 above it follows that (P;I)N.E PE. Hence the result.

Corollary.

Consider the set {xr: r in a countable set I} of N x l
vectors. PN, the family of polynomials over this set, is defined as
the union of the families of polynomials over any finite subset of

{xr: r'E I]. It is immediate from the theorem then that

PN = (P1)N.
Here p1 is the family of all polynomials in {xrj: 1 §_r < m,
1 < j < N}
2.4. Let An denote the smallest matricial algebra containing the
1‘
1

function 1 =

and {fk: k‘: n}.

27

Let A denote the smallest matricial algebra containing

9
the function 1 and {fk: —q :_k.5_0}.

Let An denote the linear algebra generated by the function 1
and {fij:-oo<if_n,lij_<_N].

Let A q denote the linear algebra generated by the function

1 and {f ° -q £_i 5_O, l :_j £_N}.

ij’

Lemma.
1
A = {a + Z A ( H A f ):
n O n ,...JIEIZ'n1"°°’n£ i=1 n1 n1
1 2
I is a finite subset of {k: —oo < k_<_ n},An 6 HN}}.
1

Proof.

Since An. is closed for * multiplication and addition and
premultiplication by elements of HN,

R
An_3_{a0+ )3 An,” (II A ):
nl,...,n£EI l =

I a finite subset of {k: -oo < k: n], An 6 HN} .
1

Conversely the family on the right is a matricial algebra since it is
closed for the required operations. It also contains the function 1
and {sz k.g n}. Hence it contains An. Hence the result

Them-

For the strictly stationary process {fn: -w < n < m}

— —N
A =A =(An)

 

A bar on top denotes closure.

 

 

 

 

fit

LL)
H

I;

the

the

Cor
C95

is

28

EEQQI:
An is the family PN of polynomials over (£1: -m < i < n}.
Also An is the family P1 of polynomials over {fij: -w < i < n,
l §_j §_N}. So by Theorem 2.3, An = A2. Taking closures in the trace
norm gives the result.
Similarly for A0,
2.5. The moments of the stochastic process {fn: -m < n < m} are de-
fined as

=ff xf , X...Xf dP(w)
Q n

(I
(“19.11) (n29j2)!°°°9(nk9jk) nljl 232 kjk

for -<=o<nlin2_<_...5_nk and l_<_jl,j2,...,jk_<_N.

Birkhoff's Ergodic Theorem. Ref. [17] 52.2.
If T is a measure-preserving transformation on (Q,F,P)

then for any random variable f defined on 9 such that Elf] < m

:3

Iin—l— 2 f(va) = E[fIT](m) a.e.,
n+1
n*w v 0

where T is the family of T-invariant measurable sets. Further if

T is ergodic, then

f(va) = E[f] a.s.

1
lim-——-
n+1 O

M \J

"ML"

Corollary 1. For a multivariate strictly stationary stochastic pro-
cess {fn: -m < n < m}, if the unique shift Operator defined in 1.3

is ergodic then letting TN denote the inflation of T to N-dimensions

29

1 v 1,]
lim -- 2 T { f * f *...* .f (m)}
v—m v+l “=0 N Iljlnl Iljzn2 Iljrnr
a.e.

" “(111%) (n2,j2)....,(nr.jr)

whatever —m < n1.: n2.:..2g nr < m and 1.: jl,j2,...,jr :_N.

Proof. The shift Operator T is measure-preserving. Also

N

T“{I f * I, f *...* I f }
N j1 111 J2 n2 jr “r

Using Birkhoff's ergodic theorem the result now follows.

Corollgry 2. For a multivariate stationary stochastic process
{fn: -m < n < m}, if the unique shift operator T is ergodic then

for any polynomial P in N-dimensional variables fn ,...,fn with

. 1 r
matricial coefficients in HN
1 m p =
lim ;:I Z TN{P(fn ,...,fn )} E{P(fn ,fn ,...,fn )}.

p30 1 r 1 2 r

30

Proof. Note that

'U
N v
A
H‘
:3

H-

LJ.
v
...;
A
He
|A
N
U
...;
[A
(.Jo
|A
2

p (f .): l 5_i :_r, l 5.j :_N,’
N 1113 /

where p1,p2,...,pN are some polynomials with real coefficients.

The ergodic theorem holds for each coordinate now. This gives the

desired result.

Remark. The assumption of ergodicity of the stationary process is not
necessary in this thesis but with that assumption Birkhoff's ergodic
thedrem justifies the possibility of determining moments of the process
from data collected from time-series Observations. Ref. [19] 53, pp.
195-196. The assumption of stationarity reduces significantly the

number of moments to be determined.

3. Statement of the Problem

 

3.1. Definition. For a strictly stationary stochastic process
{fn: -w < n < CD} with zero expectations and for any v > 0, define

the predictor fv with lead v of the function fv as

j/E(fvllBO)
A I E(fOZIBO)
'f =I .

v .
E<va|BO>/

' Let the right hand side of the equation be denoted by B(fv|80)'

31

3.2. The Prediction Problem.

 

Given the moments for

O . . .
(DJ-931) (“2932) 9 ° ' 9 9 (nk9Jk)

-m < n1_i n2.:..s: nk < m and l _<_jl,j2,...,jk i N of the multi-

variate strictly stationary stochastic process {fn: -m < n < m}

with fn EL.» and with zero expectations, to determine polynomials

¢q of q + l N-dimensional variables with matricial coefficients

 

such that
E= =' .
v E[fleO] 11m Oq(f0,f1, ,f_q)
qv-H'D
Note.
1. Since the joint distributions F of f ,f ,...,f
n ,...,n n n n
l q l 2 q

have compact support, their moments characterize them completely [12]

Cor. 1.1, p. 11.

2. The assumption fn E Lm is made to reduce the prediction prob-
lem to a projection on the closure of the algebra generated by the past.
The weaker condition of existence of all moments is not sufficient.

Ref. [19] Theorem 6.5.

4. Main Theorems.

 

4.1. The following are based on [5].

* 5
L2 with the inner—product (fl,f2) - {f flfzdP}

which yields
the trace norm, is complete (almost everywhere equal functions are
identified).
Theorem.

For v > O, fv is the orthogonal projection in L2 of fv

on M0. E[fv|80,q] is the orthogonal projection in L2 of fv on

M0.q'

32

Proof.

_ _ N
M0 Non L2 is a subspace of L2 - L2.

Hence M0, the space of all co-ordinates of elements of MO, is

_ N
In fact M0 — L2(BO).

L2(BO). Also as in the proof of Lemma 2.8 of [5], p. 354.

/ .
(fleo) = /E(valMo) E(stIBO)

/

I ._
I B(fOZIMO) — B(fvleO)

I3, =.~

\ z ' 7
\ M M )./ M 3 )/
.\ le O/ le O//

= B(fVIBO).

Similarly for projections on M

4.2. Theorem.

A function f E AQ)q iff there exists a Baire function

s: (11.1‘1)‘1+1 +'RN such that f = ¢(f0,f ,f

-l,ooo -q)o
4.3. Theorem '
For F a distribution function on (RN)q with a compact
carrier Jq for some compact set J c RN and for 1.: p < m, let
Lp F be the space of all RN-valued functions O on (RN)q which
9
are measurable w.r.t. the Lebesgue—Stieltjes measure generated by F
p I *
and are such that IIOI dF < m where lol R] O O. Then under the norm
H¢HF p = { f§|¢|de}1/p, the polynomials in q-variables with values in
’ J
RN and coefficients in HN form an everywhere dense subset of Lp F'

Proof.

Lp F = {¢: (RN)q + R : f{¢*¢}de < m}.

As in 2.3 P: denotes the family of polynomials in q N-variate vari-

I‘llll II If «I
III" \III
||Ill|ul IIIOIII

33

ables {xi}:=l' (Péq)N is the family of polynomials in Nq variables
. N _ 1 N
{xij' 1 §_i :_q, 1 §_j :_N}. By Theorem 2.3 we have Pq f (PNq) .

1
Now PNq is a dense subset of the space L F of all Lebesgue-Stieltjes
P.

measurable functions w defined on RNq into RN with
it
[(n 4,)de < a, Ref. [19] §6.2, p. 203. Hence for each coordinate

¢i of ¢ 6 Lp,F there corresponds for given 5 > 0, pi.6 Fiﬁ such

P
u 1
that H¢i - pi”: < e/Np. And then p = :1' E (PNq

pN

)N = P2 is such

that

p, _Zp/2 -1~_:p
th¢-FhF’p [{X(¢i pi) I dF: M) ZN¢1 pin < 5.

Hence P: is everywhere dense in Lp F'
9

4.4. Theorem:

on

(1) M0 is the closure in L2 of the linear manifold U M

q=0 O’q

(ii) M0,c1 = A0,Cl

(iii) M0 = A0.
Proof.

Let M0 denote the subspace of L2
ordinates of elements of MO. Similarly define M

consisting of all co-

. Then b Lemma
0.q y

1.2 of Chapter I

Now let f'E M Since the truncations {fX of f

0' lflgn}

approach f in the trace norm, each co-ordinate of f is approximable

by simple Bo measurable functions. Since 80 is the Borel algebra

34

co

generated by LJ 80 q’ simple Borel measurable functions are

q=O ’ k

approximable by simple functions of the form 2 aiIF with

w m i=1 i
F' E U B which obviously belong to LJ N . Such functions
1 =0 O.q _ DA

9 q—O
being bounded are in L and hence in. U (N n L ) = IJ M .

2 _ 0"] 2 _ Oeq
q—O q-O

Hence the result.
(ii) Let f'E MO Then by Theorem 4.2, f = ¢(f0,f_l,...,f_q) where

¢: (RN)q+1 + RN is a Baire function.

Now f E [2, hence o 6 La F where F denotes the joint dis-

tribution of {f ° -q‘: i_: 0}, because

1°

“fo2 = huiz

’

Also F has a compact carrier. Hence by Theorem 4.3 there exists a

sequence of real polynomials Qn of q+1 N-variate variables such that

OJ
m
:5

+00.

4.4.1 HO - QnHF + 0

Now define wn=Qn(fO,f_l,...,f ). Then tyne Aoq. Since

" 9

f6 L2, tyne L2 and Hf - wnHLZ = ”‘1’ - QnHF° By 4.4.1 then

f = lim wn in L2-norm.

n+m
(iii) From Definition 2.4 it follows immediately that A0 = $20 A0,q'
Also from (ii) above I: A0, is everywhere dense in
I: M . By (1) therEEgre, A0 is everywhere dense in M0 i.e.

q_0 on
AO - Mo.

35

Corollaries.
(i) For the strictly stationary stochastic process

{fn: -m < n < m} and for v > O

fv = lim E[fleO’q] in L2 .

q—HD
This is immediate from Theorem 4.4(i) above.
(ii) For the strictly stationary stochastic process

{fnz -m < n < m} there exists a sequence ¢q of Baire functions

on (RN)q+1 to RN such that
fv = lim ¢q(f0,f_l,...,f_q) in L2.
q-W
Proof.

From Corollary (i) f = lim E[f IM ].
V V Oeq
q—mo
Also by Theorem 4.2 there exists a Baire function ¢q such

that E[fleO,q] = ¢q(f0,f_l,...,f_ ). Hence the result.
(iii) For the strictly stationary stochastic process
{fn: -m < n < 0°} and for any integer v > 0, there exists a sequence

of polynomials Qq in q+1 N-variate variables with coefficients in

HN and values in RN such that

fv = lim Qq(f0,f

q-wo

_1,...,f_q) in L2.

Proof.

By Corollary (ii) above there exist Baire functions ¢q

such that “Ev - ¢q(f0,f f_q)h2+ 0. Furthermore for given

-1’...’

e > 0 there exist polynomials Qq of the above type such that

 

 

I‘llllllll‘

36

E_.

C _
lieq - Qqhz < I? Thus [I¢q(f0,fl,...,f ) - Qq(f0,f_1,...,f_q)[.2 < 2“.

'9
Hence fv - lim Qq(f0,f

q—W

_l,...,f_q) in L2.

5. Computation of the Predictor.

 

In this section we shall obtain an orthonormal basis for the

space MO =LAb in L2 with the trace norm. The expansion of the pre-

dictor in this basis is then actually computable from the given data.

Define a subset H c L2 to be linearly independent iff for

any finite collection h .,hn 6 H and A1 6 HN with l_: i‘: n,

1,0.

zAihi=O_=Ai=O vi with l_<_i:n.

Define the span S(H) of a subset H of L2 to be the

family

11

{1:1 Aihi: A1 é HN, h1 E H and O < n < m} .

A subset H of a subspace M of L2 is said to be a basis

for M iff H is linearly independent and spans M.
A basis H of a subspace M of L2 is said to be orthogonal

iff for hl # h2 E H

*
(h1,h2) = Ehlh2 = 0 .

An orthogonal basis H is said to be orthonormal if in

addition V h 6 H

37

.”h"2 = trace(h,h) = 1.

5.1. Lemma.

 

A subset HCIL2 is linearly independent in L2 iff the

 

.h '
corresponding set H = {hj : l §_j §_N, h = :1 E H} is linearly
independent in L . .
2 \hN
Proof.
Let hl,...,hn be a finite set which is linearly independent
in H. Then
n N
2 Z a .h , = O ...
.= .= i; 13 ” 811 812 a1N h11
1 1 J l O O 0 h
... 12 +...
. . I I
\- s
‘\thl."
an1 ... anN 'hnl g
+ O .;. O E = . w aij = 0 for 1 §_i §_n, l §_j :_N.
0 ° 0 - hnN °
\0/
Conversely if {hij: 1 :_i §_n, l §_j §_N} is linearly independent in
L2 then hl"'°’hn must be linearly independent in L2 since for
k ' k
matrices Ak [aileXN’ kzl Akhk - O a aij - O for l_: 1.: N,

l §_j §_N and I C {1,2,...,n}, i.e. Ak = O for each k E I. Hence

{h h } is linearly independent in L2.

1’...,n
The case when the set H is not finite is immediate now.

5.1.1. A linear arrangement of the set H = {fij: «w < i :_O,

l £_j 5_N} is:

38

f ...
01’ ’fON’f-11"°°’f-IN’f-2I""’f-ki""
Then H = {hnz 0 :_n < m} with h0 = 1 and for each n > 0
h =
n f n-1 n-1

-L_ﬁ—]sn-L N IN

h 9- d n
w ere [N] enotes the integral part of E} One method of arranging

all finite products of elements of H in a row is given by Wiener and

Masani as follows. Ref. [19] §7, p. 205.

Let pk = k+1th prime, and V j > 1 with the prime factoriza-

tion

j+l = pk pk ... pkr , O :_k

define

R1 R2 kr

{sz 1 :_j < 00} is then a linear rearrangement of all finite products
of elements of H.

Thus
M0 = clos Sioj: 0.: j < m} where O0 = 1

M0 is therefore the closure of the subspace of L2 spanned

¢j=01j<o§.

O

by the set M ={

O...

39

5.2. _A;Basis for A0.

 

Lemma.
A subset M C A forms an orthonormal basis for A0 in

0
if and only if the corresponding set M ={ : h E M} forms

05"

L2

0000

an orthonormal basis for A0 in L2.

Proof.

 

By Theorem 2.4 A0 = A3. By Lemma 1.2(i) of Chapter I the

result follows since coefficients in linear combinations of elements

of A0 in L come from HN and

j<

00} need not be linearly independent

2

0:!"

II

O...

Now the set {¢j: 0

IA

in general Ref. [19], 57.3. Although the following procedure of
.computation of the predictOr is possible without having a linearly
independent set at this stage, we may not be able to keep track of the
number of variables involved in the approximating polynomials. Of
course the computations get more complicated also. We therefore make

the following

Assumption.

 

The strictly stationary stochastic process {fnz -w < n < 00}
is such that for any finite set I c {-m < n < OD} x {1,2,...,N}, the

spectrum of the joint distribution of {f : (i,j) 6 I} has positive

13

Lebesgue measure in the III dimensional space, where III denotes

the cardinality of the set I.

40

5.2.1. Lemma.
Under the above assumptions the set {¢j: 0 :_j < co} is
linearly independent.

Proof.
n

Let u = z c o , c
i=1 i i i

61's are univariate random variables. Let the factors occurring in the

# 0, be any linear combination. Note that

products Oi for i = 1,2,...,n be {f (i,j) E I}. Then

ij‘
w = Q<fij: (i,j) E I) for some non-zero polynomial Q in III

univariate variables and

”WH2 = f|Q(xij: (i,j) e I)|2dFI(xi )

j

where FI denotes the joint distribution of {fijz (i,j) E I}. Note

that FI has compact support. Since Q can vanish only on at most
a III - 1 dimensional algebraic surface which is of zero III-dimen-

sional Lebesgue measure and the spectrum of F has positive measure,

I
it follows that {HwH > O, i.e. w # 0. Hence {¢j: O < j < 0R} is

linearly independent.
5.2.2. Corollary.

Under the above assumptions the set {¢ ° 0 §_j < 00} can be

j'

orthonormalized to obtain an orthonormal basis {wj: O §_j < 0°} for

A0 in L2 as follows:

41

‘I’ a
O 1
wl - ¢l//<¢l.¢l)
(¢09¢0) (¢03¢1) o o o (¢O)¢k_l) ¢O
l .
W - _—__—_—— (¢ 9¢ ) (¢ 9¢ ) .. . . (¢ 9¢ ) ¢
k m l 0 1 l l k-l l

 

 

(¢k9¢0) (¢ka¢l) o o o (¢k,¢k_l) ¢k

where Ak = det[(¢1,¢j)]O 3.1’j.i k’ k > 1.
5.2.3. Corollary.
. n
For every integer v > O, fvj = E[fvaBO] = :1: k:0(fvj,wk)wk.

5.3. Computation of the Predictor.

 

Theorem.
For every v > O, fv = ii: Qn(f0,f_l,...,f_m ) where mn is an
increasing sequence of nonnegative integers and Qn is a polynomial

in mn+l N-variate variables with co—efficients in HN, these co—

efficients being computable in terms of the moments

O .
(n19j1)(n2’j2) ° ° 0 (uijk)
Proof.

As in the univariate case (Ref. [19] Theorem 7.9, p. 209), let
“j be the subscript of the last prime in the factorization of j+l

and let, for any n > O

m =maX so. .
n {U1’. sun}

Then wk is expressible as a linear polynomial in terms of

{¢j: 0.: j.§ k}, each one of which is further expressible as a poly-

42

nomial in {hjz l_: j_: k}.

Thus from the Corollary 5.2.3 there exist real polynomials

qnj such that
. n
f = lim 2 (f .,w )w = lim q (f : -m §_i :_O, l §_£ f N).
vj n+m k=O vj k k n+w nj ii nj
Thus
qn (fij -mn-i i < O, 1‘: j < N)
l
qn2(fij: -mn: i _<_ 0, 1: j :N)
f =11!“ ooooooooooooooooooooooooooooooo o
W
qu(fiJ -mn ﬁ'i §_O, 1.: j_: N .
N
= C
Since A0,m A0,m , there exist polynomials Qn" A0,m
n n n
such that
5.3.1 A =
fv lim Qn(f0’f-l’f-2’°'°’f-m ).
n+m n

Note: Actually qnj is a linear polynomial in mn+l variables and

A

lim qnj(h£: -m < E §_O)

vj new n'-
= lim q (f : -m < 2 §_O)
n-m “3 4%] .n-Iﬁm “

So the coordinates of the predictor are obtained through the
procedure of this section and are written in terms of limits of

polynomials in N-variate variables in 5.3.1.

CHAPTER III
EXTENSION OF THE ALGORITHM FOR LINEAR

PREDICTION TO BANACH SPACE VALUED
STATIONARY STOCHASTIC PROCESSES

An algorithm for computation of the linear predictor for Banach
space valued random variables is obtained in this chapter. Time domain
and spectral analysis for such processes, including Wold Cramer con-
cordance and necessary and sufficient conditions for factorability of
the spectral density were obtained by A. G. Miammee Ref. [7], Chapter
III. However the algorithm of Wiener and Masani under the boundedness
condition, was obtained only for Hilbert space valued random variables
using Fourier analysis of infinite matrix valued functions Ref. [7],
Chapter VII. Under an extension of the boundedness condition of
Wiener and Masani on the spectral density of the Banach space valued
stationary stochastic process, a corresponding algorithm for computing

the generating function and the linear predictor is obtained.

1.1 The Boundedness condition on the Spectral Density

 

Let the spectral density fe , of the B(X,K) valued stationary
stochastic process {an: -m < n < m} satisfy
0 < m(6)A*A 5 f6 5 M(6)A*A a.e. 6 e [O,2h)

1 11(6)
m(6)’ m(6)

 

for some A: X + K with “A” = l and M(6), summable.

43

44

1.2. Lemma.

 

 

f
*
Under boundedness condition 1.1 for Ne=-—g — A A with
agnm)+mm 9
6 2 ’

M(e) - m(e)
stimm+nw)<L

 

Proof.

Let f6 denote the quadratic form of f6’ i.e.

V x.€ X, fe(x) = (fax,x).

HI
A
x
+

'~<
v
II

fe(x) + fe(y) + (fGX.y) + (fey.X)

H\

A
N
I
"4

V
II

ie(x) + Ency) — (fax.y) - (fey.x).
But fe is Hermitian, i.e. (fex,y) = (fay,x). So
(f x,y) = lIE (x+y) - f (x-y)]-
O 4 6 6
Similarly

(Ax.Ay> = %[(A*A(x+y).x+y) - <A*A<x-y),x-y>].

Therefore
(fax,y) *
(Nn<x).y> = -——;;;—— - (A Ax.y>
f (x+y) E (x-y)
=‘IIt-9————— - <A*A(x+y).x+y) - {-9---- <A*A<x-Y>’X-Y)}I-
a9 39

Due to boundedness assumption 1.1 then

45

M(6)

|<N6<x>.y)|;;%[t;——— (A*A<x+y),x+y) - (A*A<x+y),x+y)} -
6

{3&91 (A*A(x-y),x-y) - (A*A(x—y),x-y)}] a.e. e
9

=_i[M(e) - m(e)
4 a

 

 

{(A*Ax,x) + (A*Ay,y)} + (”(9) 2 9(9) _
e e

s *
{(A Ax,y) + (A Ay,x)}] a.e. 6 .

 

 

 

 

 

Note that M(6) : m(6) - 2 = O, which gives
6
M(O A* + *A
I(Ne(x),y)I.: M(e; ; 2E2; {( Ax’x) 2 (A y’y>} a.e. 6

Thus
1.2.1. ”N H < “(9) ’ m(e) sup “Ax”2 + “Ayuz a.e e

6 B "’ M(6) + m(6) x:IIxII=l

Y=HYH=1
§_ggg; ; 2E2; a.e. 6 (since IIAII2 = 1).

1.3. Lemma.

Let the spectral density f satisfy boundedness condition
1.1 and the image AX be dense in K. If A is one—to-one onto

then

*
(a) A is one-to-one

(b) A*‘1 = (A'1)*

 

2) x

AX

and
(c) |A*“1N6A51(k),i| 5_ﬁgg; :nEEgIHRH “in a.e. e for R52 e AX.
Proof.

* *
(a) Since AX is dense in K, A is defined on K to X

* * *
as follows V k6 K, A (k) = x where x (y) = (k,Ay) V y‘E X.

46

Further, for each k,kd 5 K

t *
A (k) = A (k')¢= VyE X. (k,Ay) = (k'.Ay)
e we AX. (k.£) = (16.2)
c: V26 K, (k,2) = (k',2)
(since AX is dense in K)
c: k = k'

*
Hence A is one-to-one.

* * * *
(b) Range of A = {x 6 X. :51 k5 K with x (y) = (k,Ay) VyE X}.

_ - *
Also A 1 is defined on a dense subspace of K, and domain of (A 1)

*
is a subspace of X . In fact

*

{x

0((A'1>*)

m

* ..
x.*: site K with x (A 12) =<1t,2) vie AX}

* * *
= {x e X.: ake K with x (y) = (k,Ay) VyE X}

(since A is one-to-one onto AX from X).

Therefore 0(A*'1) = 0((A’1)*).

* *-
Also for x E D.(A 1)

*-1 * *

A (x) = kw x (y) = (k,Ay) VyE X-
e x*(A‘12) = (k,AA'li) V2 5 AX
e x*(A‘19.) - (k,2.) V2 5 A-X

- * *
e (A 1) (x ) =
(c) From 1.2.1 in the proof of the previous lemma 1.2, for x,yE X

INe (x),yI < ”(9) " m(e) LIL—ML- a. e.

M(6) + m(6)

 

Thus for k,2,e A(X)

47

 

 

 

 

I(A*"1N9A'1(k>,2)l = I(NnA"l(k),A'12)l
< M(6) — m(6) AAA-I(mliz + helm? e
-— M(6) + m(6) 2 a°e'
n M(6) — m(e):llkll2 + IIRIIZ a e e
M(6) + m(6) 2 ° ° °
Thus
2 2
:*- -li < M(6) - m(6) k' +. Q
“A lNeA llB-— .:”E M(6) + m(6) 2
I E1
III—1
= Me) - mm) 6
M(6) + m(6) °

Hence the result.

1.4. Relationship with the Case of Hilbert Space Valued Random Vari-

 

ables.

Main Theorem I.

 

If the spectral density fe satisfies the boundedness con-
dition and A: X + K is one-to—one and AX dense in K then there
is a unique stationary stochastic process (nu: 1m < n < 00} which

is B(K,K) valued and is such that

(i) Rn(n) = A*-1Rg(n)A—l on AX

2
M(6) + m(6

lNeA'l] = A*-1f€(6)A-l on AX

 

_ *-
(ii) fn(6) ' ) [IK + A

where IK denotes the identity Operator on K and f€(6) is f

in our previous notation.

6

Proof.

2
"' M(6) + m(6) UK

a bounded operator defined on AX. Let ge denote its unique con-

 

*— - *.. .—
By Lemma 1.3 A lf€(6)A 1 + A 1NeA 1] is

tinuous extension to K. g6 is then a B+(K,K) valued function

48

defined on the unit circle. Further g(6) is strongly measurable
since f(6) is assumed to be so. Also by 1.3(c) .“ge“.€ L1[0,2n).
Hence g6 '18 Bochner integrable. Thus for any n, EDA-1 defined on

AX is such that V k.€ AX

2n

. -1 :2 -.l_ *-1 -1
.hgnA (k)h - 2” g (A feA (k),k)d6
1 2n

= 27 (I) (gn<k>.k>de

I A

IIgIILl[0,2n)IIkII2 < w.

Hence gnA-l admits of a unique continuous extension to K, say nn.
(nu: -m < n < 00} is then a B(K,K) valued stochastic process. In
fact {nnz -m < n < 00} is stationary as shown in the following
reasoning.

For k,£ 6 K we must show that (nnk,nm£) depends only on
m—n. Since A(X) is dense in K, 3 sequences {xp}, {yq} in X

such that

A(x ) + k in K
P
and A(yp) + R in K.

Then, since nn and nm are bounded

(nnk.nm£) 11m (nn(Axp).nm(Ayp))

p-W

= lim (tnA'1(Axp),th‘1(Ayp))

p—hm

= 11 ,
p+2 (Enxp Emyp)

2
= lim -—'f
P+00 0

H
-i(n-m)6
e (fexp,yp)d6

49

which depends on m and n only through n-m.

Furthermore
2" -i(n-m)6 -l -1
(nnk.nm2) = 1m m-E— fe (feA (Axp),A (Ayp))d6
pw 0
2N
_- _ *_ _
= Iim-—— f e 1(“ m>e(A 1f A 1(Ax ),Ay )d6
2n 6 p p
pw 0
2N
_ _ *_ -
= 1—-f e i(“ m) {lim (A 1f A l(Ax ),Ay )}de
2n 0 6 P P
p-NX)
2n
=-%; f e-i(n-m)e(ge(k),(1))d9a
O

the last two steps being true since ge is a bounded operator. Thus

on A(X)
_ _ -1 -1 _ *-1 -1 _ *-1 -1
Rn - (nn.n0) - (EnA ,EOA ) - A (En,€O)A - A RE A
n n
f (e) = A*'1f (6)A-1.
n E
n D

and due to continuity of all functions involved, Rn and fn (6) are

*-1 -1 n *-l -1
the unique continuous extensions of A RE A and A fg (6)A

n n
respectively to K.

1.5. Factorization of the spectral density.

 

Corollary.

If O6: K + K is the generating function given in [7] 7.3.5,

for the B(K,K) valued stationary stochastic process {nn: -m < n < 0°}

and if fe satisfies the boundedness condition and A: X + K is

one-to-one and AX dense in K then ¢6A: X + K is such that

*
fe - (¢nA) <¢nA>.

50

Proof.

By Theorem 1.4, ge is the unique continuous extension of

*_ -
A 1f A 1. So that

f = A 86A

*3! _ *
A ¢e¢oA - (¢9A) (¢9A).

1.6. The Prediction Error Matrix and the Predictor for a Banach
Space Valued Process.

For the B(K,K) valued stationary stochastic process
{nn: -w < n < Go} a schematic algorithm to obtain the prediction error
matrix Gn and the linear predictor av of nv for v > 0 based
on the past Inn: n.: O}, is given in [7], Chapter VII. We shall now
find the same for the B(X,K) valued process {§n: -w < n < 00}.
1.6.1. Notation.

For 3 c B(X,K) let
3(3) denote the smallest (strongly) closed subspace of B(X,K)
containing the set {SB: 8 E S, B 6 B(X,X)}
0(3) denote the smallest closed subspace of K containing the set
{Sx: S E S, x e X}.
We shall use the same notation also for subsets S of B(K,K). In

this notation then
5(3) = B(X,O(S)) (Ref. [7], 3.2.3, Chapter III, p. 9).

1.6.2. Definition.

 

For the stationary stochastic B(X,X) valued process

{En: -m < n < 00} define

51

Mn = O{Ek: k.: n}, Mn = oIEkx: x E X; k :_n}, -® < n < w
M = n M , M = n M
n n

M = E{Ek: -w < k < co}, Mco = O{Ekx: x 6 X, -m < k < 0°}.

00

Furthermore let Bn’ Bn for -w :.n i w denote the corresponding
subspaces for the B(K,K) valued stationary process {nn: -m < n < 00}.
1.6.3. Main Theorem II.

The two stationary stochastic processes {Enz -w < n < 00}
and (nu: -m < n < 00} are further related as follows
(i) For each integer v > 0, if Ev denotes the projection of Ev

on M0 and nv the prOJection of nv on B0 then

*

ii G = A G A

() n ”n

(iii) Ev = lim 2 Evkg-k
n+0° k=0

iv6 16 -1

where EV is the kth Fourier coefficient of [e- O(e )]O+O ,

k

O being the generating function of the process {nn: -w < n < 00}.
Proof.

(1) Note that E = n A for each k. Also AX is dense in K

k k

and nk is bounded. Therefore for each k

o{£kx: x E X} = O{nk2: t 6 K}.

Now for each x E X

52

nvA(x) (nVIBO)(AX)

= (nvAxlBO)

= (g A-leIB )
v 0

= (EvXIBO)

= (EVXIMO)

_ = (anIMO)(x>

H
m
C
A
><
v

(ii) Now for each x E X, y E X

(5*GnA(X))(y) ((nl - ﬁl)(Ax), (n1 — Ol)(Ay)) (by definition of Gn)

(nle - nle, nle — nlAy)

(flA'1(Ax> - 51(x).zlA‘1(Ax) - zl<y>>

(using nn = EnA-l on AX and (1))

(61x - EIX.€1x - EIX)

G€(x).

(iii) For each integer v > 0,
n

h (x) = lim ( z E n )(x) Ref. [7], 7.4.11, Chapter VII,
v vk -k .
n+m k=0 p 108.

Therefore

53

€v(X) = nvA(X) = nv(AX)
n
= lim ( Z Evkn-k)(AX)
n-*0° k=0
n -1
= lim ( z Eka_kA )(Ax)
n+00 k=0
" -1
= lim 2 E t A (Ax)
n+m k=0 vk -k
n
= lim 2 E E (x).
n+w k=0 vk -k

1.7. Note.

 

For results in this chapter the boundedness assumption 1.1 was
made on the spectral density of the process and it was further assumed
that the map A: X + K be one-to-one with the image of X dense in K.
The restriction of AX being dense in K is easily deleted by re-
placing K by the Hilbert space H generated by AX in defining
the process {nn: -w < n < 00}. Generalization when A is not one-to-
one calls for a closer look and may be handled as follows: Let K(P)

denote the kernel of any operator P. Then due to boundednessassumption 1.1
1.7.1 K(A) = K(fe) a.e. e

where f6 denotes the quadratic form of f Let the quotient space,

6.
denoted by I, be such that

v x t x,. 15;“;- inf IIx - sII = d(x,K(A))
)

where ; is the equivalence class x + K(A) of elements of x. Now

(X,“ h) is a Banach space. Ref. [3], P. 140.

54

The linear map AQ defined on it as follows

AQ(§) = Ax for Q t Y
is continuous in the norm of X. This is shown as follows

IIAQII= sup IAQ<§>II= a. ,stu.
Hxh=l x:d(x,K(A))=l

Also for each x 6 X with d(x,K(A)) = l, x = x + 6 - 6 whatever

6 6 K(A). So

.hAx“ i inf {IA(x + 6)“ + hAdh}_: inf {“x + 5“} 5_1
6€K(A) deR(A)

since A6 = 0 for O E K(A) and hA“ = 1.

Hence A is continuous. Furthermore A is such that

Q Q

(AQAQ(;)’;) = <Ax.Ay> = (A*Ax.y) for x,y e x.

Thus

*
: M(6)A A a.e. 6

m(6)A*A _: f Q Q

Q Q 9

and A : X + K is one-to-one, and without loss of generality A (X)

Q

is dense in K.

Q

To make sense of the definition of EnAQ

under AQ’ we must have an uniquely defined on X. It is here that

we would need the assumption of linearity of En. Let, for x and y

on the image of X

in X, Ax = Ay. Then

55

h€n(X) - £n(y)h2 = han<x - y>h2
= (£n(x - y), £n(x - y))
= (50(x - y). €O(x - y))
1 2n .
= as fe(x - y)d6
= 0 (due to 1.7.1).

So if x = y mod K(A) then gn(x) = €n(y). Hence ‘V x.€ X we may

define €n(;) = €n(x). And the preceding procedures now apply to

-1
Q

dictor for the process {€n: -w < n < co}.

EnA to ultimately yield the prediction error matrix and the pre-

CHAPTER IV

ON LINEARIZING STATISTICS OF TIME-SERIES

FOR NONLINEAR PREDICTION

The problem of nonlinear prediction of a stationary stochastic
process was dealt with, in the second chapter of this thesis at the level
of defining the predictor and showing that we have a determinate mathe-
matical problem and then obtaining the predictor by a more or less direct
attack. In order to utilize the time and spectral domain analysis to
obtain an algorithm for the predictor at a more efficient level, N. Weiner
in [18] suggests a method of relating the nonlinear prediction problem
to linear prediction of an infinite-variate stationary stochastic process.

In this chapter we will explore this relationship further.

1. The Related Infinite-Variate Process

 

Let {fn : -m < n < 00} be a univariate strictly stationary stochastic
process defined on a probability space (9, F, P), with zero expectations.
Let T denote the shift associated with the process. Further assume
that there exists a > ()such that IfCI §_a.
Let H = { H f1 : I C {0,-l,-2,...} and I finite}, where elements
of H are writiZﬁ as products of £1 with decreasing indices. Let
{¢j' : 1 §_j' <w} be a linear arrangement of H as in 5.1 of Chapter II.
Let {Oj : l j_j<w} be the subfamily {Oj. : j' odd}. Note that for
j odd, j + 1 contains a factor of 2 and p0 - 2 being the first prime,
the family {O : l §_j < 00} consists of those finite products of‘{fi :Li §_0},

J
written in descending order, which begin with f0.

56

1.1.

1.2.

57
Let ¢O\
¢1
e1 _

/
Due to duplication of subfactors, for men , X (m) may not be square
0

X0 =

summable. Now the shift T acting successively on any finite product

H fi defines a stochastic process in its own right since
icI

Tn .= .
(n f1) H fi+11

181 161
(Ref. Thm. 1.3, Chapter 1).

Define for -w<n < w

n
T O0

n
X = T :¢1

Tn:¢j
Notation I
Let Xnk denote the kth coordinate ofXn for 0.: k < w.
lem. Mk denote the family of square integrable functions which are
measurable w.r.t. the c-algebra generated by{fn : n §_k }, along
with all null sets 0 §_k < w. Also let Mk denote the closed manifold

of L2(Q) generated by the co—ordinate functions {Xn1 : 0 §_i < w, n §_k}.

Main Theorem

 

{Xn : - w < n < on} is a Banach space,€m- valued strictly stationary
stochastic process with shift which is the inflation of T. Further-
more this process is related to the univariate process {fn : -'w < n < w}
as follows

(i) for -m < k < w

Mk = Mk’

58

(ii) for v > 0
(fle0) = XvO.

where X; denotes the linear predictor of Xv based on the present and

past of the process {Xn : -w < n < 00}.

Proof

 

(i) The polynomials in {fn : n §_k} form a dense subset of Mk'
Mk’ on the other hand, is the closure of finite linear combi-
nations of products from {fn : n §_k}.

Both MR and MR therefore, contain a common dense set. Both
being closed, it follows that
Mk = Mk'

(ii) then follows from the definition of predictors.

1.3. MAIN RESULT I .

 

The nonlinear predictor (fV'MO) may thus be obtained as the
0th coordinate of the linear predictor RV of the ‘zm-valued strictly
stationary random process {Xn : - m < n < w}.

1.4 Let C()denote the set of all infinite vector sequences that tend to

zero. C()is then a (separable) Banach space and its dual space £'

is separable.

1.4.1. MAIN RESULT II

 

Under the assumption that Rh 6 C0, the process {En : -w < n < co}
1
*
defined from the separable Banach space CO = .K to L2(O) as follows

1
y mi . £n(x) (o) = x<xn<wn

may be identified with the process {Xn : -w < n < 00I.

59

This identification yields the linear predictor by methods used
in Chapter III, under the boundedness condition on the spectral density
of {En : -m < n < OD}. The problem of translating this condition in
terms of moments of the process {fn : —m < n < co} is still open.

An attempt to obtain extension of the work of Robertson Ref., [10], [11],

has not been made here. This is to be investigated later.

Special Case

 

we shall, now, consider a particular stationary stochastic process

{fn : -w < n < m} and its related infinite variate process
{Xn : -m < n < co} to shed some light on the difficulty in translating

the condition of boundedness of the spectral density of {Xn: -w < n < co}
in terms of moments of {fn : -w < n < co}. Let {fn : -w < n < 0°} be

a two state stationary Markov process with state space {1, -1} and the

l - a]
P = I
l r a a J

l] = P [Xn

transition matrix

9

such that P [Xn -l] = %'for -m < n < w.

Then for each n > 0, the n step transition matrix is

I l 1 n ' 1 l n
I— _ a- _ — — -
2 + 2 (2 l) 2 2 (2O 1)
PD = 1 1 l l n
n
2 .. 2 (Za—l) 2 + 2 (20’1) J

All moments of finite products of {fa}0° are then obtainable as

—00

v
follows. Since for any integer v > O, fn is l or fn according as v
is even or odd respectively, to obtain moments of finite products of

{fn}0° we need just evaluate B(fn f ... f ) for -w < n1 < <

1 n2 “R

-—00
< :1k < w. Notice that for an integer v > 0

n2

6O

B(vaf0=l) - E(vaf0=-l) = P“(1,1) — PV(1,-1) -{PV(-1,1) - PV(-1,-1)}

l.
2

+ Ian—i)" 4% - l, (mi—1)") - g A, (zen-1)" — § -% (2a-1>V}

2 (2o-1)V.

Similarly

f(fnlf0=1) + B(fVIfO = -1) = PV(1,1) - PV(1,-1) + PV(-1,1) - P“(-1,—1) = 0.

Now
f(fofn1 fn2 ... fnk) = f:(fn1 fn2 ... fnkIfO = 1) P(f0 = 1) —
E(fnl fn2 ... fnklfo = -1) P (f0 = -1)
8.]... = — =—
2[B(fn1 fn2 ... fnkIfO 1) f(fn1 fnz ... fnklfo 1)]

1
= E-[E(fn2 ...fnkIfnl - l, f = 1) P(fn.l = llf0=1>

O

=-l,f0=l) P(fnl =-1|f0=l)

-E(f ...f f
n2 nkI n1

-E(fn2 ...fnkIfnl =l,fO=-l) P(fnl =1If
+E(fn2 ...fnkIfnl - —1, f0=-1) P(fnl=-1If0=—l)]

l n n
- §{E(fn2 ...fnkIfnl =1) P 1 (1,1)-E(fn2 ...fnklfnl= -1)P l(-1.1)

—E(f ...fnkIf

- 91 s- “1 _ _
n2 n1 -1) P (-l,l)+E(fn2...fnkIfnl 1)? ( 1, 1) 1

61

= listf f If -1) {P“1(1 l) p“1 1 1 } _
2 n2"° Bk nl- ’ — (- s ) +E(fn2ooofnklfn1 - -1)

xiPnl(-l.-l) -Pn1<-1.1>}1

= l{B(f f If =1) (2 -1>n1 + B(f f If “ -1)(2 1>nl
2 n ... n n a n ... n n - o- ]

2 k 1 2 k 1

1 n1
2 “2 “k “1 2 “k 1

Again repeating the same process

n "n “n
=1) {P 2 l<1.1>+Pn2 1(4.1)}

E(f f f ) = 1-(2 —l)n1 [B(f
0 n ...n 2 a n3... 2

f |fn
1 k “k

_E(fn ...fnklfn =-1) {Pnz-n1(1,-1)+p“2'“1 (-1,-1) 1

3 2

_ 1 n1 1 n2‘n1 1 1 n2"“1
- -2-(2a-1) [B(fn . ..fn lfn =1){ ‘2' + (20"1) + 7 - f(ZG-l) }

3 k 2

nz-nl nz‘n

-E(fn ...fn |f = —1) §-- %-(2a-1) + %-— %-(2a—1) }

3 k 2

1 n1
s _(2a_1) [E(f ...f If =1) -E(f ...f If = -1)]
2 n3 nk n2 n3 nk n2

Proceeding thus we obtain

n n “n
E(f(fnl...fnk) ='%(2a-1) 1 (2s-1) 3 2 =-1)]

...[E(fnkIf -1)+(-1)kE(fnkIfnk_l

nk-l

n +(n3—n )+... n -n _
- %{(2a-l) l 2 }x 2 (2a-l) k k 1 if k is odd

0 if k is even

62

n +n +...+n —(n +n +... + n )
l k 4 k-l
= (2a—1) 3 2 if k is odd

0 if k is even

We will actually consider the corresponding prediction problem for
{gn: -~<n<m} where gn = 6 fn for some fixed 6 with O<6<l. Now if the
condition of boundedness of the spectral density of the related infinite
variate process {xnz -w<n<W} was satisfied, we could determine the
nonlinear predictor for the process {gn: —w<n<m}, hence for the process
{fn: -w<n<w}.

The spectral density for the process {Xn: -w<n<w} is given as

follows:
f (1 ) - : E(X x ) ieh <1 <m
e ,j _ h=_oo 01- hj e 0— ’j
b th th
where th = T X0j is the h shift of the j coordinate of X0, T being

the shift for the stationary stochastic process {gn: -m<n<m}

Let X = g g ...g and X = g g ...gm .
Oi 0 n1 {n1 Oj 0 ml J

Then

on

f (1.3) = 2 B(8 8 -..g s s -.-g + ) e
9 h=_s 0 n1 1 m1+h 91 h

16h
= 2 B(X X ) e + Z B(g g ...g g gm +h...gm )e
i h h +h
|h|_<__nI 0 j lhl>nI 0 nl nI 1 J

63

16h
= z E(X , xh,) e + 0, if I + J is odd
01 J
lhlipl
'6 nl+n —n2+ ...+n -nI_l+m +m3-m2+.
B(xoi xh,) e1 h + z (2a-1) 3 I l
Ihlinl J h >nI

if I and J are both odd

n +n -n +...+(h-nI)+(m2+h—ml—h)+

2 B(XOi xh.) sigh + 2 {(Zo—l) 1 3 2
lhlfPI ' J lhl>n1
16h

X e } if I and J are both even

= 2 16h
Ihlinl B(x01 th) e if I + J is odd

..‘HD -m

J J-l

...+ +h- -h)
(mJ mJ—l

6 16h
2 E(X X ) e1 h + c(i,j) 2 e if I, J odd
Ihl<n ”1 hi Ih|>
—-I n1
6 h i h
2 B(X Xh,) e1 h + c(i,j) Z (Zn-l) e 6 if I, J even,
01 3

where c(i,j) is the appropriate constant bounded by l.

The spectral density in all these cases is bounded above by some

M(6) which is integrable since

19h
(i)| Z B(g g ---g 3 g ) e I
Ihlinl 0 n1 nI h ”1+“

64

i(n +1)6
(ii) c(i,j) X ' eieh = 2 c(i,j) e I

lhl>n1 l - 816

which is integrable on the unit circle,

n +1
(111) |c(i,j) 2 (201—1)h eiehl :_c(i,j) 2(2a-l) I z (201—1)h
|h|>nI h>0

n +1
= 2c(i,j) (2a-1) I
l - (Zn-l)

nI+l
= c(i,j) (Zn-1)
l - a

The spectral density, however, is not bounded away from 0. Consider the

diagonal element fe(i,i) where XOi is gogl...gI.

Then
ieh
ZI
||r£1|<IEmehj e [:6 x'2I

which tends to O as I tends to w.

Also
16h

Ilﬁl>1 E K(ﬁ th e tends to 0 as I tends to w.

Thus there exists a subsequence {in} such that f (in, in) tends to 0

6

as n tends to w. Hence fe cannot be bounded away from zero.
1
So if we consider the infinite variate process {ETE1 gn: -w<n<w},

its Spectral density will also not be bounded away from zero.

Remark.

The above example shows that a non—deterministic stochastic

process with bounded spectral density may very well have its spectral

6S

density not bounded away from zero. Since the boundedness condition

1.1 of Chapter III is not satisfied, to use Wiener and Masani's algorithm
for computing the predictor one must prove a stronger version of our
algorithm techniques given in Chapter III which would be valid in the
absence of a positive lower bound. Once such an algorithm is avail-

able, examples such as above can be more fully investigated.

BIBLIOGRAPHY

10.

ll.

12.

BIBLIOGRAPHY

Doob, J.L., Stochastic Processes, Wiley, 1960.

Gangolli, R., Wide sense stationary sequences of distributions on
Hilbert space and the factorization of operator valued functions.
J. Math. Mech. 12(1963), 893-910.

Kato, T., Perturbation Thoery for Linear Operators, Springer-Verlag
New York Inc., 1960.

Mandrekar, V. and Salehi, H., The square integrability of operator-
valued functions with respect to a non-negative operator-valued
measure and Kolmogorov isomorphism theorem. Indiana Univ. Math.
J. 20(1970), 545-563.

Masani, P., Recent trends in Multivariate Prediction Theory, Multi-
variate Analysis, Academic Press, New York, 1966, 351-382.

, The Prediction Theory of Multivariate Stochastic Processes,

III, Acta Math. 104(1960), 142-162.

Miamee, A.G., Banach Space Valued Stationary Stochastic Processes and
factorization of nonnegative operator valued functions on a Banach
space. Ph.D. thesis, MSU (1973). (To be published in Sankhya).

Nadkarni, M.G., Prediction theory of infinite variate weakly stationary
stochastic processes. Sankhya, Series A, Vol. 32, Part 2 (1970),
145-172.

, On a paper of Remesh Gangolli, J. Math. Mech. 17(1967), 403—405.

Robertson, J.B., An algorithm for computing the non-linear predictor,
Ann. Math. Statist. 42(1971), 75-80.

’ On a problem in nonlinear prediction theory. Ann. Math.
Statist. 34(1965), 1554-1555.

Rosenblatt, M., Stationary processes as shifts of functions of in-

dependent random variables, J. Math. Mech., 8(1959),
665-681.

66

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

67

Rozanov, Yu, Some approximation problems in the theory of stationary
processes, J. Multivariate Anal., 2(1972), 135-144.

Salehi, H., On determination of the optimal factor of a non-negative
matrix valued function, Proc. Amer. Math. Soc. 29(1971), 383-389.

Shohat, J.A. and Tamashin, J.D., The problem of moments, American
Mathematical Society Surveys, No. 1, New York (1943).

Sobel, E., Prediction of a noise distorted mutlivariate non—stationary
signal. J. Appl. Prob. 4(1967), 330-342.

Spitzer, P., Random fields and interacting particle systems, Notes on
Lectures given at the 1971 MAA summer seminar.

Wiener, N., Nonlinear prediction and dynamics. Proc. of Third Berkeley
Symposium on Math. Stat. and Prob. III(1956), 247-252.

Wiener, N. and Masani, P., Nonlinear Prediction, Probability and
Statistics, The Harald Cramer Volume, Ed. by Ulf Grenader, Stockholm

(1959), 190-212.

, The prediction theory of multivariate stochastic processes,
1., Acta Math. 98(1957), 111-150.

, The prediction theory of multivariate stochastic processes,
11-. Acta Math. 99(1958), 93-137.

Yosida, K., Functional Analysis, Springer-Verlag New York Inc., 1971.

“711171171111;:1111111111711“

783