WE UNEFQRM APPROXEMATEW OF A F6: 830.3%
ENE EYE BERE‘MTEVB BY PGLWC‘MEALS

Thesis for the Begree of Ph. D.
MiCHlGAN STATE UNWERSETY
FREDERECK JAMES SCHBURMANN
11967

 

J. L

“‘5‘5 ' LIBRARY

" Michigan State
University

    
  

 

This is to certify that the

thesis entitled
"The Uniform Approximation of a Function

and its Derivatives by Polynomials" 1

presented by

Frederick James Schuurmann

has been accepted towards fulﬁllment
of the requirements for

M degree inMathemaiic s

facécc yylc‘bbf? tow-(Q

D. Moursund

Major professor

 

Due July 10. 1967

0-169

ABSTRACT

THE UNIFORM APPROXIMATION OF A FUNCTION AND
ITS DERIVATIVES BY POLYNOMIALS

by Frederick James Schuurmann

Let X be a compact subset of the real line, and let
I be a finite collection of nonnegative integers including
0. The function f(x) and the base functions {¢i(x)}i:l are
assumed to be in Cq(X) while the weight functions wk(X) for
keI are continuous on X. The problem is to find real

scalars al,a2,...,an which minimize

M I <>‘1k [r() n ()]|
ax w x ———» x - a e x .
(x,k)exx1 k dxk i 1

The main objective is to provide an efficient method for
the computation of a best approximation on a digital com-
puter when X is a closed interval.

First the problem of the existence of a best approxi-
mation is discussed. Then the characterization of a best
approximation is treated to provide a basis for the com-
putational algorithm as well as to provide a method of
determining when an approximation is a best approximation.
Results are also obtained concerning the dimension of the

space of best approximations.

Frederick James Schuurmann

Next, sufficient conditions for the uniform con-
vergence of the polynomial and its derivatives to the
function and its derivatives as the number of base
functions increases are given.

It was also found that with the appropriate hy-
potheses the approximation of a function and its first
derivative on an interval by certain classes of base
functions is unique.

Finally, two algorithms which use linear programming
to find best approximations on finite point sets are pre-
sented and convergence theorems given. Several compu-

tational examples are also presented.

THE UNIFORM APPROXIMATION OF A FUNCTION AND
ITS DERIVATIVES BY POLYNOMIALS

By

Frederick James Schuurmann

A THESIS

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Mathematics

1967

(A\
5\‘4%-
5 \J
ye

gs

l

N

ACKNOWLEDGMENTS

The author is deeply indebted to his major professor
Dr. David G. Moursund for suggesting this thesis problem
and for his patient and helpful guidance during the prepa—
ration of this thesis.

The author would also like to thank Michigan State
University for making available the computing facilities
used in the development of the algorithms and the calcu-

lation of the examples in this paper.

11

TABLE OF CONTENTS

ACKNOWLEDGMENTS . . . . . . . . . . .
INTRODUCTION . . . . . . . . . . . .
Chapter

I. EXISTENCE AND CHARACTERIZATION.

1. Introduction . . . . . .

2. Existence . . . . . . . .

3. Characterization . . . . . .
II. CONVERGENCE CONSIDERATIONS . . . . .

1. Uniform Convergence of the Poly-
nomial to the Function . . .

2. Continuous Dependence of the
Approximating Polynomial on the
Function. .

3. The de la Vallée Poussin Algorithm .

III. UNIQUENESS RESULTS. . .

I. An Example of Nonuniqueness .
2. A General Uniqueness Theorem . .
3. The Trigonometric Polynomials.
A. A Special Class of Polynomials

IV. COMPUTATIONAL ALGORITHMS FOR A BEST
APPROXIMATION . . . . . .

1. Introduction . . . .

2. The Continuity of e(T) . .

3. Two Algorithms. . . .

4. Convergence of the Algorithms.

5. Computational Procedures . .

6. Computational Examples . . . .
BIBLIOGRAPHY . . . . . . .
INDEX OF NOTATIONS . . . . .

iii

Page
ii

30

32
38

“A

AA
47
57
58

INTRODUCTION

Many advances have been made in the study of approxi-
mation theory since the rise of the electronic computer.
The computer has stimulated the study of approximation
theory because approximations are necessary in the effici-
ent handling of many problems, and the computer provides
the means whereby approximations may be computed. Many
problems which once were dismissed as impractical because
of the difficulties of computation are now solvable using
an electronic computer.

A recent bibliography of approximation theory, a
survey of recent Russian literature in approximation theory,
and a number of papers on approximation theory are given
in Garabedian [10]. Another earlier survey of literature
in approximation theory is given by Buck [3,A]. Recently,
several new books have been published which deal exclusively
with approximation theory. Among these are Rice [23] and
Cheney [5]. Both of these books contain extensive biblio-
graphies.

The problem considered in this paper is that of
approximating a given function f by a polynomial in such a
way as to make one or more of the derivatives of the
approximating polynomial approximate the corresponding
derivatives of the function f in some prescribed manner.

1

In the first chapter, the existence and charac-
terization of best approximations are treated exten—
sively. Next the uniform convergence of best approxi-
mations to f and the continuous dependence of best
approximations on f are studied. A theoretical algorithm
for the computation of a best approximation is also pre-
sented. Then several classes of problems are described
which have unique best approximations. Finally the
computational problems of finding a best approximation
are considered. Two algorithms are given and conver-
gence theorems presented. Several numerical examples

are also given.

CHAPTER I

EXISTENCE AND CHARACTERIZATION

1. Introduction

 

Let X be a compact subset of the real line, and let
I be an finite collection of nonnegative integers including
0. The function f(x) and the base functions
¢1(x),¢2(x),...,¢n(x) for fixed n 1 l, are assumed to be
in Cq(X) where q is the largest integer in I. For each
keI let wk(x) be a continuous weight function on X. (We
show in Lemma 1.5 that there is no loss of generality in
assuming that the weight functions are nonnegative.) The

following standard notation will be used:

k
_ d k _ d _ df(x)
dx dx 0 dx
x = x0

1.1 Definition. Let T be a closed subset of XXI and let

 

gqu(X). Then define MT[g(x)] = Maxlwk(x)Dkg(x) . If

(x,k)eT
T = XXI we will write M[g(x)] for MT[g(x)]. (A set TcXXI

 

is called closed if the sets Vk = {x:(x,k)eT} for keI are

all closed subsets of X.)

1.2 Problem. Find real scalars al,a2,...,an such that

M[f(x) -

a ¢ ( )l
1 1 1 x

IIMZS

l

is a minimum. A solution to this problem is called a best

approximation to f(x) on XXI with weight functions {wk(x)}.
As a notational convenience, points in En are repre-

sented by a = (al,a2,...,an), B = (bl,b2,...,bn), etc.,

while polynomials are represented by

n
P(x a) = 2 a ¢ (x).
’ i=1 i 1

Also, for any closed set TCXXI, let

e(T) = Min MT[f(x) - P(x,a)].

n
deE

It is shown in Theorem 1.16 that this minimum exists.
e(T) will be called the deviation of a best approximation

to f(x) on the set T. We define
R(T) = {aeEn:MT[f(x) - P(x,a)] = e(T)}

to be the set of all best approximations to f(x) on T.
When T is XXI we will write R for R(T) and e for e(T).
The norm to be used on the polynomial coefficients which
are points of En is

lldll = Maxlail
i=l,2,...,n

Throughout this paper the symbol T, with or without a
subscript or superscript, will denote a compact subset

of XXI.

1.3 Theorem. MTEg] is defined for all gqu(X) and has
the properties:

(a) 0 :_MT[g] < a

(b) MTEg] = 0 if g s O

(c) MTEtg] = ItIMTEg] for any real scalar t

(d) MTEg + h] :_MT[g] + MTEh].

The first three prOperties are obvious. The fourth
can be proved using the triangle inequality for real num-
bers. Since we allow the weight functions to be zero, we
cannot prove the converse of (b); hence MTEg] is a pseudo-

norm.

l.A Corollary. The set R(T) of best approximations is

 

convex.
Proof: If P(x,a) and P(x,8) are two best approximations

and O i t i 1, then

MTEf<x> - tP(x.a) — (1 — t)P(x,B)]

MT[t(f(x) - P(x,a)) + (l - t)(f(x) - P(x,8))]

IA

MT[t(f(X) - P(X,G))] + MT[(1 - t)(f(X) - P(X,B))]

tMT[f(x) - P(X.a)] + (l - t)MT[f(x) - P(x,B)] = e(T)

Observe that the inequality in the above relation is
actually an equality for all t. If this were not true,
some linear combination of P(x,a) and P(x,8) would be a
better approximation than either P(x,a) or P(x,8).
Therefore, any convex combination of two best approxi-

mations is a best approximation. Q. E. D.

1.5 Lemma. Let {wk(x)} for keI be continuous weight

functions and

 

MlEg] = Max wk(x)Dkg(x) , M2[g] = Maxllwk(x)IDkg(x)
(x,k)eXXI (x,k)eXXI

Then P(x,a) is a best approximation to f(x) in the pseudo—
norm Ml if and only if P(x,a) is a best approximation to

f(x) in the pseudonorm M2.

Proof: P(x,a) is a best approximation in M1[g] if it

minimizes

Max lwk(x)Dk[f(x) — P(x,a)]! . (1)
(x,k)eXXI

In the case of M2[g], P(x,a) must minimize

 

Maxllwk(x)||Dk[f(x) - P(x,a)]|
(x,k)eXXI

. (2)

It is evident that for a given a, the quantities (l) and
(2) are identical. This implies that an a which minimizes

one of them minimizes both, and completes the proof.

Since the set R of best approximations does not change
if the absolute value of the weight function is substituted
for the weight function, we will henceforth assume that all
weight functions are nonnegative.

In order to study some properties of the base functions
of the approximating polynomials, we introduce the following

matrices.

1.6 Definition. The interpolation matrix of a set Q

 

which is a set of m ordered pairs{(x1,ki)}i:ll or ordered
triplesl{(xi,ki,si)}i:ll is the an matrix B whose entry

in the ith row and Jth column is

k
b = D ie

13 J<X1): 1 = 192"°°:m 3 J = 1,2,...,n.

This matrix will be denoted by IM(Q).
The weighted interpolation matrix of a set Q which
is a set of m ordered pairs {(x1,ki)}121 or ordered triples
{(x1,ki,si)}1:1l is the an matrix C whose entry in the ith
row and Jth column is
k

1
13 = Wki(X1)D ¢J(X1). i = 1.2,-..nn; J = 1,2,...,n.

C

This matrix will be denoted by WIM(Q).
In the above definition, it should be observed that
the interpolation matrix, or weighted interpolation matrix

of a set of ordered triples does not depend on the third

element of the triple, the $1. In addition, the matrices

IM(Q) and WIM(Q) always have n columns if Q is nonempty.

1.7 Assumption. We will assume that the base functions

 

¢1(x),¢2(x),...,¢n(x) are linearly independent in the

sense that
M[P(x,a)] = 0 implies IIaII = O. (3)

This is a natural assumption since without it one cannot
hope to prove uniqueness of a best approximation; i.e.,
if M[P(x,c)] = O and P(x,8) is a best approximation to
f(x), then P(x,ca + B) is also a best approximation for
any real c.

The problem of approximating a function f(x) which
is equal to some polynomial in the n base functions on
{(x,k)eXXI:wk(x) # O} is easily solved and of little
interest. These problems will be dropped from further
consideration and hence we may assume that e > O for any

particular problem under consideration.

1.8 m. If MT[P(x,c)] = 0 implies Hall = 0, then T
contains a set Q of n points such that det[WIM(Q)] # 0.
(Here det denotes the determinant.)

Proof: Assume that the conclusion is false. Let T' be

a set of m points from T for which B = WIM(T') has rank

m < n, where m is the maximum rank of any weighted inter-

polation matrix in points from T. Let (xo,ko) be any

point from T and define T" = T'U{(xo,ko)}. Then the rank

of B' = WIM(T") is m and the row bm+l of 8’ corresponding

to (xo,ko) can be written as a linear combination of the

rows of B. Hence in matrix notation we have bm+ = YB

1
where y is a row matrix giving the proper multiples of

the rows of B which are needed to produce bm+1' Next
consider the space of solutions to the problem But = O.
This problem has a solution space of dimension n - m,

and since n > m there exists a nonzero solution do.

The weighted value of this polynomial P(x,ao) at (xo’ko)

t _ t _ _
m+lao _ YBGO - 00 Hence MT[P(X’°O)] — O

and Ilaoll f O which contradicts the hypothesis of this

is given by b

lemma. Q. E. D.

2. Existence

 

1.9 Theorem. G(a) a MT[f(x) - P(x,a)] is a continuous

function for deEn.

Proof: Let K = Max MT[¢i(x)]. If K = O,
i=l,2,...,n n

G(a) is constant for all aeE and hence is continuous.

If K ¢ 0, let a0 be a point of En. Given any e‘>0 let

5 = e/nK. Then Ila - aoll < 5 implies

|G(B) - G(ao)l = IMT[f(X) - P(X,B)] - MT[f(X) - P(X,ao)]|
n

:IMT[f(x) — P(x,B) - f(X) + P(x,ao)]| = MTtiz (ai — bi)¢i(x)]
=l

~nK

 

 

< Max Iai — bi Ila - 8| -nK < s. Q. E. D.

‘ i=1,2,...,n °

10

1.10 Corollary. The set R(T) of best approximations is

 

closed.

Pgoof: If R(T) is empty, the result is trivial. Other-
wise R(T) = {a:G(a) = e(T)} and hence R(T) = G-l[e(T)].

Since G(a) is continuous and e(T) is a closed subset of

the real numbers, R(T) is closed. Q. E. D.

1.11 Lommo. If aeR(T) (the set of best approximations

on the set T) then MT[P(x,a)] i 2-MT[f(x)].

Pooof: MT[P(x,a)] - MT[f(x)] i MT[f(x) — P(x,a)] i MT[f(x)]
where the second inequality holds because P(x,a) a 0 cannot
be a better approximation than any best approximation. The

result follows from these inequalities. Q. E. D.

1.12 Definition. A family of polynomials {P(x,a)} for

 

m
i=1

or a set Q' = {(x ,k ,s )} m if there exists some constant
k i i 1 i=1

ass is said to be bounded at a set of points Q = {(x1,k1)}

2 such that ID 1P(xi,a)|: z for all aeS.

1.13 @333. Let {P(x,a)} for aeSCEn be a family of poly-
nomials in n base functions which is bounded by B' at the

n points of T<2XXI. Then if det[WIM(T)] # 0, there exists
a constant B such that IIaII: B for all ces.

Pgoof: Since det[WIM(T)] ¢ 0, any polynomial in these

n base functions is determined by the values of the poly-
nomial at the points of T. Then if V(xi’ki)’ i = 1,2,...,n

gives the values of a polynomial P(x,a) at the points of

11

T and L J is the cofactor of CiJ from the ith row and Jth

1
column of WIM(T), the coefficients of P(x,a) are

n
= [ Z L

aJ i=1 13V(xi,ki)]/det[w1M(T)], J = l,2,...,n.

Then if L = MaleiJl. E = Idet£w1M(T)j| and
13 ;
|V(Xi,k1)| :B' for i = l,2,...,l’1 WC have

[A

laJI nLB'/E for all ces.

Thus B = nLB'/E is a uniform bound on ||c|| for ces. Q. E. D.

1.1M Theorem. The set R(T) of best approximations on the

set T is bounded if MT[P(x,a)] = 0 implies IIaII = 0.

Proof: Lemma 1.8 implies that T contains a set

Q = {(xi’k1)}I=1 such that det[W1M(Q)] # 0 if MT[P(x,o)] = 0

implies ||a|| = 0. Then letting W = Min Iwk (xi)|
k=l,2,...,n 1

and using Lemma 1.11,

ki ki
2MT[f(x)] 1 MT[P(x,a)] :||wki(xi)D P(xi,a)| i WID P(xi,a)|

for i = 1,2,..,r1 and aeR(T). Since det[WIM(Q)] # 0 implies
that w > o, it follows that IDkiP(xi,a)l : 2MT[f(x)]/W

for (xi,k1)eQ and oeR(T). Hence Lemma 1.13 implies that
R(T) is bounded. Q. E. D.

n
1.15 Corollary. If Q = {(xi’ki)}i=1 is a set of points

 

from XXI such that det[WIM(Q)] # 0, then there exists
a constant B such that ||c|| < B for all aeR(TlL)Q) for

any Tl CXXI .

12

Proof: Let T' = TlLIQ; then from the proof of Theorem
1.1A we have
ki
2M[f(x)] 1 2MT'[f(x)] 1 WID P(xi,o)| for i = 1,2,...,n

and any aeR(T'). This implies that

k
2M[f(x)] _>_ ID 1P(xk,a)| for 1 = 1,2,...,n and anR(TlUQ).
W TlCXXI

Then Lemma 1.13 implies that there exists a constant B
such that ||c|| < B for all aeR(T1U Q) for any TlCXXI.

Q. E. D.

1.16 Theorem. The set R(T) of best approximations on the
set T is nonempty.
£31929

Case I.--Assume that MT[P(x,a)] = 0 implies that
IIaII = 0. By Theorem 1.1“ the set R(T) is bounded. Then
C(a) = MT[f(x) - P(x,a)] is a continuous function on a
compact set and hence attains its minimum value. Thus a
best approximation exists and R(T) is nonempty.

Case II.--If MT[P(x,o)] = 0 does not imply that
Ilall = 0, there is a nontrivial polynomial P(x,8) such
that MT[P(x,B)] = o. This means that

n
wk(x)DkP(x,B) = wk(x) z b Dk¢i(x) = o for (x,k)eT

i=1 1

where some b1 is nonzero. Therefore, the set of base

13

functions may be reduced in number by at least one without
affecting the values that the weighted polynomial can
attain on T. i.e., With this reduced set of base functions,
we can approximate f(x) on T Just as closely as we could
using the original set of base functions. This reduction
in the number of base functions may be repeated until

those remaining satisfy the assumption of Case I. Q. E. D.

3. Characterization

 

Instead of having one error function as in the approxi-
mation of a function by polynomials, we have an error

function for each keI.

1.17 Definition. Lk(x,c) 2 wk(x)Dk[f(x) — P(x,a)]. The

 

functions Lk(x,o), which are defined for all (X,k)cXXI
and aeEn, are called weighted error functions.

Throughout this section on characterization, T will
be assumed to be a closed subset of XXI such that e(T) > 0.

Such sets T exist because we are assuming that e = e(XXI) > O.

1.18 Definition. An ordered triple (xo,k,s)eXXIx{-1,l}

 

is called an extremum with respect to the approximation
P(x,a) to f(x) on the closed set TCZXXI if (xo,k)eT and

Lk(xo,a) = sd where d = MT[f(x) - P(x,a)].

1.19 Definition. If acEn and Tc:XXI is closed, let

 

C(T,a) = {(x,k,s)eXXIX{—1,l}:(x,k)eT and Lk(x,a) = sd}

where d = MT[f(x) - P(x,c)]. If T = XXI, let C(a) = C(T,a).

14

The set C(T,o) is called the set of extrema of the approxi-
mation P(x,o) to f(x) in the pseudonorm MT[g(x)].

Since MT[g(x)] gives the maximum absolute value of a
finite number of continuous functions on a compact set,
every approximation must have at least one extremum. This
means that C(T,a) is nonempty for each aeEn. The points
of C(T,a) are of considerable importance in the comput-
tational schemes for solving Problem 1.2 which we discuss
in Chapter IV. Here we are interested in the set C(T,o)
when aeR(T). We shall give certain theorems characterizing

the points of C(T,o) in this case.

1.20 Theorem. There exists an aOeR(T) such that for every
BeR(T) and keI, C(T,do)<:C(T,B).
32223: We have previously shown that R(T) is closed, con-
vex, and nonempty. Let m be the dimension of R(T). If
m = O, R(T) is a single point, the best approximation is
unique, and the theorem is true. If m 1 1 the set R(T)
has a nonempty interior. Let do be in the interior of
R(T). It_will be shown that this point satisfies the
assertion of the theorem.

Let B be any other point in R(T) which is not the
same as so. Extend the line segment 3;? beyond do to
some other point aleR(T) so that al’ao and B are on the
same line with a between a

o 1
Then oo.must be a point of the hyperplane

and B. Let (Xo,k,s)eC(T,do).

15
k _ .
wk(xO)D [f(xO) - P<xo,a)] - se<T>. (5)
Suppose, for example, that s = l and that
w (x )Dk[f(x ) — P<X 8)] <e(T)
k 0 o o’ ‘

Since the three points do, and B are on the line seg-

“l
ment ale, it follows that ol and 8 must be on opposite

sides of the hyperplane in line (5). This means that
w (x )Dk[f(x > - P(x c )1 > e(T)
k 0 o o’ 1

which contradicts the assumption that aleR(T). Thus 8
must be a point of the hyperplane from line (5) since
8eR(T). This same conclusion is also reached if s = -1.
This implies that every BeR(T) must satisfy line (5) and
hence we conclude that (xo,k,s)eC(T,B) for all BeR(T).
Q. E. D.

We have also proved the following:

1.21 Corollary. If a’is an interior point of R(T) then

 

k k
D P(Xo,a) = D P(XO,B)

for every (xo,k,s)eC(T,u) and every BeR(T).

1.22 Definition. The set MES(T) = r\C(T,B) is called the
BER(T)
minimal extremal set (or MES) of the best approximations

 

to f(x) on the set T.

16

The previous theorem shows that MES(T) is the
extremal set of any interior point of R(T). Since
Lk(x,a) is continuous in x for fixed k and a, we observe

that C(T,a) is closed; hence we have the following:

1.23 Corollary. If T is any closed, nonempty subset of

 

XXI, then MES(T) is compact and nonempty.
So far we do not have any method for determining
if a given approximation to f(x) is a best approximation.
To alleviate this difficulty, we use the following definition

and theorem.

1.2“ Definition. A polynomial P(x,a) is said to satisfy
m
i’ki’si)}i=l

(alsokcalled points) where (xi,ki,si)eXXIX{-1,1}, if
sgnED iP(xi,o)] = 'Si’ 1 = 1,2,...,m.

 

Condition A with respect to a set of triples {(x

1.25 Theorem. A polynomial P(x,u) is a best approximation
to f(x) on the set T if and only if there is no polynomial
P(x,8) which satisfies Condition A with respect to C(T,o)
the extremal set of the approximation P(x,o).

Pgoof: Assume that P(x,o) is not a best approximation so
that there exists P(x,y) such that MT[f(x) — P(x,y)]

= d' < MT[f(x) - P(x,u)] = d. Let P(x,8) = P(x,o) - P(x,y).

Then for any (xo,k,s)eC(T,a) we have
w (x )Dk[f(x ) — P(X 0.)] = sd
k 0 o o’ '

Since wk(xo) > 0 it follows that

17
DkP(x e) = Dk[f(x > - P<x v)] - Dk£f<x > - P<x c>1
o’ o o’ o o’

is greater than or less than s(d' - d)/wk(xo) as s is -1
or +1 respectively. Therefore, P(x,8) satisfies Condition
A with respect to C(T,a).

Let d = MT[f(x) - P(x,o)] and assume that there
exists a polynomial P(x,8) satisfying Condition A with

respect to C(T,a). Then let

U = Closure {(x,k)eT:sgnDkP(x,B) = sgnDk[f(x) - P(x,a)]}
d' = Max ILk(x,a)l r = Max Iwk(x)l
(x’k)€U (X,k)€T
s = Max IDkP(x,B)| t = Min{(d-d')/2rs,d'/rs}.
(x,k)eT
Then

wk(X){Dk[f(X) - P<x,c - Bt)]} = Lk(x,a) + Wk(X)DkP(X,Bt) <6)
and using the quantities defined above we have
Iwk(x)DkP(x,Bt)| : IrtDkP(x,B)l : rts ={Min (d—d')/2,d'}.

On the set U, the two terms on the right side of (6) are

of the same sign and less than d' and (d-d')/2 respectively
in absolute value. On T ~ U (where ~ is used to denote

the usual set difference) the terms on the right of (6)

are of opposite sign and satisfy

18
ILk(x,a) + wk(x)DkP(x,Bt)| < d

since

Iwk(x)DkP(x,8t)l : d' on T ~ U.

Hence P(x,o) was not a best approximation. Q. E. D.
In order to better understand the characterization
of best approximations, we focus our attention on certain

subsets of the MES.

1.26 Definition. A set QCZXXIX{-1,1} is called a minimal

 

characterization set (or MCS) if there exists no poly-
nomial satisfying Condition A with respect to Q but for

any QOCZQ, QO # Q there exists a polynomial which satisfies
Condition A with respect to Q0. (Observe that a MCS has
nothing to do with a particular function f(x) being

approximated.)

1.27 Definition. If T is a closed subset of XXI, let

 

MCS(T) denote the collection of all minimal characterization
sets which are subsets of MES(T).

The above definition makes sense because there may
be more than one MCS contained in the set MCS(T) as is

shown in the following:

1.28 Example. Let X = [0,An], I = {O}, w0(x) 5 1, n = 2,
¢l(x) E l, ¢2(x) = x, f(x) = sin x. This is an ordinary

Chebyshev approximation problem, and we know that the MES is

19

C = {(n/2,0,1), (3n/2,0,-l), (Sn/2,0,1), (7n/2,0,-1)}

Also from the theory of ordinary Chebyshev approximation

we know that both

C
II

{(N/2,0,1), (3w/2,0,-l), (Sn/2,0,l)}

[11
ll

{(3n/2,0,-l), (En/2,0,1), (7n/2,0,-l)}

are in MCS(XXI).

1.29 Definition. Corresponding to each ordered triple

 

(x ,k,s), define a point
0
k k k
(sD ¢l(xo),sD ¢2(xo),..”sD ¢n(xo))

in En which will be called an n—point (as Opposed to the
point (xo,k,s) itself).
The following theorem is a modification and extension

of the Lemma of G. F. Voronoi, Remez [22] p. 112.

1.30 Theorem. Let Y be a closed subset of XXIX{-1,l}.
Then there exists a polynomial satisfying Condition A with
respect to the points of Y if and only if the convex hull
of the n-points of Y does not contain the origin of En.
Pgoogz Suppose the convex hull of the n—points of Y does
not contain the origin. Then since Y is compact, the con-
vex hull of Y is compact and hence there is a hyperplane

a'Z = a z + a z + ... + a = -b where b > 0 which

1 1 2 2
separates the origin and the convex hull of Y. Thus for

Z
nn

20

each n-point Yi of Y, the inequality o-Yi < 0 is satis-
fied. Thus P(x,u) is a polynomial satisfying Condition A
with respect to the points of Y.

Assume that there exists a polynomial P(x,a) satis-
fying Condition A with respect to the points of Y. Then
a-Yi < O for all n-points Yi of Y. Then since Y is com-
pact, there exists a constant b > 0 such that a-Yi < —b

for all Y of Y. Thus the origin is not in the convex

i
hull of the n-points of Y. Q. E. D.

Next we have a theorem which gives an upper bound on
the number of points in a MCS. It states that a MCS can

have at most n+1 points.

1.31 Theorem. If Y is a closed subset of XXIX{-l,l} and
if the n-points of Y contain.the origin in their convex
hull, then there is a set of k i n + l n—points of Y whose
convex hull contains the origin.

This theorem is an immediate consequence of the
Theorem of Caratheodory which is stated below as given in

Cheney [6] p. 17.

1.32 Theorem of Caratheodory. Let A be a subset of n-

 

dimensional linear space. Every point of the convex hull
of A is expressible as a convex linear combination of
n+1 (or fewer) elements of A.

In ordinary Chebyshev approximation by polynomials
in a Chebyshev set of base functions a MCS must contain

exactly n+1 points. This is because the interpolation

21

matrix in any n distinct points is nonsingular and hence
for any n distinct points we have a polynomial satisfying
Condition A. In the more general case a MCS may have
fewer than n+1 points, and there may be more than one MCS
having fewer than n+1 points. Also, there may be two or
more minimal characterization sets which do not have the
same number of points. These special cases will be shown

in the following examples.

1.33 Example. Let I = {0,1}, w0(x)
i-l

wl(x) E 1. n = A,
¢i(x) = x for i = 1,2,3,” and X = {-l,-1//3,0,1//§,1}.
The function f(x) is defined on XXI by the following

table.

 

 

x f(x) f'(X)
-l -l O
-1. 0 .1
./3
O l 0
i 0 1
/3
l -l 0

 

Let B

l {(-l,0,—1),(-1//3,1,-l),(0,0,l),(l,0,-l)}.

CD
II

2 ,{(-l,0,-l),(l//§,l,l),(0,0’1)’(1,0,_l)}.

22
Then
B' = {(-1,1,-l,1),(0,-l,2//3,-1),(l,0,0,0),(-l,-l,-l,-l)}

is the set of n—points of both B1 and B2. The origin can

be expressed as a convex combination of these points of E”

by using the multiples

2 + /§ /§ 2 2 - /§
2(A + /§) A + /§ A + /3 2(u + /§)

 

 

Thus using Theorem 1.30 there is no polynomial satisfying

Condition A with respect to the points of B1 or B2. The

nonsignularity of the generalized Vandermonde matrix im-
plies that there is a polynomial satisfying Condition A

with respect to any 3 point subset of either B1 or B2.

Thus B1 and B2 are both minimal characterization sets.

The sets B1 and B2

is the origin of E“. Since there is no polynomial satis-

are both subsets of C(0) where 0

fying Condition A with respect to B1 or B2, there can not
be any polynomial which satisfies Condition A with re-
Spect to C(O). Hence a = 0 is a best approximation, and

the sets B1 and B are both in MCS(XXI) if 0 is an interior

2
point of R.

To show that 0 is interior to R we will determine R.
Since 0 is a best approximation, e = M[f(x) - P(x,o)] = 1

for a = 0 and the following inequalities must be satisfied.

23

-1 i f(xi) - P(xi,o) i

—1 i f'(xi) — P'(xi,a) <

x1 = -1, x2

= -l//3, x

l

l

i

1.2.3.“.5 (a)

i 1,2,3,U,5 (b)

3 = o, xu = 1//§, x = 1.

In the following computation we will refer to these

inequalities by

a letter and a number.

to line (b) with i = 5.

From lines (a) and (b) we have

-1<

Then adding and

-1<

Adding these we
..l<

-1:

-1 - a1 + a2 - a3 + an
-1 - al - a2 - a3 - an
dividing by 2

-1 - a - a

l 3

-l + a
obtain a3 1 0.

-1 - a2 + (2//3)a3 - an

-1 + a2 + (2//3)a3 + a“

Adding and dividing by 2

e.g., (b) 5 refers

from (a) 1

from (a) 5.

from (a) 3.

(7)

from (b) 2

from (b) A.

-1 i -1 + (2/ 3)a3 implies O : (2//3)a3

and using (7) we know a3 = 0.

 

 

24

Then equation (a)3 implies that al 1 0.
-l i —1 - a1 + a2 + an from (a) l
-1 i -l — al - a2 - au from (a) 5.
Adding and dividing by 2
-l o -1 - al and hence al 1 0 and thus a1 = 0.

Equation (b)A implies that l - a2 - a“ i l and hence

-a2 - an i 0.

Then (a) 5 implies that 0 1 -a2 - an and hence a2 = an.

From the computation using lines (a) and (b) we know
that the best approximations are polynomials of the form
au(x3 - x). Since this family of polynomials is zero at
the points of BlLIB2, the solution space R must be deter-
mined by the other points of XXI. Using the equations (b)l
or (b)5 we have —1 1 -a2 + an i l or -1 1 2a“ 1 1. Then it
can be verified that all of the other inequalities from (a)
and (b) are satisfied for these values of an. Thus the
set R of best approximations is au(x3 — x) for -l/2 i an 1 1/2.
Therefore, a = 0 is interior to R, C(O) is the set MES(XXI),

and the sets B and B

l 2 are in MCS(XXI).

1.34 Example. Let x = {-2,-1,0,l,2}, I

wo(x) a wl(x) 5 1 and ¢i(x) = x1_1 for i = 1,2,3. The

{0,1}, n = 3,

function f(x) is defined on XXI by the following table:

25

 

 

x f(x) f'(x)
-2 l O
-1 —1 0

O O —l

l l O

2 -l O

 

From ordinary Chebyshev approximation theory we know that

B = {(-2,0,l),(-1,0,—l),(1,0,l),(2,0,-l)}

l

is a MCS and that a = 0 is the unique best approximation

with e = 1. Now we will show that

B = {(-l,O,-l),(O,l,—l),(l,0,l)}

2

is also a MCS. We must show that there is no polynomial

satisfying Condition A with respect to B2. Thus we need

to verify that the following inequalities are inconsistent.

a1 + a2 + a3 <
a1 — 32 + a3 <

(8)

(9)

(10)

Adding (8) and (10) we have a2 < 0, which contradicts (9).

Therefore, B2 is a MCS. Since both B

and B are subsets

2

26

of C(O) = MES(XXI), the sets B and B are in MCS(XXI).

1 2
Thus the various minimal characterization sets for a
best approximation (those in MCS(XXI)) need not have the

same number of points.

1.35 Theorem. If T is a finite subset of XXI, the
dimension of R(T) is n - k where n is the number of base
functions and k is the rank of B = WIMEMES(T)].

Proof: Let To = {(x,k):(x,k,s)eMES(T)} and let T = T ~ T

1 o'
If P(x,y) is a best approximation, where y is in the in-
terior of R(T), (if the best approximation is unique, let
P(x,y) be this unique best approximation) then
MTl[f(x) — P(X,Y)] = d < e(T). This is because the maxi-
mum is taken over points which are not points of MES(T)
and because T is finite. Then the continuity of
MTl[f(x) - P(x,y)] in y implies that there exists an
e > 0 such that MTl[f(x) - P(x,y) - P(x,a)] < e(T) for
oeN(o,e) a small neighborhood of the origin in En.

Next consider the solution space S of But = 0. S
has dimension n-k where k is the rank of B. We also know
that MTO[P(x,o)] = 0 and MTO[f(x) — P(x,y) - P(x,u)] = e(T)
for all aeS. Then for aeN(o,e)f\S it follows that
MT[f(x) - P(x,y) - P(x,y)] = e(T) since T = TotlTl. Since
N(o,e)/\S has dimension n-k, the dimension of R(T) must
be at least n-k. But any two best approximations must be

equal at the points of To' Then if the dimension of R(T)

is greater than n-k, the solution space of But = 0 must

27

also be greater than n-k. This is a contradiction to
the assumption that k is the rank of B. Q. E. D.
In the following corollaries, T is assumed to be

a closed subset of XXI and not necessarily finite.

1.36 Corollary. If T is any closed subset of XXI, the

 

dimension of R(T) is bounded above by n-k.

1.37 Corollary. If R(T) is not a single point and

 

is a boundary point of R(T), then C(T,o) contains at
least one more point than MES(T).

Pooofz If not, then the proof of Theorem 1.35 implies
that for some a > 0 there is a neighborhood N(a,e) of a
such that N(a,e)/lSCR(T). This contradicts the

assumption that o is on the boundary of R(T).

1.38 Corollary. If the rank of WIMEMES(T)] is equal to

 

n, then the set of best approximations R(T) is a single

point and the best approximation is unique.

1.39 Definition. A polynomial P(x,o) is said to have a

 

ziro at the point (xo,ko) or the point (xo,kO
D oP(xo,d) = 0.

,so) if

1.40 Lemma. If Q is a MCS of k i n + 1 points, then
the ranks of both IM(Q) and IM(QO) are 1-1 where QO is

any k—l point subset of Q.

28

3193;: Let m be the rank of IM(Q). This means that the
n-points of Q (to be denoted NQ) span a subspace of En

of dimension m. (The rows of IM(Q) are the n-points of Q
except for the factor of -1 if s = -1.) The origin of

En is in the convex hull of NQ because Q is a MCS. Apply-
ing the Theorem of Caratheodory to this m dimensional sub—
space, the origin may be expressed as a convex combination
of some set of not more than m+1 points from NQ. Since

Q is a MCS, the origin cannot be expressed as a linear
combination of any subset of NQ and hence k :_m + 1. If

k < m + l, the rank of IM(Q) is k and there exists a poly-
nomial satisfying Condition A with respect to Q. Thus

k = m + 1 and the rank of IM(Q) is khl.

Suppose that the rank of IM(QO) is less than k-l for
some k-l point subset Q0 of Q. This implies that there
exists a polynomial P(x) which is zero at the points of
Q0 and nonzero at (xo,ko,so) = Q ~ Q0. Since Q is a MCS
there exists a polynomial V(x) which satisfies Condition A
with respect to Q0 such that 30 # -sgn[Dk°V(xo)]. Then

k
if D °V(xo) s o,

k k k
V(x) - 2[D OV(xO)]sgn[D °P(xo)]P(x)/|D"P(xo)|

is equal to V(x) at the points of Q0 and is equal to
k‘ k
0 o
-D V(xo) at (xo,ko,so). If D V(xo) = 0, then

k k
V(x) — sosgn[D °P(xo)]P(x)/|D °P(xo)|

29

is equal to V(x) at the points of Q0 and is equal to -sO
at (xo,ko,so). Thus in either case, there exists a poly-
nomial satisfying Condition A with respect to Q. This is
a contradiction and implies that the rank of IM(QO) is k-l.

Q. E. D.

1.A1 Theorem. If any set in MCS(T) contains n+1 points
then the best approximation is unique.

Pooog: Let Q be a set of n+1 points which is in MCS(T).
Since we are assuming that e(T) > 0 in this section, the
ranks of WIM(Q) and IM(Q) are the same. (The weight
functions are nonzero on Q.) Since Q is a subset of MES(T)

we may apply Corollary 1.38 and Lemma 1.A0. Q. E. D.

CHAPTER II

CONVERGENCE CONSIDERATIONS

1. Uniform Convergence of the Po1ynomial
to the Function

 

In uniform approximation theory one basic property
that any set of approximating polynomials should possess
is that of the uniform convergence of the polynomials of
best approximation to the function being approximated as
the number of base functions increases. This is desirable
because when approximating a function, a certain maximum
allowable deviation is usually specified and the number
of base functions is chosen so that this requirement can
be satisfied. If the approximating polynomials do not
converge uniformly to the function, then for some speci-
fied maximum deviation 6 > 0 we may use as many base
functions as we please, but the deviation e of the best
approximation will always be greater than c. The follow-
ing example illustrates a typical difficulty which must

be avoided.

2.1 Example. Let f(x) = sin x + l, I = {0,1}, X = [—l,l],

¢i(x) = x1 for i = 1,2,... and wo(x) wl(x) a 1. Then

30

31

we know that DP(x,a) can approximate Df(x) as closely as
desired using the Weierstrass Approximation Theorem, but
the difference between f(x) and P(x,a) is 1 at x = 0 for
any polynomial in these base functions.

The following theorem gives conditions on the base

functions which are sufficient to overcome this difficulty.

m i-l q
2.2 Theorem. Let {¢i(x)}i=1 include the set {x }1:1
and let X = [a,b], where b > a. Suppose that for any

f(x)qu(X) and any 2' > 0 there exists an n and a poly-

O

nomial P(x,B) in no base functions such that

IDq[f(x) - P(x,B)]I < e' on X. Then given any a > 0

n
there exists an n and a polynomial P(x,o) = z ai¢1(x)
i=1
such that M[f(x) - P(x,a)] < s.
Proof: Let a > 0 be given and let 8' = e/E where
E = Max [Iwk(x)l(b - a)q-k]. This maximum exists
(x,k)eXXI

because the functions wk(x) are continuous on X for keI.
i-l q
Obtain a polynomial P(x,8) which has {x }i=l among its
base functions and for which the coefficients of{x1-l}?_=1
are zero, such that qu[f(x) - P(x,8)]| < e' on X.
Let T(x) be the Taylor expansion of P(x,8) - f(x)
about the point x = a with q terms and define

P(x,a) = P(x,B) - T(x). Then for any XeX

IDq[f(x) — P(x,a)1| = IDq[f(x> - P(x,8)]l < e'

More generally we have

32

IDk[f(x) - P(x,a)]l < e'(b - a)q’k for k = 0,1,...,q

This is true because Dk[f(a) — P(a,a)] = 0 for k = 0,1,...,q-l.

Hence

lwk(x)Dk[f(x) — P(x,o)]|< Maxlwk(x)|e'(b - a)q-k §_Ee' = c
xeX

for all keI and xeX. Q. E. D.

2. Continuous Dependence of the
Approximating Polynomial
on the Function

 

 

 

The next theorem is an extension of a theorem by
Maehly and Witzgall [1“]. It relates the closeness of
P(x,af) and P(x,ug), (best approximations to f(x) and g(x)
respectively in the pseudonorm M) as a function of the
closeness of f(x) and g(x) where g(x) is considered fixed.
As expected, under the appropriate hypotheses the best
approximations to f(x) and g(x) will be close to each

other if f(x) is sufficiently close to g(x).

2.3 Theorem. Let f(x) and g(x) be in Cq(X) with sets of
best approximations Rf and R8 respectively on XXI, with
respect to the weight functions {wk(X)}keI' If MCS(XXI)g
(the set of minimal characterization sets which are sub-
sets of MES(XXI) for the function g(x)) contains a MCS of
n+1 points, then there exists a constant B which depends

n
only on g(x), {¢i(x)}i=1 and{wk(x)}kEI such that

33

Max le[P(x,af) - P(x,ag)]l i B Max le[f(x) - g(x)]l
(x,k)eXXI (x,k)eXXI
where “g is the unique best approximation to g(x) and
of is any best approximation in Rf.

n+1

Proof: Let Q s {(xi’ki’si)}i=l be a MCS from MCS(XxI)g
and let ef and eg be the deviations of P(x,af) and
P(x,ag) from f(x) and g(x) respectively. (Since it is
assumed from the beginning of this paper that e, the
deviation of the best approximations to a function, is

nonzero for any function under consideration, we will

assume that eg > 0.) Also let

a = Max IDk[f(x) — g(x)]l and w = Max lwk(x)|.
(x,k)EXXI (X,k)eXxI
Then
ef _<_ M[f(x) - P(x,ag>l §_ M[f(x) - g(x)]

+ M[g(x) - P(x,ag)] 1 W6 + eg

and hence

-W6 1 eg - ef (1)

We also know that

k

siwki(xi)D'i[g(xi) - P(xi,ag)] = eg (2)

k
Siwki(xi)D i[f(xi) - P(xi,af)] i ef (3)

34

for i = 1,2,..un+l; upon subtracting (3) from (2) and

using line (1) we have

ki ki
siwki(xi)[D P(xi,af - ag)] 1 siwki(x1)D [f(xi) — g(xi)]

+6 -

g ef 1 - 2W6.

Thus letting a = of - cg we see that

k
siD iP(xi,o) 1.- ggi for i = l,2,...n+l (4)
where z = Min Iwk (x1)|. (z is not zero because
i=1,..”n+1 i
e > 0.
g )

To complete the proof we shall prove the following
lemma.
2.4 Lemma. Let P(x,a) satisfy

k
D iP(xi,a) 1 -2W6/z for i = l,2,...,n+l, where

5’i

_ n+1
Q - {(xi’ki’si)}i=l is a MCS of n+1 points. Then there
is a constant B independent of 6 such that

Max IDkP(X,q)I 1 B6.
(x,k)eXXI

Proof: Assume that such a constant does not exist. Then

for each integer m there exists an “m such that P(x,am)

satisfies (4) and Max leP(x,am)| > m6. As a conse-
(x,k)eXXI ,

quence of Lemma 1.40 and Lemma 1.13, the sequence {P(x,am)}

is bounded at most n-l points of Q. In addition, for

each polynomial in the sequence there must be a point

35

k

l
(xi,k1,si)eQ such that sgnED P(xi,am)] # -s (Other—

1'
wise we would have a polynomial satisfying Condition A
with respect to Q.) Thus for each am there exists a
point (xT,ki,xi) Q such that

k
0 : siD iP(x?,am) 1 2W6/z.

Choose a subsequence of {P(x,am)}, also to be de—
ki
noted {P(x,am)}, so that the sequences {D P(xi,am)}
converge to 11 or diverge to :w. This divides Q into two
k
parts, Q1 and Q2 where Ql is where {D iP(xi,am)} is bounded
and Q2 is the rest. Thus Q1 and Q2 are both nonempty and
Q1 contains no more than n-l points.

For each (Xj’kj’sj) in Q1 consider the functions

P(x,8,) such that

)

II
(D
I
>a

“J
D P(xJ.,BJ

)

I
O

k
D JP(xi,B for xite, 1 # j.

J

Then specify P(x,BJ) at as many more points to make it
unique. (We know that these polynomials exist by Lemma
1.40.) Consider the function
P(X,Eﬁ) E P(x,am) + 2 P(x,81).
XieTl

k

For each point of Q2, {D lP(xij'mﬂ diverges to siw.

36
For points of Q1 we have

R k

i — i
D P(xi,am) = D P(Xi,dm) + s - A

i i'

k
Thus for m large enough, sgnED iP(x1,§'m)] = 31 for

i = 1,2,..”n+1 and therefore -P(x,§h) satisfies Condition
A with respect to Q. This is a contradiction and com-
pletes the proof of the lemma and Theorem 2.3.

A more general result which is applicable to all
approximation problems under consideration is obtained in

the following theorem.

2.5 Theorem. Let M be a pseudonorm over XXI with weight
functions wk(x) for keI. Let f(x) and g(x) be functions
with sets of best approximations Rf and Rg respectively.
Then given a > 0 there exists a 6 >0 depending only on e
and g(x) such that if afeRf, then M[g(x) - f(x)] < 6
implies that

Min M[P(X,ag) — P(x,af)] < e.

R
age 3

Proof: If this theorem is not true, there exists an

so > 0 and a sequence of functions {fm(x)} and their

corresponding sets of best approximations Rf from
m
which we obtain a sequence {P(x,am)} which satisfies the

following:

37

Min M[P(x,og) - P(x,am)] 1 e

R
0‘se a

O

for all m. (5)

l

M[g(X) - fm(X)] < 5-

Then using Lemma 1.11 and the triangle inequality we

have

M[P(x,am)] 1,2°M[fm(x)] 1 2(M[g(x)] + l/m)

i 2M[g(x)] + 2 for all m.

Since we are assuming that M[P(x,a)] = 0 implies IIaII = 0,
there exists a set T of n points which has a nonsingular
interpolation matrix. Lemma 1.13 implies that the co-
efficients of {P(x,cm)} are uniformly bounded for all m
and hence there is a convergent subsequence, which also
will be denoted {P(x,am)}, which converges to a poly-
nomial P(x,cm). Then the assumption given in line (5)
implies that/P(x,aw) is not a best approximation to g(x).
Therefore there exists a constant e' > 0 such that if

ageRg then
M[g(x) — P(x,og)] + e' = M[g(x) — P(x,am)]

i M[g(x) - fm(X)] + MEfm(X) - P(X,Um)]

+ M[P(x,am) — P(x,am)]. (6)

38
Then determine N so that m 1 N implies that

M[g(x) - fm(x)] i e'/4 and M[P(x,um) - P(x,am)] i e'/4.

(7)

Therefore, for m 1 N it follows from (6) and (7) that

M[g(x) - P(x,ag)] + €'/2 i.M[fm(X) - P(x,am)]. (8)

But we also have

MEfm(x) - P(x,ag)] :_M[fm(x) - g(x)] + M[g(x) - P(x,ag)]

i e'/4 + M[g(x) - P(x,og)] for m 1 N.
(9)
Thus (8) and (9) imply that there exists an m such that

M[fm(x) - P(x,ag)] + e'/4 i e'/2 + M[g(x) — P(x,ag)]
i MEfm(x) - P(x,am)].

This is a contradiction since P(x,dm) is a best approxi-

mation to fm(x). Q. E. D.

3. The de la Vallée Poussin Algorithm
In this section it will be assumed that X = [a,b]
where a < b. Moreover, the approximation problems under
consideration all satisfy Assumption 1.7 and hence from

Lemma 1.8 there exists a set Q0 of n points from XXI for

39

which det[W1M(QO)] # O. From the set XXI choose a count-
able dense subset Q' which contains the set Q0. Let

Q" = {(x,k)eXXI:wk(x) = O} and Q = Q' ~ Q". Since the
weight functions are continuous, Q is dense in XXI except
where the weight functions are identically zero on an
interval. In addition, since det[WIM(QO)] # O, we know
that QO C, Q.

Let Tm be subsets of Q defined for m 1 n and con—
taining m points such that QoCTkCTk+l for k = n,n+l,...
where the limiting set of points Tm as m approaches m is
Q. Also, let 5m (the spacing or density of the set Tm
in Q) be defined as

6' = Sup {Min[lx - x'l + (b-a)|k-k'|]}.
(x,k)eQ (x',x')eTm

(10)

The quantity b-a is used in this definition so that
am < b-a implies that there are at least two points
(x,k)eTm with second coordinates equal for each keI.
Then one would hope that for ameR(Tm) and a*eR

it would follow that

lim M[f(x) — P(x,am)] = M[f(x) — P(x,a*)] = e

m+oo
More specifically, if {(P(x,o )} is any sequence with terms

m

from R(Tm), one would hope that any convergent subsequence

of {P(x,am)} would converge to a polynomial in R. These

no

results are proved below, with stronger conclusions being

obtained in certain cases.

2.6 Definition. f(x) is said to satisfy a Halder con-

 

dition on [a,b] with exponent 9 if there exists a con—

stant A such that

|f(y) - f(X)| : Aly - x|p p > 0
for all x and y in [a,b].

2.7 Remark. If f'(x) satisfies a H81der condition on
[a,b] with exponent p§_l, then f(x) also satisfies a
Halder condition on [a,b] with exponent p. This follows
from the boundedness of f'(x) on [a,b] and the fact that
if f(x) satisfies a Holder condition with exponent 1, then
it also satisfies a Holder condition with exponent p for

0 < p <1. (See Natanson [21] p. 72.)

2.8 Theorem. Let qu(x) and {Dq¢i(x)}i:l satisfy a
Holder condition with exponent p :1. Then there exists

n
a constant K depending on f(x), {¢i(x)}i=l and {wk(x)}kEI

such that
o 1 M[f(x) — P(x,am)] - e : K(6m)°

for m 1.n and any umeR(Tm). (Here 6m is defined in line

(10).)

41

Proof: Choose N so that m 1_N implies that am < b-a and
then assume that m 1_N in this proof. Let (y,k)eXXI be
such that

M£f<x> - P(X,am)] = Iwk(y)Dk[f(y) - P(y,am)]|

for a fixed omsR(Tm); also, let (z,k) be a point in Tm
such that Iy-zl < 6m. Let e and e(Tm) denote the deviations
of P(x,a*) and P(x,am) from f(x) on XXI and Tm respectively

where a*eR and ameR(Tm). Then

M[f(x) - P(x,am)l = Iwk(y)Dk[f(y) - P(y,am)]|

1 |wk(y)Dk[f(y) - f(z)]| + |wk(y)Dk[f(z) — P(z,am)]|

+ Iwk<y)Dk[P(z,cm) - P(y,am)]l

1 WKl(6m)p + e(Tm) + Iwk(y)Dk[P(z,am) - P(ysam)]|

(11)

where W = Max Iwk(x)| and we use Remark 2.10.
(x,k)eXXI

Since QOCTm for all m 1_n, Corollary 1.15 implies that
there exists a constant B such that ||a|| < B for all
aeR(Tm) for any m 1.n. Therefore, since the base functions
satisfy a Halder condition with exponent p, there exists a
constant K2 such that the term on the right in line (11)

is less than WK2(6m)°. Thus M[f(x) - P(x,dm)] i K(6m)p + e

where K = W(Kl + K2). Q. E. D.

42

2.9 Lemma. Assume that MCS(XXI) contains a set Q1 of
n+1 points and let P(x,o*) be the unique best approxi-
mation to f(x) on XXI. If P(x,B) is another approximation

to f(x) with deviation e = M[f(x) — P(X,B)] such that

B
IeB - el = c then there exists a constant K" not dependent
on e such that
k H
Max |D [P(x,8) - P(x,d*)]| < K e.

(x,k)eXXI

Proof: Let Q1 5 {(xi’ki’si)} 2:1 be the n+1 point set

from MCS(XXI). Then for this set of points we have

k
si wki(xi)D i[f(xi) — P(xi,a*)] = e (12)
ki
81 wki(xi)D [f(xi) - P(xi,8)] < e8 (13)
Subtracting (13) from (12) we have
ki
31 wki(si)D [P(x1,8) - P(xi,o*)] 1 e - e8 1 - 6

Then applying the results of Lemma 2.4 we have the existence

of the constant K". Q. E. D.

2.10 Corollary. If in addition to the hypothesis of

 

Theorem 2.8 there is a set Q1 of n+1 points such that
QleMCS(XXI), then there exists a constant K' depending

on{¢i(x)}i:l, f(x) and {wk(x)}keI such that

43

Max le[P(x,a*) - P(x,om)]l : K'(6m)p
(x,k)eXXI
where am is any best approximation in R(Tm) and o* is the
unique best approximation in R.
Proof: Let the e from Lemma 2.19 be equal to K(am)°

from Theorem 2.8. Q. E. D.

2.11 Corollary. If the hypotheses of Theorem 2.8 are

 

satisfied and if {P(x,am)} is any convergent sequence of
polynomials with terms taken from R(Tm), then

Aim P(x,am) = P(x,a) is a best approximation such that oeR.

m+w
Proof: This follows directly from the continuity of

M[f(x) — P(x,a)] in a. Q. E. D.

CHAPTER III

UNIQUENESS RESULTS

In this chapter we shall restrict our attention to
the case where X is an interval [a,b] and I is {0,1}.
We shall show that under appropriate hypotheses this
problem has a unique solution. We begin with an example

of nonuniqueness.

1. An Example of Nonuniqueness
In Example 1.33 we saw an approximation problem on
a finite point set which did not have a unique solution.
The following is an example of nonuniqueness for approxi-

mation on an interval.

1-1

3.1 Example. Let X [0,4], I = {0,1},¢i(x) = x

for i = 1,2,3, wo(x) 1 and let

x , f(x) = x2/2 — x , Df(x)

ll
X
I
l—‘

wl(x)

for 0 :_x < 2

-x2/2 + 3x - 4 , Df(x) = -x + 3

wl(x) = 4 - x , f(x)

for 2 5_x < 4.

This example was constructed so that x/2 l is a best

approximation. To verify this, consider the weighted error

curves:

44

45
(x-l)(x-2)/2 if 0 i x i 2

Lo(x) wo(x)[f(x) - x/2+1]

-(x-2)(x-3)/2 if 2

|A

x i 4

X(x—3/2) if 0

|A
>4

1 2

Ll(x) wl(x)[Df(x) - 1/2]

(4-x)(5/2-x) if 2 < x < 4
It may be verified that the extreme values of Lo(x)
on [0,4] are 1 and -l at x = 0 and x = 4 respectively.
The extreme value of Ll(x) is l for x = 2. Then the claim
is that Q = {(0,0,1),(4,0,-1),(2,1,l)} is in MCS(XXI) and
since the deviation of the approximation is not attained
at any other point of XXI it follows that Q is equal to
MES(XXI). Using Theorem 1.30 it must be shown that the
origin is in the convex hull of the n-points of Q which
are (1,0,0), (-l,—4,-l6) and (0,1,4). Using the multiples
1/6, 1/6 and 2/3 respectively, we have the origin as a
convex combination of these points. This shows that there
is no polynomial satisfying Condition A with respect to Q.
Then using the properties of the generalized Vandermonde
matrix, any two point subset Q' of Q does have a poly-
nomial satisfying condition A with Respect to Q'. Thus
QeMCS(XXI), e = 1, and x/2 - l is a best approximation.
Next we observe that ax(x-4) is the only family of
polynomials in the base functions under consideration which
is zero at the points of Q. Hence using Corollary 1.21,
the space of best approximations is P(x,a) E x/2 - 1 + ax(x-4)
where a is restricted to some interval containing zero.

Then it may be verified that the space of best approximations

46

is P(x,a) for ae[—(2/2 — l)/8,(2/2 - l)/8]. The weighted
error curves are given in the following graphs for various

values of a.
a1\‘
a2\

a
{A

au\

 

 

a5\

 

Lo(x) = wo(x)[f(x) - P(x,a)] for a = a1, i = l,2,3,4,5

 

 

 

 

Ll(x) = wl(x)[Df(x) - DP(x,a)] for a = a1, 1 = l,2,3,4,5

a1 = (2/2 - 1)/8 , a2 = al/2 , a3 = c, a“ = -a2, a = -a

5 l'

47

2. A General Uniqpeness Theorem

 

We have shown by example that a solution to a problem
of approximating a function and its derivative need not be
unique. However, certain problems do have unique solutions.
Such a unique solution can sometimes be detected using the
results of Chapter I. For example if one has a solution to
a given approximation problem and it has a MCS of n+1 points,
then it follows from Theorem 1.41 that this solution is
unique. A much more desirable result is one which can be
used to claim uniqueness before a solution is found. Such
a result will hold provided that {¢i(x)}1:1 and f(x) satisfy
certain conditions. First we shall indicate conditions on
the base functions which will be used in a uniqueness proof.

Throughout this chapter the notation |s| will be used
to indicate the cardinality of the finite point set S.

We shall first consider the special base functions
¢i(x) = x1-1 for i = 1,2,...,n. We shall use the following
properties of these base functions:

(a) D2¢i(x) exists on [a,b] for i = 1,2,...,n.

(b) DP(x,a) = iElaiD¢i(x) has at most n-2 zeros
on [a,b] if DP(x,a) z 0 on [a,b].

(c) At most one base function is constant on [a,b].

In the following two lemmas certain point sets of n
and n-1 points are described which have interpolation

matrices of rank n and n-1 respectively. A MES cannot have

an interpolation matrix for which the rank is equal to the

48

number of points. If it did, there would exist a poly-
nomial satisfying Condition A with respect to this MES.
Thus we are able to establish a lower bound on the number
of points in a MES. This lower bound is essential to the

proof of uniqueness theorems.

3.2 Remark. If it is desired to interpolate the function
0 at the points of a finite set QCZXXI by a polynomial in
n base functions, a homogeneous system of m = IQI linear
equations in n unknowns must be solved. From the theory
of linear equations, the dimension of the solution space
will be n-r where r is the rank of the interpolation matrix

of Q in n base functions.

3.3 Definition. A polynomial P(x,a) has a zero at the

 

point (xo,k) or the point (xo,k,s) if DkP(xO,a) = 0.

3.4 14311112- Given any set TC[a,b] such that |T| = n-l,
define Q' = TX{1}. Then if n > 1 the interpolation matrix
IM(Q') has rank n-l.

£5993: Assume that IM(Q') has rank k i n - 2. Consider
the problem of interpolating the function 0 at the points
of Q'. The dimension of the solution space is n - k 1 2.
Since at most one base function is constant on [a,b],

there exists a polynomial P(x,a) such that DP(x,a) i 0 on
[a,b] and DP(x,a) has n-l zeros on [a,b]. This contradicts

property (b) and completes the proof. Q. E. D.

49

3.5 gommo. Let S,T' and T" be finite sets such that
T"C{a,b}, T'CSC[a,b] and [S] + |T'UT"| = n; also,
define T = T'LJT" and Q = SX{0}L}TX{1}. Then IM(Q),

the interpolation matrix of Q, is nonsingular if n > 2

or S in nonempty.

Ppooﬂ: First we observe that n > 2 implies that ISI 1 1.
Then assume that the interpolation matrix for Q is singular
and that n > 2. This implies the existence of a nontrivial
polynomial P(x,a) with zeros at the points of Q. Since T"
can only contain the end points of [a,b], we may apply
Rolle's Theorem to every adjacent pair of points from
SX{0}, if there are any, to conclude that DP(x,a) has at
least ISI - 1 zeros on [a,b] in addition to the |T|

zeros of TX{1}. This implies that DP(x,a) has at least

ISI + |T| - 1 = IQI - l = n - l zeros which contradicts
property (b) of the base functions under consideration.
Thus DP(x,o) E 0 and since ISI 1 1, it follows that

P(x,a) e- 0. Q. E. D.

In Corollary 1.21 we established that any two best
approximations must agree at the points of a MES. In the
following theorem we find that under the appropriate
hypotheses the first derivatives of any two best approxi-
mations must also agree at the points of MES which are

interior to XXI.

50

3.6 Theorem. Let (xo,k,s) be a point from MES(XXI) such
that x0 is in the interior of [a,b]. Then if Dwk(xo)

and Dk+l[f(xo) - P(xoo)] exist for aeEn it follows that

Dk+1 Dk+1

P(xo,c) = f(xo) + seDwk(xO)/[wk(xo)]2

for all deR.

k
Proof: For any aeR, Lk(xo,o) = wk(xo)D [f(xo) - P(xo,a)]
has a relative extrema at X0 in the interior of [a,b].
Since the derivative of Lk(x,a) exists at xO it must be

zero there. Thus

k+
1[

Dwk(xO)Dk[f(xO) — P(xo,c)] + D f(x ) — P(xo,a)]wk(xo) = o

O

and since we assume that e > 0 it follows that wk(xo) # 0

and hence

Dk+lP(xo,a) = Dk+l[f(xo)] + Dwk(xO)Dk[f(xO) — P(xo,a)]/wk(xo)

Dk

+1f(xo) + seDwk(xO)/[wk(xo)]2

since Lk(xo,a) = se. Q. E. D.

The proof of the following uniqueness theorem follows
the pattern of the proof used in the classical case of
ordinary uniform approximation theory. If there are two
best approximations, it follows from Corollary 1.21 and
Theorem 3.6 that the difference of the two is a poly-

nomial with a certain number of zeros. This number is

51

dependent on the number of points in the MES. Under
appropriate conditions this number of zeros-is sufficient
to insure that the difference polynomial is identically
zero. This completes the proof of uniqueness. This same
theorem is given in Moursund [16,17]. The proof here is
considerably simplified, and easily extends to other cases

of interest.

3.7 Theorem. Let X = [a,b], I = {0,1}, and let
Dwo(x), le(x), and D2f(x) exist on [a,b]. If ¢1(x) = xi—l
for i = 1,2,...J1then one of the following is true:

(a) The best approximation is unique.

(b) The best approximation is unique except for an
additive constant, and if P(x,a) is any best
approximation then DP(x,B) is the unique best
approximation to Df(x) with weight function
wl(x).

Ppoog: For notational convenience we define the following

sets. MES(XXI) is the minimal extremal set of points for

the best approximations on XXI. Then let

0
II

{(x,k):(x,k,s)eMES(XXI)}

0
so = Qoﬂ(a,b)X{0}
Sa = Qof){(a,0)}
s =

b Qon{(b,o)}

52

To = QO()(a,b)X{l}
Ta = QOI){(a,1)}
Tb = Q0 fl{(b,l)}

E:
II

{(x,0):(x,0,s)eSO and (x,l,s)eTo}

* = ..
S SO W

The set 8* is the set of interior extrema from S0 which
do not have the same x coordinate as the interior extrema
from To.

Case I.--We first consider the special case when
n f. 2 and SOUSaUSb is empty. If n = 1 then MES(XXI) must
contain at least one point since every approximation has
at least one extremum. If n = 2 the set MES(XXI) must con-
tain at least two points because Lemma 3.4 implies that
there exists a polynomial satisfying Condition A with re-
spect to any single point. Therefore ITol + lTaI + ITbI 1 n
and using Corollary 1.21 the difference between two best
approximations is a nontrivial polynomial P(x,a), which is
zero at the points of MES(XXI). Thus DP(x,o) has at least
n zeros which implies that DP(x,d) a 0 using property (b)
of the base functions. Thus the best approximations are

unique except for an additive constant.

53

Case II.--The remaining case is when n > 2 or

SOLisaLISb is not empty. We begin by showing that

2 ITOI + IT + lTbl + |S*| + lsal + lsbl = m > n. <1)

a |

We shall show that if (1) is not satisfied then there
exists a polynomial satisfying Condition A with respect
to MES(XXI).

Assume that m 1_n and let
S'CIX ~ {xeX:(x,k,s)eMES(Xxl)} such that IS'I = n—m.

Also define

U)
ll

{x:(x,k)eTOU S*USaUSb}U S'

Tl

{x:(x,k)eTO} T" = {X:(x,k)sTaU Tb}.

Since all of the sets used in the definition of S, T' and

T" are pairwise disjoint,

ISI ITOI + |s*| + lsal + Isbl + ls'l

lT'l |T| IT"! = IT

0 + IT

3' al

and it follows that the sets S, T' and T" satisfy the
hypothesis of Lemma 3.5. This implies that IM(Q) is non-
singular (the Q is from Lemma 3.5). Since QOCLQ, the
rank of IM(QO) is equal to IQOI. Thus there exists a
polynomial satisfying Condition A with respect to MES(XXI)

and the truth of line (1) is verified by this contradiction.

54

Every best approximation must satisfy Corollary 1.21.
Then if there is more than one best approximation, the
difference between two of them is a nontrivial polynomial
P(x,a) which is zero at every point of MES(XXI), or
equivalently on Q0. Moreover, since the hypothesis of
Theorem 3.6 are satisfied for any (xo,k,s) where x0 is
interior to [a,b], it follows that DP(x,a) must be zero
at all points (x,k)eS*LlTO. Thus P(x,a) has double zeros
at the points of T0 and single zeros at each point of
TatJTb, which means that DP(x,o) has at least
2|TO| + ITaI + ITbI zeros. (See Definition 3.3.) The
polymonial DP(x,o) also has zeros at the points of 8*.
Then because of the manner in which 8* was defined we may
conclude that DP(x,o) must have at least
m' = 2|TO| + lTal + lTbl + |S*| zeros. From line (1)
we see that m' + ISaI + ISbI > n, and since

|sa| + Is 1 2 it follows that m' > n-2. This contra-

b I
diets property (b) of the polynomials under consideration
and implies that DP(x,o) s 0. Therefore, the first
derivatives of any two best approximations are equal and
the best approximations may differ by a constant. If
SOLISaLJSb is not empty, P(x,a) (which is a constant)

must have at least one zero) and hence P(x,a) e 0 and

the best approximation is unique. Q. E. D.

55

In the next theorem, a class of polynomials is
described which has the same three properties that were
used to prove uniqueness in Theorem 3.7. Since these
three properties are the only properties of the base
functions that were used in the proof, Theorem 3.7 and

its proof are also valid for this class of base functions.

3.8 Theorem. Let g(x) satisfy the following conditions:
(a) g(x) is strictly monotone on [a,b].
(b) Dg(x) has no zeros on [a,b]
(c) D2g(x) exists on [a,b]

Then letting

[g(x)lj"l

n
P(x,a) = z aJ
J=l
it follows that DP(x,a) can have at most n-2 zeros on
[a,b], where zeros of multiplicity 2 or more are counted
as two zeros.
Proof: Assume that DP(x,o) has k zeros of multiplicity 1

denoted by'{xi}il:_l and m zeros of multiplicity 2 or more

k+m

denoted by {xi}i=k+l

, where k + 2m > n - 2. Then

n
DP(xi,a) = Dg(x1) z a

= _ «1‘2
0 3:2 J(J l)[g(xi)]
for i = l,2,...,k+m, and
n
o = D2P<xi.a> = [Ds(xi)]2323aj(J-l)(J-2)[g(x1)lj'3

2 n 3-2
+ D g(X ) 2 a (J—l)[s(x )1

for i = k+l,...,k+m.

56
Since Dg(x) is nonzero on [a,b] it follows that
n J-2

for y1 = g(xi) , i = 1,...,k+m, and
<> n < 33
0 = Q y 5 Z a J-l)(J-2)(y ) ‘

for y1 = g(xi) , i = k+l,...,k+m.

Moreover, g(x) is strictly monotone on [a,b], and hence
the numbers {yi}§:T are all distinct. Thus Q(Y), which
is a polynomial in y of degree n—2, has at least n-l

zeros. This is a contradiction and completes the proof.

3.9 Corollary. If g(x) satisfies the hypothesis of

Theorem 3.8, Theorem 3.7 is true for the base functions
_ i-l _
¢i(X) - [g(x)] for i - 1,2,...,n

EEQQL‘ These base functions obviously satisfy properties
(a) and (0) listed at the beginning of this section. More-
over, the proof of Theorem 3.7 did not consider zeros of
any derivative higher than the second and hence these

base functions also satisfy property (b) as it was used

in the proof. Since properties (a),(b),and (c) are the
only properties of the base functions used in the proof

of Theorem 3.7 and the lemmas leading up to it, the proof

of this corollary is complete.

57

3. The Trigonometric Polynomials

 

In this section on trigonometric polynomials, X
will be the half open interval [0,2n) where the topology
of X is the ordinary topology on [0,2n] with the points 0
and Zn being identified with each other. The polynomials

will be represented by

sin(ix).

"MW

P(x,a) = a1 + a2icos(ix) + a

1 2i+1

i

P(x,a) will be called a polynomial of degree k if
Ia2k| + |a2k+1| > 0. It is well known that a trigonometric
polynomial of degree k > 0 has at most 2k zeros counting

multiplicity, on the interval X. Thus one can easily show:

3.10 Lemma. Given sets S and T such that TCSCX and
|S| + |T| = 2k+l, the interpolation matrix IM(Q) of

Q = Sx{0}lJTX{1} in n

2k+l base functions is nonsingular.

3.11 Theorem. Let X [0,2n), I = {0,1}, and let
Dwo(x), le(x), and D2f(x) exist on X; then if the
approximating polynomials are the trigonometric poly-
nomials of degree k, one of the following is true:
(a) The best approximation in unique.
(b) The best approximation is unique except for an
additive constant and if P(x,a) is any best
approximation, then DP(x,B) is the unique best

approximation to Df(x) with weight function wl(x).

58

The proof of this theorem is almost identical to
that of Case 11 of Theorem 3.7. Using the notation intro-
are empty (and

duced there, the sets Sa’ S Ta’ and T

b’ b
hence not used) since X has no end points. The two

differences between these two proofs are as follows:

(1) Lemma 3.10 is used here instead of Lemma 3.5.

(2) DP(x,o) in n base functions can have at most
n-l = 2k zeros for trigonometric polynomials
of degree k instead of the n-2 zeros as

assumed in property (b) of section (2).

4. A Special Class of Polynomials

 

This section will deal with the special case where

i+k for i = 1,2,...,n where k 1 0 is a fixed

¢1(X) = x
integer. Let I = {0,1} and X = [a,b] where 0 < a < b.
(Results similar to those obtained in this section hold
if a < b < O.) The functions D2f(x), Dw0(x) and le(x)
will be assumed to exist on [a,b]. It should be noted
that both P(x,a) and DP(x,a) in these base functions can

have at most n-l zeros in the interval [a,b]. This is

because P(x,a) is a polynomial of degree n+k with k+l

zeros at x O and DP(x,a) is a polynomial of degree
n+k-1 with k zeros at x = 0. From this property of the
base functions under consideration we have the following

lemma.

 

59

3.12 Lom_m_§_. Given sets S and T' such that T'cSc[a,b]
and a set T" which is empty or is equal to {b} where

|s| + IT'U T"| = n, define T = T'U T" and Q = SX{0}UTX{1}.
Then IM(Q) is nonsingular.

3322:: Assume that IM(Q) is singular. Then the problem
of interpolating 0 at the points of Q has a nontrivial
solution P(x,a). If |S| + IT'I = n, this polynomial

P(x,a) has n zeros in [a,b] counting multiplicity. This
implies that P(x,a) a 0 and hence IM(Q) must be nonsingular.
If |s| + IT'I < n it follows that T" is not empty. Then
applying Rolle's Theorem to P(x,a) on the whole real line
we find that DP(x,o) has at least k zeros at x = 0 and at
least n-l more zeros in (0,b). The zeros counted cannot
include the zero at the point b since b is the furthest
point from x = 0 of all points in [a,b]. Thus DP(x,a)
has at least k+(n-1)+1 = k+n zeros which means that
DP(x,o) 2 0. Moreover, P(x,a) has a zero at x = 0 so
P(x,a) E 0 and IM(Q) must also be nonsingular in this

case. Q. E. D.

3.13 Lemma. Let t t .,tm be m distinct points in

1, 2,0.
(a,b). Suppose D3g(x) exists on [a,b] and g(a) = g(b) = 0,
and Dg(ti) = D2g(ti) = 0 for i = l,2,...,m. Then there
exists at least 2m+l zeros of Dg(x) in (a,b) counting

multiplicity.

6O

Epoog: If Dg(ti) = D2g(ti) = D3g(ti) = 0 for some i,

the conclusion is true. Therefore, assume that

D3g(t1) # 0 for i = l,2,...,m. Then the points ti for

i = 1,2,...,m are relative maximum or minimum points for
Dg(x). If the 2m zeros which we assume for Dg(x) are the
only ones that Dg(x) has in (a,b), it follows that Dg(x)
is either nonnegative or nonpositive on [a,b]. Hence
g(x) is a monotone function on [a,b] and g(x) s 0. Here

we have a contradiction and the proof is complete.

3.14 Theorem. Under the assumptions of this section and
the assumption that wl(a) = 0, the best approximation is
always unique.

The assumption that wl(a) = 0 implies that the points
(a,l,1) and (a,1,—l) cannot be in the set MES(XXI). This
restriction of the points of MES(XXI) is necessary in
order that the method of proof used in Theorem 3.7 may
also be used here.

The proof of this theorem is almost identical to
that of Case II of Theorem 3.7. Using the notation intro—
duced there, the set Ta is empty (and hence not used)
since w1(a) = 0. The three differences are as follows:

(1) Lemma 3.12 is used here instead of Lemma 3.5.

(2) In counting the zeros that DP(x,o) must have

we use Lemma 3.13.
(3) DP(x,o) in n base functions can have at most

n-l zeros for the polynomials under

61

consideration rather than the n-2 zeros as
assumed in property (b) of section 2.
Finally, since there is no constant among the base
functions, two best approximations cannot differ by a

constant. Thus DP(x,o) 5 0 implies that P(x,a) s 0.

CHAPTER IV

COMPUTATIONAL ALGORITHMS FOR A

BEST APPROXIMATION

1. Introduction

 

The most basic tool in the computation of uniform
approximations is linear programming, which can be used
to find best approximations on finite point sets. The
approximation problem under consideration may be stated
as a linear programming problem as follows: find a point

a = (al,a2,...,an) in En such that

n
Iwk(x)Dk[f(x) — z ai¢i(x)]l : e for (x,k)eXXI (1)
i=1
where e is to be a minimum. This is equivalent to the
problem:

maximize -e subject to the constraints

wk(x)Dkf(x) 1 -e + wk(x)DkP(x,a)

for (x,k)eXXI.
-wk(x)Dkf(x) 1 —e — wk(x)DkP(x,d)

Since (1) implies that e is nonnegative, the quantity -e
is bounded above. Hence when applied to an approxi-
mation problem on a finite set TCZXXI, the simplex

62

63

algorithm for solving a linear programming problem will
terminate at a best approximation on T. (See Arden [1].)
For a comprehensive study of linear programming, see
Dantzig [6].

In Chapter II the de la Vallée Poussin Algorithm for
finding a best approximation was given. In this algorithm
each term of the sequence of polynomials which converges H
to a best approximation is a best approximation on some
finite point set. Therefore, linear programming may be

used to generate this sequence. This algorithm is accept- !

 

 

able if we do not demand an approximation which is very
close to a best approximation. If a high degree of accur—
acy is desired, this algorithm is applicable but impracticle
because of the large number of constraints that must be in-
cluded in the linear program.

To eliminate the need for solving linear programming
problems with large numbers of constraints, an algorithm
based on the characterization of best approximations will
be developed. In the case of the uniform approximation of
a function by polynomials over a Chebyshev set of base
functions, the Remez Algorithm is a very efficient method
for generating a sequence of polynomials which converges
to a best approximation. At each step of this algorithm a
best approximation is computed on n+1 points. (This parti-
cular linear programming problem can be reduced to solving
a particular system of n+1 simultaneous linear equations

in n+1 unknowns. Thus at no time in the process does one

64

need to solve a linear programming problem involving a
very large number of constraints.

In the more general problem to be considered here,
which includes the weighted approximation of a function
and its derivatives by polynomials in a general set of
base functions, the direct extension of the Remez Algorithm
does not necessarily converge to a best approximation.
Moreover, the general n+1 point problem is not easily con-
verted into a system of simultaneous linear equations ex-
cept in certain special cases, (see Moursund and Stroud
[18]). Hence we will rely heavily upon the standard linear
programming algorithm, and develop an algorithm which uses
the fact that the deviation of a best approximation on any
compact set T<ZXXI is determined by any MCS in MES(T).
(Recall that any MCS has at most n+1 points where n is the
number of base functions.)

In the next section we shall show that with appropri-
ate hypotheses, the deviation of a best approximation on a
finite point set is a continuous function of the point set.
Hence computationally we may look for a sequence of finite
point sets which converge to a finite point set containing
a MCS from MES(XxI) and then conclude that the deviation
on these points sets will approach e. From this sequence
of finite point sets we may determine an approximation
which is close to some best approximation.

Throughout this chapter X will be a closed interval
[a,b] where a < b (except in Example 4.9 where X includes

a finite point set in addition to an interval).

65

2. The Continuity of e(T)

 

The set XXI is a subset of E2; therefore, the dis-
tance function p[(x,k),(x',k')] = |x “-X'l + |k — k'|
is a metric on XXI. Using this metric we define the dis-
tance from a point (x',k') to a compact set T<:XXI to be

d'[(x',k'),T] = Min p[(X',k'),(x,k)].

(X,k)eT

In order to measure the distance between two compact sets
T and Q, with the requirement that the distance between
two sets is zero if and only if they are identical, we
define the following distance for compact subsets of XXI.

d(T,Q) = Max{Max d'[(x,k),T], Max d'[(x,k),Q]}

(x,k)eQ (x,k)eT

It can be shown that this distance function is a metric on
the compact subsets of XXI. Moreover, the space of all
compact subsets of XXI is a compact metric space using
this metric. (See Michael [15].) Metrics such as this
are commonly called Hausdorff metrics on subspaces of a
metric space.

In this chapter the symbol T with or without a sub-
script or superscript will denote a compact subset of XXI.
The following definition concerning the convergence of
sequences of subsets of XXI is the usual one for metric

spaces.

66

4.1 Definition. A sequence {Tm} converges to a set TO

 

if for each s > 0 there exists an integer N = N(e) such

that m 1 N implies that d(Tm,TO) < s

4.2 Lemma. Let the sequence {Tm} converge to T0 and let
S be a compact subset of En; then for any 5 > 0 there

exists an integer N = N(e)‘such that m 1 N implies

[MTm[f(x) - P(x,a)] — MTO[f(x) - P(x,a)]l < c

for all oeS.

Proof: The function w'k(x)Dk[f(x) - P(x,a)] is a continuous
function of x,k, and a on Xxlen and hence uniformly con-

tinuous on the compact set XxIxS. Thus given any a > 0

there exists a 5(e) such that |xl - x2| + |kl - k2| < 6
implies that
k1 k2
lwkl(xl)D [f(xl) - P(xl,a)] - Wk2(X2)D [f(xg) - P(x2,a)]| < e

for (xl’kl) and (x2,k2) in XXI and for all 058. Since {Tm}
converges to T0 there exists an integer N = N(6) such that
m 1 N implies that d(Tm,TO) < 6. Therefore, it follows

that

IMTm[f(x) - P(x,a)] - MTO[f(x) - P(x,a)]l < e

for m 1 N and ass. Q. E. D.

67

4.3 Definition. Let T(ZXXI be a set of m points; then

 

Nb(T,6) a {Q:Q<LXXI has m points and d(Q,T) i 6} is a

neighborhood of T of radius 6.

4.4 Lemma. If TCLXXI is a set of n points where
det[WIM(T)] # 0, there exists a 6 >0 such that Qer(T,6)

implies det[WIM(Q)] # 0.

2393;: Since a determinant is a continuous function of

its elements and since all of the elements of WIM(Q) are
continuous on XXI, the quantity det[WIM(Q)] is a continuous
function of the points of Q. Then if det[WIM(T)] is non-

zero, it must be nonzero in some neighborhood of T. Q. E. D.

4.5 Theorem. Let {Tm} be a sequence of compact subsets
of XXI which converges to a compact set TO<2XXI. If TO
contains a set of n points T' such that det[WIM(T')] # 0

then Aim e(Tm) = e(TO).

m->0°
Proof: We know that e(Tm) - e(To) 1_MTm[f(x) - P(x,ao)] - e(TO)
where coeR(TO) the set of best approximations to f(x) on the
set To' Then Lemma 4.2 implies that for any a > 0 there

exists an integer N = N(e) such that m 1 N implies

MTm[f(x) - P(x,ao)] - MTO[f(x) - P(x,ao)] < e.

Thus for m 1 N we have e(Tm) — e(To) < e.
The set TO contains a set T' of n points such that
det[WIM(T')] # 0. Thus by Lemma 4.4 there exists a

60 > 0 such that Qer(T',60) implies that det[WIM(Q)] # O.

68

Since {Tm} converges to TOZDT' there exists an integer

N1 = N1(60) such that m 1'N

Qmer(T',60) such that chﬁTm. Then Lemma 1.11 implies

1 implies that there exists a

that

MQm[P(X,am)] 1 MTmEP<x.am>1 1 2MTmEf<x>J _<_ 2M£f<x>1 (2)

where ameR(Tm), the set of best approximations on the set

Tm. Let B [WIM(Q)] denote the cofactor of the entry in

13
the ith row and jth column of WIM(Q) and define

B = MaxIBiJ[WIM(Q)]| E = Minldet[wlm(Q)l|. (3)
Qer(T'O,60) Qer(T'O,60)
i = 1,2,...,n
j = 1,2,...,n

Since the coefficients of a polynomial P(x,a) are
uniquely determined by the values it has on any point set

: n '
Q - {(xi’ki)}i=l in Nb(T O,60), these coefficients are given

by

n k1
1:lBiJ[WIM(Q)] wki(x1) D P(xi,a)
33 = det[WIM(Qy] , J = 1,2,...,n.

 

Then if ameR(Tm) for m 1 N it follows from lines (2) and

l
(3) that

Iam | 1 2nB-M[f(x)1/E, j = 1,2,...,n
J

where am = (aml,am2,...,amn). Let S be the set of all

thn for which Ilall i 2nB-M[f(x)]/E. It follows then

69

from Lemma 4.2 that for any a > 0 there exists an integer

N = N2(c) such that m 1 N

2 implies

2
MTO[f(x) - P(x,a)] — MTm[f(x) — P(x,a)] < e

for all aeS (which includes all am for m > N1). Thus

e(TO) - e(Tm) :_MTO[f(x) - P(x,am)]

- MTm[f(x) - P(x,am)] < e

for m 1 N2. Q. E. D.

4.6 Corollary. Under the hypotheses of Theorem 4.5 there

 

exists an integer N such that R(Tm) is contained in some

compact set S for all m 1_N.

 

4.7 Corollary. Let {Tm} converge to a set T'O. Then
Aim e(Tm) i e(T'O) if the limit exists.
m-No

This corollary follows directly from the first part

of the proof of Theorem 4.5.

4.8 Corollary. Suppose T' is a set of n points for which

 

det[WIM(T')] a! 0. If T'ch, for m = 1,2,.... , and if

Aim Tm = To, then

m-Hn
:12 e(Tm) = e(To).
The following is an example where e(T) is not a

continuous function of T.

70

4.9 Example. Let x = {-1,—l//3,o,1//3}L}[4/5,l], n = 4,

I = {0,1}, ¢i(x) = x1'1

for i = l,2,3,4 and wo(x) s wl(x) a 1.
The function f(x) is defined by the following table where

k is to be determined later.

 

 

x f(x) Df(x)
-1 -1 0
—1//3 0 -l
o 1 o
1//§ o l
[4/5,l] k 0

 

Also let T0 = {(1,0),(0,0),(-l,0),(1//3,1),(-1//3,1)}
and T1 = {(l-l/i,o),(o,o),(—1,o),(l//3,1),(—l//§,l)}

for i 1_5. Then {Ti} converges to To.

The value of e(Ti) on any set T1 for i 1 5 can be
determined from the following system of equations where we

let t = 1 - 1/1.

 

 

 

 

 

 

i

'1 t1 (t1)2 (ti)3 sl‘ “all ' k1
l -l l -1 s2 a2 —1
l 0 0 0 53 a3 = 1
o l 2//§ 1 Sn an 1
.0 1 -2//3 1 s5, (e1; \-1;

71

In this system of equations the quantities sJ for j = 1,...,5
are to be determined to minimize the absolute value of e1
where sJ may be chosen to be l,-1, or 0. Using Cramer's

rule

kBl - B + B3 + B4 - B5

J=1SJB3 I

#0

p.
MUTN

where BJ is the cofactor of sJ and it is noted that BJ

for j = 2,3,4,5. Then if we choose sJ = sgn(BJ) it follows

 

 

that e(Ti) = [oil will be a minimum. Moreover, B1 = 0 AV
and hence e(Ti) is independent of k. Therefore we may
conclude that e(Ti) i 1 for all i 1 5 and any k.
Next consider the approximation problem on the set To.
Since the polynomials do not satisfy the requirement that
MTO[P(x,a)] = 0 implies IIaII = 0 we may reduce the number

of base functions by at least 1. Since x = x3

at the points
of T0 we will use the base functions {1,x,x2}. Consider the
subset T' = {(1,0),(-1,0),(l//3,l),(—l//3,l)} of To. Then

e(To) 1 le'l where e' is given by

 

 

 

 

 

 

(1 1 1 s1 al k
1 —1 1 52 a2 —1
0 1 2//3 s3 a3 = l
(o 1 —2//3 s“) Le'J k 1,

72

Using Cramer's rule and the same notation and argu-

ments as above, we have

 

e—
4
2 s B
3:1 J J
Since B1 = 4//3 we may make le'l and hence e(TO) as large

as we desire by choosing k properly. Thus e(T) is not

continuous at the point set To'

3. Two Algorithms

 

In this section we shall discuss two algorithms which
can be implemented on a computer, and which (theoretically)
can be used to solve the interval version of the problem of
this thesis. Each algorithm involves the determination of
a sequence of point sets, and uses linear programming to
calculate best approximations to f(x) on the finite point
sets.

Since the continuity of e(T) depends on the presence
of a set T' of n points which has a weighted interpolation
matrix of rank n, we_must be able to determine such a
set T' for use in the computational algorithm. (Such a
set T' exists because of the restrictions on the base
functions and Lemma 1.8.) In problems of computational
interest such a set may be found quite easily. For
example, if the base functions {¢1(x)}i§1 are a Chebyshev
set on [a,b], then any n points from [a,b]X{0} for which

wk(x) is nonzero will have a nonsingular weighted

 

73

interpolation matrix. Also, from the continuity of
det[WIM(T')] in the points of T' we know that if any non-
trivial polynomial P(x,a) in the base functions {¢i(X)}i:l
can have at most a finite number of zeros on X, then in
any neighborhood of a set of points from [a,b]X{0} for
which wk(x) is nonzero, there will be a set T' for which
WIM(T') is nonsingular. Moreover, it should be noted
that such a set T' is independent of the function f(x)
being approximated and hence if a number of functions are
to be approximated on XXI with the same base functions and
weight functions, a set T' may be determined once and used
for all functions being approximated.

The convergence of the following algorithms depends

on the existence and use of this point set T'.

Algorithm I

 

Let T' be a set of n points such that det[WIM(T')] # 0
and let so 1 0 be some fixed real number. (In the theorems

which follow we shall show that the use of £0 = 0 will give

a convergent computational scheme in certain special pro-
blems. In actual computational practice, most problems
fall into this class. The use of so > 0 gives a slightly
different algorithm which will tend to be computationally
slower than the so = 0 algorithm. However, we shall show
that the algorithm converges in all cases if so > 0.) Then
choose some finite point set T"O of at least n+1 points

from XXI such that e(T"O) > 0. (This set T"O is usually

74

chosen to be a set of equally spaced points from each set
XX{k} for keI which includes the points (a,k) and (b,k)
for each keI. If the first choice of T"O has a deviation
e(T"O) = 0, then additional points must be chosen from XXI
and included in this set T"O until the deviation is non—
zero. Such sets T"O such that e(T"O) > 0 exist because we
are only considering problems for which any best approxi-
mation on XXI has a deviation e > 0.) Next determine a
best approximation P(x,ao) to f(x) on T"O. (This may be
done using linear programming.) Let QO be a subset of the
extremal point set C(T"O,oo) such that QO contains some
minimal characterization set from MCS(T"O) and let

T0 = {(x,k):(x,k,s)er}. (The set Q0 and the sets Qm for
any integer m 1 1 given below are easily determined from
the linear program by taking the points of C(T"O,ao) or
C(Tnm’am) respectively which correspond to the slack
variables which are set equal to zero to obtain a solution
of the linear programming problem. The set QO or Qm will
usually include all the points of c(T"o,oO) or C(T"m,dm)
respectively.) Then let t = 0 and define the sequence

of point sets {Tm} and the sequence of approximations
{P(x,om)} recursively as follows (m = 0 initially). 0b-
tain a point T'm = {(xm,km)} where ILk(x,om)l attains its

maximum on XXI. Let

t =
T" = TmUT m m if t o
””1 TmUT' UT'i if t > o
i=m-t

75

ll
and find a best approximation P(x,am+l) t0 f(x) on T m+1

(again we may use linear programming). Let Qm+l be a sub-

set of the extremal point set C(T ) such that Qm+l

)

m+l’am+1

contains some minimal characterization set from MCS(T"m+l

and let Tm+1 = ((x,k):(x,k,s)eQm+l}. Then if

e(Tm+l

increase t by l.

) _>_ e(Tm) + so set t = o and if e(Tm+l) < e(Tm) + co

Algorithm II

 

This algorithm differs from Algorithm I only in the
determination of the set T'm which is defined as follows.

For each approximation P(x,am) we define the set

B = {(x,k)eXXIzlLk(x,om)| 1 e(Tm)}

The set B consists of at most a countable number of closed
intervals (allowing single points as degenerate intervals)
from XXI. Then combine the intervals from B into larger

sets J1,J2,.. 3
satisfies the following requirements. Let J* be any

.,J so that this collection of sets {Ji}
fixed element in{Ji}, then

(a) only one of the following is true for all
(x,k)eJ*:Lk(x,am) : e(Tm) or Lk(x,am) i - e(Tm),

(b) {k:(x,k)eJ*} contains only one integer,

(c) if (x,k) and (y,k) are any two points from J*
such that x < y then there is no (z,k)eB such
that x < z < y and Lk(x,a ) < - se(T ) where

m '— m

s = sgn[Lk(X,um)l,

76

(d) if J' is any element from {J1} which is adja-
cent to J* then sgn[Lk(x',om)] = -sgn[Lk(x*,om)]

where (x',k) J' and (x*,k)eJ*.

Since the function Lk(x,cm) is continuous and e(Tm) > 0
the collection of sets {J1} consists of at most a finite
number of subsets of XXI. Then from each set J1, pick
out a point where lLk(x,am)l attains its maximum value

on J1 and let T'm consist of all these points. (In
computational practice the set T'm can usually be defined
to be the set of all points (x,k) which are relative
maxima of ILk(x,am)I over XXI for which ILk(x,om)| 1 e(Tm).
The only difficulty that this could present is that T'm
might contain an infinite number of points, or T'm might
contain too many points for efficient computation of a

best approximation on a finite point set.)

4. Convergence of the Algorithms

 

The first convergence theorem says that for any
approximation problem under consideration in this chapter,
the polynomial P(x,am) as generated by either of the two
algorithms with so > 0 will be close to a best approxi-
mation if m is taken large enough. First we observe that
the proof of the following lemma follows directly from

the compactness of XXI.

77

4.10 Lemma. Let {(xm,k 1}”

m m=1 be a sequence of points

m—l
from XXI and let Qm = U (Xi’ki)’ Then given any a > 0
‘ i=1

there exists an integer N = N(s) such that m 1 N implies

that d[(xm,km),Qm] < c.

4.11 Theorem. Let go > 0 for Algorithm I be fixed and

let T' be a set of n points such that det[WIM(T')] # 0.
Then given any 5 > 0 and any finite point set T"O for

which e(T"o) > 0, there exists an integer N depending on
so,e,T' and T"O such that if Algorithm I is followed start—

ing with T"O, it follows that
M[f(X) - P(X,am)] — e i e for m 1 N,
and
e — e(Tm) i e for m 1 N.

Proof: The sequence {e(Tm)} is monotone increasing and
bounded above and hence converges to some real number

e' i e. Let mO be some integer for which e' - e(Tm ) < a
o
l H H ‘
Then it follows that T CT mCT m+1 for all m > mO

Since T'<:T"m for m > mo, Corollary 1.15 implies that

0.

there exists a compact set S<:En such that R(Tm)(:S for
all m > mo. Thus the weighted error functions Lk(x,a)
defined for all keI, are uniformly continuous in a and x
for 658 and xeX. Therefore, given any a > 0 there exists
a a such that ILk(xl,d) - Lk(x2,a)|< efor all ces, ksI,

and x1,x2eX satisfying Ixl — x2] < 6.

78

Then from Lemma 4.10 there exists an integer N = N(6)

such that
‘ n c. >
d[(xm,km),T m] . 6 for m _ N.
Thus for some (x',k')eT"m we have

lLkm(xm,om)| - lLk,(x',am)l i lLkm(xm,am) - Lk,(x',am)| < 5.

Since ILk,(x',am)| i e(Tm) i e and
e 1 M[f(x) - P(X,am)] = lLkm(Xm’°‘m)'

we have M[f(x) - P(x,am)] - e < e and e - e(Tm) < e

for all m 1 N. Q. E. D.

The above theorem and proof are easily extended to
Algorithm II without any significant changes.

The second convergence theorem is one which is appli-
cable in certain special cases. It states that for certain
problems the two algorithms will converge to a best approxi-
mation with so = 0 if we start with a point set T"O for
which e(T"O) is close enough to e. We shall first prove

three lemmas to simplify the proof of the theorem.

4.12 Lemma. Let P(x,a) be a best approximation to f(x)
on T(ZXXI where MCS(T) contains a set of n+1 points. If
|Lk (Xo,a)| > e(T) for some (xo,ko)2XXI, then

0
e(T(J{(xO,kO)}) > e(T)-

79

£3993: Assume that e(TL}{(xO,kO)}) = e(T). Let a' be
any best approximation in R(TL}{(xO,kO)}); then 6' is
also in R(T). Theorem 1.41 implies that R(T) contains
one unique best approximation and thus that o = a'. This

contradicts the assumption that ILk
o

(I)

i=1
sequence of continuous functions on a compact set X

4.13 Lemma. Let {fm(x)} be a uniformly convergent
which converges to f(x) and for which there exists a
sequence {xm}m=1 converging to x0 and satisfying

fm(xm) = Maxlfm(x)l. Then Max|f(x)| = |f(xo)|.
XeX xeX

Proof: For each m and any xeX, |fm(x)| i |fm(xm)l.

Then using the triangle inequality we have

|fm(x)| 1 Ifm(xm)| : |f(xo)| + Ifm(xm) - fm(xo)l

+ |fm(xo) - f(xo)l

for any xeX. Then taking the limit as m+w we have
|f(x)| : |f(xo)l for all xex. (In taking this limit we
must use the fact that a uniformly convergent sequence
of continuous functions is equicontinuous. See Goffman

[11], p. 106.) Q. E. D.

4.14 Lemma. Let $12 Tm = To, £12 e(Tm) = e(TO) and

£12 8m = so where BmeR(Tm) for all m. Then BOeR(TO).

Proof: Since e(Tm) = MTm[f(x) — P(x,am)] for all m, it

follows that

(Xo,a)} > e(T). Q. E.

D.

8O

MTm[f(x) - P(X,BO)] _<_ e(Tm) + M[P(X,Bm) - P(X,BO)]. (4)

Taking the limit of both sides of (4) as m+w,

MTO[f(x) - P(x,80)] i e(To) which implies that BOeR(TO).

4.15 Theorem. Suppose that in a particular problem of
approximating a function f(x) on XXI all of the point sets
in MCS(XXI) contain n+1 points. Then there exists a con-
stant B < e such that if T"O is any finite subset of XXI
for which e(T"o) > B and if either Algorithm I or Algorithm
II is applied with so = 0 beginning with the set T"O, it
follows that each approximation P(x,am) will be the unique

best approximation on T" (and hence also on Tm) and

m
{P(x,am)} will converge to the best approximation on XXI.

Proof: First we shall show that there exists a constant

B < e such that if T is,a finite subset of XXI and e(T) > B,
then every set in MCS(T) contains n+1 points. If this were
not so, then for each integer m i 1 there exists a set of
km < n+1 points TIn such that e(Tm) i e - l/m. Then {Tm}

has a convergent subsequence, also denoted {Tm}, converging

to a set To' But Corollary 4.7 implies that ﬁlm e(Tm) i e(TO

m-Hao

and hence that e = e(TO) since 21m e(Tm) = e. Since e(TO) =
m+co

it follows that RCLR(TO) and since the sets in MCS(XXI) con-

tain n+1 points, the set R contains one unique best approxi-

mation 0:. Thus

MES(TO) C;C(To,a) C:C(a) = MES(XXI)

)

e

81

and hence any MCS in MCS(TO) is also in MCS(XXI). But
the sets in MCS(TO) can consist of at most n points since
TO contains at most n points. This contradicts the
assumption that the sets from MCS(XXI) all contain n+1
points.

The set T' from the algorithms need not be con—
sidered in this proof since so = O and hence t = 0 through-
out the algorithm. Assume that e(T"O) > B. Since
e(Tm) : e(T"O) for all m, the sets in MCS(T"O) all con-
sist of n+1 points and hence by Theorem 1.41 each am is
the unique best approximation on the set Tm for m 1 O.
From the three sequences {P(x,am)}, {Tm}, and {T'm}
which are generated by the algorithm, choose three corre-
sponding subsequences also denoted {P(x,am)} {Tm}, and
{T'm} which converge to P(x,aw), Tm, and T'0° respectively
and for which MTm[f(x) - P(x,am)] = e(Tm) and
MT'm[f(x) = P(x,am)] = M[f(x) — P(x,am)]. (The convergent
sequence {P(x,am)} exists by Corollary 4.6.) The sequence
{e(Tm)} is monotone increasing and bounded above which
means that Aim e(Tm) exists. Corollary 4.7 implies that
e(Tw) :_e(T:;m> B for all m and hence that the set MCS(Tm)
contains a MCS, denoted Q*, of n+1 points. Let
T* = {(x,k):(x,k,s)eQ*}. Then Lemma 1.40 implies that the
rank of WIM(T*) = WIM(Q*) has rank n. Thus T*<;T0° and.
hence the hypotheses of Theorem 4.5 are satisfied which
means that ﬁlm e(Tm) = e(Tm). Then Lemma 4.14 implies

m+°°

that P(x,am) is the unique best approximation to f(x) on Tm.

82

Assume that P(x,am) is not a best approximation of
f(x) on XXI. From Lemma 4.13 we know that T'0° contains an
absolute maximum point (x',k') of |Lk(x,aw)| over XXI.
Since P(x,am) is not a best approximation,

ILk,(x',am)| > e(Tm) and hence by Lemma 4.12 it follows

that e(TmLJT'm) > e(Tw). Moreover, 21m e(TmLJT'm) = e(Tm(/T'ml
by Theorem 4.5 since T*<2T0° and the $;:k of WIM(T*) is n.

Thus for m large enough, e(TmcIT'm) > e(Tm) which is im-
possible. Thus e(T'mLJTm) = e(Tw) = e and P(x,am) is the
unique best approximation to f(x) on XXI. Thus the original
sequence P(x,am) generated by the algorithm converges to
P(x,am).

In general we do not know if the hypotheses of the
previous theorem are satisfied before solving the problem.
However, in computational practice Algorithm II with s0 = O
has proved to be an efficient method of computing best
approximations even when e(T"O) is not close to e as re-
quired by Theorem 4.14. This is true because for most
problems of computational interest the quantity e(T"m) be-
comes larger than the number B from Theorem 4.5 for quite
small m and hence the theorem may be applicable from this
point on.

In problems where MCS(XXI) contains a set of less
than n+1 points or where some points of a set in MCS(XXI)
are very close together, the convergence may be quite slow
and a high degree of accuracy obtainable only with multiple

precision arithmetic.

83

5. Computational Procedures

 

In the actual computational use of Algorithm II
using linear programming to find best approximations to
f(x) on finite point sets, certain serious difficulties
were encountered. The most serious difficulty was that
of the extreme inaccuracy of the linear programming
solution to certain approximation problems on finite point
sets as computed by the simplex algorithm. It was also
found that the number of eliminations of the simplex
algorithm used in finding best approximations was quite
large. However, there is a simple procedure which has
been very successful in eliminating both of these diffi-
culties without changing any of the theory behind the
application of linear programming to the solution of the
approximation problem under consideration.

Instead of solving the system of inequalities of
line (1) page 62 on a finite point set T"m while minimizing
e, the following procedure is used. Using the notation of
the algorithms and defining P(x,a_m) s 0, find a best
approximation P(x,a*) to f(x) - P(x,am_l) on the set T"m
and set am = am-l + a*. This means that the linear pro-
gramming problem corresponding to line (1) page 62 is
modified so that for a fixed am-l we find a point

a* = (al,a2,...,an) in En such that

Iwk(x)Dk[f(x) — P(x,a — P(x,a*)]| : e

m-l)

84

for (x,k)sT"m where e is to be a minimum. Then am is
*
defined to be am_1 + a .
The following example shows how this procedure
affects the speed and accuracy of the simplex algorithm
solution to specific linear programming problems. (The

first method will be called Method I and the later modi-

fication will be called Method II.)

4.16 Example. Let I = {0,1,2,3,4}, X = [-l,1], n = 9
1-1
x

¢1(x) = for 1 = l,2,...,9, f(x) = sin(x), wo(x) 5 1
Wl(x) = .l(1 — x2) W3(x) = .001(l - x10)
W2(x) = .01(1 — x6) Wu(x) a .0001

The set T"O was chosen to be ten equally spaced points
from each set XX{k} for ksI; thus T"O consists of 50
points from XXI. Using single precision arithmetic (ten
decimal places of accuracy) on the CDC 3600, several terms
of the sequence {P(x,am)} were generated using Algorithm
II with s0 = O. The following table gives a comparison

of Method I and Method II and indicates the advantages of
Method II in both speed and accuracy. The columns marked
"Eliminations" indicates the number of eliminations used
in the simplex algorithm in the solution of the linear

program.

85

 

 

 

Method I Method II
e(T"i) Eliminations e(T"i) Eliminations i
7.05269xlo'8 36 7.05269x10"8 36 0
8.98299x10‘8 49 8.98362xlo'8 22 l
1.32349x10'23 59 8.98948x10‘8 28 2

 

Letting a2 and 0'2 denote the third approximation to f(x) as
given by Method I and Method II respectively, it was found

that

M[f(x) - P(x,a2>] 3.53572xlo'5

8

M[f(x) - P(x,a'2)] 8.99017x10’

The third linear program which used Method I and required
59 eliminations has a computed deviation e(T"2) which is
essentially zero and hence gives no indication concerning
the true deviation of a best approximation on the set T"2.
Moreover the set Q2 as determined by the linear program

did not contain a set from MCS(T"2) and hence the algorithm
could not be continued. (It should be noted that the set
T"l was the same in both Method I and Method II. The set
T"2 was not the same for both methods because the com-

puted solutions from R(T"l) were not the same for both

methods.)

86

There is one other difficulty which could occur in
theory but which has not been observed in actual compu-
tational practice. The sets T1 = {(x,k):(x,k,s)sQi}
could have an increasing number of points as 1 becomes
large and hence the whole purpose of solving the problem
on a small number of points would be defeated. If this
would ever become a problem, the set Ti could be replaced
by a subset T* of no more than n+1 points by determining
T* so that e(Ti) = e(T*). This can be done by taking
various n+1 point subsets of Ti and calculating the

deviation on these subsets.

6. Computational Examples
Many approximations were computed on the CDC 3600
using single precision arithmetic (ten decimal places of
accuracy) and Algorithm II with s0 = O. The following
examples are given to illustrate some of the unusual

things that can occur in the computation of best approxi-

mations.

4.17 Example. This example is the computer solution to
the approximation problem given in Example 3.1 which does
not have a unique best approximation. It was found that
the best approximation obtained by the algorithm depended
on the initial point set T"O. In each case a best approxi-
mation was determined in two steps of the algorithm. Thus
P(x,al) is a best approximation in each case. This

occurred because the set

87
{(x,k):(x,k,s)sMES(XXI)}

was included in the set T"l and thus a best approximation
could be computed exactly. The various sets T"O and

best approximations P(x,al) are given in Table I.

4.18 Example. Let X = [-l,l], I = {0,1}, n = 5,

1—1 for 1 = 1,2,....5, wo(x) a 1, Wl(x) = .15(l — x5),

¢i(x) = x
f(x) = cos(xL Df(x) = -sin(x). The initial set T"O was
chosen to be 6 equally spaced points from XX{O} including
the end points, and 6 equally spaced points from XX{l},
including the end points. In Table II the successive x
values of the point sets T1 are given in the columns headed
(l), (2),...,(6) for values of i = 0,1,...,8. The values
which are followed by an asterisk are points (x,l) from
XX{l} while those without an asterisk are from XX{O}. The
column headed "Elim." indicates the number of eliminations
that were needed to find a best approximation on the set

T" using the simplex algorithm and Method II. The set

i
T contained one additional point, namely (0.,0), which

1
is not shown in the table.
It should be noted that the extremal points in columns
(4) and (5) are approaching each other as the algorithm
proceeds. It has been observed that when this occurs, the
rate of convergence of the algorithm is slower than for
similar problems where the extremal points do not come to-

gether. In situations where points come together it may

become necessary to take sO to be positive in the algorithms.

8
8

 

 

 

 

 

mloaxmommzmm.a nloaxoaamamm.z oooooo.H *ommwow. *mzmoom. mmoooo. *mzmwow.l oooooo.HI o m
mloaxmmowzmm.: mloaxommwzmm.: oooooo.H *ommwow. *mzmmon. mmooon. *mmmwom.l oooooo.Hl N m
mloaxzmmommm.z mloaxmomwzmm.z oooooo.H *ommmow. *mmamow. mmoooo. «mmooow.l oooooo.al m o
mloaxomzmmmm.: mIonNHmnzmm.: oooooo.H *ommmow. *zwmmow. oooooo. *mmooom.l oooooo.HI OH m
muoﬁxsoqmsmm.s muoﬁxmommmmm.z oooooo.a *ommmﬁs. *qsmmos. ooocoo. *mmooos.u oooooo.au NH 2
mloaxwoommmm.: mIonoHHHomm.: oooooo.H mmﬁmaz. *wzwmow. oooooo. *mmwoom.l oooooo.HI ma m
mloaxwzmonmo.m slonmomoomz.z oooooo.H obmqom. *mmwmmo. oooooo. *Nmmmmo.l 5mmmw2.I ma m
macaxommmsow.m maoaxsmmms0m.s cooooo.a ooooow. *mmssmm. *mﬁmsum. *smssmo.u *mﬂasmm.u ma H
mIonmmmommm.w mloaxmmmmmam.m oooooo.H ooooom. *ooooom. *ooooom. *oooooo.l *ooooom.l ma 0
ﬂhﬁs.xoa I Axvco: Aﬁ:eom goo Amv Ago Amo Amo AHV .sﬁﬂm a
HH mqmqﬂ

AAHA.HV «AHaoOv nACa.:v nAOa.MV AAOA.NV «Ara.va AmNHMOo «mNMo noHIV

AAHA.:V AAHnoHV «AHn.OV AAOAoMV «AOnnmv «AOA.HVW AN. amo| noHIV

AAHnozv AAHA.NV nAHA.Ov «ADA-MV aAOa.va AmNH. «Q A.H|v

0:9 HO

H mqm¢8

10.

ll.

12.

BIBLIOGRAPHY

Dean N. Arden, The solution of linear programming

GI

problems, Mathematical Methods for Digital
Computers, eds. Anthony Ralston and Herbert
S. Wilf, Wiley, 1960, pp. 263-279.

 

Achieser, Theory of Approximation, Ungar, New
York, 1965.

 

Buck, Linear spaces and approximation theory,
On Numerical Approximation, ed. Rudolph Langer,
Univ. of Wis. Press, 1959, pp. 11-23.

 

. Buck, Survey of recent Russian literature on approxi-

mation, On Numerical ApprOximation, ed. Rudolph
Langer, Univ. of Wis. Press, 1959, pp. 341-359.

 

Cheney, Introduction to Approximation Theory,
McGraw-Hill, New York, 1966.

 

Dantzig, Linear Programming and Extensions, Prince-
ton University Press, Princeton, N. J., 1963.

Davis, Interpolation and Approximation, Blaisdell,
New York, 1963.

 

Davis, Approximation theory in the first two
decades of electronic computers, Approximation
of Functions, ed. H. L. Garabedian, Elsevier,
Amsterdam, 1965, pp. 152—163.

 

 

Denman, Minimax polynomial approximation, Math of
Comp., 20 (1966), pp. 257-265.

Garabedian, Approximation of Functions, Elsevier,
Amsterdam, 1965. _

. Goffman, Real Functions, Holt, Rinehart and Winston,

 

1963.
Lorentz, Russian literature on approximation in

1958-1964, Approximation of Functions, ed. H. L.
Garabedian, Elsevier, Amsterdam, 1965, pp. 191—215.

89

13.

14.

15.

l6.

17.

18.

19.

20.

21.

22.

23.

24.

25.

T.

O.

90

G. Lorentz, Approximation of Functions, Holt,
Rinehart and Winston, New York, 1966.

J. Maehly, and Ch. Witzgall, Tchebysheff—Approxi-
mation in Kleinen Intervallen I, Approximation
durch Polynome, Numer. Math., 2 (1960), pp. 142-
150.

Michael, Topologies on spaces of subsets, Trans.
Amer. Math. Soc., 71 (1951), pp. 151-182.

. G. Moursund, Optimal approximation of functions:

Chebyshev-type approximations, Ph.D. Thesis,
University of Wisconsin, 1963.

G. Moursund, Chebyshev approximation of a function
and its derivatives, Math. Comp., 18 (1964), pp.

G. Moursund and A. H. Stroud, The best Ceby§ev
approximation to a function and its derivative
on n + 2 points, J. SIAM Numer. Anal., 2 (1965),
pp. 15-23.

G. Moursund, Some computational aspects of the
uniform approximation of a function and its
derivative, J. SIAM Numer. Anal., 2 (1965),
pp. 464-472.

G. Moursund, Chebyshev solution of n + 1 linear
equations in n unknowns, J. ACM 12 (1965), pp.
383-387.

P. Natanson, Constructive Theory of Functions,
United States Atomic Energy Commission, Washington,
D. C., 1961.

. Ya. Remez, General Computational Methods of Chebyshev

Approximation. The Problem with Linear and Real
Parameters, United States Atomic Energy Commission,
Washington, D. C., 1962.

 

. R. Rice, The Approximation of Functions, Vol. I:

Linear Theory, Addison-Wesley, Reading, Mass.,
196E.

 

J. Rivlin and H. S. Shapiro, A unified approach to
certain problems of approximation and minimization,
J. Soc. Indust. Appl. Math., 9 (1941), pp. 670-699.

Shisha, The Chebyshev polynomial of best approximation
to a given function on an interval, Math. of Comp.,
20 (1966), pp. 266-271.

91

26. A. F. Timan, Simultaneous approximation of functions
and their derivatives on the whole real axis,
Amer. Math. Soc. Translations, Series 2, 44
(1965), pp- 1-11-

f(x)

n
{¢i(x)}i=1 .

wk(x).

MTE 3
ME 1
a, B, Y .
P(x,a)

e(T)

R(T)

R.

|| H-
T, T', T T

IM(Q), IM(T)

WIM(Q), WIM(T)-

det

i’ m ‘

INDEX OF NOTATIONS

Page

rttttrwwwwwwwwww

5, l3, 55

92

93

Page
Lk(x,a) . . . . . . . . . . . . . . . 13
C(T,a) . . . . . . . . . . . . . . . l3
C(a) . . . . . . . . . . . . . . . . 13
(x,k,s) . . . . . . . . . . . . . . . 13
Extrema, Extremum. . . . . . . . . . . . 13
Minimal Extremal Set. . . . . . . . . . . 15
MES(T) . . . . . . . . . . . . . . . 15
MES . . . . . . . . . . . . . . . . 15
Condition A. . . . . . . . . . . . . . 16
MCS . . . . . . . . . . . . . . . . l8
Minimal Characterization Set . . . . . . . . 18
MCS(T) . . . . . . . . . . . . . . . 18

n’pOint o o o o o o o o o o o o o o o 19

am. . . . . . . . . . . . . . . . . 39

|s| . . . . . . . . . . . . . . . . 47
d(T,T') . . . . . . . . . . . . . . . 65
Nb(T,6) . . . . . . . . . . . . . . . 67

1
4
5
5
9
6
1
3
0
3
9
2
1
3

mlll
A“
R"
Bl“
Ill.
L”
Y"
"
H
u
"
E”!
T“
A!"
T"
5"
H
H
H