WE UNEFQRM APPROXEMATEW OF A F6: 830.3% ENE EYE BERE‘MTEVB BY PGLWC‘MEALS Thesis for the Begree of Ph. D. MiCHlGAN STATE UNWERSETY FREDERECK JAMES SCHBURMANN 11967 J. L “‘5‘5 ' LIBRARY " Michigan State University This is to certify that the thesis entitled "The Uniform Approximation of a Function and its Derivatives by Polynomials" 1 presented by Frederick James Schuurmann has been accepted towards fulfillment of the requirements for M degree inMathemaiic s facécc yylc‘bbf? tow-(Q D. Moursund Major professor Due July 10. 1967 0-169 ABSTRACT THE UNIFORM APPROXIMATION OF A FUNCTION AND ITS DERIVATIVES BY POLYNOMIALS by Frederick James Schuurmann Let X be a compact subset of the real line, and let I be a finite collection of nonnegative integers including 0. The function f(x) and the base functions {¢i(x)}i:l are assumed to be in Cq(X) while the weight functions wk(X) for keI are continuous on X. The problem is to find real scalars al,a2,...,an which minimize M I <>‘1k [r() n ()]| ax w x ———» x - a e x . (x,k)exx1 k dxk i 1 The main objective is to provide an efficient method for the computation of a best approximation on a digital com- puter when X is a closed interval. First the problem of the existence of a best approxi- mation is discussed. Then the characterization of a best approximation is treated to provide a basis for the com- putational algorithm as well as to provide a method of determining when an approximation is a best approximation. Results are also obtained concerning the dimension of the space of best approximations. Frederick James Schuurmann Next, sufficient conditions for the uniform con- vergence of the polynomial and its derivatives to the function and its derivatives as the number of base functions increases are given. It was also found that with the appropriate hy- potheses the approximation of a function and its first derivative on an interval by certain classes of base functions is unique. Finally, two algorithms which use linear programming to find best approximations on finite point sets are pre- sented and convergence theorems given. Several compu- tational examples are also presented. THE UNIFORM APPROXIMATION OF A FUNCTION AND ITS DERIVATIVES BY POLYNOMIALS By Frederick James Schuurmann A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mathematics 1967 (A\ 5\‘4%- 5 \J ye gs l N ACKNOWLEDGMENTS The author is deeply indebted to his major professor Dr. David G. Moursund for suggesting this thesis problem and for his patient and helpful guidance during the prepa— ration of this thesis. The author would also like to thank Michigan State University for making available the computing facilities used in the development of the algorithms and the calcu- lation of the examples in this paper. 11 TABLE OF CONTENTS ACKNOWLEDGMENTS . . . . . . . . . . . INTRODUCTION . . . . . . . . . . . . Chapter I. EXISTENCE AND CHARACTERIZATION. 1. Introduction . . . . . . 2. Existence . . . . . . . . 3. Characterization . . . . . . II. CONVERGENCE CONSIDERATIONS . . . . . 1. Uniform Convergence of the Poly- nomial to the Function . . . 2. Continuous Dependence of the Approximating Polynomial on the Function. . 3. The de la Vallée Poussin Algorithm . III. UNIQUENESS RESULTS. . . I. An Example of Nonuniqueness . 2. A General Uniqueness Theorem . . 3. The Trigonometric Polynomials. A. A Special Class of Polynomials IV. COMPUTATIONAL ALGORITHMS FOR A BEST APPROXIMATION . . . . . . 1. Introduction . . . . 2. The Continuity of e(T) . . 3. Two Algorithms. . . . 4. Convergence of the Algorithms. 5. Computational Procedures . . 6. Computational Examples . . . . BIBLIOGRAPHY . . . . . . . INDEX OF NOTATIONS . . . . . iii Page ii 30 32 38 “A AA 47 57 58 INTRODUCTION Many advances have been made in the study of approxi- mation theory since the rise of the electronic computer. The computer has stimulated the study of approximation theory because approximations are necessary in the effici- ent handling of many problems, and the computer provides the means whereby approximations may be computed. Many problems which once were dismissed as impractical because of the difficulties of computation are now solvable using an electronic computer. A recent bibliography of approximation theory, a survey of recent Russian literature in approximation theory, and a number of papers on approximation theory are given in Garabedian [10]. Another earlier survey of literature in approximation theory is given by Buck [3,A]. Recently, several new books have been published which deal exclusively with approximation theory. Among these are Rice [23] and Cheney [5]. Both of these books contain extensive biblio- graphies. The problem considered in this paper is that of approximating a given function f by a polynomial in such a way as to make one or more of the derivatives of the approximating polynomial approximate the corresponding derivatives of the function f in some prescribed manner. 1 In the first chapter, the existence and charac- terization of best approximations are treated exten— sively. Next the uniform convergence of best approxi- mations to f and the continuous dependence of best approximations on f are studied. A theoretical algorithm for the computation of a best approximation is also pre- sented. Then several classes of problems are described which have unique best approximations. Finally the computational problems of finding a best approximation are considered. Two algorithms are given and conver- gence theorems presented. Several numerical examples are also given. CHAPTER I EXISTENCE AND CHARACTERIZATION 1. Introduction Let X be a compact subset of the real line, and let I be an finite collection of nonnegative integers including 0. The function f(x) and the base functions ¢1(x),¢2(x),...,¢n(x) for fixed n 1 l, are assumed to be in Cq(X) where q is the largest integer in I. For each keI let wk(x) be a continuous weight function on X. (We show in Lemma 1.5 that there is no loss of generality in assuming that the weight functions are nonnegative.) The following standard notation will be used: k _ d k _ d _ df(x) dx dx 0 dx x = x0 1.1 Definition. Let T be a closed subset of XXI and let gqu(X). Then define MT[g(x)] = Maxlwk(x)Dkg(x) . If (x,k)eT T = XXI we will write M[g(x)] for MT[g(x)]. (A set TcXXI is called closed if the sets Vk = {x:(x,k)eT} for keI are all closed subsets of X.) 1.2 Problem. Find real scalars al,a2,...,an such that M[f(x) - a ¢ ( )l 1 1 1 x IIMZS l is a minimum. A solution to this problem is called a best approximation to f(x) on XXI with weight functions {wk(x)}. As a notational convenience, points in En are repre- sented by a = (al,a2,...,an), B = (bl,b2,...,bn), etc., while polynomials are represented by n P(x a) = 2 a ¢ (x). ’ i=1 i 1 Also, for any closed set TCXXI, let e(T) = Min MT[f(x) - P(x,a)]. n deE It is shown in Theorem 1.16 that this minimum exists. e(T) will be called the deviation of a best approximation to f(x) on the set T. We define R(T) = {aeEn:MT[f(x) - P(x,a)] = e(T)} to be the set of all best approximations to f(x) on T. When T is XXI we will write R for R(T) and e for e(T). The norm to be used on the polynomial coefficients which are points of En is lldll = Maxlail i=l,2,...,n Throughout this paper the symbol T, with or without a subscript or superscript, will denote a compact subset of XXI. 1.3 Theorem. MTEg] is defined for all gqu(X) and has the properties: (a) 0 :_MT[g] < a (b) MTEg] = 0 if g s O (c) MTEtg] = ItIMTEg] for any real scalar t (d) MTEg + h] :_MT[g] + MTEh]. The first three prOperties are obvious. The fourth can be proved using the triangle inequality for real num- bers. Since we allow the weight functions to be zero, we cannot prove the converse of (b); hence MTEg] is a pseudo- norm. l.A Corollary. The set R(T) of best approximations is convex. Proof: If P(x,a) and P(x,8) are two best approximations and O i t i 1, then MTEf - tP(x.a) — (1 — t)P(x,B)] MT[t(f(x) - P(x,a)) + (l - t)(f(x) - P(x,8))] IA MT[t(f(X) - P(X,G))] + MT[(1 - t)(f(X) - P(X,B))] tMT[f(x) - P(X.a)] + (l - t)MT[f(x) - P(x,B)] = e(T) Observe that the inequality in the above relation is actually an equality for all t. If this were not true, some linear combination of P(x,a) and P(x,8) would be a better approximation than either P(x,a) or P(x,8). Therefore, any convex combination of two best approxi- mations is a best approximation. Q. E. D. 1.5 Lemma. Let {wk(x)} for keI be continuous weight functions and MlEg] = Max wk(x)Dkg(x) , M2[g] = Maxllwk(x)IDkg(x) (x,k)eXXI (x,k)eXXI Then P(x,a) is a best approximation to f(x) in the pseudo— norm Ml if and only if P(x,a) is a best approximation to f(x) in the pseudonorm M2. Proof: P(x,a) is a best approximation in M1[g] if it minimizes Max lwk(x)Dk[f(x) — P(x,a)]! . (1) (x,k)eXXI In the case of M2[g], P(x,a) must minimize Maxllwk(x)||Dk[f(x) - P(x,a)]| (x,k)eXXI . (2) It is evident that for a given a, the quantities (l) and (2) are identical. This implies that an a which minimizes one of them minimizes both, and completes the proof. Since the set R of best approximations does not change if the absolute value of the weight function is substituted for the weight function, we will henceforth assume that all weight functions are nonnegative. In order to study some properties of the base functions of the approximating polynomials, we introduce the following matrices. 1.6 Definition. The interpolation matrix of a set Q which is a set of m ordered pairs{(x1,ki)}i:ll or ordered triplesl{(xi,ki,si)}i:ll is the an matrix B whose entry in the ith row and Jth column is k b = D ie 13 J O for any particular problem under consideration. 1.8 m. If MT[P(x,c)] = 0 implies Hall = 0, then T contains a set Q of n points such that det[WIM(Q)] # 0. (Here det denotes the determinant.) Proof: Assume that the conclusion is false. Let T' be a set of m points from T for which B = WIM(T') has rank m < n, where m is the maximum rank of any weighted inter- polation matrix in points from T. Let (xo,ko) be any point from T and define T" = T'U{(xo,ko)}. Then the rank of B' = WIM(T") is m and the row bm+l of 8’ corresponding to (xo,ko) can be written as a linear combination of the rows of B. Hence in matrix notation we have bm+ = YB 1 where y is a row matrix giving the proper multiples of the rows of B which are needed to produce bm+1' Next consider the space of solutions to the problem But = O. This problem has a solution space of dimension n - m, and since n > m there exists a nonzero solution do. The weighted value of this polynomial P(x,ao) at (xo’ko) t _ t _ _ m+lao _ YBGO - 00 Hence MT[P(X’°O)] — O and Ilaoll f O which contradicts the hypothesis of this is given by b lemma. Q. E. D. 2. Existence 1.9 Theorem. G(a) a MT[f(x) - P(x,a)] is a continuous function for deEn. Proof: Let K = Max MT[¢i(x)]. If K = O, i=l,2,...,n n G(a) is constant for all aeE and hence is continuous. If K ¢ 0, let a0 be a point of En. Given any e‘>0 let 5 = e/nK. Then Ila - aoll < 5 implies |G(B) - G(ao)l = IMT[f(X) - P(X,B)] - MT[f(X) - P(X,ao)]| n :IMT[f(x) — P(x,B) - f(X) + P(x,ao)]| = MTtiz (ai — bi)¢i(x)] =l ~nK < Max Iai — bi Ila - 8| -nK < s. Q. E. D. ‘ i=1,2,...,n ° 10 1.10 Corollary. The set R(T) of best approximations is closed. Pgoof: If R(T) is empty, the result is trivial. Other- wise R(T) = {a:G(a) = e(T)} and hence R(T) = G-l[e(T)]. Since G(a) is continuous and e(T) is a closed subset of the real numbers, R(T) is closed. Q. E. D. 1.11 Lommo. If aeR(T) (the set of best approximations on the set T) then MT[P(x,a)] i 2-MT[f(x)]. Pooof: MT[P(x,a)] - MT[f(x)] i MT[f(x) — P(x,a)] i MT[f(x)] where the second inequality holds because P(x,a) a 0 cannot be a better approximation than any best approximation. The result follows from these inequalities. Q. E. D. 1.12 Definition. A family of polynomials {P(x,a)} for m i=1 or a set Q' = {(x ,k ,s )} m if there exists some constant k i i 1 i=1 ass is said to be bounded at a set of points Q = {(x1,k1)} 2 such that ID 1P(xi,a)|: z for all aeS. 1.13 @333. Let {P(x,a)} for aeSCEn be a family of poly- nomials in n base functions which is bounded by B' at the n points of T<2XXI. Then if det[WIM(T)] # 0, there exists a constant B such that IIaII: B for all ces. Pgoof: Since det[WIM(T)] ¢ 0, any polynomial in these n base functions is determined by the values of the poly- nomial at the points of T. Then if V(xi’ki)’ i = 1,2,...,n gives the values of a polynomial P(x,a) at the points of 11 T and L J is the cofactor of CiJ from the ith row and Jth 1 column of WIM(T), the coefficients of P(x,a) are n = [ Z L aJ i=1 13V(xi,ki)]/det[w1M(T)], J = l,2,...,n. Then if L = MaleiJl. E = Idet£w1M(T)j| and 13 ; |V(Xi,k1)| :B' for i = l,2,...,l’1 WC have [A laJI nLB'/E for all ces. Thus B = nLB'/E is a uniform bound on ||c|| for ces. Q. E. D. 1.1M Theorem. The set R(T) of best approximations on the set T is bounded if MT[P(x,a)] = 0 implies IIaII = 0. Proof: Lemma 1.8 implies that T contains a set Q = {(xi’k1)}I=1 such that det[W1M(Q)] # 0 if MT[P(x,o)] = 0 implies ||a|| = 0. Then letting W = Min Iwk (xi)| k=l,2,...,n 1 and using Lemma 1.11, ki ki 2MT[f(x)] 1 MT[P(x,a)] :||wki(xi)D P(xi,a)| i WID P(xi,a)| for i = 1,2,..,r1 and aeR(T). Since det[WIM(Q)] # 0 implies that w > o, it follows that IDkiP(xi,a)l : 2MT[f(x)]/W for (xi,k1)eQ and oeR(T). Hence Lemma 1.13 implies that R(T) is bounded. Q. E. D. n 1.15 Corollary. If Q = {(xi’ki)}i=1 is a set of points from XXI such that det[WIM(Q)] # 0, then there exists a constant B such that ||c|| < B for all aeR(TlL)Q) for any Tl CXXI . 12 Proof: Let T' = TlLIQ; then from the proof of Theorem 1.1A we have ki 2M[f(x)] 1 2MT'[f(x)] 1 WID P(xi,o)| for i = 1,2,...,n and any aeR(T'). This implies that k 2M[f(x)] _>_ ID 1P(xk,a)| for 1 = 1,2,...,n and anR(TlUQ). W TlCXXI Then Lemma 1.13 implies that there exists a constant B such that ||c|| < B for all aeR(T1U Q) for any TlCXXI. Q. E. D. 1.16 Theorem. The set R(T) of best approximations on the set T is nonempty. £31929 Case I.--Assume that MT[P(x,a)] = 0 implies that IIaII = 0. By Theorem 1.1“ the set R(T) is bounded. Then C(a) = MT[f(x) - P(x,a)] is a continuous function on a compact set and hence attains its minimum value. Thus a best approximation exists and R(T) is nonempty. Case II.--If MT[P(x,o)] = 0 does not imply that Ilall = 0, there is a nontrivial polynomial P(x,8) such that MT[P(x,B)] = o. This means that n wk(x)DkP(x,B) = wk(x) z b Dk¢i(x) = o for (x,k)eT i=1 1 where some b1 is nonzero. Therefore, the set of base 13 functions may be reduced in number by at least one without affecting the values that the weighted polynomial can attain on T. i.e., With this reduced set of base functions, we can approximate f(x) on T Just as closely as we could using the original set of base functions. This reduction in the number of base functions may be repeated until those remaining satisfy the assumption of Case I. Q. E. D. 3. Characterization Instead of having one error function as in the approxi- mation of a function by polynomials, we have an error function for each keI. 1.17 Definition. Lk(x,c) 2 wk(x)Dk[f(x) — P(x,a)]. The functions Lk(x,o), which are defined for all (X,k)cXXI and aeEn, are called weighted error functions. Throughout this section on characterization, T will be assumed to be a closed subset of XXI such that e(T) > 0. Such sets T exist because we are assuming that e = e(XXI) > O. 1.18 Definition. An ordered triple (xo,k,s)eXXIx{-1,l} is called an extremum with respect to the approximation P(x,a) to f(x) on the closed set TCZXXI if (xo,k)eT and Lk(xo,a) = sd where d = MT[f(x) - P(x,a)]. 1.19 Definition. If acEn and Tc:XXI is closed, let C(T,a) = {(x,k,s)eXXIX{—1,l}:(x,k)eT and Lk(x,a) = sd} where d = MT[f(x) - P(x,c)]. If T = XXI, let C(a) = C(T,a). 14 The set C(T,o) is called the set of extrema of the approxi- mation P(x,o) to f(x) in the pseudonorm MT[g(x)]. Since MT[g(x)] gives the maximum absolute value of a finite number of continuous functions on a compact set, every approximation must have at least one extremum. This means that C(T,a) is nonempty for each aeEn. The points of C(T,a) are of considerable importance in the comput- tational schemes for solving Problem 1.2 which we discuss in Chapter IV. Here we are interested in the set C(T,o) when aeR(T). We shall give certain theorems characterizing the points of C(T,o) in this case. 1.20 Theorem. There exists an aOeR(T) such that for every BeR(T) and keI, C(T,do)<:C(T,B). 32223: We have previously shown that R(T) is closed, con- vex, and nonempty. Let m be the dimension of R(T). If m = O, R(T) is a single point, the best approximation is unique, and the theorem is true. If m 1 1 the set R(T) has a nonempty interior. Let do be in the interior of R(T). It_will be shown that this point satisfies the assertion of the theorem. Let B be any other point in R(T) which is not the same as so. Extend the line segment 3;? beyond do to some other point aleR(T) so that al’ao and B are on the same line with a between a o 1 Then oo.must be a point of the hyperplane and B. Let (Xo,k,s)eC(T,do). 15 k _ . wk(xO)D [f(xO) - P. (5) Suppose, for example, that s = l and that w (x )Dk[f(x ) — P - P(x c )1 > e(T) k 0 o o’ 1 which contradicts the assumption that aleR(T). Thus 8 must be a point of the hyperplane from line (5) since 8eR(T). This same conclusion is also reached if s = -1. This implies that every BeR(T) must satisfy line (5) and hence we conclude that (xo,k,s)eC(T,B) for all BeR(T). Q. E. D. We have also proved the following: 1.21 Corollary. If a’is an interior point of R(T) then k k D P(Xo,a) = D P(XO,B) for every (xo,k,s)eC(T,u) and every BeR(T). 1.22 Definition. The set MES(T) = r\C(T,B) is called the BER(T) minimal extremal set (or MES) of the best approximations to f(x) on the set T. 16 The previous theorem shows that MES(T) is the extremal set of any interior point of R(T). Since Lk(x,a) is continuous in x for fixed k and a, we observe that C(T,a) is closed; hence we have the following: 1.23 Corollary. If T is any closed, nonempty subset of XXI, then MES(T) is compact and nonempty. So far we do not have any method for determining if a given approximation to f(x) is a best approximation. To alleviate this difficulty, we use the following definition and theorem. 1.2“ Definition. A polynomial P(x,a) is said to satisfy m i’ki’si)}i=l (alsokcalled points) where (xi,ki,si)eXXIX{-1,1}, if sgnED iP(xi,o)] = 'Si’ 1 = 1,2,...,m. Condition A with respect to a set of triples {(x 1.25 Theorem. A polynomial P(x,u) is a best approximation to f(x) on the set T if and only if there is no polynomial P(x,8) which satisfies Condition A with respect to C(T,o) the extremal set of the approximation P(x,o). Pgoof: Assume that P(x,o) is not a best approximation so that there exists P(x,y) such that MT[f(x) — P(x,y)] = d' < MT[f(x) - P(x,u)] = d. Let P(x,8) = P(x,o) - P(x,y). Then for any (xo,k,s)eC(T,a) we have w (x )Dk[f(x ) — P(X 0.)] = sd k 0 o o’ ' Since wk(xo) > 0 it follows that 17 DkP(x e) = Dk[f(x > - P - P1 o’ o o’ o o’ is greater than or less than s(d' - d)/wk(xo) as s is -1 or +1 respectively. Therefore, P(x,8) satisfies Condition A with respect to C(T,a). Let d = MT[f(x) - P(x,o)] and assume that there exists a polynomial P(x,8) satisfying Condition A with respect to C(T,a). Then let U = Closure {(x,k)eT:sgnDkP(x,B) = sgnDk[f(x) - P(x,a)]} d' = Max ILk(x,a)l r = Max Iwk(x)l (x’k)€U (X,k)€T s = Max IDkP(x,B)| t = Min{(d-d')/2rs,d'/rs}. (x,k)eT Then wk(X){Dk[f(X) - P 0 which 1 1 2 2 separates the origin and the convex hull of Y. Thus for Z nn 20 each n-point Yi of Y, the inequality o-Yi < 0 is satis- fied. Thus P(x,u) is a polynomial satisfying Condition A with respect to the points of Y. Assume that there exists a polynomial P(x,a) satis- fying Condition A with respect to the points of Y. Then a-Yi < O for all n-points Yi of Y. Then since Y is com- pact, there exists a constant b > 0 such that a-Yi < —b for all Y of Y. Thus the origin is not in the convex i hull of the n-points of Y. Q. E. D. Next we have a theorem which gives an upper bound on the number of points in a MCS. It states that a MCS can have at most n+1 points. 1.31 Theorem. If Y is a closed subset of XXIX{-l,l} and if the n-points of Y contain.the origin in their convex hull, then there is a set of k i n + l n—points of Y whose convex hull contains the origin. This theorem is an immediate consequence of the Theorem of Caratheodory which is stated below as given in Cheney [6] p. 17. 1.32 Theorem of Caratheodory. Let A be a subset of n- dimensional linear space. Every point of the convex hull of A is expressible as a convex linear combination of n+1 (or fewer) elements of A. In ordinary Chebyshev approximation by polynomials in a Chebyshev set of base functions a MCS must contain exactly n+1 points. This is because the interpolation 21 matrix in any n distinct points is nonsingular and hence for any n distinct points we have a polynomial satisfying Condition A. In the more general case a MCS may have fewer than n+1 points, and there may be more than one MCS having fewer than n+1 points. Also, there may be two or more minimal characterization sets which do not have the same number of points. These special cases will be shown in the following examples. 1.33 Example. Let I = {0,1}, w0(x) i-l wl(x) E 1. n = A, ¢i(x) = x for i = 1,2,3,” and X = {-l,-1//3,0,1//§,1}. The function f(x) is defined on XXI by the following table. x f(x) f'(X) -l -l O -1. 0 .1 ./3 O l 0 i 0 1 /3 l -l 0 Let B l {(-l,0,—1),(-1//3,1,-l),(0,0,l),(l,0,-l)}. CD II 2 ,{(-l,0,-l),(l//§,l,l),(0,0’1)’(1,0,_l)}. 22 Then B' = {(-1,1,-l,1),(0,-l,2//3,-1),(l,0,0,0),(-l,-l,-l,-l)} is the set of n—points of both B1 and B2. The origin can be expressed as a convex combination of these points of E” by using the multiples 2 + /§ /§ 2 2 - /§ 2(A + /§) A + /§ A + /3 2(u + /§) Thus using Theorem 1.30 there is no polynomial satisfying Condition A with respect to the points of B1 or B2. The nonsignularity of the generalized Vandermonde matrix im- plies that there is a polynomial satisfying Condition A with respect to any 3 point subset of either B1 or B2. Thus B1 and B2 are both minimal characterization sets. The sets B1 and B2 is the origin of E“. Since there is no polynomial satis- are both subsets of C(0) where 0 fying Condition A with respect to B1 or B2, there can not be any polynomial which satisfies Condition A with re- Spect to C(O). Hence a = 0 is a best approximation, and the sets B1 and B are both in MCS(XXI) if 0 is an interior 2 point of R. To show that 0 is interior to R we will determine R. Since 0 is a best approximation, e = M[f(x) - P(x,o)] = 1 for a = 0 and the following inequalities must be satisfied. 23 -1 i f(xi) - P(xi,o) i —1 i f'(xi) — P'(xi,a) < x1 = -1, x2 = -l//3, x l l i 1.2.3.“.5 (a) i 1,2,3,U,5 (b) 3 = o, xu = 1//§, x = 1. In the following computation we will refer to these inequalities by a letter and a number. to line (b) with i = 5. From lines (a) and (b) we have -1< Then adding and -1< Adding these we ..l< -1: -1 - a1 + a2 - a3 + an -1 - al - a2 - a3 - an dividing by 2 -1 - a - a l 3 -l + a obtain a3 1 0. -1 - a2 + (2//3)a3 - an -1 + a2 + (2//3)a3 + a“ Adding and dividing by 2 e.g., (b) 5 refers from (a) 1 from (a) 5. from (a) 3. (7) from (b) 2 from (b) A. -1 i -1 + (2/ 3)a3 implies O : (2//3)a3 and using (7) we know a3 = 0. 24 Then equation (a)3 implies that al 1 0. -l i —1 - a1 + a2 + an from (a) l -1 i -l — al - a2 - au from (a) 5. Adding and dividing by 2 -l o -1 - al and hence al 1 0 and thus a1 = 0. Equation (b)A implies that l - a2 - a“ i l and hence -a2 - an i 0. Then (a) 5 implies that 0 1 -a2 - an and hence a2 = an. From the computation using lines (a) and (b) we know that the best approximations are polynomials of the form au(x3 - x). Since this family of polynomials is zero at the points of BlLIB2, the solution space R must be deter- mined by the other points of XXI. Using the equations (b)l or (b)5 we have —1 1 -a2 + an i l or -1 1 2a“ 1 1. Then it can be verified that all of the other inequalities from (a) and (b) are satisfied for these values of an. Thus the set R of best approximations is au(x3 — x) for -l/2 i an 1 1/2. Therefore, a = 0 is interior to R, C(O) is the set MES(XXI), and the sets B and B l 2 are in MCS(XXI). 1.34 Example. Let x = {-2,-1,0,l,2}, I wo(x) a wl(x) 5 1 and ¢i(x) = x1_1 for i = 1,2,3. The {0,1}, n = 3, function f(x) is defined on XXI by the following table: 25 x f(x) f'(x) -2 l O -1 —1 0 O O —l l l O 2 -l O From ordinary Chebyshev approximation theory we know that B = {(-2,0,l),(-1,0,—l),(1,0,l),(2,0,-l)} l is a MCS and that a = 0 is the unique best approximation with e = 1. Now we will show that B = {(-l,O,-l),(O,l,—l),(l,0,l)} 2 is also a MCS. We must show that there is no polynomial satisfying Condition A with respect to B2. Thus we need to verify that the following inequalities are inconsistent. a1 + a2 + a3 < a1 — 32 + a3 < (8) (9) (10) Adding (8) and (10) we have a2 < 0, which contradicts (9). Therefore, B2 is a MCS. Since both B and B are subsets 2 26 of C(O) = MES(XXI), the sets B and B are in MCS(XXI). 1 2 Thus the various minimal characterization sets for a best approximation (those in MCS(XXI)) need not have the same number of points. 1.35 Theorem. If T is a finite subset of XXI, the dimension of R(T) is n - k where n is the number of base functions and k is the rank of B = WIMEMES(T)]. Proof: Let To = {(x,k):(x,k,s)eMES(T)} and let T = T ~ T 1 o' If P(x,y) is a best approximation, where y is in the in- terior of R(T), (if the best approximation is unique, let P(x,y) be this unique best approximation) then MTl[f(x) — P(X,Y)] = d < e(T). This is because the maxi- mum is taken over points which are not points of MES(T) and because T is finite. Then the continuity of MTl[f(x) - P(x,y)] in y implies that there exists an e > 0 such that MTl[f(x) - P(x,y) - P(x,a)] < e(T) for oeN(o,e) a small neighborhood of the origin in En. Next consider the solution space S of But = 0. S has dimension n-k where k is the rank of B. We also know that MTO[P(x,o)] = 0 and MTO[f(x) — P(x,y) - P(x,u)] = e(T) for all aeS. Then for aeN(o,e)f\S it follows that MT[f(x) - P(x,y) - P(x,y)] = e(T) since T = TotlTl. Since N(o,e)/\S has dimension n-k, the dimension of R(T) must be at least n-k. But any two best approximations must be equal at the points of To' Then if the dimension of R(T) is greater than n-k, the solution space of But = 0 must 27 also be greater than n-k. This is a contradiction to the assumption that k is the rank of B. Q. E. D. In the following corollaries, T is assumed to be a closed subset of XXI and not necessarily finite. 1.36 Corollary. If T is any closed subset of XXI, the dimension of R(T) is bounded above by n-k. 1.37 Corollary. If R(T) is not a single point and is a boundary point of R(T), then C(T,o) contains at least one more point than MES(T). Pooofz If not, then the proof of Theorem 1.35 implies that for some a > 0 there is a neighborhood N(a,e) of a such that N(a,e)/lSCR(T). This contradicts the assumption that o is on the boundary of R(T). 1.38 Corollary. If the rank of WIMEMES(T)] is equal to n, then the set of best approximations R(T) is a single point and the best approximation is unique. 1.39 Definition. A polynomial P(x,o) is said to have a ziro at the point (xo,ko) or the point (xo,kO D oP(xo,d) = 0. ,so) if 1.40 Lemma. If Q is a MCS of k i n + 1 points, then the ranks of both IM(Q) and IM(QO) are 1-1 where QO is any k—l point subset of Q. 28 3193;: Let m be the rank of IM(Q). This means that the n-points of Q (to be denoted NQ) span a subspace of En of dimension m. (The rows of IM(Q) are the n-points of Q except for the factor of -1 if s = -1.) The origin of En is in the convex hull of NQ because Q is a MCS. Apply- ing the Theorem of Caratheodory to this m dimensional sub— space, the origin may be expressed as a convex combination of some set of not more than m+1 points from NQ. Since Q is a MCS, the origin cannot be expressed as a linear combination of any subset of NQ and hence k :_m + 1. If k < m + l, the rank of IM(Q) is k and there exists a poly- nomial satisfying Condition A with respect to Q. Thus k = m + 1 and the rank of IM(Q) is khl. Suppose that the rank of IM(QO) is less than k-l for some k-l point subset Q0 of Q. This implies that there exists a polynomial P(x) which is zero at the points of Q0 and nonzero at (xo,ko,so) = Q ~ Q0. Since Q is a MCS there exists a polynomial V(x) which satisfies Condition A with respect to Q0 such that 30 # -sgn[Dk°V(xo)]. Then k if D °V(xo) s o, k k k V(x) - 2[D OV(xO)]sgn[D °P(xo)]P(x)/|D"P(xo)| is equal to V(x) at the points of Q0 and is equal to k‘ k 0 o -D V(xo) at (xo,ko,so). If D V(xo) = 0, then k k V(x) — sosgn[D °P(xo)]P(x)/|D °P(xo)| 29 is equal to V(x) at the points of Q0 and is equal to -sO at (xo,ko,so). Thus in either case, there exists a poly- nomial satisfying Condition A with respect to Q. This is a contradiction and implies that the rank of IM(QO) is k-l. Q. E. D. 1.A1 Theorem. If any set in MCS(T) contains n+1 points then the best approximation is unique. Pooog: Let Q be a set of n+1 points which is in MCS(T). Since we are assuming that e(T) > 0 in this section, the ranks of WIM(Q) and IM(Q) are the same. (The weight functions are nonzero on Q.) Since Q is a subset of MES(T) we may apply Corollary 1.38 and Lemma 1.A0. Q. E. D. CHAPTER II CONVERGENCE CONSIDERATIONS 1. Uniform Convergence of the Po1ynomial to the Function In uniform approximation theory one basic property that any set of approximating polynomials should possess is that of the uniform convergence of the polynomials of best approximation to the function being approximated as the number of base functions increases. This is desirable because when approximating a function, a certain maximum allowable deviation is usually specified and the number of base functions is chosen so that this requirement can be satisfied. If the approximating polynomials do not converge uniformly to the function, then for some speci- fied maximum deviation 6 > 0 we may use as many base functions as we please, but the deviation e of the best approximation will always be greater than c. The follow- ing example illustrates a typical difficulty which must be avoided. 2.1 Example. Let f(x) = sin x + l, I = {0,1}, X = [—l,l], ¢i(x) = x1 for i = 1,2,... and wo(x) wl(x) a 1. Then 30 31 we know that DP(x,a) can approximate Df(x) as closely as desired using the Weierstrass Approximation Theorem, but the difference between f(x) and P(x,a) is 1 at x = 0 for any polynomial in these base functions. The following theorem gives conditions on the base functions which are sufficient to overcome this difficulty. m i-l q 2.2 Theorem. Let {¢i(x)}i=1 include the set {x }1:1 and let X = [a,b], where b > a. Suppose that for any f(x)qu(X) and any 2' > 0 there exists an n and a poly- O nomial P(x,B) in no base functions such that IDq[f(x) - P(x,B)]I < e' on X. Then given any a > 0 n there exists an n and a polynomial P(x,o) = z ai¢1(x) i=1 such that M[f(x) - P(x,a)] < s. Proof: Let a > 0 be given and let 8' = e/E where E = Max [Iwk(x)l(b - a)q-k]. This maximum exists (x,k)eXXI because the functions wk(x) are continuous on X for keI. i-l q Obtain a polynomial P(x,8) which has {x }i=l among its base functions and for which the coefficients of{x1-l}?_=1 are zero, such that qu[f(x) - P(x,8)]| < e' on X. Let T(x) be the Taylor expansion of P(x,8) - f(x) about the point x = a with q terms and define P(x,a) = P(x,B) - T(x). Then for any XeX IDq[f(x) — P(x,a)1| = IDq[f(x> - P(x,8)]l < e' More generally we have 32 IDk[f(x) - P(x,a)]l < e'(b - a)q’k for k = 0,1,...,q This is true because Dk[f(a) — P(a,a)] = 0 for k = 0,1,...,q-l. Hence lwk(x)Dk[f(x) — P(x,o)]|< Maxlwk(x)|e'(b - a)q-k §_Ee' = c xeX for all keI and xeX. Q. E. D. 2. Continuous Dependence of the Approximating Polynomial on the Function The next theorem is an extension of a theorem by Maehly and Witzgall [1“]. It relates the closeness of P(x,af) and P(x,ug), (best approximations to f(x) and g(x) respectively in the pseudonorm M) as a function of the closeness of f(x) and g(x) where g(x) is considered fixed. As expected, under the appropriate hypotheses the best approximations to f(x) and g(x) will be close to each other if f(x) is sufficiently close to g(x). 2.3 Theorem. Let f(x) and g(x) be in Cq(X) with sets of best approximations Rf and R8 respectively on XXI, with respect to the weight functions {wk(X)}keI' If MCS(XXI)g (the set of minimal characterization sets which are sub- sets of MES(XXI) for the function g(x)) contains a MCS of n+1 points, then there exists a constant B which depends n only on g(x), {¢i(x)}i=1 and{wk(x)}kEI such that 33 Max le[P(x,af) - P(x,ag)]l i B Max le[f(x) - g(x)]l (x,k)eXXI (x,k)eXXI where “g is the unique best approximation to g(x) and of is any best approximation in Rf. n+1 Proof: Let Q s {(xi’ki’si)}i=l be a MCS from MCS(XxI)g and let ef and eg be the deviations of P(x,af) and P(x,ag) from f(x) and g(x) respectively. (Since it is assumed from the beginning of this paper that e, the deviation of the best approximations to a function, is nonzero for any function under consideration, we will assume that eg > 0.) Also let a = Max IDk[f(x) — g(x)]l and w = Max lwk(x)|. (x,k)EXXI (X,k)eXxI Then ef _<_ M[f(x) - P(x,ag>l §_ M[f(x) - g(x)] + M[g(x) - P(x,ag)] 1 W6 + eg and hence -W6 1 eg - ef (1) We also know that k siwki(xi)D'i[g(xi) - P(xi,ag)] = eg (2) k Siwki(xi)D i[f(xi) - P(xi,af)] i ef (3) 34 for i = 1,2,..un+l; upon subtracting (3) from (2) and using line (1) we have ki ki siwki(xi)[D P(xi,af - ag)] 1 siwki(x1)D [f(xi) — g(xi)] +6 - g ef 1 - 2W6. Thus letting a = of - cg we see that k siD iP(xi,o) 1.- ggi for i = l,2,...n+l (4) where z = Min Iwk (x1)|. (z is not zero because i=1,..”n+1 i e > 0. g ) To complete the proof we shall prove the following lemma. 2.4 Lemma. Let P(x,a) satisfy k D iP(xi,a) 1 -2W6/z for i = l,2,...,n+l, where 5’i _ n+1 Q - {(xi’ki’si)}i=l is a MCS of n+1 points. Then there is a constant B independent of 6 such that Max IDkP(X,q)I 1 B6. (x,k)eXXI Proof: Assume that such a constant does not exist. Then for each integer m there exists an “m such that P(x,am) satisfies (4) and Max leP(x,am)| > m6. As a conse- (x,k)eXXI , quence of Lemma 1.40 and Lemma 1.13, the sequence {P(x,am)} is bounded at most n-l points of Q. In addition, for each polynomial in the sequence there must be a point 35 k l (xi,k1,si)eQ such that sgnED P(xi,am)] # -s (Other— 1' wise we would have a polynomial satisfying Condition A with respect to Q.) Thus for each am there exists a point (xT,ki,xi) Q such that k 0 : siD iP(x?,am) 1 2W6/z. Choose a subsequence of {P(x,am)}, also to be de— ki noted {P(x,am)}, so that the sequences {D P(xi,am)} converge to 11 or diverge to :w. This divides Q into two k parts, Q1 and Q2 where Ql is where {D iP(xi,am)} is bounded and Q2 is the rest. Thus Q1 and Q2 are both nonempty and Q1 contains no more than n-l points. For each (Xj’kj’sj) in Q1 consider the functions P(x,8,) such that ) II (D I >a “J D P(xJ.,BJ ) I O k D JP(xi,B for xite, 1 # j. J Then specify P(x,BJ) at as many more points to make it unique. (We know that these polynomials exist by Lemma 1.40.) Consider the function P(X,Efi) E P(x,am) + 2 P(x,81). XieTl k For each point of Q2, {D lP(xij'mfl diverges to siw. 36 For points of Q1 we have R k i — i D P(xi,am) = D P(Xi,dm) + s - A i i' k Thus for m large enough, sgnED iP(x1,§'m)] = 31 for i = 1,2,..”n+1 and therefore -P(x,§h) satisfies Condition A with respect to Q. This is a contradiction and com- pletes the proof of the lemma and Theorem 2.3. A more general result which is applicable to all approximation problems under consideration is obtained in the following theorem. 2.5 Theorem. Let M be a pseudonorm over XXI with weight functions wk(x) for keI. Let f(x) and g(x) be functions with sets of best approximations Rf and Rg respectively. Then given a > 0 there exists a 6 >0 depending only on e and g(x) such that if afeRf, then M[g(x) - f(x)] < 6 implies that Min M[P(X,ag) — P(x,af)] < e. R age 3 Proof: If this theorem is not true, there exists an so > 0 and a sequence of functions {fm(x)} and their corresponding sets of best approximations Rf from m which we obtain a sequence {P(x,am)} which satisfies the following: 37 Min M[P(x,og) - P(x,am)] 1 e R 0‘se a O for all m. (5) l M[g(X) - fm(X)] < 5- Then using Lemma 1.11 and the triangle inequality we have M[P(x,am)] 1,2°M[fm(x)] 1 2(M[g(x)] + l/m) i 2M[g(x)] + 2 for all m. Since we are assuming that M[P(x,a)] = 0 implies IIaII = 0, there exists a set T of n points which has a nonsingular interpolation matrix. Lemma 1.13 implies that the co- efficients of {P(x,cm)} are uniformly bounded for all m and hence there is a convergent subsequence, which also will be denoted {P(x,am)}, which converges to a poly- nomial P(x,cm). Then the assumption given in line (5) implies that/P(x,aw) is not a best approximation to g(x). Therefore there exists a constant e' > 0 such that if ageRg then M[g(x) — P(x,og)] + e' = M[g(x) — P(x,am)] i M[g(x) - fm(X)] + MEfm(X) - P(X,Um)] + M[P(x,am) — P(x,am)]. (6) 38 Then determine N so that m 1 N implies that M[g(x) - fm(x)] i e'/4 and M[P(x,um) - P(x,am)] i e'/4. (7) Therefore, for m 1 N it follows from (6) and (7) that M[g(x) - P(x,ag)] + €'/2 i.M[fm(X) - P(x,am)]. (8) But we also have MEfm(x) - P(x,ag)] :_M[fm(x) - g(x)] + M[g(x) - P(x,ag)] i e'/4 + M[g(x) - P(x,og)] for m 1 N. (9) Thus (8) and (9) imply that there exists an m such that M[fm(x) - P(x,ag)] + e'/4 i e'/2 + M[g(x) — P(x,ag)] i MEfm(x) - P(x,am)]. This is a contradiction since P(x,dm) is a best approxi- mation to fm(x). Q. E. D. 3. The de la Vallée Poussin Algorithm In this section it will be assumed that X = [a,b] where a < b. Moreover, the approximation problems under consideration all satisfy Assumption 1.7 and hence from Lemma 1.8 there exists a set Q0 of n points from XXI for 39 which det[W1M(QO)] # O. From the set XXI choose a count- able dense subset Q' which contains the set Q0. Let Q" = {(x,k)eXXI:wk(x) = O} and Q = Q' ~ Q". Since the weight functions are continuous, Q is dense in XXI except where the weight functions are identically zero on an interval. In addition, since det[WIM(QO)] # O, we know that QO C, Q. Let Tm be subsets of Q defined for m 1 n and con— taining m points such that QoCTkCTk+l for k = n,n+l,... where the limiting set of points Tm as m approaches m is Q. Also, let 5m (the spacing or density of the set Tm in Q) be defined as 6' = Sup {Min[lx - x'l + (b-a)|k-k'|]}. (x,k)eQ (x',x')eTm (10) The quantity b-a is used in this definition so that am < b-a implies that there are at least two points (x,k)eTm with second coordinates equal for each keI. Then one would hope that for ameR(Tm) and a*eR it would follow that lim M[f(x) — P(x,am)] = M[f(x) — P(x,a*)] = e m+oo More specifically, if {(P(x,o )} is any sequence with terms m from R(Tm), one would hope that any convergent subsequence of {P(x,am)} would converge to a polynomial in R. These no results are proved below, with stronger conclusions being obtained in certain cases. 2.6 Definition. f(x) is said to satisfy a Halder con- dition on [a,b] with exponent 9 if there exists a con— stant A such that |f(y) - f(X)| : Aly - x|p p > 0 for all x and y in [a,b]. 2.7 Remark. If f'(x) satisfies a H81der condition on [a,b] with exponent p§_l, then f(x) also satisfies a Halder condition on [a,b] with exponent p. This follows from the boundedness of f'(x) on [a,b] and the fact that if f(x) satisfies a Holder condition with exponent 1, then it also satisfies a Holder condition with exponent p for 0 < p <1. (See Natanson [21] p. 72.) 2.8 Theorem. Let qu(x) and {Dq¢i(x)}i:l satisfy a Holder condition with exponent p :1. Then there exists n a constant K depending on f(x), {¢i(x)}i=l and {wk(x)}kEI such that o 1 M[f(x) — P(x,am)] - e : K(6m)° for m 1.n and any umeR(Tm). (Here 6m is defined in line (10).) 41 Proof: Choose N so that m 1_N implies that am < b-a and then assume that m 1_N in this proof. Let (y,k)eXXI be such that M£f - P(X,am)] = Iwk(y)Dk[f(y) - P(y,am)]| for a fixed omsR(Tm); also, let (z,k) be a point in Tm such that Iy-zl < 6m. Let e and e(Tm) denote the deviations of P(x,a*) and P(x,am) from f(x) on XXI and Tm respectively where a*eR and ameR(Tm). Then M[f(x) - P(x,am)l = Iwk(y)Dk[f(y) - P(y,am)]| 1 |wk(y)Dk[f(y) - f(z)]| + |wk(y)Dk[f(z) — P(z,am)]| + Iwk4 1 2 Ll(x) wl(x)[Df(x) - 1/2] (4-x)(5/2-x) if 2 < x < 4 It may be verified that the extreme values of Lo(x) on [0,4] are 1 and -l at x = 0 and x = 4 respectively. The extreme value of Ll(x) is l for x = 2. Then the claim is that Q = {(0,0,1),(4,0,-1),(2,1,l)} is in MCS(XXI) and since the deviation of the approximation is not attained at any other point of XXI it follows that Q is equal to MES(XXI). Using Theorem 1.30 it must be shown that the origin is in the convex hull of the n-points of Q which are (1,0,0), (-l,—4,-l6) and (0,1,4). Using the multiples 1/6, 1/6 and 2/3 respectively, we have the origin as a convex combination of these points. This shows that there is no polynomial satisfying Condition A with respect to Q. Then using the properties of the generalized Vandermonde matrix, any two point subset Q' of Q does have a poly- nomial satisfying condition A with Respect to Q'. Thus QeMCS(XXI), e = 1, and x/2 - l is a best approximation. Next we observe that ax(x-4) is the only family of polynomials in the base functions under consideration which is zero at the points of Q. Hence using Corollary 1.21, the space of best approximations is P(x,a) E x/2 - 1 + ax(x-4) where a is restricted to some interval containing zero. Then it may be verified that the space of best approximations 46 is P(x,a) for ae[—(2/2 — l)/8,(2/2 - l)/8]. The weighted error curves are given in the following graphs for various values of a. a1\‘ a2\ a {A au\ a5\ Lo(x) = wo(x)[f(x) - P(x,a)] for a = a1, i = l,2,3,4,5 Ll(x) = wl(x)[Df(x) - DP(x,a)] for a = a1, 1 = l,2,3,4,5 a1 = (2/2 - 1)/8 , a2 = al/2 , a3 = c, a“ = -a2, a = -a 5 l' 47 2. A General Uniqpeness Theorem We have shown by example that a solution to a problem of approximating a function and its derivative need not be unique. However, certain problems do have unique solutions. Such a unique solution can sometimes be detected using the results of Chapter I. For example if one has a solution to a given approximation problem and it has a MCS of n+1 points, then it follows from Theorem 1.41 that this solution is unique. A much more desirable result is one which can be used to claim uniqueness before a solution is found. Such a result will hold provided that {¢i(x)}1:1 and f(x) satisfy certain conditions. First we shall indicate conditions on the base functions which will be used in a uniqueness proof. Throughout this chapter the notation |s| will be used to indicate the cardinality of the finite point set S. We shall first consider the special base functions ¢i(x) = x1-1 for i = 1,2,...,n. We shall use the following properties of these base functions: (a) D2¢i(x) exists on [a,b] for i = 1,2,...,n. (b) DP(x,a) = iElaiD¢i(x) has at most n-2 zeros on [a,b] if DP(x,a) z 0 on [a,b]. (c) At most one base function is constant on [a,b]. In the following two lemmas certain point sets of n and n-1 points are described which have interpolation matrices of rank n and n-1 respectively. A MES cannot have an interpolation matrix for which the rank is equal to the 48 number of points. If it did, there would exist a poly- nomial satisfying Condition A with respect to this MES. Thus we are able to establish a lower bound on the number of points in a MES. This lower bound is essential to the proof of uniqueness theorems. 3.2 Remark. If it is desired to interpolate the function 0 at the points of a finite set QCZXXI by a polynomial in n base functions, a homogeneous system of m = IQI linear equations in n unknowns must be solved. From the theory of linear equations, the dimension of the solution space will be n-r where r is the rank of the interpolation matrix of Q in n base functions. 3.3 Definition. A polynomial P(x,a) has a zero at the point (xo,k) or the point (xo,k,s) if DkP(xO,a) = 0. 3.4 14311112- Given any set TC[a,b] such that |T| = n-l, define Q' = TX{1}. Then if n > 1 the interpolation matrix IM(Q') has rank n-l. £5993: Assume that IM(Q') has rank k i n - 2. Consider the problem of interpolating the function 0 at the points of Q'. The dimension of the solution space is n - k 1 2. Since at most one base function is constant on [a,b], there exists a polynomial P(x,a) such that DP(x,a) i 0 on [a,b] and DP(x,a) has n-l zeros on [a,b]. This contradicts property (b) and completes the proof. Q. E. D. 49 3.5 gommo. Let S,T' and T" be finite sets such that T"C{a,b}, T'CSC[a,b] and [S] + |T'UT"| = n; also, define T = T'LJT" and Q = SX{0}L}TX{1}. Then IM(Q), the interpolation matrix of Q, is nonsingular if n > 2 or S in nonempty. Ppoofl: First we observe that n > 2 implies that ISI 1 1. Then assume that the interpolation matrix for Q is singular and that n > 2. This implies the existence of a nontrivial polynomial P(x,a) with zeros at the points of Q. Since T" can only contain the end points of [a,b], we may apply Rolle's Theorem to every adjacent pair of points from SX{0}, if there are any, to conclude that DP(x,a) has at least ISI - 1 zeros on [a,b] in addition to the |T| zeros of TX{1}. This implies that DP(x,a) has at least ISI + |T| - 1 = IQI - l = n - l zeros which contradicts property (b) of the base functions under consideration. Thus DP(x,o) E 0 and since ISI 1 1, it follows that P(x,a) e- 0. Q. E. D. In Corollary 1.21 we established that any two best approximations must agree at the points of a MES. In the following theorem we find that under the appropriate hypotheses the first derivatives of any two best approxi- mations must also agree at the points of MES which are interior to XXI. 50 3.6 Theorem. Let (xo,k,s) be a point from MES(XXI) such that x0 is in the interior of [a,b]. Then if Dwk(xo) and Dk+l[f(xo) - P(xoo)] exist for aeEn it follows that Dk+1 Dk+1 P(xo,c) = f(xo) + seDwk(xO)/[wk(xo)]2 for all deR. k Proof: For any aeR, Lk(xo,o) = wk(xo)D [f(xo) - P(xo,a)] has a relative extrema at X0 in the interior of [a,b]. Since the derivative of Lk(x,a) exists at xO it must be zero there. Thus k+ 1[ Dwk(xO)Dk[f(xO) — P(xo,c)] + D f(x ) — P(xo,a)]wk(xo) = o O and since we assume that e > 0 it follows that wk(xo) # 0 and hence Dk+lP(xo,a) = Dk+l[f(xo)] + Dwk(xO)Dk[f(xO) — P(xo,a)]/wk(xo) Dk +1f(xo) + seDwk(xO)/[wk(xo)]2 since Lk(xo,a) = se. Q. E. D. The proof of the following uniqueness theorem follows the pattern of the proof used in the classical case of ordinary uniform approximation theory. If there are two best approximations, it follows from Corollary 1.21 and Theorem 3.6 that the difference of the two is a poly- nomial with a certain number of zeros. This number is 51 dependent on the number of points in the MES. Under appropriate conditions this number of zeros-is sufficient to insure that the difference polynomial is identically zero. This completes the proof of uniqueness. This same theorem is given in Moursund [16,17]. The proof here is considerably simplified, and easily extends to other cases of interest. 3.7 Theorem. Let X = [a,b], I = {0,1}, and let Dwo(x), le(x), and D2f(x) exist on [a,b]. If ¢1(x) = xi—l for i = 1,2,...J1then one of the following is true: (a) The best approximation is unique. (b) The best approximation is unique except for an additive constant, and if P(x,a) is any best approximation then DP(x,B) is the unique best approximation to Df(x) with weight function wl(x). Ppoog: For notational convenience we define the following sets. MES(XXI) is the minimal extremal set of points for the best approximations on XXI. Then let 0 II {(x,k):(x,k,s)eMES(XXI)} 0 so = Qofl(a,b)X{0} Sa = Qof){(a,0)} s = b Qon{(b,o)} 52 To = QO()(a,b)X{l} Ta = QOI){(a,1)} Tb = Q0 fl{(b,l)} E: II {(x,0):(x,0,s)eSO and (x,l,s)eTo} * = .. S SO W The set 8* is the set of interior extrema from S0 which do not have the same x coordinate as the interior extrema from To. Case I.--We first consider the special case when n f. 2 and SOUSaUSb is empty. If n = 1 then MES(XXI) must contain at least one point since every approximation has at least one extremum. If n = 2 the set MES(XXI) must con- tain at least two points because Lemma 3.4 implies that there exists a polynomial satisfying Condition A with re- spect to any single point. Therefore ITol + lTaI + ITbI 1 n and using Corollary 1.21 the difference between two best approximations is a nontrivial polynomial P(x,a), which is zero at the points of MES(XXI). Thus DP(x,o) has at least n zeros which implies that DP(x,d) a 0 using property (b) of the base functions. Thus the best approximations are unique except for an additive constant. 53 Case II.--The remaining case is when n > 2 or SOLisaLISb is not empty. We begin by showing that 2 ITOI + IT + lTbl + |S*| + lsal + lsbl = m > n. <1) a | We shall show that if (1) is not satisfied then there exists a polynomial satisfying Condition A with respect to MES(XXI). Assume that m 1_n and let S'CIX ~ {xeX:(x,k,s)eMES(Xxl)} such that IS'I = n—m. Also define U) ll {x:(x,k)eTOU S*USaUSb}U S' Tl {x:(x,k)eTO} T" = {X:(x,k)sTaU Tb}. Since all of the sets used in the definition of S, T' and T" are pairwise disjoint, ISI ITOI + |s*| + lsal + Isbl + ls'l lT'l |T| IT"! = IT 0 + IT 3' al and it follows that the sets S, T' and T" satisfy the hypothesis of Lemma 3.5. This implies that IM(Q) is non- singular (the Q is from Lemma 3.5). Since QOCLQ, the rank of IM(QO) is equal to IQOI. Thus there exists a polynomial satisfying Condition A with respect to MES(XXI) and the truth of line (1) is verified by this contradiction. 54 Every best approximation must satisfy Corollary 1.21. Then if there is more than one best approximation, the difference between two of them is a nontrivial polynomial P(x,a) which is zero at every point of MES(XXI), or equivalently on Q0. Moreover, since the hypothesis of Theorem 3.6 are satisfied for any (xo,k,s) where x0 is interior to [a,b], it follows that DP(x,a) must be zero at all points (x,k)eS*LlTO. Thus P(x,a) has double zeros at the points of T0 and single zeros at each point of TatJTb, which means that DP(x,o) has at least 2|TO| + ITaI + ITbI zeros. (See Definition 3.3.) The polymonial DP(x,o) also has zeros at the points of 8*. Then because of the manner in which 8* was defined we may conclude that DP(x,o) must have at least m' = 2|TO| + lTal + lTbl + |S*| zeros. From line (1) we see that m' + ISaI + ISbI > n, and since |sa| + Is 1 2 it follows that m' > n-2. This contra- b I diets property (b) of the polynomials under consideration and implies that DP(x,o) s 0. Therefore, the first derivatives of any two best approximations are equal and the best approximations may differ by a constant. If SOLISaLJSb is not empty, P(x,a) (which is a constant) must have at least one zero) and hence P(x,a) e 0 and the best approximation is unique. Q. E. D. 55 In the next theorem, a class of polynomials is described which has the same three properties that were used to prove uniqueness in Theorem 3.7. Since these three properties are the only properties of the base functions that were used in the proof, Theorem 3.7 and its proof are also valid for this class of base functions. 3.8 Theorem. Let g(x) satisfy the following conditions: (a) g(x) is strictly monotone on [a,b]. (b) Dg(x) has no zeros on [a,b] (c) D2g(x) exists on [a,b] Then letting [g(x)lj"l n P(x,a) = z aJ J=l it follows that DP(x,a) can have at most n-2 zeros on [a,b], where zeros of multiplicity 2 or more are counted as two zeros. Proof: Assume that DP(x,o) has k zeros of multiplicity 1 denoted by'{xi}il:_l and m zeros of multiplicity 2 or more k+m denoted by {xi}i=k+l , where k + 2m > n - 2. Then n DP(xi,a) = Dg(x1) z a = _ «1‘2 0 3:2 J(J l)[g(xi)] for i = l,2,...,k+m, and n o = D2P = [Ds(xi)]2323aj(J-l)(J-2)[g(x1)lj'3 2 n 3-2 + D g(X ) 2 a (J—l)[s(x )1 for i = k+l,...,k+m. 56 Since Dg(x) is nonzero on [a,b] it follows that n J-2 for y1 = g(xi) , i = 1,...,k+m, and <> n < 33 0 = Q y 5 Z a J-l)(J-2)(y ) ‘ for y1 = g(xi) , i = k+l,...,k+m. Moreover, g(x) is strictly monotone on [a,b], and hence the numbers {yi}§:T are all distinct. Thus Q(Y), which is a polynomial in y of degree n—2, has at least n-l zeros. This is a contradiction and completes the proof. 3.9 Corollary. If g(x) satisfies the hypothesis of Theorem 3.8, Theorem 3.7 is true for the base functions _ i-l _ ¢i(X) - [g(x)] for i - 1,2,...,n EEQQL‘ These base functions obviously satisfy properties (a) and (0) listed at the beginning of this section. More- over, the proof of Theorem 3.7 did not consider zeros of any derivative higher than the second and hence these base functions also satisfy property (b) as it was used in the proof. Since properties (a),(b),and (c) are the only properties of the base functions used in the proof of Theorem 3.7 and the lemmas leading up to it, the proof of this corollary is complete. 57 3. The Trigonometric Polynomials In this section on trigonometric polynomials, X will be the half open interval [0,2n) where the topology of X is the ordinary topology on [0,2n] with the points 0 and Zn being identified with each other. The polynomials will be represented by sin(ix). "MW P(x,a) = a1 + a2icos(ix) + a 1 2i+1 i P(x,a) will be called a polynomial of degree k if Ia2k| + |a2k+1| > 0. It is well known that a trigonometric polynomial of degree k > 0 has at most 2k zeros counting multiplicity, on the interval X. Thus one can easily show: 3.10 Lemma. Given sets S and T such that TCSCX and |S| + |T| = 2k+l, the interpolation matrix IM(Q) of Q = Sx{0}lJTX{1} in n 2k+l base functions is nonsingular. 3.11 Theorem. Let X [0,2n), I = {0,1}, and let Dwo(x), le(x), and D2f(x) exist on X; then if the approximating polynomials are the trigonometric poly- nomials of degree k, one of the following is true: (a) The best approximation in unique. (b) The best approximation is unique except for an additive constant and if P(x,a) is any best approximation, then DP(x,B) is the unique best approximation to Df(x) with weight function wl(x). 58 The proof of this theorem is almost identical to that of Case 11 of Theorem 3.7. Using the notation intro- are empty (and duced there, the sets Sa’ S Ta’ and T b’ b hence not used) since X has no end points. The two differences between these two proofs are as follows: (1) Lemma 3.10 is used here instead of Lemma 3.5. (2) DP(x,o) in n base functions can have at most n-l = 2k zeros for trigonometric polynomials of degree k instead of the n-2 zeros as assumed in property (b) of section (2). 4. A Special Class of Polynomials This section will deal with the special case where i+k for i = 1,2,...,n where k 1 0 is a fixed ¢1(X) = x integer. Let I = {0,1} and X = [a,b] where 0 < a < b. (Results similar to those obtained in this section hold if a < b < O.) The functions D2f(x), Dw0(x) and le(x) will be assumed to exist on [a,b]. It should be noted that both P(x,a) and DP(x,a) in these base functions can have at most n-l zeros in the interval [a,b]. This is because P(x,a) is a polynomial of degree n+k with k+l zeros at x O and DP(x,a) is a polynomial of degree n+k-1 with k zeros at x = 0. From this property of the base functions under consideration we have the following lemma. 59 3.12 Lom_m_§_. Given sets S and T' such that T'cSc[a,b] and a set T" which is empty or is equal to {b} where |s| + IT'U T"| = n, define T = T'U T" and Q = SX{0}UTX{1}. Then IM(Q) is nonsingular. 3322:: Assume that IM(Q) is singular. Then the problem of interpolating 0 at the points of Q has a nontrivial solution P(x,a). If |S| + IT'I = n, this polynomial P(x,a) has n zeros in [a,b] counting multiplicity. This implies that P(x,a) a 0 and hence IM(Q) must be nonsingular. If |s| + IT'I < n it follows that T" is not empty. Then applying Rolle's Theorem to P(x,a) on the whole real line we find that DP(x,o) has at least k zeros at x = 0 and at least n-l more zeros in (0,b). The zeros counted cannot include the zero at the point b since b is the furthest point from x = 0 of all points in [a,b]. Thus DP(x,a) has at least k+(n-1)+1 = k+n zeros which means that DP(x,o) 2 0. Moreover, P(x,a) has a zero at x = 0 so P(x,a) E 0 and IM(Q) must also be nonsingular in this case. Q. E. D. 3.13 Lemma. Let t t .,tm be m distinct points in 1, 2,0. (a,b). Suppose D3g(x) exists on [a,b] and g(a) = g(b) = 0, and Dg(ti) = D2g(ti) = 0 for i = l,2,...,m. Then there exists at least 2m+l zeros of Dg(x) in (a,b) counting multiplicity. 6O Epoog: If Dg(ti) = D2g(ti) = D3g(ti) = 0 for some i, the conclusion is true. Therefore, assume that D3g(t1) # 0 for i = l,2,...,m. Then the points ti for i = 1,2,...,m are relative maximum or minimum points for Dg(x). If the 2m zeros which we assume for Dg(x) are the only ones that Dg(x) has in (a,b), it follows that Dg(x) is either nonnegative or nonpositive on [a,b]. Hence g(x) is a monotone function on [a,b] and g(x) s 0. Here we have a contradiction and the proof is complete. 3.14 Theorem. Under the assumptions of this section and the assumption that wl(a) = 0, the best approximation is always unique. The assumption that wl(a) = 0 implies that the points (a,l,1) and (a,1,—l) cannot be in the set MES(XXI). This restriction of the points of MES(XXI) is necessary in order that the method of proof used in Theorem 3.7 may also be used here. The proof of this theorem is almost identical to that of Case II of Theorem 3.7. Using the notation intro— duced there, the set Ta is empty (and hence not used) since w1(a) = 0. The three differences are as follows: (1) Lemma 3.12 is used here instead of Lemma 3.5. (2) In counting the zeros that DP(x,o) must have we use Lemma 3.13. (3) DP(x,o) in n base functions can have at most n-l zeros for the polynomials under 61 consideration rather than the n-2 zeros as assumed in property (b) of section 2. Finally, since there is no constant among the base functions, two best approximations cannot differ by a constant. Thus DP(x,o) 5 0 implies that P(x,a) s 0. CHAPTER IV COMPUTATIONAL ALGORITHMS FOR A BEST APPROXIMATION 1. Introduction The most basic tool in the computation of uniform approximations is linear programming, which can be used to find best approximations on finite point sets. The approximation problem under consideration may be stated as a linear programming problem as follows: find a point a = (al,a2,...,an) in En such that n Iwk(x)Dk[f(x) — z ai¢i(x)]l : e for (x,k)eXXI (1) i=1 where e is to be a minimum. This is equivalent to the problem: maximize -e subject to the constraints wk(x)Dkf(x) 1 -e + wk(x)DkP(x,a) for (x,k)eXXI. -wk(x)Dkf(x) 1 —e — wk(x)DkP(x,d) Since (1) implies that e is nonnegative, the quantity -e is bounded above. Hence when applied to an approxi- mation problem on a finite set TCZXXI, the simplex 62 63 algorithm for solving a linear programming problem will terminate at a best approximation on T. (See Arden [1].) For a comprehensive study of linear programming, see Dantzig [6]. In Chapter II the de la Vallée Poussin Algorithm for finding a best approximation was given. In this algorithm each term of the sequence of polynomials which converges H to a best approximation is a best approximation on some finite point set. Therefore, linear programming may be used to generate this sequence. This algorithm is accept- ! able if we do not demand an approximation which is very close to a best approximation. If a high degree of accur— acy is desired, this algorithm is applicable but impracticle because of the large number of constraints that must be in- cluded in the linear program. To eliminate the need for solving linear programming problems with large numbers of constraints, an algorithm based on the characterization of best approximations will be developed. In the case of the uniform approximation of a function by polynomials over a Chebyshev set of base functions, the Remez Algorithm is a very efficient method for generating a sequence of polynomials which converges to a best approximation. At each step of this algorithm a best approximation is computed on n+1 points. (This parti- cular linear programming problem can be reduced to solving a particular system of n+1 simultaneous linear equations in n+1 unknowns. Thus at no time in the process does one 64 need to solve a linear programming problem involving a very large number of constraints. In the more general problem to be considered here, which includes the weighted approximation of a function and its derivatives by polynomials in a general set of base functions, the direct extension of the Remez Algorithm does not necessarily converge to a best approximation. Moreover, the general n+1 point problem is not easily con- verted into a system of simultaneous linear equations ex- cept in certain special cases, (see Moursund and Stroud [18]). Hence we will rely heavily upon the standard linear programming algorithm, and develop an algorithm which uses the fact that the deviation of a best approximation on any compact set T 0 there exists an integer N = N(e) such that m 1 N implies that d(Tm,TO) < s 4.2 Lemma. Let the sequence {Tm} converge to T0 and let S be a compact subset of En; then for any 5 > 0 there exists an integer N = N(e)‘such that m 1 N implies [MTm[f(x) - P(x,a)] — MTO[f(x) - P(x,a)]l < c for all oeS. Proof: The function w'k(x)Dk[f(x) - P(x,a)] is a continuous function of x,k, and a on Xxlen and hence uniformly con- tinuous on the compact set XxIxS. Thus given any a > 0 there exists a 5(e) such that |xl - x2| + |kl - k2| < 6 implies that k1 k2 lwkl(xl)D [f(xl) - P(xl,a)] - Wk2(X2)D [f(xg) - P(x2,a)]| < e for (xl’kl) and (x2,k2) in XXI and for all 058. Since {Tm} converges to T0 there exists an integer N = N(6) such that m 1 N implies that d(Tm,TO) < 6. Therefore, it follows that IMTm[f(x) - P(x,a)] - MTO[f(x) - P(x,a)]l < e for m 1 N and ass. Q. E. D. 67 4.3 Definition. Let T(ZXXI be a set of m points; then Nb(T,6) a {Q:Q0 such that Qer(T,6) implies det[WIM(Q)] # 0. 2393;: Since a determinant is a continuous function of its elements and since all of the elements of WIM(Q) are continuous on XXI, the quantity det[WIM(Q)] is a continuous function of the points of Q. Then if det[WIM(T)] is non- zero, it must be nonzero in some neighborhood of T. Q. E. D. 4.5 Theorem. Let {Tm} be a sequence of compact subsets of XXI which converges to a compact set TO<2XXI. If TO contains a set of n points T' such that det[WIM(T')] # 0 then Aim e(Tm) = e(TO). m->0° Proof: We know that e(Tm) - e(To) 1_MTm[f(x) - P(x,ao)] - e(TO) where coeR(TO) the set of best approximations to f(x) on the set To' Then Lemma 4.2 implies that for any a > 0 there exists an integer N = N(e) such that m 1 N implies MTm[f(x) - P(x,ao)] - MTO[f(x) - P(x,ao)] < e. Thus for m 1 N we have e(Tm) — e(To) < e. The set TO contains a set T' of n points such that det[WIM(T')] # 0. Thus by Lemma 4.4 there exists a 60 > 0 such that Qer(T',60) implies that det[WIM(Q)] # O. 68 Since {Tm} converges to TOZDT' there exists an integer N1 = N1(60) such that m 1'N Qmer(T',60) such that chfiTm. Then Lemma 1.11 implies 1 implies that there exists a that MQm[P(X,am)] 1 MTmEP1 1 2MTmEfJ _<_ 2M£f1 (2) where ameR(Tm), the set of best approximations on the set Tm. Let B [WIM(Q)] denote the cofactor of the entry in 13 the ith row and jth column of WIM(Q) and define B = MaxIBiJ[WIM(Q)]| E = Minldet[wlm(Q)l|. (3) Qer(T'O,60) Qer(T'O,60) i = 1,2,...,n j = 1,2,...,n Since the coefficients of a polynomial P(x,a) are uniquely determined by the values it has on any point set : n ' Q - {(xi’ki)}i=l in Nb(T O,60), these coefficients are given by n k1 1:lBiJ[WIM(Q)] wki(x1) D P(xi,a) 33 = det[WIM(Qy] , J = 1,2,...,n. Then if ameR(Tm) for m 1 N it follows from lines (2) and l (3) that Iam | 1 2nB-M[f(x)1/E, j = 1,2,...,n J where am = (aml,am2,...,amn). Let S be the set of all thn for which Ilall i 2nB-M[f(x)]/E. It follows then 69 from Lemma 4.2 that for any a > 0 there exists an integer N = N2(c) such that m 1 N 2 implies 2 MTO[f(x) - P(x,a)] — MTm[f(x) — P(x,a)] < e for all aeS (which includes all am for m > N1). Thus e(TO) - e(Tm) :_MTO[f(x) - P(x,am)] - MTm[f(x) - P(x,am)] < e for m 1 N2. Q. E. D. 4.6 Corollary. Under the hypotheses of Theorem 4.5 there exists an integer N such that R(Tm) is contained in some compact set S for all m 1_N. 4.7 Corollary. Let {Tm} converge to a set T'O. Then Aim e(Tm) i e(T'O) if the limit exists. m-No This corollary follows directly from the first part of the proof of Theorem 4.5. 4.8 Corollary. Suppose T' is a set of n points for which det[WIM(T')] a! 0. If T'ch, for m = 1,2,.... , and if Aim Tm = To, then m-Hn :12 e(Tm) = e(To). The following is an example where e(T) is not a continuous function of T. 70 4.9 Example. Let x = {-1,—l//3,o,1//3}L}[4/5,l], n = 4, I = {0,1}, ¢i(x) = x1'1 for i = l,2,3,4 and wo(x) s wl(x) a 1. The function f(x) is defined by the following table where k is to be determined later. x f(x) Df(x) -1 -1 0 —1//3 0 -l o 1 o 1//§ o l [4/5,l] k 0 Also let T0 = {(1,0),(0,0),(-l,0),(1//3,1),(-1//3,1)} and T1 = {(l-l/i,o),(o,o),(—1,o),(l//3,1),(—l//§,l)} for i 1_5. Then {Ti} converges to To. The value of e(Ti) on any set T1 for i 1 5 can be determined from the following system of equations where we let t = 1 - 1/1. i '1 t1 (t1)2 (ti)3 sl‘ “all ' k1 l -l l -1 s2 a2 —1 l 0 0 0 53 a3 = 1 o l 2//§ 1 Sn an 1 .0 1 -2//3 1 s5, (e1; \-1; 71 In this system of equations the quantities sJ for j = 1,...,5 are to be determined to minimize the absolute value of e1 where sJ may be chosen to be l,-1, or 0. Using Cramer's rule kBl - B + B3 + B4 - B5 J=1SJB3 I #0 p. MUTN where BJ is the cofactor of sJ and it is noted that BJ for j = 2,3,4,5. Then if we choose sJ = sgn(BJ) it follows that e(Ti) = [oil will be a minimum. Moreover, B1 = 0 AV and hence e(Ti) is independent of k. Therefore we may conclude that e(Ti) i 1 for all i 1 5 and any k. Next consider the approximation problem on the set To. Since the polynomials do not satisfy the requirement that MTO[P(x,a)] = 0 implies IIaII = 0 we may reduce the number of base functions by at least 1. Since x = x3 at the points of T0 we will use the base functions {1,x,x2}. Consider the subset T' = {(1,0),(-1,0),(l//3,l),(—l//3,l)} of To. Then e(To) 1 le'l where e' is given by (1 1 1 s1 al k 1 —1 1 52 a2 —1 0 1 2//3 s3 a3 = l (o 1 —2//3 s“) Le'J k 1, 72 Using Cramer's rule and the same notation and argu- ments as above, we have e— 4 2 s B 3:1 J J Since B1 = 4//3 we may make le'l and hence e(TO) as large as we desire by choosing k properly. Thus e(T) is not continuous at the point set To' 3. Two Algorithms In this section we shall discuss two algorithms which can be implemented on a computer, and which (theoretically) can be used to solve the interval version of the problem of this thesis. Each algorithm involves the determination of a sequence of point sets, and uses linear programming to calculate best approximations to f(x) on the finite point sets. Since the continuity of e(T) depends on the presence of a set T' of n points which has a weighted interpolation matrix of rank n, we_must be able to determine such a set T' for use in the computational algorithm. (Such a set T' exists because of the restrictions on the base functions and Lemma 1.8.) In problems of computational interest such a set may be found quite easily. For example, if the base functions {¢1(x)}i§1 are a Chebyshev set on [a,b], then any n points from [a,b]X{0} for which wk(x) is nonzero will have a nonsingular weighted 73 interpolation matrix. Also, from the continuity of det[WIM(T')] in the points of T' we know that if any non- trivial polynomial P(x,a) in the base functions {¢i(X)}i:l can have at most a finite number of zeros on X, then in any neighborhood of a set of points from [a,b]X{0} for which wk(x) is nonzero, there will be a set T' for which WIM(T') is nonsingular. Moreover, it should be noted that such a set T' is independent of the function f(x) being approximated and hence if a number of functions are to be approximated on XXI with the same base functions and weight functions, a set T' may be determined once and used for all functions being approximated. The convergence of the following algorithms depends on the existence and use of this point set T'. Algorithm I Let T' be a set of n points such that det[WIM(T')] # 0 and let so 1 0 be some fixed real number. (In the theorems which follow we shall show that the use of £0 = 0 will give a convergent computational scheme in certain special pro- blems. In actual computational practice, most problems fall into this class. The use of so > 0 gives a slightly different algorithm which will tend to be computationally slower than the so = 0 algorithm. However, we shall show that the algorithm converges in all cases if so > 0.) Then choose some finite point set T"O of at least n+1 points from XXI such that e(T"O) > 0. (This set T"O is usually 74 chosen to be a set of equally spaced points from each set XX{k} for keI which includes the points (a,k) and (b,k) for each keI. If the first choice of T"O has a deviation e(T"O) = 0, then additional points must be chosen from XXI and included in this set T"O until the deviation is non— zero. Such sets T"O such that e(T"O) > 0 exist because we are only considering problems for which any best approxi- mation on XXI has a deviation e > 0.) Next determine a best approximation P(x,ao) to f(x) on T"O. (This may be done using linear programming.) Let QO be a subset of the extremal point set C(T"O,oo) such that QO contains some minimal characterization set from MCS(T"O) and let T0 = {(x,k):(x,k,s)er}. (The set Q0 and the sets Qm for any integer m 1 1 given below are easily determined from the linear program by taking the points of C(T"O,ao) or C(Tnm’am) respectively which correspond to the slack variables which are set equal to zero to obtain a solution of the linear programming problem. The set QO or Qm will usually include all the points of c(T"o,oO) or C(T"m,dm) respectively.) Then let t = 0 and define the sequence of point sets {Tm} and the sequence of approximations {P(x,om)} recursively as follows (m = 0 initially). 0b- tain a point T'm = {(xm,km)} where ILk(x,om)l attains its maximum on XXI. Let t = T" = TmUT m m if t o ””1 TmUT' UT'i if t > o i=m-t 75 ll and find a best approximation P(x,am+l) t0 f(x) on T m+1 (again we may use linear programming). Let Qm+l be a sub- set of the extremal point set C(T ) such that Qm+l ) m+l’am+1 contains some minimal characterization set from MCS(T"m+l and let Tm+1 = ((x,k):(x,k,s)eQm+l}. Then if e(Tm+l increase t by l. ) _>_ e(Tm) + so set t = o and if e(Tm+l) < e(Tm) + co Algorithm II This algorithm differs from Algorithm I only in the determination of the set T'm which is defined as follows. For each approximation P(x,am) we define the set B = {(x,k)eXXIzlLk(x,om)| 1 e(Tm)} The set B consists of at most a countable number of closed intervals (allowing single points as degenerate intervals) from XXI. Then combine the intervals from B into larger sets J1,J2,.. 3 satisfies the following requirements. Let J* be any .,J so that this collection of sets {Ji} fixed element in{Ji}, then (a) only one of the following is true for all (x,k)eJ*:Lk(x,am) : e(Tm) or Lk(x,am) i - e(Tm), (b) {k:(x,k)eJ*} contains only one integer, (c) if (x,k) and (y,k) are any two points from J* such that x < y then there is no (z,k)eB such that x < z < y and Lk(x,a ) < - se(T ) where m '— m s = sgn[Lk(X,um)l, 76 (d) if J' is any element from {J1} which is adja- cent to J* then sgn[Lk(x',om)] = -sgn[Lk(x*,om)] where (x',k) J' and (x*,k)eJ*. Since the function Lk(x,cm) is continuous and e(Tm) > 0 the collection of sets {J1} consists of at most a finite number of subsets of XXI. Then from each set J1, pick out a point where lLk(x,am)l attains its maximum value on J1 and let T'm consist of all these points. (In computational practice the set T'm can usually be defined to be the set of all points (x,k) which are relative maxima of ILk(x,am)I over XXI for which ILk(x,om)| 1 e(Tm). The only difficulty that this could present is that T'm might contain an infinite number of points, or T'm might contain too many points for efficient computation of a best approximation on a finite point set.) 4. Convergence of the Algorithms The first convergence theorem says that for any approximation problem under consideration in this chapter, the polynomial P(x,am) as generated by either of the two algorithms with so > 0 will be close to a best approxi- mation if m is taken large enough. First we observe that the proof of the following lemma follows directly from the compactness of XXI. 77 4.10 Lemma. Let {(xm,k 1}” m m=1 be a sequence of points m—l from XXI and let Qm = U (Xi’ki)’ Then given any a > 0 ‘ i=1 there exists an integer N = N(s) such that m 1 N implies that d[(xm,km),Qm] < c. 4.11 Theorem. Let go > 0 for Algorithm I be fixed and let T' be a set of n points such that det[WIM(T')] # 0. Then given any 5 > 0 and any finite point set T"O for which e(T"o) > 0, there exists an integer N depending on so,e,T' and T"O such that if Algorithm I is followed start— ing with T"O, it follows that M[f(X) - P(X,am)] — e i e for m 1 N, and e — e(Tm) i e for m 1 N. Proof: The sequence {e(Tm)} is monotone increasing and bounded above and hence converges to some real number e' i e. Let mO be some integer for which e' - e(Tm ) < a o l H H ‘ Then it follows that T CT mCT m+1 for all m > mO Since T'<:T"m for m > mo, Corollary 1.15 implies that 0. there exists a compact set S<:En such that R(Tm)(:S for all m > mo. Thus the weighted error functions Lk(x,a) defined for all keI, are uniformly continuous in a and x for 658 and xeX. Therefore, given any a > 0 there exists a a such that ILk(xl,d) - Lk(x2,a)|< efor all ces, ksI, and x1,x2eX satisfying Ixl — x2] < 6. 78 Then from Lemma 4.10 there exists an integer N = N(6) such that ‘ n c. > d[(xm,km),T m] . 6 for m _ N. Thus for some (x',k')eT"m we have lLkm(xm,om)| - lLk,(x',am)l i lLkm(xm,am) - Lk,(x',am)| < 5. Since ILk,(x',am)| i e(Tm) i e and e 1 M[f(x) - P(X,am)] = lLkm(Xm’°‘m)' we have M[f(x) - P(x,am)] - e < e and e - e(Tm) < e for all m 1 N. Q. E. D. The above theorem and proof are easily extended to Algorithm II without any significant changes. The second convergence theorem is one which is appli- cable in certain special cases. It states that for certain problems the two algorithms will converge to a best approxi- mation with so = 0 if we start with a point set T"O for which e(T"O) is close enough to e. We shall first prove three lemmas to simplify the proof of the theorem. 4.12 Lemma. Let P(x,a) be a best approximation to f(x) on T(ZXXI where MCS(T) contains a set of n+1 points. If |Lk (Xo,a)| > e(T) for some (xo,ko)2XXI, then 0 e(T(J{(xO,kO)}) > e(T)- 79 £3993: Assume that e(TL}{(xO,kO)}) = e(T). Let a' be any best approximation in R(TL}{(xO,kO)}); then 6' is also in R(T). Theorem 1.41 implies that R(T) contains one unique best approximation and thus that o = a'. This contradicts the assumption that ILk o (I) i=1 sequence of continuous functions on a compact set X 4.13 Lemma. Let {fm(x)} be a uniformly convergent which converges to f(x) and for which there exists a sequence {xm}m=1 converging to x0 and satisfying fm(xm) = Maxlfm(x)l. Then Max|f(x)| = |f(xo)|. XeX xeX Proof: For each m and any xeX, |fm(x)| i |fm(xm)l. Then using the triangle inequality we have |fm(x)| 1 Ifm(xm)| : |f(xo)| + Ifm(xm) - fm(xo)l + |fm(xo) - f(xo)l for any xeX. Then taking the limit as m+w we have |f(x)| : |f(xo)l for all xex. (In taking this limit we must use the fact that a uniformly convergent sequence of continuous functions is equicontinuous. See Goffman [11], p. 106.) Q. E. D. 4.14 Lemma. Let $12 Tm = To, £12 e(Tm) = e(TO) and £12 8m = so where BmeR(Tm) for all m. Then BOeR(TO). Proof: Since e(Tm) = MTm[f(x) — P(x,am)] for all m, it follows that (Xo,a)} > e(T). Q. E. D. 8O MTm[f(x) - P(X,BO)] _<_ e(Tm) + M[P(X,Bm) - P(X,BO)]. (4) Taking the limit of both sides of (4) as m+w, MTO[f(x) - P(x,80)] i e(To) which implies that BOeR(TO). 4.15 Theorem. Suppose that in a particular problem of approximating a function f(x) on XXI all of the point sets in MCS(XXI) contain n+1 points. Then there exists a con- stant B < e such that if T"O is any finite subset of XXI for which e(T"o) > B and if either Algorithm I or Algorithm II is applied with so = 0 beginning with the set T"O, it follows that each approximation P(x,am) will be the unique best approximation on T" (and hence also on Tm) and m {P(x,am)} will converge to the best approximation on XXI. Proof: First we shall show that there exists a constant B < e such that if T is,a finite subset of XXI and e(T) > B, then every set in MCS(T) contains n+1 points. If this were not so, then for each integer m i 1 there exists a set of km < n+1 points TIn such that e(Tm) i e - l/m. Then {Tm} has a convergent subsequence, also denoted {Tm}, converging to a set To' But Corollary 4.7 implies that film e(Tm) i e(TO m-Hao and hence that e = e(TO) since 21m e(Tm) = e. Since e(TO) = m+co it follows that RCLR(TO) and since the sets in MCS(XXI) con- tain n+1 points, the set R contains one unique best approxi- mation 0:. Thus MES(TO) C;C(To,a) C:C(a) = MES(XXI) ) e 81 and hence any MCS in MCS(TO) is also in MCS(XXI). But the sets in MCS(TO) can consist of at most n points since TO contains at most n points. This contradicts the assumption that the sets from MCS(XXI) all contain n+1 points. The set T' from the algorithms need not be con— sidered in this proof since so = O and hence t = 0 through- out the algorithm. Assume that e(T"O) > B. Since e(Tm) : e(T"O) for all m, the sets in MCS(T"O) all con- sist of n+1 points and hence by Theorem 1.41 each am is the unique best approximation on the set Tm for m 1 O. From the three sequences {P(x,am)}, {Tm}, and {T'm} which are generated by the algorithm, choose three corre- sponding subsequences also denoted {P(x,am)} {Tm}, and {T'm} which converge to P(x,aw), Tm, and T'0° respectively and for which MTm[f(x) - P(x,am)] = e(Tm) and MT'm[f(x) = P(x,am)] = M[f(x) — P(x,am)]. (The convergent sequence {P(x,am)} exists by Corollary 4.6.) The sequence {e(Tm)} is monotone increasing and bounded above which means that Aim e(Tm) exists. Corollary 4.7 implies that e(Tw) :_e(T:;m> B for all m and hence that the set MCS(Tm) contains a MCS, denoted Q*, of n+1 points. Let T* = {(x,k):(x,k,s)eQ*}. Then Lemma 1.40 implies that the rank of WIM(T*) = WIM(Q*) has rank n. Thus T*<;T0° and. hence the hypotheses of Theorem 4.5 are satisfied which means that film e(Tm) = e(Tm). Then Lemma 4.14 implies m+°° that P(x,am) is the unique best approximation to f(x) on Tm. 82 Assume that P(x,am) is not a best approximation of f(x) on XXI. From Lemma 4.13 we know that T'0° contains an absolute maximum point (x',k') of |Lk(x,aw)| over XXI. Since P(x,am) is not a best approximation, ILk,(x',am)| > e(Tm) and hence by Lemma 4.12 it follows that e(TmLJT'm) > e(Tw). Moreover, 21m e(TmLJT'm) = e(Tm(/T'ml by Theorem 4.5 since T*<2T0° and the $;:k of WIM(T*) is n. Thus for m large enough, e(TmcIT'm) > e(Tm) which is im- possible. Thus e(T'mLJTm) = e(Tw) = e and P(x,am) is the unique best approximation to f(x) on XXI. Thus the original sequence P(x,am) generated by the algorithm converges to P(x,am). In general we do not know if the hypotheses of the previous theorem are satisfied before solving the problem. However, in computational practice Algorithm II with s0 = O has proved to be an efficient method of computing best approximations even when e(T"O) is not close to e as re- quired by Theorem 4.14. This is true because for most problems of computational interest the quantity e(T"m) be- comes larger than the number B from Theorem 4.5 for quite small m and hence the theorem may be applicable from this point on. In problems where MCS(XXI) contains a set of less than n+1 points or where some points of a set in MCS(XXI) are very close together, the convergence may be quite slow and a high degree of accuracy obtainable only with multiple precision arithmetic. 83 5. Computational Procedures In the actual computational use of Algorithm II using linear programming to find best approximations to f(x) on finite point sets, certain serious difficulties were encountered. The most serious difficulty was that of the extreme inaccuracy of the linear programming solution to certain approximation problems on finite point sets as computed by the simplex algorithm. It was also found that the number of eliminations of the simplex algorithm used in finding best approximations was quite large. However, there is a simple procedure which has been very successful in eliminating both of these diffi- culties without changing any of the theory behind the application of linear programming to the solution of the approximation problem under consideration. Instead of solving the system of inequalities of line (1) page 62 on a finite point set T"m while minimizing e, the following procedure is used. Using the notation of the algorithms and defining P(x,a_m) s 0, find a best approximation P(x,a*) to f(x) - P(x,am_l) on the set T"m and set am = am-l + a*. This means that the linear pro- gramming problem corresponding to line (1) page 62 is modified so that for a fixed am-l we find a point a* = (al,a2,...,an) in En such that Iwk(x)Dk[f(x) — P(x,a — P(x,a*)]| : e m-l) 84 for (x,k)sT"m where e is to be a minimum. Then am is * defined to be am_1 + a . The following example shows how this procedure affects the speed and accuracy of the simplex algorithm solution to specific linear programming problems. (The first method will be called Method I and the later modi- fication will be called Method II.) 4.16 Example. Let I = {0,1,2,3,4}, X = [-l,1], n = 9 1-1 x ¢1(x) = for 1 = l,2,...,9, f(x) = sin(x), wo(x) 5 1 Wl(x) = .l(1 — x2) W3(x) = .001(l - x10) W2(x) = .01(1 — x6) Wu(x) a .0001 The set T"O was chosen to be ten equally spaced points from each set XX{k} for ksI; thus T"O consists of 50 points from XXI. Using single precision arithmetic (ten decimal places of accuracy) on the CDC 3600, several terms of the sequence {P(x,am)} were generated using Algorithm II with s0 = O. The following table gives a comparison of Method I and Method II and indicates the advantages of Method II in both speed and accuracy. The columns marked "Eliminations" indicates the number of eliminations used in the simplex algorithm in the solution of the linear program. 85 Method I Method II e(T"i) Eliminations e(T"i) Eliminations i 7.05269xlo'8 36 7.05269x10"8 36 0 8.98299x10‘8 49 8.98362xlo'8 22 l 1.32349x10'23 59 8.98948x10‘8 28 2 Letting a2 and 0'2 denote the third approximation to f(x) as given by Method I and Method II respectively, it was found that M[f(x) - P(x,a2>] 3.53572xlo'5 8 M[f(x) - P(x,a'2)] 8.99017x10’ The third linear program which used Method I and required 59 eliminations has a computed deviation e(T"2) which is essentially zero and hence gives no indication concerning the true deviation of a best approximation on the set T"2. Moreover the set Q2 as determined by the linear program did not contain a set from MCS(T"2) and hence the algorithm could not be continued. (It should be noted that the set T"l was the same in both Method I and Method II. The set T"2 was not the same for both methods because the com- puted solutions from R(T"l) were not the same for both methods.) 86 There is one other difficulty which could occur in theory but which has not been observed in actual compu- tational practice. The sets T1 = {(x,k):(x,k,s)sQi} could have an increasing number of points as 1 becomes large and hence the whole purpose of solving the problem on a small number of points would be defeated. If this would ever become a problem, the set Ti could be replaced by a subset T* of no more than n+1 points by determining T* so that e(Ti) = e(T*). This can be done by taking various n+1 point subsets of Ti and calculating the deviation on these subsets. 6. Computational Examples Many approximations were computed on the CDC 3600 using single precision arithmetic (ten decimal places of accuracy) and Algorithm II with s0 = O. The following examples are given to illustrate some of the unusual things that can occur in the computation of best approxi- mations. 4.17 Example. This example is the computer solution to the approximation problem given in Example 3.1 which does not have a unique best approximation. It was found that the best approximation obtained by the algorithm depended on the initial point set T"O. In each case a best approxi- mation was determined in two steps of the algorithm. Thus P(x,al) is a best approximation in each case. This occurred because the set 87 {(x,k):(x,k,s)sMES(XXI)} was included in the set T"l and thus a best approximation could be computed exactly. The various sets T"O and best approximations P(x,al) are given in Table I. 4.18 Example. Let X = [-l,l], I = {0,1}, n = 5, 1—1 for 1 = 1,2,....5, wo(x) a 1, Wl(x) = .15(l — x5), ¢i(x) = x f(x) = cos(xL Df(x) = -sin(x). The initial set T"O was chosen to be 6 equally spaced points from XX{O} including the end points, and 6 equally spaced points from XX{l}, including the end points. In Table II the successive x values of the point sets T1 are given in the columns headed (l), (2),...,(6) for values of i = 0,1,...,8. The values which are followed by an asterisk are points (x,l) from XX{l} while those without an asterisk are from XX{O}. The column headed "Elim." indicates the number of eliminations that were needed to find a best approximation on the set T" using the simplex algorithm and Method II. The set i T contained one additional point, namely (0.,0), which 1 is not shown in the table. It should be noted that the extremal points in columns (4) and (5) are approaching each other as the algorithm proceeds. It has been observed that when this occurs, the rate of convergence of the algorithm is slower than for similar problems where the extremal points do not come to- gether. In situations where points come together it may become necessary to take sO to be positive in the algorithms. 8 8 mloaxmommzmm.a nloaxoaamamm.z oooooo.H *ommwow. *mzmoom. mmoooo. *mzmwow.l oooooo.HI o m mloaxmmowzmm.: mloaxommwzmm.: oooooo.H *ommwow. *mzmmon. mmooon. *mmmwom.l oooooo.Hl N m mloaxzmmommm.z mloaxmomwzmm.z oooooo.H *ommmow. *mmamow. mmoooo. «mmooow.l oooooo.al m o mloaxomzmmmm.: mIonNHmnzmm.: oooooo.H *ommmow. *zwmmow. oooooo. *mmooom.l oooooo.HI OH m muofixsoqmsmm.s muofixmommmmm.z oooooo.a *ommmfis. *qsmmos. ooocoo. *mmooos.u oooooo.au NH 2 mloaxwoommmm.: mIonoHHHomm.: oooooo.H mmfimaz. *wzwmow. oooooo. *mmwoom.l oooooo.HI ma m mloaxwzmonmo.m slonmomoomz.z oooooo.H obmqom. *mmwmmo. oooooo. *Nmmmmo.l 5mmmw2.I ma m macaxommmsow.m maoaxsmmms0m.s cooooo.a ooooow. *mmssmm. *mfimsum. *smssmo.u *mflasmm.u ma H mIonmmmommm.w mloaxmmmmmam.m oooooo.H ooooom. *ooooom. *ooooom. *oooooo.l *ooooom.l ma 0 flhfis.xoa I Axvco: Afi:eom goo Amv Ago Amo Amo AHV .sfiflm a HH mqmqfl AAHA.HV «AHaoOv nACa.:v nAOa.MV AAOA.NV «Ara.va AmNHMOo «mNMo noHIV AAHA.:V AAHnoHV «AHn.OV AAOAoMV «AOnnmv «AOA.HVW AN. amo| noHIV AAHnozv AAHA.NV nAHA.Ov «ADA-MV aAOa.va AmNH. «Q A.H|v 0:9 HO H mqm¢8 10. ll. 12. BIBLIOGRAPHY Dean N. Arden, The solution of linear programming GI problems, Mathematical Methods for Digital Computers, eds. Anthony Ralston and Herbert S. Wilf, Wiley, 1960, pp. 263-279. Achieser, Theory of Approximation, Ungar, New York, 1965. Buck, Linear spaces and approximation theory, On Numerical Approximation, ed. Rudolph Langer, Univ. of Wis. Press, 1959, pp. 11-23. . Buck, Survey of recent Russian literature on approxi- mation, On Numerical ApprOximation, ed. Rudolph Langer, Univ. of Wis. Press, 1959, pp. 341-359. Cheney, Introduction to Approximation Theory, McGraw-Hill, New York, 1966. Dantzig, Linear Programming and Extensions, Prince- ton University Press, Princeton, N. J., 1963. Davis, Interpolation and Approximation, Blaisdell, New York, 1963. Davis, Approximation theory in the first two decades of electronic computers, Approximation of Functions, ed. H. L. Garabedian, Elsevier, Amsterdam, 1965, pp. 152—163. Denman, Minimax polynomial approximation, Math of Comp., 20 (1966), pp. 257-265. Garabedian, Approximation of Functions, Elsevier, Amsterdam, 1965. _ . Goffman, Real Functions, Holt, Rinehart and Winston, 1963. Lorentz, Russian literature on approximation in 1958-1964, Approximation of Functions, ed. H. L. Garabedian, Elsevier, Amsterdam, 1965, pp. 191—215. 89 13. 14. 15. l6. 17. 18. 19. 20. 21. 22. 23. 24. 25. T. O. 90 G. Lorentz, Approximation of Functions, Holt, Rinehart and Winston, New York, 1966. J. Maehly, and Ch. Witzgall, Tchebysheff—Approxi- mation in Kleinen Intervallen I, Approximation durch Polynome, Numer. Math., 2 (1960), pp. 142- 150. Michael, Topologies on spaces of subsets, Trans. Amer. Math. Soc., 71 (1951), pp. 151-182. . G. Moursund, Optimal approximation of functions: Chebyshev-type approximations, Ph.D. Thesis, University of Wisconsin, 1963. G. Moursund, Chebyshev approximation of a function and its derivatives, Math. Comp., 18 (1964), pp. G. Moursund and A. H. Stroud, The best Ceby§ev approximation to a function and its derivative on n + 2 points, J. SIAM Numer. Anal., 2 (1965), pp. 15-23. G. Moursund, Some computational aspects of the uniform approximation of a function and its derivative, J. SIAM Numer. Anal., 2 (1965), pp. 464-472. G. Moursund, Chebyshev solution of n + 1 linear equations in n unknowns, J. ACM 12 (1965), pp. 383-387. P. Natanson, Constructive Theory of Functions, United States Atomic Energy Commission, Washington, D. C., 1961. . Ya. Remez, General Computational Methods of Chebyshev Approximation. The Problem with Linear and Real Parameters, United States Atomic Energy Commission, Washington, D. C., 1962. . R. Rice, The Approximation of Functions, Vol. I: Linear Theory, Addison-Wesley, Reading, Mass., 196E. J. Rivlin and H. S. Shapiro, A unified approach to certain problems of approximation and minimization, J. Soc. Indust. Appl. Math., 9 (1941), pp. 670-699. Shisha, The Chebyshev polynomial of best approximation to a given function on an interval, Math. of Comp., 20 (1966), pp. 266-271. 91 26. A. F. Timan, Simultaneous approximation of functions and their derivatives on the whole real axis, Amer. Math. Soc. Translations, Series 2, 44 (1965), pp- 1-11- f(x) n {¢i(x)}i=1 . wk(x). MTE 3 ME 1 a, B, Y . P(x,a) e(T) R(T) R. || H- T, T', T T IM(Q), IM(T) WIM(Q), WIM(T)- det i’ m ‘ INDEX OF NOTATIONS Page rttttrwwwwwwwwww 5, l3, 55 92 93 Page Lk(x,a) . . . . . . . . . . . . . . . 13 C(T,a) . . . . . . . . . . . . . . . l3 C(a) . . . . . . . . . . . . . . . . 13 (x,k,s) . . . . . . . . . . . . . . . 13 Extrema, Extremum. . . . . . . . . . . . 13 Minimal Extremal Set. . . . . . . . . . . 15 MES(T) . . . . . . . . . . . . . . . 15 MES . . . . . . . . . . . . . . . . 15 Condition A. . . . . . . . . . . . . . 16 MCS . . . . . . . . . . . . . . . . l8 Minimal Characterization Set . . . . . . . . 18 MCS(T) . . . . . . . . . . . . . . . 18 n’pOint o o o o o o o o o o o o o o o 19 am. . . . . . . . . . . . . . . . . 39 |s| . . . . . . . . . . . . . . . . 47 d(T,T') . . . . . . . . . . . . . . . 65 Nb(T,6) . . . . . . . . . . . . . . . 67 1 4 5 5 9 6 1 3 0 3 9 2 1 3 mlll A“ R" Bl“ Ill. L” Y" " H u " E”! T“ A!" T" 5" H H H