w E 0)th J MICHIGAN STATE UNIV SITY UBRMIES L; Hill/Ulll/li/HIIllIf}!!!IIIIUIIIWHWWI/Ill)!!!” l 3 1293 01037 9992 This is to certify that the dissertation entitled LEAST DISTANCE ALGORITHMS FOR LINEAR SYSTEMS presented by Said Bahi has been accepted towards fulfillment of the requirements for Ph.D. degree in Mathematics 4 r '1 4 tiff/'K [(Mfle 16% Major professor Date ’4'. — ' ' MS U i: an Affirmative Action/Equal Opportunity Institution 0-12771 LIBRARY Mlchlgen State Unlverslty PLACE IN RETURN BOX to remove thle checkom from your record. TO AVOID FINES return on or bdore date due. DATE DUE DATE DUE DATE DUE m 3—! DEM WE MSU le An Affirmative ActhnlEquel Opportunity Inetltuton LEAST DISTANCE ALGORITHMS FOR LINEAR SYSTEMS By Said Bahi A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY Department of Mathematics 1993 ABSTRACT LEAST DISTANCE ALGORITHMS FOR LINEAR SYSTEMS BY Said Bahi In this thesis, we investigate the numerical methods for three classes of approx- imation problems. We begin with the study of methods for approximating a non negative solution of an overdetermined system of linear equations. We define a best approximate solution of the system At = b, x Z 0, to be the vector x Z 0 which minimizes the norm of the residual r(:r) = b -— Art, for a strictly convex and smooth norm . We, then, consider a system of linear equations Ax = b which has a non nega- tive solution. We present a method for computing, amongst all these non negative solutions, the one which is of the least norm when the space R“ is equipped with a strictly convex norm. Finally, given a system of linear inequalities, Ar 2 b, we suggest an algorithm that solves the question of the feasibility of the system, and if it is feasible, it computes the unique solution which has minimal norm. For each of these three approximation problems, we study a dual problem whose solution leads to the characterization of the solutions of the original problem. Al- gorithms for computing the solutions are presented and their convergence proved. Numerical results for different I? approximation problems are discussed. ACKNOWLEDGMENT I would like to thank my advisor, Professor V.P. Sreedharan, for his guidance in the reseach leading to this thesis and for all the discusions we have had on this particular subject and many others. I am especially grateful for his enthusiasm, which has made the last two years so enjoyable. I am also grateful to the members of my guidance committee, Professors D. Dunninger, G. Ludden, W. Sledd and D. Yen. I further thank the faculty and staff of the Department of Mathematics at Michigan State University, in particular Mrs. Barbara Miller, for their support of my effort. Finally, I wish to express my gratitude to Dr. D. Horner, Mrs. Elda Keaton and staff of the International Center at Michigan State for their help and to Mrs. Monique Bidaoui. iii Contents 0 Introduction 1 l A Least Distance Algorithm For A Smooth Strictly Convex Norm 7 2 An Algorithm For Minimal Norm Solution Of A Linear System 26 3 On The Minimum Norm Solution Of A System Of Linear Inequali- ties 46 4 Numerical results 59 LIST OF REFERENCES 66 iv Chapter 0 Introduction The Least Squares Problems have received a considerable amout of attention from different authors for either the special case of the Euclidean norm or the more general case when the norm considered is the IP-norms, 1 S p S 00. We shall study the numerical methods for computing solutions of the problem that we state generally as follows. Let a system of linear equations (resp. linear inequalities) be given and let ll.“ be a norm. If the system is overdetermined, find the ‘best’ non-negative solutions which minimize the [HI—error. If the system is feasible, find the unique solution that has minimal norm. This problem will be considered for general abstract norms. For more practical use, this work will apply to the interesting case of the lP-norms. First, we start by restating the original problem, specifying the stages of this thesis, as follows Problem 1. Given a linear system A2: = b that has no non-negative exact solution, find a: 2 0 for which As: is the closest to b. Problem 2. Given the system of linear equations Arzb,:r_>_0 that is feasible, find the vector :1: Z 0, solution to the system and which has the minimal norm. Problem 3. Consider the system of linear inequalities As: 2 b. Find the solution a: with the least norm. In problem 3, we make no feasibility assumption. Incidently the proposed method to solve this problem will answer the question of whether the system is consistent or not. Our aim is to propose some implementable convergent algorithms that solve these problems. After the algorithms are specified, their feasibility and convergence are proved for any smooth, strictly convex norm. We will test these algorithms for the usual lP-norm, 1 < p < oo llxllp = (Z lxil”)”” - (0-1) In the first chapter we will be concerned with problem 1 : Let A be an m x 11 real matrix and let 6 be a real m-vector such that the system Ax = b , a: Z 0 has no exact solution. Let the norm H.” on R” be smooth and strictly convex. We seek a: Z 0 solution of “b — Arc” (min) , a: Z 0 . (0.2) A more general formulation of this problem can be stated as follows: Given a closed convex cone K in a finite dimensional Banach space X with a smooth, strictly convex norm II." and a point b E X \ K, find 2 E K that minimizes the distance ||b — z", :c E K. We point out also that our results, including our algorithm can be formulated in infinite dimensional spaces as was done in [9]. This problem has been extensively studied for the norms ||.||2 , ||.||1 and ||..||°° The last one is refered as the Chebyshev problem. Since the l1 and the l°° norms 2 fail to be strictly convex or smooth, differentiable optimization techniques cannot be applied. Special methods have been developed to handle these problems. In each of these cases, thanks to linear programming, the special structure of the norm may be used. See for example Barrodale and Young [1]. The case when the norm considered is the l2 norm ||b— Ax”; (min) , a: Z 0 . has received the most consideration. For a more complete discussion see Lawson and Hanson [5] where an efficient finite algorithm called the NNLS algorithm as well as a Fortran implementation are proposed. With no side constraints, the problem “5 - Amill» (min) for any p, l < p < 00, has also been studied by different authors. See Fletcher, Grant and Hebden [3], Owens [8], Owens and Sreedharan [10], Sreedharan [15], Spath [13] and Watson [21]. It presents a lesser difficulty since the set of approximation K = { A2: | a: E R" } is a linear subspace. If the condition a: Z 0 is added, the set of approximation is no longer a linear subspace. In this case we are seeking y in the cone K={Ax|x20,xER”} which is the closest to b. Sreedharan in [20] has developed an algorithm for solving this approximation problem for any smooth, strictly convex norm with applications to the minimization problem min{ ||b—Ax||,|a:ER", 2:20}. 3 In [20] a duality theory was developed and a characterization of the solution was given. The algorithm we propose in chapter 1 will use these ideas. We will give a dual problem and review the related duality theory. After stating the algorithm, we will show that all the steps of the algorithm are valid, then the proof of the convergence will be given. It is worth mentioning that our algorithm does not depend explicitly on the special structure of the I? norm, but only on its strict convexity and smoothness. Numerical results for the norms ||.||,, for various p, will be presented in chapter 4. General algorithms (such as in Polak [11], Zangwill [22], Zoutendijk [23]) that minimize smooth, strictly convex functions subject to linear constraints could be used to minimize the function f (:r) = ||b- Arllg, subject to .1: Z 0. Or use may be made of nonsmooth optimization algorithms in Sreedharan ([16], [17], [19], [18]) and others. All of these algorithms for their guarantee of global convergence, however require the use of their built-in “antijaming” precautions, i.e., procedures that circumvent the possibility of the generated sequence clustering at or even converging to non-optimal points. In contrast to these, the algorithm to be presented here for the solution of the problem at hand has no antijaming precautions because none is needed, and it is globally convergent. General algorithms like those in Gill, et. al. [4] deal with the optimization of a twice continuously differentiable objective function subject to linear constraints and as such are not applicable to our problem, since in general our objective function, the norm, need not be twice differentiable througout Rm \ {0}. In particular, for the l? norms, l < p < 2, there are non-zero points in R”, m _>_ 2, where the norms fail to be twice differentiable. Despite this lack of differentiability, one may try to apply second order methods, the validity of which is predicated on the continuity of the Hessian of the objective function. Numerical answers obtained on specific examples are then checked a posteriori to see whether they are the “right” answers. Such tests lend credance to the applicability of second order methods, even when the relevent hypotheses are violated, but such findings do not prove the convergence of the algorithm. We, however, in the present work, give a proof of convergence of our algorithm, which includes the 1” case, 1 < p < 2. In chapter 2, we will be considering the following minimization problem “9:“ (min), Ax=b, 1:20. The norm is assumed to be smooth, strictly convex and the system An: = b to have a non negative exact solution. This problem has been studied for the lf-norm with linear inequalities constraints in [5] and for general norms in [6]. Lawson and Hanson in [5] gave an algorithm that solves the problem "2:”; (min), As: 2 b, called the LDP (Least Distance Programming) algorithm. It uses the transformation of the present problem into a NNLS (Non Negative Least Squares) problem. A Fortran code was also presented. As we pointed out in the discussion of problem 1, general purpose algorithms may also be applied to solve problem 2, when the norm considered is the lp-norm. However, the objective function f(a:) = ”x“; poses the same differentiability problems for 1 < p < 00. Moreover, as it was outlined in [6], during machine implementation, some precautions would have to be taken to circumvent the possibility that the generated sequence of approximations be clustering or converging to a non optimal solution, though we prove in theory the convergence of the sequence. On the other hand, our statements including the proposed algorithm do not depend explicitly on the norm, need no extra differentiability of the objective function and the global convergence of the algorithm is proved. This work was motivated by similar algorithms in [6] and [20]. A major step in this algorithm involves the solution, at each iteration cycle k, of an I2 problem Ax = b , a: Z 0 , ||:r — ak||2(min) where a], is defined at the begining of each iteration. The feasibilty of the algorithm will be studied and its convergence proved. Next, we examine in chapter 3 the problem ||x|]p(min) , As: _>_ b for any p, 1 < p < oo. Lawson and Hanson studied a similar problem when p = 2. Our work is an extension of the so called LDP algorithm in [5]. This problem is solved via the algorithm in chapter 1. A dual problem will be presented and its correspondance with the original problem will be studied. A related least squares problem will be given and used, as mentioned above, to find the solution to our problem. After presenting the algorithm, we will prove its feasibility and that it computes the solution if it exists. The fact that no assumption regarding the feasibility of the system As: 2 b is involved will explain the usefulness of this algorithm. Moreover its implementation is particularly simple. In chapter 4, some numerical results are given and a subalgorithm is presented. These results show that the coded algorithm does well compared to other algorithms in the literature, for different p—norms, when p is in the range (2, co) and not far away from 2. Chapter 1 A Least Distance Algorithm For A Smooth Strictly Convex Norm We study in this chapter the system of linear equations A1: = b ( 1.1) where a: is a non-negative n-vector in R“, A is an m x n and b is in R” . Let H.” be a smooth, strictly convex norm on R". In many applications the system above is overdetermined. In this case, one seeks to minimize the error r(:r) = b — Ax. We shall be concerned with the following problem (P) min {Ilb— Ax" I a: E R",:c 2 0}. Our objective is to give an implemetable algorithm to solve (P), and prove its convergence. In practice the proposed algorithm computes simultanuously a solution of (P), a solution of its dual (P') and their common value. At each iteration cycle, an approximation to the solution of (P) is computed and then used to find an approxi- mation of (P'). The later enables us to compute a new, improved, approximation of (P)- In [20], Sreedharan suggested a dual problem (P') , and the relation between problems (P) and (P') was studied. We will review briefly problem (P') . 7 We begin by establishing some notations and definitions. Given two vectors x and y in R", the usual inner product is denoted by (., .) (1‘, y) = Ext-yi- A partial order in R“ is defined as follows. For any u = (u1,...u,,) E R",u Z 0 iff iv,- 2 0,i = 1, ...,n. If A is a matrix, its transpose is denoted by AT. Associated with the above inner product is the usual Euclidean norm II..II2 In formulating the duality theorems, we will use the well known notion of dual norm. Let H." be a norm on R”, m 2 1, the dual norm II.III on R“ is defined by III/ll, = ma${( l z€K°,llzll'=1}, (1.7) have the same value for any norm II.II on Rm. From this, the following corollary is immediate Corollary 1.2 Let R“ be equipped with a norm II.II.Then the dual pair (P) and (P') have the same value. An important result in [20] will be used to solve (P). It characterizes the solutions of (P) in the case when the norm II.II is strictly convex and the system Ax = b has no non-negative exact solution. Theorem 1.3 Let the norm II.II be strictly convex. The following are equivalent (i) 5 _>_ 0 is a non-negative least error II.II solution of the system Ax = b . (ii) There exists 3/ E R”, IIyII' = 1, ATy S 0, such that A5: = b _ (b,y)y’. (1.8) Moreover, (y,A:E) = 0 and y solves (P'). If in addition the norm II.II is smooth, then (i) and (ii) are equivalent to (iii) Let r = b — Ax, then ATr“ S 0, (1.9) and (r‘, Ax) = 0. (1.10) An immediate consequence of the theorem follows easily when II.II = II.II2. In this case clearly r' = r/IIrII2, for any r 96 0. The vector x 2 0 is a solution of (P) in the 10 l2-sense iff the residual r = b — Ax is such that ATr _<_ 0 and (r, Ax) = 0. (1.11) Now, we give our algorithm for solving (P). It is assumed that the norm II.II is smooth, strictly convex and that the system Ax = b, x _>_ 0 has no exact solution. The main results are the feasability of the algorithm and the theorem concerning its convergence. Proofs that all the steps of the algorithm are valid will also be given. Algorithm 1.1 Step 0. Find x0 a NNLS solution of Ax = b. Let r0 = b — Axo, yo = To/IIroII', lc = 0. Step 1. Find x], a NN LS solution of Ax = b — (b, gay; . Let r), be the residual rk = b — (b,yk)y; -- Aft), . Step 2. If r1, = 0 , GO TO step 8 ; else continue. Step 3. If (5,71) 2 (hyklllrkll' + ”HUI/4 ~ Set yk.” = rk/IIrkIII and GO TO step 7. Step 4. Let m: = ((bn‘k) - llrkll3/2)/(b,yk) - Step 5. Find a]. > 0 such that III/k + Clerk”, = 1 + akflk - 11 Step 6. Set I I I yk+1 = (w. + §akrk)/||yk + 5am.” - Step 7. Increase k by 1, and RETURN to step 1. Step 8. xk is a non-negative II.II minimal solution of Ax = b. The computation is complete. We make a few observations. A crucial statement in the algorithm is step 5 defining the number at. Any efficient algorithm for finding the zeros of a function can be used in the current situation . Later, we will suggest a procedure to compute the at occuring in step 5. The stopping criterion given in step 2 is a very convenient one in the proof of the convergence. From a computational point of view, a more practical stopping rule will be to require that the so called relative duality gap goes to zero. So, in writing a routine to implement the algorithm, step 2. will be replaced by Step 2'. If (“5 — AkuI - (AWN/”b - AkuI 5 6, where e is a fixed stopping rule parameter, GO TO step 8 (the computation is com- plete); else proceed. All the other steps of the algorithm remain unchanged. Remark. Later in proposition 1.5, we shall show that the sequence ((b,yk)) generated by the algorithm is strictly increasing. We claim that (b,yk) >0, for all h. To see this, let r0 be the NNLS residual of the system Ax = b , x _>_ 0 , as defined in step 0. Recall that the system is assumed to have no non-negative exact solution, so 12 ro :fi 0. From equation ( 1.11) we get (b, 1'0) = “7‘0“: > 0- Once more, by step 0 of the algorithm, yo = ro/IIroII', so that (b, yo) > 0, which proves the claim. We now turn our focus to the convergnce theory. We begin by proving the validity of the various steps of the algorithm. In the next theorem, we prove that under the assumption that the primal norm II.II is smooth, the number a], > 0 defined by step 5 always exists. Theorem 1.4 Let the norm II.II on Rm be smooth. Then if the algorithm is at the stage of entering step 5, 301;. > 0 such that “3/1: + attic”I = 1+ akllk (1.12) where pk is defined by 1 2 me = ((bfl‘k) - §||rk||2)/(b, 31k) . Proof. Since the norm II.II is smooth, the dual norm is strictly convex. Consider the strictly convex function f : R ——+ R defined by f(/\) = “31k + Arkll' (1-13) and let l(A)=1+/\}tk . Then f(0) = l(0) = 1, l'(0) 2 pk and f'(0) = (y[,rk). (For details about f', see [14]). We note that the algorithm enters step 5 only if r), aé 0. By step 1 and ( 1.11), we obtain llrklli = (M) - (b,yk)(yi,rk) - (1-14) 13 Using( 1.14), we have f'(0) — I'm) = (vim) — p. = (gm — (are - guano/<1), w.) = (strewn — (M) + guano/(mt) “gums/(bat) <0. This implies that f (a) - 1(a) < 0 for some a > 0. Note that step 5 is executed only if step 3 is answered negatively, i.e, . 1 , (b,r,.) < (hyklllrkll + Zl|rk||2 - (1-15) In this case, since IkaIII = 1, for any A > 0 we have ll/Vk + ykll' - Ame -1 = um + ytu' — Mam.) -— guano/(w —1 2 unit — uyku' - Mam.) - guano/(mt) -1 = A(Ilrkll'(b,yk) — an.) + gnaw/(m - 2 1 1 > M-ZII'WII: + §||rk||§)/(b,yk) - 2 , f0) - 10‘) due to ( 1.15). Hence 10) ~10) > gums/<12, w.» — 2 goestoooasA—aoo. Thus, there exists A > 0 such that f (A)-l()\) > 0. Because the function f (A)—l(A) is continuous, this implies that there exists an on, > 0 such that (1.12) holds as claimed, and the proof is complete. 14 Proposition 1.5 Let rk and y], be as defined in the algorithm. Then (a)y1c + ark yé 0, Va, if rt 75 0. (b) ATyk S 0 and IkaII’ :1 , Vic. Proof. To prove (a), suppose rk = an,”c for some a. The algorithm defines rk by rk = b — (b, yk)yIc — Axk . Now, because of the second half of ( 1.11), we get Ilrklli = (b — (Mitch/lath) - (1-16) Hence Ilrklli = 0(1), we) — 0(b,yk>(y;¢,yk> = o , since (y;,yk) = IkaIII = 1. This implies that rk = 0, a contradiction. To prove (b), once more by ( 1.11) we have ATro S 0, so that by step 0 of the algorithm ATyo _<_ 0. A simple induction shows that ATy;c S 0, We and IkaIII = 1 . The proof is complete. The next result shows that the sequence (yk) built by the algorithm leads to an improved approximation of the value of (P'), after each iteration cycle. Proposition 1.6 Let (yk) be the sequence generated by the algorithm. Then, the sequence ((b,yk)) is strictly increasing and convergent. Proof. Here two cases have to be considered. Either (a) the question in step 3 is answered affirmatively, in which case, I 1 (b’rkl 2 (bayklllrkll +lerkllgi (1'17) 15 or (b) the question in step 3 is answered negatively, so that t l (bfl‘k) < (hyklllrkll + lerklli - (1-18) It is easy to see that case (a) yields a strictly increasing sequence. By step 4 of the algorithm yk+1 = Tic/“Th”, - Inequality ( 1.17) implies (ham) = (barkl/llt‘kll' 2 any» + illrklli/llrkll' . Thus (5, n+1) > (bat/k) , (since r1, 76 0 ), as desired. We now take up the second case (b). Because the norm II.II is smooth, the dual norm II.IIl is strictly convex. By proposition 1.5-(a) yk and rk are linearly independent. We have 1 . 1 . |ka + 50m.“ = ”5011: + y]. + “We” < 1u +aru'+1u II' 2311: kl: 23/): = 1 + gawk (due to step 5) l I = 1+ Emma) — glans/(mt) . We conclude that I t 1 1 (hyklllyk + 5am.” < (5,31th + 50145, "kl - Zakllrklli , (1-19) which, in view of step 6, implies that l l l I (b,yk) < ((5,311: + 50m) - Zakllrkllimlyk + 50m." < (b, yk+l> , 16 as claimed. To prove the convergence of the sequence ((b, yk)), we simply observe that because IkaIII == 1, we have (b, 311:) S Ilbll-Ilykll' = “b”- The proof is complete. The next result will be needed in the proof of the convergence theorem. It also gives a further insight into the existence and the uniqueness of the number a], > 0, defined in step 5. Lemma 1.7 Let a]. > 0 be as defined by step 5 of the algorithm. Then on. satisfies the following inequality , «a + ”roams/k) + gnu“: 2 M) . (120) Proof. By the definition of the dual norm, we have ”31:: + amll' = ((3/1: + Okrk)',yk) + GHQ/k + akrk)',r,.) . (1-21) Using the inequality ((3/1: + akrkya 31k) _<_ ”(31k + akrkl'll-llykll' = 1 , and ( 1.12), we obtain 1+ ak((b,rk) — glans/(bet) = uyt + new 5 1 + ak((y;c + akrkf, rk) , (1.22) which proves the desired inequality ( 1.20). We are now able to prove the convergence theorem. 17 Theorem 1.8 Either (a) the algorithm solves (P) and (P') in a finite number of iterations; or (b) the algorithm generates infinite sequences (23,.) and (yk) such that the sequence (yk) converges to the solution of (P') and every cluster point of (xk) is a solution of (P). Proof. To prove (a), we note that the sequences (xk) and (yk) are finite sequences (i.e, the algorithm terminates in a finite number of iterations) if and only if r], = 0, for some fixed integer k. In this case by theorem 1.3, x], and yk are solutions of (P) and (P') respectively. To prove (b), let (xk), (yk), (rk) and (01‘) be the infinite sequences genarated by the algorithm. Let d = min {IIb— AxII Ix Z 0, x E R"} . (1.23) Since the system Ax = b, x _>_ 0 is assumed to have no non-negative exact solution, d > 0. By ( 1.16) Ilt‘klli = (5 -(b,yk)yiark) S (Ilbll2 + dllyillznlrkllz Let M > 0 be such that IIzII2 S MHz”, for any z E R”. Then, because Ill/i” = l, we obtain llrkllz < ||b||2+Md. (1-24) Hence, the sequence (rt) is bounded. Our goal is to prove that (rk) converges to zero. Suppose that this not true, i.e, the sequence (rt) does not converge to zero. then, by ( 1.24), there exists a subsequence, denoted again (rk) such that rk—-—9r5£0 18 There are two cases to investigate. Case 1. Assume that step 5 of the algorithm is executed for an infinite number of indices (k), so that t l on) < (mourn! + lerkuz. (1.25) By taking a subsequence if necessary, we may assume that “all; 2 6 We 2 0 , for some 6 > 0 . (1.26) Now, aI:||"I=||'-1 S llyk+akrkll' = 1+ new — guano/(m) (because of step 5), from which one obtains 2 2 a..(||rtn'+-;-IlrkII§)/(b,yk> l l > aid-Ellrklli + 5”"!«IIil/(bayk) , (due to ( 1.25)). By the weak duality and ( 1.23) this implies akllrklli < 8(1), 311:) < 8d. Using ( 1.26), we see that 0 < a). < 8d/6 . (1.27) We have ,thus, shown that the positive sequence (ak) is bounded from above. Since IkaIIl = 1, by passing to a further subsequence of (k), which we call again for simplicity (k), we may assume that there exists a non-negative number a and a vector y such that ak—aa and yk—Hy. 19 Letting k -—+ co in ( 1.20), we get <(y+ar)'.r> 2 (--;-nrn§)/ (1.28) (this is possible because the map 2 H 2' is continuous on R” \ {0}). We will prove that both possibilities, a = 0 and a > 0, lead to contradictions. If a = 0, then ( 1.28) becomes our) (my) 2 an) -1111; . (1.29) If one takes the limit as k -—+ 00 in ( 1.16), one has llrlli = (b, r) - (bulb/17‘) - (1-30) Inequality ( 1.29) and equation ( 1.30) together imply urn: _<_. lnrni 2 which implies r = 0, leading to a contradiction, as claimed. If a > 0, then by step 6 of the algorithm ya. ——» (y + gas/11y +gm~1 = 12 , (1.31) as Ir -——1 00. Both subsequences ((b,yk)) and ((b,yk+1)) have the same limit. Let p be this limit. Then (5,3!) = (5137) = P- (1-32) Using equations ( 1.31) and ( 1.32) simultaneously gives (many + $0.11 = + gut.) . Rewriting this equation yields 1 1 1 My + 5m” = 1+ 501(1), r)/(b, y) . (1.33) 20 Allowing k ———+ co in the equation ( 1.12) defining at, we obtain (13,)“, + arn' = «1.11) + a< - guru) . Because the norm II.II is smooth, we apply the same argument as in the proof of proposition 1.5 using the strict convexity of the dual norm II.III (equation ( 1.19)) to conclude that «2.11111 + garu' < (11.1) + gut,» — iallrlli . (1.34) It is clear now that equations ( 1.33) and ( 1.34) combined imply o < ianru: , 4 leading to the desired contradiction. The proof of case 1 is complete. Case 2. Suppose that step 3 is executed for all but a finite number of iterations. Passing to a subsequence denoted again (k), we may assume 3/), ——> y and r), ——+ r , as k ——-) 00. In view of arriving to a contradiction, assume that r 76 0. By step 3 of the algorithm yk+1 = "k/IlrkII' ~ (135) The subsequence (31114-1) converges to 12 = r/Ilrll' - Thus 9 , besides y, is a cluster point of the sequence (yk). Because of the fact that the original numerical sequence ((b, yk)) is convergent (proposition 1.5 ), one has (513/) = (b, 37) - (1-36) Letting k -——+ co in both sides of the inequality defining step 3, we find . 1 (b1?) 2 “JAN?” +lerlli, 21 i.e, «1.111111 2 urn’ + guru: , ( since (b, r) = (b, 3,7)IIrIII ) . In view of equation ( 1.36), this implies r = 0, yielding a contradiction. So, in both cases, we have proved that the sequence (rk) converges to zero, as claimed. It remains to prove that the sequence (yk) converges to the solution of problem (P'). Let y be a cluster point of the sequence (yk). Then, there exists a subsequence, denoted again (yk) converging to y. From step 1 of the algorithm, we have r,c :: b -— (b,yk)y; — Ax;c . (1.37) Since r): -—-+ 0, the sequence (Axk) converges to b — (b, y)y'. From the fact that the cone {Ax I x 2 0,11: E R"} is closed, we get b —(b,y)y'=Ax, forsome x20. Now applying theorem 1.3, we see that y solves problem (P'). The same argument applies to any cluster point x of the sequence (x1). Let (ku) be a subsequence converging to x. By taking a further subsequence, if necessary, we may assume that yk: ——-> y. Considering equation ( 1.37) and letting k' ——+ co in ( 1.37), we obtain Ax = b - (b,y)y', (since rk ——1 0). Once more applying theorem 1.3, we conclude that x is a solution of problem (P), as claimed. This completes the proof. The algorithm studied in this chapter can be generalized as follows. We keep the previous notations. At each iteration cycle of algorithm 1.1, a prototype approxima- tion is given by I 1 1 yk+1 = (31k + iakrkl/Ilyk + 50mg” . (138) As it was done in [14] a natural extension results from choosing an arbitrary coefficient A), belonging to a set to be determined as a substitute to the coefficient 1 / 2 in equation ( 1.38). We set down these changes in the following generalized algorithm. 22 Algorithm 1.2 Step 0 to step 5. Same as in algorithm 1.1 Step 6. Let yk+1 = (111: + Akakt‘kl/llyk + Akakrkll' 1 where A). E A. A is a fixed compact subset of the open interval (%,1). Step 7. Increase 11: by 1 and return to step 2. Step 8. x), is a non-negative II.II minimal norm solution of Ax = b. The computa- tion is complete. A careful] inspection shows that most of the results concerning algorithm 1.2 carry over to the present situation. Only a few changes are needed in the proof of the convergence. Theorem 1.9 Let A be a non-empty compact subset of the open interval 6,1). As- sume the norm II.II to be smooth and strictly convex. Let the sequence (yk) be defined as in step 6 of the algorithm by yk+1 = (3/1: + Akcrkrkl/llyk + Akatrkll' ; where A), is arbitrary, A), E A. Then (a) the sequence ((b, yk)) generated by the algorithm 1.3 is strictly increasing and convergent. (b) the sequence (yk) converges to the solution of problem (P'). Proof. Recall that the smoothness of the norm II.II implies the strict convexity of the dual norm II..II' From this observation we obtain ”31k + Akakrtll' < Akllyk + akrkll' + (1 - Ak)||yk||' 23 = Ak(1+ ark/1).) + (1 — Ak) (due to step 5) I = 1+ A1a1((b,r1) - girdle/(m) . (1-39) Rewriting ( 1.39), we get I 1 (b.3111) < ((5111:: + Mam) - §Akak||r1I|§)/llyk + AWWII < (bayk+ll 1 l 4 , a), > 0 and r)‘ 96 0 . Thus, the sequence ((b,yk)) is strictly increasing. since A), > The rest of the proof of (a) is similar to the proof of proposition 1.5. To prove the convergence of (yk) claimed in (b), we show that limrk = 0 . (1.40) As in theorem 1.8, we assume r,: ——+ r gé 0 and show that this leads to a contradiction. The difference is that we have to pass to an additional subsequence of (A1,), that we shall denote (A1,), such that A), ———+ A E A . The two cases a = 0 and a > 0 are studied. The case a = 0 carries over exactly as in the proof of theorem 1.8, while in the case a > O by letting I: ——-> 00, we have yk+1 -* (y + A07‘)/lly + Mrll' = 17- (1-41) Since the subsequences ((b, yk)) and ( (b, yk+1)) have the same limit, equation ( 1.41) shows that (many + Aarn' = (1,1,) + Mb, 1) . (1.42) Now, let I: ——1 oo in ( 1.39). We get 1 l uy + Aarll < (my) + Aa - ,Aauru: . (1.43) 24 Combinig eq.( 1.42) and inequality ( 1.43), leads to 1 2 0 < —§AaIIrII2 . which contradicts the assumption that r 7i 0. the rest of the proof is completed exactly as in theorem 1.8. The reader will note that this theorem was motivated by a similar work in [14]. 25 Chapter 2 An Algorithm For Minimal Norm Solution Of A Linear System In this chapter, we will be considering a system of real linear equations Ax = l) . (2.1) where A is an m x n real matrix, b an m-real vector, and x an n-real vector. Under the assumption that the system ( 2.1) has a non—negative solution, we shall be concerned with the following problem: amongst all the non-negative solutions of ( 2.1), select that solution which has the least norm II .II, when R" is equipped with a strictly convex norm II.II (e.g, the lP-norm, 1 < p < oo) . This problem will be referred to as problem (P) (P) min{IIxIIIAx=b,xER" ,xZO}. We assume that b at 0, for otherwise the problem is trivial. The notations and definitions are those of chapter 1. We present an algorithm for computing the solution of (P). Its feasibility and the convergence will then be studied . All the steps in this algorithm will be shown to be feasible. Its global convergence will then be proved. To solve the given problem, a dual problem, to be denoted by (P'), will be associ- 26 ated with (P). An outline of the correspondence between (P) and (P') will be given. The main application of this work is the lP-norm case. Namely, Find x E R", that solves the lP-problem minimize {IIxIIp I Ax = b , x E R" , x Z 0} . (2.2) A worthwhile remark to make here is that the objective function in problem ( 2.2) need not be twice differentiable in R" \ {0}. This is particulary the case when 1 < p < 2. For p = 2, problem ( 2.2) was studied in Lawson and Hanson [5]. They refer to ( 2.2) as the Least Distance Programming (LDP) problem. A finite algorithm for solving LDP was given in [5]. We will use this algorithm as follows. At each iteration cycle of our algorithm, the LDP problem Ax = b ,x Z 0 ,IIx — at”; (min), where (at) is a sequence defined by the algorithm, is solved using the LDP algorithm of Lawson and Hanson. Briefly, we recall some definitions and notations from chapter 1. A norm II.II is said to be strictly convex if the unit sphere contains no line segment on its surface. In other words ”3“ = My” = “(33 + y)/2|| =1 => 1' = y- The norm is smooth if through each point of the unit sphere in R", there passes exactly one hyperplane supporting the closed unit sphere. If the norm II.II is strictly convex (resp. smooth), then the II.II (resp. II.II') dual vector is unique (see chapter 1 for the definition of the dual norm and dual vectors). Moreover, the correspondence x H x' (resp. x H x‘ ) from R“ into the II.II (II.II')-unit sphere is odd, continuous and positively homogenuous of degree zero on R" \ {0}. In the important case of the lP-norm, defined as usual by llmllp = (2 Ian“? 1 27 with 1 < p < 00, we have ”II; = II.IIq, where p + q = pq. We close this review by recalling that for any smooth, strictly convex norm II.II v" = v/llvll' and v" = v/||v|| 1 ifvaéO. Let K={J:€R”Ix20, szb}. Given problem (P), we associate a dual problem ([6]) (P’) m... {(11.11) ICE R" ,5 2 0.11 e R“. Ilé + ATyII' s 1 }, where AT is the transpose of the matrix A. The relation between problem (P) and problem (P') is investigated in the next two results. Lemma 2.1 (weak duality) Suppose K non-empty. Let the norm II.II on R" be arbi- trary, then value of (P') 5 value of (P) . (2.3) Proof. Let x E K , and let E and y be as defined in problem (P'). Then (1111/) = (Any) = (an/1’11) l/\ (x,ATy + 5) (since x 2 0 , £2 0) l/\ lell-IIATy + {If llxll , IA’ which implies ( 2.3), as claimed. Theorem 2.2 (Duality [6])If II II is an arbitrary norm on R", and K is non-empty, then the problems (P) and (P') have the same value. 28 The next theorem, due to Sreedharan and Nikolopoulos, states a useful characteriza- tion of the solution of (P). Theorem 2.3 ([6]) Let the norm II.II on R" be strictly convex. Suppose K non-empty. The following are equivalent (i) e is a non-negative minimal norm solution of problem (P). (ii) A5: = b , 1': _>_ 0, and there exist 6 E R", {_>_ 0 and y E Rm such that (1., y) > 0. IIATy + {II' =1 . (2.4) and 5- = (b1y)(ATy + 0’ ~ (2-5) Furthermore, (£10411! + 5),) = 0 - (2-6) If in addition the norm II.II is smooth, then x‘ = g + ATy ,and (6,1?) . (2.7) Finally, (y,{) solves (P'), and (b,y) = (522:) . A characterization of the solution of problem (P) in the lz-norm case follows imme- diately from theorem 2.3. This is an important ingredient in the subsequent devel- opement of this chapter. We record it as Corollary 2.4 If [III 2 II.IIg, then x is the solution ofproblem (P) ifix E K, i = ATy +5 1 (2-8) 29 and (£1 5) = a (2'9) for somey E Rm, (6 R", {_>_ 0. It should be noted that when II.II = II.IIg, then 5:" = e/IIeIlz, so that ( 2.8) and ( 2.9) are simply an easy consequences of equations ( 2.6) and ( 2.7). We are now prepared to state our algorithm for computing the solution of problem (P). We assume that the system of linear equations Ax = b , x 2 0, is feasible and that the norm II.II is strictly convex and smooth. Algorithm 2.1 Step 0. Find x0 the solution of Ax = b ,x Z 0 , IIxII2(min). Let 90 = 11io/Ilftoll’ 150 = (90,330) and k = 0- Step 1. Set a), = ,BkgI‘. Find x1.” solution of Ax = b ,x Z 0 , IIx — akII2(min) . Let u), = x1“ — ak. Step 2. If u)‘ = 0, stop. xk+1 is the solution of (P); else continue. Step 3. Set 7;, = (uhxk). Step 4. If 1 1 711 Z ,BkIIukII +leuklli 1 let 9k+1 = "k/Ilukll' and flk+1 = Tic/“uh”, . and GO TO step 8; else continue. 30 Step 5. Let n =11. — 11111111111. . Step 6. Find a), > 0 such that “9!: + “Wk”, = 1 + 0!ka - Step 7. Let 1 1 . 9k+1 = (911 + §akuk)/Ing + 50km” a and 1 1 1 flit-+1 = (,3): + §Ok7k)/llgk + iakuk“ - Step 8. Increase 11: by 1 and RETURN to step 1. Later in this chapter the stopping rule of step 2 will be used as follows. We will show that the constructed sequence (uh) converges to zero. Then, because of this convergence, it will be proven that the algorithm converges to the solution of problem (P). However, for implementation purpose, we shall use the more realistic stopping rule, similar to the one used in chapter 1, called the relative duality gap criterion. In other words, we replace the condition u)c = 0 of step 2 by the condition ("all - flk)/IkaII S r), where n > 0 is a stopping rule parameter. We record this in Algorithm 2.2. Step 0. Find x0, go and 30, as in step 0 of algorithm 2.1; let 1] > 0 be a stopping parameter. Step 1. Same as step 1 in algorithm 2.1. Step 2. If (llxk+1ll- fl1)/||x1+1ll S n 1 stop. x1,“ is the solution of (P); else proceed. 31 Step 3 to 8. Same as in algorithm 2.1. Our main interest in the upcoming sections of this chapter will be focused on proving that the various steps of the algorithm are valid and that the important step 6 defining a). > 0 is anserwed affirmatively and, finally, to prove that the algorithm leads effectively to the solution of problem (P). Lemma 2.5 ([6]). Let xo,go,flo be defined as in step 0 of the algorithm. Then, there exist {0 E R", {0 2 0, yo E R” such that 90 = {0 + ATyo 1Il90“, = 1 ,and flo = (bat/o) > 0 . (2-10) Proof. By corollary 2.4, there exist 6 E R”, 5 Z 0, z E R“ such that x0 =£+AT2 , and (€,xo) =0 . But 90 = (”o/”30”,, thus 90 = 50 + ATyOs where 50 =5/“1'0II'1 and 310 = Z/II$0|I'- To prove the last statement of ( 2.10), we note that 50 = (901-730) = [Valli/”$0”! > 0, and (50 + ATyo, $0) = (yo, Axo) (Due to ( 2.9)), 2 (yo, b) a (90130) which completes the proof. Proposition 2.6 (a) Let uk at 0 be as defined in the algorithm. Then Va>0,gk+auk #0. (2.11) 32 (b) Let a). > 0 be as defined by step 6 of the algorithm, then there exist {1+1 6 R" 1 {1+1 2 0 1 yk+1 E Rm 8110}! that 9k+1 = {1+1 + ATyHl a ”9H1”, = 1 ,and flk+1 = (”+1151 - (2-12) Proof. To prove (a) we proceed by induction on In. Let k = 0 and suppose ( 2.11) does not hold, i.e, 301 > 0 such that go = —auo. Then ”Halli = ("0,131 - a0) = —a'1(go,x1 — a0) = -01-1((901$1) " (90100)) = _a-1(($,,go + ATyo) _ so) ,(because of ( 2.10) and step 1) = —a‘1((Ax1,yo) + (131,50) - [10) = —a"(flo + ($1.60) — ,30) 1 (since b = A31) 1 = —0-1(($11€o>)- This yields a contradiction since 1:; 2 0,60 2 0, and no 7i 0, which implies ( 2.11) for k = 0, as desired. To prove (b), by definition of x; in step 1 of the algorithm and the corollary 2.4, there exist 5 E R",£ Z 0 , z E R“ such that 110 2' (Cl—Clo = 5+ ATz , (2.13) and (215) = (“01131) = 70- Suppose 00 is determined by step 6 of the algorithm. Then, because of ( 2.11), 90 + '2'00110 33 is nonzero. Let 1 l I 3/1 = (3/0 + gaozl/llgo + 500110” , and s. = (so + gas/"g. + gaouou’ . From ( 2.10) and ( 2.13), we obtain 1 1 I ATS/1 + 51 = (50 + ATS/o + §00(€ + ATZ))/llgo + §00u01l = (90 + aanal/”£10 + 00110”, = 91' Now, )/ngo + gator 1 l . = (30 + growl/“90 + ‘00u0ll 2 =31. This completes the proof of ( 2.12), for k = 0. The same argument applies for any integer k if we assume the proposition to be true for k — 1. Remark. It will be shown later that the sequence (31:) generated by the algorithm is strictly increasing. This combined with ( 2.10) imply that m > 0, for all k. We conclude that step 5 is properly formulated. Proposition 2.7 Suppose the algorithm is at the stage of executing step 5. Let pk be defined by l m. = (7. - gllukll§)/flk . Then, there exists on. > 0 such that ”9!: + 01:10:“, = 1 + akl‘k . 34 i Proof. Let the function f : R -—-> R be defined by f0) = ”91: + A‘ukll' , and let I(A)=1+)\#k . Then, f is a strictly convex function. Moreover, f (0) = Ilgkll' = 1 = [(0). Also f'(0) = (drank) a (see chapter 1). We have Ak/flt = (uka‘tk-HVflk , (by 8t€P3) = (uhuk + ak)/flk , (because of step 1) = IIUklli/flk + (magi) , (since a]. = 5kg“. If the algorithm is at the stage of executing step 5, then uh aé 0. So, 110) - I'm) = — (7. — guano/a = - llukllé/fik — (um) + gnukug/fl. = gunman. <0. Hence, there must exist /\ > 0 such that f(/\) — ((A) < 0. Now, a]. > 0 in step 5 is sought only if step 4 of the algorithm is answered negatively. This means that 7;, must satisfy 1 1 7!: < WNW” + Ella/ell; - (2-15) 35 Using this inequality, we have , 1 f0) - l(/\) = ”at + Mkll -1- M71: - §llukll§)/flk I , 1 Z Allflkll - llgkll -1- M71: - glluklliflflk : l = l\(5Icllukll - 71: + §IIUk||§)/flk - 2 l 1 > A(—;,-Ilukll§ + §lluklli)/flk — 2 . The last step is due to ( 2.15). From this we get A 2 fm4m>zmwm4—«n as/\-——>oo. Therefore, f (A) — 1(A) > 0, for some A > 0. Since f — l is continuous, there must exists 0;, > 0, solution of f()\) — ((A) = 0. Thus, equation ( 2.14) holds. The proof is complete. Proposition 2.8 Let the norm H.” be smooth. Then (a) The sequence (Bk) generated by the algorithm is strictly increasing. (b) Either the sequence {flu is finite; or it is a convergent infinite sequence. Proof. Two cases are to be distinguished. The first case is straightforward. Assume step 4 of the algorithm is executed. This occurs if . 1 n2flWN+flMfi~ So, from the definition of flip“, we obtain xm1=7mmw 1 . 22a+flmmum > ’61:, 36 ( since uk ¢ 0). Now, let us assume that step 4 of the algorithm is answered negatively. Then flk+1 is defined by step 7. Let a]. > 0 be as defined in step 6. Recall that we are assuming the norm II.” to be smooth. So, the dual norm II.II' is strictly convex. It follows that l I 1 I l I Ilgk+§akukn < §Ilgk+awkll +5119.” 1 1 = 1+ 5am]. —- §llukl|§)/flk . Consequently, we get 1 l 1 I 31: < ((31: + 5mm) - ZakIIUkHiVllgk + 501cm” 1 l , (M + fwd/Hm. + 50km.” /\ = 161:“ - This proves that, in all cases, the squence (3],) is strictly increasing, as claimed. To prove (b), supppose that the sequence ([31,) is infinite. To show its convergence, we need only prove that it is a bounded sequence. This follows immediatly from the weak duality lemma, 18k : (yin b) < llxll , (2-16) for any fixed feasible solution x E R". Hence, (M) is bounded from above. The proof is complete. After these preliminaries concerning the feasibility of Algorithm 2.1, we begin the study of its convergence. The next two results will be needed in the convergence theorem. Lemma 2.9 Let a}, > 0 be as defined in step 6 of the algorithm. Then , 1 5!:«91: + akuk) ,Uk) + '2'lluklli 2 7k - (2-17) 37 Proof. First, we recall that by the definition of the dual vectors and the construction of gk, we have “(91: + awn'll = 1 and ”all = 1 . This implies ((91: + akukl’igk) S “(91: + akMYll-llgkll' = 1 . Due to the definition of the dual norm, we also have the equation ”9!: + 01:31:“, = ((91: + akUk)',9k) + 0k((9k + aka/cl], “kl . This, combined with the equation ( 2.14), defining ah, yield 1 . 1 + akhk - §llul|i)/flk = “91: + akmll 3 1+ a,.((g;c + aka/c)’, uh) , which, since a]. > 0, implies ( 2.17), as desired. Theorem 2.10 Let a]. , xk and u). be as defined in the algorithm. Then, the sequences (ah) ,(xk) and (u) are bounded. Proof. If the algorithm terminates in a finite number of iterations, we are done. Con- sider the case the sequences are infinite. Let d > 0 be the value of the minimization problem (P). Then as mentioned above, :6]: : (bay/cl g d , for all k. (2.18) Now, from step 1 of the algorithm, we see that ”akllz = 5kllgil|2 S Mflkllgill = Mflk (since ”v,” :1 ,for all v # 0) S Md, 38 where M > 0 is such that "v“; _<_ M||v|| for all u E R“. Hence (ak) is a bounded sequence. To prove that the sequence (xk) is bounded, let :i: be any fixed feasible solution of problem (P). Because x1,“ is the minimizer of the problem Ax=b , x20 , “CD—akllz (min), (this minimization is done in step 1 of the algorithm at each iteration cycle), we have ||$k+1|| S ||$k+1 - akllz + llakllz S ”5? — akllz + llakllz IA llillz + 2llakllz S llillz + ZMd- Thus, (xk) is a bounded sequence. From this it follows clearly that the sequence (uk), uk = xk+1 — ak, is also bounded. This completes the proof. We can now put everything together to prove that the algorithm 2.2 converges to the solution of problem (P). Proposition 2.11 ([6]). Let (uk) be the sequence generated by the algorithm. If u], = 0, for some k, then x1,“ is the solution ofproblem (P). Proof. u], = 0 implies xk+1 = ak = my; (because of step 1 of the algorithm). By virtue of proposition 2.6, 31: = (5.311;) and 9:: =51: + ATS/k ~ This gives us the relation n+1 = (b,y,.)(§,, + Airs/k), - 39 Moreover, IIEk + ATykll' = 1 and (Mn) > 0 . Using theorem 2.3, it is obvious now that xk+1 solves (P). Theorem 2.12 (a) If the algorithm terminates after a finite number k of iterations, then x1,“ is the solution of problem (P). (b) If the algorithm generates an infinite sequence (xk), then it converges to the solution of (P). Proof. Algorithm 2.2 terminates in a finite number k of iterations iff u], = 0. In this case, by proposition 2.11, the algorithm computes the unique solution 33,.“ to the problem (P). This proves (a). To prove (b), assume that the algorithm is executed for an infinite number of indices (It). By theorem 2.10, the sequence (7,.) defined by 7’: = (ukaaik-Hl a where (xk) and (uh) are generated by the algorithm 2.1, is a bounded sequence. Hence, by passing to a subsequence if necessary, we may assume that 7;. ——+ 7. Since 0 < m S d ( due to ( 2.18)) and Ilgkll' 2: 1 , let us pass to a further subsequence, denoted again by (k), such that 35—45 and gk—ag. Our first goal is to establish that lim u), = 0 . (2.19) Suppose the claim were false. Then, once more by theorem 2.10, there exists a subsequence, denoted (k), such that u, __. u ,e o. (2.20) 40 Case 1. Step 6 is executed for an infinite number of indices (k). We claim that the corresponding sequence of positive numbers (at) is bounded from above. Because of ( 2.20), we can pick a subsequence, denoted once more by (k), such that ”uh“; Z 6 , Vk , for some 6 > 0 . (2.21) Now, using ( 2.14) it follows akllukll' -1 S “9:: + akukll' 1 = 1+ ak(’7k — §lluklli)/5k - This shows that I l ak(flk|lukll - ‘71: + §||0k||§) _<_ 25k - (2-22) Recall that step 6 is executed only in case . 1 , 7. < flkllwll + Zuni“. . (223) Combining ( 2.22) and ( 2.23) we get 1 2 Zakllukllz < 25k - Together with ( 2.21) and ( 2.18), this shows that 0 < 0k < 8(1/6 . Thus, the sequence (01),) is bounded, as claimed. Passing to a further subsequence, if necessary, we may assume that there exists a Z 0 such that 0;, ——> a as k —-—> oo . If we let I: —-—> oo in ( 2.17), then by continuity of the map 2 H 2' on R" \ {0}, it follows that , 1 mg + m.) ,u> + gun"; 2 7 . (2.24) 41 We distinguish two possibilities, a = 0 or a > 0. We prove that each of these possibilities leads to a contradiction. We then conclude that lim u), = 0. If a = 0, then ( 2.24) becomes . 1 2 my ,u> + 5“"“2 2 7. (2.25) From the definition of u). in step 1 of the algorithm, we obtain “Welli = (Mn-amine) == (xk+1,uk)-(ak,uk) = 7k - flklgimkl - (2-26) Passing to the limit on both sides of ( 2.26) leads to IIUI|§ = ‘7 - Mint) - (2.27) Inequality ( 2.25) and equation ( 2.27) force u to satisfy 1 which is a contradiction to ( 2.20). So u = 0 , as claimed. Suppose now that a > 0. Using step 7 of the algorithm and allowing 1:: —-> 00, we get 1 1 , . (3H1 -—’ (5 + §ai)/llg + 50"” = fl- Note that fl), = (yk, b) and 5H1 = (yk+1, b) both have the same limit. 80 B = fi. This yields the equation 1 i I fill? + 50“” = ,3 + 507 . (2.28) Once more allowing 1: ——i 00 in the equation ( 2.14) defining ah, gives ms + auu' = a + as — guuné) . (2.29) 42 Applying the strict convexity of the dual norm ||.||' implies flllg+§aull < s<§llgll'+§llg+auu') = m§+§a+a(7-§nullz)/m) (because of (2.29) 1 l = 5 + -01(7 - -||ul|§) . (2-30) 2 2 Inserting this in ( 2.28) shows that 1 1 1 , 3 + @017 < 5 + 50(7 - Ellullzl a which implies 1 Illulli < 0 leading to the sought contradiction with ( 2.20). So u = 0 and the whole sequence (uh) converges to zero, as desired. Case 2. Assume that the condition in step 4 of the algorithm is satisfied for all but a finite number of indices (k). As in the first case we proceed by contradiction. So, let us suppose that (uh) does not converge to zero. Then there exist a subsequence which we denote again by (k) and a nonzero vector u such that limu)c = u 74 0 . (2.31) Since step 4 is answered affirmatively, we have a 1 2 7k 2 Bkllflkll + leukllz , (2-32) for all lc. Letting k ——+ co in the above inequality implies . 1 , 7 _>. flllull + Zuuu. . (233) By step 4, 31m = 7k/llwcll' - 43 As lc —-+ 00, this converges to [3 = 7/IIUHI - (2-34) Combining inequality ( 2.33) and equation ( 2.34), we obtain 1 leullé s 0, a contradiction with ( 2.31). We have thus proved that in all cases, the sequence (uk) generated by the algorithm converges to zero, establishing ( 2.19), as claimed. We now prove that the algorithm converges to the unique solution of problem (P). Let x" be any cluster point of the sequence (xk) and let (ku) be a subsequence converging to x'. Writing the relation in step 1 of the algorithm for all k', we have $k’+1 = ”H + flk'glc' a (2-35) where 91’ = ATyk' '1' Ck’ a llgk'll’ = 1 , and 51" = (yk'abl , V k! a (by proposition 2.6). Since ”91’”, = 1, by passing to a further subsequence that we denote (k'), we have gk’ ‘69 i llgll'=1° It has been proven that lim up = 0, so if we let k' ——-> oo in ( 2.35) we get w‘=fig' , ugn'=1. (2.36) Clearly, x‘ is feasible since K = { x E R" | x Z 0, Ax = b } is closed. In other words, Ax“ = b and x‘ Z 0. 44 We have seen earlier in the proof that [3 = lim(y,,r,b), where yk: E Rm , E Z 0 and IIATyk: + 611’ ||' 2 1. By the weak duality lemma and equation ( 2.36) we have (yk'v b) S ”13.” = s. (2.37) Letting ls:l —-+ 00 in ( 2.37) yields equality. This shows that every cluster point of the seqeunce (xk) is a solution of problem (P). Due to the uniqueness of the solution, we conclude that (xk) converges to the unique solution of (P). This completes the proof of the theorem. 45 Chapter 3 On The Minimum Norm Solution Of A System Of Linear Inequalities Lawson and Hanson proposed in their book ([5],chapter 23) an algorithm for solving the so called LDP problem (Least Distance Programming). The algorithm computes the vector of minimal norm solution of a system of linear inequalities. The norm used by the two authors for the objective function was the lz-norm and the problem was Minimize ”x”; subject to Ga: 2 h . The formulation of the solution to the problem was based on the Kuhn-Tucker opti— mality conditions. When the norm IL”; is replaced by the lP-norm, 1 < p < oo, in the objective function, the problem is no longer linear. A different characterization of the solution has to be considered. We shall propose a generalization of the LDP algorithm, given by Lawson and Hanson, to the I’D-case, for any p, 1 < p < 00. Keeping in mind the difference of the two problems, we will follow closely, whenever possible the approach in [5]. This is made possible, simpler and elegant because of the availability of the least distance algorithm presented in chapter 2 of this work. 46 The problem considered, and refered to as problem (P) is (P) “1‘“ (min), Am 2 b~ Here A an m x n-real matrix, b an m-real vector, :1: an n-real vector. The norm II.II is assumed to be strictly convex and smooth. To solve the given problem (P) and with an eye toward the characterization of the solution, a dual problem, referred to as (P'), will be introduced. We will investigate the relation between (P) and (P'). A related least distance problem, similar to the one studied in Chapter 2, will then be used to solve (P). The main contribution of this work resides in giving an explicit formula to compute the vector of minimal II.IIp-norm solution of (P). We introduce briefly the notion of the dual vector for the norm II.II on R". Let II.III denote the dual norm. For any vector v 6 R”, v yé 0 , a II.II — dual vector, v', is defined by ”v,” =1 . (v',v) = “v“, - (3-1) Similarly, the II.II' — dual vector, v“ , is defined by “'0’” = 1 , (”'1’”) = llvll - (3-2) When II.II = II.IIp, l < p < 00, is the usual lP-norm, then II.II' = II.IIq, where p+q = pq. In terms of coordinates, the dual vectors are given by vi = (Ival/Ilvllq)”‘lsgnvi z'=1,---,n, v: = (Iv;I/IIvII,,)”"lsgnv,- i = l, ...,n . The following two identities are useful and will be often reffered to later. = v/nvu and = v/nvn’ , (3.3) for anyv 75 0. 47 Set K={xER" I Abe}. If the region K is nonempty, then due to the compactness of the norm II.II unit ball in R“ a solution of problem (P) exists; it is unique due to the strict convexity of the norm II..II Given problem (P), we assotiate a dual problem (1") max { (MM 31 E R’” , y 20 ,IIATyII' S 1} where AT is the transpose of the matrix A. The next theorem generalizes slightly the duality theorem 3.1 in [6] to the case of linear inequality constraints. It establishes the relation between problem (P) and its dual (P'). Since 0 E K iff b S 0, in which case the value of (P) is zero, we explicitly exclude this easy case. Theorem 3.1 Assume that K is non-empty and that 0 ¢ K. Then, problems (P) and (P') have the same value. Proof. We begin by showing the classical weak duality inequality. Let x E K , y 6 R” , y 2 0 such that IIATyII' S 1 .Then (b,y) S (Ax,y) (since y 2 0 ) = (:1), ATyl S lell-IIATyII' S ”93” (3-4) Hence, value of (P') _<_ value of (P). To prove the desired, let d=d(0,K)=inf {IIxII I :66 K}. 48 By the duality theorem in Nirenberg [7], d(0, K) = max { —o(z) I ”2“, S 1 } , (3.5) where 0 denotes the support function of the convex set K and where we have adopted the notation d(a,X) = inf{ IIa — x” I x E X }. Let 20 be the maximizer of ( 3.5). By the definition of a, we see that —o(zo) = —sup {(zo,x) I x E K } = —sup {(zo,x) I Ax 2 b} inf {I—zo,x) I Abe} = SUP{(b,y)|ATy=-zo. 3120}- The last step is due to the duality theory in linear programming. Now, since SUP {(5.11) | ATy = —Zo, y 2 0} is finite, there exists g _>_ 0 such that ATg=—ZO$ 3720a and (5,37) =S‘1P WW) l 473/ = -zo, y 2: 0 }- Furtheremore, Equation( 3.6) combined with ( 3.4) imply the claimed equality. The following theorem is due to Sreedharan-Nikolopoulos [6] in a slightly different form. Only a few changes are needed in the proof. We include it for completness. It states a characterization of the solution of problem (P) and establishes the rela- tion between the solutions of (P) and (P'). Because of theorem 3.1, only a minor modification is needed for its use in our case. 49 Theorem 3.2 Assume that K is non-empty and that 0 ¢ K. Then the following are equivalent (i) 51': solves problem (P). (ii) A5: 2 b and By E B”, y 2 0 such that IIATy||'=1, (W?) > 0, and i‘ = (b,y)(ATy)' - Proof. Due to theorem 3.1, it is easy to see that (ii) implies (i). Since IIATyIII = 1, Hill = (b,y)ll(ATy)'ll = (5,1!) - To prove the converse, let y 2 0 be a solution of problem (P'). Then llill = (13.31) = p- Since Ax 2 b and y 2 0, we have (ATya j) = (3’, A5) On the other hand (47311:?) S IIATyll'Ilill S p- Thus (ATy.r7=> = p and IIATyll' = 1 . So, («Viki/p) = llATyll' - 50 By uniqueness of the II.II-dual and since IIi/pII = 1, we have i/P = (ATy)I 1 which implies i = (b,y)(ATy)' completing the proof. Theorem 3.3 Assume that K is non-empty and that 0 ¢ K. Then, x” is the solution ofproblem (P) ifl' A5: 2 b and By E R” , y 2 0 such that IIATyII' = 1 , (3-7) and 5' = ATy . (3.8) Furthermore, and y is the solution of (P'). Proof. If a’: is the solution of (P), then by theorem 3.2 it?" = (ATyY' = ATy , with y 2 0 and IIATyII' = 1, which proves the “only if” part. “If” part: By ( 3.8) f/llill = 55" = (ATS/Y ~ Thus 1 = II?" = IIATyII' = (ATy.(ATy)') = (ATy,i/||i||> - 51 So, llfll = (Airs/,5) (mAfl 2 (1w) Now, (5,31) = (ATyw’?) _<_ IIATyII'HiH = Hill - This implies Hill = (5,31) - Therefore, 5: and y solve (P) and (P') respectively, as desired. As mentiond in the discussion at the beginning of this chapter, a key step toward the solution of problem (P) lies in our ability to compute effectively a least distance solution of a related problem. Before we formulate this problem, we set down some notations that will be used throughout this chapter. Let E be an m x n real matrix and let c 6 R’". We consider the convex cone C = {zERmIETz SO} and its negative polar Co = {ExIx20, xERn}. Given E and b, we introduce the following minimization problem min { ”6- MI I y 6 0° } (39) With ( 3.9) is associated the following dual problem max {(c,z) I z E C , ”2” =1 } . (3.10) 52 Problem ( 3.9) and its dual ( 3.10) have the same value (see chapter 1). The relation between ( 3.9) and its dual ( 3.10) was studied in Sreedharan [20] where the following theorem is proved Theorem 3.4 Assume that c ¢ C0. Let 5': _>_ 0 be a solution of problem (3.9) and let y be the maximizer of ( 3.10). Then Ex = c — (c,y)y" (3.11) and (E23331) = 0 (3.12) Proof. See theorems 3.5 and 3.7 in Sreedharan [20]. Note the interchange of primes and stars since our minimization problem uses the dual norm. Corollary 3.5 ([20]) [ff 2 0 is a solution ofproblem (3.9), then ETr' g o , (3.13) wherer=c—E:E . Remark. Since the map 2 H z’ is odd, positively homogenous of of degree zero on R” \ {0}, it is easily seen from ( 3.11) that I 7‘ = (fez/W), = y/llyll , ( by (3-3) ) = y (since llyll = 1) (3.14) We are now ready to give the algorithm for solving (P) when II.II = II.I.I,, The reader will note that for the validity of the present algorithm, it will not be sufficient to assume that the norm II.II is just strictly convex and smooth. The special structure 53 of the l" norm will be used. A careful look at our proof will reveal that the new requirement is the following. Let v at 0 and v' its IIII dual. Then v; = 0 implies v.- = 0, where v = (v1, ...vn) and vI = (vi, ...,vI,). The algorithm starts by solving a problem of type ( 3.9). Then it proceeds to compute the solution of (P). We will state this algorithm for the norms II.II = II.IIp and II”, = II.IIq, where p + q = pq Given the matrix A and the vector b, defining problem (P), let AT and c = [0, ...,0,1IT. E is an (m + 1) x n-matrix and c an (n + 1)-vector. Algorithm 3.1 Step 0. If b S 0, Then :2 = 0 solves (P). GO TO step 6. Step 1. Find fl 2 0, a solution of the problem IIc — ExIIq (min) , x Z 0 . Step 2. Compute the residual r = c — Ea. Step 3. If r = 0, the feasible region of (P) is empty. GO TO step 5; else proceed. Step 4. Compute r' = (rI, ...,r;+,), the II.IIp dual of r. Let I I . Step 5. Accept x as the solution of (P). Step 6. The computation is complete. Before proceeding any further, some comments are in order here. To find a Z 0 in step 1, Algorithm 1.2 of chapter 1 can be used. Step 3 answers the question of the feasibility of the system Ax _>_ b . To determine feasibility we may use the I’- residual. If the problem is feasible we actually start all over from step 1 with the 54 given (9 norm. If the region K is non-empty (the system has a solution), then the algorithm will compute the solution of the system, which has minimum norm. If step 3 is answered affirmatively, the region K is empty. In this case we exit the algorithm. We start by proving the feasibility of the algorithm in Proposition 3.6 Let r = c — Eu be the residual as given by step 1 of the algorithm. If step 3 is answered negatively, then Thu = “7'qu > 0 - Proof. Note that since the case b S 0 has already been handled, 0 ¢ K. In this case let y be the solution of the dual problem ( 3.10). It follows from ( 3.12) and ( 3.14) that (ETr',a) = (ETy,a) = (31.1317) = 0 , (Due to( 3.12)) . (3.15) This implies that 0 = (1373-317) = (r’,Ea) .-.-_ (r',c— r), (due to step 1) = (r',c) — I|rI|q . (3.16) Step 3 of the algorithm is answered negatively if r 74 0 . Hence from ( 3.16), we obtain r;+l = ”r”, > 0, (because of ( 3.1)) (3.17) This completes the proof. In the next result, we show that the stopping criterion of the proposed algorithm is well formulated. We also show the feasibility of the system Ax _>_ b, if r 91$ 0. 55 Theorem 3.7 Let r be the residual vector, with (n+1) components, given by step 1 of the algorithm, then (a) Ifx is as defined in step 4, x is the minimal norm solution ofproblem (P). (b) If IIrIIp = 0, the system Ax 2 b is inconsistent. Proof. Let us prove (a). From step 4 of the algorithm, we have I I 131' = -rj/rn+1 = —r;-/IIrIIq , j = 1, ...n . (3.18) To verify the feasibility of x, we need to show Ax 2 b. By ( 3.17), we have —II7‘IIqu,—1IT = (—T;+1)Ii,—1IT = (_rh+l)l—ri/rh+l a "'1 —T;,/T;,+l i _llT = [rI,...,r;,,r:,+1]T I :rT Hence — [A, b] I :21 I “r“, = ETr’ . (3.19) This combined with the inequality ( 3.13) of corollary 3.4 implies (b — Ame“, = ETr' g o . Thus, Ax 2 b, as claimed. If x = 0, then due to the feasibility we just proved, b S 0 and we are done. So assume that x 76 0. Then, by step 4, r;- # 0 for some j, l S j S n. As observed earlier, this implies r,- 74 0, due to the special form of the II.IIp dual in the case of the l” norm. So, i‘ = (r1, ...,rn) 7g 0. Now we have I r,- = (It‘jI/llrllqlq‘lsgn r,- = I’m-139" "j/II'IIZ‘1 , j=1,...n. 56 From this it follows that —xj = hill-1619""j/Il'llqllrlli-1 = Irj|°'189n rj/IITIIZ = (lrjlq’l/llfIIZ‘Ikgn "j-(IIV‘HZ'l/IITIIZ) . 1': 1,-.-,n. So, 4 = fi-(llf‘lli’l/IITIIZ) - Using the fact that the map 2 H z' is odd, positively homogenous of degree zero on R" \ {0}, we get — :2' = 1"" = F/IIfII, . (3.20) By definition of E and c, we have 7‘ = —ATa. Thus I = ATP/”file (~21 = AT(u/I|r||,), a 20. (3.21) This implies that x satisfies equation ( 3.8) with y = fi/IIrIIq, y 2 0 and IIATyIIq = 1. We have verified that x satisfies the conditions of theorem 3.3, so that x is the solution of problem (P). The proof of (a) is now complete. To prove (b), assume that r = 0 and that there exists a solution x of the system Ax 2 b. We have ATa = 0 , bTa = 1 . (3.22) By step 1 of the algorithm, a Z 0, so 1=bTfi 57 = 0, (by(3.22)) , a contradiction. Thus, the system Ax 2 b has no solution, as claimed. This completes the proof. We close this chapter with the following observations. 1. With the algorithm of chapter 1 at hand, the present algorithm for finding the minimal norm solution of a system of linear inequalities is easy to implement. 2. A consequence of this algorithm is to determine whether the system of linear inequalities under consideration is consistent or not, as shown in step 3. But if this is all that one is interested in then one would use the l2-norm in place of the lq-norm in solving the problem stated in step 1. 58 Chapter 4 Numerical results In this chapter we discuss the computational aspects of the algorithm presented in chapter 1 of this work. At this time we will not investigate the numerical results of the algorithm of chapter 2. The whole implementation of this algorithm will be presented elsewhere. The main computational difficulty encountered in this algorithm is finding a). > 0 such that ”31:: + akrkll = 1+ aw]. - This search occurs in step 5 at each iteration cycle. Suppose that we are at the stage of entering step 5. Let the function F be defined by F(a) = 1+ am. — llye + arkll' . (4-1) We are searching for a). > 0 such that 17(0):) = 0 . It is well known that d I I Elli/k + an,” = ((311: + (W) ark) (4-2) (see for example [14].). 59 Assume that the search for a). has been reduced to an interval (3,7), 7 > fl 2 0, with F(fl) > 0 and F(7) < 0. We begin by fitting a quadratic q(a) on the interval [[3,7] as follows (1(3) = 17(15):: F1 I «1(7) = F (7) == F2 (4.3) «1(3) = F'(fl) == Ff We seek the roots 6: of q. If IF(6z)I S 17 and £1 E (fl,7), where 17 is a given tolerance parameter, then we set ak=é and return to the main algorithm. If the stopping condition IF (d)| _<_ 1) is not met but 6: belongs to (fl, 7), we reduce the interval of search by setting 7:3 ifF(c‘x)<0 3:6: ifF(d)>0 and then apply the routine to the new reduced interval (3,7), till an acceptable or). is obtained. In the case when the root (‘1 is not in the interval of search (fl, 7) or if the quadratic interpolation has no real root, we consider the quadratic fitting as not suitable. A linear interpolation is then performed to determine 6, i.e 51’ = (13172 - 7F1)/(F2 - F1) a (4-4) followed by an update of the interval (fl, 7), as was done in the quadratic fitting case. Let q(a) = A(a — fl)2 + F{(a — £3) + F1 (4.5) be the quadratic interpolation defined via ( 4.3). It is easily seen that A = (F2 — F1 —Ff(7-fl))/(7-fl)2- (4.6) 60 The roots of q are a — p = (—F,’ :1: JP," — 4AF1 )/2A . (4.7) Recall that the quadratic interpolation is considered under the conditions F1 > 0 and F; < 0. So, q has its maximum in (—00, 6:). It follows that q'(a) = 2A(& — [3) + F,’ g 0. Thus, the only relevant root in ( 4.7) is a _ p = (_F,’ — \/F;'-’ — 4AF1)/2A. The following subalgorithm is based on the above discussion. For a further refinement of the interval of search (3,7), we included a bisection to be performed at each iteration cycle of the subalgorithm. 4.1 Subalgorithm Step 0. Let 3 = 0 and 7 be such that F(7) < 0. Let c > 0 Step 1. Let F; = F(3), F; = F'(3) and F; = F(7). Step 2. Let h = 7 — 3. Step 3. Compute A = (F2 — F1 — F;h)/h2. If A = 0, GO TO step 7; else proceed. Step 4. Let A = F;2 — 4AF1. If A < 0, GO TO step 7; else proceed. Step 5. Set a = 3+ (—F,' — x/E)/2A. Step 6. If 3 < (is < 7, GO TO step 8; else proceed. Step 7. Let C1 = (51:2 — 7F1)/F2 - F1)- 61 Step 8. If IF(d)I S 6, set a), = Li and RETURN to the main program; else proceed. Step 9. If F(d) < —6, set 7 = 61; else proceed. Step 10. Set 3 = 6:. Step 11. Let C“! = (3 + 7)/2. Step 12. If IF(&)I S 6, set a), = 62 and return to the main program; else proceed. Step 13. If F(&) < —6, set 7 = 6:; else proceed Step 14. Set 3 = d and RETURN to step 1. A Newton method can also be incorporated within this subalgorithm. As noted in chapter 1, in the lP-case the dual of a given non-zero vector 2 is given by z; = (Iz;I/IIzII,)q‘lsgnz,- , i=1,...m . This particulary simple formula of zI makes the calculation of the derivative in ( 4.2) immediate. We use the Newton iterations as follows. Via the quadratic model, we determine 6: belonging to (3, 7), then we start the Newton iteration at (3:. We compute a; = a — F(a)/F’(a) . (4.8) Having this new approximation, we check if 0‘ belongs to (3,7) and if it yields an actual decrease the value of F. Only under these conditions we continue the Newton iterations until an acceptable 61 is reached. If one of the conditions (a) d 6 (5.7) (b) |F(a‘) < |F(5!)| is not met, we exit the Newton iteration and return to the quadratic model. 62 The algorithm 1.2 suggested in chapter 1 was coded in Fortran 77 for a SPARC station. The norm considered is the IP-norm lellp = Q: Ian-PW" for various values of p. The dual norm is the ”II, norm, q = p/ (p — 1). The stop- ping rule parameter whithin which we consider the tolerance for the duality gap as acceptable is e = 10*. The code was run in double precision. We calculated the sequence (0,.) using the subalgorithm outlined in this chapter. We also incorporated a Newton iteration scheme, as discussed earlier, to determine each at. The coded version of the main algorithm seems to do much better for p Z 2 than for p in the range (1, 2). When p > 2 and not far away from 2, the convergence seems to do well compared to [20]. We also found that the sequence ((b, yk)) increases and the duality gap decreases monotonically. However, in some cases, the sequence (0),) poses more problems, e.g, it may converge to two different limits. The case 1 < p < 2 does not do as well. For example for p = 1.8 a significantly larger amount of iterations were needed to reach the same acceptable tolerance of 10-6. The sequence ((b, yk)) still increased monotonically. The duality gap decreased in the same way. In conclusion, compared to the algorithm in [20], the present algorithm seems to perform well for values of p larger than 2, but not as well for p in the range (1,2) with regard to the number of iterations. The following linear system is taken from Barrodale and Young [1] and was used 63 by Sreedharan [20] and others ([8], [10]) for l < p < oo. 1.52 £131 = 1.025 x1 + x2 = x1 + 21:; = 0.475 x; + 3x; = 0.01 x1 + 43:; = —0.475 —1.005 331 +5332 = We record the results in the following table. $1 1'2 p iterations 5.0 4.5 4.0 3.8 3.5 3.0 2.5 2.0 1.8 0.258716 0.253714 0.261537 0.261783 0.261952 0.261791 0.260704 0.258333 0.239018 0.000334 0.001339 0.000031 0.000000 0.000000 0.000000 0.000000 0.000000 0.0019 1.472222 1.507273 1.546597 1.568348 1.607494 1.697914 1.842740 2.102851 2.280894 149 59 28 27 24 20 15 1 165 Table 4.1: 1.8 S p S 5 Algorithm 1.2 was also coded in Fortran 77 for a SPARC station. The results are presented for p = 3 (table 4.1) and for various of A). = 1/6k. The tolerence parameter is c = 10-6. For p = 3, the numerical results suggest that the algorithm converges faster for 1,. near 1%. The convergence tends to be slower for A), far away from in We used the same example as above. 64 6,, x1 x; p iterations 1.8 0.26179422 0.0 1.69791476 16 1.9 0.26179173 0.0 1.69791476 18 2.0 0.26179170 0.0 1.69791476 20 2.2 0.26179069 0.0 1.69791476 20 2.5 0.26179026 0.0 1.69791476 23 2.8 0.26179038 0.0 1.69791476 26 3.0 0.26179090 0.0 1.69791476 29 4.0 0.26179037 0.0 1.69791476 39 5.0 0.26179041 0.0 1.69791476 50 10.0 0.26179035 0.0 1.69791476 104 Table 4.2: p = 3 65 Bibliography [1] Barrodale, I. and Young, A., Algorithms for best L1 and Loo linear approximation on a discrete set, Numer. Math., 8, (1966), 295-306. [2] Cheney, E. W.,“Introduction to Approximation theory”,2nd ed., McGraw-Hill, New York, 1982. [3] Fletcher, R., Grant,J. A. and Hebden, M. D., The calculation of linear best Lp approximation, Computer J., 14, (1971), 276-279. [4] Gill, P. E., Murray, W., Saunders, M. A. and Wright, M. H., Linearly constrained optimization, in Nonlinear Optimization 1.981, M. J. D. Powell, ed., Academic Press, London (1982), 123-139. [5] Lawson, CL. and Hanson, R. J.,“Solving Least Squares Problems”, Prentice Hall, Englewood Cliffs, NJ. (1974). [6] Nikolopoulos, P. V. and Sreedharan V. P., An algorithm for computing non negative minimal norm solutions, Numer. Funct. Anal. and 0ptimiz.. To appear. [7] N irenberg, L., “Functional Analysis”, Lectures given in 1960-61, notes by Lesley Sibner, New York University, (1961). I8] Owens, R. W., An algorithm for best approximate solutions of Ax = b with a smooth strictly convex norm, Numer. Math., 29, (1977), 83-91. [9] Owens, R. W. and Sreedharan, V. P., An Algorithm for approximation by ele- ments of a cone in a Banach space, Numer. Funct. Anal. and Optimiz, 10, (1989), 1161-1189. [10] Owens, R. W. and Sreedharan V. P., Least squares methods to minimize errors in a smooth, strictly convex norm, J. Approx. Theory, 2, (1993), 180-198. [11] Polak, E., “Computational Methods in Optimization”, Academic Press, New york, N.Y (1971). [12] Rockafellar, R. T., “Convex Analysis”, Princeton University Press, Princeton, NJ, 1969. [13] Spith, H.“Mathematical Algorithms for Linear Regression”, Academic Press, Orlando, FL, in press. 66 [l4] Sreedharan, V. P., Least squares algorithms for finding solutions of overdeter- mined linear equations which minimize error in an abstract norm, Numer. Math., 17, (1971), 387-401. [15] Sreedharan, V. P., Least squares algorithms for finding solutions of overdeter- mined systems of linear equations which minimize error in a smooth strictly convex norm, J. Approx. Theory, 8, (1973), 46-61. [16] Sreedharan, V. P. , A subgradient projection algorithm, J. Approx. Theory, 35, (1982), 111-126. [17] Sreedharan, V. P., Subgradient projection algorithm II , J. Approx. Theory, 41, (1984), 217-243. [18] Sreedharan, V. P., Extensions of subgradient projection algorithms, J. Approx. Theory, 47,(1986), 228-239. [19] Sreedharan, V. P., e-Subgradient projection algorithm, J. Approx. Theory, 51, (1987), 27-46. [20] Sreedharan, V. P., An algorithm for a non negative norm minimal solutions, Numer. Funct. Anal. and 0ptimiz., 9, (1987), 193-232. [21] Watson, G. A., “Approximation Theory and Numerical Methods”, John Wiley and Sons, New York, N.Y. (1980). [22] Zangwill, W. 1., “Nonlinear Programming: A unified approach”, Prentice Hall, Englewood Cliffs, N. J., (1969). [23] Zoutendijk, G., “Methods of Feasible Directions”, Elsevier, Amsterdam (1960). 67