w E 0)th

J
MICHIGAN STATE UNIV SITY UBRMIES L;

Hill/Ulll/li/HIIllIf}!!!IIIIUIIIWHWWI/Ill)!!!” l

3 1293 01037 9992

 

 

This is to certify that the

dissertation entitled

LEAST DISTANCE ALGORITHMS FOR
LINEAR SYSTEMS

presented by

Said Bahi

has been accepted towards fulfillment
of the requirements for

Ph.D. degree in Mathematics

 

 

4 r '1 4
tiff/'K [(Mﬂe 16%

Major professor

 

Date ’4'. — ' '

MS U i: an Afﬁrmative Action/Equal Opportunity Institution 0-12771

 

 

 

 

 

 

 

LIBRARY
Mlchlgen State
Unlverslty

 

 

 

PLACE IN RETURN BOX to remove thle checkom from your record.
TO AVOID FINES return on or bdore date due.

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

m

3—!
DEM
WE

MSU le An Affirmative ActhnlEquel Opportunity Inetltuton

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

LEAST DISTANCE ALGORITHMS FOR
LINEAR SYSTEMS

By

Said Bahi

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the Degree of

DOCTOR OF PHILOSOPHY

Department of Mathematics

1993

ABSTRACT

LEAST DISTANCE ALGORITHMS FOR LINEAR SYSTEMS

BY

Said Bahi

In this thesis, we investigate the numerical methods for three classes of approx-
imation problems. We begin with the study of methods for approximating a non
negative solution of an overdetermined system of linear equations. We deﬁne a best
approximate solution of the system At = b, x Z 0, to be the vector x Z 0 which
minimizes the norm of the residual r(:r) = b -— Art, for a strictly convex and smooth

norm .

We, then, consider a system of linear equations Ax = b which has a non nega-
tive solution. We present a method for computing, amongst all these non negative
solutions, the one which is of the least norm when the space R“ is equipped with a

strictly convex norm.

Finally, given a system of linear inequalities, Ar 2 b, we suggest an algorithm that
solves the question of the feasibility of the system, and if it is feasible, it computes

the unique solution which has minimal norm.

For each of these three approximation problems, we study a dual problem whose
solution leads to the characterization of the solutions of the original problem. Al-
gorithms for computing the solutions are presented and their convergence proved.

Numerical results for different I? approximation problems are discussed.

ACKNOWLEDGMENT

I would like to thank my advisor, Professor V.P. Sreedharan, for his guidance
in the reseach leading to this thesis and for all the discusions we have had on this
particular subject and many others. I am especially grateful for his enthusiasm, which
has made the last two years so enjoyable. I am also grateful to the members of my
guidance committee, Professors D. Dunninger, G. Ludden, W. Sledd and D. Yen. I
further thank the faculty and staff of the Department of Mathematics at Michigan
State University, in particular Mrs. Barbara Miller, for their support of my effort.
Finally, I wish to express my gratitude to Dr. D. Horner, Mrs. Elda Keaton and staff
of the International Center at Michigan State for their help and to Mrs. Monique

Bidaoui.

iii

Contents

0 Introduction 1
l A Least Distance Algorithm For A Smooth Strictly Convex Norm 7
2 An Algorithm For Minimal Norm Solution Of A Linear System 26

3 On The Minimum Norm Solution Of A System Of Linear Inequali-

ties 46
4 Numerical results 59

LIST OF REFERENCES 66

iv

Chapter 0

Introduction

The Least Squares Problems have received a considerable amout of attention from
different authors for either the special case of the Euclidean norm or the more general

case when the norm considered is the IP-norms, 1 S p S 00.

We shall study the numerical methods for computing solutions of the problem

that we state generally as follows.

Let a system of linear equations (resp. linear inequalities) be given and let ll.“ be
a norm. If the system is overdetermined, ﬁnd the ‘best’ non-negative solutions which
minimize the [HI—error. If the system is feasible, ﬁnd the unique solution that has

minimal norm.

This problem will be considered for general abstract norms. For more practical
use, this work will apply to the interesting case of the lP-norms. First, we start by

restating the original problem, specifying the stages of this thesis, as follows

Problem 1. Given a linear system A2: = b that has no non-negative exact solution,

find a: 2 0 for which As: is the closest to b.

Problem 2. Given the system of linear equations

Arzb,:r_>_0

that is feasible, ﬁnd the vector :1: Z 0, solution to the system and which has the

minimal norm.

Problem 3. Consider the system of linear inequalities As: 2 b. Find the solution

a: with the least norm.

In problem 3, we make no feasibility assumption. Incidently the proposed method
to solve this problem will answer the question of whether the system is consistent or

not.

Our aim is to propose some implementable convergent algorithms that solve these
problems. After the algorithms are speciﬁed, their feasibility and convergence are
proved for any smooth, strictly convex norm. We will test these algorithms for the

usual lP-norm, 1 < p < oo
llxllp = (Z lxil”)”” - (0-1)
In the ﬁrst chapter we will be concerned with problem 1 :

Let A be an m x 11 real matrix and let 6 be a real m-vector such that the system

Ax = b , a: Z 0 has no exact solution. Let the norm H.” on R” be smooth and strictly

convex. We seek a: Z 0 solution of

“b — Arc” (min) , a: Z 0 . (0.2)

A more general formulation of this problem can be stated as follows: Given a
closed convex cone K in a ﬁnite dimensional Banach space X with a smooth, strictly
convex norm II." and a point b E X \ K, ﬁnd 2 E K that minimizes the distance
||b — z", :c E K. We point out also that our results, including our algorithm can be

formulated in inﬁnite dimensional spaces as was done in [9].

This problem has been extensively studied for the norms ||.||2 , ||.||1 and ||..||°°

The last one is refered as the Chebyshev problem. Since the l1 and the l°° norms

2

fail to be strictly convex or smooth, differentiable optimization techniques cannot be
applied. Special methods have been developed to handle these problems. In each of
these cases, thanks to linear programming, the special structure of the norm may be
used. See for example Barrodale and Young [1]. The case when the norm considered
is the l2 norm
||b— Ax”; (min) , a: Z 0 .

has received the most consideration. For a more complete discussion see Lawson and
Hanson [5] where an efﬁcient ﬁnite algorithm called the NNLS algorithm as well as a

Fortran implementation are proposed.

With no side constraints, the problem
“5 - Amill» (min)

for any p, l < p < 00, has also been studied by different authors. See Fletcher, Grant
and Hebden [3], Owens [8], Owens and Sreedharan [10], Sreedharan [15], Spath [13]

and Watson [21]. It presents a lesser difficulty since the set of approximation
K = { A2: | a: E R" }
is a linear subspace.

If the condition a: Z 0 is added, the set of approximation is no longer a linear

subspace. In this case we are seeking y in the cone
K={Ax|x20,xER”}
which is the closest to b.

Sreedharan in [20] has developed an algorithm for solving this approximation
problem for any smooth, strictly convex norm with applications to the minimization
problem

min{ ||b—Ax||,|a:ER", 2:20}.
3

In [20] a duality theory was developed and a characterization of the solution was
given. The algorithm we propose in chapter 1 will use these ideas. We will give a
dual problem and review the related duality theory. After stating the algorithm, we
will show that all the steps of the algorithm are valid, then the proof of the convergence
will be given. It is worth mentioning that our algorithm does not depend explicitly on
the special structure of the I? norm, but only on its strict convexity and smoothness.

Numerical results for the norms ||.||,, for various p, will be presented in chapter 4.

General algorithms (such as in Polak [11], Zangwill [22], Zoutendijk [23]) that
minimize smooth, strictly convex functions subject to linear constraints could be used
to minimize the function f (:r) = ||b- Arllg, subject to .1: Z 0. Or use may be made of
nonsmooth optimization algorithms in Sreedharan ([16], [17], [19], [18]) and others.
All of these algorithms for their guarantee of global convergence, however require the
use of their built-in “antijaming” precautions, i.e., procedures that circumvent the
possibility of the generated sequence clustering at or even converging to non-optimal
points. In contrast to these, the algorithm to be presented here for the solution of
the problem at hand has no antijaming precautions because none is needed, and it
is globally convergent. General algorithms like those in Gill, et. al. [4] deal with
the optimization of a twice continuously differentiable objective function subject to
linear constraints and as such are not applicable to our problem, since in general our
objective function, the norm, need not be twice differentiable througout Rm \ {0}.
In particular, for the l? norms, l < p < 2, there are non-zero points in R”, m _>_ 2,
where the norms fail to be twice differentiable. Despite this lack of differentiability,
one may try to apply second order methods, the validity of which is predicated on
the continuity of the Hessian of the objective function. Numerical answers obtained
on speciﬁc examples are then checked a posteriori to see whether they are the “right”

answers. Such tests lend credance to the applicability of second order methods,

even when the relevent hypotheses are violated, but such ﬁndings do not prove the
convergence of the algorithm. We, however, in the present work, give a proof of

convergence of our algorithm, which includes the 1” case, 1 < p < 2.

In chapter 2, we will be considering the following minimization problem
“9:“ (min), Ax=b, 1:20.

The norm is assumed to be smooth, strictly convex and the system An: = b to have
a non negative exact solution. This problem has been studied for the lf-norm with
linear inequalities constraints in [5] and for general norms in [6]. Lawson and Hanson
in [5] gave an algorithm that solves the problem "2:”; (min), As: 2 b, called the LDP
(Least Distance Programming) algorithm. It uses the transformation of the present
problem into a NNLS (Non Negative Least Squares) problem. A Fortran code was

also presented.

As we pointed out in the discussion of problem 1, general purpose algorithms may
also be applied to solve problem 2, when the norm considered is the lp-norm. However,
the objective function f(a:) = ”x“; poses the same differentiability problems for
1 < p < 00. Moreover, as it was outlined in [6], during machine implementation, some
precautions would have to be taken to circumvent the possibility that the generated
sequence of approximations be clustering or converging to a non optimal solution,
though we prove in theory the convergence of the sequence. On the other hand, our
statements including the proposed algorithm do not depend explicitly on the norm,
need no extra differentiability of the objective function and the global convergence of
the algorithm is proved. This work was motivated by similar algorithms in [6] and
[20].

A major step in this algorithm involves the solution, at each iteration cycle k, of

an I2 problem

Ax = b , a: Z 0 , ||:r — ak||2(min)

where a], is deﬁned at the begining of each iteration. The feasibilty of the algorithm

will be studied and its convergence proved.

Next, we examine in chapter 3 the problem
||x|]p(min) , As: _>_ b

for any p, 1 < p < oo. Lawson and Hanson studied a similar problem when p = 2. Our
work is an extension of the so called LDP algorithm in [5]. This problem is solved via
the algorithm in chapter 1. A dual problem will be presented and its correspondance
with the original problem will be studied. A related least squares problem will be given
and used, as mentioned above, to ﬁnd the solution to our problem. After presenting
the algorithm, we will prove its feasibility and that it computes the solution if it

exists. The fact that no assumption regarding the feasibility of the system
As: 2 b

is involved will explain the usefulness of this algorithm. Moreover its implementation

is particularly simple.

In chapter 4, some numerical results are given and a subalgorithm is presented.
These results show that the coded algorithm does well compared to other algorithms
in the literature, for different p—norms, when p is in the range (2, co) and not far away

from 2.

Chapter 1

A Least Distance Algorithm For A
Smooth Strictly Convex Norm

We study in this chapter the system of linear equations
A1: = b ( 1.1)

where a: is a non-negative n-vector in R“, A is an m x n and b is in R” . Let H.”
be a smooth, strictly convex norm on R". In many applications the system above is

overdetermined. In this case, one seeks to minimize the error r(:r) = b — Ax.
We shall be concerned with the following problem
(P) min {Ilb— Ax" I a: E R",:c 2 0}.

Our objective is to give an implemetable algorithm to solve (P), and prove its
convergence. In practice the proposed algorithm computes simultanuously a solution
of (P), a solution of its dual (P') and their common value. At each iteration cycle, an
approximation to the solution of (P) is computed and then used to ﬁnd an approxi-

mation of (P'). The later enables us to compute a new, improved, approximation of
(P)-
In [20], Sreedharan suggested a dual problem (P') , and the relation between

problems (P) and (P') was studied. We will review brieﬂy problem (P') .
7

We begin by establishing some notations and deﬁnitions. Given two vectors x

and y in R", the usual inner product is denoted by (., .)

(1‘, y) = Ext-yi-

A partial order in R“ is deﬁned as follows. For any u = (u1,...u,,) E R",u Z 0 iff
iv,- 2 0,i = 1, ...,n. If A is a matrix, its transpose is denoted by AT. Associated with
the above inner product is the usual Euclidean norm II..II2 In formulating the duality
theorems, we will use the well known notion of dual norm. Let H." be a norm on R”,

m 2 1, the dual norm II.III on R“ is deﬁned by

III/ll, = ma${(<v,y)| ||$||=1, x E Rm}-

For any y in Rm, 3/ ¢ 0, a II.“ — dual vector yI is then deﬁned by

Hill =1, and (3133/) = ||y||'- (1-2)
Similarly, a II.I|I — dual vector y“ is given by

lly‘ll' =1 , and (3/211) = Ilyllo (1.3)

Recall that a norm II." is strictly convex iff
||$|| = My” = ”it + yll/2 implies a: = y,

and the norm II.“ is smooth if and only if through each point of the unit norm there
passes a unique hyperplane supporting the closed unit ball B = {y E RmI II 3/” S 1}.
We note that II." is smooth if and only if ”III is strictly convex. If the norm II." is
strictly convex, then the II.II-duals are unique, the map y H y' is odd, continuous
on R“ \ {0}, and positively homogenous of degree zero. If the norm II." is smooth,
then the ||.|I'-duals are unique, and the map y H y" is odd, continuous, positively

homogenous of degree zero on R“ \ {0}. It is easily seen that for y aé 0,

y" = y/llyll' and y" = y/IlyII, (1-4)
8

when the norm II.” is smooth and strictly convex, respectively.

In the special case when H.” is the lp-norm, with 1 < p < 00, it follows easily
from the Holder inequality that II.II;, =II.IIq, where p + q = pq. In this case, the dual

vectors take the particularly simple forms
y:- = (lysl/llyllq)""8gnye, 2': 1, "mm

3/: =(lin/“yllplp-1‘Sgnyis i: 1’ °'°,m° (15)
for any nonzero vector y E R“.

A convex cone K in R” is a convex subset for which Aye K, Vy E K and VA 2 0.
The negative polar of K is deﬁned by K° = {z E R“ I (z,y) S 0 , Vy E K}. In
particular, if

K={Ax€Rm I xER",x 20},

then its negative polar is given by
K° = {z E R” I ATz SO}.

Following Lawson and Hanson [5], if the system Ax = b is overdetermined, then the
minimizers of (P) are called the non-negative least squares (NNLS), when the norm
is the lz-norm. The corresponding residual is called the NNLS-residual. In general
the N NLS solutions are not unique. A ﬁnite algorithm for solving the NNLS problem

is given in [5]. At the computational stage of this chapter, we will use this algorithm.

The following dual problem (P') can be now introduced [20]
(P') max{(b,z) I z E Rm , ATz S 0 , “z“, :1}.

Theorem 1.1 (Nirenberg [7]) Let K be a convex cone in B“. Let bE R”, then the
dual problems

mmﬂw—mllyeKL (Lo
9

and

max{<b.z> l z€K°,llzll'=1}, (1.7)

have the same value for any norm II.II on Rm.
From this, the following corollary is immediate

Corollary 1.2 Let R“ be equipped with a norm II.II.Then the dual pair (P) and (P')

have the same value.

An important result in [20] will be used to solve (P). It characterizes the solutions
of (P) in the case when the norm II.II is strictly convex and the system Ax = b has

no non-negative exact solution.

Theorem 1.3 Let the norm II.II be strictly convex. The following are equivalent
(i) 5 _>_ 0 is a non-negative least error II.II solution of the system Ax = b .

(ii) There exists 3/ E R”, IIyII' = 1, ATy S 0, such that
A5: = b _ (b,y)y’. (1.8)

Moreover, (y,A:E) = 0 and y solves (P'). If in addition the norm II.II is smooth,

then (i) and (ii) are equivalent to

(iii) Let r = b — Ax, then

ATr“ S 0, (1.9)

and
(r‘, Ax) = 0. (1.10)
An immediate consequence of the theorem follows easily when II.II = II.II2. In this

case clearly r' = r/IIrII2, for any r 96 0. The vector x 2 0 is a solution of (P) in the

10

l2-sense iff the residual r = b — Ax is such that

ATr _<_ 0 and (r, Ax) = 0. (1.11)

Now, we give our algorithm for solving (P). It is assumed that the norm II.II is
smooth, strictly convex and that the system Ax = b, x _>_ 0 has no exact solution.
The main results are the feasability of the algorithm and the theorem concerning its

convergence. Proofs that all the steps of the algorithm are valid will also be given.
Algorithm 1.1

Step 0. Find x0 a NNLS solution of Ax = b. Let r0 = b — Axo, yo = To/IIroII',
lc = 0.

Step 1. Find x], a NN LS solution of
Ax = b — (b, gay; .
Let r), be the residual
rk = b — (b,yk)y; -- Aft), .
Step 2. If r1, = 0 , GO TO step 8 ; else continue.

Step 3. If

(5,71) 2 (hyklllrkll' + ”HUI/4 ~
Set yk.” = rk/IIrkIII and GO TO step 7.

Step 4. Let

m: = ((bn‘k) - llrkll3/2)/(b,yk) -

Step 5. Find a]. > 0 such that

III/k + Clerk”, = 1 + akﬂk -

11

Step 6. Set

I I I
yk+1 = (w. + §akrk)/||yk + 5am.” -

Step 7. Increase k by 1, and RETURN to step 1.

Step 8. xk is a non-negative II.II minimal solution of Ax = b. The computation is

complete.

We make a few observations. A crucial statement in the algorithm is step 5
deﬁning the number at. Any efficient algorithm for ﬁnding the zeros of a function
can be used in the current situation . Later, we will suggest a procedure to compute
the at occuring in step 5. The stopping criterion given in step 2 is a very convenient
one in the proof of the convergence. From a computational point of view, a more
practical stopping rule will be to require that the so called relative duality gap goes
to zero. So, in writing a routine to implement the algorithm, step 2. will be replaced

by

Step 2'. If
(“5 — AkuI - (AWN/”b - AkuI 5 6,

where e is a ﬁxed stopping rule parameter, GO TO step 8 (the computation is com-

plete); else proceed.
All the other steps of the algorithm remain unchanged.

Remark. Later in proposition 1.5, we shall show that the sequence ((b,yk))

generated by the algorithm is strictly increasing. We claim that

(b,yk) >0, for all h.

To see this, let r0 be the NNLS residual of the system Ax = b , x _>_ 0 , as deﬁned in

step 0. Recall that the system is assumed to have no non-negative exact solution, so

12

ro :ﬁ 0. From equation ( 1.11) we get
(b, 1'0) = “7‘0“: > 0-

Once more, by step 0 of the algorithm, yo = ro/IIroII', so that (b, yo) > 0, which

proves the claim.

We now turn our focus to the convergnce theory. We begin by proving the validity
of the various steps of the algorithm. In the next theorem, we prove that under the
assumption that the primal norm II.II is smooth, the number a], > 0 deﬁned by step

5 always exists.

Theorem 1.4 Let the norm II.II on Rm be smooth. Then if the algorithm is at the

stage of entering step 5, 301;. > 0 such that
“3/1: + attic”I = 1+ akllk (1.12)

where pk is defined by
1 2
me = ((bﬂ‘k) - §||rk||2)/(b, 31k) .

Proof. Since the norm II.II is smooth, the dual norm is strictly convex. Consider the

strictly convex function f : R ——+ R deﬁned by

f(/\) = “31k + Arkll' (1-13)

and let

l(A)=1+/\}tk .

Then f(0) = l(0) = 1, l'(0) 2 pk and f'(0) = (y[,rk). (For details about f', see [14]).
We note that the algorithm enters step 5 only if r), aé 0. By step 1 and ( 1.11), we

obtain

llrklli = (M) - (b,yk)(yi,rk) - (1-14)
13

Using( 1.14), we have

f'(0) — I'm) = (vim) — p.
= (gm — (are - guano/<1), w.)
= (strewn — (M) + guano/(mt)
“gums/(bat)

<0.

This implies that f (a) - 1(a) < 0 for some a > 0. Note that step 5 is executed only

if step 3 is answered negatively, i.e,
. 1 ,
(b,r,.) < (hyklllrkll + Zl|rk||2 - (1-15)
In this case, since IkaIII = 1, for any A > 0 we have

ll/Vk + ykll' - Ame -1

= um + ytu' — Mam.) -— guano/(w —1

2 unit — uyku' - Mam.) - guano/(mt) -1
= A(Ilrkll'(b,yk) — an.) + gnaw/(m - 2

1 1
> M-ZII'WII: + §||rk||§)/(b,yk) - 2 ,

f0) - 10‘)

due to ( 1.15). Hence

10) ~10) > gums/<12, w.» — 2

goestoooasA—aoo.

Thus, there exists A > 0 such that f (A)-l()\) > 0. Because the function f (A)—l(A)
is continuous, this implies that there exists an on, > 0 such that (1.12) holds as

claimed, and the proof is complete.

14

Proposition 1.5 Let rk and y], be as deﬁned in the algorithm. Then

(a)y1c + ark yé 0, Va, if rt 75 0.

(b) ATyk S 0 and IkaII’ :1 , Vic.

Proof. To prove (a), suppose rk = an,”c for some a. The algorithm deﬁnes rk by
rk = b — (b, yk)yIc — Axk .

Now, because of the second half of ( 1.11), we get

Ilrklli = (b — (Mitch/lath) - (1-16)
Hence
Ilrklli = 0(1), we) — 0(b,yk>(y;¢,yk>
= o ,
since (y;,yk) = IkaIII = 1. This implies that rk = 0, a contradiction.

To prove (b), once more by ( 1.11) we have ATro S 0, so that by step 0 of the

algorithm ATyo _<_ 0. A simple induction shows that
ATy;c S 0, We and IkaIII = 1 .
The proof is complete.
The next result shows that the sequence (yk) built by the algorithm leads to an

improved approximation of the value of (P'), after each iteration cycle.

Proposition 1.6 Let (yk) be the sequence generated by the algorithm. Then, the

sequence ((b,yk)) is strictly increasing and convergent.

Proof. Here two cases have to be considered. Either (a) the question in step 3 is

answered afﬁrmatively, in which case,

I 1
(b’rkl 2 (bayklllrkll +lerkllgi (1'17)

15

or (b) the question in step 3 is answered negatively, so that
t l
(bﬂ‘k) < (hyklllrkll + lerklli - (1-18)

It is easy to see that case (a) yields a strictly increasing sequence. By step 4 of
the algorithm
yk+1 = Tic/“Th”, -

Inequality ( 1.17) implies

(ham) = (barkl/llt‘kll'
2 any» + illrklli/llrkll' .
Thus
(5, n+1) > (bat/k) ,

(since r1, 76 0 ), as desired.

We now take up the second case (b). Because the norm II.II is smooth, the dual
norm II.IIl is strictly convex. By proposition 1.5-(a) yk and rk are linearly independent.

We have

1 . 1 .
|ka + 50m.“ = ”5011: + y]. + “We”

< 1u +aru'+1u II'
2311: kl: 23/):

= 1 + gawk (due to step 5)

l I
= 1+ Emma) — glans/(mt) .
We conclude that

I t 1 1
(hyklllyk + 5am.” < (5,31th + 50145, "kl - Zakllrklli , (1-19)

which, in view of step 6, implies that

l l l I
(b,yk) < ((5,311: + 50m) - Zakllrkllimlyk + 50m."

< (b, yk+l> ,
16

as claimed.

To prove the convergence of the sequence ((b, yk)), we simply observe that because

IkaIII == 1, we have

(b, 311:) S Ilbll-Ilykll'
= “b”-
The proof is complete.

The next result will be needed in the proof of the convergence theorem. It also
gives a further insight into the existence and the uniqueness of the number a], > 0,

deﬁned in step 5.

Lemma 1.7 Let a]. > 0 be as deﬁned by step 5 of the algorithm. Then on. satisﬁes

the following inequality
, «a + ”roams/k) + gnu“: 2 M) . (120)
Proof. By the deﬁnition of the dual norm, we have
”31:: + amll' = ((3/1: + Okrk)',yk) + GHQ/k + akrk)',r,.) . (1-21)
Using the inequality

((3/1: + akrkya 31k) _<_ ”(31k + akrkl'll-llykll'
= 1 ,
and ( 1.12), we obtain
1+ ak((b,rk) — glans/(bet) = uyt + new
5 1 + ak((y;c + akrkf, rk) , (1.22)
which proves the desired inequality ( 1.20).

We are now able to prove the convergence theorem.

17

Theorem 1.8 Either
(a) the algorithm solves (P) and (P') in a ﬁnite number of iterations; or
(b) the algorithm generates inﬁnite sequences (23,.) and (yk) such that the sequence

(yk) converges to the solution of (P') and every cluster point of (xk) is a solution of

(P).

Proof. To prove (a), we note that the sequences (xk) and (yk) are ﬁnite sequences
(i.e, the algorithm terminates in a ﬁnite number of iterations) if and only if r], = 0,
for some ﬁxed integer k. In this case by theorem 1.3, x], and yk are solutions of (P)

and (P') respectively.

To prove (b), let (xk), (yk), (rk) and (01‘) be the inﬁnite sequences genarated by
the algorithm.
Let
d = min {IIb— AxII Ix Z 0, x E R"} . (1.23)

Since the system Ax = b, x _>_ 0 is assumed to have no non-negative exact solution,

d > 0. By ( 1.16)

Ilt‘klli = (5 -(b,yk)yiark)

S (Ilbll2 + dllyillznlrkllz

Let M > 0 be such that IIzII2 S MHz”, for any z E R”. Then, because Ill/i” = l,

we obtain

llrkllz < ||b||2+Md. (1-24)

Hence, the sequence (rt) is bounded. Our goal is to prove that (rk) converges to zero.
Suppose that this not true, i.e, the sequence (rt) does not converge to zero. then, by

( 1.24), there exists a subsequence, denoted again (rk) such that

rk—-—9r5£0
18

There are two cases to investigate.

Case 1. Assume that step 5 of the algorithm is executed for an inﬁnite number

of indices (k), so that

t l
on) < (mourn! + lerkuz. (1.25)

By taking a subsequence if necessary, we may assume that
“all; 2 6 We 2 0 , for some 6 > 0 . (1.26)
Now,

aI:||"I=||'-1 S llyk+akrkll'

= 1+ new — guano/(m)

(because of step 5), from which one obtains

2 2 a..(||rtn'<b,yt>—<b,rt>+-;-IlrkII§)/(b,yk>

l l
> aid-Ellrklli + 5”"!«IIil/(bayk) ,

(due to ( 1.25)). By the weak duality and ( 1.23) this implies

akllrklli < 8(1), 311:)

< 8d.

Using ( 1.26), we see that
0 < a). < 8d/6 . (1.27)

We have ,thus, shown that the positive sequence (ak) is bounded from above. Since
IkaIIl = 1, by passing to a further subsequence of (k), which we call again for simplicity
(k), we may assume that there exists a non-negative number a and a vector y such
that

ak—aa and yk—Hy.

19

Letting k -—+ co in ( 1.20), we get

<(y+ar)'.r> 2 (<b,r>--;-nrn§)/<b,y> (1.28)

(this is possible because the map 2 H 2' is continuous on R” \ {0}). We will prove

that both possibilities, a = 0 and a > 0, lead to contradictions.

If a = 0, then ( 1.28) becomes
our) (my) 2 an) -1111; . (1.29)
If one takes the limit as k -—+ 00 in ( 1.16), one has
llrlli = (b, r) - (bulb/17‘) - (1-30)
Inequality ( 1.29) and equation ( 1.30) together imply
urn: _<_. lnrni
2
which implies r = 0, leading to a contradiction, as claimed.
If a > 0, then by step 6 of the algorithm
ya. ——» (y + gas/11y +gm~1 = 12 , (1.31)

as Ir -——1 00. Both subsequences ((b,yk)) and ((b,yk+1)) have the same limit. Let p

be this limit. Then
(5,3!) = (5137) = P- (1-32)
Using equations ( 1.31) and ( 1.32) simultaneously gives
(many + $0.11 = <b,y> + gut.) .

Rewriting this equation yields

1 1 1
My + 5m” = 1+ 501(1), r)/(b, y) . (1.33)

20

Allowing k ———+ co in the equation ( 1.12) deﬁning at, we obtain

(13,)“, + arn' = «1.11) + a<<b,r> - guru) .

Because the norm II.II is smooth, we apply the same argument as in the proof of
proposition 1.5 using the strict convexity of the dual norm II.III (equation ( 1.19)) to

conclude that
«2.11111 + garu' < (11.1) + gut,» — iallrlli . (1.34)
It is clear now that equations ( 1.33) and ( 1.34) combined imply
o < ianru: ,
4
leading to the desired contradiction. The proof of case 1 is complete.

Case 2. Suppose that step 3 is executed for all but a ﬁnite number of iterations.

Passing to a subsequence denoted again (k), we may assume
3/), ——> y and r), ——+ r ,

as k ——-) 00. In view of arriving to a contradiction, assume that r 76 0. By step 3 of
the algorithm

yk+1 = "k/IlrkII' ~ (135)
The subsequence (31114-1) converges to
12 = r/Ilrll' -

Thus 9 , besides y, is a cluster point of the sequence (yk). Because of the fact that

the original numerical sequence ((b, yk)) is convergent (proposition 1.5 ), one has

(513/) = (b, 37) - (1-36)

Letting k -——+ co in both sides of the inequality deﬁning step 3, we ﬁnd

. 1
(b1?) 2 “JAN?” +lerlli,

21

i.e,

«1.111111 2 <b.y>urn’ + guru: ,
( since (b, r) = (b, 3,7)IIrIII ) . In view of equation ( 1.36), this implies r = 0, yielding
a contradiction. So, in both cases, we have proved that the sequence (rk) converges

to zero, as claimed.

It remains to prove that the sequence (yk) converges to the solution of problem
(P'). Let y be a cluster point of the sequence (yk). Then, there exists a subsequence,

denoted again (yk) converging to y. From step 1 of the algorithm, we have
r,c :: b -— (b,yk)y; — Ax;c . (1.37)

Since r): -—-+ 0, the sequence (Axk) converges to b — (b, y)y'. From the fact that the

cone {Ax I x 2 0,11: E R"} is closed, we get
b —(b,y)y'=Ax, forsome x20.

Now applying theorem 1.3, we see that y solves problem (P'). The same argument
applies to any cluster point x of the sequence (x1). Let (ku) be a subsequence
converging to x. By taking a further subsequence, if necessary, we may assume that
yk: ——-> y. Considering equation ( 1.37) and letting k' ——+ co in ( 1.37), we obtain
Ax = b - (b,y)y', (since rk ——1 0). Once more applying theorem 1.3, we conclude

that x is a solution of problem (P), as claimed. This completes the proof.

The algorithm studied in this chapter can be generalized as follows. We keep the
previous notations. At each iteration cycle of algorithm 1.1, a prototype approxima-
tion is given by

I 1 1
yk+1 = (31k + iakrkl/Ilyk + 50mg” . (138)

As it was done in [14] a natural extension results from choosing an arbitrary coefﬁcient
A), belonging to a set to be determined as a substitute to the coefficient 1 / 2 in equation

( 1.38). We set down these changes in the following generalized algorithm.
22

Algorithm 1.2
Step 0 to step 5. Same as in algorithm 1.1

Step 6. Let

yk+1 = (111: + Akakt‘kl/llyk + Akakrkll' 1
where A). E A. A is a ﬁxed compact subset of the open interval (%,1).
Step 7. Increase 11: by 1 and return to step 2.

Step 8. x), is a non-negative II.II minimal norm solution of Ax = b. The computa-

tion is complete.

A careful] inspection shows that most of the results concerning algorithm 1.2
carry over to the present situation. Only a few changes are needed in the proof of the

convergence.

Theorem 1.9 Let A be a non-empty compact subset of the open interval 6,1). As-
sume the norm II.II to be smooth and strictly convex. Let the sequence (yk) be deﬁned

as in step 6 of the algorithm by

yk+1 = (3/1: + Akcrkrkl/llyk + Akatrkll' ;

where A), is arbitrary, A), E A. Then

(a) the sequence ((b, yk)) generated by the algorithm 1.3 is strictly increasing and

convergent.

(b) the sequence (yk) converges to the solution of problem (P').

Proof. Recall that the smoothness of the norm II.II implies the strict convexity of

the dual norm II..II' From this observation we obtain

”31k + Akakrtll' < Akllyk + akrkll' + (1 - Ak)||yk||'

23

= Ak(1+ ark/1).) + (1 — Ak) (due to step 5)

I
= 1+ A1a1((b,r1) - girdle/(m) . (1-39)
Rewriting ( 1.39), we get

I 1
(b.3111) < ((5111:: + Mam) - §Akak||r1I|§)/llyk + AWWII

< (bayk+ll 1

l

4 , a), > 0 and r)‘ 96 0 . Thus, the sequence ((b,yk)) is strictly increasing.

since A), >

The rest of the proof of (a) is similar to the proof of proposition 1.5.

To prove the convergence of (yk) claimed in (b), we show that
limrk = 0 . (1.40)

As in theorem 1.8, we assume r,: ——+ r gé 0 and show that this leads to a contradiction.
The difference is that we have to pass to an additional subsequence of (A1,), that we
shall denote (A1,), such that

A), ———+ A E A .

The two cases a = 0 and a > 0 are studied. The case a = 0 carries over exactly as

in the proof of theorem 1.8, while in the case a > O by letting I: ——-> 00, we have
yk+1 -* (y + A07‘)/lly + Mrll' = 17- (1-41)

Since the subsequences ((b, yk)) and ( (b, yk+1)) have the same limit, equation ( 1.41)
shows that

(many + Aarn' = (1,1,) + Mb, 1) . (1.42)

Now, let I: ——1 oo in ( 1.39). We get

1 l
<b.y>uy + Aarll < (my) + Aa<b,r> - ,Aauru: . (1.43)

24

Combinig eq.( 1.42) and inequality ( 1.43), leads to
1 2
0 < —§AaIIrII2 .

which contradicts the assumption that r 7i 0. the rest of the proof is completed

exactly as in theorem 1.8.

The reader will note that this theorem was motivated by a similar work in [14].

25

Chapter 2

An Algorithm For Minimal Norm
Solution Of A Linear System

In this chapter, we will be considering a system of real linear equations
Ax = l) . (2.1)

where A is an m x n real matrix, b an m-real vector, and x an n-real vector. Under the
assumption that the system ( 2.1) has a non—negative solution, we shall be concerned
with the following problem: amongst all the non-negative solutions of ( 2.1), select
that solution which has the least norm II .II, when R" is equipped with a strictly convex
norm II.II (e.g, the lP-norm, 1 < p < oo) . This problem will be referred to as problem
(P)

(P) min{IIxIIIAx=b,xER" ,xZO}.

We assume that b at 0, for otherwise the problem is trivial. The notations and

deﬁnitions are those of chapter 1.

We present an algorithm for computing the solution of (P). Its feasibility and the
convergence will then be studied . All the steps in this algorithm will be shown to be

feasible. Its global convergence will then be proved.

To solve the given problem, a dual problem, to be denoted by (P'), will be associ-

26

ated with (P). An outline of the correspondence between (P) and (P') will be given.
The main application of this work is the lP-norm case. Namely, Find x E R", that

solves the lP-problem
minimize {IIxIIp I Ax = b , x E R" , x Z 0} . (2.2)

A worthwhile remark to make here is that the objective function in problem ( 2.2) need

not be twice differentiable in R" \ {0}. This is particulary the case when 1 < p < 2.

For p = 2, problem ( 2.2) was studied in Lawson and Hanson [5]. They refer to
( 2.2) as the Least Distance Programming (LDP) problem. A ﬁnite algorithm for
solving LDP was given in [5]. We will use this algorithm as follows. At each iteration

cycle of our algorithm, the LDP problem
Ax = b ,x Z 0 ,IIx — at”; (min),

where (at) is a sequence deﬁned by the algorithm, is solved using the LDP algorithm
of Lawson and Hanson. Brieﬂy, we recall some deﬁnitions and notations from chapter
1. A norm II.II is said to be strictly convex if the unit sphere contains no line segment

on its surface. In other words
”3“ = My” = “(33 + y)/2|| =1 => 1' = y-

The norm is smooth if through each point of the unit sphere in R", there passes
exactly one hyperplane supporting the closed unit sphere. If the norm II.II is strictly
convex (resp. smooth), then the II.II (resp. II.II') dual vector is unique (see chapter 1
for the deﬁnition of the dual norm and dual vectors). Moreover, the correspondence
x H x' (resp. x H x‘ ) from R“ into the II.II (II.II')-unit sphere is odd, continuous
and positively homogenuous of degree zero on R" \ {0}. In the important case of the

lP-norm, deﬁned as usual by

llmllp = (2 Ian“? 1
27

with 1 < p < 00, we have ”II; = II.IIq, where p + q = pq. We close this review by

recalling that for any smooth, strictly convex norm II.II
v" = v/llvll' and v" = v/||v|| 1

ifvaéO. Let
K={J:€R”Ix20, szb}.

Given problem (P), we associate a dual problem ([6])
(P’) m... {(11.11) ICE R" ,5 2 0.11 e R“. Ilé + ATyII' s 1 },

where AT is the transpose of the matrix A. The relation between problem (P) and

problem (P') is investigated in the next two results.

Lemma 2.1 (weak duality) Suppose K non-empty. Let the norm II.II on R" be arbi-

trary, then
value of (P') 5 value of (P) . (2.3)

Proof. Let x E K , and let E and y be as deﬁned in problem (P'). Then

(1111/) = (Any)

= (an/1’11)

l/\

(x,ATy + 5) (since x 2 0 , £2 0)

l/\

lell-IIATy + {If

llxll ,

IA’

which implies ( 2.3), as claimed.

Theorem 2.2 (Duality [6])If II II is an arbitrary norm on R", and K is non-empty,
then the problems (P) and (P') have the same value.

28

The next theorem, due to Sreedharan and Nikolopoulos, states a useful characteriza-

tion of the solution of (P).

Theorem 2.3 ([6]) Let the norm II.II on R" be strictly convex. Suppose K non-empty.

The following are equivalent

(i) e is a non-negative minimal norm solution of problem (P).

(ii) A5: = b , 1': _>_ 0, and there exist 6 E R", {_>_ 0 and y E Rm such that

(1., y) > 0. IIATy + {II' =1 . (2.4)
and
5- = (b1y)(ATy + 0’ ~ (2-5)
Furthermore,
(£10411! + 5),) = 0 - (2-6)
If in addition the norm II.II is smooth, then
x‘ = g + ATy ,and (6,1?) . (2.7)

Finally, (y,{) solves (P'), and
(b,y) = (522:) .

A characterization of the solution of problem (P) in the lz-norm case follows imme-
diately from theorem 2.3. This is an important ingredient in the subsequent devel-

opement of this chapter. We record it as

Corollary 2.4 If [III 2 II.IIg, then x is the solution ofproblem (P) iﬁx E K,

i = ATy +5 1 (2-8)

29

and
(£1 5) = a (2'9)

for somey E Rm, (6 R", {_>_ 0.

It should be noted that when II.II = II.IIg, then 5:" = e/IIeIlz, so that ( 2.8) and

( 2.9) are simply an easy consequences of equations ( 2.6) and ( 2.7).

We are now prepared to state our algorithm for computing the solution of problem
(P). We assume that the system of linear equations Ax = b , x 2 0, is feasible and

that the norm II.II is strictly convex and smooth.
Algorithm 2.1

Step 0. Find x0 the solution of Ax = b ,x Z 0 , IIxII2(min). Let
90 = 11io/Ilftoll’ 150 = (90,330) and k = 0-

Step 1. Set a), = ,BkgI‘. Find x1.” solution of
Ax = b ,x Z 0 , IIx — akII2(min) .

Let u), = x1“ — ak.
Step 2. If u)‘ = 0, stop. xk+1 is the solution of (P); else continue.
Step 3. Set 7;, = (uhxk).

Step 4. If

1 1
711 Z ,BkIIukII +leuklli 1

let

9k+1 = "k/Ilukll' and ﬂk+1 = Tic/“uh”, .

and GO TO step 8; else continue.

30

Step 5. Let

n =11. — 11111111111. .

Step 6. Find a), > 0 such that

“9!: + “Wk”, = 1 + 0!ka -

Step 7. Let

1 1 .
9k+1 = (911 + §akuk)/Ing + 50km” a

and

1 1 1
ﬂit-+1 = (,3): + §Ok7k)/llgk + iakuk“ -

Step 8. Increase 11: by 1 and RETURN to step 1.

Later in this chapter the stopping rule of step 2 will be used as follows. We
will show that the constructed sequence (uh) converges to zero. Then, because of
this convergence, it will be proven that the algorithm converges to the solution of
problem (P). However, for implementation purpose, we shall use the more realistic
stopping rule, similar to the one used in chapter 1, called the relative duality gap
criterion. In other words, we replace the condition u)c = 0 of step 2 by the condition

("all - ﬂk)/IkaII S r), where n > 0 is a stopping rule parameter. We record this in
Algorithm 2.2.

Step 0. Find x0, go and 30, as in step 0 of algorithm 2.1; let 1] > 0 be a stopping

parameter.
Step 1. Same as step 1 in algorithm 2.1.

Step 2. If

(llxk+1ll- ﬂ1)/||x1+1ll S n 1

stop. x1,“ is the solution of (P); else proceed.

31

Step 3 to 8. Same as in algorithm 2.1.

Our main interest in the upcoming sections of this chapter will be focused on
proving that the various steps of the algorithm are valid and that the important step
6 deﬁning a). > 0 is anserwed afﬁrmatively and, ﬁnally, to prove that the algorithm

leads effectively to the solution of problem (P).

Lemma 2.5 ([6]). Let xo,go,ﬂo be deﬁned as in step 0 of the algorithm. Then, there

exist {0 E R", {0 2 0, yo E R” such that
90 = {0 + ATyo 1Il90“, = 1 ,and ﬂo = (bat/o) > 0 . (2-10)

Proof. By corollary 2.4, there exist 6 E R”, 5 Z 0, z E R“ such that
x0 =£+AT2 , and (€,xo) =0 .

But 90 = (”o/”30”,, thus 90 = 50 + ATyOs where 50 =5/“1'0II'1 and 310 = Z/II$0|I'- To

prove the last statement of ( 2.10), we note that

50 = (901-730)

= [Valli/”$0”! > 0,

and

(50 + ATyo, $0)

= (yo, Axo) (Due to ( 2.9)),

2 (yo, b) a

(90130)

which completes the proof.

Proposition 2.6 (a) Let uk at 0 be as deﬁned in the algorithm. Then

Va>0,gk+auk #0. (2.11)
32

(b) Let a). > 0 be as deﬁned by step 6 of the algorithm, then there exist {1+1 6

R" 1 {1+1 2 0 1 yk+1 E Rm 8110}! that
9k+1 = {1+1 + ATyHl a ”9H1”, = 1 ,and ﬂk+1 = (”+1151 - (2-12)

Proof. To prove (a) we proceed by induction on In. Let k = 0 and suppose ( 2.11)

does not hold, i.e, 301 > 0 such that go = —auo. Then

”Halli = ("0,131 - a0)
= —a'1(go,x1 — a0)
= -01-1((901$1) " (90100))
= _a-1(($,,go + ATyo) _ so) ,(because of ( 2.10) and step 1)
= —a‘1((Ax1,yo) + (131,50) - [10)
= —a"(ﬂo + ($1.60) — ,30) 1 (since b = A31) 1
= —0-1(($11€o>)-
This yields a contradiction since 1:; 2 0,60 2 0, and no 7i 0, which implies ( 2.11) for
k = 0, as desired.

To prove (b), by deﬁnition of x; in step 1 of the algorithm and the corollary 2.4,

there exist 5 E R",£ Z 0 , z E R“ such that

110 2' (Cl—Clo
= 5+ ATz , (2.13)
and
(215) = (“01131) = 70-

Suppose 00 is determined by step 6 of the algorithm. Then, because of ( 2.11),

90 + '2'00110

33

is nonzero. Let

1 l I
3/1 = (3/0 + gaozl/llgo + 500110” ,

and

s. = (so + gas/"g. + gaouou’ .

From ( 2.10) and ( 2.13), we obtain

1 1 I
ATS/1 + 51 = (50 + ATS/o + §00(€ + ATZ))/llgo + §00u01l

= (90 + aanal/”£10 + 00110”,

= 91'

Now,

<y.. b) = «yo, I») + gum. b>)/ngo + gator

1 l .
= (30 + growl/“90 + ‘00u0ll

2
=31.

This completes the proof of ( 2.12), for k = 0. The same argument applies for any

integer k if we assume the proposition to be true for k — 1.

Remark. It will be shown later that the sequence (31:) generated by the algorithm

is strictly increasing. This combined with ( 2.10) imply that m > 0, for all k. We

conclude that step 5 is properly formulated.

Proposition 2.7 Suppose the algorithm is at the stage of executing step 5. Let pk

be deﬁned by
l
m. = (7. - gllukll§)/ﬂk .
Then, there exists on. > 0 such that

”9!: + 01:10:“, = 1 + akl‘k .
34 i

Proof. Let the function f : R -—-> R be deﬁned by
f0) = ”91: + A‘ukll' ,

and let

I(A)=1+)\#k .

Then, f is a strictly convex function. Moreover, f (0) = Ilgkll' = 1 = [(0). Also

f'(0) = (drank) a

(see chapter 1). We have

Ak/ﬂt = (uka‘tk-HVﬂk , (by 8t€P3)
= (uhuk + ak)/ﬂk , (because of step 1)

= IIUklli/ﬂk + (magi) ,

(since a]. = 5kg“. If the algorithm is at the stage of executing step 5, then uh aé 0.
So,

110) - I'm) = <g;.,u.> — (7. — guano/a
= <g;.,u.> - llukllé/ﬁk — (um) + gnukug/ﬂ.
= gunman.

<0.

Hence, there must exist /\ > 0 such that f(/\) — ((A) < 0. Now, a]. > 0 in step 5
is sought only if step 4 of the algorithm is answered negatively. This means that 7;,

must satisfy

1 1
7!: < WNW” + Ella/ell; - (2-15)

35

Using this inequality, we have

, 1
f0) - l(/\) = ”at + Mkll -1- M71: - §llukll§)/ﬂk
I , 1
Z Allﬂkll - llgkll -1- M71: - glluklliﬂﬂk
: l
= l\(5Icllukll - 71: + §IIUk||§)/ﬂk - 2

l 1
> A(—;,-Ilukll§ + §lluklli)/ﬂk — 2 .
The last step is due to ( 2.15). From this we get
A 2
fm4m>zmwm4—«n

as/\-——>oo.

Therefore, f (A) — 1(A) > 0, for some A > 0. Since f — l is continuous, there must

exists 0;, > 0, solution of f()\) — ((A) = 0. Thus, equation ( 2.14) holds. The proof is

complete.

Proposition 2.8 Let the norm H.” be smooth. Then
(a) The sequence (Bk) generated by the algorithm is strictly increasing.

(b) Either the sequence {ﬂu is finite; or it is a convergent inﬁnite sequence.

Proof. Two cases are to be distinguished. The ﬁrst case is straightforward. Assume

step 4 of the algorithm is executed. This occurs if

. 1
n2ﬂWN+ﬂMﬁ~
So, from the deﬁnition of ﬂip“, we obtain

xm1=7mmw
1 .
22a+ﬂmmum

> ’61:,
36

( since uk ¢ 0). Now, let us assume that step 4 of the algorithm is answered negatively.
Then ﬂk+1 is deﬁned by step 7. Let a]. > 0 be as deﬁned in step 6. Recall that we

are assuming the norm II.” to be smooth. So, the dual norm II.II' is strictly convex.

It follows that

l I 1 I l I
Ilgk+§akukn < §Ilgk+awkll +5119.”

1 1
= 1+ 5am]. —- §llukl|§)/ﬂk .

Consequently, we get

1 l 1 I
31: < ((31: + 5mm) - ZakIIUkHiVllgk + 501cm”

1 l ,
(M + fwd/Hm. + 50km.”

/\

= 161:“ -
This proves that, in all cases, the squence (3],) is strictly increasing, as claimed.

To prove (b), supppose that the sequence ([31,) is inﬁnite. To show its convergence,
we need only prove that it is a bounded sequence. This follows immediatly from the

weak duality lemma,

18k : (yin b)

< llxll , (2-16)

for any ﬁxed feasible solution x E R". Hence, (M) is bounded from above. The proof

is complete.

After these preliminaries concerning the feasibility of Algorithm 2.1, we begin the
study of its convergence. The next two results will be needed in the convergence

theorem.

Lemma 2.9 Let a}, > 0 be as deﬁned in step 6 of the algorithm. Then

, 1
5!:«91: + akuk) ,Uk) + '2'lluklli 2 7k - (2-17)

37

Proof. First, we recall that by the deﬁnition of the dual vectors and the construction
of gk, we have
“(91: + awn'll = 1 and ”all = 1 .
This implies
((91: + akukl’igk) S “(91: + akMYll-llgkll' = 1 .

Due to the deﬁnition of the dual norm, we also have the equation

”9!: + 01:31:“, = ((91: + akUk)',9k) + 0k((9k + aka/cl], “kl .
This, combined with the equation ( 2.14), deﬁning ah, yield

1 .
1 + akhk - §llul|i)/ﬂk = “91: + akmll

3 1+ a,.((g;c + aka/c)’, uh) ,

which, since a]. > 0, implies ( 2.17), as desired.

Theorem 2.10 Let a]. , xk and u). be as deﬁned in the algorithm. Then, the sequences

(ah) ,(xk) and (u) are bounded.

Proof. If the algorithm terminates in a ﬁnite number of iterations, we are done. Con-
sider the case the sequences are inﬁnite. Let d > 0 be the value of the minimization

problem (P). Then as mentioned above,

:6]: : (bay/cl

g d , for all k. (2.18)

Now, from step 1 of the algorithm, we see that

”akllz = 5kllgil|2
S Mﬂkllgill
= Mﬂk (since ”v,” :1 ,for all v # 0)

S Md,
38

where M > 0 is such that "v“; _<_ M||v|| for all u E R“. Hence (ak) is a bounded

sequence.

To prove that the sequence (xk) is bounded, let :i: be any ﬁxed feasible solution of

problem (P). Because x1,“ is the minimizer of the problem
Ax=b , x20 , “CD—akllz (min),
(this minimization is done in step 1 of the algorithm at each iteration cycle), we have

||$k+1|| S ||$k+1 - akllz + llakllz

S ”5? — akllz + llakllz

IA

llillz + 2llakllz

S llillz + ZMd-

Thus, (xk) is a bounded sequence. From this it follows clearly that the sequence (uk),

uk = xk+1 — ak, is also bounded. This completes the proof.

We can now put everything together to prove that the algorithm 2.2 converges to

the solution of problem (P).

Proposition 2.11 ([6]). Let (uk) be the sequence generated by the algorithm. If

u], = 0, for some k, then x1,“ is the solution ofproblem (P).

Proof. u], = 0 implies xk+1 = ak = my; (because of step 1 of the algorithm). By

virtue of proposition 2.6,
31: = (5.311;) and 9:: =51: + ATS/k ~
This gives us the relation

n+1 = (b,y,.)(§,, + Airs/k), -

39

Moreover,

IIEk + ATykll' = 1 and (Mn) > 0 .

Using theorem 2.3, it is obvious now that xk+1 solves (P).

Theorem 2.12 (a) If the algorithm terminates after a ﬁnite number k of iterations,

then x1,“ is the solution of problem (P).

(b) If the algorithm generates an inﬁnite sequence (xk), then it converges to the

solution of (P).

Proof. Algorithm 2.2 terminates in a ﬁnite number k of iterations iff u], = 0. In this

case, by proposition 2.11, the algorithm computes the unique solution 33,.“ to the

problem (P). This proves (a).

To prove (b), assume that the algorithm is executed for an inﬁnite number of

indices (It). By theorem 2.10, the sequence (7,.) deﬁned by
7’: = (ukaaik-Hl a

where (xk) and (uh) are generated by the algorithm 2.1, is a bounded sequence.
Hence, by passing to a subsequence if necessary, we may assume that 7;. ——+ 7. Since
0 < m S d ( due to ( 2.18)) and Ilgkll' 2: 1 , let us pass to a further subsequence,

denoted again by (k), such that
35—45 and gk—ag.
Our ﬁrst goal is to establish that
lim u), = 0 . (2.19)

Suppose the claim were false. Then, once more by theorem 2.10, there exists a

subsequence, denoted (k), such that

u, __. u ,e o. (2.20)
40

Case 1. Step 6 is executed for an inﬁnite number of indices (k). We claim that
the corresponding sequence of positive numbers (at) is bounded from above. Because

of ( 2.20), we can pick a subsequence, denoted once more by (k), such that
”uh“; Z 6 , Vk , for some 6 > 0 . (2.21)
Now, using ( 2.14) it follows

akllukll' -1 S “9:: + akukll'

1
= 1+ ak(’7k — §lluklli)/5k -

This shows that

I l
ak(ﬂk|lukll - ‘71: + §||0k||§) _<_ 25k - (2-22)

Recall that step 6 is executed only in case
. 1 ,
7. < ﬂkllwll + Zuni“. . (223)

Combining ( 2.22) and ( 2.23) we get

1 2

Zakllukllz < 25k -
Together with ( 2.21) and ( 2.18), this shows that

0 < 0k < 8(1/6 .

Thus, the sequence (01),) is bounded, as claimed. Passing to a further subsequence, if

necessary, we may assume that there exists a Z 0 such that
0;, ——> a as k —-—> oo .

If we let I: —-—> oo in ( 2.17), then by continuity of the map 2 H 2' on R" \ {0}, it
follows that
, 1
mg + m.) ,u> + gun"; 2 7 . (2.24)

41

We distinguish two possibilities, a = 0 or a > 0. We prove that each of these

possibilities leads to a contradiction. We then conclude that lim u), = 0.

If a = 0, then ( 2.24) becomes
. 1 2
my ,u> + 5“"“2 2 7. (2.25)
From the deﬁnition of u). in step 1 of the algorithm, we obtain

“Welli = (Mn-amine)
== (xk+1,uk)-(ak,uk)

= 7k - ﬂklgimkl - (2-26)

Passing to the limit on both sides of ( 2.26) leads to
IIUI|§ = ‘7 - Mint) - (2.27)
Inequality ( 2.25) and equation ( 2.27) force u to satisfy
1

which is a contradiction to ( 2.20). So u = 0 , as claimed.

Suppose now that a > 0. Using step 7 of the algorithm and allowing 1:: —-> 00,
we get
1 1 , .
(3H1 -—’ (5 + §ai)/llg + 50"” = ﬂ-

Note that ﬂ), = (yk, b) and 5H1 = (yk+1, b) both have the same limit. 80 B = ﬁ. This
yields the equation

1 i I
ﬁll? + 50“” = ,3 + 507 . (2.28)

Once more allowing 1: ——i 00 in the equation ( 2.14) deﬁning ah, gives

ms + auu' = a + as — guuné) . (2.29)

42

Applying the strict convexity of the dual norm ||.||' implies

ﬂllg+§aull < s<§llgll'+§llg+auu')
= m§+§a+a(7-§nullz)/m) (because of (2.29)

1 l
= 5 + -01(7 - -||ul|§) . (2-30)
2 2
Inserting this in ( 2.28) shows that
1 1 1 ,
3 + @017 < 5 + 50(7 - Ellullzl a

which implies
1
Illulli < 0
leading to the sought contradiction with ( 2.20). So u = 0 and the whole sequence

(uh) converges to zero, as desired.

Case 2. Assume that the condition in step 4 of the algorithm is satisﬁed for all
but a ﬁnite number of indices (k). As in the ﬁrst case we proceed by contradiction.
So, let us suppose that (uh) does not converge to zero. Then there exist a subsequence

which we denote again by (k) and a nonzero vector u such that
limu)c = u 74 0 . (2.31)
Since step 4 is answered afﬁrmatively, we have
a 1 2
7k 2 Bkllﬂkll + leukllz , (2-32)
for all lc. Letting k ——+ co in the above inequality implies
. 1 ,
7 _>. ﬂllull + Zuuu. . (233)

By step 4,

31m = 7k/llwcll' -
43

As lc —-+ 00, this converges to

[3 = 7/IIUHI - (2-34)
Combining inequality ( 2.33) and equation ( 2.34), we obtain

1
leullé s 0,

a contradiction with ( 2.31). We have thus proved that in all cases, the sequence (uk)

generated by the algorithm converges to zero, establishing ( 2.19), as claimed.

We now prove that the algorithm converges to the unique solution of problem
(P). Let x" be any cluster point of the sequence (xk) and let (ku) be a subsequence

converging to x'. Writing the relation in step 1 of the algorithm for all k', we have

$k’+1 = ”H + ﬂk'glc' a (2-35)
where
91’ = ATyk' '1' Ck’ a llgk'll’ = 1 ,
and
51" = (yk'abl , V k! a
(by proposition 2.6). Since ”91’”, = 1, by passing to a further subsequence that we

denote (k'), we have

gk’ ‘69 i llgll'=1°

It has been proven that lim up = 0, so if we let k' ——-> oo in ( 2.35) we get

w‘=ﬁg' , ugn'=1. (2.36)

Clearly, x‘ is feasible since K = { x E R" | x Z 0, Ax = b } is closed. In other

words, Ax“ = b and x‘ Z 0.

44

We have seen earlier in the proof that [3 = lim(y,,r,b), where yk: E Rm , E Z

0 and IIATyk: + 611’ ||' 2 1. By the weak duality lemma and equation ( 2.36) we have

(yk'v b) S ”13.”

= s. (2.37)

Letting ls:l —-+ 00 in ( 2.37) yields equality. This shows that every cluster point of the
seqeunce (xk) is a solution of problem (P). Due to the uniqueness of the solution, we
conclude that (xk) converges to the unique solution of (P). This completes the proof

of the theorem.

45

Chapter 3

On The Minimum Norm Solution
Of A System Of Linear

Inequalities

Lawson and Hanson proposed in their book ([5],chapter 23) an algorithm for solving
the so called LDP problem (Least Distance Programming). The algorithm computes
the vector of minimal norm solution of a system of linear inequalities. The norm used

by the two authors for the objective function was the lz-norm and the problem was
Minimize ”x”; subject to Ga: 2 h .

The formulation of the solution to the problem was based on the Kuhn-Tucker opti—
mality conditions. When the norm IL”; is replaced by the lP-norm, 1 < p < oo, in
the objective function, the problem is no longer linear. A different characterization

of the solution has to be considered.

We shall propose a generalization of the LDP algorithm, given by Lawson and
Hanson, to the I’D-case, for any p, 1 < p < 00. Keeping in mind the difference of the
two problems, we will follow closely, whenever possible the approach in [5]. This is
made possible, simpler and elegant because of the availability of the least distance

algorithm presented in chapter 2 of this work.

46

The problem considered, and refered to as problem (P) is
(P) “1‘“ (min), Am 2 b~

Here A an m x n-real matrix, b an m-real vector, :1: an n-real vector. The norm II.II
is assumed to be strictly convex and smooth. To solve the given problem (P) and
with an eye toward the characterization of the solution, a dual problem, referred to
as (P'), will be introduced. We will investigate the relation between (P) and (P').
A related least distance problem, similar to the one studied in Chapter 2, will then
be used to solve (P). The main contribution of this work resides in giving an explicit

formula to compute the vector of minimal II.IIp-norm solution of (P).

We introduce brieﬂy the notion of the dual vector for the norm II.II on R". Let
II.III denote the dual norm. For any vector v 6 R”, v yé 0 , a II.II — dual vector, v', is
deﬁned by

”v,” =1 . (v',v) = “v“, - (3-1)
Similarly, the II.II' — dual vector, v“ , is deﬁned by

“'0’” = 1 , (”'1’”) = llvll - (3-2)
When II.II = II.IIp, l < p < 00, is the usual lP-norm, then II.II' = II.IIq, where p+q = pq.

In terms of coordinates, the dual vectors are given by
vi = (Ival/Ilvllq)”‘lsgnvi z'=1,---,n,

v: = (Iv;I/IIvII,,)”"lsgnv,- i = l, ...,n .

The following two identities are useful and will be often reffered to later.
= v/nvu and = v/nvn’ , (3.3)

for anyv 75 0.
47

Set

K={xER" I Abe}.

If the region K is nonempty, then due to the compactness of the norm II.II unit ball
in R“ a solution of problem (P) exists; it is unique due to the strict convexity of the

norm II..II

Given problem (P), we assotiate a dual problem
(1") max { (MM 31 E R’” , y 20 ,IIATyII' S 1}

where AT is the transpose of the matrix A.

The next theorem generalizes slightly the duality theorem 3.1 in [6] to the case of
linear inequality constraints. It establishes the relation between problem (P) and its

dual (P'). Since 0 E K iff b S 0, in which case the value of (P) is zero, we explicitly

exclude this easy case.

Theorem 3.1 Assume that K is non-empty and that 0 ¢ K. Then, problems (P)

and (P') have the same value.

Proof. We begin by showing the classical weak duality inequality. Let x E K , y 6
R” , y 2 0 such that IIATyII' S 1 .Then

(b,y) S (Ax,y) (since y 2 0 )

= (:1), ATyl
S lell-IIATyII'
S ”93” (3-4)

Hence, value of (P') _<_ value of (P). To prove the desired, let

d=d(0,K)=inf {IIxII I :66 K}.
48

By the duality theorem in Nirenberg [7],
d(0, K) = max { —o(z) I ”2“, S 1 } , (3.5)
where 0 denotes the support function of the convex set K and where we have adopted

the notation d(a,X) = inf{ IIa — x” I x E X }. Let 20 be the maximizer of ( 3.5).

By the deﬁnition of a, we see that

—o(zo) = —sup {(zo,x) I x E K }

= —sup {(zo,x) I Ax 2 b}

inf {I—zo,x) I Abe}
= SUP{(b,y)|ATy=-zo. 3120}-

The last step is due to the duality theory in linear programming.

Now, since
SUP {(5.11) | ATy = —Zo, y 2 0}

is ﬁnite, there exists g _>_ 0 such that
ATg=—ZO$ 3720a

and
(5,37) =S‘1P WW) l 473/ = -zo, y 2: 0 }-

Furtheremore,

Equation( 3.6) combined with ( 3.4) imply the claimed equality.

The following theorem is due to Sreedharan-Nikolopoulos [6] in a slightly different
form. Only a few changes are needed in the proof. We include it for completness.
It states a characterization of the solution of problem (P) and establishes the rela-
tion between the solutions of (P) and (P'). Because of theorem 3.1, only a minor

modiﬁcation is needed for its use in our case.

49

Theorem 3.2 Assume that K is non-empty and that 0 ¢ K. Then the following are

equivalent
(i) 51': solves problem (P).

(ii) A5: 2 b and By E B”, y 2 0 such that
IIATy||'=1, (W?) > 0,

and

i‘ = (b,y)(ATy)' -
Proof. Due to theorem 3.1, it is easy to see that (ii) implies (i). Since IIATyIII = 1,

Hill = (b,y)ll(ATy)'ll = (5,1!) -

To prove the converse, let y 2 0 be a solution of problem (P'). Then

llill = (13.31) = p-
Since Ax 2 b and y 2 0, we have

(ATya j) = (3’, A5)

On the other hand
(47311:?) S IIATyll'Ilill S p-
Thus
(ATy.r7=> = p and IIATyll' = 1 .

So,

(«Viki/p) = llATyll' -
50

By uniqueness of the II.II-dual and since IIi/pII = 1, we have
i/P = (ATy)I 1

which implies
i = (b,y)(ATy)'

completing the proof.

Theorem 3.3 Assume that K is non-empty and that 0 ¢ K. Then, x” is the solution

ofproblem (P) iﬂ' A5: 2 b and By E R” , y 2 0 such that
IIATyII' = 1 , (3-7)

and

5' = ATy . (3.8)

Furthermore,

and y is the solution of (P').

Proof. If a’: is the solution of (P), then by theorem 3.2
it?" = (ATyY' = ATy ,

with y 2 0 and IIATyII' = 1, which proves the “only if” part.
“If” part: By ( 3.8)
f/llill = 55" = (ATS/Y ~

Thus
1 = II?" = IIATyII' = (ATy.(ATy)') = (ATy,i/||i||> -

51

So,

llfll = (Airs/,5)
(mAﬂ

2 (1w)

Now,

(5,31) = (ATyw’?) _<_ IIATyII'HiH = Hill -
This implies
Hill = (5,31) -
Therefore, 5: and y solve (P) and (P') respectively, as desired.

As mentiond in the discussion at the beginning of this chapter, a key step toward
the solution of problem (P) lies in our ability to compute effectively a least distance
solution of a related problem. Before we formulate this problem, we set down some

notations that will be used throughout this chapter.

Let E be an m x n real matrix and let c 6 R’". We consider the convex cone
C = {zERmIETz SO}
and its negative polar
Co = {ExIx20, xERn}.
Given E and b, we introduce the following minimization problem
min { ”6- MI I y 6 0° } (39)
With ( 3.9) is associated the following dual problem

max {(c,z) I z E C , ”2” =1 } . (3.10)
52

Problem ( 3.9) and its dual ( 3.10) have the same value (see chapter 1).

The relation between ( 3.9) and its dual ( 3.10) was studied in Sreedharan [20]

where the following theorem is proved

Theorem 3.4 Assume that c ¢ C0. Let 5': _>_ 0 be a solution of problem (3.9) and

let y be the maximizer of ( 3.10). Then
Ex = c — (c,y)y" (3.11)

and

(E23331) = 0 (3.12)

Proof. See theorems 3.5 and 3.7 in Sreedharan [20]. Note the interchange of primes

and stars since our minimization problem uses the dual norm.

Corollary 3.5 ([20]) [ff 2 0 is a solution ofproblem (3.9), then
ETr' g o , (3.13)

wherer=c—E:E .

Remark. Since the map 2 H z’ is odd, positively homogenous of of degree zero on

R” \ {0}, it is easily seen from ( 3.11) that

I

7‘ = (fez/W),
= y/llyll , ( by (3-3) )
= y (since llyll = 1) (3.14)
We are now ready to give the algorithm for solving (P) when II.II = II.I.I,, The

reader will note that for the validity of the present algorithm, it will not be sufﬁcient

to assume that the norm II.II is just strictly convex and smooth. The special structure

53

of the l" norm will be used. A careful look at our proof will reveal that the new
requirement is the following. Let v at 0 and v' its IIII dual. Then v; = 0 implies
v.- = 0, where v = (v1, ...vn) and vI = (vi, ...,vI,). The algorithm starts by solving a
problem of type ( 3.9). Then it proceeds to compute the solution of (P). We will

state this algorithm for the norms II.II = II.IIp and II”, = II.IIq, where p + q = pq
Given the matrix A and the vector b, deﬁning problem (P), let
AT
and c = [0, ...,0,1IT. E is an (m + 1) x n-matrix and c an (n + 1)-vector.
Algorithm 3.1

Step 0. If b S 0, Then :2 = 0 solves (P). GO TO step 6.

Step 1. Find ﬂ 2 0, a solution of the problem

IIc — ExIIq (min) , x Z 0 .

Step 2. Compute the residual r = c — Ea.
Step 3. If r = 0, the feasible region of (P) is empty. GO TO step 5; else proceed.

Step 4. Compute r' = (rI, ...,r;+,), the II.IIp dual of r. Let

I I .

Step 5. Accept x as the solution of (P).
Step 6. The computation is complete.

Before proceeding any further, some comments are in order here. To ﬁnd a Z 0
in step 1, Algorithm 1.2 of chapter 1 can be used. Step 3 answers the question of
the feasibility of the system Ax _>_ b . To determine feasibility we may use the I’-

residual. If the problem is feasible we actually start all over from step 1 with the

54

given (9 norm. If the region K is non-empty (the system has a solution), then the
algorithm will compute the solution of the system, which has minimum norm. If step

3 is answered afﬁrmatively, the region K is empty. In this case we exit the algorithm.

We start by proving the feasibility of the algorithm in

Proposition 3.6 Let r = c — Eu be the residual as given by step 1 of the algorithm.

If step 3 is answered negatively, then

Thu = “7'qu > 0 -

Proof. Note that since the case b S 0 has already been handled, 0 ¢ K. In this case
let y be the solution of the dual problem ( 3.10). It follows from ( 3.12) and ( 3.14)

that
(ETr',a) = (ETy,a)

= (31.1317)

= 0 , (Due to( 3.12)) . (3.15)
This implies that

0 = (1373-317) = (r’,Ea)
.-.-_ (r',c— r), (due to step 1)
= (r',c) — I|rI|q . (3.16)

Step 3 of the algorithm is answered negatively if r 74 0 . Hence from ( 3.16), we

obtain

r;+l = ”r”, > 0, (because of ( 3.1)) (3.17)
This completes the proof.

In the next result, we show that the stopping criterion of the proposed algorithm

is well formulated. We also show the feasibility of the system Ax _>_ b, if r 91$ 0.
55

Theorem 3.7 Let r be the residual vector, with (n+1) components, given by step 1

of the algorithm, then
(a) Ifx is as deﬁned in step 4, x is the minimal norm solution ofproblem (P).

(b) If IIrIIp = 0, the system Ax 2 b is inconsistent.

Proof. Let us prove (a). From step 4 of the algorithm, we have

I I
131' = -rj/rn+1

= —r;-/IIrIIq , j = 1, ...n . (3.18)
To verify the feasibility of x, we need to show Ax 2 b. By ( 3.17), we have

—II7‘IIqu,—1IT = (—T;+1)Ii,—1IT
= (_rh+l)l—ri/rh+l a "'1 —T;,/T;,+l i _llT
= [rI,...,r;,,r:,+1]T

I
:rT

Hence

— [A, b] I :21 I “r“, = ETr’ . (3.19)
This combined with the inequality ( 3.13) of corollary 3.4 implies

(b — Ame“, = ETr' g o .
Thus, Ax 2 b, as claimed.

If x = 0, then due to the feasibility we just proved, b S 0 and we are done. So
assume that x 76 0. Then, by step 4, r;- # 0 for some j, l S j S n. As observed
earlier, this implies r,- 74 0, due to the special form of the II.IIp dual in the case of the

l” norm. So, i‘ = (r1, ...,rn) 7g 0. Now we have

I

r,- = (It‘jI/llrllqlq‘lsgn r,-
= I’m-139" "j/II'IIZ‘1 , j=1,...n.

56

From this it follows that

—xj = hill-1619""j/Il'llqllrlli-1

= Irj|°'189n rj/IITIIZ
= (lrjlq’l/llfIIZ‘Ikgn "j-(IIV‘HZ'l/IITIIZ) . 1': 1,-.-,n.
So,
4 = ﬁ-(llf‘lli’l/IITIIZ) -
Using the fact that the map 2 H z' is odd, positively homogenous of degree zero on

R" \ {0}, we get
— :2' = 1"" = F/IIfII, . (3.20)

By deﬁnition of E and c, we have 7‘ = —ATa. Thus

I = ATP/”file

(~21

= AT(u/I|r||,), a 20. (3.21)

This implies that x satisﬁes equation ( 3.8) with y = ﬁ/IIrIIq, y 2 0 and IIATyIIq = 1.
We have veriﬁed that x satisﬁes the conditions of theorem 3.3, so that x is the solution

of problem (P). The proof of (a) is now complete.

To prove (b), assume that r = 0 and that there exists a solution x of the system

Ax 2 b. We have

ATa = 0 , bTa = 1 . (3.22)
By step 1 of the algorithm, a Z 0, so

1=bTﬁ

57

= 0, (by(3.22)) ,

a contradiction. Thus, the system Ax 2 b has no solution, as claimed. This completes

the proof.
We close this chapter with the following observations.

1. With the algorithm of chapter 1 at hand, the present algorithm for ﬁnding the

minimal norm solution of a system of linear inequalities is easy to implement.

2. A consequence of this algorithm is to determine whether the system of linear
inequalities under consideration is consistent or not, as shown in step 3. But if this
is all that one is interested in then one would use the l2-norm in place of the lq-norm

in solving the problem stated in step 1.

58

Chapter 4

Numerical results

In this chapter we discuss the computational aspects of the algorithm presented
in chapter 1 of this work. At this time we will not investigate the numerical results
of the algorithm of chapter 2. The whole implementation of this algorithm will be

presented elsewhere.

The main computational difﬁculty encountered in this algorithm is ﬁnding a). > 0
such that

”31:: + akrkll = 1+ aw]. -
This search occurs in step 5 at each iteration cycle.

Suppose that we are at the stage of entering step 5. Let the function F be deﬁned
by
F(a) = 1+ am. — llye + arkll' . (4-1)

We are searching for a). > 0 such that
17(0):) = 0 .

It is well known that

d I I
Elli/k + an,” = ((311: + (W) ark) (4-2)

(see for example [14].).

59

Assume that the search for a). has been reduced to an interval (3,7), 7 > ﬂ 2 0,

with F(ﬂ) > 0 and F(7) < 0. We begin by ﬁtting a quadratic q(a) on the interval
[[3,7] as follows

(1(3) = 17(15):: F1
I «1(7) = F (7) == F2

(4.3)
«1(3) = F'(ﬂ) == Ff

We seek the roots 6: of q. If IF(6z)I S 17 and £1 E (ﬂ,7), where 17 is a given tolerance
parameter, then we set

ak=é

and return to the main algorithm. If the stopping condition IF (d)| _<_ 1) is not met
but 6: belongs to (ﬂ, 7), we reduce the interval of search by setting

7:3 ifF(c‘x)<0
3:6: ifF(d)>0

and then apply the routine to the new reduced interval (3,7), till an acceptable or).
is obtained.

In the case when the root (‘1 is not in the interval of search (ﬂ, 7) or if the quadratic

interpolation has no real root, we consider the quadratic ﬁtting as not suitable. A

linear interpolation is then performed to determine 6, i.e

51’ = (13172 - 7F1)/(F2 - F1) a (4-4)

followed by an update of the interval (ﬂ, 7), as was done in the quadratic ﬁtting case.
Let

q(a) = A(a — ﬂ)2 + F{(a — £3) + F1

(4.5)
be the quadratic interpolation deﬁned via ( 4.3). It is easily seen that

A = (F2 — F1 —Ff(7-ﬂ))/(7-ﬂ)2- (4.6)
60

The roots of q are

 

a — p = (—F,’ :1: JP," — 4AF1 )/2A . (4.7)

Recall that the quadratic interpolation is considered under the conditions F1 > 0 and

F; < 0. So, q has its maximum in (—00, 6:). It follows that
q'(a) = 2A(& — [3) + F,’ g 0.

Thus, the only relevant root in ( 4.7) is

 

a _ p = (_F,’ — \/F;'-’ — 4AF1)/2A.

The following subalgorithm is based on the above discussion. For a further reﬁnement
of the interval of search (3,7), we included a bisection to be performed at each

iteration cycle of the subalgorithm.
4.1 Subalgorithm
Step 0. Let 3 = 0 and 7 be such that F(7) < 0. Let c > 0
Step 1. Let F; = F(3), F; = F'(3) and F; = F(7).
Step 2. Let h = 7 — 3.
Step 3. Compute A = (F2 — F1 — F;h)/h2. If A = 0, GO TO step 7; else proceed.
Step 4. Let A = F;2 — 4AF1. If A < 0, GO TO step 7; else proceed.

Step 5. Set
a = 3+ (—F,' — x/E)/2A.

Step 6. If 3 < (is < 7, GO TO step 8; else proceed.

Step 7. Let
C1 = (51:2 — 7F1)/F2 - F1)-

61

Step 8. If IF(d)I S 6, set

a), = Li

and RETURN to the main program; else proceed.

Step 9. If F(d) < —6, set 7 = 61; else proceed.

Step 10. Set 3 = 6:.

Step 11. Let C“! = (3 + 7)/2.

Step 12. If IF(&)I S 6, set a), = 62 and return to the main program; else proceed.

Step 13. If F(&) < —6, set 7 = 6:; else proceed

Step 14. Set 3 = d and RETURN to step 1.

A Newton method can also be incorporated within this subalgorithm. As noted
in chapter 1, in the lP-case the dual of a given non-zero vector 2 is given by

z; = (Iz;I/IIzII,)q‘lsgnz,- , i=1,...m .

This particulary simple formula of zI makes the calculation of the derivative in ( 4.2)
immediate. We use the Newton iterations as follows. Via the quadratic model, we

determine 6: belonging to (3, 7), then we start the Newton iteration at (3:. We compute
a; = a — F(a)/F’(a) . (4.8)

Having this new approximation, we check if 0‘ belongs to (3,7) and if it yields an
actual decrease the value of F. Only under these conditions we continue the Newton

iterations until an acceptable 61 is reached. If one of the conditions
(a) d 6 (5.7)

(b) |F(a‘) < |F(5!)|

is not met, we exit the Newton iteration and return to the quadratic model.

62

The algorithm 1.2 suggested in chapter 1 was coded in Fortran 77 for a SPARC

station. The norm considered is the IP-norm

lellp = Q: Ian-PW"

for various values of p. The dual norm is the ”II, norm, q = p/ (p — 1). The stop-
ping rule parameter whithin which we consider the tolerance for the duality gap as

acceptable is e = 10*. The code was run in double precision.

We calculated the sequence (0,.) using the subalgorithm outlined in this chapter.
We also incorporated a Newton iteration scheme, as discussed earlier, to determine

each at.

The coded version of the main algorithm seems to do much better for p Z 2 than

for p in the range (1, 2).

When p > 2 and not far away from 2, the convergence seems to do well compared to
[20]. We also found that the sequence ((b, yk)) increases and the duality gap decreases
monotonically. However, in some cases, the sequence (0),) poses more problems, e.g,

it may converge to two different limits.

The case 1 < p < 2 does not do as well. For example for p = 1.8 a signiﬁcantly
larger amount of iterations were needed to reach the same acceptable tolerance of
10-6. The sequence ((b, yk)) still increased monotonically. The duality gap decreased

in the same way.

In conclusion, compared to the algorithm in [20], the present algorithm seems to
perform well for values of p larger than 2, but not as well for p in the range (1,2)

with regard to the number of iterations.

The following linear system is taken from Barrodale and Young [1] and was used

63

by Sreedharan [20] and others ([8], [10]) for l < p < oo.

1.52

£131 =

1.025

x1 + x2 =
x1 + 21:; = 0.475
x; + 3x; = 0.01
x1 + 43:; = —0.475

—1.005

331 +5332 =

We record the results in the following table.

 

$1

1'2

p

iterations

 

5.0
4.5
4.0
3.8
3.5
3.0
2.5
2.0
1.8

 

 

0.258716
0.253714
0.261537
0.261783
0.261952
0.261791
0.260704
0.258333
0.239018

 

0.000334
0.001339
0.000031
0.000000
0.000000
0.000000
0.000000
0.000000
0.0019

 

1.472222
1.507273
1.546597
1.568348
1.607494
1.697914
1.842740
2.102851
2.280894

 

149
59
28
27
24
20
15
1
165

 

Table 4.1: 1.8 S p S 5

 

Algorithm 1.2 was also coded in Fortran 77 for a SPARC station. The results are
presented for p = 3 (table 4.1) and for various of A). = 1/6k. The tolerence parameter
is c = 10-6. For p = 3, the numerical results suggest that the algorithm converges
faster for 1,. near 1%. The convergence tends to be slower for A), far away from in We

used the same example as above.

64

 

 

 

 

 

 

 

 

6,, x1 x; p iterations
1.8 0.26179422 0.0 1.69791476 16
1.9 0.26179173 0.0 1.69791476 18
2.0 0.26179170 0.0 1.69791476 20
2.2 0.26179069 0.0 1.69791476 20
2.5 0.26179026 0.0 1.69791476 23
2.8 0.26179038 0.0 1.69791476 26
3.0 0.26179090 0.0 1.69791476 29
4.0 0.26179037 0.0 1.69791476 39
5.0 0.26179041 0.0 1.69791476 50
10.0 0.26179035 0.0 1.69791476 104
Table 4.2: p = 3

65

 

Bibliography

[1] Barrodale, I. and Young, A., Algorithms for best L1 and Loo linear approximation
on a discrete set, Numer. Math., 8, (1966), 295-306.

[2] Cheney, E. W.,“Introduction to Approximation theory”,2nd ed., McGraw-Hill,
New York, 1982.

[3] Fletcher, R., Grant,J. A. and Hebden, M. D., The calculation of linear best Lp
approximation, Computer J., 14, (1971), 276-279.

[4] Gill, P. E., Murray, W., Saunders, M. A. and Wright, M. H., Linearly constrained
optimization, in Nonlinear Optimization 1.981, M. J. D. Powell, ed., Academic
Press, London (1982), 123-139.

[5] Lawson, CL. and Hanson, R. J.,“Solving Least Squares Problems”, Prentice

Hall, Englewood Cliffs, NJ. (1974).

[6] Nikolopoulos, P. V. and Sreedharan V. P., An algorithm for computing non
negative minimal norm solutions, Numer. Funct. Anal. and 0ptimiz.. To appear.

[7] N irenberg, L., “Functional Analysis”, Lectures given in 1960-61, notes by Lesley
Sibner, New York University, (1961).

I8] Owens, R. W., An algorithm for best approximate solutions of Ax = b with a
smooth strictly convex norm, Numer. Math., 29, (1977), 83-91.

[9] Owens, R. W. and Sreedharan, V. P., An Algorithm for approximation by ele-
ments of a cone in a Banach space, Numer. Funct. Anal. and Optimiz, 10, (1989),
1161-1189.

[10] Owens, R. W. and Sreedharan V. P., Least squares methods to minimize errors
in a smooth, strictly convex norm, J. Approx. Theory, 2, (1993), 180-198.

[11] Polak, E., “Computational Methods in Optimization”, Academic Press, New
york, N.Y (1971).

[12] Rockafellar, R. T., “Convex Analysis”, Princeton University Press, Princeton,
NJ, 1969.

[13] Spith, H.“Mathematical Algorithms for Linear Regression”, Academic Press,
Orlando, FL, in press.

66

[l4] Sreedharan, V. P., Least squares algorithms for ﬁnding solutions of overdeter-
mined linear equations which minimize error in an abstract norm, Numer. Math.,

17, (1971), 387-401.

[15] Sreedharan, V. P., Least squares algorithms for ﬁnding solutions of overdeter-
mined systems of linear equations which minimize error in a smooth strictly
convex norm, J. Approx. Theory, 8, (1973), 46-61.

[16] Sreedharan, V. P. , A subgradient projection algorithm, J. Approx. Theory, 35,
(1982), 111-126.

[17] Sreedharan, V. P., Subgradient projection algorithm II , J. Approx. Theory, 41,
(1984), 217-243.

[18] Sreedharan, V. P., Extensions of subgradient projection algorithms, J. Approx.
Theory, 47,(1986), 228-239.

[19] Sreedharan, V. P., e-Subgradient projection algorithm, J. Approx. Theory, 51,
(1987), 27-46.

[20] Sreedharan, V. P., An algorithm for a non negative norm minimal solutions,

Numer. Funct. Anal. and 0ptimiz., 9, (1987), 193-232.

[21] Watson, G. A., “Approximation Theory and Numerical Methods”, John Wiley
and Sons, New York, N.Y. (1980).

[22] Zangwill, W. 1., “Nonlinear Programming: A uniﬁed approach”, Prentice Hall,
Englewood Cliffs, N. J., (1969).

[23] Zoutendijk, G., “Methods of Feasible Directions”, Elsevier, Amsterdam (1960).

67