lllllllll'llllNIllfllllllllllllHHlHlllllHIlllHHl

1293 00876 1565

This is to certify that the
dissertation entitled

Homotopy Methods and Algorithms
for Real Symmetric Eigenproblems

presented by

Kuiyuan Li

has been accepted towards fulﬁllment
of the requirements for

Ph.D. degree in Mathematics

/1%%6w/

Majo‘ professor

 

Date July 181 1991

 

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0- 12771

LIBRARY l
Michigan 323th
University , *

 

PLACE IN RETURN BOX to remcvcthic checkout rrcm your record.
TO AVOID FINES return on or before date due.

NR
7—“
DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

a

 

 

 

 

 

 

 

 

 

 

 

 

 

 

hf

MSU Is An Afﬁrmative Action/Equal Opportunity Institution
cMcMmii-oi

 

HOMOTOPY METHODS AND ALGORITHMS
FOR REAL SYMMETRIC EIGEN PROBLEMS

By

Kuiyuan Li

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Mathematics

1991

ABSTRACT
HOMOTOPY METHODS AND ALGORITHMS
FOR REAL SYMMETRIC EIGENPROBLEMS
By
Kuiyuan Li

This thesis discusses the homotopy methods and algorithms for real symmetric
eigenproblems. It contains two parts. In the ﬁrst part, a new algorithm is presented
for ﬁnding all the eigenvalues and the corresponding eigenvectors of a symmetric
tridiagonal matrix. The algorithm is based on the homotopy continuation approach
coupled with the strategy of ‘Divide and Conquer’. Evidenced by the numerical
results, the algorithm provides a considerable advance over previous attempts in using
homotopy method for symmetric eigenvalue problems. Numerical comparisons of the
algorithm with the methods in widely used ESPACK library as well as Cuppen’s
‘Divide and Conquer’ method are presented. It appears that our algorithm is strongly
competitive in terms of speed, accuracy and orthogonality and leads in speed in
almost all the cases. The performance of the parallel version of the algorithm is also
presented.

In the second part, a homotopy method for ﬁnding all eigenpairs of a real symmet-
ric matrix pencil (A, B) is giyen, where A and B are real n x n symmetric matrices
and B is a positive semideﬁnite or ill-conditioned positive deﬁnite matrix. A reduc-
tion of pencil (A, B) to pencil (A, B) is given, where A is an unreduced symmetric
tridiagonal matrix and B is a positive deﬁnite diagonal matrix. One can easily forms
the eigenpair (:r, A) of pencil (A, B) from the eigenpair (y, A) of pencil (A, 3). Fur-
thermore, a formula is presented for ﬁnding the number of the ﬁnite eigenvalues of
pencil (A, B) without actually solving the generalized eigenproblem. By choosing
initial pencil properly, the homotopy curves are very well separated and, in general,
very ﬂat and easy to follow. The homotopy algorithm is compared with QZ algo—
rithm. The numerical results show that the homotopy algorithm leads in speed in all

the cases.

ii

To my daughter Lei.

iii

ACKNOWLEDGMENTS

I would like to thank Professor Tien—Yien Li, my dissertation advisor, for his
constant encouragement and support during my graduate study at Michigan State
University. I would also like to thank him for suggesting the problems and the helpful
directions which made this work possible. _

I would like to thank Professor R. 0. Hill under whose guidance I took reading
course in Symmetric Eigenvalue Problems which laid the foundation for this research.
I would also like to thank my dissertation committee members Professor Q. Du,
Professor D. R. Dunninger, Professor R. 0. Hill, and Professor D. Yen for their

valuable suggestions and their time.

iv

Contents

Part I
The Homotopy Method for Real Symmetric
Tridiagonal Eigenproblem

1 The Homotopy Method for Real Symmetric Tridiagonal Eigenprob-

lem 2
1.1 Introduction ................................ 2
1.2 Preliminary Analysis ........................... 5

2 The Homotopy Algorithms for Real Symmetric Tridiagonal Eigen-

problem 16
2.1Introduction...................... .......... 16
2.2 Initiating at t = 0 ............................ 18
2.3 Prediction ................................. 18
2.4 Correction ................................ 19
2.5 Checking ................................. 20
2.6 Detection of a cluster and subspace iteration ............. 20
2.7 Step size selection ............................ 21
2.8 Terminating at t = 1 ........................... 21

3 Numerical Results of the Homotopy Algorithms for Real Symmetric
Tridiagonal Eigenproblem 23
3.1 Test Matrices ............................... 23

3.2 A serial comparison with the existing methods. ............

3.3 An indirect comparison with the existing parallel algorithms .....

Part II
The Homotopy Method for Real Symmetric
Generalized Eigenproblem

Introduction

Reduction
Preliminary Analysis

The Homotopy Algorithms for Generalized Eigenproblem .

7.1 Reduction .................................
7.2 Initiating at t = 0 ............................
7.3 Prediction .................................
' 7.4 Correction ...................... 7 ..........
7.5 Checking .................................
7.6 Detection of a cluster ...........................
7.7 Step size selection ............................
7.8 Terminating at t = 1 ...........................
7.9 Forming the eigenpairs of (A, B) .....................

Numerical Results of the Homotopy Method for Symmetric Gener-

alized Eigenproblem

vi

24
28

41

44

61

65
65
66
66
67
68
69
70
70
71

72

List of Tables

3.1 Execution Time (second) of computed eigenvalues and eigenvectors. . 25
3.2 The residual of computed eigenvalues and eigenvectors. ....... 26
3.3 The orthonormality of computed eigenvectors ............. 27

3.4 Execution Time (second) of computed eigenvalues and eigenvectors
from type 1 to 7. ............................. 29
3.5 The residual and the orthonormality of computed eigenvalues and
eigenvectors from type 1 to 7. ...................... 30
3.6 Execution Time (second) of computed eigenvalues and eigenvectors
from type 8 to 14 .............................. 31
3.7 The residual and the orthonormality of computed eigenvalues and
eigenvectors from type 8 to 14 ............. i .......... 32
3.8 Execution Time (second) of computed eigenvalues and eigenvectors
from type 15 to 21. ............................ 33
3.9 The residual and the orthonormality of computed eigenvalues and

eigenvectors from type 15 to 21. .................... 34
3.10 Execution time (second), speed-up and efﬁciency of HOMO on [1, 2,

1] and T2 matrices. ............................ 35
3.11 Execution time (second), speed-up and efﬁciency of HOMO on Wilkin-

son and Random matrices. ....................... 36
3.12 A comparison on the Alliant FX/8 for TQL2 vs. TREEQL on [1,2,1]

matrices. ................................. 37
3.13 Speed-up over TQL2 on the Alliant FX/ 8 for computing all the eigen-

pairs of [—1,2, -1] matrix of order 500. ................ 38

vii

3.14 A comparison on the Alliant FX/ 8 for TQL2 vs. TREEQL on random

8.1

8.2

matrices. ................................. 38

Execution Time (second) of computed eigenpairs of generalized eigen-
problems................... ............... 74
The residual and orthonormality of computed eigenvectors of general-

ized eigenproblems ............................. 75

viii

List of Figures

1.1
1.2
1.3

3.1

"3.2

6.1

Homotopy curves ............................. 11
The eigenvalue curves of [1,i,1] matrix with D=[0,i,0] ......... 14
The eigenvalue curves of random matrix with a; < (n+1, on diagonal

and with D=[0,a.-,0] ........................... 15

Efﬁciency of homotopy method on [1, 2, 1], T2, Random matrices and

w: .................................... 39
Speed-up of homotopy method on [1, 2, 1], T2, Random matrices and

W: .................................... 39
The generalized homotopy curves .................... 64

ix

PART I
The Homotopy Method for
Real Symmetric Tridiagonal
Eigenproblem

Chapter 1

The Homotopy Method for Real
Symmetric Tridiagonal

Eigenproblem

1.1 Introduction

In this chapter, we propose a new algorithm, based on the continuation approach,
for finding all the eigenvalues and eigenvectors of a symmetric tridiagonal matrix. Let

A be an n x 12 real symmetric tridiagonal matrix of the form

( 01 52 l
52 02 33 0
A = ' - ° - ° - (1.1)
ﬂit-1 art-l ﬂu
O
\ ﬁn an )

 

 

In (1.1), if some 13.- = 0, then Rn clearly decomposes into two complementary

subspaces invariant under A. Thus the eigenproblem decomposes in an obvious way

into two smaller subproblems. Therefore we will assume that each 19; 9E 0. That is,
A is unreduced. Our algorithm employs the strategy of ‘Divide and Conquer’. First
of all, the matrix A is divided into two blocks by letting one of the A’s equal to zero.

Namely, we let

 

 

 

 

D=(Dl 0) an
0 D;
where

f 01 32 l { 0H1 BM: \

132 a: 53 5H2 01k+2 ﬁk+3
D1 = ° ° ' , Dz —
ﬂk—l air-1 5k ﬂit-1 an—l ﬂu
\ 51: a]: j ( 19» an )

We then calculate the eigenvalues of unreduced matrices 01 and D; by using the most
efﬁcient algorithm available. Different from Cuppen’s ‘Divide and Conquer’ method
[8], our algorithm conquers the matrix A by the homotopy H : R’1 X R x [0, 1]
—-§ R“ x R, deﬁned by

Act—D2: Ax—Aa:
H($,A,t)= (l—t) sz—l +1 sz—l
2 2

As: — [(l — t)D + tA]z

= sz __ 1 (1.3)
2
As: — A(t):r
= 2:72: — l
2

where A(t) = (1 — t)D + tA and D is called an initial matrix. It can be easily seen

that the solution set of H (2,),t) = 0 in (1.3) consists of disjoint smooth curves,

3

each of which joins an eigenpair of D to one of A [5, 6, 7, 20]. We call each of
these curves a homotopy curve or an eigenpath. Thus, by following the eigenpaths
emanating from the eigenpairs of D at t = 0, we can reach all the eigenpairs of A at
t = 1. Theorem 1.1 in next section shows that the eigenvalue component A(t) of each
eigenpath (z(t), A(t)) is monotonic in t. On the Other hand, by Hoffman-Wielandt
theorem [31],

is. — MO)” 5 "Am — Alli.w = 2(1 — 0212:... for any t e [0.11 (1.4)

. i=1

where ||.||p- is the Frobenius norm and where A.- and A;(t) are the i“ eigenvalues of A
and A(t) respectively. Therefore, all the eigenvalue curves /\,~(t) can be quite flat if
5H1 (=0 in D ) is very small, especially when n is very large. As a consequence, the
eigenvalue curves A;(t) are very easy to follow. We shall describe our curve following
algorithm in Chapter 2.

The search for fast reliable methods for handling symmetric eigenproblems has
produced a number of methods, most notably the QR algorithm, the bisection Sturm
sequence method with inverse iteration [14, 15] and the ‘Divide and Conquer’ method
[8, 9, 28]. In Chapter 3, we shall present our numerical results, along with comparisons
with these methods. It appears that our algorithm is strongly competitive in terms
of speed, accuracy and orthogonality, and leads in speed in almost all the cases.

Modern scientiﬁc computing is marked by the advent of vector and parallel com-
puters and the search for algorithms that are to a large extent parallel in nature.
A further advantage of our method is that it is to a larger degree parallel, in the
sense that each eigenpath is followed independently of the others. This inherent na-
ture of the homotopy method makes the parallel implementation much simpler than
the other methods. In Chapter 3, we also show the performance of our parallel al-
gorithm and an indirect comparison with both Divide and Conquer (D&C) [9] and
Bisection/Multisections (B / M) [23], which are currently considered the only parallel
algorithms available for symmetric eigenproblems. The very high efﬁciency of our
method and its natural parallelism make the algorithm an excellent candidate for a

variety of architectures.

Theoretical aspects of the continuation approach to the eigenvalue problems have
been studied in [5, 6, 7, 20]. A ﬁrst attempt was made in [18] to make the method
computationally efﬁcient. Its parallel version appeared in [22]. Evidenced by the
numerical results, our algorithms given here provide a considerable improvement over

the algorithms in [18, 22].

1 .2 Preliminary Analysis

Let (A)1 denote the lower (n — 1) x (n — l) submatrix of A , (A)1 denote the upper
(12 — 1) x (n — l) submatrix of A.

By a straightforward veriﬁcation, one can prove the following lemma.

Lemma 1.1 For matrices M and N and a real number 0,

r . \
Ml

det _ ._... ._ .. _ __ _ =detM-detN—c3det(M)1~det(N)‘. (1.5)

 

 

Let M be a k x k unreduced symmetric tridiagonal matrix and let { £1 < {2 <
< £1. } and { m < 172 < < 771,-1 } be the eigenvalues of M and (M )1 respectively.
Then, by Cauchy’s interlacing theorem [24],

£1 < 111 < 62 < < 1]];_1 < ﬁg. (1.6)

For D1 and D2 in (1.2), let f1 = det(D1 —M), f2 = dct(D2—AI), f3 = det((D1)1 —
«\I) and f4 = det((D2)1 — AI). Let (:r(t),A(t)) be an eigenpath of the homotopy
H(z,A,t) = 0 in (1.3), then for each 0 S t S 1, A(t) is an eigenvalue of A(t) =
(l - t)D + tA in (1.3), where D is of the form in (1.2).

5

Theorem 1.1 If all the eigenvalues of D are distinct then
i) Either A(t) is constant for all t in [0,1] or strictly monotonic.
ii) A(t)A(t) > o for t small, if in) ¢ 0.

Proof: Since

 

 

| \
D1 —— A(t) I I
l tﬂm
det(A(t)—A(t)I)=det — ————— ——— — —.—— ————— — =0,
Wk“ 1
K I /
from ( 1.5), we have
f1(A(t))f3(A(t)) - tzﬂinfsﬂallfdiial) = 0- (1-7)

If there exists a to in [0,1] for which f3(A(to))f4(A(to)) = 0 then either f3(A(to)) = 0
or f.(A(to)) = 0; say f3(A(to)) = 0. It follows from (1.6), f1(A(to)) 76 0. Hence,
f2(A(to)) = 0 in ( 1.7); accordingly, *

f1(*(10))f2(1(t0)) - t21913+:f:+(A(to))f4(A(to)) E 0.

This implies det(A(t) — A(to)I) = 0. Thus, A(t) = A(to) for all t in [0,1].
Assume f3(A(t))f4(A(t)) 96 0 for any t in [0,1]. Write A(t) = 53A“). Differentiat-
ing (1.7) with respect to t yields,

dl-A-lf1(/\(t))fg(A(t)) — #193,,f3(A(t))f4(A(t))]A(t) = 2tﬂt+1fa(«\(t))f4(k(t)) (1-8)
(Iii-[fl“(0)506”‘t2ﬁ2+1f3(3(t))f40(t))l '7‘" 0 for any t E (0t1]°
We claim that
gqxlfn(l\(t))f2(k(t)) - t’ﬁt+1fs(A(t))f4(A(t))] H, 7‘ 0- (1-9)

6

For otherwise,

ginsenfzwor twaifaoannmn)![:11 vastness)» .._.o

d d
=3; we» t=012(x(0))+f1(x(0))5 sou» I,=o=0.

Since A(O) is an eigenvalue of D, we have f1(A(0))fg(A(0)) = 0. If f1(A(0)) = 0
then fg(A(0)) at 0 since all eigenvalues of D are distinct. Hence, ﬁ- f1(A(t))|¢=o = 0.
Consequently, A(0) is a multiple eigenvalue of D1 which contradicts to the fact that
D1 is unreduced. The proof of (1.9) for the case f2(A(0)) = 0 follows by the same

argument.

Now, from (1.8),

2tﬂt+1fa(k(t))f4(»\(t))
ailmenfaou» — t’ﬂ2+1fa(A(t))f.(A(t))]

A(t) =
which implies A(t) 75 0 for any t 6 (0,1]. Obviously, A(t) is continuous, hence, A(t) is
strictly positive or strictly negative. Namely, A(t) is strictly monotonic.

NOW let 9 = fifs , f = ﬂi+1f3f4 811d ’1 = (9 — t’f); = Mtthlla then A = ”N"

1(1) = —41’f2 ‘9 713”)” + 8t’£;;f- + 3,? ' (1.10)

Since f’(g — t’f),\;,/h3 and f; f /h2 are continuous in [0,1], they are bounded. By
(1.10),
A(O) =]1_r.nA(t) =]i_rr.12f/h
Hence A(t)A(t) > 0 for t small since A(t) = 2t f /h.
Q.E.D.

Theorem 1.2 (Hoffman and Wielandt [31]) Let M be an n x n symmetric
matrix. Let M’ E M + E where E is a symmetric perturbation of M. Denote the
eigenvalues ofM by { £1 5 {3 S S 5,. }, the eigenvalues ofM’ by { £1 5 f; S S
6:, }, and the eigenvalues ofE by { 71 _<_ 72 S S 7“}, then

81.42)“ s 27.?- (1.11)

i=1 i=1

Applying the above theorem on M = A and M’ = A(t) = (l — t)D + tA = A + E
with
f | l

0 l
I (t "' llﬂkn

E: _ ___ ___ _ ___ ___. _

(t — ”ﬁlm I
l 0
\ I i

then, (1.11) gives

 

 

in. _ 1.11))2 5 2(1 — 0219;“ for all t in [0,11 (1.12)

i=1

where A.- and A,-(t) are the eigenvalues of A and A(t) respectively.

From (1.12), together with the conclusion in Theorem 1.1 that each A,-(t) is mono—
tonic in t, we see that the smaller 5H1 is the flatter the eigenvalue curves are,
especially when n is very large. To make the eigenvalue curves easy to follow we in-

tend to choose ,6)...“ as small as we can for k in certain range as described in Chapter 2.

Let A(t) = D + Y(t), where A(t), D and Y(t) are real symmetric, and A,-(t), £5
and 11,-(t) be the eigenvalues of A(t), D and Y(t), respectively, with A;(t) S A5.” (t),
£i S {ii-1 and Hi“) S Ili+1(t):5= 1:2v-wn " 1

Theorem 1.3 For any i, j satisfying 1 S i + j — 1 S n, and t 6 [0,1], the following
inequalities hold:

f.“ + MU) S I‘m-10) (1.13)
and

Ann-64“) S £n+1~i + [Inn-1' (t) (1-14)

Proof: Let RB", RH, and Billy R5"l be the subspaces of Ru deﬁned implicitly by

sTDs . sTDs
E min
Mtg-1 1:73

 

 

C.- = max min T
RP“ 8.1.1?!“ 8 8

8

TY t TY t
11,-(t) = max min 8 T( )8 5 min —-—-—----—s T( )3.
RJ-laIRJ-l s s “my-é) s s

 

In fact, Rf," is the span of the eigenvectors corresponding to the i — l smallest
eigenvalues of D. Similarly Rial) is the span of the eigenvectors corresponding to the
j —l smallest eigenvalues of Y(t). These subspaces may or may not have a nontrivial
intersection. Let S be the subspace of the smallest dimension containing both R‘B‘

and Iii-(,1). Write k a dim(S) + 1. Now,
k—l:dim(S')S(i—l)+(j—l) <n,

with equality holds only if the intersection, RS" n Rial) = {0}.
So,

Ara—1 2 A]:

, zTA(t):c
= max min _—

Rh 1 All" 1 sz s by deﬁnition Of A,”
- c _

 

, since dim(S) = k - 1,
zTDz + zTY(t)z}
1.21.3 :rTz $T$
uTDu . vTY(t)v

min
«Is 11% v1.5 11%

T T t

2 min 11 Du min v Y( )v

'—1 T ‘—1 T

 

 

 

 

(since RB" Q S, and Rig}, Q 5,)
= £1 + II}.

Inequality (1.14) can also be proved by following the same line of argument.
Q.E.D.
Let the inertia of Y(t) be (1r, u,(), where 11', u and C are the number of positive,

negative, and zero eigenvalues of Y(t) respectively. Then we have,

Corollary 1.1 £1-” S A1,(t) S6131, k = u+l,u+2,...,n—1r, for all t 6
[0,1].

Proof: By the deﬁnition of u, py+1(t) Z 0. Let i = Ic — u, and j = V + l in Theorem
1.3, then
51.» S £1.» + pw1(t) S Ak(t)-

Let i = 11 +1 — lc — r, and j = 1r +1 in inequality (1.14), we have
M“) 5 61+: + Ila—«(0 5 (1+:

since 11“-,(t) S 0. Q.E.D.
Let Y(t) = t(A — D), where D is the block diagonal matrix given in (1.2), then

11(1): ----- +-----.

K I J

Since Y(t) has exactly one positive eigenvalue and exactly one negative eigenvalue,

 

 

from Corollary 1.1, the following corollary is immediately achieved.

Corollary 1.2 For anyt 6 [0,1],

{i-l 5 A8'“) 5 {ii-11‘. = 2131 ""n "' 11'
1‘10) $51
A..(t) 2 6“.

From Theorem 1.1 and Corollary 1.2, the homotopy curve must be one of those in
Figure 1.1. Any homotopy curve is bounded by two consecutive dotted lines and no
homotopy curve can cross a dotted line.

It is desirable to choose D as a diagonal matrix, consisting of the diagonal part
of A, rather than the form in (1.2) as we did in the above analysis. If D is a diag-
onal matrix, then eigenvalues and corresponding eigenvectors of D are immediately
available. Thus, the work of solving the eigenproblem of D is saved. In [18], Li and
Rhee showed that this strategy worked very well for certain matrices, such as [1, i, l],

i = l,2,...,n. That is, if we choose D = diag(l,2, ...,n) in solving eigenproblem of

10

M0)
714(0)

113(0)
MO)

H(X,A:l)=0

..................................

..........................................

 

711(0)

.........................................

 

 

 

Figure 1.1: Homotopy curves

the matrix [1, i, l], the eigenpaths are still very ﬂat and easy to follow. However, this
strategy bfeaks down when we solve the eigenproblem of tridiagonal matrices [1, 2,1].
The eigenpaths are rather difﬁcult to follow. In the following analysis, we give some
criteria which guarantee the safety of choosing a diagonal starting matrix D.

Note that (A)1 is the lower (11 - l) x (n — l) submatrix of A , (A); the upper
(n - 1) x (n — 1) submatrix of A and A.-(A) is the i‘" smallest eigenvalue of A.

Lemma 1.2 Ifa; < ag+1,i = l,2,...,n —1 and if there exists a c , 0 < c S 1 such

that

(All - (A)1 - Clsrgigﬁam - 11,-)I

is positive semideﬁnite then

min
lSiSn-

where A,- = A,-(A).

Proof: Since A is symmetric, so are (A); and (A)’.

Let

#1 Sﬂ2.<_“°Sﬂn-1

ll

,(l‘m — A6) 2 c15¥§i§_,(ai+1 — 0i)

and
61552$°°°S5n-1

be the eigenvalues of (A)1 and (A)1 respectively, then by Cauchy’s interlacing theorem
[24],
AISFISAZS'”SI‘n-1_<_An (1-15)
*1 S 61 5. A2 _<_. ' ° ' S. 611-1 _<_. ’\n- (116)

Since (A)1= (A)1 + cal +[(A)‘— (A)1 - cal], and (A)1 — (A)1 — ca] is positive

semideﬁnite, where

a — 1_<_1'<m 111(0’1" _ 0,),

by the Courant-Fisher maximum characterization [31],
A,-((A)‘) _>__ A,((A)1 + cal) for any i, 1 S i S n —1

i.e.,

p;—6,~an>0, lSiSn—l.
By (1.15) and (1.16),

’\1S6lS/‘ISA2S”'S6n—IS/‘n-lSAno

Hence
Ai-t-l—AiZI‘i’aizcaa 1SiSn—l
and
lsInsin (A,-+1 - A ,-)_ > clsmin“l (01,-...1— 0;).
Q.E.D.
Corollary 1.3 If
a] 02
(A): - = (A)1 '-
an-1 an

then

l.<‘r.11<in_l (A5“ — A ,~__) > lsr.r1in__1(¢:r.'+1— or). (1.17)

12

Proof: (1.17) follows immediately from Lemma 1.2, since

a; — (11
(All — (Ah - = 0.
an - Ora—1
Q.E.D.
Let A(t) = (1 - t)D + tA, where D is a diagonal matrix consisting of the diagonal

elements of A, then
Theorem 1.4

15113340144 (t) - A;(t)) _>_ clsrrgi,p_l(a,-+1 — 01') t 6 [0,1].

Proof:
(AW — (A(t)). — a1 = «(Ar — (A11 — a1) _ (1 ,8)
+(1-t)d¢'ag(az - a1 — 0,03 - a2 — 01"‘1011 — 0.1-1 — a),
where
a = 615%i3—1(a’+1 - 01,-), 0 < c S 1.
Clearly, the second term of the right hand side of (1.18) is positive semideﬁnite and

the ﬁrst term is positive semideﬁnite by assumption. Hence, (A(t))1 - (A(t))l - a]
is positive semideﬁnite for t 6 [0,1]. By Lemma 1.2,

l_<-riréi,n__l(z\a+1(t) - AU» 2 clsrpsi'n_l(a,-+l — 01,) t 6 [0,1].

Q.E.D.

If A satisﬁes the conditions in Lemma 1.2, we may choose the initial matrix D as a
diagonal matrix consisting of the diagonal elements of A, then, A(t) is an unreduced
symmetric tridiagonal matrix and the eigenvalue curves are not only distinct, but

also very well separated. There is a lower bound between any two eigenvalue curves

so that the eigenvalue curves are easy to follow.

Example 1.1 A=[1,i,1], where i=1,2,...,20. If we let D=diag{ 1,2,...,20 }, then all

the eigenvalue curves are very well separated. See Figure 1.2.

13

A

21 0000
18 3875
15 7750
13 1625
10 5500
7 93750
5 32500

2 71250

 

100000
00 0000 125000 250000 375000 500000 625000 750000 875000 I 0000C

t

Figure 1.2: The eigenvalue curves of [1,i,1] matrix with D=[0,i,0]

l4

- I 50000 1

I 25000 '

I 02000 ‘

 

\
1)

780000 '

 

 

 

 

 

$10000 "

 

 

 

 

 

 

300000 "

 

 

 

050000 ‘

- 280000 "

 

 

 

7200:: 0000 185000 237000 375000 506000 625000 736000 0&00-1 00000
Figure 1.3: The eigenvalue curves of random matrix with a,- < 01,-4.1, on diagonal and

with D=[0,a.-,0]

Example 1.2 A is a real symmetric tridiagonal matrix whose diagonal and subdi-
agonal elements are random numbers between 0 and 1, and whose diagonal elements
satisfy a.- < as“, and D is a diagonal matrix consisting of the diagonal elements of

A, then all the eigenvalue curves are very well separated. See Figure 1.3.

15

Chapter 2

The Homotopy Algorithms for
Real Symmetric Tridiagonal
Eigenproblem

2.1 Introduction

The basic features of the curve-following scheme of our algorithm to follow the

eigenpath (x(t),A(t)) are :

(i) Initiating at t = 0

(ii) Prediction

(iii) Correction

(iv) Checking

(v) Detection of a cluster and space iteration
(vi) Step-size selection

(vii)Terminating at t = 1.

We begin by giving an overview of our algorithm, followed by detailed explanation
of these features.

Simple computation shows that Newton’s method for the nonlinear problem of

16

n + 1 equations r

Ax - A1: = 0
F(A,$) = i 3T3 -—l (2.1)
2

 

L

of the n + 1 variables A, x1,xg, ...,xn at (A(“),x(")) is the inverse iteration,
(A — Al”)l)y = x(")

and i

(n) T (n)
,\(n+1)= ,\(n)+ (‘5 ) 9: +1
‘ 2(31="")’”y

[ x(n+l) = (A(n-H) __ A(n))y.

 

 

By making the initial matrix D close to A, the eigenpairs of D should be excellent
starting points for applying Newton’s method on the eigenproblem (2.1). Based on
this observation, our algorithm, in simple terms, is designed to use the homotopy
continuation method as a backup of Newton’s method applied on (2.1). Namely,
we solve the eigenvalues of the initial matrix D by using the most efﬁcient method
available ﬁrst, and apply the inverse iteration on (2.1), using each eigenvalue of D as a
shift. Then, we switch to Rayleigh quotient iteration (RQI), an inverse iteration with
Rayleigh quotient as a shift, to speed up the convergence. This is mainly equivalent
to choosing the starting step size h = 1 in the usual curve following scheme with zero
order prediction and Newton correction to follow the eigenpath (x(t), A(t)) of the
homotopy H (x, A, t) = 0 in (1.3). By checking the Sturm sequence at the convergent
point and if this procedure fails to provide the right eigenpair, we shall cut the step

size in half. That is, we repeat the process by applying the inverse iteration on

Ax — A(t)x = 0

t xTx—l
2

 

with t = 0.5, where A(t) = (1 — t)D + tA in (1.3), and then switch to RQI to come
back to the right eigenpath (x(t), A(t)). Assuming that after i steps, the approximate

17

value (x(tg), A(t;)) is known, we always choose the step size h = l — t,- at (x(tg), A(t,-)).
In this way, we follow the eigenpath from t = 0 to t = l.

2.2 Initiating at t = 0

As mentioned in Chapter 1, we intend to choose I: for which 191+; is as small as
possible. To make the sizes of the blocks D1 and D2 roughly the same, we limit the
choice of k in the range n/2 — j S k S n/2 + j, where j is roughly equal to n/20,
and ﬁnd the smallest ﬁg,“ by local sorting.

When the initial matrix D is decided, different from the homotopy algorithms in
[18, 22] where all the eigenvalues and the eigenvectors of D are calculated in order
to start following the eigenpaths, our algorithm only calculates the eigenvalues of D1
and D2. These eigenvalues are obtained by using the most efﬁcient method available.
We require the accuracy to stay only within one-half or even one-third of the working

precision. With this strategy, considerable amount of computing time is reduced.

2.3 Prediction

Assume that after i steps the approximate value (:‘1":(t,-), A(t;)) on the eigenpath
(x(t), A(t)) at t; is known and the next step-size h is determined; that is, t,” = t,- + h.
We want to ﬁnd an approximate value (:‘1':(t.-+1),A(t,-+1)) of (x(t;+1),A(t,-+1)) on the
eigenpath at t5”. Notice that (ii:(t.-+1), A(t;+1)) is an approximate eigenpair of A(t,-+1).
Since H (x(t), A(t), t) = 0, we have

A(t)x(t) = A(t)x(t)

x(t)Tx(t) = 1.
Differentiating both equations with respect to t yields,

(A — D)x(t) + A(t)a’:(t) = A(t)x(t) + A(t):i:(t)
(2.2)

x(t)T:i:(t) = 0.

l8

For t = t5, multiplying (2.2) on the left by xT(t,-) yields,
A(t.‘) = $T(t;)(A — D)x(t,-) = 2ﬂk+1$k(t;)xk+1(tg) (2.3)

where x(t.-) = (x1(t,-),...,x,,(t,-))T. In view of (2.3), we use the Euler predictor to
predict the eigenvalue at ti“, namely, 1

Ao(t;+1) = A(tg) + A(tgyl.

It is easy to see that A(O) = 0 in (2.3). Consequently, A°(t1) always equals to
A°(0). To predict the eigenvector, we use the inverse power method on A(t;+1) with
shift A°(t.-+1). That is, we solve

(A(tm) '- A°(t.-+1)I )y°(t,+1) = x(t,-)

and let

1100141)
“90011-0”-

At t,- = 0, since we skip the calculations of eigenvectors of D, x(0) is not available.

3001'“) =

We choose a random vector for x(0).

2.4 Correction

As a corrector, we use the standard RQI, starting with x°(t,-+1). To be more

P11501861 at (ti-16141): Aj'l(ti+1))(j ..>. 1) let
”(1141) = 1". -1(ti+1)TA(ti+1)$j'-i(ti+1)
then solve
(A(tm) - *5 (1:41)”th (0+1) = 1‘" "’(t-‘Hl

and let
95 (0+1)
“160141)"-

$j(tg+l) =

19

We repeat the above process to within half of the working precision if single precision
is used and one-third of the working precision if double precision is used when t.-+1 < 1,
since precision in determining the curve itself is only of secondary interest. We polish
(x5(t.-+1), A5(t.-+1)) at the end of the path (t.-+1 = l) by iterating the Rayleigh quotient
to machine precision. The stop point (xj(t;+1), Aj(t.-+1)) of RQI will be taken as an
approximate eigenpair (5:(t.-+1), A(t;+1)) of A(t.-+1). The cubic convergence rate of RQI
makes the corrector highly efﬁcient.

2.5 Checking

When (5:(t.-+1), A(t.-+1)) is taken as an approximate eigenpair of A(t;+1), the Sturm
sequence at A(t;+1) + e is computed to check that, if we are trying to follow the curve
of 1"“ highest eigenvalues, we still on that curve. Here, 6 is chosen as half of the
working precision if single precision is used and one-third of the working precision if
double precision is used. If the check fails, we reduce the step size to h / 2 and repeat

the whole process once again beginning with the eigenvalue prediction in Section 2.3.

2.6 Detection of a cluster and subspace iteration
At t.- = 0, when all the eigenvalues of D
A1(0) < A2(0) < ... < 21,.(0)

are available, we let 6 = max(10'5,10"2(A,.(0) — A1(0))/n) if double precision is used
(or 6 = max(10'3,10'2(A,,(0)— A1 (0)) / n) if single precision is used). Set A,- and A,- in
the same group if IA.'(0) — Aj(0)| < 6. If the number of the eigenvalues in any group
is bigger than V1,_then a cluster is detected. At t,- 79 0, or 1, when (:‘1’:(t,-),A(t,-)) is
taken as an approximate eigenpair of A(tg), after the checking step in Section 2.5, we
compute the Sturm sequences at A(t.) :1: 6 for the purpose of ﬁnding the number of
eigenvalues of A(tg) in the interval (A(t,-) -— 6, A(t,-) + 6). When this number is bigger
than 1, a cluster of eigenvalues of A(tg) is detected.

20

In those cases, the corresponding eigenvectors are ill-conditioned and this ill-
conditioning can cause the inefﬁciency of the algorithm. To remedy this problem,
the inverse power method [24, 31] with A(tg) as shift is used to construct an approx-
imation of the corresponding eigenspace .S' of dimension m (= the number of the
eigenvalues in the cluster) of A(tg). This approximate eigenspace S is used as an ini-
tial subspace of the subspace iterations at t,-+1 when we approximate the eigenpairs

Of A(t.’+1).

2.7 Step size selection

In the ﬁrst attempt, we always choose the step size h = 1 — t,- at t,- < 1. If after
the prediction and the correction steps the checking step fails, we reduce the step size
to h/ 2 as mentioned in Section 2.5. This extremely liberal choice of step size can be
justiﬁed because of the’ob'servations we described in the beginning of this Chapter
( Section 2.5 ) as well as the effective checking algorithm. Indeed, since the initial
matrix D is chosen to be so close to A, from our experiences, most of the eigenpairs
of A can be reached in one step, i.e., h = 1.

Very small step size can also cause the inefﬁciency of the algorithm. Therefore,
we impose a minimum 7 on the step size h. If h < 7, we simply give up following the
eigenpath and the corresponding eigenpair of A will be calculated at the end of the
algorithm by the method of bisection with inverse iterations (see Section 2.8). We
usually choose 7 ‘8 0.25.

2.8 Terminating at t = 1

At t = 1, when an approximate eigenvalue A(l) is reached, we compute the Sturm
sequence at A(l) + c with e = machine precision to ensure the correct order. If the
checking fails, we have jumped into a wrong eigenpath. More precisely, suppose we
are following the i“ eigenpair, the checking algorithm detects that we have reached

the j"’ eigenpair instead. In this situation, we will save the 1"” eigenpair before the

21

step size is cut. By saving the j‘h eigenpair, the computation of following the j‘h
eigenpair is no longer needed.

As mentioned in Section 2.7, we may give up following some eigenpaths to avoid
adapting a step size that is too small. Without any extra computation, we know
exactly which eigenpairs are lost at t = 1. In order to ﬁnd these eigenpairs, we ﬁrst
use the bisection to ﬁnd the eigenvalues up to the half working precision and then use
the inverse iteration and the RQI or subspace iteration (if there are some clusters) to

ﬁnd the eigenpairs.

22

Chapter 3

Numerical Results of the
Homotopy Algorithms for Real
Symmetric Tridiagonal

Eigenproblem

3.1 Test Matrices

The homotopy continuation algorithm is in its preliminary stage, and much de-
velopment and testing are necessary. But the numerical results on the examples we
have looked at seem remarkable. Our testing matrices are:

(1) The Toeplitz matrix [1, 2, l], i.e., all its diagonal elements are 2 and off-
diagonal elements are 1.

(2) The random matrix with both diagonal and off-diagonal elements being uni-
formly distributed random numbers between 0 and 1.

(3) The Wilkinson matrix W,;". i.e., the matrix [l,d;,1], where d,- = abs((n +
l)/2 — i), i = 1,2, ...,n with n odd.

(4) The matrix [1, 11,-, 1], where p.- = i x 10".

(5) The matrix T; : same as matrix [2, 8, 2] except the ﬁrst diagonal element

23

a] = 4.

(6) The glued Wilkinson matrix W}? The matrix consists of j copies of Wilkinson
matrices W: along the diagonal and the off-diagonal elements 19.31,.” = 10’“, where
i=1,...,j-l.

(7) The LAPACK test matrices which include 21 type matrices.

3.2 A serial comparison with the existing meth-
ods.

For symmetric tridiagonal matrices, the routine TQL2 in EISPACK [27] im-
plements explicit QL-iteration to ﬁnd all the eigenpairs. EISPACK also includes a
Sturm sequence with inverse iteration method ( BISECT+ TINIVT) which is much
faster than TQL2. However, it may fail to provide more accurate eigenvectors when
the corresponding eigenvalues are very close. A new version by Jessup [15] (B / III)
has considerably improved the reliability and the accuracy of the inverse iteration.
The ‘Divide and Conquer’ method for symmetric tridiagonal matrix was suggested
by Cuppen [8] and was implemented, combining with a deﬂation and a robust root
ﬁnding technique, by Dongarra and Sorenson [9] (TREEQL) (see also [28]).

We shall show the computational results comparing the homotopy continuation
method HOMO with those obtained by the methods TQL2, B/III and TREEQL
mentioned above. The computations were done on a Sun SPARC station 1.

Table 3.1, 3.2 and 3.3, show the comparisons on the ﬁrst 6 type test matrices
listed in Section 3.1 in terms of speed, accuracy and orthogonality respectively. The
homotopy method appears to be strongly competitive in every category and leads in
speed by a considerable margin in comparison with all other methods in most of the
cases.

Tables 3.4 to 3.9 show the comparisons with (B/III) i.e., DSTEBZ+ DSTEIN,
which is the latest code based on bisection with inverse iteration, in terms of speed,
accuracy and orthogonality respectively on the LAPACK test matrices. The LAPACK
test matrices of type 1 to type 7 are diagonal matrices. Table 3.4 to 3.6 show that

24

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Matrix Order Execution Time (second)
N HOMO B/III TREEQL TQL2
64 2.03 2.43000 5.90999 17.0600
[1,2,1] 125 5.88 8.58000 37.6199 114.460
256 30.25 35.7000 302.859 904.659
499 100.04 152.920 984.460 2416.14
64 1.13 2.44000 6.09000 17.3200
Random 125 3.79 8.53000 31.4100 115.880
256 14.90 34.4500 158.120 949.939
499 53.19 133.620 235.889 2482.68
65 0.97 1.84000 2.73999 16.1000
w: 125 3.97 6.21000 8.20001 108.890
255‘ 16.40 22.5700 31.8398 879.959
499 57.35 95.5000 57.8300 3869.33
64 1.97 2.43000 5.91000 17.1500
[1,,1.,1] 125 6.78 8.64999 37.6500 115.699
256 30.61 34.1900 303.449 901.810
499 107.04 129.370 984.410 2424.88
64 1.80 2.39000 5.80000 16.6800
T2 125 6.85 8.60000 37.3700 115.000
256 28.24 34.8600 165.250 939.859
499 108.46 174.760 979.738 2506.93
64 1.80 1.73000 2.07999 12.2000
w; 128 8.69 6.07000 6.90000 67.5600
256_ 38.85 22.5100 29.0700 490.779
512 144.05 89.2200 162.040 5303.37

 

Table 3.1: Execution Time (second) of computed eigenvalues and eigenvectors.

25

 

 

Matrix Order [

max; llAza ’, Aatallz/Amu

 

 

|
TQL2 [

 

 

 

 

 

 

N HOMO B/III I TREEQL
64 1.9107015 4.9769016 1.9027015 9.8365015
[1,2,1] 125 1.9519015 8.4237016 3.3617015 1.0949014
256 2.3533015 1.2825015 6.0929015 2.7538014
499 2.2251015 1.7640015 7.5172015 3.7564014
64 1.9319016 3.6661016 2.4581014 6.0589015
Random 125 2.0485016 3.5621016 5.6541014 1.2686014
256 3.9846015 3.6238016 7.1488014 5.3928014
499 1.0330013 9.2496015 4.8200014 5.7588014
65 5.4172016 5.4991015 1.8801013 8.7914014
w: 125 3.3967016 7.1627015 1.0842012 3.4443013
255 7.7880016 1.4301014 1.0868012 6.7159013
499 9.1037016 2.8363014 1.6269011 6.5775012
64 4.9149015 5.0881016 2.4436015 8.8198015
[1,11.,1] 125 3.2254015 7.0876016 3.8144015 1.1196014
256 4.4093015 1.1424015 5.4271015 2.7578014
499 4.9410015 1.7580015 8.4456015 3.7155014
64 7.1616016 1.7737015 5.5986015 1.5815014
T2 125 7.3060016 1.8509015 6.5573015 2.6714014
256 8.1746016 2.2977015 1.3413014 5.0148014
499 8.0805016 3.7362015 1.7603014 8.1773014
64 8.1580016 3.2592015 4.3889014 3.1906014
w; 128 1.2509015 1.3014014 4.3246013 1.1476013
256 2.1321015 1.4015014 3.8488012 6.6732013
512 4.2327015 2.3889014 3.8490012 1.7667012

 

 

 

 

 

 

Table 3.2: The residual of computed eigenvalues and eigenvectors.

26

 

 

 

 

 

 

 

 

 

Matrix Order max-JKXTX —I);J|/Am.,
N HOMO B/III PTREEQL TQL2
64 1.7133015 9.4041015 1.4836015 5.9952015
[1,2,1] 125 1.3399014 3.5176014 3.6237015 7.5495015
256 1.1554014 2.1010014 1.4941014 1.5987014
499 5.3994014 4.1999014 1.9569014 2.4868014
64 2.3654015 2.6645015 1.0890014 7.3274015
Random 125 8.5588014 5.5839015 4.1540014 1.1324014
256 4.0178013 3.9968015 1.9257013 3.6415014
499 1.0330013 9.2496015 4.8200014 4.2410014
65 4.7465017 2.2204015 1.1102015 7.5495015
w: 125 1.7693017 3.2163015 1.2585015 1.2656014
255 1.2167017 4.3258015 "4.4408015 2.5091014
499 9.3353018 1.2278013 8.2156015 6.9722014
64 3.3790015 7.6699015 1.0852015 4.6629015
[1,15, 1] 125 1.2375014 4.3372014 1.2751014 6.6613015
256 3.9662014 1.6639013 8.4026015 1.3322014
499 1.8082013 1.3981013 6.9169014 2.4424014
64 5.7249016 7.9102015 9.3588016 5.9952015
T2 125 1.8625015 1.3832014 3.7492015 1.3322014
256 3.2107014 1.9904014 1.0485014 2.1760014
499 6.6773014 1.5714014 1.9133014 4.9293014
64 7.9556017 3.5527015 6.6613016 7.9936015
w; 128 6.2610016 1.4708013 1.1366015 9.1038015
256 7.0303017 1.0394013 4.4408015 2.3758014
512 8.4508017 1.4259013 1.1324014 2.6645014

 

 

 

 

 

 

 

Table 3.3: The orthonormality of computed eigenvectors

27

 

both algorithms work very well. The matrices of type 9, l7 and 21 have a large cluster
with dimension around 4n/5, where n is the order of matrices. On these matrices,
while HOMO is not as fast as B/III its accuracy is still competitive. The matrices of
type 10 and 18 have a even larger cluster with dimension n - 1. Although HOMO
works, time consuming is out of comparison. Tables 3.4 to 3.9 clearly show that the
homotopy method leads in speed by a considerable margin in comparison with B / III
on all other types of the LAPACK test matrices.

3.3 An indirect comparison with the existing par-
allel algorithms

Scientiﬁc and engineering research has become increasingly dependent upon the de-
velopment and implementation of efﬁcient parallel algorithms on modern high -
performance computers. Developing algorithms for advanced computers suitable for
eigenvalue problems has produced several algorithms, such as Divide and Conquer
(D&C)[9] and Bisection/Multisection (B/M)[23] for symmetric tridiagonal matrices.

The homotopy algorithm is to a large degree parallel since each eigenpath can
be fOllowed independently. This inherent nature of the homotopy method makes the
parallel implementation much simpler than other methods.

In our parallel algorithm, after all the eigenvalues of D are computed and put in
the increasing order, we assign each processor to trace roughly 11 / p eigencurves, where
n is the dimension of matrix A and p is the number of the processors being used. Let
the ﬁrst processor trace the ﬁrst n/ P smallest eigencurves from the smallest to the
largest and let the second processor trace the second n / p smallest eigencurves, and
so on.

We present, in this section, the numerical results of the parallel implementation
of our algorithm. All examples were executed on BUTTERFLY GP 1000, a shared

memory multiprocessor machine.

28

Table 3.4: Execution Time (second) of computed eigenvalues and eigenvectors from

type 1 to 7.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Matrix Order Execution The (second)
7". 17 110140 31111 11.41.7(B/IIDIHOMO
10 0.00 0.00 .
10.111: 02 0.00 0.00000000
Type 04 2.0000000: 2.0000000:
1 120 7.0007000: 1.0000000:
:50 0.020000 0.000017
500 1.17000 1.40000 1.10
10 0.00 1.0000000:
100411: 0: 1.0000200: 0.00000000
7". 04 2.0000000: 2.0000000:
2 120 0.0001000: 0.0000000:
:00 0.010000 0.000000
000 1.10000 1.02000 1.12
10 0.00 0.00000000
14.111. 0: 0.00 0.00000000
'1‘". 04 2.0000000: 0.0002000:
0 120 0.0002100: 0.0011000:
:50 0.000007 0.040000
000 1.17000 1.00007 1.14
10 0.00 0.00
14.411: 0: 1.0000200: 0.00
r". 04 2.0000000: 2.0000000:
4 120 0.0002100: 0.0000000:
:50 0.020000 0.000000
1100 1.17000 1.04000 1.14
10 0.00 0.00
11.411: 0: 0.00 1.0000200:
r". 04 2.0000500: 2.0000000:
0 120 7.0000000: 0.0000000:
:00 0.000007 0.040000
000 1.10000 1.00000 1.10
10 0.00 0.00
00.411: 0: 1.0000200: 0.00070000
Type 04 2.0000500: 2.0000500:
0 1:0 0.0070000: 0.0000000:
:50 0.020000 0.000000
000 1.1000: 1.02007 1.12
10 0.00 0.00000000
10541-1: 02 0.00020000 1.0000200:
Type 04 2.0000000: 2.0000000:
7 120 0.0070000: 0.0000000:
:00 0.020000 0.000000
1100 1.10000 1.04000 1.14

 

29

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Mom: Otdu mu.- IIM.‘ - Airing/8m I Duij KXTX - ”Lil/x770“ I
32 0.0 0.0 0.0 0.0
Type 04 0.0 0.0 0.0 0.0
3 130 0 0 0.0 0 0 0 0
250 0.0 0.0 0.0 0.0
500 0.0 0.0 0.0 0.0
10 0.0 0.0 0.0 0.0
32 0.0 0.0 0.0 0.0
Type M 0.0 0.0 0.0 0.0
3 120 0 0 0.0 0 0 0 0
250 0.0 0.0 0.0 0.0
000 0.0 0.0 0.0 0.0
10 0.0 0.0 0.0 0.0
33 0.0 0.0 0.0 0.0
Type 04 0.0 0.0 0.0 0.0
d 12. 0 0 0.0 0 0 0 0
L 250 0 0 0.0 0 0 0 0
500 0.0 0.0 0.0 0.0
)0 0.0 0.0 0.0 0.0
32 0.0 0.0 0.0 0.0
Type 04 0.0 0.0 0.0 0.0
5 130 0 0 0.0 0 0 0 0
350 0.0 0.0 0.0 0.0
000 0.0 0.0 0.0 0.0
10 0.0 0.0 0.0 0.0
32 0.0 0.0 0.0 0.0
Type 04 0.0 0.0 0.0 0.0
0 120 0.0 0.0 0.0 0.0
3“ 0.0 0.0 0.0 0.0
500 0.0 0.0 0.0 0.0
10 0.0 0.0 0.0 0.0
32 0.0 0.0 0.0 0.0
Typo 04 0.0 0.0 0.0 0.0
7 120 0.0 0.0 0.0 0.0
3“ 0.0 0.0 0.0 0.0
m 0.0 0.0 0.0 0.0

 

 

Table 3.5: The residual and the orthonormality of computed eigenvalues and eigen-
vectors from type 1 to 7.

30

Table 3.6: Execution Time (second) of computed eigenvalues and eigenvectors from

type 8 to 14.

 

 

 

 

 

 

 

 

 

 

 

1100 1'18 00400 380000100 Time (0000“)
Type 11 nouo 31111 11.41.(
10 0.00M0‘02 1 .000003-01 .
1400112 32 0.700000 1 .01000
Type 44 2.47000 4.04000
4 120 11.3300 15.3000
254 33.3000 40.1001
500 142.500 224.000 1 .50
10 0.1 1000 1 .000003-01
ll“ ﬁx 32 1.24000 0.03000
Type 04 10.0100 4.04000
0 120 1 10.000 15.3000
250 037.030 154.510
500 7334.24 000.000 0. 14
10 ‘ 0.000003-02
“51718 32 ‘ 0.50000
1'". 04 ' 0.01000
10 120 " 24.2200
250 ‘ 171.450
500 ‘ 1 103. 1 1 ’
10 0.0000000: 0.10000
11.1112 02 0.7100'00 1.00000
Type 44 2.44000 3.44000
1 1 124 11 .1400 24.2000
254 40.3500 57.00“
500 174.370 217.410 1.25
10 1 .00000001 0.1 1000
1400010 32 0.530001 1.07000
Type 04 0.22000 4.17000
12 120 12.2300 14.7000
250 40.0300 00.00“
500 150.332 220. 100 1 .44
10 0. 1 10000 0.13000
Hun-in 32 0.47000 0.04000
Type 44 2.37000 4.44000
13 120 0.05001 14.00”
250 33.0000 50.5000
5“ 157.010 223.441 1.42
10 5.000003-02 0.12000
1400010 32 0.520000 0.00000
Typo 04 2.40000 0.02000
14 124 4.70004 14.7700
250 41.3700 55.1000
5” 137.410 217.320 1.53

 

 

 

' The dime-0100 cl 050 0050100000 10 3000000 than 410/5.

31

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

non-1x 00000 .1086 "As.- - 15:5", [1,... nos,- .1' [1(7): - 1),- .1' mm
Typo 11 110140 a/m 110110 111m

10 1.110000." 1.401000.“ 2.0001 10.10 4.4400011.”

02 2.1000011.“ 2.0005011.” 1.0010011.“ 1.0000011."

Type 04 1.0000010.“ 2.0000111.“ 0.0000011.” 2.000720.“

0 120 0.4110211.“ 2.0440013.“ 0.0400011.“ 4.5000013.»

200 0.0100011." 0.0-000013.10 1.0050211.“ 1.0211110.“

soo 0.000740.“ 0.0100411." 2.0200413.“ 2.0010010.“

‘ 10 0.1120211." 1.0070013." 0.001 10413.10 0.0000013."

02 2.1100111.“ 1.000100.“ 0.020040.“ 4.440000.“

Typ0 04 0.1010111." 1.0071411." 0.101020.“ 4.040240.“

0 120 1.0004011.“ 1.4002011.“ 0.1101511.“ 1.0110010."

200 1.1470011.“ 2.0002211." 1.1000413.“ 1.2110011.”

1100 1.0000010." 0.0-100013.10 1.7700011.“ 2.100410."

10 - 0.014400." 0.000000.”

02 - 0.5120413." — 5.1500004:

'rypo 04 - 1.001000." - 0.1210013.“

10 120 - 0.0014011.“ - 0.224100.“

200 - 0.014700.“ - 1.2002113.”

000 - 0.0001711." - 0.2200011."

10 1.0200113." 1.4420211.“ 1.004000.“ 1.401000."

02 1.0000011.“ 2.0112011." 1.7000211.“ 1.0000011."

'rypo 04 1.0070010.“ 2.0000011.“ 1.011000." 0.007000.”

11 120 0.0000011.“ 2.0002113.“ 2.0001013.“ 2.0124010.“

200 0.01 17011.10 2.0011210.“ 4.0040111." 1.4404010.”

:00 1.0402011." 0.4404411.“ 1.1000213. 14 0007171110

10 2.2000011." 1.0404013.“ 0.200020." 1.012000.“

02 ”01000.10 2.0127013.“ 4.4100411.“ 0.4400013."

'rypo 04 0.0200011.“ 2.2000011.“ 1.000440.” 4.007000.“

12 120 4.0000011.“ 2.0101411.“ 4.0200011.» 0.0020011.“

200 4.001000.“ 2.0020011.“ 0.0000011.“ 0.0400013.“

000 1.2400111.“ 0.000000.“ 1.400400.“ 0.007010.“

10 1.040140.“ 0.122200." 1.171000.“ 1.100700."

02 0.1100011.“ 1.0002011.“ . ”202013.10 1.000000."

Typo 04 1.010010.“ 2.0000711.“ 1.1000013." 1.0000011.“

10 120 0.0014010.“ 2.0014013.“ 0.0402013.“ 4.0010211.“

200 1.0071 115.10 2.2122213." 1.0700010." 1.0100010."

000 0.000020." 2.4210411.» 1.2010011.” 0.404000."

10 0.4712011." 2.1021011.“ 1.407240." 2.000000."

02 1.0000011." 1.0014111.“ 0.420000." 1.404070.“

'rypo 04 2.4001711." 1.010000.“ 1.200100." 0.1407011.»

14 100 4.2400011.“ 2.0100413.“ 0.0001413." 2.020400. 10

2110 2.0110011." 2.0000110.“ 0.4702011.” "10010.10

000 1.1101011.“ 2.0100010.“ 1.0142011.“ 2.0100013.“

 

Table 3.7: The residual and the orthonormality of computed eigenvalues and eigen-

‘ 1'00 0100000100 0! 100 0050100430 10 3000100 0000 401/5.

vectors from type 8 to 14.

32

 

-' Table 3.8: Execution Time (second) of computed eigenvalues and eigenvectors from

type 15 to 21.

 

 

 

 

 

 

 

 

 

0000110 00400 Ixmthl Til-0 (0000.1!)
rm 17 110000 31111 100110 ((B/III) O
10 0.11000 0.10000 .
10.111: 02 0.00000 1.00000
2‘". 00 0.02000 0.07000
10 120 0.07000 10.0000
000 00.7100 00.0000
000 100.000 220.710 1.00
10 ”00070.02 0.12000
100011; 02 0.000001 1.00000
rm 00 2.00001 0.00000
10 120 0.00007 10.1000
200 02.0100 000001
000 100.000 227.001 1.01
10 0.000000.” 0.10000
00.1010 02 2.00000 0.070001
1”. 00 10.0700 0.00001
17 120 100.000 20.0000
200 000.000 102.210
000 0010.00 000.100 0.11
10 - 0000021002
100111: 02 0 0.000000
Tm 00 - 0.00001
10 120 - 20.2000
200 - 170.000
000 - 1100.00 -
10 0.100000 0.110000
00.111: 02 0.000000 1.02000
Type 00 2.00000 0.01001
10 120 0.71007 10.7000
200 00.7000 07.0000
000 107.070 210.000 1.00
10 ”00070.02 0.110000
10001111 02 0.000007 1.00000
Type 00 0.70000 0.00000
20 120 10.0000 10.0000
200 02.0002 00.2000
000 100.100 201.700 1.00
10 7.000000.02 7.00000502
1000110 02 2.07000 1.01000
Type 04 14.1000 0.00001
21 120 102.200 20.0200
200 1110.00 100.700
000 0702.00 002.001 0.12

 

 

 

 

 

' Th0 dine-.000- 0! the 0050100000 10 3000007 050. 00,5.

33

 

 

 

 

 

 

 

 

 

 

140001: 07407 1010:.- "As.- -1,-s.-|bllm .uid' KXTX - ”MI/1m
1'”. N 110000 31111 30000 I 31m

10 ”10000.17 0.010771110— m

52 1352.20. 1. 1.545.20- 1. 1.4407404. 7.45.5.04.

Type .4 5.4.0070- 1. 2..52..04. 1.5522504. 5......045

1. 12. 1.7551204. 1.22.2.0- 1. 5.7740204. 7.2....04.

2.. ...554004. 3.1552104. ..5.7240- 1. 4.4.4.5044

500 ..5577504. 5.54.5204. 2.745750- 14 2.0.2020. 14

10 ..2..51047 1.215.104. 53505.0. 1. 4.440.204.

52 7.35512047 2.012.104. ....1550- 1. 7.771550. 1.

Typo .4 1.132.704. 2.2550204. 1.7211504. 1.0055704.

1. 12. 2.242.2047 2.2125504. 240277045 2.275550. 15

25. 4.575.30- 1. 1.525540. 1. 5.4172.04. 5.5445504.

.00 1.145.504. 2.5722704. 1.70.15045 1.24725044

10 5.51.510- 17 2..1022047 1. 1 10220- 1. 5.5505504.

52 ..4.712047 1.575070- 1. 2.5555104. 4.440.20- 1.

'1'pr .4 ...1.25047 2.1.52.047 4.215.404. 1.1055704.

17 12. 2.54522047 1.545200. 1. 7.2255004. 2.....00- 1.

35. ..45502047 1.5720204. 1.34544044 5.0025504.

.00 2.552210. 1. 1.....50- 1. 2.0.721044 5.2050004.

10 ' 1.515250- 1. 7.45.5404.

52 ' 1.075520. 1. 1 .15.4.0- 15

Type .4 ‘ 0.3624004? 8.055000. 1.

1. 12. ‘ 1.22.2204. ‘ 1 ..0025045

2.. ‘ 4.4755204. ‘ 2.1.2.0045

500 ‘ 5.7144304. ‘ 1.5244404.

10 2.401.30- 1. 5.2.7.30. 1 7 ..472.30- 1. 4.440.20- 1.

53 7.425.40- 17 1.550.204. 4..5.1104. 7.1.”00- 1.

Type 00 2.000070." 1.200000. 10 7.100000. 10 0.000000. 10

12 13. 5.1.1500- 1. 1..522204. 1.52.3045 1.144.504.

25. ..271270. 1. 2.12.7504. 5.5.5.2045 4.5754404.

5N 1.20.7504. 2.1277204. 7..5512045 1....75045

10 7.515.0047 1.020270- 1. 2.7522104. ...10550o 1.

.2 5.7224704. 1.”.2104. ..4220.0- 1. 5.02.020. 15

Type .4 1.4.22.04. 1.2015404. 7.502.104. 5.1111704.

20 12. 2.5402.0- 1. 1.....10- 1. 1.5555104. 2.4.50.0- 15

2.. 4.4504504. 2.52.0404. 5.0225004. 4.2522504.

.00 ..051520- 1. 2.3070404. 2.4.5.50- 1. 5.2555104.

10 5.1.5.2047 ......5047 1.1751204. 5.52.5.04.

53 5.57254047 1.0077504. 2.4427404. ....2750- 1.

Type .4 4.5552.047 1.0125404. 5.225.50- 1. ....15.0- 1.

21 13. 5.5.4.1047 1.1.71.0- 1. 2.5472404. ....1550o 1.

25. ..354200- 17 1.1435704. 1.24544044 ....17.04.

500 ..04..5047 1.14.0504. 2.5.5.7044 1.75.5104.

 

 

 

 

 

 

 

 

‘ Th0 dime-0k- “ the 0.5090000 10 3700000 050. 400/5.
Table 3.9: The residual and the orthonormality of computed eigenvalues and eigen-
vectors from type 15 to 21.

34

The speed-up is deﬁned as

execution time using one processor
P execution time using p processors

 

and the eﬂiciency is the ratio of the speed-up over p, the number of processors being
used.

Table 3.10 shows the execution time as well as the speed-up S, and the efﬁciency
SP/p of our algorithm HOMO on matrices [1, 2, l] and T2 by using p processors.
For the purpose of comparison with other methods, the speed up of our method over
TQL2, 71.811726, on [1, 2, 1] are also listed. Similar results on random matrices and
Wilkinson matrices are shown on Table 3.11.

We list in Table 3.12, 3.13 and 3.14 some of the results of D&C and B/M. It
is somewhat difﬁcult to compare our results with theirs directly since their results
were executed on different machines. Using TQL2, an indirect comparison may be
obtained. In Table 3.12 (Table 8.3[9]), when eight processors are being used, the
speed-up of D&C algorithm(SESUPD) over TQL2 is 9.4 for matrix [1, 2, 1] of order
100 whereas ours is 27.70 for the same matrix of order 125, and the speed-up of
D&C algorithm(SESUPD) over TQL2 is 20.00 for order 400 whereas ours is 146.19
for order 499. In Table 3.14 (Table 8.4[9]), when eight processors are being used, the
speed-up of D&C algorithm(SESUPD) over TQL2 is 12.1 for random matrix of order
100 whereas ours is 46.16 for the same matrix of order 125 (see Table 3.11), and the
speedup of D&C algorithm(SESUPD) over TQL2 is 60.7 for order 400 whereas ours
is 220.00 for order 499.

Table 3.13 (Table 7b[23]) shows the speed-up of B / M (two versions: TREPSl and
TREPS2) and D800 over TQL2 for the matrix [—1,2, —1] of order 500. It indicates
that TQL2/SESUPD = 27.1 and TQL2/TREPS2 = 131.5 whereas TQL2/HOMO =
146.19 for the matrix [1,2,1] of order 499 on Table 3.10. This result suggests that
the speed-up of the homotopy algorithm is at least as good as TREPSZ.

Figure 3.1 shows the efﬁciency of the matrices [1, 2, 1], T2, Random matrices and
W: of the order 499 and Figure 3.2 shows the speed-up of the matrices [1, 2, 1], T2,
Random matrices and W: of the order 499.

35

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[1, 2, 1] matrix

0.... 0.... HOMO g, TQL2 TQL2
li— (can...) 3, p (lu'l‘tms) (00.11....)
___—T 7.99 .1.0 1.00 27.55 3.44 9.41 1.0 1.00 26.33
65 2 4.51 1.8 0.89 6.10 4.93 1.9 0.95

4 2.77 2.9 0.72 9.93 2.88 3.3 0.82

1 29.18 1.0 1.00 176.7 6.06 34.98 1.0 1.00 177.33
125 2 15.25 1.9 0.96 11.59 17.55 2.0 1.00

4 8.76 3.3 0.83 20.18 9.34 3.7 0.94

8 6.38 4.6 0.57 27.70 6.14 5.7 0.71

1 122.37 1.0 1.00 1457. 11.91 143.46 1.0 1.00 1497.

2 69.93 1.9 0.97 20.85 71.76 2.0 1.00
255 4 33.08 3.7 0.92 44.07 37.2 3.9 0.96

8 20.38 6.0 0.75 71.53 22.4 6.4 0.80

16 14.28 8.6 0.54 102.08 14.85 9.7 0.60

1 477.91 1.0 1.00 10889 22.79 550.90 1.0 1.00 11198

2 242.01 2.0 0.99 45.00 276.56 2.0 1.00
499 4 125.45 3.8 0.95 86.80 140.48 3.9 0.98

8 74.49 6.4 0.80 146.19 81.72 6.7 0.84

16 47.41 10.1 0.63 229.69 52.27 10.5 0.66

 

Table 3.10: Execution time (second), speed-up and efﬁciency of HOMO on [1, 2, 1]

and T2 matrices.

36

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Random matrix ll ' Wilkinson matrix

0.... ...... 110140 5, TQL2 3912 HOMO g, TQL2
N p (lu‘l‘ime) S, L (Bu’l‘imo) HOMO (Bu’l‘imo) S, p (24m...)
1 1 6.66 13 1.00— 29.48 4.43 6.27 1.0 1.00 25.61
65 2 3.90 1.7 0.85 7.56 3.35 1.9 0.94

4 2.44 2.7 0.68 12.08 2.03 3.0 0.77

1 21.11 1.0 1.00 187.4 8.88 22.10 1.0 1.00 172.05
125 2 11.11 1.9 0.95 16.87 11.41 1.9 0.97

4 6.04 3.5 0.87 31.03 6.20 3.6 0.89

8 4.06 5.2 0.65 46.16 4.26 5.2 0.65

1 80.55 1.0 1.00 1490. 18.50 86.69 1.0 1.00 1367.

2 41.59 1.9 0.97 35.83 44.27 2.0 0.98
255 4 22.23 3.6 0.91 67.03 23.45 3.7 0.92

8 14.35 5.6 0.70 103.83 15.19 5.7 0.71

16 10.57 7.6 0.48 140.96 10.55 8.2 0.51

1 301.78 1.0 1.00 11447 37.93 328.24 1.0 1.00 9949.

2 155.11 1.9 0.97 73.80 166.82 2.0 0.98
499 4 82.47 3.7 0.92 138.80 88.89 3.7 0.92

8 52.03 5.8 0.73 220.00 56.16 5.8 0.73

16 39.19 7.7 0.48 292.09 40.13 8.2 0.51

 

 

Table 3.11: Execution time (second), speed-up and efﬁciency of HOMO on Wilkinson

and Random matrices.

37

 

N 100 200 300 400
TQL2/SESUPD 9.4 15.4 17.7 20.00

 

 

 

 

 

Table 3.12: A comparison on the Alliant FX/8 for TQL2 vs. TREEQL on [1,2,1]

 

 

 

 

matrices.

i Algorithm Time(TQL2 on 1 CE) Time(TQL2 on 8 CE)
Time(AIg. i on 8 CEs) Time(AIg. i on 8 CEs)

1 TREPSI 32.9 7

2 TREPSZ 131.5 28

3 TQL2 4.7 1

4 BISECT+TINVIT 3.6 .8

5 SESUPD 27.1 . 5.8

 

 

 

 

 

 

Table 3.13: Speed-up over TQL2 on the Alliant FX / 8 for computing all the eigenpairs
of [—1,2, —1] matrix of order 500.

 

N 100 200 300 400
TQL2/SESUPD 12.1 19.5 38.8 60.7

 

 

 

 

 

Table 3.14: A comparison on the Alliant FX/8 for TQL2 vs. TREEQL on random

matrices.

38

Test matrices

 

 

£2 4
P
1. Random
1.00 ~-
2. Wilkinson
3. [1.2.1]
0.75 "
4. T2
0.50 '" n=500
p---# of processors
0.25 '
—— =efﬁciency
P
} l l ! 4
0 4 8 12 16 P

Figure 3.1: Efﬁciency of homotopy method on [1, 2, 1], T2, Random matrices and

W3”

 

 

 

Sp 0 Test matrices
1. Random
10 ' 2, Wilkinson
3. [1.2.1]
7.5
4. T2
5 — n=500
p=# of processors
2.5 Sp=speed-up
l I l l _
I ' I I —"
0 4 8 12 16 P

Figure 3.2: Speed-up of homotopy method on [1, 2, 1], T2, Random matrices and W:

39

PART II
The Homotopy Method
for
Real Symmetric Generalized

Eigenproblem

40

Chapter 4

Introduction

Consider the symmetric positive semidefinite generalized eigenproblem:
Ax 2 A83: (4.1)

where A and B are n x 71 real symmetric matrices, B is positive semideﬁnite. The
pair (A, B) is called a pencil of the eigenproblem (4.1), and the eigenpair of (4.1) is
called the eigenpair of the pencil (A, B). When B is singular, (4.1) has fewer than n

eigenvalues.

Let X AX T be a spectral decomposition of B, where X is an orthonormal matrix

and A is a diagonal matrix. Then, (4.1) is equivalent to
ﬂy = AAy

with A = XTAX and y = X72. Hence, we may assume the matrix B in (4.1)
is a diagonal matrix with nonnegative diagonal elements. In this case, the MDR
reduction [3] can further reduce A to a symmetric tridiagonal matrix and, through
the reduction, keep B as a diagonal matrix. Therefore, we will assume hereafter that,
in (4.1), A is a symmetric tridiagonal matrix of the form
f 01 52 l
.32 02 33

‘- ’° (4.2)
571-1 Ora—1 ﬂu
( [3» an )

 

 

41

and B = diag(bl,bg, ...,bn) with b,- _>_ 0.

If ,6; = 0 for some i, 2 S i S n, then R“ can clearly be decomposed into two
complementary subspaces invariant under A. Thus the generalized eigenproblem
A2: = AB: is decomposed in an obvious way into two smaller subproblems. Hence,
we will assume that each [3,- ;£ 0. That is, A is unreduced.

In this part of the work, we shall describe our homotopy algorithm for solving
eigenpairs of the pencil (A, B) in (4.1). Let D be an n x n symmetric tridiagonal
matrix and consider the homotopy H : Rn x R x [0, 1] —-§ Rn x R, deﬁned by

ABa: — D2: /\B:1: — A1:
HWJJ) =(1-t) xTBz—l +t zTBz-l
2 2

ABz - [(1 — t)D + tA]a:
zTBa: — 1 (4.3)
2

ABa: — A(t)x
= $733: — 1
2
where A(t) = (1 — t)D + tA. The pencil (D, B) is called an initial pencil.

In Chapter 5, we shall give a reduction which shows that the eigenvalues of the
pencil (A, B) are the same eigenvalues of a pencil (A, B) with an unreduced sym-
metric tridiagonal matrix A and a positive definite diagonal matrix B. And, from
the reduction, one can easily form the eigenpair (:c, A) of the pencil (A, B) from the
eigenpair (y, )1) of the pencil (A, B). Therefore, we shall assume the diagonal elements
b,- of B are all positive. In such case, we shall prove, in Chapter 6, that the solution
set of H (z, A,t) = 0 in (4.3) consists of disjoint smooth curves, each of which joins

an eigenpair of the pencil (D, B) to an eigenpair of the pencil (A, B). We call each

42

of these curves a homotopy curve or an eigenpath. Thus, by following the eigenpaths
emanating from the eigenpairs of the pencil (D, B) at t = 0, we can reach all the
eigenpairs of the pencil (A, B) at t = 1.

The algorithm of following an eigenpath will be given in detail in Chapter 7. Some
numerical results along with comparison with the QZ- method will be presented in
Chapter 8. It appears that our algorithm is strongly competitive in terms of speed,
accuracy and orthogonality and leads in speed in all the cases.

43

Chapter 5

Reduction

Let A be an unreduced, symmetric tridiagonal matrix in (4.2) and b = diag(bI,
b2,...,b,,) with b: > 0 and b,- _>_ 0 for i = 2,3, ...,n. In this chapter, we give the details
of the reduction of the pencil (A, B) to a pencil (A, B), where A is still unreduced
tridiagonal and B is positive deﬁnite and diagonal.

Lemma 5.1 Assume

A: ————JEmﬁ}-“‘ andB=(Bl 0):

 

 

\

where B1 is a positive deﬁnite diagonal matrix with dim(B1) = dim(A1) = m1 and 0
is the zero matrix with dim(0) = dim(A‘,’) = n - m1. If det(A‘,’) 96 0, then the pencil

(A, B) has the same eigenvalues as the eigenproblem:
(A1 + S)y = A313/,
where S is a diagonal matrix.

Proof: Clearly, with x = (x1, x2, ...,xn)T, Ax = ABx is equivalent to

and

{$1\

3m: —l

0

 

 

 

Km)

0

 

1 6

( ﬁm+1xm )

+A‘,’

 

)

Since det(A‘1’) gé 0, (A‘,’)"1 exists. From (5.2),

( mmx+1 ‘

aim; +2

 

 

(2..)

= 443-1

Let (w1, wg, ..., wn_,,,,)T be the solution of

A?

rum

 

(44..-... J

10:

 

K ﬂmi +131»: +1 )

 

/ 4...,“ \

3m1+2

= AB]

31111—1

 

 

 

(4,.)

0

 

(1)

0

 

 

m

then from (5.3) and (5.4), me = —wlﬂm,+1xm,. Let

f 0

 

then (5.1) can be written as:

(A1 + S)

 

 

 

 

{ Batu-131711 \

I.”

 

Km)

 

 

(5.1)

(5.2)

(5.3)

(5.4)

(5.5)

Clearly A1 + S is an unreduced symmetric tridiagonal matrix and the eigenvalues of
(5.5) are exactly the eigenvalues of the pencil (A, B).

Furthermore, if x = (x1,x2, ...,xm,)T is the eigenvector of (5.5) corresponding to
the eigenvalue A, then x = (x1, x3, ..., xm,, —cwl, ..., —cw,,_,,,,)T, where c = 5",”.le ,

is the eigenvector of the pencil (A, B) corresponding to the same eigenvalue A.

 

 

Q.E.D.
Lemma 5.2 Assume
/ . K
A l
_ _ .1- ' €79.11- ..
19m,“ I Bl
A = || A(l). : and B = 0 ,
----lgmzﬂ ---- B2
.. ﬁrm-H I
I
K . A2 1

where B,-,i = 1,2, are positive deﬁnite diagonal matrices with dim(B1) = dim(A1) =
m; and dim(Bg) = dim(A2) = n — mg. 0 is the zero matrix with dim(0) =
dim(A?) = mg - m1. If det(A?) rf 0, then the pencil (A, B) has the same eigen-

values as the eigenproblem:

M“ l l (3‘ )
+3 y=A y, (5.6)
A2 82

where S is a symmetric matrix and the matrix

A1
+S

is unreduced, symmetric and tridiagonal.

Proof: Clearly, Ax = ABx can be rewritten as:

 

 

 

( 31 A f 0 l f x; A
A1 5 “l" 0 =A31 E , (5.7)
K 3m / Kﬂmmzmm ) K 2..., )

 

 

 

46

( ﬁrm-Hahn: \

+A‘,’

 

 

 

and

 

 

Since det(A‘1’) 76 0, from (5.8)

(zmﬂl

$m1+2

ting-1

K‘m J

 

 

 

 

{ zmﬁ-l \ f 0 A
tun-{’2 +
0
3m: ) Kﬂmz-Hzmﬁ-l /
{ 3mg.” \ ( 17mg.” \
3m?” = A 32 3m2+2

 

 

4M0“

 

K ﬂmz+13M2+l /

 

 

 

 

=0 (5@

59)

(5m)

Let w = (wl, ..., mm.m1 )T and v = (v1, ...,vm,-m,)T be the solutions of

 

 

 

 

(1) (0)
0 0 0 E
Alw = , and Alv = ,
i 0
K 0 l K I J
then, from (5.10),
ﬂm1+lzm1+l = “wlﬂrznl-I-lzml _ vlﬂm1+1ﬁmg+lxm2+l
ﬂinz'i-lxma = 'wmg—m1ﬂm14-116m2-I-13m1 "" ”Ma—M1ﬂ72n34-lzmTI-1'

Since A? is unreduced, clearly v1 7e 0. A? is symmetric, so is its inverse. Thus,

v1 = w,,,,..,,.,. Let S be an (n — mg + m1) x (n — mg + m1) matrix of form

47

K 0 l
0

“ml 53., +1 ”vlﬁmi +1 ﬁmﬁ-l "(ml)". raw

-v1.3m.+iﬂm2+1 —”m,—m116,2n,+1 . ...(m1+1)"‘row

0
K 0)

Then, (5.7) and (5.9) can be rewritten as:

f 271 A ( 1'1 \

A1 + S xi": = A Bl 2m; . (5.11)
A2 3771344 32 3m2+1

Clearly, the matrix
A1
+ S

is unreduced, symmetric and tridiagonal and the eigenvalues of (5.11) are the same

eigenvalues of the pencil (A, B).

 

 

 

 

 

 

Furthermore, if (x1, ..., xm, 2m“, ..., xn)T is the eigenvector of (5.11) correspond-

ing to the eigenvalue A, then for e = 19",,“ xm,“ and d = ﬂmﬁlxmv
(x x —w — v d —w c — v .d )T
l, 00., mx, [C l ’00., m2_m1 gun—m‘ ,3m2+1,...,$n

is the eigenvector of the pencil (A, B) corresponding to the same eigenvalue A.

Q.E.D.

48

If

(B; A (A1 4: A

01 :1: A‘,’ :1:
B: B2 , weletA= . (5.12)
ASL, :1:

 

 

 

 

If
(B; l (A, at A
01 :1: A‘,’ at
B: B2 ,weletA= . (5.13)
-. A, 4:
K 0,) K * 42)

 

 

 

 

where 05’: are zero matrices with dim(0,-) = dim(A?), and B,-’s arepositive deﬁnite

diagonal matrices with dim(B.-) = dim(A.-), i = 1,2, ...,r.

Theorem 5.1 If det(A?) 96 0, i = l,2,...,r, then the pencil (A, B) has the same

eigenvalues as the eigenproblem:

((A1 ) ) (B. )

I
» A B
E 2 + S y = A 2 . y, (5.14)

KK Ar) 1 K Br)

where S is a symmetric matrix and the matrix on the left hand side is an unreduced

 

 

 

 

 

 

symmetric tridiagonal matrix.

Proof: By the same line of argument used in Lemma 5.1 and Lemma 5.2, the proof
can be immediately achieved by mathematics induction.

Q.E.D.

In the following discussion, let (A)1 and (A)1 denote the lower (n — 1) x (n - l)

submatrix and the upper (n - 1) x (n — 1) submatrix of A respectively, then (A)1 =

((4)1)1 =((A)‘)1~

49

Lemma 5.3 Assume

A: —————-€"2+3-—-- and B: Bl ,
0

 

 

where B1 is a positive deﬁnite diagonal matrix with dim(B1) = dim(A1) : m1 and 0
is a zero matrix with dim(0) : dim(A‘l’) = n — m1. If det(A‘1’) = 0, then the pencil

(A, B) has the same eigenvalues as the eigenproblem:
(Amy = M3019,

Proof: Clearly, Ax : ABx can be rewritten as

 

 

 

 

 

 

 

 

 

 

( x; A f 0 A r x; A
A, 4., + 0 = AB, 3” (5.15)
K 3m; j \ ﬁmj-I-lznu-I-l ) K 3m) J
and
( ﬂint-Hz?!“ \ ( $731+! \
0 + A‘,’ ““7 = 0. ~ (5.16)
K o J K .. J
Since det(A‘1’) : 0 and (5.16) has a solution,
/ (ﬂing-Hz?!“ \ \
rank 0 ,A‘,’ = ran/4A?) = n — m1 — 1.

 

 

 

 

50

This implies ﬂm+1xm : 0, since A? is unreduced tridiagonal. Since 5m,” 74 0,
xml : 0. Replacing x,,., by 0 in (5.15), we have

(x1)(0) (x1)

 

 

 

 

 

 

A1 : + E = AB; 3 . (5.17)
3m1—1 0 31m —1
K 0 l K ﬂmi-Hzmi-H ) K 0 )

From the last row of (5.17),

ﬂmlxml-l = —ﬂm1+13m1+1- (5.18)
Hence,
$1 31
(All! 5 = A(Blh 5 . (5.19)
xml-l Jinn—l

Clearly, the eigenvalues of (5.19) are the same eigenvalues of the pencil (A, B). More-
over, from (5.16) and (5.18),

{ -%3m1 -1 \

zm1+2

A? . = 0. (5.20)

 

 

KI»)

When (5.19) is solved, xm,-1 is known, so (5.20) has a unique solution since
rank(A‘,’) : n - m1 — 1. Hence if (x1,...,x,,,,-1)T is the eigenvector of (5.19) cor-
responding to the eigenvalue A, then for c : —ﬂm,/ﬂm,+1, (x1,...,x,,,,-1,0,cxm,-1,
x,,,1 +2, ..., x,,)T is the eigenvector of the pencil (A, B) corresponding to the same eigen-
value A.

Q.E.D.

51

Lemma 5.4 Assume

 

 

K . l
where B,-,i = 1,2, are positive deﬁnite diagonal matrices with dim(B1)-= dim(A1) :
m1 and dim(Bg) = dim(Ag) = n—mg. 0 is a zero matrix with dim(0) = dim(A?) =
mg - m1. If det(A?) = 0, then the pencil (A, B) has the same eigenvalues as the

eigenproblem:

(A1)1 .4 ... (Blli
0 +3 y=A o +T y, (5.21)

(’42)1 (32)1
where S is a symmetric matrix and the matrix on the left handside in (5.21) is an

unreduced symmetric tridiagonal matrix. T is a diagonal matrix and the matrix on

the right hand side is positive deﬁnite and diagonal.

Proof: If mg — m1 : 1, i.e., A? is an 1 x 1 matrix, then A? = 0 since det(A?) = 0.

Hence, Ax = ABx can be rewritten as

 

 

 

 

 

 

( x1 \ f 0 \ / x1 \
Al 3.2 'l" E J. =3 AB] 1:2 ,
' : 0 :
K 3m, l K 5m1+1$m1+1 ) K xmI )
ﬁrm-+1311“ + ﬂm3+lzm3+l = 0 (5.22)

52

and
( ﬂm4+1zm ) f 2...,“ ) f 3...,“ l

0 3711344 wins-+2

4' A2 . = 1‘32

K 0 l K34} K45}

Notice that 3:",2 = xm,+1, and B1 and B2 are positive deﬁnite. We may rewrite (5.22)

 

 

 

 

 

 

 

 

 

 

 

 

as:
{ $1 A r 0 \ K 31 \
(A1)1 . "l' . = A(Blh . ,
37131-2 0 ting-2
A 31711-1 A K _H'Znijxnxm'tl / K mml-I A
Om m m
"' ﬂmixmi—l + “—ﬂ'lﬁjﬂxmz-i-l '— .Bmi-mez = Ahlé'L-Sﬂzmz-i-l (523)
"11 m1
and
ﬂmz-i-lzmz + am2+l$m2+1 + 76m3+23m2+2 = Abmrl-lxma-H (524)
( ﬁmg+2xm2+1 ) ( 3m3+2 A ( $m3+2 A
0 m m
+ (A2)1 3 2+3 = A(Bz)l x 2+3

 

 

 

 

 

 

K0} (2,.) Kat”)

Solving for x,,,2 in (5.23) and substituting it into (5.24), yields

on 5?»
-3 3'3““ 39711—1 + (am! 7372:]- + am2+1)zm2+l '1' ﬂmg+2$m3+2

My?!

32
= A(bmﬁl ‘1' ﬁbmilxmﬁl'

Let S and T be (n — mg + m1 — 1) x (n —_m2 + m1 — 1) matrices of forms.

53

(0 \
0

3m ﬂ
0 -ﬁ3 . ...(ml — l)"‘row

,,, 5’
S = -Ejﬂnﬂ am: 73;!11 '1' am2+1 ﬁm2+2 ...(m1)"‘ row
”1+1 "31+!

0

 

 

 

 

 

 

 

 

and
f 0 )
’ 0
0
T = bmz+l "I" $23—15"): (mllth'ow
0
O
K 0}
Then
( r: ) f z: \
(A1)1 : (31h
0_ +8 “"1 =4 0 +T 2"“
(4‘13)l 3m?“ (132)1 3m?“
K ... l K 4,. l
(5.25)
Clearly,

53.. +1
bmg-H ‘l" 'ﬁ'z—z'—bm1 > 0:
”n+1

and the matrix on the left hand side in (5.25) is unreduced symmetric tridiagonal.
- The eigenvalues of (5.25) are the same eigenvalues of the pencil (A, B).

54

Furthermore, after solving (5.25), 33m, and x...2 can be obtained from (5.22) and
from (5.23). Hence if (x1,...,xm,_1,xm,+1,...,x,.)T is the eigenvector of (5.25) cor-
responding to the eigenvalue A, then (x1,...,x,,,,_1,x,,,,, xm,,x,,,,+1,...,x,,)T is the
eigenvector of the pencil (A, B) corresponding to the same eigenvalue A.

If A? is (m2 -- m1) x (m; - m1), where m; — m1? 2, then Ax : ABx can be

 

 

 

 

 

 

 

 

 

 

 

 

written as:
f 331 \ f 0 A f 31 \
(A1)1 . + . = A(Bih . 4 (5-26)
wing—2 0 $m1-2
K 4....-. ) K 94.x... ) K mm.-. )
ﬂmxzml—l + amlxﬂu + ﬂm1+lxm1+l = Abﬂu 3m“ (5'27)
( Bm1+13m1 l f 3m,“ l r 0 A
0 ... E
+ A? 3 f“ + = 0, (5.28)
. : 0
K 0 j K x1712 ) K ﬂma-i-lxmz-l-l )
ﬂm2+lxmg + am3+1$m2+1 + ﬂm2+QxM2+2 = Ab7732+lmms+li (5'29)
and
( ﬂma+23m2+1 l ( $m2+2 ) ( aim-4+2 )
0 m m
+ (A2)1 3 ”3 = A(Bg)‘ 1” ”'3 (5.30)

 

 

 

 

 

 

K0) Ken} K9»)

Since ranh(A?) : m2 — m1 — 1 and A? is unreduced tridiagonal,

 

 

/ 1 K
A? y‘ = 0 (5.31)
K yma-nu-l /

55

has a unique solution. Multiplying (l,y1,...,ym,_m,-1) on the left in (5.28), yields

ﬂmri-lzml = -ym,-m,-1J9m3+13m+1-

Now (5.26) and (5.27) can be rewritten as

 

 

 

 

 

 

 

( $1 \ f 0 \ , { $1 \
(Alli E + 0 =A(Bl)1 E , (5-32)
szx-IJ K'"""'"3.:Ifl”‘”""“$ma+1J Kan-IJ
and

‘1’ -m —lam am +1
ﬁrm 2m,_1 + m2 1 1 2

3m!“ $m2+1 + 3m1+13m1+1

mam

-v -m .- 5 +1
= Abm: "'2 30011:! m

Since det(A?) : 0 and A? is unreduced tridiagonal, det((A?)‘) # 0. Therefore

(1) he (0)

0
(A:)‘v= , .and(A:)‘w=

3m3+1 °

0
KOJ KlJ

have solutions. Write v : (v1,...,v,,,,_m,_1)T and w : (w1,...,wm,_m,_1)T. (A?)1 is

 

 

 

 

symmetric, so is its inverse. So w] : vm,_m,_1. From (5.28),

 

 

 

 

3mm = -viﬁm,+2$m,+1 - wiﬂm2+1$m2+1
ass
3m; = _wlﬂm1+2zm1+1 — wM2-ml-lﬂm:+le2+l°
It follows from (5.31),
f 111 l ( ﬂm1+2 A
0
(4‘91 ,, = — . ,
K ym.-..,-1 J K 0 J
thus,
yﬂtz-ml-l = —ﬂm1+2vm2-m1—1 = _ﬂmpI-zwl' (5.35)

56

Substituting (5.34) into (5.29), we have

wlﬁmg-Hﬁmrl’lzmi-i-l "l’ (am2+1 - wms-m1-1 3.2+!)3m3-H + ﬂm2+2$m2+2 (5.36)

= Abuse“ 3mz+l -

Similarly, substituting (5.35) into (5.32) and (5.33), yields,

 

 

 

 

 

 

 

 

 

ﬂmazml’l + mamlgz:::ﬁm2+1zm2+l + ﬂml+1$m1+1 (5 37)
= Abm, wingin-jfzrng-fl zmz-Ha
and
f 31 ‘ r 0 \ K $1 \
(A1)1 ' + ° = A(Bl)1 ' . (5.38)
37121-2 0 zm1—2
‘0 ”Hunt!“ 3
K3m1-1 / K 1 ”Mai: ”£93m“ ) szi-l )

We may solve for xm.” in (5.36) and substitute it into (5.37), to obtain

 

w pm pm hi ”13m 5
1 Jamil: 12031-1 + (Oman - wma-mx-lﬁvznz-l-l + am1( p1":2+1m+1)2)$m2+1

 

+ﬂm+23m3+2 = Abzmg-H
(5.39)

where
wlﬂmx-Hﬂmg-l-l )2 > 0
ﬁﬂu-{d

Let S and T be (n — mg + m1 — 1) x (n - mg + m1 — 1) matrices of forms

{ 0 )

b = bins-H + bm1(

0 : s l ...(ml - 1)"‘row
S: s I t 'ﬂm,+2 ...(m1)"‘ row
— - ... — —1 -— ...-

.ﬂmﬁ.) ; 0 ...(m; + l)"‘row

 

 

57

T = b ...(m1)"‘row
0 4
where s = “pm‘gzxfﬁm” , t = am,“ - wing—nu-l 3,2.” + am,(wlp'3‘::ﬂ'”’“ )2.

From (5.32), (5.38) and (5.39), we have

 

 

 

 

( 31 ) l 31 )
(4.). ‘ (3.). °
0 + S 3”“ = A 0 + T 3“”
(A2)1 301.2471 (Bg)l ”Mai-1
K .. J K ... J
(5.40)

Obviously, the matrix on the left hand side in (5.40) is unreduced tridiagonal and
the matrix on the right hand side is positive deﬁnite and diagonal. The eigenvalues
of (5.40) are the same eigenvalues of the pencil (A, B).

Furthermore, after solving (5.40), we can calculate xm,“ from (5.36), x,,,, from
(5.27) and x,,.,+2,...,x,,., from (5.28). Hence, the eigenvector of the pencil (A, B)
corresponding to the eigenvalue A can be easily formed from the eigenvector of (5.40)

corresponding to the same eigenvalue A.
Q.E.D

Theorem 5.2 Assume det(A?) = 0, fori : l,2,...,r.
(i) HA and B are of the same form as in (5.12), then the pencil (A, B) has the

same eigenvalues as the eigenproblem:

( ( (Aill ) ) r ( (31h ) )

0 o
(42% + S v = A (32H + T v

 

 

 

 

 

 

 

 

KK ' (4)1) J KK . (BM

(5.41)
where S is a symmetric matrix, and the matrix on the left hand side is an unreduced

symmetric tridiagonal, T is a diagonal matrix and the matrix on the right hand side

58

is positive deﬁnite and diagonal.
(ii) IfA and B are of the same form as in (5.13), then the pencil (A, B) has the

same eigenvalues as the following eigenproblem

( f (141): ) l f ( (31h \ )

(42): +5 u=4 ' (3,); +:r ,

 

 

 

 

 

 

 

 

K K (4)4 J J K K (3.): J

(5.42)
where S is a symmetric matrix, and the matrix on the left hand side in (5.42) is
unreduced symmetric tridiagonal, T is a diagonal matrix and the matrix on the right

hand side is positive deﬁnite and diagonal.

Proof: By the same arguments used in Lemma 5.3 and Lemma 5.4, the proof can be

easily achieved by mathematics induction.

Q.E.D.

Theorem 5.3 If A is an unreduced symmetric tridiagonal matrix and B is a positive
semideﬁnite diagonal matrix, then the pencil (A, B) can be reduced to a positive deﬁ-
nite pencil (A, B), where A is still unreduced symmetric tridiagonal and B is positive
deﬁnite diagonal.

Proof: The proof follows immediately from Theorem 5.1 and Theorem 5.2.
Q.E.D.

Let f,, denote the number of the eigenvalues of the pencil (A, B), then by Theorem

5.1 and Theorem 5.2, we may obtain the following result.

Theorem 5.4

f,, -= rank(B) _ :(1 — sign(|det(A?)|)).

i=1

Since B is diagonal, rank(B) is just the number of nonzero diagonal elements of

B. If zero is an eigenvalue of A9, sign(|det(A?)|) : 0. Otherwise, sign(|det(A?)|) : 1.

59

Since A? is unreduced, the Sturm’s sequence [24, 31] can be used to check if zero
is an eigenvalue of A9, so, f,, can be easily computed without actually solving the

generalized eigenvalue problem.

60

Chapter 6
Preliminary Analysis

Let A be an n x n real symmetric tridiagonal matrix and B be an n x n positive

deﬁnite diagonal matrix of the form

(011 32 l
53 as 53 0 Kb] 0‘

A= . , B: b2 . 5 (6'1)

 

 

 

 

0 [921—1 an-l ﬁn K 0 . bu l
K J?» an ) .
where b; > 0, and ﬂ,- 7‘. 0 for all i. Our algorithm employed the same strategy of
‘Divide and Conquer’ as we did in Part I. First of all, the matrix A is divided into
two blocks by letting one of the 388 equal to zero. Namely, for the initial pencil
(D, B) in (4.3), we let

D=(1:g), (6.2)
2
where
I 011 52 O l { 0H1 ﬂit-+2 0 l
52 02 53 51:“ 05+: ﬁns
D1 _ . . ’ D3 = ’
Bic-1 alt-1 ﬂit Ian-1 an—l ﬁn
K 0 a a.) K 0 9» “nJ

 

 

 

 

61

and then let

B1 0
B =
with dim(B,-) = dim(D,-),i = l, 2. We calculate the eigenvalues of the pencils (D,-, B;),

i : 1, 2, by using the most efﬁcient algorithm available. Then our algorithm conquers

the pencil (A, B) by the homotopy in (4.3).

Theorem 6.1 The solution set of (4.3) consists of disjoint smooth curves and each

curve joins one eigenpair of the pencil (D, B) and one eigenpair of the pencil (A, B).

Proof: Differentiating H (x, A,t) with respect to (x, A) in (4.3) yields,

AB — A(t) Bx )

Hm; (x,A,t) =
( ) ( xTB 0

We claim that if (x, A, t) satisﬁes H (x, A,t) = 0, then H(,,)()(x,A,t) is nonsingu-
lar. To see this, we show that H(,,;K)(x,A,t)y : 0 has only trivial solution. Clearly
dim(lcer(AB-A(t))) : 1 and xTBx at 0, since for each t, AB —A(t) is unreduced tridi-
agonal and B is positive deﬁnite. Suppose y satisﬁes H(,,)K)y = 0, write y = (yf, y2)T,
where yl E R1‘ and y; E R, then

(AB — A(t))yl + 3]ng = 0 (6.3)
(BTByl = 0. (6.4)
Since xTBx at 0 and xT(AB -— A(t)) : 0, (6.3) implies yg .: 0. Hence,

(*3 - A(t))yt = 0.

01‘

Him — B-iA(t)B-%)Biy, = 0.

That is, Big, e ker(AI — A(t)) with A(t) = B-iA(t)B-l. It is easy to see that A(t)

is also unreduced since B‘i is a diagonal matrix with positive diagonal elements.

62

Thus, dint(lcer(AI - A(t)) = 1. On the other hand,

()1 — 4(4))th = (AI — B'iA(t)B-i)Bix
= B-i(AB — A(t))x = 0.

Thus, B§x E ker(AI — A(t)) and hence, B’i‘x : cBiyl for certain nonzero constant
c. From (6.4),

473,, = (szxBiyl) = c(B%y.)T(B%h) = 0.

Therefore, y; = 0, since B ’1‘ is positive definite. Hence y = 0.
A repeated application of the implicit function theorem, the assertion of the the-

orem is achieved.
Q.E.D.
Let f.- and A.-(t) be eigenvalues of the pencils (D, B) 'and (A(t),B), respectively.
Then the following theorems follow immediately from the results we proved in Part

1.

Theorem 6.2 If eigenvalues of (D, B) are distinct then
(i) Either A;(t) is constant for all t in [0,1] or strictly monotonic.
(ii) A,(t)A.-(t) > 0 for t small, if A,-(t) ¢ 0.

Theorem 6.3 For any t 6 [0,1],

£{_1 S Ag“) 5 £§+1,i ‘2 2, 3, ...,n — 1,
A1“) $51
A..(t) 2 5,.

From Theorem 6.2 and Theorem 6.3, every homotopy curve must be one of those
in Figure 6.1. Each homotopy curve is bounded by two consecutive dotted lines and

no homotopy curve can cross any dotted line.

63

(D, B)
A (A, B)

 

 

 

 

 

 

 

 

7t
H(x,A,t)-:0 93(1)

(0) ______________________________________
aim) -------------------------- / 73(3)
7‘4“» ------------------------- :54 7X
75(0) 575‘“)
MO) .........................................

(0) __________________________________________
O t=l t

Figure 6.1: The generalized homotopy curves

Chapter 7

The Homotopy Algorithms for

Generalized Eigenproblem

Our algorithm for ﬁnding all the eigenpairs of the pencil (A, B) with unreduced
symmetric tridiagonal matrix A and positive semideﬁnite diagonal matrix B is based
on the following steps:

(i) Reduction

(ii) Initiating at t = 0
(iii) Prediction

(iv) Correction

(v) Checking

(vi) Detection of a cluster
(vii) Step-size selection
(viii)Terminating at t = 1

(ix) Forming the eigenpairs of the pencil (A, B).

7 .1 Reduction

If B is positive semideﬁnite, the pencil (A, B) will be reduced to a pencil (A, B)
with B positive deﬁnite by the reduction we described in Chapter 5.

65

7 .2 Initiating at t = 0

As we mentioned in Part I, by making the initial matrix D in (6.2) close to A,
then the eigenpairs of the pencil (D, B) should be excellent starting points as we
mentioned in Part I. We intend to choose I: for which ﬁg.“ is as small as possible. To
make the sizes of the blocks D1 and D; roughly the same, we limit the choice of k in
the range n/2 —j S I: _<_ n/2 +j, wherej is roughly equal to n/20, and ﬁnd the
smallest (31,.” by local sorting.

When the initial matrix D is decided, we calculate the eigenvalues of the pencils
(D1, B1) and (D2, B2) by using the most efﬁcient method available. We only require
the accuracy stay within one-half or even one-third of the working precision. With

this strategy, considerable amount of computing time is saved.

7.3 Prediction

Assume that after i steps the approximate value (:i':(t,-), A(t;)) on the eigenpath
(x(t), A(t)) at t,- is known and the next step-size h is determined; that is, t5.“ : t,- + h.
We want to ﬁnd an approximate value (i(t,-+1),A(t,~+1)) of (x(t;+1),A(t,-+1)) on the
eigenpath at t5“. Notice that (i(t,~.,.1),A(t,-+1)) is an approximate eigenpair of the
pencil (A(t;+1),B). Since H (x(t), A(t),t) = 0, we have

K A(t)x(t) = A(t)Bx(t)

 

. x(t)TBx(t) = 1.
Differentiating both equations with respect to t , yields,

(A — D)x(t) + A(t)x(t) = A(i)Bx(t) + A(t)B:1’:(t)

(7.1)
$(t)TB:i'(t) = 0.
For t = t,-, multiplying (7.1) on the left by xT(t,), yields,
A(ti) = 3T(ti)(A - D)$(ti) = 25k+1$k(ti)$k+1(ta) (7-2)

66

where $(t5) = (x1(t,-),...,x,,(t,-))T. In view of (7.2), we use the Euler predictor to

predict the eigenvalue at t.-+1 , namely,
A°(t,-+1) = A(t,-) + i(t.-)h.

It is easy to see that A(0) : 0 in (7.2). Consequently, A°(t1) always equals to A°(0).
To predict the eigenvector, we use the inverse iterations on (A(t;+1), B) with shift

A°(t,-+1). That is, we solve
(A(tm) - A°(t.~+1)B)y°(t,-+1) = 3304')

and let
By°(t.-+1)

ITBy°(t.-+1)ll-

At t,- = 0, since we skip the calculations of eigenvectors of the pencil (D, B), x(0) is

 

30(ti+1) =

not available. We choose a random vector for x(0).

7 .4 Correction

When B is positive deﬁnite, the generalized Rayleigh quotient p(u) : (uTAu)/uTBu
enjoys the following properties [24]:

(i) Stationarity.

Since

grad(p(u)) a v(p(u)) = 2“" :T”§:)B“)T.

 

thus p is stationary at the eigenvectors of the pencil (A, B).

(ii) Minimum residual.
“(A - aB)u||},_. Z ”Aullis-I - |P(u)I’IIBUI|ia—x,

where [le [23-1 = xTB'lx, with equality holds if and only if a = p(u).

(iii) Monotonicity.

“(A " Pk+1B)$k+1IIB-1 S “(A - ptB)xtIIB.-1 for all k.

67

(iv) Cubic convergence.
Therefore, we use the generalized RQI as a corrector, starting with x°(t;+1). To
be more precise, at (xj‘1(t,-+1), A5’1(t.-+1)) (j _>_ 1) let

A"'(tm) = 3i"(ti+1)TA(ti+1)3j-l(ti+1l
then solve
(A(tm) - A5(ti+1)B)IIj(ti+1) = xj'lUm)

and let
Byj(ti+1)
HBvi(t.-'+1)ll~
We repeat the above process to within half of the working precision if single precision

$j(ti+1) =

 

is used and one-third of the working precision if double precision is used when t5.” < 1,
since precision in determining the curve itself is only of secondary interest. We polish
(xj(t,+1), A5(t,-+1)) at the end of the path (t.-+1 = 1) by iterating the Rayleigh quotient
to machine precision. The stop point (x5(t.-+1),.A5'(t',-+1)) of ROI will be taken as an
approximate eigenpair (:i':(t,-+1), A(t,-+1)) of the pencil (A(t.-+1), B).

The cubic convergence rate of RQI makes the corrector highly efﬁcient.

7 .5 . Checking

 

 

 

 

For
f a 32 ) i
1 f t, \
52 02 .33 b;
A: , and B: . ,

ﬂn-l an-l ﬁn K ° b j

K ﬁn an ) n

with nonzero ﬁgs and b,- > 0, the_polynomials deﬁned by

110(k) =1
P1”) = 01 — Ab1

“(Al = (a, " ”ﬁlly-10‘) - ﬁEPr-“M

r = 2, 3, ...,n

68

form a generalized Sturm sequence. Thus, the number of the eigenvalues of (A, B)
strictly greater than A is equal to the number of the sign changes of the Sturm
sequence, with the convention that if p,(A) : 0, then p,(A) is taken to have the
opposite sign of p,-1 (A).

When (i(t,-+1), A(t;+1)) is taken as an approximate eigenpair of the pencil (A(t,-+1),
B), the generalized Sturm sequence at A(t.‘+1)+ e is computed to check that, if we are
trying to follow the curve corresponding to j ‘h largest eigenvalue, we are still on that
curve. Here, 6 is chosen as half of the working precision if single precision is used and
one-third of the working precision if double precision is used. If the check fails, we
reduce the step size to h/2 and repeat the whole process once again beginning with

the eigenvalue prediction in Section 7.3.

7 .6 Detection of a cluster
At t,- = 0, when all the eigenvalues of the pencil (D, B)
A1(0) < A2(0) < < A..(0)

are available, we let 6 : max(10'5,10’2(A,.(0) — A1(0))/ n) if double precision is used
(or 6 = max(10’3,10’2(A,,(0) — A1(0))/n) if single precision is used). Set A,- and A,-
in the same group if IA.-(0) - A,(0)I < 6. If the number of the eigenvalues in any
group is bigger than 1, then a cluster is detected. At t,- 96 0, or 1, when (5(a), A(t;))
is taken as an approximate eigenpair of the pencil (A(tg), B), after the checking step
in Section 7.5, we compute the Sturm sequences at A(t.) :l: 6 to ﬁnd the number of
eigenvalues of (A(t;),B) in the interval (A(tg) — 6, A(tg) + 6). When this number is
bigger than 1, a cluster of eigenvalues of (A(tg), B) is detected.

, In those cases, the corresponding eigenvectors are ill-conditioned and such ill-
condition can cause the inefﬁciency of the algorithm. We simply give up following
the eigenpath and the corresponding eigenpair of the pencil (A, B) will be calculated
at the end of the algorithm (see Section 7.8).

69

7 .7 Step size selection

In the ﬁrst attempt, we always choose the step size h = l — t.- at t,- < 1. If after
the prediction and the correction steps the checking step fails, we reduce the step size
to M2 as mentioned in Section 7.5. Since the initial pencil (D, B) is chosen to be so
close to (A, B), from our experiences, most of the eigenpairs of the pencil (A, B) can
be reached in one step, i.e., h : 1.

Very small step size can also cause the inefﬁciency of the algorithm. Therefore,
we impose a minimum 7 on the step size h. If h < 7, we simply give up following the
eigenpath and the corresponding eigenpair of A will be calculated at the end of the
algorithm (see Section 7.8). We usually choose 7 as 0.25.

7 .8 Terminating at t : 1

At t = 1, when an approximate eigenvalue A(l) is reached, we compute the Sturm
sequence at A(l) + c with e = machine precision to ensure the correct order. If the
checking fails, we have jumped into a wrong eigenpath. More precisely, suppose we
are following the i“ eigenpair, the checking algorithm detects that we have reached
the j‘“ eigenpair instead. In this situation, we will save the j“ eigenpair before the
step size is cut. By saving the j“ eigenpair, the computation of following the j‘h
eigenpair is no longer needed.

As mentioned in Section 7.6 and 7.7, we may give up following some eigenpaths
to avoid adapting a step size that is too small or the situation when a cluster is
encountered. Without any extra computation, we know exactly which eigenpairs are
lost at t = 1. In order to ﬁnd these eigenpairs, we ﬁrst use the bisection to ﬁnd
the eigenvalues up to the half working precision and then use the inverse iteration
and the RQI. If there is a cluster, then we do bisection to ﬁnd the eigenvalues up
to the machine precision, then use the inverse iteration to ﬁnd the corresponding
eigenvectors. In this case, to guarantee the orthogonality, we orthonormalize the

eigenvectors belonging to the same cluster while we are using the inverse iteration.

70

7 .9 Forming the eigenpairs of (A, B)

If B is positive semideﬁnite, the pencil (A, B) was reduced to a pencil (A, B) with
B positive deﬁnite. From Theorem 5.1 and Theorem 5.2, the eigenvalues of the pencil
(A, B) are the same eigenvalues of the pencil (A, B). Although the eigenvectors are
different, the eigenvectors of the pencil (A, B) can be easily formed from those of the

pencil (A, B) with few computations. The formulas are given in Chapter 5.

71

Chapter 8

Numerical Results of the
Homotopy Method for Symmetric

Generalized Eigenproblem

We shall show the computational results comparing the homotopy continuation
method GHOMO with QZ method. The computations were done on a Sun SPARC
station 1.

' Our testing examples consist of the following types of pencils (A, B):

Type 1. A is an unreduced symmetric tridiagonal matrix with both diagonal
and off-diagonal elements being uniformly distributed random numbers between 0
and 1. B is a diagonal matrix with the ﬁrst n/2 diagonal elements being uniformly
distributed random numbers between 0 and l, and the last 11] 2 being zeros.

Type 2. A is an unreduced symmetric tridiagonal matrix with both diagonal and
off-diagonal elements being uniformly distributed random numbers between 0 and l.
B is a diagonal matrix with the ﬁrst 3n/ 10 and the last 3n/ 10 diagonal elements
being uniformly distributed random numbers between 0 and 1, and the rest being
zeros.

Type 3. A is Toeplitz matrix [1,2,1]. B is a diagonal matrix with the ﬁrst n/2
diagonal elements being 1, and the rest being zeros.

Type 4. A is Toeplitz matrix [1,2,1]. B is a diagonal matrix with the ﬁrst 3n/ 10

72

and the last 3n/ 10 diagonal elements being 1, and the rest being zeros.

Type 5. A is an unreduced symmetric tridiagonal matrix with both diagonal and
off-diagonal elements being uniformly distributed random numbers between 0 and 1.
B is a diagonal matrix with all diagonal elements being random numbers between 0
and 1.

Table 8.1 shows the comparison in terms of speed with the QZ method. Table
8.2 shows the accuracy and orthogonality of the homotopy method. The homotopy
method appears to be strongly competitive and leads in speed by a considerable

margin in comparison with the QZ method in all the cases.

73

Table 8.1: Execution Time (second) of computed eigenpairs of generalized eigenprob-

lems.

 

 

 

 

 

 

 

 

 

 

 

 

Matrix Order I Execution Time (second)
[_Type N I GHOMO QZ Ratio (ox/0110540)
l——k 30 0.23 6.49 28.21

Matrix 100 0.75 48.20 64.26

Type 200 2.85 328.19 115.15
1 400 12.39 2546.06 205.49

50 0.28 6.23 22.25

Matrix 100 1.14 45.67 40.06
Type 200 4.86 321.86 66.22
2 400 15.24 1802.32 118.26
50 0.31 6.15 19.83

Matrix 100 1.60 46.96 29.35
Type 200 6.46 352.67 54.59
3 400 23.68 2796.63 118.10
50 0.52 5.82 10.15

Matrix 100 2.18 44.20 20.27
Type 200 8.80 336.50 38.23
4 400 34.84 2694.60 77.34
50 0.72 11.17 15.51

Matrix 100 2.81 74.66 26.56
Type 200 12.74 488.53 38.34
5 400 47.70 3866.14 81.05

 

 

 

 

 

 

74

 

 

 

 

 

 

 

 

Matrix Order N max.- ||Ax,- - Anggllg/Am, max“,- |(XTBX — I ),- JI / Am
50 3.9107905787880D-16 2.8617639677154D-16
Matrix 100 3.4241406339558D-16 2.6417938095675D-15
Type 200 4.3779590340543D-16 2.0037668814815D-15
1 400 5.2265929394497D-16 5.1552956553653D-15
50 3.4923828496565D-16 l .3678005594614D-15
Matrix 100 4.1134132219334D-16 1.0126081452260D-15
Type 200 5.1246002221589D- 16 1.8957582132953D-13
2 400 4.9856607322645D-16 1.093460401 163213-14
50 1.1454950034247D-16 3.4233801826002D-l6
Matrix 100 1.6564408035527D-16 1 .1470895221 83613—1 5
Type 200 2.1624971847800D-16 3.6096263986950D-15
3 400 2.4734721400459D- 16 2.1752300627827D-14
50 1.0336086782520D- 16 2.4870346559132D-16
Matrix 100 2.0867616866469D-16 2.3841690810960D-15
Type 200 2.1954327917052D-16 6.4657627842233D-15
4 400 2.9860202732850D- 16 2.6556085482256D-14
50 2.1611472844832D-16 2.2100520796971D-15
Matrix 100 4.3084863486435D-16 1 .7434653812219D-15
Type 200 4.6005546307787D- 16 5.2256957128428D-15
5 400 6.0472727089284D- 16 2.1235908790871D-14

 

 

 

 

 

Table 8.2: The residual and orthonormality of computed eigenvectors of generalized

eigenproblems.

75

Bibliography

[1] R. Barlow and D. Evans, A parallel organization of the bisection algorithm, The
Computer Journal, 22(1977), pp. 267-269.

[2] H. Bernstine and M. Goldstine, Parallel implementation of bisection for the cal-
culation of eigenvalues of tridiagonal symmetric matrices, Computing, 37(1986),
pp. 85—91. '

[3] A. Bounse-Gerstner, An algorithm for the symmetric generalized eigenvalue prob-
lem, Linear Alg. Appl., 58(1984), pp 43-68.

[4] H. Bowdler, R. Martin, and J. Wilkinson, The QR and QL algorithms for sym-
metric matrices, Numer. Math, 11(1968), pp. 227-240. ,

[5] M. T. Chu, A simple Application of the Homotopy Method to Symmetric Eigen-
value Problems, Linear Algebra and Appl., Vol. 59 (1984), 85-90.

[6] M. T. Chu, A Note on the Homotopy Method for Linear Algebraic Eigenvalue
Problems, Linear Algebra and Appl., Vol. 105, (1988), 225-236.

[7] M. T. Chu, T. Y. Li and T Sauer, Homotopy Method for General A-Matrix
Problems, SIAM J. Matrix Anal. and Appl., Vol., 9, No. 4, 528-536, (1988).

[8] J. J. M. Cuppen, A divide and conquer method for the symmetric tridiagonal
eigenproblem, Numer. Math., 36(1981), pp.177-1951.

[9] J. J. Dongarra and D. C. Sorensen, A fully parallel algorithm for symmetric
eigenvalue problems, SIAM J. Sci. Stat. Comput. Vol 8. (1987), pp. 139-154.

76

[10] G. Fix and R. Heiberger, An algorithm for the ill-conditioned generalized eigen-
value problem, SIAM, J. Numer. Anal., 9(1972), pp 78-88.

[11] F. Rellich, Perturbation theory of eigenvalue problems, Gordon and Breach Sci-
ence Publishers, New York, NY, 1969.

[12] G. Golub and C. V. Loan, Matrix computations, The Johns Hopkins Press, Bal-
timore, MD, 1983.

[13] R. O. Hill, Interlacing of eigenvalues of symmetric, tridiagonal matrices, A

Preprint.

[14] I. Ilse and E. Jessup, Solving the symmetric tridiagonal eigenvalue problem on

the Hypercube, SIAM J. Sci. Stat. Comput. vol 11. (1990), pp. 203-229.

[15] E. Jessup, Parallel Solution of the Symmetric Tridiagonal Eigenproblem, Ph.D.
Thesis, (1989) Yale University.

[16] K. Li and T. Y. Li, An Algorithm For Symmetric Tridiagonal Eigen-problems —
Divide and Conquer with Homotopy Continuation, A Preprint.

[17] T. Y. Li, The homotopy method for solving real symmetric generalized eigenvalue
problem, A Preprint.

[18] T. Y. Li and N. H. Rhee, Homotopy algorithm for symmetric eigenvalue problems,
Numer.Math. 55, (1989), pp.256-280.

[19] T. Y. Li and T. Sauer, Homotopy method for generalized eigenvalue problems,
Lin. Alg. Appl. 91(1987), pp. 65-74.

[20] T. Y. Li, T. Sauer and J. Yorke, Numerical Solution of a Class of Deﬁcient
Polynomial Systems, SIAM J. Numer. Anal. 24, 435-451 (1987).

[21] T. Y. Li, Z. Zeng, and L. Cong, Solving Eigenvalue Problems of Real Nonsym-
metric Matrices with Real Homotopy, A Preprint.

77

[22] T. Y. Li, H. Zhang and x. H. Sun, Parallel homotopy algorithm for symmetric
tridiagonal eigenvalue problem, to appear: SIAM J. Sci. Stat. Comput.

[23] S. S. Lo, B. Philippe and A. Sameh, A multiprocessor algorithm for symmetric
tridiagonal eigenvalue problem, SIAM J. Sci. Stat. Comput. Vol 8. (1987), pp.
155-165. ‘

[24] B. N. Parlett, The symmetric eigenvalue problem, Englewood Cliffs, N.J.,
Prentice-Hall (1980).

[25] A. Sameh and D. Kuck, A parallel QR algorithm for symmetric tridiagonal ma-
trices, IEEE Trans. Computers, C-26(1977), pp. 147-53.

[26] G. Peters and J. H. Wilkinson, Ax = ABx and the generalized eigenproblem,
SIAM J. Numer. Anal., 7(1970), pp 479-492.

[27] Smith, B. T., et.al., Matrix eigensystem'"routines-EISPACK Guide. 2nd edi-
tion,New york, Springer-Verlag (1976).

[28] D. C. Sorensen and Peter Tang, 0n the Orthogonality of Eigenvectors Computed
by Divide-and-Conquer Techniques, A Preprint.

[29] C. Stewart, Introduction to matrix computations, Academic Press, Now York,

1973.

[30] J. Stoer and R. Bulirsch, Introduction to numerical analysis, Springer-Verlag,
1980.

[31] J. H. Wilkinson, The algebraic eigenvalue problem, New York, Oxford University
Press (1965).

[32] J. H. Wilkinson and C. Reinsch, Handbook for automatic computatiiin: Linear
Algebra, Springer-Verlag, 1971.

78

"IJJJKKJKJJJKJMTJJJKJJJJJF