lllllllll'llllNIllfllllllllllllHHlHlllllHIlllHHl 1293 00876 1565 This is to certify that the dissertation entitled Homotopy Methods and Algorithms for Real Symmetric Eigenproblems presented by Kuiyuan Li has been accepted towards fulfillment of the requirements for Ph.D. degree in Mathematics /1%%6w/ Majo‘ professor Date July 181 1991 MS U is an Affirmative Action/Equal Opportunity Institution 0- 12771 LIBRARY l Michigan 323th University , * PLACE IN RETURN BOX to remcvcthic checkout rrcm your record. TO AVOID FINES return on or before date due. NR 7—“ DATE DUE DATE DUE DATE DUE a hf MSU Is An Affirmative Action/Equal Opportunity Institution cMcMmii-oi HOMOTOPY METHODS AND ALGORITHMS FOR REAL SYMMETRIC EIGEN PROBLEMS By Kuiyuan Li A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mathematics 1991 ABSTRACT HOMOTOPY METHODS AND ALGORITHMS FOR REAL SYMMETRIC EIGENPROBLEMS By Kuiyuan Li This thesis discusses the homotopy methods and algorithms for real symmetric eigenproblems. It contains two parts. In the first part, a new algorithm is presented for finding all the eigenvalues and the corresponding eigenvectors of a symmetric tridiagonal matrix. The algorithm is based on the homotopy continuation approach coupled with the strategy of ‘Divide and Conquer’. Evidenced by the numerical results, the algorithm provides a considerable advance over previous attempts in using homotopy method for symmetric eigenvalue problems. Numerical comparisons of the algorithm with the methods in widely used ESPACK library as well as Cuppen’s ‘Divide and Conquer’ method are presented. It appears that our algorithm is strongly competitive in terms of speed, accuracy and orthogonality and leads in speed in almost all the cases. The performance of the parallel version of the algorithm is also presented. In the second part, a homotopy method for finding all eigenpairs of a real symmet- ric matrix pencil (A, B) is giyen, where A and B are real n x n symmetric matrices and B is a positive semidefinite or ill-conditioned positive definite matrix. A reduc- tion of pencil (A, B) to pencil (A, B) is given, where A is an unreduced symmetric tridiagonal matrix and B is a positive definite diagonal matrix. One can easily forms the eigenpair (:r, A) of pencil (A, B) from the eigenpair (y, A) of pencil (A, 3). Fur- thermore, a formula is presented for finding the number of the finite eigenvalues of pencil (A, B) without actually solving the generalized eigenproblem. By choosing initial pencil properly, the homotopy curves are very well separated and, in general, very flat and easy to follow. The homotopy algorithm is compared with QZ algo— rithm. The numerical results show that the homotopy algorithm leads in speed in all the cases. ii To my daughter Lei. iii ACKNOWLEDGMENTS I would like to thank Professor Tien—Yien Li, my dissertation advisor, for his constant encouragement and support during my graduate study at Michigan State University. I would also like to thank him for suggesting the problems and the helpful directions which made this work possible. _ I would like to thank Professor R. 0. Hill under whose guidance I took reading course in Symmetric Eigenvalue Problems which laid the foundation for this research. I would also like to thank my dissertation committee members Professor Q. Du, Professor D. R. Dunninger, Professor R. 0. Hill, and Professor D. Yen for their valuable suggestions and their time. iv Contents Part I The Homotopy Method for Real Symmetric Tridiagonal Eigenproblem 1 The Homotopy Method for Real Symmetric Tridiagonal Eigenprob- lem 2 1.1 Introduction ................................ 2 1.2 Preliminary Analysis ........................... 5 2 The Homotopy Algorithms for Real Symmetric Tridiagonal Eigen- problem 16 2.1Introduction...................... .......... 16 2.2 Initiating at t = 0 ............................ 18 2.3 Prediction ................................. 18 2.4 Correction ................................ 19 2.5 Checking ................................. 20 2.6 Detection of a cluster and subspace iteration ............. 20 2.7 Step size selection ............................ 21 2.8 Terminating at t = 1 ........................... 21 3 Numerical Results of the Homotopy Algorithms for Real Symmetric Tridiagonal Eigenproblem 23 3.1 Test Matrices ............................... 23 3.2 A serial comparison with the existing methods. ............ 3.3 An indirect comparison with the existing parallel algorithms ..... Part II The Homotopy Method for Real Symmetric Generalized Eigenproblem Introduction Reduction Preliminary Analysis The Homotopy Algorithms for Generalized Eigenproblem . 7.1 Reduction ................................. 7.2 Initiating at t = 0 ............................ 7.3 Prediction ................................. ' 7.4 Correction ...................... 7 .......... 7.5 Checking ................................. 7.6 Detection of a cluster ........................... 7.7 Step size selection ............................ 7.8 Terminating at t = 1 ........................... 7.9 Forming the eigenpairs of (A, B) ..................... Numerical Results of the Homotopy Method for Symmetric Gener- alized Eigenproblem vi 24 28 41 44 61 65 65 66 66 67 68 69 70 70 71 72 List of Tables 3.1 Execution Time (second) of computed eigenvalues and eigenvectors. . 25 3.2 The residual of computed eigenvalues and eigenvectors. ....... 26 3.3 The orthonormality of computed eigenvectors ............. 27 3.4 Execution Time (second) of computed eigenvalues and eigenvectors from type 1 to 7. ............................. 29 3.5 The residual and the orthonormality of computed eigenvalues and eigenvectors from type 1 to 7. ...................... 30 3.6 Execution Time (second) of computed eigenvalues and eigenvectors from type 8 to 14 .............................. 31 3.7 The residual and the orthonormality of computed eigenvalues and eigenvectors from type 8 to 14 ............. i .......... 32 3.8 Execution Time (second) of computed eigenvalues and eigenvectors from type 15 to 21. ............................ 33 3.9 The residual and the orthonormality of computed eigenvalues and eigenvectors from type 15 to 21. .................... 34 3.10 Execution time (second), speed-up and efficiency of HOMO on [1, 2, 1] and T2 matrices. ............................ 35 3.11 Execution time (second), speed-up and efficiency of HOMO on Wilkin- son and Random matrices. ....................... 36 3.12 A comparison on the Alliant FX/8 for TQL2 vs. TREEQL on [1,2,1] matrices. ................................. 37 3.13 Speed-up over TQL2 on the Alliant FX/ 8 for computing all the eigen- pairs of [—1,2, -1] matrix of order 500. ................ 38 vii 3.14 A comparison on the Alliant FX/ 8 for TQL2 vs. TREEQL on random 8.1 8.2 matrices. ................................. 38 Execution Time (second) of computed eigenpairs of generalized eigen- problems................... ............... 74 The residual and orthonormality of computed eigenvectors of general- ized eigenproblems ............................. 75 viii List of Figures 1.1 1.2 1.3 3.1 "3.2 6.1 Homotopy curves ............................. 11 The eigenvalue curves of [1,i,1] matrix with D=[0,i,0] ......... 14 The eigenvalue curves of random matrix with a; < (n+1, on diagonal and with D=[0,a.-,0] ........................... 15 Efficiency of homotopy method on [1, 2, 1], T2, Random matrices and w: .................................... 39 Speed-up of homotopy method on [1, 2, 1], T2, Random matrices and W: .................................... 39 The generalized homotopy curves .................... 64 ix PART I The Homotopy Method for Real Symmetric Tridiagonal Eigenproblem Chapter 1 The Homotopy Method for Real Symmetric Tridiagonal Eigenproblem 1.1 Introduction In this chapter, we propose a new algorithm, based on the continuation approach, for finding all the eigenvalues and eigenvectors of a symmetric tridiagonal matrix. Let A be an n x 12 real symmetric tridiagonal matrix of the form ( 01 52 l 52 02 33 0 A = ' - ° - ° - (1.1) flit-1 art-l flu O \ fin an ) In (1.1), if some 13.- = 0, then Rn clearly decomposes into two complementary subspaces invariant under A. Thus the eigenproblem decomposes in an obvious way into two smaller subproblems. Therefore we will assume that each 19; 9E 0. That is, A is unreduced. Our algorithm employs the strategy of ‘Divide and Conquer’. First of all, the matrix A is divided into two blocks by letting one of the A’s equal to zero. Namely, we let D=(Dl 0) an 0 D; where f 01 32 l { 0H1 BM: \ 132 a: 53 5H2 01k+2 fik+3 D1 = ° ° ' , Dz — flk—l air-1 5k flit-1 an—l flu \ 51: a]: j ( 19» an ) We then calculate the eigenvalues of unreduced matrices 01 and D; by using the most efficient algorithm available. Different from Cuppen’s ‘Divide and Conquer’ method [8], our algorithm conquers the matrix A by the homotopy H : R’1 X R x [0, 1] —-§ R“ x R, defined by Act—D2: Ax—Aa: H($,A,t)= (l—t) sz—l +1 sz—l 2 2 As: — [(l — t)D + tA]z = sz __ 1 (1.3) 2 As: — A(t):r = 2:72: — l 2 where A(t) = (1 — t)D + tA and D is called an initial matrix. It can be easily seen that the solution set of H (2,),t) = 0 in (1.3) consists of disjoint smooth curves, 3 each of which joins an eigenpair of D to one of A [5, 6, 7, 20]. We call each of these curves a homotopy curve or an eigenpath. Thus, by following the eigenpaths emanating from the eigenpairs of D at t = 0, we can reach all the eigenpairs of A at t = 1. Theorem 1.1 in next section shows that the eigenvalue component A(t) of each eigenpath (z(t), A(t)) is monotonic in t. On the Other hand, by Hoffman-Wielandt theorem [31], is. — MO)” 5 "Am — Alli.w = 2(1 — 0212:... for any t e [0.11 (1.4) . i=1 where ||.||p- is the Frobenius norm and where A.- and A;(t) are the i“ eigenvalues of A and A(t) respectively. Therefore, all the eigenvalue curves /\,~(t) can be quite flat if 5H1 (=0 in D ) is very small, especially when n is very large. As a consequence, the eigenvalue curves A;(t) are very easy to follow. We shall describe our curve following algorithm in Chapter 2. The search for fast reliable methods for handling symmetric eigenproblems has produced a number of methods, most notably the QR algorithm, the bisection Sturm sequence method with inverse iteration [14, 15] and the ‘Divide and Conquer’ method [8, 9, 28]. In Chapter 3, we shall present our numerical results, along with comparisons with these methods. It appears that our algorithm is strongly competitive in terms of speed, accuracy and orthogonality, and leads in speed in almost all the cases. Modern scientific computing is marked by the advent of vector and parallel com- puters and the search for algorithms that are to a large extent parallel in nature. A further advantage of our method is that it is to a larger degree parallel, in the sense that each eigenpath is followed independently of the others. This inherent na- ture of the homotopy method makes the parallel implementation much simpler than the other methods. In Chapter 3, we also show the performance of our parallel al- gorithm and an indirect comparison with both Divide and Conquer (D&C) [9] and Bisection/Multisections (B / M) [23], which are currently considered the only parallel algorithms available for symmetric eigenproblems. The very high efficiency of our method and its natural parallelism make the algorithm an excellent candidate for a variety of architectures. Theoretical aspects of the continuation approach to the eigenvalue problems have been studied in [5, 6, 7, 20]. A first attempt was made in [18] to make the method computationally efficient. Its parallel version appeared in [22]. Evidenced by the numerical results, our algorithms given here provide a considerable improvement over the algorithms in [18, 22]. 1 .2 Preliminary Analysis Let (A)1 denote the lower (n — 1) x (n — l) submatrix of A , (A)1 denote the upper (12 — 1) x (n — l) submatrix of A. By a straightforward verification, one can prove the following lemma. Lemma 1.1 For matrices M and N and a real number 0, r . \ Ml det _ ._... ._ .. _ __ _ =detM-detN—c3det(M)1~det(N)‘. (1.5) Let M be a k x k unreduced symmetric tridiagonal matrix and let { £1 < {2 < < £1. } and { m < 172 < < 771,-1 } be the eigenvalues of M and (M )1 respectively. Then, by Cauchy’s interlacing theorem [24], £1 < 111 < 62 < < 1]];_1 < fig. (1.6) For D1 and D2 in (1.2), let f1 = det(D1 —M), f2 = dct(D2—AI), f3 = det((D1)1 — «\I) and f4 = det((D2)1 — AI). Let (:r(t),A(t)) be an eigenpath of the homotopy H(z,A,t) = 0 in (1.3), then for each 0 S t S 1, A(t) is an eigenvalue of A(t) = (l - t)D + tA in (1.3), where D is of the form in (1.2). 5 Theorem 1.1 If all the eigenvalues of D are distinct then i) Either A(t) is constant for all t in [0,1] or strictly monotonic. ii) A(t)A(t) > o for t small, if in) ¢ 0. Proof: Since | \ D1 —— A(t) I I l tflm det(A(t)—A(t)I)=det — ————— ——— — —.—— ————— — =0, Wk“ 1 K I / from ( 1.5), we have f1(A(t))f3(A(t)) - tzflinfsflallfdiial) = 0- (1-7) If there exists a to in [0,1] for which f3(A(to))f4(A(to)) = 0 then either f3(A(to)) = 0 or f.(A(to)) = 0; say f3(A(to)) = 0. It follows from (1.6), f1(A(to)) 76 0. Hence, f2(A(to)) = 0 in ( 1.7); accordingly, * f1(*(10))f2(1(t0)) - t21913+:f:+(A(to))f4(A(to)) E 0. This implies det(A(t) — A(to)I) = 0. Thus, A(t) = A(to) for all t in [0,1]. Assume f3(A(t))f4(A(t)) 96 0 for any t in [0,1]. Write A(t) = 53A“). Differentiat- ing (1.7) with respect to t yields, dl-A-lf1(/\(t))fg(A(t)) — #193,,f3(A(t))f4(A(t))]A(t) = 2tflt+1fa(«\(t))f4(k(t)) (1-8) (Iii-[fl“(0)506”‘t2fi2+1f3(3(t))f40(t))l '7‘" 0 for any t E (0t1]° We claim that gqxlfn(l\(t))f2(k(t)) - t’fit+1fs(A(t))f4(A(t))] H, 7‘ 0- (1-9) 6 For otherwise, ginsenfzwor twaifaoannmn)![:11 vastness)» .._.o d d =3; we» t=012(x(0))+f1(x(0))5 sou» I,=o=0. Since A(O) is an eigenvalue of D, we have f1(A(0))fg(A(0)) = 0. If f1(A(0)) = 0 then fg(A(0)) at 0 since all eigenvalues of D are distinct. Hence, fi- f1(A(t))|¢=o = 0. Consequently, A(0) is a multiple eigenvalue of D1 which contradicts to the fact that D1 is unreduced. The proof of (1.9) for the case f2(A(0)) = 0 follows by the same argument. Now, from (1.8), 2tflt+1fa(k(t))f4(»\(t)) ailmenfaou» — t’fl2+1fa(A(t))f.(A(t))] A(t) = which implies A(t) 75 0 for any t 6 (0,1]. Obviously, A(t) is continuous, hence, A(t) is strictly positive or strictly negative. Namely, A(t) is strictly monotonic. NOW let 9 = fifs , f = fli+1f3f4 811d ’1 = (9 — t’f); = Mtthlla then A = ”N" 1(1) = —41’f2 ‘9 713”)” + 8t’£;;f- + 3,? ' (1.10) Since f’(g — t’f),\;,/h3 and f; f /h2 are continuous in [0,1], they are bounded. By (1.10), A(O) =]1_r.nA(t) =]i_rr.12f/h Hence A(t)A(t) > 0 for t small since A(t) = 2t f /h. Q.E.D. Theorem 1.2 (Hoffman and Wielandt [31]) Let M be an n x n symmetric matrix. Let M’ E M + E where E is a symmetric perturbation of M. Denote the eigenvalues ofM by { £1 5 {3 S S 5,. }, the eigenvalues ofM’ by { £1 5 f; S S 6:, }, and the eigenvalues ofE by { 71 _<_ 72 S S 7“}, then 81.42)“ s 27.?- (1.11) i=1 i=1 Applying the above theorem on M = A and M’ = A(t) = (l — t)D + tA = A + E with f | l 0 l I (t "' llflkn E: _ ___ ___ _ ___ ___. _ (t — ”film I l 0 \ I i then, (1.11) gives in. _ 1.11))2 5 2(1 — 0219;“ for all t in [0,11 (1.12) i=1 where A.- and A,-(t) are the eigenvalues of A and A(t) respectively. From (1.12), together with the conclusion in Theorem 1.1 that each A,-(t) is mono— tonic in t, we see that the smaller 5H1 is the flatter the eigenvalue curves are, especially when n is very large. To make the eigenvalue curves easy to follow we in- tend to choose ,6)...“ as small as we can for k in certain range as described in Chapter 2. Let A(t) = D + Y(t), where A(t), D and Y(t) are real symmetric, and A,-(t), £5 and 11,-(t) be the eigenvalues of A(t), D and Y(t), respectively, with A;(t) S A5.” (t), £i S {ii-1 and Hi“) S Ili+1(t):5= 1:2v-wn " 1 Theorem 1.3 For any i, j satisfying 1 S i + j — 1 S n, and t 6 [0,1], the following inequalities hold: f.“ + MU) S I‘m-10) (1.13) and Ann-64“) S £n+1~i + [Inn-1' (t) (1-14) Proof: Let RB", RH, and Billy R5"l be the subspaces of Ru defined implicitly by sTDs . sTDs E min Mtg-1 1:73 C.- = max min T RP“ 8.1.1?!“ 8 8 8 TY t TY t 11,-(t) = max min 8 T( )8 5 min —-—-—----—s T( )3. RJ-laIRJ-l s s “my-é) s s In fact, Rf," is the span of the eigenvectors corresponding to the i — l smallest eigenvalues of D. Similarly Rial) is the span of the eigenvectors corresponding to the j —l smallest eigenvalues of Y(t). These subspaces may or may not have a nontrivial intersection. Let S be the subspace of the smallest dimension containing both R‘B‘ and Iii-(,1). Write k a dim(S) + 1. Now, k—l:dim(S')S(i—l)+(j—l) __ A,((A)1 + cal) for any i, 1 S i S n —1 i.e., p;—6,~an>0, lSiSn—l. By (1.15) and (1.16), ’\1S6lS/‘ISA2S”'S6n—IS/‘n-lSAno Hence Ai-t-l—AiZI‘i’aizcaa 1SiSn—l and lsInsin (A,-+1 - A ,-)_ > clsmin“l (01,-...1— 0;). Q.E.D. Corollary 1.3 If a] 02 (A): - = (A)1 '- an-1 an then l.<‘r.11 lsr.r1in__1(¢:r.'+1— or). (1.17) 12 Proof: (1.17) follows immediately from Lemma 1.2, since a; — (11 (All — (Ah - = 0. an - Ora—1 Q.E.D. Let A(t) = (1 - t)D + tA, where D is a diagonal matrix consisting of the diagonal elements of A, then Theorem 1.4 15113340144 (t) - A;(t)) _>_ clsrrgi,p_l(a,-+1 — 01') t 6 [0,1]. Proof: (AW — (A(t)). — a1 = «(Ar — (A11 — a1) _ (1 ,8) +(1-t)d¢'ag(az - a1 — 0,03 - a2 — 01"‘1011 — 0.1-1 — a), where a = 615%i3—1(a’+1 - 01,-), 0 < c S 1. Clearly, the second term of the right hand side of (1.18) is positive semidefinite and the first term is positive semidefinite by assumption. Hence, (A(t))1 - (A(t))l - a] is positive semidefinite for t 6 [0,1]. By Lemma 1.2, l_<-riréi,n__l(z\a+1(t) - AU» 2 clsrpsi'n_l(a,-+l — 01,) t 6 [0,1]. Q.E.D. If A satisfies the conditions in Lemma 1.2, we may choose the initial matrix D as a diagonal matrix consisting of the diagonal elements of A, then, A(t) is an unreduced symmetric tridiagonal matrix and the eigenvalue curves are not only distinct, but also very well separated. There is a lower bound between any two eigenvalue curves so that the eigenvalue curves are easy to follow. Example 1.1 A=[1,i,1], where i=1,2,...,20. If we let D=diag{ 1,2,...,20 }, then all the eigenvalue curves are very well separated. See Figure 1.2. 13 A 21 0000 18 3875 15 7750 13 1625 10 5500 7 93750 5 32500 2 71250 100000 00 0000 125000 250000 375000 500000 625000 750000 875000 I 0000C t Figure 1.2: The eigenvalue curves of [1,i,1] matrix with D=[0,i,0] l4 - I 50000 1 I 25000 ' I 02000 ‘ \ 1) 780000 ' $10000 " 300000 " 050000 ‘ - 280000 " 7200:: 0000 185000 237000 375000 506000 625000 736000 0&00-1 00000 Figure 1.3: The eigenvalue curves of random matrix with a,- < 01,-4.1, on diagonal and with D=[0,a.-,0] Example 1.2 A is a real symmetric tridiagonal matrix whose diagonal and subdi- agonal elements are random numbers between 0 and 1, and whose diagonal elements satisfy a.- < as“, and D is a diagonal matrix consisting of the diagonal elements of A, then all the eigenvalue curves are very well separated. See Figure 1.3. 15 Chapter 2 The Homotopy Algorithms for Real Symmetric Tridiagonal Eigenproblem 2.1 Introduction The basic features of the curve-following scheme of our algorithm to follow the eigenpath (x(t),A(t)) are : (i) Initiating at t = 0 (ii) Prediction (iii) Correction (iv) Checking (v) Detection of a cluster and space iteration (vi) Step-size selection (vii)Terminating at t = 1. We begin by giving an overview of our algorithm, followed by detailed explanation of these features. Simple computation shows that Newton’s method for the nonlinear problem of 16 n + 1 equations r Ax - A1: = 0 F(A,$) = i 3T3 -—l (2.1) 2 L of the n + 1 variables A, x1,xg, ...,xn at (A(“),x(")) is the inverse iteration, (A — Al”)l)y = x(") and i (n) T (n) ,\(n+1)= ,\(n)+ (‘5 ) 9: +1 ‘ 2(31="")’”y [ x(n+l) = (A(n-H) __ A(n))y. By making the initial matrix D close to A, the eigenpairs of D should be excellent starting points for applying Newton’s method on the eigenproblem (2.1). Based on this observation, our algorithm, in simple terms, is designed to use the homotopy continuation method as a backup of Newton’s method applied on (2.1). Namely, we solve the eigenvalues of the initial matrix D by using the most efficient method available first, and apply the inverse iteration on (2.1), using each eigenvalue of D as a shift. Then, we switch to Rayleigh quotient iteration (RQI), an inverse iteration with Rayleigh quotient as a shift, to speed up the convergence. This is mainly equivalent to choosing the starting step size h = 1 in the usual curve following scheme with zero order prediction and Newton correction to follow the eigenpath (x(t), A(t)) of the homotopy H (x, A, t) = 0 in (1.3). By checking the Sturm sequence at the convergent point and if this procedure fails to provide the right eigenpair, we shall cut the step size in half. That is, we repeat the process by applying the inverse iteration on Ax — A(t)x = 0 t xTx—l 2 with t = 0.5, where A(t) = (1 — t)D + tA in (1.3), and then switch to RQI to come back to the right eigenpath (x(t), A(t)). Assuming that after i steps, the approximate 17 value (x(tg), A(t;)) is known, we always choose the step size h = l — t,- at (x(tg), A(t,-)). In this way, we follow the eigenpath from t = 0 to t = l. 2.2 Initiating at t = 0 As mentioned in Chapter 1, we intend to choose I: for which 191+; is as small as possible. To make the sizes of the blocks D1 and D2 roughly the same, we limit the choice of k in the range n/2 — j S k S n/2 + j, where j is roughly equal to n/20, and find the smallest fig,“ by local sorting. When the initial matrix D is decided, different from the homotopy algorithms in [18, 22] where all the eigenvalues and the eigenvectors of D are calculated in order to start following the eigenpaths, our algorithm only calculates the eigenvalues of D1 and D2. These eigenvalues are obtained by using the most efficient method available. We require the accuracy to stay only within one-half or even one-third of the working precision. With this strategy, considerable amount of computing time is reduced. 2.3 Prediction Assume that after i steps the approximate value (:‘1":(t,-), A(t;)) on the eigenpath (x(t), A(t)) at t; is known and the next step-size h is determined; that is, t,” = t,- + h. We want to find an approximate value (:‘1':(t.-+1),A(t,-+1)) of (x(t;+1),A(t,-+1)) on the eigenpath at t5”. Notice that (ii:(t.-+1), A(t;+1)) is an approximate eigenpair of A(t,-+1). Since H (x(t), A(t), t) = 0, we have A(t)x(t) = A(t)x(t) x(t)Tx(t) = 1. Differentiating both equations with respect to t yields, (A — D)x(t) + A(t)a’:(t) = A(t)x(t) + A(t):i:(t) (2.2) x(t)T:i:(t) = 0. l8 For t = t5, multiplying (2.2) on the left by xT(t,-) yields, A(t.‘) = $T(t;)(A — D)x(t,-) = 2flk+1$k(t;)xk+1(tg) (2.3) where x(t.-) = (x1(t,-),...,x,,(t,-))T. In view of (2.3), we use the Euler predictor to predict the eigenvalue at ti“, namely, 1 Ao(t;+1) = A(tg) + A(tgyl. It is easy to see that A(O) = 0 in (2.3). Consequently, A°(t1) always equals to A°(0). To predict the eigenvector, we use the inverse power method on A(t;+1) with shift A°(t.-+1). That is, we solve (A(tm) '- A°(t.-+1)I )y°(t,+1) = x(t,-) and let 1100141) “90011-0”- At t,- = 0, since we skip the calculations of eigenvectors of D, x(0) is not available. 3001'“) = We choose a random vector for x(0). 2.4 Correction As a corrector, we use the standard RQI, starting with x°(t,-+1). To be more P11501861 at (ti-16141): Aj'l(ti+1))(j ..>. 1) let ”(1141) = 1". -1(ti+1)TA(ti+1)$j'-i(ti+1) then solve (A(tm) - *5 (1:41)”th (0+1) = 1‘" "’(t-‘Hl and let 95 (0+1) “160141)"- $j(tg+l) = 19 We repeat the above process to within half of the working precision if single precision is used and one-third of the working precision if double precision is used when t.-+1 < 1, since precision in determining the curve itself is only of secondary interest. We polish (x5(t.-+1), A5(t.-+1)) at the end of the path (t.-+1 = l) by iterating the Rayleigh quotient to machine precision. The stop point (xj(t;+1), Aj(t.-+1)) of RQI will be taken as an approximate eigenpair (5:(t.-+1), A(t;+1)) of A(t.-+1). The cubic convergence rate of RQI makes the corrector highly efficient. 2.5 Checking When (5:(t.-+1), A(t.-+1)) is taken as an approximate eigenpair of A(t;+1), the Sturm sequence at A(t;+1) + e is computed to check that, if we are trying to follow the curve of 1"“ highest eigenvalues, we still on that curve. Here, 6 is chosen as half of the working precision if single precision is used and one-third of the working precision if double precision is used. If the check fails, we reduce the step size to h / 2 and repeat the whole process once again beginning with the eigenvalue prediction in Section 2.3. 2.6 Detection of a cluster and subspace iteration At t.- = 0, when all the eigenvalues of D A1(0) < A2(0) < ... < 21,.(0) are available, we let 6 = max(10'5,10"2(A,.(0) — A1(0))/n) if double precision is used (or 6 = max(10'3,10'2(A,,(0)— A1 (0)) / n) if single precision is used). Set A,- and A,- in the same group if IA.'(0) — Aj(0)| < 6. If the number of the eigenvalues in any group is bigger than V1,_then a cluster is detected. At t,- 79 0, or 1, when (:‘1’:(t,-),A(t,-)) is taken as an approximate eigenpair of A(tg), after the checking step in Section 2.5, we compute the Sturm sequences at A(t.) :1: 6 for the purpose of finding the number of eigenvalues of A(tg) in the interval (A(t,-) -— 6, A(t,-) + 6). When this number is bigger than 1, a cluster of eigenvalues of A(tg) is detected. 20 In those cases, the corresponding eigenvectors are ill-conditioned and this ill- conditioning can cause the inefficiency of the algorithm. To remedy this problem, the inverse power method [24, 31] with A(tg) as shift is used to construct an approx- imation of the corresponding eigenspace .S' of dimension m (= the number of the eigenvalues in the cluster) of A(tg). This approximate eigenspace S is used as an ini- tial subspace of the subspace iterations at t,-+1 when we approximate the eigenpairs Of A(t.’+1). 2.7 Step size selection In the first attempt, we always choose the step size h = 1 — t,- at t,- < 1. If after the prediction and the correction steps the checking step fails, we reduce the step size to h/ 2 as mentioned in Section 2.5. This extremely liberal choice of step size can be justified because of the’ob'servations we described in the beginning of this Chapter ( Section 2.5 ) as well as the effective checking algorithm. Indeed, since the initial matrix D is chosen to be so close to A, from our experiences, most of the eigenpairs of A can be reached in one step, i.e., h = 1. Very small step size can also cause the inefficiency of the algorithm. Therefore, we impose a minimum 7 on the step size h. If h < 7, we simply give up following the eigenpath and the corresponding eigenpair of A will be calculated at the end of the algorithm by the method of bisection with inverse iterations (see Section 2.8). We usually choose 7 ‘8 0.25. 2.8 Terminating at t = 1 At t = 1, when an approximate eigenvalue A(l) is reached, we compute the Sturm sequence at A(l) + c with e = machine precision to ensure the correct order. If the checking fails, we have jumped into a wrong eigenpath. More precisely, suppose we are following the i“ eigenpair, the checking algorithm detects that we have reached the j"’ eigenpair instead. In this situation, we will save the 1"” eigenpair before the 21 step size is cut. By saving the j‘h eigenpair, the computation of following the j‘h eigenpair is no longer needed. As mentioned in Section 2.7, we may give up following some eigenpaths to avoid adapting a step size that is too small. Without any extra computation, we know exactly which eigenpairs are lost at t = 1. In order to find these eigenpairs, we first use the bisection to find the eigenvalues up to the half working precision and then use the inverse iteration and the RQI or subspace iteration (if there are some clusters) to find the eigenpairs. 22 Chapter 3 Numerical Results of the Homotopy Algorithms for Real Symmetric Tridiagonal Eigenproblem 3.1 Test Matrices The homotopy continuation algorithm is in its preliminary stage, and much de- velopment and testing are necessary. But the numerical results on the examples we have looked at seem remarkable. Our testing matrices are: (1) The Toeplitz matrix [1, 2, l], i.e., all its diagonal elements are 2 and off- diagonal elements are 1. (2) The random matrix with both diagonal and off-diagonal elements being uni- formly distributed random numbers between 0 and 1. (3) The Wilkinson matrix W,;". i.e., the matrix [l,d;,1], where d,- = abs((n + l)/2 — i), i = 1,2, ...,n with n odd. (4) The matrix [1, 11,-, 1], where p.- = i x 10". (5) The matrix T; : same as matrix [2, 8, 2] except the first diagonal element 23 a] = 4. (6) The glued Wilkinson matrix W}? The matrix consists of j copies of Wilkinson matrices W: along the diagonal and the off-diagonal elements 19.31,.” = 10’“, where i=1,...,j-l. (7) The LAPACK test matrices which include 21 type matrices. 3.2 A serial comparison with the existing meth- ods. For symmetric tridiagonal matrices, the routine TQL2 in EISPACK [27] im- plements explicit QL-iteration to find all the eigenpairs. EISPACK also includes a Sturm sequence with inverse iteration method ( BISECT+ TINIVT) which is much faster than TQL2. However, it may fail to provide more accurate eigenvectors when the corresponding eigenvalues are very close. A new version by Jessup [15] (B / III) has considerably improved the reliability and the accuracy of the inverse iteration. The ‘Divide and Conquer’ method for symmetric tridiagonal matrix was suggested by Cuppen [8] and was implemented, combining with a deflation and a robust root finding technique, by Dongarra and Sorenson [9] (TREEQL) (see also [28]). We shall show the computational results comparing the homotopy continuation method HOMO with those obtained by the methods TQL2, B/III and TREEQL mentioned above. The computations were done on a Sun SPARC station 1. Table 3.1, 3.2 and 3.3, show the comparisons on the first 6 type test matrices listed in Section 3.1 in terms of speed, accuracy and orthogonality respectively. The homotopy method appears to be strongly competitive in every category and leads in speed by a considerable margin in comparison with all other methods in most of the cases. Tables 3.4 to 3.9 show the comparisons with (B/III) i.e., DSTEBZ+ DSTEIN, which is the latest code based on bisection with inverse iteration, in terms of speed, accuracy and orthogonality respectively on the LAPACK test matrices. The LAPACK test matrices of type 1 to type 7 are diagonal matrices. Table 3.4 to 3.6 show that 24 Matrix Order Execution Time (second) N HOMO B/III TREEQL TQL2 64 2.03 2.43000 5.90999 17.0600 [1,2,1] 125 5.88 8.58000 37.6199 114.460 256 30.25 35.7000 302.859 904.659 499 100.04 152.920 984.460 2416.14 64 1.13 2.44000 6.09000 17.3200 Random 125 3.79 8.53000 31.4100 115.880 256 14.90 34.4500 158.120 949.939 499 53.19 133.620 235.889 2482.68 65 0.97 1.84000 2.73999 16.1000 w: 125 3.97 6.21000 8.20001 108.890 255‘ 16.40 22.5700 31.8398 879.959 499 57.35 95.5000 57.8300 3869.33 64 1.97 2.43000 5.91000 17.1500 [1,,1.,1] 125 6.78 8.64999 37.6500 115.699 256 30.61 34.1900 303.449 901.810 499 107.04 129.370 984.410 2424.88 64 1.80 2.39000 5.80000 16.6800 T2 125 6.85 8.60000 37.3700 115.000 256 28.24 34.8600 165.250 939.859 499 108.46 174.760 979.738 2506.93 64 1.80 1.73000 2.07999 12.2000 w; 128 8.69 6.07000 6.90000 67.5600 256_ 38.85 22.5100 29.0700 490.779 512 144.05 89.2200 162.040 5303.37 Table 3.1: Execution Time (second) of computed eigenvalues and eigenvectors. 25 Matrix Order [ max; llAza ’, Aatallz/Amu | TQL2 [ N HOMO B/III I TREEQL 64 1.9107015 4.9769016 1.9027015 9.8365015 [1,2,1] 125 1.9519015 8.4237016 3.3617015 1.0949014 256 2.3533015 1.2825015 6.0929015 2.7538014 499 2.2251015 1.7640015 7.5172015 3.7564014 64 1.9319016 3.6661016 2.4581014 6.0589015 Random 125 2.0485016 3.5621016 5.6541014 1.2686014 256 3.9846015 3.6238016 7.1488014 5.3928014 499 1.0330013 9.2496015 4.8200014 5.7588014 65 5.4172016 5.4991015 1.8801013 8.7914014 w: 125 3.3967016 7.1627015 1.0842012 3.4443013 255 7.7880016 1.4301014 1.0868012 6.7159013 499 9.1037016 2.8363014 1.6269011 6.5775012 64 4.9149015 5.0881016 2.4436015 8.8198015 [1,11.,1] 125 3.2254015 7.0876016 3.8144015 1.1196014 256 4.4093015 1.1424015 5.4271015 2.7578014 499 4.9410015 1.7580015 8.4456015 3.7155014 64 7.1616016 1.7737015 5.5986015 1.5815014 T2 125 7.3060016 1.8509015 6.5573015 2.6714014 256 8.1746016 2.2977015 1.3413014 5.0148014 499 8.0805016 3.7362015 1.7603014 8.1773014 64 8.1580016 3.2592015 4.3889014 3.1906014 w; 128 1.2509015 1.3014014 4.3246013 1.1476013 256 2.1321015 1.4015014 3.8488012 6.6732013 512 4.2327015 2.3889014 3.8490012 1.7667012 Table 3.2: The residual of computed eigenvalues and eigenvectors. 26 Matrix Order max-JKXTX —I);J|/Am., N HOMO B/III PTREEQL TQL2 64 1.7133015 9.4041015 1.4836015 5.9952015 [1,2,1] 125 1.3399014 3.5176014 3.6237015 7.5495015 256 1.1554014 2.1010014 1.4941014 1.5987014 499 5.3994014 4.1999014 1.9569014 2.4868014 64 2.3654015 2.6645015 1.0890014 7.3274015 Random 125 8.5588014 5.5839015 4.1540014 1.1324014 256 4.0178013 3.9968015 1.9257013 3.6415014 499 1.0330013 9.2496015 4.8200014 4.2410014 65 4.7465017 2.2204015 1.1102015 7.5495015 w: 125 1.7693017 3.2163015 1.2585015 1.2656014 255 1.2167017 4.3258015 "4.4408015 2.5091014 499 9.3353018 1.2278013 8.2156015 6.9722014 64 3.3790015 7.6699015 1.0852015 4.6629015 [1,15, 1] 125 1.2375014 4.3372014 1.2751014 6.6613015 256 3.9662014 1.6639013 8.4026015 1.3322014 499 1.8082013 1.3981013 6.9169014 2.4424014 64 5.7249016 7.9102015 9.3588016 5.9952015 T2 125 1.8625015 1.3832014 3.7492015 1.3322014 256 3.2107014 1.9904014 1.0485014 2.1760014 499 6.6773014 1.5714014 1.9133014 4.9293014 64 7.9556017 3.5527015 6.6613016 7.9936015 w; 128 6.2610016 1.4708013 1.1366015 9.1038015 256 7.0303017 1.0394013 4.4408015 2.3758014 512 8.4508017 1.4259013 1.1324014 2.6645014 Table 3.3: The orthonormality of computed eigenvectors 27 both algorithms work very well. The matrices of type 9, l7 and 21 have a large cluster with dimension around 4n/5, where n is the order of matrices. On these matrices, while HOMO is not as fast as B/III its accuracy is still competitive. The matrices of type 10 and 18 have a even larger cluster with dimension n - 1. Although HOMO works, time consuming is out of comparison. Tables 3.4 to 3.9 clearly show that the homotopy method leads in speed by a considerable margin in comparison with B / III on all other types of the LAPACK test matrices. 3.3 An indirect comparison with the existing par- allel algorithms Scientific and engineering research has become increasingly dependent upon the de- velopment and implementation of efficient parallel algorithms on modern high - performance computers. Developing algorithms for advanced computers suitable for eigenvalue problems has produced several algorithms, such as Divide and Conquer (D&C)[9] and Bisection/Multisection (B/M)[23] for symmetric tridiagonal matrices. The homotopy algorithm is to a large degree parallel since each eigenpath can be fOllowed independently. This inherent nature of the homotopy method makes the parallel implementation much simpler than other methods. In our parallel algorithm, after all the eigenvalues of D are computed and put in the increasing order, we assign each processor to trace roughly 11 / p eigencurves, where n is the dimension of matrix A and p is the number of the processors being used. Let the first processor trace the first n/ P smallest eigencurves from the smallest to the largest and let the second processor trace the second n / p smallest eigencurves, and so on. We present, in this section, the numerical results of the parallel implementation of our algorithm. All examples were executed on BUTTERFLY GP 1000, a shared memory multiprocessor machine. 28 Table 3.4: Execution Time (second) of computed eigenvalues and eigenvectors from type 1 to 7. Matrix Order Execution The (second) 7". 17 110140 31111 11.41.7(B/IIDIHOMO 10 0.00 0.00 . 10.111: 02 0.00 0.00000000 Type 04 2.0000000: 2.0000000: 1 120 7.0007000: 1.0000000: :50 0.020000 0.000017 500 1.17000 1.40000 1.10 10 0.00 1.0000000: 100411: 0: 1.0000200: 0.00000000 7". 04 2.0000000: 2.0000000: 2 120 0.0001000: 0.0000000: :00 0.010000 0.000000 000 1.10000 1.02000 1.12 10 0.00 0.00000000 14.111. 0: 0.00 0.00000000 '1‘". 04 2.0000000: 0.0002000: 0 120 0.0002100: 0.0011000: :50 0.000007 0.040000 000 1.17000 1.00007 1.14 10 0.00 0.00 14.411: 0: 1.0000200: 0.00 r". 04 2.0000000: 2.0000000: 4 120 0.0002100: 0.0000000: :50 0.020000 0.000000 1100 1.17000 1.04000 1.14 10 0.00 0.00 11.411: 0: 0.00 1.0000200: r". 04 2.0000500: 2.0000000: 0 120 7.0000000: 0.0000000: :00 0.000007 0.040000 000 1.10000 1.00000 1.10 10 0.00 0.00 00.411: 0: 1.0000200: 0.00070000 Type 04 2.0000500: 2.0000500: 0 1:0 0.0070000: 0.0000000: :50 0.020000 0.000000 000 1.1000: 1.02007 1.12 10 0.00 0.00000000 10541-1: 02 0.00020000 1.0000200: Type 04 2.0000000: 2.0000000: 7 120 0.0070000: 0.0000000: :00 0.020000 0.000000 1100 1.10000 1.04000 1.14 29 Mom: Otdu mu.- IIM.‘ - Airing/8m I Duij KXTX - ”Lil/x770“ I 32 0.0 0.0 0.0 0.0 Type 04 0.0 0.0 0.0 0.0 3 130 0 0 0.0 0 0 0 0 250 0.0 0.0 0.0 0.0 500 0.0 0.0 0.0 0.0 10 0.0 0.0 0.0 0.0 32 0.0 0.0 0.0 0.0 Type M 0.0 0.0 0.0 0.0 3 120 0 0 0.0 0 0 0 0 250 0.0 0.0 0.0 0.0 000 0.0 0.0 0.0 0.0 10 0.0 0.0 0.0 0.0 33 0.0 0.0 0.0 0.0 Type 04 0.0 0.0 0.0 0.0 d 12. 0 0 0.0 0 0 0 0 L 250 0 0 0.0 0 0 0 0 500 0.0 0.0 0.0 0.0 )0 0.0 0.0 0.0 0.0 32 0.0 0.0 0.0 0.0 Type 04 0.0 0.0 0.0 0.0 5 130 0 0 0.0 0 0 0 0 350 0.0 0.0 0.0 0.0 000 0.0 0.0 0.0 0.0 10 0.0 0.0 0.0 0.0 32 0.0 0.0 0.0 0.0 Type 04 0.0 0.0 0.0 0.0 0 120 0.0 0.0 0.0 0.0 3“ 0.0 0.0 0.0 0.0 500 0.0 0.0 0.0 0.0 10 0.0 0.0 0.0 0.0 32 0.0 0.0 0.0 0.0 Typo 04 0.0 0.0 0.0 0.0 7 120 0.0 0.0 0.0 0.0 3“ 0.0 0.0 0.0 0.0 m 0.0 0.0 0.0 0.0 Table 3.5: The residual and the orthonormality of computed eigenvalues and eigen- vectors from type 1 to 7. 30 Table 3.6: Execution Time (second) of computed eigenvalues and eigenvectors from type 8 to 14. 1100 1'18 00400 380000100 Time (0000“) Type 11 nouo 31111 11.41.( 10 0.00M0‘02 1 .000003-01 . 1400112 32 0.700000 1 .01000 Type 44 2.47000 4.04000 4 120 11.3300 15.3000 254 33.3000 40.1001 500 142.500 224.000 1 .50 10 0.1 1000 1 .000003-01 ll“ fix 32 1.24000 0.03000 Type 04 10.0100 4.04000 0 120 1 10.000 15.3000 250 037.030 154.510 500 7334.24 000.000 0. 14 10 ‘ 0.000003-02 “51718 32 ‘ 0.50000 1'". 04 ' 0.01000 10 120 " 24.2200 250 ‘ 171.450 500 ‘ 1 103. 1 1 ’ 10 0.0000000: 0.10000 11.1112 02 0.7100'00 1.00000 Type 44 2.44000 3.44000 1 1 124 11 .1400 24.2000 254 40.3500 57.00“ 500 174.370 217.410 1.25 10 1 .00000001 0.1 1000 1400010 32 0.530001 1.07000 Type 04 0.22000 4.17000 12 120 12.2300 14.7000 250 40.0300 00.00“ 500 150.332 220. 100 1 .44 10 0. 1 10000 0.13000 Hun-in 32 0.47000 0.04000 Type 44 2.37000 4.44000 13 120 0.05001 14.00” 250 33.0000 50.5000 5“ 157.010 223.441 1.42 10 5.000003-02 0.12000 1400010 32 0.520000 0.00000 Typo 04 2.40000 0.02000 14 124 4.70004 14.7700 250 41.3700 55.1000 5” 137.410 217.320 1.53 ' The dime-0100 cl 050 0050100000 10 3000000 than 410/5. 31 non-1x 00000 .1086 "As.- - 15:5", [1,... nos,- .1' [1(7): - 1),- .1' mm Typo 11 110140 a/m 110110 111m 10 1.110000." 1.401000.“ 2.0001 10.10 4.4400011.” 02 2.1000011.“ 2.0005011.” 1.0010011.“ 1.0000011." Type 04 1.0000010.“ 2.0000111.“ 0.0000011.” 2.000720.“ 0 120 0.4110211.“ 2.0440013.“ 0.0400011.“ 4.5000013.» 200 0.0100011." 0.0-000013.10 1.0050211.“ 1.0211110.“ soo 0.000740.“ 0.0100411." 2.0200413.“ 2.0010010.“ ‘ 10 0.1120211." 1.0070013." 0.001 10413.10 0.0000013." 02 2.1100111.“ 1.000100.“ 0.020040.“ 4.440000.“ Typ0 04 0.1010111." 1.0071411." 0.101020.“ 4.040240.“ 0 120 1.0004011.“ 1.4002011.“ 0.1101511.“ 1.0110010." 200 1.1470011.“ 2.0002211." 1.1000413.“ 1.2110011.” 1100 1.0000010." 0.0-100013.10 1.7700011.“ 2.100410." 10 - 0.014400." 0.000000.” 02 - 0.5120413." — 5.1500004: 'rypo 04 - 1.001000." - 0.1210013.“ 10 120 - 0.0014011.“ - 0.224100.“ 200 - 0.014700.“ - 1.2002113.” 000 - 0.0001711." - 0.2200011." 10 1.0200113." 1.4420211.“ 1.004000.“ 1.401000." 02 1.0000011.“ 2.0112011." 1.7000211.“ 1.0000011." 'rypo 04 1.0070010.“ 2.0000011.“ 1.011000." 0.007000.” 11 120 0.0000011.“ 2.0002113.“ 2.0001013.“ 2.0124010.“ 200 0.01 17011.10 2.0011210.“ 4.0040111." 1.4404010.” :00 1.0402011." 0.4404411.“ 1.1000213. 14 0007171110 10 2.2000011." 1.0404013.“ 0.200020." 1.012000.“ 02 ”01000.10 2.0127013.“ 4.4100411.“ 0.4400013." 'rypo 04 0.0200011.“ 2.2000011.“ 1.000440.” 4.007000.“ 12 120 4.0000011.“ 2.0101411.“ 4.0200011.» 0.0020011.“ 200 4.001000.“ 2.0020011.“ 0.0000011.“ 0.0400013.“ 000 1.2400111.“ 0.000000.“ 1.400400.“ 0.007010.“ 10 1.040140.“ 0.122200." 1.171000.“ 1.100700." 02 0.1100011.“ 1.0002011.“ . ”202013.10 1.000000." Typo 04 1.010010.“ 2.0000711.“ 1.1000013." 1.0000011.“ 10 120 0.0014010.“ 2.0014013.“ 0.0402013.“ 4.0010211.“ 200 1.0071 115.10 2.2122213." 1.0700010." 1.0100010." 000 0.000020." 2.4210411.» 1.2010011.” 0.404000." 10 0.4712011." 2.1021011.“ 1.407240." 2.000000." 02 1.0000011." 1.0014111.“ 0.420000." 1.404070.“ 'rypo 04 2.4001711." 1.010000.“ 1.200100." 0.1407011.» 14 100 4.2400011.“ 2.0100413.“ 0.0001413." 2.020400. 10 2110 2.0110011." 2.0000110.“ 0.4702011.” "10010.10 000 1.1101011.“ 2.0100010.“ 1.0142011.“ 2.0100013.“ Table 3.7: The residual and the orthonormality of computed eigenvalues and eigen- ‘ 1'00 0100000100 0! 100 0050100430 10 3000100 0000 401/5. vectors from type 8 to 14. 32 -' Table 3.8: Execution Time (second) of computed eigenvalues and eigenvectors from type 15 to 21. 0000110 00400 Ixmthl Til-0 (0000.1!) rm 17 110000 31111 100110 ((B/III) O 10 0.11000 0.10000 . 10.111: 02 0.00000 1.00000 2‘". 00 0.02000 0.07000 10 120 0.07000 10.0000 000 00.7100 00.0000 000 100.000 220.710 1.00 10 ”00070.02 0.12000 100011; 02 0.000001 1.00000 rm 00 2.00001 0.00000 10 120 0.00007 10.1000 200 02.0100 000001 000 100.000 227.001 1.01 10 0.000000.” 0.10000 00.1010 02 2.00000 0.070001 1”. 00 10.0700 0.00001 17 120 100.000 20.0000 200 000.000 102.210 000 0010.00 000.100 0.11 10 - 0000021002 100111: 02 0 0.000000 Tm 00 - 0.00001 10 120 - 20.2000 200 - 170.000 000 - 1100.00 - 10 0.100000 0.110000 00.111: 02 0.000000 1.02000 Type 00 2.00000 0.01001 10 120 0.71007 10.7000 200 00.7000 07.0000 000 107.070 210.000 1.00 10 ”00070.02 0.110000 10001111 02 0.000007 1.00000 Type 00 0.70000 0.00000 20 120 10.0000 10.0000 200 02.0002 00.2000 000 100.100 201.700 1.00 10 7.000000.02 7.00000502 1000110 02 2.07000 1.01000 Type 04 14.1000 0.00001 21 120 102.200 20.0200 200 1110.00 100.700 000 0702.00 002.001 0.12 ' Th0 dine-.000- 0! the 0050100000 10 3000007 050. 00,5. 33 140001: 07407 1010:.- "As.- -1,-s.-|bllm .uid' KXTX - ”MI/1m 1'”. N 110000 31111 30000 I 31m 10 ”10000.17 0.010771110— m 52 1352.20. 1. 1.545.20- 1. 1.4407404. 7.45.5.04. Type .4 5.4.0070- 1. 2..52..04. 1.5522504. 5......045 1. 12. 1.7551204. 1.22.2.0- 1. 5.7740204. 7.2....04. 2.. ...554004. 3.1552104. ..5.7240- 1. 4.4.4.5044 500 ..5577504. 5.54.5204. 2.745750- 14 2.0.2020. 14 10 ..2..51047 1.215.104. 53505.0. 1. 4.440.204. 52 7.35512047 2.012.104. ....1550- 1. 7.771550. 1. Typo .4 1.132.704. 2.2550204. 1.7211504. 1.0055704. 1. 12. 2.242.2047 2.2125504. 240277045 2.275550. 15 25. 4.575.30- 1. 1.525540. 1. 5.4172.04. 5.5445504. .00 1.145.504. 2.5722704. 1.70.15045 1.24725044 10 5.51.510- 17 2..1022047 1. 1 10220- 1. 5.5505504. 52 ..4.712047 1.575070- 1. 2.5555104. 4.440.20- 1. '1'pr .4 ...1.25047 2.1.52.047 4.215.404. 1.1055704. 17 12. 2.54522047 1.545200. 1. 7.2255004. 2.....00- 1. 35. ..45502047 1.5720204. 1.34544044 5.0025504. .00 2.552210. 1. 1.....50- 1. 2.0.721044 5.2050004. 10 ' 1.515250- 1. 7.45.5404. 52 ' 1.075520. 1. 1 .15.4.0- 15 Type .4 ‘ 0.3624004? 8.055000. 1. 1. 12. ‘ 1.22.2204. ‘ 1 ..0025045 2.. ‘ 4.4755204. ‘ 2.1.2.0045 500 ‘ 5.7144304. ‘ 1.5244404. 10 2.401.30- 1. 5.2.7.30. 1 7 ..472.30- 1. 4.440.20- 1. 53 7.425.40- 17 1.550.204. 4..5.1104. 7.1.”00- 1. Type 00 2.000070." 1.200000. 10 7.100000. 10 0.000000. 10 12 13. 5.1.1500- 1. 1..522204. 1.52.3045 1.144.504. 25. ..271270. 1. 2.12.7504. 5.5.5.2045 4.5754404. 5N 1.20.7504. 2.1277204. 7..5512045 1....75045 10 7.515.0047 1.020270- 1. 2.7522104. ...10550o 1. .2 5.7224704. 1.”.2104. ..4220.0- 1. 5.02.020. 15 Type .4 1.4.22.04. 1.2015404. 7.502.104. 5.1111704. 20 12. 2.5402.0- 1. 1.....10- 1. 1.5555104. 2.4.50.0- 15 2.. 4.4504504. 2.52.0404. 5.0225004. 4.2522504. .00 ..051520- 1. 2.3070404. 2.4.5.50- 1. 5.2555104. 10 5.1.5.2047 ......5047 1.1751204. 5.52.5.04. 53 5.57254047 1.0077504. 2.4427404. ....2750- 1. Type .4 4.5552.047 1.0125404. 5.225.50- 1. ....15.0- 1. 21 13. 5.5.4.1047 1.1.71.0- 1. 2.5472404. ....1550o 1. 25. ..354200- 17 1.1435704. 1.24544044 ....17.04. 500 ..04..5047 1.14.0504. 2.5.5.7044 1.75.5104. ‘ Th0 dime-0k- “ the 0.5090000 10 3700000 050. 400/5. Table 3.9: The residual and the orthonormality of computed eigenvalues and eigen- vectors from type 15 to 21. 34 The speed-up is defined as execution time using one processor P execution time using p processors and the efliciency is the ratio of the speed-up over p, the number of processors being used. Table 3.10 shows the execution time as well as the speed-up S, and the efficiency SP/p of our algorithm HOMO on matrices [1, 2, l] and T2 by using p processors. For the purpose of comparison with other methods, the speed up of our method over TQL2, 71.811726, on [1, 2, 1] are also listed. Similar results on random matrices and Wilkinson matrices are shown on Table 3.11. We list in Table 3.12, 3.13 and 3.14 some of the results of D&C and B/M. It is somewhat difficult to compare our results with theirs directly since their results were executed on different machines. Using TQL2, an indirect comparison may be obtained. In Table 3.12 (Table 8.3[9]), when eight processors are being used, the speed-up of D&C algorithm(SESUPD) over TQL2 is 9.4 for matrix [1, 2, 1] of order 100 whereas ours is 27.70 for the same matrix of order 125, and the speed-up of D&C algorithm(SESUPD) over TQL2 is 20.00 for order 400 whereas ours is 146.19 for order 499. In Table 3.14 (Table 8.4[9]), when eight processors are being used, the speed-up of D&C algorithm(SESUPD) over TQL2 is 12.1 for random matrix of order 100 whereas ours is 46.16 for the same matrix of order 125 (see Table 3.11), and the speedup of D&C algorithm(SESUPD) over TQL2 is 60.7 for order 400 whereas ours is 220.00 for order 499. Table 3.13 (Table 7b[23]) shows the speed-up of B / M (two versions: TREPSl and TREPS2) and D800 over TQL2 for the matrix [—1,2, —1] of order 500. It indicates that TQL2/SESUPD = 27.1 and TQL2/TREPS2 = 131.5 whereas TQL2/HOMO = 146.19 for the matrix [1,2,1] of order 499 on Table 3.10. This result suggests that the speed-up of the homotopy algorithm is at least as good as TREPSZ. Figure 3.1 shows the efficiency of the matrices [1, 2, 1], T2, Random matrices and W: of the order 499 and Figure 3.2 shows the speed-up of the matrices [1, 2, 1], T2, Random matrices and W: of the order 499. 35 [1, 2, 1] matrix 0.... 0.... HOMO g, TQL2 TQL2 li— (can...) 3, p (lu'l‘tms) (00.11....) ___—T 7.99 .1.0 1.00 27.55 3.44 9.41 1.0 1.00 26.33 65 2 4.51 1.8 0.89 6.10 4.93 1.9 0.95 4 2.77 2.9 0.72 9.93 2.88 3.3 0.82 1 29.18 1.0 1.00 176.7 6.06 34.98 1.0 1.00 177.33 125 2 15.25 1.9 0.96 11.59 17.55 2.0 1.00 4 8.76 3.3 0.83 20.18 9.34 3.7 0.94 8 6.38 4.6 0.57 27.70 6.14 5.7 0.71 1 122.37 1.0 1.00 1457. 11.91 143.46 1.0 1.00 1497. 2 69.93 1.9 0.97 20.85 71.76 2.0 1.00 255 4 33.08 3.7 0.92 44.07 37.2 3.9 0.96 8 20.38 6.0 0.75 71.53 22.4 6.4 0.80 16 14.28 8.6 0.54 102.08 14.85 9.7 0.60 1 477.91 1.0 1.00 10889 22.79 550.90 1.0 1.00 11198 2 242.01 2.0 0.99 45.00 276.56 2.0 1.00 499 4 125.45 3.8 0.95 86.80 140.48 3.9 0.98 8 74.49 6.4 0.80 146.19 81.72 6.7 0.84 16 47.41 10.1 0.63 229.69 52.27 10.5 0.66 Table 3.10: Execution time (second), speed-up and efficiency of HOMO on [1, 2, 1] and T2 matrices. 36 Random matrix ll ' Wilkinson matrix 0.... ...... 110140 5, TQL2 3912 HOMO g, TQL2 N p (lu‘l‘ime) S, L (Bu’l‘imo) HOMO (Bu’l‘imo) S, p (24m...) 1 1 6.66 13 1.00— 29.48 4.43 6.27 1.0 1.00 25.61 65 2 3.90 1.7 0.85 7.56 3.35 1.9 0.94 4 2.44 2.7 0.68 12.08 2.03 3.0 0.77 1 21.11 1.0 1.00 187.4 8.88 22.10 1.0 1.00 172.05 125 2 11.11 1.9 0.95 16.87 11.41 1.9 0.97 4 6.04 3.5 0.87 31.03 6.20 3.6 0.89 8 4.06 5.2 0.65 46.16 4.26 5.2 0.65 1 80.55 1.0 1.00 1490. 18.50 86.69 1.0 1.00 1367. 2 41.59 1.9 0.97 35.83 44.27 2.0 0.98 255 4 22.23 3.6 0.91 67.03 23.45 3.7 0.92 8 14.35 5.6 0.70 103.83 15.19 5.7 0.71 16 10.57 7.6 0.48 140.96 10.55 8.2 0.51 1 301.78 1.0 1.00 11447 37.93 328.24 1.0 1.00 9949. 2 155.11 1.9 0.97 73.80 166.82 2.0 0.98 499 4 82.47 3.7 0.92 138.80 88.89 3.7 0.92 8 52.03 5.8 0.73 220.00 56.16 5.8 0.73 16 39.19 7.7 0.48 292.09 40.13 8.2 0.51 Table 3.11: Execution time (second), speed-up and efficiency of HOMO on Wilkinson and Random matrices. 37 N 100 200 300 400 TQL2/SESUPD 9.4 15.4 17.7 20.00 Table 3.12: A comparison on the Alliant FX/8 for TQL2 vs. TREEQL on [1,2,1] matrices. i Algorithm Time(TQL2 on 1 CE) Time(TQL2 on 8 CE) Time(AIg. i on 8 CEs) Time(AIg. i on 8 CEs) 1 TREPSI 32.9 7 2 TREPSZ 131.5 28 3 TQL2 4.7 1 4 BISECT+TINVIT 3.6 .8 5 SESUPD 27.1 . 5.8 Table 3.13: Speed-up over TQL2 on the Alliant FX / 8 for computing all the eigenpairs of [—1,2, —1] matrix of order 500. N 100 200 300 400 TQL2/SESUPD 12.1 19.5 38.8 60.7 Table 3.14: A comparison on the Alliant FX/8 for TQL2 vs. TREEQL on random matrices. 38 Test matrices £2 4 P 1. Random 1.00 ~- 2. Wilkinson 3. [1.2.1] 0.75 " 4. T2 0.50 '" n=500 p---# of processors 0.25 ' —— =efficiency P } l l ! 4 0 4 8 12 16 P Figure 3.1: Efficiency of homotopy method on [1, 2, 1], T2, Random matrices and W3” Sp 0 Test matrices 1. Random 10 ' 2, Wilkinson 3. [1.2.1] 7.5 4. T2 5 — n=500 p=# of processors 2.5 Sp=speed-up l I l l _ I ' I I —" 0 4 8 12 16 P Figure 3.2: Speed-up of homotopy method on [1, 2, 1], T2, Random matrices and W: 39 PART II The Homotopy Method for Real Symmetric Generalized Eigenproblem 40 Chapter 4 Introduction Consider the symmetric positive semidefinite generalized eigenproblem: Ax 2 A83: (4.1) where A and B are n x 71 real symmetric matrices, B is positive semidefinite. The pair (A, B) is called a pencil of the eigenproblem (4.1), and the eigenpair of (4.1) is called the eigenpair of the pencil (A, B). When B is singular, (4.1) has fewer than n eigenvalues. Let X AX T be a spectral decomposition of B, where X is an orthonormal matrix and A is a diagonal matrix. Then, (4.1) is equivalent to fly = AAy with A = XTAX and y = X72. Hence, we may assume the matrix B in (4.1) is a diagonal matrix with nonnegative diagonal elements. In this case, the MDR reduction [3] can further reduce A to a symmetric tridiagonal matrix and, through the reduction, keep B as a diagonal matrix. Therefore, we will assume hereafter that, in (4.1), A is a symmetric tridiagonal matrix of the form f 01 52 l .32 02 33 ‘- ’° (4.2) 571-1 Ora—1 flu ( [3» an ) 41 and B = diag(bl,bg, ...,bn) with b,- _>_ 0. If ,6; = 0 for some i, 2 S i S n, then R“ can clearly be decomposed into two complementary subspaces invariant under A. Thus the generalized eigenproblem A2: = AB: is decomposed in an obvious way into two smaller subproblems. Hence, we will assume that each [3,- ;£ 0. That is, A is unreduced. In this part of the work, we shall describe our homotopy algorithm for solving eigenpairs of the pencil (A, B) in (4.1). Let D be an n x n symmetric tridiagonal matrix and consider the homotopy H : Rn x R x [0, 1] —-§ Rn x R, defined by ABa: — D2: /\B:1: — A1: HWJJ) =(1-t) xTBz—l +t zTBz-l 2 2 ABz - [(1 — t)D + tA]a: zTBa: — 1 (4.3) 2 ABa: — A(t)x = $733: — 1 2 where A(t) = (1 — t)D + tA. The pencil (D, B) is called an initial pencil. In Chapter 5, we shall give a reduction which shows that the eigenvalues of the pencil (A, B) are the same eigenvalues of a pencil (A, B) with an unreduced sym- metric tridiagonal matrix A and a positive definite diagonal matrix B. And, from the reduction, one can easily form the eigenpair (:c, A) of the pencil (A, B) from the eigenpair (y, )1) of the pencil (A, B). Therefore, we shall assume the diagonal elements b,- of B are all positive. In such case, we shall prove, in Chapter 6, that the solution set of H (z, A,t) = 0 in (4.3) consists of disjoint smooth curves, each of which joins an eigenpair of the pencil (D, B) to an eigenpair of the pencil (A, B). We call each 42 of these curves a homotopy curve or an eigenpath. Thus, by following the eigenpaths emanating from the eigenpairs of the pencil (D, B) at t = 0, we can reach all the eigenpairs of the pencil (A, B) at t = 1. The algorithm of following an eigenpath will be given in detail in Chapter 7. Some numerical results along with comparison with the QZ- method will be presented in Chapter 8. It appears that our algorithm is strongly competitive in terms of speed, accuracy and orthogonality and leads in speed in all the cases. 43 Chapter 5 Reduction Let A be an unreduced, symmetric tridiagonal matrix in (4.2) and b = diag(bI, b2,...,b,,) with b: > 0 and b,- _>_ 0 for i = 2,3, ...,n. In this chapter, we give the details of the reduction of the pencil (A, B) to a pencil (A, B), where A is still unreduced tridiagonal and B is positive definite and diagonal. Lemma 5.1 Assume A: ————JEmfi}-“‘ andB=(Bl 0): \ where B1 is a positive definite diagonal matrix with dim(B1) = dim(A1) = m1 and 0 is the zero matrix with dim(0) = dim(A‘,’) = n - m1. If det(A‘,’) 96 0, then the pencil (A, B) has the same eigenvalues as the eigenproblem: (A1 + S)y = A313/, where S is a diagonal matrix. Proof: Clearly, with x = (x1, x2, ...,xn)T, Ax = ABx is equivalent to and {$1\ 3m: —l 0 Km) 0 1 6 ( fim+1xm ) +A‘,’ ) Since det(A‘1’) gé 0, (A‘,’)"1 exists. From (5.2), ( mmx+1 ‘ aim; +2 (2..) = 443-1 Let (w1, wg, ..., wn_,,,,)T be the solution of A? rum (44..-... J 10: K flmi +131»: +1 ) / 4...,“ \ 3m1+2 = AB] 31111—1 (4,.) 0 (1) 0 m then from (5.3) and (5.4), me = —wlflm,+1xm,. Let f 0 then (5.1) can be written as: (A1 + S) { Batu-131711 \ I.” Km) (5.1) (5.2) (5.3) (5.4) (5.5) Clearly A1 + S is an unreduced symmetric tridiagonal matrix and the eigenvalues of (5.5) are exactly the eigenvalues of the pencil (A, B). Furthermore, if x = (x1,x2, ...,xm,)T is the eigenvector of (5.5) corresponding to the eigenvalue A, then x = (x1, x3, ..., xm,, —cwl, ..., —cw,,_,,,,)T, where c = 5",”.le , is the eigenvector of the pencil (A, B) corresponding to the same eigenvalue A. Q.E.D. Lemma 5.2 Assume / . K A l _ _ .1- ' €79.11- .. 19m,“ I Bl A = || A(l). : and B = 0 , ----lgmzfl ---- B2 .. firm-H I I K . A2 1 where B,-,i = 1,2, are positive definite diagonal matrices with dim(B1) = dim(A1) = m; and dim(Bg) = dim(A2) = n — mg. 0 is the zero matrix with dim(0) = dim(A?) = mg - m1. If det(A?) rf 0, then the pencil (A, B) has the same eigen- values as the eigenproblem: M“ l l (3‘ ) +3 y=A y, (5.6) A2 82 where S is a symmetric matrix and the matrix A1 +S is unreduced, symmetric and tridiagonal. Proof: Clearly, Ax = ABx can be rewritten as: ( 31 A f 0 l f x; A A1 5 “l" 0 =A31 E , (5.7) K 3m / Kflmmzmm ) K 2..., ) 46 ( firm-Hahn: \ +A‘,’ and Since det(A‘1’) 76 0, from (5.8) (zmfll $m1+2 ting-1 K‘m J { zmfi-l \ f 0 A tun-{’2 + 0 3m: ) Kflmz-Hzmfi-l / { 3mg.” \ ( 17mg.” \ 3m?” = A 32 3m2+2 4M0“ K flmz+13M2+l / =0 (5@ 59) (5m) Let w = (wl, ..., mm.m1 )T and v = (v1, ...,vm,-m,)T be the solutions of (1) (0) 0 0 0 E Alw = , and Alv = , i 0 K 0 l K I J then, from (5.10), flm1+lzm1+l = “wlflrznl-I-lzml _ vlflm1+1fimg+lxm2+l flinz'i-lxma = 'wmg—m1flm14-116m2-I-13m1 "" ”Ma—M1fl72n34-lzmTI-1' Since A? is unreduced, clearly v1 7e 0. A? is symmetric, so is its inverse. Thus, v1 = w,,,,..,,.,. Let S be an (n — mg + m1) x (n — mg + m1) matrix of form 47 K 0 l 0 “ml 53., +1 ”vlfimi +1 fimfi-l "(ml)". raw -v1.3m.+iflm2+1 —”m,—m116,2n,+1 . ...(m1+1)"‘row 0 K 0) Then, (5.7) and (5.9) can be rewritten as: f 271 A ( 1'1 \ A1 + S xi": = A Bl 2m; . (5.11) A2 3771344 32 3m2+1 Clearly, the matrix A1 + S is unreduced, symmetric and tridiagonal and the eigenvalues of (5.11) are the same eigenvalues of the pencil (A, B). Furthermore, if (x1, ..., xm, 2m“, ..., xn)T is the eigenvector of (5.11) correspond- ing to the eigenvalue A, then for e = 19",,“ xm,“ and d = flmfilxmv (x x —w — v d —w c — v .d )T l, 00., mx, [C l ’00., m2_m1 gun—m‘ ,3m2+1,...,$n is the eigenvector of the pencil (A, B) corresponding to the same eigenvalue A. Q.E.D. 48 If (B; A (A1 4: A 01 :1: A‘,’ :1: B: B2 , weletA= . (5.12) ASL, :1: If (B; l (A, at A 01 :1: A‘,’ at B: B2 ,weletA= . (5.13) -. A, 4: K 0,) K * 42) where 05’: are zero matrices with dim(0,-) = dim(A?), and B,-’s arepositive definite diagonal matrices with dim(B.-) = dim(A.-), i = 1,2, ...,r. Theorem 5.1 If det(A?) 96 0, i = l,2,...,r, then the pencil (A, B) has the same eigenvalues as the eigenproblem: ((A1 ) ) (B. ) I » A B E 2 + S y = A 2 . y, (5.14) KK Ar) 1 K Br) where S is a symmetric matrix and the matrix on the left hand side is an unreduced symmetric tridiagonal matrix. Proof: By the same line of argument used in Lemma 5.1 and Lemma 5.2, the proof can be immediately achieved by mathematics induction. Q.E.D. In the following discussion, let (A)1 and (A)1 denote the lower (n — 1) x (n - l) submatrix and the upper (n - 1) x (n — 1) submatrix of A respectively, then (A)1 = ((4)1)1 =((A)‘)1~ 49 Lemma 5.3 Assume A: —————-€"2+3-—-- and B: Bl , 0 where B1 is a positive definite diagonal matrix with dim(B1) = dim(A1) : m1 and 0 is a zero matrix with dim(0) : dim(A‘l’) = n — m1. If det(A‘1’) = 0, then the pencil (A, B) has the same eigenvalues as the eigenproblem: (Amy = M3019, Proof: Clearly, Ax : ABx can be rewritten as ( x; A f 0 A r x; A A, 4., + 0 = AB, 3” (5.15) K 3m; j \ fimj-I-lznu-I-l ) K 3m) J and ( flint-Hz?!“ \ ( $731+! \ 0 + A‘,’ ““7 = 0. ~ (5.16) K o J K .. J Since det(A‘1’) : 0 and (5.16) has a solution, / (fling-Hz?!“ \ \ rank 0 ,A‘,’ = ran/4A?) = n — m1 — 1. 50 This implies flm+1xm : 0, since A? is unreduced tridiagonal. Since 5m,” 74 0, xml : 0. Replacing x,,., by 0 in (5.15), we have (x1)(0) (x1) A1 : + E = AB; 3 . (5.17) 3m1—1 0 31m —1 K 0 l K flmi-Hzmi-H ) K 0 ) From the last row of (5.17), flmlxml-l = —flm1+13m1+1- (5.18) Hence, $1 31 (All! 5 = A(Blh 5 . (5.19) xml-l Jinn—l Clearly, the eigenvalues of (5.19) are the same eigenvalues of the pencil (A, B). More- over, from (5.16) and (5.18), { -%3m1 -1 \ zm1+2 A? . = 0. (5.20) KI») When (5.19) is solved, xm,-1 is known, so (5.20) has a unique solution since rank(A‘,’) : n - m1 — 1. Hence if (x1,...,x,,,,-1)T is the eigenvector of (5.19) cor- responding to the eigenvalue A, then for c : —flm,/flm,+1, (x1,...,x,,,,-1,0,cxm,-1, x,,,1 +2, ..., x,,)T is the eigenvector of the pencil (A, B) corresponding to the same eigen- value A. Q.E.D. 51 Lemma 5.4 Assume K . l where B,-,i = 1,2, are positive definite diagonal matrices with dim(B1)-= dim(A1) : m1 and dim(Bg) = dim(Ag) = n—mg. 0 is a zero matrix with dim(0) = dim(A?) = mg - m1. If det(A?) = 0, then the pencil (A, B) has the same eigenvalues as the eigenproblem: (A1)1 .4 ... (Blli 0 +3 y=A o +T y, (5.21) (’42)1 (32)1 where S is a symmetric matrix and the matrix on the left handside in (5.21) is an unreduced symmetric tridiagonal matrix. T is a diagonal matrix and the matrix on the right hand side is positive definite and diagonal. Proof: If mg — m1 : 1, i.e., A? is an 1 x 1 matrix, then A? = 0 since det(A?) = 0. Hence, Ax = ABx can be rewritten as ( x1 \ f 0 \ / x1 \ Al 3.2 'l" E J. =3 AB] 1:2 , ' : 0 : K 3m, l K 5m1+1$m1+1 ) K xmI ) firm-+1311“ + flm3+lzm3+l = 0 (5.22) 52 and ( flm4+1zm ) f 2...,“ ) f 3...,“ l 0 3711344 wins-+2 4' A2 . = 1‘32 K 0 l K34} K45} Notice that 3:",2 = xm,+1, and B1 and B2 are positive definite. We may rewrite (5.22) as: { $1 A r 0 \ K 31 \ (A1)1 . "l' . = A(Blh . , 37131-2 0 ting-2 A 31711-1 A K _H'Znijxnxm'tl / K mml-I A Om m m "' flmixmi—l + “—fl'lfijflxmz-i-l '— .Bmi-mez = Ahlé'L-Sflzmz-i-l (523) "11 m1 and flmz-i-lzmz + am2+l$m2+1 + 76m3+23m2+2 = Abmrl-lxma-H (524) ( fimg+2xm2+1 ) ( 3m3+2 A ( $m3+2 A 0 m m + (A2)1 3 2+3 = A(Bz)l x 2+3 K0} (2,.) Kat”) Solving for x,,,2 in (5.23) and substituting it into (5.24), yields on 5?» -3 3'3““ 39711—1 + (am! 7372:]- + am2+1)zm2+l '1' flmg+2$m3+2 My?! 32 = A(bmfil ‘1' fibmilxmfil' Let S and T be (n — mg + m1 — 1) x (n —_m2 + m1 — 1) matrices of forms. 53 (0 \ 0 3m fl 0 -fi3 . ...(ml — l)"‘row ,,, 5’ S = -Ejflnfl am: 73;!11 '1' am2+1 fim2+2 ...(m1)"‘ row ”1+1 "31+! 0 and f 0 ) ’ 0 0 T = bmz+l "I" $23—15"): (mllth'ow 0 O K 0} Then ( r: ) f z: \ (A1)1 : (31h 0_ +8 “"1 =4 0 +T 2"“ (4‘13)l 3m?“ (132)1 3m?“ K ... l K 4,. l (5.25) Clearly, 53.. +1 bmg-H ‘l" 'fi'z—z'—bm1 > 0: ”n+1 and the matrix on the left hand side in (5.25) is unreduced symmetric tridiagonal. - The eigenvalues of (5.25) are the same eigenvalues of the pencil (A, B). 54 Furthermore, after solving (5.25), 33m, and x...2 can be obtained from (5.22) and from (5.23). Hence if (x1,...,xm,_1,xm,+1,...,x,.)T is the eigenvector of (5.25) cor- responding to the eigenvalue A, then (x1,...,x,,,,_1,x,,,,, xm,,x,,,,+1,...,x,,)T is the eigenvector of the pencil (A, B) corresponding to the same eigenvalue A. If A? is (m2 -- m1) x (m; - m1), where m; — m1? 2, then Ax : ABx can be written as: f 331 \ f 0 A f 31 \ (A1)1 . + . = A(Bih . 4 (5-26) wing—2 0 $m1-2 K 4....-. ) K 94.x... ) K mm.-. ) flmxzml—l + amlxflu + flm1+lxm1+l = Abflu 3m“ (5'27) ( Bm1+13m1 l f 3m,“ l r 0 A 0 ... E + A? 3 f“ + = 0, (5.28) . : 0 K 0 j K x1712 ) K flma-i-lxmz-l-l ) flm2+lxmg + am3+1$m2+1 + flm2+QxM2+2 = Ab7732+lmms+li (5'29) and ( flma+23m2+1 l ( $m2+2 ) ( aim-4+2 ) 0 m m + (A2)1 3 ”3 = A(Bg)‘ 1” ”'3 (5.30) K0) Ken} K9») Since ranh(A?) : m2 — m1 — 1 and A? is unreduced tridiagonal, / 1 K A? y‘ = 0 (5.31) K yma-nu-l / 55 has a unique solution. Multiplying (l,y1,...,ym,_m,-1) on the left in (5.28), yields flmri-lzml = -ym,-m,-1J9m3+13m+1- Now (5.26) and (5.27) can be rewritten as ( $1 \ f 0 \ , { $1 \ (Alli E + 0 =A(Bl)1 E , (5-32) szx-IJ K'"""'"3.:Ifl”‘”""“$ma+1J Kan-IJ and ‘1’ -m —lam am +1 firm 2m,_1 + m2 1 1 2 3m!“ $m2+1 + 3m1+13m1+1 mam -v -m .- 5 +1 = Abm: "'2 30011:! m Since det(A?) : 0 and A? is unreduced tridiagonal, det((A?)‘) # 0. Therefore (1) he (0) 0 (A:)‘v= , .and(A:)‘w= 3m3+1 ° 0 KOJ KlJ have solutions. Write v : (v1,...,v,,,,_m,_1)T and w : (w1,...,wm,_m,_1)T. (A?)1 is symmetric, so is its inverse. So w] : vm,_m,_1. From (5.28), 3mm = -vifim,+2$m,+1 - wiflm2+1$m2+1 ass 3m; = _wlflm1+2zm1+1 — wM2-ml-lflm:+le2+l° It follows from (5.31), f 111 l ( flm1+2 A 0 (4‘91 ,, = — . , K ym.-..,-1 J K 0 J thus, yfltz-ml-l = —flm1+2vm2-m1—1 = _flmpI-zwl' (5.35) 56 Substituting (5.34) into (5.29), we have wlfimg-Hfimrl’lzmi-i-l "l’ (am2+1 - wms-m1-1 3.2+!)3m3-H + flm2+2$m2+2 (5.36) = Abuse“ 3mz+l - Similarly, substituting (5.35) into (5.32) and (5.33), yields, flmazml’l + mamlgz:::fim2+1zm2+l + flml+1$m1+1 (5 37) = Abm, wingin-jfzrng-fl zmz-Ha and f 31 ‘ r 0 \ K $1 \ (A1)1 ' + ° = A(Bl)1 ' . (5.38) 37121-2 0 zm1—2 ‘0 ”Hunt!“ 3 K3m1-1 / K 1 ”Mai: ”£93m“ ) szi-l ) We may solve for xm.” in (5.36) and substitute it into (5.37), to obtain w pm pm hi ”13m 5 1 Jamil: 12031-1 + (Oman - wma-mx-lfivznz-l-l + am1( p1":2+1m+1)2)$m2+1 +flm+23m3+2 = Abzmg-H (5.39) where wlflmx-Hflmg-l-l )2 > 0 fiflu-{d Let S and T be (n — mg + m1 — 1) x (n - mg + m1 — 1) matrices of forms { 0 ) b = bins-H + bm1( 0 : s l ...(ml - 1)"‘row S: s I t 'flm,+2 ...(m1)"‘ row — - ... — —1 -— ...- .flmfi.) ; 0 ...(m; + l)"‘row 57 T = b ...(m1)"‘row 0 4 where s = “pm‘gzxffim” , t = am,“ - wing—nu-l 3,2.” + am,(wlp'3‘::fl'”’“ )2. From (5.32), (5.38) and (5.39), we have ( 31 ) l 31 ) (4.). ‘ (3.). ° 0 + S 3”“ = A 0 + T 3“” (A2)1 301.2471 (Bg)l ”Mai-1 K .. J K ... J (5.40) Obviously, the matrix on the left hand side in (5.40) is unreduced tridiagonal and the matrix on the right hand side is positive definite and diagonal. The eigenvalues of (5.40) are the same eigenvalues of the pencil (A, B). Furthermore, after solving (5.40), we can calculate xm,“ from (5.36), x,,,, from (5.27) and x,,.,+2,...,x,,., from (5.28). Hence, the eigenvector of the pencil (A, B) corresponding to the eigenvalue A can be easily formed from the eigenvector of (5.40) corresponding to the same eigenvalue A. Q.E.D Theorem 5.2 Assume det(A?) = 0, fori : l,2,...,r. (i) HA and B are of the same form as in (5.12), then the pencil (A, B) has the same eigenvalues as the eigenproblem: ( ( (Aill ) ) r ( (31h ) ) 0 o (42% + S v = A (32H + T v KK ' (4)1) J KK . (BM (5.41) where S is a symmetric matrix, and the matrix on the left hand side is an unreduced symmetric tridiagonal, T is a diagonal matrix and the matrix on the right hand side 58 is positive definite and diagonal. (ii) IfA and B are of the same form as in (5.13), then the pencil (A, B) has the same eigenvalues as the following eigenproblem ( f (141): ) l f ( (31h \ ) (42): +5 u=4 ' (3,); +:r , K K (4)4 J J K K (3.): J (5.42) where S is a symmetric matrix, and the matrix on the left hand side in (5.42) is unreduced symmetric tridiagonal, T is a diagonal matrix and the matrix on the right hand side is positive definite and diagonal. Proof: By the same arguments used in Lemma 5.3 and Lemma 5.4, the proof can be easily achieved by mathematics induction. Q.E.D. Theorem 5.3 If A is an unreduced symmetric tridiagonal matrix and B is a positive semidefinite diagonal matrix, then the pencil (A, B) can be reduced to a positive defi- nite pencil (A, B), where A is still unreduced symmetric tridiagonal and B is positive definite diagonal. Proof: The proof follows immediately from Theorem 5.1 and Theorem 5.2. Q.E.D. Let f,, denote the number of the eigenvalues of the pencil (A, B), then by Theorem 5.1 and Theorem 5.2, we may obtain the following result. Theorem 5.4 f,, -= rank(B) _ :(1 — sign(|det(A?)|)). i=1 Since B is diagonal, rank(B) is just the number of nonzero diagonal elements of B. If zero is an eigenvalue of A9, sign(|det(A?)|) : 0. Otherwise, sign(|det(A?)|) : 1. 59 Since A? is unreduced, the Sturm’s sequence [24, 31] can be used to check if zero is an eigenvalue of A9, so, f,, can be easily computed without actually solving the generalized eigenvalue problem. 60 Chapter 6 Preliminary Analysis Let A be an n x n real symmetric tridiagonal matrix and B be an n x n positive definite diagonal matrix of the form (011 32 l 53 as 53 0 Kb] 0‘ A= . , B: b2 . 5 (6'1) 0 [921—1 an-l fin K 0 . bu l K J?» an ) . where b; > 0, and fl,- 7‘. 0 for all i. Our algorithm employed the same strategy of ‘Divide and Conquer’ as we did in Part I. First of all, the matrix A is divided into two blocks by letting one of the 388 equal to zero. Namely, for the initial pencil (D, B) in (4.3), we let D=(1:g), (6.2) 2 where I 011 52 O l { 0H1 flit-+2 0 l 52 02 53 51:“ 05+: fins D1 _ . . ’ D3 = ’ Bic-1 alt-1 flit Ian-1 an—l fin K 0 a a.) K 0 9» “nJ 61 and then let B1 0 B = with dim(B,-) = dim(D,-),i = l, 2. We calculate the eigenvalues of the pencils (D,-, B;), i : 1, 2, by using the most efficient algorithm available. Then our algorithm conquers the pencil (A, B) by the homotopy in (4.3). Theorem 6.1 The solution set of (4.3) consists of disjoint smooth curves and each curve joins one eigenpair of the pencil (D, B) and one eigenpair of the pencil (A, B). Proof: Differentiating H (x, A,t) with respect to (x, A) in (4.3) yields, AB — A(t) Bx ) Hm; (x,A,t) = ( ) ( xTB 0 We claim that if (x, A, t) satisfies H (x, A,t) = 0, then H(,,)()(x,A,t) is nonsingu- lar. To see this, we show that H(,,;K)(x,A,t)y : 0 has only trivial solution. Clearly dim(lcer(AB-A(t))) : 1 and xTBx at 0, since for each t, AB —A(t) is unreduced tridi- agonal and B is positive definite. Suppose y satisfies H(,,)K)y = 0, write y = (yf, y2)T, where yl E R1‘ and y; E R, then (AB — A(t))yl + 3]ng = 0 (6.3) (BTByl = 0. (6.4) Since xTBx at 0 and xT(AB -— A(t)) : 0, (6.3) implies yg .: 0. Hence, (*3 - A(t))yt = 0. 01‘ Him — B-iA(t)B-%)Biy, = 0. That is, Big, e ker(AI — A(t)) with A(t) = B-iA(t)B-l. It is easy to see that A(t) is also unreduced since B‘i is a diagonal matrix with positive diagonal elements. 62 Thus, dint(lcer(AI - A(t)) = 1. On the other hand, ()1 — 4(4))th = (AI — B'iA(t)B-i)Bix = B-i(AB — A(t))x = 0. Thus, B§x E ker(AI — A(t)) and hence, B’i‘x : cBiyl for certain nonzero constant c. From (6.4), 473,, = (szxBiyl) = c(B%y.)T(B%h) = 0. Therefore, y; = 0, since B ’1‘ is positive definite. Hence y = 0. A repeated application of the implicit function theorem, the assertion of the the- orem is achieved. Q.E.D. Let f.- and A.-(t) be eigenvalues of the pencils (D, B) 'and (A(t),B), respectively. Then the following theorems follow immediately from the results we proved in Part 1. Theorem 6.2 If eigenvalues of (D, B) are distinct then (i) Either A;(t) is constant for all t in [0,1] or strictly monotonic. (ii) A,(t)A.-(t) > 0 for t small, if A,-(t) ¢ 0. Theorem 6.3 For any t 6 [0,1], £{_1 S Ag“) 5 £§+1,i ‘2 2, 3, ...,n — 1, A1“) $51 A..(t) 2 5,. From Theorem 6.2 and Theorem 6.3, every homotopy curve must be one of those in Figure 6.1. Each homotopy curve is bounded by two consecutive dotted lines and no homotopy curve can cross any dotted line. 63 (D, B) A (A, B) 7t H(x,A,t)-:0 93(1) (0) ______________________________________ aim) -------------------------- / 73(3) 7‘4“» ------------------------- :54 7X 75(0) 575‘“) MO) ......................................... (0) __________________________________________ O t=l t Figure 6.1: The generalized homotopy curves Chapter 7 The Homotopy Algorithms for Generalized Eigenproblem Our algorithm for finding all the eigenpairs of the pencil (A, B) with unreduced symmetric tridiagonal matrix A and positive semidefinite diagonal matrix B is based on the following steps: (i) Reduction (ii) Initiating at t = 0 (iii) Prediction (iv) Correction (v) Checking (vi) Detection of a cluster (vii) Step-size selection (viii)Terminating at t = 1 (ix) Forming the eigenpairs of the pencil (A, B). 7 .1 Reduction If B is positive semidefinite, the pencil (A, B) will be reduced to a pencil (A, B) with B positive definite by the reduction we described in Chapter 5. 65 7 .2 Initiating at t = 0 As we mentioned in Part I, by making the initial matrix D in (6.2) close to A, then the eigenpairs of the pencil (D, B) should be excellent starting points as we mentioned in Part I. We intend to choose I: for which fig.“ is as small as possible. To make the sizes of the blocks D1 and D; roughly the same, we limit the choice of k in the range n/2 —j S I: _<_ n/2 +j, wherej is roughly equal to n/20, and find the smallest (31,.” by local sorting. When the initial matrix D is decided, we calculate the eigenvalues of the pencils (D1, B1) and (D2, B2) by using the most efficient method available. We only require the accuracy stay within one-half or even one-third of the working precision. With this strategy, considerable amount of computing time is saved. 7.3 Prediction Assume that after i steps the approximate value (:i':(t,-), A(t;)) on the eigenpath (x(t), A(t)) at t,- is known and the next step-size h is determined; that is, t5.“ : t,- + h. We want to find an approximate value (i(t,-+1),A(t,~+1)) of (x(t;+1),A(t,-+1)) on the eigenpath at t5“. Notice that (i(t,~.,.1),A(t,-+1)) is an approximate eigenpair of the pencil (A(t;+1),B). Since H (x(t), A(t),t) = 0, we have K A(t)x(t) = A(t)Bx(t) . x(t)TBx(t) = 1. Differentiating both equations with respect to t , yields, (A — D)x(t) + A(t)x(t) = A(i)Bx(t) + A(t)B:1’:(t) (7.1) $(t)TB:i'(t) = 0. For t = t,-, multiplying (7.1) on the left by xT(t,), yields, A(ti) = 3T(ti)(A - D)$(ti) = 25k+1$k(ti)$k+1(ta) (7-2) 66 where $(t5) = (x1(t,-),...,x,,(t,-))T. In view of (7.2), we use the Euler predictor to predict the eigenvalue at t.-+1 , namely, A°(t,-+1) = A(t,-) + i(t.-)h. It is easy to see that A(0) : 0 in (7.2). Consequently, A°(t1) always equals to A°(0). To predict the eigenvector, we use the inverse iterations on (A(t;+1), B) with shift A°(t,-+1). That is, we solve (A(tm) - A°(t.~+1)B)y°(t,-+1) = 3304') and let By°(t.-+1) ITBy°(t.-+1)ll- At t,- = 0, since we skip the calculations of eigenvectors of the pencil (D, B), x(0) is 30(ti+1) = not available. We choose a random vector for x(0). 7 .4 Correction When B is positive definite, the generalized Rayleigh quotient p(u) : (uTAu)/uTBu enjoys the following properties [24]: (i) Stationarity. Since grad(p(u)) a v(p(u)) = 2“" :T”§:)B“)T. thus p is stationary at the eigenvectors of the pencil (A, B). (ii) Minimum residual. “(A - aB)u||},_. Z ”Aullis-I - |P(u)I’IIBUI|ia—x, where [le [23-1 = xTB'lx, with equality holds if and only if a = p(u). (iii) Monotonicity. “(A " Pk+1B)$k+1IIB-1 S “(A - ptB)xtIIB.-1 for all k. 67 (iv) Cubic convergence. Therefore, we use the generalized RQI as a corrector, starting with x°(t;+1). To be more precise, at (xj‘1(t,-+1), A5’1(t.-+1)) (j _>_ 1) let A"'(tm) = 3i"(ti+1)TA(ti+1)3j-l(ti+1l then solve (A(tm) - A5(ti+1)B)IIj(ti+1) = xj'lUm) and let Byj(ti+1) HBvi(t.-'+1)ll~ We repeat the above process to within half of the working precision if single precision $j(ti+1) = is used and one-third of the working precision if double precision is used when t5.” < 1, since precision in determining the curve itself is only of secondary interest. We polish (xj(t,+1), A5(t,-+1)) at the end of the path (t.-+1 = 1) by iterating the Rayleigh quotient to machine precision. The stop point (x5(t.-+1),.A5'(t',-+1)) of ROI will be taken as an approximate eigenpair (:i':(t,-+1), A(t,-+1)) of the pencil (A(t.-+1), B). The cubic convergence rate of RQI makes the corrector highly efficient. 7 .5 . Checking For f a 32 ) i 1 f t, \ 52 02 .33 b; A: , and B: . , fln-l an-l fin K ° b j K fin an ) n with nonzero figs and b,- > 0, the_polynomials defined by 110(k) =1 P1”) = 01 — Ab1 “(Al = (a, " ”filly-10‘) - fiEPr-“M r = 2, 3, ...,n 68 form a generalized Sturm sequence. Thus, the number of the eigenvalues of (A, B) strictly greater than A is equal to the number of the sign changes of the Sturm sequence, with the convention that if p,(A) : 0, then p,(A) is taken to have the opposite sign of p,-1 (A). When (i(t,-+1), A(t;+1)) is taken as an approximate eigenpair of the pencil (A(t,-+1), B), the generalized Sturm sequence at A(t.‘+1)+ e is computed to check that, if we are trying to follow the curve corresponding to j ‘h largest eigenvalue, we are still on that curve. Here, 6 is chosen as half of the working precision if single precision is used and one-third of the working precision if double precision is used. If the check fails, we reduce the step size to h/2 and repeat the whole process once again beginning with the eigenvalue prediction in Section 7.3. 7 .6 Detection of a cluster At t,- = 0, when all the eigenvalues of the pencil (D, B) A1(0) < A2(0) < < A..(0) are available, we let 6 : max(10'5,10’2(A,.(0) — A1(0))/ n) if double precision is used (or 6 = max(10’3,10’2(A,,(0) — A1(0))/n) if single precision is used). Set A,- and A,- in the same group if IA.-(0) - A,(0)I < 6. If the number of the eigenvalues in any group is bigger than 1, then a cluster is detected. At t,- 96 0, or 1, when (5(a), A(t;)) is taken as an approximate eigenpair of the pencil (A(tg), B), after the checking step in Section 7.5, we compute the Sturm sequences at A(t.) :l: 6 to find the number of eigenvalues of (A(t;),B) in the interval (A(tg) — 6, A(tg) + 6). When this number is bigger than 1, a cluster of eigenvalues of (A(tg), B) is detected. , In those cases, the corresponding eigenvectors are ill-conditioned and such ill- condition can cause the inefficiency of the algorithm. We simply give up following the eigenpath and the corresponding eigenpair of the pencil (A, B) will be calculated at the end of the algorithm (see Section 7.8). 69 7 .7 Step size selection In the first attempt, we always choose the step size h = l — t.- at t,- < 1. If after the prediction and the correction steps the checking step fails, we reduce the step size to M2 as mentioned in Section 7.5. Since the initial pencil (D, B) is chosen to be so close to (A, B), from our experiences, most of the eigenpairs of the pencil (A, B) can be reached in one step, i.e., h : 1. Very small step size can also cause the inefficiency of the algorithm. Therefore, we impose a minimum 7 on the step size h. If h < 7, we simply give up following the eigenpath and the corresponding eigenpair of A will be calculated at the end of the algorithm (see Section 7.8). We usually choose 7 as 0.25. 7 .8 Terminating at t : 1 At t = 1, when an approximate eigenvalue A(l) is reached, we compute the Sturm sequence at A(l) + c with e = machine precision to ensure the correct order. If the checking fails, we have jumped into a wrong eigenpath. More precisely, suppose we are following the i“ eigenpair, the checking algorithm detects that we have reached the j‘“ eigenpair instead. In this situation, we will save the j“ eigenpair before the step size is cut. By saving the j“ eigenpair, the computation of following the j‘h eigenpair is no longer needed. As mentioned in Section 7.6 and 7.7, we may give up following some eigenpaths to avoid adapting a step size that is too small or the situation when a cluster is encountered. Without any extra computation, we know exactly which eigenpairs are lost at t = 1. In order to find these eigenpairs, we first use the bisection to find the eigenvalues up to the half working precision and then use the inverse iteration and the RQI. If there is a cluster, then we do bisection to find the eigenvalues up to the machine precision, then use the inverse iteration to find the corresponding eigenvectors. In this case, to guarantee the orthogonality, we orthonormalize the eigenvectors belonging to the same cluster while we are using the inverse iteration. 70 7 .9 Forming the eigenpairs of (A, B) If B is positive semidefinite, the pencil (A, B) was reduced to a pencil (A, B) with B positive definite. From Theorem 5.1 and Theorem 5.2, the eigenvalues of the pencil (A, B) are the same eigenvalues of the pencil (A, B). Although the eigenvectors are different, the eigenvectors of the pencil (A, B) can be easily formed from those of the pencil (A, B) with few computations. The formulas are given in Chapter 5. 71 Chapter 8 Numerical Results of the Homotopy Method for Symmetric Generalized Eigenproblem We shall show the computational results comparing the homotopy continuation method GHOMO with QZ method. The computations were done on a Sun SPARC station 1. ' Our testing examples consist of the following types of pencils (A, B): Type 1. A is an unreduced symmetric tridiagonal matrix with both diagonal and off-diagonal elements being uniformly distributed random numbers between 0 and 1. B is a diagonal matrix with the first n/2 diagonal elements being uniformly distributed random numbers between 0 and l, and the last 11] 2 being zeros. Type 2. A is an unreduced symmetric tridiagonal matrix with both diagonal and off-diagonal elements being uniformly distributed random numbers between 0 and l. B is a diagonal matrix with the first 3n/ 10 and the last 3n/ 10 diagonal elements being uniformly distributed random numbers between 0 and 1, and the rest being zeros. Type 3. A is Toeplitz matrix [1,2,1]. B is a diagonal matrix with the first n/2 diagonal elements being 1, and the rest being zeros. Type 4. A is Toeplitz matrix [1,2,1]. B is a diagonal matrix with the first 3n/ 10 72 and the last 3n/ 10 diagonal elements being 1, and the rest being zeros. Type 5. A is an unreduced symmetric tridiagonal matrix with both diagonal and off-diagonal elements being uniformly distributed random numbers between 0 and 1. B is a diagonal matrix with all diagonal elements being random numbers between 0 and 1. Table 8.1 shows the comparison in terms of speed with the QZ method. Table 8.2 shows the accuracy and orthogonality of the homotopy method. The homotopy method appears to be strongly competitive and leads in speed by a considerable margin in comparison with the QZ method in all the cases. 73 Table 8.1: Execution Time (second) of computed eigenpairs of generalized eigenprob- lems. Matrix Order I Execution Time (second) [_Type N I GHOMO QZ Ratio (ox/0110540) l——k 30 0.23 6.49 28.21 Matrix 100 0.75 48.20 64.26 Type 200 2.85 328.19 115.15 1 400 12.39 2546.06 205.49 50 0.28 6.23 22.25 Matrix 100 1.14 45.67 40.06 Type 200 4.86 321.86 66.22 2 400 15.24 1802.32 118.26 50 0.31 6.15 19.83 Matrix 100 1.60 46.96 29.35 Type 200 6.46 352.67 54.59 3 400 23.68 2796.63 118.10 50 0.52 5.82 10.15 Matrix 100 2.18 44.20 20.27 Type 200 8.80 336.50 38.23 4 400 34.84 2694.60 77.34 50 0.72 11.17 15.51 Matrix 100 2.81 74.66 26.56 Type 200 12.74 488.53 38.34 5 400 47.70 3866.14 81.05 74 Matrix Order N max.- ||Ax,- - Anggllg/Am, max“,- |(XTBX — I ),- JI / Am 50 3.9107905787880D-16 2.8617639677154D-16 Matrix 100 3.4241406339558D-16 2.6417938095675D-15 Type 200 4.3779590340543D-16 2.0037668814815D-15 1 400 5.2265929394497D-16 5.1552956553653D-15 50 3.4923828496565D-16 l .3678005594614D-15 Matrix 100 4.1134132219334D-16 1.0126081452260D-15 Type 200 5.1246002221589D- 16 1.8957582132953D-13 2 400 4.9856607322645D-16 1.093460401 163213-14 50 1.1454950034247D-16 3.4233801826002D-l6 Matrix 100 1.6564408035527D-16 1 .1470895221 83613—1 5 Type 200 2.1624971847800D-16 3.6096263986950D-15 3 400 2.4734721400459D- 16 2.1752300627827D-14 50 1.0336086782520D- 16 2.4870346559132D-16 Matrix 100 2.0867616866469D-16 2.3841690810960D-15 Type 200 2.1954327917052D-16 6.4657627842233D-15 4 400 2.9860202732850D- 16 2.6556085482256D-14 50 2.1611472844832D-16 2.2100520796971D-15 Matrix 100 4.3084863486435D-16 1 .7434653812219D-15 Type 200 4.6005546307787D- 16 5.2256957128428D-15 5 400 6.0472727089284D- 16 2.1235908790871D-14 Table 8.2: The residual and orthonormality of computed eigenvectors of generalized eigenproblems. 75 Bibliography [1] R. Barlow and D. Evans, A parallel organization of the bisection algorithm, The Computer Journal, 22(1977), pp. 267-269. [2] H. Bernstine and M. Goldstine, Parallel implementation of bisection for the cal- culation of eigenvalues of tridiagonal symmetric matrices, Computing, 37(1986), pp. 85—91. ' [3] A. Bounse-Gerstner, An algorithm for the symmetric generalized eigenvalue prob- lem, Linear Alg. Appl., 58(1984), pp 43-68. [4] H. Bowdler, R. Martin, and J. Wilkinson, The QR and QL algorithms for sym- metric matrices, Numer. Math, 11(1968), pp. 227-240. , [5] M. T. Chu, A simple Application of the Homotopy Method to Symmetric Eigen- value Problems, Linear Algebra and Appl., Vol. 59 (1984), 85-90. [6] M. T. Chu, A Note on the Homotopy Method for Linear Algebraic Eigenvalue Problems, Linear Algebra and Appl., Vol. 105, (1988), 225-236. [7] M. T. Chu, T. Y. Li and T Sauer, Homotopy Method for General A-Matrix Problems, SIAM J. Matrix Anal. and Appl., Vol., 9, No. 4, 528-536, (1988). [8] J. J. M. Cuppen, A divide and conquer method for the symmetric tridiagonal eigenproblem, Numer. Math., 36(1981), pp.177-1951. [9] J. J. Dongarra and D. C. Sorensen, A fully parallel algorithm for symmetric eigenvalue problems, SIAM J. Sci. Stat. Comput. Vol 8. (1987), pp. 139-154. 76 [10] G. Fix and R. Heiberger, An algorithm for the ill-conditioned generalized eigen- value problem, SIAM, J. Numer. Anal., 9(1972), pp 78-88. [11] F. Rellich, Perturbation theory of eigenvalue problems, Gordon and Breach Sci- ence Publishers, New York, NY, 1969. [12] G. Golub and C. V. Loan, Matrix computations, The Johns Hopkins Press, Bal- timore, MD, 1983. [13] R. O. Hill, Interlacing of eigenvalues of symmetric, tridiagonal matrices, A Preprint. [14] I. Ilse and E. Jessup, Solving the symmetric tridiagonal eigenvalue problem on the Hypercube, SIAM J. Sci. Stat. Comput. vol 11. (1990), pp. 203-229. [15] E. Jessup, Parallel Solution of the Symmetric Tridiagonal Eigenproblem, Ph.D. Thesis, (1989) Yale University. [16] K. Li and T. Y. Li, An Algorithm For Symmetric Tridiagonal Eigen-problems — Divide and Conquer with Homotopy Continuation, A Preprint. [17] T. Y. Li, The homotopy method for solving real symmetric generalized eigenvalue problem, A Preprint. [18] T. Y. Li and N. H. Rhee, Homotopy algorithm for symmetric eigenvalue problems, Numer.Math. 55, (1989), pp.256-280. [19] T. Y. Li and T. Sauer, Homotopy method for generalized eigenvalue problems, Lin. Alg. Appl. 91(1987), pp. 65-74. [20] T. Y. Li, T. Sauer and J. Yorke, Numerical Solution of a Class of Deficient Polynomial Systems, SIAM J. Numer. Anal. 24, 435-451 (1987). [21] T. Y. Li, Z. Zeng, and L. Cong, Solving Eigenvalue Problems of Real Nonsym- metric Matrices with Real Homotopy, A Preprint. 77 [22] T. Y. Li, H. Zhang and x. H. Sun, Parallel homotopy algorithm for symmetric tridiagonal eigenvalue problem, to appear: SIAM J. Sci. Stat. Comput. [23] S. S. Lo, B. Philippe and A. Sameh, A multiprocessor algorithm for symmetric tridiagonal eigenvalue problem, SIAM J. Sci. Stat. Comput. Vol 8. (1987), pp. 155-165. ‘ [24] B. N. Parlett, The symmetric eigenvalue problem, Englewood Cliffs, N.J., Prentice-Hall (1980). [25] A. Sameh and D. Kuck, A parallel QR algorithm for symmetric tridiagonal ma- trices, IEEE Trans. Computers, C-26(1977), pp. 147-53. [26] G. Peters and J. H. Wilkinson, Ax = ABx and the generalized eigenproblem, SIAM J. Numer. Anal., 7(1970), pp 479-492. [27] Smith, B. T., et.al., Matrix eigensystem'"routines-EISPACK Guide. 2nd edi- tion,New york, Springer-Verlag (1976). [28] D. C. Sorensen and Peter Tang, 0n the Orthogonality of Eigenvectors Computed by Divide-and-Conquer Techniques, A Preprint. [29] C. Stewart, Introduction to matrix computations, Academic Press, Now York, 1973. [30] J. Stoer and R. Bulirsch, Introduction to numerical analysis, Springer-Verlag, 1980. [31] J. H. Wilkinson, The algebraic eigenvalue problem, New York, Oxford University Press (1965). [32] J. H. Wilkinson and C. Reinsch, Handbook for automatic computatiiin: Linear Algebra, Springer-Verlag, 1971. 78 "IJJJKKJKJJJKJMTJJJKJJJJJF