V .cn'zxxx :v. JILLS. .: 1 - - .‘,\‘ .1 .a u- m f .. ” !,~‘..‘3,~1.‘ L‘ 4936‘!» 115mm- v ,.. ,._. 5.”; .v _. M, ‘ .' l 1‘. , 'r‘; if .. .M , P; TYIL IRIBRA | IHZHHHI lllllWiUHllltil”H“INIHUHMHIN 3 1293 00882 This is to certify that the dissertation entitled , H amatopy -Dererw~ mum in at». qa't For mi v N\ KthW‘ X E \‘ 8%“ V‘GJ/vw; D V‘“ (3‘ le'v"“3 '0.»ch I ‘3 Pw‘cvuehz» Cya- \‘ “9’ presented by Z k C- V\ 06 $539"? 254’»? has been accepted towards fulfillment of the requirements for P k' D degree in VA vatbtiw»kt '15 {am flaw Major professcir Daaa {/C/K/W MSU is an Affirmative Action/Equal Opportunity Institution 0- 12771 LiBRARY ] Michigan State University i PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. DATE DUE DATE DUE DATE DUE] , 'I MSU Is An Affirmative Action/Equal Opportunity Institution c:\citc\datedue. urns-p. 1 HOMOTOPY-DETERMINANT METHOD FOR SOLVING MATRD€ EIGENVALUE PROBLEMS AND ITS PARALLELIZATION By Zhonggang Zeng A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mathematics 1991 man was" 7 ABSTRACT HOMOTOPY-DETERMINANT METHOD FOR SOLVlNG MATRIX EIGENVALUE PROBLEMS AND ITS PARALLELIZATION By Zhonggang Zeng With an efficient method for determinant evaluation, the homotopy-determinant method for matrix eigenvalue problems is developed in this thesis. Different from tra- ditional homotopy continuation methods, homotopy-determinant method calculates eigenvalues without computing their corresponding eigenvectors. For general real matrices, the homotopy-determinant method follows eigenvalue paths of a real homotopy that lead to all eigenvalues. The regularity of the homotopy is established to a necessary extent. The inevitable bifurcation and possible path jumping are handled by effective processes. The parallelization of the method is the one of the most important feature of this method which is shown to be suitable to various parallel architectures. For nonsym- metric matrices of order n, the method achieves the time complexity 0(n) using n2 processors. This running time is much better than any other parallel algorithm. The numerical results of the algorithm and a comparison with its counterparts in EISPACK are presented. They clearly demonstrated the efficiency of the method. To my wife and daughter ACKNOWLEDGMENTS I would like to thank Professor Tien- Yien Li, my dissertation advisor, for his con- stant encouragement and support during my graduate study at Michigan State Uni— versity. I would also like to thank him for suggesting the problem and the helpful directions which made this work possible. I would like to thank Professor Lei Jingan, my advisor at Wuhan University in China during 1982-1984, for his valuable guidance which laid the necessary foundation for my research career in numerical analysis and scientific computing. I would also like to thank Professor Wang Zeke at Zhongshan University in China, for his inspiration of my interest in homotopy method and our cooperation during 1982—1986 which eventually leads to this work. I would like to thank my dissertation committee members Professor Richard Hill, Professor Charles R. Maccluer, Professor Milan Miklavcic, and Professor David H. Y. Yen for their valuable suggestions and their time. I would like to thank my fellow graduate students Luan Cong, Liangjiao Huang, Kaiyuan Li and Xiaoshen Wang for years of friendship and collaboration. Contents 0 Introduction 1 Homotopy—determinant algorithm 1.1 The homotopy ............................... 1.2 The choice of the initial matrix D .................... 1.3 The algorithm ............................... 1.4 Hyman’s method ............................. 2 Regularity and Bifurcation Analysis 2.1 Preliminary results ............................ 2.2 Regularity at the initial level t = 0 ................... 2.3 Regularity of the real eigenvalue paths ................. 2.4 Regularity of the complex eigenvalue paths ............... 2.5 Bifurcation analysis ............................ 3 Evaluation of the eigenvalues 3.1 Following the real eigenvalue paths ................... 3.2 Following the complex eigenvalue paths ................. 3.3 Bifurcation ................................ 3.3.1 Real—to-complex bifurcation ................... 3.3.2 Complex-to—real bifurcation ................... 3.4 The control of stepsizes .......................... 3.5 Numerical Results ............................. 4 Evaluation of the eigenvectors 4.1 Eigenvector algorithm .......................... 4.2 Numerical Results ............................. 5 Parallelization and Vectorization 5.1 Parallelization ............................... 10 10 11 12 12 15 15 17 18 20 21 29 29 35 36 36 39 40 43 46 46 50 53 53 5.2 Vectorization ............................... 5.3 Numerical Results ............................. Reference List of Tables The execution time comparison between homotopy—determinant algo- rithm and HQR in EISPACK ...................... The rates of “easy paths” and occurrence of bifurcations ....... The performance of the eigenvector algorithm and the comparison with INVIT in EISPACK ........................... The effect of the magnitude of the subdiagonal entries upon the eigen— vector evaluation ............................. Speed-ups in parallel computations ................... List of Figures WNCDUIi-hwm The regularity of real eigenvalue paths ................. The properties of the bifurcation .................... Examples of bifurcations ......................... Properties of real eigenvalue paths .................... Corrections on real eigenvalue paths ................... Options for the complex correction ................... Real-to-complex bifurcation algorithm ................. Complex—to-real bifurcation algorithm .................. 57 59 28 32 33 37 41 0 Introduction In this work, we deal with the eigenvalue problem of an n x n real matrix A: Arc—Am20, A60, (360”. (1) The matrix eigenvalue problem is one of the most fundamental problems in com- putational mathematics and is currently under intensive study by many numerical analysts. Some successful algorithms such as QR algorithm have been well developed and implemented in standard software package such as EISPACK and upcoming LAPACK. The parallelization and vectorization of these methods also attract consid- erable attention. Recently, remarkable numerical results have been obtained using homotopy con— tinuation algorithms on eigenvalue problems [5, 8, 11]. The theoretical version of the homotopy method for algebraic eigenvalue problems has been studied in [1, 2, 6, 7]. The idea of the homotopy continuation method is as follows: Let D be an n x 77. matrix whose eigenvalues are known or easy to find. Then n eigenvalue paths are generated with a parameterized matrix A(t) = (1 — t)D + tA from t = 0 to t = 1. These paths connect the known eigenvalues of D to the target eigenvalues of A ( because of the continuity of eigenvalues with respect to the matrix entries ). The homotopy algorithm is designed to follow these eigenvalue paths. The studies of homotopy continuation algorithms mentioned above have a common approach: viewing eigenvalue problem ( 1) as a polynomial system in n + 1 variables (A,a:) E C X C”. Thus the eigenvectors are evaluated along with the associated eigenvalues. This approach may not be suitable in many applications when only eigenvalues are in demand. In this work, a new approach is brought into the homotopy algorithm: Hyman’s method of determinant evaluation. Different from the traditional homotopy methods, our algorithm considers the following polynomial equation pm 2 (1qu — AI] = 0, /\ e c where /\ is the only variable. When this equation is solved, we obtain all the eigen— values. If the corresponding eigenvectors are also needed, we may employ inverse iteration after the eigenvalues are determined. To distinguish this method from other homotopy algorithms, we call this method the homotopy-determinant algorithm. Like all homotopy algorithms, the homotopy—deterrninant method has a great advantage in parallel computation: each eigenvalue path is traced independently of the others. In this respect, homotopy methods stand in contrast to the highly serial QR algorithm. Compared our method to other homotopy algorithms, it can achieve even more efficiency and flexibility in parallelization and vectorization. The natural parallelism seems to suggest that the homotopy—determinant algorithm is an excellent candidate for a variety of advanced architectures. In this work, we consider only the eigenvalue problem of a real nonsymmetric matrix A. Without loss of generality, A is assumed to be in the upper-Hessenberg form with nonzero subdiagonal entries. Let the matrix D be defined by making one of the subdiagonal entries of A zero. The homotopy is then defined as H()\, t) = det[A(t) — AI] (2) where A(t) = (1 — t)D + tA. Hyman’s method is presented in Section 1.4 for the evaluation of the determinant of A(t) — /\I and its partial derivatives. For the homotopy in (2), the regularity and bifurcation behavior of the eigenvalue paths are established in Chapter 2. The major results are as follows: 0 Generically, all real eigenvalue paths are simple and smooth if H()\, t) in (2) is restricted to R X [0,1] ——> R. o Generically, all complex eigenvalue paths are simple and smooth except at finitely many points where the eigenvalue paths intersect real space. Bifur— cations occur at these points. 0 Generically, bifurcations are simple in the sense that exactly two branches of eigenvalue paths intersect with perpendicular tangents at each bifurcation point. The details of the algorithm are described in Chapter 3. Properties of real eigen- value paths are employed to avoid “path jumping” and an efficient bifurcation treat- ment is also presented. While the main purpose of our algorithm is to calculate eigenvalues, the evaluation of eigenvectors is also an important development in our algorithm. It was believed that Hyman’s method, although remarkably stable in eigenvalue evaluation, is not suitable for the evaluation of eigenvectors. Thus the usual practice is to evaluate every eigenvector by the inverse iteration, e.g. using the routine INVIT in EISPACK after identifying all the eigenvalues. In Chapter 4, we develop a fast eigenvector algo- rithm and a criterion for accuracy testing using by-products of Hyman’s method of determinant evaluation. With this method, we can substantially reduce the compu- tations in the eigenvector evaluation and obtain most of the eigenvectors to working accuracy with only 0(n2) floating point operations. In comparison, INVIT requires 0(n3) operations for this purpose. The parallelization and vectorization of our method are discussed in Chapter 5. We show that, based on a proper parallel model, 0(n) time complexity can be achieved with n2 processors. This is an important result, because there has been no other parallel algorithm for nonsymmetric eigenvalue problems that achieves the running time less than 0(n2). Numerical results on evaluating eigenvalues of random nonsymmetric matrices with a sequential machine SPARC 1 are summarized in section 3.5. Eigenvector evaluation is also tested on the same machine and summarized in Section 4.2. In Section 5.3, we list the numerical results of parallel computation on a BBN CPIOOO (BUTTERFLY). 1 Homotopy-determinant algorithm 1.1 The homotopy Consider the eigenvalue problem A2: = Ase, where A is an n X n real nonsymmetric matrix and m = (x1, - - - , any. By an orthog- onal transformation, A is similar to an upper Hessenberg matrix. Thus, without loss of generality we assume throughout this paper that A is an upper Hessenberg matrix: * * * * 0.21 * * * (131 * * A = (an) = 0 an—1,n-—2 * * unfit—1 * We further assume that A is irreducible, that is, none of the subdiagonal entries aj,j_1,j = 2,.. .n are zero. For otherwise, A can be broken into blocks and the eigenvalue problem is reduced to smaller ones. For t 6 [0,1], let AU) = ((1270)) = (1 — 0D + M where D = (ch-j) is a real upper Hessenberg matrix With known eigenvalues. The matrix D is called the initial matrix. Let H : C x [0,1] ——> C be defined by H()\,t) = det[A(t) — AI]. (3) When DH 2 (HA,Ht), where H,\ and H1 denote the partial derivatives of H with respect to A and t respectively, is of full rank at (A0, to) E H‘1(0), then locally the solution set of H(/\, t) = 0 consists ofa smooth 1-manifold (/\(s), t(s)) passing through (A0, to) (Implicit Function Theorem). Such a curve (Ms), t(s)) is called an eigenvalue 10 path. These eigenvalue paths, connecting the eigenvalues of D to those of A, satisfy the initial value problem of an ordinary differential equation d/\ dt HAE + Htg -— 0 /\(0) = some eigenvalue of D t(0) = 0. Roughly speaking, our algorithm is designed to follow these eigenvalue paths with a special solver for such differential equations. To follow the eigenvalue paths effectively, one must efficiently evaluate H, H A and Ht. For this purpose, a stable method, due to M. Hyman [4, 12, 13], for evaluating determinants of Hessenberg matrices and their derivatives will be used. The details will be discussed in Section 1.4. 1.2 The choice of the initial matrix D The initial matrix D is determined according to the following requirements: 1. To maintain the conjugacy of the eigenvalues of A(t) for each t, the initial matrix should be chosen to be real so that for each t E [0, 1], if A(t) is an eigenvalue of A(t), so is its conjugate 2(t). Thus, when a complex eigenvalue path (/\(s), t(s)) is followed, its conjugate eigenvalue path (A(s), t(s)) is obtained as a by-product without further computations. 2. The eigenvalue problem of D must be easy to solve. 00 . It is desirable to choose D as close to A as possible so that the eigenvalue paths are close to straight lines and easy to follow. Based on the above considerations, the strategy of “Divide and Conquer” is employed. The initial matrix D = (daj) is formed by making one of the subdiagonal entry aw”, of A zero, namely, dij={aa~ . (i.j)¢(p+1.p) 0 , (i,j) = (p-i- 1,17). 11 The eigenvalues of D are obtained by solving the eigenvalues of the reduced sub- matrices of D. The target matrix A is “conquered” by following the eigenvalue paths of the homotopy in (3). 1.3 The algorithm The outline of the algorithm is as follows: 1. Solve the reduced eigenvalue problem of the initial matrix D {using either the homotopy method or QR algorithm) . 2. For each initial eigenvalue A0 of D, start the path-following process: (a) Input to = 0 and (Ao,0) as the first point on the eigenvalue path. (b) At (Ao,t0), evaluate Ht and HA. (c) Evaluate the unit tangent vector (A,t) of the eigenvalue path at (A0,t0) by solving the linear system: Hai+H.i = 0 . . (4) A2 + t2 = 1. (d) Prediction: Let (A?,t‘1’) = (A0,to) + 6(A, t) with a stepsize 6. (e) Correction: Starting from (A?,t‘1)), use Newton iteration to obtain a point (A1,t1) on the eigenvalue path (see the details in Section 3.1 and 3.2). (f) Set (A1,t1) to be the new (A0,t0) and go to step 2b until t1 = 1 and thus an eigenvalue A1 is obtained. 3. We now have all eigenvalues. Evaluate eigenvectors by the eigenvector algorithm in Section /,.I, if they are also in demand. 1.4 Hyman’s method In the algorithm above, steps 2b and 26 contain the highest order of floating point operations 0(n2) which occur in the evaluation of H(A,t) = det[A(t) — AI] and its 12 partial derivatives H A and Ht. A stable method of such evaluations is due to Hyman [4, 12] and can be described as follows. For A(t) = (a,j(t)) = (1 — t)D + tA and a: = ($1, - - - an)T, the system of equations (A(t) — AI):c = 0 can be written as (a11(t) — A)a:1 + a12(t)a:2 + .................. + a1n(t)xn = 0 a21(t)a:1 + (a22(t) — A):I:2 + .................. + a2n(t):cn = 0 ................................................ .................................... an,n—1(t)-Tn—1 + (ann(t) '_ )‘yEn : 0 For every fixed value of A, set 33,, = 1, the last n—l equations can be solved recursively for xn_1, - ~ ' ,:1:2,x1. Those values can be used to evaluate the left hand side of the first equation (a11(t) — A)x1 + a12(t)x2 + ............ + a1n(t)a:,, = F(A, t) (6) Obviously, F(A,t) = 0 if and only if A is an eigenvalue of A(t). On the other hand, the matrix A(t) — A] has the form a11(t) —' /\ 012(t) 013(t) . . . . . . a1,,(t) a21(t) a22(t) — A a23(t) ... . .. a2n(t) a32(t) a33(t) -— A ... ... a3n(t) 0 an1n_1(t) am,(t) — A Forj = 1, 2, - - - ,n, one can multiply the jth column by .rj as found in the system (5) and add to the last column, thereby obtaining the matrix 011(t) — A (112(t) a13(t) . . . . . . (11,n_1(t) F(A, t) (121(t) 022(t) — A (123(t) . . . . . . a2,n_1(t) 0 (132(t) (133(t) — A . . . . . . 03,”..1(t) 0 0 an,n_1(t) 0 Then _1 Ho. t) = det[A(t) — AI] = Fa. t) H aaaaait) To compute HA it is sufficient to compute8 a—F .Differentiating (5) with respect to A, taking into account that :e,-,j = 1, - - - ,n—l are functions of (A, t) and that 33,, = 1, yields (a11(t) — Mail —— 2:1 + a12(t)%£2 + .................. + a1n(t )% = 0 (121(t )%+((122( ) A)%—$2+ .................. +a2n(t )8_/\ = 0 ............. 6358181 (7) akk 1(t)—5— " 1 + (akk(t—) A) ,\ — wk + +akn(t)—f = 0 ann_1(t)6””‘1 + (awn) — )%a _ 33,, = 0 With 68% = 0, one can solve (7) successively for 833"; ,- - - , 9&1 using previously computed values of 1:1, - - - ,xn. The value of the left—hand side of the first equation in (7) is then 53—? H; can be evaluated similarly. The method described above is virtually Gaussian elimination without pivoting. It might seem that the accuracy of this procedure would be curtailed if some of the aj+1'j(t) are small in magnitude because of the necessity of dividing by aj+1,j(t) in the recursion used to solve the last n — 1 equations of (5) and (7). However, Wilkinson [12] showed that there is little correlation between the accuracy of the method and the magnitude of the subdiagonal entries of A(t). 14 2 Regularity and Bifurcation Analysis The boundedness as well as the smoothness of the eigenvalue paths are essential properties needed for the homotopy method. In the traditional theory for homotopy methods [1, 2, 6, 7], the smoothness of the homotopy paths are achieved by ran- dom perturbations of certain parameters. Here, the eigenvalue paths are obviously bounded in C x [0, 1] because of the continuity of the eigenvalues with respect to the entries of the matrix. However, in order to maintain the homotopy real for impor- tant practical considerations, the perturbations are restricted to real perturbations. Different from traditional complex perturbations used in [1, 2, 6, 7], the real perturba— tion can not completely guarantee smoothness of the eigenvalue paths. Bifurcations (which do not occur when normal complex perturbations are used) are inevitable. In this chapter we shall show that, with perturbations of as few as four entries, we can achieve the necessary smoothness of the eigenvalue path as well as the simplicity of the bifurcation behavior. 2. 1 Preliminary results The following lemmas will be used through out this chapter. Lemma 2.1 (Parameterized Sard’s Theorem) Let U Q Rm and V Q R" be open sets and let F:U>R” be CT with r > max{0,m—p}. Suppose the px (m+q) matrix ’DF 2 (Fu, FU) is offull rank at every point 0fF‘1(0) C U X V. Consider F(-,v) : U ——> RP. Then for almost every point v E V, DF(-,v) 2 (Fa) is offull rank at every u E F(-,v)”1(0) C U. Definition 2.1 Let f(2) = a0 + alz + - - - + anzk and g(z) 2 be + blz + - - ~ + bmzm be polynomials of degree It and m respectively. The determinant of the (k +m) X (k + m) matrix a0 a1 02 ... ... oh a0 a1 02 ak b0 b1 b2 . . . . . . bm bo b1 b2 bm is called the resultant off and g. The resultant ofa polynomial and its derivative is called the discriminant of the polynomial. The following two lemmas can be found in Van der Waerden [15]. Lemma 2.2 Two polynomials have a common non-constant factor if and only iftheir resultant is zero. Lemma 2.3 A polynomial has a multiple root if and only if its discriminant is zero. The next lemma implies that an n x n irreducible upper Hessenberg matrix has no multiple eigenvalues unless its upper-right corner entry is one of n — 1 specific values depending upon the other entries. Lemma 2.4 For every collection ofm real numbers a1, - - - ,am, there exists a subset Co C R consisting of at most m — 1 elements such that the polynomial f(Z) =amzm+---+alz+ao has no multiple roots unless a0 6 C0. Proof: f(z) has a multiple root 20 if and only if f(zo) = f’(20) = 0. f’(z) is a polynomial of degree m — 1 which is independent of a0. Let 21, - - - ,zm_1 be the roots of f’(z) which are all the possible multiple roots of f(z). For eachj = 1, - - - ,m — 1, 21- is a multiple root if and only if a0 = —(amz;" + - - - + a12j). Cl 16 Corollary 2.1 Given an irreducible m X m upper Hessenberg matrix B = [bij], there is an open subset Q C R, which contains all R except for at most m— 1 real numbers, for which B has no multiple eigenvalue if blm E Q. Proof: Let f()\) = det[B — AI]. Then a straight forward verification gives that the constant term of f()\) can be written as Kblm + c, where K = H37; bid-1 75 0, and c is independent of blm- The conclusion follows Lemma 2.4. D 2.2 Regularity at the initial level t = 0 D1 * 0 D2 D is a reducible upper Hessenberg matrix, so the set of eigenvalues of D consists of eigenvalues of D1 and D2. Note that H ‘1(0) fl{t = 0} is the set of eigenvalues of D. If an eigenvalue A0 of D is simple, then HA()\0,0) 74 0, Le. ’DH()\0,0) = (HMHt) is of full rank. Thus, to establish the regularity of H at t = 0, it is sufficient to show Att=0,A(t)=D= ],whereD1istpand D215 (n—p)x(n—p). that D has no multiple eigenvalues. Proposition 2.2 Let D be an upper Hessenberg matrix with dp+1m = 0 and no other subdiagonal entries equal to zero. Then there exists an open subset Q1 C R2 with full measure such that D has no multiple eigenvalues if (d1,p,dp+1,n) 6 Q1. Proof: Both D1 and D2 are irreducible upper Hessenberg matrices. By Corollary 2.1, there exist open subsets Q1, Q2 C R with full measure such that neither D1 nor D2 has multiple eigenvalues if their entries d”, E Q1 and dp+1m 6 Q2. It suffices to show that there exists an open subset Q C R2 with full measure such that if (d1,p,dp+1,n) E Q then D1 and D2 have no common eigenvalues. Let dp+1m E a 6 Q2 and let /\1(a), . . . , An_p(a) be the eigenvalues of D2 with respect to a. Each Aj(a) is a continuous function of a. If Aj(a), j E {1, . . . ,n — p}, is also an eigenvalue of D1, then d1”, must make det[D1 —- /\j(a)I] = 0 and it is clear that only one such value of d”, E C exists for each /\j(a) (see the proof of Corollary 2.1). Le. there are n — p 17 values 31(a), . . . , sn_p(a) (possibly repeated) in C for d”, such that 01 and D2 have common eigenvalues. These sj’s are continuous function of M (again, see the proof of Corollary 2.1). Thus they are continuous function of a. Let E be the closure of the set {(dlma dp+1.n) l dp+1.n E Q2: dlm E {31(dp+l.n)i ' - ' isn-p(dp+1.n)}}nR2- E is a subset of R2 with measure zero because it is at most a graph of n — p piece- wise continuous functions on the real line. Let Q = R2 \ E. If (d1,p,dp+1,n) E Q, then D1 and D; have no common eigenvalues. Let Q1 = (Q1 x Q2) Q Q. For every (c114,, dp+1,n) 6 Q1, D has no multiple eigenvalues. Q1 is clearly an open set with full measure. CI. 2.3 Regularity of the real eigenvalue paths H ‘1(0) can consist of both real and complex eigenvalue paths. The real eigenvalue paths can be considered the set of zeros of H restricted on R X [0,1]. The following proposition shows that generically the real eigenvalue paths are smooth in R x [0, 1], and this phenomenon is illustrated in Figure 1. Proposition 2.3 There exists a subset Q2 Q R with full measure such that if the entry dln in the initial matrix D is chosen from Q2 and ifH is considered a map from R x (0,1) to R, then the zeros ofH are regular; i.e. for every (A,t) E H‘1(0) fl[R x (0,1)], the matrix ’DH = [HMHt] is offull rank. Proof: Let U = R x (0,1) and V = R. Consider dln a variable of H : U X V ——> R, i.e. H(/\, t, d1”) = C(t)[(a11(t) — AI)x1 + a12(t)x2 +---+(ta1n + (1 —— t)cl1n)xn], where C(t) = H amt-(t). Thus, J DH = [I-I,\, 11., C(t)(1 — m 18 Eigenvalue paths in C X [0,1]: bifurcation may occur R C R C ]] _________________ Ayn-:3" '-".’ -------------- ', .' / ...................... .. ................... x . . ' i .. > t i 1:" ' When H is restricted in Rx[0,1], eigenvalue paths are smooth. No bifurcation exists. R 13 7" Figure 1: The regularity of real eigenvalue paths which is of full rank for t 6 (0,1). The assertion is achieved by Lemma 2.1 ( Param- eterized Sard’s Theorem ). D 2.4 Regularity of the complex eigenvalue paths Proposition 2.4 There exists a subset Q3 9 R2 with full measure such that if (d1,,._1,d1,,) of the initial matrix D are chosen from Q3, then for every (A,t) E H‘1(0) fl [(C\R) x (0,1)], DH 2 [HMHJ is offull rank. Proof: Let’s split H, /\ and a: into real and imaginary parts: A = €+ni a: = y+zi H(€ + 77in?) = F(€+ni,t)+ G(€ + math (8) Regard C as R2 and consider d1,,,_1 and dln variables, the equation (8) can be rewritten as F “,t,d ..- ,d ,, 11(6 + ”i! ta dl,n—1, d1”) = (6 + "z 1’ 1 1 ) 0(5 + ’72,t,d1,n_1,d1,,) _ C(t + (Mun—1 + (1 — t)dl,n—l)yn—1 +(ta1m. +(1_t)d1,n)yn +(ta1,n—1 + (1 _ z)dl,.._1)2,._1 +(ta1,.. + (1 — t)d1,n)zn where an,n—l(t)(yn—1 + 271—175) + (ann(t) '— é _ Tll)(y7l + 2712) = 0' i.e., yn = 1, 2,, = 0, nn t _ yn1= (a () oazn—l = 77 awn—1 an n—l Thus, FA ‘GA Ft Fdi,n—1 Fdln GA FA Gt Gd1,n—1 Gdin DH : 20 F)‘ —G)‘ F; Fdl,n—1 C(t)(1— t) GA FA 0. 0_1—L$j<1-‘" 0 ' Since 7] at 0, rank[DH] = 2 for t 6 (0,1). The conclusion of the proposition follows by Parameterized Sard’s Theorem. El With the above propositions, the following results concerning bifurcations can be easily derived. Corollary 2.5 If(A, t) is a bifurcation point, then A is real, provided that the entries (d1,n_1,d1n) of the initial matrix D are chosen from Q3. F,\ —G,\ G,\ F,\ Gt By Proposition 2.4, rank(DH) = 2 if A is in C \ R. Thus A must be real if (A, t) is Proof: (A, t) is a bifurcation point if DH = is of rank 0 or 1. a bifurcation point. D Corollary 2.6 If (A,t) is a bifurcation point, then HA = 0, 21—: = 0 and Ht gé 0, provided that (d1,n_1,d1n) is chosen from Q2 fl(R x Q1). Proof: By Corollary 2.5, (A,t) E R x (0,1). If H is viewed as a map from R x (0,1) to R, then DH = [HM Ht] at [0,0] by Proposition 2.3. If H is Viewed as a H,\ 0 H; _ map from C x (0,1) to C and C = R2, then DH 2 With rank less 0 H,\ 0 than 2. It is clear that H; = 0 and H, yé 0. Then from (4), j—g = 0. D We conclude from Corollary 2.5 and 2.6 that generically the eigenvalue paths are smooth and simple except at some bifurcation point (A,t) where A is real and H,\ = 0. That is, A is a real multiple eigenvalue of A(t). The following section will give a complete analysis on such bifurcation points. 2.5 Bifurcation analysis By Corollary 2.5, bifurcation can occur only at (A, t) where A is a real multiple eigenvalue of A(t). In this section, we shall prove that generically there are at most 21 finitely many bifurcations and each of them is simple in the sense that there are exactly two branches of eigenvalue paths with perpendicular tangents. Proposition 2.7 There exists an open subset Q4 C R with a finite complement such that if the entry dln of the initial matrix D is chosen from Q4, then the set of bifurcation points E = {(A,t) E H‘1(0) : A is a real multiple eigenvalue of A(t) } consists of at most finitely many points. Proof: Let Q4 C R be the set such that if dln 6 Q4, then A6) = %D + 17/1 has no multiple eigenvalues if (11,, 6 Q4. Corollary 2.1 guarantees that Q4 is open with a finite complement. Let’s assume dln E Q4. Let d(t) be the discriminant (see earlier definition) of det[A(t) — AI] which is considered a polynomial in A. By definition, d(t) is a polynomial in t. By Lemma 2.3, d(t) = 0 if and only if A(t) has multiple eigenvalues. Hence d( %) 9g 0, i.e. d(t) is a nontrivial polynomial. Thus d(t) has at most finitely many real roots in [0,1]. By Lemma 2.3, there are finitely manyt E [0, 1] for which A(t) has multiple eigenvalues. At each such t, A(t) has at most n/2 multiple eigenvalues. C] For every point (A, t) E I{_1(0), the multiplicity of A as an eigenvalue of A(t) is in fact the number of eigenvalue paths passing through (A, t). The following proposition shows that generically this number is at most two. Proposition 2.8 There exists an open subset Q5 C R2 with full measure such that if the entries (d1,n_1,d1n) of the initial matrix D is chosen from Q5, then for each t 6 (0,1), every eigenvalue A of A(t) has multiplicity less than or equal to 2. Proof: Consider H to be a polynomial in A with parameters t, d1,,,_1 and d1”. Let f(t,d1,n_1,d1n) and g(t,dl,n_1) be the discriminants of H and H,\ respectively. They are both polynomials and g is independent of (11". f(t,d1,n_1,d1n) is not identically zero because of Corollary 2.1. g(t,d1,n_1) is not identically zero since, by a similar 22 argument used in Corollary 2.1, only finitely many d1,n_1 for which H ,\(A, 1) have multiple roots. Consider f and g polynomials in t with parameters d1,n_1, dln and let k(d1,,,_1, d1”) be the resultant of f and g. k(d1,n_1,d1n) = 0 if and only if there exists to E C such that f(to,d1,n_1,d1n) = g(to, d1,,,_1,d1n) = 0. i.e. there exists A1, A2 6 C for which H(A1, to) = H,\(A1, to) = 0 and H,\(A2,to) = HA,\(A2,to) = 0. In other words, if A(t) has triple eigenvalues for some t then k(d1,,,_1,d1n) = 0. Note that lc(d1,n_1, d1”) is not identically zero. That is, there exists (d1,n_1,d1n) E R2 such that f(t,d1,n_1,d1n) and g(t,d1vn_1) have no common roots in t. In fact, with a. similar proof of Proposition 2.7 one can show that there is a d1,,,_1 such that g(t,d1,n_1) has only finitely many roots t1, - - - , t1. For fixed tj, Corollary 2.1 shows that f(t,,d1,n_1,d1,,,) = 0 only at at most n — 1 values of d1”. Thus, M : {dln E le(tjadl,n—19 (11,11) = 03 j: 13' ' ' 7 l} is a finite set. If dln is chosen from R \ M, then, together with the previously chosen d1,n_1, f(t, d1,,,_1, d1”) and g(t, d1,n-1) have no common roots in t. Thus k(d1,n_1,d1n) is a polynomial in d1,n_1,d1n, which is not identically zero. So, k‘1(0) H R2 is a closed one-dimensional algebraic set. i.e. R2 \ k‘1(0) is an open set with full measure. B Let Q2345 = (R x Q2) flQ3 fl(R x Q4) flQs. Then Q2345 is an open subset of R2 with full measure. Corollary 2.9 Suppose the entries (d1'n_1,d1,n) of the initial matrix D is chosen from Q2345. If(Ao,t0) is a bifurcation point in H‘1(0), then (i) A0 is real. {ii} A0 is a double eigenvalue of A(to). (iii) If H is restricted in RX(0,1), then there is a real, simple and smooth eigenvalue path passing through (A0, to) with tangent vectors (A, t) : (:izl,0) where ' means d/ds. 23 (iv) 0n the real eigenvalue path passing through (Ao,to), t = —H,\,\/Ht 75 0 at (AOat0)' Proof: We only need to prove (iv). In fact, Hui? + with + Hui + HA3. + HJ = 0. Because at (Ao,to), t = 0, H,\ = 0, H» 95 0 and A2 = 1. Hence t = —HM/Ht 75 0. D At the bifurcation point (Ao,t0), there are actually two real eigenvalue paths r. = {(A+(s),t+(8)) e R x [0,1] I s e (—e, 0], (A+(o>,t+(o>> = we} and r- = {(A_(s),t_(s)) E R x [0, 1] I s 6 [0,6), (A_(0),t_(0)) = (Ao,t0)} with tangent vec- tors (1,0) and (—1,0) respectively. They join at (Ao,t0) to form a branch of a real smooth 1-manifold. Since t aé 0 (from (iv)) both RI. and P_ lie on one side of the line t = to. By the continuity of eigenvalues with respect to the entries of matrix, there must be extra eigenvalue paths on the other side of the line t = to touching the bifurcation point (A0, to). The multiplicity of A0 equals two which implies that the number of such extra paths is two. From (iii), the joint curve F = F+ UIL is simple in real space, so the extra eigenvalues must be complex. Corollary 2.10 below shows that these two complex paths join at the eigenvalue point (A0, to) with tangent vectors (i,0) and (—i,0) respectively and thus form a smooth piece of curve. The proof requires a complex bifurcation theorem. Lemma 2.5 (Complex Bifurcation Theorem) Let P : Cm x R —> C” be ana- lytic and (z°,so) E Cm x R be a simple quadratic fold of the equation P(z, s) = 0. That is, (a) P(z°,so) = 0. (b) Pz(z0,so) E Pg is singular with dimN(Pz°) = c0dim’R(P§) = 1. Here N(.) and 7Z(.) denote the null space and range of linear operators respectively. 24 (c) Let P,(z°,so) E Pf. Then ’R.(P,?) ¢ 7?.(Pg), where PE is regarded as a linear operator from C to Cm. (d) Let qt" 6 CW“ and (I) E Cm be nonzero vectors for which \II*P§ = 0 and PEG) = 0. Then W‘szqflb aé 0 where P22 E Pzz(z°, so). Then, {a} (z0,so) is a bifurcation point of P‘1(0). (fl) The bifurcation is simple in the sense that there are exactly two solution paths PI and F” passing through (z°,so). (7) IfG) is the tangent vector of I" at the bifurcation point (z°,so), then i0 is the tangent vector of P” at (z°,so) This lemma can be found in [10]. Corollary 2.10 Under the assumption of Corollary 2.9, there exists a unique real eigenvalue path passing through (A0,t0) with tangent vector (A1,t1) = (i1,0) and a unique complex eigenvalue path passing through the same point with the tangent vector (xi) = (i220) Proof: Obviously H(A, t) : C x R —) C is analytic. (A0, to) is a simple quadratic fold of the equation H(A, t) = 0 because (a) H(A0,t0) = 0. (b) H? E H,\(A0,t0) = 0 since A0 is a double zero of H(-,to) : C ——> C. (c) ’R(H,°) gZ R0113) since H? = 0 and H? aé 0. ((1) \II’“ =1 and (I) = 1 satisfy W‘Hg = 0 and HQQ = 0 and \IVHBAQNI) = Hijx 7t 0. 25 Thus, by Lemma 2.5, there are exactly two eigenvalue paths F1 and F” passing through (A0, to). Part (iii) of Corollary 2.9 indicates that one of the eigenvalue paths, say I”, is a real eigenvalue path which is simple in R x [0,1]. Thus F” must be a complex eigenvalue path with tangent vector i(:l:1, 0) = (:lzi, 0). D The bifurcations can be illustrated by Figure 2. Figure 3 shows some of the bifurcation examples we have encountered. 26 I (I ,0) tangent vector of the real path O O Q ......‘...I......OCCIOOC......-......C—wvw- real eigenvalue path .“ — I fbifurcation point complex eignenvalue path (130) tangent vector of the complex path / The properties of the bifurcation: 1. there are exactly two branches of eigenvalue paths intersect at a bifurcation point one is real, another is complex; 2. every bifurcation point is in Rx[0,1]; 3. the tangents of the two joint eigenvalue paths are ( 1,0) and (i,0) respectively. Figure 2: The properties of the bifurcation 27 i it / Rx[0,1] Rx[0,1] Figure 3: Examples of bifurcations 28 3 Evaluation of the eigenvalues In the last chapter, it is shown that, theoretically, the necessary regularity and the simple bifurcation of the eigenvalue paths can be obtained with probability one by choosing four entries (1: (dlp, dp+1,n,d1,n_1,d1,,) of the initial matrix D = (d,,-) at random. In practice, in order to apply the strategy of “divide and conquer”, the initial matrix D = (dij) is set to be dij : aij if(2,j)75(p+l,p) dp+1.p = 0 if(i,j)=(P+1aP) (9) for certain p m n/ 2. The parameters in a are perturbed when nongeneric singu- larities are encountered (which, in fact, never occurred in our intensive numerical experiments). 3.1 Following the real eigenvalue paths Curve jumping is a serious difficulty in following eigenvalue paths. Namely, when an eigenvalue path F1 is followed, one may inadvertently jump to an eigenvalue path F2 which passes close to F1. Usually this phenomenon occurs only in following real eigenvalue paths. When a complex eigenvalue path is followed, the path lies in C x [0,1] which has one more dimension than R x [0, 1], so there is more room for maneuvering. Hence, when one traces a real eigenvalue path, special attention must be paid to prevent curve jumping. In the following, several special features of the homotopy will be presented. They are particularly useful in eliminating virtually all possibility curve jumping among real eigenvalue paths. 29 For the choice of the initial matrix D in (9) (121 a22 a32 A(t)=(1—t)D+tA= app 0 tap+1.p \ an,n-1 ann f Notice that t occurs only at the (p + 1, p) entry. Simple observations indicate that the homotopy H (A, t) = det[A(t) -- AI] can be written as H(A, 15) = f1(/\) + tf2(/\) (10) where f1 and f; are polynomials in A with degree n and n - 2 respectively. The representation of H in (10) leads to the following very useful properties: (i) Given any A0 6 R, as long as f2(Ao) ;£ 0, there exists a unique to E R such that H (A0, to) = 0. Since f2 has at most n—2 real zeros, any nonconstant component A(s) of any real eigenvalue path (A(s), t(s)) must be monotone. Consequently, if < A2_, < AS< A?+1 < are real eigenvalues of D, the eigenvalue path (A(s), t(s)) emanating from (A2, 0) will stay in (A2, A9“) X (0,1) if A(s) is monotonically increasing, or will stay in (A24, A?) X (0,1) if A(s) is monotonically decreasing. From this property, when a real eigenvalue path is followed, the eigenvalue path may be limited in the proper boundary to prevent the jumping. Since H(A, t) is linear in t, for any to 6 (0,1) and any A1 6 R with f2(A1) 75 0, the one-step Newton iteration gives H(A,, to) Ht(A1,t0) (11) t1=to — 30 for Wthh H(Abtl) = 0. (iii) (Degree preserving property): Let P = {(A(s),t(SD E R X [0,1] = S E [a, bl} be a part of a real eigenvalue path,which contains no bifurcation point. Since H ,\ is continuous and equals to zero only at bifurcation points, the degree of the zero (A(s), t(s)) deg[A(s),t(s)] E sign[H,\(A(s), t(s))] = constant for s 6 [a,b]. Properties above are illustrated in Figure 4. From (i) and (iii), when tracing a real eigenvalue path, one may check its de— gree and limit the eigenvalue path within the proper boundary to stay on the right eigenvalue path. To follow a real eigenvalue path P = (A(s),t(s)), let (A0,t0) be a point on F. Calculating the tangent vector (A,t) at (A0,t0) is the first step in the prediction- correction procedure. When the arclength is used as the parameter of the eigenvalue path, the tangent vector can be obtained by solving the following system H,i+H,i =0 A2+i2=1 As described in Section 1.4, H), and H, can be evaluated efficiently by Hyman’s method. The Sign of i is always chosen to be positive. With the tangent vector (A,t), the prediction for the next point on the path is evaluated by (A?, t?) = (A0, to) + 6(A, t) with stepsize 6. This prediction will be the initial point in the iteration for correction. The correction can be carried out in two ways: (See Figure 5) 31 + , - : the sign of the degree along the real eigenvalue paths " " " ' : the boundary of the real eigenvalue paths Properties: 1. real eigenvalue paths are monotone with respect to t; 2. adjacent real eigenvalue paths carry dgrees with alternative signs; 3. along an eigenvalue path, the degree changes sign only at bifurcation points; 4. each eigenvalue path stay in its boundary. Figure 4: Properties of real eigenvalue paths 32 Case (i): when the eigenvalue path is not too flat, correct the prediction along a horizonal line. R A horizontal correction ’ prediction Case (ii): when the eigenvalue path is flat, correct the prediction along a vertical line. vertical correction ----- ---- v Figure 5: Corrections on real eigenvalue paths 33 (i) If t? 75 1 and [AI is not too small, say IAI > 104, then the slope dA/dt is not too close to zero. That is, locally the eigenvalue path is not close to horizontal. Hence, the horizontal line A = A? should intersect the eigenvalue path. In this case, the correction is performed on the horizontal line A = A9. Namely, formula (11) is used: A, = x; H (At), t9) t1 = 0—— 1 Hr(/\i’, if). The advantage of this iteration is that only one step Newton iteration is neces- sary to obtain t1 from t? for fixed A? such that H (A1, t1) = O. (ii) If [A] z 0, the horizontal correction formula in (i) is not applicable since the horizontal line A = A? may not intersect the eigenvalue path which is nearly horizontal. In this case, however, the vertical line t = t? should intersect the eigenvalue path. Thus, one can correct A? for fixed t = t? by using the Newton iteration formula _ H (A1”, ti) HA(Alna t9) m=0,1,... m+l_ m A1 —/\1 O U C D C m o 0 O The iteration lS terminated 1f [WI < e, where 6 IS a preassrgned tolerance 1 ’ 1 and let (A1,t1) = (A§"+1,t?). After obtaining (A1, t1), one must check (a) if (A1, t1) is in the proper region described in property (r); (b) if sign[HA(A0, t0)] = sign[I-IA(A1, t1)]. In (a), if (A1,t1) is out of the region, then a path jumping occurs. Thus, (A1,t1) should be discarded followed by repeating the prediction with one half of the stepsize and then the correction. For (b), if the sign of H ,\ changes at (A1,t1), then either a path jumping or a bifurcation occurs. In this case, we shall try the bifurcation 34- treatment first (see Section 3.3). If there is no bifurcation between (A0, to) and (A1, t1), apparently the curve jumping takes place. We shall cut the stepsize in half and repeat the prediction and correction at (A0, to). If both tests (a) and (b) pass, i.e. (A1,t1) lies in the proper region and no sign change for HA at (A1,t1), then (A1,t1) is accepted as a new point on the eigenvalue path. 3.2 Following the complex eigenvalue paths Complex eigenvalue paths appear in pairs. If (A(s), t(s)) is an complex eigenvalue path, so is its conjugate (A(s), t(s)). Only one of them needs to be followed, say, the one with positive imaginary part in A(s). Let (A0,to) be a point on F = (A(s), t(s)) with Im(A(s)) > 0. The tangent vector (A, t) at (A0, to) is the solution of the following system of equations: Hpi + Htt' = 0 ii + i2 = 1 (i > 0). After finding the tangent vector (A, t) by solving the above equations, we make the prediction, (At, t?) = (A0. to) + 66, i) with step size 6 > 0. Since H(A,t) = O is a system of two real equations in three variables (counting real and imaginary parts of A), in order to apply Newton’s method for the correction, an additional equation is attached. First of all, we choose a vector (u,v) E C X R, then, augment H(A, t) = 0 with the plane perpendicular to (u,v): Re[u(A — m] + v(t — 19)] = 0. (12) There are three options for the choice of the plane in (12): (See Figure 6) (i) When i m 1, (u,v) is chosen to be (0,1). With this choice, the correction is executed for fixed t 2 ti- The Newton iteration is in a simple form H(Ai”, if) Am+l:)\m 1 ‘ IMAM?) 35 and the cost is as low as 4722 + 0(n) per step. (zi) If f z 0, (u, v) is chosen to be (A, f). The correction in this case is on the plane perpendicular to the tangent vector. (zii) When a bifurcation is suspected, i.e. I m(A‘1)) < 0, choose (u,v) = (i, 0) because (i, 0) is the tangent vector at the bifurcation point. This case will be discussed later. When a proper (u, v) is determined, Newton’s iteration will be performed on the equation H (A,t) = 0 augmented with (12) starting from the initial point (A?,t§’). If the iteration does not converge, the step size is cut in half and the prediction- correction is repeated at (A0, to). If the Newton’s iteration converges, let (A1,t1) be the convergent point. If I m(A1) > 0, then no bifurcation between (A0, to) and (A1, t1), thus (A1,t1) is accepted as a new point on F. Otherwise, if I m(A1) < 0, there is a bifurcation between (A0, to) and (A1, t1). Then the bifurcation treatment discussed in the following section will be employed. 3.3 Bifurcation Bifurcations can not be avoided in our homotopy. Therefore, an efficient algorithm is needed to identify the bifurcation point and continuously follow the bifurcation branches. In [11], a method is introduced to detect and pass the bifurcation point for their real homotopy. However, the simplicity of our homotopy and the associate eigenvalue paths provide big room for a much more efficient method. 3.3.1 Real-to-complex bifurcation As mentioned before, if (A0, to) and (A1, t1) are two consecutive points obtained on a real eigenvalue path I‘ = (A(s),t(s)), and the sign of H; at (A1,t1) is different from the sign of H,\ at (A0, to), then a bifurcation point (A0, 1”) exists between these two points, at which HA must vanish. A linear interpolation is used first to approximate 36 Option (i) : correction at fixed t \ >t \ Option (ii) : correction on the plane perpendicular to the tangent. \ \ Option (iii) : correction with fixed imaginary part. V iA Figure 6: Options for the complex correction 37 1. A bifurcation is identified because of the sign change of the degree along the real eigenvalue path. it ‘_ Rx[0,1] bifurcation exists in between « IA ’1 I I I ' Rx[0,1] ‘_ approximate bifurcation point 3. lift the approximate bifurcatoin point along i-axis and correct it to the complex eigenvalue path. / continue ...... _ -10 correction 10 lifted point i A 5 /‘ Rx[ 0,1 ] ' <—— approximate bifurcation point ----0-00-0.1 Figure 7: Real—to-complex bifurcation algorithm 38 A0; that is, let A0 — /\1 H,\()\o, to) - HA(/\1,t1). X = A0 _ HA(A09 t0) Then, fix 3 and solve t“ by . _ H(A, t0) {2 to ——.——. Ht(/\a t0) (13) The point (A, t) will be taken as an approximation of the bifurcation point (A0, to). With the “lifting” technique described below, the precision of the approximation (A, f) becomes much less important. From (A, f), we make a prediction of the complex bifurcation branch by lifting (A, f) into complex space 0' X [0, 1]. That is, 10-102' is added to A and take (A + 10-102, f) as the new prediction. Then the correction is carried out on the plane I m(A) = 10‘10 (option (ziz) of the plane in (12)). This procedure is very efficient. And the real—to— complex space transition can be completed without computing the tangent vector at (i, t“). The procedure of real-to-complex transition is shown in Figure 7. Remark: The sign change of H ,\ at (A1, t1) can also be caused by curve jumping. Since two real eigenvalue paths next to each other share different degrees, i.e. +1 and -1, among them. This situation can easily be detected. If f in (13) is outside the interval [0,1], or the correction after “lifting” does not converge, then there is no bifurcation between (A0,to) and (A1,t1). Apparently, (A1,t1) is on the wrong eigenvalue path and must be discarded. The prediction-correction process is then repeated at (A0, to) with shorter step size. 3.3.2 Complex-to-real bifurcation Let (A0, to) and (A1, 751) be two consecutive points obtained on a complex eigenvalue path. If Im(Ao) > 0 and Im(Al) < 0, then a bifurcation point (A0, to) with Im(AO) = 0 exists. We proceed by basically reversing the steps in real-to—complex bifurcation. To approximate (A0, to), let 6... be the solution of Im(Ao + 6A) = 10—10 (14) 39 where (A, f) is the tangent vector at (A0, to). Making a new prediction at (A0, to) with tangent vector (A,i) at (A0,to) with stepsize 6.. and carrying out the correction on the plane Im(A) = 10‘10 (i.e. option (in) in (12) ), yields an approximation (Ad?) of the bifurcation point (A0, to). We then project (A, if) from C X [0, 1] into R x [0, 1] by taking (Re(A0), f) as a prediction of the real bifurcation branch. And the correction is made on the line A = Re(A) to obtain a point (A, t) on a real bifurcation branch. From the bifurcation analysis, there are two real bifurcation branches with tangent vector (1,0) and (—1,0) respectively. Thus the prediction-correction process can be started with each tangent vector separately to trace both eigenvalue paths. The complex-to—real transition at bifurcation point can be shown in Figure 8 3.4 The control of stepsizes (2') Initial stepsize In following an eigenvalue path of H(A, t) = det([(1— t)D + tA] — AI) = 0 at an initial point (A0,0) where A0 is an eigenvalue of D, first attempt of the algorithm is to choose the initial stepsize 6 = 1. The point (A0,1) will then be taken as a prediction point, followed by Newton’s correction at t = 1. This procedure, 0-ordered prediction with stepsize 1 followed by Newton’s correction, is the same as applying Newton’s iteration directly in solving the equation det(A — AI) = 0 by using A0 as a starting point. The choice of the initial matrix D in (2) makes the eigenvalues of D very close to the eigenvalues of A. Hence, Newton’s iteration on det(A — AI) = 0 with starting point A0 has a great possibility to converge. Indeed, the numerical results indicates that the vast majority (usually more than 90 % ) of the eigenvalues of A can be obtained this way. If Newton’s 40 i 1. A bifurcation is identified because the imaginary part changes sign along a complex path. + + + ‘ complex path 4. bifurcatoin 1:: I l I Rx[0,1] 2. Make prediction-correction on a lifted plane, and then project it to Rx[0,1]. 3 ' -10 projection §\ 10 / I Rx[O’U 3. From the projected point, perform correction toward the real path. complex path 1 earlier é / pro jecnon g real path it... correction Rx[0,1] Figure 8: Complex-to-real bifurcation algorithm 41 (iii) (iv) correction fails to converge, then the stepsize should be chosen in a standard way described below. Let (A0, to) be the tangent vector evaluated at (Ao,0). If to is close to 1 then |A0| z 0, since IAol2 +t3 = 1. So, the eigenvalue path is close to a straight line at (A0, 0) and can tolerate large stepsize, thus the stepsize is taken to be 6 = 1/ to in this case. When to << 1, we choose 6 = max{0.01, Itl5}. Cutting the stepsize From a given point (A0, to) 6 H‘1(0) on an eigenvalue path, let the step size in the current prediction be 6. In the correction process, if the iteration does not show a “tendency of converge”, the prediction will be repeated at (A0, to) with stepsize 6/2. The criteria for judging the “tendency of convergence” we used is k+l Ak < 1 Ak Ak-l 1A1 _ 1| _ 3| 1 _ 1 1 Increasing the stepsize When the prediction-correction step at (A0, to) is successful, a new point (A1, t1) on the eigenvalue path is obtained. If the angle between the tangent vectors at these two points is small (say, less than 15°), then it appears that the eigenvalue path is quite flat and thus a doubled stepsize will be used in the next prediction attempt at (A1, t1). Otherwise, the last successful stepsize in achieving (A1, t1) is used for the prediction at (A1, t1). Adjusting the stepsize If the prediction (A0, to) + 6(A, t) gives to + 6i > 1 then the stepsize 6 is reset to l-t - . . _ 7'1, making the prediction reaches the plane t _ 1. 42 3.5 Numerical Results The eigenvalues of the initial matrix DI * 0 D2 D: consists of the eigenvalues of the submatrices 01 and 02, both are irreducible upper Hessenberg matrices. The strategy of “Divide and Conquer” can certainly be repeated in finding the eigenvalues of D1 and D2, etc. However, our experience shows that the QR algorithm implemented in EISPACK [14], the HQR subroutine, is normally faster than homotopy-determinant algorithm for n g 25. Therefore the “divide” procedure should be stopped when the sizes of the submatrices are less than 25. The eigenvalues of the submatrices are solved by QR algorithm and then start the “conquer” procedure consecutively. The homotopy-determinant algorithm is tested on n X n upper Hessenberg matrices A = (a;,-) where the entries a;,- were generated by a random number generator. The computation were done on a SPARC Station 1 with double precision. To obtain all the eigenvalues A1, - - - ,An of A, we require that n 1 _ ngX/M - Grail <10 16 J—l which is the same requirement used in HQR. For fixed matrix size n, the algorithm is executed on more than 20 different matri- ces that are consecutive in a preset random number sequence. The results are shown on Table 1. The efficiency of the algorithm is closely related to the amount of bifur- cations we encountered in following the eigenvalue paths. Thus, for fixed n, the CPU time of each individual case varies in a relatively wide range. However, the average CPU time is quite encouraging. As a comparison, the results of the counterpart of ho- motopy algorithm in EISPACK, the subroutine HQR, are also listed on Table 1. For each fixed matrix order, there are often a few cases in which homotopy-determinant method runs slower. Nevertheless, the method holds an edge in average speed ( the average over more than 20 different matrices for each n ) up to two time as fast as 43 Matrix time (seconds) Average time ratio order minimum maximum average HQR / H—D 20 ]] HQR 0.34 0.49 0.3973 || H-D 0.21 1.51 0.5609 0.708 25 HQR 0.63 0.87 0.7388 H-D 0.38 1.58 0.7144 1.034 50 HQR 4.58 5.68 5.11 H-D 2.01 7.72 3.96 1.2.9 100 HQR 34.05 39.02 36.41 H-D 11.99 40.72 22.90 1.5.9 200 HQR 254.52 281.57 266.30 H-D 72.28 ' 340.24 142.01 2.26 300 HQR 643.18 660.50 651.84 H-D 208.26 579.51 320.88 2. 03 400 I] HQR 1536.61 1563.24 1549.40 || H-D 510.07 1779.93 689.47 2.25 Table 1: time comparison between HQR and H-D (H-D : Homotopy—Determinant method) HQR when the matrix order increases to 400. While the potential of homotopy method lies in its natural parallelism in the sense that each eigenvalue path can be followed independently, it is remarkable to note that the algorithm is strongly competitive on these example on serial computers. As discussed in Chapter 3, the 0-order prediction with stepsize one at an initial point (A0, 0), where A0 is an eigenvalue of the initial matrix D, followed by Newton’s correction at t = 1, or equivalently, applying Newton’s iteration directly on det(A — 44 Matrix order Percentage of number of Bifurcation frequency of “non-easy” (bifurcation points n “easy” paths paths per matrix ) 25 87.1% 2.51 0.92 50 92.3% 2.96 1.27 100 93.4% 5.04 2.08 200 95.7% 6.5 2.78 300 98.1% 4.3 0.92 400 96.0% 12.2 1.56 Table 2: rates of “easy” paths and bifurcations 45 AI) = 0 with starting point A0, has a great possibility to converge successfully. If this procedure succeeds, the corresponding eigenvalue of A is obtained with one step in following the eigenvalue path. Such an eigenvalue path is called an “easy path”. Table 2 shows the percentage of the “easy paths” obtained in each category. It appears that the overwhelming majority of the eigenvalue paths are “easy paths”, and the proportion of the “non-easy paths” which constitute the major part of the computations stay relatively invariant when the matrix order increases. Bifurcations are inevitable when real homotopies are used. Table 2 also shows the bifurcation frequency, that is, the average number of bifurcation points we en- countered in following all the eigenvalue paths of a single matrix. Apparently, the choice of the special form of the initial matrix D by using the strategy of “Divide and Conquer” minimizes the occurrence of the bifurcation points. In all our computations, the curve jumping was never undetected before the final results were reached, and was effectively prevented by adjusting the stepsize. 4 Evaluation of the eigenvectors 4.1 Eigenvector algorithm An eigenvalue A obtained in our algorithm is a result of the Newton iteration H(Amfl) A. = Am — —. HAOW, 1) at the level t = 1. The value A. is accepted as an approximation of an eigenvalue if IA... — Am] is small enough. In the meantime, values of H(Am, 1) and a: = (x1, . . . , :13”)T in F(Am) E [H ar,.-_1]‘11'1(Am, 1) = (an - Am)x1 + 6112272 + . . . + ainxn i=2 46 (1.21531 + ((122 — Am).’L‘2 + (1231133 + .................. + (1271,1371, = 0 a32x2 + (a33 — Am)a:3 + .................. + a3nxn = 0 ak,k—1-Tk—1 + (akk — Am)“ + . . - + aknitn = 0 (15) ................................. awn—lxn—l '1' (arm _ Am)‘Tn : 0 3:” = 1 as well as H,\(Am, 1) and rc’ = (01:1,. . . ,mg) in F’(Am) E [H a;,i_1]”1HA(Am, 1) = (all — Am):c; -— $1 + (11ng + . . . + alum; i=2 021513; + ((122 — Am).’13,2 + (123133 + .................. + (12,-LIB; = $2 (13ng + ((133 — Am)a:g + .................. + agnxfi, = 3:3 ak,k_1wt_1 + (an. - Am)xt + . . . + aknm; = $1. (16) an,n—1$.In_1 + (arm — Am)x"n = $71. (13;, = 0 have also been calculated (See Section 1.4). If the eigenvector corresponding to the eigenvalue A, is also wanted, the above values are useful to reduce the computation. Usually, a vector u E C" can be accepted as an approximate eigenvector when the relative residue ”Au — Aull res(A,u = —— ) llAlllluH is small. In our case here, it is clear that, IFO‘m)! 7‘68(A,..,£U) z —. llAlHlfcll Thus, if |F(Am)|/(||A]|||:v]]) is small, then a: is an acceptable eigenvector associated with A... In general, the residue [F(Am)]/(|IAI|||:1:||) is not small since |F(Am)] might be of high magnitude even if Am is very close to A.. For instance, given an 100 X 100 upper 47 Hessenberg matrix A with all subdiagonal entries a111,,- = 1/2, i=1, . . . ,n—1 and other entries in the magnitude 0(1). Assume that we can evaluate the eigenvalue A to the usual working accuracy [det[A — AI“ = 10‘15HAI]. We can only expect ||Au — Au” _ [117:2 a.,._1]-1| det[A — AI] _ 299 x 10-15 > 6 x 1014 IIAIIIIUII IIAIIII‘BH ll‘vll llwll res(A,a:) z which is not small in general unless ”:13“ is very large. From this viewpoint, Hyman’s method should not be used in the evaluation of the eigenvectors ( Wilkinson [13], page 627 ). However, our following analysis sug- gests that in ordinary cases a: can be refined to an excellent approximation of the corresponding eigenvector of A1, and in a way that requires neglectable further com— putations. We assume that A. = Am — c where e is smaller than our preassigned tolerance. Note that Or F(Am) —— eF’(Am) = 0. It is easy to see from (15) and (16) that F(/\m) 0 (A — AmI)a: = 0 F’(/\m) 0 (A—AmI):c'= +a: 0 Let :i: = a: — 6313,. Then (A — AmI):i: = = (A — AmI)(:c — 6313/) 48 If A. is accepted as the eigenvalue, then (A — A...I):i: = [A — (Am — e)I]:i: = e(—a: + :5) = —62m'. H(A- 4.041 2 .2 llw’ll 1411141 141141 This residue is easily available since the values of a: and (6’ have been computed and res(A,.,:i:) = if this residue is less than the preassigned tolerance, then i: can be accepted as the eigenvector corresponding to the eigenvalue A... If the above residue is not small enough, then the inverse iteration is executed to refine at: (A—A)U"‘+1 = um ( ) 771+} 17 um+1 = Um =0,1,2,... IIU “H m and a: can be used as the initial vector no, because it is more or less an approximation of the eigenvector. Our eigenvector algorithm can be summarized as follows: Let tol be the preassigned residue tolerance. Suppose in the m-th step of Newton iteration, the values Am, 1:, rc’ and F(Am) are obtained, and A, 2 Am — e is the accepted eigenvalue. Then 0 Test 1: If M S tol, then a: is accepted as the eigenvector corresponds to the eigenvalue A... Stop. Otherwise, perform Test 2. 0 Test 2: If 11% S tol, then :3: E :c — cm’ is accepted as the eigenvector corresponds to the eigenvalue A... Stop. Otherwise perform inverse iteration. 49 o Inverse iteration: If both Test 1 and Test 2 have failed, execute inverse iteration (17). 4.2 Numerical Results We have tested the eigenvector algorithm described in Section 4.1 in comparison with INVIT, which is an implementation of the inverse iteration in EISPACK. In our test, we use a random number generator on the interval [0.1] to obtain 10 matrices for each order n z: 100, 200 and 300 and reduce them to upper Hessenberg matrices. With evaluated eigenvalues, we first execute INVIT, then evaluate the residue for INVIT and use this residue as the tolerance for our algorithm. Because the residue tolerance is the actual residue obtained from INVIT, our algorithm performs to the same accuracy as INVIT. The results are summarize in Table 3. Table 3 clearly indicates that Test 2 is necessary because not all a: can pass Test 1. It is also surprising that, no inverse iteration is needed for the random matrices we tried. From our experience, the performance of our eigenvector evaluation is strongly related to the magnitude of the subdiagonal entries ai+1,i, i = 1, . . . ,n—1 of the matrix A. When these entries are extraordinarily small, our method has less advantage. Table 4 shows the effect of the magnitude of subdiagonal entries on our method. With a fixed 100 X 100 upper Hessenberg matrix which is reduced from a full random matrix, we multiply the subdiagonal entries of A by a shrinking factor a before performing our eigenvector algorithm. We can see that more and more eigenvectors fail both tests as the shrinking factor a decreases. 50 matrix time time for the percentage percentage percentage order for homotopy— of e’vectors of e’vectors of eigenvectors INVIT determinant obtained obtained obtained from method by Test 1 by Test 2 inverse iteration 100 2 12.51 S 0.13 96.8 3.2 0 200 2 98.62 S 0.61 92.4 7.6 0 300 2 332.46 S 0.93 93.9 6.1 0 Table 3: The performance of the eigenvector algorithm and the comparison with INVIT in EISPACK 51 subdiagonal time ratio: percentage percentage percentage shrinking Homotopy/ of e’vectors of e’vectors of eigenvectors factor INVIT obtained obtained obtained from a by Test 1 by Test 2 inverse iteration 1.0 0.0 100.0 0 0 0.8 0.0 98.2 1.8 0 0.6 0.011 82.1 17.8 0 0.4 0.033 44.8 55.2 0 0.2 0.043 22.0 78.0 0 0.1 0.107 9.2 83.1 7.7 0.08 0.196 4.5 79.1 16.4 0.06 0.227 5.6 73.6 20.8 0.04 0.466 5.3 47.4 47.4 0.02 0.743 4.6 25.3 70.1 0.01 0.831 1.9 17.6 81.3 0.008 0.830 3.3 15.4 81.3 0.006 0.855 3.3 13.0 83.7 0.004 0.905 1.1 10.8 88.2 0.002 0.923 2.1 8.3 90.4 0.001 0.930 4.1 7.2 90.7 smaller than usual. evaluation 52 Remark: In column 1, the shrinking factor decreases from 1.0 to 0.001. Column 2 shows that the time ratio of homotopy method over INVIT approaches 1. The sum of column 3 and column 4 represents the percentages of eigenvectors obtained without inverse iteration. These percentages are getting less when the subdiagonal entries are Table 4: The effect of the magnitude of the subdiagonal entries upon the eigenvector 5 Parallelization and Vectorization Homotopy—determinant method is not only a successful sequential algorithm for nonsymmetric eigenvalue problems, but also suitable for various parallel architectures. Based on CREW PRAM (concurrent read exclusive write parallel random accessible memory) model, a powerful parallel homotopy algorithm is presented in this chapter. It achieves the 0(n) time complexity assuming n2 processors. 5.1 Parallelization There are now several versions of homotopy algorithms. They enjoy a common advantage of natural parallelism in the sense that following each eigenvalue path is independent of the others. For nonsymmetric eigenvalue problems, each eigenvalue path requires 0(n2) flops (floating point operation) to follow. Thus, if each processor of the parallel machine is assigned an eigenvalue path, and if n processors are assumed, then the time complexity of eigenvalue evaluation is reduced by a factor of n. The homotopy-determinant algorithm we developed here can achieve more paral- lelization. In the path-following process, the major computations occur in the evalu- ation of determinants and their derivatives by Hyman’s method: H(A, t) = (t H aj,j—-1)l(a11 — ”5131+ 012352 + ' - - + 011111711] 1:2 where 021231 + ((122 — /\).’L‘2 + ........................... + (lg-nil?” = 0 tap+1m$p + (ap+1,p+1 — A)x,,+1 + - - - + ap+1.n~’cn = (18) anm—litn—I ‘l’ (“7171 _ A)$n = 0 .rn = 1. and 53 H,\(/\, t) = (t H aj,j_1)[(a11 — A)y1 “ $1 + 012312 + ' ' ' 'l' 01111an =2 where 021311 + ((122 — A)y2 + ........................... + 02711171 2 $2 taP+lmyp + (ap+1,p+1 _ A)yp+l + - ° ~ + ap+l.nyn : 3710+1 (19) an,n—1yn—1 'l‘ (arm _ /\)yn = $71 317; = 0) and x1, - - - , 33,, are as in (18). Also, Ht(A, t) = (H aj,j_1)[(au — M171 + 012332 + ' ' ' + alnxnl-l- i=2 +t(H aj,j—1)[(a11 — )021 + (11222 + ' ' ' + almzp] i=2 where, 21, - . - , 2,, are the solution of the following system, (12121 + ((122 — A)zg + ............ + ampzp = 0 ................................. (20) arm—lzp-l + (“721p " ”2p 2 0 z, = .71. Virtually all computations occur in solving upper triangular matrices (18), (19) and (20) in a general form: This linear system can be solved in the following parallel procedure: 54 For i = m down to 1 do 17." ‘— 04/51; For j = 1 to i — 1 do parallel cj «— cj —— bjim; end for end for The above two levels of parallelization, parallelization by path and parallelization in upper-triangular system solver, can be combined to obtain an powerful parallel algorithm: Parallel Homotopy Algorithm for Eigenvalue problem Input: An n x n matrix A, Output: n eigenvalues of A. Procedure: 1. If n S 2, solve the eigenvalues directly, stop. D1 D3 0 D2 solve eigenproblems for D1 and D2 in parallel to get the n initial 2. (n > 2) Form the initial matrix D = ( ) and recursively eigenvalues of A(O). 3. For each initial eigenvalue A0, and do in parallel (a) For k = 0,1,-- -, evaluate H(Ak, 1) and H,\(Ak, 1) in parallel and perform Newton iteration H(Ak 1) A . = A — —’—-—. If convergent, go to (k), otherwise goto next step. (c) Evaluate H,\(A0,to) and HAAOJO) in parallel. ((1) Find the tangent vector (u,v) by solving HAN + HtU = 0 55 (e) Predict with stepsize 6: A9=Ao+6u , tgzto+6v; (If t? > 1, reset 6 = (1 — to)/v and predict again to make if0 = 1) (f) Set m = 0; (g) Evaluate H(Ai", ti”), H,\(A;",t{”) and H¢(A}",t{”) in parallel. (h) Correct the error in the prediction by performing one step of new— ton iteration using results in step 3(g) and obtain (ATH, tin“); (i) Set m := m + 1 and go back to step 3(f) until (A1",ti") is a satisfactory approximation of the point on the eigenvalue path. (j) (A0, to) «— (A1”,t1”) and goto 3(c) until to =1 (k) Accept the resulting eigenvalue With CREW PRAM model, suppose n2 processors are used and each set of n processors is assigned for an initial eigenvalue in Step 3 to follow an eigenvalue path, and these n processors are used in Step 3 to evaluate H, H ,\ and H t for each eigenvalue path tracing. The 0(n) running time is enough to find all eigenvalues of A. Assume that the time complexity of the algorithm is T(n). Step 1 takes constant time. Step 2 computes the eigenvalues of two n/2 X n/2 matrices in parallel, it needs T(n/2) time. The n eigenvalue paths will be traced in Step 3 simultaneously to obtain the n eigenvalues of A. Step 3(a), 3(c) and 3(g) can be done in O(n) time respectively, by using n processors for each eigenvalue path. Obviously, the rest substeps in Step 3 also only require at most 0(1) time, respectively. Since the number of steps in following an eigenvalue path is independent of n, the number of iterations of Step 3 is constant. Thus T(n)=T(n/2) + 0(n), which is 0(n). The total cost of this process is n2 X 0(a) = 0(n3) which matches the cost of best known sequential algorithm. The speed-up of the parallel algorithm is very significant in terms of the order of n. 56 5.2 Vectorization Homotopy—determinant method is also suitable for vector computers. This is be- cause, as described in Section 5.1, the algorithm consists of a sequence of computing upper—triangular system solutions ( 0(n2) flops each ) and some computations with fewer than 0(n) operations, and every upper-triangular system can be solved in vector computation. Assume B = (b1, . . . ,bm) is an m X m upper-triangular matrix and b, E Cm its column vectors. The equation Bm=d can be written as b1x1+...+bmxm=d, wherebi: E ,d= The procedure of solving the system can be described in the following vector algorithm For i = m down to 1 do :5,- (— (ii/bi; d1 d1 bli (— - x, di—l di—l 51—1,." end for It consists of 3m vector operations and m scalar divisions. 5.3 Numerical Results A primitive numerical testing has been conducted on the parallel machine BBN CP1000 (BUTTERFLY) which has 76 processors available. For each matrix order n = 100, 200, 300, we choose both the worst and the best matrices we tested in Section 57 3.5 and compare with single processor HQR in EISPACK. We define the speed-up here by time for 76 processor computation with homotopy algorithm time for 1 processor computation with HQR speed-up = At present, the computation is executed by assigning each processor an eigenvalue path. Once a processor completes its job, we assign another eigenvalue path to it until all eigenvalue paths are followed. The speed-ups for the worst and the best cases are in Table 5. Matrix order speed-ups worst case best case 100 2.07 47.41 200 2.63 99.97 300 9.13 127.59 Table 5: Speed-ups in parallel computations 58 References [1] M. T. Chu, A note on the homotopy method for linear algebraic eigenvalue problems, Linear Algebra and Appl., Vol. 105, 225-236 (1988). [2] M. T. Chu, T. Y. Li and T. Sauer, Homotopy method for general A-matrix problems, SIAM J. Matrix Anal. and Appl., Vol., 9, N0. 4, 528-536 (1988). [3] M. E. Henderson and H. B. Keller, Complex bifurcation from real paths, SIAM J. Appl. Math, 50 (1990), pp. 460—482. [4] M. Hyman, Eigenvalues and eigenvectors of general matrices, presented at the 12th National Meeting of the Association for Computing Machinery, June 1957, Houston, Texas. [5] T. Y. Li and N. H. Rhee , Homotopy algorithm for symmetric eigenvalue prob- lems, Numer. Math. 55, 265—280 (1989). [6] T. Y. Li and T. Sauer : Homotopy method for generalized eigenvalue problems An: = ABaz, Linear Alg. Appl., Vol. 91, 65-74 (1987). [7] T. Y. Li, T. Sauer and J. Yorke : Numerical solution of a class of deficient polynomial systems, SIAM J. Numer. Anal. 24, 435-451(1987). [8] T. Y. Li, H. Zhang and X. H. Sun, Parallel homotopy algorithm for symmetric tridiagonal eigenvalue problems, SIAM J. Sci. Stat. Comput, Vol. 12, No. 3, 469—487 (1991). [9] B. Parlett, Laguerre’s Method Applied to the Matrix Eigenvalue Problem, Math. Comput. 8 464—485 (1964). [10] T. Y. Li, X. Wang and Z. Zeng, On the complex bifurcation theorem. Preprint. [11] T. Y. Li, Z. Zeng and L. Cong, Solving eigenvalue problems of real nonsymmetric matrices with real homotopies, to appear: SIAM J. Numer. Anal. 59 [12] J. H. Wilkinson, Error analysis of floating-point computation, Numer. Math. 2, 319-340 (1960) [13] J. H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University press, 1965. [14] B. T. Smith et a1. : Matrix Eigensystem Routines-EISPA CK Guide, second edi- tion. Springer-Verlag 1976. [15] B. L. Van der Waerden: Modern Algebra, Vol. 1; Frederick Ungar, New York, 1949. 60 4'2‘4'.«'.J ,. .9. A | NICHI titutlir , .~ . " » ’ :..t : . . . .‘