This is to certify that the dissertation entitled THE HOMOTOPY METHOD FOR THE SYMMETR I C E I GENVALUE PROBLEMS presented by Noah H. Rhee has been accepted towards fulfillment of the requirements for PhD degree in Mathematics 4 ‘ 7 Major plefiessor June 30th, 1987 Date MSU is an Affirmative Action/Equal Opportunity Institution 0-12771 )V1ESI_J RETURNING MATERIALS: Place in book drop to LIBRARJES remove this checkout from .—;—. your record. FINES wiH be charged if book is returned after the date stamped be1ow. THE HOMOTOPY METHOD FOR THE SYMMETRIC EIGEN-VALUE PROBLEMS BY Noah H. Rhee A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mathematics 1987 ABSTRACT THE HOMOTOPY METHOD FOR THE SYMMETRIC EIGEN-VALUE PROBLEMS BY Noah H. Rhee The homotopy method is applied to solve the linear algebraic eigen-value problems for symmetric matrices. Special homotopy equations for symmetric eigen-value problems Ax = Xx are constructed. It is known that there are n distinct curves connecting trivial solutions to desired eigen—pairs. Each curve is composed of an eigenvector curve in Rn x [0,1] and an eigenvalue curve in R x [0,1]. In this thesis we show that it is enough to consider only an eigenvalue curve in R X [0.1]. The homotopy method for calculating eigen-values and eigen-vectors of a matrix is a serious alternative to the currently most popular approach EISPACK for SIMD machine. The computational results obtained through this thesis are extremely promising. This thesis is the first serious attempt to make the homotopy method for symmetric eigen-value problems efficient by making use of special features of the homotopy method. ACKNOWLEDGMENTS I thank God who gave me the gift of life and the chance to study Mathematics at Michigan State University. I am grateful to Professor Tien-Yien Li, my thesis advisor. His guidance and encouragement made this work possible. TABLE OF CONTENTS INTRODUCTION . . . . . . . . . . . . . . . CHAPTER I: THE EIGENVALUE CURVE . . . . . §1.l: The Existence of n-distinct Eigenvalue Curves. . . . . . . §l.2: The Local Conditioning Factors an Eigenvalue Curve. . . . . . CHAPTER II: THE ALGORITHM . . . . . . . . §2.l: Prediction . . . . . . . . . . (a) Eigenvalue Prediction . . (b) Eigenvector Prediction (c) Step-size Updating. . . . §2.2: Correction . . . . . . . . . . §2.3: Checking . . . . . . . . . . . (a) Prediction Checking . (b) Correction Checking §2.4: Flow Chart . . . . . . . . . . CHAPTER III: NUMERICAL RESULTS. . . . . . §3.l: Examples . . . . . . . . . . . §3.2: Comparison With EISPACK. , . . (a) Execution Time. . . . . (b) Computational Complexity. (c) Storage Consideration . . REFERENCES 0 O O I O I O O C I C I I O O l iv LIST OF TABLES Page Table 3 .l . . . . . . . . . . . . . . . . . . 39 Table 3 .2 . . . . . . . . . . . . . . . . . . 42 Table 3 .3 . . . . . . . . . . . . . . . . . . 44 Table 3 '4 I I I I I I I I I I I I I I I I I I 48 LIST OF FIGURES Page Figure 2.1 . . . . . . . . . . . . . . . . . . 21 vi LIST OF ALGROITHMS Page Algorithm 1: Eigenvalue Prediction . . . . . . . 18 Algorithm 2: Eigenvector Prediction . . . . . . 19 Algorithm 3: Step—Size Updating . . . . . . . . 22 Algorithm 4: ROI-Algorithm . . . . . . . . . . . 23 Algorithm 5: Prediction Checking . . . . . . . . 28 Algorithm 6: Sturm Sequence Algorithm . . . . . 31 Algorithm 7: Correction Checking . . . . . . . . 34 INTRODUCTION Solving an eigenvalue problem Ax = Xx for a symmetric n x n matrix A can be thought of as solving a nonlinear system of polynomial equations. Fcp(x.x) = o. where Fm :Rn XR .. R“ xR is defined by Ax - Ax F (XA) = ‘9 cp(x) and m is a polynomial from Rn to R. For example, if we want eigenvectors to have Euclidean norm 1, we might let m(x) = xi-+x§-+..u+x: —l, where x = (x x x )T 1' zl-oo'n c From this point of view, many well developed methods can then be employed to find zeros of this Fm. We shall assume that all eigenvalues of A are distinct. Then the classical Newton's method and its many improved modifications are applicable for solving F®(x,l) = 0. Unfortunately Newton’s method converges to only one zero at a time. In order to obtain all n—eigenpairs of A, we have to restart the iteration by making n suitable initial guesses, which is difficult in practice. The homotopy method of finding all the isolated solutions of a system of polynomials has attracted considerable attention recently. The basic idea of homotopy continuation is to construct a homotopy from a trivial map to the one of interest. Under suitable conditions, a smooth curve starting from the solution of the trivial map will lead us to the desired solution. The homotopy method for the symmetric eigen— value problems was studied by Chu [3]. For the general eigenvalue problems, a homotopy was given by Li, Sauer and Yorke [4]. They constructed homotopies which give n-disjoint smooth curves in Rn+l)([0,l] (or Cn+l)<[0,l]). And each curve leads from an obvious starting point to an eigenpair (x,k) of the given matrix. In both [3] and [4] a curve in Rn+1)<[0,l] (or Cn+l'x[0,l]) is composed of an eigenvector curve in Rrl x [0,1] (or Cn x [0,1]) and an eigenvalue curve in R x [0,1] (or C x [0,1]). In this thesis we consider only an eigenvalue curve in R x [0,1] by constructing the following homotopy equation: H:Rn x R x [0,1] -* Rn such that (0.1) H(x,)\,t) = (1—t) [Xx-Dx] + t[).x—Ax] = 0, where D is an n x n matrix whose eigenpairs can be computed easily. Under suitable conditions n-distinct smooth eigenvalue curves (in R x [0,1]) of (0.1) exist, and each eigenvalue curve leads from an eigenvalue of D to that of A. Once we find an eigenvalue, the corresponding eigenvector is immediate by Inverse Power Iteration. Each eigenvalue curve can be characterized by the solution curve of a scalar ordinary differential equation with an initial value an eigenvalue of D. Hence each eigenvalue curve can be followed numerically. Furthermore different eigenvalue curves correspond only to different initial values of the same ordinary differential equation. Following one eigenvalue curve is completely independent of following the other eigenvalue curves. Therefore the homotopy algorithm is an excellent candidate for exploiting the advantages of parallel processing. Also the homotopy method maintains the structure of the underlying matrix, if there is any. Chapter I discusses the existence of n—distinct smooth eigenvalue curves and the local conditioning of the eigenvalue curve. Chapter II discusses the curve following algorithm in detail. Chapter III gives several numerical results which show that the homotopy algorithm for eigenvalue problems can be a serious alternative to the QR—algorithm which is currently the most powerful algorithm for eigenvalue problems. CHAPTER I: THE EIGENVALUE CURVE The discussion of the eigenvalue curve is divided into two sections: The Existence of n—distinct Eigenvalue Curves, and The Local Conditioning Factors of an Eigenvalue Curve. §l.1: The Existence of n-Distinct Eigenvalue Curves From (0.1) we have H(x,l,t) = (1 -t)(lx -Dx) + t(lx-—Ax) = Xx - (D+t(A—D))x = Xx — A(t)x where A(t) = D + t(A-D). We call D the initial matrix and A the final matrix. Because A is symmetric, we can tridiagonalize A by a standard tridiagonalization process. Hence we shall assume that A is tridiagonal. We may assume that none of the off-diagonal elements is zero, for otherwise we would subdivide_ A into the direct sum of tridiagonal matrices of lower order and work instead with them. We choose the initial matrix D such that (a) D has distinct eigenvalues and all its eigenpairs are available. (b) A(t) = D + t(A-—D) is tridiagonal with non—zero off-diagonal elements unless t = 0. Remark 1.1: (1) (a) and (b) imply that for each t 6 [0,1], A(t) has n-distinct eigenvalues. For a proof see [8, page 124]. (2) The conditions (a) and (b) are easily satis- fied, for example, by choosing a diagonal matrix D with distinct diagonal elemtns. *** We denote the n—distinct eigenvalues of A(t) by (Xl(t).X2(t),...,kn(t)) and assume that X1(t) < X2(t) <...< Xn(t). And we denote the corres- ponding normalized (with respect to the Euclidean norm) eigenvectors by (Ql(t),xz(t),...,§n(t)). Henceforth A -notation will be used for a unit vector. *** The eigenvalues of a matrix are continuous functions of the elements of the matrix [7]. Hence we have the following proposition. Proposition 1.1: X1(t) is a continuous function of t for i = l,...,n. *** A We shall denote [(X1(t) ,t) : o g t g 1] by ci for i = l,2,...,n. Note that Ci n Cj = ¢ if i ¥ j from Remark 1.1: (l). Remark 1.2: (1) Proposition 1.1 establishes that there are n—distinct continuous eigenvalue curves C1,C2,...,C such that Ci joints the n ith eigenvalue of D and the ith eigen— value of A. We call this property the Order Preserving Property of eigenvalues. (2) The Order Preserving Property of eigenvalues is important, since in application we often need to find few eigenvalues which are either algebraically the largest or smallest. The Order Preserving Property of eigenvalue also provides a valuable checking algorithm as we follow an eigenvalue curve (see Chapter II, section 3.) *** Now we want to show that Ci is, in fact, a Ca- curve for i = l,2,...,n. A For fixed v in [0,1], let (x(v),X(v)) be the ith eigenpair of A(v). Consider the linear functional A w :Rn * R defined by ox = x(v)Tx for x E Rn. Let H : Rn x R x [0,1] * Rn x R be defined by ~ Ax - A(t)x H(x.>\.t) = A x(v)Tx - 1 Lemma 1.2: The Frechet-Derivative ~ A N D(x X)H(x(v),k(v),v) of H with respect to x and l at (§(v),l(v),v) is nonsingular. Proof: See [3]. *** ~ A Notice that H(x(v),l(v),v) = 0. Hence by the Implicit Function Theorem, there exist 6 > O and a on unique C —curve I v [Fv(t) :v -6 < t < v +6} {(X(t) Ix(t) It) : (X(V) IK(V) IV) = (§x(v)I-A(v)]+) = {o} u [XT$%:3" v e o(A(v)), v e A(v)]. Hence H[x(v)1 -A(v)]+H = l/d(i,v). *** We denote (dn/dtn)A(v) by A(n)(v) and (dn/dtn)x(v) by x(n) (v) . And we denote 1(1) (v) and x(1)(v) by A(v) and x(v) respectively. Theorem 1.6: Mk) (v)! < C(k) HA _D:ik/d(i,v)k‘1 11 llx‘k’ (v) H gC(k) HA -D!lk/d(i.v)k for k = 1.2.---, where C(k) is a constant, which depends only on k. Proof: We will use induction to prove this ~ A . . theorem. For m = 1, (d/dt)H(x(v) ,Mv) ,v) = 0 implies that I /\ A A(v) I - A(v) ll x(v) x(v) (A —D)x(v) A ' :--- = —————————— x(v)T I o A(v) o By Proposition 1.5 . + | A A x(v) [A(V)I - A(v)] | x(v) (A -D)x(v) ( _—-— = I A(v) QMT - o o [l(v)I -A(v) ]+(A -D)§(v) A T A x(v) (A -D)X(V) . A Thus x(v) = [A(v) -A(V)]+(A-D)X(V) ”x(v)” = llmv) -A(v)1+(A—n)§2(v)u <_. H [x(v) —A1+HHA -DH = HA -Dll/d (i.v). Hence ”x(v)” 3 ‘IA -Dll/d(i.v) . 12 Also (1.1) i = §(v)T(A -D)§‘<A<(v) u = HA -D||- Hence [A(VH g ”A — DH . Therefore our Theorem is true for m = 1. Suppose that Him) (VH1 S C(m) ”A —D”,m/d(i,v)m-l and ”x(m) W)” i C(m) llA -DHm/d(i,v)m are true for m g k -1. Let m = k (k 2 2) . By repeatedly differentiating H(x(t) ,Mt) ,t) = O with respect to t and using the fact that (d/dt) [A(t)] = (d/dt) [D+t(A —D)] = A -D and A evaluating at (x(v) ,Mv) ,v) , we obtain the following expression: x(v): -A(v) i QM x(k) (v) I A + """" x(v)T : 0 And (v) k(A _D)x(k-1) (V) +gl)‘(1) (V)X(k_1) (V) +921”) (v)x(k—2) (V) + + 9k_lx(k‘” (mam (v) where 91, 92, . . . , gk_1 are constants, which depend only on k. Let 1 g p g k -1. Then by the induction hypothesis we'have 13 1MP) (v)! = C(p) HA -D||p/d(i,v) p'l ”x(P) (v) II = C(p) HA —DHp/d(i,v)P. Hence the upper part of the right hand side is bounded by C(k)HA-DHk/d(i,v)k_l. Let us denote the right hand side by [b.01T. Then HhHgC(k)HA-DHk/d(i,v)k-l. Hence i x‘“ (v) [A(v): - A(V)]+ ; QMi h ——————— = ———-———————————-.l————— -—- A A(k)(v) x(v)T : O O [Mv)I - AM 1*}. Q(V)Th Thus ux‘k) (v) u = u we I we mu : tlmv) -A.,t) =0 (0.1), where (§(t),x(t)) is an eigenpair of A(t). Hence we can locate (§(t),k(t)) as accurate as we want, for example, by correcting a reasonable approximation by the Newton method. Hence in homotopy method, there is no problem of instability. Hence it is desirable to develop a numerical algorithm which makes use of the special structure of homotopy method for the symmetric eigenvalue problems instead of using any available ODE software solvers. The major difficulty in following an eigenvalue curve is that there are n‘-1 other eigenvalue curves. Hence in following an eigenvalue curve, the important question is how we can prevent ourselves from jumping into another eigenvalue curve. As usual, there are two main steps in following the solution curve of an ordinary differential equation, namely the prediction step and the correction step. In this chapter, in addition to these two steps, we also discuss a checking step. A 17 Therefore the description of the algorithm will be divided into three parts: Prediction, Correction and Checking. Then we summerize the algorithm through the Flow Chart. A Suppose that we have found (x(v),l(v)), the ith eigenpair of A(v) and 0 g v g 1. Now we want (Q(v-+h),l(v-+h)), the ith eigenpair of A(v-rh). Throughout this chapter we assume that we are following the ith eigenvalue curve Ci' We shall denote ci by [(A(t),t):0gtg1]. §2.l: Prediction In this section, we want to determine the next step-size h and prediction for A(v-th) and §(v+h) . The description of the prediction is divided into three parts: Eigenvalue Prediction, Eigenvector Prediction and Step-Size Updating. (a) Eigenvalue Prediction For now, let us assume that the next step size h was determined. 18 From (1.1) My) =§(v)T(A—D):’é(v). So A(v) is immediately available. Thus we use therHermite Inter- polation which interpolates [A(u),A(u),A(v),A(v)] to predict A(v-th), where A(u) and A(u) are the eigenvalue and eigenvalue derivative at the previous step respectively. At v = 0 we choose the initial matrix D which has a simple structure so that it is easy to get higher derivatives of A(t) at t = 0. Thus we used third order Taylor method to predict for the next step eigenvalue. Let P(t) be the polynomial which interpolates Mt) at {Mu).i(u).x O is a prescribed accuracy. Since the Hermite Interpolation, which interpolates [A(u),A(u),A(v),A(v)], is used to predict A(v-th), ll(v+h) —>.O(v +11)! A(t) for t E [u,v-+h]. In fact, the error in is related to the 4th derivative of we have the following error formula: Theorem 2.1: Let the real function A be (n-+l) times differentiable on the interval [a,b] and consider (m-tl) support abscissae ti 6 [a,b] such that to < t1 <...< tm. If the polynomial P(t) 11-: n, where n:+1 = 1 i 7!. [‘43 whose degree is at the most 20 interpolates at each support abscissae ti not only the value but also the first ni —l derivatives of x, that is, p(k) (ti) = W" (ti) (k = 0,1,....ni-l) for each i = 0,1,...,m, then to every t 6 [a,b], there exist E, which belongs to the smallest interval containing t, such that (t-t )n0(t—t )n1 (t-t )nmx(n+1’ (E) 0 1 ... m (n-tl)! A(t) - P(t) Proof: See [10]. *** By Theorem 2.1, l(v+h) — xo(v+b) = (v+h-u)2(v+h-v)2l(4)(t)/24 = [h + (v —u) 1213).“) (E) /24, where t E [u,v-+h]. Suppose that we have an estimate Mv for [1(4)(t)l, t E [u,v-+h]. To determine the next step size h, which makes the error [A(v:+h) -AO(v-+h)l less than 6, we set [h-+(v«-u)]2h2Mv/24 = E, that is [h+(v-u)]2h2 = 24E/Mv. Let f(h) = [h+(v-u)]2h2. Then we shall look for the smallest positive solution h of f(h) = 24E/Mv. The function f(h) has two double zeros at h = 0 and h = -v(v -u). So we have the following picture: 21 M ————_—-—— D‘_.__._ —(v-u) 0 Figure 2.1 Clearly there is a unique positive solution h. Furthermore it is easy to see that Newton Iteration for the equation f(h) = 246/MV, starting from the . . _ co preVious step—Size ho — v-u, Will produce [hn]n=o such that hn * h as n * w. To complete the updating of the step-size, we need to update Mv so that we may have an estimate M will be used for the determination v+h' v+h of the next step-size from t = v-th. for M Note that at this stage h is available. By Algorithm 1, we shall have Xo(v-+h). Later by correction we shall have A(v-th). Let ERR = [A(v-th) -Ao(v-+u)l. Again by Theorem 2.1 we have 22 [11+ (v -u) 12h2 ERR = 24 It“) (E) I. where E 6 [u,v-th]. Hence HM) (EH __, 24ERR 2 2 [h + (v -u)] h We set Mv+h = 24ERR . [h+ (v-u)]2h2 Therefore, we have the following step-size updating algorithm: Algorithm 3: Step-Size UpdatinguAlgorithm (1) Set f(h) = [h+ (v-u)]2h2 = 246/Mv. Solve for h by Newton Iteration starting from h = v - u, where E is a prescribed accuracy and Mv is an estimate for [A(4)(t)] near t = v. (2) Wait until Ao(v-+h) and A(v-th) are available. (3) Calculate ERR = (Minn) -lo(v+h) 1. 24ERR (4) Set M = 7-5. v+h [h+(v-u)] h Remark 2.1: MO is found by calculating lk(4)(0)l. Since D has a simple structure, it can be found easily. 23 §2.2: Correction , A For the correction of (xO(w),AO(w)), we use the A Rayleigh Quotient Iteration starting from xo(w). We shall abbreviate Rayleigh Quotient Iteration by R01. The asymptotic rate of convergence of R01 is cubic for a symmetric matrix [8]. Algorithm 4: ROI-Algorithm Let A be an n X n symmetric matrix and (1) (2) (3) (4) (5) T _ x ax x x '. . A Pick a unit vector x0. For j = 0.1.2,..., repeat the following. A Compute Xj+l - pA(xj). A Solve (Aj+II«-A)yj+1 — xj for yj+1. Set 9 = /HY H j+1 yj+1 - j+1" If (Hyj+1H is big enough) THEN A . . . Accept (xj+l'xj+l) is an eigenpair of A. ELSE j = j + 1 Go to (2) END IF. *** Remark 2.2: (1) In our algorithm 90 = xO(w) and A = A(w) . We shall write Q]. and xj by Qjm) and Xj(W) respectively, for j = l,2,°--. *** 24 (2) In step 5, that is big enough means Hyj+lH that the residual is small enough. For A A y.+1 x. l. I-A . = x. I-A = . ( 3+1 )X3+l ( 3+1 ’ Hyj+1H Hyj+lH Hence ”(Aj+11"A)Qj+1H = l/HYj+1H° Now we want to show that for small enough step-size A h, RQIeAlgorithm with starting point x0(v-+h) produces A A xj(v-th) which converges to x(v-th). It suffices to consider diagonal matrices to understand the geometry and dynamics of symmetric matrices under ROI [1]. Thus we shall assume that A(v-th) is diagonal matrix with A1(v-th),...,kn(v-+h) as diagonal elements for each h. Definition 2.2: Let i . i p1(x) = Z xjxi/Z x2. j=1 j j=1 J n n p2(x) = Z 13x2./23 x2. j=i J j=1 J '-l i - n . _ A:L__ilL_ E - {X E R .Xi - 1: 01(X) > 2 xi-+Ai+l and p2(x) < -—jf——-—i where Xj is the jth coordinate of x and A] is the jth eigenvalue of the associated diagonal matrix for j = l,2,...,n. 25 Theorem 2.3: If we start RQI-algorithm from any vector in E, the sequence of vectors by ROI—algorithm converges to e1 = (0,0,...,0,1,0,...,0), which is ith the ith eigenvector of the associated diagonal matrix. Proof: See [1]. *** Remark 2.3: (1) E is not empty since ei E E. (2) E is open set in the x1 = l chart, since p1(x) and p2(x) are continuous functions in the xi = l chart. Theorem 2.4: There exists 5 > 0 such that 0 < h < 3 implies that ROI-algorithm starting from A . x0(v-+h) produces sequence which converges to e1. Proof: For given h > O, we associate Ai-l i E(h) = {X E Rn :Xi = 1'p1(x) > XV+h)2+A (Vi-h) li(v +h) + Ai+1(v +h) Since E(h) is open in xi = l chart, there exists a neighborhood of e1 with radius 6(h) in x. = 1 chart, say N6(h)(ei)’ such that N (ei) i 6(h) is the maximal open ball which is in E(h). Note that 6(h) is a continuous function of h since everything which is involved is continuous with respect to h. 26 Now we fix h > 0. Let 6 = mink 6(h). Then Oghgh 6 > 0, since 6(h) is a positive continuous function and [0,h] is compact. Note that as h * 0, [Ai(v+h) -l0(v+h)l -: o and thus Qo(v+h) -> e1. Hence there exists h such that 0 < h < h and if h < h then Hx0(v-+h)-eiH < 6, where xo(v-+h) is the constant multiple of Q0(v-+h) so that x0(v-+h) may belong to xi = l chart. Thus the Theorem immediately follows. *** Remark 2.4: The convergence in Theorem 2.3 and Theorem 2.4 is the convergence in the real projective space PRn-l. Hence the correction may end up (-§(v+h),Mv+h)) instead of (9(v+h).l(v+h)). However, it has no effect in our algorithm at all since i(v+h) x(v+h)T(A —o):’2(v+h) -§(v+h)T(A—D)(-a’2(v+h)). *** 27 §2.3: Checking A Even though we find (x(w),A(w)) from A (x(v),A(v)) by prediction and correction, where w v-th, A(w) may be on a different eigenvalue curve from what we expect. There are two reasons for this failure. First, A(4)(t) (u g_t g_w) is not'known exactly. Second, the eigenvalue separation of A(w) is not known. Fortunately there is a simple and accuract checking algorithm for tridiagonal eigenvalue problem, which can assure that if A(v) was the ith eigenvalue of A(v), then A(w) is also the ith eigenvalue of A(W) . The description of the checking algorithm is divided into two parts: Prediction Checking and Correction Checking. (a) Prediction Checking Since A(t) = D-+t(A-D) is symmetric for each t 6 [0,1], the eigensystem forms a complete 28 orthonormal system for each t. Hence the eigensystem may be considered as rotating continuously from t = 0 to t = 1. If the rotation of the eigenvector is too big in two consecutive steps, we might have jumped into another eigenvalue curve. A After eigenvector prediction, we have x0(w). A And we hope that the rotation between xo(w) and A x(v) is not too big. So we have the following prediction checking algorithm: Algorithm 5: Prediction Checking Algorithm Let CRI be a positive real number. (1) Calculate IP = x(v)TQO(w). (2) If ([IP] > CRI) THEN Accept Q0(w). ELSE Cut the current step-size by half. Go back to eigenvalue prediction. END IF. *** Remark 2.5: (l) The number CRI actually restricts the rotation angle between the eigenvectors. 29 In our program4'we set CRI = .85. It means that no more than g eigenvector rotation is allowed in two consecutive steps. (2) If IIPlg_CRI, then indeed we might have jumped into another eigenvalue curve. However IIPI> CRI does not necessarily imply that A(w) is on the right eigen- value curve. To be safe, we devise a correction checking algorithm as follows. (b) Correction Checking To state correction checking algorithm, we need some background material. r Cl1 32 1 B2 “2 °'. Let T = '. °. °. 0°. '. Bn L an n where Bi #'0 for i = 2,3,...,n. The characteristic polynominals Pi(X) of the principle minor formed by the first i X i submatrix of the matrix T satisfy the following recursive relations: 3O PO(A) = l Pl(l) = o1 - A _ 2 - for i = 2,3,...,n. Then the roots of Pi strictly separate those of Pi-1(1 g_i g_n-1) [12]. We call this property of ,P .,Pn the Strum Sequence Property. Po 1'“ The Sturm Sequence Property for symmetric tridiagonal matrices with non-zero off diagonal elements forms the basis of the following Theorem. Theorem 2.5: Let G(u) be the number of sign changes of Po(u),...,Pn(u) at location M. Then G(H) is the number of roots of Pn(A) with A < H. Proof: See [12]. *** Remark 2.6: If Pi(u) = 0, we take the sign of Pi(u) as that of Pi_1(u). Note that no two con- secutive Pi(u) can be zero. *** Let Qi(A) = Pi(A) /Pi-1()\) (i = 1,2:oooon) . Then Qi(A) satisfies the following relations: 31 01(A) = o1 - A 2 . _ Qi()\) (Oi -A) - Bi/Qi-1(X) (1 -' Zloooln) and C(u) is given by the number of negative Qi(H). We thus have the Sturm Sequence.Algorithm. Algorithm 6: Strum Sequence Algorithm (1) Set COUNT = O (2) Calculate 01(H) = a1 - u IF i = 2,3,...,n. repeat the following. (3) IF (Oi-1‘“) = 0) Qi_1(u) = 1811 MACH. where MACH is the smallest number for which 1 + MACH > 1 on the computer. (4) Calculate aim) = (oi-u) - ref/eight). (5) If (Qi(H) < O) COUNT = COUNT + 1. an IF (i = n) THEN Set (sun = COUNT ELSE i = i + 1 Go to (3) END IF . *** Remark 2.7: (1) G(u) is the number of eigenvalues of T which is less than H. (2) When Qi_l(u) = 0, replaCing Qi_1(H) by (Bil MACH (note that this is positive: this is essential because if Qi_1(u) = 0, 32 it is treated as positive) is equivalent to replaCing ai-l by oi_1 + [Bi] MACH. Hence this algorithm is stable. (3) For the detailed discussion of this algorithm see [6]. *** Algorithm 6 can be used to check whether A(w) is on the right eigenvalue curve. By the prediction and correction step, we have the sequence {(xj(w),lj(w))]§p1, where ”Yk(W)H is big enough [Algorithm 4] . And the pair (9k (w) ,A.k(w)) is con- sidered as the ith eigenpair of A(w). At this stage there are two possibilities: (l) Qk_1(w)T §k(w) > 0 or (2) Qk_l(W)T Qkhv) < o. A Since Qk_1(w) and xk(w) have almost the same direction, §k_1(w)T Qk(w) cannot be 0. Theorem 2.6: Suppose that Ak(w) and §k_1(w) are sufficiently close to A(w) = 11(w) and A A' x(w) = x1(w), respectively. Then, (1) If 95(_1(w)T Qk(w) > 0, then )‘1<(W) > MW). (2) If §k_1(w)T Qk(w) < 0. then xk(w) < A(w). 33 Proof: It is sufficient to prove (1). Let A {x1(w),xz(w),...,xn(w)] be the set of n orthonormal eigenvectors of A(w). Step (3) in Algorithm 4 with j = k-l can be written as: [ARMI -A(W)] yk(W) = QJPIW) . Let A n A' (w) = Z 9.x3(w). xk-l j=l j where (2.1) gj = Qk-1(w>T Qj(w) (j = 1.2.....n). A A' Since xk_1(w) is close to xl(w) by assumption, gi is not small. Then, yk(W) WW” -A(w)1‘l a.“ (w) n . (x(w): -A(w) 1‘1 2 9. film) i=1 3 n g. = Z) J . Q.(w) j=l 1km) —l3(w) 3 From (2.1), we obtain 2 " (w)T y (w) = 2% ‘5 xk‘l k j=1 Ak(w) -A3(w) 9? ’8 1' . NJ“) -l1(w) for Ak(w) is, by assumption, sufficiently close to li(w), and 9.1 is not small. 34 A Because yk(w) and xk(w) have the same d' t' A ( )T A (w) 0 ' l' A (w)T 0 irec ion, Xk-l w xk > imp ies Xk-l yk > . It follows that x(w) > l1(w) .-. A(w). *** Theorem 2.6 and the Sturm Sequence Property lead to the following correction checking algorithm. Algorithm 7: Correction Checking Algorithm A Suppose that (x(O),X(0)) is the ith eigenpair of the initial matrix D. IF (95(_1(w)'r 42km > 0) THEN IF (G(Ak(w)) = i) THEN Accept (9e<(w).xk(w)). ELSE Cut the current step-size by half Go back to eigenvalue prediction END IF ELSE IF (GO-Rho) = i-1) THEN Accept (Qk(w).1kiw)) ELSE Cut the current step-size by half Go back to eigenvalue prediction END IF END IF. *** 35 Remark 2.8: The usage of Sturm Sequence in Algorithm 7 is quite different from that of finding eigenvalues by the method of bisection. The convergence process of the bisection is linear with Convergence rate 0.5. But the Strum Sequence Algorithm is used only one time in Algorithm 7 to check whether Ak(w) is on the right eigenvalue curve. *** 36 § 2.4: Flow Chart t = O (i) Determine the ith eigenpair of D (ii) Determine the initial step-size ¢. g ‘. Prediction for (x,A) at the I I \ next step by A19. 1 and.Alg. 2 Prediction checking by A19. 5 otherwise checks out L--[Cut the steprsize by half;] [Correction by Alg. 4'] A J. Check the Correction by A19. 6 and A19. 7 otherwise checks out [Check t-valve] ‘L t < 1 ,otherwise Print-out the ith eigenpair (x,A) of A Update the \L step—size (— by A19. 3 STOP . CHAPTER III : NUMERICAL RESULTS The computations described in this chapter were performed on the CDC Cyber 750 at Michigan State University. This chapter is divided into two parts: Examples and Comparison with EISPACK. §3.1: Examples Two examples will be presented in this section. Example 1: r r 7 ol BZ 7 l B o l 2 2 2 . 1 3 . Let A = '. '. 0'5 = . . °. ‘ ° n I 0. I1 '5 G 'l n n n L. s L 3 That is, oi = i (i = l,2,...,n) and B. = 1 (i = 2,3,...,n). This kind of matrices arises in i connection with a study of the John-Teller effect [5,11]. 37 38 The matrix A has the ith eigenvalue nearby i except for few smallest and largest eigenvalues. Hence A has a good eigenvalue separation. The homotopy algorithm was used to find the first eigenpair of A with n = 20,40, and 80. For each n, we used three different initial matrices: fi F ol “2 D1 = o C '1'! J F ‘ a fi GIBZ I I I | B2G2 i ‘ I ._ _ _ {.a_.g T...- F - -.__ l 3 4t ; _ p I ‘ D2 ' * 34 a4. . ._.. _ T___._ 7... _ _. _ _. | I |(III-'1fir1 L ‘ 5 Ian an J V l “1 32 I A 52 “2 B3 1 53 “3 B4) _ 1 D3 ‘ 54 “4' ____ _ __._l_a; _______ I I a6 l O L i ”a n .J 39 The first eigenvalue of D1 is clearly l and the corresponding eigenvector is (l,0,0,...,0)T. The first eigenpair of D2 is easily available since the first eigenpair is essentially the first eigen- pair of the first 2 X 2 submatrix. The first eigenvalue of D2 is .381966 and the corresponding eigenvector is (.850651, -.525731,0,...,0)T. The first eigenpair of D3 is essentially the first eigenpair of upper 4 X 4 matrix and it was found by the EISPACK subroutine IMTQLZ. The first eigenvalue of D is .254719 and the corresponding 3 eigenvector is (.777951, -.579792, .233949, -.0624651, 0,0,...,0)T. Here we are taking advantage of the nice performance of EISPACK with small matrices. We chose the error tolerance 6 = lE-2, and corrected eigenpair until Hyj” 2_€'2 = lE-t4 if t < l, and llyjll 2 1E + 10 if t = 1. (For notation, see Algorithm 3 and Algorithm 4). We obtain the following results. STEPS L-SYSTEMS MAXERR FAILURES D1 4 8 .8643E-2 0 D2 2 4 .9422E-2 0 D3 1 2 .9067E-5 O Table 3.1: n = 20 4O Notations 3.1 STEPS: The total number of steps to find the first eigenpair of A, where a step means a loop of prediction, correction and checking. L-SYSTEMS: The total number of linear systems solved, (see Algorithm 2 and Algorithm 4). MAXERR: The maximum error in eigenvalue predictions. FAILURES: The total number of failures in either prediction checking or correction checking, where a failure means the case in which we need to cut the step-size by half and pre- dict eigenvalue again, (see Algorithm 5 and Algorithm 7). *** Remark 3.1: (l) (2) (3) The error control was completely satisfactory. We keep track of the total number of linear systems solved, since it is the most time consuming process in the homotopy algorithm. For n = 40 and 80, we obtained almost the same results. Hence the growth of n did not make the first eigenvalue curve ill conditioned. 41 (4) D2 gives much better conditioning to the first eigenvalue curve than D1. D3 gives still better conditioning than D2. *** A similar idea can be used to find the other eigenpair. For example, suppose that we want to find the 50th eigenpair of A. Now we choose D4 as follows: P01 1 t T . l I .. l _.-_ [“43‘_.__<_ __ _ __ _ —. _ _ _.__ _ _ “48 549 i i (B49 “49 “so 1 D4 = . “so “so “51 . I l 351 “51 552 . I t _ ~_ __ _1 __ “52 “52 __ __ __ _ _ -1 l _ __ - _. - ( “53 I I ’ ... l ’ °“80 L ' J The 50th eigenpair of D4 is essentially one of the eigenpairs of the middle 5 X 5 submatrix, and it was found by IMTQLZ. The 50th eigenvalue of D4 is 50 and the corresponding eigenvector is '(o,o....,o, .301511, .603023, .301511, -.603023, 48th 49th 50th 51th .301511, 0.....0)T. 52th 42 To find the 50th eigenpair of A, we needed only one step, and the solution of one linear system. Other eigenpairs can be handled in a similar fashion. Remark 3.2: The above discussion shows that we can make each eigenvalue curve in Example 1 have almost the same conditioning by a suitable choice of the initial matrix D. Example 2: The following example is from [11]. di Bi A1 1 .25000 0000 .00000 0000 .06437 9910 2 .76849 1173 .15366 0746 .07359 7119 3 .91955 6756 .46726 0328 .08422 5269 4 .23093 8895 .11925 6498 .09720 9219 5 .13305 3788 .08076 3539 .10321 5761 6 .22254 9575 .03394 7196 .12278 7524 7 .11612 7856 .03609 0904 .14342 2880 8 .12033 9373 .03502 2375 .16632 4602 9 .12371 9912 .02915 7561 .17130 7560 10 .12856 1407 .03745 3705 .17735 6337 11 .10776 8089 .01609 0599 .23163 9484 12 .13703 9203 .02382 6467 .26773 3297 13 .13805 7030 .02946 8449 .46276 6202 14 .10379 6943 .00764 6394 1.33403 4844 Table 3.2 where (1.1 is a diagonal element (i = l,2,...,14) Bi is an off-diagonal element (i = 2,3,... A1 is the ith eigenvalue (i = l,2,...,l4). Remark 3.3: The eigenvalue separation is poor. For example We choose the initial matrix D (1132 i i “2 “2.. . ‘ ° . '.. “5 : °“5 “5 I . ”Has—“TH“— D= (37.07. B ‘ -.°.. 10 I “10'“10 ——'_ — “i.'— - I '—9fi.%3 —'- - ‘ 2512 “12 B13 . ‘ B13 “13 “14 L i i “14 “14 The choice of D was guided by Theorem 1.6. Since 86 is less than BS and B7, and 811 is less than 310 and 812, by the above choice of D, HA-—D” is minimized and D has reasonably sized sub-blocks. We write l5 - 14 = .0060065. 43 *** as follows: 44 and found all eigenpairs of U,M and L by using EISPACK subroutine IMTQLZ. tolerance 6 until llyjll 2 6'2 HyjH 2_1E + 10 if t = l. 1E - 4 1E + 8 if t < 1, We chose the error and corrected the eigenpair and We obtained the following result. i 11(0) STEPS L-SYSTEMS MAXERR FAILUREd 11(1) 1 .06634 0723 2 5 1.0880E-4 0 .06437 9909 2 .07878 7312 2 6 1.4849E-4 0 .07359 7119 3 .08852 7020 4 9 .8085E-4 0 .08422 5268 4 .09408 9982 4 10 .8073E-4 0 .09720 9219 5 .10255 7223 2 5 .7869E-4 0 .10321 5760 6 .12351 1014 2 5 .8523E-4 0 '.12278 7523 7 .14116 5599 2 4 .9097E-4 0 .14342 2879 8 .16767 9729 2 4 .9339E-4 0 .16632 4601 9 .17206 4026 3 7 .7269E-4 0 .17130 7559 10 .17497 7810 3 7 .7432E-4 0 .17735 6336 11 .23472 2400 3 7 .8802E-4 0 .23163 9484 12 .25880 1504 3 6 .8933E-4 0 .26773 3296 13 .46273 7255 1 2 .4505E-7 0 .46276 6202 14 133403 4811 l 1 0 0 1.33403 4842 Table 3.3 45 Remark 3.4: (1) Example 2 was chosen to demonstrate the accurate execution of the homotopy algorithm. As we started from the ith eigenpair of D, we arrived at the ith eigenpair of A for each i = l,2,...,14. There was no FAILURE at all in the checking process. (2) The average number of STEPS is 2.4286 and the average number of L-SYSTEMS is 5.5714, to find one eigenpair of A. Thus we need to solve an average of 2.2941 linear systems in each STEP. This is typical of our homotopy algorithm. *** Example 3: The following example is from [8. page 125]. -7 I] l H O\l4 Let A = '. 0 '. That is, ai = 8«-i (i = l,2,...,8). 46 and 0(A) = {-1.125441522005, .253805837119, .947534612211, 1.789326378193, 2.130221682144, 2.961274130561, 3.043336908165, 4.000000000000, 4.008304183180, 5.038725869439. 5.039166155057, 6.210673621807. 6.210683778125. 7.746194162881. 7.746194203123}. We tried to locate the biggest eigenvalue of A by using homotopy methods starting from the initial matrix F71 47 The biggest eigenvalue of D is 7.745281240174 and the corresponding eigenvector is (.7779570546743, .579791948271. .233949463778, .062465125788. 0.....0)T. We chose the error tolerance E = lE-—4, and corrected eigenpair until Hij 2,6-“ = lE-t8 if t < l, and HyjH 2,1E-+10 if t = 1. However the second largest eigenvalue of A agrees the largest eigenvalue of A up to 7 decimal places. Hence we may expect some failures in correction checking. Indeed 7 failures in correction checking were observed until the homot0py algorithm located the largest eigenvalue of A. Even if the homotopy algorithm could locate the largest eigenvalue of A, many failures made it inefficient. Remark 3.5: Many failures in correction checking 114(1) and l15 occurred due to the fact that (l) are very close. Hence if A has very close algorithm in this thesis can be inefficient. Hence we need to develop another homotopy algorithm for the case in which A has very close eigenvalues. This will be one of the further research topics. 48 §3.2: Comparison With EISPACK In this section we consider Example 1 with the choice of D as the initial matrix. We also found 3 all the eigenpairs of A in Example 1 with n = 20, 40,80 by using IMTQLZ subroutine in EISPACK. IMTQL2 determines the eigenvalues and eigenvectors of a symmetric tridiagonal matrix. IMTQLZ uses the implicit QL method to compute the eigenvalues and accumulates the QL transformations to compute eigenvectors [9]. (a) Execution Time We obtained the following results: 1 2 3 4 5 6 7 (sec) (sec) Egp’ Egg. log Egg 109 Egg, EEC n Hn En Hn En 2 Hn 2 En n‘Hn 10 .000375* .004 -— -——— -—-— -——- 1.0667 20 .00075* .026 '-- 6.5000 -—— 2.7004 1.7333 40 .0015* .196 -- 7.5385 -—- 2.9143 3.2667 80 .003 1.226 -- 6.2551 -- 2.6450 -5.1083 160 .006 7.993 2.0000 6.5196 1 2.7048 8.3260 320 .011 54.988 1.8333 6.8795 .8744 2.7823 15.6216 640 .024 ———- 2.1818 ——— 1.1255 ‘-- ——- 1280 .047 -- 1.9583 -—- .9696 '-— '-- Table .4 49 Notation 3.2 Hn: The execution time in seconds to find the first eigenpair of n x n matrix in Example 1 by the homotopy algorithm with the initial matrix D3. The execution time in seconds to find all m eigenpairs of the n x n matrix in Example 1 by using IMTQLZ. Remark 3.6: (1) In Column 1 * means estimated time. (n = 10,20,40, actual measurement of Hn is not reliable since Hn is too short). (2) In Column 2, E640 and E1280 become too long (that is, too expensive). (3) Column 7 shows that if n 2 20, the homotopy algorithm is better than EISPACK even with one CPU; As n gets bigger the homotopy algorithm becomes increasingly better. (b) Computational Complexity Column 5 in Table 3.4 shows that the complexity of following one eigenvalue curve is 0(n). Hence the complexity of finding all eigenpairs of a matrix by the homotopy algorithm is 0(n2) by assuming that each eigenvalue curve has more or less the same 50 conditioning, which is true in this Example [Remark 3.2]. Therefore, if we adopt parallel processing the com- plexity of the homotopy algorithm is better than 0(n2). If we assume that there are n parallel pro- cessors, then the complexity becomes 0(n). However column 6 in Table 3.4 shows that the complexity of IMTQLZ is worse than 0(n2'6). Remark 3.7: The discussions in (a) and (b) are not general assertions. They are based on Example 1. Hence the discussions in (a) and (b) are true for matrices which have good eigenvalue separations. *** (c) Storage Consideration In IMTQLZ, Z is a real two dimensional variable with row dimension at least n and column dimension at least n. If the eigenvectors of the symmetric tridiagonal matrix are desired, then on input, Z contains the identity matrix of order n, and on output the orthonormal eigenvectors of this tridiagonal matrix. Hence to use IMTQLZ, at least nz-storage is needed [9]. For the homotopy algorithm storage needed equals only a few multiples of n. According to our program the storage requirement of the homotopy algorithm is around 20n. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] REFERENCES Batterson, Steve and Smillie, John, "The Dynamics of Rayleigh Quotient Iteration", Preprint. Blattner, J3W. (1962), "Bordered Matrices", J. Soc. Indust. Appl. Math. 10: p. 528-536. Chu, M.T. (1984), "A simple application of the homotopy method to symmetric eigenvalue problems", Linear Algebra Appl. 59: p.85-90. Li, T.Y., Sauer, T. and Yorke, J., "Numerical solution of a class of deficient polynomial systems", to appear. Longuet-Higgins, H.C., Opik, U., Pryce, M.H.L. and Sack, R.A. (1958), "Studies of the John-Teller effect", Preceedings of the Royal Society Vol. 244: p.1-16. Martin, R.S. and Wilkinson, J.H. (1967), "Calculation of the eigenvalues of a symmetric tridiagonal matrix by the method of bisection", Numerical Math. 9: p.386-393. Ortega, J.M. (1972), "Numerical Analysis: a second course", New York. Academic Press. Parlett (1980), "The Symmetric Eigenvalue Prdblem", Englewood Cliffs, N.J. Prentice-Hall. Smith, B.T., et. a1 (1976), "Matrix Eigensystem Routines, EISPACK Guide“, 2nd Edition, Springer- Verlag, New York. Store, J. and Bulirsch, R. (1980), "Introduction_ to Numerical Analysis", SpringeréVerlag, New York Inc. ° 51 52 [11] Wilkinson, J.H. (1958), "The calculation of the eigenvectors of codiagonal matrices", Computer Journal 1: p.90-96. [12] Wilkinson, J.H. (1965), "The Algebraic Eigenvalue Problem“, Oxford University Press, New York. J l.) Illv Alli. 2| .lal‘ii ]