1|}!leWWW“)WWIWW“MINIMUM

      

12
2

0
N01

'YHESlS
\

llllllllllMIIIIHIWWlllllllllllllllllllllllllllllllllllll

31293 01413 7727

This is to certify that the

dissertation entitled

A SCALABLE ALGORITHM FOR
NON-SYMMETRIC EIGENVALUE PROBLEM

presented by

Xiaozhuo Yang

has been accepted towards fulﬁllment
of the requirements for

 

 

 

Ph . D. degree in Applied Mathematics
7
r—Cvx' \7/:£L- 6’)
Major professor

Date May 2: 1996

MS U is an Afﬁrmative Action/Equal Opportunity Institution 0- 12771

 

LIBRARY

M'Chigan State
’ University

 

 

 

PLACE IN RETURN BOX to remove thle checkout from your record.
TO AVOID FINES return on or before date due.

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

L_L__|
T—ll—TF—J

usu IeAn Afﬁrmative Action/Ewe! Opportunity lnetltulon
W

 

m1

A SCALABLE ALGORITHM FOR
NON-SYMMETRIC EIGENVALUE PROBLEM

By

Xiaozhuo Yang

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the Degree of

DOCTOR OF PHILOSOPHY

Department of Mathematics

1996

ABSTRACT

A SCALABLE ALGORITHM FOR NON-SYMMETRIC EIGENVALUE
PROBLEM

BY

Xiaozhuo Yang

This thesis is on iterative methods which find all the zeros of a monic complex

polynomial simultaneously and an application to the algebraic eigenvalue problems.

In Chapter One we investigate the convergence behavior of the Durand-Kerner
method on the real plane R2 and characterize the convergence set completely in C2.
The Durand-Kerner method converges if and only if the initial approximations are in

the convergence set.

Chapter Two presents a modiﬁed Aberth method to ﬁnd the multiple zeros of
a polynomial efﬁciently and accurately. We show that the modiﬁed Aberth method
converges cubically. We also describe the property of the original Aberth method
near a multiple zero and hence propose a scheme to dynamically detect the multiple

zeros and their multiplicities.

In Chapter Three we apply the Aberth method, along with homotopy method, to
the nonsymmetric eigenvalue problem. The algorithm has been implemented on the
Intel Touchstone Delta multicomputers. The result shows this algorithm is competi-
tive to HQR from EISPACK. A dynamical load-balancing technique is developed to
balance the load across the computing nodes. The load-balanced algorithm improves

the performance by up to 15%.

DEDICATION

To my wife, Ling Gao and sons, Pengling and Phillip, for their
understanding, support and patience.

iii

ACKNOWLEDGEMENTS

I am most indebted to my dissertation advisor, Professor T.Y. Li, for his encourage-
ment, support adn advice during my graduate study at Michigan State University.

His deep insight into mathematics and persistence have always inﬂuenced me.

I also would like to thank my dissertation committee members, Professor Michael
Prazier, Professor Richard Hill, Professor John McCarthy, and Professor Zhengfang

Zhou, for their valuable suggestions and precious time.

iv

Contents

LIST OF TABLES . vi
LIST OF FIGURES .............................. viii

1 Dynamics of the Durand-Kerner Method 1
1.1 Introduction ................................ 1
1.2 The Derivation .............................. 2
1.3 Dynamics of the Durand—Kerner Method in R2 ............ 3
1.4 The Complex Case ............................ 15

2 A Modiﬁed Aberth Method of Finding All Roots of a Polynomial

with Multiple Roots 18
2.1 derivation ................................. 18
2.2 Local Convergence ............................ 21
2.3 The Aberth Method at Multiple Zeros ................. 28
2.4 Dynamic Estimation of Multiplicities .................. 37
2.5 A Numerical Example .......................... ' 40
3 Nonsymmetric Eigenvalue Problem 44

3.1 Introduction ................................ 44

3.2 Splitting Procedure and Homotopy ................... 44
3.3 Hyman’s Method ............................. 47
3.4 Initial Implementation .......................... 49
3.5 Load—Balancing .............................. 52
3.6 Conclusion ................................. 56
BIBLIOGRAPHY 57

vi

List of Tables

3.1 Time on i860 with a 64 x 64 random matrix ............... 51
3.2 Time on i860 with a 128 x 128 random matrix. ............ 51
3.3 Time on i860 with a 256 x 256 random matrix. ............ 52
3.4 Unbalanced Load Distribution ..................... 53
3.5 Balanced Load Distribution ....................... 54

vii

List of Figures

 

1.1 The Convergence Set ........................... 6
1.2 The Decomposition of R2 ......................... 8
1.3 The First Iteration ............................ 10
1.4 The Second Iteration ........................... 11
1.5 The Monotone Convergence ....................... 13
2.1 Algorithm MODABERTH ......................... 43
3.1 Algorithm ABERTH ............................ 50
3.2 Unbalanced vs. Balanced Timing .................... 55
3.3 Timing Info for 240 x 240 and 360 x 360 matrix ........... 55

viii

Chapter 1

Dynamics of the Durand-Kerner
Method

1 . 1 Introduction

The Durand-Kerner method is an iterative method which approximates simultane-
ously all the n roots of a complex monic polynomial of degree n. It was ﬁrst proposed
by Durand [11], and later, independently, by Kerner [20]. Since then, more work on

this method has appeared in the literature [17, 18, 21].

With a monic polynomial
19(2) =zn+alz"_l+~-+an_lz+an, (1.1)

the Durand—Kerner iterations are

 

z
zizzk— n Mk) , k=1,2,...,n. (1.2)
I—Ijzl,j¢k(zk _ 23')
Here, {21, 22, . . . , 2n} is a set of approximations to the n zeros of p(z). Empera-
tively, {21, 2;, . . . , 2;} forms a set of better approximations.

The method was shown to have convergence order two when the polynomial has

only simple roots [3, 20].

It was conjectured in [19] that this method converges for almost all initial approx-
imations in C", when the initial approximations are viewed as a point (21, 22, . . . , 2”)
in C". To the best of our knowledge, this conjecture remains open, except for the
case n = 2. It was mentioned in [19] that T. Terano gave a proof [33] for n = 2 in his

Ph.D. thesis.

In this chapter, we investigate the convergence behavior of the Durand-Kerner
method on the real plane 82 and characterize the convergence set completely in C2.
The Durand-Kerner method converges if and only if the initial approximations are in

the convergence set.

After the derivation of the Durand-Kerner method in Section 1.2, we discuss
in Section 1.3 the convergence behavior of the Durand-Kerner method in R2. In
Section 1.4 we show that the Durand-Kerner method converges quadratically in its

convergence set in C 2.

1.2 The Derivation

The Durand-Kerner method can be derived from different approaches [11, 13, 20].
Here, we give a simple derivation of the Durand—Kerner method following [2]. We

also discuss some properties of the Durand-Kerner method.

Suppose that $1, $2, . . . , at” are n simple roots of polynomial p(z) and 21, 22, . . . , 2,,
are n initial approximations. Let 6;, = :51, — 2k. Then, polynomial (1.1) can be written

as

71

10(2) = 11(2 — (2;. + 61.)).

k=l

Expanding the right hand side in the powers of 6;, and dropping all higher powers
of 6;, yield

z)E’H(z—zk) 251,1] z—z,
k=1

k: 1 jyék

For a ﬁxed index k, plugging in 2;, on both sides and solving for 6;, give the ﬁrst

order approximation for 6k,

 

P(Zk)
6;, = — . (1.3)
Hj¢k(zk — 21')
This gives formula (1.2). Notice that the sum of the Durand-Kerner iterates z], 2;, . . . , 25,,

is —al [21].
1.3 Dynamics of the Durand-Kerner Method in R2

In this section, we consider the Durand-Kerner method in R2. All numbers considered
are real numbers except otherwise stated. We assume that the polynomial has degree

two with distinct real roots, having the following form

where a and b are two distinct real numbers.

For simplicity, let A(:L', y) = (:r’, y’) be the Durand-Kerner iteration, where

1:: :

Obviously, the Durand-Kerner method does not converge when the initial points are

 

1:-

(1.4)

chosen from the set D = {(x, y) E R2 : a: = y}, which makes the denominators zero.

Let J = A’1(D). This set can be described in the following theorem.

Theorem 1.3.1 The set J has the following form

J:{(x,y):2xy—(a+b)(:r+y)+2ab=0}.

Moreover, the Durand-Kerner iteration maps all points in J into one single point

(new)
2’2 '

Proof. Suppose (:r,y) is in the set A‘1(D), then

 

 

 

 

_ (Iv-a)(-’L'-b) = _ (y-a)(y-b)
:c—y y—a: '
So,
xz—xy—a:2+(a+b):c—ab_ yz—xy—y2+(a+b)y—ab
x—y _ y—m '
Hence,
2xy—(a+b)(:c+y)+2ab=0. (1.5)

On the other hand, let (:r,y) satisfy condition (1.5). If :1: 7E 3%, then y =

W. Bring it into the Durand-Kerner iteration (1.4), yielding

 

 

 

,_ (:c—a)(a:—b) a+b
(I? ‘- (B _ ]a+b[.1:—2czb : 2 '
a: - 2x—(a+b)
Similarly,
,_ (y-a)(y-b)_a+b
y _ y — ]a+b]y-Zab - 2 '

y — 2y-(a+b)
Now, we show that :1: 52$ 9359 when (x,y) E J.

Ifx = 13:5 and (x,y) E J, then

2a+b

 

b
y—(a+b)(y+a—:—)+2ab=0.

5

Consequently, —L%§L2 + 2ab = 0. So, (a — b)2 = 0. Hence a = b. This contradicts

to our assumption that a and b are distinct.

Let P = ((a + b)/2, (a + b)/2), then the Durand-Kerner iteration A maps J onto
P. I

Let L = {(33, y) : r + y = a + b}, a straight line on the real plane. The two fixed
points P1 = (b,a) and P2 = (a,b) of the Durand-Kerner iteration function (1.4) are

on L (See Figure 1.1).

Theorem 1.3.2 After one Durand-Kerner iteration, (:r',y’) in (1.4) satisﬁes :r’+y’ =
a + b.

Proof. Choose any starting points a: and y such that a: 75 y. By the Durand-

Kerner iteration (1.4) ,

, (x-a)(~'v-b)

a: = :1:—
3‘31

 

—:ry+ (a+b)a: — ab

 

3

 

 

:r—y
and
, — —b
y I y_(y a)(y )
y—a:
_ -:I:y+(a+b)y—ab
_ y—at ,
So,

x'+y'—a—b = a"—a+y'—b
—a:y+(a+b)a:—a(x—y)—ab
:r—y

 

—:cy+(a+b>y—b<y—x>—ab
y—x

+

 

6

—:cy+b:c+ay—ab+:ry—ay—ba:+ab
3"?!

 

 

Pl

 

Lh

 

 

 

 

Figure 1.1: The Convergence Set

Theorem 1.3.1 and Theorem 1.3.2 show that the Durand-Kerner method maps
It2 —- D onto L and maps J onto point P. We will show next that R2 — (J U D) is the
convergence set on the rea1 plane where starting in which the Durand—Kerner method

converges .

For convenience, deﬁne

(a+b):c—2ab
2x—(a+b)'

 

d(a:) =

We assume a > b. The graph of D is a straight line passing through the point
P = ((a + b)/2, (a + b)/2) in R2 while the graph of d(a:), the set J, has two branches

(see Figure 1.1).

The function y = d(:c) has a horizontal asymptote and a vertical asymptote,

Lh= 3! = 99?,
Lu: :1: = gig—Q.

L}, and LU divide the plane into four regions:

II={(x,y):x>mandy> w}.

12={(:r,y)::r<9ﬂandy>a—+b.

I3={(:r,y):.r<mandy<1ﬂ.

I4 = 1193,31)”: 2 “-I-b and y S ELb}-
The graphs of J and D divide 11 and 13 into eight regions:

e 111={(:c,y)€ 11 : y < :1: and y > d(:c)}.

[12 = {(x,y) E 11 : y > a: and y > d(a:)}.

113 = {(x,y) E 11 : y > .r and y < d(a:)}.

[14 = {(a:,y) E II : y < a: and y < d(a:)}.

I31={(:c,y)6 13 : y > :1: and y < d(:r)}.

I32 : {(a:,y) 6 I3 : y < :1: and y < d(.r)}.

133 = {(x,y) E [3 : y < :1: and y > d(:c)}.

134 = {(a',y) E 13 : y > :1: and y > d(:z:)}.

Line y = a and line a: = b divide the regions 12 and 14 into eight regions:

0 [21={(:I:,y)€I4.b<:rS(a+b)/2 anda<y}.

122={(:v,y)€I4zx<banda<y}.

123={(a:,y)€ I4zx<band (a+b)/2Sy<a}.

124 = {($,y) 6 I4 : b < a:(a + b)/2 and (a + b)/2 < y < a}.

I41={(:c,y)€I4:r>aandb<yg(a+b)/2}.

I42={(a:,y)€ I4:(a+b)/2 <1: <aandb<y< (a+b)/2}.

I43={(:r,y)€ I4:(a+b)/2Z:r <aandy<b}.

I44={(:1:,y)6I4zr>aandy<b}.

See Figure 1.2.

 

12 \‘X 11

[21 112

122“.
“ .P' 11 “1

123 124‘ 114

 

 

 

 

I31 132‘ p2 141
13 ‘~
I32

144
I33 “

I43

 

13 I4

 

 

 

 

Figure 1.2: The Decomposition of R2

The behavior of the Durand-Kerner iterations is shown in the following lemmas.

Lemma 1.3.3 A(112)C 14 n L.

Proof. Suppose (x,y) E 112, then y > :r, y > d(a:) and a: > (a + b)/2. For

(x’, y’) = A(:c, y), by Theorem 1.3.2, we only need to show ar’ > (a + b)/2.

 

 

 

 

 

, a+b —:cy+(a+b).r—ab a+b
.r- = -
2 x—y 2
_ 25ry—2(a+b):r+2ab—(a+b)y+(a+b).r
2(y-arr)
_ 2:1:y—(a+b)a:—(a+b)y+2ab
2(y-as) '

Since the denominator is positive by assumption, the numerator also needs to be

positive. Now,

 

 

0 < y — d(.r)
(a + b):c — 2ab
2a: — (a + b)
_ 2xy—(a+b)y—(a+b)x+2ab
— 2:1: — (a + b)
The denominator is positive, so is the numerator. this proves cr’ — @ > 0. I

Lemma 1.3.4 A(114) C [4 O L.

Proof. Suppose the initial point (x,y) 6 114, then y < :L‘, y < d(:c) and :r >
(a + b) / 2. The assertion can be proved by applying the same arguments as in Lemma

1.3.3. I
Similarly, we can show that A(111) C [2 O L and A(113) C 12 (1 L.

By symmetry, similar results hold when the starting points are in [3, namely,
A(I31) C [2 n L, A(I33) C [2 n L, A(I32) C 14 (1 L and A(I34) C 14 O L (see
Figure 1.3).

10

 

 

““.‘12 I]
If _____ r- ---l.!.2_:.i' ..
I, +..
4 ’ 11 Ill
l 114 e
l'131
I34 4 a";
132 I-." “
.‘I .r
...... 133-- .........
‘e
.‘ L
13 I4

 

 

 

 

Figure 1.3: The First Iteration

Next, we show that the Durand-Kerner iterations converge to P2 = (a,b) and
P1 = (b,a) when the starting points are in I4 and 12, respectively. By symmetry, we

only consider the case of 1.; (see Figure 1.4).

Lemma 1.3.5 A([41) C [42 n L.

Proof. Let (x,y) E I41. Notice that y < 3:, a: > a > (a + b)/2 and b < y <

(a + b)/2. For (1", y’) = A(2:,y),

 

 

 

(III—a : _$y+(a+b)x_ab_a($_y)

13—3]

_. —$y+bx+ay—ab

_ 53—3,

_ (x—GXD—u)

_ x—y

< O,

and
I a+b —2a:y+2(a+b):z:—2ab—(a+b)x+(a+b)y

 

 

2 2(x - y)

11

—2xy + (a + b):c + (a + b)y — 2ab

 

2(x - y)

Combining inequalities w > 2ab and a: > a > Lag—bl, yields

—2a:y + (a + b)a: + (a + b)y — 2ab

> —2a:y+(a+b):r+(a+b)y—

(a + b)’
2

 

= 0.5((a + b) — 2y)(2:1: — (a + b))

>0.

So, 93’ — $13 > 0. This proves our assertion.

Similarly, we can show that A(I43) C [42 n L.

 

12 “

11

 

13

 

\
\
‘ \
132,.-.

 

 

I43

 

-‘
¢’
.
-‘
--

 

 

Figure 1.4: The Second Iteration

When the starting points are chosen in 142 we have

Lemma 1.3.6 A(I42) C [44 0 L.

Proof. Let (a3,y) E [42, then y < :r, 91sz < a: g a and b S y < 9:39.

12

For (:r’, y’) = A(:c,y), we need to show 93’ Z a since (:r’,y’) lies on L.

, —:cy+(a+b)a:—ab—a(:r—y)

 

 

 

 

 

 

a: —a =
:1: — y
_ (x - a)(b - y)
33 ‘y
> 0
I
For region 144, we have
Lemma 1.3.7 AU“) C [44 0 L.
Proof. When (2:,y) E 144, cc 2 a > b 2 y. Then,
:c’—a : —:ry+(a+b)at—ab-a(.r—y)
I ”y
= (x —a)(b—y)
‘7" ‘y
2 0.
and
,_b _ -xy+(a+b)y-ab-b(y—w)
y — y —.’1:
= (m—a)(b-—y)
y—a:
S 0
SO ((13,, y’) E I44 (1 L. I

We have shown that after at most two iterations the Durand-Kerner iterates lie

on the line L in either 144 or 122 (see Figure 1.4).

Next, we will show that the Durand-Kerner iterations converge monotonically on
the line L towards the points P2 2 (a,b) and P1 = (b, a) when the initial approxima-

tions are in the region I44 (1 L and [22 {'1 L, respectively (see Figure 1.5).

13

 

‘2 ~. 11

122

“ Pl

 

\4

13 I4

 

 

 

 

Figure 1.5: The Monotone Convergence
Theorem 1.3.8 If the starting point (:r,y) E L F1 144, then the Durand-Kerner iter-

ations converge to (a,b) monotonically. Moreover, it converges quadratically.

Proof. Since (2:,y) lies on L, we have a: + y = a + b. For (x’, y’) = A(:c,y), we

also have ar’ + y’ = a + b. Hence,

 

 

y = a+b—x.
So,
$I_a : ($—a)(b—y)
1'"?!
: (r—a)(:c—a)
2:c—(a+b)
= 2(2) Gib)”; a)

14

 

° x—a
Since 0 < 2(x—“—}9) < 1.

Let (34231”) = (A)“($.y) E A(ﬁnk—1016.11». then

1
0 < a:(k+1)— a < —(;r(k) — a) < (§)k+l(:r — a)

inductively from (1.6). So, .731") converges to a monotonically. At the same time, y”)

converges to b monotonically. Therefore, (2711‘), y‘kl) converges to (a, b) monotonically.

Next, we consider the rate of convergence. Since

 

 

(||(:v’. y’) - (a, b)|l2)2 = (50’ - (1)2 + (y’ - b)2
(”(23.11) - (a,b)II2)4 ((11 - a)2 + (y - WV
2 rug-1311? 2

 

((33 - a02 + (y - (9)2)?

(:13 - a)2(x - a)2

(a + b — 2x)22(:c -— a)4

 

l
(a +b— 2:22)2

 

1

(x - 11-1)?

.hh—i

 

 

 

The Durand-Kerner method converges quadratically. I

Similarly, we can prove the following

Theorem 1.3.9 If the starting point (.r,y) E L (1 122, then the Durand-Kerner iter-

ations converge to (b,a). Moreover, it converges quadratically.

In summary, after at most two iterations the Durand-Kerner method starting at

15

a point in R2 — (J U D) will converge to the solution monotonically along the line L

and the rate of convergence is two.

1.4 The Complex Case

This section discusses the Durand-Kerner method in C 2. Let the two zeros a and b
as well as the starting points :1: and y be complex numbers. Similar to the previous
section, (a, b), (b, a) and (3:, y) are viewed as points in two dimensional complex space
C2. Let D = {(x,y) 6 C2: 1: = y}, J = {(a:,y) 6 C2: 2:1:y—(a+b)(a:+y)+2ab = 0}
and L = {(a:,y) E C'2 : a: + y = a + b}. Notice that Theorem 1.3.1 and Theorem
1.3.2 are still true in complex case. J is the pre—image of D and after one iteration all
points lie on the complex line L. We also notice that both J and D have Lebesgue

measure zero in C 2.

Similar to the Durand-Kerner method on the real plane, we show that the Durand-
Kerner method converges for any starting points in the set C2 — (J U D U S ), where

S is a set of Lebesgue measure zero in C 2.

The line L can be parameterized as follows:

a+b a-b
2+2t

 

a; :
(1.7)
y=%+%t

1

where t is a complex number. When t = 0, —l and 1 the corresponding points are
P = ((a + b)/2,(a + b)/2) , P1 = (b,a) and P2 = (a,b).

Since the Durand—Kerner iterative function (1.4) maps C 2 — (J U D) onto L — P,
we assume, for the moment, that the starting points are chosen from L — P. These

points can be parameterized by (1.7) with complex parameter t # 0. We will see that

Durand-Kerner method does not converge when the initial approximation is chosen

16

by (1.7) with an imaginary number t. Let S be the set of points in 02 corresponding
to imaginary t, that is, .5' = 5,, U 5y, where

a+b a—b
2 + 2

 

ri)}

 

S.={(w.y)€02=(fv-y)$-($-a)($-b)=(iv-y)(

and

S.={(x.y)e C2:(y—x)y-(y—a)(y—b> =(y—x)(“',*b+ bgarm,

 

where r is a real number and i = \/—1. It is easy to see that S is a geometric surface

in C 2 , hence it has Lebesgue measure zero.

Now, the Durand-Kerner iterative function (1.4) maps points in C 2 - (J UDUS) to
points on line L which can be parameterized by (1.7) with non-zero and non-imaginary

t. Let (:r,y) be on L with parametric equation (1.7). For (33’, y’) = A(a:,y), we have

I _ gﬂ [a-b]]1+t2]
a: — 2 + 2 2t

_ a+b (5-02 (1+9)
y' — '2— + 2 2t ‘

Therefore, the Durand—Kerner iterations are determined solely by the iterative

behavior of the complex iterative function

 

It is easy to see that f (z) is just the Newton iterative function for the quadratic

polynomial q(z) = 22—1. That is, f (z) = z—:—,(L:%. The Newton iterations of quadratic

functions are well understood [4]. We list some results in the following

Theorem 1.4.1 The iterations of the function f(z) have the following properties:

1. f"(z) converges to 1 if the real part ofz is positive,

17
2. f"(z) converges to -1 if the real part ofz is negative,

3. f"(z) does not converge ifz is on the imaginary axis,
where f”(2) = f(f“’l(2))-

Therefore, the Durand-Kerner method converges when the starting points are

chosen from 02 — (J U D U 5'). Since Newton’s method converges quadratically, so

does the Durand-Kerner method.

We conclude that the Durand-Kerner method converges if and only if the starting

points are in the set C'2 — (J U D U S).

Chapter 2

A Modiﬁed Aberth Method of
Finding All Roots of a Polynomial
with Multiple Roots

2.1 derivation

The Aberth method is a parallel iterative method for finding all roots of a monic

complex polynomial with simple roots simultaneously. This method was discovered

many times, see Aberth [2], Borsch-Supan [5], Docev [8], and Ehrlich [12].

For a monic complex polynomial

 

p(z)=z”-1-a12""1+---+a,,_lz+an , (2.1)
with n distinct initial approximations zl, 22, ..., zn, the Aberth iterations are
2
zfczzk— p( k) k=1,2,...,n. (2.2)

P’(Zk) - 19121:) zy=1g¢k 11—; ’

It was shown [2] that the Aberth method converges cubically to the roots of p(a:)
when it has only simple roots. It is also observed that this method converges to
the n roots simultaneously for almost all initial approximations. To be more precise,

considering the initial approximations (21,22, . ..,z,.,) as a point in C", the Aberth

18

19

method converges for all initial points in C n — J, where J is a subset of C" with
Lebesgue measure 0. At this stage, no proof for this observation is available.

The Aberth method is not suitable for ﬁnding the roots of a polynomial with

1

 

multiple roots, since the quotient can be very large when two approximations
2k and z, converge to the same multiple root. Overflow would occur. To overcome this
difﬁculty, we propose a modified Aberth method which has the same convergence rate
as the original Aberth method and can ﬁnd all roots of a polynomial with multiple
roots. This method reduces to the original Aberth method when the polynomial has

only simple roots. The modiﬁed method, like the original method, is parallel in nature

and is globally convergent in practice.

In this chapter, we ﬁrst derive the modiﬁed Aberth method and describe its con-
vergence property. In later sections we show the convergence behavior of the original
Aberth method near multiple roots and discuss the issue concerning how to dynami—

cally detect the multiple roots.

The derivation of the modiﬁed Aberth method is based on the theory of electro—

statics. Let the polynomial in (2.1) have the following form

p(z) = (z — 1:1)"1(z — $2)"2...(z — 17m)”"‘ , (2.3)
where 1:1, 1:2,. . . , mm are the m distinct roots of p(z) with multiplicities n1, n2, . . . , nm,
respectively, where n = 22;, nk .

Starting with m distinct initial approximations z1,22, . . . , 2",, the modiﬁed Aberth

iterations are

nkp(zk)
P’(Zk) — 11121:) Z?=1;j¢k Stu—127

 

,k=1,2,...,m. (2.4)

ZLZZ/c—

We utilize the identiﬁcation of complex numbers with vectors in the complex

plane C. If an 12;, unit plus charge is situated at the point 3],, for each It, the resulting

20

electromagnetic vector ﬁeld at a point z is

i (7.42:: = 6%?) (2.5)

k=1 _ 33")

Given a polynomial p(z), we wish to locate a root of p(z) by sampling the ﬁeld
deﬁned by the right hand side of (2.5) at some point z;c with multiplicity nk, and
ﬁnding the point where a single nk unit plus charge would be located if it were
causing this ﬁeld. Sampling at this new point, the cycle then could be repeated.

Calling the new point z]c = z), + wk, and solving for wk in the equation

(z,c — (2+ wk» : (211:3) ’

leads to the modiﬁed Newton’s method

 

 

 

 

I Z nkp(zk)

2k: k— p’(2k) . (2.6)

Now, if we try to locate all roots of p(z) by simultaneously applying the modiﬁed
Newton’s method (2.6) to m different sampling points, some of them may converge
to the same (multiple) root. To avoid this, an nk unit minus charge is assigned at
the sampling point 2),. The idea is that when a sampling point z)c is near a (multiple)
root, the ﬁeld from the minus charge at 2;, should counteract that ﬁeld from the plus
charge at the root, preventing a second sampling point from converging to this root.

After taking conjugates, we have the iteration equation for the k-th sampling point,

 

 

 

n). _ P'fZ) "n1
(2:. — (z). + UM) _ P12) + Z

 

#k 2" — 2"
Solving for wk yields
wk = — nkﬂzk) . . (2.7)

p,(zk) _ 19(2):) 21%]: zk—ZJ

This gives the modiﬁed Aberth iterative formula (2.4).

21

The modiﬁed Aberth iteration correction (2.7) can also be derived algebraically.

Let Rk(z)— —- mfg—aw,— be a rational function, k- — 1,2,. ,m. Then 12;, has the

same root 3:), as p(z). The modiﬁed Newton’s method (2.6) applied to Rk(z) gives

nkRk(Zk)
32121:)

wk: —

 

le n]#k?::k-)zj)n1 (Hj¢k(zk — 2.1.)"J )2

— (II (2 2)"J)’
P’1Zk) IIj¢k(zk ‘ 21')" ’ “P120 ”1::(zZ—z:)n J .

 

 

M10120
P'1Zk) _ 19121:) Eye): 27212;, .

 

2.2 Local Convergence

In this section we discuss the local convergence property of the modiﬁed Aberth
method (2.4). We will Show that it converges cubically. However, when the mul-
tiplicities are estimated incorrectly in practice, only linear convergence rate can be

reached.

For convenience, let

 

and

 

Expanding Fk(Z) in the Taylor series at the point X = (x1, x2, . . . ,xm) yields

 

6Fk(X 82Fk(X
w 2...; 1..— Hi: We..1.._..)...-a.)

22

One can show that all the ﬁrst and second partial derivatives are zero, so the
successive iterates converge cubically. But this involves some tedious calculation of
the second derivatives. Instead, we give in this section another proof of the cubical
convergence. First, we calculate the ﬁrst derivatives which reveal that the modiﬁed
Aberth method is only linearly convergent if the multiplicities are estimated incor-

rectly.

Lemma 2.2.1

 

. P(Zk) _
ill-£31: P’(Zk) —

Proof. For simplicity, let k = 1.

1. P121)

‘33. pa.)

 

21

 

_ 1.... (21 — we. — .2)» ~12. - arm
zl—nl n1(zl — x1)("1“1)(zl — 332)"? - ~ - + (21 — $1)"1n2(zl — x2)("2'1) - - - +

1. (21 — $0121" 11311)"2 ' ' ' (21 — $m)n"'
= 1m
z1—>x1n1(zl— 2:2)"2(21- 9:3)"3 ---+ (21 — $1)n2(21 — $2)("2‘1)~-- + ---

=0.

 

Lemma 2.2.2 Suppose j 7i k, then

 

8Dk(Z) ] n,-
62y Z:X— (it;c — $J)2,
and
6015(2) _ —nk
82]: lZ=X— be; (13k _ 3302
Proof

 

23

 

and
8D,.(Z) _ n1 _ —n1
82" l —X (#1: (2k 21)2 |Z_X— #2]: (17k _ xl)2
Lemma 2.2.3
P12) Pm(2) n1. —1

 

 

 

Proof. Write p(z) = (z — xk)""g(z), where g(:1:k) # 0. Then

11(2) = no - x.)"~-‘g(z) + (z — arm's)
11212) = min. —1)(z — ark-29(2) + no - ark-1912)
+m<z — ark-1912:) + (z — xvmgrwz).
So,

17(2) 191212) | _
P’1Z)P'(Z) ”‘71"

 

: (z — mmge) X

nk(z - mk)("*‘1)912)+(z - $k)""g’(Z)

 

 

[nk1nk -1)(Z - $k)""'2912) + nk(z - $k)"“'lg'(Z)

...(z — wank-19(2) + (z — mime) +

”1:12 - $k)""'19'12)+12 — $k)""9(2)(z

)
nk(z " wk)"*“g(2) + (Z — wk)"*g’(z) ] 12:“

 

= 9(2) X

721.9(2) + (Z - $0912)

 

”k0“: —1)9(Z)+ ”1:12 — $0912) + nk(z — Stick/(Z) +12 - $k)29(2)(z)
n19(3) + (Z - $0912)

 

2:1}

9(xk)nk(nk — 1)g($k)
nk9($k)nk9($k)

 

nk—l

 

nk

24

Theorem 2.2.4 All the ﬁrst derivatives of Fk(Z) with respect to 2) at the point X

are zero, namely,

61712) I _ z 0
321 Z—X a

 

for k,l = 1,2,...,m.
Proof. Notice that p(:1:k) = 0. For 1 # k,

8Fk(Z) _ 0 nkp(zk) [p(2k) 62
32: |z=x _ _(P’(Zk)-P(Zk)Dk 2))212=x

nk [P(zk) ] 2 Ti);

 

A

 

 

 

_ P’(Zk) (am-21)?
_ p121.) 2 |Z=X
[1 _ p'(Zk)Dk(Z)l
= 0.
For I = k,
0F,.(Z) |
82k Z=X

nkP'(Zk) [P'(Zk) — P(Zk)Dk(Z)l - nkP(zk) [P(2)(Zk) - P(zk)a—%%Eﬂ — P'(Zk)Dk(Z)

 

 

 

= 1.. f _
(#121) — P(zk)Dk(Z))2 IZ—X

[1_ p12.) My.) + pot) pa.) 60.42)]
_ 1— Tu: p’(zk) pit“) p’(zk)P'(zk) 32:;
_ 2

[1— new]

(1- m)

= 1 _ nk___’:__
1

= 0.

Let the multiplicities n1, n2, . . . , nm in the iteration formula (2.4) be replaced by

estimated multiplicities N1, N2,. . . , Nm. The modiﬁed Aberth iteration formula has

25

the following form:

NkP(Zk)
p’(zk) “ P(Zk) Z?=1,j¢k fig:

 

ZLZZk—

We then have the following convergence result.

 

 

 

 

For 175 k,
19(2):) 2 N
aFk(Z) I _ _ Nk [pr(zk)] (4-2,) I = 0
62'] Z:X 1_ p 2‘: 2m N (Z) 2 Z=X a
P,(Zk) 3:1,j¢k zit—z)
and
351(2) I
62k =
m N
1 — 1121M + P(zkl 19(2).) 6(21=1.1¢k 3;“3L)
p’(zk) P'(zk) Pl(zk) pl(zk) 62k
= 1 — Nk N 2 ' IZ:X
P 21:
[1— p’(zk) 23:1,j75k f5]
1 _ 2:;1
: 1 ._ NI: 1"):
N
_ 1 _ _£.
nk

By these results, the method converges linearly if the estimated multiplicities Nk

satisfyO <I nk —Nk |< nk, for k=1,2,...,m

When the estimated multiplicities are close (but not equal) to true multiplici-
ties, like many other iterative methods [25], the modiﬁed Aberth method converges

linearly. In this case, it does not converge cubically.

When the estimated mutiplicities are the true multiplicities, we can show the

following theorem.

Theorem 2.2.5 The modiﬁed Aberth method (2.4) converges cubically.

26

Proof. Notice that i855; "L + "I: .Now,

 

 

 

 

 

 

 

P(Zk) 35“" Zk-l‘: Zk-l'k
’ a: - a: m
2k k -— 2k 1: p12,) _ Dk(Z)
P(Zk)
nk
: z“ _ x“ — Z ._"J_ J1:— _ _".z_
J75" zk—xj zk-rk i3“: zk-ZJ
_ _ x _ nk(zk — an.)
— 2k ‘6 n _ z m(zJ—xJ)(Zk-$k)
k 33““ (Zk-$J)(zk_z.1)
Z: '¢k nj(z£xszk-3k)
J 2 —$ 2 -2
= (Zia—33k) (k J)(k J)

_ n‘(z -$ )(zk-xk)
”1: 23¢]: izki$13(zk-zj)

le ,

(z. — at. — z.) [m - :2... "25:23:223’l

 

(2k — $k)(zj — 2:1)(21. — 3:1.) - (2-9)

”I

' n-(z -- )(z -3: )
(Zk—IJ)(Zk—2J)[nk—ZJ¢“ i’kJ-‘Jihkk-zjf

 

It is easy to see that

 

] in the above sum ap-

"1
(2:11 --1:1 )(xk —:rJ' )nk

 

proaches and therefore is bounded by a positive number A as Z ap-

proaches X. Therefore, as I z}, — :ck |< e, k = 1, 2, . . . , n, for a small number 6,

I 2;: —:ck |< A(n—1)c3.

This shows the method converges cubically. I

Let M = A(n — 1) in the above proof. We shall estimate M in more details. Let
D(xk,ek) = {z 6 C :| z — .rk |< ck}, k =1,2,...,m, where (1,62,...,€m are chosen
for which the disks are pairwise disjoint. The separation of the disks is measured by

the minimum distance of those disks, that is,
P = min“ 21: — Zj I, 21. E D($k,€k)a2j E DUB, Cilak 75 3'}-

At the r-th iteration, m disks D(’")(:I:Y),e§:)), k = 1,2, . . . ,m, are generated by the

Aberth iterations (2.4), where 65:) 2' zip—1:1. I and {zfcﬂ} are the new approximations.

27

Let pm be the separation of those disks and 10(0) = p. Intuitively, when the initial

disks are chosen small enough and the initial approximations are in the initial disks,

the generated disks must be nested; that is, D("+1)(:c):+1), 624-1)) C D(’")(:r):), 62)) .

Based on the result of Theorem 2.2.5, we have the following

Theorem 2.2.6 Let T and t be the largest and smallest multiplicities respectively,
(r) . . . . (r)
and 9- be the distance of the J-th root at the r-th approximation 23- to the root xj.

Among which let e“) be the largest one. Let K be an integer greater than or equal to

two. If initial disks satisfy 6 = 6(0) S , “117627139! then

C(k+1) S K(m _1)T(C(k))3 .
tp2

Proof. We ﬁrst estimate the coefﬁcient of (21C — :rk)(z,- — :rj)(zk — xk) in (2.9).

 

Notice that 21%,: "f::1_:’)z::“_:;) Z 21%,: %Z = gm—‘EDE—z. By the initial assump-

- T —1 2
tion, we have _ﬂp2_h < t. 80,

l ”j I
(2" — xi Mz" — 2,) ln" _ Em nili’41iiliiilii)l

 

 

 

S 92(t - 532—4162).
Therefore,
.5." s p,” _Tg;1(:(0,)2)(e<°’)<e§.°’)2
s aggwmxef’)?
= ————f(§\’?_‘11):,§(e<°>)(e§.°))2
s Mil—Tswmei”)?

tp2

IX/(m - 1)T(£(0))3
— tp2 .
Hence, 6(1) _<_ Wkwl)?
Furthermore,
.53) _ ————-—,p2 wows. ’)

K(m — 1)T lpz (0)

<

_ tp2 K(m — 1)TC“
_ 6(0)

__ k ,

 

Hence, each new disk is contained in the previous disk. Therefore,

K(m — 1)T

6("7+1) <

(4")?

by induction. I

2.3 The Aberth Method at Multiple Zeros

Though we proved in the previous section that the modiﬁed Aberth method con-
verges cubically, no information is available in practice whether a given polynomial
has multiple roots and what the multiplicities of multiple roots are. With n initial
approximations for a polynomial of degree n, one can only start with the original
Aberth method. An effective program should be able to detect the multiplicities of
multiple roots of a polynomial in the process of execution. We ﬁrst investigate the
behavior of the original Aberth method near a multiple root. In this section we show
that the approximations to a multiple root become equidistant on the circle centered
at the root. This information will be used to dynamically detect multiple roots and

their multiplicities.

We ﬁrst investigate the convergence behavior of the Aberth iterations applied to

polynomial 17(2) 2 2". Let rk = if, r; = %,k = 2,3,...,n, and r1 = l,r’l = 1.

29

Assume none of the approximations reaches the root of p(z) in the process. Dividing

both sides of (2.2) by 2;, we have

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2k — p(zk) 1
I z; p’(zk)-p(zn=)2?=1,,¢x 3:3
rk — '—,' —
21 21 _ P051) f
P’(z1)-P(21)2?=2 11—”:
51 _ PM)
21 21(P’(Zk)-P(zk)2;'=1,,¢k 371:3)
_ _ 13(21) T
21(P’(zi)-P(ZI)2?=2 3:37)
51 Z?
— —1
21 21012: ‘21? 23:1,];41: ,k—lgl
_ 1 z?
— —T
21(nz;1 —z{‘ :=2 ﬂ)
,.
k
7'1: — 71—1 1: 1
"r1. "'2' 211:1 1w: 3:
l — ;——,,1—1—
- 1:2 l—rJ
1 — 1 1
""k 2;: .nek a:
_ J
_ Tl:

l— ——l—_r_—_
"-21:21—rj

Qk(7‘1, 7‘2, ..., Tn) .

With Q1 2 1, let Q : (Q1,Q2,...,Qn). Forj = 1,2,...,n, let w,- = 62195411 be
the roots of the polynomial P(z) = z" — 1, where i = \/—1. It is easy to see that

(w1,w2, . . . ,wn) is a ﬁxed point of Q.

Theorem 2.3.1 Ifr = (r1,r2, . . . ,rn) with no zero components is a ﬁxed point of Q,

then r = w.

Proof. Under the assumption, rk satisﬁes,

 

 

71-1
”A.
nrn—I_rn 2n: 1
k k ]= 1,);11: 'lc‘rj
rkzr;c ,kzl,2,...,n. (2.10)

1—n'_2:1=121—jr

3O

Canceling 7‘]; on both sides of (2.10) yields

 

__l__
nrk- -rk Z]: 1,J#k 'k—r)

l-n "8:21—11,

Therefore,

1 1

n 1 —' n 1 a
n 7* Zj=m¢k rk—r ” 21:2 l—r
J J

 

 

or

 

 

n 23f221jm QJU
F

1': 1j¢k Tip

for k=1,2,. .,n

We want to show that r1,r2, ..., rn are exactly the n roots of polynomial z" — 1,
which are spaced equally on the unit circle. It is easy to see that the roots of z" — 1
satisfy (2.11), since 2 Zj¢k——— ”_rj — -'—’£,—)(,(,:—;l for any polynomial with all its roots being
simple, where p’ and pm are the ﬁrst and second derivatives of p.

The right hand side of (2.11) is a constant C for any k = l, 2, . . . , n. Summing up

both sides of (2.11) for h from 1 to n, the right hand side becomes nC.

Let the left hand side of (2.11) be fn, then

fn = in. Z

 

 

 

 

— r
k=1 1': —1.j¢k rk J
n—1 n—1 1 n—1 n—1 1
= Th 7' r + T" Z r + Th
- . — r r — r
k=1 J:1,j;ék " 3:1 " J k=1 " "
= ﬂ4+n—l. (2H)

Since the ﬁrst term of the recursive equation is f2 = l, the solution of the recursive

31

relation is fn = M. Therefore, nC = £33411. Cancelling n on both sides gives

2
" 1 — 1
Tk Z = n 2 , (2.13)

7' —T‘
j=1.j¢k " J

 

 

or,

2n, 2 H(rk—r1)=(n—1)H(rk—r,). (2.14)

j=l,j¢k l¢k.j 1'9“:

Let P(z) = n (z — Tj). Then (2.14) becomes

rkP(2)(rk) = (n — 1)P'(rk) (2.15)

for k = 1,2, ..., n. However, the polynomials zP(2)(z) and (n — 1)P'(z) on both sides
of (2.15) have the same degree n — 1 and the same leading coefﬁcient (n — l)(n — 2).
They are also identical at n distinct points r1,r2, . . . ,rn. So those two polynomials

are identical. We have

 

 

P(2)(z) __ n — 1
P’(z) _ 2 °
Integrating both sides, we obtain P'(z) 2 c2:"‘1 for some constant c. Hence,
P(z) = z" —1 since P(z) is monic and r1 = 1. Therefore, r1, r2, . . . , rn are the n roots
of z" — 1 = 0 and w is the only ﬁxed point of Q. I

Based on Theorem 2.3.1, we can show,

Theorem 2.3.2 Applying the Aberth method {2.2) to p(z) = 2", when the iterations
converge to the n-ple root 0 and none of them reaches it, then they are eventually

equally spaced on a circle around 0 and the modulus of each correction is ;1%.

Proof. The ﬁrst part is a direct consequence of Theorem 2.3.1. For the second

part, assume the approximations are equally spaced on a circle around 0. Without

32

loss of generality, let the approximations 21, 22, - - - ,2n be the n roots of z" -— r = 0

for some positive number r. Then

T!
I zk

zk _ 2k — n l l
- n
nzk — zk 2:5“: “—2,

 

 

 

 

 

 

2!:
: zk_ n(n-l)z"—2
" ‘ k—t—
Zk
= 2k — T
n “ T
2
= z 1—
A n+1)
n —1
= z
72 +1 k
So,
IZk—zll=lzk—n_lzkl= 2 Izkl.
n+1 n+1
That is, the modulus of each correction is $. I

Similar result holds for the Aberth method applied to polynomial p(z) = (z — 3:)"

for a complex number 1:.

For a general polynomial, when we analyze the behavior of the approximations
converging to multiple roots, we may assume the approximations which converge to
simple roots are the roots themselves, since these approximations converge to their

targets very fast (we will show this later).

Theorem 2.3.3 For a polynomial p(z) of degree n which has an m-ple root :31
(m < n) and simple roots x2, x3, ..., :rn_m+1, suppose approximations 21, 22, ...,
2m converge to $1 and other approximations 2...“, ..., 2,, take the roots ofp(z) other
than 3:1 as their initial values, then 21, 22, ..., 2,., are eventually equally spaced on a

circle around $1 and the modulus of each correction is #.

33

Proof. Since approximations zm+1,. . . ,2” are roots of p(z): 2:2, 3:3, ..., $n_m+1,
no iterations for them are needed. We want to show that the Aberth iterations for

21, 22, . . . , 2m applied to p(z) are equivalent to the Aberth iterations applied to 2'" —:1:1.

Let p(z) = (z — xl)mg(z), then the Aberth iterations applied to p(z) for 21, 22,

...,zm are

 

zfczzk— p(z).) 1 ,k=1,2,...,m.

P'(Zk) " p(z).) Z?¢k,,-=1 27.3
The denominators in the above iterations are

P'(Zk) - PM) E l

j¢k.j=l

 

zk—zj

= m(2k - ﬁlm-19(4) + (2k ‘ $llm9'(zkl

 

 

 

 

= mac—mm 19(2.)—(z.—x1)mg(zv Z kl,
J¢k,j<m J

m I n-m+l 1 -|

+(zk—1c1) 9(Zk)“g(zk) Z Zk—SF'J

 

 

, (2k - $1)’"g(2k)

zk = 2k —
9(Zk) m(2k — “ﬁlm" — (2k — $1)?“ Enemy 21:2,

 

(2k ‘xllm

m(zlc — lily”.l _ (2k _ 331),“ ngthSm zk-l-zj .

 

34
This is exactly the Aberth method applied to 2'" — x1. I
Corollary 2.3.4 Under the same condition as in the above theorem, let 10):) be the

correction at the r-th iteration for the k-th approximation 2):) which converges to an

m-ple root . Then,

lim T— — .
Proof B Th 233 h "l — J— z" 1’ d — z" 1’. s
. y eorem . . , we ave wk — m+1 2k an z,c+1 — m——+lz o
(r+1)_ (r)_ m- (r-l)
wk — milzk — mil'oTi 1: ° '

The above results also show that the original Aberth method converges linearly if
the polynomial has multiple roots. That is, those approximations which converge to
multiple roots converge to their targets linearly. Next, we show that the approxima-

tions which converge to simple roots converge much faster.

Lemma 2.3.5 Let 2k = a + rkee" ,-k — 0,1,2,” — 1, where i = \/—1, a is a

complex number and rk > 0. Let r 2 rk and z E C with | z — a |> 1‘. Then

 

 

 

 

 

 

 

 

 

 

1 m
k=0 2 — 2;, z — a
with some ﬂ satisfying I ,6 |< (Iz- al)(|r:- _a|_r)
Proof.
1 1 _ 1 1
z-—z;c z—a — z—a—rkeek‘ z—a
”69,4
r(z — a — rkeak')(z — or),
and
ski
rke 1
I IS

r(2—a—rke9k‘)(z—a) (Iz—a|)(|z—a|—r)'

35

 

 

 

 

Let
m—l rkepk,‘
B = 0 3 ,
k=0 r(z — a — rke * )(z — a)
m—l l m m
then, 2,50 z_2k — z_a = 7‘2 and Iﬂ IS (|z-o|)(|z—o|—r)' I

Theorem 2.3.6 Suppose the Aberth approximations applied to p(z) converges and

the approximation starting at z, converges to a simple root x1, then, eventually,

I zlr+1)— :6.- IS 063,

("+1)

where z,- is the approximation at step (r + l), C is a constant and

e=max{|z§r)—xj |,j:1,2,...,n}.

Proof. Suppose p(z) = H2":I(z — xi)” , where x1,x2, ...,xm are m distinct roots

with multiplicities n1, n2, ..., nm. Then

 

 

Let c be a small number such that the distance of every Aberth approximation
to its target is decreasing as soon as all the approximations are within 6 distance
from the roots. Let p be the minimum distance of any two distinct roots x1. and x,.
Let c be far less than p, say 46 < p. For simplicity, let n1 2' 1, 21 approximates x1,
22, . . . ,zn,+n, approximate x2, . . . , Zn1+n2+...+nm_1 , . . . ,zn approximate xm. Suppose

all the approximations are within 6 distance from their targets. From Lemma 2.3.5,

 

 

we have,
"1
1 n-
J
E , = + 7351'
k=1 21 _ zn1+...+n]_1+k 21 _ xj

. n 8n -
Wltl’l r, < e and I 2, |< 17%; = 3;} SlIlCe Zn1+...+n,_1+1,. . -,Zn1+...+n, converge to x,.
4 4

36

 

 

 

 

 

 

So,
, 1
21—31 — Zi—iBl—p.(21)_z,, 1
p(zl) 3:2 7.1—2,
1
= 21 — .751 —
m _"J__+ _
21:2 zl—xJ-l- zl —1:l 2m :2 zl—xJ —l—+ rj/Bj
l
= 21— 171‘
1 m . .
z1-1731
2' 21—1‘1 —

 

1 "’ Z?=2(zl _ lerjﬂj

1

1— 21:2(21— $07351]

_ :71: 2(31 ‘5cllrjﬂj
— (21 — 3:1)1 “ 2m=2(zl“ ‘31 )rjﬂj

Hence, when r,- < e we have

 

= (zl—xl) l—

 

8 n-l
3p 3
z —x < ———e
3p
16
Therefore, | 7.1 — x1 |< 1_L;p;1_1€ 3 when 6 satisﬁes the extra condition 62 < ———33— I

16(n— 1)

This theorem does not imply that the rate of convergence for the approximations
which converge to simple roots is cubic. When 6. is the largest error of all the errors
| 2):) -- 23;. l, k = 1,2, . . . ,m, at step 7‘, the conclusion of the theorem indicates the
error at step r + 1 for approximations converging to simple roots is less than C 63,
for some ﬁxed constant C. Apparently, it is much faster than the convergence rate
of the approximations which converge to multiple roots where the error at step r + 1
is less than C16 with C1 < 1. In practice, the approximations converging to simple
roots converge to the roots very fast while the approximations converging to multiple

roots display a very slow convergence rate.

37

In [29] and [17], ﬁnding multiple roots of polynomials by Durand-Kerner method
is discussed. Like the Aberth method, the Durand-Kerner method can ﬁnd all the
roots of a polynomial with simple roots very efficiently. It converges quadratically in

this case while it converges only linearly when multiple roots exist.

In [29] a result similar to Theorem 2.3.2 for the Durand-Kerner method is proved
without using the modiﬁed Durand-Kerner method to reﬁne the approximations. In
[17] a different technique is used to handle the multiple roots and obtains superlinear
convergence rate. Their numerical results show that both approaches can not ﬁnd

the multiple roots with high accuracy.

2.4 Dynamic Estimation of Multiplicities

For a given polynomial p(z) with no information in advance, one can only use the
original Aberth method (2.2) in the beginning. It is desirable that the multiplicities
of multiple zeros of p(z) can be dynamically detected in the process of execution.
The approximations can then be grouped into different groups according to the data
available concerning whether they converge to the same root. Then, the modiﬁed
Aberth method is applied. For a group of approximations converging to the same
root, the proposed algorithm takes the average value of all the members in the group.
This average value should be closer to the multiple root than any approximation
in this group as the approximations are spaced equally on a circle around the root
by Theorem 2.3.2 and Theorem 2.3.3. This average is used in the modiﬁed Aberth
iterations. The same approach can be found in [17] which uses these properties
without proof. Each 2;. takes a small random perturbation of the averaged value for
the next iteration since the same Durand-Kerner iteration functions are used and

the approximations must be distinct. While this approach provides a supper linear

38

convergence rate, it does not yield a high accuracy.

In [29], the correction
f (2k)
Hj¢k(zk ‘ ZJ‘)

is multiplied by 72],, the estimated multiplicity, to be added to 2],, since the modulus of

 

the correction is 771; for the Durand-Kerner method. In this way, quadratic convergence
is achieved theoretically. Again, this approach can not yield results with high accuracy
since no new formulae are used in the process of iterations. None of the remedies
can avoid the overﬂow problem because the reciprocals of small differences of close
approximations are too large. The modiﬁed Aberth method introduced in this chapter
can avoid the overﬂow problem because the approximations in the same group are
treated as a single value in the new formula. Consequently, the roots of the polynomial

can be obtained with high accuracy.

Our technique of dividing approximations into groups is based on the following

observations:

1. If the approximations to an m-ple root are sufficiently close, m approximations

are situated on a small circle centering around the root by an angle of 2;".

2. Every correction directs toward the center of the circle and has almost the same

magnitude.

3. In comparison with approximations of simple roots, the order of convergence is

slow, i.e., the value of the residual is relatively large.

4. The magnitudes of the residuals for m approximations are almost the same.

After some iterations we can group the approximations into different groups by
the following method. If the k-th and j—th approximations are in the same group,

they must satisfy the following conditions:

39

1. The corrections wk and w, should satisfy

lwkl

1—a<
lel

 

<1+a, (am)

for a small number a.

2. Let 0].,- be the angle between vector 2, — 2;, and vector wk, and 0,]. be the angle

between vector zk — z,- and vector wj. These angles should satisfy

| cosﬂk, — €0.3ij |< B , (2.17)
for a small number ,8. Notice that cosﬁkj = W > 0 and 0050,}. =

(2:;.,—zz ,m,) > 0
lzk‘zjllel .

3. We can count the number of approximations satisfying the above two crite—
ria to ﬁnd the multiplicity. To make sure this group contains all appropriate
approximations we check the criterion:

[W£r+l}]_m—1

— —< 2.1
wa’}| m+1| 7’ ( 8)

for a small number 7.

In our implementation, we trace the rate of convergence of each approximation
by keeping track of each residual wk. At ﬁrst few iterations, wk usually ﬂuctuates

{r-H}

wildly. Then all 11);, start decreasing monotonically. By checking the ratio 35-m—

r
k

which converges to (m — 1) / (m + 1) by Corollary 2.3.4, one can tell which approx—
imations converge to simple roots. By Theorem 2.3.6 the ratios corresponding to
approximations which converge to simple roots converge to zero very fast. We ob-
served that as soon as all residuals start decreasing it takes at most 10 iterations
for those approximations which converge to simple roots to converge to their targets

within the chosen accuracy. Then the program begins to detect the multiplicities

40

and to execute the grouping process. At ﬁrst, the residuals are sorted by their mag-
nitudes and the grouping process is invoked to group the approximations according
to criteria (2.16) and (2.17). The program checks criterion (2.18) for each group.
If it is satisﬁed, the program takes the average of the approximations in the group
and use it in the modiﬁed Aberth method. If criterion (2.18) is not satisﬁed the ap-
proximations in this group are used in the modiﬁed Aberth method instead of their
average. When the grouping process is successful it takes only a few iterations for the
modiﬁed Aberth method to converge since the averages are very close to the roots
and the method converges cubically. In our experiment, it usually takes less than 3
iterations for modiﬁed Aberth method to converge after the grouping. In practice,
the grouping process may include wrong approximations in a group. In this case, the
modiﬁed Aberth method will converge linearly. The program disbands the groups
with linear convergence rate and convert to the beginning with the newly computed
approximations as initial approximations. The ﬂow chart of the program is given in

Figure 2.1.

2.5 A Numerical Example

We demonstrate our method by the following numerical example. Let
p(z) = (z —1.1—1.1z')4(z — 3.2 — 2.3i)2(z — 2.1—1.5).

This polynomial was also used in [29] with which we will compare our result. As
in [29], we choose the initial approximations 2;, = 627”"77, k = 0,1,. . . ,6. After 12

iterations the approximations and the residual | w,- I are

real imaginary residual

1. 2.1000000000000 1.5000000000000 0.

 

In fact, the lst approximation reaches the exact root at the 10—th iteration. The

4—th, 5-th, 6-th and the 7-th are grouped as one group with multiplicity 4. Their

3.2002858826098

3.1997141178405

1.0968596312555

1 .09406242073 72

1.1059206829313

1.1032928553054

average is

41

2.2998138134465

2.3001862562026

1.0941839385686

1.1031845106896

1.0967222422851

1 . 1060241399490

6.8221947921266D—04

6.8252015101818D-04

4.2572547041395D-03

4.4900998706488D-03

4.5441352406740D-03

4.7093618963726D-03

(1.1000338975574, 1.10002287078731).

It is very close to the exact root (1.1, 1.1).

Grouping the 2nd and 3rd in one group gives an average

which is also very close to the exact root (3.2, 2.3). These demonstrate the effective—

(3.2000000002252, 2.3000000348246)

ness of Theorem 2.2.6.

Then the program switches to the modiﬁed Aberth method (2.4) . Only one

iteration gives all the exact roots.

1.

2.

3.

real

1 . 1000000000000

3.2000000000000

2.1000000000000

imaginary

1.1000000000000
2.3000000000000

1.5000000000000

42

In [29], the same groups are formed after 14 iterations. It took another 7 iterations
for the program to stop. The ﬁnal results for the approximations converging to the

root (1.1,1.1) are listed below with accuracy to the 4—th digit.

real imaginary
1. 1.09914601 1.10078329
2. 1.10092914 1.09945021
3. 1.09931441 1.09917266
4. 1.10045233 1.10079471

The result is even far less accurate than our average.

In our experiments the approximations converge linearly until the modiﬁed Aberth
method is applied. Once the multiplicities are detected, the modiﬁed Aberth method

converges extremely fast.

 

43

 

Algorithm MODABERTH

Input: polynomial p(z) and its degree n
the initial approximations 2;
Output: the roots of p(z)
begin
for i = 1 to 2 do
for j = 1 to n do
compute new approximations z;
compute residual ué, save old one as an
for j = 1 to n do
compute the ratio Tj=3Ué/UU
for i = 1 to 2
for j = 1 to n
compute new approximation as before
compute residual as before
compute ratio as before and save old ones
now we have 3 consecutive ratios for each approximation
we want to check if the ratios are all decreasing monotonically
while not decreasing
do one iteration for each approximation
while there is an i such that r5< smalll and w,- > smallg
! if there is an approximation converging to a simple root but
! it has not yet converged.
do one iteration for each approximation
while not done
sort residuals un,um,~-,um.
for i = 1 to n and w; > small;
for j = i+1 to n
check if j should be in the same group as i
for each group, check criterion 3
if all groups satisfy criterion 3 set success = 1
if success = 1
while not all convergent
use modified Aberth method once for each approx
if linear convergence is observed
break up groups and goto begin
done = 1 ! all approximations are convergent
else
use modified Aberth method once for each approx
end MODABERTH

 

 

Figure 2.1: Algorithm MODABERTH

Chapter 3

Nonsymmetric Eigenvalue Problem

3.1 Introduction

This chapter presents a parallel algorithm to solve the eigenvalues of non-symmetric
matrices. Both homotopy and divide-and-conquer strategies are used. The original
matrix is split into two or more submatrices, then a homotopy between the sub-
matrices and the original matrix is established such that the intermediate matrices
have only simple eigenvalues. The Aberth method is then applied to reﬁne all the ap-
proximations simultaneously. Section 3.2 discusses the split-homotopy strategy, while
Section 3.3 introduces the Hyman method for computing the characteristic polyno-
mial and its derivative. In Section 3.4 we make a straightforward implementation on
Intel iPSC/ 860 and Intel Delta multicomputers. However, this initial approach did
not use the processors uniformly so we modify our technique in Section 3.5 to balance

the load for message-passing distributed systems.

3.2 Splitting Procedure and Homotopy

Since any real matrix A can be transformed to a similar matrix in upper Hessenberg
form by an orthogonal transformation, we shall assume throughout this chapter that

matrix A is an upper Hessenberg matrix, namely,

44

45

A = (as) = a3?

0 an—Ln—2 *

\ anm-l

 

 

**
\;_

For convenience we assume n is an even number. We further assume A is irre-
ducible, that is, none of the subdiagonal entries am-“ 2 < j < n, is zero, otherwise
we can consider the reduced matrices. Let k = 71/2. Let D 2 (did) be the matrix

obtaining from A by making the subdiagonal entry ah“). = 0, namely,
_ A11 A12
D _ ( 0 A22 ) .
The eigenvalues of D is the union of the eigenvalues of smaller matrices A11 and

A22. It takes much less time to compute the eigenvalues of A11 and A22 and it can be

done in parallel. The following homotopy is established
H(x\,t)=c(1—t)det(D—AI)+tdet(A—)\I) (3.1)

where c is a random complex number and 0 S t S 1.

For each t, since H (A, t) is a polynomial in A of degree 12, there are n zeros: A1(t),
A2(t), . .. , An(t) . They are called the eigenpaths of system (3.1). It is well known
[23] that the eigenpaths do not meet in the interval (0, 1) for a randomly chosen 0.
It implies that the zeros of H (A,t) at each i, 0 < t < 1, are distinct. Our algorithm
does not depend heavily on this property, though it makes the algorithm easier to

implement and more efﬁcient.

Notice that the zeros of H (A, 0) are the eigenvalues of matrix D while the zeros of

H(/\, 1) are the eigenvalues of matrix A. We choose 0 = to < t1 < t2 < < tm = 1

46

for some positive integer m. The idea is to use the zeros of H (A, t) at t,- as the initial
approximations to the zeros of H (A, t) at ti“. Aberth method or our modiﬁed Aberth
method may be applied to reﬁne the approximations. It is also possible to use the
tangent line at Ak(t,) to predict the initial approximation for Ak(t.-+1). Our numerical
experiments show that the algorithm of the second [approach is slower than that of
the ﬁrst one because of the extra calculations of derivatives in the variable t and the
global convergence property of the Aberth method in practice. Here, we only discuss

the ﬁrst approach.

In the spirit of the Aberth method, approximations at each step must be distinct.
It is possible that some of the zeros are identical at t = 0 that makes the Aberth
method not applicable. In such case, we simply make random perturbations to make
the initial approximations distinct. Notice that our goal is to ﬁnd the zeros at t = 1.
It is unnecessary to approximate the zeros with high accuracy at the intermediate
steps t,- < 1. In our implementation, the algorithm is allowed to run for a ﬁxed
number of iterations at the intermediate steps regardless of the convergence. The
global convergence property of the Aberth method in practice ensures the eventual

convergence at t = 1.

The problem of ﬁnding the zeros at t = 0, i.e., the eigenvalues of both A11 and
A22 are solved by the fastest available serial codes or by applying the same splitting
and homotopy strategy. In the latter case, we keep dividing the matrices until certain
dimension then the fastest serial code is applied. For instance, suppose the dimension
of the original matrix A is n = 2’, the number of processing nodes is k = 2’", and
m < n. One can split matrix A into I: small matrices, all of them have dimension

L = 2”“, the serial algorithm is then applied to those small matrices.

47

3.3 Hyman’s Method

In order to extract the zeros of H (A,t) iteratively by Aberth method, H (A,t) and

8H :r.t
8A

[16] which is remarkably backward stable as pointed out by Wilkinson [34]. Notice

need to be evaluated efficiently. For this purpose, Hyman’s method is used

that det(D — AI) = det(A“ — AI) det(A” — AI). Differentiating equation (3.1) with
respect to A, we have,

6H(A,t)
ax

(9det(A11— AI)
6A

0det(A22 — AI)

= c(1—t) 8A

 

det(A22 — A1) + det(A“ — AI)

 

8det(A — AI)
+t 6A .

 

Since A11 and A22 have the same structure as A, we only need to address the

problem of computing det(A — AI) and N—etaﬂxé—Q in more details.

For A = (a;,-) and J: = ($1,$2,...,xn)T, the system of equations (A — AI):r = 0

can be written as

(all — ”131+ 012132 + ° ° ° + a1n$n = 0
021151 + ((122 - ”$2 + ' ' ° + a2n$n = 0

For given A, letting 23,, = 1 in the last equation, we can solve the last n — 1
equations recursively for a:,._1, . . . , .132, :31. These values are then used to evaluate the

left-hand side of the ﬁrst equation,
F”): (011— ”$1 + 012132 + ° ' ' + a1nl‘n-

It is clear that F(A) = 0 if A is a eigenvalue of A.

48

To computer det(A — AI), we write matrix A — AI as

 

 

(an—A (112 013 aln \
021 (122 — A 023 "' "° a2n
a32 033.. A ... ... as"
0 an—1,n—-2 an—lm-l " A art—1,11
\ an,n—l ann _ A

Forj = 1,2, . . . ,n the jth column is multiplied by 1:,- as found above and added to

the last column, thereby yielding the matrix

 

 

{ 011— A 012 013 al.n—l F(A) \
021 022 - A 023 "' "' C12,n—1 0
(132 033 — A ' ' ' (13.n—1 0
0 an—l,n—2 an—l,n—l — A 0
\ an,n—l 0 )
Therefore
n-l
det(A — AI) = (—1)"‘1F(A)H (1,11,.
i=1

To compute W, we need to compute 5%? Differentiating (3.2) with respect to

A, taking into account that 1:], j = 1,2, . . . ,n — 1, are functions of (A,t), we have

81? 8:1: :1:
(all._.A)_i%._.xl.+.a12_73_+.”..+.al %i?._.0
8x 8x 8x

axk_ 3r 8
ahk-l ‘ (akk-A)3f-$k+~°+akn-§f=0

(3.3)

 

8r"- 8
ann—l—__l 'l' (ann _ A)ﬁl _mn : 0

. . a n_ a n_ . .
With 83”} = 0, we solve (3.3) success1vely for $8, ‘ , I8, 2 , . . . , %, usmg the prev1ously

 

 

computed values of 1:1, 3:2, . . . , at”. The value of the left-hand side of the ﬁrst equation

in (3.3) is then $3.

49

3.4 Initial Implementation

We implemented the algorithm on the Intel iPSC/ 860 gamma multicomputer and

Intel Touchstone Delta multicomputer.

Both iPSC/ 860 and Delta multicomputers are distributed memory MIMD ma-
chines [15]. They consist of a collection of computing nodes each of which has a
processor and a local memory. The nodes are inter-connected by networks. The
iPSC / 860 system uses hypercube topology architecture while the Delta system uses
2 dimensional mesh. The computing nodes for both systems are the Intel 2860 pro-

CCSSOI‘S .

The 128 computing nodes of the iPSC/86O are networked as a 7 dimensional
hypercube. The 512 computing nodes of the Delta system are networked as a 16 by

32 mesh. Each row has 32 computing nodes and each column has 16 nodes.

The communication between processors for both systems are done by a library of
explicit message passing functions. It is the programmer’s responsibility to locate the
synchronization points of the program and specify the communication between the

processors by explicitly passing messages.

It is expected that the Delta multicomputer offers better performance since for
some applications 2—D mesh is a natural topology and the Delta multicomputer uses
faster switching techniques [1, 6]. In our application we have not observed any signif—
icant improvement of the Delta multicomputer over the iPSC / 860 system since our
program doesn’t need extensive communication. On the other hand, the communica—
tion in our program is solely broadcasting in which the nodes broadcast the messages
in turn with no message contention. The special faster switching technique for the

Delta multicomputer lose advantage for our application.

50

 

Algorithm ABERTH

Input: the matrix A

the initial approximations A5, i=1,2,---,n.

the number of processors m, where n is a multiple of m
Output: the eigenvalues of A, A5, i=1,2,---,n.
begin ABERTH

s = n/m ! my share of computation load
! begin computation
iam = mynode() ! system call, returns the id of this node

for i=iam=ks+1 to (iam+1)*s
update Ag! use Aberth method
end for
! end of computation
! begin communication
for £220 to n1-1 ! the node counts from 0 to nz—-1
if (iam = i) then
send Agam.,+1,°'°,/\(gam+1)" to all other nodes.
else
receive Aiam13+1,---,A(,am+1)., from node i.
end if
end for
! end communication
end ABERTH

 

 

 

Figure 3.1: Algorithm ABERTH

Both systems support Fortran 77 with message passing libraries. Their main
differences are their inter-connection topologies and the switching techniques which
are invisible to the users. The programs written for one system are portable to the
other. In our implementation, programs were written for iPSC/ 860 multicomputer

ﬁrst, then ported to the Delta multicomputer without modiﬁcation.

The program has two phases. The ﬁrst phase splits the matrix into two or more
submatrices. Then subroutine H QR in EISPACK is called to ﬁnd the eigenvalues
of each submatrices. Those eigenvalues are used in the second phase as the initial

approximations to the eigenvalues of the target matrix. The Aberth method is then

51

used iteratively to ﬁnd all the eigenvalues in parallel. The basic structure of the

program for one iteration in the second phase is shown in Figure 3.1.

When compared to the fastest available sequential code H QR in EISPACK run-
ning on one node, our parallel program was slower when less than eight nodes were
used, see Table 3.1, Table 3.2 and Table 3.3 for matrices with different dimensions.
For eight or more nodes, the parallel version was faster, by a factor of two on 16
nodes. Execution time improved as the number of nodes increased. As more nodes
are added the improvement ﬂattens off due to the load imbalancing problem and the

communication overhead.

 

 

 

 

 

 

 

 

 

nodes sec

1 2.72

2 1.55

4 0.97

8 0.67

HQR on 1 node 16 0.52
0.75 sec 32 0.47

64 0.60

 

 

 

 

 

 

 

 

Table 3.1: Time on i860 with a 64 x 64 random matrix.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

nodes sec

1 18.96

2 10.61

4 6.63

8 4.23

HQR on 1 node 16 2.83
4.56 sec 32 2.25

64 2.48

128 2.72

 

 

 

Table 3.2: Time on i860 with a 128 x 128 random matrix.

52

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

nodes sec

1 149.45

2 84.54

4 51.22

8 . 30.29

HQR on 1 node 16 17.87
32.24 sec 32 10.81

64 8.21

128 9.82

256 13.38

 

 

Table 3.3: Time on i860 with a 256 x 256 random matrix.

3.5 Load-Balancing

In this section we consider the load-balancing problem. The Aberth method ﬁts
parallel computers naturally. Suppose n nodes are used for the computation of the n
eigenvalues of an n x 71 matrix. Each node is responsible for ﬁnding one eigenvalue.
For direct implementation, every node has a copy of the complete information of
the matrix and all the 11 initial approximations. The nodes execute calculations
simultaneously so that one node updates only one approximation. Since all nodes
have the same amount of computation, they generally complete the calculation at
the same time. Then every node broadcasts the updated approximation to the other
nodes. This process continues until all the nodes ﬁnd the expected eigenvalues. A
natural stopping criterion is, as soon as an approximation is good enough no further

iteration on that approximation is needed.

In Section 3.4 we have shown that the parallel algorithm can beat the best serial
algorithm, H QR in EISPACK, for a sufﬁciently large number of processors using an
Intel Delta multicomputer. However, if the number of nodes is far less than the num-

ber of eigenvalues, each node is responsible for ﬁnding more than one eigenvalue and

53

 

 

 

 

 

 

] Si 82 S3 S4[85 S6 S7 S8 S9 810 811 S12 813 814 S15 S16]
N0 10 10 10 10 9 7 7 5 5 5 3 2 1 1 1 0
N1 10 10 9 8 6 4 3 3 2 2 2 1 1 1 0 0
N2 10 10 10 10 9 9 8 8 8 7 7 6 3 2 1 1
N3 10 10 10 9 9 8 7 5 5 4 3 3 2 1 0 0
N4 10 10 10 9 9 8 7 7 6 6 6 4 4 3 1 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 3.4: Unbalanced Load Distribution

some nodes may complete their jobs earlier. This difference causes load-imbalance.

To illustrate the load-imbalance, let’s compute the eigenvalues of a 50 x 50 matrix
whose eigenvalues and the initial approximations are randomly chosen in the region
{2 E C, [2] < 1}. With 5 nodes, the program assigns the ﬁrst 10 eigenvalues to the ﬁrst
node, the second 10 eigenvalues to the second node, . . . , and the last 10 eigenvalues
to the last node. Since it takes the same number of ﬂoating-point operations for any
node to update one approximation, we will count the amount of time to complete this
number of operations as one unit time. Table 3.4 lists the number of eigenvalues each
node computes at each step—each row represents a node and each column represents

a step. It takes 105 parallel time units for the program to complete.

The key for the load-balancing problem for the Aberth method is the observation
that if every node has a complete copy of the previous approximations, it can update
any approximation. The total number of eigenvalues is sufﬁciently small for each node
to have a copy. Based on the differences between the old approximations and the new
ones it received, each node can decide which approximations are good enough and
need no more updating. Once a node has complete information on which approxima-
tions need to be updated and the number of nodes involved in the computation, it

can determine the distribution of approximations to nodes. The number of approx-

54

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SI 82 83 84 85 86 S7 88 89 810 S11 812 813 814 815 816]
N0 10 10 10 10 9 8 7 6 6 5 5 4 3 2 1 1
N1 10 10 10 9 9 7 7 6 5 5 4 3 2 2 1 1
N2 10 10 10 9 8 7 6 6 5 5 4 3 2 2 1 0
N3 10 10 10 9 8 7 6 5 5 5 4 3 2 1 0 0
N4 10 10 9 9 8 7 6 5 5 4 4 3 2 1 0 0

 

 

 

Table 3.5: Balanced Load Distribution

imations can be evenly divided among nodes, and that determination is completely
distributed. Note that each node also knows the approximations other nodes are to

update. This information is required when a node broadcasts the updated approxi—

n1
nnodes

 

mations later. Node i updates [ ] + 1‘.- consecutive approximations, where m is

the number of approximations to be updated at this step, nnodes is the number of

n1
nnodes

nodes involved in the computation, r,- = 1 if i < m — [

 

] and r.- = 0 otherwise.
For example, when there are 50 approximations to be updated and there are 5 nodes,
each node gets 10 consecutive approximations. If there are 33 approximations, then
node 0, node 1 and node 2 each gets 7 consecutive approximations while node 3 and
node 4 each gets 2 consecutive approximations. For this balanced program, at each
step every node computes the approximations it is responsible for. Table 3.5 lists the
number of approximations each node computes at each step for the above example.
The program takes 97 parallel time units to complete which represents a distinct
improvement over the 105 time units for the unbalanced example. Figure 3.2 shows

the corresponding timing information.

The unbalanced program uses static scheduling which assigns the i-th chunk of
approximations to the i-th node. The balanced program, on the other hand, assigns

the i-th share of approximations to the i-th node dynamically at each step. For this

NodeO
Nothl

Nodal
Nodc3
Nod¢4

55

 

 

 

 

 

 

 

Nacho
i i I i i ' ' Nab!
. . —h h h h “I Noch
Lp—I—I-I-I-IIIJH New .
' 5* i—n Nodu ,, . . " "or
F*hh :_ h h :p?][ o 10 a no a o n u 10 nu “commun-
. m.___ III-I'l—ii
m 4 l 1 L11; No.0flt‘
0 I0 I! so co ‘9 5| U 7‘ I1 ” ”INIMWI”
(a) Unbalanced Timing (b) Balanced Timing

Figure 3.2: Unbalanced vs. Balanced Timing

seconds

601-

45 ~~

 

30 «—

 

 

 

 

l
1

2 3 4 5 6 No. of Nodes 2 3 4 5 6 No. (I Nodes

(a) Timing for 240 x 240 matrices (b) Timing for 360 x 360 matrices
Figure 3.3: Timing Info for 240 X 240 and 360 x 360 matrix

dynamical scheduling extra computation is needed to determine the shares for all
the nodes at every node. Fortunately, this computation takes only 0(n) operations,
while one updating of an approximation takes 0(n2) operations in Hyman’s method

to evaluate the polynomial and its derivative. This extra computation is negligible.

We implemented the balanced algorithm on a cluster of up to 6 DEC Alfa 3000
workstations running Message-Passing-Interface (MPI) with Fortran 77. MPI is a
widely used standard for writing message-passing programs. It is available for al-
most all existing high performance machines [27]. The test matrices are real upper

Hessenberg matrices whose entries are randomly generated in the range (-10, 10).

56

The matrix is divided in the middle into two smaller real upper Hessenberg matrices.
Subroutine H QB. in EISPACK is called to ﬁnd the eigenvalues for each of the two
matrices. The eigenvalues for the two matrices are merged as the initial approxima-
tions to the original matrix. Aberth method is applied to reﬁne the approximations.
We tested the matrices of degree up to 360 x 360. The results show that the balanced
program saves about 10% for the chosen class of matrices compared to the unbal-

anced program. Figures 3.3 shows the timing information for test matrices of degree

240 x 240 and degree 360 x 360.

3.6 Conclusion

In this chapter we describe a parallel algorithm to solve the eigenvalues of non-
symmetric matrices. The straightforward approach leads to an unbalanced utilization
of processors 80 a dynamically load balanced approach was developed and demon-
strated. At this time, only six processors were available so the gains are small. Since
the computations grow much faster than communication we expect signiﬁcantly bet-

ter results on a larger machine, especially with respect to the fastest serial algorithm,

HQR.

Bibliography

[1] Aad J. wan der Steen, An overview of (almost) available parallel systems,

Publication of the NCF, December 1993.

[2] O. Aberth, Iteration Methods for Finding all Zeros of a Polynomial Simulta-
neously, Math. Comp., Vol.27,1973, pp. 339-344

[3] G. Alefeld and Herzberger, 0n the Convergence speed of Some algorithms
for simultaneous approximation of polynomial roots, SIAM J. Num. Anal,
11(1974), 237-243.

[4] Alan F. Beardon, Iteration of Rational Functions, Springer-Verlag, 1990.

[5] W. Borsch-Supan, A posteriori error bounds fro the zeros of polynomials, Nu-

mer. Math. 6(1963) pp. 380-398.

[6] Lionel M. Ni, Lecture Notes of CPSSQQ, Computer Science Department, Michi-
gan State University, Spring, 1995.

[7] J .Cuppen, A divide and comquer method for the symmetric tridiagonal matri-
ces, Numer. Math., 36(1981), pp. 177-195.

[8] K. Docev, A modiﬁed Newton method for the simultaneous approximation of
all roots of the given algebraic equations, Fiz. - Mat. Spis Bulgar. Akad. Nauk
5 (1962) pp. 136-139 ( in Bulgarian ).

[9] J. Dongarra, M. Sidani, A parallel algorithm for the nonsymmetric eigenvalue
problem, SIAM J. Sci. Comput. 5(1993), pp. 542-569.

[10] J. Dongarra, C. Sorensen, A fully parallel algorithm for the symmetric eigen-
value problem, SIAM J. Sci. Statist. Comput. 8(1978), pp. Sl39—@154.

[11] E. Durand, Solutions Numeriques des Equations Algebriques, T.I. Paris 1960.

[12] L.W. Ehrlich, A modiﬁed Newton method for polynomials, , Comm. ACM 10
(1967) pp. 107-108.

[13] M.R.Farmer,G. Loizou, A CLASS OF ITERATION FUNCTIONS FOR IM-
PROVING,SIMULTANEOUSLY, APPROXIMATIONS TO THE ZEROS OF
A POLYNOMIAL, BIT 15 (1975),pp. 250 - 158.

[14] Hamilton, A Type of Variation on Newton’s Method, Amer. Math. Monthly,
57, 517-522(1950).

57

 

58

[15] Kai Hwang, Advanced Computer Architecture: Parallelism, Scalability, Pro-
grammability, McGraw-Hill Inc. 1993.

[16] M. Hyman, Eigenvalues and Eigenvectors of general matrices, presented at the
12th National Meeting of the Association for Computing Machinary, Houston,
Texas, June 1957.

[17] Pierre Fraigniaud, The Durand-Kerner Polynomials Roots-Finding Method in
case of Multiple Roots, BIT 31(1991),112—123.

[18] T.L.Freeman, Calculating Polynomial zeros on a local memory parallel com-
puter, Paralle Computing 12(1989) 351-358.

[19] M.W. Green, A.J.Korsak and M.C.Pease, Simultaneous iteration towards all
roots of a complex polynomial, SIAM Rev. 18 (1976), pp. 501-502.

[20] I Kerner, Ein Gessamtschrittverfahren zur Berechung der Nullstellen von Poly-
nooem. Num. Math. 8(1966), 290-294.

[21] Gbran Kjellberg, Two Observations on Durand-Kerner’s Root-Finding
Method, BIT 24(1984),556—559.

[22] K. Li and T.Y. Li, An Algorithm for Symmetric Tridiagonal Eigenproblems—
Divide and Conquer with Homotopy Continuation, SIAM J. Sci. Statist. Com—
put., 14(1993), pp. 735-751.

[23] T.Y. Li, T. Sauer, J. Yorke, The random product homotopy and deﬁcient poly-
nomial systems, Numer. Math., 51, (1987), pp. 481-500.

[24] T.Y. Li , Zhonggang Zeng, Homotopy-determinant algorithm for solving non-
symmetric eigenvalue problems, Math. Comp. 10(1992), pp. 483-502.

[25] T.Y. Li and Zhonggang Zeng, The Laguerre Iteration in Solving the Symmetric
Tridiagonal Eigenproblem, Revisited, SIAM J. Sci. Comput. 9(1994), 1145-
1173.

[26] T.Y.Li, Z. Zeng, and L. Cong, Solving eigenvalue Problems of real nonsymmet-
ric matrices with real homotopies, SIAM J. Numer. Anal. 29(1992), 229-248.

[27] Message Passing Interface Forum, MPI: A Message-Passing Interface Stan-
dard, May 5, 1994.

[28] M. Marden, Geometry of Polynomials, Amer. Mathematical Soc., Providence,
RI, 2nd ed., 1966.

[29] Tsuyako MIYAKODA, Iterative method for multiple zeros of a polynomial by
clustering, J. Corn. and App. Math. 28(1989),pp. 315-326.

[30] J .M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations

in Several Variables, Academic Press, 1970.

[31] B.N.Parlett, Laguerre’s Method Applied to Matrix Eigenvalue Problem, Math.
Comp. 18(1965), pp. 464-485.

59

[32] L. Pasquini and D. Trigiante, Il metodo di continuazione e l’approssimazione
simultanea degli zeri dk un polinomio, Monograﬁe di Soft. Matem., N. 30,

Pubbl. dell’IAC, 1984.

[33] T. Terano, On a global algorithm for algebraic equationss, PhD Thesis, Uni-
versity of Tokyo, Information engineering course (1978).

[34] J.H. Wilkinson, Error Analysis of Floating—point Compuation, Numer. Math.
2 (1960), 319-340.

"lllllllllllllllf