} LIBRARY
:00"? Michigan State
University

 

This is to certify that the
dissertation entitled

ALGORITHMS FOR SOLVING POLYNOMIAL SYSTEMS
BY HOMOTOPY CONTINUATION METHOD AND ITS
PARALLELIZATION

presented by

Chih-Hsiung Tsai

has been accepted towards fulﬁllment
of the requirements for the

PhD. degree in Mathematics

 

 

n
I , \-/ I, r \
/&A - / .e___\_ C 1

Major Profe’ssor’s Signature

7/79//08’

Date

 

MSU is an aﬁinnative-action, equal-opportunity employer

.1--.-.-I-I-I-O-l-I-O-I-I-I—O—O-l-I-O-I-I-I-U—l-l-I-

4 -4-4-s-a-I-n-A-I-u-v-l-O-I-n-o-l-I-lh

n.-.-.—v- - _ .

 

 

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5108 K:IProj/Acc&Pres/ClRC/DateDue‘indd

ALGORITHMS FOR SOLVING POLYNOMIAL SYSTEMS BY
HOMOTOPY CONTINUATION METHOD AND ITS
PARALLELIZATION

By

Chih-Hsiung Tsai

A DISSERTATION
Submitted to
Michigan State University

in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY
Department of Mathematics

2008

ABSTRACT

ALGORITHMS FOR SOLVING POLYNOMIAL SYSTEMS BY
HOMOTOPY CONTINUATION METHOD AND ITS
PARALLELIZATION

By

Chih—Hsiung Tsai

HOM4PS-2.0 is a software package in FORTRAN 90 which implements the polyhedral
homotopy continuation method for solving polynomial systems. It updates its original
version HOM4PS in three key aspects: (1) New method for ﬁnding mixed cells, (2)
New idea of following homotopy paths, (3) New way of dealing with the curve jumping.

The parallel version of HOM4PS-2.0, named HOM4PS-2.0para, parallelizes the
three main stages in HOM4PS-2.0. Excellent scalability in the numerical results shows
that the parallelization of the homotopy method always provides a great amount of
extra computing resources to help solve polynomial systems of larger size which would

be very difficult to deal with otherwise.

To my parents

iii

ACKNOWLEDGMENTS

I would like to give my immense gratitude to my dissertation advisor, Professor
Tien—Yien Li, for his constant encouragement and support during my graduate study
at Michigan State University. His knowledge, insights and enthusiasm are invaluable.

I would also like to express my thank to Professor Chichia Chiu, Professor Jeanne
Wald, Professor Guowei Wei , and Professor Zhengfang Zhou for their time and
valuable advice.

I am grateful to my classmates Jules, Tianran, Alice and Ying for their help and
useful comments. Last, but not least, I want to thank Jamie for her support during

these years.

iv

TABLE OF CONTENTS

List of Tables
Introduction

Polyhedral Homotopy Method
2.1 Bernshtein’s Theorem ...........................
2.2 Polyhedral Homotopy ...........................

HOM4PS-2.0
3.1 Mixed cell computations .........................
3.2 Following homotopy paths ........................
3.2.1 Constructing the polyhedral-linear homotopy ..........
3.2.2 Classical linear homotopy ....................
3.3 On curve jumping .............................
3.4 Miscellaneous aspects of HOM4PS-2.0 .................
3.4.1 Evaluating polynomials and derivatives .............
3.4.2 Scaling of the coefficients .....................
3.4.3 The end game ...........................
3.5 Numerical results .............................
3.5.1 The performance of HOM4PS-2.0 in dealing with curve jump-
ing and diverging paths ......................
3.5.2 HOM4PS-2.0 vs. HOM4PS ...................
3.5.3 HOM4PS—2.0 vs. PHCpack and PHoM ............

HOM4PS-2.0para-Parallelizing HOM4PS—2.0
4.1 Parallelizing the computation of mixed cells ..............
4.2 Parallelizing the curve tracing of polyhedral homotopy ........
4.3 Parallelizing the curve tracing of the classical linear homotopy . . . .
4.4 Parallelizing the checking of possible curve
jumping ..................................
4.5 Numerical results .............................
4.5.1 The performance on large systems ................
4.5.2 Scalability .............................
4.5.3 The multiple precision ......................

Conclusion and future work

Bibliography

vi

0501

19
19
24
24
3O
31
35
35
38
41
43

46
49
51

55
56
58
67

71
77
77
80
82

83

84

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.9

4.1

4.2

4.3

4.4

4.5

4.6

4.7

LIST OF TABLES

T (7123, M=6) ............................
The polynomial systems .........................
The performance of HOM4PS-2.0 ....................

Comparison of the classical linear homotopy and the polyhedral—linear
homotopy in HOM4PS-2.0 ........................

Comparison of HOM4PS and HOM4PS—2.0 ..............
Comparison of HOM4PS-2.0 and PHCpack ..............
Comparison of HOM4PS-2.0 and PHoM ................

Maximum sizes of polynomial systems that can be solved by PHCpack,
PHoM, and HOM4PS—2.0 within 12 hours of cpu time ........

fT .....................................

Bixlib ...................................

Solving systems by the polyhedral-linear homotopy with 1 master and
7 workers .................................

Solving systems by the classical linear homotopy with 1 master and 7
workers ..................................

The scalability of solving systems by the polyhedral-linear homotopy

The scalability of solving systems by the classical linear homotopy . .

vi

37

45

46

48

64

Chapter 1

Introduction

Let P(x) = 0 be a system of n polynomial equations in n unknowns. Denoting
P(:c) = (p1(:c), . . . ,pn(:r)) with :1: = (3:1, . . . ,xn), we want to ﬁnd all isolated solutions

of

p1(:r:1,...,:cn) = 0,

in C".
The problem of solving systems of polynomial equations arises in many scientific
and engineering applications. Reliable and efficient numerical approaches to solve this

problem have long been desired. One such approach is the homotopy continuation

method proposed in 1977 by Garcia and Zangwill [11] and Drexler [7] independently.

1

Since then, this approach has undergone quite a bit of development. In the beginning,
the employment of linear homotopies was standard. Then in 1995, Huber and Sturm-
fels [15] suggested the use of polyhedral (nonlinear) homotopies based on Bernshtein’s
combinatorial root count [1] yielding drastic improvements over the classical linear
homotopy method for solving sparse polynomial systems. Ultimately this is because
Bernshtein’s root count is a much tighter bound on the number of solutions than the

Bézout number. The software package HOM4PS deveIOped by T. Y. Li and T. Gao

since 1999 implemented this approach.

Huber and Sturmfels [15] also recommended the use of the combined polyhedral-
linear homotopies, but these were impractical due to difﬁculties with efﬁciency and
stability. In 2007, T. Y. Li and C. H. Tsai, utilizing exponential-logarithmic transfor-
mations, were able to successfully implement combined polyhedral-linear homotopies,
and, released with T. L. Lee an update to HOM4PS, called HOM4PS-2.0. Its par-
allel version HOM4PS-2.0para which utilizes multiprocessors soon followed. These
latest two packages show considerable speed-up and increased ability to handle larger

systems.

This thesis presents the details of the strategies used to implement the homotopy
continuation method in the codes HOM4PS-2.0 and HOM4PS-2.0para. A sketch of
the theoretical underpinnings and historical development of the homotopy continua-
tion method will be presented. Techniques used to overcome computational difﬁculties

in implementing the polyhedral continuation method will be outlined.

2

In Chapter 2, we ﬁrst review the basics of the polyhedral homotopy continua-
tion method; this includes, the deﬁnition of the mixed volume of the support of a
polynomial system as well as Bernshtein’s theorem concerning root counts. Next, a
practical method of computing mixed volumes, using the mixed cells of the support,
which leads to the construction of the corresponding polyhedral homotopies, will be
discussed. The details of ﬁnding the mixed cells will be left to Chapter 3.

In Chapter 3, we will present the latest optimizing strategies used in HOM4PS-
2.0, which considerably upgrades its original version HOM4PS. There are three major

steps in HOM4PS-2.0:

0 Mixed cell computations

 

0 Following homotopy paths

 

0 Dealing with ‘Curve Jumping’.

 

Sections 3.1-3.3 take up the treatment of these issues. Other miscellaneous aspects
of HOM4PS-2.0 are listed in Section 3.4. These include evaluating polynomials and
derivatives, scaling coefﬁcients, and determining criteria for path convergence and
divergence, i.e., the end game. Numerical results of HOM4PS-2.0 are shown and
discussed in Section 3.5. These results clearly show that HOM4PS-2.0 is very efﬁcient
and powerful.

As the size of the polynomial system becomes larger, more computing resources
are required. And a natural way to allocate extra computing resources efficiently is

to perform independent tasks simultaneously in parallel. Since each isolated zero is

3

computed independently of all the others in the homotopy continuation method, it
provides a natural environment for the parallelization. In Chapter 4, we present a par-
allel implementation of HOM4PS-2.0, HOM4PS-2.0para, in which each of the three
main stages above are parallelized using multiple processors. A simple master-workers
parallel computation environment is used, in which one master CPU communicates
with all worker CPUS using the Message Passing Interface, MP1 [34], to dynami-
cally allocate tasks among the latter. The ﬁrst four sections of Chapter 4 describe
respectively, the parallel implementations of the mixed cells computation, the curve
tracings of the polyhedral-linear homotopies, the curve tracings of the optional clas-
sical linear homotopy, and the checking for possible curve jumping. In Section 4.5,
numerical results of HOM4PS-2.0para on the bench mark systems, eco—n [30], katsura-
n [3], noon-n [35], reimer-n [37], and cyclic—n [2], are summarized. The near perfect
speed-ups in the results clearly demonstrate that our dynamic allocation algorithms
effectively utilize the processing capacities of a parallel architecture to implement the
polyhedral-linear homotopy continuation method for solving previously unmanage-
able polynomial systems of larger size. Conclusions and directions for future work

are given in Chapter 5.

Chapter 2

Polyhedral Homotopy Method

For sparse polynomial systems, Bernshtein’s combinatorial root count gives a much
tighter bound on the number of isolated solutions than the Bézout number. It thus
provides a starting point for an alternative approach to the classical linear homotopy
continuation method which will reduce the number of homotopy paths to be traced.
In this chapter, we ﬁrst review Bernshtein’s theorem on the exact root count in (0’)"
for a generic system, where C" = C\{O}. Next, we discuss an important extension by
Li and Wang [26] for the root count in C”, followed by the description of the actual
procedure that uses Bernshtein’s root count to ﬁnd all isolated solutions, i.e., Huber

and Sturmfel’s polyhedral homotopy method.

5

2.1 Bernshtein’s Theorem

Let P(:r) = (p1(a:), . . . ,pn(:r)) be a system of n polynomials in n unknowns with
:1: = (1131,...,IL‘n), where
pj(a:) = Z c3103“, j = 1,. . . ,n. (2.1)
aESj
Here as“ = 3:61l1 ~43,” where a = (a1, . . . ,an) with coefﬁcient cia E C" = C\{0}.

The index set Sj, denotes the support of pj(:r), is the set of all monomial powers of
103(33)-

The convex hull K, = conv(.S'J-) in R” of Sj is the Newton polytope of pi (cc) and
S = (31, . . .,Sn) is called the support of the system P(:r). For non-negative real

variables A1, . . . , An, write the Minkowski sum

A1K1+~~+AnKn={A171+---+An~yn|'yj EKj,j=1,...,n}.

The n-dimensional volume Voln()\1K1 + + AnKn) is known to be a homo-
geneous polynomial of degree n in the variables A1, . . . , An. The coefﬁcient of the
monomial A1/\2 - - - An in this homogeneous polynomial is called the mixed volume of
the polytopes K1, . . . , Kn, denoted by M(K1, . . . ,Kn). Alternatively, it is called the
mixed volume of the support of the system P(:1:) = (p1(:r), . . . ,pn(x)), denoted by
M(Sl, . . . , Sn), or just the mixed volume of P(:r) for short.

Now, let the coefﬁcient cjﬂ of each monomial in the original system P(x) =

6

(131(1), . . . , pn(a:)) in (2.1) be considered as a variable in its own right, resulting in

the associated system P(c, z) = (p1(c,:z:), . . . ,pn(c, 23)) where

pj(c,a:) = Z cj,aa:a, j = 1,. . . ,n. (2.2)
aESj

The original system P(zr.) = P(c*, :r) is simply the expanded system P(c, :13) evaluated

at a set of speciﬁed values of coefficients c* = (c; a)'

Recall that the root count of a polynomial system is simply the total number of

isolated zeros, counting multiplicities.

Lemma 2.1 [14] For polynomial systems P(c, 11:) in (2.2), there exists a polynomial
system C(c) = (gl(c), . . . ,gl(c)) in the variables 0 = (Cj,a) for a E Sj andj = 1,. . .,n
such that for those coefﬁcients c* = (Cia) for which G(c*) sé 0, the root count in
(C*)" of the corresponding polynomial system in (2.2) is a ﬁxed number. And the
root count in (0‘)" of any other polynomial system in (2.2) is bounded above by this

number.

Remark 2.2 Since the zeros of the polynomial system C(c) in the above lemma
form an algebraic set, its complement is Open and dense with full measure. Thus, a

polynomial system P(c*, :13) in (2.2) with G(c*) # 0 is said to be in general position.

Theorem 2.3 [1] The root count in (C*)” of a polynomial system P(z) = (p1(:r), . . . ,

pn(z‘)) in general position equals the mixed volume of its support.

7

The above root count is, in practice, a much tighter bound than the Bézout bound
[31] and is also known as the ”BKK bound” after its inventors, Bernshtein [1], Kush-
nirenko [18], and Khovanskii [16]. An extension of the Bernshtein Theorem that

counts the roots in C” as opposed to (C*)” was provided by Li and Wang [26].

Theorem 2.4 The root count in C" of a polynomial system P(x) = (p1(:c), . . . ,pn(a‘))

with support S = (S 1, . . . , Sn) is bounded above by the mixed volume M(31 U{0}, . . .

,Sn U{0})-

In other words, simply augment each polynomial in the original system P(x) by
adding a constant term, if one is missing, namely adding the point 0 = (0,. ..,O)
to each support Sj. Then, the mixed volume of the augmented supports, which is
the root count in (C*)" of the augmented polynomial system in general position,
is the upper bound for the root count in C” of the original system P(x). Hence,
if the constant term were already present in each polynomial in the original system
P(x) = O, i.e., its support 83- = Sj U{0}, then the upper bound for the root count
in ((C*)" would be the same as that in C". Furthermore, for a generic system with

these supports (with all constant terms included),

# of solns in (C*)n = M(Sl, . . .,Sn) = # of solns in C".

So no isolated solution of a system in general position in which all constant terms are

present has a zero component.

2.2 Polyhedral Homotopy

The earliest form of the homotopy continuation method for solving polynomial system
P(z) = (p1(x),...,pn(:r)) = 0 consisted of ﬁrst ﬁnding an easily solved system
Q(:r) = (q1(:r), . . . ,qn(:r)) = 0, next deﬁning a linear homotopy H : C" x [0,1] —» C”
by

H(:c, t) = (1 — t)Q(:r:) + tP(x), (2.3)

and then numerically following the curves in the real variable t which make up the
solution set of H (x, t) = 0. Here the homotopy H (as, t) is linear in t. More generally,
homotopies may be constructed to be nonlinear in t.

In order for the continuation method to work, the homotopy H (x,t) must be

chosen so that the following three properties hold:

0 Property 1 (Triviality) The solution points of the starting system H(:1:,0) =

Q(:1:) = 0 are known.

0 Property 2 (Smoothness) The solution set of H (11:, t) = 0 for 0 S t < 1 consists

of a ﬁnite number of smooth paths, each parameterized by t in [0, 1).

0 Property 3 (Accessibility) Every isolated solution of the original target system
H(:I:, 1) : P(x) = 0 can be reached by some path originating at t = 0; i.e. at

some solution point of the starting system H (2:, 0) = Q(:z:) = 0.

If all three properties hold, then all solutions of the original problem P(x) = 0

can be obtained by tracing the solution paths of H (x,t) = 0, each starting from

9

an initial starting point which is a solution of the known system Q(:r) = 0. Note
that while divergence of a path to inﬁnity in the middle of the interval (as t —>
to < 1) is precluded by the smoothness property, it is still consistent with the above
properties that a solution path of H (x, t) = 0 may diverge to inﬁnity as the parameter
t approaches 1. Thus, some of the paths emanating from the roots of Q(ar), may be
extraneous, and diverge to inﬁnity as t ——> 1. Since in practice it is not known ahead
of time which path is extraneous, each path must be followed until it is clear that it
diverges or a solution point of P(z) = 0 is reached. Divergent paths'thus represent

wasted computation effort.

Now we outline Huber and Sturmfel’s polyhedral homotopy method to ﬁnd all iso-
lated solutions in C" of the given polynomial system P(x) 2 (191(3),. . . , pn(:c)) with
supports S = (51,...,Sn). First, all polynomials without constant term are aug-
mented with the constant monomial 1:0 = 1:9 . - «1:9, = 1, resulting in the correspond-
ing augmented supports S j, 2 S7 U {0}. A generic system Q(a:) = (q1(z‘), . . . , qn(:r))
with the augmented support S I = (51’ , . . . , S71) is constructed with randomly chosen
coefﬁcients. By Theorem 2.4, the root count of Q(;r) in C” is the mixed volume

I I
M(Sl , . . . , Sn) of the augmented supports.

The ﬁrst step in Huber and Sturmfel’s polyhedral homotopy method is to solve
the system Q(:1:) = 0 in general position by means of polyhedral homotopies which
will be discussed below in detail. It will be shown that the number of paths needed

to be followed in this step is M(Sl,, . . . ,S,’,) which is the root count of Q(a:) in C",

10

and no paths diverge.

The second step in the polyhedral homotopy method is to solve the original system
P(z) = 0 by means of the linear homotopy H(:z:, t) = (1 — t) 7Q(a:)+t P(z), t 6 [0,1]
with generically chosen complex 7, also known as the so—called cheater’s homotopy
[27] (or the coefﬁcient—parameter continuation [32]). In this step, the linear homotopy
paths of H (1:, t) = 0 are followed from their starting points (which are the solutions of
Q(.v) = 0 found in the ﬁrst step). It is known that all isolated solutions of P(z) = 0
can be attained, which was our original goal. Here, some paths may diverge. Since the
number of starting points, and hence the number of paths, is the root count of Q($),
the number of paths needed to be followed in this step is also M(Sll , . . . ,Sé). Hence

the total number of paths needed to be followed in both steps is 2M (SI, , . . . , 3,1).

We begin our discussion of how to solve the system Q(:1:) = 0 using polyhedral

homotopies. Writing

a: = E 1:“,
(M ) EGGS; 1,a
QUE) = (2 4)
:1: = 5 a“,
(Inf ) ZaESé 71,0
I
note that all coefﬁcients '6 = (Ejﬂ) for a E S j and j = 1,. . . ,n are chosen at random

to ensure that Q(:r) is in general position.

To construct a nonlinear homotopy with Q(:I:) as the target system, ﬁrst deﬁne

a generic lifting w = (0.21, . . . ,wn) on the support S, = (81],. . .,S4) of Q(:c) where

11

I I
each wj : S j —-> R, j = 1, . . . , n is a random real-valued function on S 3" Each wj lifts

I , .I . I
Sj to its graph Sj = {a = (a,wj(a)) | a E Sj}.

Deﬁne the polyhedral homotopy Q : C" x [0, 1] —+ C” by adding a power of

t given by the above random lifting to each term in Q(:r) to obtain Q(:I:,t) 2

(91(1', t), - . . , Qn(:c,t)) where

é1($,t) = EGGS; Elﬂxatwlm),
Chant): g (2-5)

(jn($,t) = Z I Cn,a$atwn(a).
aESn

In order to apply the continuation method, we must demonstrate that Q(:r,t)
satisﬁes all three of the desired properties given in the beginning of this subsection
(triviality, smoothness, and accessibility). Clearly the target system is Q(a:, 1) =
Q(:I:) = 0. It can be shown that for any ﬁxed to > 0, the system Q(a:, to) is in general
position having all nonzero constant terms present which implies that it has mixed
volume k :2 M(Sli, . . .,STZ) isolated solutions in (C*)”. Then the entire zero set
of the homotopy Q(:1:, t) consists of k distinct homotopy paths :r(1)(t), . . . , 113(k)(t) in
(C*)n ending at the distinct solutions of Q(z) = 0. And again, since each éj(x,t)
has nonzero constant term for all j : 1,...,n, by a standard application of the
generalized Sard’s Theorem [5], each of the k solution curves of Q(:r,t) is smooth

with no bifurcations. Therefore, Property 2 (smoothness) holds.

However, at t = 0, the start system Q(:I;,0) E 0 which means that the starting

12

points z(1)(0), . . . ,z(k)(0) of the corresponding homotopy curves :1:(1)(t), . . . ,a:(k)(t)
can not be identiﬁed. So the polyhedral homotopy Q(:r:,t) does not satisfy Property
1 (triviality) and Property 3 (accessibility). So while every isolated solution of the
target system may be reached by a homotopy curve of Q(1:,t) = 0, it is not clear
where it originates from. This difﬁculty is resolved by the following technique.

We ﬁrst ﬁnd all vectors (3: = (a, 1) E Rn‘l'l matched with a collection of pairs of

, I I I I I .
pornts ({a1,a1}, . . . , {an,an}) from the support S = (81,...,Sn) of Q(x), which

together satisfy the following conditions

a Each pair of points is chosen from each support,
. I I I I I I . .
i.e. {a1,a1} C 51,. . . , {aman} C Sn, so that {a1 —a1, . . . ,an —an} 18 linearly

independent in IR", and

0 Taking (-, ) to be the usual inner product in Euclidean space and a = (a, wj(a)) E
, I I '
R’Hl where a E Sj and wj : S —> IR is the corresponding lifting, for j =

1,...,n,

(a ,d) -— (a ,(3I)
J J (2.6)
(End) > (aj,d), for all a E S; \ {drag}.

Geometrically, d = (a, 1) represents a linear functional, or alternatively, the normal
of parallel hyperplanes in Rm”, which supports the convex hull of the graph S J, of
S J, , at exactly two points {aj, a3} C S J, for j = 1,. . . , n. Such a collection of pairs of
points ({a1,a’1},. . . , {am a;}) satisfying the above conditions is called a mixed cell

13

of S I = (51,, . . . , 37],), and its associated vector 0 is called its inner normal. Finding
all mixed cells and their associated inner normals a is one of the key steps of the
polyhedral homotopy method. We will treat the actual mixed cells computation in
Chapter 3 and Chapter 4.

Having found such a mixed cell ({a1,a,1}, . . . , (an, a;}) together with its asso-
ciated inner normal 0 where a = (0:1, . . . ,an), deﬁne a change of variables between

:r = (3:1, . . . ,l‘n) and y = (311,. . .,yn) by :1: 2 pt“ or alternatively, y = t—ax where

331 = tall/1, 29/1 = F0111,
or 3 (2-7)

IFn = tanyn, yn = t_an-Tn-

With this change of variables, any monomial will be transformed as follows:

a a
3:“ = $11....11nn

= ﬁnial)“1 -~(ynta")a”
gntala1+~+anan

a
: y11"'y

_ yat(a,a).

Thus each (jj(z,t),j = 1,. . . ,n of the homotopy Q(:r, t) in (2.2 5) becomes

14

(lj(yt0‘,t) = Z Ej,ayat(a’alt“’j(a)

a633-
: Z 5]. ayat<(a’wj(a))a(ar1)>
I
aESj
I
aESj

Forj=1,...,n, let

,Bj = min/(61,6!)
aESj

By the conditions deﬁning the mixed cell ({a1,a,1}, . . . , {an,a:,}), we have ﬁj =

(aj,d) = ((13,631) forj=1,...,n. Now deﬁne Ha(y,t) = (h?(y,t),...,h%(y,t)) by

“ factoring ” out the lowest powers of t in each of the (17(th t)-
q : _'Bj A' a Z -' 6,6!) -BJ
h] (y,t) t q,(yt .0 2,1653 Cmy t
(“1,62 --5
= Z: 63’ Cmya + Z 63’ Cmy t< ) J
(éadl—IBJ (d’a>>6]
I
= 53a 3] '7 +0. lyj + Z I CJﬂy “(LOO-ﬂ]
,a . aES-
.7 J
(d,&)>ﬁj

Since exactly two inner products equal the minimum ﬁj for each j, our new homotopy

H 0(y, t) has exactly two terms without t.

This homotopy retains most of the properties of the previous homotopy Q(x, t) =

15

0; in particular Property 2 (smoothness) still holds, because both H “(y, t) and Q(sc, t)
have the same support and are in general position for ﬁxed to > 0. The entire solution
set of Ha(y, t) = 0 consists of k distinct smooth homotopy paths 31(1) (t), . . . , y(k) (t) in
(C*)” which are transformed versions of the solutions :r(1)(t), . . . ,.’E(k) (t) of Q(z, t) =
0. However, by the transformation (2.7), we have x = y at t = 1. Hence the solutions
to the target system Ha(y, 1) = Q(y, 1) = Q(y) = 0 are exactly the same as the
solutions to the target system Q(Lr, 1) = Q(z) = 0. Moreover, we have an easily

solvable start system, known as the binomial system,

I

ha .0: E 0:5 “1+5 “1:0,
1(y. ) Sues/1 Lay l,aly 1,ally
Holt/,0) = , g (2-9)
I
hiﬁy, 0) = Z I Enut/a = Emanyan + E I ya" = 0.
0.63" ' Than

An algorithm to solve binomial systems is laid out in Section 4.3. Thus Property 1
(triviality) is now satisﬁed. Since the homotopy curves originating from the solutions
of (2.9) can be traced to reach a partial set of isolated zeros of Q(y) at t = 1, Property

3 (accessibility) partially holds.

Proposition 2.5 [22] The binomial system

I

61013/01 + 5 [gal = 07
’ 1,al
(2.10)
I
Cn’anyan + En a] ya” = 0.
, n

16

has

I I
kaz=|det(a1— a1, . . . ,an — anll

nonsingular isolated solutions in ((C*)"'.

The number ha is called the volume of the mixed cell ({a1,a,1}, . . . , (an, a;}). The
existence of such mixed cell ({a1,a,1}, . . . , {an,a:,}) along with its corresponding
inner normal 0: E IR” for which condition (2.6) holds and hence for which H a (y, 0) = 0

is a binomial system is guaranteed by the following proposition.

I
Proposition 2.6 For all generically chosen real functions wj : S j —-* IR, 3' = 1,. . . ,n
there exists a 6 IR”, for which the start system Ha(y,0) = 0 of the homotopy
H a(y,t) = 0 in (2.8) is a binomial system having a nonempty set of nonsingular

solutions in (C*)"’, i.e., ka 75 0 in (2.10).

This proposition was proved implicitly in [15] and [22] in the context of combinatorial

geometry. In [15], Huber and Sturmfels proved that

I I
M(s,,...,s,,)=§:ka,
a

i.e., the mixed volume of Q(y), the total number of isolated zeros‘of Q(y), is the
I I

sum of the volumes has of the mixed cells corresponding to all possible a s for which

H “(y, 0) = O is a binomial system. This gives an alternative approach to Bernshtein’s

Theorem, different from the original paper [1].

17

Clearly, different inner normals a 6 IR" induce different homotopies H “(y, t) = 0
in (2.8). Phrthermore, by examining the Puiseux series expansions of the homotopy
solution curves, we conclude that any two sets of isolated zeros of Q(y) are disjoint,
if they are reached by corresponding sets of solution paths belonging to two distinct
homotopies.

Therefore, every isolated zero of Q(y) can be obtained by tracing a solution curve

of a homotopy H “(y, t) = 0 induced by some oz 6 IR” given by Proposition 2.6.

18

Chapter 3

HOM4PS-2.0

Based on the polyhedral homotopies given in Section 2.2, the software package HOM4PS,
developed over the years by a group led by T. Y. Li at Michigan State Univer-
sity, implemented this approach for approximating all the isolated zeros of P(x) =
(p1(.1:), . . . ,pn(:1:)). HOM4PS-2.0 is a new version in FORTRAN 90 which updates
HOM4PS with remarkable speed-ups. In this chapter, we will explain the details of
the three main stages in HOM4PS-2.0 that account for the major improvement over
HOM4PS: mixed cell computation, following homotopy paths, and checking for curve

jumping.

3.1 Mixed cell computations

When the polyhedral homotopy is employed to ﬁnd all isolated zeros of P(IE) =
(p1(:r). . . . , pn(:r)), the process of locating all the mixed cells during the mixed volume

computation plays a crucially important role [22-24]: The mixed volume determines

19

the number of solution paths which need to be traced and the mixed cells provide

starting points of the solution paths.

For polynomial system P(r) : (p1(a:), . . . ,pn(a:)) with support S = (51,. . .,Sn),
let wj : Sj ——> R be a random lifting function on Sj which lifts Sj to its graph
Sj = {a = (a,wj(a)) : a E 53'} C Rm” forj = 1,...,n. Here we assume that
Sj = Sj U{0}, i.e., all polynomials pj (:17) in the system have constant terms. See

Section 2.2.

Recall that a collection of pairs {aha/1} C 31,.. ., {amag} C Sn is called a

mixed cell if there exists d 2 (oz, 1) E Rn'H with OI E R" such that

(a,,a) = (fig-,0?) < (as) for a e sj\ {gag}, j=1,...,,n

I A
and a is called the inner normal of this mixed cell. For 1 S i S n, e = (“,5 } C Si:

is called a lower edge of S,- if there exists a = (a, 1) E Rn+l for which the following

relations hold:

(as) = <a’,a) g (as) v be S,\{a,a’}.

Denote the set of all lower edges of S,- by L(S,-). For k distinct integers {i1, . . . ,ik} C

{1,...,n},

- I .
Ek=(ei1,...,eik),1gkgn, where eij ={az-j,aij} CSij for J =1,...,k,

20

is called a level-k subface of S 2 (S1, . . . , Sn) (or simply “level-k subface”) if there
exists (3: = (a, 1) E Rn+1 such that for each j = 1,. . .,k

A A A, A A A I
a-.,a = a-,,a < a,a V aES-. a-.,a-. .
<._, > a, >_< > .,\[., .3]

A A

For a level-k subface E‘k == (éi1""’éik) of S = (S1,...,Sn) with 1 S k < n
I A

where éij = {a,-J.,a,.,} E L(S,-j) for j = 1,...,k, we say that the lower edge
.7

} E L(S,- for certain ik+1 E {1,2,...,n}\{i1,--- ,ik}

A A AI
ezk+1 _ {a1k+1’aik+1 k+1)

extends the level-k subface Ek if Ek+1 == (éil’ . . is a level-(k + 1) subface

"eik+1)

of S = (81,. . .,Sn). We say Ek is extendible in such situations.

A main strategy for ﬁnding mixed cells is the extension of level-k subfaces El: of
S 2 (S1, . . . , Sn) starting from k = 1 and an extendible Eh when k = n—l yields
mixed cells of S 2 (S1, . . . , Sn) induced by elements in Sin- In [8—10, 25], the order
of this extension i1, i2, . . . is predetermined and ﬁxed, that is, one always starts from
extending lower edge él of S1 to a level-2 subface (é1,é2) with ég E L(S2), then
extend (é1,é2) to a level-3 subface (é1,é2,é3) with (33 E L(S3) etc. Software
package MixedVol [10] for the mixed cells computation that was adopted in the

original HOM4PS was developed along this line of approach.

In [29], the novel idea of dynamic enumeration of all mixed cells emerged where

dynamic means that the order of the subface extension will not stay ﬁxed. To extend

21

a particular level—k subface
. . . . . A] A .
Ek=(e,-1,...,eik), 1$k<n, whereeij={az-j,a,-J_}EL(S,-j) for j=1,...,k,

one searches among M := {S1 : l E {1,...,n}\{i1,...,ik}} for Sik+l that has

minimal number of suitable points where only lower edges consisting of points among
them can possibly extend Ek to a level-(k+ 1) subface. The main strategy suggested

in [29] for ﬁnding such S,- is the removal of those points in each S; E M which

k-l-l

have no chances to be part of a lower edge in MS) that can extend Ek, and select

A

the one with minimal remaining points as Sik+1°

A I A
For level-k subface Bk 2 (fez-1,. . . ,éz-k) where éij = {away} E L(S,-j) for j =
1,...,k, let Q :2 {i1,...,ik} C {1,...,n}. Foraﬁxed l E {1,...,n}\Q, let 6.,

be a particular point in S1, and consider the set of constraints:

(awe) = <a§,,a) v IIEQ

|/\

(a, a) v a e s,,\ {away} (3.1)

(Ema) g (is) v ies,\,[13v}

in terms of unknowns d 2 (01,1) E Rn+1. Apparently, when the set of constraints
in (3.1) is infeasible, then there is no f)” in S] such that {hmhu} can extend E]:
to become a level—(k + 1) subface, and therefore hv can be safely removed from S].

For the feasibility of the set of constraints in (3 1), consider the linear programming

22

(LP) problem

(P) Max (r, a)

Subject to (3.1)

where r E R" is any ﬁxed vector. By the duality theorem, the feasibility of the
inequalities in (3.1) can be determined by the boundedness of the duality of the LP
problem in (P), which can be checked by standard techniques in the textbooks.
The superiority of the resulting software DEMiCs-0.95 in computing mixed
cells with the dynamic enumeration was reported in [29]. Soon after, this idea was
embedded in the original MixedVol and a new code MixedVol—2.0 was developed
which improves the speed of DEMiCs-0.95 by a substantial margin as shown in
[20]. As mentioned before, employing the polyhedral homotopy method for solving
polynomial systems, locating all mixed cells always plays a critically important role
[22—24]. The new adoption of MixedVol-2.0 for the mixed cell computation in
HOM4PS-2.0 is certainly one of the main factors accountable for its considerable

speed-up.

23

3.2 Following homotopy paths

3.2.1 Constructing the polyhedral-linear homotopy

For polynomial system P(z) = (p1(:c), . . . , pn(:r:)) with

pj($): Z Cj,a$at j=1""3n’
a E Sj
let Q(SE) = (q1(:r), . . . ,qn(:z:)) = 0 be a polynomial system having the same mono-

mials as P(sr) but with randomly chosen coefficients, i.e., qj(a:) = 2:065]. Emil?“
where éja are randomly chosen complex numbers. In HOM4PS, this system, as
elaborated in Chapter 2. is ﬁrst solved by using a polyhedral homotopy Q(cr,t) =
(q“1(:r,t), . . . ,q‘n(r, t)), t E [0,1] witharandom lifting given by w = (w1,. . . ,wn), wj :
Sj -> R; i.e., éj(x,t) = 2063]. Ejﬂzatwﬂa) forj = 1,. . .,n. Then a linear homo-
topy H (:r,t) = (1 — t) 7Q(:r) + tP(:c), t E [0, 1] with generically chosen complex 7,
is constructed to reach all isolated solutions of P(z) = 0 by following the smooth so-
lution paths of H (2:, t) = 0 emanating from the solutions to Q(zr) = 0 found above.
The number of paths needed to be followed in each step is the mixed volume of the

polynomial system.

In HOM4PS-2.0, these two steps were combined in one step by considering the

polyhedral-linear homotopy

H(.:c,t) = (h1(a:,t),...,hn(:c,t)), x = (2:1,...,:cn), tE [0, 1]

24

where

fly-(22¢): Z ((1—t)Ej,a+th-,a)xatwj(a), j=1,...,n.
aESj

Note that H(a:, 1) = P(x). For a given mixed cell C = ({a11,a12}, . . . , {an1,an2})
with inner normal a E IR", where {aj1,aj2} C Sj for each j = 1,...,n, af-
ter applying the change of variables a: = yto‘ where y = (y1,...,yn) and 933- =
yjtaj for j : 1,...,n, and keeping the variable a: in place of y, we reach the

homotopy f1(:c,t) = (h1(:r,t), . . . ,hn(z,t)), t E [0,1], where for j = 1,...,n

hj(a:,t) = Z [(1 — t75j,a + tcj,a]a:at(&’d), with a = (a,wj(a)) for a E Sj.
aES-
.7

Letting

ﬁj = min (End) for j=1,...,n

aESj

and “factoring out. the lowest power of t” yields

A A

H(a:,t)=(h1(:r,t),...,hn(:r,t)), where tE[O,1] and

in“): Z [(1—t)5j,a+tcj,a]:cat(<&’d>_ﬁ7), for j=1,...,n. (3.1)
aESj

Note that we still have H (IE, 1) = P(rr), because a: = y at t = 1 In following the
solution paths of 1:1($,t) = 0 by the prediction-correction method, the ﬁrst step of
the predictor at t = 0 cannot be taken if a power of t in H (11:, t) is less than one,

since Ht(a:,t) would then be undeﬁned at t = 0. If the minimum power of t in

25

= tom

(3.1) is, say, tO'Ol, then changing variables with T would solve the immedi-

ate problem. But it would reduce numerical stability and computational efﬁciency if

tI’OOO, were also contained in 170:, t). Then the tangent

large powers of t, such as
vector :I'; 2 ﬁlm—1 * Ht would contain the terms in the order of 100,000 * t99’999
which, if evaluated at any t E [0,1), would give 0. Close to 1, however, the tan-
gent vector would become extremely steep, and step sizes for following the homotopy
path would have to be correspondingly minuscule. Actually, while these sorts of
problems already exist when “the polyhedral step” and “the linear step” are split
as implemented in HOM4PS, they become multiply ampliﬁed when the combined
polyhedral-linear homotopy is used. Ironically, notwithstanding the number of paths
needed to be followed was cut in half by combining the polyhedral and linear steps,
the difference between the computing times of the two approaches is almost negligible
most of the time.

In HOM4PS-2.0, we address this problem by applying the transformation 3 2
Int in (3.1), resulting in the homotopy H(:c,s) = (h1(a:,s),...,hn(x,s)), s E

(—oo,0], where
hj(.v,s) = Z [(1— e3)5j,a + escjaa]xaes*(<&’é)_ﬁ7) for j = 1,. . . ,n.

aESj

Recall that mixed cell C = ({a11,a12}, {a21,a22}, . . . , {an1,an2}) with inner nor-

26

mal a E R" satisﬁes the relations

(Ema) = <aj2,d>,

<éj1,d> < (&,d) V&ESj\{&j1,&j2}, j=1,...,n.

So, in each hj(:r, 3), there are exactly two powers of 6 equal to 0, namely, for each

j = 1,. . .,n,
-=mm End = a-,(3I = d-,d.
:8] aESj< > < 31 > < ]2 .
Therefore at s 2 —oo, H (1:, —oo) becomes a binomial system,
Ellxall + 512$a12 = 0,
Enlzanl + 57,2270"? = 0,
having [det(a11 — a12,...,an1—— an2)| number of nonsingular isolated solutions

which provide the starting points for following the solution paths of H(x,s) =
0 from s 2 —00 to 0. We will detail the standard procedure for solving binomial
systems in Section 4.3. As we can see here, the number of paths associated to the
cell is the volume of the cell. If we repeat the same process with all the mixed cells,

then the total number of the homotopy paths needed to be followed equals

Vol (C) = mixed volume of the system .
C :mixed cell

27

Since H(x,s) =H(:L',e3), thus for H(a:(s),s) = 0 we have

dH(:r: (s), s): H(x, es)- -— 1733—: + Htes = 0. (3.2)

d
It follows that 32—: I 32—00 = 0 and the values of 23(3) stay close to invariant for large
5
negative 3. Thus, to keep H(x, 3) z 0, we choose 80 so that terms e 80*(<a’a>_ﬁj) for
a E Sj\{aj1,aj2} are negligible for all j = 1, . . . , n, say on the order of 10‘s, as our
starting 8 value for following the solution paths of H (:6, s) = 0. Since 5 E (—00, 0]

the dominant or largest term with base 6 and exponent s * ((6nd) — ﬁj) in the

polynomial

:__(S)CZ[1_6 _j,sj,a+ec axet]a'9*((\dd)—ﬁj)
aESj

for all j: 1,...,n is given by

p:=exp 3* min ((a, a)— B) , forj=1,...,n.
aESj\{aj1 “32} J

Setting II = 10—8 and solving for 3, yields

—81n 10
n1ina€5j\{aﬂ,aj2}(<a,a) -— ﬂj)
jE{1,2,...,n}

 

Tracing solution paths :r(s) of H (3,3) = 0 from 30 will reach isolated zeros of P(r)
when 3 reaches 0. As this is quite different from following paths in [0,1], one must

28

move more aggressively, especially when the magnitude of so is very large. From

(3-2),

d A . .
332—1151 Htes, with H(z,t) as in (3.1) and t=es.

d
80 d_:r is small for large negative so, which justiﬁes the adoption of a large step
3

size at so, say do = ———38—9, making 31 = so + (So. It is then followed by a standard
prediction—correction algorithm at 31 for tracing homotopy paths. In general, at sk,
we choose initial step size 6k = min{6k_1, —Tsk}, namely, the step size remains the
same as that of the previous stage, but not excessively large - not over a third of
the remaining distance. When two consecutive points on the path are available along
with their tangents to the path, we use the cubic Hermite interpolation rather than
the Euler method as our predictor, followed by the Newton corrector. As usual, step
sizes are adjusted by the chosen tolerance parameters. For instance, normally it takes
no more than three iterations for the Newton corrector to converge within the desired
accuracy. Therefore if the number of Newton’s iterations for the correction is > 3,
the step size will be cut in half to minimize the possibility that the beginning predictor
estimate was too far away from the curve. On the other hand, if in two consecutive
stages the step sizes were not cut, we assume the curve is ﬂat at this moment and

will take the initial step size to be the minimum of doubling the previous step size

and a third of the remaining distance to 0.

As mentioned before, combining linear and nonlinear homotopies to reduce the

number of solution paths needed to be followed in the polyhedral homotopy method

29

by half was originally suggested back in 1995 [15]. However, this idea was not suc-
cessfully implemented in HOM4PS because of the involved stability and efﬁciency
problems. Addressing those difﬁculties by the transformation 3 = lnt and para-
meterizing the solution paths by s E (—oo,0] in HOM4PS-2.0 as shown above, a
substantial improvement in algorithmic efﬁciency and stability has been achieved as
evidenced by the results of intensive numerical experiments. This combination strat-
egy is particularly important when the polyhedral homotopies are used to solve large

problems where mixed volumes of the systems are more than millions.

3.2.2 Classical linear homotopy

For some polynomial systems, such as katsura-n [3], noon-n [35] and reimer—n [37],
the mixed volume of each system is the same or nearly the same as its total degree (or
the Bézout number). In such situations, instead of employing polyhedral homotopies,
they should be solved directly by following the total degree number of solution paths
of the classical linear homotopy H(z‘,t) = (1 — t)7 Q(cc) + tP(:z:) = 0 with generic

'7 E C where Q(x) = (q1(:z:), . . . ,qn(:r:)) with cc 2 ($131,” .,xn),

‘11 = 0113111 _ b1, (11 = degree of 191(23),
(1n = 071ng _. bn, dn = degree of Pn(33)

30

and randomly chosen (a1,...,an,b1,....bn) E C2" . In this way, by avoiding
polyhedral homotopies, very costly mixed cell computations for large systems are
averted. Moreover, the s = lnt transformation for the powers of t used in the
polyhedral-linear combination is unnecessary for a homotopy straightly linear in the
parameter t. For large systems, this can also reduce a considerable amount of cpu
time. We also implemented the algorithm of the classical linear homotopy for solving

polynomial systems and made it an option in HOM4PS-2.0.

3.3 On curve jumping

After following all the homotopy paths. we need to check if there are paths accidentally
jumping to other paths during the process of paths tracing. This may result in missing
zeros. Following two very close homotopy paths may increase the chance of curve
jumping that can occur at each stage of the prediction and correction. When Euler’s
method, or the cubic Hermite interpolation, is applied to predict a beginning estimate
for the Newton correction of one homotopy path, the resulting point may become too
close to the other homotopy path. Thus the correction sequence will converge to a
point that is not on the desired path. Subsequently, continued with the prediction-
correction procedure, we are in effect following the second path which will also be
traversed independently beginning with its own starting point. From the numerical
results, it looks as though two different curves each with different starting points

both reach the same solution. On the other hand, reaching the same solution may

31

not indicate the occurrence of curve jumping because of possible singular solutions.
For example, when solving cyclic-8 [2] and cyclic—9 [2] systems by the homotopy
method, both have more than one curve leading to the same singular solution that

lies on a positive dimensional solution component.

Before dealing with the curve jumping, we wish, in the ﬁrst place, to minimize
the chance for curve jumping to happen during the curve tracing procedure. In
HOM4PS-2.0, a more sophisticated selection for the tolerance parameters is designed
for dynamically determining step size during the curve following. For instance, for

Newton correction at s = sk, we consider any two consecutive iteration points x(m)

N am“) — wt") II
II 2:0”) n

 

and :c(m+1) too far apart from each other if the relative error
10%. In this situation, we will repeat the prediction-correction process with the step
size being cut in half. In addition, as mentioned before, if more than 3 steps of
Newton’s iterations were required to converge within the desired accuracy, we yet
again cut the step size in half to minimize the possibility that the beginning predictor
estimate was too far away from the curve. As shown in Table 3.4 in Section 3.5.1,
these collective treatments greatly decrease the occurrences of curve jumping. In fact,

they never appear in most of the systems we solved.

To check if curve jumping occurs, we must verify all the attained solutions to see
if there are two solutions that are very close to each other. We may, for instance,
consider two solutions to be numerically identical if the relative error of these two

solutions is less than a chosen parameter co > 0. In HOM4PS, this task was done in a

32

quite straightforward manner, essentially each pair of solution points were compared.
This will naturally become very costly when the number of solutions is big, say
100,000. Then there are 100, 000 * 99, 999/2 solution pairs, and the relative error of

each pairs must all be computed.

In HOM4PS-2.0, we divide all the solutions into different groups and only check
solution pairs within the same group for closeness. For each isolated solution point
z = (21, . . . , zn) E C" we will focus on the imaginary part of its ﬁrst component
21 = A1 + Bli where A1, Bl E R. The decimal representation of 81 always has
a positive or negative sign associated with it and the solutions will be divided into
groups that are characterized by this sign as well as the chosen and ﬁxed k-th digit
and (k +1)-th digit after the decimal point of the decimal representation of Bl. Each
digit place has ten possibilities {0, 1, 2, . . . ,9}. So in total, the solution set is divided
into 200 groups, and each group of solution points is characterized by having the same
sign, the same k-th digit bk and (k +1)-th digit bk+1 of the decimal representation
of the imaginary part of its ﬁrst component. Obviously, to locate solution pairs that

are numerically identical, one only needs to compare solution pairs within the same

group.

Along the same line, the number of groups can certainly be increased if one wishes
to deal with even bigger solution set. Our numerical results show that this technique
for the curve jumping detection chopped off a big amount of computing time of

HOM4PS especially when solving big systems.

33

Now, after two (or more) numerically identical solutions are detected, call it 5:,
we ﬁrst check the smallest singular value a of the Jacobian matrix Px(:c) evaluated
at 51?. When a is bigger than a chosen threshold 61 > 0, then the solution 513 will be
considered isolated and nonsingular. The curve jumping clearly occurs. In such cases,
we retrace the two associated curves with smaller step sizes. When a < 61, and the
solution :1": is isolated, we will infer that the two curves reach the same solution 513
and curve jumping does not occur. However, we can not rule out the possibility of the
curve jumping if the solution it is a nonsingular point lying on a higher dimensional

solution component Z. A point z E Z is called nonsingular if

 

Weir-”1912) _ _ .
TankC6($1,...,$n)(Z)—n dth. (3.1)

To differentiate those cases, the algorithm developed in [19] is used to determine the
dimension of the solution component to which the solution 5: belongs ﬁrst, followed
by checking the rank condition of :2 in (3.1). For the rank revealing of the Jacobian,
we use the scheme developed in [28] rather than calculating the whole SVD (Singular
Value Decomposition) of the matrix. When 5: is singular, it commonly attracts more
than one different solution curves as in solving cyclic-4 and cyclic-8 systems [19].

Curve jumping is only allowed to exist if :E is nonsingular.

34

3.4 Miscellaneous aspects of HOM4PS-2.0

3.4.1 Evaluating polynomials and derivatives

The prediction-correction process for following the homotopy paths of H (:13, s) =

(h1(:1:,s),...,hn(x,s))= 0 where for j = 1,. . . ,n,

hj(;1:,s) = Z [(1 — es)5j,a + escjaakcaesqlé’m—Hj) and :13“ = will 33%",
aESj
requires the computation of H (z,s),Hs(:r:,s), and the matrix Hx(:c,s) for ﬁxed
3. What is essentially involved in evaluating H (13,3), H3(:c,s) and Hx(:1:,s) at a
given point a: = (2:1, . . . ,.vn) are the evaluations of multivariate polynomials and
their partial derivatives. In HOM4PS, a multivariate polynomial g(:1:1, . . . ,zn) was
evaluated via Horner’s rule for univariate polynomials. By factoring out a variable,
say 1:1, g(.v1, . . . ,a‘n) becomes a polynomial in 3:1 with coefﬁcients in C[a:2, . . . ,xn].
By the same principle, those coefﬁcients, as polynomials in one less variable, were
evaluated by factoring out another variable. This may continue until the variables

are exhausted.

If multivariate polynomials are evaluated in this manner, the repeated compu-

tation of the same powers of some variables seems inevitable. For instance, for the

35

System P($1,1‘2,$3) = (P1031,1'2,$3),p2($1,$2,$3),P3($1,$2,$3)) where

p1 2*w?+3*2:g+5*2;g—1

p2 = 3*$?*z3+2*2:(15*2:g+4*23‘21*xg—5 (3-1)

p3 = 5*x?*x%*x§—7,

23?, 2:3, and 2:3 appear in all pl, 122, and p3. When the above rule is applied, those

quantities must repeatedly be computed.

Forj=1,...,n, let

MaxDegU) = max{aj|(a1, . . . ,an) E S1 U - - - U Sn}
— maximum power of the variable :rj appearing in the entire

polynomial system

and

M = .max MaxDeg(j)

j=1,...,n

= the largest power of all the variables in the entire polynomial system.

In HOM4PS-2.0, a table T of size n X M is established to store all possible powers

of 233', j .—_—. 1,...,n which may appear in H(z‘,s), H3(2:,s), and Hx(2:,s).

The ﬁrst row of T stores the monomials 2:1, 2%, . . . , witfaxDeg(1) , the second row
stores the monomials 2:2, 23%, . . . ,233/1 ”Deg (2), and similarly, the last row stores the

36

 

 

 

 

 

 

 

2:1 23:12 xf/IaxDegU)
$2 23523 JIQ/IamDegQ)
a, $3, $.13“deng
Table 3.1: T
monomials rm, 2%,. ..,:r,1[1axDeg(n). Since Ha;(:r,s) contains the partial derivatives

of x“ = 236111 - - 2%" that appear in H (27,3), and since HS(23,s) contains the same

$0.

= 23;ll m3," as H (23,5), table T stores all possible powers of 23j appearing
in all of H(23,s), Hs(2:,s), and Hx(:r,s). Those powers that are involved in any

monomial evaluations, no matter how often they appear, will never be repeatedly

computed.

For the system in (3.1), we have MaxDeg(l) = 6, Ma23De9(2) = 4, MaxDeg(3) =

5 and hence M = 6. The 3 x 6 table T is given as follows:

 

 

$2 $2 $2 $2
$3 $3 $3 $3 $3

 

 

 

 

 

 

 

 

 

Table 3.2: T (n = 3, M = 6)

The value of a monomial evaluated at a point 23 = ($1, . . .,23n) can easily be
obtained from table T. For example, for n = 3, the quantity of £23203 is T(1, 2) *

T(2, 3) * T(3,1).

37

3.4.2 Scaling of the coefﬁcients

Certain polynomial systems, such as cohn 2 and cohn 3 [6], have large coefﬁcients.
Large magnitudes in coefﬁcients will result in large magnitudes in tangent vectors of
the homotopy paths. This will affect the efﬁciency of the curve tracing because smaller
step sizes need to be taken to follow the homotopy paths. The idea of scaling the
system to reduce the magnitudes of the coefﬁcients of the polynomials ﬁrst appeared

in [30]. We shall illustrate our scaling method by way of an example.

Example Consider the following system of two equations in two unknowns

8000x§x§—2000x1+1 = 0 (3.2)

500021232—30 = 0. (3.3)

To scale the variables, let 231 = 100121 and $2 = 1002 22, and to scale the equations,

multiply (3.2) by 1063 and multiply (3.3) by 1064. This gives

1003(8000 * 102CI+202 2.sz - 2000 >i= 1001.21 +1) = 0

1064(5000 * 1001+02 2122 — 30) = 0.

10Elz¥z§—1052z1+1053 = 0

101342122 — 1055 = 0

38

where

E1 2 2C} + 2c2 + C3 + loglo(8000)
E2 2 c1 + C3 + loglo(2000)

E3 = C3

E4 = c1 + c2 + C4 + loglo(5000)

E5 = C4+log10(30).

To have the numerical stability afforded by coefﬁcients centered about unity, we want
each E, to be close to 0. Furthermore, to reduce variability among the magnitude
of the coefﬁcients in each equation, we want the difference between each pair of Egs

in an equation to be close to 0. Thus, setting

r13 E¥+E§+E§+E§+E§

1‘2 E [(131 - E2)2 + (E2 - E3)2 + (E1 - Ea)2l + [(134 - E5)2],

we wish to minimize r = r1 + r2. More explicitly,

r = (2c1 + 2C2 + 63 + log(8000))2 + (c1 + C3 +10g(2000))2 + c3 (3.4)
+(c1 + c2 + C4 + log(5000))2 + (C4 + log(30))2
+(c1 + 2C2 + log(8000) — log(2000))2 + (2c1 + 202 + log(8000))2

+(c1 + log(2000))2 + (Cl + c2 + log(5000) — log(30))2.

39

While in [30] r is considered as a second degree polynomial in four unknowns

C1, 62, C3, C4 and is minimized by the solution of

 

 

 

 

 

 

 

 

 

 

 

2i=0 fori=1,2,3,4,
862'
we rewrite r in (3.4) as
2
(2 2 1 0 ( —log(8000) \
1 0 1 0 —log(2000)
Ii
0 0 1 0 0
(C1\
1 1 0 1 —log(5000)
C2
T Z 0 0 0 1 ‘ -—log(30)
C3
1 2 0 0 ‘ log(2000)—log(8000)
W
2 2 0 0 T —log(8000)
1 0 0 0 —log(2000)
(1 1 0 0 ) ( log(30)—log(5000) )
\ I M v J
A b 2

= "Ax—bug.

and consider its minimization as a linear least squares problem. With the solution

of this least squares problem C1 = —3.3437, c2 = 1.3495, C3 = 0.0427, and C4 =

40

—1.5909, the original equations then become

0.9064zfz3—z1+1.1033 = 0

1.29962122—0.7695 = 0.

Clearly, the new system has coefﬁcients with magnitudes smaller than those of the
original one. They are closer to unity and to each other. When solutions z = (21, .22)
of the new system are located, the solutions 23 = (231, 222) can be attained by applying

the transformation 231 = 1001231 and 232 = 106223.

3.4.3 The end game

Some of the homotopy paths 23(3) of H (2,3) = 0 for s E (—oo,0] may diverge,
or go to 00, as s approaches 0. (In fact, solving systems like reimer-n [37] by the
homotopy continuation method, the majority of homotopy paths diverge.) Tracing
divergent paths usually consumes a multiple computing time. Typically near the end
of the solution path, very small step sizes must be taken to determine the convergence

of the path. For a more efﬁcient method, write

“31' 0° 1'
233(3) 2 aJ-(l — e8)m (1+ Eat-3'0 — es)m), j = 1,. . . ,n (3.5)
i=1
where m, called the cyclic number, is a positive integer and w = (1.21, . . . ,wn) E Z".

It was shown in Morgan et al. [33] that such expansions exist for each path in the

41

neighborhood of 3 = 0. Taking the logarithm of the absolute value of both sides of

equation (3.5) yields

00

.. _ . fl _ s .. _ 82'
log |;L,(s)| —log|aJ|+ m log(1 e )+Zb,,(1 e ) . (3.6)

i=1

. i
Here 2931 b,j(1 — e3)-7 is the Taylor expansion of log(1+ 2:31 aij(1 — es)'n_1).
During the process of homotOpy continuation, a sequence of points 23(3):), —00 <
so < 31 < S 0 were generated. Choose two consecutive 3k and 310 +1 close to

0, say 1 — esk < 10-6. By equation (3.6),

logic-3 —108$‘3 w-
l ]( 5)] l .7( ks'l'l“ = J+O(1—esk).
logll—ekl—logll‘ekHl m

 

w-

Therefore —2— may be estimated by the left hand side of the above equation when
m

3k is close to 0.

In HOM4PS, so is in PHCpack [38], the convergence (or divergence) of a path

23(3) 2 (1‘1(S),...,In(8)) when 3 —I 0 is determined by

w'
1. If wj < 0 for certain j E {1,...,n}, then ~51]— < 0, and the path component

233(3) diverges. Hence the path 113(8) diverges when 3 —1 0.

w-
2. If wj Z 0 Vj=1,...,n, then a] 2 0 Vj and all path components 2.3-(3)

converge. Hence the path 2(3) converges when 3 —’ 0.

w-
Newly inserted in our HOM4PS-2.0 is the observation that when 0 < a] < l for

42

d$j(8)

S

 

 

certain j then lim 2 00 which implies the divergence of the component

s-—}

233(3), hence the divergence of the path 23(3) when 3 —> 0. We thus split the second

item above as follows:

02-
2.1 If wj = 0 Vj = 1,...,n, , then —-1 = 0 Vj and all path components 233(3)
m

converge. Hence the path 23(3) converges when 3 —> 0.

w-
2.2 If 0 < a] < 1 for certain j E {1, . . . ,n}, then the path component 233- diverges

and hence the path 23(3) diverges when 3 ——> 0.

2.3 If

3 l3?

_>_ 1 V j = 1,...,n, then all path components 233(3) converge and

hence the path 23(3) converges when 3 —> 0.

Our numerical results show that this splitting provides a much accurate judgement

in many situations.

3.5 Numerical results

To demonstrate the performance of HOM4PS~2.0 and to compare it with the existing
packages HOM4PS, PHCpack [38] and PHoM [12] which implemented the polyhe-
dral homotopy method, we will focus on those size-expandable bench mark systems
listed in Table 3.3. All the computations in this section were carried out on a Dell
PC with a Pentium 4 CPU of 2.2GHz, 1GB of memory. Results presented are mainly

restricted to those systems that can be solved within 12 hours of cpu time. The

43

package HOM4PS-2.0 is written in FORTRAN 90. The binary codes as well as its

Matlab interface are available at: http://www.math.msu. edu/~li/Software.htm

44

 

eco—n [30] Total degree = 2371-2

 

(231+ 23le + - - - + 23n_223n_1)23n — 1 = 0
($2 + 23123 + - - - + 23n_323n_1)23n — 2 = 0

23n_123n —- (n — 1) = 0

$1+$2+-“+$n_1+1=0

 

 

noon-n [35] Total degree = 3"

 

2:1(z%+232+---+2:,2,—1.1)+1=0
22(a:§+z§+-~+x%—1.1)+1=0

xn(a:%+:c§+~-+:c,2,_1—1.1)+1=0

 

 

cyclic-n [2] Total degree = n!

 

231+232+---+2:n=0
$1232 + 332233 + - - - + $n_1$n + $17,121 = 0
231232233 + 2322:3234 + - - - + 23n_12:n231 + 2371231232 = 0

2:1232-~-xn—1=0

 

 

 

katsura—n [3] Total degree = 2"

 

223n+1+223n+---+2232+231—1=0

2xﬁ+1+2xﬁ+m+2$§+x§—xl =0
223nmn+1 + 23311—11371 + - . - + 2232233 + 2231232 — 232 = 0
223n_123n+1+ 223n_223n + - - - + 223123 + 23% — 233 = 0

22:323n+1 + 2231237; + 223223,,_1 + ~ - - + 23n/2$(n+2)/2 — 2m 2 0 (if n is even)

2$2$n+1+ 2231231; + 223223714 + - - - + xfn'l'llﬂ — an = 0 (if n is odd)

 

 

reimer—n [37] Total degree = (n + 1)!

 

211% — 2x3 + - - - + (—1)"+12z¥, — 1 = 0

223113— 2233 + - - - -l- (—1)"+1223% — 1 = 0

 

22:?T1— 22334—1+---+(—1)"+122:'T]+1 — 1 : 0

 

Table 3.3: The polynomial systems

45

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

System CPU time Mixed volume # Of curve # Of isolated
jumpings solutions
eco-16 6m35s 16,384 - 16,384
eco-17 22m23s 32,768 - 32,768
eco-18 1h51m303 65,536 - 65,536
noon-10 1m27s 59,029 - 59,029
noon—11 5m32s 177,125 2 177,125
noon-12 27m298 531,417 2 531,417
noon-13 3h7m108 1,594,297 10 1,594,297
katsura- 17 40m483 131,072 — 131,072
katsura-18 1h35m47s 262,144 - 262,144
katsura-19 3h50m483 524,288 4 524,288
katsura-20 8h58m00s 1,048,576 4 1,048,576
cyclic-9 44s 11,016 - 6,642
cyclic-10 2m47s 35,940 - 34,940
cyclic-11 19m408 184,756 - 184,756
cyclic-12 1h36m40s 500,352 - 367,488
reimer-7 1m583 40,320 - 2,880
reimer-8 30m43s 362,880 - 14,400
reimer-9 7h52m40s 3,628,800 14 86,400

 

 

 

 

 

 

 

 

Table 3.4: The performance of HOM4PS-2.0

3.5.1 The performance of HOM4PS-2.0 in dealing with curve

jumping and diverging paths

Listed in Table 3.4 is the performance of HOM4PS-2.0 on solving all those bench
mark systems in Table 3.3. From the 4th column in the table, one can see that
the occurrences of curve jumping have been mostly diminished for HOM4PS-2.0.
We must note here that curve jumping never appears for systems in each system

category with sizes smaller than those listed in the table. On the other hand, we

46

only need to retrace 10 paths (among 1,594,297 paths) for noon-13, 4 paths (among
1,048,576 paths) for katsura-20 and 14 paths (among 3,628,800) for reimer-9 due to
curve jumping. Moreover, when retracing was necessary, no homotopy paths need
to be retraced more than once, while, as reported in [12], multiple retracings were

required very often for the software package PHoM [12] to deal with curve jumping.

As shown in the table, when solving reimer-n systems, most homotopy paths
diverged. For example, the mixed volume, or the total number of paths we must
follow, of the reimer-8 system is 362,880, but the number of its isolated solutions is
just 14,400. While it’s normally costly in tracing diverging paths, HOM4PS-2.0 is
capable of determining those divergent homotopy paths very efﬁciently with the end

game criteria discussed in Section 3.4.3 as the cpu times in the table show.

As mentioned before, we may solve katsura—n and reimer-n systems by the clas-
sical linear homotopies because the mixed volume of each system agrees with its total
degree. As a comparison, we list in Table 3.5 the results of solving those systems by
both the classical linear homotopy option and the polyhedral-linear homotopy option
in HOM4PS-2.0. While the proof is not available at this moment, by the observation
on a collective numerical data from intensive experiments on noon-n systems, the

total degree of each noon-n system and its mixed volume satisfy:

total degree 2 3n = mixed volume + 2 n.

So, when n becomes large, the difference between them becomes very slim relatively.

47

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

System Total Degree CPU time # of isolated
Linear [Poly-Linear solutions

noon-10 59,029 + 20 1m27s 5m12s 59,029
noon-11 177,125 + 22 5m32s 23m27s 177,125
noon-12 531,417 + 24 27m293 1h28m003 531,417
noon-13 1,594,297 + 26 3h7m103 7h02m103 1,594,297
katsura—15 32,768 7m03s 1h50m26s 32,768
katsura-16 65,536 l6m25s - 65,536
katsura—17 131,072 40m48s - 131,072
katsura-18 262,144 1h35m47s - 262,144
katsura-19 524,288 3h50m483 - 524,288
katsura—20 1,048,576 8h58m003 - 1,048,576
reimer-7 40,320 1m58s 2m49s 2,880
reimer-8 362,880 30m43s 36m43s 14,400
reimer-9 3,628,800 7h52m403 8h47m42s 86,400

 

 

 

 

 

Table 3.5: Comparison of the classical linear homotopy and the polyhedral—linear
homotopy in HOM4PS-2.0 ‘

Therefore we also include them in the table. Apparently, as it shows, if the closeness
of the mixed volume and the total degree of the system can be revealed beforehand,
sometimes HOM4PS-2.0 can handle much bigger systems within tolerable time.

For reimer—n systems, the cpu time for ﬁnding mixed cells is very minimal (less
than 1 second most of the times). While tracing the same number of curves, the
differences in the cpu times of the Classical linear homotopy and the polyhedral-linear
homotopy in the table indicate that nonlinear homotopies can be costly for large

systems.

48

3.5.2 HOM4PS-2.0 vs. HOM4PS

Table 3.6 lists the numerical results that compare HOM4PS-2.0 with HOM4PS.
Since the classical linear homotopy method was not implemented in HOM4PS, the
table only displays the results that used the polyhedral homotopy method uniformly
on all the systems.

As it shows, HOM4PS-2.0 is considerably faster than HOM4PS, and the speed—
up ratio increases by quite a bit as the mixed volume of the polynomial system
increases. Recall that for a speciﬁc system, HOM4PS-2.0 only needs to trace the
mixed volume number of homotopy paths, while twice this number of paths needs to
be traced in HOM4PS. Moreover, HOM4PS-2.0 is much more powerful in dealing
with larger systems. For instance, originally HOM4PS can not solve noon-13 system
within tolerable time, whereas HOM4PS-2.0 followed over 1.5 million curves in 7

hours.

49

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

System Mixed Volume CPU time Speed—up
(# of paths) HOM4PS HOM4P S-2.0 ratio
eco—15 8,192 33m25s 2m25s 13.8
eco—16 16,384 2h55m12s 6m35s 26.6
eco—17 32,768 - 22m23s —
eco- 18 65,536 - 1h51m303 -
noon-9 19,665 21m4ls 1m15s 17.3
noon— 10 59,029 3h20m45s 5m12s 38.6
noon-11 177,125 - 23m27s -
noon-12 531 ,417 - 1h28m00s -
noon— 13 1,594,297 — 7h02m10s —
katsura- 12 4,096 43m54s 1m42s 25.8
katsura- 13 8,192 3h40m54s 4m56s 44.8
katsura— 14 16,384 - 25m15s -
katsura— 15 32,768 - 1h50m26s -
cyclic-9 1 1,016 8m37s 44s 1 1.8
cyclic-10 35,940 58m02s 2m47s 20.9
cyclic-11 184,756 - 19m403 -
cyclic-12 500,352 - 1h36m403 -
reimer-7 40,320 7m47s 2m49s 2.8
reimer—8 362,880 1h44m18s 36m43s 2.8
reimer-9 3,628,800 - 8h47m42s —

 

 

Table 3.6: Comparison of HOM4PS and HOM4PS-2.0

50

 

 

3.5.3 HOM4PS-2.0 vs. PHCpack and PHoM

Listed in Table 3.7 is the comparison of the performance of HOM4PS-2.0 and
PHCpack [38]. The implementation of solving polynomial systems by the Classical
linear homotOpies is also available in PHCpack. Therefore the comparisons listed in
Table 3.7 on noon—n, katsura-n and reimer-n systems whose mixed volume and total
degree of each system are the same (or almost the same) are the results by using the
classical linear homotopy option in each package. Our powerful code MixedVol—2.0
[20] for computing mixed cells has no place in this computation because no mixed
cell computations are required. Nonetheless, as it stands, HOM4PS-2.0 still leads
PHCpack in speed by a big margin in those situations.

For eco—n and cyclic-n systems, there is a considerable difference, sometimes huge,
between the mixed volume and the total degree of the system. So, we must employ the
polyhedral homotopy here. When the PHCpack was tested, we used its fastest option,

as indicated in the package, which utilizes MixedVol [10] for mixed cell computations.

51

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

System Total degree CPU time Speed-up #isolated
PHCpack HOM4PS-2.0 ratio solutions

eco— 14 1,062,882 1h26m04s 52.93 97.6 4,096
eco— 15 3,188,646 3h55m238 2m25s 97.4 8,192
eco- 17 28,697,814 - 22m23s — 32,768
eco— 18 86,093,442 - 1h51m303 - 65,536
noon-9 19,683 33m283 22.2s 90.5 19,665
noon-10 59,049 2h33m27s 1m27s 105.8 59,029
noon-11 177,147 - 5m32s — 177,125
noon-13 1,594,323 - 3h7mlOs - 1,594,297
katsura-14 16,384 2h49m003 2m52s 59.0 16,384
katsura- 15 32,768 8h22m45s 7m03s 71.3 32,768
katsura-16 65,536 - 16m25s - 65,536
katsura—20 1 ,048,576 - 8h58m008 - 1 ,048,576
cyclic-9 362,880 3h50m483 443 314.7 6,642
cyclic- 10 3,628,800 11h00m23s 2m47s 237.2 34,940
cyclic-11 39,916,800 - 19m403 - 184,756
cyclic— 12 479,001,600 - lh36m40s - 367,488
reimer—6 5,040 15m088 9.63 94.5 576
reimer-7 40,320 3h45m43s 1m583 1 14.7 2,880
reimer-8 362,880 - 30m433 - 14,400
reimer-9 3,628,800 - 7h52m408 - 86,400

 

 

 

 

 

 

 

Table 3.7: Comparison of HOM4PS-2.0 and PHCpack

52

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

System Total degree CPU time Speed-up #isolated
PHoM HOM4PS-2.0 ratio solutions

eco—13 354,294 2h39m31s 19s 503.7 2,048
eco— 14 1,062,882 9h57m15s 52.98 677.4 4,096
eco— 15 3,188,646 - 2m25s - 8,192
eco- 18 86,093,442 - 1h51m30s - 65,536
noon-8 6,661 54m185 19s 171.5 6,645
noon-9 19,683 5h01m06s 1m15s 240.9 19,665
noon— 10 59,049 - 5m123 - 59.029
noon— 13 1 ,594,323 - 7h02m10s - 1,594,297
katsura-l 1 2,048 lh2lml3s 283 174.0 2,048
katsura-12 4,096 4h00m09s 1m42s 141.3 4,096
katsura-13 8,192 — 4m56s - 8,192
katsura—15 32,768 - 1h50m263 - 32,768
cyclic-8 40,320 32m32s 6. 8s 287.0 1 , 152
cyclic-9 362,880 - 44s - 6,642
cyclic- 12 479,001 ,600 - lh36m408 - 367,488
reimer-6 5,040 lh14m503 12.1s 371.0 576
reimer—7 40,320 - 2m49s — 2,880
reimer-9 3,628 ,800 - 8h47m42s - 86,400

 

 

 

 

 

 

 

 

Table 3.8: Comparison of HOM4PS-2.0 and PHoM

Table 3.8 compares the performance of HOM4PS-2.0 and PHoM [12]. The im-
plementation of the classical linear homotopy for solving polynomial systems is not
available in PHoM. Therefore all the results listed in Table 3.8 used the polyhedral

homotopy on all the systems.

53

 

Maximum size

PHoM ] PHCpack [ HOM4PS-2.0
eco- 14 (1,062,882) 15 (3,188,646) 18 (86,093,442)
noon- 9 (19,683) 10 (59,049) 13 (1,594,323)

katsura- 12 (2,048) 15 (32,768) 20 (1,048,576)

cyclic- 8 (40,320) 10 (3,628,800) 12 (479,001,600)

reimer- 6 (5,040) 7 (40,320) 9 (3,628,800)

System

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 3.9: Maximum sizes of polynomial systems that can be solved by PHCpack,
PHoM, and HOM4PS-2.0 within 12 hours of cpu time

Table 3.9 provides the maximum sizes of the systems that can be solved by
PHCpack, PHoM, and HOM4PS-2.0 within 12 hours of cpu time. The total de-
gree of the system is given in the parenthesis. As it shows, HOM4PS-2.0 can solve

systems of much larger size than the other two packages.

54

Chapter 4

HOM4PS-2.0para—Parallelizing

HOM4PS-2.0

Each of the three main steps in HOM4PS-2.0, ﬁnding mixed cells, following homotopy
paths and checking for possible curve jumping, may all be parallelized. HOM4PS-
2.0para is the parallel version of HOM4PS—2.0 which parallelizes the three main steps
in HOM4PS-2.0. A common feature of those three stages, as mentioned before, is that
each important task on a single CPU can be successively subdivided into multiple sub—
tasks, which can be processed independently from the others without communicating
with them. We therefore adopt a simple master-workers parallel computation envi-
ronment where one master CPU communicates with all worker CPUs, and workers
do not communicate among themselves. We use MP1 [34], which provides a message
passing library for this type of communication between master and workers, in our

implementation of the parallelization. In this chapter, we will address the details of

55

parallelizing HOM4PS-2.0 in each step.

4.1 Parallelizing the computation of mixed cells

Recall that a main strategy for ﬁnding mixed cells is the extension of level-k subfaces
El: 2 (éi1,...,éik) of the lifted support S = (S1,...,Sn) starting from k = 1
and when k = n — 1 an extensible Ek yields mixed cells of S = (51,. . . ,3”) in-
duced by elements in Sin- In our parallelization in computing mixed cells we ﬁrst
choose i1 to be the smallest integer in {1, . . . ,n} such that ]L(S,;1)| 2 |L(Sj)] for all
j E {1, . . . ,n} \ {i1} . Each lower edge in L(S,-l) can be extended to subfaces of the
next and consecutive levels independently by employing the idea of dynamic enumer-
ation of all mixed cells [29]. Thus each worker processor will be assigned to extend a

lower edge of L(S,-1), and the work is dynamically distributed in the following man-

ner.

Algorithm for the master

Choose i1 to be the smallest integer in {1, . . . ,n} such that ]L(S,-l)| _>_ |L(Sj)| for all
2'6 {1a---,n}\{i1}

N = the # of lower edges in L(Sz-1)

14311) = {é111---1é1N}

p = the # of processors (i.e., 1 master and p — 1 workers)

dos=1,p—1

56

send job 3 to worker 3 (i.e., processor 3 works on extending edge €213)

end do

(Comment: The time required for each job is different. When worker j ﬁnishes and
is ready for a new job, it sends the result to the master, and receives the next open

job in the queue.)

do 3 = p, N + p — 1
receive results from worker j (i.e., worker j ﬁnishes its current job)

send job 3 to worker j (i.e, worker j works on extending another edge éls)

end do

(Comment: For 3 > N, job 3 is an “empty job”, an ending signal to worker proces-

sors.)

Algorithm for workers
Choose i1 to be the smallest integer in {1, . . . ,n} such that |L(S,'l)| 2 |L(Sj)| for all
j E {1,...,n}\{i1}
N = the # of lower edges in L(S,-1)
L(311)={é11w~1é1N}
Step 0: Receive job 3 from the master
If (s > N) then stop. (i.e., there are only N edges)
Otherwise, set F1 := {éls},k :2 1.
Step 1: If k = 0, send all mixed cells to the master. Stop.

If Ek = (I), set It 2: k — 1, and go to Step 1. Otherwise go to Step 2.

57

Step 2:

Step 3:

Step 4:

4.2

Select éik E Ek, and set Ek :2 Ek\{éik}.

Let Eli: = (éi1,. . . ,éik), where éi1= élsr and

A A A, A 0

e4]. = {oz-fag} E L(S,-j) for j = 2,. . .,k, and go to Step 3.
To extend level-k subface Eh, search among

M:={Sl:lE{1,...,n}\{i1,...,ik}} for 84+, that has minimal

number of suitable points where only lower edges among them can possibly
extend Ek to a level—(k + 1) subface.

Find 5(Ek) (2 The collection of all lower edges in L(S,~ that extends

k+1)

A

Ek.)
If 5(Ek) = (0, go to Step 1. Otherwise set Fk+1 = 6(Ek), k := k + 1, then
go to Step 4.

If k = n, all C = ((a,-1, . . . , e 1 with 5,7, e in are mixed cells.

2‘71—] ’ éln’
Collect all these mixed cells, set k := k — 1, and go to Step 1. Otherwise go
to Step 2.

Parallelizing the curve tracing of polyhedral

homotopy

The fact that each homotopy path can be followed independently seemingly pro—

vides a perfect environment for the parallelization. Nonetheless when the polyhedral

homotopy method is used to solve polynomial systems, different mixed cells induce

different polyhedral-linear homotopies with different binomial systems to solve. Ini-

58

tially, it would seem natural to distribute the workload by sending one mixed cell to
each worker, and when each worker receives a mixed cell, it induces the corresponding
polyhedral-linear homotopy, solves the associate binomial system for starting points,
and follows the resulting homotopy paths. However, the volume of each mixed cell
may differ, hence the number of paths followed by each worker would differ. More—
over, there may be more workers than cells. For an extreme case: one can show that
reimer-n systems [37] permit only one mixed cell regardless of what sort of liftings are
used. In this situation, only one worker traces homotopy paths while the rest of the
workers all remain idle. Therefore, we propose to distribute the workload consisting
of a suitable amount of homotopy paths rather than sending one mixed cell to each

worker.

To decide how many paths should be assigned to each worker each time, let the
chunk size, denoted CS, be the number of paths given to each worker to follow in
each job. If CS is too small compared to the mixed volume, then too much time
would be spent on sending and receiving messages between workers and the master.
Since all workers ﬁnish their tasks quickly and report to the master at about the
same time, trafﬁc jams will result and furthermore it will increase the communication
time between the master and workers. On the other hand, choosing relatively large
CS may reduce the communication time, but it may also result in imbalance of
the workload in certain situations. Tracing different paths may consume different

amount of time, for instance, following divergent paths is about 3 to 5 times slower

59

 

than following convergent ones. Therefore it may not take the same cpu time for all
workers each to ﬁnish tracing the same number of homotopy paths. Those workers
who ﬁnish all their jobs must wait for those who have not done their jobs, and
the amount of waiting time depends largely on the size of CS. In the following, a
method on choosing an appropriate CS to make the work time more balanced among

processors is proposed.

Let M V represent the mixed volume of the system, which is known after all mixed
cells are found. Let p be the number of processors (1 master and p - 1 workers), t be
the average cpu time to follow one path, and I be the average number of jobs each

worker should ﬁnish. Then

MV

CS*l*(p—1)%MV=>CS=z*—(PT:—l—),

and once we choose l, chunk size CS is determined. Each worker will then follow

_ MV
—l*@-U

worker to follow CS paths is about CS :1: t. The computation time for each worker to

CS homotopy paths per job. The average computation time for each
ﬁnish its 1 jobs is about t2 = CS :1: t :1: l. The workers that ﬁnish all their jobs ﬁrst
must wait for those that have not done their jobs, and the waiting time is at most
t1 = CS * t. To balance the workload distribution, the ratio between the worst case
waiting time and total time required for each worker to ﬁnish all its 1 jobs needs to

1
be small, say 0.01. In other words, we want :—1 = I z 0.01. So 1 is about 100, yield-
2
M V

mg CSZrmi—i)

. This means each worker will be assigned about 100 jobs, and

60

 

each job consists of following homotopy paths. The aim of parallelizing

M
100(p — 1)
our software package HOM4PS-2.0 is to increase the computing resources for solving

larger systems, which inevitably requires tracing a big amount of homotopy paths.

Therefore, in the following, we always take I = 100 for simplicity.

To distribute the jobs dynamically, we ﬁrst number all mixed cells, and for each
mixed cell ({a11,a12}, . . . , {an1,an2}), we number all the isolated solutions of its

associated binomial system

Cllxall + C12$a12 = 0,
(4.1)
cnlccanl + 9,22%? = 0,
in the following manner. Equivalent to (4.1), write
mall—(112 : b1,
(4.2)
zanI—an2 2 bn,

c.
with bj = ——12 forj = 1, 2, . . . ,n. When solving (4.2) [22, 24], the integer matrix
c.
31

A = [011- 012,-uaan1- 0112]

61

is upper triangularized ﬁrst. That is, ﬁnding an integer matrix B with [det B] = 1

 

 

such that
v11 ”12 ”In
0 1122 v2n
BA 2
0 0 Um,
Recall that if y = (y1,...,yn) E C" and m1,...,mn are columns of an n x n

integer matrix M, then yM := (ym1,...,ym"). With these notations, if we ﬁnd

2 = (z1,. . .,zn) E C" satisfying

zBA = (51,. . . ,bn), (4.3)

then 23 : 23 is a solution to (4.2). Written out explicitly, (4.3) becomes

’U
211412,,322 : b2,
(4.4)
lelln H,Z;Jlnn : bn-
Without loss of generality, we may assume 'Ujj > 0 for all j = 1, . . . ,n. Now, v11

62

solutions of 21 in (4.4) can be expressed as

arg (b1) + 27rk1i
7Jll

 

2(k1)_lb Ill—1E ( ) f k _
1 — 1| 23p or 1—0,...,v11—1

where arg (b1) is the principal argument of b1. And, by forward substitutions in

(4.4), for j: 2,...,n

 

 

1 — .
k- _ —.—. or b' +2rrlc-i
2(3):|b-|UJJE23p 9(3) J for k-=0,...,v~—1
J J 0.. J J]
7.7

— bj — . . . -

where bj = zvlj zvj-lj and org (bj) IS the prmcrpal argument of bj. So the system
1 j—l

in (4.4) has ”11 x x vnn(=|detA|——- volume of the mixed cell ({a11,a12}, . . .,

{an1, an2})) isolated solutions in total, and each solution can be written in the form
(zgkl),...,z1(zknl) where 0 S kj < '03") for j = 1,...,n.

We order those isolated solutions as follows: For 1 S l S v11 x- - ~xvnn, consecutively
applying “division algorithm” on I — 1 yields a unique decomposition of l - 1 in the

form

l—1=u1(v22x~-xvnn)+u2(v33X-~xvnn)+~-+un_1(vnn)+un.

Apparently, 0 g uj < Ujj for j = 1,. . . ,n, and we thereby label (z[u1), . . . , Alum)

as the lm solution of the system in (4.4). For the corresponding solution 23 = 23

63

to the system in (4.1), we label it with the same order index of the solution 2 to the

system in (4.4).

The polyhedral-linear homotopy paths can now be ordered by taking the same
indices of their starting points of the associated binomial systems together with the
number indices of the corresponding mixed cells. For this purpose, a table fT is
established in which each entry in the ﬁrst row represents the index of a mixed
cell and the entry in the row below which denotes the index of the one of the so-
lutions of the associated binomial system of the mixed cell. For example, if there
are 30 mixed cells Cl,C2,C3,...,C3o and Volume of Cl = 8, Volume of C2 =

12,...,Volume of C30 = 8, then the table fT is: Suppose the mixed volume

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cell 1 cell 2 . .. cell 30
Cell# 1 11 2 30 30
Path# 1 8[1 12 1 8
Table4.1:fT

M V of the system is 1500 and the number of processors p = 4. Then CS =

1500

m = 5 and the number of total jobs = 300 where each job consists of

following 5 paths. Based on the ﬁrst row fT(1,j) in table f T, we associate to each
job i = 1, . . .,300 atable fT(i) with 3 rows and fT(1,i*5)-fT(1,(i—1)*5+1)+1
columns. Entries in the 1‘“ row are the indices of the mixed cells. And entries in the
2nd row and 3rd row list the beginning indices and the ending indices of the chunk

respectively. For instance, associated to job 1 is the table fT(l) of size 3 x 1 :

64

1 , which means the worker that receives job 1 will follow paths emanating from

 

 

éﬁlfitions number 1 through number 5 of the binomial system associated with cell

 

1 2

 

1. Similarly, associated to job 2 is the table flT(2) of size 3 x 2 : 6 1 . So

 

8 2

the worker that receives job 2 will trace paths number 6 through number 8 of the

 

 

 

 

binomial system associated with cell 1 as well as paths 1 and 2 of the binomial

system associated with cell 2, making it 5 in total.

We summarize the above in the next two algorithms.

Algorithm for the master

Establish Table f T of two rows in which each entry in the ﬁrst row represents the
index of a mixed cell and the entry below which in the second row denotes the index
of the solution of the associated binomial system of the mixed cell

M V

N umO f J obs = CS (M V is the mixed volume and CS is the chunk size.)

p= # of processors (1 master and p — 1 workers)
do i = 1 to p — 1

send job i (along with table fT(i)) to worker i
end do

do i 2p to NumOfJobs+p—1

65

receive results from worker j
send job i (along with table fT(i)) to worker j

end do

Algorithm for worker

receive job i from the master
if (i > NumOfJobs), then
stop
else
receive table fT(i) with 3 rows and fT(1,i =1: CS) — fT(1,(i — 1) >1: CS + 1) +1
columns
entries in 1“ row are the indices of the mixed cells
entries in 2nd row are the beginning indices of the chunk
entries in 3rd row are the ending indices of the chunk
worker solves the corresponding binomial systems and follows CS paths based on
f T0)
send results to the master

end if

When a worker receives job i along with table f T(i) from the master, it solves the
requisite binomial systems for picking out the appropriate paths to follow. However,

as mentioned before, some systems only allow very few mixed cells. Solving the

66

same binomial system repeatedly will certainly effect the scalability of the dynamic
distribution. So if there are only a few mixed cells, we ﬁrst solve all the binomial
systems corresponding to those mixed cells respectively, and save all solutions of the
binomial systems. When homotopy paths are distributed to workers, we only send
CS binomial system solutions as starting points without having workers solve the

same binomial systems over and over again.

4.3 Parallelizing the curve tracing of the classical

linear homotopy

For systems whose mixed volumes are the same or almost the same as their total
degree (or the Bézout number), the polyhedral-linear homotopy method provides no
advantages when it is used for solving them. As mentioned in Section 3.2.2, those
systems P(23) = (p1(23), . . . ,pn(23)) with 23 = (231, . . .,23n) should be solved directly
by following the total degree number of solution paths of the classical linear homotopy
H(2:,t) = (1 — t)')Q(:r) + tP(23) = 0 for generic 'y E C where a typical choice of

Q(I) = (210?).- - ~1Qn($)) is

(11 = a11361ll - 171, d1 = degree of P1013),
(4.1)
on = anzgn — bn, dn = degree of pn(2‘),

67

with randomly chosen (a1, . . . ,an, b1, . . . , bn) E C2" [24]. In this way, since no poly-
hedral homotopies are involved, no costly mixed cell computations are required. We

now describe the parallel algorithm in our HOM4PS-2.0para for this homotopy.

While the starting system Q(23) in (4.1) is a special binomial system as in (4.1),
the situation is much simpler here. Let TotalDeg 2 (11 x d2 x x dn be the total
degree of the system P(23), the total number of paths needed to be followed for the
classical linear homotopy. Let p be the number of processors, among which there are

1 master and p — 1 workers. Let CS be the number of homotopy paths that should

Total Deg

be followed in each job. To dynamically assign the jobs, we choose CS = _—100(p __ 1)

as discussed in the last subsection. For workers to decide which CS paths to follow,
we ﬁrst number all the isolated zeros of the starting system Q(23) as follows: Each

isolated zero of Q(23) can be written in the form

(xgkl),...,x$lkn)) where OSkj<dj for j=1,...,n

and

d—l

 

1 - .
3‘. ar -+2rrk.-i
|JEIEp< 9(3) 3) for kj=0,...,

_ b- _ _
with bj = a—J and arg(bj) being the principal argument of bj. Again, for 1 S
j

l S TotalDeg, consecutively applying “division algorithm” on I — 1 will result in a

68

unique decomposition of l — 1 in the form
l-1=u1(d2 x x dn)+u2(d3 x xdn)+---+un_1(dn)+un (4.2)

with 0 S uj < dj for j = 1,. . . ,n. So, as before, (regal), . . . 25:71)) will be the lm
solution of the starting system Q(23) = 0, and homotopy paths will be ordered by

taking the same indices of their starting points, respectively, as solutions of Q(23) = 0.

TotalDeg
CS

With this order arrangement, the worker that receives job i for 1 S i S
will follow homotopy paths from number (i — 1)CS + 1 to number iCS, which is CS
of them in total.

On the other hand, this order arrangement can also help to avoid saving all the
solutions of the system Q(23) = 0 in the main memory by constructing the table
Bixlib of size n x d where d = max {d1, . . . ,dn}, in which the j-tﬁ row stores the

(0) (dj"1)

d.
. ‘ . . . ]_ ._ .2
(1] solutions 2:]. ,..., 2:]. of the equation a] 2:]. b] —0 for J 1,...,n. As

an example, for n = 3, d1 = 8, d2 = 5 and d3 = 10, the 3 x 10 Bixlib table is:

 

(6) (7)

(01 $511 (21 431 ,.[41 ,.[51 $1 $1

(01 (11 (21 (31 (41
0 1 2 4 4 5 6 7 8 9
(1 (1 (1 (1 (1414141 (1 <1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 4.2: Bixlib

When table Bixlib is established, one can easily identify the l-t—ll solution
(agul), . . . Jan“) of Q(23) = 0 as long as 111,. . .,un in the decomposition of l— 1

in (4.2) are determined, and there is no need to save TotalDeg number of solutions

69

 

of Q(rr) = 0 in the memory.

We now list the algorithms for the master and the workers below.

Algorithm for the master
Create table Bixlib

p: # of processors (1 master and p — 1 workers)

Total Deg

NumOfJobs= CS

(Total Deg is the total degree and CS is the chunk size)

do i = 1 to p — 1
send job i to worker i

end do

do i = p to NumOfJobs +1) ——1
receive results from some worker j
send job i to worker j

end do

Algorithm for worker

receive job i from the master

if (i > NumOfJobs), then
stop

else
receive 2 integers (i — 1)CS + 1 and iCS, indicating the starting and ending
indices of paths

for each integer between the starting and ending indices, ﬁnd the corresponding

70

starting point from table Bixlib to follow the path
send results to the master

end if

4.4 Parallelizing the checking of possible curve

jumping

Once all the solutions are attained after following all homotopy paths, we proceed
to check for possible curve jumping. First of all, the master will divide the solutions
into 200 groups, as we explained in Section 3.3, where each group of solutions is
characterized by having the same sign of the imaginary part of the solutions’ ﬁrst
component as well as the same k-th digit bk and (k + 1)-th digit bk+1 after the
decimal point of its decimal representation. Note that solutions may not be evenly
distributed among those groups, therefore for systems having large number of isolated
solutions, some of those 200 groups may still have a big amount of solutions. In
HOM4PS—2.0para, if any group has more than say 1,000 solutions, then this speciﬁc
group will be divided into 200 subgroups again based on the imaginary part of the
second component of each solution in the group. A table ng is established in
which each entry in the ﬁrst row represents the index of a group and the entry in
the row below denotes the index of subgroup associated to the group in the ﬁrst
row. We use 0 in the second row to indicate that if the group in the ﬁrst row

does not have more than 1,000 elements, hence no further division of the group is

71

 

necessary. For example, let 91,. . .,9200 denote the original 200 groups. Suppose
|g2| > 1, 000, |g3| > 1, 000, and |g4| > 1, 000 and the rest of the groups have no more
than 1000 elements. Then 92, 93, and 94 will all be divided into 200 subgroups based
on the imaginary part of the second component of each solution in those groups.

LQt Q(2,1), . . . , 9(2, 200),g(3,1), . . . ,g(3, 200), 9(4, 1) . . . , 9(4’ 200) be the subgroups 0f

92, 93, and 94 respectively. The table ng becomes: In total, there are (200 — 3) +

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

91 92 93 .94 - - - 9200
Group# 1 2 2 3 3 4 4 200
Subgp # 0 1 200 1 200 1 200 0

 

 

 

 

 

 

 

 

 

 

 

Table 4.3: ng

200 * 3 = 797 groups (or subgroups) which need to be checked. We dynamically
distribute the jobs in a similar manner as what was done before as discussed in
Section 4.2 and 4.3. Suppose that there are 4 processors (1 master and 3 workers).

Let CS (=chunk size) be the number of groups needed to be checked in each job. Here,

in our case, CS = Zf—wﬁw z 3 . This means each worker would have to check 3

groups (or subgroups) in each job, and the total number of jobs is 1:1 = 266. Based
on the ﬁrst row ng(1, j) in table ng, to each job i = 1,. . .,266, we associate a

table ng(i) with 3 rows and ng(1,i * 3) — ng(1, (i — 1) =1: 3 + 1) + 1 columns.

lst 2nd

Entries in the row are the indices of the original groups. And entries in the
row and 3rd row list the beginning indices and the ending indices of the subgroups

associated to the group index in the ﬁrst row respectively. For instance, associated

72

 

 

 

to job 1 is a table ng (1) of size 3 x 2 : 0 1 , which means the worker that

 

0 2
receives job 1 will check group 91 and also subgroups 9(211) and g(2,2) of the group

 

 

 

 

92. Here 0 in the second and third row of table ng(1) means 91 has no more than

1,000 elements. There is no need to diﬂe 91 into 200 subgroups. Associated to

2

F—d

job 2 is a table ng(2) of size 3 x 1 : 3 . So the worker that receives job 2 will

r—-——1

5
check subgroups g ,g and g L(ﬁﬁgroup 9 making it 3 in total. Similarly,
(2.3) (2,4) (2.5) 2

T

2
L—

 

 

associated to job 3 is a table ng(3) of size 3 x 1 : . So the worker that receives

6
g—
8

 

 

job 3 will check subgroups 9(2,6)’ 9(237) and 9(2,8) oft—he group 92.

The dynamic distribution algorithms for the master and workers for checking the

curve jumping described in Section 3.3 are listed below.

Algorithm for the master
p = number of processors (1 master and p — 1 workers)
Divide the solution set into 200 groups 91,. . .,9200 based on the imaginary part of

the ﬁrst component of each solution.

Establish Table ng of two rows in which each entry in the ﬁrst row represents the

73

index of a group among 91, . . . , 9200 and the entry below which in the second row
denotes the index of subgroup associated to the group in the ﬁrst row.
Note: entries 0 in the second row of ng means the corresponding group in the ﬁrst

row has less than or equal to 1,000 elements.

Let N = #{i6 {1,2...,200}| lg,|>1,000}

Totales = 200 * N + (200 — N) : total number of groups (or subgroups)

CS 2 % : number of groups (or subgroups) in each job
Totales
N b = _—
umO f J 0 3 CS

do i = 1 to p — 1
send job i (along with ng(i)) to worker 2'

end do
do i = p to NumOfJobs +13 —1
receive computation results from some worker j
send job i (along with ng(i)) to worker j
end do
(Comment: For i = 1, . ..,Numo f Jobs, job i consists of checking solutions in
groups (or subgroups) from table ng(i). Job Numo f Jobs +1, job N umo f Jobs + 2,
. . , job NumOfJobs + p — 1 are empty jobs.)

Algorithm for the workers

receive job i from the master

74

if (i > NumOfJobs) stop
if (i S NumOfJobs) then
receive table ng(i) with 3 rows and ng(1, i =1: CS) - ng(1, (i — 1) * CS+ 1) +1
columns
entries in 1‘“ row are the indices of among the groups 91,. . . ,9200
entries in 2nd row are the beginning indices of the subgroups in the chunk
entries in 3rd row are the ending indices of the subgroups in the chunk
for any group (or subgroup) g = {$1, . . . ,xlgl} E ng
dol=1to|g|—1

dom=l+1to|g|

ll 131—33m II

II $1 M

if the minimum singular value of the Jacobian matrix Px(:rl) is bigger

if < a chosen parameter 60, then
than a chosen threshold 61, then retrace the two paths with smaller
step sizes
send the updated solutions to the master
else
if :13) is a nonsingular point on a positive dimensional component
curve jumping occurs
retrace the two paths with smaller step sizes
end if

end if

75

end if
end do
end do

end for

end if

After two (or more) numerically identical solutions are detected and the occur-
rence of curve jumping is determined, we need to identify the associated paths and
retrace them with smaller step sizes. For this purpose, the order index of the path
must be saved when it reaches a zero of P(x) = (191(1):), . . . ,pn(a:)) at the end of
the path. When the polyhedral-linear homotopy is used to solve the system, each
path corresponds to a unique mixed cell along with a unique path number in that
cell as elaborated in Section 3.2. We save these two numbers for the path so that
identiﬁcation of the mixed cell provides the associated binomial system, yielding the
starting point of a particularly numbered path and so that retracing of the paths
may then proceed. For the classical linear homotopy, the situation is simpler. The
solution paths of the homotopy are numbered according to the order indices of their
starting points as solutions of the starting system. Hence storing the path number is

sufﬁcient for retracing the path.

Remark: In the very beginning, when the master receives the solutions of P(x) = 0
in C" , they are saved in a 2-dimensional array of complex numbers in HOM4PS-

2.0para. The size of the 2-dimensional array is n x M V for the polyhedral homotopy

76

and n x T otalDeg for the classical linear homotopy (n complex components of the
solution and M V or TotalDeg number of solutions). As before, M V is the mixed
volume of P(x) and TotalDeg is the total degree of P(z). Different from the paral-
lel version of PHCpack [39], in which the master will write the solutions of P(x) = 0
to a ﬁle when the solutions are received from the workers, we save all the solutions in
an array for the purpose of grouping the solutions for the detection of curve jumping.
Having the solutions stored in an array of complex numbers is convenient for retriev-
ing the necessary information to retrace the homotopy paths, and for updating the

new solutions.

4.5 Numerical results

To demonstrate the performance of our parallel software package HOM4PS-2.0para
we will focus on those size-expandable bench mark systems listed in Table 3.3. All
the computations were carried out on a cluster 8 AMD dual 2.2 GHz cpus. Again,

we only list those systems that can be solved within 12 hours cpu time.

4.5.1 The performance on large systems

Table 4.4 shows the results of solving eco-n and cyclic-n systems using 1 master and 7
workers. For eco—n and cyclic—n systems, the total degree of each individual system is
much greater than its mixed volume, especially when 17. becomes larger. We therefore

use the polyhedral-linear homotopy to solve those systems, and the total number of

77

homotopy paths that need to be followed is the mixed volume of the system. As it
shows, curve jumping rarely occurs. On the other hand, when retracing was necessary,

no homotOpy paths need to be retraced more than once.

 

 

 

 

 

 

 

 

 

 

 

 

System CPU time Total degree Mixed volume # of curve # of isolated
(=# of paths) jumping solutions

eco— 17 2ml ls 28,697,814 32,768 - 32,768
eco— 18 6m308 86,093,442 65,536 - 65,536
eco—19 26m26s 258,280,326 131,072 - 131,072
eco—20 1h29m295 774,840,978 262,144 - 262,144
eco—21 10h08m555 2,324,522,934 524,288 1 524,288
cyclic-11 3m34s 39,916,800 184,756 - 184,756
cyclic-12 14m07s 479,001,600 500,352 - 367,488
cyclic-l3 lh39m103 6,227,020,800 2,704,156 - 2,704,156
cyclic-14 7h32m42s 87,178,291,200 8,795,976 4 8,464,307

 

 

 

 

 

 

 

 

Table 4.4: Solving systems by the polyhedral-linear homotopy with 1 master and 7
workers

78

Table 4.5 lists the results of solving systems noon-n, katsura-n, and reimer-n with
again 1 master and 7 workers. For katsura-n and reimer-n systems, total degree of
each system is equal (or almost equal) to the mixed volume of the system. Therefore
we solved these systems by using the classical linear homotopy to avoid the costly
mixed cell computations. Indeed, it is very costly to ﬁnd all the mixed cells of
katsura—n system when n > 15. For noon-n systems, the difference between the total
degree of each system and its mixed volume is 277. [21], which becomes much slimmer
relatively as 72 becomes larger. Hence we also solved noon-n systems by the classical

linear homotopy. Note that while curve jumping occurs in solving very big systems,

the number of occurrences is very small relatively. Again, we only need to retrace the

 

homotopy paths once when curve jumping was detected.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

System CPU time Total degree # of curve # of isolated
(=# of paths) jumping solutions

noon-12 2m23s 531,417+24 - 531,417
noon-13 7m48s 1,594,297+26 - 1,594,297
noon-14 38m12s 4,782,941+28 - 4,782,941
noon-15 4h14m33s 14,348,877+30 - 14,348,877
katsura—18 9m46s 262,144 - 262,144
katsura-19 23m36s 524,288 2 524,288
katsura-20 55mlOs 1,048,576 4 1,048,576
katsura—21 2h08m42s 2,097,152 8 2,097,152
katsura-22 4h52m013 4,194,304 20 4,194,304
katsura-23 11h17m408 8,388,608 52 8,388,608
reimer—S 2m36s 362,880 — 14,400
reimer-9 28m04s 3,628,800 8 86,400
reimer—lO 8h40m46s 39,916,800 20 518,400

 

 

 

 

 

Table 4.5: Solving systems by the classical linear homotopy with 1 master and 7

workers

79

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

System # of Total time to Time to ﬁnd Time to Time to

workers solve system mixed cells trace curve check solutions

k cpu(s) ] ratio cpu(s) 1 ratio cpu(s) I ratio cpu(s) I ratio

eco—16 1 445.32 1.00 120.02 1.00 325.00 1.00 0.30 1.00
2 223.49 1.99 60.66 1.98 162.58 2.00 0.25 1.20

3 150.69 2.96 40.94 2.93 109.53 2.97 0.22 1.36

5 91.31 4.88 25.22 4.76 65.89 4.93 0.20 1.59

7 68.70 6.48 19.99 6.00 48.58 6.69 0.13 2.31

cyclic-11 1 1475.39 1.00 38.15 1.00 1436.49 1.00 0.75 1.00
2 734.96 2.00 19.10 2.00 715.41 2.00 0.45 1.67

3 494.19 2.99 12.94 2.95 480.86 2.99 0.39 1.92

5 295.90 4.99 8.05 4.74 287.47 5.00 0.38 1.97

7 212.87 6.93 6.47 6.00 206.06 6.97 0.34 2.21

 

 

 

 

 

 

 

 

 

 

 

 

Table 4.6: The scalability of solving systems by the polyhedral-linear homotopy

4.5.2 Scalability

To test the scalability of our HOM4PS-2.0para, we solved eco-16, noon-12, cyclic-11,
katsura-17 and reimer—8 systems with varying numbers, k = 1, 2,3, 5, 7, of workers.
Table 4.6 shows the scalability of solving eco—16 and cyclic-11 by the polyhedral-
linear homotopy and Table 4.7 exhibits the scalability of solving reimer-8, noon-12
and katsura-17 systems by the classical linear homotopy. The ratios in both tables
stand for the cpu time for one worker to ﬁnish the task divides by the cpu time of
k workers for the same task. The ratios for curve tracing in both tables illustrate
an excellent scalability (nearly perfect) for each system, showing evidently that each
path can be traced independently. The cpu time to check the solutions on both
tables includes the cpu times for the detection of curve jumping as well as retracing

the associated paths when curve jumping occurs. The ratios in this category do not

show an excellent scalability because the cpu time for checking curve jumping is

80

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

System # of Total time to Time to Time to

workers solve system trace curve check solutions

k cpu(s) lratio cpu(s)J ratio cpu(s) [ ratio

noon-12 1 1003.32 1.00 980.72 1.00 22.50 1.00
2 501.75 2.00 490.33 2.00 11.42 1.97

3 335.18 2.99 326.68 3.00 8.50 2.65

5 201.27 4.98 195.30 5.00 5.97 3.77

7 143.22 7.00 138.88 7.00 4.34 5.18

reimer—8 1 1088.95 1.00 1087.74 1.00 1.21 1.00
2 545.08 2.00 543.96 2.00 1.12 1.08

3 363.89 2.99 362.81 3.00 1.08 1.12

5 218.69 4.98 217.97 4.99 0.72 1.68

7 156.54 6.96 155.86 6.98 0.68 1.78

katsura—17 1 1964.22 1.00 1963.12 1.00 1.10 1.00
2 982.38 2.00 981.35 2.00 1.03 1.07

3 654.17 3.00 653.22 3.00 0.95 1.16

5 394.02 4.99 393.18 4.99 0.84 1.31

7 280.56 7.00 279.85 7.00 0.71 1.55

 

 

 

 

 

 

 

 

 

 

Table 4.7: The scalability of solving systems by the classical linear homotopy

very fast, as one can see, and does not change much for various number of workers.
As expected, when the classical linear homotopy is used the scalabilities in Ta-
ble 4.7 are very close to being perfect. Notably illuminating is the almost perfect
scalability in solving the whole system of cyclic-11 in Table 4.6. This shows that
when the polyhedral-linear homotopy is used our strategy for the dynamic workload

distribution is quite balanced.

81

4.5.3 The multiple precision

As in general, we use Newton’s method for the “corrector” in our prediction-correction
scheme in following homotopy paths. Namely, when the predictor (we use the cubic
Hermite interpolation as our predictor after the ﬁrst step which uses the Euler pre-
dictor) provides a prediction point (2:0, to), Newton’s iteration is applied to H (:13, t0)
initiated at (1:0, to), until the magnitude of “H(x, to)” is less than a predetermined
accuracy 6 > 0. Realistically, this accuracy parameter should vary according to the
size of the Jacobian “Hm“ as well as the size of ”as” , and in HOM4PS-2.0, we set 6
to be the minimum of “Hz“ * ||23|| * 10‘8 and 0.1.

Now, during the computation, if the machine precision is set to be double precision,
it allows 15 digits accuracy. So when the condition number of the Jacobian matrix
H$(x,t0) is greater than 1015, then Newton’s method can hardly converge within
the desired accuracy. The step size will then be cut in half, followed by repeating
the prediction-correction scheme. If the step size is less than the minimum step size
allowable, say 10‘“, before reaching the stage of the end game, then the procedure
for tracing this path will cease to continue. We then retrace this homotopy path
by increasing the precision from double precision to quad precision which is about
10 times slower in tracing paths. Fortunately, those situations seldom occur, and
therefore causing only very minor effect on the scalability and efﬁciency. For instance,
only about 7 (among 65, 536) paths for eco-18, 20 (among 131, 072) paths for eco—19,

and 18 (among 262,144) paths for eco—20 need to use quad precision to retrace paths.

82

Chapter 5

Conclusion and future work

Polyhedral-linear and classical linear homotopies can be used to solve large polynomial
systems efﬁciently. The sequential version HOM4PS-2.0 leads the existing software by
huge margins. With multi—processors, HOM4PS-2.0para—parallel version of HOM4PS-
2.0 can solve systems that have not been solved in the past within tolerable time.
As we can see from Table 4.6, the ratios of computing mixed cells do not exhibit
near perfect scalabilities as the ratios of tracing curves show. At this time, while
the master sends the lower edges of L(S,;1) to the workers dynamically, some lower
edges may take much more time for the extension than the others. Therefore at the
very end, workers who ﬁnish their lower edge extensions will need to wait for those
workers who have not ﬁnished theirs. This will result in an imbalance of workload
distribution. Apparently a more sophisticated method needs to be developed for the

workload distribution for mixed cell computations.

83

Bibliography

[ll

[2]

[3]

[4]

[5]

[6]

l7]

[8]

D. N. Bernshtein (1975), “The number of roots of a system of equations”, Func-
tional Analysis and Appl, 9(3), 183-185.

C. Bjork and R. Roberg (1991), “A faster way to count the solutions of in-
homogeneous systems of algebraic equations”, J. Symbolic Computaion, 12(3),
329-336.

W. Boege, R. Gebauer, and H. Kredel (1986), “Some examples for solving sys-
tems of algebraic equations by calculating Groebner bases”, J. Symbolic Com-
putation, 2, 83-98.

J. Canny and J. M. Rojas (1991), “An optional condition for determining the
exact number of roots of a polynomial system”, IN Proceedins of the 19.91 Inter-
national Symposium on Symbolic and Algebraic Computaion,ACM,83-98.

S. N. Chow, J. Mallet-Paret, and J. A. Yorke (1978), “Finding zeros of maps:
homotopy methods that are constructive with probability one”, Math. Comput,
32, 887-899.

H. Cohn (1982) “An explicit modular equation in two variables and Hilbert’s
twelfth problem”, Math. of Comp., 38, 227-236.

F. J. Drexler (1977) “Eine Methode zur Berechnung samtlicher Losungen von
Polynongleichungssystemen”, Numer. Math., 29, 45-58.

T. Gao and T. Y. Li (2000), “Mixed volume computation via linear program-
ming”, Taiwan J. of Math., 4, 599-619.

[9] T. Cao and T. Y. Li (2003), “Mixed volume computation for semi-mixed sys-

tems”, Discrete Comput. Geom, 29(2), 257-277.

84

[10] T. Gao, T. Y. Li and M. Wu (2005), “Algorithm 846: MixedVol: A software
package for mixed volume computation”, ACM Transactions on Math. Software,
31(4), 555-560.

[11] C. B. Garcia and W. I. Zangwill (1979), “Findind all solutions to polynomial
systems and other systems of equations”, Math. Programming, 16, 159-176.

[12] T. Gunji, S. Kim, M. Kojima, A. Takeda, K. Fujisawa and T. Mizutani (2004),
“PHoM - A polyhedral homotopy continuation method”, Computing, 73, 53-57.

[13] T. Gunji, S. Kim, K. Fujisawa and M. Kojima (2006), “PHoMpara - Parallel
implementation of the polyhedral homotopy continuation method for polynomial
systems”, Computing, 77, 387-411.

[14] B. Huber (1996), “Solving sparse polynomial systems”, Ph.D. dissertation, Cor-
nell University.

[15] B. Huber and B. Sturmfels (1995), “A Polyhedral method for solving sparse
polynomial systems”, Math. Comp., 64, 1541-1555.

[16] A. G. Khovanski (1978), “Neton polyhedral and the genus of complete intersec-
tions”, Functional Analysis and Appl., 12(1), 38—46.

[17] S. Kim and M. Kojima (2004), “Numerical stability of path tracing in polyhedral
homotopy continuation methods”, Computing, 73, 329-348.

[18] A. G. Kushnirenko (1976), “Newton Plytopes and the Bezout Theorem”, Func-
tional Analysis and Appl, 10(3), 233-235.

[19] Y. C. Kuo and T. Y. Li (2008), “Determine whether a numerical solution of a
polynomial system is isolated”, J. Math. Anal. Appl, 338(2), 840-851.

[20] T. Lee and T. Y. Li,“Mixed volume computation, A Revisit”,(Submitted).

[21] T. Lee, T. Y. Li and C. Tsai,“ HOM4PS-2.0: A software package for solving
polynomial systems by the polyhedral homotopy continuation method”,
(Submitted).

[22] T. Y. Li (1997), “Numerical solution of multivariate polynomial systems by ho-
motopy continuation methods”, ACTA Numerica, 399-436.

[23] T. Y. Li (1999), “Solving polynomial systems by polyhedral homotopies”, Taiwan
J. of Math., 3, 251-279.

[24] T. Y. Li (2003), “Solving polynomial systems by the homotopy continuation

method”, Handbook of numerical analysis, Vol. XI, Edited by P. G. Ciarlet,
North-Holland, Amsterdam.

[25] T. Y. Li and X. Li (2001), “Finding mixed cells in the mixed volume computa-
tion”, Foundation of Computational Mathematics, 1, 161-181.

[26] T. Y. Li and X. Wang (1996), “The BKK root count in CT”, Math. Comp., 65,
no. 216, 161-181.

[27] T. Y. Li, T. Sauer and J. A. Yorke (1989), “The cheaters homotopy: an efﬁcient
procedure for solving systems of polynomial equations”, SIAM J. Numer. Anal,
26, 1241-1251.

[28] T. Y. Li and Z. Zeng (2005), “A rank-revealing method with updating, down-
dating, and applications”, SIAM J. Matrix Anal. Appl., 26, 918-946.

[29] T. Mizutani, A. Takeda and M. Kojima (2007), “Dynamic enumeration of all
mixed cells”, Discrete Comput. Geom, 37, 351-367.

[30] A. MORGAN, (1987), Solving polynomial systems using continuation for engi-
neering and scientiﬁc problems, Prentice-Hall, New Jersey.

[31] A. P. Morgan and A. J. Sommese (1987), “A homotpy for solving gen-
eral polynomial systems that respect m-homogeneous structures”, Appl. Math.
Comput.,24,101-113.

[32] A. P. Morgan and A. J. Sommese (1989), “Coefﬁcient-parameter polynomial
continuation”, Appl. Math. Comput., 29, 123-160. Errata: Appl. Math. Comput.,
51, 207 (1992).

[33] A. P. Morgan, A. J. Sommerse, and C. W. Wampler (1992), “A power series
method for computing singular solutions to nonlinear analytic systems”, Numer.
Math., 63(3), 1779-1792.

[34] MP1: http: //www.mpi-forum.org.

[35] V. W. Noonburg (1989), “A neural network modeled by an adaptive Lotka-
Volterra system”, SIAM J. Appl. Math., 49, 1779-1792.

[36] J. M. Rojas (1994), “A convex geometric approach to counting the roots of a
polynomial system”, Theoret. Comput. Sci., 49, 105—140.

[37] C. Traverso, The PoSSo test suite examples, Available at:
www.inria.fr/saga/POL

86

[38] J. Verschelde (1999), “Algorithm 795: PHCpack: A general-purpose solver for
polynomial systems by homotopy continuation”, ACM Trans. Math. Softw, 25,
251-276.

[39] Y. Zhuang (2007), “Parallel Implementation of Polyhedral Homotopy Methods”,
Ph.D. thesis, Univesity of Illinois at Chicago.

87

   

11]]1][]1]][[1[1[1]]:

...1,