} LIBRARY :00"? Michigan State University This is to certify that the dissertation entitled ALGORITHMS FOR SOLVING POLYNOMIAL SYSTEMS BY HOMOTOPY CONTINUATION METHOD AND ITS PARALLELIZATION presented by Chih-Hsiung Tsai has been accepted towards fulfillment of the requirements for the PhD. degree in Mathematics n I , \-/ I, r \ /&A - / .e___\_ C 1 Major Profe’ssor’s Signature 7/79//08’ Date MSU is an afiinnative-action, equal-opportunity employer .1--.-.-I-I-I-O-l-I-O-I-I-I—O—O-l-I-O-I-I-I-U—l-l-I- 4 -4-4-s-a-I-n-A-I-u-v-l-O-I-n-o-l-I-lh n.-.-.—v- - _ . PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 5108 K:IProj/Acc&Pres/ClRC/DateDue‘indd ALGORITHMS FOR SOLVING POLYNOMIAL SYSTEMS BY HOMOTOPY CONTINUATION METHOD AND ITS PARALLELIZATION By Chih-Hsiung Tsai A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Mathematics 2008 ABSTRACT ALGORITHMS FOR SOLVING POLYNOMIAL SYSTEMS BY HOMOTOPY CONTINUATION METHOD AND ITS PARALLELIZATION By Chih—Hsiung Tsai HOM4PS-2.0 is a software package in FORTRAN 90 which implements the polyhedral homotopy continuation method for solving polynomial systems. It updates its original version HOM4PS in three key aspects: (1) New method for finding mixed cells, (2) New idea of following homotopy paths, (3) New way of dealing with the curve jumping. The parallel version of HOM4PS-2.0, named HOM4PS-2.0para, parallelizes the three main stages in HOM4PS-2.0. Excellent scalability in the numerical results shows that the parallelization of the homotopy method always provides a great amount of extra computing resources to help solve polynomial systems of larger size which would be very difficult to deal with otherwise. To my parents iii ACKNOWLEDGMENTS I would like to give my immense gratitude to my dissertation advisor, Professor Tien—Yien Li, for his constant encouragement and support during my graduate study at Michigan State University. His knowledge, insights and enthusiasm are invaluable. I would also like to express my thank to Professor Chichia Chiu, Professor Jeanne Wald, Professor Guowei Wei , and Professor Zhengfang Zhou for their time and valuable advice. I am grateful to my classmates Jules, Tianran, Alice and Ying for their help and useful comments. Last, but not least, I want to thank Jamie for her support during these years. iv TABLE OF CONTENTS List of Tables Introduction Polyhedral Homotopy Method 2.1 Bernshtein’s Theorem ........................... 2.2 Polyhedral Homotopy ........................... HOM4PS-2.0 3.1 Mixed cell computations ......................... 3.2 Following homotopy paths ........................ 3.2.1 Constructing the polyhedral-linear homotopy .......... 3.2.2 Classical linear homotopy .................... 3.3 On curve jumping ............................. 3.4 Miscellaneous aspects of HOM4PS-2.0 ................. 3.4.1 Evaluating polynomials and derivatives ............. 3.4.2 Scaling of the coefficients ..................... 3.4.3 The end game ........................... 3.5 Numerical results ............................. 3.5.1 The performance of HOM4PS-2.0 in dealing with curve jump- ing and diverging paths ...................... 3.5.2 HOM4PS-2.0 vs. HOM4PS ................... 3.5.3 HOM4PS—2.0 vs. PHCpack and PHoM ............ HOM4PS-2.0para-Parallelizing HOM4PS—2.0 4.1 Parallelizing the computation of mixed cells .............. 4.2 Parallelizing the curve tracing of polyhedral homotopy ........ 4.3 Parallelizing the curve tracing of the classical linear homotopy . . . . 4.4 Parallelizing the checking of possible curve jumping .................................. 4.5 Numerical results ............................. 4.5.1 The performance on large systems ................ 4.5.2 Scalability ............................. 4.5.3 The multiple precision ...................... Conclusion and future work Bibliography vi 0501 19 19 24 24 3O 31 35 35 38 41 43 46 49 51 55 56 58 67 71 77 77 80 82 83 84 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.1 4.2 4.3 4.4 4.5 4.6 4.7 LIST OF TABLES T (7123, M=6) ............................ The polynomial systems ......................... The performance of HOM4PS-2.0 .................... Comparison of the classical linear homotopy and the polyhedral—linear homotopy in HOM4PS-2.0 ........................ Comparison of HOM4PS and HOM4PS—2.0 .............. Comparison of HOM4PS-2.0 and PHCpack .............. Comparison of HOM4PS-2.0 and PHoM ................ Maximum sizes of polynomial systems that can be solved by PHCpack, PHoM, and HOM4PS—2.0 within 12 hours of cpu time ........ fT ..................................... Bixlib ................................... Solving systems by the polyhedral-linear homotopy with 1 master and 7 workers ................................. Solving systems by the classical linear homotopy with 1 master and 7 workers .................................. The scalability of solving systems by the polyhedral-linear homotopy The scalability of solving systems by the classical linear homotopy . . vi 37 45 46 48 64 Chapter 1 Introduction Let P(x) = 0 be a system of n polynomial equations in n unknowns. Denoting P(:c) = (p1(:c), . . . ,pn(:r)) with :1: = (3:1, . . . ,xn), we want to find all isolated solutions of p1(:r:1,...,:cn) = 0, in C". The problem of solving systems of polynomial equations arises in many scientific and engineering applications. Reliable and efficient numerical approaches to solve this problem have long been desired. One such approach is the homotopy continuation method proposed in 1977 by Garcia and Zangwill [11] and Drexler [7] independently. 1 Since then, this approach has undergone quite a bit of development. In the beginning, the employment of linear homotopies was standard. Then in 1995, Huber and Sturm- fels [15] suggested the use of polyhedral (nonlinear) homotopies based on Bernshtein’s combinatorial root count [1] yielding drastic improvements over the classical linear homotopy method for solving sparse polynomial systems. Ultimately this is because Bernshtein’s root count is a much tighter bound on the number of solutions than the Bézout number. The software package HOM4PS deveIOped by T. Y. Li and T. Gao since 1999 implemented this approach. Huber and Sturmfels [15] also recommended the use of the combined polyhedral- linear homotopies, but these were impractical due to difficulties with efficiency and stability. In 2007, T. Y. Li and C. H. Tsai, utilizing exponential-logarithmic transfor- mations, were able to successfully implement combined polyhedral-linear homotopies, and, released with T. L. Lee an update to HOM4PS, called HOM4PS-2.0. Its par- allel version HOM4PS-2.0para which utilizes multiprocessors soon followed. These latest two packages show considerable speed-up and increased ability to handle larger systems. This thesis presents the details of the strategies used to implement the homotopy continuation method in the codes HOM4PS-2.0 and HOM4PS-2.0para. A sketch of the theoretical underpinnings and historical development of the homotopy continua- tion method will be presented. Techniques used to overcome computational difficulties in implementing the polyhedral continuation method will be outlined. 2 In Chapter 2, we first review the basics of the polyhedral homotopy continua- tion method; this includes, the definition of the mixed volume of the support of a polynomial system as well as Bernshtein’s theorem concerning root counts. Next, a practical method of computing mixed volumes, using the mixed cells of the support, which leads to the construction of the corresponding polyhedral homotopies, will be discussed. The details of finding the mixed cells will be left to Chapter 3. In Chapter 3, we will present the latest optimizing strategies used in HOM4PS- 2.0, which considerably upgrades its original version HOM4PS. There are three major steps in HOM4PS-2.0: 0 Mixed cell computations 0 Following homotopy paths 0 Dealing with ‘Curve Jumping’. Sections 3.1-3.3 take up the treatment of these issues. Other miscellaneous aspects of HOM4PS-2.0 are listed in Section 3.4. These include evaluating polynomials and derivatives, scaling coefficients, and determining criteria for path convergence and divergence, i.e., the end game. Numerical results of HOM4PS-2.0 are shown and discussed in Section 3.5. These results clearly show that HOM4PS-2.0 is very efficient and powerful. As the size of the polynomial system becomes larger, more computing resources are required. And a natural way to allocate extra computing resources efficiently is to perform independent tasks simultaneously in parallel. Since each isolated zero is 3 computed independently of all the others in the homotopy continuation method, it provides a natural environment for the parallelization. In Chapter 4, we present a par- allel implementation of HOM4PS-2.0, HOM4PS-2.0para, in which each of the three main stages above are parallelized using multiple processors. A simple master-workers parallel computation environment is used, in which one master CPU communicates with all worker CPUS using the Message Passing Interface, MP1 [34], to dynami- cally allocate tasks among the latter. The first four sections of Chapter 4 describe respectively, the parallel implementations of the mixed cells computation, the curve tracings of the polyhedral-linear homotopies, the curve tracings of the optional clas- sical linear homotopy, and the checking for possible curve jumping. In Section 4.5, numerical results of HOM4PS-2.0para on the bench mark systems, eco—n [30], katsura- n [3], noon-n [35], reimer-n [37], and cyclic—n [2], are summarized. The near perfect speed-ups in the results clearly demonstrate that our dynamic allocation algorithms effectively utilize the processing capacities of a parallel architecture to implement the polyhedral-linear homotopy continuation method for solving previously unmanage- able polynomial systems of larger size. Conclusions and directions for future work are given in Chapter 5. Chapter 2 Polyhedral Homotopy Method For sparse polynomial systems, Bernshtein’s combinatorial root count gives a much tighter bound on the number of isolated solutions than the Bézout number. It thus provides a starting point for an alternative approach to the classical linear homotopy continuation method which will reduce the number of homotopy paths to be traced. In this chapter, we first review Bernshtein’s theorem on the exact root count in (0’)" for a generic system, where C" = C\{O}. Next, we discuss an important extension by Li and Wang [26] for the root count in C”, followed by the description of the actual procedure that uses Bernshtein’s root count to find all isolated solutions, i.e., Huber and Sturmfel’s polyhedral homotopy method. 5 2.1 Bernshtein’s Theorem Let P(:r) = (p1(a:), . . . ,pn(:r)) be a system of n polynomials in n unknowns with :1: = (1131,...,IL‘n), where pj(a:) = Z c3103“, j = 1,. . . ,n. (2.1) aESj Here as“ = 3:61l1 ~43,” where a = (a1, . . . ,an) with coefficient cia E C" = C\{0}. The index set Sj, denotes the support of pj(:r), is the set of all monomial powers of 103(33)- The convex hull K, = conv(.S'J-) in R” of Sj is the Newton polytope of pi (cc) and S = (31, . . .,Sn) is called the support of the system P(:r). For non-negative real variables A1, . . . , An, write the Minkowski sum A1K1+~~+AnKn={A171+---+An~yn|'yj EKj,j=1,...,n}. The n-dimensional volume Voln()\1K1 + + AnKn) is known to be a homo- geneous polynomial of degree n in the variables A1, . . . , An. The coefficient of the monomial A1/\2 - - - An in this homogeneous polynomial is called the mixed volume of the polytopes K1, . . . , Kn, denoted by M(K1, . . . ,Kn). Alternatively, it is called the mixed volume of the support of the system P(:1:) = (p1(:r), . . . ,pn(x)), denoted by M(Sl, . . . , Sn), or just the mixed volume of P(:r) for short. Now, let the coefficient cjfl of each monomial in the original system P(x) = 6 (131(1), . . . , pn(a:)) in (2.1) be considered as a variable in its own right, resulting in the associated system P(c, z) = (p1(c,:z:), . . . ,pn(c, 23)) where pj(c,a:) = Z cj,aa:a, j = 1,. . . ,n. (2.2) aESj The original system P(zr.) = P(c*, :r) is simply the expanded system P(c, :13) evaluated at a set of specified values of coefficients c* = (c; a)' Recall that the root count of a polynomial system is simply the total number of isolated zeros, counting multiplicities. Lemma 2.1 [14] For polynomial systems P(c, 11:) in (2.2), there exists a polynomial system C(c) = (gl(c), . . . ,gl(c)) in the variables 0 = (Cj,a) for a E Sj andj = 1,. . .,n such that for those coefficients c* = (Cia) for which G(c*) sé 0, the root count in (C*)" of the corresponding polynomial system in (2.2) is a fixed number. And the root count in (0‘)" of any other polynomial system in (2.2) is bounded above by this number. Remark 2.2 Since the zeros of the polynomial system C(c) in the above lemma form an algebraic set, its complement is Open and dense with full measure. Thus, a polynomial system P(c*, :13) in (2.2) with G(c*) # 0 is said to be in general position. Theorem 2.3 [1] The root count in (C*)” of a polynomial system P(z) = (p1(:r), . . . , pn(z‘)) in general position equals the mixed volume of its support. 7 The above root count is, in practice, a much tighter bound than the Bézout bound [31] and is also known as the ”BKK bound” after its inventors, Bernshtein [1], Kush- nirenko [18], and Khovanskii [16]. An extension of the Bernshtein Theorem that counts the roots in C” as opposed to (C*)” was provided by Li and Wang [26]. Theorem 2.4 The root count in C" of a polynomial system P(x) = (p1(:c), . . . ,pn(a‘)) with support S = (S 1, . . . , Sn) is bounded above by the mixed volume M(31 U{0}, . . . ,Sn U{0})- In other words, simply augment each polynomial in the original system P(x) by adding a constant term, if one is missing, namely adding the point 0 = (0,. ..,O) to each support Sj. Then, the mixed volume of the augmented supports, which is the root count in (C*)" of the augmented polynomial system in general position, is the upper bound for the root count in C” of the original system P(x). Hence, if the constant term were already present in each polynomial in the original system P(x) = O, i.e., its support 83- = Sj U{0}, then the upper bound for the root count in ((C*)" would be the same as that in C". Furthermore, for a generic system with these supports (with all constant terms included), # of solns in (C*)n = M(Sl, . . .,Sn) = # of solns in C". So no isolated solution of a system in general position in which all constant terms are present has a zero component. 2.2 Polyhedral Homotopy The earliest form of the homotopy continuation method for solving polynomial system P(z) = (p1(x),...,pn(:r)) = 0 consisted of first finding an easily solved system Q(:r) = (q1(:r), . . . ,qn(:r)) = 0, next defining a linear homotopy H : C" x [0,1] —» C” by H(:c, t) = (1 — t)Q(:r:) + tP(x), (2.3) and then numerically following the curves in the real variable t which make up the solution set of H (x, t) = 0. Here the homotopy H (as, t) is linear in t. More generally, homotopies may be constructed to be nonlinear in t. In order for the continuation method to work, the homotopy H (x,t) must be chosen so that the following three properties hold: 0 Property 1 (Triviality) The solution points of the starting system H(:1:,0) = Q(:1:) = 0 are known. 0 Property 2 (Smoothness) The solution set of H (11:, t) = 0 for 0 S t < 1 consists of a finite number of smooth paths, each parameterized by t in [0, 1). 0 Property 3 (Accessibility) Every isolated solution of the original target system H(:I:, 1) : P(x) = 0 can be reached by some path originating at t = 0; i.e. at some solution point of the starting system H (2:, 0) = Q(:z:) = 0. If all three properties hold, then all solutions of the original problem P(x) = 0 can be obtained by tracing the solution paths of H (x,t) = 0, each starting from 9 an initial starting point which is a solution of the known system Q(:r) = 0. Note that while divergence of a path to infinity in the middle of the interval (as t —> to < 1) is precluded by the smoothness property, it is still consistent with the above properties that a solution path of H (x, t) = 0 may diverge to infinity as the parameter t approaches 1. Thus, some of the paths emanating from the roots of Q(ar), may be extraneous, and diverge to infinity as t ——> 1. Since in practice it is not known ahead of time which path is extraneous, each path must be followed until it is clear that it diverges or a solution point of P(z) = 0 is reached. Divergent paths'thus represent wasted computation effort. Now we outline Huber and Sturmfel’s polyhedral homotopy method to find all iso- lated solutions in C" of the given polynomial system P(x) 2 (191(3),. . . , pn(:c)) with supports S = (51,...,Sn). First, all polynomials without constant term are aug- mented with the constant monomial 1:0 = 1:9 . - «1:9, = 1, resulting in the correspond- ing augmented supports S j, 2 S7 U {0}. A generic system Q(a:) = (q1(z‘), . . . , qn(:r)) with the augmented support S I = (51’ , . . . , S71) is constructed with randomly chosen coefficients. By Theorem 2.4, the root count of Q(;r) in C” is the mixed volume I I M(Sl , . . . , Sn) of the augmented supports. The first step in Huber and Sturmfel’s polyhedral homotopy method is to solve the system Q(:1:) = 0 in general position by means of polyhedral homotopies which will be discussed below in detail. It will be shown that the number of paths needed to be followed in this step is M(Sl,, . . . ,S,’,) which is the root count of Q(a:) in C", 10 and no paths diverge. The second step in the polyhedral homotopy method is to solve the original system P(z) = 0 by means of the linear homotopy H(:z:, t) = (1 — t) 7Q(a:)+t P(z), t 6 [0,1] with generically chosen complex 7, also known as the so—called cheater’s homotopy [27] (or the coefficient—parameter continuation [32]). In this step, the linear homotopy paths of H (1:, t) = 0 are followed from their starting points (which are the solutions of Q(.v) = 0 found in the first step). It is known that all isolated solutions of P(z) = 0 can be attained, which was our original goal. Here, some paths may diverge. Since the number of starting points, and hence the number of paths, is the root count of Q($), the number of paths needed to be followed in this step is also M(Sll , . . . ,Sé). Hence the total number of paths needed to be followed in both steps is 2M (SI, , . . . , 3,1). We begin our discussion of how to solve the system Q(:1:) = 0 using polyhedral homotopies. Writing a: = E 1:“, (M ) EGGS; 1,a QUE) = (2 4) :1: = 5 a“, (Inf ) ZaESé 71,0 I note that all coefficients '6 = (Ejfl) for a E S j and j = 1,. . . ,n are chosen at random to ensure that Q(:r) is in general position. To construct a nonlinear homotopy with Q(:I:) as the target system, first define a generic lifting w = (0.21, . . . ,wn) on the support S, = (81],. . .,S4) of Q(:c) where 11 I I each wj : S j —-> R, j = 1, . . . , n is a random real-valued function on S 3" Each wj lifts I , .I . I Sj to its graph Sj = {a = (a,wj(a)) | a E Sj}. Define the polyhedral homotopy Q : C" x [0, 1] —+ C” by adding a power of t given by the above random lifting to each term in Q(:r) to obtain Q(:I:,t) 2 (91(1', t), - . . , Qn(:c,t)) where é1($,t) = EGGS; Elflxatwlm), Chant): g (2-5) (jn($,t) = Z I Cn,a$atwn(a). aESn In order to apply the continuation method, we must demonstrate that Q(:r,t) satisfies all three of the desired properties given in the beginning of this subsection (triviality, smoothness, and accessibility). Clearly the target system is Q(a:, 1) = Q(:I:) = 0. It can be shown that for any fixed to > 0, the system Q(a:, to) is in general position having all nonzero constant terms present which implies that it has mixed volume k :2 M(Sli, . . .,STZ) isolated solutions in (C*)”. Then the entire zero set of the homotopy Q(:1:, t) consists of k distinct homotopy paths :r(1)(t), . . . , 113(k)(t) in (C*)n ending at the distinct solutions of Q(z) = 0. And again, since each éj(x,t) has nonzero constant term for all j : 1,...,n, by a standard application of the generalized Sard’s Theorem [5], each of the k solution curves of Q(:r,t) is smooth with no bifurcations. Therefore, Property 2 (smoothness) holds. However, at t = 0, the start system Q(:I;,0) E 0 which means that the starting 12 points z(1)(0), . . . ,z(k)(0) of the corresponding homotopy curves :1:(1)(t), . . . ,a:(k)(t) can not be identified. So the polyhedral homotopy Q(:r:,t) does not satisfy Property 1 (triviality) and Property 3 (accessibility). So while every isolated solution of the target system may be reached by a homotopy curve of Q(1:,t) = 0, it is not clear where it originates from. This difficulty is resolved by the following technique. We first find all vectors (3: = (a, 1) E Rn‘l'l matched with a collection of pairs of , I I I I I . pornts ({a1,a1}, . . . , {an,an}) from the support S = (81,...,Sn) of Q(x), which together satisfy the following conditions a Each pair of points is chosen from each support, . I I I I I I . . i.e. {a1,a1} C 51,. . . , {aman} C Sn, so that {a1 —a1, . . . ,an —an} 18 linearly independent in IR", and 0 Taking (-, ) to be the usual inner product in Euclidean space and a = (a, wj(a)) E , I I ' R’Hl where a E Sj and wj : S —> IR is the corresponding lifting, for j = 1,...,n, (a ,d) -— (a ,(3I) J J (2.6) (End) > (aj,d), for all a E S; \ {drag}. Geometrically, d = (a, 1) represents a linear functional, or alternatively, the normal of parallel hyperplanes in Rm”, which supports the convex hull of the graph S J, of S J, , at exactly two points {aj, a3} C S J, for j = 1,. . . , n. Such a collection of pairs of points ({a1,a’1},. . . , {am a;}) satisfying the above conditions is called a mixed cell 13 of S I = (51,, . . . , 37],), and its associated vector 0 is called its inner normal. Finding all mixed cells and their associated inner normals a is one of the key steps of the polyhedral homotopy method. We will treat the actual mixed cells computation in Chapter 3 and Chapter 4. Having found such a mixed cell ({a1,a,1}, . . . , (an, a;}) together with its asso- ciated inner normal 0 where a = (0:1, . . . ,an), define a change of variables between :r = (3:1, . . . ,l‘n) and y = (311,. . .,yn) by :1: 2 pt“ or alternatively, y = t—ax where 331 = tall/1, 29/1 = F0111, or 3 (2-7) IFn = tanyn, yn = t_an-Tn- With this change of variables, any monomial will be transformed as follows: a a 3:“ = $11....11nn = finial)“1 -~(ynta")a” gntala1+~+anan a : y11"'y _ yat(a,a). Thus each (jj(z,t),j = 1,. . . ,n of the homotopy Q(:r, t) in (2.2 5) becomes 14 (lj(yt0‘,t) = Z Ej,ayat(a’alt“’j(a) a633- : Z 5]. ayat<(a’wj(a))a(ar1)> I aESj I aESj Forj=1,...,n, let ,Bj = min/(61,6!) aESj By the conditions defining the mixed cell ({a1,a,1}, . . . , {an,a:,}), we have fij = (aj,d) = ((13,631) forj=1,...,n. Now define Ha(y,t) = (h?(y,t),...,h%(y,t)) by “ factoring ” out the lowest powers of t in each of the (17(th t)- q : _'Bj A' a Z -' 6,6!) -BJ h] (y,t) t q,(yt .0 2,1653 Cmy t (“1,62 --5 = Z: 63’ Cmya + Z 63’ Cmy t< ) J (éadl—IBJ (d’a>>6] I = 53a 3] '7 +0. lyj + Z I CJfly “(LOO-fl] ,a . aES- .7 J (d,&)>fij Since exactly two inner products equal the minimum fij for each j, our new homotopy H 0(y, t) has exactly two terms without t. This homotopy retains most of the properties of the previous homotopy Q(x, t) = 15 0; in particular Property 2 (smoothness) still holds, because both H “(y, t) and Q(sc, t) have the same support and are in general position for fixed to > 0. The entire solution set of Ha(y, t) = 0 consists of k distinct smooth homotopy paths 31(1) (t), . . . , y(k) (t) in (C*)” which are transformed versions of the solutions :r(1)(t), . . . ,.’E(k) (t) of Q(z, t) = 0. However, by the transformation (2.7), we have x = y at t = 1. Hence the solutions to the target system Ha(y, 1) = Q(y, 1) = Q(y) = 0 are exactly the same as the solutions to the target system Q(Lr, 1) = Q(z) = 0. Moreover, we have an easily solvable start system, known as the binomial system, I ha .0: E 0:5 “1+5 “1:0, 1(y. ) Sues/1 Lay l,aly 1,ally Holt/,0) = , g (2-9) I hifiy, 0) = Z I Enut/a = Emanyan + E I ya" = 0. 0.63" ' Than An algorithm to solve binomial systems is laid out in Section 4.3. Thus Property 1 (triviality) is now satisfied. Since the homotopy curves originating from the solutions of (2.9) can be traced to reach a partial set of isolated zeros of Q(y) at t = 1, Property 3 (accessibility) partially holds. Proposition 2.5 [22] The binomial system I 61013/01 + 5 [gal = 07 ’ 1,al (2.10) I Cn’anyan + En a] ya” = 0. , n 16 has I I kaz=|det(a1— a1, . . . ,an — anll nonsingular isolated solutions in ((C*)"'. The number ha is called the volume of the mixed cell ({a1,a,1}, . . . , (an, a;}). The existence of such mixed cell ({a1,a,1}, . . . , {an,a:,}) along with its corresponding inner normal 0: E IR” for which condition (2.6) holds and hence for which H a (y, 0) = 0 is a binomial system is guaranteed by the following proposition. I Proposition 2.6 For all generically chosen real functions wj : S j —-* IR, 3' = 1,. . . ,n there exists a 6 IR”, for which the start system Ha(y,0) = 0 of the homotopy H a(y,t) = 0 in (2.8) is a binomial system having a nonempty set of nonsingular solutions in (C*)"’, i.e., ka 75 0 in (2.10). This proposition was proved implicitly in [15] and [22] in the context of combinatorial geometry. In [15], Huber and Sturmfels proved that I I M(s,,...,s,,)=§:ka, a i.e., the mixed volume of Q(y), the total number of isolated zeros‘of Q(y), is the I I sum of the volumes has of the mixed cells corresponding to all possible a s for which H “(y, 0) = O is a binomial system. This gives an alternative approach to Bernshtein’s Theorem, different from the original paper [1]. 17 Clearly, different inner normals a 6 IR" induce different homotopies H “(y, t) = 0 in (2.8). Phrthermore, by examining the Puiseux series expansions of the homotopy solution curves, we conclude that any two sets of isolated zeros of Q(y) are disjoint, if they are reached by corresponding sets of solution paths belonging to two distinct homotopies. Therefore, every isolated zero of Q(y) can be obtained by tracing a solution curve of a homotopy H “(y, t) = 0 induced by some oz 6 IR” given by Proposition 2.6. 18 Chapter 3 HOM4PS-2.0 Based on the polyhedral homotopies given in Section 2.2, the software package HOM4PS, developed over the years by a group led by T. Y. Li at Michigan State Univer- sity, implemented this approach for approximating all the isolated zeros of P(x) = (p1(.1:), . . . ,pn(:1:)). HOM4PS-2.0 is a new version in FORTRAN 90 which updates HOM4PS with remarkable speed-ups. In this chapter, we will explain the details of the three main stages in HOM4PS-2.0 that account for the major improvement over HOM4PS: mixed cell computation, following homotopy paths, and checking for curve jumping. 3.1 Mixed cell computations When the polyhedral homotopy is employed to find all isolated zeros of P(IE) = (p1(:r). . . . , pn(:r)), the process of locating all the mixed cells during the mixed volume computation plays a crucially important role [22-24]: The mixed volume determines 19 the number of solution paths which need to be traced and the mixed cells provide starting points of the solution paths. For polynomial system P(r) : (p1(a:), . . . ,pn(a:)) with support S = (51,. . .,Sn), let wj : Sj ——> R be a random lifting function on Sj which lifts Sj to its graph Sj = {a = (a,wj(a)) : a E 53'} C Rm” forj = 1,...,n. Here we assume that Sj = Sj U{0}, i.e., all polynomials pj (:17) in the system have constant terms. See Section 2.2. Recall that a collection of pairs {aha/1} C 31,.. ., {amag} C Sn is called a mixed cell if there exists d 2 (oz, 1) E Rn'H with OI E R" such that (a,,a) = (fig-,0?) < (as) for a e sj\ {gag}, j=1,...,,n I A and a is called the inner normal of this mixed cell. For 1 S i S n, e = (“,5 } C Si: is called a lower edge of S,- if there exists a = (a, 1) E Rn+l for which the following relations hold: (as) = a, >_< > .,\[., .3] A A For a level-k subface E‘k == (éi1""’éik) of S = (S1,...,Sn) with 1 S k < n I A where éij = {a,-J.,a,.,} E L(S,-j) for j = 1,...,k, we say that the lower edge .7 } E L(S,- for certain ik+1 E {1,2,...,n}\{i1,--- ,ik} A A AI ezk+1 _ {a1k+1’aik+1 k+1) extends the level-k subface Ek if Ek+1 == (éil’ . . is a level-(k + 1) subface "eik+1) of S = (81,. . .,Sn). We say Ek is extendible in such situations. A main strategy for finding mixed cells is the extension of level-k subfaces El: of S 2 (S1, . . . , Sn) starting from k = 1 and an extendible Eh when k = n—l yields mixed cells of S 2 (S1, . . . , Sn) induced by elements in Sin- In [8—10, 25], the order of this extension i1, i2, . . . is predetermined and fixed, that is, one always starts from extending lower edge él of S1 to a level-2 subface (é1,é2) with ég E L(S2), then extend (é1,é2) to a level-3 subface (é1,é2,é3) with (33 E L(S3) etc. Software package MixedVol [10] for the mixed cells computation that was adopted in the original HOM4PS was developed along this line of approach. In [29], the novel idea of dynamic enumeration of all mixed cells emerged where dynamic means that the order of the subface extension will not stay fixed. To extend 21 a particular level—k subface . . . . . A] A . Ek=(e,-1,...,eik), 1$k R; i.e., éj(x,t) = 2063]. Ejflzatwfla) forj = 1,. . .,n. Then a linear homo- topy H (:r,t) = (1 — t) 7Q(:r) + tP(:c), t E [0, 1] with generically chosen complex 7, is constructed to reach all isolated solutions of P(z) = 0 by following the smooth so- lution paths of H (2:, t) = 0 emanating from the solutions to Q(zr) = 0 found above. The number of paths needed to be followed in each step is the mixed volume of the polynomial system. In HOM4PS-2.0, these two steps were combined in one step by considering the polyhedral-linear homotopy H(.:c,t) = (h1(a:,t),...,hn(:c,t)), x = (2:1,...,:cn), tE [0, 1] 24 where fly-(22¢): Z ((1—t)Ej,a+th-,a)xatwj(a), j=1,...,n. aESj Note that H(a:, 1) = P(x). For a given mixed cell C = ({a11,a12}, . . . , {an1,an2}) with inner normal a E IR", where {aj1,aj2} C Sj for each j = 1,...,n, af- ter applying the change of variables a: = yto‘ where y = (y1,...,yn) and 933- = yjtaj for j : 1,...,n, and keeping the variable a: in place of y, we reach the homotopy f1(:c,t) = (h1(:r,t), . . . ,hn(z,t)), t E [0,1], where for j = 1,...,n hj(a:,t) = Z [(1 — t75j,a + tcj,a]a:at(&’d), with a = (a,wj(a)) for a E Sj. aES- .7 Letting fij = min (End) for j=1,...,n aESj and “factoring out. the lowest power of t” yields A A H(a:,t)=(h1(:r,t),...,hn(:r,t)), where tE[O,1] and in“): Z [(1—t)5j,a+tcj,a]:cat(<&’d>_fi7), for j=1,...,n. (3.1) aESj Note that we still have H (IE, 1) = P(rr), because a: = y at t = 1 In following the solution paths of 1:1($,t) = 0 by the prediction-correction method, the first step of the predictor at t = 0 cannot be taken if a power of t in H (11:, t) is less than one, since Ht(a:,t) would then be undefined at t = 0. If the minimum power of t in 25 = tom (3.1) is, say, tO'Ol, then changing variables with T would solve the immedi- ate problem. But it would reduce numerical stability and computational efficiency if tI’OOO, were also contained in 170:, t). Then the tangent large powers of t, such as vector :I'; 2 film—1 * Ht would contain the terms in the order of 100,000 * t99’999 which, if evaluated at any t E [0,1), would give 0. Close to 1, however, the tan- gent vector would become extremely steep, and step sizes for following the homotopy path would have to be correspondingly minuscule. Actually, while these sorts of problems already exist when “the polyhedral step” and “the linear step” are split as implemented in HOM4PS, they become multiply amplified when the combined polyhedral-linear homotopy is used. Ironically, notwithstanding the number of paths needed to be followed was cut in half by combining the polyhedral and linear steps, the difference between the computing times of the two approaches is almost negligible most of the time. In HOM4PS-2.0, we address this problem by applying the transformation 3 2 Int in (3.1), resulting in the homotopy H(:c,s) = (h1(a:,s),...,hn(x,s)), s E (—oo,0], where hj(.v,s) = Z [(1— e3)5j,a + escjaa]xaes*(<&’é)_fi7) for j = 1,. . . ,n. aESj Recall that mixed cell C = ({a11,a12}, {a21,a22}, . . . , {an1,an2}) with inner nor- 26 mal a E R" satisfies the relations (Ema) = , <éj1,d> < (&,d) V&ESj\{&j1,&j2}, j=1,...,n. So, in each hj(:r, 3), there are exactly two powers of 6 equal to 0, namely, for each j = 1,. . .,n, -=mm End = a-,(3I = d-,d. :8] aESj< > < 31 > < ]2 . Therefore at s 2 —oo, H (1:, —oo) becomes a binomial system, Ellxall + 512$a12 = 0, Enlzanl + 57,2270"? = 0, having [det(a11 — a12,...,an1—— an2)| number of nonsingular isolated solutions which provide the starting points for following the solution paths of H(x,s) = 0 from s 2 —00 to 0. We will detail the standard procedure for solving binomial systems in Section 4.3. As we can see here, the number of paths associated to the cell is the volume of the cell. If we repeat the same process with all the mixed cells, then the total number of the homotopy paths needed to be followed equals Vol (C) = mixed volume of the system . C :mixed cell 27 Since H(x,s) =H(:L',e3), thus for H(a:(s),s) = 0 we have dH(:r: (s), s): H(x, es)- -— 1733—: + Htes = 0. (3.2) d It follows that 32—: I 32—00 = 0 and the values of 23(3) stay close to invariant for large 5 negative 3. Thus, to keep H(x, 3) z 0, we choose 80 so that terms e 80*(_fij) for a E Sj\{aj1,aj2} are negligible for all j = 1, . . . , n, say on the order of 10‘s, as our starting 8 value for following the solution paths of H (:6, s) = 0. Since 5 E (—00, 0] the dominant or largest term with base 6 and exponent s * ((6nd) — fij) in the polynomial :__(S)CZ[1_6 _j,sj,a+ec axet]a'9*((\dd)—fij) aESj for all j: 1,...,n is given by p:=exp 3* min ((a, a)— B) , forj=1,...,n. aESj\{aj1 “32} J Setting II = 10—8 and solving for 3, yields —81n 10 n1ina€5j\{afl,aj2}( 3, the step size will be cut in half to minimize the possibility that the beginning predictor estimate was too far away from the curve. On the other hand, if in two consecutive stages the step sizes were not cut, we assume the curve is flat at this moment and will take the initial step size to be the minimum of doubling the previous step size and a third of the remaining distance to 0. As mentioned before, combining linear and nonlinear homotopies to reduce the number of solution paths needed to be followed in the polyhedral homotopy method 29 by half was originally suggested back in 1995 [15]. However, this idea was not suc- cessfully implemented in HOM4PS because of the involved stability and efficiency problems. Addressing those difficulties by the transformation 3 = lnt and para- meterizing the solution paths by s E (—oo,0] in HOM4PS-2.0 as shown above, a substantial improvement in algorithmic efficiency and stability has been achieved as evidenced by the results of intensive numerical experiments. This combination strat- egy is particularly important when the polyhedral homotopies are used to solve large problems where mixed volumes of the systems are more than millions. 3.2.2 Classical linear homotopy For some polynomial systems, such as katsura-n [3], noon-n [35] and reimer—n [37], the mixed volume of each system is the same or nearly the same as its total degree (or the Bézout number). In such situations, instead of employing polyhedral homotopies, they should be solved directly by following the total degree number of solution paths of the classical linear homotopy H(z‘,t) = (1 — t)7 Q(cc) + tP(:z:) = 0 with generic '7 E C where Q(x) = (q1(:z:), . . . ,qn(:r:)) with cc 2 ($131,” .,xn), ‘11 = 0113111 _ b1, (11 = degree of 191(23), (1n = 071ng _. bn, dn = degree of Pn(33) 30 and randomly chosen (a1,...,an,b1,....bn) E C2" . In this way, by avoiding polyhedral homotopies, very costly mixed cell computations for large systems are averted. Moreover, the s = lnt transformation for the powers of t used in the polyhedral-linear combination is unnecessary for a homotopy straightly linear in the parameter t. For large systems, this can also reduce a considerable amount of cpu time. We also implemented the algorithm of the classical linear homotopy for solving polynomial systems and made it an option in HOM4PS-2.0. 3.3 On curve jumping After following all the homotopy paths. we need to check if there are paths accidentally jumping to other paths during the process of paths tracing. This may result in missing zeros. Following two very close homotopy paths may increase the chance of curve jumping that can occur at each stage of the prediction and correction. When Euler’s method, or the cubic Hermite interpolation, is applied to predict a beginning estimate for the Newton correction of one homotopy path, the resulting point may become too close to the other homotopy path. Thus the correction sequence will converge to a point that is not on the desired path. Subsequently, continued with the prediction- correction procedure, we are in effect following the second path which will also be traversed independently beginning with its own starting point. From the numerical results, it looks as though two different curves each with different starting points both reach the same solution. On the other hand, reaching the same solution may 31 not indicate the occurrence of curve jumping because of possible singular solutions. For example, when solving cyclic-8 [2] and cyclic—9 [2] systems by the homotopy method, both have more than one curve leading to the same singular solution that lies on a positive dimensional solution component. Before dealing with the curve jumping, we wish, in the first place, to minimize the chance for curve jumping to happen during the curve tracing procedure. In HOM4PS-2.0, a more sophisticated selection for the tolerance parameters is designed for dynamically determining step size during the curve following. For instance, for Newton correction at s = sk, we consider any two consecutive iteration points x(m) N am“) — wt") II II 2:0”) n and :c(m+1) too far apart from each other if the relative error 10%. In this situation, we will repeat the prediction-correction process with the step size being cut in half. In addition, as mentioned before, if more than 3 steps of Newton’s iterations were required to converge within the desired accuracy, we yet again cut the step size in half to minimize the possibility that the beginning predictor estimate was too far away from the curve. As shown in Table 3.4 in Section 3.5.1, these collective treatments greatly decrease the occurrences of curve jumping. In fact, they never appear in most of the systems we solved. To check if curve jumping occurs, we must verify all the attained solutions to see if there are two solutions that are very close to each other. We may, for instance, consider two solutions to be numerically identical if the relative error of these two solutions is less than a chosen parameter co > 0. In HOM4PS, this task was done in a 32 quite straightforward manner, essentially each pair of solution points were compared. This will naturally become very costly when the number of solutions is big, say 100,000. Then there are 100, 000 * 99, 999/2 solution pairs, and the relative error of each pairs must all be computed. In HOM4PS-2.0, we divide all the solutions into different groups and only check solution pairs within the same group for closeness. For each isolated solution point z = (21, . . . , zn) E C" we will focus on the imaginary part of its first component 21 = A1 + Bli where A1, Bl E R. The decimal representation of 81 always has a positive or negative sign associated with it and the solutions will be divided into groups that are characterized by this sign as well as the chosen and fixed k-th digit and (k +1)-th digit after the decimal point of the decimal representation of Bl. Each digit place has ten possibilities {0, 1, 2, . . . ,9}. So in total, the solution set is divided into 200 groups, and each group of solution points is characterized by having the same sign, the same k-th digit bk and (k +1)-th digit bk+1 of the decimal representation of the imaginary part of its first component. Obviously, to locate solution pairs that are numerically identical, one only needs to compare solution pairs within the same group. Along the same line, the number of groups can certainly be increased if one wishes to deal with even bigger solution set. Our numerical results show that this technique for the curve jumping detection chopped off a big amount of computing time of HOM4PS especially when solving big systems. 33 Now, after two (or more) numerically identical solutions are detected, call it 5:, we first check the smallest singular value a of the Jacobian matrix Px(:c) evaluated at 51?. When a is bigger than a chosen threshold 61 > 0, then the solution 513 will be considered isolated and nonsingular. The curve jumping clearly occurs. In such cases, we retrace the two associated curves with smaller step sizes. When a < 61, and the solution :1": is isolated, we will infer that the two curves reach the same solution 513 and curve jumping does not occur. However, we can not rule out the possibility of the curve jumping if the solution it is a nonsingular point lying on a higher dimensional solution component Z. A point z E Z is called nonsingular if Weir-”1912) _ _ . TankC6($1,...,$n)(Z)—n dth. (3.1) To differentiate those cases, the algorithm developed in [19] is used to determine the dimension of the solution component to which the solution 5: belongs first, followed by checking the rank condition of :2 in (3.1). For the rank revealing of the Jacobian, we use the scheme developed in [28] rather than calculating the whole SVD (Singular Value Decomposition) of the matrix. When 5: is singular, it commonly attracts more than one different solution curves as in solving cyclic-4 and cyclic-8 systems [19]. Curve jumping is only allowed to exist if :E is nonsingular. 34 3.4 Miscellaneous aspects of HOM4PS-2.0 3.4.1 Evaluating polynomials and derivatives The prediction-correction process for following the homotopy paths of H (:13, s) = (h1(:1:,s),...,hn(x,s))= 0 where for j = 1,. . . ,n, hj(;1:,s) = Z [(1 — es)5j,a + escjaakcaesqlé’m—Hj) and :13“ = will 33%", aESj requires the computation of H (z,s),Hs(:r:,s), and the matrix Hx(:c,s) for fixed 3. What is essentially involved in evaluating H (13,3), H3(:c,s) and Hx(:1:,s) at a given point a: = (2:1, . . . ,.vn) are the evaluations of multivariate polynomials and their partial derivatives. In HOM4PS, a multivariate polynomial g(:1:1, . . . ,zn) was evaluated via Horner’s rule for univariate polynomials. By factoring out a variable, say 1:1, g(.v1, . . . ,a‘n) becomes a polynomial in 3:1 with coefficients in C[a:2, . . . ,xn]. By the same principle, those coefficients, as polynomials in one less variable, were evaluated by factoring out another variable. This may continue until the variables are exhausted. If multivariate polynomials are evaluated in this manner, the repeated compu- tation of the same powers of some variables seems inevitable. For instance, for the 35 System P($1,1‘2,$3) = (P1031,1'2,$3),p2($1,$2,$3),P3($1,$2,$3)) where p1 2*w?+3*2:g+5*2;g—1 p2 = 3*$?*z3+2*2:(15*2:g+4*23‘21*xg—5 (3-1) p3 = 5*x?*x%*x§—7, 23?, 2:3, and 2:3 appear in all pl, 122, and p3. When the above rule is applied, those quantities must repeatedly be computed. Forj=1,...,n, let MaxDegU) = max{aj|(a1, . . . ,an) E S1 U - - - U Sn} — maximum power of the variable :rj appearing in the entire polynomial system and M = .max MaxDeg(j) j=1,...,n = the largest power of all the variables in the entire polynomial system. In HOM4PS-2.0, a table T of size n X M is established to store all possible powers of 233', j .—_—. 1,...,n which may appear in H(z‘,s), H3(2:,s), and Hx(2:,s). The first row of T stores the monomials 2:1, 2%, . . . , witfaxDeg(1) , the second row stores the monomials 2:2, 23%, . . . ,233/1 ”Deg (2), and similarly, the last row stores the 36 2:1 23:12 xf/IaxDegU) $2 23523 JIQ/IamDegQ) a, $3, $.13“deng Table 3.1: T monomials rm, 2%,. ..,:r,1[1axDeg(n). Since Ha;(:r,s) contains the partial derivatives of x“ = 236111 - - 2%" that appear in H (27,3), and since HS(23,s) contains the same $0. = 23;ll m3," as H (23,5), table T stores all possible powers of 23j appearing in all of H(23,s), Hs(2:,s), and Hx(:r,s). Those powers that are involved in any monomial evaluations, no matter how often they appear, will never be repeatedly computed. For the system in (3.1), we have MaxDeg(l) = 6, Ma23De9(2) = 4, MaxDeg(3) = 5 and hence M = 6. The 3 x 6 table T is given as follows: $2 $2 $2 $2 $3 $3 $3 $3 $3 Table 3.2: T (n = 3, M = 6) The value of a monomial evaluated at a point 23 = ($1, . . .,23n) can easily be obtained from table T. For example, for n = 3, the quantity of £23203 is T(1, 2) * T(2, 3) * T(3,1). 37 3.4.2 Scaling of the coefficients Certain polynomial systems, such as cohn 2 and cohn 3 [6], have large coefficients. Large magnitudes in coefficients will result in large magnitudes in tangent vectors of the homotopy paths. This will affect the efficiency of the curve tracing because smaller step sizes need to be taken to follow the homotopy paths. The idea of scaling the system to reduce the magnitudes of the coefficients of the polynomials first appeared in [30]. We shall illustrate our scaling method by way of an example. Example Consider the following system of two equations in two unknowns 8000x§x§—2000x1+1 = 0 (3.2) 500021232—30 = 0. (3.3) To scale the variables, let 231 = 100121 and $2 = 1002 22, and to scale the equations, multiply (3.2) by 1063 and multiply (3.3) by 1064. This gives 1003(8000 * 102CI+202 2.sz - 2000 >i= 1001.21 +1) = 0 1064(5000 * 1001+02 2122 — 30) = 0. 10Elz¥z§—1052z1+1053 = 0 101342122 — 1055 = 0 38 where E1 2 2C} + 2c2 + C3 + loglo(8000) E2 2 c1 + C3 + loglo(2000) E3 = C3 E4 = c1 + c2 + C4 + loglo(5000) E5 = C4+log10(30). To have the numerical stability afforded by coefficients centered about unity, we want each E, to be close to 0. Furthermore, to reduce variability among the magnitude of the coefficients in each equation, we want the difference between each pair of Egs in an equation to be close to 0. Thus, setting r13 E¥+E§+E§+E§+E§ 1‘2 E [(131 - E2)2 + (E2 - E3)2 + (E1 - Ea)2l + [(134 - E5)2], we wish to minimize r = r1 + r2. More explicitly, r = (2c1 + 2C2 + 63 + log(8000))2 + (c1 + C3 +10g(2000))2 + c3 (3.4) +(c1 + c2 + C4 + log(5000))2 + (C4 + log(30))2 +(c1 + 2C2 + log(8000) — log(2000))2 + (2c1 + 202 + log(8000))2 +(c1 + log(2000))2 + (Cl + c2 + log(5000) — log(30))2. 39 While in [30] r is considered as a second degree polynomial in four unknowns C1, 62, C3, C4 and is minimized by the solution of 2i=0 fori=1,2,3,4, 862' we rewrite r in (3.4) as 2 (2 2 1 0 ( —log(8000) \ 1 0 1 0 —log(2000) Ii 0 0 1 0 0 (C1\ 1 1 0 1 —log(5000) C2 T Z 0 0 0 1 ‘ -—log(30) C3 1 2 0 0 ‘ log(2000)—log(8000) W 2 2 0 0 T —log(8000) 1 0 0 0 —log(2000) (1 1 0 0 ) ( log(30)—log(5000) ) \ I M v J A b 2 = "Ax—bug. and consider its minimization as a linear least squares problem. With the solution of this least squares problem C1 = —3.3437, c2 = 1.3495, C3 = 0.0427, and C4 = 40 —1.5909, the original equations then become 0.9064zfz3—z1+1.1033 = 0 1.29962122—0.7695 = 0. Clearly, the new system has coefficients with magnitudes smaller than those of the original one. They are closer to unity and to each other. When solutions z = (21, .22) of the new system are located, the solutions 23 = (231, 222) can be attained by applying the transformation 231 = 1001231 and 232 = 106223. 3.4.3 The end game Some of the homotopy paths 23(3) of H (2,3) = 0 for s E (—oo,0] may diverge, or go to 00, as s approaches 0. (In fact, solving systems like reimer-n [37] by the homotopy continuation method, the majority of homotopy paths diverge.) Tracing divergent paths usually consumes a multiple computing time. Typically near the end of the solution path, very small step sizes must be taken to determine the convergence of the path. For a more efficient method, write “31' 0° 1' 233(3) 2 aJ-(l — e8)m (1+ Eat-3'0 — es)m), j = 1,. . . ,n (3.5) i=1 where m, called the cyclic number, is a positive integer and w = (1.21, . . . ,wn) E Z". It was shown in Morgan et al. [33] that such expansions exist for each path in the 41 neighborhood of 3 = 0. Taking the logarithm of the absolute value of both sides of equation (3.5) yields 00 .. _ . fl _ s .. _ 82' log |;L,(s)| —log|aJ|+ m log(1 e )+Zb,,(1 e ) . (3.6) i=1 . i Here 2931 b,j(1 — e3)-7 is the Taylor expansion of log(1+ 2:31 aij(1 — es)'n_1). During the process of homotOpy continuation, a sequence of points 23(3):), —00 < so < 31 < S 0 were generated. Choose two consecutive 3k and 310 +1 close to 0, say 1 — esk < 10-6. By equation (3.6), logic-3 —108$‘3 w- l ]( 5)] l .7( ks'l'l“ = J+O(1—esk). logll—ekl—logll‘ekHl m w- Therefore —2— may be estimated by the left hand side of the above equation when m 3k is close to 0. In HOM4PS, so is in PHCpack [38], the convergence (or divergence) of a path 23(3) 2 (1‘1(S),...,In(8)) when 3 —I 0 is determined by w' 1. If wj < 0 for certain j E {1,...,n}, then ~51]— < 0, and the path component 233(3) diverges. Hence the path 113(8) diverges when 3 —1 0. w- 2. If wj Z 0 Vj=1,...,n, then a] 2 0 Vj and all path components 2.3-(3) converge. Hence the path 2(3) converges when 3 —’ 0. w- Newly inserted in our HOM4PS-2.0 is the observation that when 0 < a] < l for 42 d$j(8) S certain j then lim 2 00 which implies the divergence of the component s-—} 233(3), hence the divergence of the path 23(3) when 3 —> 0. We thus split the second item above as follows: 02- 2.1 If wj = 0 Vj = 1,...,n, , then —-1 = 0 Vj and all path components 233(3) m converge. Hence the path 23(3) converges when 3 —> 0. w- 2.2 If 0 < a] < 1 for certain j E {1, . . . ,n}, then the path component 233- diverges and hence the path 23(3) diverges when 3 ——> 0. 2.3 If 3 l3? _>_ 1 V j = 1,...,n, then all path components 233(3) converge and hence the path 23(3) converges when 3 —> 0. Our numerical results show that this splitting provides a much accurate judgement in many situations. 3.5 Numerical results To demonstrate the performance of HOM4PS~2.0 and to compare it with the existing packages HOM4PS, PHCpack [38] and PHoM [12] which implemented the polyhe- dral homotopy method, we will focus on those size-expandable bench mark systems listed in Table 3.3. All the computations in this section were carried out on a Dell PC with a Pentium 4 CPU of 2.2GHz, 1GB of memory. Results presented are mainly restricted to those systems that can be solved within 12 hours of cpu time. The 43 package HOM4PS-2.0 is written in FORTRAN 90. The binary codes as well as its Matlab interface are available at: http://www.math.msu. edu/~li/Software.htm 44 eco—n [30] Total degree = 2371-2 (231+ 23le + - - - + 23n_223n_1)23n — 1 = 0 ($2 + 23123 + - - - + 23n_323n_1)23n — 2 = 0 23n_123n —- (n — 1) = 0 $1+$2+-“+$n_1+1=0 noon-n [35] Total degree = 3" 2:1(z%+232+---+2:,2,—1.1)+1=0 22(a:§+z§+-~+x%—1.1)+1=0 xn(a:%+:c§+~-+:c,2,_1—1.1)+1=0 cyclic-n [2] Total degree = n! 231+232+---+2:n=0 $1232 + 332233 + - - - + $n_1$n + $17,121 = 0 231232233 + 2322:3234 + - - - + 23n_12:n231 + 2371231232 = 0 2:1232-~-xn—1=0 katsura—n [3] Total degree = 2" 223n+1+223n+---+2232+231—1=0 2xfi+1+2xfi+m+2$§+x§—xl =0 223nmn+1 + 23311—11371 + - . - + 2232233 + 2231232 — 232 = 0 223n_123n+1+ 223n_223n + - - - + 223123 + 23% — 233 = 0 22:323n+1 + 2231237; + 223223,,_1 + ~ - - + 23n/2$(n+2)/2 — 2m 2 0 (if n is even) 2$2$n+1+ 2231231; + 223223714 + - - - + xfn'l'llfl — an = 0 (if n is odd) reimer—n [37] Total degree = (n + 1)! 211% — 2x3 + - - - + (—1)"+12z¥, — 1 = 0 223113— 2233 + - - - -l- (—1)"+1223% — 1 = 0 22:?T1— 22334—1+---+(—1)"+122:'T]+1 — 1 : 0 Table 3.3: The polynomial systems 45 System CPU time Mixed volume # Of curve # Of isolated jumpings solutions eco-16 6m35s 16,384 - 16,384 eco-17 22m23s 32,768 - 32,768 eco-18 1h51m303 65,536 - 65,536 noon-10 1m27s 59,029 - 59,029 noon—11 5m32s 177,125 2 177,125 noon-12 27m298 531,417 2 531,417 noon-13 3h7m108 1,594,297 10 1,594,297 katsura- 17 40m483 131,072 — 131,072 katsura-18 1h35m47s 262,144 - 262,144 katsura-19 3h50m483 524,288 4 524,288 katsura-20 8h58m00s 1,048,576 4 1,048,576 cyclic-9 44s 11,016 - 6,642 cyclic-10 2m47s 35,940 - 34,940 cyclic-11 19m408 184,756 - 184,756 cyclic-12 1h36m40s 500,352 - 367,488 reimer-7 1m583 40,320 - 2,880 reimer-8 30m43s 362,880 - 14,400 reimer-9 7h52m40s 3,628,800 14 86,400 Table 3.4: The performance of HOM4PS-2.0 3.5.1 The performance of HOM4PS-2.0 in dealing with curve jumping and diverging paths Listed in Table 3.4 is the performance of HOM4PS-2.0 on solving all those bench mark systems in Table 3.3. From the 4th column in the table, one can see that the occurrences of curve jumping have been mostly diminished for HOM4PS-2.0. We must note here that curve jumping never appears for systems in each system category with sizes smaller than those listed in the table. On the other hand, we 46 only need to retrace 10 paths (among 1,594,297 paths) for noon-13, 4 paths (among 1,048,576 paths) for katsura-20 and 14 paths (among 3,628,800) for reimer-9 due to curve jumping. Moreover, when retracing was necessary, no homotopy paths need to be retraced more than once, while, as reported in [12], multiple retracings were required very often for the software package PHoM [12] to deal with curve jumping. As shown in the table, when solving reimer-n systems, most homotopy paths diverged. For example, the mixed volume, or the total number of paths we must follow, of the reimer-8 system is 362,880, but the number of its isolated solutions is just 14,400. While it’s normally costly in tracing diverging paths, HOM4PS-2.0 is capable of determining those divergent homotopy paths very efficiently with the end game criteria discussed in Section 3.4.3 as the cpu times in the table show. As mentioned before, we may solve katsura—n and reimer-n systems by the clas- sical linear homotopies because the mixed volume of each system agrees with its total degree. As a comparison, we list in Table 3.5 the results of solving those systems by both the classical linear homotopy option and the polyhedral-linear homotopy option in HOM4PS-2.0. While the proof is not available at this moment, by the observation on a collective numerical data from intensive experiments on noon-n systems, the total degree of each noon-n system and its mixed volume satisfy: total degree 2 3n = mixed volume + 2 n. So, when n becomes large, the difference between them becomes very slim relatively. 47 System Total Degree CPU time # of isolated Linear [Poly-Linear solutions noon-10 59,029 + 20 1m27s 5m12s 59,029 noon-11 177,125 + 22 5m32s 23m27s 177,125 noon-12 531,417 + 24 27m293 1h28m003 531,417 noon-13 1,594,297 + 26 3h7m103 7h02m103 1,594,297 katsura—15 32,768 7m03s 1h50m26s 32,768 katsura-16 65,536 l6m25s - 65,536 katsura—17 131,072 40m48s - 131,072 katsura-18 262,144 1h35m47s - 262,144 katsura-19 524,288 3h50m483 - 524,288 katsura—20 1,048,576 8h58m003 - 1,048,576 reimer-7 40,320 1m58s 2m49s 2,880 reimer-8 362,880 30m43s 36m43s 14,400 reimer-9 3,628,800 7h52m403 8h47m42s 86,400 Table 3.5: Comparison of the classical linear homotopy and the polyhedral—linear homotopy in HOM4PS-2.0 ‘ Therefore we also include them in the table. Apparently, as it shows, if the closeness of the mixed volume and the total degree of the system can be revealed beforehand, sometimes HOM4PS-2.0 can handle much bigger systems within tolerable time. For reimer—n systems, the cpu time for finding mixed cells is very minimal (less than 1 second most of the times). While tracing the same number of curves, the differences in the cpu times of the Classical linear homotopy and the polyhedral-linear homotopy in the table indicate that nonlinear homotopies can be costly for large systems. 48 3.5.2 HOM4PS-2.0 vs. HOM4PS Table 3.6 lists the numerical results that compare HOM4PS-2.0 with HOM4PS. Since the classical linear homotopy method was not implemented in HOM4PS, the table only displays the results that used the polyhedral homotopy method uniformly on all the systems. As it shows, HOM4PS-2.0 is considerably faster than HOM4PS, and the speed— up ratio increases by quite a bit as the mixed volume of the polynomial system increases. Recall that for a specific system, HOM4PS-2.0 only needs to trace the mixed volume number of homotopy paths, while twice this number of paths needs to be traced in HOM4PS. Moreover, HOM4PS-2.0 is much more powerful in dealing with larger systems. For instance, originally HOM4PS can not solve noon-13 system within tolerable time, whereas HOM4PS-2.0 followed over 1.5 million curves in 7 hours. 49 System Mixed Volume CPU time Speed—up (# of paths) HOM4PS HOM4P S-2.0 ratio eco—15 8,192 33m25s 2m25s 13.8 eco—16 16,384 2h55m12s 6m35s 26.6 eco—17 32,768 - 22m23s — eco- 18 65,536 - 1h51m303 - noon-9 19,665 21m4ls 1m15s 17.3 noon— 10 59,029 3h20m45s 5m12s 38.6 noon-11 177,125 - 23m27s - noon-12 531 ,417 - 1h28m00s - noon— 13 1,594,297 — 7h02m10s — katsura- 12 4,096 43m54s 1m42s 25.8 katsura- 13 8,192 3h40m54s 4m56s 44.8 katsura— 14 16,384 - 25m15s - katsura— 15 32,768 - 1h50m26s - cyclic-9 1 1,016 8m37s 44s 1 1.8 cyclic-10 35,940 58m02s 2m47s 20.9 cyclic-11 184,756 - 19m403 - cyclic-12 500,352 - 1h36m403 - reimer-7 40,320 7m47s 2m49s 2.8 reimer—8 362,880 1h44m18s 36m43s 2.8 reimer-9 3,628,800 - 8h47m42s — Table 3.6: Comparison of HOM4PS and HOM4PS-2.0 50 3.5.3 HOM4PS-2.0 vs. PHCpack and PHoM Listed in Table 3.7 is the comparison of the performance of HOM4PS-2.0 and PHCpack [38]. The implementation of solving polynomial systems by the Classical linear homotOpies is also available in PHCpack. Therefore the comparisons listed in Table 3.7 on noon—n, katsura-n and reimer-n systems whose mixed volume and total degree of each system are the same (or almost the same) are the results by using the classical linear homotopy option in each package. Our powerful code MixedVol—2.0 [20] for computing mixed cells has no place in this computation because no mixed cell computations are required. Nonetheless, as it stands, HOM4PS-2.0 still leads PHCpack in speed by a big margin in those situations. For eco—n and cyclic-n systems, there is a considerable difference, sometimes huge, between the mixed volume and the total degree of the system. So, we must employ the polyhedral homotopy here. When the PHCpack was tested, we used its fastest option, as indicated in the package, which utilizes MixedVol [10] for mixed cell computations. 51 System Total degree CPU time Speed-up #isolated PHCpack HOM4PS-2.0 ratio solutions eco— 14 1,062,882 1h26m04s 52.93 97.6 4,096 eco— 15 3,188,646 3h55m238 2m25s 97.4 8,192 eco- 17 28,697,814 - 22m23s — 32,768 eco— 18 86,093,442 - 1h51m303 - 65,536 noon-9 19,683 33m283 22.2s 90.5 19,665 noon-10 59,049 2h33m27s 1m27s 105.8 59,029 noon-11 177,147 - 5m32s — 177,125 noon-13 1,594,323 - 3h7mlOs - 1,594,297 katsura-14 16,384 2h49m003 2m52s 59.0 16,384 katsura- 15 32,768 8h22m45s 7m03s 71.3 32,768 katsura-16 65,536 - 16m25s - 65,536 katsura—20 1 ,048,576 - 8h58m008 - 1 ,048,576 cyclic-9 362,880 3h50m483 443 314.7 6,642 cyclic- 10 3,628,800 11h00m23s 2m47s 237.2 34,940 cyclic-11 39,916,800 - 19m403 - 184,756 cyclic— 12 479,001,600 - lh36m40s - 367,488 reimer—6 5,040 15m088 9.63 94.5 576 reimer-7 40,320 3h45m43s 1m583 1 14.7 2,880 reimer-8 362,880 - 30m433 - 14,400 reimer-9 3,628,800 - 7h52m408 - 86,400 Table 3.7: Comparison of HOM4PS-2.0 and PHCpack 52 System Total degree CPU time Speed-up #isolated PHoM HOM4PS-2.0 ratio solutions eco—13 354,294 2h39m31s 19s 503.7 2,048 eco— 14 1,062,882 9h57m15s 52.98 677.4 4,096 eco— 15 3,188,646 - 2m25s - 8,192 eco- 18 86,093,442 - 1h51m30s - 65,536 noon-8 6,661 54m185 19s 171.5 6,645 noon-9 19,683 5h01m06s 1m15s 240.9 19,665 noon— 10 59,049 - 5m123 - 59.029 noon— 13 1 ,594,323 - 7h02m10s - 1,594,297 katsura-l 1 2,048 lh2lml3s 283 174.0 2,048 katsura-12 4,096 4h00m09s 1m42s 141.3 4,096 katsura-13 8,192 — 4m56s - 8,192 katsura—15 32,768 - 1h50m263 - 32,768 cyclic-8 40,320 32m32s 6. 8s 287.0 1 , 152 cyclic-9 362,880 - 44s - 6,642 cyclic- 12 479,001 ,600 - lh36m408 - 367,488 reimer-6 5,040 lh14m503 12.1s 371.0 576 reimer—7 40,320 - 2m49s — 2,880 reimer-9 3,628 ,800 - 8h47m42s - 86,400 Table 3.8: Comparison of HOM4PS-2.0 and PHoM Table 3.8 compares the performance of HOM4PS-2.0 and PHoM [12]. The im- plementation of the classical linear homotopy for solving polynomial systems is not available in PHoM. Therefore all the results listed in Table 3.8 used the polyhedral homotopy on all the systems. 53 Maximum size PHoM ] PHCpack [ HOM4PS-2.0 eco- 14 (1,062,882) 15 (3,188,646) 18 (86,093,442) noon- 9 (19,683) 10 (59,049) 13 (1,594,323) katsura- 12 (2,048) 15 (32,768) 20 (1,048,576) cyclic- 8 (40,320) 10 (3,628,800) 12 (479,001,600) reimer- 6 (5,040) 7 (40,320) 9 (3,628,800) System Table 3.9: Maximum sizes of polynomial systems that can be solved by PHCpack, PHoM, and HOM4PS-2.0 within 12 hours of cpu time Table 3.9 provides the maximum sizes of the systems that can be solved by PHCpack, PHoM, and HOM4PS-2.0 within 12 hours of cpu time. The total de- gree of the system is given in the parenthesis. As it shows, HOM4PS-2.0 can solve systems of much larger size than the other two packages. 54 Chapter 4 HOM4PS-2.0para—Parallelizing HOM4PS-2.0 Each of the three main steps in HOM4PS-2.0, finding mixed cells, following homotopy paths and checking for possible curve jumping, may all be parallelized. HOM4PS- 2.0para is the parallel version of HOM4PS—2.0 which parallelizes the three main steps in HOM4PS-2.0. A common feature of those three stages, as mentioned before, is that each important task on a single CPU can be successively subdivided into multiple sub— tasks, which can be processed independently from the others without communicating with them. We therefore adopt a simple master-workers parallel computation envi- ronment where one master CPU communicates with all worker CPUs, and workers do not communicate among themselves. We use MP1 [34], which provides a message passing library for this type of communication between master and workers, in our implementation of the parallelization. In this chapter, we will address the details of 55 parallelizing HOM4PS-2.0 in each step. 4.1 Parallelizing the computation of mixed cells Recall that a main strategy for finding mixed cells is the extension of level-k subfaces El: 2 (éi1,...,éik) of the lifted support S = (S1,...,Sn) starting from k = 1 and when k = n — 1 an extensible Ek yields mixed cells of S = (51,. . . ,3”) in- duced by elements in Sin- In our parallelization in computing mixed cells we first choose i1 to be the smallest integer in {1, . . . ,n} such that ]L(S,;1)| 2 |L(Sj)] for all j E {1, . . . ,n} \ {i1} . Each lower edge in L(S,-l) can be extended to subfaces of the next and consecutive levels independently by employing the idea of dynamic enumer- ation of all mixed cells [29]. Thus each worker processor will be assigned to extend a lower edge of L(S,-1), and the work is dynamically distributed in the following man- ner. Algorithm for the master Choose i1 to be the smallest integer in {1, . . . ,n} such that ]L(S,-l)| _>_ |L(Sj)| for all 2'6 {1a---,n}\{i1} N = the # of lower edges in L(Sz-1) 14311) = {é111---1é1N} p = the # of processors (i.e., 1 master and p — 1 workers) dos=1,p—1 56 send job 3 to worker 3 (i.e., processor 3 works on extending edge €213) end do (Comment: The time required for each job is different. When worker j finishes and is ready for a new job, it sends the result to the master, and receives the next open job in the queue.) do 3 = p, N + p — 1 receive results from worker j (i.e., worker j finishes its current job) send job 3 to worker j (i.e, worker j works on extending another edge éls) end do (Comment: For 3 > N, job 3 is an “empty job”, an ending signal to worker proces- sors.) Algorithm for workers Choose i1 to be the smallest integer in {1, . . . ,n} such that |L(S,'l)| 2 |L(Sj)| for all j E {1,...,n}\{i1} N = the # of lower edges in L(S,-1) L(311)={é11w~1é1N} Step 0: Receive job 3 from the master If (s > N) then stop. (i.e., there are only N edges) Otherwise, set F1 := {éls},k :2 1. Step 1: If k = 0, send all mixed cells to the master. Stop. If Ek = (I), set It 2: k — 1, and go to Step 1. Otherwise go to Step 2. 57 Step 2: Step 3: Step 4: 4.2 Select éik E Ek, and set Ek :2 Ek\{éik}. Let Eli: = (éi1,. . . ,éik), where éi1= élsr and A A A, A 0 e4]. = {oz-fag} E L(S,-j) for j = 2,. . .,k, and go to Step 3. To extend level-k subface Eh, search among M:={Sl:lE{1,...,n}\{i1,...,ik}} for 84+, that has minimal number of suitable points where only lower edges among them can possibly extend Ek to a level—(k + 1) subface. Find 5(Ek) (2 The collection of all lower edges in L(S,~ that extends k+1) A Ek.) If 5(Ek) = (0, go to Step 1. Otherwise set Fk+1 = 6(Ek), k := k + 1, then go to Step 4. If k = n, all C = ((a,-1, . . . , e 1 with 5,7, e in are mixed cells. 2‘71—] ’ éln’ Collect all these mixed cells, set k := k — 1, and go to Step 1. Otherwise go to Step 2. Parallelizing the curve tracing of polyhedral homotopy The fact that each homotopy path can be followed independently seemingly pro— vides a perfect environment for the parallelization. Nonetheless when the polyhedral homotopy method is used to solve polynomial systems, different mixed cells induce different polyhedral-linear homotopies with different binomial systems to solve. Ini- 58 tially, it would seem natural to distribute the workload by sending one mixed cell to each worker, and when each worker receives a mixed cell, it induces the corresponding polyhedral-linear homotopy, solves the associate binomial system for starting points, and follows the resulting homotopy paths. However, the volume of each mixed cell may differ, hence the number of paths followed by each worker would differ. More— over, there may be more workers than cells. For an extreme case: one can show that reimer-n systems [37] permit only one mixed cell regardless of what sort of liftings are used. In this situation, only one worker traces homotopy paths while the rest of the workers all remain idle. Therefore, we propose to distribute the workload consisting of a suitable amount of homotopy paths rather than sending one mixed cell to each worker. To decide how many paths should be assigned to each worker each time, let the chunk size, denoted CS, be the number of paths given to each worker to follow in each job. If CS is too small compared to the mixed volume, then too much time would be spent on sending and receiving messages between workers and the master. Since all workers finish their tasks quickly and report to the master at about the same time, traffic jams will result and furthermore it will increase the communication time between the master and workers. On the other hand, choosing relatively large CS may reduce the communication time, but it may also result in imbalance of the workload in certain situations. Tracing different paths may consume different amount of time, for instance, following divergent paths is about 3 to 5 times slower 59 than following convergent ones. Therefore it may not take the same cpu time for all workers each to finish tracing the same number of homotopy paths. Those workers who finish all their jobs must wait for those who have not done their jobs, and the amount of waiting time depends largely on the size of CS. In the following, a method on choosing an appropriate CS to make the work time more balanced among processors is proposed. Let M V represent the mixed volume of the system, which is known after all mixed cells are found. Let p be the number of processors (1 master and p - 1 workers), t be the average cpu time to follow one path, and I be the average number of jobs each worker should finish. Then MV CS*l*(p—1)%MV=>CS=z*—(PT:—l—), and once we choose l, chunk size CS is determined. Each worker will then follow _ MV —l*@-U worker to follow CS paths is about CS :1: t. The computation time for each worker to CS homotopy paths per job. The average computation time for each finish its 1 jobs is about t2 = CS :1: t :1: l. The workers that finish all their jobs first must wait for those that have not done their jobs, and the waiting time is at most t1 = CS * t. To balance the workload distribution, the ratio between the worst case waiting time and total time required for each worker to finish all its 1 jobs needs to 1 be small, say 0.01. In other words, we want :—1 = I z 0.01. So 1 is about 100, yield- 2 M V mg CSZrmi—i) . This means each worker will be assigned about 100 jobs, and 60 each job consists of following homotopy paths. The aim of parallelizing M 100(p — 1) our software package HOM4PS-2.0 is to increase the computing resources for solving larger systems, which inevitably requires tracing a big amount of homotopy paths. Therefore, in the following, we always take I = 100 for simplicity. To distribute the jobs dynamically, we first number all mixed cells, and for each mixed cell ({a11,a12}, . . . , {an1,an2}), we number all the isolated solutions of its associated binomial system Cllxall + C12$a12 = 0, (4.1) cnlccanl + 9,22%? = 0, in the following manner. Equivalent to (4.1), write mall—(112 : b1, (4.2) zanI—an2 2 bn, c. with bj = ——12 forj = 1, 2, . . . ,n. When solving (4.2) [22, 24], the integer matrix c. 31 A = [011- 012,-uaan1- 0112] 61 is upper triangularized first. That is, finding an integer matrix B with [det B] = 1 such that v11 ”12 ”In 0 1122 v2n BA 2 0 0 Um, Recall that if y = (y1,...,yn) E C" and m1,...,mn are columns of an n x n integer matrix M, then yM := (ym1,...,ym"). With these notations, if we find 2 = (z1,. . .,zn) E C" satisfying zBA = (51,. . . ,bn), (4.3) then 23 : 23 is a solution to (4.2). Written out explicitly, (4.3) becomes ’U 211412,,322 : b2, (4.4) lelln H,Z;Jlnn : bn- Without loss of generality, we may assume 'Ujj > 0 for all j = 1, . . . ,n. Now, v11 62 solutions of 21 in (4.4) can be expressed as arg (b1) + 27rk1i 7Jll 2(k1)_lb Ill—1E ( ) f k _ 1 — 1| 23p or 1—0,...,v11—1 where arg (b1) is the principal argument of b1. And, by forward substitutions in (4.4), for j: 2,...,n 1 — . k- _ —.—. or b' +2rrlc-i 2(3):|b-|UJJE23p 9(3) J for k-=0,...,v~—1 J J 0.. J J] 7.7 — bj — . . . - where bj = zvlj zvj-lj and org (bj) IS the prmcrpal argument of bj. So the system 1 j—l in (4.4) has ”11 x x vnn(=|detA|——- volume of the mixed cell ({a11,a12}, . . ., {an1, an2})) isolated solutions in total, and each solution can be written in the form (zgkl),...,z1(zknl) where 0 S kj < '03") for j = 1,...,n. We order those isolated solutions as follows: For 1 S l S v11 x- - ~xvnn, consecutively applying “division algorithm” on I — 1 yields a unique decomposition of l - 1 in the form l—1=u1(v22x~-xvnn)+u2(v33X-~xvnn)+~-+un_1(vnn)+un. Apparently, 0 g uj < Ujj for j = 1,. . . ,n, and we thereby label (z[u1), . . . , Alum) as the lm solution of the system in (4.4). For the corresponding solution 23 = 23 63 to the system in (4.1), we label it with the same order index of the solution 2 to the system in (4.4). The polyhedral-linear homotopy paths can now be ordered by taking the same indices of their starting points of the associated binomial systems together with the number indices of the corresponding mixed cells. For this purpose, a table fT is established in which each entry in the first row represents the index of a mixed cell and the entry in the row below which denotes the index of the one of the so- lutions of the associated binomial system of the mixed cell. For example, if there are 30 mixed cells Cl,C2,C3,...,C3o and Volume of Cl = 8, Volume of C2 = 12,...,Volume of C30 = 8, then the table fT is: Suppose the mixed volume cell 1 cell 2 . .. cell 30 Cell# 1 11 2 30 30 Path# 1 8[1 12 1 8 Table4.1:fT M V of the system is 1500 and the number of processors p = 4. Then CS = 1500 m = 5 and the number of total jobs = 300 where each job consists of following 5 paths. Based on the first row fT(1,j) in table f T, we associate to each job i = 1, . . .,300 atable fT(i) with 3 rows and fT(1,i*5)-fT(1,(i—1)*5+1)+1 columns. Entries in the 1‘“ row are the indices of the mixed cells. And entries in the 2nd row and 3rd row list the beginning indices and the ending indices of the chunk respectively. For instance, associated to job 1 is the table fT(l) of size 3 x 1 : 64 1 , which means the worker that receives job 1 will follow paths emanating from éfilfitions number 1 through number 5 of the binomial system associated with cell 1 2 1. Similarly, associated to job 2 is the table flT(2) of size 3 x 2 : 6 1 . So 8 2 the worker that receives job 2 will trace paths number 6 through number 8 of the binomial system associated with cell 1 as well as paths 1 and 2 of the binomial system associated with cell 2, making it 5 in total. We summarize the above in the next two algorithms. Algorithm for the master Establish Table f T of two rows in which each entry in the first row represents the index of a mixed cell and the entry below which in the second row denotes the index of the solution of the associated binomial system of the mixed cell M V N umO f J obs = CS (M V is the mixed volume and CS is the chunk size.) p= # of processors (1 master and p — 1 workers) do i = 1 to p — 1 send job i (along with table fT(i)) to worker i end do do i 2p to NumOfJobs+p—1 65 receive results from worker j send job i (along with table fT(i)) to worker j end do Algorithm for worker receive job i from the master if (i > NumOfJobs), then stop else receive table fT(i) with 3 rows and fT(1,i =1: CS) — fT(1,(i — 1) >1: CS + 1) +1 columns entries in 1“ row are the indices of the mixed cells entries in 2nd row are the beginning indices of the chunk entries in 3rd row are the ending indices of the chunk worker solves the corresponding binomial systems and follows CS paths based on f T0) send results to the master end if When a worker receives job i along with table f T(i) from the master, it solves the requisite binomial systems for picking out the appropriate paths to follow. However, as mentioned before, some systems only allow very few mixed cells. Solving the 66 same binomial system repeatedly will certainly effect the scalability of the dynamic distribution. So if there are only a few mixed cells, we first solve all the binomial systems corresponding to those mixed cells respectively, and save all solutions of the binomial systems. When homotopy paths are distributed to workers, we only send CS binomial system solutions as starting points without having workers solve the same binomial systems over and over again. 4.3 Parallelizing the curve tracing of the classical linear homotopy For systems whose mixed volumes are the same or almost the same as their total degree (or the Bézout number), the polyhedral-linear homotopy method provides no advantages when it is used for solving them. As mentioned in Section 3.2.2, those systems P(23) = (p1(23), . . . ,pn(23)) with 23 = (231, . . .,23n) should be solved directly by following the total degree number of solution paths of the classical linear homotopy H(2:,t) = (1 — t)')Q(:r) + tP(23) = 0 for generic 'y E C where a typical choice of Q(I) = (210?).- - ~1Qn($)) is (11 = a11361ll - 171, d1 = degree of P1013), (4.1) on = anzgn — bn, dn = degree of pn(2‘), 67 with randomly chosen (a1, . . . ,an, b1, . . . , bn) E C2" [24]. In this way, since no poly- hedral homotopies are involved, no costly mixed cell computations are required. We now describe the parallel algorithm in our HOM4PS-2.0para for this homotopy. While the starting system Q(23) in (4.1) is a special binomial system as in (4.1), the situation is much simpler here. Let TotalDeg 2 (11 x d2 x x dn be the total degree of the system P(23), the total number of paths needed to be followed for the classical linear homotopy. Let p be the number of processors, among which there are 1 master and p — 1 workers. Let CS be the number of homotopy paths that should Total Deg be followed in each job. To dynamically assign the jobs, we choose CS = _—100(p __ 1) as discussed in the last subsection. For workers to decide which CS paths to follow, we first number all the isolated zeros of the starting system Q(23) as follows: Each isolated zero of Q(23) can be written in the form (xgkl),...,x$lkn)) where OSkj NumOfJobs), then stop else receive 2 integers (i — 1)CS + 1 and iCS, indicating the starting and ending indices of paths for each integer between the starting and ending indices, find the corresponding 70 starting point from table Bixlib to follow the path send results to the master end if 4.4 Parallelizing the checking of possible curve jumping Once all the solutions are attained after following all homotopy paths, we proceed to check for possible curve jumping. First of all, the master will divide the solutions into 200 groups, as we explained in Section 3.3, where each group of solutions is characterized by having the same sign of the imaginary part of the solutions’ first component as well as the same k-th digit bk and (k + 1)-th digit bk+1 after the decimal point of its decimal representation. Note that solutions may not be evenly distributed among those groups, therefore for systems having large number of isolated solutions, some of those 200 groups may still have a big amount of solutions. In HOM4PS—2.0para, if any group has more than say 1,000 solutions, then this specific group will be divided into 200 subgroups again based on the imaginary part of the second component of each solution in the group. A table ng is established in which each entry in the first row represents the index of a group and the entry in the row below denotes the index of subgroup associated to the group in the first row. We use 0 in the second row to indicate that if the group in the first row does not have more than 1,000 elements, hence no further division of the group is 71 necessary. For example, let 91,. . .,9200 denote the original 200 groups. Suppose |g2| > 1, 000, |g3| > 1, 000, and |g4| > 1, 000 and the rest of the groups have no more than 1000 elements. Then 92, 93, and 94 will all be divided into 200 subgroups based on the imaginary part of the second component of each solution in those groups. LQt Q(2,1), . . . , 9(2, 200),g(3,1), . . . ,g(3, 200), 9(4, 1) . . . , 9(4’ 200) be the subgroups 0f 92, 93, and 94 respectively. The table ng becomes: In total, there are (200 — 3) + 91 92 93 .94 - - - 9200 Group# 1 2 2 3 3 4 4 200 Subgp # 0 1 200 1 200 1 200 0 Table 4.3: ng 200 * 3 = 797 groups (or subgroups) which need to be checked. We dynamically distribute the jobs in a similar manner as what was done before as discussed in Section 4.2 and 4.3. Suppose that there are 4 processors (1 master and 3 workers). Let CS (=chunk size) be the number of groups needed to be checked in each job. Here, in our case, CS = Zf—wfiw z 3 . This means each worker would have to check 3 groups (or subgroups) in each job, and the total number of jobs is 1:1 = 266. Based on the first row ng(1, j) in table ng, to each job i = 1,. . .,266, we associate a table ng(i) with 3 rows and ng(1,i * 3) — ng(1, (i — 1) =1: 3 + 1) + 1 columns. lst 2nd Entries in the row are the indices of the original groups. And entries in the row and 3rd row list the beginning indices and the ending indices of the subgroups associated to the group index in the first row respectively. For instance, associated 72 to job 1 is a table ng (1) of size 3 x 2 : 0 1 , which means the worker that 0 2 receives job 1 will check group 91 and also subgroups 9(211) and g(2,2) of the group 92. Here 0 in the second and third row of table ng(1) means 91 has no more than 1,000 elements. There is no need to difle 91 into 200 subgroups. Associated to 2 F—d job 2 is a table ng(2) of size 3 x 1 : 3 . So the worker that receives job 2 will r—-——1 5 check subgroups g ,g and g L(fifigroup 9 making it 3 in total. Similarly, (2.3) (2,4) (2.5) 2 T 2 L— associated to job 3 is a table ng(3) of size 3 x 1 : . So the worker that receives 6 g— 8 job 3 will check subgroups 9(2,6)’ 9(237) and 9(2,8) oft—he group 92. The dynamic distribution algorithms for the master and workers for checking the curve jumping described in Section 3.3 are listed below. Algorithm for the master p = number of processors (1 master and p — 1 workers) Divide the solution set into 200 groups 91,. . .,9200 based on the imaginary part of the first component of each solution. Establish Table ng of two rows in which each entry in the first row represents the 73 index of a group among 91, . . . , 9200 and the entry below which in the second row denotes the index of subgroup associated to the group in the first row. Note: entries 0 in the second row of ng means the corresponding group in the first row has less than or equal to 1,000 elements. Let N = #{i6 {1,2...,200}| lg,|>1,000} Totales = 200 * N + (200 — N) : total number of groups (or subgroups) CS 2 % : number of groups (or subgroups) in each job Totales N b = _— umO f J 0 3 CS do i = 1 to p — 1 send job i (along with ng(i)) to worker 2' end do do i = p to NumOfJobs +13 —1 receive computation results from some worker j send job i (along with ng(i)) to worker j end do (Comment: For i = 1, . ..,Numo f Jobs, job i consists of checking solutions in groups (or subgroups) from table ng(i). Job Numo f Jobs +1, job N umo f Jobs + 2, . . , job NumOfJobs + p — 1 are empty jobs.) Algorithm for the workers receive job i from the master 74 if (i > NumOfJobs) stop if (i S NumOfJobs) then receive table ng(i) with 3 rows and ng(1, i =1: CS) - ng(1, (i — 1) * CS+ 1) +1 columns entries in 1‘“ row are the indices of among the groups 91,. . . ,9200 entries in 2nd row are the beginning indices of the subgroups in the chunk entries in 3rd row are the ending indices of the subgroups in the chunk for any group (or subgroup) g = {$1, . . . ,xlgl} E ng dol=1to|g|—1 dom=l+1to|g| ll 131—33m II II $1 M if the minimum singular value of the Jacobian matrix Px(:rl) is bigger if < a chosen parameter 60, then than a chosen threshold 61, then retrace the two paths with smaller step sizes send the updated solutions to the master else if :13) is a nonsingular point on a positive dimensional component curve jumping occurs retrace the two paths with smaller step sizes end if end if 75 end if end do end do end for end if After two (or more) numerically identical solutions are detected and the occur- rence of curve jumping is determined, we need to identify the associated paths and retrace them with smaller step sizes. For this purpose, the order index of the path must be saved when it reaches a zero of P(x) = (191(1):), . . . ,pn(a:)) at the end of the path. When the polyhedral-linear homotopy is used to solve the system, each path corresponds to a unique mixed cell along with a unique path number in that cell as elaborated in Section 3.2. We save these two numbers for the path so that identification of the mixed cell provides the associated binomial system, yielding the starting point of a particularly numbered path and so that retracing of the paths may then proceed. For the classical linear homotopy, the situation is simpler. The solution paths of the homotopy are numbered according to the order indices of their starting points as solutions of the starting system. Hence storing the path number is sufficient for retracing the path. Remark: In the very beginning, when the master receives the solutions of P(x) = 0 in C" , they are saved in a 2-dimensional array of complex numbers in HOM4PS- 2.0para. The size of the 2-dimensional array is n x M V for the polyhedral homotopy 76 and n x T otalDeg for the classical linear homotopy (n complex components of the solution and M V or TotalDeg number of solutions). As before, M V is the mixed volume of P(x) and TotalDeg is the total degree of P(z). Different from the paral- lel version of PHCpack [39], in which the master will write the solutions of P(x) = 0 to a file when the solutions are received from the workers, we save all the solutions in an array for the purpose of grouping the solutions for the detection of curve jumping. Having the solutions stored in an array of complex numbers is convenient for retriev- ing the necessary information to retrace the homotopy paths, and for updating the new solutions. 4.5 Numerical results To demonstrate the performance of our parallel software package HOM4PS-2.0para we will focus on those size-expandable bench mark systems listed in Table 3.3. All the computations were carried out on a cluster 8 AMD dual 2.2 GHz cpus. Again, we only list those systems that can be solved within 12 hours cpu time. 4.5.1 The performance on large systems Table 4.4 shows the results of solving eco-n and cyclic-n systems using 1 master and 7 workers. For eco—n and cyclic—n systems, the total degree of each individual system is much greater than its mixed volume, especially when 17. becomes larger. We therefore use the polyhedral-linear homotopy to solve those systems, and the total number of 77 homotopy paths that need to be followed is the mixed volume of the system. As it shows, curve jumping rarely occurs. On the other hand, when retracing was necessary, no homotOpy paths need to be retraced more than once. System CPU time Total degree Mixed volume # of curve # of isolated (=# of paths) jumping solutions eco— 17 2ml ls 28,697,814 32,768 - 32,768 eco— 18 6m308 86,093,442 65,536 - 65,536 eco—19 26m26s 258,280,326 131,072 - 131,072 eco—20 1h29m295 774,840,978 262,144 - 262,144 eco—21 10h08m555 2,324,522,934 524,288 1 524,288 cyclic-11 3m34s 39,916,800 184,756 - 184,756 cyclic-12 14m07s 479,001,600 500,352 - 367,488 cyclic-l3 lh39m103 6,227,020,800 2,704,156 - 2,704,156 cyclic-14 7h32m42s 87,178,291,200 8,795,976 4 8,464,307 Table 4.4: Solving systems by the polyhedral-linear homotopy with 1 master and 7 workers 78 Table 4.5 lists the results of solving systems noon-n, katsura-n, and reimer-n with again 1 master and 7 workers. For katsura-n and reimer-n systems, total degree of each system is equal (or almost equal) to the mixed volume of the system. Therefore we solved these systems by using the classical linear homotopy to avoid the costly mixed cell computations. Indeed, it is very costly to find all the mixed cells of katsura—n system when n > 15. For noon-n systems, the difference between the total degree of each system and its mixed volume is 277. [21], which becomes much slimmer relatively as 72 becomes larger. Hence we also solved noon-n systems by the classical linear homotopy. Note that while curve jumping occurs in solving very big systems, the number of occurrences is very small relatively. Again, we only need to retrace the homotopy paths once when curve jumping was detected. System CPU time Total degree # of curve # of isolated (=# of paths) jumping solutions noon-12 2m23s 531,417+24 - 531,417 noon-13 7m48s 1,594,297+26 - 1,594,297 noon-14 38m12s 4,782,941+28 - 4,782,941 noon-15 4h14m33s 14,348,877+30 - 14,348,877 katsura—18 9m46s 262,144 - 262,144 katsura-19 23m36s 524,288 2 524,288 katsura-20 55mlOs 1,048,576 4 1,048,576 katsura—21 2h08m42s 2,097,152 8 2,097,152 katsura-22 4h52m013 4,194,304 20 4,194,304 katsura-23 11h17m408 8,388,608 52 8,388,608 reimer—S 2m36s 362,880 — 14,400 reimer-9 28m04s 3,628,800 8 86,400 reimer—lO 8h40m46s 39,916,800 20 518,400 Table 4.5: Solving systems by the classical linear homotopy with 1 master and 7 workers 79 System # of Total time to Time to find Time to Time to workers solve system mixed cells trace curve check solutions k cpu(s) ] ratio cpu(s) 1 ratio cpu(s) I ratio cpu(s) I ratio eco—16 1 445.32 1.00 120.02 1.00 325.00 1.00 0.30 1.00 2 223.49 1.99 60.66 1.98 162.58 2.00 0.25 1.20 3 150.69 2.96 40.94 2.93 109.53 2.97 0.22 1.36 5 91.31 4.88 25.22 4.76 65.89 4.93 0.20 1.59 7 68.70 6.48 19.99 6.00 48.58 6.69 0.13 2.31 cyclic-11 1 1475.39 1.00 38.15 1.00 1436.49 1.00 0.75 1.00 2 734.96 2.00 19.10 2.00 715.41 2.00 0.45 1.67 3 494.19 2.99 12.94 2.95 480.86 2.99 0.39 1.92 5 295.90 4.99 8.05 4.74 287.47 5.00 0.38 1.97 7 212.87 6.93 6.47 6.00 206.06 6.97 0.34 2.21 Table 4.6: The scalability of solving systems by the polyhedral-linear homotopy 4.5.2 Scalability To test the scalability of our HOM4PS-2.0para, we solved eco-16, noon-12, cyclic-11, katsura-17 and reimer—8 systems with varying numbers, k = 1, 2,3, 5, 7, of workers. Table 4.6 shows the scalability of solving eco—16 and cyclic-11 by the polyhedral- linear homotopy and Table 4.7 exhibits the scalability of solving reimer-8, noon-12 and katsura-17 systems by the classical linear homotopy. The ratios in both tables stand for the cpu time for one worker to finish the task divides by the cpu time of k workers for the same task. The ratios for curve tracing in both tables illustrate an excellent scalability (nearly perfect) for each system, showing evidently that each path can be traced independently. The cpu time to check the solutions on both tables includes the cpu times for the detection of curve jumping as well as retracing the associated paths when curve jumping occurs. The ratios in this category do not show an excellent scalability because the cpu time for checking curve jumping is 80 System # of Total time to Time to Time to workers solve system trace curve check solutions k cpu(s) lratio cpu(s)J ratio cpu(s) [ ratio noon-12 1 1003.32 1.00 980.72 1.00 22.50 1.00 2 501.75 2.00 490.33 2.00 11.42 1.97 3 335.18 2.99 326.68 3.00 8.50 2.65 5 201.27 4.98 195.30 5.00 5.97 3.77 7 143.22 7.00 138.88 7.00 4.34 5.18 reimer—8 1 1088.95 1.00 1087.74 1.00 1.21 1.00 2 545.08 2.00 543.96 2.00 1.12 1.08 3 363.89 2.99 362.81 3.00 1.08 1.12 5 218.69 4.98 217.97 4.99 0.72 1.68 7 156.54 6.96 155.86 6.98 0.68 1.78 katsura—17 1 1964.22 1.00 1963.12 1.00 1.10 1.00 2 982.38 2.00 981.35 2.00 1.03 1.07 3 654.17 3.00 653.22 3.00 0.95 1.16 5 394.02 4.99 393.18 4.99 0.84 1.31 7 280.56 7.00 279.85 7.00 0.71 1.55 Table 4.7: The scalability of solving systems by the classical linear homotopy very fast, as one can see, and does not change much for various number of workers. As expected, when the classical linear homotopy is used the scalabilities in Ta- ble 4.7 are very close to being perfect. Notably illuminating is the almost perfect scalability in solving the whole system of cyclic-11 in Table 4.6. This shows that when the polyhedral-linear homotopy is used our strategy for the dynamic workload distribution is quite balanced. 81 4.5.3 The multiple precision As in general, we use Newton’s method for the “corrector” in our prediction-correction scheme in following homotopy paths. Namely, when the predictor (we use the cubic Hermite interpolation as our predictor after the first step which uses the Euler pre- dictor) provides a prediction point (2:0, to), Newton’s iteration is applied to H (:13, t0) initiated at (1:0, to), until the magnitude of “H(x, to)” is less than a predetermined accuracy 6 > 0. Realistically, this accuracy parameter should vary according to the size of the Jacobian “Hm“ as well as the size of ”as” , and in HOM4PS-2.0, we set 6 to be the minimum of “Hz“ * ||23|| * 10‘8 and 0.1. Now, during the computation, if the machine precision is set to be double precision, it allows 15 digits accuracy. So when the condition number of the Jacobian matrix H$(x,t0) is greater than 1015, then Newton’s method can hardly converge within the desired accuracy. The step size will then be cut in half, followed by repeating the prediction-correction scheme. If the step size is less than the minimum step size allowable, say 10‘“, before reaching the stage of the end game, then the procedure for tracing this path will cease to continue. We then retrace this homotopy path by increasing the precision from double precision to quad precision which is about 10 times slower in tracing paths. Fortunately, those situations seldom occur, and therefore causing only very minor effect on the scalability and efficiency. For instance, only about 7 (among 65, 536) paths for eco-18, 20 (among 131, 072) paths for eco—19, and 18 (among 262,144) paths for eco—20 need to use quad precision to retrace paths. 82 Chapter 5 Conclusion and future work Polyhedral-linear and classical linear homotopies can be used to solve large polynomial systems efficiently. The sequential version HOM4PS-2.0 leads the existing software by huge margins. With multi—processors, HOM4PS-2.0para—parallel version of HOM4PS- 2.0 can solve systems that have not been solved in the past within tolerable time. As we can see from Table 4.6, the ratios of computing mixed cells do not exhibit near perfect scalabilities as the ratios of tracing curves show. At this time, while the master sends the lower edges of L(S,;1) to the workers dynamically, some lower edges may take much more time for the extension than the others. Therefore at the very end, workers who finish their lower edge extensions will need to wait for those workers who have not finished theirs. This will result in an imbalance of workload distribution. Apparently a more sophisticated method needs to be developed for the workload distribution for mixed cell computations. 83 Bibliography [ll [2] [3] [4] [5] [6] l7] [8] D. N. Bernshtein (1975), “The number of roots of a system of equations”, Func- tional Analysis and Appl, 9(3), 183-185. C. Bjork and R. Roberg (1991), “A faster way to count the solutions of in- homogeneous systems of algebraic equations”, J. Symbolic Computaion, 12(3), 329-336. W. Boege, R. Gebauer, and H. Kredel (1986), “Some examples for solving sys- tems of algebraic equations by calculating Groebner bases”, J. Symbolic Com- putation, 2, 83-98. J. Canny and J. M. Rojas (1991), “An optional condition for determining the exact number of roots of a polynomial system”, IN Proceedins of the 19.91 Inter- national Symposium on Symbolic and Algebraic Computaion,ACM,83-98. S. N. Chow, J. Mallet-Paret, and J. A. Yorke (1978), “Finding zeros of maps: homotopy methods that are constructive with probability one”, Math. Comput, 32, 887-899. H. Cohn (1982) “An explicit modular equation in two variables and Hilbert’s twelfth problem”, Math. of Comp., 38, 227-236. F. J. Drexler (1977) “Eine Methode zur Berechnung samtlicher Losungen von Polynongleichungssystemen”, Numer. Math., 29, 45-58. T. Gao and T. Y. Li (2000), “Mixed volume computation via linear program- ming”, Taiwan J. of Math., 4, 599-619. [9] T. Cao and T. Y. Li (2003), “Mixed volume computation for semi-mixed sys- tems”, Discrete Comput. Geom, 29(2), 257-277. 84 [10] T. Gao, T. Y. Li and M. Wu (2005), “Algorithm 846: MixedVol: A software package for mixed volume computation”, ACM Transactions on Math. Software, 31(4), 555-560. [11] C. B. Garcia and W. I. Zangwill (1979), “Findind all solutions to polynomial systems and other systems of equations”, Math. Programming, 16, 159-176. [12] T. Gunji, S. Kim, M. Kojima, A. Takeda, K. Fujisawa and T. Mizutani (2004), “PHoM - A polyhedral homotopy continuation method”, Computing, 73, 53-57. [13] T. Gunji, S. Kim, K. Fujisawa and M. Kojima (2006), “PHoMpara - Parallel implementation of the polyhedral homotopy continuation method for polynomial systems”, Computing, 77, 387-411. [14] B. Huber (1996), “Solving sparse polynomial systems”, Ph.D. dissertation, Cor- nell University. [15] B. Huber and B. Sturmfels (1995), “A Polyhedral method for solving sparse polynomial systems”, Math. Comp., 64, 1541-1555. [16] A. G. Khovanski (1978), “Neton polyhedral and the genus of complete intersec- tions”, Functional Analysis and Appl., 12(1), 38—46. [17] S. Kim and M. Kojima (2004), “Numerical stability of path tracing in polyhedral homotopy continuation methods”, Computing, 73, 329-348. [18] A. G. Kushnirenko (1976), “Newton Plytopes and the Bezout Theorem”, Func- tional Analysis and Appl, 10(3), 233-235. [19] Y. C. Kuo and T. Y. Li (2008), “Determine whether a numerical solution of a polynomial system is isolated”, J. Math. Anal. Appl, 338(2), 840-851. [20] T. Lee and T. Y. Li,“Mixed volume computation, A Revisit”,(Submitted). [21] T. Lee, T. Y. Li and C. Tsai,“ HOM4PS-2.0: A software package for solving polynomial systems by the polyhedral homotopy continuation method”, (Submitted). [22] T. Y. Li (1997), “Numerical solution of multivariate polynomial systems by ho- motopy continuation methods”, ACTA Numerica, 399-436. [23] T. Y. Li (1999), “Solving polynomial systems by polyhedral homotopies”, Taiwan J. of Math., 3, 251-279. [24] T. Y. Li (2003), “Solving polynomial systems by the homotopy continuation method”, Handbook of numerical analysis, Vol. XI, Edited by P. G. Ciarlet, North-Holland, Amsterdam. [25] T. Y. Li and X. Li (2001), “Finding mixed cells in the mixed volume computa- tion”, Foundation of Computational Mathematics, 1, 161-181. [26] T. Y. Li and X. Wang (1996), “The BKK root count in CT”, Math. Comp., 65, no. 216, 161-181. [27] T. Y. Li, T. Sauer and J. A. Yorke (1989), “The cheaters homotopy: an efficient procedure for solving systems of polynomial equations”, SIAM J. Numer. Anal, 26, 1241-1251. [28] T. Y. Li and Z. Zeng (2005), “A rank-revealing method with updating, down- dating, and applications”, SIAM J. Matrix Anal. Appl., 26, 918-946. [29] T. Mizutani, A. Takeda and M. Kojima (2007), “Dynamic enumeration of all mixed cells”, Discrete Comput. Geom, 37, 351-367. [30] A. MORGAN, (1987), Solving polynomial systems using continuation for engi- neering and scientific problems, Prentice-Hall, New Jersey. [31] A. P. Morgan and A. J. Sommese (1987), “A homotpy for solving gen- eral polynomial systems that respect m-homogeneous structures”, Appl. Math. Comput.,24,101-113. [32] A. P. Morgan and A. J. Sommese (1989), “Coefficient-parameter polynomial continuation”, Appl. Math. Comput., 29, 123-160. Errata: Appl. Math. Comput., 51, 207 (1992). [33] A. P. Morgan, A. J. Sommerse, and C. W. Wampler (1992), “A power series method for computing singular solutions to nonlinear analytic systems”, Numer. Math., 63(3), 1779-1792. [34] MP1: http: //www.mpi-forum.org. [35] V. W. Noonburg (1989), “A neural network modeled by an adaptive Lotka- Volterra system”, SIAM J. Appl. Math., 49, 1779-1792. [36] J. M. Rojas (1994), “A convex geometric approach to counting the roots of a polynomial system”, Theoret. Comput. Sci., 49, 105—140. [37] C. Traverso, The PoSSo test suite examples, Available at: www.inria.fr/saga/POL 86 [38] J. Verschelde (1999), “Algorithm 795: PHCpack: A general-purpose solver for polynomial systems by homotopy continuation”, ACM Trans. Math. Softw, 25, 251-276. [39] Y. Zhuang (2007), “Parallel Implementation of Polyhedral Homotopy Methods”, Ph.D. thesis, Univesity of Illinois at Chicago. 87 11]]1][]1]][[1[1[1]]: ...1,