PROJECTIVE PATH TRACKING FOR HOMOTOPY CONTINUATION METHOD
By
Tianran Chen

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
Applied Mathematics
2012

ABSTRACT
PROJECTIVE PATH TRACKING FOR HOMOTOPY CONTINUATION
METHOD
By
Tianran Chen
Solving systems of polynomial equations is an important problem in mathematics with a
wide range of applications in many ﬁelds. The homotopy continuation method is a large class
of reliable and eﬃcient numerical methods for solving systems of polynomial equations. An
essential component in the homotopy continuation method is the path tracking algorithm for
tracking smooth paths of one real dimension. “Divergent paths” pose a tough challenge to
path tracking algorithms as the tracking of such paths are generally impossible. The existence
of such paths is, in part, caused by

Cn, the space in which homotopy methods usually operate,

being non-compact. A well known remedy is to operate inside the complex projective space
instead. Path tracking inside the complex projective space is the focus of this article. An
existing method based on the use of generic “aﬃne charts” of the complex projective space
is widely used. While it works well in theory, we will point out the unnecessary numerical
instability it could potentially create via the analysis of “path condition”. This article, then
proposes a numerically superior approach for projective path tracking developed from the
point of view of the Riemannian geometry of the complex projective space.

Copyright by
TIANRAN CHEN
2012

This dissertation is dedicated in loving memory of my dear A’po.

iv

ACKNOWLEDGMENTS

This dissertation would not have been possible without the help from many people around
me, to only some of whom it is possible to give particular mention here.
Above all, I would like to thank my advisor Professor Tien-Yien Li for his guidance,
advice, understanding, support, and patience, not to mention his unsurpassed knowledge in
every aspect of the Polyhedral homotopy continuation method. However, more importantly,
his deep appreciation to an unusually wide range of mathematical subjects and the basic
instinct of constantly asking why will certainly beneﬁt me for years to come.
I am very grateful to Professor Zhonggang Zeng and Professor Tsung-Lin Lee for the
discussions on various topics in numerical analysis which has greatly inﬂuenced my views on
many technical aspects of the homotopy continuation methods. I would also like to thank my
fellow graduate students Chaitanya Senapathi and Faramarz Vafaee for their helps during
the project. Nick Ovenhouse, a research assistant in our group, has provided great helps
in many aspects of the implementation of the projective path tracking algorithms as well
as the creation of the tools for creating graphs included in this article of which I am very
grateful. I would like to acknowledge the ﬁnancial and technical support from the Michigan
State University.
Last, but by no means least, I thank my family for their love and support for which my
mere expression of thanks does not suﬃce.

v

TABLE OF CONTENTS

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . .
1.1 The basics of homotopy continuation methods for solving
1.2 Basic aﬃne path tracking algorithms . . . . . . . . . . .
1.3 Numerical condition of paths . . . . . . . . . . . . . . . .
1.3.1 Why path condition matters . . . . . . . . . . . .
1.3.2 Ill-conditioned paths . . . . . . . . . . . . . . . .
1.3.2.1 Divergent paths . . . . . . . . . . . . . .
1.3.2.2 Paths with large coordinates . . . . . . .
1.3.2.3 Paths approaching singular points in the
1.3.2.4 Paths approaching singular end points .
1.3.2.5 Summary of ill-conditioned paths . . . .

. . . . . . . . . . .
polynomial systems
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
middle . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .

Chapter 2 Projective Path Tracking . . . . . . . . . . . .
2.1 Background: working in projective space . . . . . . . . .
2.1.1 Projective path tracking using aﬃne charts . . . .
2.1.2 Numerical analysis of the homotopy path . . . . .
2.2 Geometry of the complex projective space . . . . . . . .
2.2.1 Identiﬁcation between and 2 . . . . . . . . .
2.2.2 Notation and terminology . . . . . . . . . . . . .
n as S 2n+1 /U . . . . . . . . . . . .
2.2.3 Realizing
n . . . . . . . . . . . . .
2.2.4 Tangent formula for
n . . . . . . . . . . . . .
2.2.5 Distance formula for
2.2.5.1 Riemannian geodesics . . . . . . . . . .
2.2.5.2 Distance on S 1 and S 2n+1 . . . . . . . .
2.2.5.3 Closest representatives of two points in
n . . . . . . . . . . . .
2.2.5.4 Geodesics in
n
2.2.5.5 Summary: the distance formula in
2.3 Projective predictors . . . . . . . . . . . . . . . . . . . .
2.3.1 Projective Euler’s method . . . . . . . . . . . . .
2.4 Projective correctors . . . . . . . . . . . . . . . . . . . .
2.4.1 Projective Newton’s method . . . . . . . . . . . .
2.4.1.1 Convergence property . . . . . . . . . .
2.4.1.2 Stopping criteria . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

C

CP

R

CP
CP

CPn

CP

CP

vi

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

ix

1
1
3
4
6
7
7
8
8
9
9
10
11
14
18
22
22
25
25
29
30
31
35
36
39
42
43
43
44
45
46
46

2.4.2 Projective Newton’s method with damping
2.5 Projective Newton’s homotopy . . . . . . . . . . .
2.6 Numerical analysis of the algorithms . . . . . . .
2.7 Numerical determination of residual . . . . . . . .
2.7.1 Aﬃne residual . . . . . . . . . . . . . . . .
2.7.2 Tangential residual . . . . . . . . . . . . .
2.7.3 Spherical residual . . . . . . . . . . . . . .
2.7.4 Summary of residuals . . . . . . . . . . . .
2.8 Special conditioning . . . . . . . . . . . . . . . . .
2.8.1 The curious case of barry . . . . . . . . . .
2.8.2 The general problem . . . . . . . . . . . .
2.8.3 Dynamic row-scaling . . . . . . . . . . . .
2.9 The path tracking algorithm . . . . . . . . . . . .
2.10 Numerical results . . . . . . . . . . . . . . . . . .

factor
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

47
48
50
52
53
54
55
55
56
56
58
59
61
62

Chapter 3 Endgame . . . . . . . . . . . . . . . . . . . . . . .
3.1 Local theory of homotopy paths . . . . . . . . . . . . . .
3.1.1 Background: local theory of holomorphic varieties
3.1.2 Local normal form of aﬃne homotopy paths . . .
3.1.3 Local normal form of projective homotopy paths .
3.2 Projective endgame based on Cauchy integral . . . . . .
3.2.1 Stopping criteria based on Riemannian distance .
3.2.2 Stopping criteria based on tangent vector . . . . .
3.2.3 Consistency tests . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

65
66
67
71
73
77
80
82
83

. . . . . . . . . . . . . . . . . . .

85

Bibliography . . . . . . . . . . . . . . .

vii

LIST OF TABLES

Table 2.1

Table 3.1

Table 3.2

Average percentage, rounded to the second digit after the decimal
point, of paths with large maximum path condition number over 10
ˆ
diﬀerent runs when the homotopy H L = 0 of Equation (2.2) is used
to solve diﬀerent polynomial systems (all divergent paths and those
with singular end points are ignored). At ﬁrst glance, the percentages
seem rather small, however one must keep in mind that hundreds of
thousands of paths are tracked in solving some of the systems, so even
a very small percentage will translate to a large absolute number of
paths with very high maximum path condition. . . . . . . . . . . . .

20

The scales of |x0 | at end points of some paths that are known to be
“at inﬁnity”: These paths comes from solving systems listed on the
ﬁrst column using the projective path tracking we have discussed. .

66

Comparison of the magnitude of |x0 |, in log10 scale, with and without
the Cauchy integral method. The second column shows the winding
number of the loop used for evaluating Cauchy integral. The third
column shows the approximate order of magnitude of |x0 | at the end
point using the projective path tracking alone, while the last column
shows the same measure but with Cauchy integral producing the end
point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

viii

LIST OF FIGURES

Figure 2.1

The path condition number, in log10 scale, of a single path deﬁned
ˆ
by Equation (2.2) H L = 0 for solving the cyclic7 problem: the path
condition can be as large as 1016 “in the middle” of the path tracking. 16

Figure 2.2

The path condition numbers, in log10 scale, of a path deﬁned by
ˆ
Equation (2.2) H L = 0 (shown in solid lines) and that of the aﬃne
associate of the same projective path (shown in dashed line) deﬁned by
the original homotopy H = 0 tracked for solving the cyclic7 problem.
n × [0, 1]
The two paths represent the same projective path in
ˆ
deﬁned by H = 0, but the diﬀerence in their respective path condition
numbers is striking! . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

The histogram of maximum path condition in log10 scale for all paths
deﬁned by the original homotopy H = 0 (in striped bars) and those
ˆ
deﬁned by the homotopy H L = 0 of Equation (2.2) (in solid bars)
for solving the cyclic7 problem. The height of each bar indicates
the total number of paths having the corresponding maximum path
condition. For example the leftmost striped bar has a height of 852
and occupies a horizontal interval near 2. This bar signiﬁes that there
are 852 paths deﬁned by the original H = 0 having maximum path
condition approximately in the scale of 102 . This graph represents
the average results of 10 diﬀerent runs. While the two approaches
are essentially tracking the diﬀerent projections of the same set of
ˆ
projective paths deﬁned by H = 0, the diﬀerence in maximum path
condition distribution is indeed striking! . . . . . . . . . . . . . . . .

21

Path condition, in log10 scale, along a single path tracked for solving the barry problem using projective path tracking algorithms on
S 2n+1 /U . The maximum path condition exceeds 109 . . . . . . . . .

56

The 2-norm of rows, in log10 scale, of the Jacobian matrix along a
ˆ
single path deﬁned by H = 0, tracked using projective path tracking
algorithms on S 2n+1 /U for solving the barry problem. . . . . . . . .

57

The comparison between path condition of a single path tracked in
solving the barry problem using projective path tracking with (dashed)
and without (solid) the dynamic row-scaling . . . . . . . . . . . . .

60

CP

Figure 2.3

Figure 2.4

Figure 2.5

Figure 2.6

ix

Figure 2.7

The ﬂowchart summarizes the overall projective path tracking algorithm 62

Figure 2.8

The histogram shows the distribution of maximum path conditions
along all paths tracked for solving the cyclic7 problem using projective
path tracking algorithms on S 2n+1 /U together with the dynamic
row-scaling technique. . . . . . . . . . . . . . . . . . . . . . . . . . .

63

The histogram shows the distribution of maximum path conditions
along all paths tracked for solving the cyclic7 problem with the two
diﬀerent methods. The height of each solid bar represents the number
of paths having the corresponding maximum path condition when the
traditional method with aﬃne charts described in Equation (2.2) is
used. The striped bars show the same information when the projective
path tracking algorithms on S 2n+1 /U developed in this chapter is
used. As with previous cases, this graph represents the average result
of 10 diﬀerent runs. . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

Figure 2.9

x

Chapter 1
Introduction

1.1

The basics of homotopy continuation methods for
solving polynomial systems

Solving systems of polynomial equations is an important problem in mathematics. It has
a wide range of applications in many ﬁelds of mathematics, sciences, and engineering. In
this article we restrict our attention to systems of n polynomial equations in n variables with
complex coeﬃcients of the form


 p1 (x1 , . . . , xn ) =




.
.
P (x) =
.





 p (x , . . . , x ) =
n 1
n

k1
(1)
kn
k1 ,...,kn ck1 ,...,kn x1 · · · xn

= 0

k1
(n)
kn
k1 ,...,kn ck1 ,...,kn x1 · · · xn

= 0

.
.
.

which will simply be called polynomial systems. By the Abel’s impossibility theorem and
Galois theory, explicit formulae for solutions to such systems by radicals are unlikely to
exist. So numerical computation arises naturally in the solution to such systems. Homotopy
continuation methods represent a major class of numerical methods for solving polynomial
systems.
Instead of attacking a polynomial system P (x) = 0 head on, the homotopy continuation

1

methods consider it as a member of a family of closely related polynomial systems parametrized
by a single real parameter. One member Q(x) = 0 of this family should be trivial to solve.
The solutions of this trivial system should be connected via smooth solution paths to all
isolated solutions of the target system P (x) = 0. More precisely, we construct a homotopy

Cn × [0, 1] → Cn between the given polynomial system P and some chosen system Q: H
is a continuous map from the product space Cn × [0, 1] to Cn such that H(x, 0) ≡ Q(x) and
H:

H(x, 1) ≡ P (x). Throughout this article we require H(x, t) to be algebraic in x, although
the method is widely applicable to nonlinear homotopies. For the purpose of the endgame
analysis in Chapter 3, we also require H(x, t) to be holomorphic in t on a domain containing
the unit interval (0, 1) ⊂ R ⊂ C when t is considered as a complex variable. We further
require the construction of H(x, t) to have the following three properties [17]:
1. (Triviality) The solutions of H(x, 0) = Q(x) = 0 are known.
2. (Smoothness) The solution set of H(x, t) = 0 for t ∈ (0, 1) consists of a ﬁnite number
of smooth paths in

Cn × (0, 1), each parametrized by t.

3. (Accessibility) Every isolated solution of H(x, 1) = P (x) = 0 is reached by some path
originating from t = 0, at a solution of H(x, 0) = Q(x) = 0.
When the three properties hold, the solution paths deﬁned by H(x, t) = 0 can be traced from
the initial points, the solutions of Q(x) = 0, at t = 0 to all solutions of the target problem
P (x) = 0 using standard numerical techniques. During the last three decades, homotopy
continuation methods has seen extremely active development and grown into a large class
containing a wide range of diﬀerent homotopy constructions including the Total Degree
Homotopy [16], the m-Homogeneous Homotopy [22], the Random Product Homotopy [18],
the Cheater’s Homotopy [19], the Polyhedral Homotopy [10], and many more. The homotopy
2

continuation method has been proved to be a reliable and eﬃcient class of numerical methods
for approximating all isolated zeros of polynomial systems. Furthermore, as it matured,
it is now used as the basic building block for other numerical methods, such as numerical
irreducible decomposition algorithms [25], opening up new possibilities.

1.2

Basic aﬃne path tracking algorithms

Fix any path γ ⊂

Cn × (0, 1) deﬁned by the homotopy H(x, t) = 0, by the smoothness

assumption, γ can be parametrized by the t-variable, and we can write x as a smooth function
x (t) of t with H(x(t), t) = 0. Then it is easy to see that the function x(t) must satisfy the
system of ordinary diﬀerential equation

Hx (x(t), t) ·

dx
+ Ht (x(t), t) = 0,
dt

(1.1)

or simply Hx dx = −Ht , commonly known as the Davidenko diﬀerential equation [1]. This
dt
forms the basis of the numerical path tracking algorithms with which one can trace a solution
path from its starting point. While any numerical solver of ordinary diﬀerential equations can,
in principle, be applied to Equation (1.1) and thus used for path tracking, the special class of
predictor-corrector method is generally preferred. In this scheme, an eﬃcient but potentially
inaccurate “predictor” is responsible for producing a rough estimate of the next point on
the path using the information of known points on the path. Then a series of Newton-like
“corrector” iterations is employed to bring the point approximately back to the path.
One of the most basic predictor-corrector conﬁguration is the duet of Euler’s method and
˜
Newton’s iteration in which the prediction x (t0 + ∆t) for the value of x at t1 = t0 + ∆t is

3

given by

−1
˜
˙
x(t0 + ∆t) = x(t0 ) + ∆t · x(t0 ) = x(t0 ) − ∆t · Hx (x(t0 ), t0 ) · Ht (x(t0 ), t0 ).

This prediction step is followed by a series of Newton’s iteration: at t1 = t0 + ∆t, the equation
H(x, t1 ) = 0 becomes a system of n equations in n unknowns. So Newton’s iterations can be
˜
used to reﬁne the prediction x(t1 ) with

xk+1 = xk − Hx (xk , t1 )

−1

H(xk , t1 )

˜
for k = 0, 1 . . ., where x0 = x (t1 ) is the start point.
Higher order predictors such as Hermite interpolation, Adam-Bashforth methods, and
Runge-Kutta methods may be used to provide better predictions. More sophisticated
correctors which can improve the convergence property of Newton’s method also exist. It
is the goal of this article to extend the path tracking algorithms to the complex projective
space.

1.3

Numerical condition of paths

In the general context of numerical computation, it is a fair question to ask: What does
a numerical algorithm do? While it is generally assumed that numerical algorithms ﬁnd
approximate solutions of a given problem, it was pointed out by Wilkinson [31] that they
actually compute exact solutions to a nearby problem. The distance between the two problems
is known as the backward error. Whether or not this solution is close to the solution of the
original problem depends, greatly, on the sensitivity of a solution under certain perturbation
4

of the problem. The condition number is a numerical measurement of this kind of sensitivity.
For instance, in the case of a system of linear equations of the form Ax = b, the condition
number, respect to some matrix norm • is given by cond (A) = A · A−1 or ∞ when A
is singular. Simply put, the condition number and errors are related by the inequality

Forward error

(Condition number) · (Backward error) .

(1.2)

We refer to [26] for the more precise statements. It is important to note that while the
backward error is controlled by the algorithm and computing devices used, the condition
number is a property of the problem formulation itself. When the condition number is
suﬃciently large, the problem is said to be ill-conditioned, and according to inequality (1.2),
one cannot provably control the forward error whenever any backward error is present. In
other words, when the question is bad, there is no good answer .
Here we would like to assign such a condition number to the path tracking problems
homotopy continuation methods intend to solve. We found it unlikely that a single number
can characterize the condition of such a complex problem. In this section, we will deﬁne a
weaker concept, the condition of a homotopy path at a point, in terms of a speciﬁc linear
equation. We shall then justify the usefulness of such a concept.
Fix a path γ ⊂

Cn+1 × (0, 1) deﬁned by H(x, t) = 0.

By the smoothness condition

of the homotopy construction, γ can be parametrized by t as a smooth function x(t) on
(0, 1). We have stated in Equation (1.1) that this smooth function must satisfy the equation
Hx dx = −Ht . Thus, the tangent vector v of the smooth function x(t) at a ﬁxed point (x, t)
dt

5

on the path is a solution to the linear system

Hx v = −Ht .

(1.3)

We will call this linear system the tangent vector problem and deﬁne the condition
number of the path γ at the point (x, t) to be the condition number of the matrix Hx
on the left hand side of the above equation. If a path has a suﬃciently large condition number
at a point respect to some given threshold, we say the path is ill-conditioned at that point.
In general a path is said to be ill-conditioned if it is ill-conditioned at any point, otherwise
it is said to be well-conditioned. The threshold, of course, depends on many factors such
as the precision of the ﬂoating point arithmetic, the desired accuracy for solutions, and the
nature of the problem or its application.

1.3.1

Why path condition matters

It is worth reiterating that the path condition at a point, as deﬁned above, is a property of
the tangent vector problem (1.3) at that point determined by the homotopy used, and it is
independent from the algorithm with which Equation (1.3) is solved or the precision of the
ﬂoating point arithmetic in which computation is carried out. Also note that error is almost
always present: Even if exact arithmetic can be used, discretization error still exists.
A direct justiﬁcation to the meaning of the condition number of (1.3) is that, as an
experimentally veriﬁed rule of thumb, if the condition number is 10d , then Gaussian elimination
will generally loose d decimal places worth of accuracy to rounding error [27, p.269]. Thus it
is a direct measure of the sensitivity of the solution to the tangent vector problem respect
to perturbation when the Gaussian elimination is used. Its eﬀect on other methods varies.
6

While a large path condition number does not make the path impossible to track, it will
adversely aﬀect any path tracking method that depends on the direct or indirect solution to
Equation (1.3).
From users’ point of view, the eﬀect of the path condition is twofold. First, it is a
general experience that bad (large) path condition leads to very slow convergence for many
numerical algorithms used for path tracking and vice versa. A quantitative discussion of
the computational complexity of the path tracking in relation to the condition number can
be found in [4]. Second, when the path condition number is suﬃciently large, one cannot
obtain approximations of the path tangent vector with any reasonable accuracy (using any
algorithm in any arithmetic). This should deﬁnitely cast doubts on the validity of the ﬁnal
solutions obtained by the overall homotopy continuation method. In short, the tracking of
ill-conditioned paths is slower and less trustworthy.

1.3.2

Ill-conditioned paths

A path can become ill-conditioned for a number of reasons. Here we will discuss four of the
major causes. There are also other numerical causes and we will have more detailed analysis
in later sections.

1.3.2.1

Divergent paths

By the smoothness condition, along each path we can consider x as a smooth function x (t)
of t for t ∈ (0, 1). While the hope is x (t) converges to a solution of the target system
P (x) = 0 in

Cn as t → 1, it could happen that x (t) does not converge to any point in

Cn and instead

x (t) becomes unbounded as t → 1. In this case, we say the path x (t)

diverges. Such a path is called a divergent path. Divergent paths have inﬁnite arc length
7

and are ill-conditioned. Tracking such paths directly is generally impossible. Since the
genesis of general homotopy continuation methods for solving polynomial systems, much
eﬀort has been devoted for constructing the homotopy so that the number of divergent paths
is minimized. Despite the tremendous progress made in recent years, the handling of divergent
paths remains an important problem.

1.3.2.2

Paths with large coordinates

Even if a path does not diverge, it is still possible for it to contain points with arbitrarily large
coordinates since points in

Cn × [0, 1] in general have unbounded coordinates. Notice that

since H is algebraic in x, the magnitude of all non-constant entries of Hx grows unboundedly
as x → ∞. In this situation, if entries of Hx grows at diﬀerent rate, then it will likely
cause the path to be ill-conditioned when x is large. Geometrically speaking, as the path
x (t) escapes

˙
Cn × [0, 1] the norm of its tangent vector x (t) = dx will grow unboundedly
dt

forcing Hx to become closer and closer to being singular.

1.3.2.3

Paths approaching singular points in the middle

In this context, a singular point of a path is a point at which the smoothness condition fails.
More precisely, it is a point where the path cannot be parametrized smoothly by the variable
t. By assumption, when the homotopy is properly constructed, such points should never exist
on the path away from the end point at t = 1. However, it is possible for a path to have a
close encounter with a singularity. When the path is close to a singularity, the path condition
number can also be large.

8

1.3.2.4

Paths approaching singular end points

The smoothness condition requires that the paths to be parametrized smoothly by t for
t ∈ (0, 1). However, it is also consistent with this assumption a path hits a singular end
point at t = 1 in the sense that x is not a smooth function in t at t = 1. The analysis of the
structure of the path near such a singular end point is the topic of Chapter 3.

1.3.2.5

Summary of ill-conditioned paths

Firstly, divergent paths are caused by the non-compactness of

Cn as well as the choice of the

homotopy. Secondly, path with large coordinates exists largely due to the fact that points in

Cn have unbounded coordinates. Thirdly, the unfortunate encounter with singularities in the
middle had the bad formulation of the homotopy itself to blame. Finally, singular end points
is a feature of the target system. In this article we develop the projective path tracking
method that will eliminate the ﬁrst three causes of ill-conditioned paths while partially deal
with the last cause.

9

Chapter 2
Projective Path Tracking
As we mentioned in Chapter 1, from the point of view of path condition, divergent paths are
ill-conditioned. Tracking such paths are generally impossible. Divergent paths exist because

Cn is not compact as a topological space. If we replace Cn with a compact topological space
W in which Cn is embedded as a dense subset, then one can show that all homotopy paths,
now in W × [0, 1], must converge to points inside W at t = 1 and have ﬁnite arc length [17].
Such a compact space W is called a compactiﬁcation of

Cn. One of the most commonly

used compactiﬁcation in the context of algebraic geometry is the complex projective space

CPn. In this chapter, we will develop the tools for homotopy path tracking in CPn.
First, we will explore existing methods for path tracking in complex projective space and
discuss their need for improvement. In the second portion of this chapter we will brieﬂy
outline the Riemannian geometry of the complex projective space. While most of the standard
constructions and concepts can be found in suﬃciently advanced textbooks on Riemannian
geometry and complex geometry, for completeness, we shall deﬁne all the concepts that are
likely beyond an average 1st year graduate level course on diﬀerential geometry. Besides the
nuts and bolts of the basic geometry, the end results of this purely theoretical discussion
include three computational tools: the tangent space formula, the distance formula, and the
geodesic deﬁnition on

CPn.

Guided by the theory, we will derive methods for projective

path tracking in the third portion. Then in Section 2.6, through numerical analysis of the

10

algorithms we justify, theoretically, that the proposed projective path tracking algorithms
does not pollute path conditions. In the sections that follow, we demonstrate some useful
numerical techniques for projective path tracking. Finally we shall show, via computational
results, that the geometrically motivated algorithms are also numerically sound.

2.1

Background: working in projective space

One deﬁnition for the complex projective space is the set of equivalent classes

CPn

=

Cn+1\ {(0, . . . , 0)}

/∼

(2.1)

Cn+1 if there exists a λ ∈ C\ {0} such that x = λy. In other words,
points of CPn are one dimensional linear subspaces of Cn+1 with the “origin” deleted. It

where x ∼ y for x, y ∈

is common to use the notation [x0 : · · · : xn ] for the homogeneous coordinate of a point

CPn with [x0 : · · · : xn] being equivalent to [λx0 : · · · : λxn] for any λ ∈ C\ {0}. With
such coordinates CPn , as a set, can be covered by subsets Uj = [x0 : · · · : xn ] |xj = 0 for
j = 0, . . . , n, called standard charts. Clearly, each standard chart Uj is equivalent to Cn ,
in

as a set, via the identiﬁcation [x0 : · · · : xn ] →

xj−1 xj+1
x0
xn
xj , . . . , xj , xj , . . . , xj

. These charts

CPn the structure of a complex manifold [28].
The zero sets of polynomials in CPn are not well deﬁned in general since each point in CPn

equip the set

has inﬁnitely many diﬀerent coordinates. However, the zero set of a homogeneous polynomial
p∈

C [x0, . . . , xn] is indeed well deﬁned.

More generally, for a system P = (p1 , . . . , pn ) of

homogeneous polynomials, the common zero set of P in

11

CPn is also well deﬁned. So given

any polynomial f ∈

C [x1, . . . , xn] of degree d, we can deﬁne its homogenization to be
ˆ
f (x0 , . . . , xn ) = xd f
0

x1
xn
,...,
x0
x0

which is clearly homogeneous of degree d, i.e.,

ˆ
ˆ
f (λx0 , . . . , λxn ) = λd f (x0 , . . . , xn )

so its zero set is well deﬁned in

CPn.

ˆ
Yet f is still closely related to f in the sense that

ˆ
ˆ
f (1, x1 , . . . , xn ) = f (x1 , . . . , xn ), i.e., f and f are equivalent on the chart U0 deﬁned by
x0 = 0. This common construction allows us to “lift” a problem into the complex projective
space. Detailed discussion about this technique can be found in any textbook on basic
algebraic geometry.
To apply this to the homotopy continuation method, given a homotopy H(x1 , . . . , xn ,
t) = (h1 , . . . , hn ) deﬁned on

Cn × [0, 1] that is algebraic in the variables x1, . . . , xn, we shall

consider their homogenizations respect to (x1 , . . . , xn )
d

j
ˆ
hj (x0 , . . . , xn , t) = x0 hj

x1
xn
,..., ,t
x0
x0

ˆ
for each j = 1, . . . , n where dj = deg hj , and the new homotopy H(x0 , x1 , . . . , xn , t) =
ˆ
ˆ
(h1 , . . . , hn ), which is now deﬁned on
ˆ
zero set of H(x0 , . . . , xn , t) in

Cn+1 × [0, 1]. Then for any ﬁxed t ∈ [0, 1] the common

ˆ
CPn is indeed well deﬁned. We hence consider the equation H =

CPn × [0, 1] which we shall call projective paths. To avoid confusion,
the original paths deﬁned by H = 0 in Cn × [0, 1] will be called aﬃne paths. Clearly, for
any path γ ⊂ Cn × [0, 1] deﬁned by H = 0, γ = {([1, x1 , . . . , xn ], t) | (x1 , . . . , xn , t) ∈ γ} give
ˆ
0 to deﬁne paths in

12

ˆ
us a projective path deﬁned by H = 0. In this case, we call γ the aﬃne associate of γ .
ˆ
One key advantage of working in

CPn is that it is compact, as a topological space, thus all

ˆ
projective paths deﬁned by H = 0 must converge and be of ﬁnite length. In the following
subsections, we will discuss existing path tracking methods in the projective space

CPn. One

important theorem that is used many times in our discussion is the Euler’s theorem which
we shall state here for an easy reference.
Theorem 1. (Euler’s identity for homogeneous functions) Given a smooth homogeneous function f (x0 , . . . , xn ) of order d,
n

xj
j=0

∂f
∂xj

= d · f (x0 , . . . , xn ) .

[29, Theorem 10.2] contains a simple proof of this theorem. From this we can deduce the
following useful fact:
Corollary 1. Let F = (F1 , . . . , Fn ) with each Fj ∈
let (x0 , . . . , xn ) ∈

C [x0, . . . , xn] being homogeneous, and

Cn+1 be a point such that F (x0, . . . , xn) = 0, then




DF(x ,...,xn ) 
0



In other words, a common zero x ∈





x0 
 0


.  =  .
. 
 .
. 
 .


xn
0





.



Cn+1 of a system of homogeneous polynomial is also

a right zero vector of the Jacobian matrix of the system at x.

13

2.1.1

Projective path tracking using aﬃne charts

ˆ
The major diﬃculty in working with H, deﬁned on

CPn × [0, 1] is that for a ﬁxed t ∈ [0, 1]

CPn, has inﬁnitely many diﬀerent but equivalent
homogeneous coordinates. When it is considered as a system of equations in Cn+1 for ﬁxed t,
ˆ
a solution to H = 0, being a point in

ˆ
H = 0 has n + 1 unknowns but only n equations and hence has no isolated solutions. Thus
traditional path tracking algorithms cannot be directly applied to the problem.
ˆ
In theory, this problem can be resolved by simply transferring the problem H = 0
back to the aﬃne space

Cn.

Uj = [x0 : · · · : xn ] |xj = 0

CPn is covered by standard charts
for j = 0, . . . , n with each being identiﬁed with a copy of Cn .
As we have just stated,

So one could simply work in one such chart. On the chart deﬁned by xj = 0, after scaling, we
ˆ
can assume xj = 1. Then on this chart H = 0 can be simply represented by the new system
of equations

 ˆ
 H (x0 , . . . , xn , t) = 0

 x −1
j

= 0.

In particular, using j = 0 we will obtain exactly the original problem we have started
ˆ
with, since H (1, x1 , . . . , xn , t) = H (x1 , . . . , xn , t). The above equation is an equation in

Cn+1 × [0, 1], and existing aﬃne path tracking algorithms for Cn+1 can be used.
This method can be further generalized. A geometric interpretation of the last equation
xj − 1 = 0 in the above system is that it deﬁnes an aﬃne subspace of (complex) codimension
one, i.e., a hyperplane L of

Cn+1.

Then the above system can be interpreted as the

ˆ
projection of the projective homotopy paths deﬁned by H = 0 to the hyperplane L. With this
interpretation in mind, there is no reason to restrict ourselves in using only the hyperplanes
deﬁned by xj − 1 = 0. We can generalize the method by using a hyperplane L of

14

Cn+1

deﬁned by a linear equation a0 x0 + · · · + an xn − 1 = 0, i.e., we can consider the system
induced by L

 ˆ
 H (x0 , . . . , xn , t)
= 0
ˆ L (x, t) =
H

 a x + · · · + a x − 1 = 0.
n n
0 0

(2.2)

For later reference, we state the following lemma that was shown in [21] and [25]:
Lemma 1. As long as the original homotopy H satisﬁes the smoothness condition, given
ˆ
any ﬁxed t ∈ (0, 1), for almost all L, the projection of the zero set of H = 0 to L consists of
smooth points.
Thus one can simply choose a generic hyperplane L, and apply aﬃne path tracking
ˆ
algorithms to the homotopy H L = 0 induced by L. Methods based on this idea are widely
used and are adopted by software packages such as Bertini and Hom4ps-3.0 with useful
results.
However, this method is not completely without ﬂaws. First of all, while
compact, its projection onto the hyperplane L in

CPn itself is

Cn+1 is not. So during the path tracking

process the magnitude of the components xj for j = 0, . . . , n can still grow without bound,
which is indeed the original problem we set out to solve. But most importantly, for the
numerical path tracking algorithms to work well in real world , it is crucially important that
the paths not only satisfy the theoretical smoothness but also possess good numerical path
condition. To illustrate the problem in this regard, let us consider an actual computational
result where the homotopy method based on Equation (2.2) is used to solve the cyclic7 [3]
problem. A randomly chosen a = (a0 , . . . , an ) is used to represent the generic hyperplane L.
Figure 2.1 shows the path condition number along a single path.
Clearly visible in the graph is the fact that the path condition can reach close to 1016 . It is

15

16

Path condition along a single path for cyclic7

Path condition in

log10

scale

14

12

10

8

6

4

2

0
0.0

0.2

0.4

0.6

0.8

1.0

t

Figure 2.1: The path condition number, in log10 scale, of a single path deﬁned by Equation
ˆ
(2.2) H L = 0 for solving the cyclic7 problem: the path condition can be as large as 1016 “in
the middle” of the path tracking.
tempting to conclude that we had a close encounter with a singularity and hence the original
homotopy H or even the target system (cyclic7) is to be blamed. However, when compared
with the path conditions of the original aﬃne associate of exactly the same projective path,
shown in Figure 2.2, the diﬀerence is quite striking: while the path condition number of the
original aﬃne associate was kept well under 100 throughout the entire path tracking process,
the projection of the same projective path to L has path condition close to 1016 at certain
points. So the projection to the hyperplane L, in this particular case, has polluted the path
condition.

16

16

Path condition in log10 scale

14

Path condition along a single path for cyclic7
Affine charts
Original method

12
10
8
6
4
2
0
0.0

0.2

0.4

t

0.6

0.8

1.0

Figure 2.2: The path condition numbers, in log10 scale, of a path deﬁned by Equation (2.2)
ˆ
H L = 0 (shown in solid lines) and that of the aﬃne associate of the same projective path
(shown in dashed line) deﬁned by the original homotopy H = 0 tracked for solving the cyclic7
n × [0, 1] deﬁned by H = 0,
ˆ
problem. The two paths represent the same projective path in
but the diﬀerence in their respective path condition numbers is striking!

CP

To understand the cause, we shall now analyze the path condition when the homotopy
deﬁned in Equation (2.2) is used.

17

2.1.2

Numerical analysis of the homotopy path

As we deﬁned in Chapter 1, at a ﬁxed (x, t) = (x0 , . . . , xn , t) ∈

Cn+1 × [0, 1] on a path

deﬁned by Equation (2.2), the path condition is the condition number of the Jacobian matrix


ˆ
∂ H1
∂x0

···


 .
..
 .
.
 .
J = 
 ∂ Hn
ˆ
 ∂x
···

0

a0 · · ·



ˆ
∂ H1
∂xn



. 
. 
. 

ˆ
∂ Hn 

∂xn 

an

ˆ
of H L respect to x = (x0 , . . . , xn ) evaluated at (x, t). In light of Corollary 1, we can see that



0

x
=
J· x
2





1 ·
x 2 





.
.
.
0
a0 x 0 + · · · + an x n






 = α · en+1





where α := (a0 x0 + · · · + an xn )/ x 2 and en+1 = (0, . . . , 0, 1) ∈

Cn+1.

By assumption

α = 0, hence we have J −1 en+1 = α x , so for the unit vector en+1 ,
x 2
x
α x 2 2

J −1 en+1 2 =
Therefore J −1 2

1
|α|

1
= |α| .

1
and the path condition at (x, t) is at least |α| . When the absolute

value of α is small the path becomes ill-conditioned. Given any threshold µ for path condition

Cn+1 that cause the path condition at (x, t) to be greater than
µ contains at least the set {a ∈ Cn+1 : |a x|/ x 2 < 1/µ} even if the original homotopy
the set of a = (a0 , . . . , an ) ∈

18

H(x, t) is well-conditioned. Furthermore, unless the chart is to be changed at every step, the
same chart must be used on some t-intervals [t1 , t2 ], then the set of a that will cause the
path condition to be greater than µ on that t-interval contains at least

a∈

Cn+1 : |a

x(t)|/ x(t) 2 < 1/µ for t ∈ [t1 , t2 ]

Cn+1 Lebesgue measure. From a probability point
of view, while for a ﬁxed (x, t) on a path the choices of a ∈ Cn+1 making the path singular

which could potentially be of rather large

is small enough to be ignored, the choices of a making the path ill-conditioned may not be.
In other words, while “wrong” choices are rare, bad choices are plenty. In this case it can be
hard to avoid them.
To have a quantitative idea of the probability of making such bad choices, we shall look
at a distribution of maximum path condition numbers. The cyclic7 problem computation is
ˆ
repeated 10 times using both homotopies H = 0 and H L = 0. The maximum path condition
along each path is collected. The distribution of maximum path condition, in log10 scale,
resulted from the two diﬀerent approach is shown in Figure 2.3. The two approaches are
essentially tracking the same paths: the only diﬀerence is the hyperplanes the paths are
projected to. However, the diﬀerence in the distribution of the maximum path condition is
striking.
ˆ
Clearly shown in the histogram is the fraction of ill-conditioned paths deﬁned by H L = 0.
While double-precision can still track all the paths and obtain the correct answers, in light of
Inequality (1.2), ﬂoating point arithmetic with much higher precision must be used if one
intend to obtain any provable accuracy. As expected, the bad condition has also made the
path tracking process much slower: in this particular experiment, while the number of paths

19

with maximum path condition greater than 1010 is less than 5% among the total number of
paths, the amount of time the path tracking algorithm spent on them is actually more than
what it spent on the remaining 95% of the paths!
The cyclic7 problem is not alone in revealing the numerical problem with the method
based on aﬃne charts. As we can see in Table 2.1, the same numerical problem exists in the
path tracking of many systems.
System

1010

Percentage with condition

Percentage with condition

cyclic5 [3]

1.67%

0.83%

cyclic7 [3]

4.73%

0.8%

cyclic11 [3]

4.97%

1.03%

eco11 [21]

0.91%

0.91%

eco15 [21]

0.99%

0.71%

9-point [30]

0.99%

1015

0.12%

Table 2.1: Average percentage, rounded to the second digit after the decimal point, of paths
with large maximum path condition number over 10 diﬀerent runs when the homotopy
ˆ
H L = 0 of Equation (2.2) is used to solve diﬀerent polynomial systems (all divergent paths
and those with singular end points are ignored). At ﬁrst glance, the percentages seem rather
small, however one must keep in mind that hundreds of thousands of paths are tracked in
solving some of the systems, so even a very small percentage will translate to a large absolute
number of paths with very high maximum path condition.

Clearly, more numerically stable method for path tracking in the complex projective space
is needed. In the remaining part of this chapter, we shall develop a projective path tracking
method from the point of view of the Riemannian geometry on

20

CPn.

900
800

Distribution of maximum path condition for cyclic7
Affine charts
Original

Number of paths

700
600
500
400
300
200
100
00

2

4
6
8
10
12
Maximum path condition in log10 scale

14

16

Figure 2.3: The histogram of maximum path condition in log10 scale for all paths deﬁned by
ˆ
the original homotopy H = 0 (in striped bars) and those deﬁned by the homotopy H L = 0 of
Equation (2.2) (in solid bars) for solving the cyclic7 problem. The height of each bar indicates
the total number of paths having the corresponding maximum path condition. For example
the leftmost striped bar has a height of 852 and occupies a horizontal interval near 2. This
bar signiﬁes that there are 852 paths deﬁned by the original H = 0 having maximum path
condition approximately in the scale of 102 . This graph represents the average results of 10
diﬀerent runs. While the two approaches are essentially tracking the diﬀerent projections of
ˆ
the same set of projective paths deﬁned by H = 0, the diﬀerence in maximum path condition
distribution is indeed striking!

21

2.2

Geometry of the complex projective space

First, we shall review the Riemannian geometry of

CPn. The deﬁnition CPn = (Cn+1\{(0,

CPn as a set and carries no obvious geometric structure. While
the standard charts U0 , . . . , Un used above equip CPn the structure of a complex manifold,
here we shall take another classical route: CPn can be realized as the quotient manifold of
. . . , 0)})/ ∼ merely describes

S 2n+1 by the action of a compact Lie group, the circle group. This point of view provides
us the “natural” choices for the smooth and Riemannian structure for

CPn.

With these

additional geometric structures, we can derive tools for projective path tracking algorithms.

2.2.1

C and R2

Identiﬁcation between

We identify

C

with

R2

via the map x + iy → (x, y) . Under such an identiﬁcation,

the multiplication by a complex number a + bi can be considered as an invertible linear
transformation on

R2.

It is easy to check that it is represented by the 2 × 2 real valued

matrix


a −b

.
b a

Cm and R2m via (z1, . . . , zm) → (Re z1, Im z1, . . . ,
Re zm , Im zm ). Then the scalar multiplication to vectors in Cm by the complex number
a + bi, under such an identiﬁcation, can be viewed as the linear transformation of R2m given

We can of course extend this and identify

22

by the 2m × 2m matrix





a −b





b a







..

.
.







a −b




b a
In the context of complex geometry, it can be immediately recognized that we are simply
describing the standard complex structure on

R2m.

One such scalar multiplication is of special interest to us: multiplying by a complex
number eiθ = cos θ + sin θ i on the unit circle U = {u ∈
length. As a linear transformation on

C : |u| = 1} of C, which preserves

R2m, we shall use the same notation





cos θ − sin θ





 sin θ cos θ







iθ =
...
e









cos θ − sin θ




sin θ cos θ

(2.3)

to represent the block diagonal matrix on the right hand side, which is clearly orthogonal.
One can easily check the familiar statements eiθ

−1

= ei(−θ) and eiθ1 · eiθ2 = ei(θ1 +θ2 ) are

still true as statements of linear transformations, and that 1 = ei0 is the identity matrix.
Simply put, there is a natural isomorphism between the multiplicative group of the unit circle
U of

C and that of the orthogonal matrices of the above form. To go one step further, for a

23

ﬁxed point v ∈

R2n the statement





− sin θ − cos θ





 cos θ − sin θ






d eiθ v =
...
 · v = ieiθ v

dθ







− sin θ − cos θ




cos θ − sin θ
still holds in this context. For u, v ∈

R2n, the notation
u, v

R

= u v

always denotes the standard dot product in the Euclidean space. We will use the same
symbols u and v for the two corresponding vectors in

u, v

C

Cn and the notation

:= uH v

for the complex inner product, where uH is the conjugate transpose of the complex vector
u∈

Cn.

There are other possible deﬁnitions for the complex inner product. This one is

chosen so that the two inner products have an obvious connection:

u, v

R

= Re u, v

C

and

iu, v

R

= Im u, v

It is also a convenient fact that the length of a vector u 2 =
regardless which inner product is used.

24

C

u, u

.

(2.4)

R

=

u, u

C

2.2.2

Notation and terminology

In the rest of this chapter the term smooth always means inﬁnitely diﬀerentiable while a
manifold is always a connected smooth manifold that is Hausdorﬀ and second-countable.
A smooth curve is by assumption a smoothly parametrized smooth submanifold of (real)
dimension 1. At a point x of a manifold M , we use the notation Tx M for the tangent space
of M at x. The tangent bundle of M is denoted by T M while its smooth vector ﬁelds is
denoted by X (M ). When considering certain group action on M , we use [x] for the orbit of
x ∈ M under that action.
Since both inner products •, •

R

and •, •

C

will come into our discussion, we must

use the concept of orthogonality with care. The word orthogonal should always mean the
relationship that two (real) vectors have their real inner product being zero. The term
orthogonal compliment should carry the same meaning in real.

2.2.3

Realizing

CPn as S 2n+1/U

Cn+1 : z 2 = 1 be the (2n + 1)-dimensional
sphere. It is a standard construction to view CPn as the quotient of S 2n+1 under a group
action. This construction will provide us the smooth and Riemannian structure for CPn .
It is clear that each point (z0 , . . . , zn ) ∈ S 2n+1 represents a point [z0 : · · · : zn ] in CPn ,
and each point in CPn has such a representative. More formally, let π : S 2n+1 → CPn
Let S 2n+1 = x ∈

R2n+2 :

x 2=1 = z∈

be the map given by (z0 , . . . , zn ) → [z0 : · · · : zn ], then π is clearly onto. We would like
to use S 2n+1 as our model for

CPn via the map π.

It has the obvious advantage that all

coordinates for all points are bounded. Indeed x 2 = 1 for all x ∈ S 2n+1 . However, the
representative of a point in

CPn is not unique, i.e., π is not 1-to-1, as π (x) = π (λx) for any
25

C∗ = C\ {0}. But to leave S 2n+1 invariant, we must have λ = eiθ . The multiplication
by eiθ , viewed as a transformation on R2n+2 , is given by the orthogonal matrix shown in
λ∈

Equation (2.3). It is easy to check that for x ∈ S 2n+1 , the points of the form eiθ x with
θ∈

R are exactly those that represent the same point as x itself. Formally speaking,
π −1 (π (x)) =

Therefore, as a set, we can identify

eiθ x|θ ∈

R

.

CPn with the set of equivalent classes {[x] : x ∈ S 2n+1}

where

[x] := { eiθ x | θ ∈

R }.

Next we shall show that this identiﬁcation is also geometric. Let U = {eiθ | θ ∈
unit circle which is a compact smooth submanifold of

R} be the

C ≈ R2. U clearly forms a group under

multiplication, which is itself a smooth operation. In other words, U is a compact Lie group.
From this point of view each orbit U x = [x] is exactly the set of points representing the
same point in

CPn via the map π. Namely, we can consider CPn as the quotient S 2n+1/U

of S 2n+1 under the group action of U . It is easy to check that U acts on S 2n+1 smoothly

C is compact, the action is also proper. By the Quotient Manifold
Theorem [13, Theorem 9.16] CPn = S 2n+1 /U is a smooth manifold in its own right, and
and freely. Because U ⊂

it has the unique smooth structure such that π is a smooth submersion, and at each point
x ∈ S 2n+1 we have the decomposition
Tx S 2n+1 = T[x]

26

CPn ⊕ Tx [x]

of the tangent space. Note, however, that the decomposition alone does not explicitly tell
us how to choose T[x]
isomorphic to T[x]

CPn, since any complement of Tx [x] in TxS 2n+1 is algebraically

CPn.

Nonetheless, if we take the Riemannian structure of S 2n+1 into

account, then there is essentially only one “natural” choice of T[x]

CPn and one “natural”

CPn. We shall make this statement precise.
As a submanifold of R2n+2 , S 2n+1 inherits a natural Riemannian

Riemann structure on

metric gS 2n+1 ,

which is a smooth assignment of inner products in the tangent bundle T S 2n+1 . Or more
formally, gS 2n+1 is a 2-tensor ﬁeld that is symmetric and positive deﬁnite simply given by

gS 2n+1 (u, v) =

u, v

R

for u, v ∈ Tx S 2n+1 at any x ∈ S 2n+1 . Using this metric, we can decompose the tangent
space at x into

Tx S 2n+1 = Vx ⊕ Hx

where

Vx = Tx [x]

Hx = {h ∈ Tx S 2n+1 | gS 2n+1 (h, v) = 0, ∀v ∈ Vx }.

That is, we can decompose Tx S 2n+1 into the direct sum of two subspaces that are orthogonal
respect to the inner product given by gS 2n+1 at x. In this case, we can simply deﬁne

CPn := Hx, the so called horizontal space at x respect to the action of U . As stated
above, the canonical projection π : S 2n+1 → S 2n+1 /U ≈ CPn is a smooth submersion
T[x]

in terms of the smooth structure given by the Quotient Manifold Theorem, i.e., at any
27

CPn is surjective, as a linear map, at each
point x. There is a unique Riemannian metric g := gCPn for CPn , with respect to which
x ∈ S 2n+1 the pushforward π∗ : Tx S 2n+1 → T[x]

π is a Riemannian submersion: a smooth submersion that is also an isometry on the
horizontal space Hx over each point x, i.e.,

g (π∗ (h1 ) , π∗ (h2 )) = gS 2n+1 (h1 , h2 )

for all h1 , h2 ∈ Hx ,

where π∗ is the pushforward of the smooth map π. In the rest of this chapter the notation

CPn should always mean the n-dimensional complex projective space realized as the quotient
manifold S 2n+1 /U carrying the smooth structure determined by the quotient and the
Riemannian metric g.
Actually,

CPn has much richer geometric structures than we shall make use of in the

following discussion. The Hermitian structure (a continuous assignment of Hermitian product

CPn) and the complex structure, to name a few, are not explicitly
used here but are naturally quite important in the study of CPn . We refer to [4] for a more
on the tangent bundle T

detailed discussion.

28

2.2.4

Tangent formula for

CPn

Following the notation we introduced above, at any point x ∈ S 2n+1 , the tangent space
T[x]

CPn

= Hx , viewed as the horizontal space of Tx S 2n+1 respect to the action of U ,

consists of all the vectors that are orthogonal to the vertical space Vx = Tx [x]. At x the
tangent space Tx S 2n+1 of S 2n+1 , as an embedded submanifold of

R

subspace {x}⊥ ⊆ Tx 2n+2 ∼
=

R2n+2.

R2n+2, is simply the linear

For Tx [x], it is clear that near x, the one (real)

dimensional submanifold [x] is parametrized by γ (θ) = eiθ x with γ (0) = x. So

γ (0) = iei0 x = ix .
˙

R

As a vector in Tx 2n+2 , it is the generator of the one dimensional vector space Tx [x]. So
far, we have
Tx [x] = span {ix}

and

Tx S 2n+2 = {x}⊥ .

CPn is the linear subspace of TxS 2n+1 = {x}⊥ that is orthogonal
So it is simply {ix}⊥ ∩ {x}⊥ in Tx R2n+2 ∼ R2n+2 , i.e., it is the set of
=

As we just stated, T[x]
to Tx [x] = {ix}.
vector v such that

ix, v

R

= 0

x, v

R

= 0

which is equivalent to the complex equation x, v

C

= 0 based on the observation from

Equation (2.4). Therefore the tangent space can be characterized as

T[x]

CPn

=

R

v ∈ Tx 2n+2 : x, v
29

C=0

.

In particular, for a smooth curve x (t) : [0, 1] →

˙
CPn, its tangent vector x (t) ∈ T[x]CPn

has the property that

˙
x (t) , x (t)

when both vectors are considered as vectors in

C

= 0

Cn+1. Notice that when we restrict ourselves

to the point of view of Riemannian geometry we must consider the above equation as a
˙
system of two equations x (t) , x (t)

R = 0 and

˙
ix (t) , x (t)

R = 0 in R2n+2 .

The ﬁrst one

˙
states that x must lie ﬂat in the tangent space of S 2n+1 , while the second one requires the
tangent vector to be inside the horizontal space that is orthogonal to the orbit of x under
the group action of U . The same equation can also be derived from the point of view of the
Hermitian structure on

2.2.5

CPn, for that we again refer to [4].

Distance formula for

The concept of distance on

CPn

CPn plays a number of important roles in the numerical algorithms

that we will discuss later in this article, ranging from detecting the convergence of projective
Newton’s iterations to testing the sameness of end points in projective Cauchy integral
method. A “natural” concept of distance can be derived from the Riemannian metric on

CPn.

Recall that the Riemannian metric g of a Riemannian manifold M is a smooth assignment of
inner product in the tangent bundle T M . It also induces a smooth assignment
to vectors in T M (a Finsler structure) given by

g (v, v).

(v) =

30

of lengths

With this, for a smooth curve γ : [0, 1] → M , we can measure its length by
1

(γ (t)) dt
˙
0

which is indeed invariant under reparametrization [13, Proposition 11.15] and hence a property
of the curve itself. So it is meaningful to assign a length (γ) to the smooth curve γ as
deﬁned above. This deﬁnition can be easily extended to piecewise smooth curves by summing
the length of individual segments. Armed with the tool for measuring length of piecewise
smooth curves, we can then deﬁne the distance between two points in

CPn to be the inﬁmum

of the length among all piecewise smooth curves connecting the two. While quite elegant,
this deﬁnition does not oﬀer a computable formula. A more direct approach requires the
concept of geodesics.

2.2.5.1

Riemannian geodesics

In the search of the shortest paths between two points, a.k.a. the minimizing curves, our
intuition would be of great help. In the Euclidean space, the shortest path between two points
is simply the straight line segment joining the two which also happen to be the most straight
path. This concept of “most straight paths” can be generalized to Riemannian manifolds, and
this generalization is known as geodesics. The link between “straightness” and “shortness” is
preserved to a certain extend: as we will state later, all minimizing curves are geodesics as
long as they are parametrized in the “right way”, and all geodesics are locally minimizing.
Intuitively, geodesics are curves that are “as straight as possible”, much like the straight
line segments in the Euclidean space. The “straightness” of straight lines can be characterized
precisely: a curve with constant speed parametrization is straight if and only if its acceleration

31

is always zero. We simply have to generalize this notion of zero acceleration to curves in
Riemannian manifold. While it is tempting to deﬁne the acceleration of a curve γ : [0, 1] → M
on a manifold M to be the second derivative “ γ ”, one immediately runs into trouble with
¨
this approach: the tangent vectors at diﬀerent points actually live in diﬀerent tangent spaces.
So to deﬁne the concept of acceleration, we need a geometrically meaningful way to capture
d ˙
the notion of the acceleration “ γ = dt γ ” along the curve γ. More formally, we need an
¨

operator that gives us a coordinate independent method for diﬀerentiating a vector ﬁeld
d
along a curve. Being the analog of dt , it obviously has to be linear (over

R) and satisﬁes the

product rule. However, these requirements do not uniquely determine our desired concept
of “acceleration”. Fortunately, there is an important theorem, lies deep in the heart of
Riemannian geometry, which basically states that there is one “right” deﬁnition, with which
we have the link between curves with zero acceleration and those with shortest length. In the
following paragraphs, we shall state these ideas rigorously.
A linear connection [12, Lemma 4.9] on M is a bilinear map
X (M ), with the notation

XY

:=

(X, Y ), such that for f1 , f2 ∈ C ∞ (M ) we have

f1 X1 +f2 X2 Y
X (f1 Y )

We also say

: X (M ) × X (M ) →

= f1 X1 Y + f2 X2 Y
= f1 X Y + (Xf1 )Y.

is symmetric if it satisﬁes

XY

−

YX

= [X, Y ], where [X, Y ] is the Lie

bracket of the vector ﬁelds X and Y . Let X (γ) be the space of all vector ﬁelds along a
smooth curve γ parametrized by variable t, then respect to a given linear connection

,

there exists a unique linear operator [12, Lemma 4.9] Dt : X (γ) → X (γ), where t denotes

32

the parametrization variable, such that

Dt (f V ) = f˙V + f Dt V

for any smooth function f deﬁned on [0, 1], and

Dt V (t) =

˜

γ(t) V
˙

˜
for any extension V of V on M . Dt is called the covariant derivative operator along γ,
d
and in general it will play the role of “ dt ” along curves in Riemannian manifold. We deﬁne

the acceleration of γ to be the vector ﬁeld Dt γ along γ, and γ is called a geodesic with
˙
respect to

if its acceleration is zero, i.e.,

Dt γ ≡ 0
˙

on the interval (0,1). This equation is called the geodesic equation. Such a deﬁnition of
geodesics depends on the connection

we use, so in order to have a concept of geodesics we

must ﬁrst choose a connection. It is a useful fact that every smooth manifold admits at least
one linear connection. For example, embedded smooth submanifolds of Euclidean spaces
have tangential connection

[12, p.66]. For a Riemannian manifold M with metric g,

however, there is a special one. A linear connection

is said to be compatible with g if for

all vector ﬁelds X, Y, Z

Xg (Y, Z) = g ( X Y, Z) + g (Y,

33

X Z) .

Theorem 2. (Fundamental Lemma of Riemannian Geometry) Let M be a Riemannian manifold with Riemannian metric g, there exists a unique linear connection

on M

that is compatible with g and symmetric.
We refer to [12, Theorem 5.4] for the proof. This connection is called the Riemannian
connection (a.k.a. Levi-Civita connection) of g. Geodesics respect to the Riemannian
connection are called Riemannian geodesics. As we have anticipated, such geodesics has
a close link to minimizing curves.
Theorem 3. Every minimizing curve is a Riemannian geodesic when it is given unit speed
parametrization.
We again refer to [12, Theorem 6.6] for a proof that uses variational calculus. A simpler,
albeit longer, proof can be found in [6]. The converse is almost true:
Theorem 4. Every Riemannian geodesic is locally minimizing.
See [12, Theorem 6.12] for the proof and a detailed discussion. Geodesics that happen
to be minimizing is called minimizing geodesics. With this concept, we can see that the
distance between two points is simply the length of the minimizing geodesic joining them.
The distance function deﬁned this way will be called the Riemannian distance. Since we
will not use any other distance, the Riemannian distance will simply be called distance.
Next, we shall review the distance formula on S 1 and S 2n+1 , and then use them to construct

CPn. The distance on S 1 and S 2n+1 are denoted
by dS 1 and dS 2n+1 respectively, while our ﬁnal goal, the distance on CPn , is simply denoted
the geodesic and the distance formula on

by d.

34

2.2.5.2

Distance on S 1 and S 2n+1

It is clear that the distance between two points x = (a, b) and x = (1, 0) on the unit circle
S1 ⊂

R2 has to be the length of the shorter arch between the two points on the unit circle.

This length is given by the angle between x = (a, b) and the horizontal axis on which x lies:

dS 1 x, x

 



 1   a 
= cos−1 (a) = cos−1     = cos−1
0
b

x, x

R

.

The same formula works in general for S 2n+1 . For two distinct points on x, x ∈ S 2n+1 ⊂

R2n+2, there exists a unique 2-dimensional linear subspace of R2n+2 that contains both x
and x . This subspace intersects S 2n+1 on a circle. This circle is called the great circle
through x and x . It is intuitively clear that the distance between them has to be the
length of the shorter arc joining the two on the great circle (geodesic) that passes through

both of them. Surprisingly, this seemingly obvious fact is not completely trivial to check.
Here we refer to [12, Proposition 5.13] for an indirect proof that uses the fact that S 2n+1 is
homogeneous and isotropic. There exists an orthogonal change of coordinates after which
the two points x and x together with the great circle passing through them lie ﬂat in

R2 ⊂ R2n+2. Indeed, this change of coordinates is given explicitly by the QR decomposition:

35

there exists a (2n + 2) × (2n + 2) real orthogonal matrix Q such that


Q x x

1


0



= 0


.
.
.


0


a


b



0


.
.
.


0

R. So the orthogonal transformation Q maps the great circle through those
two points to the unit circle S 1 ⊂ R2 ⊂ R2n+2 , and we can thus compute the distance as we
for some a, b ∈

did in the previous case. Since Q, being orthogonal, preserves dot product, we get

dS 2n+1 x, x

= cos−1 (a) = cos−1 (Qx)

Qx

= cos−1

x, x

R

.

In this case, the distance is still given by the arccos of the real inner product of the two
points as vectors in

2.2.5.3

R2n+2.

Closest representatives of two points in

For two points [x], [x ] ∈

CPn

CPn ≈ S 2n+1/U , each of them has inﬁnite number of representatives

in S 2n+1 of the form eiθ x and eiθ x respectively. We ﬁrst need to ﬁnd a pair with the
minimum distance on S 2n+1 . That is, we want to minimize the function dS 2n+1 eiθ x, eiθ x

36

which is given by

cos−1 ( eiθ x, eiθ x

R ) = cos−1 (Re eiθ x, eiθ x C )
C)

= cos−1 (Re eiθ x, eiθ x
= cos−1 (Re e−iθ x, eiθ x
= cos−1 (Re x, ei(θ −θ) x

C)
C)

which, in turn, is simply dS 2n+1 (x, ei(θ −θ) x ). So it is enough to ﬁx the representative x
of [x] and minimize the distance over all the representatives of the other orbit [x ]. Hence,
without loss of generality, we can focus on minimizing the real-valued function

= cos−1

dS 2n+1 x, eiθ x

over θ ∈

x, eiθ x

R

R. If we let r and φ be the real numbers such that reiφ =
r =

x, x

C

and φ = arg x, x

x, x

C , i.e.,

C

for some suitable branch of the complex argument function. Then x, eiθ x

Re x, eiθ x

C

= Re eiθ x, x

C

= Re rei(φ+θ)

R is simply

= r cos (φ + θ) .

Therefore we want to minimize

dS 2n+1 x, eiθ x

= cos−1

x, eiθ x

37

R

= cos−1 (r cos (φ + θ))

over θ ∈

R. First of all, it is clear that if r = 0, then the above function is a constant function,

and any θ would give us the “minimum value”. For r = 1, the above function is simply

cos−1 (r cos (φ + θ)) = cos−1 (cos (φ + θ)) = |φ + θ|
which is minimized at φ = −θ. Finally, by the Cauchy-Schwarz inequality r = x, x

C

x 2 · x 2 = 1, so we only need to consider the case 0 < r < 1. In this case, using basic
calculus, to ﬁnd the critical points we solve

d
dθ

r sin(φ+θ)

cos−1 (r cos (φ + θ)) =

1−(r cos(φ+θ))2

= 0

which gives sin(φ + θ) = 0. Hence θ ≡ −φ (mod π). But since


 √r


if θ ≡ −φ
(mod 2π)
d2
1−r2
cos−1 (r cos (φ + θ)) =
 −r
dθ2
 √

if θ ≡ −φ + π (mod 2π)
1−r2

the minimum is attained at θ ≡ −φ (mod 2π). We thus conclude that among all representatives eiθ x of [x ] in S 2n+1 /U , e−iφ x is the closest, in terms of the distance, given by
dS 2n+1 , to x. Furthermore, among all representatives of [x] and [x ], (x, e−iφ x ) forms a
pair with the minimum distance. For later reference, let us deﬁne

ηx (x ) := e−iφ x

where φ = arg x, x

38

C,

with a suitable branch of the complex argument function, to be the nearest representative of
[x ] to x on S 2n+1 . It is easy to see the important property

x, ηx x

C

=

x, e−iφ x

C

=

x, x

C

= cos−1

x, x

C

= e−iφ x, x

C

.

(2.5)

Based on this, we can easily derive

dS 2n+1 x, ηx (x )

= cos−1 Re x, ηx (x )

C

.

(2.6)

Next, we shall show that this distance also gives us the distance between [x] and [x ] in

CPn = S 2n+1/U .
2.2.5.4

Geodesics in

CPn

We start with ﬁnding out the geodesic between [x] and [x ] in

CPn.

We have just shown

that among inﬁnitely many representatives in S 2n+1 the two points x and ηx (x ) form a
closest pair in terms of their distance in S 2n+1 . Let γ : [0, 1] → S 2n+1 be the minimizing
Riemannian geodesic (the shorter arc of a great circle) in S 2n+1 joining x and ηx x

with

γ (0) = x, γ (1) = ηx x . We wish to show the smooth function γ (t) := [γ(t)] parametrizes
ˆ
the minimizing Riemannian geodesic in

CPn joining [x] and [x ], and the two curves have

the same length. This can be accomplished in three steps. First, we shall prove that the
tangent vector of γ is always in the horizontal subspace Hγ(t) of Tγ(t) S 2n+1 . In this case,
γ is said to be horizontal. With this, we will then show that γ is indeed a Riemannian
ˆ
geodesic in

CPn. Finally, we need to establish the minimizing property of the geodesic γ .
ˆ

By construction, at each point γ (t) ∈ S 2n+1 of the curve, the tangent space can be

39

decomposed into orthogonal subspaces

Tγ(t) S 2n+1 = Vγ(t) ⊕ Hγ(t)

Vγ(t) = Tγ(t) [γ (t)] = span {iγ (t)} .

where

In particular, at t = 1, we have Vγ(1) = span {iγ (1)} = span iηx x

which give us the

direction of the orbit x . Also by deﬁnition, the great circle passing through x and ηx x ,
as a set, is the intersection between S 2n+1 and the unique 2-dimensional linear subspace
L = span x, ηx x

of

R2n+2 that contains both x and ηx

x . Then γ, being a segment

of this great circle, must also lie ﬂat inside the subspace L, and hence L must contain the
tangent vector γ (1). So to prove γ (1) is orthogonal to Vγ(1) , it is enough to show that the
˙
˙
vertical space Vγ(1) is orthogonal to the basis {x, ηx (x )} of L which contains γ(1). Notice
˙
that

x, iηx (x )

R = Re x, iηx (x ) C
= Re(i x, ηx (x )
= Re(i x, x

C)

C)

=0

by Equation (2.5), and similarly,

ηx x , iηx x

R = Re ηx

x , iηx x

= Re i ηx x , ηx x
= Re (i)
= 0.
40

C
C

Therefore the generator iγ(1) = iηx (x ) of Vγ(1) is orthogonal to both x and ηx (x ) which
are the basis of the subspace L. This implies Vγ(1) is orthogonal to the entire subspace
L = span{x, ηx (x )}. In particular, it is orthogonal to γ (1) ∈ L. Therefore γ (1) lies in
˙
˙
the horizontal subspace Hγ(1) . Repeating the same argument for any other t ∈ [0, 1], so
γ (t) ∈ Hγ(t) for all t ∈ [0, 1]. With this observation, we can now apply the following theorem
˙
[6, Proposition 2.109]:
Theorem 5.
˙
ˆ
i. If γ is a Riemannian geodesic of S 2n+1 and γ (1) is horizontal, then the curve γ = [γ (t)]
is a Riemannian geodesic of S 2n+1 /U of the same length as γ.
ˆ
ˆ
ii. Conversely, if γ is a Riemannian geodesic of S 2n /U with γ (0) = [x], then there exists
a unique Riemannian geodesic γ in S 2n+1 such that γ (0) = x, [γ (t)] = γ (t), and it is
ˆ
horizontal in the sense that γ (t) ∈ Hγ(t) . In this case, γ is called the horizontal lift
˙
of γ .
ˆ
Hence γ is a Riemannian geodesic of
ˆ

CPn = S 2n+1/U joining the two points [x] and

[x ]. To show this geodesic is indeed minimizing, suppose there is a shorter minimizing curve
γ2 : [0, 1] →
ˆ

CPn joining [x] and [x ]. Without loss of generality, let us assume it has constant

speed parametrization. By Theorem 3, γ2 is a Riemannian geodesic. Then the second part
ˆ
of Theorem 5 implies that there is a unique Riemannian geodesic γ2 : [0, 1] → S 2n+1 , the
horizontal lift of γ2 , such that γ2 (0) = x, γ2 ≡ [γ2 ], and γ2 (t) ∈ Hγ (t) for t ∈ [0, 1]. Let
ˆ
ˆ
˙
2
x := γ2 (1), then [x ] = [x ] by construction. Let us still use

CPn

and S 2n+1 as the

operators that assign length to smooth curves. Since γ2 is horizontal, we necessarily have
S 2n+1

(γ2 ) =

γ
CPn (ˆ2 )

because the quotient map π : S 2n+1 → S 2n+1 /U ≈
41

CPn is a

Riemannian submersion, thus

dS 2n+1 x, x

= S 2n+1 (γ2 )
=

γ
CPn (ˆ2 )

<

γ
CPn (ˆ)

= S 2n+1 (γ)
= dS 2n+1 x, ηx x

.

This provides another representative x of [x ] that is closer to x then ηx (x ), a contradiction.
We conclude that such a shorter minimizing curve joining [x] and [x ] cannot exist, and thus
γ , as constructed, is the true minimizing Riemannian geodesic joining [x] and [x ] with which
ˆ
the distance between them can be deﬁned.

2.2.5.5

Summary: the distance formula in

CPn

The distance d([x], [x ]) between the points [x], [x ] ∈

CPn is simply the length of the

minimizing Riemannian geodesic γ joining the two which, as we have just derived, is the
ˆ
same as the length of γ:

d [x], [x ]

=

γ
CPn (ˆ)

=

S 2n+1

(γ) = dS 2n+1 x, ηx x

.

Therefore we have the distance formula

d [x] , [x ]

= cos−1

42

x, x

C

.

(2.7)

2.3

Projective predictors

For some ﬁxed t0 ∈ [0, 1], given a representative (x, t0 ) of a point with [x] ∈

CPn on a

ˆ
projective homotopy path, i.e, H(x, t0 ) = 0, the job of a predictor is to produce a prediction
x so that the point (x , t0 + ∆t), for some given step size ∆t > 0, represents a point that is
close to being on that projective path. The simplest possible predictor is the zero predictor,
which actually does nothing. More precisely, the zero predictor produces (x, t0 + ∆t) as the
prediction from (x, t0 ). [4] has demonstrated the eﬀectiveness of such a predictor. However
such method is in general too ineﬃcient to be used in practice.

2.3.1

Projective Euler’s method

The projective Euler’s predictor produces the prediction x by moving x one step toward
the tangent direction of [x(t)], viewed as a smooth curve in

CPn. This requires three steps.

First let v be the solution of the system of equation

ˆ
ˆ
Hx (x, t) v = −Ht (x, t)
σx, v

C

= 0

where σ ∈ [σn , σ1 ], called the conditioning factor, is an arbitrary real number chosen
ˆ
between σn and σ1 , the n-th and the ﬁrst singular values of the matrix Hx (x, t). The
solvability of the above linear system and the role σ plays here will be discussed in detail
later. We shall simply state here that under the smoothness assumption of the homotopy
this system is numerically solvable. Next, we let

˜
x = x + v · ∆t
43

where ∆t is the step size. Finally we take

x

˜ ˜
= x/ x 2

to be the prediction, which is obvious inside S 2n+1 . To express this method more formally,
let us deﬁne the map R :

Cn+1 → S 2n+1 given by
R (x) =

Then R is a retraction of

x
.
x 2

Cn+1, taking away the origin, to the subset S 2n+1, and the

prediction E(x, ∆t) := x of the projective Euler’s method can be expressed as


−1 



ˆ

 Hx (x, t) 
E (x, ∆t) = R x − ∆t 


σxH

2.4


ˆ
 Ht (x, t) 

 .

0

Projective correctors

The prediction x , t0 + ∆t produced by a predictor may not be exactly on or even very
ˆ
close to the path deﬁned by H = 0. If the next prediction step is to start from such a
poor approximation, the error can quickly build up to an unacceptable level. To curb such
error accumulation, a corrector is needed to return the prediction to the path. If we now ﬁx
ˆ
t1 = t0 + ∆t, the equation H = 0 becomes a system of n homogeneous polynomial equations
in n + 1 unknowns. The job of a corrector is to produce a reﬁnement x of the approximate
ˆ
solution x of H = 0 at t = t1 . For correctors, failure is always an option. When a corrector
fails to bring the prediction back to the path quickly and reliably, it is usually the case

44

that the step size ∆t used in the prediction step is too large, and the prediction should be
performed again with a smaller step size.
A natural choice of the corrector is an extension of the Newton’s iteration to the complex
projective space. In the following subsections, we will outline this method, developed in [24],
and two related methods.

2.4.1

Projective Newton’s method

In [24] Michael Shub and Steven Smale proposed the Projective Newton’s method given
by


−1 



ˆ
Hx (x, t) 

x − 
N (x) = R 


H
σx


ˆ
 H (x, t) 

 .

0

Notice that a conditioning factor σ, identical to the one used in the projective Euler’s method,
is introduced to improve the numerical stability of the method. The role it plays here will be
discussed in Section 2.6. Using the starting point x0 = x

produced by a predictor, we

can perform projective Newton’s iterations

xk = N xk−1

for k = 1, 2, . . . until certain stopping criteria are met. One can interpret a single step of
projective Newton’s iteration geometrically as a single step of the regular Newton’s iteration
restricted to the hyperplane of

Cn+1 deﬁned by the tangent space of CPn at that point.

45

2.4.1.1

Convergence property

It is more convenient to discuss the convergence property if we use another measure of the
“distance” as deﬁned in [24]

dT ([x], [x ]) := tan d([x], [x ]) =

x − ηx (x ) 2 .

With this, it has been shown in [4] that for an exact regular isolated solution [ζ] of F ([ζ]) =
ˆ
H(ζ, t) = 0 there is a neighborhood U containing [ζ] of

dT

[ζ] , [xk ]

CPn such that if [x0] ∈ U then

k
1 2 −1
dT [ζ] , [x0 ]
2

for k = 1, 2, . . .. So with dT used as the “distance”, the projective Newton’s method can
be considered having quadratic convergence property when the above condition is satisﬁed.
We refer to [4] for the proof as well as the exact statement. When the actual Riemannian
distance d = dCPn is used, the above formula becomes much more complicated. Nevertheless,
we can still conclude, qualitatively, that as long as the starting point [x0 ] is suﬃciently close
to the solution [ζ], the projective Newton’s method converges very quickly.

2.4.1.2

Stopping criteria

The Newton’s method, being an iterative method, can be repeated indeﬁnitely. Stopping
criteria are used to mark the termination of the algorithm. Following the point of view that
a projective Newton’s iteration is simply a regular Newton’s iteration restricted to certain
hyperplanes, we expect the existing stopping criteria developed for the Newton’s corrector in
the context of aﬃne path tracking to be usable with minimal modiﬁcations:

46

• First of all, a small residual is an obvious stopping criteria. The residual, a numerical
measure of the closeness of a point from being a solution, is the topic of Section 2.7. If
the residual at [xk ] is suﬃciently small, the correction is considered to have succeeded.
This threshold depends on both the path condition and the machine epsilon.
• If the Riemannian distance d([xk+1 ], [xk ]) between consecutive points [xk+1 ] and [xk ]
is greater than a certain threshold, then the correction is considered to have failed.
This criterion is used to prevent a phenomenon commonly known as “curve jumping”
in which the path tracking algorithm accidentally switch paths. We refer to [15] for the
discussion of this rare but potentially dangerous phenomenon. The threshold used here
is proportional to the quotient between the volume of S 2n+1 and the number of paths
deﬁned by the homotopy.
• Finally, if the sequence of points x1 , x2 , . . . in

CPn does not stabilize, in the sense

of Riemannian distance d, within a few iterations, we consider the correction to have
failed. In our actual implementation the limit on the number of iterations range from 3
to 7 depending on the path condition.
We refer to [15] for the complete list of the stopping criteria as well as their detailed
descriptions. Our preliminary implementation, equipped with these stopping criteria, has
shown competitive performance, as we shall see in the last section of this chapter.

2.4.2

Projective Newton’s method with damping factor

The regular Newton’s method is often modiﬁed with a damping factor to improve its
convergence property. The same modiﬁcation can be applied to the projective Newton’s

47

method given a damping factor λ ∈ [0, 1], given by


−1 



ˆ

 Hx (x, t) 
Nλ (x) = R x − λ 


σxH


ˆ
 H (x, t) 

 .

0

Clearly, the projective Newton’s method can be considered as the special case with λ = 1,
i.e., N = N1 . Just like the Newton’s method, the dampened version is repeated with the
starting point x0 = x

produced by a predictor:

xk = Nλ xk−1

for k = 1, 2, . . . potentially with non-constant damping factor λ, until certain stopping criteria
are met. Our preliminary implementation adopts the same stopping criteria used for the
projective Newton’s corrector, and the damping factor ranges from 0.5 to 1.0 depending on
the number of consecutive failed corrections. We refer to [26] for the general theory and
convergence analysis.

2.5

Projective Newton’s homotopy

Another corrector is the projective Newton’s homotopy method. In this method one solves
the correction problem via another homotopy continuation method. Since in the correction
ˆ
ˆ
problem we ﬁx t1 = t0 + ∆t, we can deﬁne F (x) = H(x, t1 ) which is a nonlinear system of n
equations in n + 1 unknowns. The prediction x is assumed or at least expected to be close to
ˆ
a true solution [ζ] of F (x) = 0 in the complex projective space which the corrector intends to
ˆ
locate. In order to do so, we can consider the homotopy Ht1 :
48

Cn+1 × [0, 1] → Cn+1 given

by

ˆ
Ht1 (x, s) =



ˆ
F (x) − (1 − s)F (x ) = H(x, t ) − (1 − s)H(x , t )
ˆ
ˆ
ˆ

1
1


 x, x


C−1

= x, x

.

C−1

ˆ
ˆ
Clearly, at s = 0, Ht1 (x , 0) = 0, and at s = 1, the solution set of Ht1 (x, 1) = 0 may contain
some representatives of [ζ]. Standard path tracking techniques can be used to tracking the
solution path from s = 0 at x to s = 1.
Just like the Newton’s iteration, the projective Newton’s homotopy may or may not
converge. The convergence property and the cost analysis can be found in [1]. While generally
much more expensive, as an experimentally veriﬁed rule of thumb, the projective Newton
homotopy generally oﬀers much larger region of fast convergence than the projective Newton’s
method. Hence we suspect that this method has its own niche among the correctors.

49

2.6

Numerical analysis of the algorithms

Theoretically, for the above algorithms to work, the square linear system of equations in
v∈

Cn+1
ˆ
Hx (x, t) · v = b
σx, v

must be solvable for any vector b ∈

C

= 0

Cn and a pair (x, t) representing a point ([x], t) on

ˆ
a projective path deﬁned by H = 0. Moreover, since these are intend to be a numerical
algorithms, we must also show the above system is well conditioned. For this ﬁxed t, deﬁne
ˆ
F (x) = H(x, t) = (f1 , . . . , fn ), then the above equation can be written as








 DF (x) 
 b 

v =  
σxH
0
where DF (x) is the Jacobian matrix of F respect to x. Since F (x) = (f1 , . . . , fn ) is a system
of homogeneous polynomials and F (x) = 0, by the Euler’s theorem (1) for homogeneous
functions,

n

xj
j=0

∂fi
∂xj

=

di · fi (x)

=

0

for each i = 1, . . . , n where di = deg fi . Letting A = DF (x), we have Ax = 0. In other words,
x ∈ ker A. Then we can ﬁnd n right singular vectors {v 1 , . . . , v n } such that {v 1 , . . . , v n , x}
form a orthonormal basis of

Cn+1 with respect to the complex inner product and A has the

50

singular value decomposition of the form

UHA

v1 · · ·



 σ1


...
= 



vn x

0 

. 
. 
. 

σn 0

for some unitary n × n matrix U , where {σ1 , . . . , σn } with σ1

···

σ2

σn are the ﬁrst

n singular values of A. It follows that

H
U








 A 

 v1 · · ·
H
1
σx

vn x


=


UHA
σxH




v1





x

v1 · · · vn x


σ1


..

.

=


σn





···

vn

σ










.







 DF (x) 
 A 
Simply put, the matrix 
=
 has singular values σ1 , . . . , σn , and σ
H
H
σx
σx
which is within the interval [σn , σ1 ] by our construction. Thus the maximum and the minimum


 DF (x) 
singular value of 
 are still σ1 and σn respectively, and its condition number is
H
σx





σ1
 DF (x) 
cond 
.
 =
σn
H
σx
Theoretically, at least, by the smoothness condition of the homotopy construction,
51

ˆ
DF (x) = Hx (x, t) must be of full row rank and thus σn = 0. This implies that the


 DF (x) 
matrix 
 is nonsingular. But more importantly, its condition number is exactly
H
σx
σ1
σn

which is determined by the original homotopy construction. Since the path condition

at (x, t) is deﬁned to be the condition number of this matrix, we can conclude that our
projective path tracking algorithms do not pollute the path condition.

2.7

Numerical determination of residual

In the aﬃne path tracking process, one tracks a smooth path deﬁned by H(x, t) = 0 where
H:

Cn × [0, 1] → Cn. However, it is usually not possible to stay exactly on that path due to

truncation error from the numerical algorithms used or rounding error caused by ﬂoating
point computation. The best one could hope is, during the path tracking process, we have
H(x, t) ≈ 0 at each step. In this context, the magnitude of the norm H(x, t) is generally
a good measure of how far the point (x, t) ∈

Cn × [0, 1] is from the path in the sense that

when H(x, t) = 0 it is deﬁnitely on the path, and when H(x, t) is very large then it
is very unlikely to be close to the path. Hence this norm H(x, t) , called residual, is
usually a good numerical indicator of the closeness of a point to paths deﬁned by H(x, t) = 0.
Throughout the path tracking process, one must make sure the residual is small.
In this section, the notion of the residual for projective path tracking will be developed.
ˆ
Just like in the aﬃne case, for a ﬁxed t, the condition H(x, t) = 0 ∈
that [x] ∈

Cn is a clear indication

ˆ
CPn is indeed a solution, in CPn, to the homogeneous system H(x, t) = 0.

ˆ
ˆ
Unfortunately, H(x, t) ≈ 0 does not imply (x, t) is close to being a solution to H(x, t) = 0. To
ˆ
see this, note that the norm H(x, t) is not generally meaningful: the values of homogeneous
polynomials are not well deﬁned since for a homogeneous function f :
52

Cn+1

→

C of

degree d, f (λx) = λd f (x), while x and λx in

CPn = Cn+1\ {0}
2.7.1

Cn+1 actually represent the same point in

/ ∼.

Aﬃne residual

In many cases, the original residual H (x, t) is what the users really care about. By deﬁnition

CPn

ˆ
H (x1 , . . . , xn , t) = H (1, x1 , . . . , xn , t). For a point [x] in

C

= ( n \ {0}) / ∼ with

x
homogeneous coordinate x = [x0 : · · · : xn ] with x0 = 0, it is equivalent to 1 : x1 : · · · : xn ,
x0
0

so the aﬃne residual is

x1
xn
,..., ,t
x0
x0

ρA (x) := H
ˆ
= H





= 




2

xn
x0 x1
, ,..., ,t
x0 x0
x0
1 ˆ
d1 h1 (x0 , . . . , xn , t)
x0

.
.
.
1 h (x , . . . , x , t)
ˆn 0
n
xdn
0

which is equivalent to a weighted norm

2










2

• W (x) with weights W (x) = (1/|x0 |d1 , . . . ,

1/|xn |dn ) given by
n

ˆ
H (x, t)

W (x)

:=

d
ˆ
hk (x, t) / x0k

2

.

k=1

ˆ
Thus we can use H(x, t) W (x) as a measure of how close x is from being a solution of
ˆ
H = 0 at the ﬁxed t-value whenever x0 is not close to zero. Similarly, the relative residual is

53

given by

ˆ
H(x,t) W (x)
x1
xn
x0 ,..., x0

=

2
d
n
ˆ
|hk (x,t)|/|x0k |
k=1
(x1 ,...,xn ) /|x0 |

d −1
ˆ
hk (x,t) / x0k

n
k=1

=

2

.

(x1 ,...,xn )

The obvious downside of this aﬃne residual is that it is undeﬁned for any point with x0 = 0.
For any point with x0 close to being zero, the computation of the aﬃne residual as deﬁned
above is numerically unstable.

2.7.2

Tangential residual

In many cases it is useful to restrict ourselves to the projection of

CPn = (Cn+1 \ {0})/ ∼

Cn+1 as we did when using (2.2).
ˆ
For a ﬁxed t and [x0 ] ∈ CPn that is known to be close to the actual solution of H(x, t) = 0
one is seeking, the hyperplane determined by the tangent space at this point x0 + T[x0 ] CPn
can be used as a hyperplane to which we project CPn = (Cn+1 \{0})/ ∼, and deﬁne the
on a hyperplane of

tangential residual

ρT (x) :=







ˆ
H (x, t)
x0 , x
x0 2

C

−1




2

which implicitly depends on the choices of the point x0 . Unlike the aﬃne residual ρA , the
tangential residual is deﬁned for any [x] suﬃciently close to [x0 ] in terms of the Riemannian
distance. An apparent downside is that diﬀerent representatives λx of the point [x] will
produce diﬀerent residual. In our experiments, however, we found this almost never causes
problems.

54

2.7.3

Spherical residual

Another residual stays true to our point of view of the complex projective space

CPn ≈

ˆ
S 2n+1 /U is what we shall call the spherical residual ρS . While H(x, t) 2 is not well
deﬁned for [x] ∈

CPn = (Cn+1\{0})/ ∼, the value
ˆ
H (x, t)

ρS (x) =

2

is, however, well deﬁned on S 2n+1 /U . This is because [x] := {eiθ x|θ ∈

R} in S 2n+1/U , and

ρS is clearly invariant under such action.

2.7.4

Summary of residuals

Each of the deﬁnitions for residuals has diﬀerent properties and its own pros and cons:
• The aﬃne residual ρA is usually what the users actually care about, however it is
undeﬁned for any points with x0 = 0.
• The tangential residual ρT is always computable, but it is, in general, not well deﬁned.
• The spherical residual ρS is always well deﬁned on S 2n+1 /U , but it has no connection
to the aﬃne residual the users care about.
In our preliminary implementation, a combination of these residuals is used. Inside Newtonlike correctors the tangential residual ρT is used with the role of x0 played by the most recent
x-value produced by the iteration. In all other places during the path tracking process, the
spherical residual ρS is used. In the ﬁnal reﬁnement stage, in which the solutions are “reﬁned”
to a higher accuracy, if we are conﬁdent that a solution [x0 : · · · : xn ] is in the aﬃne part of

CPn, i.e., x0 = 0, then ρA is preferred.

55

2.8

Special conditioning

2.8.1

The curious case of barry

Figure 2.4 shows the path condition along a single path when our projective path tracking
algorithm on S 2n+1 /U is used to solve the barry problem from the PoSSo [2] test suite.
With only 3 equations and 3 unknowns, barry is one of the smallest systems in the PoSSo
test suite. We thus expect it being very easy to solve.

10

Path condition along a single path for barry

Path condition in

log10

scale

8

6

4

2

0
0.0

0.2

0.4

0.6

0.8

1.0

t

Figure 2.4: Path condition, in log10 scale, along a single path tracked for solving the barry
problem using projective path tracking algorithms on S 2n+1 /U . The maximum path condition
exceeds 109 .
The maximum path condition exceeds 109 . While our path tracking algorithm with

56

double-precision ﬂoating point arithmetic had no trouble tracking this path, considering the
simplicity of the system, this result is certainly surprising and, to a certain extend, unsettling.
It turns out, the bad numerical path condition is caused by unusually large diﬀerence in the
scales of the rows of the Jacobian matrix. Figure 2.5 shows the large diﬀerence in the scales
of the rows in the Jacobian matrix along a single path when the projective path tracking is
used. Clearly, this would naturally cause terrible path condition.

1Norm of rows of Jacobian matrix along a single path for barry
0

Row norm in log10 scale

1
2
3
4
5
6
7
8
0.0

0.2

0.4

t

0.6

0.8

1.0

Figure 2.5: The 2-norm of rows, in log10 scale, of the Jacobian matrix along a single path
ˆ
deﬁned by H = 0, tracked using projective path tracking algorithms on S 2n+1 /U for solving
the barry problem.

57

2.8.2

The general problem

Our experiments have shown that the problem of large diﬀerence in the scale of rows in
Jacobian matrix revealed by the above observation is widespread when projective path tracking
is in use. Indeed, this problem has plagued not only the path tracking algorithms on S 2n+1 /U
we have proposed here but also any path tracking algorithms involving the homogenization
of the homotopy including the method using aﬃne chart described in Equation (2.2). To see
why, consider a homogeneous polynomial f ∈

∂f
C [x0, . . . , xn] of degree d, then each ∂xj

for

j = 0, . . . , n is homogeneous of degree d − 1 unless f is constant respect to xj . In either case,
we have

∂f
∂f
(λx0 , . . . , λxn ) = λd−1
.
∂xj
∂xj
ˆ
For a ﬁxed t, if we write the Jacobian matrix of F (x) = H (x, t) respect to x at a point x


 v1 


 . 
as the row matrix J (x) =  . , since F = (f1 , . . . , fn ) is homogeneous, we have
 . 


vn


d −1
 λ 1 v1


.
.
J (λx) = 
.


λdn −1 v n





,



where di = deg fi . Namely, when the given point x is scaled by a ﬁxed factor, the rows in J
are scaled by diﬀerent factors determined by the degrees of the corresponding polynomials,
then when the original system contains polynomials of very diﬀerent degrees. The diﬀerence
in the scales of the rows in J (x) can be very sensitive to the scaling x → λx.
58

2.8.3

Dynamic row-scaling

The problem stated above can be solved quite simply via row scaling. Clearly, the system of
linear equations

Jv = b

is equivalent to

AJv = Ab

whenever A is invertible. So to bring all the rows to more or less the same scale, we can take
A to be the diagonal matrix




 1/ν1


...


A = 


1/νn



1












where νj = Jj is the norm of the j-th row of the Jacobian matrix J. In principle any norm
can be used here. Our actual implementation uses the ∞-norm, for the ease of computation.
Figure 2.6 shows the path condition of the same path tracked for solving the barry problem
with and without the dynamic row-scaling technique. The diﬀerence is day and night. In
solving a large number of polynomial systems with projective path tracking, this simple
technique is helpful, and, in certain cases, essential in improving the path conditions.
However, it is important to note that while row-scaling is useful in improving the path

59

10

Path condition along a single path for barry
No scaling
With scaling

Path condition in

log10

scale

8

6

4

2

0
0.0

0.2

0.4

0.6

0.8

1.0

t

Figure 2.6: The comparison between path condition of a single path tracked in solving the
barry problem using projective path tracking with (dashed) and without (solid) the dynamic
row-scaling
condition, this transformation may conceal the fact that we are near a true singularity of a
path, and thus it must be used with caution. Indeed this technique should only be used when
one is conﬁdent that the path is away from a true singularity. In our actual implementation,
this procedure is only active away from the endpoint at t = 1 where singularity may appear.
Near the endpoint, the so called “endgame” techniques are used. Projective endgame is the
topic of the next chapter.

60

2.9

The path tracking algorithm

The overall projective path tracking algorithm is summarized by the ﬂowchart in Figure
2.7. In this ﬂowchart, only the projective Newton’s method is shown as the corrector, since
it is indeed the main corrector we use in the path tracking algorithm. We should note,
however, the other two projective correctors, the dampened projective Newton’s method and
the projective Newton’s homotopy, can also replace or supplement the projective Newton’s
method. In this ﬂowchart the operator E and N represent the projective Euler’s method and
the projective Newton’s method respectively. N is the threshold for testing the convergence
of the projective Newton’s method. When the tangential residual is less than this threshold,
the projective Newton’s method is considered to have succeeded. The real number d is
another threshold that plays a similar role: When the Riemannian distance moved by a
single Newton’s iteration is less than this threshold, the projective Newton’s method is
also considered to have succeeded. kmax is the maximum number of Newton’s iteration the
corrector is allowed to use.

61

Start

x ← x0

x ← x

x ← E(x, ∆t);
k ← 0

∆t ←

∆t
2

No
x ← N (x );
k ← k+1

ρT (x ) <

x ← x

No

N?

d(x , x ) <

Yes

d?

k < kmax ?

No

Yes
Yes
t ← t + ∆t

∆t ← max(∆t, 1 − t)

Yes

No

t < 1?

Finish

Figure 2.7: The ﬂowchart summarizes the overall projective path tracking algorithm

2.10

Numerical results

In this section we present some numerical results obtained when the projective path tracking
algorithms on S 2n+1 /U is used. We mainly focus on the improvements on path conditions in
contrast to the traditional approach of using generic hyperplane charts, described in Equation
(2.2). Here we chose a well known problem: the cyclic7 problem. However it is a representative
of similar results we obtained from a large set of problems. Figure 2.8 shows the distribution
of the maximum path conditions along each path tracked in solving the cyclic7 problem using
62

the projective path tracking algorithms on S 2n+1 /U with the dynamic row-scaling technique.
Compared with the maximum path conditions when the method deﬁned in Equation (2.2) is
used, as shown in Figure 2.3, the paths here appear to be much more tamed with maximum
condition number less than 104 which is well within the reach of double-precision ﬂoating
point arithmetic.

400

Distribution of maximum path condition for cyclic7

350

Number of paths

300
250
200
150
100
50
0
0.5

1.0

1.5
2.0
2.5
3.0
Maximum path condition in log10 scale

3.5

4.0

Figure 2.8: The histogram shows the distribution of maximum path conditions along all
paths tracked for solving the cyclic7 problem using projective path tracking algorithms on
S 2n+1 /U together with the dynamic row-scaling technique.

The diﬀerence is more obvious when we plot the distributions of maximum path conditions
obtained from diﬀerent path tracking methods. In Figure 2.9 we can clearly see the diﬀerence.

63

900
800

Distribution of maximum path condition for cyclic7
Affine charts
Using S2n +1 /U

Number of paths

700
600
500
400
300
200
100
00

2

4
6
8
10
12
Maximum path condition in log10 scale

14

16

Figure 2.9: The histogram shows the distribution of maximum path conditions along all
paths tracked for solving the cyclic7 problem with the two diﬀerent methods. The height
of each solid bar represents the number of paths having the corresponding maximum path
condition when the traditional method with aﬃne charts described in Equation (2.2) is used.
The striped bars show the same information when the projective path tracking algorithms on
S 2n+1 /U developed in this chapter is used. As with previous cases, this graph represents the
average result of 10 diﬀerent runs.

64

Chapter 3
Endgame
ˆ
CPn is compact, a projective path γ ⊂ CPn × [0, 1] deﬁned by H(x, t) = 0 necessarily
ˆ
has an end point in CPn at t = 1. However, as discussed in the previous chapters, it is
Since

possible that the end point lies outside U0 = {[x0 : · · · : xn ] | x0 = 0} which we identify with

Cn. We generally use the expression “solutions at inﬁnity” to describe such end points, and

these solutions are usually considered extraneous. To identify these extraneous “end points
at inﬁnity”, it seems that one may just simply marks all those with x0 = 0 or suﬃciently
close to 0. However, Table 3.1 shows a very diﬀerent reality. Contained in that table is the
scale of |x0 | at the end points of a few paths from diﬀerent systems that are known to be “at
inﬁnity”. While we expect these values of |x0 | to be exactly 0 or at least within machine
epsilon from 0, we can see the scale of |x0 | range from 10−4 to as large as 10−2 . Thus it is
clearly impossible to classify them as end point “at inﬁnity” based on those numbers alone.
The reason, as it turns out, is that these end points happen to be singular in the sense
that the smoothness condition fails at those points. It is well known that if the end point is
singular, ordinary predictor-corrector methods for path tracking, as described in the previous
chapters are unable to locate the end points accurately.
The job of the endgame is twofold. First, it should identify paths with end points being
“solutions at inﬁnity” of the target system. Second, it has to obtain the end point accurately
even if a path has a singular end point. In the case the end point is indeed singular, a

65

System

log10 |x0 | of the end point
−4

noon3 [23]

−4

reimer2 [2]

−3

reimer3 [2]

−2

reimer4 [2]

−4

cyclic5 [3]

−2

cyclic5 [3]

Table 3.1: The scales of |x0 | at end points of some paths that are known to be “at inﬁnity”:
These paths comes from solving systems listed on the ﬁrst column using the projective path
tracking we have discussed.
bonus feature for the endgame would be to discover additional information that describe
the geometric structure around the singular end point. In this chapter, we shall develop the
proper language in which the endgame can be discussed rigorously as well as the numerical
techniques with which the endgame problem can be solved in practice.

3.1

Local theory of homotopy paths

In this section, for completeness, we shall brieﬂy outline some relevant concepts and theorems
in the local theory of holomorphic varieties (a.k.a. analytic sets). Detailed discussions can be
found in [7], [8], and [9]. The theory provides us the proper language to discuss and describe
the endgame techniques.
66

3.1.1

Background: local theory of holomorphic varieties

In this section we shall brieﬂy review the basic theory of holomorphic functions and holo-

Cn starts from its topology. Open
polydiscs will be used as the basis of the topology. An open polydisc in Cn is a subset
∆(w; r) ⊂ Cn of the form ∆(w; r) = {(z1 , . . . , zn ) ∈ Cn : |zj − wj | < rj for j = 1, . . . , n}
where w = (w1 , . . . , wn ) ∈ Cn is the center and r = (r1 , . . . , rn ) ∈ Rn is the polyradius.
morphic varieties. The study of the function theory on

In this chapter a “neighborhood” of a point is always an open polydisc centered at that point.
Deﬁnition 1. A complex-valued function f deﬁned on an open subset D ⊂

Cn is called

holomorphic in D if each point w = (w1 , . . . , wn ) ∈ D has an open neighborhood U in D
such that the function f has a power series expansion

f (z) =

∞
v1 ,...,vn =0

av1 ,...,vn (z1 − w1 )v1 · · · (zn − wn )vn

which converges for all z = (z1 , . . . , zn ) ∈ U .
In this chapter, we will focus on the local properties of holomorphic functions. We thus
need the language of function germs.
Deﬁnition 2. Two

C-valued functions f : U → C and g : V

→

C are called equivalent at

the point p ∈ U ∩ V if there is an open neighborhood W of p such that W ⊆ U ∩ V and
f ≡ g on W . This is an equivalence relation, and an equivalence class is called a germ of

C-valued function at the point p.

Cn naturally forms a ring under the pointwise
addition and multiplication. The ﬁeld of complex numbers C sits inside this ring in the
form of constant functions with 1 ∈ C being the unity of the ring. Therefore the set of
The set of all such germs at a point p ∈

67

C-algebra structure. Within this algebra, the set of germs given by
holomorphic functions forms a sub C-algebra called the germs of holomorphic functions,
for which we will use the notation n Op . It is clear that for any two points p, q ∈ Cn , n Op
and n Oq are isomorphic as C-algebras. Therefore the local theory of holomorphic functions
germs actually has a

is the same at any point. It is easy to check that n Op is an integral domain and hence has
a well deﬁned ﬁeld of quotients n Mp , the ﬁeld of germs of meromorphic functions.
It is also easy to see that n Op is a local ring with non-units forming the unique maximal
ideal. The Weierstrass Preparation Theorem [9, p.68 Theorem 2] and Weierstrass Division
Theorem [9, p.70 Theorem 3] are two of the important theorems that reveal the structure
of the embedding n−1 Op ⊂ n−1 Op [zn ] ⊂ n Op . They form the stepping stones of many
induction proofs about the structure of n Op . In particular they are used to establish the
fact that n Op is Noetherian [8, Theorem 2, p.7], which is an extension of the Hilbert basis
theorem. We shall now turn our attention to the zero sets of holomorphic functions.
Deﬁnition 3. Let B be an open subset of

Cn.

A holomorphic subvariety of B is a

subset V of B such that for each p ∈ V there exists a neighborhood U of p and f1 , . . . , fk
holomorphic in U such that

V ∩ U = {z : f1 (z) = · · · = fk (z) = 0}.

Just like the situation for holomorphic functions, the language of germs facilitates our
study of local behavior of holomorphic varieties:
Deﬁnition 4. For open sets B1 , B2 in

Cn, if V1 and V2 are holomorphic subvarieties of

B1 , B2 that contain p respectively, we say V1 is equivalent to V2 at p if there exists a

68

neighborhood U of p on which

V1 ∩ U = V2 ∩ U.

It is easy to check that this indeed deﬁnes an equivalence relation, and hence we can talk
about the equivalence classes of holomorphic subvarieties.
Deﬁnition 5. A germ at p of a holomorphic subvariety is an equivalence class of
holomorphic subvarieties under the above equivalence relation. We say a germ of holomorphic
subvariety is reducible if it is the union of two proper germs of holomorphic subvarieties,
and irreducible otherwise.
Now we have two diﬀerent kinds of objects. On the algebraic side, we have germs of
holomorphic functions at a point. On the geometric side, we see germs of holomorphic
subvarieties at a point. The relationship between these two kinds mimics that between
polynomials and varieties that is central to algebraic geometry. For f1 , . . . , fk ∈ n Op , we use
the standard notation

V(f1, . . . , fk ) to denote the germ of holomorphic subvariety deﬁned

by f1 , . . . , fk . To simplify the notations, here we generalize the notion of holomorphic
subvariety to allow the empty set ∅ to be considered as a germ of holomorphic subvariety:

V(f1, . . . , fk ) = ∅ if any fj (p) = 0. We can extend this notation to any ideal I ⊆ nOp. Since
n Op

is Noetherian, I is generated by ﬁnitely many germs of holomorphic functions represented

by f1 , . . . , fk with some suﬃciently small common domain U containing p, then their common
zero set V = {z ∈ U : f1 (z) = · · · = fk (z) = 0} represents a germ of holomorphic subvariety

V (I). From the opposite direction, given a germ V of a
holomorphic subvariety at p ∈ Cn , we use the notation I (V ) to denote the set of germs
of holomorphic functions f ∈ n Op that vanish on V . One can show that I (V ) is an ideal,
at p. We deﬁne this germ to be

69

I

commonly known as the ideal of the germ V . Indeed, (V ) is radical, i.e.,

I (V ) = I (V ).

The well-deﬁnedness of both deﬁnitions are actually not completely trivial to verify. We refer
to [5, p.77] for the proof and discussions of the properties of the two operators. An important
consequence is that any germ V of a holomorphic subvariety can be written as a ﬁnite union
of irreducible germs. This union, called the irreducible decomposition of the germ V is

I

actually unique up to a permutation. We can show this by noticing the ideal (V ) of the
germ V is an ideal in n Op which is Noetherian. The complete proof can be found in [8, p.15
Theorem 7].
We shall now focus on irreducible germs of holomorphic subvarieties. Our goal is to
have a canonical form for all such germs. Recall that a mapping between two topological
spaces is called ﬁnite if the inverse image of each point is a ﬁnite set, and is called proper
if the inverse image of any compact set is also compact. Furthermore, a covering map is
a surjective continueous map from a locally path-connected topological space to another
topological space such that around each point in its image, there exists an open neighborhood
whose inverse image under this map is a disjoint union of connected open subsets in the
domain [14, p.278].
Deﬁnition 6. A continuous, ﬁnite, proper, surjective map π : V → W between two secondcountable Hausdorﬀ spaces is called a ﬁnite branched covering if there is a dense open
subsets W0 ⊆ W such that V0 = π −1 (W0 ) is dense in V and the restriction of π0 : V0 → W0
of π on V0 is a covering map.
Here the space W is called the base, while V is called the cover. The restriction
π0 : V0 → W0 of π on V0 is called a regular part of the ﬁnite branched covering π : V → W .
So far, this construction is purely topological. What we need is an analytic version of this

70

concept:
Deﬁnition 7. A ﬁnite branched covering π : V → W between two holomorphic varieties, as
topological spaces, is a ﬁnite branched holomorphic covering if there is a regular part
π0 : V0 → W0 of π for which W \W0 is a holomorphic subvariety of W and π0 is a locally
biholomorphic mapping.
Theorem 6. (Local parametrization theorem) [8, Theorem 10, p.48] For an irreducible

Cn of a holomorphic subvariety, with a suitable nonsingular linear change
of coordinates of Cn there exists an integer d n such that the natural projection π : Cn =
Cd × Cn−d → Cd, restricted on V , is a ﬁnite branched holomorphic covering over some open
set in Cd .
germ V at p ∈

We call the smallest such integer d the Weierstrass dimension, or simply the dimension, of the irreducible germ V . Note that in this context, the regular part of this branched
holomorphic covering πV over some open set in

Cd is a connected complex manifold of

complex dimension d. For the rest of the discussion, we will only focus on the case when the
dimension is 1, and in this case, we have a particularly simple and useful description of the
cover. Notice that Theorem 6 is essentially a local theory, so we should be able to easily
extend it into the projective space, after all

3.1.2

CPn is locally aﬃne.

Local normal form of aﬃne homotopy paths

Now we would like to derive the canonical form of a path γ ⊂

H (x, t) = 0

71

Cn × [0, 1] deﬁned by

that converges to some end point ζ ∈
as a map H :

Cn+1

→

Cn

Cn

at t = 1. If we consider H = (h1 , . . . , hn )

holomorphic in (x, t) = (x1 , . . . , xn , t), then the zero set

Cn+1 | H(x, t) = 0} is a holomorphic subvariety of some domain in Cn+1. We
can then consider the germ of the holomorphic subvariety V = V(h1 , . . . hn ) at the point
(ζ, 1) ∈ Cn+1 .
{(x, t) ∈

In this context the endgame in path tracking for homotopy continuation method can be
understood as the geometric characterization of the germ V at the end point (ζ, 1) as well
as the computation of an accurate estimate for the end point (ζ, 1) using these geometric
information. To outline the basic idea, we shall start with a few nontrivial observations:
Firstly, in the irreducible decomposition V = V1 ∪ · · · ∪ V , over a suﬃciently small t-interval
the path γ must lie in exactly one such irreducible germ. Secondly, by Theorem 6 this
irreducible germ can be realized as a ﬁnite branched holomorphic covering over some domain
in

Cd for which the regular part is a complex manifold of dimension d. But by the smoothness

condition, we necessarily have d = 1. Thirdly, topologically speaking, the ﬁnite branched
covering must be isomorphic to the standard ﬁnite branched covering given by z → z m where
m is the number of sheets. Finally, the role of the special variable that serves as the base
space can be played by none other than the path parameter t. These observations are made
precise by the the following important theorem:
Theorem 7. With the notation used above, at t = 1, the path γ has the convergent power
series expansion of the form

xj =

∞

ajk sk

k=0

t = 1 − sm

72

Z

for j = 1, . . . , n, where m ∈ + .
We refer to [25] for the proof of this theorem based on the Local Parametrization Theorem
6. A number of other diﬀerent approaches can be used to prove this theorem, in particular
one also can consider it as the Local Normal Form theorem for holomorphic maps between
Riemann surfaces, which can be found in [20]. A variety of endgames were developed based
on this theorem.

3.1.3

Local normal form of projective homotopy paths

The goal of this section is to derive series expansions for projective paths similar to that
provided by Theorem 7. These results are well developed, we refer to [20] for more in depth
discussion. Let γ ⊂
ˆ
associate γ ⊂

ˆ
CPn × [0, 1] be a path deﬁned by H (x, t) = 0 that has an aﬃne

Cn × [0, 1] which is deﬁned for all t ∈ (0, 1), i.e., x0 = 0 along γ for t ∈ (0, 1).
ˆ

By the smoothness condition, they are smooth paths parametrized by t for t ∈ (0, 1) in

CPn and Cn respectively. Because CPn is compact, γ necessarily converge to some point
ˆ
[ζ] = [ζ0 : · · · : ζn ] in CPn at t = 1. By the previous analysis, the projection of the germ
represented by γ at ζ onto the hyperplane
ˆ

L (x) :=

where x = (x0 , . . . , xn ) ∈

ζ, x

C−1

= 0

Cn+1 is also smooth. This additional equation deﬁnes γ as an aﬃne
ˆ

homotopy path to which Theorem 7 may be applied. Algebraically, there exists a punctured
disk D = ∆ (1; ε) \ {1} such that for each t ∈ D, there is a representative (x0 , . . . , xn ) ∈

73

Cn+1

of γ (t) that is a nonsingular solution of the system
ˆ

ˆ
H (x0 , . . . , xn , t) = 0
L (x0 , . . . , xn ) = 0.

∞
k
k=0 ajk s

By Theorem 7, t = 1 − sm and xj =
aﬃne associate γ ⊂

for j = 0, . . . , n. By assumption γ has an
ˆ

Cn × [0, 1] which is a smooth path in Cn parametrized by t for t ∈ (0, 1).

So it is necessary that x0 = 0 ∈ 1 O0 , i.e., it is not the zero power series. To summarize, near
t = 1, the projective path γ can be parametrized by a holomorphic map in a variable s of the
ˆ
form




 x0






 x1





=

∞
k
k=0 a0k s

=

∞
k
k=0 a1k s

.
.
.

≡ 0

(3.1)






∞
k
 x
 n =

k=0 ank s





 t(s) = 1 − sm
which satisﬁes the additional equations

a00 = ζ0
.
.
.

.
.
.

.
.
.

an0 = ζn
ζ, x (s)

= 1.

Remark 1. In light of Lemma 1 and the analysis in Section 2.1.2, the role of the true end
˜
point ζ can be played by some suﬃciently close ζ in this power series expansion. Of course,

74

it is generally impossible to determine, a priori, how close is suﬃciently close.
Note that such a power series expansion exists for paths that converges to point in
or outside

Cn (points at “inﬁnity”).

Cn

While it is dependent on the hyperplane deﬁned by

L (x) = ζ, x − 1 = 0, we can easily remove this artiﬁcial dependency by rewriting the series
and obtain a convenient alternative expansion. Since x0 (s) is not the zero power series in s,
it must be of ﬁnite order. Let d be its order, then

x0 =

∞

sd

a0k sk

k=0
∞
k
k=0 a0k s

for a00 = 0. Then

is order 0 and hence invertible in the ring 1 O0 . Let b ∈ 1 O0

be its inverse, then in 1 M0 , the inverse of x0 has the form s−d b. Since the aﬃne associate γ

x
x
of γ is given by x1 , . . . , xn , t for ([x0 : · · · : xn ] , t) ∈ γ , each coordinate xj can be formally
ˆ
ˆ
x0
0
0

written as

xj
x0

=

s−d b

∞

ajk

sk

=

s−d

k=0

∞

ajk
˜

k=0

sk

=

∞

ajk sk
˜

k=−d

which are well deﬁned germs of meromorphic functions in 1 M0 . Therefore in aﬃne space we
have Laurent series expansions of the form

xj =

∞

ajk sk

k=−d

t = 1 − sm
for some ﬁnite d ∈

Z with j = 1, . . . , n. By “factoring” out the lowest term from the series

expansion for each xj , we can write xj as the product of aj,−d s−d ∈ 1 M0 and another series
75

in 1 O0 . After relabeling, we obtain an equivalent form in the aﬃne space


 x
 1








= a1 s ω 1 1 +

∞
k
k=1 a1k s

.
.
.
(3.2)


 x
 n = a1 s ω n 1 +






 t(s) = 1 − sm
where ω1 , . . . , ωn ∈

Cn:

∞
k
k=1 ank s

Z are the orders of the power series expansions.

The power series expansions (3.1) and (3.2) allow us to perform endgame classiﬁcation
via well developed power series techniques (a.k.a., Puiseux series techniques) described in
[11],[15], [17], and [25] in conjunction with projective path tracking. These techniques have
been used in our actual implementation, and they are eﬃcient and eﬀective in many cases.
However, as explicitly pointed out by [25, p.186] and [11], their usefulness diminish as the
integer m increases. Indeed our experiments have shown that it is generally of little hope
in applying these techniques to paths with m greater than three or four using standard
double-precision ﬂoating point arithmetic only. Thus the power series techniques are only
used as the ﬁrst line of defense in our implementation. A more powerful, albeit expensive,
technique via Cauchy integral is preferred.

76

3.2

Projective endgame based on Cauchy integral

Let us continue to use the notations γ and γ for the projective path and its aﬃne associate
ˆ
with ([ζ] , 1) being the end point of γ . With (3.1), the endgame based on Cauchy integral
ˆ
described in [25] can be easily extended to

CPn. Let

x0 , t0 be a point on γ suﬃciently
ˆ

close to ([ζ] , 1). By (3.1) and Lemma 1 the coordinates of the projection of the path γ to
ˆ
the hyperplane deﬁned by L (x) = x0 , x
given by xj (s) =

∞
k
k=0 a0k s

C − 1 = 0 is parametrized by a holomorphic map

for each j = 0, . . . , n. By the Cauchy Integral Theorem,

xj (0) =

xj (s)
1
ds
2πi Γ s

for each j = 0, . . . , n and any loop Γ around s = 0 which is the only possible singularity of
xj (s) in the interior of Γ. More generally, written in a vector notation, we have

x (0) =

1
x (s)
ds.
2πi Γ s

(3.3)

In other words, the value of x at the end point of the path γ can be computed via an
ˆ
integral, since [ζ] = [x (0)] as points in

CPn.

This is the basic idea behind the projective

Cauchy integral endgame. Unfortunately, this integral cannot be computed directly since the
relationship between t and s is not known. Thus we ﬁrst need to know the value of m in
t = 1 − sm . Second, we need to collect numerical samples of the value of x along the loop Γ,
so that the above integral can be evaluated. Conveniently, the two pieces of information can
be obtained from the very same process, killing two birds with one stone.
Recall that t = 1 − sm which may not have holomorphic inverse in a disk centered at s = 0.
1

However, we certainly have a holomorphic branch of the “inverse”, given by s = e m log(1−t)

77

with the principle branch of the logarithm function. Then along this branch, the value of x
1

can be expressed as a function of t given by x e m log(1−t) . If we parametrize the small
circle of radius r = 1 − t0 centered at 1 in t-space as t = 1 − reiθ using the angle θ, the value
of x, now as a function of θ, can be expressed as
1 log reiθ

φ (θ) := x e m

= x

√
m reiθ/m

.

Then it is easy to check that m is the smallest natural number among all k ∈
that φ (2kπ) = φ (0). That is, the values Φ = {φ (θ) |θ ∈

Z+ such

R} ⊂ CPn form a (closed) loop

parametrized by θ ∈ [0, 2mπ].
In the projective Cauchy integral endgame, we consider the loop Φ as the projection on
the x-space of a homotopy path deﬁned by the equation

ˆ
H x, 1 − reiθ

≡ 0.

Hence the techniques described in the previous chapter can be used to track the path Φ.
The strategy is then to track the value of φ (θ) as θ increases in discrete steps until the
ﬁrst occurrence of φ (2kπ) = φ (0), at which point we will obtain two things: The value k
which gives us m and the set of sample values of φ (θ) with which the above integral can be
approximated numerically, which in turn gives us the estimate of the end point ([ζ] , 1).
More precisely, the process has three stages: the sampling stage, integration stage, and the
veriﬁcation stage. In the sampling stage, we perform the projective path tracking algorithms
to track the path φ (θ) in prescribed and equally spaced steps θ1 = ∆θ, θ2 = 2∆θ, . . . starting
from θ0 = 0 and the initial point φ (0) = x0 . During this path tracking process, the model

78

CPn = S 2n+1/U is used, so all the sample points φ

θj have unit norm. The sample points

φ (θ1 ) , φ (θ2 ) , . . . are then scaled via

ˆ
φ θj

φ θj
θj

:=

x0 , φ

C

so that they lie on the hyperplane deﬁned by L (x) = x0 , x

C−1

= 0. This process

Z

continues until we have determined that φ θj = φ (0) for some θ = 2kπ where k ∈ + is
the winding number of the loop. This stopping criterion is, of course, an ill-posed question.
We will discuss the stopping criteria in the following subsections. Next, the samples are used
to approximate the Cauchy integral 3.3 using the trapezoid method given by

1
˜
ζ =

φ(θj )
j=0

˜
˜
Finally, in the veriﬁcation stage the residual ρ(ζ) is computed. Since we expect ζ to be a
˜
ˆ
close approximation of a solution to H(x, 1) = 0, the residual ρ(ζ) should be relatively small.
˜
If ρ(ζ) is greater than a certain threshold, the result is discarded, and we consider the Cauchy
integral endgame to have failed. From our experiences, this criterion is generally suﬃcient
to verify the projective Cauchy integral endgame has worked correctly. However, in Section
3.2.3, we will discuss a stronger veriﬁcation process.
It is experimentally veriﬁed that when the projective Cauchy integral endgame works
correctly, it provides us much better accuracy than what can be obtained from projective
path tracking algorithm alone for singular endpoints. Using the Total Degree Homotopy [16]
with projective path tracking developed in the previous chapter, all paths of noon3, reimer2,
reimer3, cyclic4, and cyclic5 can be tracked to their endpoints. Unfortunately, as shown

79

in Figure 3.1, in most cases the scale of |x0 | makes a poor indicator of the path going to
“inﬁnity”. With projective Cauchy integral endgame, however, we can improve the estimate
of x0 greatly: the aforementioned paths now has |x0 | at end points close to machine epsilon
using double-precision ﬂoating point arithmetic, making it a good indication that the path
end point is “at inﬁnity”.
System Winding number
noon3
reimer2
reimer3
reimer4
cyclic5
cyclic5

2
2
3
4
2
5

log10 |x0 |
with path tracking with Cauchy integral
−4
−17
−4
−16
−3
−15
−2
−14
−4
−15
−2
−15

Table 3.2: Comparison of the magnitude of |x0 |, in log10 scale, with and without the Cauchy
integral method. The second column shows the winding number of the loop used for evaluating
Cauchy integral. The third column shows the approximate order of magnitude of |x0 | at the
end point using the projective path tracking alone, while the last column shows the same
measure but with Cauchy integral producing the end point.

3.2.1

Stopping criteria based on Riemannian distance

When do we stop collecting samples of φ(θj )? Namely, how many {φ(θ0 ), φ(θ1 ), . . .} should we
collect? In theory, we should stop at the smallest natural number k such that [φ(2kπ)] = [φ(0)]
as points in

CPn, and for this purpose the Riemannian distance function d = dCPn on CPn

can be used, i.e., we require

d ([φ (2kπ)] , [φ (0)]) = 0.

However, with numerical error, this criterion is unlikely to ever be satisﬁed exactly. So we
must answer the question: how small is small enough? From the experiments, we found
80

it doubtful that there is a threshold one can use for all systems. In general the distance
d([φ(2kπ)], [φ(0)]) must be compared to the size of the loop Φ. So we can keep track of the
maximum distance from [φ(0)] to points [φ(θ1 )], [φ(θ2 )], . . .

dmax := max d
j

φ θj

, [φ (0)]

which should provide us a good estimation of the size of the loop Φ. Then the closing of the
loop can be numerically determined by the condition

d ([φ (2kπ)] , [φ (0)])
< ε
dmax
for certain threshold ε. In our preliminary implementation, the value ε = 10−3 is used, and
it is successful as a stopping criteria for almost all paths with a winding number 12 or less.

81

3.2.2

Stopping criteria based on tangent vector

A potentially more robust stopping criteria comes from the observation that if Φ does indeed
form a loop with k being the smallest natural number such that φ(2kπ) = φ(0), then it is, in
addition, a smooth loop, which means as tangent vectors of T[φ(0)]

CPn we must have

˙
˙
φ (2kπ) = φ (0) ,

therefore the angle between the two vectors

cos−1

˙
˙
φ(2kπ), φ(0) R
˙
˙
φ(2kπ) 2 · φ(0) 2

should give us another indicator for whether or not we have collected a full loop. Together
with the stopping criteria based on the Riemannian distance this should provide us a more
robust criteria, in theory at least. In our numerical experiments, we found this technique
being helpful in preventing premature termination of the sampling stage in certain cases.

82

3.2.3

Consistency tests

As a ﬁnal line of defense, the projective Cauchy integral endgame is computed with smaller
and smaller radius r = 1 − t0 closer and closer to the end point at t = 1. If each iteration
worked correctly, we would expect
• The winding number m1 , m2 , . . . remain a constant,
˜1 ˜2
• The variation of the endpoint estimates ζ , ζ , . . . should be no greater than twice the
square root of the machine epsilon.
Thus the stabilizing of the results of successive projective Cauchy integral endgames should
give us a very strong indication that the projective Cauchy integral has worked correctly and
˜
provided us the accurate approximation of the end point ζ ≈ ζ.

83

BIBLIOGRAPHY

84

BIBLIOGRAPHY
[1] E.L. Allgower and K. Georg. Introduction to numerical continuation methods, volume 45.
Society for Industrial and Applied Mathematics, 2003.
[2] G. Attardi and C. Traverso. The PoSSo library for polynomial system solving. Proc. of
AIHENP95, 1995.
[3] G. Bjorck and R. Froberg. A faster way to count the solutions of inhomogeneous
systems of algebraic equations, with applications to cyclic n-roots. Journal of Symbolic
Computation, 12(3):329–336, 1991.
[4] L. Blum, Felipe Cucker, M. Shub, and S. Smale. Complexity and real computation.
Springer-Verlag, 1998.
[5] W. Ebeling. Functions of several complex variables and their singularities, volume 83.
American Mathematical Society, 2007.
[6] S. Gallot, D. Hulin, and J. Lafontaine. Riemannian geometry. Springer-Verlag, 2004.
[7] R.C. Gunning. Introduction to holomorphic functions of several variables: Function
theory, volume 1. Brooks/Cole Publishing Company, 1990.
[8] R.C. Gunning. Introduction to holomorphic functions of several variables: Local theory,
volume 2. Brooks/Cole Publishing Company, 1990.
[9] R.C. Gunning and H. Rossi. Analytic functions of several complex variables. AMS
Chelsea Publishing, 2009.
[10] B. Huber and B. Sturmfels. A polyhedral method for solving sparse polynomial systems.
Mathematics of computation, 64(212):1541–1555, 1995.
[11] B. Huber and J. Verschelde. Polyhedral end games for polynomial continuation. Numerical
Algorithms, 18(1):91–108, 1998.
[12] J.M. Lee. Riemannian manifolds: An introduction to curvature, volume 176. SpringerVerlag, 1997.
85

[13] J.M. Lee. Introduction to smooth manifolds, volume 218. Springer-Verlag, 2003.
[14] J.M. Lee. Introduction to topological manifolds, volume 202. Springer-Verlag, 2011.
[15] T.L. Lee, T.Y. Li, and C.H. Tsai. HOM4PS-2.0: a software package for solving polynomial
systems by the polyhedral homotopy continuation method. Computing, 83(2):109–133,
2008.
[16] T.Y. Li. On Chow, Mallet-Paret and Yorke homotopy for solving systems of polynomials.
Bulletin of the Institute of Mathematics. Acad. Sinica, pages 433–437, 1983.
[17] T.Y. Li. Numerical solution of polynomial systems by homotopy continuation methods.
Handbook of numerical analysis, 11:209–304, 2003.
[18] T.Y. Li, T. Sauer, and J.A. Yorke. The random product homotopy and deﬁcient
polynomial systems. Numerische Mathematik, 51(5):481–500, 1987.
[19] T.Y. Li, T. Sauer, and J.A. Yorke. The cheater’s homotopy: an eﬃcient procedure for
solving systems of polynomial equations. SIAM Journal on Numerical Analysis, pages
1241–1251, 1989.
[20] R. Miranda. Algebraic curves and Riemann surfaces, volume 5. American Mathematical
Society, 1995.
[21] A.P. Morgan. Solving polynomial systems using continuation for engineering and scientiﬁc
problems, volume 57 of Classics in Applied Mathematics. Society for Industrial and
Applied Mathematics, 2009.
[22] A.P. Morgan and A.J. Sommese. A homotopy for solving general polynomial systems that
respect m-homogeneous structures. Applied Mathematics and Computation, 24(2):101–
113, 1987.
[23] V.W. Noonburg. A neural network modeled by an adaptive Lotka-Volterra system.
SIAM Journal on Applied Mathematics, pages 1779–1792, 1989.
[24] M. Shub and S. Smale. On the complexity of Bezouts theorem I - geometric aspects.
Journal of the AMS, 6(2), 1993.
[25] A.J. Sommese and C.W. Wampler. The Numerical solution of systems of polynomials
arising in engineering and science. World Scientiﬁc Pub Co Inc, 2005.

86

[26] J. Stoer and R. Bulirsch. Introduction to numerical analysis, volume 12. Springer-Verlag,
2002.
[27] G. Strang. Linear algebra and its application. Academic Press Inc., 1976.
[28] D. Varolin. Riemann surfaces by way of complex analytic geometry, volume 125. American
Mathematical Society, 2011.
[29] R.J. Walker. Algebraic curves. Springer-Verlag, 1978.
[30] C.W. Wampler, A.P. Morgan, and A.J. Sommese. Complete solution of the nine-point
path synthesis problem for four-bar linkages. Journal of Mechanical Design, 114:153,
1992.
[31] J.H. Wilkinson. The algebraic eigenvalue problem. Oxford University Press, USA, 1988.

87