PERSISTENT TOPOLOGICAL LAPLACIANS AND THEIR APPLICATIONS

By

Xiaoqi Wei

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

Mathematics—Doctor of Philosophy

2024

ABSTRACT

Topological and geometrical methods are known for their capability to reduce noise and

have achieved significant success in analyzing complex biological data. A key method in

topological data analysis is persistent homology, which leverages a filtration of simplicial

complexes to extract multiscale spatial information. To integrate non-spatial information,

specially tailored persistent homology approaches, such as element-specific persistent ho-

mology, have been proposed and have shown significant success in predictive modeling of

molecular structures.

Recently, it was discovered that persistent Laplacians can be defined for a filtration, and

the nullity of a persistent Laplacian is equal to the corresponding persistent Betti number,

suggesting that the spectra of persistent Laplacians offer additional information beyond

traditional persistent homology. Spectra of persistent Laplacians can be used in combination

with persistent homology to enhance the featurization of raw biological data. Inspired by the

theory of cellular sheaves, the theory of persistent sheaf Laplacians was proposed; spectra

of persistent sheaf Laplacians encode both spatial and non-spatial information of a labeled

point cloud. The theory of persistent sheaf Laplacians provides an elegant method for fusing

different types of data and holds significant potential for future development.

The construction of persistent Laplacians can also be easily generalized to other settings,

such as digraphs and hypergraphs. These generalizations are important, as they offer various

ways to integrate different types of biological information.

In this thesis, we introduce

persistent Laplacians and some generalizations, such as persistent sheaf Laplacians, and

discuss their applications in biology.

Copyright by
XIAOQI WEI
2024

ACKNOWLEDGEMENTS

First and foremost, I would like to express my deepest gratitude to my PhD supervisor,

Guo-Wei Wei, for his invaluable guidance, continuous support, and for leading me onto the

path of applied mathematics. His mentorship has been fundamental to my growth as a

researcher.

I would also like to extend my thanks to my committee members, Ekaterina

Rapinchuk, Moxun Tang, and Yiying Tong, for their support and guidance throughout my

PhD study.

I am also grateful to my research group members, in particular Dong Chen, Jiahui Chen,

Kaifu Gao, Shen Li, Rui Wang, and Menglun Wang, for their valuable discussions, insights,

and assistance throughout my PhD journey. Their collaboration has been an essential part

of my research experience.

I wish to acknowledge the professors who have greatly influenced my academic journey.

During my master’s studies, I would like to thank Qingtao Chen for his guidance and support

in my research. I am also grateful to Dong Tian, Bingzhe Hou, and Wenjie Gao for their

guidance during my undergraduate years.

My heartfelt thanks go to my friends from Changchun, Zurich, and East Lansing—particularly

Wenxian, Zheng, Hantian, Haozhi, Honghao, and Lin-Da—who have made my life brighter

and more colorful throughout my studies.

A special acknowledgment goes to Mr. Wei Wang, who sparked my interest in mathe-

matics and nurtured my passion for the subject from a young age.

Lastly, I would like to extend my deepest appreciation to my family. In particular, I am

profoundly grateful to my parents and my grandmother for their endless encouragement and

love.

iv

TABLE OF CONTENTS

CHAPTER 1

INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER 2

PERSISTENT LAPLACIANS . . . . . . . . . . . . . . . . . . . .

CHAPTER 3

(CO)SHEAVES AND PERSISTENT SHEAF LAPLACIANS . . .

CHAPTER 4

OTHER PERSISTENT TOPOLOGICAL LAPLACIANS . . . . .

CHAPTER 5

UNDERSTANDING DOMINANT VARIANTS OF SARS-COV-2
USING LAPLACIANS . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER 6

THESIS CONTRIBUTION AND FUTURE WORK . . . . . . . .

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

APPENDIX A

STATISTICS OF NONZERO EIGENVALUES OF PERSISTENT
LAPLACIANS . . . . . . . . . . . . . . . . . . . . . . . . . . . .

APPENDIX B

METHOD OF HOMOTOPY CONTINUATION . . . . . . . . .

APPENDIX C

APPENDIX OF THE SHEAF CHAPTER . . . . . . . . . . . . .

APPENDIX D

APPENDIX OF THE SARS-COV-2 CHAPTER . . . . . . . . .

1

4

27

44

52

71

73

84

88

91

96

v

CHAPTER 1

INTRODUCTION

In recent years, advancements in experimental techniques in biological research have gen-

erated vast amounts of data requiring extensive analysis. A biological object can often

be viewed as a geometrical object in various ways; for instance, a protein’s atoms form a

point cloud, its molecular surface can be seen as a manifold, and its polypeptides may be

treated as knots or links. Since certain types of biological data possess spatial informa-

tion, topological and geometrical methods have shown great potential in analyzing them

[17, 40, 48, 78, 137, 146].

One prime example of applying topological methods in biology is the use of persistent

homology in predictive modeling of molecular structures [16, 18, 19, 20, 144, 145]. A fun-

damental principle in molecular biology is that structure dictates function. Since critical

structural information often varies depending on the problem and dataset, using machine

learning to capture this information is more effective. When employing supervised learning

methods, such as gradient boosting trees, the challenge lies in featurization, i.e., mapping a

high-dimensional raw molecular structure to a low-dimensional space while preserving suf-

ficient structural information. As molecules can be naturally viewed as point cloud data,

persistent homology provides a solution to this challenge. The basic idea of persistent ho-

mology is to create a multiscale family of simplicial complexes (referred to as a filtration)

from the molecule and describe the evolution of these simplicial complexes. For any two

simplicial complexes X ⊂ Y in the filtration, the persistent homology group Hi(X, Y ) can

be calculated, and the collection of all persistent homology groups describes the shape evo-

lution of the simplicial complexes and provides a multiscale and low-dimensional topological

characterization of the point cloud.

It is important to remember that biological data are not purely geometrical, and successful

applications of topological or geometrical methods must incorporate non-spatial information.

A good mathematical representation of biological data should capture the spatial information

1

and the non-spatial information at the same time. For example, when studying molecular

structures, to extract finer structural information, we can use only atoms of certain element

types or of certain roles when performing persistent homology analysis (this approach is

referred to as the element specific persistent homology [20]).

If Vietoris-Rips filtration is

employed, we can also modify the distance matrix to emphasize specific types of interactions

between atoms.

The major theme of this thesis is the persistent Laplacian approach, proposed to com-

plement persistent homology and its generalizations that integrate various non-spatial infor-

mation. Laplacians are ubiquitous in science and engineering and are deeply connected to

homology theories. In graph theory, it is known that the nullity of the graph Laplacian is

equal to the number of connected components, and the smallest nonzero eigenvalue, called

the Fiedler value, reflects the graph’s connectivity. For simplicial complexes, the combina-

torial Laplacian [43] is defined for each dimension on a simplicial complex, and we can prove

that the kernel of a combinatorial Laplacian is isomorphic to the corresponding simplicial

homology group. On a differentiable manifold, de Rham-Hodge theory states that the kernel

of a Hodge Laplacian is isomorphic to the corresponding de Rham cohomology group. Thus,

combinatorial Laplacians are discretizations of Hodge Laplacians. Another discretization of

Hodge Laplacians can be achieved through discrete exterior calculus [42], and a multiscale

formulation of Hodge Laplacians on manifolds was introduced by Chen et al. [32] to study

manifold-type data. For knot-type data, multiscale Laplacians also exist in the context of

Khovanov homology [120].

Persistent Laplacians [84, 133] are the counterparts of combinatorial Laplacians in the

context of persistent homology. It has been suggested that the spectra of persistent Lapla-

cians not only retain information from persistent homology but also provide additional spatial

information1. In the most general sense, any method that utilizes multiscale Laplacians to

quantitatively analyze data can be referred to as a persistent Laplacian approach. The the-

1However, beyond simple cases, making explicit statements about the relationship between shape and

spectrum is often challenging.

2

ory of persistent Laplacians has been studied extensively in [63, 88, 96]. Algorithms have

been developed for computing persistent Laplacians [96, 135], and these Laplacians have

been applied to protein-ligand binding prediction [97], SARS-CoV-2 research [28, 139], and

protein engineering [105]. To integrate non-spatial information, persistent Laplacians have

been extended to various mathematical settings, such as cellular sheaves [141], flag complexes

[75], digraphs [134], hypergraphs [90], and hyperdigraphs [24].

This thesis is organized as follows. In Chapter 2, we introduce the theory of persistent

homology and persistent Laplacians. In Chapter 3, we develop the theory of persistent sheaf

Laplacians, which can be applied to analyze labeled point clouds. In Chapter 4, we review

some recent developments concerning persistent Laplacians. In Chapter 5, we demonstrate

the application of persistent Laplacians in COVID-19 research.

3

CHAPTER 2

PERSISTENT LAPLACIANS

In this chapter we first present a very brief introduction to persistent homology theory and

then give the definition of a persistent Laplacian. [99] is a very good reference on simplicial

complexes. [44] contains an introduction to basic concepts of topology from a computational

perspective.

Simplicial complexes and combinatorial Laplacians

Definition 2.0.1. A q-simplex, denoted as σq = {u0, . . . , uq}, is the convex hull of q + 1

affinely independent points u0, . . . , uq in Rn. The orientation of σq is determined by the

ordering of the vertices and two orderings define the same orientation if and only if they

differ by an even permutation. The dimension of σq = {u0, . . . , uq} is defined as q. For

0 ≤ i ≤ n, {u0, . . . , ˆui, . . . , uq} is said to be a face of σq, where the hat indicates the omission

of the vertex ui.

Definition 2.0.2. A finite set of simplices, X, is a simplicial complex if the following con-

ditions are satisfied: (1) all faces of any simplex in X are also in X; (2) the non-empty

intersection of any two simplices in X is a common face of the two simplices. The dimension

of a simplicial complex X is defined as the maximal dimension of its simplices.

Definition 2.0.3. Given a finite set V , an abstract simplicial complex X is a collection of

subsets of V , such that if a set σ is in X, then any subset of σ is also in X. A set σ that

consists of q + 1 elements is referred to as a q-simplex. If σ is a subset of τ , then we say that

σ is a face of τ and denote the face relation by σ ⩽ τ . If X and Y are abstract simplicial

complexes and X ⊂ Y , then X is referred to as a subcomplex of Y .

(a) 0-simplex

(b) 1-simplex

(c) 2-simplex

(d) 3-simplex

Figure 2.1 Illustrations of simplices.

4

The concepts of abstract simplicial complexes and simplicial complexes are closely re-

lated. One may build a simplicial complex from an abstract simplicial complex or vice versa

[44]. They contain exactly the same combinatorial information. From now on we will not

distinguish abstract simplicial complexes from simplicial complexes.

Example 2.0.1. A simple graph G = (V, E) can be seen as a simplicial complex, since each

edge {vi, vj} ∈ E is a subset of the vertex set V .

Example 2.0.2. Suppose V is a finite set of points in Rn. Given a real number d, a

Rips complex Xd can be defined as follows. A set σ = {va0, . . . , vaq } ∈ Xd if and only if the
Euclidean distance (cid:13)
. We can see that Xs ⊂ Xt

(cid:13)
(cid:13) ≤ d for any pair of points vai

and vaj

(cid:13)vai, vaj

if s ≤ t.

Example 2.0.3. Given a finite set of points V in Rn, we can also build an Alpha complex.

First we define the Voronoi cell. The Voronoi cell of a point u in V is

Vu = {x ∈ Rn | ∥x − u∥ ≤ ∥x − v∥ , v ∈ V }.

Let Bu(r) be the closed ball with center u and radius r. Denote the intersection Bu(r) ∩ Vu

by Ru(r). Then the Alpha complex Alpha(r) is defined by

{σ ⊂ V |

(cid:92)

u∈σ

Ru(r) ̸= ∅}.

In other words, Alpha(r) is the nerve of cover {Ru(r), u ∈ V }. It is also true that Alpha(r1) ⊂

Alpha(r2) if r1 ≤ r2.

Each simplicial complex has an associated algebraic structure called the simplicial chain

complex. For the sake of simplicity, we will designate a fixed global ordering of vertices in

a simplicial complex, and require that vertices of any simplices should be ordered according

to the fixed ordering1. For example, suppose we use the natural ordering 0 < 1 < 2 for

the simplicial complex {{0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}} (Figure 2.2b), then we must not

1A fixed ordering is not necessary. The reader can find more information in [99].

5

write the simplex {0, 1} as {1, 0}. To emphasize that a simplex {v0, . . . , vq} is ordered, we

will use notation [v0, . . . , vq] or v0 . . . vq.

2

2

2

0

1

0

1

0

1

(a)

(b)

(c)

Figure 2.2 (a) The simplicial complex {{0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, {0, 1, 2}}. (b) The
oriented simplicial complex {0, 1, 2, 01, 02, 12}. Arrows emphasize that vertices are ordered.
(c) The oriented simplicial complex {0, 1, 2, 01, 12}.

Definition 2.0.4. A simplicial chain complex consists of a sequence of real vector space

Cq(X) and a sequence of linear homomorphisms ∂q between them. It is usually written out

as follows:

· · ·

∂3

C2(X)

∂2

C1(X)

∂1

C0(X)

0

where the real vector space Cq(X) is generated by q-simplices. An element of Cq(X) is called

a q-chain, and we can actually represent a q-chain by a function fq whose domain is the set

of q-simplices. The boundary operator ∂q is a linear map such that

∂q[va0, . . . , vaq ] =

(cid:88)

(−1)i[va0, . . . , ˆvai, . . . , vaq ].

i

Here the symbol ˆvai
boundary operator is well-defined.

means that ˆvai

is deleted. The ordering of vertices ensures that the

It is well known that ∂q∂q+1 = 0, so the q-th homology group Hq = ker ∂q/ im ∂q+1

is well-defined. The dimension of the homology group Hq is referred to as the q-th Betti

number.

Example 2.0.4. At least for some simple simplicial complexes, the q-th Betti number counts

the number of q-dimensional holes. The simplicial complex X = {0, 1, 2, 01, 02, 12} (Figure

6

2.2b) has only two chain groups C0 and C1, and one boundary map ∂1 which can be repre-

sented by the matrix

01

12

02

0 −1


0 −1







1

2

0

1 −1

1










.

0

1

Recall that we can identify a q-chain (cid:80)

σ cσσ with a function fq such that fq(σ) = cσ. The
operator ∂1 maps a real-valued function f1 : {01, 12, 02} → R to a function f0 : {0, 1, 2} → R

where f0(0) = −f1(01) − f1(02), f0(1) = f1(01) − f1(12), f0(2) = f1(12) + f1(02). Since

C2 = 0, the homology group H1 is ker ∂1 and we can verify that f1 ∈ H1(X) implies

f1(01) = −f1(02) = f1(12). In other words, the 1-th Betti number is 1.

For the simplicial complex Y = {0, 1, 2, 01, 12} (Figure 2.2c), the matrix representation

of ∂1 is

01

12


0 −1







1

2

0










0

1

1 −1

and we can verify that the only f1 ∈ C1(Y ) that satisfies ∂1f1 = 0 is the zero function.

The intuition behind the difference of H1(X) and H1(Y ) is that, in X the edges {01, 12, 02}

constitute a close path, while in Y there are no close paths.

Generally, a sequence of abelian groups and group morphisms

∂q+2

· · ·

Aq+1

∂q+1

Aq

∂q

Aq−1

∂q−1

· · ·

where ∂q∂q+1 = 0 holds, is referred to as a chain complex. The q-th homology group Hq =

ker ∂q/ im ∂q+1 is readily defined. If each chain group Aq is an inner product space, the q-th

7

combinatorial Laplacian or Hodge Laplacian ∆q : Aq → Aq [43] is defined by

∂q+1∂†

q+1 + ∂†

q∂q,

where ∂†
q

denotes the adjoint of ∂q. The q-th combinatorial Laplacian ∆q is a positive semi-

definite symmetric operator and therefore only has non-negative eigenvalues. We often call

∂q+1∂†

q+1

the q-th up Laplacian and denote it by ∆q,+, and ∂†

q∂q the q-th down Laplacian and

denote it by ∆q,−. A sequence of abelian groups and group morphisms

· · ·

∂q−2

Aq−1

∂q−1

Aq

∂q

Aq+1

∂q+1

· · ·

where ∂q∂q−1 = 0, is called a cochain complex. The q-th cohomology group is ker ∂q/ im ∂q−1,

and the q-th combinatorial Laplacian for a cochain complex is defined analogously when each

cochain group is an inner product space.

Example 2.0.5. The graph Laplacian L of a graph G = (V, E) is usually defined element-

wise by

Lij =






deg vi,

if i = j

−1,

0,

if i ̸= j and vi is adjacent to vj

otherwise

The graph Laplacian L actually coincides with the matrix representation of 0-th combina-

torial Laplacian ∆0. Take the graph {[a, b], [b, c], [c, a]} as an example.

b

a

c

8

Its graph Laplacian is

a

b

c



2 −1 −1

2 −1

a




b −1




c −1 −1

2

The matrix representation of ∂1 is

ab

bc

ac










.










,


a −1







b

c

0

0 −1

1 −1

0

1

1

1

0









=









and









−1

0 −1

−1

2 −1 −1

1 −1

0

1

0

1

0 −1 1

−1

2 −1

−1

0

1

−1 −1

2

























.

Example 2.0.6. Consider the 2-simplex shown below. We compute its ∆1.

c

a

b

ab

bc

ac

The matrix representation of ∂1 is


a −1







b

c

0

0 −1

1 −1










,

0

1

1

9

and the matrix representation of ∂2 is

abc



1

ab

bc







ac −1

1










.

The matrix representation of ∆1 = ∂2∂†

2 + ∂†









1

1









(cid:18)

(cid:19)

1 1 −1

+









−1

−1

1

0

−1

0 −1

3 0 0

1∂1 is















0 −1 1

−1

0

1

1 −1

0

1

0

1

















=









0 3 0

.

0 0 3

σi

Definition 2.0.5. Two q-simplices σi and σj are said to be lower adjacent, denoted by
L∼ σj, if they share a common (q − 1)-face. They are said to be upper adjacent, denoted
U∼ σj, if they both are faces of a (q + 1)-simplex. The lower degree degL(σ) of a q-
simplex σ is q + 1, the number of its (q − 1)-faces. The upper degree degU (σ) of a q-simplex
σ is defined as the number of (q + 1)-simplices in K of which σ is a face. The degree of

by σi

q-simplex σ is defined by

degU (σ) + degL(σ) = degU (σ) + q + 1.

Now suppose σi

U∼ σj with a common upper (q + 1)-simplex τ . Let’s examine the signs of

the coefficients of σi and σj in the boundary ∂τ of τ . We say that σi and σj are similarly

oriented if the signs of the coefficients of σi and σj in ∂τ are the same; They are dissimilarly

oriented if the signs are different.

Next suppose σi

L∼ σj with common lower (q − 1)-simplex η. Let’s examine the signs of

the coefficients of η in ∂σi and ∂σj. We say that η is a similar common lower simplex of

σi and σj if the signs are the same; η is a dissimilar common lower simplex if the signs are

different.

The following proposition gives an explicit formula for Lq, the matrix representation of

∆q.

10

Proposition 2.0.1. [52, Theorem. 3.3.4] Suppose we have a finite simplicial complex and

its set of q-simplices is {σ1, . . . , σn}.

(1) When q = 0,

degU σi,






−1,

0,

if i = j

if i ̸= j and σi

U∼ σj

if i ̸= j and σi

U≁ σj

Lij =

(2) If q > 0, then

Lij =






degU σi + q + 1,

if i = j

1,

−1,

0,

if i ̸= j, σi

U≁ σj and have a similar

common lower simplex

if i ̸= j, σi

U≁ σj and have a dissimilar

common lower simplex

if i ̸= j and either σi and σj are upper

adjacent or are not lower adjacent

Next we are going to show that the kernel of ∆q is isomorphic to the q-th homology

group. We first need the following lemma.

Proposition 2.0.2. [85] If U, V, W are finite-dimensional inner product spaces and f : U →

V , g : V → W are two linear morphisms such that gf = 0,

U

f

V

g

W

then ker(g†g + f f †) ∼= ker g/ im f , and V = im g† ⊕ ker(g†g + f f †) ⊕ im f .

Proof. We give the reader an outline of the proof. Since ⟨g†g+f f †v, v⟩ = ⟨gv, gv⟩+⟨f †v, f †v⟩,

v ∈ ker(g†g + f f †) is equivalent to v ∈ ker g ∩ ker f †. Since ker f † is orthogonal to im f , and

im f ⊂ ker g, we know that ker(g†g + f f †) is isomorphic to the orthogonal complement of

im f in ker g.

11

The condition that v⊥ im f implies that ⟨v, v⟩ ≤ ⟨v + f u, v + f u⟩ for any u ∈ U .

Conversely,

if ⟨v + f u, v + f u⟩ − ⟨v, v⟩ = 2⟨v, f u⟩ + ⟨f u, f u⟩ >= 0 for any u ∈ U ,

⟨v, f u⟩ must be 0, otherwise we can always multiply a coeffcient k to u to make the sum

2k⟨v, f u⟩ + k2⟨f u, f u⟩ < 0. Therefore, for a equivalent class v′ + im f in ker g/ im f , its

corresponding element v in ker(g†g + f f †) is the element that minimizes its ‘size’ ⟨v, v⟩.

Each chain group of a simplicial chain complex has a canonical inner product structure.

We can just let ⟨σ, τ ⟩ = 0 if σ ̸= τ , and ⟨σ, τ ⟩ = 1 if σ = τ . Therefore the q-th combinatorial

Laplacian is readily defined for a simplicial complex. Apply Proposition 2.0.2 to a simplicial

chain complex, we know that the kernel of the q-th combinatorial Laplacian ∆q is isomorphic

to the q-th homology group Hq [43] (sometimes called the Hodge theorem), and Aq admits

a Hodge decomposition

Cq(X) = im ∂†

q ⊕ ker ∆q ⊕ im ∂q+1.

Proposition 2.0.3. The nonzero spectrum of ∆q is the union of the nonzero spectra of ∆q,+

and ∆q,−.

Proof. This is derived from the Hodge decomposition. Since ∆q|im ∂†
and ∆q|im ∂q+1 = ∆q,−|im ∂q+1 ⊂ im ∂q+1, ∆q is indeed the orthogonal direct sum

= ∆q,+|im ∂†

q

q

⊂ im ∂†
q

0|ker ∆q ⊕ ∆q,+|im ∂†

q

⊕ ∆q,−|im ∂q+1.

The operator ∆0 = ∂1∂†
1

is more commonly known as the graph Laplacian, and there is

a vast amount of work regarding the relation between the spectrum of graph Laplacian and

the shape of a graph [37]. For a connected graph, it is well known that the minimal nonzero

eigenvalue of the graph Laplacian reflects the graph connectivity [47]. Note that graphs that

have the same homology groups may have different graph Laplacians (Figure 2.3).

The intuition behind the Hodge theorem when q = 0 is as follows. For a simple graph

(V, E), let f0 be a function that sends every vertex to a real number. If we view the simple

12

(a)

(b)

(c)

Figure 2.3 Simplicial homology can distinguish (a) from (b) and (c), but cannot distinguish
between (b) and (c). The graph Laplacian can distinguish among all of them. Indeed for a
cycle graph with n vertices, the spectrum of the graph Laplacian is {2 − 2 cos(2kπ/n) | k =
1, . . . , n}.

graph as a simplicial complex, ∂†
1

maps f0 to a real valued function whose domain is E. The

sum

(cid:88)

ij∈E

|f0(i) − f0(j)|2 = ⟨∂†

1f0, ∂†

1f0⟩ = ⟨f0, ∂1∂†

1f0⟩

is called the Dirichlet energy of f0, and it measures how f0 varies over V . Any f0 ∈ ker ∆0 =

ker ∂1∂†
1

is a function with zero Dirichlet energy. For a connected graph, if f0 has zero

Dirichlet energy then f0(a) = f0(b) for any two vertices a and b, because there is always

a path that starts from a and ends at b. In other words, f0 is a constant function. If a

graph has more than one connected components, f0 only needs to be locally constant on any

connected components. In other words, the dimension of ker ∆0 is equal to the number of

connected subgraphs, which is also the dimension of H0.

Persistent homology

Now suppose X is a subcomplex of Y . Then the q-th chain group of X is a subspace of

the q-th chain group of Y , as shown in the following diagram

∂q+2

· · ·

Cq+1(X)

∂q+1

Cq(X)

∂q+2

· · ·

Cq+1(Y )

∂q+1

Cq(Y )

∂q

∂q

Cq−1(X)

∂q−1

· · ·

Cq−1(Y )

∂q−1

· · ·

where hooked dashed arrows represent inclusion maps ι : Cq(X) (cid:44)→ Cq(Y ). The inclusion ι

is a chain map and induces a map ι• : Hq(X) → Hq(Y ). Intuitively, ι• sends a cycle, i.e., an

element of ker ∂q(X), to itself in ker ∂q(Y ). The q-th persistent homology for the pair (X, Y )

13

is the image

Proposition 2.0.4.

ι•(Hq(X)).

ι•(Hq(X)) ∼=

ker ∂q(X)
ker ∂q(X) ∩ im ∂q+1(Y )

.

Proof. For any α ∈ ker ∂q(X), α + im ∂q+1(X) ∈ ker ι• is equivalent to α ∈ im ∂q+1(Y ).

α ∈ ker ∂q(X) ∩ im ∂q+1(Y ), is equivalent to

α + im ∂q+1(X) ∈

ker ∂q(X) ∩ im ∂q+1(Y )
im ∂q+1(X)

.

ker ι• =

ker ∂q(X) ∩ im ∂q+1(Y )
im ∂q+1(X)

,

Therefore,

and

ι•(Hq(X)) ∼=

ker ∂q(X)/ im ∂q+1(X)
ker ∂q(X) ∩ im ∂q+1(Y )/ im ∂q+1(X)

∼=

ker ∂q(X)
ker ∂q(X) ∩ im ∂q+1(Y )

.

The quotient space ker ∂q(X)/(ker ∂q(X) ∩ im ∂q+1(Y )) has an intuitive interpretation.

The space ker ∂q(X) ∩ im ∂q+1(Y ) actually corresponds to all the (q + 1)-simplices in Y

whose boundaries are in X. When X evolves into Y (more simplices are added), some

topological features of X will be lost (a cycle maybe filled). We can say that the quotient

space ker ∂q(X)/(ker ∂q(X) ∩ im ∂q+1(Y )) captures the persistent topological features of X.

Persistence modules

Given a point cloud, a filtration of simplicial complexes can be constructed in various

ways. Recall that a filtration is a sequence of simplicial complexes {Xt}, where Xs ⊂ Xt if

s ≤ t.

14

Xt. When t =
√

t goes from

Example 2.0.7. One popular construction is called the Rips filtration, where Xt is the Rips

complex with t as the threshold: a simplex is in Xt if and only if the distance between any pair

of its vertices is at most t. Because there are only finitely many possible pairwise distances,

Xt will change only for a finite number of times. Consider the point cloud {x = (1, 0), y =

(0, 1), z = (−1, 0), w = (0, −1)} ⊂ R2 shown in 2.4a. When t = 0, there are no edges in

√

2, Xt changes for the first time and becomes {x, y, z, w, xy, yz, zw, xw}. If

2 to 2, X2 = X√

2 ∪ {xz, yw, yzw, xzw, xyw, xyz, xyzw}. As t becomes bigger,

Xt contains more and more simplices.

(a)
Figure 2.4 (a) X0 = {x, y, z, w}. (b) X√
{xz, yw, yzw, xzw, xyw, xyz, xyzw}.

(b)

(c)

2 = {x, y, z, w, xy, yz, zw, xw}. (c) X2 = X√

2 ∪

After a filtration is constructed, each inclusion map Xs ⊂ Xt

induces a map ι•
s,t

:

Hq(Xs) → Hq(Xt) for each q ≤ 0. These homology groups and maps form a sequence

· · ·

Hq(Xti)

ι•
ti,ti+1

Hq(Xti+1)

ι•
ti+1,ti+2

Hq(Xti+2)

· · · .

Note that ι•

ti+1,ti+2

ι•
ti,ti+1

= ι•

ti,ti+2

. Using the language of category theory, we can say that such

a sequence is a functor from a totally order set {ti} (ti ≤ ti+1 is thought of as a morphism

ti → ti+1) to the category of vector spaces. Such a functor is called a persistence module.

This viewpoint will pave the way for further generalizations.

It is known that a persistence module has a unique decomposition into fundamental build-

ing blocks [151]. We usually view the filtration {Xt} as a temporal evolution, so homology

class are ‘born’ and ‘killed’ at certain timestamps. Each fundamental building block in the

decomposition of a persistence module will be interpreted as the life of a homology class.

This unique decomposition is often represented as a persistence diagram or barcodes. The

15

number of homology classes that are born and killed at certain timestamps can actually be

calculated from persistent Betti numbers [44].

Persistent Laplacians

It is possible to construct a symmetric semi-definite operator whose kernel is isomorphic

to a given persistent homology group. In this section we slightly generalize the notion of

persistent homology to the setting of differential graded inner product spaces 2, and give the

definition of a persistent Laplacian.

Definition 2.0.6. A differential graded inner product space (V, dV ) is just a chain complex

dV
q+2

· · ·

Vq+1

dV
q+1

Vq

dV
q

Vq−1

dV
q−1

· · ·

whose chain groups are inner product spaces. One can think of V as the direct sum of all

Vq. When we say (V, dV ) is a subspace of (W, dW ), we mean that Vq is a subspace of Wq for

any q, and the inner space structure of Vq and boundary operator d of (V, dV ) are inherited

from (W, dW ).

For a pair of differential graded inner product spaces (V, dV ) ⊂ (W, dW ), the q-th persis-

tent homology group is defined analogously by

ι•(Hq(V )) ∼=

ker dV
q
q ∩ im dW
q+1

ker dV

.

Observe that ker dV

q ∩ im dW

q+1 = Vq ∩ im dW
q+1

. The preimage of Vq ∩ im dW
q+1

under dW
q+1

is

just (dW

q+1)−1(Vq) = {w ∈ Wq+1 | dW

q+1w ∈ Vq}. Hence, ker dV

q ∩ im dW
q+1

is the image of

πdW

q+1|(dW

denote πdW

q+1)−1(Vq) : (dW
q+1|(dW

q+1)−1(Vq) → Vq, where π = ι†, the projection map from W to V . We
. These maps are shown in the

, and (dW

q+1)−1(Vq) by ΘV,W

q+1

by dV,W
q+1

q+1)−1(Vq)

2The extension of persistent Laplacians to the setting of differential graded inner product spaces first

appeared in [88].

16

following diagram

dV
q

(dV

q )†

Vq−1

Vq+1

dV
q+1

dV,W
q+1

ΘV,W
q+1

dW
q+1

Wq+1

Vq

(dV,W

q+1 )†

Wq

where hooked dashed arrows represent inclusion maps. We define the q-th persistent Lapla-

cian ∆V,W

q

: Vq → Vq by

(dV

q )†dV

q + dV,W

q+1 (dV,W

q+1 )†.

Because dV

q dV,W

q+1 = 0, by Proposition 2.0.2 we see that

ker ∆V,W

q

∼=

ker dV
q
q ∩ im dW
q+1

ker dV

.

This relation is sometimes referred to as the persistent Hodge theorem. The operators

q+1 )† and (dq)†dq are sometimes referred to as the up persistent Laplacian and the

q+1 (dV,W
dV,W
down persistent Laplacian, respectively.

If there is an inner product preserving chain map f : (V, dV ) → (W, dW ), the q-th per-

sistent Laplacian can also be defined [88]. The concept of a persistent Laplacian was first

introduced by André Lieutier in 2014 [84]. Later in 2020, it was rediscovered independently

by Wang et al [133]. Persistent Laplacians are originally defined for simplicial chain com-

plexes (C•(X), ∂) and (C•(Y ), ∂) where X is a subcomplex of Y .

The matrix representation of a persistent Laplacian

In this section we will explain the calculation of the matrix representation of a persistent

Laplacian with simple examples.

Example 2.0.8. We compute the persistent Laplacian ∆X,Y

1

for X and Y in Figure 2.5.

Since each chain group has a canonical orthonormal basis, the matrix representation of ∂† is

17

1

2

0

3

1

2

0

3

Figure 2.5 X = {0, 1, 2, 3, 01, 12, 23, 03} and Y = {0, 1, 2, 3, 01, 12, 23, 03, 02, 012, 023}.

(a) X

(b) Y

the transpose of the matrix representation of ∂. The matrix representation of ∂X
1

is

01

12

23

03

and we easily get the matrix representation of (∂X

1 )†

1


0 −1











0

2

3

0

0

0 −1

1 −1

0

1 −1

0

1

1

1

2

0

1

0 −1

0

12


01 −1










03 −1

23

0

0

0

1

3

0

0














,














.

0 −1 1

0

0

1

18

The matrix representation of ∂Y
2

is

012 023



1

01

1

12

23















02 −1

03

0

0

0

0

1

−1

1


















.

Our goal is to find a basis of the subspace ΘX,Y

2

. We first try to make the submatrix

012 023

(cid:18)
02 −1

(cid:19)

1

in column echelon form. We apply one column reduction and get

012 023 + 012



1

01

1

12

23















02 −1

03

0

0

1

1

1

−1

0










.








Therefore, ΘX,Y

2 = span(023 + 012) and one matrix representation of ∂X,Y

2

is














.

023 + 012














01

12

23

03

1

1

1

−1

19

Generally, for f : V → W , if we choose arbitrary bases of V and W and take a matrix

representation [f ] of f , then the matrix representation [f †] of f † is P −1[f ]T Q, where P and

Q are inner product matrices of V and W respectively. If we use {023 + 012} as the basis

of ΘX,Y
2

, then the inner product matrix of ΘX,Y

2

is 2 (the square of the norm of 023 + 012).

The corresponding matrix representation of (∂X,Y

2

)† is

(cid:18)

(cid:19)

1 1 1 −1

1
2

and the matrix representation of the up Laplacian is

01

12

23

03



1/2

1/2

1/2 −1/2

1/2

1/2

1/2 −1/2

1/2

1/2

1/2 −1/2

01

12

23












03 −1/2 −1/2 −1/2

1/2














.

We can also find an orthonormal basis for ΘX,Y

2

at first and just take the transpose. After

the calculation of the up persistent Laplacian and the down persistent Laplacian, we only

need to add them to get the persistent Laplacian.

One of the main contributions of [63, 96] is that the up persistent Laplacian can be

calculated via the Schur complement. Let’s first recall the definition.

Definition 2.0.7. For a square matrix M =






A B

C D



, the Schur complement [23] of D in




M , denoted by M/D, is given by A − BD−1C where D−1 is the Moore-Penrose generalized

inverse. We can view M as an operator on Rn, so M/D is a way of ‘restricting’ M to

the subspace Rm corresponding to A. Now suppose L : V → V is a linear operator on a

finite-dimensional real inner product space V , and W is a subspace, we can define the Schur

restriction of L onto W as follows. We first choose bases for W and W ⊥. With respect

20

to the chosen bases, the matrix representation of L is a block matrix

W

W W ⊥

A

B








,




W ⊥ C D

and the Schur restriction Sch(L, W ) : W → W is naturally defined by the linear operator

represented by A − BD−1C.

Gulen et al.

[63] showed that if L is self-adjoint positive semi-definite, the Schur re-

striction Sch(L, W ) is well defined, i.e., independent of the choice of bases of W and W ⊥.

Moreover, they proved the following proposition.

Proposition 2.0.5. Let f : ˆV → V be a linear morphism between two finite-dimensional real

inner product space, and L = f f † : V → V . For any subspace W ⊂ V , let fW : f −1W → W

be the restriction of f on f −1W (its codomain is also restricted to W ). Then the Schur

restriction of L onto W is fW (fW )†.

This proposition ensures that we can first compute ∂Y

q+1(∂Y

q+1)† and then compute the

Schur complement. In the above example, the matrix representation of ∂Y

q+1(∂Y

q+1)† is

01

12

23

03

02



1

1

0

0

1

1

0

0

0

0 −1

0 −1

1 −1

1

0 −1

1 −1

01

12

23

03
















02 −1 −1

1 −1

2


















.

Note that Cq(X) = span{01, 12, 23, 03}. We treat this matrix as a block matrix






A B

C D






21

where A is

01 12

23

03














01

12

23

03

1

1

0

0














.

1

1

0

0

0

0

0

1 −1

0 −1

1

The Schur complement A − BD−1C is

01

12

23

03



1/2

1/2

1/2 −1/2

1/2

1/2

1/2 −1/2

1/2

1/2

1/2 −1/2

01

12

23












03 −1/2 −1/2 −1/2

1/2








.






Example 2.0.9. Consider two complexes X = {0, 1, 2, 3, 4, 02, 04, 14, 23, 34} and Y = X ∪

{024, 234}. Let’s first calculate the matrix representation of ∆X,Y

1

. The matrix representation

of ∂X
1

is

0




1 −1












0

0

2

3

4

1

14

02

23

04

34



0 −1

0 −1

0

0

0










.








0

0

1 −1

0

0

0

0

1

0

0 −1

1

1

22

The matrix representation of ∂Y
2

is

024 234



0

14

1

0

02

23












04 −1







34

0

24

1

0

0

1

0

1

−1





















.

Note that 24 /∈ C1(X). It is easy to see that span{024 + 234} = ΘX,Y

2

. So ∂X,Y

2

is equal to

024 + 234


















14

02

23

04

34

0

1

1

−1

1


















and P is equal to ⟨024 + 234, 024 + 234⟩ = 2. So ∆X,Y

1

is equal to

2

0















0

0

1

1



2.5 −0.5

0.5

0.5

0 −0.5

2.5 −0.5 −0.5

1

1

0.5 −0.5

2.5

0.5 −0.5

0.5

0.5

2.5

.













Example 2.0.10. Consider the complex Y

1

3

4

2

23

and let X = {1, 2}. We compute ∆X,Y

0

. The matrix representation of ∂X
1

is

13

34

24

2


1 −1











0

3

4

0

0

0

0 −1

1 −1

1

0

1








.






After a few steps of Gauss elimination we get

13

13 + 34 −13 − 34 + 24

2


1 −1











1

3

0

4

0

−1

0

0

1














.

1

−1

0

0

It is clear that ΘX,Y

1 = span{−13 − 34 + 24}, P = 3 and the matrix representation of ∂X,Y

1

is

−13 − 34 + 24






1

2

1

−1




.

Then the matrix representation of ∆X,Y

0

is






Its spectrum is {0, 2/3}.

1/3 −1/3

−1/3

1/3




 .

Eigenvalues and eigenvectors of a Laplacian

There are already some results concerning the relation between spectra of Laplacians and

the shape of a simplicial complex [52, 72]. How do we interpret eigenvectors of a Laplacian?

24

For an eigenvector of a q-th combinatorial Laplacian, we can look at the shape of q-simplices

where the eigenvector has support (signs are arbitrary because they are affected by the fixed

ordering of vertices). Empirical observations [79, 98, 138] suggest that:

(a) harmonic eigenvectors (eigenvectors of zero eigenvalues) have support near q-

dimensional “holes” (or vertices in a connected component when q = 0).

(b) non-harmonic eigenvectors (eigenvectors of nonzero eigenvalues) have support near

“clusters” of q-simplices. As to persistent Laplacians, very little is known about the geomet-

rical/topological interpretation of eigenvalues and eigenvectors.

The workflow of the (persistent) Laplacian approach

Given a point cloud, the usual workflow of topological data analysis is to first generate a

filtration {Xt} (t is often associated with distance) and then compute persistent homology

(often in the form of barcodes or a persistence diagram). If we want to employ Laplacians, we

need to select some pairs of Xt and Xs in the filtration and compute persistent Laplacians.

After we compute some Laplacians, we have to featurize Laplacians. Here featurization of

Laplacians is the process of transforming a set of Laplacians to a vector of a fixed size. As

the spectrum of a Laplacian is not affected by the global ordering of points in the point

cloud, most featurization methods focus on the spectrum. Some featurization methods are

summarized in [97]. So far the choices of pairs (Xt, Xs) and featurization methods require

some domain-knowledge and experience about the specific problem, and we wonder if any

data-driven (or self-learning) approach is possible.

Homotopy continuation and persistent Laplacians

Seeking new ways to calculate the spectrum of an operator is an active research topic

[3]. In addition to the traditional methods of numerical linear algebra, one may alternatively

calculate the spectrum by finding the roots of the characteristic polynomial associated with

the operator.

Homotopy continuation is a method for solving a single polynomial or systems of polyno-

mial equations. The essential idea is to build a homotopy between the system to be solved

25

(called the target system) and an easier system with known roots (called the start system)

and track down the known roots of the start system to the roots of the target system. As

systems of polynomial equations arise in mathematics, science, and engineering, homotopy

continuation methods have found applications in various areas, such as algebraic geometry

[69, 81], robot kinematics [130], optimal control [6], differential equations [1, 67], and biology

[62, 66, 113]. Several software packages implement homotopy continuation methods, such as

Bertini [7], HomotopyContinuation.jl [12], Hom4PS-3 [33], and PHCpack [128].

[140] ver-

ified that at least for some simple polytopes and small molecules in the three-dimensional

space, the minimal nonzero eigenvalues of persistent Laplacians calculated by homotopy

continuation are very close to the result from HERMES.

26

CHAPTER 3

(CO)SHEAVES AND PERSISTENT SHEAF LAPLACIANS

The goal of this chapter is to introduce the theory of cellular (co)sheaves and the extension

of persistent Laplacians to the setting of cellular sheaves. We will first introduce weighted

simplicial complexes, which can be viewed as cellular cosheaves.

Weighted simplicial complexes

Generally speaking, any simplicial complex whose simplices have weights can be called

a weighted simplicial complex. The weights can be geometrical, such as angles between

simplices, volumes of simplices, or non-geometrical such as numbers of scientific papers

coauthored by groups of people. Many theories and models involving weighted simplicial

complexes exist (e.g., [5, 9, 38, 103, 119]). Here we focus on the theory of weighted simplicial

complexes proposed by Robert J. MacG. Dawson [41], and developed in [14, 15, 83, 109, 111,

142, 143].

Definition 3.0.1. A weighted simplicial complex is a simplicial complex where each simplex

σ has a weight w(σ) valued in an integral domain R, such that if σ is a face of τ , then w(τ )

is divisible by w(σ).

The weighted chain complex of a weighted simplicial complex X is defined as follows.

Let the q-th chain group Cq(X, w) be the set of formal sums of nonzero weighted q-simplices

with coefficients in R. For σ = [va0, . . . , vaq ], we denote the face [va0, . . . , ˆvai, . . . , vaq ] by diσ.
The weighted boundary operator ∂ is defined by

∂σ =

q
(cid:88)

i=0

w(σ)
w(diσ)

(−1)idiσ.

We still have ∂2 = 0, because for 0 ≤ i < j ≤ q,

w(σ)
w(diσ)

w(diσ)
w(dj−1diσ)

=

w(σ)
w(djσ)

w(djσ)
w(didjσ)

=

w(σ)
w(didjσ)

;

hence, weighted homology groups can be defined analogously. Wu et al.

[142] pointed out

that in the proof of ∂2 = 0, what really matters is the quotient of weights.

If we write

27

w(τ )/w(σ) as ϕ(σ, τ ), then the equality

w(σ)
w(diσ)

w(diσ)
w(dj−1diσ)

=

w(σ)
w(djσ)

w(djσ)
w(didjσ)

becomes

ϕ(diσ, dj−1diσ)ϕ(σ, diσ) = ϕ(djσ, didjσ)ϕ(σ, djσ),

which means that any ϕ : X × X → R satisfying this equality induces a (ϕ-weighted)

boundary operator

∂qσ =

q
(cid:88)

i=0

(−1)iϕ(σ, diσ)diσ.

such that ∂2 = 0. A simplicial complex paired with such a generalized weight function ϕ is

called a ϕ-weighted simplicial complex.

Example 3.0.1. [142] A weighted polygon is a polygon with ϕ({vi, vj}, vi) = αi ∈ Z (Figure

3.1). The matrix representation of ∂1 is

α2

α3

α1

α4

α0

Figure 3.1 A weighted polygon.

v0v1

v0v4

v1v2

v2v3

v3v4

v0 −α0 −α0


















v1

v2

v3

v4

α1

0

0

0

0

0

0

α4

0

−α1

0

0

α2 −α2

0

0

0

0

0

α3 −α3

0

α4


















28

and the weighted H0 is dependent on αi. The weighted homology of weighted polygons might

be useful for studying ring structures in biomolecules.

We have emphasized that a point cloud can be studied by building a filtration of simplicial

complexes. If we want to distinguish some points from other points, we can assign weights and

building a filtration of weighted simplicial complexes [111]. We may also consider weighted

versions of combinatorial and persistent Laplacians [142].

Example 3.0.2. Suppose each point v in a point cloud has weight w(v). We can associate

any simplex {va0, . . . , vaq } the product weight [111]

q
(cid:89)

i=0

w(vai).

Since the weighted boundary map can be formally given by

∂(σ) =

q
(cid:88)

i=0

w(vai)(−1)idiσ,

we can just define the q-th chain groups as the space generated by all q-simplices without

worrying about zero weights.

Example 3.0.3. Suppose a poind cloud contains two types of points {A, B}. We can assign

weights {0, 1} to {A, B}, and compute weighted homology and Laplacians using product

weighting. At least when a point cloud is simple, weighted combinatorial Laplacian can be

used to differentiate among different patterns of distribution of A and B. For a point cloud

of four points {(0, 0), (1, 0), (1, 1), (0, 1)} there are five configurations (shown in Figure 3.2)

that include at least one point whose weight is 1. The weighted Laplacian results are shown

in Figure 3.3.

(a)

(b)

(c)

(d)

(e)

Figure 3.2 Different patterns of A and B.

29

(a) BBBB

(b) ABBB

(c) AABB

(d) ABAB

(e) AAAB

Figure 3.3 Results of weighted homology and Laplacians.

Cellular (co)sheaves

In a ϕ-weighted simplicial complex, we can imagine that a copy of R resides each simplex

and ϕ(τ, σ) represents a scalar multiplication from the copy on τ to the copy on σ [65]. If

we associate each simplex with a vector space and designate a linear morphism for every

face relation, we will get a cellular (co)sheaf. The theory of cellular (co)sheaves was first

introduced in [122] and later gained attention for its application potential [39, 64, 150, 115,

116].

Definition 3.0.2. 1A cellular sheaf S on a simplicial complex X consists of the following

data: an assignment to each simplex σ of X a (finite-dimensional) vector space S(σ) and to

each face relation σ ⩽ τ a linear morphism of vector spaces denoted by Sσ⩽τ or S(σ ⩽ τ ) :

S(σ) → S(τ ), satisfying the rule

ρ ⩽ σ ⩽ τ ⇒ Sρ⩽τ = Sσ⩽τ ◦ Sρ⩽σ

1For ease of exposition we simplify the definition of a cellular (co)sheaf.

30

and Sσ⩽σ is the identity map.

The vector space S(σ) is referred to as the stalk of S over σ, and the linear morphism

Sσ⩽τ is referred to as the restriction map of the face relation σ ⩽ τ . A cellular cosheaf is very

similar to a cellular sheaf, and the only difference is that in a cosheaf Sσ⩽τ is a morphism

from S(τ ) to S(σ), referred to as the extension map of the face relation σ ⩽ τ .

Example 3.0.4. Let X be a finite simplicial complex. We attach to every simplex of X a

fixed vector space V and let every restriction map be the identity map. This sheaf is referred

to as the constant sheaf V on X.

Example 3.0.5. Let X be a finite simplicial complex. We attach to every simplex of X the

zero space, except a simplex σ, the stalk of which is a vector space V . All restriction maps

have to be zero except the identity map required by the definition. This sheaf is called a

skyscraper sheaf.

Definition 3.0.3. Suppose that f : X → Y is a simplicial map [99] and that S is a cellular

sheaf on Y. The pullback sheaf f ∗S on X is given by

(f ∗S)(σ) = S(f (σ)),

and for the face relation σ ⩽ τ of X,

(f ∗S)σ⩽τ = Sf (σ)⩽f (τ ).

The pullback of a cosheaf is defined analogously.

Example 3.0.6. Suppose that X is a subcomplex of Y , and S is a sheaf on Y . We can

define a sheaf T on X using the data of Y . For σ ∈ X, let T(σ) = S(σ). For the face relation

σ ⩽ τ in X, let T(σ ⩽ τ ) = S(σ ⩽ τ ). The sheaf T is a pullback of S.

Definition 3.0.4. A global section s of a sheaf S is an assignment to each simplex σ an

element sσ ∈ S(σ) such that Sσ⩽τ (sσ) = sτ for any face relation σ ⩽ τ . The set of global

sections is denoted by Γ(X; S).

31

It is possible to construct the sheaf cochain complex of a sheaf or cosheaf chain complex

of a cosheaf. For a sheaf S on a finite simplicial complex X, let the q-th cochain group

C q(X; S) be the direct sum of S(σ) over all q-simplices σ. To define the coboundary map d,

we need a signed incidence relation [39].

Definition 3.0.5. A signed incidence relation is an assignment to every face relation σ ⩽ τ

an integer [σ : τ ] satisfying the following conditions: (1) if dim τ −dim σ > 1, then [σ : τ ] = 0;
and (2) if γ ⩽ τ and dim τ − dim γ = 2, the sum (cid:80)

σ[γ : σ][σ : τ ] = 0.

Once a signed incidence relation is given, the coboundary map dq

: C q(X; S) →

C q+1(X; S) is given by

dq|S(σ) =

(cid:88)

[σ : τ ]Sσ⩽τ .

σ⩽τ

Since dq is a linear morphism, its action on each stalk S(σ) determines itself. We can verify

that dqdq−1 = 0 [39, Lemma 6.2.2], so there is the sheaf cochain complex

0

C 0(X; S)

d

C 1(X; S)

d

C 2(X; S)

d

· · · .

The q-th sheaf cohomology group H q(X; S) is defined as ker dq/ im dq−1.

A natural signed incidence relation exists for every oriented simplicial complex. Recall

that the orientation of a simplex is determined by the ordering of its vertices. For an oriented

simplex τ = [v0, v1, . . . , vn] and its oriented face σ = [v0, . . . , ˆvi, . . . , vn], we let [σ : τ ] = (−1)i.

If σ or τ is oriented alternatively, we let [σ : τ ] = (−1)i+1. This signed incidence relation

is used throughout this paper. We remind the reader that we do not need orientation

information to define a sheaf.

Dually, given a cellular cosheaf S on a simplicial complex X, the q-th cosheaf chain group

is the direct sum of all stalks over q-simplices, and the cosheaf boundary map is given by

dq|S(τ ) =

(cid:88)

σ⩽τ

[σ : τ ]Sσ⩽τ .

32

(Co)sheaf Laplacians

Recall that if cochain groups of a cochain complex

· · ·

dq−2

Aq−1

dq−1

Aq

dq

Aq+1

dq+1

· · ·

are all finite-dimensional inner product spaces, the q-th combinatorial Laplacian ∆q : Aq →

Aq is defined by

∆q = (dq)†dq + dq−1(dq−1)†,

where (dq)† is the adjoint of dq, and it is well-known that the kernel of ∆q is isomorphic to

the q-th cohomology group H q. Hansen and Ghrist [65] applied this construction to sheaf

cochain complexes and the resulting new combinatorial Laplacian is referred to as the sheaf

Laplacian. If every stalk of a sheaf S is a finite-dimensional inner product space, we can

equip an inner product structure on every C q(X; S) such that S(σ) and S(σ′) are orthogonal

if σ ̸= σ′.

Example 3.0.7. Suppose there is a sheaf F over the simplicial complex {0, 1, 2, 01, 02, 12},

then the sheaf coboundary map d0 is represented by the block matrix

F0

F1

F01 −F0⩽01

F1⩽01

F02 −F0⩽02

0

F2⩽02

F12

0

−F1⩽12 F2⩽12

F2

0










.










The 0-th sheaf Laplacian is (d0)†d0, represented by the block matrix

F0

F1

F0⩽01 + F∗

0⩽02

F0⩽02

−F∗

0⩽01

F1⩽01

F∗

1⩽01

F1⩽01 + F∗

1⩽12

F1⩽12

0⩽01


F0 F∗







F1

F2

−F∗

1⩽01

F0⩽01

−F∗

2⩽02

F0⩽02

F2

−F∗

0⩽02

F2⩽02

−F∗

1⩽12

F2⩽12










.

−F∗

2⩽12

F1⩽12

F∗

2⩽02

F2⩽02 + F∗

2⩽12

F2⩽12

33

Persistent (co)sheaf Laplacians

Persistent (co)sheaf (co)homology is known to experts [114, 149] and a systematical treat-

ment can be found in [117]. We first show how to construct persistent cosheaf Laplacians.

Suppose Y is an oriented simplicial complex and X is a subcomplex of Y whose orientation

is identical to Y . If G is a cosheaf on Y , let F be the pullback cosheaf on X. We have the

following commutative diagram

· · ·

· · ·

d

d

Cq+1(X; F)

Cq+1(Y ; G)

d

d

Cq(X; F)

Cq(Y ; G)

d

d

Cq−1(X; F)

Cq−1(Y ; G)

d

d

· · ·

· · ·

where hooked dashed arrows represent inclusion maps ι : Cq(X; F) (cid:44)→ Cq(Y ; G). The q-th

persistent cosheaf homology group is defined by

Consider the following diagram

ι∗Hq(X; F).

Cq+1(X; F)

dF
q+1

d

F,G
q+1

Cq(X; F)

dF
q

(dF

q )†

Cq−1(X; F)

Θ

F,G
q+1

d

G
q+1

(d

F,G
q+1)†

Cq(Y ; G)

Cq+1(Y ; G)

where Θ

F,G
q+1 = {x ∈ Cq+1(Y ; G) | dG

q+1x ∈ Cq(X; F)}. Let d

F,G
q+1 : Θ

(π is the adjoint of ι), the q-th persistent cosheaf Laplacian ∆F,G

F,G
q+1 → Cq(X; F) be
: Cq(X; F) →

q

F,G
q+1

πqdY

q+1|Θ
Cq(X; F) is

(dF

q )†dF

q + d

F,G
q+1(d

F,G
q+1)†.

As shown in section 2, the persistent Hodge theorem is still true.

We show next that the spectrum of a persistent cosheaf Laplacian doesn’t rely on the

choice of the orientation of the complex Y . This is important, since a point cloud may not

have a canonical ordering.

34

Proposition 3.0.1. The spectrum of the q-th persistent cosheaf Laplacian is independent of

the orientation of Y .

Proof. Fixing a choice of orientation for each simplex of Y , it suffices to show that the

spectrum of ∆F,G

q

is unchanged if the orientation of one simplex of Y is alternated. We first

fix some notations. Suppose we change the orientation of a simplex σ, then every morphism

defined with respect to this new orientation will have a bar. We also sometimes denote Θ

F,G
q+1

by Θ. We define a linear map Iσ,− such that Iσ,−|G(σ) = −I and Iσ,−|G(σ′) = I if σ′

̸= σ.

The adjoint of Iσ,− is itself. Depending on context, the domain of Iσ,− will be understood as

Cq−1(X; F), Cq(Y ; G) or Cq+1(Y ; G). The proof is divided into cases.

(dF

Case I. If σ ∈ Cq−1(X; F), then dG
q )†dF
q )†Iσ,−Iσ,−dF
Case II. If σ ∈ C q(Y ; G), then dF

q+1 = dG
. In other words, ∆ = ∆.

q = (dF

q

q = dF

q πIσ,−|Cq(X;F) and dG

q+1 = Iσ,−dG

q+1

. As

and dF

q = Iσ,−dF

q

.

So (dF

q )†dF

q =

q+1

(dG

q+1)Θ = Iσ,−(dG

q+1)Θ ⊂ Iσ,−Cq(X; F) = Cq(X; F),

we see that Θ ⊂ Θ. Similarly

(dG

q+1)Θ = Iσ,−Iσ,−(dG

q+1)Θ = Iσ,−(dG

q+1)Θ ⊂ Iσ,−Cq(X; F) = Cq(X; F),

we see that Θ ⊃ Θ, so Θ = Θ.

Then

πdG

q+1|Θ = πIσ,−|Cq(X;F)πdG

q+1|Θ.

So

†

∆ = dF
q

(dF

q ) + πdG

q+1|Θ

(cid:0)πdG

q+1|Θ

(cid:1)†

= πIσ,−|Cq(X;F)∆(πIσ,−|Cq(X;F))†.

Case III. If σ ∈ Cq+1(Y ; G), then dF

q = dF

q

and dG

q+1 = dG

q+1Iσ,−. As Cq(X; F) ⊃ dG

q+1Θ =

dG
q+1Iσ,−Θ, we see that Iσ,−Θ ⊂ Θ. Similarly Iσ,−Θ ⊂ Θ, so Iσ,−Θ = Θ. Denote by I Θ

σ,− :

35

Θ → Θ the restriction of Iσ,− on Θ. We have πdG

q+1|Θ = πdG

q+1Iσ,−|Θ = πdG

q+1|ΘI Θ
σ,−

. Then

∆ = dF
q

†

q + πdG
dF

q+1|Θ

(cid:0)πdG

q+1|Θ)†

= (dF

q )†dF

q + πdG

q+1|ΘI Θ

σ,−)†(cid:0)πdG

q+1|Θ)†

= (dF

q )†dF

q + πdG

q+1|Θ

σ,−(I Θ
(cid:0)πdG

q+1|Θ)†

= ∆.

The next corollary is the consequence of the persistent Hodge theorem.

Corollary 1. Persistent cosheaf Betti numbers are independent of the orientation of Y

Now let’s discuss sheaves. Given two oriented simplicial complexes X, Y , if X ⊂ Y and

the orientation of X is identical to Y , let sheaf F on X be the pullback of the sheaf G on Y ,

then we have the following commutative diagram

· · ·

· · ·

d

d

C q−1(X; F)

π

C q−1(Y ; G)

d

d

C q(X; F)

π

C q(Y ; G)

d

d

C q+1(X; F)

π

C q+1(Y ; G)

d

d

· · ·

· · ·

where π : C q(Y ; G) → C q(X; F) is a projection map such that π |G(σ) is the identity map if

σ ∈ X, and π |G(σ)= 0 otherwise. Since π is a cochain map, it induces a map π• between

sheaf cohomology groups of F and G, and the q-th persistent sheaf group is defined by

π•(H q(Y ; G))

whose dimension is the q-th persistent sheaf Betti number. We can dualize the above diagram

(i.e., reverse all arrows) and define the persistent sheaf Laplacian by the persistent cosheaf

36

Laplacian of the dualized diagram. More specifically, we have the following diagram

C q−1(X; F)

dq−1
F

(dq−1

F )†

C q(X; F)

dq
F,G

(dq

F,G)†

C q(Y ; G)

Θq+1
F,G

dq
G

(dq

G)†

C q+1(Y ; G)

(note that an inner product space is self-dual) where Θq+1

F,G = {x ∈ C q+1(Y ; G) | (dq

G)†(x) ∈

C q(X; F)} and dq

F,G is the adjoint of π(dq

G)†|Θq+1

F,G

: Θq+1

F,G → C q(X; F). We define the q-th

persistent sheaf Laplacian ∆F,G

q

2 by

∆F,G

q = (dq

F,G)†dq

F,G + dq−1

F (dq−1

F )†.

The nullity of ∆F,G

q

is equal to the q-th persistent Betti number of the dualized diagram

(since we dualize everything, the two cochain complexes become chain complexes). By the

universal coefficient theorem for cohomology, π• and ι• have the same rank (here • means

the induced map between cohomology or homology groups). So the q-th persistent Betti

number of the dualized diagram is equal to the q-th persistent sheaf Betti number. In other

words, we have

ker ∆F,G

q

∼= π•(H q(Y ; G)).

More concretely, given a sheaf cochain complex

· · ·

V q−1

d

V q

d

V q+1

· · · ,

we are actually working with the complex

· · ·

V q−1

d†

V q

d†

V q+1

· · · .

The universal coefficient theorem relates the above chain complex to the following cochain

complex

· · ·

(V q−1)∗

(d†)∗

(V q)∗

(d†)∗

(V q+1)∗

· · · ,

2Sometimes we may use the notation ∆X,Y

q

.

37

which can be identified as the original sheaf cochain complex

· · ·

V q−1

d

V q

d

V q+1

· · · ,

since there is a natural isomorphism between the dual functor and the adjoint functor.

Example 3.0.8. Consider the 1-dimensional simplicial complex Y

1

3

4

2

and the constant sheaf R over Y . We compute ∆X,Y

0

when X = {1, 2}. The matrix repre-

sentation of (d0

Y )† is

13

34

24

2


1 −1











0

3

4

0

0

0

0 −1

1 −1

1

0

1








.






According to the definition of Θ0

X,Y

, we want to find all elements of C 1(Y ; R) that are sent

to C 0(X; R) = span{1, 2} by (d0

Y )†. After a few steps of column Gauss elimination we get a

new matrix representation of (d0

Y )†

13

13 + 34 −13 − 34 + 24

2


1 −1











1

3

0

4

0

−1

0

0

1














.

1

−1

0

0

From this representation, we see that for any vector v = a13 + b(13 + 34) + c(−13 − 34 + 24)

(a, b, and c are coefficients), (d0

Y )†v ∈ span{1, 2} if and only if a and b are both zero. In

38

other words, Θ0

X,Y = span{−13 − 34 + 24}, P = 3, and the matrix representation of (d0

X,Y )†

is

−13 − 34 + 24






1

2

1

−1




.

Then, the matrix representation of ∆X,Y
0


is






and its spectrum is {0, 2/3}.

1/3 −1/3

−1/3

1/3




Cellular sheaves on a labeled simplicial complex

In this section we construct a class of sheaves on a so called labeled simplicial complex.

A labeled simplicial complex is a simplicial complex where each vertex is associated with a

real number. We pay attention to label simplicial complexes because in application we often

have a point cloud where each point is associated with some kind of quantity. For example,

the atoms of a molecule can be seen as a point cloud, and each atom has its partial charge.

If we build a Rips filtration from a labeled point cloud, then each complex in the filtration

will be a label simplicial complex.

We first give a simple example. Suppose that there is a 1-dimensional labeled simplicial

complex X where each vertex vi is associated with a quantity qi ∈ R. Denote the edge

connecting vi and vj by eij. We can define a sheaf S on X where each stalk is R, and for

the face relation vi ⩽ eij, the morphism Svi⩽eij
rij is the length of eij. The assignment qi → vi and qiqj/rij → eij is a global section, since

is the scalar multiplication by qj/rij where

Svi⩽eij (qi) = Svj ⩽eij (qj) = qiqj/rij. If we think of qi and qj as partial charges on atoms, then
the quantity qiqj/rij is the potential energy.

The above sheaf can be generalized to high-dimensional labeled simplicial complexes. Let

F : X → R be a nowhere zero function. We can define a sheaf where each stalk is R, and

39

for the face relation [v0, . . . , vn] ⩽ [v0, . . . , vn, vn+1 . . . , vm] (here orientation is not relevant),

the linear morphism S([v0, . . . , vn] ⩽ [v0, . . . , vn, vn+1 . . . , vm]) is the scalar multiplication by

F ([v0, . . . , vn])qn+1 · · · qm
F ([v0, . . . , vn, vn+1, . . . , vm])

.

This is indeed a sheaf since if we have [v0, . . . , vn] ⩽ [v0, . . . , vm] ⩽ [v0, . . . , vl], then

F ([v0, . . . , vm])qm+1 · · · ql
F ([v0, . . . , vl])

F ([v0, . . . , vn])qn+1 · · · qm
F ([v0, . . . , vm])

=

F ([v0, . . . , vn])qn+1 . . . ql
F ([v0, . . . , vl])

.

The assignment qi0 · · · qin/F ([vi0, . . . , vin]) → [vi0, . . . , vin] is a nontrivial global section.

Example 3.0.9. Consider the oriented simplicial complex {v0, v1, v2, v0v1, v1v2, v0v2, v0v1v2}

(v2, q2)

(v0, q0)

(v1, q1)

where qi ∈ R is associated to vi. Let r01, r12, r02 be the lengths of e01, e12, e02. We can define

the above sheaf on this complex where F maps every vertex to 1, every edge eij to its length

rij, and the 2-simplex [v0, v1, v2] to r01r12r02. The matrix representation of d0 is

v0

v1



v0v1 −q1/r01

q0/r01

v2

0








v1v2

0

−q2/r12

q1/r12

v0v2 −q2/r02

0

q0/r02










,

and the matrix representation of d1 is

v0v1

v0v2

v1v2

(cid:18)

v0v1v2

q2
r02r12

−q1
r01r12

q0
r01r02

(cid:19)
.

Note that many alternative sheaf constructions are available by appropriate choices of F .

40

Example 3.0.10. We consider the F defined in the same way as in Example 3.0.9 to evaluate

the spectra of sheaf Laplacians. Consider the 2-simplex

(v2, q2)

(v0, q0)

(v1, q1)

whose edges are all of length 1. The ∆0 is









1 + q2
q2

2 −q0q1 −q0q2

−q0q1

0 + q2
q2

2 −q1q2

−q0q2 −q1q2

0 + q2
q2
1









and its eigenvalues are {q2

0 + q2

1 + q2

2, q2

0 + q2

1 + q2

2, 0} and the corresponding eigenvectors are

(−q1/q0, 1, 0)T , (−q2/q0, 0, 1)T , and (q0/q2, q1/q2, 1)T . Moreover, ∆1 is

0 + q2
q2

1 + q2
2

0

0 + q2
q2

1 + q2
2









0

0

0

0









0

0 + q2
q2

1 + q2
2

and its only eigenvalue is q2

0 + q2

1 + q2
2

. This example shows that the eigenvalues of ∆0 and

∆1 are dependent on the amplitude of qi, which allows the embedding of non-geometric

information in practical applications. However, they are not sensitive to the sign of qi.

Therefore, a (persistent) sheaf Dirac theory as an extension of recent Dirac formulation or

quantum persistent homology [2] may enable us to further eliminate the sign degeneracy.

Experiments of persistent sheaf Laplacians

Given a labeled point cloud P (i.e., a point cloud with a nonzero quantity qi associated

with each point vi), we can build a Rips or Alpha filtration from it and construct a sheaf

St on each Xt consistently as described in section 3 provided a suitable global F : 2P → R

is chosen. If we take Ft to be the restriction of F : 2P → R on Xt and construct St, then

we ensure that the stalks of σ and τ and the restriction map between σ and τ remain the

41

same for any St containing σ ⩽ τ . Since St is the pullback of St+p for any t and p, we get a

persistent module of sheaf cochain complexes and can compute the spectra of persistent sheaf

Laplacians. In this section, we calculate the spectra of persistent sheaf Laplacians for a few

examples of point clouds in this way. Some examples are the vertices of simple geometrical

shapes and some are the coordinates of atoms of molecules. We assign the quantities qi

to simple geometrical shapes, and take the partial charges as qi for molecules. An Alpha

filtration {Xr} is built for each labeled point cloud, parametrized by radius r. We choose

F such that F maps every vertex vi to 1, every edge vivj to the length of itself rij, and
every 2-cell vivjvk to the product of lengths of its edges rijrikrjk. The spectrum of ∆Xr,Xr+p

d

for d = 0, 1 and selected r, p will be calculated. The radius r will be a multiple of 0.01

or 0.01Å. Many information can be extracted from a spectrum, but for simplicity here we

only plot the minimal nonzero eigenvalue and the nullity against the radius r. The minimal

nonzero eigenvalue of the persistent sheaf Laplacian ∆Xr,Xr+p

d

is denoted by λr,p
d

and the d-th

persistent sheaf Betti number of the pair (Xr, Xr+p) (i.e., the nullity of ∆Xr,Xr+p

d

) is denoted

by βr,p
d

.

The examples of simple shapes are the vertex sets of a 2-dimensional square or a 2-

dimensional trapezoid as shown in Figure C.1 with different choices of local property qi.

More specifically, we consider four labeled point clouds and two point clouds in R2. The two

point clouds are {(0, 0), (1, 0), (1, 1), (0, 1)} and {(0, 0), (1, 0), (3/4,

√

15/4), (1/4,

√

15/4)}.

For {(0, 0), (1, 0), (1, 1), (0, 1)} we assign q = ±1 to (0, 1) and q = 1 to the rest. For

{(0, 0), (1, 0), (3/4,

√

15/4), (1/4,

√

15/4)} we assign q = ±1 to (1/4,

√

15/4) and q = 1 to

the rest. For the two point clouds we construct the constant sheaf and compute persistent

sheaf Laplacians, whose spectra coincide with persistent Laplacians. For the four labeled

point clouds we construct filtrations of sheaves in the way described earlier in this section.

The results are shown in Figures C.2 and C.3. The first thing we can infer is that, the mini-

mal nonzero eigenvalue and the nullity usually change significantly when the topology of Xr

changes (when p is nonzero, both of them change at r − p rather than r). When we consider

42

labeled point clouds instead of point clouds, sometimes the minimal nonzero eigenvalue or

the nullity show less change (compare (a)(c) and (b)(d) in Figure C.2 and Figure C.3). If

we compare the results of the square and the trapezoid, we see more changes in the results

of the trapezoid. This is because the filtration constructed from the vertex set of trapezoid

contains more different complexes. We also observe that the change of signs of qi does not

affect the minimal nonzero eigenvalue and the nullity, though the eigenspaces of Laplacians

are different.

Next we study the molecule CB8 [118] shown in Figure C.4. We associate each atom

with the corresponding partial charge (obtained using [107]). The results for CB8 are shown

in Figures C.5 and C.6. Due to complexity of the molecule, it is very difficult to explain

the spectral details of the system. However, this information can be very useful for machine

learning analysis.

Finally, to demonstrate our method for practical problems, we study a small protein

called bacteriocin AS-48 (PDB ID: 1E68) [54]. We select the model 1 of AS-48 and compute

the pqr file by PDB2PQR with the Amber force field [77]. For the sake of faster computation,

we only use the coordinates of carbon atoms as the point cloud. Results are shown in Figures

C.8 and C.9.

43

CHAPTER 4

OTHER PERSISTENT TOPOLOGICAL LAPLACIANS

Persistent Laplacians for simplicial maps

The classical filtration of simplicial complexes only represents one type of shape evolution.

We also need tools to study more general shape evolution, such as sparsification of a simplicial

complex. This requires us to consider general simplicial maps rather than inclusion maps.

Gülen et al. [63] developed a theory of persistent Laplacians for a simplicial map f : X → Y

(the chain map induced by a simplicial map is not necessarily injective). Suppose f : X → Y

is a simplicial map,

· · ·

· · ·

Cq+1(X)

∂X
q+1

Cq(X)

fq+1

fq

Cq+1(Y )

∂Y
q+1

Cq(Y )

∂X
q

∂Y
q

Cq−1(X)

fq−1

Cq−1(Y )

· · ·

· · ·

where fq : Cq(X) → Cq(Y ) is induced by f . Different from the original q-th persistent

Laplacian, we need to define two subspaces

Cq+1(Y ) ⊃ ΘY ←X

q+1 = {c ∈ Cq+1(Y ) | ∂Y

q+1(c) ∈ fq(ker ∂X

q )}

and

Cq−1(X) ⊃ ΘX→Y

q−1 = {c ∈ Cq−1(X) | (∂X

q )∗(c) ∈ (ker fq)⊥},

and then use the restrictions of ∂Y

q )∗ to them to construct the q-th persistent
Laplacian for f . The q-th persistent Laplacian for a simplicial map has a more symmetric

and (∂X

q+1

expression, and the proof of the persistent Hodge theorem is more complicated.

Digraphs and path homology

The motivation behind path homology is to construct a homology theory of digraphs such

that directional information of edges is encoded and higher dimensional homology groups

are less likely to be non-trivial. Path homology was proposed by Grigor’yan et al.

[57]

and developed in [58, 59, 60, 61, 86]. A summary of recent advances in path homology

44

of digraphs can be found in [55]. There are also other (co)homology theories of digraphs

[22, 92, 95, 108]. Recall that a digraph is a pair G = (V, E) where E is a set of ordered

pairs of vertices. We only consider digraphs without self-loops. In path homology, the focus

is on the paths in a digraph. An allowed q-path is an ordered finite sequence of vertices

{x0, . . . , xq} such that (xi, xi+1) ∈ E for all i = 0, . . . , q − 1. The length of an allowed path

is its natural dimension, so we can try to define the q-th chain group as the vector space

consisting of formal linear combinations of allowed q-paths with coefficients in R, denoted

by Aq, and define the boundary map ∂q by

∂q{x0, . . . , xq} =

q
(cid:88)

i=0

(−1)i{x0, . . . , ˆxi, . . . , xq}

then formally we can show that ∂2 = 0. However, ∂q{x0, . . . , xq} may include paths that are

not allowed paths. To solve this problem, we need to introduce some general concepts first.

Definition 4.0.1. Suppose X is a finite set. An elementary p-path is a sequence [x0, . . . , xp]

of p + 1 elements of X. The space generated by all elementary p-paths with coefficient in R

is denoted by Λp(X). The q-th non-regular boundary map is given by

∂nr
q [x0, . . . , xq] =

p
(cid:88)

i=0

[x0, . . . , ˆxi, . . . , xq].

One can prove this is a chain complex. Among all the paths, a path that lingers at a

vertex (for some i, xi = xi+1) is considered a degenerate path since we are not interested in

self-loops.

Definition 4.0.2. A path [x0, . . . , xq] over X where xi ̸= xi+1 for each i is called regular.

The space generated by all regular q-paths is denoted by Rq.

We define a new boundary operator ∂q between regular paths. When computing

∂q([x0, . . . , xq]), we first compute ∂nr

q ([x0, . . . , xq]) and treat all irregular paths arising from

it as zeros. One can still verify that ∂2 = 0 [56].

45

Now given a digraph G = (V, E), every Aq is a subspace of Rq.

∂q+2

. . .

Rq+1

∂q+1

Rq

∂q

Rq−1

∂q−1

. . .

Aq+1

Aq

Aq−1

One way to make ∂q : Aq → Aq−1 well defined is to restrict ∂q to the subspace Aq ∩ ∂−1

We have to verify that ∂q(Aq ∩ ∂−1

q Aq−1) ⊂ Aq−1 ∩ ∂−1

true by definition, and ∂q(Aq ∩ ∂−1

q Aq−1) ⊂ ∂−1

q Aq−1.
q Aq−1) ⊂ Aq−1 is
q−1Aq−2 is true because of ∂2 = 0. Therefore,

q−1Aq−2. ∂q(Aq ∩ ∂−1

we have the chain complex

. . .

Aq+1 ∩ ∂−1

q+1Aq

∂q+1

Aq ∩ ∂−1

q Aq−1

∂q

Aq ∩ ∂−1

q Aq−1

. . .

and the definition of a path homology group is straightforward. The q-th chain group

Aq ∩ ∂−1

q Aq−1 is called the space of ∂-invariant q-paths on G, denoted by Ωq

1.

As to the geometrical interpretation of path homology, we only know for sure that non-

reduced H0 is the number of connected components of the underlying undirected graph. It is

not easy to relate higher dimensional path homology groups to features of the digraph. Some

characterizations of path homologies of certain families of small digraphs were obtained by

Chowdhury et al [34]. Since directional information of edges is encoded in path homology,

path homology can be used to distinguish network motifs [35] and isomers in molecular and

materials sciences [25]. We can also quantify the significance of a node in a network by

observing the changes of path homology after the deletion of the node [25].

Since Ωq inherits the inner product structure from Aq, the so-called path Laplacian 2

can be defined. We can use path Laplacians [53, 55, 134] to distinguish among digraphs

that path homology cannot. For example, according to [57, Theorem 5.4], the following two

digraphs GL and GR (see Figure 4.1) have the same path homology. But the spectrum of

the 0-th path Laplacian of GL is {0, 3, 3} and that of GR is {0, 2, 4, 4}.

1If a digraph is not simple, there will be two choices of ∂q [57] that might be suitable for different

problems [74].

2Another type of path Laplacians was proposed by Estrada [46], and was generalized and applied in

molecular biology in [89].

46

(a) GL

(b) GR

Figure 4.1 Two graphs that have the same path homology.

Persistent path homology was proposed by Chowdhury and Mémoli [35] to study a di-

graph where each edge e has a weight w(e). A filtration of digraphs {Gd} is constructed

such that e ∈ Gd if and only if w(e) ≤ d. Wang and Wei [134] introduced persistent path

Laplacians based on Chowdhury and Mémoli’s work. They suggested that persistent path

Laplacians can be applied to study molecules, where much information can be encoded in a

digraph.

Example 4.0.1. Given a weighted digraph, we can build a filtration {Gd} such that e ∈ Gd

iff w(e) ≤ d. Two weighted graphs whose path Betti numbers are the same for every Gd may

have different path Laplacians. This is shown in Figure 4.2.

Besides path homology, there is another homology theory of digraphs based on clique

complexes. A q-clique of a digraph G = (V, E) is an ordered subset of vertices σ =

(x0, x1, . . . , xq−1) such that for i < j we have (xi, xj) ∈ E. The directed flag complex

dFl(G) is the simplicial complex on V with q-simplices the (q + 1)-cliques of G. A persistent

Laplacian theory of directed flag complexes was proposed by Jones and Wei [75].

Hypergraphs and hyperdigraphs

Hypergraphs can be thought of as a generalization of simplicial complexes. A hypergraph

is a pair (V, E) where E is a subset of the power set of V . A element of E is called a

hyperedge, and a hyperedge consisting of q + 1 elements is called a q-hyperedge. To define

a chain complex for hypergraphs, the problem here is identical to what we encounter in

the study of digraphs. If we define the q-th chain group to be the vector space generated

by q-hyperedges, the boundary map is not well-defined. One solution is to consider the

associated simplicial complex (simplicial closure) of a hypergraph [102], i.e., the minimal

47

3

0

0

1

1

0

0

2

2

(a)

1

1

0

0

2

2

(b)

(c) Results of the left graph

(d) Results of the right graph

Figure 4.2 The x axis represents the weight. As usual, λ and β represent the minimal
eigenvalues and Betti numbers.

simplicial complex that contains the given hypergraph. Another solution inspired by the

path homology is the embedded homology [13].

If we look at the chain complex of the

associated simplicial complex, each simplicial chain group Cq contains Dq, the vector space

generated by hyperedges. We only need to restrict the domain of the simplicial boundary

operator to

Infq = Dq ∩ ∂−1

q (Dq),

and then the boundary operator is well-defined.

A hyperdigraph is a hypergraph where each hyperedge is ordered3, the embedded homol-

ogy of which can be defined similarly [24]. Persistent homology and persistent Laplacians of

hypergraphs and hyperdigraphs are studied in [13, 24, 110, 112]. Other approaches regarding

the homology and Laplacian of hypergraph includes [36, 45, 73, 76, 100, 101].

3There are other definitions of a hyperdigraph [4, 127].

48

∂q+2

. . .

Cq+1

∂q+1

Dq+1

Cq

Dq

∂q

Cq−1

∂q−1

. . .

Dq−1

Dq+1 ∩ ∂−1

q+1(Dq)

Dq ∩ ∂−1

q (Dq−1)

Dq−1 ∩ ∂−1

q−1(Dq−2)

Figure 4.3 Cq is the q-th chain group of the associated simplicial complex of H, and Dq is
the vector space generated by q-hyperedges.

Mayer homology

We can also define persistent homology and persistent Laplacians for the so-called N -

chain complexes [121]. An N -chain complex is a sequence of abelian groups and group

morphisms (V, d) where dN = 0. A simplicial complex actually gives rise to a N -chain

complex. Recall that in a simplicial chain complex the boundary operator is given by

∂[va0, . . . , vaq ] =

(cid:88)

i

(−1)i[va0, . . . , ˆvai, . . . , vaq ].

For a prime number N , let ξ = e2πi/N , we can define another boundary operator d by

d[va0, . . . , vaq ] =

(cid:88)

i

ξi[va0, . . . , ˆvai, . . . , vaq ]

and prove that dN = 0. Even though N -chain complex is not a chain complex, observe that

CN +q(X; C)

dn

CN +q−n(X; C)

dN −n

CN (X; C)

resembles a part of chain complex, we can define the Mayer homology group HN +q−n,N −n(X)

by

HN +q−n,N −n(X) = ker dN −n/ im dn.

Therefore, it is not surprising that the theory of persistent homology and persistent Lapla-

cians can be extended to the setting of N -chain complexes. One advantage of using N −chain

complexes is that the number of Betti numbers and Laplacians is much larger than tradi-

tional persistent homology and persistent Laplacians, and we can fine tune N for a specific

problem.

49

Persistent Dirac operators

Besides Laplacians, Dirac operators on chain complexes have also been studied [2, 10,

126, 136]. Our definitions follow [126]. For a chain complex (V, d)

· · ·

d3

V2

d2

V1

d1

V0

0

where each chain group Vq is a finite-dimensional inner product space, the q-th Dirac operator

Dq is represented by the block matrix

V0

V1

V2

· · ·

Vq

Vq+1





















V0

V1

V2
...

Vq

Vq+1

0

[d1]

0

· · ·

[d∗
1]

0

[d2]

· · ·

0
...

0

0

[d∗
2]
...

0

0

0
...

0

0

· · ·
. . .

· · ·

· · ·





















0

0

0
...

0

0

0

0
...

[dq+1]

[d∗

q+1]

0

where [] denotes a matrix representation of a linear morphism.

Proposition 4.0.1. If λ is an eigenvalue of Dq, then −λ is also an eigenvalue of Dq.

Proof. Let Qq be



















Idim V0

0

0

0
...

0

0

−Idim V1

0
...

0

0

0

0

Idim V2
...

0

0

· · ·

· · ·

· · ·
...

· · ·

· · ·

0

0

0
. . .

(−1)qIdim Vq

0

0

0
...

0

0

(−1)q+1Idim Vq+1



















...

.

We can verify that DqQq = −QqDq. Suppose Dqv = λv, then DqQqv = −QqDqv = −λQqv.

50

Dirac operators are closely related to combinatorial Laplacians. If we think of all com-

binatorial Laplacians as a single operator dd∗ + d∗d = (d + d∗)2 on V , then the q-th Dirac

operator is the restriction of the square root d + d∗ on V0 ⊕ · · · ⊕ Vq+1. We can also see this

by direct computation. The square of Dq is

V0

[∆0]

V1

0

0

0
...

0

0

[∆1]

0
...

0

0





















V0

V1

V2
...

Vq

Vq+1

V2

· · ·

Vq

Vq+1

· · ·

· · ·

· · ·
. . .

0

0

0
...

· · ·

[∆q]

0

0

0
...

0

· · ·

0

[∆q+1,−]





















0

0

[∆2]
...

0

0

where ∆q is the q-th combinatorial Laplacian. Therefore, the square of any eigenvalue λ of

a Dirac operator must be an eigenvalue of a combinatorial Laplacian.

Recall that to define persistent Laplacians, we construct an auxiliary subspace ΘX,Y
q+1

of

Cq+1(Y ) and a map ∂X,Y

q+1 : ΘX,Y

q+1 → Cq(X). Since Cq(X) is actually a subspace of ΘX,Y

q

, we

have a boundary map ι ◦ ∂X,Y

q

chain complex

: ΘX,Y

q+1 → ΘX,Y

q

. All ΘX,Y

q

and ι∂X,Y

q

constitute an auxiliary

∂X,Y
q+1

C X

q+1

ι

ΘX,Y
q+1

∂X,Y
q

C X
q

ι

ΘX,Y
q

C X

q−1

ι

ΘX,Y
q−1

· · · .

∂X,Y
q−1

· · ·

The q-th persistent Dirac operator of simplicial complexes X ⊂ Y is defined by the q-th Dirac

operator on this auxiliary complex, and it is easy to generalize persistent Dirac operators to

other settings such as digraphs and hyperdigraphs [126]. The square of a persistent Dirac

operator is not necessarily a block matrix consisting of persistent Laplacians.

51

CHAPTER 5

UNDERSTANDING DOMINANT VARIANTS OF SARS-COV-2 USING
LAPLACIANS

The aim of this chapter is to demonstrate the applications of Laplacians in biological sciences

through the study on dominant variants of SARS-CoV-2. SARS-CoV-2 enters the host cell

via either endosomes or plasma membrane fusion. In both ways, the S protein of SARS-CoV-

2 first attaches to the host cell-surface protein, angiotensin converting enzyme 2 (ACE2).

The receptor binding domain (RBD) of the S protein is essential for the entry. This is why

the binding free energy (BFE) of the RBD-ACE2 complex is a measure of viral infectivity.

A mutation of viral RBD induces BFE changes. A positive (resp. negative) BFE change

indicates the strengthening (resp. weakening) of the protein-protein binding. If we know

how to predict BFE changes, we can predict infectivity of new variants. TDA methods such

as persistent homology and persistent Laplacian are useful because they are able to encode

viral RBD structural information.

Biological background

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the ongo-

ing global coronavirus disease 2019 (COVID-19) pandemic. Its evolution and future direction

are of major concern. It was well established that the emergence of SARS-CoV-2 new vari-

ants is dictated by mutation-induced infectivity strengthening [30] and antibody resistance

(or vaccine breakthrough) [132], two molecular mechanisms that determined the natural se-

lection at the population scale. More specifically, the binding of the viral spike protein, par-

ticularly the receptor-binding domain (RBD), to the human receptor angiotensin-converting

enzyme 2 (ACE2) facilitates the entry of the virus into host cells [71, 129]. In early 2020,

it was hypothesized that natural selection favors those SARS-CoV-2 RBD mutations that

strengthen the RBD-ACE2 binding, which leads to higher viral infectivity [30]. The hypoth-

esis was initially supported by the frequency analysis of 89 single RBD mutations found from

the genotyping of 15,140 complete SARS-CoV-2 genome samples [30] and later confirmed

52

beyond doubt by the evolution pattern of 651 RBD mutations found from the genotyping of

506,768 SARS-CoV-2 genomes extracted from COVID-19 patients up to early 2021 [131].

The vaccine breakthrough mechanism was not discovered until vaccines became widely

available in industrialized countries in the summer of 2021.

It was found that an RBD

mutation that weakens the viral infectivity had an unusually high observed frequency in

2,298,349 complete SARS-CoV-2 genomes isolated from patients. This abnormal statistics

was found to strongly correlate with the vaccination rates in a few industrialized countries,

including Denmark, the United Kingdom, France, Bulgaria, the United States, etc [132].

To understand this correlation, the mutational impact of a set of 130 antibodies extracted

from Covid patients that targets the RBD was studied.

It was found that the abnormal

mutation on the RBD has a very strong ability to disrupt the binding of most antibody-

RBD complexes, which gives rise to antibody resistance (or vaccine breakthrough) at the

population scale [132].

As discussed above, the reveal of the natural selection mechanisms of SARS-CoV-2 evo-

lution is a typical example of a data-driven discovery that cannot be achieved by individual

experimental laboratories.

In fact, the discovery utilized results from tens of thousands

of experimental laboratories around the world [30, 132]. Machine learning, including deep

learning and data-driven approach, played an essential role in the discovery. Deep learning

methods can offer some of the most accurate predictions of biomolecular properties, including

the binding affinity of protein-protein interactions (PPIs). This approach becomes partic-

ularly advantageous and outperforms other methods when good-quality experimental data

are available. However, structure-based machine learning, including deep learning methods

encounter difficulties in PPI predictions due to their intricate structural complexity and high

dimensionality.

Although sequence-based approaches offer good predictions of mutational impacts on

proteins, structure-based methods outperform other approaches [106]. In machine-learning-

assisted directed evolution and protein engineering and machine-learning-based PPI and

53

protein folding stability predictions, mutant structures are typically not available and are

conventionally created by computational means for the machine learning predictions [19, 16,

26, 27, 91], which is a source of errors. It is interesting and important to quantify such errors.

Fortunately, since SARS-COV-2 variants are some of the most studied subjects, some of their

three-dimensional (3D) structures are available in the literature, which offers an opportunity

for in-depth analysis and comparison.

PTL analysis of structural changes

We are interested in both the structural changes of the wild type RBD induced by mu-

tations and the structural changes of the wild type RBD or mutant RBDs induced by their

binding to ACE2. To quantify structural changes we first perform alignment of structures

and calculate the distances between corresponding atoms (e.g., Cα). Then, we compute

PTLs of different structures to further characterize their structural changes.

5.0.1 PTL analysis of RBD structural changes induced by mutations

Figure 5.1 Sequence alignment of RBDs of the wild type, Alpha, Beta, Gamma, BA.1,
and BA.2. Alpha has one RBD mutation N501Y. Beta has three RBD mutations K417N,
E484K, and N501Y. Gamma has three RBD mutations K417T, E484K, and N501Y. BA.1 has
15 RBD mutations G339D, S371L, S373P, S375F, K417N, N440K, G446S, S477N, T478K,
E484A, Q493R, G496S, Q498R, N501Y, and Y505H. BA.2 has 16 RBD mutations G339D,
S371F, S373P, S375F, T376A, D405N, R408S, K417N, N440K, S477N, T478K, E484A,
Q493R, Q498R, N501Y, and Y505H.

54

Figure 5.2 (a) Wild type RBD-ACE2 complex. The RBD is colored by light grey and
mutated residues in Alpha, Beta, Gamma, BA.1 and BA.2 are marked. (b, c, d, e, f) Atoms
of the wild type RBD are colored by their distances to corresponding atoms in a mutant
RBD. Subfigures (a), (b), (c), (b), and (f) corresponds to the Alpha, Beta, Gamma, BA.1,
and BA.2 variants, respectively. Pink and red corresponds to 0Å and 14.32Å respectively.
For each mutant we record the residues that have at least one atom whose distance to the
corresponding atom in the wild type RBD is larger than 7.16Å. In Alpha, Beta, and Gamma,
such residue is R346. In BA.1, such residue is K386. In BA.2, such residues are N370, A372,
K378, and K386. These residues are marked in (g). (Plots generated by ChimeraX [104].)

To understand the structural differences of RBD between the wild type and mutants

in RBD-ACE2 complex, we align the RBDs of SARS-CoV-2 variants Alpha (PDB ID:

8DLK[93]), Beta (PDB ID: 8DLN[93]), Gamma (PDB ID: 8DLQ[93]), BA.1 (PDB ID:

7T9L[94]), and BA.2 (PDB ID: 7XB0[82]) along with the wild type RBD (PDB ID: 6M0J[80])

in Figures 5.2 and 5.3. For Alpha, Beta, Gamma, BA.1, and BA.2, the maximal distances

between corresponding atoms of mutant RBDs and the wild-type RBD are 9.14Å, 9.33Å,

9.87Å, 7.44Å, and 14.32Å respectively. For each mutant, the residues are recorded if they

have at least one atom whose distance to the corresponding atom in wild-type RBD is more

than 7.16Å, which is half of the maximal distance, 14.32Å. For variants Alpha, Beta, and

55

Figure 5.3 Atoms of the wild type RBD are colored by their distances to corresponding atoms
in a mutant RBD. Subfigures (a), (b), (c), and (b) corresponds to the Alpha, Beta, Gamma,
and BA.1, variants, respectively. Each alignment has its own color range. For each mutant,
we record the residues that have at least one atom whose distance to the corresponding
atom in the wild type RBD is more than half of the maximal distance (4.57Å, 4.67Å, 4.94Å,
and 3.72Å) between corresponding atoms. In Alpha, such residues are T333, R346, K378,
K386, R408, and N450. In Beta and Gamma, such residues are T333, R346, K378, K386,
and R408. In BA.1, such residues are T333, N334, E340, R346, N360, D364, Y369, K378,
K386, F392, R408, K424, N450, K462, and H519. These residues are marked in (e). (Plots
generated by ChimeraX [104].)

56

Gamma, such a residue is R346, while in BA.1 such residue is K386. BA.2 has most such

residues, which are N370, A372, K378, and K386, containing atoms deviating from the wild

type. However, these residues are not in the receptor-binding motif (RBM, residues 438-506)

that interacts directly with ACE2.

Alternatively, for Alpha, Beta, Gamma, and BA.1 variants, we can change the threshold

from 7.16Å to the half of maximal distance (4.57Å, 4.67Å, 4.94Å, and 3.72Å, respectively).

Then in the Alpha variant, such residues are T333, R346, K378, K386, R408, and N450.

In Beta and Gamma variants, such residues are T333, R346, K378, K386, and R408. In

BA.1 such residues are T333, N334, E340, R346, N360, D364, Y369, K378, K386, F392,

R408, K424, N450, K462, and H519. Also most large Cα structural changes occur at the

coil regions of the RBD. For the BA.2 variant, the half of maximal distance is 7.16Å and we

have recorded such residues that have at least one atom whose distance to the corresponding

atom in the wild-type RBD is more than 7.16Å.

To quantify the total structural differences between the wild type and mutants, we calcu-

late the sum of squares of distances between corresponding Cα atoms. The results of Alpha,

Beta, Gamma, BA.1, and BA.2 are 69 Å2, 70 Å2, 67Å2, 93Å2, and 255Å2, respectively as

shown in Figure 5.4. The large values for BA.1 and BA.2 are consistent with fact that BA.1

and BA.2 are strongly antibody disruptive [29, 31]. The large structural changes induced by

BA.2 mutations create significant mismatch between antibodies and antigens, making BA.2

one of the most antibody resistant variants [31]. Arguably, the amount of mutation-induced

structural changes in RBD-ACE2 complexes also strongly correlates with viral infectivity

changes.

Now we turn to the topological characterization of the mutation-induced conformational

changes. To this end, we employ persistent Laplacians (PL) and persistent sheaf Lapla-

cians (PSL) to examine the local RBD structural changes induced by the mutation N501Y

(a common mutation that exists in Alpha, Beta, Gamma, BA.1, and BA.2). For the wild

type and mutants, the residue 501 mutation site is defined as the set of neighborhood heavy

57

Figure 5.4 The total structural changes of RBD between the wild type and mutants in RBD-
ACE2 complex. Given an alignment of a mutant RBD to the wild type RBD, the total
structural changes is defined to be the sum of squares of distances between corresponding
Cα atoms in RBD.

Figure 5.5 Illustration of persistent (sheaf) Betti numbers of element nonspecific persistent
Laplacian (PL) and persistent sheaf Laplacian (PSL) of the residue 501 mutation site at
different filtration values, i.e., radii (unit: Å). The wild type (PDB ID: 6M0J) and Alpha
(PDB ID: 8DLK) are given in the first row. The Beta (PDB ID: 8DLN) and Gamma (PDB
ID: 8DLQ) are given in the second row. BA.1 (PDB ID: 7T9L) and BA.2 (PDB ID: 7XB0)
are given the third row.

58

Figure 5.6 Illustration of the first nonzero eigenvalues of element nonspecific persistent Lapla-
cian (PL) and persistent sheaf Laplacian (PSL) of the residue 501 mutation site at different
filtration values, i.e., radii (unit: Å). The wild type (PDB ID: 6M0J) and Alpha (PDB ID:
8DLK) are given in the first row. The Beta (PDB ID: 8DLN) and Gamma (PDB ID: 8DLQ)
are given in the second row. BA.1 (PDB ID: 7T9L) and BA.2 (PDB ID: 7XB0) are given
the third row.

Figure 5.7 Illustration of persistent Betti numbers (red line) and the first nonzero eigenvalues
(blue line) of element nonspecific persistent Laplacians of the wild type N501 mutation site at
different filtration values, i.e., radii (unit: Å). Alpha filtration is used. The graphs from top
to bottom represent the results of dimension-0, dimension-1, and dimension-2 Laplacians.

59

atoms (C, N, and O) in RBD such that the distance of any atom in the set to the residue

501 Cα is smaller than 10Å. We calculate persistent Laplacians and persistent sheaf Lapla-

cians for mutation sites of the wild type and variants and compare the persistent (sheaf)

Betti numbers and the smallest nonzero eigenvalues of spectra at different filtration values.

Persistent Laplacians and persistent sheaf Laplacians can be calculated as either element

non-specifically or element specifically (i.e., considering carbon, nitrogen, and oxygen atoms

separately). We first employ the element non-specific approach and compare the results of

the wild type and variants. The results of persistent Laplacians and persistent sheaf Lapla-

cians are shown in Figures 5.5 and 5.6. The x axis represents the filtration values of Rips

filtration, such that at a filtration value r the Rips complex is constructed by considering

balls of radius r. The sudden changes of persistent (sheaf) Betti numbers and the first

nonzero eigenvalues near r = 0.65Å reflect the fact that most neighboring atoms are about

1.3Å away from each other. In Figure 5.5, The number of atoms is reflected in the initial

0-th Betti numbers. The 0-th Betti number dramatically decreases around 0.65 Å because

covalent bond distances are about 1.5Å. The 0-th Betti number decreases further from 1.2Å

to 1.7Å due to other many non-covalent bonds.

In Figure 5.6, the results of the wild type and mutants almost coincide, except that the

first nonzero eigenvalues of persistent sheaf Laplacians of BA.1 and BA.2 near r = 0.65Å

have very different values. The results of persistent Laplacians are quite different from

those of persistent sheaf Laplacians at large filtration values. The significant changes around

r = 0.65Å are due to the topological changes.

We are also interested in understanding whether higher dimensional persistent Laplacians

can offer an additional characterization of biomolecules. Figure 5.7 presents the higher

dimensional persistent Laplacian analysis of the wide type RBD near the N501 residue.

Obviously, higher dimensional persistent Laplacian offers significant structural information

about the distributions of circles and cavities of the macromolecule. Most dimension-1 circles

occur in the range of 1.5-2.4Å, whereas most 2-dimensional cavities locate around 1.8-2.8Å.

60

2-dimensional cavities are short-lived in the filtration, indicating the lack of multiple large

cavities in the structure (at most one large cavity in the structure). This distribution can

be used to understand interaction forces. For example, the length of hydrogen bonds ranges

from 2-3.6Å(corresponding to 1-1.8 Å in the filtration radii). This information is valuable for

the design of machine learning representations, including the selection of the set of filtration

intervals. We also note that the peak of λr,0
2

is at the left of βr,0
2

. It’s possible that when r

is in the range of 1.2Å-1.5Å, many 2-simplices are born but no 2-cycles are formed yet.

Figure 5.8 Illustration of the first nonzero eigenvalues of element-specific persistent Laplacian
of the residue 501 mutation site at different filtration values, i.e., radii (unit: Å). The wild
type (PDB ID: 6M0J) and Alpha (PDB ID: 8DLK) are given in the first row. The Beta
(PDB ID: 8DLN) and Gamma (PDB ID: 8DLQ) are given in the second row. BA.1 (PDB
ID: 7T9L) and BA.2 (PDB ID: 7XB0) are given the third row.

The element-specific results of the residue 501 mutation site of the wild type, and variants

Alpha, Beta, Gamma, BA.1, and BA.2 are shown in Figures 5.8 and 5.9, as well as in

Figures D.1 and D.2 in the Appendix. We observe that the difference between the first

nonzero eigenvalues is much more obvious. For instance, in Figure 5.8 there is a higher

spike near 0.7Å in the graph of Alpha carbon atoms, and two spikes near 1.3Å and 1.7Å

disappear in the graph of the Alpha variant’s oxygen atoms. In Figure 5.8, all results of

carbon atoms have similar shapes, implying a relatively stable RBD carbon atom structure.

In the results of nitrogen atoms, we notice that the results of Alpha, Beta, and Gamma

variants resemble each other, and the same can be said of the results of BA.1 and BA.2

61

Figure 5.9 Illustration of the first nonzero eigenvalues of element-specific persistent sheaf
Laplacian of the residue 501 mutation site at different filtration values, i.e., radii (unit: Å).
The wild type (PDB ID: 6M0J) and Alpha (PDB ID: 8DLK) are given in the first row. The
Beta (PDB ID: 8DLN) and Gamma (PDB ID: 8DLQ) are given in the second row. BA.1
(PDB ID: 7T9L) and BA.2 (PDB ID: 7XB0) are given the third row.

variants.

In the results of oxygen atoms, the results of Alpha, Beta, and Gamma still

resemble each other, but the results of BA.1 and BA.2 are quite different. The results of

the wild type are unique in the sense that it has one or two spikes near 1.3Å or 1.7Å. These

results indicate that element-specific persistent Laplacians and element-specific persistent

sheaf Laplacians are better approaches in characterizing SARS-CoV-2 variants than element-

non-specific approaches. We know that nitrogen and oxygen atoms are sparser in a protein,

so if we use element nonspecific approach, nitrogen atoms and oxygen atoms will first form

edges with neighboring carbon atoms, and we are not able to infer distances between nitrogen

atoms or oxygen atoms. This explains why element specific approach outperforms element

nonspecific approach.

5.0.2 PTL analysis of RBD structural changes induced by its binding to ACE2

We investigate how binding to ACE2 changes the spike protein RBD structure from the

closed state to the open state for the wild type, Alpha, Beta, BA.1, and BA.2 variants.

The PDB IDs of the spike protein of wild type, Alpha, Beta, BA.1 and BA.2 used in this

section are 7DF3 [148], 7LWS [51], 7LYM [51], 7TF8 [50] and 7XIX [21]. The analysis of the

Gamma variant is eliminated due to the lack of experimental structure. We first align each

62

of the three RBDs in the closed-state spike protein to the RBD in the RBD-ACE2 complex.

The maximal distances between corresponding atoms in the RBM of the three alignments

of BA.1 are 8.76Å, 13.49Å, and 9.44Å, which are larger than those of alignments of the wild

type and other mutants. For each alignment, we record the RBM residues that have at least

one atom whose distance to the corresponding atom is larger than 5.28Å, i.e., half of the

mean maximal distances between corresponding atoms in RBM of the three alignments of

BA.1. In wild-type RBD, such residues are K444 and K458. In Alpha there are no such

residues; In Beta, chains A and B have K458; chain C has T478 and P479. In BA.1, each

chain has different such residues: chain A has K440, Y453, K458, K478, and F486; chain B

has K440, Y453, R457, K458, R466, Y473, Q474, K478, F486, F490, R493; and chain C has

K440, Y453, Y473, K478, F486. In BA.2 such residues are E465, K478, and G482.

Figure 5.10 The total structural changes of the RBM between the closed state RBD and the
open state RBD induced by ACE2 binding. Here the total structural changes are defined to
be the sum of squares of distances between Cα atoms in the RBM.

We also calculate the total structural changes of the RBM between the closed state

RBD and the open state RBD induced by its binding to the human ACE2. Here, the total

structural changes are defined to be the sum of squares of distances between Cα atoms in

the RBM. Since spike protein is a trimer, we calculate the total structural changes for each

chain and report the average (see Figure 5.10). It turns out that the average total structural

changes induced by binding to ACE2 do not increase too much with respect to the number

of RBD mutations.

63

Figure 5.11 Illustration of persistent Betti numbers (red line) and the first nonzero eigenvalues
(blue line) of persistent Laplacian of the RBD binding site of the wild type RBD-ACE2
complex (PDB ID: 6M0J) and closed state spike protein (PDB ID: 7DF3, Chain ID: A) at
different filtration values, i.e., radii (unit: Å). The graphs from top to bottom represent the
results of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

Now, we calculate persistent Laplacians and persistent sheaf Laplacians for the RBD

binding site in the closed state spike protein and the RBD-ACE2 complex. For the wild

type and mutants, we define the RBD binding site as the set of RBD residues whose Cαs are

within 10Å from the Cαs of ACE2 residues. We choose 10Å as the cutoff distance, because

if we used 11Å then the RBD binding site would include non-RBM residues. Spike protein

as a trimer has three chains. In the results of alignments, the recorded residues of the wild

type, Alpha, and BA.2 are the same for the three chains. Therefore, for the wild type, Alpha

and BA.2 we only use chain A, and for Beta and BA.1, we use all three chains. The study

was carried out in an element-specific manner for carbon atoms, nitrogen atoms, and oxygen

atoms. The results of the wild type are shown in Figure 5.11. We noted that persistent

Betti numbers cannot distinguish two structures. However, the first nonzero eigenvalues of

the persistent Laplacians capture the difference, demonstrating the advantage of persistent

Laplacians over persistent homology in protein structure analysis.

Additional analysis is presented in Figures D.3, D.4, D.5, D.6, D.7, D.8, D.9, D.10, D.11,

and D.12 in the Appendix. In Figure D.3, the results of the wild type, Alpha, Beta, BA.1,

64

and BA.2 RBD binding sites are quite similar except that the wild type RBD binding site

has relatively lower first nonzero eigenvalues near r = 0.7Å. A peak appears or disappears

in the graph of the nitrogen atoms, whereas for BA.1 and BA.2, the results of the nitrogen

atoms resemble each other, sometimes even coincide.

The results of persistent Laplacians and persistent sheaf Laplacians are similar in this

work. However, this similarity is due to the specific implementation of persistent sheaf

Laplacians. In general, persistent sheaf Laplacians enable the embedding of non-geometric

chemical and physical information of biomeolecules in topological and spectral representa-

tions.

TDA assisted supervised learning

The workflow of TDA-assisted supervised learning is shown as below. Suppose our dataset

includes molecules (e.g., PDB files) and corresponding labels (a numerical number represent-

ing a biochemical property), one can use TDA methods to obtain a representation (a feature

vector) of each molecule and input all representation and labels to train a machine learn-

ing model. Persistent homology was first employed in TDA-assisted supervised learning.

For example, [20] used persistent homology to predict protein-ligand binding affinity and

state-of-the-art results were achieved.

Molecular
structures

TDA repre-
sentations

Supervised
machine
learning

Trained models

labels of
molecular
structures

Figure 5.12 The general workflow of TDA-assisted supervised learning.

TDA-assisted supervised learning has also been used to predict the impact of mutation

on protein-protein interaction [18, 28]. Suppose a wild type protein-protein complex struc-

ture and a dataset including single amino acid mutations (chain ID, residue ID, mutant

residue, etc.) and corresponding binding free energy (BFE) changes caused by mutations

65

are available1. Mutant structures can be computationally generated based on experimentally

determined structures of the wild-type antibody-antigen complexes and mutation informa-

tion. Since we are interested in the prediction of binding free energy changes caused by

mutation, we input both the TDA representations (i.e., feature vectors) of wild type struc-

ture and mutant structures to a machine learning model, so that the model will learn the

structural change induced by a mutation2. Laplacian representations have been employed in

TDA-assisted supervised learning [28]. Their performance on benchmark datasets are better

than previous models. The only difference is that TopLapGBT uses gradient boosting trees

whereas TopLapNet employs neural networks.

the single
mutation
information

the wildtype
structure

the mutant
structure

TDA rep-
resentation

TDA rep-
resentation

the BFE change

Supervised
machine
learning

Figure 5.13 The workflow to study the impact of mutation on a wildtype structure.

To do prediction with a trained TopLapGBT or TopLapNet, one has to transform the

wildtype structure and the mutant structure to TDA representations and input them to the

trained model.

When we apply persistent homology and persistent Laplacians to the study of protein-

protein interactions, we always extract the atoms within a certain cutoff distance r of the

binding site3 and construct a distance matrix such that if two atoms are in the same protein

then the distance between them is an extremely large constant number (to ensure that

1For instance, the AB-Bind S645 dataset [123] includes 645 mutants with experimentally determined

BFE changes across 29 antibody-antigen complexes.

2We also input the difference of the feature vectors of the wild type and the mutant. This trick enhances

the performance.

3Suppose an atom a is in protein A. If the distance of atom a to the other protein in the complex is not

larger than r, then we say atom a is within distance r of the binding site.

66

the single
mutation
information

the wild type
structure

the mutant
structure

TDA rep-
resentation

TDA rep-
resentation

TopLapNet (or
TopLapGBT)

the prediction

Figure 5.14 How to predict using TopLapGBT or TopLapNet.

atomic interaction within a single protein is ignored). To further characterize the interaction

between atoms of certain elements E1 and E2, we can consider the point cloud formed by

the atoms of an element E1 of protein A within r of the binding site, and the atoms of

element E2 of protein B within r of the binding site. After the calculation of persistent

homology and persistent Laplacians, the next step is to transform the barcodes of persistent

homology or spectra of persistent Laplacians into vector representations of fixed lengths. For

barcodes, there are at least two ways: either we divide the interval [0, r] into bins of even

length and count the occurrence of bars, birth values, and death values in each bin, or we

simply compute statistics such as sum, maximum, minimum, mean, and standard deviation

for bar lengths, birth values, and death values. The former method is often applied to 0-

dimensional barcodes and the latter to 1-dimensional and 2-dimensional barcodes. For the

spectrum of a persistent Laplacian, we separate zero eigenvalues (harmonic spectra) and

nonzero eigenvalues (non-harmonic spectra). We use the number of zero eigenvalues, the

sum, the minimum, the maximum, the mean, the standard deviation, the variance, and the

sum of squares of nonzero eigenvalues.

Impacts of computationally generated structures on PTL-assisted machine learn-
ing predictions

We mentioned earlier that a wild type structure is needed in TopLapGBT and TopLap-

Net. It is natural to ask if structural perturbation to the wild type structure will impact the

two models. To explore this, we use a SARS-CoV-2 BA.2 RBD deep mutational scanning

67

dataset which involves the systematical mutations of each residue on the BA.2 RBD to 19

other residues and records corresponding binding affinity changes [125]. The deep mutational

scanning covers the RBD residues from 333 to 527. In order to apply machine learning mod-

els, such as TopLapGBT and TopLapNet [28], to this dataset, BA.2 RBD mutants need to

be computationally generated based on a BA.2 RBD structure and the choice of the BA.2

RBD structure can affect the performance of machine learning models. We can choose either

an experimentally determined BA.2 RBD-ACE2 complex structure or a BA.2 RBD-ACE2

complex structure computationally generated based on an experimentally determined BA.1

RBD-ACE2 complex structure. When the given BA.2 RBD structure is experimentally de-

termined (PDB ID: 7XB0), the resulting models are referred to as ExpTopLapGBT (experi-

mental TopLapGBT) and ExpTopLapNet (experimental TopLapNet). When the BA.2 RBD

structure is computationally generated from BA.1 RBD (PDB ID: 7T9L) by Jackal [147], the

resulting model is referred to as ComTopLapGBT (computational TopLapGBT) or ComTo-

pLapNet (computational TopLapGBT). The distances of corresponding atoms between the

experimentally determined RBD (PDB ID: 7XB0) and the RBD generated computationally

from BA.1 RBD (PDB ID: 7T9L) is shown in Figure 5.15.

Figure 5.15 Atoms of BA.2 RBD (PDB ID: 7XB0) are colored by their distances to corre-
sponding atoms in the computationally generated structure. We record the residues that
have at least one atom whose distance to the corresponding atom in wild type RBD is more
than 7.57Å. Such residues are 370, 375, 378, 386, 387, and 519.

We compare the results of ExpTopLapGBT and ComTopLapGBT, on the predictions

of the RBD deep mutational scanning dataset. We split the dataset into 10 folds, and for

each fold, we use the other 9 folds as the training set to build a machine learning model,

which is used to predict ACE2-binding affinity changes for the fold. Therefore, for a given

68

10-fold splitting we get the ExpTopLapGBT and ComTopLapGBT predictions of RBD-

ACE2 binding affinity changes for the deep mutational scanning dataset. We denote by

Rp(Exp, T rue) the Pearson correlation coefficient between ExpTopLapGBT predicted bind-

ing affinity changes and experimental binding affinity changes. Similarly, Rp(Com, T rue) (or

Rp(Exp, Com)) is the Pearson correlation coefficient between ComTopLapGBT predicted

binding affinity changes and experimental binding affinity changes (or ExpTopLapGBT pre-

dicted binding affinity changes). We also do the same analysis for TopLapNet

Method

Rp(Exp, T rue) Rp(Com, T rue) Rp(Exp, Com)

TopLapGBT

TopLapNet

0.901

0.879

0.898

0.849

0.990

0.925

Table 5.1 Rp(Exp, T rue) is the correlation coefficient between predictions of ExpTo-

pLapGBT (or ExpTopLapNet) and true affinity changes. Here, Rp(Com, T rue) is the cor-

relation coefficient between predictions of ComTopLapGBT (or ComTopLapNet) and true

affinity changes. Rp(Exp, Com) is Pearson the correlation coefficient between the predic-

tions of ExpTopLapGBT and ComTopLapGBT (or between ExpTopLapNet and ComTo-

pLapNet). A random state affects the 10-fold splitting and the training of GBT and neural

networks.

The results of TopLapGBT and TopLapNet are shown in Table 5.1. Generally, the

performance of models using experimentally determined structures is better than that of

models using the computationally generated structure. This is not surprising since the

computationally generated structure is an approximation of the experimental structure. The

performance of ExpTopLapGBT and ComTopLapGBT are extremely close, whereas the

performance of ComTopLapNet differs very much from that of ExpTopLapNet. We also see

that ExpTopLapGBT outperforms ExpTopLapNet.

In this study, we use scikit-learn to build a gradient boosting tree whose parameters

are n_estimators=20000, learning_rate = 0.005, max_features = ‘sqrt’, max_depth = 9,

69

min_samples_split = 3, subsample = 0.4, and n_iter_no_change=500. Additionally, we

use PyTorch to build a neural network with 7 hidden layers where each layer has 8000

neurons.

70

CHAPTER 6

THESIS CONTRIBUTION AND FUTURE WORK

The main contributions of this dissertation are listed as follows:

• In section 2, we propose to calculate of spectra of persistent Laplacians using homotopy

continuation methods.

• In chapter 3, we extend persistent Laplacians to the setting of cellular sheaves and

discuss how persistent sheaf Laplacians can be applied to analyze biomolecules.

• In chapter 4, we review the recent generalizations of persistent Laplacians.

• In chapter 5, we perform analysis of RBD structural changes induced by mutations

and stability of persistent Laplacian assisted machine learning models.

The contents of this dissertation are mostly adopted from the following publications and

preprints:

• X. Wei and G.-W. Wei. Homotopy continuation for the spectra of persistent Laplacians.

Foundations of Data Science, 3(4):677, 2021.

• X. Wei and G.-W. Wei. Persistent sheaf Laplacians. Foundations of Data Science,

2024.

• X. Wei and G.-W. Wei. Persistent Topological Laplacians–a Survey. arXiv preprint

arXiv:2312.07563, 2023.

• X. Wei, J. Chen, and G.-W. Wei. Persistent topological Laplacian analysis of SARS-

CoV-2 variants. Journal of computational biophysics and chemistry, 22(5):569, 2023.

Many future directions are available, including:

• It is challenging to understand the relationship between the geometry/topology of the

data and PTLs. The understanding of this relationship is crucial for the application

of PTLs to real world problems.

71

• To a certain extent, the success of persistent homology can be attributed to its inte-

gration with machine learning, particularly with the introduction of topological deep

learning [19]. The featurization of Laplacians typically requires domain knowledge

and experience. Since self learning representations of persistent diagrams have been

proposed [70], we wonder if self learning representations of (persistent) Laplacians are

possible. It is also interesting to featurize the eigenvectors of Laplacians.

• Despite efforts in software development [96, 135], the computation of PTLs remains

slow, particularly for problems involving large datasets. Since the primary value of

TDA lies in its ability to analyze data, one of the most pressing needs will be the

development of efficient and robust PTL software packages. The development of finite

field PTLs will be also valuable.

• The invention of cellular sheaves for different scenarios is crucial for successful appli-

cations of persistent sheaf Laplacians.

• One can also extend PTLs to settings such as the Hochschild complex [49], quantum

homology [11], multiparameter persistent homology [68], and interaction homotopy

and interaction homology [87].

• As discussed in [126], persistent Dirac operators can be formulated for flag complexes,

digraphs, hyperdigraphs, etc. It is possible that a persistent sheaf Dirac operator can

be devised to distinguish certain point clouds.

• It will be interesting to generalize various PTLs on point clouds to the manifold and

knot-type data settings.

• Persistent Mayer homology and persistent Mayer Laplacians have been introduced

on N -chain complexes [121]. These formulations encompass persistent homology and

persistent Laplacians as special cases. The potential for future developments on these

subjects is widely open.

72

[1]

[2]

E. L. Allgower, D. J. Bates, A. J. Sommese, and C. W. Wampler. Solution of polynomial
systems derived from differential equations. Computing, 76:1–10, 2006.

BIBLIOGRAPHY

B. Ameneyro, V. Maroulas, and G. Siopsis. Quantum persistent homology. Journal of
Applied and Computational Topology, pages 1–20, 2024.

[3] D. N. Arnold, G. David, M. Filoche, D. Jerison, and S. Mayboroda. Computing
spectra without solving eigenvalue problems. SIAM Journal on Scientific Computing,
41(1):B69–B92, 2019.

[4] G. Ausiello and L. Laura. Directed hypergraphs: Introduction and fundamental algo-

rithms—a survey. Theoretical Computer Science, 658:293–306, 2017.

[5]

F. Baccini, F. Geraci, and G. Bianconi. Weighted simplicial complexes and their
representation power of higher-order network data and topology. Physical Review E,
106(3):034319, 2022.

[6] D. J. Bates, I. A. Fotiou, and P. Rostalski. A numerical algebraic geometry approach
to nonlinear constrained optimal control. In 2007 46th IEEE Conference on Decision
and Control, pages 6256–6261. IEEE, 2007.

[7] D. J. Bates, J. D. Hauenstein, A. J. Sommese, and C. W. Wampler. Bertini: Software
for numerical algebraic geometry. Available at bertini.nd.edu with permanent doi:
dx.doi.org/10.7274/R0H41PB5.

[8] D. J. Bates, A. J. Sommese, J. D. Hauenstein, and C. W. Wampler. Numerically Solving
Polynomial Systems with Bertini. Society for Industrial and Applied Mathematics,
Philadelphia, PA, 2013.

[9]

C. Battiloro, S. Sardellitti, S. Barbarossa, and P. Di Lorenzo. Topological signal
processing over weighted simplicial complexes. arXiv preprint arXiv:2302.08561, 2023.

[10] G. Bianconi. The topological dirac equation of networks and simplicial complexes.

Journal of Physics: Complexity, 2(3):035022, 2021.

[11] P. Biran and O. Cornea. A lagrangian quantum homology. New perspectives and

challenges in symplectic field theory, 49:1–44, 2009.

[12] P. Breiding and S. Timme. HomotopyContinuation.jl: A Package for Homotopy Con-
tinuation in Julia. In International Congress on Mathematical Software, pages 458–465.
Springer, 2018.

[13] S. Bressan, J. Li, S. Ren, and J. Wu. The embedded homology of hypergraphs and

applications. Asian Journal of Mathematics, 23(3):479–500, 2019.

[14] A. Bura, Q. He, and C. Reidys. Weighted homology of bi-structures over certain

discrete valuation rings. Mathematics, 9(7):744, 2021.

73

[15] A. C. Bura, N. S. Dutta, T. J. Li, and C. M. Reidys. A computational framework for

weighted simplicial homology. arXiv preprint arXiv:2206.04612, 2022.

[16] Z. Cang, L. Mu, and G.-W. Wei. Representability of algebraic topology for biomolecules
in machine learning based scoring and virtual screening. PLoS computational biology,
14(1):e1005929, 2018.

[17] Z. Cang, L. Mu, K. Wu, K. Opron, K. Xia, and G.-W. Wei. A topological approach
for protein classification. Computational and Mathematical Biophysics, 3(1), 2015.

[18] Z. Cang and G.-W. Wei. Analysis and prediction of protein folding energy changes
upon mutation by element specific persistent homology. Bioinformatics, 33(22):3549–
3557, 2017.

[19] Z. Cang and G.-W. Wei. Topologynet: Topology based deep convolutional and multi-
task neural networks for biomolecular property predictions. PLoS computational biol-
ogy, 13(7):e1005690, 2017.

[20] Z. Cang and G.-W. Wei.

Integration of element specific persistent homology and
machine learning for protein-ligand binding affinity prediction. International journal
for numerical methods in biomedical engineering, 34(2):e2914, 2018.

[21] Y. Cao, A. Yisimayi, F. Jian, W. Song, T. Xiao, L. Wang, S. Du, J. Wang, Q. Li,
X. Chen, et al. BA. 2.12. 1, BA. 4 and BA. 5 escape antibodies elicited by Omicron
infection. Nature, 608(7923):593–602, 2022.

[22] L. Caputi and H. Riihimäki. Hochschild homology, and a persistent approach via
connectivity digraphs. Journal of Applied and Computational Topology, pages 1–50,
2023.

[23] D. Carlson, E. Haynsworth, and T. Markham. A generalization of the Schur comple-
ment by means of the Moore–Penrose inverse. SIAM Journal on Applied Mathematics,
26(1):169–175, 1974.

[24] D. Chen, J. Liu, J. Wu, and G.-W. Wei. Persistent hyperdigraph homology and
persistent hyperdigraph Laplacians. Foundations of Data Science, 5(4):558–588, 2023.

[25] D. Chen, J. Liu, J. Wu, G.-W. Wei, F. Pan, and S.-T. Yau. Path topology in molecular
and materials sciences. The Journal of Physical Chemistry Letters, 14(4):954–964,
2023.

[26] J. Chen, K. Gao, R. Wang, and G.-W. Wei. Prediction and mitigation of mutation
threats to COVID-19 vaccines and antibody therapies. Chemical Science, 12(20):6929–
6948, 2021.

[27] J. Chen, K. Gao, R. Wang, and G.-W. Wei. Revealing the threat of emerging SARS-
CoV-2 mutations to antibody therapies. Journal of Molecular Biology, 433(7744),
2021.

74

[28] J. Chen, Y. Qiu, R. Wang, and G.-W. Wei. Persistent Laplacian projected Omicron
BA. 4 and BA. 5 to become new dominating variants. Computers in Biology and
Medicine, 151:106262, 2022.

[29] J. Chen, R. Wang, N. B. Gilby, and G.-W. Wei. Omicron variant (B. 1.1. 529): infec-
tivity, vaccine breakthrough, and antibody resistance. Journal of chemical information
and modeling, 62(2):412–422, 2022.

[30] J. Chen, R. Wang, M. Wang, and G.-W. Wei. Mutations strengthened SARS-CoV-2

infectivity. Journal of molecular biology, 432(19):5212–5226, 2020.

[31] J. Chen and G.-W. Wei. Omicron BA. 2 (B. 1.1. 529.2): high potential for becoming
the next dominant variant. The journal of physical chemistry letters, 13(17):3840–3849,
2022.

[32] J. Chen, R. Zhao, Y. Tong, and G.-W. Wei. Evolutionary de rham-Hodge method.

Discrete and continuous dynamical systems. Series B, 26(7):3785, 2021.

[33] T. Chen, T.-L. Lee, and T.-Y. Li. Hom4ps-3: A parallel numerical solver for systems
In
of polynomial equations based on polyhedral homotopy continuation methods.
H. Hong and C. Yap, editors, Mathematical Software – ICMS 2014, pages 183–190,
Berlin, Heidelberg, 2014. Springer Berlin Heidelberg.

[34] S. Chowdhury, S. Huntsman, and M. Yutin. Path homologies of motifs and temporal

network representations. Applied Network Science, 7(1):4, 2022.

[35] S. Chowdhury and F. Mémoli. Persistent path homology of directed networks. In Pro-
ceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms,
pages 1152–1169. SIAM, 2018.

[36] F. R. Chung. The Laplacian of a hypergraph. In Expanding graphs, pages 21–36, 1992.

[37] F. R. Chung. Spectral graph theory, volume 92. American Mathematical Soc., 1997.

[38] O. T. Courtney and G. Bianconi. Weighted growing simplicial complexes. Physical

Review E, 95(6):062301, 2017.

[39] J. Curry. Sheaves, cosheaves and applications. PhD thesis, University of Pennsylvania,

2014.

[40] Y. Dabaghian, F. Mémoli, L. Frank, and G. Carlsson. A topological paradigm for

hippocampal spatial map formation using persistent homology. 2012.

[41] R. J. M. Dawson. Homology of weighted simplicial complexes. Cahiers de Topologie

et Géométrie Différentielle Catégoriques, 31(3):229–243, 1990.

[42] M. Desbrun, E. Kanso, and Y. Tong. Discrete differential forms for computational

modeling. In ACM SIGGRAPH 2006 Courses, pages 39–54. 2006.

75

[43] B. Eckmann. Harmonische funktionen und randwertaufgaben in einem komplex. Com-

mentarii Mathematici Helvetici, 17(1):240–255, 1944.

[44] H. Edelsbrunner and J. Harer. Computational Topology: An Introduction. American

Mathematical Society, 2010.

[45] E. Emtander. Betti numbers of hypergraphs. Communications in algebra, 37(5):1545–

1571, 2009.

[46] E. Estrada. Path laplacian matrices: introduction and application to the analysis of

consensus in networks. Linear algebra and its applications, 436(9):3373–3391, 2012.

[47] M. Fiedler. Algebraic connectivity of graphs. Czechoslovak mathematical journal,

23(2):298–305, 1973.

[48] M. Gameiro, Y. Hiraoka, S. Izumi, M. Kramar, K. Mischaikow, and V. Nanda. A
topological measurement of protein compressibility. Japan Journal of Industrial and
Applied Mathematics, 32(1):1–17, 2015.

[49] M. Gerstenhaber and A. Voronov. Higher operations on the hochschild complex. Funct.

Anal. Appl, 29(1):3, 1995.

[50] S. M.-C. Gobeil, R. Henderson, V. Stalls, K. Janowska, X. Huang, A. May, M. Speak-
man, E. Beaudoin, K. Manne, D. Li, et al. Structural diversity of the SARS-CoV-2
Omicron spike. Molecular cell, 2022.

[51] S. M.-C. Gobeil, K. Janowska, S. McDowell, K. Mansouri, R. Parks, V. Stalls, M. F.
Kopp, K. Manne, D. Li, K. Wiehe, et al. Effect of natural mutations of SARS-CoV-2
on spike structure, conformation, and antigenicity. Science, 373(6555):eabi6226, 2021.

[52] T. E. Goldberg. Combinatorial Laplacians of simplicial complexes. Senior Thesis,

Bard College, 2002.

[53] A. Gomes and D. Miranda. Path cohomology of locally finite digraphs, Hodge’s theo-

rem and the p-lazy random walk. arXiv preprint arXiv:1906.04781, 2019.

[54] C. González, G. M. Langdon, M. Bruix, A. Gálvez, E. Valdivia, M. Maqueda, and
M. Rico. Bacteriocin as-48, a microbial cyclic polypeptide structurally and functionally
related to mammalian nk-lysin. Proceedings of the National Academy of Sciences,
97(21):11221–11226, 2000.

[55] A. Grigor’yan. Advances in path homology theory of digraphs. 2022.

[56] A. Grigor’yan, R. Jimenez, Y. Muranov, and S.-T. Yau. Homology of path complexes

and hypergraphs. Topology and its Applications, 267:106877, 2019.

[57] A. Grigor’yan, Y. Lin, Y. Muranov, and S.-T. Yau. Homologies of path complexes and

digraphs. arXiv preprint arXiv:1207.2834, 2012.

76

[58] A. Grigor’yan, Y. Lin, Y. Muranov, and S.-T. Yau. Homotopy theory for digraphs.

Pure and Applied Mathematics Quarterly, 10(4):619–674, 2014.

[59] A. Grigor’yan, Y. Lin, Y. Muranov, and S.-T. Yau. Cohomology of digraphs and

(undirected) graphs. Asian Journal of Mathematics, 19(5):887–932, 2015.

[60] A. Grigor’yan, Y. Lin, Y. V. Muranov, and S.-T. Yau. Path complexes and their

homologies. Journal of Mathematical Sciences, 248:564–599, 2020.

[61] A. Grigor’yan, Y. Muranov, and S.-T. Yau. Homologies of digraphs and künneth

formulas. Communications in Analysis and Geometry, 25(5):969–1018, 2017.

[62] E. Gross, B. Davis, K. L. Ho, D. J. Bates, and H. A. Harrington. Numerical algebraic
geometry for model selection and its application to the life sciences. Journal of The
Royal Society Interface, 13(123):20160256, 2016.

[63] A. B. Gülen, F. Mémoli, Z. Wan, and Y. Wang. A generalization of the persistent
Laplacian to simplicial maps. In E. W. Chambers and J. Gudmundsson, editors, 39th
International Symposium on Computational Geometry (SoCG 2023), volume 258 of
Leibniz International Proceedings in Informatics (LIPIcs), pages 37:1–37:17, Dagstuhl,
Germany, 2023. Schloss Dagstuhl – Leibniz-Zentrum für Informatik.

[64] J. Hansen. Laplacians of Cellular Sheaves: Theory and Applications. PhD thesis,

University of Pennsylvania, 2020.

[65] J. Hansen and R. Ghrist. Toward a spectral theory of cellular sheaves. Journal of

Applied and Computational Topology, 3(4):315–358, 2019.

[66] W. Hao, J. D. Hauenstein, B. Hu, Y. Liu, A. J. Sommese, and Y.-T. Zhang. Multiple
stable steady states of a reaction-diffusion model on zebrafish dorsal-ventral patterning.
Discrete & Continuous Dynamical Systems - S, 4:1413, 2011.

[67] W. Hao, B. Hu, and A. J. Sommese. Numerical algebraic geometry and differential
equations. In Future Vision and Trends on Shapes, Geometry and Algebra, pages 39–53.
Springer, 2014.

[68] H. A. Harrington, N. Otter, H. Schenck, and U. Tillmann. Stratifying multiparameter
persistent homology. SIAM Journal on Applied Algebra and Geometry, 3(3):439–471,
2019.

[69] J. Hauenstein, J. Rodriguez, and B. Sturmfels. Maximum likelihood for matrices with

rank constraints, 2013.

[70] C. D. Hofer, R. Kwitt, and M. Niethammer. Learning representations of persistence

barcodes. J. Mach. Learn. Res., 20(126):1–45, 2019.

[71] M. Hoffmann, H. Kleine-Weber, S. Schroeder, N. Krüger, T. Herrler, S. Erichsen, T. S.
Schiergens, G. Herrler, N.-H. Wu, A. Nitsche, et al. SARS-CoV-2 cell entry depends
on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell,
181(2):271–280, 2020.

77

[72] D. Horak and J. Jost. Spectra of combinatorial laplace operators on simplicial com-

plexes. Advances in Mathematics, 244:303–336, 2013.

[73] S. Hu and L. Qi. The Laplacian of a uniform hypergraph. Journal of Combinatorial

Optimization, 29(2):331–366, 2015.

[74] S. Huntsman. Path homology as a stronger analogue of cyclomatic complexity. arXiv

preprint arXiv:2003.00944, 2020.

[75] B. Jones and G.-W. Wei. Persistent flag Laplacians. arXiv preprint arXiv:2203.12965,

2023.

[76] J. Jost and R. Mulas. Hypergraph Laplace operators for chemical reaction networks.

Advances in mathematics, 351:870–896, 2019.

[77] E. Jurrus, D. Engel, K. Star, K. Monson, J. Brandi, L. E. Felberg, D. H. Brookes,
L. Wilson, J. Chen, K. Liles, M. Chun, P. Li, D. W. Gohara, T. Dolinsky, R. Konecny,
D. R. Koes, J. E. Nielsen, T. Head-Gordon, W. Geng, R. Krasny, G.-W. Wei, M. J.
Holst, J. A. McCammon, and N. A. Baker.
Improvements to the apbs biomolecu-
lar solvation software suite. Protein science : a publication of the Protein Society,
27(1):112–128, Jan 2018. 28836357[pmid].

[78] P. M. Kasson, A. Zomorodian, S. Park, N. Singhal, L. J. Guibas, and V. S. Pande.
Persistent voids: a new structural metric for membrane fusion. Bioinformatics,
23(14):1753–1759, 2007.

[79] S. Krishnagopal and G. Bianconi. Spectral detection of simplicial communities via

Hodge laplacians. Physical Review E, 104(6):064303, 2021.

[80] J. Lan, J. Ge, J. Yu, S. Shan, H. Zhou, S. Fan, Q. Zhang, X. Shi, Q. Wang, L. Zhang,
et al. Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2
receptor. Nature, 581(7807):215–220, 2020.

[81] A. Leykin and F. Sottile. Galois groups of schubert problems via homotopy computa-

tion. Mathematics of Computation, 78(267):1749–1765, 2009.

[82] L. Li, H. Liao, Y. Meng, W. Li, P. Han, K. Liu, Q. Wang, D. Li, Y. Zhang, L. Wang,
et al. Structural basis of human ACE2 higher binding affinity to currently circulating
Omicron SARS-CoV-2 sub-variants BA. 2 and BA. 1.1. Cell, 185(16):2952–2960, 2022.

[83] T. J. Li and C. M. Reidys. On weighted simplicial homology.

arXiv preprint

arXiv:2205.03435, 2022.

[84] A. Lieutier. Talk: Persistent harmonic forms, 2014.

[85] L.-H. Lim. Hodge Laplacians on graphs. Siam Review, 62(3):685–715, 2020.

[86] Y. Lin, S. Ren, C. Wang, and J. Wu. Weighted path homology of weighted digraphs

and persistence. arXiv preprint arXiv:1910.09891, 2019.

78

[87] J. Liu, D. Chen, and G.-W. Wei.

Interaction homotopy and interaction homology.

arXiv preprint arXiv:2311.16322, 2023.

[88] J. Liu, J. Li, and J. Wu. The algebraic stability for persistent Laplacians. arXiv

preprint arXiv:2302.03902, 2023.

[89] R. Liu, X. Liu, and J. Wu. Persistent path-spectral (pps) based machine learning
for protein–ligand binding affinity prediction. Journal of Chemical Information and
Modeling, 2023.

[90] X. Liu, H. Feng, J. Wu, and K. Xia. Persistent spectral hypergraph based machine
learning (psh-ml) for protein-ligand binding affinity prediction. Briefings in Bioinfor-
matics, 22(5):bbab127, 2021.

[91] X. Liu, H. Feng, J. Wu, and K. Xia. Hom-complex-based machine learning (hcml) for
the prediction of protein–protein binding affinity changes upon mutation. Journal of
chemical information and modeling, 62(17):3961–3969, 2022.

[92] D. Lütgehetmann, D. Govc, J. P. Smith, and R. Levi. Computing persistent homology

of directed flag complexes. Algorithms, 13(1):19, 2020.

[93] D. Mannar, J. W. Saville, Z. Sun, X. Zhu, M. M. Marti, S. S. Srivastava, A. M.
Berezuk, S. Zhou, K. S. Tuttle, M. D. Sobolewski, et al. SARS-CoV-2 variants of
concern: spike protein mutational analysis and epitope for broad neutralization. Nature
Communications, 13(1):1–12, 2022.

[94] D. Mannar, J. W. Saville, X. Zhu, S. S. Srivastava, A. M. Berezuk, K. S. Tuttle,
A. C. Marquez, I. Sekirov, and S. Subramaniam. SARS-CoV-2 Omicron variant:
Antibody evasion and cryo-EM structure of spike protein–ACE2 complex. Science,
375(6582):760–764, 2022.

[95] P. Masulli and A. E. Villa. The topology of the directed clique complex as a network

invariant. SpringerPlus, 5:1–12, 2016.

[96] F. Mémoli, Z. Wan, and Y. Wang. Persistent Laplacians: Properties, algorithms and
implications. SIAM Journal on Mathematics of Data Science, 4(2):858–884, 2022.

[97] Z. Meng and K. Xia. Persistent spectral–based machine learning (perspect ml) for

protein-ligand binding affinity prediction. Science Advances, 7(19):eabc5329, 2021.

[98] A. Muhammad and M. Egerstedt. Control using higher order Laplacians in network
In Proc. of 17th International Symposium on Mathematical Theory of

topologies.
Networks and Systems, pages 1024–1038. Citeseer, 2006.

[99] J. R. Munkres. Elements of Algebraic Topology. Addison-Wesley Publishing Company,

Inc., 1984.

[100] Y. Muranov, A. Szczepkowska, and V. Vershinin. Path homology of directed hyper-

graphs. Homology, Homotopy and Applications, 24(2):347–363, 2022.

79

[101] A. Myers, C. Joslyn, B. Kay, E. Purvine, G. Roek, and M. Shapiro. Topological analysis
of temporal hypergraphs. In Algorithms and Models for the Web Graph: 18th Interna-
tional Workshop, WAW 2023, Toronto, ON, Canada, May 23–26, 2023, Proceedings,
pages 127–146. Springer, 2023.

[102] A. D. Parks and S. L. Lipscomb. Homology and hypergraph acyclicity: a combinatorial
invariant for hypergraphs. Technical report, NAVAL SURFACE WARFARE CENTER
DAHLGREN VA, 1991.

[103] G. Petri, M. Scolamiero, I. Donato, and F. Vaccarino. Topological strata of weighted

complex networks. PloS one, 8(6):e66506, 2013.

[104] E. F. Pettersen, T. D. Goddard, C. C. Huang, E. C. Meng, G. S. Couch, T. I. Croll,
J. H. Morris, and T. E. Ferrin. Ucsf chimerax: Structure visualization for researchers,
educators, and developers. Protein Science, 30(1):70–82, 2021.

[105] Y. Qiu and G.-W. Wei. Persistent spectral theory-guided protein engineering. Nature

Computational Science, 3(2):149–163, 2023.

[106] Y. Qiu and G.-W. Wei. Persistent spectral theory-guided protein engineering. Nature

Computational Science, 2023.

[107] T. Raček, O. Schindler, D. Toušek, V. Horsk`y, K. Berka, J. Koča, and R. Svo-
bodová. Atomic charge calculator II: web-based tool for the calculation of partial
atomic charges. Nucleic acids research, 48(W1):W591–W596, 2020.

[108] M. W. Reimann, M. Nolte, M. Scolamiero, K. Turner, R. Perin, G. Chindemi,
P. Dłotko, R. Levi, K. Hess, and H. Markram. Cliques of neurons bound into cav-
ities provide a missing link between structure and function. Frontiers in computational
neuroscience, page 48, 2017.

[109] S. Ren and C. Wu. Weighted simplicial complexes and weighted analytic torsions.

arXiv preprint arXiv:2103.04252, 2021.

[110] S. Ren, C. Wu, and J. Wu. Hodge decompositions for weighted hypergraphs. arXiv

preprint arXiv:1805.11331, 2018.

[111] S. Ren, C. Wu, and J. Wu. Weighted persistent homology. Rocky Mountain Journal

of Mathematics, 48(8):2661 – 2687, 2018.

[112] S. Ren and J. Wu. Stability of persistent homology for hypergraphs. arXiv preprint

arXiv:2002.02237, 2020.

[113] Y. Ren, J. W. Martini, and J. Torres. Decoupled molecules with binding polynomials

of bidegree (n, 2). Journal of mathematical biology, 78:879–898, 2019.

[114] M. Robinson. How do we deal with noisy data?

[115] M. Robinson. Topological signal processing, volume 81. Springer.

80

[116] M. Robinson. Sheaves are the canonical data structure for sensor integration. Infor-

mation Fusion, 36:208–224, 2017.

[117] F. Russold. Persistent sheaf cohomology. arXiv preprint arXiv:2204.13446, 2022.

[118] Samplchallenges. Sampl6/cb8.mol2 at master · samplchallenges/sampl6.

[119] A. Sharma, T. J. Moore, A. Swami, and J. Srivastava. Weighted simplicial complex:
In Advances in Knowledge
A novel approach for predicting small group evolution.
Discovery and Data Mining: 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South
Korea, May 23-26, 2017, Proceedings, Part I 21, pages 511–523. Springer, 2017.

[120] L. Shen, J. Liu, and G.-W. Wei. Evolutionary Khovanov homology. AIMS Mathemat-

ics, 9(9):26139–26165, 2024.

[121] L. Shen, J. Liu, and G.-W. Wei. Persistent Mayer homology and persistent Mayer

Laplacian. Foundations of Data Science, 6(4):584–612, 2024.

[122] A. D. Shepard. A cellular description of the derived category of a stratified space. PhD

thesis, Brown University, 1985.

[123] S. Sirin, J. R. Apgar, E. M. Bennett, and A. E. Keating. AB-Bind: antibody binding
mutational database for computational affinity predictions. Protein Science, 25(2):393–
409, 2016.

[124] A. J. Sommese and C. W. Wampler. The Numerical Solution of Systems of Polynomials

Arising in Engineering and Science. WORLD SCIENTIFIC, 2005.

[125] T. N. Starr, A. J. Greaney, C. M. Stewart, A. C. Walls, W. W. Hannon, D. Veesler, and
J. D. Bloom. Deep mutational scans for ACE2 binding, RBD expression, and antibody
escape in the SARS-CoV-2 Omicron BA. 1 and BA. 2 receptor-binding domains. PLoS
pathogens, 18(11):e1010951, 2022.

[126] F. Suwayyid and G.-W. Wei. Persistent Dirac of paths on digraphs and hypergraphs.

Foundations of Data Science, 6(2):124–153, 2024.

[127] M. Thakur and R. Tripathi. Linear connectivity problems in directed hypergraphs.

Theoretical Computer Science, 410(27-29):2592–2618, 2009.

[128] J. Verschelde. Algorithm 795: Phcpack: A general-purpose solver for polynomial
systems by homotopy continuation. ACM Trans. Math. Softw., 25(2):251–276, June
1999.

[129] A. C. Walls, Y.-J. Park, M. A. Tortorici, A. Wall, A. T. McGuire, and D. Veesler.
Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell,
181(2):281–292, 2020.

[130] C. W. Wampler and A. J. Sommese. Numerical algebraic geometry and algebraic

kinematics. Acta Numerica, 20:469–567, 2011.

81

[131] R. Wang, J. Chen, K. Gao, and G.-W. Wei. Vaccine-escape and fast-growing mutations
in the United Kingdom, the United States, Singapore, Spain, India, and other COVID-
19-devastated countries. Genomics, 113(4):2158–2170, 2021.

[132] R. Wang, J. Chen, and G.-W. Wei. Mechanisms of SARS-CoV-2 evolution revealing
vaccine-resistant mutations in Europe and America. The Journal of Physical Chemistry
Letters, 12:11850–11857, 2021.

[133] R. Wang, D. D. Nguyen, and G.-W. Wei. Persistent spectral graph. International
Journal for Numerical Methods in Biomedical Engineering, 36(9):e3376, 2020.

[134] R. Wang and G.-W. Wei. Persistent path Laplacian. Foundations of Data Science,

5(1):26–55, 2023.

[135] R. Wang, R. Zhao, E. Ribando-Gros, J. Chen, Y. Tong, and G.-W. Wei. Hermes:

Persistent spectral graph software. Foundations of Data Science, 3(1):67–97, 2020.

[136] J. Wee, G. Bianconi, and K. Xia. Persistent Dirac for molecular representation. Sci-

entific Reports, 13(1):11183, 2023.

[137] J. Wee and K. Xia. Ollivier persistent Ricci curvature-based machine learning for
the protein–ligand binding affinity prediction. Journal of Chemical Information and
Modeling, 61(4):1617–1626, 2021.

[138] R. K. J. Wei, J. Wee, V. E. Laurent, and K. Xia. Hodge theory-based biomolecular

data analysis. Scientific Reports, 12(1):9699, 2022.

[139] X. Wei, J. Chen, and G.-W. Wei. Persistent topological Laplacian analysis of SARS-
CoV-2 variants. Journal of computational biophysics and chemistry, 22(5):569, 2023.

[140] X. Wei and G.-W. Wei. Homotopy continuation for the spectra of persistent Laplacians.

Foundations of Data Science, 3(4):677, 2021.

[141] X. Wei and G.-W. Wei. Persistent sheaf Laplacians. Foundations of Data Science,

2024.

[142] C. Wu, S. Ren, J. Wu, and K. Xia. Weighted (co)homology and weighted Laplacian.

arXiv preprint arXiv:1804.06990, 2018.

[143] C. Wu, S. Ren, J. Wu, and K. Xia. Discrete morse theory for weighted simplicial

complexes. Topology and its Applications, 270:107038, 2020.

[144] K. Xia, X. Feng, Y. Tong, and G. W. Wei. Persistent homology for the quantitative
prediction of fullerene stability. Journal of computational chemistry, 36(6):408–422,
2015.

[145] K. Xia and G.-W. Wei. Persistent homology analysis of protein structure, flexibility,
and folding. International journal for numerical methods in biomedical engineering,
30(8):814–844, 2014.

82

[146] K. Xia and G.-W. Wei. Persistent topology for cryo-EM data analysis. International

Journal for Numerical Methods in Biomedical Engineering, 31(8), 2015.

[147] Z. Xiang and B. Honig. Extending the accuracy limits of prediction for side-chain

conformations. Journal of molecular biology, 311(2):421–430, 2001.

[148] C. Xu, Y. Wang, C. Liu, C. Zhang, W. Han, X. Hong, Y. Wang, Q. Hong, S. Wang,
Q. Zhao, et al. Conformational dynamics of SARS-CoV-2 trimeric spike glycoprotein
in complex with receptor ACE2 revealed by cryo-EM. Science advances, 7(1):eabe5575,
2021.

[149] K. Yegnesh. Persistence and sheaves. arXiv preprint arXiv:1612.03522, 2016.

[150] H. R. Yoon. Cellular sheaves and cosheaves for distributed topological data analysis.

PhD thesis, University of Pennsylvania, 2018.

[151] A. Zomorodian and G. Carlsson. Computing persistent homology. In Proceedings of

the twentieth annual symposium on Computational geometry, pages 347–356, 2004.

83

APPENDIX A

STATISTICS OF NONZERO EIGENVALUES OF PERSISTENT
LAPLACIANS

Here we present some statistics of nonzero eigenvalues of persistent Laplacians. An Alpha

filtration is constructed from the atoms of molecule CB8 [118]. The spectrum of ∆Xr,Xr+p

d

for d = 0, 1 and selected r, p will be calculated. The radius r will be a multiple of 0.01Å.

Figure A.1 The minimal of non-zero eigenvalues when p = 0.

Figure A.2 The minimal of non-zero eigenvalues when p = 0.2.

84

Figure A.3 The mean of non-zero eigenvalues when p = 0.

Figure A.4 The mean of non-zero eigenvalues when p = 0.2.

Figure A.5 The median of non-zero eigenvalues when p = 0.

85

Figure A.6 The median of non-zero eigenvalues when p = 0.2.

Figure A.7 The maximal of non-zero eigenvalues when p = 0.

Figure A.8 The maximal of non-zero eigenvalues when p = 0.2.

86

Figure A.9 The standard variation of non-zero eigenvalues when p = 0.

Figure A.10 The standard variation of non-zero eigenvalues when p = 0.2.

87

APPENDIX B

METHOD OF HOMOTOPY CONTINUATION

Our exposition of homotopy continuation follows [8]. More theoretical treatment of this topic

can be found elsewhere [124].

Solving a system of polynomial equations f by homotopy continuation basically consists

of three steps: 1) build a start system g such that g can be solved easily; 2) build a homotopy

between two systems f and g; 3) track the roots of g to the roots of f .

We first look at a simple example. Let us say we wish to solve the following polynomial

in one complex variable

f (z) = −2z3 − 5z2 + 4z + 1.

We take a similar and simpler polynomial g(z) and deform the roots of g(z) to f (z). For

instance we may take g(z) = z3 + 1 and construct a linear homotopy

h(z, s) = sg(z) + (1 − s)f (z)

where s is a complex variable. Though the second parameter of h is a complex variable, we

still call h a homotopy between f and g for convenience. Then we parametrize s by a curve

s(t) =

γt
γt + (1 − t)

, t ∈ [0, 1], γ ∈ C\R

(this is called the gamma trick [8, Section 6.1] and there are technical reasons behind such

choice of parametrization). We substitute s(t) in h(z, s) and clear denominators, then obtain

a usual homotopy

H(z, t) = γtg(z) + (1 − t)f (z), t ∈ [0, 1].

For each t0 ∈ [0, 1], H(z, t0) is a polynomial. Once we know how to numerically solve

H(z, t0 − ∆t) = 0 from the known roots of H(z, t0), we can pick a grid of [0, 1] and track the

known roots of g step by step all the way to the solutions of f . This process is called the path

88

tracking. Now suppose H(z(t), t) = 0 for any t ∈ (0, 1] with z(1) a root of g. Differentiate

H(z(t), t) with respect to t, we have the Davidenko differential equation

∂H
∂t

(z(t), t) +

∂H
∂z

(z(t), t)

dz(t)
dt

= 0.

If ∂H

∂z (z(t), t) is nonzero, the Davidenko differential equation can be rewritten as

dz(t)
dt

= −

(cid:16) ∂H
∂z

(z(t), t)

(cid:17)−1 ∂H
∂t

(z(t), t)

As we know the value of z(t) at t0, we have indeed transformed our original problem of

tracking roots to the classical initial value problem of ordinary differential equation (ODE).

One may use any ODE method to predict z(t0 − ∆t) (The default ODE solver employed by

Bertini is RKF45). For instance we can apply the simplest Euler’s method and get

z(t0 − ∆t) = z(t0) −

(cid:16)∂H
∂z

(z(t0), t0)

(cid:17)−1 ∂H
∂t

(z(t0), t0)∆t.

Since we also know that H(z(t0 − ∆t), t0 − ∆t) should be zero, we can apply several

iterations of Newton’s method to update z(t0 − ∆t). Such combination of an ODE predictor

with Newton’s method is called a predictor-corrector method.

Now after the path tracking from t = 1 to t = 0, we get a sequence {z(ti)}. If the limit

lim
ti→0

z(ti) exists and is finite, we think of lim
ti→0

z(ti) as a solution of f .

Example B.0.1. The reader may wonder why we do not just use

H(z, t) = tg(z) + (1 − t)f (z), t ∈ [0, 1].

Consider the example H(z, t) = t(z2 − 1) + (1 − t)(5 − z2). When t = 1/2, H(z, 1/2) = 2

has no roots. When t = 5/6, H(z, 5/6) = 2/3z2 has a singular root 0, and the derivative of

it at z = 0 is zero.

Example B.0.2. Though usually we are only interested in real roots of the target system,

we should also track complex roots. Consider the homotopy

h(z, t) = z4 − e2πi(1−t) + 0.25 = 0, t ∈ [0, 1].

89

√
At t = 1, h(z, 1) has two real roots ± 4

√
0.75 and two imaginary roots ± 4

0.75i. As t goes

from 1 to 0, e2πi(1−t) travels around the unit circle in the complex plane counterclockwise;

The two real roots will be deformed to the two imaginary ones and vice versa.

90

APPENDIX C

APPENDIX OF THE SHEAF CHAPTER

Figure C.1 A square and a trapezoid. Coordinates of vertices are shown.

(a)

(b)

(c)

(d)

(e)
Figure C.2 The results of the square. We consider pairs (Xr, Xr) or (Xr, Xr+0.2) in a filtration.
The results of the labeled point cloud {((0, 0), 1), ((1, 0), 1), ((1, 1), 1), ((0, 1), 1)} are shown
in (a)(c). The results of the point cloud {(0, 0), (1, 0), (1, 1), (0, 1)} are shown in (b)(d). The
results of the labeled point cloud {((0, 0), 1), ((1, 0), 1), ((1, 1), 1), ((0, 1), −1)} are shown in
(e)(f).

(f)

91

(a)

(b)

(c)

(d)

(e)

(f)

a

in

the

results

trapezoid.

of
filtration.
√

Figure C.3 The
or
(Xr, Xr+0.2)
{((0, 0), 1), ((1, 0), 1), ((3/4,
the point cloud {(0, 0), (1, 0), (3/4,
of the labeled point cloud {((0, 0), 1), ((1, 0), 1), ((3/4,
shown in (e)(f).

15/4), (1/4,

results

The

√

pairs

We
of

consider

the

labeled

(Xr, Xr)
cloud
The results of
15/4)} are shown in (b)(d). The results
15/4), −1)} are

15/4), 1), ((1/4,

point

√

√

15)/4, 1), ((0, 1), 1)} are shown in (a)(c).
√

Figure C.4 The structure of CB8.

92

Figure C.5 The results of the CB8 when p = 0.

Figure C.6 The results of the CB8 when p = 0.2.

93

Figure C.7 Illustration of the structure of bacteriocin AS-48.

Figure C.8 The results of AS-48 when p = 0.

94

Figure C.9 Results of AS-48 when p = 0.4.

95

APPENDIX D

APPENDIX OF THE SARS-COV-2 CHAPTER

Here we provide the information of PDB structures we used and additional topological

analysis using persistent Laplacian and persistent sheaf Laplacian. The information of PDB

structures is given in Table D.1. PTL results are given in Figures D.1, D.2, D.3, D.4, D.5,

D.6, D.7, D.8, D.9, D.10, D.11, and D.12.

Specifically, Figures D.1, and D.2 are element-specific analysis of the wide type, Alpha,

Beta, Gamma, BA.1, and BA.2 using LP and PSL, respectively. Figure D.3 presents carbon

specific analysis of the wide type, Alpha, Beta, Gamma, BA.1, and BA.2. Figures D.4, D.5,

D.6, and D.7 demonstrate the PL analysis of Alpha, Beta, BA.2, and BA.2, respectively. The

spectral analysis of three major types of elements, namely carbon atoms, nitrogen atoms,

and oxygen atoms, is presented in these figures. Finally, D.8, D.9, D.10, D.11, and D.12

illustrate the PSL analysis of the wide type, Alpha, Beta, BA.2, and BA.2, respectively.

These figures display the spectral analysis of three major types of elements, namely carbon

atoms, nitrogen atoms, and oxygen atoms.

96

PDB ID

Method

Resolution (unit: Å)

Description

6M0J[80]

X-ray diffraction

8DLK[93] Electron microscopy

8DLN[93] Electron microscopy

8DLQ[93] Electron microscopy

7T9L[94] Electron microscopy

7XB0[82]

X-ray diffraction

7DF3[148] Electron microscopy

7LWS[51] Electron microscopy

7LYM[51] Electron microscopy

7TF8[50] Electron microscopy

7XIX[21] Electron microscopy

2.45

3.04

3.04

2.77

2.66

2.90

2.70

3.22

3.57

3.36

3.25

wild type RBD-ACE2

Alpha RBD-ACE2

Beta RBD-ACE2

Gamma RBD-ACE2

BA.1 RBD-ACE2

BA.2 RBD-ACE2

wild type spike

Alpha spike

Beta spike

BA.1 spike

BA.2 spike

Table D.1 Information of PDB 3D structures used in this work.

Figure D.1 Illustration of persistent Betti numbers of element specific persistent Laplacian
of the residue 501 mutation site at different filtration values, i.e., radii (unit: Å). The wild
type (PDB ID: 6M0J) and Alpha (PDB ID: 8DLK) are given in the first row. The Beta
(PDB ID: 8DLN) and Gamma (PDB ID: 8DLQ) are given in the second row. The BA.1
(PDB ID: 7T9L) and BA.2 (PDB ID: 7XB0) are given the third row.

97

Figure D.2 Illustration of persistent Betti numbers of element specific persistent sheaf Lapla-
cian of the residue 501 mutation site at different filtration values, i.e., radii (unit: Å). The
wild type (PDB ID: 6M0J) and Alpha (PDB ID: 8DLK) are given in the first row. The Beta
(PDB ID: 8DLN) and Gamma (PDB ID: 8DLQ) are given in the second row. The BA.1
(PDB ID: 7T9L) and BA.2 (PDB ID: 7XB0) are given the third row.

98

Figure D.3 Illustration of persistent Betti numbers (red line) and the first nonzero eigenvalues
(blue line) of persistent Laplacian of carbon atoms of the RBD binding site in RBD-ACE2
complex of wild type (PDB ID: 6M0J), Alpha (PDB ID: 8DLK), Beta (PDB ID: 8DLN),
BA.1 (PDB ID: 7T9L), and BA.2 (PDB ID: 7XB0) at different filtration values, i.e., radii
(unit: Å). The graphs from top to bottom represent the results of the wild type, Alpha,
Beta, BA.1, and BA.2 variants, respectively.

99

Figure D.4 Illustration of persistent Betti numbers (red line) and the first nonzero eigenvalues
(blue line) of persistent Laplacian of the RBD binding site of Alpha RBD-ACE2 complex
(PDB ID: 8DLK) and closed state spike protein (PDB ID: 7LWS, Chain ID: A) at different
filtration values, i.e., radii (unit: Å). The graphs from top to bottom represent the results
of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

Figure D.5 Illustration of persistent Betti numbers (red line) and the first nonzero eigenvalues
(blue line) of persistent Laplacian of the RBD binding site of Beta RBD-ACE2 complex (PDB
ID: 8DLN) and closed state spike protein (PDB ID: 7LYM, Chain ID: A, B, C) at different
filtration values, i.e., radii (unit: Å). The graphs from top to bottom represent the results
of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

100

Figure D.6 Illustration of persistent Betti numbers (red line) and the first nonzero eigenvalues
(blue line) of persistent Laplacian of the RBD binding site of BA.1 RBD-ACE2 complex
(PDB ID: 7T9L) and closed state spike protein (PDB ID: 7TF8, Chain ID: A, B, C) at
different filtration values, i.e., radii (unit: Å). The graphs from top to bottom represent the
results of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

Figure D.7 Illustration of persistent Betti numbers (red line) and the first nonzero eigenvalues
(blue line) of persistent Laplacian of the RBD binding site of BA.2 RBD-ACE2 complex
(PDB ID: 7XB0) and closed state spike protein (PDB ID: 7XIX, Chain ID: A) at different
filtration values, i.e., radii (unit: Å). The graphs from top to bottom represent the results
of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

101

Figure D.8 Illustration of persistent sheaf Betti numbers (red line) and the first nonzero
eigenvalues (blue line) of persistent sheaf Laplacian of the RBD binding site of wild type
RBD-ACE2 complex (PDB ID: 6M0J) and closed state spike protein (PDB ID: 7DF3, Chain
ID: A) at different filtration values, i.e., radii (unit: Å). The graphs from top to bottom
represent the results of carbon atoms, nitrogen atoms, and oxygen atoms respectively.

Figure D.9 Illustration of persistent sheaf Betti numbers (red line) and the first nonzero
eigenvalues (blue line) of persistent sheaf Laplacian of the RBD binding site of Alpha RBD-
ACE2 complex (PDB ID: 8DLK) and closed state spike protein (PDB ID: 7LWS, Chain ID:
A) at different filtration values, i.e., radii (unit: Å). The graphs from top to bottom represent
the results of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

102

Figure D.10 Illustration of persistent sheaf Betti numbers (red line) and the first nonzero
eigenvalues (blue line) of persistent sheaf Laplacian of the RBD binding site of Beta RBD-
ACE2 complex (PDB ID: 8DLN) and closed state spike protein (PDB ID: 7LYM, Chain ID:
A, B, C) at different filtration values, i.e., radii (unit: Å). The graphs from top to bottom
represent the results of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

Figure D.11 Illustration of persistent sheaf Betti numbers (red line) and the first nonzero
eigenvalues (blue line) of persistent sheaf Laplacian of the RBD binding site of BA.1 RBD-
ACE2 complex (PDB ID: 7T9L) and closed state spike protein (PDB ID: 7TF8, Chain ID:
A, B, C) at different filtration values, i.e., radii (unit: Å). The graphs from top to bottom
represent the results of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

103

Figure D.12 Illustration of persistent sheaf Betti numbers (red line) and the first nonzero
eigenvalues (blue line) of persistent sheaf Laplacian of the RBD binding site of BA.2 RBD-
ACE2 complex (PDB ID: 7XB0) and closed state spike protein (PDB ID: 7XIX, Chain ID:
A) at different filtration values, i.e., radii (unit: Å). The graphs from top to bottom represent
the results of carbon atoms, nitrogen atoms, and oxygen atoms, respectively.

104