APPLICATIONS OF PERSISTENT COHOMOLOGY TO DIMENSIONALITY REDUCTION AND CLASSIFICATION PROBLEMS By Luis G. Polanco Contreras A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Computational Mathematics, Science and Engineering – Doctor of Philosophy Mathematics – Dual Major 2022 ABSTRACT APPLICATIONS OF PERSISTENT COHOMOLOGY TO DIMENSIONALITY REDUCTION AND CLASSIFICATION PROBLEMS By Luis G. Polanco Contreras Many natural phenomena are characterized by their underlying geometry and topological invariants. Part of understanding such processes is being able to differentiate them and classify them through their topological and geometrical signatures. Many advances have been made which use topological data analysis to such end. In this work we present multiple machine learning tools aided by topological data analysis to classify and understand said phenomena. First, feature extraction from persistence diagrams, as a tool to enrich machine learning techniques, has received increasing attention in recent years. In this paper we explore an adaptive methodology to localize features in persistent diagrams, which are then used in learning tasks. Specifically, we investigate three algorithms, CDER, GMM and HDBSCAN, to obtain adaptive template functions/features. Said features are evaluated in three classifi- cation experiments with persistence diagrams. Namely, manifold, human shapes and protein classification. In this area, our main conclusion is that adaptive template systems, as a feature extraction technique, yield competitive and often superior results in the studied ex- amples. Moreover, from the adaptive algorithms here studied, CDER consistently provides the most reliable and robust adaptive featurization. Furthermore, we introduce a framework to construct coordinates in finite Lens spaces for data with nontrivial 1-dimensional Zq := Z/Zq persistent cohomology, for q > 2 prime. Said coordinates are defined on an open neighborhood of the data, yet constructed with only a small subset of landmarks. We also introduce a dimensionality reduction scheme in S 2n−1 /Zq (Lens-PCA: LPCA) and demonstrate the efficacy of the pipeline Zq -persistent cohomology ⇒ S 2n−1 /Zq coordinates ⇒ LPCA, for nonlinear (topological) dimensionality reduction. This methodology allows us to capture and preserve geometrical and topological information through a very efficient dimensionality reduction algorithm. Finally, to make use of some of the most powerful tools in algebraic topology we improve on methodologies that make use of persistent 2-dimensional homology to obtain quasiperiodic scores that indicate the degree of periodicity or quasiperiodicity of a signal. There is a significant computational disadvantage in this approach since it requires the often expensive computation of 2-dimensional persistent homology. Our contribution in this area uses the algebraic structure of the cohomology ring to obtain classes in the 2-dimensional persistent diagram by only using classes in dimension 1, saving valuable computational time in this manner and obtaining more reliable quasiperiodicity scores. We develop an algorithm that allows us to effectively compute the cohomological death and birth of a persistent cup product expression. This allows us to define a quasiperi- odic score that reliably separates periodic from quasiperiodic time series. A mi madre y mi hermano. iv ACKNOWLEDGEMENTS First and foremost, I would like to express my deepest gratitude to Dr. Jose A. Perea for the opportunity to pursue a Ph.D. at Michigan State University. The advice I received during my time as his student helped me gather a deeper understanding and appreciation for topological data analysis and mathematics in general. He spent countless hours helping me develop and polish ideas, papers, and talks. I will be forever in debt for your patience and guidance through many difficult times and hurdles to overcome. I would like to thank other committee members, Dr. Teena Gerhardt, Dr. Mattew Hedden, Dr. Mark Iwen, and Dr. Elizabeth Much. I gain valuable knowledge through lectures and casual discussions. I cherish the passion for research and teaching you all have shown to me. I would like to offer my special thanks to Dr. Hitesh Gakhar and Dr. Joshua L. Mike and Danielle Barns for their support with writing, coding, and research discussions that help me navigate across many difficult problems and challenges. I would like to express my gratitude to Dr. Andres Angel Cardenas for guiding me through my undergraduate and master’s studies. His inspiration and passion help me forge my path. I can only strive to inspire others as he has been an inspiration to me. I want to thank my friends Omar, Nicolas, Santiago, Mateo, Felipe, Sebastian, Sergio, Angelica, Dylan, and Jessica for always offering words of advice and putting a smile on my face all these years. Last but not least, I would like to thank my mother and brother for their endless sup- port and understanding. If anything, this work is a tribute to their unconditional love and sacrifices. Without your support, I would never make it. Thank you! v TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix LIST OF ALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 CHAPTER 2 BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1 Persistent homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Space of persistent diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 CHAPTER 3 ADAPTIVE TEMPLATE SYSTEMS . . . . . . . . . . . . . . . . . 15 3.1 Approximating Functions on Persistence Diagrams . . . . . . . . . . . . . . . 15 3.1.1 Template functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Adaptive template systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.1 Cover-Tree Entropy Reduction - CDER . . . . . . . . . . . . . . . . . 19 3.2.2 Gaussian Mixture Models - GMM . . . . . . . . . . . . . . . . . . . . 21 3.2.3 Hierarchical Density-Based Spatial Clustering of Applications with Noise - HDBSCAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3.1 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.2 Shape data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3.3 Protein classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 CHAPTER 4 PERSISTENT CUP PRODUCTS AND QUIASI-PERIODICITY DETECTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.1 Persistent cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.1.1 Computing persistent cohomology . . . . . . . . . . . . . . . . . . . . 34 4.2 Persistent cup products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.3.1 Cohomological death . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.3.2 Complexity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.4.1 Torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.4.2 Periodic signal with non trivial 2-dimensional persistence. . . . . . . . 44 4.5 Quasi-periodicity detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.5.1 ROC for quasi-periodicity detection . . . . . . . . . . . . . . . . . . . 47 4.5.2 Vocal folds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 vi CHAPTER 5 LENS COORDINATES AND DATA COORDINATIZATION . . . . . 51 5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.1.1 Persistent Cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.1.2 Lens Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.1.2.1 A Fundamental domain for L2q (1, p) . . . . . . . . . . . . . . 52 5.1.3 Principal Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 5.2 Main Theorem: Explicit Classifying Maps for L∞ q . . . . . . . . . . . . . . . 54 5.3 Lens coordinates for data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.3.1 Landmark selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.3.2 A Partition of Unity subordinated to Bϵ . . . . . . . . . . . . . . . . 56 5.3.3 From Rips to Čech to Rips . . . . . . . . . . . . . . . . . . . . . . . . 57 5.4 Dimensionality Reduction in Ln q via Principal Lens Components . . . . . . . 57 5.4.1 Inductive construction of LPCA . . . . . . . . . . . . . . . . . . . . . 62 5.4.2 Choosing a target dimension . . . . . . . . . . . . . . . . . . . . . . . 62 5.4.3 Independence of the cocycle representative . . . . . . . . . . . . . . . 63 5.4.4 Visualization map for L23 . . . . . . . . . . . . . . . . . . . . . . . . . 63 5.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.5.1 The Circle S 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.5.2 The Moore space M (Z3 , 1) . . . . . . . . . . . . . . . . . . . . . . . . 66 5.5.3 The Lens space L23 = S 3 /Z3 . . . . . . . . . . . . . . . . . . . . . . . 68 5.5.4 Isomap dimensionality reduction . . . . . . . . . . . . . . . . . . . . . 69 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 vii LIST OF TABLES Table 3.1 Manifold classification: Each row corresponds to the number of samples taken from each manifold. The first two columns show the best results reported in [5]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Table 3.2 Shape classification: Portion out of the 10 problems, for which each adaptive template system yields the best classification results. The cells with a dash (-) indicate that no computations were carried out. . . . . . . 25 Table 3.3 Shape classification: Classification accuracy for adaptive templates (ours), tent functions and interpolating polynomials [5] . . . . . . . . . . . . . . . 26 Table 3.4 Protein classification: average classification accuracy for the 55 tasks in the data set PCB00019. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Table 4.1 Quasi-periodicit score (PQS) and Quasi-periodicity Cup score (CQPS) across the 7 vocal folds videos. . . . . . . . . . . . . . . . . . . . . . . . . 48 Table 5.1 Percentage of recovered variance in Ln 3. . . . . . . . . . . . . . . . . . . . 65 Table 5.2 In green we highlight the fraction that indicates which method better identifies the topological features. . . . . . . . . . . . . . . . . . . . . . . 71 viii LIST OF FIGURES Figure 1.1 Left: Periodic signal with underlying frequencies √{1, 5}. Right: Quasi- periodic signal with underlying frequencies 5, 5 3 . . . . . . . . . . . . 5 Figure 3.1 Left: Collection of persistent diagrams colored by class. Right: Mesh covering the collection on the left. . . . . . . . . . . . . . . . . . . . . . . 18 Figure 3.2 Left: Collection of persistent diagrams colored by class. Right: collec- tion of open balls as supports for template functions. . . . . . . . . . . . 19 Figure 3.3 Ω′ ⊂ Ω but S(Ω) < S(Ω′ ). . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Figure 3.4 Example of the 6 manifolds, from top left to bottom right we have: annulus, cube, 3 clusters, 3 cluster of 3 clusters, S 2 (projected on the xy-plane) and torus (projected on the xy-plane). . . . . . . . . . . . . . . 23 Figure 3.5 Examples of shapes and poses in the SHREC synthetic data set. . . . . . 25 Figure 3.6 Examples of data points in PCB00019 and their corresponding persis- tent diagrams. Top row: Protein domains from [30]. Bottom row: Persistent diagrams in dimensions 0and 1. . . . . . . . . . . . . . . . . . 28 Figure 4.1 Simplicial complex structure on the torus . . . . . . . . . . . . . . . . . . 30 Figure 4.2 Filtration of a simplicial complex representation of a torus in R3 . . . . . 32 Figure 4.3 Cellular complex on the Torus. . . . . . . . . . . . . . . . . . . . . . . . 33 √  Figure 4.4 Time series f (t) = cos(5t) + cos 5 3t . . . . . . . . . . . . . . . . . . . . 43 Figure 4.5 Persistence diagram of P H ∗ (SWd,τ (f ); Z2 ) and persistence of the cup product of the 2 most persistent classes in P H 1 (SWd,τ (f ); Z2 ). . . . . . 44 Figure 4.6 Time series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Figure 4.7 Persistence P H 1 (SWd,τ (f )). . . . . . . . . . . . . . . . . . . . . . . . . . 46 Figure 4.8 ROC curves for the origical Quai-periodicity score in [21] and Cup quasi- periodic score. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Figure 5.1 Left: Sample X, in black landmark set L ⊂ X. Right: P H i (R(L); Z3 ) for i = 0, 1, 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Figure 5.2 Percentage of recovered variance. . . . . . . . . . . . . . . . . . . . . . . 65 ix Figure 5.3 Visualization P2 (f (X)) ⊂ L23 . . . . . . . . . . . . . . . . . . . . . . . . . 66 Figure 5.4 Left: X ⊂ M (Z3 , 1) with landmarks in black. Right: P H i (R(L); Z3 ) for i = 0, 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Figure 5.5 P H i (R(L); F) for i = 0, 1 and F = Z2 , Z3 . . . . . . . . . . . . . . . . . . 67 Figure 5.6 Percentage of recovered variance. . . . . . . . . . . . . . . . . . . . . . . 68 Figure 5.7 Visualization of the resulting P2 (f (X)) ⊂ L23 . . . . . . . . . . . . . . . . 68 Figure 5.8 P H i (R(L); F) for i = 0, 1 and F = Z2 , Z3 . . . . . . . . . . . . . . . . . . 69 Figure 5.9 Left: X ⊂ L23 . Right: Lens coordinates. . . . . . . . . . . . . . . . . . . 70 Figure 5.10 Percentage of recovered variance. . . . . . . . . . . . . . . . . . . . . . . 70 x LIST OF ALGORITHMS Algorithm 4.1 Reduction algorithm (pHcol) . . . . . . . . . . . . . . . . . . . . . . 34 Algorithm 4.2 Cohomological death of a cochain . . . . . . . . . . . . . . . . . . . . 41 xi CHAPTER 1 INTRODUCTION The main objective in this work is to examine multiple problems in the field of data analysis for which the usual tools do not provide sufficiently satisfactory solutions. We develop different approaches to solve these problems by leveraging the inherent advantages present in using geometric and topological tools. In Chapter 1 we tackle the main question of how can we take advantage of the information present in persistent diagrams and persistent homology to solve machine learning problems. To this end we develop a featurization method that allows us to extract meaningful feature vectors from persistent diagrams that can be fed into classical machine learning pipelines. One of the central questions in Topological Data Analysis (TDA) is how to leverage topological information, like persistence diagrams [1], for machine learning purposes. This idea has been explored, for instance, in [2–5] and [6]. In particular, [5] establishes theoretical and computational tools to translate super- vised machine learning tasks (e.g., classification and regression) with topological features, into the problem of approximating continuous real-valued functions on the space of per- sistence diagrams, D, endowed with the Bottleneck distance. The main concept is that of templates. These are continuous real-valued compactly supported functions on W := (x1 , x2 ) ∈ R2 | 0 ≤ x1 < x2 , which (by integration against persistence measures, whereby  a diagram is replaced by a sum of Dirac deltas) yield continuous functions on D. The same work shows that one can construct countable families of template functions (a template system), which in turn give rise to dense subsets of C(D, R) with respect to the compact- open topology. 3.1.6 below indicates how a template system can be utilized to generate (polynomial) features for supervised machine learning problems on persistence diagrams. In this paper we address the question of producing template systems that are attuned 1 (adaptive) to the input data set and the supervised classification problem at hand. We explore and compare different strategies to assemble adaptive template systems; namely, Cover-Tree Entropy Reduction (CDER) [7], Gaussian Mixture Models (GMM) [8] and Hi- erarchical density-based spatial clustering of applications with noise (HDBSCAN) [9]. The conclusion is that CDER is the most consistently successful strategy out of the ones explored. We present three different examples where we use adaptive template functions to extract features from persistence diagrams for supervised classification tasks. First, we explore a 6 class classification problem presented in [5]. In this problem, several random samples are taken from each of 6 manifolds, and persistence diagrams are computed as descriptors in each case. We then use our adaptive template functions and compare to the results provided in [5]. The average classification accuracy of adaptive templates for both the training and testing sets is comparable to that of [5]. On the other hand, the standard deviation of our results is much smaller, making our methodology more stable compared to the template systems proposed in [5]. We then report results on the SHREC 2014 synthetic data set [10], which involves a 15 class supervised learning problem. Each class in this data set corresponds to a human body in five different poses and three different shapes: male, female and child. The data points in this data set are 3D meshes. In [4] a heat kernel signature is computed for each mesh and for 10 different kernel parameter values. This defines 10 different classification tasks, each of which uses the corresponding heat sub-level set persistence diagrams as inputs. The results we obtain using adaptive template functions are contrasted with [5]. The results from this experiment highlight how CDER provides a more reliable method for obtaining adaptive templates when compared to GMM and HDBSCAN. Furthermore, when we select the heat kernel signature corresponding to the 6-th frequency, CDER adaptive templates generate a classification model with accuracy on par with the best results in [5] and [4]. When compared to the non-adaptive (tent) templates of [5], our classification results are superior. Finally, we present results for a protein classification problem on the publicly available 2 Protein Classification Benchmark data set PCB00019 [11]. This data set contains spatial information for 1, 357 proteins as well as 55 distinct supervised classification tasks. The results on [3] are used as a benchmark, since they also use persistence diagrams, but the extracted features are hand-crafted to reflect chemical/physical properties of interest. In this experiment, adaptive template functions improve the average classification accuracy reported in [3] from 82% to around 98%. In Chapter 2 we develop a dimensionality reduction algorithm that is consistent with standard methods in topological data analysis. We aim to present a dimensionality reduction algorithm that preserves topological invariants, moreover the methodology we present aim to preserve specific topological features in the persistent homology across the dimensionality reduction process. Another central question in Topological Data Analysis (TDA) is how to use topologi- cal signatures like persistent (co)homology [1] to infer spaces parametrizing a given data set [12–14]. This is relevant in nonlinear dimensionality reduction since the presence of nontrivial topology—e.g., loops, voids, non-orientability, torsion, etc—can prevent accurate descriptions with low-dimensional Euclidean coordinates. Here we seek to address this problem motivated by two facts. The first: If G is a topological abelian group, then one can associate to it a contractible space, EG, equipped with a free right G-action. For instance, if G = Z, then R is a model for EZ, with right Z-action R × Z ∋ (r, n) 7→ r + n ∈ R. The quotient BG := EG/G is called the classifying space of G [15]. In particular BZ ≃ S 1 , BZ2 ≃ RP∞ , BS 1 ≃ CP∞ and BZq ≃ S ∞ /Zq ; here ≃ denotes homotopy equivalence. The second fact: If B is a topological space and CG is the sheaf over B (defined in [16]) sending each U ⊂ B open to the abelian group of continuous maps from U to G, then Ȟ 1 (B; CG )—the first Čech cohomology group of B with coefficients in CG —is in bijective correspondence with [B , BG]—the set of homotopy classes of continuous maps from B to the classifying space BG. This bijection is a manifestation of the Brown representability theorem [17], and implies, in not so many words, that Čech 3 cohomology classes can be represented as coordinates with values in a classifying space (like S 1 or S ∞ /Zq ). For point cloud data—i.e., for a finite subset X of an ambient metric space (M, d)— one does not compute Čech cohomology, but rather persistent cohomology. Specifically, the persistent cohomology of the Rips filtration on the data set X (or a subset of landmarks L). The first main result of this paper contends that steps one through three below mimic the bijection Ȟ 1 (B; CZq ) ∼ = [B, S ∞ /Zq ] for B ⊂ M an open neighborhood of X: 1. Let (M, d) be a metric space and let L ⊂ X ⊂ M be finite. X is the data and L is a set of landmarks. 2. For a prime q > 2 compute P H 1 (R(L); Zq ); the 1-dim Zq -persistent cohomology of the Rips filtration on L. If the corresponding persistence diagram dgm(L) has an element (a, b) so that 2a < b, then let a ≤ ϵ < b/2 and choose a representative cocycle η ∈ Z 1 (R2ϵ (L); Zq ) whose cohomology class has (a, b) as birth-death pair. 3. Let Bϵ (l) be the open ball in M of radius ϵ centered at l ∈ L = {l1 , . . . , ln }, and let φ = {φl }l∈L be a partition of unity subordinate to B = {Bϵ (l)}l∈L . If ζq ̸= 1 is a q-th root of unity, then the cocycle η yields a map f : B −→ Ln q to the Lens space S Ln 2n−1 /Z defined as the quotient generated by the free Z action (z , . . . , z ) 7→ q =S q q 1 n (e2πi/q · z1 , . . . , e2πi/q · zn ). The map f can be expressed in homogeneous coordinates by the formula hp ηj1 p ηjn i Bϵ (ℓj ) ∋ b , f (b) = φ1 (b)ζq : · · · : φn (b)ζq where ηjk ∈ Zq is the value of the cocycle η on the edge {lj , lk } ∈ R2ϵ (L). If X ⊂ B, then f (X) = Y ⊂ Ln q is the representation of the data—in a potentially high S dimensional Lens space—corresponding to the cocycle η. The second contribution of this paper is a dimensionality reduction procedure in Ln q akin to Principal Component Analysis, called LPCA. This allows us to produce from Y , a family of point clouds Pk (Y ) ⊂ Lkq , 4 1 ≤ k ≤ n, Pn (Y ) = Y , minimizing an appropriate notion of distortion. These are the Lens coordinates of X induced by the cocycle η. This work, combined with [18, 19], should be seen as one of the final steps in completing the program of using the classifying space BG, for G abelian and finitely generated, to produce coordinates for data with nontrivial underlying 1st cohomology. Indeed, this follows from the fact that B(G ⊕ G′ ) ≃ BG × BG′ , and that if G is finitely generated and abelian, then it is ismorphic to Zn ⊕ Zn1 ⊕ · · · ⊕ Znr for unique integers n, n1 , . . . , nr ≥ 0. Part of understanding periodic process is being able to differentiate them from quasi- periodic occurrences. These phenomena are characterized by collections of underlying fre- quencies. In the case of periodic signals, the underlying frequencies have a common fun- damental frequency, while on the other hand quasi periodic processes are determined by linearly independent frequencies. Figure 1.1 Left: Periodic signal with underlying √ frequencies {1, 5}. Right: Quasi-periodic signal with underlying frequencies 5, 5 3 . Many tools have been made to leverage topological data analysis to classify and under- stand quasi-periodic phenomena, one of the most recent approaches comes from [20] and [21]. In the later, the authors developed a framework which allows them to extract topolog- ical invariants, such as persistent cohomology to construct different scores that quantify the degree of periodicity vs quasi-periodicity of a given signal. This framework makes ample use of [22], where the authors show how sliding window embedding produces a point cloud which accumulates around a torus of dimension equal to the number of independent frequencies in the signal. 5 With this point cloud in place it is possible to compute its persistent cohomology, using the Ripser algorithm [23], to recover topological signatures corresponding to such toroidal spaces. In particular, the methodology presented in [21] relies heavily on computing persis- tent cohomology up to dimension 2 and their corresponding persistent diagrams to differen- tiate between periodic and quasi-periodic signals. However, it is possible to generate periodic signals whose time window embedding does not recover a 2-dimensional torus while producing persistent diagrams that suggest this toroidal structure. We construct in Chapter 4.4.2 one such example, by using the tools exhibited in [22]. More specifically we produce a periodic signal whose time delay embedding clusters around a space we refer as a "bunny". In other words a space with topological invariant similar to that of a sphere (S 2 ) with two loops (S 1 ) attach to it, similar to the ears of a bunny, thus the name. One of the main goals in this work is to refine this methodology in order to overcome this type of false positive examples. To achieve this we use the richer algebraic structure of the cohomology ring in the form of the cup product [24]. This algebraic tool allow us to enhance persistent diagrams to further study interaction between loops and void in the topological structure of the spaces we analyse. In Chapter 4.2 and Chapter 4.3 we present an algorithm to obtain the persistence of cup chains, which are persistent chains that arise from the cup product of two classes in the persistent diagram. This algorithm allow us to effectively compute the cohomological death and birth of a persistent cup product expression by taking advantage of some of the efficiencies baked in the Ripser algorithm [23]. We use this approach to concretely obtain the persistence of classes in the 2-dimensional persistent cohomology while only using classes in dimension 1. In Chapter 4.5 we present the improved quasi-periodicity scores in [21] using persistent cup product and showcase its performance in both synthetic and real data sets. In addition to this, we exhibit in Chapter 4.3 how the proposed algorithm provides com- putational improvements since it allow us to obtain the persistence of classes in 2-dimensional 6 persistent diagram by only computing the generators for classes in the 1-dimensional persis- tent diagram. This procedure eliminates the need to enumerate the 3-simplices in the Rips complex or to make any computation/reduction using the 2-coboundary operator. This re- sults in improving the computational efficiency of obtaining scores to separate periodic and quasi-periodic signals. 7 CHAPTER 2 BACKGROUND Persistent homology has emerged as an increasingly useful tool is the area of machine learn- ing and data science from the perspective of Topological Data Analysis. One of the main contributions it has provided to the field comes in the form of methodologies aimed to cap- ture the geometrical and topological features to be used for machine learning tasks and dimensionality reduction problems. We first provide some basic definitions as the building blocks for the work presented in here. Since we will be working with persistent homology we will need to introduce some basic notions in algebraic topology, most of these definition and results in this section are taken from [25] and [26]. Definition 2.0.1. Given a set X an abstract simplicial complex K is a collection of subsets of P (X) such that 1. Every σ ∈ K is finite, 2. For any σ ∈ K, if τ ⊂ σ then τ ∈ K. We will use the following commonly used notation from this point on. The sets in ∆ are called simplices and the dimension of a simplex is defined as dim(σ) = |σ| − 1, and since by definition each simplex is finite we define dim(K) = max{dim(σ)}. A face of σ is any ∅ ≠ τ ⊊ σ. The n-th skeleton of K is denoted by K (n) := {σ ∈ ∆ : dim(σ) ≤ n}. K (0) is also called the set of vertices of K. A subcomplex L of K is an abstract simplical complex such that L(n) ⊂ K (n) for all n ∈ Z. Definition 2.0.2. A simplicial map f : K → L between simplicial complexes is a map such that f (K (0) ) ⊂ L(0) . 8 Now we define the algebraic structures required to define homology. So for a given simplical complex K and abelian group G we have the following. Definition 2.0.3. The n-th chain group Cn (K; G) of K with coefficients in the group G is nP o Cn (K; G) := g σ : σ ∈ K (n) \ K (n−1) , g ∈ G and g = 0 ∈ G for all but finitely many i ∈ I . i∈I i i i i i We will refer to each τ ∈ Cn (K; G) as a chain. And each σ ∈ K (n) as a generator of Cn (K; G). We will use the notation Cn (K) if there is no ambiguity about the group G being used. Amongst the more common choices for G are Z and Zq = Z/qZ for a prime number q ∈ Z, for the computations and examples presented in this work we will be using G = Z2 unless we indicate otherwise. Definition 2.0.4. Let K be a simplicial complex and G a group. The n-th boundary map ∂n is a group homomorphism ∂n : Cn (K; G) → Cn−1 (K; G) defined for any σ = [v0 , . . . , vn ] ∈ K (n) as n (−1)n [v0 , . . . , vˆi , . . . , vn ], X ∂n ([v0 , . . . , vn ]) := (2.1) i=0 where [v0 , . . . , vˆi , . . . , vn ] denotes the i-th face of σ obtained from deleting the vertex vi from the set {v0 , . . . , vn }. Definition 2.0.5. A chain complex is a collection C∗ = {Ci , fi }i∈I of groups Ci and mor- phisms fi : Ci → Ci−1 such that fi−1 fi = 0. This condition implies that img(fi ) ⊂ ker(fi−1 ) for all i ∈ I. Example 1. The collection {Cn (K), ∂n } of chain groups of a simplicial complex K together with the boundary maps defined above form a chain complex. A proof of this is completely analogous to Lemma 2.1 in [25]. 9 Definition 2.0.6. The i-th homology group of a chain complex C∗ with coefficients in a group G is defined as ker(fi ) Hi (C∗ ; G) := (2.2) img(fi+1 ) Example 2. The homology groups of the chain complex {Ci (K; G), ∂i }i∈N are simply de- noted by Hi (K; G). Since we will we working with data sets we need to provide a method to obtain simplicial complexes out of them. To achieve this goal we can view a data set as a finite metric space. For example if the data set is a subset of Rn it has a natural metric associated to it. In most cases this will not be true but the user can always define a suitable metric for the problem at hand. So given a finite metric space (X, d) we can construct multiple simplicial complexes. We will focus on two particular constructions for the time being. Definition 2.0.7. Given a finite metric space (X, d) and a real number ϵ > 0 we define the Rips complex Rϵ (X) to be the simplicial complex with vertices X and n-simplices defined to be the set  (x1 , . . . , xn ) : d(xi , xj ) < 2ϵ, 0 ≤ i, j ≤ n . Definition 2.0.8. Given ϵ > 0 the Čech complex Cϵ (X) on (X, d) is the simplicial complex with vertex set X and such that (x0 , . . . , xn ) ∈ Cϵ (X) if and only if \n Bϵ (xj ) ̸= ∅, j=1 where Bϵ (xj ) = y ∈ X : d(xj , y) < ϵ .  There are some important relations between the Vieotris-Rips and Čech complex of a metric space. First of all we can see that Cϵ (X) ⊂ Rϵ (X) since the Vietoris-Rips complex contains every simplex warranted by the given edges, which is not true for the Čech com- plex, thus making the Vietoris-Rips complex larger in general. But one can also prove that 10 Rϵ (X) ⊂ C2ϵ (X) (see. Section III.3 in [26]), giving us in total Cϵ (X) ⊂ Rϵ (X) ⊂ C2ϵ (X). (2.3) This is useful to us since the metric spaces that we will consider will be random samples of manifolds M . In which case, if the sample size of X ⊂ M is large enough we can find ϵ > 0 so that the open covering {Bϵ (x)}x∈X is in fact a good covering of M and therefore it will recover the topological information of the underlying M . This suggests that to recover the topological information we are forced to use the Čech complex, which is computationally expensive to compute for large sets of vertices. But Chapter 2.3 provides us a method to approximate the Čech complex using the Vietoris-Rips complex which is in fact much faster to compute since we don’t need to verify intersections of multiple open balls. Thus from this point on and for all the computation presented we will use the Vietoris- Rips complex, knowing that it provides a truthful computation of the topological features we will be working with. 2.1 Persistent homology Definition 2.1.1. A filtered complex K is a collection of simplicial complexes {Ki }ni=0 such that K 0 ⊂ K1 ⊂ K2 ⊂ · · · ⊂ K n Definition 2.1.2. A persistence complex is a family of chain complexes {C∗i }i together with chain maps f i : C∗i → C∗i+1 . A filtered complex K produces a persistence complex in a natural way using the chain maps induced by the inclusions. 11 ∂2 ∂1 ··· / C2 (K ) /C / / O 2 1 (K O 2 ) C0 (K O 2 ) 0 i1 i1 i1 ∂2 ∂1 ··· / C2 (K1 ) /C / / O 1 (KO 1 ) C0 (K1 ) O 0 i0 i0 i0 ∂2 ∂ ··· / C2 (K0 ) / C (K ) 1 / C0 (K0 ) / 0 1 0 Now we give the definitions of the main object that we will use throughout this work. Definition 2.1.3. Let R be a ring and I and totally ordered index set, then a persistence module is a family of R-modules M i together with homomorphisms ϕi,j : M j → M j for i ≤ j ∈ I. Some of the most commonly used index sets are I = Z, R+ , [a.b] with a, b ∈ R. The homology groups of the persistence complex obtained from a filtered complex is a persistence module where the maps are the ones induced by the inclusions. i0∗ i1∗ i2∗ P Hi (K, Z) : Hi (K0 ; Z) / Hi (K1 ; Z) / Hi (K2 ; Z) / ··· , more precisely one has the following definition. Definition 2.1.4. The i-th persistent homology groups are the images of the homomorphisms r,s r,s r,s induced by the inclusion, Hi = img(fi ) where fi : Hi (Kr ) → Hi (Ks ). The correspond- r,s r,s ing i-th persistent Betti numbers are the ranks of these groups, namely βi = rank(Hi ). Since these algebraic objects are in general difficult to understand, we would like to have a decomposition into simpler pieces. The atomic or basic pieces used for this decomposition are interval modules. n o Definition 2.1.5. An interval module is a persistence module I[r,s) = Ia , ia where Ia = G b if a ∈ [r, s) and 0 otherwise while iba : Ia → Ib correspond to the identity whenever r ≤ a ≤ b < s. 12 Under adequate conditions one can show (see [27]) that each persistent module obtained by computing persistent homology can be decomposed as M P Hi (K) = I[r,s) (2.4) [r,s)∈A 2.2 Space of persistent diagrams We start by defining a persistence diagram Sµ as the pair (S, µ) where 1. S ⊂ W := (x, y) ∈ R2 | 0 ≤ x < y such that for any ϵ > 0 the set  uϵ = {(x, y) ∈ S | y − x > ϵ} is finite and 2. µ : S → N. We also define the persistence of a point (x, y) ∈ S as pers((x, y)) := y − x. If we denote by D the set of persistence diagrams we can consider the bottleneck distance dB : D × D → R+ to make (D, dB ) a complete metric space. Definition 2.2.1. A partial matching M between persistent diagrams Sµ and Tα is a bijec- tion between subsets of Sµ and subsets of Tα , i.e. M : Sµ′ → Tα′ is a bijection where Sµ′ ⊂ Sµ and Tα′ ⊂ Tα . If (y, n) = M (x, m) we say that (x, m) is matched with (y, n). If (z, k) ̸∈ (Sµ \ Sµ′ ) or (z, k) ̸∈ Tα \ Tα′ we call it unmatched. Definition 2.2.2. Given δ > 0 and a partial matching M , then we have a δ-matching if: • If (x, m) ∈ Sµ′ is matched with (y, n) then ∥x − y∥∞ < δ. • If (z, m) ∈ Sµ ∪ Tα is unmatched, then pers(z) < 2δ. Definition 2.2.3. The bottleneck distance dB : D × D → R+ is defined as dB (D1 , D2 ) = inf{δ > 0 : there is a δ-matching between D1 , D2 }. 13 In [28] the authors prove that dB defines a metric on D and that D is the metric com- pletion under the bottleneck distance of D0 := {(S, µ) ∈ D | S is finite}. 14 CHAPTER 3 ADAPTIVE TEMPLATE SYSTEMS 3.1 Approximating Functions on Persistence Diagrams The goal of this section is to provide the theoretical framework in which template functions are used as a means to approximating continuous functions on the space of persistence diagrams. Definition 3.1.1. The bottleneck distance dB : D × D → R+ is defined as dB (D1 , D2 ) = inf{δ > 0 : M : D1 → D2 is a δ-matching}. In [28] it is shown that dB defines a metric on D and that D is the metric completion of D0 := {(S, µ) ∈ D | S is finite} with respect to dB . In [5] the authors present a complete characterization of (relatively) compact subsets of (D, dB ). In particular, one can prove that compact subsets of (D, dB ) have empty interiors (see Theorem 13 in [5]). This implies that (D, dB ) is not locally compact and therefore the compact-open topology on C(D, R), the space of real-valued continuous functions on D, is not metrizable. 3.1.1 Template functions Lemma 3.1.2. Let Cc (W) denote the set of real-valued continuous functions on W with compact support. For each f ∈ Cc (W), the function νf : D → R defined as X νf (Sµ ) = µ(x)f (x) x∈S is continuous. Proof. See Lemma 23 in [5]. 15 Definition 3.1.3. A coordinate system for D is a collection F ⊂ C(D, R) with the follow- ing property: for any two distinct D, D′ ∈ D, there exists F ∈ F such that F (D) ̸= F (D′ ). Remark 3.1.4. F = C(D, R) is itself a coordinate system, but at least for computational purposes, it is too large to be of algorithmic use. Definition 3.1.5. A template system for D is a set T ⊂ Cc (W) such that νf : f ∈ T  is a coordinate system for D. The elements of T are called template functions. The main utility of template systems is that they can be used to construct dense subsets of C(D, R) with respect to the compact-open topology. In this topology, which is not metrizable as we mentioned above, two functions are deemed to be nearby if their values on compact sets are similar. Since the space of persistence diagrams is rather large and complicated, such comparisons (weaker than L2 or ∥ · ∥∞ ) are desirable. Theorem 3.1.6. Let T be a template system for D, let C ⊂ D be compact, and let F : C → R be continuous. Then, for any ϵ > 0 there exist N ∈ N, a polynomial p ∈ R[t1 , . . . , tN ], and template functions f1 , . . . , fN ∈ T so that |p(νf1 (D), . . . , νf (D)) − F (D)| < ϵ (3.1) N for all D ∈ C. Proof. See Theorem 29 in [5]. Corollary 3.1.7. Let T ⊂ Cc (W) be a template system for D. Then, the collection of functions of the form D −→ R D 7→ p(νf1 (D), . . . , νf (D)) N where N ∈ N, p ∈ R[t1 , . . . , tN ], and fn ∈ T , is dense in C(D, R) with respect to the compact-open topology. 16 The problem of constructing template systems of reasonable size (e.g., countable) is addressed by the following theorem. Theorem 3.1.8. Let f ∈ Cc (W), n ∈ N, m ∈ Z2 and define  m fn,m (x) = f nx + n If f is nonzero, then T = {fn,m |n ∈ N, m ∈ Z2 } ∩ Cc (W) is a template system for D. Proof. See Theorem 30 in [5]. The goal of this paper is to investigate and identify data-driven methodologies for se- lecting the support of an initial template function f , as well as its most relevant re-scaled translates fn,m , so that the scalar features νfn,m can be used successfully in learning prob- lems on persistence diagrams. 3.2 Adaptive template systems In [5] two different template systems are suggested: tent functions and interpolating poly- nomials. Specifically, let W = {(x, y) | x ∈ R, y ∈ R>0 } f be the conversion of W to the birth-lifetime plane. This conversion is defined by (a, b) ∈ W 7→ (a, b − a) ∈ f W. The template system of tent functions in the birth-lifetime plane is defined as follows. Let a = (a, b) ∈ f W and 0 < δ < b, then   1 ga,δ (x, y) = max 1 − max{|x − a|, |y − b|}, 0 δ In a similar manner one can define a template system of interpolating polynomials. Given (ai , bj ) i,j ⊂ f W and ci,j i,j ⊂ R, one can use Lagrange interpolating polynomials   to construct a function f such that f (ai , bj ) = ci,j . In general these two approaches require the user to input the meshes used in defining the template systems. By construction, such meshes define the support of the template 17 functions. One shortcoming of this procedure, when applied to 3.1.6, is that without prior knowledge about the compact set C ⊂ D the number of template functions that carry no information relevant to the problem can be high. This drawback is illustrated in 3.1. Figure 3.1 Left: Collection of persistent diagrams colored by class. Right: Mesh covering the collection on the left. The main goal of this paper is to present a methodology to define the template system used in 3.1.6 that incorporates the prior information we have about the particular learning task. Such methodology is what we refer to as adaptive template functions. Our approach to defining adaptive template functions is to first identify a collection of open ellipses in fW or W as in 3.2. Each ellipse in this collection will be the support for a template function defined in the following manner. Let A ∈ M2×2 (R) represent the quadratic form in two variables corresponding to an ellipse in the collection, and let x ∈ W be its center. Then, the associated template function is  1 − (z − x)∗ A(z − x) , (z − x)∗ A(z − x) < 1   fA (z) = , (z − x)∗ A(z − x) ≥ 1  0  To obtain the collection of ellipses (A) mentioned above we will use and compare three different approaches; namely, Cover-Tree Entropy Reduction (CDER, [7]), Gaussian Mixture 18 Figure 3.2 Left: Collection of persistent diagrams colored by class. Right: collection of open balls as supports for template functions. Models (GMM) and Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). We now provide a brief description of each method. 3.2.1 Cover-Tree Entropy Reduction - CDER The main objective of this algorithm is to find a partial cover tree for a collection of labeled point clouds in Rn . CDER searches for the cover tree in convex regions that are likely to have minimum local entropy. In this section we will explain the notion of entropy used by CDER, as it is relevant to explaining some of the results in 3.3. Let χ = {X1 , . . . , XN } be a collection of point clouds Xi ⊂ Rd — which in our case will N G be persistence diagrams — and define χ := Xi . We also have a labeling map at the level i=1 of point clouds λ : χ → {1, . . . , L} := L. Notice that we have a natural map ind : χ → χ given by ind(x) = Xi if x ∈ Xi . This allow us to define χl := (λ ◦ ind)−1 (l), 19 the set of all point in a point cloud labeled l ∈ L. Now we will assign weights to each point x ∈ χ in the following manner: 1. Each label is equally likely among the data and χ has a total weight of 1. Thus each label l ∈ L has an allocated weight of 1/L. 2. Now consider λ−1 (l) = {All point clouds with label l} , we assume again that each point cloud in λ−1 (l) is equally likely, so each Xi ∈ λ−1 (l) has an allocated weight of 1/(Nl L), where Nl = λ−1 (l) . 3. Finally each point in x ∈ Xi is equally likely, and thus 1 w(x) = ∥Xi ∥Nl L n o Definition 3.2.1. Let χ = X1 , . . . , XN | Xi ⊂ Rd be a collection of point clouds together with a label map λ : χ → {1, . . . , L} := L and a weight function w : χ → R. For any convex and compact set Ω ⊂ Rd we define the following quantities: 1. The total weight of l ∈ L in Ω Xn o wl (Ω) = w(x) | x ∈ Ω ∩ χl . 2. the total weight of Ω X W (Ω) = wl (Ω) l∈L wl (Ω) Using the weights assigned above we can interpret as the probability of a point in W (Ω) Ω to have label l. This leads to the following definition. Definition 3.2.2. Let χ, λ, and w be as before. For any convex and compact set Ω ⊂ Rd its entropy is defined by   X w (Ω) l wl (Ω) S(Ω) = − logL . W (Ω) W (Ω) l∈L 20 This definition is borrowed from information theory. Notice that if all wl (Ω) are roughly the same, then S(Ω) ≈ 1. But, if for example, Ω only contains points with a single label then we must have S(Ω) = 0. Remark 3.2.3. Generally, we would expect the entropy to decrease as we select smaller subsets of Ω. However, this is not always the case. Figure 3.3 Ω′ ⊂ Ω but S(Ω) < S(Ω′ ). For instance, consider point clouds X1 and X2 such that |X1 | = 10 and |X2 | = 20, each point cloud with a different label {0, 1}, and compact sets Ω, Ω′ as show in 3.3. We can easily see that ω0 (Ω′ ) = ω1 (Ω′ ) = 7/(20 · 20 · 2), so W (Ω′ ) = 7/(20 · 20 · 2) + 7/(20 · 20 · 2) = 7/400. Finally S(Ω′ ) = 1. A similar calculation will show that S(Ω) ≈ 0.9182, and thus Ω′ ⊂ Ω but S(Ω) < S(Ω′ ). 3.2.2 Gaussian Mixture Models - GMM This algorithm is an implementation of the Expectation-Maximization (EM) algorithm to fit Gaussian models to a collection of points. Recall that an EM algorithm is an iterative 21 method to solve maximum likelihood estimation of parameters for a given model; in our case Gaussian models. The EM algorithm iterates over two steps: an expectation step and a maximization step. The first step defines the expected value of the log likelihood function using a given set of parameters for the model. The maximization step finds a new set of parameters that maximizes the previously described expected value. 3.2.3 Hierarchical Density-Based Spatial Clustering of Applications with Noise - HDBSCAN HDBSCAN [9] is a hierarichal clustering algorithm extending BDSCAN. The latter, finds clusters as the connected components of a graph. The vertices of this graph are the elements in the data set after removing “noise points” and the adjacency is defined by a user-provided parameter ϵ. HDBSCAN constructs a hierarchical collection of DBSCAN solutions by changing the parameter used to compute the adjacency in the DBSCAN algorithm. Once this hierarchy of solutions is obtained, the algorithm extracts from its hierarchy dendrogram a sumarized collection of significant clusters. 3.3 Experimental Results Thus far we have developed a methodology for deriving adaptive template systems from labeled persistence diagrams. The main idea is to use algorithms such as CDER, GMM and HDBSCAN to identify the supports of the functions in the template system. We will use these adaptive templates to produce feature vectors in order to solve supervised classification problems. In this section we present three examples of supervised learning, where adaptive template functions yield featurizations that improve the classification accuracy or robustness of the classification model. Our implementation of adaptive template systems with CDER, 22 GMM and HDBSCAN as well as the scripts to replicate all the results in this section can be found in the associated GitHub repository1 . 3.3.1 Manifolds Figure 3.4 Example of the 6 manifolds, from top left to bottom right we have: annulus, cube, 3 clusters, 3 cluster of 3 clusters, S 2 (projected on the xy-plane) and torus (projected on the xy-plane). For this example we revisit an experiment presented in [5] and [6]. We generated point clouds sampled from different manifolds in R2 or R3 . Each point cloud has 200 points and the manifolds considered are the following: an annulus with inner radius 1 and outer radius 2 centered at (0, 0), 3 clusters of points drawn from normal distributions with means (0, 0), (0, 2) and (2, 0) all with standard deviation 0.05, 3 cluster of 3 clusters of points drawn from normal distributions with standard deviation 0.05 and means (0, 0), (0, 1.5), (1.5, 0), (0, 4), (1, 3), (1, 5), (3, 4), (3, 55) and (4.5, 4), cube defined as [0, 1]2 ⊂ R2 , torus obtained 1 https://github.com/lucho8908/adaptive_template_systems.git 23 from rotating a circle of radius 1 centered at (2, 0) on the xz-plane around the z-axis and sphere S 2 ⊂ R3 with uniform noise in [−0.05, 0.05] on the normal direction. We used CDER, GMM and HDBSCAN to generate the supports of the functions that form our template systems. We reserved 33% of the data for testing, and trained a kernel ridge regression model on the remaining data (%67). In addition, we investigate the effect of increasing the number of point clouds sampled from each manifold. CDER GMM HDBSCAN Train Test Train Test Train Test Train Test 10 0.99 ± 0.9 0.96 ± 3.2 0.99 ± 0.001 0.98 ± 0.034 0.99 ± 0.001 0.91 ± 0.075 1.00 ± 0.000 0.90 ± 0.092 25 0.99 ± 0.3 0.99 ± 1.0 0.99 ± 0.001 0.99 ± 0.001 0.99 ± 0.004 0.99 ± 0.009 1.00 ± 0.000 0.89 ± 0.031 50 1.00 ± 0.0 0.99 ± 0.9 0.99 ± 0.001 0.99 ± 0.001 0.99 ± 0.002 0.99 ± 0.008 1.00 ± 0.000 0.95 ± 0.003 100 0.99 ± 0.1 0.99 ± 0.4 0.99 ± 0.001 0.99 ± 0.001 0.99 ± 0.004 0.99 ± 0.005 1.00 ± 0.000 0.97 ± 0.011 200 0.99 ± 0.1 0.99 ± 0.3 0.99 ± 0.002 0.99 ± 0.005 0.99 ± 0.002 0.99 ± 0.003 1.00 ± 0.000 0.98 ± 0.005 Table 3.1 Manifold classification: Each row corresponds to the number of samples taken from each manifold. The first two columns show the best results reported in [5]. In 3.1 we present the mean accuracy of the model on both the training and testing data after averaging the results over 10 experiments for each sampling size. Furthermore, 3.1 contains the results obtained from using CDER, GMM and HDBSCAN to find the adaptive templates as well as the best results presented by [5] on the same classification problems. The first important feature to highlight is that for all the different sampling sizes the adaptive templates accuracy results show a smaller standard deviation than the results re- ported in [5]. At the same time, the mean accuracy of our methodology is comparable with the state of the art results (in [5]). It is worth mentioning that across all the different methods used in this work to ob- tain adaptive template systems, CDER provides the most stable results. This is specially significant for the smaller sample size (the first row in 3.1). 3.3.2 Shape data In this example we consider the synthetic SHREC 2014 data set [10], of which some instances are shown in Figure 3.5. We compare our result to the methods reported in [5] by extracting 24 features using adaptive template systems for the same data from [4] and [5]. In [4] the authors defined a function on each mesh using a heat kernel signature, [29], for 10 parameters and computed persistent diagrams for dimensions 0 and 1. Figure 3.5 Examples of shapes and poses in the SHREC synthetic data set. For each one of the 10 parameter values we have 300 pairs of persistence diagrams and the goal of the problem is to predict the human model. The models correspond to 5 different poses for people labeled as male, female and child; giving us a total of 15 labels. Lets us remark that each one of the 10 parameters yields a different classification problem. Polynomial RBF Sigmoid Kernel CDER = 0.6 CDER = 0.4 CDER = 0.2 Ridge GMM = 0.3 GMM = 0.4 GMM = 0.4 HDBSCAN = 0.1 HDBSCAN = 0.1 HDBSCAN = 0.4 CDER = 0.7 CDER = 0.7 CDER = 0.8 SVM GMM = 0.3 GMM = 0.2 GMM = 0.2 HDBSCAN = 0.0 HDBSCAN = 0.0 HDBSCAN = 0.0 Random CDER = 0.6 - - Forest GMM = 0.3 - - HDBSCAN = 0.1 - - Table 3.2 Shape classification: Portion out of the 10 problems, for which each adaptive template system yields the best classification results. The cells with a dash (-) indicate that no computations were carried out. We considered CDER, GMM and HDBSCAN as methods to obtain adaptive template systems. For each one of these template systems we used three different kernel methods to 25 solve the classification problems, namely, kernel ridge regression, kernel support vector ma- chines (SVM) and random forest. Finally, three different kernels were examined, polynomial, radial basis function (RBF) and sigmoid kernels. 3.2 shows the portion, out of the 10 problems, for which a given adaptive template system yields the best classification results. For instance, the entry corresponding to ridge regression with a polynomial kernel (first row and first column). shows that the CDER template system yields the best classification accuracy in 6 our of 10 problems, GMM is the best for 3 out of 10 and HDBSCAN is superior in 1 out of the 10 problems. With this interpretation of 3.2 in mind, we can see that CDER adaptive template sys- tems yield more accurate classification results than GMM or HDBSCAN templates. This holds true for all but one, of the kernel and kernel method combinations explored in this experiment. Global features Local features Interpolating Polynomials Tent functions Adaptive templates Freq. Train Test Train Test Train Test 1 0.99 ± 0.3 0.90 ± 5.3 0.08±0.30 0.03±0.5 0.79±0.01 0.73±0.02 GMM Kernel Ridge 2 1.00 ± 0.0 0.95 ± 2.4 0.08±0.40 0.03±1.00 0.84±0.00 0.8±0.03 GMM 3 0.99 ± 0.5 0.90 ± 2.0 0.80±1.3 0.44±4.3 0.7±0.01 0.66±0.03 HDBSCAN 4 0.98 ± 0.9 0.84 ± 3.9 0.89±1.5 0.69±4.9 0.7±0.01 0.67±0.03 GMM 5 0.99 ± 0.4 0.93 ± 2.2 0.76±2.7 0.58±7.9 0.78±0.04 0.76±0.08 CDER 6 0.98 ± 0.5 0.92 ± 1.8 0.96±0.67 0.89±1.7 0.97±0.01 0.92±0.03 CDER SVM 7 0.99 ± 0.4 0.95 ± 1.4 0.98±0.60 0.94±2.5 0.96±0.02 0.94±0.05 CDER 8 0.99 ± 0.4 0.94 ± 2.2 0.91±1.2 0.89±3.3 1±0.00 0.88±0.05 CDER Random Forest 9 0.98 ± 1.3 0.92 ± 2.1 0.64±2.3 0.53±3.8 1±0.00 0.88±0.03 CDER 10 0.97 ± 1.1 0.89 ± 4.6 0.27±3.4 0.18±5.6 1±0.00 0.9±0.08 CDER Table 3.3 Shape classification: Classification accuracy for adaptive templates (ours), tent functions and interpolating polynomials [5] Having stablished how CDER, GMM and HDBSCAN compare across different kernel methods and kernels, next we compare our classification results with the state of the art. 3.3 contains our classification accuracy on the training and testing shape data, as well as the results obtained using the tent functions and interpolating polynomials from [5]. The first important feature to remark is that for 7 out of the 10 problems, the model ob- tained from adaptive coordinates has less overfitting than interpolating polynomials. Mean- 26 ing that for most of the classification problems studied in this example, adaptive template systems provide a more robust classification model. An additional argument in favor of the robustness of our approach is that the standard deviation of all the models using adap- tive templates is smaller than those presented in [5]. Moreover, when comparing adaptive template systems with tent functions (see 3.3) we attain better classification results on the testing set across all the problems presented. This highlights the benefits of adaptive versus non-adaptive local template systems. It is pertinent to mention that the results in 3.3 correspond to a specific selection of parameters for each kernel and regularization in each classification method. In fact, from our methodology we can find a combination of kernel parameters and regularization that yields higher classification accuracy in the training set. But, for such models the overfitting issues are more noticeable. Finally, since the end goal of this problem is to solve the multiclass classification problem in the synthetic SHREC 2014 data set. We can select the problem corresponding to the frequency 6 in the heat kernel signature (row 6 in 3.3) as the one that gives us the best classification accuracy while minimizing the overfiting concern. Such result is obtained using a CDER template system and a regularized SVM method. 3.3.3 Protein classification We consider next the data set PCB00019 from the Protein Classification Benchmark Collec- tion [11]. This problem set contains 1, 357 proteins and 55 classification tasks. Our results will be compared to those reported in [3]. To compare our results with the ones in [3], we report the average classification accuracy over the 55 classification tasks in the data set PCB00019. 3.4 shows the average classification accuracy and standard deviation for each of the classification tasks from PCB00019. Here we used a regularized kernel ridge regression with polynomial, RBF and sigmoid kernels. The last row in 3.4 displays the average classification 27 Figure 3.6 Examples of data points in PCB00019 and their corresponding persistent diagrams. Top row: Protein domains from [30]. Bottom row: Persistent diagrams in dimensions 0and 1. Train Test Polynomial 0.90 ± 0.07 0.98 ± 0.02 CDER RBF 0.91 ± 0.06 0.97 ± 0.02 Sigmoid 0.90 ± 0.07 0.98 ± 0.02 Topological features in [3] - 0.82 ± —- Table 3.4 Protein classification: average classification accuracy for the 55 tasks in the data set PCB00019. accuracy for the testing set reported in [3]. We note that the authors do not report standard deviation or average classification accuracy for the testing set. It is meaningful to remark that in [3] topological features are used to solve the classifi- cation problems. Those features were constructed using persistence diagrams as to reflect relevant properties of the proteins in the given data set. 28 3.4 Discussion This paper investigates the viability of utilizing data-driven methodologies to localize features in persistence diagrams. These features are used in subsequent supervised learning tasks for classification problems where shape is an important feature. We examined three different algorithms, CDER, GMM and HBDSCAN to produce adaptive template functions. Through extensive testing with real and synthetic data sets, we demonstrate that CDER provides a more robust collection of adaptive features while maintaining classification accuracy on par with the state of the art. In terms of time complexity, CDER also outperforms GMM and HDBSCAN for the problems here considered. 29 CHAPTER 4 PERSISTENT CUP PRODUCTS AND QUIASI-PERIODICITY DETECTION 4.1 Persistent cohomology Given and abstract simplicial complex K any finite set σ in K is called a simplex of dimension |σ| − 1, for example a simplex with one element is called a 0-simplex, a simplex with two elements is a 1-simplex. We will denote a simplex as [v0 , . . . , vn ] and the points vi are called the vertices of the simplex. Using this notation we will say that [v0 , . . . , vi−1 , vi+1 , . . . , vn ] is a face of [v0 , . . . , vn ] and [v0 , . . . , vn ] is a coface of [v0 , . . . , vi−1 , vi+1 , . . . , vn ]. Example 3. This first example will provide the basic example that we will use throughout this paper. Figure 4.1 Simplicial complex structure on the torus In this example we present a simplicial complex that represents a torus by letting the base 30 set S = {0, . . . , 8}. We define the 1-simplices to be {[0, 1], [1, 2], [0, 2], [3, 4], [4, 5], [3, 5], [6, 7], [7, 8], [6, 8], [0, 3], [3, 6], [0, 6], [1, 4], [4, 7], [1, 7], [2, 5], [5, 8], [2, 8], [2, 4], [1, 3], [0, 5], [5, 7], [4, 6], [3, 8], [1, 8], [0, 7], [2, 6]}, and the 2-simplices to be defined as {[0, 1, 3], [1, 3, 4], [1, 2, 4], [2, 4, 5], [0, 2, 5], [0, 3, 5], [4, 6, 7], [3, 4, 6], [4, 5, 7], [5, 7, 8], [3, 5, 8], [3, 6, 8], [0, 6, 7], [0, 1, 7], [1, 7, 8], [1, 2, 8], [2, 6, 8], [0, 2, 6], [0, 1, 2]} A family {Ki }i∈I of simplicial complexes indexed by a totally ordered set I and such that Ki ⊂ Kj is a sub-simplex whenever i ≤ j is called a filtered complex. A spacial case of filtered complexes arises whenever Ki = Ki−1 ∪ {σi }, we call this type of filtration a simplexwise filtration. In general any filtered complex can be made into a simplexwise filtration by refining and reindexing maps as presented by [23], therefore from this point on we will consider any filtered complex to be a simplexwise filtrations and we will refer to it just as a filtration. In computational topology one of the main object of study are Vietoris-Rips filtrations due to the computational properties they hold (see [26]). We will make use of this filtrations as well since they naturally arise as part of many computational tools such as Ripser (see [23]). The Vietoris-Rips filtration is usually applied to obtain persistent diagrams from data sets when seen as finite metric spaces, in other words if X is a finite subset of a metric space (M, d). We define the Vietoris-Rips filtration of X to be the family of simplicial complex S Rϵ (X) = {S ⊂ X : S ̸= ∅ and diam(S) ≤ ϵ} together with natural inclusions Rϵ (X) ,→ Rϵ′ (X) whenever ϵ < ϵ′ . 31 Figure 4.2 Filtration of a simplicial complex representation of a torus in R3 . Example 4. Here we have filtered complex of a Torus (see Section 2.1 on [25]). The filtered complex is given by the coloration of the edges and vertices, meaning the edges in green enter the filtration sooner than those in blue, while the faces in blue enter the filtration sooner than those in red. Given a simplicial complex K and a field F we can define the group of p-cochains, denoted by C p (K), to be the collection of homomorphisms from Cp (K) to F. The p-coboundary of an element σ ∈ C p (K), denoted as δ p (σ) ∈ C p+1 (K) is defined by the values it takes in the chains Cp (K), namely (δ p (σ)) ([v0 , . . . , vp+1 ]) := σ ∂p+1 ([v0 , . . . , vp+1 ])  p+1 (−1)j σ [v0 , . . . , vˆj , . . . , vp+1 ] X  = j=0 where ∂p is as defined in Chapter 2.1. With this map in mind we define the p-cocycles of K as Z p (K) = ker(δ p ) and the p- coboundaries of K to be B p (K) = img(δ p−1 ). Using this notions the p-th cohomology of K is defined as H p (K; F) = Z p (K)/B p (K) = ker(δ p )/img(δ p−1 ). (4.1) 32 Example 5. Consider one of the complexes the filtration in Chapter 4 and fix F = Z2 Figure 4.3 Cellular complex on the Torus. This simplicial complex contains 9 vertices, {v0 , . . . , v8 }, 27 edges denoted as ei for i = 0, . . . , 26 and finally 18 faces similarly labeled fj for j = 0, . . . , 17. It is easy to verify that this simplicial complex has the cohomology of a torus, namely H 0 (K; Z2 ) = Z2 , H 1 (K; Z2 ) = Z2 × Z2 and H 2 (K; Z2 ) = Z2 . Given a filtration K = {Ki }i∈I the inclusion maps Ki ,→ Kj for i < j induce homomor- p phisms fi,j : H p (Kj ; F) → H p (Ki ; F) giving place to a sequence P H p (K; F) := H p (K1 ; F) o H p (K2 ; F) o ··· o H p (Km ; F) (4.2) p p The p-th persistent cohomology groups of K are defined as Hi,j (K; F) = img(fi,j ). In particular if i = j we recover the standard cohomology groups for each element in the p filtration, i.e. Hi,i (K; F) = H p (Ki ; F) for any i ∈ I. Moreover the sequence of cohomology groups in Chapter 4.2 form a pointwise finite- dimensional persistent vector space. Meaning that each cohomology group is a finite dimen- sional vector space over the field F. Once this condition is met, in [31] is shown that any pointwise finite-dimensional persistence module is a direct sum of interval modules. 33 Require: D = (m × n) filtration boundary matrix. Ensure: R is reduce, V is invertible upper triangular and R = DV . Initialize: (R, v) = (D, I) for j = 1, . . . , n do while ∃i < j such that lowR (i) = lowR (j) do λ = R[lowR (i), j]/R[lowR (i), i] R[:, j]− = λR[:, i] V [:, j]− = λV [:, i] return (R, V ) Algorithm 4.1 Reduction algorithm (pHcol) Each interval module is defined over an interval I ⊂ R and the corresponding interval module V is given by letting Vt = F for any t ∈ I, Vt = 0 for any t ̸∈ I and the maps between ρst : Vs → Vt correspond to the identity F → F. This decomposition of persistent cohomology in terms of interval modules presented in [31] allow us to represent the persistent cohomology as a barcode or persistent diagram. These representations are further explained and explored in Chapter 6 and furthermore in [32]. 4.1.1 Computing persistent cohomology The decomposition in interval modules (see [31]) can be computed by standard Ripser al- gorithm as shown in [23] and Algorithm 4.1. The output of such algorithm is a matrix decomposition R = DV , where the matrix D is assembled from the boundary maps, V is an invertible upper triangular matrix and R is used to encode the persistent diagram of P H ∗ (K : F). First, given a filtration K we assemble the coboundaries δ(σ) as columns or- dered according to the filtration order into the filtration boundary matrix D. If the maximum dimension on K is d then we can compute all persistent cohomology up to dimension d − 1. Algorithm 4.1 stopping condition is defined in terms of the matrix R being reduced as defined by [33]. More specifically, R is reduced if for any pair of non-zero columns i-th and j-th then lowR (i) ̸= lowR (j), where lowR (i) represents the index of the lowest row with a 34 non-zero entry in the i-th column of R. In [33] the authors prove that given a decomposition of the form R = DV where V is invertible and upper triangular and R is in reduced form, then the pairings define by • (i, j) if lowR (j) = i, • (i, ∞) if the i-th column of R is zero and there is no j such that lowR (j) = i are unique in general and define the persistent diagram or barcode of the filtration K. Notice that this result does not imply uniqueness of the matrices R and V . One of the optimization presented in [34] to use Algorithm 4.1 consist of replacing the matrix D by D⊥ , which is define entry wise as D⊥ [i, j] = D[n + 1 − i, n + 1 − j]. We will consider this matrix for decomposition purposes since this will coincide with the exact implementation presented in [23]. An important note worth highlighting is the fact that given a pair (i, j) in the persistent diagram of P H ∗ (K; F) we will say that i is the cohomological death of a class, while j is the cohomological birth of a class. Example 6. In this example we will take the filtration on Chapter 4 Example 4 and will compute its persistent cohomology using a decomposition of D⊥ . Since showcasing the en- tirety of the matrix D⊥ would be too cumbersome, we will illustrate the process on a smaller submatrix. In particular, we will use the submatrix that contains the 0-coboundaries. In other words, each column of such matrix contains the coboundary of a vertex vi in the filtration on Chapter 4. By using the reduction Algorithm 4.1 we obtain a reduced matrix R⊥ and a matrix of elementary operations V ⊥ . From this submatrix alone is impossible to read the complete persistent cohomology of the filtered complex in Chapter 4, but using the same reduction algorithm applied to the complete matrix D⊥ we can easily find that the pairs in the persistent cohomology are given by following the rules stated in [34]. 35 .. D⊥ v8 v7 v6 v5 v4 v3 v2 v1 v0 . e26 1 1  e12  1 1  e25 1 1  e11  1 1 e24 1 1     e10   1 1   e23 1 1     e9   1 1   e22 1 1     e8   1 1   e21 1 1  e7    1 1     e20 1 1   , e6   1 1   e19 1 1     e5   1 1   e18 1 1    e4   1 1   e17 1 1    e3   1 1  e16 1 1    e2   1 1   e15 1 1     e1  1 1  e14 1 1   e0 1 1 e13 1 1 v8 v7 v6 v5 v4 v3 v2 v1 v0 .. R⊥ v8 v7 v6 v5 v4 v3 v2 v1 v0 . e26 1  e12  1 1  e25 1 1  e11  1  e24 1 1   e10 1 1       e23 1 1   e9 1 1       e22 1 1   e8 1 1       e21 1   e7 1 1       e20 1 1   e6 1 1 ,     e19 1 1   e5 1 1       e18 1 1   e4 1 1      e17 1 1   e3 1      e16 1 1   e2 1 1      e15 1 1   e1 1     e14 1 1   e0 1 e13 1 1 v8 v7 v6 v5 v4 v3 v2 v1 v0 P H 0 (K; Z2 ) = {[v0 , ∞), [v8 , e16 ), [v5 , e13 ), [v2 , e12 ), [v6 , e7 ), [v7 , e5 ), [v3 , e3 ), [v4 , e1 ), [v1 , e0 )} P H 1 (K; Z2 ) = {[e9 , ∞), [e26 , f17 ), [e25 , f16 ), [e24 , f15 ), [e23 , f14 ), [e22 , f13 ), [e21 , f19 ), [e20 , f11 ), [e19 , f12 ), [e18 , f10 ), [e17 , f9 ), [e15 , f8 ), [e14 , f7 ), [e11 , f6 ), [e10 , f5 ), [e8 , f4 ), [e6 , f3 ), [e4 , f2 ), [e2 , f1 )} P H 2 (K; Z2 ) = {[f18 , ∞)} 36 V⊥ v8 v7 v6 v5 v4 v3 v2 v1 v0 v8 1 1 v7  1 1 v6 1 1    v5 1 1    v4 1 1    v3 1 1    v2 1 1    v1 1 1   v0 1 4.2 Persistent cup products Given K a simplicial complex and a field F, if we take a k-cochain ϕ ∈ C k (K; F) and an l-cochain ψ ∈ C l (K; F) then cup product between ϕ and ψ is a (k + l)-cochain ϕ ⌣ ψ ∈ C k+l (K; F). To define this cochain we need to define the values that it takes on any (k + l)-simplex σ = [v0 , . . . , vk+l ] as (ϕ ⌣ ψ)(σ) := ϕ([v0 , . . . , vk ])ψ([vk , . . . , vk+l ]). We can easily see that this product is bilinear, associative and graded commutative (see [24]). Since this cup product produces a map on cochains we need to do some extra work to define a cup product at the level of cohomology. To achieve such goal the following results is required (see Lemma 3.6 in [24]). Lemma 4.2.1. Let ϕ ∈ C k (K; F) and ψ ∈ C l (K; F) then δ(ϕ ⌣ ψ) = δϕ ⌣ ψ + (−1)k ϕ ⌣ δψ. We can now define a cup product at cohomology level ⌣: H k (K; R) × H l (K; R) → H k+l (K; R). (4.3) This cup product has many useful properties, among which we have that it is 37 • associative, • distributive, • anti-commutative, in other words ϕ ⌣ ψ = (−1)kl ψ ⌣ ϕ, • and natural with respect to continuous functions, i.e. given a continuous map f : K → L the induced map f ∗ : H k (L; F) → H k (K; F) satisfies f ∗ (ϕ ⌣ ψ) = f ∗ (ϕ) ⌣ f ∗ (ψ). Example 7. Consider the same complex K as in Chapter 5 and with the persistent coho- mology as computed in Chapter 6 we can obtain the generators from the matrix V ⊥ . For this example we will take the persistent classes [e9 , ∞) and [e21 , f19 ) which are gen- erated by the cochains 1[0,8] + 1[2,8] + 1[2,7] + 1[0,6] + 1[1,7] + 1[1,6] and 1[0,8] + 1[6,8] + 1[5,6] + 1[3,5] + 1[2,3] + 1[0,2] respectively. Where 1[0,8] , for example, represents the identification function that takes value 1 on the simplex [0, 8] and 0 otherwise. With this notation in mind and the definition at the beginning of this section then one can compute 1[v0 ,v1 ] ⌣ 1[v2 ,v3 ] ([v4 , v5 , v6 ]) = 1[v0 ,v1 ] ([v4 , v5 ])1[v2 ,v3 ] ([v5 , v6 ]) is non-zero if and only if v1 = v5 = v2 , therefore 1[v ,v ] ⌣ 1[v ,v ] = 1[v ,v ,v ] . 0 1 2 3 0 1 3 It is the easy to verify that the cup product of the two generator chosen above results in 1[0,6,8] + 1[1,6,8] . To extend the notion of cup product to persistent homology we need to make use of the the property that cup products are natural with respect to continuous functions. With this 38 in mind and using the inclusions in a given filtration K = {Ki } we will define a cup product on P H ∗ (K : F) by looking at the following diagram P H k (K; F) := H k (K1 ; F) o H k (K2 ; F) o ··· o H k (Km ; F) (4.4) ⌣ ⌣ ⌣ P H l (K; F) := H l (K1 ; F) o H l (K2 ; F) o ··· o H l (Km ; F)    P H k+l (K; F) := H k+l (K 1 ; F) o k+l H (K 2 ; F) o ··· o k+l H (K m ; F) Meaning that given persistent classes in P H k (K : F) and P H l (K : F) we can compute the cup product entry-wise and obtain a persistent cochain in P H k+l (K : F). Notice that from this definition we do not know the persistence of this cup product expression. Since the persistence of a class is defined in term of its cohomological birth and death we need to define such notion for a general cochain in P H k+l (K : F). The cohomological birth of a persistent cochain to be the largest point in the filtration where the given cochain is a cocyle, while the cohomological death will be the largest point in the filtration where the cochain is a coboundary. The goal now and our contribution is to develop an algorithm that allow us to compute the cohomological birth and death of a persistent cochain in P H k+l (K : F). 4.3 Algorithm The goal of this section is to present an algorithm to compute the persistence of a cochain of the form α ⌣ β for any two classes. To achieve this goal we split this section in to main steps: one dedicated on how to compute cohomological death while the other one deals with cohomologcal birth. 39 4.3.1 Cohomological death From this point on we will make heavy use of ideas in [34] and [35] to compute persistent cohomology using a decomposition (Dn−1 )⊥ = R⊥ V ⊥ where R⊥ is reduced and V ⊥ is invertible and upper-triangular. Where for any matrix A define lowA (j) as the index of the lowest non-zero entry in the j -th column of A; it is undefined if the column is zero. We say that matrix A is reduced if lowR is injective over its domain of definition. (n) Let Xl := X (n−1) ∪ σ1 ∪ · · · ∪ σl for any 1 ≤ l ≤ s. Then the cohomological birth of a cochain γ ∈ C n (X; Zq ) coincides with the smallest 1 ≤ l ≤ s such that 1. i∗l ◦ δ n−1 (x) = i∗l (γ) has no solution x ∈ C n−1 (X; Zq ), and 2. i∗l−1 ◦ δ n−1 (x) = i∗l−1 (γ) can be solved. (n) Where il : Xl ,→ X (n) is the natural inclusion. First of all notice that when written in the standard basis the composition i∗l ◦ δ n−1 corresponds to the l × t submatrix of (D(n−1) )⊥ composed of the last l rows of (D(n−1) )⊥ , namely (D(n−1) )⊥ [s + 1 − l : s, 1 : t]. therefore, the previously presented condition can be restated as follows: 1. (D(n−1) )⊥ [s + 1 − l : s, 1 : t]x[s + 1 − l : s] = γ[s + 1 − l : s] has no solution in (n) C n−1 (Xl ; Zq ), and 2. (D(n−1) )⊥ [s + 1 − (l + 1) : s, 1 : t]x[s + 1 − (l + 1) : s] = γ[s + 1 − (l + 1) : s] has no solution. Furthermore, these conditions can be transformed in terms of R⊥ . Let x̂ := V ⊥ x and γ̂ := V ⊥ γ, resulting in: (n) 1. R⊥ [s + 1 − l : s, 1 : t]x̂[s + 1 − l : s] = γ̂[s + 1 − l : s] has no solution in C n−1 (Xl ; Zq ), and 2. R⊥ [s + 1 − (l + 1) : s, 1 : t]x̂[s + 1 − (l + 1) : s] = γ̂[s + 1 − (l + 1) : s] has no solution. 40 Precompute: R⊥ and lowR⊥ Initialize: x̂ = (0, . . . , 0) ∈ Ztq and death = 0 for l = s, . . . , 1 do if l ∈ lowR⊥ then Let j be lowR⊥ (j) = l x̂[k]R⊥ [l, k] P x̂[j] = γ̂[j] − k̸=j else x̂[k]R⊥ [l, k] == γ̂[j] then P if k Pass else death = l Break return death Algorithm 4.2 Cohomological death of a cochain Using the fact that the matrix R⊥ is reduced we have lowR⊥ is injective on its domain. Thus we can use the following algorithm to compute the cohomology death of a cochain γ ∈ C n (X; Zq ). Example 8. Let us consider the filtration K from Chapter 4. Algorithm 4.2 requires a decomposition R⊥ = D⊥ V ⊥ which was already computed in Chapter 6. Consider now the generators associated to the persistent classes used in Chapter 7, namely [e9 , ∞) and [e21 , f19 ). Now we are interested in computing the cohomological death of the persistent cochain [e9 , ∞) ⌣ [e21 , f19 ) = 1[0,6,8] + 1[1,6,8] using Algorithm 4.2. Furthermore, since the 2-simplex [1, 6, 8] is not part of the filtered complex in Chapter 4, then we have [e9 , ∞) ⌣ [e21 , f19 ) = 1[0,6,8] . So we are interested in solving the system of equations using the matrix V ⊥ and 1[0,6,8] , in other words A careful evaluation of Algorithm 4.2 will return 1, which means that the cohomological death of this cochain happens whenever f17 enters the filtration. 41 f18  1  0 f17 1 1 1  0 f16 1 1  1   1   f15  1 1  1   0   f14 1 1 0       f13 1 1 1 0       f12 1 0       f11 1 1 0       f10 1 1 0       f9 1 1 x = 0      f8 1 1 0       f7 1 1 0       f6 1 0       f5 1 0       f4 1 1 0       f3 1 0       f2 1 1 0       f1 1 0     f0 1 0 4.3.2 Complexity analysis To study the complexity of our implementation of cohomological death computation we will focus on Algorithm 4.2. First of all it is important to highlight the fact that this algorithm resembles closely the standard backwards substitution algorithm. The main difference resides in the fact that instead of looping the entire matrix we only need to do the corresponding substitutions for the rows that appear in the set of lowest ones of R⊥ , i.e. the rows that we can encounter in the image of lowR⊥ . Recall that in general δ n : C n → C n+1 , in this context this implies that the inner product has at most as many operations as generators in C n . Furthermore, since the matrix represen- tation D⊥ of δ n is more often than not not-square is it’s safe to assume that it’s reduced form R⊥ will have at most min dim(C n ), dim(C n+1 ) pivots. This fact implies that the outer  most loop in Algorithm 4.2 will iterate at most dim(C n ) times. With this considerations in mind we can see that the number of operation in Algorithm 4.2 is O(dim(C n )2 ). To understand this result even better, let us consider a Rips filtration on a set with n points, here we know that the dim(C k ) grows exponentially on the number of points on 0-skeleton. This in particular means that we can expect dim(C k ) ∼ nk . Furthermore, if we 42 use δ 1 to compute the cohomological death of a cochain in C 2 the complexity of Algorithm 4.2 would become O(n2 ). 4.4 Examples In this section we present different examples of persistent cup product computations. The first two examples we explore in here correspond to sliding window embedding of time series that results in topological spaces homeomorphic to a torus T 2 and to the wedge S 2 ∨S 1 ∨S 1 . This first example allow us to illustrate the power of the persistent cup product presented in this paper, as it will help us differentiate persistent diagrams coming from samplings of these two spaces. 4.4.1 Torus For this example we will take advantage of the fact that the sliding window embedding of a quasi-periodic function is homeomorphic to a 2-dimensional torus (see [20]). For this particular example we will use the function in Chapter 4.4. √  Figure 4.4 Time series f (t) = cos(5t) + cos 5 3t . Once we have the point cloud obtained from the sliding window embedding we use the Vietoris-Rips filtration of the point cloud to compute its persistent cohomology. Such com- 43 putation is carried away using the python implementation of Ripser [23] provided as part of the package scikit-tda [36]. The persistent diagram corresponding to this computation is presented in Chapter 4.5. Figure 4.5 Persistence diagram of P H ∗ (SWd,τ (f ); Z2 ) and persistence of the cup product of the 2 most persistent classes in P H 1 (SWd,τ (f ); Z2 ). 4.4.2 Periodic signal with non trivial 2-dimensional persistence. The main goal of this example is to showcase a periodic signal which sliding window embed- ding (see [37]) produces a point cloud such that its persistent diagram has 2 high persistent classes in dimension 1 and 1 persistent class in dimension 2. One way in which this goal can be attained is if the point cloud (obtained from the sliding window embedding of a signal) has the same topology as gluing 2 circles to a sphere in different points. More specifically, to obtain such point cloud we will first construct a time series using the results in [22] that will recover the desired geometric properties. To do so we require to choose an observation function as required by the methodology in [22], in this specific example we will use the Euclidean distance to a single random point in Rn as our observation function. By applying the results in [22] we obtain the time series in Chapter 4.6 44 Figure 4.6 Time series. Once this time series is at hand we can go ahead and compute it’s sliding window embed- ding after fixing an adequate dimension embedding and window size for the embedding to recover the desired geometry. As suggested by [37] we make our dimension d = 5 and using this we define the sliding window size as ω = d+12πd . Once the sliding window point cloud is computed we user Ripser to obtain the persistent diagram in Chapter 4.7. First let us remark the fact that this persistent diagram has the topological features we required at the beginning of this example, namely it was obtained from a periodic signal (see Chapter 4.6) while presenting 2 clearly persistent classes in P H 1 and 1 persistent class in dimension 2. 45 Figure 4.7 Persistence P H 1 (SWd,τ (f )). 4.5 Quasi-periodicity detection Following the work in [21] we define a quasi-periodicity score based on the geometry of the sliding window embedding of a signal. To do so we define dgmn to be the n-dimensional persistence diagram for the Rips filtration on the sliding window embedding of the signal, therefore we define mpi (dgmn ) as the i-th largest difference b − a for any point (a, b) ∈ dgmn , or in other words, the persistence of the i-th largest class in dgmn . Furthermore, we define pcupi,j to be the persistence of the cup product of the i-th and j-th most persistent cohomology classes in dgmn . With this definitions in mind we define a Quasi-periodicity Cup Score as s mp2 (dgm1 )pcup1,2 CQP S = (4.5) 3 which is designed to differentiate quasi-periodic signals like the ones in Chapter 4.4.1 from periodic signal with non trivial 2-dimensional cohomology, as in Chapter 4.4.2. This score is based on the second largest 1-dimensional persistence times the persistence of the cup product between the largest and second largest classes in dgm1 . 46 Throughout this section we will be comparing this score with the Quasi-periodic score defined in [21] as r mp2 (dgm1 )mp1 (dgm2 ) QP S = (4.6) 3 which make use of the most persistent class in the 2-dimensional cohomology of the filtration. 4.5.1 ROC for quasi-periodicity detection The goal of this example is to provide an evaluation method for the Quasi-periodocity score in Chapter 4.5. To achieve this we produced a data set of 600 time series split as 150 periodic signals, 150 quasi-periodic signals as in Chapter 4.4.1, 150 periodic signal whose sliding window embedding has non trivial 1 and 2-dimensional cohomology as in Chapter 4.4.2 and finally 150 linear signal with Gaussian noise. Figure 4.8 ROC curves for the origical Quai-periodicity score in [21] and Cup quasi-periodic score. 47 After computing the Quasi-periodicity score and Cup Quasi-periodicity score for each signal we solved the binary classification problem of quasi-periodic signal vs. the rest of them. Chapter 4.8 shows the ROC curve for each score where we observe an improvement when using our Cup Quasi-periodic Score over the classification results obtained with the basic QCP in Chapter 4.6. Furthermore, this improvement as measured by the area under the curve (AUC) corresponds to an increase from 0.8052 to 0.8755, when the theoretical maximum is 1. 4.5.2 Vocal folds Finally we explore an example with real data in the form of Quasi-periodicity detection in video data. More specifically, we consider the data set of high speed vocal folds videos studied in [21] used to separate periodicity from biphonation phenomenons. In particular we apply our cup product pipeline to detect periodicity on the videos from quasi-periodicity. In biological terms the biphonetion phenomenon occurs when the vocal folds oscillation bifurcates between between two different frequencies, this behaviour results in a sliding window embedding that resembles a torus as in Chapter 4. This data set consists to 7 high speed videos, 2 of which correspond to “normal” periodic vocal fold oscillation, 3 of them represent biphonation and the last 2 portrait irregular motion. Here we replicated the reprocessing and persistent diagram computation carried away in [21] Video Name QPS CQPS HerbstPeriodic 0.00448 0.0003570 NormalPeriodic 0.0216 0.0000371 APBiphonation 0.298 0.2828325 APBiphonation2 0.116 0.0985476 ClinicalAsymmetry 0.404 0.4025851 MucusPerturbedPeriodic 0.00447 0.0002006 HerbstIrregular 0.0398 0.0345596 Table 4.1 Quasi-periodicit score (PQS) and Quasi-periodicity Cup score (CQPS) across the 7 vocal folds videos. 48 In Chapter 4.1 we observed that our proposed Quasi-periodicity Cup Score (CQPS) maintains high values on the quasi-periodic signals as we would expect from the results in Chapter 4.5, where we showed for quasi-periodic signal we would expect a 2-dimensional class in cohomolohy with comparable persistence to the one for the cup product of the 2 most persistent classes in dimension 1. More importantly, when we look at the scores for the periodic signals the CQPS score reduces its value compared to the original QPS score by an amount between 1 and 3 orders of magnitude. This increased separation between the scores for periodic and quasi-periodic signals makes the separation between this phenomenon easier to identify. This behaviour of our CQPS score is reinforced by the results in Chapter 4.5.1 where we see an improvement ion the classification power of CQPS when compared to QPS. 4.6 Discussion The main contribution of this paper is in Chapter 4.3 where we present an algorithm that allow us to compute the persistence of persistent chains in general, allowing us to reach the two main goals we set for this work: • to compute the persistence of cup product expressions and enrich persistent diagrams with such information, while providing scores to enhance quasi-periodicity detection, and • to reduce the computational burden of obtaining said scores, by allowing the com- putation of persistence for 2-chains while only computing generators in 1-dimensional persistent cohomology. We show the performance of this methodology to separate periodic from quasi-periodic signals in a synthetic data set to be comparable to that of the scores proposed in [21] as presented in Chapter 4.8. As well as in a real life data set in Chapter 4.5.2, for which 49 we can clearly see a stronger separation between periodic and quasi-periodic signals when compared to the results in the original work [21], while displaying comparable results for the edge scenarios of signal that are in general hard to identify in Chapter 4.1 like the “Clinical Asymetry” cases. It is important to point out that the computations presented in this work do not reflect the persistent cup product as a submodule of the persistent cohomology. We are computing the persistence of a specific persistent cochain, that arises from the cup product at cochain level of generators of persistent classes in the persistent cohomology. Nevertheless, investigating the possibility to extended some of these tools to aid in the computation of the cup product as a persistent submodule is an interesting research avenue for future projects. Furthermore, the cup product persistence algorithm presented here can achieve higher theoretical performance compared the version presented in [21] that depends on computing 2- dimensional persistent cohomology as expressed in Chapter 4.3. Our specific implementation was made in Python 3.7 and therefore carries over some of the inefficiencies inherited by the language. Further work will focus on refining our implementation to take further advantage of the computational benefits benefits one can achieve with this algorithm as well as exploring the performance of this classification methodology on other quasi-periodic phenomena. 50 CHAPTER 5 LENS COORDINATES AND DATA COORDINATIZATION 5.1 Preliminaries 5.1.1 Persistent Cohomology A family K = {Kα }α∈R of simplicial complexes is called a filtration if Kα ⊂ Kα′ whenever α ≤ α′ . If F is a field and i ≥ 0 is an integer, then the direct sum P H i (K; F) := L i H (Kα ; F) α of cohomology groups is called the i-th dimensional F-persistent cohomology of K. A theorem of Crawley-Boevey [38] contends that if H i (Kα ; F) is finite dimensional for each α, then the isomorphism type of P H i (K; F)—as a persistence module—is uniquely determined by a multiset (i.e., a set whose elements may appear with repetitions) dgm ⊂ {(α, α′ ) ∈ [−∞, ∞]2 : α ≤ α′ } called the persistence diagram of P H i (K; F). Pairs (α, α′ ) with large persistence α′ − α, are indicative of stable topological features throughout the filtration K. Persistent cohomology is used in TDA to quantify the topology underlying a data set. There are two widely used filtrations associated to a subset X of a metric space (M, d), the Rips filtration R(X) = {Rα (X)}α and the Čech filtration C(X) ˇ = {Čα (X)}α . Specifically, Rα (X) is the set of nonempty finite subsets of X with diameter less than α, and Čα (X) is the nerve of the collection Bα of open balls Bα (x) ⊂ M of radius α, centered at x ∈ X. In other words, Čα (X) = N (Bα ). Generally R(X) is more easily computable, but ˇ C(X) has better theoretical properties (e.g., the Nerve theorem [39, 4G.3]). Their relative weaknesses are ameliorated by noticing that Rα (X) ⊂ N (Bα ) ⊂ R2α (X) 51 for all α, and using both filtrations in analyses: Rips for computations, and Čech for theo- retical inference. 5.1.2 Lens Spaces Let q ∈ N and let ζq ∈ C be a primary q-th root of unity. Fix n ∈ N and let q1 , . . . , qn ∈ N be relatively prime to q. We define the Lens space Ln q (q1 , . . . , qn ) as the quotient of S 2n−1 ⊂ Cn by the Zq right action q g h i q g [z1 , . . . , zn ] · g := z1 ζq 1 , . . . , zn ζq n with simplified notation Ln q := Lq (1, . . . , 1). Notice that when q = 2 and q1 = · · · = qn = 1, n then the right action described above is the antipodal map of S 2n−1 , and therefore Ln 2 = RP2n−1 . Similarly, the infinite Lens space L∞ ∞ q = Lq (1, 1, . . .) is defined as the quotient of the infinite unit sphere S ∞ ⊂ C∞ , by the action of Zq induced by scalar-vector multiplication by powers of ζq . 5.1.2.1 A Fundamental domain for L2q (1, p) In what follows we describe a convenient model for both L2q (1, p) and a fundamental domain thereof. This model will allow us to provide visualizations in Lens spaces towards the end of the paper. Let D3 be the set of points x ∈ R3 with ∥x∥ ≤ 1, and let D+ (D− ) be the upper (lower) hemisphere of ∂D3 , including the equator. Let rp/q : D+ −→ D+ be counterclockwise rotation by 2πp/q radians around the z-axis, and let ρ : D+ −→ D− be the reflection ρ(x, y, z) = (x, y, −z). Then, L2q (1, p) is homeomorphic to D3 / ∼, where x ∼ y if and only if x ∈ D+ and y = ρ ◦ rp/q (x). 5.1.3 Principal Bundles Let B be a topological space with base point b0 ∈ B. One of the most transparent methods for producing an explicit bijection between Ȟ 1 (B; CZq ) and [B, L∞ q ] is via the theory of 52 Principal bundles. We present a terse introduction here, but direct the interested reader to [40] for details. A continuous map π : P −→ B is said to be a fiber bundle with fiber F = π −1 (b0 ) and total space P , if π is surjective, and every b ∈ B has an open neighborhood U ⊂ B as well as a homeomorphism ρU : U × F −→ π −1 (U ), so that π ◦ ρU (x, e) = x for every (x, e) ∈ U × F . Let (G, +) be an abelian topological group. A fiber bundle π : P −→ B is said to be a principal G-bundle over B, if P comes equipped with a free right G-action P × G ∋ (e, g) 7→ e · g ∈ P which is transitive in π −1 (b) for every b ∈ B. Moreover, two principal G-bundles π : P −→ B and π ′ : P ′ −→ B are isomorphic, if there exits a homeomorphism Φ : P −→ P ′ , with π ′ ◦ Φ = π and so that Φ(e · g) = Φ(e) · g for all (e, g) ∈ P × G. Given an open cover U = {Uj }j∈J of B, a Čech cocycle η = {ηjk } ∈ Ž 1 (U; CG ) is a collection of continuous maps ηjk : Uj ∩ Uk −→ G so that ηjk (b) + ηkl (b) = ηjl (b) for every b ∈ Uj ∩ Uk ∩ Ul . Given such a cocycle, one can construct a principal G-bundle over B with total space   [ Pη =  Uj × {j} × G / ∼ j∈J where (b, j, g) ∼ (b, k, g + ηjk (b)) for every b ∈ Uj ∩ Uk , and π : Pη −→ B sends the class of (b, j, g) to b ∈ B. Theorem 5.1.1. If PrinG (B) denotes the set of isomorphism classes of principal G-bundles over B, then Ȟ 1 (B; CG ) −→ PrinG (B) [η] 7→ [Pη ] is a bijection. Proof. See 2.4 and 2.5 in [18] 53 Now, let us describe the relation between principal G-bundles over B, and maps from B to the classifying space BG. Indeed, let ȷ : EG −→ BG = EG/G be the quotient map. Given h : B −→ BG continuous, the pullback h∗ EG is the principal G-bundle over B with total space {(b, e) ∈ B × EG : h(b) = ȷ(e)}, and projection map (b, e) 7→ b. Moreover, Theorem 5.1.2. Let [B, BG] denote the set of homotopy class of maps from B to the classifying space BG. Then, the function [B, BG] −→ PrinG (B) [h] 7→ [h∗ EG] is a bijection. Proof. See [40], Chapter 4: Theorems 12.2 and 12.4. In summary, given a principal G-bundle π : P −→ B, or its corresponding Čech cocycle η, there exists a continuous map h : B −→ BG so that h∗ EG is isomorphic to (π, P ), and the choice of h is unique up to homotopy. Any such choice is called a classifying map for π : P −→ B. 5.2 Main Theorem: Explicit Classifying Maps for L∞ q The goal of this section is to show how one can go from a singular cocycle η ∈ Z 1 (N (U); Zq ) to an explicit map f : U −→ L∞ q . Let J = {1, . . . , n}, let U = {Uj }j∈J be an open cover S for B, and let {φj }j∈J be a partition of unity dominated by U. If η = Z 1 (N (U); Zq ) and ζq is a primitive q-th root of unity, let fj : Uj × {j} × Zq −→ S 2n−1 ⊂ Cn be   p (g+ηj1 ) p (g+ηjn ) fj (b, j, g) = φ1 (b)ζq , . . . , φn (b)ζq If b ∈ Uj ∩ Uk , then fj (b, j, g) = fk (b, k, g + ηjk ) and we get an induced map Φ : Pη −→ S 2n−1 ⊂ S ∞ taking the class of (b, j, g) in the quotient Pη to fj (b, j, g). Proposition 5.2.1. Φ is well defined and Zq -equivariant. 54 Proof. Take [b, j, g] ∈ Pη and consider a different representative of the class. Namely, an element (b, k, g+ηjk ) such that b ∈ Uj ∩Uk . By definition of Φ, we have Φ([b, j, g]) = fj (b, j, g) and Φ([b, k, g + ηjk ]) = fk (b, k, g + ηjk ). And since fj (b, j, g) = fk (b, k, g + ηjk ), we have that Φ([b, j, g]) = Φ([b, k, g + ηjk ]), which shows that Φ is well defined. To see that Φ is Zq -equivariant, take m ∈ Zq for any m = 0, . . . , q − 1 and compute Φ([b, j, g]) · m   p (g+m+ηj1 ) p (g+m+ηjn ) = φ1 (b)ζq , . . . , φn (b)ζq = fj (b, j, g + m) = Φ([b, j, g + m]) = Φ([b, j, g] · m). Let p : S 2n−1 −→ Ln 2n−1 ⊂ S ∞ is Z - q be the quiotient map. Since Φ : Pη −→ S q equivariant, it induces a map f : B −→ Ln ∞ q ⊂ Lq such that p ◦ Φ = f ◦ π. By construction of π : Pη −→ B, f (π([b, j, g])) = f (b) for any g ∈ Zq . In particular for 0 ∈ Zq hp ηj1 ηjn i (5.1) p Uj ∋ b , f (b) = φ1 (b)ζq : · · · : φn (b)ζq Remark 5.2.2. The notation [a1 : · · · : an ] corresponds to homogeneous coordinates in the quotient S 2n−1 /Zq . In other words, [a1 : · · · : an ] = {[a1 · α, . . . , an · α] ∈ S 2n−1 : α ∈ Zq }. Theorem 5.2.3. The map f classifies the Zq -principal bundle Pη associated to the cocycle η ∈ Z 1 (N (U); Zq ). Proof. First we need to see that f is well defined. Let b ∈ Uj ∩ Uk , therefore hp ηj1 p ηjn i p(Φ([b, j, 0])) = φ1 (b)ζq : · · · : φn (b)ζq = p(Φ([b, k, 0)). 55 This shows that f (b) is independent of the open set containing b. Hence (Φ, f ) : (Pη , π, B) → (S 2n−1 , π, Ln q ) is a morphism of principal Zq -bundles, and by [[40], Chapter 4: Theorem 4.2] we conclude that Pη and f ∗ (S 2n−1 ) are isomorphic principal Zq -bundles over B. 5.3 Lens coordinates for data Let (M, d) be a metric space and let L ⊂ M be a finite subset. We will use the following notation from now on: Bϵ (l) = {y ∈ M : d(y, l) < ϵ}, Bϵ = {Bϵ (l)}l∈L , and Lϵ = Bϵ . S Given a data set X ⊂ M , our goal will be to choose L ⊂ X, a suitable ϵ such that X ⊂ Lϵ , and a cocycle η ∈ Z 1 (N (Bϵ ); Zq ). Chapter 5.1 yields a map f : Lϵ → L∞ q defined for every point in X, but constructed from a much smaller subset of landmarks. Next we describe the details of this construction. 5.3.1 Landmark selection We select the landmark set L ⊂ X either at random or through maxmin sampling. The latter proceeds inductively as follows: Fix n ≤ |X|, and let l1 ∈ X be chosen at random. Given l1 , . . . , lj ∈ X for j < n, we let lj+1 = argmax min{d(x, l1 ), . . . , d(x, lj )}. x∈X 5.3.2 A Partition of Unity subordinated to Bϵ Defining f requires a partition of unity subordinated to Bϵ . Since Bϵ is an open cover composed of metric balls, then we can provide an explicit construction. Indeed, for r ∈ R let |r|+ := max{r, 0}, then .X φl (x) := |ϵ − d(x, l)|+ ϵ − d(x, l′ ) + (5.2) l′ ∈L is a partition of unity subordinated to Bϵ . 56 5.3.3 From Rips to Čech to Rips As we alluded to in the introduction, a persistent cohomology calculation is an appro- priate vehicle to select a scale ϵ and a candidate cocycle η. That said, determining η ∈ Z 1 (N (Bϵ ), Zq ) would require computing N (Bϵ ) for all ϵ, which in general is an expensive procedure. Instead we will use the homomorphisms ∗ H 1 (R2ϵ (L)) i / H 1 (N (Bϵ )) / 2H 1 (R (L)) ϵ ι induced by the appropriate inclusions. Indeed, let η̃ ∈ Z 1 (R2ϵ (L); Zq ) be such that [η̃] ̸∈ ker(ι). This is where we use the persistent cohomology of R(L). Since the previous diagram commutes, then [η̃] ̸∈ ker(i∗ ), so i∗ ([η̃]) ̸= 0 in H 1 (N (Bϵ ); Zq ). We will let [η] = i∗ ([η̃]) be the class that we use in Chapter 5.2.3. However, Proposition 5.3.1. If b ∈ Bϵ (lj ) and 1 ≤ k ≤ n, then p ηjk p η̃jk φk (b)ζq = φk (b)ζq . That is, we can compute Lens coordinates using only the Rips filtration on the landmark set. Proof. First of all, R2ϵ (L)(0) = N (Bϵ )(0) = L. If b ̸∈ Bϵ (lk ), then φk (b) = 0 and therefore the equality holds. If on the other hand b ∈ Bϵ (lk ) ∩ Bϵ (lj ), then {j, k} ∈ N (Bϵ )(1) ⊂ R2ϵ (L)(1) . In which case, by definition of i∗ , we have η̃jk = ηjk . 5.4 Dimensionality Reduction in Lnq via Principal Lens Components Chapter 5.1 gives an explicit formula for the classifying map f : B −→ Ln q . By construction, the dimension of Ln q depends on the number n of landmarks selected, which in general can be large. The main goal of this section is to construct a dimensionality reduction procedure in Lnq to address this shortcoming. To this end, we define the distance dL : Lq × Lq −→ [0, ∞) n n as dL ([x], [y]) := dH (x · Zq , y · Zq ) (5.3) 57 where dH id the Hausdorff distance for subsets of S 2n−1 . Proposition 5.4.1. Let [x], [y] ∈ Ln q , then dL ([x], [y]) = d(x, Zq y) = min d(x, gy). g∈Zq Proof. For x, y ∈ Cn let ⟨x, y⟩R := real(⟨x, y⟩C ). By definition of Hausdorff distance, we have that ( dL ([x], [y]) = max max min arccos(⟨x · g, y · h⟩R ) , g∈Zq h∈Zq ) max min arccos(⟨x · g, y · h⟩R ) . h∈Zq g∈Zq Notice that D E  g ⟨x · g, y · h⟩R = real ζq x, ζqh y C D E  (h−g) = real x, ζq y C = ⟨x, y · (h − g)⟩R And since Zq is Abelian, then max min arccos(⟨x · g, y · h⟩R ) h∈Zq g∈Zq = max min arccos(⟨x · (g − h), y⟩R ) h∈Zq g∈Zq = max min arccos(⟨x · (−h), y · (−g)⟩R ) h∈Zq g∈Zq = max min arccos ⟨x · h′ , y · g ′ ⟩R .  h′ ∈Zq g ′ ∈Zq Thus dL ([x], [y]) = max min arccos(⟨x · g, y · h⟩R ). g∈Zq h∈Zq Furthermore dL ([x], [y]) = max d(x · g, y · Zq ) = max d(x, y · (−g)Zq ). g∈Zq g∈Zq 58 Since y · (−g)Zq = y · Zq for any g ∈ Zq , we obtain dL ([x], [y]) = maxg∈Zq d(x, y · Zq ) =  d(x, y · Zq ) = minh∈Zq d(x, y · h). We will now describe a notion of projection in Ln q onto lower-dimensional Lens spaces. Indeed, let u ∈ S 2n−1 . Since ζqk w ∈ spanC (u)⊥ for any k ∈ Zq and w ∈ spanC (u)⊥ , then Lqn−1 (u) := (spanC (u)⊥ ∩ S 2n−1 )/Zq is isometric to Ln−1 ⊥ q . Let Pu (v) = v − ⟨v, u⟩C u for v ∈ C , and if v ∈ / spanC (u), then we let n h i Pu ([v]) := Pu⊥ (v) ∥Pu⊥ (v)∥ ∈ Ln−1  q (u) It readily follows that Pu is well defined, and that Lemma 5.4.2. For u ∈ S 2n−1 and v ∈ / spanC (u), we have   dL ([v], Pu ([v])) = d v , Pu⊥ (v) ∥Pu⊥ (v)∥  where d is the distance on S 2n−1 . Furthermore, Pu ([v]) is the point in Ln−1 q (u) closest to [v] with respect to dL . Proof. From Chapter 5.4.1 we know that dL ([v], Pu⊥ ([v])) = min d(v, Pu⊥ ([v]) · g) g∈Zq ! Pu⊥ (v) = min d v, ·g . g∈Zq ∥Pu⊥ (v)∥   Pu⊥ (v) Let g ∗ := argmin d v, · g , so we have g∈Zq ∥Pu⊥ (v)∥ * + ! P ⊥ (v) dL ([v], Pu⊥ ([v])) = arccos v, u⊥ · g∗ . ∥Pu (v)∥ R 59 Notice that the argument of the arccos can be simplified as follows * + * + Pu⊥ (v) P ⊥ (v) v, ⊥ · g∗ = ⟨v, u⟩C u + Pu⊥ (v), u⊥ · g∗ ∥Pu (v)∥ ∥Pu (v)∥ R * + R ⊥ P (v) = ⟨v, u⟩C u, u⊥ · g∗ ∥Pu (v)∥ * +R ⊥ P (v) + Pu⊥ (v), u⊥ · g∗ . ∥Pu (v)∥ R since u and Pu⊥ (v) are orthogonal in Cn then they are also orthogonal in R2n , making the then the firs summand on the right hand side equal to zero. Additionally since arccos as a real valued function is monotonically decreasing we have 1 D E g ∗ = argmax P ⊥ (v), P ⊥ (v) · g u u . g∈Zq ∥Pu⊥ (v)∥ R Using the fact that the action of Zq is an isometry (and therefore an operator of norm one) as well as the Cauchy-Schwartz inequality we obtain D E Pu⊥ (v), Pu⊥ (v) · g 1 D E R ≤ P ⊥ (v), P ⊥ (v) · g u u ∥Pu⊥ (v)∥ ∥Pu⊥ (v)∥ R 1 ≤ ⊥ ∥Pu⊥ (v)∥∥Pu⊥ (v) · g∥ ∥Pu (v)∥ = ∥Pu⊥ (v) · g∥ = ∥Pu⊥ (v)∥. And the equality holds whenever g = e ∈ Zq , so we must have g ∗ = e. Let [w] ∈ Lqn−1 (u), so w ∈ span⊥ C (u) which implies that for any h ∈ Zq uk (ζqh wk ) = ζq−h uk wk = ζq−h ⟨u, w⟩ = 0. X X ⟨u, w · h⟩C = k k In other words w · h ∈ span⊥ C (u) for any h ∈ Zq . Thus by the Cauchy-Schwartz inequality ⟨v, w · h⟩R = ⟨⟨v, u⟩C u + Pu⊥ (v), w · h⟩R = ⟨Pu⊥ (v), w · h⟩R ≤ |⟨Pu⊥ (v), w · h⟩R | ≤ ∥Pu⊥ (v)∥∥w · h∥ = ∥Pu⊥ (v)∥∥w∥ = ∥Pu⊥ (v)∥, 60 since the action of Zq is an isometry and w ∈ S 2n−1 . Finally since arccos is decreasing   dL ([v], Pu⊥ ([v])) = arccos ∥Pu⊥ (v)∥ ≤ arccos(⟨v, w · h⟩R ) for all h ∈ Zq , thus dL ([v], Pu⊥ ([v])) ≤ dL ([v], [w]). This last result suggests that a PCA-like approach is possible for dimensionality reduction in Lens spaces. Specifically, for Y = {[y1 ], . . . , [yN ]} ⊂ Ln q , the goal is to find u ∈ S 2n−1 such that Lqn−1 (u) is the best (n − 1)-Lens space approximation to Y , then project Y onto Ln−1q (u) using Pu , and repeat the process iteratively reducing the dimension by 1 each time. At each stage, the appropriate constrained optimization problem is N u∗ dL ([yj ], Pu ([yi ]))2 X = argmin u∈Cn ,∥u∥=1 j=1 N  2 X π = argmin − arccos(|⟨yi , u⟩|) u∈Cn ,∥u∥=1 j=1 2 which can be linearized using the Taylor series expansion of arccos(θ) around 0. Indeed, | π2 − arccos(θ)| ≈ |θ| to third order, and thus N u∗ |⟨yi , u⟩|2 . X ≈ argmin u∈Cn ,∥u∥=1 j=1 This approximation is a linear least square problem whose solution is given by the eigenvector corresponding to the smallest eigenvalue of the covariance matrix  | |  " − y1 − # Cov (y1 , . . . , yN ) = y1 ··· yN .. . | | . − yN − Moreover, for any α1 , . . . , αN ∈ S 1 ⊂ C we have that Cov (α1 y1 , . . . , αN yN ) = Cov (y1 , . . . , yN ) , so Cov(Y ) is well defined for Y ⊂ Ln q. 61 5.4.1 Inductive construction of LPCA Let vn = LastLensComp(Y ) be the eigenvector of Cov(Y ) corresponding to the smallest eigenvalue. Assume that we have constructed vk+1 , . . . , vn ∈ S 2n−1 for 1 < k < n, and let {u1 , . . . , uk } be an orthonormal basis for spanC (vk+1 , . . . , vn )⊥ . Let Uk ∈ Cn×k be the † matrix with columns u1 , . . . , uk , and let Uk be its conjugate transpose. We define the k-th Lens Principal component of Y as the vector † † ! Uk y1 Uk yN vk := Uk · LastLensComp † ,..., † ∥Uk y1 ∥ ∥Uk yN ∥ This inductive procedure yields a collection [v2 ], . . . , [vn ] ∈ Ln q , and we let v1 ∈ S 2n−1 be such that spanC {v1 } = spanC {v2 , . . . , vn }⊥ . Finally LPCA(Y ) := {[v1 ], . . . , [vn ]} are the Lens Principal Components of Y . Let Vk ∈ Cn×k" be the #n-by-k matrix with † V yj columns v1 , . . . , vk , and let Pk (Y ) ⊂ Lkq be the set of classes k † , 1 ≤ j ≤ N . The ∥V yj ∥ k point clouds Pk (Y ), k = 1, . . . , n, are the Lens Principal Coordinates of Y . 5.4.2 Choosing a target dimension The variance recovered by the first k Lens Principal Components [v1 ], . . . , [vk ] ∈ Ln q is defined as !2 k N † " # 1 XX Vl yj vark (Y ) := dL † , Ll−1 q (el−1 ) N ∥Vl yj ∥ l=2 j=1 where Vl is the n-by-l matrix with columns v1 , . . . , vl , 1 < l ≤ k, and el−1 ∈ Cl is the vector [0, . . . , 0, 1, 0]. Therefore, the percentage of cumulative variance p.var(k) := vark (Y ) varn (Y ), can  be interpreted as the portion of total variance of Y along LPCA(Y ), explained by the first k components. 62 Thus we can select the target dimension as the smallest k for which p.vark (Y ) is greater than a predetermined value. In other words, we select the dimension that recovers a signifi- cant portion of the total variance. Another possible guideline to choose the target dimension is as the minimum value of k for which p.var(k) − p.var(k + 1) < γ for a small γ > 0. 5.4.3 Independence of the cocycle representative Let η ∈ Z 1 (N (Bϵ ); Zq ) be such that [η] ̸= 0 in H 1 (N (Bϵ ); Zq ), and let η ′ = η + δ 0 (α) with α ∈ C 0 (N (Bϵ ); Zq ). If b ∈ Uj , then p ηj1 +α1 p ηjn +αn fη′ (b) = [ ϕ1 (b)ζq : · · · : ϕn (b)ζq ] α α If Zα is the square diagonal matrix with entries ζq 1 , ζq 2 , . . . , ζqαn , then fη′ (b) = Zα ·f (b). Moreover, after taking classes in Ln q , this implies that fη ′ (X) = Zα · f (X). Since Cov(Zα · † f (X)) = Zα Cov(f (X))Zα and Zα is orthonormal, then if v is an eigenvector of Cov(f (X)) with eigenvalue σ, we also have that Zα v is an eigenvector of Cov(Zα · f (X)) with the same eigenvalue. Therefore LastLensComp(fη′ (X)) = Zα LastLensComp(f (X)). Since each component in LPCA is obtained in the same manner, we must have that LPCA(fη′ (X)) = Zα LPCA(f (X)). Thus, the lens coordinates from two cohomologous cocy- cles η and η +δ 0 (α) (i.e., representing the same cohomology class) only differ by the isometry of Lnq induced by the linear map Zα . 5.4.4 Visualization map for L23 Given v1 , . . . , vn ∈ S 2n−1 representatives for the classes in LPCA(Y ). We want to visualize P2 (Y ) ⊂ L23 in the fundamental domain described in Chapter 5.1.2.1. Let ⟨yi , v1 ⟩C , ⟨yi , v2 ⟩C ∈ S 3 ⊂ C2 : [yi ] ∈ Y   P2 (Y ) = 63 and define G : P2 (Y ) −→ S 3 ⊂ C2 to be   π  q −k G(z, w) := ζ3 z, arg(w) − 1 − ∥z∥ 2 (5.4) 3 h  where arg(w) ∈ 0, 2π 3 , and k an integer such that 2π arg(z) = k + θ, 3 where θ is the remainder after division by 2π 3 . Metric on the Moore space M (Z3 , 1). For x, y ∈ C with |x|, |y| ≤ 1, we let p    |⟨x, y⟩R | if ∥x∥, ∥w∥ < 1    if ∥x∥ = 1 or ∥w∥ = 1 .  p min d(x, y) = ζ∈Z |⟨x, ζy⟩R | (5.5)   3  if ∥x∥ = 1 and ∥w∥ = 1   min arccos(|⟨x, ζy⟩R |)   ζ∈Z3 5.5 Examples 5.5.1 The Circle S 1 Let S 1 ⊂ C be the unit circle, and let X a random sample around S 1 , with 10, 000 points and Gaussian noise in the normal direction. L ⊂ X is a landmark set with 10 points obtained as described in Chapter 5.3.1. Let a be the cohomological death of the most persistent class P H 1 (R(L); Zq ). For ϵ := a + 10−5 and η = i∗ (η ′ ) ∈ Z 1 (N (Bϵ ); Zq ) we define the map f : Bϵ → L10 3 as in Chapter 5.1. After computing LPCA for f (X) ⊂ L10 3 and the percentage of cumulative variance p.varY (k) we obtain the row in Table 5.1 with label S 1 (see Chapter ?? for more details). We see that dimension 1 recovers ∼ 60% of the variance. Moreover, Chapter 5.3 shows P2 (f (X)) ⊂ L23 in the fundamental domain described in Chapter 5.1.2.1 trough the map in Chapter 5.4. 64 Figure 5.1 Left: Sample X, in black landmark set L ⊂ X. Right: P H i (R(L); Z3 ) for i = 0, 1, 2. Figure 5.2 Percentage of recovered variance. Dim. (n) 1 2 3 4 5 S1 0.62 0.75 0.81 0.86 0.89 M (Z3 , 1) 0.56 0.7 0.76 0.8 0.83 L23 0.47 0.62 0.67 0.71 0.73 Table 5.1 Percentage of recovered variance in Ln 3. 65 Figure 5.3 Visualization P2 (f (X)) ⊂ L23 . One key aspect of LC (Lens coordinates) is that it is designed to highlight the cohomology class η used on Chapter 5.1. This is easily observed in this example; we selected the most persistent class in P H 1 (R(L); Z3 ) and as a consequence in Chapter 5.3 we see how this class is preserved while all the information in the normal direction is lost in the process. 5.5.2 The Moore space M (Z3 , 1) Let G be an abelian group and n ∈ N. The Moore space M (G, n) is a CW-complex such that Hn (M (G, n), Z) = G and H̃i (M (G, n), Z) = 0 for all i ̸= n. A well known construction for M (Z3 , 1) can be found in [39]. Chapter 5.5 defines a metric on M (Z3 , 1). Chapter 5.4, on the left, shows a sample X ⊂ M (Z3 , 1) with ∥X∥ = 15, 000 and 70 landmarks. The landmarks were obtained by minmax sampling after feeding the algorithm with an initial set of 10 point on the boundary on the disc. Chapter 5.5 shows the persistent cohomology of R(L) with coefficients in Z2 and Z3 side-by-side. We compute f : M (Z3 , 1) −→ L70 3 analogously to the previous example and obtain a point cloud f (X) ⊂ L70 3 . The profile of recovered variance is shown in Table 5.1. Dimension 66 Figure 5.4 Left: X ⊂ M (Z3 , 1) with landmarks in black. Right: P H i (R(L); Z3 ) for i = 0, 1. Figure 5.5 P H i (R(L); F) for i = 0, 1 and F = Z2 , Z3 . 2 provides a low dimensional representation of f (X) inside L23 with 70% of recovered variance (see Figure 5.6). Since f classifies the principal Z3 -bundle Pη over M (Z3 , 1), then f must be homotopic to the inclusion of M (Zq , 1) in L∞ q . Chapter 5.7 shows X ⊂ M (Z3 , 1) mapped by f in L3 . 2 Notice the identifications on X are handled by the identification on S 1 × {0} ⊂ D3 from the fundamental domain on Chapter 5.1.2.1. See https://youtu.be/_Ic730_xFkw for a more complete visualization. 67 Figure 5.6 Percentage of recovered variance. Figure 5.7 Visualization of the resulting P2 (f (X)) ⊂ L23 . 5.5.3 The Lens space L23 = S 3 /Z3 We use the metric defined in Chapter 5.3 on L23 and randomly sample 15, 000 points to create X ⊂ L23 . Chapter 5.9(left) shows the sample set using the fundamental domain from 68 Chapter 5.1.2.1. We can use P H i (R(X); Z2 ) and P H i (R(X); Z3 ) to verify that the sampled metric space has the expected topological features. Figure 5.8 contains the corresponding persistent dia- grams. Figure 5.8 P H i (R(L); F) for i = 0, 1 and F = Z2 , Z3 . Just as in the previous examples define f : L23 → L∞ 3 using the most persistent class in P H 1 (R(L); Z3 ). The homotopy class of f must be the same as that of the inclusion L23 ⊂ L∞3 , since f classifies the Z3 -principal bundle Pη . Thus we expect L3 to be preserved 2 up to homotopy under LPCA. Chapter 5.9 offers a side and top view of P2 (f (X)) ⊂ L23 . Here we clearly see how the original data set X is transformed while preserving the identifications on the boundary of the fundamental domain. Finally in Figure 5.10 we show the variance profile for the dimensionality reduction problem. We see that for dimension 4 we have recovered more than 70% of the total variance as seen in Table 5.1. 5.5.4 Isomap dimensionality reduction We conclude this section by providing evidence that Lens coordinates (LC) preserve topolog- ical features when compared to other dimensionality reduction algorithms. For this purpose we use Isomap ([41]) as our point of comparison. 69 Figure 5.9 Left: X ⊂ L23 . Right: Lens coordinates. Figure 5.10 Percentage of recovered variance. The Isomap algorithm consist of 3 main steps. The first step determines neighborhoods of each point using k-th nearest neighbors. The second step estimates the geodesic distances between all pairs of points using shortest distance path, and the final step applies classical MDS to the matrix of graph distances. 70 Let dgm be a persistent diagram. Define per1 to be the largest persistence of an element in dgm, and let per2 be the second largest persistence of an element dgm. per1 /per2 Z2 Z3 Isomap 1.0105 1.0105 M (Zq , 1) LC 1.7171 3.6789 Isomap 1.0080 1.0080 L23 LC 1.1592 2.8072 Table 5.2 In green we highlight the fraction that indicates which method better identifies the topological features. For both M (Z3 , 1) and L23 it is clear that the Isomap projection fails to preserve the difference between the cohomology groups with coefficients in Z2 and Z3 . On the other hand the LC projections maintains this difference in both examples. 71 BIBLIOGRAPHY 72 BIBLIOGRAPHY [1] J. A. Perea. “A Brief History of Persistence”. In: preprint arXiv:1809.03624 (2018). https://arxiv.org/abs/1809.03624. [2] P. Bubenik and P. Dlotko. “A persistence landscapes toolbox for topological statistics”. In: arXiv e-prints, arXiv:1501.00179 (2014), arXiv:1501.00179. arXiv: 1501.00179. [3] Z. Cang et al. “A topological approach for protein classification”. In: Computational and Mathematical Biophysics 3 (2015). [4] J. Reininghaus et al. “A Stable Multi-Scale Kernel for Topological Machine Learning”. In: arXiv e-prints, arXiv:1412.6821 (2014), arXiv:1412.6821. arXiv: 1412.6821. [5] J. A. Perea, A. Munch, and F. A. Khasawneh. “Approximating Continuous Functions on Persistence Diagrams Using Template Functions”. In: CoRR abs/1902.07190 (2019). arXiv: 1902.07190. url: http://arxiv.org/abs/1902.07190. [6] H. Adams et al. “Persistence Images: A Stable Vector Representation of Persistent Homology”. In: Journal of Machine Learning Research 18.8 (2017), pp. 1–35. url: http://jmlr.org/papers/v18/16-337.html. [7] A. Smith et al. “Supervised Learning of Labeled Pointcloud Differences via Cover-Tree Entropy Reduction”. In: arXiv e-prints, arXiv:1702.07959 (2017), arXiv:1702.07959. arXiv: 1702.07959. [8] D. Reynolds. “Gaussian Mixture Models”. In: Encyclopedia of Biometrics. Ed. by Stan Z. Li and Anil Jain. Boston, MA: Springer US, 2009, pp. 659–663. isbn: 978-0-387- 73003-5. doi: 10.1007/978-0-387-73003-5_196. url: https://doi.org/10.1007/978-0- 387-73003-5_196. [9] R. J. G. B. Campello, D. Moulavi, and J. Sander. “Density-Based Clustering Based on Hierarchical Density Estimates”. In: Advances in Knowledge Discovery and Data Min- ing. Ed. by J. Pei et al. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 160– 172. isbn: 978-3-642-37456-2. [10] D. Pickup et al. “Shape Retrieval of Non-rigid 3D Human Models”. In: Proceedings of the 7th Eurographics Workshop on 3D Object Retrieval. 3DOR ’14. Strasbourg, France: Eurographics Association, 2014, pp. 101–110. isbn: 978-3-905674-58-3. doi: 10.2312/3dor.20141056. url: https://doi.org/10.2312/3dor.20141056. 73 [11] P. Sonego et al. “A Protein Classification Benchmark collection for machine learning”. In: Nucleic acids research 35 (Feb. 2007), pp. D232–6. doi: 10.1093/nar/gkl812. [12] G. Carlsson. “Topological pattern recognition for point cloud data”. In: Acta Numerica 23 (2014), 289–368. doi: 10.1017/S0962492914000051. [13] J. A. Perea and G. Carlsson. “A Klein-Bottle-Based Dictionary for Texture Represen- tation”. In: International Journal of Computer Vision 107 (Mar. 2014), pp. 75–97. doi: 10.1007/s11263-013-0676-2. [14] G. Carlsson et al. “On the Local Behavior of Spaces of Natural Images”. In: Int. J. Comput. Vision 76.1 (Jan. 2008), pp. 1–12. issn: 0920-5691. doi: 10.1007/s11263-007- 0056-x. url: http://dx.doi.org/10.1007/s11263-007-0056-x. [15] J. Milnor. “Construction of universal bundles, II”. In: Annals of Mathematics (1956), pp. 430–436. [16] R. Miranda. Algebraic curves and Riemann surfaces. Vol. 5. American Mathematical Soc., 1995. [17] E. H. Brown. “Cohomology theories”. In: Annals of Mathematics (1962), pp. 467–484. [18] J. A. Perea. “Sparse Circular Coordinates via Principal Z-Bundles”. In: arXiv e-prints, arXiv:1809.09269 (2018), arXiv:1809.09269. arXiv: 1809.09269 [math.AT]. [19] J. A. Perea. “Multiscale projective coordinates via persistent cohomology of sparse filtrations”. In: Discrete & Computational Geometry 59.1 (2018), pp. 175–225. [20] H. Gakhar and J. A. Perea. Sliding Window Persistence of Quasiperiodic Functions. 2021. arXiv: 2103.04540. [21] C. J. Tralie and J. A. Perea. “(Quasi)Periodicity Quantification in Video Data, Using Topology”. In: SIAM Journal on Imaging Sciences 11.2 (2018), pp. 1049–1077. doi: 10 . 1137 / 17M1150736. eprint: https : / / doi . org / 10 . 1137 / 17M1150736. url: https : //doi.org/10.1137/17M1150736. [22] B. Xu et al. “Twisty Takens: a geometric characterization of good observations on dense trajectories”. In: Journal of Applied and Computational Topology 3 (2019), 285–313. doi: https://doi.org/10.1007/s41468-019-00036-9. [23] U. Bauer. Ripser: efficient computation of Vietoris-Rips persistence barcodes. 2021. arXiv: 1908.02518. 74 [24] J. R. Munkres. Elements of Algebraic Topology. Addison Wesley Publishing Company, 1984. isbn: 0201045869. url: http://www.worldcat.org/isbn/0201045869. [25] A. Hatcher. Algebraic topology. Cambridge: Cambridge Univ. Press, 2000. url: https: //cds.cern.ch/record/478079. [26] H. Edelsbrunner and J. Harer. Computational Topology: An Introduction. Applied Mathematics. American Mathematical Society, 2010. isbn: 9780821849255. [27] F. Chazal et al. The Structure and Stability of Persistence Modules. SpringerBriefs in Mathematics. Springer Verlag, 2016, pp. VII, 116. url: https://hal.inria.fr/hal- 01330678. [28] D. Cohen-Steiner, H. Edelsbrunner, and J. Harer. “Stability of Persistence Diagrams”. In: Discrete & Computational Geometry 37.1 (2007), pp. 103–120. issn: 1432-0444. doi: 10.1007/s00454-006-1276-5. url: https://doi.org/10.1007/s00454-006-1276-5. [29] J. Sun, M. Ovsjanikov, and L. Guibas. “A Concise and Provably Informative Multi- Scale Signature Based on Heat Diffusion”. In: Computer Graphics Forum (2009). issn: 1467-8659. doi: 10.1111/j.1467-8659.2009.01515.x. [30] N. Fox, S. Brenner, and J. Chandonia. “SCOPe: Structural Classification of Proteins - Extended, integrating SCOP and ASTRAL data and classification of new structures”. In: Nucleic acids research 42 (Dec. 2013). doi: 10.1093/nar/gkt1240. [31] W. Crawley-Boevey. “Decomposition of pointwise finite-dimensional persistence mod- ules”. In: Journal of Algebra and its Applications 14.05 (2015), p. 1550066. [32] H. Edelsbrunner and J. Harer. Computational Topology - an Introduction. American Mathematical Society, 2010, pp. I–XII, 1–241. isbn: 978-0-8218-4925-5. [33] A. Zomorodian and G. Carlsson. “Computing Persistent Homology”. In: Discrete Com- putational Geometry 33 (2005), pp. 249–274. url: https://doi.org/10.1007/s00454- 004-1146-y. [34] V. de Silva, D. Morozov, and M. Vejdemo-Johansson. “Dualities in persistent (co)homology”. In: Inverse Problems 27.12, 124003 (2011), p. 124003. doi: 10.1088/0266-5611/27/12/ 124003. arXiv: 1107.5665. [35] D. Cohen-Steiner, H. Edelsbrunner, and D. Morozov. “Vines and vineyards by updating persistence in linear time.” In: Symposium on Computational Geometry. Ed. by Nina Amenta and Otfried Cheong. ACM, 2006, pp. 119–126. isbn: 1-59593-340-9. url: http: //dblp.uni-trier.de/db/conf/compgeom/compgeom2006.html#Cohen-SteinerEM06. 75 [36] N. Saul and C. J. Tralie. Scikit-TDA: Topological Data Analysis for Python. 2019. doi: 10.5281/zenodo.2533369. url: https://doi.org/10.5281/zenodo.2533369. [37] J. A. Perea and J. Harer. “Sliding Windows and Persistence: An Application of Topo- logical Methods to Signal Analysis”. In: Foundations of Computational Mathematics 15 (2015), 799–838. doi: https://doi.org/10.1007/s10208-014-9206-z. [38] W. Crawley-Boevey. “Decomposition of pointwise finite-dimensional persistence mod- ules”. In: Journal of Algebra and its Applications 14.05 (2015), p. 1550066. [39] A. Hatcher. Algebraic topology. Cambridge University Press, 2002. [40] D. Husemoller and D. Husemöller. Fibre Bundles. Graduate Texts in Mathematics. Springer, 1994. isbn: 9780387940878. url: https://books.google.com/books?id=DPr\ _BSH89cAC. [41] J. B. Tenenbaum, V. de Silva, and J. C. Langford. “A Global Geometric Framework for Nonlinear Dimensionality Reduction”. In: Science 290.5500 (2000), p. 2319. 76