You are here
Search results
(1 - 7 of 7)
- Title
- Dynamical Systems Analysis Using Topological Signal Processing
- Creator
- Myers, Audun
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Topological Signal Processing (TSP) is the study of time series data through the lens of Topological Data Analysis (TDA)—a process of analyzing data through its shape. This work focuses on developing novel TSP tools for the analysis of dynamical systems. A dynamical system is a term used to broadly refer to a system whose state changes in time. These systems are formally assumed to be a continuum of states whose values are real numbers. However, real-life measurements of these systems only...
Show moreTopological Signal Processing (TSP) is the study of time series data through the lens of Topological Data Analysis (TDA)—a process of analyzing data through its shape. This work focuses on developing novel TSP tools for the analysis of dynamical systems. A dynamical system is a term used to broadly refer to a system whose state changes in time. These systems are formally assumed to be a continuum of states whose values are real numbers. However, real-life measurements of these systems only provide finite information from which the underlying dynamics must be gleaned. This necessitates making conclusions on the continuous structure of a dynamical system using noisy finite samples or time series. The interest often lies in capturing qualitative changes in the system’s behavior known as a bifurcation through changes in the shape of the state space as one or more of the system parameters vary. Current literature on time series analysis aims to study this structure by searching for a lower-dimensional representation; however, the need for user-defined inputs, the sensitivity of these inputs to noise, and the expensive computational effort limit the usability of available knowledge especially for in-situ signal processing.This research aims to use and develop TSP tools to extract useful information about the underlying dynamical system's structure. The first research direction investigates the use of sublevel set persistence—a form of persistent homology from TDA—for signal processing with applications including parameter estimation of a damped oscillator and signal complexity measures to detect bifurcations. The second research direction applies TDA to complex networks to investigate how the topology of such complex networks corresponds to the state space structure. We show how TSP applied to complex networks can be used to detect changes in signal complexity including chaotic compared to periodic dynamics in a noise-contaminated signal. The last research direction focuses on the topological analysis of dynamical networks. A dynamical network is a graph whose vertices and edges have state values driven by a highly interconnected dynamical system. We show how zigzag persistence—a modification of persistent homology—can be used to understand the changing structure of such dynamical networks.
Show less
- Title
- Topological Data Analysis and Machine Learning Framework for Studying Time Series and Image Data
- Creator
- Yesilli, Melih Can
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
The recent advancements in signal acquisition and data mining have revealed the importance of data-driven tools for analyzing signals and images. The availability of large and complex data has also highlighted the need for investigative tools that provide autonomy, noise-robustness, and efficiently utilize data collected from different settings but pertaining to the same phenomenon. State-of-the-art approaches include using tools such as Fourier analysis, wavelets, and Empirical Mode...
Show moreThe recent advancements in signal acquisition and data mining have revealed the importance of data-driven tools for analyzing signals and images. The availability of large and complex data has also highlighted the need for investigative tools that provide autonomy, noise-robustness, and efficiently utilize data collected from different settings but pertaining to the same phenomenon. State-of-the-art approaches include using tools such as Fourier analysis, wavelets, and Empirical Mode Decomposition for extracting informative features from the data. These features can then be combined with machine learning for clustering, classification, and inference. However, these tools typically require human intervention for feature extraction, and they are sensitive to the input parameters that the user chooses during the laborious but often necessary manual data pre-processing. Therefore, this dissertation was motivated by the need for automatic, adaptive, and noise-robust methods for efficiently leveraging machine learning for studying images as well as time series of dynamical systems. Specifically, this work investigates three application areas: chatter detection in manufacturing processes, image analysis of manufactured surfaces, and tool wear detection during titanium alloys machining. This work’s novel investigations are enabled by combining machine learning with methods from Topological Data Analysis (TDA), a relatively recent field of applied topology that encompasses a variety of mature tools for quantifying the shape of data. First, this study experimentally shows for the first time that persistent homology (or persistence) from TDA can be used for chatter classification with accuracies that rival existing detection methods. Further, the efficient use of chatter data sets from different sources is formulated and studied as a transfer learning problem using experimental turning and milling vibration signals. Classification results are shown using comparisons between the TDA pipeline developed in this dissertation and prominent methods for chatter detection. Second, this work describes how to utilize TDA tools for extracting descriptive features from simulated samples generated using different Hurst roughness exponents. The efficiency of the feature extraction is tested by classifying the surfaces according to their roughness level. The resulting accuracies show that TDA can outperform several traditional feature extraction approaches in surface texture analysis. Further, as part of this work, adaptive threshold selection algorithms are developed for Discrete Cosine Transform, and Discrete Wavelet Transform to bypass the need for subjective operator input during surface roughness analysis. Both experimental and synthetic data sets are used to test the effectiveness of these two algorithms. This study also discusses a TDA-based framework that can potentially provide a feasible approach for building an automatic surface finish monitoring system.Finally, this work shows that persistence can be used for tool condition monitoring during titanium alloys machining. Since, in these processes, the cutting tools typically fracture catastrophically before the gradual tool wear reaches the maximum tool life criteria, the industry uses very conservative criteria for replacing the tools. An extensive experiment is described for relating wear markers in various sensor signals to the tool condition at different stages of the tool life. This work shows how, in this setting, TDA provides significant advantages in terms of robustness to noise and alleviating the need for an expert user to extract the informative features. The obtained TDA-based features are compared to existing state-of-the-art featurization tools using feature-level data fusion. The temporal location of the most representative tool condition features is also studied in the signals by considering a variety of window lengths preceding tool wear milestones.
Show less
- Title
- APPLICATIONS OF PERSISTENT COHOMOLOGY TO DIMENSIONALITY REDUCTION AND CLASSIFICATION PROBLEMS
- Creator
- Polanco Contreras, Luis G.
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Many natural phenomena are characterized by their underlying geometry and topological invariants. Part of understanding such processes is being able to differentiate them and classify them through their topological and geometrical signatures. Many advances have been made which use topological data analysis to such end. In this work we present multiple machine learning tools aided by topological data analysis to classify and understand said phenomena.First, feature extraction from persistence...
Show moreMany natural phenomena are characterized by their underlying geometry and topological invariants. Part of understanding such processes is being able to differentiate them and classify them through their topological and geometrical signatures. Many advances have been made which use topological data analysis to such end. In this work we present multiple machine learning tools aided by topological data analysis to classify and understand said phenomena.First, feature extraction from persistence diagrams, as a tool to enrich machine learning techniques, has received increasing attention in recent years. In this paper we explore an adaptive methodology to localize features in persistent diagrams, which are then used in learning tasks. Specifically, we investigate three algorithms, CDER, GMM and HDBSCAN, to obtain adaptive template functions/features. Said features are evaluated in three classification experiments with persistence diagrams. Namely, manifold, human shapes and protein classification. In this area, our main conclusion is that adaptive template systems, as a feature extraction technique, yield competitive and often superior results in the studied examples. Moreover, from the adaptive algorithms here studied, CDER consistently provides the most reliable and robust adaptive featurization.Furthermore, we introduce a framework to construct coordinates in finite Lens spaces for data with nontrivial 1-dimensional $\Z_q := \Z / \Z_q$ persistent cohomology, for $q > 2$ prime. Said coordinates are defined on an open neighborhood of the data, yet constructed with only a small subset of landmarks. We also introduce a dimensionality reduction scheme in $S^{2n−1}/ \Z_q$ (Lens-PCA: LPCA) and demonstrate the efficacy of the pipeline $\Z_q$ -persistent cohomology $\Rightarrow$ $S^{2n−1}/ \Z_q$ coordinates $\Rightarrow$ LPCA, for nonlinear (topological) dimensionality reduction. This methodology allows us to capture and preserve geometrical and topological information through a very efficient dimensionality reduction algorithm.Finally, to make use of some of the most powerful tools in algebraic topology we improve on methodologies that make use of persistent 2-dimensional homology to obtain quasiperiodic scores that indicate the degree of periodicity or quasiperiodicity of a signal. There is a significant computational disadvantage in this approach since it requires the often expensive computation of 2-dimensional persistent homology.Our contribution in this area uses the algebraic structure of the cohomology ring to obtain classes in the 2-dimensional persistent diagram by only using classes in dimension 1, saving valuable computational time in this manner and obtaining more reliable quasiperiodicity scores. We develop an algorithm that allows us to effectively compute the cohomological death and birth of a persistent cup product expression. This allows us to define a quasiperiodic score that reliably separates periodic from quasiperiodic time series.
Show less
- Title
- Algebraic topology and machine learning for biomolecular modeling
- Creator
- Cang, Zixuan
- Date
- 2018
- Collection
- Electronic Theses & Dissertations
- Description
-
Data is expanding in an unprecedented speed in both quantity and size. Topological data analysis provides excellent tools for analyzing high dimensional and highly complex data. Inspired by the topological data analysis's ability of robust and multiscale characterization of data and motivated by the demand of practical predictive tools in computational biology and biomedical researches, this dissertation extends the capability of persistent homology toward quantitative and predictive data...
Show moreData is expanding in an unprecedented speed in both quantity and size. Topological data analysis provides excellent tools for analyzing high dimensional and highly complex data. Inspired by the topological data analysis's ability of robust and multiscale characterization of data and motivated by the demand of practical predictive tools in computational biology and biomedical researches, this dissertation extends the capability of persistent homology toward quantitative and predictive data analysis tools with an emphasis in biomolecular systems. Although persistent homology is almost parameter free, careful treatment is still needed toward practically useful prediction models for realistic systems. This dissertation carefully assesses the representability of persistent homology for biomolecular systems and introduces a collection of characterization tools for both macromolecules and small molecules focusing on intra- and inter-molecular interactions, chemical complexities, electrostatics, and geometry. The representations are then coupled with deep learning and machine learning methods for several problems in drug design and biophysical research. In real-world applications, data often come with heterogeneous dimensions and components. For example, in addition to location, atoms of biomolecules can also be labeled with chemical types, partial charges, and atomic radii. While persistent homology is powerful in analyzing geometry of data, it lacks the ability of handling the non-geometric information. Based on cohomology, we introduce a method that attaches the non-geometric information to the topological invariants in persistent homology analysis. This method is not only useful to handle biomolecules but also can be applied to general situations where the data carries both geometric and non-geometric information. In addition to describing biomolecular systems as a static frame, we are often interested in the dynamics of the systems. An efficient way is to assign an oscillator to each atom and study the coupled dynamical system induced by atomic interactions. To this end, we propose a persistent homology based method for the analysis of the resulting trajectories from the coupled dynamical system. The methods developed in this dissertation have been applied to several problems, namely, prediction of protein stability change upon mutations, protein-ligand binding affinity prediction, virtual screening, and protein flexibility analysis. The tools have shown top performance in both commonly used validation benchmarks and community-wide blind prediction challenges in drug design.
Show less
- Title
- A topological study of toroidal dynamics
- Creator
- Gakhar, Hitesh
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
This dissertation focuses on developing theoretical tools in the field of Topological Data Analysis and more specifically, in the study of toroidal dynamical systems. We make contributions to the development of persistent homology by proving Kunneth-type theorems, to topological time series analysis by further developing the theory of sliding window embeddings, and to multiscale data coordinatization in topological spaces by proving stability theorems. First, in classical algebraic topology,...
Show moreThis dissertation focuses on developing theoretical tools in the field of Topological Data Analysis and more specifically, in the study of toroidal dynamical systems. We make contributions to the development of persistent homology by proving Kunneth-type theorems, to topological time series analysis by further developing the theory of sliding window embeddings, and to multiscale data coordinatization in topological spaces by proving stability theorems. First, in classical algebraic topology, the Kunneth theorem relates the homology of two topological spaces with that of their product. We prove Kunneth theorems for the persistent homology of the categorical and tensor product of filtered spaces. That is, we describe the persistent homology of these product filtrations in terms of that of the filtered components. Using these theorems, we also develop novel methods for algorithmic and abstract computations of persistent homology. One of the direct applications of these results is the abstract computation of Rips persistent homology of the N-dimensional torus.Next, we develop the general theory of sliding window embeddings of quasiperiodic functions and their persistent homology. We show that the sliding window embeddings of quasiperiodic functions, under appropriate choices of the embedding dimension and time delay, are dense in higher dimensional tori. We also explicitly provide methods to choose these parameters. Furthermore, we prove lower bounds on Rips persistent homology of these embeddings. Using one of the persistent Kunneth formulae, we provide an alternate algorithm to compute the Rips persistent homology of the sliding window embedding, which outperforms the traditional methods of landmark sampling in both accuracy and time. We also apply our theory to music, where using sliding windows and persistent homology, we characterize dissonant sounds as quasiperiodic in nature.Finally, we prove stability results for sparse multiscale circular coordinates. These coordinates on a data set were first created to aid non-linear dimensionality reduction analysis. The algorithm identifies a significant integer persistent cohomology class in the Rips filtration on a landmark set and solves a linear least squares optimization problem to construct a circled valued function on the data set. However, these coordinates depend on the choice of the landmarks. We show that these coordinates are stable under Wasserstein noise on the landmark set.
Show less
- Title
- The Past, Present, and Future of Graduate Admissions in Physics
- Creator
- Young, Nicholas T.
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
While graduate admissions in physics directly affects only a small number of people on an annual basis, the number of people indirectly affected is orders of magnitude greater. Those who complete graduate degrees in physics will go on to become leaders in industry, government, and academia, with the latter educating the next generation of leaders in science and engineering. Given the possibly enormous consequences of our decisions in physics graduate admissions, care should be taken to ensure...
Show moreWhile graduate admissions in physics directly affects only a small number of people on an annual basis, the number of people indirectly affected is orders of magnitude greater. Those who complete graduate degrees in physics will go on to become leaders in industry, government, and academia, with the latter educating the next generation of leaders in science and engineering. Given the possibly enormous consequences of our decisions in physics graduate admissions, care should be taken to ensure that the process is working effectively. The evidence, however, suggests it is not. Many inequities exist in the admissions process, unfairly keeping potentially great scientists from pursuing graduate studies. This dissertation seeks to understand what those inequities might be and how we might address them. First, I study the admissions process in the physics department at a Midwestern, public university using the random forest algorithm, a machine learning method, to understand what drives their process. After finding test scores and grades drive the process, I show that one of those tests, the physics GRE, does not gives applicants the outsized advantage that it is claimed to provide. Given the components that drove the admissions process contain inequities, the second half of the dissertation explores whether a rubric-based holistic admissions process might be able to address those inequities. Preliminary evidence suggests that it does. Finally, to ensure that the methods used in the previous chapters were appropriate, the dissertation concludes with a simulation study, finding that the methods used might lead to false negatives in conclusions. Overall, this dissertation suggests that the current graduate admissions process in physics contains inequities and that rubric-based admissions might be able to address them. By addressing those inequities, everyone can be given a fair shot in the admissions process and physics as a discipline can work toward becoming more representative of the population. Failure to act only perpetuates the inequities that have and will continue to keep people out of physics.
Show less
- Title
- On Permutation Patterns, Pinnacle Sets, and Backbones of Bipartite Projections
- Creator
- Domagalski, Rachel
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
This dissertation encompasses the study of two different fields, one regarding permutations including pattern containment and pinnacle sets, and the other on weighted networks, specifically bipartite projections and their backbones. The study of pattern containment and avoidance for linear permutations is a well-established area of enumerative combinatorics. A cyclic permutation is the set of all rotations of a linear permutation. Callan initiated the study of permutation avoidance in cyclic...
Show moreThis dissertation encompasses the study of two different fields, one regarding permutations including pattern containment and pinnacle sets, and the other on weighted networks, specifically bipartite projections and their backbones. The study of pattern containment and avoidance for linear permutations is a well-established area of enumerative combinatorics. A cyclic permutation is the set of all rotations of a linear permutation. Callan initiated the study of permutation avoidance in cyclic permutations and characterized the avoidance classes for all single permutations of length 4. We continue this work. In particular, we establish a cyclic variant of the Erd\H{o}s-Szekeres Theorem that any linear permutation of length mn+1 must contain either the increasing pattern of length m+1 or the decreasing pattern of length n+1. We then derive results about avoidance of multiple patterns of length 4. We also determine generating functions for the cyclic descent statistic on these classes. We then study the pinnacle set, which is the value analogue of a well-studied permutation statistic, the peak set. Let pi=pi_1 pi_2 ... pi_n be a permutation in the symmetric group S_n written in one-line notation. The pinnacle set of pi, denoted Pin pi, is the set of all pi_i such that pi_{i-1}pi_{i+1}. The classic peak set statistic consists of the positions of these values. The pinnacle set was introduced by Davis, Nelson, Petersen, and Tenner who showed that it has many interesting properties. In particular, they proved that the number of subsets of [n]={1,2,...,n} which can be the pinnacle set of some permutation is a binomial coefficient. Their proof involved a bijection with lattice paths and was somewhat involved. We give a simpler demonstration of this result which does not need lattice paths. Moreover, we show that our map and theirs are different descriptions of the same function. Davis et al. also studied the number of pinnacle sets with maximum m and cardinality d which they denoted by p(m,d). We show that these integers are the well-known ballot numbers and give two proofs of this fact: one using finite differences and one bijective. Diaz-Lopez, Harris, Huang, Insko, and Nilsen found a summation formula for calculating the number of permutations in S_n having a given pinnacle set. We derive a new expression for this number which is faster to calculate in many cases. We also show how this method can be adapted to find the number of orderings of a pinnacle set which can be realized by some pi in S_n. This concludes our research on permutations.Bipartite projections are used in a wide range of network contexts including politics (bill co-sponsorship), geography (firm co-location), genetics (gene co-expression), economics (executive board co-membership), and innovation (patent co-authorship). However, because bipartite projections are always weighted graphs, which are inherently challenging to analyze and visualize, it is often useful to examine the `backbone,' an unweighted subgraph containing only the most significant edges. We introduce the \textsf{R} package \texttt{backbone} for extracting the backbone of weighted bipartite projections, and use two empirical datasets to demonstrate its functionality, bill sponsorship data from the 114\textsuperscript{th} session of the United States Senate and a Globalization and World Cities data set regarding firm locations in 2000. After introducing and demonstrating five different models for backbone extraction, the fixed fill model (FFM), fixed row model (FRM), fixed column model (FCM), fixed degree sequence model (FDSM), and stochastic degree sequence model (SDSM), we compare them in terms of accuracy, speed, statistical power, similarity, and community detection. Here, we aim to find which models perform similarly to FDSM, since the FDSM model controls for both degree sequences exactly. We find that the computationally-fast SDSM offers a statistically conservative but close approximation of the computationally-impractical FDSM under a wide range of conditions, and that it correctly recovers a known community structure even when the signal is weak. Therefore, although each backbone model may have particular applications, we recommend SDSM for extracting the backbone of most bipartite projections.
Show less