You are here
Search results
(1 - 20 of 81)
Pages
- Title
- Iris Recognition : Enhancing Security and Improving Performance
- Creator
- Sharma, Renu
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Biometric systems recognize individuals based on their physical or behavioral traits, viz., face, iris, and voice. Iris (the colored annular region around the pupil) is one of the most popular biometric traits due to its uniqueness, accuracy, and stability. However, its widespread usage raises security concerns against various adversarial attacks. Another challenge is to match iris images with other compatible biometric modalities (i.e., face) to increase the scope of human identification....
Show moreBiometric systems recognize individuals based on their physical or behavioral traits, viz., face, iris, and voice. Iris (the colored annular region around the pupil) is one of the most popular biometric traits due to its uniqueness, accuracy, and stability. However, its widespread usage raises security concerns against various adversarial attacks. Another challenge is to match iris images with other compatible biometric modalities (i.e., face) to increase the scope of human identification. Therefore, the focus of this thesis is two-fold: firstly, enhance the security of the iris recognition system by detecting adversarial attacks, and secondly, accentuate its performance in iris-face matching.To enhance the security of the iris biometric system, we work over two types of adversarial attacks - presentation and morph attacks. A presentation attack (PA) occurs when an adversary presents a fake or altered biometric sample (plastic eye, cosmetic contact lens, etc.) to a biometric system to obfuscate their own identity or impersonate another identity. We propose three deep learning-based iris PA detection frameworks corresponding to three different imaging modalities, namely NIR spectrum, visible spectrum, and Optical Coherence Tomography (OCT) imaging inputting a NIR image, visible-spectrum video, and cross-sectional OCT image, respectively. The techniques perform effectively to detect known iris PAs as well as generalize well across unseen attacks, unseen sensors, and multiple datasets. We also presented the explainability and interpretability of the results from the techniques. Our other focuses are robustness analysis and continuous update (retraining) of the trained iris PA detection models. Another burgeoning security threat to biometric systems is morph attacks. A morph attack entails the generation of an image (morphed image) that embodies multiple different identities. Typically, a biometric image is associated with a single identity. In this work, we first demonstrate the vulnerability of iris recognition techniques to morph attacks and then develop techniques to detect the morphed iris images.The second focus of the thesis is to improve the performance of a cross-modal system where iris images are matched against face images. Cross-modality matching involves various challenges, such as cross-spectral, cross-resolution, cross-pose, and cross-temporal. To address these challenges, we extract common features present in both images using a multi-channel convolutional network and also generate synthetic data to augment insufficient training data using a dual-variational autoencoder framework. The two focus areas of this thesis improve the acceptance and widespread usage of the iris biometric system.
Show less
- Title
- EFFICIENT AND PORTABLE SPARSE SOLVERS FOR HETEROGENEOUS HIGH PERFORMANCE COMPUTING SYSTEMS
- Creator
- Rabbi, Md Fazlay
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Sparse matrix computations arise in the form of the solution of systems of linear equations, matrix factorization, linear least-squares problems, and eigenvalue problems in numerous computational disciplines ranging from quantum many-body problems, computational fluid dynamics, machine learning and graph analytics. The scale of problems in these scientific applications typically necessitates execution on massively parallel architectures. Moreover, due to the irregular data access patterns and...
Show moreSparse matrix computations arise in the form of the solution of systems of linear equations, matrix factorization, linear least-squares problems, and eigenvalue problems in numerous computational disciplines ranging from quantum many-body problems, computational fluid dynamics, machine learning and graph analytics. The scale of problems in these scientific applications typically necessitates execution on massively parallel architectures. Moreover, due to the irregular data access patterns and low arithmetic intensities of sparse matrix computations, achieving high performance and scalability is very difficult. These challenges are further exacerbated by the increasingly complex deep memory hierarchies of the modern architectures as they typically integrate several layers of memory storage. Data movement is an important bottleneck against efficiency and energy consumption in large-scale sparse matrix computations. Minimizing data movement across layers of the memory and overlapping data movement with computations are keys to achieving high performance in sparse matrix computations. My thesis work contributes towards systematically identifying algorithmic challenges of the sparse solvers and providing optimized and high performing solutions for both shared memory architectures and heterogeneous architectures by minimizing data movements between different memory layers. For this purpose, we first introduce a shared memory task-parallel framework focusing on optimizing the entire solvers rather than a specific kernel. As most of the recent (or upcoming) supercomputers are equipped with Graphics Processing Unit (GPU), we decided to evaluate the efficacy of the directive-based programming models (i.e., OpenMP and OpenACC) in offloading computations on GPU to achieve performance portability. Being inspired by the promising results of this work, we port and optimize our shared memory task-parallel framework on GPU accelerated systems to execute problem sizes that exceed device memory.
Show less
- Title
- PALETTEVIZ : A METHOD FOR VISUALIZATION OF HIGH-DIMENSIONAL PARETO-OPTIMAL FRONT AND ITS APPLICATIONS TO MULTI-CRITERIA DECISION MAKING AND ANALYSIS
- Creator
- Talukder, AKM Khaled Ahsan
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Visual representation of a many-objective Pareto-optimal front in four or more dimensional objective space requires a large number of data points. Moreover, choosing a single point from a large set even with certain preference information is problematic, as it imposes a large cognitive burden on the decision-makers. Therefore, many-objective optimization and decision-making practitioners have been interested in effective visualization methods to en- able them to filter down a large set to a...
Show moreVisual representation of a many-objective Pareto-optimal front in four or more dimensional objective space requires a large number of data points. Moreover, choosing a single point from a large set even with certain preference information is problematic, as it imposes a large cognitive burden on the decision-makers. Therefore, many-objective optimization and decision-making practitioners have been interested in effective visualization methods to en- able them to filter down a large set to a few critical points for further analysis. Most existing visualization methods are borrowed from other data analytics domains and they are too generic to be effective for many-criterion decision making. In this dissertation, we propose a visualization method, using star-coordinate and radial visualization plots, for effectively visualizing many-objective trade-off solutions. The proposed method respects some basic topological, geometric and functional decision-making properties of high-dimensional trade- off points mapped to a three-dimensional space. We call this method Palette Visualization (PaletteViz). We demonstrate the use of PaletteViz on a number of large-dimensional multi- objective optimization test problems and three real-world multi-objective problems, where one of them has 10 objective and 16 constraint functions. We also show the uses of NIMBUS and Pareto-Race concepts from canonical multi-criterion decision making and analysis literature and introduce them into PaletteViz to demonstrate the ease and advantage of the proposed method.
Show less
- Title
- Towards Robust and Reliable Communication for Millimeter Wave Networks
- Creator
- Zarifneshat, Masoud
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
The future generations of wireless networks benefit significantly from millimeter wave technology (mmW) with frequencies ranging from about 30 GHz to 300 GHz. Specifically, the fifth generation of wireless networks has already implemented the mmW technology and the capacity requirements defined in 6G will also benefit from the mmW spectrum. Despite the attractions of the mmW technology, the mmW spectrum has some inherent propagation properties that introduce challenges. The first is that free...
Show moreThe future generations of wireless networks benefit significantly from millimeter wave technology (mmW) with frequencies ranging from about 30 GHz to 300 GHz. Specifically, the fifth generation of wireless networks has already implemented the mmW technology and the capacity requirements defined in 6G will also benefit from the mmW spectrum. Despite the attractions of the mmW technology, the mmW spectrum has some inherent propagation properties that introduce challenges. The first is that free space pathloss in mmW is more severe than that in the sub 6 GHz band. To make the mmW signal travel farther, communication systems need to use phased array antennas to concentrate the signal power to a limited direction in space at each given time. Directional communication can incur high overhead on the system because it needs to probe the space for finding signal paths. To have efficient communication in the mmW spectrum, the transmitter and the receiver should align their beams on strong signal paths which is a high overhead task. The second is a low diffraction of the mmW spectrum. The low diffraction causes almost any object including the human body to easily block the mmW signal degrading the mmW link quality. Avoiding and recovering from the blockage in the mmW communications, especially in dynamic environments, is particularly challenging because of the fast changes of the mmW channel. Due to the unique characteristics of the mmW propagation, the traditional user association methods perform poorly in the mmW spectrum. Therefore, we propose user association methods that consider the inherent propagation characteristics of the mmW signal. We first propose a method that collects the history of blockage incidents throughout the network and exploits the historical blockage incidents to associate user equipment to the base station with lower blockage possibility. The simulation results show that our proposed algorithm performs better in terms of improving the quality of the links and blockage rate in the network. User association based only on one objective may deteriorate other objectives. Therefore, we formulate a biobjective optimization problem to consider two objectives of load balance and blockage possibility in the network. We conduct Lagrangian dual analysis to decrease time complexity. The results show that our solution to the biobjective optimization problem has a better outcome compared to optimizing each objective alone. After we investigate the user association problem, we further look into the problem of maintaining a robust link between a transmitter and a receiver. The directional propagation of the mmW signal creates the opportunity to exploit multipath for a robust link. The main reasons for the link quality degradation are blockage and link movement. We devise a learning-based prediction framework to classify link blockage and link movement efficiently and quickly using diffraction values for taking appropriate mitigating actions. The simulations show that the prediction framework can predict blockage with close to 90% accuracy. The prediction framework will eliminate the need for time-consuming methods to discriminate between link movement and link blockage. After detecting the reason for the link degradation, the system needs to do the beam alignment on the updated mmW signal paths. The beam alignment on the signal paths is a high overhead task. We propose using signaling in another frequency band to discover the paths surrounding a receiver working in the mmW spectrum. In this way, the receiver does not have to do an expensive beam scan in the mmW band. Our experiments with off-the-shelf devices show that we can use a non-mmW frequency band's paths to align the beams in mmW frequency. In this dissertation, we provide solutions to the fundamental problems in mmW communication. We propose a user association method that is designed for mmW networks considering challenges of mmW signal. A closed-form solution for a biobjective optimization problem to optimize both blockage and load balance of the network is also provided. Moreover, we show that we can efficiently use the out-of-band signal to exploit multipath created in mmW communication. The future research direction includes investigating the methods proposed in this dissertation to solve some of the classic problems in the wireless networks that exist in the mmW spectrum.
Show less
- Title
- Variational Bayes inference of Ising models and their applications
- Creator
- Kim, Minwoo
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Ising models originated in statistical physics have been widely used in modeling spatialdata and computer vision problems. However, statistical inference of this model and its application to many practical fields remain challenging due to intractable nature of the normalizing constant in the likelihood. This dissertation consists of two main themes, (1) parameter estimation of Ising model and (2) structured variable selection based on the Ising model using variational Bayes (VB).In Chapter 1,...
Show moreIsing models originated in statistical physics have been widely used in modeling spatialdata and computer vision problems. However, statistical inference of this model and its application to many practical fields remain challenging due to intractable nature of the normalizing constant in the likelihood. This dissertation consists of two main themes, (1) parameter estimation of Ising model and (2) structured variable selection based on the Ising model using variational Bayes (VB).In Chapter 1, we review the background, research questions and development of Isingmodel, variational Bayes, and other statistical concepts. An Ising model basically deal with a binary random vector in which each component is dependent on its neighbors. There exist various versions of Ising model depending on parameterization and neighboring structure. In Chapter 2, with two-parameter Ising model, we describe a novel procedure for the pa- rameter estimation based on VB which is computationally efficient and accurate compared to existing methods. Traditional pseudo maximum likelihood estimate (PMLE) can pro- vide accurate results only for smaller number of neighbors. A Bayesian approach based on Markov chain Monte Carlo (MCMC) performs better even with a large number of neighbors. Computational costs of MCMC, however, are quite expensive in terms of time. Accordingly, we propose a VB method with two variational families, mean-field (MF) Gaussian family and bivariate normal (BN) family. Extensive simulation studies validate the efficacy of the families. Using our VB methods, computing times are remarkably decreased without dete- rioration in performance accuracy, or in some scenarios we get much more accurate output. In addition, we demonstrates theoretical properties of the proposed VB method under MF family. The main theoretical contribution of our work lies in establishing the consistency of the variational posterior for the Ising model with the true likelihood replaced by the pseudo- likelihood. Under certain conditions, we first derive the rates at which the true posterior based on the pseudo-likelihood concentrates around the εn- shrinking neighborhoods of the true parameters. With a suitable bound on the Kullback-Leibler distance between the true and the variational posterior, we next establish the rate of contraction for the variational pos- terior and demonstrate that the variational posterior also concentrates around εn-shrinking neighborhoods of the true parameter.In Chapter 3, we propose a Bayesian variable selection technique for a regression setupin which the regression coefficients hold structural dependency. We employ spike and slab priors on the regression coefficients as follows: (i) In order to capture the intrinsic structure, we first consider Ising prior on latent binary variables. If a latent variable takes one, the corresponding regression coefficient is active, otherwise, it is inactive. (ii) Employing spike and slab prior, we put Gaussian priors (slab) on the active coefficients and inactive coefficients will be zeros with probability one (spike).
Show less
- Title
- Optimizing and Improving the Fidelity of Reactive, Polarizable Molecular Dynamics Simulations on Modern High Performance Computing Architectures
- Creator
- O'Hearn, Kurt A.
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Reactive, polarizable molecular dynamics simulations are a crucial tool for the high-fidelity study of large systems with chemical reactions. In support of this, several approaches have been employed with varying degrees of computational cost and physical accuracy. One of the more successful approaches in recent years, the reactive force field (ReaxFF) model, wasdesigned to fill the gap between traditional classical models and quantum mechanical models by incorporating a dynamic bond order...
Show moreReactive, polarizable molecular dynamics simulations are a crucial tool for the high-fidelity study of large systems with chemical reactions. In support of this, several approaches have been employed with varying degrees of computational cost and physical accuracy. One of the more successful approaches in recent years, the reactive force field (ReaxFF) model, wasdesigned to fill the gap between traditional classical models and quantum mechanical models by incorporating a dynamic bond order potential term. When coupling ReaxFF with dynamic global charges models for electrostatics, special considerations are necessary for obtaining highly performant implementations, especially on modern high-performance computing architectures.In this work, we detail the performance optimization of the PuReMD (PuReMD Reactive Molecular Dynamics) software package, an open-source, GPLv3-licensed implementation of ReaxFF coupled with dynamic charge models. We begin byexploring the tuning of the iterative Krylov linear solvers underpinning the global charge models in a shared-memory parallel context using OpenMP, with the explicit goal of minimizing the mean combined preconditioner and solver time. We found that with appropriate solver tuning, significant speedups and scalability improvements were observed. Following these successes, we extend these approaches to the solvers in the distributed-memory MPI implementation of PuReMD, as well as broaden the scope of optimization to other portions of the ReaxFF potential such as the bond order computations. Here again we find that sizable performance gains were achieved for large simulations numbering in the hundreds of thousands of atoms.With these performance improvements in hand, we next change focus to another important use of PuReMD -- the development of ReaxFF force fields for new materials. The high fidelity inherent in ReaxFF simulations for different chemistries oftentimes comes at the expense of a steep learning curve for parameter optimization, due in part to complexities in the high dimensional parameter space and due in part to the necessity of deep domain knowledge of how to adequately control the ReaxFF functional forms. To diagnose and combat these issues, a study was undertaken to optimize parameters for Li-O systems using the OGOLEM genetic algorithms framework coupled with a modified shared-memory version of PuReMD. We found that with careful training set design, sufficient optimization control with tuned genetic algorithms, and improved polarizability through enhanced charge model use, higher accuracy was achieved in simulations involving ductile fracture behavior, a difficult phenomena to hereto model correctly.Finally, we return to performance optimization for the GPU-accelerated distributed-memory PuReMD codebase. Modern supercomputers have recently achieved exascale levels of peak arithmetic rates due in large part to the design decision to incorporate massive numbers of GPUs. In order to take advantage of such computing systems, the MPI+CUDA version of PuReMD was re-designed and benchmarked on modern NVIDIA Tesla GPUs. Performance on-par with or exceeding the LAMMPS Kokkos, a ReaxFF implementation developed at Scandia National Laboratories, with PuReMD typically out-performing LAMMPS Kokkos at larger scales.
Show less
- Title
- VISIONING THE AGRICULTURE BLOCKCHAIN : THE ROLE AND RISE OF BLOCKCHAIN IN THE COMMERCIAL POULTRY INDUSTRY
- Creator
- Fennell, Chris
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Blockchain is an emerging technology that is being explored by technologists and industry leaders as a way to revolutionize the agriculture supply chain. The problem is that human and ecological insights are needed to understand the complexities of how blockchain could fulfill these visions. In this work, I assert how the blockchain's promising vision of traceability, immutability and distributed properties presents advancements and challenges to rural farming. This work wrestles with the...
Show moreBlockchain is an emerging technology that is being explored by technologists and industry leaders as a way to revolutionize the agriculture supply chain. The problem is that human and ecological insights are needed to understand the complexities of how blockchain could fulfill these visions. In this work, I assert how the blockchain's promising vision of traceability, immutability and distributed properties presents advancements and challenges to rural farming. This work wrestles with the more subtle ways the blockchain technology would be integrated into the existing infrastructure. Through interviews and participatory design workshops, I talked with an expansive set of stakeholders including Amish farmers, contract growers, senior leadership and field supervisors. This research illuminates that commercial poultry farming is such a complex and diffuse system that any overhaul of its core infrastructure will be difficult to ``roll back'' once blockchain is ``rolled out.'' Through an HCI and sociotechnical system perspective, drawing particular insights from Science and Technology Studies theories of infrastructure and breakdown, this dissertation asserts three main concerns. First, this dissertation uncovers the dominant narratives on the farm around revision and ``roll back'' of blockchain, connecting to theories of version control from computer science. Second, this work uncovers that a core concern of the poultry supply chain is death and I reveal the sociotechnical and material implications for the integration of blockchain. Finally, this dissertation discusses the meaning of ``security’’ for the poultry supply chain in which biosecurity is prioritized over cybersecurity and how blockchain impacts these concerns. Together these findings point to significant implications for designers of blockchain infrastructure and how rural workers will integrate the technology into the supply chain.
Show less
- Title
- Predicting the Properties of Ligands Using Molecular Dynamics and Machine Learning
- Creator
- Donyapour, Nazanin
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
The discovery and design of new drugs requires extensive experimental assays that are usually very expensive and time-consuming. To cut down the cost and time of the drug development process and help design effective drugs more efficiently, various computational methods have been developed that are referred to collectively as in silico drug design. These in silico methods can be used to not only determine compounds that can bind to a target receptor but to determine whether compounds show...
Show moreThe discovery and design of new drugs requires extensive experimental assays that are usually very expensive and time-consuming. To cut down the cost and time of the drug development process and help design effective drugs more efficiently, various computational methods have been developed that are referred to collectively as in silico drug design. These in silico methods can be used to not only determine compounds that can bind to a target receptor but to determine whether compounds show ideal drug-like properties. I have provided solutions to these problems by developing novel methods for molecular simulation and molecular property prediction. Firstly, we have developed a new enhanced sampling MD algorithm called Resampling of Ensembles by Variation Optimization or “REVO” that can generate binding and unbinding pathways of ligand-target interactions. These pathways are useful for calculating transition rates and Residence Times (RT) of protein-ligand complexes. This can be particularly useful for drug design as studies for some systems show that the drug efficacy correlates more with RT than the binding affinity. This method is generally useful for generating long-timescale transitions in complex systems, including alternate ligand binding poses and protein conformational changes. Secondly, we have developed a technique we refer to as “ClassicalGSG” to predict the partition coefficient (log P) of small molecules. log P is one of the main factors in determining the drug likeness of a compound, as it helps determine bioavailability, solubility, and membrane permeability. This method has been very successful compared to other methods in literature. Finally, we have developed a method called ``Flexible Topology'' that we hope can eventually be used to screen a database of potential ligands while considering ligand-induced conformational changes. After discovering molecules with drug-like properties in the drug design pipeline, Virtual Screening (VS) methods are employed to perform an extensive search on drug databases with hundreds of millions of compounds to find candidates that bind tightly to a molecular target. However, in order for this to be computationally tractable, typically, only static snapshots of the target are used, which cannot respond to the presence of the drug compound. To efficiently capture drug-target interactions during screening, we have developed a machine-learning algorithm that employs Molecular Dynamics (MD) simulations with a protein of interest and a set of atoms called “Ghost Particles”. During the simulation, the Flexible Topology method induces forces that constantly modify the ghost particles and optimizes them toward drug-like molecules that are compatible with the molecular target.
Show less
- Title
- UNDERSTANDING THE GENETIC BASIS OF HUMAN DISEASES BY COMPUTATIONALLY MODELING THE LARGE-SCALE GENE REGULATORY NETWORKS
- Creator
- Wang, Hao
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Many severe diseases are known to be caused by the genetic disorder of the human genome, including breast cancer and Alzheimer's disease. Understanding the genetic basis of human diseases plays a vital role in personalized medicine and precision therapy. However, the pervasive spatial correlations between the disease-associated SNPs have hindered the ability of traditional GWAS studies to discover causal SNPs and obscured the underlying mechanisms of disease-associated SNPs. Recently, diverse...
Show moreMany severe diseases are known to be caused by the genetic disorder of the human genome, including breast cancer and Alzheimer's disease. Understanding the genetic basis of human diseases plays a vital role in personalized medicine and precision therapy. However, the pervasive spatial correlations between the disease-associated SNPs have hindered the ability of traditional GWAS studies to discover causal SNPs and obscured the underlying mechanisms of disease-associated SNPs. Recently, diverse biological datasets generated by large data consortia provide a unique opportunity to fill the gap between genotypes and phenotypes using biological networks, representing the complex interplay between genes, enhancers, and transcription factors (TF) in the 3D space. The comprehensive delineation of the regulatory landscape calls for highly scalable computational algorithms to reconstruct the 3D chromosome structures and mechanistically predict the enhancer-gene links. In this dissertation, I first developed two algorithms, FLAMINGO and tFLAMINGO, to reconstruct the high-resolution 3D chromosome structures. The algorithmic advancements of FLAMINGO and tFLAMINGO lead to the reconstruction of the 3D chromosome structures in an unprecedented resolution from the highly sparse chromatin contact maps. I further developed two integrative algorithms, ComMUTE and ProTECT, to mechanistically predict the long-range enhancer-gene links by modeling the TF profiles. Based on the extensive evaluations, these two algorithms demonstrate superior performance in predicting enhancer-gene links and decoding TF regulatory grammars over existing algorithms. The successful application of ComMUTE and ProTECT in 127 cell types not only provide a rich resource of gene regulatory networks but also shed light on the mechanistic understanding of QTLs, disease-associated genetic variants, and high-order chromatin interactions.
Show less
- Title
- Robust Learning of Deep Neural Networks under Data Corruption
- Creator
- Liu, Boyang
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Training deep neural networks in the presence of corrupted data is challenging as the corrupted data points may significantly impact generalization performance of the models. Unfortunately, the data corruption issue widely exists in many application domains, including but not limited to, healthcare, environmental sciences, autonomous driving, and social media analytics. Although there have been some previous studies that aim to enhance the robustness of machine learning models against data...
Show moreTraining deep neural networks in the presence of corrupted data is challenging as the corrupted data points may significantly impact generalization performance of the models. Unfortunately, the data corruption issue widely exists in many application domains, including but not limited to, healthcare, environmental sciences, autonomous driving, and social media analytics. Although there have been some previous studies that aim to enhance the robustness of machine learning models against data corruption, most of them either lack theoretical robustness guarantees or unable to scale to the millions of model parameters governing deep neural networks. The goal of this thesis is to design robust machine learning algorithms that 1) effectively deal with different types of data corruption, 2) have sound theoretical guarantees on robustness, and 3) scalable to large number of parameters in deep neural networks.There are two general approaches to enhance model robustness against data corruption. The first approach is to detect and remove the corrupted data while the second approach is to design robust learning algorithms that can tolerate some fraction of corrupted data. In this thesis, I had developed two robust unsupervised anomaly detection algorithms and two robust supervised learning algorithm for corrupted supervision and backdoor attack. Specifically, in Chapter 2, I proposed the Robust Collaborative Autoencoder (RCA) approach to enhance the robustness of vanilla autoencoder methods against natural corruption. In Chapter 3, I developed Robust RealNVP, a robust density estimation technique for unsupervised anomaly detection tasks given concentrated anomalies. Chapter 4 presents the Provable Robust Learning (PRL) approach, which is a robust algorithm against agnostic corrupted supervision. In Chapter 5, a meta-algorithm to defend against backdoor attacks is proposed by exploring the connection between label corruption and backdoor data poisoning attack. Extensive experiments on multiple benchmark datasets have demonstrated the robustness of the proposed algorithms under different types of corruption.
Show less
- Title
- ASSURING THE ROBUSTNESS AND RESILIENCY OF LEARNING-ENABLED AUTONOMOUS SYSTEMS
- Creator
- Langford, Michael Austin
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
As Learning-Enabled Systems (LESs) have become more prevalent in safety-critical applications, addressing the assurance of LESs has become increasingly important. Because machine learning models in LESs are not explicitly programmed like traditional software, developers typically have less direct control over the inferences learned by LESs, relying instead on semantically valid and complete patterns to be extracted from the system’s exposure to the environment. As such, the behavior of an LES...
Show moreAs Learning-Enabled Systems (LESs) have become more prevalent in safety-critical applications, addressing the assurance of LESs has become increasingly important. Because machine learning models in LESs are not explicitly programmed like traditional software, developers typically have less direct control over the inferences learned by LESs, relying instead on semantically valid and complete patterns to be extracted from the system’s exposure to the environment. As such, the behavior of an LES is strongly dependent on the quality of its training experience. However, run-time environments are often noisy or not well-defined. Uncertainty in the behavior of an LES can arise when there is inadequate coverage of relevant training/test cases (e.g., corner cases). It is challenging to assure safety-critical LESs will perform as expected when exposed to run-time conditions that have never been experienced during training or validation. This doctoral research contributes automated methods to improve the robustness and resilience of an LES. For this work, a robust LES is less sensitive to noise in the environment, and a resilient LES is able to self-adapt to adverse run-time contexts in order to mitigate system failure. The proposed methods harness diversity-driven evolution-based methods, machine learning, and software assurance cases to train robust LESs, uncover robust system configurations, and foster resiliency through self-adaptation and predictive behavior modeling. This doctoral work demonstrates these capabilities by applying the proposed framework to deep learning and autonomous cyber-physical systems.
Show less
- Title
- Towards Accurate Ranging and Versatile Authentication for Smart Mobile Devices
- Creator
- Li, Lingkun
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Internet of Things (IoTs) was rapidly developed during past years. Smart devices, such as smartphones, smartwatches, and smart assistants, which are equipped with smart chips as well as sensors, provide users with many easy used functions and lead them to a more convenient life. In this dissertation, we carefully studied the birefringence of the transparent tape, the nonlinear effects of the microphone, and the phase characteristic of the reflected ultrasound, and make use of such effects to...
Show moreInternet of Things (IoTs) was rapidly developed during past years. Smart devices, such as smartphones, smartwatches, and smart assistants, which are equipped with smart chips as well as sensors, provide users with many easy used functions and lead them to a more convenient life. In this dissertation, we carefully studied the birefringence of the transparent tape, the nonlinear effects of the microphone, and the phase characteristic of the reflected ultrasound, and make use of such effects to design three systems, RainbowLight, Patronus, and BreathPass, to provide users with accurate localization, privacy protection, and authentication, respectively.RainbowLight leverages observation direction-varied spectrum generated by a polarized light passing through a birefringence material, i.e., transparent tape, to provide localization service. We characterize the relationship between observe direction, light interference and the special spectrum, and using it to calculate the direction to a chip after taking a photo containing the chip. With multiple chips, RainbowLight designs a direction intersection based method to derive the location. In this dissertation, we build the theoretical basis of using polarized light and birefringence phenomenon to perform localization. Based on the theoretical model, we design and implement the RainbowLight on the mobile device, and evaluate the performance of the system. The evaluation results show that RainbowLight achieves 1.68 cm of the median error in the X-axis, 2 cm of the median error in the Y-axis, 5.74 cm of the median error in Z-axis, and 7.04 cm of the median error with the whole dimension.It is the first system that could only use the reflected lights in the space to perform visible light positioning. Patronus prevents unauthorized speech recording by leveraging the nonlinear effects of commercial off-the-shelf microphones. The inaudible ultrasound scramble interferes recording of unauthorized devices and can be canceled on authorized devices through an adaptive filter. In this dissertation, we carefully studied the nonlinear effects of ultrasound on commercial microphones. Based on the study, we proposed an optimized configuration to generate the scramble. It would provide privacy protection againist unauthorized recordings that does not disturb normal conversations. We designed, implemented a system including hardware and software components. Experiments results show that only 19.7% of words protected by Patronus' scramble can be recognized by unauthorized devices. Furthermore, authorized recordings have 1.6x higher perceptual evaluation of speech quality (PESQ) score and, on average, 50% lower speech recognition error rates than unauthorized recordings. BreathPass uses speakers to emit ultrasound signals. The signals are reflected off the chest wall and abdomen and then back to the microphone, which records the reflected signals. The system then extracts the fingerprints from the breathing pattern, and use these fingerprints to perform authentication. In this dissertation, we characterized the challenge of conducting authentication with the breathing pattern. After addressing these challenges, we designed such a system and implemented a proof-of-concept application on Android platform.We also conducted comprehensive experiments to evaluate the performance under different scenarios. BreathPass achieves an overall accuracy of 83%, a true positive rate of 73%, and a false positive rate of 5%, according to performance evaluation results. In general, this dissertation provides an enhanced ranging and versatile authentication systems of Internet of Things.
Show less
- Title
- Investigating the Role of Sensor Based Technologies to Support Domestic Activities in Sub-Saharan Africa
- Creator
- Chidziwisano, George Hope
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
In sub-Saharan Africa (SSA), homes face various challenges including insecurity, unreliable power supply, and extreme weather conditions. While the use of sensor-based technologies is increasing in industrialized countries, it is unclear how they can be used to support domestic activities in SSA. The availability of low-cost sensors and the widespread adoption of mobile phones presents an opportunity to collect real-time data and utilize proactive methods to monitor these challenges. This...
Show moreIn sub-Saharan Africa (SSA), homes face various challenges including insecurity, unreliable power supply, and extreme weather conditions. While the use of sensor-based technologies is increasing in industrialized countries, it is unclear how they can be used to support domestic activities in SSA. The availability of low-cost sensors and the widespread adoption of mobile phones presents an opportunity to collect real-time data and utilize proactive methods to monitor these challenges. This dissertation presents three studies that build upon each other to explore the role of sensor-based technologies in SSA. I used a technology probes method to develop three sensor-based systems that support domestic security (M-Kulinda), power blackout monitoring (GridAlert) and poultry farming (NkhukuApp). I deployed M-Kulinda in 20 Kenyan homes, GridAlert in 18 Kenyan homes, and NkhukuProbe in 15 Malawian home-based chicken coops for one month. I used interview, observation, diary, and data logging methods to understand participants’ experiences using the probes. Findings from these studies suggest that people in Kenya and Malawi want to incorporate sensor-based technologies into their everyday activities, and they quickly find unexpected ways to use them. Participants’ interactions with the probes prompted detailed reflections about how they would integrate sensor-based technologies in their homes (e.g., monitoring non-digital tools). These reflections are useful for motivating new design concepts in HCI. I use these findings to motivate a discussion about unexplored areas that could benefit from sensor-based technologies. Further, I discuss recommendations for designing sensor-based technologies that support activities in some Kenyan and Malawian homes. This research contributes to HCI by providing design implications for sensor-based applications in Kenyan and Malawian homes, employing a technology probes method in a non-traditional context, and developing prototypes of three novel systems.
Show less
- Title
- Efficient and Secure Message Passing for Machine Learning
- Creator
- Liu, Xiaorui
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Machine learning (ML) techniques have brought revolutionary impact to human society, and they will continue to act as technological innovators in the future. To broaden its impact, it is urgent to solve the emerging and critical challenges in machine learning, such as efficiency and security issues. On the one hand, ML models have become increasingly powerful due to big data and models, but it also brings tremendous challenges in designing efficient optimization algorithms to train the big ML...
Show moreMachine learning (ML) techniques have brought revolutionary impact to human society, and they will continue to act as technological innovators in the future. To broaden its impact, it is urgent to solve the emerging and critical challenges in machine learning, such as efficiency and security issues. On the one hand, ML models have become increasingly powerful due to big data and models, but it also brings tremendous challenges in designing efficient optimization algorithms to train the big ML models from big data. The most effective way for large-scale ML is to parallelize the computation tasks on distributed systems composed of many computational devices. However, in practice, the scalability and efficiency of the systems are greatly limited by information synchronization since the message passing between the devices dominates the total running time. In other words, the major bottleneck lies in the high communication cost between devices, especially when the scale of the system and the models becomes larger while the communication bandwidth is relatively limited. This communication bottleneck often limits the practical speedup of distributed ML systems. On the other hand, recent research has generally revealed that many ML models suffer from security vulnerabilities. In particular, deep learning models can be easily deceived by the unnoticeable perturbations in data. Meanwhile, graph is a kind of prevalent data structure for many real-world data that encodes pairwise relations between entities such as social networks, transportation networks, and chemical molecules. Graph neural networks (GNNs) generalize and extend the representation learning power of traditional deep neural networks (DNNs) from regular grids, such as image, video, and text, to irregular graph-structured data through message passing frameworks. Therefore, many important applications on these data can be treated as computational tasks on graphs, such as recommender systems, social network analysis, traffic prediction, etc. Unfortunately, the vulnerability of deep learning models also translates to GNNs, which raises significant concerns about their applications, especially in safety-critical areas. Therefore, it is critical to design intrinsically secure ML models for graph-structured data.The primary objective of this dissertation is to figure out the solutions to solve these challenges via innovative research and principled methods. In particular, we propose multiple distributed optimization algorithms with efficient message passing to mitigate the communication bottleneck and speed up ML model training in distributed ML systems. We also propose multiple secure message passing schemes as the building blocks of graph neural networks aiming to significantly improve the security and robustness of ML models.
Show less
- Title
- Efficient Distributed Algorithms : Better Theory and Communication Compression
- Creator
- LI, YAO
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Large-scale machine learning models are often trained by distributed algorithms over either centralized or decentralized networks. The former uses a central server to aggregate the information of local computing agents and broadcast the averaged parameters in a master-slave architecture. The latter considers a connected network formed by all agents. The information can only be exchanged with accessible neighbors with a mixing matrix of communication weights encoding the network's topology....
Show moreLarge-scale machine learning models are often trained by distributed algorithms over either centralized or decentralized networks. The former uses a central server to aggregate the information of local computing agents and broadcast the averaged parameters in a master-slave architecture. The latter considers a connected network formed by all agents. The information can only be exchanged with accessible neighbors with a mixing matrix of communication weights encoding the network's topology. Compared with centralized optimization, decentralization facilitates data privacy and reduces the communication burden of the single central agent due to model synchronization, but the connectivity of the communication network weakens the theoretical convergence complexity of the decentralized algorithms. Therefore, there are still gaps between decentralized and centralized algorithms in terms of convergence conditions and rates. In the first part of this dissertation, we consider two decentralized algorithms: EXTRA and NIDS, which both converge linearly with strongly convex objective functions and answer two questions regarding them. \textit{What are the optimal upper bounds for their stepsizes?} \textit{Do decentralized algorithms require more properties on the functions for linear convergence than centralized ones?} More specifically, we relax the required conditions for linear convergence of both algorithms. For EXTRA, we show that the stepsize is comparable to that of centralized algorithms. For NIDS, the upper bound of the stepsize is shown to be exactly the same as the centralized ones. In addition, we relax the requirement for the objective functions and the mixing matrices. We provide the linear convergence results for both algorithms under the weakest conditions.As the number of computing agents and the dimension of the model increase, the communication cost of parameter synchronization becomes the major obstacle to efficient learning. Communication compression techniques have exhibited great potential as an antidote to accelerate distributed machine learning by mitigating the communication bottleneck. In the rest of the dissertation, we propose compressed residual communication frameworks for both centralized and decentralized optimization and design different algorithms to achieve efficient communication. For centralized optimization, we propose DORE, a modified parallel stochastic gradient descent method with a bidirectional residual compression, to reduce over $95\%$ of the overall communication. Our theoretical analysis demonstrates that the proposed strategy has superior convergence properties for both strongly convex and nonconvex objective functions. Existing works mainly focus on smooth problems and compressing DGD-type algorithms for decentralized optimization. The class of smooth objective functions and the sublinear convergence rate under relatively strong assumptions limit these algorithms' application and practical performance. Motivated by primal-dual algorithms, we propose Prox-LEAD, a linear convergent decentralized algorithm with compression, to tackle strongly convex problems with a nonsmooth regularizer. Our theory describes the coupled dynamics of the inexact primal and dual update as well as compression error without assuming bounded gradients. The superiority of the proposed algorithm is demonstrated through the comparison with state-of-the-art algorithms in terms of convergence complexities and numerical experiments. Our algorithmic framework also generally enlightens the compressed communication on other primal-dual algorithms by reducing the impact of inexact iterations.
Show less
- Title
- Sparse Large-Scale Multi-Objective Optimization for Climate-Smart Agricultural Innovation
- Creator
- Kropp, Ian Meyer
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
The challenge of our generation is to produce enough food to feed the present and future global population. This is no simple task, as the world population is expanding and becoming more affluent, and conventional agriculture often degrades the environment. Without a healthy and functional environment, agriculture as we know it will fail. Therefore, we must equally balance our broad goals of sustainability and food production as a single system. Multi-objective optimization, algorithms that...
Show moreThe challenge of our generation is to produce enough food to feed the present and future global population. This is no simple task, as the world population is expanding and becoming more affluent, and conventional agriculture often degrades the environment. Without a healthy and functional environment, agriculture as we know it will fail. Therefore, we must equally balance our broad goals of sustainability and food production as a single system. Multi-objective optimization, algorithms that search for solutions to complex problems that contain conflicting objectives, is an effective tool for balancing these two goals. In this dissertation, we apply multi-objective optimization to find optimal management practices for irrigating and fertilizing corn. There are two areas for improvement in multi-objective optimization of corn management: existing methods run burdensomely slow and do not account for the uncertainty of weather. Improving run-time and optimizing in the face of weather uncertainty are the two goals of this dissertation. We address these goals with four novel methodologies that advance the fields of biosystems & agricultural engineering, as well as computer science engineering. In the first study, we address the first goal by drastically improving the performance of evolutionary multi-objective algorithms for sparse large-scale optimization problems. Sparse optimization, such as irrigation and nutrient management, are problems whose optimal solutions are mostly zero. Our novel algorithm, called sparse population sampling (SPS), integrates with and improves all population-based algorithms over almost all test scenarios. SPS, when used with NSGA-II, was able to outperform the existing state-of-the-art algorithms with the most complex of sparse large-scale optimization problems (i.e., 2,500 or more decision variables). The second study addressed the second goal by optimizing common management practices in a study site in Cass County, Michigan, for all climate scenarios. This methodology, which relied on SPS from the first goal, implements the concept of innovization in agriculture. In our innovization framework, 30 years of management practices were optimized against observed weather data, which in turn was compared to common practices in Cass County, Michigan. The differences between the optimal solutions and common practices were transformed into simple recommendations for farmers to apply during future growing seasons. Our recommendations drastically increased yields under 420 validation scenarios with no impact on nitrogen leaching. The third study further improves the performance of sparse large-scale optimization. Where SPS was a single component of a population-based algorithm, our proposed method, S-NSGA-II, is a novel and complete evolutionary algorithm for sparse large-scale optimization problems. Our algorithm outperforms or performs as well as other contemporary sparse large-scale optimization algorithms, especially in problems with more than 800 decision variables. This enhanced convergence will further improve multi-objective optimization in agriculture. Our final study, which addresses the second goal, takes a different approach to optimizing agricultural systems in the face of climate uncertainty. In this study, we use stochastic weather to quantify risk in optimization. In this way, farmers can choose between optimal management decisions with full understanding of the risks involved in every management decision.
Show less
- Title
- IMPROVED DETECTION AND MANAGEMENT OF PHYTOPHTHORA SOJAE
- Creator
- McCoy, Austin Glenn
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Phytophthora spp. cause root and stem rots, leaf blights and fruit rots on agricultural and economically important plant species. Symptoms of Phytophthora infected plants, particularly root rots, can be difficult to distinguish from other oomycete and fungal pathogens and often result in devastating losses. Phytophthora spp. can lie dormant for many years in the oospore stage, making long-term management of these diseases difficult. Phytophthora sojae is an important and prevalent pathogen of...
Show morePhytophthora spp. cause root and stem rots, leaf blights and fruit rots on agricultural and economically important plant species. Symptoms of Phytophthora infected plants, particularly root rots, can be difficult to distinguish from other oomycete and fungal pathogens and often result in devastating losses. Phytophthora spp. can lie dormant for many years in the oospore stage, making long-term management of these diseases difficult. Phytophthora sojae is an important and prevalent pathogen of soybean (Glycine max L.) worldwide, causing Phytophthora stem and root rot (PRR). PRR disease management during the growing season relies on an integrated pest management approach using a combination of host resistance, chemical compounds (fungicides; oomicides) and cultural practices for successful management. Therefore, this dissertation research focuses on improving the detection and management recommendations for Phytophthora sojae. In Chapter 1 I provide background and a review of the current literature on Phytophthora sojae management, including genetic resistance, chemical control compounds (fungicides; oomicides) and cultural practices used to mitigate losses to PRR. In my second chapter I validate the sensitivity and specificity of a preformulated Recombinase Polymerase Amplification assay for Phytophthora spp. This assay needs no refrigeration, does not require extensive DNA isolation, can be used in the field, and different qPCR platforms could reliably detect down to 3.3-330.0 pg of Phytophthora spp. DNA within plant tissue in under 30 minutes. Based on the limited reagents needed, ease of use, and reliability, this assay would be of benefit to diagnostic labs and inspectors monitoring regulated and non-regulated Phytophthora spp. Next, I transitioned the Habgood-Gilmour Spreadsheet (‘HaGiS’) from Microsoft Excel format to the subsequent R package ‘hagis’ and improved upon the analyses readily available to compare pathotypes from different populations of P. sojae (Chapter 3; ‘hagis’ beta-diversity). I then implemented the R package ‘hagis’ in my own P. sojae pathotype and fungicide sensitivity survey in the state of Michigan, identifying effective resistance genes and seed treatment compounds for the management of PRR. This study identified a loss of Rps1c and Rps1k, the two most widely plant Phytophthora sojae resistance genes, as viable management tools in Michigan and an increase in pathotype complexity, as compared to a survey conducted twenty years ago in Michigan (Chapter 4). In Chapter 5 I led a multi-state integrated pest management field trial that was performed in Michigan, Indiana, and Minnesota to study the effects of partial resistance and seed treatments with or without ethaboxam and metalaxyl on soybean stand, plant dry weights, and final yields under P. sojae pressure. This study found that oomicide treated seed protects stand across three locations in the Midwest, but the response of soybean varieties based on seed treatment, was variety and year specific. Significant yield benefits from using oomicide treated seed were only observed in one location and year. The effects of partial resistance were inconclusive and highlighted the need for a more informative and reliable rating system for soybean varieties partial resistance to P. sojae. Finally, in Chapter 6 I present conclusions and impacts on the studies presented in this dissertation. Overall, the studies presented provide an improvement to the detection, virulence data analysis, and integrated pest management recommendations for Phytophthora sojae.
Show less
- Title
- Novel Depth Representations for Depth Completion with Application in 3D Object Detection
- Creator
- Imran, Saif Muhammad
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Depth completion refers to interpolating a dense, regular depth grid from sparse and irregularly sampled depth values, often guided by high-resolution color imagery. The primary goal of depth completion is to estimate depth. In practice methods are trained by minimizing an error between predicted dense depth and ground-truth depth, and are evaluated by how well they minimize this error. Here we identify a second goal which is to avoid smearing depth across depth discontinuities. This second...
Show moreDepth completion refers to interpolating a dense, regular depth grid from sparse and irregularly sampled depth values, often guided by high-resolution color imagery. The primary goal of depth completion is to estimate depth. In practice methods are trained by minimizing an error between predicted dense depth and ground-truth depth, and are evaluated by how well they minimize this error. Here we identify a second goal which is to avoid smearing depth across depth discontinuities. This second goal is important because it can improve downstream applications of depth completion such as object detection and pose estimation. However, we also show that the goal of minimizing error can conflict with the goal of eliminating depth smearing.In this thesis, we propose two novel representations of depths that can encode depth discontinuity across object surfaces by allowing multiple depth estimation in the spatial domain. In order to learn these new representations, we propose carefully designed loss functions and show their effectiveness in deep neural network learning. We show how our representations can avoid inter-object depth mixing and also beat state of the art metrics for depth completion. The quality of ground-truth depth in real-world depth completion problems is another key challenge for learning and accurate evaluation of methods. Ground truth depth created from semi-automatic methods suffers from sparse sampling and errors at object boundaries. We show that the combination of these errors and the commonly used evaluation measure has promoted solutions that mix depths across boundaries in current methods. The thesis proposes alternate depth completion performance measures that reduce preference for mixed depths and promote sharp boundaries.The thesis also investigates whether additional points from depth completion methods can help in a challenging and high-level perception problem; 3D object detection. It shows the effect of different depth noises originated from depth estimates on detection performances and proposes some effective ways to reduce noise in the estimate and overcome architecture limitations. The method is demonstrated on both real-world and synthetic datasets.
Show less
- Title
- Detecting and Mitigating Bias in Natural Languages
- Creator
- Liu, Haochen
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Natural language processing (NLP) is an increasingly prominent subfield of artificial intelligence (AI). NLP techniques enable intelligent machines to understand and analyze natural languages and make it possible for humans and machines to communicate through natural languages. However, more and more evidence indicates that NLP applications show human-like discriminatory bias or make unfair decisions. As NLP algorithms play an increasingly irreplaceable role in promoting the automation of...
Show moreNatural language processing (NLP) is an increasingly prominent subfield of artificial intelligence (AI). NLP techniques enable intelligent machines to understand and analyze natural languages and make it possible for humans and machines to communicate through natural languages. However, more and more evidence indicates that NLP applications show human-like discriminatory bias or make unfair decisions. As NLP algorithms play an increasingly irreplaceable role in promoting the automation of people's lives, bias in NLP is closely related to users' vital interests and demands considerable attention.While there are a growing number of studies related to bias in natural languages, the research on this topic is far from complete. In this thesis, we propose several studies to fill up the gaps in the area of bias in NLP in terms of three perspectives. First, existing studies are mainly confined to traditional and relatively mature NLP tasks, but for certain newly emerging tasks such as dialogue generation, the research on how to define, detect, and mitigate the bias in them is still absent. We conduct pioneering studies on bias in dialogue models to answer these questions. Second, previous studies basically focus on explicit bias in NLP algorithms but overlook implicit bias. We investigate the implicit bias in text classification tasks in our studies, where we propose novel methods to detect, explain, and mitigate the implicit bias. Third, existing research on bias in NLP focuses more on in-processing and post-processing bias mitigation strategies, but rarely considers how to avoid bias being produced in the generation process of the training data, especially in the data annotation phase. To this end, we investigate annotator bias in crowdsourced data for NLP tasks and its group effect. We verify the existence of annotator group bias, develop a novel probabilistic graphical framework to capture it, and propose an algorithm to eliminate its negative impact on NLP model learning.
Show less
- Title
- Computational methods to investigate connectivity in evolvable systems
- Creator
- Ackles, Acacia Lee
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Evolution sheds light on all of biology, and evolutionary dynamics underlie some of the most pressing issues we face today. If we can deepen our understanding of evolution, we can better respond to these various challenges. However, studying such processes directly can be difficult; biological data is naturally messy, easily confounded, and often limited. Fortunately, we can use computational modeling to help simplify and systematically untangle complex evolutionary processes. The aim of this...
Show moreEvolution sheds light on all of biology, and evolutionary dynamics underlie some of the most pressing issues we face today. If we can deepen our understanding of evolution, we can better respond to these various challenges. However, studying such processes directly can be difficult; biological data is naturally messy, easily confounded, and often limited. Fortunately, we can use computational modeling to help simplify and systematically untangle complex evolutionary processes. The aim of this dissertation is therefore to develop innovative computational frameworks to describe, quantify, and build intuition about evolutionary phenomena, with a focus on connectivity within evolvable systems. Here I introduce three such computational frameworks which address the importance of connectivity in systems across scales.First, I introduce rank epistasis, a model of epistasis that does not rely on baseline assumptions of genetic interactions. Rank epistasis borrows rank-based comparison testing from parametric statistics to quantify mutational landscapes around a target locus and identify how much that landscape is perturbed by mutation at that locus. This model is able to correctly identify lack of epistasis where existing models fail, thereby providing better insight into connectivity at the genome level.Next, I describe the comparative hybrid method, an approach to piecewise study of complex phenotypes. This model creates hybridized structures of well-known cognitive substrates in order to address what facilitates the evolution of learning. The comparative hybrid model allowed us to identify both connectivity and discretization as important components to the evolution of cognition, as well as demonstrate how both these components interact in different cognitive structures. This approach highlights the importance of recognizing connected components at the level of the phenotype.Finally, I provide an engineering point of view for Tessevolve, a virtual reality enabled system for viewing fitness landscapes in multiple dimensions. While traditional methods have only allowed for 2D visualization, Tessevolve allows the user to view fitness landscapes scaled across 2D, 3D, and 4D. Visualizing these landscapes in multiple dimensions in an intuitive VR-based system allowed us to identify how landscape traversal changes as dimensions increase, demonstrating the way that connections between points across fitness landscapes are affected by dimensionality. As a whole, this dissertation looks at connectivity in computational structures across a broad range of biological scales. These methods and metrics therefore expand our computational toolkit for studying evolution in multiple systems of interest: genotypic, phenotypic, and at the whole landscape level.
Show less