You are here
Search results
(21 - 35 of 35)
Pages
- Title
- Iris Recognition : Enhancing Security and Improving Performance
- Creator
- Sharma, Renu
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Biometric systems recognize individuals based on their physical or behavioral traits, viz., face, iris, and voice. Iris (the colored annular region around the pupil) is one of the most popular biometric traits due to its uniqueness, accuracy, and stability. However, its widespread usage raises security concerns against various adversarial attacks. Another challenge is to match iris images with other compatible biometric modalities (i.e., face) to increase the scope of human identification....
Show moreBiometric systems recognize individuals based on their physical or behavioral traits, viz., face, iris, and voice. Iris (the colored annular region around the pupil) is one of the most popular biometric traits due to its uniqueness, accuracy, and stability. However, its widespread usage raises security concerns against various adversarial attacks. Another challenge is to match iris images with other compatible biometric modalities (i.e., face) to increase the scope of human identification. Therefore, the focus of this thesis is two-fold: firstly, enhance the security of the iris recognition system by detecting adversarial attacks, and secondly, accentuate its performance in iris-face matching.To enhance the security of the iris biometric system, we work over two types of adversarial attacks - presentation and morph attacks. A presentation attack (PA) occurs when an adversary presents a fake or altered biometric sample (plastic eye, cosmetic contact lens, etc.) to a biometric system to obfuscate their own identity or impersonate another identity. We propose three deep learning-based iris PA detection frameworks corresponding to three different imaging modalities, namely NIR spectrum, visible spectrum, and Optical Coherence Tomography (OCT) imaging inputting a NIR image, visible-spectrum video, and cross-sectional OCT image, respectively. The techniques perform effectively to detect known iris PAs as well as generalize well across unseen attacks, unseen sensors, and multiple datasets. We also presented the explainability and interpretability of the results from the techniques. Our other focuses are robustness analysis and continuous update (retraining) of the trained iris PA detection models. Another burgeoning security threat to biometric systems is morph attacks. A morph attack entails the generation of an image (morphed image) that embodies multiple different identities. Typically, a biometric image is associated with a single identity. In this work, we first demonstrate the vulnerability of iris recognition techniques to morph attacks and then develop techniques to detect the morphed iris images.The second focus of the thesis is to improve the performance of a cross-modal system where iris images are matched against face images. Cross-modality matching involves various challenges, such as cross-spectral, cross-resolution, cross-pose, and cross-temporal. To address these challenges, we extract common features present in both images using a multi-channel convolutional network and also generate synthetic data to augment insufficient training data using a dual-variational autoencoder framework. The two focus areas of this thesis improve the acceptance and widespread usage of the iris biometric system.
Show less
- Title
- PALETTEVIZ : A METHOD FOR VISUALIZATION OF HIGH-DIMENSIONAL PARETO-OPTIMAL FRONT AND ITS APPLICATIONS TO MULTI-CRITERIA DECISION MAKING AND ANALYSIS
- Creator
- Talukder, AKM Khaled Ahsan
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Visual representation of a many-objective Pareto-optimal front in four or more dimensional objective space requires a large number of data points. Moreover, choosing a single point from a large set even with certain preference information is problematic, as it imposes a large cognitive burden on the decision-makers. Therefore, many-objective optimization and decision-making practitioners have been interested in effective visualization methods to en- able them to filter down a large set to a...
Show moreVisual representation of a many-objective Pareto-optimal front in four or more dimensional objective space requires a large number of data points. Moreover, choosing a single point from a large set even with certain preference information is problematic, as it imposes a large cognitive burden on the decision-makers. Therefore, many-objective optimization and decision-making practitioners have been interested in effective visualization methods to en- able them to filter down a large set to a few critical points for further analysis. Most existing visualization methods are borrowed from other data analytics domains and they are too generic to be effective for many-criterion decision making. In this dissertation, we propose a visualization method, using star-coordinate and radial visualization plots, for effectively visualizing many-objective trade-off solutions. The proposed method respects some basic topological, geometric and functional decision-making properties of high-dimensional trade- off points mapped to a three-dimensional space. We call this method Palette Visualization (PaletteViz). We demonstrate the use of PaletteViz on a number of large-dimensional multi- objective optimization test problems and three real-world multi-objective problems, where one of them has 10 objective and 16 constraint functions. We also show the uses of NIMBUS and Pareto-Race concepts from canonical multi-criterion decision making and analysis literature and introduce them into PaletteViz to demonstrate the ease and advantage of the proposed method.
Show less
- Title
- Optimizing and Improving the Fidelity of Reactive, Polarizable Molecular Dynamics Simulations on Modern High Performance Computing Architectures
- Creator
- O'Hearn, Kurt A.
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Reactive, polarizable molecular dynamics simulations are a crucial tool for the high-fidelity study of large systems with chemical reactions. In support of this, several approaches have been employed with varying degrees of computational cost and physical accuracy. One of the more successful approaches in recent years, the reactive force field (ReaxFF) model, wasdesigned to fill the gap between traditional classical models and quantum mechanical models by incorporating a dynamic bond order...
Show moreReactive, polarizable molecular dynamics simulations are a crucial tool for the high-fidelity study of large systems with chemical reactions. In support of this, several approaches have been employed with varying degrees of computational cost and physical accuracy. One of the more successful approaches in recent years, the reactive force field (ReaxFF) model, wasdesigned to fill the gap between traditional classical models and quantum mechanical models by incorporating a dynamic bond order potential term. When coupling ReaxFF with dynamic global charges models for electrostatics, special considerations are necessary for obtaining highly performant implementations, especially on modern high-performance computing architectures.In this work, we detail the performance optimization of the PuReMD (PuReMD Reactive Molecular Dynamics) software package, an open-source, GPLv3-licensed implementation of ReaxFF coupled with dynamic charge models. We begin byexploring the tuning of the iterative Krylov linear solvers underpinning the global charge models in a shared-memory parallel context using OpenMP, with the explicit goal of minimizing the mean combined preconditioner and solver time. We found that with appropriate solver tuning, significant speedups and scalability improvements were observed. Following these successes, we extend these approaches to the solvers in the distributed-memory MPI implementation of PuReMD, as well as broaden the scope of optimization to other portions of the ReaxFF potential such as the bond order computations. Here again we find that sizable performance gains were achieved for large simulations numbering in the hundreds of thousands of atoms.With these performance improvements in hand, we next change focus to another important use of PuReMD -- the development of ReaxFF force fields for new materials. The high fidelity inherent in ReaxFF simulations for different chemistries oftentimes comes at the expense of a steep learning curve for parameter optimization, due in part to complexities in the high dimensional parameter space and due in part to the necessity of deep domain knowledge of how to adequately control the ReaxFF functional forms. To diagnose and combat these issues, a study was undertaken to optimize parameters for Li-O systems using the OGOLEM genetic algorithms framework coupled with a modified shared-memory version of PuReMD. We found that with careful training set design, sufficient optimization control with tuned genetic algorithms, and improved polarizability through enhanced charge model use, higher accuracy was achieved in simulations involving ductile fracture behavior, a difficult phenomena to hereto model correctly.Finally, we return to performance optimization for the GPU-accelerated distributed-memory PuReMD codebase. Modern supercomputers have recently achieved exascale levels of peak arithmetic rates due in large part to the design decision to incorporate massive numbers of GPUs. In order to take advantage of such computing systems, the MPI+CUDA version of PuReMD was re-designed and benchmarked on modern NVIDIA Tesla GPUs. Performance on-par with or exceeding the LAMMPS Kokkos, a ReaxFF implementation developed at Scandia National Laboratories, with PuReMD typically out-performing LAMMPS Kokkos at larger scales.
Show less
- Title
- VISIONING THE AGRICULTURE BLOCKCHAIN : THE ROLE AND RISE OF BLOCKCHAIN IN THE COMMERCIAL POULTRY INDUSTRY
- Creator
- Fennell, Chris
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Blockchain is an emerging technology that is being explored by technologists and industry leaders as a way to revolutionize the agriculture supply chain. The problem is that human and ecological insights are needed to understand the complexities of how blockchain could fulfill these visions. In this work, I assert how the blockchain's promising vision of traceability, immutability and distributed properties presents advancements and challenges to rural farming. This work wrestles with the...
Show moreBlockchain is an emerging technology that is being explored by technologists and industry leaders as a way to revolutionize the agriculture supply chain. The problem is that human and ecological insights are needed to understand the complexities of how blockchain could fulfill these visions. In this work, I assert how the blockchain's promising vision of traceability, immutability and distributed properties presents advancements and challenges to rural farming. This work wrestles with the more subtle ways the blockchain technology would be integrated into the existing infrastructure. Through interviews and participatory design workshops, I talked with an expansive set of stakeholders including Amish farmers, contract growers, senior leadership and field supervisors. This research illuminates that commercial poultry farming is such a complex and diffuse system that any overhaul of its core infrastructure will be difficult to ``roll back'' once blockchain is ``rolled out.'' Through an HCI and sociotechnical system perspective, drawing particular insights from Science and Technology Studies theories of infrastructure and breakdown, this dissertation asserts three main concerns. First, this dissertation uncovers the dominant narratives on the farm around revision and ``roll back'' of blockchain, connecting to theories of version control from computer science. Second, this work uncovers that a core concern of the poultry supply chain is death and I reveal the sociotechnical and material implications for the integration of blockchain. Finally, this dissertation discusses the meaning of ``security’’ for the poultry supply chain in which biosecurity is prioritized over cybersecurity and how blockchain impacts these concerns. Together these findings point to significant implications for designers of blockchain infrastructure and how rural workers will integrate the technology into the supply chain.
Show less
- Title
- UNDERSTANDING THE GENETIC BASIS OF HUMAN DISEASES BY COMPUTATIONALLY MODELING THE LARGE-SCALE GENE REGULATORY NETWORKS
- Creator
- Wang, Hao
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Many severe diseases are known to be caused by the genetic disorder of the human genome, including breast cancer and Alzheimer's disease. Understanding the genetic basis of human diseases plays a vital role in personalized medicine and precision therapy. However, the pervasive spatial correlations between the disease-associated SNPs have hindered the ability of traditional GWAS studies to discover causal SNPs and obscured the underlying mechanisms of disease-associated SNPs. Recently, diverse...
Show moreMany severe diseases are known to be caused by the genetic disorder of the human genome, including breast cancer and Alzheimer's disease. Understanding the genetic basis of human diseases plays a vital role in personalized medicine and precision therapy. However, the pervasive spatial correlations between the disease-associated SNPs have hindered the ability of traditional GWAS studies to discover causal SNPs and obscured the underlying mechanisms of disease-associated SNPs. Recently, diverse biological datasets generated by large data consortia provide a unique opportunity to fill the gap between genotypes and phenotypes using biological networks, representing the complex interplay between genes, enhancers, and transcription factors (TF) in the 3D space. The comprehensive delineation of the regulatory landscape calls for highly scalable computational algorithms to reconstruct the 3D chromosome structures and mechanistically predict the enhancer-gene links. In this dissertation, I first developed two algorithms, FLAMINGO and tFLAMINGO, to reconstruct the high-resolution 3D chromosome structures. The algorithmic advancements of FLAMINGO and tFLAMINGO lead to the reconstruction of the 3D chromosome structures in an unprecedented resolution from the highly sparse chromatin contact maps. I further developed two integrative algorithms, ComMUTE and ProTECT, to mechanistically predict the long-range enhancer-gene links by modeling the TF profiles. Based on the extensive evaluations, these two algorithms demonstrate superior performance in predicting enhancer-gene links and decoding TF regulatory grammars over existing algorithms. The successful application of ComMUTE and ProTECT in 127 cell types not only provide a rich resource of gene regulatory networks but also shed light on the mechanistic understanding of QTLs, disease-associated genetic variants, and high-order chromatin interactions.
Show less
- Title
- ASSURING THE ROBUSTNESS AND RESILIENCY OF LEARNING-ENABLED AUTONOMOUS SYSTEMS
- Creator
- Langford, Michael Austin
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
As Learning-Enabled Systems (LESs) have become more prevalent in safety-critical applications, addressing the assurance of LESs has become increasingly important. Because machine learning models in LESs are not explicitly programmed like traditional software, developers typically have less direct control over the inferences learned by LESs, relying instead on semantically valid and complete patterns to be extracted from the system’s exposure to the environment. As such, the behavior of an LES...
Show moreAs Learning-Enabled Systems (LESs) have become more prevalent in safety-critical applications, addressing the assurance of LESs has become increasingly important. Because machine learning models in LESs are not explicitly programmed like traditional software, developers typically have less direct control over the inferences learned by LESs, relying instead on semantically valid and complete patterns to be extracted from the system’s exposure to the environment. As such, the behavior of an LES is strongly dependent on the quality of its training experience. However, run-time environments are often noisy or not well-defined. Uncertainty in the behavior of an LES can arise when there is inadequate coverage of relevant training/test cases (e.g., corner cases). It is challenging to assure safety-critical LESs will perform as expected when exposed to run-time conditions that have never been experienced during training or validation. This doctoral research contributes automated methods to improve the robustness and resilience of an LES. For this work, a robust LES is less sensitive to noise in the environment, and a resilient LES is able to self-adapt to adverse run-time contexts in order to mitigate system failure. The proposed methods harness diversity-driven evolution-based methods, machine learning, and software assurance cases to train robust LESs, uncover robust system configurations, and foster resiliency through self-adaptation and predictive behavior modeling. This doctoral work demonstrates these capabilities by applying the proposed framework to deep learning and autonomous cyber-physical systems.
Show less
- Title
- Efficient and Secure Message Passing for Machine Learning
- Creator
- Liu, Xiaorui
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Machine learning (ML) techniques have brought revolutionary impact to human society, and they will continue to act as technological innovators in the future. To broaden its impact, it is urgent to solve the emerging and critical challenges in machine learning, such as efficiency and security issues. On the one hand, ML models have become increasingly powerful due to big data and models, but it also brings tremendous challenges in designing efficient optimization algorithms to train the big ML...
Show moreMachine learning (ML) techniques have brought revolutionary impact to human society, and they will continue to act as technological innovators in the future. To broaden its impact, it is urgent to solve the emerging and critical challenges in machine learning, such as efficiency and security issues. On the one hand, ML models have become increasingly powerful due to big data and models, but it also brings tremendous challenges in designing efficient optimization algorithms to train the big ML models from big data. The most effective way for large-scale ML is to parallelize the computation tasks on distributed systems composed of many computational devices. However, in practice, the scalability and efficiency of the systems are greatly limited by information synchronization since the message passing between the devices dominates the total running time. In other words, the major bottleneck lies in the high communication cost between devices, especially when the scale of the system and the models becomes larger while the communication bandwidth is relatively limited. This communication bottleneck often limits the practical speedup of distributed ML systems. On the other hand, recent research has generally revealed that many ML models suffer from security vulnerabilities. In particular, deep learning models can be easily deceived by the unnoticeable perturbations in data. Meanwhile, graph is a kind of prevalent data structure for many real-world data that encodes pairwise relations between entities such as social networks, transportation networks, and chemical molecules. Graph neural networks (GNNs) generalize and extend the representation learning power of traditional deep neural networks (DNNs) from regular grids, such as image, video, and text, to irregular graph-structured data through message passing frameworks. Therefore, many important applications on these data can be treated as computational tasks on graphs, such as recommender systems, social network analysis, traffic prediction, etc. Unfortunately, the vulnerability of deep learning models also translates to GNNs, which raises significant concerns about their applications, especially in safety-critical areas. Therefore, it is critical to design intrinsically secure ML models for graph-structured data.The primary objective of this dissertation is to figure out the solutions to solve these challenges via innovative research and principled methods. In particular, we propose multiple distributed optimization algorithms with efficient message passing to mitigate the communication bottleneck and speed up ML model training in distributed ML systems. We also propose multiple secure message passing schemes as the building blocks of graph neural networks aiming to significantly improve the security and robustness of ML models.
Show less
- Title
- Efficient Distributed Algorithms : Better Theory and Communication Compression
- Creator
- LI, YAO
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Large-scale machine learning models are often trained by distributed algorithms over either centralized or decentralized networks. The former uses a central server to aggregate the information of local computing agents and broadcast the averaged parameters in a master-slave architecture. The latter considers a connected network formed by all agents. The information can only be exchanged with accessible neighbors with a mixing matrix of communication weights encoding the network's topology....
Show moreLarge-scale machine learning models are often trained by distributed algorithms over either centralized or decentralized networks. The former uses a central server to aggregate the information of local computing agents and broadcast the averaged parameters in a master-slave architecture. The latter considers a connected network formed by all agents. The information can only be exchanged with accessible neighbors with a mixing matrix of communication weights encoding the network's topology. Compared with centralized optimization, decentralization facilitates data privacy and reduces the communication burden of the single central agent due to model synchronization, but the connectivity of the communication network weakens the theoretical convergence complexity of the decentralized algorithms. Therefore, there are still gaps between decentralized and centralized algorithms in terms of convergence conditions and rates. In the first part of this dissertation, we consider two decentralized algorithms: EXTRA and NIDS, which both converge linearly with strongly convex objective functions and answer two questions regarding them. \textit{What are the optimal upper bounds for their stepsizes?} \textit{Do decentralized algorithms require more properties on the functions for linear convergence than centralized ones?} More specifically, we relax the required conditions for linear convergence of both algorithms. For EXTRA, we show that the stepsize is comparable to that of centralized algorithms. For NIDS, the upper bound of the stepsize is shown to be exactly the same as the centralized ones. In addition, we relax the requirement for the objective functions and the mixing matrices. We provide the linear convergence results for both algorithms under the weakest conditions.As the number of computing agents and the dimension of the model increase, the communication cost of parameter synchronization becomes the major obstacle to efficient learning. Communication compression techniques have exhibited great potential as an antidote to accelerate distributed machine learning by mitigating the communication bottleneck. In the rest of the dissertation, we propose compressed residual communication frameworks for both centralized and decentralized optimization and design different algorithms to achieve efficient communication. For centralized optimization, we propose DORE, a modified parallel stochastic gradient descent method with a bidirectional residual compression, to reduce over $95\%$ of the overall communication. Our theoretical analysis demonstrates that the proposed strategy has superior convergence properties for both strongly convex and nonconvex objective functions. Existing works mainly focus on smooth problems and compressing DGD-type algorithms for decentralized optimization. The class of smooth objective functions and the sublinear convergence rate under relatively strong assumptions limit these algorithms' application and practical performance. Motivated by primal-dual algorithms, we propose Prox-LEAD, a linear convergent decentralized algorithm with compression, to tackle strongly convex problems with a nonsmooth regularizer. Our theory describes the coupled dynamics of the inexact primal and dual update as well as compression error without assuming bounded gradients. The superiority of the proposed algorithm is demonstrated through the comparison with state-of-the-art algorithms in terms of convergence complexities and numerical experiments. Our algorithmic framework also generally enlightens the compressed communication on other primal-dual algorithms by reducing the impact of inexact iterations.
Show less
- Title
- Sparse Large-Scale Multi-Objective Optimization for Climate-Smart Agricultural Innovation
- Creator
- Kropp, Ian Meyer
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
The challenge of our generation is to produce enough food to feed the present and future global population. This is no simple task, as the world population is expanding and becoming more affluent, and conventional agriculture often degrades the environment. Without a healthy and functional environment, agriculture as we know it will fail. Therefore, we must equally balance our broad goals of sustainability and food production as a single system. Multi-objective optimization, algorithms that...
Show moreThe challenge of our generation is to produce enough food to feed the present and future global population. This is no simple task, as the world population is expanding and becoming more affluent, and conventional agriculture often degrades the environment. Without a healthy and functional environment, agriculture as we know it will fail. Therefore, we must equally balance our broad goals of sustainability and food production as a single system. Multi-objective optimization, algorithms that search for solutions to complex problems that contain conflicting objectives, is an effective tool for balancing these two goals. In this dissertation, we apply multi-objective optimization to find optimal management practices for irrigating and fertilizing corn. There are two areas for improvement in multi-objective optimization of corn management: existing methods run burdensomely slow and do not account for the uncertainty of weather. Improving run-time and optimizing in the face of weather uncertainty are the two goals of this dissertation. We address these goals with four novel methodologies that advance the fields of biosystems & agricultural engineering, as well as computer science engineering. In the first study, we address the first goal by drastically improving the performance of evolutionary multi-objective algorithms for sparse large-scale optimization problems. Sparse optimization, such as irrigation and nutrient management, are problems whose optimal solutions are mostly zero. Our novel algorithm, called sparse population sampling (SPS), integrates with and improves all population-based algorithms over almost all test scenarios. SPS, when used with NSGA-II, was able to outperform the existing state-of-the-art algorithms with the most complex of sparse large-scale optimization problems (i.e., 2,500 or more decision variables). The second study addressed the second goal by optimizing common management practices in a study site in Cass County, Michigan, for all climate scenarios. This methodology, which relied on SPS from the first goal, implements the concept of innovization in agriculture. In our innovization framework, 30 years of management practices were optimized against observed weather data, which in turn was compared to common practices in Cass County, Michigan. The differences between the optimal solutions and common practices were transformed into simple recommendations for farmers to apply during future growing seasons. Our recommendations drastically increased yields under 420 validation scenarios with no impact on nitrogen leaching. The third study further improves the performance of sparse large-scale optimization. Where SPS was a single component of a population-based algorithm, our proposed method, S-NSGA-II, is a novel and complete evolutionary algorithm for sparse large-scale optimization problems. Our algorithm outperforms or performs as well as other contemporary sparse large-scale optimization algorithms, especially in problems with more than 800 decision variables. This enhanced convergence will further improve multi-objective optimization in agriculture. Our final study, which addresses the second goal, takes a different approach to optimizing agricultural systems in the face of climate uncertainty. In this study, we use stochastic weather to quantify risk in optimization. In this way, farmers can choose between optimal management decisions with full understanding of the risks involved in every management decision.
Show less
- Title
- IMPROVED DETECTION AND MANAGEMENT OF PHYTOPHTHORA SOJAE
- Creator
- McCoy, Austin Glenn
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Phytophthora spp. cause root and stem rots, leaf blights and fruit rots on agricultural and economically important plant species. Symptoms of Phytophthora infected plants, particularly root rots, can be difficult to distinguish from other oomycete and fungal pathogens and often result in devastating losses. Phytophthora spp. can lie dormant for many years in the oospore stage, making long-term management of these diseases difficult. Phytophthora sojae is an important and prevalent pathogen of...
Show morePhytophthora spp. cause root and stem rots, leaf blights and fruit rots on agricultural and economically important plant species. Symptoms of Phytophthora infected plants, particularly root rots, can be difficult to distinguish from other oomycete and fungal pathogens and often result in devastating losses. Phytophthora spp. can lie dormant for many years in the oospore stage, making long-term management of these diseases difficult. Phytophthora sojae is an important and prevalent pathogen of soybean (Glycine max L.) worldwide, causing Phytophthora stem and root rot (PRR). PRR disease management during the growing season relies on an integrated pest management approach using a combination of host resistance, chemical compounds (fungicides; oomicides) and cultural practices for successful management. Therefore, this dissertation research focuses on improving the detection and management recommendations for Phytophthora sojae. In Chapter 1 I provide background and a review of the current literature on Phytophthora sojae management, including genetic resistance, chemical control compounds (fungicides; oomicides) and cultural practices used to mitigate losses to PRR. In my second chapter I validate the sensitivity and specificity of a preformulated Recombinase Polymerase Amplification assay for Phytophthora spp. This assay needs no refrigeration, does not require extensive DNA isolation, can be used in the field, and different qPCR platforms could reliably detect down to 3.3-330.0 pg of Phytophthora spp. DNA within plant tissue in under 30 minutes. Based on the limited reagents needed, ease of use, and reliability, this assay would be of benefit to diagnostic labs and inspectors monitoring regulated and non-regulated Phytophthora spp. Next, I transitioned the Habgood-Gilmour Spreadsheet (‘HaGiS’) from Microsoft Excel format to the subsequent R package ‘hagis’ and improved upon the analyses readily available to compare pathotypes from different populations of P. sojae (Chapter 3; ‘hagis’ beta-diversity). I then implemented the R package ‘hagis’ in my own P. sojae pathotype and fungicide sensitivity survey in the state of Michigan, identifying effective resistance genes and seed treatment compounds for the management of PRR. This study identified a loss of Rps1c and Rps1k, the two most widely plant Phytophthora sojae resistance genes, as viable management tools in Michigan and an increase in pathotype complexity, as compared to a survey conducted twenty years ago in Michigan (Chapter 4). In Chapter 5 I led a multi-state integrated pest management field trial that was performed in Michigan, Indiana, and Minnesota to study the effects of partial resistance and seed treatments with or without ethaboxam and metalaxyl on soybean stand, plant dry weights, and final yields under P. sojae pressure. This study found that oomicide treated seed protects stand across three locations in the Midwest, but the response of soybean varieties based on seed treatment, was variety and year specific. Significant yield benefits from using oomicide treated seed were only observed in one location and year. The effects of partial resistance were inconclusive and highlighted the need for a more informative and reliable rating system for soybean varieties partial resistance to P. sojae. Finally, in Chapter 6 I present conclusions and impacts on the studies presented in this dissertation. Overall, the studies presented provide an improvement to the detection, virulence data analysis, and integrated pest management recommendations for Phytophthora sojae.
Show less
- Title
- Novel Depth Representations for Depth Completion with Application in 3D Object Detection
- Creator
- Imran, Saif Muhammad
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Depth completion refers to interpolating a dense, regular depth grid from sparse and irregularly sampled depth values, often guided by high-resolution color imagery. The primary goal of depth completion is to estimate depth. In practice methods are trained by minimizing an error between predicted dense depth and ground-truth depth, and are evaluated by how well they minimize this error. Here we identify a second goal which is to avoid smearing depth across depth discontinuities. This second...
Show moreDepth completion refers to interpolating a dense, regular depth grid from sparse and irregularly sampled depth values, often guided by high-resolution color imagery. The primary goal of depth completion is to estimate depth. In practice methods are trained by minimizing an error between predicted dense depth and ground-truth depth, and are evaluated by how well they minimize this error. Here we identify a second goal which is to avoid smearing depth across depth discontinuities. This second goal is important because it can improve downstream applications of depth completion such as object detection and pose estimation. However, we also show that the goal of minimizing error can conflict with the goal of eliminating depth smearing.In this thesis, we propose two novel representations of depths that can encode depth discontinuity across object surfaces by allowing multiple depth estimation in the spatial domain. In order to learn these new representations, we propose carefully designed loss functions and show their effectiveness in deep neural network learning. We show how our representations can avoid inter-object depth mixing and also beat state of the art metrics for depth completion. The quality of ground-truth depth in real-world depth completion problems is another key challenge for learning and accurate evaluation of methods. Ground truth depth created from semi-automatic methods suffers from sparse sampling and errors at object boundaries. We show that the combination of these errors and the commonly used evaluation measure has promoted solutions that mix depths across boundaries in current methods. The thesis proposes alternate depth completion performance measures that reduce preference for mixed depths and promote sharp boundaries.The thesis also investigates whether additional points from depth completion methods can help in a challenging and high-level perception problem; 3D object detection. It shows the effect of different depth noises originated from depth estimates on detection performances and proposes some effective ways to reduce noise in the estimate and overcome architecture limitations. The method is demonstrated on both real-world and synthetic datasets.
Show less
- Title
- Detecting and Mitigating Bias in Natural Languages
- Creator
- Liu, Haochen
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Natural language processing (NLP) is an increasingly prominent subfield of artificial intelligence (AI). NLP techniques enable intelligent machines to understand and analyze natural languages and make it possible for humans and machines to communicate through natural languages. However, more and more evidence indicates that NLP applications show human-like discriminatory bias or make unfair decisions. As NLP algorithms play an increasingly irreplaceable role in promoting the automation of...
Show moreNatural language processing (NLP) is an increasingly prominent subfield of artificial intelligence (AI). NLP techniques enable intelligent machines to understand and analyze natural languages and make it possible for humans and machines to communicate through natural languages. However, more and more evidence indicates that NLP applications show human-like discriminatory bias or make unfair decisions. As NLP algorithms play an increasingly irreplaceable role in promoting the automation of people's lives, bias in NLP is closely related to users' vital interests and demands considerable attention.While there are a growing number of studies related to bias in natural languages, the research on this topic is far from complete. In this thesis, we propose several studies to fill up the gaps in the area of bias in NLP in terms of three perspectives. First, existing studies are mainly confined to traditional and relatively mature NLP tasks, but for certain newly emerging tasks such as dialogue generation, the research on how to define, detect, and mitigate the bias in them is still absent. We conduct pioneering studies on bias in dialogue models to answer these questions. Second, previous studies basically focus on explicit bias in NLP algorithms but overlook implicit bias. We investigate the implicit bias in text classification tasks in our studies, where we propose novel methods to detect, explain, and mitigate the implicit bias. Third, existing research on bias in NLP focuses more on in-processing and post-processing bias mitigation strategies, but rarely considers how to avoid bias being produced in the generation process of the training data, especially in the data annotation phase. To this end, we investigate annotator bias in crowdsourced data for NLP tasks and its group effect. We verify the existence of annotator group bias, develop a novel probabilistic graphical framework to capture it, and propose an algorithm to eliminate its negative impact on NLP model learning.
Show less
- Title
- Computational methods to investigate connectivity in evolvable systems
- Creator
- Ackles, Acacia Lee
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Evolution sheds light on all of biology, and evolutionary dynamics underlie some of the most pressing issues we face today. If we can deepen our understanding of evolution, we can better respond to these various challenges. However, studying such processes directly can be difficult; biological data is naturally messy, easily confounded, and often limited. Fortunately, we can use computational modeling to help simplify and systematically untangle complex evolutionary processes. The aim of this...
Show moreEvolution sheds light on all of biology, and evolutionary dynamics underlie some of the most pressing issues we face today. If we can deepen our understanding of evolution, we can better respond to these various challenges. However, studying such processes directly can be difficult; biological data is naturally messy, easily confounded, and often limited. Fortunately, we can use computational modeling to help simplify and systematically untangle complex evolutionary processes. The aim of this dissertation is therefore to develop innovative computational frameworks to describe, quantify, and build intuition about evolutionary phenomena, with a focus on connectivity within evolvable systems. Here I introduce three such computational frameworks which address the importance of connectivity in systems across scales.First, I introduce rank epistasis, a model of epistasis that does not rely on baseline assumptions of genetic interactions. Rank epistasis borrows rank-based comparison testing from parametric statistics to quantify mutational landscapes around a target locus and identify how much that landscape is perturbed by mutation at that locus. This model is able to correctly identify lack of epistasis where existing models fail, thereby providing better insight into connectivity at the genome level.Next, I describe the comparative hybrid method, an approach to piecewise study of complex phenotypes. This model creates hybridized structures of well-known cognitive substrates in order to address what facilitates the evolution of learning. The comparative hybrid model allowed us to identify both connectivity and discretization as important components to the evolution of cognition, as well as demonstrate how both these components interact in different cognitive structures. This approach highlights the importance of recognizing connected components at the level of the phenotype.Finally, I provide an engineering point of view for Tessevolve, a virtual reality enabled system for viewing fitness landscapes in multiple dimensions. While traditional methods have only allowed for 2D visualization, Tessevolve allows the user to view fitness landscapes scaled across 2D, 3D, and 4D. Visualizing these landscapes in multiple dimensions in an intuitive VR-based system allowed us to identify how landscape traversal changes as dimensions increase, demonstrating the way that connections between points across fitness landscapes are affected by dimensionality. As a whole, this dissertation looks at connectivity in computational structures across a broad range of biological scales. These methods and metrics therefore expand our computational toolkit for studying evolution in multiple systems of interest: genotypic, phenotypic, and at the whole landscape level.
Show less
- Title
- Memory-efficient emulation of physical tabular data using quadtree decomposition
- Creator
- Carlson, Jared
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Computationally expensive functions are sometimes replaced in simulations with an emulator that approximates the true function (e.g., equations of state, wavelength-dependent opacity, or composition-dependent materials properties). For functions that have a constrained domain of interest, this can be done by discretizing the domain and performing a local interpolation on the tabulated function values of each local domain. For these so-called tabular data methods, the method of discretizing...
Show moreComputationally expensive functions are sometimes replaced in simulations with an emulator that approximates the true function (e.g., equations of state, wavelength-dependent opacity, or composition-dependent materials properties). For functions that have a constrained domain of interest, this can be done by discretizing the domain and performing a local interpolation on the tabulated function values of each local domain. For these so-called tabular data methods, the method of discretizing the domain and mapping the input space to each subdomain can drastically influence the memory and computational costs of the emulator. This is especially true for functions that vary drastically in different regions. We present a method for domain discretization and mapping that utilizes quadtrees, which results in significant reductions in the size of the emulator with minimal increases to computational costs or loss of global accuracy. We apply our method to the electron-positron Helmholtz free energy equation of state and show over an order of magnitude reduction in memory costs for reasonable levels of numerical accuracy.
Show less
- Title
- Efficient Transfer Learning for Heterogeneous Machine Learning Domains
- Creator
- Zhu, Zhuangdi
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Recent advances in deep machine learning hinge on a large amount of labeled data. Such heavy dependence on supervision data impedes the broader application of deep learning in more practical scenarios, where data annotation and labeling can be expensive (e.g. high-frequency trading) or even dangerous (e.g. training autonomous-driving models.) Transfer Learning (TL), equivalently referred to as knowledge transfer, is an effective strategy to confront such challenges. TL, by its definition,...
Show moreRecent advances in deep machine learning hinge on a large amount of labeled data. Such heavy dependence on supervision data impedes the broader application of deep learning in more practical scenarios, where data annotation and labeling can be expensive (e.g. high-frequency trading) or even dangerous (e.g. training autonomous-driving models.) Transfer Learning (TL), equivalently referred to as knowledge transfer, is an effective strategy to confront such challenges. TL, by its definition, distills the external knowledge from relevant domains into the target learning domain, hence requiring fewer supervision resources than learning-from-scratch. TL is beneficial for learning tasks for which the supervision data is limited or even unavailable. It is also an essential property to realize Generalized Artificial Intelligence. In this thesis, we propose sample-efficient TL approaches using limited, sometimes unreliable resources. We take a deep look into the setting of Reinforcement Learning (RL) and Supervised Learning, and derive solutions for the two domains respectively. Especially, for RL, we focus on a problem setting called imitation learning, where the supervision from the environment is either non-available or scarcely provided, and the learning agent must transfer knowledge from exterior resources, such as demonstration examples of a previously trained expert, to learn a good policy. For supervised learning, we consider a distributed machine learning scheme called Federated Learning (FL), which is a more challenging scenario than traditional machine learning, since the training data is distributed and non-sharable during the learning process. Under this distributed setting, it is imperative to enable TL among distributed learning clients to reach a satisfiable generalization performance. We prove by both theoretical support and extensive experiments that our proposed algorithms can facilitate the machine learning process with knowledge transfer to achieve higher asymptotic performance, in a principled and more efficient manner than the prior arts.
Show less