You are here
Search results
(1 - 20 of 20)
- Title
- AN EVOLUTIONARY MULTI-OBJECTIVE APPROACH TO SUSTAINABLE AGRICULTURAL WATER AND NUTRIENT OPTIMIZATION
- Creator
- Kropp, Ian Meyer
- Date
- 2018
- Collection
- Electronic Theses & Dissertations
- Description
-
One of the main problems that society is facing in the 21st century is that agricultural production must keep pace with a rapidly increasing global population in an environmentally sustainable manner. One of the solutions to this global problem is a system approach through the application of optimization techniques to manage farm operations. However, unlike existing agricultural optimization research, this work seeks to optimize multiple agricultural objectives at once via multi-objective...
Show moreOne of the main problems that society is facing in the 21st century is that agricultural production must keep pace with a rapidly increasing global population in an environmentally sustainable manner. One of the solutions to this global problem is a system approach through the application of optimization techniques to manage farm operations. However, unlike existing agricultural optimization research, this work seeks to optimize multiple agricultural objectives at once via multi-objective optimization techniques. Specifically, the algorithm Unified Non-dominated Sorting Genetic Algorithm-III (U-NSGA-III) searched for irrigation and nutrient management practices that minimized combinations of environmental objectives (e.g., total irrigation applied, total nitrogen leached) while maximizing crop yield for maize. During optimization, the crop model named the Decision Support System for Agrotechnology Transfer (DSSAT) calculated the yield and nitrogen leaching for each given management practices. This study also developed a novel bi-level optimization framework to improve the performance of the optimization algorithm, employing U-NSGA-III on the upper level and Monte Carlo optimization on the lower level. The multi-objective optimization framework resulted in groups of equally optimal solutions that each offered a unique trade-off among the objectives. As a result, producers can choose the one that best addresses their needs among these groups of solutions, known as Pareto fronts. In addition, the bi-level optimization framework further improved the number, performance, and diversity of solutions within the Pareto fronts.
Show less
- Title
- Achieving reliable distributed systems : through efficient run-time monitoring and predicate detection
- Creator
- Tekken Valapil, Vidhya
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Runtime monitoring of distributed systems to perform predicate detection is critical as well as a challenging task. It is critical because it ensures the reliability of the system by detecting all possible violations of system requirements. It is challenging because to guarantee lack of violations one has to analyze every possible ordering of system events and this is an expensive task. In this report, wefocus on ordering events in a system run using HLC (Hybrid Logical Clocks) timestamps,...
Show moreRuntime monitoring of distributed systems to perform predicate detection is critical as well as a challenging task. It is critical because it ensures the reliability of the system by detecting all possible violations of system requirements. It is challenging because to guarantee lack of violations one has to analyze every possible ordering of system events and this is an expensive task. In this report, wefocus on ordering events in a system run using HLC (Hybrid Logical Clocks) timestamps, which are O(1) sized timestamps, and present some efficient algorithms to perform predicate detection using HLC. Since, with HLC, the runtime monitor cannot find all possible orderings of systems events, we present a new type of clock called Biased Hybrid Logical Clocks (BHLC), that are capable of finding more possible orderings than HLC. Thus we show that BHLC based predicate detection can find more violations than HLC based predicate detection. Since predicate detection based on both HLC and BHLC do not guarantee detection of all possible violations in a system run, we present an SMT (Satisfiability Modulo Theories) solver based predicate detection approach, that guarantees the detection of all possible violations in a system run. While a runtime monitor that performs predicate detection using SMT solvers is accurate, the time taken by the solver to detect the presence or absence of a violation can be high. To reduce the time taken by the runtime monitor, we propose the use of an efficient two-layered monitoring approach, where the first layer of the monitor is efficient but less accurate and the second layer is accurate but less efficient. Together they reduce the overall time taken to perform predicate detection drastically and also guarantee detection of all possible violations.
Show less
- Title
- Adaptive and Automated Deep Recommender Systems
- Creator
- Zhao, Xiangyu
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Recommender systems are intelligent information retrieval applications, and have been leveraged in numerous domains such as e-commerce, movies, music, books, and point-of-interests. They play a crucial role in the users' information-seeking process, and overcome the information overload issue by recommending personalized items (products, services, or information) that best match users' needs and preferences. Driven by the recent advances in machine learning theories and the prevalence of deep...
Show moreRecommender systems are intelligent information retrieval applications, and have been leveraged in numerous domains such as e-commerce, movies, music, books, and point-of-interests. They play a crucial role in the users' information-seeking process, and overcome the information overload issue by recommending personalized items (products, services, or information) that best match users' needs and preferences. Driven by the recent advances in machine learning theories and the prevalence of deep learning techniques, there have been tremendous interests in developing deep learning based recommender systems. They have unprecedentedly advanced effectiveness of mining the non-linear user-item relationships and learning the feature representations from massive datasets, which produce great vitality and improvements in recommendations from both academic and industry communities.Despite above prominence of existing deep recommender systems, their adaptiveness and automation still remain under-explored. Thus, in this dissertation, we study the problem of adaptive and automated deep recommender systems. Specifically, we present our efforts devoted to building adaptive deep recommender systems to continuously update recommendation strategies according to the dynamic nature of user preference, which maximizes the cumulative reward from users in the practical streaming recommendation scenarios. In addition, we propose a group of automated and systematic approaches that design deep recommender system frameworks effectively and efficiently from a data-driven manner. More importantly, we apply our proposed models into a variety of real-world recommendation platforms and have achieved promising enhancements of social and economic benefits.
Show less
- Title
- Advanced Operators for Graph Neural Networks
- Creator
- Ma, Yao
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Graphs, which encode pairwise relations between entities, are a kind of universal data structure for many real-world data, including social networks, transportation networks, and chemical molecules. Many important applications on these data can be treated as computational tasks on graphs. For example, friend recommendation in social networks can be regarded as a link prediction task and predicting properties of chemical compounds can be treated as a graph classification task. An essential...
Show moreGraphs, which encode pairwise relations between entities, are a kind of universal data structure for many real-world data, including social networks, transportation networks, and chemical molecules. Many important applications on these data can be treated as computational tasks on graphs. For example, friend recommendation in social networks can be regarded as a link prediction task and predicting properties of chemical compounds can be treated as a graph classification task. An essential step to facilitate these tasks is to learn vector representations either for nodes or the entire graphs. Given its great success of representation learning in images and text, deep learning offers great promise for graphs. However, compared to images and text, deep learning on graphs faces immense challenges. Graphs are irregular where nodes are unordered and each of them can have a distinct number of neighbors. Thus, traditional deep learning models cannot be directly applied to graphs, which calls for dedicated efforts for designing novel deep graph models. To help meet this pressing demand, we developed and investigated novel GNN algorithms to generalize deep learning techniques to graph-structured data. Two key operations in GNNs are the graph filtering operation, which aims to refine node representations; and the graph pooling operation, which aims to summarize node representations to obtain a graph representation. In this thesis, we provide deep understandings or develop novel algorithms for these two operations from new perspectives. For graph filtering operations, we propose a unified framework from the perspective of graph signal denoising, which demonstrates that most existing graph filtering operations are conducting feature smoothing. Then, we further investigate what information typical graph filtering operations can capture and how they can be understood beyond feature smoothing. For graph pooling operations, we study the procedure of pooling from the perspective of graph spectral theory and present a novel graph pooling operation. We also propose a technique to downsample nodes considering both mode importance and representativeness, which leads to a novel graph pooling operation.
Show less
- Title
- DIGITAL IMAGE FORENSICS IN THE CONTEXT OF BIOMETRICS
- Creator
- Banerjee, Sudipta
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Digital image forensics entails the deduction of the origin, history and authenticity of a digital image. While a number of powerful techniques have been developed for this purpose, much of the focus has been on images depicting natural scenes and generic objects. In this thesis, we direct our focus on biometric images, viz., iris, ocular and face images.Firstly, we assess the viability of using existing sensor identification schemes developed for visible spectrum images on near-infrared (NIR...
Show moreDigital image forensics entails the deduction of the origin, history and authenticity of a digital image. While a number of powerful techniques have been developed for this purpose, much of the focus has been on images depicting natural scenes and generic objects. In this thesis, we direct our focus on biometric images, viz., iris, ocular and face images.Firstly, we assess the viability of using existing sensor identification schemes developed for visible spectrum images on near-infrared (NIR) iris and ocular images. These schemes are based on estimating the multiplicative sensor noise that is embedded in an input image. Further, we conduct a study analyzing the impact of photometric modifications on the robustness of the schemes. Secondly, we develop a method for sensor de-identificaton, where the sensor noise in an image is suppressed but its biometric utility is retained. This enhances privacy by unlinking an image from its camera sensor and, subsequently, the owner of the camera. Thirdly, we develop methods for constructing an image phylogeny tree from a set of near-duplicate images. An image phylogeny tree captures the relationship between subtly modified images by computing a directed acyclic graph that depicts the sequence in which the images were modified. Our primary contribution in this regard is the use of complex basis functions to model any arbitrary transformation between a pair of images and the design of a likelihood ratio based framework for determining the original and modified image in the pair. We are currently integrating a graph-based deep learning approach with sensor-specific information to refine and improve the performance of the proposed image phylogeny algorithm.
Show less
- Title
- Example-Based Parameterization of Linear Blend Skinning for Skinning Decomposition (EP-LBS
- Creator
- Hopkins, Kayra M.
- Date
- 2017
- Collection
- Electronic Theses & Dissertations
- Description
-
This thesis presents Example-based Parameterization of Linear Blend Skinning for Skinning Decomposition (EP-LBS), a unified and robust method for using example data to simplify and improve the development and parameterization of high quality 3D models for animation. Animation and three-dimensional (3D) computer graphics have quickly become a popular medium for education, entertainment and scientific simulation. In addition to film, gaming and research applications, recent advancements in...
Show moreThis thesis presents Example-based Parameterization of Linear Blend Skinning for Skinning Decomposition (EP-LBS), a unified and robust method for using example data to simplify and improve the development and parameterization of high quality 3D models for animation. Animation and three-dimensional (3D) computer graphics have quickly become a popular medium for education, entertainment and scientific simulation. In addition to film, gaming and research applications, recent advancements in augmented reality (AR) and virtual reality (VR) are driving additional demand for 3D content. However, the success of graphics in these arenas depends greatly on the efficiency of model creation and the realism of the animation or 3D image.A common method for figure animation is skeletal animation using linear blend skinning (LBS). In this method, vertices are deformed based on a weighted sum of displacements due to an embedded skeleton. This research addresses the problem that LBS animation parameter computation, including determining the rig (the skeletal structure), identifying influence bones (which bones influence which vertices), and assigning skinning weights (amounts of influence a bone has on a vertex), is a tedious process that is difficult to get right. Even the most skilled animators must work tirelessly to design an effective character model and often find themselves repeatedly correcting flaws in the parameterization. Significant research, including the use of example-data, has focused on simplifying and automating individual components of the LBS deformation process and increasing the quality of resulting animations. However, constraints on LBS animation parameters makes automated analytic computation of the values equally as challenging as traditional 3D animation methods. Skinning decomposition is one such method of computing LBS animation LBS parameters from example data. Skinning decomposition challenges include constraint adherence and computationally efficient determination of LBS parameters.The EP-LBS method presented in this thesis utilizes example data as input to a least-squares non-linear optimization process. Given a model as a set of example poses captured from scan data or manually created, EP-LBS institutes a single optimization equation that allows for simultaneous computation of all animation parameters for the model. An iterative clustering methodology is used to construct an initial parameterization estimate for this model, which is then subjected to non-linear optimization to improve the fitting to the example data. Simultaneous optimization of weights and joint transformations is complicated by a wide range of differing constraints and parameter interdependencies. To address interdependent and conflicting constraints, parameter mapping solutions are presented that map the constraints to an alternative domain more suitable for nonlinear minimization. The presented research is a comprehensive, data-driven solution for automatically determining skeletal structure, influence bones and skinning weights from a set of example data. Results are presented for a range of models that demonstrate the effectiveness of the method.
Show less
- Title
- Face Anti-Spoofing : Detection, Generalization, and Visualization
- Creator
- Liu, Yaojie
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Face anti-spoofing is the process of distinguishing genuine faces and face presentation attacks: attackers presenting spoofing faces (e.g. photograph, digital screen, and mask) to the face recognition system and attempting to be authenticated as the genuine user. In recent years, face anti-spoofing has brought increasing attention to the vision community as it is a crucial step to prevent face recognition systems from a security breach. Previous approaches formulate face anti-spoofing as a...
Show moreFace anti-spoofing is the process of distinguishing genuine faces and face presentation attacks: attackers presenting spoofing faces (e.g. photograph, digital screen, and mask) to the face recognition system and attempting to be authenticated as the genuine user. In recent years, face anti-spoofing has brought increasing attention to the vision community as it is a crucial step to prevent face recognition systems from a security breach. Previous approaches formulate face anti-spoofing as a binary classification problem, and many of them struggle to generalize to different conditions(such as pose, lighting, expressions, camera sensors, and unknown spoof types). Moreover, those methods work as a black box and cannot provide interpretation or visualization to their decision. To address those challenges, we investigate face anti-spoofing in 3 stages: detection, generalization and visualization. In the detection stage, we learn a CNN-RNN model to estimate auxiliary tasks of face depth and rPPG signals estimation, which can bring additional knowledge for the spoof detection. In the generalization stage, we investigate the detection of unknown spoof attacks and propose a novel Deep Tree Network (DTN) to well represent the unknown spoof attacks. In the visualization stage, we find “spoof trace, the subtle image pattern in spoof faces (e.g., color distortion, 3D mask edge, and Moire pattern), is effective to explain why a spoof is a spoof. We provide a proper physical modeling of the spoof traces and design a generative model to disentangle the spoof traces from input faces. In addition, we also show that a proper physical modeling can benefit other face problems, such as face shadow detection and removal. A proper shadow modeling can not only detect the shadow region effectively, but also remove the shadow in a visually plausible manner.
Show less
- Title
- Fast edit distance calculation methods for NGS sequence similarity
- Creator
- Islam, A. K. M. Tauhidul
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Sequence fragments generated from targeted regions of phylogenetic marker genes provide valuable insight in identifying and classifying organisms and inferring taxonomic hierarchies. In recent years, significant development in targeted gene fragment sequencing through Next Generation Sequencing (NGS) technologies has increased the necessity of efficient sequence similarity computation methods for very large numbers of pairs of NGS sequences.The edit distance has been widely used to determine...
Show moreSequence fragments generated from targeted regions of phylogenetic marker genes provide valuable insight in identifying and classifying organisms and inferring taxonomic hierarchies. In recent years, significant development in targeted gene fragment sequencing through Next Generation Sequencing (NGS) technologies has increased the necessity of efficient sequence similarity computation methods for very large numbers of pairs of NGS sequences.The edit distance has been widely used to determine the dissimilarity between pairs of strings. All the known methods for the edit distance calculation run in near quadratic time with respect to string lengths, and it may take days or weeks to compute distances between such large numbers of pairs of NGS sequences. To solve the performance bottleneck problem, faster edit distance approximation and bounded edit distance calculation methods have been proposed. Despite these efforts, the existing edit distance calculation methods are not fast enough when computing larger numbers of pairs of NGS sequences. In order to further reduce the computation time, many NGS sequence similarity methods have been proposed using matching k-mers. These methods extract all possible k-mers from NGS sequences and compare similarity between pairs of sequences based on the shared k-mers. However, these methods reduce the computation time at the cost accuracy.In this dissertation, our goal is to compute NGS sequence similarity using edit distance based methods while reducing the computation time. We propose a few edit distance prediction methods using dataset independent reference sequences that are distant from each other. These reference sequences convert sequences in datasets into feature vectors by computing edit distances between the sequence and each of the reference sequences. Given sequences A, B and a reference sequence r, the edit distance, ed(A.B) 2265 (ed(A, r) 0303ed(B, r)). Since each reference sequence is significantly different from each other, with sufficiently large number of reference sequences and high similarity threshold, the differences of edit distances of A and B with respect to the reference sequences are close to the ed(A,B). Using this property, we predict edit distances in the vector space based on the Euclidean distances and the Chebyshev distances. Further, we develop a small set of deterministically generated reference sequences with maximum distance between each of them to predict higher edit distances more efficiently. This method predicts edit distances between corresponding sub-sequences separately and then merges the partial distances to predict the edit distances between the entire sequences. The computation complexity of this method is linear with respect to sequence length. The proposed edit distance prediction methods are significantly fast while achieving very good accuracy for high similarity thresholds. We have also shown the effectiveness of these methods on agglomerative hierarchical clustering.We also propose an efficient bounded exact edit distance calculation method using the trace [1]. For a given edit distance threshold d, only letters up to d positions apart can be part of an edit operation. Hence, we generate pairs of sub-sequences up to length difference d so that no edit operation is spilled over to the adjacent pairs of sub-sequences. Then we compute the trace cost in such a way that the number of matching letters between the sub-sequences are maximized. This technique does not guarantee locally optimal edit distance, however, it guarantees globally optimal edit distance between the entire sequences for distance up to d. The bounded exact edit distance calculation method is an order of magnitude faster than that of the dynamic programming edit distance calculation method.
Show less
- Title
- GENERATIVE SIGNAL PROCESSING THROUGH MULTILAYER MULTISCALE WAVELET MODELS
- Creator
- He, Jieqian
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Wavelet analysis and deep learning are two popular fields for signal processing. The scattering transform from wavelet analysis is a recently proposed mathematical model for convolution neural networks. Signals with repeated patterns can be analyzed using the statistics from such models. Specifically, signals from certain classes can be recovered from related statistics. We first focus on recovering 1D deterministic dirac signals from multiscale statistics. We prove a dirac signal can be...
Show moreWavelet analysis and deep learning are two popular fields for signal processing. The scattering transform from wavelet analysis is a recently proposed mathematical model for convolution neural networks. Signals with repeated patterns can be analyzed using the statistics from such models. Specifically, signals from certain classes can be recovered from related statistics. We first focus on recovering 1D deterministic dirac signals from multiscale statistics. We prove a dirac signal can be recovered from multiscale statistics up to a translation and reflection. Then we switch to a stochastic version, modeled using Poisson point processes, and prove wavelet statistics at small scales capture the intensity parameter of Poisson point processes. We also design a scattering generative adversarial network (GAN) to generate new Poisson point samples from statistics of multiple given samples. Next we consider texture images. We successfully synthesize new textures given one sample from the texture class through multiscale, multilayer wavelet models. Finally, we analyze and prove why the multiscale multilayer model is essential for signal recovery, especially natural texture images.
Show less
- Title
- IMPROVING THE PREDICTABILITY OF HYDROLOGIC INDICES IN ECOHYDROLOGICAL APPLICATIONS
- Creator
- Hernandez Suarez, Juan Sebastian
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Monitoring freshwater ecosystems allow us to better understand their overall ecohydrological condition within large and diverse watersheds. Due to the significant costs associated with biological monitoring, hydrological modeling is widely used to calculate ecologically relevant hydrologic indices (ERHIs) for stream health characterization in locations with lacking data. However, the reliability and applicability of these models within ecohydrological frameworks are major concerns....
Show moreMonitoring freshwater ecosystems allow us to better understand their overall ecohydrological condition within large and diverse watersheds. Due to the significant costs associated with biological monitoring, hydrological modeling is widely used to calculate ecologically relevant hydrologic indices (ERHIs) for stream health characterization in locations with lacking data. However, the reliability and applicability of these models within ecohydrological frameworks are major concerns. Particularly, hydrologic modeling’s ability to predict ERHIs is limited, especially when calibrating models by optimizing a single objective function or selecting a single optimal solution. The goal of this research was to develop model calibration strategies based on multi-objective optimization and Bayesian parameter estimation to improve the predictability of ERHIs and the overall representation of the streamflow regime. The research objectives were to (1) evaluate the predictions of ERHIs using different calibration techniques based on widely used performance metrics, (2) develop performance and signature-based calibration strategies explicitly constraining or targeting ERHIs, and (3) quantify the modeling uncertainty of ERHIs using the results from multi-objective model calibration and Bayesian inference. The developed strategies were tested in an agriculture-dominated watershed in Michigan, US, using the Unified Non-dominated Sorting Algorithm III (U-NSGA-III) for multi-objective calibration and the Soil and Water Assessment Tool (SWAT) for hydrological modeling. Performance-based calibration used objective functions based on metrics calculated on streamflow time series, whereas signature-based calibration used ERHIs values for objective functions’ formulation. For uncertainty quantification purposes, a lumped error model accounting for heteroscedasticity and autocorrelation was considered and the multiple-try Differential Evolution Adaptive Metropolis (ZS) (MT-DREAM(ZS)) algorithm was implemented for Markov Chain Monte Carlo (MCMC) sampling. In relation to the first objective, the results showed that using different sets of solutions instead of a single optimal introduces more flexibility in the predictability of various ERHIs. Regarding the second objective, both performance-based and signature-based model calibration strategies were successful in representing most of the selected ERHIs within a +/-30% relative error acceptability threshold while yielding consistent runoff predictions. The performance-based strategy was preferred since it showed a lower dispersion of near-optimal Pareto solutions when representing the selected indices and other hydrologic signatures based on water balance and Flow Duration Curve characteristics. Finally, regarding the third objective, using near-optimal Pareto parameter distributions as prior knowledge in Bayesian calibration generally reduced both the bias and variability ranges in ERHIs prediction. In addition, there was no significant loss in the reliability of streamflow predictions when targeting ERHIs, while improving precision and reducing the bias. Moreover, parametric uncertainty drastically shrank when linking multi-objective calibration and Bayesian parameter estimation. Still, the representation of low flow magnitude and timing, rate of change, and duration and frequency of extreme flows were limited. These limitations, expressed in terms of bias and interannual variability, were mainly attributed to the hydrological model’s structural inadequacies. Therefore, future research should involve revising hydrological models to better describe the ecohydrological characteristics of riverine systems.
Show less
- Title
- INTERPRETABLE ARTIFICIAL INTELLIGENCE USING NONLINEAR DECISION TREES
- Creator
- Dhebar, Yashesh Deepakkumar
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
The recent times have observed a massive application of artificial intelligence (AI) to automate tasks across various domains. The back-end mechanism with which automation occurs is generally black-box. Some of the popular black-box AI methods used to solve an automation task include decision trees (DT), support vector machines (SVM), artificial neural networks (ANN), etc. In the past several years, these black-box AI methods have shown promising performance and have been widely applied and...
Show moreThe recent times have observed a massive application of artificial intelligence (AI) to automate tasks across various domains. The back-end mechanism with which automation occurs is generally black-box. Some of the popular black-box AI methods used to solve an automation task include decision trees (DT), support vector machines (SVM), artificial neural networks (ANN), etc. In the past several years, these black-box AI methods have shown promising performance and have been widely applied and researched across industries and academia. While the black-box AI models have been shown to achieve high performance, the inherent mechanism with which a decision is made is hard to comprehend. This lack of interpretability and transparency of black-box AI methods makes them less trustworthy. In addition to this, the black-box AI models lack in their ability to provide valuable insights regarding the task at hand. Following these limitations of black-box AI models, a natural research direction of developing interpretable and explainable AI models has emerged and has gained an active attention in the machine learning and AI community in the past three years. In this dissertation, we will be focusing on interpretable AI solutions which are being currently developed at the Computational Optimization and Innovation Laboratory (COIN Lab) at Michigan State University. We propose a nonlinear decision tree (NLDT) based framework to produce transparent AI solutions for automation tasks related to classification and control. The recent advancement in non-linear optimization enables us to efficiently derive interpretable AI solutions for various automation tasks. The interpretable and transparent AI models induced using customized optimization techniques show similar or better performance as compared to complex black-box AI models across most of the benchmarks. The results are promising and provide directions to launch future studies in developing efficient transparent AI models.
Show less
- Title
- Learning to Detect Language Markers
- Creator
- Tang, Fengyi
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
In the world of medical informatics, biomarkers play a pivotal role in determining the physical state of human beings, distinguishing the pathologic from the clinically normal. In recent years, behavioral markers, due to their availability and low cost, have attracted a lot of attention as a potential supplement to biomarkers. “Language markers” such as spoken words and lexical preference have been shown to be both cost-effective as well as predictive of complex diseases such as mild...
Show moreIn the world of medical informatics, biomarkers play a pivotal role in determining the physical state of human beings, distinguishing the pathologic from the clinically normal. In recent years, behavioral markers, due to their availability and low cost, have attracted a lot of attention as a potential supplement to biomarkers. “Language markers” such as spoken words and lexical preference have been shown to be both cost-effective as well as predictive of complex diseases such as mild cognitive impairment (MCI).However, language markers, although universal, do not possess many of the favorable properties that characterize traditional biomakers. For example, different people may exhibit similar use of language under certain conversational contexts (non-unique), and a person's lexical preferences may change over time (non-stationary). As a result, it is unclear whether any set of language markers can be measured in a consistent manner. My thesis projects provide solutions to some of the limitations of language markers: (1) We formalize the problem of learning a dialog policy to measure language markers as an optimization problem which we call persona authentication. We provide a learning algorithm for finding such a dialog policy that can generalize to unseen personalities. (2) We apply our dialog policy framework on real-world data for MCI prediction and show that the proposed pipeline improves prediction against supervised learning baselines. (3) To address non-stationarity, we introduce an effective way to do temporally-dependent and non-i.i.d. feature selection through an adversarial learning framework which we call precision sensing. (4) Finally, on the prediction side, we propose a method for improving the sample efficiency of classifiers by retaining privileged information (auxiliary features available only at training time).
Show less
- Title
- Object Detection from 2D to 3D
- Creator
- Brazil, Garrick
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Monocular camera-based object detection plays a critical role in widespread applications including robotics, security, self-driving cars, augmented reality and many more. Increased relevancy is often given to the detection and tracking of safety-critical objects like pedestrians, cyclists, and cars which are often in motion and in close association to people. Compared to other generic objects such as animals, tools, food — safety-critical objects in urban scenes tend to have unique challenges...
Show moreMonocular camera-based object detection plays a critical role in widespread applications including robotics, security, self-driving cars, augmented reality and many more. Increased relevancy is often given to the detection and tracking of safety-critical objects like pedestrians, cyclists, and cars which are often in motion and in close association to people. Compared to other generic objects such as animals, tools, food — safety-critical objects in urban scenes tend to have unique challenges. Firstly, such objects usually have a wide range of detection scales such that they may appear anywhere from 5-50+ meters from the camera. Safety-critical objects also tend to have a high variety of textures and shapes, exemplified by the clothing of people and variability of vehicle models. Moreover, the high-density of objects in urban scenes leads to increased levels of self-occlusion compared to general objects in the wild. Off-the-shelf object detectors do not always work effectively due to these traits, and hence special attention is needed for accurate detection. Moreover, even successful detection of safety-critical is not inherently practical for applications designed to function in the real 3D world, without integration of expensive depth sensors. To remedy this, in this thesis we aim to improve the performance of 2D object detection and extend boxes into 3D, while using only monocular camera-based sensors. We first explore how pedestrian detection can be augmented using an efficient simultaneous detection and segmentation technique, while notably requiring no additional data or annotations. We then propose a multi-phased autoregressive network which progressively improves pedestrian detection precision for difficult samples, while critically maintaining an efficient runtime. We additionally propose a single-stage region proposal networks for 3D object detection in urban scenes, which is both more efficient and up to 3x more accurate than comparable state-of-the-art methods. We stabilize our 3D object detector using a highly tailored 3D Kalman filter, which both improves localization accuracy and provides useful byproducts such as ego-motion and per-object velocity. Lastly, we utilize differentiable rendering to discover the underlying 3D structure of objects beyond the cuboids used in detection, and without relying on expensive sensors or 3D supervision. For each method, we provide comprehensive experiments to demonstrate effectiveness, impact and runtime efficiency.
Show less
- Title
- PRECISION DIAGNOSTICS AND INNOVATIONS FOR PLANT BREEDING RESEARCH
- Creator
- Hugghis, Eli
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Major technological advances are necessary to reach the goal of feeding our world’s growing population. To do this, there is an increasing demand within the agricultural field for rapid diagnostic tools to improve the efficiency of current methods in plant disease and DNA identification. The use of gold nanoparticles has emerged as a promising technology for a range of applications from smart agrochemical delivery systems to pathogen detection. In addition to this, advances in image...
Show moreMajor technological advances are necessary to reach the goal of feeding our world’s growing population. To do this, there is an increasing demand within the agricultural field for rapid diagnostic tools to improve the efficiency of current methods in plant disease and DNA identification. The use of gold nanoparticles has emerged as a promising technology for a range of applications from smart agrochemical delivery systems to pathogen detection. In addition to this, advances in image classification analyses have allowed machine learning approaches to become more accessible to the agricultural field. Here we present the use of gold nanoparticles (AuNPs) for the detection of transgenic gene sequences in maize and the use of machine learning algorithms for the identification and classification of Fusarium spp. infected wheat seed. AuNPs show promise in their ability to diagnose the presence of transgenic insertions in DNA samples within 10 minutes through colorimetric response. Image-based analysis with the utilization of logistic regression, support vector machines, and k-nearest neighbors were able to accurately identify and differentiate healthy and diseased wheat kernels within the testing set at an accuracy of 95-98.8%. These technologies act as rapid tools to be used by plant breeders and pathologists to improve their ability to make selection decisions efficiently and objectively.
Show less
- Title
- Quantitative methods for calibrated spatial measurements of laryngeal phonatory mechanisms
- Creator
- Ghasemzadeh, Hamzeh
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
The ability to perform measurements is an important cornerstone and the prerequisite of any quantitative research. Measurements allow us to quantify inputs and outputs of a system, and then to express their relationships using concise mathematical expressions and models. Those models would then enable us to understand how a target system works and to predict its output for changes in the system parameters. Conversely, models would enable us to determine the proper parameters of a system for...
Show moreThe ability to perform measurements is an important cornerstone and the prerequisite of any quantitative research. Measurements allow us to quantify inputs and outputs of a system, and then to express their relationships using concise mathematical expressions and models. Those models would then enable us to understand how a target system works and to predict its output for changes in the system parameters. Conversely, models would enable us to determine the proper parameters of a system for achieving a certain output. Putting these in the context of voice science research, variations in the parameters of the phonatory system could be attributed to individual differences. Thus, accurate models would enable us to account for individual differences during the diagnosis and to make reliable predictions about the likely outcome of different treatment options. Analysis of vibration of the vocal folds using high-speed videoendoscopy (HSV) could be an ideal candidate for constructing computational models. However, conventional images are not spatially calibrated and cannot be used for absolute spatial measurements. This dissertation is focused on developing the required methodologies for calibrated spatial measurements from in-vivo HSV recordings. Specifically, two different approaches for calibrated horizontal measurements of HSV images are presented. The first approach is called the indirect approach, and it is based on the registration of a specific attribute of a common object (e.g. size of a lesion) from a calibrated intraoperative still image to its corresponding non-calibrated in-vivo HSV recording. This approach does not require specialized instruments and can be implemented in many clinical settings. However, its validity depends on a couple of assumptions. Violation of those assumptions could lead to significant measurement errors. The second approach is called the direct approach, and it is based on a laser-projection flexible fiberoptic endoscope. This approach would enable us to make accurate calibrated spatial measurements. This dissertation evaluates the accuracy of the first approach indirectly, and by studying its underlying fundamental assumptions. However, the accuracy of the second approach is evaluated directly, and using benchtop experiments with different surfaces, different working distances, and different imaging angles. The main significances and contributions of this dissertation are the following: (1) a formal treatment of indirect horizontal calibration is presented, and the assumptions governing its validity and reliability are discussed. A battery of tests is presented that can indirectly assess the validity of those assumptions in laryngeal imaging applications; (2) recordings from pre- and post-surgery from patients with vocal fold mass lesions are used as a testbench for the developed indirect calibration approach. In that regard, a full solution is developed for measuring the calibrated velocity of the vocal folds. The developed solution is then used to investigate post-surgery changes in the closing velocity of the vocal folds from patients with vocal fold mass lesions; (3) the method for calibrated vertical measurement from a laser-projection fiberoptic flexible endoscope is developed. The developed method is evaluated at different working distances, different imaging angles, and on a 3D surface; (4) a detailed analysis and investigation of non-linear image distortion of a fiberoptic flexible endoscope is presented. The effect of imaging angle and spatial location of an object on the magnitude of that distortion is studied and quantified; (5) the method for calibrated horizontal measurement from a laser-projection fiberoptic flexible endoscope is developed. The developed method is evaluated at different working distances, different imaging angles, and on a 3D surface.
Show less
- Title
- SOCIAL MECHANISMS OF LEADERSHIP EMERGENCE : A COMPUTATIONAL EVALUATION OF LEADERSHIP NETWORK STRUCTURES
- Creator
- Griffin, Daniel Jacob
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Leadership emergence is a topic of immense interest in the organizational sciences. One promising recent development in the leadership literature focuses on the development and impact of informal leadership structures in a share leadership paradigm. Despite its theoretical importance, the network perspective of leadership emergence is still underdeveloped, largely due to the complexity of studying and theorizing about network-level phenomena. Using computational modeling techniques, I...
Show moreLeadership emergence is a topic of immense interest in the organizational sciences. One promising recent development in the leadership literature focuses on the development and impact of informal leadership structures in a share leadership paradigm. Despite its theoretical importance, the network perspective of leadership emergence is still underdeveloped, largely due to the complexity of studying and theorizing about network-level phenomena. Using computational modeling techniques, I evaluate the network-level implications of two existing theories that broadly represent social theories of leadership emergence. I derive formal representations for both foundational theories and expand on this theory to develop a synthesis theory describing how these two processes work in parallel. Results from simulated experiments indicate that group homogeneity is associated with vastly different leadership network structures depending on which theoretical process mechanisms are in play. This thesis contributes significantly to the literature by 1) advancing a network-based approach to leadership emergence research, 2) testing the implications of existing theory, 3) developing new theory, and 4) providing a strong foundation and tool kit for future leadership network emergence research.
Show less
- Title
- Semi-Adversarial Networks for Imparting Demographic Privacy to Face Images
- Creator
- Mirjalili, Vahid
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Face recognition systems are being widely used in a number of applications ranging from user authentication in hand-held devices to identifying people of interest from surveillance videos. In several such applications, face images are stored in a central database. In such cases, it is necessary to ensure that the stored face images are used for the stated purpose and not for any other purposes. For example, advanced machine learning methods can be used to automatically extract age, gender,...
Show moreFace recognition systems are being widely used in a number of applications ranging from user authentication in hand-held devices to identifying people of interest from surveillance videos. In several such applications, face images are stored in a central database. In such cases, it is necessary to ensure that the stored face images are used for the stated purpose and not for any other purposes. For example, advanced machine learning methods can be used to automatically extract age, gender, race and so on from the stored face images. These cues are often referred to as demographic attributes. When such attributes are extracted without the consent of individuals, it can lead to potential violation of privacy. Indeed, the European Union's General Data Protection and Regulation (GDPR) requires the primary purpose of data collection to be declared to individuals prior to data collection. GDPR strictly prohibits the use of this data for any purpose beyond what was stated. In this thesis, we consider this type of regulation and develop methods for enhancing the privacy accorded to face images with respect to the automatic extraction of demogrpahic attributes. In particular, we design algorithms that modify input face images such that certain specified demogrpahic attributes cannot be reliably extracted from them. At the same time, the biometric utility of the images is retained, i.e., the modified face images can still be used for matching purposes. The primary objective of this research is not necessarily to fool human observers, but rather to prevent machine learning methods from automatically extracting such information. The following are the contributions of this thesis. First, we design a convolutional autoencoder known as a semi-adversarial neural network, or SAN, that perturbs input face images such that they are adversarial with respect to an attribute classifier (e.g., gender classifier) while still retaining their utility with respect to a face matcher. Second, we develop techniques to ensure that the adversarial outputs produced by the SAN are generalizable across multiple attribute classifiers, including those that may not have been used during the training phase. Third, we extend the SAN architecture and develop a neural network known as PrivacyNet, that can be used for imparting multi-attribute privacy to face images. Fourth, we conduct extensive experimental analysis using several face image datasets to evaluate the performance of the proposed methods as well as visualize the perturbations induced by the methods. Results suggest the benefits of using semi-adversarial networks to impart privacy to face images while still retaining the biometric utility of the ensuing face images.
Show less
- Title
- Sequence learning with side information : modeling and applications
- Creator
- Wang, Zhiwei
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Sequential data is ubiquitous and modeling sequential data has been one of the most long-standing computer science problems. The goal of sequence modeling is to represent a sequence with a low-dimensional dense vector that incorporates as much information as possible. A fundamental type of information contained in sequences is the sequential dependency and a large body of research has been devoted to designing effective ways to capture it. Recently, sequence learning models such as recurrent...
Show moreSequential data is ubiquitous and modeling sequential data has been one of the most long-standing computer science problems. The goal of sequence modeling is to represent a sequence with a low-dimensional dense vector that incorporates as much information as possible. A fundamental type of information contained in sequences is the sequential dependency and a large body of research has been devoted to designing effective ways to capture it. Recently, sequence learning models such as recurrent neural networks (RNNs), temporal convolutional networks, and Transformer have gained tremendous popularity in modeling sequential data. Equipped with effective structures such as gating mechanisms, large receptive fields, and attention mechanisms, these models have achieved great success in many applications of a wide range of fields.However, besides the sequential dependency, sequences also exhibit side information that remains under-explored. Thus, in the thesis, we study the problem of sequence learning with side information. Specifically, we present our efforts devoted to building sequence learning models to effectively and efficiently capture side information that is commonly seen in sequential data. In addition, we show that side information can play an important role in sequence learning tasks as it can provide rich information that is complementary to the sequential dependency. More importantly, we apply our proposed models in various real-world applications and have achieved promising results.
Show less
- Title
- Towards a Robust Unconstrained Face Recognition Pipeline with Deep Neural Networks
- Creator
- Shi, Yichun
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Face recognition is a classic problem in the field of computer vision and pattern recognition due to its wide applications in real-world problems such as access control, identity verification, physical security, surveillance, etc. Recent progress in deep learning techniques and the access to large-scale face databases has lead to a significant improvement of face recognition accuracy under constrained and semi-constrained scenarios. Deep neural networks are shown to surpass human performance...
Show moreFace recognition is a classic problem in the field of computer vision and pattern recognition due to its wide applications in real-world problems such as access control, identity verification, physical security, surveillance, etc. Recent progress in deep learning techniques and the access to large-scale face databases has lead to a significant improvement of face recognition accuracy under constrained and semi-constrained scenarios. Deep neural networks are shown to surpass human performance on Labeled Face in the Wild (LFW), which consists of celebrity photos captured in the wild. However, in many applications, e.g. surveillance videos, where we cannot assume that the presented face is under controlled variations, the performance of current DNN-based methods drop significantly. The main challenges in such an unconstrained face recognition problem include, but are not limited to: lack of labeled data, robust face normalization, discriminative representation learning and the ambiguity of facial features caused by information loss.In this thesis, we propose a set of methods that attempt to address the above challenges in unconstrained face recognition systems. Starting from a classic deep face recognition pipeline, we review how each step in this pipeline could fail on low-quality uncontrolled input faces, what kind of solutions have been studied before, and then introduce our proposed methods. The various methods proposed in this thesis are independent but compatible with each other. Experiment on several challenging benchmarks, e.g. IJB-C and IJB-S show that the proposed methods are able to improve the robustness and reliability of deep unconstrained face recognition systems. Our solution achieves state-of-the-art performance, i.e. 95.0\% TAR@FAR=0.001\% on IJB-C dataset and 61.98\% Rank1 retrieval rate on the surveillance-to-booking protocol of IJB-S dataset.
Show less
- Title
- Using Eventual Consistency to Improve the Performance of Distributed Graph Computation In Key-Value Stores
- Creator
- Nguyen, Duong Ngoc
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Key-value stores have gained increasing popularity due to their fast performance and simple data model. A key-value store usually consists of multiple replicas located in different geographical regions to provide higher availability and fault tolerance. Consequently, a protocol is employed to ensure that data are consistent across the replicas.The CAP theorem states the impossibility of simultaneously achieving three desirable properties in a distributed system, namely consistency,...
Show moreKey-value stores have gained increasing popularity due to their fast performance and simple data model. A key-value store usually consists of multiple replicas located in different geographical regions to provide higher availability and fault tolerance. Consequently, a protocol is employed to ensure that data are consistent across the replicas.The CAP theorem states the impossibility of simultaneously achieving three desirable properties in a distributed system, namely consistency, availability, and network partition tolerance. Since failures are a norm in distributed systems and the capability to maintain the service at an acceptable level in the presence of failures is a critical dependability and business requirement of any system, the partition tolerance property is a necessity. Consequently, the trade-off between consistency and availability (performance) is inevitable. Strong consistency is attained at the cost of slow performance and fast performance is attained at the cost of weak consistency, resulting in a spectrum of consistency models suitable for different needs. Among the consistency models, sequential consistency and eventual consistency are two common ones. The former is easier to program with but suffers from poor performance whereas the latter suffers from potential data anomalies while providing higher performance.In this dissertation, we focus on the problem of what a designer should do if he/she is asked to solve a problem on a key-value store that provides eventual consistency. Specifically, we are interested in the approaches that allow the designer to run his/her applications on an eventually consistent key-value store and handle data anomalies if they occur during the computation. To that end, we investigate two options: (1) Using detect-rollback approach, and (2) Using stabilization approach. In the first option, the designer identifies a correctness predicate, say $\Phi$, and continues to run the application as if it was running on sequential consistency, as our system monitors $\Phi$. If $\Phi$ is violated (because the underlying key-value store provides eventual consistency), the system rolls back to a state where $\Phi$ holds and the computation is resumed from there. In the second option, the data anomalies are treated as state perturbations and handled by the convergence property of stabilizing algorithms.We choose LinkedIn's Voldemort key-value store as the example key-value store for our study. We run experiments with several graph-based applications on Amazon AWS platform to evaluate the benefits of the two approaches. From the experiment results, we observe that overall, both approaches provide benefits to the applications when compared to running the applications on sequential consistency. However, stabilization provides higher benefits, especially in the aggressive stabilization mode which trades more perturbations for no locking overhead.The results suggest that while there is some cost associated with making an algorithm stabilizing, there may be a substantial benefit in revising an existing algorithm for the problem at hand to make it stabilizing and reduce the overall runtime under eventual consistency.There are several directions of extension. For the detect-rollback approach, we are working to develop a more general rollback mechanism for the applications and improve the efficiency and accuracy of the monitors. For the stabilization approach, we are working to develop an analytical model for the benefits of eventual consistency in stabilizing programs. Our current work focuses on silent stabilization and we plan to extend our approach to other variations of stabilization.
Show less