Search results
(1 - 20 of 30)
Pages
- Title
- Supervised Dimension Reduction Techniques for High-Dimensional Data
- Creator
- Molho, Dylan
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
The data sets arising in modern science and engineering are often extremely large, befitting the era of big data. But these data sets are not only large in the number of samples they have, they may also have a large number of features, placing each data point in a high-dimensional space.However, unique problems arise when the dimension of the data has the same or even greater order than the sample size. This scenario in statistics is known as the High Dimension, Low Sample Size problem (HDLSS...
Show moreThe data sets arising in modern science and engineering are often extremely large, befitting the era of big data. But these data sets are not only large in the number of samples they have, they may also have a large number of features, placing each data point in a high-dimensional space.However, unique problems arise when the dimension of the data has the same or even greater order than the sample size. This scenario in statistics is known as the High Dimension, Low Sample Size problem (HDLSS). In this paradigm, many standard statistical estimators are shown to perform sub-optimally and in some cases can not be computed at all. To overcome the barriers found in HDLSS scenarios, one must make additional assumptions on the data, either with explicit formulations or with implicit beliefs about the behavior of the data. The first type of research leads to structural assumptions placed on the probability model that generates the data, which allow for alterations to classical methods to yield theoretically optimal estimators for the chosen well-defined tasks. The second type of research, in contrast, makes general assumptions usually based on the the causal nature of chosen real-world data application, where the data is assumed to have dependencies between the parameters.This dissertation develops two novel algorithms that successfully operate in the paradigm of HDLSS. We first propose the Generalized Eigenvalue (GEV) estimator, a unified sparse projection regression framework for estimating generalized eigenvector problems.Unlike existing work, we reformulate a sequence of computationally intractable non-convex generalized Rayleigh quotient optimization problems into a computationally efficient simultaneous linear regression problem, padded with a sparse penalty to deal with high-dimensional predictors. We showcase the applications of our method by considering three iconic problems in statistics: the sliced inverse regression (SIR), linear discriminant analysis (LDA), and canonical correlation analysis (CCA). We show the reformulated linear regression problem is able to recover the same projection space obtained by the original generalized eigenvalue problem. Statistically, we establish the nonasymptotic error bounds for the proposed estimator in the applications of SIR and LDA, and prove these rates are minimax optimal. We present how the GEV is applied to the CCA problem, and adapt the method for a robust Huber-loss based formulation for noisy data. We test our framework on both synthetic and real datasets and demonstrate its superior performance compared with other state-of-the-art methods in high dimensional statistics. The second algorithm is the scJEGNN, a graphical neural network (GNN) tailored to the task of data integration for HDLSS single-cell sequencing data.We show that with its unique model, the GNN is able to leverage structural information of the biological data relations in order to perform a joint embedding of multiple modalities of single-cell gene expression data. The model is applied to data from the NeurIPS 2021 competition for Open Problems in Single-Cell Analysis, and we demonstrate that our model is able to outperform top teams from the joint embedding task.
Show less
- Title
- Learning Fair Representations without Demographics
- Creator
- Wang, Xiaoxue
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Due to hard accessibility, real-world adoption of fair representation learning algorithms lacks the prior knowledge of the sensitive attributes that we wish to be fair with. To address the challenge in fairness without explicit demographics, our solution is based on the idea of maximally randomizing the representation while being as informative as possible about the target task. We operationalize this goal through the concept of maximizing the entropy of the learned representation. For this...
Show moreDue to hard accessibility, real-world adoption of fair representation learning algorithms lacks the prior knowledge of the sensitive attributes that we wish to be fair with. To address the challenge in fairness without explicit demographics, our solution is based on the idea of maximally randomizing the representation while being as informative as possible about the target task. We operationalize this goal through the concept of maximizing the entropy of the learned representation. For this purpose, we propose two new avenues for entropy maximization in the absence of demographic information: intra-class and inter-class entropy maximization. For 1) intra-class entropy maximization, it maximizes the entropy of the non-target class predictions (excluding the probability of the ground truth class label for classification problems), thus encouraging the model to discard spurious correlations between the different target classes, and for 2) inter-class entropy maximization, it maximizes the entropy of the representation conditioned on the target label, thus encouraging randomization of the samples within each target class label and minimizing the leakage of potential demographic information in the representation. Quantitative and qualitative results of our Maximum Entropy method (MaxEnt) on COMPAS and UCI Adult datasets show that 1) our method can outperform the State-of-the-art (SOTA) Adversarially Reweighted Learning (ARL) method and will enhance the difficulty of extracting sensitive demographic information in representation without prior demographic knowledge 2) our method reaches a good trade-off between utility and fairness.
Show less
- Title
- Machine Learning on Drug Discovery : Algorithms and Applications
- Creator
- Sun, Mengying
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
Drug development is an expensive and time-consuming process where thousands of chemical compounds are being tested and experiments being conducted in order to find out drugs that are safe and effective. Modern drug development aims to speed up the intermediate steps and reduce cost by leveraging machine learning techniques, typically at drug discovery and preclinical research stages. Better identification of promising candidates can significantly reduce the load of later processes, e.g.,...
Show moreDrug development is an expensive and time-consuming process where thousands of chemical compounds are being tested and experiments being conducted in order to find out drugs that are safe and effective. Modern drug development aims to speed up the intermediate steps and reduce cost by leveraging machine learning techniques, typically at drug discovery and preclinical research stages. Better identification of promising candidates can significantly reduce the load of later processes, e.g., clinical trials, saving tons of resources as well as time.In this dissertation, we explored and proposed novel machine learning algorithms for drug discovery from the aspects of robustness, knowledge transfer, molecular generation and optimization. First of all, labels from high-throughput experiments (e.g., biological profiling and chemical screening) often contain inevitable noise due to technical and biological variations. We proposed a method that leverages both disagreement and agreement among deep neural networks to mitigate the negative effect brought by noisy labels and better predict drug responses. Secondly, graph neural networks (GNNs) has become popular for modeling graph-structured data (e.g., molecules). Graph contrastive learning, by maximizing the mutual information between paired graph augmentations, has been shown to be an effective strategy for pretraining GNNs. However, the existing graph contrastive learning methods have intrinsic limitations when adopted for molecular tasks. Therefore, we proposed a method that utilizes domain knowledge at both local- and global-level to assist representation learning. The local-level domain knowledge guides the augmentation process such that variation is introduced without changing graph semantics. The global-level knowledge encodes the similarity information between graphs in the entire dataset and helps to learn representations with richer semantics. Last but not least, we proposed a search-based approach for multi-objective molecular generation and optimization. We show that given proper design and sufficient information, search-based methods can achieve performance comparable or even better than deep learning methods while being computationally efficient. Specifically, the proposed method starts with existing molecules and uses a two-stage search strategy to gradually modify them into new ones, based on transformation rules derived from large compound libraries. We demonstrate all the proposed methods with extensive experiments.
Show less
- Title
- ASSURING THE ROBUSTNESS AND RESILIENCY OF LEARNING-ENABLED AUTONOMOUS SYSTEMS
- Creator
- Langford, Michael Austin
- Date
- 2022
- Collection
- Electronic Theses & Dissertations
- Description
-
As Learning-Enabled Systems (LESs) have become more prevalent in safety-critical applications, addressing the assurance of LESs has become increasingly important. Because machine learning models in LESs are not explicitly programmed like traditional software, developers typically have less direct control over the inferences learned by LESs, relying instead on semantically valid and complete patterns to be extracted from the system’s exposure to the environment. As such, the behavior of an LES...
Show moreAs Learning-Enabled Systems (LESs) have become more prevalent in safety-critical applications, addressing the assurance of LESs has become increasingly important. Because machine learning models in LESs are not explicitly programmed like traditional software, developers typically have less direct control over the inferences learned by LESs, relying instead on semantically valid and complete patterns to be extracted from the system’s exposure to the environment. As such, the behavior of an LES is strongly dependent on the quality of its training experience. However, run-time environments are often noisy or not well-defined. Uncertainty in the behavior of an LES can arise when there is inadequate coverage of relevant training/test cases (e.g., corner cases). It is challenging to assure safety-critical LESs will perform as expected when exposed to run-time conditions that have never been experienced during training or validation. This doctoral research contributes automated methods to improve the robustness and resilience of an LES. For this work, a robust LES is less sensitive to noise in the environment, and a resilient LES is able to self-adapt to adverse run-time contexts in order to mitigate system failure. The proposed methods harness diversity-driven evolution-based methods, machine learning, and software assurance cases to train robust LESs, uncover robust system configurations, and foster resiliency through self-adaptation and predictive behavior modeling. This doctoral work demonstrates these capabilities by applying the proposed framework to deep learning and autonomous cyber-physical systems.
Show less
- Title
- Modeling physical causality of action verbs for grounded language understanding
- Creator
- Gao, Qiaozi
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
Building systems that can understand and communicate through human natural language is one of the ultimate goals in AI. Decades of natural language processing research has been mainly focused on learning from large amounts of language corpora. However, human communication relies on a significant amount of unverbalized information, which is often referred as commonsense knowledge. This type of knowledge allows us to understand each other's intention, to connect language with concepts in the...
Show moreBuilding systems that can understand and communicate through human natural language is one of the ultimate goals in AI. Decades of natural language processing research has been mainly focused on learning from large amounts of language corpora. However, human communication relies on a significant amount of unverbalized information, which is often referred as commonsense knowledge. This type of knowledge allows us to understand each other's intention, to connect language with concepts in the world, and to make inference based on what we hear or read. Commonsense knowledge is generally shared among cognitive capable individuals, thus it is rarely stated in human language. This makes it very difficult for artificial agents to acquire commonsense knowledge from language corpora. To address this problem, this dissertation investigates the acquisition of commonsense knowledge, especially knowledge related to basic actions upon the physical world and how that influences language processing and grounding.Linguistics studies have shown that action verbs often denote some change of state (CoS) as the result of an action. For example, the result of "slice a pizza" is that the state of the object (pizza) changes from one big piece to several smaller pieces. However, the causality of action verbs and its potential connection with the physical world has not been systematically explored. Artificial agents often do not have this kind of basic commonsense causality knowledge, which makes it difficult for these agents to work with humans and to reason, learn, and perform actions.To address this problem, this dissertation models dimensions of physical causality associated with common action verbs. Based on such modeling, several approaches are developed to incorporate causality knowledge to language grounding, visual causality reasoning, and commonsense story comprehension.
Show less
- Title
- An Analytical and experimental investigation of the elastodynamic response of a class of intelligent machinery
- Creator
- Sunappan, Vasudivan
- Date
- 1987
- Collection
- Electronic Theses & Dissertations
- Title
- Living real experience in virtual network environments in Pierre Teilhard de Chardin
- Creator
- Santos, Gildasio Mendes dos
- Date
- 2000
- Collection
- Electronic Theses & Dissertations
- Title
- Cortex-inspired goal-directed recurrent networks for developmental visual attention and recognition with complex backgrounds
- Creator
- Luciw, Matthew
- Date
- 2010
- Collection
- Electronic Theses & Dissertations
- Title
- Real time robot control over the internet with force reflection
- Creator
- Elhajj, Imad Hanna
- Date
- 1999
- Collection
- Electronic Theses & Dissertations
- Title
- CMOS VLSI implementations of a new feedback neural network architecture
- Creator
- Wang, Yiwen
- Date
- 1991
- Collection
- Electronic Theses & Dissertations
- Title
- Developmental learning with applications to attention, task transfer and user presence detection
- Creator
- Huang, Xiao
- Date
- 2005
- Collection
- Electronic Theses & Dissertations
- Title
- Nonparametric procedures for learning with an imperfect teacher
- Creator
- Richter, Ronald Joseph, 1945-
- Date
- 1972
- Collection
- Electronic Theses & Dissertations
- Title
- Representation and processes of pedagogic knowledge
- Creator
- Grossman, Harold Charles
- Date
- 1978
- Collection
- Electronic Theses & Dissertations
- Title
- On the application of relevance measures in mechanical deduction
- Creator
- Soddy, James Stephen
- Date
- 1982
- Collection
- Electronic Theses & Dissertations
- Title
- IPCA : an intelligent control architecture based on the generic task approach to knowledge-based systems
- Creator
- Decker, David Bruce
- Date
- 1995
- Collection
- Electronic Theses & Dissertations
- Title
- 'Almost' real-time diagnosis and correction of manufacturing scrap using an expert system
- Creator
- Chesney, David Raymond
- Date
- 1987
- Collection
- Electronic Theses & Dissertations
- Title
- Cortex-inspired developmental learning for vision-based navigation, attention and recognition
- Creator
- Ji, Zhengping
- Date
- 2009
- Collection
- Electronic Theses & Dissertations
- Title
- Adaptive and Automated Deep Recommender Systems
- Creator
- Zhao, Xiangyu
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Recommender systems are intelligent information retrieval applications, and have been leveraged in numerous domains such as e-commerce, movies, music, books, and point-of-interests. They play a crucial role in the users' information-seeking process, and overcome the information overload issue by recommending personalized items (products, services, or information) that best match users' needs and preferences. Driven by the recent advances in machine learning theories and the prevalence of deep...
Show moreRecommender systems are intelligent information retrieval applications, and have been leveraged in numerous domains such as e-commerce, movies, music, books, and point-of-interests. They play a crucial role in the users' information-seeking process, and overcome the information overload issue by recommending personalized items (products, services, or information) that best match users' needs and preferences. Driven by the recent advances in machine learning theories and the prevalence of deep learning techniques, there have been tremendous interests in developing deep learning based recommender systems. They have unprecedentedly advanced effectiveness of mining the non-linear user-item relationships and learning the feature representations from massive datasets, which produce great vitality and improvements in recommendations from both academic and industry communities.Despite above prominence of existing deep recommender systems, their adaptiveness and automation still remain under-explored. Thus, in this dissertation, we study the problem of adaptive and automated deep recommender systems. Specifically, we present our efforts devoted to building adaptive deep recommender systems to continuously update recommendation strategies according to the dynamic nature of user preference, which maximizes the cumulative reward from users in the practical streaming recommendation scenarios. In addition, we propose a group of automated and systematic approaches that design deep recommender system frameworks effectively and efficiently from a data-driven manner. More importantly, we apply our proposed models into a variety of real-world recommendation platforms and have achieved promising enhancements of social and economic benefits.
Show less
- Title
- Control of MSU jumping robot
- Creator
- Alsaedi, Emad
- Date
- 2014
- Collection
- Electronic Theses & Dissertations
- Description
-
The idea of Miniature jumping robots is taking exactly from the animal living around us. A gecko lizard is a perfect example for a jumping balancing Robot, in an experiment, it was seen how gecko lizard can balance landing using the moment of the tail. Another example is a falling cat midair self-righting. These miniature jumping robot are becoming widely used in real life. One cannot expect how useful to use such robot at times where human presence is at danger. Like in the case of wars or...
Show moreThe idea of Miniature jumping robots is taking exactly from the animal living around us. A gecko lizard is a perfect example for a jumping balancing Robot, in an experiment, it was seen how gecko lizard can balance landing using the moment of the tail. Another example is a falling cat midair self-righting. These miniature jumping robot are becoming widely used in real life. One cannot expect how useful to use such robot at times where human presence is at danger. Like in the case of wars or natural disaster (as in earth quakes or nuclear like disaster as Fukushima leaks in Japan). The MSU jumper robot is unique in terms of weight and size; however, there is some control problem in the case of landing procedure, as well as self-righting and maneuvering in midair. Designing a controller for MSU jumping robot is challenging, the controller has to response in half a second as the jumping period is close to 2 second, that short period made it almost impossible for the robot to resist uncertainties or unmolded dynamics, as well as changes in the mass of moment of inertia of the body due to change of body shape. We managed to add mini wings to the robot to prolong jumping period and the stabilize landing procedure, as well as to enable the robot to estimate the mass of moment of inertia for the body , and all of that for the controller at the tail to force the body to land on the desired edge. MSU jumper robot has swept greatly throughout robotics media and industry due to the tininess and light weight properties. A light weight that doesn't exceed 28 g and a maximum size of 6.5 cm is what made the robot special in its types of all jumping robots.
Show less
- Title
- Searle's Chinese box : the Chinese room argument and artificial intelligence
- Creator
- Hauser, Larry Steven
- Date
- 1993
- Collection
- Electronic Theses & Dissertations