You are here
Search results
(1 - 15 of 15)
- Title
- Robust multi-task learning algorithms for predictive modeling of spatial and temporal data
- Creator
- Liu, Xi (Graduate of Michigan State University)
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
"Recent years have witnessed the significant growth of spatial and temporal data generated from various disciplines, including geophysical sciences, neuroscience, economics, criminology, and epidemiology. Such data have been extensively used to train spatial and temporal models that can make predictions either at multiple locations simultaneously or along multiple forecasting horizons (lead times). However, training an accurate prediction model in these domains can be challenging especially...
Show more"Recent years have witnessed the significant growth of spatial and temporal data generated from various disciplines, including geophysical sciences, neuroscience, economics, criminology, and epidemiology. Such data have been extensively used to train spatial and temporal models that can make predictions either at multiple locations simultaneously or along multiple forecasting horizons (lead times). However, training an accurate prediction model in these domains can be challenging especially when there are significant noise and missing values or limited training examples available. The goal of this thesis is to develop novel multi-task learning frameworks that can exploit the spatial and/or temporal dependencies of the data to ensure robust predictions in spite of the data quality and scarcity problems. The first framework developed in this dissertation is designed for multi-task classification of time series data. Specifically, the prediction task here is to continuously classify activities of a human subject based on the multi-modal sensor data collected in a smart home environment. As the classes exhibit strong spatial and temporal dependencies, this makes it an ideal setting for applying a multi-task learning approach. Nevertheless, since the type of sensors deployed often vary from one room (location) to another, this introduces a structured missing value problem, in which blocks of sensor data could be missing when a subject moves from one room to another. To address this challenge, a probabilistic multi-task classification framework is developed to jointly model the activity recognition tasks from all the rooms, taking into account the block-missing value problem. The framework also learns the transitional dependencies between classes to improve its overall prediction accuracy. The second framework is developed for the multi-location time series forecasting problem. Although multi-task learning has been successfully applied to many time series forecasting applications such as climate prediction, conventional approaches aim to minimize only the point-wise residual error of their predictions instead of considering how well their models fit the overall distribution of the response variable. As a result, their predicted distribution may not fully capture the true distribution of the data. In this thesis, a novel distribution-preserving multi-task learning framework is proposed for the multi-location time series forecasting problem. The framework uses a non-parametric density estimation approach to fit the distribution of the response variable and employs an L2-distance function to minimize the divergence between the predicted and true distributions. The third framework proposed in this dissertation is for the multi-step-ahead (long-range) time series prediction problem with application to ensemble forecasting of sea surface temperature. Specifically, our goal is to effectively combine the forecasts generated by various numerical models at different lead times to obtain more precise predictions. Towards this end, a multi-task deep learning framework based on a hierarchical LSTM architecture is proposed to jointly model the ensemble forecasts of different models, taking into account the temporal dependencies between forecasts at different lead times. Experiments performed on 29-year sea surface temperature data from North American Multi-Model Ensemble (NAMME) demonstrate that the proposed architecture significantly outperforms standard LSTM and other MTL approaches."--Pages ii-iii.
Show less
- Title
- Novel learning algorithms for mining geospatial data
- Creator
- Yuan, Shuai (Software engineer)
- Date
- 2017
- Collection
- Electronic Theses & Dissertations
- Description
-
Geospatial data have a wide range of applicability in many disciplines, including environmental science, urban planning, healthcare, and public administration. The proliferation of such data in recent years have presented opportunities to develop novel data mining algorithms for modeling and extracting useful patterns from the data. However, there are many practical issues remain that must be addressed before the algorithms can be successfully applied to real-world problems. First, the...
Show moreGeospatial data have a wide range of applicability in many disciplines, including environmental science, urban planning, healthcare, and public administration. The proliferation of such data in recent years have presented opportunities to develop novel data mining algorithms for modeling and extracting useful patterns from the data. However, there are many practical issues remain that must be addressed before the algorithms can be successfully applied to real-world problems. First, the algorithms must be able to incorporate spatial relationships and other domain constraints defined by the problem. Second, the algorithms must be able to handle missing values, which are common in many geospatial data sets. In particular, the models constructed by the algorithms may need to be extrapolated to locations with no observation data. Another challenge is to adequately capture the nonlinear relationship between the predictor and response variables of the geospatial data. Accurate modeling of such relationship is not only a challenge, it is also computationally expensive. Finally, the variables may interact at different spatial scales, making it necessary to develop models that can handle multi-scale relationships present in the geospatial data. This thesis presents the novel algorithms I have developed to overcome the practical challenges of applying data mining to geospatial datasets. Specifically, the algorithms will be applied to both supervised and unsupervised learning problems such as cluster analysis and spatial prediction. While the algorithms are mostly evaluated on datasets from the ecology domain, they are generally applicable to other geospatial datasets with similar characteristics. First, a spatially constrained spectral clustering algorithm is developed for geospatial data. The algorithm provides a flexible way to incorporate spatial constraints into the spectral clustering formulation in order to create regions that are spatially contiguous and homogeneous. It can also be extended to a hierarchical clustering setting, enabling the creation of fine-scale regions that are nested wholly within broader-scale regions. Experimental results suggest that the nested regions created using the proposed approach are more balanced in terms of their sizes compared to the regions found using traditional hierarchical clustering methods. Second, a supervised hash-based feature learning algorithm is proposed for modeling nonlinear relationships in incomplete geospatial data. The proposed algorithm can simultaneously infer missing values while learning a small set of discriminative, nonlinear features of the geospatial data. The efficacy of the algorithm is demonstrated using synthetic and real-world datasets. Empirical results show that the algorithm is more effective than the standard approach of imputing the missing values before applying nonlinear feature learning in more than 75% of the datasets evaluated in the study. Third, a multi-task learning framework is developed for modeling multiple response variables in geospatial data. Instead of training the local models independently for each response variable at each location, the framework simultaneously fits the local models for all response variables by optimizing a joint objective function with trace-norm regularization. The framework also leverages the spatial autocorrelation between locations as well as the inherent correlation between response variables to improve prediction accuracy. Finally, a multi-level, multi-task learning framework is proposed to effectively train predictive models from nested geospatial data containing predictor variables measured at multiple spatial scales. The framework enables distinct models to be developed for each coarse- scale region using both its fine-level and coarse-level features. It also allows information to be shared among the models through a common set of latent features. Empirical results show that such information sharing helps to create more robust models especially for regions with limited or no training data. Another advantage of using the multi-level, multi-task learning framework is that it can automatically identify potential cross-scale interactions between the regional and local variables.
Show less
- Title
- Smartphone-based sensing systems for data-intensive applications
- Creator
- Moazzami, Mohammad-Mahdi
- Date
- 2017
- Collection
- Electronic Theses & Dissertations
- Description
-
"Supported by advanced sensing capabilities, increasing computational resources and the advances in Artificial Intelligence, smartphones have become our virtual companions in our daily life. An average modern smartphone is capable of handling a wide range of tasks including navigation, advanced image processing, speech processing, cross app data processing and etc. The key facet that is common in all of these applications is the data intensive computation. In this dissertation we have taken...
Show more"Supported by advanced sensing capabilities, increasing computational resources and the advances in Artificial Intelligence, smartphones have become our virtual companions in our daily life. An average modern smartphone is capable of handling a wide range of tasks including navigation, advanced image processing, speech processing, cross app data processing and etc. The key facet that is common in all of these applications is the data intensive computation. In this dissertation we have taken steps towards the realization of the vision that makes the smartphone truly a platform for data intensive computations by proposing frameworks, applications and algorithmic solutions. We followed a data-driven approach to the system design. To this end, several challenges must be addressed before smartphones can be used as a system platform for data-intensive applications. The major challenge addressed in this dissertation include high power consumption, high computation cost in advance machine learning algorithms, lack of real-time functionalities, lack of embedded programming support, heterogeneity in the apps, communication interfaces and lack of customized data processing libraries. The contribution of this dissertation can be summarized as follows. We present the design, implementation and evaluation of the ORBIT framework, which represents the first system that combines the design requirements of a machine learning system and sensing system together at the same time. We ported for the first time off-the-shelf machine learning algorithms for real-time sensor data processing to smartphone devices. We highlighted how machine learning on smartphones comes with severe costs that need to be mitigated in order to make smartphones capable of real-time data-intensive processing. From application perspective we present SPOT. SPOT aims to address some of the challenges discovered in mobile-based smart-home systems. These challenges prevent us from achieving the promises of smart-homes due to heterogeneity in different aspects of smart devices and the underlining systems. We face the following major heterogeneities in building smart-homes:: (i) Diverse appliance control apps (ii) Communication interface, (iii) Programming abstraction. SPOT makes the heterogeneous characteristics of smart appliances transparent, and by that it minimizes the burden of home automation application developers and the efforts of users who would otherwise have to deal with appliance-specific apps and control interfaces. From algorithmic perspective we introduce two systems in the smartphone-based deep learning area: Deep-Crowd-Label and Deep-Partition. Deep neural models are both computationally and memory intensive, making them difficult to deploy on mobile applications with limited hardware resources. On the other hand, they are the most advanced machine learning algorithms suitable for real-time sensing applications used in the wild. Deep-Partition is an optimization-based partitioning meta-algorithm featuring a tiered architecture for smartphone and the back-end cloud. Deep-Partition provides a profile-based model partitioning allowing it to intelligently execute the Deep Learning algorithms among the tiers to minimize the smartphone power consumption by minimizing the deep models feed-forward latency. Deep-Crowd-Label is prototyped for semantically labeling user's location. It is a crowd-assisted algorithm that uses crowd-sourcing in both training and inference time. It builds deep convolutional neural models using crowd-sensed images to detect the context (label) of indoor locations. It features domain adaptation and model extension via transfer learning to efficiently build deep models for image labeling. The work presented in this dissertation covers three major facets of data-driven and compute-intensive smartphone-based systems: platforms, applications and algorithms; and helps to spurs new areas of research and opens up new directions in mobile computing research."--Pages ii-iii.
Show less
- Title
- Performance analysis and privacy protection of network data
- Creator
- Ahmed, Faraz (Research engineer)
- Date
- 2018
- Collection
- Electronic Theses & Dissertations
- Description
-
"The goal of this thesis is to address network management research challenges faced by operational networks - with specific focus on cellular networks, content delivery networks, and online social networks. Next, I give an overview of my research on network management of these networks. Cellular networks utilize existing service quality management systems for detecting performance degradation issues inside the network, however, under certain conditions degradation in End-to-End (E2E)...
Show more"The goal of this thesis is to address network management research challenges faced by operational networks - with specific focus on cellular networks, content delivery networks, and online social networks. Next, I give an overview of my research on network management of these networks. Cellular networks utilize existing service quality management systems for detecting performance degradation issues inside the network, however, under certain conditions degradation in End-to-End (E2E) performance may go undetected. These conditions may arise due to problems in the mobile device hardware, smartphone applications, and content providers. In this thesis, I present a system for detecting and localizing E2E performance degradation at cellular service providers across four administrative domains: cellular network, content providers, device manufacturers, and smartphone applications. Cellular networks also need systems that can prioritize performance degradation issues according to the number of customers impacted. Cell tower outages are performance degradation issues that directly impact connectivity of cellular network users. In this thesis, we design and evaluate a cell tower outage monitoring system that analyzes and estimates device level impact during cell tower outages. Content delivery networks (CDNs) maintain multiple transit routes from content distribution servers to eyeball ISP networks which provide Internet connectivity to end users. Two major considerations for CDNs are transit prices and performance dynamics of delivering content to end users. The dynamic nature of transit pricing and performance makes it challenging to optimize the cost and performance tradeoff. There are thousands of eyeball ISPs which are reachable via different transit routes and different geographical locations. Each choice of transit route for a particular eyeball ISP and geographical location has distinct cost and performance characteristics, which makes the problem of developing a transit routing strategy challenging. In this thesis, I present a measurement approach to actively collect client perceived network performance and then use these measurements towards optimal transit route selection for CDNs. Online Social Networks (OSNs) often refuse to publish their social network graphs due to privacy concerns. Differential privacy has been the widely accepted criteria for privacy preserving data publishing. In this thesis, I present a random matrix approach to OSN graph publishing, which achieves storage and computational efficiency by reducing dimensions of adjacency matrices and achieves differential privacy by adding a small amount of noise."--Pages ii-iii.
Show less
- Title
- Multi-task learning and its application to geospatio-temporal data
- Creator
- Xu, Jianpeng
- Date
- 2017
- Collection
- Electronic Theses & Dissertations
- Description
-
Multi-task learning (MTL) is a data mining and machine learning approach for modeling multiple prediction tasks simultaneously by exploiting the relatedness among the tasks. MTL has been successfully applied to various domains, including computer vision, healthcare, genomics, recommender systems, and natural language processing. The goals of this thesis are: (1) to investigate the feasibility of applying MTL to geospatio-temporal prediction problems, particularly those encountered in the...
Show moreMulti-task learning (MTL) is a data mining and machine learning approach for modeling multiple prediction tasks simultaneously by exploiting the relatedness among the tasks. MTL has been successfully applied to various domains, including computer vision, healthcare, genomics, recommender systems, and natural language processing. The goals of this thesis are: (1) to investigate the feasibility of applying MTL to geospatio-temporal prediction problems, particularly those encountered in the climate and environmental science domains and (2) to develop novel MTL frameworks that address the challenges of building effective predictive models from geospatio-temporal data.The first contribution of this thesis is to develop an online temporal MTL framework called ORION for ensemble forecasting problems. Ensemble forecasting uses a numerical method to simulate the evolution of nonlinear dynamic systems, such as climate and hydrological systems. ORION aims to effectively aggregate the forecasts generated by different ensemble members for a future time window, where each forecast is obtained by perturbing the starting condition of the computer model or using a different model representation. ORION considers the prediction for each time point in the forecast window as a distinct prediction task, where the task relatedness is achieved by imposing temporal smoothness and mean regularization constraints. A novel, online update with restart strategy is proposed to handle missing observations in the training data. ORION can also be optimized for different objectives, such as ε -insensitive and quantile loss functions.The second contribution of this thesis is to propose a MTL framework named GSpartan that can perform inferences at multiple locations simultaneously while allowing the local models for different locations to be jointly trained. GSpartan assumes that the local models share a common, low-rank representation and employs a graph Laplacian regularization to enforce constraints due to the inherent spatial autocorrelation of the data. Sparsity and non-negativity constraints are also incorporated into the formulation to ensure interpretability of the models.GSpartan is a MTL framework that considers only the spatial autocorrelation of the data. It is also a batch learning algorithm, which makes it difficult to scale up to global-scale data. To address these limitations, a new framework called WISDOM is proposed, which can incorporate the task relatedness across both space and time. WISDOM encodes the geospatio-temporal data as a tensor and performs supervised tensor decomposition to identify the latent factors that capture the inherent spatial and temporal variabilities of the data as well as the relationship between the predictor and target variables. The framework is unique in that it trains distinct spatial and temporal prediction models from the latent factors of the decomposed tensor and aggregates the outputs of these models to obtain the final prediction. WISDOM also employs an incremental learning algorithm that can systematically update the models when training examples are available for a new time period or for a new location.Finally, the geospatio-temporal data for many scientific applications are often available at varying spatial scales. For example, they can be generated by computer models simulated at different grid resolutions (e.g., the global and regional models used in climate modeling). A simple way to handle the predictor variables generated from the multi-scale data is to concatenate them into a single feature vector and train WISDOM using the concatenated vectors. However, this strategy may not be effective as it ignores the inherent dependencies between variables at different scales. To overcome this limitation, this thesis presents an extension of WISDOM called MUSCAT for handling multi-scale geospatio-temporal data. MUSCAT considers the consistency of the latent factors extracted from the spatio-temporal tensors at different scales while inheriting the benefits of WISDOM. Given the massive size of the multi-scale spatio-temporal tensors, a novel, supervised, incremental multi-tensor decomposition algorithm is develop to efficiently learn the model parameters.
Show less
- Title
- Occupant behavior prediction model based on energy consumption using machine learning approaches
- Creator
- Mo, Yunjeong
- Date
- 2018
- Collection
- Electronic Theses & Dissertations
- Description
-
This research will have an impact on residential occupant behavior by helping occupants better understand their own behaviors' effects on energy usage, and detect what changes would improve energy efficiency in their homes. The findings will be beneficial to energy-related industintegrated with research in other fields.
- Title
- Design and deployment of low-cost wireless sensor networks for real-time event detection and monitoring
- Creator
- Phillips, Dennis Edward
- Date
- 2018
- Collection
- Electronic Theses & Dissertations
- Description
-
As sensor network technologies become more mature, they are increasingly being applied to a wide variety of environmental monitoring applications, ranging from agricultural sensing to habitat monitoring, oceanic and volcanic monitoring. In this dissertation two wireless sensor networks (WSNs) are presented. One for monitoring residential power usage and another for producing an image of a volcano's internal structure.The two WSNs presented address several common challenges facing modern...
Show moreAs sensor network technologies become more mature, they are increasingly being applied to a wide variety of environmental monitoring applications, ranging from agricultural sensing to habitat monitoring, oceanic and volcanic monitoring. In this dissertation two wireless sensor networks (WSNs) are presented. One for monitoring residential power usage and another for producing an image of a volcano's internal structure.The two WSNs presented address several common challenges facing modern sensor networks. The first is in-network processing and assigning the processing tasks across a heterogeneous network architecture. By efficiently utilizing in-network processing power consumption can be reduced and operational lifetime of the network can be extended. As nodes are embedded into various environments sensing accuracy is intrinsically affected by physical noise. The second challenge relates to how to deal with this noise in a way which increases sensing accuracy. The third challenge is ease of deployment. As WSNs become more common place they will be installed by non-experts.As a key technology of home area networks in smart grids, fine-grained power usage monitoring may help conserve electricity. Smart homes outfitted with network connected appliances will provide this capability in the future. Until smart appliances have wide adaption there is a serious gap in capabilities. To fill this gap an easy to deploy monitoring system is needed. Several existing systems achieve the goal of fine-grained power monitoring by exploiting appliances' power usage signatures utilizing labor-intensive in situ training processes. Recent work shows that autonomous power usage monitoring can be achieved by supplementing a smart meter with distributed sensors that detect the working states of appliances. However, sensors must be carefully installed for each appliance, resulting in high installation cost. Supero is the first ad hoc sensor system that can monitor appliance power usage without supervised training. By exploiting multi-sensor fusion and unsupervised machine learning algorithms, Supero can classify the appliance events of interest and autonomously associate measured power usage with the respective appliances. Extensive evaluation in five real homes shows that Supero can estimate the energy consumption with errors less than 7.5%. Moreover, non-professional users can quickly deploy Supero with considerable flexibility.There are a number of active volcanos around the world with large population areas located nearby. An eruption poses a significant threat to the adjacent population. During times of increased activity being able to obtain a real-time images of the interior would allow seismologists to better understand volcanic dynamics. Volcano tomography can provide this valuable information concerning the internal structure of a volcano. The second sensor network presented in this dissertation is a seismic monitoring sensor network featuring in-network processing of the seismic signals with the capability to perform volcano tomography in real-time. The design challenges, analysis of processing/network processing times in the information processing pipeline, the system designed to meet these challenges and the results from deploying a prototype network on two volcanoes in Ecuador and Chile are presented. The study shows that it is possible to achieve in-network seismic event detection and real-time tomography using a sensor network that is 2 orders of magnitude less expensive than traditional seismic equipment.
Show less
- Title
- Novel computational methods for improving functional analysis for long noisy reads
- Creator
- Du, Nan
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
"Single-molecule, real-time sequencing (SMRT) developed by Pacific Biosciences (PacBio) and Nanopore sequencing developed by Oxford Nanopore Technologies (Nanopore) produce longer reads than second-generation sequencing technologies such as Illumina. The increased read length enables PacBio sequencing to close gaps in genome assembly, reveal structural variations, and characterize the intra-species variations. It also holds the promise to decipher the community structure in complex microbial...
Show more"Single-molecule, real-time sequencing (SMRT) developed by Pacific Biosciences (PacBio) and Nanopore sequencing developed by Oxford Nanopore Technologies (Nanopore) produce longer reads than second-generation sequencing technologies such as Illumina. The increased read length enables PacBio sequencing to close gaps in genome assembly, reveal structural variations, and characterize the intra-species variations. It also holds the promise to decipher the community structure in complex microbial communities because long reads help metagenomic assembly. However, compared with data produced by popular short read sequencing technologies (such as Illumina), PacBio and Nanopore data have a higher sequencing error rate and lower coverage. Therefore, new algorithms are needed to take full advantage of third-generation sequencing technologies. For example, during an alignment-based homology search, insertion or deletion errors in genes will cause frameshifts, which may lead to marginal alignment scores and short alignments. In this case, it is hard to distinguish correct alignments from random alignments, and the ambiguity will incur errors in the structural and functional annotation. Existing frameshift correction tools are designed for data with a much lower error rate, and they are not optimized for PacBio data. As an increasing number of groups are using SMRT, there is an urgent need for dedicated homology search tools for PacBio and Nanopore data. Another example is overlap detection. For both PacBio reads and Nanopore reads, there is still a need to improve the sensitivity of detecting small overlaps or overlaps with high error rates. Addressing this need will enable better assembly for metagenomic data produced by the third-generation sequencing technologies.In this article, we are going to discuss the possible method for homology search and overlap detection for the third-generation sequencing. For overlap detection, we designed and implemented an overlap detection program named GroupK. GroupK takes a group of short kmer hits, which satisfy statistically derived distance constraints to increase the sensitivity of small overlap detection. For the homology search, we designed and implemented a profile homology search tool named Frame-Pro based on the profile hidden Markov model (pHMM) and consensus sequences finding method. However, Frame-pro is still relying on multiple sequence alignment. So we implemented DeepFrame, a deep learning model that predicts the corresponding protein function for third-generation sequencing reads. In the experiment on simulated reads of protein-coding sequences and real reads from the human genome, our model outperforms pHMM-based methods and the deep learning based method. Our model can also reject unrelated DNA reads and achieves higher recall with the precision comparable to the state-of-the-art method."--Pages ii-iii.
Show less
- Title
- Contributions to machine learning in biomedical informatics
- Creator
- Baytas, Inci Meliha
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
"With innovations in digital data acquisition devices and increased memory capacity, virtually all commercial and scientific domains have been witnessing an exponential growth in the amount of data they can collect. For instance, healthcare is experiencing a tremendous growth in digital patient information due to the high adaptation rate of electronic health record systems in hospitals. The abundance of data offers many opportunities to develop robust and versatile systems, as long as the...
Show more"With innovations in digital data acquisition devices and increased memory capacity, virtually all commercial and scientific domains have been witnessing an exponential growth in the amount of data they can collect. For instance, healthcare is experiencing a tremendous growth in digital patient information due to the high adaptation rate of electronic health record systems in hospitals. The abundance of data offers many opportunities to develop robust and versatile systems, as long as the underlying salient information in data can be captured. On the other hand, today's data, often named big data, is challenging to analyze due to its large scale and high complexity. For this reason, efficient data-driven techniques are necessary to extract and utilize the valuable information in the data. The field of machine learning essentially develops such techniques to learn effective models directly from the data. Machine learning models have been successfully employed to solve complicated real world problems. However, the big data concept has numerous properties that pose additional challenges in algorithm development. Namely, high dimensionality, class membership imbalance, non-linearity, distributed data, heterogeneity, and temporal nature are some of the big data characteristics that machine learning must address. Biomedical informatics is an interdisciplinary domain where machine learning techniques are used to analyze electronic health records (EHRs). EHR comprises digital patient data with various modalities and depicts an instance of big data. For this reason, analysis of digital patient data is quite challenging although it provides a rich source for clinical research. While the scale of EHR data used in clinical research might not be huge compared to the other domains, such as social media, it is still not feasible for physicians to analyze and interpret longitudinal and heterogeneous data of thousands of patients. Therefore, computational approaches and graphical tools to assist physicians in summarizing the underlying clinical patterns of the EHRs are necessary. The field of biomedical informatics employs machine learning and data mining approaches to provide the essential computational techniques to analyze and interpret complex healthcare data to assist physicians in patient diagnosis and treatment. In this thesis, we propose and develop machine learning algorithms, motivated by prevalent biomedical informatics tasks, to analyze the EHRs. Specifically, we make the following contributions: (i) A convex sparse principal component analysis approach along with variance reduced stochastic proximal gradient descent is proposed for the patient phenotyping task, which is defined as finding clinical representations for patient groups sharing the same set of diseases. (ii) An asynchronous distributed multi-task learning method is introduced to learn predictive models for distributed EHRs. (iii) A modified long-short term memory (LSTM) architecture is designed for the patient subtyping task, where the goal is to cluster patients based on similar progression pathways. The proposed LSTM architecture, T-LSTM, performs a subspace decomposition on the cell memory such that the short term effect in the previous memory is discounted based on the length of the time gap. (iv) An alternative approach to T-LSTM model is proposed with a decoupled memory to capture the short and long term changes. The proposed model, decoupled memory gated recurrent network (DM-GRN), is designed to learn two types of memories focusing on different components of the time series data. In this study, in addition to the healthcare applications, behavior of the proposed model is investigated for traffic speed prediction problem to illustrate its generalization ability. In summary, the aforementioned machine learning approaches have been developed to address complex characteristics of electronic health records in routine biomedical informatics tasks such as computational patient phenotyping and patient subtyping. Proposed models are also applicable to different domains with similar data characteristics as EHRs."--Pages ii-iii.
Show less
- Title
- Learning 3D model from 2D in-the-wild images
- Creator
- Tran, Luan Quoc
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Understanding 3D world is one of computer vision's fundamental problems. While a human has no difficulty understanding the 3D structure of an object upon seeing its 2D image, such a 3D inferring task remains extremely challenging for computer vision systems. To better handle the ambiguity in this inverse problem, one must rely on additional prior assumptions such as constraining faces to lie in a restricted subspace from a 3D model. Conventional 3D models are learned from a set of 3D scans or...
Show moreUnderstanding 3D world is one of computer vision's fundamental problems. While a human has no difficulty understanding the 3D structure of an object upon seeing its 2D image, such a 3D inferring task remains extremely challenging for computer vision systems. To better handle the ambiguity in this inverse problem, one must rely on additional prior assumptions such as constraining faces to lie in a restricted subspace from a 3D model. Conventional 3D models are learned from a set of 3D scans or computer-aided design (CAD) models, and represented by two sets of PCA basis functions. Due to the type and amount of training data, as well as, the linear bases, the representation power of these model can be limited. To address these problems, this thesis proposes an innovative framework to learn a nonlinear 3D model from a large collection of in-the-wild images, without collecting 3D scans. Specifically, given an input image (of a face or an object), a network encoder estimates the projection, lighting, shape and albedo parameters. Two decoders serve as the nonlinear model to map from the shape and albedo parameters to the 3D shape and albedo, respectively. With the projection parameter, lighting, 3D shape, and albedo, a novel analytically differentiable rendering layer is designed to reconstruct the original input. The entire network is end-to-end trainable with only weak supervision. We demonstrate the superior representation power of our models on different domains (face, generic objects), and their contribution to many other applications on facial analysis and monocular 3D object reconstruction.
Show less
- Title
- Online Learning Algorithms for Mining Trajectory data and their Applications
- Creator
- Wang, Ding
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Trajectories are spatio-temporal data that represent traces of moving objects, such as humans, migrating animals, vehicles, and tropical cyclones. In addition to the geo-location information, a trajectory data often contain other (non-spatial) features describing the states of the moving objects. The time-varying geo-location and state information would collectively characterize a trajectory dataset, which can be harnessed to understand the dynamics of the moving objects. This thesis focuses...
Show moreTrajectories are spatio-temporal data that represent traces of moving objects, such as humans, migrating animals, vehicles, and tropical cyclones. In addition to the geo-location information, a trajectory data often contain other (non-spatial) features describing the states of the moving objects. The time-varying geo-location and state information would collectively characterize a trajectory dataset, which can be harnessed to understand the dynamics of the moving objects. This thesis focuses on the development of efficient and accurate machine learning algorithms for forecasting the future trajectory path and state of a moving object. Although many methods have been developed in recent years, there are still numerous challenges that have not been sufficiently addressed by existing methods, which hamper their effectiveness when applied to critical applications such as hurricane prediction. These challenges include their difficulties in terms of handling concept drifts, error propagation in long-term forecasts, missing values, and nonlinearities in the data. In this thesis, I present a family of online learning algorithms to address these challenges. Online learning is an effective approach as it can efficiently fit new observations while adapting to concept drifts present in the data. First, I proposed an online learning framework called OMuLeT for long-term forecasting of the trajectory paths of moving objects. OMuLeT employs an online learning with restart strategy to incrementally update the weights of its predictive model as new observation data become available. It can also handle missing values in the data using a novel weight renormalization strategy.Second, I introduced the OOR framework to predict the future state of the moving object. Since the state can be represented by ordinal values, OOR employs a novel ordinal loss function to train its model. In addition, the framework was extended to OOQR to accommodate a quantile loss function to improve its prediction accuracy for larger values on the ordinal scale. Furthermore, I also developed the OOR-ε and OOQR-ε frameworks to generate real-valued state predictions using the ε insensitivity loss function.Third, I developed an online learning framework called JOHAN, that simultaneously predicts the location and state of the moving object. JOHAN generates its predictions by leveraging the relationship between the state and location information. JOHAN utilizes a quantile loss function to bias the algorithm towards predicting more accurately large categorical values in terms of the state of the moving object, say, for a high intensity hurricane.Finally, I present a deep learning framework to capture non-linear relationships in trajectory data. The proposed DTP framework employs a TDM approach for imputing missing values, coupled with an LSTM architecture for dynamic path prediction. In addition, the framework was extended to ODTP, which applied an online learning setting to address concept drifts present in the trajectory data.As proof of concept, the proposed algorithms were applied to the hurricane prediction task. Both OMuLeT and ODTP were used to predict the future trajectory path of a hurricane up to 48 hours lead time. Experimental results showed that OMuLeT and ODTP outperformed various baseline methods, including the official forecasts produced by the U.S. National Hurricane Center. OOR was applied to predict the intensity of a hurricane up to 48 hours in advance. Experimental results showed that OOR outperformed various state-of-the-art online learning methods and can generate predictions close to the NHC official forecasts. Since hurricane intensity prediction is a notoriously hard problem, JOHAN was applied to improve its prediction accuracy by leveraging the trajectory information, particularly for high intensity hurricanes that are near landfall.
Show less
- Title
- SIGN LANGUAGE RECOGNIZER FRAMEWORK BASED ON DEEP LEARNING ALGORITHMS
- Creator
- Akandeh, Atra
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
According to the World Health Organization (WHO, 2017), 5% of the world’s population have hearing loss. Most people with hearing disabilities communicate via sign language, which hearing people find extremely difficult to understand. To facilitate communication of deaf and hard of hearing people, developing an efficient communication system is a necessity. There are many challenges associated with the Sign Language Recognition (SLR) task, namely, lighting conditions, complex background,...
Show moreAccording to the World Health Organization (WHO, 2017), 5% of the world’s population have hearing loss. Most people with hearing disabilities communicate via sign language, which hearing people find extremely difficult to understand. To facilitate communication of deaf and hard of hearing people, developing an efficient communication system is a necessity. There are many challenges associated with the Sign Language Recognition (SLR) task, namely, lighting conditions, complex background, signee body postures, camera position, occlusion, complexity and large variations in hand posture, no word alignment, coarticulation, etc.Sign Language Recognition has been an active domain of research since the early 90s. However, due to computational resources and sensing technology constraints, limited advancement has been achieved over the years. Existing sign language translation systems mostly can translate a single sign at a time, which makes them less effective in daily-life interaction. This work develops a novel sign language recognition framework using deep neural networks, which directly maps videos of sign language sentences to sequences of gloss labels by emphasizing critical characteristics of the signs and injecting domain-specific expert knowledge into the system. The proposed model also allows for combining data from variant sources and hence combating limited data resources in the SLR field.
Show less
- Title
- Adaptive and Automated Deep Recommender Systems
- Creator
- Zhao, Xiangyu
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Recommender systems are intelligent information retrieval applications, and have been leveraged in numerous domains such as e-commerce, movies, music, books, and point-of-interests. They play a crucial role in the users' information-seeking process, and overcome the information overload issue by recommending personalized items (products, services, or information) that best match users' needs and preferences. Driven by the recent advances in machine learning theories and the prevalence of deep...
Show moreRecommender systems are intelligent information retrieval applications, and have been leveraged in numerous domains such as e-commerce, movies, music, books, and point-of-interests. They play a crucial role in the users' information-seeking process, and overcome the information overload issue by recommending personalized items (products, services, or information) that best match users' needs and preferences. Driven by the recent advances in machine learning theories and the prevalence of deep learning techniques, there have been tremendous interests in developing deep learning based recommender systems. They have unprecedentedly advanced effectiveness of mining the non-linear user-item relationships and learning the feature representations from massive datasets, which produce great vitality and improvements in recommendations from both academic and industry communities.Despite above prominence of existing deep recommender systems, their adaptiveness and automation still remain under-explored. Thus, in this dissertation, we study the problem of adaptive and automated deep recommender systems. Specifically, we present our efforts devoted to building adaptive deep recommender systems to continuously update recommendation strategies according to the dynamic nature of user preference, which maximizes the cumulative reward from users in the practical streaming recommendation scenarios. In addition, we propose a group of automated and systematic approaches that design deep recommender system frameworks effectively and efficiently from a data-driven manner. More importantly, we apply our proposed models into a variety of real-world recommendation platforms and have achieved promising enhancements of social and economic benefits.
Show less
- Title
- Deep Convolutional Networks for Modeling Geo-Spatio-Temporal Relationships and Extremes
- Creator
- Wilson, Tyler
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Geo-spatio-temporal data are valuable for a broad range of applications including traffic forecasting, weather prediction, detection of epidemic outbreaks, and crime monitoring. Data driven approaches to these problems must address several fundamental challenges such as handling the %The two we focus on are the importance ofgeo-spatio-temporal relationships and extreme events. Another recent technological shift has been the success of deep learning especially in applications such as computer...
Show moreGeo-spatio-temporal data are valuable for a broad range of applications including traffic forecasting, weather prediction, detection of epidemic outbreaks, and crime monitoring. Data driven approaches to these problems must address several fundamental challenges such as handling the %The two we focus on are the importance ofgeo-spatio-temporal relationships and extreme events. Another recent technological shift has been the success of deep learning especially in applications such as computer vision, speech recognition, and natural language processing. In this work, we argue that deep learning is a promising approach for many geo-spatio-temporal problems and highlight how it can be used to address the challenges of modeling geo-spatio-temporal relationships and extremes. Though previous research has established techniques for modeling spatio-temporal relationships, these approaches are often limited to gridded spatial data with fixed-length feature vectors and considered only spatial relationships among the features, while ignoring the relationships among model parameters.We begin by describing how the spatial and temporal relationships for non-gridded spatial data can be modeled simultaneously by coupling the graph convolutional network with a long short-term memory (LSTM) network. Unlike previous research, our framework treats the adjacency matrix associated with the spatial data as a model parameter that can be learned from data, with constraints on its sparsity and rank to reduce the number of estimated parameters.Further, we show that the learned adjacency matrix may reveal useful information about the dominant spatial relationships that exist within the data. Second, we explore the varieties of spatial relationships that may exist in a geo-spatial prediction task. Specifically, we distinguish between spatial relationships among predictors and the spatial relationships among model parameters at different locations. We demonstrate an approach for modeling spatial dependencies among model parameters using graph convolution and provide guidance on when convolution of each type can be effectively applied. We evaluate our proposed approach on a climate downscaling and weather prediction tasks. Next, we introduce DeepGPD, a novel deep learning framework for predicting the distribution of geo-spatio-temporal extreme events. We draw on research in extreme value theory and use the generalized Pareto distribution (GPD) to model the distribution of excesses over a threshold. The GPD is integrated into our deep learning framework to learn the distribution of future excess values while incorporating the geo-spatio-temporal relationships present in the data. This requires a novel reparameterization of the GPD to ensure that its constraints are satisfied by the outputs of the neural network. We demonstrate the effectiveness of our proposed approach on a real-world precipitation data set. DeepGPD also employs a deep set architecture to handle the variable-sized feature sets corresponding to excess values from previous time steps as its predictors. Finally, we extend the DeepGPD formulation to simultaneously predict the distribution of extreme events and accurately infer their point estimates. Doing so requires modeling the full distribution of the data not just its extreme values. We propose DEMM, a deep mixture model for modeling the distribution of both excess and non-excess values. To ensure the point estimation of DEMM is a feasible value, new constraints on the output of the neural network are introduced, which requires a new reparameterization of the model parameters of the GPD. We conclude by discussing possibilities for further research at the intersection of deep learning and geo-spatio-temporal data.
Show less
- Title
- MICROBLOG GUIDED CRYPTOCURRENCY TRADING AND FRAMING ANALYSIS
- Creator
- Pawlicka Maule, Anna Paula
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
With 56 million people actively trading and investing in cryptocurrency online and globally, there is an increasing need for an automatic social media analysis tool to help understand trading discourse and behavior. Previous works have shown the usefulness of modeling microblog discourse for the prediction of trading stocks and their price fluctuations, as well as content framing. In this work, I present a natural language modeling pipeline that leverages language and social network behaviors...
Show moreWith 56 million people actively trading and investing in cryptocurrency online and globally, there is an increasing need for an automatic social media analysis tool to help understand trading discourse and behavior. Previous works have shown the usefulness of modeling microblog discourse for the prediction of trading stocks and their price fluctuations, as well as content framing. In this work, I present a natural language modeling pipeline that leverages language and social network behaviors for the prediction of cryptocurrency day trading actions and their associated framing patterns. Specifically, I present two modeling approaches. The first determines if the tweets of a 24-hour period can be used to guide day trading behavior, specifically if a cryptocurrency investor should buy, sell, or hold their cryptocurrencies in order to make a trading profit. The second is an unsupervised deep clustering approach to automatically detect framing patterns. My contributions include the modeling pipeline for this novel task, a new dataset of cryptocurrency-related tweets from influential accounts, and a transaction volume dataset. The experiments executed show that this weakly-supervised trading pipeline achieves an 88.78% accuracy for day trading behavior predictions and reveals framing fluctuations prior to and during the COVID-19 pandemic that could be used to guide investment actions.
Show less