Robust learning of deep neural networks under data corruption
Training deep neural networks in the presence of corrupted data is challenging as the corrupted data points may significantly impact generalization performance of the models. Unfortunately, the data corruption issue widely exists in many application domains, including but not limited to, healthcare, environmental sciences, autonomous driving, and social media analytics. Although there have been some previous studies that aim to enhance the robustness of machine learning models against data corruption, most of them either lack theoretical robustness guarantees or unable to scale to the millions of model parameters governing deep neural networks. The goal of this thesis is to design robust machine learning algorithms that 1) effectively deal with different types of data corruption, 2) have sound theoretical guarantees on robustness, and 3) scalable to large number of parameters in deep neural networks.There are two general approaches to enhance model robustness against data corruption. The first approach is to detect and remove the corrupted data while the second approach is to design robust learning algorithms that can tolerate some fraction of corrupted data. In this thesis, I had developed two robust unsupervised anomaly detection algorithms and two robust supervised learning algorithm for corrupted supervision and backdoor attack. Specifically, in Chapter 2, I proposed the Robust Collaborative Autoencoder (RCA) approach to enhance the robustness of vanilla autoencoder methods against natural corruption. In Chapter 3, I developed Robust RealNVP, a robust density estimation technique for unsupervised anomaly detection tasks given concentrated anomalies. Chapter 4 presents the Provable Robust Learning (PRL) approach, which is a robust algorithm against agnostic corrupted supervision. In Chapter 5, a meta-algorithm to defend against backdoor attacks is proposed by exploring the connection between label corruption and backdoor data poisoning attack. Extensive experiments on multiple benchmark datasets have demonstrated the robustness of the proposed algorithms under different types of corruption.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Liu, Boyang
- Thesis Advisors
-
Tan, Pang-Ning PT
Zhou, Jiayu JZ
- Committee Members
-
Tan, Pang-Ning PT
Zhou, Jiayu JZ
Tang, Jiliang JT
Cheruvelil, Kendra KC
- Date Published
-
2022
- Subjects
-
Computer science
Neural networks (Computer science)
Anomaly detection (Computer security)
Computer security
Deep learning (Machine learning)
- Program of Study
-
Computer Science - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- xiv, 121 pages
- ISBN
-
9798834063872
- Permalink
- https://doi.org/doi:10.25335/ywk2-8j42