Detecting and mitigating bias in natural languages

Natural language processing (NLP) is an increasingly prominent subfield of artificial intelligence (AI). NLP techniques enable intelligent machines to understand and analyze natural languages and make it possible for humans and machines to communicate through natural languages. However, more and more evidence indicates that NLP applications show human-like discriminatory bias or make unfair decisions. As NLP algorithms play an increasingly irreplaceable role in promoting the automation of people's lives, bias in NLP is closely related to users' vital interests and demands considerable attention.While there are a growing number of studies related to bias in natural languages, the research on this topic is far from complete. In this thesis, we propose several studies to fill up the gaps in the area of bias in NLP in terms of three perspectives. First, existing studies are mainly confined to traditional and relatively mature NLP tasks, but for certain newly emerging tasks such as dialogue generation, the research on how to define, detect, and mitigate the bias in them is still absent. We conduct pioneering studies on bias in dialogue models to answer these questions. Second, previous studies basically focus on explicit bias in NLP algorithms but overlook implicit bias. We investigate the implicit bias in text classification tasks in our studies, where we propose novel methods to detect, explain, and mitigate the implicit bias. Third, existing research on bias in NLP focuses more on in-processing and post-processing bias mitigation strategies, but rarely considers how to avoid bias being produced in the generation process of the training data, especially in the data annotation phase. To this end, we investigate annotator bias in crowdsourced data for NLP tasks and its group effect. We verify the existence of annotator group bias, develop a novel probabilistic graphical framework to capture it, and propose an algorithm to eliminate its negative impact on NLP model learning.

Read