Fairness and Accessibility in Multimodal AI Recruitment : Assessing and Mitigating Bias Across Text, Video, and Speech

Artificial intelligence (AI) is rapidly reshaping how people work, with up to 30% of work activities projected to be automatable by 2030. As of 2024, 42% of enterprise organizations have adopted AI in their operations, and half of surveyed companies now incorporate AI into their hiring processes—a figure projected to rise to 68% within the next year. These systems are now used to automate resume screening, conduct chatbot-based interviews, and evaluate candidate personality traits and skills through video analysis. While they offer efficiency and scalability, their growing role in high-stakes employment decisions has raised serious concerns about fairness, transparency, and accessibility. In particular, AI-driven recruitment systems have demonstrated tendencies to reinforce gender biases and disadvantage individuals with speech disfluencies, such as stuttering. This dissertation addresses these challenges by examining the fairness and effectiveness of AI systems across three key modalities—text, video, and speech—and proposes novel strategies to mitigate bias across all three.Specifically, it makes three core contributions spanning adaptive assessments, automated video interview scoring, and accessibility in speech-based technologies:First, we develop a multi-objective optimization framework for computerized adaptive testing that balances precision, test length, content coverage, item exposure, and fairness. By integrating evolutionary algorithms and large language models, our approach reduces assessment time without sacrificing accuracy, while also examining fairness trade-offs related to item exposure, memorization, and shortened assessments.Second, we introduce a multimodal, multi-task deep neural network that predicts OCEAN personality traits from video interviews more efficiently than existing models. To measure and mitigate bias, we implement a counterfactual fairness framework using generative adversarial networks, enabling us to evaluate how changes to protected attributes (e.g., gender, age) affect model predictions. This ensures fairer treatment across demographic groups in personality-based scoring systems.Third, we conduct a first-of-its-kind analysis of bias in automatic speech recognition (ASR) systems against individuals who stutter—a group historically underserved and disadvantaged by voice-based technologies. Using both synthetic and real-world data, we identify significant accuracy disparities in commercial and open-source ASRs. To address these, we propose a two-stage bias mitigation strategy: (1) fine-tuning pre-trained ASRs with disfluency-augmented data using parameter-efficient methods, and (2) correcting ASR transcript errors using large language models with retrieval-augmented generation. To support this work, we developed a novel cloud-based data collection platform to build one of the largest datasets of stuttered speech to date, collected across diverse speakers and application contexts.Overall, this dissertation provides a comprehensive roadmap for improving fairness, accessibility, and efficiency in AI-based recruitment, offering actionable strategies for organizations aiming to build fair, inclusive, and effective hiring pipelines in an increasingly AI-driven world.

Read