Towards Post-Hoc Human-Interpretability of Multimodal Neural Networks for Healthcare Applications

The use of artificial intelligence (AI) in healthcare has rapidly expanded in recent years. Multimodal neural networks (MNNs) that analyze diverse data types like images, lab reports, and genomics data often outperform unimodal approaches in healthcare applications. However, owing to their complex architecture, the decision-making logic of these large AI models is often unknown. This raises serious concerns surrounding model reliability, accountability, patient autonomy, and bias. The black-box nature of these models often makes them unsuitable for high-risk healthcare applications. Research in explainable AI (XAI) is therefore critical for the safe implementation of these models. This dissertation develops a two-phase approach to improve explainability and reliability assessment of MNNs in healthcare: Phase 1 - Explainability via feature importance: we develop a unified framework that quantifies the relative importance of multimodal inputs using post-hoc model-agnostic methods. The estimated importances are validated through importance-known-exactly simulations and agreement between multiple attribution methods. Experiments with multimodal breast tumor and cardiomegaly classifiers demonstrate the technique explains model behavior across diverse data types with high agreement scores and alignment with expert intuition. Phase 2 - Quantifying prediction reliability: we use multimodal input importance to predict the impact of missing inputs on MNN performance. This impact is presented with interpretable performance metrics, including accuracy reduction, providing measures closely tied to model reliability. We also propose an extension of the average model reliability to more fine grain patient-specific reliability estimates using reliability calibration curves. The methods developed in this dissertation offer promising approaches to improve interpretability and quantify reliability of complex MNNs, potentially facilitating their safe adoption in high-risk clinical settings.

Read