Abstract
Model-agnostic feature attribution techniques are used to explain the decisions of complex machine learning (ML) models including ensemble models, and deep neural networks (DNNs). However, since complex ML models perform best when trained on low-level features, the explanations generated by these algorithms are often not interpretable or usable by humans. Recently proposed model-agnostic methods that support the generation of human-interpretable explanations are impractical because they require a fully invertible transformation function that maps the model’s input features to human-interpretable features. While some practical human-interpretable explainability methods exist (e.g., concept-based methods), they typically require direct access to the model and are not fully model-agnostic. In this paper, we introduce Latent SHAP, a model-agnostic black-box feature attribution framework that provides human-interpretable explanations without necessitating a fully invertible transformation function. We validate the fidelity of Latent SHAP ’s explanations through quantitative faithfulness assessments on two controlled datasets—a self-generated artificial dataset and the dSprites dataset. Furthermore, we showcase the practical utility of Latent SHAP in various real-world scenarios across domains such as computer vision, natural language processing, and cybersecurity. Each domain involves complex models (ensembles, DNNs, and LLMs), where invertible transformation functions are not available.
| Original language | English |
|---|---|
| Article number | 209 |
| Journal | Machine Learning |
| Volume | 114 |
| Issue number | 9 |
| DOIs | |
| State | Published - 1 Sep 2025 |
Keywords
- Explainability
- Explainable ML
- Machine learning
- XAI algorithms
ASJC Scopus subject areas
- Software
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'Toward practical human-interpretable explanations'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver