Shahaf Bassan, Yizhak Yisrael Elboher, Tobias Ladner, Volkan Şahin, Jan Kretinsky, Matthias Althoff, Guy Katz
This paper presents an efficient algorithm for generating provably minimal explanations for Neural Additive Models, which are more interpretable neural networks, by reducing the computational complexity of the task.
Explaining how neural networks make decisions can be challenging, especially when trying to identify the smallest set of input features that determine a prediction. For many neural networks, this task is computationally difficult. However, this research focuses on Neural Additive Models (NAMs), a type of neural network that is easier to interpret. The authors developed a new algorithm that efficiently finds the smallest set of features needed for a prediction, with guaranteed accuracy. This method is faster and produces more concise explanations than previous approaches, offering insights that are not possible with traditional methods.