Explainable Artificial Intelligence
(Autumn 2021)

General Concepts / Surveys

Hidden neuron analysis methods:
- Visualize, revert-map or label features that are learned by hidden neurons
- GAN
- Maximum Activation
- Qualitatively analyzing every neuron does not provide much actionable and quantitative interpretation about the overall mechanism of the whole model
Model mimicking methods: imitate the classification function of a target model and build a transparent model that is easy to interpret and has high accuracy
- Due to the reduced model complexity of a mimic model, no guarantee that a deep model with a large VC dimension can be successfully imitated by a simpler shallow model
Local interpretation methods:
- Compute and visualize the important features for an input instance by analyzing the predictions of its local perturbations
- LIME
- Saliency map
- Perceptively indistinguishable instances may not be explained consistently