Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks

Abstract

Over the last decade, Convolutional Neural Network (CNN) models have been\nhighly successful in solving complex vision problems. However, these deep\nmodels are perceived as "black box" methods considering the lack of\nunderstanding of their internal functioning. There has been a significant\nrecent interest in developing explainable deep learning models, and this paper\nis an effort in this direction. Building on a recently proposed method called\nGrad-CAM, we propose a generalized method called Grad-CAM++ that can provide\nbetter visual explanations of CNN model predictions, in terms of better object\nlocalization as well as explaining occurrences of multiple object instances in\na single image, when compared to state-of-the-art. We provide a mathematical\nderivation for the proposed method, which uses a weighted combination of the\npositive partial derivatives of the last convolutional layer feature maps with\nrespect to a specific class score as weights to generate a visual explanation\nfor the corresponding class label. Our extensive experiments and evaluations,\nboth subjective and objective, on standard datasets showed that Grad-CAM++\nprovides promising human-interpretable visual explanations for a given CNN\narchitecture across multiple tasks including classification, image caption\ngeneration and 3D action recognition; as well as in new settings such as\nknowledge distillation.\n

Keywords

Convolutional neural networkArtificial intelligenceComputer scienceFeature (linguistics)Class (philosophy)Image (mathematics)Object (grammar)Deep learningPattern recognition (psychology)Feature extractionContextual image classificationMachine learning

Affiliated Institutions

Related Publications

Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks

Haofan Wang , Zifan Wang , Mengnan Du +5 more

Recently, increasing attention has been drawn to the internal mechanisms of convolutional neural networks, and the reason why the network makes specific decisions. In this paper...

2020 1186 citations

Object Detection Networks on Convolutional Feature Maps

Shaoqing Ren , Kaiming He , Ross Girshick +2 more

Most object detectors contain two important components: a feature extractor and an object classifier. The feature extractor has rapidly evolved with significant research efforts...

2015 arXiv (Cornell University) 35 citations

Fast R-CNN

Ross Girshick

This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection. Fast R-CNN builds on previous work to efficiently classify object proposa...

2015 2015 IEEE International Conference on... 26511 citations

Fast R-CNN

Ross Girshick

This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection. Fast R-CNN builds on previous work to efficiently classify object proposa...

2015 arXiv (Cornell University) 1766 citations

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Kaiming He , Xiangyu Zhang , Shaoqing Ren +1 more

Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 × 224) input image. This requirement is "artificial" and may reduce the recognition accuracy f...

2015 IEEE Transactions on Pattern Analysis... 10916 citations

Publication Info

Year: 2018
Type: preprint
Citations: 2688
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv Semantic Scholar

Social Impact

Altmetric

Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

2688

OpenAlex

287

Influential

2158

CrossRef

Cite This

APA Style

                            
                                    Aditya Chattopadhay, 
                                
                                    Anirban Sarkar, 
                                
                                    Prantik Howlader
                                
                                et al.
                            
                            (2018). 
                            Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks. 
                            2018 IEEE Winter Conference on Applications of Computer Vision (WACV)
                            
                            .
                            https://doi.org/10.1109/wacv.2018.00097

Identifiers

DOI: 10.1109/wacv.2018.00097
arXiv: 1710.11063

Data Quality

Data completeness: 84%