Layer-Wise Relevance Propagation with Conservation Property for ResNet

Abstract

Transparent formulation of explanation methods is essential for elucidating the predictions of neural network models, which are commonly of a black-box nature. Layer-wise Relevance Propagation (LRP) stands out as a well-established method that transparently traces the flow of a model's prediction backward through its architecture by backpropagating relevance scores. However, LRP has not fully considered the existence of a skip connection, and its application to the widely used ResNet architecture has not been thoroughly explored.

In this study, we extend LRP to the ResNet models by introducing relevance splitting at a point where outputs from a skip connection and a residual block converge. Moreover, our formulation guarantees the conservation property throughout the process, thereby preserving the integrity of the generated explanations.

To evaluate the effectiveness of our approach, we conduct the experiments on ImageNet and the Caltech-UCSD Birds-200-2011 dataset. Our method demonstrated superior performance compared to baseline methods on standard evaluation metrics such as Insertion-Deletion score while maintaining its conservation property.

Fig. 1: We propose LRP for ResNet. By formulating Relevance Splitting at a point where the output from a skip connection converges with that from a residual block, we extend LRP---originally designed for propagating relevance between two consecutive layers---to the ResNet architecture while guaranteeing its conservation property, thereby preserving the integrity of the explanation process.

Overview

Fig. 2: LRP for ResNet. Top: LRP propagates the relevance score backward to generate an attribution map corresponding to the input image. We focus on the relevance score propagation through the Bottleneck module, which incorporates a residual connection. Bottom: Architecture of the Bottleneck module. The D-Bottleneck employs a linear projection, implemented by a 1 \( \times \) 1 convolution, in its skip connection for dimension matching. ReLU activation functions and batch normalization layers are omitted for simplicity.

Relevance Splitting

LRP originally defines a propagation rule between two consecutive layers. However, this rule does not account for the existence of a skip connection that bridges nonconsecutive layers.
How should we propagate the relevance scores at the point where the output of the skip connection converges with that of the residual block?
\( \downarrow \)
We propose Relevance Splitting to address this issue.

As a preliminary step, we first divide the relevance score \( \boldsymbol R^{(l+1)} \) into two parts: \( \boldsymbol R_s \) for propagation through the skip connection and \( \boldsymbol R_m \) for the mainstream of the residual block. In this context, \( s \) in \( \boldsymbol R_s \) signifies the skip connection, and \( m \) in \( \boldsymbol R_m \) denotes the mainstream of the residual block. To adhere to the conservation property, we impose the following constraint: \[ \boldsymbol R^{(l)} = \boldsymbol R^{(l+1)} = \boldsymbol R_s + \boldsymbol R_m. \]

Fig. 3: Architecture of the Bottleneck block in ResNet and our Relevance Splitting approach. We introduce Relevance Splitting to consider the existence of skip connections in the relevance propagation of LRP. We formulate two splitting approaches and discuss their application to two distinct types of skip connections.

We formulate following two splitting approaches.

Symmetric Splitting

Symmetric Splitting divides \( \boldsymbol R^{(l+1)} \) equally as follows: \[ (R_s)_i = (R_m)_i = \frac{R^{(l+1)}_i}{2}. \]

Ratio-Based Splitting

Ratio-Based Splitting is a more thorough approach. Let \( \boldsymbol h_s \) and \( \boldsymbol h_m \) represent the outputs of the skip connection and the residual block, respectively. We divide \( \boldsymbol R^{(l+1)} \) to satisfy the following conditions: \[ (R_s)_i = \frac{R^{(l+1)}_i \cdot |(h_s)_i|}{|(h_m)_i| + |(h_s)_i|},\quad (R_m)_i = \frac{R^{(l+1)}_i \cdot |(h_m)_i|}{|(h_m)_i| + |(h_s)_i|}. \]

Results & Empirical Analysis

We conduct experiments on ImageNet and the Caltech-UCSD Birds-200-2011 dataset to evaluate the effectiveness of our approach.

Qualitative Results

Fig. 4: Qualitative Results: Attribution produced by each explanation method for the prediction of ResNet50 with respect to the ground-truth classes (top to bottom): "Brandt Cormorant", "Savannah Sparrow", "Sock", "Bustard", and "Bee."

Quantitative Results

Table 1: Quantitative results on ImageNet and the CUB dataset. IG and Guided BP denote Integrated Gradients and Guided BackPropagation, respectively. Ins. and Del. denote Insertion and Deletion score, respectively. The best results are marked in bold.

Empirical Analysis of Conservation Property

Fig. 5: Visualization of the sum of relevance scores backpropagated to four critical points within ResNet50: (a) input of the entire network, (b) input of the first Bottleneck block, and (c) input of the last Bottleneck block. Each point in the scatter plots corresponds to the sum of the relevance scores plotted against the model's output for one sample. Points closer to the diagonal line indicate a higher adherence to the conservation property.

Ablation Study of Relevance Propagation Rules for Bottleneck Modules

Table 2: Comparison of propagation rules for the Bottleneck modules. "Include Identical" denotes the condition in which the relevance score for skip connections with identity mapping is not set to 0. Insertion, Deletion and ID scores were calculated on ImageNet. The highest scores are marked in bold.

Fig. 6: Qualitative results from the ablation study of propagation rules for the Bottleneck modules. Methods (i) and (iv) refer to the methods described in Table 2. Method (iv) exhibits a more concentrated attribution towards relevant objects than Method (i).

BibTeX


    @article{otsuki2024layer,
      title={{Layer-Wise Relevance Propagation with Conservation Property for ResNet}},
      author={Seitaro Otsuki, Tsumugi Iida, F\'elix Doublet, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Komei Sugiura},
      journal={arXiv preprint arXiv:2407.09115},
      year={2024},
    }