Segpgd: An Effective and Efficient Adversarial Attack For Evaluating and Boosting Segmentation Robustness
Segpgd: An Effective and Efficient Adversarial Attack For Evaluating and Boosting Segmentation Robustness
1 2
University of Munich The University of Hong Kong
3
Torr Vision Group, University of Oxford
1 Introduction
Due to their vulnerability to artificial small perturbations, the adversarial ro-
bustness of deep neural networks has received great attention [40,13]. A large
amount of attack and defense strategies have been proposed for classification in
past years [5,37,46,47,54,57,43,1,52,14,39,34]. As an extension of classification,
semantic segmentation also suffers from adversarial examples [50,2]. Segmen-
tation models applied in real-world safety-critical applications also face poten-
tial threats, e.g., in self-driving systems [32,25,19,33,4,36] and in medical image
analysis [10,35,12,30]. Hence, the adversarial robustness of segmentation has also
raised great attention recently [50,2,51,49,20,42,27,24,49,8,38,53].
2 J. Gu et al.
In terms of the attack methods, different from classification, the attack goal
in segmentation is to fool all pixel classifications at the same time. An effec-
tive adversarial example of a segmentation model are expected to fool as many
pixel classifications as possible, which requires the larger number of attack itera-
tions [50,15]. The observation makes both robustness evaluation and adversarial
training on segmentation models challenging. In this work, we propose an ef-
fective and efficient segmentation attack method, dubbed SegPGD. Besides, we
provide a convergence analysis to show why the proposed SegPGD can create
more effective adversarial examples than PGD under the same number of attack
iterations.
The right evaluation of model robustness is an important step to building
robust models. Evaluation with weak or inappropriate attack methods can give
a false sense of robustness [3]. Recent work [51] evaluates the robustness of
segmentation models under a similar setting to the one used in classification.
This could be problematic given the fact that a large number of attack iterations
are required to create effective adversarial examples of segmentation [50]. We
evaluate the adversarially trained segmentation models in previous work with a
strong attack setting, namely with a large number of attack iterations. We found
the robustness can be significantly reduced. Our SegPGD can reduce the mIoU
score further. For example, the mIoU of adversarially trained PSPNet [56] on
Cityscapes dataset [9] can be reduced to near zero under 100 attack iterations.
As one of the most effective defense strategies, adversarial training was pro-
posed to address the vulnerability of classification models, where the adversarial
examples are created and injected into training data during training [13,29]. One
promising way to boost segmentation robustness is to apply adversarial train-
ing to segmentation models. However, the creation of effective segmentation
adversarial examples during training can be time-consuming. In this work, we
demonstrate that our effective and efficient SegPGD can mitigate this challenge.
Since it can create effective adversarial examples, the application of SegPGD as
the underlying attack method of adversarial training can effectively boost the
robustness of segmentation models. It is worth noting that many adversarial
training strategies with single-step attacks have been proposed to address the
efficiency of adversarial training in classification [37,47,57,43,1]. However, they
do not work well on segmentation models since the adversarial examples created
by single-step attacks are not effective enough to fool segmentation models.
The contributions of our work can be summarised as follows:
2 Related Work
One way to address the dilemma is to overcome the challenges using advanced
single-step attacks [41,44,46,55,47,21,22], which can address label leaking prob-
lem and avoid gradient masking phenomenon. Though it boosts the robustness of
the classification models, however, single-step attack based adversarial training
does work well on segmentation model due to the challenge to create effective
segmentation adversarial examples with a single-step attack. Another way to
address the dilemma is to simulate the robustness performance of multi-step
attack-based adversarial training in an efficient way [37,57,5]. However, it is not
clear how well the generalization of the methods above to segmentation is.
We reformulate the loss function into two parts in Equation 3. The first term
therein is the loss of the correctly classified pixels, while the second one is formed
by the wrongly classified pixels.
1 X 1 X
L(fseg (X advt ), Y ) = Lj + Lk , (3)
H ×W T
H ×W F
j∈P k∈P
SegPGD: An Effective and Efficient Adversarial Attack 5
where two loss terms are weighted with 1 − λ and λ, respectively. Note that
the selection of λ is non-trivial. It does not work well by simply setting λ = 0
where only correctly classified pixels are considered. In such a case, the previous
wrongly classified pixels can become benign again after a few attack iterations
since they are ignored when updating perturbations. The claim is also consistent
with the previous observation [48,45] that adversarial perturbation is also sen-
sitive to small noise. Furthermore, setting λ to a fixed value in [0, 0.5] does not
always lead to better attack performance due to a similar reason. When most
of pixel classifications are fooled after a few attack iterations, less weight on the
wrongly classified pixels can make some of them benign again.
In this work, instead of manually specifying a fixed value to λ, we propose to
set λ dynamically with the number of attack iterations. The intuition behind the
dynamic schedule is that we mainly focus on fooling correct pixel classifications
in the first a few attack iterations and then treat the wrong pixel classifications
6 J. Gu et al.
quasi equally in the last few iterations. By doing this, our SegPGD can achieve
similar attack effectiveness with less iterations. We list some instances of our
dynamic schedule as follows
t−1 1 t−1 1
λ(t) = , λ(t) = ∗ log2 (1 + ), λ(t) = ∗ (2(t−1)/T − 1), (5)
2T 2 T 2
where t is the index of current attack iteration and T are the number of all attack
iterations. Our experiments show that all the proposed instances are similarly
effective. In this work, we mainly use the first simple linear schedule. The pseudo
code of our SegPGD with the proposed schedule is shown in Algorithm 1. Further
discussion on the schedules to dynamically set λ are in Sec. 4.2.
Similarly, the loss function in Equation 4 can also be applied in single-step
adversarial attack, e.g., FGSM [13]. In the resulted SegFGSM, only correctly
classified pixels are considered in case of the proposed λ schedule. Since it only
takes one-step update, the wrongly classified pixels is less likely to become be-
nign. Hence, SegFGSM with the proposed λ schedule (i.e., λ = 1) also shows
superior attack performance than FGSM.
In this subsection, we propose a fast segmentation attack method, i.e., SegPGD.
It can be applied to evaluate the adversarial robustness of segmentation mod-
els in an efficient way. Besides, SegPGD can also be applied to accelerate the
adversarial training on segmentation models.
where gi (X) = −L(X, Y i ). The variable is constrained into concave region since
both constraints are linear.
Projected Gradient Descent-based optimization method is often applied to
solve the constrained minimization problem above [29]. The method first takes
a step towards the negative gradient direction to get a new point while ignoring
the constraint, and then correct the new point by projecting it back into the
constraint set.
The gradient-descent step of PGD attack is
H×W
X
X t+1 = X t − α ∗ sign(∇ gi (X t )), (8)
i=1
SegPGD: An Effective and Efficient Adversarial Attack 7
PosiRatio
Loss
Loss
0.8
1.0 0.5 100
0.7
0.5 1
0.6 10
Fig. 1: Convergence Analysis. SegPGD marked with blue solid lines achieve
higher MisRatio than PGD under the same number of attack iterations. The
loss of false classified pixels (FLoss) marked with triangle down dominate the
overall loss (i.e. red lines without markers) during attacks. Compared to PGD,
the FLoss in SegPDG makes up a smaller portion of the overall loss since SegPGD
main focuses on correctly classified pixels in the first a few attack iterations.
where α is the step size. The initial point is the original clean example X clean
or a random initialization X clean + U(−ϵ, +ϵ).
Convergence Criterion. In classification task, the loss is directly correlated
with attack goal. The larger the loss is, the more likely the input is to be mis-
classified. However, it does not hold in segmentation task. The large loss of
segmentation not necessarily leads to more pixel misclassifications since the loss
consists of losses of all pixel classifications. Once a pixel is misclassified, the in-
crease of the loss on the pixel does not bring more adversarial effect. Hence, we
propose a new convergence criterion for segmentation, dubbed MisRatio, which
is defined as the ratio of misclassified pixels to all input pixels.
Convergence Analysis. In the first step to update adversarial examples, the
update rule of our SegPGD can be simplified as
X
X 1 = X 0 + α ∗ sign( ∇gj (X t )), (10)
j∈P T
In all intermediate steps, both SegPGD and PGD leverage gradients of all
pixels classification loss to update adversarial examples. The difference is that
our SegPGD assign more weight to loss of correctly classified pixel classifications.
The assigned value depends on the update iteration t. Our SegPGD focuses more
on fooling correctly classified pixels at first a few iterations and then treat both
quasi equally. By doing this, our SegPGD can achieve higher MisRatio than
PGD under the same attack iterations.
In Fig. 1, we show the pixel classification loss and PosiRatio (=1 - MisRatio)
in each attack iteration. Fig. 1a shows the case to attack adversarially trained
PSPNet on VOC (see more details in experimental section). SegPGD marked
with blue solid lines achieve higher MissRatio than PGD under the same num-
ber of attack iterations. The loss of False classified pixels (FLoss) marked with
triangle down dominate the overall loss (i.e. red lines without markers) during
attacks. Compared to PGD, the FLoss in SegPDG makes up a smaller portion
of the overall loss since SegPGD main focuses on correctly classified pixels in
the first a few attack iterations. Note that the scale of loss does not matter since
only the signs of input gradients are leveraged to create adversarial examples.
4 Experiment
In this section, we first introduce the experimental setting. Then, we show the
effectiveness of SegPGD. Specifically, we show SegPGD can achieve similar at-
tack effect with less attack iterations than PGD on both standard models and
adversarially trained models. In the last part, we show that adversarial training
with SegPGD can achieve more adversarially robust segmentation models.
1 1 10 10
0 0 0 0
3 5 7 10 20 40 100 3 5 7 10 20 40 100 3 5 7 10 20 40 100 3 5 7 10 20 40 100
Attack Iterations Attack Iterations Attack Iterations Attack Iterations
(e) Standard PSPNet (f) Standard DeepLabv3 (g) AT PSPNet (h) AT DeepLabV3
Fig. 2: SegPGD is more effective and efficient than PGD. SegPGD creates more
effective adversarial examples with the same number of attack iterations and
converges to a better minima than PGD. The subfigures (a-d) show the seg-
mentation mIoUs on VOC, while the scores on Cityscapes are reported in the
subfigures (e-h). AT PSPNet stands for the adversarially trained PSPNet.
case of our SegPGD along with other minor differences. DAG only considers
the correctly classified pixels in each attack iteration, which is equivalent to set
λ = 0 in our SegPGD. To further improve the attack effectiveness, the work [16]
proposes multiple-layer attack (MLAttack) where the losses in feature spaces of
multiple intermediate layers and the one in the final output layer are combined
to create adversarial examples. SegPGD outperforms both DAG and MLAttack
in terms of both efficiency and effectiveness, as shown in Appendix A.
Single-Step Attack. When a single attack iteration is applied, SegPGD is
degraded to SegFGSM. In SegFGSM, only the loss of correctly classified pixels
are considered in the case of the proposed λ schedule. We compare FGSM and
SegFGSM and report the mIOU. Our SegFGSM outperforms FGSM on both
standard models and adversarially trained models. See Appendix B for details.
12 J. Gu et al.
18 20
baseline weight-mis-0.3 baseline weight-mis-0.3
only-correct weight-dyn-exp only-correct weight-dyn-exp
16 weight-mis-0.1 weight-dyn-log 19 weight-mis-0.1 weight-dyn-log
weight-mis-0.2 weight-dyn-linear weight-mis-0.2 weight-dyn-linear
mIoU (in \%)
12 17
10 16
Weighting Strategies Weighting Strategies
Fig. 4: Schedules for weighting misclassified pixels. SegPGD with our weight-dyn-
linear weighting schedules can better reduce the mIoU and achieve better attack
effectiveness than the ones with baseline schedules.
Table 2: Adversarial Training on Cityscapes Dataset. This table show that the
boosting effect of adversarial training with our SegPGD still clearly holds on a
different dataset. Besides, we show the previous adversarially trained baseline
model can be reduced to near zero under strong attack, i.e., PGD3-AT PSPNet
under PGD100 attack. Our segPGD improves the robustness significantly.
PSPNet Clean CW DeepFool BIMl2 PGD10 PGD20 PGD40 PGD100
Standard 73.98 5.94 12.68 12.36 0.96 0.61 0.42 0.27
DDCAT [51] 71.86 53.19 54.08 48.86 24.40 20.90 17.93 12.97
PGD3-AT 71.28 35.21 36.84 32.22 28.79 17.3 9.29 3.95
SegPGD3-AT 71.01 36.30 38.27 35.34 33.52 25.23 19.22 13.04
PGD7-AT 69.85 27.78 28.44 27.87 26.00 24.75 23.86 22.8
SegPGD7-AT 70.21 29.59 30.68 32.55 27.13 25.56 24.29 23.13
methods are tested. The model trained with our SegPGD-AT shows the best
performance against the transfer-based black-box attacks. The claim is also true
when different attack methods are applied to create adversarial examples.
5 Conclusions
A large number of attack iterations are required to create effective segmentation
adversarial examples. The requirement makes both robustness evaluation and
adversarial training on segmentation challenging. In this work, we propose an
effective and efficient segmentation-specific attack method, dubbed SegPGD. We
first show SegPGD can converge better and faster than the baseline PGD. The ef-
fectiveness and efficiency of SegPGD is verified with comprehensive experiments
on different segmentation architectures and popular datasets. Besides the evalu-
ation, we also demonstrate how to boost the robustness of segmentation models
with SegPGD. Specifically, we apply SegPGD to create segmentation adversar-
ial examples for adversarial training. Given the high effectiveness of the created
adversarial examples, the adversarial training with SegPGD improves the seg-
mentation robustness significantly and achieves the state of the art. However,
there is still much space to improve in terms of the effectiveness and efficiency
of segmentation adversarial training. We hope this work can serve as a solid
baseline and inspire more work to improve segmentation robustness.
Acknowledgement This work is supported by the UKRI grant: Turing AI Fel-
lowship EP/W002981/1, EPSRC/MURI grant: EP/N019474/1, HKU Startup
Fund, and HKU Seed Fund for Basic Research. We would also like to thank the
Royal Academy of Engineering and FiveAI.
SegPGD: An Effective and Efficient Adversarial Attack 15
References
1. Andriushchenko, M., Flammarion, N.: Understanding and improving fast adver-
sarial training. NeurIPS (2020)
2. Arnab, A., Miksik, O., Torr, P.H.: On the robustness of semantic segmentation
models to adversarial attacks. In: CVPR (2018)
3. Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of
security: Circumventing defenses to adversarial examples. In: ICML (2018)
4. Bar, A., Lohdefink, J., Kapoor, N., Varghese, S.J., Huger, F., Schlicht, P., Fin-
gscheidt, T.: The vulnerability of semantic segmentation networks to adversarial
attacks in autonomous driving: Enhancing extensive environment sensing. IEEE
Signal Processing Magazine 38(1), 42–52 (2020)
5. Cai, Q.Z., Du, M., Liu, C., Song, D.: Curriculum adversarial training. IJCAI (2018)
6. Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In:
2017 ieee symposium on security and privacy (sp). pp. 39–57. IEEE (2017)
7. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution
for semantic image segmentation. In: arXiv:1706.05587 (2017)
8. Cho, S., Jun, T.J., Oh, B., Kim, D.: Dapas: Denoising autoencoder to prevent ad-
versarial attack in semantic segmentation. In: 2020 International Joint Conference
on Neural Networks (IJCNN). pp. 1–8. IEEE (2020)
9. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R.,
Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene
understanding. In: CVPR (2016)
10. Daza, L., Pérez, J.C., Arbeláez, P.: Towards robust general medical image segmen-
tation. In: International Conference on Medical Image Computing and Computer-
Assisted Intervention. pp. 3–13. Springer (2021)
11. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The
pascal visual object classes (voc) challenge. International journal of computer vision
(IJCV) (2010)
12. Full, P.M., Isensee, F., Jäger, P.F., Maier-Hein, K.: Studying robustness of semantic
segmentation under domain shift in cardiac mri. In: International Workshop on
Statistical Atlases and Computational Models of the Heart. pp. 238–249. Springer
(2020)
13. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial
examples. In: ICLR (2015)
14. Gu, J., Wu, B., Tresp, V.: Effective and efficient vote attack on capsule networks.
arXiv preprint arXiv:2102.10055 (2021)
15. Gu, J., Zhao, H., Tresp, V., Torr, P.: Adversarial examples on segmentation models
can be easy to transfer. arXiv preprint arXiv:2111.11368 (2021)
16. Gupta, P., Rahtu, E.: Mlattack: Fooling semantic segmentation networks by
multi-layer attacks. In: German Conference on Pattern Recognition. pp. 401–413.
Springer (2019)
17. Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Hypercolumns for object seg-
mentation and fine-grained localization. In: Proceedings of the IEEE conference
on computer vision and pattern recognition. pp. 447–456 (2015)
18. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.
In: CVPR (2016)
19. He, X., Yang, S., Li, G., Li, H., Chang, H., Yu, Y.: Non-local context encoder:
Robust biomedical image segmentation against adversarial attacks. In: Proceedings
of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 8417–8424 (2019)
16 J. Gu et al.
20. Hendrik Metzen, J., Chaithanya Kumar, M., Brox, T., Fischer, V.: Universal ad-
versarial perturbations against semantic image segmentation. In: ICCV (2017)
21. Jia, X., Zhang, Y., Wu, B., Ma, K., Wang, J., Cao, X.: Las-at: Adversarial training
with learnable attack strategy. In: Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. pp. 13398–13408 (2022)
22. Jia, X., Zhang, Y., Wu, B., Wang, J., Cao, X.: Boosting fast adversarial training
with learnable adversarial initialization. IEEE Transactions on Image Processing
(2022)
23. Kang, X., Song, B., Du, X., Guizani, M.: Adversarial attacks for image segmenta-
tion on multiple lightweight models. IEEE Access 8, 31359–31370 (2020)
24. Kapoor, N., Bär, A., Varghese, S., Schneider, J.D., Hüger, F., Schlicht, P., Fin-
gscheidt, T.: From a fourier-domain perspective on adversarial examples to a wiener
filter defense for semantic segmentation. In: 2021 International Joint Conference
on Neural Networks (IJCNN). pp. 1–8. IEEE (2021)
25. Klingner, M., Bar, A., Fingscheidt, T.: Improved noise and attack robustness for
semantic segmentation by using multi-task training with self-supervised depth es-
timation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition Workshops. pp. 320–321 (2020)
26. Kurakin, A., Goodfellow, I., Bengio, S., et al.: Adversarial examples in the physical
world. In: ICLR (2016)
27. Lee, H.J., Ro, Y.M.: Adversarially robust multi-sensor fusion model training via
random feature fusion for semantic segmentation. In: 2021 IEEE International
Conference on Image Processing (ICIP). pp. 339–343. IEEE (2021)
28. Li, Y., Li, Y., Lv, Y., Jiang, Y., Xia, S.T.: Hidden backdoor attack against semantic
segmentation models. arXiv preprint arXiv:2103.04038 (2021)
29. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning
models resistant to adversarial attacks. In: ICLR (2018)
30. Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks
for volumetric medical image segmentation. In: 3DV (2016)
31. Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate
method to fool deep neural networks. In: Proceedings of the IEEE conference on
computer vision and pattern recognition. pp. 2574–2582 (2016)
32. Nakka, K.K., Salzmann, M.: Indirect local attacks for context-aware semantic seg-
mentation networks. In: European Conference on Computer Vision. pp. 611–628.
Springer (2020)
33. Nesti, F., Rossolini, G., Nair, S., Biondi, A., Buttazzo, G.: Evaluating the ro-
bustness of semantic segmentation for autonomous driving against real-world ad-
versarial patch attacks. In: Proceedings of the IEEE/CVF Winter Conference on
Applications of Computer Vision. pp. 2280–2289 (2022)
34. Park, G.Y., Lee, S.W.: Reliably fast adversarial training via latent adversarial
perturbation. ICCV (2021)
35. Paschali, M., Conjeti, S., Navarro, F., Navab, N.: Generalizability vs. robustness:
investigating medical imaging networks using adversarial examples. In: Interna-
tional Conference on Medical Image Computing and Computer-Assisted Interven-
tion. pp. 493–501. Springer (2018)
36. Rossolini, G., Nesti, F., D’Amico, G., Nair, S., Biondi, A., Buttazzo, G.: On the
real-world adversarial robustness of real-time semantic segmentation models for
autonomous driving. arXiv preprint arXiv:2201.01850 (2022)
37. Shafahi, A., Najibi, M., Ghiasi, A., Xu, Z., Dickerson, J., Studer, C., Davis, L.S.,
Taylor, G., Goldstein, T.: Adversarial training for free! NeurIPS (2019)
SegPGD: An Effective and Efficient Adversarial Attack 17
38. Shen, G., Mao, C., Yang, J., Ray, B.: Advspade: Realistic unrestricted attacks for
semantic segmentation. arXiv preprint arXiv:1910.02354 (2019)
39. Sriramanan, G., Addepalli, S., Baburaj, A., et al.: Towards efficient and effective
adversarial training. NeurIPS (2021)
40. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.,
Fergus, R.: Intriguing properties of neural networks. In: ICLR (2014)
41. Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.:
Ensemble adversarial training: Attacks and defenses. ICLR (2018)
42. Tran, H.D., Pal, N., Musau, P., Lopez, D.M., Hamilton, N., Yang, X., Bak, S.,
Johnson, T.T.: Robustness verification of semantic segmentation neural networks
using relaxed reachability. In: International Conference on Computer Aided Veri-
fication. pp. 263–286. Springer (2021)
43. Vivek, B., Babu, R.V.: Single-step adversarial training with dropout scheduling.
In: CVPR (2020)
44. Vivek, B., Mopuri, K.R., Babu, R.V.: Gray-box adversarial training. In: ECCV.
pp. 203–218 (2018)
45. Wang, D., Ju, A., Shelhamer, E., Wagner, D., Darrell, T.: Fighting gradients
with gradients: Dynamic defenses against adversarial attacks. arXiv preprint
arXiv:2105.08714 (2021)
46. Wang, J., Zhang, H.: Bilateral adversarial training: Towards fast training of more
robust models against adversarial attacks. In: ICCV (2019)
47. Wong, E., Rice, L., Kolter, J.Z.: Fast is better than free: Revisiting adversarial
training. ICLR (2020)
48. Wu, B., Pan, H., Shen, L., Gu, J., Zhao, S., Li, Z., Cai, D., He, X., Liu, W.:
Attacking adversarial attacks as a defense. arXiv preprint arXiv:2106.04938 (2021)
49. Xiao, C., Deng, R., Li, B., Yu, F., Liu, M., Song, D.: Characterizing adversarial
examples based on spatial consistency information for semantic segmentation. In:
Proceedings of the European Conference on Computer Vision (ECCV). pp. 217–
234 (2018)
50. Xie, C., Wang, J., Zhang, Z., Zhou, Y., Xie, L., Yuille, A.: Adversarial examples
for semantic segmentation and object detection. In: ICCV (2017)
51. Xu, X., Zhao, H., Jia, J.: Dynamic divide-and-conquer adversarial training for
robust semantic segmentation. In: ICCV (2021)
52. Ye, N., Li, Q., Zhou, X.Y., Zhu, Z.: Amata: An annealing mechanism for adversarial
training acceleration. AAAI (2021)
53. Yu, Y., Lee, H.J., Kim, B.C., Kim, J.U., Ro, Y.M.: Towards robust training of
multi-sensor data fusion network against adversarial examples in semantic seg-
mentation. In: ICASSP 2021-2021 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP). pp. 4710–4714. IEEE (2021)
54. Zhang, D., Zhang, T., Lu, Y., Zhu, Z., Dong, B.: You only propagate once: Accel-
erating adversarial training via maximal principle. NeurIPS (2019)
55. Zhang, H., Wang, J.: Defense against adversarial attacks using feature scattering-
based adversarial training. NeurIPS (2019)
56. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In:
CVPR (2017)
57. Zheng, H., Zhang, Z., Gu, J., Lee, H., Prakash, A.: Efficient adversarial training
with transferable adversarial examples. In: CVPR (2020)
18 J. Gu et al.
Supplementary Material
40 60
PGD PGD
35 SegPGD (ours) SegPGD (ours)
MLAttack 50 MLAttack
30 DAG DAG
mIoU (in \%)
40
25
30
20
15 20
3 7 10 20 40 100 3 7 10 20 40 100
Attack Iterations Attack Iterations
(a) PSPNet trained with PGD3-AT on VOC (b) DeepLabV3 trained with PGD3-AT on VOC
AT on VOC
PGD3-AT SegPGD3-AT PGD7-AT SegPGD7-AT
PGD100 13.89 14.49 16.97 19.23
Attack Method
SegPGD100 9.67 10.34 16.20 17.03
AT on Cityscapes
PGD3-AT SegPGD3-AT PGD7-AT SegPGD7-AT
PGD100 3.95 13.04 22.80 23.13
Attack Method
SegPGD100 1.91 8.86 17.03 22.54
We also compare our SegPGD-AT with the recently proposed segmentation ad-
versarial training method DDCAT. We load the pre-trained DDCAT models
from their released the codebase and evaluate the model with strong attacks.
We found that their models are very weak to defend strong attacks. For fair
comparison, we compare the scores on our SegPGD3-AT with the ones on their
models since three steps are applied to generate adversarial examples in both
case. As shown in Tab. 5, our model trained with SegPGD3-AT outperform the
DDCAT by a large margin under strong attacks.
We train PSPNet and DeepLabV3 on the same dataset. Then, we create adver-
sarial examples on PSPNet with PGD100 or SegPGD100 and test the robustness
of DeepLabV3 on these adversarial examples. The results are reported in Tab. 6.
We test the DeepLabV3 models trained with different methods. The model
trained with our SegPGD3 shows the best performance against the transfer-
based black-box attacks. The claim is also true when different attack methods
are applied to create adversarial examples.
Table 6: Evaluation under Black-box Attacks. The model with our SegPGD3-
based adversarial training performs more robust than other methods on different
datasets under different attacks.