Adversarial Attack Estimation and Mitigation in Semantic Segmentation
Adversarial Attack Estimation and Mitigation in Semantic Segmentation
MITIGATION IN SEMANTIC
SEGMENTATION Project based learning - lll
IN TR O D U C TIO N A N D BA C KG R O U N D
Deep Learning Vulnerabilities: deep neural networks (DNNs), has become a powerful tool for tasks such as
semantic segmentation, object detection, and face parsing. Semantic segmentation assigns a label to each
pixel in an image, making it essential in fields like autonomous driving and medical imaging.
These models are, however, vulnerable to small, often imperceptible changes in the input data, termed
adversarial perturbations. These perturbations trick the model into making incorrect predictions without
affecting human perception of the image.
Since segmentation is critical for applications like autonomous driving, where safety is paramount, it’s crucial
to make models resistant to these attacks. The study explores and develops new adversarial attack
algorithms and proposes a detection mechanism to address these vulnerabilities.
PROBLEM DEFINITION AND OBJECTIVE
The problem focuses on the potential security risks due to adversarial attacks in segmentation tasks. The
objective is to design a novel adversarial attack for semantic segmentation and propose a defense
mechanism that mitigates the effects of such attacks.
The study aims to create a new attack algorithm that extends the attack to face parsing tasks, evaluates
performance, and proposes a min-max based attack-detection mechanism.
LITERATURE SURVEY
Semantic Segmentation Techniques: Early machine learning methods like Support Vector Machines (SVM) and
clustering were used, but deep learning has since introduced advanced architectures like U-Net, Mask R-CNN, and
DeepLab for more accurate segmentation.
Face Parsing: A specialized type of segmentation where each pixel of a face image is classified, which is used in
applications like facial analysis and medical imaging.
Adversarial Attacks: Existing methods of adversarial attacks on image classifiers are adapted here to semantic
segmentation. The paper discusses black-box and white-box attacks, targeted and untargeted attacks, and one-
shot vs. iterative attacks.
Defense Mechanisms: Several methods for defending against adversarial attacks exist, like denoising
autoencoders that remove noise from adversarial examples, but fewer methods are designed specifically for
segmentation models.
PROPOSED METHODOLOGY
Internal Wasserstein Distance (IWD): Measures similarity between the original and perturbed images, aiming to
keep the adversarial image close to the original. IWD captures internal distributions in patches, and calculates a
Wasserstein Distance, giving a spatial measure of similarity.
Triplet Loss (TL): Triplet Loss operates on a set of images termed an anchor, positive, and negative. Here, the
original image is the anchor, the adversarial example is the positive, and a dissimilar image (based on a low IoU
score) is the negative. This loss maximizes the similarity between anchor and positive and minimizes it between
anchor and negative.
Note - Attack Process: By combining IWD and TL with cross-entropy loss, the IWD-TL attack algorithm generates
adversarial examples, which fool the segmentation model while remaining visually indistinguishable to humans.
MIN -MA X AT TA C K-D ETEC TIO N
MECHA N ISM
The defense mechanism employs a game-theoretic approach where the attacker tries to maximize the
classification loss (misleading the model), and the detector minimizes the distance between original and
perturbed images.
The detector architecture uses an encoder that extracts features from the input. These features are
compared between the original and perturbed images, helping the detector identify adversarial
samples.
This min-max game aims to make the detector robust by minimizing the effect of adversarial examples.
INTERNAL WASSERSTEIN DISTANCE AND TRIPLET
LOSS ALGORITHM
The IWD-TL Attack Algorithm is designed to create adversarial examples specifically for semantic
segmentation models. It uses a combination of Internal Wasserstein Distance (IWD), Triplet Loss (TL),
and Cross-Entropy Loss to create perturbations that are visually imperceptible but significantly affect
model predictions.
Min-Max Attack-Detection Mechanism
Where is a loss function that minimizes similarity between the original and adversarial features,
preventing the detector from distinguishing between the two.
Detector Objective
The detector uses an encoder network to extract feature representations from the input images. It minimizes the
Internal Wasserstein Distance and Triplet Loss:
The detector attempts to distinguish original images from adversarial ones by analyzing differences in feature
representations
____________________________________________________________________________________________
Min-Max Game
1 The attacker maximizes classification error and minimizes detection accuracy.
2 The detector minimizes the adversarial impact on the model by learning to distinguish adversarial examples.
EXPERIMEN TA L SET - U P
Datasets
Drone Dataset: Used for urban scene understanding in drone-based applications.
iSAID Dataset: A large-scale dataset of aerial images for semantic segmentation and object detection.
Face Parsing Datasets: LaPa and iBugMask, both annotated for face parsing tasks.
Architectures
UNet, UNet++, and MAnet architectures are chosen for the experiments due to their success in
segmentation tasks.
Results and Analysis
Note
Mean Intersection over Union (mIoU): Measures the accuracy of predicted segmentation against the ground truth,
a key metric in segmentation tasks.
The study presents results showing that the IWD-TL attack effectively lowers mIoU and pixel accuracy across the
datasets, indicating that it misclassifies the images effectively.
For example, on the Drone dataset using the UNet architecture, mIoU dropped from 0.377 to 0.043 as noise was
increased. Pixel accuracy also dropped from 86.5% to 18.8%.