TY - JOUR
T1 - Mitigating the impact of adversarial attacks in very deep networks
AU - Hassanin, Mohammed
AU - Radwan, Ibrahim
AU - Moustafa, Nour
AU - Tahtali, Murat
AU - Kumar, Neeraj
N1 - Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/7
Y1 - 2021/7
N2 - Deep Neural Network (DNN) models have vulnerabilities related to security concerns, with attackers usually employing complex hacking techniques to expose their structures. Data poisoning-enabled perturbation attacks are complex adversarial ones that inject false data into models. They negatively impact the learning process, with no benefit to deeper networks, as they degrade a model’s accuracy and convergence rates. In this paper, we propose an attack-agnostic-based defense method for mitigating their influence. In it, a Defensive Feature Layer (DFL) is integrated with a well-known DNN architecture which assists in neutralizing the effects of illegitimate perturbation samples in the feature space. To boost the robustness and trustworthiness of this method for correctly classifying attacked input samples, we regularize the hidden space of a trained model with a discriminative loss function called Polarized Contrastive Loss (PCL). It improves discrimination among samples in different classes and maintains the resemblance of those in the same class. Also, we integrate a DFL and PCL in a compact model for defending against data poisoning attacks. This method is trained and tested using the CIFAR-10 and MNIST datasets with data poisoning-enabled perturbation attacks, with the experimental results revealing its excellent performance compared with those of recent peer techniques.
AB - Deep Neural Network (DNN) models have vulnerabilities related to security concerns, with attackers usually employing complex hacking techniques to expose their structures. Data poisoning-enabled perturbation attacks are complex adversarial ones that inject false data into models. They negatively impact the learning process, with no benefit to deeper networks, as they degrade a model’s accuracy and convergence rates. In this paper, we propose an attack-agnostic-based defense method for mitigating their influence. In it, a Defensive Feature Layer (DFL) is integrated with a well-known DNN architecture which assists in neutralizing the effects of illegitimate perturbation samples in the feature space. To boost the robustness and trustworthiness of this method for correctly classifying attacked input samples, we regularize the hidden space of a trained model with a discriminative loss function called Polarized Contrastive Loss (PCL). It improves discrimination among samples in different classes and maintains the resemblance of those in the same class. Also, we integrate a DFL and PCL in a compact model for defending against data poisoning attacks. This method is trained and tested using the CIFAR-10 and MNIST datasets with data poisoning-enabled perturbation attacks, with the experimental results revealing its excellent performance compared with those of recent peer techniques.
KW - Deep neural network
KW - Adversarial attack
KW - Data poisoning
KW - Defensive feature layer
KW - Polarized-contrastive loss
KW - Regularizing hidden space
UR - http://www.scopus.com/inward/record.url?scp=85102969720&partnerID=8YFLogxK
U2 - 10.1016/j.asoc.2021.107231
DO - 10.1016/j.asoc.2021.107231
M3 - Article
SN - 1568-4946
VL - 105
SP - 1
EP - 8
JO - Applied Soft Computing
JF - Applied Soft Computing
M1 - 107231
ER -