Paper ID | MLR-APPL-IP-5.3 | ||
Paper Title | FABRICATE-VANISH: AN EFFECTIVE AND TRANSFERABLE BLACK-BOX ADVERSARIAL ATTACK INCORPORATING FEATURE DISTORTION | ||
Authors | Yantao Lu, Syracuse University, United States; Xueying Du, Bingkun Sun, Northwestern Polytechnical University, China; Haining Ren, Purdue University, United States; Senem Velipasalar, Syracuse University, United States | ||
Session | MLR-APPL-IP-5: Machine learning for image processing 5 | ||
Location | Area E | ||
Session Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation | Poster | ||
Topic | Applications of Machine Learning: Machine learning for image processing | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Adversarial examples have emerged as increasingly severe threats for deep neural networks. Recent works have revealed that these malicious samples can transfer across different neural networks, and effectively attack other models. The state-of-the-art methodologies leverage Fast Gradient Sign Method to generate obstructing textures, which can cause neural networks to make incorrect inferences. However, the over-reliance on task-specific loss functions makes the adversarial examples less transferable across networks. Moreover, recent de-noising based adaptive defences provide promising performance against aforementioned attacks. Therefore, to achieve better transferability and attack effectiveness, we propose a novel attack, referred to as the Fabricate-Vanish (FV) attack, which is able to erase benign representations and generate obstruction textures simultaneously. The proposed FV attack treats the adversarial example transferability as latent contribution for each layer of deep neural networks, and maximizes the attack performance by balancing transferability and task specific loss function. Our experimental results on ImageNet show that the proposed FV attack achieves the best attack performance and better transferability by degrading the accuracy of classifiers 3.8\% more on average compared to the state-of-the-art attacks. |