Paper ID | MLR-APPL-IP-5.5 | ||
Paper Title | SIMTROJAN: STEALTHY BACKDOOR ATTACK | ||
Authors | Yankun Ren, Longfei Li, Jun Zhou, Ant Group, China | ||
Session | MLR-APPL-IP-5: Machine learning for image processing 5 | ||
Location | Area E | ||
Session Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation | Poster | ||
Topic | Applications of Machine Learning: Machine learning for image processing | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Recent researches indicate deep learning models are vulnerable to adversarial attacks. Backdoor attack, also called trojan attack, is a variant of adversarial attacks. An malicious attacker can inject backdoor to models in training phase. As a result, the backdoor model performs normally on clean samples and can be triggered by a backdoor pattern to recognize backdoor samples as a wrong target label specified by the attacker. However, the vanilla backdoor attack method causes a measurable difference between clean and backdoor samples in latent space. Several state-of-the-art defense methods utilize this to identify backdoor samples. In this paper, we propose a novel backdoor attack method called SimTrojan, which aims to inject backdoor in models stealthily. Specifically, SimTrojan makes clean and backdoor samples have indistinguishable representations in latent space to evade current defense methods. Experiments demonstrate that SimTrojan achieves a high attack success rate and is undetectable by state-of-the-art defense methods. The study suggests the urgency of building more effective defense methods. |