Enhancing adversarial example transferability with an intermediate level attack
Research output: Contribution to journal › Conference article › Research › peer-review
Standard
Enhancing adversarial example transferability with an intermediate level attack. / Huang, Qian; Katsman, Isay; Gu, Zeqi; He, Horace; Belongie, Serge; Lim, Ser Nam.
In: Proceedings of the IEEE International Conference on Computer Vision, 10.2019, p. 4732-4741.Research output: Contribution to journal › Conference article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Enhancing adversarial example transferability with an intermediate level attack
AU - Huang, Qian
AU - Katsman, Isay
AU - Gu, Zeqi
AU - He, Horace
AU - Belongie, Serge
AU - Lim, Ser Nam
N1 - Publisher Copyright: © 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples are typically overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box transfer attacks to other target models. We introduce the Intermediate Level Attack (ILA), which attempts to fine-tune an existing adversarial example for greater black-box transferability by increasing its perturbation on a pre-specified layer of the source model, improving upon state-of-the-art methods. We show that we can select a layer of the source model to perturb without any knowledge of the target models while achieving high transferability. Additionally, we provide some explanatory insights regarding our method and the effect of optimizing for adversarial examples using intermediate feature maps.
AB - Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples are typically overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box transfer attacks to other target models. We introduce the Intermediate Level Attack (ILA), which attempts to fine-tune an existing adversarial example for greater black-box transferability by increasing its perturbation on a pre-specified layer of the source model, improving upon state-of-the-art methods. We show that we can select a layer of the source model to perturb without any knowledge of the target models while achieving high transferability. Additionally, we provide some explanatory insights regarding our method and the effect of optimizing for adversarial examples using intermediate feature maps.
UR - http://www.scopus.com/inward/record.url?scp=85081908467&partnerID=8YFLogxK
U2 - 10.1109/ICCV.2019.00483
DO - 10.1109/ICCV.2019.00483
M3 - Conference article
AN - SCOPUS:85081908467
SP - 4732
EP - 4741
JO - Proceedings of the IEEE International Conference on Computer Vision
JF - Proceedings of the IEEE International Conference on Computer Vision
SN - 1550-5499
T2 - 17th IEEE/CVF International Conference on Computer Vision, ICCV 2019
Y2 - 27 October 2019 through 2 November 2019
ER -
ID: 301824085