在 Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey[14] 这篇论文中总结了 12 种攻击方法,如下图所示:
经过我的调研,在论文 Adversarial Examples for Semantic Segmentation and Object Detection[4] 的启发下,我认为,既然对抗攻击是对抗样本的生成任务,而生成任务又是现在发展非常迅速的一个领域,我们可以把一些生成模型迁移到这个任务上来。
比如,现在非常热门的对抗生成网络 GAN 是生成任务最有效的模型之一,我认为可以借用这种对抗的思想生成对抗样本: 一个专门向原数据中加噪声的网络和一个试图根据对抗样本完成分类任务的网络,两个网络就像 GAN 里面的生成器和鉴别器一样对抗学习,最后会收敛于加噪声的网络生成的对抗样本足以迷惑分类网络,这样生成的对抗样本或许会比前文所述的方法效果更好。
[1] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, A.Swami, The Limitations of Deep Learning in Adversarial Settings, In Proceedings of IEEE European Symposium on Security and Privacy, 2016.
[2] J. Su, D. V. Vargas, S. Kouichi, One pixel attack for fooling deep neural networks, arXiv preprint arXiv:1710.08864, 2017.
[3] S. Moosavi-Dezfooli, A. Fawzi, P. Frossard, DeepFool: a simple and accurate method to fool deep neural networks, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574-2582, 2016.
[4] C. Xie, J. Wang, Z. Zhang, Y. Zhou, L. Xie, and A. Yuille, Adversarial Examples for Semantic Segmentation and Object Detection, arXiv preprint arXiv:1703.08603, 2017.
[5] Dai, Hanjun, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. "Adversarial Attack on Graph Structured Data." In International Conference on Machine Learning (ICML), vol. 2018. 2018.
[6] Zu?gner, Daniel, Amir Akbarnejad, and Stephan Gu?nnemann. "Adversarial attacks on neural networks for graph data." In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2847-2856. ACM, 2018.
[7] Ying R, You J, Morris C, et al. Hierarchical graph representation learning with differentiable pooling[J]. CoRR, 2018
[8] Jia R, Liang P. Adversarial examples for evaluating reading comprehension systems[J]. arXiv preprint arXiv:1707.07328, 2017.
[9] Y. Lin, Z. Hong, Y. Liao, M. Shih, M. Liu, and M. Sun, Tactics of Adversarial Attack on Deep Reinforcement Learning Agents, arXiv preprint arXiv:1703.06748, 2017.
[10] Papernot N, McDaniel P, Swami A, et al. Crafting adversarial input sequences for recurrent neural networks[C]//Military Communications Conference, MILCOM 2016-2016 IEEE. IEEE, 2016:49-54
[11] Carlini N, Wagner D. Audio adversarial examples: Targeted attacks on speech-to-text[J]. arXiv preprint arXiv:1801.01944, 2018.
[12] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, arXiv preprint arXiv:1312.6199, 2014.
[13] I. J. Goodfellow, J. Shlens, C. Szegedy, Explaining and Harnessing Adversarial Examples, arXiv preprint arXiv:1412.6572, 2015.
[14] Akhtar N, Mian A. Threat of adversarial attacks on deep learning in computer vision: A survey[J]. arXiv preprint arXiv:1801.00553, 2018
[15] Lu S, Yu L, Zhang W, et al. CoT: Cooperative Training for Generative Modeling[J]. arXiv preprint arXiv:1804.03782, 2018.
[16] Kingma D P, Dhariwal P. Glow: Generative flow with invertible 1x1 convolutions[J]. arXiv preprint arXiv:1807.03039, 2018.