计算机与现代化 ›› 2021, Vol. 0 ›› Issue (04): 117-121.

• 信息安全 • 上一篇    下一篇

基于深度强化学习的黑盒对抗攻击算法

  

  1. (河海大学计算机与信息学院,江苏南京211100)
  • 出版日期:2021-04-22 发布日期:2021-04-25
  • 作者简介:李蒙(1995—),男,重庆人,硕士研究生,研究方向:机器学习,对抗攻击,E-mail: limeng1995hhu@163.com; 通信作者:韩立新(1967—),男,江苏南京人,教授,博士生导师,博士,研究方向:信息检索,数据挖掘,模式识别,E-mail: lixinhan2002@aliyun.com。

Black Box Adversarial Attack Algorithm Based on Deep Reinforcement Learning

  1. (College of Computer and Information, Hohai University, Nanjing 211100, China)
  • Online:2021-04-22 Published:2021-04-25

摘要: 针对图像识别领域中的黑盒对抗攻击问题,基于强化学习中DDQN框架和Dueling网络结构提出一种黑盒对抗攻击算法。智能体通过模仿人类调整图像的方式生成对抗样本,与受攻击模型交互获得误分类结果,计算干净样本和对抗样本的结构相似性后获得奖励。攻击过程中仅获得了受攻击模型的标签输出信息。实验结果显示,攻击在CIFAR10和CIFAR100数据集上训练的4个深度神经网络模型的成功率均超过90%,生成的对抗样本质量与白盒攻击算法FGSM相近且成功率更有优势。

关键词: 对抗样本, 黑盒攻击, 深度学习, 强化学习

Abstract: Aiming at the problem of black box adversarial attack in the field of image recognition, a black box adversarial attack algorithm is proposed based on the DDQN framework and Dueling network structure in reinforcement learning. The agent generates an adversarial sample by imitating human adjustment of the image, interacts with the attacked model to obtain misclassification results, and calculates the structural similarity of the clean sample and the adversarial sample to generate a reward. During the attack, only the label output information of the attacked model was obtained. The experimental results show that the success rate of attacking the four deep neural network models trained on the CIFAR10 and CIFAR100 datasets exceeds 90%. The quality of the generated adversarial samples is similar to the white box attack algorithm FGSM and the success rate is more advantageous.

Key words: adversarial samples, black box attacks, deep learning, reinforcement learning