计算机与现代化 ›› 2023, Vol. 0 ›› Issue (02): 40-49.

• 人工智能 • 上一篇    下一篇

结合注意力机制的深度神经网络综述

  

  1. (河海大学能源与电气学院,江苏 南京 211100)
  • 出版日期:2023-04-10 发布日期:2023-04-10
  • 作者简介:皇甫晓瑛(1997—),女,河南安阳人,硕士研究生,研究方向:人体行为识别,机器视觉,深度学习,E-mail: 201306060013@hhu.edu.cn; 通信作者:钱惠敏(1980—),女,江苏宜兴人,副教授,博士,研究方向:计算机视觉,机器学习,E-mail: qhmin0316@163.com; 黄敏(1998—),女,江苏启东人,硕士研究生,研究方向:深度学习,视频中的人体行为识别,E-mail: 2458237010@qq.com。
  • 基金资助:
    国家自然科学基金资助项目(61573001)

A Review of Deep Neural Networks Combined with Attention Mechanism

  1. (College of Energy and Electric Engineering, Hohai University, Nanjing 211100, China)
  • Online:2023-04-10 Published:2023-04-10

摘要: 注意力机制已成为改进神经网络学习能力的研究热点之一。鉴于注意力机制受到的广泛关注,本文旨在从注意力机制的分类、与深度神经网络的结合方式,以及在自然语言处理和计算机视觉领域的具体应用3个方面对深度神经网络中的注意力机制给出较全面的分析和阐述。具体地,分析比较了软注意力、硬注意力和自注意力这3种机制的优缺点;并分别讨论了递归神经网络和卷积神经网络中结合注意力机制的常用方式及其代表性模型结构;然后,以自然语言处理、计算机视觉领域为例,说明了其应用情况;最后,分析了注意力机制的发展趋势,期望为后续研究提供线索和方向。

关键词: 注意力机制, 深度学习, 神经网络, 注意力模型

Abstract: Attention mechanism has become one of the research hotspots in improving the learning ability of deep neural network. In view of the wide attention paid to the attention mechanism, this paper aims to give a comprehensive analysis and elaboration of attention mechanism in deep neural network from three aspects: the classification of attention mechanism, the way of combining with deep neural network, and the specific applications in natural language processing and computer vision. Specifically, attention mechanism has been divided into soft attention mechanism, hard attention mechanism and self-attention mechanism, and their advantages and disadvantages are compared. Then, the common ways of combining attention mechanism in recursive neural network and convolutional neural network are discussed respectively, and the representative model structures of each way are given. After that, the applications of attention mechanism in natural language processing and computer vision are illustrated. Finally, several future developments of attention mechanism are illustrated expecting to provide clues and directions for subsequent researches.

Key words: attention mechanisms, deep learning, neural networks, attention models