Computer and Modernization ›› 2023, Vol. 0 ›› Issue (02): 40-49.
Previous Articles Next Articles
Online:
2023-04-10
Published:
2023-04-10
HUANGFU Xiao-ying, QIAN Hui-min, HUANG Min. A Review of Deep Neural Networks Combined with Attention Mechanism[J]. Computer and Modernization, 2023, 0(02): 40-49.
[1] 吴建鑫,高斌斌,魏秀参,等. 资源受限的深度学习:挑战与实践[J]. 中国科学:信息科学, 2018,48(5):501-510. [2] BORJI A, ITTI L. State-of-the-art in visual attention modeling[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013,35(1):185-207. [3] 王培森. 基于注意力机制的图像分类深度学习方法研究[D]. 合肥:中国科学技术大学, 2018. [4] HONG Z. A preliminary study on artificial neural network[C]// 2011 6th IEEE Joint International Information Technology and Artificial Intelligence Conference. 2011:336-338. [5] CHAUDHARI S, MITHAL V, POLATKAN G, et al. An Attentive Survey of Attention Models[J]. arXiv preprint arXiv:1904.02874, 2019. [6] 任欢,王旭光. 注意力机制综述[J]. 计算机应用, 2021,41(S01):1-6. [7] CORREIA A D S, COLOMBINI E L. Attention, please! A survey of neural attention models in deep learning[J]. arXiv preprint arXiv:2103.16775, 2021. [8] XU K, BA J, KIROS R, et al. Show, attend and tell: Neural image caption generation with visual attention[J]. arXiv preprint arXiv:1502.03044v2, 2015. [9] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-xxcitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020,42(8):2011-2023. [10] WANG Q L, WU B G, ZHU P F, et al. ECA-Net: Efficient channel attention for deep convolutional neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020:11531-11539. [11] LI X, WANG W H, HU X L, et al. Selective kernel networks[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019:510-519. [12] JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks[J]. arXiv preprint arXiv:1506.02025v3, 2015. [13] WOO S, PARK J, LEE J, et al. CBAM: Convolutional block attention module[C]// Computer Vision - ECCV 2018. 2018. DOI:10.1007/978-3-030-01234-2_1. [14] PARK J, WOO S, LEE J, et al. BAM: Bottleneck attention module[J]. arXiv preprint arXiv:1807.06514, 2018. [15] ROY A G, NAVAB N, WACHINGER C. Concurrent spatial and channel squeeze & excitation in fully convolutional networks[J]. arXiv preprint arXiv:1803.02579v2, 2018. [16] MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. 2014:2204-2212. [17] MALINOWSKI M, DOERSCH C, SANTORO A, et al. Learning visual question answering by?Bootstrapping hard attention[C]// Computer Vision – ECCV 2018. 2018. DOI:10.1007/978-3-030-01231-1_1. [18] ZHOU S K, LE H N, LU U K, et al. Deep reinforcement learning in medical imaging: A literature review[J]. arXiv preprint arXiv:2103.05115, 2021. [19] MICHEL P, LEVY O, NEUBIG Graham. Are sixteen heads really better than one?[J]. arXiv preprint arXiv: 1905.10650, 2019. [20] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. arXiv preprint arXiv:1706.03762v5, 2017. [21] WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:7794-7803. [22] HUANG Z L, WANG X G, HUANG L C, et al. CCNet: Criss-cross attention for semantic segmentation[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019:603-612. [23] REN S C, ZHOU D Q, HE S F, et al. Shunted self-attention via multi-scale token aggregation[J]. arXiv preprint arXiv:2111.15193, 2021. [24] PI H J, WANG H Y, LI Y W, et al. Searching for TrioNet: Combining convolution with local and global self-attention[J]. arXiv preprint arXiv:2111.07547, 2021. [25] LIPTON Z C, BERKOWITZ J, ELKAN C. A critical review of recurrent neural networks for sequence learning[J]. arXiv preprint arXiv: 1506.00019v4, 2015. [26] SEO M, KEMBHAVI A, FARHADI A, et al. Bidirectional attention flow for machine comprehension[J]. arXiv preprint arXiv:1611.01603v6, 2016. [27] WANG B, LIU K, ZHAO J. Inner attention based recurrent neural networks for answer selection[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016:1288-1297. [28] QIN Y, SONG D J, CHENG H F, et al. A dual-stage attention-based recurrent neural network for time series prediction[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017:2627-2633. [29] ZHANG P F, XUE J R, LAN C L, et al. Adding attentiveness to the neurons in recurrent neural networks[C]// 15th European Conference on Munich, Germany. 2018:136-152. [30] HUBEL D H, WIESEL T N. Early exploration of the visual cortex[J]. Neuron, 1998,20(3):401-412. [31] XU S J, CHENG Y, GU K, et al. Jointly attentive spatial-temporal pooling networks for video-based person re-identification[C]// 2017 IEEE International Conference on Computer Vision (ICCV). 2017:4743-4752. [32] LIU N, LONG Y C, ZOU C Q, et al. ADCrowdNet: An attention-injective deformable convolutional network for crowd understanding[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019:3220-3229. [33] GUO S N, LIN Y F, FENG N, et al. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019,33(1):922-929. [34] WANG F, JIANG M Q, QIAN C, et al. Residual attention network for image classification[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:6450-6458 [35] FU J L, ZHENG H L, MEI T. Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:4476-4484. [36] YANG S H, WANG Y X, CHU X W. A survey of deep learning techniques for neural machine translation[J]. arXiv preprint arXiv:2002.07526v1, 2020. [37] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473v7, 2016. [38] LUONG M, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[J]. arXiv preprint arXiv:1508.04025, 2015. [39] GEHRING J, AULI M, GRANGIER D, et al. Convolutional sequence to sequence learning[J]. arXiv preprint arXiv:1705.03122v3, 2017. [40] DEVLIN J, CHANG M, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018. [41] 杨小冈,高凡,卢瑞涛,等. 基于改进YOLOv5的轻量化航空目标检测方法[J]. 信息与控制, 2022,51(3):361-368. [42] 刘赏,葛顶玉,耿明筱.结合全局与局部的人群集体性卷积网络识别方法 [J/OL].信息与控制.[2022-01-01] https://doi.org/10.13976/j.cnki.xk.2022.1381. [43] CHEN T, LIU Y, SU H, et al. Dual-Awareness Attention for Few-Shot Object Detection[J]. arXiv preprint arXiv:2102.12152v3, 2021. [44] NIU R, SUN X, TIAN Y, et al. Hybrid multiple attention network for semantic segmentation in aerial images[J]. IEEE Transactions on Geoscience and Remote Sensing. 2021,60:1-18. [45] FU J, LIU J, TIAN H J, et al. Dual attention network for scene segmentation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019:3141-3149. [46] WANG H, WANG W N, LIU J. Temporal memory attention for video semantic segmentation[J]. arXiv preprint arXiv:2102.08643, 2021. [47] 于东飞. 基于注意力机制与高层语义的视觉问答研究[D]. 合肥:中国科学技术大学, 2019. [48] GUO W Y, ZHANG Y, WU X F, et al. Re-attention for visual question answering[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020,34(1):91-98. [49] 孟乐乐. 融合时空网络与注意力机制的人体行为识别研究[D]. 北京:北京交通大学, 2018. [50] PEREZ-RUA J M, MARTINEZ B, ZHU X, et al. Knowing what, where and when to look: Efficient video action modeling with attention[J]. arXiv preprint arXiv: 2004.01278v1, 2020. [51] PU S, SONG Y B, MA C, et al. Deep attentive tracking via reciprocative learning[J]. arXiv preprint arXiv:1810.03851, 2018. [52] CAO Z, FU C H, YE J J, et al. SiamAPN++: Siamese attentional aggregation network for real-time UAV Tracking[J]. arXiv preprint arXiv:2106.08816v2, 2021. [53] XUE Y, YUAN Z M, NERI F. ConAM: Confidence attention module for convolutional neural networks[J]. arXiv preprint arXiv:2110.14369, 2021. [54] GUO M H, LU C Z, LIU Z N, et al. Visual attention network[J]. arXiv preprint arXiv:2202.09741, 2022. [55] 姚懿秦,郭薇. 基于交互注意力机制的多模态情感识别算法[J]. 计算机应用研究, 2021,38(6):1689-1693. [56] LIU H D, XU S Y, FU J M, et al. CMA-CLIP: Cross-modality attention CLIP for image-text classification[J]. arXiv preprint arXiv:2112.03562, 2021. [57] HAFIZ A M, PARAH S A, BHAT R U A. Attention mechanisms and deep learning for machine vision: A survey of the state of the art[J]. arXiv preprint arXiv:2106.07550v1, 2021. [58] 姚玉倩. 基于胶囊网络的人脸表情特征提取与识别算法研究[D]. 北京:北京交通大学, 2019. |
[1] | HE Sida, CHEN Pinghua. Intent-based Lightweight Self-Attention Network for Sequential Recommendation [J]. Computer and Modernization, 2024, 0(12): 1-9. |
[2] | LIU Baobao, YANG Jingjing, TAO Lu, WANG Heying . DSMSC Based on Attention Mechanism for Remote Sensing Image Scene Classification [J]. Computer and Modernization, 2024, 0(12): 72-77. |
[3] | QI Xian, LIU Daming, CHANG Jiaxin. Multi-view 3D Reconstruction Based on Improved Self-attention Mechanism [J]. Computer and Modernization, 2024, 0(11): 106-112. |
[4] | CHEN Kai1, LI Yiting1, 2, QUAN Huafeng1. A River Discarded Bottles Detection Method Based on Improved YOLOv8 [J]. Computer and Modernization, 2024, 0(11): 113-120. |
[5] | YANG Jun1, HU Wei1, ZHU Wenfu2. Visual SLAM Loop Closure Detection Algorithm Based on Improved MobileNetV3 [J]. Computer and Modernization, 2024, 0(10): 21-26. |
[6] | WANG Yingying, HAO Xiao. Fine-grained Image Classification Based on Res2Net and Recursive Gated Convolution [J]. Computer and Modernization, 2024, 0(10): 74-79. |
[7] | SHI Xingyu1, LI Qiang2, ZHUANG Li3, LIANG Yi3, WANG Qiulin3, CHEN Kai3, WU Chenzhou3, CHANG Sheng1. Object Detection Models Distillation Technique for Industrial Deployment [J]. Computer and Modernization, 2024, 0(10): 93-99. |
[8] | ZHANG Ze1, ZHANG Jianquan2, 3, ZHOU Guopeng2, 3. Camera Module Defect Detection Based on Improved YOLOv8s [J]. Computer and Modernization, 2024, 0(09): 107-113. |
[9] | CHENG Yazi1, LEI Liang1, 2, CHEN Han1, ZHAO Yiran1. Multi-scale Depth Fusion Monocular Depth Estimation Based on Transposed Attention [J]. Computer and Modernization, 2024, 0(09): 121-126. |
[10] | CHENG Meng, LI Hao. Improved Deciduous Tree Nest Detection Method Based on YOLOv5s [J]. Computer and Modernization, 2024, 0(08): 24-29. |
[11] | WANG Mengxi, LI Jun. Review of Fall Detection Technologies for Elderly [J]. Computer and Modernization, 2024, 0(08): 30-36. |
[12] | SHI Xianwei1, FAN Xin2. Semantic Segmentation of Video Frame Scene Based on Lightweight [J]. Computer and Modernization, 2024, 0(08): 49-53. |
[13] | XU Xin’ai, LI Gang. An Image Generation Method of Classroom Expression Images [J]. Computer and Modernization, 2024, 0(08): 88-91. |
[14] | GAO Shuaipeng, WANG Yifan. Survey on Group-level Emotion Recognition in Images [J]. Computer and Modernization, 2024, 0(08): 98-107. |
[15] | HUANG Wendong, WANG Yifan. Survey on Multimodal Information Processing and Fusion Based on Modal Categories [J]. Computer and Modernization, 2024, 0(07): 47-62. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||