融合注意力机制的非对称深度监督哈希

摘要/Abstract

摘要： 随着大数据时代的到来，互联网上的信息数据呈指数级增长。在这些数据中，图像资源占比巨大，因此如何在海量图像中进行准确而高效的图像检索成为当今的重要研究课题之一。目前大多数方法提取到的特征信息含有大量冗余信息，使得在图像检索中不能有效关注到图像的重点区域而导致检索性能差、准确度低等问题。基于以上不足，本文提出一种融合注意力机制的非对称深度哈希算法。以卷积神经网络为基础，对现有的由语义特征引导的混合注意力机制进行改进，将其嵌入进网络中，使得哈希网络将全局语义信息和局部语义信息共同分析。同时设计新的量化函数来减少量化误差，从而增强哈希编码的特征表达能力。并采用mAP作为评价指标，在数据集CIFAR-10和NUS-WIDE数据集上将本文方法与其他哈希方法进行比较，结果表明本文设计的网络模型能很好地结合全局和局部的特征信息，提高图像检索性能。

关键词: 图像检索, 注意力机制, 深度哈希, 卷积神经网络, 特征提取

Abstract: With the advent of the era of big data， the information data on the Internet is growing exponentially. Among these data， image resource accounts for a very large proportion， so how to carry out accurate and efficient image retrieval from massive images has become one of the important research topics today. At present， there are some problems in large-scale image retrieval， such as poor retrieval performance and low accuracy due to the inability to effectively focus on the key areas of the image. Based on the above shortcomings， an asymmetric deep hash algorithm that integrates the attention mechanism is proposed， which is modified based on convolutional neural network. The existing mixed attention mechanism guided by semantic features is improved and embedded into the network， so that the hash network can analyze the global semantic information and local semantic information together. At the same time， a new quantization function is designed to reduce quantization error， so as to enhance the feature expression ability of hash coding. This method is compared with other hashing methods on the CIFAR-10 and NUS-WIDE datasets with evaluation standard mAP. The results show that the proposed network model can combine global and local spatial features well， and improve the image retrieval performance.

Key words: image retrieval, attention mechanism, deep hashing, convolutional neural network, feature extraction

王欣怡, 尹四清, 洪军. 融合注意力机制的非对称深度监督哈希[J]. 计算机与现代化, 2023, 0(05): 26-31.

WANG Xin-yi, YIN Si-qing, HONG Jun. Asymmetric Deep Supervised Hashing with Attention Mechanism[J]. Computer and Modernization, 2023, 0(05): 26-31.

参考文献

［1］ KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2012: 1097-1105.
［2］ REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN: Towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017，39（6）:1137-1149.
［3］ NOH H， HONG S， HAN B. Learning deconvolution network for semantic segmentation［C］// Proceedings of 2015 IEEE International Conference on Computer Vision. IEEE， 2015:1520-1528.
［4］ SCHROFF F， KALENICHENKO D， PHILBIN J. Facenet: A unified embedding for face recognition and clustering［C］// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition（CVPR）. IEEE， 2015: 815-823.
［5］庄咸乐，王朝立，孙占全. 一种基于注意力机制的CT图像预处理方法［J］. 小型微型计算机系统， 2022，43（3）:626-631.
［6］ HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE，2016 :770-778.
［7］ LUO X， WU D Q， CHEN C， et al. A survey on deep hashing methods［J］. arXiv preprint arXiv:2003.03369， 2020.
［8］ ANDONI A， INDYK P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions［C］// The 47th Annual IEEE Symposium on Foundations of Computer Science. IEEE， 2006:459-468.
［9］ WEISS Y， TORRALBA A， FERGUS R. Spectral hashing［C］// Proceedings of the 21st International Conference on Neural Information Processing Systems. 2008:1753-1760.
［10］ GONG Y C， LAZEBNIK S， GORDO A， et al. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2012，35（12）:2916-2929.
［11］ XIA R， PAN Y， LAI H J， et al. Supervised hashing for image retrieval via image representation learning［C］// The 28th AAAI Conference on Artificial Intelligence. 2014:2156-2162.
［12］ LAI H J， PAN Y， LIU Y， et al. Simultaneous feature learning and hash coding with deep neural networks［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2015:3270-3278.
［13］ LIU H M， WANG R P， SHAN S G， et al. Deep supervised hashing for fast image retrieval［C］// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2016:2064-2072.
［14］ LI W J， WANG S， KANG W C. Feature learning based deep supervised hashing with pairwise labels［C］// Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016:1711-1717.
［15］ ZHENG X T， ZHANG Y C， LU X Q. Deep balanced discrete hashing for image retrieval［J］. Neurocomputing， 2020，403（3）:224-236.
［16］ HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR）. IEEE， 2018:7132-7141.
［17］ NG T， BALNTAS V， TIAN Y， et al. SOLAR: Second-order loss and attention for image retrieval［C］// Proceedings of 2020 European Conference on Computer Vision. IEEE， 2020:253-270.
［18］ WOO S， PARK J， LEE J Y， et al. CBAM: Convolutional block attention module［C］// Proceedings of the 15th European Conference on Computer Vision. Springer. 2018:3-19.
［19］ WANG Q L， WU B G， ZHU P F， et al. Eca-net: Efficient channel attention for deep convolutional neural networks［C］// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE， 2020:11531-11539.
［20］ CAO J W， LIU L Q， WANG P， et al. Where to focus: Query adaptive matching for instance retrieval using convolutional feature maps［J］. arXiv preprint arXiv:1606.06811，
2016.
［21］ GONG Y C， WANG L W， GUO R Q， et al. Multi-scale orderless pooling of deep convolutional activation features［C］// Proceedings of European Conference on Computer Vision（ECCV）. Springer， 2014:392-407.
［22］ LAI H J， PAN Y， LIU Y， et al. Simultaneous feature learning and hash coding with deep neural networks［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2015:3270-3278.
［23］ SHEN F M， SHEN C H， LIU W， et al. Supervised discrete hashing［C］// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2015:37-45.
［24］ SONG J K， HE T， GAO L L， et al. Binary generative adversarial networks for image retrieval［C］// Proceedings of the AAAI Conference on Artificial Intelligence. 2018:394-401．
［25］ CAO Y， LIU B， LONG M S， et al. HashGAN: Deep learning to hash with pair conditional wasserstein GAN［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2018:1287-1296.
［26］ JIANG Q Y， LI W J. Asymmetric deep supervised hashing［C］// AAAI Conference on Artificial Intelligence. 2018:3342-3349.
［27］龙显忠，程成，李云. 深度优先局部聚合哈希［J］. 湖南大学学报（自然科学版）， 2021，48（6）:58-66.

[1]	何思达, 陈平华. 基于意图的轻量级自注意力序列推荐模型[J]. 计算机与现代化, 2024, 0(12): 1-9.
[2]	赵晨阳, 薛涛, 刘俊华. 基于改进Stable Diffusion的时尚服饰图案生成[J]. 计算机与现代化, 2024, 0(12): 15-23.
[3]	黄庭培1, 马禄彪1, 李世宝2, 刘建航1. 基于WiFi和原型网络的手势识别方法[J]. 计算机与现代化, 2024, 0(12): 34-39.
[4]	张晓东1, 白广芝1, 李敏1, 李昊洋2. 基于经验小波变换的油气井产量预测模型 [J]. 计算机与现代化, 2024, 0(12): 53-58.
[5]	刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71.
[6]	刘宝宝, 杨菁菁, 陶露, 王贺应. 基于注意力的DSMSC的遥感图像场景分类[J]. 计算机与现代化, 2024, 0(12): 72-77.
[7]	谷岳, 邓松峰, 沈霁, 穆文涛, 赵恩棋. 基于改进YOLOv8的SAR舰船目标检测算法[J]. 计算机与现代化, 2024, 0(12): 78-83.
[8]	王艳媛, 茅正冲. 中英文场景文本图像的检测和识别算法[J]. 计算机与现代化, 2024, 0(12): 84-90.
[9]	李钧超1, 尤菲1, 张超2, 苏乐乐2, 龚龑2. 基于新型多目标浣熊优化算法的BiLSTM-Attention#br# 预测模型及误差分析[J]. 计算机与现代化, 2024, 0(11): 70-76.
[10]	张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98.
[11]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[12]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[13]	魏学诚1, 江凌云1, 李研2, 何非2. 改进YOLOv5的路侧单目视角小目标检测算法[J]. 计算机与现代化, 2024, 0(10): 27-34.
[14]	杜猛俊1, 李昂1, 童俊1, 钱锦1, 康恺1, 王若丁1, 靳文星2. 基于改进极限学习算法的电力信息数据融合模型[J]. 计算机与现代化, 2024, 0(10): 61-64.
[15]	杨世军1, 狄广义1, 高军1, 陈见飞1, 王耀坤1, 季晓晗2. 跨模态注意力融合和信息感知的情感一致检测[J]. 计算机与现代化, 2024, 0(10): 113-119.