基于注意力机制和语义相似度的跨模态哈希检索

doi:10.3969/j.issn.1006-2475.2023.08.008

计算机与现代化 ›› 2023, Vol. 0 ›› Issue (08): 44-53.doi: 10.3969/j.issn.1006-2475.2023.08.008

基于注意力机制和语义相似度的跨模态哈希检索

（华南师范大学计算机学院，广东广州 510631）

出版日期:2023-08-30 发布日期:2023-09-13
作者简介:王鸿（1996—），男，江西吉安人，硕士研究生，研究方向：跨模态检索，E-mail： 1090463612@qq.com；葛红（1969—），女，湖北襄阳人，副教授，博士，研究方向：智能信息处理，机器学习，深度学习，E-mail： gehong@scnu.edu.cn。基于注意力机制和语义相似度的跨模态哈希检索
基金资助:
基金项目：国家自然科学基金资助项目（62177015）

Cross Modal Hash Retrieval Based on Attention Mechanism and Semantic Similarity

（School of Computer Science， South China Normal University， Guangzhou 510631， China）

Online:2023-08-30 Published:2023-09-13

摘要/Abstract

摘要： 摘要：现如今，跨模态哈希检索已被广泛且成功地应用于多媒体相似性搜索应用中。为进一步提高检索性能，针对现有深度哈希检索方法存在的2个主要问题：1）如何度量不同模态的相似度，更精确地表示模态间的相似性；2）如何融合多个模态的特征，得到更丰富的特征表示，避免把多个模态单独处理，未考虑之间的联系造成的信息丢失。因此提出基于注意力机制和语义相似度的跨模态哈希检索方法（ASSH），该模型定义了新的多标签相似度衡量方法，对不同标签的重要程度加以区分，更好地表达不同模态的相似信息。设计注意力机制融合模块，使得其在特征学习过程中融合不同模态的特征，加强不同模态之间的交互，来捕捉不同模态的局部重要信息。本文在MIR-Flickr25k、IAPR TC-12、NUS-WIDE等广泛使用的图文数据集上进行实验，实验结果表明本文方法在各个问题模式下均超过之前的方法，在哈希码长度为16 bit时，与当前最好的检索方法相比平均检索精度（mAP）分别提升了1.1% 、0.63%。同时，消融实验也充分证明了本文方法的有效性。

关键词: 关键词：跨模态检索, 注意力机制, 语义相似度, 哈希检索, 特征融合

Abstract: Abstract： Nowadays， cross-modal hash retrieval has been widely and successfully used in multimedia similarity search applications. There are two challenged questions in deep hash retrieval methods：1）How to measure multiple modal’s similarity more accurately. 2）How to fuse multiple modal’s features to gain more abundant feature representations， so as to avoid key information loss. Therefore， in order to solve these two problems， we propose a novel cross-modal hashing method， called cross-modal hash retrieval model based on attention mechanism and semantic similarity （ASSH）， by defining a new multi-label similarity measurement method to distinguish the importance of different labels， designing an attention fusion module to fuse the features and enhance the interaction between different modal. Experimental results demonstrate that the proposed method outperforms the previous methods in all problem modes on the three common datasets MIRFLICKR-25K， NUS-WIDE and IAPR TC-12. Compared to the state-of-the-art method， when the hash code length is 16 bit， the mean Average Precision （mAP） is improved by 1.1% and 0.63%. At the same time， the ablation experiment also fully proved the effectiveness of the method.

Key words: Key words： cross modal retrieval, attention mechanism, similarity matrix, hash retrieval, feature fusion

中图分类号:

TP391

王鸿, 葛红. 基于注意力机制和语义相似度的跨模态哈希检索[J]. 计算机与现代化, 2023, 0(08): 44-53.

WANG Hong, GE Hong. Cross Modal Hash Retrieval Based on Attention Mechanism and Semantic Similarity[J]. Computer and Modernization, 2023, 0(08): 44-53.

[1]	何思达, 陈平华. 基于意图的轻量级自注意力序列推荐模型[J]. 计算机与现代化, 2024, 0(12): 1-9.
[2]	赵晨阳, 薛涛, 刘俊华. 基于改进Stable Diffusion的时尚服饰图案生成[J]. 计算机与现代化, 2024, 0(12): 15-23.
[3]	黄庭培1, 马禄彪1, 李世宝2, 刘建航1. 基于WiFi和原型网络的手势识别方法[J]. 计算机与现代化, 2024, 0(12): 34-39.
[4]	张思敏, 刘新妹, 殷俊龄, 李宝玲. 基于YOLOv7改进的PCB缺陷检测方法[J]. 计算机与现代化, 2024, 0(12): 45-52.
[5]	张晓东1, 白广芝1, 李敏1, 李昊洋2. 基于经验小波变换的油气井产量预测模型 [J]. 计算机与现代化, 2024, 0(12): 53-58.
[6]	王海洋, 弓同鑫, 杨锦涛, 陈再龙. 多尺度时间编码的工业园区短期负荷预测[J]. 计算机与现代化, 2024, 0(12): 59-65.
[7]	刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71.
[8]	谷岳, 邓松峰, 沈霁, 穆文涛, 赵恩棋. 基于改进YOLOv8的SAR舰船目标检测算法[J]. 计算机与现代化, 2024, 0(12): 78-83.
[9]	王艳媛, 茅正冲. 中英文场景文本图像的检测和识别算法[J]. 计算机与现代化, 2024, 0(12): 84-90.
[10]	李钧超1, 尤菲1, 张超2, 苏乐乐2, 龚龑2. 基于新型多目标浣熊优化算法的BiLSTM-Attention#br# 预测模型及误差分析[J]. 计算机与现代化, 2024, 0(11): 70-76.
[11]	张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98.
[12]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[13]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[14]	魏学诚1, 江凌云1, 李研2, 何非2. 改进YOLOv5的路侧单目视角小目标检测算法[J]. 计算机与现代化, 2024, 0(10): 27-34.
[15]	杜猛俊1, 李昂1, 童俊1, 钱锦1, 康恺1, 王若丁1, 靳文星2. 基于改进极限学习算法的电力信息数据融合模型[J]. 计算机与现代化, 2024, 0(10): 61-64.

基于注意力机制和语义相似度的跨模态哈希检索

Cross Modal Hash Retrieval Based on Attention Mechanism and Semantic Similarity

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价