基于可学习记忆特征金字塔网络的小样本目标检测

doi:10.3969/j.issn.1006-2475.2023.12.002

计算机与现代化 ›› 2023, Vol. 0 ›› Issue (12): 7-13.doi: 10.3969/j.issn.1006-2475.2023.12.002

基于可学习记忆特征金字塔网络的小样本目标检测

（1.广东工业大学计算机学院，广东广州 510006； 2.上海交通大学自动化学院，上海 200030；
3.威斯康星康考迪亚大学，威斯康星梅库恩 WI 53097）

出版日期:2023-12-24 发布日期:2024-01-24
作者简介:夏千涵（1998—），女，吉林四平人，硕士研究生，研究方向：深度学习，机器视觉，E-mail： 541949442@qq.com；何胜煌（1992—），男，福建龙岩人，博士后，研究方向：深度学习，医工结合，多智能体系统，E-mail： shhesjtu@sjtu.edu.cn；通信作者：吴元清（1985—），男，广东广州人，教授，研究方向：无人智能小车编队控制，机器视觉处理，E-mail： yqwuzju@163.com；赵乐乐（1984—），女，吉林长春人，硕士研究生，研究方向：机械设计，E-mail： lelezhaochina@163.com。
基金资助:
国家自然科学基金资助项目（U22A2065， 62003100， 62276074）；国家重点发展计划项目（2022YFB4701300）；广东省基础和应用基础研究基金资助项目（2021B15120058）

Few-shot Object Detection via Learnable Memory Feature Pyramid Network

（1. School of Computer Science and Technology， Guangdong University of Technology， Guangzhou 510006， China；
2. School of Automation， Shanghai Jiaotong University， Shanghai 200030， China；
3. Concordia University Wisconsin， Mequon WI 53097， USA）

Online:2023-12-24 Published:2024-01-24

摘要/Abstract

摘要： 摘要：现阶段，部分行业应用场景数据难以获取，从而产生的小样本问题成为制约深度学习技术应用推广的重要因素。本文通过小样本方法来提升模型在数据缺乏情况下的表现，降低深度学习模型对数据的依赖性，提出一种基于可学习记忆特征金字塔网络来保留更干净的多尺度特征信息用于分类器预测。借助自适应特征融合模块，让网络自行选择不同层级特征间的侧重比，最大化保留不同尺度的判别性特征信息。同时还加入回溯特征对齐模块，用于缓解特征层堆叠时引入的特征混淆效应。实验结果表明，通过克服样本依赖性可以有效地提升模型性能，改进后的模型可以在COCO数据集和VOC数据集上超越其他现有同类型的模型。特别地，在VOC数据集中将先验参数k设置为5的情况下，nAP50提高了4.8达到44.7；在COCO数据集中将先验参数k设置为30的情况下，nAP50提高了4.0达到29.4。

关键词: 关键词：小样本, 自适应融合, 特征对齐, 特征金字塔网络

Abstract: Abstract: At present， it is difficult to obtain the data of some industry application scenarios， and the problem of few shot has become an important factor restricting the application and promotion of deep learning technology. In this paper， few shot method is adopted to improve the performance of the model in the absence of data and reduce the dependence of the deep learning model on data， and few-shot object detection via learnable memory feature pyramid network is proposed to retain cleaner multi-scale feature information for classifier prediction. With the help of the adaptive feature fusion module， the network can choose the emphasis ratio among the features of different levels to maximize the retention of discriminant feature information of different scales. At the same time， we also add a retrospective feature alignment module to alleviate the feature confusion effect introduced by stacking feature layers. The experimental results show that the model performance can be effectively improved by overcoming the dependence on data， and the improved model can surpass other existing models of the same type in the COCO dataset and VOC dataset. In particular， when the prior parameter k is set to 5 in VOC dataset， nAP50 increases by 4.8 to 44.7； when the prior parameter k is set to 30 in COCO dataset， nAP50 increases by 4.0 to 29.4.

Key words: Key words： few shot, adaptive fusion, feature alignment, feature pyramid network

中图分类号:

TP391

夏千涵, 何胜煌, 吴元清, 赵乐乐. 基于可学习记忆特征金字塔网络的小样本目标检测[J]. 计算机与现代化, 2023, 0(12): 7-13.

XIA Qian-han, HE Sheng-huang, WU Yuan-qing, ZHAO Le-le. Few-shot Object Detection via Learnable Memory Feature Pyramid Network[J]. Computer and Modernization, 2023, 0(12): 7-13.

参考文献

［1］ OGUZ C， VU N T. Few-shot learning for slot tagging with attentive relational network［C］// Proceedings of the 2021 EACL Association for Computational Linguistics. 2021：1566-1572.
［2］吴晗，张志龙，李楚为，等. 小样本红外图像的样本扩增与目标检测算法［J］. 控制理论与应用， 2021，38（9）：1477-1485.
［3］ ZHOU J， ZHENG Y N， TANG J， et al. FlipDA： Effective and robust data augmentation for few-shot learning［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022：8646-8665.
［4］张振伟，郝建国，黄健，等. 小样本图像目标检测研究综述［J］. 计算机工程与应用， 2022，58（5）：1-11.
［5］徐培，赵雪专，唐红强，等. 基于两阶段投票的小样本目标检测方法［J］. 计算机应用， 2014，34（4）：1126-1129.
［6］彭云聪，秦小林，张力戈，等. 面向图像分类的小样本学习算法综述［J］. 计算机科学， 2022，49（5）：1-9.
［7］ KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 2012 International Conference on Neural Information Processing Systems. 2012：1097-1105.
［8］ XIAO Y， MARLET R. Few-shot object detection and viewpoint estimation for objects in the wild［C］// Proceedings of the 2020 European Conference on Computer Vision. 2020：192-210.
［9］ QIAO L M， ZHAO Y X， LI Z Y， et al. DeFRCN： Decoupled faster R-CNN for few-shot object detection［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. 2021：8661-8670.
［10］ RUSU A A， RAO D， SYGNOWSKI J， et al. Meta-learning with latent embedding optimization［C］// Proceedings of the 2019 International Conference on Learning Representations. 2019：43-49.
［11］ SUN Q R， LIU Y Y， CHUA T S， et al. Meta-transfer learning for few-shot learning［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 2019：403-412.
［12］ CHEN W Y， LIU Y C， KIRA Z， et al. A closer look at few-shot classification［C］// Proceedings of the 2019 International Conference on Learning Representations. 2019：1873-1881.
［13］ HAN G X， HE Y C， HUANG S Y， et al. Query adaptive few-shot object detection with heterogeneous graph convolutional networks［C］// Proceedings of the 2021 IEEE International Conference on Computer Vision （ICCV）. 2021：3243-3252.
［14］ LI B W， WANG C， REDDY P， et al. AirDet： Few-shot detection without fine-tuning for autonomous exploration［C］// Proceedings of the 2022 European Conference on Computer Vision. 2022：427-444.
［15］ Sun B， Li B H， Cai S C， et al. FSCE： Few-shot object detection via contrastive proposal encoding［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 2021：7348-7358.
［16］林润超，黄荣，董爱华. 基于注意力机制和元特征二次重加权的小样本目标检测［J］. 计算机应用， 2022，42（10）：3025-3032.
［17］ FU J L， ZHENG H L， MEI T. Look closer to see better： Recurrent attention convolutional neural network for fine-grained image recognition［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2017：4476-4484.
［18］邓杰航，郭文权，陈汉杰，等. 融合多尺度多头自注意力和在线难例挖掘的小样本硅藻检测［J］. 计算机应用， 2022，42（8）：2593-2600.
［19］ LIN T Y， DOLLAR P， GIRSHICK R， et al. Feature pyramid networks for object detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2017：936-944.
［20］ CHEN H， WANG Y L， WANG G Y， et al. LSTD： A low-shot transfer detector for object detection［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018：2836-2843.
［21］ KANG B Y， LIU Z， WANG X， et al. Few-shot object detection via feature reweighting［C］// Proceedings of the 2019 IEEE International Conference on Computer Vision （ICCV）. 2019：8419-8428.
［22］ HAN G X， HUANG S Y， MA J W， et al. Meta faster R-CNN： Towards accurate few-shot object detection with attentive feature alignment［C］// Proceedings of the 36th AAAI Conference on Artificial Intelligence. 2022：780-789.
［23］ WANG X， HUANG T E， DARRELL T， et al. Frustratingly simple few-shot object detection［C］// Proceedings of the 37th International Conference on Machine Learning. 2020：9919-9928.
［24］ WU J X， LIU S T， HUANG D， et al. Multi-scale positive sample refinement for few-shot object detection［C］// Proceedings of the 2020 European Conference on Computer Vision. 2020：456-472.
［25］ YANG Z， WANG Y L， CHEN X Y， et al. Context-transformer： Tackling object confusion for few-shot detection［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020：12653-12660.
［26］ EVERINGHAM M， VAN GOOL L， WILLIAMS C K I， et al. The PASCAL visual object classes （VOC） challenge［J］. International Journal of Computer Vision， 2010，88（2）：303-338.
［27］ LIN T Y， MAIRE M， BELONGIE S， et al. Microsoft COCO： Common objects in context［C］// Proceedings of the 2014 European Conference on Computer Vision. 2014：740-755.
［28］ REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： Towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017，39（6）：1137-1149.
［29］ GUO C X， FAN B， ZHANG Q， et al. AugFPN： Improving multi-scale feature learning for object detection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020：12592-12601.
［30］ LUO Y H， CAO X， ZHANG J T et al. CE-FPN： Enhancing channel information for object detection［J］. Multimedia Tools and Applications， 2022，81（21）：30685-30704.

[1]	郑久超, 赵新元. 基于主题与描述信息的实体链接方法[J]. 计算机与现代化, 2024, 0(12): 10-14.
[2]	张昆1, 张永伟1, 吴永城1, 张笑文2, 翟世臣2. 基于大模型的设备故障知识图谱自动构建方法[J]. 计算机与现代化, 2024, 0(11): 46-53.
[3]	张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98.
[4]	杜猛俊1, 李昂1, 童俊1, 钱锦1, 康恺1, 王若丁1, 靳文星2. 基于改进极限学习算法的电力信息数据融合模型[J]. 计算机与现代化, 2024, 0(10): 61-64.
[5]	焦一凯1, 2, 朱欣娟1, 2. 公共文化资源标签推荐方法[J]. 计算机与现代化, 2024, 0(10): 107-112.
[6]	杨俞沣1, 2, 夏小云2, 陈泽丰3, 廖伟志2, 李积武2. 融合多策略蜣螂优化算法的外卖订单配送路径优化[J]. 计算机与现代化, 2024, 0(09): 25-32.
[7]	马钰, 杨勇, 任鸽, 帕力旦·吐尔逊. 基于GCN和微调BERT的作文自动评分方法[J]. 计算机与现代化, 2024, 0(09): 33-37.
[8]	刘文亮1, 吴飞1, 何德明1, 赵维伟2, 潘建宏3. 基于相异度矩阵的碎片化回复文本聚类方法[J]. 计算机与现代化, 2024, 0(09): 56-60.
[9]	高猛, 曾宪文. 基于Circle映射和自适应t分布变异改进的鹈鹕优化算法[J]. 计算机与现代化, 2024, 0(09): 69-73.
[10]	余晨曦, 谷林. 基于人体骨架的电梯内异常行为识别预警[J]. 计算机与现代化, 2024, 0(09): 114-120.
[11]	王妍, 丛鑫, 訾玲玲. 结合知识追踪和图卷积的知识概念推荐[J]. 计算机与现代化, 2024, 0(08): 17-23.
[12]	付书岗1, 2, 3. 基于改进YOLOX和新型数据关联方式的无人机#br# 多目标跟踪方法[J]. 计算机与现代化, 2024, 0(08): 59-66.
[13]	魏嘉焜, 王家润. 手势识别与交互综述[J]. 计算机与现代化, 2024, 0(08): 67-76.
[14]	王涛1, 2, 黄丹1, 2, 刘禅奕1, 2, 朱桃1, 2. 基于YOLOv5s的无人机图像车辆检测[J]. 计算机与现代化, 2024, 0(08): 108-113.
[15]	杨江1, 孙晓梅1, 许韬2. 基于业务内容构建股票关联关系的股价预测[J]. 计算机与现代化, 2024, 0(07): 21-25.

基于可学习记忆特征金字塔网络的小样本目标检测

Few-shot Object Detection via Learnable Memory Feature Pyramid Network

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价