基于可变权重损失函数和难例挖掘模块的Faster R-CNN改进算法

doi:10.3969/j.issn.1006-2475.2020.08.009

计算机与现代化 ›› 2020, Vol. 0 ›› Issue (08): 56-62.doi: 10.3969/j.issn.1006-2475.2020.08.009

基于可变权重损失函数和难例挖掘模块的Faster R-CNN改进算法

(1.河海大学能源与电气学院,江苏南京211100; 2.浙江华云信息科技有限公司,浙江杭州310008;
3.南京理工大学自动化学院,江苏南京210098)

收稿日期:2020-01-02 出版日期:2020-08-17 发布日期:2020-08-17
作者简介:施非(1996-),男,江苏南京人,硕士研究生,研究方向:模式识别,目标检测,E-mail: 1426557795@qq.com; 邱臻(1985-),女,浙江衢州人，学士,研究方向:智能图像识别应用集成,E-mail: loachqz@126.com; 韩勤(1982-),男,浙江杭州人，助理工程师，学士,研究方向:信息系统架构,软件设计,E-mail: martinhanqin@gmail.com; 李金耿(1986-),男,浙江台州人,学士，研究方向:数据分析,数据挖掘,E-mail: lijingengdba@qq.com; 通信作者:钱惠敏(1980-),女,江苏宜兴人，副教授,博士,研究方向:计算机视觉,机器学习,视频中的人体行为识别,目标检测,E-mail: qhmin0316@163.com; 项文波(1976-),男,讲师,硕士,研究方向:图像和视频处理,E-mail: xiang_wb@163.com。
基金资助:
江苏省自然科学基金资助项目(20145051211); 河海大学中央高校基本科研业务费专项资金资助项目(261220182018B15514)

An Improved Algorithm of Faster R-CNN Based on Variable Weight Loss Function and OHEM

(1. College of Energy and Electrical Engineering, Hohai University, Nanjing 211100, China;
2. Zhejiang Huayun Information Technology Co. Ltd., Hangzhou 310008, China;
3. College of Automation, Nanjing University of Science & Technology, Nanjing 210098, China)

Received:2020-01-02 Online:2020-08-17 Published:2020-08-17

摘要/Abstract

摘要： 基于深度卷积神经网络的目标检测算法已成为目标检测领域中的研究热点，它包括基于区域提议的两阶段目标检测算法和基于位置回归的一阶段目标检测算法。Faster R-CNN是两阶段目标检测的典型算法之一,但是，训练数据集中简单样本-〖KG-*8〗难分样本数量不平衡，以及样本数据的类间不平衡，都是影响Faster R-CNN检测精度的重要原因。本文提出一种基于可变权重损失函数Focal Loss和难例挖掘模块的改进Faster R-CNN算法。具体地,在网络的分类部分引入Focal Loss函数，通过权重调节样本数据的类间不平衡，改善简单样本-〖KG-*8〗难分样本的数量不平衡；同时，修改网络结构，引入难例挖掘模块，进一步平衡简单样本-〖KG-*8〗难分样本的数量，提高网络的检测性能。本文采用不同数据集，不同基础网络来测试提出的算法性能。实验结果表明，在VGG-16基础网络下，本文算法在Pascal VOC 2007数据集上平均检测精度较原算法提高了0.9个百分点，在Pascal VOC 07+12数据集上提高了1.7个百分点；在Res-101基础网络上，在Pascal VOC 2007数据集上平均检测精度较原算法提高了1.3个百分点，在Pascal VOC 07+12数据集上提高了1.5个百分点。

关键词: 深度学习, 目标检测, 焦点损失, 难例挖掘

Abstract: Object detection algorithm based on deep convolutional neural network has become a research hotspot in the field of object detection, which includes two-stage object detection algorithm based on region proposal and one-stage object detection algorithm based on position regression. Faster R-CNN is one of the typical algorithms for two-stage object detection. However, the imbalance between simple examples and hard examples in the training data set and the inter-class imbalance of sample data are important reasons that affect the detection accuracy of Faster R-CNN. In this paper, an improved algorithm of Faster R-CNN based on variable weight loss function and OHEM is proposed. Specifically, the Focal Loss function is introduced into the classification part of the network to adjust the inter-class imbalance of sample data and improve the imbalance of the number of simple examples and the number of hard examples by adjusting the weight. At the same time, the network structure is modified, and online hard example mining is introduced to further balance the number of simple samples and the number of hard samples so as to improve the detection performance of the network. To verify the performance of the proposed algorithm, experiments on different data sets and different basic networks are conducted. The experimental results show that on the basic network VGG-16, the proposed algorithm improves the mAP by 09 percentage points on the Pascal VOC 2007 data set compared with the original algorithm and 1.7 percentage points on Pascal VOC 07+12 data set. On the basic network RES-101, the mAP of the proposed algorithm on Pascal VOC 2007 data set is 1.3 percentage points higher than that of the original algorithm, and the mAP of the proposed algorithm on Pascal VOC 07+12 data set is 1.5 percentage points higher.

Key words: deep learning, object detection, Focal Loss, online hard example mining

中图分类号:

TP391.4

施非, 邱臻, 韩勤, 李金耿, 钱惠敏, 项文波. 基于可变权重损失函数和难例挖掘模块的Faster R-CNN改进算法[J]. 计算机与现代化, 2020, 0(08): 56-62.

SHI Fei, QIU Zhen, HAN Qin, LI Jin-geng, QIAN Hui-min, XIANG Wen-bo. An Improved Algorithm of Faster R-CNN Based on Variable Weight Loss Function and OHEM[J]. Computer and Modernization, 2020, 0(08): 56-62.

参考文献

［1］张泽苗,霍欢,赵逢禹. 深层卷积神经网络的目标检测算法综述［J］. 小型微型计算机系统, 2019,40(9):1825-1831.
［2］ GIRSHICK R. Fast R-CNN［C］// Proceedings of 2015 IEEE International Conference on Computer Vision. 2015:1440-1448.
［3］ REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,39(6):1137-1149.
［4］ HE K M, GEORGIA G, PIOTR D, et al. Mask R-CNN［C］// Proceedings of 2017 IEEE International Conference on Computer Vision. 2017:2961-2969.
［5］ REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection［C］// 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:779-788.
［6］ LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector［C］// 2016 European Conference on Computer Vision. 2016:21-37.
［7］吴燕如,珠杰,管美静. 基于神经网络的目标检测技术研究综述及应用［J］. 电脑知识与技术, 2019,15(33):181-184.
［8］ SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining ［C］// 2016 IEEE Conference on Computer Vision & Pattern Recognition. 2016:761-769.
［9］ LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection［C］// 2017 IEEE International Conference on Computer Vision. 2017:2999-3007.
［10］海涛. 基于深度学习的图像识别鲁棒性研究［D］. 南京:南京邮电大学, 2018.
［11］张文. 类别不平衡的多任务人脸属性识别［J］. 计算机与现代化, 2018(6):62-67.
［12］孙玉,刘贵全,汪中. 基于不平衡分类的人脸检测系统［J］. 计算机应用与软件, 2012,29(12):24-26.
［13］赖策,魏小琴. 卷积神经网络的训练方式研究［J］. 信息与电脑(理论版), 2019,31(22):103-104.
［14］向鸿鑫,杨云. 不平衡数据挖掘方法综述［J］. 计算机工程与应用, 2019,55(4):1-16.
［15］王凯,王健,刘刚,等. 基于RetinaNet和类别平衡采样方法的销钉缺陷检测［J］. 电力工程技术, 2019,38(4):80-85.
［16］WANG X L, GUPTA A. Unsupervised learning of visual representations using videos［C］// 2015 IEEE International Conference on Computer Vision. 2015:2794-2802.
［17］李健伟,曲长文,彭书娟,等. 基于生成对抗网络和线上难例挖掘的SAR图像舰船目标检测［J］. 电子与信息学报, 2019,41(1):143-149.
［18］陈圣灵,沈思淇,李东升. 基于样本权重更新的不平衡数据集成学习方法［J］. 计算机科学, 2018,45(7):31-37.
［19］万宇文,黄林颖,甘登文. 基于权值的关联规则挖掘改进算法［J］. 计算机与现代化, 2014(4):77-80.
［20］MATHEW J, LUO M, PANG C K, et al. Kernel-based SMOTE for SVM classification of imbalanced datasets［C］// Proceedings of the 41st Annual Conference of the IEEE Industrial Electronics Society. 2015:1127-1132.
［21］GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// 2014 IEEE Conference on Computer Vision and Pattern Recognize, 2014:580-587.
［22］HAMIDZADEH J, MOSLEMNEJAD S. Identification of uncertainty and decision boundary for SVM classification training using belief function［C］// 2014 IEEE conference on Applied Intelligence, 2019,49(6):2030-2045.
［23］HENNIG C, SAUERBREI W. Exploration of the variability of variable selection based on distances between bootstrap sample results［J］. Advances in Data Analysis and Classification, 2019,13(4):933-963.
［24］郭怿品,李典庆,唐小松,等. 基于Bootstrap方法的堆石坝坝坡稳定可靠度分析［J］. 武汉大学学报(工学版), 2019,52(2):106-115.
［25］JOSHI M V, KUMAR V, AGARWAL R C. Evaluating boosting algorithms to classify rare classes： Comparison and improvements［C］// 2001 IEEE International Conference on Data Mining. 2001:257-264.
［26］王来,樊重俊,杨云鹏,等. 面向不平衡数据分类的KFDA-Boosting算法［J］. 计算机应用研究, 2019,36(3):807-811.
［27］HAN C, GAO G Y, ZHANG Y. Real-time small traffic sign detection with revised Faster-RCNN［J］. Multimedia Tools and Applications, 2019,78(10):13263-13278.
［28］聂凡杰. 基于端到端的深度学习目标检测算法研究［D］. 北京:北京邮电大学, 2018.
［29］方青云,王兆魁. 基于改进YOLOv3网络的遥感目标快速检测方法［J］. 上海航天, 2019,36(5):21-27.

[1]	刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71.
[2]	陈亮, 李诚, 易伟, 熊伟, 汪晓帆, 唐海东. 基于毫米波雷达与视觉融合的电力现场安全帽佩戴检测[J]. 计算机与现代化, 2024, 0(12): 100-107.
[3]	张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98.
[4]	董玉玟. 基于改进YOLOv7-tiny的多尺度运动目标检测算法[J]. 计算机与现代化, 2024, 0(11): 99-105.
[5]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[6]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[7]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[8]	魏学诚1, 江凌云1, 李研2, 何非2. 改进YOLOv5的路侧单目视角小目标检测算法[J]. 计算机与现代化, 2024, 0(10): 27-34.
[9]	王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79.
[10]	史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99.
[11]	张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113.
[12]	程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126.
[13]	程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29.
[14]	王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36.
[15]	时现伟1, 范鑫2. 基于轻量化的视频帧场景语义分割方法[J]. 计算机与现代化, 2024, 0(08): 49-53.

基于可变权重损失函数和难例挖掘模块的Faster R-CNN改进算法

An Improved Algorithm of Faster R-CNN Based on Variable Weight Loss Function and OHEM

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价