计算机与现代化 ›› 2020, Vol. 0 ›› Issue (08): 56-62.doi: 10.3969/j.issn.1006-2475.2020.08.009

• 算法分析与设计 • 上一篇    下一篇

基于可变权重损失函数和难例挖掘模块的Faster R-CNN改进算法

  

  1. (1.河海大学能源与电气学院,江苏南京211100; 2.浙江华云信息科技有限公司,浙江杭州310008;
    3.南京理工大学自动化学院,江苏南京210098)
  • 收稿日期:2020-01-02 出版日期:2020-08-17 发布日期:2020-08-17
  • 作者简介:施非(1996-),男,江苏南京人,硕士研究生,研究方向:模式识别,目标检测,E-mail: 1426557795@qq.com; 邱臻(1985-),女,浙江衢州人,学士,研究方向:智能图像识别应用集成,E-mail: loachqz@126.com; 韩勤(1982-),男,浙江杭州人,助理工程师,学士,研究方向:信息系统架构,软件设计,E-mail: martinhanqin@gmail.com; 李金耿(1986-),男,浙江台州人,学士,研究方向:数据分析,数据挖掘,E-mail: lijingengdba@qq.com; 通信作者:钱惠敏(1980-),女,江苏宜兴人,副教授,博士,研究方向:计算机视觉,机器学习,视频中的人体行为识别,目标检测,E-mail: qhmin0316@163.com; 项文波(1976-),男,讲师,硕士,研究方向:图像和视频处理,E-mail: xiang_wb@163.com。
  • 基金资助:
    江苏省自然科学基金资助项目(20145051211); 河海大学中央高校基本科研业务费专项资金资助项目(261220182018B15514)

An Improved Algorithm of Faster R-CNN Based on Variable Weight Loss Function and OHEM

  1. (1. College of Energy and Electrical Engineering, Hohai University, Nanjing 211100, China;
    2. Zhejiang Huayun Information Technology Co. Ltd., Hangzhou 310008, China;
    3. College of Automation, Nanjing University of Science & Technology, Nanjing 210098, China)
  • Received:2020-01-02 Online:2020-08-17 Published:2020-08-17

摘要: 基于深度卷积神经网络的目标检测算法已成为目标检测领域中的研究热点,它包括基于区域提议的两阶段目标检测算法和基于位置回归的一阶段目标检测算法。Faster R-CNN是两阶段目标检测的典型算法之一,但是,训练数据集中简单样本-〖KG-*8〗难分样本数量不平衡,以及样本数据的类间不平衡,都是影响Faster R-CNN检测精度的重要原因。本文提出一种基于可变权重损失函数Focal Loss和难例挖掘模块的改进Faster R-CNN算法。具体地,在网络的分类部分引入Focal Loss函数,通过权重调节样本数据的类间不平衡,改善简单样本-〖KG-*8〗难分样本的数量不平衡;同时,修改网络结构,引入难例挖掘模块,进一步平衡简单样本-〖KG-*8〗难分样本的数量,提高网络的检测性能。本文采用不同数据集,不同基础网络来测试提出的算法性能。实验结果表明,在VGG-16基础网络下,本文算法在Pascal VOC 2007数据集上平均检测精度较原算法提高了0.9个百分点,在Pascal VOC 07+12数据集上提高了1.7个百分点;在Res-101基础网络上,在Pascal VOC 2007数据集上平均检测精度较原算法提高了1.3个百分点,在Pascal VOC 07+12数据集上提高了1.5个百分点。

关键词: 深度学习, 目标检测, 焦点损失, 难例挖掘

Abstract: Object detection algorithm based on deep convolutional neural network has become a research hotspot in the field of object detection, which includes two-stage object detection algorithm based on region proposal and one-stage object detection algorithm based on position regression. Faster R-CNN is one of the typical algorithms for two-stage object detection. However, the imbalance between simple examples and hard examples in the training data set and the inter-class imbalance of sample data are important reasons that affect the detection accuracy of Faster R-CNN. In this paper, an improved algorithm of Faster R-CNN based on variable weight loss function and OHEM is proposed. Specifically, the Focal Loss function is introduced into the classification part of the network to adjust the inter-class imbalance of sample data and improve the imbalance of the number of simple examples and the number of hard examples by adjusting the weight. At the same time, the network structure is modified, and online hard example mining is introduced to further balance the number of simple samples and the number of hard samples so as to  improve the detection performance of the network. To verify the performance of the proposed algorithm, experiments on different data sets and different basic networks are conducted. The experimental results show that on the basic network VGG-16, the proposed algorithm improves the mAP by 09 percentage points on the Pascal VOC 2007 data set compared with the original algorithm and 1.7 percentage points on Pascal VOC 07+12 data set. On the basic network RES-101, the mAP of the proposed algorithm on Pascal VOC 2007 data set is 1.3 percentage points higher than that of the original algorithm, and the mAP of the proposed algorithm on Pascal VOC 07+12 data set is 1.5 percentage points higher.

Key words: deep learning, object detection, Focal Loss, online hard example mining

中图分类号: