收稿日期:
2019-06-26
出版日期:
2020-05-20
发布日期:
2020-05-21
作者简介:
曹燕(1993-),女,四川广安人,硕士研究生,研究方向:深度学习,图像处理,E-mail: 726377694@qq.com; 李欢(1995-),男,湖南衡阳人,硕士研究生,研究方向:深度学习,图像处理,E-mail: 1603420591@qq.com; 通信作者:王天宝(1967-),男,四川剑阁人,教授,硕士,研究方向:无线通信技术与应用,网络通信与信息安全,E-mail: wangtianbao@cuit.edu.cn。
基金资助:
Received:
2019-06-26
Online:
2020-05-20
Published:
2020-05-21
摘要: 传统的目标检测算法主要依赖于人工选取的特征来对物体进行检测。人工提取的特征对主要针对某些特定对象,比如有的特征适合做边缘检测,有的适合做纹理检测,不具有普遍性。近年来,深度学习蓬勃发展,在计算机视觉领域比如图像分类、目标检测、图像语义分割等方面取得了重大的进展。深度学习作为一种特征学习方法能够自动学习到目标的有用特征,避免了人工提取特征,同时能够保证良好的检测效果。本文首先介绍基于深度学习的目标检测算法研究进展,其次总结目标检测算法中常见的难题与解决措施,最后对目标检测算法的可能发展方向进行展望。
中图分类号:
曹燕,李欢,王天宝. 基于深度学习的目标检测算法研究综述[J]. 计算机与现代化, doi: 10.3969/j.issn.1006-2475.2020.05.011.
CAO Yan, LI Huan, WANG Tian-bao. A Survey of Research on Target Detection Algorithms Based on Deep Learning[J]. Computer and Modernization, doi: 10.3969/j.issn.1006-2475.2020.05.011.
[1] | SZELISKI R. Computer Vision: Algorithms and Applications[M]. New York: Springer, 2010. |
[2] | LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015,521(7553):436-444. |
[3] | LAWRENCE G R. Machine Perception of Three-dimensional Solids[D]. Cambridge: Massachusetts Institute of Technology, 1963. |
[4] | CANNY J. A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986,8(6):679-698. |
[5] | MARR D, HILDRETH E. Theory of edge detection[J]. Proceedings of the Royal Society of London, Series B: Biological Sciences, 1980,207(1167):187-217. |
[6] | PELLEGRINO F A, VANZELLA W, TORRE V. Edge detection revisited[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2004,34(3):1500-1518. |
[7] | HARRIS C, STEPHENS M. A combined corner and edge detector[C]// Proceedings of the 4th Alvey Vision Conference. 1988:147-152. |
[8] | 〖JP+2〗ROSTEN E, PORTER R, DRUMMOND T. Faster and better: A machine learning approach to corner detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010,32(1):105-119. |
[9] | LOWE D G. Object recognition from local scale-invariant features[C]// Proceedings of the 7th IEEE International Conference on Computer Vision. 1999,2:1150-1157. |
[10] | KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. 2012:1097-1105. |
[11] | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:1-9. |
[12] | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778. |
[13] | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:580-587. |
[14] | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:779-788. |
[15] | LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multiBox detector[C]// Proceedings of the 14th European Conference on Computer Vision. 2016:21-37. |
[16] | UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al.Selective search for object recognition[J]. International Journal of Computer Vision, 2013,104(2):154-171. |
[17] | ZITNICK C L, DOLLAR P. Edge boxes: Locating object proposals from edges[C]// Proceedings of the 13th European Conference on Computer Vision. 2014:391-405. |
[18] | HOSANG J, BENENSON R, DOLLAR P, et al. What makes for effective detection proposals?[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(4):814-830. |
[19] | SERMANET P, EIGEN D, ZHANG X, et al. OverFeat: Integrated recognition, localization and detection using convolutional networks[J]. arXiv preprint arXiv:1312.6229, 2013. |
[20] | RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015,115(3):211-252. |
[21] | EVERINGHAM M, WINN J. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Development Kit[DB/OL]. (2007-06-07)[2019-04-10]. https://www.nevis.columbia.edu/~vgenty/public/devkit_doc.pdf. |
[22] | HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,37(9):1904-1916. |
[23] | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1440-1448. |
[24] | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:91-99. |
[25] | KONG T, YAO A B, CHEN Y R, et al. HyperNet: Towards accurate region proposal generation and joint object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:845-853. |
[26] | SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:761-769. |
[27] | SUNG K K. Learning and Example Selection for Object and Pattern Detection[D]. Cambridge: Massachusetts Institute of Technology, 1996. |
[28] | LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:936-944. |
[29] | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]// Proceedings of the 13th European Conference on Computer Vision. 2014:740-755. |
[30] | HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:2980-2988. |
[31] | JIANG B R, LUO R X, MAO J Y, et al. Acquisition of localization confidence for accurate object detection[J]. arXiv preprint arXiv:1807.11590, 2018. |
[32] | 〖JP2〗YANG B, YAN J J, LEI Z, et al. CRAFT objects from images[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:6043-6051. |
[33] | GIDARIS S, KOMODAKIS N. Attend refine repeat: Active box proposal generation via in-out localization[J]. arXiv preprint arXiv:1606.04446, 2016. |
[34] | GIDARIS S, KOMODAKIS N. Object detection via a multi-region and semantic segmentation-aware CNN model[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1134-1142. |
[35] | RAJARAM R N, OHN-BAR E, TRIVEDI M M. RefineNet: Iterative refinement for accurate object localization[C]// Proceedings of the 2016 IEEE 19th International Conference on Intelligent Transportation Systems. 2016:1528-1533. |
[36] | CAI Z W, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:6154-6162. |
[37] | SZEGEDY C, TOSHEV A, ERHAN D. Deep neural networks for object detection[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013:2553-2561. |
[38] | ERHAN D, SZEGEDY C, TOSHEV A, et al. Scalable object detection using deep neural networks[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:2155-2162. |
[39] | LI X D, YE M, LIU D, et al. Memory-based object detection in surveillance scenes[C]// Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. 2016, DOI: 10.1109/ICME.2016.7552920. |
[40] | 〖JP2〗REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:6517-6525. |
[41] | IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning. 2015:448-456. |
[42] | HARTIGAN J A, WONG M A. Algorithm AS 136: A K-means clustering algorithm[J]. Journal of the Royal Statistical Society, Series C (Applied Statistics), 1979,28(1):100-108. |
[43] | NOH H, HONG S, HAN B. Learning deconvolution network for semantic segmentation[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1520-1528. |
[44] | NEWELL A, YANG K Y, DENG J. Stacked hourglass networks for human pose estimation[C]// Proceedings of the 〖JP4〗14th European Conference on Computer Vision. 2016:483-499. |
[45] | FU C Y, LIU W, RANGA A, et al. DSSD: Deconvolutional single shot detector[J]. arXiv preprint arXiv:1701.06659, 2017. |
[46] | SHEN Z Q, LIU Z, LI J G, et al. DSOD: Learning Deeply Supervised Object Detectors from Scratch[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:1937-1945. |
[47] | REDMON J, FARHADI A. YOLOv3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018. |
[48] | HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017. |
[49] | ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices[J]. arXiv preprint arXiv:1707.01083, 2017. |
[50] | 〖JP2〗ZOPH B, VASUDEVAN V, SHLENS J, et al. Learning transferable architectures for scalable image recognition[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:8697-8710. |
[51] | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:4510-4520. |
[52] | WANG R J, LI X, LING C X. Pelee: A real-time object detection system on mobile devices[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2018:1967-1976. |
[53] | HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:2261-2269. |
[54] | SHRIVASTAVA A, SUKTHANKAR R, MALIK J, et al. Beyond skip connections: Top-down modulation for object detection[J]. arXiv preprint arXiv:1612.06851, 2016. |
[55] | LI Z M, PENG C, YU G, et al. DetNet: A backbone network for object detection[J]. arXiv preprint arXiv:1804.06215, 2018. |
[56] | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. arXiv preprint arXiv:1412.7062, 2014. |
[57] | YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv:1511.07122, 2016. |
[58] | KONG T, SUN F C, YAO A B, et al. RON: Reverse connection with objectness prior networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:5244-5252. |
[59] | ZHANG S F, WEN L Y, BIAN X, et al. Single-shot refinement neural network for object detection[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:4203-4212. |
[60] | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:2999-3007. |
[61] | BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS: Improving object detection with one line of code[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:5562-5570. |
[62] | OZCAN B. Object Detection and Localization Using Dense and SIFT Features[D]. Kingsville: Texas A&M University-Kingsville, 2014. |
[63] | LAW H, DENG J. CornerNet: Detecting objects as paired key points[C]// Proceedings of the 15th European Conference on Computer Vision. 2018:765-781. |
[1] | 罗伟,梁世豪,姜鑫,安妮,杜锐. 基于深度学习的野外露头区岩石裂缝识别[J]. 计算机与现代化, 2020, 0(05): 56-. |
[2] | 吴世海,鲍义东,陈果,陈秋实. 基于随机Gabor特征的半参考农作物图像质量评价方法[J]. 计算机与现代化, 2020, 0(05): 70-. |
[3] | 李程启1,郑文杰1,黄文礼2,温招洋2. 一种基于透视变换数据增广的斜视目标鲁棒检测方法[J]. 计算机与现代化, 2020, 0(04): 1-. |
[4] | 付磊,任德均,胡云起,郜明,邱吕. 基于ResNet网络的医用塑瓶制造缺陷检测方法[J]. 计算机与现代化, 2020, 0(04): 104-. |
[5] | 宋周锐. 基于混合全局池化的回环检测算法[J]. 计算机与现代化, 2020, 0(04): 115-. |
[6] | 刘力冉1,曹杰2,杨磊1,仇男豪1. 一种改进YOLOv3-Tiny的行车检测算法[J]. 计算机与现代化, 2020, 0(03): 108-. |
[7] | 刘阳,孟艾 . 基于卷积神经网络的多聚脯氨酸二型二级结构预测[J]. 计算机与现代化, 2020, 0(02): 65-. |
[8] | 苏蒙,李为 . 一种基于SSD改进的目标检测算法[J]. 计算机与现代化, 2020, 0(02): 89-. |
[9] | 徐晗智,艾中良,张志超 . 一种基于通道重排的轻量级目标检测网络[J]. 计算机与现代化, 2020, 0(02): 94-. |
[10] | 崔文超,李渊博,王敏鉴 . 适用于移动端的输电线路鸟类检测算法研究 [J]. 计算机与现代化, 2020, 0(02): 110-. |
[11] | 莫蓓蓓,吴克河 . 引入Self-Attention的电力作业违规穿戴智能检测技术研究[J]. 计算机与现代化, 2020, 0(02): 115-. |
[12] | 项威. 事件知识图谱构建技术与应用综述[J]. 计算机与现代化, 2020, 0(01): 10-. |
[13] | 齐玉东1,丁海强1,赵锦超2,孙明玮1. 基于biRNN的海军军械不均衡文本数据集处理方法[J]. 计算机与现代化, 2019, 0(12): 21-. |
[14] | 胡骞鹤1,方书雅1,刘守印1,李纪平2. 基于教室监控视频的学生位置检测和人脸图像捕获算法[J]. 计算机与现代化, 2019, 0(12): 60-. |
[15] | 李梵若,韩莹,代广斌,冯天歌,胡琳. 文字识别在虚拟导游系统中的应用研究与实现[J]. 计算机与现代化, 2019, 0(12): 83-. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||