计算机与现代化 ›› 2020, Vol. 0 ›› Issue (05): 63-.doi: 10.3969/j.issn.1006-2475.2020.05.011
收稿日期:
2019-06-26
出版日期:
2020-05-20
发布日期:
2020-05-21
作者简介:
曹燕(1993-),女,四川广安人,硕士研究生,研究方向:深度学习,图像处理,E-mail: 726377694@qq.com; 李欢(1995-),男,湖南衡阳人,硕士研究生,研究方向:深度学习,图像处理,E-mail: 1603420591@qq.com; 通信作者:王天宝(1967-),男,四川剑阁人,教授,硕士,研究方向:无线通信技术与应用,网络通信与信息安全,E-mail: wangtianbao@cuit.edu.cn。
基金资助:
Received:
2019-06-26
Online:
2020-05-20
Published:
2020-05-21
摘要: 传统的目标检测算法主要依赖于人工选取的特征来对物体进行检测。人工提取的特征对主要针对某些特定对象,比如有的特征适合做边缘检测,有的适合做纹理检测,不具有普遍性。近年来,深度学习蓬勃发展,在计算机视觉领域比如图像分类、目标检测、图像语义分割等方面取得了重大的进展。深度学习作为一种特征学习方法能够自动学习到目标的有用特征,避免了人工提取特征,同时能够保证良好的检测效果。本文首先介绍基于深度学习的目标检测算法研究进展,其次总结目标检测算法中常见的难题与解决措施,最后对目标检测算法的可能发展方向进行展望。
中图分类号:
曹燕,李欢,王天宝. 基于深度学习的目标检测算法研究综述[J]. 计算机与现代化, 2020, 0(05): 63-.
CAO Yan, LI Huan, WANG Tian-bao. A Survey of Research on Target Detection Algorithms Based on Deep Learning[J]. Computer and Modernization, 2020, 0(05): 63-.
[1] SZELISKI R. Computer Vision: Algorithms and Applications[M]. New York: Springer, 2010. [2] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015,521(7553):436-444. [3] LAWRENCE G R. Machine Perception of Three-dimensional Solids[D]. Cambridge: Massachusetts Institute of Technology, 1963. [4] CANNY J. A computational approach to edge detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986,8(6):679-698. [5] MARR D, HILDRETH E. Theory of edge detection[J]. Proceedings of the Royal Society of London, Series B: Biological Sciences, 1980,207(1167):187-217. [6] PELLEGRINO F A, VANZELLA W, TORRE V. Edge detection revisited[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2004,34(3):1500-1518. [7] HARRIS C, STEPHENS M. A combined corner and edge detector[C]// Proceedings of the 4th Alvey Vision Conference. 1988:147-152. [8] 〖JP+2〗ROSTEN E, PORTER R, DRUMMOND T. Faster and better: A machine learning approach to corner detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010,32(1):105-119. [9] LOWE D G. Object recognition from local scale-invariant features[C]// Proceedings of the 7th IEEE International Conference on Computer Vision. 1999,2:1150-1157. [10]KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. 2012:1097-1105. [11]SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:1-9. [12]HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778. [13]GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:580-587. [14]REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:779-788. [15]LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multiBox detector[C]// Proceedings of the 14th European Conference on Computer Vision. 2016:21-37. [16]UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al.Selective search for object recognition[J]. International Journal of Computer Vision, 2013,104(2):154-171. [17]ZITNICK C L, DOLLAR P. Edge boxes: Locating object proposals from edges[C]// Proceedings of the 13th European Conference on Computer Vision. 2014:391-405. [18]HOSANG J, BENENSON R, DOLLAR P, et al. What makes for effective detection proposals?[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(4):814-830. [19]SERMANET P, EIGEN D, ZHANG X, et al. OverFeat: Integrated recognition, localization and detection using convolutional networks[J]. arXiv preprint arXiv:1312.6229, 2013. [20]RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015,115(3):211-252. [21]EVERINGHAM M, WINN J. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Development Kit[DB/OL]. (2007-06-07)[2019-04-10]. https://www.nevis.columbia.edu/~vgenty/public/devkit_doc.pdf. [22]HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,37(9):1904-1916. [23]GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1440-1448. [24]REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:91-99. [25]KONG T, YAO A B, CHEN Y R, et al. HyperNet: Towards accurate region proposal generation and joint object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:845-853. [26]SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:761-769. [27]SUNG K K. Learning and Example Selection for Object and Pattern Detection[D]. Cambridge: Massachusetts Institute of Technology, 1996. [28]LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:936-944. [29]LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]// Proceedings of the 13th European Conference on Computer Vision. 2014:740-755. [30]HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:2980-2988. [31]JIANG B R, LUO R X, MAO J Y, et al. Acquisition of localization confidence for accurate object detection[J]. arXiv preprint arXiv:1807.11590, 2018. [32]〖JP2〗YANG B, YAN J J, LEI Z, et al. CRAFT objects from images[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:6043-6051. [33]GIDARIS S, KOMODAKIS N. Attend refine repeat: Active box proposal generation via in-out localization[J]. arXiv preprint arXiv:1606.04446, 2016. [34]GIDARIS S, KOMODAKIS N. Object detection via a multi-region and semantic segmentation-aware CNN model[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1134-1142. [35]RAJARAM R N, OHN-BAR E, TRIVEDI M M. RefineNet: Iterative refinement for accurate object localization[C]// Proceedings of the 2016 IEEE 19th International Conference on Intelligent Transportation Systems. 2016:1528-1533. [36]CAI Z W, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:6154-6162. [37]SZEGEDY C, TOSHEV A, ERHAN D. Deep neural networks for object detection[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013:2553-2561. [38]ERHAN D, SZEGEDY C, TOSHEV A, et al. Scalable object detection using deep neural networks[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:2155-2162. [39]LI X D, YE M, LIU D, et al. Memory-based object detection in surveillance scenes[C]// Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. 2016, DOI: 10.1109/ICME.2016.7552920. [40]〖JP2〗REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:6517-6525. [41]IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]// Proceedings of the 32nd International Conference on Machine Learning. 2015:448-456. [42]HARTIGAN J A, WONG M A. Algorithm AS 136: A K-means clustering algorithm[J]. Journal of the Royal Statistical Society, Series C (Applied Statistics), 1979,28(1):100-108. [43]NOH H, HONG S, HAN B. Learning deconvolution network for semantic segmentation[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1520-1528. [44]NEWELL A, YANG K Y, DENG J. Stacked hourglass networks for human pose estimation[C]// Proceedings of the 〖JP4〗14th European Conference on Computer Vision. 2016:483-499. [45]FU C Y, LIU W, RANGA A, et al. DSSD: Deconvolutional single shot detector[J]. arXiv preprint arXiv:1701.06659, 2017. [46]SHEN Z Q, LIU Z, LI J G, et al. DSOD: Learning Deeply Supervised Object Detectors from Scratch[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:1937-1945. [47]REDMON J, FARHADI A. YOLOv3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018. [48]HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017. [49]ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices[J]. arXiv preprint arXiv:1707.01083, 2017. [50]〖JP2〗ZOPH B, VASUDEVAN V, SHLENS J, et al. Learning transferable architectures for scalable image recognition[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:8697-8710. [51]SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:4510-4520. [52]WANG R J, LI X, LING C X. Pelee: A real-time object detection system on mobile devices[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2018:1967-1976. [53]HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:2261-2269. [54]SHRIVASTAVA A, SUKTHANKAR R, MALIK J, et al. Beyond skip connections: Top-down modulation for object detection[J]. arXiv preprint arXiv:1612.06851, 2016. [55]LI Z M, PENG C, YU G, et al. DetNet: A backbone network for object detection[J]. arXiv preprint arXiv:1804.06215, 2018. [56]CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[J]. arXiv preprint arXiv:1412.7062, 2014. [57]YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv:1511.07122, 2016. [58]KONG T, SUN F C, YAO A B, et al. RON: Reverse connection with objectness prior networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:5244-5252. [59]ZHANG S F, WEN L Y, BIAN X, et al. Single-shot refinement neural network for object detection[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:4203-4212. [60]LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:2999-3007. [61]BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS: Improving object detection with one line of code[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:5562-5570. [62]OZCAN B. Object Detection and Localization Using Dense and SIFT Features[D]. Kingsville: Texas A&M University-Kingsville, 2014. [63]LAW H, DENG J. CornerNet: Detecting objects as paired key points[C]// Proceedings of the 15th European Conference on Computer Vision. 2018:765-781. |
[1] | 赵晨阳, 薛涛, 刘俊华. 基于改进Stable Diffusion的时尚服饰图案生成[J]. 计算机与现代化, 2024, 0(12): 15-23. |
[2] | 刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71. |
[3] | 陈亮, 李诚, 易伟, 熊伟, 汪晓帆, 唐海东. 基于毫米波雷达与视觉融合的电力现场安全帽佩戴检测[J]. 计算机与现代化, 2024, 0(12): 100-107. |
[4] | 张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98. |
[5] | 董玉玟. 基于改进YOLOv7-tiny的多尺度运动目标检测算法[J]. 计算机与现代化, 2024, 0(11): 99-105. |
[6] | 祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112. |
[7] | 陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120. |
[8] | 杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26. |
[9] | 魏学诚1, 江凌云1, 李研2, 何非2. 改进YOLOv5的路侧单目视角小目标检测算法[J]. 计算机与现代化, 2024, 0(10): 27-34. |
[10] | 王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79. |
[11] | 史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99. |
[12] | 张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113. |
[13] | 程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126. |
[14] | 程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29. |
[15] | 王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||