A Survey of Research on Target Detection Algorithms Based on Deep Learning

doi:10.3969/j.issn.1006-2475.2020.05.011

Abstract

Abstract: Traditional target detection algorithms rely mainly on manually selecting features to detect objects. The artificially extracted feature pairs are mainly for certain specific objects, such as some features suitable for edge detection, and some suitable for texture detection, which is not universal. In recent years, deep learning has flourished, and significant research progress has been made in the field of computer vision such as image classification, target detection, and image semantic segmentation. As a feature learning method, deep learning can automatically learn the useful features of the target, avoiding the problem of manual extraction of features, and at the same time ensuring good detection results. Firstly, the research progress of target detection algorithm based on deep learning is introduced. Secondly, the common problems and solutions in target detection algorithm are summarized. Finally, the possible development direction of target detection algorithm is prospected.

Key words: target detection, deep learning, computer vision

CLC Number:

TP183

CAO Yan, LI Huan, WANG Tian-bao. A Survey of Research on Target Detection Algorithms Based on Deep Learning[J]. Computer and Modernization, 2020, 0(05): 63-.

References

［1］ SZELISKI R. Computer Vision: Algorithms and Applications［M］. New York: Springer, 2010.
［2］ LECUN Y, BENGIO Y, HINTON G. Deep learning［J］. Nature, 2015,521(7553):436-444.
［3］ LAWRENCE G R. Machine Perception of Three-dimensional Solids［D］. Cambridge: Massachusetts Institute of Technology, 1963.
［4］ CANNY J. A computational approach to edge detection［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986,8(6):679-698.
［5］ MARR D, HILDRETH E. Theory of edge detection［J］. Proceedings of the Royal Society of London, Series B: Biological Sciences, 1980,207(1167):187-217.
［6］ PELLEGRINO F A, VANZELLA W, TORRE V. Edge detection revisited［J］. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2004,34(3):1500-1518.
［7］ HARRIS C, STEPHENS M. A combined corner and edge detector［C］// Proceedings of the 4th Alvey Vision Conference. 1988:147-152.
［8］〖JP+2〗ROSTEN E, PORTER R, DRUMMOND T. Faster and better: A machine learning approach to corner detection［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010,32(1):105-119.
［9］ LOWE D G. Object recognition from local scale-invariant features［C］// Proceedings of the 7th IEEE International Conference on Computer Vision. 1999,2:1150-1157.
［10］KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 25th International Conference on Neural Information Processing Systems. 2012:1097-1105.
［11］SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:1-9.
［12］HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778.
［13］GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:580-587.
［14］REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:779-788.
［15］LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multiBox detector［C］// Proceedings of the 14th European Conference on Computer Vision. 2016:21-37.
［16］UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al.Selective search for object recognition［J］. International Journal of Computer Vision, 2013,104(2):154-171.
［17］ZITNICK C L, DOLLAR P. Edge boxes: Locating object proposals from edges［C］// Proceedings of the 13th European Conference on Computer Vision. 2014:391-405.
［18］HOSANG J, BENENSON R, DOLLAR P, et al. What makes for effective detection proposals?［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(4):814-830.
［19］SERMANET P, EIGEN D, ZHANG X, et al. OverFeat: Integrated recognition, localization and detection using convolutional networks［J］. arXiv preprint arXiv:1312.6229, 2013.
［20］RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge［J］. International Journal of Computer Vision, 2015,115(3):211-252.
［21］EVERINGHAM M, WINN J. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Development Kit［DB/OL］. (2007-06-07)［2019-04-10］. https://www.nevis.columbia.edu/~vgenty/public/devkit_doc.pdf.
［22］HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,37(9):1904-1916.
［23］GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1440-1448.
［24］REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:91-99.
［25］KONG T, YAO A B, CHEN Y R, et al. HyperNet: Towards accurate region proposal generation and joint object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:845-853.
［26］SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:761-769.
［27］SUNG K K. Learning and Example Selection for Object and Pattern Detection［D］. Cambridge: Massachusetts Institute of Technology, 1996.
［28］LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:936-944.
［29］LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context［C］// Proceedings of the 13th European Conference on Computer Vision. 2014:740-755.
［30］HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:2980-2988.
［31］JIANG B R, LUO R X, MAO J Y, et al. Acquisition of localization confidence for accurate object detection［J］. arXiv preprint arXiv:1807.11590, 2018.
［32］〖JP2〗YANG B, YAN J J, LEI Z, et al. CRAFT objects from images［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:6043-6051.
［33］GIDARIS S, KOMODAKIS N. Attend refine repeat: Active box proposal generation via in-out localization［J］. arXiv preprint arXiv:1606.04446, 2016.
［34］GIDARIS S, KOMODAKIS N. Object detection via a multi-region and semantic segmentation-aware CNN model［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1134-1142.
［35］RAJARAM R N, OHN-BAR E, TRIVEDI M M. RefineNet: Iterative refinement for accurate object localization［C］// Proceedings of the 2016 IEEE 19th International Conference on Intelligent Transportation Systems. 2016:1528-1533.
［36］CAI Z W, VASCONCELOS N. Cascade R-CNN: Delving into high quality object detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:6154-6162.
［37］SZEGEDY C, TOSHEV A, ERHAN D. Deep neural networks for object detection［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013:2553-2561.
［38］ERHAN D, SZEGEDY C, TOSHEV A, et al. Scalable object detection using deep neural networks［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:2155-2162.
［39］LI X D, YE M, LIU D, et al. Memory-based object detection in surveillance scenes［C］// Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. 2016, DOI: 10.1109/ICME.2016.7552920.
［40］〖JP2〗REDMON J, FARHADI A. YOLO9000: Better, faster, stronger［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:6517-6525.
［41］IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift［C］// Proceedings of the 32nd International Conference on Machine Learning. 2015:448-456.
［42］HARTIGAN J A, WONG M A. Algorithm AS 136: A K-means clustering algorithm［J］. Journal of the Royal Statistical Society, Series C (Applied Statistics), 1979,28(1):100-108.
［43］NOH H, HONG S, HAN B. Learning deconvolution network for semantic segmentation［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1520-1528.
［44］NEWELL A, YANG K Y, DENG J. Stacked hourglass networks for human pose estimation［C］// Proceedings of the 〖JP4〗14th European Conference on Computer Vision. 2016:483-499.
［45］FU C Y, LIU W, RANGA A, et al. DSSD: Deconvolutional single shot detector［J］. arXiv preprint arXiv:1701.06659, 2017.
［46］SHEN Z Q, LIU Z, LI J G, et al. DSOD: Learning Deeply Supervised Object Detectors from Scratch［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:1937-1945.
［47］REDMON J, FARHADI A. YOLOv3: An incremental improvement［J］. arXiv preprint arXiv:1804.02767, 2018.
［48］HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications［J］. arXiv preprint arXiv:1704.04861, 2017.
［49］ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices［J］. arXiv preprint arXiv:1707.01083, 2017.
［50］〖JP2〗ZOPH B, VASUDEVAN V, SHLENS J, et al. Learning transferable architectures for scalable image recognition［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:8697-8710.
［51］SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:4510-4520.
［52］WANG R J, LI X, LING C X. Pelee: A real-time object detection system on mobile devices［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2018:1967-1976.
［53］HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:2261-2269.
［54］SHRIVASTAVA A, SUKTHANKAR R, MALIK J, et al. Beyond skip connections: Top-down modulation for object detection［J］. arXiv preprint arXiv:1612.06851, 2016.
［55］LI Z M, PENG C, YU G, et al. DetNet: A backbone network for object detection［J］. arXiv preprint arXiv:1804.06215, 2018.
［56］CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs［J］. arXiv preprint arXiv:1412.7062, 2014.
［57］YU F, KOLTUN V. Multi-scale context aggregation by dilated convolutions［J］. arXiv preprint arXiv:1511.07122, 2016.
［58］KONG T, SUN F C, YAO A B, et al. RON: Reverse connection with objectness prior networks for object detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:5244-5252.
［59］ZHANG S F, WEN L Y, BIAN X, et al. Single-shot refinement neural network for object detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:4203-4212.
［60］LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:2999-3007.
［61］BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS: Improving object detection with one line of code［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:5562-5570.
［62］OZCAN B. Object Detection and Localization Using Dense and SIFT Features［D］. Kingsville: Texas A&M University-Kingsville, 2014.
［63］LAW H, DENG J. CornerNet: Detecting objects as paired key points［C］// Proceedings of the 15th European Conference on Computer Vision. 2018:765-781.

[1]	ZHAO Chenyang, XUE Tao, LIU Junhua. Fashion Clothing Pattern Generation Based on Improved Stable Diffusion [J]. Computer and Modernization, 2024, 0(12): 15-23.
[2]	CHEN Liang, LI Cheng, YI Wei, XIONG Wei, WANG Xiaofan, TANG Haidong. Helmet Wearing Detection in Electric Power Field Based on#br# Millimeter-wave Radar and Visual Fusion [J]. Computer and Modernization, 2024, 0(12): 100-107.
[3]	ZHANG Yu1, 2, LI Jing1, 2, MA Ming1, 2, WANG Zhongxiang1, 2, SUN Yan1, 2. YOLOLW: A Novel Lightweight Object Detection Model [J]. Computer and Modernization, 2024, 0(11): 91-98.
[4]	QI Xian, LIU Daming, CHANG Jiaxin. Multi-view 3D Reconstruction Based on Improved Self-attention Mechanism [J]. Computer and Modernization, 2024, 0(11): 106-112.
[5]	CHEN Kai1, LI Yiting1, 2, QUAN Huafeng1. A River Discarded Bottles Detection Method Based on Improved YOLOv8 [J]. Computer and Modernization, 2024, 0(11): 113-120.
[6]	YANG Jun1, HU Wei1, ZHU Wenfu2. Visual SLAM Loop Closure Detection Algorithm Based on Improved MobileNetV3 [J]. Computer and Modernization, 2024, 0(10): 21-26.
[7]	WEI Xuecheng1, JIANG Lingyun1, LI Yan2, HE Fei2. Improved Roadside Monocular View Small Target Detection Algorithm Based on YOLOv5 [J]. Computer and Modernization, 2024, 0(10): 27-34.
[8]	WANG Yingying, HAO Xiao. Fine-grained Image Classification Based on Res2Net and Recursive Gated Convolution [J]. Computer and Modernization, 2024, 0(10): 74-79.
[9]	SHI Xingyu1, LI Qiang2, ZHUANG Li3, LIANG Yi3, WANG Qiulin3, CHEN Kai3, WU Chenzhou3, CHANG Sheng1. Object Detection Models Distillation Technique for Industrial Deployment [J]. Computer and Modernization, 2024, 0(10): 93-99.
[10]	ZHANG Ze1, ZHANG Jianquan2, 3, ZHOU Guopeng2, 3. Camera Module Defect Detection Based on Improved YOLOv8s [J]. Computer and Modernization, 2024, 0(09): 107-113.
[11]	CHENG Yazi1, LEI Liang1, 2, CHEN Han1, ZHAO Yiran1. Multi-scale Depth Fusion Monocular Depth Estimation Based on Transposed Attention [J]. Computer and Modernization, 2024, 0(09): 121-126.
[12]	CHENG Meng, LI Hao. Improved Deciduous Tree Nest Detection Method Based on YOLOv5s [J]. Computer and Modernization, 2024, 0(08): 24-29.
[13]	WANG Mengxi, LI Jun. Review of Fall Detection Technologies for Elderly [J]. Computer and Modernization, 2024, 0(08): 30-36.
[14]	SHI Xianwei1, FAN Xin2. Semantic Segmentation of Video Frame Scene Based on Lightweight [J]. Computer and Modernization, 2024, 0(08): 49-53.
[15]	WEI Jiakun, WANG Jiarun. Survey on Gesture Recognition and Interaction [J]. Computer and Modernization, 2024, 0(08): 67-76.