计算机视觉下的果实目标检测算法综述

摘要/Abstract

摘要： 基于计算机视觉的果实目标检测识别是目标检测、计算机视觉、农业机器人等多学科的重要交叉研究课题，在智慧农业、农业现代化、自动采摘机器人等领域，具有重要的理论研究意义和实际应用价值。随着深度学习在图像处理领域中广泛应用并取得良好效果，计算机视觉技术结合深度学习方法的果实目标检测识别算法逐渐成为主流。本文介绍基于计算机视觉的果实目标检测识别的任务、难点和发展现状，以及2类基于深度学习方法的果实目标检测识别算法，最后介绍用于算法模型训练学习的公开数据集与评价模型性能的评价指标，且对当前果实目标检测识别存在的问题和未来可能的发展方向进行讨论。

关键词: 计算机视觉, 深度学习, 果实检测, 目标检测

Abstract: Fruit target detection and recognition based on computer vision is an important cross-disciplinary research topic of target detection, computer vision, agricultural robots, etc. It has important theoretical research significance and practical application value in the fields of smart agriculture, agricultural modernization, and automatic picking robots. As deep learning is widely used in the field of image processing and has achieved good results, fruit target detection and recognition algorithms combining computer vision technology with deep learning methods gradually become the mainstream. This article introduces the tasks, difficulties and development status of fruit target detection and recognition based on computer vision, as well as two types of fruit target detection and recognition algorithms based on deep learning methods. Finally, the public data set used for the training and learning of the algorithm model and the evaluation index for evaluating the performance of the model are introduced, and the current problems in the detection and recognition of fruit targets and the possible future development directions are discussed.

Key words: computer vision, deep learning, fruit detection, target detection

李伟强, 王东, 宁政通, 卢明亮, 覃鹏飞. 计算机视觉下的果实目标检测算法综述[J]. 计算机与现代化, 2022, 0(06): 87-95.

LI Wei-qiang, WANG Dong, NING Zheng-tong, LU Ming-liang, QIN Peng-fei. Survey of Fruit Object Detection Algorithms in Computer Vision[J]. Computer and Modernization, 2022, 0(06): 87-95.

参考文献

［1］徐铭辰,牛媛媛,余永昌. 果蔬采摘机器人研究综述［J］. 安徽农业科学, 2014,42(31):11024-11027.
［2］程祥云,宋欣. 果蔬采摘机器人视觉系统研究综述［J］. 浙江农业科学, 2019,60(3):490-493.
［3］高文硕,宋卫东,王教领,等. 果蔬菌采摘机械研究综述［J］. 中国农机化学报, 2020,41(10):9-15.
［4］蒋焕煜,彭永石,申川,等. 基于双目立体视觉技术的成熟番茄识别与定位［J］. 农业工程学报, 2008,24(8):279-283.
［5］ LOWE D G. Distinctive image features from scale-invariant keypoints［J］. International Journal of Computer Vision, 2004,60(2):91-110.
［6］ DALAL N, TRIGGS B. Histograms of oriented gradients for human detection［C］// 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). 2005:886-893.
［7］ BURGES C J C. A tutorial on support vector machines for pattern recognition［J］. Data Mining and Knowledge Discovery, 1998,2(2):121-167.
［8］ FREUND Y, SCHAPIRE R E. A decision-theoretic generalization of on-line learning and an application to boosting［J］. Journal of Computer and System Sciences, 1997,55(1):119-139.
［9］ JI W, ZHAO D, CHENG F Y, et al. Automatic recognition vision system guided for apple harvesting robot［J］. Computers and Electrical Engineering, 2012,38(5):1186-1195.
［10］SI Y S, LIU G, FENG J. Location of apples in trees using stereoscopic vision［J］. Computers and Electronics in Agriculture, 2015,112:68-74.
［11］陶华伟,赵力,奚吉,等. 基于颜色及纹理特征的果蔬种类识别方法［J］. 农业工程学报, 2014,30(16):305-311.
［12］KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks［C］// 2012 Advances in Neural Information Processing Systems. 2012:1097-1105.
［13］GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:580-587.
［14］UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective search for object regonition［J］. International Journal of Computer Vision, 2013,104(2):154-171.
［15］穆龙涛,高宗斌,崔永杰,等. 基于改进AlexNet的广域复杂环境下遮挡猕猴桃目标识别［J］. 农业机械学报, 2019,50(10):24-34.
［16］GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1440-1448.
［17］SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［J］. arXiv preprint arXiv:1409.1556, 2014.
［18］周云成,许童羽,邓寒冰,等. 基于双卷积链Fast R-CNN的番茄关键器官识别方法［J］. 沈阳农业大学学报, 2018,49(1):65-74.
［19］REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:91-99.
［20］朱旭,马淏,姬江涛,等. 基于Faster R-CNN的蓝莓冠层果实检测识别分析［J］. 南方农业学报, 2020,51(6):1493-1501.
［21］HE K M, GKIOXARI G, DOLLR P, et al. Mask R-CNN［C］// 2017 IEEE International Conference on Computer Vision (ICCV). 2017:2980-2988.
［22］LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation［C］// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:8759-8768.
［23］CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2D pose estimation using part affinity fields［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:7291-7299.
［24］LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation［C］// 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015:3431-3440.
［25］宁政通,罗陆锋,廖嘉欣,等. 基于深度学习的葡萄果梗识别与最优采摘定位［J］. 农业机械学报, 2021,37(9):222-229.
［26］REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection［C］// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016:779-788.
［27］REDMON J, FARHADI A. YOLO9000: Better, faster, stronger［C］// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:6517-6525.
［28］刘芳,刘玉坤,林森,等. 基于改进型YOLO的复杂环境下番茄果实快速识别方法［J］. 农业机械学报, 2020,51(6):229-237
［29］IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift［C］// Proceedings of the 32nd International Conference on International Conference on Machine Learning. 2015:448-456.
［30］REDMON J, FARHADI A. YOLOv3: An increment improvement［J］. arXiv preprint arXiv:1804.02767, 2018.
［31］薛月菊,黄宁,涂淑琴,等. 未成熟芒果的改进YOLOv2识别方法［J］. 农业工程学报, 2018,34(7):173-179.
［32］HOSMER D W, LEMESHOW S. Applied Logistic Regression［M］. John Wiley & Sons, 2013.
［33］SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-resNet and the impact of residal connections on learning［J］. arXiv preprint arXiv:1602.07261, 2016.
［34］LIN T Y, DOLLR P, GIRSHICK R, et al. Feature pyramid networks for object detection［C］// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:936-944.
［35］JAIN A K. Data clustering: 50 years beyond K-means［J］. Pattern Recognition Letters, 2010,31(8):651-666.
［36］BOCHKOVSKIY A, WANG C Y, MARK LIAO H Y. YOLOv4: Optimal speed and accuracy of object detection［J］. arXiv preprint arXiv:2004.10934, 2020.
［37］武星,齐泽宇,王龙军,等. 基于轻量化YOLOv3卷积神经网络的苹果检测方法［J］. 农业机械学报, 2020,51(8):17-25.
［38］HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015,37(9):1904-1916.
［39］LI C, DENG C, LI N, et al. Self-supervised adversarial hashing networks for cross-modal retrieval［C］// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. 2018:4242-4251.
［40］张晴晖,孔德肖,李俊萩,等. 基于逆运动学降维求解与YOLOv4的果实采摘系统设计［J/OL］. 农业机械学报:1-15［2021-07-30］. http://kns.cnki.net/kcms/detail/11.1964.S.20210526.1723.018.html.
［41］LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shotmultibox detector［C］// 2016 European Conference on Computer Vision. 2016:21-37.
［42］李善军,胡定一,高淑敏,等. 基于改进SSD的柑橘实时分类检测［J］. 农业工程学报, 2019,35(24):307-313.
［43］彭红星,黄博,邵园园,等. 自然环境下多类水果采摘目标识别的通用改进SSD模型［J］. 农业工程学报, 2018,34(16):155-162.
［44］岳有军,田博凯,王红君,等. 基于改进Mask RCNN的复杂环境下苹果检测研究［J］. 中国农机化学报, 2019,40(10):128-134.
［45］赵德安,吴任迪,刘晓洋,等. 基于YOLO深度卷积神经网络的复杂背景下机器人采摘苹果定位［J］. 农业工程学报, 2019,35(3):164-173.
［46］朱旭,马淏,姬江涛,等. 基于Faster R-CNN的蓝莓冠层果实检测识别分析［J］. 南方农业学报, 2020,51(6):1493-1501.
［47］刘芳,刘玉坤,林森,等. 基于改进型YOLO的复杂环境下番茄果实快速识别方法［J］. 农业机械学报, 2020,51(6):229-237.
［48］曾镜源,洪添胜,杨洲. 基于实例分割的柚子姿态识别与定位研究［J］. 河南农业大学学报, 2021,55(2):287-294.
［49］ZHENG Y Y, KONG J L, JIN X B, et al. CropDeep: The crop vision dataset for deep-learning-based classification and detection in precision agriculture［J］. Sensors, 2019,19(5). DOI: 10.3390/s19051058.
［50］HNI N, ROY P, ISLER V. MinneApple: A benchmark dataset for apple detection and segmentation［J］. arXiv preprint arXiv:1909.06441, 2019.
［51］GEN-MOLA J, VILAPLANA V, ROSELL-POLO J R, et al. Multi-modal deep learning for fruit detection using RGB-D cameras and their radiometric capabilities［J］. Computers and Electronics in Agriculture, 2019,162:689-698.
［52］FERLEZ J. Methods for analysis of research related data in the IST-world application［D］. University of Ljubljana, 2007.
［53］HOREA M, MIHAI O. Fruit recognition from images using deep learning［J］. Acta Universitatis Sapientiae Informatica, 2018,10(1):26-42.
［54］WALTNER G, SCHWARZ M, LADSTTTER S, et al. Personalized dietary self-management using mobile vision-based assistance［M］// New Trends in Image Analysis and Processing-ICIAP 2017. 2017:385-393.
［55］EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL visual object classes (VOC) challenge［J］. International Journal of Computer Vision, 2010,88(2):303-338.
［56］EVERINGHAM M, ALI ESLAMI S M, VAN GOOL L, et al. The PASCAL visual object classes challenge：A retrospective［J］. International Journal of Computer Vision, 2015,111(1):98-136.
［57］RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge［J］. International Journal of Computer Vision, 2015,115(3):211-252.
［58］HOIEM D, CHODPATHUMWAN Y, DAI Q Y. Diagnosing error in object detectors［C］// Proceedings of the 12th European Conference on Computer Vision. 2012:340-353.
［59］陈兆凡,赵春阳,李博. 一种改进IoU损失的边框回归损失函数［J］. 计算机应用研究, 2020,37(S2):293-296.

[1]	赵晨阳, 薛涛, 刘俊华. 基于改进Stable Diffusion的时尚服饰图案生成[J]. 计算机与现代化, 2024, 0(12): 15-23.
[2]	刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71.
[3]	陈亮, 李诚, 易伟, 熊伟, 汪晓帆, 唐海东. 基于毫米波雷达与视觉融合的电力现场安全帽佩戴检测[J]. 计算机与现代化, 2024, 0(12): 100-107.
[4]	张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98.
[5]	董玉玟. 基于改进YOLOv7-tiny的多尺度运动目标检测算法[J]. 计算机与现代化, 2024, 0(11): 99-105.
[6]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[7]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[8]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[9]	魏学诚1, 江凌云1, 李研2, 何非2. 改进YOLOv5的路侧单目视角小目标检测算法[J]. 计算机与现代化, 2024, 0(10): 27-34.
[10]	王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79.
[11]	史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99.
[12]	张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113.
[13]	程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126.
[14]	程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29.
[15]	王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36.