基于改进YOLOv5s的道路坑洼检测算法

doi:10.3969/j.issn.1006-2475.2023.06.012

计算机与现代化 ›› 2023, Vol. 0 ›› Issue (06): 69-75.doi: 10.3969/j.issn.1006-2475.2023.06.012

基于改进YOLOv5s的道路坑洼检测算法

白芮¹, 徐杨^1,2, 王彬¹, 张雯雯¹

1.贵州大学大数据与信息工程学院,贵州贵阳 550025;
2.贵阳铝美设计研究院有限公司,贵州贵阳 550025

收稿日期:2022-07-25 修回日期:2022-08-21 出版日期:2023-06-28 发布日期:2023-06-28
通讯作者: 徐杨（1980—）,男,贵州贵阳人,副教授,博士,研究方向：机器学习,深度学习和数据采集,E-mail： xuy@gzu.edu.cn。
作者简介:白芮（1997—）,男,贵州铜仁人,硕士研究生,研究方向：计算机视觉,深度学习,E-mail： 286732320@qq.com; 王彬（1997—）,男,四川泸州人,硕士研究生,研究方向：计算机视觉,深度学习,E-mail： 1060431874@qq.com; 张雯雯（1997—）,女,贵州铜仁人,硕士研究生,研究方向：计算机视觉,深度学习,E-mail： 1098462621@qq.com。
基金资助:
贵州省科技计划项目（黔科合支撑[2021]一般 176）

Road Pothole Detection Algorithm Based on Improved YOLOv5s

BAI Rui¹, XU Yang^1,2, WANG Bin¹, ZHANG Wen-wen¹

1. College of Big Data & Information Engineering, Guizhou University, Guiyang 550025, China;
2. Guiyang Aluminum-magnesium Design and Research Institute Co. LTD, Guiyang 550025, China

Received:2022-07-25 Revised:2022-08-21 Online:2023-06-28 Published:2023-06-28

摘要/Abstract

摘要： 针对现有目标检测算法难以对道路坑洼进行精准检测、检测速度慢等问题,提出一种基于改进YOLOv5s的道路坑洼检测算法。首先在YOLOv5s主干网络中融入坐标注意力（Coordinate Attention, CA）模块,使模型不仅捕获跨通道信息,还捕获方向和位置敏感信息,有助于模型更准确地定位和识别检测对象;然后在空间金字塔池化（Spatial Pyramid Pool, SPP）模块中采用软池化SoftPool改进最大池化操作,保留更详细的特征信息;在特征融合阶段,使用基于内容的功能重组（Content-Aware ReAssembly of Features, CARAFE）对多尺度特征融合中上采样进行改进,动态生成自适应内核,可以在一个大的感受野内聚集上下文信息;最后,使用Alpha-IoU对损失函数进行改进,提高边框回归精度。实验结果表明,改进的YOLOv5s算法在平均精度上较原始网络提高了4.6个百分点,与其他主流算法SSD、Faster R-CNN、YOLOv3、YOLOv3-tiny、YOLOv4-tiny相比检测精度有较大提升。

关键词: 深度学习, 坑洼检测, 坐标注意力, 最大池化

Abstract: Aiming at the problem that existing target detection algorithms are difficult to accurately detect road potholes and the detection speed is slow, a road pothole detection algorithm based on improved YOLOv5s is proposed. Firstly, CA （Coordinate attention） module is integrated into YOLOv5s backbone network, so that the model can capture not only cross-channel information, but also direction perception and position sensitive information, which is helpful for the model to locate and identify the detected object more accurately. Then, SoftPool is adopted in Spatial Pyramid Pool （SPP） module to improve the maximum pooling operation and retain more detailed characteristic information. In the feature fusion stage, Content-Aware ReAssembly of FEatures （CARAFE） is used to improve the up-sampling of multi-scale feature fusion and dynamically generate an adaptive kernel, which can gather context information in a large receptive field. Finally, Alpha-IoU is used to improve the loss function and improve the margin regression accuracy. Experimental results show that the average accuracy of the improved YOLOv5s algorithm is 4.6 percentage points higher than that of the original network, and the detection accuracy of the improved YOLOv5s algorithm is greatly improved compared with other mainstream algorithms such as SSD, Faster R-CNN, YOLOv3, YOLOv3-tiny and YOLOv4-tiny.

Key words: deep learning, pothole detection, coordinate attention, maximum pooling

中图分类号:

TP391

白芮, 徐杨, 王彬, 张雯雯. 基于改进YOLOv5s的道路坑洼检测算法[J]. 计算机与现代化, 2023, 0(06): 69-75.

BAI Rui, XU Yang, WANG Bin, ZHANG Wen-wen. Road Pothole Detection Algorithm Based on Improved YOLOv5s[J]. Computer and Modernization, 2023, 0(06): 69-75.

参考文献

[1] 刘紫扬. 一种路面坑洼、拥包预警系统的设计[J]. 科技创新与应用, 2017(12):73.
[2] 张子茹,陈向东,丁星. 基于PQCR-PSL传感器的路面坑洼检测设计[J]. 电子设计工程, 2019,27(19):90-94.
[3] KRIZHEVSKY A, SUTSKEVER I, HINTON G E.Imagenet classfication with deep convolutional neural networks[J]. Communications of the ACM, 2017,60(6):84-90.
[4] 陈鹏,应骏. 基于卷积神经网络的多场景道路坑洼图像检测[J]. 上海师范大学学报(自然科学版), 2020,49(1):96-101.
[5] 杨肖,刘淼. 基于树莓派平台的道路坑洼检测系统设计[J]. 农业装备与车辆工程, 2022,60(6):98-101.
[6] GIRSHICK R, DONAHUE J, DARRELL T, et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:580-587.
[7] GIRSHICK R.Fast R-CNN[C]// 2015 IEEE International Conference on Computer Vision (ICCV). 2015:1440-1448.
[8] REN S Q, HE K M, GIRSHICK R, et al.Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6):1137-1149.
[9] HE K M, GKIOXARI G, DOLLAR P, et al.Mask R-CNN[C]// Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2017:2961-2969.
[10] REDMON J, DIVVALA S, GIRSHICK R, et al.You only look once: Unified, real-time object detection[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016:779-788.
[11] REDMON J, FARHADI A.YOLO9000: Better, faster, stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:6517-6525.
[12] LIU W, ANGUELOV D, ERHAN D, et al.SSD: Single shot multiBox detector[C]// Computer Vision-ECCV 2016. 2016:21-37.
[13] REDMON J, FARHADI A. YOLOv3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
[14] BOCHKOVSKIY A, WANG C, LIAO H M.YOLOv4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv:2004.10934, 2020.
[15] JOCHER G.YOLOv5[EB/OL]. [2020-08-09].https://github.com/ultralyc-s/yolov5.
[16] 王玲敏,段军,辛立伟. 引入注意力机制的YOLOv5安全帽佩戴检测方法[J]. 计算机工程与应用, 2022,58(9):303-312.
[17] HOU Q, ZHOU D, FENG J.Coordinate attention for efficient mobile network design[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2021:13708-13717.
[18] STERGIOU A, POPPE R, KALLIATAKIS G.Refining activation downsampling with SoftPool[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). 2021:10337-10346.
[19] LIN T, DOLLAR P, GIRSHICK R, et al.Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017:936-944.
[20] LIU S, QI L, QIN H, et al.Path aggregation network for instance segmentation[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:8759-8768.
[21] TAN M, PANG R, LE Q V.EfficientDet: Scalable and efficient object detection[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020:10778-10787.
[22] NOH H, HONG S, HAN B.Learning deconvolution network for semantic segmentation[C]// Proceedings of the IEEE International Conference on Computer Vision (ICCV). 2015:1520-1528.
[23] WANG J, CHEN K, XU R, et al.CARAFE: Content-aware reAssembly of fEatures[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019:3007-3016.
[24] TYCHSEN-SMITH L, PETERSSON L.Improving object localization with fitness NMS and bounded IoU loss[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:6877-6885.
[25] REZATOFIGHI H, TSOI N, GWAK J, et al.Generalized intersection over union: A metric and a loss for bounding box regression[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019:658-666.
[26] ZHENG Z, WANG P, LIU W, et al.Distance-IoU loss: Faster and better learning for bounding box regression[J]. arXiv preprint arXiv:1911.08287, 2019.
[27] HE J, ERFANI S, MA X, et al.Alpha-IoU: A family of power intersection over union losses for bounding box regression[J]. Advances in Neural Information Processing Systems, 2021,34:20230-20242.

基于改进YOLOv5s的道路坑洼检测算法

Road Pothole Detection Algorithm Based on Improved YOLOv5s

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71.
[2]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[3]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[4]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[5]	王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79.
[6]	史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99.
[7]	张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113.
[8]	程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126.
[9]	程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29.
[10]	王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36.
[11]	时现伟1, 范鑫2. 基于轻量化的视频帧场景语义分割方法[J]. 计算机与现代化, 2024, 0(08): 49-53.
[12]	徐新爱, 李钢. 基于DCGAN的课堂表情图像生成方法[J]. 计算机与现代化, 2024, 0(08): 88-91.
[13]	高帅鹏, 王怡凡. 基于图像的群体情绪识别综述[J]. 计算机与现代化, 2024, 0(08): 98-107.
[14]	黄文栋, 王怡凡. 基于模态类别的多模态信息处理与融合综述[J]. 计算机与现代化, 2024, 0(07): 47-62.
[15]	武丽1, 张征浩2, 葛彩成2, 俞俊2. 基于改进SCNN网络的车道线检测算法[J]. 计算机与现代化, 2024, 0(07): 87-92.