计算机与现代化 ›› 2023, Vol. 0 ›› Issue (06): 69-75.doi: 10.3969/j.issn.1006-2475.2023.06.012

• 图像处理 • 上一篇    下一篇

基于改进YOLOv5s的道路坑洼检测算法

白芮1, 徐杨1,2, 王彬1, 张雯雯1   

  1. 1.贵州大学大数据与信息工程学院,贵州 贵阳 550025;
    2.贵阳铝美设计研究院有限公司,贵州 贵阳 550025
  • 收稿日期:2022-07-25 修回日期:2022-08-21 出版日期:2023-06-28 发布日期:2023-06-28
  • 通讯作者: 徐杨(1980—),男,贵州贵阳人,副教授,博士,研究方向:机器学习,深度学习和数据采集,E-mail: xuy@gzu.edu.cn。
  • 作者简介:白芮(1997—),男,贵州铜仁人,硕士研究生,研究方向:计算机视觉,深度学习,E-mail: 286732320@qq.com; 王彬(1997—),男,四川泸州人,硕士研究生,研究方向:计算机视觉,深度学习,E-mail: 1060431874@qq.com; 张雯雯(1997—),女,贵州铜仁人,硕士研究生,研究方向:计算机视觉,深度学习,E-mail: 1098462621@qq.com。
  • 基金资助:
    贵州省科技计划项目(黔科合支撑[2021]一般 176)

Road Pothole Detection Algorithm Based on Improved YOLOv5s

BAI Rui1, XU Yang1,2, WANG Bin1, ZHANG Wen-wen1   

  1. 1. College of Big Data & Information Engineering, Guizhou University, Guiyang 550025, China;
    2. Guiyang Aluminum-magnesium Design and Research Institute Co. LTD, Guiyang 550025, China
  • Received:2022-07-25 Revised:2022-08-21 Online:2023-06-28 Published:2023-06-28

摘要: 针对现有目标检测算法难以对道路坑洼进行精准检测、检测速度慢等问题,提出一种基于改进YOLOv5s的道路坑洼检测算法。首先在YOLOv5s主干网络中融入坐标注意力(Coordinate Attention, CA)模块,使模型不仅捕获跨通道信息,还捕获方向和位置敏感信息,有助于模型更准确地定位和识别检测对象;然后在空间金字塔池化(Spatial Pyramid Pool, SPP)模块中采用软池化SoftPool改进最大池化操作,保留更详细的特征信息;在特征融合阶段,使用基于内容的功能重组 (Content-Aware ReAssembly of Features, CARAFE)对多尺度特征融合中上采样进行改进,动态生成自适应内核,可以在一个大的感受野内聚集上下文信息;最后,使用Alpha-IoU对损失函数进行改进,提高边框回归精度。实验结果表明,改进的YOLOv5s算法在平均精度上较原始网络提高了4.6个百分点,与其他主流算法SSD、Faster R-CNN、YOLOv3、YOLOv3-tiny、YOLOv4-tiny相比检测精度有较大提升。

关键词: 深度学习, 坑洼检测, 坐标注意力, 最大池化

Abstract: Aiming at the problem that existing target detection algorithms are difficult to accurately detect road potholes and the detection speed is slow, a road pothole detection algorithm based on improved YOLOv5s is proposed. Firstly, CA (Coordinate attention) module is integrated into YOLOv5s backbone network, so that the model can capture not only cross-channel information, but also direction perception and position sensitive information, which is helpful for the model to locate and identify the detected object more accurately. Then, SoftPool is adopted in Spatial Pyramid Pool (SPP) module to improve the maximum pooling operation and retain more detailed characteristic information. In the feature fusion stage, Content-Aware ReAssembly of FEatures (CARAFE) is used to improve the up-sampling of multi-scale feature fusion and dynamically generate an adaptive kernel, which can gather context information in a large receptive field. Finally, Alpha-IoU is used to improve the loss function and improve the margin regression accuracy. Experimental results show that the average accuracy of the improved YOLOv5s algorithm is 4.6 percentage points higher than that of the original network, and the detection accuracy of the improved YOLOv5s algorithm is greatly improved compared with other mainstream algorithms such as SSD, Faster R-CNN, YOLOv3, YOLOv3-tiny and YOLOv4-tiny.

Key words: deep learning, pothole detection, coordinate attention, maximum pooling

中图分类号: