雨天道路场景语义分割算法及其移动端部署

doi:10.3969/j.issn.1006-2475.2024.10.002

摘要/Abstract

摘要： 现有语义分割模型容易受到雨滴遮蔽干扰，在雨天道路场景数据集上表现不佳；且没有着重关注道路场景中较为重要的车辆和行人2个类别。针对上述2个问题，设计雨天道路场景语义分割算法并将其部署在移动端，以促进自动驾驶技术的发展。提出快速融合金字塔池化模块（Fast Fusion Pyramid Pooling Module， FFPPM），使特征图融合丰富的全局语义信息和局部细节信息，有效分割雨滴遮蔽的场景；提出多重注意力融合模块（Multiple Attention Fusion Module， MAFM），并提高损失函数中车辆和行人的类别权重，增加模型对车辆和行人的关注度；借助Android Studio平台将模型部署到移动端，使用ONNX Runtime进行前向推理，分割效果与电脑端一致。在Rainy WCity数据集上与较新的5种模型进行比较，本文模型在电脑端和移动端分割精度一致；PA和mIoU分别为95.25%和72.96%，车辆PA和IoU分别为84.04%和74.15%，行人PA和IoU分别为34.91%和26.37%，均高于其他5种模型；此外本文模型在电脑端和移动端的FPS分别为45.46和1.26，分割速度较快。本文模型能够在移动端有效分割雨水遮蔽下的道路场景图像，对车辆和行人分割更加精确。

关键词: 语义分割, 道路场景, 雨滴遮蔽, 类别权重, 移动端部署

Abstract: The existing semantic segmentation models are susceptible to the interference of raindrop occlusion， and the performance is poor on the rainy road scene dataset. Moreover， they did not focus on the two important categories of vehicles and pedestrians in the road scene. Aiming at the above two problems， this paper designs a semantic segmentation algorithm for rainy road scenes and deploys it on a mobile terminal to promote the development of autonomous driving technology. A fast fusion pyramid pooling module is proposed to make the feature map integrate rich global semantic information and local detail information， and effectively segment the raindrop obscured scene. A multiple attention fusion module is proposed， and the category weight of vehicles and pedestrians in the loss function is increased to enhance the model’s attention to vehicles and pedestrians. The model is deployed to the mobile terminal with the help of Android Studio platform， and the ONNX Runtime is used for forward inference， and the segmentation effect is consistent with that of the computer terminal. Compared with five recent models on the Rainy WCity dataset， the segmentation accuracy of this model is the same on the computer terminal and the mobile terminal. Specifically， PA and mIoU are 95.25% and 72.96%， vehicle PA and IoU are 84.04% and 74.15%， and pedestrian PA and IoU are 34.91% and 26.37%， respectively， which are higher than those of the other five models. In addition， the FPS of the model in the computer and mobile terminals are 45.46 and 1.26， respectively， and the segmentation speed is fast. The model proposed in this paper can effectively segment the road scene image under the shelter of rain on the mobile terminal， and it is more accurate to segment vehicles and pedestrians.

Key words: , semantic segmentation； road scene； raindrop masking； class weight； mobile deployment

中图分类号:

TP391.41

周安达, 唐超颖. 雨天道路场景语义分割算法及其移动端部署[J]. 计算机与现代化, 2024, 0(10): 7-13.

ZHOU Anda, TANG Chaoying. Semantic Segmentation Algorithm for Rainy Road Scene and Its Mobile Deployment[J]. Computer and Modernization, 2024, 0(10): 7-13.

参考文献

［1］于辉. 面向自动驾驶的语义分割研究［D］. 拉萨：西藏大学， 2023.
［2］刘增文. 浅谈汽车自动驾驶技术的原理及应用［J］. 中国设备工程， 2022（12）：209-211.
［3］李富松，张华鑫，刘志忠. 自动驾驶技术简析［J］. 汽车维修， 2022（2）：12-15.
［4］李升波，张航. 用于自动驾驶汽车的深度学习技术介绍［J］. 建设科技， 2022（1）：37-46.
［5］ BROSTOW G J， SHOTTON J， FAUQUEUR J， et al. Segmentation and recognition using structure from motion point clouds［C］// Computer Vision–ECCV 2008： 10th European Conference on Computer Vision. Springer， 2008：44-57.
［6］ CORDTS M， OMRAN M， RAMOS S， et al. The cityscapes dataset for semantic urban scene understanding［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2016：3213-3223.
［7］ ZHAO H S， QI X J， SHEN X Y， et al. ICNet for real-time semantic segmentation on high-resolution images［C］// Proceedings of the 2018 European Conference on Computer Vision （ECCV）. Springer， 2018：418-434.
［8］ YU C Q， GAO C X， WANG J B， et al. BiSeNet v2： Bilateral network with guided aggregation for real-time semantic segmentation［J］. International Journal of Computer Vision， 2021，129（11）：3051-3068.
［9］ LOU A， LOEW M. CFPNet： Channel-wise feature pyramid for real-time semantic segmentation［C］// 2021 IEEE International Conference on Image Processing （ICIP）. IEEE， 2021：1894-1898.
［10］ XU J C， XIONG Z X， BHATTACHARYYA S P. PIDNet： A real-time semantic segmentation network inspired by PID controllers［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE， 2023：19529-19539.
［11］ ELHASSAN M A M， YANG C H， HUANG C X， et al. S2-FPN： Scale-ware strip attention guided feature pyramid network for real-time semantic segmentation［J］. arXiv Preprint arXiv：2206.07298， 2022.
［12］ ZHONG X， TU S D， MA X Z， et al. Rainy WCity： A real rainfall dataset with diverse conditions for semantic driving scene understanding［C］// Proceedings of the 31st International Joint Conference on Artificial Intelligence. ACM，2022：1743-1749.
［13］ LONG J， SHELHAMER E， DARRELL T. Fully convolutional networks for semantic segmentation［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2015：3431-3440.
［14］ KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 2012 25th International Conference on Neural Information Processing Systems. ACM， 2012:1097-1105.
［15］ SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［J］. arXiv preprint arXiv：1409.1556， 2014.
［16］ SZEGEDY C， LIU W， JIA Y Q， et al. Going deeper with convolutions［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE，2015：1-9.
［17］ HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2016：770-778.
［18］ CHOLLET F. Xception： Deep learning with depthwise separable convolutions［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2017：1800-1807.
［19］ ZHAO H S， SHI J P， QI X J， et al. Pyramid scene parsing network［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2017：6230-6239.
［20］ PENG J C， LIU Y， TANG S Y， et al. PP-LiteSeg： A superior real-time semantic segmentation model［J］. arXiv preprint arXiv：2204.02681， 2022.
［21］ HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. IEEE， 2018：7132-7141.
［22］ WOO S， PARK J C， LEE J Y， et al. CBAM： Convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision （ECCV）. Springer. 2018：3-19.
［23］张用川，牟凤云，陈建坤，等. 基于改进YOLOv5算法的道路伤损检测［J］. 电子测量技术， 2023，46（4）：161-168.
［24］吴文斌. 基于深度学习的手部关键点检测及其移动端应用［D］. 广州：华南理工大学， 2020.
［25］段必超. 基于深度学习的道路目标检测研究及其安卓应用开发［D］. 广州：华南理工大学， 2020.
［26］ ZHANG L C， WU X H， LUO D S. Real-time activity recognition on smartphones using deep neural networks［C］// 2015 IEEE 12th International Conference on Ubiquitous Intelligence and Computing and 2015 IEEE 12th International Conference on Autonomic and Trusted Computing and 2015 IEEE 15th International Conference on Scalable Computing and Communications and Its Associated Workshops （UIC-ATC-ScalCom）. IEEE， 2015：1236-1242.
［27］刘程. 基于深度学习的移动端人脸验证系统的设计［D］.南京：南京航空航天大学， 2019.
［28］李亚鹏. 基于深度学习的人脸检测识别器设计及Android移动端应用［D］. 南京：东南大学， 2019.
［29］朱成诚. 基于深度学习的移动端图像识别研究［D］. 北京：华北电力大学， 2020.
［30］张颖. 基于深度学习的图像语义分割算法研究［D］. 绵阳：西南科技大学， 2021.

[1]	杨庆五, 罗小辉, 刘鑫. 基于Edge Drawing的工业图像圆检测算法[J]. 计算机与现代化, 2024, 0(11): 121-126.
[2]	王佳1, 张云龙1, 鞠炜刚1, 周志鹏2, 米传民2. 一种通用的服务器类环境资源节能降耗平台[J]. 计算机与现代化, 2024, 0(05): 61-68.
[3]	陈子健, 段春红. 面向在线学习情境的认知情绪面部表情识别[J]. 计算机与现代化, 2023, 0(10): 92-98.
[4]	潘燕红，潘林. 基于SVM的眼底图像硬性渗出检测[J]. 计算机与现代化, 2014, 0(4): 33-37.
[5]	邢益良，马亮，韩宝如，符石. 基于骑士巡游的匀速移动图像置乱算法[J]. 计算机与现代化, 2014, 0(4): 41-46.