基于双注意力机制的街景语义分割

摘要/Abstract

摘要： 高性能语义分割算法由于自身高延迟性存在无法快速感知路况的问题。本文提出一种基于注意力机制的双路径网络模型。该网络模型采用轻量的局部轮廓信息提取模块和语义信息提取模块来替代复杂的编码器结构。针对不同路径下特征图的特点，分别基于自注意力和通道注意力机制设计特征优化模块，该算法可有效地提高轻量网络结构对细节特征的表达能力。设计的语义分割网络以25 fps的速度处理图像的同时，可保持73.9%的平均交并比。经实物验证，表明本文算法具备实时性，具有一定的实际应用价值。

关键词: 语义分割, 双路径卷积神经网络, 自动驾驶, 嵌入式平台

Abstract: High-performance semantic segmentation algorithms cannot quickly perceive road conditions due to their high latency. This paper proposes a dual-path network model based on attention mechanism. The network model uses a lightweight local contour information extraction module and a semantic information extraction module to replace the complex encoder structure. Aiming at the characteristics of feature maps under different paths, feature optimization modules are designed based on self-attention and channel attention mechanisms. This algorithm effectively improves the ability of lightweight network structures to express detailed features. The designed semantic segmentation network processes images at a speed of 25 fps while maintaining an average cross-to-parallel ratio of 73.9%. The physical verification shows that the algorithm has real-time performance and high value in certain practical application.

Key words: semantic segmentation, bilateral convolutional neural network, autopilot, embedded platform

唐舒放, 王志胜. 基于双注意力机制的街景语义分割[J]. 计算机与现代化, 2021, 0(10): 69-74.

TANG Shu-fang, WANG Zhi-sheng. Semantic Segmentation of Street Scenes Based on Double Attention Mechanism[J]. Computer and Modernization, 2021, 0(10): 69-74.

参考文献

［1］唐溢. 基于深度学习的低延迟视频语义分割算法［D］. 成都:电子科技大学, 2020.
［2］罗鹏飞. 基于自动驾驶城市场景的语义分割研究［D］. 武汉:武汉大学, 2019.
［3］陈小波. 基于深度学习的语义分割算法研究［D］. 成都:电子科技大学, 2020.
［4］ LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:3431-3440.
［5］ NOH H, HONG S, HAN B. Learning deconvolution network for semantic segmentation［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. 2015:1520-1528.
［6］ BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(12):2481-2495.
［7］ LIN G S, MILAN A, SHEN C H, et al. RefineNet: Multi-path refinement networks for high-resolution semantic segmentation［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:5168-5177.
［8］ HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778.
［9］ PENG C, ZHANG X Y, YU G, et al. Large kernel matters: Improve semantic segmentation by global convolutional network［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:1743-1751.
［10］ZHANG H, DANA K, SHI J P, et al. Context encoding for semantic segmentation［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:7151-7160.
［11］HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications［J］. arXiv preprint arXiv:1704.04861, 2017.
［12］SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:4510-4520.
［13］HOWARD A, SANDLER M, CHEN B, et al. Searching for mobileNetV3［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. 2019:1314-1324.
［14］ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:6848-6856.
［15］MA N N, ZHANG X Y, ZHENG H T, et al. ShuffleNet V2: Practical guidelines for efficient CNN architecture design［C］// Proceedings of the 2018 European Conference on Computer Vision. 2018:122-138.
［16］LI H C, XIONG P F, FAN H Q, et al. DFANet: Deep feature aggregation for real-time semantic segmentation［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:9514-9523.
［17］PASZKE A, CHAURASIA A, KIM S, et al. ENet: A deep neural network architecture for real-time semantic segmentation［J］. arXiv preprint arXiv:1606.02147, 2016.
［18］WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:7794-7803.
［19］HU J, SHEN L, SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:7132-7141.
［20］LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. 2017:2999-3007.
［21］CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018,40(4):834-848.
［22］ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:6230-6239.
［23］YU C Q, WANG J B, PENG C, et al. BiSeNet: Bilateral segmentation network for real-time semantic segmentation［C］// Proceedings of the 2018 European Conference on Computer Vision. 2018:334-349.

[1]	周安达, 唐超颖. 雨天道路场景语义分割算法及其移动端部署[J]. 计算机与现代化, 2024, 0(10): 7-13.
[2]	乔佳, 徐琨, 胡佩蓉. 多尺度特征融合的版面分析方法[J]. 计算机与现代化, 2024, 0(05): 16-21.
[3]	崔少国, 胡光平. 基于语义分割的嵌套命名实体识别方法[J]. 计算机与现代化, 2024, 0(02): 69-74.
[4]	胡崇佳, 刘金洲, 方立. 基于无监督域适应的室外点云语义分割[J]. 计算机与现代化, 2024, 0(01): 74-79.
[5]	许鸿奎, 李振业, 郭文涛, 赵京政, 郭旭斌. 基于分割的任意形状场景文本实时检测[J]. 计算机与现代化, 2023, 0(11): 95-100.
[6]	叶思佳, 魏延, 杜韩宇, 邓金枝. 结合注意力机制的HRNet图像语义分割算法[J]. 计算机与现代化, 2023, 0(10): 65-69.
[7]	刘续, 查可可. 一种用于机场特种车辆作业的环境目标识别方法[J]. 计算机与现代化, 2023, 0(08): 18-24.
[8]	牛玉珩, 李永可, 陈燕红, 蒋平安. 基于改进SegFormer模型的棉田地表残膜图像分割方法[J]. 计算机与现代化, 2023, 0(07): 93-98.
[9]	叶力鸣, 陈蔚文. 一种结合语义分割和目标检测的级联式绝缘子缺陷检测方法[J]. 计算机与现代化, 2023, 0(06): 82-88.
[10]	周贤来. 基于语义分割的异构多核平台大数据挖掘算法[J]. 计算机与现代化, 2020, 0(10): 40-43.
[11]	朱大庆, 曹国. 基于全卷积网络的砂石图像粒径检测[J]. 计算机与现代化, 2020, 0(07): 111-116.
[12]	周晨轶，王文，卢杉，徐亦白. 基于多层信息融合的实时语义分割及其在电力场景中的应用[J]. 计算机与现代化, 2019, 0(08): 17-.
[13]	王文，徐亦白，卢杉，冯宇. 一种结合区域检测和语义分割的SLAM技术[J]. 计算机与现代化, 2019, 0(07): 55-.
[14]	杨志尧1,2，彭召意1,2，文志强1,2. 一种基于区域建议网络的图像语义分割方法[J]. 计算机与现代化, 2018, 0(02): 122-.