计算机与现代化 ›› 2025, Vol. 0 ›› Issue (11): 58-64.doi: 10.3969/j.issn.1006-2475.2025.11.007

• 图像处理 • 上一篇    下一篇

基于改进P2PNet的铁路车站人群数量估计算法

  


  1. (1.北京世纪瑞尔技术股份有限公司,北京 100085; 2.北京交通大学计算机科学与技术学院,北京 100044;
    3.北京智芯微电子科技有限公司,北京 102200)
  • 出版日期:2025-11-20 发布日期:2025-11-24
  • 作者简介: 作者简介:万成凯(1976—),男,安徽黄山人,工程师,博士,研究方向:计算机视觉,人工智能,E-mail: wanchengkai@163.com;安高云(1980—),男,山东莱州人,教授,博士,研究方向:计算机视觉,深度学习等,E-mail: gyan@bjtu.edu.cn; 崔岚(1980—),女,吉林长春人,工程师,硕士,研究方向:人工智能,E-mail: cuilancn@163.com。
  • 基金资助:
    基金项目:国家自然科学基金资助项目(62072028)
       

Crowd Counting Estimation Algorithm of Railway Stations Based on Improved P2PNet


  1. (1. Beijing Century Real Technology Co.,Ltd., Beijing 100085, China; 2. School of Computer Science and Technology, Beijing Jiaotong University, Beijing 100044, China; 3. Beijing Smartchip Microelectronics Technology Co., Ltd., Beijing 102200, China)
  • Online:2025-11-20 Published:2025-11-24

摘要: 摘要:本文针对铁路车站等场景下人群数量估计问题,提出一种基于改进P2PNet的铁路车站人群数量估计算法。该算法对传统的P2PNet算法进行较大的改动和优化。首先,算法模型采用BiFPN的路径聚合来增强网络对不同尺度特征图的融合能力,解决图像中人员大小尺度相差较大的情况;其次,网络在低层特征图引入A-SPP结构以提高网络的感受野范围,增强网络提取多尺度特征的能力;再次,网络中的OutLayer之前采用了CSAM注意力动态地调整特征图中各个通道和空间位置的重要性,更有效地回归图像中人的位置和分类;最后,在损失函数中,Focal Loss代替了传统的交叉熵,来解决正负样本不均衡和难易样本不均衡的问题。在公开数据集和自有数据集上进行对比实验的结果表明,该算法在平均绝对误差上优于当前同类先进算法。在实际的车站视频场景下,该算法能够准确估计人群数量。


关键词: 关键词:人群数量估计, P2PNet, 多尺度特征, 注意力机制, 特征融合

Abstract: Abstract: An improved P2PNet based algorithm for estimating the number of people in railway stations is proposed. The algorithm has made significant modifications and optimizations to the traditional P2PNet algorithm. Firstly, the algorithm model adopts BiFPN path aggregation to enhance the network’s ability to fuse feature maps of different scales, solving the problem of large differences in personnel size and scale in images. Secondly, the network introduces the A-SPP structure into the low-level feature maps to increase the receptive field range and enhance its ability to extract multi-scale features. Thirdly, the CSAM attention mechanism is adopted before the OutLayer in the network to dynamically adjust the importance of each channel and spatial position in the feature map, which more effectively regresses the position and classification of people in the image. Finally, Focal Loss replaces traditional cross entropy in loss function to solve the problems of imbalanced positive and negative samples and imbalanced difficult and easy samples. The results of comparative experiments on publicly datasets and proprietary datasets show that this algorithm outperforms current advanced algorithms in terms of mean absolute error. In actual station video scenarios, this algorithm can accurately estimate the number of people.

Key words: Key words: , crowd counting estimation; P2PNet; multi-scale features; attention mechanism; feature fusion

中图分类号: