计算机与现代化 ›› 2021, Vol. 0 ›› Issue (12): 27-36.

• 算法设计与分析 • 上一篇    下一篇

基于通道切分的人体姿态估计算法

  

  1. (1.南通大学张謇学院,江苏南通226019;2.南通大学交通与土木工程学院,江苏南通226019)
  • 出版日期:2021-12-24 发布日期:2021-12-24
  • 作者简介:周昆阳(2000—),男,江苏盐城人,本科生,研究方向:计算机视觉,E-mail: 1752465993@qq.com; 赵梦婷(2001—),女,江苏沭阳人,本科生,研究方向:图像处理,E-mail: 3248463196@qq.com; 张海潮(2001—),女,四川德阳人,本科生,研究方向:图像处理,E-mail: 1908966460@qq.com;通信作者:邵叶秦(1978—),男,浙江海宁人,副教授,博士,研究方向:计算机视觉,E-mail: hnsyk@ntu.edu.cn。
  • 基金资助:
    国家自然科学基金面上项目(61671255); 江苏省大学生创新训练计划项目(201910304158H, 202010304180H, 202010304122Y)

Human Pose Estimation Algorithm Based on Channel Splitting

  1. (1. School of Zhang Jian, Nantong University, Nantong 226019, China;
    2. School of Transportation and Civil Engineering, Nantong University, Nantong 226019, China)

  • Online:2021-12-24 Published:2021-12-24

摘要: 为了提高人体姿态估计的准确率和识别速度,提出一种基于通道切分的人体姿态估计算法Channel-Split Residual Steps Network(Channel-Split RSN)。首先,提出通道切分模块,对切分后的特征通道通过卷积提取特征再融合起来,以获得丰富的特征表示。接着,引入特征增强模块,对特征通道进一步分组,并对不同的分组采取不同的处理策略,以减少特征通道内的相似特征。最后,结合改进的空间注意力机制,提出一种基于特征空间相关性的姿态修正机Context-PRM,得到更加准确的人体姿态估计。在COCO test-dev数据集上的实验结果表明,本文方法达到75.9%的AP和55.36的FPS,并且模型的大小Params(M)仅为18.3。相较于传统的RSN18和传统的RSN50,模型的AP分别提高了5和3.4个百分点,FPS比传统的RSN50快12.08。在更具挑战性的CrowdPose数据集上,本文方法达到66.9%的AP和19.16的FPS,相较于RSN18,AP提高了4.6个百分点。有效提高了人体姿态估计的准确率,且模型具有较快的识别速度。本文源代码公开在https://github.com/qdd1234/Channel-Split-RSN。

关键词: Channel-Split RSN, 人体姿态估计, 通道切分模块, 特征增强模块, Context-PRM

Abstract: To improve the accuracy and speed of human pose estimation, a channel-split-based human pose estimation algorithm, named Channel-Split Residual Steps Network (Channel-Split RSN), is proposed. First of all, channel-split blocks are proposed to apply convolution operation for split feature in order to obtain rich feature representation. Then, feature enhancement blocks are introduced to further split feature channel and employ different strategies for different groups which can reduce similar features in feature channels. Finally, to further enhance the pose refine machine in Channel-Split RSN, combined with improved spatial attention mechanism, a pose refine machine based on feature spatial correlation, named Context-PRM, is proposed. Experimental results show that on the COCO test-dev dataset, our algorithm reaches 75.9% AP and 55.36 FPS, and the Params(M) of the model is only 18.3. Compared with the traditional RSN18 and RSN50, the AP of the model is improved by 5 and 3.4 percentage points, respectively. FPS is 12.08 faster than the traditional RSN50. On the more challenging CrowdPose dataset, our approach achieves 66.9% AP and 19.16 FPS, an AP improvement of 4.6 percentage points compared to RSN18, which effectively improves the accuracy of human pose estimation and the model has a faster recognition speed. Our source code is available at https://github.com/qdd1234/Channel-Split-RSN.

Key words: Channel-Split RSN, human pose estimation, channel-split block, feature enhancement block, Context-PRM