计算机与现代化

• 模式识别 • 上一篇    下一篇

基于改进型C3D神经网络的动作识别技术

  

  1. (华北计算技术研究所系统八部,北京100083)
     
  • 收稿日期:2018-08-27 出版日期:2019-04-08 发布日期:2019-04-10
  • 作者简介:廖小东(1995-),男,江西赣州人,硕士研究生,研究方向:计算机视觉,动作识别,图像处理,E-mail: 1457667862@qq.com; 贾晓霞(1976-),女, 山西原平人,研究员级高级工程师,研究方向:信息服务。

Action Recognition Technology Based on Improved C3D Neural Network

  1. (Eigth System Department, North China Institute of Computing Technology, Beijing 100083, China)
  • Received:2018-08-27 Online:2019-04-08 Published:2019-04-10

摘要: Facebook提出的C3D三维卷积神经网络虽然能达到良好的视频动作识别准确率,但是在速度方面还有很大的改进余地,而且训练得到的模型过大,不便于移动设备使用。本文利用小型卷积核能够减少参数的特点,对已有网络结构进行优化,提出一种新的动作识别方案,将原C3D神经网络常用的3×3×3卷积核分解成深度卷积和点卷积(1×1×1卷积核),并且在UCF101数据集和ActivityNet数据集训练测试。结果表明,与原C3D网络进行对比:改进后的C3D网络准确率比C3D提升了2.4%,在速度方面比C3D提升了12.9%,模型大小压缩到原来的25.8%。

关键词: 动作识别, 卷积分解, 识别速度, 模型压缩

Abstract: Although the C3D convolutional neural network proposed by Facebook can achieve good video action recognition accuracy, there is still much room for improvement in terms of speed, and the model obtained by training is too large to be used by mobile devices. This paper uses small convolutional kernels to reduce the characteristics of parameters, optimizes the existing network structure, and proposes a new action recognition scheme, which decomposes the 3×3×3 convolutional kernel commonly used in the original C3D neural network into deep convolution and point convolution (1×1×1 convolution kernel), and training tests on the UCF101 dataset and ActivityNet dataset. The results show that compared with the original C3D network, the improved C3D network accuracy is 2.4% higher than C3D, 12.9% faster than C3D in speed, and the model size is compressed to 25.8%.

Key words: action recognition, convolution decomposition, recognition speed, model compression

中图分类号: