计算机与现代化

• 图像处理 • 上一篇    下一篇

基于深度学习框架的多模态动作识别

  

  1. (南京理工大学计算机科学与工程学院,江苏南京210094)
  • 收稿日期:2016-11-14 出版日期:2017-07-20 发布日期:2017-07-20
  • 作者简介:韩敏捷(1990-),男,江苏南京人,南京理工大学计算机科学与工程学院硕士研究生,研究方向:计算机视觉,动作行为识别。
  • 基金资助:
    国家自然科学基金资助项目(61672285)

Multi-modal Action Recognition Based on Deep Learning Framework

  1. (School of Computer Science and Engineering, Nanjing University of Science & Technology, Nanjing 210094, China)
  • Received:2016-11-14 Online:2017-07-20 Published:2017-07-20

摘要: 提出一种基于深度神经网络的多模态动作识别方法,根据不同模态信息的特性分别采用不同的深度神经网络,适应不同模态的视频信息,并将多种深度网络相结合,挖掘行为识别的多模态特征。主要考虑人体行为静态和动态2种模态信息,结合微软Kinect的多传感器摄像机获得传统视频信息的同时也能获取对应的深度骨骼点信息。对于静态信息采用卷积神经网络模型,对于动态信息采用递归循环神经网络模型。最后将2种模型提取的特征相融合进行动作识别和分类。在MSR 3D的行为数据库上实验结果表明,本文的方法对动作识别具有良好的分类效果。

关键词: 深度学习, 多模态, 动作识别

Abstract: This paper proposes an approach for multi-modal action recognition based on deep neural networks. In order to process different modal video information, different artificial networks are utilized and combined to exploit the multi-modal features. We mainly consider the static and dynamic modalities of human action. With the assistance of Microsoft Kinect sensor camera, the visual and depth skeleton data of video can be captured simultaneously. For the static RGB information, we implement Convolutional Neural Networks, while for the dynamic information we use Recurrent Neural Networks. Finally, we combine the extraction features through these two networks and train the action classifier. The experiment results on the MSR 3D datasets show the effectiveness of our method.

Key words: deep learning, multi-modality, action recognition

中图分类号: