Computer and Modernization ›› 2017, Vol. 0 ›› Issue (7): 48-52.doi: 10.3969/j.issn.1006-2475.2017.07.009

Previous Articles     Next Articles

Multi-modal Action Recognition Based on Deep Learning Framework

  

  1. (School of Computer Science and Engineering, Nanjing University of Science & Technology, Nanjing 210094, China)
  • Received:2016-11-14 Online:2017-07-20 Published:2017-07-20

Abstract: This paper proposes an approach for multi-modal action recognition based on deep neural networks. In order to process different modal video information, different artificial networks are utilized and combined to exploit the multi-modal features. We mainly consider the static and dynamic modalities of human action. With the assistance of Microsoft Kinect sensor camera, the visual and depth skeleton data of video can be captured simultaneously. For the static RGB information, we implement Convolutional Neural Networks, while for the dynamic information we use Recurrent Neural Networks. Finally, we combine the extraction features through these two networks and train the action classifier. The experiment results on the MSR 3D datasets show the effectiveness of our method.

Key words: deep learning, multi-modality, action recognition

CLC Number: