Dynamic Gesture Recognition Based on 3D Convolutional Neural Networks

doi:10.3969/j.issn.1006-2475.2019.11.015

Abstract

Abstract: In video recognition, the traditional 2D convolution neural networks are easy to lose the relevant feature information in time dimension, which leads to the reduction of recognition accuracy. This paper uses 3D convolutional neural network as a basic network framework with 3D convolution kernel to extract the temporal and spatial features of videos, at the same time, the integration of multiple 3D convolutional neural network models are proposed to recognize dynamic gesture. In order to improve the convergence speed of the model and the stability of training, the network is optimized by Batch Normalization (BN) technology to shorten the training time of the network. Experimental results show that the proposed method has a good recognition performance for dynamic gesture recognition, and the recognition accuracy reaches 98.06% in Sheffield Kinect Gesture (SKIG) data set. Solely compared with RGB information, depth information and traditional 2D CNN, the gesture recognition rate is higher, which verifies the feasibility and effectiveness of the proposed method.

Key words: 3D Convolutional Neural Network (3D CNN), optical flow, ensemble learning, deep learning, dynamic gesture recognition

CLC Number:

TP391.41

GU Chen-nan, ZENG Xiao-qin. Dynamic Gesture Recognition Based on 3D Convolutional Neural Networks[J]. Computer and Modernization, doi: 10.3969/j.issn.1006-2475.2019.11.015.

References

［1］ MITRA S, ACHARYA T. Gesture recognition: A survey［J］. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2007,37(3):311-324.
［2］ LIU N, LOVELL B C, KOOTSOOKOSP J, et al. Model structure selection & training algorithms for an HMM gesture recognition system［C］// International Workshop on Frontiers in Handwriting Recognition. 2004:100-105.〖HJ1.15mm〗
［3］黄振翔,彭波,吴娟,等. 基于DTW与混合判别特征检测器的手势识别［J］. 计算机工程, 2014,40(5):216-218.
［4］易生,梁华刚,茹锋. 基于多列深度3D卷积神经网络的手势识别［J］. 计算机工程, 2017,43(8):243-248.
［5］ JI S W, XU W, YANG M, et al. 3D convolutional neural networks for human action recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013,35(1):221-231.
［6］ KARPATHY A, TODERICI G, SHETTY S, et al. Large-scale video classification with convolutional neural networks［C］// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014:1725-1732.
［7］ SIMONYAN K, ZISSERMAN A. Two-stream convolutional networks for action recognition in videos［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014:568-576.
［8］ TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks［C］//Proceedings of 2015 IEEE International Conference on Computer Vision. 2015:4489-4497.
［9］ MOLCHANOV P, GUPTA S, KIM K, et al. Hand gesture recognition with 3D convolutional neural networks［C］// 2015 IEEE Conference on Computer Vision and Pattern Recognition. 2015:1-7.
［10］CAMGOZ N C, HADFIELD S, KOLLERO, et al. Using convolutional 3D neural networks for user-independent continuous gesture recognition［C］// 2016 23rd International Conference on Pattern Recognition. 2016:49-54.
［11］LIU L, SHAO L. Learning discriminative representations from RGB-D video data［C］// International Joint Conference on Artificial Intelligence. 2013:1493-1500.
［12］WANG H, SCHMID C. Action recognition with improved trajectories［C］// 2013 IEEE International Conference on Computer Vision. 2014:3551-3558.
［13］IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift［C］// International Conference on Machine Learning. 2015:448-456.
［14］WANG H , ALEXANDER K, SCHMID C, et al. Dense trajectories and motion boundary descriptors for action recognition［J］. International Journal of Computer Vision, 2013,103(1):60-79.
［15］KROGH A, VEDELSBY J. Neural network ensembles, cross validation, and active learning［C］// International Conference on Neural Information Processing Systems. 1995:231-238.
［16］KINGMA D P, BA J. Adam: A method for stochastic optimization［J］. Machine Learning, 2014:arXiv:1412.6980.
［17］CHOI H, PARK H. A hierarchical structure for gesture recognition using RGB-D sensor［C］// Proceedings of the 2nd International Conference on Human-agent Interaction. 2014:265-268.
［18］Cirujeda P, Binefa X. 4DCov: A nested covariance descriptor of spatio-temporal features for gesture recognition in depth sequences［C］// 2014 2nd International Conference on 3D Vision. 2014:DOI:10.1109/3DV.2014.10.
［19］LIU M Y, LIU H. Depth context: A new descriptor for human activity recognition by using sole depth sequences［J］. Neurocomputing, 2016,175(A):747-758.
［20］TUNG P T, LY N Q. Elliptical density shape model for hand gesture recognition［C］// Proceedings of the 5th Symposium on Information and Communication Technology. 2014:186-191.
［21］NISHIDA N, NAKAYAMA H. Multimodal gesture recognition using multi-stream recurrent neural network［C］// Revised Selected Papers of the 7th Pacific-Rim Symposium on Image and Video Technology. 2015:682-694.
［22］BEZAK P. Building recognition system based on deep learning［C］// International Conference on Artificial Intelligence and Pattern Recognition. 2016:1-5.
［23］SGOUROPOULOS K, STERGIOPOULOU E, PAPAMARKOSN. A dynamic gesture and posture recognition system［J］. Journal of Intelligent & Robotic Systems, 2014,76(2):283-296.

[1]	HU Chong-jia, LIU Jin-zhou, FANG Li. Unsupervised Domain Adaptation for Outdoor Point Cloud Semantic Segmentation [J]. Computer and Modernization, 2024, 0(01): 74-79.
[2]	LIN Wei. Incremental News Recommendation Method Based on Self-supervised Learning and Data Replay [J]. Computer and Modernization, 2023, 0(12): 1-6.
[3]	LIANG Tian-kai, HUANG Kang-hua, LIU Kai-hang, LAN Lan, ZENG Bi. Deep Federated Image Classification Method Based on Bilateral Homomorphic Encryption [J]. Computer and Modernization, 2023, 0(12): 36-40.
[4]	QIU Kai-xing, FENG Guang. A Multi-label Image Classification Model Based on Dual Feature Attention [J]. Computer and Modernization, 2023, 0(12): 41-47.
[5]	ZHANG Bo-quan, MAI Hai-peng, CHEN Jia-min, Pang Jin-ju. White Matter Hyperintensities Segmentation Based on High Gray Value#br# Attention Mechanism [J]. Computer and Modernization, 2023, 0(12): 67-75.
[6]	LI Yan-man, WANG Bi-heng, ZHAO Ling-yan. Safety Helmet Detection Based on Lightweight YOLOv5 [J]. Computer and Modernization, 2023, 0(10): 59-64.
[7]	LI Shi-da, XIANG Jian-wen. A Weakened Joint Reinforcement Method to Improve Robustness of Image Recognition Models [J]. Computer and Modernization, 2023, 0(10): 70-76.
[8]	SHEN Jia-wei, LU Yi-ming, CHEN Xiao-yi, QIAN Mei-ling, LU Wei-zhong, . Review of Research on Human Behavior Detection Methods Based on Deep Learning [J]. Computer and Modernization, 2023, 0(09): 1-9.
[9]	WANG Jie, XU Xiang, LUO Xiao-dan, ZHANG Meng, HUANG Che, HONG Guan-zhong, WANG Xiang. Calculation Method of Chaohu Lake Surface Rainfall Based on Ensemble Learning [J]. Computer and Modernization, 2023, 0(09): 38-43.
[10]	LIU Chan-yi, HUANG Dan, XUE Lin-yan, WANG Tao, ZHU Tao, . COVID-19 X-ray Classification Based on Improved Efficientnet Network [J]. Computer and Modernization, 2023, 0(09): 94-99.
[11]	MA Guo-xiang, YANG Ling-fei, YAN Chuan-bo, ZHANG Zhi-hao, SUN Bing, WANG Xiao-rong. Ultrasonic Image Diagnosis of Hepatic Echinococcosis Based on Deep DenseNet Network [J]. Computer and Modernization, 2023, 0(09): 100-104.
[12]	NONG Hao-cheng, REN De-jun, REN Qiu-lin, LIU Peng-li, HUANG De-cheng. Surface Anomaly Detection Algorithm of Flexible Plastic Packaging Based on Improved ConvNeXt [J]. Computer and Modernization, 2023, 0(08): 12-17.
[13]	OUYANG Fei, WU Xu, XIANG Dong-sheng. Garbage Classification and Detection Method Based on Improved YOLOX [J]. Computer and Modernization, 2023, 0(08): 68-73.
[14]	HU Rui-jie, CHE Dou. Review of Infrared Small Target Detection [J]. Computer and Modernization, 2023, 0(08): 79-86.
[15]	JIANG Lei, TANG Jian, YANG Chao-yue, LYU Ting-ting. Bearing Fault Diagnosis Based on CWGAN-GP and CNN [J]. Computer and Modernization, 2023, 0(07): 1-6.