计算机与现代化

• 图像处理 • 上一篇    下一篇

基于可变形卷积神经网络的手势识别方法

  

  1.  (华南农业大学数学与信息学院,广东广州510642)
  • 出版日期:2018-04-28 发布日期:2018-05-02
  • 作者简介: 苏军雄(1995),男,广东中山人,华南农业大学数学与信息学院本科生,研究方向:图像处理; 见雪婷(1995),女,本科生,研究方向:图像处理; 刘玮(1995),女,广东广州人,本科生,研究方向:图像处理; 华俊达(1999),男,广东茂名人,本科生,研究方向:图像处理; 通信作者:张胜祥(1969),男,副教授,博士,研究方向:非线性系统理论。
  • 基金资助:
    2016年省级大学生创新训练计划项目(201610564356); 广州市科技计划项目(201707010031)

Gesture Recognition Method Based on Deformable Convolution Neural Network

  1. (College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China)
  • Online:2018-04-28 Published:2018-05-02

摘要: 卷积神经网络本身具有丰富的特征表达能力和学习能力,但本质上,其模块中几何变换能力是固定的。因此,引入可变形卷积核来改进VGG16的网络结构,搭建名为DCVGG的卷积神经网络结构来进行手势识别的研究。在不同数据集下,基于可变形卷积神经网络的手势识别方法能够直接把RGB图像数据输入网络。最终输出的结果,对手势的平均识别率达到97%以上,有效提高网络的性能,提升卷积神经网络对样本对象的容忍度和多样性,丰富卷积神经网络的特征表达能力,与传统LeNet5、VGG16结构和传统人工特征提取算法相比效果更佳,比传统结构更深,鲁棒性更好,识别率更强,可以为复杂背景下有效识别手势提供参考,具有一定的延拓能力。

关键词: 手势识别, 可变形卷积, 卷积神经网络, 卷积核, 双线性插值

Abstract: Convolution neural network itself has a rich ability of expressing features and learning, but in essence, the module geometric transformation ability is fixed. Therefore, the VGG16 network structure is improved by introducing a deformable convolution kernel, and a convolution neural network structure named DCVGG is built to study the gesture recognition. In different data sets, the gesture recognition method based on deformable convolution neural network can input RGB image data directly into the network. The results show that the average recognition rate of gestures is over 97%, which can improve the performance of the network, enhance the tolerance and diversity of the convolution neural network to the sample object, and enrich the expression ability of the convolution neural network. Compared with the traditional LeNet5, VGG16 structure and traditional feature extraction by hand, DCVGG is deeper than the traditional structure, the robustness is better, the recognition rate is stronger, which can provide reference for the effective recognition of gestures in complex background, and has some extension ability.

Key words: gesture recognition, deformable convolution, convolution neural network (CNN), convolution kernel, bilinear interpolation

中图分类号: