计算机与现代化

• 图像处理 • 上一篇    下一篇

基于深度学习的体感交互方法

  

  1. (中国农业大学信息与电气工程学院,北京100083)
  • 收稿日期:2018-07-12 出版日期:2019-02-25 发布日期:2019-02-26
  • 作者简介:唐晖(1992-),男,河南信阳人,硕士研究生,研究方向:人机交互与深度学习,E-mail: 201923668@qq.com; 王庆(1977-),女,北京人,副教授,博士,研究方向:人机交互与虚拟现实,E-mail: wangqingait@cau.edu.cn; 陈洪(1976-),男,北京人,副教授,博士,研究方向:人机交互与虚拟现实,E-mail: norman_chen@263.net; 郭浩(1985-),男,陕西咸阳人,讲师,博士,研究方向:虚拟现实技术及三维重建,E-mail: guohaolys@cau.edu.cn。
  • 基金资助:
    国家科技支撑计划项目(2015BAH28F01)

Somatosensory Interaction Method Based on Deep Learning

  1. (College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China)
  • Received:2018-07-12 Online:2019-02-25 Published:2019-02-26

摘要: 随着微软2017年10月份宣布永久停产Kinect产品,目前体感交互领域急需一种Kinect的替代品。本文采用普通单目摄像头实时读取视频流,用Faster-RCNN网络检测人体位置并且框出,改进非极大值抑制算法,引入线性加权函数将IOU大于阈值的检测框分数减低而不是变成零。其次,根据得到的检测框送入人体关键点检测CPM网络,输出人物全身骨骼点坐标位置,将Center Loss引入以增加关键点的类内特征的内聚性和类间的差异性。最后,按照模板匹配法根据识别结果生成体感交互的控制指令。本文方法降低了对专业设备的依赖,简化了体感交互的复杂度,对促进体感普及以及拓展人机交互使用范围都具有重要价值。

关键词: 体感交互, Faster-RCNN, CPM, 非极大值抑制, Center Loss

Abstract: With Microsoft’s announcement of a permanent discontinuation of Kinect products in October 2017, there is an urgent need for a Kinect replacement in the field of somatosensory interaction. This article uses a normal monocular camera to read the video stream in real time. The Faster-RCNN network is used to detect the position of the human body and frame the human body. The non-maximum suppression algorithm is improved, and a linear weighting function is introduced to reduce the detection frame score of the IOU greater than the threshold instead of becoming zero. Secondly, according to the obtained detection frame, the CPM network is detected by the key points of the human body, and the coordinate position of the whole body skeleton point is outputted, and Center Loss is introduced to increase the cohesiveness and inter-class difference of the intra-class features of the key points. Finally, according to the template matching method, a control instruction of the somatosensory interaction is generated according to the recognition result. The method of this paper reduces the dependence on professional equipment, simplifies the complexity of somatosensory interaction, and has important value for promoting the popularity of somatosensory and expanding the scope of human-computer interaction.

Key words: somatosensory interaction, Faster-RCNN, CPM, non-maximum suppression, center loss

中图分类号: