基于深度学习的体感交互方法

doi:10.3969/j.issn.1006-2475.2019.02.002

计算机与现代化 ›› 2019, Vol. 0 ›› Issue (02): 7-.doi: 10.3969/j.issn.1006-2475.2019.02.002

基于深度学习的体感交互方法

(中国农业大学信息与电气工程学院，北京100083)

收稿日期:2018-07-12 出版日期:2019-02-25 发布日期:2019-02-26
作者简介:唐晖（1992-），男，河南信阳人，硕士研究生，研究方向：人机交互与深度学习,E-mail: 201923668@qq.com; 王庆（1977-），女，北京人，副教授，博士，研究方向：人机交互与虚拟现实,E-mail: wangqingait@cau.edu.cn; 陈洪（1976-），男，北京人，副教授，博士，研究方向：人机交互与虚拟现实,E-mail: norman_chen@263.net; 郭浩（1985-)，男，陕西咸阳人，讲师，博士，研究方向：虚拟现实技术及三维重建,E-mail: guohaolys@cau.edu.cn。
基金资助:
国家科技支撑计划项目(2015BAH28F01)

Somatosensory Interaction Method Based on Deep Learning

(College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China)

Received:2018-07-12 Online:2019-02-25 Published:2019-02-26

摘要/Abstract

摘要： 随着微软2017年10月份宣布永久停产Kinect产品，目前体感交互领域急需一种Kinect的替代品。本文采用普通单目摄像头实时读取视频流，用Faster-RCNN网络检测人体位置并且框出，改进非极大值抑制算法，引入线性加权函数将IOU大于阈值的检测框分数减低而不是变成零。其次，根据得到的检测框送入人体关键点检测CPM网络，输出人物全身骨骼点坐标位置，将Center Loss引入以增加关键点的类内特征的内聚性和类间的差异性。最后，按照模板匹配法根据识别结果生成体感交互的控制指令。本文方法降低了对专业设备的依赖，简化了体感交互的复杂度，对促进体感普及以及拓展人机交互使用范围都具有重要价值。

关键词: 体感交互, Faster-RCNN, CPM, 非极大值抑制, Center Loss

Abstract: With Microsoft’s announcement of a permanent discontinuation of Kinect products in October 2017, there is an urgent need for a Kinect replacement in the field of somatosensory interaction. This article uses a normal monocular camera to read the video stream in real time. The Faster-RCNN network is used to detect the position of the human body and frame the human body. The non-maximum suppression algorithm is improved, and a linear weighting function is introduced to reduce the detection frame score of the IOU greater than the threshold instead of becoming zero. Secondly, according to the obtained detection frame, the CPM network is detected by the key points of the human body, and the coordinate position of the whole body skeleton point is outputted, and Center Loss is introduced to increase the cohesiveness and inter-class difference of the intra-class features of the key points. Finally, according to the template matching method, a control instruction of the somatosensory interaction is generated according to the recognition result. The method of this paper reduces the dependence on professional equipment, simplifies the complexity of somatosensory interaction, and has important value for promoting the popularity of somatosensory and expanding the scope of human-computer interaction.

Key words: somatosensory interaction, Faster-RCNN, CPM, non-maximum suppression, center loss

中图分类号:

TP391

唐晖，王庆，陈洪，郭浩. 基于深度学习的体感交互方法[J]. 计算机与现代化, 2019, 0(02): 7-.

TANG Hui, WANG Qing, CHEN Hong，GUO Hao. Somatosensory Interaction Method Based on Deep Learning[J]. Computer and Modernization, 2019, 0(02): 7-.

参考文献

［1］杨明浩,陶建华. 多通道人机交互信息融合的智能方法［J］. 中国科学:信息科学, 2018,48(4):433-448.
［2］任少卿. 基于特征共享的高效物体检测［D］. 合肥:中国科学技术大学, 2016.〖HJ1mm〗
［3］许常蕾,王庆,陈洪,等. 基于体感交互的仿上肢采摘机器人系统设计与仿真［J］. 农业工程学报, 2017,33(S1):49-55.
［4］侯永宏,叶秀峰,张亮,等. 基于深度学习的无人机人机交互系统［J］. 天津大学学报(自然科学与工程技术版), 2017,50(9):967-974.
［5］马淼. 视频中人体姿态估计、跟踪与行为识别研究［D］. 济南:山东大学, 2017.
［6］ FELZENSZWALB P F, HUTTENLOCHER D P. Pictorial structures for object recognition［J］. International Journal of Computer Vision, 2005,61(1):55-79.
［7］ YANG Y, RAMANAN D. Articulated pose estimation with flexible mixtures-of-parts［C］// Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. 2011:1385-1392.
［8］韩贵金,沈建冬. 二维人体姿态估计研究进展［J］. 西安邮电大学学报, 2017,22(4):1-9.
［9］ REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6):1137-1149.
［10］REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:779-788.
［11］LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multiBox detector［C］// Proceedings of the 2016 European Conference on Computer Vision. 2016:21-37.
［12］WEI S E, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016:4724-4732.
［13］NEWELL A, YANG K U, DENG J. Stacked hourglass networks for human pose estimation［C］// Proceedings of the 2016 European Conference on Computer Vision. 2016:483-499.
［14］CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded Pyramid Network for Multi-Person Pose Estimation［DB/OL］. (2018-04-08). https://arxiv.org/pdf/1711.07319.pdf.
［15］XIA F T, WANG P, CHEN X J, et al. Joint multi-person pose estimation and semantic part segmentation［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:6080-6089.
［16］CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2d pose estimation using part affinity fields［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. 2017:1302-1310.
［17］PAPANDREOU G, ZHU T, CHEN L C, et al. PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model［DB/OL］. (2018-03-22). https://arxiv.org/pdf/1803.08225v1.pdf.
［18］薛月菊,朱勋沐,郑婵,等. 基于改进Faster R-CNN识别深度视频图像哺乳母猪姿态［J］. 农业工程学报, 2018,34(9):189-196.
［19］ ZHANG H B, LEI Q, ZHONG B N, et al. A survey on human pose estimation［J］. Intelligent Automation & Soft Computing, 2016,22(3):483-489.
［20］ PARK S, JI M, CHUN J. 2D human pose estimation based on object detection using RGB-D information［J］. KSII Transactions on Internet and Information Systems, 2018,12(2):800-816.
［21］WEN Y D, ZHANG K P, LI Z F, et al. A discriminative feature learning approach for deep face recognition［C］// Proceedings of the 2016 European Conference on Computer Vision. 2016:499-515.
［22］ HOU Y X, YAO H X, LI H R, et al. Dancing like a superstar: Action guidance based on pose estimation and conditional pose alignment［C］// Proceedings of the 2017 IEEE International Conference on Image Processing. 2017:1312-1316.
［23］ EVERINGHAM M. Visual Object Classes Challenge 2012 (VOC2012)［EB/OL］. ［2018-06-28］. http://cvlab.postech.ac.kr/~mooyeol/pascal_voc_2012/.

[1]	石展鲲, 杨风, 韩建宁, 郭鑫, 曹尚斌. 基于Faster-RCNN的自然环境下苹果识别[J]. 计算机与现代化, 2023, 0(02): 62-65.
[2]	胡昌冉, 樊彦国, 禹定峰. 嵌入空洞卷积模块的改进YOLOv3车辆检测算法[J]. 计算机与现代化, 2021, 0(04): 53-60.
[3]	蔡嘉伦1，余芳2. 面向儿童自学的体感交互应用设计[J]. 计算机与现代化, 2016, 0(1): 35-38.
[4]	鲁胜强;刘瑞玲. 基于最大目标函数的医学图像边缘检测[J]. 计算机与现代化, 2011, 1(6): 5-3.

基于深度学习的体感交互方法

Somatosensory Interaction Method Based on Deep Learning

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 4

编辑推荐

Metrics

本文评价