计算机与现代化 ›› 2024, Vol. 0 ›› Issue (12): 34-39.doi: 10.3969/j.issn.1006-2475.2024.12.005

• 算法设计与分析 • 上一篇    下一篇

基于WiFi和原型网络的手势识别方法


  

  1. (1.中国石油大学(华东)计算机科学与技术学院,山东 青岛 266580; 2.中国石油大学(华东)海洋与空间信息学院,山东 青岛 266580)
  • 出版日期:2024-12-31 发布日期:2024-12-31
  • 基金资助:
    国家自然科学基金资助项目(61872385, 61972417); 山东省自然科学基金资助项目(ZR2020MF005)

Gesture Recognition Method Based on WiFi and Prototypical Network

  1. (1. College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China;
    2. College of Oceanography and Space Informatics, China University of Petroleum (East China), Qingdao 266580, China)
  • Online:2024-12-31 Published:2024-12-31

摘要: WiFi的手势识别在无接触式人机交互中发挥着重要作用,然而现有基于WiFi的手势识别系统面临着数据量小和跨域性能差的挑战。为了解决上述问题,本文将采集到的原始WiFi信道状态信息(CSI)通过CSI Ratio去噪、提取相位并转换为CSI图像,将其转化为图像分类问题,然后将转化的图像输入到原型网络(PN)中进行小样本跨域手势识别,在PN特征提取网络中加入了增强的卷积注意力模块(CSI-CBAM)来提高手势的表征学习。在Widar3.0数据集上的实验结果表明,当支持集中的每个类别达到4个标记样本时,该系统在跨环境、跨用户、跨位置和跨方向上的平均识别准确率分别为93.54%、91.28%、91.99%和89.16%。跨域平均准确率大于90%,本文方法只需少量标记样本即可实现高精度的跨域识别。

关键词: 手势识别, 信道状态信息, 人机交互, 图像分类, 注意力机制

Abstract: WiFi-based gesture recognition plays an important role in touchless human-computer interaction. However, existing WiFi-based gesture recognition systems faced the challenges of small data amount and poor cross-domain performance. In order to solve the above problems, the captured raw WiFi channel state information (CSI) is denoised by CSI Ratio, the extracted phase and converted into CSI images, which is transformed into an image classification problem. Then the transformed images are fed into the prototypical network (PN) for small sample cross-domain gesture recognition, and an enhanced Convolutional Block Attention Module (CSI-CBAM) is added to the PN feature extraction network to improve the gesture representation learning. Extensive experiments were conducted on the Widar3.0 dataset. The experimental results showed that when each class in support set reaches four labeled samples, the system average recognition accuracies are 93.54%, 91.28%, 91.99%, and 89.16% for cross-environment, cross-user, cross-location, and cross-orientation, respectively. Average cross-domain accuracy is higher than 90%,  the proposed method only required a small number of labeled samples to achieve high accuracy cross-domain recognition.

Key words: gesture recognition, channel state information, human-computer interaction, image classification, attention mechanism

中图分类号: