计算机与现代化 ›› 2024, Vol. 0 ›› Issue (01): 80-86.doi: 10.3969/j.issn.1006-2475.2024.01.013

• 人工智能 • 上一篇    下一篇

一种面向微控制器上环境声音分类的DNN压缩方法

  

  1.   (北京交通大学计算机与信息技术学院,北京 100044) 
  • 出版日期:2024-01-23 发布日期:2024-02-26
  • 作者简介:孟娜(1997—),女,河北衡水人,硕士研究生,研究方向:边缘计算,E-mail: 20120397@bjtu.edu.cn; 通信作者:方维维(1981—),男,安徽芜湖人,副教授,博士,研究方向:边缘计算,云计算和物联网,E-mail: fangww@bjtu.edu.cn; 路红英 (1963—),女,河南安阳人,高级工程师,本科,研究方向:边缘计算,E-mail: hylu@bjtu.edu.cn。

A DNN Compression Method for Environmental Sound Classification on Microcontroller Unit

  1. (School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China)
  • Online:2024-01-23 Published:2024-02-26

摘要: 摘要:环境声音分类(Environmental Sound Classification, ESC)是非语音音频分类任务最重要的课题之一。近年来,深度神经网络(Deep Neural Network, DNN)方法在ESC方面取得了许多进展。然而,DNN是计算和存储密集型的,无法直接部署到基于微控制器(Microcontroller Unit, MCU)的物联网设备上。针对这一问题,本文提出一种用于资源高度受限设备的DNN压缩方法。由于DNN模型参数规模较大无法直接部署,因此提出使用剪枝方法进行大幅压缩,并针对该操作带来的精度损失问题,设计一种基于模型中间层特征信息的知识蒸馏方法。基于STM32F746ZG设备在公开的数据集(UrbanSound8K、ESC-50) 上进行测试,实验结果表明,本文方法能够获得高达97%的压缩率,同时保持良好的推理精度和速度。

关键词: 关键词:环境声音分类, 边缘计算, 微控制器, 剪枝, 知识蒸馏, 量化

Abstract: Abstract: Environmental Sound Classification (ESC) is known as one of the most important topics of the non-speech audio classification task. In recent years, deep neural networks (DNNs) have made a lot of progress in ESC. However, DNNs are computationally and memory-intensive, and cannot be directly deployed on IoT devices based on microcontroller units (MCU). To address this problem, this paper proposes a DNN compression method for highly resource-constrained devices. Since DNNs have a large number of parameters, which cannot be directly deployed, so this paper proposes to use the pruning method for substantial compression. Afterwards, aiming at the problem of accuracy loss caused by this operation, we design a knowledge distillation based on the feature information of multiple intermediate layers. Tests are carried out on public datasets (UrbanSound8K, ESC-50) using the STM32F746ZG device. The experimental results demonstrate that proposed method can achieve up to 97% compression rate while maintaining good inference performance and speed.

Key words: Key words: environmental sound classification, edge computing, microcontroller unit, pruning, knowledge distillation, quantization

中图分类号: