计算机与现代化 ›› 2021, Vol. 0 ›› Issue (09): 83-89.

• 人工智能 • 上一篇    下一篇

降低参数规模的卷积神经网络模型压缩方法

  

  1. (1. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China;
    2. University of Chinese Academy of Sciences, Beijing 100049, China)
  • 出版日期:2021-09-14 发布日期:2021-09-14
  • 作者简介:朱雪晨(1996—),女,河南新乡人,硕士研究生,研究方向:FPGA神经网络加速器设计,E-mail: 18703862561@163.com; 陈三林(1996—),男,硕士研究生,研究方向:数字集成电路设计,E-mail: chensanlin19@mails.ucas.ac.cn; 通信作者:蔡刚(1980—),男,高级工程师,博士,研究方向:大规模集成电路设计,人工智能,E-mail: caig@mail.ie.ac.cn; 黄志洪(1984—),男,高级工程师,博士,研究方向:可编程芯片设计技术,FPGA神经网络加速器设计,E-mail: huangzhihong@mail.ie.ac.cn。
  • 基金资助:
    国家自然科学基金资助项目(61704173)

Compression Method of CNN Model for Parameter Reduction

  1. (1.中国科学院空天信息创新研究院,北京100094;2.中国科学院大学,北京100049)
  • Online:2021-09-14 Published:2021-09-14

摘要: 针对卷积神经网络模型参数规模越来越大导致难以在计算与存储资源有限的嵌入式设备上大规模部署的问题,提出一种降低参数规模的卷积神经网络模型压缩方法。通过分析发现,卷积层参数量与输入输出特征图数量以及卷积核大小有关,而全连接层参数数量众多且难以大幅减少。通过分组卷积减少输入输出特征图数量,通过卷积拆分减小卷积核大小,同时采用全局平均池化层代替全连接层的方法来解决全连接层参数数量众多的问题。将上述方法应用于LeNet5和AlexNet进行实验,实验结果表明通过使用组合压缩方法对LeNet5模型进行最大压缩后,参数规模可减少97%,识别准确率降低了不到2个百分点,而压缩后的AlexNet模型参数规模可减少95%,识别准确率提高了6.72个百分点,在保证卷积神经网络精度的前提下,可大幅减少模型的参数量。

关键词: 卷积神经网络, 参数规模, 分组卷积, 卷积拆分, 全局平均池化

Abstract: In order to solve the problem that it is difficult to deploy convolutional neural network model on embedded devices with limited computing and storage resources due to the increasing scale of parameters, a convolutional neural network model compression method is proposed to reduce the scale of parameters. It is found that the number of convolution layer parameters is related to the number of input and output feature maps and the size of convolution kernel, while the number of full connection layer parameters is large and difficult to be reduced significantly. The number of input and output feature maps is reduced by grouping convolution, and the convolution kernel size is reduced by convolution resolution. At the same time, the global average pooling layers are used to replace the fully connected layers to solve the problem of large number of parameters in the fully connected layers. The above methods are applied to LeNet5 and AlexNet for experiments, the experimental results show that the parameters of LeNet5 model can be reduced by 97% and the recognition accuracy can be reduced by less than 2 percentage points by using the combined compression method, the parameters of AlexNet model can be reduced by 95% and the recognition accuracy can be improved by 6.72 percentage points after compression. On the premise of ensuring the accuracy of convolutional neural network, the parameters of the model can be greatly reduced.

Key words: convolutional neural networks, parameter scale, grouping convolution, convolution resolution, global average pooling