模型剪枝算法综述

摘要/Abstract

摘要： 模型剪枝算法利用不同的标准或方式对深度神经网络中冗余神经元进行裁剪，在不损失模型精度的情况下对模型进行最大程度的压缩，从而可以减少存储并提升速度。首先，对模型剪枝算法的研究现状与主要研究方向进行总结并归类。主要研究方向包括剪枝的尺度、剪枝元素重要性评估的方法、剪枝的稀疏度、剪枝的理论基础及对于不同任务的剪枝等方面。然后对近年来具有代表性的剪枝算法进行详细描述。最后对此领域的研究提出未来展望。

关键词: 模型剪枝, 深度学习, 通道, 阈值, 压缩

Abstract: The model pruning algorithms apply different standards or methods to prune the redundant neurons in the deep neural network, which can compress the model to the maximum extent without losing the accuracy of the model, so as to reduce the storage and improve the speed. Firstly, the research status of model pruning algorithm and the main research direction are summarized and classified. The main research areas of model pruning include the granularity of pruning, the method to evaluate the importance of pruning elements, the sparsity of pruning, the theoretical foundation of model pruning, pruning for different tasks and so on. Then, the recent representative pruning algorithms are described in detail as well. Finally, the future research direction in this field is brought forward.

Key words: model pruning, deep learning, channel, threshold, compression

李屹, 魏建国, 刘贯伟. 模型剪枝算法综述[J]. 计算机与现代化, 2022, 0(09): 51-59.

LI Yi, WEI Jian-guo, LIU Guan-wei. Survey of Model Pruning Algorithms[J]. Computer and Modernization, 2022, 0(09): 51-59.

参考文献

［1］ HAN S, MAO H Z, DALLY W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding［C］// Proceedings of International Conference on Learning Representation. 2016.
［2］ DENIL M, SHAKIBI B, DINH L, et al. Prediction parameters in deep learning［C］// Proceedings of Advances in Neural Information Processing Systems. 2013:2148-2156.
［3］ CUN Y L, DENKER J S, SOLLA S A. Optimal brain damage［C］// Proceedings of Advances in Neural Information Processing Systems. 1990:589-605.
［4］ HASSIBI B, STORK D G. Second order derivatives for network pruning: Optimal brain surgeon［C］// Proceedings of Advances in Neural Information Processing Systems. 1992:164-171.
［5］ HAN S, POOL J, TRAN J, et al. Learning both weights and connections for efficient neural networks［C］// Proceedings of Advances in Neural Information Processing Systems. 2015:1135-1143.
［6］ LIU Z, SUN M J, ZHOU T H, et al. Rethinking the value of network pruning［C］// Proceedings of International Conference on Learning Representation. 2019.
［7］ MITTAL D, BHARDWAJ S, KHAPRA M M, et al. Recovering from random pruning: On the plasticity of deep convolutional neural networks［C］// Proceedings of IEEE Winter Conference on Applications of Computer Vision. 2018:848-857.
［8］ WANG Y L, ZHANG X L, XIE L X, et al. Pruning from scratch［C］// Proceedings of the AAAI Conference on Artificial Intelligence. 2020:12273-12280.
［9］ MANESSI F, ROZZA A, BIANCO S, et al. Automated pruning for deep neural network compress［C］// Proceedings of International Conference on Pattern Recognition. 2018: 654-657.
［10］CARREIRA-PERPINAN M A, IDELBAYEV Y. Learning-compression algorithm for neural network pruning［C］// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:8532-8541.
［11］LIN M B, JI R R, ZHANG Y X, et al. Channel pruning via automatic structure search［C］// Proceedings of International Joint Conference on Artificial Intelligence. 2020:673-679.
［12］LIU Z H, MU H Y, ZHANG X Y, et al. Meta learning for automatic neural network channel pruning［C］// Proceedings of IEEE International Conference on Computer Vision. 2019:3295-3304.
［13］HE Y H, LIN J, LIU Z J, et al. AMC: AutoML or model compression and acceleration on mobile devices［C］// Proceedings of European Conference on Computer Vision. 2018:784-800.
［14］LIN S H, JI R R, YAN C Q, et al. Towards optimal structured CNN pruning via generative adversarial learning［C］// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:2790-2799.
［15］NING X F, ZHAO T C, LI W S, et al. DSA: More efficient budgeted pruning via differentiable sparsity allocation［C］// Proceedings of European Conference on Computer Vision. 2020:592-607.
［16］GUO S P, WANG Y J, LI Q Q, et al. DMCP: Differentiable Markov channel pruning for neural networks［C］// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020:1536-1544.
［17］LI Y W, GU S H, MAYER C, et al. Group sparsity: The hinge between filter pruning and decomposition for network compression［C］// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020:8018-8027.
［18］WANG T Z, WANG K, CAI H, et al. APQ: Joint search for network architecture, pruning and quantization policy［C］// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020:2078-2087.
［19］TUNG F, MORI G. CLIP-Q: Deep network compression learning by in-parallel pruning-quantization［C］// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018:7873-7882.
［20］DELLINGER F, BOULAY T, BARRENECHEA D M, et al. Multi-task network pruning and embedded optimization for real-time deployment in ADAS［J］. arXiv preprint arXiv:2102.07831, 2021.
［21］HE W, WU M Q, LIANG M F, et al. CAP: Context-aware pruning for semantic segmentation［C］// Proceedings of the IEEE Winter Conference on Applications of Computer Vision. 2021:960-969.
［22］MENG F X, CHENG H, LI K, et al. Pruning filter in filter［C］// Proceedings of Advances in Neural Information Processing Systems. 2020:17629-17640.
［23］MAO H Z, HAN S, POOL J, et al. Exploring the regularity of sparse structure in convolutional neural network［J］. arXiv preprint arXiv:1705.08922, 2017.
［24］LI H, KADAV A, DURDANOVIC I, et al. Pruning filters for efficient ConvNets［C］// Proceedings of International Conference on Learning Representation. 2017.
［25］HE Y, LIU P, WANG Z W, et al. Filter pruning via geometric median for deep convolutional neural network acceleration［C］// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019:4340-4349.
［26］WEN W, WU C P, WANG Y D, et al. Learning structured sparsity in deep neural networks［C］// Proceedings of Advances in Neural Information Processing Systems. 2016:2074-2082.
［27］LIU Z, LI J G, SHEN Z Q, et al. Learning efficient convolutional networks through network slimming［C］// Proceedings of IEEE International Conference on Computer Vision. 2017:2736-2744.
［28］HU H Y, PENG R, TAI Y W, et al. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures［J］. arXiv preprint arXiv:1607.03250, 2016.
［29］LIN M B, JI R R, WANG Y, et al. Filter pruning using high-rank feature map［C］// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020:1529-1538.
［30］LUO J H, WU J X. An entropy-based pruning method for CNN compression［J］. arXiv preprint arXiv:1706.05791, 2017.
［31］MOLCHANOV P, TYREE S, KARRAS T, et al. Pruning convolutional neural networks for resource efficient transfer learning［C］// Proceedings of International Conference on Learning Representation. 2017.
［32］XIN D, CHEN S Y, PAN S. Learning to prune deep neural networks via layer-wise optimal brain surgeon［C］// Proceedings of Advances in Neural Information Processing Systems. 2017:4857-4867.
［33］YOU Z H, YAN K, YE J M, et al. Gate decorator: Global filter pruning method for accelerating deep convolutional neural networks［C］// Proceedings of Advances in Neural Information Processing Systems. 2019:2133-2144.
［34］LUO J H, WU J X, LIN W Y. ThiNet: A filter level pruning method for deep neural network compression［C］// Proceedings of IEEE International Conference on Computer Vision. 2017:5058-5066.
［35］HE Y H, ZHANG X Y, SUN J. Channel pruning for accelerating very deep neural networks［C］// Proceedings of IEEE International Conference on Computer Vision. 2017:1389-1397.
［36］KARABOGA D. An idea based on honey bee swarm for numerical optimization［R］. Technical Report, 2005.
［37］LI Y W, GU S H, ZHANG K, et al. DHP: Differentiable meta pruning via hypernetworks［C］// Proceedings of European Conference on Computer Vision. 2020:608-624.
［38］HE Y, KANG G L, DONG X Y, et al. Soft filter pruning for accelerating deep convolutional neural networks［C］// Proceedings of International Joint Conference on Artificial Intelligence. 2018:2234-2240.
［39］LIN M B, JI R R, LI S J, et al. Filter sketch for network pruning［J］. arXiv preprint arXiv:2001.08514, 2020.
［40］LIBERTY E. Simple and deterministic matrix sketching［C］// Proceeding of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2013:581-588.
［41］GUO Y W, YAO A B, CHEN Y R. Dynamic network surgery for efficient DNNs［C］// Proceedings of Advances in Neural Information Processing Systems. 2016:1379-1387.
［42］ZHU M H, GUPTA S. To prune, or not to prune: Exploring the efficacy of pruning for model compression［C］// Proceedings of International Conference on Learning Representations. 2018.
［43］LUO J H, WU J X. An end-to-end trainable filter pruning method for efficient deep model inference［J］. arXiv preprint arXiv:1805.08941, 2020.
［44］FRANKE J, CARBIN M. The lottery ticket hypothesis: Finding sparse, trainable neural networks［J］. arXiv preprint arXiv:1803.03635, 2018.
［45］ZHOU H, LAN J, LIU R, et al. Deconstruction lottery tickets: Zeros, signs, and the super-mask［C］// Proceedings of Advances in Neural Information Processing Systems. 2019:3597-3607.
［46］RAMANUJAN V, WORTSMAN M, KEMBHAVI A, et al. What’s hidden in a randomly weighted neural network［C］// Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020:11893-11902.
［47］MALACH E, YEHUDAI G, SHALEV-SHWARTZ S, et al. Proving the lottery ticket hypothesis: Pruning is all you need［C］// Proceedings of International Conference on Machine Learning. 2020:6682-6691.
［48］QIAN X, KLABJAN D. A probabilistic approach to neural network pruning［J］. arXiv preprint arXiv:2105.10065v1, 2021.
［49］WANG Y, LU Y D, BLANKEVOORT T. Differentiable joint pruning and quantization for hardware efficiency［C］// Proceedings of European Conference on Computer Vision. 2020:259-277.
［50］DAI B, ZHU C, WIPF D. Compressing neural networks using the variational information bottleneck［C］// Proceedings of International Conference on Machine Learning. 2018:1135-1144.
［51］GUO J Y, OUYANG W L, XU D. Multi-dimensional pruning: A unified framework for model compression［C］// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020:1508-1517.

[1]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[2]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[3]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[4]	王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79.
[5]	史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99.
[6]	张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113.
[7]	程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126.
[8]	程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29.
[9]	王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36.
[10]	时现伟1, 范鑫2. 基于轻量化的视频帧场景语义分割方法[J]. 计算机与现代化, 2024, 0(08): 49-53.
[11]	徐新爱, 李钢. 基于DCGAN的课堂表情图像生成方法[J]. 计算机与现代化, 2024, 0(08): 88-91.
[12]	高帅鹏, 王怡凡. 基于图像的群体情绪识别综述[J]. 计算机与现代化, 2024, 0(08): 98-107.
[13]	黄文栋, 王怡凡. 基于模态类别的多模态信息处理与融合综述[J]. 计算机与现代化, 2024, 0(07): 47-62.
[14]	武丽1, 张征浩2, 葛彩成2, 俞俊2. 基于改进SCNN网络的车道线检测算法[J]. 计算机与现代化, 2024, 0(07): 87-92.
[15]	张可1, 艾中良2, 刘忠麟3, 顾平莉1, 刘学林4. 基于多元组匹配损失的司法论辩理解方法[J]. 计算机与现代化, 2024, 0(06): 115-120.