基于深度学习的文本分类综述

摘要/Abstract

摘要： 随着互联网的不断发展，网络上的文本数据日益增多，如果能对这些数据进行有效分类，那么更有利于从中挖掘出有价值的信息，因此文本数据的管理和整合显得十分重要。文本分类是自然语言处理任务中的一项基础性工作，主要应用于舆情检测及新闻文本分类等领域，目的是对文本资源进行整理和归类。基于深度学习的文本分类，在对文本数据处理中，表现出较好的分类效果。本文对用于文本分类的深度学习算法进行详细阐述，按照深度学习的不同算法进行分类，并分析各种算法的特点，最后对深度学习算法在文本分类领域的未来研究方向进行总结。

关键词: 文本数据, 文本分类, 自然语言, 深度学习

Abstract: With the continuous development of the Internet, there is an increasing number of text data on the Internet. If these data can be effectively classified, it is more conducive to mining valuable information. Therefore, the management and integration of text data is very important. Text classification is a basic task in natural language processing tasks. It is mainly used in the fields of public opinion detection and news text classification. The purpose is to sort and classify text resources. The text classification based on deep learning shows a good classification effect in the processing of text data. The article elaborates on the deep learning algorithms used for text classification, classifies according to different deep learning algorithms, and analyzes the characteristics of various algorithms, and finally summarizes the future research directions of deep learning algorithms in the field of text classification.

Key words: text data, text classification, natural language, deep learning

贾澎涛, 孙炜. 基于深度学习的文本分类综述[J]. 计算机与现代化, 2021, 0(07): 29-37.

JIA Peng-tao, SUN Wei. A Survey of Text Classification Based on Deep Learning[J]. Computer and Modernization, 2021, 0(07): 29-37.

参考文献

［1］陈海虹. 机器学习原理及应用［M］. 成都:电子科技大学出版社, 2017.
［2］ MARON M E. Automatic indexing: An experimental inquiry［J］. Journal of the ACM, 1961,8(3):404-417.
［3］张征杰,王自强. 文本分类及算法综述［J］. 电脑知识与技术, 2012,8(4):825-828.
［4］徐泓洋,杨国为. 中文文本特征选择方法研究综述［J］. 工业控制计算机, 2017,30(11):80-81.
［5］ MARON M E, KUHNS J L. On relevance, probabilistic indexing and information retrieval［J］. Journal of the ACM, 1960,7(3):216-244.
［6］ COVER T M, HART P E. Nearest neighbor pattern classification［J］. IEEE Transactions on Information Theory, 1967,13(1):21-27.
［7］ VAPNIK V N. The Nature of Statistical Learning Theory［M］. New York: John Wiley and Sons, 1998.
［8］ HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks［J］. Science, 2006,313(5786):504-507.
［9］ ZHU X D, SOBHANI P, GUO H Y. Long short-term memory over recursive structures［C］// Proceedings of the 32nd International Conference on Machine Learning. 2015:1604-1612.
［10］杨东,王移芝. 基于Attention-based C-GRU神经网络的文本分类［J］. 计算机与现代化, 2018(2):96-100.
［11］VAN RIJSBERGEN C J, ROBERTSON S E, PORTER M F. New models in probabilistic information retrieval［J］. Computer Laboratory, University of Cambridge, 1980.
［12］EDDY S R. Hidden Markov models［J］. Current Opinion in Structural Biology. 1996,6(3): 361-365.
［13］LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data［C］// Proceedings of the 18th International Conference on Machine Learning. 2001:282-289.
［14］PENG N Y, DREDZE M. Multi-task domain adaptation for sequence tagging［C］// Proceedings of the 2nd Workshop on Representation Learning for NLP. 2017:91-100.
［15］刘琴,袁家政,翁长虹. 基于深度学习的短文本分类研究综述［J］. 计算机科学, 2017,44(10A):11-15.
［16］DAVIS C A, FONSECA F T. Assessing the certainty of locations produced by an address geocoding system［J］. Geoinformatica, 2007,11(1):103-129.
［17］MIKOLOV T, KARAFIAT M, BURGET L, et al. Recurrent eural network based language model［C］// Proceedings of the 11th Annual Conference of the International Speech Communication Association. 2010:1045-1048.
［18］LE Q, MIKOLOV T. Distributed representations of sentences and documents［C］// Proceedings of the 31st International Conference on Machine Learning. 2014:1188-1196.
［19］汪岿,刘柏嵩. 文本分类研究综述［J］. 数据通信, 2019(3):37-47.
［20］JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification［J］. arXiv preprint arXiv:1607.01759, 2016.
［21］KIM Y. Convolutional neural networks for sentence classification［J］. arXiv preprint arXiv:1408.5882, 2014．
［22］ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification［C］// Proceedings of the 8th International Joint Conference on Natural Language Processing. 2017:253-263.
［23］ZHANG X, ZHAO J B, LECUN Y. Character-level convolutional networks for text classification［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:649-657.
［24］CONNEAU A, SCHWENK H, BARRAULT L, et al. Very deep convolutional networks for text classification［C］// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017. DOI:18653/V1/E17-1104.
［25］JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017. DOI:10.18653/V1/P17-1052.
［26］YAO L, MAO C S, LUO Y. Graph convolutional networks for text classification［C］// Proceedings of the AAAI Conference on Artificial Intelligence. 2019:7370-7377.
［27］MA M B, HUANG L, ZHOU B W, et al. Dependency-based convolutional neural networks for sentence embedding［C］// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015. DOI:10.3115/V1/P15-2029.
［28］MOU L L, PENG H, LI G, et al. Discriminative neural sentence modeling by tree-based convolution［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015:2315-2325.
［29］韩涛,尹伟石,方明. 基于卷积神经网络和XGBoost的情感分析模型［J］. 复旦学报（自然科学版）, 2019,58(5):561-564.
［30］JORDAN M I. A parallel distributed processing approach［J］. Advances in Psychology, 1997,121:471-495.
［31］KOUTNIK J, GREFF K, GOMEZ F, et al.A clockwork RNN［J］. arXiv preprint arXiv:1402.3511, 2014．
［32］SOCHER R, PERELYGIN A, WU J Y, et al. Recursive deep models for semantic compositionality over a sentiment treebank［C］// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013:1631-1642.
［33］SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks［J］. IEEE Transactions on Signal Processing, 1997,45(11):2673-2681.
［34］SCHMIDHUBER J. Deep learning in neural networks: An overview［J］. Neural Networks, 2015,61:85-117.
［35］刘婷婷,朱文东,刘广一. 基于深度学习的文本分类研究进展［J］. 电力信息与通信技术, 2018,16(3):1-7.
［36］ZHANG Y, LIU Q, SONG L F. Sentence-state LSTM for text representation［C］// Proceeding of the 56th Annual Meeting of the Association for Computational Linguistics. 2018. DOI:10.18653/V1/P18-1030.
［37］MIYATO T, DAI A M, GOODFELLOW I. Adversarial training methods for semi-supervised text classification［C］// Proceedings of the International Conference on Learning Representations. 2017.
［38］MIYATO T, MAEDA S I, KOYAMA M, et al. Virtual adversarial training: A regularization method for supervised and semi-supervised learning［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019,41(8):1979-1993.
［39］LU C, HUANG H Y, JIAN P, et al. A P-LSTM neural network for sentiment classification［C］// Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2017:524-533.
［40］TAI K S, SOCHER R, MANNING C D. Improved semantic representations from tree-structured long short-term memory networks［J］. arXiv preprint arXiv:1503.00075, 2015.
［41］GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures［J］. Neural Networks, 2005,18(5-6):602-610.
［42］和志强,杨建,罗长玲. 基于BiLSTM神经网络的特征融合短文本分类算法［J］. 智能计算机与应用. 2019,9(2):21-27.
［43］CHO K, MERRIENBOER B V, BAHDANAU D, et al. On the properties of neural machine translation: Encoder-decoder approaches［J］. arXiv preprint arXiv:1409.1259, 2014.
［44］CHEN X C, QIU X P, ZHU C X, et al. Gated recursive neural network for Chinese word segmentation［C］// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015:1744-1753.
［45］TANG Q T, LI J, CHEN J Y, LU H T, et al. Full attention-based bi-GRU neural network for news text classification［C］// Proceedings of the 2019 IEEE 5th International Conference on Computer and Communication. 2019. DOI:10.1109/ICCC47050.2019.9064061.
［46］线岩团,相艳,余正涛,等. 用于文本分类的均值原型网络［J］. 中文信息学报, 2020,34(6):74-80.
［47］孙明敏. 基于GRU-Attention的中文文本分类［J］. 现代信息科技, 2019(3):10-12.
［48］LIANG X, LIU Z M, OUYANG C P. A multi-sentiment classifier based on GRU and attention mechanism［C］// Proceedings of the 2018 IEEE 9th International Conference on Software Engineering and Service Science. 2018:527-530.
［49］BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate［J］. arXiv preprint arXiv:1409.0473, 2014.
［50］LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation［J］. arXiv preprint arXiv:1508.04025, 2015.
［51］VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010.
［52］SHEN T, ZHOU T Y, LONG G D, et al. Disan: Directional self-attention network for RNN/CNN-free language understanding［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018.
［53］TAN Z X, WANG M X, XIE J, et al. Deep semantic role labeling with self-attention［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018.
［54］贾红雨,王宇涵,丛日晴,等. 结合自注意力机制的神经网络文本分类算法研究［J］. 计算机应用与软件, 2020,37(2):200-206.
［55］吴小华,陈莉,魏甜甜,等. 基于Self-Attention和Bi-LSTM的中文短文本情感分析［J］. 中文信息学报, 2019,33(6):100-107.
［56］李丽华,胡小龙. 基于深度学习的文本情感分析［J］. 湖北大学学报:自然科学版, 2020,42(2):142-149.
［57］YANG Z C, YANG D Y, DYER C, et al. Hierarchical attention networks for document classification［C］// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2017. DOI:10.18653/V1/N16-1174.
［58］PAPPAS N, POPESCU-BELIS A. Multilingual hierarchical attention networks for document classification［J］. arXiv preprint arXiv:1707.00896, 2017.
［59］赵勤鲁,蔡晓东,李波,等. 基于LSTM-Attention神经网络的文本特征提取方法［J］. 现代电子技术, 2018,41(8):167-170.
［60］CAI J J, LI J P, LI W, et al. Deep learning model used in text classification［C］// Proceedings of the 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing. 2018. DOI:10.1109/ICCWAMTIP.2018.8632592.
［61］ZHOU K, LONG F. Sentiment analysis of text based on CNN and bi-directional LSTM model［C］// Proceedings of the 2018 24th International Conference on Automation and Computing. 2018.
［62］LAI S W, XU L H, LIU K, et al. Recurrent convolutional neural networks for text classification［C］// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015:2267-2273.
［63］DU J C, GUI L, HE Y L, et al. Convolution-based neural attention with applications to sentiment classication［J］. IEEE Access, 2019(7):27983-27992.
［64］ZHANG J R, LI Y X, TIAN J, et al. LSTM-CNN hybrid model for text classification［C］// Proceedings of the 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference. 2018:1675-1680.
［65］SHE X Y, ZHANG D. Text classification based on hybrid CNN-LSTM hybrid model［C］// Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design. 2018. DOI:10.1109/ISCID. 2018.10144.
［66］彭玉青,宋初柏,闫倩,等. 基于VDCNN与LSTM混合模型的中文文本分类研究［J］. 计算机工程, 2018,44(11):190-196.
［67］LI C B, ZHAN G H, LI Z H. News text classification based on improved Bi-LSTM-CNN［C］// Proceedings of the 2018 9th International Conference on Information Technology in Medicine and Education. 2018:890-893.
［68］FELBO B, MISLOVE A, SGAARD A, et al. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm［J］. arXiv preprint arXiv:1708.00524, 2017.
［69］WANG G Y, LI C Y, WANG W L, et al. Joint embedding of words and labels for text classification［J］. arXiv preprint arXiv:1805.04174, 2018.
［70］宋祖康,阎瑞霞. 基于CNN-BIGRU的中文文本情感分类模型［J］. 计算机技术与发展, 2020,30(2):166-170.
［71］LUO L X. Network text sentiment analysis method combining LDA text representation and GRU-CNN［J］. Personal and Ubquitous Computing, 2019,23(3-4):405-412.
［72］HOWARD J, RUDER S. Universal language model fine-tuning for text classification［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018:328-339．
［73］PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018:2227-2237.
［74］VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010.
［75］DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-Training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019:4171-4186.
［76］LIU X D, HE P C, CHEN W Z, et al. Multi-task deep neural networks for natural language understanding［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:4487-4496.
［77］CHI S, QIU X P, XU Y G, et al. How to fine-tune BERT for text classification?［C］// Proceedings of the China National Conference on Computational Linguistics. 2019:194-206.
［78］YANG Z L, DAI Z H, YANG Y M, et al. XLNet: Generalized autoregressive pretraining for language understanding［C］// Proceedings of the 33rd Conference on Neural Information Processing Systems. 2019:5754-5764.
［79］DAI Z H, YANG Z L, YANG Y M, et al. Transformer-XL: Attentive language models beyond a fixed -length context［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2978-2888.
［80］余传明,王曼怡,林虹君,等. 基于深度学习的词汇表示模型对比研究［J］. 数据分析与知识发现, 2020(8):28-40.
［81］SUN Y, WANG S H, LI Y K, et al. ERNIE: Enhanced representation through knowledge integration［J］. arXiv preprint arXiv:1904.09223, 2019.
［82］RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of tansfer learning with a unified text-to-text transformer［J］. arXiv preprint arXiv:1910.10683, 2020.
［83］YE Z H, GUO Q P, GAN Q, et al. BP-transformer: Modelling long-range context via binary partitioning［J］. arXiv preprint arXiv:1911.04070, 2019.
［84］ZHOU G Y, ZHU Z Y, HE T T, et al. Cross-lingual sentiment classification with stacked autoencoders［J］. Knowledge and Information Systems, 2016,47(1):27-44.

[1]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[2]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[3]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[4]	王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79.
[5]	史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99.
[6]	张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113.
[7]	程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126.
[8]	程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29.
[9]	王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36.
[10]	时现伟1, 范鑫2. 基于轻量化的视频帧场景语义分割方法[J]. 计算机与现代化, 2024, 0(08): 49-53.
[11]	徐新爱, 李钢. 基于DCGAN的课堂表情图像生成方法[J]. 计算机与现代化, 2024, 0(08): 88-91.
[12]	高帅鹏, 王怡凡. 基于图像的群体情绪识别综述[J]. 计算机与现代化, 2024, 0(08): 98-107.
[13]	周宪溪, 牟莉. 基于改进TF-IDF和AGLCNN的新闻长文本分类模型[J]. 计算机与现代化, 2024, 0(08): 120-126.
[14]	李璐, 朱焱. 基于知识提示微调的事件抽取方法[J]. 计算机与现代化, 2024, 0(07): 36-40.
[15]	黄文栋, 王怡凡. 基于模态类别的多模态信息处理与融合综述[J]. 计算机与现代化, 2024, 0(07): 47-62.