计算机与现代化 ›› 2021, Vol. 0 ›› Issue (07): 29-37.
出版日期:
2021-08-02
发布日期:
2021-08-02
作者简介:
贾澎涛(1977—),女,陕西蒲城人,教授,博士,研究方向:深度学习,数据挖掘,煤矿安全可视化,E-mail: jiapengtao@xust.edu.cn; 孙炜(1995—),女,硕士研究生,研究方向:深度学习,E-mail: 153365088@qq.com。
基金资助:
Online:
2021-08-02
Published:
2021-08-02
摘要: 随着互联网的不断发展,网络上的文本数据日益增多,如果能对这些数据进行有效分类,那么更有利于从中挖掘出有价值的信息,因此文本数据的管理和整合显得十分重要。文本分类是自然语言处理任务中的一项基础性工作,主要应用于舆情检测及新闻文本分类等领域,目的是对文本资源进行整理和归类。基于深度学习的文本分类,在对文本数据处理中,表现出较好的分类效果。本文对用于文本分类的深度学习算法进行详细阐述,按照深度学习的不同算法进行分类,并分析各种算法的特点,最后对深度学习算法在文本分类领域的未来研究方向进行总结。
贾澎涛, 孙炜. 基于深度学习的文本分类综述[J]. 计算机与现代化, 2021, 0(07): 29-37.
JIA Peng-tao, SUN Wei. A Survey of Text Classification Based on Deep Learning[J]. Computer and Modernization, 2021, 0(07): 29-37.
[1] 陈海虹. 机器学习原理及应用[M]. 成都:电子科技大学出版社, 2017. [2] MARON M E. Automatic indexing: An experimental inquiry[J]. Journal of the ACM, 1961,8(3):404-417. [3] 张征杰,王自强. 文本分类及算法综述[J]. 电脑知识与技术, 2012,8(4):825-828. [4] 徐泓洋,杨国为. 中文文本特征选择方法研究综述[J]. 工业控制计算机, 2017,30(11):80-81. [5] MARON M E, KUHNS J L. On relevance, probabilistic indexing and information retrieval[J]. Journal of the ACM, 1960,7(3):216-244. [6] COVER T M, HART P E. Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967,13(1):21-27. [7] VAPNIK V N. The Nature of Statistical Learning Theory[M]. New York: John Wiley and Sons, 1998. [8] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006,313(5786):504-507. [9] ZHU X D, SOBHANI P, GUO H Y. Long short-term memory over recursive structures[C]// Proceedings of the 32nd International Conference on Machine Learning. 2015:1604-1612. [10]杨东,王移芝. 基于Attention-based C-GRU神经网络的文本分类[J]. 计算机与现代化, 2018(2):96-100. [11]VAN RIJSBERGEN C J, ROBERTSON S E, PORTER M F. New models in probabilistic information retrieval[J]. Computer Laboratory, University of Cambridge, 1980. [12]EDDY S R. Hidden Markov models[J]. Current Opinion in Structural Biology. 1996,6(3): 361-365. [13]LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the 18th International Conference on Machine Learning. 2001:282-289. [14]PENG N Y, DREDZE M. Multi-task domain adaptation for sequence tagging[C]// Proceedings of the 2nd Workshop on Representation Learning for NLP. 2017:91-100. [15]刘琴,袁家政,翁长虹. 基于深度学习的短文本分类研究综述[J]. 计算机科学, 2017,44(10A):11-15. [16]DAVIS C A, FONSECA F T. Assessing the certainty of locations produced by an address geocoding system[J]. Geoinformatica, 2007,11(1):103-129. [17]MIKOLOV T, KARAFIAT M, BURGET L, et al. Recurrent eural network based language model[C]// Proceedings of the 11th Annual Conference of the International Speech Communication Association. 2010:1045-1048. [18]LE Q, MIKOLOV T. Distributed representations of sentences and documents[C]// Proceedings of the 31st International Conference on Machine Learning. 2014:1188-1196. [19]汪岿,刘柏嵩. 文本分类研究综述[J]. 数据通信, 2019(3):37-47. [20]JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[J]. arXiv preprint arXiv:1607.01759, 2016. [21]KIM Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014. [22]ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. 2017:253-263. [23]ZHANG X, ZHAO J B, LECUN Y. Character-level convolutional networks for text classification[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:649-657. [24]CONNEAU A, SCHWENK H, BARRAULT L, et al. Very deep convolutional networks for text classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017. DOI:18653/V1/E17-1104. [25]JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017. DOI:10.18653/V1/P17-1052. [26]YAO L, MAO C S, LUO Y. Graph convolutional networks for text classification[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2019:7370-7377. [27]MA M B, HUANG L, ZHOU B W, et al. Dependency-based convolutional neural networks for sentence embedding[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015. DOI:10.3115/V1/P15-2029. [28]MOU L L, PENG H, LI G, et al. Discriminative neural sentence modeling by tree-based convolution[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015:2315-2325. [29]韩涛,尹伟石,方明. 基于卷积神经网络和XGBoost的情感分析模型[J]. 复旦学报(自然科学版), 2019,58(5):561-564. [30]JORDAN M I. A parallel distributed processing approach[J]. Advances in Psychology, 1997,121:471-495. [31]KOUTNIK J, GREFF K, GOMEZ F, et al.A clockwork RNN[J]. arXiv preprint arXiv:1402.3511, 2014. [32]SOCHER R, PERELYGIN A, WU J Y, et al. Recursive deep models for semantic compositionality over a sentiment treebank[C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013:1631-1642. [33]SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997,45(11):2673-2681. [34]SCHMIDHUBER J. Deep learning in neural networks: An overview[J]. Neural Networks, 2015,61:85-117. [35]刘婷婷,朱文东,刘广一. 基于深度学习的文本分类研究进展[J]. 电力信息与通信技术, 2018,16(3):1-7. [36]ZHANG Y, LIU Q, SONG L F. Sentence-state LSTM for text representation[C]// Proceeding of the 56th Annual Meeting of the Association for Computational Linguistics. 2018. DOI:10.18653/V1/P18-1030. [37]MIYATO T, DAI A M, GOODFELLOW I. Adversarial training methods for semi-supervised text classification[C]// Proceedings of the International Conference on Learning Representations. 2017. [38]MIYATO T, MAEDA S I, KOYAMA M, et al. Virtual adversarial training: A regularization method for supervised and semi-supervised learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019,41(8):1979-1993. [39]LU C, HUANG H Y, JIAN P, et al. A P-LSTM neural network for sentiment classification[C]// Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2017:524-533. [40]TAI K S, SOCHER R, MANNING C D. Improved semantic representations from tree-structured long short-term memory networks[J]. arXiv preprint arXiv:1503.00075, 2015. [41]GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005,18(5-6):602-610. [42]和志强,杨建,罗长玲. 基于BiLSTM神经网络的特征融合短文本分类算法[J]. 智能计算机与应用. 2019,9(2):21-27. [43]CHO K, MERRIENBOER B V, BAHDANAU D, et al. On the properties of neural machine translation: Encoder-decoder approaches[J]. arXiv preprint arXiv:1409.1259, 2014. [44]CHEN X C, QIU X P, ZHU C X, et al. Gated recursive neural network for Chinese word segmentation[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015:1744-1753. [45]TANG Q T, LI J, CHEN J Y, LU H T, et al. Full attention-based bi-GRU neural network for news text classification[C]// Proceedings of the 2019 IEEE 5th International Conference on Computer and Communication. 2019. DOI:10.1109/ICCC47050.2019.9064061. [46]线岩团,相艳,余正涛,等. 用于文本分类的均值原型网络[J]. 中文信息学报, 2020,34(6):74-80. [47]孙明敏. 基于GRU-Attention的中文文本分类[J]. 现代信息科技, 2019(3):10-12. [48]LIANG X, LIU Z M, OUYANG C P. A multi-sentiment classifier based on GRU and attention mechanism[C]// Proceedings of the 2018 IEEE 9th International Conference on Software Engineering and Service Science. 2018:527-530. [49]BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014. [50]LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[J]. arXiv preprint arXiv:1508.04025, 2015. [51]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010. [52]SHEN T, ZHOU T Y, LONG G D, et al. Disan: Directional self-attention network for RNN/CNN-free language understanding[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018. [53]TAN Z X, WANG M X, XIE J, et al. Deep semantic role labeling with self-attention[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018. [54]贾红雨,王宇涵,丛日晴,等. 结合自注意力机制的神经网络文本分类算法研究[J]. 计算机应用与软件, 2020,37(2):200-206. [55]吴小华,陈莉,魏甜甜,等. 基于Self-Attention和Bi-LSTM的中文短文本情感分析[J]. 中文信息学报, 2019,33(6):100-107. [56]李丽华,胡小龙. 基于深度学习的文本情感分析[J]. 湖北大学学报:自然科学版, 2020,42(2):142-149. [57]YANG Z C, YANG D Y, DYER C, et al. Hierarchical attention networks for document classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2017. DOI:10.18653/V1/N16-1174. [58]PAPPAS N, POPESCU-BELIS A. Multilingual hierarchical attention networks for document classification[J]. arXiv preprint arXiv:1707.00896, 2017. [59]赵勤鲁,蔡晓东,李波,等. 基于LSTM-Attention神经网络的文本特征提取方法[J]. 现代电子技术, 2018,41(8):167-170. [60]CAI J J, LI J P, LI W, et al. Deep learning model used in text classification[C]// Proceedings of the 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing. 2018. DOI:10.1109/ICCWAMTIP.2018.8632592. [61]ZHOU K, LONG F. Sentiment analysis of text based on CNN and bi-directional LSTM model[C]// Proceedings of the 2018 24th International Conference on Automation and Computing. 2018. [62]LAI S W, XU L H, LIU K, et al. Recurrent convolutional neural networks for text classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015:2267-2273. [63]DU J C, GUI L, HE Y L, et al. Convolution-based neural attention with applications to sentiment classication[J]. IEEE Access, 2019(7):27983-27992. [64]ZHANG J R, LI Y X, TIAN J, et al. LSTM-CNN hybrid model for text classification[C]// Proceedings of the 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference. 2018:1675-1680. [65]SHE X Y, ZHANG D. Text classification based on hybrid CNN-LSTM hybrid model[C]// Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design. 2018. DOI:10.1109/ISCID. 2018.10144. [66]彭玉青,宋初柏,闫倩,等. 基于VDCNN与LSTM混合模型的中文文本分类研究[J]. 计算机工程, 2018,44(11):190-196. [67]LI C B, ZHAN G H, LI Z H. News text classification based on improved Bi-LSTM-CNN[C]// Proceedings of the 2018 9th International Conference on Information Technology in Medicine and Education. 2018:890-893. [68]FELBO B, MISLOVE A, SGAARD A, et al. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm[J]. arXiv preprint arXiv:1708.00524, 2017. [69]WANG G Y, LI C Y, WANG W L, et al. Joint embedding of words and labels for text classification[J]. arXiv preprint arXiv:1805.04174, 2018. [70]宋祖康,阎瑞霞. 基于CNN-BIGRU的中文文本情感分类模型[J]. 计算机技术与发展, 2020,30(2):166-170. [71]LUO L X. Network text sentiment analysis method combining LDA text representation and GRU-CNN[J]. Personal and Ubquitous Computing, 2019,23(3-4):405-412. [72]HOWARD J, RUDER S. Universal language model fine-tuning for text classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018:328-339. [73]PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018:2227-2237. [74]VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010. [75]DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-Training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019:4171-4186. [76]LIU X D, HE P C, CHEN W Z, et al. Multi-task deep neural networks for natural language understanding[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:4487-4496. [77]CHI S, QIU X P, XU Y G, et al. How to fine-tune BERT for text classification?[C]// Proceedings of the China National Conference on Computational Linguistics. 2019:194-206. [78]YANG Z L, DAI Z H, YANG Y M, et al. XLNet: Generalized autoregressive pretraining for language understanding[C]// Proceedings of the 33rd Conference on Neural Information Processing Systems. 2019:5754-5764. [79]DAI Z H, YANG Z L, YANG Y M, et al. Transformer-XL: Attentive language models beyond a fixed -length context[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2978-2888. [80]余传明,王曼怡,林虹君,等. 基于深度学习的词汇表示模型对比研究[J]. 数据分析与知识发现, 2020(8):28-40. [81]SUN Y, WANG S H, LI Y K, et al. ERNIE: Enhanced representation through knowledge integration[J]. arXiv preprint arXiv:1904.09223, 2019. [82]RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of tansfer learning with a unified text-to-text transformer[J]. arXiv preprint arXiv:1910.10683, 2020. [83]YE Z H, GUO Q P, GAN Q, et al. BP-transformer: Modelling long-range context via binary partitioning[J]. arXiv preprint arXiv:1911.04070, 2019. [84]ZHOU G Y, ZHU Z Y, HE T T, et al. Cross-lingual sentiment classification with stacked autoencoders[J]. Knowledge and Information Systems, 2016,47(1):27-44. |
[1] | 祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112. |
[2] | 陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120. |
[3] | 杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26. |
[4] | 王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79. |
[5] | 史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99. |
[6] | 张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113. |
[7] | 程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126. |
[8] | 程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29. |
[9] | 王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36. |
[10] | 时现伟1, 范鑫2. 基于轻量化的视频帧场景语义分割方法[J]. 计算机与现代化, 2024, 0(08): 49-53. |
[11] | 徐新爱, 李钢. 基于DCGAN的课堂表情图像生成方法[J]. 计算机与现代化, 2024, 0(08): 88-91. |
[12] | 高帅鹏, 王怡凡. 基于图像的群体情绪识别综述[J]. 计算机与现代化, 2024, 0(08): 98-107. |
[13] | 周宪溪, 牟莉. 基于改进TF-IDF和AGLCNN的新闻长文本分类模型[J]. 计算机与现代化, 2024, 0(08): 120-126. |
[14] | 李 璐, 朱 焱. 基于知识提示微调的事件抽取方法[J]. 计算机与现代化, 2024, 0(07): 36-40. |
[15] | 黄文栋, 王怡凡. 基于模态类别的多模态信息处理与融合综述[J]. 计算机与现代化, 2024, 0(07): 47-62. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||