计算机与现代化 ›› 2021, Vol. 0 ›› Issue (07): 29-37.
出版日期:
2021-08-02
发布日期:
2021-08-02
作者简介:
贾澎涛(1977—),女,陕西蒲城人,教授,博士,研究方向:深度学习,数据挖掘,煤矿安全可视化,E-mail: jiapengtao@xust.edu.cn; 孙炜(1995—),女,硕士研究生,研究方向:深度学习,E-mail: 153365088@qq.com。
基金资助:
Online:
2021-08-02
Published:
2021-08-02
摘要: 随着互联网的不断发展,网络上的文本数据日益增多,如果能对这些数据进行有效分类,那么更有利于从中挖掘出有价值的信息,因此文本数据的管理和整合显得十分重要。文本分类是自然语言处理任务中的一项基础性工作,主要应用于舆情检测及新闻文本分类等领域,目的是对文本资源进行整理和归类。基于深度学习的文本分类,在对文本数据处理中,表现出较好的分类效果。本文对用于文本分类的深度学习算法进行详细阐述,按照深度学习的不同算法进行分类,并分析各种算法的特点,最后对深度学习算法在文本分类领域的未来研究方向进行总结。
贾澎涛, 孙炜. 基于深度学习的文本分类综述[J]. 计算机与现代化, 2021, 0(07): 29-37.
JIA Peng-tao, SUN Wei. A Survey of Text Classification Based on Deep Learning[J]. Computer and Modernization, 2021, 0(07): 29-37.
[1] | 陈海虹. 机器学习原理及应用[M]. 成都:电子科技大学出版社, 2017. |
[2] | MARON M E. Automatic indexing: An experimental inquiry[J]. Journal of the ACM, 1961,8(3):404-417. |
[3] | 张征杰,王自强. 文本分类及算法综述[J]. 电脑知识与技术, 2012,8(4):825-828. |
[4] | 徐泓洋,杨国为. 中文文本特征选择方法研究综述[J]. 工业控制计算机, 2017,30(11):80-81. |
[5] | MARON M E, KUHNS J L. On relevance, probabilistic indexing and information retrieval[J]. Journal of the ACM, 1960,7(3):216-244. |
[6] | COVER T M, HART P E. Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967,13(1):21-27. |
[7] | VAPNIK V N. The Nature of Statistical Learning Theory[M]. New York: John Wiley and Sons, 1998. |
[8] | HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006,313(5786):504-507. |
[9] | ZHU X D, SOBHANI P, GUO H Y. Long short-term memory over recursive structures[C]// Proceedings of the 32nd International Conference on Machine Learning. 2015:1604-1612. |
[10] | 杨东,王移芝. 基于Attention-based C-GRU神经网络的文本分类[J]. 计算机与现代化, 2018(2):96-100. |
[11] | VAN RIJSBERGEN C J, ROBERTSON S E, PORTER M F. New models in probabilistic information retrieval[J]. Computer Laboratory, University of Cambridge, 1980. |
[12] | EDDY S R. Hidden Markov models[J]. Current Opinion in Structural Biology. 1996,6(3): 361-365. |
[13] | LAFFERTY J D, MCCALLUM A, PEREIRA F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the 18th International Conference on Machine Learning. 2001:282-289. |
[14] | PENG N Y, DREDZE M. Multi-task domain adaptation for sequence tagging[C]// Proceedings of the 2nd Workshop on Representation Learning for NLP. 2017:91-100. |
[15] | 刘琴,袁家政,翁长虹. 基于深度学习的短文本分类研究综述[J]. 计算机科学, 2017,44(10A):11-15. |
[16] | DAVIS C A, FONSECA F T. Assessing the certainty of locations produced by an address geocoding system[J]. Geoinformatica, 2007,11(1):103-129. |
[17] | MIKOLOV T, KARAFIAT M, BURGET L, et al. Recurrent eural network based language model[C]// Proceedings of the 11th Annual Conference of the International Speech Communication Association. 2010:1045-1048. |
[18] | LE Q, MIKOLOV T. Distributed representations of sentences and documents[C]// Proceedings of the 31st International Conference on Machine Learning. 2014:1188-1196. |
[19] | 汪岿,刘柏嵩. 文本分类研究综述[J]. 数据通信, 2019(3):37-47. |
[20] | JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[J]. arXiv preprint arXiv:1607.01759, 2016. |
[21] | KIM Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014. |
[22] | ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. 2017:253-263. |
[23] | ZHANG X, ZHAO J B, LECUN Y. Character-level convolutional networks for text classification[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:649-657. |
[24] | CONNEAU A, SCHWENK H, BARRAULT L, et al. Very deep convolutional networks for text classification[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 2017. DOI:18653/V1/E17-1104. |
[25] | JOHNSON R, ZHANG T. Deep pyramid convolutional neural networks for text categorization[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017. DOI:10.18653/V1/P17-1052. |
[26] | YAO L, MAO C S, LUO Y. Graph convolutional networks for text classification[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2019:7370-7377. |
[27] | MA M B, HUANG L, ZHOU B W, et al. Dependency-based convolutional neural networks for sentence embedding[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015. DOI:10.3115/V1/P15-2029. |
[28] | MOU L L, PENG H, LI G, et al. Discriminative neural sentence modeling by tree-based convolution[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015:2315-2325. |
[29] | 韩涛,尹伟石,方明. 基于卷积神经网络和XGBoost的情感分析模型[J]. 复旦学报(自然科学版), 2019,58(5):561-564. |
[30] | JORDAN M I. A parallel distributed processing approach[J]. Advances in Psychology, 1997,121:471-495. |
[31] | KOUTNIK J, GREFF K, GOMEZ F, et al.A clockwork RNN[J]. arXiv preprint arXiv:1402.3511, 2014. |
[32] | SOCHER R, PERELYGIN A, WU J Y, et al. Recursive deep models for semantic compositionality over a sentiment treebank[C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013:1631-1642. |
[33] | SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997,45(11):2673-2681. |
[34] | SCHMIDHUBER J. Deep learning in neural networks: An overview[J]. Neural Networks, 2015,61:85-117. |
[35] | 刘婷婷,朱文东,刘广一. 基于深度学习的文本分类研究进展[J]. 电力信息与通信技术, 2018,16(3):1-7. |
[36] | ZHANG Y, LIU Q, SONG L F. Sentence-state LSTM for text representation[C]// Proceeding of the 56th Annual Meeting of the Association for Computational Linguistics. 2018. DOI:10.18653/V1/P18-1030. |
[37] | MIYATO T, DAI A M, GOODFELLOW I. Adversarial training methods for semi-supervised text classification[C]// Proceedings of the International Conference on Learning Representations. 2017. |
[38] | MIYATO T, MAEDA S I, KOYAMA M, et al. Virtual adversarial training: A regularization method for supervised and semi-supervised learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019,41(8):1979-1993. |
[39] | LU C, HUANG H Y, JIAN P, et al. A P-LSTM neural network for sentiment classification[C]// Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2017:524-533. |
[40] | TAI K S, SOCHER R, MANNING C D. Improved semantic representations from tree-structured long short-term memory networks[J]. arXiv preprint arXiv:1503.00075, 2015. |
[41] | GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005,18(5-6):602-610. |
[42] | 和志强,杨建,罗长玲. 基于BiLSTM神经网络的特征融合短文本分类算法[J]. 智能计算机与应用. 2019,9(2):21-27. |
[43] | CHO K, MERRIENBOER B V, BAHDANAU D, et al. On the properties of neural machine translation: Encoder-decoder approaches[J]. arXiv preprint arXiv:1409.1259, 2014. |
[44] | CHEN X C, QIU X P, ZHU C X, et al. Gated recursive neural network for Chinese word segmentation[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 2015:1744-1753. |
[45] | TANG Q T, LI J, CHEN J Y, LU H T, et al. Full attention-based bi-GRU neural network for news text classification[C]// Proceedings of the 2019 IEEE 5th International Conference on Computer and Communication. 2019. DOI:10.1109/ICCC47050.2019.9064061. |
[46] | 线岩团,相艳,余正涛,等. 用于文本分类的均值原型网络[J]. 中文信息学报, 2020,34(6):74-80. |
[47] | 孙明敏. 基于GRU-Attention的中文文本分类[J]. 现代信息科技, 2019(3):10-12. |
[48] | LIANG X, LIU Z M, OUYANG C P. A multi-sentiment classifier based on GRU and attention mechanism[C]// Proceedings of the 2018 IEEE 9th International Conference on Software Engineering and Service Science. 2018:527-530. |
[49] | BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014. |
[50] | LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[J]. arXiv preprint arXiv:1508.04025, 2015. |
[51] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010. |
[52] | SHEN T, ZHOU T Y, LONG G D, et al. Disan: Directional self-attention network for RNN/CNN-free language understanding[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018. |
[53] | TAN Z X, WANG M X, XIE J, et al. Deep semantic role labeling with self-attention[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018. |
[54] | 贾红雨,王宇涵,丛日晴,等. 结合自注意力机制的神经网络文本分类算法研究[J]. 计算机应用与软件, 2020,37(2):200-206. |
[55] | 吴小华,陈莉,魏甜甜,等. 基于Self-Attention和Bi-LSTM的中文短文本情感分析[J]. 中文信息学报, 2019,33(6):100-107. |
[56] | 李丽华,胡小龙. 基于深度学习的文本情感分析[J]. 湖北大学学报:自然科学版, 2020,42(2):142-149. |
[57] | YANG Z C, YANG D Y, DYER C, et al. Hierarchical attention networks for document classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2017. DOI:10.18653/V1/N16-1174. |
[58] | PAPPAS N, POPESCU-BELIS A. Multilingual hierarchical attention networks for document classification[J]. arXiv preprint arXiv:1707.00896, 2017. |
[59] | 赵勤鲁,蔡晓东,李波,等. 基于LSTM-Attention神经网络的文本特征提取方法[J]. 现代电子技术, 2018,41(8):167-170. |
[60] | CAI J J, LI J P, LI W, et al. Deep learning model used in text classification[C]// Proceedings of the 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing. 2018. DOI:10.1109/ICCWAMTIP.2018.8632592. |
[61] | ZHOU K, LONG F. Sentiment analysis of text based on CNN and bi-directional LSTM model[C]// Proceedings of the 2018 24th International Conference on Automation and Computing. 2018. |
[62] | LAI S W, XU L H, LIU K, et al. Recurrent convolutional neural networks for text classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015:2267-2273. |
[63] | DU J C, GUI L, HE Y L, et al. Convolution-based neural attention with applications to sentiment classication[J]. IEEE Access, 2019(7):27983-27992. |
[64] | ZHANG J R, LI Y X, TIAN J, et al. LSTM-CNN hybrid model for text classification[C]// Proceedings of the 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference. 2018:1675-1680. |
[65] | SHE X Y, ZHANG D. Text classification based on hybrid CNN-LSTM hybrid model[C]// Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design. 2018. DOI:10.1109/ISCID. 2018.10144. |
[66] | 彭玉青,宋初柏,闫倩,等. 基于VDCNN与LSTM混合模型的中文文本分类研究[J]. 计算机工程, 2018,44(11):190-196. |
[67] | LI C B, ZHAN G H, LI Z H. News text classification based on improved Bi-LSTM-CNN[C]// Proceedings of the 2018 9th International Conference on Information Technology in Medicine and Education. 2018:890-893. |
[68] | FELBO B, MISLOVE A, SGAARD A, et al. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm[J]. arXiv preprint arXiv:1708.00524, 2017. |
[69] | WANG G Y, LI C Y, WANG W L, et al. Joint embedding of words and labels for text classification[J]. arXiv preprint arXiv:1805.04174, 2018. |
[70] | 宋祖康,阎瑞霞. 基于CNN-BIGRU的中文文本情感分类模型[J]. 计算机技术与发展, 2020,30(2):166-170. |
[71] | LUO L X. Network text sentiment analysis method combining LDA text representation and GRU-CNN[J]. Personal and Ubquitous Computing, 2019,23(3-4):405-412. |
[72] | HOWARD J, RUDER S. Universal language model fine-tuning for text classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018:328-339. |
[73] | PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018:2227-2237. |
[74] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010. |
[75] | DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-Training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019:4171-4186. |
[76] | LIU X D, HE P C, CHEN W Z, et al. Multi-task deep neural networks for natural language understanding[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:4487-4496. |
[77] | CHI S, QIU X P, XU Y G, et al. How to fine-tune BERT for text classification?[C]// Proceedings of the China National Conference on Computational Linguistics. 2019:194-206. |
[78] | YANG Z L, DAI Z H, YANG Y M, et al. XLNet: Generalized autoregressive pretraining for language understanding[C]// Proceedings of the 33rd Conference on Neural Information Processing Systems. 2019:5754-5764. |
[79] | DAI Z H, YANG Z L, YANG Y M, et al. Transformer-XL: Attentive language models beyond a fixed -length context[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2978-2888. |
[80] | 余传明,王曼怡,林虹君,等. 基于深度学习的词汇表示模型对比研究[J]. 数据分析与知识发现, 2020(8):28-40. |
[81] | SUN Y, WANG S H, LI Y K, et al. ERNIE: Enhanced representation through knowledge integration[J]. arXiv preprint arXiv:1904.09223, 2019. |
[82] | RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of tansfer learning with a unified text-to-text transformer[J]. arXiv preprint arXiv:1910.10683, 2020. |
[83] | YE Z H, GUO Q P, GAN Q, et al. BP-transformer: Modelling long-range context via binary partitioning[J]. arXiv preprint arXiv:1911.04070, 2019. |
[84] | ZHOU G Y, ZHU Z Y, HE T T, et al. Cross-lingual sentiment classification with stacked autoencoders[J]. Knowledge and Information Systems, 2016,47(1):27-44. |
[1] | 李健, 张克亮, 唐亮, 夏榕璟, 任静静. 面向中文命名实体识别任务的数据增强[J]. 计算机与现代化, 2022, 0(04): 1-6. |
[2] | 陈云翔, 王巍, 宁娟, 陈怡丹, 赵永新, 周庆华. PSWGAN-GP:改进梯度惩罚的生成对抗网络[J]. 计算机与现代化, 2022, 0(04): 21-26. |
[3] | 梁正友, 耿经邦, 孙宇. 基于改进残差网络的交通标志识别算法[J]. 计算机与现代化, 2022, 0(04): 52-57. |
[4] | 谢辉, 师后勤, 齐宇霄, 陈瑞, 童莹. 基于注意力机制子网络的时空跌倒检测算法[J]. 计算机与现代化, 2022, 0(03): 70-75. |
[5] | 胡焱, 卓书龙, 司成可. 数据驱动的ADS-B干扰源信号类型识别[J]. 计算机与现代化, 2022, 0(02): 19-25. |
[6] | 张晓航, 李 政, 朱晓明, 张海锋, 赵博宇. 基于RBF神经网络的可信加密流量分类方法[J]. 计算机与现代化, 2022, 0(02): 45-51. |
[7] | 赵延平, 王芳, 夏杨. 基于支持向量机的短文本分类方法[J]. 计算机与现代化, 2022, 0(02): 92-96. |
[8] | 陈勋豪, 杨莹, 黄俊茹, 孙玉宝. 基于多尺度融合网络的视频快照压缩感知重建[J]. 计算机与现代化, 2021, 0(12): 58-64. |
[9] | 汪帆, 魏宪, 郭杰龙, 梁培栋. 基于多通道分离整合的多尺度单幅图像去雨算法[J]. 计算机与现代化, 2021, 0(12): 72-78. |
[10] | 王天星, 袁家斌, 刘昕. 基于同等注意力图网络的视觉问答方法[J]. 计算机与现代化, 2021, 0(11): 1-6. |
[11] | 李国栋, 彭敦陆. 基于多模双线性池化方法的虚假新闻检测模型[J]. 计算机与现代化, 2021, 0(11): 17-21. |
[12] | 万发洋, 于旭, 徐其江. 基于多头自注意力机制的深度缺陷分派模型[J]. 计算机与现代化, 2021, 0(11): 39-43. |
[13] | 冯茹嘉, 张海军, 潘伟民. 基于情感分析和Transformer模型的微博谣言检测[J]. 计算机与现代化, 2021, 0(10): 1-7. |
[14] | 金龙, 吴游, 张泳翔. 基于改进SRGAN的OFDM信道估计方法[J]. 计算机与现代化, 2021, 0(10): 112-118. |
[15] | 刘晓康, 夏天雷, 吴晨媛, 姜雄彪, 周明玉, 王庆华. 电力巡检场景下的红外与可见光图像配准方法[J]. 计算机与现代化, 2021, 0(09): 31-36. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||