基于BERT和深层等长卷积的新闻标签分类

计算机与现代化 ›› 2021, Vol. 0 ›› Issue (08): 94-99.

基于BERT和深层等长卷积的新闻标签分类

（广东工业大学计算机学院，广东广州511400）

出版日期:2021-08-19 发布日期:2021-08-19
作者简介:杨文浩(1997—),男,广东韶关人,硕士研究生,研究方向:深度学习,自然语言处理,E-mail: 495839152@qq.com；刘广聪(1970—),男,广东韶关人,副教授,硕士生导师,研究方向:机器学习,物联网,大数据，E-mail: liugc@gdut.edu.cn；罗可劲 (1996—),男,广东云浮人,硕士研究生,研究方向:深度学习,推荐系统,E-mail: 953390179@qq.com。
基金资助:
国家自然科学基金面上项目(61672007)

News Label Classification Based on BERT and Deep Equal Length Convolution

（School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 511400, China）

Online:2021-08-19 Published:2021-08-19

摘要/Abstract

摘要： 针对THUCNews的中文新闻文本标签分类任务，在BERT预训练语言模型的基础上，提出一种融合多层等长卷积和残差连接的新闻标签分类模型（DPCNN-BERT）。首先，通过查询中文向量表将新闻文本中的每个字转换为向量输入到BERT模型中以获取文本的全文上下文关系。然后，通过初始语义提取层和深层等长卷积来获取文本中的局部上下文关系。最后，通过单层全连接神经网络获得整个新闻文本的预测标签。将本文模型与卷积神经网络分类模型(TextCNN)、循环神经网络分类模型(TextRNN)等模型进行对比实验。实验结果表明，本文模型的预测准确率达到94.68%，F1值达到94.67%，优于对比模型，验证了本文提出模型的性能。

关键词: 标签分类, 等长卷积, 残差连接, BERT

Abstract: For the THUCNews’ Chinese news text label classification task, a news label classification model (DPCNN-BERT) that combines multi-layer equal-length convolution and residual connection based on BERT pre-training language model is proposed. Firstly, by querying the Chinese vector table, each word in the news text is converted into a vector and input into BERT model to get the full-text context of the text. Then, the local context relationship in the text is obtained through the initial semantic extraction layer and deep equal-length convolution. Finally, the predicted label of the entire news text is obtained through a single-layer fully connected neural network. The model proposed in this paper is compared with the convolutional Neural Network Classification Model (TextCNN), Recurrent Neural Network Classification Model (TextRNN) and other models. The experimental results show that the prediction accuracy of the model reaches 94.68%, and the F1 value reaches 94.67%, which is better than the comparison models. The performance of the model proposed in this paper is verified.

Key words: label classification, equal-length convolution, residual connection, BERT

杨文浩, 刘广聪, 罗可劲. 基于BERT和深层等长卷积的新闻标签分类[J]. 计算机与现代化, 2021, 0(08): 94-99.

YANG Wen-hao, LIU Guang-cong, LUO Ke-jing. News Label Classification Based on BERT and Deep Equal Length Convolution [J]. Computer and Modernization, 2021, 0(08): 94-99.

参考文献

［1］汪岿,刘柏嵩. 文本分类研究综述［J］. 数据通信, 2019(3):37-47.
［2］ HUQ M R, ALI A, RAHMAN A. Sentiment analysis on Twitter data using KNN and SVM［J］. International Journal of Advanced Computer Science and Applications, 2017,8(6): 19-25.
［3］ RONG X. Word2vec parameter learning explained［J］. arXiv preprint arXiv:1411.2738, 2014.
［4］何力,谭霜,项凤涛,等. 基于深度学习的文本分类技术研究进展［J/OL］.计算机工程:1-15［2020-11-22］. https://doi.org/10.19678/j.issn.1000-3428.0059099.
［5］ NAM J, KIM J, MENCIA E L, et al. Large-scale multi-label text classification—revisiting neural networks［C］// 2014 European Conference on Machine Learning and Knowledge Discovery in Databases. 2014:437-452.
［6］ BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model［J］. Journal of Machine Learning Research, 2003,3:1137-1155.
［7］ KIM Y. Convolutional neural networks for sentence classification［J］. arXiv preprint arXiv:1408.5882, 2014.
［8］ BAGHERI H, ISLAM M J. Sentiment analysis of Twitter data［J］. arXiv preprint arXiv:1711.10377, 2017.
［9］ ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification［J］. arXiv preprint arXiv:1510.03820, 2015.
［10］PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations［J］. arXiv preprint arXiv:1802.05365, 2018.
［11］DEVLIN J, CHANG M W, LEE K, et al. Bert: Pretraining of deep bidirectional transformers for language understanding［J］. arXiv preprint arXiv:1810.04805, 2018.
［12］LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation［J］. arXiv preprint arXiv:1508.04025, 2015.
［13］BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate［J］. arXiv preprint arXiv:1409.0473, 2014.
［14］VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing. 2017: 6000-6010.
［15］HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778.
［16］IOFFE S，SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift ［C］// 2015 International Conference on Machine Learning. 2015:448-456.
［17］SANTURKAR S, TSIPRAS D, ILYAS A, et al. How does batch normalization help optimization?［C］//Advances in Neural Information Processing Systems. 2018:2483-2493.
［18］SCHOLZ R W, TIETJE O. Embedded Case Study Methods: Integrating Quantitative and Qualitative Knowledge［M］. Sage, 2002.
［19］ZHANG Z Y, HAN X, LIU Z Y, et al. ERNIE: Enhanced language representation with informative entities［J］. arXiv preprint arXiv:1905.07129, 2019.
［20］LAN Z, CHEN M, GOODMAN S, et al. Albert: A lite BERT for self-supervised learning of language representations［J］. arXiv preprint arXiv:1909.11942, 2019.
［21］YANG Z C, YANG D Y, DYER C, et al. Hierarchical attention networks for document classification［C］// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016:1480-1489.
［22］CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation［J］. arXiv preprint arXiv:1406.1078, 2014.
［23］方炯焜,陈平华,廖文雄. 结合GloVe和GRU的文本分类模型［J］. 计算机工程与应用, 2020,56(20):98-103.

[1]	郑久超, 赵新元. 基于主题与描述信息的实体链接方法[J]. 计算机与现代化, 2024, 0(12): 10-14.
[2]	焦一凯1, 2, 朱欣娟1, 2. 公共文化资源标签推荐方法[J]. 计算机与现代化, 2024, 0(10): 107-112.
[3]	马钰, 杨勇, 任鸽, 帕力旦·吐尔逊. 基于GCN和微调BERT的作文自动评分方法[J]. 计算机与现代化, 2024, 0(09): 33-37.
[4]	赵盾1, 佘学兵2, 邬昌兴3. 基于BERT-BiLSTM-CRF党建领域命名实体识别[J]. 计算机与现代化, 2024, 0(09): 91-94.
[5]	王谭, 陈金广, 马丽丽. 融合词典信息和句子语义的中文命名实体识别[J]. 计算机与现代化, 2024, 0(03): 24-28.
[6]	郑立瑞, 肖晓霞, 邹北骥, 刘彬, 周展. 基于BERT的电子病历命名实体识别[J]. 计算机与现代化, 2024, 0(01): 87-91.
[7]	刘玉鹏, 葛艳, 杜军威, 陈卓. 融合FGM和指针标注的实体关系联合抽取方法[J]. 计算机与现代化, 2023, 0(11): 1-5.
[8]	唐诗琪, 周瑞平, 谢仕斌, 刘梦赤, 肖文, . 基于栈式降噪编码器的跨语言多标签情感分类[J]. 计算机与现代化, 2023, 0(11): 6-12.
[9]	李诗月, 孟佳娜, 于玉海, 李雪莹, 许英傲. 基于知识增强的方面级情感分析方法[J]. 计算机与现代化, 2023, 0(10): 1-8.
[10]	曾丽丽, 汤华贝, 牛艺晓, 孟凡月. 基于LSTM堆叠残差网络的岩相识别方法[J]. 计算机与现代化, 2023, 0(08): 38-43.
[11]	谢世超, 黄蔚, 任祥辉. 一种基于BERT的文本实体链接方法[J]. 计算机与现代化, 2023, 0(02): 58-61.
[12]	朱亚军, 拥措, 尼玛扎西, . 基于藏文BERT的藏医药医学实体识别[J]. 计算机与现代化, 2023, 0(01): 43-48.
[13]	于清, 马志龙, 徐春. 基于BERT和非自回归的医疗知识抽取[J]. 计算机与现代化, 2023, 0(01): 120-126.
[14]	黄忠祥, 李明. ALBERT结合双向网络的文本分类[J]. 计算机与现代化, 2022, 0(10): 8-12.
[15]	陈钢. 融合RoBERTa和特征提取的政务热线工单分类[J]. 计算机与现代化, 2022, 0(06): 21-26.