一种基于BERT和池化操作的文本分类模型

计算机与现代化 ›› 2022, Vol. 0 ›› Issue (06): 1-7.

• 算法设计与分析 • 下一篇

一种基于BERT和池化操作的文本分类模型

(1.东华理工大学软件学院,江西南昌330013;2.东华理工大学信息工程学院,江西南昌330013)

出版日期:2022-06-23 发布日期:2022-06-23
作者简介:张军(1978—),男（土家族）,湖南常德人,副教授,硕士生导师,博士,研究方向:处理器/存储器性能功耗优化,自然语言处理,E-mail: zhangjun_whu@whu.edu.cn; 通信作者:邱龙龙(1997—),男,安徽安庆人,硕士研究生,研究方向:深度学习,自然语言处理,E-mail: 792688763@qq.com。
基金资助:
国家自然科学基金资助项目(62162002, 61662002, 61972293, 61902189); 江西省自然科学基金资助项目(20212BAB202002);江苏省自然科学基金资助项目(BK20180821)

A Text Classification Model Based on BERT and Pooling Operation

(1. School of Software, East China University of Technology, Nanchang 330013, China;
2. School of Information Engineering, East China University of Technology, Nanchang 330013, China)

Online:2022-06-23 Published:2022-06-23

摘要/Abstract

摘要： 使用预训练语言模型的微调方法在以文本分类为代表的许多自然语言处理任务中取得了良好的效果，尤其以基于Transformer框架的BERT模型为典型代表。然而，BERT直接使用［CLS］对应的向量作为文本表征，没有从全局和局部考虑文本的特征，从而限制了模型的分类性能。因此，本文提出一种引入池化操作的文本分类模型，使用平均池化、最大池化以及K-MaxPooling等池化方法从BERT输出矩阵中提取文本的表征向量。实验结果表明，与原始的BERT模型相比，本文提出的引入池化操作的文本分类模型具有更好的性能，在实验的所有文本分类任务中，其准确率和F1-Score值均优于BERT模型。

关键词: 文本分类, 分类模型, BERT, 平均池化, 最大池化, K-MaxPooling

Abstract: The fine-tuning method using the pre-trained language model has achieved good results in many natural language processing tasks represented by text classification, BERT model based on the Transformer framework as a typical representative especially. However, BERT uses the vector corresponding to ［CLS］ as the text representation directly, and does not consider the local features and global features of texts, which limits the classification performance of the model. Therefore, this paper proposes a text classification model that introduces a pooling operation, and uses pooling methods such as average pooling, maximum pooling, and K-MaxPooling to extract the representation vector of texts from the output matrix of BERT. The experimental results show that compared with the original BERT model, the text classification model with pooling operation proposed in this paper has better performance. In all text classification tasks in the experiment, its accuracy and F1-Score value are better than BERT model.

Key words: text classification, classification model, BERT, mean-pooling, max-pooling, K-MaxPooling

张军, 邱龙龙. 一种基于BERT和池化操作的文本分类模型[J]. 计算机与现代化, 2022, 0(06): 1-7.

ZHNAG Jun, QIU Long-long. A Text Classification Model Based on BERT and Pooling Operation[J]. Computer and Modernization, 2022, 0(06): 1-7.

参考文献

［1］ KALCHBRENNER N, GREFENSTETTE E, BLUNSOM P. A convolutional neural network for modelling sentences［C］// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014:655-665.
［2］ SHEN D H, ZHANG Y Z, HENAO R, et al. Deconvolutional latent-variable model for text sequence matching［C］// The 32nd AAAI Conference on Artificial Intelligence. 2018,32(1).
［3］ WU H Y, LIU Y, SHI S Y. Modularized syntactic neural networks for sentence classification［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. DOI: 10.18653/v1/2020.emnlp-main.222.
［4］ LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition［J］. Proceedings of the IEEE, 1998,86(11):2278-2324.
［5］朱雪晨,陈三林,蔡刚,等. 降低参数规模的卷积神经网络模型压缩方法［J］. 计算机与现代化, 2021(9):83-89.
［6］刘奇旭,刘心宇,罗成,等. 基于双向循环神经网络的安卓浏览器指纹识别方法［J］. 计算机研究与发展, 2020,57(11):2294-2311.
［7］夏瑜潞. 循环神经网络的发展综述［J］. 电脑知识与技术, 2019,15(21):182-184.
［8］石磊,王明宇,宋哲理,等.自注意力机制和BiGRU相结合的文本分类研究［J/OL］. 小型微型计算机系统:1-10［2021-11-18］. https://kns-cnki-net.webvpn.ecut.edu.cn/kcms/detail/21.1106.TP.20211102.1155.010.html.
［9］罗嘉,王乐豪,涂姗姗,等. 基于LSTM-BLS的突发气象灾害事件中公众情感倾向分析［J/OL］. 南京信息工程大学学报(自然科学版):1-13［2021-06-30］. https://kns-cnki-net.webvpn.ecut.edu.cn/kcms/detail/32.1801.N.20210628.1426.002.html.
［10］HOCHREITER S, SCHMIDHUBER J. Long short-term memory［J］. Neural Computation, 1997,9(8):1735-1780.
［11］DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding［J］. arXiv preprint arXiv: 1810.04805, 2018.
［12］SUN C, QIU X P, XU Y G, et al. How to fine-tune BERT for text classification?［C］// 2019 China National Conference on Chinese Computational Linguistics. 2019:194-206.
［13］JIAO X Q, YIN Y C, SHANG L F, et al. TinyBERT: Distilling BERT for natural language understanding［J］. arXiv preprint arXiv:1909.10351, 2019.
［14］BAHDANAU D, CHO K, BENGIO Y.Neural machine translation by jointly learning to align and translate［C］// The 3rd International Conference on Learning Representations. 2015.
［15］VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010.
［16］马月梅,陈海英,刘国军. 彩色图像质量评价的广义平均池化策［J］. 激光与光电子学进展, 2018,55(2):206-213.
［17］刘国军,高丽霞,陈丽奇. 广义平均的全参考型图像质量评价池化策略［J］. 光学精密工程, 2017,25(3):742-748.
［18］王静. 基于最大池化的图双注意力网络研究及应用［D］. 石家庄:河北师范大学, 2020.
［19］SHU B, REN F J, BAO Y W. Investigating LSTM with K-Max pooling for text classification［C］// 2018 11th International Conference on Intelligent Computation Technology and Automation (ICICTA). 2018:31-34.
［20］ZHOU P, QI Z Y, ZHENG S C, et al. Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling［J］. arXiv preprint arXiv:1611.06639, 2016.
［21］CONNEAU A, SCHWENK H, BARRAULT L, et al. Very deep convolutional networks for text classification［J］. arXiv preprint arXiv:1606.01781, 2016.
［22］SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: A simple way to prevent neural networks from overfitting［J］. The Journal of Machine Learning Research, 2014,15(1):1929-1958.
［23］KINGAD P,BA J. Adam: A method for stochastic optimization［J］. arXiv preprint arXiv:1412.6980, 2014.
［24］LIU Y H, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized bert pretraining approach［J］. arXiv preprint arXiv:1907.11692, 2019.
［25］LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT: A lite BERT for self-supervised learning of language representations［J］. arXiv preprint arXiv: 1909.11942, 2019.
［26］JOSHI M, CHEN D Q, LIU Y H, et al. SpanBERT: Improving pre-training by representing and predicting spans［J］. Transactions of the Association for Computational Linguistics, 2020,8:64-77.
［27］SUN Y, WANG S H, LI Y K, et al. ERNIE: Enhanced representation through knowledge integration［J］. arXiv preprint arXiv:1904.09223, 2019.

[1]	郑久超, 赵新元. 基于主题与描述信息的实体链接方法[J]. 计算机与现代化, 2024, 0(12): 10-14.
[2]	马钰, 杨勇, 任鸽, 帕力旦·吐尔逊. 基于GCN和微调BERT的作文自动评分方法[J]. 计算机与现代化, 2024, 0(09): 33-37.
[3]	赵盾1, 佘学兵2, 邬昌兴3. 基于BERT-BiLSTM-CRF党建领域命名实体识别[J]. 计算机与现代化, 2024, 0(09): 91-94.
[4]	周宪溪, 牟莉. 基于改进TF-IDF和AGLCNN的新闻长文本分类模型[J]. 计算机与现代化, 2024, 0(08): 120-126.
[5]	张可1, 艾中良2, 刘忠麟3, 顾平莉1, 刘学林4. 基于多元组匹配损失的司法论辩理解方法[J]. 计算机与现代化, 2024, 0(06): 115-120.
[6]	王谭, 陈金广, 马丽丽. 融合词典信息和句子语义的中文命名实体识别[J]. 计算机与现代化, 2024, 0(03): 24-28.
[7]	郑立瑞, 肖晓霞, 邹北骥, 刘彬, 周展. 基于BERT的电子病历命名实体识别[J]. 计算机与现代化, 2024, 0(01): 87-91.
[8]	刘玉鹏, 葛艳, 杜军威, 陈卓. 融合FGM和指针标注的实体关系联合抽取方法[J]. 计算机与现代化, 2023, 0(11): 1-5.
[9]	唐诗琪, 周瑞平, 谢仕斌, 刘梦赤, 肖文, . 基于栈式降噪编码器的跨语言多标签情感分类[J]. 计算机与现代化, 2023, 0(11): 6-12.
[10]	李诗月, 孟佳娜, 于玉海, 李雪莹, 许英傲. 基于知识增强的方面级情感分析方法[J]. 计算机与现代化, 2023, 0(10): 1-8.
[11]	白芮, 徐杨, 王彬, 张雯雯. 基于改进YOLOv5s的道路坑洼检测算法[J]. 计算机与现代化, 2023, 0(06): 69-75.
[12]	徐涯昕, 何泽恩, 徐绪堪. 基于CNN-BiLSTM网络的数控机床故障文本自动分类[J]. 计算机与现代化, 2023, 0(04): 7-14.
[13]	谢世超, 黄蔚, 任祥辉. 一种基于BERT的文本实体链接方法[J]. 计算机与现代化, 2023, 0(02): 58-61.
[14]	朱亚军, 拥措, 尼玛扎西, . 基于藏文BERT的藏医药医学实体识别[J]. 计算机与现代化, 2023, 0(01): 43-48.
[15]	于清, 马志龙, 徐春. 基于BERT和非自回归的医疗知识抽取[J]. 计算机与现代化, 2023, 0(01): 120-126.

一种基于BERT和池化操作的文本分类模型

A Text Classification Model Based on BERT and Pooling Operation

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价