ALBERT结合双向网络的文本分类

摘要/Abstract

摘要： 针对目前多标签文本分类算法不能有效利用文本深层信息的缺陷，提出一种利用ALBERT模型进行文本深层信息的特征提取，使用双向LSTM网络进行特征训练，并结合注意力机制强化分类效果，完成分类的模型——ABAT模型。在百度发布的DuEE1.0数据集上进行实验，相对于各对比模型，该模型的各项性能均达到最优，Micro-Precision达到0.9625，Micro-F1达到0.9033，同时模型汉明损失下降到0.0023。实验结果表明，改进的ABAT模型能较好地完成多标签文本分类的任务。

关键词: 多标签, ALBERT预训练, 双向网络, 注意力机制

Abstract: Aiming at the defect that the current multi-label text classification algorithms cannot effectively utilize the deep text information, we propose a model——ABAT. The ALBERT model is used to extract the features of the deep text information, and the bidirectional LSTM network is used for feature training, and the attention mechanism is used to enhance the classification effect to complete the classification. Experiments are carried out on the DuEE1.0 data set released by Baidu. Compared with each comparative model, the performance of the model reaches the best, Micro-Precision reaches 0.9625, Micro-F1 reaches 0.9033, and the model’s Hamming loss drops to 0.0023. The experimental results show that the improved ABAT model can better complete the task of multi-label text classification.

Key words: multi-label, ALBERT pre-training, bidirectional network, attention mechanism

黄忠祥, 李明. ALBERT结合双向网络的文本分类[J]. 计算机与现代化, 2022, 0(10): 8-12.

HUANG Zhong-xiang, LI Ming. Text Classification Based on ALBERT Combined with Bidirectional Network[J]. Computer and Modernization, 2022, 0(10): 8-12.

参考文献

［1］ ALI T, ASGHAR S. Multi-label scientific document classification［J］. Journal of Internet Technology, 2018,19(6):1707-1716.
［2］滕金保,孔韦韦,田乔鑫,等. 基于CNN和LSTM的多通道注意力机制文本分类模型［J］. 计算机工程与应用, 2021,57(23):154-162.
［3］滕金保,孔韦韦,田乔鑫,等. 基于LSTM-Attention与CNN混合模型的文本分类方法［J］. 计算机工程与应用, 2021,57(14):126-133.
［4］田园,原野,刘海斌,等. 基于BERT预训练语言模型的电网设备缺陷文本分类［J］. 南京理工大学学报, 2020,44(4):446-453.
［5］田园,马文. 基于Attention-BiLSTM的电网设备故障文本分类［J］. 计算机应用, 2020,40(S2):24-29.
［6］段丹丹,唐加山,温勇,等. 基于BERT模型的中文短文本分类算法［J］. 计算机工程, 2021,47(1):79-86.
［7］孙红,陈强越. 融合BERT词嵌入和注意力机制的中文文本分类［J］. 小型微型计算机系统, 2022,43(1):22-26.
［8］温超东,曾诚,任俊伟,等. 结合ALBERT和双向门控循环单元的专利文本分类［J］. 计算机应用, 2021,41(2):407-412.
［9］曾诚,温超东,孙瑜敏,等. 基于ALBERT-CRNN的弹幕文本情感分析［J］. 郑州大学学报(理学版), 2021,53(3):1-8.
［10］李启行,廖薇,孟静雯. 基于注意力机制的双通道DAC-RNN文本分类模型［J/OL］. 计算机工程与应用:1-9(2021-04-21)［2022-01-25］. https://kns.cnki.net/kcms/detail/11.2127.tp.20210420.1354.070.html.
［11］SCHAPIRE R E, SINGER Y. BoosTexter: A boosting-based system for text categorization［J］. Machine Learning, 2000,39(2-3):135-168.
［12］ELISSEEFF A, WESTON J. A kernel method for multi-labelled classification［C］// Proceedings of the 14th International Conference on Neural Information Processing Systems. 2001:681-687.
［13］ZHANG M L, ZHOU Z H. ML-KNN: A lazy learning approach to multi-label learning［J］. Pattern Recognition, 2007,40(7):2038-2048.
［14］CHEN G B, YE D H, XING Z C, et al. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization［C］// Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN). 2017:2377-2383.
［15］饶竹一,张云翔. 基于BiGRU和注意力机制的多标签文本分类模型［J］. 现代计算机, 2020(1):31-35.
［16］郝超,裘杭萍,孙毅,等. 多标签文本分类研究进展［J］. 计算机工程与应用, 2021,57(10):48-56.
［17］LIU W W, WANG H B, SHEN X B, et al. The emerging trends of multi-label learning［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021. DOI: 10.1109/TPAMI.2021.3119334.
［18］DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding［J］. arXiv preprint arXiv:1810.04805, 2018.

［19］张德政,范欣欣,谢永红,等. 基于ALBERT与双向GRU的中医脏腑定位模型［J］. 工程科学学报, 2021,43(9):1182-1189.

［20］LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT: A lite BERT for self-supervised learning of language representations［J］. arXiv preprint arXiv:1909.11942, 2019.
［21］DANGOVSKI R, JING L, NAKOV P, et al. Rotational unit of memory: A novel representation unit for RNNs with scalable applications［J］. Transactions of the Association for Computational Linguistics, 2019,7:121-138.
［22］MA Q L, LIN Z X, YAN J Y, et al. MODE-LSTM: A parameter-efficient recurrent network with multi-scale for sentence classification［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020:6705-6715.
［23］谢思雅,施一萍,胡佳玲,等. 基于BiLSTM-ATT的微博用户情感分类研究［J］. 传感器与微系统, 2021,40(2):26-29.
［24］LIU G, GUO J B. Bidirectional LSTM with attention mechanism and convolutional layer for text classification［J］. Neurocomputing, 2019,337:325-338.
［25］BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate［C］// Proceedings of the 3rd International Conference on Learning Representations. 2015:940-1000.
［26］梁斌,刘全,徐进,等. 基于多注意力卷积神经网络的特定目标情感分析［J］. 计算机研究与发展, 2017,54(8):1724-1735.
［27］VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010.
［28］LI X Y, LI F Y, PAN L, et al. DuEE: A large-scale dataset for Chinese event extraction in real-world scenarios［C］// Proceedings of the 2020 CCF International Conference on Natural Language Processing and Chinese Computing. 2020:534-545.
［29］SCHAPIRE R E, SINGER Y. Improved boosting algorithms using confidence-rated predictions［J］. Machine Learning, 1999,37(3):297-336.
［30］SCHUTZE H, MANNING C D, RAGHAVAN P. Introduction to Information Retrieval［M］. Cambridge: Cambridge University Press, 2008.

[1]	何思达, 陈平华. 基于意图的轻量级自注意力序列推荐模型[J]. 计算机与现代化, 2024, 0(12): 1-9.
[2]	赵晨阳, 薛涛, 刘俊华. 基于改进Stable Diffusion的时尚服饰图案生成[J]. 计算机与现代化, 2024, 0(12): 15-23.
[3]	黄庭培1, 马禄彪1, 李世宝2, 刘建航1. 基于WiFi和原型网络的手势识别方法[J]. 计算机与现代化, 2024, 0(12): 34-39.
[4]	张晓东1, 白广芝1, 李敏1, 李昊洋2. 基于经验小波变换的油气井产量预测模型 [J]. 计算机与现代化, 2024, 0(12): 53-58.
[5]	刘云海1, 冯广1, 吴晓婷2, 杨群2. 复杂施工场景下的安全帽佩戴检测算法[J]. 计算机与现代化, 2024, 0(12): 66-71.
[6]	谷岳, 邓松峰, 沈霁, 穆文涛, 赵恩棋. 基于改进YOLOv8的SAR舰船目标检测算法[J]. 计算机与现代化, 2024, 0(12): 78-83.
[7]	王艳媛, 茅正冲. 中英文场景文本图像的检测和识别算法[J]. 计算机与现代化, 2024, 0(12): 84-90.
[8]	李钧超1, 尤菲1, 张超2, 苏乐乐2, 龚龑2. 基于新型多目标浣熊优化算法的BiLSTM-Attention#br# 预测模型及误差分析[J]. 计算机与现代化, 2024, 0(11): 70-76.
[9]	张宇1, 2, 黎靖1, 2, 马铭1, 2, 王众祥1, 2, 孙妍1, 2. YOLOLW:一个新的轻量级目标检测模型[J]. 计算机与现代化, 2024, 0(11): 91-98.
[10]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[11]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[12]	魏学诚1, 江凌云1, 李研2, 何非2. 改进YOLOv5的路侧单目视角小目标检测算法[J]. 计算机与现代化, 2024, 0(10): 27-34.
[13]	杜猛俊1, 李昂1, 童俊1, 钱锦1, 康恺1, 王若丁1, 靳文星2. 基于改进极限学习算法的电力信息数据融合模型[J]. 计算机与现代化, 2024, 0(10): 61-64.
[14]	焦一凯1, 2, 朱欣娟1, 2. 公共文化资源标签推荐方法[J]. 计算机与现代化, 2024, 0(10): 107-112.
[15]	杨世军1, 狄广义1, 高军1, 陈见飞1, 王耀坤1, 季晓晗2. 跨模态注意力融合和信息感知的情感一致检测[J]. 计算机与现代化, 2024, 0(10): 113-119.