计算机与现代化 ›› 2024, Vol. 0 ›› Issue (09): 101-106.doi: 10.3969/j.issn.1006-2475.2024.09.017

• 数据库与数据挖掘 • 上一篇    下一篇

结合注意力机制和Mengzi模型的短文本分类


  

  1. (1.东北石油大学电气信息工程学院,黑龙江 大庆 163318; 2.东北石油大学计算机与信息技术学院,黑龙江 大庆 163318)
  • 出版日期:2024-09-27 发布日期:2024-09-29
  • 基金资助:
    国家自然科学基金资助项目(61402099, 61702093)

Short Text Classification Combining Attention Mechanism and Mengzi Model

  1. (1. School of Electrical and Information Engineering, Northeast Petroleum University, Daqing 163318, China;
    2. School of Computer and Information Technology, Northeast Petroleum University, Daqing 163318, China)
  • Online:2024-09-27 Published:2024-09-29

摘要: 如何使用短文本分类技术挖掘有用的文本信息,是当前热门的研究方向之一。为了解决短文本特征信息稀疏和特征信息难以提取的问题,提出一种Mengzi-ADCBU短文本分类模型,该模型利用Mengzi预训练模型将输入的文本信息转化为相应的文本表示,再将获得的文本向量分别输入改进的深度金字塔卷积神经网络和融合了多头注意力机制的双向门控单元中提取文本特征信息,将两者提取到的特征信息进行融合之后,输送给全连接层和Softmax函数完成短文本分类。在公开的短文本数据集THUCNews和SougouCS上分别进行多组模型对比实验,实验结果表明本文提出的Mengzi-ADCBU模型在短文本分类的准确率、精确度、召回率和F1值等评价指标上都比现在的主流模型性能更优,具有较好的短文本分类能力。

关键词: 短文本, 多头注意力, 深度金字塔卷积神经网络, 双向门控单元

Abstract: How to use short text classification technology to mine useful text information is one of the current hot research directions. To solve the problem of sparse feature information and difficult extraction of short text, a short text classification model named Mengzi-ADCBU is proposed. This model uses Mengzi pre-training model to convert input text information into corresponding text representation. Then, the obtained text vectors are input to the improved deep pyramid convolutional neural network and the bidirectional gated unit integrated with multi-head attention mechanism to extract text feature information, and the extracted feature information is fused and sent to the full connection layer and Softmax function to complete short text classification. Multiple models comparison experiments are carried out on the publicly available THUCNews short text data set and SougouCS short text data set respectively. The experimental results show that the proposed Mengzi-ADCBU model is better than the current mainstream models in the accuracy, precision, recall rate and F1 value of short text classification and has better short text classification ability.

Key words: short text, multi-head attention, deep pyramid convolutional neural netwrks, bidirectional gated unit

中图分类号: