计算机与现代化 ›› 2023, Vol. 0 ›› Issue (02): 66-71.

• 人工智能 • 上一篇    下一篇

基于PGN-CL的文本摘要生成模型

  

  1. (新疆师范大学计算机科学技术学院,新疆 乌鲁木齐 830054)
  • 出版日期:2023-04-10 发布日期:2023-04-10
  • 作者简介:刘雅情(1997—),女,辽宁大连人,硕士,研究方向:自然语言处理,E-mail: 1109701435@qq.com; 通信作者:张海军(1973—),男,吉林四平人,教授,博士,研究方向:自然语言处理,情感计算,人工智能,E-mail: ustczhj@qq.com; 梁科晋(1995—),男,山西晋城人,硕士研究生,研究方向:自然语言处理;张昱(1995—),女,陕西商洛人,硕士研究生,研究方向:自然语言处理; 王月阳(1995—),男,河北沧州人,硕士研究生,研究方向:自然语言处理。
  • 基金资助:
    国家自然科学基金—新疆联合基金重点项目(U1703261)

Text Summarization Generation Model Based on PGN-CL

  1. (College of Computer Science and Technology, Xinjiang Normal University, Urumqi 830054, China)
  • Online:2023-04-10 Published:2023-04-10

摘要: 基于Seq2Seq框架的生成式文本摘要模型取得了不错的研究进展,但此类模型大多存在未登录词、生成文本重复、曝光偏差问题。为此,本文提出基于对抗性扰动对比学习的指针生成器网络PGN-CL来建模文本摘要生成过程,该模型以指针生成器网络PGN为基本架构,解决摘要模型存在的未登录词和生成文本重复的问题;采用对抗性扰动对比学习作为一种新的模型训练方式来解决曝光偏差问题。在PGN模型的训练过程中,通过向目标序列添加扰动并建立对比损失函数来生成对抗性正负样本,使负样本与目标序列在嵌入空间相似但语义差别很大,正样本与目标序列在语义空间很相近但嵌入空间差距较大,这些区分困难的正负样本可以引导PGN模型在特征空间更好地学习到正负样本的区分特征,获得更准确的摘要表示。在LCSTS数据集上的实验结果表明,提出的模型在ROUGE评价指标上的表现优于对比基线,证明了融合指针生成器网络和对抗性扰动对比学习对摘要质量提升的有效性。

关键词: 文本摘要, 指针生成器网络, 对抗性扰动, 对比学习

Abstract: The model of abstractive text summarization based on the Seq2Seq framework has made great achievements. However, most of these models suffer from out-of-vocabulary, generated text repetition, and exposure bias. To tackle this problem, we propose a pointer generator network based on adversarial perturbation contrastive learning (PGN-CL) to model the text summarization generation process. As the basic structure, PGN is used for solving the problems of out-of-vocabulary and generated text repetition in this model as well as introducing Adversarial Perturbation Contrastive Learning as a new model training method to address exposure bias. In the model training process, we add perturbations to the target sequence and build a contrastive loss function to generate adversarial positive and negative samples. By this way, negative samples are similar to the target sequence in the embedding space but have large differences in semantic space, while the positive samples are similar to the target sequence in semantic space but have large differences in embedding space. These indistinguishable positive and negative samples can guide the model to learn the distinguishing features of these samples better in the feature space and obtain more accurate summary representation. The experiment result on the LCSTS dataset shows that the proposed model outperforms the comparative baselines on the ROUGE evaluation metric, demonstrating the effectiveness of the proposed model for summary quality improvement.

Key words: text summarization, pointer generator network, adversarial perturbation, contrastive learning