Text Summarization Generation Model Based on PGN-CL

Abstract

Abstract: The model of abstractive text summarization based on the Seq2Seq framework has made great achievements. However, most of these models suffer from out-of-vocabulary， generated text repetition， and exposure bias. To tackle this problem, we propose a pointer generator network based on adversarial perturbation contrastive learning （PGN-CL） to model the text summarization generation process. As the basic structure， PGN is used for solving the problems of out-of-vocabulary and generated text repetition in this model as well as introducing Adversarial Perturbation Contrastive Learning as a new model training method to address exposure bias. In the model training process， we add perturbations to the target sequence and build a contrastive loss function to generate adversarial positive and negative samples. By this way， negative samples are similar to the target sequence in the embedding space but have large differences in semantic space， while the positive samples are similar to the target sequence in semantic space but have large differences in embedding space. These indistinguishable positive and negative samples can guide the model to learn the distinguishing features of these samples better in the feature space and obtain more accurate summary representation. The experiment result on the LCSTS dataset shows that the proposed model outperforms the comparative baselines on the ROUGE evaluation metric, demonstrating the effectiveness of the proposed model for summary quality improvement.

Key words: text summarization, pointer generator network, adversarial perturbation, contrastive learning

LIU Ya-qing, ZHANG Hai-jun, LIANG Ke-jin, ZHANG Yu, WANG Yue-yang. Text Summarization Generation Model Based on PGN-CL[J]. Computer and Modernization, 2023, 0(02): 66-71.

References

［1］侯圣峦，张书涵，费超群. 文本摘要常用数据集和方法研究综述［J］. 中文信息学报，2019，33（5）：1-16.
［2］ EL-KASSAS W S， SALAMA C R， RAFEA A A， et al. Automatic text summarization： A comprehensive survey［J］. Expert Systems with Applications， 2021，165. DOI：10.1016/j.eswa.2020.113679.
［3］李金鹏，张闯，陈小军，等. 自动文本摘要研究综述［J］. 计算机研究与发展， 2021，58（1）：1-21.
［4］陈伟，杨燕. 基于指针网络的抽取生成式摘要生成模型［J］. 计算机应用， 2021，41（12）：3527-3533.
［5］ MIHALCEA R， TARAU P. Textrank： Bringing order into text［C］// Proceedings of the 2004 Conference on Cmpirical Methods in Natural Language Processing. 2004： 404-411.
［6］ NALLAPATI R， ZHAI F F， ZHOU B W. SummaRuNNer： A recurrent neural network based sequence model for extractive summarization of documents［C］// 31st AAAI Conference on Artificial Intelligence. 2017，31（1）. DOI：10.1609/aaai.v31i1.10958.
［7］ MEHDAD Y， CARENINI G， NG R T. Abstractive summarization of spoken and written conversations based on phrasal queries［C］// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014： 1220-1230.
［8］ CHOPRA S， AULI M， RUSH A M. Abstractive sentence summarization with attentive recurrent neural networks［C］// Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. 2016：93-98.
［9］ ZHANG H Y， CAI J J， XU J J， et al. Pretraining-based natural language generation for text summarization［C］// Proceedings of the 23rd Conference on Computational Natural Language Learning. 2019， 789-797.
［10］ RUSH A M， CHOPRA S， WESTON J. A neural attention model for abstractive sentence summarization［J］. arXiv preprint arXiv：1509.00685， 2015.
［11］ NALLAPATI R， ZHOU B W， SANTOS C D， et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond［C］// Proceedings of the 20th SIGNLL Conference on Computational Natural. 2016， 280-290.
［12］ GU J T， LU Z D， LI H， et al. Incorporating copying mechanism in sequence-to-sequence learning［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016，1：1631-1640.
［13］ LIN J Y， SUN X， MA S M， et al. Global encoding for abstractive summarization［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018，2：163-169.
［14］ SEE A， LIU P J， MANNING C D. Get to the point： Summarization with pointer-generator networks［C］// Proceedings of the 55th Annual Meeting of the ACL. Stroudsburg： ACL， 2017，1：1073-1083.
［15］ TU Z P， LU Z D， LIU Y， et al. Modeling coverage for neural machine translation［C］// The 2016 Annual Meeting of the Association for Computational Linguistics. 2016，1：76-85.
［16］ WILLIAMS R J， ZIPSER D. A learning algorithm for continually running fully recurrent neural networks［J］. Neural Computation， 1989，1（2）：270-280.
［17］ PAULUS R， XIONG C M， SOCHER R. A deep reinforced model for abstractive summarization［J］. arXiv preprint arXiv：1705.04304， 2017.
［18］ YU L T， ZHANG W N， WANG J， et al. Seqgan： Sequence generative adversarial nets with policy gradient［C］// Proceedings of the AAAI conference on artificial intelligence. 2017， 31（1）. DOI:10.1609/aaai.V31i1.10804.
［19］ CHEN T， KORNBLITH S， NOROUZI M， et al. A simple framework for contrastive learning of visual representations［J］. arXiv preprint arXiv：2002.05709， 2020.
［20］ LOGESWARAN L， LEE H. An efficient framework for learning sentence representations［J］. arXiv preprint arXiv：1803.02893， 2018.
［21］ YANG Z H， CHENG Y， LIU Y， et al. Reducing word omission errors in neural machine translation： A contrastive learning approach［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019， 6191-6196.
［22］ LEE S， LEE D B， HWANG S J. Contrastive learning with adversarial perturbations for conditional text generation［J］. arXiv preprint arXiv：2012.07280， 2020.
［23］ HU B， CHEN Q， ZHU F. LCSTS： A large scale chinese short text summarization dataset［J］. arXiv preprint arXiv：1506.05865， 2015.
［24］ LIN C Y， HOVY E. Automatic evaluation of summaries using n-gram co-occurrence statistics［C］// Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 2003，1：150-157.
［25］ LIN C Y. ROUGE： A package for automatic evaluation of summaries［C］// The 2004 Annual Meeting of the Association for Computational Linguistics. 2004.
［26］ MAATEN L V D， Hinton G. Visualizing data using t-SNE［J］. Journal of Machine Learning Research. 2008，9：2579-2605.

[1]	WANG Xiao-xia, MENG Jia-na, JIANG Feng, DING Zi-qing. Multi-view Knowledge-aware Recommender System [J]. Computer and Modernization, 2024, 0(02): 100-107.
[2]	XU Yue-wen1, LI Ming1, LI Li2. Image Classification of COVID-19 Based on Contrast Learning MocoV2 [J]. Computer and Modernization, 2024, 0(02): 81-87.
[3]	JIN Du-liang, FAN Yong-sheng, ZHANG Qi. Semantic Loss Degree of Text Summarization Evaluation Method [J]. Computer and Modernization, 2023, 0(03): 84-89.
[4]	JIAO Xin-quan, LI Rui-kang, CHEN Jian-jun. Remote Sensing Image Object Detection Based on Improved MoCo [J]. Computer and Modernization, 2022, 0(12): 88-94.
[5]	YUE Yi-feng, HUANG Wei, REN Xiang-hui. An Automatic Text Summarization Model Construction Method Based on BERT Embedding [J]. Computer and Modernization, 2020, 0(01): 63-.
[6]	MAO Liangwen1, XU Liang2,3. Automatic Text Summarization Algorithm Based on Sentence Weight and Chapter Structure [J]. Computer and Modernization, 2015, 0(12): 19-.