Computer and Modernization ›› 2022, Vol. 0 ›› Issue (06): 56-66.
Previous Articles Next Articles
Online:
2022-06-23
Published:
2022-06-23
ZHANG Zi-yun, WANG Wen-fa, MA Le-rong, DING Cang-feng. Research Progress of Text Summarization Model[J]. Computer and Modernization, 2022, 0(06): 56-66.
[1] | 朱永清,赵鹏,赵菲菲,等. 基于深度学习的生成式文本摘要技术综述[J]. 计算机工程, 2021,47(11):11-21. |
[2] | CHENG J P, LAPATA M. Neural summarization by extracting sentences and words[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016:484-494. |
[3] | NALLAPATI R, ZHAI F F, ZHOU B. SummaRuNNer: A recurrent neural network based sequence model for extractive summarization of documents[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017:3075-3081. |
[4] | PAULUS R, XIONG C M, SOCHER R. A deep reinforced model for abstractive summarization[C]// Proceedings of the 6th International Conference on Learning Representations. 2017. |
[5] | WU Y X, HU B T. Learning to extract coherent summary via deep reinforcement learning[C]// The 32nd AAAI Conference on Artifificial Intelligence. 2018,32(1). |
[6] | HAGHIGHI A, VANDERWENDE L. Exploring content models for multi-document summarization[C]// Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. 2009:362-370. |
[7] | CHEUNG J C K, PENN G. Probabilistic domain modelling with contextualized distributional semantic vectors[C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013:392-401. |
[8] | CAO Z Q, LI W J, LI S J, et al. Improving multi-document summarization via text classification[C]// The 31st AAAI Conference on Artificial Intelligence. 2017:31(1). |
[9] | ISONUMA M, FUJINO T, MORI J, et al. Extractive summarization using multi-task learning with document classification[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017:2101-2110. |
[10] | 温文波,杜维. 蚁群算法概述[J]. 石油化工自动化, 2002(1):19-22. |
[11] | 李金鹏,张闯,陈小军,等. 自动文本摘要研究综述[J]. 计算机研究与发展, 2021,58(1):1-21. |
[12] | PAPADIMITRIOU C H, RAGHAVAN P, TAMAKI H, et al. Latent semantic indexing: A probabilistic analysis[J]. Journal of Computer and System Sciences, 2000,61(2):217-235. |
[13] | ZHENG B, MCLEAN D C, LU X H. Identifying biological concepts from a protein-related corpus with a probabilistic topic model[J]. BMC Bioinformatics, 2006. DOI: 10.1186/1471-2105-7-58. |
[14] | 郭继峰,费禹潇,孙文博,等. 一种融合主题的PGN-GAN文本摘要模型[J/OL]. [2021-12-06]. http://kns.cnki.net/kcms/detail/21.1106.TP.20211115.1055.002.html. |
[15] | BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003,3:993-1022. |
[16] | DEERWESTER S, DUMAIS S T, FURNAS G W, et al. Indexing by latent semantic analysis[J]. Journal of the American Society for Information Science, 1990,41(6):391-407. |
[17] | TARDAN P P, ERWIN A, ENG K I, et al. Automatic text summarization based on semantic analysis approach for documents in Indonesian language[C]// 2013 International Conference on Information Technology and Electrical Engineering (ICITEE). 2013:47-52. |
[18] | JAGADEESH J J, PINGALI P, VARMA V. Sentence extraction based single document summarization[R]. Workshop on Document Summarization, 2005. |
[19] | ZHOU L, HOVY E. Template-filtered headline summarization[C]// Proceedings of the ACL Workshop on Text Summarization. 2004. |
[20] | CAO Z Q, LI W J, LI S J, et al. Retrieve, rerank and rewrite: Soft template based neural summarization[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018:152-161. |
[21] | WANG K, QUAN X J, WANG R. BiSET: Bi-directional selective encoding with template for abstractive summarization[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2153-2162. |
[22] | PAGE L, BRIN S, MOTWANI R, et al. The PageRank citation ranking: Bringing order to the Web[J]. Stanford Digital Libraries Working Paper, 1998. |
[23] | MIHALCEA R, TARAU P. TextRank: Bringing order into texts[C]// 2004 Conference on Empirical Methods in Natural Language Processing. 2004. |
[24] | 汪旭祥,韩斌,高瑞,等. 基于改进TextRank的文本摘要自动提取[J]. 计算机应用与软件, 2021,38(6):155-160. |
[25] | SEHGAL S, KUMAR B, RAMPAL L, et al. A modification to graph based approach for extraction based automatic text summarization[M]// Progress in Advanced Computing and Intelligent Engineering. 2018:373-378. |
[26] | PEYRARD M. A simple theoretical model of importance for summarization[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. DOI:10.18653/v1/P19-1101. |
[27] | WEST P, HOLTZMAN A, BUYS J, et al. BottleSum: Unsupervised and self-supervised sentence summarization using the information bottleneck principle[C]// Proceedings of the Empirical Methods in Natural Language Processing (EMNLP) & International Joint Conference on Natural Language Processing (IJCNLP). 2019:3750-3759. |
[28] | LEV G, SHMUELI-SCHEUER M, HERZIG J, et al. TalkSumm: A dataset and scalable annotation method for scientific paper summarization based on conference talks[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2125-2131. |
[29] | PALASKAR S, LIBOVICKY J, GELLA S, et al. Multimodal abstractive summarization for how2 videos[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:6587-6596. |
[30] | RUSH A M, CHOPRA S, WESTON J. A neural attention model for abstractive sentence summarization[J]. arXiv preprint arXiv: 1509.00685, 2015. |
[31] | CHOPRA S, AULI M, RUSH A M. Abstractive sentence summarization with attentive recurrent neural networks[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. DOI: 10.18653/v1/N16-1012. |
[32] | NALLAPATI R, ZHOU B, SANTOS C N D, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]// Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. 2016. DOI: 10.18653/v1/K16-1028. |
[33] | SEE A, LIU P J, MANNING C D. Get to the point: Summarization with pointer-generator networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017:1073-1083. |
[34] | GU J T, LU Z D, LI H, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016:1631-1640. |
[35] | ZENG W Y, LUO W J, FIDLER S, et al. Efficient summarization with read-again and copy mechanism[J]. arXiv preprint arXiv:1611.03382, 2016. |
[36] | HSU W T, LIN C K, LEE M Y, et al. A unified model for extractive and abstractive summarization using inconsistency loss[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018:132-141. |
[37] | LI C L, XU W R, LI S, et al. Guiding generation for abstractive text summarization based on key information guide network[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018:55-60. |
[38] | GEHRMANN S, DENG Y T, RUSH A M. Bottom-up abstractive summarization[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018:4098-4109. |
[39] | CHEN Y C, BANSAL M. Fast abstractive summarization with reinforce-selected sentence rewriting[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018:675-686. |
[40] | PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018:2227-2237. |
[41] | RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[J]. 2018. |
[42] | DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019:4171-4186. |
[43] | MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv:1301.3781, 2013. |
[44] | PENNINGTON J, SOCHER R, MANNING C D. Glove: Global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014. DOI: 10.3115/v1/D14-1162. |
[45] | BOJANOWSKI P, GRAVE E, JOULIN A, et al. Enriching word vectors with subword information[J]. Transactions of the Association for Computational Linguistics, 2016,5(1). DOI: 10.1162/tacl_a_00051. |
[46] | ZHANG X X, WEI F R, ZHOU M. HIBERT: Document level pre-training of hierarchical bidirectional transformers for document summarization[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:5059-5069. |
[47] | LIU L Q, LU Y, YANG M, et al. Generative adversarial network for abstractive text summarization[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018:8109-8110. |
[48] | BAE S, KIM T, KIM J, et al. Summary level training of sentence rewriting for abstractive summarization[J]. arXiv preprint arXiv:1909.08752, 2019. |
[49] | SHARMA E, HUANG L Y, HU Z, et al. An entity-driven framework for abstractive summarization[C]// Proceedings of the 2019 Empirical Methods in Natural Language Processing Conference and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP-2019). 2019:3278-3289. |
[50] | ZHANG H Y, GONG Y Y, YAN Y, et al. Pretraining-based natural language generation for text summarization[C]// Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). 2019:789-797. |
[51] | SONG K Q, WANG B Q, FENG Z, et al. Controlling the amount of verbatim copying in abstractive summarization[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2020,34(5):8902-8909. |
[52] | ZHANG J Q, ZHAO Y, SALEH M, et al. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization[C]// Proceedings of the 37th International Conference on Machine Learning. 2020:11328-11339. |
[53] | SONG K T, TAN X, QIN T, et al. Mass: Masked sequence to sequence pre-training for language generation[J]. arXiv preprint arXiv:1905.02450, 2019. |
[54] | LIU Y. Fine-tune BERT for extractive summarization[J]. arXiv preprint arXiv:1903.10318, 2019. |
[55] | ZHENG H, LAPATA M. Sentence centrality revisited for unsupervised summarization[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:6236-6247. |
[56] | WANG D Q, LIU P F, ZHONG M, et al. Exploring domain shift in extractive text summarization[J]. arXiv preprint arXiv:1908.11664, 2019. |
[57] | CHO S, LI C, YU D, et al. Multi-Document summarization with determinantal point processes and contextualized representations[C]// Proceedings of the 2nd Workshop on New Frontiers in Summarization. 2019. DOI: 10.18653/v1/D19-5412. |
[58] | LIU Y, LAPATA M. Text summarization with pretrained encoders[C]// Proceedings of the Empirical Methods in Natural Language Processing (EMNLP) & International Joint Conference on Natural Language Processing (IJCNLP). 2019:3728-3738. |
[59] | KHANDELWAL U, CLARK K, JURAFSKY D, et al. Sample efficient text summarization using a single pre-trained transformer[J]. arXiv preprint arXiv: 1905.08836, 2019. |
[60] | DONG L, YANG N, WANG W H, et al. Unified language model pre-training for natural language understanding and generation[J]. arXiv preprint arXiv: 1905.03197, 2019. |
[61] | LEWIS M, LIU Y H, GOYAL N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2019.DOI:10.18653/v1/2020.acl-main.703. |
[62] | BELTAGY I, LO K, COHAN A. SciBERT: A pretrained language model for scientific text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019:3613-3618. |
[63] | SUN S, NENKOVA A. The feasibility of embedding based automatic evaluation for single document summarization[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019:1216-1221. |
[64] | GRAFF D, CIERI C. English gigaword[J]. Linguistic Data Consortium, Philadelphia, 2003. DOI: 10.35111/0z6y-q265. |
[65] | RUSH A M, CHOPRA S, WESTON J. A neural attention model for abstractive sentence summarization[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015:379-389. |
[66] | HERMANN K M, KOVCISKY T, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:1693-1701. |
[67] | SEE A, LIU P J, MANNING C D. Get to the point: Summarization with pointer-generator networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017:1073-1083. |
[68] | SANDHAUS E. The New York Times Annotated Corpus Overview[DB/OL].[2021-09-02].https://catalog.ldc.upeen.edu/docs/LDC2008T19/new_york_times.annotated.corpus.pdf. |
[69] | DURRETT G, BERG-KIRKPATRICK T, KLEIN D. Learning-based single-document summarization with compression and anaphoricity constraints[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016. |
[70] | NARAYAN S, COHEN S B, LAPATA M. Don’t give me the details, just the summary! Topic-aware convolutional neural networks for extreme summarization[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018:1797-1807. |
[71] | GRUSKY M, NAAMAN M, ARTZI Y. Newsroom: A dataset of 1.3 million summaries with diverse extractive strategies[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018:708-719. |
[72] | HU B T, CHEN Q C, ZHU F Z. LCSTS: A large scale chinese short text summarization dataset[J]. arXiv preprint arXiv:1506.05865, 2015. |
[73] | SHARMA E, LI C, WANG L. BIGPATENT: A large-scale dataset for abstractive and coherent summarization[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2204-2213. |
[74] | FABBRI A R, LI I, SHE T W, et al. Multi-news: A large-scale multi-document summarization dataset and abstractive hierarchical model[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:1074-1084. |
[75] | LIN C Y. Rouge: A package for automatic evaluation of summaries[C]// Text Summarization Branches Out. 2004:74-81. |
[76] | PAPINENI K, ROUKOS S, WARD T, et al. BLEU: A method for Automatic Evaluation of Machine Translation[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002:311-318. |
[77] | BANERJEE S, LAVIE A. METEOR: An automatic metric for mt evaluation with improved correlation with human judgments[C]// Proceedings of the 2005Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. 2005:65-72. |
[78] | DENKOWSKI M, LAVIE A. Meteor universal: Language specific translation evaluation for any target language[C]// Proceedings of the 9th Workshop on Statistical Machine Translation. 2014:376-380. |
[1] | WANG Hao-chang, LIU Ru-yi. Review of Relation Extraction Based on Pre-training Language Model [J]. Computer and Modernization, 2023, 0(01): 49-57. |
[2] | WANG Hao-chang, SUN Meng-ran, ZHAO Tie-jun. Low-resource Neural Machine Translation Based on ELMO [J]. Computer and Modernization, 2021, 0(07): 38-42. |
[3] | XU Long. Short Text Sentiment Analysis Based on Self-attention and Capsule Network [J]. Computer and Modernization, 2020, 0(07): 61-64. |
[4] | WANG Jin;TAO Qian;CHEN Hong. Analysis of Datasets Model in IEC61850 [J]. Computer and Modernization, 2012, 203(7): 202-205. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||