基于预训练模型的关系抽取研究综述

摘要/Abstract

摘要： 近年来随着深度学习技术的不断革新，预训练模型在自然语言处理中的应用也越来越广泛，关系抽取不再是单纯地依赖传统的流水线方法。预训练语言模型的发展已经极大地推动了关系抽取的相关研究，在很多领域已经超越了传统方法。首先简要介绍关系抽取的发展与经典预训练模型；其次总结当下常用的数据集与评测方法，并分析模型在各数据集上的表现；最后探讨关系抽取发展的挑战与未来研究趋势。

关键词: 深度学习, 预训练模型, 关系抽取, 特征抽取, 自然语言处理

Abstract: In recent years, with the continuous innovation of deep learning technology， the application of pre-training models in natural language processing has become more and more extensive, and relation extraction is no longer purely dependent on the traditional pipeline method. The development of pre-training language models has greatly promoted the related research of relation extraction， and has surpassed traditional methods in many fields. First， this paper briefly introduces the development of relationship extraction and classic pre-training models；secondly, summarizes the current commonly used data sets and evaluation methods, and analyzes the performance of the model on each data set; finally， discusses the development challenges of relationship extraction and future research trends.

Key words: deep learning, pre-training model, relation extraction, feature extraction, natural language processing

王浩畅, 刘如意. 基于预训练模型的关系抽取研究综述[J]. 计算机与现代化, 2023, 0(01): 49-57.

WANG Hao-chang, LIU Ru-yi. Review of Relation Extraction Based on Pre-training Language Model[J]. Computer and Modernization, 2023, 0(01): 49-57.

参考文献

［1］ CHINCHOR N， MARSH E. MUC-7 information extraction task definition［C］// Proceedings of the 7th Message Understanding Conference. 1998:359-367.
［2］ NIST Website. Automatic Content Extraction［EB/OL］.［2007-05-28］. http://www.nist.gov/speech/tests/ace/.
［3］ HENDRICKX I， KIM S N， KOZAREVA Z， et al. SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals［C］// Proceedings of the 2009 Workshop on Semantic Evaluations: Recent Achievements and Future Directions. 2009:94-99.
［4］ AONE C， HALVERSON L， HAMPTON T， et al. SRA: Description of the IE2 system used for MUC-7［C］// Proceedings of the 7th Message Understanding Conference. 1998.
［5］ MNIH A， HINTON G E. A scalable hierarchical distributed language model［C］// Proceedings of the 21st International Conference on Neural Information Processing Systems. 2008:1081-1088.
［6］ MIKOLOV T， SUTSKEVERI， CHEN K， et al. Distributed representations of words and phrases and their composi-tionality［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013:3111-3119.
［7］ PENNINGTON J， SOCHER R， MANNING C. Glove: Global vectors for word representation［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing （EMNLP）. 2014:1532-1543.
［8］ BAHDANAU D， CHO K， BENGIO Y. Neural machine translation by jointly learning to align and translate［C］// Proceedings of 3rd International Conference on Learning Representations. 2015.
［9］ PETERS M E， NEUMANN M， IYYER M， et al. Deep contextualized word representations［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2018:2227-2237.
［10］ RADFORD A， NARASIMHAN K， SALIMANS T， et al. Improving 1anguage understanding by generative pretraining［J］. arXiv preprint arXiv:1802.05365， 2018.
［11］ DEVLIN J， CHANG M W， LEE K， et al. BERT: Pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019:4171-4186.
［12］ YANG Z L， DAI Z H， YANG Y M， et al. XLNET: Generalized autoregressive pretraining for language understanding［C］// Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019:5753-5763.
［13］ DAI Z H， YANG Z L， YANG Y M， et al. Transformer-XL: Attentive language models beyond a fixed-length context［EB/OL］. arXiv preprint arXiv:1901.02860， 2019.
［14］ ZHONG Z X， CHEN D Q. A frustratingly easy approach for entity and relation extraction［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021:50-61.
［15］ XIE C H， LIANG J Q， LIU J P， et al. Revisiting the negative data of distantly supervised relation extraction［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021:3572-3581.
［16］ SHANG Y M， HUANG H Y， MAo X L. OneRel: Joint entity and relation extraction with one module in one step［J］. Proceedings of the AAAI Conference on Artificial Intelligence. 2022，36（10）:11285-11293.
［17］ SHI X J， CHEN Z R， WANG H， et al. Convolutional LSTM network: A machine learning approach for precipitation nowcasting［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015:802-810.
［18］ JOZEFOWICZ R， VINYALS O， SCHUSTER M， et al. Exploring the limits of language modeling［J］. arXiv preprint arXiv:1602.02410， 2016.
［19］ LE CUN Y， BOSER B， DENKER J S， et al. Handwritten digit recognition with a back-propagation network［C］// Proceedings of the 2nd International Conference on Neural Information Processing Systems. 1989:396-404.
［20］ VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017:6000-6010.
［21］ HOWARD J， RUDER S. Universal language model fine-tuning for text classification［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018:328-339.
［22］ LIU P J， SALEH M， POT E， et al. Generating Wikipedia by summarizing long sequences［J］. arXiv preprint arXiv:1801.10198， 2018.
［23］ JIANG X Z， LIANG Y B， CHEN W Z， et al. XLM-K: Improving cross-lingual language model pre-training with multilingual knowledge［J］. arXiv preprint arXiv:2109. 12573， 2021.
［24］ ZHANG Z Y， HAN X， LIU Z Y， et al. ERNIE: Enhanced language representation with informative entities［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:1441-1451.
［25］ SUN Y， WANG S H， LI Y K， et al. ERNIE: Enhanced representation through knowledge integration［J］. arXiv preprint arXiv:1904.09223， 2019.
［26］ LIU X D， HE P C， CHEN W Z， et al. Multi-task deep neural networks for natural language understanding［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:4487-4496.
［27］ SUN Y， WANG S H， LI Y K， et al. Ernie 2.0: A continual pre-training framework for language understanding［J］. Proceedings of the AAAI Conference on Artificial Intelligence， 2020，34（5）:8968-8975.
［28］ CUI Y M， CHE W X， LIU T， et al. Pre-training with whole word masking for chinese bert［J］. IEEE/ACM Transactions on Audio， Speech， and Language Processing， 2021，29:3504-3514.
［29］ LIU Y H， OTT M， GOYAL N， et al. Roberta: A robustly optimized bert pretraining approach［J］. arXiv preprint arXiv:1907.11692， 2019.
［30］ JOSHI M， CHEN D Q， LIU Y H， et al. Spanbert: Improving pre-training by representing and predicting spans［J］. Transactions of the Association for Computational Linguistics， 2020，8:64-77.
［31］ SONG K T， TAN X， QIN T， et al. MASS: Masked sequence to sequence pre-training for language generation［C］// Proceedings of the 36th International Conference on Machine Learning. 2019:5926-5936.
［32］ LAN Z Z， CHEN M D， GOODMAN S， et al. ALBERT: A lite BERT for self-supervised learning of language representations［J］. arXiv preprint arXiv:1909.11942， 2019.
［33］ JIAO X Q， YIN Y C， SHANG L F， et al. TinyBERT: Distilling BERT for natural language understanding［C］// Findings of the Association for Computational Linguistics: EMNLP 2020. 2020:4163-4174.
［34］ SANH V， DEBUT L， CHAUMOND J， et al. DistilBERT， a distilled version of BERT: Smaller， faster， cheaper and lighter［J］. arXiv preprint arXiv:1910.01108， 2019.
［35］ WEI J Q， REN X Z， LI X G， et al. NEZHA: Neural contextualized representation for Chinese language understanding［J］. arXiv preprint arXiv:1909.00204， 2019.
［36］ PETERS M E， NEUMANN M， LOGAN R， et al. Knowledge enhanced contextual word representations［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019:43-54.
［37］ WANG J， LU W. Two are better than one: Joint entity and relation extraction with table-sequence encoders［C］ // Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020:1706-1721.
［38］ WADDEN D， WENNBERG U， LUAN Y， et al. Entity， relation， and event extraction with contextualized span representations［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019:5784-5789.
［39］ LUAN Y， WADDEN D， HE L H， et al. A general framework for information extraction using dynamic span graphs［C］// Proceedings of the 2019 Conference of the North. 2019:3036-3046.
［40］ COHEN A D， ROSENMAN S， GOLDBERG Y. Relation classification as two-way span-prediction［J］. arXiv preprint arXiv:2010.04829， 2020.
［41］ LI C， TIAN Y. Downstream model design of pre-trained language model for relation extraction task［J］. arXiv preprint arXiv:2004.03786， 2020.
［42］ TAO Q X， LUO X H， WANG H， et al. Enhancing relation extraction using syntactic indicators and sentential contexts［C］// 2019 IEEE 31st International Conference on Tools with Artificial Intelligence. 2019:1574-1580.
［43］ WEI Z P， SU J L， WANG Y， et al. A novel cascade binary tagging framework for relational triple extraction［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020:1476-1488.
［44］ YE H B， ZHANG N Y， DENG S M， et al. Contrastive triple extraction with generative transformer［J］. Proceedings of the AAAI Conference on Artificial Intelligence， 2021，35（16）:14257-14265.
［45］ ZHOU W X， HUANG K， MA T Y， et al. Document-level relation extraction with adaptive thresholding and localized context pooling［J］. Proceedings of the AAAI Conference on Artificial Intelligence， 2021，35（16）: 14612-14620.
［46］ ZENG S， XU R X， CHANG B B， et al. Double graph based reasoning for document-level relation extraction［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020:1630-1640.
［47］ DIXIT K， AL-ONAIZAN Y. Span-level model for relation extraction［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:5308-5314.
［48］ SANH V， WOLF T， RUDER S. A hierarchical multi-task approach for learning embeddings from semantic tasks［J］. Proceedings of the AAAI Conference on Artificial Intelligence， 2019，33（1）:6949-6956.
［49］ ZHAO Y， WAN H Y， GAO J W， et al. Improving relation classification by entity pair graph［C］// Proceedings of the 11th Asian Conference on Machine Learning. 2019:1156-1171.
［50］ SOARES L B， FITZGERALD N， LING J， et al. Matching the blanks: Distributional similarity for relation learning［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:2895-2905.
［51］ WU S C， HE Y F. Enriching pre-trained language model with entity information for relation classification［C］// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019:2361-2364.
［52］ WANG H Y， TAN M， YU M， et al. Extracting multiple-relations in one-pass with pre-trained transformers［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019:1371-1377.
［53］ ALT C， H[U]BNER M， HENNIG L. Improving relation extraction by pre-trained language representations［C］ // Automated Knowledge Base Construction （AKBC）. 2018.
［54］ LUO X K， LIU W J， MA M， et al. BiTT: Bidirectional tree tagging for joint extraction of overlapping entities and relations［J］. arXiv preprint arXiv:2008.13339， 2020.
［55］ SUN K， ZHANG R C， MENSAH S， et al. Recurrent interaction network for jointly extracting entities and classifying relations［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020:3722-3732.
［56］ YAMADA I， ASAI A， SHINDO H， et al. LUKE: Deep contextualized entity representations with entity-aware self-attention［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020:6442-6454.
［57］ YANG S M， YOO S Y， JEONG O R. DeNERT-KG: Named entity and relation extraction model using DQN， knowledge graph， and BERT［J］. Applied Sciences， 2020，10（18）:6429.
［58］ WANG R Z， TANG D Y， DUAN N， et al. K-Adapter: Infusing knowledge into pre-trained models with adapters［C］// Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2021:1405-1418.
［59］ WANG X Z， GAO T Y， ZHU Z C， et al. KEPLER: A unified model for knowledge embedding and pre-trained language representation［J］. Transactions of the Association for Computational Linguistics， 2021，9（11）:176-194.
［60］ CHEN J， HOEHNDORF R， ELHOSEINY M， et al. Efficient long-distance relation extraction with DG-SpanBERT［J］. arXiv preprint arXiv:2004.03636， 2020.
［61］ XUE F Z， SUN A X， ZHANG H， et al. GDPNet: Refining Latent multi-view graph for relation extraction［J］. Proceedings of the AAAI Conference on Artificial Intelligence， 2021，35（16）:14194-14202.
［62］ PENG H， GAO T Y， HAN X， et al. Learning from context or names？ An empirical study on neural relation extraction［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020:3661-3672.
［63］ SHI P， LIN J. Simple BERT models for relation extraction and semantic role labeling［J］. arXiv preprint arXiv:1904.05255， 2019.
［64］ HUANG K， WANG G T， MA T Y， et al. Entity and evidence guided relation extraction for DocRED［J］. arXiv preprint arXiv:2008.12283， 2020.
［65］ YE D M， LIN Y K， DU J J， et al. Coreferential reasoning learning for language representation［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020:7170-7186.
［66］ NAN G， GUO Z， SEKULICI， et al. Reasoning with Latent structure refinement for document-level relation extraction［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020:1546-1557.
［67］ JUNG W， SHIM K. Dual supervision framework for relation extraction with distant supervision and human annotation［C］// Proceedings of the 28th International Conference on Computational Linguistics. 2020:6411-6423.
［68］ TANG H Z， CAO Y N， ZHANG Z Y， et al. Hin: Hierarchical inference network for document-level relation extraction［C］// Proceedings of the 24th Pacific-Asia Conference， on Knowledge Discovery and Data Mining. 2020:197-209.
［69］ WANG H， FOCKE C， SYLVESTER R， et al. Fine-tune Bert for DocRED with two-step process［J］. arXiv preprint arXiv:1909.11898， 2019.
［70］ WANG Y C， YU B W， ZHANG Y Y， et al. TPLinker: Single-stage joint extraction of entities and relations through token pair linking［C］// Proceedings of the 28th International Conference on Computational Linguistics. 2020:1572-1582.

[1]	祁贤, 刘大铭, 常佳鑫. 基于改进自注意力机制的多视图三维重建[J]. 计算机与现代化, 2024, 0(11): 106-112.
[2]	陈凯1, 李宜汀1, 2, 全华凤1 . 基于改进YOLOv8的河道废弃瓶检测方法[J]. 计算机与现代化, 2024, 0(11): 113-120.
[3]	杨骏1, 胡为1, 朱文福2. 基于改进MobileNetV3的视觉SLAM回环检测算法[J]. 计算机与现代化, 2024, 0(10): 21-26.
[4]	王莹莹, 郝潇. 基于Res2Net和递归门控卷积的细粒度图像分类[J]. 计算机与现代化, 2024, 0(10): 74-79.
[5]	史星宇1, 李强2, 庄莉3, 梁懿3, 王秋琳3, 陈锴3, 伍臣周3, 常胜1. 一种面向工业部署的目标检测模型蒸馏技术[J]. 计算机与现代化, 2024, 0(10): 93-99.
[6]	张泽1, 张建权2, 3, 周国鹏2, 3. 基于改进YOLOv8s的摄像头模组缺陷检测[J]. 计算机与现代化, 2024, 0(09): 107-113.
[7]	程亚子1, 雷亮1, 2, 陈瀚1, 赵毅然1. 基于转置注意力的多尺度深度融合单目深度估计[J]. 计算机与现代化, 2024, 0(09): 121-126.
[8]	程萌, 李浩. 改进YOLOv5s的落叶树鸟巢检测方法[J]. 计算机与现代化, 2024, 0(08): 24-29.
[9]	王梦溪, 李峻. 老年人跌倒检测技术研究综述[J]. 计算机与现代化, 2024, 0(08): 30-36.
[10]	时现伟1, 范鑫2. 基于轻量化的视频帧场景语义分割方法[J]. 计算机与现代化, 2024, 0(08): 49-53.
[11]	徐新爱, 李钢. 基于DCGAN的课堂表情图像生成方法[J]. 计算机与现代化, 2024, 0(08): 88-91.
[12]	高帅鹏, 王怡凡. 基于图像的群体情绪识别综述[J]. 计算机与现代化, 2024, 0(08): 98-107.
[13]	李璐, 朱焱. 基于知识提示微调的事件抽取方法[J]. 计算机与现代化, 2024, 0(07): 36-40.
[14]	黄文栋, 王怡凡. 基于模态类别的多模态信息处理与融合综述[J]. 计算机与现代化, 2024, 0(07): 47-62.
[15]	武丽1, 张征浩2, 葛彩成2, 俞俊2. 基于改进SCNN网络的车道线检测算法[J]. 计算机与现代化, 2024, 0(07): 87-92.