一种基于BERT的文本实体链接方法

计算机与现代化 ›› 2023, Vol. 0 ›› Issue (02): 58-61.

一种基于BERT的文本实体链接方法

（中国电子科技集团公司第十五研究所，北京 100083）

出版日期:2023-04-10 发布日期:2023-04-10
作者简介:谢世超（1996—），男，河北沧州人，硕士研究生，研究方向：自然语言处理，E-mail: 867660674@qq.com；黄蔚（1972—），女，研究员，硕士，研究方向：大数据处理整合与挖掘分析，Email: huangw@nci.ac.cn；任祥辉（1979—），男，研究员，硕士，研究方向：系统体系架构、大数据分析，E-mail: 13691446610@163.com。
基金资助:
国家重点研发计划项目（2018YFC0831206）

A Text Entity Linking Method Based on BERT

（The 15 th Research Institure of China Electronics Technology Corporation， Beijing 100083， China）

Online:2023-04-10 Published:2023-04-10

摘要/Abstract

摘要： 实体链接是明确文本中实体指称的重要手段，也是构建知识图谱的关键技术，在智能问答、信息检索等领域中具有重要作用，但由于中文文本的多词一义或者一词多义等问题，现有的文本实体链接方法准确率较低。针对这些问题，本文提出了一种基于BERT（ Bidirectional Encoder Representations from Transformers）的文本实体链接方法命名为STELM模型，通过将每一对指称上下文和对应的候选实体描述分别输入BERT模型，将输出的结果拼接起来通过一个全连接层然后取得分最高的候选实体作为最终结果。在CCKS2020（2020全国知识图谱与语义计算大会）数据集上的实验结果表明本文提出的模型准确率相较于其他模型有一定的提升，准确率达到了0.9175。

关键词: 实体链接, BERT模型, 全连接网络, 模型拼接

Abstract: Entity linking is not only an important means to clarify the entity reference in the text， but also the key technology to construct the knowledge map. It plays an important role in the fields of intelligent question answering and information retrieval. However， due to the problems of polysemy or polysemy in Chinese Texts， the accuracy of the existing text entity linking methods is low. To solve these problems， this paper proposes a text entity linking method based on BERT （Bidirectional Encoder Representations from Transformers）， named STELM model. By inputting each pair of reference and candidate entities into the BERT model， the output results are spliced together and the candidate entity with the highest score is taken as the final result through a full connection layer. The experimental results on CCKS2020（2020 China Conference on Knowledge Graph and Semantic Computing） dataset show that the accuracy of the model proposed in this paper has a certain improvement compared with other models and the accuracy has reached 0.9175.

Key words: entity linking, BERT, full connection layer, model concatenate

谢世超, 黄蔚, 任祥辉. 一种基于BERT的文本实体链接方法[J]. 计算机与现代化, 2023, 0(02): 58-61.

XIE Shi-chao, HUANG Wei, REN Xiang-hui. A Text Entity Linking Method Based on BERT[J]. Computer and Modernization, 2023, 0(02): 58-61.

参考文献

［1］ RAU L F. Extracting company names from text 18［C］// Proceedings of the Seventh IEEE Conference on Artificial Intelligence Application. 1991:29-32.
［2］ ZHOU G， SU J. Named entity recognition using an HMM-based chunk tagger［C］// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002:473-480.
［3］ MALOUF R. Markov models for language-independent named entity recognition［C］// Proceedings of the 6th Conference on Natural language learning - Volume 20. 2002:1-4.
［4］ NADEAU D， TURNEY P D， MATWIN S. Unsupervised named-entity recognition: Generating gazetteers and resolving ambiguity［C］// Proceedings of the 19th International Conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence. 2006:266-277.
［5］ LI Y， BONTCHEVA K， CUNNINGHAM H. SVM based learning system for information extraction［C］// Proceedings of the First International Conference on Deterministic and Statistical Methods in Machine Learning. 2004:319-339.
［6］ LIU S，Y TANG B Z， CHEN Q C， et al. Effects of semantic features on machine learning-based drug name recognition systems: Word embeddings vs. manually constructed dictionaries［J］. Information （Switzerland）， 2015，6（4）:848-865.
［7］ SEGURA-BEDMAR I， MARTINEZ P， ZAZO M H. Semeval-2013 task 9: Extraction of drug-drug interactions from biomedical texts （ddiextraction 2013）［J］. Association for Computational Linguistics， 2013（6）:341-350.
［8］ BENGIO Y， DUCHARME R， VINCENT P. A neural probabilistic language model［C］// Proceedings of the 13th International Conference on Neural Information Processing Systems. 2000:893-899.
［9］ PETERS M E， NEUMANN M， IYYER M， et al. Deep contextualized word representations. NAACL-HLT［J］. 2018. ［J］. arXiv preprint arXiv:1802.05365， 2018.
［10］ SARZYNSKA-WAWER J， WAWER A， PAWLAK A， et al. Detecting formal thought disorder by deep contextualized word representations［J］. Psychiatry Research， 2021，304. DOI:10.1016/j.psychres.2021.114135.
［11］ MAO J， XU W， YANG Y， et al. Deep captioning with multimodal recurrent neural networks （m-RNN）［J］. arXiv preprint arXiv:1412.6632， 2014.
［12］ CHO K， VAN M B， GULCEHRE C， et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation［J］. arXiv preprint arXiv:1406.1078， 2014.
［13］ SHERSTINSKY A. Fundamentals of recurrent neural network （RNN） and long short-term memory （LSTM） network［J］. Physica D: Nonlinear Phenomena， 2020，404. DOI:10.1016/j.physd.2019.132306.
［14］ MIAO Y， GOWAYYED M， METZE F. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding［C］// 2015 IEEE Workshop on Automatic Speech Recognition and Understanding （ASRU）. 2015:167-174.
［15］ WILLIAMS G， BAXTER R， HE H， et al. A comparative study of RNN for outlier detection in data mining［C］// Proceedings. 2002 IEEE International Conference on Data Mining， 2002. ICDM 2003. 2002:709-712.
［16］ JADERBERG M， SIMONYAN K， ZISSERMAN A， et al. Spatial transformer networks［J］. arXiv preprint arXiv:1506.02025， 2015.
［17］ KITAEV N， KAISER Ł， LEVSKAYA A. Reformer: The efficient transformer［J］. arXiv preprint arXiv:2001.04451， 2020.
［18］ RADFORD A， NARASIMHAN K， SALIMANS T， et al. Improving Language Understanding by Generative Pre-training［EB/OL］. ［2022-05-17］. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
［19］ ZHANG Y Z， SUN S Q， GALLEY M， et al. DialoGPT: large-scale generative pre-training for conversational response Generation［J］. arXiv preprint arXiv:1911.00536， 2019.
［20］ ETHAYARAJH K. How contextual are contextualized word representations? Comparing the geometry of BERT， ELMo， and GPT-2 Embeddings［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the the 9th International Joint Conference on Natural Language Processing （EMNLP-IJCNLP）. 2019:55-65.
［21］ FLORIDI L， CHIRIATTI M. GPT-3: Its nature， scope， limits， and consequences［J］. Minds and Machines， 2020，30（4）:681-694.
［22］ DEVLIN J， CHANG M， LEE K， et al. BERT: Pre-training of deep bidirectional transformers for language understanding［J］. arXiv preprint arXiv:1810.04805， 2018.
［23］ JOVCIC D. Bidirectional， high-power DC transformer［J］. IEEE Transactions on Power Delivery， 2009，24（4）:2276-2283.
［24］ SUN F， LIU J， WU J， et al. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer［C］// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019:1441-1450.
［25］ ZHANG X X， WEI F R， ZHOU M. HIBERT: Document level pre-training of hierarchical bidirectional transformers for document summarization［J］. arXiv preprint arXiv:1905.06566， 2019.
［26］詹飞，朱艳辉，梁文桐，等. 基于多任务学习的短文本实体链接方法［J］. 计算机工程， 2022，48（3）:315-320.

[1]	郑久超, 赵新元. 基于主题与描述信息的实体链接方法[J]. 计算机与现代化, 2024, 0(12): 10-14.
[2]	宋爽, 陆鑫达. 基于BERT与图像自注意力机制的文本匹配模型[J]. 计算机与现代化, 2021, 0(11): 12-16.
[3]	岳一峰，黄蔚，任祥辉. 一种基于BERT的自动文本摘要模型构建方法[J]. 计算机与现代化, 2020, 0(01): 63-.
[4]	冯钧,柳菁铧,孔盛球. 融合多特征的中文集成实体链接方法[J]. 计算机与现代化, 2019, 0(01): 69-.