计算机与现代化 ›› 2023, Vol. 0 ›› Issue (02): 58-61.

• 人工智能 • 上一篇    下一篇

一种基于BERT的文本实体链接方法

  

  1. (中国电子科技集团公司第十五研究所,北京 100083)
  • 出版日期:2023-04-10 发布日期:2023-04-10
  • 作者简介:谢世超(1996—),男,河北沧州人,硕士研究生,研究方向:自然语言处理,E-mail: 867660674@qq.com; 黄蔚(1972—),女,研究员,硕士,研究方向:大数据处理整合与挖掘分析,Email: huangw@nci.ac.cn; 任祥辉(1979—),男,研究员,硕士,研究方向:系统体系架构、大数据分析,E-mail: 13691446610@163.com。
  • 基金资助:
    国家重点研发计划项目(2018YFC0831206)

A Text Entity Linking Method Based on BERT

  1. (The 15 th Research Institure of China Electronics Technology Corporation, Beijing 100083, China)
  • Online:2023-04-10 Published:2023-04-10

摘要: 实体链接是明确文本中实体指称的重要手段,也是构建知识图谱的关键技术,在智能问答、信息检索等领域中具有重要作用,但由于中文文本的多词一义或者一词多义等问题,现有的文本实体链接方法准确率较低。针对这些问题,本文提出了一种基于BERT( Bidirectional Encoder Representations from Transformers)的文本实体链接方法命名为STELM模型,通过将每一对指称上下文和对应的候选实体描述分别输入BERT模型,将输出的结果拼接起来通过一个全连接层然后取得分最高的候选实体作为最终结果。在CCKS2020(2020全国知识图谱与语义计算大会)数据集上的实验结果表明本文提出的模型准确率相较于其他模型有一定的提升,准确率达到了0.9175。

关键词: 实体链接, BERT模型, 全连接网络, 模型拼接

Abstract: Entity linking is not only an important means to clarify the entity reference in the text, but also the key technology to construct the knowledge map. It plays an important role in the fields of intelligent question answering and information retrieval. However, due to the problems of polysemy or polysemy in Chinese Texts, the accuracy of the existing text entity linking methods is low. To solve these problems, this paper proposes a text entity linking method based on BERT (Bidirectional Encoder Representations from Transformers), named STELM model. By inputting each pair of reference and candidate entities into the BERT model, the output results are spliced together and the candidate entity with the highest score is taken as the final result through a full connection layer. The experimental results on CCKS2020(2020 China Conference on Knowledge Graph and Semantic Computing) dataset show that the accuracy of the model proposed in this paper has a certain improvement compared with other models and the accuracy has reached 0.9175.

Key words: entity linking, BERT, full connection layer, model concatenate