计算机与现代化

• 人工智能 • 上一篇    下一篇

基于多粒度特征表示的知识图谱问答

  

  1. (1.中国科学院大学电子电气与通信工程学院,北京100049;2.中国科学院电子学研究所,北京100190;
    3.中国科学院空间信息处理与应用系统技术重点实验室,北京100190)
  • 收稿日期:2018-03-15 出版日期:2018-09-29 发布日期:2018-09-30
  • 作者简介:申存(1992-),男,四川广安人,中国科学院大学电子电气与通信工程学院、中国科学院电子学研究所硕士研究生,研究方向:自然语言处理,知识图谱; 黄廷磊(1971-),男,研究员,博士生导师,博士,研究方向:数据挖掘,大数据分析; 梁霄(1981-),男,助理研究员,博士,研究方向:数据组织管理,知识图谱。
  • 基金资助:
    国家自然科学基金资助项目(61725105, 61331017)

Knowledge Graph Question Answering Based on Multi-granularity Feature Representation

  1. (1. School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China;
    2. Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China;
    3. Key Laboratory of Technology in Geo-Spatial Information Processing and Application System,
    Chinese Academy of Sciences, Beijing 100190, China)
  • Received:2018-03-15 Online:2018-09-29 Published:2018-09-30

摘要: 近年来,基于知识图谱的问答系统逐渐成为学术界和工业界的研究和应用热点方向,而传统方法通常存在效率不高以及未充分利用数据信息的问题。针对以上问题,本文将中文知识图谱问答分为实体抽取和属性选择2个子任务,采用双向长短期记忆条件随机场(Bi-LSTM-CRF)模型来进行实体识别,并提出一种多粒度特征表示的属性选择模型。该模型采用字符级别以及词级别分别对问句和属性进行嵌入表示并通过编码器进行编码,对于属性同时还引入热度编码的信息。通过不同粒度文本表示的结合,并对问句和属性进行相似度计算,最终该系统在NLPCC-ICCPOL 2016 KBQA数据集上取得了73.96%的F1值,能够较好地完成知识图谱问答任务。

关键词: 知识图谱, 问答系统, 实体抽取, 属性选择

Abstract: Recently, knowledge graph question answering has gradually become the focus of academic and industrial circles. However, traditional methods often have problems of inefficiency and insufficient use of data information. In order to solve the problems above, this paper divides the Chinese knowledge graph question answering into two sub-tasks: entity extraction and property selection. The Bi-LSTM-CRF model is used to identify entities, and a multi-granularity feature representation model is proposed to carry out property selection. The model utilizes character-level and word-level to represent questions and properties and encode them through the encoder. At the same time, it also introduces the one-hot information for the properties. Through the combination of multi-granularity text representations and the similarity calculation of questions and properties, the system finally achieves a 73.96% F1 value on the NLPCC-ICCPOL 2016 KBQA data set, which finishes the knowledge graph question and answer task successfully.

Key words:  knowledge graph, question answering system, entity extraction, property selection

中图分类号: