计算机与现代化 ›› 2018, Vol. 0 ›› Issue (09): 17-.doi: 10.3969/j.issn.1006-2475.2018.09.004

• 人工智能 • 上一篇    下一篇

基于卷积神经网络的中文人物关系抽取方法

  

  1. (1.中国电建集团华中电力设计研究院有限公司,河南郑州450007;2.华北电力大学电子与通信工程系,河北保定071003)
  • 收稿日期:2018-04-19 出版日期:2018-09-29 发布日期:2018-09-30
  • 作者简介:司文豪(1968-),女,河南郑州人,中国电建集团华中电力设计研究院有限公司高级工程师,本科,研究方向:电力信息与通信技术; 贾雷萍(1991-),女,华北电力大学电子与通信工程系硕士研究生,研究方向:信息处理; 戚银城(1968-),男,教授,博士,研究方向:电力信息技术。

Chinese Personal Relation Extraction Method Based on Convolutional Neural Network

  1. (1. Central China Electric Power Engineering Corporation Limited, Power China, Zhengzhou 450007, China;
    2. Department of Electronic and Communication Engineering, North China Electric Power University, Baoding 071003, China)
  • Received:2018-04-19 Online:2018-09-29 Published:2018-09-30

摘要: 针对基于机器学习的人物关系抽取需要人工选取特征的问题,提出一种基于卷积神经网络的中文人物关系抽取方法。采用搜狗实验室公开的中文全网新闻语料库来训练Word2vec模型,得到基于分布式表示的词向量表达,并完成了对百度百科数据集的词向量转化工作。设计一种基于经典CNN模型的中文人物关系抽取系统方案,用CNN模型自动提取特征并进行人物关系的分类,实现了5类常见人物关系的提取,准确率达到92.87%,平均召回率达到86.92%。实验结果表明,该方法无需人工构建复杂特征即可得到较好的人物关系抽取效果。

关键词: 文本挖掘, 人物关系抽取, 卷积神经网络, 分类, 词向量特征

Abstract: Focused on the problem that the features need to be selected manually in personal relation extraction based on machine learning, a Chinese personal relation extraction method based on convolutional neural networks is proposed. The Word2vec model is trained by the Internet Chinese news corpus of Sogou Lab, and the expression of word vector based on distributed representation is obtained, and the transformation of the word vector for the Baidu encyclopedia data set is completed. A Chinese personal relation extraction system based on the classic CNN model is designed. The features are automatically extracted and the personal relation is classified by the CNN model. The accuracy rate reaches to 92.87%, and the average recall rate reaches to 86.92% in extraction of 5 kinds of personal relation. Experimental results show that this method does not need to construct complex features artificially, and it can get a better effect in personal relation extraction.

Key words: text mining, personal relation extraction, convolutional neural network, classification, word vector feature

中图分类号: