计算机与现代化 ›› 2016, Vol. 0 ›› Issue (4): 16-20.doi: 10.3969/j.issn.1006-2475.2016.04.004

• 数据库与数据挖掘 • 上一篇    下一篇

结合词性规则和依存句法分析的评价对象抽取方法

  

  1. 1.广西财经学院实验教学中心,广西南宁530003;2.国海证券股份有限公司,广西南宁530028)

  • 收稿日期:2016-01-19 出版日期:2016-04-14 发布日期:2018-09-30
  • 作者简介:张建华(1986-),女,河南辉县人,广西财经学院实验教学中心助理工程师,硕士,研究方向:文本处理,数据挖掘; 肖中正(1988-),男,广西桂林人,国海证券股份有限公司助理工程师,硕士,研究方向:分布式计算,数据分析。

Opinion Target Extraction Method with Speech Rules and Dependency Parsing

  1. (1. The Experimental Teaching Center, Guangxi University of Finance and Economics, Nanning 530003, China;2. Sealand Securities Co. Ltd., Nanning 530028, China)
  • Received:2016-01-19 Online:2016-04-14 Published:2018-09-30

摘要:

评价对象的抽取能够让用户和商家同时受益,商家通过评价对象了解用户关心的产品特征,改进商品质量;用户通过评价对象做出购买决策。由于网络评论环境特殊,评价对象的抽取比传统的信息处理更复杂。在一些学者研究的基础上,本文提出一种词性规则和依存句法分析相结合的抽取方法。首先,该抽取方法利用词性规则制定名词短语抽取模板,得到候选评价对象,根据评价词对评价对象的修饰作用对评价对象进行第一次筛选;其次,利用8种依存句法关系对评价对象进行第二次筛选;最后,将2种筛选结果进行结合,得到最终的评价对象。实验结果表明,该方法在3类数据集上都取得了一定的效果。

关键词:

text-indent: 21pt">mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">评价对象; mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">评价词; mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">信息抽取; mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">词性规则; mso-ascii-font-family: 'Times New Roman', mso-hansi-font-family: 'Times New Roman'">依存句法分析

Abstract:

Abstract: Extracting opinion targets simultaneously benefit users and businesses, as businesses understand the product features by the opinion target of interest to users, then improve product quality, and users can make purchasing decisions by these objects. Due to the special circumstances of Web reviews, the extraction of opinion targets is more complex than conventional information processing. Based on some scholars’ basic researches, the paper proposed an extracting method for speech rules and dependency parsing. Firstly, speech rule is used to make noun phrase extraction template, getting candidate opinion targets that are filtered for the first time according to the modification of evaluation terms to opinion target. Secondly, eight kinds of dependency relationship are used to filter opinion targets for the second time. Finally, two kinds of screening results are combined to give the final opinion targets. Experimental results demonstrate that this method on three types of data sets has achieved certain results.

Key words: opinion target, evaluation term, information extraction, speech rules, dependency parsing

中图分类号: