计算机与现代化

• 网络与通信 • 上一篇    下一篇

基于词重要性的Markov网络查询扩展模型

  

  1. (江西师范大学计算机信息工程学院,江西 南昌 330022)
  • 收稿日期:2017-03-13 出版日期:2017-11-21 发布日期:2017-11-21
  • 作者简介:王千千(1988-),女,江西南昌人,江西师范大学计算机信息工程学院硕士研究生,研究方向:信息检索,数据挖掘; 罗文兵(1984-),男,实验师,研究方向:信息检索,自然语言处理。

Markov Network Query Expansion Model Based on Term Importance

  1. (School of Computer Information Engineering, Jiangxi Normal University, Nanchang 330022, China)
  • Received:2017-03-13 Online:2017-11-21 Published:2017-11-21

摘要: 词项权重已经广泛应用于信息检索模型中,针对传统的词项独立性假设的词袋模型的问题,本文将基于词重要性的词项权重的计算方法应用于Markov网络查询扩展模型中。该词项权重的计算方法须先建立文档的词项图,然后根据词项图得到词项的共现矩阵和词项间的概率转移矩阵,最后利用Markov链的计算方法得到词的权重。将得到的词项权重代入Markov网络扩展模型中,在5个标准数据集上的实验结果表明,采用基于词重要性的Markov网络查询扩展模型的检索结果优于传统的基于词袋的检索结果。

关键词: 词项图, Markov网络, 查询扩展

Abstract: The weight of term has been widely used in models of information retrieved. In order to solve the problem of independence assumption of word bags mode for traditional model, the weight of term based on the importance of term will be used in the Markov network query expansion model. In order to calculate the weight of the term, firstly we must establish the graph-of-word of documents. Then according to the graph-of-word, we get the matrix that terms occur together and the probability transfer matrix between terms. Lastly, we use the chain of Markov to get the weight of term. By putting the weight of term into the Markov network query expansion model, the experiment results on 5 standard datasets show that the search results of using Markov network query expansion model based on term importance are better than those based on traditional model of word bags.

Key words: graph-of-word, Markov network, query expansion

中图分类号: