计算机与现代化 ›› 2011, Vol. 7 ›› Issue (7): 96-98,1.doi:

• 算法分析与设计 • 上一篇    下一篇

融合VSM技术的PageRank算法研究与应用

李卫东,陆 玲   

  1. 东华理工大学信息工程学院,江西 南昌 330013
  • 收稿日期:2011-05-03 修回日期:1900-01-01 出版日期:2011-07-15 发布日期:2011-07-15

Research and Application of PageRank Algorithm Combined with VSM Technique

LI Wei-dong, LU Ling   

  1. College of Information Engineering, East China Institute of Technology,Nanchang 330013,China
  • Received:2011-05-03 Revised:1900-01-01 Online:2011-07-15 Published:2011-07-15

摘要: 为解决PageRank算法存在的“主题漂移”问题,本文提出一种融合VSM(向量空间模型)技术的改进方法。首先根据网页的链接结构计算PageRank值,然后建立网页的内容特征向量空间,计算主题内容相似度,最后将这两个值按一定的权重系数进行融合计算,产生新的PageRank值。经过对比实验证明,改进后的PageRank算法减少了无关网页的数量,为搜索引擎提供了更好的排序结果。

关键词: PageRank算法, 链接分析, 向量空间模型, 搜索引擎

Abstract: In order to solve the "Topic Drift" problem of PageRank algorithm, this paper proposes an improved method combined with VSM(vector space model) technique. First, it computes PageRank value by hyperlink structure of Web page, then builds vector space model of Web page content and computes topic content similarity. Finally it sums up new PageRank value according these two values by certain weight coefficient. Contrast experiments show that improved PageRank algorithm reduces the quantity of irrelevant Web page and provides better sorting results for search engine.

Key words: PageRank algorithm, hyperlink analysis, vector space model, search engine