计算机与现代化

• 算法分析与设计 • 上一篇    下一篇

一种改进的LSH/MinHash协同过滤算法

卞艺杰,陈 超,马玲玲,陈远磊   

  1. 河海大学商学院,江苏南京210098
  • 收稿日期:2013-07-08 修回日期:1900-01-01 出版日期:2013-12-18 发布日期:2013-12-18

An Improved LSH/MinHash Collaborative Filtering Algorithm

BIAN Yi-jie, CHEN Chao, MA Ling-ling, CHEN Yuan-lei   

  1. Business School, Hohai University, Nanjing 210098, China
  • Received:2013-07-08 Revised:1900-01-01 Online:2013-12-18 Published:2013-12-18

摘要: 近年来很多基于协同过滤的推荐系统得到了成功应用,但随着系统中用户和项目数量的不断增加,相似度计算量剧增,使得协同过滤推荐系统的扩展性问题变得日益突出。本文提出改进的基于近似最近邻的LSH/MinHash算法,并运用到图书馆资源聚类中,以解决在合理时间复杂度下的高维大数据量聚类问题,降低相似度计算量,提高算法的可扩展性。实验表明此算法有较高的效率与精度。

关键词: 图书馆, 个性化推荐, 协同过滤, LSH

Abstract: In recent years, many collaborative filtering-based recommender systems have been successfully applied, but with the increasing number of system users and projects, the amount of similarity calculation increases sharply, collaborative filtering recommendation system scalability issues become increasingly prominent. This paper puts forward a LSH/MinHash algorithm based on the approximate nearest neighbor, and applies it to the clustering of library resources, for solving the problem of high dimension and a amount of data cluster in the complexity under reasonable time. It reduces the amount of similarity calculation, improves the scalability of the algorithm. Experiments show that this algorithm is of higher efficiency and accuracy.

Key words: library, personalized recommendation, collaborative filtering, LSH