基于用户信息向量聚类和改进SAMME的推荐算法

计算机与现代化 ›› 2021, Vol. 0 ›› Issue (07): 23-28.

基于用户信息向量聚类和改进SAMME的推荐算法

(成都理工大学信息科学与技术学院(网络安全学院),四川成都610051)

出版日期:2021-08-02 发布日期:2021-08-02
作者简介:王杉文(1997—),男,四川遂宁人,硕士研究生,研究方向:推荐系统,深度学习,E-mail: 302618914@qq.com; 欧鸥(1977—),男,教授,博士,研究方向:智能计算,空间信息技术; 马万民(1998—),男,硕士研究生,研究方向:机器学习; 陈建林(1995—),男,硕士研究生,研究方向:推荐系统。
基金资助:
国家重点研发计划资助项目(2018YFF01013304)

Recommendation Algorithm Based on User Information Vector Clustering and Improved SAMME#br#

(College of Information Science and Technology (College of Internet Security), Chengdu University of Technology, Chengdu 610051, China)

Online:2021-08-02 Published:2021-08-02

摘要/Abstract

摘要： 针对目前主流的推荐算法中获取的用户信息不完整以及推荐时间过长的问题，本文提出一种基于用户信息向量聚类和改进SAMME的推荐算法，该算法通过分析用户基本信息(地域、时间、兴趣、标签等)，找出用户信息关键词；对不同用户信息关键词基于TF-IDF方法进行加权构建用户信息向量；接着使用K-means算法进行用户聚类分析，将用户聚类结果作为改进SAMME训练样本集；最后通过改进SAMME算法将预测结果对用户进行好友推荐，并在训练过程中保存模型，大大减少推荐时间。最终将本文算法在真实的微博用户数据集上进行实验，并与其他主流算法进行对比，结果显示本文算法在准确率、召回率、F值上都取得了不错的效果。

关键词: 推荐系统, SAMME算法, 用户信息, 聚类分析

Abstract: Aiming at the problem of imperfect user information acquisition and long recommendation time in the current mainstream recommendation algorithms, this paper proposes a recommendation algorithm based on user information vector clustering and improved SAMME. The algorithm analyzes basic user information (region, time, interest, tags, etc) to find user information keywords; weights different user information keywords based on the TF-IDF method to construct user information vectors; then uses the K-means algorithm to perform user clustering analysis, and uses the user clustering results as improved SAMME training sample set; finally, the prediction results are recommended to the user by the improved SAMME algorithm, and the model is saved during the training process, which greatly reduces the recommendation time. Finally, the algorithm of this paper is tested on the real Weibo user data set and compared with other mainstream algorithms. The results show that the algorithm of this paper achieves good results in accuracy, recall and F-value.

Key words: recommendation system, SAMME algorithm, user information, cluster analysis

王杉文, 欧鸥, 马万民, 陈建林. 基于用户信息向量聚类和改进SAMME的推荐算法[J]. 计算机与现代化, 2021, 0(07): 23-28.

WANG Shan-wen, OU Ou, MA Wan-min, CHEN Jian-lin. Recommendation Algorithm Based on User Information Vector Clustering and Improved SAMME#br#[J]. Computer and Modernization, 2021, 0(07): 23-28.

参考文献

［1］ SAITO K, KIMURA M, OHARA K, et al. Super mediator： A new centrality measure of node importance for information diffusion over social network［J］. Information Sciences, 2016,329:985-1000.
［2］ China Internet Network Information Center. The 45th China Statistical Report on Internet Development［R］. Beijing: China Internet Network Information Center, 2020.
［3］苑宁萍,辛力坚,王呼生,等. 融合用户兴趣度和信任度的协同过滤推荐算法［J］. 计算机工程与设计, 2020,41(7):1967-1974.
［4］姚彬修,倪建成,于苹苹,等. 基于多源信息相似度的微博用户推荐算法［J］. 计算机应用, 2017,37(5):1382-1386.

［5］黄贤英,阳安志,刘小洋,等. 融合兴趣的微博用户相似度计算研究［J］. 计算机应用研究, 2020,37(1):66-70.

［6］ LIU H F, HU Z, MIAN A, et al. A new user similarity model to improve the accuracy of collaborative filtering［J］. Knowledge-Based Systems, 2014,56:156-166.
［7］ YU Z W, WONG R K, CHI C H. Efficient role mining for context-aware service recommendation using a high-performance cluster［J］. IEEE Transactions on Services Computing, 2017,10(6):914-926.
［8］田保军,胡培培,杜晓娟,等. Hadoop下基于聚类协同过滤推荐算法优化的研究［J］. 计算机工程与科学, 2016,38(8):1615-1624.
［9］ ZHANG T W, LI W P, WANG L, et al. Social recommendation algorithm based on stochastic gradient matrix decomposition in social network［J］. Journal of Ambient Intelligence and Humanized Computing, 2020,11(2):601-608.
［10］付永平,邱玉辉. 一种基于贝叶斯网络的个性化协同过滤推荐方法研究［J］. 计算机科学, 2016,43(9):266-268.
［11］杨尊琦,张倩楠. 基于k-means算法的微博用户推荐功能研究［J］. 情报杂志, 2013,32(8):142-144.
［12］KATARYA R, VERMA O P. A collaborative recommender system enhanced with particle swarm optimization technique［J］. Multimedia Tools and Applications, 2016,75(15):9225-9239.
［13］王永贵,刘凯奇. 一种优化聚类的协同过滤推荐算法［J］. 计算机工程与应用, 2020,56(15):66-73.
［14］杨兴雨,李华平,张宇波. 基于聚类和随机森林的协同过滤推荐算法［J］. 计算机工程与应用, 2018,54(16):152-157.
［15］范奥哲,何利力. 一种双向聚类协同过滤推荐算法研究［J］. 软件导刊, 2020,19(5):78-82.
［16］NAJAFABADI M K, MAHRIN M N, CHUPRAT S, et al. Improving the accuracy of collaborative filtering recommendations using clustering and association rules mining on implicit data［J］. Computers in Human Behavior, 2017,67:113-128.
［17］KOOHI H, KIANI K. User based collaborative filtering using fuzzy C-means［J］. Measurement, 2016,91:134-139.
［18］ARTHUR D, VASSILVITSKII S. K-means++: The advantages of careful seeding［C］// Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. 2007:1027-1035.
［19］MADAN S, DANA K J. Modified balanced iterative reducing and clustering using hierarchies (m-BIRCH) for visual clustering［J］. Pattern Analysis & Applications, 2016,19(4):1023-1040.
［20］ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise［C］// Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. 1996:226-231.
［21］ZHU J, ZOU H, ROSSETS S, et a1． Multi-class AdaBoost［J］. Statistics and Its Interface, 2009,2(3):349-360．

[1]	翟梅. 个性化新闻推荐系统研究综述及探讨[J]. 计算机与现代化, 2024, 0(04): 12-20.
[2]	孟雅蕾1, 师红宇1, 王予2. 一种无阻流量预测方法[J]. 计算机与现代化, 2024, 0(04): 33-37.
[3]	杨孟, 杨进, 陈步前. 基于知识图谱的多目标可解释性推荐[J]. 计算机与现代化, 2024, 0(03): 34-40.
[4]	王晓霞, 孟佳娜, 江烽, 丁梓晴. 基于多视图的知识感知推荐系统#br#[J]. 计算机与现代化, 2024, 0(02): 100-107.
[5]	王秋忆, 周浩, 郑婷婷. 改进RetinaNet的电力设备目标检测方法[J]. 计算机与现代化, 2024, 0(01): 47-52.
[6]	韩雪. 基于约束聚类和粒子群算法的多路径规划[J]. 计算机与现代化, 2023, 0(08): 7-11.
[7]	孙子雨, 任燃, 魏曦哲. 基于DTW-TCN的股票分类及预测研究[J]. 计算机与现代化, 2023, 0(08): 31-37.
[8]	张子璇, 沙秀艳, 肖霏, 粟宝婵, 隋雨陆, 孟子宸. 基于犹豫模糊Canopy-K均值聚类算法的研究与应用[J]. 计算机与现代化, 2022, 0(11): 17-21.
[9]	李春生, 冯阳宵, 富宇, 张可佳, 吴润桐. 基于均值聚类的员工行为分析方法[J]. 计算机与现代化, 2022, 0(09): 19-24.
[10]	李舒, 张伟业, 汪坤, 段照斌. 基于聚类分析的航班油耗组合估计[J]. 计算机与现代化, 2022, 0(08): 65-69.
[11]	张宗海, 於跃成, 冯申. 融合三重注意力和评论评分的深度推荐算法[J]. 计算机与现代化, 2022, 0(05): 1-9.
[12]	刘亦欣, 王家伟, 李自力. 融合注意力与深度因子分解机的时间上下文推荐模型[J]. 计算机与现代化, 2021, 0(11): 22-27.
[13]	陈卓, 袁玺明, 杜军威. 在线问答社区——海川化工论坛的回答者推荐算法[J]. 计算机与现代化, 2021, 0(10): 29-34.
[14]	郝敏,刘航,李扬,简单,王俊影. 基于聚类分析与说话人识别的语音跟踪[J]. 计算机与现代化, 2020, 0(04): 7-.
[15]	欧阳宏基１，杨铎2. 基于微服务架构的学位论文写作辅助平台[J]. 计算机与现代化, 2019, 0(10): 34-.