计算机与现代化 ›› 2022, Vol. 0 ›› Issue (10): 1-7.

• 人工智能 •    下一篇

FOCoR:一种基于特征选择优化的课程推荐技术

  

  1. (贵州大学计算机科学与技术学院,贵州贵阳550025)
  • 出版日期:2022-10-20 发布日期:2022-10-20
  • 作者简介:王扬(1996—),男,贵州贵阳人,硕士研究生,研究方向:大规模数据处理与分析,推荐系统,E-mail: 1459422465@qq.com; 陈梅(1964—),女,贵州都匀人,教授,研究方向:大规模数据处理,数据挖掘技术及应用,E-mail: 1004225928@qq.com; 通信作者:李晖(1982—),男,湖南长沙人,教授,博士,研究方向:大规模数据管理与分析,人工智能技术及应用,高性能数据库,云服务,E-mail: cse.HuiLi@gzu.edu.cn。
  • 基金资助:
    国家自然科学基金资助项目(62162010, 62162011); 贵州省高层次创新型人才项目(黔财教[2018]190)

FOCoR: A Course Recommendation Approach Based on Feature Selection Optimization

  1. (College of Computer Science and Technology, Guizhou University, Guiyang 550025, China)
  • Online:2022-10-20 Published:2022-10-20

摘要: 针对在线教育平台行为日志推荐模型存在的冷启动问题,设计一种融合高校选课数据的课程推荐方法FOCoR。首先,提出基于遗传算法的特征选择技术FSBGA (Feature Selection Based on Genetic Algorithm),然后再以特征选择的结果作为输入,基于梯度提升树LightGBM技术构建推荐模型来进行课程推荐。具体地,在提出的FSBGA算法中,构造结合模型损失和特征数量的适应度函数,并在高校选课数据的特征子集空间中搜索出兼顾模型损失和特征数量的最优特征子集。与基于互信息、F检验的特征选择方法相比,在FSBGA算法所选出的特征子集上训练的选课模型在AUC、F1分数、对数损失这3项指标上均优于其它特征选择算法。为了验证本文工作的有效性,将FOCoR与LightGBM、XGBoost、决策树、随机森林、逻辑回归等算法在真实数据集上进行实验和性能评估,结果表明FOCoR在F1分数上取得了最好的性能。

关键词: 课程推荐, 冷启动, 特征选择, 遗传算法

Abstract: To solve the cold start problem of the recommendation model based on the behavioral log from online education platform, we design a course recommendation method named FOCoR that integrates data of course selection. First, we propose a technology of feature selection based on genetic algorithm (FSBGA), and then take the result of feature selection as input to build a recommendation model based on LightGBM which is a technology of gradient boosting tree for course recommendation. To be more specific, we construct a fitness function combining the loss of model and the number of features in the proposed FSBGA so that we successfully searched out the optimal feature subset that takes into account the loss of model and the number of features in the feature subset space of university course selection data. According to three indicators of log loss, F1-score and AUC, the model of course selection trained on the feature subset selected by the FSBGA is better than the models trained on the others selected by algorithms based on mutual information or F-test. In order to verify the effectiveness of the work in this paper, we have tested and evaluated FOCoR, LightGBM, XGBoost, decision tree, random forest, logistic regression and other algorithms on real data sets, and the results show that FOCoR has achieved the best performance in F1 scores.

Key words: course recommendation system, cold start, feature selection, genetic algorithm