计算机与现代化 ›› 2023, Vol. 0 ›› Issue (03): 90-95.

• 数据库与数据挖掘 • 上一篇    下一篇

基于SMOTE和贝叶斯优化的Adj-LightGBM人岗匹配算法

  

  1. (五邑大学数学与计算科学学院,广东 江门 529020)
  • 出版日期:2023-04-17 发布日期:2023-04-17
  • 作者简介:刘付谦(2000—),男,广东东莞人,本科生,研究方向:数据挖掘,E-mail: liufuqian2000@163.com; 秦华妮(1977—),女,湖南常德人,副教授,博士,研究方向:智能信息处理与数据挖掘,E-mail: qhn2010@126.com; 赖惠慧(2001—),女,广东惠州人,本科生,研究方向:数据挖掘,E-mail: 1303843032@qq.com。
  • 基金资助:
    国家自然科学基金资助项目(11871379); 2021年广东省大学生创新创业训练计划重点支持领域项目(202111349071)

Person-post Matching Adj-LightGBM Algorithm Based on SMOTE and Bayesian Optimization

  1. (School of Mathematics and Computational Sciences, Wuyi University, Jiangmen 529020, China)
  • Online:2023-04-17 Published:2023-04-17

摘要: 近2年由于新冠疫情的影响,各行各业受到了巨大的冲击,传统招聘方式难以实行,一方面招聘单位人才缺口大,另一方面求职者无法线下应聘。网络招聘的出现为求职者和招聘单位带来了一定的方便,但仍存在人岗匹配效率低、匹配不平衡的问题,如何精准且快速地完成人岗匹配工作成为需要解决的迫切问题。针对该问题,提出一种基于SMOTE和贝叶斯优化的Adj-LightGBM人岗匹配算法。首先对人岗数据集进行数据预处理;其次使用SMOTE算法对匹配成功样本进行过采样处理,处理后的正负样本比例为1:3;然后在验证集上使用贝叶斯优化寻找最优的LightGBM模型;最后对该模型进行测试与评价,得出该模型的F1-score为0.974,Auc为0.971。通过与支持向量机、随机森林以及XGBoost算法进行对比,发现本文提出的Adj-LightGBM算法不仅在人岗匹配预测上具有更高的准确性,而且在模型训练效率上也有着显著优势。

关键词: 人岗匹配, 不平衡数据, 过采样技术, 贝叶斯优化, 轻量级梯度提升机

Abstract: COVID-19 has a significant impact on all walks of life during the last two years. The traditional recruitment tactics are difficult to put into practice. On the one hand, the recruitment gap is large, on the other hand, job seekers have nowhere to apply for a job. The emergence of online recruitment has brought some convenience to job seekers and recruitment units, but there are still issues such as low efficiency and unbalanced matching betheen person-post. How to execute job matching effectively and swiftly has become an urgent issue that need to be addressed. To solve this problem, a person-posts matching algorithm of Adj-LightGBM based on SMOTE and Bayesian optimization is proposed. Firstly, the post data set is preprocessed. Secondly, SMOTE algorithm is used to over sample the successfully matched samples with a positive-to-negative sample ratio of 1:3. Then, Bayesian optimization is used to find the optimal LightGBM model on the verification set. Finally, the model is tested and evaluated. The optimal Auc and F1-score of the model is 0.974 and 0.970. Compared with support vector machine, random forest and XGBoost algorithm, it is discovered that the proposed algorithm not only has higher accuracy in person-post matching prediction, but also has substantial benefits in model training efficiency.

Key words: person-post matching, unbalanced data, SMOTE, Bayesian optimization, LightGBM