计算机与现代化 ›› 2023, Vol. 0 ›› Issue (03): 29-37.

• 人工智能 • 上一篇    下一篇

结合注意力机制和图神经网络的CTR预估模型

  

  1. (安徽师范大学计算机与信息学院,安徽 芜湖 241002)
  • 出版日期:2023-04-17 发布日期:2023-04-17
  • 作者简介:夏义春(1996—),男,安徽肥西人,硕士研究生,研究方向:推荐系统,计算广告,深度学习,E-mail:2865067549@qq.com; 李汪根(1973—),男,安徽太湖人,教授,博士,研究方向:生物计算,智能计算,E-mail: 1346858911@qq.com; 李豆豆(1996—),男,安徽淮北人,硕士研究生,研究方向:图像处理,深度学习,E-mail: 1346858911@qq.com; 葛英奎(1997—),男,安徽马鞍人,硕士研究生,研究方向:推荐系统,深度学习,E-mail: 1013366766@qq.com; 王志格(1997—),男,安徽宣城人,硕士研究生,研究方向:推荐系统,深度学习,E-mail: 1076797329@qq.com。
  • 基金资助:
    高校领军人才引进与培育计划项目(051619)

CTR Prediction Model Combining Attention Mechanism and Graph Neural Network

  1. (School of Computer and Information, Anhui Normal University, Wuhu 241002, China)
  • Online:2023-04-17 Published:2023-04-17

摘要: 大多数CTR预测的算法都是将特征嵌入初始化为一个固定的维度,忽略了长尾物品特征的流行度不高。把它和头部物品的嵌入向量设置为相同长度会导致模型训练不平衡,影响最后的预测结果。基于此,本文首先使用一个端到端的可微框架,该框架可以根据特征的流行度自动选择不同的嵌入维度。其次,引入挤压激励网络机制和具有残差连接的多头自注意力机制,分别从不同角度动态地学习特征的重要性以及识别重要的特征组合,然后使用图神经网络代替传统内积和哈达玛积显式建模二阶特征交互。最后为了进一步提高性能,将DNN组件与浅层模型相结合形成深度模型, 利用贝叶斯优化算法为深度模型选择一组超参数,避免复杂的调参过程,并且在2个基准数据集上实验,结果验证模型的有效性

关键词: 点击率预测, 自动嵌入搜索, 挤压激励网络, 多头自注意力机制, 图神经网络, 贝叶斯优化

Abstract: Most CTR prediction algorithms initialize the feature embedding as a fixed dimension, ignoring the low popularity of the long tail feature. Setting it to the same length as the head object embedding vector will lead to unbalanced model training and affect the final recommendation results. Based on this, this paper first uses an end-to-end differentiable framework, which can automatically select different embedded dimensions according to the popularity of features. Secondly, this paper introduces squeeze excitation network mechanism and multi-head self-attention mechanism with residual connection to dynamically learn the importance of features and identify important feature combinations from different angles, and then uses graph neural network to explicitly model the second-order feature interaction instead of traditional inner product and Hadamard product. Finally, in order to further improve the performance, this paper combines the DNN component with the shallow model to form the depth model, uses the Bayesian optimization algorithm to select a set of super parameters for the depth model to avoid the complex parameter adjustment process, and the experimental results on two benchmark datasets verify the effectiveness of the model.

Key words: CTR prediction, automatic embedded search, squeeze excitation network, multi-head self-attention mechanism, graph neural network, Bayesian optimization