计算机与现代化 ›› 2025, Vol. 0 ›› Issue (07): 119-126.doi: 10.3969/j.issn.1006-2475.2025.07.017

• 算法设计与分析 • 上一篇    

基于HSIC Lasso的特征加权支持向量机

  


  1. (赣南师范大学数学与计算机科学学院,江西 赣州 341000)
  • 出版日期:2025-07-22 发布日期:2025-07-22
  • 作者简介: 作者简介:赖志勇(1997—),男,江西赣州人,硕士研究生,研究方向:机器学习与数据挖掘,E-mail: jonezylai@163.com; 通信作者:汪廷华(1977—),男,江西玉山人,教授,博士,研究方向:人工智能与机器学习,E-mail: wthpku@163.com; 张昕(1999—),男,江西赣州人,硕士研究生,研究方向:机器学习与数据挖掘,E-mail: zhxblbj@163.com。
  • 基金资助:
     基金项目:江西省自然科学基金重点项目(20242BAB26024); 江西省研究生创新专项资金资助项目(YC2023-S865); 江西省学位与研究生教育教学改革研究项目(JXYJG-2022-172)

Feature Weighted Support Vector Machine Based on HSIC Lasso


  1. (School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, China)
  • Online:2025-07-22 Published:2025-07-22

摘要: 摘要:支持向量机(SVM)通过引入核函数将原低维问题转化为高维核空间线性问题,已成功应用于数据分类等问题。经典的SVM算法平等对待所有的特征,忽略了不同的特征对模型输出的贡献不一样的事实,因此,核空间的构造可能不尽合理。本文提出一种基于希尔伯特—施密特独立性准则(Hilbert-Schmidt Independence Criterion, HSIC)Lasso的特征加权支持向量机算法HSIC Lasso-FWSVM。该算法首先利用HSIC有效地度量2个随机变量之间的关系,计算特征空间中特征与特征以及特征与标签之间的相关性,使用计算出的相关性作为对应特征的权重。然后应用Lasso回归方法,通过稀疏性约束重新度量各个特征的权重,将部分无关特征的权重收缩至0。最后,将得到的特征权重应用到SVM核函数的计算中,从而避免弱相关或不相关的特征干扰核函数的计算。在9个UCI数据集上分别使用本文算法、经典SVM和一些最新的特征加权SVM算法进行仿真实验,结果表明,HSIC Lasso-FWSVM具有更好的泛化能力和鲁棒性。


关键词: 关键词:支持向量机, HSIC Lasso, 特征加权, 核方法, 机器学习

Abstract: Abstract: Support vector machine (SVM) has been successfully applied in data classification by transforming the original low-dimensional problem into a high-dimensional linear problem through kernel functions. However, the classical SVM algorithm treats all features equally, ignoring the fact that different features contribute differently to the output of the model. Therefore, the construction of the kernel space may not be entirely reasonable. This paper introduces a feature-weighted SVM algorithm based on the Hilbert-Schmidt independence criterion (HSIC) Lasso, named HSIC Lasso-FWSVM. The algorithm effectively measures the relationship between two random variables using the HSIC, computes the correlation between features, and that between features and labels within the feature space. These correlations are utilized as weights for the corresponding features. Next, the algorithm applies Lasso regression with sparse constraints to reevaluate the weights of various features, shrinking the weights of irrelevant features to zero. Finally, the obtained feature weights are applied to the SVM kernel function calculation, thereby avoiding interference from weakly or non-correlated features during kernel function computation. Experiments were conducted using the proposed algorithm on nine UCI datasets and compared with classical SVM and some recent feature weighted SVM algorithms. The results demonstrate that HSIC Lasso-FWSVM exhibits superior generalization and robustness.

Key words: Key words: support vector machine (SVM), HSIC Lasso, feature weighting, kernel methods, machine learning

中图分类号: