计算机与现代化 ›› 2018, Vol. 0 ›› Issue (08): 61-.doi: 10.3969/j.issn.1006-2475.2018.08.012

• 数据库与数据挖掘 • 上一篇    下一篇

#br# 耦合样本先验分布信息的加权极限学习机

  

  1. (江苏科技大学计算机学院,江苏镇江212003)
  • 出版日期:2018-09-11 发布日期:2018-09-11
  • 作者简介:席晓燕(1989-),女,河南汝州人,江苏科技大学计算机学院硕士研究生,研究方向:机器学习,数据挖掘; 于化龙(1982-),男,黑龙江哈尔滨人,副教授,博士,研究方向:机器学习,数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61305058, 61572242); 江苏省自然科学基金资助项目(BK20130471); 中国博士后特别资助计划项目(2015T80481); 中国博士后科学基金资助项目(2013M540404); 江苏省博士后基金资助项目(1401037B)

Coupling Sample Prior Distribution Weighted Extreme Learning Machine

  1. (School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212003, China)
  • Online:2018-09-11 Published:2018-09-11

摘要: 极限学习机广泛用于分类、聚类、回归等任务中,但在处理类不平衡分类问题时,前人未充分考虑样本先验分布信息对分类性能的影响。针对此问题,本文提出耦合样本先验分布信息的加权极限学习机(Coupling sample Prior distribution Weighted Extreme Learning Machine,CPWELM)算法。该算法基于加权极限学习机,充分探讨不同分布样本点的重要程度,以此构造代价矩阵,进而提升分类器性能。本文通过12个不平衡数据集,对CPWELM算法的可行性及有效性进行了验证。结果表明,相比同类其他算法,CPWELM算法的性能更优。

关键词: 类不平衡, 极限学习机, 代价敏感学习, 样本先验信息

Abstract: Extreme learning machine can be widely used in classification, clustering, regression, etc. However, previous researchers ignore the influence of sample prior distribution information for classification performance when they deal with class imbalance problems. Aiming at this problem, this paper presents an algorithm called CPWELM ( Coupling sample Prior distribution Weighted Extreme Learning Machine), which is based on extreme learning machine. We fully discuss the importance of the different distribution sample points, then we construct the cost matrix with it for the improvement of classifier performance. We do experiments on 12 imbalanced datasets to verify the feasibility and effectiveness of the proposed algorithm. The results indicate that the proposed algorithm generally performs better than the state-of-the-art ones.

Key words: class imbalance, extreme learning machine, cost-sensitive learning, sample prior information

中图分类号: