计算机与现代化 ›› 2020, Vol. 0 ›› Issue (06): 120-.

• • 上一篇    

基于自步数据重构正则化的模糊C均值聚类算法改进

  

  1. (1.西安航空学院,陕西西安710077;2.中国特种设备检测研究院,北京100029;
    3.西安交通大学数学与统计学院,陕西西安710049)
  • 收稿日期:2019-10-22 出版日期:2020-06-24 发布日期:2020-06-28
  • 作者简介:陈怡君(1984-),女,宁夏固原人,硕士研究生,研究方向:数字图书馆,数据挖掘技术及应用,E-mail: xiaoyifighting@stu.xjtu.edu.cn; 曹逻炜(1985-),男,高级工程师,博士,研究方向:承压设备风险控制,结构完整性评价,损伤预测,E-mail: lwcao_1794@126.com; 杜玉倩(1994-),女,硕士研究生,研究方向:大数据处理与分析方法,E-mail: 809009017@qq.com。

Improvement of Fuzzy C-Means Clustering Algorithm Based on Self-paced Data Reconstruction Regularization

  1. (1. Xi’an Aeronautical University, Xi’an 710077, China;
    2. China Special Equipment Inspection and Research Institute, Beijing  100029, China;
    3. School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049, China)
  • Received:2019-10-22 Online:2020-06-24 Published:2020-06-28

摘要: 为了有效降低模糊C均值算法对奇异值和噪声点的敏感性,本文提出一种自步数据重构正则化模糊C均值聚类算法。传统算法是在C均值算法的目标函数中引入加权参数来实现对数据的模糊性划分,而本文提出的方法则是通过对C均值的目标函数进行数据重构正则化来实现,并以自步学习的方式逐步对数据点进行聚类。实验结果表明,本文算法在模拟数据、实际数据以及在图像分割中都能显著降低算法对奇异值和噪声数据的敏感性,聚类更为准确高效。

关键词: 模糊C均值, 聚类划分, 自步学习, 数据重构正则化

Abstract: In order to reduce the sensitivity of fuzzy C-means clustering algorithm for outliers and noise data points, a self-paced data reconstruction is proposed. Traditional fuzzy C-means algorithm realizes fuzzification of memberships by introducing a weighting parameter into the objective function of the C-means clustering. This paper achieves fuzzification of memberships through regularization of hard C-means clustering by data reconstruction. In addition, the proposed algorithm gradually carries out the clustering of data points in a self-paced manner. Experimental results show that the algorithm can significantly reduce the sensitivity to singular value and noise data in simulation data, actual data and image segmentation, and clustering is more accurate and efficient.

Key words: fuzzy C-means, clustering partition, self-paced learning, data reconstruction regularization

中图分类号: