计算机与现代化 ›› 2021, Vol. 0 ›› Issue (11): 89-94.

• 算法设计与分析 • 上一篇    下一篇

基于熵与邻域约束的模糊C均值改进算法

  

  1. (南京理工大学理学院,江苏南京210094)

  • 出版日期:2021-12-13 发布日期:2021-12-13
  • 作者简介:冯俊淇(1997—),男,辽宁沈阳人,硕士研究生,研究方向:数据挖掘,E-mail: 1193875868@qq.com;通信作者: 张正军(1965—),男,江苏阜宁人,副教授,硕士生导师,博士,研究方向:数据挖掘,E-mail: zjzhang@njust.edu.cn; 章曼(1998—),女,安徽安庆人,硕士研究生,研究方向:数据挖掘,E-mail: 1277167538@qq.com; 严涛(1977—),男,江苏泰兴人,副教授,硕士生导师,博士,研究方向:最优化理论与算法,E-mail: tyan@njust.edu.cn。
  • 基金资助:
    国家自然科学基金资助项目(11671205, 61773014)

Improved FCM Algorithm Based on Entropy and Neighborhood Constraint

  1. (School of Science, Nanjing University of Science and Technology, Nanjing 210094, China)
  • Online:2021-12-13 Published:2021-12-13

摘要: 针对模糊C均值(FCM)聚类算法没有考虑样本不同属性的重要程度、邻域信息等问题,提出一种基于熵与邻域约束的FCM算法。首先通过计算样本各属性的熵值来为各属性赋予权重,结合属性权重改进距离度量函数;随后根据邻域样本与中心样本间的距离计算邻域隶属度权重,加权得到邻域隶属度,利用邻域隶属度约束目标函数,修正隶属度迭代过程,最终达到提升FCM聚类算法性能的目的。理论分析和在人造数据集、多个UCI数据集的试验结果表明,改进后的算法在聚类效果、鲁棒性上均优于传统FCM算法、PCM算法、KFCM算法、KPCM算法和DSFCM算法,表明了本文算法的有效性。

关键词: 模糊C均值算法, 聚类算法, 邻域信息, 熵权法

Abstract: Aiming at the problems of fuzzy C-means (FCM) clustering algorithm that does not consider the importance of different attributes of samples and neighborhood information, a FCM algorithm based on entropy and neighborhood constraints is proposed. First the entropy value of each attribute of the sample is calculated to give weight to each attribute, the attribute weight is combined to improve the distance measurement function; then the neighborhood membership weight is calculated according to the distance between the neighborhood sample and the center sample, and the neighborhood membership is got by weighting. The membership degree of the neighborhood constrains the objective function, and the iterative process of the degree of membership is modified, finally the purpose of improving the performance of the FCM clustering algorithm is achieved. Theoretical analysis and experimental results on artificial data sets and multiple UCI data sets show that the improved algorithm is superior to the traditional FCM algorithm, PCM algorithm, KFCM algorithm, KPCM algorithm, and DSFCM algorithm in terms of clustering effect and robustness, which shows the effectiveness of this algorithm.

Key words: fuzzy C-means algorithm, clustering algorithm, neighborhood information, entropy weight method