计算机与现代化 ›› 2022, Vol. 0 ›› Issue (11): 17-21.

• 算法设计与分析 • 上一篇    下一篇

基于犹豫模糊Canopy-K均值聚类算法的研究与应用

  

  1. (1.曲阜师范大学统计与数据科学学院,山东济宁273165;2.山东农业大学经管学院,山东泰安271018)
  • 出版日期:2022-11-30 发布日期:2022-11-30
  • 作者简介:张子璇(2001—),女,山东临沂人,本科生,研究方向:决策分析,数据预测; E-mail: 15376096928@163.com; 通信作者:沙秀艳(1977—),女,山东枣庄人,副教授,博士,研究方向:数据预测,决策分析,E-mail: shaxiuyan@sina.com。
  • 基金资助:
    国家自然科学基金资助项目(12071251); 全国统计科学研究项目(2019LY47); 山东省大学生创新创业训练计划项目(S202010446020)

Research and Application of Hesitant Fuzzy Canopy-K-means Clustering Algorithm

  1. (1. School of Statistics and Data Science, Qufu Normal University, Jining 273165, China;
    2. College of Economics and Management, Shandong Agricultural University, Taian 271018, China)
  • Online:2022-11-30 Published:2022-11-30

摘要: 针对传统K均值聚类算法对初始值敏感、易陷入局部极值点,导致数据分类结果不理想的问题,本文提出一种基于犹豫模糊Canopy-K均值聚类算法。首先利用Canopy算法对原始数据进行初步分类,形成多个数据重合的Canopy中心集合,即得到K均值算法的初始聚类中心。然后再利用K均值聚类算法进行聚类,得到最终的聚类结果。最后结合疫情后复工复产企业评价信息数据进行实例分析,从6个方面对复工复产的5个企业发展情况进行评估。将新提出的算法和基于层次分析的K均值聚类算法进行对比分析。结果表明,新提出的方法较大地减少了迭代次数,聚类结果更加合理、稳定和有效。

关键词: 犹豫模糊集, 聚类分析, K均值聚类, Canopy算法

Abstract: Aiming at the problem that the traditional K-means clustering algorithm is sensitive to the initial value and fall into local extreme points easily, resulting in unsatisfactory data classification results, this paper proposes a hesitant fuzzy Canopy-K-means clustering algorithm. Firstly, the original data is preliminarily classified by the Canopy algorithm, and a set of Canopy centers with overlapping data is formed, that is, the initial cluster center of the K-means algorithm is obtained. Then, the K-means clustering algorithm is used for clustering to obtain the final clustering result. Finally, based on the evaluation information data of enterprises that resumed work and production after the epidemic, an example analysis is carried out, and 5 enterprises that have resumed work and production are analyzed from 6 aspects to evaluate the enterprises’ business development. The new proposed algorithm and the traditional K-means clustering algorithm are compared and analyzed, and the results show that the new proposed method greatly reduces the number of iterations, and the clustering results are more reasonable, stable and effective.

Key words: hesitant fuzzy set, clustering analysis, K-means clustering, Canopy algorithm