计算机与现代化

• 算法设计与分析 • 上一篇    下一篇

基于数据场的FCM改进算法

  

  1. 浙江工业大学信息工程学院,浙江杭州310023
  • 收稿日期:2013-11-11 出版日期:2014-06-13 发布日期:2014-06-25
  • 作者简介:王丽红(1989-),女,福建宁德人,浙江工业大学信息工程学院硕士研究生,研究方向:数据挖掘; 何熊熊(1965-),男,教授,研究方向:机器人,控制理论与应用,智能系统和信号处理。

An Improved FCM Algorithm Based on Data Field

  1. School of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
  • Received:2013-11-11 Online:2014-06-13 Published:2014-06-25

摘要: FCM算法对初始聚类中心敏感,对噪声和孤立点敏感,容易受到数据分布的影响。本文的改进算法引入物理学上的数据场理论,用势函数来描述数据的分布,优化初始聚类中心;同时采用冗余聚类中心的方法,即将大簇分割成多个小类,再用分离度作为评估函数进行类合并。仿真实验结果表明,改进算法能够克服FCM算法的一些缺陷,对数据分布不规则的数据集进行有效聚类,聚类效果良好。

关键词: 聚类, FCM算法, 数据场, 初始聚类中心

Abstract: The fuzzy c-means algorithm has several limitations: too sensitive to choose initial class center of divisions, too sensitive to noises and outliers, easy to be effected by data distributions. The improved algorithm uses the data field according to the theory of fields in physics, uses potential values of fault points in the data field to identify noise point and determine the initial class center, uses multicenters clustering algorithm that big cluster is cut into several small clusters, and then takes the separation measures as evaluation function to merge small clusters. Experiments show that the improved fuzzy c-means algorithm could make up the defects of fuzzy c-means algorithm, and be well suited to the non-uniform subject distributions.

Key words: clustering, fuzzy c-means algorithm, data field, initial cluster centers