计算机与现代化

• 数据库与数据挖掘 • 上一篇    下一篇

 基于网格的多密度增量聚类算法

  

  1. 1.成都农业科技职业学院基础部,四川成都611130;
    2.成都工业职业技术学院交通运输系,四川成都610081
  • 收稿日期:2014-09-01 出版日期:2014-11-27 发布日期:2014-12-10
  • 作者简介: 李光兴(1956-),男,四川仁寿人,成都农业科技职业学院副教授,研究方向:数据挖掘与人工智能; 尹冀川(1960-),男,四川成都人,成都工业职业技术学院副教授,研 究方向:数据挖掘。

Multi-density Incremental Clustering Algorithm Based on Grid

  1. 1. Department of Fundamental Courses, Chengdu Vocational College of Agricultural Science and Technology, Chengdu 611130, China;
      2. Department of Transportation, Chengdu Industrial Vocational Technical College, Chengdu 610081, China
  • Received:2014-09-01 Online:2014-11-27 Published:2014-12-10

摘要:

 提出一种基于网格的多密度增量聚类算法MICG,定义含网格单元间的相对密度和重心距离的判别函数。当数据集的部分数据发生变动后,不需要对全部数据重新聚类,只需分析有数据变更
的单元与邻居单元的关系,结合原有的聚类结果形成新的聚类,有效地提高了聚类分析的效率。时间复杂度与空间复杂度同数据集大小、属性个数成线性关系。实验结果表明,MICG算法能够处理任意形
状和不同密度的类,有效地解决数据更新时的增量聚类问题。

关键词:  , 网格聚类, 增量聚类, 多密度, 单元, 判别函数

Abstract:

This paper presents a multi-density incremental clustering algorithm based on grid (MICG), the discriminant function taking into account relative density and
gravity distance between grid cells is introduced. When a portion of the data sets changed, without re-clustering all the data, this algorithm could formulate a new cluster
according to original clustering result merely based on the relationship between the unit with changed data set and neighbored unit. This approach effectively improved
efficiency of cluster analysis. The time complexity and space complexity are linear with the size of dataset and the number of attributes. The experimental results show that
MICG algorithm can process cluster with any shape or different densities, and can solve the increment clustering problem effectively when the data is updated.

Key words:  , grid clustering, incremental clustering, multi-density, cell, discriminant function