计算机与现代化 ›› 2013, Vol. 12 ›› Issue (12): 27-29.doi: 10.3969/j.issn.1006-2475.2013.12.007
• 算法分析与设计 • 上一篇 下一篇
王秀华
收稿日期:
修回日期:
出版日期:
发布日期:
Received:
Revised:
Online:
Published:
Abstract: To solve problems that traditional K-means clustering algorithm can not solve the large scale dataset clustering, this paper presents a speeding K-means clustering method based on random sampling, called Kmeans_RS clustering algorithm. The working set is selected from the original clustering dataset by random sampling and the traditional K-means clustering method is executed on this working set. Then the center and radius of every cluster is computed and the sampling result is obtained. The last clustering result of all dataset is obtained by measuring the relationship of sampling result and other data to cluster the remaining data. The random sampling way is used in this process and the size of K-means clustering is decreased, so the clustering efficiency is improved largely and it can be used to solve the large scale clustering problems. Simulation results demonstrate that the excellent clustering efficiency is obtained by this parallel speeding K-means method.
Key words: K-means clustering, random sampling, center, radius, working set, efficiency
中图分类号:
null
王秀华 . 基于随机抽样的加速K-均值聚类方法[J]. 计算机与现代化, 2013, 12(12): 27-29.
WANG Xiu-hua . A Speeding K-means Clustering Method Based on Sampling[J]. Computer and Modernization, 2013, 12(12): 27-29.
0 / / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://www.c-a-m.org.cn/CN/10.3969/j.issn.1006-2475.2013.12.007
http://www.c-a-m.org.cn/CN/Y2013/V12/I12/27