计算机与现代化
• 数据库与数据挖掘 • 上一篇 下一篇
收稿日期:
出版日期:
发布日期:
作者简介:
基金资助:
Received:
Online:
Published:
摘要: 目标函数二阶差分方法利用目标函数值随类别数的梯度变化作为判定准则,直接利用目标函数值与聚类数的关系,实现在不同数据集上自动得到正确的聚类数,但计算最佳聚类数会占用一定时间。当样本总数较大时,采用该方法得到最佳聚类数计算量非常大。针对此问题,本文提出基于均匀抽样的二阶差分聚类数确定方法,首先采用改进的均匀抽样设计,然后在所得到的数据子集上进行二阶差分设计。实验结果表明,该方法在减少计算量的同时达到了期望的正确判断。
关键词: 二阶差分, 最佳聚类数, 均匀抽样设计
Abstract: Two order difference method for objective function uses it as a decision criteria that value of the objective function changes with the gradient of classes number. The two order difference algorithm directly uses the relation between the objective function value and the number of clusters to achieve the correct number of clusters on different data sets. But the calculation of the optimal cluster number will occupy a period of time. When the number of samples is large, the amount of calculation of using this method to obtain the optimum clustering number, will be also very large. To solve this problem, this paper proposes a method for determining two order difference cluster number based on uniform sampling. First, the improved uniform sampling design is adopted, and then the two order difference design is carried out on the subset of the data obtained. Experimental results show that this method not only can greatly reduce the amount of calculation, but also can achieve the desired correct judgment.
Key words: two order difference, optimal cluster number, uniform sampling design
陈 艳 1,陈 光 1,易叶青 2,刘 强 1. 基于均匀抽样的二阶差分聚类数确定方法[J]. 计算机与现代化, doi: 10.3969/j.issn.1006-2475.2017.10.010.
CHEN Yan 1, CHEN Guang 1, YI Ye-qing 2, LIU Qiang 1. A Method for Determining Two Order Difference Cluster Number Based on Uniform Sampling[J]. Computer and Modernization, doi: 10.3969/j.issn.1006-2475.2017.10.010.
0 / / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://www.c-a-m.org.cn/CN/10.3969/j.issn.1006-2475.2017.10.010
http://www.c-a-m.org.cn/CN/Y2017/V0/I10/49