计算机与现代化 ›› 2021, Vol. 0 ›› Issue (10): 15-22.

• 人工智能 • 上一篇    下一篇

基于改进布谷鸟搜索的k-means算法的离群点检测

  

  1. (沈阳工业大学理学院,辽宁沈阳110870)
  • 出版日期:2021-10-14 发布日期:2021-10-14
  • 作者简介:庄丽丽(1995—),女,内蒙古通辽人,硕士研究生,研究方向:智能优化算法,数据挖掘,E-mail: 1324656735@qq.com; 石鸿雁(1962—),女,辽宁葫芦岛人,教授,博士,研究方向:智能优化算法,数据挖掘,E-mail: shy620317@163.com。
  • 基金资助:
    国家自然科学基金资助项目(61074005)

Outlier Detection Based on Improved Cuckoo Search k-means Algorithm

  1. (School of Science, Shenyang University of Technology, Shenyang 110870, China)
  • Online:2021-10-14 Published:2021-10-14

摘要: 为了解决k-means算法的离群点检测容易受到初始聚类中心的影响陷入局部最优的问题,本文提出一种基于改进布谷鸟搜索的k-means算法的离群点检测方法。首先,对原始布谷鸟搜索算法中的发现概率和莱维飞行步长做自适应策略改进并进行实验仿真;其次讨论改进后的布谷鸟搜索算法的收敛性问题;最后将改进后的布谷鸟搜索算法与k-means的离群点检测算法融合成一种新的离群点检测算法——基于改进布谷鸟搜索的k-means算法的离群点检测。通过对UCI数据集进行仿真实验,结果表明,本文算法不仅精确度方面有着明显优势,而且在3个数据集上收敛速度均有改善,可有效地抑制k-means算法的离群点检测容易陷入局部最优的问题,缩短运行时间。

关键词: 离群点检测, k-means算法, 布谷鸟搜索算法, 收敛性

Abstract: In order to solve the problem that the outlier detection of k-means algorithm is susceptible to fall into local optimality by the influence of the initial clustering center, an outlier detection based on the k-means algorithm of improving cuckoo search is proposed. Firstly, the adaptive strategy improvement is made to the discovery probability and Levy flight step size of the original cuckoo search algorithm, and the experimental simulation is carried out. Secondly, the convergence of the improved cuckoo search algorithm is discussed. Finally, the improved cuckoo search algorithm and the k-means outlier detection algorithm are fused into a new outlier detection algorithm: the outlier detection method based on the k-means algorithm of improved cuckoo search. Through the simulation experiments on UCI data sets, the results show that the proposed algorithm not only has obvious advantages in accuracy, but also improves the convergence speed on three data sets, which can effectively suppress the problem that the outlier detection of k-means algorithm is easy to fall into local optimality and shorten the running time.

Key words: outlier detection, k-means algorithm, cuckoo search algorithm, convergence