计算机与现代化 ›› 2021, Vol. 0 ›› Issue (01): 94-99.

• 数据库与数据挖掘 • 上一篇    下一篇

基于PCA降维的成对约束半监督聚类集成

  

  1. (1.盐城工学院信息工程学院,江苏盐城224002;2.江苏科技大学计算机学院,江苏镇江212003)
  • 出版日期:2021-01-28 发布日期:2021-01-29
  • 作者简介:黄欣辰(1994—),男,江苏盐城人,硕士研究生,研究方向:数据挖掘,人工智能,E-mail: hxc1912@foxmail.com; 皋军(1971—),男,江苏盐城人,教授,博士,研究方向:数据挖掘,人工智能,E-mail: gj0104211@163.com; 黄豪杰(1994—),男,江苏南通人,硕士研究生,研究方向:模式识别,人工智能,E-mail: 1356175704@qq.com。
  • 基金资助:
    国家自然科学基金资助项目(61772198)

Semi-supervised Clustering Ensemble with Pairwise Constraints Based on PCA Dimension Reduction 

  1. (1. School of Information Engineering, Yancheng Institute of Technology, Yancheng 224002, China;
    2. School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang 212003, China)
  • Online:2021-01-28 Published:2021-01-29

摘要: 针对现有的聚类集成算法大都是无监督聚类集成算法且不能很好地处理高维数据的问题,设计一种基于PCA降维技术的成对约束半监督聚类集成算法(SSCEDR)。SSCEDR方法使用PCA主成分分析对原始数据进行降维,结合半监督聚类集成技术,在降维后的空间中将成对约束等先验知识代入到聚类集成过程中。本文通过在多组数据集上实验来验证算法的有效性。

关键词: 聚类集成, 降维, 成对约束, 半监督, 主成分分析

Abstract: Aiming at the problem that the existing clustering integration algorithms are unsupervised and cannot deal with high-dimensional data well, this paper proposes a semi-supervised clustering ensemble with pairwise constraints based on PCA dimension reduction (SSCEDR), the SSCEDR method uses PCA principal component analysis to reduce the dimension of the original data. Combined with semi-supervised clustering integration technology, the prior knowledge such as pairwise constraints is substituted into the clustering integration process in the reduced dimension space. The effectiveness of the algorithm is verified by experiments on multiple data sets.

Key words: clustering ensemble, dimension reduction, pairwise constraints, semi-supervised, principal component analysis