计算机与现代化 ›› 2023, Vol. 0 ›› Issue (02): 83-88.

• 人工智能 • 上一篇    下一篇

基于非负矩阵分解的半监督模型用于多层网络聚类

  

  1. (重庆医科大学附属第一医院信息中心,重庆 400016)
  • 出版日期:2023-04-10 发布日期:2023-04-10
  • 作者简介:刘兴建(1994—),男,湖北恩施人,助理工程师,硕士,研究方向:数据挖掘与分析,网络聚类,E-mail: 2638666276@qq.com; 通信作者:杨晓夫(1989—),男,重庆垫江人,助理工程师,硕士,研究方向:医疗大数据挖掘,软件设计,E-mail: 1273133998@qq.com; 胡磊(1980—),男,重庆渝中人,高级工程师,硕士,研究方向:医疗大数据挖掘,医学管理,E-mail: 753245152@qq.com。
  • 基金资助:
    重庆医科大学附属第一医院管理科研基金资助项目(GLJJ2020-10); 重庆市科卫联合医学科研项目(2021MSXM147)

A Semi-supervised Model with Non-negative Matrix Factorization for Multiplex Network Clustering

  1. (Information Center, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, China)
  • Online:2023-04-10 Published:2023-04-10

摘要: 真实世界多层网络具有多维度、高复杂性的特征,使得仅使用网络拓扑信息进行聚类的算法往往不能精准挖掘网络的公共社区结构。为了解决这一问题,本文提出一种基于非负矩阵分解的半监督模型(Semi-supervised Model with Non-negative Matrix Factorization, SeNMF)。首先,该模型设计基于PageRank算法的贪婪搜索方法获取网络的共识先验信息,用以增强每一层网络的拓扑结构,降低网络噪声;然后利用整体非负矩阵分解将所有网络层的低维表示在格拉斯曼流形上融合以获取更优的公共低维表示矩阵;最后利用K-means聚类得到网络的公共社区结构。实验表明,无论是网络层数的增加还是层间噪声的增强,SeNMF模型相较其他算法在多层网络聚类时均具有一定的优越性。

关键词: 多层网络聚类, 非负矩阵分解, 半监督模型, 共识先验信息, 公共社区结构

Abstract: Real-world multiplex networks often have the characteristics of multi-dimensional and high complexity. The clustering accuracy of existing approaches that only use network topology information for clustering often cannot be guaranteed. To address the problem, the paper proposes a semi-supervised model with non-negative matrix factorization (SeNMF). Firstly, the model designs a greedy search method based on the PageRank algorithm to obtain the consensus prior information of network. The prior information is used to enhance the topology of each network layer to reduce network noise. Then, the overall non-negative matrix factorization is used to obtain a better common low-dimensional representation matrix by fusing the low-dimensional representations of all network layers on the Grassmannian manifold. Finally, K-means is used to obtain the public community structure of the network. Extensive experiments show that SeNMF achieves the outstanding performance over the state-of-the-art approaches, whether it is the increase of network layers or the enhancement of inter-layer noise.

Key words: multiplex network clustering, non-negative matrix factorization, semi-supervised model, consensus prior information, public community structure