计算机与现代化

• 中文信息处理技术 • 上一篇    下一篇

基于LeaderRank的重叠社区发现算法

  

  1. (河海大学计算机与信息学院,江苏南京211100)
  • 收稿日期:2018-09-15 出版日期:2019-04-08 发布日期:2019-04-10
  • 作者简介:朱帅(1992-),男,江苏扬州人,硕士研究生,研究方向:社交网络,数据管理,E-mail: 1213501782@qq.com;许国艳(1971-),女,内蒙古赤峰人,副教授,博士,研究方向:大数据,数据起源,数据管理; 李敏佳(1994-),女,河南周口人,硕士研究生,研究方向:社交网络,数据管理; 张网娟(1992-),女,江苏盐城人,硕士研究生,研究方向:大数据,数据管理。
  • 基金资助:
    江苏省水利科技科研项目(2017065,2016023,2015001);中央高校业务费资助项目(2017B42214)

Overlapping Community Detection Algorithm Based on LeaderRank

  1. (College of Computer and Information, Hohai University, Nanjing 211100, China)
  • Received:2018-09-15 Online:2019-04-08 Published:2019-04-10

摘要: 在真实的社交网络结构中常常存在着社区相互重叠的现象,发现社交网络中的重叠社区有利于研究网络特性,反映网络中的真实情况。针对多标签传播重叠社区发现算法COPRA存在的随机性,导致社区发现结果稳定性差等问题,提出一种结合节点重要性的标签传播算法。该算法首先采用LeaderRank计算出网络中各个节点的重要性,选择重要性高的节点进行团扩展作为标签初始阶段的预处理,采用合理的标签更新顺序以防止抵消预处理阶段的工作,后期引入贡献度来弱化标签选择阶段的随机性,在基准网络和真实网络上的实验结果表明本文算法提高了社区发现结果的质量。

关键词: 社交网络, 重叠社区, 标签传播, LeaderRank, 贡献度

Abstract: There are lots of overlapping communities in the real social networks, better detection of overlapping communities in social networks is conducive to studying network characteristics and reflecting the real situation of the networks. In order to solve the problem of the randomness of COPRA in the overlapping community of multi-label propagation, this paper proposes a label propagation algorithm based on the importance of nodes. The algorithm uses LeaderRank to calculate the importance of each node in the network, and selects the nodes of high importance to expand into a group as the pretreatment of the initial label phase, uses reasonable label update order to prevent offset pretreatment phase, and then uses the contribution degree to weaken the randomness of the label selection stage. Experimental results on benchmark networks and real networks show that the algorithm improves the quality of community discovery results.

Key words: social network, overlapping communities, label propagation, LeaderRank, contribution degree

中图分类号: