计算机与现代化 ›› 2020, Vol. 0 ›› Issue (08): 63-68.doi: 10.3969/j.issn.1006-2475.2020.08.010

• 算法设计与分析 • 上一篇    下一篇

具有对偶约束的半监督重叠社区发现方法

  

  1. (1.江苏开放大学信息工程学院,江苏南京210017;2.中国矿业大学计算机科学与技术学院,江苏徐州221116)
  • 收稿日期:2020-01-06 出版日期:2020-08-17 发布日期:2020-08-17
  • 作者简介:许小媛(1980-),女,江苏淮安人,副教授,硕士,研究方向:云计算,大数据应用,E-mail: 626028672@qq.com; 李海波(1982-),男,江苏淮安人,讲师,研究方向:数据挖掘; 于本成(1981-),男,辽宁大连人,副教授,CCF会员,博士研究生,研究方向:网络应用算法; 刘芳(1976-),女,安徽六安人,副教授,研究方向:复杂网络算法。
  • 基金资助:
    江苏省高等学校自然科学研究项目(19KJB520026); 江苏省高校哲学社会科学研究一般项目(2019SJA0674); 江苏高校“青蓝工程”项目

Semi-supervised Overlapping Community Finding with Pairwise Constraints

  1. (1. School of Information and Engineering, Jiangsu Open University, Nanjing 210017, China;
    2. School of Computer Science and Technology, China University of Minging and Technology, Xuzhou 221116, China)
  • Received:2020-01-06 Online:2020-08-17 Published:2020-08-17

摘要: 在复杂网络重叠社区发现方法的研究中,提高算法准确度的方法之一是利用额外的背景信息(例如来自领域专家的)作为约束的来源来指导社区检测过程。本文研究探索半监督策略的潜力,用以改善在网络中寻找重叠的社区的准确性。在进程的初始化阶段和子社区扩展过程中引入必须链接和不可能链接的约束,提出一种使用有限数量的成对约束、结合贪心策略来寻找重叠社区的方法PC-GCE(Pairwise Constrained Greedy Clique Expansion)。对模拟网络数据与当前无约束的局部扩展重叠社区发现算法(GCE)进行对比实验,结果表明PC-GCE方法在发现重叠社区的性能上优于无约束的算法,并且随着成对约束数量的增加,发现重叠社区的性能越好。

关键词: 复杂网络, 重叠社区发现, 半监督, 对偶约束, PC-GCE

Abstract: In the research of finding overlapping community in complex networks, one of the ways to improve accuracy is by harnessing additional background information (e.g. from domain experts), which can be used as a source of constraints to guide the community detection process. In this paper, the potential of semi-supervised strategies to improve algorithms for finding overlapping communities in networks is explored. We introduce constraints that must link and cannot link into the initialization phase of the process and in the subcommunity extension process, and propose an approach named PC-GCE(Pairwise Constrained Greedy Clique Expansion) for finding overlapping communities by using a limited number of pairwise constraints and combing greedy strategies. A comparative experiment is conducted between the simulated network data and the current unconstrained locally extended overlapping community discovery algorithm (GCE), experimental results show that the PC-GCE can achieve better performance than GCE on finding overlapping communities, and as the increasing numbers of pairwise constraints, PC-GCE shows greater performance in the finding accuracy.

Key words: complex network, overlapping community finding, semi-supervise, pairwise constraints, pairwise constrained greedy clique expansion

中图分类号: