计算机与现代化 ›› 2014, Vol. 0 ›› Issue (8): 21-25+29.doi: 10.3969/j.issn.1006-2475.2014.08.005

• 数据库与数据挖掘 • 上一篇    下一篇

基于LDA模型的社交网络主题社区挖掘

  

  1. (1.广东技术师范学院计算机科学学院,广东广州510665;2.江西师范大学心理学院,江西南昌330027)
  • 收稿日期:2014-05-04 出版日期:2014-08-15 发布日期:2014-08-19
  • 作者简介:欧卫(1985-),男,湖南永州人,广东技术师范学院计算机科学学院硕士研究生,研究方向:机器学习,信号处理; 谢赞福(1956-),男,海南儋州人,教授,研究方向:云计算与大数据处理; 谢彬彬(1988-),男,广东河源人,硕士研究生,研究方向:机器学习,视频检索; 欧缤忆(1980-),女,湖南永州人,江西师范大学心理学院硕士研究生,研究方向:积极心理治疗。
  • 基金资助:
    广东省高等学校科技创新项目(2013KJCX0117)

Mining of Topic Communities in Social Networks Based on LDA Model

  1. (1. School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou 510665, China;

    2. School of Psychology, Jiangxi Normal University, Nanchang 330027, China)
  • Received:2014-05-04 Online:2014-08-15 Published:2014-08-19

摘要: 以微博为代表的社交网络已成为社会舆情的战略要地。对于社交网络中隐含主题社区的发掘,具有较高的商业推广和舆情监控价值。近年来,概率生成主题模型LDA(Latent Dirichlet Allocation)在数据挖掘领域得到了广泛应用。但是,一般而言,LDA适用于处理文本、数字信号数据,并不能合理地用来处理社交网络用户的关系数据。对LDA进行修改,提出适用于处理用户关系数据的Tri-LDA模型,挖掘社交网络中的主题社区。实验结果表明,基于Tri-LDA模型,进行机器学习所得到的结果基本能够反映社交网络上真实的主题社区分布情况。

关键词: LDA, 社交网络, 主题社区

Abstract: Social networks has gained huge popularity in particular microblogs in recent years. The discovery of latent topic communities in social networks carries high value in commercial promotion, public opinion monitoring, etc. In recent years, probabilistic generative topic model (Latent Dirichlet Allocation, LDA) has been widely applied in the field of data mining. Generally, LDA can process text or digital signal data, however, without any modification, it lacks the capability to properly process the relation data between users in a social network. By modifying the original LDA model, this essay proposes a new model, Tri-LDA and applies it to dig the hidden topic communities in a social network. The experiment result shows that the topic communities found by Tri-LDA is basically consistent with the realistic topic communities that hand-labeled by the authors.

Key words: LDA, social networks, topic community

中图分类号: