计算机与现代化

• 算法设计与分析 • 上一篇    下一篇

基于MapReduce的分布式AP聚类算法

  

  1. 渤海大学信息科学与技术学院,辽宁锦州121001
  • 收稿日期:2014-07-22 出版日期:2014-11-03 发布日期:2014-11-05
  • 作者简介:冷泳林(1978-),女,辽宁营口人,渤海大学信息科学与技术学院讲师,博士研究生,研究方向:数据挖掘,大数据处理。
  • 基金资助:
    辽宁省自然科学基金资助项目(2013020014)

Distributed AP Clustering Algorithm Based on MapReduce

  1. College of Information Science and Technology, Bohai University, Jinzhou 121001, China
  • Received:2014-07-22 Online:2014-11-03 Published:2014-11-05

摘要: 随着网络的普遍应用,网络中产生的数据急剧增长,大规模数据处理面临严峻挑战。本文在对AP聚类算法进行研究的基础上,利用MapReduce编程模型思想对AP聚类算法进行改进,设计在云平台Hadoop环境下运行的基于MapReduce的分布式AP聚类算法,并在实验中对不同规模的图数据进行聚类测试,实验结果表明分布式的AP聚类算法具有很好的时间效率和加速比。

关键词: MapReduce模型, 分布式AP聚类算法, Hadoop

Abstract: With the extensive application of network, and the data of network grows rapidly, the process of big data faces severe challenge. After studying the AP clustering algorithm, we improved the AP clustering algorithm based on MapReduce model and designed a distributed AP clustering algorithm based on MapReduce in the cloud platform of Hadoop. We examined the distributed AP clustering algorithm on different size of graph data. The result shows that the distributed AP clustering algorithm is of good efficiency and speedup time.

Key words: MapReduce model, distributed AP clustering algorithm, Hadoop

中图分类号: