Computer and Modernization

Previous Articles     Next Articles

Dynamic Replicas Strategy Based on Predicted Popularity

  

  1. 1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China;
    2. Unit 94860 of PLA, Nanjing 210018, China
  • Received:2014-11-12 Online:2015-02-28 Published:2015-03-06

Abstract:

To improve data availability and performance of cluster, current HDFS adapt uniform data replication. However, different files have different popularity and sometimes
the disparity is enormous, access to high popular data may hurt job performance. To address this problem, a dynamic replicas strategy based on predicted popularity is put
forward. By making full use of the recent data popularity, based on grey prediction model, we use Markov prediction model to correct the predicted deviation because of the burst
access and shifting access, and get the accurate predicted popularity of file. After then, finite channel service model based on the predicted popularity is established to
calculate the minimum replicas meeting user demand. Experimental result shows that compared with default data replication, our strategy can more effectively avoid contentions,
reduce the time consuming of job, and alleviated the network traffic.

Key words: high popular data, replica management, cloud computing, Hadoop, grey prediction, birth and death process

CLC Number: