Computer and Modernization

Previous Articles     Next Articles

Realization of Load Balancing Mechanism in Storm Streaming Processing Platform

  

  1. (Information Service Laboratory, The Thirty-second Institute of Chinese Electronic Technology Group Corporation, Shanghai 201808, China)
  • Received:2017-04-14 Online:2017-12-25 Published:2017-12-26

Abstract: Compared with Hadoop, Storm has advantage of real-time data stream processing, which provides an efficient, fast and real-time data processing framework for multi-source heterogeneous data processing. However, the worker assignments in the Storm cluster only consider the sort of available Slot between different nodes, while ignoring the current load condition of different nodes, which may fail to meet the command of load balancing when more than one topology running in the cluster. In order to improve the efficiency and achieve load balancing of real-time stream processing, a Storm scheduling algorithm is proposed which is weighted sorting of available Slot and node load conditions and based on Storm-based distributed flow processing system to reduce load imbalance. And through designing the data structure reasonably, the paper designs the rowkey in Hbase randomly and evenly, which can ensure the load balance of the various RegionServer,improve the utilization of cluster resources and increase the speed of data writing greatly. Through the comparison experiment with the original Storm system, it is shown that the above algorithm improvement and mechanism optimization ensure the fast writing of data and improve the utilization rate of cluster resources. The improved system has obvious advantages in practicality and efficiency. 

Key words: Storm, streaming processing, distributed computing, batch processing, load balancing

CLC Number: