计算机与现代化 ›› 2011, Vol. 193 ›› Issue (9): 201-204.doi: 10.3969/j.issn.1006-2475.2011.09.054

• 应用与开发 • 上一篇    下一篇

分布式ETL负载均衡策略研究

张亮1,夏秀峰2   

  1. 1.沈阳药科大学高职学院,辽宁 沈阳 110026; 2.沈阳航空航天大学计算机学院,辽宁 沈阳 110136
  • 收稿日期:2011-05-03 修回日期:1900-01-01 出版日期:2011-09-22 发布日期:2011-09-22

Research on Strategy of Load Balancing in Distributed ETL

ZHANG Liang 1, XIA Xiu-feng2   

  1. 1.Vocational and Technical School of Shenyang Pharmaceutical University, Shenyang 110026, China;2.School of Computer Science and Engineering, Shenyang Aerospace University, Shenyang 110136, China
  • Received:2011-05-03 Revised:1900-01-01 Online:2011-09-22 Published:2011-09-22

摘要: 在分析分布式ETL中负载均衡重要性的基础上,针对传统ETL应用于分布式数据仓库中效率低的缺陷,提出一种根据ETL节点所抽取的数据类型不同对分布式ETL节点抽取的数据进行分割的策略,以及一种新的负载均衡模型—链网模型和Routers相结合的RCN模型。在此基础上提出一种基于ETL数据分片和RCN模型相结合的分布式ETL节点负载调度和均衡策略。此策略使ETL节点的数据处理能力有了很大的提高,有效地提高了分布式ETL的效率。

关键词: 分布式数据仓库, ETL, 数据分割, 负载均衡

Abstract: After analyzing the importance of load balancing in distributed ETL, aiming at the limitation of traditional ETL used in distributed data warehouse, a new strategy that the data extracted from ETL node are divided according to the type of data is proposed, and then presents a new model of load balancing R-CN model that is combined by chain network model and Routers. On the basis of that presents a new load balancing strategy based on the data segment strategy and R-CN model. The strategy improves the ability of data processing greatly, and it also effectively increases the efficiency of distributed ETL.

Key words: distributed data warehouse, ETL, data partitioning, load balancing

中图分类号: