计算机与现代化

• 软件工程 • 上一篇    下一篇

一种Hadoop YARN的资源调度机制

  

  1. (中国电子科技集团公司第三十二研究所信息服务平台室,上海 201808)
  • 收稿日期:2017-05-31 出版日期:2017-11-21 发布日期:2017-11-21
  • 作者简介:李程(1993-),男,湖南浏阳人,中国电子科技集团公司第三十二研究所信息服务平台室硕士研究生,研究方向:大数据与云计算; 柴小丽(1968-),女,所副总工程师,研究员级高级工程师,研究方向:计算机系统结构,嵌入式计算机,国产化计算机。

A Resource Scheduling Mechanism of Hadoop YARN

  1. (Information Service Laboratory, No. 32nd Research Institute of China Electronics Technology Group Corporation, Shanghai 201808, China)
  • Received:2017-05-31 Online:2017-11-21 Published:2017-11-21

摘要: YARN是Hadoop中广泛应用的资源管理系统,支持MapReduce, Spark, Storm等多种计算框架,已成为大数据生态中的核心组件。然而,在Hadoop YARN现有的资源调度器中,采用基于资源预留的资源保障机制,会产生资源碎片,导致资源浪费。为提高集群的资源利用率和吞吐量,本文提出一种基于预约回填的资源分配机制。在该机制中,基于作业的优先级来决定是否对资源进行预约,并引入回填策略,在不影响预约作业执行的情况下,对资源进行回填使用。实验表明,使用基于预约回填的资源调度机制能够有效提高Hadoop YARN集群的资源利用率和吞吐量。

关键词: Hadoop YARN, 大数据, 资源调度, 预约回填

Abstract: YARN is a resource management system widely used in Hadoop. It supports MapReduce, Spark, Storm and other computing frameworks, and has become the core component of big data ecology. However, in Hadoop YARN’s existing resource scheduler, a resource guarantee mechanism based on resource reservation, will produce resource fragmentations, leading to a waste of resources. In order to improve the resource utilization and throughput of the cluster, this paper proposes a resource allocation mechanism based on reservation and backfill. In this mechanism, based on the priority of the job, it decides whether to make a reservation to the resource and introduce a backfill strategy to backfill the resource without affecting the execution of the reservation job. Experiments show that the resource scheduling mechanism based on reserved backfill can effectively improve the resource utilization and throughput of Hadoop YARN cluster.

Key words: Hadoop YARN, big data, resource scheduler, reserved backfill

中图分类号: