一种GPU集群任务调度中间件的设计与实现

doi:10.3969/j.issn.1006-2475.2013.02.032

计算机与现代化 ›› 2013, Vol. 1 ›› Issue (2): 130-133.doi: 10.3969/j.issn.1006-2475.2013.02.032

一种GPU集群任务调度中间件的设计与实现

陈春雷

西北工业大学自动化学院,陕西西安710072

收稿日期:2012-12-21 修回日期:1900-01-01 出版日期:2013-02-27 发布日期:2013-02-27

Design and Implementation of Task Scheduling Middleware for GPU Cluster

CHEN Chun-lei

School of Automation, Northwestern Polytechnical University, Xi’an 710072, China

Received:2012-12-21 Revised:1900-01-01 Online:2013-02-27 Published:2013-02-27

摘要/Abstract

摘要： GPU的协处理器特性使得计算机集群的静态任务调度策略会导致计算能力分配不均。针对这一问题，本文提出一种基于权重的动态任务调度策略，并通过中间件的形式将其应用于GPU集群。该策略将集群中的所有GPU视为整体，但不依赖于全局信息。每个集群节点仅通过在本地维护的GPU权重来决定使用本地GPU或远程GPU。作为调度策略的载体，中间件保证了调度策略对用户的透明，它主要由3个部分构成：API库、资源管理后台程序和GPU执行后台程序。在两节点验证性平台上的实验结果表明，该策略获得的GPU利用率比静态调度策略高16%，比另一种依赖全局信息的动态调度策略（基于全局队列的调度策略）高45%。


关键词: GPU集群, 动态任务调度, 中间件

Abstract: In a GPU cluster, the static task scheduling policy may result in unbalanced allocation of computing resource, because GPUs work as co-processors. A weight-based dynamic scheduling policy is proposed and implemented as a middleware, so that it can be applied to the GPU cluster. Under this policy, local GPUs and remote GPUs are not explicitly distinguished, and no global information is required. Every cluster node decides whether to use local GPUs or remote GPUs, according to weights of GPUs. And these weights are locally maintained by each node, respectively. As a carrier of the policy, the middleware ensures that the policy is transparent to users. It is composed of three parts: API library, resource management daemon, and GPU execution daemon. The policy is validated on a two-node cluster. Experiments show that the weight-based dynamic scheduling policy can achieve a 16% higher GPU utilization rate than the static policy, and a 45% higher GPU utilization rate than another dynamic policy (global-queue-based policy).

Key words: GPU cluster, dynamic task scheduling, middleware

陈春雷. 一种GPU集群任务调度中间件的设计与实现[J]. 计算机与现代化, 2013, 1(2): 130-133.

CHEN Chun-lei. Design and Implementation of Task Scheduling Middleware for GPU Cluster[J]. Computer and Modernization, 2013, 1(2): 130-133.

[1]	杨贵福1，胡佑蓉1，刘淑霞1，刘振邦2,3，包宇2. 基于命名管道和异构通信机制的多应用场景下驱动升级策略[J]. 计算机与现代化, 2020, 0(05): 44-.
[2]	张子晔1,刘玉龙1,呼北2. 基于数据虚拟化技术的多来源数据集成方法[J]. 计算机与现代化, 2019, 0(11): 18-.
[3]	马佳艳,王萍,申红伟. 基于中间件的竞赛数据分发系统设计[J]. 计算机与现代化, 2017, 0(12): 94-97.
[4]	李娜，陈正鸣，吕嘉，刘春芳. HDFS访问中间件的事务设计与实现[J]. 计算机与现代化, 2017, 0(1): 46-50.
[5]	褚福影，卫文学. 基于Web Services和Hibernate技术的数据库中间件[J]. 计算机与现代化, 2015, 0(10): 40-44.
[6]	徐磊1,2，周喜1，马玉鹏1，王磊1. 一种基于NFC手机的RFID中间件的设计与实现[J]. 计算机与现代化, 2014, 0(9): 90-94.
[7]	张华1,罗维2. 改进的PHP面向对象持久化中间件关键技术[J]. 计算机与现代化, 2014, 0(6): 111-115.
[8]	张胜，程贝. 基于Spring JMS的船舶申报系统[J]. 计算机与现代化, 2014, 0(3): 126-130.
[9]	冯荷飞;孙前. 将Word表格数据导入Oracle中的数据入库中间件[J]. 计算机与现代化, 2013, 1(9): 226-228,.
[10]	龚华明;阴躲芬. RFID中间件复杂事件处理模型[J]. 计算机与现代化, 2013, 1(9): 232-235.
[11]	郑宗苗;王国明. 基于移动定位的云平台方案的研究与实现[J]. 计算机与现代化, 2013, 1(4): 180-183,.
[12]	舒远仲;朱玄华;田蕾;张丽;宋利康. 基于构件的RFID中间件管理模块的研究与开发[J]. 计算机与现代化, 2013, 1(3): 78-81.
[13]	阴躲芬;龚华明. RFID中间件数据处理模型研究与实现[J]. 计算机与现代化, 2012, 1(9): 200-202.
[14]	阴躲芬;龚华明. 一种基于无线传感器的扩展型中间件框架[J]. 计算机与现代化, 2012, 203(7): 116-119.
[15]	张俊军;章旋. ICE中间件技术及其应用研究[J]. 计算机与现代化, 2012, 1(201): 192-194.

一种GPU集群任务调度中间件的设计与实现

Design and Implementation of Task Scheduling Middleware for GPU Cluster

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价