基于深度Q学习的电力物联网任务卸载研究

计算机与现代化 ›› 2022, Vol. 0 ›› Issue (11): 75-80.

基于深度Q学习的电力物联网任务卸载研究

（1.国网电力科学研究院有限公司,江苏南京211000；2.南京邮电大学通信与信息工程学院,江苏南京210003）

出版日期:2022-11-30 发布日期:2022-11-30
作者简介:丁忠林（1982—），男，江苏南通人，工程师，研究方向：电力无线通信技术，E-mail: dingzhonglin@sgepri.sgcc.com.cn；李洋（1975—），男，江苏灌云人，高级工程师，学士，研究方向：电力无线通信技术，E-mail: liyang2@sgepri.sgcc.com.cn；曹委（1993—），男，江苏南京人，工程师，研究方向：电力无线通信技术，E-mail: caowei1@sgepri.sgcc.com.cn；通信作者：谈宇浩（1998—），男，湖北孝感人，硕士研究生，研究方向：分布式学习，E-mail: 1136851629@qq.com；徐波（1995—），男，江苏南京人，博士研究生，研究方向：无线通信，E-mail: 10108010321@njupt.edu.cn。
基金资助:
国家电网有限公司总部管理科技项目资助（SGZJXT00JSJS2000455）

Deep Q-learning Based Task Offloading in Power IoT

（1. State Grid Electric Power Research Institute, Nari Group Corporation， Nanjing 211000, China;2. College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

Online:2022-11-30 Published:2022-11-30

摘要/Abstract

摘要： 随着现代化城市与工业生产中电力需求的不断提高，电力物联网（Power Internet of Things, PIoT）作为一种能够显著提高电力系统效率的解决方案受到了广泛关注。为有效解决接入问题，现有的电力设备往往已配备内置轻量级人工智能的5G模组。然而，受制于模组有限的计算能力和通信能力，设备产生的海量数据难以实时处理和分析。基于该问题，本文主要研究电力物联网系统中的任务卸载问题，通过联合优化卸载决策和边缘服务器的计算资源分配，从而降低时延与能耗的加权和。此外本文提出一种基于深度强化学习的任务卸载算法，首先任务在边缘服务器的处理过程建模为队列，其次基于凸优化理论对本地计算资源分配进行优化，最后采用深度Q学习算法优化任务卸载决策。实验结果表明，本文提出的方法能够有效降低系统时延与能耗的加权和。

关键词: 电力物联网, 边缘卸载, 资源分配, 深度强化学习, 5G模组

Abstract: With the increasing demand for electricity in modern cities and industrial production, power internet of things （PIoT） has attracted extensive attention. PIoT is considered as a solution which can significantly improve the efficiency of power systems. In order to establish effective access, power equipment now is often equipped with 5G modules with lightweight built-in AI. However, limited to computing and communication capabilities of the modules, great challengs are brought by real-time processing and analysis of massive data generated by the equipment. In this paper, we mainly focus on task offloading in the PIoT system. By jointly optimizing the task scheduling and the computing resource allocation of edge servers, the weighted sum of latency and energy consumption turn out to be reduced. We propose a task offloading algorithm based on deep reinforcement learning. Firstly, the task execution on each edge server is modeled as a queuing system. Then, the local computing resource allocation is optimized based on convex optimization theory. Finally, a deep Q-learning algorithm is proposed to optimize the task offloading decisions. Simulation results show that, the proposed algorithm can reduce the latency and energy consumption significantly.

Key words: power internet of things, edge offloading, resource allocation, deep reinforcement learning, 5G modules

丁忠林, 李洋, 曹委, 谈宇浩, 徐波. 基于深度Q学习的电力物联网任务卸载研究[J]. 计算机与现代化, 2022, 0(11): 75-80.

DING Zhong-lin, LI Yang, CAO Wei, TAN Yu-hao, XU Bo. Deep Q-learning Based Task Offloading in Power IoT[J]. Computer and Modernization, 2022, 0(11): 75-80.

参考文献

［1］刘锐. 物联网技术在智能电网中的应用［J］. 无线互联科技, 2021,18（4）:18-19.
［2］汪兴. 面向智能电网建设的电力物联网架构研究［J］. 电力大数据, 2018,21（10）:28-31.
［3］ LIU Y, YU R, PAN M, et al. Sd-mac: Spectrum database driven mac protocol for cognitive machine-to-machine networks［J］. IEEE Transactions on Vehicular Technology, 2017,66（2）:1456-1467.
［4］张海波,李虎,陈善学,等. 超密集网络中基于移动边缘计算的任务卸载和资源优化［J］. 电子与信息学报, 2019,41（5）:1194-1201.
［5］田辉,范绍帅,吕昕晨,等. 面向5G需求的移动边缘计算［J］. 北京邮电大学学报, 2017,40（2）:1-10.
［6］ HUANG Y T, LU Y H, WANG F, et al. An edge computing framework for real-time monitoring in smart grid［C］// 2018 IEEE International Conference on Industrial Internet （ICII）. 2018:99-108.
［7］ ISLAM T, HASHEM M M A. A big data management system for providing real time services using fog infrastructure［C］// 2018 IEEE Symposium on Computer Applications & Industrial Electronics（ISCAIE）. 2018:85-89.
［8］ KUMAR N, ZEADALLY S, RODRIGUEJ J. Vehicular delay-tolerant networks for smart grid data management using mobile edge computing［J］. IEEE Communications Magezine, 2016,54（10）:60-66.
［9］ ZAHOOR S, JAVAID N, KHAN A, et al. A cloud-fog-based smart grid model for efficient resource utilization［C］// Proceedings of the 14th IEEE International Wireless Communications and Mobile Computing（IWCMC）. 2018:1154-1160.
［10］LIU J, MAO Y Y, ZHANG J, et al. Delay-optimal computation task escheduling for mobile-edge computing systems［C］// 2016 IEEE International Symposium on Information Theory （ISIT）. 2016:1451-1455.
［11］MAO Y Y, ZHANG J, LETAIEF K B. Dynamic computation offloading for mobile-edge computing with energy harvesting devices［J］. IEEE Journal on Selected Areas in Communications, 2016,34（12）:3590-3605.
［12］REN J K, YU G D, CAI Y L, et al. Latency optimization for resource allocation in mobile-edge computation offloading［J］. IEEE Transactions on Wireless Communications, 2018,17（8）:5506-5519.
［13］ZHANG K, LENG S, PENG X, et al. Artificial intelligence inspired transmission scheduling in cognitive vehicular communications and networks［J］. IEEE Internet of Things Journal, 2018,6（2）:1987-1997.
［14］ZHANG K, ZHU Y X, LENG S P, et al. Deep learning empowered task offloading for mobile edge computing in urban informatics［J］. IEEE Internet of Things Journal, 2019,6（5）:7635-7647.
［15］HE Y, ZHAO N, YIN H X. Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach［J］. IEEE Transactions on Vehicular Technology, 2017,67（1）:44-55.
［16］MIN M H, XU D J, XIAO L, et al. Learning based computation offloading for iot devices with energy harvesting［J］. IEEE Transactions on Wireless Communications, 2019,68（2）:1930-1941.

［17］DINH T Q, LA Q D, QUEK T Q, et al. Learning for computation offloading in mobile edge computing［J］. IEEE Transactions on Wireless Communications, 2018, 66（12）:6353-6367.

［18］LUO Y, ZENG M, JIANG H. Learning to tradeoff between energy efficiency and delay in energy harvesting-powered D2D communication: A distributed experience-sharing algorithm［J］. IEEE Internet of Things Journal, 2019,6（3）:5585-5594.
［19］ALSHEIKH M A, HOANG D T, NIYATO D, et al. Markov decision processes with applications in wireless sensor networks: A survey［J］. IEEE Communications Surveys and Tutorials, 2015,17（3）:1239-1267.
［20］VAN HASSELT H, GUEZ A, SILVERD. Deep reinforcement learning with double Q-learning［J］. arXiv preprint arXiv:1509.06461, 2015.
［21］DAI Y Y, XU D, MAHARJAN S, et al. Blockchain and deep reinforcement learning empowered intelligent 5G beyond［J］. IEEE Network, 2019,33（3）:10-17.
［22］TRAN T X, POMPILI D. Joint task offloading and resource allocation for multi-server mobile-edge computing networks［J］. IEEE Transactions on Vehicular Technology, 2018,68（1）:856-868.
［23］SARDELLITTI S, SCUTARI G, BARBAROSSA S. Joint optimization of radio and computational resources for multicell mobile-edge computing［J］. IEEE Transactions on Signal and Information Processing over Networks, 2015,1（2）:89-103.

[1]	刘行1, 2, 郭靓1, 2, 王正琦1, 2, 韦小刚1, 2, 徐雪菲1, 2, 刘京3. 基于Q学习的安全服务功能链编排算法[J]. 计算机与现代化, 2024, 0(11): 34-40.
[2]	李爽1, 2, 叶宁1, 2, 徐康1, 2, 王甦1, 王汝传1, 2. 面向智慧养老的边缘计算卸载方法[J]. 计算机与现代化, 2024, 0(06): 95-102.
[3]	王健铭1, 王欣1, 李养辉2, 王殿龙1. 基于改进D3QN算法的泊车机器人路径规划[J]. 计算机与现代化, 2024, 0(03): 7-14.
[4]	李鹏, 徐珞. 一种面向城市战场的智能车自主导航方法[J]. 计算机与现代化, 2024, 0(01): 92-98.
[5]	张国有, 宋世峰. 基于D3QN的交通灯控制优化[J]. 计算机与现代化, 2023, 0(07): 30-35.
[6]	赖建彬, 冯刚. 一种基于混合样本的经验回放策略[J]. 计算机与现代化, 2023, 0(06): 33-38.
[7]	吴水明, 吉志远, 王震宇, 景栋盛. 基于Dueling-DDQN的电力信息网络入侵检测算法[J]. 计算机与现代化, 2021, 0(12): 43-47.
[8]	刘露, 申国伟, 郭春, 崔允贺, 蒋朝惠, 伍大勇. 一种基于深度强化学习的Spark Streaming参数优化方法[J]. 计算机与现代化, 2021, 0(10): 49-56.
[9]	王鹏勇, 陈龚涛, 赵江烁. 基于深度强化学习的机场出租车司机决策方法[J]. 计算机与现代化, 2020, 0(08): 94-99.
[10]	吴金宇1,张丽娟2,孙宏棣2,赖宇阳2. 泛在电力物联网可信安全接入方案[J]. 计算机与现代化, 2020, 0(04): 52-.
[11]	袁雯，刘惠义. 基于深度Q网络的仿人机器人步态优化[J]. 计算机与现代化, 2019, 0(04): 47-.
[12]	彭琛,韩立新. 基于深度强化学习的计步方法[J]. 计算机与现代化, 2019, 0(01): 63-.
[13]	齐岳1，2，3，黄硕华1. 基于深度强化学习DDPG算法的投资组合管理[J]. 计算机与现代化, 2018, 0(05): 93-.
[14]	蒋维成,李兰英,刘华春,郭维树. 基于任务特征的云计算资源分配策略[J]. 计算机与现代化, 2017, 0(7): 79-84.
[15]	孙亚南，潘中强. IEEE 802.16中支持QoS的有效带宽分配机制[J]. 计算机与现代化, 2016, 0(9): 25-29+34.