计算机与现代化

• 算法设计与分析 •    下一篇

基于秩的Q-路由选择算法

  

  1. (1.国网江苏省电力有限公司苏州供电分公司,江苏苏州215004;
    2.苏州大学计算机科学与技术学院,江苏苏州215006)
  • 收稿日期:2018-04-03 出版日期:2018-10-26 发布日期:2018-10-26
  • 作者简介:王月娟(1981-),女,江苏苏州人,国网江苏省电力有限公司苏州供电分公司工程师,硕士,研究方向:智能信息系统,机器学习,智能网络通信; 张苏宁(1973-),女,江苏南通人,高级工程师,研究方向:智能网络通信,计算机应用,机器学习; 吴水明(1970-),男,江苏苏州人,工程师,研究方向:智能网络通信,智能信息系统,机器学习; 朱斐(1978-),男,江苏苏州人,苏州大学计算机科学与技术学院副教授,硕士生导师,博士,研究方向:机器学习,人工智能,智能信息系统。
  • 基金资助:
    国家自然科学基金资助项目(61303108,61373094); 江苏省高校自然科学研究项目重大项目(17KJA520004); 苏州大学高校省级重点实验室项目(KJS1524)

A Rank-based Q-routing Algorithm

  1. (1. Suzhou Power Supply Branch, State Grid Jiangsu Electric Power Limited Company, Suzhou 215004, China;
    2. School of Computer Science and Technology, Soochow University, Suzhou 215006, China)
  • Received:2018-04-03 Online:2018-10-26 Published:2018-10-26

摘要: 如何在动态变化的复杂网络中实现高效的路由选择是当前的研究热点之一。Q-学习是一种常用的强化学习算法,通过与环境的不断交互来解决未知环境中最优控制问题,能有效地完成在线式学习任务。本文提出一种基于秩的Q-路由选择(Rank-based Q-routing, RQ routing)算法。RQ routing算法在Q-学习的框架下,保留了Q-路由选择(Q-routing)算法的高效性,引入能动态计算的秩函数,用于表示当前状态在场景中的优先级,用以求解路由选择的最优解,避免等待队列过长,减少网络拥堵,提高传输速度。RQ routing算法中的秩函数具有灵活性,使用不同的秩函数即可满足各种场景的需求,保证了算法具有更好的泛化能力,克服了传统Q-routing应用场景单一的不足。实验验证了本文算法的有效性。

关键词: 强化学习, Q-学习, Q-路由选择, QoS路由, 计算机网络

Abstract: How to achieve efficient routing in the dynamical and complex network is one of current research hotspots. Q-learning, a frequently used reinforcement learning method, which can solve the optimal control problem in unknown environment by continuously interacting with the environment, is able to achieve on-line learning task. A rank-based Q-routing algorithm (RQ routing) is proposed. RQ routing algorithm, taking Q-learning algorithm as learning framework, and preserving the efficiency of the Q-routing algorithm, introduces the rank function that can be dynamically calculated to represent the priority of the current state in the scene, so as to solve the optimal solution of the route selection, which can avoid long waiting queue, reduce network congestion and improve the transmission speed. The rank function in the RQ routing algorithm is flexible. People can use different rank functions to meet the needs of various scenes, ensure the better generalization ability of the algorithm, and overcome the inflexibility of the traditional Q-routing application scene. The experiment verifies the effectiveness of the algorithm.

Key words: reinforcement learning, Q-learning, Q-routing, QoS routing, computer network

中图分类号: