计算机与现代化

• 网络与通信 • 上一篇    下一篇

基于Q学习的单路口交通信号协调控制

  

  1. (东北林业大学信息与计算机工程学院,黑龙江哈尔滨150040)
  • 收稿日期:2019-12-27 出版日期:2020-05-20 发布日期:2020-05-21
  • 作者简介:胡宇(1999-),男,河南新乡人,本科生,研究方向:智慧城市交通,E-mail: yolo0810@163.com; 刘美玲(1981-),女,黑龙江哈尔滨人,副教授,博士,研究方向:社交媒体数据挖掘,智慧城市交通,人工智能,E-mail: lmling2008@163.com; 周子昂(1999-),男,浙江温州人,本科生,研究方向:智慧城市交通,E-mail: zza.1999@163.com; 张敏(1999-),女,山西左云人,本科生,研究方向:智慧城市交通,E-mail: 954549092@qq.com。
  • 基金资助:
    国家自然科学基金资助项目(61702091); 中央高校基本科研业务费专项基金资助项目(2572018BH06); 国家级大学生创新创业训练计划项目(201910225191)

Single Intersection Traffic Signal Coordination Control Based on Q-learning

  1. (College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China)
  • Received:2019-12-27 Online:2020-05-20 Published:2020-05-21

摘要: Q学习通过与外部环境的交互来进行单路口的交通信号自适应控制。在城市交通愈加拥堵的时代背景下,为了缓解交通拥堵,提出一种结合SCOOT系统对绿信比优化方法的Q学习算法。本文将SCOOT系统中对绿信比优化的方法与Q学习相结合,即通过结合车均延误率以及停车次数等时间因素以及经济因素2方面,建立新的数学模型来作为本算法的成本函数并建立一种连续的奖惩函数,在此基础上详细介绍Q学习算法在单路口上的运行过程并且通过与Webster延误率和基于最小车均延误率的Q学习进行横向对比,验证了此算法优于定时控制以及基于车均延误的Q学习算法。相对于这2种算法,本文提出的算法更加适合单路口的绿信比优化。

关键词: 交通信号控制, Q学习, 单路口, 智能体

Abstract: Q-learning uses the interaction with the external environment to carry out the traffic signal adaptive control of a single intersection. In the background of the increasingly congested urban traffic, in order to alleviate the traffic congestion, a Q-learning algorithm combined with the green signal ratio optimization method of SCOOT system is proposed. In this paper, the method of green signal ratio optimization in SCOOT system is combined with Q-learning, that is, a new mathematical model is established as the cost function of the algorithm by combining the time factors such as average vehicle delay rate, parking times and economic factors, and a continuous reward and punishment function is established. On this basis, the operation process of Q-learning algorithm on a single intersection is introduced in detail, and through the horizontal comparison with Webster delay rate and Q-learning based on the minimum average vehicle delay rate, it is verified that this algorithm is superior to the timing control and Q-learning algorithm based on average vehicle delay. Compared with these two algorithms, the algorithm proposed in this paper is more suitable for the single intersection green signal ratio optimization.

Key words: traffic signal control, Q-learning, single intersection, agent

中图分类号: