[1] |
顾鸿儒,孙连坤. 基于层次颜色Petri网的交通紧急调度算法与建模[J]. 计算机工程与应用, 2016,52(16):261-270.
|
[2] |
李敏. 交通堵塞车流调度单点信号嵌入式控制仿真[J]. 计算机仿真, 2017,34(2):189-192.
|
[3] |
GRONDMAN I, BUSONIU L, LOPES G A D, et al. A survey of actor-critic reinforcement learning: Standard and natural policy gradients[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2012,42(6):1291-1307.
|
[4] |
DANN C, NEUMANN G, PETERS J. Policy evaluation with temporal differences: A survey and comparison[J]. Journal of Machine Learning Research, 2014,15:809-883.
|
[5] |
万里鹏,兰旭光,张翰博,等. 深度强化学习理论及其应用综述[J]. 模式识别与人工智能, 2019,32(1):67-81.
|
[6] |
ZHANG D X, HAN X Q, DENG C Y. Review on the research and practice of deep learning and reinforcement learning in smart grids[J]. CSEE Journal of Power and Energy Systems, 2018,4(3):362-370.
|
[7] |
王竹晓,张彭彭,李为,等. 基于深度Q网络的电力工控网络异常检测系统[J]. 计算机与现代化, 2019(12):114-118.
|
[8] |
袁雯,刘惠义. 基于深度Q网络的仿人机器人步态优化[J]. 计算机与现代化, 2019(4):47-51.
|
[9] |
彭琛,韩立新. 基于深度强化学习的计步方法[J]. 计算机与现代化, 2019(1):63-68.
|
[10] |
GAO P, ZHANG Q Q, WANG F, et al. Learning reinforced attentional representation for end-to-end visual tracking[J]. Information Sciences, 2020,517:52-67.
|
[11] |
YAN S Y, CHEN C Y, WU C C. Solution methods for the taxi pooling problem[J]. Transportation, 2012,39(3):723-748.
|
[12] |
QI X, XIONG J, XU G Q, et al. Taxi-pooling scheduling model and algorithm based on many-to-many pickup and delivery problems[C]// The 16th COTA International Conference on Transportation Professionals. 2016:89-98.
|
[13] |
欧先锋,罗百通,向灿群,等. 一种出租车合乘业务方案设计[J]. 成都工业学院学报, 2017,20(2):43-49.
|
[14] |
曾伟良,吴淼森,孙为军,等. 自动驾驶出租车调度系统研究综述[J/OL]. 计算机科学, (2019-12-25)[2020-02-13]. http://kns.cnki.net/kcms/detail/50.1075.TP.20191225.0909.006.html.
|
[15] |
谢榕,潘维,柴崎亮介. 基于人工鱼群算法的出租车智能调度[J]. 系统工程理论与实践, 2017,37(11):2938-2947.
|
[16] |
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with Deep Reinforcement Learning[DB/OL]. (2013-12-19)[2020-02-13]. https://arxiv.org/pdf/1312.5602.pdf.
|
[17] |
MNIH V, KAVUKCUOGLU K, SILVER D, et al.Human-level control through deep reinforcement learning[J]. Nature, 2015,518:529-533.
|
[18] |
VAN HASSELT H, GUEZ A, SILVER D, et al. Deep reinforcement learning with double Q-Learning[C]// The 30th AAAI Conference on Artificial Intelligence. 2016:2094-2100.
|
[19] |
WANG Z Y, TOM S, MATTEO H, et al. Dueling network architectures for deep reinforcement learning[C]// Proceedings of the 33rd International Conference on Machine Learning. 2016:1995-2003.
|
[20] |
SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized Experience Replay[DB/OL].(2016-02-25)[2020-02-13]. https://arxiv.org/pdf/1511.05952.pdf.
|
[21] |
MNIH V,BADIA A P,MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]// Proceedings of the 33rd International Conference on Machine Learning. 2016:1928-1937.
|
[22] |
BELLEMARE M G, DABNEY W, MUNOS R, et al. A distributional perspective on reinforcement learning[C]// Proceedings of the 34rd International Conference on Machine Learning. 2017:449-458.
|
[23] |
周建频,张姝柳. 基于深度强化学习的动态库存路径优化[J]. 系统仿真学报, 2019,31(10):2155-2163.
|
[24] |
王云鹏,郭戈. 基于深度强化学习的有轨电车信号优先控制[J]. 自动化学报, 2019,45(12):2366-2377.
|