[1] 顾鸿儒,孙连坤. 基于层次颜色Petri网的交通紧急调度算法与建模[J]. 计算机工程与应用, 2016,52(16):261-270.
[2] 李敏. 交通堵塞车流调度单点信号嵌入式控制仿真[J]. 计算机仿真, 2017,34(2):189-192.
[3] GRONDMAN I, BUSONIU L, LOPES G A D, et al. A survey of actor-critic reinforcement learning: Standard and natural policy gradients[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2012,42(6):1291-1307.
[4] DANN C, NEUMANN G, PETERS J. Policy evaluation with temporal differences: A survey and comparison[J]. Journal of Machine Learning Research, 2014,15:809-883.
[5] 万里鹏,兰旭光,张翰博,等. 深度强化学习理论及其应用综述[J]. 模式识别与人工智能, 2019,32(1):67-81.
[6] ZHANG D X, HAN X Q, DENG C Y. Review on the research and practice of deep learning and reinforcement learning in smart grids[J]. CSEE Journal of Power and Energy Systems, 2018,4(3):362-370.
[7] 王竹晓,张彭彭,李为,等. 基于深度Q网络的电力工控网络异常检测系统[J]. 计算机与现代化, 2019(12):114-118.
[8] 袁雯,刘惠义. 基于深度Q网络的仿人机器人步态优化[J]. 计算机与现代化, 2019(4):47-51.
[9] 彭琛,韩立新. 基于深度强化学习的计步方法[J]. 计算机与现代化, 2019(1):63-68.
[10]GAO P, ZHANG Q Q, WANG F, et al. Learning reinforced attentional representation for end-to-end visual tracking[J]. Information Sciences, 2020,517:52-67.
[11]YAN S Y, CHEN C Y, WU C C. Solution methods for the taxi pooling problem[J]. Transportation, 2012,39(3):723-748.
[12]QI X, XIONG J, XU G Q, et al. Taxi-pooling scheduling model and algorithm based on many-to-many pickup and delivery problems[C]// The 16th COTA International Conference on Transportation Professionals. 2016:89-98.
[13]欧先锋,罗百通,向灿群,等. 一种出租车合乘业务方案设计[J]. 成都工业学院学报, 2017,20(2):43-49.
[14]曾伟良,吴淼森,孙为军,等. 自动驾驶出租车调度系统研究综述[J/OL]. 计算机科学, (2019-12-25)[2020-02-13]. http://kns.cnki.net/kcms/detail/50.1075.TP.20191225.0909.006.html.
[15]谢榕,潘维,柴崎亮介. 基于人工鱼群算法的出租车智能调度[J]. 系统工程理论与实践, 2017,37(11):2938-2947.
[16]MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing Atari with Deep Reinforcement Learning[DB/OL]. (2013-12-19)[2020-02-13]. https://arxiv.org/pdf/1312.5602.pdf.
[17]MNIH V, KAVUKCUOGLU K, SILVER D, et al.Human-level control through deep reinforcement learning[J]. Nature, 2015,518:529-533.
[18]VAN HASSELT H, GUEZ A, SILVER D, et al. Deep reinforcement learning with double Q-Learning[C]// The 30th AAAI Conference on Artificial Intelligence. 2016:2094-2100.
[19]WANG Z Y, TOM S, MATTEO H, et al. Dueling network architectures for deep reinforcement learning[C]// Proceedings of the 33rd International Conference on Machine Learning. 2016:1995-2003.
[20]SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized Experience Replay[DB/OL].(2016-02-25)[2020-02-13]. https://arxiv.org/pdf/1511.05952.pdf.
[21]MNIH V,BADIA A P,MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]// Proceedings of the 33rd International Conference on Machine Learning. 2016:1928-1937.
[22]BELLEMARE M G, DABNEY W, MUNOS R, et al. A distributional perspective on reinforcement learning[C]// Proceedings of the 34rd International Conference on Machine Learning. 2017:449-458.
[23]周建频,张姝柳. 基于深度强化学习的动态库存路径优化[J]. 系统仿真学报, 2019,31(10):2155-2163.
[24]王云鹏,郭戈. 基于深度强化学习的有轨电车信号优先控制[J]. 自动化学报, 2019,45(12):2366-2377.
|