【2022】【Reinforcement learning in urban network traffic signal control: A systematic literature review】

本篇综述主要介绍两个或多个路口路网的基于强化学习的交通信号灯控制,覆盖了1994年至2022年来自20个国家的160多篇文章。具体内容有:

  • A review on Reinforcement Learning in the network-scale Traffic Signal Control area.(网络规模交通信号控制领域的强化学习综述。)

  • Presents a comprehensive systematic literature review of 160 included articles.(对 160 篇收录文章进行全面系统的文献综述。)

  • Consolidates and characterizes the existing research on the defined area.(巩固和描述对特定领域的现有研究。)

  • Explores the methods, applications, domains, and first events in the defined scope.(探索定义范围内的方法、应用程序、域和第一个事件。)

  • Identifies past and present trends and directions for further research in the area.(确定该领域过去和现在的进一步研究趋势和方向。)

ABSTRACT

  • (i) publication and authors’ data,
  • (ii) method identification and analysis,
  • (iii) environment attributes and traffic simulation,
  • (iv) application domains of RL-NTSC,
  • (v) major first events of RL-NTSC and authors’ key statements,
  • (vi) code availability, and
  • (vii) evaluation.

作者:

image

1.Induction

针对现有信号灯控制方式的弊端所提出的方案,分为三类:
These methods include

接着,给出RL在交通信号灯控制相较其他方法的优点:
An advantage of RL over conventional methods, e.g. traffic theory based and heuristic methods, is that RL can learn from the interaction with the environment via trial and error to take appropriate actions based on the feedback it receives from the environment, rather than relying on pre-defined rules which are often used in conventional methods.(RL 优于传统方法的优势,例如基于交通理论和启发式方法,RL 可以通过反复试验从与环境的交互中学习,以根据从环境中接收到的反馈采取适当的行动,而不是依赖于传统中经常使用的预定义规则方法。)

强调本文章要做的工作:
Due to the rising popularity of RL in TSC recently, specifically in NTSC, we aim to thoroughly characterize the existing research in the area of urban traffic networks where RL is applied and to provide a complete account of what has already been explored.(由于最近 RL 在 TSC 中越来越受欢迎,特别是在 NTSC 中,我们的目标是彻底描述应用 RL 的城市交通网络领域的现有研究,并提供对已经探索的内容的完整说明。)

最后,对采纳文章的标准进行阐述,并给出本文覆盖范围:
image

2.Backgroud

在TSC,通常单个agent为单个路口,由此引出多智能体强化学习(Multi-Agent Reinforcement Learning,MARL)。同时对TSC中的相位等进行阐述。

在 reinforcement learning fundamentals in traffic signal control 小节,文章首先阐述RL相关知识,并给出RL在TSC中的完整过程。接着基于引文提出Q-Learning是TSC中使用频率最高最成功的方法(One of the most frequently used and successful RL methods in traffic signal control is Q-learning (Reinforcement Learning: An Introduction,Sutton & Barto, 2018), which was first investigated in 1989. )。同时对Q-Learning算法做了完整介绍。
image

3. Review method

(略)

4. Results

对引文进行分析,可以看出,多数情况下:
RL method:Q-Learning、RL、Actor-Critic
Approx,method:Deep NN、RNN、LSTM、GCN
Action selection:𝜖-greedy、softmax
Action type:7、6、5

4.2.5 RL method's components and types

TOP 5

  • state:queue size,phase state,number of vehicles,the position of vehicles,speed
  • reward:queue size,delay,waiting time,the number of vehicles, nuber of vehicles passed the intersection
posted @ 2022-09-23 00:02  sethnie  阅读(66)  评论(0编辑  收藏  举报