Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control

Subscribers:
24,000
Published on ● Video Link: https://www.youtube.com/watch?v=33klFWORcWs



Duration: 20:33
196 views
2


Deep Reinforcement Learning (DRL) is a promising data-driven approach for traffic signal control, especially because DRL can learn to adapt to varying traffic demands. For that, DRL agents maximize a scalar reward by interacting with an environment. However, one needs to formulate a suitable reward, aligning agent behavior and user objectives, which is an open research problem. We investigate this problem in the context of traffic signal control with the objective of minimizing CO2 emissions at intersections. Because CO2 emissions can be affected by multiple factors outside the agent’s control, it is unclear if an emission-based metric works well as a reward, or if a proxy reward is needed. To obtain a suitable reward, we evaluate various rewards and combinations of rewards. For each reward, we train a Deep Q-Network (DQN) on homogeneous and heterogeneous traffic scenarios. We use the SUMO (Simulation of Urban MObility) simulator and its default emission model to monitor the agent’s performance on the specified rewards and CO2 emission. Our experiments show that a CO2 emission-based reward is inefficient for training a DQN, the agent’s performance is sensitive to variations in the parameters of combined rewards, and some reward formulations do not work equally well in different scenarios. Based on these results, we identify desirable reward properties that have implications to reward design for reinforcement learning-based traffic signal control.

--

Title: Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control: An Investigation using a CO2 Emission Objective

Presenters: Christian Medeiros Adriano and Max Schumacher

Authors: Max Schumacher, Christian Medeiros Adriano and Holger Giese




Other Videos By Eclipse Foundation


2023-05-29Comparing Measured Driver Behavior Distributions to Results from CF Models using SUMO and ...
2023-05-29Development, calibration, and validation of a large-scale traffic simulation model: Belgium network
2023-05-29Keynote: e-bike-city: An answer to our transport dead-end?
2023-05-29The state of Bicycle Modeling in SUMO
2023-05-29A Framework for Simulating Cyclists in SUMO
2023-05-29The Effects of Route Randomization on Urban Emissions
2023-05-29Evaluating the benefits of promoting intermodality and active modes in urban transport …
2023-05-29Analysis and Modelling of Road Traffic Using SUMO to Optimize the Arrival Time of Emergency Vehicles
2023-05-29SUMO Simulations for Federated Learning in Communicating Autonomous Vehicles
2023-05-29Sensor-based Flow Optimization on connected real-world intersections via a SUMO Feature Gap Analysis
2023-05-29Challenges in Reward Design for Reinforcement Learning-based Traffic Signal Control
2023-05-19Virtual IoT & Edge Days - Day 2
2023-05-19Virtual IoT & Edge Days - Day 1
2023-05-02Eclipse IDE Working Group Community Call Recording - April 26 2023
2023-04-24vECM - Someone reports a security issue in my project! Now what?
2023-04-13SDV Community Day - Lisbon 2023
2023-03-30Webinar: Come SLSA with us! With Chainguard, OpenSSF, Eclipse Foundation, and Rust Foundation
2023-03-23Power Skills Bootcamp
2023-03-15EMBEDDED WORLD 2023 | DAY 2 RECAP
2023-03-14EMBEDDED WORLD 2023 | RECAP DAY #1
2023-03-06The strategic significance of open source in Europe