Causal inference multi-agent reinforcement learning for traffic signal control

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

Causal inference multi-agent reinforcement learning for traffic signal control. / Yang, Shantian; Yang, Bo; Zeng, Zheng; Kang, Zhongfeng.

I: Information Fusion, Bind 94, 2023, s. 243-256.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Yang, S, Yang, B, Zeng, Z & Kang, Z 2023, 'Causal inference multi-agent reinforcement learning for traffic signal control', Information Fusion, bind 94, s. 243-256. https://doi.org/10.1016/j.inffus.2023.02.009

APA

Yang, S., Yang, B., Zeng, Z., & Kang, Z. (2023). Causal inference multi-agent reinforcement learning for traffic signal control. Information Fusion, 94, 243-256. https://doi.org/10.1016/j.inffus.2023.02.009

Vancouver

Yang S, Yang B, Zeng Z, Kang Z. Causal inference multi-agent reinforcement learning for traffic signal control. Information Fusion. 2023;94:243-256. https://doi.org/10.1016/j.inffus.2023.02.009

Author

Yang, Shantian ; Yang, Bo ; Zeng, Zheng ; Kang, Zhongfeng. / Causal inference multi-agent reinforcement learning for traffic signal control. I: Information Fusion. 2023 ; Bind 94. s. 243-256.

Bibtex

@article{c38f2477d08a44b2905ba0aab0c1d01d,
title = "Causal inference multi-agent reinforcement learning for traffic signal control",
abstract = "A primary challenge in multi-agent reinforcement learning for traffic signal control is to produce effective cooperative traffic-signal policies in non-stationary multi-agent traffic environments. However, each agent suffers from its local non-stationary traffic environment caused by the time-varying traffic-signal policies of adjacent agents; At the same time, different agents also produce time-varying traffic-signal policies, which further results in the non-stationarity of the whole traffic environment, so these produced traffic-signal policies may be ineffective. In this work, we propose a Causal Inference Multi-Agent reinforcement learning (CI-MA) algorithm, which can alleviate the non-stationarity of multi-agent traffic environments from both feature representation and optimization, eventually helps to produce effective cooperative traffic-signal policies. Specifically, a Causal-Inference (CI) model is first designed to reason about and tackle the non-stationarity of multi-agent traffic environments by both acquiring feature representation distributions and deriving variational lower bounds (i.e., objective functions); And then, based on the designed CI model, we propose a CI-MA algorithm, in which the feature representations are acquired from the non-stationarity of multi-agent traffic environments at both task level and timestep level, the acquired feature representations are used to produce cooperative traffic-signal policies and Q-values for multiple agents; Finally the corresponding objective functions optimize the whole algorithm from both causal inference and multi-agent reinforcement learning. Experiments are conducted in different non-stationary multi-agent traffic environments. Results show that CI-MA algorithm outperforms other state-of-the-art algorithms, and demonstrate that the proposed algorithm trained in synthetic-traffic environments can be effectively transferred to both synthetic- and real-traffic environments with non-stationarity.",
keywords = "Causal inference, Deep reinforcement learning, Graph model, Multi-agent learning, Traffic signal control",
author = "Shantian Yang and Bo Yang and Zheng Zeng and Zhongfeng Kang",
note = "Publisher Copyright: {\textcopyright} 2023 Elsevier B.V.",
year = "2023",
doi = "10.1016/j.inffus.2023.02.009",
language = "English",
volume = "94",
pages = "243--256",
journal = "Information Fusion",
issn = "1566-2535",
publisher = "Elsevier",

}

RIS

TY - JOUR

T1 - Causal inference multi-agent reinforcement learning for traffic signal control

AU - Yang, Shantian

AU - Yang, Bo

AU - Zeng, Zheng

AU - Kang, Zhongfeng

N1 - Publisher Copyright: © 2023 Elsevier B.V.

PY - 2023

Y1 - 2023

N2 - A primary challenge in multi-agent reinforcement learning for traffic signal control is to produce effective cooperative traffic-signal policies in non-stationary multi-agent traffic environments. However, each agent suffers from its local non-stationary traffic environment caused by the time-varying traffic-signal policies of adjacent agents; At the same time, different agents also produce time-varying traffic-signal policies, which further results in the non-stationarity of the whole traffic environment, so these produced traffic-signal policies may be ineffective. In this work, we propose a Causal Inference Multi-Agent reinforcement learning (CI-MA) algorithm, which can alleviate the non-stationarity of multi-agent traffic environments from both feature representation and optimization, eventually helps to produce effective cooperative traffic-signal policies. Specifically, a Causal-Inference (CI) model is first designed to reason about and tackle the non-stationarity of multi-agent traffic environments by both acquiring feature representation distributions and deriving variational lower bounds (i.e., objective functions); And then, based on the designed CI model, we propose a CI-MA algorithm, in which the feature representations are acquired from the non-stationarity of multi-agent traffic environments at both task level and timestep level, the acquired feature representations are used to produce cooperative traffic-signal policies and Q-values for multiple agents; Finally the corresponding objective functions optimize the whole algorithm from both causal inference and multi-agent reinforcement learning. Experiments are conducted in different non-stationary multi-agent traffic environments. Results show that CI-MA algorithm outperforms other state-of-the-art algorithms, and demonstrate that the proposed algorithm trained in synthetic-traffic environments can be effectively transferred to both synthetic- and real-traffic environments with non-stationarity.

AB - A primary challenge in multi-agent reinforcement learning for traffic signal control is to produce effective cooperative traffic-signal policies in non-stationary multi-agent traffic environments. However, each agent suffers from its local non-stationary traffic environment caused by the time-varying traffic-signal policies of adjacent agents; At the same time, different agents also produce time-varying traffic-signal policies, which further results in the non-stationarity of the whole traffic environment, so these produced traffic-signal policies may be ineffective. In this work, we propose a Causal Inference Multi-Agent reinforcement learning (CI-MA) algorithm, which can alleviate the non-stationarity of multi-agent traffic environments from both feature representation and optimization, eventually helps to produce effective cooperative traffic-signal policies. Specifically, a Causal-Inference (CI) model is first designed to reason about and tackle the non-stationarity of multi-agent traffic environments by both acquiring feature representation distributions and deriving variational lower bounds (i.e., objective functions); And then, based on the designed CI model, we propose a CI-MA algorithm, in which the feature representations are acquired from the non-stationarity of multi-agent traffic environments at both task level and timestep level, the acquired feature representations are used to produce cooperative traffic-signal policies and Q-values for multiple agents; Finally the corresponding objective functions optimize the whole algorithm from both causal inference and multi-agent reinforcement learning. Experiments are conducted in different non-stationary multi-agent traffic environments. Results show that CI-MA algorithm outperforms other state-of-the-art algorithms, and demonstrate that the proposed algorithm trained in synthetic-traffic environments can be effectively transferred to both synthetic- and real-traffic environments with non-stationarity.

KW - Causal inference

KW - Deep reinforcement learning

KW - Graph model

KW - Multi-agent learning

KW - Traffic signal control

UR - http://www.scopus.com/inward/record.url?scp=85147849073&partnerID=8YFLogxK

U2 - 10.1016/j.inffus.2023.02.009

DO - 10.1016/j.inffus.2023.02.009

M3 - Journal article

AN - SCOPUS:85147849073

VL - 94

SP - 243

EP - 256

JO - Information Fusion

JF - Information Fusion

SN - 1566-2535

ER -

ID: 337590795